Next Issue
Volume 9, November
Previous Issue
Volume 9, September
 
 

J. Imaging, Volume 9, Issue 10 (October 2023) – 41 articles

Cover Story (view full-size image): Data augmentation is a fundamental machine learning technique that expands the size of training datasets. In recent years, the development of several libraries has simplified the utilization of diverse data augmentation strategies for different tasks. Through a curated taxonomy, we present an organized classification of the different approaches employed by these libraries for computer vision tasks. To ensure the accessibility of this valuable information, a dedicated public website named DALib has been created. By offering this comprehensive resource, we aim to empower practitioners and contribute to the advancement of computer vision research and applications through the effective utilization of data augmentation techniques. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
16 pages, 6517 KiB  
Article
Super-Resolved Dynamic 3D Reconstruction of the Vocal Tract during Natural Speech
by Karyna Isaieva, Freddy Odille, Yves Laprie, Guillaume Drouot, Jacques Felblinger and Pierre-André Vuissoz
J. Imaging 2023, 9(10), 233; https://doi.org/10.3390/jimaging9100233 - 20 Oct 2023
Cited by 1 | Viewed by 1347
Abstract
MRI is the gold standard modality for speech imaging. However, it remains relatively slow, which complicates imaging of fast movements. Thus, an MRI of the vocal tract is often performed in 2D. While 3D MRI provides more information, the quality of such images [...] Read more.
MRI is the gold standard modality for speech imaging. However, it remains relatively slow, which complicates imaging of fast movements. Thus, an MRI of the vocal tract is often performed in 2D. While 3D MRI provides more information, the quality of such images is often insufficient. The goal of this study was to test the applicability of super-resolution algorithms for dynamic vocal tract MRI. In total, 25 sagittal slices of 8 mm with an in-plane resolution of 1.6 × 1.6 mm2 were acquired consecutively using a highly-undersampled radial 2D FLASH sequence. The volunteers were reading a text in French with two different protocols. The slices were aligned using the simultaneously recorded sound. The super-resolution strategy was used to reconstruct 1.6 × 1.6 × 1.6 mm3 isotropic volumes. The resulting images were less sharp than the native 2D images but demonstrated a higher signal-to-noise ratio. It was also shown that the super-resolution allows for eliminating inconsistencies leading to regular transitions between the slices. Additionally, it was demonstrated that using visual stimuli and shorter text fragments improves the inter-slice consistency and the super-resolved image sharpness. Therefore, with a correct speech task choice, the proposed method allows for the reconstruction of high-quality dynamic 3D volumes of the vocal tract during natural speech. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

46 pages, 18393 KiB  
Review
DALib: A Curated Repository of Libraries for Data Augmentation in Computer Vision
by Sofia Amarù, Davide Marelli, Gianluigi Ciocca and Raimondo Schettini
J. Imaging 2023, 9(10), 232; https://doi.org/10.3390/jimaging9100232 - 20 Oct 2023
Viewed by 2847
Abstract
Data augmentation is a fundamental technique in machine learning that plays a crucial role in expanding the size of training datasets. By applying various transformations or modifications to existing data, data augmentation enhances the generalization and robustness of machine learning models. In recent [...] Read more.
Data augmentation is a fundamental technique in machine learning that plays a crucial role in expanding the size of training datasets. By applying various transformations or modifications to existing data, data augmentation enhances the generalization and robustness of machine learning models. In recent years, the development of several libraries has simplified the utilization of diverse data augmentation strategies across different tasks. This paper focuses on the exploration of the most widely adopted libraries specifically designed for data augmentation in computer vision tasks. Here, we aim to provide a comprehensive survey of publicly available data augmentation libraries, facilitating practitioners to navigate these resources effectively. Through a curated taxonomy, we present an organized classification of the different approaches employed by these libraries, along with accompanying application examples. By examining the techniques of each library, practitioners can make informed decisions in selecting the most suitable augmentation techniques for their computer vision projects. To ensure the accessibility of this valuable information, a dedicated public website named DALib has been created. This website serves as a centralized repository where the taxonomy, methods, and examples associated with the surveyed data augmentation libraries can be explored. By offering this comprehensive resource, we aim to empower practitioners and contribute to the advancement of computer vision research and applications through effective utilization of data augmentation techniques. Full article
Show Figures

Figure 1

16 pages, 5237 KiB  
Article
Spatio-Temporal Positron Emission Tomography Reconstruction with Attenuation and Motion Correction
by Enza Cece, Pierre Meyrat, Enza Torino, Olivier Verdier and Massimiliano Colarieti-Tosti
J. Imaging 2023, 9(10), 231; https://doi.org/10.3390/jimaging9100231 - 20 Oct 2023
Viewed by 1276
Abstract
The detection of cancer lesions of a comparable size to that of the typical system resolution of modern scanners is a long-standing problem in Positron Emission Tomography. In this paper, the effect of composing an image-registering convolutional neural network with the modeling of [...] Read more.
The detection of cancer lesions of a comparable size to that of the typical system resolution of modern scanners is a long-standing problem in Positron Emission Tomography. In this paper, the effect of composing an image-registering convolutional neural network with the modeling of the static data acquisition (i.e., the forward model) is investigated. Two algorithms for Positron Emission Tomography reconstruction with motion and attenuation correction are proposed and their performance is evaluated in the detectability of small pulmonary lesions. The evaluation is performed on synthetic data with respect to chosen figures of merit, visual inspection, and an ideal observer. The commonly used figures of merit—Peak Signal-to-Noise Ratio, Recovery Coefficient, and Signal Difference-to-Noise Ration—give inconclusive responses, whereas visual inspection and the Channelised Hotelling Observer suggest that the proposed algorithms outperform current clinical practice. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

22 pages, 6119 KiB  
Article
A Geometric Feature-Based Algorithm for the Virtual Reading of Closed Historical Manuscripts
by Rosa Brancaccio, Fauzia Albertin, Marco Seracini, Matteo Bettuzzi and Maria Pia Morigi
J. Imaging 2023, 9(10), 230; https://doi.org/10.3390/jimaging9100230 - 20 Oct 2023
Viewed by 1270
Abstract
X-ray Computed Tomography (CT), a commonly used technique in a wide variety of research fields, nowadays represents a unique and powerful procedure to discover, reveal and preserve a fundamental part of our patrimony: ancient handwritten documents. For modern and well-preserved ones, traditional document [...] Read more.
X-ray Computed Tomography (CT), a commonly used technique in a wide variety of research fields, nowadays represents a unique and powerful procedure to discover, reveal and preserve a fundamental part of our patrimony: ancient handwritten documents. For modern and well-preserved ones, traditional document scanning systems are suitable for their correct digitization, and, consequently, for their preservation; however, the digitization of ancient, fragile and damaged manuscripts is still a formidable challenge for conservators. The X-ray tomographic approach has already proven its effectiveness in data acquisition, but the algorithmic steps from tomographic images to real page-by-page extraction and reading are still a difficult undertaking. In this work, we propose a new procedure for the segmentation of single pages from the 3D tomographic data of closed historical manuscripts, based on geometric features and flood fill methods. The achieved results prove the capability of the methodology in segmenting the different pages recorded starting from the whole CT acquired volume. Full article
(This article belongs to the Section Document Analysis and Processing)
Show Figures

Figure 1

20 pages, 5066 KiB  
Article
A Simplified Convex Optimization Model for Image Restoration with Multiplicative Noise
by Haoxiang Che and Yuchao Tang
J. Imaging 2023, 9(10), 229; https://doi.org/10.3390/jimaging9100229 - 20 Oct 2023
Viewed by 1283
Abstract
In this paper, we propose a novel convex variational model for image restoration with multiplicative noise. To preserve the edges in the restored image, our model incorporates a total variation regularizer. Additionally, we impose an equality constraint on the data fidelity term, which [...] Read more.
In this paper, we propose a novel convex variational model for image restoration with multiplicative noise. To preserve the edges in the restored image, our model incorporates a total variation regularizer. Additionally, we impose an equality constraint on the data fidelity term, which simplifies the model selection process and promotes sparsity in the solution. We adopt the alternating direction method of multipliers (ADMM) method to solve the model efficiently. To validate the effectiveness of our model, we conduct numerical experiments on both real and synthetic noise images, and compare its performance with existing methods. The experimental results demonstrate the superiority of our model in terms of PSNR and visual quality. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

14 pages, 4154 KiB  
Article
Measuring Power of Earth Disturbances Using Radio Wave Phase Imager
by Radwan N. K. Sharif and Rodney A. Herring
J. Imaging 2023, 9(10), 228; https://doi.org/10.3390/jimaging9100228 - 20 Oct 2023
Viewed by 1329
Abstract
Numerous studies have investigated ionospheric waves, also known as ionospheric disturbances. These disturbances exhibit complex wave patterns similar to those produced by solar, geomagnetic, and meteorological disturbances and human activities within the Earth’s atmosphere. The radio wave phase imager described herein measures the [...] Read more.
Numerous studies have investigated ionospheric waves, also known as ionospheric disturbances. These disturbances exhibit complex wave patterns similar to those produced by solar, geomagnetic, and meteorological disturbances and human activities within the Earth’s atmosphere. The radio wave phase imager described herein measures the power of the ionospheric waves using their phase shift seen in phase images produced by the Long Wavelength Array (LWA) at the New Mexico Observatory, a high-resolution radio camera. Software-defined radio (SDR) was used for processing the data to produce an amplitude image and phase image. The phase image revealed the ionospheric waves, whereas the amplitude image could not see them. From the phase image produced from the carrier wave received at the LWA, the properties of the ionospheric waves have been previously characterized in terms of their energy and wave vector. In this study, their power was measured directly from the phase shift of the strongest set of ionospheric waves. The power of these waves, which originated at Albuquerque, the local major power consumer, was 15.3 W, producing a power density of 0.018 W/m2. The calculated power density that should be generated from the local power generating stations around Albuquerque was also 0.018 W/m2, in agreement with the experimentally measured value. This correspondence shows that the power generated by power stations and being consumed is not lost but captured by the ionosphere. Full article
(This article belongs to the Special Issue Recent Advances in Image-Based Geotechnics II)
Show Figures

Figure 1

14 pages, 3480 KiB  
Article
Mapping Quantitative Observer Metamerism of Displays
by Giorgio Trumpy, Casper Find Andersen, Ivar Farup and Omar Elezabi
J. Imaging 2023, 9(10), 227; https://doi.org/10.3390/jimaging9100227 - 19 Oct 2023
Viewed by 1609
Abstract
Observer metamerism (OM) is the name given to the variability between the color matches that individual observers consider accurate. The standard color imaging approach, which uses color-matching functions of a single representative observer, does not accurately represent every individual observer’s perceptual properties. This [...] Read more.
Observer metamerism (OM) is the name given to the variability between the color matches that individual observers consider accurate. The standard color imaging approach, which uses color-matching functions of a single representative observer, does not accurately represent every individual observer’s perceptual properties. This paper investigates OM in color displays and proposes a quantitative assessment of the OM distribution across the chromaticity diagram. An OM metric is calculated from a database of individual LMS cone fundamentals and the spectral power distributions of the display’s primaries. Additionally, a visualization method is suggested to map the distribution of OM across the display’s color gamut. Through numerical assessment of OM using two distinct publicly available sets of individual observers’ functions, the influence of the selected dataset on the intensity and distribution of OM has been underscored. The case study of digital cinema has been investigated, specifically the transition from xenon-arc to laser projectors. The resulting heatmaps represent the “topography” of OM for both types of projectors. The paper also presents color difference values, showing that achromatic highlights could be particularly prone to disagreements between observers in laser-based cinema theaters. Overall, this study provides valuable resources for display manufacturers and researchers, offering insights into observer metamerism and facilitating the development of improved display technologies. Full article
(This article belongs to the Special Issue Advances in Color Imaging, Volume II)
Show Figures

Figure 1

9 pages, 2417 KiB  
Brief Report
Placental Vessel Segmentation Using Pix2pix Compared to U-Net
by Anouk van der Schot, Esther Sikkel, Marèll Niekolaas, Marc Spaanderman and Guido de Jong
J. Imaging 2023, 9(10), 226; https://doi.org/10.3390/jimaging9100226 - 16 Oct 2023
Viewed by 1447
Abstract
Computer-assisted technologies have made significant progress in fetoscopic laser surgery, including placental vessel segmentation. However, the intra- and inter-procedure variabilities in the state-of-the-art segmentation methods remain a significant hurdle. To address this, we investigated the use of conditional generative adversarial networks (cGANs) for [...] Read more.
Computer-assisted technologies have made significant progress in fetoscopic laser surgery, including placental vessel segmentation. However, the intra- and inter-procedure variabilities in the state-of-the-art segmentation methods remain a significant hurdle. To address this, we investigated the use of conditional generative adversarial networks (cGANs) for fetoscopic image segmentation and compared their performance with the benchmark U-Net technique for placental vessel segmentation. Two deep-learning models, U-Net and pix2pix (a popular cGAN model), were trained and evaluated using a publicly available dataset and an internal validation set. The overall results showed that the pix2pix model outperformed the U-Net model, with a Dice score of 0.80 [0.70; 0.86] versus 0.75 [0.0.60; 0.84] (p-value < 0.01) and an Intersection over Union (IoU) score of 0.70 [0.61; 0.77] compared to 0.66 [0.53; 0.75] (p-value < 0.01), respectively. The internal validation dataset further validated the superiority of the pix2pix model, achieving Dice and IoU scores of 0.68 [0.53; 0.79] and 0.59 [0.49; 0.69] (p-value < 0.01), respectively, while the U-Net model obtained scores of 0.53 [0.49; 0.64] and 0.49 [0.17; 0.56], respectively. This study successfully compared U-Net and pix2pix models for placental vessel segmentation in fetoscopic images, demonstrating improved results with the cGAN-based approach. However, the challenge of achieving generalizability still needs to be addressed. Full article
Show Figures

Figure 1

13 pages, 18698 KiB  
Article
Leveraging AI in Postgraduate Medical Education for Rapid Skill Acquisition in Ultrasound-Guided Procedural Techniques
by Flora Wen Xin Xu, Amanda Min Hui Choo, Pamela Li Ming Ting, Shao Jin Ong and Deborah Khoo
J. Imaging 2023, 9(10), 225; https://doi.org/10.3390/jimaging9100225 - 16 Oct 2023
Viewed by 1290
Abstract
Ultrasound-guided techniques are increasingly prevalent and represent a gold standard of care. Skills such as needle visualisation, optimising the target image and directing the needle require deliberate practice. However, training opportunities remain limited by patient case load and safety considerations. Hence, there is [...] Read more.
Ultrasound-guided techniques are increasingly prevalent and represent a gold standard of care. Skills such as needle visualisation, optimising the target image and directing the needle require deliberate practice. However, training opportunities remain limited by patient case load and safety considerations. Hence, there is a genuine and urgent need for trainees to attain accelerated skill acquisition in a time- and cost-efficient manner that minimises risk to patients. We propose a two-step solution: First, we have created an agar phantom model that simulates human tissue and structures like vessels and nerve bundles. Moreover, we have adopted deep learning techniques to provide trainees with live visualisation of target structures and automate assessment of their user speed and accuracy. Key structures like the needle tip, needle body, target blood vessels, and nerve bundles, are delineated in colour on the processed image, providing an opportunity for real-time guidance of needle positioning and target structure penetration. Quantitative feedback on user speed (time taken for target penetration), accuracy (penetration of correct target), and efficacy in needle positioning (percentage of frames where the full needle is visualised in a longitudinal plane) are also assessable using our model. Our program was able to demonstrate a sensitivity of 99.31%, specificity of 69.23%, accuracy of 91.33%, precision of 89.94%, recall of 99.31%, and F1 score of 0.94 in automated image labelling. Full article
(This article belongs to the Special Issue Application of Machine Learning Using Ultrasound Images, Volume II)
Show Figures

Figure 1

15 pages, 21134 KiB  
Article
Explainable Image Similarity: Integrating Siamese Networks and Grad-CAM
by Ioannis E. Livieris, Emmanuel Pintelas, Niki Kiriakidou and Panagiotis Pintelas
J. Imaging 2023, 9(10), 224; https://doi.org/10.3390/jimaging9100224 - 14 Oct 2023
Cited by 1 | Viewed by 2918
Abstract
With the proliferation of image-based applications in various domains, the need for accurate and interpretable image similarity measures has become increasingly critical. Existing image similarity models often lack transparency, making it challenging to understand the reasons why two images are considered similar. In [...] Read more.
With the proliferation of image-based applications in various domains, the need for accurate and interpretable image similarity measures has become increasingly critical. Existing image similarity models often lack transparency, making it challenging to understand the reasons why two images are considered similar. In this paper, we propose the concept of explainable image similarity, where the goal is the development of an approach, which is capable of providing similarity scores along with visual factual and counterfactual explanations. Along this line, we present a new framework, which integrates Siamese Networks and Grad-CAM for providing explainable image similarity and discuss the potential benefits and challenges of adopting this approach. In addition, we provide a comprehensive discussion about factual and counterfactual explanations provided by the proposed framework for assisting decision making. The proposed approach has the potential to enhance the interpretability, trustworthiness and user acceptance of image-based systems in real-world image similarity applications. Full article
Show Figures

Figure 1

21 pages, 633 KiB  
Review
18F-FDG PET/MRI and 18F-FDG PET/CT for the Management of Gynecological Malignancies: A Comprehensive Review of the Literature
by Leila Allahqoli, Sevil Hakimi, Antonio Simone Laganà, Zohre Momenimovahed, Afrooz Mazidimoradi, Azam Rahmani, Arezoo Fallahi, Hamid Salehiniya, Mohammad Matin Ghiasvand and Ibrahim Alkatout
J. Imaging 2023, 9(10), 223; https://doi.org/10.3390/jimaging9100223 - 13 Oct 2023
Cited by 3 | Viewed by 1968
Abstract
Objective: Positron emission tomography with 2-deoxy-2-[fluorine-18] fluoro- D-glucose integrated with computed tomography (18F-FDG PET/CT) or magnetic resonance imaging (18F-FDG PET/MRI) has emerged as a promising tool for managing various types of cancer. This review study was conducted to investigate the role of 18F- [...] Read more.
Objective: Positron emission tomography with 2-deoxy-2-[fluorine-18] fluoro- D-glucose integrated with computed tomography (18F-FDG PET/CT) or magnetic resonance imaging (18F-FDG PET/MRI) has emerged as a promising tool for managing various types of cancer. This review study was conducted to investigate the role of 18F- FDG PET/CT and FDG PET/MRI in the management of gynecological malignancies. Search strategy: We searched for relevant articles in the three databases PubMed/MEDLINE, Scopus, and Web of Science. Selection criteria: All studies reporting data on the FDG PET/CT and FDG PET MRI in the management of gynecological cancer, performed anywhere in the world and published exclusively in the English language, were included in the present study. Data collection and analysis: We used the EndNote software (EndNote X8.1, Thomson Reuters) to list the studies and screen them on the basis of the inclusion criteria. Data, including first author, publication year, sample size, clinical application, imaging type, and main result, were extracted and tabulated in Excel. The sensitivity, specificity, and diagnostic accuracy of the modalities were extracted and summarized. Main results: After screening 988 records, 166 studies published between 2004 and 2022 were included, covering various methodologies. Studies were divided into the following five categories: the role of FDG PET/CT and FDG-PET/MRI in the management of: (a) endometrial cancer (n = 30); (b) ovarian cancer (n = 60); (c) cervical cancer (n = 50); (d) vulvar and vagina cancers (n = 12); and (e) gynecological cancers (n = 14). Conclusions: FDG PET/CT and FDG PET/MRI have demonstrated potential as non-invasive imaging tools for enhancing the management of gynecological malignancies. Nevertheless, certain associated challenges warrant attention. Full article
Show Figures

Figure 1

13 pages, 3060 KiB  
Article
The Pattern of Metastatic Breast Cancer: A Prospective Head-to-Head Comparison of [18F]FDG-PET/CT and CE-CT
by Rosa Gram-Nielsen, Ivar Yannick Christensen, Mohammad Naghavi-Behzad, Sara Elisabeth Dahlsgaard-Wallenius, Nick Møldrup Jakobsen, Oke Gerke, Jeanette Dupont Jensen, Marianne Ewertz, Malene Grubbe Hildebrandt and Marianne Vogsen
J. Imaging 2023, 9(10), 222; https://doi.org/10.3390/jimaging9100222 - 12 Oct 2023
Viewed by 1538
Abstract
The study aimed to compare the metastatic pattern of breast cancer and the intermodality proportion of agreement between [18F]FDG-PET/CT and CE-CT. Women with metastatic breast cancer (MBC) were enrolled prospectively and underwent a combined [18F]FDG-PET/CT and CE-CT scan to [...] Read more.
The study aimed to compare the metastatic pattern of breast cancer and the intermodality proportion of agreement between [18F]FDG-PET/CT and CE-CT. Women with metastatic breast cancer (MBC) were enrolled prospectively and underwent a combined [18F]FDG-PET/CT and CE-CT scan to diagnose MBC. Experienced nuclear medicine and radiology physicians evaluated the scans blinded to the opposite scan results. Descriptive statistics were applied, and the intermodality proportion of agreement was used to compare [18F]FDG-PET/CT and CE-CT. In total, 76 women with verified MBC were enrolled in the study. The reported number of site-specific metastases for [18F]FDG-PET/CT vs. CE-CT was 53 (69.7%) vs. 44 (57.9%) for bone lesions, 31 (40.8%) vs. 43 (56.6%) for lung lesions, and 16 (21.1%) vs. 23 (30.3%) for liver lesions, respectively. The proportion of agreement between imaging modalities was 76.3% (95% CI 65.2–85.3) for bone lesions; 82.9% (95% CI 72.5–90.6) for liver lesions; 57.9% (95% CI 46.0–69.1) for lung lesions; and 59.2% (95% CI 47.3–70.4) for lymph nodes. In conclusion, bone and distant lymph node metastases were reported more often by [18F]FDG-PET/CT than CE-CT, while liver and lung metastases were reported more often by CE-CT than [18F]FDG-PET/CT. Agreement between scans was highest for bone and liver lesions and lowest for lymph node metastases. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

22 pages, 4250 KiB  
Article
Exploring the Limitations of Hybrid Adiabatic Quantum Computing for Emission Tomography Reconstruction
by Merlin A. Nau, A. Hans Vija, Wesley Gohn, Maximilian P. Reymann and Andreas K. Maier
J. Imaging 2023, 9(10), 221; https://doi.org/10.3390/jimaging9100221 - 11 Oct 2023
Cited by 2 | Viewed by 1959
Abstract
Our study explores the feasibility of quantum computing in emission tomography reconstruction, addressing a noisy ill-conditioned inverse problem. In current clinical practice, this is typically solved by iterative methods minimizing a L2 norm. After reviewing quantum computing principles, we propose the use [...] Read more.
Our study explores the feasibility of quantum computing in emission tomography reconstruction, addressing a noisy ill-conditioned inverse problem. In current clinical practice, this is typically solved by iterative methods minimizing a L2 norm. After reviewing quantum computing principles, we propose the use of a commercially available quantum annealer and employ corresponding hybrid solvers, which combine quantum and classical computing to handle more significant problems. We demonstrate how to frame image reconstruction as a combinatorial optimization problem suited for these quantum annealers and hybrid systems. Using a toy problem, we analyze reconstructions of binary and integer-valued images with respect to their image size and compare them to conventional methods. Additionally, we test our method’s performance under noise and data underdetermination. In summary, our method demonstrates competitive performance with traditional algorithms for binary images up to an image size of 32×32 on the toy problem, even under noisy and underdetermined conditions. However, scalability challenges emerge as image size and pixel bit range increase, restricting hybrid quantum computing as a practical tool for emission tomography reconstruction until significant advancements are made to address this issue. Full article
Show Figures

Figure 1

20 pages, 65655 KiB  
Article
A Spatially Guided Machine-Learning Method to Classify and Quantify Glomerular Patterns of Injury in Histology Images
by Justinas Besusparis, Mindaugas Morkunas and Arvydas Laurinavicius
J. Imaging 2023, 9(10), 220; https://doi.org/10.3390/jimaging9100220 - 11 Oct 2023
Viewed by 1198
Abstract
Introduction The diagnosis of glomerular diseases is primarily based on visual assessment of histologic patterns. Semi-quantitative scoring of active and chronic lesions is often required to assess individual characteristics of the disease. Reproducibility of the visual scoring systems remains debatable, while digital and [...] Read more.
Introduction The diagnosis of glomerular diseases is primarily based on visual assessment of histologic patterns. Semi-quantitative scoring of active and chronic lesions is often required to assess individual characteristics of the disease. Reproducibility of the visual scoring systems remains debatable, while digital and machine-learning technologies present opportunities to detect, classify and quantify glomerular lesions, also considering their inter- and intraglomerular heterogeneity. Materials and methods: We performed a cross-validated comparison of three modifications of a convolutional neural network (CNN)-based approach for recognition and intraglomerular quantification of nine main glomerular patterns of injury. Reference values provided by two nephropathologists were used for validation. For each glomerular image, visual attention heatmaps were generated with a probability of class attribution for further intraglomerular quantification. The quality of classifier-produced heatmaps was evaluated by intersection over union metrics (IoU) between predicted and ground truth localization heatmaps. Results: A proposed spatially guided modification of the CNN classifier achieved the highest glomerular pattern classification accuracies, with area under curve (AUC) values up to 0.981. With regards to heatmap overlap area and intraglomerular pattern quantification, the spatially guided classifier achieved a significantly higher generalized mean IoU value compared to single-multiclass and multiple-binary classifiers. Conclusions: We propose a spatially guided CNN classifier that in our experiments reveals the potential to achieve high accuracy for the localization of intraglomerular patterns. Full article
(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

20 pages, 5567 KiB  
Article
Evaluating Retinal Disease Diagnosis with an Interpretable Lightweight CNN Model Resistant to Adversarial Attacks
by Mohan Bhandari, Tej Bahadur Shahi and Arjun Neupane
J. Imaging 2023, 9(10), 219; https://doi.org/10.3390/jimaging9100219 - 11 Oct 2023
Cited by 1 | Viewed by 1795
Abstract
Optical Coherence Tomography (OCT) is an imperative symptomatic tool empowering the diagnosis of retinal diseases and anomalies. The manual decision towards those anomalies by specialists is the norm, but its labor-intensive nature calls for more proficient strategies. Consequently, the study recommends employing a [...] Read more.
Optical Coherence Tomography (OCT) is an imperative symptomatic tool empowering the diagnosis of retinal diseases and anomalies. The manual decision towards those anomalies by specialists is the norm, but its labor-intensive nature calls for more proficient strategies. Consequently, the study recommends employing a Convolutional Neural Network (CNN) for the classification of OCT images derived from the OCT dataset into distinct categories, including Choroidal NeoVascularization (CNV), Diabetic Macular Edema (DME), Drusen, and Normal. The average k-fold (k = 10) training accuracy, test accuracy, validation accuracy, training loss, test loss, and validation loss values of the proposed model are 96.33%, 94.29%, 94.12%, 0.1073, 0.2002, and 0.1927, respectively. Fast Gradient Sign Method (FGSM) is employed to introduce non-random noise aligned with the cost function’s data gradient, with varying epsilon values scaling the noise, and the model correctly handles all noise levels below 0.1 epsilon. Explainable AI algorithms: Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) are utilized to provide human interpretable explanations approximating the behaviour of the model within the region of a particular retinal image. Additionally, two supplementary datasets, namely, COVID-19 and Kidney Stone, are assimilated to enhance the model’s robustness and versatility, resulting in a level of precision comparable to state-of-the-art methodologies. Incorporating a lightweight CNN model with 983,716 parameters, 2.37×108 floating point operations per second (FLOPs) and leveraging explainable AI strategies, this study contributes to efficient OCT-based diagnosis, underscores its potential in advancing medical diagnostics, and offers assistance in the Internet-of-Medical-Things. Full article
(This article belongs to the Special Issue Advances in Retinal Image Processing)
Show Figures

Figure 1

19 pages, 4088 KiB  
Article
Threshold-Based BRISQUE-Assisted Deep Learning for Enhancing Crack Detection in Concrete Structures
by Sanjeetha Pennada, Marcus Perry, Jack McAlorum, Hamish Dow and Gordon Dobie
J. Imaging 2023, 9(10), 218; https://doi.org/10.3390/jimaging9100218 - 10 Oct 2023
Cited by 2 | Viewed by 1593
Abstract
Automated visual inspection has made significant advancements in the detection of cracks on the surfaces of concrete structures. However, low-quality images significantly affect the classification performance of convolutional neural networks (CNNs). Therefore, it is essential to evaluate the suitability of image datasets used [...] Read more.
Automated visual inspection has made significant advancements in the detection of cracks on the surfaces of concrete structures. However, low-quality images significantly affect the classification performance of convolutional neural networks (CNNs). Therefore, it is essential to evaluate the suitability of image datasets used in deep learning models, like Visual Geometry Group 16 (VGG16), for accurate crack detection. This study explores the sensitivity of the BRISQUE method to different types of image degradations, such as Gaussian noise and Gaussian blur. By evaluating the performance of the VGG16 model on these degraded datasets with varying levels of noise and blur, a correlation is established between image degradation and BRISQUE scores. The results demonstrate that images with lower BRISQUE scores achieve higher accuracy, F1 score, and Matthew’s correlation coefficient (MCC) in crack classification. The study proposes the implementation of a BRISQUE score threshold (BT) to optimise training and testing times, leading to reduced computational costs. These findings have significant implications for enhancing accuracy and reliability in automated visual inspection systems for crack detection and structural health monitoring (SHM). Full article
(This article belongs to the Special Issue Feature Papers in Section AI in Imaging)
Show Figures

Figure 1

17 pages, 505 KiB  
Article
Make It Less Complex: Autoencoder for Speckle Noise Removal—Application to Breast and Lung Ultrasound
by Duarte Oliveira-Saraiva, João Mendes, João Leote, Filipe André Gonzalez, Nuno Garcia, Hugo Alexandre Ferreira and Nuno Matela
J. Imaging 2023, 9(10), 217; https://doi.org/10.3390/jimaging9100217 - 10 Oct 2023
Viewed by 1832
Abstract
Ultrasound (US) imaging is used in the diagnosis and monitoring of COVID-19 and breast cancer. The presence of Speckle Noise (SN) is a downside to its usage since it decreases lesion conspicuity. Filters can be used to remove SN, but they involve time-consuming [...] Read more.
Ultrasound (US) imaging is used in the diagnosis and monitoring of COVID-19 and breast cancer. The presence of Speckle Noise (SN) is a downside to its usage since it decreases lesion conspicuity. Filters can be used to remove SN, but they involve time-consuming computation and parameter tuning. Several researchers have been developing complex Deep Learning (DL) models (150,000–500,000 parameters) for the removal of simulated added SN, without focusing on the real-world application of removing naturally occurring SN from original US images. Here, a simpler (<30,000 parameters) Convolutional Neural Network Autoencoder (CNN-AE) to remove SN from US images of the breast and lung is proposed. In order to do so, simulated SN was added to such US images, considering four different noise levels (σ = 0.05, 0.1, 0.2, 0.5). The original US images (N = 1227, breast + lung) were given as targets, while the noised US images served as the input. The Structural Similarity Index Measure (SSIM) and Peak Signal-to-Noise Ratio (PSNR) were used to compare the output of the CNN-AE and of the Median and Lee filters with the original US images. The CNN-AE outperformed the use of these classic filters for every noise level. To see how well the model removed naturally occurring SN from the original US images and to test its real-world applicability, a CNN model that differentiates malignant from benign breast lesions was developed. Several inputs were used to train the model (original, CNN-AE denoised, filter denoised, and noised US images). The use of the original US images resulted in the highest Matthews Correlation Coefficient (MCC) and accuracy values, while for sensitivity and negative predicted values, the CNN-AE-denoised US images (for higher σ values) achieved the best results. Our results demonstrate that the application of a simpler DL model for SN removal results in fewer misclassifications of malignant breast lesions in comparison to the use of original US images and the application of the Median filter. This shows that the use of a less-complex model and the focus on clinical practice applicability are relevant and should be considered in future studies. Full article
(This article belongs to the Special Issue Application of Machine Learning Using Ultrasound Images, Volume II)
Show Figures

Figure 1

17 pages, 11760 KiB  
Article
Real-Time Obstacle Detection with YOLOv8 in a WSN Using UAV Aerial Photography
by Shakila Rahman, Jahid Hasan Rony, Jia Uddin and Md Abdus Samad
J. Imaging 2023, 9(10), 216; https://doi.org/10.3390/jimaging9100216 - 10 Oct 2023
Cited by 3 | Viewed by 4610
Abstract
Nowadays, wireless sensor networks (WSNs) have a significant and long-lasting impact on numerous fields that affect all facets of our lives, including governmental, civil, and military applications. WSNs contain sensor nodes linked together via wireless communication links that need to relay data instantly [...] Read more.
Nowadays, wireless sensor networks (WSNs) have a significant and long-lasting impact on numerous fields that affect all facets of our lives, including governmental, civil, and military applications. WSNs contain sensor nodes linked together via wireless communication links that need to relay data instantly or subsequently. In this paper, we focus on unmanned aerial vehicle (UAV)-aided data collection in wireless sensor networks (WSNs), where multiple UAVs collect data from a group of sensors. The UAVs may face some static or moving obstacles (e.g., buildings, trees, static or moving vehicles) in their traveling path while collecting the data. In the proposed system, the UAV starts and ends the data collection tour at the base station, and, while collecting data, it captures images and videos using the UAV aerial camera. After processing the captured aerial images and videos, UAVs are trained using a YOLOv8-based model to detect obstacles in their traveling path. The detection results show that the proposed YOLOv8 model performs better than other baseline algorithms in different scenarios—the F1 score of YOLOv8 is 96% in 200 epochs. Full article
Show Figures

Figure 1

17 pages, 1959 KiB  
Article
Comparative Analysis of Machine Learning Models for Image Detection of Colonic Polyps vs. Resected Polyps
by Adriel Abraham, Rejath Jose, Jawad Ahmad, Jai Joshi, Thomas Jacob, Aziz-ur-rahman Khalid, Hassam Ali, Pratik Patel, Jaspreet Singh and Milan Toma
J. Imaging 2023, 9(10), 215; https://doi.org/10.3390/jimaging9100215 - 9 Oct 2023
Cited by 3 | Viewed by 2005
Abstract
(1) Background: Colon polyps are common protrusions in the colon’s lumen, with potential risks of developing colorectal cancer. Early detection and intervention of these polyps are vital for reducing colorectal cancer incidence and mortality rates. This research aims to evaluate and compare the [...] Read more.
(1) Background: Colon polyps are common protrusions in the colon’s lumen, with potential risks of developing colorectal cancer. Early detection and intervention of these polyps are vital for reducing colorectal cancer incidence and mortality rates. This research aims to evaluate and compare the performance of three machine learning image classification models’ performance in detecting and classifying colon polyps. (2) Methods: The performance of three machine learning image classification models, Google Teachable Machine (GTM), Roboflow3 (RF3), and You Only Look Once version 8 (YOLOv8n), in the detection and classification of colon polyps was evaluated using the testing split for each model. The external validity of the test was analyzed using 90 images that were not used to test, train, or validate the model. The study used a dataset of colonoscopy images of normal colon, polyps, and resected polyps. The study assessed the models’ ability to correctly classify the images into their respective classes using precision, recall, and F1 score generated from confusion matrix analysis and performance graphs. (3) Results: All three models successfully distinguished between normal colon, polyps, and resected polyps in colonoscopy images. GTM achieved the highest accuracies: 0.99, with consistent precision, recall, and F1 scores of 1.00 for the ‘normal’ class, 0.97–1.00 for ‘polyps’, and 0.97–1.00 for ‘resected polyps’. While GTM exclusively classified images into these three categories, both YOLOv8n and RF3 were able to detect and specify the location of normal colonic tissue, polyps, and resected polyps, with YOLOv8n and RF3 achieving overall accuracies of 0.84 and 0.87, respectively. (4) Conclusions: Machine learning, particularly models like GTM, shows promising results in ensuring comprehensive detection of polyps during colonoscopies. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

15 pages, 4126 KiB  
Article
Performance Comparison of Classical Methods and Neural Networks for Colour Correction
by Abdullah Kucuk, Graham D. Finlayson, Rafal Mantiuk and Maliha Ashraf
J. Imaging 2023, 9(10), 214; https://doi.org/10.3390/jimaging9100214 - 7 Oct 2023
Viewed by 1705
Abstract
Colour correction is the process of converting RAW RGB pixel values of digital cameras to a standard colour space such as CIE XYZ. A range of regression methods including linear, polynomial and root-polynomial least-squares have been deployed. However, in recent years, various neural [...] Read more.
Colour correction is the process of converting RAW RGB pixel values of digital cameras to a standard colour space such as CIE XYZ. A range of regression methods including linear, polynomial and root-polynomial least-squares have been deployed. However, in recent years, various neural network (NN) models have also started to appear in the literature as an alternative to classical methods. In the first part of this paper, a leading neural network approach is compared and contrasted with regression methods. We find that, although the neural network model supports improved colour correction compared with simple least-squares regression, it performs less well than the more advanced root-polynomial regression. Moreover, the relative improvement afforded by NNs, compared to linear least-squares, is diminished when the regression methods are adapted to minimise a perceptual colour error. Problematically, unlike linear and root-polynomial regressions, the NN approach is tied to a fixed exposure (and when exposure changes, the afforded colour correction can be quite poor). We explore two solutions that make NNs more exposure-invariant. First, we use data augmentation to train the NN for a range of typical exposures and second, we propose a new NN architecture which, by construction, is exposure-invariant. Finally, we look into how the performance of these algorithms is influenced when models are trained and tested on different datasets. As expected, the performance of all methods drops when tested with completely different datasets. However, we noticed that the regression methods still outperform the NNs in terms of colour correction, even though the relative performance of the regression methods does change based on the train and test datasets. Full article
(This article belongs to the Special Issue Imaging and Color Vision)
Show Figures

Figure 1

16 pages, 5170 KiB  
Article
Radiomics Analyses to Predict Histopathology in Patients with Metastatic Testicular Germ Cell Tumors before Post-Chemotherapy Retroperitoneal Lymph Node Dissection
by Anna Scavuzzo, Giovanni Pasini, Elisabetta Crescio, Miguel Angel Jimenez-Rios, Pavel Figueroa-Rodriguez, Albert Comelli, Giorgio Russo, Ivan Calvo Vazquez, Sebastian Muruato Araiza, David Gomez Ortiz, Delia Perez Montiel, Alejandro Lopez Saavedra and Alessandro Stefano
J. Imaging 2023, 9(10), 213; https://doi.org/10.3390/jimaging9100213 - 7 Oct 2023
Cited by 1 | Viewed by 1702
Abstract
Background: The identification of histopathology in metastatic non-seminomatous testicular germ cell tumors (TGCT) before post-chemotherapy retroperitoneal lymph node dissection (PC-RPLND) holds significant potential to reduce treatment-related morbidity in young patients, addressing an important survivorship concern. Aim: To explore this possibility, we conducted a [...] Read more.
Background: The identification of histopathology in metastatic non-seminomatous testicular germ cell tumors (TGCT) before post-chemotherapy retroperitoneal lymph node dissection (PC-RPLND) holds significant potential to reduce treatment-related morbidity in young patients, addressing an important survivorship concern. Aim: To explore this possibility, we conducted a study investigating the role of computed tomography (CT) radiomics models that integrate clinical predictors, enabling personalized prediction of histopathology in metastatic non-seminomatous TGCT patients prior to PC-RPLND. In this retrospective study, we included a cohort of 122 patients. Methods: Using dedicated radiomics software, we segmented the targets and extracted quantitative features from the CT images. Subsequently, we employed feature selection techniques and developed radiomics-based machine learning models to predict histological subtypes. To ensure the robustness of our procedure, we implemented a 5-fold cross-validation approach. When evaluating the models’ performance, we measured metrics such as the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, precision, and F-score. Result: Our radiomics model based on the Support Vector Machine achieved an optimal average AUC of 0.945. Conclusions: The presented CT-based radiomics model can potentially serve as a non-invasive tool to predict histopathological outcomes, differentiating among fibrosis/necrosis, teratoma, and viable tumor in metastatic non-seminomatous TGCT before PC-RPLND. It has the potential to be considered a promising tool to mitigate the risk of over- or under-treatment in young patients, although multi-center validation is critical to confirm the clinical utility of the proposed radiomics workflow. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

12 pages, 5869 KiB  
Article
Near-Infrared Fluorescence Imaging in Preclinical Models of Glioblastoma
by Monserrat Llaguno-Munive, Wilberto Villalba-Abascal, Alejandro Avilés-Salas and Patricia Garcia-Lopez
J. Imaging 2023, 9(10), 212; https://doi.org/10.3390/jimaging9100212 - 6 Oct 2023
Viewed by 1337
Abstract
Cancer is a public health problem requiring ongoing research to improve current treatments and discover novel therapies. More accurate imaging would facilitate such research. Near-infrared fluorescence has been developed as a non-invasive imaging technique capable of visualizing and measuring biological processes at the [...] Read more.
Cancer is a public health problem requiring ongoing research to improve current treatments and discover novel therapies. More accurate imaging would facilitate such research. Near-infrared fluorescence has been developed as a non-invasive imaging technique capable of visualizing and measuring biological processes at the molecular level in living subjects. In this work, we evaluate the tumor activity in two preclinical glioblastoma models by using fluorochrome (IRDye 800CW) coupled to different molecules: tripeptide Arg-Gly-Asp (RGD), 2-amino-2-deoxy-D-glucose (2-DG), and polyethylene glycol (PEG). These molecules interact with pathological conditions of tumors, including their overexpression of αvβ3 integrins (RGD), elevated glucose uptake (2-DG), and enhanced permeability and retention effect (PEG). IRDye 800CW RGD gave the best in vivo fluorescence signal from the tumor area, which contrasted well with the low fluorescence intensity of healthy tissue. In the ex vivo imaging (dissected tumor), the accumulation of IRDye 800CW RGD could be appreciated at the tumor site. Glioblastoma tumors were presently detected with specificity and sensitivity by utilizing IRDye 800CW RGD, a near-infrared fluorophore combined with a marker of αvβ3 integrin expression. Further research is needed on its capacity to monitor tumor growth in glioblastoma after chemotherapy. Full article
(This article belongs to the Special Issue Fluorescence Imaging and Analysis of Cellular System)
Show Figures

Figure 1

24 pages, 16214 KiB  
Article
Qualification of the PAVIN Fog and Rain Platform and Its Digital Twin for the Evaluation of a Pedestrian Detector in Fog
by Charlotte Segonne and Pierre Duthon
J. Imaging 2023, 9(10), 211; https://doi.org/10.3390/jimaging9100211 - 3 Oct 2023
Viewed by 1540
Abstract
Vehicles featuring partially automated driving can now be certified within a guaranteed operational design domain. The verification in all kinds of scenarios, including fog, cannot be carried out in real conditions (risks or low occurrence). Simulation tools for adverse weather conditions (e.g., physical, [...] Read more.
Vehicles featuring partially automated driving can now be certified within a guaranteed operational design domain. The verification in all kinds of scenarios, including fog, cannot be carried out in real conditions (risks or low occurrence). Simulation tools for adverse weather conditions (e.g., physical, numerical) must be implemented and validated. The aim of this study is, therefore, to verify what criteria need to be met to obtain sufficient data to test AI-based pedestrian detection algorithms. It presents both analyses on real and numerically simulated data. A novel method for the test environment evaluation, based on a reference detection algorithm, was set up. The following parameters are taken into account in this study: weather conditions, pedestrian variety, the distance of pedestrians to the camera, fog uncertainty, the number of frames, and artificial fog vs. numerically simulated fog. Across all examined elements, the disparity between results derived from real and simulated data is less than 10%. The results obtained provide a basis for validating and improving standards dedicated to the testing and approval of autonomous vehicles. Full article
(This article belongs to the Special Issue Machine Learning for Human Activity Recognition)
Show Figures

Figure 1

11 pages, 1961 KiB  
Article
Radiofrequency Echographic Multispectrometry (REMS): A New Option in the Assessment Bone Status in Adults with Osteogenesis Imperfecta
by Carla Caffarelli, Antonella Al Refaie, Caterina Mondillo, Alessandro Versienti, Leonardo Baldassini, Michela De Vita, Maria Dea Tomai Pitinca and Stefano Gonnelli
J. Imaging 2023, 9(10), 210; https://doi.org/10.3390/jimaging9100210 - 3 Oct 2023
Cited by 3 | Viewed by 1906
Abstract
This study aimed to estimate the utility of the Radiofrequency Echographic Multispectrometry (REMS) approach in the assessment of bone mineral density (BMD) in subjects with osteogenesis imperfecta (OI). In 41 subjects (40.5 ± 18.7 years) with OI and in 36 healthy controls, we [...] Read more.
This study aimed to estimate the utility of the Radiofrequency Echographic Multispectrometry (REMS) approach in the assessment of bone mineral density (BMD) in subjects with osteogenesis imperfecta (OI). In 41 subjects (40.5 ± 18.7 years) with OI and in 36 healthy controls, we measured BMD at the lumbar spine (LS-BMD), femoral neck (FN-BMD) and total hip (TH-BMD), employing a dual-energy X-ray absorptiometry tool. Additionally, REMS scans were also performed at the lumbar and femoral sites. The presence and number of reported fractures were assessed in the study population. Patients characterized by a history of fragility fractures represented 84.5% of the study population. OI subjects showed significantly reduced BMD values both at the level of the lumbar spine and the femoral subregions (p < 0.01) compared to healthy controls when performed using both the DXA and the REMS method. Dividing OI patients on the basis of the Sillence classification, no differences were found between the LS-BMD values carried out using the DXA technique between the OI type I group and OI Type III and IV groups. On the contrary, the OI Type III and IV groups presented significantly lower values of both Trabecular Bone Score (TBS) and LS-BMD through REMS with respect to OI type I patients (p < 0.05). Based on the data of this study, it is possible to conclude that even the new REMS assessment, which does not use ionizing radiation, represents an excellent method for studying the bone status in subjects affected by OI. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

9 pages, 5007 KiB  
Article
Photon-Counting CT Material Decomposition in Bone Imaging
by Abhisek Bhattarai, Ray Tanaka, Andy Wai Kan Yeung and Varut Vardhanabhuti
J. Imaging 2023, 9(10), 209; https://doi.org/10.3390/jimaging9100209 - 2 Oct 2023
Cited by 1 | Viewed by 1490
Abstract
The accurate screening of osteoporosis is important for identifying persons at risk. The diagnosis of bone conditions using dual X-ray absorptiometry is limited to extracting areal bone mineral density (BMD) and fails to provide any structural information. Computed tomography (CT) is excellent for [...] Read more.
The accurate screening of osteoporosis is important for identifying persons at risk. The diagnosis of bone conditions using dual X-ray absorptiometry is limited to extracting areal bone mineral density (BMD) and fails to provide any structural information. Computed tomography (CT) is excellent for morphological imaging but not ideal for material quantification. Advanced photon-counting detector CT (PCD-CT) possesses high spectral sensitivity and material decomposition capabilities to simultaneously determine qualitative and quantitative information. In this study, we explored the diagnostic utility of PCD-CT to provide high-resolution 3-D imaging of bone microarchitecture and composition for the sensitive diagnosis of bone in untreated and ovariectomized rats. PCD-CT accurately decomposed the calcium content within hydroxyapatite phantoms (r = 0.99). MicroCT analysis of tibial bone revealed significant differences in the morphological parameters between the untreated and ovariectomized samples. However, differences in the structural parameters of the mandible between the treatment groups were not observed. BMD determined with microCT and calcium concentration decomposed using PCD-CT differed significantly between the treatment groups in both the tibia and mandible. Quantitative analysis with PCD-CT is sensitive in determining the distribution of calcium and water components in bone and may have utility in the screening and diagnosis of bone conditions such as osteoporosis. Full article
Show Figures

Figure 1

28 pages, 2143 KiB  
Review
Digital Filtering Techniques Using Fuzzy-Rules Based Logic Control
by Xiao-Xia Yin and Sillas Hadjiloucas
J. Imaging 2023, 9(10), 208; https://doi.org/10.3390/jimaging9100208 - 30 Sep 2023
Cited by 1 | Viewed by 1675
Abstract
This paper discusses current formulations based on fuzzy-logic control concepts as applied to the removal of impulsive noise from digital images. We also discuss the various principles related to fuzzy-ruled based logic control techniques, aiming at preserving edges and digital image details efficiently. [...] Read more.
This paper discusses current formulations based on fuzzy-logic control concepts as applied to the removal of impulsive noise from digital images. We also discuss the various principles related to fuzzy-ruled based logic control techniques, aiming at preserving edges and digital image details efficiently. Detailed descriptions of a number of formulations for recently developed fuzzy-rule logic controlled filters are provided, highlighting the merit of each filter. Fuzzy-rule based filtering algorithms may be designed assuming the tailoring of specific functional sub-modules: (a) logical controlled variable selection, (b) the consideration of different methods for the generation of fuzzy rules and membership functions, (c) the integration of the logical rules for detecting and filtering impulse noise from digital images. More specifically, we discuss impulse noise models and window-based filtering using fuzzy inference based on vector directional filters as associated with the filtering of RGB color images and then explain how fuzzy vector fields can be generated using standard operations on fuzzy sets taking into consideration fixed or random valued impulse noise and fuzzy vector partitioning. We also discuss how fuzzy cellular automata may be used for noise removal by adopting a Moore neighbourhood architecture. We also explain the potential merits of adopting a fuzzy rule based deep learning ensemble classifier which is composed of a convolutional neural network (CNN), a recurrent neural networks (RNN), a long short term memory neural network (LSTM) and a gated recurrent unit (GRU) approaches, all within a fuzzy min-max (FMM) ensemble. Fuzzy non-local mean filter approaches are also considered. A comparison of various performance metrics for conventional and fuzzy logic based filters as well as deep learning filters is provided. The algorhitms discussed have the following advantageous properties: high quality of edge preservation, high quality of spatial noise suppression capability especially for complex images, sound properties of noise removal (in cases when both mixed additive and impulse noise are present), and very fast computational implementation. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

22 pages, 871 KiB  
Review
Developments in Image Processing Using Deep Learning and Reinforcement Learning
by Jorge Valente, João António, Carlos Mora and Sandra Jardim
J. Imaging 2023, 9(10), 207; https://doi.org/10.3390/jimaging9100207 - 30 Sep 2023
Cited by 7 | Viewed by 8159
Abstract
The growth in the volume of data generated, consumed, and stored, which is estimated to exceed 180 zettabytes in 2025, represents a major challenge both for organizations and for society in general. In addition to being larger, datasets are increasingly complex, bringing new [...] Read more.
The growth in the volume of data generated, consumed, and stored, which is estimated to exceed 180 zettabytes in 2025, represents a major challenge both for organizations and for society in general. In addition to being larger, datasets are increasingly complex, bringing new theoretical and computational challenges. Alongside this evolution, data science tools have exploded in popularity over the past two decades due to their myriad of applications when dealing with complex data, their high accuracy, flexible customization, and excellent adaptability. When it comes to images, data analysis presents additional challenges because as the quality of an image increases, which is desirable, so does the volume of data to be processed. Although classic machine learning (ML) techniques are still widely used in different research fields and industries, there has been great interest from the scientific community in the development of new artificial intelligence (AI) techniques. The resurgence of neural networks has boosted remarkable advances in areas such as the understanding and processing of images. In this study, we conducted a comprehensive survey regarding advances in AI design and the optimization solutions proposed to deal with image processing challenges. Despite the good results that have been achieved, there are still many challenges to face in this field of study. In this work, we discuss the main and more recent improvements, applications, and developments when targeting image processing applications, and we propose future research directions in this field of constant and fast evolution. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

11 pages, 1516 KiB  
Article
Creating Digital Watermarks in Bitmap Images Using Lagrange Interpolation and Bezier Curves
by Aigerim Yerimbetova, Elmira Daiyrbayeva, Ekaterina Merzlyakova, Andrey Fionov, Nazerke Baisholan, Mussa Turdalyuly, Nurzhan Mukazhanov and Almas Turganbayev
J. Imaging 2023, 9(10), 206; https://doi.org/10.3390/jimaging9100206 - 29 Sep 2023
Viewed by 1129
Abstract
The article is devoted to the introduction of digital watermarks, which formthe basis for copyright protection systems. Methods in this area are aimed at embedding hidden markers that are resistant to various container transformations. This paper proposes a method for embedding a digital [...] Read more.
The article is devoted to the introduction of digital watermarks, which formthe basis for copyright protection systems. Methods in this area are aimed at embedding hidden markers that are resistant to various container transformations. This paper proposes a method for embedding a digital watermark into bitmap images using Lagrange interpolation and the Bezier curve formula for five points, called Lagrange interpolation along the Bezier curve 5 (LIBC5). As a means of steganalysis, the RS method was used, which uses a sensitive method of double statistics obtained on the basis of spatial correlations in images. The output value of the RS analysis is the estimated length of the message in the image under study. The stability of the developed LIBC5 method to the detection of message transmission by the RS method has been experimentally determined. The developed method proved to be resistant to RS analysis. A study of the LIBC5 method showed an improvement in quilting resistance compared to that of the INMI image embedding method, which also uses Lagrange interpolation. Thus, the LIBC5 stegosystem can be successfully used to protect confidential data and copyrights. Full article
(This article belongs to the Topic Computer Vision and Image Processing)
Show Figures

Figure 1

23 pages, 19046 KiB  
Article
A New CNN-Based Single-Ingredient Classification Model and Its Application in Food Image Segmentation
by Ziyi Zhu and Ying Dai
J. Imaging 2023, 9(10), 205; https://doi.org/10.3390/jimaging9100205 - 29 Sep 2023
Viewed by 1762
Abstract
It is important for food recognition to separate each ingredient within a food image at the pixel level. Most existing research has trained a segmentation network on datasets with pixel-level annotations to achieve food ingredient segmentation. However, preparing such datasets is exceedingly hard [...] Read more.
It is important for food recognition to separate each ingredient within a food image at the pixel level. Most existing research has trained a segmentation network on datasets with pixel-level annotations to achieve food ingredient segmentation. However, preparing such datasets is exceedingly hard and time-consuming. In this paper, we propose a new framework for ingredient segmentation utilizing feature maps of the CNN-based Single-Ingredient Classification Model that is trained on the dataset with image-level annotation. To train this model, we first introduce a standardized biological-based hierarchical ingredient structure and construct a single-ingredient image dataset based on this structure. Then, we build a single-ingredient classification model on this dataset as the backbone of the proposed framework. In this framework, we extract feature maps from the single-ingredient classification model and propose two methods for processing these feature maps for segmenting ingredients in the food images. We introduce five evaluation metrics (IoU, Dice, Purity, Entirety, and Loss of GTs) to assess the performance of ingredient segmentation in terms of ingredient classification. Extensive experiments demonstrate the effectiveness of the proposed method, achieving a mIoU of 0.65, mDice of 0.77, mPurity of 0.83, mEntirety of 0.80, and mLoGTs of 0.06 for the optimal model on the FoodSeg103 dataset. We believe that our approach lays the foundation for subsequent ingredient recognition. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

16 pages, 932 KiB  
Article
Synthesizing Human Activity for Data Generation
by Ana Romero, Pedro Carvalho, Luís Côrte-Real and Américo Pereira
J. Imaging 2023, 9(10), 204; https://doi.org/10.3390/jimaging9100204 - 29 Sep 2023
Viewed by 1021
Abstract
The problem of gathering sufficiently representative data, such as those about human actions, shapes, and facial expressions, is costly and time-consuming and also requires training robust models. This has led to the creation of techniques such as transfer learning or data augmentation. However, [...] Read more.
The problem of gathering sufficiently representative data, such as those about human actions, shapes, and facial expressions, is costly and time-consuming and also requires training robust models. This has led to the creation of techniques such as transfer learning or data augmentation. However, these are often insufficient. To address this, we propose a semi-automated mechanism that allows the generation and editing of visual scenes with synthetic humans performing various actions, with features such as background modification and manual adjustments of the 3D avatars to allow users to create data with greater variability. We also propose an evaluation methodology for assessing the results obtained using our method, which is two-fold: (i) the usage of an action classifier on the output data resulting from the mechanism and (ii) the generation of masks of the avatars and the actors to compare them through segmentation. The avatars were robust to occlusion, and their actions were recognizable and accurate to their respective input actors. The results also showed that even though the action classifier concentrates on the pose and movement of the synthetic humans, it strongly depends on contextual information to precisely recognize the actions. Generating the avatars for complex activities also proved problematic for action recognition and the clean and precise formation of the masks. Full article
(This article belongs to the Special Issue Machine Learning for Human Activity Recognition)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop