Next Issue
Volume 10, September
Previous Issue
Volume 10, July
 
 

J. Imaging, Volume 10, Issue 8 (August 2024) – 32 articles

Cover Story (view full-size image): We propose a deep learning architecture that enables the real-time detection and segmentation of lesion regions from endoscopic video, with our experiments focused on autofluorescence bronchoscopy (AFB) for the lungs and colonoscopy for the intestinal tract. Our architecture, dubbed ESFPNet, draws on a pretrained Mix Transformer (MiT) encoder and a decoder structure that incorporates a new Efficient Stage-Wise Feature Pyramid (ESFP) to promote accurate lesion segmentation. In comparison to existing deep learning models, the ESFPNet model gave superior lesion segmentation performance for an AFB dataset. It also produced superior segmentation results for three widely used public colonoscopy databases and nearly the best results for two other public colonoscopy databases. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
21 pages, 4230 KiB  
Article
Help-Seeking Situations Related to Visual Interactions on Mobile Platforms and Recommended Designs for Blind and Visually Impaired Users
by Iris Xie, Wonchan Choi, Shengang Wang, Hyun Seung Lee, Bo Hyun Hong, Ning-Chiao Wang and Emmanuel Kwame Cudjoe
J. Imaging 2024, 10(8), 205; https://doi.org/10.3390/jimaging10080205 - 22 Aug 2024
Viewed by 508
Abstract
While it is common for blind and visually impaired (BVI) users to use mobile devices to search for information, little research has explored the accessibility issues they encounter in their interactions with information retrieval systems, in particular digital libraries (DLs). This study represents [...] Read more.
While it is common for blind and visually impaired (BVI) users to use mobile devices to search for information, little research has explored the accessibility issues they encounter in their interactions with information retrieval systems, in particular digital libraries (DLs). This study represents one of the most comprehensive research projects, investigating accessibility issues, especially help-seeking situations BVI users face in their DL search processes. One hundred and twenty BVI users were recruited to search for information in six DLs on four types of mobile devices (iPhone, iPad, Android phone, and Android tablet), and multiple data collection methods were employed: questionnaires, think-aloud protocols, transaction logs, and interviews. This paper reports part of a large-scale study, including the categories of help-seeking situations BVI users face in their interactions with DLs, focusing on seven types of help-seeking situations related to visual interactions on mobile platforms: difficulty finding a toggle-based search feature, difficulty understanding a video feature, difficulty navigating items on paginated sections, difficulty distinguishing collection labels from thumbnails, difficulty recognizing the content of images, difficulty recognizing the content of graphs, and difficulty interacting with multilayered windows. Moreover, corresponding design recommendations are also proposed: placing meaningful labels for icon-based features in an easy-to-access location, offering intuitive and informative video descriptions for video players, providing structure information about a paginated section, separating collection/item titles from thumbnail descriptions, incorporating artificial intelligence image/graph recognition mechanisms, and limiting screen reader interactions to active windows. Additionally, the limitations of the study and future research are discussed. Full article
(This article belongs to the Special Issue Image and Video Processing for Blind and Visually Impaired)
Show Figures

Figure 1

26 pages, 4676 KiB  
Article
Optimisation of Convolution-Based Image Lightness Processing
by D. Andrew Rowlands and Graham D. Finlayson
J. Imaging 2024, 10(8), 204; https://doi.org/10.3390/jimaging10080204 - 22 Aug 2024
Viewed by 366
Abstract
In the convolutional retinex approach to image lightness processing, an image is filtered by a centre/surround operator that is designed to mitigate the effects of shading (illumination gradients), which in turn compresses the dynamic range. Typically, the parameters that define the shape and [...] Read more.
In the convolutional retinex approach to image lightness processing, an image is filtered by a centre/surround operator that is designed to mitigate the effects of shading (illumination gradients), which in turn compresses the dynamic range. Typically, the parameters that define the shape and extent of the filter are tuned to provide visually pleasing results, and a mapping function such as a logarithm is included for further image enhancement. In contrast, a statistical approach to convolutional retinex has recently been introduced, which is based upon known or estimated autocorrelation statistics of the image albedo and shading components. By introducing models for the autocorrelation matrices and solving a linear regression, the optimal filter is obtained in closed form. Unlike existing methods, the aim is simply to objectively mitigate shading, and so image enhancement components such as a logarithmic mapping function are not included. Here, the full mathematical details of the method are provided, along with implementation details. Significantly, it is shown that the shapes of the autocorrelation matrices directly impact the shape of the optimal filter. To investigate the performance of the method, we address the problem of shading removal from text documents. Further experiments on a challenging image dataset validate the method. Full article
Show Figures

Figure 1

20 pages, 4347 KiB  
Article
Automatic Classification of Nodules from 2D Ultrasound Images Using Deep Learning Networks
by Tewele W. Tareke, Sarah Leclerc, Catherine Vuillemin, Perrine Buffier, Elodie Crevisy, Amandine Nguyen, Marie-Paule Monnier Meteau, Pauline Legris, Serge Angiolini and Alain Lalande
J. Imaging 2024, 10(8), 203; https://doi.org/10.3390/jimaging10080203 - 22 Aug 2024
Viewed by 577
Abstract
Objective: In clinical practice, thyroid nodules are typically visually evaluated by expert physicians using 2D ultrasound images. Based on their assessment, a fine needle aspiration (FNA) may be recommended. However, visually classifying thyroid nodules from ultrasound images may lead to unnecessary fine needle [...] Read more.
Objective: In clinical practice, thyroid nodules are typically visually evaluated by expert physicians using 2D ultrasound images. Based on their assessment, a fine needle aspiration (FNA) may be recommended. However, visually classifying thyroid nodules from ultrasound images may lead to unnecessary fine needle aspirations for patients. The aim of this study is to develop an automatic thyroid ultrasound image classification system to prevent unnecessary FNAs. Methods: An automatic computer-aided artificial intelligence system is proposed for classifying thyroid nodules using a fine-tuned deep learning model based on the DenseNet architecture, which incorporates an attention module. The dataset comprises 591 thyroid nodule images categorized based on the Bethesda score. Thyroid nodules are classified as either requiring FNA or not. The challenges encountered in this task include managing variability in image quality, addressing the presence of artifacts in ultrasound image datasets, tackling class imbalance, and ensuring model interpretability. We employed techniques such as data augmentation, class weighting, and gradient-weighted class activation maps (Grad-CAM) to enhance model performance and provide insights into decision making. Results: Our approach achieved excellent results with an average accuracy of 0.94, F1-score of 0.93, and sensitivity of 0.96. The use of Grad-CAM gives insights on the decision making and then reinforce the reliability of the binary classification for the end-user perspective. Conclusions: We propose a deep learning architecture that effectively classifies thyroid nodules as requiring FNA or not from ultrasound images. Despite challenges related to image variability, class imbalance, and interpretability, our method demonstrated a high classification accuracy with minimal false negatives, showing its potential to reduce unnecessary FNAs in clinical settings. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

15 pages, 2856 KiB  
Review
A Review of Advancements and Challenges in Liver Segmentation
by Di Wei, Yundan Jiang, Xuhui Zhou, Di Wu and Xiaorong Feng
J. Imaging 2024, 10(8), 202; https://doi.org/10.3390/jimaging10080202 - 21 Aug 2024
Viewed by 457
Abstract
Liver segmentation technologies play vital roles in clinical diagnosis, disease monitoring, and surgical planning due to the complex anatomical structure and physiological functions of the liver. This paper provides a comprehensive review of the developments, challenges, and future directions in liver segmentation technology. [...] Read more.
Liver segmentation technologies play vital roles in clinical diagnosis, disease monitoring, and surgical planning due to the complex anatomical structure and physiological functions of the liver. This paper provides a comprehensive review of the developments, challenges, and future directions in liver segmentation technology. We systematically analyzed high-quality research published between 2014 and 2024, focusing on liver segmentation methods, public datasets, and evaluation metrics. This review highlights the transition from manual to semi-automatic and fully automatic segmentation methods, describes the capabilities and limitations of available technologies, and provides future outlooks. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

14 pages, 3565 KiB  
Article
Artificial Intelligence (AI) and Nuclear Features from the Fine Needle Aspirated (FNA) Tissue Samples to Recognize Breast Cancer
by Rumana Islam and Mohammed Tarique
J. Imaging 2024, 10(8), 201; https://doi.org/10.3390/jimaging10080201 - 19 Aug 2024
Viewed by 525
Abstract
Breast cancer is one of the paramount causes of new cancer cases worldwide annually. It is a malignant neoplasm that develops in the breast cells. The early screening of this disease is essential to prevent its metastasis. A mammogram X-ray image is the [...] Read more.
Breast cancer is one of the paramount causes of new cancer cases worldwide annually. It is a malignant neoplasm that develops in the breast cells. The early screening of this disease is essential to prevent its metastasis. A mammogram X-ray image is the most common screening tool practiced currently when this disease is suspected; all the breast lesions identified are not malignant. The invasive fine needle aspiration (FNA) of a breast mass sample is the secondary screening tool to clinically examine cancerous lesions. The visual image analysis of the stained aspirated sample imposes a challenge for the cytologist to identify the malignant cells accurately. The formulation of an artificial intelligence-based objective technique on top of the introspective assessment is essential to avoid misdiagnosis. This paper addresses several artificial intelligence (AI)-based techniques to diagnose breast cancer from the nuclear features of FNA samples. The Wisconsin Breast Cancer dataset (WBCD) from the UCI machine learning repository is applied for this investigation. Significant statistical parameters are measured to evaluate the performance of the proposed techniques. The best detection accuracy of 98.10% is achieved with a two-layer feed-forward neural network (FFNN). Finally, the developed algorithm’s performance is compared with some state-of-the-art works in the literature. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

27 pages, 14394 KiB  
Article
Celiac Disease Deep Learning Image Classification Using Convolutional Neural Networks
by Joaquim Carreras
J. Imaging 2024, 10(8), 200; https://doi.org/10.3390/jimaging10080200 - 16 Aug 2024
Viewed by 742
Abstract
Celiac disease (CD) is a gluten-sensitive immune-mediated enteropathy. This proof-of-concept study used a convolutional neural network (CNN) to classify hematoxylin and eosin (H&E) CD histological images, normal small intestine control, and non-specified duodenal inflammation (7294, 11,642, and 5966 images, respectively). The trained network [...] Read more.
Celiac disease (CD) is a gluten-sensitive immune-mediated enteropathy. This proof-of-concept study used a convolutional neural network (CNN) to classify hematoxylin and eosin (H&E) CD histological images, normal small intestine control, and non-specified duodenal inflammation (7294, 11,642, and 5966 images, respectively). The trained network classified CD with high performance (accuracy 99.7%, precision 99.6%, recall 99.3%, F1-score 99.5%, and specificity 99.8%). Interestingly, when the same network (already trained for the 3 class images), analyzed duodenal adenocarcinoma (3723 images), the new images were classified as duodenal inflammation in 63.65%, small intestine control in 34.73%, and CD in 1.61% of the cases; and when the network was retrained using the 4 histological subtypes, the performance was above 99% for CD and 97% for adenocarcinoma. Finally, the model added 13,043 images of Crohn’s disease to include other inflammatory bowel diseases; a comparison between different CNN architectures was performed, and the gradient-weighted class activation mapping (Grad-CAM) technique was used to understand why the deep learning network made its classification decisions. In conclusion, the CNN-based deep neural system classified 5 diagnoses with high performance. Narrow artificial intelligence (AI) is designed to perform tasks that typically require human intelligence, but it operates within limited constraints and is task-specific. Full article
Show Figures

Graphical abstract

9 pages, 2933 KiB  
Opinion
Congenital Absence of Pericardium: The Swinging Heart
by Raffaella Marzullo, Alessandro Capestro, Renato Cosimo, Marco Fogante, Alessandro Aprile, Liliana Balardi, Mario Giordano, Gianpiero Gaio, Gabriella Gauderi, Maria Giovanna Russo and Nicolò Schicchi
J. Imaging 2024, 10(8), 199; https://doi.org/10.3390/jimaging10080199 - 14 Aug 2024
Viewed by 413
Abstract
Congenital absence of the pericardium (CAP) is an unusual condition discovered, in most cases, incidentally but can potentially lead to fatal complications, including severe arrhythmias and sudden death. Recently, the use of modern imaging technologies has increased the diagnosis of CAP, providing important [...] Read more.
Congenital absence of the pericardium (CAP) is an unusual condition discovered, in most cases, incidentally but can potentially lead to fatal complications, including severe arrhythmias and sudden death. Recently, the use of modern imaging technologies has increased the diagnosis of CAP, providing important findings for risk stratification. Nevertheless, there is not yet consensus regarding therapeutic decisions, and the management of patients with CAP remains challenging. In this paper, we discuss the pathophysiological implication of CAP, review the current literature and explain the role of multimodality imaging tools for its diagnosis, management and treatment. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

13 pages, 3182 KiB  
Article
Simultaneous Stereo Matching and Confidence Estimation Network
by Tobias Schmähling, Tobias Müller, Jörg Eberhardt and Stefan Elser
J. Imaging 2024, 10(8), 198; https://doi.org/10.3390/jimaging10080198 - 14 Aug 2024
Viewed by 396
Abstract
In this paper, we present a multi-task model that predicts disparities and confidence levels in deep stereo matching simultaneously. We do this by combining its successful model for each separate task and obtaining a multi-task model that can be trained with a proposed [...] Read more.
In this paper, we present a multi-task model that predicts disparities and confidence levels in deep stereo matching simultaneously. We do this by combining its successful model for each separate task and obtaining a multi-task model that can be trained with a proposed loss function. We show the advantages of this model compared to training and predicting disparity and confidence sequentially. This method enables an improvement of 15% to 30% in the area under the curve (AUC) metric when trained in parallel rather than sequentially. In addition, the effect of weighting the components in the loss function on the stereo and confidence performance is investigated. By improving the confidence estimate, the practicality of stereo estimators for creating distance images is increased. Full article
(This article belongs to the Special Issue Image Processing and Computer Vision: Algorithms and Applications)
Show Figures

Figure 1

19 pages, 9912 KiB  
Article
A Multi-Scale Target Detection Method Using an Improved Faster Region Convolutional Neural Network Based on Enhanced Backbone and Optimized Mechanisms
by Qianyong Chen, Mengshan Li, Zhenghui Lai, Jihong Zhu and Lixin Guan
J. Imaging 2024, 10(8), 197; https://doi.org/10.3390/jimaging10080197 - 13 Aug 2024
Viewed by 660
Abstract
Currently, existing deep learning methods exhibit many limitations in multi-target detection, such as low accuracy and high rates of false detection and missed detections. This paper proposes an improved Faster R-CNN algorithm, aiming to enhance the algorithm’s capability in detecting multi-scale targets. This [...] Read more.
Currently, existing deep learning methods exhibit many limitations in multi-target detection, such as low accuracy and high rates of false detection and missed detections. This paper proposes an improved Faster R-CNN algorithm, aiming to enhance the algorithm’s capability in detecting multi-scale targets. This algorithm has three improvements based on Faster R-CNN. Firstly, the new algorithm uses the ResNet101 network for feature extraction of the detection image, which achieves stronger feature extraction capabilities. Secondly, the new algorithm integrates Online Hard Example Mining (OHEM), Soft non-maximum suppression (Soft-NMS), and Distance Intersection Over Union (DIOU) modules, which improves the positive and negative sample imbalance and the problem of small targets being easily missed during model training. Finally, the Region Proposal Network (RPN) is simplified to achieve a faster detection speed and a lower miss rate. The multi-scale training (MST) strategy is also used to train the improved Faster R-CNN to achieve a balance between detection accuracy and efficiency. Compared to the other detection models, the improved Faster R-CNN demonstrates significant advantages in terms of [email protected], F1-score, and Log average miss rate (LAMR). The model proposed in this paper provides valuable insights and inspiration for many fields, such as smart agriculture, medical diagnosis, and face recognition. Full article
Show Figures

Figure 1

40 pages, 4079 KiB  
Article
Investigating Contrastive Pair Learning’s Frontiers in Supervised, Semisupervised, and Self-Supervised Learning
by Bihi Sabiri, Amal Khtira, Bouchra El Asri and Maryem Rhanoui
J. Imaging 2024, 10(8), 196; https://doi.org/10.3390/jimaging10080196 - 13 Aug 2024
Viewed by 741
Abstract
In recent years, contrastive learning has been a highly favored method for self-supervised representation learning, which significantly improves the unsupervised training of deep image models. Self-supervised learning is a subset of unsupervised learning in which the learning process is supervised by creating pseudolabels [...] Read more.
In recent years, contrastive learning has been a highly favored method for self-supervised representation learning, which significantly improves the unsupervised training of deep image models. Self-supervised learning is a subset of unsupervised learning in which the learning process is supervised by creating pseudolabels from the data themselves. Using supervised final adjustments after unsupervised pretraining is one way to take the most valuable information from a vast collection of unlabeled data and teach from a small number of labeled instances. This study aims firstly to compare contrastive learning with other traditional learning models; secondly to demonstrate by experimental studies the superiority of contrastive learning during classification; thirdly to fine-tune performance using pretrained models and appropriate hyperparameter selection; and finally to address the challenge of using contrastive learning techniques to produce data representations with semantic meaning that are independent of irrelevant factors like position, lighting, and background. Relying on contrastive techniques, the model efficiently captures meaningful representations by discerning similarities and differences between modified copies of the same image. The proposed strategy, involving unsupervised pretraining followed by supervised fine-tuning, improves the robustness, accuracy, and knowledge extraction of deep image models. The results show that even with a modest 5% of data labeled, the semisupervised model achieves an accuracy of 57.72%. However, the use of supervised learning with a contrastive approach and careful hyperparameter tuning increases accuracy to 85.43%. Further adjustment of the hyperparameters resulted in an excellent accuracy of 88.70%. Full article
Show Figures

Figure 1

24 pages, 410 KiB  
Article
Gastric Cancer Image Classification: A Comparative Analysis and Feature Fusion Strategies
by Andrea Loddo, Marco Usai and Cecilia Di Ruberto
J. Imaging 2024, 10(8), 195; https://doi.org/10.3390/jimaging10080195 - 10 Aug 2024
Viewed by 682
Abstract
Gastric cancer is the fifth most common and fourth deadliest cancer worldwide, with a bleak 5-year survival rate of about 20%. Despite significant research into its pathobiology, prognostic predictability remains insufficient due to pathologists’ heavy workloads and the potential for diagnostic errors. Consequently, [...] Read more.
Gastric cancer is the fifth most common and fourth deadliest cancer worldwide, with a bleak 5-year survival rate of about 20%. Despite significant research into its pathobiology, prognostic predictability remains insufficient due to pathologists’ heavy workloads and the potential for diagnostic errors. Consequently, there is a pressing need for automated and precise histopathological diagnostic tools. This study leverages Machine Learning and Deep Learning techniques to classify histopathological images into healthy and cancerous categories. By utilizing both handcrafted and deep features and shallow learning classifiers on the GasHisSDB dataset, we conduct a comparative analysis to identify the most effective combinations of features and classifiers for differentiating normal from abnormal histopathological images without employing fine-tuning strategies. Our methodology achieves an accuracy of 95% with the SVM classifier, underscoring the effectiveness of feature fusion strategies. Additionally, cross-magnification experiments produced promising results with accuracies close to 80% and 90% when testing the models on unseen testing images with different resolutions. Full article
Show Figures

Figure 1

22 pages, 1969 KiB  
Article
AIDA (Artificial Intelligence Dystocia Algorithm) in Prolonged Dystocic Labor: Focus on Asynclitism Degree
by Antonio Malvasi, Lorenzo E. Malgieri, Ettore Cicinelli, Antonella Vimercati, Reuven Achiron, Radmila Sparić, Antonio D’Amato, Giorgio Maria Baldini, Miriam Dellino, Giuseppe Trojano, Renata Beck, Tommaso Difonzo and Andrea Tinelli
J. Imaging 2024, 10(8), 194; https://doi.org/10.3390/jimaging10080194 - 9 Aug 2024
Viewed by 794
Abstract
Asynclitism, a misalignment of the fetal head with respect to the plane of passage through the birth canal, represents a significant obstetric challenge. High degrees of asynclitism are associated with labor dystocia, difficult operative delivery, and cesarean delivery. Despite its clinical relevance, the [...] Read more.
Asynclitism, a misalignment of the fetal head with respect to the plane of passage through the birth canal, represents a significant obstetric challenge. High degrees of asynclitism are associated with labor dystocia, difficult operative delivery, and cesarean delivery. Despite its clinical relevance, the diagnosis of asynclitism and its influence on the outcome of labor remain matters of debate. This study analyzes the role of the degree of asynclitism (AD) in assessing labor progress and predicting labor outcome, focusing on its ability to predict intrapartum cesarean delivery (ICD) versus non-cesarean delivery. The study also aims to assess the performance of the AIDA (Artificial Intelligence Dystocia Algorithm) algorithm in integrating AD with other ultrasound parameters for predicting labor outcome. This retrospective study involved 135 full-term nulliparous patients with singleton fetuses in cephalic presentation undergoing neuraxial analgesia. Data were collected at three Italian hospitals between January 2014 and December 2020. In addition to routine digital vaginal examination, all patients underwent intrapartum ultrasound (IU) during protracted second stage of labor (greater than three hours). Four geometric parameters were measured using standard 3.5 MHz transabdominal ultrasound probes: head-to-symphysis distance (HSD), degree of asynclitism (AD), angle of progression (AoP), and midline angle (MLA). The AIDA algorithm, a machine learning-based decision support system, was used to classify patients into five classes (from 0 to 4) based on the values of the four geometric parameters and to predict labor outcome (ICD or non-ICD). Six machine learning algorithms were used: MLP (multi-layer perceptron), RF (random forest), SVM (support vector machine), XGBoost, LR (logistic regression), and DT (decision tree). Pearson’s correlation was used to investigate the relationship between AD and the other parameters. A degree of asynclitism greater than 70 mm was found to be significantly associated with an increased rate of cesarean deliveries. Pearson’s correlation analysis showed a weak to very weak correlation between AD and AoP (PC = 0.36, p < 0.001), AD and HSD (PC = 0.18, p < 0.05), and AD and MLA (PC = 0.14). The AIDA algorithm demonstrated high accuracy in predicting labor outcome, particularly for AIDA classes 0 and 4, with 100% agreement with physician-practiced labor outcome in two cases (RF and SVM algorithms) and slightly lower agreement with MLP. For AIDA class 3, the RF algorithm performed best, with an accuracy of 92%. AD, in combination with HSD, MLA, and AoP, plays a significant role in predicting labor dystocia and labor outcome. The AIDA algorithm, based on these four geometric parameters, has proven to be a promising decision support tool for predicting labor outcome and may help reduce the need for unnecessary cesarean deliveries, while improving maternal-fetal outcomes. Future studies with larger cohorts are needed to further validate these findings and refine the cut-off thresholds for AD and other parameters in the AIDA algorithm. Full article
Show Figures

Figure 1

18 pages, 1265 KiB  
Review
Revolutionizing Cardiac Imaging: A Scoping Review of Artificial Intelligence in Echocardiography, CTA, and Cardiac MRI
by Ali Moradi, Olawale O. Olanisa, Tochukwu Nzeako, Mehregan Shahrokhi, Eman Esfahani, Nastaran Fakher and Mohamad Amin Khazeei Tabari
J. Imaging 2024, 10(8), 193; https://doi.org/10.3390/jimaging10080193 - 8 Aug 2024
Viewed by 1045
Abstract
Background and Introduction: Cardiac imaging is crucial for diagnosing heart disorders. Methods like X-rays, ultrasounds, CT scans, and MRIs provide detailed anatomical and functional heart images. AI can enhance these imaging techniques with its advanced learning capabilities. Method: In this scoping review, following [...] Read more.
Background and Introduction: Cardiac imaging is crucial for diagnosing heart disorders. Methods like X-rays, ultrasounds, CT scans, and MRIs provide detailed anatomical and functional heart images. AI can enhance these imaging techniques with its advanced learning capabilities. Method: In this scoping review, following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-analyses) Guidelines, we searched PubMed, Scopus, Web of Science, and Google Scholar using related keywords on 16 April 2024. From 3679 articles, we first screened titles and abstracts based on the initial inclusion criteria and then screened the full texts. The authors made the final selections collaboratively. Result: The PRISMA chart shows that 3516 articles were initially selected for evaluation after removing duplicates. Upon reviewing titles, abstracts, and quality, 24 articles were deemed eligible for the review. The findings indicate that AI enhances image quality, speeds up imaging processes, and reduces radiation exposure with sensitivity and specificity comparable to or exceeding those of qualified radiologists or cardiologists. Further research is needed to assess AI’s applicability in various types of cardiac imaging, especially in rural hospitals where access to medical doctors is limited. Conclusions: AI improves image quality, reduces human errors and radiation exposure, and can predict cardiac events with acceptable sensitivity and specificity. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

27 pages, 13847 KiB  
Article
RailTrack-DaViT: A Vision Transformer-Based Approach for Automated Railway Track Defect Detection
by Aniwat Phaphuangwittayakul, Napat Harnpornchai, Fangli Ying and Jinming Zhang
J. Imaging 2024, 10(8), 192; https://doi.org/10.3390/jimaging10080192 - 7 Aug 2024
Viewed by 816
Abstract
Railway track defects pose significant safety risks and can lead to accidents, economic losses, and loss of life. Traditional manual inspection methods are either time-consuming, costly, or prone to human error. This paper proposes RailTrack-DaViT, a novel vision transformer-based approach for railway track [...] Read more.
Railway track defects pose significant safety risks and can lead to accidents, economic losses, and loss of life. Traditional manual inspection methods are either time-consuming, costly, or prone to human error. This paper proposes RailTrack-DaViT, a novel vision transformer-based approach for railway track defect classification. By leveraging the Dual Attention Vision Transformer (DaViT) architecture, RailTrack-DaViT effectively captures both global and local information, enabling accurate defect detection. The model is trained and evaluated on multiple datasets including rail, fastener and fishplate, multi-faults, and ThaiRailTrack. A comprehensive analysis of the model’s performance is provided including confusion matrices, training visualizations, and classification metrics. RailTrack-DaViT demonstrates superior performance compared to state-of-the-art CNN-based methods, achieving the highest accuracies: 96.9% on the rail dataset, 98.9% on the fastener and fishplate dataset, and 98.8% on the multi-faults dataset. Moreover, RailTrack-DaViT outperforms baselines on the ThaiRailTrack dataset with 99.2% accuracy, quickly adapts to unseen images, and shows better model stability during fine-tuning. This capability can significantly reduce time consumption when applying the model to novel datasets in practical applications. Full article
(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

24 pages, 3240 KiB  
Article
ESFPNet: Efficient Stage-Wise Feature Pyramid on Mix Transformer for Deep Learning-Based Cancer Analysis in Endoscopic Video
by Qi Chang, Danish Ahmad, Jennifer Toth, Rebecca Bascom and William E. Higgins
J. Imaging 2024, 10(8), 191; https://doi.org/10.3390/jimaging10080191 - 7 Aug 2024
Viewed by 940
Abstract
For patients at risk of developing either lung cancer or colorectal cancer, the identification of suspect lesions in endoscopic video is an important procedure. The physician performs an endoscopic exam by navigating an endoscope through the organ of interest, be it the lungs [...] Read more.
For patients at risk of developing either lung cancer or colorectal cancer, the identification of suspect lesions in endoscopic video is an important procedure. The physician performs an endoscopic exam by navigating an endoscope through the organ of interest, be it the lungs or intestinal tract, and performs a visual inspection of the endoscopic video stream to identify lesions. Unfortunately, this entails a tedious, error-prone search over a lengthy video sequence. We propose a deep learning architecture that enables the real-time detection and segmentation of lesion regions from endoscopic video, with our experiments focused on autofluorescence bronchoscopy (AFB) for the lungs and colonoscopy for the intestinal tract. Our architecture, dubbed ESFPNet, draws on a pretrained Mix Transformer (MiT) encoder and a decoder structure that incorporates a new Efficient Stage-Wise Feature Pyramid (ESFP) to promote accurate lesion segmentation. In comparison to existing deep learning models, the ESFPNet model gave superior lesion segmentation performance for an AFB dataset. It also produced superior segmentation results for three widely used public colonoscopy databases and nearly the best results for two other public colonoscopy databases. In addition, the lightweight ESFPNet architecture requires fewer model parameters and less computation than other competing models, enabling the real-time analysis of input video frames. Overall, these studies point to the combined superior analysis performance and architectural efficiency of the ESFPNet for endoscopic video analysis. Lastly, additional experiments with the public colonoscopy databases demonstrate the learning ability and generalizability of ESFPNet, implying that the model could be effective for region segmentation in other domains. Full article
(This article belongs to the Special Issue Advancements in Imaging Techniques for Detection of Cancer)
Show Figures

Figure 1

12 pages, 2015 KiB  
Article
Automatic Segmentation of Mediastinal Lymph Nodes and Blood Vessels in Endobronchial Ultrasound (EBUS) Images Using Deep Learning
by Øyvind Ervik, Ingrid Tveten, Erlend Fagertun Hofstad, Thomas Langø, Håkon Olav Leira, Tore Amundsen and Hanne Sorger
J. Imaging 2024, 10(8), 190; https://doi.org/10.3390/jimaging10080190 - 6 Aug 2024
Viewed by 780
Abstract
Endobronchial ultrasound (EBUS) is used in the minimally invasive sampling of thoracic lymph nodes. In lung cancer staging, the accurate assessment of mediastinal structures is essential but challenged by variations in anatomy, image quality, and operator-dependent image interpretation. This study aimed to automatically [...] Read more.
Endobronchial ultrasound (EBUS) is used in the minimally invasive sampling of thoracic lymph nodes. In lung cancer staging, the accurate assessment of mediastinal structures is essential but challenged by variations in anatomy, image quality, and operator-dependent image interpretation. This study aimed to automatically detect and segment mediastinal lymph nodes and blood vessels employing a novel U-Net architecture-based approach in EBUS images. A total of 1161 EBUS images from 40 patients were annotated. For training and validation, 882 images from 30 patients and 145 images from 5 patients were utilized. A separate set of 134 images was reserved for testing. For lymph node and blood vessel segmentation, the mean ± standard deviation (SD) values of the Dice similarity coefficient were 0.71 ± 0.35 and 0.76 ± 0.38, those of the precision were 0.69 ± 0.36 and 0.82 ± 0.22, those of the sensitivity were 0.71 ± 0.38 and 0.80 ± 0.25, those of the specificity were 0.98 ± 0.02 and 0.99 ± 0.01, and those of the F1 score were 0.85 ± 0.16 and 0.81 ± 0.21, respectively. The average processing and segmentation run-time per image was 55 ± 1 ms (mean ± SD). The new U-Net architecture-based approach (EBUS-AI) could automatically detect and segment mediastinal lymph nodes and blood vessels in EBUS images. The method performed well and was feasible and fast, enabling real-time automatic labeling. Full article
(This article belongs to the Special Issue Advances in Medical Imaging and Machine Learning)
Show Figures

Figure 1

11 pages, 532 KiB  
Review
Insights into Ultrasound Features and Risk Stratification Systems in Pediatric Patients with Thyroid Nodules
by Carla Gambale, José Vicente Rocha, Alessandro Prete, Elisa Minaldi, Rossella Elisei and Antonio Matrone
J. Imaging 2024, 10(8), 189; https://doi.org/10.3390/jimaging10080189 - 5 Aug 2024
Viewed by 712
Abstract
Thyroid nodules in pediatric patients are less common than in adults but show a higher malignancy rate. Accordingly, the management of thyroid nodules in pediatric patients is more complex the younger the patient is, needing careful evaluation by physicians. In adult patients, specific [...] Read more.
Thyroid nodules in pediatric patients are less common than in adults but show a higher malignancy rate. Accordingly, the management of thyroid nodules in pediatric patients is more complex the younger the patient is, needing careful evaluation by physicians. In adult patients, specific ultrasound (US) features have been associated with an increased risk of malignancy (ROM) in thyroid nodules. Moreover, several US risk stratification systems (RSSs) combining the US features of the nodule were built to define the ROM. RSSs are developed for the adult population and their use has not been fully validated in pediatric patients. This study aimed to evaluate the available data about US features of thyroid nodules in pediatric patients and to provide a summary of the evidence regarding the performance of RSS in predicting malignancy. Moreover, insights into the management of thyroid nodules in pediatric patients will be provided. Full article
Show Figures

Figure 1

15 pages, 2742 KiB  
Article
Screening Mammography Diagnostic Reference Level System According to Compressed Breast Thickness: Dubai Health
by Entesar Z. Dalah, Maryam K. Alkaabi, Hashim M. Al-Awadhi and Nisha A. Antony
J. Imaging 2024, 10(8), 188; https://doi.org/10.3390/jimaging10080188 - 5 Aug 2024
Viewed by 738
Abstract
Screening mammography is considered to be the most effective means for the early detection of breast cancer. However, epidemiological studies suggest that longitudinal exposure to screening mammography may raise breast cancer radiation-induced risk, which begs the need for optimization and internal auditing. The [...] Read more.
Screening mammography is considered to be the most effective means for the early detection of breast cancer. However, epidemiological studies suggest that longitudinal exposure to screening mammography may raise breast cancer radiation-induced risk, which begs the need for optimization and internal auditing. The present work aims to establish a comprehensive well-structured Diagnostic Reference Level (DRL) system that can be confidently used to highlight healthcare centers in need of urgent action, as well as cases exceeding the dose notification level. Screening mammographies from a total of 2048 women who underwent screening mammography at seven different healthcare centers were collected and retrospectively analyzed. The typical DRL for each healthcare center was established and defined as per (A) bilateral image view (left craniocaudal (LCC), right craniocaudal (RCC), left mediolateral oblique (LMLO), and right mediolateral oblique (RMLO)) and (B) structured compressed breast thickness (CBT) criteria. Following this, the local DRL value was established per the bilateral image views for each CBT group. Screening mammography data from a total of 8877 images were used to build this comprehensive DRL system (LCC: 2163, RCC: 2206, LMLO: 2288, and RMLO: 2220). CBTs were classified into eight groups of <20 mm, 20–29 mm, 30–39 mm, 40–49 mm, 50–59 mm, 60–69 mm, 70–79 mm, 80–89 mm, and 90–110 mm. Using the Kruskal–Wallis test, significant dose differences were observed between all seven healthcare centers offering screening mammography. The local DRL values defined per bilateral image views for the CBT group 60–69 mm were (1.24 LCC, 1.23 RCC, 1.34 LMLO, and 1.32 RMLO) mGy. The local DRL defined per bilateral image view for a specific CBT highlighted at least one healthcare center in need of optimization. Such comprehensive DRL system is efficient, easy to use, and very clinically effective. Full article
Show Figures

Figure 1

20 pages, 3626 KiB  
Article
Semantic Segmentation in Large-Size Orthomosaics to Detect the Vegetation Area in Opuntia spp. Crop
by Arturo Duarte-Rangel, César Camacho-Bello, Eduardo Cornejo-Velazquez and Mireya Clavel-Maqueda
J. Imaging 2024, 10(8), 187; https://doi.org/10.3390/jimaging10080187 - 1 Aug 2024
Viewed by 751
Abstract
This study focuses on semantic segmentation in crop Opuntia spp. orthomosaics; this is a significant challenge due to the inherent variability in the captured images. Manual measurement of Opuntia spp. vegetation areas can be slow and inefficient, highlighting the need for more advanced [...] Read more.
This study focuses on semantic segmentation in crop Opuntia spp. orthomosaics; this is a significant challenge due to the inherent variability in the captured images. Manual measurement of Opuntia spp. vegetation areas can be slow and inefficient, highlighting the need for more advanced and accurate methods. For this reason, we propose to use deep learning techniques to provide a more precise and efficient measurement of the vegetation area. Our research focuses on the unique difficulties posed by segmenting high-resolution images exceeding 2000 pixels, a common problem in generating orthomosaics for agricultural monitoring. The research was carried out on a Opuntia spp. cultivation located in the agricultural region of Tulancingo, Hidalgo, Mexico. The images used in this study were obtained by drones and processed using advanced semantic segmentation architectures, including DeepLabV3+, UNet, and UNet Style Xception. The results offer a comparative analysis of the performance of these architectures in the semantic segmentation of Opuntia spp., thus contributing to the development and improvement of crop analysis techniques based on deep learning. This work sets a precedent for future research applying deep learning techniques in agriculture. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

20 pages, 6281 KiB  
Article
Overlapping Shoeprint Detection by Edge Detection and Deep Learning
by Chengran Li, Ajit Narayanan and Akbar Ghobakhlou
J. Imaging 2024, 10(8), 186; https://doi.org/10.3390/jimaging10080186 - 31 Jul 2024
Viewed by 724
Abstract
In the field of 2-D image processing and computer vision, accurately detecting and segmenting objects in scenarios where they overlap or are obscured remains a challenge. This difficulty is worse in the analysis of shoeprints used in forensic investigations because they are embedded [...] Read more.
In the field of 2-D image processing and computer vision, accurately detecting and segmenting objects in scenarios where they overlap or are obscured remains a challenge. This difficulty is worse in the analysis of shoeprints used in forensic investigations because they are embedded in noisy environments such as the ground and can be indistinct. Traditional convolutional neural networks (CNNs), despite their success in various image analysis tasks, struggle with accurately delineating overlapping objects due to the complexity of segmenting intertwined textures and boundaries against a background of noise. This study introduces and employs the YOLO (You Only Look Once) model enhanced by edge detection and image segmentation techniques to improve the detection of overlapping shoeprints. By focusing on the critical boundary information between shoeprint textures and the ground, our method demonstrates improvements in sensitivity and precision, achieving confidence levels above 85% for minimally overlapped images and maintaining above 70% for extensively overlapped instances. Heatmaps of convolution layers were generated to show how the network converges towards successful detection using these enhancements. This research may provide a potential methodology for addressing the broader challenge of detecting multiple overlapping objects against noisy backgrounds. Full article
Show Figures

Figure 1

16 pages, 584 KiB  
Article
A Cortical-Inspired Contour Completion Model Based on Contour Orientation and Thickness
by Ivan Galyaev and Alexey Mashtakov
J. Imaging 2024, 10(8), 185; https://doi.org/10.3390/jimaging10080185 - 31 Jul 2024
Viewed by 646
Abstract
An extended four-dimensional version of the traditional Petitot–Citti–Sarti model on contour completion in the visual cortex is examined. The neural configuration space is considered as the group of similarity transformations, denoted as M=SIM(2). The left-invariant subbundle of the tangent bundle [...] Read more.
An extended four-dimensional version of the traditional Petitot–Citti–Sarti model on contour completion in the visual cortex is examined. The neural configuration space is considered as the group of similarity transformations, denoted as M=SIM(2). The left-invariant subbundle of the tangent bundle models possible directions for establishing neural communication. The sub-Riemannian distance is proportional to the energy expended in interneuron activation between two excited border neurons. According to the model, the damaged image contours are restored via sub-Riemannian geodesics in the space M of positions, orientations and thicknesses (scales). We study the geodesic problem in M using geometric control theory techniques. We prove the existence of a minimal geodesic between arbitrary specified boundary conditions. We apply the Pontryagin maximum principle and derive the geodesic equations. In the special cases, we find explicit solutions. In the general case, we provide a qualitative analysis. Finally, we support our model with a simulation of the association field. Full article
(This article belongs to the Special Issue Modelling of Human Visual System in Image Processing)
Show Figures

Figure 1

15 pages, 3247 KiB  
Article
The Usefulness of a Virtual Environment-Based Patient Setup Training System for Radiation Therapy
by Toshioh Fujibuchi, Kosuke Kaneko, Hiroyuki Arakawa and Yoshihiro Okada
J. Imaging 2024, 10(8), 184; https://doi.org/10.3390/jimaging10080184 - 30 Jul 2024
Viewed by 791
Abstract
In radiation therapy, patient setup is important for improving treatment accuracy. The six-axis couch semi-automatically adjusts the patient’s position; however, adjusting the patient to twist is difficult. In this study, we developed and evaluated a virtual reality setup training tool for medical students [...] Read more.
In radiation therapy, patient setup is important for improving treatment accuracy. The six-axis couch semi-automatically adjusts the patient’s position; however, adjusting the patient to twist is difficult. In this study, we developed and evaluated a virtual reality setup training tool for medical students to understand and improve their patient setup skills for radiation therapy. First, we set up a simulated patient in a virtual space to reproduce the radiation treatment room. A gyro sensor was attached to the patient phantom in real space, and the twist of the phantom was linked to the patient in the virtual space. Training was conducted for 24 students, and their operation records were analyzed and evaluated. The training’s efficacy was also evaluated through questionnaires provided at the end of the training. The total time required for patient setup tests before and after training decreased significantly from 331.9 s to 146.2 s. As a result of the questionnaire regarding the usability of training to the trainee, most were highly evaluated. We found that training significantly improved students’ understanding of the patient setup. With the proposed system, trainees can experience a simulated setup that can aid in deepening their understanding of radiation therapy treatments. Full article
Show Figures

Figure 1

12 pages, 2150 KiB  
Article
Optimized Crop Disease Identification in Bangladesh: A Deep Learning and SVM Hybrid Model for Rice, Potato, and Corn
by Shohag Barman, Fahmid Al Farid, Jaohar Raihan, Niaz Ashraf Khan, Md. Ferdous Bin Hafiz, Aditi Bhattacharya, Zaeed Mahmud, Sadia Afrin Ridita, Md Tanjil Sarker, Hezerul Abdul Karim and Sarina Mansor
J. Imaging 2024, 10(8), 183; https://doi.org/10.3390/jimaging10080183 - 30 Jul 2024
Viewed by 804
Abstract
Agriculture plays a vital role in Bangladesh’s economy. It is essential to ensure the proper growth and health of crops for the development of the agricultural sector. In the context of Bangladesh, crop diseases pose a significant threat to agricultural output and, consequently, [...] Read more.
Agriculture plays a vital role in Bangladesh’s economy. It is essential to ensure the proper growth and health of crops for the development of the agricultural sector. In the context of Bangladesh, crop diseases pose a significant threat to agricultural output and, consequently, food security. This necessitates the timely and precise identification of such diseases to ensure the sustainability of food production. This study focuses on building a hybrid deep learning model for the identification of three specific diseases affecting three major crops: late blight in potatoes, brown spot in rice, and common rust in corn. The proposed model leverages EfficientNetB0′s feature extraction capabilities, known for achieving rapid high learning rates, coupled with the classification proficiency of SVMs, a well-established machine learning algorithm. This unified approach streamlines data processing and feature extraction, potentially improving model generalizability across diverse crops and diseases. It also aims to address the challenges of computational efficiency and accuracy that are often encountered in precision agriculture applications. The proposed hybrid model achieved 97.29% accuracy. A comparative analysis with other models, CNN, VGG16, ResNet50, Xception, Mobilenet V2, Autoencoders, Inception v3, and EfficientNetB0 each achieving an accuracy of 86.57%, 83.29%, 68.79%, 94.07%, 90.71%, 87.90%, 94.14%, and 96.14% respectively, demonstrated the superior performance of our proposed model. Full article
(This article belongs to the Special Issue Imaging Applications in Agriculture)
Show Figures

Figure 1

32 pages, 41999 KiB  
Review
Special Types of Breast Cancer: Clinical Behavior and Radiological Appearance
by Marco Conti, Francesca Morciano, Silvia Amodeo, Elisabetta Gori, Giovanna Romanucci, Paolo Belli, Oscar Tommasini, Francesca Fornasa and Rossella Rella
J. Imaging 2024, 10(8), 182; https://doi.org/10.3390/jimaging10080182 - 29 Jul 2024
Viewed by 964
Abstract
Breast cancer is a complex disease that includes entities with different characteristics, behaviors, and responses to treatment. Breast cancers are categorized into subgroups based on histological type and grade, and these subgroups affect clinical presentation and oncological outcomes. The subgroup of “special types” [...] Read more.
Breast cancer is a complex disease that includes entities with different characteristics, behaviors, and responses to treatment. Breast cancers are categorized into subgroups based on histological type and grade, and these subgroups affect clinical presentation and oncological outcomes. The subgroup of “special types” encompasses all those breast cancers with insufficient features to belong to the subgroup “invasive ductal carcinoma not otherwise specified”. These cancers account for around 25% of all cases, some of them having a relatively good prognosis despite high histological grade. The purpose of this paper is to review and illustrate the radiological appearance of each special type, highlighting insights and pitfalls to guide breast radiologists in their routine work. Full article
Show Figures

Figure 1

17 pages, 11358 KiB  
Article
Fiduciary-Free Frame Alignment for Robust Time-Lapse Drift Correction Estimation in Multi-Sample Cell Microscopy
by Stefan Baar, Masahiro Kuragano, Naoki Nishishita, Kiyotaka Tokuraku and Shinya Watanabe
J. Imaging 2024, 10(8), 181; https://doi.org/10.3390/jimaging10080181 - 29 Jul 2024
Viewed by 795
Abstract
When analyzing microscopic time-lapse observations, frame alignment is an essential task to visually understand the morphological and translation dynamics of cells and tissue. While in traditional single-sample microscopy, the region of interest (RoI) is fixed, multi-sample microscopy often uses a single microscope that [...] Read more.
When analyzing microscopic time-lapse observations, frame alignment is an essential task to visually understand the morphological and translation dynamics of cells and tissue. While in traditional single-sample microscopy, the region of interest (RoI) is fixed, multi-sample microscopy often uses a single microscope that scans multiple samples over a long period of time by laterally relocating the sample stage. Hence, the relocation of the optics induces a statistical RoI offset and can introduce jitter as well as drift, which results in a misaligned RoI for each sample’s time-lapse observation (stage drift). We introduce a robust approach to automatically align all frames within a time-lapse observation and compensate for frame drift. In this study, we present a sub-pixel precise alignment approach based on recurrent all-pairs field transforms (RAFT); a deep network architecture for optical flow. We show that the RAFT model pre-trained on the Sintel dataset performed with near perfect precision for registration tasks on a set of ten contextually unrelated time-lapse observations containing 250 frames each. Our approach is robust for elastically undistorted and translation displaced (x,y) microscopic time-lapse observations and was tested on multiple samples with varying cell density, obtained using different devices. The approach only performed well for registration and not for tracking of the individual image components like cells and contaminants. We provide an open-source command-line application that corrects for stage drift and jitter. Full article
Show Figures

Figure 1

17 pages, 1884 KiB  
Review
Image-Based 3D Reconstruction in Laparoscopy: A Review Focusing on the Quantitative Evaluation by Applying the Reconstruction Error
by Birthe Göbel, Alexander Reiterer and Knut Möller
J. Imaging 2024, 10(8), 180; https://doi.org/10.3390/jimaging10080180 - 24 Jul 2024
Viewed by 742
Abstract
Image-based 3D reconstruction enables laparoscopic applications as image-guided navigation and (autonomous) robot-assisted interventions, which require a high accuracy. The review’s purpose is to present the accuracy of different techniques to label the most promising. A systematic literature search with PubMed and google scholar [...] Read more.
Image-based 3D reconstruction enables laparoscopic applications as image-guided navigation and (autonomous) robot-assisted interventions, which require a high accuracy. The review’s purpose is to present the accuracy of different techniques to label the most promising. A systematic literature search with PubMed and google scholar from 2015 to 2023 was applied by following the framework of “Review articles: purpose, process, and structure”. Articles were considered when presenting a quantitative evaluation (root mean squared error and mean absolute error) of the reconstruction error (Euclidean distance between real and reconstructed surface). The search provides 995 articles, which were reduced to 48 articles after applying exclusion criteria. From these, a reconstruction error data set could be generated for the techniques of stereo vision, Shape-from-Motion, Simultaneous Localization and Mapping, deep-learning, and structured light. The reconstruction error varies from below one millimeter to higher than ten millimeters—with deep-learning and Simultaneous Localization and Mapping delivering the best results under intraoperative conditions. The high variance emerges from different experimental conditions. In conclusion, submillimeter accuracy is challenging, but promising image-based 3D reconstruction techniques could be identified. For future research, we recommend computing the reconstruction error for comparison purposes and use ex/in vivo organs as reference objects for realistic experiments. Full article
(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)
Show Figures

Figure 1

12 pages, 8025 KiB  
Article
Deep Learning for Single-Shot Structured Light Profilometry: A Comprehensive Dataset and Performance Analysis
by Rhys G. Evans, Ester Devlieghere, Robrecht Keijzer, Joris J. J. Dirckx and Sam Van der Jeught
J. Imaging 2024, 10(8), 179; https://doi.org/10.3390/jimaging10080179 - 24 Jul 2024
Viewed by 808
Abstract
In 3D optical metrology, single-shot deep learning-based structured light profilometry (SS-DL-SLP) has gained attention because of its measurement speed, simplicity of optical setup, and robustness to noise and motion artefacts. However, gathering a sufficiently large training dataset for these techniques remains challenging because [...] Read more.
In 3D optical metrology, single-shot deep learning-based structured light profilometry (SS-DL-SLP) has gained attention because of its measurement speed, simplicity of optical setup, and robustness to noise and motion artefacts. However, gathering a sufficiently large training dataset for these techniques remains challenging because of practical limitations. This paper presents a comprehensive DL-SLP dataset of over 10,000 physical data couples. The dataset was constructed by 3D-printing a calibration target featuring randomly varying surface profiles and storing the height profiles and the corresponding deformed fringe patterns. Our dataset aims to serve as a benchmark for evaluating and comparing different models and network architectures in DL-SLP. We performed an analysis of several established neural networks, demonstrating high accuracy in obtaining full-field height information from previously unseen fringe patterns. In addition, the network was validated on unique objects to test the overall robustness of the trained model. To facilitate further research and promote reproducibility, all code and the dataset are made publicly available. This dataset will enable researchers to explore, develop, and benchmark novel DL-based approaches for SS-DL-SLP. Full article
(This article belongs to the Special Issue Deep Learning in Computer Vision)
Show Figures

Figure 1

19 pages, 2916 KiB  
Article
Iterative Tomographic Image Reconstruction Algorithm Based on Extended Power Divergence by Dynamic Parameter Tuning
by Ryuto Yabuki, Yusaku Yamaguchi, Omar M. Abou Al-Ola, Takeshi Kojima and Tetsuya Yoshinaga
J. Imaging 2024, 10(8), 178; https://doi.org/10.3390/jimaging10080178 - 23 Jul 2024
Viewed by 751
Abstract
Computed tomography (CT) imaging plays a crucial role in various medical applications, but noise in projection data can significantly degrade image quality and hinder diagnosis accuracy. Iterative algorithms for tomographic image reconstruction outperform transform methods, especially in scenarios with severe noise in projections. [...] Read more.
Computed tomography (CT) imaging plays a crucial role in various medical applications, but noise in projection data can significantly degrade image quality and hinder diagnosis accuracy. Iterative algorithms for tomographic image reconstruction outperform transform methods, especially in scenarios with severe noise in projections. In this paper, we propose a method to dynamically adjust two parameters included in the iterative rules during the reconstruction process. The algorithm, named the parameter-extended expectation-maximization based on power divergence (PXEM), aims to minimize the weighted extended power divergence between the measured and forward projections at each iteration. Our numerical and physical experiments showed that PXEM surpassed conventional methods such as maximum-likelihood expectation-maximization (MLEM), particularly in noisy scenarios. PXEM combines the noise suppression capabilities of power divergence-based expectation-maximization with static parameters at every iteration and the edge preservation properties of MLEM. The experimental results demonstrated significant improvements in image quality in metrics such as the structural similarity index measure and peak signal-to-noise ratio. PXEM improves CT image reconstruction quality under high noise conditions through enhanced optimization techniques. Full article
(This article belongs to the Special Issue Image Processing and Computer Vision: Algorithms and Applications)
Show Figures

Figure 1

9 pages, 643 KiB  
Article
Influence of Examiner Experience on the Measurement of Bone-Loss by Low-Dose Cone-Beam Computed Tomography: An Ex Vivo Study
by Maurice Ruetters, Korallia Alexandrou, Antonio Ciardo, Sinclair Awounvo, Holger Gehrig, Ti-Sun Kim, Christopher J. Lux and Sinan Sen
J. Imaging 2024, 10(8), 177; https://doi.org/10.3390/jimaging10080177 - 23 Jul 2024
Viewed by 589
Abstract
The aim of this study was to investigate the influence of examiner experience on measurements of bone-loss using high-dose (HD) and low-dose (LD) CBCT. Three diagnosticians with varying levels of CBCT interpretation experience measured bone-loss from CBCT scans of three cadaveric heads at [...] Read more.
The aim of this study was to investigate the influence of examiner experience on measurements of bone-loss using high-dose (HD) and low-dose (LD) CBCT. Three diagnosticians with varying levels of CBCT interpretation experience measured bone-loss from CBCT scans of three cadaveric heads at 30 sites, conducting measurements twice. Between the first and second measurements, diagnostician 2 and diagnostician 3 received training in LD-CBCT diagnostics. The diagnosticians also classified the certainty of their measurements using a three-grade scale. The accuracy of bone-loss measurements was assessed using the absolute difference between observed and clinical measurements and compared among diagnosticians with different experience levels for both HD and LD-CBCT. At baseline, there was a significant difference in measurement accuracy between diagnostician 1 and diagnostician 2, and between diagnostician 1 and diagnostician 3, but not between diagnostician 2 and diagnostician 3. Training improved the accuracy of both HD-CBCT and LD-CBCT measurements in diagnostician 2, and of LD-CBCT measurements in diagnostician 3. Regarding measurement certainty, there was a significant difference among diagnosticians before training. Training enhanced the certainty for diagnosticians 2 and 3, with a significant improvement noted only for diagnostician 3. Examiner experience level significantly impacts the accuracy and certainty of bone-loss measurements using HD- and LD-CBCT. Full article
Show Figures

Figure 1

35 pages, 5499 KiB  
Review
Deep Learning for Pneumonia Detection in Chest X-ray Images: A Comprehensive Survey
by Raheel Siddiqi and Sameena Javaid
J. Imaging 2024, 10(8), 176; https://doi.org/10.3390/jimaging10080176 - 23 Jul 2024
Viewed by 1117
Abstract
This paper addresses the significant problem of identifying the relevant background and contextual literature related to deep learning (DL) as an evolving technology in order to provide a comprehensive analysis of the application of DL to the specific problem of pneumonia detection via [...] Read more.
This paper addresses the significant problem of identifying the relevant background and contextual literature related to deep learning (DL) as an evolving technology in order to provide a comprehensive analysis of the application of DL to the specific problem of pneumonia detection via chest X-ray (CXR) imaging, which is the most common and cost-effective imaging technique available worldwide for pneumonia diagnosis. This paper in particular addresses the key period associated with COVID-19, 2020–2023, to explain, analyze, and systematically evaluate the limitations of approaches and determine their relative levels of effectiveness. The context in which DL is applied as both an aid to and an automated substitute for existing expert radiography professionals, who often have limited availability, is elaborated in detail. The rationale for the undertaken research is provided, along with a justification of the resources adopted and their relevance. This explanatory text and the subsequent analyses are intended to provide sufficient detail of the problem being addressed, existing solutions, and the limitations of these, ranging in detail from the specific to the more general. Indeed, our analysis and evaluation agree with the generally held view that the use of transformers, specifically, vision transformers (ViTs), is the most promising technique for obtaining further effective results in the area of pneumonia detection using CXR images. However, ViTs require extensive further research to address several limitations, specifically the following: biased CXR datasets, data and code availability, the ease with which a model can be explained, systematic methods of accurate model comparison, the notion of class imbalance in CXR datasets, and the possibility of adversarial attacks, the latter of which remains an area of fundamental research. Full article
Show Figures

Figure 1

Previous Issue
Back to TopTop