Next Issue
Volume 8, October
Previous Issue
Volume 8, August
 
 

J. Imaging, Volume 8, Issue 9 (September 2022) – 29 articles

Cover Story (view full-size image): Today, it is common to compare different tools for a certain task, such as segmentation or classification. What is the best way to compare them, however? This is especially important in the context of AI with many loss functions, optimisers and a plethora of architectures. This work presents a comparison framework, which is applied for the classification of Computed Tomography images of individuals with COVID-19, pneumonia or disease-free. Five architectures (ResNet-50, ResNet-50r, DenseNet-121, MobileNet-v3 and the state-of-the-art CaiT-24-XXS-224), Adam and AdamW optimisers, cross entropy and weighted cross entropy were combined to form 20 experiments with 10 repetitions and then bootstrapped for 1000 cycles. Performance was compared using the Friedman–Nemenyi non-parametric statistical test. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
19 pages, 4837 KiB  
Article
Using Transparent Soils to Observe Soil Liquefaction and Fines Migration
by Jisun Chang and David Airey
J. Imaging 2022, 8(9), 253; https://doi.org/10.3390/jimaging8090253 - 19 Sep 2022
Cited by 2 | Viewed by 1620
Abstract
The cyclic liquefaction of soils and associated mud-pumping can lead to costly repairs of roads, railways, and other heavy-haul infrastructure. Over the last decade, several laboratory studies have been conducted to investigate these phenomena, but, due to the opacity of soil, the typical [...] Read more.
The cyclic liquefaction of soils and associated mud-pumping can lead to costly repairs of roads, railways, and other heavy-haul infrastructure. Over the last decade, several laboratory studies have been conducted to investigate these phenomena, but, due to the opacity of soil, the typical experimental observations of cyclic liquefaction have been limited to post-test observations of fine movement and the data of water pressures and soil settlements. In this paper, we show how partially transparent soil models can be used to provide the visualization of a moving saturation front and that fully transparent models can be used to observe fine migration during the cycling loading of a soil column. The changing saturation degree was tracked using a correlation between the degree of saturation, soil transparency, and grayscale image values, while particle movements of fines and larger particles were measured using a small number of fluorescent particles and particle tracking velocimetry. Another innovation of the work was in using mixtures of ethyl benzoate and ethanol as a low-viscosity pore fluid with the refractive index matching the fused silica soil particles. The benefits and challenges of these visualization tests are discussed. Full article
(This article belongs to the Special Issue Recent Advances in Image-Based Geotechnics)
Show Figures

Figure 1

14 pages, 6345 KiB  
Article
Evaluation of an Object Detection Algorithm for Shrapnel and Development of a Triage Tool to Determine Injury Severity
by Eric J. Snider, Sofia I. Hernandez-Torres, Guy Avital and Emily N. Boice
J. Imaging 2022, 8(9), 252; https://doi.org/10.3390/jimaging8090252 - 19 Sep 2022
Cited by 5 | Viewed by 1658
Abstract
Emergency medicine in austere environments rely on ultrasound imaging as an essential diagnostic tool. Without extensive training, identifying abnormalities such as shrapnel embedded in tissue, is challenging. Medical professionals with appropriate expertise are limited in resource-constrained environments. Incorporating artificial intelligence models to aid [...] Read more.
Emergency medicine in austere environments rely on ultrasound imaging as an essential diagnostic tool. Without extensive training, identifying abnormalities such as shrapnel embedded in tissue, is challenging. Medical professionals with appropriate expertise are limited in resource-constrained environments. Incorporating artificial intelligence models to aid the interpretation can reduce the skill gap, enabling identification of shrapnel, and its proximity to important anatomical features for improved medical treatment. Here, we apply a deep learning object detection framework, YOLOv3, for shrapnel detection in various sizes and locations with respect to a neurovascular bundle. Ultrasound images were collected in a tissue phantom containing shrapnel, vein, artery, and nerve features. The YOLOv3 framework, classifies the object types and identifies the location. In the testing dataset, the model was successful at identifying each object class, with a mean Intersection over Union and average precision of 0.73 and 0.94, respectively. Furthermore, a triage tool was developed to quantify shrapnel distance from neurovascular features that could notify the end user when a proximity threshold is surpassed, and, thus, may warrant evacuation or surgical intervention. Overall, object detection models such as this will be vital to compensate for lack of expertise in ultrasound interpretation, increasing its availability for emergency and military medicine. Full article
(This article belongs to the Special Issue Application of Machine Learning Using Ultrasound Images)
Show Figures

Figure 1

33 pages, 5637 KiB  
Article
CLAIRE—Parallelized Diffeomorphic Image Registration for Large-Scale Biomedical Imaging Applications
by Naveen Himthani, Malte Brunn, Jae-Youn Kim, Miriam Schulte, Andreas Mang and George Biros
J. Imaging 2022, 8(9), 251; https://doi.org/10.3390/jimaging8090251 - 16 Sep 2022
Cited by 3 | Viewed by 2070
Abstract
We study the performance of CLAIRE—a diffeomorphic multi-node, multi-GPU image-registration algorithm and software—in large-scale biomedical imaging applications with billions of voxels. At such resolutions, most existing software packages for diffeomorphic image registration are prohibitively expensive. As a result, practitioners first significantly downsample the [...] Read more.
We study the performance of CLAIRE—a diffeomorphic multi-node, multi-GPU image-registration algorithm and software—in large-scale biomedical imaging applications with billions of voxels. At such resolutions, most existing software packages for diffeomorphic image registration are prohibitively expensive. As a result, practitioners first significantly downsample the original images and then register them using existing tools. Our main contribution is an extensive analysis of the impact of downsampling on registration performance. We study this impact by comparing full-resolution registrations obtained with CLAIRE to lower resolution registrations for synthetic and real-world imaging datasets. Our results suggest that registration at full resolution can yield a superior registration quality—but not always. For example, downsampling a synthetic image from 10243 to 2563 decreases the Dice coefficient from 92% to 79%. However, the differences are less pronounced for noisy or low contrast high resolution images. CLAIRE allows us not only to register images of clinically relevant size in a few seconds but also to register images at unprecedented resolution in reasonable time. The highest resolution considered are CLARITY images of size 2816×3016×1162. To the best of our knowledge, this is the first study on image registration quality at such resolutions. Full article
(This article belongs to the Topic Medical Image Analysis)
Show Figures

Figure 1

22 pages, 4793 KiB  
Article
Dual Autoencoder Network with Separable Convolutional Layers for Denoising and Deblurring Images
by Elena Solovyeva and Ali Abdullah
J. Imaging 2022, 8(9), 250; https://doi.org/10.3390/jimaging8090250 - 13 Sep 2022
Cited by 3 | Viewed by 3261
Abstract
A dual autoencoder employing separable convolutional layers for image denoising and deblurring is represented. Combining two autoencoders is presented to gain higher accuracy and simultaneously reduce the complexity of neural network parameters by using separable convolutional layers. In the proposed structure of the [...] Read more.
A dual autoencoder employing separable convolutional layers for image denoising and deblurring is represented. Combining two autoencoders is presented to gain higher accuracy and simultaneously reduce the complexity of neural network parameters by using separable convolutional layers. In the proposed structure of the dual autoencoder, the first autoencoder aims to denoise the image, while the second one aims to enhance the quality of the denoised image. The research includes Gaussian noise (Gaussian blur), Poisson noise, speckle noise, and random impulse noise. The advantages of the proposed neural network are the number reduction in the trainable parameters and the increase in the similarity between the denoised or deblurred image and the original one. The similarity is increased by decreasing the main square error and increasing the structural similarity index. The advantages of a dual autoencoder network with separable convolutional layers are demonstrated by a comparison of the proposed network with a convolutional autoencoder and dual convolutional autoencoder. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning: Trends and Applications)
Show Figures

Figure 1

13 pages, 2179 KiB  
Article
Training Ultrasound Image Classification Deep-Learning Algorithms for Pneumothorax Detection Using a Synthetic Tissue Phantom Apparatus
by Emily N. Boice, Sofia I. Hernandez Torres, Zechariah J. Knowlton, David Berard, Jose M. Gonzalez, Guy Avital and Eric J. Snider
J. Imaging 2022, 8(9), 249; https://doi.org/10.3390/jimaging8090249 - 11 Sep 2022
Cited by 11 | Viewed by 2813
Abstract
Ultrasound (US) imaging is a critical tool in emergency and military medicine because of its portability and immediate nature. However, proper image interpretation requires skill, limiting its utility in remote applications for conditions such as pneumothorax (PTX) which requires rapid intervention. Artificial intelligence [...] Read more.
Ultrasound (US) imaging is a critical tool in emergency and military medicine because of its portability and immediate nature. However, proper image interpretation requires skill, limiting its utility in remote applications for conditions such as pneumothorax (PTX) which requires rapid intervention. Artificial intelligence has the potential to automate ultrasound image analysis for various pathophysiological conditions. Training models require large data sets and a means of troubleshooting in real-time for ultrasound integration deployment, and they also require large animal models or clinical testing. Here, we detail the development of a dynamic synthetic tissue phantom model for PTX and its use in training image classification algorithms. The model comprises a synthetic gelatin phantom cast in a custom 3D-printed rib mold and a lung mimicking phantom. When compared to PTX images acquired in swine, images from the phantom were similar in both PTX negative and positive mimicking scenarios. We then used a deep learning image classification algorithm, which we previously developed for shrapnel detection, to accurately predict the presence of PTX in swine images by only training on phantom image sets, highlighting the utility for a tissue phantom for AI applications. Full article
(This article belongs to the Special Issue Application of Machine Learning Using Ultrasound Images)
Show Figures

Figure 1

33 pages, 6334 KiB  
Article
On the Quantification of Visual Texture Complexity
by Fereshteh Mirjalili and Jon Yngve Hardeberg
J. Imaging 2022, 8(9), 248; https://doi.org/10.3390/jimaging8090248 - 10 Sep 2022
Cited by 4 | Viewed by 2239
Abstract
Complexity is one of the major attributes of the visual perception of texture. However, very little is known about how humans visually interpret texture complexity. A psychophysical experiment was conducted to visually quantify the seven texture attributes of a series of textile fabrics: [...] Read more.
Complexity is one of the major attributes of the visual perception of texture. However, very little is known about how humans visually interpret texture complexity. A psychophysical experiment was conducted to visually quantify the seven texture attributes of a series of textile fabrics: complexity, color variation, randomness, strongness, regularity, repetitiveness, and homogeneity. It was found that the observers could discriminate between the textures with low and high complexity using some high-level visual cues such as randomness, color variation, strongness, etc. The results of principal component analysis (PCA) on the visual scores of the above attributes suggest that complexity and homogeneity could be essentially the underlying attributes of the same visual texture dimension, with complexity at the negative extreme and homogeneity at the positive extreme of this dimension. We chose to call this dimension visual texture complexity. Several texture measures including the first-order image statistics, co-occurrence matrix, local binary pattern, and Gabor features were computed for images of the textiles in sRGB, and four luminance-chrominance color spaces (i.e., HSV, YCbCr, Ohta’s I1I2I3, and CIELAB). The relationships between the visually quantified texture complexity of the textiles and the corresponding texture measures of the images were investigated. Analyzing the relationships showed that simple standard deviation of the image luminance channel had a strong correlation with the corresponding visual ratings of texture complexity in all five color spaces. Standard deviation of the energy of the image after convolving with an appropriate Gabor filter and entropy of the co-occurrence matrix, both computed for the image luminance channel, also showed high correlations with the visual data. In this comparison, sRGB, YCbCr, and HSV always outperformed the I1I2I3 and CIELAB color spaces. The highest correlations between the visual data and the corresponding image texture features in the luminance-chrominance color spaces were always obtained for the luminance channel of the images, and one of the two chrominance channels always performed better than the other. This result indicates that the arrangement of the image texture elements that impacts the observer’s perception of visual texture complexity cannot be represented properly by the chrominance channels. This must be carefully considered when choosing an image channel to quantify the visual texture complexity. Additionally, the good performance of the luminance channel in the five studied color spaces proves that variations in the luminance of the texture, or as one could call the luminance contrast, plays a crucial role in creating visual texture complexity. Full article
(This article belongs to the Special Issue Color Texture Classification)
Show Figures

Figure 1

22 pages, 14201 KiB  
Article
Local Contrast-Based Pixel Ordering for Exact Histogram Specification
by Kohei Inoue, Naoki Ono and Kenji Hara
J. Imaging 2022, 8(9), 247; https://doi.org/10.3390/jimaging8090247 - 10 Sep 2022
Viewed by 1783
Abstract
Histogram equalization is one of the basic image processing tasks for contrast enhancement, and its generalized version is histogram specification, which accepts arbitrary shapes of target histograms including uniform distributions for histogram equalization. It is well known that strictly ordered pixels in an [...] Read more.
Histogram equalization is one of the basic image processing tasks for contrast enhancement, and its generalized version is histogram specification, which accepts arbitrary shapes of target histograms including uniform distributions for histogram equalization. It is well known that strictly ordered pixels in an image can be voted to any target histogram to achieve exact histogram specification. This paper proposes a method for ordering pixels in an image on the basis of the local contrast of each pixel, where a Gaussian filter without approximation is used to avoid the duplication of pixel values that disturbs the strict pixel ordering. The main idea of the proposed method is that the problem of pixel ordering is divided into small subproblems which can be solved separately, and then the results are merged into one sequence of all ordered pixels. Moreover, the proposed method is extended from grayscale images to color ones in a consistent manner. Experimental results show that the state-of-the-art histogram specification method occasionally produces false patterns, which are alleviated by the proposed method. Those results demonstrate the effectiveness of the proposed method for exact histogram specification. Full article
(This article belongs to the Special Issue Imaging and Color Vision)
Show Figures

Figure 1

19 pages, 6366 KiB  
Article
Towards More Accurate and Complete Heterogeneous Iris Segmentation Using a Hybrid Deep Learning Approach
by Yuan Meng and Tie Bao
J. Imaging 2022, 8(9), 246; https://doi.org/10.3390/jimaging8090246 - 10 Sep 2022
Cited by 1 | Viewed by 1857
Abstract
Accurate iris segmentation is a crucial preprocessing stage for computer-aided ophthalmic disease diagnosis. The quality of iris images taken under different camera sensors varies greatly, and thus accurate segmentation of heterogeneous iris databases is a huge challenge. At present, network architectures based on [...] Read more.
Accurate iris segmentation is a crucial preprocessing stage for computer-aided ophthalmic disease diagnosis. The quality of iris images taken under different camera sensors varies greatly, and thus accurate segmentation of heterogeneous iris databases is a huge challenge. At present, network architectures based on convolutional neural networks (CNNs) have been widely applied in iris segmentation tasks. However, due to the limited kernel size of convolution layers, iris segmentation networks based on CNNs cannot learn global and long-term semantic information interactions well, and this will bring challenges to accurately segmenting the iris region. Inspired by the success of vision transformer (VIT) and swin transformer (Swin T), a hybrid deep learning approach is proposed to segment heterogeneous iris images. Specifically, we first proposed a bilateral segmentation backbone network that combines the benefits of Swin T with CNNs. Then, a multiscale feature information extraction module (MFIEM) is proposed to extract multiscale spatial information at a more granular level. Finally, a channel attention mechanism module (CAMM) is used in this paper to enhance the discriminability of the iris region. Experimental results on a multisource heterogeneous iris database show that our network has a significant performance advantage compared with some state-of-the-art (SOTA) iris segmentation networks. Full article
(This article belongs to the Special Issue Current Methods in Medical Image Segmentation)
Show Figures

Figure 1

14 pages, 7399 KiB  
Article
Using Computer Vision to Track Facial Color Changes and Predict Heart Rate
by Salik Ram Khanal, Jaime Sampaio, Juliana Exel, Joao Barroso and Vitor Filipe
J. Imaging 2022, 8(9), 245; https://doi.org/10.3390/jimaging8090245 - 9 Sep 2022
Cited by 1 | Viewed by 2128
Abstract
The current technological advances have pushed the quantification of exercise intensity to new era of physical exercise sciences. Monitoring physical exercise is essential in the process of planning, applying, and controlling loads for performance optimization and health. A lot of research studies applied [...] Read more.
The current technological advances have pushed the quantification of exercise intensity to new era of physical exercise sciences. Monitoring physical exercise is essential in the process of planning, applying, and controlling loads for performance optimization and health. A lot of research studies applied various statistical approaches to estimate various physiological indices, to our knowledge, no studies found to investigate the relationship of facial color changes and increased exercise intensity. The aim of this study was to develop a non-contact method based on computer vision to determine the heart rate and, ultimately, the exercise intensity. The method was based on analyzing facial color changes during exercise by using RGB, HSV, YCbCr, Lab, and YUV color models. Nine university students participated in the study (mean age = 26.88 ± 6.01 years, mean weight = 72.56 ± 14.27 kg, mean height = 172.88 ± 12.04 cm, six males and three females, and all white Caucasian). The data analyses were carried out separately for each participant (personalized model) as well as all the participants at a time (universal model). The multiple auto regressions, and a multiple polynomial regression model were designed to predict maximum heart rate percentage (maxHR%) from each color models. The results were analyzed and evaluated using Root Mean Square Error (RMSE), F-values, and R-square. The multiple polynomial regression using all participants exhibits the best accuracy with RMSE of 6.75 (R-square = 0.78). Exercise prescription and monitoring can benefit from the use of these methods, for example, to optimize the process of online monitoring, without having the need to use any other instrumentation. Full article
Show Figures

Figure 1

20 pages, 22310 KiB  
Article
Fuzzy Color Aura Matrices for Texture Image Segmentation
by Zohra Haliche, Kamal Hammouche, Olivier Losson and Ludovic Macaire
J. Imaging 2022, 8(9), 244; https://doi.org/10.3390/jimaging8090244 - 8 Sep 2022
Cited by 3 | Viewed by 1662
Abstract
Fuzzy gray-level aura matrices have been developed from fuzzy set theory and the aura concept to characterize texture images. They have proven to be powerful descriptors for color texture classification. However, using them for color texture segmentation is difficult because of their high [...] Read more.
Fuzzy gray-level aura matrices have been developed from fuzzy set theory and the aura concept to characterize texture images. They have proven to be powerful descriptors for color texture classification. However, using them for color texture segmentation is difficult because of their high memory and computation requirements. To overcome this problem, we propose to extend fuzzy gray-level aura matrices to fuzzy color aura matrices, which would allow us to apply them to color texture image segmentation. Unlike the marginal approach that requires one fuzzy gray-level aura matrix for each color channel, a single fuzzy color aura matrix is required to locally characterize the interactions between colors of neighboring pixels. Furthermore, all works about fuzzy gray-level aura matrices consider the same neighborhood function for each site. Another contribution of this paper is to define an adaptive neighborhood function based on information about neighboring sites provided by a pre-segmentation method. For this purpose, we propose a modified simple linear iterative clustering algorithm that incorporates a regional feature in order to partition the image into superpixels. All in all, the proposed color texture image segmentation boils down to a superpixel classification using a simple supervised classifier, each superpixel being characterized by a fuzzy color aura matrix. Experimental results on the Prague texture segmentation benchmark show that our method outperforms the classical state-of-the-art supervised segmentation methods and is similar to recent methods based on deep learning. Full article
(This article belongs to the Special Issue Color Texture Classification)
Show Figures

Figure 1

20 pages, 13052 KiB  
Article
Use of Multispectral Microscopy in the Prediction of Coated Halftone Reflectance
by Fanny Dailliez, Mathieu Hébert, Lionel Chagas, Thierry Fournel and Anne Blayo
J. Imaging 2022, 8(9), 243; https://doi.org/10.3390/jimaging8090243 - 8 Sep 2022
Cited by 1 | Viewed by 1472
Abstract
When a print is coated with a transparent layer, such as a lamination film or a varnish layer, its color can be modified compared to the uncoated version due to multiple reflections between the layer-air interface and the inked substrate. These interreflections involve [...] Read more.
When a print is coated with a transparent layer, such as a lamination film or a varnish layer, its color can be modified compared to the uncoated version due to multiple reflections between the layer-air interface and the inked substrate. These interreflections involve a multiple-convolution process between the halftone pattern and a ring-shaped luminous halo. They are described by an optical model which we have developed. The challenge at stake is to observe the impact of the coated layer on the print spectral reflectances and see if it can be predicted. The approach is based on pictures of the print captured with a multispectral microscope that are processed through the optical model to predict the spectral pictures of the coated print. The pictures averaged on the spatial dimension led to spectral reflectances which can be compared with macroscale measurements performed with a spectrophotometer. Comparison between macroscale measurements and microscale measurements with a multispectral microscope being delicate, specific care has been taken to calibrate the instruments. This method resulted in fairly conclusive predictions, both at the macroscale with the spectral reflectances, and at the microscale with an accurate prediction of the blurring effect induced by the multi-convolutive optical process. The tests carried out showed that the optical and visual effect of a coating layer on single-ink or multi-ink halftones with various patterns can be predicted with a satisfactory accuracy. Hence, by measuring the spatio-spectral reflectance of the uncoated print and predicting the spatio-spectral reflectance of the coating print, we can predict the color changes due to the coating itself. The model could be included in color management workflows for printing applications including a finishing coating. Full article
(This article belongs to the Special Issue Selected Papers from Computational Color Imaging Workshop 2022)
Show Figures

Figure 1

12 pages, 15255 KiB  
Article
Pore Segmentation Techniques for Low-Resolution Data: Application to the Neutron Tomography Data of Cement Materials
by Ivan Zel, Murat Kenessarin, Sergey Kichanov, Kuanysh Nazarov, Maria Bǎlǎșoiu and Denis Kozlenko
J. Imaging 2022, 8(9), 242; https://doi.org/10.3390/jimaging8090242 - 7 Sep 2022
Cited by 2 | Viewed by 1388
Abstract
The development of neutron imaging facilities provides a growing range of applications in different research fields. The significance of the obtained structural information, among others, depends on the reliability of phase segmentation. We focused on the problem of pore segmentation in low-resolution images [...] Read more.
The development of neutron imaging facilities provides a growing range of applications in different research fields. The significance of the obtained structural information, among others, depends on the reliability of phase segmentation. We focused on the problem of pore segmentation in low-resolution images and tomography data, taking into consideration possible image corruption in the neutron tomography experiment. Two pore segmentation techniques are proposed. They are the binarization of the enhanced contrast data using the global threshold, and the segmentation using the modified watershed technique—local threshold by watershed. The proposed techniques were compared with a conventional marker-based watershed on the test images simulating low-quality tomography data and on the neutron tomography data of the samples of magnesium potassium phosphate cement (MKP). The obtained results demonstrate the advantages of the proposed techniques over the conventional watershed-based approach. Full article
(This article belongs to the Special Issue Computational Methods for Neutron Imaging)
Show Figures

Figure 1

1 pages, 163 KiB  
Correction
Correction: Tøttrup et al. A Real-Time Method for Time-to-Collision Estimation from Aerial Images. J. Imaging 2022, 8, 62
by Daniel Tøttrup, Stinus Lykke Skovgaard, Jonas le Fevre Sejersen and Rui Pimentel de Figueiredo
J. Imaging 2022, 8(9), 241; https://doi.org/10.3390/jimaging8090241 - 6 Sep 2022
Viewed by 769
Abstract
The authors wish to make the following corrections to the article [...] Full article
20 pages, 31395 KiB  
Article
Conditional Random Field-Guided Multi-Focus Image Fusion
by Odysseas Bouzos, Ioannis Andreadis and Nikolaos Mitianoudis
J. Imaging 2022, 8(9), 240; https://doi.org/10.3390/jimaging8090240 - 5 Sep 2022
Cited by 2 | Viewed by 1426
Abstract
Multi-Focus image fusion is of great importance in order to cope with the limited Depth-of-Field of optical lenses. Since input images contain noise, multi-focus image fusion methods that support denoising are important. Transform-domain methods have been applied to image fusion, however, they are [...] Read more.
Multi-Focus image fusion is of great importance in order to cope with the limited Depth-of-Field of optical lenses. Since input images contain noise, multi-focus image fusion methods that support denoising are important. Transform-domain methods have been applied to image fusion, however, they are likely to produce artifacts. In order to cope with these issues, we introduce the Conditional Random Field (CRF) CRF-Guided fusion method. A novel Edge Aware Centering method is proposed and employed to extract the low and high frequencies of the input images. The Independent Component Analysis—ICA transform is applied to high-frequency components and a Conditional Random Field (CRF) model is created from the low frequency and the transform coefficients. The CRF model is solved efficiently with the α-expansion method. The estimated labels are used to guide the fusion of the low-frequency components and the transform coefficients. Inverse ICA is then applied to the fused transform coefficients. Finally, the fused image is the addition of the fused low frequency and the fused high frequency. CRF-Guided fusion does not introduce artifacts during fusion and supports image denoising during fusion by applying transform domain coefficient shrinkage. Quantitative and qualitative evaluation demonstrate the superior performance of CRF-Guided fusion compared to state-of-the-art multi-focus image fusion methods. Full article
(This article belongs to the Special Issue The Present and the Future of Imaging)
Show Figures

Figure 1

13 pages, 4421 KiB  
Article
Simple Color Calibration Method by Projection onto Retroreflective Materials
by Yusuke Nakamura and Takahiko Horiuchi
J. Imaging 2022, 8(9), 239; https://doi.org/10.3390/jimaging8090239 - 3 Sep 2022
Viewed by 1499
Abstract
Retroreflective materials have the property of directional reflection, reflecting light strongly in the direction of the light source, and have been used for road traffic signs. In recent years, retroreflective materials have been used in entertainment and industrial technologies, in combination with projection [...] Read more.
Retroreflective materials have the property of directional reflection, reflecting light strongly in the direction of the light source, and have been used for road traffic signs. In recent years, retroreflective materials have been used in entertainment and industrial technologies, in combination with projection mapping technology. In general, color calibration is important when projectors are used to control reflected colors. In this study, we investigated a simple color calibration method for retroreflective materials with a 3D shape under the condition that they are observed in the same direction as the light source. Three types of retroreflective materials with different reflective properties were used. First, to measure the reflective properties of each reflective material, the reflective material was fixed to a flat plate and rotated, while the reflected light was measured in the same direction as the light source. It was then confirmed that the reflected light intensity varied smoothly with angular change, and appropriate measurement angles were investigated based on the AIC criterion, aiming to interpolate the reflectance characteristics from a small number of measurement angles. Color calibration was performed based on the reflectance characteristics obtained from the derived measurement angles, and the experiments showed that good color calibration was possible with fewer measurements. Full article
(This article belongs to the Special Issue Selected Papers from Computational Color Imaging Workshop 2022)
Show Figures

Figure 1

24 pages, 2540 KiB  
Article
A Novel Trademark Image Retrieval System Based on Multi-Feature Extraction and Deep Networks
by Sandra Jardim, João António, Carlos Mora and Artur Almeida
J. Imaging 2022, 8(9), 238; https://doi.org/10.3390/jimaging8090238 - 2 Sep 2022
Cited by 7 | Viewed by 2565
Abstract
Graphical Search Engines are conceptually used in many development areas surrounding information retrieval systems that aim to provide a visual representation of results, typically associated with retrieving images relevant to one or more input images. Since the 1990s, efforts have been made to [...] Read more.
Graphical Search Engines are conceptually used in many development areas surrounding information retrieval systems that aim to provide a visual representation of results, typically associated with retrieving images relevant to one or more input images. Since the 1990s, efforts have been made to improve the result quality, be it through improved processing speeds or more efficient graphical processing techniques that generate accurate representations of images for comparison. While many systems achieve timely results by combining high-level features, they still struggle when dealing with large datasets and abstract images. Image datasets regarding industrial property are an example of an hurdle for typical image retrieval systems where the dimensions and characteristics of images make adequate comparison a difficult task. In this paper, we introduce an image retrieval system based on a multi-phase implementation of different deep learning and image processing techniques, designed to deliver highly accurate results regardless of dataset complexity and size. The proposed approach uses image signatures to provide a near exact representation of an image, with abstraction levels that allow the comparison with other signatures as a means to achieve a fully capable image comparison process. To overcome performance disadvantages related to multiple image searches due to the high complexity of image signatures, the proposed system incorporates a parallel processing block responsible for dealing with multi-image search scenarios. The system achieves the image retrieval through the use of a new similarity compound formula that accounts for all components of an image signature. The results shows that the developed approach performs image retrieval with high accuracy, showing that combining multiple image assets allows for more accurate comparisons across a broad spectrum of image typologies. The use of deep convolutional networks for feature extraction as a means of semantically describing more commonly encountered objects allows for the system to perform research with a degree of abstraction. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning: Trends and Applications)
Show Figures

Figure 1

24 pages, 1324 KiB  
Article
Comparison of Convolutional Neural Networks and Transformers for the Classification of Images of COVID-19, Pneumonia and Healthy Individuals as Observed with Computed Tomography
by Azucena Ascencio-Cabral and Constantino Carlos Reyes-Aldasoro
J. Imaging 2022, 8(9), 237; https://doi.org/10.3390/jimaging8090237 - 1 Sep 2022
Cited by 4 | Viewed by 2431
Abstract
In this work, the performance of five deep learning architectures in classifying COVID-19 in a multi-class set-up is evaluated. The classifiers were built on pretrained ResNet-50, ResNet-50r (with kernel size 5×5 in the first convolutional layer), DenseNet-121, MobileNet-v3 and the state-of-the-art [...] Read more.
In this work, the performance of five deep learning architectures in classifying COVID-19 in a multi-class set-up is evaluated. The classifiers were built on pretrained ResNet-50, ResNet-50r (with kernel size 5×5 in the first convolutional layer), DenseNet-121, MobileNet-v3 and the state-of-the-art CaiT-24-XXS-224 (CaiT) transformer. The cross entropy and weighted cross entropy were minimised with Adam and AdamW. In total, 20 experiments were conducted with 10 repetitions and obtained the following metrics: accuracy (Acc), balanced accuracy (BA), F1 and F2 from the general Fβ macro score, Matthew’s Correlation Coefficient (MCC), sensitivity (Sens) and specificity (Spec) followed by bootstrapping. The performance of the classifiers was compared by using the Friedman–Nemenyi test. The results show that less complex architectures such as ResNet-50, ResNet-50r and DenseNet-121 were able to achieve better generalization with rankings of 1.53, 1.71 and 3.05 for the Matthew Correlation Coefficient, respectively, while MobileNet-v3 and CaiT obtained rankings of 3.72 and 5.0, respectively. Full article
(This article belongs to the Special Issue The Present and the Future of Imaging)
Show Figures

Figure 1

14 pages, 2416 KiB  
Review
Dual-Energy CT of the Heart: A Review
by Serena Dell’Aversana, Raffaele Ascione, Marco De Giorgi, Davide Raffaele De Lucia, Renato Cuocolo, Marco Boccalatte, Gerolamo Sibilio, Giovanni Napolitano, Giuseppe Muscogiuri, Sandro Sironi, Giuseppe Di Costanzo, Enrico Cavaglià, Massimo Imbriaco and Andrea Ponsiglione
J. Imaging 2022, 8(9), 236; https://doi.org/10.3390/jimaging8090236 - 1 Sep 2022
Cited by 11 | Viewed by 5077
Abstract
Dual-energy computed tomography (DECT) represents an emerging imaging technique which consists of the acquisition of two separate datasets utilizing two different X-ray spectra energies. Several cardiac DECT applications have been assessed, such as virtual monoenergetic images, virtual non-contrast reconstructions, and iodine myocardial perfusion [...] Read more.
Dual-energy computed tomography (DECT) represents an emerging imaging technique which consists of the acquisition of two separate datasets utilizing two different X-ray spectra energies. Several cardiac DECT applications have been assessed, such as virtual monoenergetic images, virtual non-contrast reconstructions, and iodine myocardial perfusion maps, which are demonstrated to improve diagnostic accuracy and image quality while reducing both radiation and contrast media administration. This review will summarize the technical basis of DECT and review the principal cardiac applications currently adopted in clinical practice, exploring possible future applications. Full article
Show Figures

Figure 1

10 pages, 2763 KiB  
Article
Agar Gel as a Non-Invasive Coupling Medium for Reflectance Photoacoustic (PA) Imaging: Experimental Results on Wall-Painting Mock-Ups
by Antonina Chaban, George J. Tserevelakis, Evgenia Klironomou, Giannis Zacharakis and Jana Striova
J. Imaging 2022, 8(9), 235; https://doi.org/10.3390/jimaging8090235 - 30 Aug 2022
Cited by 4 | Viewed by 2014
Abstract
The new reflectance set-up configuration extended the applicability of the photoacoustic (PA) imaging technique to art objects of any thickness and form. Until now, ultrasound gel or distilled water have been necessary as coupling mediums between the immersion-type transducer and the object’s surface. [...] Read more.
The new reflectance set-up configuration extended the applicability of the photoacoustic (PA) imaging technique to art objects of any thickness and form. Until now, ultrasound gel or distilled water have been necessary as coupling mediums between the immersion-type transducer and the object’s surface. These media can compromise the integrity of real artwork; therefore, known applications of reflectance PA imaging have been limited to only experimental mock-ups. In this paper, we evaluate an alternative non-invasive PA coupling medium, agar gel, applied in two layers of different consistency: first, rigid—for the protection of the object’s surface, and second, fluid—for the transducer’s immersion and movement. Agar gel is widely used in various conservation treatments on cultural heritage objects, and it has been proven to be safely applicable on delicate surfaces. Here, we quantify and compare the contrast and signal-to-noise ratio (SNR) of PA images, obtained in water and in agar gel on the same areas, at equal experimental conditions. The results demonstrate that the technique’s performance in agar is comparable to that in water. The study uncovers the advanced potential of the PA approach for revealing hidden features, and is safely applicable for future real-case studies. Full article
(This article belongs to the Special Issue Spectral Imaging for Cultural Heritage)
Show Figures

Figure 1

18 pages, 5837 KiB  
Article
Adaptation to CT Reconstruction Kernels by Enforcing Cross-Domain Feature Maps Consistency
by Stanislav Shimovolos, Andrey Shushko, Mikhail Belyaev and Boris Shirokikh
J. Imaging 2022, 8(9), 234; https://doi.org/10.3390/jimaging8090234 - 30 Aug 2022
Viewed by 1746
Abstract
Deep learning methods provide significant assistance in analyzing coronavirus disease (COVID-19) in chest computed tomography (CT) images, including identification, severity assessment, and segmentation. Although the earlier developed methods address the lack of data and specific annotations, the current goal is to build a [...] Read more.
Deep learning methods provide significant assistance in analyzing coronavirus disease (COVID-19) in chest computed tomography (CT) images, including identification, severity assessment, and segmentation. Although the earlier developed methods address the lack of data and specific annotations, the current goal is to build a robust algorithm for clinical use, having a larger pool of available data. With the larger datasets, the domain shift problem arises, affecting the performance of methods on the unseen data. One of the critical sources of domain shift in CT images is the difference in reconstruction kernels used to generate images from the raw data (sinograms). In this paper, we show a decrease in the COVID-19 segmentation quality of the model trained on the smooth and tested on the sharp reconstruction kernels. Furthermore, we compare several domain adaptation approaches to tackle the problem, such as task-specific augmentation and unsupervised adversarial learning. Finally, we propose the unsupervised adaptation method, called F-Consistency, that outperforms the previous approaches. Our method exploits a set of unlabeled CT image pairs which differ only in reconstruction kernels within every pair. It enforces the similarity of the network’s hidden representations (feature maps) by minimizing the mean squared error (MSE) between paired feature maps. We show our method achieving a 0.64 Dice Score on the test dataset with unseen sharp kernels, compared to the 0.56 Dice Score of the baseline model. Moreover, F-Consistency scores 0.80 Dice Score between predictions on the paired images, which almost doubles the baseline score of 0.46 and surpasses the other methods. We also show F-Consistency to better generalize on the unseen kernels and without the presence of the COVID-19 lesions than the other methods trained on unlabeled data. Full article
(This article belongs to the Special Issue Deep Learning in Medical Image Analysis, Volume II)
Show Figures

Figure 1

14 pages, 4927 KiB  
Article
Privacy-Preserving Semantic Segmentation Using Vision Transformer
by Hitoshi Kiya, Teru Nagamori, Shoko Imaizumi and Sayaka Shiota
J. Imaging 2022, 8(9), 233; https://doi.org/10.3390/jimaging8090233 - 30 Aug 2022
Cited by 10 | Viewed by 2590
Abstract
In this paper, we propose a privacy-preserving semantic segmentation method that uses encrypted images and models with the vision transformer (ViT), called the segmentation transformer (SETR). The combined use of encrypted images and SETR allows us not only to apply images without sensitive [...] Read more.
In this paper, we propose a privacy-preserving semantic segmentation method that uses encrypted images and models with the vision transformer (ViT), called the segmentation transformer (SETR). The combined use of encrypted images and SETR allows us not only to apply images without sensitive visual information to SETR as query images but to also maintain the same accuracy as that of using plain images. Previously, privacy-preserving methods with encrypted images for deep neural networks have focused on image classification tasks. In addition, the conventional methods result in a lower accuracy than models trained with plain images due to the influence of image encryption. To overcome these issues, a novel method for privacy-preserving semantic segmentation is proposed by using an embedding that the ViT structure has for the first time. In experiments, the proposed privacy-preserving semantic segmentation was demonstrated to have the same accuracy as that of using plain images under the use of encrypted images. Full article
(This article belongs to the Topic Computer Vision and Image Processing)
Show Figures

Figure 1

20 pages, 11391 KiB  
Article
Angle-Retaining Chromaticity and Color Space: Invariants and Properties
by Marco Buzzelli
J. Imaging 2022, 8(9), 232; https://doi.org/10.3390/jimaging8090232 - 29 Aug 2022
Cited by 3 | Viewed by 1961
Abstract
The angle-retaining color space (ARC) and the corresponding chromaticity diagram encode information following a cylindrical color model. Their main property is that angular distances in RGB are mapped into Euclidean distances in the ARC chromatic components, making the color space suitable for data [...] Read more.
The angle-retaining color space (ARC) and the corresponding chromaticity diagram encode information following a cylindrical color model. Their main property is that angular distances in RGB are mapped into Euclidean distances in the ARC chromatic components, making the color space suitable for data representation in the domain of color constancy. In this paper, we present an in-depth analysis of various properties of ARC: we document the variations in the numerical precisions of two alternative formulations of the ARC-to-RGB transformation and characterize how various perturbations in RGB impact the ARC representation. This was done empirically for the ARC diagram in a direct comparison against other commonly used chromaticity diagrams, and analytically for the ARC space with respect to its three components. We conclude by describing the color space in terms of perceptual uniformity, suggesting the need for new perceptual color metrics. Full article
(This article belongs to the Special Issue Selected Papers from Computational Color Imaging Workshop 2022)
Show Figures

Figure 1

19 pages, 2261 KiB  
Article
Automatic Classification of Simulated Breast Tomosynthesis Whole Images for the Presence of Microcalcification Clusters Using Deep CNNs
by Ana M. Mota, Matthew J. Clarkson, Pedro Almeida and Nuno Matela
J. Imaging 2022, 8(9), 231; https://doi.org/10.3390/jimaging8090231 - 29 Aug 2022
Cited by 5 | Viewed by 2262
Abstract
Microcalcification clusters (MCs) are among the most important biomarkers for breast cancer, especially in cases of nonpalpable lesions. The vast majority of deep learning studies on digital breast tomosynthesis (DBT) are focused on detecting and classifying lesions, especially soft-tissue lesions, in small regions [...] Read more.
Microcalcification clusters (MCs) are among the most important biomarkers for breast cancer, especially in cases of nonpalpable lesions. The vast majority of deep learning studies on digital breast tomosynthesis (DBT) are focused on detecting and classifying lesions, especially soft-tissue lesions, in small regions of interest previously selected. Only about 25% of the studies are specific to MCs, and all of them are based on the classification of small preselected regions. Classifying the whole image according to the presence or absence of MCs is a difficult task due to the size of MCs and all the information present in an entire image. A completely automatic and direct classification, which receives the entire image, without prior identification of any regions, is crucial for the usefulness of these techniques in a real clinical and screening environment. The main purpose of this work is to implement and evaluate the performance of convolutional neural networks (CNNs) regarding an automatic classification of a complete DBT image for the presence or absence of MCs (without any prior identification of regions). In this work, four popular deep CNNs are trained and compared with a new architecture proposed by us. The main task of these trainings was the classification of DBT cases by absence or presence of MCs. A public database of realistic simulated data was used, and the whole DBT image was taken into account as input. DBT data were considered without and with preprocessing (to study the impact of noise reduction and contrast enhancement methods on the evaluation of MCs with CNNs). The area under the receiver operating characteristic curve (AUC) was used to evaluate the performance. Very promising results were achieved with a maximum AUC of 94.19% for the GoogLeNet. The second-best AUC value was obtained with a new implemented network, CNN-a, with 91.17%. This CNN had the particularity of also being the fastest, thus becoming a very interesting model to be considered in other studies. With this work, encouraging outcomes were achieved in this regard, obtaining similar results to other studies for the detection of larger lesions such as masses. Moreover, given the difficulty of visualizing the MCs, which are often spread over several slices, this work may have an important impact on the clinical analysis of DBT images. Full article
Show Figures

Figure 1

11 pages, 7512 KiB  
Article
Application of Fractal Image Analysis by Scale-Space Filtering in Experimental Mechanics
by Anna Bauer, Wolfram Volk and Christoph Hartmann
J. Imaging 2022, 8(9), 230; https://doi.org/10.3390/jimaging8090230 - 26 Aug 2022
Cited by 2 | Viewed by 1592
Abstract
Increasingly complex numerical analyses require more and more precise, accurate and varied input parameters in order to achieve results that are as realistic and reliable as possible. Therefore, experimental analyses for material parameter identification are of high importance and a driving force for [...] Read more.
Increasingly complex numerical analyses require more and more precise, accurate and varied input parameters in order to achieve results that are as realistic and reliable as possible. Therefore, experimental analyses for material parameter identification are of high importance and a driving force for further developments. In this work, opportunities by applying fractal analysis to optical measurement data of a shear cutting process are investigated. The fractal analysis is based on a modification of the concept of scale-space filtering. Scale exponent fields are calculated for the image sequences of the shear cutting process that are taken by a mobile microscope. A least-square approximation is used for the automated evaluation of the local scale exponent values. In order to determine the change of the scale exponent of individual material points, a digital image correlation is applied. Full article
Show Figures

Figure 1

25 pages, 1112 KiB  
Article
Shapley-Additive-Explanations-Based Factor Analysis for Dengue Severity Prediction using Machine Learning
by Shihab Uddin Chowdhury, Sanjana Sayeed, Iktisad Rashid, Md. Golam Rabiul Alam, Abdul Kadar Muhammad Masum and M. Ali Akber Dewan
J. Imaging 2022, 8(9), 229; https://doi.org/10.3390/jimaging8090229 - 26 Aug 2022
Cited by 3 | Viewed by 2749
Abstract
Dengue is a viral disease that primarily affects tropical and subtropical regions and is especially prevalent in South-East Asia. This mosquito-borne disease sometimes triggers nationwide epidemics, which results in a large number of fatalities. The development of Dengue Haemorrhagic Fever (DHF) is where [...] Read more.
Dengue is a viral disease that primarily affects tropical and subtropical regions and is especially prevalent in South-East Asia. This mosquito-borne disease sometimes triggers nationwide epidemics, which results in a large number of fatalities. The development of Dengue Haemorrhagic Fever (DHF) is where most cases occur, and a large portion of them are detected among children under the age of ten, with severe conditions often progressing to a critical state known as Dengue Shock Syndrome (DSS). In this study, we analysed two separate datasets from two different countries– Vietnam and Bangladesh, which we referred as VDengu and BDengue, respectively. For the VDengu dataset, as it was structured, supervised learning models were effective for predictive analysis, among which, the decision tree classifier XGBoost in particular produced the best outcome. Furthermore, Shapley Additive Explanation (SHAP) was used over the XGBoost model to assess the significance of individual attributes of the dataset. Among the significant attributes, we applied the SHAP dependence plot to identify the range for each attribute against the number of DHF or DSS cases. In parallel, the dataset from Bangladesh was unstructured; therefore, we applied an unsupervised learning technique, i.e., hierarchical clustering, to find clusters of vital blood components of the patients according to their complete blood count reports. The clusters were further analysed to find the attributes in the dataset that led to DSS or DHF. Full article
Show Figures

Figure 1

20 pages, 615 KiB  
Review
AI in Breast Cancer Imaging: A Survey of Different Applications
by João Mendes, José Domingues, Helena Aidos, Nuno Garcia and Nuno Matela
J. Imaging 2022, 8(9), 228; https://doi.org/10.3390/jimaging8090228 - 26 Aug 2022
Cited by 11 | Viewed by 3957
Abstract
Breast cancer was the most diagnosed cancer in 2020. Several thousand women continue to die from this disease. A better and earlier diagnosis may be of great importance to improving prognosis, and that is where Artificial Intelligence (AI) could play a major role. [...] Read more.
Breast cancer was the most diagnosed cancer in 2020. Several thousand women continue to die from this disease. A better and earlier diagnosis may be of great importance to improving prognosis, and that is where Artificial Intelligence (AI) could play a major role. This paper surveys different applications of AI in Breast Imaging. First, traditional Machine Learning and Deep Learning methods that can detect the presence of a lesion and classify it into benign/malignant—which could be important to diminish reading time and improve accuracy—are analyzed. Following that, researches in the field of breast cancer risk prediction using mammograms—which may be able to allow screening programs customization both on periodicity and modality—are reviewed. The subsequent section analyzes different applications of augmentation techniques that allow to surpass the lack of labeled data. Finally, still concerning the absence of big datasets with labeled data, the last section studies Self-Supervised learning, where AI models are able to learn a representation of the input by themselves. This review gives a general view of what AI can give in the field of Breast Imaging, discussing not only its potential but also the challenges that still have to be overcome. Full article
Show Figures

Figure 1

21 pages, 6664 KiB  
Article
3D Dynamic Spatiotemporal Atlas of the Vocal Tract during Consonant–Vowel Production from 2D Real Time MRI
by Ioannis K. Douros, Yu Xie, Chrysanthi Dourou, Karyna Isaieva, Pierre-André Vuissoz, Jacques Felblinger and Yves Laprie
J. Imaging 2022, 8(9), 227; https://doi.org/10.3390/jimaging8090227 - 25 Aug 2022
Cited by 1 | Viewed by 1719
Abstract
In this work, we address the problem of creating a 3D dynamic atlas of the vocal tract that captures the dynamics of the articulators in all three dimensions in order to create a global speaker model independent of speaker-specific characteristics. The core steps [...] Read more.
In this work, we address the problem of creating a 3D dynamic atlas of the vocal tract that captures the dynamics of the articulators in all three dimensions in order to create a global speaker model independent of speaker-specific characteristics. The core steps of the proposed method are the temporal alignment of the real-time MR images acquired in several sagittal planes and their combination with adaptive kernel regression. As a preprocessing step, a reference space was created to be used in order to remove anatomical information of the speakers and keep only the variability in speech production for the construction of the atlas. The adaptive kernel regression makes the choice of atlas time points independently of the time points of the frames that are used as an input for the construction. The evaluation of this atlas construction method was made by mapping two new speakers to the atlas and by checking how similar the resulting mapped images are. The use of the atlas helps in reducing subject variability. The results show that the use of the proposed atlas can capture the dynamic behavior of the articulators and is able to generalize the speech production process by creating a universal-speaker reference space. Full article
(This article belongs to the Special Issue Spatio-Temporal Biomedical Image Analysis)
Show Figures

Figure 1

11 pages, 1453 KiB  
Article
X-ray Dark-Field Imaging for Improved Contrast in Historical Handwritten Literature
by Bernhard Akstaller, Stephan Schreiner, Lisa Dietrich, Constantin Rauch, Max Schuster, Veronika Ludwig, Christina Hofmann-Randall, Thilo Michel, Gisela Anton and Stefan Funk
J. Imaging 2022, 8(9), 226; https://doi.org/10.3390/jimaging8090226 - 24 Aug 2022
Viewed by 1784
Abstract
If ancient documents are too fragile to be opened, X-ray imaging can be used to recover the content non-destructively. As an extension to conventional attenuation imaging, dark-field imaging provides access to microscopic structural object information, which can be especially advantageous for materials with [...] Read more.
If ancient documents are too fragile to be opened, X-ray imaging can be used to recover the content non-destructively. As an extension to conventional attenuation imaging, dark-field imaging provides access to microscopic structural object information, which can be especially advantageous for materials with weak attenuation contrast, such as certain metal-free inks in paper. With cotton paper and different self-made inks based on authentic recipes, we produced test samples for attenuation and dark-field imaging at a metal-jet X-ray source. The resulting images show letters written in metal-free ink that were recovered via grating-based dark-field imaging. Without the need for synchrotron-like beam quality, these results set the ground for a mobile dark-field imaging setup that could be brought to a library for document scanning, avoiding long transport routes for valuable historic documents. Full article
Show Figures

Figure 1

23 pages, 3511 KiB  
Review
Three-Dimensional Reconstruction from a Single RGB Image Using Deep Learning: A Review
by Muhammad Saif Ullah Khan, Alain Pagani, Marcus Liwicki, Didier Stricker and Muhammad Zeshan Afzal
J. Imaging 2022, 8(9), 225; https://doi.org/10.3390/jimaging8090225 - 23 Aug 2022
Cited by 4 | Viewed by 4324
Abstract
Performing 3D reconstruction from a single 2D input is a challenging problem that is trending in literature. Until recently, it was an ill-posed optimization problem, but with the advent of learning-based methods, the performance of 3D reconstruction has also significantly improved. Infinitely many [...] Read more.
Performing 3D reconstruction from a single 2D input is a challenging problem that is trending in literature. Until recently, it was an ill-posed optimization problem, but with the advent of learning-based methods, the performance of 3D reconstruction has also significantly improved. Infinitely many different 3D objects can be projected onto the same 2D plane, which makes the reconstruction task very difficult. It is even more difficult for objects with complex deformations or no textures. This paper serves as a review of recent literature on 3D reconstruction from a single view, with a focus on deep learning methods from 2018 to 2021. Due to the lack of standard datasets or 3D shape representation methods, it is hard to compare all reviewed methods directly. However, this paper reviews different approaches for reconstructing 3D shapes as depth maps, surface normals, point clouds, and meshes; along with various loss functions and metrics used to train and evaluate these methods. Full article
(This article belongs to the Special Issue Geometry Reconstruction from Images)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop