Next Issue
Volume 8, April
Previous Issue
Volume 8, February
 
 

J. Imaging, Volume 8, Issue 3 (March 2022) – 32 articles

Cover Story (view full-size image): Visual tracking is still an open challenge in computer vision, especially with mobile cameras and in the wild. Errors in the target’s bounding-box estimations accumulate over time and yield to drifting of the tracker. Hence, bounding-box estimations must be as precise as possible in each frame. This article proposes a new iterative procedure to locate the target in the image by gradually refining its bounding box. It also introduces the idea of non-conflicting bounding-box transformations, which allows applying multiple refinements to the target’s bounding box without introducing ambiguities when learning parameters. The empirical results demonstrate that the proposed approach improves the single iterative refinement in terms of accuracy of tracking results. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
12 pages, 1781 KiB  
Article
A New Approach in Detectability of Microcalcifications in the Placenta during Pregnancy Using Textural Features and K-Nearest Neighbors Algorithm
by Mihaela Miron, Simona Moldovanu, Bogdan Ioan Ștefănescu, Mihai Culea, Sorin Marius Pavel and Anisia Luiza Culea-Florescu
J. Imaging 2022, 8(3), 81; https://doi.org/10.3390/jimaging8030081 - 19 Mar 2022
Cited by 3 | Viewed by 2544
Abstract
(1) Background: Ultrasonography is the main method used during pregnancy to assess the fetal growth, amniotic fluid, umbilical cord and placenta. The placenta’s structure suffers dynamic modifications throughout the whole pregnancy and many of these changes, in which placental microcalcifications are by far [...] Read more.
(1) Background: Ultrasonography is the main method used during pregnancy to assess the fetal growth, amniotic fluid, umbilical cord and placenta. The placenta’s structure suffers dynamic modifications throughout the whole pregnancy and many of these changes, in which placental microcalcifications are by far the most prominent, are related to the process of aging and maturation and have no effect on fetal wellbeing. However, when placental microcalcifications are noticed earlier during pregnancy, they could suggest a major placental dysfunction with serious consequences for the fetus and mother. For better detectability of microcalcifications, we propose a new approach based on improving the clarity of details and the analysis of the placental structure using first and second order statistics, and fractal dimension. (2) Methods: The methodology is based on four stages: (i) cropping the region of interest and preprocessing steps; (ii) feature extraction, first order—standard deviation (SD), skewness (SK) and kurtosis (KR)—and second order—contrast (C), homogeneity (H), correlation (CR), energy (E) and entropy (EN)—are computed from a gray level co-occurrence matrix (GLCM) and fractal dimension (FD); (iii) statistical analysis (t-test); (iv) classification with the K-Nearest Neighbors algorithm (K-NN algorithm) and performance comparison with results from the support vector machine algorithm (SVM algorithm). (3) Results: Experimental results obtained from real clinical data show an improvement in the detectability and visibility of placental microcalcifications. Full article
(This article belongs to the Special Issue Radiomics and Texture Analysis in Medical Imaging)
Show Figures

Figure 1

12 pages, 4461 KiB  
Article
Neutron Tomography Studies of Two Lamprophyre Dike Samples: 3D Data Analysis for the Characterization of Rock Fabric
by Ivan Zel, Bekhzodjon Abdurakhimov, Sergey Kichanov, Olga Lis, Elmira Myrzabekova, Denis Kozlenko, Mannab Tashmetov, Khalbay Ishbaev and Kuatbay Kosbergenov
J. Imaging 2022, 8(3), 80; https://doi.org/10.3390/jimaging8030080 - 19 Mar 2022
Viewed by 2142
Abstract
The rock fabric of two lamprophyre dike samples from the Koy-Tash granitoid intrusion (Koy-Tash, Jizzakh region, Uzbekistan) has been studied, using the neutron tomography method. We have performed virtual segmentation of the reconstructed 3D model of the tabular igneous intrusion and the corresponding [...] Read more.
The rock fabric of two lamprophyre dike samples from the Koy-Tash granitoid intrusion (Koy-Tash, Jizzakh region, Uzbekistan) has been studied, using the neutron tomography method. We have performed virtual segmentation of the reconstructed 3D model of the tabular igneous intrusion and the corresponding determination of dike margins orientation. Spatial distributions of inclusions in the dike volume, as well as further analysis of size distributions and shape orientations of inclusions, have been obtained. The observed shape preferred orientations of inclusions as evidence of the magma flow-related fabric. The obtained structural data have been discussed in the frame of the models of rigid particle motion and the straining of vesicles in a moving viscous fluid. Full article
(This article belongs to the Special Issue Computational Methods for Neutron Imaging)
Show Figures

Figure 1

13 pages, 26935 KiB  
Article
Comparing Desktop vs. Mobile Interaction for the Creation of Pervasive Augmented Reality Experiences
by Tiago Madeira, Bernardo Marques, Pedro Neves, Paulo Dias and Beatriz Sousa Santos
J. Imaging 2022, 8(3), 79; https://doi.org/10.3390/jimaging8030079 - 18 Mar 2022
Cited by 8 | Viewed by 2749
Abstract
This paper presents an evaluation and comparison of interaction methods for the configuration and visualization of pervasive Augmented Reality (AR) experiences using two different platforms: desktop and mobile. AR experiences consist of the enhancement of real-world environments by superimposing additional layers of information, [...] Read more.
This paper presents an evaluation and comparison of interaction methods for the configuration and visualization of pervasive Augmented Reality (AR) experiences using two different platforms: desktop and mobile. AR experiences consist of the enhancement of real-world environments by superimposing additional layers of information, real-time interaction, and accurate 3D registration of virtual and real objects. Pervasive AR extends this concept through experiences that are continuous in space, being aware of and responsive to the user’s context and pose. Currently, the time and technical expertise required to create such applications are the main reasons preventing its widespread use. As such, authoring tools which facilitate the development and configuration of pervasive AR experiences have become progressively more relevant. Their operation often involves the navigation of the real-world scene and the use of the AR equipment itself to add the augmented information within the environment. The proposed experimental tool makes use of 3D scans from physical environments to provide a reconstructed digital replica of such spaces for a desktop-based method, and to enable positional tracking for a mobile-based one. While the desktop platform represents a non-immersive setting, the mobile one provides continuous AR in the physical environment. Both versions can be used to place virtual content and ultimately configure an AR experience. The authoring capabilities of the different platforms were compared by conducting a user study focused on evaluating their usability. Although the AR interface was generally considered more intuitive, the desktop platform shows promise in several aspects, such as remote configuration, lower required effort, and overall better scalability. Full article
(This article belongs to the Special Issue Advanced Scene Perception for Augmented Reality)
Show Figures

Figure 1

24 pages, 6178 KiB  
Article
The Capabilities of Dedicated Small Satellite Infrared Missions for the Quantitative Characterization of Wildfires
by Winfried Halle, Christian Fischer, Dieter Oertel and Boris Zhukov
J. Imaging 2022, 8(3), 78; https://doi.org/10.3390/jimaging8030078 - 18 Mar 2022
Viewed by 2320
Abstract
The main objective of this paper was to demonstrate the capability of dedicated small satellite infrared sensors with cooled quantum detectors, such as those successfully utilized three times in Germany’s pioneering BIRD and FireBIRD small satellite infrared missions, in the quantitative characterization of [...] Read more.
The main objective of this paper was to demonstrate the capability of dedicated small satellite infrared sensors with cooled quantum detectors, such as those successfully utilized three times in Germany’s pioneering BIRD and FireBIRD small satellite infrared missions, in the quantitative characterization of high-temperature events such as wildfires. The Bi-spectral Infrared Detection (BIRD) mission was launched in October 2001. The space segment of FireBIRD consists of the small satellites Technologie Erprobungs-Träger (TET-1), launched in July 2012, and Bi-spectral InfraRed Optical System (BIROS), launched in June 2016. These missions also significantly improved the scientific understanding of space-borne fire monitoring with regard to climate change. The selected examples compare the evaluation of quantitative characteristics using data from BIRD or FireBIRD and from the operational polar orbiting IR sensor systems MODIS, SLSTR and VIIRS. Data from the geostationary satellite “Himawari-8” were compared with FireBIRD data, obtained simultaneously. The geostationary Meteosat Third Generation-Imager (MTG-I) is foreseen to be launched at the end of 2022. In its application to fire, the MTG-I’s Flexible Combined Imager (FCI) will provide related spectral bands at ground sampling distances (GSD) of 3.8 µm and 10.5 µm at the sub-satellite point (SSP) of 1 km or 2 km, depending on the used FCI imaging mode. BIRD wildfire data, obtained over Africa and Portugal, were used to simulate the fire detection and monitoring capability of MTG-I/FCI. A new quality of fire monitoring is predicted, if the 1 km resolution wildfire data from MTG-1/FCI are used together with the co-located fire data acquired by the polar orbiting Visible Infrared Imaging Radiometer Suite (VIIRS), and possibly prospective FireBIRD-type compact IR sensors flying on several small satellites in various low Earth orbits (LEOs). Full article
(This article belongs to the Special Issue Infrared-Image Processing for Climate Change Monitoring from Space)
Show Figures

Figure 1

14 pages, 10260 KiB  
Article
Metal Artifact Reduction in Spectral X-ray CT Using Spectral Deep Learning
by Matteo Busi, Christian Kehl, Jeppe R. Frisvad and Ulrik L. Olsen
J. Imaging 2022, 8(3), 77; https://doi.org/10.3390/jimaging8030077 - 17 Mar 2022
Cited by 7 | Viewed by 3400
Abstract
Spectral X-ray computed tomography (SCT) is an emerging method for non-destructive imaging of the inner structure of materials. Compared with the conventional X-ray CT, this technique provides spectral photon energy resolution in a finite number of energy channels, adding a new dimension to [...] Read more.
Spectral X-ray computed tomography (SCT) is an emerging method for non-destructive imaging of the inner structure of materials. Compared with the conventional X-ray CT, this technique provides spectral photon energy resolution in a finite number of energy channels, adding a new dimension to the reconstructed volumes and images. While this mitigates energy-dependent distortions such as beam hardening, metal artifacts due to photon starvation effects are still present, especially for low-energy channels where the attenuation coefficients are higher. We present a correction method for metal artifact reduction in SCT that is based on spectral deep learning. The correction efficiently reduces streaking artifacts in all the energy channels measured. We show that the additional information in the energy domain provides relevance for restoring the quality of low-energy reconstruction affected by metal artifacts. The correction method is parameter free and only takes around 15 ms per energy channel, satisfying near-real time requirement of industrial scanners. Full article
(This article belongs to the Special Issue Advances in Deep Neural Networks for Visual Pattern Recognition)
Show Figures

Figure 1

26 pages, 2722 KiB  
Article
Microsaccades, Drifts, Hopf Bundle and Neurogeometry
by Dmitri Alekseevsky
J. Imaging 2022, 8(3), 76; https://doi.org/10.3390/jimaging8030076 - 17 Mar 2022
Cited by 1 | Viewed by 2159
Abstract
The first part of the paper contains a short review of the image processing in early vision is static, when the eyes and the stimulus are stable, and in dynamics, when the eyes participate in fixation eye movements. In the second part, we [...] Read more.
The first part of the paper contains a short review of the image processing in early vision is static, when the eyes and the stimulus are stable, and in dynamics, when the eyes participate in fixation eye movements. In the second part, we give an interpretation of Donders’ and Listing’s law in terms of the Hopf fibration of the 3-sphere over the 2-sphere. In particular, it is shown that the configuration space of the eye ball (when the head is fixed) is the 2-dimensional hemisphere SL+, called Listing hemisphere, and saccades are described as geodesic segments of SL+ with respect to the standard round metric. We study fixation eye movements (drift and microsaccades) in terms of this model and discuss the role of fixation eye movements in vision. A model of fixation eye movements is proposed that gives an explanation of presaccadic shift of receptive fields. Full article
Show Figures

Figure 1

10 pages, 20722 KiB  
Review
Visualization of Inferior Alveolar and Lingual Nerve Pathology by 3D Double-Echo Steady-State MRI: Two Case Reports with Literature Review
by Adib Al-Haj Husain, Daphne Schönegg, Silvio Valdec, Bernd Stadlinger, Thomas Gander, Harald Essig, Marco Piccirelli and Sebastian Winklhofer
J. Imaging 2022, 8(3), 75; https://doi.org/10.3390/jimaging8030075 - 17 Mar 2022
Cited by 5 | Viewed by 4571
Abstract
Injury to the peripheral branches of the trigeminal nerve, particularly the lingual nerve (LN) and the inferior alveolar nerve (IAN), is a rare but serious complication that can occur during oral and maxillofacial surgery. Mandibular third molar surgery, one of the most common [...] Read more.
Injury to the peripheral branches of the trigeminal nerve, particularly the lingual nerve (LN) and the inferior alveolar nerve (IAN), is a rare but serious complication that can occur during oral and maxillofacial surgery. Mandibular third molar surgery, one of the most common surgical procedures in dentistry, is most often associated with such a nerve injury. Proper preoperative radiologic assessment is hence key to avoiding neurosensory dysfunction. In addition to the well-established conventional X-ray-based imaging modalities, such as panoramic radiography and cone-beam computed tomography, radiation-free magnetic resonance imaging (MRI) with the recently introduced black-bone MRI sequences offers the possibility to simultaneously visualize osseous structures and neural tissue in the oral cavity with high spatial resolution and excellent soft-tissue contrast. Fortunately, most LN and IAN injuries recover spontaneously within six months. However, permanent damage may cause significant loss of quality of life for affected patients. Therefore, therapy should be initiated early in indicated cases, despite the inconsistency in the literature regarding the therapeutic time window. In this report, we present the visualization of two cases of nerve pathology using 3D double-echo steady-state MRI and evaluate evidence-based decision-making for iatrogenic nerve injury regarding a wait-and-see strategy, conservative drug treatment, or surgical re-intervention. Full article
(This article belongs to the Special Issue New Frontiers of Advanced Imaging in Dentistry)
Show Figures

Figure 1

10 pages, 6957 KiB  
Article
Investigation of Nonlinear Optical Properties of Quantum Dots Deposited onto a Sample Glass Using Time-Resolved Inline Digital Holography
by Andrey V. Belashov, Igor A. Shevkunov, Ekaterina P. Kolesova, Anna O. Orlova, Sergei E. Putilin, Andrei V. Veniaminov, Chau-Jern Cheng and Nikolay V. Petrov
J. Imaging 2022, 8(3), 74; https://doi.org/10.3390/jimaging8030074 - 16 Mar 2022
Cited by 3 | Viewed by 2043
Abstract
We report on the application of time-resolved inline digital holography in the study of the nonlinear optical properties of quantum dots deposited onto sample glass. The Fresnel diffraction patterns of the probe pulse due to noncollinear degenerate phase modulation induced by a femtosecond [...] Read more.
We report on the application of time-resolved inline digital holography in the study of the nonlinear optical properties of quantum dots deposited onto sample glass. The Fresnel diffraction patterns of the probe pulse due to noncollinear degenerate phase modulation induced by a femtosecond pump pulse were extracted from the set of inline digital holograms and analyzed. The absolute values of the nonlinear refractive index of both the sample glass substrate and the deposited layer of quantum dots were evaluated using the proposed technique. To characterize the inhomogeneous distribution of the samples’ nonlinear optical properties, we proposed plotting an optical nonlinearity map calculated as a local standard deviation of the diffraction pattern intensities induced by noncollinear degenerate phase modulation. Full article
(This article belongs to the Special Issue Digital Holography: Development and Application)
Show Figures

Figure 1

14 pages, 3877 KiB  
Article
Fabrication of a Human Skin Mockup with a Multilayered Concentration Map of Pigment Components Using a UV Printer
by Kazuki Nagasawa, Shoji Yamamoto, Wataru Arai, Kunio Hakkaku, Chawan Koopipat, Keita Hirai and Norimichi Tsumura
J. Imaging 2022, 8(3), 73; https://doi.org/10.3390/jimaging8030073 - 15 Mar 2022
Cited by 2 | Viewed by 2270
Abstract
In this paper, we propose a pipeline that reproduces human skin mockups using a UV printer by obtaining the spatial concentration map of pigments from an RGB image of human skin. The pigment concentration distributions were obtained by a separating method of skin [...] Read more.
In this paper, we propose a pipeline that reproduces human skin mockups using a UV printer by obtaining the spatial concentration map of pigments from an RGB image of human skin. The pigment concentration distributions were obtained by a separating method of skin pigment components with independent component analysis from the skin image. This method can extract the concentration of melanin and hemoglobin components, which are the main pigments that make up skin tone. Based on this concentration, we developed a procedure to reproduce a skin mockup with a multi-layered structure that is determined by mapping the absorbance of melanin and hemoglobin to CMYK (Cyan, Magenta, Yellow, Black) subtractive color mixing. In our proposed method, the multi-layered structure with different pigments in each layer contributes greatly to the accurate reproduction of skin tones. We use a UV printer because the printer is capable of layered fabrication by using UV-curable inks. As the result, subjective evaluation showed that the artificial skin reproduced by our method has a more skin-like appearance than that produced using conventional printing. Full article
(This article belongs to the Special Issue Intelligent Media Processing)
Show Figures

Figure 1

20 pages, 13517 KiB  
Article
Comparison of 2D Optical Imaging and 3D Microtomography Shape Measurements of a Coastal Bioclastic Calcareous Sand
by Ryan D. Beemer, Linzhu Li, Antonio Leonti, Jeremy Shaw, Joana Fonseca, Iren Valova, Magued Iskander and Cynthia H. Pilskaln
J. Imaging 2022, 8(3), 72; https://doi.org/10.3390/jimaging8030072 - 14 Mar 2022
Cited by 6 | Viewed by 2507
Abstract
This article compares measurements of particle shape parameters from three-dimensional (3D) X-ray micro-computed tomography (μCT) and two-dimensional (2D) dynamic image analysis (DIA) from the optical microscopy of a coastal bioclastic calcareous sand from Western Australia. This biogenic sand from a high energy environment [...] Read more.
This article compares measurements of particle shape parameters from three-dimensional (3D) X-ray micro-computed tomography (μCT) and two-dimensional (2D) dynamic image analysis (DIA) from the optical microscopy of a coastal bioclastic calcareous sand from Western Australia. This biogenic sand from a high energy environment consists largely of the shells and tests of marine organisms and their clasts. A significant difference was observed between the two imaging techniques for measurements of aspect ratio, convexity, and sphericity. Measured values of aspect ratio, sphericity, and convexity are larger in 2D than in 3D. Correlation analysis indicates that sphericity is correlated with convexity in both 2D and 3D. These results are attributed to inherent limitations of DIA when applied to platy sand grains and to the shape being, in part, dependent on the biology of the grain rather than a purely random clastic process, like typical siliceous sands. The statistical data has also been fitted to Johnson Bounded Distribution for the ease of future use. Overall, this research demonstrates the need for high-quality 3D microscopy when conducting a micromechanical analysis of biogenic calcareous sands. Full article
(This article belongs to the Special Issue Recent Advances in Image-Based Geotechnics)
Show Figures

Figure 1

9 pages, 6384 KiB  
Article
Multi-Modality Microscopy Image Style Augmentation for Nuclei Segmentation
by Ye Liu, Sophia J. Wagner and Tingying Peng
J. Imaging 2022, 8(3), 71; https://doi.org/10.3390/jimaging8030071 - 11 Mar 2022
Cited by 4 | Viewed by 2684
Abstract
Annotating microscopy images for nuclei segmentation by medical experts is laborious and time-consuming. To leverage the few existing annotations, also across multiple modalities, we propose a novel microscopy-style augmentation technique based on a generative adversarial network (GAN). Unlike other style transfer methods, it [...] Read more.
Annotating microscopy images for nuclei segmentation by medical experts is laborious and time-consuming. To leverage the few existing annotations, also across multiple modalities, we propose a novel microscopy-style augmentation technique based on a generative adversarial network (GAN). Unlike other style transfer methods, it can not only deal with different cell assay types and lighting conditions, but also with different imaging modalities, such as bright-field and fluorescence microscopy. Using disentangled representations for content and style, we can preserve the structure of the original image while altering its style during augmentation. We evaluate our data augmentation on the 2018 Data Science Bowl dataset consisting of various cell assays, lighting conditions, and imaging modalities. With our style augmentation, the segmentation accuracy of the two top-ranked Mask R-CNN-based nuclei segmentation algorithms in the competition increases significantly. Thus, our augmentation technique renders the downstream task more robust to the test data heterogeneity and helps counteract class imbalance without resampling of minority classes. Full article
Show Figures

Figure 1

14 pages, 4785 KiB  
Article
A Novel Deep-Learning-Based Framework for the Classification of Cardiac Arrhythmia
by Sonain Jamil and MuhibUr Rahman
J. Imaging 2022, 8(3), 70; https://doi.org/10.3390/jimaging8030070 - 10 Mar 2022
Cited by 10 | Viewed by 3459
Abstract
Cardiovascular diseases (CVDs) are the primary cause of death. Every year, many people die due to heart attacks. The electrocardiogram (ECG) signal plays a vital role in diagnosing CVDs. ECG signals provide us with information about the heartbeat. ECGs can detect cardiac arrhythmia. [...] Read more.
Cardiovascular diseases (CVDs) are the primary cause of death. Every year, many people die due to heart attacks. The electrocardiogram (ECG) signal plays a vital role in diagnosing CVDs. ECG signals provide us with information about the heartbeat. ECGs can detect cardiac arrhythmia. In this article, a novel deep-learning-based approach is proposed to classify ECG signals as normal and into sixteen arrhythmia classes. The ECG signal is preprocessed and converted into a 2D signal using continuous wavelet transform (CWT). The time–frequency domain representation of the CWT is given to the deep convolutional neural network (D-CNN) with an attention block to extract the spatial features vector (SFV). The attention block is proposed to capture global features. For dimensionality reduction in SFV, a novel clump of features (CoF) framework is proposed. The k-fold cross-validation is applied to obtain the reduced feature vector (RFV), and the RFV is given to the classifier to classify the arrhythmia class. The proposed framework achieves 99.84% accuracy with 100% sensitivity and 99.6% specificity. The proposed algorithm outperforms the state-of-the-art accuracy, F1-score, and sensitivity techniques. Full article
Show Figures

Figure 1

15 pages, 1386 KiB  
Article
Seamless Copy–Move Replication in Digital Images
by Tanzeela Qazi, Mushtaq Ali, Khizar Hayat and Baptiste Magnier
J. Imaging 2022, 8(3), 69; https://doi.org/10.3390/jimaging8030069 - 10 Mar 2022
Cited by 3 | Viewed by 2116
Abstract
The importance and relevance of digital-image forensics has attracted researchers to establish different techniques for creating and detecting forgeries. The core category in passive image forgery is copy–move image forgery that affects the originality of image by applying a different transformation. In this [...] Read more.
The importance and relevance of digital-image forensics has attracted researchers to establish different techniques for creating and detecting forgeries. The core category in passive image forgery is copy–move image forgery that affects the originality of image by applying a different transformation. In this paper, a frequency-domain image-manipulation method is presented. The method exploits the localized nature of discrete wavelet transform (DWT) to attain the region of the host image to be manipulated. Both patch and host image are subjected to DWT at the same level l to obtain 3l+1 sub-bands, and each sub-band of the patch is pasted to the identified region in the corresponding sub-band of the host image. Resulting manipulated host sub-bands are then subjected to inverse DWT to obtain the final manipulated host image. The proposed method shows good resistance against detection by two frequency-domain forgery detection methods from the literature. The purpose of this research work is to create a forgery and highlight the need to produce forgery detection methods that are robust against malicious copy–move forgery. Full article
(This article belongs to the Special Issue Edge Detection Evaluation)
Show Figures

Figure 1

14 pages, 7021 KiB  
Article
Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content
by Paula Viana, Maria Teresa Andrade, Pedro Carvalho, Luis Vilaça, Inês N. Teixeira, Tiago Costa and Pieter Jonker
J. Imaging 2022, 8(3), 68; https://doi.org/10.3390/jimaging8030068 - 10 Mar 2022
Cited by 1 | Viewed by 2224
Abstract
Applying machine learning (ML), and especially deep learning, to understand visual content is becoming common practice in many application areas. However, little attention has been given to its use within the multimedia creative domain. It is true that ML is already popular for [...] Read more.
Applying machine learning (ML), and especially deep learning, to understand visual content is becoming common practice in many application areas. However, little attention has been given to its use within the multimedia creative domain. It is true that ML is already popular for content creation, but the progress achieved so far addresses essentially textual content or the identification and selection of specific types of content. A wealth of possibilities are yet to be explored by bringing the use of ML into the multimedia creative process, allowing the knowledge inferred by the former to influence automatically how new multimedia content is created. The work presented in this article provides contributions in three distinct ways towards this goal: firstly, it proposes a methodology to re-train popular neural network models in identifying new thematic concepts in static visual content and attaching meaningful annotations to the detected regions of interest; secondly, it presents varied visual digital effects and corresponding tools that can be automatically called upon to apply such effects in a previously analyzed photo; thirdly, it defines a complete automated creative workflow, from the acquisition of a photograph and corresponding contextual data, through the ML region-based annotation, to the automatic application of digital effects and generation of a semantically aware multimedia story driven by the previously derived situational and visual contextual data. Additionally, it presents a variant of this automated workflow by offering to the user the possibility of manipulating the automatic annotations in an assisted manner. The final aim is to transform a static digital photo into a short video clip, taking into account the information acquired. The final result strongly contrasts with current standard approaches of creating random movements, by implementing an intelligent content- and context-aware video. Full article
(This article belongs to the Special Issue Intelligent Media Processing)
Show Figures

Figure 1

16 pages, 98627 KiB  
Article
Perceptually Optimal Color Representation of Fully Polarimetric SAR Imagery
by Georgia Koukiou
J. Imaging 2022, 8(3), 67; https://doi.org/10.3390/jimaging8030067 - 7 Mar 2022
Cited by 2 | Viewed by 2718
Abstract
The four bands of fully polarimetric SAR data convey scattering characteristics of the Earth’s background, but perceptually are not very easy for an observer to use. In this work, the four different channels of fully polarimetric SAR images, namely HH, HV, VH, and [...] Read more.
The four bands of fully polarimetric SAR data convey scattering characteristics of the Earth’s background, but perceptually are not very easy for an observer to use. In this work, the four different channels of fully polarimetric SAR images, namely HH, HV, VH, and VV, are combined so that a color image of the Earth’s background is derived that is perceptually excellent for the human eye and at the same time provides accurate information regarding the scattering mechanisms in each pixel. Most of the elementary scattering mechanisms are related to specific color and land cover types. The innovative nature of the proposed approach is due to the two different consecutive coloring procedures. The first one is a fusion procedure that moves all the information contained in the four polarimetric channels into three derived RGB bands. This is achieved by means of Cholesky decomposition and brings to the RGB output the correlation properties of a natural color image. The second procedure moves the color information of the RGB image to the CIELab color space, which is perceptually uniform. The color information is then evenly distributed by means of color equalization in the CIELab color space. After that, the inverse procedure to obtain the final RGB image is performed. These two procedures bring the PolSAR information regarding the scattering mechanisms on the Earth’s surface onto a meaningful color image, the appearance of which is close to Google Earth maps. Simultaneously, they give better color correspondence to various land cover types compared with existing SAR color representation methods. Full article
(This article belongs to the Special Issue Advances in Color Imaging)
Show Figures

Figure 1

12 pages, 14457 KiB  
Article
An Empirical Evaluation of Convolutional Networks for Malaria Diagnosis
by Andrea Loddo, Corrado Fadda and Cecilia Di Ruberto
J. Imaging 2022, 8(3), 66; https://doi.org/10.3390/jimaging8030066 - 7 Mar 2022
Cited by 16 | Viewed by 2981
Abstract
Malaria is a globally widespread disease caused by parasitic protozoa transmitted to humans by infected female mosquitoes of Anopheles. It is caused in humans only by the parasite Plasmodium, further classified into four different species. Identifying malaria parasites is possible by analysing digital [...] Read more.
Malaria is a globally widespread disease caused by parasitic protozoa transmitted to humans by infected female mosquitoes of Anopheles. It is caused in humans only by the parasite Plasmodium, further classified into four different species. Identifying malaria parasites is possible by analysing digital microscopic blood smears, which is tedious, time-consuming and error prone. So, automation of the process has assumed great importance as it helps the laborious manual process of review and diagnosis. This work focuses on deep learning-based models, by comparing off-the-shelf architectures for classifying healthy and parasite-affected cells, by investigating the four-class classification on the Plasmodium falciparum stages of life and, finally, by evaluating the robustness of the models with cross-dataset experiments on two different datasets. The main contributions to the research in this field can be resumed as follows: (i) comparing off-the-shelf architectures in the task of classifying healthy and parasite-affected cells, (ii) investigating the four-class classification on the P. falciparum stages of life and (iii) evaluating the robustness of the models with cross-dataset experiments. Eleven well-known convolutional neural networks on two public datasets have been exploited. The results show that the networks have great accuracy in binary classification, even though they lack few samples per class. Moreover, the cross-dataset experiments exhibit the need for some further regulations. In particular, ResNet-18 achieved up to 97.68% accuracy in the binary classification, while DenseNet-201 reached 99.40% accuracy on the multiclass classification. The cross-dataset experiments exhibit the limitations of deep learning approaches in such a scenario, even though combining the two datasets permitted DenseNet-201 to reach 97.45% accuracy. Naturally, this needs further investigation to improve the robustness. In general, DenseNet-201 seems to offer the most stable and robust performance, offering as a crucial candidate to further developments and modifications. Moreover, the mobile-oriented architectures showed promising and satisfactory performance in the classification of malaria parasites. The obtained results enable extensive improvements, specifically oriented to the application of object detectors for type and stage of life recognition, even in mobile environments. Full article
(This article belongs to the Topic Medical Image Analysis)
Show Figures

Figure 1

18 pages, 1445 KiB  
Review
Review of Machine Learning in Lung Ultrasound in COVID-19 Pandemic
by Jing Wang, Xiaofeng Yang, Boran Zhou, James J. Sohn, Jun Zhou, Jesse T. Jacob, Kristin A. Higgins, Jeffrey D. Bradley and Tian Liu
J. Imaging 2022, 8(3), 65; https://doi.org/10.3390/jimaging8030065 - 5 Mar 2022
Cited by 27 | Viewed by 5120
Abstract
Ultrasound imaging of the lung has played an important role in managing patients with COVID-19–associated pneumonia and acute respiratory distress syndrome (ARDS). During the COVID-19 pandemic, lung ultrasound (LUS) or point-of-care ultrasound (POCUS) has been a popular diagnostic tool due to its unique [...] Read more.
Ultrasound imaging of the lung has played an important role in managing patients with COVID-19–associated pneumonia and acute respiratory distress syndrome (ARDS). During the COVID-19 pandemic, lung ultrasound (LUS) or point-of-care ultrasound (POCUS) has been a popular diagnostic tool due to its unique imaging capability and logistical advantages over chest X-ray and CT. Pneumonia/ARDS is associated with the sonographic appearances of pleural line irregularities and B-line artefacts, which are caused by interstitial thickening and inflammation, and increase in number with severity. Artificial intelligence (AI), particularly machine learning, is increasingly used as a critical tool that assists clinicians in LUS image reading and COVID-19 decision making. We conducted a systematic review from academic databases (PubMed and Google Scholar) and preprints on arXiv or TechRxiv of the state-of-the-art machine learning technologies for LUS images in COVID-19 diagnosis. Openly accessible LUS datasets are listed. Various machine learning architectures have been employed to evaluate LUS and showed high performance. This paper will summarize the current development of AI for COVID-19 management and the outlook for emerging trends of combining AI-based LUS with robotics, telehealth, and other techniques. Full article
(This article belongs to the Special Issue Application of Machine Learning Using Ultrasound Images)
Show Figures

Figure 1

23 pages, 417 KiB  
Article
Rethinking Weight Decay for Efficient Neural Network Pruning
by Hugo Tessier, Vincent Gripon, Mathieu Léonardon, Matthieu Arzel, Thomas Hannagan and David Bertrand
J. Imaging 2022, 8(3), 64; https://doi.org/10.3390/jimaging8030064 - 4 Mar 2022
Cited by 12 | Viewed by 2902
Abstract
Introduced in the late 1980s for generalization purposes, pruning has now become a staple for compressing deep neural networks. Despite many innovations in recent decades, pruning approaches still face core issues that hinder their performance or scalability. Drawing inspiration from early work in [...] Read more.
Introduced in the late 1980s for generalization purposes, pruning has now become a staple for compressing deep neural networks. Despite many innovations in recent decades, pruning approaches still face core issues that hinder their performance or scalability. Drawing inspiration from early work in the field, and especially the use of weight decay to achieve sparsity, we introduce Selective Weight Decay (SWD), which carries out efficient, continuous pruning throughout training. Our approach, theoretically grounded on Lagrangian smoothing, is versatile and can be applied to multiple tasks, networks, and pruning structures. We show that SWD compares favorably to state-of-the-art approaches, in terms of performance-to-parameters ratio, on the CIFAR-10, Cora, and ImageNet ILSVRC2012 datasets. Full article
Show Figures

Figure 1

15 pages, 3756 KiB  
Article
An Exploration of Pathologies of Multilevel Principal Components Analysis in Statistical Models of Shape
by Damian J. J. Farnell
J. Imaging 2022, 8(3), 63; https://doi.org/10.3390/jimaging8030063 - 4 Mar 2022
Cited by 1 | Viewed by 1702
Abstract
3D facial surface imaging is a useful tool in dentistry and in terms of diagnostics and treatment planning. Between-group PCA (bgPCA) is a method that has been used to analyse shapes in biological morphometrics, although various “pathologies” of bgPCA have recently been proposed. [...] Read more.
3D facial surface imaging is a useful tool in dentistry and in terms of diagnostics and treatment planning. Between-group PCA (bgPCA) is a method that has been used to analyse shapes in biological morphometrics, although various “pathologies” of bgPCA have recently been proposed. Monte Carlo (MC) simulated datasets were created here in order to explore “pathologies” of multilevel PCA (mPCA), where mPCA with two levels is equivalent to bgPCA. The first set of MC experiments involved 300 uncorrelated normally distributed variables, whereas the second set of MC experiments used correlated multivariate MC data describing 3D facial shape. We confirmed results of numerical experiments from other researchers that indicated that bgPCA (and so also mPCA) can give a false impression of strong differences in component scores between groups when there is none in reality. These spurious differences in component scores via mPCA decreased significantly as the sample sizes per group were increased. Eigenvalues via mPCA were also found to be strongly affected by imbalances in sample sizes per group, although this problem was removed by using weighted forms of covariance matrices suggested by the maximum likelihood solution of the two-level model. However, this did not solve problems of spurious differences between groups in these simulations, which was driven by very small sample sizes in one group. As a “rule of thumb” only, all of our experiments indicate that reasonable results are obtained when sample sizes per group in all groups are at least equal to the number of variables. Interestingly, the sum of all eigenvalues over both levels via mPCA scaled approximately linearly with the inverse of the sample size per group in all experiments. Finally, between-group variation was added explicitly to the MC data generation model in two experiments considered here. Results for the sum of all eigenvalues via mPCA predicted the asymptotic amount for the total amount of variance correctly in this case, whereas standard “single-level” PCA underestimated this quantity. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

13 pages, 11653 KiB  
Article
A Real-Time Method for Time-to-Collision Estimation from Aerial Images
by Daniel Tøttrup, Stinus Lykke Skovgaard, Jonas le Fevre Sejersen and Rui Pimentel de Figueiredo
J. Imaging 2022, 8(3), 62; https://doi.org/10.3390/jimaging8030062 - 3 Mar 2022
Cited by 6 | Viewed by 2589 | Correction
Abstract
Large vessels such as container ships rely on experienced pilots with extensive knowledge of the local streams and tides responsible for maneuvering the vessel to its desired location. This work proposes estimating time-to-collision (TTC) between moving objects (i.e., vessels) using real-time video data [...] Read more.
Large vessels such as container ships rely on experienced pilots with extensive knowledge of the local streams and tides responsible for maneuvering the vessel to its desired location. This work proposes estimating time-to-collision (TTC) between moving objects (i.e., vessels) using real-time video data captured from aerial drones in dynamic maritime environments. Our deep-learning-based methods utilize features optimized with realistic virtually generated data for reliable and robust object detection, segmentation, and tracking. Furthermore, we use rotated bounding box representations, obtained from fine semantic segmentation of objects, for enhanced TTC estimation accuracy. We intuitively present collision estimates as collision arrows that gradually change color to red to indicate an imminent collision. Experiments conducted in a realistic dockyard virtual environment show that our approaches precisely, robustly, and efficiently predict TTC between dynamic objects seen from a top-view, with a mean error and a standard deviation of 0.358 and 0.114 s, respectively, in a worst-case scenario. Full article
(This article belongs to the Special Issue Visual Localization)
Show Figures

Figure 1

16 pages, 1198 KiB  
Article
Iterative Multiple Bounding-Box Refinements for Visual Tracking
by Giorgio Cruciata, Liliana Lo Presti and Marco La Cascia
J. Imaging 2022, 8(3), 61; https://doi.org/10.3390/jimaging8030061 - 3 Mar 2022
Cited by 1 | Viewed by 2334
Abstract
Single-object visual tracking aims at locating a target in each video frame by predicting the bounding box of the object. Recent approaches have adopted iterative procedures to gradually refine the bounding box and locate the target in the image. In such approaches, the [...] Read more.
Single-object visual tracking aims at locating a target in each video frame by predicting the bounding box of the object. Recent approaches have adopted iterative procedures to gradually refine the bounding box and locate the target in the image. In such approaches, the deep model takes as input the image patch corresponding to the currently estimated target bounding box, and provides as output the probability associated with each of the possible bounding box refinements, generally defined as a discrete set of linear transformations of the bounding box center and size. At each iteration, only one transformation is applied, and supervised training of the model may introduce an inherent ambiguity by giving importance priority to some transformations over the others. This paper proposes a novel formulation of the problem of selecting the bounding box refinement. It introduces the concept of non-conflicting transformations and allows applying multiple refinements to the target bounding box at each iteration without introducing ambiguities during learning of the model parameters. Empirical results demonstrate that the proposed approach improves the iterative single refinement in terms of accuracy and precision of the tracking results. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

12 pages, 5397 KiB  
Article
Hierarchical Fusion Using Subsets of Multi-Features for Historical Arabic Manuscript Dating
by Kalthoum Adam, Somaya Al-Maadeed and Younes Akbari
J. Imaging 2022, 8(3), 60; https://doi.org/10.3390/jimaging8030060 - 1 Mar 2022
Cited by 4 | Viewed by 2319
Abstract
Automatic dating tools for historical documents can greatly assist paleographers and save them time and effort. This paper describes a novel method for estimating the date of historical Arabic documents that employs hierarchical fusions of multiple features. A set of traditional features and [...] Read more.
Automatic dating tools for historical documents can greatly assist paleographers and save them time and effort. This paper describes a novel method for estimating the date of historical Arabic documents that employs hierarchical fusions of multiple features. A set of traditional features and features extracted by a residual network (ResNet) are fused in a hierarchical approach using joint sparse representation. To address noise during the fusion process, a new approach based on subsets of multiple features is being considered. Following that, supervised and unsupervised classifiers are used for classification. We show that using hierarchical fusion based on subsets of multiple features in the KERTAS dataset can produce promising results and significantly improve the results. Full article
Show Figures

Figure 1

15 pages, 3511 KiB  
Article
Glossiness Index of Objects in Halftone Color Images Based on Structure and Appearance Distortion
by Donghui Li, Midori Tanaka and Takahiko Horiuchi
J. Imaging 2022, 8(3), 59; https://doi.org/10.3390/jimaging8030059 - 27 Feb 2022
Viewed by 2122
Abstract
This paper proposes an objective glossiness index for objects in halftone color images. In the proposed index, we consider the characteristics of the human visual system (HVS) and associate the image’s structure distortion and statistical information. According to the difference in the number [...] Read more.
This paper proposes an objective glossiness index for objects in halftone color images. In the proposed index, we consider the characteristics of the human visual system (HVS) and associate the image’s structure distortion and statistical information. According to the difference in the number of strategies adopted by the HVS in judging the difference between images, it is divided into single and multi-strategy modeling. In this study, we advocate multiple strategies to determine glossy or non-glossy quality. We assumed that HVS used different visual mechanisms to evaluate glossy and non-glossy objects. For non-glossy images, the image structure dominated, so the HVS tried to use structural information to judge distortion (a strategy based on structural distortion detection). For glossy images, the glossy appearance dominated; thus, the HVS tried to search for the glossiness difference (an appearance-based strategy). Herein, we present an index for glossiness assessment that attempts to explicitly model the structural dissimilarity and appearance distortion. We used the contrast sensitivity function to account for the mechanism of halftone images when viewed by the human eye. We estimated the structure distortion for the first strategy by using local luminance and contrast masking; meanwhile, local statistics changing in the spatial frequency components for skewness and standard deviation were used to estimate the appearance distortion for the second strategy. Experimental results showed that these two mixed-distortion measurement strategies performed well in consistency with the subjective ratings of glossiness in halftone color images. Full article
(This article belongs to the Special Issue Intelligent Media Processing)
Show Figures

Figure 1

21 pages, 6615 KiB  
Review
Scanning Hyperspectral Imaging for In Situ Biogeochemical Analysis of Lake Sediment Cores: Review of Recent Developments
by Paul D. Zander, Giulia Wienhues and Martin Grosjean
J. Imaging 2022, 8(3), 58; https://doi.org/10.3390/jimaging8030058 - 25 Feb 2022
Cited by 10 | Viewed by 3967
Abstract
Hyperspectral imaging (HSI) in situ core scanning has emerged as a valuable and novel tool for rapid and non-destructive biogeochemical analysis of lake sediment cores. Variations in sediment composition can be assessed directly from fresh sediment surfaces at ultra-high-resolution (40–300 μm measurement resolution) [...] Read more.
Hyperspectral imaging (HSI) in situ core scanning has emerged as a valuable and novel tool for rapid and non-destructive biogeochemical analysis of lake sediment cores. Variations in sediment composition can be assessed directly from fresh sediment surfaces at ultra-high-resolution (40–300 μm measurement resolution) based on spectral profiles of light reflected from sediments in visible, near infrared, and short-wave infrared wavelengths (400–2500 nm). Here, we review recent methodological developments in this new and growing field of research, as well as applications of this technique for paleoclimate and paleoenvironmental studies. Hyperspectral imaging of sediment cores has been demonstrated to effectively track variations in sedimentary pigments, organic matter, grain size, minerogenic components, and other sedimentary features. These biogeochemical variables record information about past climatic conditions, paleoproductivity, past hypolimnetic anoxia, aeolian input, volcanic eruptions, earthquake and flood frequencies, and other variables of environmental relevance. HSI has been applied to study seasonal and inter-annual environmental variability as recorded in individual varves (annually laminated sediments) or to study sedimentary records covering long glacial–interglacial time-scales (>10,000 years). Full article
(This article belongs to the Special Issue Hyperspectral Imaging and Its Applications)
Show Figures

Figure 1

14 pages, 2997 KiB  
Article
PRNU-Based Video Source Attribution: Which Frames Are You Using?
by Pasquale Ferrara, Massimo Iuliani and Alessandro Piva
J. Imaging 2022, 8(3), 57; https://doi.org/10.3390/jimaging8030057 - 25 Feb 2022
Cited by 8 | Viewed by 2628
Abstract
Photo Response Non-Uniformity (PRNU) is reputed the most successful trace to identify the source of a digital video. However, its effectiveness is mainly limited by compression and the effect of recently introduced electronic image stabilization on several devices. In the last decade, several [...] Read more.
Photo Response Non-Uniformity (PRNU) is reputed the most successful trace to identify the source of a digital video. However, its effectiveness is mainly limited by compression and the effect of recently introduced electronic image stabilization on several devices. In the last decade, several approaches were proposed to overcome both these issues, mainly by selecting those video frames which are considered more informative. However, the two problems were always treated separately, and the combined effect of compression and digital stabilization was never considered. This separated analysis makes it hard to understand if achieved conclusions still stand for digitally stabilized videos and if those choices represent a general optimum strategy to perform video source attribution. In this paper, we explore whether an optimum strategy exists in selecting frames based on their type and their positions within the groups of pictures. We, therefore, systematically analyze the PRNU contribute provided by all frames belonging to either digitally stabilized or not stabilized videos. Results on the VISION dataset come up with some insights into optimizing video source attribution in different use cases. Full article
(This article belongs to the Section Biometrics, Forensics, and Security)
Show Figures

Figure 1

17 pages, 1222 KiB  
Article
Principal Component Analysis versus Subject’s Residual Profile Analysis for Neuroinflammation Investigation in Parkinson Patients: A PET Brain Imaging Study
by Rostom Mabrouk
J. Imaging 2022, 8(3), 56; https://doi.org/10.3390/jimaging8030056 - 25 Feb 2022
Cited by 3 | Viewed by 2426
Abstract
Dysfunction of neurons in the central nervous system is the primary pathological feature of Parkinson’s disease (PD). Despite different triggering, emerging evidence indicates that neuroinflammation revealed through microglia activation is critical for PD. Moreover, recent investigations sought a potential relationship between Lrrk2 genetic [...] Read more.
Dysfunction of neurons in the central nervous system is the primary pathological feature of Parkinson’s disease (PD). Despite different triggering, emerging evidence indicates that neuroinflammation revealed through microglia activation is critical for PD. Moreover, recent investigations sought a potential relationship between Lrrk2 genetic mutation and microglia activation. In this paper, neuroinflammation in sporadic PD, Lrrk2-PD and unaffected Lrrk2 mutation carriers were investigated. The principal component analysis (PCA) and the subject’s residual profile (SRP) techniques were performed on multiple groups and regions of interest in 22 brain-regions. The 11C-PBR28 binding profiles were compared in four genotypes depending on groups, i.e., HC, sPD, Lrrk2-PD and UC, using the PCA and SPR scores. The genotype effect was found as a principal feature of group-dependent 11C-PBR28 binding, and preliminary evidence of a MAB-Lrrk2 mutation interaction in manifest Parkinson’s and subjects at risk was found. Full article
Show Figures

Figure 1

31 pages, 1204 KiB  
Article
Kidney Tumor Semantic Segmentation Using Deep Learning: A Survey of State-of-the-Art
by Abubaker Abdelrahman and Serestina Viriri
J. Imaging 2022, 8(3), 55; https://doi.org/10.3390/jimaging8030055 - 25 Feb 2022
Cited by 18 | Viewed by 8258
Abstract
Cure rates for kidney cancer vary according to stage and grade; hence, accurate diagnostic procedures for early detection and diagnosis are crucial. Some difficulties with manual segmentation have necessitated the use of deep learning models to assist clinicians in effectively recognizing and segmenting [...] Read more.
Cure rates for kidney cancer vary according to stage and grade; hence, accurate diagnostic procedures for early detection and diagnosis are crucial. Some difficulties with manual segmentation have necessitated the use of deep learning models to assist clinicians in effectively recognizing and segmenting tumors. Deep learning (DL), particularly convolutional neural networks, has produced outstanding success in classifying and segmenting images. Simultaneously, researchers in the field of medical image segmentation employ DL approaches to solve problems such as tumor segmentation, cell segmentation, and organ segmentation. Segmentation of tumors semantically is critical in radiation and therapeutic practice. This article discusses current advances in kidney tumor segmentation systems based on DL. We discuss the various types of medical images and segmentation techniques and the assessment criteria for segmentation outcomes in kidney tumor segmentation, highlighting their building blocks and various strategies. Full article
(This article belongs to the Special Issue Current Methods in Medical Image Segmentation)
Show Figures

Figure 1

13 pages, 7486 KiB  
Article
Monochrome Camera Conversion: Effect on Sensitivity for Multispectral Imaging (Ultraviolet, Visible, and Infrared)
by Jonathan Crowther
J. Imaging 2022, 8(3), 54; https://doi.org/10.3390/jimaging8030054 - 25 Feb 2022
Cited by 3 | Viewed by 3751
Abstract
Conversion of standard cameras to enable them to capture images in the ultraviolet (UV) and infrared (IR) spectral regions has applications ranging from purely artistic to science and research. Taking the modification of the camera a step further and removing the color filter [...] Read more.
Conversion of standard cameras to enable them to capture images in the ultraviolet (UV) and infrared (IR) spectral regions has applications ranging from purely artistic to science and research. Taking the modification of the camera a step further and removing the color filter array (CFA) results in the formation of a monochrome camera. The spectral sensitivities of a range of cameras with different sensors which were converted to monochrome were measured and compared with standard multispectral camera conversions, with an emphasis on their behavior from the UV through to the IR regions. Full article
(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)
Show Figures

Figure 1

18 pages, 1324 KiB  
Review
A Survey of 6D Object Detection Based on 3D Models for Industrial Applications
by Felix Gorschlüter, Pavel Rojtberg and Thomas Pöllabauer
J. Imaging 2022, 8(3), 53; https://doi.org/10.3390/jimaging8030053 - 24 Feb 2022
Cited by 7 | Viewed by 4284
Abstract
Six-dimensional object detection of rigid objects is a problem especially relevant for quality control and robotic manipulation in industrial contexts. This work is a survey of the state of the art of 6D object detection with these use cases in mind, specifically focusing [...] Read more.
Six-dimensional object detection of rigid objects is a problem especially relevant for quality control and robotic manipulation in industrial contexts. This work is a survey of the state of the art of 6D object detection with these use cases in mind, specifically focusing on algorithms trained only with 3D models or renderings thereof. Our first contribution is a listing of requirements typically encountered in industrial applications. The second contribution is a collection of quantitative evaluation results for several different 6D object detection methods trained with synthetic data and the comparison and analysis thereof. We identify the top methods for individual requirements that industrial applications have for object detectors, but find that a lack of comparable data prevents large-scale comparison over multiple aspects. Full article
(This article belongs to the Special Issue Advanced Scene Perception for Augmented Reality)
Show Figures

Figure 1

20 pages, 35766 KiB  
Article
Qualitative Comparison of Image Stitching Algorithms for Multi-Camera Systems in Laparoscopy
by Sylvain Guy, Jean-Loup Haberbusch, Emmanuel Promayon, Stéphane Mancini and Sandrine Voros
J. Imaging 2022, 8(3), 52; https://doi.org/10.3390/jimaging8030052 - 23 Feb 2022
Cited by 4 | Viewed by 3714
Abstract
Multi-camera systems were recently introduced into laparoscopy to increase the narrow field of view of the surgeon. The video streams are stitched together to create a panorama that is easier for the surgeon to comprehend. Multi-camera prototypes for laparoscopy use quite basic algorithms [...] Read more.
Multi-camera systems were recently introduced into laparoscopy to increase the narrow field of view of the surgeon. The video streams are stitched together to create a panorama that is easier for the surgeon to comprehend. Multi-camera prototypes for laparoscopy use quite basic algorithms and have only been evaluated on simple laparoscopic scenarios. The more recent state-of-the-art algorithms, mainly designed for the smartphone industry, have not yet been evaluated in laparoscopic conditions. We developed a simulated environment to generate a dataset of multi-view images displaying a wide range of laparoscopic situations, which is adaptable to any multi-camera system. We evaluated classical and state-of-the-art image stitching techniques used in non-medical applications on this dataset, including one unsupervised deep learning approach. We show that classical techniques that use global homography fail to provide a clinically satisfactory rendering and that even the most recent techniques, despite providing high quality panorama images in non-medical situations, may suffer from poor alignment or severe distortions in simulated laparoscopic scenarios. We highlight the main advantages and flaws of each algorithm within a laparoscopic context, identify the main remaining challenges that are specific to laparoscopy, and propose methods to improve these approaches. We provide public access to the simulated environment and dataset. Full article
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop