Next Issue
Volume 7, September
Previous Issue
Volume 7, July
 
 

J. Imaging, Volume 7, Issue 8 (August 2021) – 40 articles

Cover Story (view full-size image): Identification and authentication of printed documents is a critical issue for security purposes. The manufacturing process to engender common paper sheets involves wood particles useful to individuate unique features; in fact, their random disposition makes a single sheet almost unique, giving the possibility, under certain conditions, to extract a fingerprint from the generated pattern. In this paper, a method to generate a robust fingerprint for document identification based on binary patterns was proposed; it also includes a PCA-based and block-division fingerprint reduction strategy, a low-cost framework for digital pattern acquisition and a public dataset of images acquired with both low-cost and high devices. Comparison with the state of the art confirmed the goodness of the method. View this paper.
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
23 pages, 2346 KiB  
Article
A Benchmark Evaluation of Adaptive Image Compression for Multi Picture Object Stereoscopic Images
by Alessandro Ortis, Marco Grisanti, Francesco Rundo and Sebastiano Battiato
J. Imaging 2021, 7(8), 160; https://doi.org/10.3390/jimaging7080160 - 23 Aug 2021
Cited by 1 | Viewed by 2239
Abstract
A stereopair consists of two pictures related to the same subject taken by two different points of view. Since the two images contain a high amount of redundant information, new compression approaches and data formats are continuously proposed, which aim to reduce the [...] Read more.
A stereopair consists of two pictures related to the same subject taken by two different points of view. Since the two images contain a high amount of redundant information, new compression approaches and data formats are continuously proposed, which aim to reduce the space needed to store a stereoscopic image while preserving its quality. A standard for multi-picture image encoding is represented by the MPO format (Multi-Picture Object). The classic stereoscopic image compression approaches compute a disparity map between the two views, which is stored with one of the two views together with a residual image. An alternative approach, named adaptive stereoscopic image compression, encodes just the two views independently with different quality factors. Then, the redundancy between the two views is exploited to enhance the low quality image. In this paper, the problem of stereoscopic image compression is presented, with a focus on the adaptive stereoscopic compression approach, which allows us to obtain a standardized format of the compressed data. The paper presents a benchmark evaluation on large and standardized datasets including 60 stereopairs that differ by resolution and acquisition technique. The method is evaluated by varying the amount of compression, as well as the matching and optimization methods resulting in 16 different settings. The adaptive approach is also compared with other MPO-compliant methods. The paper also presents an Human Visual System (HVS)-based assessment experiment which involved 116 people in order to verify the perceived quality of the decoded images. Full article
(This article belongs to the Special Issue New and Specialized Methods of Image Compression)
Show Figures

Figure 1

17 pages, 9999 KiB  
Article
Usability of Graphical Visualizations on a Tool-Mounted Interface for Spine Surgery
by Laura Schütz, Caroline Brendle, Javier Esteban, Sandro M. Krieg, Ulrich Eck and Nassir Navab
J. Imaging 2021, 7(8), 159; https://doi.org/10.3390/jimaging7080159 - 21 Aug 2021
Cited by 7 | Viewed by 3029
Abstract
Screw placement in the correct angular trajectory is one of the most intricate tasks during spinal fusion surgery. Due to the crucial role of pedicle screw placement for the outcome of the operation, spinal navigation has been introduced into the clinical routine. Despite [...] Read more.
Screw placement in the correct angular trajectory is one of the most intricate tasks during spinal fusion surgery. Due to the crucial role of pedicle screw placement for the outcome of the operation, spinal navigation has been introduced into the clinical routine. Despite its positive effects on the precision and safety of the surgical procedure, local separation of the navigation information and the surgical site, combined with intricate visualizations, limit the benefits of the navigation systems. Instead of a tech-driven design, a focus on usability is required in new research approaches to enable advanced and effective visualizations. This work presents a new tool-mounted interface (TMI) for pedicle screw placement. By fixing a TMI onto the surgical instrument, physical de-coupling of the anatomical target and navigation information is resolved. A total of 18 surgeons participated in a usability study comparing the TMI to the state-of-the-art visualization on an external screen. With the usage of the TMI, significant improvements in system usability (Kruskal–Wallis test p < 0.05) were achieved. A significant reduction in mental demand and overall cognitive load, measured using a NASA-TLX (p < 0.05), were observed. Moreover, a general improvement in performance was shown by means of the surgical task time (one-way ANOVA p < 0.001). Full article
Show Figures

Figure 1

19 pages, 7888 KiB  
Article
A Dataset of Annotated Omnidirectional Videos for Distancing Applications
by Giuseppe Mazzola, Liliana Lo Presti, Edoardo Ardizzone and Marco La Cascia
J. Imaging 2021, 7(8), 158; https://doi.org/10.3390/jimaging7080158 - 21 Aug 2021
Cited by 7 | Viewed by 3463
Abstract
Omnidirectional (or 360°) cameras are acquisition devices that, in the next few years, could have a big impact on video surveillance applications, research, and industry, as they can record a spherical view of a whole environment from every perspective. This paper presents two [...] Read more.
Omnidirectional (or 360°) cameras are acquisition devices that, in the next few years, could have a big impact on video surveillance applications, research, and industry, as they can record a spherical view of a whole environment from every perspective. This paper presents two new contributions to the research community: the CVIP360 dataset, an annotated dataset of 360° videos for distancing applications, and a new method to estimate the distances of objects in a scene from a single 360° image. The CVIP360 dataset includes 16 videos acquired outdoors and indoors, annotated by adding information about the pedestrians in the scene (bounding boxes) and the distances to the camera of some points in the 3D world by using markers at fixed and known intervals. The proposed distance estimation algorithm is based on geometry facts regarding the acquisition process of the omnidirectional device, and is uncalibrated in practice: the only required parameter is the camera height. The proposed algorithm was tested on the CVIP360 dataset, and empirical results demonstrate that the estimation error is negligible for distancing applications. Full article
(This article belongs to the Special Issue 2020 Selected Papers from Journal of Imaging Editorial Board Members)
Show Figures

Figure 1

12 pages, 569 KiB  
Article
Multimodal Emotion Recognition from Art Using Sequential Co-Attention
by Tsegaye Misikir Tashu, Sakina Hajiyeva and Tomas Horvath
J. Imaging 2021, 7(8), 157; https://doi.org/10.3390/jimaging7080157 - 21 Aug 2021
Cited by 21 | Viewed by 3649
Abstract
In this study, we present a multimodal emotion recognition architecture that uses both feature-level attention (sequential co-attention) and modality attention (weighted modality fusion) to classify emotion in art. The proposed architecture helps the model to focus on learning informative and refined representations for [...] Read more.
In this study, we present a multimodal emotion recognition architecture that uses both feature-level attention (sequential co-attention) and modality attention (weighted modality fusion) to classify emotion in art. The proposed architecture helps the model to focus on learning informative and refined representations for both feature extraction and modality fusion. The resulting system can be used to categorize artworks according to the emotions they evoke; recommend paintings that accentuate or balance a particular mood; search for paintings of a particular style or genre that represents custom content in a custom state of impact. Experimental results on the WikiArt emotion dataset showed the efficiency of the approach proposed and the usefulness of three modalities in emotion recognition. Full article
(This article belongs to the Special Issue Fine Art Pattern Extraction and Recognition)
Show Figures

Figure 1

20 pages, 10112 KiB  
Article
Documenting Paintings with Gigapixel Photography
by Pedro M. Cabezos-Bernal, Pablo Rodriguez-Navarro and Teresa Gil-Piqueras
J. Imaging 2021, 7(8), 156; https://doi.org/10.3390/jimaging7080156 - 21 Aug 2021
Cited by 12 | Viewed by 3330
Abstract
Digital photographic capture of pictorial artworks with gigapixel resolution (around 1000 megapixels or greater) is a novel technique that is beginning to be used by some important international museums as a means of documentation, analysis, and dissemination of their masterpieces. This line of [...] Read more.
Digital photographic capture of pictorial artworks with gigapixel resolution (around 1000 megapixels or greater) is a novel technique that is beginning to be used by some important international museums as a means of documentation, analysis, and dissemination of their masterpieces. This line of research is extremely interesting, not only for art curators and scholars but also for the general public. The results can be disseminated through online virtual museum displays, offering a detailed interactive visualization. These virtual visualizations allow the viewer to delve into the artwork in such a way that it is possible to zoom in and observe those details, which would be negligible to the naked eye in a real visit. Therefore, this kind of virtual visualization using gigapixel images has become an essential tool to enhance cultural heritage and to make it accessible to everyone. Since today’s professional digital cameras provide images of around 40 megapixels, obtaining gigapixel images requires some special capture and editing techniques. This article describes a series of photographic methodologies and equipment, developed by the team of researchers, that have been put into practice to achieve a very high level of detail and chromatic fidelity, in the documentation and dissemination of pictorial artworks. The result of this research work consisted in the gigapixel documentation of several masterpieces of the Museo de Bellas Artes of Valencia, one of the main art galleries in Spain. The results will be disseminated through the Internet, as will be shown with some examples. Full article
Show Figures

Figure 1

15 pages, 1618 KiB  
Article
Dataset Growth in Medical Image Analysis Research
by Nahum Kiryati and Yuval Landau
J. Imaging 2021, 7(8), 155; https://doi.org/10.3390/jimaging7080155 - 20 Aug 2021
Cited by 16 | Viewed by 3128
Abstract
Medical image analysis research requires medical image datasets. Nevertheless, due to various impediments, researchers have been described as “data starved”. We hypothesize that implicit evolving community standards require researchers to use ever-growing datasets. In Phase I of this research, we scanned the MICCAI [...] Read more.
Medical image analysis research requires medical image datasets. Nevertheless, due to various impediments, researchers have been described as “data starved”. We hypothesize that implicit evolving community standards require researchers to use ever-growing datasets. In Phase I of this research, we scanned the MICCAI (Medical Image Computing and Computer-Assisted Intervention) conference proceedings from 2011 to 2018. We identified 907 papers involving human MRI, CT or fMRI datasets and extracted their sizes. The median dataset size had grown by 3–10 times from 2011 to 2018, depending on imaging modality. Statistical analysis revealed exponential growth of the geometric mean dataset size with an annual growth of 21% for MRI, 24% for CT and 31% for fMRI. Thereupon, we had issued a forecast for dataset sizes in MICCAI 2019 well before the conference. In Phase II of this research, we examined the MICCAI 2019 proceedings and analyzed 308 relevant papers. The MICCAI 2019 statistics compare well with the forecast. The revised annual growth rates of the geometric mean dataset size are 27% for MRI, 30% for CT and 32% for fMRI. We predict the respective dataset sizes in the MICCAI 2020 conference (that we have not yet analyzed) and the future MICCAI 2021 conference. Full article
(This article belongs to the Special Issue Intelligent Strategies for Medical Image Analysis)
Show Figures

Figure 1

15 pages, 1593 KiB  
Article
Design of an Ultrasound-Navigated Prostate Cancer Biopsy System for Nationwide Implementation in Senegal
by Gabor Fichtinger, Parvin Mousavi, Tamas Ungi, Aaron Fenster, Purang Abolmaesumi, Gernot Kronreif, Juan Ruiz-Alzola, Alain Ndoye, Babacar Diao and Ron Kikinis
J. Imaging 2021, 7(8), 154; https://doi.org/10.3390/jimaging7080154 - 20 Aug 2021
Cited by 1 | Viewed by 3124
Abstract
This paper presents the design of NaviPBx, an ultrasound-navigated prostate cancer biopsy system. NaviPBx is designed to support an affordable and sustainable national healthcare program in Senegal. It uses spatiotemporal navigation and multiparametric transrectal ultrasound to guide biopsies. NaviPBx integrates concepts and methods [...] Read more.
This paper presents the design of NaviPBx, an ultrasound-navigated prostate cancer biopsy system. NaviPBx is designed to support an affordable and sustainable national healthcare program in Senegal. It uses spatiotemporal navigation and multiparametric transrectal ultrasound to guide biopsies. NaviPBx integrates concepts and methods that have been independently validated previously in clinical feasibility studies and deploys them together in a practical prostate cancer biopsy system. NaviPBx is based entirely on free open-source software and will be shared as a free open-source program with no restriction on its use. NaviPBx is set to be deployed and sustained nationwide through the Senegalese Military Health Service. This paper reports on the results of the design process of NaviPBx. Our approach concentrates on “frugal technology”, intended to be affordable for low–middle income (LMIC) countries. Our project promises the wide-scale application of prostate biopsy and will foster time-efficient development and programmatic implementation of ultrasound-guided diagnostic and therapeutic interventions in Senegal and beyond. Full article
Show Figures

Figure 1

26 pages, 81868 KiB  
Article
Spline-Based Dense Medial Descriptors for Lossy Image Compression
by Jieying Wang, Jiří Kosinka and Alexandru Telea
J. Imaging 2021, 7(8), 153; https://doi.org/10.3390/jimaging7080153 - 19 Aug 2021
Cited by 5 | Viewed by 2357
Abstract
Medial descriptors are of significant interest for image simplification, representation, manipulation, and compression. On the other hand, B-splines are well-known tools for specifying smooth curves in computer graphics and geometric design. In this paper, we integrate the two by modeling medial descriptors with [...] Read more.
Medial descriptors are of significant interest for image simplification, representation, manipulation, and compression. On the other hand, B-splines are well-known tools for specifying smooth curves in computer graphics and geometric design. In this paper, we integrate the two by modeling medial descriptors with stable and accurate B-splines for image compression. Representing medial descriptors with B-splines can not only greatly improve compression but is also an effective vector representation of raster images. A comprehensive evaluation shows that our Spline-based Dense Medial Descriptors (SDMD) method achieves much higher compression ratios at similar or even better quality to the well-known JPEG technique. We illustrate our approach with applications in generating super-resolution images and salient feature preserving image compression. Full article
(This article belongs to the Special Issue New and Specialized Methods of Image Compression)
Show Figures

Figure 1

24 pages, 386 KiB  
Article
A Novel Methodology for Measuring the Abstraction Capabilities of Image Recognition Algorithms
by Márton Gyula Hudáky, Péter Lehotay-Kéry and Attila Kiss
J. Imaging 2021, 7(8), 152; https://doi.org/10.3390/jimaging7080152 - 19 Aug 2021
Viewed by 2033
Abstract
Creating a widely excepted model on the measure of intelligence became inevitable due to the existence of an abundance of different intelligent systems. Measuring intelligence would provide feedback for the developers and ultimately lead us to create better artificial systems. In the present [...] Read more.
Creating a widely excepted model on the measure of intelligence became inevitable due to the existence of an abundance of different intelligent systems. Measuring intelligence would provide feedback for the developers and ultimately lead us to create better artificial systems. In the present paper, we show a solution where learning as a process is examined, aiming to detect pre-written solutions and separate them from the knowledge acquired by the system. In our approach, we examine image recognition software by executing different transformations on objects and detect if the software was resilient to it. A system with the required intelligence is supposed to become resilient to the transformation after experiencing it several times. The method is successfully tested on a simple neural network, which is not able to learn most of the transformations examined. The method can be applied to any image recognition software to test its abstraction capabilities. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

16 pages, 4702 KiB  
Article
A Virtual Reality System for Improved Image-Based Planning of Complex Cardiac Procedures
by Shujie Deng, Gavin Wheeler, Nicolas Toussaint, Lindsay Munroe, Suryava Bhattacharya, Gina Sajith, Ei Lin, Eeshar Singh, Ka Yee Kelly Chu, Saleha Kabir, Kuberan Pushparajah, John M. Simpson, Julia A. Schnabel and Alberto Gomez
J. Imaging 2021, 7(8), 151; https://doi.org/10.3390/jimaging7080151 - 19 Aug 2021
Cited by 11 | Viewed by 3737
Abstract
The intricate nature of congenital heart disease requires understanding of the complex, patient-specific three-dimensional dynamic anatomy of the heart, from imaging data such as three-dimensional echocardiography for successful outcomes from surgical and interventional procedures. Conventional clinical systems use flat screens, and therefore, display [...] Read more.
The intricate nature of congenital heart disease requires understanding of the complex, patient-specific three-dimensional dynamic anatomy of the heart, from imaging data such as three-dimensional echocardiography for successful outcomes from surgical and interventional procedures. Conventional clinical systems use flat screens, and therefore, display remains two-dimensional, which undermines the full understanding of the three-dimensional dynamic data. Additionally, the control of three-dimensional visualisation with two-dimensional tools is often difficult, so used only by imaging specialists. In this paper, we describe a virtual reality system for immersive surgery planning using dynamic three-dimensional echocardiography, which enables fast prototyping for visualisation such as volume rendering, multiplanar reformatting, flow visualisation and advanced interaction such as three-dimensional cropping, windowing, measurement, haptic feedback, automatic image orientation and multiuser interactions. The available features were evaluated by imaging and nonimaging clinicians, showing that the virtual reality system can help improve the understanding and communication of three-dimensional echocardiography imaging and potentially benefit congenital heart disease treatment. Full article
Show Figures

Figure 1

19 pages, 85081 KiB  
Article
Hue-Preserving Saturation Improvement in RGB Color Cube
by Kohei Inoue, Minyao Jiang and Kenji Hara
J. Imaging 2021, 7(8), 150; https://doi.org/10.3390/jimaging7080150 - 18 Aug 2021
Cited by 13 | Viewed by 3817
Abstract
This paper proposes a method for improving saturation in the context of hue-preserving color image enhancement. The proposed method handles colors in an RGB color space, which has the form of a cube, and enhances the contrast of a given image by histogram [...] Read more.
This paper proposes a method for improving saturation in the context of hue-preserving color image enhancement. The proposed method handles colors in an RGB color space, which has the form of a cube, and enhances the contrast of a given image by histogram manipulation, such as histogram equalization and histogram specification, of the intensity image. Then, the color corresponding to a target intensity is determined in a hue-preserving manner, where a gamut problem should be taken into account. We first project any color onto a surface in the RGB color space, which bisects the RGB color cube, to increase the saturation without a gamut problem. Then, we adjust the intensity of the saturation-enhanced color to the target intensity given by the histogram manipulation. The experimental results demonstrate that the proposed method achieves higher saturation than that given by related methods for hue-preserving color image enhancement. Full article
(This article belongs to the Special Issue Advances in Color Imaging)
Show Figures

Figure 1

12 pages, 1246 KiB  
Article
Classification of Geometric Forms in Mosaics Using Deep Neural Network
by Mridul Ghosh, Sk Md Obaidullah, Francesco Gherardini and Maria Zdimalova
J. Imaging 2021, 7(8), 149; https://doi.org/10.3390/jimaging7080149 - 18 Aug 2021
Cited by 16 | Viewed by 2762
Abstract
The paper addresses an image processing problem in the field of fine arts. In particular, a deep learning-based technique to classify geometric forms of artworks, such as paintings and mosaics, is presented. We proposed and tested a convolutional neural network (CNN)-based framework that [...] Read more.
The paper addresses an image processing problem in the field of fine arts. In particular, a deep learning-based technique to classify geometric forms of artworks, such as paintings and mosaics, is presented. We proposed and tested a convolutional neural network (CNN)-based framework that autonomously quantifies the feature map and classifies it. Convolution, pooling and dense layers are three distinct categories of levels that generate attributes from the dataset images by introducing certain specified filters. As a case study, a Roman mosaic is considered, which is digitally reconstructed by close-range photogrammetry based on standard photos. During the digital transformation from a 2D perspective view of the mosaic into an orthophoto, each photo is rectified (i.e., it is an orthogonal projection of the real photo on the plane of the mosaic). Image samples of the geometric forms, e.g., triangles, squares, circles, octagons and leaves, even if they are partially deformed, were extracted from both the original and the rectified photos and originated the dataset for testing the CNN-based approach. The proposed method has proved to be robust enough to analyze the mosaic geometric forms, with an accuracy higher than 97%. Furthermore, the performance of the proposed method was compared with standard deep learning frameworks. Due to the promising results, this method can be applied to many other pattern identification problems related to artworks. Full article
(This article belongs to the Special Issue Fine Art Pattern Extraction and Recognition)
Show Figures

Figure 1

21 pages, 1594 KiB  
Article
An Iterative Algorithm for Semisupervised Classification of Hotspots on Bone Scintigraphies of Patients with Prostate Cancer
by Laura Providência, Inês Domingues and João Santos
J. Imaging 2021, 7(8), 148; https://doi.org/10.3390/jimaging7080148 - 17 Aug 2021
Cited by 4 | Viewed by 4126
Abstract
Prostate cancer (PCa) is the second most diagnosed cancer in men. Patients with PCa often develop metastases, with more than 80% of this metastases occurring in bone. The most common imaging technique used for screening, diagnosis and follow-up of disease evolution is bone [...] Read more.
Prostate cancer (PCa) is the second most diagnosed cancer in men. Patients with PCa often develop metastases, with more than 80% of this metastases occurring in bone. The most common imaging technique used for screening, diagnosis and follow-up of disease evolution is bone scintigraphy, due to its high sensitivity and widespread availability at nuclear medicine facilities. To date, the assessment of bone scans relies solely on the interpretation of an expert physician who visually assesses the scan. Besides this being a time consuming task, it is also subjective, as there is no absolute criteria neither to identify bone metastases neither to quantify them by a straightforward and universally accepted procedure. In this paper, a new algorithm for the false positives reduction of automatically detected hotspots in bone scintigraphy images is proposed. The motivation relies in the difficulty of building a fully annotated database. In this way, our algorithm is a semisupervised method that works in an iterative way. The ultimate goal is to provide the physician with a fast, precise and reliable tool to quantify bone scans and evaluate disease progression and response to treatment. The algorithm is tested in a set of bone scans manually labeled according to the patient’s medical record. The achieved classification sensitivity, specificity and false negative rate were 63%, 58% and 37%, respectively. Comparison with other state-of-the-art classification algorithms shows superiority of the proposed method. Full article
(This article belongs to the Special Issue Advanced Computational Methods for Oncological Image Analysis)
Show Figures

Figure 1

9 pages, 6782 KiB  
Article
Suppression of Cone-Beam Artefacts with Direct Iterative Reconstruction Computed Tomography Trajectories (DIRECTT)
by Sotirios Magkos, Andreas Kupsch and Giovanni Bruno
J. Imaging 2021, 7(8), 147; https://doi.org/10.3390/jimaging7080147 - 15 Aug 2021
Cited by 4 | Viewed by 2304
Abstract
The reconstruction of cone-beam computed tomography data using filtered back-projection algorithms unavoidably results in severe artefacts. We describe how the Direct Iterative Reconstruction of Computed Tomography Trajectories (DIRECTT) algorithm can be combined with a model of the artefacts for the reconstruction of such [...] Read more.
The reconstruction of cone-beam computed tomography data using filtered back-projection algorithms unavoidably results in severe artefacts. We describe how the Direct Iterative Reconstruction of Computed Tomography Trajectories (DIRECTT) algorithm can be combined with a model of the artefacts for the reconstruction of such data. The implementation of DIRECTT results in reconstructed volumes of superior quality compared to the conventional algorithms. Full article
(This article belongs to the Special Issue X-ray Digital Radiography and Computed Tomography)
Show Figures

Figure 1

15 pages, 1907 KiB  
Article
Investigating Semantic Augmentation in Virtual Environments for Image Segmentation Using Convolutional Neural Networks
by Joshua Ganter, Simon Löffler, Ron Metzger, Katharina Ußling and Christoph Müller
J. Imaging 2021, 7(8), 146; https://doi.org/10.3390/jimaging7080146 - 14 Aug 2021
Viewed by 2018
Abstract
Collecting real-world data for the training of neural networks is enormously time-consuming and expensive. As such, the concept of virtualizing the domain and creating synthetic data has been analyzed in many instances. This virtualization offers many possibilities of changing the domain, and with [...] Read more.
Collecting real-world data for the training of neural networks is enormously time-consuming and expensive. As such, the concept of virtualizing the domain and creating synthetic data has been analyzed in many instances. This virtualization offers many possibilities of changing the domain, and with that, enabling the relatively fast creation of data. It also offers the chance to enhance necessary augmentations with additional semantic information when compared with conventional augmentation methods. This raises the question of whether such semantic changes, which can be seen as augmentations of the virtual domain, contribute to better results for neural networks, when trained with data augmented this way. In this paper, a virtual dataset is presented, including semantic augmentations and automatically generated annotations, as well as a comparison between semantic and conventional augmentation for image data. It is determined that the results differ only marginally for neural network models trained with the two augmentation approaches. Full article
(This article belongs to the Special Issue Deep Learning for Visual Contents Processing and Analysis)
Show Figures

Figure 1

15 pages, 10020 KiB  
Article
Real-Time 3D Multi-Object Detection and Localization Based on Deep Learning for Road and Railway Smart Mobility
by Antoine Mauri, Redouane Khemmar, Benoit Decoux, Madjid Haddad and Rémi Boutteau
J. Imaging 2021, 7(8), 145; https://doi.org/10.3390/jimaging7080145 - 12 Aug 2021
Cited by 14 | Viewed by 4089
Abstract
For smart mobility, autonomous vehicles, and advanced driver-assistance systems (ADASs), perception of the environment is an important task in scene analysis and understanding. Better perception of the environment allows for enhanced decision making, which, in turn, enables very high-precision actions. To this end, [...] Read more.
For smart mobility, autonomous vehicles, and advanced driver-assistance systems (ADASs), perception of the environment is an important task in scene analysis and understanding. Better perception of the environment allows for enhanced decision making, which, in turn, enables very high-precision actions. To this end, we introduce in this work a new real-time deep learning approach for 3D multi-object detection for smart mobility not only on roads, but also on railways. To obtain the 3D bounding boxes of the objects, we modified a proven real-time 2D detector, YOLOv3, to predict 3D object localization, object dimensions, and object orientation. Our method has been evaluated on KITTI’s road dataset as well as on our own hybrid virtual road/rail dataset acquired from the video game Grand Theft Auto (GTA) V. The evaluation of our method on these two datasets shows good accuracy, but more importantly that it can be used in real-time conditions, in road and rail traffic environments. Through our experimental results, we also show the importance of the accuracy of prediction of the regions of interest (RoIs) used in the estimation of 3D bounding box parameters. Full article
(This article belongs to the Special Issue Visual Localization)
Show Figures

Figure 1

11 pages, 2109 KiB  
Article
Data Augmentation Using Background Replacement for Automated Sorting of Littered Waste
by Arianna Patrizi, Giorgio Gambosi and Fabio Massimo Zanzotto
J. Imaging 2021, 7(8), 144; https://doi.org/10.3390/jimaging7080144 - 12 Aug 2021
Cited by 9 | Viewed by 4692
Abstract
The introduction of sophisticated waste treatment plants is making the process of trash sorting and recycling more and more effective and eco-friendly. Studies on Automated Waste Sorting (AWS) are greatly contributing to making the whole recycling process more efficient. However, a relevant issue, [...] Read more.
The introduction of sophisticated waste treatment plants is making the process of trash sorting and recycling more and more effective and eco-friendly. Studies on Automated Waste Sorting (AWS) are greatly contributing to making the whole recycling process more efficient. However, a relevant issue, which remains unsolved, is how to deal with the large amount of waste that is littered in the environment instead of being collected properly. In this paper, we introduce BackRep: a method for building waste recognizers that can be used for identifying and sorting littered waste directly where it is found. BackRep consists of a data-augmentation procedure, which expands existing datasets by cropping solid waste in images taken on a uniform (white) background and superimposing it on more realistic backgrounds. For our purpose, realistic backgrounds are those representing places where solid waste is usually littered. To experiment with our data-augmentation procedure, we produced a new dataset in realistic settings. We observed that waste recognizers trained on augmented data actually outperform those trained on existing datasets. Hence, our data-augmentation procedure seems a viable approach to support the development of waste recognizers for urban and wild environments. Full article
Show Figures

Figure 1

16 pages, 5567 KiB  
Article
Unsupervised Approaches for the Segmentation of Dry ARMD Lesions in Eye Fundus cSLO Images
by Clément Royer, Jérémie Sublime, Florence Rossant and Michel Paques
J. Imaging 2021, 7(8), 143; https://doi.org/10.3390/jimaging7080143 - 11 Aug 2021
Cited by 6 | Viewed by 2630
Abstract
Age-related macular degeneration (ARMD), a major cause of sight impairment for elderly people, is still not well understood despite intensive research. Measuring the size of the lesions in the fundus is the main biomarker of the severity of the disease and as such [...] Read more.
Age-related macular degeneration (ARMD), a major cause of sight impairment for elderly people, is still not well understood despite intensive research. Measuring the size of the lesions in the fundus is the main biomarker of the severity of the disease and as such is widely used in clinical trials yet only relies on manual segmentation. Artificial intelligence, in particular automatic image analysis based on neural networks, has a major role to play in better understanding the disease, by analyzing the intrinsic optical properties of dry ARMD lesions from patient images. In this paper, we propose a comparison of automatic segmentation methods (classical computer vision method, machine learning method and deep learning method) in an unsupervised context applied on cSLO IR images. Among the methods compared, we propose an adaptation of a fully convolutional network, called W-net, as an efficient method for the segmentation of ARMD lesions. Unlike supervised segmentation methods, our algorithm does not require annotated data which are very difficult to obtain in this application. Our method was tested on a dataset of 328 images and has shown to reach higher quality results than other compared unsupervised methods with a F1 score of 0.87, while having a more stable model, even though in some specific cases, texture/edges-based methods can produce relevant results. Full article
(This article belongs to the Special Issue Frontiers in Retinal Image Processing)
Show Figures

Figure 1

14 pages, 1736 KiB  
Article
Synthesising Facial Macro- and Micro-Expressions Using Reference Guided Style Transfer
by Chuin Hong Yap, Ryan Cunningham, Adrian K. Davison and Moi Hoon Yap
J. Imaging 2021, 7(8), 142; https://doi.org/10.3390/jimaging7080142 - 11 Aug 2021
Cited by 4 | Viewed by 3268
Abstract
Long video datasets of facial macro- and micro-expressions remains in strong demand with the current dominance of data-hungry deep learning methods. There are limited methods of generating long videos which contain micro-expressions. Moreover, there is a lack of performance metrics to quantify the [...] Read more.
Long video datasets of facial macro- and micro-expressions remains in strong demand with the current dominance of data-hungry deep learning methods. There are limited methods of generating long videos which contain micro-expressions. Moreover, there is a lack of performance metrics to quantify the generated data. To address the research gaps, we introduce a new approach to generate synthetic long videos and recommend assessment methods to inspect dataset quality. For synthetic long video generation, we use the state-of-the-art generative adversarial network style transfer method—StarGANv2. Using StarGANv2 pre-trained on the CelebA dataset, we transfer the style of a reference image from SAMM long videos (a facial micro- and macro-expression long video dataset) onto a source image of the FFHQ dataset to generate a synthetic dataset (SAMM-SYNTH). We evaluate SAMM-SYNTH by conducting an analysis based on the facial action units detected by OpenFace. For quantitative measurement, our findings show high correlation on two Action Units (AUs), i.e., AU12 and AU6, of the original and synthetic data with a Pearson’s correlation of 0.74 and 0.72, respectively. This is further supported by evaluation method proposed by OpenFace on those AUs, which also have high scores of 0.85 and 0.59. Additionally, optical flow is used to visually compare the original facial movements and the transferred facial movements. With this article, we publish our dataset to enable future research and to increase the data pool of micro-expressions research, especially in the spotting task. Full article
(This article belongs to the Special Issue Imaging Studies for Face and Gesture Analysis)
Show Figures

Figure 1

17 pages, 953 KiB  
Article
Direct and Indirect vSLAM Fusion for Augmented Reality
by Mohamed Outahar, Guillaume Moreau and Jean-Marie Normand
J. Imaging 2021, 7(8), 141; https://doi.org/10.3390/jimaging7080141 - 10 Aug 2021
Cited by 4 | Viewed by 2630
Abstract
Augmented reality (AR) is an emerging technology that is applied in many fields. One of the limitations that still prevents AR to be even more widely used relates to the accessibility of devices. Indeed, the devices currently used are usually high end, expensive [...] Read more.
Augmented reality (AR) is an emerging technology that is applied in many fields. One of the limitations that still prevents AR to be even more widely used relates to the accessibility of devices. Indeed, the devices currently used are usually high end, expensive glasses or mobile devices. vSLAM (visual simultaneous localization and mapping) algorithms circumvent this problem by requiring relatively cheap cameras for AR. vSLAM algorithms can be classified as direct or indirect methods based on the type of data used. Each class of algorithms works optimally on a type of scene (e.g., textured or untextured) but unfortunately with little overlap. In this work, a method is proposed to fuse a direct and an indirect methods in order to have a higher robustness and to offer the possibility for AR to move seamlessly between different types of scenes. Our method is tested on three datasets against state-of-the-art direct (LSD-SLAM), semi-direct (LCSD) and indirect (ORBSLAM2) algorithms in two different scenarios: a trajectory planning and an AR scenario where a virtual object is displayed on top of the video feed; furthermore, a similar method (LCSD SLAM) is also compared to our proposal. Results show that our fusion algorithm is generally as efficient as the best algorithm both in terms of trajectory (mean errors with respect to ground truth trajectory measurements) as well as in terms of quality of the augmentation (robustness and stability). In short, we can propose a fusion algorithm that, in our tests, takes the best of both the direct and indirect methods. Full article
(This article belongs to the Special Issue Advanced Scene Perception for Augmented Reality)
Show Figures

Figure 1

16 pages, 507 KiB  
Article
Identification of Social-Media Platform of Videos through the Use of Shared Features
by Luca Maiano, Irene Amerini, Lorenzo Ricciardi Celsi and Aris Anagnostopoulos
J. Imaging 2021, 7(8), 140; https://doi.org/10.3390/jimaging7080140 - 8 Aug 2021
Cited by 13 | Viewed by 2726
Abstract
Videos have become a powerful tool for spreading illegal content such as military propaganda, revenge porn, or bullying through social networks. To counter these illegal activities, it has become essential to try new methods to verify the origin of videos from these platforms. [...] Read more.
Videos have become a powerful tool for spreading illegal content such as military propaganda, revenge porn, or bullying through social networks. To counter these illegal activities, it has become essential to try new methods to verify the origin of videos from these platforms. However, collecting datasets large enough to train neural networks for this task has become difficult because of the privacy regulations that have been enacted in recent years. To mitigate this limitation, in this work we propose two different solutions based on transfer learning and multitask learning to determine whether a video has been uploaded from or downloaded to a specific social platform through the use of shared features with images trained on the same task. By transferring features from the shallowest to the deepest levels of the network from the image task to videos, we measure the amount of information shared between these two tasks. Then, we introduce a model based on multitask learning, which learns from both tasks simultaneously. The promising experimental results show, in particular, the effectiveness of the multitask approach. According to our knowledge, this is the first work that addresses the problem of social media platform identification of videos through the use of shared features. Full article
(This article belongs to the Special Issue Image and Video Forensics)
Show Figures

Figure 1

14 pages, 6437 KiB  
Article
A Green Prospective for Learned Post-Processing in Sparse-View Tomographic Reconstruction
by Elena Morotti, Davide Evangelista and Elena Loli Piccolomini
J. Imaging 2021, 7(8), 139; https://doi.org/10.3390/jimaging7080139 - 7 Aug 2021
Cited by 9 | Viewed by 2595
Abstract
Deep Learning is developing interesting tools that are of great interest for inverse imaging applications. In this work, we consider a medical imaging reconstruction task from subsampled measurements, which is an active research field where Convolutional Neural Networks have already revealed their great [...] Read more.
Deep Learning is developing interesting tools that are of great interest for inverse imaging applications. In this work, we consider a medical imaging reconstruction task from subsampled measurements, which is an active research field where Convolutional Neural Networks have already revealed their great potential. However, the commonly used architectures are very deep and, hence, prone to overfitting and unfeasible for clinical usages. Inspired by the ideas of the green AI literature, we propose a shallow neural network to perform efficient Learned Post-Processing on images roughly reconstructed by the filtered backprojection algorithm. The results show that the proposed inexpensive network computes images of comparable (or even higher) quality in about one-fourth of time and is more robust than the widely used and very deep ResUNet for tomographic reconstructions from sparse-view protocols. Full article
(This article belongs to the Special Issue Inverse Problems and Imaging)
Show Figures

Figure 1

12 pages, 1586 KiB  
Article
Can Liquid Lenses Increase Depth of Field in Head Mounted Video See-Through Devices?
by Marina Carbone, Davide Domeneghetti, Fabrizio Cutolo, Renzo D’Amato, Emanuele Cigna, Paolo Domenico Parchi, Marco Gesi, Luca Morelli, Mauro Ferrari and Vincenzo Ferrari
J. Imaging 2021, 7(8), 138; https://doi.org/10.3390/jimaging7080138 - 5 Aug 2021
Cited by 3 | Viewed by 2475
Abstract
Wearable Video See-Through (VST) devices for Augmented Reality (AR) and for obtaining a Magnified View are taking hold in the medical and surgical fields. However, these devices are not yet usable in daily clinical practice, due to focusing problems and a limited depth [...] Read more.
Wearable Video See-Through (VST) devices for Augmented Reality (AR) and for obtaining a Magnified View are taking hold in the medical and surgical fields. However, these devices are not yet usable in daily clinical practice, due to focusing problems and a limited depth of field. This study investigates the use of liquid-lens optics to create an autofocus system for wearable VST visors. The autofocus system is based on a Time of Flight (TOF) distance sensor and an active autofocus control system. The integrated autofocus system in the wearable VST viewers showed good potential in terms of providing rapid focus at various distances and a magnified view. Full article
Show Figures

Figure 1

13 pages, 16018 KiB  
Article
Evaluation of 360° Image Projection Formats; Comparing Format Conversion Distortion Using Objective Quality Metrics
by Ikram Hussain and Oh-Jin Kwon
J. Imaging 2021, 7(8), 137; https://doi.org/10.3390/jimaging7080137 - 5 Aug 2021
Cited by 1 | Viewed by 3004
Abstract
Currently available 360° cameras normally capture several images covering a scene in all directions around a shooting point. The captured images are spherical in nature and are mapped to a two-dimensional plane using various projection methods. Many projection formats have been proposed for [...] Read more.
Currently available 360° cameras normally capture several images covering a scene in all directions around a shooting point. The captured images are spherical in nature and are mapped to a two-dimensional plane using various projection methods. Many projection formats have been proposed for 360° videos. However, standards for a quality assessment of 360° images are limited. In this paper, various projection formats are compared to explore the problem of distortion caused by a mapping operation, which has been a considerable challenge in recent approaches. The performances of various projection formats, including equi-rectangular, equal-area, cylindrical, cube-map, and their modified versions, are evaluated based on the conversion causing the least amount of distortion when the format is changed. The evaluation is conducted using sample images selected based on several attributes that determine the perceptual image quality. The evaluation results based on the objective quality metrics have proved that the hybrid equi-angular cube-map format is the most appropriate solution as a common format in 360° image services for where format conversions are frequently demanded. This study presents findings ranking these formats that are useful for identifying the best image format for a future standard. Full article
(This article belongs to the Special Issue Image and Video Quality Assessment)
Show Figures

Figure 1

13 pages, 7969 KiB  
Article
Low-Cost Hyperspectral Imaging with A Smartphone
by Mary B. Stuart, Andrew J. S. McGonigle, Matthew Davies, Matthew J. Hobbs, Nicholas A. Boone, Leigh R. Stanger, Chengxi Zhu, Tom D. Pering and Jon R. Willmott
J. Imaging 2021, 7(8), 136; https://doi.org/10.3390/jimaging7080136 - 5 Aug 2021
Cited by 27 | Viewed by 8940
Abstract
Recent advances in smartphone technologies have opened the door to the development of accessible, highly portable sensing tools capable of accurate and reliable data collection in a range of environmental settings. In this article, we introduce a low-cost smartphone-based hyperspectral imaging system that [...] Read more.
Recent advances in smartphone technologies have opened the door to the development of accessible, highly portable sensing tools capable of accurate and reliable data collection in a range of environmental settings. In this article, we introduce a low-cost smartphone-based hyperspectral imaging system that can convert a standard smartphone camera into a visible wavelength hyperspectral sensor for ca. £100. To the best of our knowledge, this represents the first smartphone capable of hyperspectral data collection without the need for extensive post processing. The Hyperspectral Smartphone’s abilities are tested in a variety of environmental applications and its capabilities directly compared to the laboratory-based analogue from our previous research, as well as the wider existing literature. The Hyperspectral Smartphone is capable of accurate, laboratory- and field-based hyperspectral data collection, demonstrating the significant promise of both this device and smartphone-based hyperspectral imaging as a whole. Full article
(This article belongs to the Special Issue Hyperspectral Imaging and Its Applications)
Show Figures

Figure 1

20 pages, 33352 KiB  
Article
CNN-Based Multi-Modal Camera Model Identification on Video Sequences
by Davide Dal Cortivo, Sara Mandelli, Paolo Bestagini and Stefano Tubaro
J. Imaging 2021, 7(8), 135; https://doi.org/10.3390/jimaging7080135 - 5 Aug 2021
Cited by 12 | Viewed by 3098
Abstract
Identifying the source camera of images and videos has gained significant importance in multimedia forensics. It allows tracing back data to their creator, thus enabling to solve copyright infringement cases and expose the authors of hideous crimes. In this paper, we focus on [...] Read more.
Identifying the source camera of images and videos has gained significant importance in multimedia forensics. It allows tracing back data to their creator, thus enabling to solve copyright infringement cases and expose the authors of hideous crimes. In this paper, we focus on the problem of camera model identification for video sequences, that is, given a video under analysis, detecting the camera model used for its acquisition. To this purpose, we develop two different CNN-based camera model identification methods, working in a novel multi-modal scenario. Differently from mono-modal methods, which use only the visual or audio information from the investigated video to tackle the identification task, the proposed multi-modal methods jointly exploit audio and visual information. We test our proposed methodologies on the well-known Vision dataset, which collects almost 2000 video sequences belonging to different devices. Experiments are performed, considering native videos directly acquired by their acquisition devices and videos uploaded on social media platforms, such as YouTube and WhatsApp. The achieved results show that the proposed multi-modal approaches significantly outperform their mono-modal counterparts, representing a valuable strategy for the tackled problem and opening future research to even more challenging scenarios. Full article
(This article belongs to the Special Issue Image and Video Forensics)
Show Figures

Figure 1

12 pages, 3196 KiB  
Article
A Detection Method of Operated Fake-Images Using Robust Hashing
by Miki Tanaka, Sayaka Shiota and Hitoshi Kiya
J. Imaging 2021, 7(8), 134; https://doi.org/10.3390/jimaging7080134 - 5 Aug 2021
Cited by 9 | Viewed by 3509
Abstract
SNS providers are known to carry out the recompression and resizing of uploaded images, but most conventional methods for detecting fake images/tampered images are not robust enough against such operations. In this paper, we propose a novel method for detecting fake images, including [...] Read more.
SNS providers are known to carry out the recompression and resizing of uploaded images, but most conventional methods for detecting fake images/tampered images are not robust enough against such operations. In this paper, we propose a novel method for detecting fake images, including distortion caused by image operations such as image compression and resizing. We select a robust hashing method, which retrieves images similar to a query image, for fake-image/tampered-image detection, and hash values extracted from both reference and query images are used to robustly detect fake-images for the first time. If there is an original hash code from a reference image for comparison, the proposed method can more robustly detect fake images than conventional methods. One of the practical applications of this method is to monitor images, including synthetic ones sold by a company. In experiments, the proposed fake-image detection is demonstrated to outperform state-of-the-art methods under the use of various datasets including fake images generated with GANs. Full article
(This article belongs to the Special Issue Intelligent Media Processing)
Show Figures

Figure 1

13 pages, 17217 KiB  
Article
Enhanced Magnetic Resonance Image Synthesis with Contrast-Aware Generative Adversarial Networks
by Jonas Denck, Jens Guehring, Andreas Maier and Eva Rothgang
J. Imaging 2021, 7(8), 133; https://doi.org/10.3390/jimaging7080133 - 4 Aug 2021
Cited by 7 | Viewed by 2773
Abstract
A magnetic resonance imaging (MRI) exam typically consists of the acquisition of multiple MR pulse sequences, which are required for a reliable diagnosis. With the rise of generative deep learning models, approaches for the synthesis of MR images are developed to either synthesize [...] Read more.
A magnetic resonance imaging (MRI) exam typically consists of the acquisition of multiple MR pulse sequences, which are required for a reliable diagnosis. With the rise of generative deep learning models, approaches for the synthesis of MR images are developed to either synthesize additional MR contrasts, generate synthetic data, or augment existing data for AI training. While current generative approaches allow only the synthesis of specific sets of MR contrasts, we developed a method to generate synthetic MR images with adjustable image contrast. Therefore, we trained a generative adversarial network (GAN) with a separate auxiliary classifier (AC) network to generate synthetic MR knee images conditioned on various acquisition parameters (repetition time, echo time, and image orientation). The AC determined the repetition time with a mean absolute error (MAE) of 239.6 ms, the echo time with an MAE of 1.6 ms, and the image orientation with an accuracy of 100%. Therefore, it can properly condition the generator network during training. Moreover, in a visual Turing test, two experts mislabeled 40.5% of real and synthetic MR images, demonstrating that the image quality of the generated synthetic and real MR images is comparable. This work can support radiologists and technologists during the parameterization of MR sequences by previewing the yielded MR contrast, can serve as a valuable tool for radiology training, and can be used for customized data generation to support AI training. Full article
(This article belongs to the Special Issue Intelligent Strategies for Medical Image Analysis)
Show Figures

Figure 1

32 pages, 12921 KiB  
Review
Imaging with Coherent X-rays: From the Early Synchrotron Tests to SYNAPSE
by Giorgio Margaritondo and Yeukuang Hwu
J. Imaging 2021, 7(8), 132; https://doi.org/10.3390/jimaging7080132 - 4 Aug 2021
Cited by 5 | Viewed by 2514
Abstract
The high longitudinal and lateral coherence of synchrotron X-rays sources radically transformed radiography. Before them, the image contrast was almost only based on absorption. Coherent synchrotron sources transformed radiography into a multi-faceted tool that can extract information also from “phase” effects. Here, we [...] Read more.
The high longitudinal and lateral coherence of synchrotron X-rays sources radically transformed radiography. Before them, the image contrast was almost only based on absorption. Coherent synchrotron sources transformed radiography into a multi-faceted tool that can extract information also from “phase” effects. Here, we report a very simple description of the new techniques, presenting them to potential new users without requiring a sophisticated background in advanced physics. We then illustrate the impact of such techniques with a number of examples. Finally, we present the international collaboration SYNAPSE (Synchrotrons for Neuroscience—an Asia-Pacific Strategic Enterprise), which targets the use of phase-contrast radiography to map one full human brain in a few years. Full article
(This article belongs to the Special Issue X-ray Digital Radiography and Computed Tomography)
Show Figures

Figure 1

11 pages, 1711 KiB  
Article
Customized Efficient Neural Network for COVID-19 Infected Region Identification in CT Images
by Alessandro Stefano and Albert Comelli
J. Imaging 2021, 7(8), 131; https://doi.org/10.3390/jimaging7080131 - 4 Aug 2021
Cited by 41 | Viewed by 3167
Abstract
Background: In the field of biomedical imaging, radiomics is a promising approach that aims to provide quantitative features from images. It is highly dependent on accurate identification and delineation of the volume of interest to avoid mistakes in the implementation of the texture-based [...] Read more.
Background: In the field of biomedical imaging, radiomics is a promising approach that aims to provide quantitative features from images. It is highly dependent on accurate identification and delineation of the volume of interest to avoid mistakes in the implementation of the texture-based prediction model. In this context, we present a customized deep learning approach aimed at addressing the real-time, and fully automated identification and segmentation of COVID-19 infected regions in computed tomography images. Methods: In a previous study, we adopted ENET, originally used for image segmentation tasks in self-driving cars, for whole parenchyma segmentation in patients with idiopathic pulmonary fibrosis which has several similarities to COVID-19 disease. To automatically identify and segment COVID-19 infected areas, a customized ENET, namely C-ENET, was implemented and its performance compared to the original ENET and some state-of-the-art deep learning architectures. Results: The experimental results demonstrate the effectiveness of our approach. Considering the performance obtained in terms of similarity of the result of the segmentation to the gold standard (dice similarity coefficient ~75%), our proposed methodology can be used for the identification and delineation of COVID-19 infected areas without any supervision of a radiologist, in order to obtain a volume of interest independent from the user. Conclusions: We demonstrated that the proposed customized deep learning model can be applied to rapidly identify, and segment COVID-19 infected regions to subsequently extract useful information for assessing disease severity through radiomics analyses. Full article
(This article belongs to the Special Issue Intelligent Strategies for Medical Image Analysis)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop