applsci-logo

Journal Browser

Journal Browser

Augmented Reality, Virtual Reality & Semantic 3D Reconstruction

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Optics and Lasers".

Deadline for manuscript submissions: closed (5 November 2020) | Viewed by 84525

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editors


E-Mail Website1 Website2
Guest Editor
PEGASUS FZ LLC, United Arab Emirates
Interests: machine learning research & applications; convolutional neural networks; sparse coding; learning-to-rank

E-Mail Website1 Website2
Guest Editor
Computer Science and Engineering, Thapar Institute of Engineering and Technology, Deemed University, Patiala 147004, India
Interests: SDN; cyber physical systems; security; smart cities; deep learning; blockchain
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Integrated Management Coastal Research Institute, Universitat Politecnica de Valencia, 46022 Valencia, Spain
Interests: network protocols; network algorithms; wireless sensor networks; ad hoc networks; multimedia streaming
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Augmented Reality is a key technology that will facilitate a major paradigm shift in the way users interact with data and has only just recently been recognized as a viable solution for solving many critical needs. Enter augmented reality (AR) technology, which can be used to visualize data from hundreds of sensors simultaneously, overlaying relevant and actionable information over your environment through a headset. Semantic 3D reconstruction makes AR technology much promising, with much more semantic information. Although, there are several methods currently available as post-processing approaches to extract semantic information from the reconstructed 3D models, the obtained results are uncertainty, and are evenly incorrect. Thus, it is necessary to explore or develop a novel 3D reconstruction approach to automatic recover 3D geometry model and obtained semantic information in simultaneous.

The rapid advent of deep learning brought new opportunities to the filed of semantic 3D reconstruction from photo collections. Deep learning-based methods are not only able to extract semantic information but also can be used to enhance some fundamental techniques in semantic 3D reconstruction, those fundamental techniques include feature matching or tracking, stereo matching, camera pose estimation, and multi-view stereo. Moreover, deep learning techniques can be used to extract priors from photo collections, the obtained information in turn can improve the quality of 3D reconstruction.

The aim of this Special Issue is to provide a platform for researchers to share the innovative work in the filed of semantic 3D reconstruction, virtual reality, and augmented reality, including deep learning-based approaches to 3D reconstruction, and software platforms of deep learning for virtual reality and augmented reality.

This special section will focus on (but not limited to) the following topics:

  • Virtual reality
  • Augmented reality
  • Semantic 3D reconstruction
  • Color transfer in virtual reality
  • Color consistency in augmented reality
  • Feature detection and matching in 3D reconstruction
  • Dynamic Simultaneous Localization and Mapping
  • Large-scale Structure from Motion
  • Augmented reality software platform
  • Virtual reality hardware platform
  • Applications include Cultural Heritage, Environmental recording, etc.
Dr. Zhihan Lv

Dr. Jing-Yan Wang
Prof. Neeraj Kumar
Prof. Jaime Lloret Mauri
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Augmented reality;

Virtual reality;

Semantic 3D reconstruction;

Color transfer in virtual reality;

Color consistency in augmented reality;

Feature detection and matching in 3D reconstruction;

Dynamic Simultaneous Localization and Mapping;

Large-scale Structure from Motion;

Augmented reality software platform;

Virtual reality hardware platform;

Applications include Cultural Heritage, Environmental recording, etc.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (20 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Review

6 pages, 197 KiB  
Editorial
Special Issue on “Augmented Reality, Virtual Reality & Semantic 3D Reconstruction”
by Zhihan Lv, Jing-Yan Wang, Neeraj Kumar and Jaime Lloret
Appl. Sci. 2021, 11(18), 8590; https://doi.org/10.3390/app11188590 - 16 Sep 2021
Cited by 2 | Viewed by 1810
Abstract
Augmented Reality is a key technology that will facilitate a major paradigm shift in the way users interact with data and has only just recently been recognized as a viable solution for solving many critical needs [...] Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)

Research

Jump to: Editorial, Review

11 pages, 5417 KiB  
Article
A Novel Real-Time Virtual 3D Object Composition Method for 360° Video
by Jaehyun Lee, Sungjae Ha, Philippe Gentet, Leehwan Hwang, Soonchul Kwon and Seunghyun Lee
Appl. Sci. 2020, 10(23), 8679; https://doi.org/10.3390/app10238679 - 4 Dec 2020
Cited by 1 | Viewed by 2971
Abstract
As highly immersive virtual reality (VR) content, 360° video allows users to observe all viewpoints within the desired direction from the position where the video is recorded. In 360° video content, virtual objects are inserted into recorded real scenes to provide a higher [...] Read more.
As highly immersive virtual reality (VR) content, 360° video allows users to observe all viewpoints within the desired direction from the position where the video is recorded. In 360° video content, virtual objects are inserted into recorded real scenes to provide a higher sense of immersion. These techniques are called 3D composition. For a realistic 3D composition in a 360° video, it is important to obtain the internal (focal length) and external (position and rotation) parameters from a 360° camera. Traditional methods estimate the trajectory of a camera by extracting the feature point from the recorded video. However, incorrect results may occur owing to stitching errors from a 360° camera attached to several high-resolution cameras for the stitching process, and a large amount of time is spent on feature tracking owing to the high-resolution of the video. We propose a new method for pre-visualization and 3D composition that overcomes the limitations of existing methods. This system achieves real-time position tracking of the attached camera using a ZED camera and a stereo-vision sensor, and real-time stabilization using a Kalman filter. The proposed system shows high time efficiency and accurate 3D composition. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

16 pages, 2221 KiB  
Article
Skeleton-Based Dynamic Hand Gesture Recognition Using an Enhanced Network with One-Shot Learning
by Chunyong Ma, Shengsheng Zhang, Anni Wang, Yongyang Qi and Ge Chen
Appl. Sci. 2020, 10(11), 3680; https://doi.org/10.3390/app10113680 - 26 May 2020
Cited by 24 | Viewed by 5183
Abstract
Dynamic hand gesture recognition based on one-shot learning requires full assimilation of the motion features from a few annotated data. However, how to effectively extract the spatio-temporal features of the hand gestures remains a challenging issue. This paper proposes a skeleton-based dynamic hand [...] Read more.
Dynamic hand gesture recognition based on one-shot learning requires full assimilation of the motion features from a few annotated data. However, how to effectively extract the spatio-temporal features of the hand gestures remains a challenging issue. This paper proposes a skeleton-based dynamic hand gesture recognition using an enhanced network (GREN) based on one-shot learning by improving the memory-augmented neural network, which can rapidly assimilate the motion features of dynamic hand gestures. Besides, the network effectively combines and stores the shared features between dissimilar classes, which lowers the prediction error caused by the unnecessary hyper-parameters updating, and improves the recognition accuracy with the increase of categories. In this paper, the public dynamic hand gesture database (DHGD) is used for the experimental comparison of the state-of-the-art performance of the GREN network, and although only 30% of the dataset was used for training, the accuracy of skeleton-based dynamic hand gesture recognition reached 82.29% based on one-shot learning. Experiments with the Microsoft Research Asia (MSRA) hand gesture dataset verified the robustness of the GREN network. The experimental results demonstrate that the GREN network is feasible for skeleton-based dynamic hand gesture recognition based on one-shot learning. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

15 pages, 2324 KiB  
Article
Exploring Visual Perceptions of Spatial Information for Wayfinding in Virtual Reality Environments
by Ju Yeon Kim and Mi Jeong Kim
Appl. Sci. 2020, 10(10), 3461; https://doi.org/10.3390/app10103461 - 17 May 2020
Cited by 9 | Viewed by 3970
Abstract
Human cognitive processes in wayfinding may differ depending on the time taken to accept visual information in environments. This study investigated users’ wayfinding processes using eye-tracking experiments, simulating a complex cultural space to analyze human visual movements in the perception and the cognitive [...] Read more.
Human cognitive processes in wayfinding may differ depending on the time taken to accept visual information in environments. This study investigated users’ wayfinding processes using eye-tracking experiments, simulating a complex cultural space to analyze human visual movements in the perception and the cognitive processes through visual perception responses. The experiment set-up consisted of several paths in COEX Mall, Seoul—from the entrance of the shopping mall Starfield to the Star Hall Library to the COEX Exhibition Hall—using visual stimuli created by virtual reality (four stimuli and a total of 60 seconds stimulation time). The participants in the environment were 24 undergraduate or graduate students, with an average age of 24.8 years. Participants’ visual perception processes were analyzed in terms of the clarity and the recognition of spatial information and the activation of gaze fixation on spatial information. That is, the analysis of the visual perception process was performed by extracting “conscious gaze perspective” data comprising more than 50 consecutive 200 ms continuous gaze fixations; “visual understanding perspective” data were also extracted for more than 300 ms of continuous gaze fixation. The results show that the methods for analyzing the gaze data may vary in terms of processing, analysis, and scope of the data depending on the purpose of the virtual reality experiments. Further, they demonstrate the importance of what purpose statements are given to the subject during the experiment and the possibility of a technical approach being used for the interpretation of spatial information. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

25 pages, 4863 KiB  
Article
Semi-Immersive Virtual Reality as a Tool to Improve Cognitive and Social Abilities in Preschool Children
by Maria Luisa Lorusso, Simona Travellini, Marisa Giorgetti, Paola Negrini, Gianluigi Reni and Emilia Biffi
Appl. Sci. 2020, 10(8), 2948; https://doi.org/10.3390/app10082948 - 24 Apr 2020
Cited by 20 | Viewed by 6770
Abstract
Virtual reality (VR) creates computer-generated virtual environments where users can experience and interact in a similar way as they would do in real life. VR systems are increasingly being used for rehabilitation goals, mainly with adults, but also with children, extending their application [...] Read more.
Virtual reality (VR) creates computer-generated virtual environments where users can experience and interact in a similar way as they would do in real life. VR systems are increasingly being used for rehabilitation goals, mainly with adults, but also with children, extending their application to the educational field. This report concerns a study of the impact of a semi-immersive VR system in a group of 25 children in a kindergarten context. The children were involved in several different games and activity types, specifically developed with the aim of learning specific skills and foster team collaboration. Their reactions and behaviors were recorded by their teachers and by trained psychologists through observation grids addressing task comprehension, participation and enjoyment, interaction and cooperation, conflict, strategic behaviors, and adult-directed questions concerning the activity, the device or general help requests. The grids were compiled at the initial, intermediate and final timepoint during each session. The results show that the activities are easy to understand, enjoyable, and stimulate strategic behaviors, interaction and cooperation, while they do not elicit the need for many explanations. These results are discussed within a neuroconstructivist educational framework and the suitability of semi-immersive, virtual-reality-based activities for cognitive empowerment and rehabilitation purposes is discussed. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Graphical abstract

12 pages, 2077 KiB  
Article
FCN-Based 3D Reconstruction with Multi-Source Photometric Stereo
by Ruixin Wang, Xin Wang, Di He, Lei Wang and Ke Xu
Appl. Sci. 2020, 10(8), 2914; https://doi.org/10.3390/app10082914 - 23 Apr 2020
Cited by 3 | Viewed by 2589
Abstract
As a classical method widely used in 3D reconstruction tasks, the multi-source Photometric Stereo can obtain more accurate 3D reconstruction results compared with the basic Photometric Stereo, but its complex calibration and solution process reduces the efficiency of this algorithm. In this paper, [...] Read more.
As a classical method widely used in 3D reconstruction tasks, the multi-source Photometric Stereo can obtain more accurate 3D reconstruction results compared with the basic Photometric Stereo, but its complex calibration and solution process reduces the efficiency of this algorithm. In this paper, we propose a multi-source Photometric Stereo 3D reconstruction method based on the fully convolutional network (FCN). We first represent the 3D shape of the object as a depth value corresponding to each pixel as the optimized object. After training in an end-to-end manner, our network can efficiently obtain 3D information on the object surface. In addition, we added two regularization constraints to the general loss function, which can effectively help the network to optimize. Under the same light source configuration, our method can obtain a higher accuracy than the classic multi-source Photometric Stereo. At the same time, our new loss function can help the deep learning method to get a more realistic 3D reconstruction result. We have also used our own real dataset to experimentally verify our method. The experimental results show that our method has a good effect on solving the main problems faced by the classical method. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

8 pages, 248 KiB  
Article
Is an ADHD Observation-Scale Based on DSM Criteria Able to Predict Performance in a Virtual Reality Continuous Performance Test?
by Débora Areces, Celestino Rodríguez, Trinidad García and Marisol Cueli
Appl. Sci. 2020, 10(7), 2409; https://doi.org/10.3390/app10072409 - 1 Apr 2020
Cited by 6 | Viewed by 3135
Abstract
The Diagnosis of Attention Deficit/Hyperactivity Disorder (ADHD) requires an exhaustive and objective assessment in order to design an intervention that is adapted to the peculiarities of the patients. The present study aimed to determine if the most commonly used ADHD observation scale—the Evaluation [...] Read more.
The Diagnosis of Attention Deficit/Hyperactivity Disorder (ADHD) requires an exhaustive and objective assessment in order to design an intervention that is adapted to the peculiarities of the patients. The present study aimed to determine if the most commonly used ADHD observation scale—the Evaluation of Attention Deficit and Hyperactivity (EDAH) scale—is able to predict performance in a Continuous Performance Test based on Virtual Reality (VR-CPT). One-hundred-and-fifty students (76% boys and 24% girls) aged 6–16 (M = 10.35; DT = 2.39) participated in the study. Regression analyses showed that only the EDAH subscale referring to inattention symptoms, was a statistically significant predictor of performance in a VR-CPT. More specifically, this subscale showed 86.5% prediction-accuracy regarding performance in the Omissions variable, 80.4% in the Commissions variable, and 74.5% in the Response-time variable. The EDAH subscales referring to impulsivity and hyperactivity were not statistically significant predictors of any variables in the VR-CPT. Our findings may partially explain why impulsive-hyperactive and the combined presentations of ADHD might be considered as unique and qualitatively different sub-categories of ADHD. These results also highlighted the importance of measuring not only the observable behaviors of ADHD individuals, but also the scores in performance tests that are attained by the patients themselves. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
27 pages, 6065 KiB  
Article
Development and Assessment of a Sensor-Based Orientation and Positioning Approach for Decreasing Variation in Camera Viewpoints and Image Transformations at Construction Sites
by Mohsen Foroughi Sabzevar, Masoud Gheisari and James Lo
Appl. Sci. 2020, 10(7), 2305; https://doi.org/10.3390/app10072305 - 27 Mar 2020
Cited by 4 | Viewed by 2783
Abstract
Image matching techniques offer valuable opportunities for the construction industry. Image matching, a fundamental process in computer vision, is required for different purposes such as object and scene recognition, video data mining, reconstruction of three-dimensional (3D) objects, etc. During the image matching process, [...] Read more.
Image matching techniques offer valuable opportunities for the construction industry. Image matching, a fundamental process in computer vision, is required for different purposes such as object and scene recognition, video data mining, reconstruction of three-dimensional (3D) objects, etc. During the image matching process, two images that are randomly (i.e., from different position and orientation) captured from a scene are compared using image matching algorithms in order to identify their similarity. However, this process is very complex and error prone, because pictures that are randomly captured from a scene vary in viewpoints. Therefore, some main features in images such as position, orientation, and scale of objects are transformed. Sometimes, these image matching algorithms cannot correctly identify the similarity between these images. Logically, if these features remain unchanged during the picture capturing process, then image transformations are reduced, similarity increases, and consequently, the chances of algorithms successfully conducting the image matching process increase. One way to improve these chances is to hold the camera at a fixed viewpoint. However, in messy, dusty, and temporary locations such as construction sites, holding the camera at a fixed viewpoint is not always feasible. Is there any way to repeat and retrieve the camera’s viewpoints during different captures at locations such as construction sites? This study developed and evaluated an orientation and positioning approach that decreased the variation in camera viewpoints and image transformation on construction sites. The results showed that images captured while using this approach had less image transformation in contrast to images not captured using this approach. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

13 pages, 1875 KiB  
Article
Generative Adversarial Network for Image Super-Resolution Combining Texture Loss
by Yuning Jiang and Jinhua Li
Appl. Sci. 2020, 10(5), 1729; https://doi.org/10.3390/app10051729 - 3 Mar 2020
Cited by 19 | Viewed by 4518
Abstract
Objective: Super-resolution reconstruction is an increasingly important area in computer vision. To alleviate the problems that super-resolution reconstruction models based on generative adversarial networks are difficult to train and contain artifacts in reconstruction results, we propose a novel and improved algorithm. Methods: This [...] Read more.
Objective: Super-resolution reconstruction is an increasingly important area in computer vision. To alleviate the problems that super-resolution reconstruction models based on generative adversarial networks are difficult to train and contain artifacts in reconstruction results, we propose a novel and improved algorithm. Methods: This paper presented TSRGAN (Super-Resolution Generative Adversarial Networks Combining Texture Loss) model which was also based on generative adversarial networks. We redefined the generator network and discriminator network. Firstly, on the network structure, residual dense blocks without excess batch normalization layers were used to form generator network. Visual Geometry Group (VGG)19 network was adopted as the basic framework of discriminator network. Secondly, in the loss function, the weighting of the four loss functions of texture loss, perceptual loss, adversarial loss and content loss was used as the objective function of generator. Texture loss was proposed to encourage local information matching. Perceptual loss was enhanced by employing the features before activation layer to calculate. Adversarial loss was optimized based on WGAN-GP (Wasserstein GAN with Gradient Penalty) theory. Content loss was used to ensure the accuracy of low-frequency information. During the optimization process, the target image information was reconstructed from different angles of high and low frequencies. Results: The experimental results showed that our method made the average Peak Signal to Noise Ratio of reconstructed images reach 27.99 dB and the average Structural Similarity Index reach 0.778 without losing too much speed, which was superior to other comparison algorithms in objective evaluation index. What is more, TSRGAN significantly improved subjective visual evaluations such as brightness information and texture details. We found that it could generate images with more realistic textures and more accurate brightness, which were more in line with human visual evaluation. Conclusions: Our improvements to the network structure could reduce the model’s calculation amount and stabilize the training direction. In addition, the loss function we present for generator could provide stronger supervision for restoring realistic textures and achieving brightness consistency. Experimental results prove the effectiveness and superiority of TSRGAN algorithm. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

11 pages, 8632 KiB  
Article
The Imperial Cathedral in Königslutter (Germany) as an Immersive Experience in Virtual Reality with Integrated 360° Panoramic Photography
by Alexander P. Walmsley and Thomas P. Kersten
Appl. Sci. 2020, 10(4), 1517; https://doi.org/10.3390/app10041517 - 23 Feb 2020
Cited by 50 | Viewed by 6662
Abstract
As virtual reality (VR) and the corresponding 3D documentation and modelling technologies evolve into increasingly powerful and established tools for numerous applications in architecture, monument preservation, conservation/restoration and the presentation of cultural heritage, new methods for creating information-rich interactive 3D environments are increasingly [...] Read more.
As virtual reality (VR) and the corresponding 3D documentation and modelling technologies evolve into increasingly powerful and established tools for numerous applications in architecture, monument preservation, conservation/restoration and the presentation of cultural heritage, new methods for creating information-rich interactive 3D environments are increasingly in demand. In this article, we describe the development of an immersive virtual reality application for the Imperial Cathedral in Königslutter, in which 360° panoramic photographs were integrated within the virtual environment as a novel and complementary form of visualization. The Imperial Cathedral (Kaiserdom) of Königslutter is one of the most important examples of Romanesque architecture north of the Alps. The Cathedral had previously been subjected to laser-scanning and recording with 360° panoramic photography by the Photogrammetry & Laser Scanning lab of HafenCity University Hamburg in 2010. With the recent rapid development of consumer VR technology, it was subsequently decided to investigate how these two data sources could be combined within an immersive VR application for tourism and for architectural heritage preservation. A specialised technical workflow was developed to build the virtual environment in Unreal Engine 4 (UE4) and integrate the panorama photographs so as to ensure the seamless integration of these two datasets. A simple mechanic was developed using the native UE4 node-based programming language to switch between these two modes of visualisation. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

14 pages, 9172 KiB  
Article
Semantic 3D Reconstruction with Learning MVS and 2D Segmentation of Aerial Images
by Zizhuang Wei, Yao Wang, Hongwei Yi, Yisong Chen and Guoping Wang
Appl. Sci. 2020, 10(4), 1275; https://doi.org/10.3390/app10041275 - 14 Feb 2020
Cited by 6 | Viewed by 5052
Abstract
Semantic modeling is a challenging task that has received widespread attention in recent years. With the help of mini Unmanned Aerial Vehicles (UAVs), multi-view high-resolution aerial images of large-scale scenes can be conveniently collected. In this paper, we propose a semantic Multi-View Stereo [...] Read more.
Semantic modeling is a challenging task that has received widespread attention in recent years. With the help of mini Unmanned Aerial Vehicles (UAVs), multi-view high-resolution aerial images of large-scale scenes can be conveniently collected. In this paper, we propose a semantic Multi-View Stereo (MVS) method to reconstruct 3D semantic models from 2D images. Firstly, 2D semantic probability distribution is obtained by Convolutional Neural Network (CNN). Secondly, the calibrated cameras poses are determined by Structure from Motion (SfM), while the depth maps are estimated by learning MVS. Combining 2D segmentation and 3D geometry information, dense point clouds with semantic labels are generated by a probability-based semantic fusion method. In the final stage, the coarse 3D semantic point cloud is optimized by both local and global refinements. By making full use of the multi-view consistency, the proposed method efficiently produces a fine-level 3D semantic point cloud. The experimental result evaluated by re-projection maps achieves 88.4% Pixel Accuracy on the Urban Drone Dataset (UDD). In conclusion, our graph-based semantic fusion procedure and refinement based on local and global information can suppress and reduce the re-projection error. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

13 pages, 2724 KiB  
Article
Semantic 3D Reconstruction for Robotic Manipulators with an Eye-In-Hand Vision System
by Fusheng Zha, Yu Fu, Pengfei Wang, Wei Guo, Mantian Li, Xin Wang and Hegao Cai
Appl. Sci. 2020, 10(3), 1183; https://doi.org/10.3390/app10031183 - 10 Feb 2020
Cited by 9 | Viewed by 4497
Abstract
Three-dimensional reconstruction and semantic understandings have attracted extensive attention in recent years. However, current reconstruction techniques mainly target large-scale scenes, such as an indoor environment or automatic self-driving cars. There are few studies on small-scale and high-precision scene reconstruction for manipulator operation, which [...] Read more.
Three-dimensional reconstruction and semantic understandings have attracted extensive attention in recent years. However, current reconstruction techniques mainly target large-scale scenes, such as an indoor environment or automatic self-driving cars. There are few studies on small-scale and high-precision scene reconstruction for manipulator operation, which plays an essential role in the decision-making and intelligent control system. In this paper, a group of images captured from an eye-in-hand vision system carried on a robotic manipulator are segmented by deep learning and geometric features and create a semantic 3D reconstruction using a map stitching method. The results demonstrate that the quality of segmented images and the precision of semantic 3D reconstruction are effectively improved by our method. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

9 pages, 2373 KiB  
Article
3D Face Model Super-Resolution Based on Radial Curve Estimation
by Fan Zhang, Junli Zhao, Liang Wang and Fuqing Duan
Appl. Sci. 2020, 10(3), 1047; https://doi.org/10.3390/app10031047 - 5 Feb 2020
Cited by 3 | Viewed by 2300
Abstract
Consumer depth cameras bring about cheap and fast acquisition of 3D models. However, the precision and resolution of these consumer depth cameras cannot satisfy the requirements of some 3D face applications. In this paper, we present a super-resolution method for reconstructing a high [...] Read more.
Consumer depth cameras bring about cheap and fast acquisition of 3D models. However, the precision and resolution of these consumer depth cameras cannot satisfy the requirements of some 3D face applications. In this paper, we present a super-resolution method for reconstructing a high resolution 3D face model from a low resolution 3D face model acquired from a consumer depth camera. We used a group of radial curves to represent a 3D face. For a given low resolution 3D face model, we first extracted radial curves on it, and then estimated their corresponding high resolution ones by radial curve matching, for which Dynamic Time Warping (DTW) was used. Finally, a reference high resolution 3D face model was deformed to generate a high resolution face model by using the radial curves as the constraining feature. We evaluated our method both qualitatively and quantitatively, and the experimental results validated our method. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

13 pages, 13598 KiB  
Article
Projection-Based Augmented Reality Assistance for Manual Electronic Component Assembly Processes
by Marco Ojer, Hugo Alvarez, Ismael Serrano, Fátima A. Saiz, Iñigo Barandiaran, Daniel Aguinaga, Leire Querejeta and David Alejandro
Appl. Sci. 2020, 10(3), 796; https://doi.org/10.3390/app10030796 - 22 Jan 2020
Cited by 31 | Viewed by 5284
Abstract
Personalized production is moving the progress of industrial automation forward, and demanding new tools for improving the decision-making of the operators. This paper presents a new, projection-based augmented reality system for assisting operators during electronic component assembly processes. The paper describes both the [...] Read more.
Personalized production is moving the progress of industrial automation forward, and demanding new tools for improving the decision-making of the operators. This paper presents a new, projection-based augmented reality system for assisting operators during electronic component assembly processes. The paper describes both the hardware and software solutions, and depicts the results obtained during a usability test with the new system. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

12 pages, 1867 KiB  
Article
Automatic Lip Reading System Based on a Fusion Lightweight Neural Network with Raspberry Pi
by Jing Wen and Yuanyao Lu
Appl. Sci. 2019, 9(24), 5432; https://doi.org/10.3390/app9245432 - 11 Dec 2019
Cited by 5 | Viewed by 4834
Abstract
Virtual Reality (VR) is a kind of interactive experience technology. Human vision, hearing, expression, voice and even touch can be added to the interaction between humans and machine. Lip reading recognition is a new technology in the field of human-computer interaction, which has [...] Read more.
Virtual Reality (VR) is a kind of interactive experience technology. Human vision, hearing, expression, voice and even touch can be added to the interaction between humans and machine. Lip reading recognition is a new technology in the field of human-computer interaction, which has a broad development prospect. It is particularly important in a noisy environment and within the hearing- impaired population and is obtained by means of visual information from a video to make up for the deficiency of voice information. This information is a visual language that benefits from Augmented Reality (AR). The purpose is to establish an efficient and convenient way of communication. However, the traditional lip reading recognition system has high requirements of running speed and performance of the equipment because of its long recognition process and large number of parameters, so it is difficult to meet the requirements of practical application. In this paper, the mobile end lip-reading recognition system based on Raspberry Pi is implemented for the first time, and the recognition application has reached the latest level of our research. Our mobile lip-reading recognition system can be divided into three stages: First, we extract key frames from our own independent database, and then use a multi-task cascade convolution network (MTCNN) to correct the face, so as to improve the accuracy of lip extraction. In the second stage, we use MobileNets to extract lip image features and long short-term memory (LSTM) to extract sequence information between key frames. Finally, we compare three lip reading models: (1) The fusion model of Bi-LSTM and AlexNet. (2) A fusion model with attention mechanism. (3) The LSTM and MobileNets hybrid network model proposed by us. The results show that our model has fewer parameters and lower complexity. The accuracy of the model in the test dataset is 86.5%. Therefore, our mobile lip reading system is simpler and smaller than other PC platforms and saves computing resources and memory space. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

16 pages, 2273 KiB  
Article
Design, Application and Effectiveness of an Innovative Augmented Reality Teaching Proposal through 3P Model
by Alejandro López-García, Pedro Miralles-Martínez and Javier Maquilón
Appl. Sci. 2019, 9(24), 5426; https://doi.org/10.3390/app9245426 - 11 Dec 2019
Cited by 14 | Viewed by 4524
Abstract
Augmented reality (AR) has evolved hand in hand with advances in technology, and today is considered as an emerging technique in its own right. The aim of our study was to analyze students’ perceptions of how useful AR is in the school environment. [...] Read more.
Augmented reality (AR) has evolved hand in hand with advances in technology, and today is considered as an emerging technique in its own right. The aim of our study was to analyze students’ perceptions of how useful AR is in the school environment. A non-experimental quantitative design was used in the form of a questionnaire in which 106 primary sixth-grade students from six schools in the Region of Murcia (Spain) participated. During the study, a teaching proposal using AR related to the content of some curricular areas was put forward in the framework of the 3P learning model. The participants’ perceptions of this technique were analyzed according to each variable, both overall and by gender, via a questionnaire of our own making, which had previously been validated by AR experts, analyzing its psychometric qualities. The initial results indicate that this technique is, according to the students, useful for teaching the curriculum. The conclusion is that AR can increase students’ motivation and enthusiasm while enhancing teaching and learning at the same time. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Graphical abstract

12 pages, 4228 KiB  
Article
Aroma Release of Olfactory Displays Based on Audio-Visual Content
by Safaa Alraddadi, Fahad Alqurashi, Georgios Tsaramirsis, Amany Al Luhaybi and Seyed M. Buhari
Appl. Sci. 2019, 9(22), 4866; https://doi.org/10.3390/app9224866 - 14 Nov 2019
Cited by 11 | Viewed by 2872
Abstract
Variant approaches used to release scents in most recent olfactory displays rely on time for decision making. The applicability of such an approach is questionable in scenarios like video games or virtual reality applications, where the specific content is dynamic in nature and [...] Read more.
Variant approaches used to release scents in most recent olfactory displays rely on time for decision making. The applicability of such an approach is questionable in scenarios like video games or virtual reality applications, where the specific content is dynamic in nature and thus not known in advance. All of these are required to enhance the experience and involvement of the user while watching or participating virtually in 4D cinemas or fun parks, associated with short films. Recently, associating the release of scents to the visual content of the scenario has been studied. This research enhances one such work by considering the auditory content along with the visual content. Minecraft, a computer game, was used to collect the necessary dataset with 1200 audio segments. The Inception v3 model was used to classified the sound and image dataset. Further ground truth classification on this dataset resulted in four classes: grass, fire, thunder, and zombie. Higher accuracies of 91% and 94% were achieved using the transfer learning approach for the sound and image models, respectively. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

14 pages, 14546 KiB  
Article
Construction Hazard Investigation Leveraging Object Anatomization on an Augmented Photoreality Platform
by Hai Chien Pham, Nhu-Ngoc Dao, Sungrae Cho, Phong Thanh Nguyen and Anh-Tuan Pham-Hang
Appl. Sci. 2019, 9(21), 4477; https://doi.org/10.3390/app9214477 - 23 Oct 2019
Cited by 28 | Viewed by 3813
Abstract
Hazard investigation education plays a crucial role in equipping students with adequate knowledge and skills to avoid or eliminate construction hazards at workplaces. With the emergence of various visualization technologies, virtual photoreality as well as 3D virtual reality have been adopted and proved [...] Read more.
Hazard investigation education plays a crucial role in equipping students with adequate knowledge and skills to avoid or eliminate construction hazards at workplaces. With the emergence of various visualization technologies, virtual photoreality as well as 3D virtual reality have been adopted and proved advantageous to various educational disciplines. Despite the significant benefits of providing an engaging and immersive learning environment to promote construction education, recent research has also pointed out that virtual photoreality lacks a 3D object anatomization tools to support learning, while 3D-virtual reality cannot provide a real-world environment. In recent years, research efforts have studied virtual reality applications separately, and there is a lack of research integrating these technologies to overcome limitations and maximize advantages for enhancing learning outcomes. In this regard, the paper develops a construction hazard investigation system leveraging object anatomization on an Interactive Augmented Photoreality platform (iAPR). The proposed iAPR system integrates virtual photoreality with 3D-virtual reality. The iAPR consists of three key learning modules, namely Hazard Understanding Module (HUM), Hazard Recognition Module (HRM), and Safety Performance Module (SPM), which adopt the revised Bloom’s taxonomy theory. A prototype is developed and evaluated objectively through interactive system trials with educators, construction professionals, and learners. The findings demonstrate that the iAPR platform has significant pedagogic methods to improve learner’s construction hazard investigation knowledge and skills, which improve safety performance. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

21 pages, 10609 KiB  
Article
Superpixel-Based Feature Tracking for Structure from Motion
by Mingwei Cao, Wei Jia, Zhihan Lv, Liping Zheng and Xiaoping Liu
Appl. Sci. 2019, 9(15), 2961; https://doi.org/10.3390/app9152961 - 24 Jul 2019
Cited by 2 | Viewed by 3047
Abstract
Feature tracking in image collections significantly affects the efficiency and accuracy of Structure from Motion (SFM). Insufficient correspondences may result in disconnected structures and incomplete components, while the redundant correspondences containing incorrect ones may yield to folded and superimposed structures. In this paper, [...] Read more.
Feature tracking in image collections significantly affects the efficiency and accuracy of Structure from Motion (SFM). Insufficient correspondences may result in disconnected structures and incomplete components, while the redundant correspondences containing incorrect ones may yield to folded and superimposed structures. In this paper, we present a Superpixel-based feature tracking method for structure from motion. In the proposed method, we first propose to use a joint approach to detect local keypoints and compute descriptors. Second, the superpixel-based approach is used to generate labels for the input image. Third, we combine the Speed Up Robust Feature and binary test in the generated label regions to produce a set of combined descriptors for the detected keypoints. Fourth, the locality-sensitive hash (LSH)-based k nearest neighboring matching (KNN) is utilized to produce feature correspondences, and then the ratio test approach is used to remove outliers from the previous matching collection. Finally, we conduct comprehensive experiments on several challenging benchmarking datasets including highly ambiguous and duplicated scenes. Experimental results show that the proposed method gets better performances with respect to the state of the art methods. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

Review

Jump to: Editorial, Research

17 pages, 2416 KiB  
Review
Analysis of the Productive, Structural, and Dynamic Development of Augmented Reality in Higher Education Research on the Web of Science
by Jesús López Belmonte, Antonio-José Moreno-Guerrero, Juan Antonio López Núñez and Santiago Pozo Sánchez
Appl. Sci. 2019, 9(24), 5306; https://doi.org/10.3390/app9245306 - 5 Dec 2019
Cited by 55 | Viewed by 4396
Abstract
Augmented reality is an emerging technology that has gained great relevance thanks to the benefits of its use in learning spaces. The present study focuses on determining the performance and scientific production of augmented reality in higher education (ARHE). A bibliometric methodology for [...] Read more.
Augmented reality is an emerging technology that has gained great relevance thanks to the benefits of its use in learning spaces. The present study focuses on determining the performance and scientific production of augmented reality in higher education (ARHE). A bibliometric methodology for scientific mapping has been used, based on processes of estimation, quantification, analytical tracking, and evaluation of scientific research, taking as its reference the analysis protocols included in the Preferred Reporting Items for Systematic reviews and Meta-analyses for Protocols (PRISMA-P) matrix. A total of 552 scientific publications on the Web of Science (WoS) have been analyzed. Our results show that scientific productions on ARHE are not abundant, tracing its beginnings to the year 1997, with its most productive period beginning in 2015. The most abundant studies are communications and articles (generally in English), with a wide thematic variety in which the bibliometric indicators “virtual environments” and “higher education” stand out. The main sources of origin are International Technology, Education and Development Conference (INTED) Proceedings and Education and New Learning Technologies (EDULEARN) Proceedings, although Spanish institutions are the most prolific. In conclusion, studies related to ARHE in the WoS have become increasingly abundant since ARHE’s research inception in 1997 (and especially since 2009), dealing with a wide thematic variety focused on “virtual environments” and “higher education”; abundant manuscripts are written in English (communications and articles) and originate from Spanish institutions. The main limitation of the study is that the results only reveal the status of this issue in the WoS database. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

Back to TopTop