Real-Time Motion Adaptation with Spatial Perception for an Augmented Reality Character

Kim, Daehwan; Chae, Hyunsic; Kim, Yongwan; Choi, Jinsung; Kim, Ki-Hong; Jo, Dongsik

doi:10.3390/app14020650

Open AccessArticle

Real-Time Motion Adaptation with Spatial Perception for an Augmented Reality Character

by

Daehwan Kim

¹,

Hyunsic Chae

¹,

Yongwan Kim

²,

Jinsung Choi

²,

Ki-Hong Kim

² and

Dongsik Jo

^1,*

¹

School of IT Convergence, University of Ulsan, Ulsan 44610, Republic of Korea

²

VR/AR Content Research Section, Communications & Media Research Laboratory, Electronics and Telecommunication Research Institute (ETRI), Daejeon 34129, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(2), 650; https://doi.org/10.3390/app14020650

Submission received: 21 November 2023 / Revised: 5 January 2024 / Accepted: 10 January 2024 / Published: 12 January 2024

(This article belongs to the Special Issue Future Information & Communication Engineering 2023)

Download

Browse Figures

Versions Notes

Abstract

Virtual characters are now widely used in games, computer-generated (CG) movies, virtual reality (VR), and communication media. The continued technological innovations in motion capture mean that a more natural representation of a three-dimensional character’s motion should be achievable. Many researchers have investigated how virtual characters interact with their surrounding environment through spatial relationships, which were introduced for adapting and preserving character motion. However, technical problems should be resolved to enable the control of characters in augmented reality (AR) environments that combine with the real world, and this can be achieved by adapting motion to environmental differences using original motion datasets. In this paper, we investigate a novel method for preserving automatic motion adaptation for a virtual character in AR environments. We used specific object (e.g., puddle) recognition and the spatial properties of the user’s surrounding space, e.g., object types and positions, and ran validation experiments to provide accurate motion to improve the AR experience. Our experimental study showed positive results in terms of smooth motion in AR configurations. We also found that the participants using AR felt a greater sense of co-presence with the character through adapted motion.

Keywords:

motion adaptation; augmented reality (AR); virtual character; spatial relationships; co-presence

1. Introduction

Augmented reality (AR) will be increasingly used as a technology that integrates digital information and virtual objects with real environments [1,2]. AR systems have been recently applied in various fields, providing the use of digital contents to combine the real and virtual worlds [3,4]. For example, Pokémon Go merged real activity and interaction with digital objects (e.g., virtual characters) in the user’s surroundings [5]. More recently, many researchers in AR fields have begun to actively study the increase in users’ perception to resolve the environmental differences between real and virtual spaces, such as light conditions [6] and geometric relationships [7]. AR systems based on a virtual character that moves and interacts have allowed for a higher sense of being there with the virtual character; however, it is difficult to match the character’s motion in real space [8]. In ongoing recent research, there have been many attempts to resolve the differences in spatial configuration in the real world with respect to motion adaptation [9]. There has been a lack of study on how to adapt using spatial perception in terms of practical deployment. In particular, it is important to match the virtual character’s motion based on spatial configurations in order to visualize the environmental conditions of the real world [10]. To overcome this problem, we present a novel method of real-time motion adaptation for an AR character using spatial perception based on relationships between a virtual character and the configuration of its surrounding environments, and we introduce a procedural approach for implementation, which had not been previously studied. Thus, we investigate a method for preserving automatic motion adaptation to adapt motion to environmental differences using original motion datasets. Also, this paper proposes a context-aware AR adaptation method for improving the AR experience of participants.

Figure 1 shows an example of adapting character motion in the AR system. The character can naturally match with and adapt to the spatial configuration in the participant’s environment. For example, the participant in the AR environment can interact with the virtual character. In this case, the character should exhibit natural movements to adapt to the real configuration (e.g., ice vs. puddle). Our method in this paper focuses on adaptive motion in the AR space rather than spatially correct motion. We identified real objects in the environment, and then the AR character was designed to move according to the real environmental situations regarding the context of the AR participant. To adapt the motion of the AR character to the real configuration, we predicted the real objects using a detection process. Then, we decided on adaptive motion by considering the location of the detected object and its characteristics. Also, we then estimated the 3D position for the AR character.

The structure of the remaining part of this paper is as follows: Section 2 gives an introduction and future directions with related works. Section 3 presents the overall system configuration, and Section 4 provides a detailed explanation of the implementation of the proposed approach in this paper. Section 5 presents the evaluation results of our approach in terms of adapting to environmental conditions and reports the main results, including those related to co-presence. Finally, in Section 6, we conclude this paper and suggest future research directions.

2. Related Work

Here, we will describe three topics with respect to the main theme of our work in this paper (i.e., interactive augmented reality technologies, AR character models, and motion adaptation in AR environments).

2.1. Interactive Augmented Reality Technologies

To develop interactive AR contents, previously representative studies used a fiducial marker-based method [11], merging a virtual character with a specific marker in the real environment. This approach was used to obtain spatially registered results (or virtual objects) from the participant’s viewpoint or captured images. Recently, with sensing technologies, markerless-based AR tracking systems have become much more feasible [12]. Reitmayr and Drummond investigated a method for rendering a virtual object registered with the shape of the object itself, allowing the AR system to determine the position and orientation of the virtual object using the natural feature points of the object shape [13]. Also, a few researchers were concerned with using realistic representation to visualize virtual objects that were indistinguishable from a perceived real space, for example, an estimation of lighting in the real world [14,15], shadow simulation [16], occlusion-based rendering [17], and real-time global illumination [18]. In this study, we were interested in the motion of the virtual character in the AR environment. In particular, more appropriate research works will be conducted to improve accuracy in spatial configurations for motion adaptation of the AR character. There have been no applicable results in this field applying motion adaptation in given spatial configurations of the AR environment.

2.2. Augmented Reality (MR) Character

With the recent developments in using sensing methods to make 3D character datasets, such as reconstructed characters similar to actual participants, AR character systems have been evolving to create computer-generated content in a realistic way [19]. Feng et al. suggested an example of rapid capturing to reconstruct a scanned 3D character by applying materials including texture information using a depth sensor [20]. The authors also presented an automatic rigging method for creating the character’s animation through the use of suitable skeleton embedding for the 3D character datasets [21].

Another concern is the use of motion datasets for character animation. Wang et al. investigated optical-based motion capture systems for recording movements placed in specific locations on the body [22]. Recent advances in this technology in terms of deep-learning-based human pose estimation and using motion retargeting to preserve the captured results between the actor’s body size and the different skeletal information of the target character have become quite mature [23]. With recent research approaches, augmented reality (AR) characters spatially superimposed onto the real world can interact with the participant’s surrounding environment and enhance the sense of co-presence, namely the sense of being in the same place together [24]. Here, we present a method for using automatic motion adaptation to provide accurate motion of an AR character using the spatial properties of the user’s surrounding space to enhance the AR experience. Accordingly, we hope that this pioneering study will serve as a reference for future AI-based motion control research in real-world situations.

2.3. Motion Adaptation in AR Environments

Previously, many researchers in the AR field strove to register a virtual object with an estimated position and orientation using various physical objects (e.g., a fiducial marker or the 3D physical shape itself) [25]. More recently, several studies focused on adapting a virtual object with the corresponding information from real situations (e.g., light conditions, real objects) [14]. Interestingly, the results of recent adaptations in physical situations showed that they are beneficial to participants’ experience in AR environments [26]. In one notable piece of work, Jo et al. introduced the motion adaptation of a teleported character to a physical configuration with respect to the differences in the real environments between two remote sites for an AR-based 3D teleconference [12]. The authors suggested that the adaptation technique was based on spatial properties in the real world and also found that their method could provide an enhanced AR experience. However, adaptive motion studies have been usually conducted in VR environments for retargeting, with no comprehensive work being carried out in the AR area that considers the situation of given objects in the real world. Therefore, in the future, sufficient research studies will continue to be conducted to clarify implementation methods.

3. System Overview and Approach

Figure 2 illustrates an example of the adaptive motion control in a puddle in the real situation. The movements made by the adapted AR character depend on the spatial configuration. We should also consider the identification of real-world objects to estimate 3D positions in AR [27], assuming that the character would be positioned in various real environments. After transformations such as the position and rotation of the character are decided, the character will adapt its motion depending on the character’s position registered in the AR environment. In this case, to represent natural movement, the AR character’s motion will be established in an adaptive form in real-world situations based on contexts of real space, and the 3D position of the AR character will be estimated.

Figure 3 describes our automatic motion adaptation process for the AR character in real environments. To adjust the motion to match the real situation, first, we needed to predict the objects in the real world using a detection process. At this time, if the score of the detected object was higher than the pre-defined threshold (T) related to object recognition, we decided that adaptive motion should be applied to the AR environment instead of the character’s original motion. Here, the threshold value, which is related to the degree of performance of object recognition, required to determine the character’s motion will be set depending on environmental conditions such as lighting conditions. Then, we applied this to the corresponding character motion by considering the location of the actual detected object and its characteristics. We then estimated the 3D position from 2D images (or moving pictures) to determine the location where the virtual character should be superimposed. For example, the AR character with original movements (e.g., normal walking movement) will generate suitable motion sequences (e.g., careful walking movement) at the location of the recognized object. As shown in Figure 3, we recognized real objects, such as puddles, from a large number of real images and selected the character’s motion expressed to match the real-world context from several motion candidates. Detailed information is provided in the Section 4.

4. Implementation

Algorithm 1 shows the overall implementation algorithm. First, we performed object recognition to express the AR character’s movements and apply them to real-world situations using context information. Note that object detection attempts to identify the location of a recognized object from an image and video [28]. Figure 4 shows examples of our results using puddle objects detected in real-life images. Here, among the many real-world objects, we selected the puddle as the most suitable object for the adaptive motion control of the AR character, and in this study, we used 1000 labeled puddle datasets for learning to establish object detection related to predictions to identify the specific object in the image. When a client running our AR content sends a request for object recognition, the server processes it based on the learned results of YOLOv5 [29]. YOLOv5 is a model for detecting objects in the You Only Look Once family of computer vision technologies, commonly processed with images or video datasets generated in the real world. The AR environment requires real-time object detection, and the recognition accuracy is high [30]. Object recognition methods must also be able to simultaneously detect various objects that may affect the character [31]. Due to these recognition constraints in AR content, we selected YOLOv5 from among the related object recognition methods [32]. As a result of object recognition, the area of the recognized results was indicated by a red square (or bounding box), and the location of the detected object was received from the server. Then, we configured the object recognition dataset using the X and Y position in the top left, the X and Y position (or X/Y coordinate) in the bottom right, and the confidence level for the object in the given image.

Algorithm 1. Motion adaption algorithm of the AR character determined by spatial properties in the real-world configurations

Input: Original motions and configurations of the AR character’s surrounding environment

Output: Adapted motions

Function Adapted_motion(Detected object, Original motion)

// Estimate newly updated motion to apply detected objects

if motion_transition_value > detection_threshold

// Calculate 3D position/rotation/scale factors to be superimposed

3D transformation ← depth maps from multi-view 2D images

for each motion(m) in all motions(M)

// match the character behavior using fuzzy inference

updated_character_motion ← original_motion + detected object

End for

End if

Return Adapted motion for animation controller, 3D transformation data

Second, we estimated the 3D position from 2D images (or moving pictures) to calculate the location where the AR character would be registered and superimposed. We adopted Godard et al.’s method of monocular depth estimation [33]. In this study, we predicted the 3D shape of the object related to transformation from the viewpoint of several input images [33] and used the view-synthesis approach to estimate the appearance of the target image from the viewpoint of another image to construct depth maps [33]. Using the reconstructed depth map (or spatial map) captured from the actual environment and the object recognition result, we estimated the relative locations related to spatial perception such as the position and orientation of the camera included in a user’s mobile computing device with six degrees of freedom (DOFs). In short, we estimated the depth in the given space using relative poses with multi-view methods and calculated the 3D spatial location of the camera. Additionally, we used the interpolation method to correct 3D positioning errors. Then, the virtual object, namely the AR character, was registered by applying transformations (e.g., position, rotation, and scale) based on the predicted camera position, which is the process of synthesizing a virtual object accurately at a specific location in the real environment.

As a final step in applying adaptive motion, the AR character with original movements (e.g., walking) generates appropriate animation sequences at the location of the recognized object. Here, as already mentioned, we focused on applying a specific movement according to context rather than the geometric 3D shape of the real environment. We modified and used the toolkit developed from previous research according to the purpose of this study in order to match the AR character’s animation [24]. Our revised toolkit for AR character control produced suitable animation data via fuzzy logic, and the results of the character response were expressed in natural movements on the detected area. Here, with a form of many-valued logic using fuzzy inference, we matched adaptive motions and carried out adjustments of the AR character’s behavior according to contexts from the real-life condition [24]. We updated the transformation with Unity3D C# scripting language with respect to object manipulation applied to the geometric descriptions (e.g., position, rotation, and size) with a Mixamo 3D model [34,35], and the animated Mixamo character also acted similar to a real person. Also, we adjusted the real-life scaled character to look like a person in a real environment. Figure 5 shows an example of the test conditions for the adaptive motion. In our situation in this study, the AR participant used Galaxy Tab S8 Ultra 5G computing devices (Samsung, Suwon, Korea), which can be used portably in an outdoor environment, running a version of Android 13 to view and interact between the character and the real environment in Unity3D [36]. As noted earlier, we intend to conduct future work related to adapting animated movements to match 3D geometrically spatial information such as when multiple objects are recognized simultaneously or are occluded by other objects. Additionally, the AR character should provide behavior that naturally matches the user’s environment [37], for which there is a need for improved motions in various situations with real-life adaptation for practical applicability. Moreover, future research works should be conducted to achieve a more suitable expression of the AR character using a deep-learning-based approach.

Figure 5 shows the AR character’s motion adjusted with the detected object in the real environment. The AR character may walk in a desired place and can express natural behavior that adapts to the context of the environment as determined by the spatial properties. We will present a discussion of the experiments conducted to validate our results with the AR character in the next section.

5. Experiments and Results

We developed real-time motion adaptation to provide appropriate character animation behavior using spatial perception for an AR character and focused on context awareness using object detection in the real environment. We conducted two comparative experiments: object detection accuracy and participant’s co-presence.

5.1. Experiment 1: Object Detection Accuracy for an AR Environment

First, we evaluated object detection accuracy to estimate the performance in a real-world situation where puddles were used as examples of the real environment. Our evaluation method determined which situations provided more accurate object detection. Figure 6 shows the accuracy degrees using the position and size of the objects in a given image of the real environment. It was found that more accurate results were obtained when the target object was in the center of the input image (see Figure 6 left). Then, in the test condition, the experiment was set up with the same sliding window size (i.e., 50 × 50). We also found that the detection performance was better when smaller objects in the image were recognized (see Figure 6 right). The darker colors in Figure 6 indicate more accurate results.

Figure 7 shows an example of the detection error in various situations. In the experiment, performance was evaluated using three confusing situations (i.e., water/puddle, slightly wet ground, and dry road). The right side of Figure 7 shows a confusion matrix of overall performance, and darker colors of the right image indicate highly confusing values in our detection scenario. This error was caused by the influence of light that occurred when the light was reflected. For example, the error probability of recognizing the wet ground in the image as an ordinary road was found to be 65%. Then, in certain situations, it was possible to determine what the object was, but it was not possible to determine exactly where the object’s area was (see Figure 7 left). In our estimation method, there were many situations where our predictions turned out to be incorrect. For instance, many results incorrectly predicted water as a road. We will leave it to future research to overcome this difficulty. As the most representative result related to our research work, Hui et al. focused on user-defined motion gestures of a virtual character interacting with real environments in an augmented reality (AR) environment and generated corresponding motions using a mobile AR interface [10]. In a usability study, they evaluated the accuracy and editing time for user-created character motions in given real environments. In our research work, we investigated the accuracy of object detection and co-presence of the AR character for automated motion generation. The next section will present the results of user experiments on co-presence.

5.2. Experiment 2: Qualitative Comparison with Co-Presence

One essential way to evaluate AR environments is to measure the feeling of being in the space with the AR character [38]. For participant co-presence, we investigated how the character responded to two different conditions (adapted motion on or off) in two different environments (puddle or muddy conditions in a real environment). In the experiment, we presented the participants with given scenarios on a portable AR device, similar to those used in previous interaction experiments with virtual characters [10]. Most participants were university students; twenty people (15 men and 5 women) with an average age of 26.5 years were paid and participated, and they conducted the within-subject experiment, which was conducted in a one-time session. Also, for consistency, it was performed at the same time on several days under the same weather conditions (i.e., sunny days). Here, all participants reported that they had prior experience with immersive virtual reality headsets. In the experiment, the AR participant used the Galaxy Tab S8 Ultra 5G computing device, which can be easily used in the outdoor environment as previously mentioned. To measure the feeling of being in the same space with the AR character, which is called co-presence, the AR character was naturally operated via a context-aware control method to adapt its motion to the real environment. After the participants had completed all tasks, their perceptions were measured using a qualitative questionnaire to allow them to express how much they agreed or disagreed using a seven-point Likert scale [39]. Figure 8 shows participants’ co-presence results generated for the different conditions (i.e., adapted or nominal/original motion) and scenarios (i.e., puddle or muddy situations) with a pairwise comparison using two-sample t-tests. In the puddle situation, the two-sample t-tests revealed main effects at p < 0.05 in the level of co-presence (see Figure 8 left). However, we found no statistically significant main effect in the muddy situation (see Figure 8 right). In certain cases of our experimental results, participants showed a greater co-presence when adapted motion was used. When asked about the reason to evaluate the resulting score; 12 participants (60% people in participants) answered that they evaluated it based on similarity to the actual situation. For example, some subjects answered that the character’s motion should be adapted to certain situations similar to everyday life. In our results in the puddle examples, the participants thought that the AR character would walk carefully to avoid splashing. On the other hand, in the muddy case, they answered that the character should not walk in the muddy space but has to take a different path. Accordingly, we found that additional research work on interaction responses of a virtual object such as a 3D character depending on the situations in daily life will be needed. Thus, in the future, we will need to analyze situations where higher co-presence can be achieved. For example, we will try to find out where automatic motion adaptation can be helpful in real-world scenarios.

Also, through open follow-up questions of user feedback, we also learned that participants’ experiences can significantly influence co-presence in AR environments. For example, participants who had developed mobile AR thought that the AR character should naturally have adapted motion to real-world situations; otherwise, they said co-presence with the AR character was broken. Therefore, future research should be conducted to analyze the impact of the accuracy level of the character motion according to user experience, and research on motion editing for natural movements will also be needed.

5.3. Discussion and Limitations

We studied the practical approach in terms of adapted motion and its benefits for the AR character. In this study, we ran comparative validation experiments to measure the detection accuracy and the effect of adapted motion. It was found that sliding window size and the position of the detection area have a significant effect on object recognition performance in an augmented reality environment. We also found that the adapted motion of the augmented reality character had a positive effect on AR participants. However, several difficulties were found when using the approach in real-world applications due to accuracy issues associated with difficulties in object recognition such as light reflections. Thus, it will be necessary to be able to make accurate object predictions based on sufficient datasets in various real-world situations. Therefore, more accurate AR registration in the outdoor environment will still require the use of methods based on more sensors (e.g., multiple cameras, depth sensors) and more datasets (e.g., image data, labeling, and point clouds) for deep-learning approaches like research works in the field of computer vision [40]. Moreover, realistic visualization issues for 3D representation of the physical environment related to depth perception between real-world objects and the virtual charter will need to be resolved. For example, it would be helpful to apply the occlusion handling approach suggested by Alfakhori et al. [17].

Figure 9 shows examples of how the AR character’s motion can be more suitably adapted. Our systems need to handle cases based on a more accurate context beyond the recognition capability of object detection in the real environment (e.g., adapted motion speed and jump height responses to the situation). Thus, we will update the details of the AR character’s movements for more natural responses. Furthermore, we will improve our method to support the friction coefficient of the terrain (i.e., the difference between ice and mud). With our results, we presented a preliminary approach for adapting the character’s motions in the AR environment and partly evaluating the effectiveness of a future AR scenario. Additionally, to illustrate the significance of the study, we performed two-typed comparative experiments. We have realized that object recognition accuracy needs to be improved, similar to the basic problem of computer vision fields. In particular, in qualitative aspects such as the participant’s level of co-presence with the AR character, we found that adaptive motion in our experimental study showed positive results in AR setups.

6. Conclusions and Future Research Works

Augmented reality (AR) contents are an important medium providing helpful information about real-life situations. Recently, an AR system based on the movement of three-dimensional characters has begun to be more actively developed, and there have been many approaches to effectively represent natural movement to enhance the user experience. In this study, we focused on adapted motion to improve the participant’s sense of co-presence with the AR character and established a type of character motion that adapts to real-world situations. We used object recognition and contexts of real space, ran validation experiments, and evaluated the users’ experience of co-presence. Our experimental study revealed positive results in terms of adaptive motion in AR situations. There are still many aspects that need to be improved for use in real life, including detection quality (e.g., how to recognize situations that may occur in everyday life) and context-based natural motion (e.g., applying physical properties such as friction, gravity, and collision due to the influence of real objects), for practical applicability. Nevertheless, our study related to the AR character with its potential impact can provide insights into approaches for applications in various fields such as AR games, teleconference systems, and remote educational systems.

In future directions, we are planning to expand this system for applications related to motion adaptation in various situations such as scenarios in which multiple AR characters interact with one another (e.g., crowd simulation using deep reinforcement learning) in the AR user’s surrounding condition and an AR character who is in a wheelchair. Also, we will need to improve usability methods (e.g., blurring real people in the video) regarding ethics and privacy, which are among the fundamental issues in augmented reality technologies.

Author Contributions

D.K. conducted the experiments; H.C. implemented the prototype; Y.K. performed the statistical analysis of the results; J.C. organized the research; K.-H.K. designed the study conceptualization; and D.J. contributed to the writing of the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Culture, Sports, and Tourism R&D Program through the Korea Creative Content Agency grant funded by the Ministry of Culture, Sports, and Tourism in 2023 (Project Name: Development of a Virtual Reality Performance Platform Supporting Multiuser Participation and Real-Time Interaction, Project Number: R2021040046, Contribution Rate: 100%).

Institutional Review Board Statement

The participants’ information used in the experiment did not include personal information. Ethical review and approval were not required for the study because all participants were adults and participated willingly. Also, in our study, procedures involved no increase in the level of risk or discomfort associated with normal, routine educational practices in exempt categories.

Informed Consent Statement

Written informed consent was obtained for the user interviews to be used to publish this paper.

Data Availability Statement

The data presented in our study are available on request from the corresponding author. The data are not publicly available due to the consent of all authors by request.

Acknowledgments

The authors express sincere gratitude to the users who participated in our experiments. The authors also thank the reviewers for their valuable contributions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Rokhsaritalemi, S.; Niaraki, A.; Choi, S. A review on mixed reality: Current trends, challenges, and prospects. Appl. Sci. 2020, 10, 636. [Google Scholar] [CrossRef]
Pejsa, T.; Kantor, J.; Benko, H.; Ofek, E.; Wilson, A. Room2Room: Enabling life-size telepresence in a projected augmented reality environment. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW), San Francisco, CA, USA, 27 February–2 March 2016. [Google Scholar]
Jo, D.; Kim, K.-H.; Kim, G.J. Spacetime: Adaptive control of the teleported avatar for improved AR tele-conference experience. Comput. Animat. Virtual Worlds 2015, 26, 259–269. [Google Scholar] [CrossRef]
Yoon, D.; Oh, A. Design of metaverse for two-way video conferencing platform based on virtual reality. J. Inf. Commun. Converg. Eng. 2022, 20, 189–194. [Google Scholar] [CrossRef]
Paavilainen, J.; Korhonen, H.; Alha, K.; Stenros, J.; Joshinen, E.; Mayra, F. The Pokemon GO experience: A location-based augmented reality mobile game goes mainstream. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, Denver, CO, USA, 6–11 May 2017. [Google Scholar]
Liu, C.; Wang, L.; Li, Z.; Quan, S.; Xu, Y. Real-time lighting estimation for augmented reality via differentiable screen-space rendeing. IEEE Trans. Vis. Comput. Graph. 2022, 29, 2132–2145. [Google Scholar] [CrossRef] [PubMed]
Ihsani, A.; Sukardi, S.; Soenarto, S.; Agustin, E. Augmented reality (AR)-based smartphone application as student learning media for javanese wedding make up in central java. J. Inf. Commun. Converg. Eng. 2021, 19, 248–256. [Google Scholar]
Raskar, R.; Welch, G.; Cutts, M.; Lake, A.; Stesin, L.; Fuchs, H. The office of the future: A unified approach to image-based modeling and spatially immersive displays. In Proceedings of the SIGGRAPH, Orlando, FL, USA, 19–24 July 1998. [Google Scholar]
Lehment, N.; Merget, D.; Rigoll, G. Creating automatically aligned consensus realities for AR videoconferencing. In Proceedings of the 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Munich, Germany, 10–12 September 2014. [Google Scholar]
Hui, Y.; Chung, K.; Wanchao, S.; Hongbo, F. ARAnimator: In-situ character animation in mobile AR with user-defined motion gestures. ACM Trans. Graph. 2020, 39, 1–12. [Google Scholar]
Jack, C.; Keyu, C.; Weiwei, C. Comparison of marker-based AR and markerless AR: A case study on indoor decoration system. In Proceedings of the Joint Conference on Computing in Construction, Heraklion, Greece, 4–7 July 2017. [Google Scholar]
Jo, D.; Choi, M. A real-time motion adaptation method using spatial relationships between a virtual character and its surrounding environment. J. Korea Soc. Comput. Inf. 2019, 24, 45–50. [Google Scholar]
Reitmayr, G.; Drummond, T. Going out: Robust model-based tracking for outdoor augmented reality. In Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, Santa Barbara, CA, USA, 22–25 October 2006. [Google Scholar]
Alhakamy, A.; Tuceryan, M. AR360: Dynamic illumination for augmented reality with real-time interaction. In Proceedings of the IEEE 2nd International Conference on Information and Computer Technologies, Kahului, HI, USA, 14–17 March 2019. [Google Scholar]
Maimone, A.; Yang, X.; Dierk, N.; State, A.; Dou, M.; Fuchs, H. General-purpose telepresence with head-worn optical see-through displays and projector-based lighting. In Proceedings of the IEEE Virtual Reality, Orlando, FL, USA, 16–20 March 2013. [Google Scholar]
Osti, F.; Santi, G.; Caligiana, G. Real time shadow mapping for augmented reality photorealistic renderings. Appl. Sci. 2019, 9, 2225. [Google Scholar] [CrossRef]
Alfakhori, M.; Barzallo, J.; Coors, V. Occlusion handling for mobile AR applications in indoor and outdoor scenarios. Sensors 2023, 23, 4245. [Google Scholar] [CrossRef] [PubMed]
Kan, P.; Kafumann, H. DeepLight: Light source estimation for augmented reality using deep learning. Vis. Comput. 2019, 35, 873–883. [Google Scholar] [CrossRef]
Beck, S.; Kunert, A.; Kulik, A.; Froehlich, B. Immersive group-to-group telepresence. IEEE Trans. Vis. Comput. Graph. 2013, 19, 616–625. [Google Scholar] [CrossRef] [PubMed]
Feng, A.; Shapiro, A.; Ruizhe, W.; Bolas, M.; Medioni, G.; Suma, E. Rapid avatar capture and simulation using commodity depth sensors. In Proceedings of the SIGGRAPH, Vancouver, BC, Canada, 10–14 August 2014. [Google Scholar]
Feng, A.; Casas, D.; Shapiro, A. Avatar reshaping and automatic rigging using a deformable model. In Proceedings of the 8th ACM SIGGRAPH Conference on Motion in Games (MIG), Paris, France, 16–18 November 2015. [Google Scholar]
Wang, L.; Li, Y.; Xiong, F.; Zhang, W. Gait recognition using optical motion capture: A decision fusion based method. Sensors 2021, 21, 3496. [Google Scholar] [CrossRef] [PubMed]
Chatzitofis, A.; Zarpalas, D.; Kollias, S.; Daras, P. DeepMoCap: Deep optical motion capture using multiple depth sensors and retro-reflectors. Sensors 2019, 19, 282. [Google Scholar] [CrossRef] [PubMed]
Kim, D.; Jo, D. Effects on co-presence of a virtual human: A comparison of display and interaction types. Electronics 2022, 11, 367. [Google Scholar] [CrossRef]
Kostak, M.; Slaby, A. Designing a simple fiducial marker for localization in spatial scenes using neural networks. Sensors 2021, 21, 5407. [Google Scholar] [CrossRef]
Wang, X.; Ye, H.; Sandor, C.; Zhang, W.; Fu, H. Predict-and-drive: Avatar motion adaption in room-scale augmented reality telepresence with heterogeneous spaces. IEEE Trans. Vis. Comput. Graph. 2022, 28, 3705–3714. [Google Scholar] [CrossRef]
Ho, E.; Komura, T.; Tai, C. Spatial relationship preserving character motion adaptation. ACM Trans. Graph. 2010, 29, 1–8. [Google Scholar] [CrossRef]
Wang, C.; Zhou, Q.; Fitzmaurice, G.; Anderson, F. VideoPoseVR: Authoring virtual reality character animations with online videos. In Proceedings of the ACM on Human-Computer Interaction, New Orleans, LA, USA, 30 April–5 May 2022. [Google Scholar]
Karthi, M.; Muthulakshmi, V.; Priscilla, R.; Praveen, P.; Vanisri, K. Evolution of YOLO-V5 algorithm for object detection: Automated detection of library books and performance validation of dataset. In Proceedings of the 2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES), Chennai, India, 24–25 September 2021. [Google Scholar]
Ghasemi, Y.; Jeong, H.; Choi, S.; Park, K.; Lee, J. Deep learning-based object detection in augmented reality: A systematic review. Comput. Ind. 2022, 139, 103661. [Google Scholar] [CrossRef]
Thalmann, N.M.; Yumak, Z.; Beck, A. Autonomous virtual humans and social robots in telepresence. In Proceedings of the 16th International Workshop on Multimedia Signal Processing (MMSP), Jakarta, Indonesia, 22–24 September 2014. [Google Scholar]
Hendrawan, A.; Gernowo, R.; Nurhayati, O.; Warsito, B.; Wibowo, A. Improvement object detection algorithm based on YoloV5 with BottleneckCSP. In Proceedings of the 2022 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT), Solo, Indonesia, 3–5 November 2022. [Google Scholar]
Godard, C.; Aodha, O.; Firman, M.; Brostow, G. Digging into self-supervised monocular depth estimation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
Singh, N.; Sharma, B.; Sharma, A. Performance analysis and optimization techniques in Unity3D. In Proceedings of the 3rd International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 20–22 October 2022. [Google Scholar]
Villegas, R.; Yang, J.; Ceylan, D.; Lee, H. Neural kinematic networks for unsupervised motion retargetting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
Grahn, I. The Vuforia SDK and Unity3D game engine. Bachelor Thesis, Linkoping University, Linköping, Sweden, 2017. [Google Scholar]
Paludan, A.; Elbaek, J.; Mortensen, M.; Zobbe, M. Disquising rotational gain for redirected walking in virtual reality: Effect of visual density. In Proceedings of the IEEE Virtual Reality, Greenville, SC, USA, 19–23 March 2016. [Google Scholar]
Niklas, O.; Michael, P.; Oliver, B.; Gordon, G.B.; Marc, J.; Nicolas, K. The role of social presence for cooperation in augmented reality on head mounted devices: A literature review. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 8–13 May 2021. [Google Scholar]
Witmer, G.; Singer, M.J. Measuring presence in virtual environments: A Presence questionnaire. Presence Teleoperators Virtual Environ. 1998, 7, 225–240. [Google Scholar] [CrossRef]
Lee, T.; Jung, C.; Lee, K.; Seo, S. A study on recognizing multi-real world object and estimating 3D position in augmented reality. J. Supercomput. 2022, 78, 7509–7528. [Google Scholar] [CrossRef]

Figure 1. Comparison of the adapted motion of an AR character; for example, a character standing in the incorrect position in the real environment (left), and the character with adaptive motion in a puddle (right). The AR character should effectively represent natural movements matching the real-world configuration to enhance the user experience. The red square in the two images indicates an AR character synthesized in the real world.

Figure 2. Comparison between original motion and our adaptive control using a real puddle: an AR character walking without acknowledging the context of the real environment (left) and the character using adaptive motion to cross the puddle (right).

Figure 3. Automatic motion adaptation process for the AR character.

Figure 4. Puddle objects detected in real-life images. The letters represent the recognized objects, and the numbers show the confidence level of the results. Here, the object recognition results are expressed based on a sliding window, so the results are displayed overlapping.

Figure 5. Examples of AR interaction between a participant and a character. The AR character is represented on a portable smart pad that operates in an outdoor environment and can express the adaptive motion (e.g., suitable walking) applied to the given environment.

Figure 6. Quantitative accuracy of object detection to evaluate performance in a real-world situation using puddles as examples. These results were probabilistic recognition performance. The top-left side of the figure shows the performance for the location in the given image, and the top-right side shows the performance related to the size of the target object. Here, darker colors indicate more accurate results. The bottom-left figure shows the x and y position marked in red of a real image, and the bottom-right figure shows an example for understanding the size of the width and height. The results show that the objects had the highest scores when they were located between 0.4 and 0.6 in the image; in the case of the size, the best results were between 0.05 and 0.15. Here, numbers for experimental results represent relative normalized values.

Figure 7. Example of our results related to detection errors. The figure shows that an error occurred due to light reflection (left). The right side shows a confusion matrix of our overall performance, where darker colors indicate higher values. For example, the error probability of recognizing the wet ground in the image as an ordinary road was found to be 65% (see the block in the bottom middle of the right image).

Figure 8. Survey results regarding participants’ experience of co-presence: adapted motion versus motion without adaptation (or nominal/original motion). Here, scenario #1 involved a puddle, and scenario #2 was performed in a muddy situation.

Figure 9. Examples of movements that can be made more automatic for the AR character: (top left) jumping on a stone bridge, (top right) jumping when the character can, (bottom left) avoiding walking through a small puddle, and (bottom right) falling into a deep puddle.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, D.; Chae, H.; Kim, Y.; Choi, J.; Kim, K.-H.; Jo, D. Real-Time Motion Adaptation with Spatial Perception for an Augmented Reality Character. Appl. Sci. 2024, 14, 650. https://doi.org/10.3390/app14020650

AMA Style

Kim D, Chae H, Kim Y, Choi J, Kim K-H, Jo D. Real-Time Motion Adaptation with Spatial Perception for an Augmented Reality Character. Applied Sciences. 2024; 14(2):650. https://doi.org/10.3390/app14020650

Chicago/Turabian Style

Kim, Daehwan, Hyunsic Chae, Yongwan Kim, Jinsung Choi, Ki-Hong Kim, and Dongsik Jo. 2024. "Real-Time Motion Adaptation with Spatial Perception for an Augmented Reality Character" Applied Sciences 14, no. 2: 650. https://doi.org/10.3390/app14020650

APA Style

Kim, D., Chae, H., Kim, Y., Choi, J., Kim, K.-H., & Jo, D. (2024). Real-Time Motion Adaptation with Spatial Perception for an Augmented Reality Character. Applied Sciences, 14(2), 650. https://doi.org/10.3390/app14020650

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Real-Time Motion Adaptation with Spatial Perception for an Augmented Reality Character

Abstract

1. Introduction

2. Related Work

2.1. Interactive Augmented Reality Technologies

2.2. Augmented Reality (MR) Character

2.3. Motion Adaptation in AR Environments

3. System Overview and Approach

4. Implementation

5. Experiments and Results

5.1. Experiment 1: Object Detection Accuracy for an AR Environment

5.2. Experiment 2: Qualitative Comparison with Co-Presence

5.3. Discussion and Limitations

6. Conclusions and Future Research Works

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI