A Platform Based on Personalized Exergames and Natural User Interfaces to Promote Remote Physical Activity and Improve Healthy Aging in Elderly People

: In recent years, there has been a signiﬁcant growth in the number of research works focused on improving the lifestyle and health of elderly people by means of technology. Telerehabilitation and the promotion of physical activity at home have been two of the ﬁelds that have attracted more attention, especially currently due to the COVID-19 pandemic. However, elderly people are sometimes reluctant to use technology at home, mainly due to fear of technology and lack of familiarity. In this context, this article presents a low-cost platform that relies on exergames and natural user interfaces to promote physical activity at home and improve the quality of life in elderly people. The underlying system is easy to use and accessible, offering a number of interaction mechanisms that guide users through the execution of routines and exercises. A relevant feature of the proposal is the ability to customize the exergames, making it possible for the therapist to adapt them according to the user’s needs. Motivation is also addressed within the developed platform to maintain the user’s engagement level as time passes by. An empirical experiment is conducted to measure the usability and motivational aspects of the proposal, which was evaluated by 17 users between 62 and 89 years of age. The obtained results showed that the proposal was well received, considering that most of the users were not experienced at all with exergame-based


Introduction
The population is aging both in absolute and relative terms, that is, considering the number of people over 65 years of age and taking into account the proportion of this group of people with respect to other age groups, according to the World Health Organization [1]. In particular, it is estimated that in 2050, 16% of the world's population will be over 65 years of age, which approximately doubles the current figure and quintuplicates the rate in 1950. Thus, the number of people over 60 will reach 2 billion by 2050, having already surpassed the 1 billion threshold in 2020. While the increase in longevity represents, in itself, a great success in recent history, it is also the root of another major socioeconomic problem that is aggravated, especially if it is related to the decline in the fertility rate. In this sense, population aging presents a number of challenges for health systems and countries' economies because older people tend to need more care than younger people and are less likely to continue working if their health deteriorates [2]. Therefore, it is clear that we are In this research work, we present a platform aimed at promoting and enhancing physical activity at home by the elderly, especially considering in its design the difficulties and barriers previously introduced. Figure 1 shows, graphically, the context or scenario in which this proposal is framed. From the hardware point of view, the platform is composed of a laptop computer connected to a multisensor kit capable of integrating and analyzing voice and computer vision models. The former is used to run the overall software of the platform, while the latter allows the tracking of the user's skeleton and facilitates interaction with the platform through natural user interfaces.
On the other hand, the platform integrates a set of activities or games that can be customized according to the physical routine to be established. These games incorporate gamification mechanisms to motivate the use of the platform over time. Thus, the main contributions of the proposal are the following: (i) integration of a scalable mechanism for defining customized games, based on a language that enables their automatic generation, (ii) use of accessible mechanisms, based on minimalist interfaces and voice commands, to facilitate and simplify the use of the platform, (iii) adoption of natural user interfaces so that the platform recognizes, in an accurate and agile way, the movements of users without the need to use physical sensors, and (iv) integration of a module based on web technology to facilitate remote monitoring, if necessary, of the activity developed by the users of the platform.

User Interfaces
In Section 1, the importance of overcoming the barrier that the use of technology can impose on older people who do not use it frequently was introduced. Particularly, the aspect related to the usability and interaction scheme offered by a system that aims to promote active aging through exercises performed at home by the user autonomously was mentioned. In this context, the present subsection focuses on the natural interaction mechanisms designed and integrated in the present proposal. At a general level, these mechanisms can be summarized as (i) the possibility of using the system without using a physical interaction device and (ii) without the user having to install physical sensors on his body. In this sense, the user's own body serves as a communication mechanism, since it is possible to interact with the system through voice commands (e.g., word 'OK') or through physical movements (e.g., moving the right hand to a point in 3D space that activates a menu), which the system itself can recognize.
In this research work, we consider the use of a natural interaction between the user and the system, eliminating the use of wearable devices or color bands that facilitate the acquisition of data related to the tracking of the skeleton, i.e., the positions and orientations in 3D space of its joints (see Figure 2). On the contrary, the system integrates a hardware/software mechanism to, from the color and depth information of the captured videos, extract the information of the patient's skeleton. This natural interaction has been adopted transversely throughout the system, being possible to use it both when performing the physical exercises and when interacting with the system menu. One of the advantages of this decision is the increased consistency when using the system, since the interaction mechanism is unique and global. In terms of voice commands, the system makes use of the Microsoft SpeechSDK tool (https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/, accessed on 10 June 2021) to detect certain keywords used to navigate the functionality offered by the system and to interact with the system when exercising. The approach is simple and is based on a set of predefined voice commands that the system is able to identify in several languages. In addition, the system itself incorporates a contextual menu, which is visually reflected in the user interface, and which serves to let the user know which voice commands can be issued at any given moment. As an example, Figure 3b shows, in the upper right part, the list of voice commands available in the general menu. At the top center, the system provides a vumeter that changes appearance when the user is speaking, so that the user knows that the system is detecting his/her voice.

Architecture
The architecture that supports the proposed architecture is shown visually in Figure 4, which is structured in two main layers: • Hardware layer. This layer integrates the physical devices used in the platform which, in turn, are used to run the various software modules of the other major conceptual layer. In particular, this layer includes a laptop with an Nvidia graphics card and a Microsoft Azure Kinect DK device™. • Software layer. This layer integrates the different modules and software libraries that make up the architecture. As can be seen, a modular design has been proposed to facilitate scalability and maintainability when making modifications or increasing the offered functionality.  With respect to the hardware layer, two relevant issues stand out. On the one hand, the tracking device used, Microsoft Azure Kinect DK™, integrates a high-quality depth camera, a 360º microphone array, a 16 megapixel RGB camera and an orientation sensor for the construction of advanced computer vision and speech recognition models. The top right of the Figure 5 shows, visually, the different components integrated in this device. The depth camera is the one that allows to obtain, in real time, information related to the positions and orientations of the joints of the human body in 3D space, as shown in the left part of the Figure 5. On the other hand, at the hardware layer it is necessary to deploy a laptop computer integrating an Nvidia graphics card, as this is a necessary requirement to employ the body tracking SDK. The currently employed laptop is shown at the bottom of Figure 5.   With respect to the software layer, the proposed design contemplates several modules and components, as described below: • Microsoft.NET Framework. This component provides the runtime environment for the entire platform, offering independence and transparency with respect to the underlying hardware and communications networks. • Unity Game Engine. This component has been integrated into the software layer to facilitate and streamline the development process from a general point of view. In essence, Unity is a cross-platform game engine that can be used to create 2D, 3D, virtual reality, and augmented reality applications. • Persistence module. This module is responsible for managing the database in which all the platform information is stored, highlighting the progress information of the system users. • Exergame generation and processing module. This module has the necessary functionality to generate and validate exergames automatically from a formal specification using a high-level language for their subsequent integration into the platform. The Section 2.3 introduces, at a general level, this language. It is beyond the scope of this paper to discuss the different submodules responsible for the validation and syntactic and semantic interpretation of the specification of the exergames built with this language. • Capturing module. This module is responsible for capturing the data provided by the tracking device integrated in the platform. This data contains the positions and orientations, in 3D space, of the user's joints. It is possible to explicitly indicate which joints to monitor. This module also integrates the software necessary to capture and recognize the voice commands issued by the user when interacting with the system. • Assessment module. This module is one of the most important of the architecture, since it provides the basic functionality to evaluate and classify the movements or physical exercises performed by the user. This module is also responsible for calculating the necessary information to provide feedback to the user, based on her performance. • GUI module. This module has as input the information generated by the assessment module and offers the user a representation of it in the form of multimedia feedback, i.e., by means of visual and sound information. The ultimate goal of this module is to motivate and engage the user so that he/she uses the platform continuously over time.
For this purpose, gamification aspects are integrated, such as, for example, scores and information on the user's level of progress.

Definition of Personalized Exergames
The proposed platform makes use of a high-level language, called Personalized Exergames Language (PEL) [18], which enables the specification of exergames adapted to a user based on his or her physical condition and ability to perform certain physical movements. Thus, from a high-level point of view, two fundamental processes can be distinguished: (i) the definition of the exergame itself, either through direct constructions of the language itself or through a graphic tool that supports the visual editing process of the exergame, and (ii) the automatic generation of the exergame from the previous definition.
The definition of an exergame includes the following steps: 1.
Definition of the exergame considering the benefits obtained from its execution.
As an example, in this step, one could think of improving specific capacities, such as muscular strength or mobility. In this sense, this step is associated with the therapeutic nature of the exergame.

2.
Choice of the interaction mechanism between the user and the platform. At this point, preliminary aspects to the execution of the exergame are considered, such as the position of the user or of the virtual camera, and aspects of real use of the system, such as the natural interaction itself that relates the user's physical movements with the virtual effects of the same in the exergame.

3.
Specification of the motivation mechanisms. In this step, the exergame integrates basic visual and sound feedback elements to provide feedback to the user when he/she is executing the exergame and when he/she finishes it. This feature is intended to increase the chances that the user will continue to use the system later on.

4.
Definition of metrics to measure the user's progress level. This step, closely related to the previous one, is focused on storing information about the user's performance after the exergame has been completed. A simple example of metric could be the amount of time spent to perform the exergame.
The aspects listed above are materialized through language constructs, i.e., there are sentences that enable their definition by a nontechnical user, who would usually play the role of clinical supervisor of the platform. Section 2.4 describes how an exergame can be defined by using this language. This very same example is shown in Appendix A.
The implementation of this language is based on the GL Transmission Format (glTF) specification [19]. glTF is an open standard, based on the JSON format that is known for its popularity as a means for information exchange, and that was conceived to work with information from 3D models. This standard offers a set of constructs that greatly simplify the specification of exergames with the proposed language, so that the peculiarities linked to the specification of exergames, i.e., domain knowledge, can be realized in the form of extensions to glTF. The glTF syntax facilitates the definition of issues traditionally linked to interactive graphics applications in 3D space, such as collision between virtual objects. In this sense, the basic interaction mechanism provided by the platform is precisely the interaction of virtual objects, considering the virtual representation of the user's joints and the representation of virtual objects in 3D space.

Experimentation
In order to perform a preliminary evaluation of the proposed platform, an intervention has been carried out with a set of 17 random users, aged between 62 and 89 years, who fit the profile of people who can benefit from the concept of healthy aging.
This group belongs to the Association of People with Physical and Sensory Disabilities (COCEMFE (https://www.cocemfe.es/, accessed on 10 June 2021). It is a public service for care of the elderly people in a Grade I of dependency situation. This association is located in Talavera de la Reina, Spain. The participants were recruited both for the aforementioned association and for the Service for the Promotion of Personal Autonomy (SEPAP-MejoraT (sepap-mejoraT, accessed on 10 June 2021).
They were attended by SEPAP-MejoraT in the rural areas of Velada and Torralba de Oropesa, who met the following inclusion criteria: (i) age: older than 60 years; (ii) Grade I of dependency level in activities of daily living; (iii) upper or lower limb motor impairment; (iv) no serious and disabling conditions; and (v) who lives at home. Regarding the exclusion criteria, four were defined: (1) presence of cognitive deficit; (2) psychiatric conditions; (3) visual or attention deficit; and (4) written nonacceptance of informed consent.
Two intervention groups were configured, according to the place of residence: Velada (n = 10) and Torralba de Oropesa (n = 7). The underlying disease presented by each participant was not considered, but rather their functional consequences and their degree of dependence, defined by the inclusion and exclusion criteria. Figure 6 shows different photos taken on the day when the system test took place. Prior to the execution of this activity, the participants were explicitly informed that the data collected would be treated confidentially and used exclusively in the present study. The ethic approval and consent form statement, which is available for the reader (https://www.esi.uclm.es/www/dvallejo/SI_Elderly_Life, accessed on 10 June 2021), was filled out by every single patient before conducting the experiment.
A intervention was designed with the proposed system, which was applied to all study participants. For data collection, an ad-hoc questionnaire was designed, in which all the study variables were included, in addition to the data provided by the system. Each session lasted 40 min, and was divided into three parts: • Preparation. An instructor presented the system to each participant for about 10 min. During the explanation, an example was projected on the wall so that the participant could follow the explanation perfectly and understand the activity to be performed. • Development. Each participant performed an exercise routine included in the system and tested one of the available exergames. • Evaluation. Once the participant had completed the previous step, he or she was encouraged to fill in a questionnaire to evaluate the exergames and the software system usingMicrosoft One Drive Forms (https://forms.microsoft.com/, accessed on 10 June 2021), to facilitate their subsequent digital processing.
In the stage of preparation and contact with the platform, the instructor made a preliminary tour of the functionality offered by the platform, so that each participant knew how to use it and had an initial reference of the graphical aspect and the functionality provided by it. In this context, Figure 3 shows the main views of the software prototype supported by the proposed platform. It should be noted that the system supports multiple languages, and that due to the fact that the experiment was carried out in Spain, the interface shown below is in Spanish.
As introduced above, in the development stage, each user had the opportunity to execute both physical exercises and exergames integrated in the platform (see Figure 7). Particularly, the physical exercises performed were two: (1) It consists in, from an upright bipedal posture, raising three times the right or left arm from the hip to the shoulder, passing the hand in red color through the spheres placed in the 3D world that draw a trajectory. Fundamentally, the hand must pass first through the sphere close to the hip and with the largest size, ending the repetition when the colored joint reaches the sphere close to the shoulder and with the smallest size. (2) As in the previous exercise, the user should start from a straight bipedal position and raise the right or left arm from the hip to the shoulder with an elbow flexion movement. The elevation of the arm should be in a straight line. The proposed functional exercise (exergame) takes place in a virtual restaurant, in which the participant pretends to be the waiter. As can be seen in Figure 7c, the dynamics of the exergame consists in moving virtual objects, such as a can of soda or a piece of cake, from a certain point to the central tray that is positioned on the bar of the restaurant. In this particular exergame, the user controls the upper limbs of the virtual avatar that can be seen in the foreground. These limbs will move according to the physical movements made by the participant in the real world, which will be captured by the system through its natural user interface. Figure A1 in Appendix A shows the full PEL definition of the right hand to head exercise. PEL is the language previously introduce to specify exergames, so that the written sentences can be automatically processed to generate exergames which can be added to the proposed platform. In this concrete exercise, the following aspects were considered: Once each participant finished the exercise, they were asked to complete a questionnaire with questions related to the usability of the system, and whether it helps them in promoting their personal autonomy.
The questionnaire comprised 25 closed-answers questions (see Table A1 in Appendix B), scored with a Likert scale (1: totally disagree; 5: totally agree), grouped in five dimensions: AP (activity perception), CL (cognitive load), UT (utility), GE (game elements), and TAM (Technology Acceptance Model) framework. These five dimensions are briefly described below. •

Activity perception (AP). This dimension contains questions related to interest (INT), effort (EFF), and ease of learning to use the system (LEA). • Cognitive load (CL). This group includes questions inspired by the Cognitive Load
Theory (CLT) [20]. It has been used to measure the complexity of the task (TD) and the complexity required by the system (DD). Additionally, two questions were introduced to measure the participant's effort with respect to the activity performed (E). • Utility (UT). The third group consists of questions related to the participant's opinion regarding the usefulness of the system. That is, if they would use it to complement the tasks performed in a rehab center (COMP), if the system encourages them to be more consistent when performing exercises (CONS), if the system enhances motivation (PAS), and if they like the application to be a game (GAM). • Game elements (GE). This dimension contains questions regarding the degree of suitability of each of the elements included both in the user interface and in the exergame scene itself, that is, elements such as the avatar representation, the number of repetitions, the score, and the remaining time or music, among others. • TAM. The fifth dimension consists of questions based on the TAM framework [21]. It helps measure the perception of system usability (PEU), the utility perception (PU), and the intention of use (ITU).

Statistical Analysis
The statistical analysis was performed by RStudio ® using the version 1.4. Firstly, a descriptive statistical analysis was carried out to help understand the data. Therefore, the mean (x), standard deviation (σ), and mode (µ) arithmetic operations were applied to describe the variables. To contrast data normality and the behavior of the variables, the Shapiro-Wilk test was computed. As a result of the test, a nonparametric one was applied to measure the degree of association between variables. Fundamentally, it was used to determine how one item causes changes in another item. Thus, interesting findings may be discovered. For this analysis, the Spearman correlation test was used. The statistical significance level was set to 0.05.

Results and Discussion
The aim of this study was to know empirically the elderly people's view with respect to the use of video games combined with virtual reality to perform physical exercises at home so as to promote their autonomy.
The results collected from the questionnaire to evaluate the usability of the system, as descriptive analysis, are presented in Table 1. Generally, the system has received a positive feedback by achieving a good overall score in all dimensions. It should be pointed out that the participants were not experienced at all with exergaming systems, since they indicated as the first question that they had never used a system of the characteristics presented in this paper.
Regarding the first dimension AP, the participants generally appreciated that the activity conducted was fun (INT1) and interesting (INT2). Furthermore, they were involved in the task and tried to do it well (EFF). Alternatively, they rated positively the system was easy to use (LEA). It seems to be logical for people who perform repetitive tasks in sessions to promote autonomy. With the use of technology, sessions are different than usual, and they seem to be more interesting, enjoyable, and helpful for elderly people.
When the subjects were asked about CL questions, they considered themselves to be focused on the task as it required certain concentration (TD, E1). The reason for this consideration can be attributed to the first question in which the participants manifested that they had never used a virtual rehabilitation system. Therefore, the users were totally inexperienced using technology to perform physical activity. The item E2 was rated relatively low, whose analysis may be interpreted as the participants did not put a lot effort to complete the exergames because they were intuitive enough. This may be related to the fact that the exergames were previously designed and adapted putting the focus on elderly people whose mobility may be relatively middle-low. On the other hand, the overall response of the question DD was unsurprisingly quite good as this item indicates that they found reasonably easy to perform physical exercises. It may be related to the previous idea which refers to the fact that exergames were previously designed with the aim of being easy to use and intuitive enough for people who are not very familiar with technology and, above all, virtual reality.
There was a significant positive score related to the UT dimension in which the participants rated that the exergames were motivating enough because the system is presented as a game. Surprisingly, the majority scored positively the question COMP. It was not expected as the target population are elderly people who are generally not familiar with technology and are normally reluctant to use it. The reason for this great valoration may be because they considered that our system may help them be more consistent in improving their autonomy (CONS). This may be also related to the TAM dimension in which they remarked the system is intuitive and easy to use. Moreover, our system appears to have been striking for them because of the score of the items ITU1 and ITU2, which actually indicate their intention of use at home. In view of this, it seems to be logical that they manifested that they would use our system at home as they believed the system can adequately guide them to single-handedly perform exercises and improve their autonomy.
On average, we found relatively good score for GE dimension of questions GE1, GE2, GE3, GE4, GE6, GE7, and GE8. It was expected as the elements of the interface was designed and included to make the system as intuitive and user-friendly as possible so that users can single-handedly use the system at home. However, the item GE5, which corresponds to the video tutorial showed during the play mode, was quite low. This result was not expected. However, a possible explanation for this may be that the vision problems some participants manifested during the experiment because of their age.
Correlational analyses were used to examine the relationship between items (see Figure 8). The most interesting findings are discussed below.
There was a significant positive correlation between INT1/INT2 and PSA items (r = 0.97, p <0.05), which suggest that enjoyment and interest are factors closely related to motivation. Likewise, we have found that the first two items (INT1/INT2) maintain a positive correlation with the variable GAM (r = 0.99, p <0.05), showing that exergames contribute to motivate people to perform more exercises. In relation to the perception of the participants regarding the usability of the system, we have found that intuitive and easy of use correlated positively with the INT1 (r = 0.85, p <0.05) and INT2 (r = 0.84, p <0.05) variables. It means that the perception of the activity, i.e., whether it was interesting or fun, had a great impact on the opinion about the usage of the tool, which seems to make sense. Interestingly, the item GE6 totally correlated with INT1 and INT2 variables (r = 1, p < 0.05). It seems to indicate that the fact the system includes a timer to perform an exercise suggests that it turns into an activity more enjoyable for them. Furthermore, it may be attributed to the fact that they can make use of the time to compete between them. In turn, the item PSA did correlate positively with the GE1 (r = 0.76, p < 0.05), GE2 (r = 0.66, p < 0.05), GE3 (r = 0.71, p < 0.05), GE4 (r = 0.72, p < 0.05), GE6 (r = 0.97, p < 0.05), GE7 (r = 0.75, p < 0.05) and GE8 (r = 0.87, p < 0.05). This indicates that the interface elements contributed to motivate the elderly people to perform physical activity. Remarkably, the correlation between PSA and ITU2 variable (r = 0.9, p < 0.05), and GAM with ITU2 variable (r = 0.93, p < 0.05), indicate the intention of elderly people to use the system. In other words, the more they find it motivating, the more they would use it at home. However, the apparent lack of correlation of DD and EFF (r = 0.05, p < 0.05) can be attributed to the fact that the participants did not put a lot effort to perform the tasks proposed, which indicates that the exergames set out were adequately adapted to their condition. Alternatively, the negative correlations make sense as they are all related to DD and E2 items. It means that the more the users find it difficult to perform exercises with the system, the more they will be reluctant to use this technology. In this sense, it essential to design well assistive technologies in which they do not turn into a barrier but a solution to meet their needs so that they do not abandon their therapy.  However, given that the results are based on a limited number of people, they should therefore be treated with caution. In effect, it represents a limitation of our study, so the findings obtained from this analysis may not be representative enough.

Conclusions
One of the major advantages of the presented proposal is the capacity to personalize the exergames integrated into the platform, which offers therapists and physicians the ability to adapt the exercise routine according to the user's needs. The gamification components have been designed to keep the user's engagement level as time goes by. In the end, ease of use and flexibility provided by the interaction mechanisms integrated in the platform make it possible to use it at home, autonomously and independently.
The platform has been evaluated by a set of 17 random users who can benefit from the concept of healthy aging through a quasiexperiment that analyzed characteristics such as the system usability, the utility, and the intention of use. We were particularly interested in measuring these items because they are strongly related to some of the barriers that elderly people face when adopting technology-based solutions, which were introduced in Section 1. The collected results reflected a positive feedback by the users, achieving a good overall score in all the assessed dimensions. It is important to point out that most of the users that took part in this experiment did not have experience in using similarhealth aging platforms.
As future lines of research, and once the proposal has been validated in terms of usability and motivation, we intend to run a clinical trial that considers a significant number of users during a longer period of time. Its aim is to objectively evaluate the quality of life of users along with their improvement of physical condition. In this trial, we are likely to find subgroups of patients who will respond well and others who will not respond as expected.
Therefore, one question that remains to be answered is related to the acceptance of the platform when used on a continuous basis over time by users who may not necessarily be familiar with the daily use of technology. Lastly, another clinical trial will be conducted to analyze the efficacy of the system compared to traditional methods. The study of this data will help us improve the proposal.
Alternatively, we are also interested in integrating the ability of automatically recommending exercises in the platform. This feature will allow the platform to dynamically adapt to the user's progress level, which is also strongly related to keeping the users motivated and engaged.