Applying Vision ‐ Based Pose Estimation in a Telerehabilitation Application

: In this paper, an augmented reality mirror application using vision ‐ based human pose detection based on vision ‐ based pose detection called ExerCam is presented. ExerCam does not need any special controllers or sensors for its operation, as it works with a simple RGB camera (webcam type), which makes the app lication totally accessible and low cost. This application also has a system for managing patients, tasks and games via the web, with which a therapist can manage their patients in a ubiquitous and totally remote way. As a final conclusion of the article, it can be inferred that the application developed is viable as a telerehabilitation tool, as it has the resource of a task mode for the calculation of the range of motion (ROM) and, on the other hand, a game mode to encourage patients to improve their performance during the therapy, with positive results obtained in this aspect.


Introduction
With a considerable increase in the number of chronic, disabled or mobility-impaired patients, virtual rehabilitation stands out as a modern alternative to treat these patients through the use of virtual reality, augmented reality and motion capture technologies. In this type of rehabilitation, patients undergo a virtual environment in which they develop the exercises corresponding to their treatment.
In addition, at the present time in which we live, the pandemic of the COVID-19 disease triggered by the SARS-CoV-2 virus has caused an unprecedented crisis in all areas, with the health sector being among the sectors that have suffered the most. Understandably, many healthcare providers are now more focused on attending to the urgent and life-threatening health needs of people.
One of the main measures that have been taken worldwide in the face of the pandemic situation experienced has been the reduction of face-to-face activities at all levels, which has given rise to three main fields of action: implementation of an online consultation modality or remote, for which use is made of different platforms (technological or nontechnological) and formats; mobilization and support of health personnel; and attention to the health and well-being of patients.
In this context, the use of technologies that allow telecare, including rehabilitation treatments, is offering a safe alternative, and in many cases, the only possible one.
On the other hand, given the lack of motivation of patients to accomplish treatment at home, a new methodology based on gamification has begun to be used. Incorporating an element of entertainment through gamification of these activities can stimulate the patient to perform the activities while playing the game. These types of games that include physical activity are called "exergames" [1][2][3].
This new methodology makes use of exergames, which allows the patientʹs motivation in the realization of telerehabilitation exercises to be increased. Exergames aim to stimulate the mobility of the body through the use of interactive environments with immersive experiences that simulate different sensations of presence and require physical effort to play [4]. Research indicates that they can produce improvements in fitness [5], posture [6], cognitive function [7], balance [8] and other nonphysical effects [9]. The fact that exergames are presented as games and gamify tasks provides the extra potential to increase motivation; therefore, the continuity of exercise or rehabilitation programs by patients. Consequently, diverse exergames have been successfully applied to medical conditions such as stroke [10], autism [11] or motor function impairments [12,13], among many others.
There are different examples in the literature about the use of exergames in telerehabilitation [14]. Some of them have adopted tools and technologies of video consoles to manage the monitoring of patient movements. In this sense, we can find works such as [15][16][17], where they make use of the Microsoft Kinect sensors [18] of the Xbox console. This sensor allows the movements of the patient to be monitored in a 3D space, allowing the patient to interact with the application directly with their body, without using additional controls. In [19][20][21], they use the Wii remote or the Wii Balance Board of the Nintendo Wii console, although in these cases, the use of patient control devices is necessary. In [22][23][24], virtual reality systems are used with promising results in rehabilitation, although they are different technologies from those proposed. They are of interest to the work because they make use of gamification and vision systems. In these referenced works, it has been necessary to use sensors such as LeapMotion, VR glasses (for example, Oculus Quest) and other complex sensors to visualize and track movements.
In recent years, these methodologies have been complemented with other technologies such as virtual reality. Augmented reality (AR) is the term used to describe the set of technologies that allow a user to visualize part of the real world through a technological device with graphic information added by it. The device, or set of devices, adds virtual information to the already existing physical information (that is, a virtual part appears in reality). In this way, tangible physical elements are combined with virtual elements, thus creating an augmented reality in real time. These technologies can be used in different formats and make use of various devices, from mobile phones to complex sensors [25].
However, most low-cost exergames, such as those using smartphones, promote healthy lifestyles more than specific exercises [26]. There are other research works more in line with the proposal of this paper, in which augmented reality mirrors are applied. Sleeve AR [17] provides a platform for exercising the upper extremities, where the patient is shown a real-time guide to exercise by means of AR information projected on the patientʹs arm and on the floor. Physio@Home [27] displays visual objects such as arrows and arm traces to guide patients through exercises and movements so that they can perform physical therapy exercises at home without the presence of a therapist. Finally, the NeuroR [28] system applies an augmented reality mirror to exercise the upper limbs. The system overlays a virtual arm on top of the body of the patient and uses an electromyography device to control its movement. The problem with these approaches and proposals is that they need sensors, devices and technologies that are not easily accessible in a middle-class home.
In other proposals [29][30][31], mobile devices are used, such as smartphones or tablets, but most of them are intended as exergames to carry out activities abroad, not thinking about telerehabilitation activities. Therefore, we think that the alternative is the development of exergames that allow the monitoring of patient movements, especially indoors and also as accessible as possible. The most accessible alternative is to be able to estimate a human pose using a single RGB camera.
The current advance in artificial vision techniques and learning methods based on the convolutional neural network (CNN) has led to an improvement in the performance of the estimation of the human pose and therefore growing interest in it. According to the different dimensions of the human pose result, existing works could be classified into 2D pose detection. To achieve this goal, there are several 2D pose detection libraries [32]. Among the most popular bottom-up approaches to estimate the human pose of various people, several open-source models and deep-learning uses stand out, among which we highlight the OpenPose library [33], DensePose [34] and ImageRecognition software wrnchAI [35]. As with many bottom-up approaches, OpenPose first detects parts (key points) that belong to each person in the image, and then assigns parts to different individuals. WrnchAI is a high-performance deep-learning runtime engine, engineered and engineered to extract human motion and behavior from standard video. Dense Pose proposes DensePose-RCNN, which is a variant of Mask-RCNN, which can densely return the UV coordinates of specific parts in each region of the human body at multiple frames per second. In [36], a recent comparison of the alternatives is presented, where both cover the needs and requirements of ExerCam. After testing the proposals, we have finally decided to use OpenPose in our work, given its ease of integration with our project. What made us opt for OpenPose especially is that OpenPose is an open-source library, with a free license compared to wrnchAI that counts with an annual license of USD 5000.
In this paper, an application of augmented reality mirror type based on 2D body pose detection called ExerCam is presented. ExerCam superimposes augmented information in real time on the video. In this way, the patient can observe on the screen a virtual skeleton on their body, where the areas to be treated are highlighted. Additionally, virtual objects are superimposed on the video (objects to hit, decoration, instructions, user interface, etc.), which allows a gamified scene to be created. ExerCam does not need any special controller or sensor for its operation, as it works with a simple, webcam-type RGB camera, which makes the application totally accessible and low cost. Therefore, the aportation of our work is the presentation of ExerCam as a low-cost system, including the use of 2D vision, which could be an alternative tool to be taken into account in telerehabilitation therapies. A study of healthy subjects has been conducted in order to assess the technical viability and usability of the developed application.
The paper is organized as follows: Section 2 explains the ExerCam application developed in depth. In Section 3, the results obtained are presented and discussed, highlighting the validity of the use of ExerCam as a tool for measuring the range of motion (ROM), necessary to be able to perform an evaluation of the evolution of the patients. Finally, Section 4 presents our conclusions.

Materials and Methods
For the purpose of this research, the ExerCam application was developed. The main objective of this system is to stimulate patients to perform the therapy prescribed by specialists. In addition, we want to offer specialists (doctors, therapists) a tool to track the progress of their patients.
ExerCam is designed as an application for telerehabilitation-based exergames. For this, we make use of the augmented reality mirror-type technology based on 2D body pose detection. The system, as seen in Figure 1, makes use of an RGB camera (for example, webcam type) installed in the patientʹs computer. This camera provides the necessary video input to be able to apply techniques of human pose estimation and augmented reality performed in the ExerPlayer module, which allows the gamified scene to be created. ExerPlayer will show the video on the patientʹs screen, along with the different virtual elements of augmented reality, the main element being a virtual skeleton superimposed on the patientʹs body. Depending on the mode of use (calibration, game or task), other virtual objects (objects to hit, decoration, instructions, user interface, etc.) are also superimposed on the video. All the information about the results of the tasks, exercises and patient data is stored in a database on the cloud, together with the information of the tasks and exercises prescribed by the therapist, which will allow the information to be available and updated in both the ExerPlayer and ExerWeb modules. At the other end of the system is the ExerWeb module, where thanks to its web interface and the different algorithms implemented, the therapist can monitor the patient's evolution from a distance and set new types of exercises or tasks. In the following subsections, the libraries and modules used in the ExerCam architecture are explained in greater detail.

OpenPose 2D Library
As previously mentioned, the application base is the OpenPose library [37], which allows 2D human pose estimation to be performed. In [38], an online and real-time multiperson 2D pose estimator and tracker is proposed. Figure 2 shows the network structure, where the part affinity fields (PAFs) and confidence maps are predicted iteratively. The PAFs predicted in the first stage are represented by Lt. The predictions in the second stage are confidence maps, represented by St. To achieve this, OpenPose uses a two-step process (see Figure 2). First, it extracts features from the image using a VGG-19 model and passes those features through a pair of convolutional neural networks (CNNs) running in parallel. One of the pairʹs CNNs calculates confidence maps (S) to detect body parts. The other calculates a set of 2D vector fields of PAFs called part affinity fields (L) and combines the parts to form the skeleton of the individual person poses. These parallel branches can be repeated many times to refine PAF and confidence map predictions. Then, part confidence maps (S) and part affinity fields (L) are:

ExerPlayer Module
ExerPlayer runs as a desktop application, installed on the patientʹs computer. The central component of the ExerPlayer module is the OpenPose library, which allows the real-time human pose estimation to be obtained through the patient's webcam. ExerPlayer makes use of the REST API to transmit and receive data regarding the tasks and exercises of patients. This module allows the patient to perform exercises from home thanks to the application developed as an augmented reality mirror type, where the patient is reflected on the screen, along with other virtual objects. In this way, it provides the means to stimulate physical activity, by inducing movement, or gross motor learning, as it is able to provide the three key principles in motor learning: repetition, feedback and motivation [39]. ExerPlayer has three types of virtual objects (see Figure 3a):  Real-time superposition of a virtual skeleton on the userʹs body, where the joints are highlighted. A human pose skeleton represents the orientation of a person in a graphical format (that is, it is a set of coordinates that can be connected to describe the pose of the person). Each coordinate in the skeleton is known as a joint or a key point (or part). A valid connection between two joints is known as a pair (or a limb). The parts of the body that can be used correspond to those provided by the COCO model [40]. This model provides an image data set with 17 key points for each person. They can be seen in Figure 3b.  Target objects, represented by several concentric circles, are targets that the players must reach with their joints. When the targets are hit, a sound is played that provides positive feedback and increases the feeling of immersion and motivation in the game [41]. The targets can be assigned to a specific joint, in which case the target will be shown in the same color as the one assigned to the joint, and the player will have to use that joint to reach the target. The targets can be fixed or mobile (with predefined trajectories). In addition, they can be configured to appear or disappear at a certain time or in response to events such as the triggering of timers or the impact of previous objectives.  Visual aid, decoration and instructions. They can be: Text with instructions, score, time elapsed, etc.; o Progress bar-like buttons to start an exercise; o Geometric shapes to help place the user and calibrate the system.
To access the ExerPlayer module, the patient must be registered in the system and enter their credentials. Once the application is accessed, the patient has several modes of use: (1) calibration mode, (2) task mode and (3) game mode.

ExerPlayer Calibration Mode
One of the main problems in vision systems is usually due to changes in the conditions of the environment (lighting, distances, camera orientation, etc.). For ExerCam, these problems are taken into account, and that is why in each session, both for performing tasks and for playing games, a previous calibration process should be carried out. ExerCam automatically launches an obligatory static calibration before executing a task or game.
In calibration mode, a static posture evaluation (SPE) [42] is performed for the patient. For this, the patient must adopt the neutral or zero position (look forward, arms hanging at the side of the body, thumbs directed forward and lower limbs side by side with knees in full extension), which will allow the application to calibrate the exact measurements and characteristics of the patient. Mainly, the distances between joints are recorded in the system (the connection between two joints or joint. This connection is known as pair or limb) and possible malformations of the patient (for example, lack of extremities). This information is sent encapsulated in JavaScript Object Notation (JSON) (JavaScript Object Notation) format to the DB ExerCam Database via the REpresentational State Transfer (REST) API [43].
To simplify the calibration, the application has visual aids to help patients place themselves in the right position and make sure that the camera is oriented in the right direction (see Figure 3a). This aid consists of a rectangle on the screen. Patients can place their body inside the rectangle, and in the case that patients move from their original position during the exercise, they can move back to it. In this way, the camera will have a different distance for each patient. It is also recommended that the camera is oriented orthogonally to the patient, although the system works correctly even with the camera oriented with a certain inclination angle. Once the system is calibrated and a task or game session is started, the camera orientation cannot be changed, neither can the distance between the patient and the camera (in the initial or resting position, during the game, they can move). For future works, a dynamic calibration is proposed, as opposed to static calibration, to avoid possible maladjustment in the measurements in the case of changes in the environment.

ExerPlayer Task Mode
The objective of the task mode is to provide the therapist with a mechanism to evaluate the range of motion (ROM) of specific joints and the reaction time of each patient. Additionally, it can be used as a goniometer. The task is to perform a specific movement that has been previously prescribed by the therapist (in Figure 4, an example of the shoulder abduction task is shown), and for this, the application gives visual support through a band of targets and the virtual skeleton (see Figure 4a,b). Through the REST API, the data of the tasks prescribed by the therapist are received via HTML. With the GET statement, the exact positions of all the targets of the exercise are obtained.
The targets and their position are automatically generated by the target location algorithm (TLA). The TLA algorithm developed for ExerCam calculates the exact position where the targets should be and ensures that the movement performed is correct. The TLA algorithm adapts the distance and location of the targets to each patient and each session, taking into account the main joint of the task and the parameters collected in the calibration mode. The number of targets to be displayed is set by a parameter configurable by the therapist.  For the development of the TLA algorithm, the movements of the articulations included in the tasks were studied in depth. What all the tasks have in common is that joint movements describe a circular movement so that the targets will form a semicircle by taking as a central point the joint to be measured. In this sense, the placement algorithm will require a central point (it will be the main articulation of the task), a radius (distance between joints) and a start and end angle. Table 1 shows, for each of the tasks implemented in ExerPlayer, the maximum and minimum values assigned for each joint. These values were obtained from theoretical values marked by the articular scales established in the R.D. 1971/1999 (based on the American Medical Association guide) [44]. As an aid, the TLA algorithm sets the initial position of the main joint of the task and shows a target of the color of the corresponding main joint (see Figure 4a). In this way, the patient can be positioned correctly.
For the example of the left shoulder abduction task, the algorithm TLA takes the patientʹs shoulder joint (Center 2) as its central point and shows the shoulder target in the center of the positioning zone (see Figure 4a). For other joints, the target is drawn in other positions. The algorithm calculates the value of the radius r, as shown in Equation (3), where the radius is equal to the distance between the shoulder joints, represented by C(h,k), and the wrist P(h, k). The joints involved in each task are reflected in the Radius column of Table 1. Once the radius is obtained, the algorithm of the incremental method of drawing a circle [45] is applied, and given an angular step, constant and selected by the therapist (dƟ), the coordinates of the centers of the targets are obtained in an incremental manner The calculation of two consecutive center points is shown in the following Equation (4).
Once the task is completed, the application generates a JSON file with all the data obtained (patient data, task data, time spent, targets reached, coordinates of the joints involved in the task). This file is sent via the REST API to the DB ExerCam Database and is used later by the evaluation algorithm to show the results of the tasks (ROM calculation. In Section 2.3.1, the evaluation algorithm is explained with the results from the ROM calculation) to the therapist.

ExerPlayer Game Mode
As already mentioned, the use of rehabilitative devices and telerehabilitation reduces the cost of rehabilitation. However, patient motivation is critical for efficient recovery [46]. This is why ExerCam offers a game-based telerehabilitation service.
In the game mode, the patient accesses their section of games planned and customized by the therapist. A game is based on an augmented reality session, where virtual targets will appear strategically placed in the scene. The game mode executes the gamified activities, aimed at the realization of concrete movements and with a number of repetitions established by the therapist, with the aim of working different motor functions. When the patient tries to reach or avoid the targets (depending on the game), the specific movements of their therapy are performed. The objective of the game mode is to improve balance (both standing and sitting), coordination, load transfers, range and alignment of movements, strengthen the nondominant side and hand-eye coordination, reaction time and muscle strengthening. To design these games, we followed the indications of the therapist and the guidelines for the design of rehabilitation activities outlined by the Commissioning Guidance for Rehabilitation of NHS England [47].
In game mode, as shown in Figure 5, four different games are available, two of them are oriented to the stimulation of the upper train and the other two to the stimulation of the lower train. The four games developed are the following:  Game 1: Crazy Targets. This is a very simple game where the patient must touch the circular targets that appear on the scene. Although in this game, targets can be placed throughout the screen, it is specially designed to stimulate the extremities of the upper train. In addition, the game can be configured with different levels of difficulty, where the difficulty is imposed by applying movement to the targets (defining the trajectory and speed of the targets), specifying the articulation with which the target must be reached (in this case the targets are shown in the same color as the assigned joint), limiting the lifetime of the target (the target disappears when it is hit or when it reaches the maximum lifetime) or limiting the time to complete the game.  Game 2: Boxing. This game is designed for stimulation of the upper extremities. In this game, the patient becomes a great boxer and must hit the key points of the opponent until they are knocked out. In this game, the targets are static, and they appear in different positions, one by one, as they are hit. The level of difficulty can be increased by identifying the specific joint with which the target should be reached (in this case the targets are shown in the same color as the assigned joint), increasing the number of targets that appear at the same time, limiting the lifetime of the target or limiting the time to complete the game.  Game 3: Football. This game is designed for stimulation of the lower body in a football atmosphere. In this game, the patient must ʺkick the ball.ʺ This game is similar to Game 1. The patient has to reach the targets that appear on the screen, but this time, the targets will look like a soccer ball. As in Game 1, the targets can be configured to increase the level of difficulty by providing movement of the balls, specifying the articulation with which the target should be reached, limiting the lifetime of the target or limiting the time to complete the game.  Game 4: Bars. This game is designed for lower train stimulation. Unlike the other three games, in this game, the patient has to prevent the bars (formed by a set of aligned targets) from reaching them. The bars move around the screen, forcing the patient to make lateral displacements, jumps or squats. In this game, the difficulty can be increased by increasing the speed with which the bars move or with the ranges of the trajectories (start and end of the sweep performed by the bars). For the placement of the targets in any of the four games, the therapist can select one of the predefined distributions for each game or can design a specific distribution for each prescribed session. After each game, a score is given so that the patient has immediate feedback about their performance.
It is important to note that in game mode, the patient has greater freedom of movement and rather than perfectly repeat a certain movement, the idea is to foment the exercise of a certain part of the body.

ExerWeb Module
The ExerWeb module provides a dynamic web interface (see Figure 6) from where the therapist can manage patients, prescribe tasks or games and consult or analyze the results in a ubiquitous and totally remote way with the patient. Once the therapist is authenticated on the website, ExerWeb has a top menu accessible from any page of the website. From this menu, the therapist can access the Therapist section and the enabled management mechanisms.

Patient Management
ExerWeb has an ExerCam Patient Management web page. This page is the default page accessed when a therapist enters the website (see Figure 7a). The Patient Management page offers a preview with the most relevant patient data in a table format. This section includes links to create, edit, delete, access the results of tasks and games and prescribe tasks and games to the patient. Both the creation mechanisms and the prescription of tasks are managed through forms (see Figure 7b).
When the results of a patient are accessed, the history of tasks and games prescribed to that patient is shown (see Figure 7a,b). Each of the prescribed tasks or games has an associated file of results that were generated with the evaluation algorithm and can be downloaded in different formats (txt, JSON or CSV). Likewise, two graphs are displayed (or fewer if not they have enough data) with the evolution of the last two tasks performed and two graphs (or fewer if they do not have enough data) with the evolution of the last two games performed. The evaluation algorithm applied to the games consists of showing the degree of improvement set by the therapist when prescribing the game (time to overcome the activity, minimum necessary points, etc.). In the case of the tasks, the evaluation algorithm consists of calculating the ROM value.
For the calculation of the ROM value, two possible solutions were considered: (1) a first solution based on the targets reached and (2) a second solution based on the variation of joint coordinates. The operations carried out are the following:  Target ROM: value obtained taking into account the targets reached. During the prescription of the task, the therapist sets the distance in degrees between each target (dƟ). Therefore, it is possible to calculate the ROM as a result of subtracting the angle in which the last target (DF) reached with the angle of the first target (DI) reached is located (see Equation (5)). We can take as an example the task of shoulder abduction shown in Figure 4c, with an angular pitch of 10°. If the first target reached was 3 and the last target reached was 14 (see Figure 4c), then we can calculate the as seen in Equation (6). * Ɵ * Ɵ (5) 14 * 10 3 * 10 110° (6)  Joint ROM: value obtained from the calculation of the subtraction between the final angle θ2 and the initial angle θ1 formed by Vectors v1 and v2 that form the existing segments between a joint (see Equation (7)). The magnitude of the angle between the two vectors is given by Equation (8).
For the example of shoulder abduction, the vectors involved correspond to those shown in Equations (9) and (10).

Task Management
From the Task Management section, the therapist can design specific tasks to later prescribe them to patients. For both the creation of a new task and the prescription of tasks to patients, a form format is used (see Figure 8b). In the creation of a new task, the therapist will indicate the articulation on which the task is carried out, the movement to be carried out, the beginning degree, the final degree and the increase in grades between targets. For the prescription of tasks, the therapist will select a task and a patient will indicate a deadline to complete the task, the maximum initial grade and the minimum final grade to complete the task. The result of each task prescribed to a patient should be consulted from the Patient Management section (see Figure 9b).

Game Management
Game management consists only of prescribing games to patients, as it currently does not allow the creation of new games. In the prescription of games, the therapist will select one of the games and a patient, indicate a deadline to complete the game and customize the game to the needs of the patient. A game can be customized indicating the duration, repetitions, difficulty, number of targets, score needed to overcome the task and adding a file with the positions in which the targets should appear.
Once exercises are performed by patients, ExerCam provides actionable information to therapists and patients through the detailed results. The result of each game prescribed to a patient should be consulted, as with the tasks, from the Patient Management section (see Figure 9a).
On this page of results of a patient, a list of the tasks prescribed to that patient is shown first, showing an icon of whether the task has been exceeded or not (it is assumed that the task has been successfully completed if the joint ROM exceeds the minimum set for that task). Following the list are two graphs (maximum) with the historical results of the last two tasks. In the case of not having different associated tasks, a graph is shown, none if there are no patient data. Following the task Results section, a section with the results of the games is shown, showing in this case if the game has been exceeded based on the score obtained.

Results
In order to check the system's viability, a preliminary study was carried out on 20 participants of different ages. Six of the subjects were aged between 60 and 75 years, another six of them were aged between 40 and 59 years and, finally, eight of them were between 25 and 39 years. All participants accepted participation by giving their informed consent, according to the ethics regulations for research on humans in the Helsinki Declaration. The study has a favorable report from the Research Ethics Committee of the Universidad Politécnica de Cartagena (CEI19_009).
The study consists of two parts. For the first part, the precision of the ROM calculation using ExerCam in task mode was evaluated, and in the second, the subjective usability of the system by participants and therapist was evaluated.
Laptop computers equipped with a webcam were used mainly to carry out the different tests. In the following Table 2, we can see the specifications of the equipment with the lowest capacity used, although it must be clarified that a study was not carried out to obtain some minimum requirements of ExerCam.

ROM Precision Evaluation
Measuring range of motion (ROM) is usually one of the first steps in an evaluation in musculoskeletal rehabilitation [48]. This measure is used to evaluate and classify joint deficiencies in patients and to indicate the results of rehabilitation programs. For ROM measurements for clinical practice, doctors and researchers mainly use goniometers, inclinometers and motion analysis systems based on markers in a controlled environment under the instructions of the medical staff [49].
The main objective of this study was to analyze the accuracy of the ROM measurement through the application in task mode. For the analysis of the precision of the ROM measurement, an experiment based on the measurement of the ROM of the following more representative tasks (movements) was performed: (1) left shoulder abduction (the results of this experiment are shown in Table 3), (2) left elbow flexion (the results of this experiment are shown in Table 4), (3) abduction of left hip (the results of this experiment are shown in Table 5) and (4) right knee flexion (the results of this experiment are shown in Table 6). To achieve this, the ExerCam task mode and manual ROM calculation through a PVC goniometer were used. Measures were taken for a series of 3 repetitions per task (S1, S2, S3), parameterizing the tasks with the initial and final grades indicated in Table 1 and using a ten-degree increment between targets.
The following tables (Tables 3-6) show the measurements taken for each of the tasks (one measurement per repetition).

ExerPlayer Usability Evaluation
ExerCam, as well as other exergames, is an application oriented toward users, and as such, experience and satisfaction are important factors to take into account. For that reason, after the execution of the exercises, the system usability scale (SUS) [50] test was given to the therapist (see Table 7) and participants (see Table 8), allowing the assessment of the usability of the system from the perspective of the user.
The SUS test is a test with a reasonable number of questions (10), easily understandable and widely validated and adopted for technological-based systems such as ExerCam. Each of its questions is rated on a Likert scale between 1 and 5 (from 1 for "totally disagree to 5 for "totally agree") measuring usability. To respond to this test, it was a requirement that all participants had tested the complete system (in other words, calibration mode, task mode and game mode). Additionally, a comments section was added to allow the participants and therapist to leave any other comment they considered had not been reflected in the questions.  To evaluate the results of the SUS test, we used the adjectives method (see Figure 10). This method is based on the association of the SUS score with an adjective scale (worst, poor, ok, good, excellent, best). In Figure 11, the results obtained from each participant and their average are shown. Acceptable values being considered are those between the adjectives good, excellent or best (that is, with a SUS score greater than 68), indicated by a horizontal red line in Figure 11.

Conclusions
This paper studies the feasibility of applying 2D body pose estimation to the development of augmented reality exergames. To that end, the ExerCam application was developed. It works as an augmented reality mirror that overlays virtual objects on images captured by a webcam. As advantages of the proposal compared to other works discussed in Section 1, ExerCam is a new telerehabilitation system for patients and therapists. Patients can perform therapy from home using a laptop and perform the tasks and exercises programmed by therapists. Therapists, in turn, can carry out a continuous and remote evaluation of the evolution of the patients. With a 3D system, patient information and monitor movements in game mode can be acquired more accurately, but then, a more complex system would be needed. The novelty of the system resides in the use of a lowcost 2D vision system for the estimation of the human pose to be able to obtain an application of augmented reality mirror type. ExerCam can be executed on a laptop (even in the future on a Tablet) without advanced devices so it becomes an accessible system for most people.
On the other hand, it has been possible to adapt a series of exercises proposed with a gamified scene, which avoids the feeling of monotony and boredom of traditional rehabilitation and motivates the participants to continue performing therapy at home. Its use in specific medical conditions has not been studied, although the accuracy of the calculation of the ROM through ExerCam has been demonstrated. However, the application allows the creation of tasks that could promote physical exercise and gross motor skills.
As indicated in Section 3.1, in this first study, a distance of ten degrees between targets was chosen for the tasks in the task mode. This choice will be in the hands of the therapist who prescribes the tasks; however, for these first tests, 10 degrees was chosen, as the maximum standardized error for ROM calculation applications is usually 12 degrees. This way, one can obtain a maximum of 10 degrees of error in the ROM calculation applications in task mode parting from the target positions. If the distance between targets is decreased (or increased) to 5 degrees, we would have a maximum error of five degrees in task mode with targets. It is important to note that the ROM calculation with the highest precision obtained with ExerCam is obtained, as expected, with the ROM value from the variation of the joint coordinates.
In the results obtained from the calculation of the ROM using the variation of the joint, the measurement error committed by ExerCam is less than 3%, so the system is valid as a resource for measuring and evaluating the evolution of participants. It should also be taken into account that in manual measurements, there is a bias error such that, as the sample number increases, this error tends to decrease [51].
ExerCam, as well as other exergames, is an application oriented toward users, and as such, experience and satisfaction are important factors to take into account. For that reason, after the execution of the exercises, the SUS questionnaire was given to the participants and therapist, allowing them to assess the usability of the system from the perspective of the user. In the results of the questionnaire, shown in Tables 6 and 7, it can be observed that most of the users answered that the application was easy to use (with Items I2, I3 and I8 having scores greater than 4). According to I7, most of the participants think that other people would be able to learn to use the application quickly. From answers to I10, they do not think that it is necessary to acquire new knowledge or skills in order to use the application properly. According to I5 and I6, participants also think that the application is well engineered. Finally, the results obtained in Question I4 are remarkable, as some of the users think that they would need the support of a technical person to be able to use them, whereas other users do not agree. This is due to the heterogeneity in the participants of the study, particularly to their generational difference, the oldest people being those that answered that they need support. Regarding the responses to the SUS test given by the therapist, we can see in Table 6 how the therapist sees the use of ExerCam as useful and easy and would like to use the system frequently. The therapist also reinforces the hypothesis of the usefulness of this system as a telerehabilitation tool.
As shown in Figure 11, the total score obtained for each participant was positive, obtaining an average of 83.5 (excellent according to the adjective scale method). This result is significant and provides a level of confidence in the use of ExerCam.
It is also worth mentioning that a generic usability test was used, as at this stage, our application is aimed at being a prototype to demonstrate the feasibility of 2D pose estimation in augmented reality exergames. Nevertheless, depending on the purpose of developed applications, other questionnaires should be used. For example, for virtual rehabilitation applications, the USEQ test is available [52].
It would also be interesting to carry out extensive and prolonged clinical studies that allow the effects of the rehabilitation system to be verified experimentally.
Furthermore, ExerCam can be used for different purposes, allowing trainers or therapists to define tasks based on the appearance of a virtual target on the screen, which the player has to reach.
Author Contributions: All authors have collaborated and worked on the enhancement of the application, the study and writing of the article. All authors have read and agreed to the published version of the manuscript. Informed Consent Statement: "Informed consent was obtained from all subjects involved in the study." Data Availability Statement: To access the study data, contact paqui.rosique@upct.es.