CogniViTra, a Digital Solution to Support Dual-Task Rehabilitation Training

This article focuses on an eHealth application, CogniViTra, to support cognitive and physical training (i.e., dual-task training), which can be done at home with supervision of a health care provider. CogniViTra was designed and implemented to take advantage of an existing Platform of Services supporting a Cognitive Health Ecosystem and comprises several components, including the CogniViTra Box (i.e., the patient terminal equipment), the Virtual Coach to provide assistance, the Game Presentation for the rehabilitation exercises, and the Pose and Gesture Recognition to quantify responses during dual-task training. In terms of validation, a functional prototype was exposed in a highly specialized event related to healthy and active ageing, and key stakeholders were invited to test it and share their insights. Fifty-seven specialists in information-technology-based applications to support healthy and active ageing were involved and the results and indicated that the functional prototype presents good performance in recognizing poses and gestures such as moving the trunk to the left or to the right, and that most of the participants would use or suggest the utilization of CogniViTra. In general, participants considered that CogniViTra is a useful tool and may represent an added value for remote dual-task training.


Introduction
Cognitive training is a non-pharmacological therapeutic aiming at the maintenance (e.g., older adults with neurological diseases) or the improvement (e.g., children in the school setting, to ameliorate problems associated with learning difficulties) of particular aspects of cognitive domains [1], such as attention, memory and learning, language, executive functions and calculation, and constructional and perceptive abilities [2].
A cognitive training program is usually made up of a set of different types of exercises, such as word searches, letters and numbers, or the recognition of patterns and geometric figures. The exercises can be performed individually or in groups, although their selection should comply with an individualized intervention explicitly focusing on a person's needs, with the difficulty level of activities adapted to the individual functioning, and must be or the onset of the disease) and with the intensity, quality, and duration appropriate for the stimulation of brain tissue plasticity [38].
Bearing in mind the enormous impact associated with neurological diseases, it is now widely recognized that the services and resources available at the level of health care systems are disproportionately scarce in this area [23,39]. Moreover, these systems are unable to cope with the pressure imposed by the increased prevalence of these diseases associated with the increased average life expectancy and population aging.
Despite the serious problems verified, there have been successive scientific advances that have resulted in the production of new knowledge in terms of genetic, molecular, cellular, physiological and behavioral processes, and neural networks that support brain plasticity and recovery [40,41]. This new knowledge has given rise to new types of therapies, namely therapies centered on the promotion and modulation of neuroplasticity, on learning and memory mechanisms or on neurogenesis and axonal regeneration, whether through pharmacological or non-pharmacological strategies [42][43][44]. In this context, neurorehabilitation is also changing as a result of the practical applicability of research in neurosciences and, consequently, is largely dependent on the development of more effective technologies that facilitate the extensive application of all the knowledge produced [45,46]. Trainingbased rehabilitation programs, regardless of their modality (e.g., cognitive, behavioral, or motor), the place of application (e.g., at the patient's home or at the institution), the mode of application (e.g., alone or in combination with pharmacological strategies), or the type of disease to be treated, require specific and controlled interventions, monitoring of patients, as well as clinical trials supported by direct measures of cognitive and motor functioning [46][47][48].
Dual-task training is being included in training-based rehabilitation programs, not only because the scientific literature reports positive effects [10][11][12][13][14][15][16][17][18], but also because daily living requires the ability to perform multiple cognitive and physical tasks simultaneously. In this respect, dual-task training is seen as an approach to prepare patients for adequately returning to community living (e.g., household, family, work, or leisure) [18,49]. Moreover, dual-task training has been proposed as a potentially effective strategy for reducing the risk of falling in the elderly population [50].
CogniViTra presents backend applications that are fundamental to helping practitioners with the management of clinical interventions, including the preparation of individualized intervention plans (e.g., definition of objectives, areas of cognitive intervention, individual composition of training sessions, and duration and intensity of treatment) and the monitoring of the intervention results. These features are not present in the identified studies and are impossible in some of them (e.g., the solution reported by [56] is a single-exercise solution, and the solution reported by [51] uses four commercial adventure games). Moreover, CogniViTra is supported in a clinical validated exercise battery, and its clinical data structures together with security mechanisms promote the integration of the corresponding interventions with the clinical care workflows. This means, for instance, that CogniViTra clinical assessments of specific patients might be available, when required, to the authorized multidisciplinary care team, safeguarding the privacy, integrity, and confidentiality of the information.
Furthermore, one of the most relevant features of the CogniViTra application is the automatic quantification and registration of the cognitive performance of the patients. This is fundamental for the practitioner assessment of the clinical intervention and surpasses difficulties associated with usual clinical cognitive interventions.
The aforementioned CogniViTra features result from the approach followed for the definition of its functional requirements. Contrary to other solutions, a major concern to consider for the development of CogniViTra was to contribute to bridging the gap between technological advancement and clinical relevance, a problem associated with dual-task rehabilitation programs, as indicated by [19].

Cognitive Health Ecosystem
To respond to the challenges of cognitive rehabilitation, some of the authors of this article were involved in the development of several eHealth applications that, taken together, constitute a Cognitive Health Ecosystem [21,22]. This development started in 2005 in a memory clinic that organized and delivered care to a 400,000 inhabitant population and was anchored in a hospital with clinical and research obligations.
An important contribution of this ecosystem is the minimization of critical sustainability issues for the cognitive health management of the population by increasing the capacity to identify early and monitor all individuals at risk for cognitive decline and providing early cognitive interventions with the adequate quality, intensity, and duration to obtain relevant functional effects. For this purpose, the Cognitive Health Ecosystem aims to disseminate innovative technological solutions to support different clinical domains:

•
Screening-sustainable population-based cognitive screening strategies to allow the population at risk to be tracked, without requiring physical travel to specialized clinical centers or expensive radiology, nuclear, or molecular medicine exams; • Diagnosis-solutions to optimize the global neuropsychological assessment process of patients and to improve the collection of data on cognitive functioning to reduce patient fatigue and the duration of assessments; • Rehabilitation-strategies to allow individual or group cognitive training programs, using cognitive tasks and others that involve exercises and movement, ideally at home or in community-based institutions; • Research-multicentric scientific studies facilitated by a translational network environment that promotes large sample sizes, while simultaneously shortening the time needed to recruit patients and complete the study. These studies carried out within the ecosystem aim to facilitate the rapid implementation of innovative processes and the mobility of the knowledge produced; • Impact-articulation between the various domains described above to have a significant impact in terms of the cognitive health of the population served, measured by the levels of intellectual performance, social participation, and quality of life of the citizens.
In strategical terms, Cogweb [21] is the ultimate exponent of the available services, and approximately 10,000 patients had access to Cogweb-mediated remote cognitive rehabilitation.

Platform of Services
The key element of the Cognitive Health Ecosystem is a Platform of Services that supports different eHealth applications [21,22] and was based on two important requirements: (i) to increase the productivity and intervention capacity of the professionals and (ii) to move the clinical decision to the community, to ensure increased adherence and quality of the services.
Without a common and shared architectural standard for software, data formats, storage systems, modules, and information resources, development and integration are difficult and often promote replicated data in diverse data formats and storage systems. To overcome these difficulties, a service oriented architecture (SOA) [66] was considered. This architecture presents several advantages since it can optimize flexibility and interoperability of the components over the longer term. Furthermore, the use of an n-tier SOA allows the separation of business rules from the applications and the technologies that interpret them, which maximizes flexibility and minimizes the cost of accommodating changes in business rules and therefore allows the development of specific applications without significant development effort.
The generic architecture of the Platform of Services is divided into three layers ( Figure 1): (i) the User Application Layer comprises a set of applications to fulfil the needs of all those involved in the Cognitive Health Ecosystem; (ii) the Backend Services Layer is responsible for all logic related to the interactions between the different applications and the data being persisted; and (iii) the Data Layer ensures the persistence of all information used in the remaining layers, and it is divided into a set of databases, each housing specific data about a given goal.

User Application Layer
All information collection and communication procedures are processed through the components of the User Application Layer. This layer is understood in a bidirectional sense in terms of information communication and admits specialization according to the characteristics of the users' profiles and the hardware used for access (e.g., a desktop or a tablet computer). Concerning users' profiles, beside the patients and the specialized prac-

User Application Layer
All information collection and communication procedures are processed through the components of the User Application Layer. This layer is understood in a bidirectional sense in terms of information communication and admits specialization according to the characteristics of the users' profiles and the hardware used for access (e.g., a desktop or a tablet computer). Concerning users' profiles, beside the patients and the specialized practitioners (e.g., neuropsychologists), other types of users include informal caregivers, physicians, researchers (e.g., data analysts or biostatistics), or those that are responsible for the ecosystem management. The specificities of the various types of users are omitted for clarity.
The cognitive interventional approaches are conducted by practitioners that play a central role in defining the degree of supervision and the type of patient exposure to the treatment, namely the definition of objectives, areas of cognitive intervention, individual composition of training sessions, or duration and intensity of treatment. Although the presence of practitioners is not continuously necessary for the training, they can actively direct all activities via online interaction and periodic (e.g., daily, weekly, or monthly) meetings.
Concerning the patients, the User Application Layer, in addition to allowing access to the online training area, contains educational content targeting the general population and a blog. The aim of the site is to provide scientific and pedagogical information about cognitive functioning and its changes, and the possibilities and indications for cognitive training.

Backend Services Layer
The main element of the Backend Services Layer is the Cognitive Games Engine that comprehends a pool of more than 100 original exercises (i.e., games) that are grouped according to major cognitive function stimulated (i.e., attention, memory and learning, language, executive functions and calculation, constructional and perceptive abilities); cover different degrees of impairment, from normal function to moderate deficits; have sequential levels of difficulty, the progression through levels being automatic and determined by patients' performance; and are prepared not only for sessions with single patients, but also for group sessions.
A typical cognitive exercise is the digit span backwards [67], a mental tracking task that requires sustained attention, working memory, and information processing speed. A string of digits (e.g., 5-7-1-6-3) is presented at a fixed rate (e.g., one per second), and the patients are requested to repeat the string in reverse order. Each patient's digit span length is determined by the largest sequence length for which the patient scores at least 75%. Figure 2 presents other four possible exercises: faces and names, match the color, changing letters, and arrange the words. In the first exercise, faces and names, photos of several individuals with their corresponding names are presented in the beginning of the exercise; then, the different photos are individually presented, and the participants should indicate whether the given name is correct or not, considering the set of photos initially presented. In the second exercise, match the color, a sequence of two words sets is presented, and the participants should indicate whether the words have the same color or not. Concerning the concrete example presented in Figure 2 of the changing the letters exercise, the participants should replace the letter R with the letter N when the traffic light is green and, when the traffic light turns red, the participants should replace the letter Q with the letter P. Finally, in the arrange the words exercise, a sequence of words is presented, and the participants should classify them. For instance, considering the specific instance of a sequence of words presented in the arrange the words exercise of Figure 2, the participants should indicate that the dolphin is a marine animal and not an insect or a vegetable. exercise, the participants should replace the letter R with the letter N when the traffic ligh is green and, when the traffic light turns red, the participants should replace the letter Q with the letter P. Finally, in the arrange the words exercise, a sequence of words is pre sented, and the participants should classify them. For instance, considering the specifi instance of a sequence of words presented in the arrange the words exercise of Figure 2 the participants should indicate that the dolphin is a marine animal and not an insect or vegetable. Some of these exercises are computerized versions of existing paper-and-pencil exer cises, and others were created to meet specific requirements expressed by the practitioner Some of these exercises are computerized versions of existing paper-and-pencil exercises, and others were created to meet specific requirements expressed by the practitioners and to exploit computer functionalities that would be particularly difficult to reproduce with a paper-and-pencil approach. During the dynamic generation of the exercises, the individual patients' performance data (e.g., accuracy and number of clues required) are analyzed to set the appropriate difficulty level. For each exercise and each level, thresholds are defined to allow difficulty levels to be progressively increased. Therefore, for each training section different parameters are retrieved, such as type of exercise, difficulty level, accuracy, and response time, which are used to assess both the overall outcome of a session and the global trend of the rehabilitation.
Since the Platform of Services supports a collaborative network and can be accessed by a broad range of users that are distributed, both in geographical terms and in the administrative terms, privacy, integrity, and confidentiality of the information need to be ensured in a transparent easy-to-maintain way and supported on a strong granularity in terms of the definition of different authorization levels. For this purpose, the Backend Services Layer presents the following services: • Authentication, to provide the identification of the users; • Authorization, to regulate access to the information, including the establishment of access controls to limit personnel access, which is challenged by a diverse set of policies, complexity of workflows, and high risk of denying access to key information; • Logging and Auditing, to trace which users look at which records so an auditor can use this information to detect abuses.
Concerning Authentication, users' credentials are analyzed, and notifications are stored by the Logging and Auditing service. Furthermore, for each authentication process, a JavaScript Object Notation (JSON) Token (JWT) is generated with several identifying elements of the entity trying to access the Platform of Services to be analyzed by the Authorization service.
The main purpose of the Authorization service is to provide general mechanisms to control data access. In terms of the implementation, both role-based access control (RBAC) and attribute-based access control (ABAC) mechanisms were considered for the access to the data resources [68]. Each member of the cognitive intervention team has access to the clinical information of their patients. However, different types of practitioners (e.g., psychologists, neuropsychologists, speech therapists, and neurologists) have different roles according to their competencies and responsibilities, which is translated into different privileges (i.e., RBAC mechanisms), mainly in terms of information writing. However, when required in terms of integration with clinical workflows, subsets of the clinical information of specific patients should be shared with clinicians that are not members of the cognitive intervention team, according to established protocols. Conceptually, although it was not implemented, the patients themselves could authorize the access of parts of their clinical information to their caregivers (e.g., their general practitioners). Consequently, in addition to the RBAC mechanisms, ABAC mechanisms were also implemented.
The application of the RBAC and ABAC mechanisms requires the definition of information access policies, so that requests related to specific patients, information, or actions are analyzed through a set of pre-established rules (of a certain policy). In this respect, the eXtensible Access Control Markup Language (XACML) [69] was used since it is a standard for handling requests for actions, regarding information access, using ABAC policies, and with additional application logic it can also be used to implement RBAC mechanisms. With XACML, policies and their corresponding rules can have different levels of detail, according to the different number of attributes needed before reaching a decision.
Pre-established rules are divided into different categories. The decision for the access request is obtained after comparing all the values of the attributes of the appropriate categories, between the request and the previously defined rules. Depending on how a specific decision rule was defined, the result of a request might be a denial or a permit. Once a decision is achieved, it is communicated by the Authorization service to the Logging and Auditing services, to the entity that requested the information access, and to the logic that supports the Patient Health Record (Figure 1).
For the Logging and Auditing service, based on a previous work [70], a cryptographic secure blockchain was proposed to store the events and the blocks (of the blockchain) in a specific database, which can be assessed for auditing. The implementation of this blockchain mechanism was not supported in available frameworks such as Hyperledger due to the intention to avoid external dependency in the long run on a commercial product. Instead, a custom solution was implemented.
The communication of any ecosystem entity with the Logging and Auditing service is first initiated with a process that aims to establish a security association. This process consists of the exchange of public keys, which are stored for the remainder of the time. The Logging and Auditing service may authorize or reject the association, as configured by the system administration, so that only authentic entities belonging to the infrastructure may generate log entries. For an authentic entity, the process culminates in the attribution of a unique shared cryptographic key, generated by the Logging and Auditing service for the purposes of encrypting the logs sent for storage. Each log entry is then encrypted and signed by the entity private key. The keys can be refreshed periodically, and a wide range of ciphers can be used. In the current implementation we resorted to Advanced Encryption Standard (AES) in the Galois/Counter Mode (GCM), as it provides encryption and integrity control. In addition, Hypertext Transfer Protocol Secure may also be used between the entities, but the cryptographic material will only be used for the establishment of a secure communication tunnel and will have no impact on the log entries.
Log insertion is triggered when an event occurs, usually as a consequence of an action executed by a user. The data describing the event are sent as a Hypertext Transfer Protocol Post to the Logging and Auditing service, where they are inserted in its request body as a JSON object. Since requests must also have a JWT, the verification of signatures is made using the algorithm specified in the JWT together with a secret key, which allows the validation of the requested events. Moreover, the Logging and Auditing service decrypts the object and processes its contents, indexing different elements involved in the event (e.g., date, user identifier, application/service identifier) to construct an appropriate document and the corresponding metadata. Log integrity is validated locally on block insertion at fixed intervals as stated by a timer. Inserting a block requires both validating previous blocks in the chain, which can be done by the Logging and Auditing service or by an offloaded service, and then registering pending log entries. This creates an interdependence between log entries from multiple sources, limiting single source spoofing.
Each block contains the set of event entries that were received since the insertion of the last block. If there is a period without activity, a block will be also created, but its contents will be a random nonce, concatenated with the current timestamp. This way, nobody external to the ecosystem will know if there really was activity or not, since only the block's metadata are shown. The block effectively seals the chain, asserting its state at a given instant. If there are data, a cryptographic hash of the block is calculated, which also considers the content of the previous block, through its hash value. The block is then inserted into document-based storage, with multiple indexes, using the ElasticSearch indexer.
Auditing of log entries is the final process and can be conducted either automatically or by human observers. Existing tools such as Kibana can be used, and the integrity of the data observed can be ensured. The integrity of the chain can be verified at any time to check for any inconsistencies. Since every block of the chain has a clear identification, with fingerprints of the log data as well as time stamps of the first and last event during that time period, the hash of all the events that occurred between two checkpoints can be recalculated to check if it matches the one stored in the block. The next and previous hash fields on every block can also be used to verify that the block itself has not been tampered with.
The Backend Services Layer also includes a data analytics framework of unidentified data generated by the users of the platform. Its main purpose is to analyze the quality of processes according to the standards established by a board of consulting clinicians. Long-term monitoring tools were incorporated into the data analytics to supervise clinical evolution and adjust programs according to the patients' progression (e.g., alerts, reports, benchmarking instruments, longitudinal analysis of evolution, prognostic and predictive models). Moreover, management algorithms were also implemented to guarantee scalability, so that some changes occur automatically, according to the results of the data analytics algorithms, such as evidence for the substitution of a useless exercise, the changing of system features that lead to errors, setting new automatisms such as alert signs for some clinical situations, or the preparation of specific educational campaigns for practitioners.

Data Layer
The Data Layer is divided into a set of databases, each housing specific data about a given goal. This division is intended to: (i) separate different types of information; (ii) ensure the autonomy of information that is persisted; (iii) increase security levels; and (iv) allow different technologies to be adopted for the implementation of different databases.
The main database is the Patient Health Record, where the practitioners may add patients' information to the system and manage it later. It includes the medical form (e.g., identification data, data on the duration of the license to use the system and patients' credentials to access it, and clinical data), the neuropsychological assessment (e.g., the general descriptive data of the evaluation and quantitative results obtained in each neuropsychological test, with the possibility to record several evaluations over time), and the intervention plan (e.g., treatment plan, including duration, main cognitive domains, and the expected intensity of training, which can be used for the detailed evaluation of the quality of the tasks prescribed to the patient in the next session).
In addition to the Patient Health Record, the Data Layer also includes all the databases required by the ecosystem, namely the users' database (including the practitioners), the games database (i.e., information on rehabilitation activities including complementary information, such as indication of a general-purpose exercise or specific use exercise, type of pathology, type of user, form of use, versioning, and translations) and the security database (e.g., credential, profiles, access policies or Logging and Auditing).

CogniViTra
The CogniViTra project seeks to develop and validate an eHealth application to support dual-task training both in clinical settings and at home. For the CogniViTra implementation, the flexibility of an SOA architecture of the Platform of Services supporting the Cognitive Health Ecosystem was utilized.
Therefore, as can be seen in Figure 3, the CogniViTra implementation was based on the resources already available in the Platform of Services, including the components of the Backend Services (e.g., the security services, or the Cognitive Games Engine, which is responsible for the management of the various games, as well as the subsequent storage of the patients' results) and the Data Layer (e.g., patient, game, and security mechanism data). In terms of User Application Layer, the practitioners' interface is based on the already existing interface.  However, concerning the patients, new developments were made for the User Application Layer, including the CogniViTra Box, a hardware component to quantify individual and group responses and interactions during dual-task training (i.e., Pose and Gesture Recognition).
To take advantage of the CogniViTra Box, the CogniViTra Patient Interaction was also developed, which is responsible for the management of the data received from the CogniViTra Box and the presentation of the rehabilitation exercises. Additionally, given that the inclusion of physical exercises necessarily implies a greater physical distance from the patient in relation to the terminal equipment, an attempt was made to make the interaction mechanisms more flexible, namely in terms of availability of multimodal strategies and of an embodied conversational agent, which aims to provide a unified, accessible, and easy-to-use interface to ensure that the patients have a seamless interaction with the Cog-niViTra application. Figure 4 presents a simplified diagram of the CogniViTra Patient Interaction application with some of the components of the Platform of Services, the generic architecture of which is introduced in Figure 1. However, concerning the patients, new developments were made for the User Application Layer, including the CogniViTra Box, a hardware component to quantify individual and group responses and interactions during dual-task training (i.e., Pose and Gesture Recognition).
To take advantage of the CogniViTra Box, the CogniViTra Patient Interaction was also developed, which is responsible for the management of the data received from the CogniViTra Box and the presentation of the rehabilitation exercises. Additionally, given that the inclusion of physical exercises necessarily implies a greater physical distance from the patient in relation to the terminal equipment, an attempt was made to make the interaction mechanisms more flexible, namely in terms of availability of multimodal strategies and of an embodied conversational agent, which aims to provide a unified, accessible, and easy-to-use interface to ensure that the patients have a seamless interaction with the CogniViTra application.

CogniViTra Box
The concept that supported the design of the CogniViTra Box was the creation of an aesthetic and functional hub, serving as interface between the user and the CogniViTra application, without the need for a bulky computer.
As presented in Figure 5, the CogniViTra Box is connected to a TV. Moreover, a Universal Serial Bus (USB) keyboard and mouse, a Wi-Fi network, and an external speaker can be added. However, if the TV is connected through High-Definition Multimedia Interface (HDMI), the speaker will not be necessary. Figures 6 and 7 represent the CogniViTra Box in a higher level of detail, with the components that are encased, specifically the single board computer (SBC) (i.e., Intel/Aaeon Up Squared), the vision processing unit (VPU) (i.e., Intel Movidius Myriad X), the uninterruptible power supply (UPS) and perception devices (i.e., video and audio).

CogniViTra Box
The concept that supported the design of the CogniViTra Box was the creation of an aesthetic and functional hub, serving as interface between the user and the CogniViTra application, without the need for a bulky computer.
As presented in Figure 5, the CogniViTra Box is connected to a TV. Moreover, a Universal Serial Bus (USB) keyboard and mouse, a Wi-Fi network, and an external speaker can be added. However, if the TV is connected through High-Definition Multimedia Interface (HDMI), the speaker will not be necessary.

CogniViTra Box
The concept that supported the design of the CogniViTra Box was the creation of an aesthetic and functional hub, serving as interface between the user and the CogniViTra application, without the need for a bulky computer.
As presented in Figure 5, the CogniViTra Box is connected to a TV. Moreover, a Universal Serial Bus (USB) keyboard and mouse, a Wi-Fi network, and an external speaker can be added. However, if the TV is connected through High-Definition Multimedia Interface (HDMI), the speaker will not be necessary. Figures 6 and 7 represent the CogniViTra Box in a higher level of detail, with the components that are encased, specifically the single board computer (SBC) (i.e., Intel/Aaeon Up Squared), the vision processing unit (VPU) (i.e., Intel Movidius Myriad X), the uninterruptible power supply (UPS) and perception devices (i.e., video and audio).   The hardware components require specific drivers or operating system versions and kernels. Therefore, a specific configuration is required in order to achieve a complete synergy between the components: • With the UPS requiring the latest version of the Ubuntu operating system (18.04 Long Term Support) and the VPU requiring at least version 16, the first step is to install and activate the Ubuntu operating system on the main board; • Having the operating system and kernel installed, all peripheral ports will be available, and the remaining drivers can be installed in any order, namely the latest version of OpenVINO, OpenCV 3.4.4, Python 3.7, Virtualenv, and also Intel RealSense Software Development Kit 2.0.
For the patient side of the system, it is preferable to have the software start-up as automatic as possible, with minimal human interaction or patient training required. To  The hardware components require specific drivers or operating system versions and kernels. Therefore, a specific configuration is required in order to achieve a complete synergy between the components:

•
With the UPS requiring the latest version of the Ubuntu operating system (18.04 Long Term Support) and the VPU requiring at least version 16, the first step is to install and activate the Ubuntu operating system on the main board; • Having the operating system and kernel installed, all peripheral ports will be available, and the remaining drivers can be installed in any order, namely the latest version of OpenVINO, OpenCV 3.4.4, Python 3.7, Virtualenv, and also Intel RealSense Software Development Kit 2.0.
For the patient side of the system, it is preferable to have the software start-up as automatic as possible, with minimal human interaction or patient training required. To The hardware components require specific drivers or operating system versions and kernels. Therefore, a specific configuration is required in order to achieve a complete synergy between the components:

•
With the UPS requiring the latest version of the Ubuntu operating system (18.04 Long Term Support) and the VPU requiring at least version 16, the first step is to install and activate the Ubuntu operating system on the main board; • Having the operating system and kernel installed, all peripheral ports will be available, and the remaining drivers can be installed in any order, namely the latest version of OpenVINO, OpenCV 3.4.4, Python 3.7, Virtualenv, and also Intel RealSense Software Development Kit 2.0.
For the patient side of the system, it is preferable to have the software start-up as automatic as possible, with minimal human interaction or patient training required. To achieve this, because the CogniViTra Box is Ubuntu-based, the boot process was simplified to the minimum, and the only interaction required is the user login.
The start-up script includes initialization, the use of the correct workspace, the launching of the required software as a background task, and the activation of a browser with the CogniViTra website already open.

Interaction Management
CogniViTra Patient Interaction (Figure 8) comprises three parts: (i) Digital Coach; (ii) Games Presentation; and (iii) Pose and Gesture Recognition. The first two components are presented in two iframes, one with fixed content (i) and the other dynamic (ii). The dynamic content iframe presents all the games as well the web pages from which exercises can be selected, working as normal of Hypertext Markup Language (HTML) documents. The third component, Pose and Gesture Recognition, aims to gather feedback from the patients by recognizing predefined static poses and gestures and is configurable for use in a home or a clinical setting.
Electronics 2021, 10, 1304 17 of 33 achieve this, because the CogniViTra Box is Ubuntu-based, the boot process was simplified to the minimum, and the only interaction required is the user login. The start-up script includes initialization, the use of the correct workspace, the launching of the required software as a background task, and the activation of a browser with the CogniViTra website already open.

Interaction Management
CogniViTra Patient Interaction (Figure 8) comprises three parts: (i) Digital Coach; (ii) Games Presentation; and (iii) Pose and Gesture Recognition. The first two components are presented in two iframes, one with fixed content (i) and the other dynamic (ii). The dynamic content iframe presents all the games as well the web pages from which exercises can be selected, working as normal of Hypertext Markup Language (HTML) documents. The third component, Pose and Gesture Recognition, aims to gather feedback from the patients by recognizing predefined static poses and gestures and is configurable for use in a home or a clinical setting.

Digital Coach
The Digital Coach is based on a 3D human embodied conversational agent (named Rachel) designed to transmit a comfortable feeling and able to communicate using natural language, with which the patients can interact using a multimodal interface, including automatic speech recognition and a graphical touch-based user interface. To closely simulate the human conversational behavior, the Digital Coach includes components for speech recognition, synthesis of speech, sound and movement, synchronized non-verbal behaviors such as head nods and facial expressions, and dialogue management. To enable speech interaction, the microphone and high-definition speakers of the CogniViTra Box are used.
The Digital Coach was developed using SmartBodyJS.js (a JavaScript compiled Library based on SmartBody [71] that allows quick development of virtual agents). The behavior and locomotion are manipulated using Behavioral Markup Language (BML) tags. Achieving poses and gestures is important to strengthen the interaction between the user and the Digital Coach as this is a form of non-verbal communication. In this respect, SmartBody has a set of pre-compiled character behaviors such as posture, animations (e.g., jogging, walking, or playing guitar) and gazes (e.g., staring at an object). Moreover, in terms of speech synthesis, the SmartBody allows the synchronization of the audio input

Digital Coach
The Digital Coach is based on a 3D human embodied conversational agent (named Rachel) designed to transmit a comfortable feeling and able to communicate using natural language, with which the patients can interact using a multimodal interface, including automatic speech recognition and a graphical touch-based user interface. To closely simulate the human conversational behavior, the Digital Coach includes components for speech recognition, synthesis of speech, sound and movement, synchronized non-verbal behaviors such as head nods and facial expressions, and dialogue management. To enable speech interaction, the microphone and high-definition speakers of the CogniViTra Box are used.
The Digital Coach was developed using SmartBodyJS.js (a JavaScript compiled Library based on SmartBody [71] that allows quick development of virtual agents). The behavior and locomotion are manipulated using Behavioral Markup Language (BML) tags. Achieving poses and gestures is important to strengthen the interaction between the user and the Digital Coach as this is a form of non-verbal communication. In this respect, SmartBody has a set of pre-compiled character behaviors such as posture, animations (e.g., jogging, walking, or playing guitar) and gazes (e.g., staring at an object). Moreover, in terms of speech synthesis, the SmartBody allows the synchronization of the audio input with the movements of the embodied conversational agent, namely in terms of mouth movements, which contributes to the naturalness of the interaction.

Games Presentation
The Games Presentation was developed primarily with web technology, using a combination of HTML, Hypertext Preprocessor (PHP), and JavaScript. All pages are rendered on the Platform of Services side to guarantee that all sensitive and private information remains safe on the backend. For this reason, PHP is used extensively to communicate with the Platform of Services, and, in turn, from the Platform of Services side, JavaScript was used for the implementation of the Cognitive Games Engine, being in charge of the most dynamic details of the pages and also responsible for control and operation of all cognitive games available in the system (Figure 9). with the movements of the embodied conversational agent, namely in terms of mouth movements, which contributes to the naturalness of the interaction.

Games Presentation
The Games Presentation was developed primarily with web technology, using a combination of HTML, Hypertext Preprocessor (PHP), and JavaScript. All pages are rendered on the Platform of Services side to guarantee that all sensitive and private information remains safe on the backend. For this reason, PHP is used extensively to communicate with the Platform of Services, and, in turn, from the Platform of Services side, JavaScript was used for the implementation of the Cognitive Games Engine, being in charge of the most dynamic details of the pages and also responsible for control and operation of all cognitive games available in the system (Figure 9). These three elements are internments united by the PHP session system, which retains all information regarding client authentication and identification during the operation period.
Looking at the Cognitive Game Engine from the Platform of Services side, the JavaScript elements are: • Game Animation Library, which uses a set of modular libraries and tools that work together or independently to allow interactive web content called CreateJS; • Cognitive Game Engine, the central point in our system, being responsible for managing several important exercise system elements, such as game loading and subsequent presentation on the canvas; • Game Script, the file with specific logical instructions (e.g., operation or winning conditions) from the game, developed and generated in Adobe Animate.
The Cognitive Game Engine provides classes and methods responsible for various tasks related to the game presentation process, graphic interface around the game (inside the canvas), and communication with the Platform of Services. These tasks vary over the different phases of the games' operation as presented in Figure 10. Cognitive Game Engine, the central point in our system, being responsible for managing several important exercise system elements, such as game loading and subsequent presentation on the canvas; • Game Script, the file with specific logical instructions (e.g., operation or winning conditions) from the game, developed and generated in Adobe Animate.
The Cognitive Game Engine provides classes and methods responsible for various tasks related to the game presentation process, graphic interface around the game (inside the canvas), and communication with the Platform of Services. These tasks vary over the different phases of the games' operation as presented in Figure 10.
During the initial stage, the game canvas is loaded on the page, and the Cognitive Game Engine commences. Then, the basic game information and the necessary resources for the game such as images, sounds, and message texts are loaded as well as the game script.
The game starts with the presentation of the title in the display for a few seconds (e.g., exercise title screen of Figure 10), which is followed by the presentation of the game instructions on the screen (e.g., exercise instruction screen of Figure 10). Then, the game enters a standby mode and only starts when user input occurs. During the initial stage, the game canvas is loaded on the page, and the Cognitiv Game Engine commences. Then, the basic game information and the necessary resource for the game such as images, sounds, and message texts are loaded as well as the gam script.
The game starts with the presentation of the title in the display for a few second After the user input occurs, the game script becomes responsible for the game, while the engine in the background manages the click events of pause and instructive buttons and responds as expected, displaying a window with a pause or some instructions (e.g., Figure 10), monitors the responses given by patients to update the results to define and adjust game difficulty level between attempts, and determines whether the game will end. When this occurs, the Cognitive Games Engine is responsible for evaluating whether the exercise session has ended or not, and, if not, the cycle starts over again for the next game. Otherwise, the session ends, the backend receives the end of session information, and the user is sent to the portal of the patient.

Pose and Gesture Recognition
Pose and gesture language such as head nodding, body postures, and hand gestures are effective communication channels in human-human collaboration and can be categorized into three types with respect to which part of the body is engaged [72]: According to the requirements of CogniViTra, the Pose and Gesture Recognition module was designed to recognize the three types of gesture and poses: full body poses, actions, or motions (e.g., 'trunk displacement to the left' or 'trunk displacement to the right'), arm poses and hand gestures (e.g., 'arms raised', singular or both) and nodding or shaking head (e.g., 'head tilt to the left' or 'head tilt to the right').
The process of Pose and Gesture Recognition can be divided into four essential sections as follows: • Sensor data collection-the raw data of poses and gestures are captured by sensors; • Pose and gesture identification-in each frame, a pose or a gesture is identified from raw data; • Pose and gesture tracking-the located skeleton is tracked during body movement; • Pose and gesture classification-tracked pose or gesture is classified according to predefined pose and gesture types.
The sensors that can be utilized in Pose and Gesture Recognition can be classified into two main groups, image-based and non-image-based [73]. In turn, the image-based technologies can be further categorized into four different classes, depending on the type of sensor being used. Hence, we can find approaches using markers [72], a single camera [74], stereo cameras [75], and depth sensors [76][77][78]. Although image-based methods have been dominant in recognizing poses and gestures for several decades, recently, the developments related to sensors have opened new possibilities in terms of non-image-based Pose and Gesture Recognition methods. Several approaches include wearable sensors such as glovebased gestural interfaces [79] and band-based sensors relying on a wristband or a similar wearable device [80][81][82][83]. Furthermore, a third type of non-image-based technologies adopts non-wearable sensors. Non-wearable sensors can detect poses and gestures without contacting the human body, for example using radio frequency sensors or radar [84][85][86][87][88]. Considering the CogniViTra implementation, the CogniViTra Box provides a single camera and depth sensors to perform the required data collection.
Identification is the first stage in Pose and Gesture Recognition, after the raw data acquisition by sensors. In the present implementation, a skeleton model approach was applied that employs a human skeleton to distinguish human body poses and simplify pose and gesture classification, as shown in Figure 11 [89]. Pose and gesture tracking as the process of finding temporal correspondences between frames and continuous tracking of an identified pose and gesture in the previous frames with current frame is not required in the current implementation of CogniViTra, since the focus is static poses and gestures that can be represented by only one frame.
Pose and gesture classification is the last but somehow most determining step in recognition. It can be resolved by many popular artificial intelligence and machine learning algorithms including K-nearest neighbors [90,91], hidden Markov model [92,93], support vector machines [94][95][96][97], artificial neural networks [98][99][100], and deep learning algorithms, from which currently two methods are popular, namely convolutional neural networks and recurrent neural networks [101].
In CogniViTra, our approach was to adopt existing open-source frameworks and tools, in particular, the available implementations based on convolutional neural networks, for which OpenPose lay the foundations for the part affinity field (PAF) algorithm proposed by the authors of this tool [102]. In Figure 12, we depict the classes that are used in the software in order to analyze the patients' poses and gestures. Pose and gesture tracking as the process of finding temporal correspondences between frames and continuous tracking of an identified pose and gesture in the previous frames with current frame is not required in the current implementation of CogniViTra, since the focus is static poses and gestures that can be represented by only one frame.
Pose and gesture classification is the last but somehow most determining step in recognition. It can be resolved by many popular artificial intelligence and machine learning algorithms including K-nearest neighbors [90,91], hidden Markov model [92,93], support vector machines [94][95][96][97], artificial neural networks [98][99][100], and deep learning algorithms, from which currently two methods are popular, namely convolutional neural networks and recurrent neural networks [101].
In CogniViTra, our approach was to adopt existing open-source frameworks and tools, in particular, the available implementations based on convolutional neural networks, for which OpenPose lay the foundations for the part affinity field (PAF) algorithm proposed by the authors of this tool [102]. In Figure 12, we depict the classes that are used in the software in order to analyze the patients' poses and gestures.
The algorithm behind the skeleton output to pose estimation is based on the relative position of the 18 skeleton key points between each other. This approach gives the patients more mobility to answer to the cognitive challenges without restricting them to a very rigid position, since there is an area where each specific pose is considered valid. For example, by having the wrist key point above the eye key point at a certain threshold distance, there is a large area to recognize the 'arm up' answers. This minimizes incorrect answers due to the patient's physical difficulties (e.g., shoulder impingements).
In the case of the 3D movements such as the punches, the approach is similar; however, both frames (i.e., red, green, and blue-RGB-and depth frames) are aligned with the data merged in order to determine the depth of the key point, making it possible to detect, for instance, a wrist with a depth lower than the neck. The algorithm behind the skeleton output to pose estimation is based on the relative position of the 18 skeleton key points between each other.
This approach gives the patients more mobility to answer to the cognitive challenges without restricting them to a very rigid position, since there is an area where each specific pose is considered valid. For example, by having the wrist key point above the eye key point at a certain threshold distance, there is a large area to recognize the 'arm up' answers. This minimizes incorrect answers due to the patient's physical difficulties (e.g., shoulder impingements).
In the case of the 3D movements such as the punches, the approach is similar; however, both frames (i.e., red, green, and blue-RGB-and depth frames) are aligned with the data merged in order to determine the depth of the key point, making it possible to detect, for instance, a wrist with a depth lower than the neck.
Hence, the accuracy of the exercises is not measured in terms of very specific pose patterns but rather is evaluated within an area or a volume in a broader space referenced in the sensor's frames. This approach was considered adequate for the specific application scenario of CogniViTra, since it is acceptable to compromise precision while ensuring accuracy: It is preferable to identify the pose, even if deviates from the ideal form, instead of discarding it and breaking the intended aim of the rehabilitation exercise being performed.

Methods
A peer review assessment [103] was conducted to gather the opinion of peers about CogniViTra. A prototype was prepared to be exposed in a highly specialized event related to healthy and active ageing, and key stakeholders were invited to test it and to share their opinion. The event chosen was the AAL FORUM 2019 [104], which took place in the city of Aarhus in Denmark in September 2019. The audience included technological developers with an interest in healthy and active ageing, health care and social care providers, Hence, the accuracy of the exercises is not measured in terms of very specific pose patterns but rather is evaluated within an area or a volume in a broader space referenced in the sensor's frames. This approach was considered adequate for the specific application scenario of CogniViTra, since it is acceptable to compromise precision while ensuring accuracy: It is preferable to identify the pose, even if deviates from the ideal form, instead of discarding it and breaking the intended aim of the rehabilitation exercise being performed.

Methods
A peer review assessment [103] was conducted to gather the opinion of peers about CogniViTra. A prototype was prepared to be exposed in a highly specialized event related to healthy and active ageing, and key stakeholders were invited to test it and to share their opinion. The event chosen was the AAL FORUM 2019 [104], which took place in the city of Aarhus in Denmark in September 2019. The audience included technological developers with an interest in healthy and active ageing, health care and social care providers, investors looking for new solutions and innovations, individuals participating in research projects, and European, national, and regional decision-makers in the fields of health care, social issues, and technological innovations [20].
Participants comprised a convenience sample of individuals recruited using the hallway technique (i.e., the participants that walked by the CogniViTra stand were invited to participate). The defined inclusion criteria were being a participant in the AAL Forum and consequently being a stakeholder for the development of applications based on information technologies to support healthy and active ageing, understanding the study, accepting to voluntarily participate in the study, and giving informed consent.
All steps were taken to protect participants' privacy, and all relevant rules on data privacy were followed. Information allowing identification of participants was not captured. The participants were distinguished in the study documents by a unique running number. The researchers involved in conducting and reporting the study were obliged to professional secrecy. The principal investigator was responsible for making sure that all members of the team were aware they must not reveal information obtained by granting access to research data to anyone outside the scientific research team.
Prior to the interaction with the application, the participants received all information about CogniViTra, the assessment objectives, duration, and methods. A member of the research team explained that participants could request additional information about the study at any moment and abandon the study at any time without any explanation or personal prejudice. Moreover, all participants completed and signed informed consent.
Each test followed three stages: (i) pre-test: the participants received all the information about CogniViTra and clarified all the questions they had; (ii) test: the participants interacted freely with CogniViTra for as long as they wanted and performed a dual-task exercise (i.e., answering to a cognitive set of tasks with a physical movements); and (iii) post-test: the participants were asked to fill in an opinion questionnaire on a tablet and to share their insights regarding the CogniViTra prototype.
The questionnaire used to gather the data was specifically created for this study and included sociodemographic information, namely country, age and gender, opinions about CogniViTra, expectation of use, perception of its market value, suggestions for improvement, and other comments.

Prototype Setup
Different living room configurations were considered for the two testing setups, resulting in a 'high setup' and a 'low setup' (Figures 13 and 14). Both configurations only differ in the height at which the CogniViTra Box is placed in front of the user. In the 'high setup' the height of the box is of about 0.9 m, while in the 'low setup' it stands at 0.45 m, measured from the ground plane. A location of two meters from the CogniViTra Box is recommended to optimize pose and gesture detection.     In the exercise selected for the experiment (Figure 15), the participants had to perform an attention task, comparing the figures in the left and right boards, and choosing between the 'equal' or 'different' options by performing two different movements.   Additionally, as a pre-condition to the testing, the users were instructed on how to correctly perform a pose and gesture, and they had available a feedback mechanism (e.g., To prepare the experimental set-up, a set of nine different poses and gestures were considered: 'arms raised' (singular or both), 'head tilt/twist' (both sides), 'punch movement' (both arms), and 'trunk displacement' (to the left and to the right). As an example, Figure 16 presents the pose and gesture 'arm raised' (singular and both), where the images represent the inference output from a pre-trained model with depth expansion, over the RGB stream with a depth-based background subtraction.  Additionally, as a pre-condition to the testing, the users were instructed on how to correctly perform a pose and gesture, and they had available a feedback mechanism (e.g., visual feedback of their skeleton model as perceived by the system) to avoid errors due to human factor. We accordingly conducted the trials always being sure that the user was performing the sequences, holding still for one second (i.e., corresponding to grabbing five frames) and not being influenced by uncertainty that could be caused by incorrect skeleton detection).The testing results were different in the two set-ups. While in the 'low setup', CogniViTra was 100% successful for all trials of all participants, for the 'high setup' the same performance was achieved for the poses and gestures 'arm raised' (singular and both), 'look left', and 'look right'. For the remaining poses and gestures, the success rate was also high, being above 90%. According to these results, a comprehensive set of postures and gestures could be used for the peer review. After brainstorming, it was decided that the participants in the peer review should move the trunk to the left if they wanted to select the 'equal' option or to the right if they wanted to select the 'different' option. As a testing procedure, we conducted trials with five different subjects performing a sequence of pose and gesture combinations. Each pose and gesture combination was repeated 20 times each in the two different box installation setups. The conditions of the test sites simulated different living room environments (e.g., cluttered background, different lighting conditions, or different furniture layout). To consider a positive recognition of a pose and gesture, the same classification result had to be maintained over five frames, which a clear result to be identified by the system.
Additionally, as a pre-condition to the testing, the users were instructed on how to correctly perform a pose and gesture, and they had available a feedback mechanism (e.g., visual feedback of their skeleton model as perceived by the system) to avoid errors due to human factor. We accordingly conducted the trials always being sure that the user was performing the sequences, holding still for one second (i.e., corresponding to grabbing five frames) and not being influenced by uncertainty that could be caused by incorrect skeleton detection).The testing results were different in the two set-ups. While in the 'low setup', CogniViTra was 100% successful for all trials of all participants, for the 'high setup' the same performance was achieved for the poses and gestures 'arm raised' (singular and both), 'look left', and 'look right'. For the remaining poses and gestures, the success rate was also high, being above 90%. According to these results, a comprehensive set of postures and gestures could be used for the peer review. After brainstorming, it was decided that the participants in the peer review should move the trunk to the left if they wanted to select the 'equal' option or to the right if they wanted to select the 'different' option.

Results of the Conceptual Validation
In the conceptual validation, the methodological approach of which is presented in [20], a sample of 57 peers from 16 different countries participated ( Figure 17). The average age was 41.5 years (Standard Deviation = 11.4); the oldest participant was 66 years old, and the youngest was 24 years old. Regarding the gender, 32 participants (56%) were male, and 25 (43%) were female.
Most of the participants (n = 53) expressed that they would use or suggest the utilization of CogniViTra. Of those, six participants (10%) stated that they would use it themselves for cognitive and physical training, while 23 (40%) mentioned that they would suggest this to a family member or a friend. Furthermore, 14 participants (25%) mentioned the intention to use it in their professional activities, and 10 (18%) would adopt CogniViTra in their organization so other staff members could use it ( Figure 18).

Results of the Conceptual Validation
In the conceptual validation, the methodological approach of which is presented in [20], a sample of 57 peers from 16 different countries participated ( Figure 17). The average age was 41.5 years (Standard Deviation = 11.4); the oldest participant was 66 years old, and the youngest was 24 years old. Regarding the gender, 32 participants (56%) were male, and 25 (43%) were female. Most of the participants (n = 53) expressed that they would use or suggest the utilization of CogniViTra. Of those, six participants (10%) stated that they would use it themselves for cognitive and physical training, while 23 (40%) mentioned that they would suggest this to a family member or a friend. Furthermore, 14 participants (25%) mentioned the intention to use it in their professional activities, and 10 (18%) would adopt Cogni-ViTra in their organization so other staff members could use it ( Figure 18).
Regarding the aspects that the participants considered to be the topics that Cogni-ViTra helps to solve (Figure 19), the most valued was the possibility of remote follow-up by health care providers (27%), followed by the possibility to extend the clinical setting to the home (24%) and increasing the number of training sessions (22%). The less valued aspects were the reduction of the costs with transportation to clinical settings (12%) and the closer follow-up from family and other informal careers (15%).   Regarding the aspects that the participants considered to be the topics that CogniViTra helps to solve (Figure 19), the most valued was the possibility of remote follow-up by health care providers (27%), followed by the possibility to extend the clinical setting to the home (24%) and increasing the number of training sessions (22%). The less valued aspects were the reduction of the costs with transportation to clinical settings (12%) and the closer follow-up from family and other informal careers (15%).  Finally, concerning the price that each participant would be willing to pay to use the system, most participants answered up to EUR 25, followed by the range of EUR 25 to EUR 50.
Participants made relevant comments and suggestions about the prototype: (i) inclusion of different levels of difficulties to accompany the entire performance spectrum, from very easy levels so that people with cognitive or physical limitations can use it, to levels with great difficulty, so that people with high performance are challenged to better themselves; (ii) improvement of the performance of the system response; (iii) revision of the icons and instructions, making them more appealing and intuitive; (iv) improvement of the exercise duration by including the option of doing shorter exercises, with 10 or 20 trials; and (v) improvement of the user interaction, namely the interaction with the Digital Coach, which should be further developed to greet, suggest drinking water, and indicate instructions.

Conclusions and Future Work
To achieve cognitive health benefits in the population, key clinical operations such as monitoring and assessment of cognitive functions and cognitive rehabilitation must be optimized. In this respect, the use of information technologies is seen as an opportunity to incorporate changes in operating processes of health systems. However, this incorporation currently suffers from a serious implementation problem [105], and there is an enormous difficulty for implementation and dissemination on a sufficient scale to allow the translation of value to the health level of the populations. This problem is not exclusive to Finally, concerning the price that each participant would be willing to pay to use the system, most participants answered up to EUR 25, followed by the range of EUR 25 to EUR 50.
Participants made relevant comments and suggestions about the prototype: (i) inclusion of different levels of difficulties to accompany the entire performance spectrum, from very easy levels so that people with cognitive or physical limitations can use it, to levels with great difficulty, so that people with high performance are challenged to better themselves; (ii) improvement of the performance of the system response; (iii) revision of the icons and instructions, making them more appealing and intuitive; (iv) improvement of the exercise duration by including the option of doing shorter exercises, with 10 or 20 trials; and (v) improvement of the user interaction, namely the interaction with the Digital Coach, which should be further developed to greet, suggest drinking water, and indicate instructions.

Conclusions and Future Work
To achieve cognitive health benefits in the population, key clinical operations such as monitoring and assessment of cognitive functions and cognitive rehabilitation must be optimized. In this respect, the use of information technologies is seen as an opportunity to incorporate changes in operating processes of health systems. However, this incorporation currently suffers from a serious implementation problem [105], and there is an enormous difficulty for implementation and dissemination on a sufficient scale to allow the translation of value to the health level of the populations. This problem is not exclusive to cognitive health; it occurs in all fields of medicine and research, and for both pharmacological and non-pharmacological solutions or even innovations in terms of technologies, processes, and organization.
In this context, the study reported by this article aimed to take advantage of the flexibility of the existing resources of a Platform of Services, namely backend services, to optimize the implementation of an application for dual-task training, as well as to promote the integration of an existing Cognitive Health Ecosystem. This approach not only allowed the optimization of the quality of the implementation and the minimization of the associated development costs, but also the availability of a set of clinically validated cognitive exercises.
The major development efforts were concentrated in the interaction mechanism for the patients. A new user interface was developed that comprises a Digital Coach and two additional modules, Game Presentation and Pose and Gesture Recognition. The patients' interaction is supported by specific hardware, the CogniViTra Box, which replaces the traditional computer and provides the hardware features to recognize poses and gestures.
An observational study was conducted to verify the viability of the dual-task training application. For this purpose, a functional prototype was exposed in a highly specialized event related to healthy and active ageing, and 57 participants were invited to share their opinion after testing the application. Positive feedback for the CogniViTra application was obtained From the participants. In general, participants considered that it is a useful application and may represent an added value for dual-task training, including cognitive and physical activities.
The fact that more than 90% of the participants expressed that they would use CogniV-iTra or suggest its utilization is an important indicator that shows a high level of acceptance and great potential of this application. Another interesting result is that some of the participants mentioned that they would use it themselves, which shows a change in mentality that results in an investment in preventive health and cognitive and physical training throughout the patient's lifespan.
Another interesting contribution was the list of problems that CogniViTra helps to solve. The most valued aspects included the possibility for remote follow-up by health care providers and the possibility to extend the clinical setting by providing home care, to increase the number of training sessions, to reduce the costs with transportation, and to promote a closer follow-up from family and informal carers.
Although the results of this study were very positive and encouraging of further development, improvements in the system should be implemented before starting tests with real users. The participants suggestions focused on the difficulty level, the system response performance, the intuitiveness of icons and instructions, the user interaction, the exercise duration, and the inclusion of a suggestion to drink water. Most suggestions were in line with the requirements previously defined for CogniViTra and were already planned to be implemented in future versions of the prototype.
This article reports the first stage of the CogniViTra assessment to verify the viability of the application. For the second stage, a study involving users interacting with the application prototype in a controlled environment is planned to assess its usability. Finally, in the third stage, a multicentric clinical trial will be conducted in Portugal, Luxembourg, and Spain to assess the adherence, efficiency, and efficacy (i.e., lifestyles and quality of life changes) of the dual-task training supported by CogniViTra.
A total number of 180 participants (90 for the experimental group and another 90 for the control group) are estimated to be enrolled. According to the trial design, the intervention plan will have a duration of 12 weeks and include individual sessions (i.e., CogniViTra activities performed by a single participant in a clinical setting or at home) and group sessions (i.e., CogniViTra activities performed by various participants in a clinical setting). In terms of exercises, the dual-task training will encompass exercises related to different cognitive domains and the patients' interactions will be based on physical exercises with different complexity levels (e.g., to move the arm towards the opposite hand, to move an arm towards the floor, or to crouch with hands on the knees and climb up and opening the arms).
In terms of outcomes, the clinical trial will compare the CogniViTra strategy to standard of care, in terms of (i) time spent on cognitive training and physical and social stimulation activities per participant; (ii) access to specialized activities per hour of specialized human resources; (iii) neuropsychiatric morbidity; (iv) quality of life; (iv) and compliance with behaviors that reduce the individual risk for a brain disease.
CogniViTra was conceptualized and designed prior to the emergence of the coronavirus pandemic situation, which led to the confinement of part of the world population. This situation made even more imperative the need for integrated solutions that allow continued care at home with provider support and advanced easy-to-use eHealth applications as in the case of CogniViTra.