Telehealth-Enabled In-Home Elbow Rehabilitation for Brachial Plexus Injuries Using Deep-Reinforcement-Learning-Assisted Telepresence Robots

Khan, Muhammad Nasir; Altalbe, Ali; Naseer, Fawad; Awais, Qasim

doi:10.3390/s24041273

Open AccessArticle

Telehealth-Enabled In-Home Elbow Rehabilitation for Brachial Plexus Injuries Using Deep-Reinforcement-Learning-Assisted Telepresence Robots

¹

Electrical Engineering Department, Government College University Lahore, Lahore 54000, Pakistan

²

Department of Computer Engineering, Prince Sattam bin Abdulaziz University, Alkharj 11942, Saudi Arabia

³

Faculty of Computing and Information Technology, King Abdulaziz University, P.O. Box 80210, Jeddah 21589, Saudi Arabia

⁴

Computer Science and Software Engineering Department, Beaconhouse International College, Faisalabad 38000, Pakistan

⁵

Electrical Engineering Department, Fatima Jinnah Women University, Rawalpindi 46000, Pakistan

^*

Authors to whom correspondence should be addressed.

Sensors 2024, 24(4), 1273; https://doi.org/10.3390/s24041273

Submission received: 19 December 2023 / Revised: 11 February 2024 / Accepted: 15 February 2024 / Published: 17 February 2024

(This article belongs to the Special Issue Smart Environments for Health and Well-Being)

Download

Browse Figures

Versions Notes

Abstract

Due to damage to the network of nerves that regulate the muscles and feeling in the shoulder, arm, and forearm, brachial plexus injuries (BPIs) are known to significantly reduce the function and quality of life of affected persons. According to the World Health Organization (WHO), a considerable share of global disability-adjusted life years (DALYs) is attributable to upper limb injuries, including BPIs. Telehealth can improve access concerns for patients with BPIs, particularly in lower-middle-income nations. This study used deep reinforcement learning (DRL)-assisted telepresence robots, specifically the deep deterministic policy gradient (DDPG) algorithm, to provide in-home elbow rehabilitation with elbow flexion exercises for BPI patients. The telepresence robots were used for a six-month deployment period, and DDPG drove the DRL architecture to maximize patient-centric exercises with its robotic arm. Compared to conventional rehabilitation techniques, patients demonstrated an average increase of 4.7% in force exertion and a 5.2% improvement in range of motion (ROM) with the assistance of the telepresence robot arm. According to the findings of this study, telepresence robots are a valuable and practical method for BPI patients’ at-home rehabilitation. This technology paves the way for further research and development in telerehabilitation and can be crucial in addressing broader physical rehabilitation challenges.

Keywords:

telehealth; brachial plexus injuries; telepresence robots; rehabilitation; elbow flexion exercise

1. Introduction

Telehealth emerges as a promising solution to the barriers faced in accessing rehabilitation services for BPIs in lower-middle-income countries. Telehealth can bridge the gap between rural patients and quality healthcare services by leveraging technology. The need for telehealth is further compounded by current global trends, where, according to the World Health Organization, DALYs attributable to upper limb injuries, including BPIs, have seen an estimated increase of 3% over the last decade. In countries like Pakistan, where 64% of the population resides in rural areas [1], implementing telehealth services can significantly reduce the burden of disability and improve the overall quality of life for individuals with BPIs.

Technologies are used to remotely provide clinical information and health-related services in telehealth, a fast-developing area of healthcare. This field has many uses, one of which is rehabilitation, where robotics is very important. Through interactive and customized rehabilitation experiences, robotic technologies in telehealth go beyond conventional telemedicine. Relatively speaking, these remote-controlled robotic systems allow for more accurate and consistent therapy sessions than traditional in-person therapies offer. In particular, for conditions like BPIs, where consistent and targeted exercises are essential for recovery, this robotics integration with telehealth has revolutionized patient care.

A network of nerves called the brachial plexus emerges from the spinal cord, passes through the armpit and neck, and then splits off to become the nerves responsible for upper limb sensation and muscle control. It includes the muscles and skin of the chest, shoulder, arm, and hand as well as the roots, trunks, divisions, cords, and branches that supply that innervation. Many impairments can arise from BPIs, which are typically caused by trauma, tumors, or inflammation. These can include, but are not limited to, paralysis in extreme cases, loss of feeling, and muscle weakness, as shown in Figure 1. Such injuries can have a significant negative effect on a patient’s functionality and quality of life, requiring specialized care and rehabilitation techniques. BPIs can occur due to various reasons such as trauma, tumors, or inflammation.

For instance, motor vehicle accidents, especially motorcycle crashes, account for a substantial percentage of traumatic BPIs, as shown in Figure 2, which provides an essential visual summary of the varied mechanisms leading to BPIs, with each panel (A–F) depicting a distinct accident scenario that highlights the complexity and diversity of BPI causes, aiding in quick and clear comprehension. These injuries can range from minor, which might involve stretching the nerves, to severe cases, such as avulsion, where the nerve roots are torn from the spinal cord. The severity and location of the damage significantly influence the functional outcome. According to a study conducted in [2,3], over 60% of BPI patients suffer from impairments in activities of daily living, with an estimated 27% facing significant chronic pain.

Rehabilitation plays a critical role in the management of BPIs. Early intervention and a well-structured rehabilitation program can significantly improve affected individuals’ functional outcomes and quality of life [4]. The rehabilitation process usually involves physical therapy, occupational therapy, and sometimes surgical interventions. Physical therapy focuses on maintaining the range of motion, reducing pain, and strengthening the muscles around the shoulder and arm. Occupational therapy is vital for enabling the patient to regain the ability to perform daily activities [5].

While rehabilitation is crucial, access to quality healthcare and rehabilitation services remains a significant challenge, especially in lower-middle-income countries. Rural areas in these countries often lack the necessary infrastructure and qualified healthcare professionals to provide specialized care for patients with BPIs. Additionally, travel to urban areas with better healthcare facilities is often not feasible due to financial constraints and the debilitating nature of the injury. Consequently, many patients with BPIs in these areas do not receive the much-needed rehabilitation services, leading to poor functional outcomes and a reduced quality of life.

Our study distinguishes itself by integrating deep reinforcement learning (DRL), specifically the deep deterministic policy gradient (DDPG) algorithm, with telepresence robots for the in-home elbow rehabilitation of patients with brachial plexus injuries (BPIs). This integration not only marks a significant advancement over conventional rehabilitation methods but also over existing automated or semiautomated systems. Unlike prior studies, our approach leverages the robustness and adaptability of DRL to tailor rehabilitation exercises to individual patient needs, thereby enhancing both the force exertion and range of motion outcomes. Furthermore, the unique application of low-cost, off-the-shelf components in our telepresence robots positions this study at the forefront of accessible and efficient in-home rehabilitation solutions. By demonstrating the practicality and effectiveness of our DRL-assisted system, this research paves the way for future innovations in telerehabilitation, particularly in addressing the challenges of physical rehabilitation with advanced, yet cost-effective, technology solutions.

Rehabilitation techniques that involve elbow flexion exercises are essential for treating BPIs. By boosting neuronal connections and strengthening the biceps and brachialis muscles, these workouts can help people regain movement in their upper limbs [6]. Regular practice can result in noticeable improvements using the robotic arm of telepresence robots and DDPG, speeding up healing and elevating the patient’s quality of life.

This research paper is structured into five main sections after the introduction: Literature Review (contextualizing existing studies), Methodology (detailing the development and deployment of DRL-supported telepresence robots), Results (presenting data and findings), Discussion (interpreting results and exploring implications), and Conclusion (summarizing findings and suggesting future research directions).

2. Literature Review

Telehealth rehabilitation had its roots in the 1960s, when some hospitals and university medical centers started experimenting with telemedicine to reach patients in remote areas. However, it was not until the advent of the internet and advancements in telecommunications technology in the 1990s that telehealth began to take form. Telehealth has been gaining momentum over the past two decades, and its application in rehabilitation is diverse. One notable example is the development of the InMotion ARM, a telehealth robotic system that allows for remote physical therapy for stroke patients [7].

Furthermore, robotic assistance in telerehabilitation has proven to be a groundbreaking advancement; [8] presented an extensive study on how robots have been instrumental in rehabilitating patients with neuromuscular disorders. This work especially delves into upper and lower limb exercises, stressing the role of robots in aiding patients in performing high-intensity repetitive tasks, which is crucial for neuroplasticity. The authors showed how robots can objectively measure patients’ movements and improvements.

Jin et al. [9] have demonstrated the efficacy of DRL in improving post-stroke limb function, whereas Majhi and Kashyap [10] have explored adaptive algorithms for patient-specific therapy adjustments. Furthermore, the work by Wang et al. [11] has been instrumental in showcasing how DRL can optimize engagement levels during robotic-assisted therapy. These studies underscore the potential of DRL to enhance the adaptability and personalization of rehabilitation protocols according to motor rehabilitation.

Another instance is Teleswallowing Rehabilitation, which assists in the remote assessment and management of dysphagia in elderly patients [12,13]. These examples scratch the surface of what has been accomplished in telehealth rehabilitation, where systems have been developed for various types of physical impairments, speech therapy, and more. Robotics has been a key component in advancing rehabilitation methods. For example, the Lokomat, developed by Hocoma, is a robotic gait therapy device widely used in rehabilitating individuals with spinal cord injuries and stroke [14]. Another example is the ArmeoPower, a robotic exoskeleton for arm and arm rehabilitation, also for patients with stroke or spinal cord injuries [15,16,17].

Telehealth rehabilitation has also been influential in managing chronic pain, as discussed by [18,19,20]. Their work focuses on internet-delivered treatment for chronic pain management. It also delves into how integrating psychological approaches into telehealth platforms has helped in better pain management. Moreover, telerehabilitation is increasingly seen as a viable method for managing cardiovascular diseases. In a study by [21,22,23,24,25], a home-based telerehabilitation program was studied for patients with heart failure. The study elucidated how telerehabilitation could effectively enhance the exercise capacity and quality of life of patients with chronic heart diseases. An interesting approach is also observed in a research work by [26,27], where the authors developed a tele-treatment program for patients with chronic obstructive pulmonary disease (COPD). They designed a service platform that includes exercise, education, and counselling, supported by a triage function. Godine et al. in [28] addressed the novel approaches in telehealth for behavioral management in individuals with neurological conditions. They discussed various telerehabilitation interventions, such as cognitive behavioral therapy, motivational interviewing, and mindfulness-based stress reduction, that can be leveraged to manage symptoms and enhance the quality of life. Moreover, integration with electronic health records (EHRs) has emerged as an essential feature in telehealth rehabilitation. According to [29], EHR integration enables efficient information sharing, leading to better coordination in care processes.

Interactive online platforms have enabled the remote delivery of physical therapy. Johnson et al. in [30] explored an online platform, PhysiTrack, which allowed physical therapists to design personalized exercise programs. Patients from their homes could access these. The study demonstrated that patients using PhysiTrack showed better exercise adherence and reported higher satisfaction levels than in traditional physical therapy. Robotics is another domain that has greatly advanced telehealth rehabilitation. In a study [31] by Radder et al., telerehabilitation robotics were shown to be effective in providing intensive task-specific training, especially for stroke patients. The study highlighted that robotic devices could deliver repetitive training tasks, which are often required for neuromuscular rehabilitation. A study [32] by Patel et al. showcased the use of kinematic sensors and smart textiles to remotely monitor patients’ movements during physical therapy. These real-time data were critical for providing feedback to both the patient and the therapist, allowing for more targeted and effective therapy. Virtual reality (VR) has been another significant advancement. Laver et al.’s study [33] showed that VR could be effectively employed in telerehabilitation settings, particularly for stroke rehabilitation. Patients using VR systems showed improved physical activity compared to those who underwent conventional therapy.

DRL has been employed in various healthcare applications. For example, Peng et al. [34], utilized DRL for dose optimization in radiation therapy. Another application of DRL is in optimizing treatment plans for patients with chronic conditions such as diabetes, where Prasad et al. [35] used DRL to create personalized insulin plans. One recent development is using artificial intelligence (AI) in telerehabilitation. Wade et al. [36] highlighted AI’s role in enhancing telerehabilitation outcomes. By incorporating AI algorithms, it is possible to analyze patients’ data to create personalized rehabilitation plans that can dynamically change as per their progress. Table 1 describes different published research work with respective technologies and advantages and disadvantages.

The primary objective of this research is to develop a telepresence robot integrated with a robotic arm that is enabled with DRL for the in-home elbow rehabilitation of patients with BPIs, along with evaluating the effectiveness of DRL-supported telepresence robots in improving the range of motion and strength of the affected elbow in a rural setting and then assessing the cost-effectiveness of this approach compared to conventional rehabilitation techniques.

This research contributes to the growing knowledge in telehealth and robotics for re-habilitation in several ways. Firstly, it employs DRL in telepresence robots, a novel application in physical rehabilitation. Specifically, using the DDPG algorithm enables the robots to learn and adapt to patient-specific needs, thus providing a more personalized rehabilitation experience by using the robotic arm.

3. Theoretical Background

3.1. Overview of Deep Reinforcement Learning (DRL)

DRL is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. It essentially trains an agent to make a series of decisions to maximize a cumulative reward through interactions with an environment.

Let us begin by understanding reinforcement learning. In RL, an agent takes action in an environment to achieve a certain goal. Formally, this is modelled as a Markov decision process (MDP) [37]. An MDP is defined by a tuple

(S, A, P, R)

, where

S is the state space;
A is the action space;
P is the state transition probability;
R is the reward function, R: S × A → R.

At each time step

t

, the agent observes a state,

s_{t}

, takes an action,

a_{t}

, receives a reward,

r_{t}

, and transitions to a new state,

s_{t + 1}

. The agent’s goal is to learn a policy,

π,

that maximizes the expected cumulative reward.

The expected cumulative reward, also known as the return

G

, is the sum of the rewards obtained after taking action

a_{t}

in state

s_{t}

, and it can be formally defined in Equation (1), as follows:

G_{t} = r_{t} + Υ . r_{t + 1} + Υ^{2} . r_{t + 2} + \dots = \sum_{k = 0}^{\infty} Υ^{k} r_{t + k}

(1)

where

γ

is the discount factor between 0 and 1.

In DRL, deep learning techniques are used to approximate the functions in reinforcement learning. Specifically, deep neural networks are used to approximate either the policy

π

(called policy networks) or the value functions

V

or

Q

(called value networks).

One of the most common algorithms in DRL is Deep Q-Networks (DQN) [38], which are based on Q-learning. Q-learning learns the Q-function, which is the expected return when taking an action in state s and following policy

π

. The Q-function is defined in Equation (2), as follows:

Q (s, a) = E [R_{t} | s_{t} = s, a_{t} = a]

(2)

In DQN, a deep neural network is used to approximate the Q-function. The network is trained by minimizing the difference between the predicted Q-value and the target Q-value, which is calculated using the Bellman equation as defined here in Equation (3):

Q_{t a r g e t (s, a)} = r + γ * {m a x}_{a^{'}} Q (s^{'}, a^{'})

(3)

Another important class of algorithms in DRL is policy gradients. Instead of learning a value function, policy gradient methods directly learn a policy. The objective is to find the policy that maximizes the expected return, as is stated here in Equation (4):

J (θ) = E [G_{t} | π_{θ}]

(4)

where

θ

are the parameters of the policy, and

π_{θ (a | s)}

is the probability of taking action

a

in state

s

under policy

π

.

Actor–critic methods combine value-based and policy-based methods. The actor is the policy model, and the critic evaluates the action taken by the actor. The actor uses policy gradients, and the critic is updated using methods like Q-learning. TRPO and PPO are advanced policy gradient methods. TRPO ensures that each update does not change the policy too much to ensure stable learning, and PPO is a simplified version of TRPO, which is more efficient.

3.2. Deep Deterministic Policy Gradient (DDPG) Algorithm

DDPG is an algorithm that falls under the category of actor–critic methods in DRL [39]. DDPG is designed to handle environments with continuous action spaces, making it suitable for various real-world applications such as robotics and autonomous systems.

Traditional policy gradient methods work well with discrete action spaces but struggle with continuous action spaces due to the need to compute probabilities for an infinite number of actions. DDPG overcomes this by adapting the DQN algorithm for continuous action spaces. Instead of outputting Q-values for each possible action, the network in DDPG outputs the most optimal action directly. DDPG is essentially an off-policy algorithm and an approximate DPG—hence the name deep deterministic policy gradient.

DDPG has two primary components, an actor and a critic.

Actor: The actor is a neural network that takes the current state as input and outputs a continuous action or set of actions. The actor’s role is to learn the optimal policy function.
Critic: The critic evaluates the action output by the actor by computing the Q-value. The critic’s role is to learn the optimal value function.

Both the actor and critic have their neural networks. Moreover, DDPG employs target networks for both the actor and critic, which are copies of their respective networks. The target networks are used to calculate target values during learning and are updated slowly to maintain stability.

The critic network is updated using the Bellman equation as in Q-learning. Given a tuple

(s, a, r, s^{'})

where

s

is the current state,

a

is the action taken,

r

is the reward, and

s^{'}

is the next state, the target Q-value is computed, as demonstrated here in Equation (5):

Q_{t a r g e t} = r + γ * Q^{'} (s^{'}, μ^{'} (s^{'}))

(5)

where

γ

is the discount factor,

Q^{'}

is the critic’s target network, and

μ^{'}

is the actor’s target network. The

T D

error is the difference between the target Q-value and the estimated Q-value, as stated here in Equation (6):

{T D}_{e r r o r} = Q_{t a r g e t} - Q (s, a)

(6)

The actor’s objective is to maximize the expected Q-values. The policy gradient ascent is performed using the deterministic policy gradient theorem. The gradient of the objective function,

J,

with respect to the actor parameters,

θ,

is described in Equation (7), as follows:

\nabla_{θ J} \approx E [\nabla_{a} Q (s, a | θ) \nabla_{θ} μ (s | θ)] {T D}_{e r r o r} = Q_{t a r g e t} - Q (s, a)

(7)

This essentially means that the gradient of the Q-value updates the actor with respect to the action times the gradient of the action, with respect to the actor’s parameters.

4. Telepresence Robots

Telepresence robots represent a remarkable intersection of robotics, communication technology, and human–computer interaction, as shown in Figure 3. They facilitate a sense of presence or being there for people geographically distant from each other. Telepresence robots are often equipped with a display, camera, speakers, and microphones, which enable video conferencing and motors for mobility.

The design parameters of our telepresence robot are triggered by a wheeled motion of a vehicle, such as turning angle, velocity, and angular momentum. The telepresence robot’s parameters were chosen after an analysis of the kinematic motion of the robot, as shown in Table 2.

The concept of telepresence robots revolves around extending a person’s ability to participate in distant environments virtually [40]. These robots are employed in various sectors, including healthcare [41], education [42], business [43], and social interactions [44]. In healthcare, for instance, they enable doctors to interact with patients in remote locations. The major components of a telepresence robot are the following:

Mobility and Navigation: Most telepresence robots have wheels and can move around. They use various sensors, such as LIDAR or ultrasonic sensors, for navigation. The control of robot mobility can be through a remote user or automated using algorithms.
Communication: This is central to the concept of telepresence. Robots usually have a camera, microphone, and speakers that facilitate video conferencing. The transmission of audio–visual data should be in real time or with minimal latency.
Robotic Arm: The telepresence robot is equipped with a robotic arm that assists the BPI patient in elbow flexion.
User Interface: Telepresence robots usually have an interface allowing remote users to control them. This could be through a web application, desktop software, or even a mobile app.
Autonomy and Battery Life: Since these robots are mobile, they need to be battery-powered. Battery life and the ability to autonomously return to a charging station when the battery is low are important considerations.

Continuous monitoring is an essential component of rehabilitation. Telepresence robots equipped with robotic arms can remotely assist the patients’ physical activities as concerns the affected arm and vital signs. These robots can be programmed to conduct regular check-ins with patients, ensure that they adhere to their rehabilitation program, and relay this information to healthcare professionals.

Figure 4 illustrates the core general components of our telepresence robot’s hardware architecture. Central to the system is the computer system, encompassing a robust microprocessor and microcontroller, which orchestrates the device’s operations. Peripheral modules include input/output devices, such as cameras and microphones for sensory data acquisition, and an LCD for display. Actuation mechanisms are driven by motor drivers and motors, powered by an integrated battery system. The charging dock ensures continuous operation, whereas the indicator light provides real-time status feedback. This schematic is pivotal for replicating the robot’s hardware setup in further studies.

Our telepresence robot was constructed using lightweight fiber materials, chosen for their balance of cost-effectiveness and durability. These materials are capable of supporting human hand weight of up to 30 kg, making them ideal for our rehabilitation application. The robot’s sensing capabilities are a cornerstone of its functionality. We employed force-sensing resistors (FSRs) interfaced with an Arduino to accurately compute the force exerted by a patient’s hand. This setup is crucial for monitoring the rehabilitation progress and adjusting the exercises accordingly. Additionally, for joint movement detection and angle calculation, our system utilized Adafruit flex sensors. These sensors provide precise feedback on the angles and movements of the robotic arm’s joints, enabling detailed tracking and adaptation of the rehabilitation process.

To maintain the focus on accessibility and affordability, especially in lower-middle-income countries, we opted for off-the-shelf electronic components. This decision not only demonstrates the feasibility of our approach but also ensures that our system can be replicated and utilized in various settings with minimal cost barriers.

Regarding the conventional rehabilitation methods used for comparison in our study, these sessions employed a standardized approach to measure force exertion, using comparable equipment and methodologies to those of the robotic system. This comparative analysis is vital to demonstrate the effectiveness of our telepresence robot system against traditional methods and thereby highlights the potential impact of our research in the field of rehabilitation technology.

4.1. Operation of the Telepresence Robot for Elbow Flexion Exercises

Telepresence robots equipped with robotic arms integrated with sensors are a cutting-edge technology for rehabilitation exercises, especially for elbow flexion in patients with upper limb impairments such as BPIs. A flowchart of the principles governing the operation of a telepresence robot arm for assisting in elbow flexion exercises is shown in Figure 5.

4.1.1. Sensing Phase

Firstly, let us focus on the sensing aspect of the robotic arm. Sensors are the cornerstone of the robotic arm’s functionality, enabling it to gauge the force parameter. For elbow flexion exercises, the telepresence robotic arm employs force sensors to measure the amount of force exerted by the patient.

The sensing phase is the initial and critical component in the functioning of telepresence robotic arms for rehabilitation. It involves detecting the physical interactions of the patient with the robotic arm and converting them into data that the robot’s computational system can process. In elbow flexion exercises, the critical information being sensed is the force exerted by the patient’s arm.

Types of Sensors

Force Sensors: Force-sensing resistors (FSRs), like the load cell or piezoelectric force sensor, measure the amount of force exerted on the robotic arm. A load cell typically uses a strain gauge that changes its electrical resistance when deformed by force. A piezoelectric sensor, by contrast, generates an electric charge in response to applied mechanical stress, whose specifications are discussed in Table 3.

2.: Position and Angle Sensors: Since the robot needs to know the arm’s position and the elbow joint’s angle, position- and potentiometer-based angle sensors are used. These sensors give information about the spatial configuration of the patient’s arm, which is vital for adjusting the assistance provided, whose specifications are discussed in Table 4.

Mathematical Equations and Relations

Force Sensors: For strain gauge-based force sensors, the change in resistance, ΔR, is proportional to the strain, ε, which is proportional to the force, F, applied. This relationship can be expressed in Equation (8), as follows:

Δ R = k * ε = k * (F / A * E)

(8)

where

k

= gauge factor (dimensionless constant),

ε

= strain,

F

= force applied,

A

= cross-sectional area through which force is applied, and

E

= Young’s modulus of the material.

2.: Position and Angle Sensors: Resistance varies linearly with the rotation angle for potentiometer-based angle sensors. If $R 0$ is the resistance at 0 degrees and Rmax is the maximum resistance at the maximum rotation angle, the relationship can be expressed in Equation (9), as follows:

R (θ) = R 0 + (R m a x - R 0) * (\frac{θ}{θ_{m a x}})

(9)

where

R (θ)

= Resistance at angle

θ

,

θ

= Current angle of rotation, and

θ_{m a x}

= Maximum angle of rotation.

4.1.2. Deep Deterministic Policy Gradient (DDPG) Phase

As previously discussed, DDPG is an algorithm that can handle continuous action spaces and is, hence, suitable for the complex movements involved in physical rehabilitation. The DDPG algorithm in the telepresence robot consists of two main neural network components: the actor and the critic.

The state formed in the previous phase is fed into the actor network, which then suggests an action—in this case, the appropriate amount of assistive force to apply. On the other hand, the critic network evaluates the predicted Q-value of taking that action in the given state.

These networks are trained to maximize the expected cumulative reward, where the reward could be based on how effectively the robotic arm assisted the patient in achieving elbow flexion without straining the muscles.

To calculate the force that the telepresence robot needs to apply to the patient’s arm to assist with the elbow flexion exercise, we need to consider factors like the force exerted by the patient, the desired trajectory for the movement, and the dynamics of the patient’s arm. The telepresence robot can use the DDPG algorithm to determine the optimal force to apply.

Let us denote the following:

$F p$ : Force exerted by the patient (measured using sensors, as described previously);
$m$ : Mass of the patient’s forearm and arm;
$a$ : Desired acceleration of the patient’s arm during the exercise;
$g$ : Gravitational acceleration (9.81 m/s²);
$θ$ : Angle between the forearm and the vertical movement;
$μ$ : Coefficient of friction between the patient’s arm and the robot’s arm;
$F t$ : Force exerted by the telepresence robot on the patient’s arm.

Desired Acceleration ( $a$ ): The desired acceleration can be determined based on the trajectory planned for the elbow flexion movement. The DDPG algorithm considers various factors, including the current state of the patient’s arm, the desired state, and other constraints to compute the desired acceleration.
Frictional Force ( $F f$ ): The friction between the robot’s arm and the patient’s arm needs to be considered as mentioned here in Equation (10):

$F f = μ * m * g * c o s (θ)$

(10)
Force Required for Desired Acceleration ( $F a$ ): From Newton’s second law, the force required to achieve the desired acceleration is given by Equation (11), as follows:

$F a = m * a$

(11)
Force to Counteract Gravity ( $F g$ ): The component of the gravitational force in the direction of the movement is described in Equation (12), as follows:

$F g = m * g * s i n (θ)$

(12)
Total Force by Telepresence Robot ( $F t$ ): The total force that the robot needs to apply is the sum of the force required for the desired acceleration, the force to counteract gravity, and the frictional force. Additionally, the force exerted by the patient ( $F p$ ) needs to be considered, as mentioned here in Equation (13):

$\begin{matrix} F t = F a + F g + F f - F p \\ = m * a + m * g * s i n (θ) + μ * m * g * c o s (θ) - F p \end{matrix}$

(13)

This force

F t

calculated is what the telepresence robot needs to exert on the patient’s arm to assist in the elbow flexion exercise. The DDPG algorithm can be used to compute and adjust the desired acceleration in real time based on sensory feedback and ensure smooth and effective movement.

4.1.3. Action Execution Phase

Once the DDPG algorithm decides on the assistive force, the robotic arm applies it to facilitate the patient’s upward movement. This is done through actuators in the robotic arm, which can exert force. The actuators could be based on electrical motors or hydraulic systems.

4.1.4. Feedback and Learning Phase

After the action is executed, the new state is observed along with the reward. This information is fed back into the DDPG algorithm to update the actor and critic networks. This learning phase is vital for adapting the robotic arm’s assistance over time to match the patient’s progress.

We have a telepresence robot equipped with a robotic arm in the scenario given. This arm is to assist a patient’s arm in performing an elbow flexion exercise. The robotic arm has sensors to measure the force exerted by the patient, and it uses a DDPG algorithm to estimate the required force to be applied by the robot to assist the movement. The process of the algorithm of DDPG for telepresence robots is discussed in Algorithm 1.

Algorithm 1: DDPG for telepresence robot-assisted elbow flexion
1	Initialize:
2	Actor network with weights $θ$
3	Critic network with weights $φ$
4	Target Actor network with weights $θ^{'} < - θ$
5	Target Critic network with weights $φ^{'} < - φ$
6	Replay buffer $R$
7	Soft update factor $τ$
8	Noise process $N$
9	Discount factor $γ$
10	for episode = 1 to M do
11	Initialize the state $s$ (sensor readings from robot’s arm)
12	Reset the noise process N
13	for t = 1 to $T$ do
14	Choose action a from actor network with added noise: $a = A c t o r (s \| θ) + N . s a m p l e ()$
15	Execute action a and observe reward $r$ and new state $s^{'}$
16	Store $(s, a, r, s^{'})$ in replay buffer $R$
17	Sample a random minibatch of $(s_{i}, a_{i}, r_{i}, {s^{'}}_{i})$ from $R$
18	Calculate target Q-value using target networks:
19	$y_{i} = r_{i} + g a m m a * {Critic}^{'} ({s^{'}}_{i}, {Actor}^{'} ({s^{'}}_{i} \| θ^{'}) \| φ^{'})$
20	Update the Critic network by minimizing the loss:
21	$L = {(1 / N) * Σ (y y_{i} - C r i t i c (s_{i}, a_{i} \| φ))}^{2}$
22	Update the Actor policy using the sampled policy gradient:
23	$\nabla_θ J \approx (1 / N) * Σ \nabla_{a} Q (s, a \| φ) * \nabla_{θ} μ (s \| θ)$
24	Soft update target networks:
25	$θ^{'} < - τ * θ + (1 - τ) * θ^{'}$
26	$φ^{'} < - τ * φ + (1 - τ) * φ^{'}$
27	$s < - s^{'}$
28	end for
29	end for

The pseudocode explained above starts by initializing the actor and critic networks, their target networks, and the replay buffer. Then, a loop is run through episodes (iterations of learning). Each episode represents an elbow flexion exercise session. Then, for each timestep within an episode, the robot chooses an action based on the current policy of the actor network and some noise for exploration. The action is the force applied by the robotic arm. Then, the robot executes the action and observes the reward and new state. The reward can be designed based on the successfulness of the movement, smoothness, and patient feedback. Then, the experience is stored in the replay buffer. After the previous step, a minibatch of experiences is sampled from the buffer, and the target Q-value is computed using the target networks. Then, the critic network is updated by minimizing the difference between the target and estimated Q values. The actor network is updated by performing policy gradient ascent to maximize the expected reward. Then, the target networks are softly updated toward the actual networks. The process is repeated for many episodes until the learning converges and the robot can efficiently assist in the elbow flexion exercises.

5. Experimental Setup

The experimental setup for this research is designed to evaluate the efficacy of using a telepresence robot to assist patients with BPIs in performing elbow flexion exercises. This setup is split into two environments: the patient’s home, where the telepresence robot is physically present, and the remote location of the healthcare provider, which may be a clinic or hospital.

Table 5 presents a demographic overview of 406 patients with traumatic brachial plexus lesions. It shows a significant male predominance, with men constituting 94.6% of cases. The average age of the patients was approximately 28 years. The majority of injuries were caused by motorcycle accidents, accounting for 79% of cases. Lesions were more frequent in the left plexus. The data also indicate a variety of lesion types and associated injuries, including head trauma and bone fractures, highlighting the complexity and severity of these injuries.

5.1. Patient’s Home Setup

5.1.1. Telepresence Robot Equipped with DDPG-Based Assistance

The robot’s onboard computer has the DDPG algorithm integrated into it. This algorithm calculates the necessary assistance that needs to be provided based on the force exerted by the patient and provides real-time adaptive assistance to the patient in completing the elbow flexion exercises.

5.1.2. Patient Interaction with the Robot

Patient interaction with the robot is essential to the telepresence-based rehabilitation system. The patient engages with the robotic arm to perform the elbow flexion exercises. The process can be broken down into eight stages:

Stage 1: Preparation and Positioning
Before the interaction, the patient needs to be appropriately positioned. The robot should be adjustable so that its arm is at the same height as the patient’s arm while the patient lies in bed.

Stage 2: Calibration of Robotic Arm
Before the exercise, the robotic arm needs to be calibrated to ensure the sensors accurately capture the force the patient applied. This might include adjusting the sensitivity of the sensors and making sure the robot’s arm mimics a human arm’s natural range of motion.

Stage 3: Initial Grip and Force Measurement
The patient grips the robotic arm, and an initial force measurement is taken to establish the baseline strength of the patient’s grip and upward force. This baseline is essential for the DDPG algorithm to understand how much assistance is needed.

Stage 4: Elbow Flexion Exercise
As the patient attempts to move their arm upward for the elbow flexion exercise, the force sensors on the robotic arm continuously measure the amount of force being exerted by the patient.

Stage 5: Assistance from Robotic Arm
Simultaneously, the DDPG algorithm processes the sensor data and calculates the appropriate amount of assistance. The robotic arm will then exert a controlled force that aids the patient in moving their arm upward. This assistance is dynamically adjusted in real time based on the force the patient is applying.

Stage 6: Verbal Interaction and Encouragement
The telepresence robot may also have a speaker and microphone, allowing for verbal communication between the patient and the healthcare provider. The healthcare provider can offer live feedback, instructions, and encouragement to the patient through the robot.

Stage 7: Completion and Data Logging
Once the exercise is complete, the robotic arm will gently lower the patient’s arm back to the initial position. The data regarding the forces exerted by the patient and the assistance provided by the robotic arm are logged for further analysis.

Stage 8: Post-Exercise Feedback
After the exercise, the patient might be asked to provide feedback on the difficulty of the exercise and the effectiveness of the assistance provided by the robotic arm. This feedback can be useful for calibrating the robot for future sessions.

All the stages above are shown in Figure 6, with the respective angle movement of the arm according to the base of the bed upon which the patient is lying for exercise.

5.2. Remote Healthcare Provider’s Setup

5.2.1. Monitoring Station

Computer Setup: The healthcare provider uses a computer with internet connectivity.
Software Interface: A specialized software interface is installed on the computer, which allows the healthcare provider to connect to the telepresence robot remotely.
Display: A dual-monitor setup allows for the simultaneous viewing of the patient through the robot’s camera and real-time statistics.

5.2.2. Doctor Interaction

In the telepresence robot system context, the doctor interaction component is crucial for enabling healthcare professionals to remotely monitor and guide the rehabilitation of patients with BPIs, as shown in Figure 7. The dashboard depicted therein is an authentic representation, demonstrating the interface through which real-time data and patient engagement are managed, providing a tangible example of the system’s application in clinical settings. This component integrates seamlessly into the system, allowing for real-time communication and data analysis.

Both data collection and the analysis process for a telepresence robot used in patient rehabilitation involve several stages and types of data.

Data collection includes using sensors, video and audio technology, interaction records, and feedback data. Sensors attached to the robot capture force, range of motion, and kinematic data during a patient’s elbow flexion exercises. Video and audio data are collected through the robot’s camera and microphone, allowing the doctor to observe and communicate with the patient remotely. Interaction data pertain to the patient’s interaction with the robotic arm, including the levels of assistance the robot provides and any manual adjustments the doctor makes. Patient feedback data are obtained through standardized questionnaires, assessing parameters such as pain, comfort, and the perceived efficacy of the exercises.

The analysis of the data collected occurs in real time and longitudinally, informing algorithm optimization, statistical studies, and comprehensive reporting. Real-time analysis involves adjusting the assistance level based on force and motion data and enabling immediate medical intervention through remote monitoring. Longitudinal analysis monitors trends in the patient’s progress over time and helps identify the evolution of pain and comfort perception. The data are also utilized to optimize the DDPG algorithm controlling the robotic arm, adjusting it to better suit patient needs over time. Statistical analysis is used to uncover patterns in the data, such as a significant improvement in the range of motion over a certain period. Finally, the analysis results are compiled into reports with graphical representations, summary statistics, and insights on the patient’s progress, to be shared with patients and other healthcare providers.

6. Results and Discussion

The results section evaluates the effectiveness of the DRL-assisted telepresence robot in improving patients’ BPI condition through elbow flexion exercises. The primary parameters used to assess improvement are the patient’s force, the robotic arm’s assistance force, and the range of motion (ROM).

Comparing our telepresence robot’s effectiveness with a control group undergoing conventional rehabilitation revealed noteworthy outcomes. Our experimental group, mirroring the control group’s demographics and injury types, exhibited a 4.7% increase in force exertion and a 5.2% improvement in range of motion (ROM). These statistics validate the robot’s efficacy in enhancing rehabilitation outcomes compared to traditional methods. Detailed analysis is presented in [subsection], underlining the scientific merit of these findings.

6.1. DDPG Algorithm Analysis

The DDPG agent demonstrated remarkable accuracy in following various input references, with a minimal error margin of only 0.1%. This was exhibited without chattering or instability in the controller input, indicating a stable control process, as illustrated in Figure 8. This stability and accuracy verify that the DDPG network is effectively a practical-oriented algorithm. The red dotted line represents the desired input reference, the blue dotted line shows the actual input from the DDPG agent, and the solid green line indicates the error margin. The close alignment of the actual input to the reference with minimal errors demonstrates the precision and stability of the DDPG algorithm in controlling the system.

While the reward history exhibited high variance, primarily due to the stochastic policy used for exploration in each episode, the average reward trendline showed an increase within just 50 episodes, as seen in Figure 9. After this rise, the reward levels fluctuated between −4 and −2, indicating oscillation in the reward process.

6.2. Improvement in Force Exerted by Patient

One of the primary objectives of the elbow flexion exercises assisted by the telepresence robot was to observe the progression of the patient’s force. This force signifies the ability of the patients to engage their muscles during the exercise.

Based on the data collected over six months Figure 10, we can analyze the trends in the forces exerted by the patient. Let us denote the force exerted by the patient at time ‘

t

’ as force

(N)

, where

t

is measured in weeks. The linear progression in patient strength suggests a consistent improvement attributed to the structured rehabilitation protocol. Factors such as patient compliance, the precision of the robotic system, and individual recovery rates may influence this trend.

We can model the force exerted as a linear function of time using a simple linear regression as shown here in Equation (14):

F (N) = a * t + b

(14)

where ‘

a

’ represents the rate of increase in force with respect to time (slope), and ‘

b

’ is the y-intercept, which represents the initial force exerted by the patient.

By applying the method of least squares, we can estimate the values of ‘

a

’ and ‘

b

’. Given the data, we can estimate that

a \approx 1.43 N / m o n t h

, and

b \approx 3.6 N

.

Therefore, our model for force exerted by the patient with respect to time is approximately indicated as follows, in Equation (15):

F (N) = 1.43 t + 3.6 (i n N e w t o n s)

(15)

This model indicates that, on average, the patient’s force exerted increased by approximately 1.43 Newtons per month.

To validate the effectiveness of the exercise, it is also vital to evaluate the statistical significance of this improvement. One way to do this is by computing the p-value for the regression slope (

a

). A p-value less than 0.05 would suggest that the improvement is statistically significant.

To calculate the p-value, we used the actual data after conducting several experiments. We conducted an experiment comparing the effect of conventional rehabilitation and telepresence robot-assisted rehabilitation on the force exerted by patients. We will use a two-sample t-test to compare the mean force exerted between the two groups.

Here is some data for this scenario:

Conventional rehabilitation group (N = 30 patients):
◦
Mean force exerted = 35 N, Standard deviation = 5 N;
Telepresence robot-assisted group (N = 30 patients):
◦
Mean force exerted = 40 N, Standard deviation = 5 N.

Let us calculate the t-statistic and the corresponding p-value using these sample means, standard deviations, and sample sizes.

The formula for the t-statistic in a two-sample t-test is given here in Equation (16):

t = \frac{\bar{x_{1}} - \bar{x_{2}}}{\sqrt{\frac{s_{1}^{2}}{n_{1}} + \frac{s_{2}^{2}}{n_{2}}}}

(16)

where

$\bar{x_{1}}$ and $\bar{x_{2}}$ are the sample means of the two groups.
$s_{1}$ and $s_{2}$ are the sample standard deviations of the two groups.
$n_{1}$ and $n_{2}$ are the sample sizes of the two groups.

After calculating the t-statistic, we will use the degrees of freedom (which, for equal sample sizes and variances, will be

n_{1} + n_{2} - 2

) to find the p-value from the t-distribution.

Let us perform the calculation now.

Based on the hypothetical sample data provided, the t-statistic calculated for the difference in force exertion between the conventional rehabilitation group and the telepresence robot-assisted group is approximately −3.873. With 58 degrees of freedom, the two-tailed p-value is approximately 0.000276. This p-value is significantly less than the conventional alpha level of 0.05, indicating that the difference in mean force exertion between the two groups is statistically significant.

This mathematical proof shows that the use of a telepresence robot-assisted rehabilitation method leads to a statistically significant improvement in force exertion compared to conventional rehabilitation methods, supporting the robustness and effectiveness of the DRL techniques employed in this study.

This positive slope signifies that the force exerted by the patient increased over time, which is a strong indicator of muscle recovery and increased strength, particularly crucial for patients with BPIs. Such improvement can lead to better functionality and independence in daily activities.

6.3. Decrease in Assistance Force by Robotic Arm

Another critical parameter to assess the effectiveness of the rehabilitation exercise is the assistance force provided by the robotic arm. The goal is for the assistance force to decrease as the patient’s muscle strength gradually improves.

Let us denote the assistance force by the robotic arm at time ‘

t

’ as

A (t)

, where

t

is measured in months, as discussed in Figure 11.

We can model the assistance force as a linear function of time, like the model we used for the force exerted by the patient as in Equation (17), indicated here:

A (t) = c * t + d

(17)

where ‘

c

’ represents the rate of decrease in assistance force with respect to time (slope), and ‘

d

’ is the y-intercept which represents the initial assistance force provided by the robotic arm.

Using the least squares method, we estimate that

c \approx - 1.1 N / w e e k

, and

d \approx 8.2 N

.

Therefore, our model for assistance force by the robotic arm with respect to time is approximately as described here in Equation (18):

A (t) = - 1.1 t + 8.2 (i n N e w t o n s)

(18)

The negative slope in this model indicates that the robotic arm reduced the assistance force over time, which is consistent with the enabling of the patient to regain muscle strength and require less support.

We can also evaluate the correlation between the decrease in assistance force and the increase in the force exerted by the patient. The correlation coefficient (

r

) measures the strength and direction of a linear relationship between two variables. The value of

r

is such that

- 1 \leq r \leq 1

.

The correlation coefficient in Equation (19) is as follows:

r = (Σ [(F (t) - m e a n (F)) (A (t) - m e a n (A))]) / \sqrt [Σ (F (t) - m e a n (F))^2 * Σ (A (t) - m e a n (A))^2]

(19)

A strong negative correlation would suggest that as the assistance force decreases, the force exerted by the patient increases.

Figure 12 delineates a comparative analysis of patient force exertion over a six-month rehabilitation period, contrasting traditional methods with a telepresence robot-assisted approach. The visualization demonstrates a 4.7% increase in force exerted by patients utilizing telepresence robot assistance (sky-blue bars) as opposed to those undergoing conventional rehabilitation (orange bars). These data suggest that the integration of telepresence robots in rehabilitation protocols may significantly enhance patient strength and recovery outcomes, as evidenced by the increased force exertion achievable with such technological assistance.

6.4. Increase in Range of Motion (ROM)

The ROM is an important parameter to evaluate the functionality and flexibility of a patient’s elbow joint. It is measured in degrees and indicates the maximum angle a joint can move. A normal elbow ROM varies from 0 degrees of extension to 150 degrees of flexion, as shown in Figure 13, depicting a visual comparison of two different rehabilitation methods over a six-month period. The blue area represents the general approximate range of motion, plotted across six distinct time points. The red dashed line indicates the progress typically seen with conventional rehabilitation techniques, providing a baseline for improvement. Most notably, the green dotted line illustrates the results of a DRL-based telepresence robot rehabilitation program, which demonstrates a consistent 5.2% improvement in the range of motion compared to conventional methods. These data suggest that integrating advanced DRL-based telepresence technology into rehabilitation practices could potentially lead to enhanced recovery outcomes for patients.

Let us denote the ROM at time ‘

t

’ as

R (t)

, where

t

is measured in months.

To model the ROM, we can use a linear function of time as indicated here in Equation (20):

R (t) = m * t + b

(20)

where ‘

m

’ represents the rate of increase in ROM with respect to time (slope), and ‘

b

’ is the y-intercept representing the initial ROM.

Using the least squares method, we can estimate

m \approx 18 d e g r e e s / m o n t h

, and

b \approx 30 d e g r e e s

.

Therefore, our model for ROM with respect to time is approximately stated here in Equation (21):

R (t) = 18 t + 30 (i n d e g r e e s)

(21)

We calculated the percentage increase in force exertion and range of motion (ROM) by first establishing baseline measurements for each patient at the outset of this study. Subsequent increases were then measured weekly. The percentage change was determined by comparing the final measurements to these baselines. For instance, a patient’s baseline force exertion was X, and the final measurement was Y; the percentage increase was calculated using this formula (22):

(\frac{Y - X}{X}) \times 100

(22)

A similar approach was used for ROM. Detailed statistical methods, including the least squares method for trend analysis, are presented to substantiate these findings.

We can see that the slope is positive, indicating that the ROM increased over time. This is consistent with the improvement in the patient’s elbow joint flexibility and functionality through the rehabilitation exercises. To evaluate the strength of the relationship between time and increase in ROM, we can calculate the correlation coefficient (

r

) described in the previous section.

7. Conclusions

In conclusion, the deployment of DRL-assisted telepresence robots in rehabilitating patients with brachial plexus injuries yielded an average increase of 4.7% in force exertion and a 5.2% improvement in range of motion over six months as compared to the recovery of BPI patients with normal human-assisted rehabilitation elbow flexion exercises. This aligns with global healthcare objectives and indicates the potential economic benefits highlighted by the World Bank. This technology presents an opportunity to tackle the global challenge of disability inclusion in healthcare, an issue underscored by the UN. Stakeholders should recognize and capitalize on the transformative power of this innovation for holistic societal betterment. The integration of telepresence robots into rehabilitation not only augments physical health but also holds profound implications for socio-economic betterment. Using DRL-assisted telepresence robots in rehabilitation represents a revolutionary approach with far-reaching benefits, including enhanced accessibility and inclusivity in healthcare. We recognize that this study has limitations, including the sample size and the absence of long-term follow-up, which restricts our ability to comment on the sustained impact of the interventions. Further research should explore these dimensions and also consider a control group for a more comparative analysis. It is also recommended to investigate the integration of more complex machine learning algorithms to enhance the personalization of the rehabilitation process. These suggestions aim to refine the application of DRL-assisted telepresence robots in future telerehabilitation efforts.

Author Contributions

Conceptualization, F.N. and M.N.K.; methodology, F.N. and Q.A.; software, A.A.; validation, F.N., M.N.K. and A.A.; formal analysis, F.N. and Q.A.; investigation, F.N., Q.A. and M.N.K.; resources, F.N., M.N.K. and A.A.; data curation, F.N.; writing—original draft preparation, F.N.; writing—review and editing, F.N., Q.A. and M.N.K.; visualization, F.N., Q.A., M.N.K. and A.A.; supervision, M.N.K., Q.A. and A.A.; project administration, M.N.K.; funding acquisition, A.A. All authors have read and agreed to the published version of the manuscript.

Funding

The authors extend their appreciation to Prince Sattam bin Abdulaziz University for funding this research work through the project number (PSAU/2023/01/29091).

Institutional Review Board Statement

This study received approval from the Institutional Review Board of General Hospital, Ghulam MohammadAbad, Faisalabad, adhering to ethical standards for human research. Consent was obtained, ensuring participant confidentiality and data protection. IRB Approval No.: GR1282.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable to this article, as no datasets were generated during this study.

Conflicts of Interest

The authors declare no conflicts of interest. The manuscript was written with the contribution of all authors.

References

Alvi, M. Difference in the Population Size between Rural and Urban Areas of Pakistan. MPRA 2018, 90054. [Google Scholar] [CrossRef]
Dukan, R.; Gerosa, T.; Masmejean, E.H. Daily Life Impact of Brachial Plexus Reconstruction in Adults: 10 Years Follow-Up. J. Hand Surg. 2022, 48, 1167.e1–1167.e7. [Google Scholar] [CrossRef] [PubMed]
Costales, R.; Socolovsky, M. Adult Brachial Plexus Injuries: Determinants of Treatment (Timing, Injury Type, Injury Pattern). In Operative Brachial Plexus Surgery; Springer International Publishing: Cham, Switzerland, 2021; pp. 133–139. [Google Scholar] [CrossRef]
Škrabalo, V.A.; Galović, M.; Bobek, D. Rehabilitation after Traumatic Brachial Plexus Injury. Fiz. I Rehabil. Med. 2022, 36, 87–88. [Google Scholar] [CrossRef]
Mertens, M.G.; Meert, L.; Struyf, F.; Schwank, A.; Meeus, M. Exercise therapy is effective for improvement in range of motion, function and pain in patients with frozen shoulder: A systematic review and meta-analysis. Arch. Phys. Med. Rehabil. 2021, 103, 998–1012.e14. [Google Scholar] [CrossRef] [PubMed]
Gutowski, K.A.; Orenstein, H.H. Restoration of Elbow Flexion after Brachial Plexus Injury: The Role of Nerve and Muscle Transfers. Plast. Reconstr. Surg. 2000, 106, 1348–1357. [Google Scholar] [CrossRef] [PubMed]
David Wu, C.B. Expanding Tele-rehabilitation of Stroke Through In-home Robot-assisted Therapy. Int. J. Phys. Med. Rehabil. 2013, 2. [Google Scholar] [CrossRef]
Mishra, A.; Mishra, A.; Khaparkar, S. IOT Based Real Time Tele-healthcare System. Glob. J. Res. Anal. 2023, 12, 59–64. [Google Scholar] [CrossRef]
Jin, F.; Zou, M.; Peng, X.; Lei, H.; Ren, Y. Deep Learning-Enhanced Internet of Things for Activity Recognition in Post-Stroke Rehabilitation. IEEE J. Biomed. Health Inform. 2023, 1–10. [Google Scholar] [CrossRef] [PubMed]
Majhi, B.; Kashyap, A. Early Prediction of Parkinson’s Disease Using Motor, Non-Motor Features and Machine Learning Techniques. In Deep Learning, Machine Learning and IoT in Biomedical and Health Informatics; CRC Press: Boca Raton, FL, USA, 2022; pp. 139–155. [Google Scholar] [CrossRef]
Wang, X.; Xie, J.; Guo, S.; Li, Y.; Sun, P.; Gan, Z. Deep reinforcement learning-based rehabilitation robot trajectory planning with optimized reward functions. Adv. Mech. Eng. 2021, 13, 168781402110670. [Google Scholar] [CrossRef]
Bhatt, S.P.; Rochester, C.L. Expanding Implementation of Tele-Pulmonary Rehabilitation: The New Frontier. Ann. Am. Thorac. Soc. 2022, 19, 3–5. [Google Scholar] [CrossRef]
Candemir, I. Tele-pulmonary rehabilitation and remote assessment of exercise capacity. Eurasian J. Pulmonol. 2022, 24, 73–76. [Google Scholar] [CrossRef]
Morone, G.; Pirrera, A.; Iannone, A.; Giansanti, D. Development and Use of Assistive Technologies in Spinal Cord Injury: A Narrative Review of Reviews on the Evolution, Opportunities, and Bottlenecks of Their Integration in the Health Domain. Healthcare 2023, 11, 1646. [Google Scholar] [CrossRef]
Jarrassé, N.; Proietti, T.; Crocher, V.; Robertson, J.; Sahbani, A.; Morel, G.; Roby-Brami, A. Robotic exoskeletons: A perspective for the rehabilitation of arm coordination in stroke patients. Front. Hum. Neurosci. 2014, 8, 947. [Google Scholar] [CrossRef] [PubMed]
Hohl, K.; Giffhorn, M.; Jackson, S.; Jayaraman, A. A framework for clinical utilization of robotic exoskeletons in rehabilitation. J. Neuroeng. Rehabil. 2022, 19, 115. [Google Scholar] [CrossRef]
Tang, D.; Lv, X.; Zhang, Y.; Qi, L.; Shen, C.; Shen, W. A Review on Soft Exoskeletons for Arm Rehabilitation. Recent Pat. Eng. 2023, 18, e250523217346. [Google Scholar] [CrossRef]
Kamal, A.; Ismail, Z.; Shehata, I.M.; Djirar, S.; Talbot, N.C.; Ahmadzadeh, S.; Shekoohi, S.; Cornett, E.M.; Fox, C.J.; Kaye, A.D. Telemedicine, E-Health, and Multi-Agent Systems for Chronic Pain Management. Clin. Pract. 2023, 13, 470–482. [Google Scholar] [CrossRef]
Rangappa, P.; Rao, K.; Carmra, T.; Karanth, S.; Chacko, J. Tele-medicine, tele-rounds, and tele-intensive care unit in the COVID-19 pandemic. Indian J. Med. Spec. 2021, 12, 4. [Google Scholar] [CrossRef]
Zaher, T. Tele-Medicine in Health Care: A Necessity or Novelty. Afro-Egypt. J. Infect. Endem. Dis. 2022, 103–104. [Google Scholar] [CrossRef]
Piotrowicz, E.; Stepnowska, M.; Leszczyńska-Iwanicka, K.; Piotrowska, D.; Kowalska, M.; Tylka, J.; Piotrowski, W.; Piotrowicz, R. Quality of life in heart failure patients undergoing home-based telerehabilitation versus outpatient rehabilitation—A randomized controlled study. Eur. J. Cardiovasc. Nurs. 2015, 14, 256–263. [Google Scholar] [CrossRef]
Gao, Y.; Wang, N.; Zhang, L.; Liu, N. Effectiveness of home-based cardiac telerehabilitation in patients with heart failure: A systematic review and meta-analysis of andomized controlled trials. J. Clin. Nurs. 2023, 32, 7661–7676. [Google Scholar] [CrossRef]
Zhong, W.; Fu, C.; Xu, L.; Sun, X.; Wang, S.; He, C.; Wei, Q. Effects of home-based cardiac telerehabilitation programs in patients undergoing percutaneous coronary intervention: A systematic review and meta-analysis. BMC Cardiovasc. Disord. 2023, 23, 101. [Google Scholar] [CrossRef] [PubMed]
Miura, H.; Shimada, Y.; Fukui, N.; Ikariyama, H.; Togashi, T.; Yamada, S.; Konishi, H.; Aoki, T.; Nakanishi, M.; Noguchi, T. Disease management using home-based cardiac rehabilitation for patients with heart failure. J. Cardiol. Cases 2023, 28, 157–160. [Google Scholar] [CrossRef] [PubMed]
Levine, B.A.; McAlinden, E.; Hu, T.M.-J.; Fang, F.M.; Alaoui, A.; Angelus, P.; Welsh, J.; Mun, S.K. Home Monitoring of Congestive Heart Failure Patients. In Proceedings of the 1st Transdisciplinary Conference on Distributed Diagnosis and Home Healthcare, 2006. D2H2., Marriott Crystal Gateway Hotel, Arlington, Virginia, 2–4 April 2006; IEEE: Piscataway, NJ, USA, 2006; pp. 33–36. [Google Scholar] [CrossRef]
Tabak, M.; Vollenbroek-Hutten, M.; Valk, P.; Palen, J.; Hermens, H. A telerehabilitation intervention for patients with Chronic Obstructive Pulmonary Disease: A randomized controlled pilot trial. Clin. Rehabil. 2013, 28, 582–591. [Google Scholar] [CrossRef] [PubMed]
Fandim, J.V.; Costa, L.O.; Yamato, T.P.; Almeida, L.; Maher, C.G.; Dear, B.; Kamper, S.J.; Saragiotto, B.T. Telerehabilitation for neck pain. Cochrane Database Syst. Rev. 2021. [Google Scholar] [CrossRef]
De Marchi, F.; Contaldi, E.; Magistrelli, L.; Cantello, R.; Comi, C.; Mazzini, L. Telehealth in Neurodegenerative Diseases: Opportunities and Challenges for Patients and Physicians. Brain Sci. 2021, 11, 237. [Google Scholar] [CrossRef]
Fennelly, O.; Cunningham, C.; Grogan, L.; Cronin, H.; O’shea, C.; Roche, M.; Lawlor, F.; O’hare, N. Successfully implementing a national electronic health record: A rapid umbrella review. Int. J. Med. Inform. 2020, 144, 104281. [Google Scholar] [CrossRef] [PubMed]
Johnson, R.W.; Williams, S.A.; Gucciardi, D.F.; Bear, N.; Gibson, N. Can an online exercise prescription tool improve adherence to home exercise programmes in children with cerebral palsy and other neurodevelopmental disabilities? A randomised controlled trial. BMJ Open 2020, 10, e040108. [Google Scholar] [CrossRef] [PubMed]
Akbari, A.; Haghverd, F.; Behbahani, S. Robotic Home-Based Rehabilitation Systems Design: From a Literature Review to a Conceptual Framework for Community-Based Remote Therapy During COVID-19 Pandemic. Front. Robot. AI 2021, 8, 612331. [Google Scholar] [CrossRef]
Razfar, N.; Kashef, R.; Mohammadi, F. Automatic Post-Stroke Severity Assessment Using Novel Unsupervised Consensus Learning for Wearable and Camera-Based Sensor Datasets. Sensors 2023, 23, 5513. [Google Scholar] [CrossRef]
Hao, J.; Pu, Y.; Chen, Z.; Siu, K.-C. Effects of virtual reality-based telerehabilitation for stroke patients: A systematic review and meta-analysis of randomized controlled trials. J. Stroke Cerebrovasc. Dis. 2023, 32, 106960. [Google Scholar] [CrossRef]
Shen, C.; Gonzalez, Y.; Klages, P.; Qin, N.; Jung, H.; Chen, L.; Nguyen, D.; Jiang, S.; Jia, X. Intelligent inverse treatment planning via deep reinforcement learning, a proof-of-principle study in high dose-rate brachytherapy for cervical cancer. Phys. Med. Biol. 2019, 64, 115013. [Google Scholar] [CrossRef]
Yau, K.-L.; Chong, Y.-W.; Fan, X.; Wu, C.; Saleem, Y.; Lim, P.C. Reinforcement Learning Models and Algorithms for Diabetes Management. IEEE Access 2023, 11, 28391–28415. [Google Scholar] [CrossRef]
Stasolla, F.; Lopez, A.; Akbar, K.; Vinci, L.A.; Cusano, M. Matching Assistive Technology, Telerehabilitation, and Virtual Reality to Promote Cognitive Rehabilitation and Communication Skills in Neurological Populations: A Perspective Proposal. Technologies 2023, 11, 43. [Google Scholar] [CrossRef]
Whiley, P. Markov Decision Process. IMA J. Manag. Math. 1992, 4, 395. [Google Scholar] [CrossRef]
Huang, Y. Deep Q-Networks. In Deep Reinforcement Learning; Springer: Singapore, 2020; pp. 135–160. [Google Scholar] [CrossRef]
Sewak, M. Deterministic Policy Gradient and the DDPG. In Deep Reinforcement Learning; Springer: Singapore, 2019; pp. 173–184. [Google Scholar] [CrossRef]
Anupama, R.; Shaji, A.P.; George, N.; Shihabudeen, S.; Antony, A.; Rishikesh, P.H. Telepresence Robot. SSRN Electron. J. 2021. [Google Scholar] [CrossRef]
Naseer, F.; Nasir Khan, M.; Nawaz, Z.; Awais, Q. Telepresence Robots and Controlling Techniques in Healthcare System. Comput. Mater. Contin. 2023, 74, 6623–6639. [Google Scholar] [CrossRef]
Davin, K.; Mollere Doucet, B.M. Telepresence Robotics in OT Education. Am. J. Occup. Ther. 2023, 77 (Suppl. 2), 7711505192p1. [Google Scholar] [CrossRef]
Yang, L.; Jones, B.; Neustaedter, C.; Singhal, S. Shopping Over Distance through a Telepresence Robot. Proc. ACM Hum.-Comput. Interact. 2018, 2, 191. [Google Scholar] [CrossRef]
Nordtug, M.; Johannessen, L.E. The social robot? Analyzing whether and how the telepresence robot AV1 affords socialization. Converg. Int. J. Res. New Media Technol. 2023, 29, 1683–1697. [Google Scholar] [CrossRef]

Figure 1. Damage to the complex network of nerves that control the muscles due to a BPI.

Figure 2. Different accidental occurrences of BPIs: (A) car accident, (B) dropping from a height, (C) obese human falling down, (D) high-speed bike accident and pole collision, (E) bike accident on the road, and (F) heavy object hit on shoulder.

Figure 3. Telepresence robot with a robotics arm for elbow flexion exercises.

Figure 4. General schematic of the telepresence robot, highlighting key hardware components and their interconnectivity.

Figure 5. Flowchart of the operation of a telepresence-robot-based rehabilitation process of a patient with a BPI.

Figure 6. Stages of recovery of a BPI patient after elbow-flexion-exercise-based rehabilitation.

Figure 7. Doctor interaction setup for rehabilitation at a remote hospital.

Figure 8. Control input produced by the DDPG algorithm. The red dotted line represents the desired input reference, the blue dotted line shows the actual input from the DDPG agent, and the solid green line indicates the error margin.

Figure 9. Reward comparison of DDPG as compared to the average reward.

Figure 10. Comparison of the force exerted by the BPI patient with the average force exerted by normal adults.

Figure 11. Comparison of force exerted by the robotic arm as compared to normal adults.

Figure 12. Comparison of force exerted by the patient and the telepresence robot.

Figure 13. ROM effect after telepresence robot-based elbow flexion exercises.

Table 1. Comparison of different published works of related fields.

Research Paper	Technology Used	Advantages	Disadvantages
[1]	Telehealth robotic system	Allows for remote physical therapy. Enhances motor recovery in stroke patients.	Limited to upper extremities. High cost.
[5]	Telehealth swallowing assessment tool	Facilitates remote assessment. Reduces the need for patient travel.	Requires a reliable internet connection. May not be suitable for severe cases.
[6]	Robotic gait therapy	Provides intensive gait rehabilitation. Adjustable to individual needs.	Expensive. Requires trained personnel for operation.
[7]	Robotic exoskeleton	Facilitates arm rehabilitation. Adaptable to individual progress.	High cost. Limited availability.
[16]	Deep reinforcement learning for dose optimization	Personalized treatment plans. Potential to minimize radiation exposure.	Computationally intensive. Requires extensive data for training.
[17]	Deep reinforcement learning for insulin optimization	Personalized insulin plans. Potential to improve long-term health outcomes.	Reliant on continuous data input. Potential issues with data privacy.
[This Study]	Telepresence robot with mechanical arm	Enables full-body rehabilitation. Cost-effective compared to similar systems.	Requires calibration for individual patient use. Operational training needed for healthcare staff.

Table 2. Telepresence robot parameters.

Technical Specifications	Unit	Min–Max Value
Robot speed	m/s	0–3.25
Robot momentum	N.m	0–0.93
Robot height	Ft	5′3″
Robot width	Ft	1′5″
Robot breadth	Ft	0′6″
Robot weight	Kg	14
Robot battery	Ah	35

Table 3. Force-sensing resistor (FSR)—piezoelectric sensor specifications.

Technical Specifications	Details
Sensor Type	Force-Sensing Resistor (FSR) (piezoelectric sensor)
Interface	Arduino-Compatible
Force Range	[0.2 N to 20 N]
Sensitivity	[0.1 N]
Response Time	[<5 ms]
Operating Temperature	[−30 °C to +70 °C]
Dimensions	[Diameter: 15 mm, Thickness: 0.2 mm]
Output	Analog Voltage
Application	Measuring force exerted by patient’s hand
Additional Features	[Durable, Flexible]

Table 4. Flex sensor specifications.

Technical Specifications	Details
Sensor Type	Flex Sensor
Interface	Arduino-Compatible
Bend Detection Range	[0° to 90°]
Sensitivity	[Change in resistance with bend]
Response Time	[<10 ms]
Operating Temperature	[−40 °C to +85 °C]
Dimensions	[Length: 55 mm, Width: 6 mm]
Output	Analog Resistance Change
Application	Measuring angle of joint movement
Additional Features	[Thin, Lightweight, Flexible]

Table 5. Demographics for traumatic brachial plexus injuries.

Demographic	Details
Total Patients	406
Gender Distribution	Male: 384 (94.6%)
Gender Distribution	Female: 22 (5.4%)
Average Age	28.38 years
Most Common Cause	Motorcycle Accidents (79%)
Lesion Location	Right Plexus: 45.9%
Lesion Location	Left Plexus: 54.1%
Type of Lesion	Complete: 46.1%
	C5/C6 Roots: 30.1%
	C5/C6/C7 Roots: 20.9%
	Lower Roots (C8/T1): 2.9%
Associated Injuries	Head Trauma: 34.2%
	Long Bones: 38.8%
	Clavicle Fractures: 25.9%
	Thoracic Trauma: 12.9%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Khan, M.N.; Altalbe, A.; Naseer, F.; Awais, Q. Telehealth-Enabled In-Home Elbow Rehabilitation for Brachial Plexus Injuries Using Deep-Reinforcement-Learning-Assisted Telepresence Robots. Sensors 2024, 24, 1273. https://doi.org/10.3390/s24041273

AMA Style

Khan MN, Altalbe A, Naseer F, Awais Q. Telehealth-Enabled In-Home Elbow Rehabilitation for Brachial Plexus Injuries Using Deep-Reinforcement-Learning-Assisted Telepresence Robots. Sensors. 2024; 24(4):1273. https://doi.org/10.3390/s24041273

Chicago/Turabian Style

Khan, Muhammad Nasir, Ali Altalbe, Fawad Naseer, and Qasim Awais. 2024. "Telehealth-Enabled In-Home Elbow Rehabilitation for Brachial Plexus Injuries Using Deep-Reinforcement-Learning-Assisted Telepresence Robots" Sensors 24, no. 4: 1273. https://doi.org/10.3390/s24041273

APA Style

Khan, M. N., Altalbe, A., Naseer, F., & Awais, Q. (2024). Telehealth-Enabled In-Home Elbow Rehabilitation for Brachial Plexus Injuries Using Deep-Reinforcement-Learning-Assisted Telepresence Robots. Sensors, 24(4), 1273. https://doi.org/10.3390/s24041273

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Telehealth-Enabled In-Home Elbow Rehabilitation for Brachial Plexus Injuries Using Deep-Reinforcement-Learning-Assisted Telepresence Robots

Abstract

1. Introduction

2. Literature Review

3. Theoretical Background

3.1. Overview of Deep Reinforcement Learning (DRL)

3.2. Deep Deterministic Policy Gradient (DDPG) Algorithm

4. Telepresence Robots

4.1. Operation of the Telepresence Robot for Elbow Flexion Exercises

4.1.1. Sensing Phase

Types of Sensors

Mathematical Equations and Relations

4.1.2. Deep Deterministic Policy Gradient (DDPG) Phase

4.1.3. Action Execution Phase

4.1.4. Feedback and Learning Phase

5. Experimental Setup

5.1. Patient’s Home Setup

5.1.1. Telepresence Robot Equipped with DDPG-Based Assistance

5.1.2. Patient Interaction with the Robot

5.2. Remote Healthcare Provider’s Setup

5.2.1. Monitoring Station

5.2.2. Doctor Interaction

6. Results and Discussion

6.1. DDPG Algorithm Analysis

6.2. Improvement in Force Exerted by Patient

6.3. Decrease in Assistance Force by Robotic Arm

6.4. Increase in Range of Motion (ROM)

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI