Designing Closed-Loop Brain-Machine Interfaces Using Model Predictive Control †

Brain-machine interfaces (BMIs) are broadly defined as systems that establish direct communications between living brain tissue and external devices, such as artificial arms. By sensing and interpreting neuronal activities to actuate an external device, BMI-based neuroprostheses hold great promise in rehabilitating motor disabled subjects, such as amputees. In this paper, we develop a control-theoretic analysis of a BMI-based neuroprosthetic system for voluntary single joint reaching task in the absence of visual feedback. Using synthetic data obtained through the simulation of an experimentally validated psycho-physiological cortical circuit model, both the Wiener filter and the Kalman filter based linear decoders are developed. We analyze the performance of both decoders in the presence and in the absence of natural proprioceptive feedback information. By performing simulations, we show that the performance of both decoders degrades significantly in the absence of the natural proprioception. To recover the performance of these decoders, we propose two problems, namely tracking the desired position trajectory and tracking the firing rate trajectory of neurons which encode the proprioception, in the model predictive control framework to design optimal artificial sensory feedback. Our results indicate that while the position trajectory based design can only recover the position and velocity trajectories, the firing rate trajectory based design can recover the performance of the motor task along with the recovery of firing rates in other cortical regions. Finally, we extend our design by incorporating a network of spiking neurons and designing artificial sensory feedback in the form of a charged balanced biphasic stimulating current.


Introduction
Brain-machine interfaces (BMIs) [1,2] are broadly defined as systems that establish direct communications between living brain tissue and external devices such as artificial arm.The major components of these systems include measurements of cortical neuronal activity, extraction of task-relevant motor intention (decoder), and an encoder that feeds back the motor relevant sensory information back to the brain.Thus the brain, the BMI and the prosthetic device together act as a closed-loop BMI. Figure 1 shows a closed-loop BMI design (also known as brain-machine-brain interface (BMBI) [3]).In the last two decades, BMI based motor intended neural prosthetic systems have been studied extensively [4][5][6][7][8][9][10][11].In most of these studies, healthy subjects are trained to perform a specific motor task such as reaching or grasping.Recorded data during the performance of the task are then used to develop a mathematical model called decoder.The decoder extracts the kinetic as well as the kinematic motor information from a continuously recorded firing activity of motor relevant cortical neurons.The performance of the decoder is typically measured by applying the decoded information to a prosthetic arm.The online movement based error correction during the reaching task is accomplished by the subject using the available visual feedback information in the absence of the natural proprioception.Therefore these BMIs are considered as partially closed-loop systems in their current formulations where the incorporation of artificial proprioception is neglected in their designs.
In the absence of tactile feedback, these BMIs can fail to differentiate visually similar textures.Similarly, in the absence of proprioception, these BMIs are unable to provide the natural sensation of the arm movement which are both experienced and used by healthy subjects in controlling their natural limb movements.It has been recognized in the BMIs community that the inclusion of sensory feedback from the actuated artificial limb in BMIs is necessary to improve the versatility of motor-based BMIs [12].It has also recently been shown in [13] that kinesthetic feedback together with the visual feedback can significantly improve the BMI performance.
Recently, attempts have been made towards closing the BMI loop by incorporating artificial texture [3] and proprioception [14,15] information.In these studies, a intra-cortical micro-stimulation (ICMS) technique has been investigated as a promising approach in providing artificial sensation of motor tasks to the brain.The approach relies on a learning paradigm where the subject is trained to differentiate [3] or learn [14,15] artificial sensory feedback in a task-dependent context.Even though the approach is promising for developing future BMIs, the experimental trial and error approach in designing appropriate stimulating sensory input currents may change the natural functionality of the brain.Therefore, a systematic approach that uses optimal feedback control theory is highly desirable towards developing stimulation enhanced next generation BMIs.This approach provides flexibility in designing optimal stimulating sensory input currents and analyzing the closed-loop BMI under various feedback scenarios.It may also reduce the learning effort of the motor task by guiding the movement in the initial learning of the BMI motor task.
The intellectual merit of studying BMIs by taking control-theoretic approach is to exploit all the available degrees of freedom in developing the next generation of BMI-based feedback-enabled neuroprosthetic devices.The development of these feedback-enabled neuroprosthetic devices is necessary for making the prosthetic devices prone to error in decoding and targeting the intended action of the neurons.Moreover, issues such as prosthetic system stability, system reliability, impact of transmission loss, latency and time delays, impact of model complexity and uncertainty, optimality of the modeling and control framework, etc. are critical to the clinical deployment of next generation BMIs.
In recognition of these merits, systematic control-theoretic approaches have recently been taken to optimize the learning process in BMIs [7], rigorously analyze the BMI systems [16][17][18] and design artificial sensory feedback optimally [19,20] by developing a theoretical framework based on optimal feedback control (OFC) theory.For instance, the authors in [7] proposed a learning model to maximize the performance of closed-loop BMIs by leveraging tools such as inverse models and feedback control which are formally embedded in control theory.The authors in [18] developed a theoretical model of BMI experiments using optimal feedback control as a policy for brain control during BMI motor tasks.The framework incorporates visual and proprioceptive feedback in estimating states which are then used to compute optimal control inputs for stimulating neuronal models of the primary motor cortex and the premotor cortex neurons.Using this framework, the authors showed that the experimentally observed abrupt changes in neural modulations when switching to BMI control can be explained using optimal feedback control.In [16], the authors investigated the importance of visual and proprioceptive feedback in BMIs by using a framework of model predictive control in designing optimal stimulus for a single spiking neuron for a single joint movement task based closed-loop BMI.In [17], the authors proposed a stochastic optimal feedback controller as the closed-loop operation of the brain during a BMI performance.Using this framework, they analyzed the performance of open and closed-loop BMIs and explained key phenomenon in closed-loop BMI operation.In particular, they explained the experimentally observed parametric variations in closed-loop operation of BMIs such as the performance deterioration with increasing bin width and diminishing effect of decoder bias in closed-loop BMIs.In [19], the authors showed the first systematic approach in designing optimal artificial sensory feedback in closed-loop BMIs.In particular, the authors proposed an optimal design of feedback-enabled closed-loop BMI and designed artificial proprioceptive feedback using a model predictive controller in a single joint reaching task.The authors further extended their firing rate-based approach [19] to ICMS [20].
In this article, we theoretically demonstrate the recovery of closed-loop performance of a BMI for voluntary single joint extension task by designing an optimal artificial sensory feedback in the absence of the natural proprioceptive feedback pathways.Throughout our analysis, we exclude the treatment of visual feedback as well any form of cortical learning.A theoretical experiment is performed on a firing rate based cortical circuit model of a voluntary control of a single joint extension task to design a BMI.In particular, we design both the Wiener filter and the Kalman filter based decoders and compare their performances.We emphasize the degraded online performance of both decoders in the absence of the proprioceptive feedback.An optimal artificial sensory feedback in the framework of model predictive control is designed to compensate the loss of the natural proprioceptive feedback pathways.We extend our design by including a recurrent network of spiking neurons in the firing rate based neurophysiological cortical circuit model which allow us to design artificial sensory feedback in the form of a charge-balanced biphasic waveform of stimulating input current.We demonstrate the efficacy of our designs by performing simulations.
The remainder of this paper is organized as follows.In Section 2, we describe the neurophysiological cortical circuit model of a voluntary single joint extension task which we used to generate the synthetic data in Section 3.This is followed by the design of both the Wiener filter and the Kalman filter based linear decoders in Section 4. The performance analysis of the designed decoder in the presence and in the absence of the natural proprioception are described in Section 5. We formulate the model predictive control problems and design the artificial sensory feedback in Section 6.The paper ends with discussion.

Psycho-Physiological Cortical Circuit Model of Single Joint Movement
We used an average firing rate based psycho-physiological cortical circuit model, proposed by [21] and shown in Figure 2, for voluntary control of a single joint movement to design a closed-loop BMI.This minimal model captures the essential cortical pathways as well as the proprioceptive feedback pathways which are relevant during voluntary extension or flexion of a single joint such as elbow.Although the model excludes the treatment of visual feedback during the movement, the model has shown its capability in a qualitative reproduction of several experimentally observed results on voluntary control of a single joint movement.The details of the model and its connection with neurophysiology of a single joint voluntary movement can be found in [21].Nomenclature (adopted from [21]): "GO" is a scalable gating signal; "DVV" is the desired velocity vector; "OPV" is the outflow position vector; "OFPV" is the outflow force and position vector; "SFV" is the static force vector; "IFV" is the inertial force vector; "PPV" is the perceived position vector; "DV" is the difference vector; "TPV" is the target position vector; "γ d " and "γ s " are dynamic and static gamma motoneurons respectively; "α" is alpha motoneuron; "Ia" and "I I" are type Ia and II afferent fibers; − represents inhibitory feedback.The rest of the connections are excitatory.
Briefly, a population of area 5 (the parietal cortex) "DV" neurons computes the difference between the target and the perceived limb position vectors.The average firing activity of a population of these neurons is represented as Here, 0 ≤ r i (t) ≤ 1 represents the average firing activity of a population of "DV" neurons associated with the agonist muscle i and shows a phasic behavior during the movement.Throughout the paper, we will denote the average firing activity of neurons associated with the agonist muscle i by the subscript i and the corresponding antagonist muscle by the subscript j.T i is the target position vector ("TPV") command for the target position of the agonist muscle i. x i (t) is the average firing activity of a population of area 5 "PPV" neurons.These neurons continuously compute the present position of the agonist muscle i.B r is the base firing activity of the "DV" neurons.Continuously computed difference vector information by the area 5 "DV" neurons is then scaled by a population of area 4 (the primary motor cortex (M1)) "DVV" neurons as u i (t) = max{g(t).(ri (t) − r j (t)) + B u , 0}. ( Here, u i (t) is the average firing activity of a population of area 4 "DVV" neurons.B u is the base firing activity of the "DVV" neurons.g(t) is an internal "GO" signal which is assumed to be originated from the basal ganglia."DVV" neurons fire only during the movement and thus their average firing activity shows a phasic-movement time (MT) behavior.The dynamics of the internal "GO" signal is modeled as Here, represents a slow integration rate and is treated as constant.C is a constant value at which the "GO" neurons saturate.The area 4 "OPV" neurons receive information from the area 4 "DVV" neurons as well as the area 5 "PPV" neurons and show tonic firing activity.The average firing activity of a population of "OPV" neurons is modeled as Here, η is a scaling factor.The average firing activity of a population of static (γ S i (t)) and dynamic (γ D i (t)) gamma motoneurons are modeled as Here, ρ is a scaling parameter.The average firing activity of the primary ("Ia") and the secondary ("II") muscle spindles afferents are modeled as Here, s 1 i (t) and s 2 i (t) are the primary and the secondary spindles afferents average firing activity respectively.p i is the position of the agonist muscle i. θ is the sensitivity of the static nuclear bag and chain fibers.φ is the sensitivity of the dynamic nuclear bag fibers.The saturation of spindles afferents activity is given by the function S(ω) = ω/(1 + 100ω 2 ).The average firing activity x i (t) of a population of area 5 "PPV" neurons is modeled as Here, τ is the delay time of the spindles feedback.Θ is a constant gain.The average firing activity q i (t) of a population of area 4 "IFV" neurons is modeled as Here, Λ is a constant threshold.The average firing activity f i (t) of a population of area 4 "SFV" neurons is modeled as Here, h is a constant gain which controls the strength and speed of an external load compensation.
ψ is an inhibitory scaling parameter.The average firing activity a i (t) of a population of the area 4 "OFPV" neurons is modeled as The average firing activity of these neurons shows a phasic-tonic behavior.The average firing activity α i (t) of alpha motoneurons is modeled as where δ is a stretch reflex gain.The limb dynamics is described by Here p i (t) is the position of the agonist muscle i within its range of origin-to-insertion distances.p j (t) is the position of the antagonist muscle such that p i (t) + p j (t) = 1.I is the moment of inertia of the limb.V is the joint viscosity.E i is the external force applied to the joint.M(c i (t), p i (t)) = max{c i (t) − p i (t), 0} represents the total force generated by the agonist muscle i. c i (t) is the muscle contraction activity dynamics of which is given by

Synthetic Experimental Data Generation for Extension Task
In a typical non-human primate experiment, a monkey is trained to accomplish a given motor task such as reaching or grasping.After the training, spiking activity of single neurons are recorded through implanted multi-channel electrodes from various motor relevant cortical areas such as the primary motor cortex (M1), the premotor area (PMv, PMd), and the primary somatosensory area (S1).Simultaneously, kinetic and kinematic information such as joint torque, velocity and position of the real arm are measured to generate a data set to design a decoder.
In this work, we generated a synthetic experimental data set for voluntary control of a single joint extension task by simulating the system model ( 1)- (13) in MATLAB (Version R2011b, The MathWorks, Inc., Natick, MA, USA).The target position of the agonist muscle i was set to the desired one at t = 0.The "GO" signal was turned on at t = 50 ms.During the initial 50 ms, the system was at the priming state.The initial condition of variables was set to 0 except x i (0) = x j (0) = 0.5, y i (0) = y j (0) = 0.5, p i (0) = p j (0) = 0.5, u i (0) = u j (0) = B u and r i (0) = r j (0) = B r .For the simulation, we used the following model parameters [21]: In BMI experiments, a trial is considered successful if the trained monkey accomplishes the specified motor task in a given time duration.Thus the accomplishment time of the task in successful trials is allowed to vary.In our case, the "GO" signal controls the velocity of the joint movement and thus the accomplishment time of a given task.Therefore we assumed that there is a trial-to-trial variability in the internal "GO" signal.To introduce the trial-to-trial variability in the "GO" signal, we modeled g 0 as a Gaussian distributed random variable with mean 0.75 and variance 0.0025.For a given trial, g 0 is constant.It should be noted that the "GO" signal has no effect on the accuracy of the movement.
We simulated the model and generated synthetic data for 1600 independent trials of the voluntary single joint extension task.In each of these trials, the simulation was performed for the duration of 1.45 s which includes a variable holding period at the target position after the accomplishment of the task.To generate a synthetic data set, we measured the average firing activity of a population of area 4 "DVV", "OPV", and "OFPV" agonist and the corresponding antagonist neurons sampled at every 10 ms.Simultaneously, we measured the agonist muscle position p i (t), the agonist muscle velocity v i (t) = dp i (t)/dt, and the total force difference between the agonist and the corresponding antagonist muscle ∆M(t) = M(c i (t), p i (t)) − M(c j (t), p j (t)) at every 10 ms.With this, we created a data set of 233, 600 samples by embedding recorded data from 1600 trials.

Wiener and Kalman Filters Based Decoder Designs
Neurophysiological as well as BMI experimental studies have shown the encoding of task-relevant kinetic as well as kinematic motor information in the spike trains of the cortical area 4 neurons.For a given motor task, this motor information is extracted from a continuously recorded spike train of the cortical area 4 neurons by developing a mathematical model called decoder.In the past, several linear and nonlinear decoders have been developed in BMI studies.Among them, the most popular are based on the Wiener filter [22], the Kalman filter and its variations [23][24][25][26][27][28], artificial neural networks [29] and recurrent neural networks [30].
We designed both the adapted Wiener filter [22] and the Kalman filter [28]) based decoder models (linear decoders) to extract the total force difference between the agonist and the corresponding antagonist muscle (∆M(k)), the agonist muscle position (p i (k)), and the agonist muscle velocity (v i (k)) from the recorded firing activity of the area 4 "DVV", "OPV", and "OFPV" neurons and performed comparisons between the performance of these two decoders.Here, k = 1, 2, • • • is a discrete sample time at which data were recorded for a given trial.k = 1 corresponds to t = 0 ms, k = 2 corresponds to t = 10 ms, and so on.We emphasize that these neurons have direct contribution to the spinal cord circuit of the neurophysiological system shown in Figure 2.

Wiener Filter Based Decoder Design
In a discrete-time adapted Wiener filter based decoder design [22], the relation between ∆M(k) and the average firing activity of area 4 neurons, i.e., "DVV", "OPV", and "OFPV" neurons can be expressed as Here, w is a (L.N) × 1 weight vector.L is the number of delay elements.(•) T is the transpose of a vector.
represents the average firing activity of the population m delayed by l samples.For our system, z 1 = y i , z 2 = y j , z 3 = u i , z 4 = u j , z 5 = a i , and z 6 = a j .Thus, N = 6.We assumed the number of delay elements L = 10.Thus the weight vector w has a dimension of 60 × 1.We also assumed that there is no measurement noise in obtaining data i.e., n 1 (k) = 0. Similarly, the relation between [p i (k), v i (k)] T and the average firing activity of area 4 neurons i.e., "DVV", "OPV", and "OFPV" neurons can be expressed as Here, W T is a weight matrix of dimension 60 × 2. We again assumed that there is no measurement noise in obtaining data.
For consistency with BMI experiments, in this work, we used 220,000 samples of the recorded synthetic data to train the weight vector "w" and the weight matrix "W" of the designed decoders (Equations ( 14) and ( 15)).For this, we used the following normalized least mean squares algorithm [22]: (16a) Here, η and β are constants.|| • || represents the Euclidean norm.In Equation (16a), e(k) represents a scalar error between the recorded ∆M(k) and the estimated value through Equation ( 14).In Equation (16b), e(k) represents the error vector between the recorded [p i (k), v i (k)] T and the estimated value through Equation (15).For our study, we set η = 0.01 and β = 1.After the training, we froze the weight vector "w" and the weight matrix "W" to the final adapted value.Then we used the rest of 13,600 samples to validate the performance of both decoders.
Figure 3A,C shows the offline performance of the adapted Wiener filter based decoder in decoding the joint position p i (k) and the joint velocity v i (k) respectively, as defined by Equation ( 15), on the test data for 1000 samples.Figure 3E shows the offline performance of the adapted Wiener filter based decoder in decoding the force difference between the agonist and the corresponding antagonist muscle, ∆M(k), as defined by Equation ( 14), on the test data for 1000 samples.

Kalman Filter Based Decoder Design
The following set of dynamical equations describes the design of the Kalman filter based decoder [28]: x Here, x(k | k − 1) and x(k) represent a priori and a posteriori estimate of the state vector x(k) of dimension n × 1 at time k respectively.P(k | k − 1) and P(k) are the estimate of a priori and a posteriori covariance matrix respectively.z(k) is the observation (firing rate) vector of dimension r × 1.K k is the Kalman gain and I is an identity matrix.A ∈ R n×n is the state matrix and is given by C ∈ R r×n represents the observation matrix and is given by T are covariance matrices of Gaussian noise sources with mean zero to the state and the observation vectors respectively.The structure of matrices Z, X, X 1 , and X 2 is given by Here, z i,j represents the j th firing rate data of the i th neuron.x i,j represents the j th data of the i th state.
For our system, x(k) ≡ ∆M(k) (n = 1) if the total force difference between the agonist and the corresponding antagonist muscle (∆M(k)) is extracted, and if the position and the velocity of the agonist muscle are extracted from the average firing activity of area 4 neurons i.e., "DVV", "OPV", and "OFPV" neurons.r = 6 and D = 220,000.
For consistency with the Wiener filter based decoder design, we used 220,000 samples of the recorded synthetic data to compute A, C, R, and Q for both x(k) ≡ ∆M(k) and x(k) ≡ [p i (k), v i (k)] T .Then we used the rest of 13, 600 samples to validate the performance of the decoder for both cases.
Figure 3F shows the offline performance of the Kalman filter based decoder on the test data for 1000 samples when x(k) ≡ ∆M(k) and Figure 3B,D shows the offline performance of the Kalman filter based decoder on the test data for 1000 samples when x(k) ≡ [p i (k), v i (k)] T .Clearly, the Kalman filter based decoder performed better than the Wiener filter based decoder on the test data.

Comparison of Designed Decoders
We compared the performance of the designed decoders (i.e., the Wiener filter and the Kalman filter) in an open-loop BMI system design shown in Figure 4.As shown in Figure 4, the average firing activity of area 4 neurons i.e., "DVV", "OPV", and "OFPV" neurons are used by the decoder to extract either the total force (∆M(k)) or the position (p i (k)) and the velocity (v i (k)) of the agonist muscle.To compare the performance of both decoders (i.e., the Wiener filter and the Kalman filter based decoders), We simulated (1)- (10) along with the particular decoder model (i.e., ( 14)-( 16) for the Wiener filter and ( 17) and (18) for the Kalman filter) and s = 0 and g 0 = 0.75.Remaining model parameters are the same as given in Section 3.
Figure 5A,B compare the open-loop performance of the Wiener filter based decoder and the Kalman filter based decoder with the performance of the closed-loop real system shown in Figure 2 when the extracted information from both decoders is the position (p i (t)) and the velocity (v i (t)) of the agonist muscle in real time t respectively.As shown clearly in these figures, the performance of both decoders degrades substantially in the absence of the natural proprioception information.Moreover, the Kalman filter performed better over the Wiener filter in decoding position compared to the velocity of the agonist muscle.
Figure 5C compares the open-loop performance of the Wiener filter based decoder and the Kalman filter based decoder with the performance of the closed-loop real system shown in Figure 2 when the extracted information from both the decoders is the total force (∆M(t)) in real time t.As shown in Figure 5C, the performance of both decoders degrades substantially in the absence of the natural proprioception information.
Without the loss of generality, in the rest of the paper, we considered the Wiener filter based decoder design with ∆M(t) as the extracted information to design closed-loop BMIs.Note that the described approach can be implemented to other forms of decoders as well.

Need of a Closed-Loop BMI
We studied the online performance of the Wiener filter based decoder with ∆M(k) as the extracted information in the presence and in the absence of the natural proprioceptive feedback information when the decoder interacts with the dynamics of the muscle as shown in Figure 6.To make a realistic comparison of the performance of the decoder with the neurophysiological (psycho-physiological) system, we first carried out our investigation by studying the performance of the neurophysiological system shown in Figure 2 in the presence and in the absence of the natural proprioceptive feedback i.e., the sensory feedback.For both cases, we simulated (1)-( 13) with g 0 = 0.75.The remaining model parameters are the same as given in Section 3 for both cases except θ = 0 and ρ = 0 in the case of no proprioception.This means that the primary ("Ia") and the secondary ("II") muscle spindles afferents become inactive (see (6)) in the absence of proprioception.
Figure 7A shows the position trajectory of the agonist muscle i (p i (t)) in the presence and in the absence of the natural proprioception for the neurophysiological system.shows the position trajectory for the neurophysiological system ("Psycho-Physiological System") while (B) shows the position trajectory for the designed BMI ("Brain-Machine Interface").The desired position target (T i ) for the agonist muscle i in (A,B) is 0.7; (C) shows the average firing activity of a population of agonist area 4 "DVV" (u i (t)), area 4 "OPV" (y i (t)), area 4 "OFPV" (a i (t)) and area 5 "PPV" (x i (t)) neurons in the presence (solid line) and in the absence (dotted line) of the natural proprioceptive feedback information for the designed BMI.
As shown in Figure 7A, the desired position of the agonist muscle i has been achieved in both cases for the neurophysiological system.The result is consistent with a prior neurophysiological experiment where it was shown that a trained monkey (in the absence of visual feedback) can reach the desired target position in the presence and in the absence of proprioception [31].
Next we studied the performance of the closed-loop BMI (decoder) (in the presence of the natural proprioceptive feedback information) and the open-loop BMI (decoder) (in the absence of the natural proprioceptive feedback information) shown in Figure 6.For this, we simulated (1)-( 10), ( 12) and ( 14).Here we assumed that the limb dynamics is the same for the neurophysiological system and the BMI.For the open-loop BMI, we again set θ = 0 and ρ = 0.
Figure 7B shows the position trajectory of the agonist muscle i (p i (t)) in the presence and in the absence of the natural proprioception for the closed-loop and the open-loop BMI.
It is clear from Figure 7B that the decoder performance degrades substantially when the decoder, trained with the closed-loop data, is applied on the open-loop system.Since the decoder was trained with the closed-loop firing activity of the area 4 "OPV", "DVV" and "OFPV" neurons, the firing activity of these neurons must have changed significantly in the absence of the natural proprioception feedback information.To see this, we plotted the firing activity of these neurons in the presence and in the absence of the natural proprioception feedback information for the BMI design shown in Figure 6. Figure 7C shows the firing activity of the area 4 "OPV", "DVV" and "OFPV" neurons and the area 5 "PPV" neurons in the presence and in the absence of the natural proprioceptive feedback information.
As shown in Figure 7C, the firing activity of cortical neurons deviates significantly from the closed-loop activity in the absence of the natural proprioceptive feedback information.Since the weights of the designed decoder were not adapted to accommodate these significant deviations in the firing activity of the area 4 neurons, the decoder performance degrades substantially in the absence of the natural proprioceptive feedback information.These results clearly show that there is a necessity for designing an artificial proprioceptive feedback to regain the closed-loop performance of the designed decoder in the absence of the natural proprioceptive feedback.

Artificial Proprioceptive Feedback Design
As shown in Figure 2, the area 5 "PPV" neurons receive the position feedback information through the primary (Ia) muscle spindles afferents.These neurons then use this proprioceptive feedback information to compute the present position vector command.In the absence of the natural proprioceptive feedback pathways, this feedback information is lost.In order to compensate the lost feedback information of the area 5 "PPV" neurons, we design an artificial sensory feedback in a model predictive control framework.The goal is to recover the closed-loop performance of the decoder (the Wiener filter based decoder with ∆M(k) as the extracted information) by providing the designed optimal artificial sensory feedback to the "PPV" neurons in the absence of the natural proprioceptive feedback pathways.Note that we are not designing artificial feedback to compensate the loss of sensory feedback to the area 4 "IFV" and "SFV" neurons in this study.Thus in the absence of the natural sensory feedback, these neurons remain inactive during our analysis.

Model Predictive Control
Model predictive control (MPC) is an optimal control strategy that explicitly incorporates a dynamic model of the system as well as constraints in determining control actions.At each time k, the system measurements are obtained and a model of the system is used to predict future outputs of the system O k+l+1|k , l = 0, 1, 2, • • • , N p − 1 as a function of current and future control moves How far ahead in the future the predictions are computed is called the prediction horizon N p and how far ahead the control moves are computed is called the control horizon N c . Figure 8 illustrates the idea of prediction and control horizon in a model-based receding horizon control strategy.Using the predictions from the model, the N c control moves I k+l|k , l = 0, 1, • • • , N c − 1 are optimally computed by minimizing a cost function J k over the prediction horizon N p subject to constraints on the control inputs as well as any other constraints on the internal states and outputs of the system as follows: min subjects to constraints on control inputs and the system.A typical quadratic objective cost function may be of the form Here, P k+l+1|k is the output to be tracked.The matrices Q 1 and R 1 are positive semidefinite and positive definite weighting matrices respectively which determine the relative importance of the tracking objective versus control effort.Only the first optimally computed move I k|k is implemented out of m computed optimal moves at time k.At the next time k + 1, new system measurements are obtained and the optimization problem is solved again with the new measurements.Thus, the control and prediction horizon recede by one step as time moves ahead by one step.The measurements at each sampling time provide feedback for rejecting inter-sample disturbances, model uncertainty and noises, all of which cause the model predictions to be different from the true system output.

Firing Rate Based Closed-Loop BMI Design
In order to recover the closed-loop (natural) performance of the decoder, we formulate two control problems in this section.In the first problem, we design an optimal artificial sensory feedback to stimulate the population of area 5 "PPV" neurons such that the position trajectory of the agonist muscle i matches the position trajectory obtained in the presence of the natural proprioception during the reaching task.We call this "Problem 1".Although we are not treating the visual feedback in this work, this position trajectory tracking problem can be considered equivalent to the BMI experiments where the visual feedback is used by the subject to make online corrections in the trajectory during the reaching task in the absence of proprioception.In the second problem, the goal is to match the average firing activity of the agonist population of "PPV" neurons to its natural firing activity by designing an optimal artificial feedback to stimulate the agonist population of the area 5 "PPV" neurons.We call this "Problem 2".This firing rate trajectory tracking problem can be considered equivalent to the BMI experiments where the primary somatosensory area (S1) neurons are stimulated artificially to restore the natural proprioception information.Figure 9A,B show the design of a closed-loop BMI operation during the reaching task for problem 1 and problem 2 respectively.
As shown in Figure 9A,B, for both problems we use the MPC strategy (described above) to design the optimal artificial sensory feedback and formulate the following control problem for the systems shown in Figure 9A,B: min such that Here, 20)).In case of Problem 1 (see Figure 9A), the measured output of the system O(k | k) at a given time k is the position of the agonist muscle i, i.e., p i (k | k).P(•) represents the desired position trajectory.In case of Problem 2 (see Figure 9B), the measured output of the system O(k | k) at a given time k is the average firing activity of the area 5 "PPV" neurons associated with the agonist muscle i, i.e., x i (k | k).P(•) represents the desired average firing activity of the area 5 "PPV" neurons.
To solve the control problem (21), we first compute the desired position and firing activity trajectory for Problem 1 and 2 respectively.For this, we simulate (1)-( 5), (6a), ( 7), ( 10), ( 12) and ( 14).It should be noted that we have included only the natural sensory feedback (Ia) to the area 5 "PPV" neurons through Equation (6a) for computing the desired trajectory.Thus in the absence of natural proprioception to the area 4 "IFV" and "SFV" neurons, q i (t) = f i (t) = 0 in (10).Next we compute Problem 1 and 2 respectively.For this, we use a model of the system given by ( 1)-( 4), ( 10), ( 12) and ( 14) along with the following modified firing activity dynamics of the "PPV" neurons: Here, I(k + l | k) is constant during t and t + 10 ms i.e., between the sample time.Note that the average firing activity of the primary ("Ia") muscle spindles afferents is set to 0 in (22) (see (7) for the comparison).(A) shows the design for "Problem 1".Here the model predictive controller designs the "Artificial Feedback" to stimulate "PPV" neurons such that the system output ("Single Joint Position" trajectory) mimics the "Desired Joint Position" trajectory; (B) shows the design for "Problem 2".Here the model predictive controller designs the "Artificial Feedback" to stimulate "PPV" neurons such that the system output ("PPV Neurons Firing Rate" ) mimics the "Desired PPV Neurons Firing Rate".
With this, we solve the optimization problem (21) numerically in MATLAB for both problems.We use the MATLAB optimization function "fmincon" with the sequential quadratic programming "sqp" algorithm.For both problems, we set N c = 5 and N p = 30.Figure 10A shows the performance of the controller in tracking the desired position trajectory of the agonist muscle i for "Problem 1".9A).(A) shows the position trajectory of the agonist muscle i in the presence of the designed artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (B) shows the velocity trajectory of the agonist muscle i in the presence of the designed artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (C) shows the average firing activity of a population of the cortical area 4 "DVV" neurons (u i (t)), "OPV" neurons (y i (t)), "OFPV" neurons (a i (t)), and the cortical area 5 "PPV" neurons (x i (t)) in the presence of the artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red).

Artificial Proprioception
As shown in Figure 10A, the controller performs well in tracking the desired position trajectory.Also the stimulation of the area 5 "PPV" neurons by the designed optimal artificial sensory feedback recovers the closed-loop velocity trajectory, as shown in Figure 10B.Thus the designed optimal artificial sensory feedback is sufficient in recovering the closed-loop performance of the decoder in the absence of the natural proprioceptive feedback pathways.Next we wonder if the designed optimal artificial sensory feedback in "Problem 1" also recovers the natural average firing activity of the cortical neurons.
Figure 10C shows the average firing activity of the cortical area 4 and 5 neurons during the reaching task.As shown in this figure, the average firing activity of the cortical area 4 and 5 neurons during the artificial stimulation differs significantly from the natural one.This shows that although the artificial stimulation of the "PPV" neurons through the design of "Problem 1" recovers the closed-loop performance of the decoder, it fails to recover the natural firing activity of the cortical neurons.
Next we study the performance of the designed controller for "Problem 2" (see Figure 9B).Remember that the objective of the controller here is to track the natural average firing activity of the area 5 "PPV" neurons by designing optimal artificial sensory input to the "PPV" neurons.Figure 11A shows the average firing activity of the cortical area 4 and 5 neurons.
As shown in the bottom right plot of Figure 11A, the designed controller performs well in tracking the natural firing activity of the area 5 "PPV" neurons.Moreover, the stimulation results in recovering the natural firing activity of the area 4 cortical neurons.
Figure 11B,C show the position and velocity trajectory of the agonist muscle i during the movement.As shown in these figures, the decoder recovers the closed-loop (natural) performance in this case.This shows that the designed controller in "Problem 2" not only recovers the closed-loop performance of the decoder but also recovers the natural firing activity of the cortical neurons through the optimal artificial stimulation.

Intracortical Micro-Stimulation Based Closed-Loop BMI Design
In the previous section, we focused on designing optimal artificial proprioception stimuli in the form of average firing rates.However in experimental BMIs studies, intracortical micro-stimulation (ICMS) based currents in a charge-balanced biphasic waveform are typically used to stimulate cortical sensory neurons and thus to provide artificial sensory feedback during closed-loop operation of BMIs [3,14,32].
Therefore, we modified the MPC-based closed-loop BMI design shown in Figure 9A which allowed us to use such waveform of currents in the present firing activity based framework, as shown in Figure 12.We modeled the "Network of spiking neurons" in Figure 12 by a recurrent network of the Integrate-and-Fire neurons with dynamics [33] where v k (t) is the membrane potential of neuron k at time t, τ k is the membrane time constant of neuron k and R s is the membrane resistance.I k (t) is the total input current to the neuron k and is the sum of the synaptic current I s k (t) and the stimulating current I E (t).v k (t) is reset to v r whenever v k (t) exceeds a constant firing threshold v th and an action potential is assumed at this time.g k (t) ≥ 0 is the excitatory conductance and E is the excitatory membrane reversal potential.N k is the total number of presynaptic neurons to the neuron k, w l is the weight of the synapse from the l th neuron to the post-synaptic neuron k, t f l is the time of the f th incoming action potential from the l th presynaptic neuron, K(t − t f l ) is the stereotypical time course of postsynaptic conductance following presynaptic spikes, q l is the maximum conductance transmitted by the l th neuron, and Θ(t − t f l ) is the heavy-side function.We considered a fully recurrent network of 3 excitatory neurons to represent both the agonist population and the antagonist population of neurons.Network parameters for both populations are provided in Appendix A.1.We computed the average firing rate from the spiking activity of both agonist and antagonist populations as the ratio of the number of spikes in a time window of T s = 30 ms and the total number of neurons in individual population, i.e., 3.
We used a rectangular charge-balanced biphasic waveform shown in Figure 13 to design the stimulating current.Here a 1 , a 2 , d 1 , d 2 , d 3 and d 4 characterize a single biphasic pulse of the current I E (t).The symmetric form of this current minimizes tissue damage, prevents irreversible electrode corrosion and avoids release of toxic metal oxides [34].This form of current is also the accepted practice in various microstimulation applications such as vestibular prosthesis [35], cochlear implant [36], FES [37,38], DBS [39], retinal prosthesis [40], and BMIs [3,14].We formulated the following optimal control problem in the MPC framework to recover the closed-loop (natural) performance of the decoder: such that Here, is the cost function.N c and N p are the control and the prediction horizon respectively.a In this work, we set A max = 10,000, A min = −10,000, T s = 30 ms, n t (k + l | k) = 1, N p = 30, and N c = 15.Using the constraints (24f) and (24g) and n t (k + l | k) = 1, our optimization variables reduced to We used a modified particle swarm optimization (PSO) algorithm within the parallel computing toolbox of MATLAB to solve the optimization problem (24) (see Appendix A.2 for details).Figure 14A,B show the performance of the designed controller (24) in tracking the desired position and velocity trajectories of the agonist muscle i.
As shown in Figure 14A, the controller performs well in tracking the desired position trajectory (although not perfect).Also the stimulation of the area 5 "PPV" neurons by the designed optimal artificial sensory feedback in the form of biphasic waveform recovers the closed-loop velocity trajectory, as shown in Figure 14B.We note that the controller performance is poor compared to the firing rate design shown in Figure 10.
Figure 14C shows the average firing activity of the cortical area 4 and 5 neurons during the reaching task.As shown in this figure, the average firing activity of the cortical area 4 and 5 neurons during the artificial stimulation differs significantly from the natural as we also observed in Figure 10C.This shows that although the artificial stimulation of the "PPV" neurons through the design of ICMS current (approximately) recovers the closed-loop performance of the decoder, it fails to recover the natural firing activity of the cortical neurons.Figure 14D shows the firing rate trajectory of the agonist and antagonist population of the spiking network.12).(A) shows the position trajectory of the agonist muscle i in the presence of the designed artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (B) shows the velocity trajectory of the agonist muscle i in the presence of the designed artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (C) shows the average firing activity of a population of the cortical area 4 "DVV" neurons (u i (t)), "OPV" neurons (y i (t)), "OFPV" neurons (a i (t)), and the cortical area 5 "PPV" neurons (x i (t)) in the presence of the artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (D) Firing trajectory of the agonist and the antagonist population of the spiking network.

Tracking Firing Rate of Neurons which Encode Sensory Information Recovers the Natural Performance of BMI
In this article, we have designed optimal artificial sensory feedback in a control-theoretic framework to recover the closed-loop performance of a brain-machine interface (BMI) during voluntary single joint extension task.This is the first systematic attempt to incorporate artificial proprioception in BMIs towards stimulation enhanced next generation BMIs.Our approach differs from recent closed-loop BMI system modeling approaches where the focus has mainly been on the analysis of closed-loop BMIs within optimal feedback control framework [17,18].Two control problems, namely, the position trajectory tracking problem and the cortical sensory neurons average firing rate tracking problem, have been investigated towards designing an optimal artificial sensory feedback for the BMI in the receding horizon control framework.Our position trajectory tracking problem designs the artificial proprioceptive feedback which differs from the natural proprioceptive feedback and can potentially be learned by the BMI user as shown in [3,15].The cortical sensory neurons average firing rate tracking problem designs the natural proprioceptive feedback through external stimulations [14].Our results shows that tracking the natural firing activity of the cortical sensory neurons using an external stimulating controller is the appropriate approach towards recovering the natural performance of the motor task.Such an approach requires a complete understanding of how our brain encodes the sensory information.In the absence of a complete understanding of neural encoding, the position trajectory problem provides a more practical way to incorporate artificial sensory feedback in BMIs and thus to close the BMI-loop as also recognized in [15].

Generalization beyond Tracking Problems
We formulated trajectory tracking control problems to design optimal artificial sensory feedback wherein we assumed that the desired trajectory (position or firing rate) is known.Obtaining such an information a priori may not be feasible in BMI motor tasks.Nevertheless, our proposed framework, based on optimal feedback control, can be generalized to design the necessary sensory feedback and thus close the BMI-loop using the available information such as the desired end-point of the arm effector and various realistic constraints.For example, one can formulate a control problem in the model predictive control framework to minimize the error between the current position and the desired position along with penalizing the derivative of the total force to stabilize the position while ensuring the smoothness of the movement.Such control problems within the optimal control framework have been studied in the context of sensorimotor control [18,41].

Limitations
In this article, we used an experimentally validated psycho-neurophysiological cortical circuit model of a single joint extension/flexion task to design a BMI.The model provided a basis to investigate the role of proprioceptive feedback in single joint tasks and to design optimal artificial sensory feedback in a BMI using the model predictive control framework.However, this model limited our investigation to a single sensory modality as it does not include visual feedback and learning.For example, we have not investigated how our design will get affected in the presence of visual feedback and any form of cortical learning of a motor task.Since our designed controller is adaptive in nature, the designed artificial sensory feedback will adapt with the learning.It may also be possible that the design controller can help in the initial learning of the BMI task by guiding the movement and thus can reduce the learning effort of the motor task.

Figure 2 .
Figure 2. A psycho-physiological cortical circuit model for voluntary control of single joint movement: The diagram has been redrawn from Bullock et al. [21], Figure 1.1.Nomenclature (adopted from[21]): "GO" is a scalable gating signal; "DVV" is the desired velocity vector; "OPV" is the outflow position vector; "OFPV" is the outflow force and position vector; "SFV" is the static force vector; "IFV" is the inertial force vector; "PPV" is the perceived position vector; "DV" is the difference vector; "TPV" is the target position vector; "γ d " and "γ s " are dynamic and static gamma motoneurons respectively; "α" is alpha motoneuron; "Ia" and "I I" are type Ia and II afferent fibers; − represents inhibitory feedback.The rest of the connections are excitatory.

Figure 3 .
Figure 3.Comparison of the offline performances of the Wiener filter and the Kalman filter based decoders for the single joint reaching task on a sample part of the test data.(A,C,E) show the comparison between the experimental (dotted line, red) and the decoded (solid line, blue) joint position p i (k), joint velocity, v i (k) and force difference between the agonist and the corresponding antagonist muscle, ∆M(k) respectively for the Wiener filter based decoder; (B,D,F) show the comparison between the experimental (dotted line, red) and the decoded (solid line, blue) joint position p i (k), joint velocity, v i (k) and force difference between the agonist and the corresponding antagonist muscle, ∆M(k) respectively for the Kalman filter based decoder.

Figure 4 .
Figure 4.An open-loop BMI system design.

Figure 5 .
Figure 5.Comparison of the open-loop performances of the Wiener filter based decoder and the Kalman filter based decoder with the performance of the closed-loop real system shown in Figure 2. (A) compares the open-loop decoder performance and the performance of the closed-loop real system when the extracted information from both decoders was the joint position, p i (t); (B) shows the open-loop decoder performance and the performance of the closed-loop real system when the extracted information from both decoders was the joint velocity, v i (t); (C) compares the open-loop decoder performance and the performance of the closed-loop real system when the extracted information from both decoders was the force difference between the agonist and the corresponding antagonist muscle, ∆M(t).

Figure 6 .
Figure 6.A closed-loop BMI system design using the natural proprioceptive feedback information (sensory feedback).

Figure 7 .
Figure 7. Degradation in the performance of BMI in the absence of proprioception.(A,B) show the position trajectory of the agonist muscle i (p i (t)) as a function of time (t) in the presence (solid line) and in the absence (dotted line) of the natural proprioceptive feedback information.(A) shows the position trajectory for the neurophysiological system ("Psycho-Physiological System") while (B) shows the position trajectory for the designed BMI ("Brain-Machine Interface").The desired position target (T i ) for the agonist muscle i in (A,B) is 0.7; (C) shows the average firing activity of a population of agonist area 4 "DVV" (u i (t)), area 4 "OPV" (y i (t)), area 4 "OFPV" (a i (t)) and area 5 "PPV" (x i (t)) neurons in the presence (solid line) and in the absence (dotted line) of the natural proprioceptive feedback information for the designed BMI.

Figure 8 .
Figure 8. Prediction and controller move optimality in MPC.

Figure 9 .
Figure 9. Model Predictive Control (MPC) based closed-loop BMI for Problem 1 and Problem 2.(A) shows the design for "Problem 1".Here the model predictive controller designs the "Artificial Feedback" to stimulate "PPV" neurons such that the system output ("Single Joint Position" trajectory) mimics the "Desired Joint Position" trajectory; (B) shows the design for "Problem 2".Here the model predictive controller designs the "Artificial Feedback" to stimulate "PPV" neurons such that the system output ("PPV Neurons Firing Rate" ) mimics the "Desired PPV Neurons Firing Rate".

Figure 10 .
Figure10.Performance of the designed closed-loop BMI in tracking the desired position trajectory ("Problem 1", Figure9A).(A) shows the position trajectory of the agonist muscle i in the presence of the designed artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (B) shows the velocity trajectory of the agonist muscle i in the presence of the designed artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (C) shows the average firing activity of a population of the cortical area 4 "DVV" neurons (u i (t)), "OPV" neurons (y i (t)), "OFPV" neurons (a i (t)), and the cortical area 5 "PPV" neurons (x i (t)) in the presence of the artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red).

Figure 11 .
Figure11.Performance of the designed closed-loop BMI in tracking the desired firing rate trajectory ("Problem 2", Figure9B).(A) shows the average firing activity of the cortical area 4 "DVV" neurons (u i (t)), "OPV" neurons (y i (t)), "OFPV" neurons (a i (t)), and the cortical area 5 "PPV" neurons (x i (t)) in the presence of artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (B) shows the position trajectory of the agonist muscle i (p i (t)) in the presence of artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (C) shows the velocity trajectory of the agonist muscle i (v i (t)) in the presence of artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red).

Figure 12 .
Figure 12.Receding horizon controller based closed-loop BMI design I.Here the receding horizon controller designs the "Artificial Feedback" stimulating current in a charge-balanced biphasic waveform to stimulate "Network of Spiking Neurons" such that the system output ("Single Joint Position" trajectory) mimics the "Desired Joint Position" trajectory.

Figure 14 .
Figure14.Performance of the designed closed-loop BMI in tracking the desired position trajectory (Figure12).(A) shows the position trajectory of the agonist muscle i in the presence of the designed artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (B) shows the velocity trajectory of the agonist muscle i in the presence of the designed artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (C) shows the average firing activity of a population of the cortical area 4 "DVV" neurons (u i (t)), "OPV" neurons (y i (t)), "OFPV" neurons (a i (t)), and the cortical area 5 "PPV" neurons (x i (t)) in the presence of the artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (D) Firing trajectory of the agonist and the antagonist population of the spiking network.
u i (t)), "OPV" neurons (y i (t)), "OFPV" neurons (a i (t)), and the cortical area 5 "PPV" neurons (x i (t)) in the presence of artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (B) shows the position trajectory of the agonist muscle i (p i (t)) in the presence of artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red); (C) shows the velocity trajectory of the agonist muscle i (v i (t)) in the presence of artificial sensory feedback (solid line, blue) and the natural sensory feedback (dotted line, red).