Evaluation of Teaching Signals for Motor Control in the Cerebellum during Real-World Robot Application

Motor learning in the cerebellum is believed to entail plastic changes at synapses between parallel fibers and Purkinje cells, induced by the teaching signal conveyed in the climbing fiber (CF) input. Despite the abundant research on the cerebellum, the nature of this signal is still a matter of debate. Two types of movement error information have been proposed to be plausible teaching signals: sensory error (SE) and motor command error (ME); however, their plausibility has not been tested in the real world. Here, we conducted a comparison of different types of CF teaching signals in real-world engineering applications by using a realistic neuronal network model of the cerebellum. We employed a direct current motor (simple task) and a two-wheeled balancing robot (difficult task). We demonstrate that SE, ME or a linear combination of the two is sufficient to yield comparable performance in a simple task. When the task is more difficult, although SE slightly outperformed ME, these types of error information are all able to adequately control the robot. We categorize granular cells according to their inputs and the error signal revealing that different granule cells are preferably engaged for SE, ME or their combination. Thus, unlike previous theoretical and simulation studies that support either SE or ME, it is demonstrated for the first time in a real-world engineering application that both SE and ME are adequate as the CF teaching signal in a realistic computational cerebellar model, even when the control task is as difficult as stabilizing a two-wheeled balancing robot.


Introduction
Cerebellar research in the last few decades has been accompanied by general consensus about its role in motor control, motor learning and motor coordination, yet abundant questions regarding the fundamental operations handled by this neural structure remain elusive [1]. One of the points of avid debate is the mechanisms of plasticity in the cerebellum [2]. In the original work by Marr-Albus [3,4], it was suggested that the cerebellum behaves as an adaptive filter in which mossy fiber inputs to the cerebellum were analyzed into sparse components at the granular cell layer and then conveyed by parallel fibers to the Purkinje cell. In this model, the location of plasticity corresponds to synapses between the numerous parallel fibers and the Purkinje cell, which change their strength according to a teaching signal conveyed by the climbing fiber (CF) input. Although a large repertoire of cerebellar models has been proposed since the early days of Marr, parallel fiber synapses remained the most commonly suggested location for cerebellar plasticity [5][6][7][8][9][10][11][12]. The debate pivots around the nature of the teaching signal in the CF.

Neuronal Network Model of the Cerebellum
Inspired by the neuronal circuit of the cerebellar cortex, in particular that of flocculus, known to be involved in the horizontal vestibulo-ocular reflex ( Figure 1A), we developed a neural network model of the cerebellum [9,11] (Figure 1B). As the cerebellar cortex has a homogeneous structure all over its volume, its basic learning algorithm is sometimes called a generic algorithm of the cerebellum [20]. We focused on the flocculus and the vestibulo-ocular reflex because they have been widely studied in anatomy and physiology [20]. The model comprises cell types found so far for which their physiological and anatomical properties are well understood [20]. Those are granule cells, Golgi cells, basket/stellate cells and Purkinje cells. We do not differentiate basket and stellate cells in our model. The model receives two types of input carried by mossy fibers and CF as in the real cerebellum. Mossy fiber inputs are postulated to carry desired motion signals, the efference copy of motor commands and sensory error signals (desired trajectory minus actual trajectory) [21][22][23]. These inputs innervate granule cells and Golgi cells. Golgi cells receive excitatory input from granule cells while projecting back to granule cells via inhibitory synapses, forming a negative feedback loop to regulate the activity of granule cells. Axons of granule cells, called parallel fibers, bifurcate and innervate basket cells and Purkinje cells via excitatory synapses. Basket cells project to Purkinje cells via inhibitory synapses, forming a negative feedforward pathway from granule cells to Purkinje cells. Purkinje cells are the sole output cell in the cerebellar cortex sending inhibitory projections down to the vestibular nucleus. This model is available for download at [24] or by accessing the model database of the cerebellar platform of the Japanese Neuroinformatics Node [25]. A proportional and derivative (PD) controller that is a feedback controller widely used in industry and other applications [26] is employed in tandem with the CNN model ( Figure 1B), representing the non-cerebellar pathway in the vestibulo-ocular reflex neuronal circuit in Figure 1A as in previous simulation studies [27,28]. It provides the non-adaptive motor command to the control plant. The PD controller also generates the ME signal in this context since it is a simple gain term that approximates the inverse model of the control plant transforming SE into ME as proposed in Kawato's feedback error learning scheme [6]. The parameters of the PD controller (proportional gain k p and derivative gain k d ) were designed following optimal settings for automatic controllers [29], so that the PD controller alone can stably operate the control plant during a simple desired motion (π sin(2π0.1t)).
The structure of our CNN model and its connectivity with the control plant, the non-cerebellar pathway and the vestibular nucleus remain the same disregarding the content of the error signal in our experiments. However, to be inline with the adaptive filter architecture proposed by Porrill et al. [19], our model would require the modification of the cerebellar structure so that the output of the Purkinje cell is projected to the input of the non-cerebellar pathway. We do not modify our cerebellar structure, maintain the architecture of the model as reported in the flocculus and only change the CF content.

Equations of the Model
The CNN model includes 755 granule cells, 5 Golgi cells, 15 basket cells and 1 Purkinje cell. These numbers preserve the ratio of the cell population and the convergence/divergence ratios of each cell type close to the cerebellar neuronal circuit [28] (shown in Table 1) and a sufficient number of granule cells for adequate control of the plants considered here. For a detailed study on the effects of scaling the number of neurons in the CNN and the output performance, refer to [30]. Each neuron model is described as follows: x PC = y PF W PF-PC + y BA W BA-PC (7) is the activity vector of all granule cells before being processed by the sigmoidal activation function y PF (Equation (2) are the matrix of synaptic weights between mossy fibers-granule cells and Golgi-granule cells, M MF-GC ∈ N N MF ×N GC , M GO-GC ∈ N N GO ×N GC are masking matrices of zeros and ones that embed the convergence/divergence ratio for granule cells referring to Table 1.
follow the same notation for Golgi cells, basket cells and the Purkinje cell, and N MF , N GC , N GO , N BA , N PC are the number of mossy fibers, Golgi, granule, basket and Purkinje cells, respectively; where R + = {x ∈ R : 0 < x ≤ 1} and R − = {x ∈ R : −1 < x ≤ 0} for excitatory and inhibitory synapses, respectively. Firing rate vectors are all R + , but y C is R = {x : −0.5 < x ≤ 0.5}, so that it can be subtracted from the motor command produced by the PD and sent to the plant. In order to standardize the contribution of each mossy fiber input to the cerebellar model, pre-cerebellar gains were set as normalizing scaling constants for each of them. These values were taken from the maximum values in the description of the robot hardware. Table 2 shows the scaling gains for each control plant. Then, the scaled mossy fibers were passed through an activation function (y MF = 1/1 + e −σ(x MF −µ) ) to obtain firing rates compatible with the model (i.e., R + ), where σ and µ are as defined previously.  The learning rule regulated by the activity in the CF input that modifies the weights of parallel fiber-Purkinje cell synapses in the model is computed every sampling time according to the following equations: where ∆W PF-PC is the matrix of changes in the synaptic weights of parallel fiber-Purkinje cell synapses, W PF-PC (t) ∈ R + is the synaptic weight matrix in these synapses at sampling time t and γ is the learning rate. In this algorithm, when the signs of (y PF (t) − 0.5) and CF(t) are the same, the synaptic weights are decreased (long-term depression), whereas when their signs are different, the weights are increased (long-term potentiation). Note that the constant 0.5 operating y PF is required to convert the y PF firing rate from R + → R. Learning rate γ was set to be 0.008, which was chosen by considering a trade off between learning speed and convergence [9,11].

Control Plants
Two control plants, a brushed direct current motor and a two-wheeled balancing robot, are employed. The 2-W brushed motor ( Figure 1C, RC-280SA, Mabuchi Motor, Chiba prefecture, Japan) generates a torque directly from the current supplied. The motor's shaft is interfaced with an encoder circuit (ZMP Inc., Tokyo, Japan) for providing angular position information and a microcontroller board (e-nuvo CPU board, ZMP Inc., Tokyo, Japan) in charge of communication with the implementation computer via a serial interface. The mossy fiber inputs to the CNN model for this control are shown in Table 2. The PD controller for this plant is a position controller with k p = 0.8 and k d = 0.01 as the proportional and derivative constants, respectively. A virtual dynamical model simulation for this motor has been included in the repository of the CNN model as an example [24].
The two-wheeled balancing robot ( Figure 1D, e-nuvo wheel, ZMP Inc., Tokyo, Japan) is an inverted pendulum system that is highly unstable and widely used in control engineering for testing control strategies [26]. It is equipped with a set of sensors, including a motor encoder and a gyroscope, which provide wheel angle ( Figure 1D; φ(t)) and body tilt angle ( Figure 1D; θ(t)), respectively. The robot is also equipped with a serial interface to allow communication with the computer (see its specification below) on which the model was implemented. The motion of the robot is driven by a single motor connected to the wheels of the robot that share the same shaft (single-input multiple-output system). The mossy fiber inputs for this control plant carry the signals described in Table 2. The PD controller in this control object is a parallel configuration of two controllers (body position controller: k p = 5 and k d = 0.5; and wheel position controller: k * p = 0.2 and k * d = 0.05) designed by following optimal settings for automatic controllers [26,29], so that the addition of both outputs (i.e., PD controller output) alone can stably operate the robot during a simple task (φ des. (t) = π sin(2π0.1t), where φ des. (t) is the desired wheel angular position).
The experiments were carried out using LabVIEW 2010 (National Instrument, Austin, TX, USA) running on a Windows computer (4 × 3.33 GHz Intel Core-i7 processor, memory: 16 GB) and communicating with the control plants via serial protocol at 57.6 kbps. The CPU board in both the motor and the two-wheeled balancing robot has a sampling time of t s = 10 ms that is the same time interval as used in the present configuration of the CNN model (for different setups and sampling times, see [31]).

Two Types of Error Information in CF
Sources of SE and ME, shown in Figure 1B, are generated from the non-cerebellar pathway. That is, the SE signal is computed from the difference between the desired and actual movement [20,32]. It carries position and velocity error components in kinematic coordinates (expressed in angle units: rad), whereas ME is computed as the output of the PD feedback controller, which represents error in motor command dynamic coordinates (expressed in electric current units: A) as proposed in the theory of the feedback-error-learning algorithm [6,17]. Additionally, the linear combination of the two types of signals (SE + ME) is also tested, as it has been suggested to be a plausible case in the real cerebellum [33]. The equations of SE and ME for each of the control plants are as follows: where subscript "e" denotes error and corresponds to the difference between the desired and yielded motion (e.g., are scaling values intended to equalize the contribution of each error component to the SE and ME signals so that all have the same importance. These values were calculated with the CNN model disabled (i.e., plant controlled only by the PD controller and desired motion φ des. (t) = π sin(2π0.1t)). ME in Equation (13) remained the same for both control plants. The combination of SE and ME is computed directly from Equations (11)-(13).

Performance Assessment Methodology
The desired motion φ des. (t) for the control plants used to assess the performance of the cerebellar model is a sinusoidal motion. The sinusoidal wave was generated at frequencies ranging from 0.2 Hz-0.4 Hz and an amplitude of π (maximum angular velocity: 7.89 rad/s). The frequencies of the stimulus were chosen below the maximum controllable velocity of the robot (9.82 rad/s). The desired body tilt angle θ des. (t) was set to zero degrees (90 degrees with the ground), so that the robot is commanded to remain vertical while following the desired wheel trajectory. Each stimulus was repeated up to 100 cycles. Considering that the random initialization of synaptic weights in the cerebellar neuronal network can be a source of variability in the performance of the model, five different initial sets of random synaptic weights were created for each set of weighted connections (W PF-PC , W MF-GC , W MF-GO , W GO-GC , W PF-GO , W PF-BA , and W BA-PC ) sampled from a normal distribution in R + and R − with a standard deviation of 1 and a mean value of 0.5 and −0.5 for excitatory and inhibitory synapses, respectively. Special care is required to avoid over inhibition of the Purkinje cell when W BA-PC is large or instability in the Purkinje cell output when the feedback loop formed by W PF-GO and W GO-GC is large. Excluding W PF-PC , all of the synaptic weights remained fixed during the experiments because they are considered the main synaptic plasticity locations of cerebellar motor learning [20,34]. The performance of the cerebellar neuronal network controller was measured as the root mean square error (RSE) of each control variable. In the case of the motor, there was one control variable (the shaft angle φ(t); Figure 1C), whereas there were two control variables for the two-wheeled balancing robot (wheel angle φ(t) and body tilt angle θ(t); Figure 1D).

Simple Control Scenario
First, we tested our CNN model with different CF information contents in the easiest control scenario in our setup, that is the control of the angular position φ(t) of a metal shaft directly connected to a motor with a sinusoidal desired motion (φ des. (t) = π sin(2π0.5t)) (the experimental protocol is described in Materials and Methods). Three types of CF to the cerebellar model were tested separately; namely, CF carrying SE (Equation (11)), ME (Equation (13)) and the combination of the two, i.e., SE + ME. Figure 2 shows the behavioral consequences in control performance under each CF condition (SE, ME, SE + ME) in terms of the RSE of φ(t) (Figure 2A) and yielded shaft motions ( Figure 2B). The performance obtained when the CNN model was disabled, i.e., the motor is controlled only by the PD, is included as "PD" in gray lines (Figure 2A,B). The RSE of φ(t) (Figure 2A) evidences the superior performance with the CNN model disregarding the CF content compared with the PD. The improvement was on average 0.015 rad or 30% of the initial value of 0.05 rad. Thus, using a CF with SE, ME or the combination of both produces similar performance in this simple control scenario. Not surprisingly, the temporal profiles of these CFs also look alike (Appendix Figure A1). Differences between the motion caused in the motor by the CNN model and that by the PD controller alone can be seen in the XY planes shown in Figure 2B. These XY planes are constructed by positioning the desired (X-axis) and yielded (Y-axis) motions in an XY plane rotated −45 degrees. In the ideal case, the desired and yielded motions are equal with a trajectory that lies in an horizontal line at y = 0 ( Figure 2B, dashed lines). Erroneous motions in the clockwise (CW) and counterclockwise (CCW) rotations of the motor shaft are mapped in the planes y > 0 and y < 0, respectively. To show the adaptation in the CNN model, the motions produced at Cycles #2, #50 and #95 are shown. Cycle #2 shows that the CNN model caused the motor to rotate in excess in the CW direction. This excess produced by the initialization conditions and the number of granule cells in the CNN model was corrected by Cycle #5. We have shown that increasing the number of granule cells brings robustness against initial conditions in a bi-hemispheric CNN [35]. XY planes corresponding to Cycles #50 and #95 show that the PD caused the motor to rotate in excess when transitioning from CW to CCW and from CCW to CW. In contrast, the CNN model caused the opposite effect in the shaft motion by reducing the rotation speed just before the transitions. As exemplified here, CF inputs carrying SE, ME or their combination are adequate error signals to drive plasticity at parallel fibers-Purkinje cell synapses in the CNN model and make little difference in the control performance if the control scenario is very simple (a shaft with 1 deg of freedom driven by a motor to follow a 0.5-Hz sinusoidal).

Difficult Control Scenario
After testing the easiest control scenario in our setup, the CF signals seemed to have similar temporal profiles, and thus, the relationship between the error signal used and the control performance attained is yet to be clarified. In this section, we tested our setup in a more difficult control scenario to elucidate this relationship.
The control scenario employs the two-wheeled balancing robot with sinusoidal desired motions for the wheel angle. The frequencies of the sinusoidal motion φ des. (t) range from 0.2 Hz-0.4 Hz (the experimental protocol is described in Materials and Methods). Figure 3, in the same format as Figure 2, summarizes the control performance achieved when the frequency of the desired motion is 0.2 Hz. Similarly to the previous case, controlling the two-wheeled balancing robot with the CNN outperforms the PD controller alone. The wheel positions generated at Cycles #2, #50 and #95 are shown in Figure 3B. These trajectories evidence that the CNN progressively improved the yielded wheel motion so that the robot motion approached the desired motion (φ des. (t)). In contrast, the wheel motion generated by the PD shows considerably larger hysteresis, meaning that the yielded motion differs from the desired motion ( Figure 3B, gray lines). Subtle differences can be seen in the control performance obtained by using SE, ME or their combination. Increasing the frequency of the desired motion from 0.2-0.3 and 0.4 Hz strengthened the differences between SE and ME. Figure 4 shows the temporal profile of the different CFs, the firing rate of the Purkinje cell and the average control performance at the three frequencies of φ des. (t). The temporal profiles of CF carrying SE ( Figure 4A, blue lines), ME (red lines) and SE + ME (green lines) show that at 0.2 Hz, all of the CFs look alike. However, increasing the frequency of φ des. (t) reveals that ME is delayed with respect to the other types of CFs ( Figure 4A, black arrows).  The firing rates of the Purkinje cell and the vestibular nucleus ( Figure 4B) also evidence clear differences when using SE or ME in the CF. At 0.4 Hz, the firing rate of the Purkinje cell caused by ME and SE is out of phase and presents distinctive shapes. The outputs of the vestibular nucleus due to these contributions by the Purkinje cell also differ depending on the CF types ( Figure 4B, black arrows). Figure 4C summarizes the control performance in terms of the average RSE of φ(t) at each one of the frequencies tested at the beginning (Cycle #5) and the end (Cycle #95) of the experiments. This figure shows that the performance with ME deteriorates ( Figure 4C, red bars) as the frequency increases, to such an extent that at 0.3 and 0.4 Hz, there is no improvement over the initial error ( Figure 4C, asterisks). On the contrary, the performance with SE always reduces the initial error and is better than using the PD or the combination SE + ME. Thus, SE is the best error signal in our setup during the control of a highly unstable two-wheeled balancing robot at high frequencies (>0.3 Hz).

Neural Consequences of the Different CF Types
In this section, we analyze the neural consequences of using SE, ME or SE + ME in the CNN model during the control scenario with the two-wheeled balancing robot. We show the adaptation produced at parallel fibers-Purkinje cell synapses, which are the only plastic locus in our model, and the correlation of mossy fibers, granule cells and the CF used. The correlation values are shown to investigate the information in the granule cells that, in combination with the CF input, resulted in the learned parallel fibers-Purkinje cell synaptic weights. Figure 5 shows the 755 parallel fibers-Purkinje cell synaptic weights (W PF-PC ) during one of the experiments with CF carrying SE. W PF-PC with the other CF contents showed similar trends. The gray area in the figure shows the weights decreased by long-term depression from the initial value (black line). Those weights increased by long-term potentiation are shown above the black line. Some representative W PF-PC are presented in red lines. The number of potentiated and depressed weights varied depending on the types of CF. With CF carrying SE, ME and SE + ME, 65%, 71% and 68% of the W PF-PC weights were depressed, respectively. Across all of the experiments, the same trend was observed, namely the number of depressed synapses was larger by using ME than SE. The insets in Figure 5 show the normalized firing rate of the granule cells whose synaptic weights were the most potentiated (labeled as "a") and most depressed (labeled as "b") for each type of CF. The firing rate of potentiated granule cells are negatively correlated with the CF, whereas those depressed are positively correlated with the CF. Figure 6A shows the coefficient of correlation of the CF and the granule cell's activity. The moving average (N = 13) is shown in bold lines. Granule cells have been sorted in the x-axis from the most depressed (x = 0) to the most potentiated (x = 755). The normalized synaptic weight of each sorted granule cell is shown (dashed line). Shadowed area shows those cells whose activity was depressed. The coefficient of correlation when using SE, ME or SE + ME shows that those granule cells potentiated are negatively correlated with the CF, meaning that increasing their activity reduces the error signal, whereas those GCs depressed are positively correlated with the error signal. is the initialization synaptic weight. Insets show the normalized firing rate (NFR) of granule cells whose W PF-PC synapses were predominately potentiated "a" or depressed "b" when climbing fiber (CF) carried sensory error (SE) (blue lines), motor error (ME) (red lines) or the combination of both (green lines).
The correlation of the granule activities with the mossy fiber inputs reveals intrinsic differences when using SE (blue lines) and ME (red lines). Figure 6B shows the coefficient of correlation of the mossy fiber carrying wheel position error (φ e (t)) and the activities of granule cells. In this correlation, those granule cells positively correlated were preferably potentiated when using SE, contrary to when using ME. Figure 6C shows the coefficient of correlation of the mossy fibers carrying wheel desired position (φ des. (t)) and the activities of granule cells. In this correlation, those granule cells negatively correlated were preferably potentiated when using SE, contrary to when using ME. These two correlations evidence intrinsic differences when using SE and ME in our setup. Granule cells with firing activity in-phase with error components (i.e., φ e (t) andφ e (t)) and those out of phase with desired motions (i.e., φ des. (t) andφ des. (t)) were preferably engaged to produce the Purkinje cell activity when using SE. When using ME, the opposite relationship was more prominent among the granule cells potentiated. Other mossy fibers, such as the efference copy, did not show differences between SE and ME. Employing the combination of SE and ME seems to soften the correlations observed with SE and ME to an intermediate point ( Figure 6, green lines); therefore, it is not a surprise that using SE + ME as CF produces intermediate control performance (Figure 4, green lines).

Discussion
Debate about the error information encoded in the CF input to the cerebellum has been going on for several years [2,5,16,36]. Two types of error have been greatly defended: SE and ME. Behavioral and neurophysiological support has been presented for SE in rabbits [13] and monkeys [14] and similarly for ME in rabbits [15] and monkeys [16]. Consensus has been difficult to reach because these behavioral and neurophysiological experiments employ simple movement tasks where the temporal patterns of SE and ME are alike. To the best knowledge of the authors, this study is the first direct comparison of the types of error information encoded in the CF input to a CNN and with the motor performance attained during a real-world engineering application. Our setup allowed us to configure the difficulty of the control task, change the type of error encoded by the CF and evaluated the yielded control performance, thus enabling us to effectively disassociate SE and ME.

Type of Error Signal and the Yielded Robot Control Performance
Our experimental results showed that both SE and ME, despite producing unique behavioral and neural changes in the CNN model, especially at the Purkinje cell output ( Figure 6B), are adequate error signals to govern successfully the plants in a diversity of control tasks (simple and complex task) (Figures 2-6). The practical implications of these results are that under certain experimental setups, investigating the consequences in motor learning when using motor or error information in the CF might yield positive, but inconclusive results. Thus, the proper selection of control plants and stimuli that allow the dissociation of motor and control error contents is imperative. This is the first direct demonstration suggesting that both SE and ME can be an adequate error signal to teach a realistic cerebellar computational model in real-world engineering applications. These results are in agreement with biological evidence from the horizontal vestibulo-ocular reflex (VOR) and optokinetic reflex (OKR) systems, a feed-forward and a feedback system sharing the same controller (i.e., cerebellar microcomplex) and the same control object (i.e., eye plant) with CF carrying SE and ME, respectively [2]. However, further evaluation using the two-wheeled balancing robot with sinusoidal desired motion at different frequencies revealed quantitative differences in control performance caused by SE and ME ( Figure 6). Namely, SE yielded better control performance than ME ( Figure 6C). This might be trivial because we evaluated the control performance in terms of the RSE of SE (i.e., RSE of φ(t)), and the CNN model was trained to reduce SE. Since the ultimate goal of a control task is to reduce the error between the desired motion and the yielded motion (i.e., reduce SE) in biological systems and engineering applications, we employed the RSE of SE as a measure of the goodness of the control performance. In addition, SE signals are observable. In the case of ME, the desired motor commands are not observable, and therefore, the performance cannot be computed directly in terms of ME.
The correlation of the firing rate activity of granule cells, CF and mossy fibers also revealed intrinsic differences caused by using SE, ME or their combination ( Figure 6). In particular, granule cell activity preferably potentiated by LTP at parallel fibers-Purkinje cell synapses with SE and ME differed in the correlation with the mossy fiber inputs. This means that to produce the Purkinje cell activities shown in Figure 4 with SE and ME, granule cells with different mossy fiber inputs were engaged, thus the differences in control performance. In particular, granule cells carrying error information were preferentially potentiated by sensory error, and granule cells carrying desired motions were preferentially potentiated by motor error. In our setup, the relation between granule cells activity and the CF content holds true for the two control plants used; however, extrapolation to other systems is not evident since changing the number of mossy fiber inputs and the size of the population of granule cells would result in a different mapping of inputs to parallel fibers, which in turn also affects the information that reaches the Purkinje cell. Clarifying this relation requires further study of the mossy fiber inputs to each granule cell and their correlation with the CF error content. For a study in this direction with a bi-hemispherical version of the CNN, refer to [31].
We have employed a PD controller as the non-cerebellar pathway that produces ME not only because it is one of the most flexible, effective and popular feedback controllers [26,29], but because it has been proposed as a simple approximation to the inverse model of the controlled plant in the feedback-error-learning theory [18,37,38]. The PD may be actually physiologically plausible [19] since its computation requires sensory information and its first derivative (e.g., eye velocity and eye acceleration), which are available at brain stem [39] and the cerebellum [40]. However, as the complexity of the plant increases, this approximation is deficient, as we have shown in the experiments with the two-wheel balancing robot ( Figure 6). We observed that increasing the frequency of the desired motion (i.e., increasing the complexity of the control task) produced a delayed ME signal, and performance weakened ( Figure 6A). This is produced by the frequency characteristics of the PD controller and the fact that it is not a close approximation to the inverse model of the two-wheeled balancing robot. These results confirm a major drawback of using ME as the teaching signal in a realistic cerebellar model as proposed in the feedback-error-learning theory since the complexity of the reference structures transforming SE into ME must increase with the complexity of the control plant, which could become troublesome in highly elaborated control scenarios.
Our results also showed that a linear combination of SE and ME encoded in the CF is an adequate error signal in our CNN model to control the plants successfully. The performance attained was always better than when the CNN model was disabled (Figures 3-5, magenta lines). These results are in-line with physiological evidence in monkeys during arm reaching experiments that has suggested that CF encodes both SE and ME at different times of the arm movement [33], and thus, CF carries the error signal required for real-time control and learning of movements.

Generalization and Limitation of the Current Results
The control performance with the CNN model was stable and consistent while considering a basic control plant (DC motor) and one of the most challenging plants in control engineering (the two-wheeled balancing robot) [26], combined with simple and complex stimulus (low and high frequency sinusoidal desired motion) within the operation boundaries of our setup (real-time operation at a sampling frequency of 10 ms). The characteristics of the control scenarios and our setup made it difficult to reproduce the delay of 100 ms (10× the sampling frequency) shown in the visual inputs in the real cerebellum [34], which correspond to φ(t) and θ(t) in our model. However, under different circumstances, we believe our CNN model would be able to learn and produce adequate motor commands. These control plants have one and two degrees of freedom, respectively. Therefore, we may not be able to simply extend the current results to those with higher numbers of degrees of freedom. There is a series of evidence in cerebellar control of eye movement showing that different zones in cerebellar flocculus control different single axes of rotation of the eye (i.e., yaw and pitch rotations) [41]. Thus, our current setup to control objects is inline with this view. To apply the model to controlling other objects with higher numbers of degrees of freedom, we may employ the same number of cerebellar models for controlling each axis of motion [35]. Evaluating the nature of error signals suitable for this situation is in our scope of future studies.
In the current configuration of our CNN model, we have only considered plasticity at parallel fibers-Purkinje cell synapses, which are regarded as the loci of fast adaptation in the cerebellum [42], as a minimum requirement for the motor learning algorithm [2,20]. Multiple sites of plasticity in the cerebellum have been reported, and their involvement in motor learning has been argued [10,43,44]. Therefore, evaluating the effects in motor performance during robot control when the CNN model includes multiple sites of plasticity remains a future study of our CNN model.

Comparison with Other Cerebellar Models
Computational models of the cerebellum and their successful interpolation into engineering applications have extensively been reported. In general, these models include a form of synaptic plasticity at parallel fibers-Purkinje cell synapses driven by the CF input as presented in our CNN model, and that corresponds to the basis for cerebellar motor learning. Examples of CF encoding SE include models used for simulation of the eye plant [45], control of pneumatic muscles [46] and control of robotic arms [10,47]. Similarly, examples of cerebellar models with ME in their CF input include robotic arms [18] and inverted pendulum systems [11,38]. We have demonstrated that the CF input in our CNN model can be configured to include SE or ME information for driving plasticity at parallel fibers-Purkinje cell synapses and control the robot plant successfully. To the best knowledge of the authors, this study is the first direct comparison of the type of teaching signal encoded in the CF input to a cerebellar model and with the motor performance attained during a real-world engineering application. In contrast to spiking models of the cerebellum [7,12,48,49], due to the level of abstraction in our CNN model (i.e., firing rate neuron models), spike patterns and temporal or spatial effects were impossible to evaluate. This would require the construction of a cerebellar network with spiking neuronal models that could endanger the real-time real-world application in control engineering and a network with realistic physical properties [28]. Nonetheless, large-scale cerebellum-like spiking model has also been demonstrated to run in real time [12], and its application into robotics has been suggested. We have also extended our CNN model to a bi-hemispherical configuration (biCNN) that presents additional benefits and reproduces a form of cerebellar asymmetrical motor adaptation using SE information in the CF input [30].

Conclusions
The current study demonstrated for the first time in a real-world engineering application that both sensory error (SE) and motor error (ME) are adequate as the climbing fiber teaching signal in a realistic computational cerebellar model not only when the control task is simple, but also when the control task is as difficult as stabilizing a two-wheeled balancing robot.