Neuromorphic Robotic Platform with Visual Input, Processor and Actuator, Based on Spiking Neural Networks

This paper describes the design and modus of operation of a neuromorphic robotic platform based on SpiNNaker, and its implementation on the goalkeeper task. The robotic system utilises an address event representation (AER) type of camera (dynamic vision sensor (DVS)) to capture features of a moving ball, and a servo motor to position the goalkeeper to intercept the incoming ball. At the backbone of the system is a microcontroller (Arduino Due) which facilitates communication and control between different robot parts. A spiking neuronal network (SNN), which is running on SpiNNaker, predicts the location of arrival of the moving ball and decides where to place the goalkeeper. In our setup, the maximum data transmission speed of the closed-loop system is approximately 3000 packets per second for both uplink and downlink, and the robot can intercept balls whose speed is up to 1 m/s starting from the distance of about 0.8 m. The interception accuracy is up to 85%, the response latency is 6.5 ms and the maximum power consumption is 7.15 W. This is better than previous implementations based on PC. Here, a simplified version of an SNN has been developed for the ‘interception of a moving object’ task, for the purpose of demonstrating the platform, however a generalised SNN for this problem is a nontrivial problem. A demo video of the robot goalie is available on YouTube.


Introduction
Neuromorphic engineering is inspired by the human neural system to address abstraction and automate human-type activities [1]. Spiking neuronal networks (SNNs) can be implemented in neuromorphic systems and together they are a part of the third generation of artificial intelligence [2,3]. The information in SNNs is encoded by the signal itself and its timing, and the signal only typically has two states: logical '1 and logical '0 . This encoding works towards reduced computational complexity in comparison to conventional deep neural networks which deal with real values. This type of encoding is also in coherence with address event representation (AER) which is often used in sensory neuromorphic systems [4].
The inherent sparseness of SNNs make them suitable for event-based image processing. Conventional cameras capture 25 or more still frames per second to present motion. Each frame in this representation is independent and normally the identical background is repeatedly recorded, which increases computational complexity and generates excessive and often useless data. However, pixels in event-driven cameras only generate data when they detect changes that are above a pre-defined threshold. This presentation enables neural networks to process the information it needs, Deep neuronal networks applied in low-latency conventional vision-based robotics require high computational power and are dependent on a powerful, graphics processing unit (GPU)-driven PC. For a camera with say 50 frames per second, the response latency is not less than 20 ms, which is far from enough to support a fast-reacting robot. If the frame rate is increased to 5 kHz, the corresponding data rate will be 234.375 MB/s for a resolution of 128 × 128 pixels. An alternative is to use an AER type camera device [22]. A robotic goalkeeper based on the DVS128, designed by Delbruck and Lang [23], achieves 3 ms reaction time. However, this system is using a PC to process data, which is reducing its mobility and power efficiency, and it cannot capture rich non-linear structures of visual input since neuronal networks are not applied in the system.
In order to utilise SNN for motion processing and implement the interface between SpiNNaker and external devices, a neuromorphic robotic platform consisting of a DVS, SpiNNaker and a servo motor has been designed. It is a ready-to-use neuromorphic platform for running any SNN to process motion in real-time, for example in the field of neuroprosthetics such as motor neuroprothetics, which allows people to bypass the spine and directly control the prosthesis, which is vital for patients with spinal cord injury. In order to evaluate the performance of its closed-loop execution and also demonstrate the benefit of using SNN in such applications, the platform is implemented on a robot goalkeeper. A similar problem of trajectory prediction for robotics, which utilises an event-based camera and a neural network trained to predict the ball's position has been also investigated by Monforte et al. [24].
In this paper, the architecture of the system and the principle of hardware communication are described and demonstrated. An evaluation based on efficiency and accuracy is also presented and discussed. The interface developed in this paper uses a single microcontroller to establish low-latency communication between DVS, SpiNNaker and a servo motor. It is the first neuromorphic interface based on SpiNNaker that supports a servo to precisely control the position of the robotic arm. It does not require a second chip to convert the communication protocol between microcontroller and SpiNNaker, which reduces power consumption. Furthermore, an Arduino microcontroller is used. It has a number of built-in libraries for external devices, which means more sensors and actuators supported by the Arduino platform can be added easily into the platform. The components that comprise the system are off-the-shelf, and the software implemented only uses the standard libraries. This neuromorphic system offers a platform for many other applications at low-cost and simple for further development.

Materials and Methods
In the designed system, the microcontroller performs communication protocol conversion between DVS, SpiNNaker and the servomotor. The data formats are different for the three components, and are referred to as follows: • Events: visual information captured by DVS and represented by AER protocol (between microcontroller and DVS). • Packets: spikes injected to and received from SpiNNaker (between microcontroller and DVS).

•
Commands: data used to control the servo (between microcontroller and servo).
As shown in Figure 1, the microcontroller receives events from DVS, the events are then converted to packets and injected into SpiNNaker. The microcontroller receives packets (spikes) back from the SNN running on SpiNNaker. Finally, it generates servomotor control commands based on the received spikes.
The neuromorphic processor used is SpiNNaker SpiNN-3 hosted on a development board. It is a multi-chip platform, where each chip has 18 identical processing cores and six bidirectional inter-chip communication links. The SpiNN-3 has 4 SpiNNaker chips and can simulate a SNN with up to 6400 neurons. These chips share a 128 MB SDRAM, which gives over 1 GB/s sustained block transfer rate [9]. The router can route the spikes from a neuron to many connected neurons. SpiNNaker is designed to optimise the simulations of SNNs, and implements some common neuron models such as leaky integrate and fire model. designed to optimise the simulations of SNNs, and implements some common neuron models such as leaky integrate and fire model. The DVS used in our platform is DVS128 from iniLabs, Zurich, which has a resolution of 128 × 128. The DVS camera has low latency down to 15 µs and high dynamic range of up to 120 dB and can work at low power at 20 mW [5,22]. Figure 2 shows a DVS recording when a ping-pong ball is moving. The outline of the ball will produce spikes, but also some random noise is generated, which can be decreased by increasing the threshold of events.

From DVS to SpiNNaker
DVS only sends the addresses of pixels whose temporal change of intensity is higher than a preset threshold [5]. Therefore, the microcontroller receives the horizontal (x) and vertical (y) coordinates of events. The resolution of DVS is 128 × 128, which requires 7 bits to encode the x and y coordinates of the event address. For example, in Table 1, the packet corresponding to the event at pixel at the position (68,98) is encoded as:   The DVS used in our platform is DVS128 from iniLabs, Zurich, which has a resolution of 128 × 128. The DVS camera has low latency down to 15 µs and high dynamic range of up to 120 dB and can work at low power at 20 mW [5,22]. Figure 2 shows a DVS recording when a ping-pong ball is moving. The outline of the ball will produce spikes, but also some random noise is generated, which can be decreased by increasing the threshold of events.
designed to optimise the simulations of SNNs, and implements some common neuron models such as leaky integrate and fire model. The DVS used in our platform is DVS128 from iniLabs, Zurich, which has a resolution of 128 × 128. The DVS camera has low latency down to 15 µs and high dynamic range of up to 120 dB and can work at low power at 20 mW [5,22]. Figure 2 shows a DVS recording when a ping-pong ball is moving. The outline of the ball will produce spikes, but also some random noise is generated, which can be decreased by increasing the threshold of events.

From DVS to SpiNNaker
DVS only sends the addresses of pixels whose temporal change of intensity is higher than a preset threshold [5]. Therefore, the microcontroller receives the horizontal (x) and vertical (y) coordinates of events. The resolution of DVS is 128 × 128, which requires 7 bits to encode the x and y coordinates of the event address. For example, in Table 1, the packet corresponding to the event at pixel at the position (68,98) is encoded as:

From DVS to SpiNNaker
DVS only sends the addresses of pixels whose temporal change of intensity is higher than a preset threshold [5]. Therefore, the microcontroller receives the horizontal (x) and vertical (y) coordinates of events. The resolution of DVS is 128 × 128, which requires 7 bits to encode the x and y coordinates of the event address. For example, in Table 1, the packet corresponding to the event at pixel at the position (68,98) is encoded as: The first bit of header is the parity bit, it is set to ensure the whole packet has odd parity. The rest of header bits and the two unused bits in the packet data are set to 0.
For SpiNNaker, the spikes are encoded as packets, in the following form ( Table 2): A packet contains: Header to indicate its specification, Packet Data which stores information about the spikes, and an optional PayLoad to carry more information, such as membrane voltage [8]. The packet format used in the designed system is 40-bit multi-cast packet without payload. Furthermore, SpiNNaker regards external devices as virtual chips on its system [25]. Thus, a virtual routing key should be defined as the delivery address of the injected spikes. In the designed system, the virtual routing key consists of four Hex numbers ('0 × 1 , '0 × 2 , '0 × 3 and '0 × 4 ).
The decimal to binary conversion of SpiNNaker packets starts from the least significant bit, and the order of virtual routing key is reversed. The reason is that SpiNNaker reads packets from the end.
The events received by the microcontroller can be sub-sampled or preprocessed to increase the processing speed and filter random noise. The resolution reduction is also important for reducing the size of the neural network needed to handle the DVS events. As shown in Figure 3, an average mask of 8 × 8 has been applied to the received events and a new threshold is set to the averaged pixel value to define whether the 'superpixel' is generating an event or not.
Appl. Syst. Innov. 2020, 3, x FOR PEER REVIEW 5 of 16 The first bit of header is the parity bit, it is set to ensure the whole packet has odd parity. The rest of header bits and the two unused bits in the packet data are set to 0.
For SpiNNaker, the spikes are encoded as packets, in the following form (Table 2): A packet contains: Header to indicate its specification, Packet Data which stores information about the spikes, and an optional PayLoad to carry more information, such as membrane voltage [8]. The packet format used in the designed system is 40-bit multi-cast packet without payload. Furthermore, SpiNNaker regards external devices as virtual chips on its system [25]. Thus, a virtual routing key should be defined as the delivery address of the injected spikes. In the designed system, the virtual routing key consists of four Hex numbers ('0 × 1′, '0 × 2′, '0 × 3′ and '0 × 4′).
The decimal to binary conversion of SpiNNaker packets starts from the least significant bit, and the order of virtual routing key is reversed. The reason is that SpiNNaker reads packets from the end.
The events received by the microcontroller can be sub-sampled or preprocessed to increase the processing speed and filter random noise. The resolution reduction is also important for reducing the size of the neural network needed to handle the DVS events. As shown in Figure 3, an average mask of 8 × 8 has been applied to the received events and a new threshold is set to the averaged pixel value to define whether the 'superpixel' is generating an event or not. . DVS data are preprocessed, and overall resolution reduced from 128 × 128 to 16 × 16, where each 'superpixel' represents a block of 64 (8 × 8) original pixels. The decision whether there is spike at a 'superpixel' is based on having a set number of spikes in the pixels that it covers within a specific time window. If the first layer of the SNN has one neuron for each pixel, then required number of neurons is reduced from 16,384 to 256-some loss of resolution but a significant advantage for the neural network training phase.
Through the resolution reduction, only 4 bits are needed to encode x and y of the events, respectively. For instance, the packet corresponding to the event at superpixel (3,15) is shown in Table 3.  . DVS data are preprocessed, and overall resolution reduced from 128 × 128 to 16 × 16, where each 'superpixel' represents a block of 64 (8 × 8) original pixels. The decision whether there is spike at a 'superpixel' is based on having a set number of spikes in the pixels that it covers within a specific time window. If the first layer of the SNN has one neuron for each pixel, then required number of neurons is reduced from 16,384 to 256-some loss of resolution but a significant advantage for the neural network training phase.
Through the resolution reduction, only 4 bits are needed to encode x and y of the events, respectively. For instance, the packet corresponding to the event at superpixel (3,15) is shown in Table 3. After receiving the raw pixel data from the DVS, the microcontroller is converting them first into superpixel events, and then into 40-bit packets appropriate for the SpiNNaker system. The microcontroller will send the packets into SpiNNaker by using 2-of-7 coding and 2-phase handshaking protocol. Each packet is sent as 10 symbols followed by an 'end-of-packet' (EOP) symbol. Except for the EOP symbol, the 10 symbols are selected from 16 Hex values, and each Hex value is 4-bit.
Both SpiNNaker input and output data buses are 7 bits. To send a symbol, the microcontroller only changes the state of two wires and keeps logical levels of the rest five wires unchanged, i.e., uses the 2-of-7 coding method [26]. In Table 4, '1' is represented by change of the state on the that line and '0' is represented with no change of the state. In other words, logical-1 is physically represented by a change of the voltage level and not by a specific value for the digital signal. Similarly, logical-0 is represented by no-change of the physical voltage on the wire, not by a specific value. After sending a symbol, the voltage levels of the 7 wires will not return to the initial state [27]. This mechanism increases the transmission speed and reduces power consumption. Table 4. 2-of-7 coding: converting a symbol into 2 changing bits. '1 and '0 do not represent physical voltage levels corresponding to logical-1 and logical-0, but symbolise change of state ('1 ) or no change of state ('0 ). If less confusing, the symbol 'C' could be used for change and 'N' for no-change, to encode '1 and '0 . After the microcontroller changed the logical levels of two wires, SpiNNaker will send back an acknowledge (ACK) signal to indicate the symbol (data) has been received. For the microcontroller, it will send the next symbol, once it receives the ACK signal from SpiNNaker. Through this 2-phase handshaking protocol, packets are sent serially.

From SpiNNaker to Servomotor
After injecting spikes into SpiNNaker, the next hardware communication is to receive spikes back from SpiNNaker i.e., the output layer of the SNN that it runs, and control the servo on the bases of the received spikes. The coding method and communication protocol is the same as for injecting spikes into the SNN, and the IDs of spiking neurons in output population are also sent in 40-bit packets. In our case, the single-axis robotic arm of the neuromorphic system is set to have eight different movable positions. Thus, the output layer of the SNN has eight neurons that correspond to eight positions. What the microcontroller receive is the spiking neuron ID. It is possible to set any other number of the goalkeeper positions using this digital motor.
The rotation range of the servomotor is 120 • , from −60 • to 60 • . It is divided into eight equal steps, with the step angle of 15 • , and each postion corresponds to a neuron ID. The speed of the servomotor is 120 • /150 ms = 0.8 • /ms (for the weight of our 'goalkeeper'). Thus, servo commands are executed at most every 150 ms. The position of the robotic arm is precisely controlled by using pulse width modulation (PWM) [28]. The range of pulse widths is from 1 ms to 2 ms, which is also divided into eight equal parts. The method of controlling the servo is described in the algorithm (Figure 4) below. The position of the robotic arm is changed after minimum 20 neuron IDs. The microcontroller keeps storing the received neuron IDs into a buffer until the number of IDs is 20. Next, the number of the most frequent ID in the buffer is found. If the number is equal to or greater than 10, a servo command corresponding to the ID will be generated. Finally, the command will be executed if the time difference from the latest execution is equal to or greater than 150 ms.

Implementation
The microcontroller must handle four tasks in parallel: • receiving and processing events from DVS, • encoding and injecting spikes to SpiNNaker, • receiving the output spikes from SpiNNaker, and • control the servomotor and the goalkeeper position.
Since the corresponding commands are processed virtually in parallel, it is important to synchronise all of them. The transmission speed of the AER events is not fixed, but it is proportional to the movement captured by DVS. The resulting speed of injecting spikes into SpiNNaker is also not fixed. Furthermore, the output spikes from SNNs are sparse and not at a fixed rate, which causes a varying speed of generating commands.

Synchronization Mechanism
To ensure the SNN receives the latest events and the command to the servo is generated based on the latest output spikes, a synchronisation mechanism is proposed, as shown in Figure 5. This mechanism sets a limit for the maximum transmission speed of injecting spikes. For example, this limit is set to 2000 packets per second. As shown in Figure 5, the microcontroller keeps receiving events but only stores one event into the buffer every 500 µs. For injecting spikes, the period is also 500 µs and the microcontroller will reset the buffer if the time difference between current stored event and sent spike is greater than 1000 µs. For servo command, it will be overwritten and not be executed until the time difference from last executed command is equal to or greater than 150 ms. Through this mechanism, events, spikes and commands are synchronised to reduce response latency.

Implementation
The microcontroller must handle four tasks in parallel: • receiving and processing events from DVS, • encoding and injecting spikes to SpiNNaker, • receiving the output spikes from SpiNNaker, and • control the servomotor and the goalkeeper position.
Since the corresponding commands are processed virtually in parallel, it is important to synchronise all of them. The transmission speed of the AER events is not fixed, but it is proportional to the movement captured by DVS. The resulting speed of injecting spikes into SpiNNaker is also not fixed. Furthermore, the output spikes from SNNs are sparse and not at a fixed rate, which causes a varying speed of generating commands.

Synchronization Mechanism
To ensure the SNN receives the latest events and the command to the servo is generated based on the latest output spikes, a synchronisation mechanism is proposed, as shown in Figure 5. This mechanism sets a limit for the maximum transmission speed of injecting spikes. For example, this limit is set to 2000 packets per second. As shown in Figure 5, the microcontroller keeps receiving events but only stores one event into the buffer every 500 µs. For injecting spikes, the period is also 500 µs and the microcontroller will reset the buffer if the time difference between current stored event and sent spike is greater than 1000 µs. For servo command, it will be overwritten and not be executed until the time difference from last executed command is equal to or greater than 150 ms. Through this mechanism, events, spikes and commands are synchronised to reduce response latency.
The buffering process for events enables the system to adapt to changing the speed of receiving events but also brings some limitations. Compared to injecting events directly, the microcontroller spends computation power on writing, reading and resetting the buffer. Thus, the maximum transmission speed is further limited. Furthermore, information will be partially dropped out if the speed of receiving events is higher than the pre-defined transmission speed. This will reduce the accuracy of the system. The buffering process for events enables the system to adapt to changing the speed of receiving events but also brings some limitations. Compared to injecting events directly, the microcontroller spends computation power on writing, reading and resetting the buffer. Thus, the maximum transmission speed is further limited. Furthermore, information will be partially dropped out if the speed of receiving events is higher than the pre-defined transmission speed. This will reduce the accuracy of the system.
The setup of the neuromorphic goalkeeper is shown in Figure 6a. The width of the baffle is approximately 4 cm and the movable range of the robot arm is 30 cm. Figure 6b shows a photo of the connections between the different modules.

Neural Network Model
For the target application described in this paper, a relatively simple SNN has been developed to run on SpiNNaker, since the aim is to demonstrate our hardware platform. Development of a general SNN for the task of predicting the end position of a ball based on its initial motion (the 'goalkeeper' task) is a non-trivial problem and it is beyond the scope of this paper. A generalised methodology for developing an SNN to run on a DVS-SpiNNaker platform, using either supervised or unsupervised training algorithm, is described in Appendix A. Here the developed SNN is described to demonstrate our platform, and the model is shown and explained in Figure 7a.
The network has three layers as shown in Figure 7b. The first (input) layer of the SNN is connected to processed DVS output, which has a resolution of 16 × 16 after downsampling (explained The setup of the neuromorphic goalkeeper is shown in Figure 6a. The width of the baffle is approximately 4 cm and the movable range of the robot arm is 30 cm. Figure 6b shows a photo of the connections between the different modules. The buffering process for events enables the system to adapt to changing the speed of receiving events but also brings some limitations. Compared to injecting events directly, the microcontroller spends computation power on writing, reading and resetting the buffer. Thus, the maximum transmission speed is further limited. Furthermore, information will be partially dropped out if the speed of receiving events is higher than the pre-defined transmission speed. This will reduce the accuracy of the system.
The setup of the neuromorphic goalkeeper is shown in Figure 6a. The width of the baffle is approximately 4 cm and the movable range of the robot arm is 30 cm. Figure 6b shows a photo of the connections between the different modules.

Neural Network Model
For the target application described in this paper, a relatively simple SNN has been developed to run on SpiNNaker, since the aim is to demonstrate our hardware platform. Development of a general SNN for the task of predicting the end position of a ball based on its initial motion (the 'goalkeeper' task) is a non-trivial problem and it is beyond the scope of this paper. A generalised methodology for developing an SNN to run on a DVS-SpiNNaker platform, using either supervised or unsupervised training algorithm, is described in Appendix A. Here the developed SNN is described to demonstrate our platform, and the model is shown and explained in Figure 7a.
The network has three layers as shown in Figure 7b. The first (input) layer of the SNN is connected to processed DVS output, which has a resolution of 16 × 16 after downsampling (explained

Neural Network Model
For the target application described in this paper, a relatively simple SNN has been developed to run on SpiNNaker, since the aim is to demonstrate our hardware platform. Development of a general SNN for the task of predicting the end position of a ball based on its initial motion (the 'goalkeeper' task) is a non-trivial problem and it is beyond the scope of this paper. A generalised methodology for developing an SNN to run on a DVS-SpiNNaker platform, using either supervised or unsupervised training algorithm, is described in Appendix A. Here the developed SNN is described to demonstrate our platform, and the model is shown and explained in Figure 7a.
The network has three layers as shown in Figure 7b. The first (input) layer of the SNN is connected to processed DVS output, which has a resolution of 16 × 16 after downsampling (explained in Figure 3). The second (hidden) layer has eight neurons, and each neuron is connected to two columns of the DVS layer, and hence has 32 input synapses. The third (output) layer also has eight neurons, which correspond to eight possible positions of the goalkeeper. The neurons in the layers two and three are connected using one-to-one connectivity. The total number of synapses in the SNN is 32 × 8 + 8 = 264. The SNN has been developed using PyNN [29] with sPyNNeker extension [30] and run on the SpiNNaker. The neuron model used in the SNN is standard leaky integrate-and-fire (LIF).
Appl. Syst. Innov. 2020, 3, x FOR PEER REVIEW 9 of 16 in Figure 3). The second (hidden) layer has eight neurons, and each neuron is connected to two columns of the DVS layer, and hence has 32 input synapses. The third (output) layer also has eight neurons, which correspond to eight possible positions of the goalkeeper. The neurons in the layers two and three are connected using one-to-one connectivity. The total number of synapses in the SNN is 32 × 8 + 8 = 264. The SNN has been developed using PyNN [29] with sPyNNeker extension [30] and run on the SpiNNaker. The neuron model used in the SNN is standard leaky integrate-and-fire (LIF).
(a) (b) For the input real-time captured events form DVS has been used, which was recording a ball rolling (pushed by a hand towards the goal from the opposite side of the Table). The weights of all synapses were adjusted manually, without using any training algorithms, based on the model explained in Figure 7.
As shown in Figure 7b, lateral inhibition is not implemented in the output layer. This means the output neurons can spike simultaneously, and in that case the resulting output could not guide the robot arm to a unique location. Therefore, a voting mechanism (Figure 4) has been introduced. It is setting a threshold of 10 spikes as a criterion for deciding which neuron will determine the goalkeeper′s position. Additionally, SpiNNaker will not send any packets unless a neuron in the output population is spiking. Furthermore, this threshold increases the robustness of the network to random spikes in the input layer representing the DVS pixels.

The Workflow
The workflow for the proposed method is shown in Figure 8. There are five main steps in the closed-loop execution: 1. DVS captures the events related to a moving object; 2. The microcontroller receives the events in AER format, converts them to SpiNNaker packets and sends to SpiNNaker; 3. SpiNNaker receives packets and injects spikes into the SNN, and after processing by SNN sends the outputs back to the microcontroller; 4. The microcontroller receives the IDs of output spikes and generates the servo command by using the algorithm (Figure 4); 5. The position of the robotic arm is adjusted by changing the pulse width of control signal received by the digital servomotor. For the input real-time captured events form DVS has been used, which was recording a ball rolling (pushed by a hand towards the goal from the opposite side of the table). The weights of all synapses were adjusted manually, without using any training algorithms, based on the model explained in Figure 7.
As shown in Figure 7b, lateral inhibition is not implemented in the output layer. This means the output neurons can spike simultaneously, and in that case the resulting output could not guide the robot arm to a unique location. Therefore, a voting mechanism (Figure 4) has been introduced. It is setting a threshold of 10 spikes as a criterion for deciding which neuron will determine the goalkeeper s position. Additionally, SpiNNaker will not send any packets unless a neuron in the output population is spiking. Furthermore, this threshold increases the robustness of the network to random spikes in the input layer representing the DVS pixels.

The Workflow
The workflow for the proposed method is shown in Figure 8. There are five main steps in the closed-loop execution: 1.
DVS captures the events related to a moving object; 2.
The microcontroller receives the events in AER format, converts them to SpiNNaker packets and sends to SpiNNaker; 3.
SpiNNaker receives packets and injects spikes into the SNN, and after processing by SNN sends the outputs back to the microcontroller; 4.
The microcontroller receives the IDs of output spikes and generates the servo command by using the algorithm (Figure 4); 5.
The position of the robotic arm is adjusted by changing the pulse width of control signal received by the digital servomotor.
Appl. Syst. Innov. 2020, 3, x FOR PEER REVIEW 10 of 16 Figure 8. Workflow of the robotic system used for interception of a moving object.

Results
The SNN has been coded in PyNN and downloaded to SpiNNaker. After the network was downloaded, the SpiNNaker board (i.e., the ethernet link) has been disconnected from the laptop, to demonstrate PC-independent operation. Once the simulation starts, the SNN will run for arbitrary time (typically 60 s in our experiments) with a time step of 1 millisecond. The input layer receives the injected spikes from the microcontroller, and the output layer has eight neurons corresponding to the eight positions. The network will not send any packets unless a neuron in the output population is spiking. A demo video of the robot goalie action can be found at the following link: https://www.youtube.com/watch?v=135fH21QXtg. The neuromorphic goalkeeper achieves an accuracy of up to 85% on the condition that the speed of the ball is up to 1 m/s and the initial distance more than 80 cm.
For the hardware communication, the logical levels of the ACK signals of injecting spikes (uplink) and receiving spikes (downlink) are monitored to measure the transmission speed, Figure 9. As can be observed both ACK signals of uplink and downlink have changed 11 times to send a packet, which indicates that data is in the form of 40-bit packets followed by an EOP symbol. According to the reading of the oscilloscope, the frequencies of uplink and downlink ACK signals are 16.951 kHz and 16.970 kHz, respectively. Two symbols are sent in one period, since the logical levels of ACK signals change twice every period. Therefore, the uplink speed is: The microcontroller receives events in parallel but injects and receives spikes in serial. Thus, the speed of receiving events is higher than the speed of injecting and receiving spikes. Since each process are parallelly executed and the speed of injecting spikes is higher than the speed of receiving spikes, the response latency is dominated by the time cost to inject spikes. Furthermore, the servo command is generated based on 20 spikes. Thus, the response latency is: Figure 8. Workflow of the robotic system used for interception of a moving object.

Results
The SNN has been coded in PyNN and downloaded to SpiNNaker. After the network was downloaded, the SpiNNaker board (i.e., the ethernet link) has been disconnected from the laptop, to demonstrate PC-independent operation. Once the simulation starts, the SNN will run for arbitrary time (typically 60 s in our experiments) with a time step of 1 millisecond. The input layer receives the injected spikes from the microcontroller, and the output layer has eight neurons corresponding to the eight positions. The network will not send any packets unless a neuron in the output population is spiking. A demo video of the robot goalie action can be found at the following link: https://www.youtube.com/watch?v=135fH21QXtg. The neuromorphic goalkeeper achieves an accuracy of up to 85% on the condition that the speed of the ball is up to 1 m/s and the initial distance more than 80 cm.
For the hardware communication, the logical levels of the ACK signals of injecting spikes (uplink) and receiving spikes (downlink) are monitored to measure the transmission speed, Figure 9. As can be observed both ACK signals of uplink and downlink have changed 11 times to send a packet, which indicates that data is in the form of 40-bit packets followed by an EOP symbol.
Appl. Syst. Innov. 2020, 3, x FOR PEER REVIEW 10 of 16 Figure 8. Workflow of the robotic system used for interception of a moving object.

Results
The SNN has been coded in PyNN and downloaded to SpiNNaker. After the network was downloaded, the SpiNNaker board (i.e., the ethernet link) has been disconnected from the laptop, to demonstrate PC-independent operation. Once the simulation starts, the SNN will run for arbitrary time (typically 60 s in our experiments) with a time step of 1 millisecond. The input layer receives the injected spikes from the microcontroller, and the output layer has eight neurons corresponding to the eight positions. The network will not send any packets unless a neuron in the output population is spiking. A demo video of the robot goalie action can be found at the following link: https://www.youtube.com/watch?v=135fH21QXtg. The neuromorphic goalkeeper achieves an accuracy of up to 85% on the condition that the speed of the ball is up to 1 m/s and the initial distance more than 80 cm.
For the hardware communication, the logical levels of the ACK signals of injecting spikes (uplink) and receiving spikes (downlink) are monitored to measure the transmission speed, Figure 9. As can be observed both ACK signals of uplink and downlink have changed 11 times to send a packet, which indicates that data is in the form of 40-bit packets followed by an EOP symbol. According to the reading of the oscilloscope, the frequencies of uplink and downlink ACK signals are 16.951 kHz and 16.970 kHz, respectively. Two symbols are sent in one period, since the logical levels of ACK signals change twice every period. Therefore, the uplink speed is: The microcontroller receives events in parallel but injects and receives spikes in serial. Thus, the speed of receiving events is higher than the speed of injecting and receiving spikes. Since each process are parallelly executed and the speed of injecting spikes is higher than the speed of receiving spikes, the response latency is dominated by the time cost to inject spikes. Furthermore, the servo command is generated based on 20 spikes. Thus, the response latency is: According to the reading of the oscilloscope, the frequencies of uplink and downlink ACK signals are 16.951 kHz and 16.970 kHz, respectively. Two symbols are sent in one period, since the logical levels of ACK signals change twice every period. Therefore, the uplink speed is: The microcontroller receives events in parallel but injects and receives spikes in serial. Thus, the speed of receiving events is higher than the speed of injecting and receiving spikes.
Since each process are parallelly executed and the speed of injecting spikes is higher than the speed of receiving spikes, the response latency is dominated by the time cost to inject spikes. Furthermore, the servo command is generated based on 20 spikes. Thus, the response latency is: t latency = 20 × 1 3082 s 6.489 ms The SNN was running for 1 min in multiple trials, and its output spikes were recorded. Since the servo motor was set to have eight movable positions, the abscissa of the coordinate where the ball can move is also divided equally into eight parts. The microcontroller obtains current coordinate of the moving ball from DVS and inputs it to the SNN. The classification of the coordinate will then be output by the SNN. For example, neuron 6 in the output layer of the SNN will spike if current abscissa of the moving ball is in the 6th position. An example of the SNN activity for some randomly released balls is shown in Figure 10. The output spikes are concentrated at one or two neuron IDs in a short period, which gives strong and effective guidance for the microcontroller to generate servo commands. Because of the voting method applied in Figure 4, the effect of random noise has been reduced. The SNN was running for 1 min in multiple trials, and its output spikes were recorded. Since the servo motor was set to have eight movable positions, the abscissa of the coordinate where the ball can move is also divided equally into eight parts. The microcontroller obtains current coordinate of the moving ball from DVS and inputs it to the SNN. The classification of the coordinate will then be output by the SNN. For example, neuron 6 in the output layer of the SNN will spike if current abscissa of the moving ball is in the 6th position. An example of the SNN activity for some randomly released balls is shown in Figure 10. The output spikes are concentrated at one or two neuron IDs in a short period, which gives strong and effective guidance for the microcontroller to generate servo commands. Because of the voting method applied in Figure 4, the effect of random noise has been reduced. The neuromorphic robotic system has a maximum power consumption of 7.15 W, which is calculated in the following way. For DVS at high activity, its supply voltage is 5 V and the current is 100 mA. For the microcontroller, its operational voltage is 3.3 V and it has a maximum fuse current of 500 mA. For SpiNN-3 board, its power supply is 5 V, idle current is 400 mA and peak current is 1 A [31]. Therefore, the maximum power of the system is: = 5 × 0.1 + 3.3 × 0.5 + 5 × 1 = 7.15 .
In our case the neural network can be run on only one chip (one of four) on SpiNNaker, which requires less than 2 mW of power, so the average power consumption in our example is approximately 4 W.
The developed system has high efficiency due to the inherently sparse events and spikes, impulse propagation of SNNs [32], and the synchronisation mechanism. The transmission speed of hardware communication is not fixed, it only has an assumed limit for the maximum speed in our realisation, but it could be relaxed. The speed is proportional to the events captured by DVS. Because of the adaptive speed, the efficiency is significantly improved.

Discussion
The platform developed in this paper has successfully demonstrated the closed-loop communication between DVS, SpiNNaker and a servo motor by using a single microcontroller. Compared to unidirectional communication of the FPGA interface, the developed solution supports bidirectional data flow and can control external actuators according to the received spikes from SNNs. The FPGA interface is based on a high-cost RoggedStone2 development kit, while our solution is based on the Arduino platform, which is more economical.
Apart from this solution, another microcontroller interface developed by the UM and TUM also supports bidirectional communication and control actuators. However, it requires the assistance of a CPLD, which increases the data transmission speed between SpiNNaker and the microcontroller but also increases the difficulty of further development. Their solution is based on an ARM Cortex-M4 microcontroller with a 168 MHz clock speed, whilst we use a Cortex-M3 microcontroller with an 86 The neuromorphic robotic system has a maximum power consumption of 7.15 W, which is calculated in the following way. For DVS at high activity, its supply voltage is 5 V and the current is 100 mA. For the microcontroller, its operational voltage is 3.3 V and it has a maximum fuse current of 500 mA. For SpiNN-3 board, its power supply is 5 V, idle current is 400 mA and peak current is 1 A [31]. Therefore, the maximum power of the system is: Apart from this solution, another microcontroller interface developed by the UM and TUM also supports bidirectional communication and control actuators. However, it requires the assistance of a CPLD, which increases the data transmission speed between SpiNNaker and the microcontroller but also increases the difficulty of further development. Their solution is based on an ARM Cortex-M4 microcontroller with a 168 MHz clock speed, whilst we use a Cortex-M3 microcontroller with an 86 MHz clock speed. Thus, the data transition of our solution is slower than the speed of the interface designed by UM and TUM. On the other side, our solution does not need the second chip, which means lower power consumption and higher stability. Additionally, the microcontroller developed by UM and TUM has 5 pre-defined peripheral ports for 2 DVSs and 3 motors. This design simplifies the connections with DVSs and motors but results in a limited capacity for other actuators and sensors. The microcontroller board used in our solution has 54 GPIOs and built-in libraries for external devices.
Compared to the robot goalie based on a PC (and DVS), the neuromorphic system that uses SpiNNaker as the processor has a lower power consumption, and it is completely controlled by an SNN running on a SpiNNaker, not using any PC processing power. Therefore, our system is portable and readily implementable on autonomous, battery-operated robotic applications. Admittedly for a small neural network as ours it would be possible to implement it on a microcontroller and thus the power consumption could be even lower than by using SpiNNaker. However, SpiNNaker is much easier to be used in developing a new SNN for our platform because of already existing PyNN software interface, and the power consumption is still acceptable for robotic applications.
Our solution can also be compared with other neuromorphic systems for tracking objects by using artificial neural networks. For example, Monforte et al. [24] have developed a recurrent neural network (RNN) for predicting the trajectory of a bouncing ball (the hardware used was ATIS event camera and iCub robot [16,21,24]). The input events of this neural network were also pre-processed by sub-sampling. This strategy reduces the computational complexity, but can affect the resolution precision. The RNN was using the long-short term memory (LSTM) learning rule [24], which indicates the current output is depended on previous input events. It is similar to our algorithm where the current prediction of the position of ball's arrival is based on 20 latest spikes received from SpiNNaker ( Figure 4). Prediction of the upcoming trajectory of a tracked object was done in an asynchronous fashion. The RNN gives both horizontal and vertical prediction, but our algorithm only focuses on horizontal information. The training set and the testing set of the RNN are recorded data, and its performance of real-time events has not been investigated.
The camera (DVS) used to capture the information of moving objects is of the neuromorphic type, which exploits data-driven, event-based updates-the same as SNNs. It generates inherently sparse data, which is important for low latency and energy-efficient applications. Compared to frame-based cameras, bandwidth and computational complexity are greatly reduced. Furthermore, DVS is much better for the task of object tracking since the static background does not generate data.
The motor control mechanism used in the developed system is PWM, and it can precisely place the robotic arm. Compared to the pulse-frequency-modulation (PFM) motor control for neuromorphic robotics [19], PWM is in the standard library and does not require extra hardware. However, PFM has lower electromagnetic interferences and response latency at the expense of high-cost and complex design.
Although a relatively simple neural network model is used in this work, it is surprisingly efficient, as demonstrated in our YouTube video, as long as the incoming ball's trajectories are not deliberately chosen to play on the weakness of the model, which is to change the 'lanes' of motion. The goalkeeper can only focus on one ball at the time, hence the accuracy will be reduced if more than one moving ball appear in the field of view of DVS simultaneously. However, this constraint is more due to the simplicity of the used SNN rather than being a hardware limitation. For further improvement of the robot for this task, an advanced SNN could be developed, based on, for example, the human ball-catching models.
Another possible limitation of our SNN and potentially use of a DVS in general is the situation when the background is changing. In that case the number of events will increase rapidly what might overload the capacity of the microcontroller to handle them, and also to force all pixels to spike almost simultaneously after post-processing and resolution reduction. Some moderate change of the background including some other small objects moving in the field of view or some noise is currently eliminated by adjusting the spiking threshold in the DVS, as well as with our post-processing of events (decision when a super-pixel spikes) and our algorithm (Figure 4) which sets a minimum threshold in the number of spikes in the output layer for making the decision about where to move the goalkeeper.

Conclusions
This paper focuses on the development of a neuromorphic robotic system and its demonstration on the robot goalkeeper task. The developed interface is the first neuromorphic interface that can precisely control the position of the robotic arm by using an SNN. It can allocate different positions corresponding to the behaviour of each neuron in the output population. The system is neuromorphic and controlled by an SNN run on SpiNNaker detached from a PC. It has high efficiency, due to the sparse data, use of SNNs and the adaptive transmission speed.
The designed system provides a low-cost and user-friendly platform for various neuromorphic robotics applications. The communication between the neuromorphic processor and the microcontroller is executed automatically. Developers only need to connect sensors to the microcontroller and get the output spikes through the microcontroller. The designed system is a prototype of general-purpose neuromorphic robotics platform. An SNN has been successfully applied in this closed-loop hardware system to solve a real-world problem: interception of a moving object. In further development, a microcontroller with a higher speed could be used to increase the data transmission speed and an advanced SNN could be created to increase the accuracy and responsiveness of the system. Regarding the applications of such a neuromorphic sensory-processing-actuation systems, one potential area is also in neuroprosthetic systems, such as sensory prosthesis (visual, cochlear, e.g., [33]) or sensory substitution systems [34]. Future work will also include development of deep learning realised on SNNs [35], and for that purpose our platform can serve as an excellent testing ground. Deep learning has been achieving exceptional progress, but for real-time reaction in real environment the efficiency and latency can be significantly improved by deploying the deep SNNs.