An FPGA-Based Neuron Activity Extraction Unit for a Wireless Neural Interface

: As computational and functional brain model development are solely dependent upon the data acquired from the neural interface, this device plays a vital role in both prosthetic developments and neurological experiments. A wireless neural interface is preferred over a traditional wired one because it can maximize the comfort of the subject and ensure the freedom of movement while implemented. This paper describes the ﬁeld programmable gate array (FPGA) prototype design of a low-power multichannel neuron activity extraction unit suitable for a wireless neural interface. To achieve the low-power requirement, we proposed a novel neural signal extraction algorithm which can provide an up to 6000X transmission rate reduction considering the input signal. Consequently, this technique offers at least 2X power reduction compared to the state-of-the-art systems. We implemented this scheme in Xilinx Zynq-7000 FPGA, which can be used as an intermediate transition towards the application speciﬁc integrated circuit (ASIC) design for on-chip neural signal processing. The proposed FPGA prototype offers reconﬁgurable computability, which means the model can be modiﬁed and veriﬁed according to prerequisites before the ﬁnal ASIC design. This prototype consists of a signal ﬁltering unit and a signal extraction unit which can be used either as stand-alone units or combined as a complete system. Our proposed scheme also provides a provision to work as a single-channel or a scalable multichannel interface based on user’s demands. We collected practical neural signals from rat brains and validated the efﬁcacy of the implemented system using in-silico signal processing.


Introduction
A neural interface is used for gaining access to the brain's circuits.It creates a direct information pathway between the brain and the outside world [1] as a gateway component of neural devices.With the recent drastic advancement in experimental neuroscience, the neural interface is becoming sophisticated and miniaturized [2][3][4][5][6].The longer-term procedure especially requires miniaturized and wireless devices to ensure patients' comfort and flexibility [7][8][9].The wireless neural interface is also suitable for preclinical experiments with nonhuman behaving animals [10][11][12][13].The wireless nature of the experiment ensures untethered movement during the procedure [14]; hence, more naturalistic brain signal recording is possible.
As most of the brain signals are collected using multiple channels, the raw neural signal contains a bulk-load of information which needs to be transmitted for further processing [15][16][17].For example, if we consider 8 bits/sample for a 64-channel system, the required data transmission rate will be more than 11 Mbps [18].Traditional wired neural devices can transmit this high a volume of data with ease using an appropriate serial or a parallel communication protocol.However, if we consider wireless neural devices, this massive amount of data creates a bottleneck for the technology, as the system will then consume significant battery power.Hence, a standalone wireless system would have to undergo frequent battery replacement which would disrupt a continues procedure.Additionally, higher power consumption is associated with a considerable amount of heat dissipation which may cause critical tissue damage.Therefore, a systemic approach is needed to lower the transmission rate by filtering out redundant parts of the signals.In this work, we developed a novel algorithm to ensure a lower data transmission rate for wireless neural signal transmission.
To implement the proposed algorithm at the circuit level, we have incorporated a field programmable gate array (FPGA) to design the hardware and verify the algorithm.A field programmable gate array (FPGA) is an integrated circuit which can be reconfigured according to the users' requirements [19].Besides, when it comes to the question of processing multiple signals at the same time, a FPGA performs notably faster due to its inherent capacity of parallel computation [20,21].As brain signals are typically collected using multiple channels, the FPGA presents itself as an optimum candidate to design the system.We have used the development board ZedBoard for the implementation of our design.This board utilizes Xilinx Zynq-7000 SoC as the FPGA chip for signal processing.In this paper, we present this FPGA prototype design, and we checked in different configurations which intermediary platform can be converted into an ASIC according to specific user demands.
The wireless neural interface is gaining popularity due to its convenience and flexibility.Most studies are primarily concentrated on wireless communication technology [11,12], wireless power transmission techniques [13,22] and compatible electrode development [23,24].Only a few of them focus on dedicated signal processing algorithms for neural data [25,26].To the best of our knowledge, none of these studies presented any design which is reconfigurable according to the end-user's demands.The major concern of this research was to develop circuit-level implementations of algorithms for data reduction to reduce the data transfer rate, which would in turn decrease the power consumption for wireless systems.Herein, we also introduce a digital filtering scheme that can be incorporated with the signal extraction unit if the user wants to omit some signal preprocessing steps.Besides, our prototype presents a design to serve as either a single-channel or a multichannel (with scalable number of channels) scheme based on design requirements.Finally, we evaluated our design with neural signals collected from rat brains to validate the prototype.

Methodology
As a multichannel input with high sampling frequency, the neural signal acquires a large amount of data per second which contains information that may be redundant for a specific application or neural experiment.For designing the prototype, we have considered an experiment which collects single-neuron activity signals from rat brain cells using a wireless neural interface.
As it is a standalone transmission system, the high data transmission rate consumes a large amount of power and consequently lowers the battery life.Therefore, the total possible time for conducting such a contentious experiment is short.To increase the duration of this procedure, we propose an algorithm that can reduce the data transmission rate.We have considered three factors for this design.Firstly, single-neuron activity in the rat brain is predominantly a high-frequency signal (>300 Hz) [27].Secondly, due to the sparse nature of the neural signal, it is rare that all channels are activated at the same time [28].
Finally, signal epochs containing neural activities can be detected with a threshold voltage.This threshold is dependent on the noise level of the signal acquisition system [29].Therefore, we only need to transmit when a particular channel is recording a high-frequency signal above a certain voltage, and we need to know the starting time and peak amplitude of that signal epoch.Considering these parameters, we have designed our proposed system, as shown in Figure 1.This system has three subunits.At first, the signal goes through the high pass filter to drop out the lower frequency signal.The filtered signal then passes through the signal extraction unit, which excludes the insignificant low voltage signal.A signal higher than a predefined threshold voltage can pass through this unit and the channel activation register unit records the relevant channel identification remarks.The output contains the waveform when the neuron is activated.It also includes the timestamp of the neuron activation, the peak amplitude during each activated epoch and the credential of the channel associated.At this point, it should be noted that, as an FPGA prototype, our design is easy to reconfigure as per the user's demand before ASIC implementation.For example, the filter parameter can be modified to select a desired passband, or the threshold value of the signal extraction unit can be adjusted.Additionally, we can choose the number of channels of the system to make it compatible with any distinct experiment parameters.This scalability is one of the unique features of the proposed design.

System Architecture
For the ease of explanation, we divided the prototype into three subsystems-i, a neural signal filter unit; ii, a neuron activity extraction unit; and iii, a channel activity register.The extraction unit is the fundamental part of this system which cannot be omitted in any design.The other two units can be excluded depending on the design requirements.For example, if there is a built-in analogue filter available with the signal collecting electrode assembly, then the filter unit becomes redundant.Similarly, if there is only one channel for signal acquisition, then the channel activation register is not required.
For this research, we have incorporated all three subunits to facilitate our desired specifications.Xilinx System Generator-a MathWorks Simulink toolbox-was used for FPGA programming.The system architecture is described in the following three subsections.It should be noted that this is a reconfigurable hardware model.Therefore, anyone can modify this FPGA design according to their needs before the final ASIC implementation.

Neural Signal Filter Unit
As mentioned in the section methodology, we need to filter out the low-frequency (≤300 Hz) components for extracting single-neuron activity from rat brain signals, as they do not contain any significance.For this purpose, we designed a high-pass equiripple FIR filter.The design parameters were selected as follows: stop-band frequency, F stop = 300 Hz; pass-band frequency, F pass = 400 Hz; stop-band attenuation, A stop = 80 dB; pass-band attenuation, A pass = 1 dB; density factor, D = 16.Considering a minimum order design, we get the frequency response of this filter as shown in Figure 2.
Here the cutoff frequency is approximately 375 Hz, which is sufficient [27] for neural activity extraction.

Neuron Activity Extraction Unit
In this subsection, the design of a neural activity extractor is discussed.As mentioned in the previous section, the brain signal indicates any neural activity only when it has a value greater than the noise voltage level.In our experiment, the collected signals become significant when they cross the 40 µV voltage level.Therefore, we are only required to transmit the signal when the input crosses this threshold voltage.There is a way of lowering the data transmission rate further-by transmitting only timestamps and peak amplitudes of the neural activation events.This process can drastically reduce the data transmission rate and power consumption.However, the processed data will contain only a fraction of the information compared to the original signal.Depending on the application requirement, the user can decide what type of output is needed.
Algorithm 1 characterizes the proposed methodology of neural activity extraction with three output components which are computed under a loop: (i) the time-stamp (OP_Time) is represented by lines 5 and 6; (ii) the peak amplitude (OP_Amp) is found by lines 7-13; and (iii) the brain wave throughput (OP_Wave) is presented by lines 3 to 9. At first, the input signal is compared with the threshold voltage.We have chosen 40 µV as the threshold voltage for this research.If the input voltage is higher than this level, the signal is passed, and the starting time of this event is recorded.A separate memory block is initialized during each episode, to keep track of the peak amplitude.
The system architecture of this subsystem is shown in Figure 3. Here, the input signal (Signal_In) of this subsystem is the filtered neural signal.This signal may come directly from an analog filter or from the filter described in the previous subsection.The signal extraction process starts by comparing the input signal with the threshold voltage.When the input signal becomes higher than that voltage, it is allowed to produce the output signal (OP_Wave).At the same time, a count-up timer circuit is used to keep track of time, and the starting time of each neural event is transmitted (OP_Time).Subsequently, a memory block is activated in every epoch to record the peak amplitude (OP_Amp).This value is computed throughout a neural activity, and this block only updates its stored data if the incoming neural signal possesses a greater value than its predecessor.The final value is then transmitted right after each epoch completion.

Channel Activity Register
For a multichannel system, we need to include an additional component-a channel activation register.It records the channel identity during a neural activation epoch.This register enables the user to find out from which electrode the neural signal is recorded during an activity epoch.Algorithm 2 presents the working principle of this subsystem.If the signal from any channel exceeds the predefined threshold (in this case 40 µV), this unit displays that channel number once per epoch-at the beginning of each neural event.This subsystem also passes the input signal when it is higher than that threshold.This signal will act as the input of the extraction subsystem in a multichannel system.
Figure 4 illustrates the proposed channel activation register architecture for a two-channel system.Nonetheless, this design can be replicated for any number of input channels specified by the user.The proposed subsystem takes the filtered signal as its input, and if it has a value higher than the threshold, the subsystem passes the associated channel identification information (Ch_ID) to the buffer register for transmission.Additionally, this unit works as a multiplexer, as it takes multichannel inputs and produces single-channel output (Ch_Out).As previously mentioned, a neural signal is sparse in nature-only one channel is activated during a neural epoch.Here, this single-channel output (Ch_Out) is the activated neural signal which works as the input for the neural activity extraction subsystem of a complete multichannel system.

Complete Model
We have already discussed the design of three subsystems which are essential components of the complete model of the prototype.There are four possible configurations for the complete setup based on the input parameters-1.
We can reconfigure the FPGA to choose from any of these configurations and test the efficacy of the system before the final ASIC implementation.If the system has one channel with a filtered input, only the signal extractor unit will be adequate as the complete model.Figure 5 illustrates this design setup.For a single-channel system with an unfiltered input, we need to concatenate the filter unit with the signal extractor unit, as presented in Figure 6.For multichannel systems, we need to include the channel activation register to record the channel identity during a neural activation epoch.Two-channel systems are presented here as examples of multichannel systems.The configuration of a multichannel system with filtered input is demonstrated in Figure 7.The design of the multichannel system with unfiltered input is similar to its filtered counterpart except for the addition of the filtering subsystems as shown in Figure 8.

Implementation and Results
We have implemented the prototype system in Xilinx Zynq-7000 (Artix-7) FPGAs on ZedBoard.Although this FPGA development board provides a cost-effective [30] solution for our system implementation, the design is not limited to this board only; it can be implemented in any modern FPGAs.As described in the previous section, there are four possible configurations (SCF, SCUF, MCF and MCUF shown respectively in Figures 5-8) from which the user can select their required design.Table 1 shows the hardware resources required for each of these setups.In this table LUT, LUTRAM, FF, BRAM and DSP stand for look up table, look up table RAM, flipflop, block RAM and digital signal processing blocks respectively.From the Table 1 it is evident that the systems with unfiltered inputs require more resources than the systems with filtered inputs.This is because the neural filter subsystem requires additional LUT, RAM and DSP blocks for signal processing.As more system resources consume more power and require extra floor space for the ASIC chip implementation [31], we suggest using filtered input for better performance if the design permits.Since analogue filters are inexpensive and have the capability of real-time filtering [32], they are suitable for the proposed neural signal extraction system.However, the performances of the digital filters are predominantly better than those of their analogue counterparts [33].Therefore, in our reconfigurable design, the user can decide which configuration is the most suited for any specific system requirement.
It should be noted that we have used two-channel systems as representatives of multi-channel models.To analyze the consequences of additional input channels on resource utilization, we have also implemented four, eight, sixteen, thirty-two and sixty-four channel filtered input systems.
Figure 9 demonstrates this resource evaluation.Here, IO represents the number of input-output blocks in the FPGA.To explain the comparative resource utilization of the systems with a different number of input channels, we need to focus on their design.As discussed in the previous section, the prime difference between a single-channel filtered input system and a two-channel filtered input system is the inclusion of a channel activity register unit.Additionally, the two-channel filtered input system has an additional input and output (channel ID) compared to its single-channel counterpart.Therefore, the two-channel system needs to employ one extra LUT and two IO blocks to facilitate its design.However, when the number of input channels increases from two to sixty-four, no further logic components are needed; only the usage of IO blocks is gradually increased.This analysis indicates that we can effectively increase the number of input channels based on our requirement without overwhelming our system's resources.This ensures the scalability of the implemented prototype.To assess the FPGA prototype, we recorded spontaneous neural activity from rat CA1 region of the hippocampus using acutely implanted microelectrodes at the Biomedical Engineering Department of USC.We used these signals to formulate in-silico datasets for our experiment.The original neural signals include the broadband raw data from multichannel recordings along with the filtered output from a high-pass filter with 300 Hz cut-off frequency.
The resultant outputs from the single-channel unfiltered input system (Figure 6) and the multi-channel filtered input system (Figure 7) are shown in Figures 10 and 11 respectively.To demonstrate the outcome of the filtering subsystem, an internal signal (filter output) is added in Figure 10.The practical output consists of the waveform during neural activity, its timestamp and peak amplitude.Here, the threshold is for triggering transmission, not for spike sorting, although depending on applications, the timestamp and peak amplitude can be used as a simple spike sorting method.However, complete spike sorting can be performed after wireless transmission of the signal during neural activity; see Figure 10c.Figure 11 illustrates the signals from a multichannel system with two inputs.It has an additional channel identification output to display the associated channel number of any neural activity.This output is illustrated in Figure 11f.If a wireless neural interface continuously transmits the raw signal collected from the brain, it will reduce the duration of an uninterrupted experiment by rapidly draining the battery.As previously mentioned, an 8 bits/sample sixty-four channel system requires an 11 Mbps transmission rate [18].However, reference [34] reports that the average neural signal spiking rate is less than 0.5 Hz for awake rats.We programmed the neural interface to transmit only when the neural activity is occurring.Thus, our system can lower the transmission up to 1.6 Kbps for a 64-channel system (considering a 10 kHz sampling rate)-a 6000X transmission rate reduction comparing with the input.
According to the post-implementation report from Xilinx Vivado simulator, the on-chip dynamic power consumption of the prototype (a 64-channel filtered input system) is 3 mW-an at least 2X power reduction compared with state-of-the-art systems.As FPGA consumes more energy than an application-specific IC, it is estimated that the power consumption will further reduce after ASIC implementation.
A comparative study between the implemented prototype and related previous works is presented in Table 2. Several research groups across the world are working on the development of wireless neural interfaces.Due to various components being required to build up the complete system, there are multiple aspects to contributions in this field, as seen in Table 2.Only a few research groups ( [26,35]) worked on the signal processing perspective of the wireless neural interface.Among these works, our implemented system offers the minimum data transmission rate at the lowest power consumption.Moreover, apart from our proposed method, none of these previous works are reconfigurable and scalable.Therefore, our prototype's neural activity extraction unit is suitable for customizable system-level applications in wireless neural interfaces.

Figure 1 .
Figure 1.Simplified block diagram of the proposed system.

Figure 2 .
Figure 2. Frequency response of the neural signal FIR filter.

Figure 5 .
Figure 5. Design of a single-channel filtered input system (SCF) in SysGen.

Figure 6 .
Figure 6.Design of a single-channel unfiltered input system (SCUF) in SysGen.

Figure 7 .
Figure 7. Design of a multichannel filtered input system (MCF) in SysGen.

Figure 9 .
Figure 9. Comparative resource utilization for different numbers of channels.

Table 2 .
Comparison with previous works.