A Multi-Channel Electromyography, Electrocardiography and Inertial Wireless Sensor Module Using Bluetooth Low-Energy

: This paper proposes a wireless sensor device for the real-time acquisition of bioelectrical signals such as electromyography (EMG) and electrocardiography (ECG), coupled with an inertial sensor, to provide a comprehensive stream of data suitable for human activity detection, motion analysis, and technology-assisted nursing of persons with physical or cognitive impairments. The sensor is able to acquire up to three independent bioelectrical channels (six electrodes), each with 24 bits of resolution and a sampling rate up to 3.2 kHz, and has a 6-DoF inertial platform measuring linear acceleration and angular velocity. The bluetooth low-energy wireless link was chosen because it allows easy interfacing with many consumer electronics devices, such as smartphones or tablets, that can work as data aggregators, but also imposes data rate restrictions. These restrictions are investigated in this paper as well, together with the strategy we adopted to maximize the available bandwidth and reliability of the transmission within the limits imposed by the protocol.


Introduction
Nowadays, wearable wireless sensors placed on the skin of the human body are becoming very common in many application fields such as healthcare, sport, fitness, ambient assisted living, entertainment, autonomous driving systems [1,2]. In particular, three body signals (i.e., electromyography (EMG), electrocardiography (ECG), and acceleration (ACC)) that can be easily captured by these sensors, have been demonstrated to be able to provide accurate and reliable information on people's activities and behaviors. Therefore, low-cost and low-energy wearable wireless EMG/ECG/ACC sensors can be efficiently used for all day long telemonitoring of patients affected by dementia or neurodegenerative disorders characterized by progressive decline in cognitive and functional abilities. Behavioural and psychiatric symptoms are often present in these cases and make patient management very hard. The quality of life (QoL) experienced by family caregivers has been shown to be lower than the QoL of caregivers for persons who do not have dementia [3,4]. Research has shown that lower QoL can increase absences from work and reduce job productivity.
To identify a potential system able to help caregivers in patient daily-management will therefore be of value. It will allow to improve quality of life of the patients and their caregivers and will reduce the social cost linked to these diseases. In particular, caregivers designate certain activities as dangerous, and it is of paramount importance to detect such activities in order to provide alarm functions to these patients [5,6]. An accurate detection of activities can be reached by acquiring information both on acceleration and other biosignals such as ECG and EMG signals at the same time [7,8]. With these applications with wireless powering and compressed sensing. The entire implementation contains the analog front-end (AFE), analog-to-digital converter (ADC), digital signal processor (DSP), power management, and an RF-to-DC rectifier and its size is of 1.7 mm × 2.5 mm.
Recently, in [52] a wearable hardware platform to measure ECG and EMG with an additional IMU sensor to detect the motion artifacts has been developed. Bringing all the sensors on single platform resolves the sensor fusion problems. Additionally, the developed platform successfully detected the motion artifacts during all the events and the necessary features of ECG and EMG waveforms are very well extracted by using signal processing techniques. The work is a contribution towards the development of integrated wearable biomedical sensors for long-term monitoring.
Finally, in [53] a novel ExG measurement system built around a custom designed ultra-low noise instrumentation amplifier (ULN-INA) has been proposed. Two versions of the prototype were presented: (a) with one INA and measures all the ExG signals in time-multiplexed manner; (b) with four INAs and can measure up to four parallel channel ExG simultaneously. Both have programmable gain and bandwidth features, which allows measurement of different bio-potentials such as ECG, EMG, EEG, EOG.
Despite all the research that has been done on these body sensors, most has been focused on sensing and signal processing for compression and recognition, while very little attention has been paid to the transport of the acquired signals through the wireless channel and to the overall system optimization. Unoptimized protocols are often used, resulting in sub-optimal performance, like 100 ms latency and only 6 h operation in [18], or low throughput in [50], to name a few that employ BLE. In our work, the communication protocol, though remaining fully standards-compliant to BLE, has been adapted to ensure reliable, controlled-latency transmission of the raw data acquired by the sensors, resulting in latencies lower than 10 ms and net data rates (not counting protocol overhead) in excess of 80 kbit/s, while keeping data integrity and tight time synchronization even in the presence of severe RF interference, and allowing days of battery-powered continued operation.

Hardware and System Description
As the system is expected to be used for prolonged amounts of time, its design has been driven by the requirement of lowering its power consumption as much as possible, so as to maximize battery life. Hence, particular emphasis will be given to the low-power operation of the whole sensor, which is achieved by a careful selection of hardware components and features, combined with a tailored power management strategy developed in software.
In the following two subsections an overview of the hardware components selected to meet our goals is provided, together with a description of the device operating modes, which include an acceleration-based wake-up system designed so that the unit can just be "worn and forgotten" (at least for a few days, until the battery needs recharging).

Circuit Implementation
The wireless sensor module is composed of the following main components: • Microcontroller with integrated wireless transceiver. Using the versatile nRF52840 Systemon-Chip (SoC) by Nordic Semiconductor, programmed to use the BLE radio protocol stack, it offers the needed wireless connectivity together with wired (USB) connectivity for battery charging, simplified and secure pairing, etc. • Biopotential analog front-end. Based on the Texas Instruments' ADS1293 highly integrated front-end (FE), which includes configurable digital filters, instrumentation amplifiers and ADCs, augmented with an external biasing network suited for both EMG and ECG signal acquisition through either DC or AC-coupled electrodes. • Inertial platform. Made with the ultra-low-power LSM6DSO digital accelerometer and gyroscope by ST Microelectronics, it features an always-on accelerometer used to wake up the system when motion is detected, and is used to stream 6-axes motion data when the system is operational.
• Power supply. It consists of a lithium-polymer battery charger and a bunch of linear and switching regulators to provide power to the different subsystems only when actually needed, and designed to maximize energy efficiency.
A block diagram of the system, with the power path highlighted, is shown in Figure 1.  Figure 1. Block diagram of the sensor node highlighting the various power supply subsystems exploited to minimize energy consumption and the data interfaces. The ADS1293 acquires the analog signals and transmits them to the SoC through a serial peripheral interface (SPI) when enabled by the general-purpose input/output (GPIO) line. The LSM6DSO integrates the inertial platform and streams its data to the microcontroller (uC) through an inter-integrated circuit (I2C) interface.

VDDIO
The nRF52840 SoC already integrates a number of voltage regulators, specifically one low dropout (LDO) linear regulator dedicated to the USB transceiver (XCVR) (inactive during battery-powered operation), a high-voltage regulator (REG0) that can accept the voltage range from a lithium-polymery battery to produce the main I/O supply voltage V DD , and the normal-voltage regulator (REG1) that further reduces V DD to the internal supply used by the microcontroller core and radio. These two latter regulators can be configured to operate either as LDOs or switching regulators.
Although REG0 can be controlled by software to produce a voltage between 1.8 V to 3.3 V in 0.3 V steps, and so it could in principle be used to also power the EMG front-end (which accepts a 2.7 V to 5.5 V range), an external low-noise LDO was added to the system to improve the noise performance of the analog front-end and also reduce current consumption when the ADS1293 needs not to be powered, as the LDO provides a much lower stand-by current than the ADS1293 itself.
The ADS1293 contains the analog FE, suitable for sensing either ECG or EMG signals. It has three fully differential simultaneous ADCs, so it can stream three different signals at once, with digitally controllable filters that simultaneously adjust the bandwidth and the output data rate. These filters can thus be programmed according to the type of signal (ECG or EMG) being acquired. Each ADC differential channel can then be connected to any two of the six input leads, under software control, for maximum usage flexibility. System timing is provided by two crystal oscillators controlled by the SoC: a 32.768 kHz low-frequency crystal oscillator (LFXO), which is used for the bluetooth radio slot scheduling and overall application timing, and a 64 MHz high-frequency crystal oscillator (HFXO), used when accurate high-frequency clocks are necessary (radio, USB, etc.) and to timestamp sampled data. The 409.6 kHz clock necessary for the sigma-delta ADC in the analog front-end is also derived by means of a fractional divider from the HFXO.
A photograph of an assembled prototype circuit board is shown in Figure 2. The overall circuit occupies 51 mm × 27 mm, but a much tighter integration is clearly possible for the final version. Electrodes will be connected (possibly through shielded cables) to contacts J1-J6. Photograph of the sensor prototype built for testing and evaluation purposes. The EMG/ECG sensing chip is U2, the inertial sensor is U5.

System Operational Modes
At any given time the system can be in one of five different operational modes (OFF, IDLE, ADVERTISING, CONNECTED, STREAMING), each of which is characterized by different power requirements: • OFF: This is the lowest power state, with the system completely disabled, meant to be used for long-term storage, with a current drain comparable or lower than the battery self-discharge rate. It is entered upon request from the host, or when a low-battery condition is detected. It can only be exited by attaching the device to the battery charger. In this state, the CPU and accelerometer are programmed in the power-down state for lowest current consumption. The EMG front-end is not powered at all. • IDLE: This is entered after a period of inactivity, i.e., after advertising for a long time without any BLE central requesting a connection. In this state the accelerometer is kept active in an ultra-low-power state and a low (12.5 Hz) sampling rate. When it detects a change in the acceleration measured on any of its three axes (e.g., when the sensor is rotated around an axis not parallel to gravity, or it is slightly shaked), it resets the CPU and moves the system to the advertising state. The EMG front-end is not powered at all as in the OFF state.
• ADVERTISING: This state is entered after a wake-up from IDLE, a reset from OFF, or after the host disconnects the BLE link. In this state the accelerometer is kept as in IDLE to prevent the advertising from timing out if there is movement, as is the EMG front-end, still unpowered. The CPU is timed from an external LFXO to reduce power consumption with respect to the internal RC oscillator. • CONNECTED: Entered when a BLE central establishes a connection. In this state the hosts can control most of the functions of the sensor. By default, only the battery monitor is enabled, the LSM6DSO is powered down, while the ADS1293, which has a relatively long start-up time due to the high-value biasing resistors, is powered on and kept in stand-by awaiting for the host to request data streaming. The CPU is still timed by the LFXO. • STREAMING: Entered when the host enables notifications for a specific service. In this state the requested data source is enabled at the prescribed data rate, and the uncompressed data stream is sent over the BLE link. For accurate data timestamping, the high-frequency crystal oscillator (HFXO) is kept active at all times and not only during radio activity as is done in the previous two states. This results in a slightly higher current consumption but it is necessary to ensure proper data synchronization.
A summary of the expected current consumption for the three main subsystems in the different modes is reported in Table 1. The "ADS1293" column also includes the quiescent current of its dedicated external low-noise voltage regulator. The "min" and "max" streaming currents refer to the number of active channels: from one to three EMG channels for the ADS1293, linear acceleration only or linear acceleration plus angular velocity (gyroscope) for the LSM6DSO.

Data Communication and Processing
The developed hardware platform is all but a small part of the whole system design, as to be able to stream all the data the sensors can acquire, over a BLE 4.0 link, requires a carefully crafted protocol to make the most of the available bandwidth while ensuring reliable transmission. This section analyses the challenges, and details the proposed solutions we implemented to achieve our overall goal.

Protocol Description
As the system is designed to be highly configurable, with different sampling frequencies available, different number of channels, and simultaneous EMG or ECG and inertial sensor acquisition, a versatile data transmission protocol was chosen. Indeed, since the only difference between ECG and EMG signal acquisition lies in the setting of the sampling frequency (ODR), as the integrated filters adjust automatically, there is no need to differentiate the two cases in the protocol, and in the following we will only refer to "EMG" for both cases.
The sensor implements the BLE generic attribute profile (GATT) server, which exposes three main custom services, besides the customary device information and battery services. The first custom service is the electromyography service, the second is the inertial platform service, the third is a system configuration and telemetry service. Each custom service provides a characteristic from which notifications of sampled data are sent, and a writable control characteristic to set operating parameters, like output data rate (ODR), full-scale settings, number of active channels, etc.
This enables both flexibility and ease of implementation at the receiver end. If the default data rates (800 Hz for EMG and 104 Hz for the inertial platform) are used, the receiver only has to enable notifications for the desired services after having established a connection, and decode the incoming stream of notifications. As an example, a schematic representation of the data format for the three custom characteristics is show in Figure 3  Fields in a gray background are present only in a few packets and contain metadata with structure and timing information. The EMG data format is flexible so that different combinations of active channels and samples per packet are allowed, as described by the associated metadata, and can result in shorter packets, with anything between one and six samples per packet.
The main data channels contain first an 8 bit sequence number, needed to properly reorder samples at the receiver in case of packet loss and retransmission, as will be detailed later, and last some optional metadata split into different "subchannels". Each subchannel is a 32 bit word of information, describing aspects of the stream such as ODR and channel config (subchannel 0), data sampling timestamps (subchannel 2), and packet transmission timestamps (subchannel 4), as depicted in Table 2.
For the EMG characteristic, since only 8 bits per packet are available, each subchannel word is split into four consecutive packets, and are identified by the sequence number n in the packet (subchannel m goes into packets with m = n/4 , where · denotes the floor rounding operator). For the other characteristic where 32 bits per packet are available, it is simply m = n.
To maximize bluetooth bandwidth efficiency, multiple samples from the sensors are possibly accumulated and sent in a single notification packet, and multiple packets are sent within a BLE connection interval. This is depicted in Figure 4.
As each physical sensor samples data, they are timestamped and put in a small ring buffer. The clock used for these timestamps is derived from the main HFXO, has a resolution of 1 µs, and is latched in hardware when the "data ready" signal from the sensor is asserted, so that it is not affected by jitter and delays introduced by firmware code execution. When enough samples have been accumulated to fill a BLE packet, the packet is assembled, a sequence number is assigned, and the resulting formatted data packet is "tentatively" scheduled for transmission.
"Tentatively" here means that, to deal with the possibility of congestion in the RF channel, we keep track of the number of packets Q in the TX queue, and avoid adding another packet to it if it is too full (Q ≥ 6). This can be done because packets are added to the queue by the application code and removed when a BLE link-layer acknowledgment (ACK) is received from the central-though we use notifications and not indications, link-layer ACKs are always mandated by the BLE protocol. The BLE stack also retries a packet indefinitely, until either an ACK is received or the connection is considered lost. To better demonstrate a typical sequence of events, Figure 5 shows a couple of examples without (top) and with (bottom) some RF interference. Table 2. Subchannel definitions. Metadata is optionally added to packets to describe acquisition parameters such as data rate, number of active channels, full-scale ranges, and timing information. The type and format of metadata depends on which subchannel it is transmitted on. Subchannels are identified by the sequence number of the packet containing them. Four consecutive packets are needed to convey a single subchannel for the EMG stream as only extra 8 bits are available on the data packets. For the GYR stream this splitting is not needed and a whole subchannel is included in a single data packet. Care has been taken in choosing subchannel indices, as of course configuration metadata is needed to interpret the main data and must come with the first packet, while transmission timestamps (which refer to the time packet #0 was transmitted) are available after the fact and their inclusion in the stream must be postponed to account for queuing delays.  As can be seen, the total supply current measurement provides many insights into the operation of the system, with clearly visible spikes corresponding to packet transmission. Spikes are grouped in bursts corresponding to each connection event. Under normal operation, three EMG packets are transmitted during each connection event, and one GYR packet about every four EMG packets, as expected at the default data rates. RX timestamps are obtained in software on the PC, so there is a small amount of delay and jitter, but the sequence is clearly correct. On the bottom panel, on the other hand, it is possible to see that several connection events fail altogether (they could also be cut short instead of totally failing, though this did not happen in the time span shown). In this case, queued packets are simply delayed until the next connection event, when they are transmitted together (before, actually) the newly arrived packets. As can be seen, a connection event has room for up to nine packets this size, so there is quite some room available for retransmissions since only three or four slots are normally used. 14 Figure 5. Total supply current to the sensor node and packet reception timestamps. Spikes around 15 mA correspond to the activation of the radio transmitter, while the lower spikes (around 8 mA) to CPU activity (mostly sensor data readout). The baseline is higher than normal because the measurement was performed with a debug probe attached to better analyse timing and firmware behavior, but it does not otherwise affect the results. Stars (placed at an arbitrary height) denote the time at which a packet was received by the PC acting as central. The time scales of the current measurement instrument and of the PC with the bluetooth receiver had been synchronized by servoing both clocks to the same, local time server. A BLE 4.0 dongle was used for the measure, so the on-air bit-rate is 1 Mb/s. This automatic retransmission scheme inherent in BLE ensures that a queued packet will eventually always reach the central, unless the connection is severed. However, it also means that the latency is potentially unbounded, which is undesirable in the present application, and in case of severe RF interference the TX queue can also fill up, leading to loss of the most recent packets in favor of the older ones already queued, another undesirable behavior.

Stream
To mitigate these problems, a separate retransmission FIFO is held by the application firmware, where packets that are not immediately sent to the TX queue (during congestions) are temporarily stored. This FIFO is much bigger than the TX queue (currently 256 entries in the FIFO and 16 in the TX queue) and operates in "overwrite" mode, with new packets overwriting the oldest ones in case it fills up. These packets will then be removed from the FIFO and queued for transmission only after the congestion is resolved, i.e., when Q < 2 as shown in the figure. Typically, one extra packet is inserted per connection event. These "late" packets are specially marked as such, and the bits normally reserved for subchannel data are instead used to extend the sequence number from 8 bit to 15 bit, so as to allow proper reordering at the receiver side in case they get delayed by more than the time it takes for the standard 8 bit counter to wrap around.
This strategy increases the likelihood that the most recent samples reach the receiver in a timely manner, while ensuring that, if some additional latency is tolerated, packets that could not be sent due to RF channel congestion/interference have a chance to be transmitted later and placed in the correct position in the data stream.
As previously said, the protocol is also flexible in terms of sampling rates and EMG channel configurations. Figure 4 showed the case where a packet contains N S = 2 samples of data, which happens, e.g., for a 3-channel EMG signal sampled at F S = 800 Hz with a N B = 24 bit resolution. As each sample thus produces 9 bytes of data, two of them fit nicely into the maximum 20 bytes GATT payload of version 4.0 BLE, with room to spare for the packet index and one byte of subchannel data. A new packet is thus ready to be queued every 2.5 ms, and using the minimum allowed T I = 7.5 ms connection interval, on average N P = 3 packets per connection event will then be sent. So, though the exact number of packets sent per connection event is not exactly fixed because the BLE timing is controlled by the central, whose clock is not synchronized to the clock that drives the ADC on the peripheral, under normal circumstances there is still plenty of airtime for other data (inertial sensor, retransmissions, etc.). Table 3 shows a selection of different possible buffering schemes allowed by the presented strategy. N C is the number of active channels, F S the specified ODR, N B the resolution of each sample. N S is then the number of samples per packet, chosen so that N G < 20 bytes, where N G = N C N B N S is the number of data bytes per packet. The average number of packets per connection event is also computed as N P = T I F S /N S . Simultaneous EMG and ACC or ACC + GYRO data streaming is allowed as long as the total N P does not exceed the available airtime or the receiver queue handling capabilities, with six being the recommended limit from the SoC manufacturer.

Receiver-Side Processing
The job of the receiver is to decode the incoming stream of notifications, interpret metadata, and reorder and synchronize the data streams so that they can be sent for easy processing to the data consumer(s).
Data consumers can be any module suited for real-time display of the signals, disk-recording of the streams for offline processing and analysis, artificial intelligence (AI) modules for automated real-time signal analysis, etc. They are outside the scope of the present paper, here we will only present the processing needed to present them with ready-to-use data streams.
Different consumers may pose different requirements on allowed latency. For instance, AI modules that then send alerts or otherwise react to critical situations might require very low latencies. Data display modules may tolerate latencies up to fractions of a second, while disk-recording is not sensitive to latency at all.
Hence, the receiver software we implemented provides multiple output ports from which data can be received by the consumers, and each port has an independently configurable latency.
From "minimum latency" (around 9 ms, see later), where a direct path from receiver to consumer is formed, discarding delayed packets and with a coarse clock synchronization, to "infinite latency", suitable for off-line analysis. Of course, the higher the latency the more time is given to the transmitter to perform "retransmissions" in case of packet loss, thus filling the gaps in the signal, and allows more time to the filters used for clock synchronization to settle and reduce jitter. Time-critical applications must instead deal with the possibility of a higher packet-loss rate. This is unavoidable in a possibly lossy wireless link such as BLE.
A block diagram of the data flow within the receiver software is shown in Figure 6.  Figure 6. Block diagram of the data flow within the receiver software. Only one output port is shown, the "latency control" block is replicated for each port using a different value of the latency t pn , n being the port number.
The BLE stack provides the incoming stream of notifications to the receiver, each notification is tagged with its reception timestamp (as measured by the PC clock, assumed synchronized to real-time clock) and the GATT handle through which it was received, so that the packet type can be recognized. The first step is to extract metadata from the GATT payload, namely, 8 bit sequence number and subchannel data (configuration information, sampling timestamps t AD , and packet transmission timestamps t TX ).
Next, the sequence number must be unwrapped to form the complete packet index (32 bit in our implementation but it can be easily extended if need be). Transmit-side retransmissions and the possibility of packet loss (packets might also be lost at the receiver side if the CPU is under stress for unrelated tasks) must be accounted for. Thus, the reconstruction procedure makes use of the packet reception timestamps t RX to rebuild the higher-order bits of the index as follows. Let ∆T be the time interval between the reception of the previous packet and the current packet, and let T P be the expected (average) time interval between two packets. Let n 1 be the current packet sequence number (LSB only), and n 0 the same for the previous packet, so that we can define ∆n = (n 1 − n 0 ) 256 , where the symbol ( · ) 256 denotes signed modular arithmetics, such that −128 < ∆n < +128. Let then us finally define as the difference between the expected number of packets within the time elapsed, and the actual increment of the sequence number. Normally, N should be close to 0. Specifically, since packets are transmitted in bursts and up to six packets might be queued for transmission waiting for the next connection event, it should be −6 < N < 6, even when a counter wraparounds occurs, as this is taken care of by the modular arithmetics in computing ∆n. In case of a wraparound during a long sequence of missed packets, on the other hand, N might be close to an integral multiple of 256. It thus suffice to add (where [ · ] denotes rounding to nearest integer) to the current value of the packet index i kept in the receiver memory to obtain the correct, unwrapped, index. Of course, the above procedure does not make sense in case the packet being processed is an automatic retransmission of a previously lost packet. In such case, ∆T is meaningless. These packets can be detected by a special flag bit set in the sequence number itself together with the presence of subchannel data. This allows 15 bit of sequence number to be transmitted, enough to avoid wraparounds for over 80 s. Packets older than that will simply be discarded at the transmitter side, it would not make much sense to keep retrying packets that old.
Of course, separate indices for the two main data services are kept. As depicted always in the same Figure 6, these indices are then used to store the data part of the received packet in the proper position within dedicated reordering buffers, so that retransmitted packets ends up in the correct order.
The output port control logic for port n then extracts packets from these buffers after a specified latency t pn , i.e., when t > t AD + t pn , where t AD is the estimated sampling time derived from the packet index, to allow time for retransmissions to arrive and to let the clock recovery block do its calculations and filtering. Extracted packets are then decoded from their on-air format to physical units, and individual samples sent out to the appropriate data consumer together with their estimated sampling timestamp t AD , obtained from t AD after adjustment of the timescale of the sender to that of the receiver as will be shown in the next section.

Clock Synchronization
This last task is performed by the "clock recovery" block, whose operation can be described with the aid of Figure 7. In the figure dots represent the difference δ = t RX − t TX , where t RX represent the real time at which a particular packet (all those with sequence number equal to 0) was received, and t TX represent the time, according to the device clock, at which the same packet was transmitted. Note that the BLE protocol itself does not guarantee latency, so transmit timestamps can only be obtained after the fact, when the packet had already been transmitted. For this reason these timestamps are sent in subchannel 4 (packets 16-19 of the EMG sequence), to allow enough time for packet 0 to actually be transmitted under normal circumstances. This difference allows to extract two essential pieces of information: the absolute difference d between transmit clock and real time (not shown in the figure because it varies randomly between runs, as the clock does not have a fixed starting epoch), and the clock drift r ≈ −∂δ/∂t RX (the slope of the interpolating curve).
Individual measures are inevitably noisy (RX software timestamping can not be very accurate), so, after removing outliers, they are cleaned by a double exponential smoothing filter that aims to approximate in real time the true regression line, as follows: A double exponential smoothing filter was employed to avoid the delays introduced by traditional low-pass filters when processing ramp-like signals, but for its proper operation an estimate of the drift is needed. Equation (4) will adjust and track slow drift variations, such as those due to crystal ageing and temperature variations, but with such a big τ r it would take hours to set initially. The initial drift estimate is thus performed at calibration time by linear regression and stored within the system, so that (4) will only need to track small, slow changes.
The smoothed curve d[i] provides the necessary information to convert the sampling timestamps as transmitted in subchannel 2 into the corresponding real-time timestamps, by inverting the relation To ease the process, the received measures, which due to noise may not be uniformly sampled neither along the RX nor the TX time base, are first resampled to a uniform interval T 0 along the t TX time scale using linear interpolation. Let us call t RT [n] the interpolated t RX real time corresponding to transmit time t TX = n T 0 .
Using a uniform sampling time T 0 = 1 s makes the resampling process effectively neutral from a noise perspective, as shown in Figure 8, which reports samples of the t TX → t RX relation both before (dots) and after (stars) resampling. To better highlight the curves, only the difference between the estimated real-time t RX and the a posteriori computed regression line is shown.
The uniformly sampled points are then interpolated (or extrapolated in case of low-latency output ports) to associate the estimated sampling timestamp according to the receiver-side time base. The quality of this interpolation/extrapolation procedure clearly depends on the requested output latency. In the lowest latency ports, it is only possible to extrapolate the latest measure using the current drift estimate, leading to the horizontal tracts of the "staircase" line in Figure 8 (tracts appear horizontal instead of approximately 45 • because the overall regression line had been subtracted for graphical purposes). Increasing the latency to above 1 s allows linear interpolation between the last two samples to be performed (red line). Higher latencies, lastly, allow more aggressive filtering to be performed. In this case we interpolated the data using Gaussian FIR low-pass filtering with impulse responses twice as long as the requested latency, so as to allow a high degree of smoothing while keeping the filter causal as required for real-time implementation, as follows.
Given a t AD time to convert, the filter operates similarly to the double exponential filter, except that the drift estimate is already available, and the differences are filtered using a FIR filter instead of an IIR filter. Indeed, in this case it is not possible to directly filter the ramp-like signal, as the residual ringing of the filter would then increase in amplitude with time, eventually exceeding the amount of wander it was supposed to reduce in the first place. Let n = t AD /T 0 be the "center" point of the FIR filter, then, letting T D = T 0 /(1 + r) be the real-time equivalent of the interval T 0 , estimated using the current drift estimate r, the difference between t RT and the interpolating straight line can be computed as with i spanning the interval n − L, . . . , n + L. This difference e[i] can then be filtered by convolution with the Gaussian impulse response g(τ) and then the final RX-side time estimate t AD obtained by re-adding the straight line The Gaussian impulse response is defined as where w(τ) is a suitable raised-cosine window with a small roll-off factor β = 1/4 employed to reduce the effects of truncation errors (9) and was chosen because the Gaussian filter limits ringing in its step response, helping in reducing the residual wander. As can be seen from the figure, jitter is also effectively completely eliminated by the Gaussian filters. Of course, in order for (6) to be computable, the output port latency t p must satisfy t p > L T D , otherwise some samples of t RT [i] might not already be available. Actually, even with proper latency, some of these will not be available near the beginning of the transmission, and after it is stopped when data buffers are flushed to the output ports with no other incoming data. During these transients, to allow the system to operate, missing samples are extrapolated using the current drift estimate as usual. Lastly, to obtain the proper t AD to transform, a relationship between packet index (or sample number), and sampling timestamp, must be derived. In the case of the EMG data, the relationship between sequence number and timestamp is fixed, since both the clock used for timestamping and for sampling the data is derived from the same HFXO. This allows an accurate signal alignment to be made at the receiver side even in case of packet loss and clock drifts between central and peripheral, because sampling timestamps can be computed exactly from packet index and the initial offset transmitted in subchannel 2. For GYR data, on the other hand, there is not an exact linear relationship between packet index and sampling time, because sampling is governed by an internal clock within the sensor chip, that is not externally synchronizable. For them, sampling timestamps can indeed be accurately estimated using the same interpolation procedure described above from the metadata always transmitted in subchannel 2. Given the typically lower data rates of GYR data with respect to EMG data, a slightly higher synchronization error is usually not perceivable.

Results
In this section the operation and performance of the prototype built will be analysed both with artificial signals, to characterize the system frequency response, noise floor, overall latency and synchronization, and power consumption, and with natural signals, to demonstrate its operation in a real scenario.

Frequency Response
The system frequency response was determined using a sinusoidal function generator to produce frequencies from 0.1 Hz to 1 kHz, using three discrete steps per decade at an amplitude of 5 mV. The signal was then acquired in the digital domain via the bluetooth connection, and digitally demodulated by means of a Hilbert transform to determine the exact frequency and amplitude for each step.
Results are shown in Figure 9 for the three different data rates. As can be seen, the passband is almost exactly flat, and the signal bandwidth, which can be estimated by computing the intersections of the interpolating curves to the −3 dB line, is approximately 1/5 of the output data rate, namely 159 Hz, 318 Hz, and 633 Hz respectively for 800 Hz, 1600 Hz, and 3200 Hz ODRs. gain @ ODR = 800 Hz gain @ ODR = 1600 Hz gain @ ODR = 3200 Hz Figure 9. Measured frequency response of the EMG signal path, with the output data rate set to 800 Hz, 1600 Hz, and 3200 Hz (markers denote measured data points, solid lines are interpolated). Figure 10 reports the power spectral density of the noise floor, measured connecting the inputs of an EMG channel to a reference voltage source. The RMS noise is 2.1 µV, 2.9 µV, and 4.3 µV for data rates of 800 Hz, 1600 Hz, and 3200 Hz, respectively. For comparison, the power spectral density of a typical EMG signal recorded from the biceps brachii during a dumbbell exercise (see next section for details) is also shown in black on the same figure. It is worth noticing that there are no line frequency disturbances, as all precautions have been taken in the design of the circuit board to balance input lines to avoid common-mode to differential conversion.

Latency and Synchronization
In some applications involving EMG sensors to detect muscle activity and then operate in the environment, or assist the patient, latency may be of concern.
To evaluate the system performance in this regard, a synthetic signal consisting of a pulse train, with 10,000 pulses spaced by 29.9 ms (about 33.4448 Hz), and accurately synchronized to the clock of the receiver (with a measured average synchronization error of just 154 ns), was generated. The pulse spacing was selected to not be a multiple of the connection interval, so that pulses occur at random times with respect to the BLE timing. At the receiver, each received packet is timestamped and the instant at which the signal crosses a threshold (set at mid-height of the pulse) is interpolated from these time stamps. The latency so defined thus includes all the effects of the amplifiers, converters, digital filters, buffering, transmission, and reception.
The distribution of the observed latencies is shown in Figure 11, the average being 8.8 ms. Please note that the ADS1293 incorporates a three-stage fifth-order sinc digital filter (sinc 5 ), and a sinc 5 filter at an 800 Hz data rate is on average responsible for a 2.5 ms latency. Occurrences Figure 11. Histogram of the measured latency of the EMG signal path, with the output data rate set to 800 Hz, tested using 10,000 occurrences. The bluetooth connection interval was set to the minimum allowed by the specifications, i.e., 7.5 ms.
Of course, data in the above referenced figure reflects what can be achieved from a "real-time" (zero latency) output port, and represent the elapsed time from the physical event to when data is actually available. Timestamps associated to such data are still compensated for transmission delays and can be used to precisely locate events in time, exploiting the clock synchronization algorithm detailed above.
To demonstrate this a similar experiment was performed, wherein 10 bursts of 100 impulses, each 10 ms wide, were applied at random times t j , and the measured signal plotted using t AD − t j as the horizontal axis. An ensemble showing one impulse for each burst is reported in Figure 12. The delay from the input edge to the mid-swing crossing, averaged over all 1000 realizations, is 2.53 ms, and the RMS timing error is 6.6 µs. Similarly, accurate synchronization is also achieved on the inertial platform. To test this, a stimulus which elicits simultaneous electrical and mechanical responses was applied by tapping the on-board EMG electrode connectors with a small, solid, conductive rod. The tap is meant to cause the board to rotate, thus the motion can be registered by the gyroscope, and the electrical contact is registered as a train of disturbances on the EMG channel as the rod makes contact and rebounds on the connector. Figure 13 shows the resulting signals, recorded after 15 min of system operation to allow ample time for a possible oscillator drift to manifest itself. Actually, the estimated drifts are such so as to cause more than a 5 s shift between the pulses, if uncorrected. After reclocking, instead, the onsets on the two signals appear simultaneous as they should be. Finally, it is of interest to analyse the effects of radio packet loss, and of the retransmission strategy adopted, on the system latency. To this end, Figure 14 shows the results of a simple "transmission quality" test performed over a 10 min time span. For the experiment, the sensor was kept within normal distance from the bluetooth receiver (a few meters) for the first 100 s. Then, the antenna was shielded to cause a surge in packet loss rate from about 100 s to 200 s. Then again, from 300 s to about 500 s the sensor was brought away from the receiver until an average 10 % TX loss rate was observed. A last move of the sensor at the edge of the useful radio range before bringing it back to normal distance caused the last increment in TX loss rate and the ramp in delay. It did not cause a large RX loss only because of its relatively short duration that did not cause the retransmission buffer to overflow. In detail, the top left panel of the figure shows a dot for each retransmitted packet (from our retransmission buffer), each representing the delay between the time it should have been received and the actual reception time. The bottom left panel shows the percentage of the packets lost at the TX side (packets being discarded by our protocol because the TX queue is already full, line labeled "TX"), along with the percentage of packets still missed at the receiver after allowing enough time for retransmissions to eventually arrive (line labeled "RX"). The lines are moving averages over a 5 s span.
Since it is not easy to generate a steady state of transmission errors with a prescribed probability using the real hardware, results from a system-level simulation of the retransmission protocol are also shown on the right, as a function of the raw packet loss rate, so that they can be used as a reference. This was a simple simulation to verify the correctness of the protocol implementation and its effectiveness in mitigating radio channel congestion situations, and is by no means meant to be extremely accurate. Indeed, it was performed using the SystemC framework to model the behavior of the sensor node in queuing and transmitting packets, implementing the very same algorithm implemented in firmware as detailed earlier in this paper, while the radio channel was simulated using a very simple packet loss model, which assumes the same raw packet loss probability p both as the probability of missing a connection event altogether, and the probability of losing a data packet within the connection event, independently of packet length. Despite these simplifications, the model matches experimental results quite well, as can be seen by comparing the relative heights of the blue and red lines during the surge of errors in the experiment (left) and the simulation (right).
Moreover, as can be seen in the bottom right panel, which reports steady state averages for varying p, retransmissions effectively avoid losing data for p < 17 %. The "TX" curve is usually lower than p because automatic link-layer retransmissions are not counted (nor are accessible by firmware with the used stack) and we are counting the packets "set aside" by the transmitter. As a reference, the curve p + (1 − p) p is also plotted (labeled "air"), to denote the air-time capacity reduction due to the possibility of losing connection events. Since our protocol only retransmits one packet per connection event, and there are on average three fresh data packets per connection event with the settings adopted, only a 1/3 reduction in channel capacity can be recovered. Indeed, the "air"curve crosses the 1/3 level at the same p at which the "RX" curve starts to rise from zero.
Finally, in the top right panel the expected ("mean") and maximum ("max") value of the simulated delays are reported versus p. These can be useful to set the latency of the output ports. As can be seen, a 5 s latency allows for every retransmission to arrive, but 0.5 s is enough for p < 15 %. Table 4 reports the measured current consumed by the system (from the battery) in its different operational states. Of course, power consumption in the active states is strongly dependent on the firmware being run, and further reductions are expected by optimizing critical code paths so that the processor can spend more time sleeping. As it stands, with a small 250 mA battery, the sensor can be used to stream three channels continuously for over 2 days, and can sit in idle mode for over a year.

Real-Life Example
To test the operation and performance of the device in a real-life situation, a data recording session was performed in the laboratory with the device worn by the experimenter on the upper portion of the right arm while performing three different simple exercises commonly used during fitness training or rehabilitation, namely, biceps curls, lateral raises, and vertical raises. Of course, the experimenter signed a written informed consent form before performing the tests.
The EMG signals were recorded from the biceps brachii, deltoideus medium, and triceps brachii, using standard 26 mm Ag/AgCl pre-gelled ECG electrodes (model PG10C made by FIAB). They have a nominal impedance of 55 Ω at 10 Hz and manifest a very low offset voltage of only 2.1 mV. The positioning was chosen so that the three different exercises should result in three different, clearly identifiable, activation patterns of the muscle groups. The EMG signal was acquired at the default 800 Hz sampling rate, which happens to be the maximum at which three channels can simultaneously be streamed over BLE 4.0, and at this ODR the frontend has a resolution of about 0.16 µV, with an integrated RMS noise of 2.1 µV.
Inertial data was also simultaneously recorded from the sensor, at the default 104 Hz ODR, with a 500 • /s gyroscope full-scale sensitivity and an 8 g accelerometer full-scale sensitivity. The sesnsor was secured to the arm with a hook-and-loop closure so as to limit spurious oscillations and vibrations of the device during movements. The setup is shown in Figure 15, together with a description of the orientation of the inertial sensor. X Y Z Figure 15. Positioning of the electrodes and device on the upper portion of the right arm. The electrodes identified by the yellow tab contact the biceps brachii, those identified with the orange tab the triceps brachii, and those with the blue tab the deltoideus medium. The orientation of the mechanical axes is visible in the right picture, showing the interior of the case with the battery on top of the circuit board. Once worn, axes are oriented so that the Y direction is along the arm, pointing downwards at rest, X points towards the front, parallel to the sagittal plane, and Z points towards the arm, along the coronal plane.
The acquisition was performed using an ordinary laptop PC as a bluetooth receiver, using the integrated wireless adapter compatible with the 4.0 specifications, and the custom-made software described above to record and synchronize the stream of incoming notifications.
The raw recorded EMG signals are shown in Figure 16 (left panel), from which it is possible to see that the biasing network operates as expected, as the signals are nearly exactly centered around the zero. Since the ADS1293 input stage can accept common mode voltages up to 0.95 V from the rails, there is ample headroom for possible DC wander induced by cable movements or skin stretch. Indeed, since the system shall also be used to record ECG signals, no analog high-pass filtering was included, as is instead customary to do for EMG signals. The high common-mode compliance and large differential-mode full-scale of the converter, combined with the biasing strategy, makes the analog filter unnecessary.
Yet, to better appreciate the EMG signal due to muscle contraction and remove the DC wander, a high-pass digital filter with a cut-off frequency of 1 Hz was applied to the recorded signals. The result is shown on the right panel of the same Figure 16, from which it is easier to assess muscle involvement during the three different exercises. As can be seen, during biceps curls, only the biceps brachii is involved, with just a minor activation of the tripes brachii for stabilization. All the three muscles are somewhat involved during the other exercises, with the deltoideus medium being exerted the most during lateral raises, as it should be.
The raw data coming from the inertial sensor is shown in Figure 17. The angular velocity (left panel) clearly shows that the upper arm was kept essentially at rest during the first exercise (biceps curls), and the concentric and eccentric phases of the other exercises can easily be identified by just looking at the sign of the velocity. Acceleration can also be used to infer the inclination of the arm, as easily seen in right panel of the same Figure 17. Of course, the two sets of data can be combined together to compute an orientation vector, possibly with the help of a magnetometer (not included in this first prototype), but this is outside the scope of the present experiment.
To complete the experimentation, the capability of the sensor to acquire an ECG signal was also tested. Figure 18 shows the unfiltered ECG signal as acquired using a two-leads configuration from the chest. As can be seen, the signal is quite clean and no visible artifacts are present.

Conclusions
This work presented a wireless sensor able to acquire either EMG or ECG signals, with up to three channels, while simultaneously capturing movement information from an inertial platform. It uses a bluetooth low-energy radio for real-time data transmission so as to minimize energy consumption, and a carefully designed transmission scheduling allowed to maximize the data throughput while keeping latency reasonably low, as demonstrated by extensive experimentation. This newly proposed adaptation layer on top of BLE also allows tight time synchronization, and provides reliability of the data transmission by ensuring lossless communication with up to 1/3 of the radio channel capacity compromised, which are all important features if the sensor is to be used to perform research or real-time monitoring. The acquired signal quality was assessed by both objective measurements, and evaluated in a real-life scenario, and it met design expectations.
The design of the whole system was driven by energy efficiency and ease of use as primary objectives. Indeed, using the motion-triggered wake-up capability provided by the accelerometer helped achieving both goals simultaneously, as the system just activates itself when worn, without needing any operator intervention. Plus, usage of a very widespread radio technology as bluetooth, allows the use of commodity hardware as smartphones or laptops as data receivers, further simplifying deployment.