#### 3.1. Compression

The plots in

Figure 8a are representative waveforms of the raw ADC codes for GCG-Z, SCG-Z, finger PPG, chest PPG 1, and ECG recorded from Subject 4 at 3200 Hz. Some interference was present in the SCG-Z signal (in the form of impulsive noise), which was likely due to the unshielded cables used in the experiments. The amplitudes and signal quality of the GCG-Z and SCG-Z are rather low, but the ECG, chest PPG, and finger PPG waveforms all exhibit markedly higher signal quality. The finger PPG signal is very clean and has such a high dynamic range that the waveform is close to saturating beat-to-beat. The amplitudes of the chest PPG and ECG are also quite strong, with clean signal features visible.

Table 1 reveals that our proposed method enabled a table size decrease of 67–99% across all sensing modalities and sampling rates, thereby optimizing the MCU memory consumption. In particular, the average table size for the standard method is 1728 for SCG-Z data recorded at 3200 Hz, which demonstrates why it is impractical for memory-constrained embedded systems because it would require, at a bare minimum, 19 kB of memory for the SCG-Z channel alone (11 B for each table entry: 2 B for the key, 4 B for the code, 1 B for the code length, and 4 B for the pointer to the next entry in the hash table bucket). In contrast, the worst case memory usage, with the 11 B memory allocation per table entry analysis above, is only 330 B for our proposed method. The reason that the tables sizes are much larger for SCG data in comparison to the other sensing modalities is likely due to the impulsive noise, which violates the underlying assumption of differential pulse code modulation that adjacent samples are highly correlated in naturally occurring signals. The impulse-like QRS complex in the ECG similarly causes its table sizes (with the standard method) to be significantly larger than those of other sensing modalities.

Another interesting observation that can be made from the results in

Table 1 is that the hyperparameter search algorithm does not greedily always select a table of size

${k}_{2}$ (the specified maximum table size of 30), which may be due to either of the following reasons: (1) the trivial case when the requested table size is larger than possible, i.e.,

${k}_{2}>\left|supp\left({H}_{{m}^{*}}^{\mathcal{T}}\right)\right|$; and (2)

$\mathsf{\Phi}[{m}^{*},{k}_{2}]\ge \mathsf{\Phi}[{m}^{*},{k}^{*}]$. During post-processing, we only had access to

$supp\left({H}_{m}^{\mathcal{T}}\right)$ for

$m=0$ (in the form of the keys of the table for the standard method). However, for

${m}^{*}\ge 0$, Proposition A1 was used to check if the trivial case could definitely be ruled out, which was indeed the case for 64 out of 65 times when the selected table size was less than

${k}_{2}$.

$\mathsf{\Phi}[{m}^{*},{k}_{2}]\ge \mathsf{\Phi}[{m}^{*},{k}^{*}]$ arises when increasing the table size results in either a tie or an increase in the total number of bits. The latter may occur as a consequence of the prefix-free property of Huffman codes, which states that no valid code is a preceding subsequence of another. To maintain this invariant, the code lengths of an initial set of probability classes may increase as the table size is increased. This may be seen in the simulated example in

Figure 3 where the codes for the probability Classes 1 and 2 would have been 0 and 1 (or possibly flipped), regardless of their relative frequencies, if they had been the only entries in the table. However, the code length for Class 2 is two because of the other probability classes added to the table. If the relative frequencies of the additional classes in the validation data are not consistent with the training data, for example if none of those additional table entries is present in the validation data, then the total number of bits increases due to the larger code length of samples in Class 2 that are not compensated for by gains from the other table entries. It should also be noted that code lengths of classes present in the validation data but not in the training data are of fixed length (

$1+M$). Both observations demonstrate how, by splitting the data into training and validation datasets, the hyperparameter search algorithm selects codes that are more robust to noise than would be the case if the codes were generated without the validation data.

The main observation that can be drawn from the results in

Table 2 is that the compression ratios obtained with the proposed method matches those of the standard approach—even with much smaller tables. In addition, the compression ratios for the GCG and chest PPG are, in general, larger than those of the finger PPG, ECG, and SCG, which suggests that impulsive signals and those with high dynamic range have lower compression ratios than signals with low amplitudes or slowly varying waveforms. It should be noted that the reported compression ratios for SCG, ECG, and PPG data are relative to the ADC’s 14-bit resolution, so the actual compression ratios are higher if one considers the two unused bits in the 16-bit representation of the raw ADC codes.

There was an across-the-board increase in compression ratios for the SCG, PPG, and ECG data when the sampling rate was increased from 1600 to 3200 Hz, which may be due to a decrease in the magnitude of residuals when the sampling rate is increased. However, this relationship is reversed for the GCG data. It is not immediately clear why, but one reasonable explanation is that the low signal levels of the GCG makes the residuals more susceptible to being dominated by random noise. Nevertheless, the changes in compression ratios in both directions demonstrates the dependence of the performance of the compression tables on the sampling rate amongst other factors, which illustrates why learning the compression tables in real-time is superior to computing them offline.

We present all the chosen values for

${m}^{*}$, the selected bin width parameter, as opposed to the mean ± std statistics because outliers hide important trends in the result. Specifically, the high degree of intra-channel consistency and inter-channel variance demonstrate the hyperparameter search algorithm’s ability to adapt to the underlying signal. For example, for the GCG-X channel,

${m}^{*}=1$ for all subjects. Furthermore, the mode was selected at least

$50\%$ of the time for nine out of ten channels, and at least

$70\%$ of the time for six out of ten channels. Another key observation is that

${m}^{*}$ is typically larger for impulsive signals or those with a high dynamic range, which explains the outlier in

Table 3 where

${m}^{*}=3$ for PPG-C-1 recorded at 1600 Hz from Subject 1 that was saturating beat-to-beat. This behavior was not observed for other measurements, so it was likely as a consequence of the contact pressure applied by the subject during that experiment. In

Figure 9, the cumulative distribution function (CDF) of the finger PPG lies below the chest PPG’s, which shows that the distribution of residuals for the finger PPG is more spread out, and that causes the algorithm to select a larger bin width to compensate for the spread. The compression table of the finger PPG recorded from Subject 4 has 11 and 340 entries for the proposed and standard methods, respectively, although the compression ratios obtained with both methods are practically identical. The reason is that the table is able to encode 88 unique residual magnitudes given the bin width of eight and the table size of 11. The CDF at

$88>0.98$, which demonstrates how a very compact compression table may be used to achieve a very high code density.

The combined duration for executing the hyperparameter search algorithm for all ten channels was

$1050\pm 158$ ms and

$2012\pm 151$ ms for the 1600 Hz and 3200 Hz sampling rates, respectively. Although not explicitly shown in Algorithm 2, the hyperparameter search algorithm terminates early once

${k}_{1}>\left|supp\left({H}_{m}^{\mathcal{T}}\right)\right|$, i.e., the minimum acceptable table size is greater than the number of histogram bins, so we also recorded the combined number of candidate

$\{m,k\}$ pairs that

$\mathsf{\Phi}$ was evaluated with across all channels, which was

$1254\pm 103$ and

$1305\pm 47$, for the 1600 and 3200 Hz sampling rates, respectively. The short duration required to compute the tables and the large number of searches, which indicates that the algorithm did not terminate only after a few iterations, demonstrates the performance of the proposed hyperparameter search method even with large datasets of 192,000 and 384,000 samples for the 1600 and 3200 Hz sampling rates, respectively. Moreover, the cost of constructing

${H}_{0}^{\mathcal{T}}$ and

${H}_{0}^{\mathcal{V}}$ likely dominates the cost of the actual search; therefore, the algorithm can be made to run more quickly by constructing them with the incremental approach described in

Section 2.1.2 for larger datasets.

#### 3.3. Limitations

In this work, the compression tables on the MCU were unconditionally updated after learning them on the smartphone. However, there is no guarantee that the new table will, at the very least, perform as well as the previous table (unless in the trivial case when the previous table is empty). One way to guard against updating a compression table with an inferior one is by evaluating both tables with data recorded after the one used to compute the table (this may also be done by partitioning the data into training, validation, and test datasets, with the test data used for the assessment). A decision can then be made as to whether the compression tables should be updated based on the outcome of the evaluation.

The lossless communications method developed in this work is limited to correcting random packet loss, and is not well-suited to more significant communication failures like long-term Bluetooth connection loss because the MCU’s 256 KB RAM only provides a few seconds of buffering capacity. In practice, however, this constraint has not been an issue and we have successfully tested the prototype by continuously streaming data for two hours.

The choice of a D-type flip-flop, as opposed to a more appropriate D-type latch, was dictated by availability (at the time of prototyping) in limited quantities required for device development. Nevertheless, for this application, a D-type flip-flop can emulate a D-type latch because the only value that needs to be latched onto is a high, and this can be done by permanently connecting the D input to high. It should also be noted that only the following discrete components are actually required: a D-type latch, a resistor, and a capacitor. That is because most MCUs, including the MSP432P4111, have on-board comparators and there are D-type latches with inverting clock inputs, which eliminates the need for a discrete comparator and an inverter.

Finally, the performance of the approach in persons with cardiovascular diseases and arrhythmias must be characterized to understand generalizability of the methods. Specifically, the impact of arrhythmias on the compression algorithm will need to be studied in future work. Broadly speaking, the signals measured in this work have been demonstrated to be of high quality even in patients with severe left ventricular dysfunction, i.e., patients with advanced heart failure [

40,

41], and thus the potential for this work to be applicable to such populations holds merit.