Compression of Bio-Signals Using Block-Based Haar Wavelet Transform and COVIDOA for IoMT Systems

Background: Bio-signals are the essential data that smart healthcare systems require for diagnosing and treating common diseases. However, the amount of these signals that need to be processed and analyzed by healthcare systems is huge. Dealing with such a vast amount of data presents difficulties, such as the need for high storage and transmission capabilities. In addition, retaining the most useful clinical information in the input signal is essential while applying compression. Methods: This paper proposes an algorithm for the efficient compression of bio-signals for IoMT applications. This algorithm extracts the features of the input signal using block-based HWT and then selects the most important features for reconstruction using the novel COVIDOA. Results: We utilized two different public datasets for evaluation: MIT-BIH arrhythmia and EEG Motor Movement/Imagery, for ECG and EEG signals, respectively. The proposed algorithm’s average values for CR, PRD, NCC, and QS are 18.06, 0.2470, 0.9467, and 85.366 for ECG signals and 12.6668, 0.4014, 0.9187, and 32.4809 for EEG signals. Further, the proposed algorithm shows its efficiency over other existing techniques regarding processing time. Conclusions: Experiments show that the proposed method successfully achieved a high CR while maintaining an excellent level of signal reconstruction in addition to its reduced processing time compared with the existing techniques.


Introduction
Smart healthcare systems deal with massive amounts of medical data daily for healthcare monitoring and early detection and diagnosis of diseases [1]. Bio-signals are records of biological events inside the human body, such as a heartbeat or muscle contraction. These signals are used to detect whether there is a problem or disorder in a human organ. There are many kinds of bio-signals used for various clinical purposes, such as ECG, which is used for recording human heart activity, EEG for recording the electrical activity of the brain, EMG for evaluating the electrical activity of skeletal muscles, ERG for measuring the electrical responses of various cell types in the retina, and EGG for recording the myoelectrical signal generated by the movement of the smooth muscle of the stomach [2]. ECG and EEG are the most widely used bio-signals for diagnoses of cardiac and brain disturbances [3]. Three main elements represent a typical heartbeat signal: the P wave, which indicates depolarization of the atria, the QRS complex, which shows depolarization of the ventricles, and the T wave, which represents repolarization of the ventricles, as shown in Figure 1. On the other hand, EEG signals record the brain's electrical activity. Several sensors are positioned on various parts of the scalp to record EEG signals, as shown in Figure 2. EEG signals help to identify various common diseases, such as epilepsy and autism spectrum disorder [4].  In smart healthcare systems, the bio-signals are recorded by sensors attached to the patient's body and then digitized. The digital bio-signals are processed using digital computers or computer-based medical devices [5]. An effective compression method is a basic need in such systems to minimize the volume of medical data and enhance the transmission's efficiency [6]. However, obtaining high compression ratios is insufficient for an efficient compression algorithm, and data quality must also be maintained since the loss of medical data could lead to misdiagnosis problems. For these reasons, we proposed an algorithm for efficient compression of bio-signals that can achieve very high CR (CR = 32) and preserve the diagnostic features of the input signal. This algorithm is based on a blockbased HWT and COVIDOA. The HWT is used to obtain the features of the signal for these reasons [7]: • The HWT can extract local spectral and temporal information simultaneously. • Wavelet-based coding allows for progressive data transmission and is more robust to transmission and decoding failures.

•
The HWT is conceptually simple and fast.

•
The HWT is completely reversible and does not suffer from the edge effects that are an issue with other wavelet transformations. • Block-based HWT and inverse transform can be performed by applying matrix multiplication.
We apply the HWT to the input signal and then select the best-fit coefficients to reconstruct the digital signal using COVIDOA according to a predefined objective function.  In smart healthcare systems, the bio-signals are recorded by sensors attached to the patient's body and then digitized. The digital bio-signals are processed using digital computers or computer-based medical devices [5]. An effective compression method is a basic need in such systems to minimize the volume of medical data and enhance the transmission's efficiency [6]. However, obtaining high compression ratios is insufficient for an efficient compression algorithm, and data quality must also be maintained since the loss of medical data could lead to misdiagnosis problems. For these reasons, we proposed an algorithm for efficient compression of bio-signals that can achieve very high CR (CR = 32) and preserve the diagnostic features of the input signal. This algorithm is based on a blockbased HWT and COVIDOA. The HWT is used to obtain the features of the signal for these reasons [7]:

•
The HWT can extract local spectral and temporal information simultaneously.

•
Wavelet-based coding allows for progressive data transmission and is more robust to transmission and decoding failures.

•
The HWT is conceptually simple and fast.

•
The HWT is completely reversible and does not suffer from the edge effects that are an issue with other wavelet transformations.

•
Block-based HWT and inverse transform can be performed by applying matrix multiplication.
We apply the HWT to the input signal and then select the best-fit coefficients to reconstruct the digital signal using COVIDOA according to a predefined objective function. In smart healthcare systems, the bio-signals are recorded by sensors attached to the patient's body and then digitized. The digital bio-signals are processed using digital computers or computer-based medical devices [5]. An effective compression method is a basic need in such systems to minimize the volume of medical data and enhance the transmission's efficiency [6]. However, obtaining high compression ratios is insufficient for an efficient compression algorithm, and data quality must also be maintained since the loss of medical data could lead to misdiagnosis problems. For these reasons, we proposed an algorithm for efficient compression of bio-signals that can achieve very high CR (CR = 32) and preserve the diagnostic features of the input signal. This algorithm is based on a block-based HWT and COVIDOA. The HWT is used to obtain the features of the signal for these reasons [7]:

•
The HWT can extract local spectral and temporal information simultaneously. • Wavelet-based coding allows for progressive data transmission and is more robust to transmission and decoding failures.

•
The HWT is conceptually simple and fast.

•
The HWT is completely reversible and does not suffer from the edge effects that are an issue with other wavelet transformations. • Block-based HWT and inverse transform can be performed by applying matrix multiplication.
We apply the HWT to the input signal and then select the best-fit coefficients to reconstruct the digital signal using COVIDOA according to a predefined objective function. The PRD is selected as the objective function so that the coefficients that lead to the minimum PRD values will be selected for reconstruction. The PRD is calculated as follows: The F and f are reconstructed and original signals. The rest of the paper is organized as follows: Concise literature is presented in Section 2. The explanation of calculating the HWT is discussed in Section 3. In Section 4, a brief overview of COVIDOA is presented. The proposed compression/decompression algorithm using HWT and COVIDOA is described in Section 5. Experiments, results, and discussion are presented in Section 6. The conclusion and the recommendation for future work are drawn in the last section.

Literature Review
Over the last few decades, various algorithms have been proposed to compress medical data. These algorithms are either lossless or lossy compression. The lossless compression methods can achieve small compression ratios with no data loss, while lossy algorithms achieve much higher compression ratios but some information will be lost [8]. Data quality is crucial in the medical field, and losing some features may significantly impact the diagnosis process. However, suppose the data loss is within an acceptable limit and does not affect the data's visual appearance. In that case, lossy compression techniques will be a good choice due to the high compression ratio they can achieve [9]. In [10], an ASCII character-encoding-based lossless compression method was proposed. In [11], Chen and Wang used two Huffman coding tables to develop a useful lossless compression method to reduce the storage and transmission demands for ECG signals. This algorithm has the advantages of low cost and power consumption. Rzepka [12] used selective linear prediction to compress multi-channel ECG.
Most lossy compression algorithms are based on transform coding, where a specific transform is applied to the input signal, and some information is used to be discarded. In contrast, the others are used in the reconstruction process. The result of this process will not be identical to the original input, but it should be close enough according to the application's purpose. The most popular transform-based compression techniques involve the DCT [13], DWT [7], and moment-based transform [14]. For bio-signal compression, Batista et al. [15] utilized Golomb-Rice coding with optimum DCT coefficients to compress ECG signals. They used the well-known MIT-BIH Arrhythmia database to evaluate their algorithm, where CR of 10.4:1 and PRD ∼ = 2.5% were achieved. Jha and Kolekar [16] proposed another DCT-based algorithm to compress ECG signals. They employed DOST and dead-zone quantization to transform coefficients. Recent work includes the technique proposed in [17] for assessing compressed and decompressed ECG databases. The proposed algorithm used DCT, 16-bit quantization, run-length encoding for compression, and convolution neural network for classification. The obtained CR was 2.56, and the classification accuracies were 0.966 and 0.990 for the compressed and decompressed databases, respectively. Further, Pal et al. [18] proposed a compression algorithm for 2D ECG signals based on the combination of DCT and embedded zero-tree wavelet. The results showed that the suggested approach could raise the sparsity of the transform domain, which boosts compression effectiveness with a small degradation in reconstruction quality. Other DCT-based signal compression algorithms are proposed in [19,20].
The WT is a powerful tool for signal analysis because of its compact representation of signals and images, and its most popular applications are denoising and compression of signals [21]. Recently, OMs, such as Tchebichef and Hahn moments, have been used in signal reconstruction and compression due to their ability to represent signals [22]. Signal compression techniques based on orthogonal moments are presented in [23,24]. It is observed from the state of the art that the wavelet-based algorithms provide superior performance compared to the other compression methods [21]. Jha and Kolekar [25] used the DWT to select an appropriate mother wavelet to compress the ECG signal while guaranteeing quality. The same authors employed EMD and DWT to develop another ECG compression algorithm [26]. Based on the obtained results, it is noticed that the suggested method performs better than several current ECG compressors. Singhai et al. [27] used DWT and PSO to design a compression algorithm where the PSO selects threshold values and the optimal wavelet parameters. The compression ratio obtained using this algorithm was 28.43 at PRD = 2.63. Kolekar et al. [28] proposed an ECG compression technique based on the modified run-length encoding of wavelet coefficients. The proposed approach used dead-zone quantization for WT coefficients, and the obtained coefficients were encoded using modified run-length encoding.
Shi et al. [29] proposed a new ECG compression method based on a binary convolutional auto-encoder (BCAE) equipped with residual error compensation (REC). The proposed method aimed to achieve efficient ECG compression through deep learning while ensuring high signal quality. The performance is tested using several measures, such as PRD, QS, SNR, and CR. The average performance in CR, PRD, NPRD, and SNR is 17.18, 3.92, 6.36, and 28.27 dB, respectively, for 48 ECG records. The achieved results in CR and PRD are 117.33 and 7.76, respectively. Recently, Singhai et al. [30] presented an algorithm for ECG compression based on DWT and nature-inspired optimization algorithms. The algorithm used the optimization algorithm to find the optimal wavelet design parameter values and optimal threshold levels. The results show the capability of this technique to provide high compression ratios with high signal quality.
Lossy compression depends on using only some features and omitting others in return for reduced size. The question is, therefore, which features should be selected and which should not? The answer should be that the features that contain the most important clinical features and lead to the highest reconstruction quality should be selected, and the remaining features should be neglected. An optimization algorithm would be very helpful in selecting the most important feature subset. Motivated by the simplicity and efficiency of the HWT in signal and image processing and the efficiency of COVIDOA in solving various optimization tasks, we utilized the HWT in combination with COVIDOA to develop an efficient compression algorithm for bio-signals. In this approach, the signal is transformed using the HWT and then the best feature subset from the wavelet coefficients that should be used for reconstructing the signal will be selected with the help of COVIDOA.

HWT
This section explains applying a fast block-based HWT to a one-dimensional signal. Suppose the signal is divided into K blocks denoted by Bi, i = 1, 2, 3, . . . , K, where Bi is the ith block of size 1 × N. The following formula can perform the forward HWT for each block: where B is the signal block and A is the Haar matrix. The Haar matrix can be obtained using the following formula [31]: and where i = 1, 2, . . . , N − 1, N = 2 j (j = 0, 1, . . . , J) refers to the wavelet level. J represents the resolution. j and k represent the integer decomposition of the index i, where i = N + k − 1 and k = 1, 2, . . . , 2 j . A 0 (x) represented the scaling function while A 1 (x) is the mother wavelet function. The other remaining Haar wavelet functions can be obtained from the mother function A 1 (x) by applying translation and dilation processes. According to the previous formula, the kernel matrix for the HWT can be generated as follows: The original signal block B can then be reconstructed from its transform by applying the inverse Haar transform as follows: T and R are the reconstructed signal block and the transform coefficients.

COVIDOA
COVIDOA is a recent metaheuristic inspired by the replication life cycle of the novel Coronavirus particles inside the human body [32]. COVIDOA is divided into four stages as follows: a. Virus entry and uncoating The virus particle tries to enter the human body cell through a special structural protein called spike protein. After entry, the virus genome is uncoated inside the cell.

b. Virus replication
The virus uses the frameshifting technique to generate millions of copies to hijack as many human cells as possible. The most popular frameshifting technique is +1 frameshifting, in which the elements of the parent sequence are moved forward by one step, resulting in losing the first element in the parent sequence, which will be replaced by a random value between lb and ub as follows: V t (2 : D) = P(1 : D − 1) The P is the parent sequence; the V t is the generated viral protein number t; lb and ub are the lower and upper bounds; D is the problem dimension.

c. Virus mutation
The virus tries to mutate to hijack the immune system as follows: Z and X are the mutated and non-mutated solutions, and i = 1, . . . , D. r is a random number in a range [lb, ub]. MR is the mutation rate, which has a value from 0.005 to 0.5.

d. New virion formation and release
The previous steps generate many new virus particles called virions, which are then released from the infected cell and directed to new cells. The pseudocode of COVIDOA is shown in Algorithm 1. Mutate the new solution using Equation (11). End if End for Until t ≥ MaxItr

The Proposed Compression/Decompression Algorithm
In the proposed compression algorithm, we utilized a simplified block-based HWT to obtain the Haar coefficients for the signal; then, the COVIDOA is used to select a subset of the coefficients needed for signal reconstruction. The size of the selected subset is determined according to the desired compression ratio (CR) using the following formula: where SS refers to the size of a selected subset of coefficients, CR is the desired compression ratio, and N is the signal block size. The following steps can summarize the proposed compression algorithm: 1.
The signal is split up into blocks of size 1 × N; N can be 8, 16, 32, or 64.

2.
The required subset of the size of the coefficients is calculated using Equation (12).

5.
For each signal block a.
Calculate the block-based HWT to obtain the Haar coefficients using Equation (2). b.
COVIDOA is used to select the optimal coefficients according to the PRD objective function using Equation (1) as follows: i. Generate an initial random population of solutions and compute the objective function for each solution. ii.
Select parent solution using tournament selection and apply the frameshifting technique to generate several proteins using Equations (9) and (10). iii.
Apply crossover between the generated proteins to generate a new virion. iv.
Apply mutation to the previously generated solution to obtain a new mutated solution. v.
Replace the new solution with the parent solution if the new solution is fitter than the parent. Otherwise, the parent solution remains. vi.
Repeat steps ii to v until the MaxItr is reached. vii.
Select the optimal solution achieved so far.
c. From the coefficient obtained in step a, only the coefficients whose positions correspond to the values in the optimum solution are selected, and the remaining coefficients are ignored (set to zero). d.
Apply the inverse transform to the optimum coefficients obtained in the previous step to obtain the reconstructed signal block using Equation (8).

6.
Concatenate the reconstructed blocks to obtain the reconstructed signal. 7.
Evaluate the algorithm's performance using CR, PRD, SSIM, and QS metrics.
A diagram of the proposed compression/decompression approach is displayed in Figure 3.

Results
This section provides a brief overview of the utilized datasets, the evaluation criteria, and the numerical results and discussions about the experiments performed by the proposed algorithm as follows:

Datasets
Two separate bio-signal datasets are utilized to evaluate the performance of the proposed compression algorithm. The MIT-BIH arrhythmia dataset is used for testing ECG

Results
This section provides a brief overview of the utilized datasets, the evaluation criteria, and the numerical results and discussions about the experiments performed by the proposed algorithm as follows:

Datasets
Two separate bio-signal datasets are utilized to evaluate the performance of the proposed compression algorithm. The MIT-BIH arrhythmia dataset is used for testing ECG compression [33]. Further, 25 ECG records that contain cardiac information for a group of volunteers are selected for evaluation. The selected signals have a sampling rate of 360 samples per second and a resolution of 11 bits. The second dataset is the Motor Movement/Imagery Dataset, used for testing EEG compression [34]. This dataset contains over 1500 one-and two-minute EEG recordings collected from 109 volunteers. Twenty single-channel EEG signals are selected for testing. The sampling frequency of the selected EEG signals is 160 samples per second.

Evaluation Criteria
The compression ratio and the reconstructed image quality must be measured to evaluate the proposed algorithm's performance. Various metrics are utilized for evaluation as follows: The achieved CR can be measured as follows: N O and N R represent the number of bits in the original and reconstructed signals.
This metric is used to quantify the difference between the original and reconstructed signals as follows: • NCC NCC is used to measure the correlation between two signals. Its value ranges from 1 to −1, where 1 represents a complete positive correlation and −1 represents a complete negative correlation. The NCC between the original signal f(x) and reconstructed signal F(x) is calculated as follows: f(x) and F(x) are the mean values for the original and reconstructed signals, respectively.
• QS QS is used to measure the performance of the compression algorithm. The higher the value of QS, the better the algorithm's performance. QS can be calculated as follows: For a fair comparison, the proposed and competing algorithms were tested on a PC with the following specifications: Intel(R) Core(TM) i7-1065G7 CPU, 8 GB RAM, Windows 10 operating system, and MATLAB R2016a development environment.

Numerical Results and Discussion
This section presents the numerical outcomes of the suggested ECG and EEG compression algorithm. For ECG compression, 25 records from the MIT-BIH arrhythmia dataset are compressed using the proposed approach, and the results are shown in Table 1. It can be noticed from the table that the proposed approach has efficient compression performance as it can achieve high CR (CR = 32) with excellent signal reconstruction quality (PRD = 0.2665 and NCC = 0.9436). Additionally, the high-quality score results prove the efficient overall performance of the proposed compression approach. For example, records 108, 112, 117, 118, and 121 have the highest QS results 119, 166, 110, 142, and 211. The average results of the proposed approach in terms of CR, PRD, NCC, and QS are 18.06, 0.2470, 0.9467, and 85.366. Figure 4 shows examples of ECG signals after compression and decompression by the proposed algorithm. The excellent quality of the reconstructed signals is clear from the figure, as the reconstructed signals are very similar to the original.    For EEG compression, 20 single-channel EEG signals from the Motor Movement/Imagery Dataset are compressed and decompressed by the proposed approach. The results in terms of CR, PRD, NCC, and QS and displayed in Table 2. It is shown from the table that the proposed algorithm can achieve high CR values while maintaining signal quality for EEG signals. The best EEG compression results for the proposed algorithm are 16, 0.2423, 0.9725, and 66.033 for CR, PRD, NCC, and QS, respectively. The proposed algorithm can achieve average CR, PRD, NCC, and QS values of 12.6668, 0.4014, 0.9187, and 32.4809, respectively. Figure 5 shows samples of EEG signals before and after compression by the proposed algorithm.
A comparison with some existing approaches [22][23][24] to prove the efficiency of the proposed compression algorithm over the state-of-the-art techniques is conducted in terms of CR, PRD, and QS, as shown in Table 3. It is shown from the table that with the same CR, the proposed approach achieves the minimum PRD and maximum QS values, which demonstrates the proposed algorithm's superiority to the competing methods.
In remote healthcare monitoring systems, the compression speed is very important as a higher compression speed means reduced energy consumption. Along with the evaluation measures already mentioned, we also compared the other techniques in processing time. Tables 4 and 5 show the processing time in seconds for the proposed and existing compression techniques for ECG and EEG signals. The tables demonstrate that the proposed algorithm has a lower processing time than the other techniques in most cases. However, in some cases, the technique in [24], which uses Tchebichef moments with the ABC algorithm, has the lowest processing time, especially in higher compression ratios. We conclude from previous experiments that the proposed compression algorithm achieves excellent performance for bio-signal compression.

Conclusions
In this paper, an efficient compression algorithm is proposed for bio-signals. Because of the simplicity and efficiency of the HWT in extracting signal information, the proposed technique used the block-based HWT to extract the signal's features. The novel COVIDOA selects the best wavelet coefficient subset to achieve the desired compression ratio. The optimum coefficient subset is selected according to a selected objective function, PRD. The subset of coefficients that achieves the minimum PRD value is selected and is considered the optimum and selected for signal reconstruction. The MIT-BIH arrhythmia and Motor Movement/Imagery datasets are used to test the performance of the proposed algorithm in ECG and EEG signal compression. The results showed that the proposed compression approach could achieve high CRs while maintaining signal quality. Comparing existing compression algorithms is conducted according to CR, PRD, NCC, QS, and processing time. The comparison proved the superiority of the proposed algorithm in ECG and EEG compression.
Future work may include applying the proposed algorithm for other bio-signals, such as EMG, ERG, and EGG. Further, the proposed approach may be applied to 2D and 3D medical image compression to minimize the storage and transmission capabilities required by healthcare systems.