Design and Optimization of ECG Modeling for Generating Different Cardiac Dysrhythmias

The electrocardiogram (ECG) has significant clinical importance for analyzing most cardiovascular diseases. ECGs beat morphologies, beat durations, and amplitudes vary from subject to subject and diseases to diseases. Therefore, ECG morphology-based modeling has long-standing research interests. This work aims to develop a simplified ECG model based on a minimum number of parameters that could correctly represent ECG morphology in different cardiac dysrhythmias. A simple mathematical model based on the sum of two Gaussian functions is proposed. However, fitting more than one Gaussian function in a deterministic way has accuracy and localization problems. To solve these fitting problems, two hybrid optimization methods have been developed to select the optimal ECG model parameters. The first method is the combination of an approximation and global search technique (ApproxiGlo), and the second method is the combination of an approximation and multi-start search technique (ApproxiMul). The proposed model and optimization methods have been applied to real ECGs in different cardiac dysrhythmias, and the effectiveness of the model performance was measured in time, frequency, and the time-frequency domain. The model fit different types of ECG beats representing different cardiac dysrhythmias with high correlation coefficients (>0.98). Compared to the nonlinear fitting method, ApproxiGlo and ApproxiMul are 3.32 and 7.88 times better in terms of root mean square error (RMSE), respectively. Regarding optimization, the ApproxiMul performs better than the ApproxiGlo method in many metrics. Different uses of this model are possible, such as a syntactic ECG generator using a graphical user interface has been developed and tested. In addition, the model can be used as a lossy compression with a variable compression rate. A compression ratio of 20:1 can be achieved with 1 kHz sampling frequency and 75 beats per minute. These optimization methods can be used in different engineering fields where the sum of Gaussians is used.


Introduction
Cardiovascular disease (CVD) is one of the leading causes of morbidity and mortality around the globe. About 80% of CVD deaths take place globally in low-and middleincome countries. About 92.1 million U.S. adults are currently suffering from CVDs. A total of 17.3 million people died globally in 2013 due to CVDs [1]. By 2030, the projected percentage of having some form of CVDs is 43.9%. Therefore, to save millions of people, • A simplified mathematical model based on the sum of two Gaussians has been proposed, which does not require baseline adjustment [23], and it is straightforward compared to the current models. • Most fitting or optimizing techniques are designed to calculate only one Gaussian function, which is not ideal for the proposed method. Therefore, to solve these problems along with the model, two-hybrid optimization methods have been proposed. • The proposed model's performance is evaluated using time domain, frequency domain, and time-frequency domain analysis. • Among various applications, one of the most promising ones is the ECG generator for education and research purpose. A graphical user interface (GUI) has been developed to show the proposed system's potential use. Besides, noisy ECG generation and data compression have also been presented.
This paper is organized as follows. Firstly, this study's background problem is formulated with the proposed ECG model through hybrid optimization methods in Section 2. In Section 3, data collection and processing for the performance evaluation of the test model and hybrid optimization are discussed. The results and comparison with real data are presented in Section 4. In Section 5, two applications have been discussed. The discussion and conclusion are presented in Sections 6 and 7, respectively.

Background Problems and Proposed Model
The overall flow diagram of the study is shown in Figure 1. First, the proposed ECG model and hybrid optimization methods are presented with their background problems. After that, the proposed model is fitted with real ECG using hybrid optimization methods. Finally, the performance measurements and applications of the proposed model are discussed in details. An ideal ECG is a combination of P, Q, R, S, and T waves; thus, the proposed simplified mathematical model is generated by constructing and assembling these waves. ECG modeling has two problems: (1) designing a simplified mathematical model with a minimum number of model parameters representing a full ECG signal, and (2) finding optimal model parameters representing different cardiac dysrhythmias. The proposed model along with the limitation of the current model (Section 2.1) and the proposed hybrid optimization technique (Section 2.2), are discussed below.

Proposed Model
ECG components, i.e., P, Q, R, S, and T waves, have an approximately symmetric "bell curve" shape that quickly falls toward both sites. This ''bell curve" shape is one of the reasons for the widely used Gaussian wave [24]. Moreover, a small number of parameters can be represented by a Gaussian function and has also been used in other biomedical signal modeling, e.g., photoplethysmogram (PPG) [25]. Awal et al. [24] use non-uniform number Gaussian functions to represent ECG components. One Gaussian function is used for P wave, while the sum of two Gaussians is used to describe Q wave. This non-uniformity of the number of Gaussians in the model creates an extra classification problem where different ECG signal components need to be classified before modeling.
A uniform equation for all ECG components i ∈ (P, Q, R, S, T) is proposed, and the equation of generating ECG wave can be written as: where A i is the height of the curve's peak, t i controls center position of the peak, σ i controls the width of the ECG, and j = 1, 2 represents the number of Gaussians. Optimal A i , t i and σ i need to be selected, which is an optimization problem, and it is discussed in the next section. Note that c i is an additional parameter that can be used to control baseline and noise modeling.

Optimization Problem Formulation and Proposed Optimization Method
The performance of the mathematical model is vastly dependent on the choice of parameters. Generally, it is done by fitting the mathematical model to real-world ECG signals. Given a uniformly spaced discrete-time data points {t = 1, 2, . . . , N} associated with ECG ∈ P, Q, R, S, T data values, where N is the total number of discrete-time points or samples in each ECG beat, and the sum of two Gaussians stated in Equation (1) is needed to fit the real ECG data to minimize the root mean square error (RMSE). Mathematically, it can be written as: where ECG Model (t) is the proposed model having seven controlling parameters: {A 1 , t 1 , σ 1 , A 2 , t 2 , σ 2, C i } for each ECG component, i.e., P, Q, R, S, T. ECG Model (t) needs to fit ECG real (t) so that minimizes RMSE given in Equation (2). The proposed model is the sum of two Gaussians. Goshtasby and O'Neill showed that Gaussian parameters (A i , t i , σ i ) can quickly and accurately be determined when a model or function contains only one Gaussian from its zero-crossings [26] or in a deterministic way, for example, using Crauna's, Guo's, and Roonizi's methods [27][28][29]. However, it is difficult to accurately determine the position and standard deviation of all Gaussians from its zero-crossing when a function contains two or more Gaussians [26]. Our proposed model consists of the sum of two Gaussians; therefore, a better optimization method is required to tune parameters. Two hybrid optimization methods (ApproxiGlo and ApproxiMul) have been proposed. Both methods are comprised of two steps (see Figure 2): (1) Determination of initial parameters and corresponding lower and upper bounds, i.e., approximation method; (2) Determination of final optimal Gaussian parameters by a global optimization solver from the approximate parameters and corresponding limits calculated in step 1. In this paper, two global optimization solvers are explored: (a) Multi-start and (b) Global search. In summary, ApproxiGlo comprises the Approximation method and Global search, whereas ApproxiMul comprises the Approximation method and Multi-start; see Figure 2. Step 1: Approximation of the Gaussian parameters. In this step, the effect of adding multiple Gaussians is not taken into account.
Step 2: These estimated parameters are regarded as initial parameters in the (b) Multi-start and (c) Global search optimization solver to determine the optimal Gaussian parameters.
Step 1: An approximation method for selecting initial model parameters is used. These initial parameters are used as input parameters for the global optimization solvers in Step 2. A flow diagram for the approximation method is shown in Figure 2a, and a brief description is given below: Step 1.1 After Segmenting ECG, beat into P, Q, R, S, T components. i.e., ECG ∈ P, Q, R, S, T, minimum and maximum values of σ 1 are calculated based on the number Sensors 2021, 21, 1638 6 of 26 of samples. Assume the total time duration of an ECG component x R and the number of samples in this time duration is N s , the σ 1 min and σ 1 max can be approximated by the following equations: Step 1.2 Increment the values of σ 1 from σ 1 min to σ 1 max at an interval of 0.3 and run through all the different values. Then, construct a Gaussian filter with the Gaussian as follows: where S is a vector calculated from here, . denotes the ceiling function used to convert the floating number into an integer. This arrangement helps to integer increment and reduce the number of iterations.
Convolve ECG with the Gaussian filter and find the maximum response using the matched filtering approach. After that, find the positions t 1 and amplitudes A 1 of the Gaussians for the corresponding σ 1 and determine the square error.
Step 1.3 Store the corresponding RMSE and corresponding parameters.
Step 1.4 Repeat step 1.2 and step 1.3 for all values of σ 1 .
Step 1.5 Select the parameters (A 1 , σ 1 , t 1 ) for which RMSE is the lowest.
Step 1.6 Finally, replicate the parameter of the second Gaussian using the calculated Gaussian parameters, i.e., As the second Gaussian parameter is approximated from the first Gaussian, this step is considered the Approximation method.
Step 2: To increase the accuracy of the Approximation method, a global optimization solver is chosen. Among various global optimization solvers such as multi-start, global search, genetic algorithm, simulated annealing, particle swarm, etc., multi-start and global search are simple, fast, and easy to use. Hence, these are adapted to obtain the optimal Gaussian parameters. Multi-start and global search have a nearly similar approach to finding the multiple-minima or global minima, and both algorithms start a local solver from multiple start points. The multi-start optimizer uses uniformly distributed start points within bounds or user-supplied start points, whereas the global search uses a scatter-search mechanism for generating start points.
A real ECG P wave is fitted by the model parameters calculated from this approximation method and represented in Figure 2. From Equation (3), it can be seen that the parameters σ 1 and σ 2 are approximated, hence called the Approximation method. Figure 2 (output of (a)) illustrates the model fitting using the approximate techniques, and Figure 2 (output of c) and (output of d)) represents the model fitting results of the proposed hybrid optimization techniques on a real ECG component, P wave.

Databases and Pre-Processing
Denoising is required as the real data is prone to noise, and data formatting is necessary to fit the model. The collected data are pre-processed, and then ECG components are extracted for modeling. A brief description is given below about these data collection and pre-processing.

Data Collection
Real ECG was used for fitting the model and testing it for accuracy. To ensure the diversity of data sources and provide the model with different cardiac dysrhythmias, ECG data was collected from (i) Biomedical Signal Processing Lab, Khulna University of Engineering and Technology (KUET), Khulna, Bangladesh, and (ii) two publicly available online databases. The first one was collected from a volunteer, and others were obtained from accessible online databases. Combining these three ECG data collection methods ensures the diversity of data sources and different cardiac dysrhythmias.

Experimental Data Collection
Experimental ECG data was collected from a 26-year-old volunteer with no known cardiovascular disorder (normal subject). The volunteer was informed about the experiment and asked to relax to guarantee lower motion artifact and EMG signal on the data. BIOPAC data acquisition unit (MP36) [30] with BIOPAC electrode lead set (SS2L) and disposable vinyl electrodes (EL503) were used for data collection setting; details can be found in [31].

Data Collection from Online
The collected data from KUET was a completely healthy subject, i.e., ECG was normal. Therefore, it is not possible to model different cardiac dysthymias from only a normal ECG. Thus, ECG from MIT-BIH Arrhythmia Database was also used, recorded by the Beth Israel Hospital Arrhythmia Laboratory between 1975 and 1979, and 60% of the total data are from patients who exhibited different cardiovascular diseases. This database contains both normal and abnormal beats, and both of them were used to justify how well the proposed methods work on different cardiac dysrhythmias and commonly used for ECG modeling and classification with their original sampling frequency [32]. For this work, 100 series from MIT-BIH Arrhythmia Database were used. These have 23 datasets with 360 Hz sampling frequency, and each of them is slightly over 30 minutes. These files are also annotated for different cardiac dysrhythmias with age, sex, and medication [33]. A publicly available database, University College Dublin Sleep Apnea (UCDSA) database, is also added. This polysomnographic data contains ECG and other physiological signals such as electroencephalogram (EEG), Electrooculogram (EOG), etc. We used only ECGs with a sampling frequency of 128 Hz from ten subjects (Record Number: ucddb002,ucddb003,ucddb005,ucddb007,ucddb009,ucddb010,ucddb011,ucddb013,ucddb014,ucddb015). Note that the sampling rate of each dataset was kept to its original sampling frequency. We have not done any up-sampling and down-sampling. Our intention is to show how well our proposed model and optimization methods work under different sampling frequencies.
This indicates that the model is independent of sampling frequency and generalization ability to replicate different cardiac dysrhythmias under diverse sampling frequency.

Denoising
It is inevitable that the wanted signal is prone to mixing with various noises such as white noise, pink noise, baseline wander, muscle noise and motion artifact, and other noises, which in varying degrees cause misjudgment and omission of conventional ECG identification. In this study, the noise was reduced by the Discrete Wavelet Transform (DWT) based filtering. The Coiflet mother wavelet [34] of order 6 with 8 levels of decomposition using adaptive shrinkage rule [31], together with a single level rescaling and soft thresholding strategy, was used for denoising.

ECG Components Extraction
The proposed model is based on a single ECG beat, so the single beat ECG isolation is essential; this was done using beat time calculated from QRS complex peaks. First, visually a starting point (Sp) of the database was chosen, assuming that it was the beginning of that particular ECG beat because the database could start in the middle of an ECG beat. Next, the peak of the QRS complex was located (Tp 1 , T p 2 , . . . . . . T p N ) and the ECG beat was isolated by: where u(t) is a unit step function, and t is the time. This concept is almost similar to [28], except they use it for P-wave detection. Instead of having any starting point defined visually, they started with zero crossings, as shown in Figure 3. Using the zero-crossing technique, the automatic starting point is prone to misdetection of P-wave accurate starting for several reasons such as baseline wander and power-line interference, and other noises. A practical scenario presented in Figure 3 reveals that there could be several false zerocrossing before starting the P-wave. Therefore, due to accuracy concerns, a visual stating point was chosen instead of the automatic starting point.
The proposed model is based on a single ECG beat, so the single beat ECG isolation is essential; this was done using beat time calculated from QRS complex peaks. First, visually a starting point ( ) of the database was chosen, assuming that it was the beginning of that particular ECG beat because the database could start in the middle of an ECG beat. Next, the peak of the QRS complex was located ( 1 , 2 , … . . ) and the ECG beat was isolated by: where ( ) is a unit step function, and is the time. This concept is almost similar to [28], except they use it for P-wave detection. Instead of having any starting point defined visually, they started with zero crossings, as shown in Figure 3. Using the zero-crossing technique, the automatic starting point is prone to misdetection of P-wave accurate starting for several reasons such as baseline wander and power-line interference, and other noises. A practical scenario presented in Figure 3 reveals that there could be several false zerocrossing before starting the P-wave. Therefore, due to accuracy concerns, a visual stating point was chosen instead of the automatic starting point.

Results
The optimization performance of ApproxiGlo and ApproxiMul is presented first. Then, the results of different cardiac dysrhythmias modeling are evaluated and compared with contemporary state-of-the-art methods.
A healthy subject's ECG recorded by the BIOPAC data acquisition system is used to evaluate the optimization method's performance and the proposed model. It also uses an ECG waveform selected from the MIT-BIH database [35]. The model parameters are calculated by fitting the model using hybrid methods. For example, the ECG P wave presented in Figure 2 shows how the proposed model with the optimization method fits with the real ECG P wave. The same technique is used for all ECG components (see Figure 4).

Results
The optimization performance of ApproxiGlo and ApproxiMul is presented first. Then, the results of different cardiac dysrhythmias modeling are evaluated and compared with contemporary state-of-the-art methods.
A healthy subject's ECG recorded by the BIOPAC data acquisition system is used to evaluate the optimization method's performance and the proposed model. It also uses an ECG waveform selected from the MIT-BIH database [35]. The model parameters are calculated by fitting the model using hybrid methods. For example, the ECG P wave presented in Figure 2 shows how the proposed model with the optimization method fits with the real ECG P wave. The same technique is used for all ECG components (see Figure 4).  To demonstrate the proposed model's efficiency, a wide variety of ECG beat is such as normal, atrial premature beat, paced beat, and premature ventricular contra ECG beat collected from MIT-BIH database; see Figure 5. It is found that the prop model can fit adequately not only the normal ECG beat but also can fit different ca To demonstrate the proposed model's efficiency, a wide variety of ECG beat is used, such as normal, atrial premature beat, paced beat, and premature ventricular contraction ECG beat collected from MIT-BIH database; see Figure 5. It is found that the proposed model can fit adequately not only the normal ECG beat but also can fit different cardiac dysrhythmias. Besides, a 1-min ECG collected from UCDSA is presented in Figure 6 to show the performance of our proposed method in a long and multi-beat situation. The inner Figure 6 shows the zoomed version. To demonstrate the proposed model's efficiency, a wide variety of ECG beat is used, such as normal, atrial premature beat, paced beat, and premature ventricular contraction ECG beat collected from MIT-BIH database; see Figure 5. It is found that the proposed model can fit adequately not only the normal ECG beat but also can fit different cardiac dysrhythmias. Besides, a 1-min ECG collected from UCDSA is presented in Figure 6 to show the performance of our proposed method in a long and multi-beat situation. The inner Figure 6 shows the zoomed version.

Performance in Time Domain
The proposed model is evaluated by visual inspection and assessed in the time domain, frequency domain, and time-frequency domain. The proposed two-hybrid optimization methods tune the model parameters, and ECG beats are finally created by using optimized model parameters to evaluate the average model performance in different do-

Performance in Time Domain
The proposed model is evaluated by visual inspection and assessed in the time domain, frequency domain, and time-frequency domain. The proposed two-hybrid optimization methods tune the model parameters, and ECG beats are finally created by using optimized model parameters to evaluate the average model performance in different domains. The relationship between the model parameters and the physiologically and morphologically different ECG waveforms are evaluated using the goodness of fitting in time-domain by using Mean Square Error (MSE), Normalized MSE (NMSE), Root MSE (RMSE), and Normalized Root MSE (NRMSE). The required equation of MSE, NMSE, RMSE, NRMSE, and Correlation coefficient (CORR) has been given in Appendix A.1. The minimum value of MSE, NMSE, RMSE, and CORR's maximum value indicates that the model can mimic the real-world different ECG waveform with physiological accuracy. Table 1 represents the average goodness of fitting value when the model fits Normal ECGs collected from MIT-BIH Arrhythmia database and UCDSA database using ApproxiGlo and ApproxiMul. It also compares with the nonlinear fitting method presented in [24]. Both ApproxiGlo and ApproxiMul provide improved results compared to the nonlinear fitting method. Table 1. Comparison between the proposed ApproxiGlo and ApproxiMul and the nonlinear fitting method described in [24] for normal ECGs. Besides, normal ECG beats, collected from MIT-BIH Arrhythmia database and UCDSA database, atrial premature beat, paced beat, and premature ventricular contraction beats are also used to fit the proposed model. It can be seen that the optimization using ApproxiMul has a lower error compared to ApproxiGlo (Tables 2 and 3) Though the difference between ApproxiGlo and ApproxiMul is not significant, in the case of Premature Ventricular Contraction ApproxiGlo has fewer errors. Both methods have a high correlation coefficient of greater than 0.9.

Performance in Frequency Domain
The significant frequency component of ECG lies between 0 Hz and 50 Hz [6,36]. It is found that both the model and real ECG signals offer almost the same frequency response in the frequency domain ( Figure 7). It can be seen that the normal ECG has a higher amplitude with a lower dominant frequency band of 0 Hz to 10 Hz, which is opposite to the atrial premature beat where the dominant frequency band lies between 0Hz to 80Hz. On the other hand, paced beat shows a higher amplitude with a lower dominant frequency content. The premature ventricular contraction beat shows a moderate amplitude with the dominant frequency between 0 Hz and 30 Hz. In contrast, other frequency-domain measurements, such as PSD, can provide more information. Figure 8 shows that below 50 Hz normal and atrial premature beats have an almost same frequency response in both methods and real ECGs. However, in the case of ApproxiGlo, one severe downward peak occurred at the 14.06 Hz and 46.41 Hz in the PSD plot for pace beat and premature ventricular contraction beat, which is unwanted. This phenomenon does not happen with ApproxiMul, which indicates better synchronization than ApproxiGlo. It enables the ECG model to stay within its allocated spectrum or band of frequency and avoid interfering with other frequency components. If proper timing is not maintained, the amplitude and width of P, Q, R, S, and T in ECG will be changed and lead to misdiagnosis. Moreover, generally different cardiac dysrhythmias like tachycardia and bradycardia are nothing but the various periods or frequency which must be maintained to ensure the perfect modeling. As the model can match the modeled ECG frequency with the real ECG, whatever its frequency components, it means a better model for both normal and abnormal ECG.

Performance in Time-Frequency Domain
For time-frequency measurement, scalogram difference (ScD) is chosen to see the time-varying spectral difference between real and model ECG. To illustrate, in the case of normal ECG modeling using ApproxiGlo (Figure 9a), the Q, R, S exhibit higher ScD in the range between 2.5 × 10 −4 and 5 × 10 −4 over a more extended scale interval of 60 to 120 scales. This value of ScD is lower in the case of P and T waves. This happens due to the fact that the QRS complex has a higher amplitude and frequency contents than P and T waves [37]. The ScD is lower using ApproxiMul (Figure 9b) than the ApproxiGlo method in normal ECG modeling. In the same way, atrial premature beat and other different physiological or pathophysiological conditions can be interpreted. Overall, ApproxiGlo has more energy difference in the time-frequency domain than the ApproxiMul except for the premature ventricular contraction. The worst occurred in the paced beat when it was compared with the ApproxiMul. However, in the case of premature ventricular contraction shown in Figure 9d, the ScD's energy is higher using ApproxiMul than ApproxiGlo, which is also expected from the system's time and frequency analysis.
in normal ECG modeling. In the same way, atrial premature beat and other different physiological or pathophysiological conditions can be interpreted. Overall, ApproxiGlo has more energy difference in the time-frequency domain than the ApproxiMul except for the premature ventricular contraction. The worst occurred in the paced beat when it was compared with the ApproxiMul. However, in the case of premature ventricular contraction shown in Figure 9d, the ScD's energy is higher using ApproxiMul than ApproxiGlo, which is also expected from the system's time and frequency analysis.

Applications of the Proposed Model
There are several applications for the proposed model. The two most obvious applications are: (i) the ECG generator for simulation purposes and (ii) ECG compression.

Applications of the Proposed Model
There are several applications for the proposed model. The two most obvious applications are: (i) the ECG generator for simulation purposes and (ii) ECG compression.

Synthetic ECG Generator
A graphical user interface (GUI) in Matlab has been developed to show the proposed system's potential use (Figure 10). Using the sum of two Gaussians model parameters, ECG signals are created with user desired requirements such as different types of ECG signals with different sampling frequency, noise type and strength, and user-defined beat rate (BPM). (1) Type of ECG: The GUI has four different ECG types: normal, atrial premature beat, paced beat, and premature ventricular contraction. The ECG coefficients are calculated from the collected data using BIOPAC [30] and the MIT-BIH database [33]. The coefficients' value for one single ECG beat is shown in the appendix used to generate ECG.
(2) Time Duration: If the user wants to generate the ECG with a fixed time, it is possible using this text box. It accepts positive integer values in seconds. By default, the system always generates ECG for 10 seconds.
(3) Add noise: Checking and unchecking enables and disables the noise adding options. Only allowing this option to Signal-to-Noise Ratio (SNR, 4 in Figure 10) and types of noise (5 in Figure 10) is useful in the system. Without enabling this option, the user's values and options for the SNR and types of noise do not affect the generated ECG.
(4) SNR (dB): To give users more flexibility, the SNR of the generated ECG can be controlled by putting values in the text box. The GUI considers the value in dB and calculates the noise power using Equation (7).
Noise Power = P y 10 SNR 10 here, P y denotes the signal power. For a N-point ECG signal y(n), the signal power is measued as the energy per sample, and mathematically it can be expressed as P y = 1 N ∑ N−1 n=0 |y(n)| 2 .
(5) Type of noise: As noise cancellation in ECG is a popular research topic among researchers. The noise adding option is added to the GUI. ECG generation is the reversed process of the calculation of the model parameter so c i in Equation (1) can be used as a noise parameter, d(t).
Using this parameter as a noise parameter, the model can support both synthetic noise (simulated) and real noise. This noise parameter can help to generate a more realistic ECG with different noise types of noise. Six types of noise are chosen to show the effectiveness of this approach. They are White Noise (WN), and Colored Noise (CN) generated from a mathematical model, and Real Muscle Artifacts (MA), Real Electrode Movements (EM), Real Baseline Wander (BW), Mixture of BW, EM, MA (MX) are from MIT-BIH noise stress test database [38]. Figure 11 shows normal ECG generated using model parameters with different noises; the system uses the same process as [31].  [38]. Figure 11 shows normal ECG generated using model parameters with different noises; the system uses the same process as [31]. Figure 11. Normal ECG with different types of noise using the model parameters.
(6) Wavelet Analysis: The wavelet analysis is also be performed by using this GUI. (7) Sampling frequency: There are four different sampling frequencies which the user in this GUI can choose are 256 Hz, 360 Hz, 512 Hz, and 1000 Hz. Though the model's sources are fixed frequency (BIOPAC is 1 kHz and MIT-BIH is 360 Hz), the frequency variation is done by resampling, which applies an antialiasing FIR lowpass filter in the desired frequency. By doing so, it is understandable that the model is independent of source sampling frequency. (6) Wavelet Analysis: The wavelet analysis is also be performed by using this GUI. (7) Sampling frequency: There are four different sampling frequencies which the user in this GUI can choose are 256 Hz, 360 Hz, 512 Hz, and 1000 Hz. Though the model's sources are fixed frequency (BIOPAC is 1 kHz and MIT-BIH is 360 Hz), the frequency variation is done by resampling, which applies an antialiasing FIR lowpass filter in the desired frequency. By doing so, it is understandable that the model is independent of source sampling frequency.
(8) Plot: The plot button is for showing the generated ECGs.
(9) Save Result: Save result is also a button for saving the generated ECG in mat format so that later the user can use it as s/he wants to use it.
(10) FFT & PSD: This checkbox enables and disables the FFT and Power spectral density (PSD). By seeing these two, the user can understand the frequency domain property of the model generated ECG.
(11) BPM: This text box can do beat per minute or BPM of the generated ECG by default; the system always generates 72 BPM ECG signal. As bradycardia and tachycardia are nothing but less or high BPM by changing the beat rate, it is possible to create bradycardia and tachycardia like Figure 12.

ECG Compression
The ECG data is converted into model parameters by modeling the ECG, and later, the ECG beat can be recreated from these model parameters. Therefore, this transform can be treated as lossy compressing [39]. On the other hand, it can remove the noise without any extra work. The compression ratio (CR) is a ratio of both signals' length. For discussion, a single minute ECG signal is used because, in ECG, bpm (beats per minute) is usually used to represent heartbeat. If a subject has HR bpm and the signal has sample per second (Hz) then the CR ratio can be written as: In the case of the original one-minute ECG, is multiplied by 60, which gives numbers needed to represent the one-minute signal. Each of these numbers representing ECG samples is multiplied with which denotes the precision bits size. On the other hand, modeled ECG signal representation does not depend on sample numbers; instead, it relies on the number of model parameters and heart rate. For successfully recovering ECG beat proposed model needs seven Gaussian parameters ( = 7) for each segment (P, Q, R, S, T) multiplied by the number of segments ( = 5) plus a number of size segment ( = ). The result of that should be multiplied by the number of beats (HR) and . As the model parameters and sample ECG both are float point, in this case, the same precision bits ( ) is assumed.
For the compression, in a practical scenario, the ECG beat could be a fraction (70.80), and the segment, in that case, the number of segments, is to round to the next segment of the last ECG beat. The worst-case scenario has to round to next beat ⌈ ⌉. Here, instead of regular HR worst-case scenario heart rate ⌈ ⌉ is used. It is considered that sampling

ECG Compression
The ECG data is converted into model parameters by modeling the ECG, and later, the ECG beat can be recreated from these model parameters. Therefore, this transform can be treated as lossy compressing [39]. On the other hand, it can remove the noise without any extra work. The compression ratio (CR) is a ratio of both signals' length. For discussion, a single minute ECG signal is used because, in ECG, bpm (beats per minute) is usually used to represent heartbeat. If a subject has HR bpm and the signal has N sample sample per second (Hz) then the CR ratio can be written as: In the case of the original one-minute ECG, N sample is multiplied by 60, which gives numbers needed to represent the one-minute signal. Each of these numbers representing ECG samples is multiplied with B p which denotes the precision bits size. On the other hand, modeled ECG signal representation does not depend on sample numbers; instead, it relies on the number of model parameters and heart rate. For successfully recovering ECG beat proposed model needs seven Gaussian parameters (N G = 7) for each segment (P, Q, R, S, T) multiplied by the number of segments (N S = 5) plus a number of size segment (N SS = N S ). The result of that should be multiplied by the number of beats (HR) and B p . As the model parameters and sample ECG both are float point, in this case, the same precision bits B p is assumed.
For the compression, in a practical scenario, the ECG beat could be a fraction (70.80), and the segment, in that case, the number of segments, is to round to the next segment of the last ECG beat. The worst-case scenario has to round to next beat HR . Here, instead of regular HR worst-case scenario heart rate HR is used. It is considered that sampling frequency is known; both have an equal header in the file. Different studies presented in Table 4 show that different ECG sampling frequencies are shown to produce a correct diagnosis. Mahdiani et al. showed a 50 Hz sampling rate is enough for visual inspection and calculating time-domain heart rate variability parameters with R-peak deformity [40]. A similar finding of 250 samples per second, causing no significant differences with the reduction in peak amplitudes, is also found [41]. On the other hand, Abboud et al. [23] showed for spectral analysis high sampling rate (1 kHz) is necessary, and in this work, the collected normal ECG is at 1 kHz. The range of heart rate variability can be from bradycardia (for example, HR = 50), and tachycardia (for example, HR = 120), then CR ranges from 30 to 12.5. A similar type of compression done by a different model has more than 7:1 compression ratio in 128 Hz sampling frequency [39]. Clifford et al. [39] have a better compression ratio than the proposed methods. However, the dynamic model is much more complex as well as symmetric asymmetric turning points are needed to be identified. The typical BPM is considered 75 and compares the proposed model with other methods in Table 4.

Discussions
In this paper, a simplified model for generating different patterns for cardiac dysrhythmias is proposed. Two hybrid optimization methods also optimize model parameters. The model can produce different beats such as normal, atrial premature beat, paced beat, and premature ventricular contraction. It is logical to discuss the proposed model's salient features and the limitation and future works of the current model. One of the salient features is the ability to model asymmetric ECG components. For example, the ECG T-wave is asymmetric, and the P wave is slightly asymmetric. The proposed model can replicate such asymmetricity as the model uses the sum of two Gaussians. Due to symmetricity, a single Gaussian is not able to reproduce asymmetricity. Moreover, abnormal ECG P-wave such as P mitrale (P mitrale is a sign of left atrial enlargement, usually due to mitral stenosis), P Pulmonale (P Pulmonale is a sign of right atrial enlargement, usually due to pulmonary hypertension (e.g., chronic respiratory disease)), multifocal atrial rhythms cannot be produced by a single Gaussian wave but can be produced by our model.
Most of the methods are complex, and results are shown qualitatively by graphical presentations [9,39,46]. A comparison is presented in Table 5 with additional information and the unique characteristics of each study. Suppaplola et al. used M number of Gaussian where M is determined by zero-crossing and based on NRMSE [6]. Hence, the model is stochastic in nature and increases the model parameters as M is not fixed [6]. The same issues arise in Parvaneh's study [47]. The number of Gaussian was extended up to 133 for better accuracy [47]. It has no baseline wonder parameters.
On the other hand, Clifford et al. used the 3D state-space model and with baseline wander parameters. They have used 5 to 7 Gaussians to model the whole ECG signal [9,39,46,48]. Clifford et al. [39] used 6 Gaussian functions to model one beat ECG signal using a Nonlinear least-squares solver. On the other hand Clifford et al. [46] n + 2m number of Gaussian functions where, n = Symmetric turning point, m = asymmetric turning point. They represented each ECG component with single Gaussians except for T wave. Due to asymmetric turning point, two Gaussians have been employed to model only ECG T wave. However, not only T wave, other ECG components such as P wave is slightly asymmetric [49]. Roonizi and Ebadollah [28] develop a faster non-iterative method to fit Gaussian Function Riding on the Polynomial Background. Dubois et al. [12] used mainly Gaussian Messa Function (GMF); however, for T-wave bi-Gaussian function (BGF) function was used for better performance. Badilini et al. [50] use 6 Gaussian Messa Function (GMF) to represent ECG. The fitting was done by Generalized Orthogonal Forward Regression (GOFR). GMF has more parameters than the single Gaussian function, where a single Gaussian function has three settings GMF has five.

Experimental search Presented visually
Clifford [46] Adaptive determination for p = n + 2m (n = Symmetric turning point, m = asymmetric turning point) A dynamical model. Asymmetry of T-wave is considered. Can be stuck in local minima due to the use of local optimizer, i.e., lsqnonlin.
Nonlinear least-squares solver using lsqnonlin function  The same type of work is also done by Dubois et al. [51]. Some other works, such as Elda et al. [32] use polynomial functions to model ECG rather than Gaussian function. However, the number of Gaussian used by the proposed model is higher than some published work. In comparison with the literature, the proposed model can solve ECG components' asymmetric problem using the sum of two Gaussians. It is also uniform for all ECG components, so there is no need to separately classify or identify different ECG components. The developed hybrid optimization performs better than the most used nonlinear methods. A summary is presented in Table 5. Besides, we compare our two optimization methods with other state-of-the-art Gaussian fitting methods such as Crauna's method [27], Guo's method [52], and the Fast, Accurate, and Separable (FAS) method [29]. Different performance metrics are calculated, and results are shown in Table 6. Note that we have added run-time as a measure of computational load. We have run the program 100 times on normal ECG to accomplish the task, and the average results are presented. The run-time is calculated on a core i9 processor having 64GB RAM. It can be seen from Table 6 that our proposed ApproxiMul provided the best performance in all metrics except in the run time. The FAS method showed the lowest run-time but delivered the worst performance in MSE, NMSE, RMSE and, CORR. Therefore, giving more importance to accuracy and precision than run-time, our proposed ApproxiMul method provided the highest results. By looking closer the ApproxiMul process, it can be seen that ApproxiMul comprises the Approximation method and multi-start method. The approximation method is faster and more time is consumed in the multi-start optimization as it is a global optimization algorithm used to find out the global minimum point.
Our proposed method is simple, easy to implement, and has localization capability, meaning that it can simulate both long-duration ECG signal on a beat-by-beat basis and even ECG component by component basis. This feature can be helpful to detect and diagnose some diseases such as sleep apnea. However, there are some limitations to this study. The flaws of the proposed recommendations are discussed below:

•
In this study, ECG ∈ P, Q, R, S, T components are extracted manually. However, this problem can be solved by the method presented in [53][54][55].

•
Another limitation of this study is the number of ECG beats used. Few ECG beats were taken into account for model fitting and optimization. However, this work aims to find out an optimization method for ECG model fitting and the possible use of this optimized model in simulating different cardiac dysrhythmias, for example, atrial premature beat, paced beat, etc.

•
The automatic classification of different types of ECG beats can be possible to implement by our proposed method. In that case, model parameters can be treated as features. These features can be trained by fuzzy-hybrid neural networks [56], support vector machine (SVM) [57,58], light gradient boosting machine [59], and Bayes maximum-likelihood (ML) classifier [32]. In addition to that, prominent features can also be selected by some feature selection algorithms such as the mRMR method and the Jaya algorithm [60][61][62] to increase the classification accuracy. • A dictionary can be built based on our model and represent and classify cardiac dysthymias (which is called matching pursuit) [63]. • The proposed model can be used in model-based signal denoising [39].

Conclusions
This paper proposes a simplified mathematical model for generating an ECG signal in different cardiac dysrhythmias. In addition to that, two hybrid methods are proposed and compared with the non-linear fitting and other optimization algorithms. The experimental results show that the proposed model can replicate the essential features of human ECG. The model and optimization methods are tested on three different datasets having different sampling frequencies and show outperforming results in every dataset. This indicates that the model is independent of sampling frequency and has the generalization ability to replicate different cardiac dysrhythmias. The model fits the normal ECG with an average MSE of 0.0023 and atrial fibrillation with an average MSE of 0.0291, which indicates the effectiveness of this simplified model. Moreover, many morphological changes, such as atrial premature beat, paced beat, and premature ventricular contraction, can be fitted by selecting proper model parameters. With the baseline drift factor in the model, this model can fit the real ECG effectively. A Matlab-based-GUI is developed to show the potential use of the proposed model. This model can also achieve a data compression ratio as high as 20:1 in 1 kHz sampling frequency and outperformed other studies in high sampling rate. A small number of ECG beat types were taken into account for model fitting and optimization. This work aims to find out an optimization method for ECG model fitting and the possibility of using this optimized model to simulate various cardiac dysrhythmias other than the fitted cardiac dysrhythmias, for example, ventricular hypertrophy, ventricular fusion beat, etc. The proposed model can be used as a supplementary medical education tool, testing, and simulating intracardiac signals. In this work, the physiological problem related to changes in beat-by-beat overtime is not discussed. Some diagnostic problems, e.g., sleep apnea, can be visible in long-duration ECG, such as in 1 minute or 5-minute duration by detecting bradycardia and tachycardia events. However, the model can replicate such long-duration phenomena by simulating each ECG beat and fit each time-varying ECG beats over time. Besides, the long-duration ECG comprised of the normal and abnormal beat can also be simulated by the model using beat by beat basis. Even the changes in ECG components could be simulated due to the localized nature of the model. Interestingly, the optimization of Gaussians can not only be used in ECG signals but can also widely be used to represent many natural phenomena and industrial processes. For example, Gaussians can model an approximation of the airy disk in image processing, microscopic applications, fluorescence dispersion in flow cytometric DNA histograms, and laser heat source in laser transmission welding. Therefore, the proposed model and optimization method can also be used in those applications.
The normalized form of MSE is: Another measurement is Root Mean Square Error, which is: The Normalized version of RMSE is:

. Frequency-Domain Metrics
A transformation of the model and signal in the frequency domain through a Fourier Transform (FT) is done to evaluate the performance. As Fourier transform of a Gaussian function is also a Gaussian function, and the Fourier transform of a Gaussian function can be expressed as: As can be observed from Equation (A6), there is an inverse relationship width b i between time domain and frequency domain, because frequency domain Gaussian function X(ω) is not shifted. Note that, here FT of a single Gaussian function is presented. The sum of two Gaussians like Equation (1) is also a sum of two Gaussians in the frequency domain as FT is a linear operator. Mathematically, where F 1 (X) and F 2 (X) are the FT of first and second Gaussian functions f 1 (x) and f 2 (x) and A 1 and A 2 are the amplitude of the Gaussians, respectively. We also used power spectral density (PSD).
Appendix A.1.3. Time-Frequency Domain Measure A qualitative distortion measure named scalogram difference (ScD) is used to calculate the performance in the time-frequency domain. ScD is computed using continuous wavelet transform and defines the percentage of energy difference for each coefficient of the real and model signal. The CWT is chosen to calculate ScD due to its good time and frequency localization, which helps localize and visualize the time-varying spectral changes in the ECG signal. CWT does not suffer from cross-terms interference and presents a signal in the time-frequency plane more flexibly than Short-time Fourier Transform (STFT) or Spectrogram by applying a variable window [64]. This ScD is a two-dimensional matrix, i.e., time-scale matrix and therefore, it is a handy tool for evaluating a model performance in the time-scale domain. One can easily visualize and locate the magnitude of change between real and model ECG. The lower value represents the higher quality of a fitting [24]. If CWT org be the wavelet coefficient of the original or real signal at scale a and CWT model be the wavelet coefficient of the reconstructed signal of that scale, then ScD at scale a can be defined as: where, n a is the total number of coefficients at scale a.

Appendix A.2. Coefficient to Fit the with Models
The coefficient used for the GUI is shown in Tables A1-A4. Note that the model requires only the first seven, i.e., X(1) to X(7) parameters.