A Novel Approach for Multi-Lead ECG Classification Using DL-CCANet and TL-CCANet

Cardiovascular disease (CVD) has become one of the most serious diseases that threaten human health. Over the past decades, over 150 million humans have died of CVDs. Hence, timely prediction of CVDs is especially important. Currently, deep learning algorithm-based CVD diagnosis methods are extensively employed, however, most such algorithms can only utilize one-lead ECGs. Hence, the potential information in other-lead ECGs was not utilized. To address this issue, we have developed novel methods for diagnosing arrhythmia. In this work, DL-CCANet and TL-CCANet are proposed to extract abstract discriminating features from dual-lead and three-lead ECGs, respectively. Then, the linear support vector machine specializing in high-dimensional features is used as the classifier model. On the MIT-BIH database, a 95.2% overall accuracy is obtained by detecting 15 types of heartbeats using DL-CCANet. On the INCART database, overall accuracies of 94.01% (II and V1 leads), 93.90% (V1 and V5 leads) and 94.07% (II and V5 leads) are achieved by detecting seven types of heartbeat using DL-CCANet, while TL-CCANet yields a higher overall accuracy of 95.52% using the above three leads. In addition, all of the above experiments are implemented using noisy ECG data. The proposed methods have potential to be applied in the clinic and mobile devices.


Introduction
Cardiovascular disease (CVD) is one of the most common causes of death, widely distributed throughout the world. According to the relevant statistical data, more than 17 million people die of cardiovascular disease each year, accounting for approximately 1/3 of the total deaths [1]. More seriously, some experts predict that the number of CVD patients is still growing, and will reach 23 million by the year 2030 [2]. Hence, it is of profound significance to timely diagnose and treat CVDs. Currently, electrocardiograms (ECGs) are widely used by medical specialists to diagnose abnormal states of the heart. However, the ECG acquisition process is inevitably affected by interference and noise. Meanwhile, the morphological changes of ECGs are not easily observable under certain disease conditions such as atrial premature beats as capturing visually subtle changes in long-term ECG signals is hard and time-consuming. To address these issues, the development of computer-aided devices for monitoring long-term ECG has been receiving increasing attention from researchers in the field.

90%
Pławiak [18] 2018 Frequency components of the power spectral density of the ECG signal Genetic ensemble of SVM classifiers optimized by sets 17 91% Yildirim et al. [14] 2018 Rescaling raw data 1D-CNN 17 91% Pławiak [5] 2019 Frequency components of the PSD of ECG signal Deep genetic ensemble of classifiers (DGEC) 17 95% Although so many methods exist for the detection of arrhythmias, it is still essential to further explore this field. Firstly, the ECGs can be recorded as different leads according to the various locations of the acquisition on the body, causing differences in the information of the heart state recorded by different multi-lead ECGs. Among these leads, limb leads (I, II, and III leads) can reflect changes in ECG from the facade of the heart, while the chest leads (V1~V6 leads) record changes of the ECGs in the cross-section of the heart. In addition, for the same type of leads (limb leads or chest leads), the change in ECG acquisition location leads to certain differences in the potential information provided by different leads. For example, the ECGs of V1 and V5 leads are all collected from the chest. However, the V1 lead contains more changes in the state of the left ventricle, while the V5 lead mainly records the potential information of the right ventricle. Hence, compared to one-lead ECGs, multi-lead ECGs can better reflect the state of the heart. Since the above methods can only process single-lead ECG signals, the other-lead ECG information reflecting the differences of cardiac states is not effectively utilized. Meanwhile, the quality of the signal collected from a certain lead may be poor, leading to possible difficulties in real-scenario arrhythmia detection using one-lead ECGs. Secondly, most of the above classification methods require the use of certain algorithms in advance to remove noise from the ECG signals [19], which may be due to the fact they can hardly deal with the noisy data. To address these

Preprocessing
In this work, we adopted R peak positions recorded in the annotations of ECG signals as the fiducial points. For each signal, we take S1 samples to the left and S2 samples to the right of R peak as a heartbeat. Next, the sample of each heartbeat is normalized as per Equation (1): (1) where ( ) represents the normalized i-th sample in a heartbeat, min(x h ) and max(x h ) represent samples with minimum and maximum amplitudes, respectively. The S1 and S2 for the MIT-BIH database and INCART database are shown in Table 2. The mapminmax function of MATLAB is used for normalization.

DL-CCANet
CCANet, designed for processing two-view images, was first proposed by Yang et al. in 2017 [20]. Compared to one-view image-based PCANet and RandNet, CCANet yields superior performance in recognizing images. A standard CCANet consists of two cascaded convolutional layers and an output layer: (1) in the convolutional layers, the CCA technique is used to extract dualead filter banks; (2) in the output layer, the CCA features extracted from the second convolutional layer was mapped into the final feature vector. In this work, we adopted CCANet as the feature extraction method of dual-lead ECGs. Because the 2D convolutional process in CCANet can only

Preprocessing
In this work, we adopted R peak positions recorded in the annotations of ECG signals as the fiducial points. For each signal, we take S 1 samples to the left and S 2 samples to the right of R peak as a heartbeat. Next, the sample x h i of each heartbeat is normalized as per Equation (1): where NEW x h i represents the normalized i-th sample in a heartbeat, min(x h ) and max(x h ) represent samples with minimum and maximum amplitudes, respectively. The S 1 and S 2 for the MIT-BIH database and INCART database are shown in Table 2. The mapminmax function of MATLAB is used for normalization.

DL-CCANet
CCANet, designed for processing two-view images, was first proposed by Yang et al. in 2017 [20]. Compared to one-view image-based PCANet and RandNet, CCANet yields superior performance in recognizing images. A standard CCANet consists of two cascaded convolutional layers and an output layer: (1) in the convolutional layers, the CCA technique is used to extract dua-lead filter banks; (2) in the output layer, the CCA features extracted from the second convolutional layer was mapped Sensors 2019, 19, 3214 5 of 20 into the final feature vector. In this work, we adopted CCANet as the feature extraction method of dual-lead ECGs. Because the 2D convolutional process in CCANet can only handle 2D matrices, an input layer is introduced before the first convolutional layer, by which the complete dual-lead CCANet (DL-CCANet) is obtained. The structure of the DL-CCANet is shown in Figure 2. handle 2D matrices, an input layer is introduced before the first convolutional layer, by which the complete dual-lead CCANet (DL-CCANet) is obtained. The structure of the DL-CCANet is shown in Figure 2. (1) Input layer In this layer, each heartbeat containing m × n samples is reshaped into an ECG matrix , = 1,2, … , , ℎ = 1,2 with a size of m × n, which represents the i-th heartbeat in the h-th ECG lead.
(2) First convolutional layer Initial stage: Given two-lead ECG matrices and , a t1 × t2 patch is adopted to extract a series of local feature blocks by centering on each pixel. We then carry out zero-center and vectorization operations on each local block, by which a set of pending vectors ̅ , , ̅ , , … , ̅ , ∈ ℜ × are obtained. Next, the pending vectors corresponding to all the heartbeats in the h-th lead are concatenated into = ̅ , , ̅ , , … , ̅ , ∈ ℜ × and the pending matrix = , , … , ∈ ℜ × Filter extraction stage: Then, the CCA filters can be obtained by reshaping several canonical vectors. With the constraints = 1 and = 1, the first canonical vector is calculated by the CCA model max ( , ) = , where = ( )( ) , and the terms a1 and b1 are the first canonical vectors for two ECG leads, respectively. As per Equation (2), a Lagrange multiplier technique with multipliers λ and ν is adopted to solve the above CCA model: To obtain the maximum value θ , we need to solve Equation (3). , by which we obtain the canonical vectors a1 and b1. Next, the subsequent canonical vectors al, l > 1 and bl, l > 1 can be calculated by solving the CCA model with the combination of Lagrange multiplier technique and new constraints = , .
(2) First convolutional layer Initial stage: Given two-lead ECG matrices I 1 i and I 2 i , a t 1 × t 2 patch is adopted to extract a series of local feature blocks by centering on each pixel. We then carry out zero-center and vectorization operations on each local block, by which a set of pending vectors x h i,1 , x h i,2 , . . . , x h i,mn ∈ t 1 ×t 2 are obtained. Next, the pending vectors corresponding to all the heartbeats in the h-th lead are concatenated into . , x h i,mn ] ∈ t 1 t 2 ×mn and the pending matrix X h = [X h 1 , X h 2 , . . . , X h N ] ∈ t 1 t 2 ×mn . Filter extraction stage: Then, the CCA filters can be obtained by reshaping several canonical vectors. With the constraints a T 1 S 11 a 1 = 1 and b T 1 S 22 b 1 = 1, the first canonical vector is calculated by the CCA model maxρ(a 1 , b 1 ) = a T 1 S 12 b 1 , where S ij = X i X j T , and the terms a 1 and b 1 are the first canonical vectors for two ECG leads, respectively. As per Equation (2), a Lagrange multiplier technique with multipliers λ and ν is adopted to solve the above CCA model: To obtain the maximum value θ, we need to solve Equation (3).
Then, Equation (3) is converted into the forms S −1 11 S 12 S −1 22 S 21 a 1 = λ 2 a 1 and S −1 22 S 21 S −1 11 S 12 b 1 = λ 2 b 1 , by which we obtain the canonical vectors a 1 and b 1 . Next, the subsequent canonical vectors a l , l > 1 and b l , l > 1 can be calculated by solving the CCA model with the combination of Lagrange multiplier technique and new constraints a T i S 11 a l = b T i S 22 b l , i < l. As per Equation (4), all the 2L 1 canonical vectors (a l , l = 1, 2, . . . , L 1 and b l , l = 1, 2, . . . , L 1 ) are mapped into matrices V 1 l , l = 1, 2, . . . , L 1 and V 2 l , l = 1, 2, . . . , L 1 , which are adopted as the CCA filters corresponding to two ECG leads, respectively: V or W?
Convolutional stage: The preliminary feature blocks (PFBs) can be obtained as per I h i,l = I h i * W h l , l = 1, 2, . . . , L 1 , where * is the convolutional symbol.
(3) Second convolutional layer Initial stage: The PFBs I h i,l = I h i * W h l , l = 1, 2, . . . , L 1 are employed as the input of this layer, and an initial operation similar to the previous layer is carried out to obtain Y 1 = Y 1 1 , Y 1 2 , . . . , Y 1 Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with S ij = Y i Y j T , by which the canonical vectors c Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the ue to solve the CCA model with = ( )( ) , by which the canonical tained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = ):  (6) eature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( ) ]∈ tes the block segmentation (the size of the block is u1 × u2)fhg with a fixed ram statistics approach, and B is the number of blocks collected from Ti,j.
of (•) and L2 are the same as in Equation (4).
of the DL-CCANet is illustrated in Algorithm 1.
where (•) reshapes the vectors al and bl into matrices and , and L1 is the number of CCA filters.
(3) Second convolutional layer Initial stage: The PFBs , = * , = 1,2, … , are employed as the input of this layer, and an initial operation similar to the previous layer is carried out to obtain = , , … , Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).
(4) Output layer In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per Equation (6): Ultimately, the final feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( )]∈ ℜ , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj , l = 1, 2, . . . , L 2 and V 2 6 of 20 re the same as in Equation (4). ndary feature blocks (SFBs) are calculated as per , = , * ℓ and , * ℓ are concatenated as a whole.
yer yer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per (6) y, the final feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( ) ]∈ Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed and a histogram statistics approach, and B is the number of blocks collected from Ti,j. orkflow of the DL-CCANet is illustrated in Algorithm 1.   (5): here the meanings of (•) and L2 are the same as in Equation (4).

4) Output layer
In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per quation (6): Ultimately, the final feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( ) ]∈ , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed verlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. he laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj where (•) reshapes the vectors al and bl into matrices and , and L1 is the number of CCA filters.
Convolutional stage: The preliminary feature blocks (PFBs) can be obtained as per , = * , = 1,2, … , , where * is the convolutional symbol. Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): nings of (•) and L2 are the same as in Equation (4).
here the meanings of (•) and L2 are the same as in Equation (4).

4) Output layer
In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed verlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. he laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj where (•) reshapes the vectors al and bl into matrices and , and L1 is the number of CCA filters.
Convolutional stage: The preliminary feature blocks (PFBs) can be obtained as per , = * , = 1,2, … , , where * is the convolutional symbol. Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).
(4) Output layer In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per Equation (6): Ultimately, the final feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( ) ]∈ ℜ , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

5:
Solve the CCA model by the Lagrange multiplier technique to obtain the two-lead project directions a, b 6: Construct two-lead filter banks , ℎ = 1,2, = 1,2, … , 7: Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do where the meanings of mat k 1 k 2 (•) and L 2 are the same as in Equation (4). Convolutional stage: Similarly, the secondary feature blocks (SFBs) are calculated as per ) and L2 are the same as in Equation (4).  notes the block segmentation (the size of the block is u1 × u2)fhg with a fixed gram statistics approach, and B is the number of blocks collected from Ti,j. the DL-CCANet is illustrated in Algorithm 1.
where (•) reshapes the vectors al and bl into matrices and , and L1 is the number of CCA filters.
Convolutional stage: The preliminary feature blocks (PFBs) can be obtained as per , = * , = 1,2, … , , where * is the convolutional symbol. Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).
Convolutional stage: Similarly, the secondary feature blocks (SFBs) are calculated as per , = , * ℓ , , * ℓ ℓ , = 1,2, … , , where , * ℓ and , * ℓ are concatenated as a whole. , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj where (•) reshapes the vectors al and bl into matrices and , and L1 is the number of CCA filters.
Convolutional stage: The preliminary feature blocks (PFBs) can be obtained as per , = * , = 1,2, … , , where * is the convolutional symbol. Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).

(4) Output layer
In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per Equation (6): Ultimately, the final feature vector is obtained as , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj are concatenated as a whole.
(4) Output layer In this layer, each SFB is converted into several decimal matrices as per rs al and bl into matrices and , and L1 is the number of inary feature blocks (PFBs) can be obtained as per , = * olutional symbol.
, = 1,2, … , are employed as the input of this layer, and revious layer is carried out to obtain = , , … , taking Y h , h = 1,2 as the object to be processed, we employ the e the CCA model with = ( )( ) , by which the canonical en, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = d L2 are the same as in Equation (4). e secondary feature blocks (SFBs) are calculated as per , = here , * ℓ and , * ℓ are concatenated as a whole.
ted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ • ) maps the , * ℓ , , * ℓ onto binary images as per ctors al and bl into matrices and , and L1 is the number of liminary feature blocks (PFBs) can be obtained as per , = * onvolutional symbol. * , = 1,2, … , are employed as the input of this layer, and e previous layer is carried out to obtain = , , … , ∈ ∈ ℜ × . ly, taking Y h , h = 1,2 as the object to be processed, we employ the olve the CCA model with = ( )( ) , by which the canonical Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = and L2 are the same as in Equation (4).
apes the vectors al and bl into matrices and , and L1 is the number of ge: The preliminary feature blocks (PFBs) can be obtained as per , = * re * is the convolutional symbol.
SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ he function H( • ) maps the , * ℓ , , * ℓ onto binary images as per nal feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( ) ]∈ denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed stogram statistics approach, and B is the number of blocks collected from Ti,j. f the DL-CCANet is illustrated in Algorithm 1. , action stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the iplier technique to solve the CCA model with = ( )( ) , by which the canonical d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = r Equation (5): nings of (•) and L2 are the same as in Equation (4).
yer er, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per , the final feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( ) ]∈ Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed and a histogram statistics approach, and B is the number of blocks collected from Ti,j. rkflow of the DL-CCANet is illustrated in Algorithm 1.
where (•) reshapes the vectors al and bl into matrices and , and L1 is the number of CCA filters.
Convolutional stage: The preliminary feature blocks (PFBs) can be obtained as per , = * , = 1,2, … , , where * is the convolutional symbol. Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).

(4) Output layer
In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per Equation (6): Ultimately, the final feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( )]∈ ℜ , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1.
Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage = * , (•) reshapes the vectors al and bl into matrices and , and L1 is the numb CCA filters.
Convolutional stage: The preliminary feature blocks (PFBs) can be obtained as per , = , = 1,2, … , , where * is the convolutional symbol. Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we emplo Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the cano vectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).
(4) Output layer In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ ℓ ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images a Equation (6): where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a overlap rate R and a histogram statistics approach, and B is the number of blocks collected from The laconic workflow of the DL-CCANet is illustrated in Algorithm 1.
Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage = * onto binary images as per Equation (6): Ultimately, the final feature vector is obtained as where Bhist(•) denotes the block segmentation (the size of the block is u 1 × u 2 )fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from T i,j . The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

Input:
Raw Two-Lead Heartbeats A h i , h = 1, 2, i = 1, 2, . . . , N Output: f i 1: Form ECG matrix I h i 2: for the first convolutional stage do 3: Form the two-lead pending matrices X h 4: Compute the covariance matrix s ij of X i and X j 5: Solve the CCA model by the Lagrange multiplier technique to obtain the two-lead project directions a, b 6: Construct two-lead filter banks W h l , h = 1, 2, l = 1, 2, . . . , L 1 7: Calculate the preliminary feature blocks of the first convolutional stage I h i,l = I h i * W h l 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix s ij of Y i and Y j 12: Solve the CCA model to obtain the two-lead project directions c, d 13: Construct two-lead filter banks V h ( ) ere (•) reshapes the vectors al and bl into matrices and , and L1 is the number of A filters.
Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the grange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical ctors cl and dl can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = , … , as per Equation (5): ere the meanings of (•) and L2 are the same as in Equation (4).

Output layer
In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per uation (6): Ultimately, the final feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( ) ]∈ , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed erlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. e laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

:
Calculate the preliminary feature blocks of the first convolutional stage , = * : end for : for the second convolutional stage do 0: Form the two-lead pending matrices 1: Compute the covariance matrix sij of Yi and Yj , h = 1, 2, where (•) reshapes the vectors al and bl into matrices and , and L1 is the number of CCA filters.
(3) Second convolutional layer Initial stage: The PFBs , = * , = 1,2, … , are employed as the input of this layer, and an initial operation similar to the previous layer is carried out to obtain = , , … , Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors cl and dl can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).
(4) Output layer In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per Equation (6): Ultimately, the final feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( ) ]∈ ℜ , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj = 1, 2, . . . , L 2

14:
Calculate the output of the second convolutional stage: where (•) reshapes the vectors al and bl into matrices and , and L1 is the number of CCA filters.
(3) Second convolutional layer Initial stage: The PFBs , = * , = 1,2, … , are employed as the input of this layer, and an initial operation similar to the previous layer is carried out to obtain = , , … , Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors cl and dl can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).

(4) Output layer
In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per Equation (6): Ultimately, the final feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( ) ]∈ ℜ , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj where (•) reshapes the vectors al and bl into matrices and , and L1 is the number of CCA filters.
(3) Second convolutional layer Initial stage: The PFBs , = * , = 1,2, … , are employed as the input of this layer, and an initial operation similar to the previous layer is carried out to obtain = , , … , Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors cl and dl can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).

(4) Output layer
In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per Equation (6): Ultimately, the final feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( ) ]∈ ℜ , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj where (•) reshapes the vectors al and bl into matrices and , and L1 is the number of CCA filters.
(3) Second convolutional layer Initial stage: The PFBs , = * , = 1,2, … , are employed as the input of this layer, and an initial operation similar to the previous layer is carried out to obtain = , , … , Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors cl and dl can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors cl and dl can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors cl and dl can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors cl and dl can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors cl and dl can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors cl and dl can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).
Convolutional stage: Similarly, the secondary feature blocks (SFBs) are calculated as per , = , * ℓ , , * ℓ ℓ , = 1,2, … , , where , * ℓ and , * ℓ are concatenated as a whole. , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors cl and dl can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).
Convolutional stage: Similarly, the secondary feature blocks (SFBs) are calculated as per , = , * ℓ , , * ℓ ℓ , = 1,2, … , , where , * ℓ and , * ℓ are concatenated as a whole. , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors cl and dl can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).
Convolutional stage: Similarly, the secondary feature blocks (SFBs) are calculated as per , = , * ℓ , , * ℓ ℓ , = 1,2, … , , where , * ℓ and , * ℓ are concatenated as a whole. , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj )

18:
Construct the histogram vector f i

TL-CCANet
In this work, we developed a TL-CCANet to extract features from three-lead ECGs. TL-CCANet contains an input layer, two cascaded convolutional layers and an output layer. Different from DL-CCANet, there are three input channels for TL-CCANet, in which the CCA processing is alternately performed on the two-lead data in the two cascaded convolutional layers. Figure 3 presents the specific structure of TL-CCANet. Construct two-lead filter banks ℓ , ℎ = 1,2, ℓ = 1,2, … ,

TL-CCANet
In this work, we developed a TL-CCANet to extract features from three-lead ECGs. TL-CCANet contains an input layer, two cascaded convolutional layers and an output layer. Different from DL-CCANet, there are three input channels for TL-CCANet, in which the CCA processing is alternately performed on the two-lead data in the two cascaded convolutional layers. Figure 3 presents the specific structure of TL-CCANet. (1) Input layer In this layer, ECG matrices , h = 1, 2, 3, h with a size of m × n are obtained by reshaping all the heartbeats.
(2) First convolutional layer Initial stage: The operation of this stage is the same as that in DL-CCANet, and the pending matrix = , , … , ∈ ℜ × corresponding to the h-th lead is then obtained. Filter extraction stage: In this step, the CCA filters are obtained based on three combinations of X h , respectively. The specific allocation scheme is shown in Table 3. (1) Input layer In this layer, ECG matrices I h i , h = 1, 2, 3, h with a size of m × n are obtained by reshaping all the heartbeats.
(2) First convolutional layer Initial stage: The operation of this stage is the same as that in DL-CCANet, and the pending matrix X h = X h 1 , X h 2 , . . . , X h N ∈ t 1 t 2 ×mn corresponding to the h-th lead is then obtained. Filter extraction stage: In this step, the CCA filters are obtained based on three combinations of X h , respectively. The specific allocation scheme is shown in Table 3. Table 3. The specific allocation results.

Lead
Combination 1 Combination 2 Combination 3 1st lead Taking the combination 1 as an example, we first calculate the canonical vectors a l and b l by solving the CCA model between X 1 and X 2 using the Lagrange multiplier technique. Then, we reserve the a l corresponding to the X 1 , and the CCA filters are obtained as per Equation (7).
where the function mat k 1 k 2 (•) is similar to that in DL-CCANet. After processing all the above three combinations, three sets of filters W 1 l , W 2 l , and W 3 l are obtained. Convolutional stage: Based on three ECG leads, we calculate the preliminary feature blocks (PFBs) according to I h i,l = I h i * W h l , l = 1, 2, · · · , L 1 , h = 1, 2, 3.

(3) Second convolutional layer
Initial stage: The I h i,l = I h i * W h l , l = 1, 2, · · · , L 1 is employed as the input of this layer, and an operation similar to the previous layer is used to obtain Filter extraction stage: Taking the Y 1 and Y 2 of combination 1 in Table 4 as an example, we handle the CCA model using Lagrange multiplier technique, by which the canonical vectors c  Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we emp Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the ca vectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj and d Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, Lagrange multiplier technique to solve the CCA model with = ( )( ) , by whic vectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).

7:
Calculate the preliminary feature blocks of the first convolutional stage , = 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj are obtained. Next, the CCA filters of Y 1 are obtained as per Equation (8). Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the agrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical ectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = ,2, … , as per Equation (5): here the meanings of (•) and L2 are the same as in Equation (4).

4) Output layer
In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per quation (6): Ultimately, the final feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( ) ]∈ , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed verlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. he laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.
Convolutional stage: Similarly, the secondary feature blocks (SFBs) are calculated as per Equation (9): er extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the e multiplier technique to solve the CCA model with = ( )( ) , by which the canonical c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = as per Equation (5): where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).
and L2 are the same as in Equation (4).
, the secondary feature blocks (SFBs) are calculated as per , = , where , * ℓ and , * ℓ are concatenated as a whole. action stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the iplier technique to solve the CCA model with = ( )( ) , by which the canonical d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = r Equation (5): nings of (•) and L2 are the same as in Equation (4 lter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the ge multiplier technique to solve the CCA model with = ( )( ) , by which the canonical s c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = as per Equation (5): the meanings of (•) and L2 are the same as in Equation (4).
utput layer this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per on (6): (6) and The laconic workflow of the TL-CCANet is illustrated in Algorithm 2. The DL-CCANet and TL-CCANet are achieved by using the PCANet [24] and canoncorr function in MATLAB. Their parameters are shown in Table 5.

Input:
Raw Three-Lead Heartbeats A h i , h = 1, 2, i = 1, 2, . . . , N Output: f i 1: Form ECG matrix I h i 2: for the first convolutional stage do 3: Form the three-lead pending matrices X h 4: Compute the covariance matrix s h i j of X h i and X h j 5: Solve the CCA model by the Lagrange multiplier technique to obtain the three-lead project directions a h , h = 1, 2, 3 and b h , h = 1, 2, 3 6: Construct three-lead filter banks W h l , h = 1,2, l = 1, 2, . . . , L 1 7: Calculate the preliminary feature blocks of the first convolutional stage I h i,l = I h i * W h l 8: end for 9: for the second convolutional stage do 10: Form the three-lead pending matrices 11: Compute the covariance matrix s h i j of Y h i and Y h here (•) reshapes the vectors al and bl into matrices and , and L1 is the number of CA filters.
Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the agrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical ectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = ,2, … , as per Equation (5): here the meanings of (•) and L2 are the same as in Equation (4).

4) Output layer
In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per quation (6): Ultimately, the final feature vector is obtained as per fi = [Bhist(T1),Bhist(T2),…,Bhist( ) ]∈ , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed verlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. he laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj , h = 1, 2, 3, where (•) reshapes the vectors al and bl into matrices and , and L1 is the number of CCA filters.
Convolutional stage: The preliminary feature blocks (PFBs) can be obtained as per , = * , = 1,2, … , , where * is the convolutional symbol. Filter extraction stage: Similarly, taking Y h , h = 1,2 as the object to be processed, we employ the Lagrange multiplier technique to solve the CCA model with = ( )( ) , by which the canonical vectors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = 1,2, … , as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).
(4) Output layer In this layer, each SFB is converted into several decimal matrices as per , = ∑ 2 ℓ , * ℓ ℓ , , * ℓ , where the function H( • ) maps the , * ℓ , , * ℓ onto binary images as per Equation (6) , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

7:
Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj = 1, 2, . . . , L 2 14: Calculate the output of the second convolutional stage: s of (•) and L2 are the same as in Equation (4).
ach SFB is converted into several decimal matrices as per , = ∑ eanings of (•) and L2 are the same as in Equation (4 , h = 1,2 as the object to be processed, we employ the range multiplier technique to solve the CCA model with = ( )( ) , by which the canonical tors c l and d l can be obtained. Then, we calculate the CCA filters ℓ , = 1,2, … , and ℓ , = … , as per Equation (5): re the meanings of (•) and L2 are the same as in Equation (4).
here the meanings of (•) and L2 are the same as in Equation (4).

nput:
Raw Two-Lead Heartbeats , ℎ = 1,2, = 1,2, … , utput: fi : Form ECG matrix : for the first convolutional stage do : Form the two-lead pending matrices X h : Compute the covariance matrix sij of Xi and Xj : Solve the CCA model by the Lagrange multiplier technique to obtain the two-lead project directions a, b : Construct two-lead filter banks , ℎ = 1,2, = 1,2, … , : Calculate the preliminary feature blocks of the first convolutional stage , = * : end for : for the second convolutional stage do 0: Form the two-lead pending matrices 1: Compute the covariance matrix sij of Yi and Yj as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4).  (5): where the meanings of (•) and L2 are the same as in Equation (4). , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

5:
Solve the CCA model by the Lagrange multiplier technique to obtain the two-lead project directions a, b 6: Construct two-lead filter banks , ℎ = 1,2, = 1,2, … , 7: Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj as per Equation (5): where the meanings of (•) and L2 are the same as in Equation (4). , where Bhist(•) denotes the block segmentation (the size of the block is u1 × u2)fhg with a fixed overlap rate R and a histogram statistics approach, and B is the number of blocks collected from Ti,j. The laconic workflow of the DL-CCANet is illustrated in Algorithm 1. Algorithm 1. The algorithm of two convolutional stages of the DL-CCANet.

5:
Solve the CCA model by the Lagrange multiplier technique to obtain the two-lead project directions a, b 6: Construct two-lead filter banks , ℎ = 1,2, = 1,2, … , 7: Calculate the preliminary feature blocks of the first convolutional stage , = * 8: end for 9: for the second convolutional stage do 10: Form the two-lead pending matrices 11: Compute the covariance matrix sij of Yi and Yj

18:
Construct the histogram vector f i

Classification
In this work, we employ a linear support vector machine (SVM) as the classifier model. Linear support vector machines can be easily used and have a relatively simple computation process for object predictions. As we know, linear SVM makes the classification process by calculating the decision hyperplane using a linear kernel and can handle massive data efficiently and quickly. Hence, linear SVM is pretty suitable for processing the CCANet features with high dimensionality and sparsity. For a linear SVM model, error penalty factor C is a crucial parameter representing the tolerance to error. The change in the error penalty factor greatly affects the prediction results. In our experiments, we adopted the Liblinear toolkit [25] realizing linear SVM [26].

Experimental Setup
The device used is a personal computer equipped with Windows 10 and 32 GB RAM, and all the experiments are implemented on Matlab 2018a using a i7-8750 CPU with a clock speed of 2.20 GHz. The proposed methodology is validated on MIT-BIH database and INCART database. In our study, we employ the DL-CCANet and the TL-CCANet to extract fusion feature from two leads and three leads, respectively. The parameters of them are shown in Tables 6 and 7 the C of the liblinear toolkit (linear SVM). In this work, the above parameters of DL-CCANet, TL-CCANet, and linear SVM were determined through a lot of experiments on training data using changed parameters and 10-fold cross-validation (alternative 1part for validation and 9 parts for training). Table 6. The parameters set for DL-CCANet and TL-CCANet.

Network Type
Layer Value To evaluate the proposed methods, we selected 15 and eight detailed categories from the MIT-BIH database and the INCART database, respectively. From the MIT-BIH database, we randomly selected 3350 heartbeats. From the INCART database, 1720 random heartbeats are used in this work. The number of each category is shown in Tables 8 and 9, respectively.  In this work, we use k-fold cross-validation [26] to perform our experiments. Specifically, the heartbeats are divided into k equal parts. In each experiment, the alternative one part is employed as the validation data, and others are used as training data. After the above k experiments, k confusion matrices are obtained and added together. Based on the overall confusion matrix, we calculate several important indicators, which are accuracy (Acc), sensitivity (Sen), precision (Ppv), specificity (Spe), and F1-score. They can be obtained according to the following equations [15]:

DL-CCANet and TL-CCANet
where TP, TN, FP, and FN are the values of true positive, true negative, false positive, and false negative, respectively. These values of each experiment are recorded in confusion matrices.

Experiments on MIT-BIH Database Using DL-CCANet
The classification results for the MIT-BIH database are presented in Table 10. Overall, 95.25% of the heartbeats were correctly classified with a 4.75% error rate. Among the normal heartbeats (Nb), approximately 2.7% heartbeats of Nb class were wrongly classified as abnormal heartbeats (Ab). By contrast, for Ab, a total of 1.8% were wrongly classified as Nb class. Meanwhile, the average accuracy, Sen, Ppv, and Spe were 99.4%, 94.6%, 96.3%, and 99.6%, respectively. For the majority of the categories, over 90% sensitivity was obtained. Except for class a, the Ppv for other classes were all more than 90%. More importantly, the specificities of all the categories are over 97%, indicating that the proposed approach also has a high ability to detect negative categories. To sum up, the combination of DL-CCANet and linear SVM performed extremely well on the MIT-BIH database.

DL-CCANet
Tables 11-13 exhibit the classification results of DL-CCANet. These results were obtained based on various combinations of ECG leads, including combination 1 (II and V1 leads), 2 (V1 and V5 leads), and 3 (II and V5 leads). It is worth noting that the Acc and Spe for classifying each category were over 95%. In terms of Sen and Ppv, the worst result was yielded when classifying n-type heartbeats, which may be due to the minimal amount of training data with relatively insufficient classification information. In addition, over 80% of values were obtained for the precisions of all the ECG categories and the Sens of the majorities of heartbeats. From a holistic perspective, the average Accs and Spes obtained using combination 1, 2 and 3 are all over 98%, while the average Sens and Ppvs of the three schemes exceed 90% and 93%, respectively. On the whole, approximately 94% of the heartbeats were correctly classified with a 6% false detection rate. Hence, the above indicates that the combination of DL-CCANet and linear SVM can effectively identify heartbeats including N, V, A, F, n, R and j types on INCART database.  Table 14 shows the confusion matrix and evaluation indicators of the classification results obtained by using II, V1, and V5 leads based TL-CCANet. Overall, 95.52% heartbeats were correctly identified with 4.48% false detection. Except for class n, over 88% Sen and 90% Ppv were obtained by classifying the remaining categories. In terms of Spe, the value is more than 98% for each abnormal heartbeats, and the average value reached 99.16%. In addition, the average Acc, average Sen, and average Ppv were 98.76%, 92.71%, and 94.88%, respectively. Compared to the DL-CCANet, TL-CCANet achieved approximately 2% higher average Sen and correctly identified more heartbeats, approximately 1.5%. More importantly, other evaluation indicators obtained by the TL-CCANet are obviously at a higher level.  Figure 4 shows the comparison of overall accuracies using DL-CCANet, PCANet, and RandNet. Among these methods, PCANet and RandNet were employed to process one-lead ECGs such as lead II or lead V1. As shown in Figure 5, the lead II-based PCANet and RandNet yielded higher recognization accuracy than lead V1, which may be due to the fact that lead II and lead V1 in the MIT-BIH database were recorded with different levels of noise. The ECGs of leads II and V1 can be seen in Figure 5, where the ECGs for lead II have a clearer waveform than that for lead V1. However, the two-lead ECGs-based DL-CCANet yielded significantly better results than lead II-and lead V1-based PCANet and RandNet, indicating that DL-CCANet effectively extracts the correlation between the two leads to compensate for the lack of quality of the lead V1. Hence, the above demonstrates that DL-CCANet can solve the problem of poor quality of one-lead ECG signals. Figure 4 shows the comparison of overall accuracies using DL-CCANet, PCANet, and RandNet. Among these methods, PCANet and RandNet were employed to process one-lead ECGs such as lead II or lead V1. As shown in Figure 5, the lead II-based PCANet and RandNet yielded higher recognization accuracy than lead V1, which may be due to the fact that lead II and lead V1 in the MIT-BIH database were recorded with different levels of noise. The ECGs of leads II and V1 can be seen in Figure 5, where the ECGs for lead II have a clearer waveform than that for lead V1. However, the two-lead ECGs-based DL-CCANet yielded significantly better results than lead II-and lead V1based PCANet and RandNet, indicating that DL-CCANet effectively extracts the correlation between the two leads to compensate for the lack of quality of the lead V1. Hence, the above demonstrates that DL-CCANet can solve the problem of poor quality of one-lead ECG signals.

The Comparison on INCART Database
Figures 6 and 7 illustrate the ECG classification results achieved by using TL-CCANet, DL-CCANet, PCANet, and RandNet. Among them, PCANet and RandNet were used to classify heartbeats of II lead, V1 lead, and V5 lead, respectively. As per Figure 6, the line charts illustrate the recognition results when experimenting on each fold. It can be obviously seen that the TL-CCANet yielded significantly higher recognition accuracy compared to other methods. Among these methods, PCANet and RandNet were employed to process one-lead ECGs such as lead II or lead V1. As shown in Figure 5, the lead II-based PCANet and RandNet yielded higher recognization accuracy than lead V1, which may be due to the fact that lead II and lead V1 in the MIT-BIH database were recorded with different levels of noise. The ECGs of leads II and V1 can be seen in Figure 5, where the ECGs for lead II have a clearer waveform than that for lead V1. However, the two-lead ECGs-based DL-CCANet yielded significantly better results than lead II-and lead V1based PCANet and RandNet, indicating that DL-CCANet effectively extracts the correlation between the two leads to compensate for the lack of quality of the lead V1. Hence, the above demonstrates that DL-CCANet can solve the problem of poor quality of one-lead ECG signals.

The Comparison on INCART Database
Figures 6 and 7 illustrate the ECG classification results achieved by using TL-CCANet, DL-CCANet, PCANet, and RandNet. Among them, PCANet and RandNet were used to classify heartbeats of II lead, V1 lead, and V5 lead, respectively. As per Figure 6, the line charts illustrate the recognition results when experimenting on each fold. It can be obviously seen that the TL-CCANet yielded significantly higher recognition accuracy compared to other methods.

The Comparison on INCART Database
Figures 6 and 7 illustrate the ECG classification results achieved by using TL-CCANet, DL-CCANet, PCANet, and RandNet. Among them, PCANet and RandNet were used to classify heartbeats of II lead, V1 lead, and V5 lead, respectively. As per Figure 6, the line charts illustrate the recognition results when experimenting on each fold. It can be obviously seen that the TL-CCANet yielded significantly higher recognition accuracy compared to other methods.
In Figure 7, the overall accuracies of these methods across five folds are presented in the form of bar charts. Overall, the recognization accuracies were over 90% obtained by all the methods. Among them, one lead-based PCANet and RandNet have different classification results using II, V1 and V5 leads, and the recognization accuracies of V1 leads are the lowest, only 92.1% and 90.76%, respectively. Relative to one lead-based methods, DL-CCANet (II and V1 leads) and DL-CCANet (V1 and V5 leads) achieve better results (94.01% and 93.90%), indicating that the useful information of heartbeats in the V1 lead was compensated by that in the II and V5 leads. In addition, the highest recognization Accs were obtained by using three-lead (II, V1, and V5 leads) ECGs based TL-CCANet. This demonstrates that when the TL-CCANet mined relevance information of dual-lead ECGs, the useful components in third-lead ECGs were also utilized for arrhythmia diagnosis. In summary, DL-CCANet and TL-CCANet can effectively utilize the multi-lead ECGs to implement ECG classification. In Figure 7, the overall accuracies of these methods across five folds are presented in the form of bar charts. Overall, the recognization accuracies were over 90% obtained by all the methods. Among them, one lead-based PCANet and RandNet have different classification results using II, V1 and V5 leads, and the recognization accuracies of V1 leads are the lowest, only 92.1% and 90.76%, respectively. Relative to one lead-based methods, DL-CCANet (II and V1 leads) and DL-CCANet (V1 and V5 leads) achieve better results (94.01% and 93.90%), indicating that the useful information of heartbeats in the V1 lead was compensated by that in the II and V5 leads. In addition, the highest recognization Accs were obtained by using three-lead (II, V1, and V5 leads) ECGs based TL-CCANet. This demonstrates that when the TL-CCANet mined relevance information of dual-lead ECGs, the useful components in third-lead ECGs were also utilized for arrhythmia diagnosis. In summary, DL-CCANet and TL-CCANet can effectively utilize the multi-lead ECGs to implement ECG classification.   In Figure 7, the overall accuracies of these methods across five folds are presented in the form of bar charts. Overall, the recognization accuracies were over 90% obtained by all the methods. Among them, one lead-based PCANet and RandNet have different classification results using II, V1 and V5 leads, and the recognization accuracies of V1 leads are the lowest, only 92.1% and 90.76%, respectively. Relative to one lead-based methods, DL-CCANet (II and V1 leads) and DL-CCANet (V1 and V5 leads) achieve better results (94.01% and 93.90%), indicating that the useful information of heartbeats in the V1 lead was compensated by that in the II and V5 leads. In addition, the highest recognization Accs were obtained by using three-lead (II, V1, and V5 leads) ECGs based TL-CCANet. This demonstrates that when the TL-CCANet mined relevance information of dual-lead ECGs, the useful components in third-lead ECGs were also utilized for arrhythmia diagnosis. In summary, DL-CCANet and TL-CCANet can effectively utilize the multi-lead ECGs to implement ECG classification.   Figure 8 shows the ECG classification results obtained using different quantities of filters of DL-CCANet, TL-CCANet, PCANet, and RandNet on INCART database. Among these records, TL-CCANet achieved the best results relative to other methods when using different numbers of filters. Meanwhile, it can be observed that the influence of the number of CCA filters on the recognition effect is obvious. Specifically, as the number of filters increases, the overall recognition Acc is improved significantly and tends to be stable. However, the use of a large number of filters leads to a higher dimension of features, causing that the linear support vector machine takes longer time to implement decision making and takes up a large amount of computer memory. After a large number of experiments, 9 was selected as the optimal number of CCA filters.

Robustness to Noise
As shown in Table 1, researchers have applied various methods and studied different types of arrhythmias in the related field. As we know, the ability of the proposed method to process noisy data is pretty important. In a real environment, the collected ECG signals typically contain noise with varying degrees and levels. To completely remove the noise in the signals, it is generally essential to strictly adjust the parameters of the denoising algorithm for a specific signal, resulting in a lot of wasted time and effort. Hence, the evaluation of the noise robustness of the proposed algorithm is critical. However, the majority of these methods were used to classify noise-free heartbeats, causing that the ability of these methods to deal with noisy heartbeats is not presented. In our study, the proposed method is directly applied in processing raw ECGs in the MIT-BIH database and INCART database and yields excellent performance. This may be because the process of CCA can extract correlation information between different leads. As we know, the collected ECGs mainly include baseline drift and high-frequency noise (such as EMG interference), etc. [27]. These types of noise are generally yielded by the acquisition equipment and muscle tremors. For example, baseline drift is

Robustness to Noise
As shown in Table 1, researchers have applied various methods and studied different types of arrhythmias in the related field. As we know, the ability of the proposed method to process noisy data is pretty important. In a real environment, the collected ECG signals typically contain noise with varying degrees and levels. To completely remove the noise in the signals, it is generally essential to strictly adjust the parameters of the denoising algorithm for a specific signal, resulting in a lot of wasted time and effort. Hence, the evaluation of the noise robustness of the proposed algorithm is critical. However, the majority of these methods were used to classify noise-free heartbeats, causing that the ability of these methods to deal with noisy heartbeats is not presented. In our study, the proposed method is directly applied in processing raw ECGs in the MIT-BIH database and INCART database and yields excellent performance. This may be because the process of CCA can extract correlation information between different leads. As we know, the collected ECGs mainly include baseline drift and high-frequency noise (such as EMG interference), etc. [27]. These types of noise are generally yielded by the acquisition equipment and muscle tremors. For example, baseline drift is generally caused by the displacement between the electrodes and the human body, and the degree and types of EMG are different as per the various ECG sampling positions on the body. Hence, the noise components of the different-lead ECGs have minimal correlation. As such, most of the noise information is filtered after processing the data by the CCA filters. The CCA features obtained from the first convolutional layer using the ECG matrices of II, V1, and V5 leads are illustrated in Figure 9. In addition, similar to PCANet [28], a multi-layer convolutional structure in CCANet reinforces the anti-noise and information extraction ability of CCA, supporting its application to noisy ECGs. and types of EMG are different as per the various ECG sampling positions on the body. Hence, the noise components of the different-lead ECGs have minimal correlation. As such, most of the noise information is filtered after processing the data by the CCA filters. The CCA features obtained from the first convolutional layer using the ECG matrices of II, V1, and V5 leads are illustrated in Figure 9. In addition, similar to PCANet [28], a multi-layer convolutional structure in CCANet reinforces the anti-noise and information extraction ability of CCA, supporting its application to noisy ECGs.

Conclusions
In this work, we have presented a fully automated classification method for heartbeats with various abnormal cardiac conditions. Specifically, to address some of the typical issues of previous studies, we propose two multi-lead ECGs-based feature extraction methods, which are DL-CCANet and TL-CCANet developed for extracting correlation information from dual-lead and three-lead ECGs, respectively. In the classification stage, the linear support vector machine specializing in processing high dimensional features was used as the classifier model. To verify the remarkable performance of the proposed methods, two databases from the PhysioNet PhysioBank were employed: MIT-BIH database and INCART database. By using the DL-CCANet, the highest overall accuracies of 95.25% and 94.05% were yielded in classifying detailed classes in the MIT-BIH database and INCART database, respectively. When using three-lead ECGs, the TL-CCANet achieved the highest accuracy of 95.52%, which is better than the results of DL-CCANet using two-lead ECGs in the INCART database. Meanwhile, we also used PCANet and RandNet to identify one-lead ECGs. Compared to them, the results of the proposed methods are significantly better.
Furthermore, there is an obvious difference in the quality of the signals between two leads in the MIT-BIH database. The performances of the multi-leads based methods prove that DL-CCANet effectively extracts the correlation between the two leads to compensate for the lack of quality of the

Conclusions
In this work, we have presented a fully automated classification method for heartbeats with various abnormal cardiac conditions. Specifically, to address some of the typical issues of previous studies, we propose two multi-lead ECGs-based feature extraction methods, which are DL-CCANet and TL-CCANet developed for extracting correlation information from dual-lead and three-lead ECGs, respectively. In the classification stage, the linear support vector machine specializing in processing high dimensional features was used as the classifier model. To verify the remarkable performance of the proposed methods, two databases from the PhysioNet PhysioBank were employed: MIT-BIH database and INCART database. By using the DL-CCANet, the highest overall accuracies of 95.25% and 94.05% were yielded in classifying detailed classes in the MIT-BIH database and INCART database, respectively. When using three-lead ECGs, the TL-CCANet achieved the highest accuracy of 95.52%, which is better than the results of DL-CCANet using two-lead ECGs in the INCART database. Meanwhile, we also used PCANet and RandNet to identify one-lead ECGs. Compared to them, the results of the proposed methods are significantly better.
Furthermore, there is an obvious difference in the quality of the signals between two leads in the MIT-BIH database. The performances of the multi-leads based methods prove that DL-CCANet effectively extracts the correlation between the two leads to compensate for the lack of quality of the lead V1. More importantly, we adopted raw ECGs as the experimental material, verifying that TL-CCANet and DL-CCANet have good ability to classify multi-lead ECGs even under noisy conditions. In terms of applications, they have minimal parameters to be adjusted and do not require the detection of P, Q, S, and T points. The proposed methods are possible to be conveniently and universally applied in the clinic or mobile devices.
The advantages of the proposed methods: (1) high overall accuracy; (2) classification of detailed classes (15 classes on MIT-BIH database and seven classes on INCART database); (3) utilization of the correlation of multi-lead ECG signals; (4) a detailed description of the methods -DL-CCANet and TL-CCANet; (5) A possible application for mobile devices; (6) a small number of parameters to be adjusted.
Among the limitations, we may mention:(1) Lower recognition performance for several classes containing minimal heartbeats such as type n in Table 10; (2) The size of ECG matrix needs to be adjusted for different databases due to the different sample rates.
In future work, we will try to solve the above limitations by utilizing more ECG leads and adopting a resampling algorithm. This work will make excellent contributions to the recognition of abnormal ECGs.

Conflicts of Interest:
The authors declare no conflict of interest.