An Improved Recognition Approach for Noisy Multispectral Palmprint by Robust L2 Sparse Representation with a Tensor-Based Extreme Learning Machine

For the past decades, recognition technologies of multispectral palmprint have attracted more and more attention due to their abundant spatial and spectral characteristics compared with the single spectral case. Enlightened by this, an innovative robust L2 sparse representation with tensor-based extreme learning machine (RL2SR-TELM) algorithm is put forward by using an adaptive image level fusion strategy to accomplish the multispectral palmprint recognition. Firstly, we construct a robust L2 sparse representation (RL2SR) optimization model to calculate the linear representation coefficients. To suppress the affection caused by noise contamination, we introduce a logistic function into RL2SR model to evaluate the representation residual. Secondly, we propose a novel weighted sparse and collaborative concentration index (WSCCI) to calculate the fusion weight adaptively. Finally, we put forward a TELM approach to carry out the classification task. It can deal with the high dimension data directly and reserve the image spatial information well. Extensive experiments are implemented on the benchmark multispectral palmprint database provided by PolyU. The experiment results validate that our RL2SR-TELM algorithm overmatches a number of state-of-the-art multispectral palmprint recognition algorithms both when the images are noise-free and contaminated by different noises.


Introduction
Palmprint recognition technologies have become a novel biometric approach and have attracted increasingly attention in recent years. In comparison with some other biological features (i.e., the iris and fingerprints, etc.), palmprints have a larger collection area with more abundant information. Besides, palmprints possess the characteristics of uniqueness, stability, scalability and non-contact acquisition, etc. As a consequence, they have strong anti-noise capability and efficient discrimination performance.
The current palmprint recognition algorithms can be mainly categorized into various sorts, such as subspace-based methods, feature-based methods and sparse representation-based classification (SRC) methods, etc. The subspace-based methods [1][2][3][4][5][6][7][8][9][10] adopt dimension reduction theory to accomplish the feature space transformation. This can reduce the data complexity and efficiently improve the discrimination of image characteristics. The conventional subspace transformation methods mainly includes the principal component analysis (PCA) [1], linear discriminant analysis (LDA) [4] and independent component analysis (ICA) [7], etc. However, due to their sensitivity to lighting, noise and other contaminations, the conventional linear discriminant methods already don't meet the requirements of actual palmprint recognition problems. To address these issues, a nonlinear spatial structure transformation technique, namely the kernel PCA method [9,10], was introduced into the palmprint recognition field. In addition, lots of feature-based methods were presented to implement palmprint recognition tasks. For instance, the coding-based method [11][12][13][14][15][16][17] has been extensively researched in the past decades. In these studies, the palmprint features were extracted by using the coding of some filtering results. The common coding methods include binarized statistical image features (BSIF) [13], double-orientation code (DOC) [14] and block dominate orientation code [17], etc. Minaee et al. [18] proposed a palmprint recognition algorithm by using the deep scattering network which achieved fine recognition performance. Some other feature-based methods [19][20][21][22][23][24] mainly take advantage of the statistic characteristics, such as mean, variance, and covariance and so on, to implement palmprint recognition. In recent years, the linear representation methods based on sparse theory [25] were proposed and popularly applied to the palmprint recognition problem [26][27][28][29][30][31]. These methods consider a testing sample as a linear representation of the training set. That is, a given testing sample was anticipated to be approximately expressed by the training samples lied in a unitary class. This can be effectively accomplished by imposing the sparseness constraint on the approximate representation with the training samples.
For the sake of higher recognition accuracy, some multispectral palmprint recognition methods [32][33][34][35][36][37][38][39][40][41][42][43][44] have been studied. Because the collected images under different spectra contain more plentiful feature information, the recognition rate can be effectively improved. In these studies, different fusion strategies were utilized to increase the recognition accuracy. The conventional multispectral palmprint recognition methods can be mainly categorized into image level fusion strategies and matching score level fusion strategies. The basic idea of image level fusion is to decompose the images under different spectra at the start, then integrate these separated decompositions for a compound approximation and reconstruct the fusion image through the inverse transformation to implement the recognition task. Based on this, Han et al. [32] used the discrete wavelet transform (DWT) method to decompose palmprint images acquired under different spectra, and then reconstructed the fused palmprint image to accomplish the multispectral palmprint recognition. Xu et al. [37] introduced the quaternion matrix to represent the palmprint images under different spectra, and then extended PCA and DWT into the quaternion domain to implement feature extraction. Finally, the Euclidean distance was used to perform the recognition task. Gumaei et al. [38] employed an autoencoder with the regularized extreme learning machine (AE-RELM) to accomplish the multispectral palmprint recognition and effectively improve the accuracy. Xu et al. [39] presented a novel multispectral palmprint recognition algorithm. They used the digital shearlet transform (DST) to implement the image fusion and proposed a multiclass projection ELM (MPELM) to accomplish the classification task. For the score level fusion method, the matching scores are obtained separately by a comparator for different spectral bands firstly, then the obtained matching scores are fused by utilizing some rules and accomplish the classification based on the fusion score. Zhang et al. [41] presented a novel algorithm named line orientation-based coding (LOC) to extract the featurew of the palmprint images with different spectrq, and then carried out the recognition task with a matching level fusion rule. Minaee et al. [42] used the co-occurrence matrix to extract the texture features, then employed the minimum distance classifier (MDC) and weighted majority voting system (WMV) to accomplish the multispectral palmprint recognition. Minaee et al. [43] presented a set of wavelet-DCT features for multispectral palmprint recognition. Although many achievements have been made in the study of multispectral palmprint recognition, there are still many open questions that need to be further studied. For example, how to increase the recognition accuracy when the collected images are contaminated by different noises.
Inspired by the these studes, in this article, we present a novel robust L2 sparse representation with a tensor-based extreme learning machine (RL2SR-TELM) algorithm by using an adaptive image level fusion strategy to accomplish the multispectral palmprint recognition. The key contributions of our algorithm can be summarized as follows: Firstly, a robust L2 norm-based sparse representation model is constructed to calculate the linear representation coefficients. It overcomes the defects of high computational complexity of the L1 norm regularization and the lack of robustness to noise contamination. Secondly, an adaptive weighted method is presented to accomplish the fusion of multispectral palmprint images at the image level. In this method, a weighted sparse and collaborative concentration index (WSCCI) is proposed that can quantify the multispectral palmprint image discrimination efficiently. By using the robust sparse coefficients and WSCCI, an adaptive weighted fusion strategy is proposed to reconstruct the fused palmprint image. Finally, aiming at the high order signal classification problem, we extend the conventional ELM [45] into the tensor space, then put forward a novel TELM method. It inherits the advantages of the conventional ELM (i.e., excellent learning speed and generalization performance) which achieves an outstanding recognition efficiency.
The rest of this paper is organized as follows: in Section 2, we introduce the principle of multispectral palmprint acquisition device. Then we discuss our proposed RL2SR-TELM algorithm in Section 3. In Section 4, simulation experiments and the result analysis of our proposed algorithm are illustrated in detail. Section 5 concludes this paper.

Acquisition Device of Multispectral Palmprint Images
The Biometrics Research Centre (BRC) of Hong Kong Polytechnic University (PolyU) has developed an acquisition device [46] for multispectral palmprints. It can collect the palmprint images using the Blue, Green, Red and Near Infrared (NIR) spectra, respectively. Figure 1 illustrates the principle of the acquisition device. It mainly includes a multispectral light source module, a light source control module, a CCD imaging sensor, an image acquisition module (A/D conversion module) and an image display module, etc. The multispectral light source module locates at the bottom of the device and consists of four monochromatic light sources. The light controller module controls the multispectral light and enables CCD imaging module to acquire palmprint images under different spectrums. The image acquisition module captures the multispectral palmprint images and converts analog image into a digital one by an A/D conversion. Figure 2 shows the acquired palmprint images with different spectrums. our algorithm can be summarized as follows: Firstly, a robust L2 norm-based sparse representation model is constructed to calculate the linear representation coefficients. It overcomes the defects of high computational complexity of the L1 norm regularization and the lack of robustness to noise contamination. Secondly, an adaptive weighted method is presented to accomplish the fusion of multispectral palmprint images at the image level. In this method, a weighted sparse and collaborative concentration index (WSCCI) is proposed that can quantify the multispectral palmprint image discrimination efficiently. By using the robust sparse coefficients and WSCCI, an adaptive weighted fusion strategy is proposed to reconstruct the fused palmprint image. Finally, aiming at the high order signal classification problem, we extend the conventional ELM [45] into the tensor space, then put forward a novel TELM method. It inherits the advantages of the conventional ELM (i.e., excellent learning speed and generalization performance) which achieves an outstanding recognition efficiency. The rest of this paper is organized as follows: in Section 2, we introduce the principle of multispectral palmprint acquisition device. Then we discuss our proposed RL2SR-TELM algorithm in Section 3. In Section 4, simulation experiments and the result analysis of our proposed algorithm are illustrated in detail. Section 5 concludes this paper.

Acquisition Device of Multispectral Palmprint Images
The Biometrics Research Centre (BRC) of Hong Kong Polytechnic University (PolyU) has developed an acquisition device [46] for multispectral palmprints. It can collect the palmprint images using the Blue, Green, Red and Near Infrared (NIR) spectra, respectively. Figure 1 illustrates the principle of the acquisition device. It mainly includes a multispectral light source module, a light source control module, a CCD imaging sensor, an image acquisition module (A/D conversion module) and an image display module, etc. The multispectral light source module locates at the bottom of the device and consists of four monochromatic light sources. The light controller module controls the multispectral light and enables CCD imaging module to acquire palmprint images under different spectrums. The image acquisition module captures the multispectral palmprint images and converts analog image into a digital one by an A/D conversion. Figure 2 shows the acquired palmprint images with different spectrums.     Figure 3 illustrates the flowchart of the presented RL2SR-TELM algorithm. It can be mainly separated into the following steps: Firstly, the acquired multispectral palmprint image is preprocessed to obtain the region of interest (ROI) of the image. Then, we calculate the sparse representation coefficients of sample images under different spectra by utilizing the proposed robust L2 sparse representation method. After that, an adaptive weighted fusion strategy is presented to obtain the fused images. Finally, by integrating the tensor theory with ELM, we propose a TELM method to complete the recognition task.

Proposed Algorithm
where n R α ∈ is the representation coefficient,   Figure 3 illustrates the flowchart of the presented RL2SR-TELM algorithm. It can be mainly separated into the following steps: Firstly, the acquired multispectral palmprint image is preprocessed to obtain the region of interest (ROI) of the image. Then, we calculate the sparse representation coefficients of sample images under different spectra by utilizing the proposed robust L2 sparse representation method. After that, an adaptive weighted fusion strategy is presented to obtain the fused images. Finally, by integrating the tensor theory with ELM, we propose a TELM method to complete the recognition task.  Figure 3 illustrates the flowchart of the presented RL2SR-TELM algorithm. It can be mainly separated into the following steps: Firstly, the acquired multispectral palmprint image is preprocessed to obtain the region of interest (ROI) of the image. Then, we calculate the sparse representation coefficients of sample images under different spectra by utilizing the proposed robust L2 sparse representation method. After that, an adaptive weighted fusion strategy is presented to obtain the fused images. Finally, by integrating the tensor theory with ELM, we propose a TELM method to complete the recognition task.

Proposed Algorithm
where n R α ∈ is the representation coefficient,

SRC Model
The sparse representation idea was introduced into the biometric recognition for the first time in 2009 by Wright et al. Given the training set matrix denoted as X = [x 1 , x 2 , . . . , x n ] ∈ R d×n , where x i is a training sample, d and n denote the training sample dimension and number, respectively. For any given testing sample y ∈ R d , we suppose that it can be coded over the training matrix X approximately, then the SRC model can be described as: where α ∈ R n is the representation coefficient, α 0 denotes the L0 norm and it counts the nonzero element number of the vector. The objective of SRC is to find as fast as possible as sparse coefficient α which can represent the testing sample over the training set. Model (1) is a NP hard problem and theoretically intractable. Reference [47] has proved that when the representation coefficient is sparse enough, the L0 norm can be approximately represented by using the L1 norm. On the basis of this theory, Wright et al. proposed the following model: This is a classical model and it has been extensively used in various areas including the image reconstruction, image de-noising, compressive sensing and machine learning, and so on. Although many scholars have devoted themselves to this algorithm and proposed largely improvements, the drawback of inefficiency is still not completely resolved.

Robust L2 Sparse Representation Method
A SRC model actually supposes that the coding residual obeys a Gaussian or Laplacian probability density function distribution. However, this hypothetical description is not always accurate enough in practice. In addition, the SRC needs to solve the L1 regularization problem and its calculation speed is very slow. To address these drawbacks, many researchers have proposed a lot of improved SRC algorithms. For examples, Yang et al. [48] proposed a novel sparse representation method that solved the sparse representation problem by using the maximum likelihood estimation (MLE) method. It can deal with the occlusion and outliers more robustly. Xu et al. [49] made use of the L2 regularization to acquire the sparse coefficient and proposed a new discriminative sparse representation method (DSRM). Inspired by these ideas, we propose a novel robust L2 regularization based sparse representation method, namely RL2SR.
Suppose that there are s different spectral bands, the class number of each spectral palmprint is C and each class has m training samples. Thus, there are N = mC training samples for each spectrum. Vectorize the training sample into the d− dimensional column vector, then the training sample matrix can be denoted as X = [X 1 , . . . , X i , . . . , . , x s m(i−1)+1 , . . . , x 1 mi , x 2 mi , . . . , x s mi ] is the training sample sub-matrix of the i-th class, x 1 mi , x 2 mi , . . . , x s mi are the (m×i)-th training samples under different spectra. Then, given any testing sample y l , where l = 1, 2, . . . , s denotes the spectral bands, we can construct the following optimization problem: argmin where λ > 0 is a constant namely regularization parameter which can balance the representation residual term and the regularization term. Here, A l = [A l 1 ; A l 2 ; · · · ; A l C ] is the linear representation coefficient with respect to the testing sample y l over the training set.
For the first term of the optimization function (3), it can be denoted as where e l k = y l k − X k A l denotes the residual term with respect to the kth element between y l and its approximate linear representation XA l . y l k and X k are the k-th element of the testing sample and the kth row of the training set matrix, respectively. In general, the residual function ρ(·) is designed to minimize the effect generated by the occlusion and outliers. Huber, Cauchy and Welsch functions can be used to express the residual function. In reference [48], Yang et al. utilized the logistic function to describe the residual information and got satisfactory performance. The logistic function can be expressed as follows: where µ and δ are the positive parameters. The selection of parameters µ and δ will be discussed in Section 4.2. In order to solve question (3), we derivative ρ(y l − XA l ) with respect to A i , then we have: , ω(e l 2 ), . . . , ω(e l d )) denotes the residual function, Equation (6) can be regarded as the derivative of 1 2 (W l ) 1/2 e l 2 2 . By using Equation (5), the residual function can be calculated as follows: For the residual matrix W l , the following method is proposed to calculate it: Step 1: Initiate W l,1 = diag(1, 1, . . . , 1) and calculate the collaborative code γ l of each testing sample by using the collaborative representation model Step 2: Substitute the collaborative residual e l k = y l k − X k γ l , (k = 1, . . . , d) into Equation (7) and obtain the residual matrix W l .
Step 3: If W l is not convergent, repeat step 1 and step 2, otherwise output W l . With the residual matrix W l calculated, Equation (3) can be rewritten as follows Due to the existence of parameter λ, omit the coefficient in front of the first term and the Equation (8) becomes: For the second term φ(A l ) of the optimal objective function, SRC [25] adopted the L1 norm to realize the sparseness of linear representation coefficient. In general, an iterative algorithm is employed to solve the L1 norm regularization based sparse representation problem. There are many famous algorithms [50] to implement the iteration, such as L1 regularized least squares (L1LS), homotopy method, augmented Lagrangian method (ALM), orthogonal matching pursuit method (OMP) [51] and fast iterative shrinkage thresholding algorithm (FISTA), etc. However, these methods still suffer from the issue of low efficiency. To address this issue, Zhang et al. [52] introduced the collaborative representation-based classification (CRC) into the method and utilized the L2 regularization to obtain the representation coefficient. Although CRC provided an efficient algorithm, it failed to give full consideration to the sparseness of linear representation. Reference [49] employed L2 regularization to implement the face recognition by utilizing a discriminative sparse representation method. Inspired by this, the L2 regularization item is introduced into our model and a novel RL2SR model is proposed as follows: Since: Equation (11) can be separated into two parts. Minimizing (X i A l i ) T (X j A l j ) implies that the correlation between the i-th class and j-th class is also minimal with respect to the linear representation. This makes the linear approximation combination have the best discrimination ability. Thus, the second term of Equation (11) has the capability of decorrelating the linear representation combination with different classes. Correspondingly, minimization of the sum (X i A l i ) T (X j A l j ), instead of any individual terms, can accomplish the decorrelation affection for different classes. In consequence, this approach can discriminate the testing sample to the really nearest class. Minimization of X i A l i 2 , (i = 1, 2, . . . , C) means that the norm of the linear representation combination with each class is also small. Similar to the presented linear representation approaches, such as SRC and CRC, there is a competitive relationship between different classes of training samples. In other word, the testing sample can be denoted by the weight sum of the training samples from all of the classes. Obviously, that is a linear representation which means every class makes its impact to represent the testing sample. Competition in representation implies that when a class makes an important impact to the linear representation, the remainder classes make considerably less impact.
The objective function shown in Equation (10) can be rewritten as: For the first term of objective function (12), using argmin instead of y l = XA l implies that XA l is a linear approximation of the test image. That is to say, this model can tolerate considerable noise contamination. In the meantime, the residual function can measure the linear representation residual well and enhance the noise robustness of the proposed model. In order to optimize the presented model, we introduce the following theorem: (12) is convex and differentiable w.r.t. coefficient A l , and it has a closed form solution.
Proof. Firstly, the objective function (12) can be considered as a combination of two L2 regularization terms, i.e., . By adopting the properties of L2 norm, the convexity and derivative of the proposed model (12) can be easily proved.
Secondly, the derivative of function (W l ) 1/2 (y l − XA l ) 2 2 can be computed as follows: On the other hand, for the second term , since it does not contain the coefficient A l explicitly, we could not compute the derivative directly. To address this issue, we compute the partial derivatives of , we have: Then, we can obtain the derivative as follows: By denoting: we have: dϕ dA l = 4(CM + X T X)A l .
As a consequence, the derivative of objective function (12) with respect to A l is: By employing the property of optimal solution, and setting is as zero, the closed solution of objective function (12) is obtained as follows: The proof of Theorem 1 is thus completed.
While error not convergent, do 1. Calculate the collaborative representation code γ l by solving 2. Calculate the residual by employing 3. Calculate the residual function by using For each spectral testing sample y l , (l = 1, 2, . . . , s), calculate A l , (l = 1, 2, . . . , s) by using

Image Fusion Based on Adaptive Weighted Method
In this section, a weighted sparse and collaborative concentration index is introduced to quantify the discrimination of each spectral testing sample and an adaptive weighted fusion method is proposed to construct the fused palmprint image. Definition 1. [25] (sparse concentration index (SCI)) The SCI of a coefficient vector α ∈ R n is defined as: where C is the class number, δ i (α) is an indicator function defined on R n which keeps the coefficients affiliated to the ith class and sets all the other coefficients to be zero.
Obviously, SCI(α) = 1 implies that the training samples from a unitary class can express the testing sample well. On the contrary, SCI(α) = 0 means that all of the training samples have an average impact to represent the testing sample. Therefore, SCI can measure the sparseness of the linear representation coefficient and the discrimination ability of the testing sample efficiently. If SCI(α) = 1, the testing sample has the strongest discrimination ability and it can be easily classified into the correct class. If SCI(α) = 0, the testing sample has the weakest discrimination ability and we cannot determine the actual class that the testing sample should belong to.
The SCI uses the L1 norm to evaluate the sparseness of the linear representation coefficient and it can't efficiently evaluate the coefficient obtained by our RL2SR method since the L2 norm regularization is utilized. It considers not only the sparseness, but also the collaborative representation information of the representation coefficient. To address this issue, the definition of SCI is extended and a weighted sparse and collaborative concentration index, namely WSCCI, is proposed to evaluate the representation coefficient obtained by our RL2SR model.

Definition 2.
(weighted sparse and collaborative concentration index (WSCCI)) The WSCCI of a coefficient vector α ∈ R n is defined as: where C denotes the class number, µ 1 and µ 2 are nonnegative parameters.
In WSCCI, the weighted fusion of the sparse and collaborative concentration index defined by the L1 norm and L2 norm is utilized to evaluate the discriminative performance of the given sample. As a consequence, it can be regarded as the weighted sum of SCI and CCI (i.e., collaborative concentration index). From the above analysis, the proposed WSCCI can be utilized to model our adaptive weighted fusion method.

Principle of Tensor Based ELM
ELM can be considered as a generalized single hidden layer feedforward neural network (SLFN). Since ELM randomly chooses the initial values of the hidden nodes and analytically calculates the output weights, the learning speed is extremely fast compared to the conventional supervised learning algorithms (i.e., support vector machine (SVM) [53] and k-nearest neighbor (KNN) algorithm, etc.). In addition, its generalization ability is better than many back propagation neural networks algorithms. In consequence, ELM has been extensively studied and widely applied in lots of areas (such as pattern classification, clustering analysis and regression etc.) and plenty of research achievements have been acquired. Inspired by this idea, we present a novel TELM by extend the conventional ELM to the tensor space, and it can regard the image as a tensor to execute the recognition task.

ELM
Given a training set with N different training samples (x j , t j ) ∈ R d × R m , (j = 1, 2, . . . , N), where x j = [x j1 , x j2 , . . . , x jd ] T ∈ R d denotes the jth training sample, t j = [t j1 , t j2 , . . . , t jm ] T ∈ R m represents the target of sample x j . A classical SLFNs can be theoretically defined by: In this model, the hidden node number is L and activation function is f (x). a i = [a i1 , a i2 , . . . , a id ] T denotes the input weight value which connects the input nodes with the ith hidden node. β i = [β i1 , β i2 , . . . , β im ] T denotes the output weight value which connects the output nodes with the ith hidden node. b i denotes the bias for the ith hidden node. a i · x j means a dot product between a i and x j . The classical SLFNs can approximate the given training samples set with the minimum residual.
Obviously, Equation (18) is a system of linear equations. By introducing the concept of matrix, we can rewrite it as follows: where: Theorem 2. For a given normative SLFNs which possesses L hidden nodes and an activation function f , where f : R → R is an infinitely differentiable function on the definition interval. Given a training set with N different samples (x j , t j ), where x j ∈ R n denotes the sample data and t j ∈ R m represents the target of x j . For any randomly assigned weight a i and bias b i , the output matrix H of the hidden layer can be obtained by the pseudo-inverse and satisfies Hβ − T = 0 for probability one with respect to any continuous probability distribution.
For the proof of the Theorem 2 readers can refer to [45]. Based on this theory, ELM can be descripted as follows: With the initial weight vector and the biases of hidden layer nodes determined by random assignment, we can obtain the output matrix H for the hidden layer based on the input samples. Therefore, we can transform the training procedure of ELM to a classical least squares problem of linear equations, i.e., min We can obtain the least square solution of Equation (20) as follows: where H + refers to the Moore-Penrose pseudo-inverse for matrix H.

Tensor Based ELM
Although the conventional ELM can deal well with one-dimensional signals, for two-dimensional images, it needs to be vectorized and solved in the one-dimensional space. However, in this transformation it is easy to lose the spatial structure information of the image. In order to solve this problem, we extend the conventional ELM to the tensor space and put forward a novel tensor-based ELM to deal with the high-dimensional signals.
In view of the high-dimensional characteristics of the palmprint image, we regard the fused image as a second-order tensor and classify it by the proposed TELM. In our method, the high order singular value decomposition (HOSVD) algorithm [54] is utilized to decompose the fused palmprint image and construct the input weight values of the TELM model.
Given an M order tensor F ∈ R I 1 ×I 2 ×···×I M and a matrix U ∈ R J m ×I m , we define B ∈ R I 1 ×···×I m−1 ×J m ×I m+1 ×···×I M as the mth modal product of F and U, the elements of B can be calculated by: so the m-th modal tensor product can be simply denoted by: The HOSVD algorithm can be implemented by using the tensor product. Given an M order tensor F ∈ R I 1 ×I 2 ×···×I M , we can use the tensor product to decompose F in the following: where S denotes an M order tensor which is called as a core tensor, U (1) ∈ R I 1 ×I 1 , U (2) ∈ R I 2 ×I 2 , . . . , U (M) ∈ R I M ×I M are unitary matrices and each column is corresponding to the orthogonal basis of unfolded matrices F (1) , F (2) , . . . , F (M) . The low rank approximation of tensor F can be calculated by HOSVD, i.e., where S ∈ R q 1 ×q 2 ×···× q M represents the principal component core tensor, U (i) q i represents the truncation matrix composed by the first q i columns of U (i) , i = 1, 2, . . . , M.
According to the above discussion, we summarize the detailed process of the tensor based ELM as follows: let G i ∈ R s×t , (i = 1, 2, . . . , N) be the ith fused training palmprint image, t i ∈ R m , (i = 1, 2, . . . , N) be the target of sample G i . Denoted the training sample set as G ∈ R s×t×N . Then the HOSVD algorithm utilized to decompose G i ∈ R s×t , (i = 1, 2, . . . , N) can be formulated as: where represent the truncation matrix with L 1 and L 2 columns, respectively. Then TELM can be defined as: where L 1 and L 2 denote the hidden layer node numbers along the tensor directions. In consequence, there are in total L = L 1 × L 2 hidden layer nodes. u l 1 and v l 2 denote the input weight vectors of the hidden layer along the tensor directions, respectively. β l 2 +(l 1 −1)L 1 denotes the weight value between the output nodes and the (l 2 + (l 1 − 1)L 1 )th node in the hidden layer. Similar to ELM algorithm, g(·) denotes the activation function. Finally, the output weight β can be obtained from Equation (27) by utilizing the least squares method.

Experiments
In this section, we evaluate the presented multispectral palmprint recognition algorithm on the benchmark available database offered by PolyU. Extensive experiments are implemented to demonstrate the effectiveness of the presented RL2SR method, adaptive fusion strategy and TELM. In the experiment of this paper, we use the fused palmprint image as the input of TELM classifier. In this section, we accomplish the experiments on a PC equipped with Windows 7, Intel Core i5-2320 CPU (3.0 GHz), and 6 GB RAM, and the algorithm is programmed using MATLAB 2017a.

The PolyU Multispectral Palmprint Database
The PolyU multispectral palmprint database was taken from 250 persons where the males are 195 and females are 55. The age of volunteers was mainly between 20 and 60 years old. In order to embody the differences of the acquired palmprint and make the palmprint images be various, the palmprint images were acquired in two separate phases. The time interval between the two phases Sensors 2019, 19, 235 13 of 25 was 5-15 days and each phase lasted about 9 days. In each phase, both hands of the volunteers were acquired six times respectively under the condition of four different spectra: Blue (470 nm), Green (525 nm), Red (660 nm) and NIR (880 nm). For each spectrum, 500 different palmprints were acquired from the 250 volunteers in the two phases. Therefore, the database contains 6000 palmprint images under each spectrum. That is, the multispectral palmprint database contains 6000 × 4 = 24,000 images in total. Reference [46] provided the ROI extraction process from the acquired multispectral palmprint images and established the database namely PolyU multispectral palmprint database (see Figure 4). The PolyU multispectral palmprint database was taken from 250 persons where the males are 195 and females are 55. The age of volunteers was mainly between 20 and 60 years old. In order to embody the differences of the acquired palmprint and make the palmprint images be various, the palmprint images were acquired in two separate phases. The time interval between the two phases was 5-15 days and each phase lasted about 9 days. In each phase, both hands of the volunteers were acquired six times respectively under the condition of four different spectra: Blue (470 nm), Green (525 nm), Red (660 nm) and NIR (880 nm). For each spectrum, 500 different palmprints were acquired from the 250 volunteers in the two phases. Therefore, the database contains 6000 palmprint images under each spectrum. That is, the multispectral palmprint database contains 6000  4 = 24,000 images in total. Reference [46] provided the ROI extraction process from the acquired multispectral palmprint images and established the database namely PolyU multispectral palmprint database (see Figure 4).      Figure 6a shows the images contaminated by white Gaussian noise. Here, the mean is 0 and the standard deviation is 25. Meanwhile, Figure 6b    The PolyU multispectral palmprint database was taken from 250 persons where the males are 195 and females are 55. The age of volunteers was mainly between 20 and 60 years old. In order to embody the differences of the acquired palmprint and make the palmprint images be various, the palmprint images were acquired in two separate phases. The time interval between the two phases was 5-15 days and each phase lasted about 9 days. In each phase, both hands of the volunteers were acquired six times respectively under the condition of four different spectra: Blue (470 nm), Green (525 nm), Red (660 nm) and NIR (880 nm). For each spectrum, 500 different palmprints were acquired from the 250 volunteers in the two phases. Therefore, the database contains 6000 palmprint images under each spectrum. That is, the multispectral palmprint database contains 6000 × 4 = 24,000 images in total. Reference [46] provided the ROI extraction process from the acquired multispectral palmprint images and established the database namely PolyU multispectral palmprint database (see Figure 4).     Figure 6a shows the images contaminated by white Gaussian noise. Here, the mean is 0 and the standard deviation is 25. Meanwhile, Figure 6b shows the images contaminated by 50% salt & pepper noise.   Figure 6a shows the images contaminated by white Gaussian noise. Here, the mean is 0 and the standard deviation is 25. Meanwhile, Figure 6b shows the images contaminated by 50% salt & pepper noise. Rows 1-4 of Figure 6 exhibit the noisy palmprint images under the Blue, Green, Red and NIR spectra, respectively. Rows 1-4 of Figure 6 exhibit the noisy palmprint images under the Blue, Green, Red and NIR spectra, respectively.

Selection of μ and δ for Residual Function
Now, let's discuss the selection of parameters μ and δ for the residual function in Equation (7).
It can be seen from Equation (7)

Selection of µ and δ for Residual Function
Now, let's discuss the selection of parameters µ and δ for the residual function in Equation (7). It can be seen from Equation (7) that ω(e l k ) → exp(µδ)/(1 + exp(µδ)) when e l k → 0 . Similarly, when e l k → ∞ , ω(e l k ) = exp(µδ)/(exp(µ(e l k ) 2 ) + exp(µδ)) → 0 . In order to make ω belong to (0, 1), set the product µδ to be large enough, then ω(e l k ) ≈ exp(µδ)/exp(µδ) = 1. For simplicity, we denote T = µδ. Since e 7 > 1000, in order to meet ω(e l k ) → 1 when e l k → 0 , set T = µδ > 7. From Equation (7), ω(e l k ) = 1/2, when δ = (e l k ) 2 , so the parameter δ determines the boundary point position of the residual function value. That is to say, δ is determined when the weight ω will pass through 0.5. For the sake of enhancing the robustness of the model for the outlier or noise contamination efficiently, a novel method of selecting the parameter δ is presented as follows. Firstly T , then arrange this vector's elements in descending order and denote the new vector by e l . By denoting its maximum element as M and the minimum element as m, set τ 1 = (1 − θ)m + θ M, where θ is a constant and θ ∈ [0.6, 0.8]. Since the dimension of e l is d, suppose that s is the nearest integer to θd and the sth biggest element of e l is selected as τ 2 . Finally, let δ = (τ 1 + τ 2 )/2. Once δ is selected, parameter µ can be calculated by µ = T/δ. In our experiments, select the constant T = 8.

Selection of the Hidden Node Numbers Along the Directions of TELM
To evaluate the effect of the hidden node numbers along the directions of TELM, the experiments are implemented by setting the hidden node numbers varying from 1 to 20 under the cases of noise-free and different noise contaminations. The recognition performance is illustrated in Figures 7-9. At the same time, Figures 7 and 8 illustrate that our algorithm could converge rapidly with the increase of hidden node numbers. Obviously, when the hidden node numbers are both greater than 7, our algorithm achieves a perfect performance. From Figure 9, although the convergence performance is inferior to the noise-free case, our algorithm can still obtain better convergence speed. As a consequence, the appropriate hidden node numbers can be selected according to the above analysis. For simplicity, the hidden node numbers in our experiments are set as L 1 = L 2 = 10. To evaluate the effect of the hidden node numbers along the directions of TELM, the experiments are implemented by setting the hidden node numbers varying from 1 to 20 under the cases of noisefree and different noise contaminations. The recognition performance is illustrated in Figures 7-9. At the same time, Figures 7 and 8 illustrate that our algorithm could converge rapidly with the increase of hidden node numbers. Obviously, when the hidden node numbers are both greater than 7, our algorithm achieves a perfect performance. From Figure 9, although the convergence performance is inferior to the noise-free case, our algorithm can still obtain better convergence speed. As a consequence, the appropriate hidden node numbers can be selected according to the above analysis. For simplicity, the hidden node numbers in our experiments are set as 1 2 10 L L = = .   To evaluate the effect of the hidden node numbers along the directions of TELM, the experiments are implemented by setting the hidden node numbers varying from 1 to 20 under the cases of noisefree and different noise contaminations. The recognition performance is illustrated in Figures 7-9. At the same time, Figures 7 and 8 illustrate that our algorithm could converge rapidly with the increase of hidden node numbers. Obviously, when the hidden node numbers are both greater than 7, our algorithm achieves a perfect performance. From Figure 9, although the convergence performance is inferior to the noise-free case, our algorithm can still obtain better convergence speed. As a consequence, the appropriate hidden node numbers can be selected according to the above analysis. For simplicity, the hidden node numbers in our experiments are set as 1 2 10 L L = = .

Experiment Results and Analysis
In this subsection, the experiments are implemented to validate the efficiency of our presented algorithm from the aspects of sparse representation, fusion strategy, classification approach and the overall algorithm. For the sake of demonstrating the robustness of the presented RL2SR model, we accomplish the experiments compared with several different models, such as SRC, CRC and DSRM. The recognition rates are shown in Table 2. From Table 2, it is easy to discover that each algorithm achieves the highest and the lowest recognition rates under the cases of noise-free and salt & pepper noise contamination, respectively. Since our proposed adaptive weighted fusion process approximates a spatial smoothing filtering, the decrease of recognition rate under the white Gaussian noise contamination is not obvious. Furthermore, by using our RL2SR coefficient for fusion, the recognition rates achieve 99.68%, 99.20% and 97.24%, which are 1.72%, 2.52 and 2.96% higher than DSRM under the cases of noise free, white Gaussian noise and SRC in the case of salt & pepper noise contamination, respectively. This indicates that our RL2SR is robust to different noises, which can improve the discriminant competency and increase the recognition rate of the fusion image.
To evaluate the efficiency of the presented adaptive fusion strategy, some comparison fusion experiments (i.e., the sum and min-max fusion strategy) are simulated and the recognition performance is listed in Table 3. In this experiment, the training sample number of each class varies from 2 to 4.

Experiment Results and Analysis
In this subsection, the experiments are implemented to validate the efficiency of our presented algorithm from the aspects of sparse representation, fusion strategy, classification approach and the overall algorithm. For the sake of demonstrating the robustness of the presented RL2SR model, we accomplish the experiments compared with several different models, such as SRC, CRC and DSRM. The recognition rates are shown in Table 2. From Table 2, it is easy to discover that each algorithm achieves the highest and the lowest recognition rates under the cases of noise-free and salt & pepper noise contamination, respectively. Since our proposed adaptive weighted fusion process approximates a spatial smoothing filtering, the decrease of recognition rate under the white Gaussian noise contamination is not obvious. Furthermore, by using our RL2SR coefficient for fusion, the recognition rates achieve 99.68%, 99.20% and 97.24%, which are 1.72%, 2.52 and 2.96% higher than DSRM under the cases of noise free, white Gaussian noise and SRC in the case of salt & pepper noise contamination, respectively. This indicates that our RL2SR is robust to different noises, which can improve the discriminant competency and increase the recognition rate of the fusion image.
To evaluate the efficiency of the presented adaptive fusion strategy, some comparison fusion experiments (i.e., the sum and min-max fusion strategy) are simulated and the recognition performance is listed in Table 3. In this experiment, the training sample number of each class varies from 2 to 4.  Table 3 illustrates that the recognition accuracies under different fusion strategies increases with the training sample number. In particularly, our presented fusion strategy achieves the highest recognition accuracy of 100%, 99.95% and 99.05% when we set the number of training samples as 4. Even when the training sample number declines to 2, our approach achieves an accuracy of 92.27% which is 19.74% higher than the min-max fusion strategy in the case of salt & pepper noise contamination (72.53%). This implies that our fusion strategy has the strongest robustness compared with the sum and min-max fusion methods.
To demonstrate the classification efficiency of the presented TELM, we accomplish the experiments compared with some other classifiers, such as NN, KNN, ELM, MPELM and RELM. For these comparison classifiers, we vectorize the fused image and take this vector as the input. For each classifier, 3-6 training samples are selected to complete the recognition experiments and the classification accuracy curves are plotted in Figure 10.   Table 3 illustrates that the recognition accuracies under different fusion strategies increases with the training sample number. In particularly, our presented fusion strategy achieves the highest recognition accuracy of 100%, 99.95% and 99.05% when we set the number of training samples as 4.
Even when the training sample number declines to 2, our approach achieves an accuracy of 92.27% which is 19.74% higher than the min-max fusion strategy in the case of salt & pepper noise contamination (72.53%). This implies that our fusion strategy has the strongest robustness compared with the sum and min-max fusion methods.
To demonstrate the classification efficiency of the presented TELM, we accomplish the experiments compared with some other classifiers, such as NN, KNN, ELM, MPELM and RELM. For these comparison classifiers, we vectorize the fused image and take this vector as the input. For each classifier, 3-6 training samples are selected to complete the recognition experiments and the classification accuracy curves are plotted in Figure 10. The curves in Figure 10 indicate that when the training sample number is greater than or equal to 4, the recognition rates of all the algorithms achieve excellent performance. The experimental results also show that, in the case of noise-free, the recognition rate of our proposed TELM algorithm gradually increases with the number of training samples. On the other hand, our TELM achieves higher recognition rates than the other algorithms. Although the improvement is not significant The curves in Figure 10 indicate that when the training sample number is greater than or equal to 4, the recognition rates of all the algorithms achieve excellent performance. The experimental results also show that, in the case of noise-free, the recognition rate of our proposed TELM algorithm gradually increases with the number of training samples. On the other hand, our TELM achieves higher recognition rates than the other algorithms. Although the improvement is not significant because the recognition rate is much approximate or even reaches to 100%. From the above analysis, it is easy to observe that the presented TELM algorithm can achieve efficient recognition performance and has strong stability compared with the other classifiers. Furthermore, more simulation experiments are implemented with the multispectral palmprint database when it is contaminated by the aforementioned noise. The recognition performances are illustrated in Table 4. It is observed from Table 4 that, in the case of white Gaussian noise contamination, the recognition rate of TELM outperforms the other classifiers. Meanwhile, the recognition accuracy of the presented TELM is remarkably higher than the other methods under the case of salt & pepper noise contamination. In consideration of the pulse characteristic of the salt & pepper noise, it impacts remarkably on the distance measurement between different samples. When the testing samples are contaminated by salt & pepper noise, the recognition accuracy of KNN method achieves 38.92%, which is significantly lower than our TELM algorithms (97.24%). Since the proposed TELM abandons the eigenvectors corresponding to the smaller eigenvalues which have the higher correlation to the noise contamination, and retains the principal components corresponding to the major eigenvalues, TELM has the ability of noise reduction and the better discrimination ability. The experimental results in Table 4 also validate that our algorithm can achieve the higher recognition rate and possess the stronger robustness to noise contamination compared with the other classifiers.
To further validate the robustness of the proposed TELM algorithm, we add different degrees of salt & pepper noise to the testing sample and implement the recognition experiment. Figure 11 shows some noisy multispectral palmprint images contaminated by salt & pepper noise with 10% to 80% percentages. Figure 11a is the original images under different spectra. Figure 11b-i are the noisy contaminated images under different spectra when the degree of salt & pepper noise varies from 10% to 80%. Figure 12 illustrates the recognition rate curves of our TELM algorithm and some of the aforementioned comparison classifiers. It is easy to find that the recognition rate curves of ELM MPELM, RELM and our algorithm drop significantly when the percentage of noise contamination is greater than 60%. Particularly, the recognition rate curves of NN and KNN methods are obviously lower than the other algorithms when the palmprint image is contaminated by more than 20% salt & pepper noise. That is to say, the accuracy curves of NN and KNN have the fast decline. The experiment result curves mean that our proposed TELM algorithm outperforms the comparison classifiers with different percentages of noise contamination and possesses stronger robustness.
To further validate the robustness of the proposed TELM algorithm, we add different degrees of salt & pepper noise to the testing sample and implement the recognition experiment. Figure 11 shows some noisy multispectral palmprint images contaminated by salt & pepper noise with 10% to 80% percentages. Figure 11a is the original images under different spectra. Figure 11b-i are the noisy contaminated images under different spectra when the degree of salt & pepper noise varies from 10% to 80%.    Figure 12 illustrates the recognition rate curves of our TELM algorithm and some of the aforementioned comparison classifiers. It is easy to find that the recognition rate curves of ELM MPELM, RELM and our algorithm drop significantly when the percentage of noise contamination is greater than 60%. Particularly, the recognition rate curves of NN and KNN methods are obviously lower than the other algorithms when the palmprint image is contaminated by more than 20% salt & pepper noise. That is to say, the accuracy curves of NN and KNN have the fast decline. The experiment result curves mean that our proposed TELM algorithm outperforms the comparison classifiers with different percentages of noise contamination and possesses stronger robustness.  Table 5 illustrates the average classification times of the aforementioned classifiers on the whole database. Although our TELM classifier is slower than the ELM method, the difference (i.e., 0.08 s) is very small. Moreover, it is distinctly faster than NN, KNN, MPELM and RELM classifiers. Especially, the classification time of NN is about five times that of our TELM. In additional, the above experiment results demonstrate that the recognition performance of our classifier significantly exceeds the NN, KNN, ELM, MPELM and RELM classifiers. This validates the recognition ability and efficiency of our algorithm.   Table 5 illustrates the average classification times of the aforementioned classifiers on the whole database. Although our TELM classifier is slower than the ELM method, the difference (i.e., 0.08 s) is very small. Moreover, it is distinctly faster than NN, KNN, MPELM and RELM classifiers. Especially, the classification time of NN is about five times that of our TELM. In additional, the above experiment results demonstrate that the recognition performance of our classifier significantly exceeds the NN, KNN, ELM, MPELM and RELM classifiers. This validates the recognition ability and efficiency of our algorithm.  Table 6 lists the recognition rates of our RL2SR-TELM algorithm with different spectral combinations. This experiment is implemented under the cases of noise-free, white Gaussian noise and 50% salt & pepper noise contamination and the training sample number per class is 4.  Table 6 summarizes the excellent performance of our presented algorithm in the cases of noise-free and white Gaussian noise contamination. In the noise-free case, the recognition accuracy achieves 100% for most of the spectral combinations. Even when the sample is contaminated by white Gaussian noise, our algorithm achieves the accuracy of more than 99.50% for all of the spectral combinations and 99.95% under the combination of Blue, Green, Red and NIR spectra. When the testing sample is contaminated by salt & pepper noise, the recognition rate declines significantly and achieves the lowest recognition rate 76.75% under the NIR spectrum. At the meantime, our RL2SR-TELM algorithm achieves an recognition performance under the combination of Blue, Green, Red and NIR spectra in the noise-free , white Gaussian noise and salt & pepper noise contamination cases, i.e., 100%, 99.95% and 99.05%, respectively. This indicates that our proposed RL2SR-TELM algorithm has excellent robustness to noise pollution. Table 7 illustrates the recognition rates of our RL2SR-TELM algorithm compared with some state-of-the art palmprint recognition methods, such as deep scattering network method [18], texture feature-based method [42], and DCT-based features method [43] etc. It is easy to find that in the case of different training samples, our algorithm achieves an excellent recognition performance. Although the recognition accuracy of our algorithm is 0.32% lower than the deep scattering network method when the training sample number is three, and it is higher than the texture feature based method and DCT-based features method when the training sample number is four. Particularly, the recognition rate of our proposed algorithm reaches 100% when the number of training samples is greater than four.  Table 8 lists the recognition rates of our RL2SR-TELM algorithm comparing with some state-of-the-art multispectral palmprint recognition algorithms, such as matching score-level fusion by LOC method, DST-MPELM method, AE-RELM method, quaternion PCA using quaternion DWT method and image-level fusion by DWT method. In this experiment, we choose three samples per class to constitute the training set. The experimental results in Table 8 illustrate that our proposed algorithm can achieve an excellent recognition accuracy in the cases of both the noise-free (99.68%) and various noise contaminations (99.20% and 97.24%, respectively). When the sample is contaminated by salt & pepper noise, the presented algorithm has more obvious advantages, which an accuracy that is respectively 0.76%, 7.26%, 1.48%, 7.08% and 14.49% higher than that of the other comparison algorithms. Table 9 demonstrates the time cost of our proposed RL2SR-TELM multispectral recognition algorithm for each test sample. It is easy to find that our RL2SR-TELM algorithm takes about 0.10945 s for a test sample recognition task. To further demonstrate the performance of our presented RL2SR-TELM method, in the case of salt & pepper noise contamination, we plot the cumulative match characteristic (CMC) curves generated by our RL2SR-TELM method and the aforementioned comparison methods. Figure 13 shows the CMC curves.
From Figure 13, it is easy to find that our presented RL2SR-TELM method has the highest rank-1 recognition accuracy. Meanwhile, the cumulative match characteristic curve of our algorithm is mostly close to the upper left corner of the coordinate system comparing with the comparison multispectral palmprint recognition approaches which means that it has the rapidest convergence speed. This implies that our algorithm outperforms the others in recognition accuracy and noise robustness, and it is quite consistent with the aforementioned experiment results and analysis. To further demonstrate the performance of our presented RL2SR-TELM method, in the case of salt & pepper noise contamination, we plot the cumulative match characteristic (CMC) curves generated by our RL2SR-TELM method and the aforementioned comparison methods. Figure 13 shows the CMC curves.

Conclusions
In this paper, a novel RL2SR-TELM algorithm is presented to implement multispectral palmprint recognition. Since the L2 regularization term is employed, the regularization optimal objective function is convex and a closed solution can be efficiently obtained. In addition, a new measurement, namely WSCCI, and an adaptive fusion framework are proposed to construct the fused multispectral palmprint images. For the classification task, we extend the conventional extreme leaning machine to the tensor domain and present a TELM algorithm. It deals with the palmprint image in two-dimensional space directly and makes the best use of its spatial structure to enhance the classification ability. Extensive experiments on PolyU multispectral palmprint database confirm the strong robustness, excellent recognition accuracy and high efficiency of our proposed algorithm.