Next Article in Journal
Relief Effects on the L-Band Emission of a Bare Soil
Next Article in Special Issue
A Regional Land Use Drought Index for Florida
Previous Article in Journal
Monitoring the Impacts of Severe Drought on Southern California Chaparral Species using Hyperspectral and Thermal Infrared Imagery
Previous Article in Special Issue
Spatial and Temporal Patterns of Global NDVI Trends: Correlations with Climate and Human Factors
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Dimension Reduction Framework for HSI Classification Using Fuzzy and Kernel NFLE Transformation

1
Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan
2
Department of Information Management, National United University, Miaoli 36063, Taiwan
3
Department of Computer Science and Information Engineering, National United University, Miaoli 36063, Taiwan
*
Author to whom correspondence should be addressed.
Remote Sens. 2015, 7(11), 14292-14326; https://doi.org/10.3390/rs71114292
Submission received: 23 May 2015 / Revised: 16 October 2015 / Accepted: 22 October 2015 / Published: 29 October 2015
(This article belongs to the Special Issue Earth Observations for the Sustainable Development)

Abstract

:
In this paper, a general nearest feature line (NFL) embedding (NFLE) transformation called fuzzy-kernel NFLE (FKNFLE) is proposed for hyperspectral image (HSI) classification in which kernelization and fuzzification are simultaneously considered. Though NFLE has successfully demonstrated its discriminative capability, the non-linear manifold structure cannot be structured more efficiently by linear scatters using the linear NFLE method. According to the proposed scheme, samples were projected into a kernel space and assigned larger weights based on that of their neighbors. The within-class and between-class scatters were calculated using the fuzzy weights, and the best transformation was obtained by maximizing the Fisher criterion in the kernel space. In that way, the kernelized manifold learning preserved the local manifold structure in a Hilbert space as well as the locality of the manifold structure in the reduced low-dimensional space. The proposed method was compared with various state-of-the-art methods to evaluate the performance using three benchmark data sets. Based on the experimental results: the proposed FKNFLE outperformed the other, more conventional methods.

Graphical Abstract

1. Introduction

Dimensionality reduction (DR) in hyperspectral image (HSI) classification is a critical issue during data analysis because most multispectral, hyperspectral, and ultraspectral images generate high-dimensional spectral images with abundant spectral bands and data. However, it is challenging to classify these spectral data because a vast amount of samples has to be collected for training beforehand. Besides, the spectral properties of land covers are too similar to clearly separate them. Hence, an effective DR is an essential step to extract the salient features for classification.
Recently, a number of DR methods have been proposed that can be classified into three categories: linear analysis, manifold learning, and kernelization. Those using linear analysis try to model the linear variation of samples and find a transformation to maximize or minimize the scatter matrix, e.g., principal component analysis (PCA) [1], linear discriminant analysis (LDA) [2], and discriminant common vectors (DCV) [3]. Sample scatters are represented in the global Euclidean structure in these methods. They work well for DR or classification if samples are linearly separated or are distributed in a Gaussian function. However, when samples are distributed in a manifold structure, the local structure of a sample in a high-dimensional space is not apparent when using global measurement. In addition, the classification performance in the case of linear analysis methods would deteriorate when the decision boundaries are predominantly nonlinear [4]. Manifold learning methods have been proposed to reveal the local structure of samples. He et al. [5] propose the locality preserving projection (LPP) method to preserve the local structure of training samples for face recognition. Since LPP presents sample scatter using the relationship between neighbors, the local manifold structure is preserved and the performance is more effective than in the case of the linear analysis methods. Tu et al. [6] used the Laplacian eigenmap (LE) method for land cover classification using polarimetric synthetic aperture radar data. The LE algorithm reduces the dimensions of features from a high-dimensional polarimetric manifold space to an intrinsic low-dimensional manifold space. Wang and He [7] investigated the LPP for DR in HSI classification. Kim et al. [8] utilized the locally linear embedding (LLE) method to reduce the dimensionality of HSIs. Li et al. [9,10] used the local Fisher discriminant analysis (LFDA) method which integrates the properties of LDA and LPP to reduce the dimensionality of HSI data. Luo et al. [11] propose a discriminative and supervised neighborhood preserving embedding (NPE) method for feature extraction in HSI classification. Zhang et al. [12] propose a manifold regularized sparse low-rank approximation, which treats the hyperspectral image as a data cube for HSI classification. These manifold learning methods all preserve the local structure of samples and improve on the performance of conventional linear analysis methods. However, according to Boots and Gordon [13], the applicability of linear manifold learning is limited to noises. Generally, the discriminative salient features of training samples are extracted using certain evaluation processes. An appropriate kernel function could improve the performance for the given method [14]. The kernelization approaches have been proposed for improving the performance of HSI classification. Boots and Gordon [13] introduced a kernelization method to alleviate the limitation of manifold learning. Scholkopf et al. [15] propose a kernel PCA (KPCA) method for nonlinear DR. KPCA generates a high-dimensional Hilbert space to extract the non-linear structure that is missed by PCA. Furthermore, Lin et al. [16] propose a general framework for multiple kernel learning during DR. They unify the multiple kernel representation, and the multiple feature representations of data are consequently revealed in a low dimension. On the other hand, a composite kernel scheme, which is a linear combination of multiple kernels, extracts both spectral and spatial data [17]. Chen et al. [18] present a sparse representation of kernels for HSI classification. A query sample is represented via all training samples in an induced kernel space. Moreover, pixels within a local neighborhood are also represented by the combination of training samples. Similar to the idea of multiple kernels, Zhang et al. [19] proposed a multiple-features combination method for HSI classification, which combined spectral, texture, and shape features to increase the HSI classification performance.
In the previous works, the nearest feature line (NFL) strategy was embedded in the linear transformation for dimension reduction on face recognition [20] and HSI classification [21]. However, the nonlinear and non-Euclidean structures were not efficiently extracted using the linear transformation. Fuzzification and kernelization are two efficient tools for enhancement in nonlinear spaces. The fuzzy methodology was further adopted in previous work [26]. In this study, a general NFLE transformation, called fuzzy-kernel NFLE, was extended for feature extraction in which kernelization and fuzzification were simultaneously considered. In addition, more experimental analysis was conducted in this study. Three benchmark data sets were evaluated in this work instead of one set as in [26]. The proposed method was compared with state-of-the-art algorithms for performance evaluation.
The rest of this paper is organized as follows: Some related works are reviewed in Section 2. In Section 3, the kernelization and fuzzification strategies are introduced and incorporated into the NFLE algorithm. Several experiments were conducted to show the effectiveness of the proposed method as reported in Section 4. Furthermore, the comparisons with several state-of-the-art HSI classification methods are given. Finally, conclusions are given in Section 5.

2. Related Works

In this study, three approaches, nearest feature line embedding (NFLE) [20,21], kernelization [15], and fuzzy k nearest neighbor (FKNN) [22], were considered to reduce the feature dimensions for HSI classification. Before the proposed methods, brief reviews of NFLE and kernelization methods are presented in the following: given N d-dimensional training samples X = [ x 1 , x 2 , , x N ] R d × N consisting of N C land-cover classes C 1 , C 2 , , C N C . The new samples in a low-dimensional space were obtained by the linear projection y i = w T x i , where w is a found linear projection matrix for DR.

2.1. Nearest Feature Line Embedding (NFLE)

NFLE is a linear transformation for DR. The sample scatters are represented in a Laplacian matrix form by using the point-to-line strategy which originated from the nearest linear combination (NLC) approach [23]. The objective function is defined and minimized as follows:
O = i ( i m n   y i L m , n ( y i )   2 l m , n ( y i ) )      = i y i j M i , j y j 2      = t r ( Y ( I M ) T ( I M ) Y )   =   t r ( w T X ( D W ) X T w )      = t r ( w T X L X T w ) .
Here, point L m , n ( y i ) is a projection point on line L m , n for point y i , and weight l m , n ( y i ) (being 1 or 0) represents the connectivity relationship from point y i to a feature line L m , n that passes through two points y m and y n . The projection point L m , n ( y i ) is represented as a linear combination of points y m and y n : L m , n ( y i ) = y m + t m , n ( y n y m ) , in which t m , n = ( y i y m ) T ( y m y n ) / ( y m y n ) T ( y m y n ) , and i m n . Using simple algebra operations, the discriminant vector from point y i to the projection point L m , n ( y i ) can be represented as y i j M i , j y j , in which two values in the ith row in matrix M are set as M i , m = t n , m , M i , n = t m , n , and t n , m + t m , n = 1 , when weight l m , n ( y i ) = 1 . The other values in the ith row are set as zero, if j m n . The mean squared distance in Equation (1) for all training points to their NFLs is next obtained as t r ( w T X L X T w ) , in which L = D W , and matrix D is a matrix of the column sums of the similarity matrix W . From the results of Yan et al. [24], matrix W is defined as W i , j = ( M + M T M T M ) i , j when i j , and is zero otherwise; j M i , j y j = 1 . Matrix L in Equation (1) is represented as a Laplacian matrix. For more details, refer to [20,21].
Considering the class labels in supervised classification, two parameters K 1 and K 2 are manually determined in calculating the within-class scatter S w and the between-class scatter S b , respectively:
S w = k = 1 N C ( x i C k L m , n F K 1 ( x i , C k ) ( x i L m , n ( x i ) ) ( x i L m , n ( x i ) )   T ) ,   and
S b = k = 1 N C ( x i C k l = 1 , l k N C L m , n F K 2 ( x i , C l ) ( x i L m , n ( x i ) ) ( x i L m , n ( x i ) )   T )
F K 1 ( x i , C k ) indicates the set of K 1 NFLs within the same class, C k , of point x i , i.e., l m , n ( y i ) = 1 , and F K 2 ( x i , C l ) is a set of K 2 NFLs belonging to the different classes of point x i . The Fisher criterion t r ( w T S b w / w T S w w ) is then maximized to find the projection matrix w , which is composed of the eigenvectors with the corresponding largest eigenvalues. A new sample in the low-dimensional space can be obtained by the linear projection y = w T x , and the nearest neighbor (one-NN) matching rule is applied for template matching.

2.2. Kernelization of LDA

In kernel LDA, considering the nonlinear mapping function from a space X to a Hilbert space H , ϕ : x X ϕ ( x ) H , the within-class and between-class scatter in space H are calculated as
S w ϕ = k = 1 N C ( x i C k ( ϕ ( x i ) ϕ ¯ k ) ( ϕ ( x i ) ϕ ¯ k ) T ) ,   and
S b ϕ = k = 1 N C ( ϕ ¯ k ϕ ¯ )   ( ϕ ¯ k ϕ ¯ ) T
Here, ϕ ¯ k = 1 n k i = 1 n k ϕ ( x i ) and ϕ ¯ k = 1 N i = 1 N ϕ ( x i ) represent the class mean and the population mean in space H , respectively. To generalize LDA to the nonlinear case, the dot product trick is exclusively used. The expression of dot product on the Hilbert space H is given by the following kernel function: k ( x i , x j ) = k i , j = ϕ T ( x i ) ϕ ( x j ) . Let the symmetric matrix K of N by N be a matrix composed of dot product in feature space H , i.e., K ( x i , x j ) = ϕ ( x i ) ϕ ( x j )   = ( k i , j ) and, i , j = 1 , 2 , ... , N . The kernel operator K makes it possible for the construction of the linear separating function in space H to be equivalent to that of the nonlinear separating function in space X . Kernel LDA also maximizes the between-class scatter and minimizes the within-class scatter, i.e., max ( w T S b ϕ w / w T S w ϕ w ) . This maximization is equivalent to the following eigenvector resolution: λ S w ϕ w = S b ϕ w . There is a set of coefficients α for w = i = 1 N α i ϕ ( x i ) such that the largest eigenvalue gives the maximum of the scatter quotien λ = w T S b ϕ w / w T S w ϕ w .

3. Fuzzy Kernel Nearest Feature Line Embedding (FKNFLE)

According to the analyses above, a training DR scheme effectively extracts the discriminant features from the non-Euclidean and non-linear space. To this end, fuzzy kernel nearest feature line embedding (FKNFLE) is proposed for HSI classification. The idea of FKNFLE is to incorporate the fuzziness and kernelization into the manifold learning method. The kernel function not only generates a non-linear feature space for discriminant analysis, but also increases the robustness to noise during the training phase. Manifold learning methods preserve the local structure of samples in the Hilbert space. On the other hand, the fuzzy kernel nearest neighbor method extracts the non-Euclidean structures of training samples to enhance discriminative capability. NFLE has been successfully applied in HSI classification. Noise variations and high-degree non-linear data distributions limit the performance of manifold learning. A kernel trick is used to alleviate this problem as introduced in the following.

3.1. Kernelization of NFLE

The kernelization function adopted in this study was inspired by that in [15]. Let ϕ : x X ϕ ( x ) Η be a nonlinear mapping from a low-dimensional space to a high-dimensional Hilbert space H . The mean squared distance for all training points to their NFLs in the Hilbert space is written as follows:
i ϕ ( y i ) L m , n ( ϕ ( y i ) ) ( 2 ) = i ϕ ( y i ) j M i , j ϕ ( y j ) 2 = t r ( ϕ T ( Y ) ( I M ) T ( I M ) ϕ ( Y ) ) = t r ( ϕ T ( Y ) ( D W ) ϕ ( Y ) ) = t r ( w T ϕ ( X ) L ϕ T ( X ) w ) .
Then, the object function in Equation (6) is minimized and expressed as a Laplacian matrix. The eigenvector problem of kernel NFLE in the Hilbert space is expressed as:
[ ϕ ( X ) L ϕ T ( X ) ] w = λ [ ϕ ( X ) D ϕ T ( X ) ] w
To extend NFLE to its kernel version, the implicit feature vector, ϕ ( x ) , does not need to be obtained explicitly. The dot product expression of two samples is exclusively applied in the Hilbert space with a kernel function as follows: K ( x i , x j ) = ϕ ( x i ) , ϕ ( x j ) . The eigenvectors of Equation (7) are represented by the linear combinations ϕ ( x 1 ) , ϕ ( x 2 ) , , ϕ ( x N ) . The coefficient α i is w = i = 1 N α i ϕ ( x i ) = ϕ ( X ) α where α = [ α 1 , α 2 , , α N ] T R N . Then, the eigenvector problem is as follows:
K L K α = λ K D K α .
Let the coefficient vectors, α 1 , α 2 , , α N , be the solutions of Equation (8) in a column format. Given a testing point, z , the projections onto the eigenvectors, w k , are obtained as follows:
( w k ϕ ( z ) ) = i = 1 N α i k ϕ ( z ) , ϕ ( x i ) = i = 1 N α i k K ( z , x i ) ,
where α i k is the ith element of the coefficient vector, α k . The kernel function RBF (radial basis function) is used in this study. Thus, the within-class and between-class scatters in a kernel space are defined as follows:
S w ϕ = k = 1 N C ( ϕ ( x i ) C k   L m , n F K 1 ( ϕ ( x i ) , C k ) ( ϕ ( x i ) L m , n ( ϕ ( x i ) ) ) ( ϕ ( x i ) L m , n ( ϕ ( x i ) ) )   T ) ,   and
S b ϕ = k = 1 N C ( ϕ ( x i ) C k    l = 1 , l k N C    L m , n F K 2 ( ϕ ( x i ) , C l ) ( ϕ ( x i ) L m , n ( ϕ ( x i ) ) ) ( ϕ ( x i ) L m , n ( ϕ ( x i ) ) )   T ) .
The kernelized manifold learning preserves the non-linear local structure in a Hilbert space. The distances in the NFLE approach are calculated by the Euclidean distance-based measurement. On the other hand, the non-Euclidean structure of training samples can be further extracted by fuzzification. The FKNN algorithm [22] enhances the discriminant power among samples by assigning the higher membership grades to the samples whose neighbors are within the same class. By doing so, the non-Euclidean structures are extracted, and the discriminative power of samples can be enhanced.

3.2. Fuzzification of NFLE

Consider N samples in the reduced space Y = [ y 1 , y 2 ... , y N ] and their corresponding fuzzy membership grades, π ( y i ) , for each sample, y i . The objective function is re-defined as follows:
O = i π ( y i ) ( i m n y i L m , n ( y i ) 2 l m , n ( y i ) ) = i π ( y i ) y i j M i , j y j 2   = t r ( Y T ( F E I F E M ) T ( F E I F E M ) Y ) = t r ( Y T ( F E D F E W ) Y ) = t r ( Y T ( D f u z z y W f u z z y ) Y )   = t r ( w T X L f u z z y X T w )
Here, each sample is assigned a fuzzy grade, π ( y i ) . Element M i , j denotes the connectivity relationship between point y i and line L m , n which is the same as that in Equation (1). Two non-zero terms, M i , n = t m , n and M i , m = t n , m , are set, and j M i , j = 1 . Using simple algebra operations, the objective function with fuzzification is represented in a Laplacian matrix in which the fuzzy terms, π ( y i ) , constitute the column vector, F , with size N × 1 , and E is a row vector of all those with size 1 × N .
Similarly, given N samples ϕ ( X ) = { ϕ ( x 1 ) , ϕ ( x 2 ) , , ϕ ( x N ) } in a Hilbert space, the membership grade of a specified sample, ϕ ( x i ) , and its K 3 neighbors, is designed in the following equation for computing the within-class scatter:
π ( x i ) = { 0.51 + ( 0.49 ( q i / K 3 ) ) , if q i θ w i t h i n ; 0.49 ( q i / K 3 ) otherwise .
Here, value q i is the number of samples whose labels are the same as that of ϕ ( x i ) among K 3 nearest neighbors, and θ w i t h i n is a manual threshold. If q i = K 3 , then π ( x i ) returns to 1, i.e., all neighbors are in the same class. Adding the fuzzy term π ( x i ) , the within-class scatter matrix becomes:
S w ϕ F = k = 1 N C ( ϕ ( x i ) C k π ( x i ) × L m , n F K 1 ( ϕ ( x i ) , C k ) ( ϕ ( x i ) L m , n ( ϕ ( x i ) ) ) ( ϕ ( x i ) L m , n ( ϕ ( x i ) ) ) T )
Similarly, a fuzzy term λ ( x i ) is also adopted to evaluate the membership grade of ϕ ( x i ) and its neighbors during the computation of between-class scatter as follows:
λ ( x i ) = { 0.51 + ( 0.49 × ( p i / K 4 ) ) if p i θ b e t w e e n ; 0.49 × ( p i / K 4 ) otherwise .
Here, value p i is the number of samples with labels different from ϕ ( x i ) among K 4 nearest neighbors, and θ b e t w e e n is a given threshold. If p i = K 4 , term λ ( x i ) returns to 1. That means that all neighbors have labels different from ϕ ( x i ) . The fuzzy term λ ( x i ) is added into the between-class scatter matrix to generate a new one as:
S b ϕ F = k = 1 N C ( ϕ ( x i ) C k λ ( x i ) × l = 1 , l k N C   L m , n F K 2 ( ϕ ( x i ) , C l ) ( ϕ ( x i ) L m , n ( ϕ ( x i ) ) )   ( ϕ ( x i ) L m , n ( ϕ ( x i ) ) ) T ) .
Hence, kernelization and fuzzification are simultaneously integrated into the NFLE transformation for feature extraction. The pseudo-codes of algorithm FKNFLE are listed in Table 1. It is proposed in this paper that a general format for the NFLE learning method using kernelization and fuzzification be used for DR. The advantages of the proposed method are threefold: the kernelization strategy generates a non-linear feature space for the discriminant analysis and increases the robustness to noise for manifold learning; the kernelized manifold learning preserves the local manifold structure in a Hilbert space as well as the locality of the manifold structure in the reduced low-dimensional space; non-Euclidean structures are extracted for improving discriminative abilities using the FKNN strategy.
Table 1. The pseudo-codes of FKNFLE (fuzzy-kernel nearest feature line) training algorithm.
Table 1. The pseudo-codes of FKNFLE (fuzzy-kernel nearest feature line) training algorithm.
Input:A d -dimensional training set X = [ x 1 , x 2 , , x N ] consists of Nc classes projected into a Hilbert space ϕ ( X ) = [ ϕ ( x 1 ) , ϕ ( x 2 ) , , ϕ ( x N ) ] , and parameters K 1 , K 2 , K 3 , K 4 .
Output:The projection transformation w .
Step 1:PCA projection: Samples are transformed from a high-dimensional space into a low-dimensional subspace by matrix w P C A .
Step 2:Computation of the within-class scatter: The possible feature lines L m , n are generated from the samples within the same class for a specified point ϕ ( x i ) . Find K 3 nearest neighbors of point ϕ ( x i ) to calculate the fuzzy membership values π ( x i ) by Equation (13). Select K 1 vectors ϕ ( x i ) L m , n ( ϕ ( x i ) ) with the smallest distances, and compute the within-class scatter S w ϕ F by Equation (14).
Step 3:Computation of the between-class scatter: Generate the feature lines from the samples whose labels are different from that of point ϕ ( x i ) . Find K 4 nearest neighbors of point ϕ ( x i ) to calculate the fuzzy membership values λ ( x i ) by Equation (15). Select K 2 discriminant vectors ϕ ( x i ) L m , n ( ϕ ( x i ) ) with the smallest distances from point ϕ ( x i ) to the feature lines. The between-class scatter S b ϕ F is obtained from Equation (16).
Step 4:Fisher criterion maximization: The Fisher criterion w * = arg max S b ϕ F / S w ϕ F is maximized to obtain the best transformation matrix, which is composed of γ eigenvectors with the largest eigenvalues.
Step 5:Output the final transformation matrix: w = w P C A w * .

4. Experimental Results

4.1. Description of Data Sets

In this section, the experimental results are discussed to demonstrate the effectiveness of the proposed method for HSI classification. Three HSI benchmarks are given for evaluation. The first data set, Indian Pines Site (IPS) image, was generated from AVIRIS (Airborne Visible/Infrared Imaging Spectrometer), which was captured by the Jet Propulsion Laboratory and NASA/Ames in 1992. The IPS image was captured from six miles in the western area of Northwest Tippecanoe County (NTC). A false color IR image of dataset IPS is shown in Figure 1a. The IPS dataset contained 16 land-cover classes with 220 bands, e.g., Alfalfa(46), Corn-notill(1428), Corn-mintill(830), Corn(237), Grass-pasture(483), Grass-trees(730), Grass-pasture-mowed(28), Hay-windrowed(478), Oats(20), Soybeans-notill(972), Soybeans-mintill(2455), Soybeans-cleantill(593), Wheat(205), Woods(1265), and Bldg-Grass-Tree-Drives(386), and Stone-Steel-Towers(93). The numbers in parentheses were the collected pixel numbers in the dataset. The ground truths in dataset IPS of 10,249 pixels were manually labeled for training and testing. In order to analyze the performance of various algorithms, 10 classes of more than 300 samples were adopted in the experiments, e.g., a subset IPS-10 of 9620 pixels. Nine hundred training samples of 10 classes in subset IPS-10 were randomly chosen from 9,620 pixels, and the remaining samples were used for testing.
Figure 1. False color of IR images for datasets (a) Indian Pines Site (IPS); (b) Pavia University; and (c) Pavia City Center.
Figure 1. False color of IR images for datasets (a) Indian Pines Site (IPS); (b) Pavia University; and (c) Pavia City Center.
Remotesensing 07 14292 g001
The other two HSI data sets adopted in the experiments were obtained from the Reflective Optics System Imaging Spectrometer (ROSIS) instrument covering the City of Pavia, Italy. Two scenes, the university area and the Pavia city center, contained 103 and 102 data bands, both with a spectral coverage from 0.43 to 0.86 um and a spatial resolution of 1.3 m. The image sizes of these two areas were 610 × 340 and 1096 × 715 pixels, respectively. Figure 1b,c show the false color IR image of these two data sets. Nine land-cover classes were available in each data set, and the samples in each data set were separated into two subsets, i.e., one training and one testing set. Given the Pavia University data set, 90 training samples per class were randomly collected for training, and the 8046 remaining samples were tested for performance evaluation. Similarly, the numbers of training and testing samples used for the Pavia City Center data set were 810 and 9529, respectively.

4.2. A Toy Example

Two toy examples are given to illustrate the discriminative power of FKNFLE in the following. Firstly, 561 samples with 220 dimensions of the three classes (Grass/pasture, Woods, and Grass/trees) were collected from a hyperspectral image. The samples were projected onto the first three axes using eight algorithms: PCA, LDA, supervised LPP, LFDA [28], NFLE, FNFLE, KNFLE, and FKNFLE, as shown in Figure 2. These class samples are represented by green triangles (class G), blue stars (class B), and red circles (class R). A simple analysis was done by observing the sample distributions in the reduced spaces. Since the global Euclidean structure criterion was considered during the PCA and LDA training phases, the samples from three classes in the reduced spaces were mixed after the PCA and LDA projections as shown in Figure 2a,b. Since the samples were distributed in a manifold structure in the original space, the manifold learning algorithms, e.g., supervised LPP, LFDA, and NFLE, were executed to preserve the local structure of the samples. The sample distributions projected by supervised LPP, LFDA, and NFLE are displayed in Figure 2c–e, respectively. Three classes were efficiently separated and contrasted with those in Figure 2a,b. The class boundaries, however, were unclear due to the non-linear and non-Euclidean sample distributions in the original space. Kernelization and fuzzification were pre-performed to extend the original non-Euclidean and non-linear space to a higher linear space. Consider the sample distributions in Figure 2e,h, the boundaries of classes G and R in Figure 2e being still unclear using the NFLE transformation. The sample distributions of FNFLE and KNFLE as shown in Figure 2f,g were the results when the kernelization and fuzzification strategies were used, respectively. Obviously, classes G and R were more effectively separated than those in Figure 2e. The local structures of the samples from the observed sample distribution were preserved, and the class separability improved. Several points located at the boundaries were misclassified in these cases. When both strategies were further adopted in FKNFLE, only one red point was mis-located at class G, and classes G and R were clearly separated. From the analysis, both fuzzification and kernelization strategies enhanced the discriminative power of manifold learning methods.
Figure 2. The first toy sample distributions projected on the first three axes using algorithms (a) PCA (principal component analysis); (b) LDA (linear discriminant analysis); (c) supervised LPP (locality preserving projection); (d) LFDA (local Fisher discriminant analysis); (e) NFLE (nearest feature line (NFL) embedding); (f) FNFLE (fuzzy nearest feature line embedding); (g) KNFLE (kernel nearest feature line embedding); and (h) FKNFLE (fuzzy-kernel nearest feature line).
Figure 2. The first toy sample distributions projected on the first three axes using algorithms (a) PCA (principal component analysis); (b) LDA (linear discriminant analysis); (c) supervised LPP (locality preserving projection); (d) LFDA (local Fisher discriminant analysis); (e) NFLE (nearest feature line (NFL) embedding); (f) FNFLE (fuzzy nearest feature line embedding); (g) KNFLE (kernel nearest feature line embedding); and (h) FKNFLE (fuzzy-kernel nearest feature line).
Remotesensing 07 14292 g002aRemotesensing 07 14292 g002bRemotesensing 07 14292 g002cRemotesensing 07 14292 g002d
Secondly, 561 samples with 220 dimensions of the three classes (Corn-no till, Soybeans-min till, Soybeans-no till) were collected from a hyperspectral image. The samples were projected onto the first three axes by eight algorithms: PCA, LDA, supervised LPP, LFDA, NFLE, FNFLE, KNFLE, and FKNFLE, as shown in Figure 3. These class samples are also represented by green triangles (class G), blue stars (class B), and red circles (class R). A simple analysis was also done by observing the sample distributions in the reduced spaces. Since the global Euclidean structure criterion was considered during the PCA and LDA training phases, the samples of three classes in the reduced spaces were mixed after the PCA and LDA projections as shown in Figure 3a,b. Since the samples were distributed in a manifold structure in the original space, the manifold learning algorithms, e.g., supervised LPP, LFDA, and NFLE, were executed to preserve the local structure of the samples. The sample distributions projected by supervised LPP, LFDA, and NFLE are displayed in Figure 3c–e, respectively. Due to the strong overlapping in classes G, R, and B, they were mixed, and the separation was relatively low compared with those in Figure 2c–e. However, when the kernelization and fuzzification strategies were used, class B was more effectively separated than those shown in Figure 3c–e. According to the analysis, in the case of strong overlapping, both fuzzification and kernelization strategies enhanced the discriminative power of manifold learning methods.
Figure 3. The second toy sample distributions projected on the first three axes using algorithms (a) PCA; (b) LDA; (c) supervised LPP; (d) LFDA; (e) NFLE; (f) FNFLE; (g) KNFLE; and (h) FKNFLE.
Figure 3. The second toy sample distributions projected on the first three axes using algorithms (a) PCA; (b) LDA; (c) supervised LPP; (d) LFDA; (e) NFLE; (f) FNFLE; (g) KNFLE; and (h) FKNFLE.
Remotesensing 07 14292 g003aRemotesensing 07 14292 g003bRemotesensing 07 14292 g003c

4.3. Classification Results

The proposed methods, NFLE [20,21], KNFLE, FNFLE [26], and FKNFLE, were compared with two state-of-the-art algorithms, i.e., nearest regularized subspace (NRS) [25] and NRS-LFDA [25]. The parameter configurations for both algorithms NRS [29] and NRS-LFDA were as seen in [25]. The gallery samples were randomly chosen for training the transformation matrix, and the query samples were matched with the gallery samples using the nearest neighbor (NN) matching rule. Each algorithm was run 30 times to obtain the average rates. To obtain the appropriate reduced dimensions of FKNFLE, the available training samples were used to evaluate the overall accuracy (OA) versus the reduced dimensions in the benchmark datasets. As shown in Figure 4, the best dimensions of algorithm FKNFLE for datasets IPS-10, Pavia University, and Pavia City Center were 25, 50, and 50, respectively. The proposed FKNFLE and KNFLE algorithms are both extended from algorithm NFLE. From the classification results as shown in Figure 4, though FKNFLE achieves the best results at the specific reduced dimensions on three datasets, the high variant OA rates are obtained. Moreover, two additional parameters K 3 and K 4 were needed for training during the fuzzification. On the other hand, the performance of KNFLE is more robust than that of FKNFLE. KNFLE usually achieves a higher performance even at low reduced dimensions, e.g., five or 10. It also outperforms the other algorithms at all reduced dimensions on datasets IPS-10 and Pavia City Center. Compared with NRS-LDA, slightly reduced OA rates were obtained on dataset Pavia University. From this analysis, algorithm KNFLE is more competitive than FKNFLE in HSI classification.
Figure 4. The classification accuracy versus the reduced dimension on three benchmark datasets using the various algorithms: (a) IPS-10; (b) Pavia University; (c) Pavia City Center.
Figure 4. The classification accuracy versus the reduced dimension on three benchmark datasets using the various algorithms: (a) IPS-10; (b) Pavia University; (c) Pavia City Center.
Remotesensing 07 14292 g004aRemotesensing 07 14292 g004b
The average classification rates versus the number of training samples on dataset IPS-10 are shown in Figure 5a; algorithms FKNFLE and KNFLE outperformed the other methods. The accuracy rate of FKNFLE was 4% higher than that of FNFLE. The kernelization strategy effectively enhanced the discriminative power. The performance of FKNFLE was better than that of KNFLE to a value of 0.8%, and the rate of FNFLE was higher than that of NFLE to a value of 0.7%. This shows that the fuzzification strategy slightly enhanced the performance. Figure 5b,c also demonstrates the overall accuracy versus the number of training samples in the benchmark datasets of Pavia University and Pavia City Center, respectively. According to the classification rates in these two datasets, algorithm FKNFLE outperformed the other methods. In addition, the classification results were insensitive to the number of training samples. Next, the maps of the classification results for the dataset IPS-10 are given in Figure 6. The classification results of algorithms FKNFLE, KNFLE, FNFLE, NFLE, NRS, and NRS-LFDA are given based on the maps of 145 × 145 pixels depicting the ground truth. The speckle-like errors of FKNFLE were fewer than those of the other algorithms. Figure 6, Figure 7 and Figure 8 give the maps of the classification results for datasets Pavia University and Pavia City Center, respectively. Once again, the speckle-like errors of FKNFLE were fewer than in the case of the other algorithms. In addition, the thematic maps of Pavia University and Pavia City Center are shown in Figure 9a,b, respectively, using the proposed FKNFLE method. Observing the results in Figure 9a, the roads, buildings, and the areas in University were clearly classified even though there was some speckle-like noise in the images. The roads, rivers, buildings, small islands, and the areas in the city were classified in the same way. See Figure 9b. Algorithm FKNFLE effectively classified the land cover even in the limited training samples.
Figure 5. The accuracy rates versus the number of training samples for datasets (a) IPS-10; (b) Pavia University; and (c) Pavia City Center.
Figure 5. The accuracy rates versus the number of training samples for datasets (a) IPS-10; (b) Pavia University; and (c) Pavia City Center.
Remotesensing 07 14292 g005
Figure 6. The classification maps of dataset IPS using various algorithms: (a) The ground truth; (b) FKNFLE; (c) KNFLE; (d) FNFLE; (e) NFLE; (f) NRS (nearest regularized subspace); (g) LFDA-NRS (local Fisher discriminant analysis-NRS); (h) LFDA; (i) supervised LPP; (j) LDA; and (k) PCA.
Figure 6. The classification maps of dataset IPS using various algorithms: (a) The ground truth; (b) FKNFLE; (c) KNFLE; (d) FNFLE; (e) NFLE; (f) NRS (nearest regularized subspace); (g) LFDA-NRS (local Fisher discriminant analysis-NRS); (h) LFDA; (i) supervised LPP; (j) LDA; and (k) PCA.
Remotesensing 07 14292 g006aRemotesensing 07 14292 g006b
Figure 7. The classification maps of dataset Pavia University using various algorithms: (a) The ground truth; (b) FKNFLE; (c) KNFLE; (d) FNFLE; (e) NFLE; (f) NRS; (g) LFDA-NRS; (h) LFDA; (i) supervised LPP; (j) LDA; and (k) PCA.
Figure 7. The classification maps of dataset Pavia University using various algorithms: (a) The ground truth; (b) FKNFLE; (c) KNFLE; (d) FNFLE; (e) NFLE; (f) NRS; (g) LFDA-NRS; (h) LFDA; (i) supervised LPP; (j) LDA; and (k) PCA.
Remotesensing 07 14292 g007aRemotesensing 07 14292 g007b
Figure 8. The classification maps of dataset Pavia City Center using various algorithms: (a) The ground truth, (b) FKNFLE; (c) KNFLE; (d) FNFLE; (e) NFLE; (f) NRS; (g) LFDA-NRS; (h) LFDA; (i) supervised LPP; (j) LDA; and (k) PCA.
Figure 8. The classification maps of dataset Pavia City Center using various algorithms: (a) The ground truth, (b) FKNFLE; (c) KNFLE; (d) FNFLE; (e) NFLE; (f) NRS; (g) LFDA-NRS; (h) LFDA; (i) supervised LPP; (j) LDA; and (k) PCA.
Remotesensing 07 14292 g008aRemotesensing 07 14292 g008b
Figure 9. The thematic maps of (a) Pavia University, and (b) Pavia City Center using the proposed FKNFLE algorithm.
Figure 9. The thematic maps of (a) Pavia University, and (b) Pavia City Center using the proposed FKNFLE algorithm.
Remotesensing 07 14292 g009
The proposed method was compared with various classification methods on computational time. All methods were implemented by MATALB codes on a personal computer with an i7 2.93-GHz CPU and 12.0 gigabyte RAM. The comparisons of various algorithms on computational time were tabulated in Table 2 for the IPS-10, Pavia University, and Pavia City Center datasets. Considering the training time, the proposed FKNFLE algorithm was generally faster than NRS and NRS-LFDA by two times and 15 times, respectively. Due to the fuzzification process, algorithms FKNFLE and FNFLE were slower than KNFLE and NFLE, by 13 times and 15 times, respectively.
Table 2. The training and testing times of various algorithms for the benchmark datasets (s).
Table 2. The training and testing times of various algorithms for the benchmark datasets (s).
DatasetsIPS-10Pavia UniversityPavia City Center
AlgorithmsTrainingTestingTrainingTestingTrainingTesting
900872081080468109529
NFLE-NN1018916920
KNFLE-NN121811161120
FNFLE-NN155181401614020
FKNFLE-NN156181411614120
NRS326326294300294351
LFDA-NRS233132720983012098352
From Table 3, Table 4 and Table 5, the producer’s accuracy, overall accuracy, kappa coefficients, and user’s accuracy defined by the error matrices (or confusion matrices) [27] were calculated for performance evaluation. They are briefly defined in the following. The user’s accuracy and the producer’s accuracy are two widely used measures for class accuracy. The user’s accuracy is defined as the ratio of the number of correctly classified pixels in each class by the total pixel number classified in the same class. The user’s accuracy is a measure of commission error, whereas the producer’s accuracy measures the errors of omission and indicates the probability that certain samples of a given class on the ground are actually classified as such. The kappa coefficient, also called the kappa statistic, is defined to be a measure of the difference between the actual agreement and the changed agreement. The overall accuracies of the proposed method were 83.34% in IPS-10, 91.31% in Pavia University, and 97.59% in Pavia City Center with the kappa coefficients of 0.821, 0.910, and 0.971, respectively. Subset IPS-10 of 10 classes is used for fair comparisons with other algorithms. Another alternative classification on the whole IPS dataset of 16 classes was performed. Ten percent training samples of each class were randomly chosen from 10,249 pixels except for class Oats. Three training samples were randomly chosen from class Oats because of few samples in this data set. The remaining samples were used for testing. The classification error matrix is given in Table 6 in which the overall accuracy and kappa coefficient are 83.85% and 0.826, respectively.
Table 3. The classification error matrix for data set IPS-10 (in percentage).
Table 3. The classification error matrix for data set IPS-10 (in percentage).
ClassesReference DataUser’s Accuracy
12345678910
179.203.430.280.3505.469.731.540079.20
25.9081.8100.1201.336.394.3400.1281.81
30097.491.460.210.4200.210.420.8497.49
4000.2796.30000003.4296.30
5000.42099.580000099.58
65.140.210.100.41088.894.420.7200.1088.89
710.595.580.290.330.049.7869.983.3000.1269.98
81.354.051.520.3401.691.8588.5300.6788.53
9003.320.16000090.835.6990.83
10003.895.700000.2610.8879.2779.27
Producer’s Accuracy77.5186.0490.6291.5799.7582.6375.7689.5188.9487.85
Kappa Coefficient: 0.821Overall Accuracy: 83.34%
Table 4. The classification error matrix for data set Pavia University (in percentage).
Table 4. The classification error matrix for data set Pavia University (in percentage).
ClassesReference DataUser’s Accuracy
123456789
190.183.150003.241.351.260.8190.18
22.3192.5002.3101.8501.01092.50
30090.072.381.580.992.970.990.9990.07
401.232.8490.241.421.421.511.32090.24
50.631.130.751.2691.910.631.640.881.1391.91
61.101.191.381.561.1992.540.550.46092.54
701.120.510.612.24093.251.221.0293.25
80.471.420.951.422.381.90090.760.6690.76
91.1402.152.0102.2902.1590.2290.22
Producer’s Accuracy94.1090.9291.3088.6591.2588.2592.0890.7195.14
Kappa Coefficient: 0.910Overall Accuracy: 91.31%
Table 5. The classification error matrix for data set Pavia City Center (in percentage).
Table 5. The classification error matrix for data set Pavia City Center (in percentage).
ClassesReference DataUser’sAccuracy
123456789
198.610.170.510.340.34000098.61
21.0497.470.43000.340.170.52097.47
30.590.8296.230.690.990000.6996.23
400.560.6696.680.370.470.660.56096.68
5000.430.3497.730.260.340.340.5297.73
60.350.260.610098.1500.260.3598.15
70.350.2600.3500.4498.230.35098.23
8000.370.300.370.520.4597.430.5297.43
90.390.590.790.290.2900097.6097.60
Producer’s Accuracy97.3297.3496.2097.6797.6497.9798.3897.9697.91
Kappa Coefficient: 0.971Overall Accuracy: 97.59%
Table 6. The classification error matrix for data set IPS of 16 classes (in percentage).
Table 6. The classification error matrix for data set IPS of 16 classes (in percentage).
Reference DataUA
12345678910111213141516
178.220004.350017.430000000078.22
2077.152.220.6900.18000.075.1713.261.19000.07077.15
303.3273.033.04000000.7115.154.75000073.03
4013.918.8465.830.42000.8201.297.591.29000065.83
500.210.230.2494.610.220000.800.811.0401.850094.61
600.120.1400.1997.1100000.68000.581.18097.11
700003.61092.813.580000000092.81
81.8100000098.190000000098.19
90000000094.990005.0100094.99
1003.830.3100.320.31000.1181.7412.950.43000081.74
1104.623.520.220.320.31000.085.4583.951.37000.16083.95
1204.937.930.610.120.140002.059.7974.25000.17074.25
13000000000000.4999.5100099.51
1400000.470.080000000.0896.033.34096.03
15000.540.547.2515.02000.201.852.590.250.2516.8554.66054.66
1601.0300000001.063.28000094.6394.63
PA97.7370.7075.4792.4984.7385.65181.8199.5181.6456.2187.2994.9083.2791.741
Kappa Coefficient: 0.826Overall Accuracy: 83.85%
UA: User’s Accuracy, PA: Producer’s Accuracy
In this study, since we focused on the performance of kernelization and fuzzification, the k-NN classifier was adopted rather than the complex support vector machine (SVM) classifier. An analysis of various k values is given to demonstrate the performance of the k-NN classifier as shown in Table 7. Here, value k was set as values 1, 3, and 4, and the voting strategy was used in this analysis. Obviously, an adaptive higher value of the k-NN classifier can achieve more competitive performances. Next, the empirical parameters K 1 , K 2 , K 3 , and K 4 were properly determined by a cross-validation technique. Training samples were separated into two groups: the training and validation subsets, where, for example, 50% of the samples for training and the other for validation. The validation results were generated under various parameters, and the proper setting was determined by selecting the best results. From the cross-validation experiment, the proper parameters K 1 = 8 , K 2 = 28 , K 3 = 14 , and K 4 = 28 were chosen. After that, the transformation was obtained from the whole training set. A sensitivity analysis on four parameters K 1 , K 2 , K 3 , and K 4 was done as shown in Figure 10. In Figure 10a, the variances of classification rates were relatively low for parameters K 1 versus K 2 . In contrast, from Figure 10b–f, parameters K 3 and K 4 resulted in a higher variance of classification rates. In other words, the NFLE parameters K 1 and K 2 were not sensitive to the classification rates, and the parameters K 3 and K 4 of fuzzy k nearest neighbor were sensitive to the classification rates. According to the results of the sensitivity analysis of the four parameters, the parameters selected in the proposed algorithm were K 1 = 8 , K 2 = 28 , K 3 = 14 , and K 4 = 28 , which are consistent with the parameters in the cross-validation test.
Table 7. The classification performance using various k-NN for data set IPS-10 (in percentage).
Table 7. The classification performance using various k-NN for data set IPS-10 (in percentage).
FKNFLEKNFLEFNFLENFLE
k-Valuek-Valuek-Valuek-Value
134134134134
IPS-1083.3484.1985.1183.0783.5584.1978.3778.9879.1077.5978.8978.93
Pavia City Center97.5998.1898.2496.5596.8496.8895.0895.3295.5194.5895.2695.41
Pavia University91.3192.1392.3689.5090.0490.1985.1086.0586.5783.8084.6385.05
Figure 10. The sensitivity analysis of four parameters K 1 , K 2 , K 3 , K 4 . (a) K 1 vs. K 2 ; (b) K 1 vs. K 3 ; (c) K 1 vs. K 4 ; (d) K 2 vs. K 3 ; (e) K 2 vs. K 4 ; (f) K 3 vs. K 4 .
Figure 10. The sensitivity analysis of four parameters K 1 , K 2 , K 3 , K 4 . (a) K 1 vs. K 2 ; (b) K 1 vs. K 3 ; (c) K 1 vs. K 4 ; (d) K 2 vs. K 3 ; (e) K 2 vs. K 4 ; (f) K 3 vs. K 4 .
Remotesensing 07 14292 g010aRemotesensing 07 14292 g010bRemotesensing 07 14292 g010c
Furthermore, due to the proposed algorithm being based on kernelization and fuzzification, the performance comparison between the proposed algorithms and the well-known kernelization-based algorithm GCK-MLR (Generalized composite kernel-multinomial logistic regression) [17,30] is illustrated in Table 8 and Table 9. Basically, algorithm GCK-MLR is a multinomial logistic regression (MLR)-based classifier of composite kernels in which four kernels, spectral, spatial, spectral-spatial cross information, and spatial-spectral cross information kernels, deeply impact the classification results. The training configurations in [17] were quite different from ours. Besides, it is unfair for comparing the results of a single kernel (KNFLE) method with those of multi-kernels (GCK-MLR). In the experiment, we re-trained the classifier using the same configurations of [17]. The training configurations and classification results have directly been referred from [17]. Moreover, only the results using a single spectral kernel K ω were used for the fair comparison. Datasets IPS of 16 classes and Pavia University were evaluated as shown in Table 8 and Table 9, respectively. Considering the IPS dataset of 16 classes in Table 8, algorithm GCK-MLR outperforms the proposed method at the overall accuracy index, while the average accuracy rate is lower than those of algorithms FKNFLE and KNFLE. In Table 9, the overall accuracy and average accuracy rates of the proposed method are both higher than those of algorithm GCK-MLR.
Table 8. The comparison between algorithm GCK-MLR ( K ω ) and the proposed method for dataset IPS of 16 classes (in percent).
Table 8. The comparison between algorithm GCK-MLR ( K ω ) and the proposed method for dataset IPS of 16 classes (in percent).
ClassNumber of SamplesGCK-MLR ( K ω )FKNFLEKNFLE
TrainTest
Alfalfa35147.06 ± 15.4165.22 ± 15.3256.52 ± 16.42
Corn-no till71136378.24 ± 3.0170.66 ± 3.0567.44 ± 3.03
Corn-min till4179364.17 ± 3.0167.71 ± 3.0471.08 ± 3.05
Corn1122348.211 ± 1.7643.88 ± 11.5447.68 ± 12.14
Grass/pasture2447387.76 ± 2.2784.47 ± 2.1887.16 ± 2.58
Grass/tree3771095.13 ± 1.4096.58 ± 1.4294.79 ± 1.32
Grass/pasture-mowed32353.04 ± 11.7492.86 ± 11.8892.82 ± 10.68
Hay-windrowed2446598.84 ± 0.6197.28 ± 0.5998.12 ± 0.62
Oats31768.82 ± 17.3365.10 ± 16.3170.12 ± 15.35
Soybeans-no till4892068.42 ± 5.2270.27 ± 5.1266.87 ± 5.42
Soybeans-min till123224582.56 ± 1.2677.43 ± 1.3173.93 ± 1.25
Soybeans-clean till3058474.52 ± 5.3562.56 ± 5.3261.89 ± 5.52
Wheat1020299.36 ± 0.5291.71 ± 0.5494.63 ± 0.51
Woods64123095.46 ± 1.5396.60 ± 1.4997.15 ± 1.54
Bldg-grass-tree-drives1936150.75 ± 3.4938.34 ± 3.1848.19 ± 3.38
Stone-steel towers49162.09 ± 6.9582.80 ± 6.8983.87 ± 6.59
Overall accuracy80.16 ± 0.7377.19 ± 0.7176.43 ± 0.73
Average accuracy73.40 ± 1.2675.22 ± 1.2175.76 ± 1.25
Table 9. The comparison between algorithm GCK-MLR ( K ω ) and the proposed method for dataset Pavia University of nine classes (in percent).
Table 9. The comparison between algorithm GCK-MLR ( K ω ) and the proposed method for dataset Pavia University of nine classes (in percent).
ClassNumber of SamplesGCK-MLR ( K ω )FKNFLEKNFLE
TrainTest
Asphalt548663182.6483.1482.64
Bare soil54018,64968.6282.8982.07
Bitumen392209975.0481.7579.32
Bricks524306497.0093.2192.95
Gravel265134599.4199.9399.93
Meadows532502993.8880.4779.48
Metal Sheets375133090.0892.2692.41
Shadows514368291.3685.6185.17
Trees23194797.5799.8999.89
Overall accuracy80.3485.7684.04
Average accuracy88.4088.7988.20

5. Conclusions

In this paper, a general NFLE transformation, FKNFLE, for HSI classification has been proposed. Kernelization and fuzzification in NFLE were both considered in order to extract non-linear and non-Euclidean structures. Three state-of-the-art algorithms, NFL, NRS and NRS-LFDA, were compared with the proposed FKNFLE. Three land-cover benchmarks, IPS-10, Pavia University, and Pavia City Center, were tested for performance evaluation. From the experimental results, algorithm FKNFLE outperformed the other algorithms. More specifically, using the 1-NN classifier, the rates of FKNFLE were higher than those of NFLE to the value of 5.75%, 3.01%, and 7.51% for datasets IPS-10, Pavia City Center, and Pavia University, respectively. Though FKNFLE had high classification rates using the features on a single pixel, there was some speckle-like noise in the image segmentation results. In the future, the features of spatial neighbors will be adopted for better classification and segmentation.

Acknowledgments

The work was supported by Ministry of Science and Technology of Taiwan under Grant nos. MOST104-2221-E-008-030-MY3 and MOST103-2221-E-008-058-MY3.

Author Contributions

The idea was conceived by Ying-Nong Chen and Chin-Chuan Han, performed by Ying-Nong Chen, Cheng-Ta Hsieh, and Ming-Gang Wen, analyzed by Chin-Chuan Han, Ying-Nong Chen, Cheng-Ta Hsieh, and Ming-Gang Wen, written by Chin-Chuan Han and Ying-Nong Chen, and revised by Ying-Nong Chen, Chin-Chuan Han and Kuo-Chin Fan.

Conflicts of Interest

We have no conflict of interest to declare.

References

  1. Turk, M.; Pentland, A.P. Face recognition using eigenfaces. In Proceedings of the 1991 Proceedings CVPR ’91. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Maui, HI, USA, 3–6 June 1991; pp. 586–591.
  2. Belhumeur, P.N.; Hespanha, J.P.; Kriegman, D.J. Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 711–720. [Google Scholar] [CrossRef]
  3. Cevikalp, H.; Neamtu, M.; Wikes, M.; Barkana, A. Discriminative common vectors for face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 4–13. [Google Scholar] [CrossRef] [PubMed]
  4. Prasad, S.; Mann Bruce, L. Information fusion in kernel-induced spaces for robust subpixel hyperspectral ATR. IEEE Trans. Geosci. Remote Sens. Lett. 2009, 6, 572–576. [Google Scholar] [CrossRef]
  5. He, X.; Yan, S.; Ho, Y.; Niyogi, P.; Zhang, H.J. Face recognition using Laplacianfaces. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 328–340. [Google Scholar] [PubMed]
  6. Tu, S.T.; Chen, J.Y.; Yang, W.; Sun, H. Laplacian eigenmaps-based polarimetric dimensionality reduction for SAR image classification. IEEE Trans. Geosci. Remote Sens. 2011, 50, 170–179. [Google Scholar] [CrossRef]
  7. Wang, Z.; He, B. Locality preserving projections algorithm for hyperspectral image dimensionality reduction. In Proceedings of the 2011 19th International Conference on Geoinformatics, Shanghai, China, 24–26 June 2011; pp. 1–4.
  8. Kim, D.H.; Finkel, L.H. Hyperspectral image processing using locally linear embedding. In Proceedings of the 1st International IEEE EMBS Conference on Neural Engineering, Italy, 20–22 March 2003; pp. 316–319.
  9. Li, W.; Prasad, S.; Fowler, J.E.; Bruce, L.M. Locality-preserving discriminant analysis in kernel-induced feature spaces for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2011, 8, 894–898. [Google Scholar] [CrossRef]
  10. Li, W.; Prasad, S.; Fowler, J.E.; Bruce, L.M. Locality-preserving dimensionality reduction and classification for hyperspectral image analysis. IEEE Trans. Geosci. Remote Sens. 2012, 50, 1185–1198. [Google Scholar] [CrossRef]
  11. Luo, R.B.; Liao, W.Z.; Pi, Y.G. Discriminative supervised neighborhood preserving embedding feature extraction for hyperspectral-image classification. Telkomnika 2012, 10, 1051–1056. [Google Scholar] [CrossRef]
  12. Zhang, L.; Zhang, Q.; Zhang, L.; Tao, D.; Huang, X.; Du, B. Ensemble manifold regularized sparse low-rank approximation for multi-view feature embedding. Pattern Recognit. 2015, 48, 3102–3112. [Google Scholar] [CrossRef]
  13. Boots, B.; Gordon, G.J. Two-manifold problems with applications to nonlinear system Identification. In Proceedings of the 29th International Conference on Machine Learning, Edinburgh, UK, 26 June–1 July 2012.
  14. Odone, F.; Barla, A.; Verri, A. Building kernels from binary strings for image matching. IEEE Trans. Image Process. 2005, 14, 169–180. [Google Scholar] [CrossRef] [PubMed]
  15. Scholkopf, B.; Smola, A.; Muller, K.R. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 1998, 10, 1299–1319. [Google Scholar] [CrossRef]
  16. Lin, Y.Y.; Liu, T.L.; Fuh, C.S. Multiple kernel learning for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 1147–1160. [Google Scholar] [CrossRef] [PubMed]
  17. Li, J.; Reddy Marpu, P.; Plaza, A.; Bioucas-Dias, J.M.; Atli Benediktsson, J. Generalized composite kernel framework for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2013, 51, 4816–4829. [Google Scholar] [CrossRef]
  18. Chen, Y.; Nasrabadi, N.M.; Tran, T.D. Hyperspectral image classification via kernel sparse representation. IEEE Trans. Geosci. Remote Sens. 2013, 51, 217–231. [Google Scholar] [CrossRef]
  19. Zhang, L.; Zhang, L.; Tao, D.; Huang, X. On combining multiple features for hyperspectral remote sensing image classification. IEEE Trans. Geosci. Remote Sens. 2012, 50, 879–893. [Google Scholar] [CrossRef]
  20. Chen, Y.N.; Han, C.C.; Wang, C.T.; Fan, K.C. Face recognition using nearest feature space embedding. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 1073–1086. [Google Scholar] [CrossRef] [PubMed]
  21. Chang, Y.L.; Liu, J.N.; Han, C.C.; Chen, Y.N. Hyperspectral image classification using nearest feature line embedding approach. IEEE Trans. Geosci. Remote Sens. 2014, 52, 278–287. [Google Scholar] [CrossRef]
  22. Keller, J.J.M.; Gray, M.R.; Givens, J.A., Jr. A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybern. 1985, 15, 580–585. [Google Scholar] [CrossRef]
  23. Li, S.Z. Face recognition based on nearest linear combinations. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA, USA, 23–25 June 1998; pp. 839–844.
  24. Yan, S.; Xu, D.; Zhang, B.; Zhang, H.J.; Yang, Q.; Lin, S. Graph embedding and extensions: a framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 40–51. [Google Scholar] [PubMed]
  25. Li, W.; Tramel, E.W.; Prasad, S.; Fowler, J.E. Nearest regularized subspace for hyperspectral classification. IEEE Trans. Geosci. Remote Sens. 2014, 52, 477–489. [Google Scholar] [CrossRef]
  26. Chen, Y.N.; Han, C.C.; Fan, K.C. Use fuzzy nearest feature line embedding for hyperspectral image classification. In Proceedings of the 4th International Conference Earth Observations and Societal Impacts, Miaoli, Taiwan, 22–24 June 2014.
  27. Lillesand, T.M.; Kiefer, R.W. Remote Sensing and Image Interpretation; Wiley: New York, NY, USA, 2000. [Google Scholar]
  28. Sugiyama-Sato Lab at the University of Tokyo. Available online: http://www.ms.k.u-tokyo.ac.jp/software.html (accessed on 26 October 2015).
  29. Github. Available online: https://github.com/eric-tramel/NRSClassifier (accessed on 15 May 2015).
  30. IEEE Publications. Available online: http://www.lx.it.pt/~jun/publications.html (accessed on 15 May 2015).

Share and Cite

MDPI and ACS Style

Chen, Y.-N.; Hsieh, C.-T.; Wen, M.-G.; Han, C.-C.; Fan, K.-C. A Dimension Reduction Framework for HSI Classification Using Fuzzy and Kernel NFLE Transformation. Remote Sens. 2015, 7, 14292-14326. https://doi.org/10.3390/rs71114292

AMA Style

Chen Y-N, Hsieh C-T, Wen M-G, Han C-C, Fan K-C. A Dimension Reduction Framework for HSI Classification Using Fuzzy and Kernel NFLE Transformation. Remote Sensing. 2015; 7(11):14292-14326. https://doi.org/10.3390/rs71114292

Chicago/Turabian Style

Chen, Ying-Nong, Cheng-Ta Hsieh, Ming-Gang Wen, Chin-Chuan Han, and Kuo-Chin Fan. 2015. "A Dimension Reduction Framework for HSI Classification Using Fuzzy and Kernel NFLE Transformation" Remote Sensing 7, no. 11: 14292-14326. https://doi.org/10.3390/rs71114292

Article Metrics

Back to TopTop