Hierarchical Unsupervised Partitioning of Large Size Data and Its Application to Hyperspectral Images

: In this paper, we propose a true unsupervised method to partition large-size images, where the number of classes, training samples, and other a priori information is not known. Thus, partitioning an image without any knowledge is a great challenge. This novel adaptive and hierarchical classiﬁcation method is based on afﬁnity propagation, where all criteria and parameters are adaptively calculated from the image to be partitioned. It is reliable to objectively discover classes of an image without user intervention and therefore satisﬁes all the objectives of an unsupervised method. Hierarchical partitioning adopted allows the user to analyze and interpret the data very ﬁnely. The optimal partition maximizing an objective criterion provides the number of classes and the exemplar of each class. The efﬁciency of the proposed method is demonstrated through experimental results on hyperspectral images. The obtained results show its superiority over the most widely used unsupervised and semi-supervised methods. The developed method can be used in several application domains to partition large-size images or data. It allows the user to consider all or part of the obtained classes and gives the possibility to select the samples in an objective way during a learning process.


Introduction
The partitioning of large-size images, in order to exploit their rich information content for the construction of reliable decision-making systems, is of growing interest in many application fields. Indeed, hyperspectral images are today among the most widely used data thanks to their large spectral range (several hundred spectral bands covering the visible and infrared domains) and their fine spatial resolution (a few tens of centimeters). Due to this richness of information, which allows for better discrimination of objects, interest in hyperspectral images has increased during the last years in many application fields. These fields include geology [1], medicine [2][3][4][5][6][7], industrial production [8,9], safety [10], and the environment [11][12][13][14][15][16][17][18][19][20]. In this latter field, hyperspectral imagery has been of great interest and several applications have been treated. These applications include the inventory of vegetation species [11,12], the early detection of vegetation diseases [13,14] and invasive species [15,16], the identification of marine algae [17,18], and the human and animal impacts on the environment [19,20], etc. Thus, to efficiently meet the needs of all these areas, any partitioning method of hyperspectral images must be conducted while respecting the physical nature of the information provided by these images. A plethora of published methods do not respect the precise information given by these images, where the ground truth (GT) data used is simplified [21,22] and the classification of existing methods is sometimes ambiguous and not consistent. For example, in [23], the authors propose a method using learning samples by defining it as semi-supervised. Other authors introduce in [24], the number of classes in the partitioning process and qualify their method as ∀ x i ∈ X, ∃ C j such that x i ∈ C j Separability: the classes must be sufficiently differentiable so that an object can only be associated with one class.
Relevance: the association of an object to a class must be carried out according to an objective optimization criterion.
The different partitioning methods can be grouped into three categories [26] which we define as follows: -Supervised methods require training samples to accomplish the partitioning task. Methods such as the Maximum Likelihood [27] and Support Vector Machines [28] are the most common methods used in this category.
We note that the information required by the methods of this category is not always available for many applications, and even if it exists, it is not always reliable [21,22]. -Semi-supervised methods, considered unsupervised in the literature, require the number of classes and threshold values or other empirical parameters from the operator. Kmeans [29] is a basic semi-supervised algorithm that classifies each object according to their similarity/dissimilarity, requiring the number of classes to be fixed by the user. Furthermore, it is sensitive to the random initialization of class centers, which makes it unstable [30][31][32]. Several extensions of K-means have been developed [33][34][35][36] but are still affected by the initial choice of class centers.
The Fuzzy C-Means (FCM) [37][38][39][40] derived from the K-means algorithm is another semi-supervised algorithm. This algorithm is also sensitive to the initial class centers and the choice of the fuzzification parameter.
In practice, the knowledge of the number of classes required by supervised and semisupervised methods is imprecise and not often accessible, in particular for hyperspectral aerial images with large landscape areas. As stated in [21,22,32], prior knowledge of the number of classes is a subjective process in nature, which precludes an absolute judgment as to the relevance of all data analysis. Consequently, these methods do not allow for the discovery of novel relevant classes. In this case, the knowledge of the number of classes in semi-supervised and supervised methods can be considered as a constraint and does not often reflect reality.
In order to partition objects or form learning classes, unsupervised partitioning methods present many advantages with respect to supervised and semi-supervised methods, as defined below.
-Unsupervised methods do not require the number of classes, associated learning samples, or any other prior knowledge. The number of classes is objectively estimated following a given optimization criterion or several optimization criteria. This category is called exploratory data analysis by Xu and Wunsch [32]. It is more adapted to explore data analysis or to learn a novel object and discover a novel phenomenon. Indeed, the identification of the nature of each class is defined after the partitioning process and not before. This category leads to a fine and complete analysis of the observed data and does not limit the analysis to known classes. For these reasons, it meets a real need generated by some application domains, such as Earth observation, where the areas to be analyzed are very large or difficult to access. It provides accurate, objective, and consistent solutions that translate real information content of images independently of the GT data or learning samples which can be biased or simplified [21,22].
The definition of unsupervised methods given above ensures that no subjective knowledge is introduced by the user.
Several unsupervised methods have been proposed for data partitioning. In [41], an unsupervised version of the K-means algorithm, named Modified Linde-Buzo-Gray (MLBG) is proposed. This method improves the different stages of the Linde-Buzo-Gray (LBG) algorithm and is able to automatically determine the number of classes; however, it requires a long computing time. In [26], an optimized version of the baseline FCM algorithm (FCMO) was presented in order to make it stable and deterministic. Stability is achieved by initializing the class centers through an adaptive incremental procedure. In addition, an unsupervised evaluation criterion, based on the within-and between-class disparities, is introduced to estimate the optimal number of classes. This method can be used in unsupervised or semi-supervised mode.
One of the most elaborate unsupervised partitioning methods is Affinity Propagation (AP) [42]. This method has six advantages: (i) no a priori knowledge is required, especially the non-introduction of training samples, to give the user the possibility to detect and locate the known classes and new classes called "discovery classes"; (ii) stability of the results thanks to its deterministic character; (iii) possibility to objectively select the samples of the classes in a learning system in order to be able to detect them afterward; (iv) applicable to several domains without learning constraints; (v) it can be used in unsupervised or semisupervised mode, and; and (vi) it is insensitive to initialization. Due to these advantages, it is widely used in several application areas, such as environmental monitoring and safety [43][44][45][46][47], multimedia data management, and pattern recognition [48][49][50][51][52][53][54][55][56][57].
The main drawback of the standard AP method is that it is not applicable to partitioning large-size hyperspectral images. Indeed, the memory space required for each of the four main matrices has a quadratic relationship with the number of pixels to be partitioned. In [25], Chehdi et al. suggested a reduction in the number of pixels to allow the application of AP to partition large-size images. To reduce this number, the hyperspectral image is divided into blocks and the reduction step is applied independently within each block. Then, the AP method is applied only on exemplars of the duplicated pixels and non-duplicated pixels. However, in its current version, AP does not take into account the presence of identical objects after the reduction step in the calculation of the criteria used. Other drawbacks of the standard AP method are: (i) some parameters are not computed adaptively; and (ii) the criterion of identification of class exemplars may create not perfectly homogeneous classes, as demonstrated in Section 3.1.2.
To extend the AP method to large-size data, such as hyperspectral aerial images, and to optimize it by adaptively calculating the criteria and parameters used, we propose here a new version without any prior knowledge introduced by the user. We recall that a relevant partitioning method, whatever its nature (supervised, semi-supervised, or unsupervised), must provide results to the end-user, taking into account the physical characteristics provided by measuring instruments [21,22].
The main contributions of this work are: (1) modification and optimization of several steps of the standard AP algorithm; (2) introduction of a new approach to partition large size images and proposal of a fusion procedure of the classes obtained in the blocks with stable results regardless of the chosen block size; and (3) generation of hierarchical partitioning. For AP optimization, the improvements are the estimation of the preference parameter value for each object; taking into account the presence of identical objects in the calculation of similarity, responsibility, and availability criteria; updating procedure of estimating the values of the responsibility and the availability criteria; and finally, modification of the decisional criterion used to estimate the number of classes and identify their exemplars. The proposed hierarchical method allows the user to obtain several partitions while indicating the most optimal one. These partitioning results are very important and necessary in several application fields to perform a fine analysis and interpretation of the formed classes.
The remainder of the paper is organized as follows: Section 2 provides an overview of the standard AP algorithm and related studies. Section 3 describes the proposed method, named "Unsupervised Partitioning by Optimized Affinity Propagation" (UP-OAP) and its hierarchical version, named "Hierarchical Unsupervised Partitioning approach by OAP" (HUP-OAP). Section 4 presents the assessment results of the proposed method on a synthetic hyperspectral image constructed from a real aerial image acquired by our platform. It is also evaluated on a real aerial large-size hyperspectral image. The target application of these images is the identification of marine algae species. To assess the efficiency of the proposed unsupervised method, each image is provided with validated GT data. Finally, Section 5 concludes this paper and provides some perspectives.

Review of the Standard Affinity Propagation Algorithm and Related Studies
Unsupervised partitioning methods, as defined in the introduction, have many advantages over supervised and semi-supervised methods: (i) they do not require any prior knowledge to aggregate objects in classes (neither the number of classes to discriminate, nor learning samples). The number of classes is estimated following a given optimization criterion, and (ii) they respect the physical characteristics of objects in the formation of classes. Thus, unsupervised methods provide more relevant results because the decision criteria for objects aggregation is independent of the GT data or learning samples, which can be biased or simplified in some cases [21,22]. For these reasons, we are interested in the development of an unsupervised partitioning method that excludes user intervention.

Overview of the Standard AP Algorithm
In the standard AP algorithm developed by Frey and Dueck [42], two procedures of message transmission, called responsibility and availability, are used to exchange messages between objects. These messages are used to identify in an iterative manner the best exemplar of each class that may exist. The responsibility, r(x i , x k ), is sent from object x i to candidate exemplar x k and reflects how well-suited it would be for object x k to be the exemplar of object x i . The availability, a(x i , x k ), is sent from candidate exemplar x k to object x i , and reflects how appropriate it would be for object x i to choose candidate exemplar x k as its exemplar. To calculate both criteria, the similarity matrix is used as the opposite of the squared Euclidean distance −d 2 For set X = {x 1 , x 2 , . . . , x N } of N objects to be partitioned, where each object, x i , is characterized by a set of B features, S, R, and A denote the similarity, responsibility, and availability matrices of size N × N, respectively. s(x i , x k ), r(x i , x k ), and are their respective elements for objects x i and x k . Mathematically, responsibility, r(x i , x k ), and availability, a(x i , x k ), are defined as follows [42]: where p denotes a global preference parameter, whose value is generally set as the minimum or median value of the similarity matrix S, which implicitly controls the number of classes.
In this method, all diagonal elements s(x k , x k ) of the similarity matrix S are equal to the p value instead of zero.
At each iteration, m, responsibility and availability are estimated as follows: where λ is a damping factor (λ]0, 1[). At any step of the iterative process, responsibilities and availabilities are combined to identify the exemplar of each class to be formed. The criterion that identifies object x k as an exemplar of object x i is:
To adapt the preference parameter, a solution is proposed in [47]. Each preference parameter, p j , is automatically adjusted during the iteration process according to the data distribution by fixing two thresholds. With this method, the problem of partitioning largesize images remains. Furthermore, the introduction of these two thresholds makes this method parametric. In [58], an extension of the AP algorithm is presented in order to reduce the computing time and the memory space, but the values of the preference parameter and the damping factor are to be set by the user. Other propositions are described in [56,59,60], but they are also not adapted to partition large-size data. Furthermore, the damping factor and the preference parameter are fixed by the user. Finally, in [61,62], the AP algorithm was combined with other methods to improve the learning task. For these last two methods, prior knowledge is required for the learning task, the preference parameter is also not adaptive, and the damping factor must also be chosen by the user.

Proposed Hierarchical Unsupervised Partitioning Method
In this section, we first present a new unsupervised partitioning method based on an Optimization of Affinity Propagation [42], which we name "UP-OAP". Next, we describe the main steps of the hierarchical partitioning version using the UP-OAP method which we call "HUP-OAP".

Unsupervised Partitioning Method by Optimized Affinity Propagation (UP-OAP)
In contrast to the standard AP, in the UP-OAP method, all the parameters and criteria are calculated in an adaptive way, taking into account the presence of identical objects in the dataset to be partitioned. Finally, the criterion for identifying the exemplars of each class is reformulated. The flowchart of the UP-OAP method is shown in Figure 1. , that maximize * = argmax [( , )] (Equation (13)) 7. If exemplars do not change, proceed to the next step, (8) else repeat steps (4) to (6) until convergence end if 8. Merging each object to its nearest exemplar and break Output: Partition of classes and exemplar of each class

Hierarchical Partitioning of Large Size Hyperspectral Images (HUP-OAP)
In this section, we detail the main steps of the hierarchical unsupervised partitioning method of large-size images based on the UP-OAP algorithm. The two main steps of this hierarchical HUP-OAP method are respectively the formation of the first and the other partitions by identifying the most relevant one according to an optimization criterion. These steps are described below.

First Partition of the Original Image
The application of the UP-OAP method to large-size images, such as hyperspectral aerial images, requires block partitioning and merging the block partitioning results. This subsection details the main steps of the formation of the first partition by partitioning the blocks of the original image by UP-OAP and fusion of the classes of all blocks, as shown in the flowchart of Figure 2.
To be able to partition all the pixels of an image, in a simple way, the image is divided into regular blocks of the same size, without overlapping between the blocks. Definition 2. The number of pixels exemplars, , considered for partitioning is defined by:

Preference Parameter and Responsibility-Availability Criteria
In the standard version of AP, the preference parameter, p, is set to the minimum or median value of the similarity matrix and does not take into account the variations of the similarity values between the objects of the dataset, X. To associate to each object, x i , its preference parameter, p i , in the similarity matrix S (row i of S) calculated on X, when s(x i , x k ) = 0, for i = k or i = k, this parameter is calculated as follows: where s(x i , x k ) denotes the elements of matrix S and N is the size of X.
To take into account the presence of identical objects in the dataset to be partitioned, that is, s(x i , x k ) = 0, for i = k, two modifications are made: (i) assigning the value of the preference parameter, p k , to all the null elements of the matrix S, in the same way as those on its diagonal; and (ii) calculating the elements of R and A, as follows: To update R and A in Equations (5) and (6) without a choice of the damping factor, λ, a smoothing operation is introduced as follows: This operation provides less biased estimates of R and A matrices, in contrast to those obtained in [42,63].

Identification Criterion of Exemplars
The decision criterion E * (see Equation (7)) maximizing (R + A) to identify exemplars is inappropriate because availability, A, is in some cases incoherent with responsibility, R. Consequently, some exemplars are not detected, which leads to the aggregation of a truly existing class to another. To overcome this drawback, only the criterion using responsibility is used, as justified in the following. Proposition 1. Let x i , x j ∈ X, be two highly similar objects (s x j , x i ∼ = 0) and ∀ x q ∈ X, s x j , x i s x j , x q , where x j and x q are dissimilar (s x j , x q 0), that is, x j can only aggregate with x i . Assume that x i and x q are two candidate exemplars for x j , where x j will be better represented by x i than x q : r x j , x i > r x j , x q .
Moreover, assume that x i was not chosen as an exemplar for any object, i.e., r(x i , x i ) + a(x i , x i ) < 0, therefore r x j , x i + a x j , x i < 0. Then, x j will be aggregated with another exemplar, This proposition shows that when the responsibility between two objects, x j and x i , is positive and larger than all other responsibilities, but the candidate exemplar, x i , is not chosen as an exemplar for any object, the availability in absolute value exceeds the responsibility value and assigns object x j to another class, even if these objects cannot be aggregated.
Proof of Proposition 1. Let x i , x j , and x q ∈ X. Assume that r x j , x i > 0 and r x j , x q > 0, with r x j , x i > r x j , x q (i.e., it is better for x j to be represented by x i than x q ). Moreover, assume that x i was not chosen for any object as an exemplar.
Because x i is not chosen as an exemplar, then: Because a x j , x i < 0, r x j , x i > 0 and under the assumptions of Proposition 1, we have shown that when a x j , The presence of availability in criterion E * (x i ) for the search for exemplars can assign an exemplar to an object even if they are highly dissimilar. This disturbs the final decision and does not correctly detect the real present classes.
Under these conditions, the identification of exemplars by maximizing only R contributes to the formation of homogeneous classes representative of the observed data.
This leads to modify the decision criterion, E * , identifying exemplars by using only the responsibility, R: The steps of UP-OAP method are shown in Algorithm 2.

Calculation of the similarity matrix
Replacement of the null elements of S by the value of p i 4. Calculation of all responsibilities given the availabilities according to Equations (1), (9), and (11) 5. Calculation of all availabilities given the responsibilities according to Equations (3), (10), and (12) (13)) 7. If exemplars do not change, proceed to the next step, (8) else repeat steps (4) to (6) until convergence end if 8. Merging each object to its nearest exemplar and break Output: Partition P of K classes and exemplar I j of each class C j

Hierarchical Partitioning of Large Size Hyperspectral Images (HUP-OAP)
In this section, we detail the main steps of the hierarchical unsupervised partitioning method of large-size images based on the UP-OAP algorithm. The two main steps of this hierarchical HUP-OAP method are respectively the formation of the first and the other partitions by identifying the most relevant one according to an optimization criterion. These steps are described below.

First Partition of the Original Image
The application of the UP-OAP method to large-size images, such as hyperspectral aerial images, requires block partitioning and merging the block partitioning results. This subsection details the main steps of the formation of the first partition by partitioning the blocks of the original image by UP-OAP and fusion of the classes of all blocks, as shown in the flowchart of Figure 2. This proposition shows that the size of the similarity matrix using the exemplars is smaller than the one calculated on the whole original image, which makes it possible to apply UP-OAP on very large-size images.
Algorithm 3 details the steps to obtain this first partition.

Hierarchical Partitioning
In order to give the user the ability to conduct a finer analysis of datasets, a hierarchical link between the classes of the partitions is created. These partitions are generated by using the UP-OAP method from exemplars of classes of the first partition, . Thus, to obtain several partitions in a hierarchical way, the UP-OAP method is applied iteratively at each level, ( ≥ 2), to exemplars of classes of partition corresponding to level − 1. This "exemplars-partitioning" operation is repeated as long as the partitioning results are not stable. At each level, the number of classes of the partition is automatically estimated. The optimal partition and the estimated number of classes are given by the partition that maximizes the Levine and Nazif criterion [64], denoted as . This procedure is detailed in Algorithm 4.
The method developed here for partitioning large size images, named Hierarchical Unsupervised Partitioning by Optimized Affinity Propagation (HUP-OAP), is composed of the three algorithms presented before: the generation of the First partition (Algorithm 3) by using the UP-OAP method (Algorithm 2) and the hierarchical partitioning (Algorithm 4). The main steps of this method are summarized in Algorithm 5. To be able to partition all the pixels of an image, in a simple way, the image is divided into regular blocks of the same size, without overlapping between the blocks. Definition 2. The number of pixels exemplars, N 0 , considered for partitioning is defined by: (14) where N ij is the number of classes obtained by UP-OAP on block B ij , M 1 is the number of blocks in a row, and M 2 is the number of blocks in a column.

Proposition 2.
If S 0 is the similarity matrix of size N 0 × N 0 calculated on the new dataset X 0 , of size N 0 formed by the exemplars of classes of blocks, then we have N 0 N.
This proposition shows that the size of the similarity matrix using the exemplars is smaller than the one calculated on the whole original image, which makes it possible to apply UP-OAP on very large-size images.
Algorithm 3 details the steps to obtain this first partition.

Hierarchical Partitioning
In order to give the user the ability to conduct a finer analysis of datasets, a hierarchical link between the classes of the partitions is created. These partitions are generated by using the UP-OAP method from exemplars of classes of the first partition, P 1 . Thus, to obtain several partitions in a hierarchical way, the UP-OAP method is applied iteratively at each level, i (i ≥ 2), to exemplars of classes of partition P i−1 corresponding to level i − 1. This "exemplars-partitioning" operation is repeated as long as the partitioning results are not stable. At each level, the number of classes of the partition is automatically estimated. The optimal partition and the estimated number of classes are given by the partition that maximizes the Levine and Nazif criterion [64], denoted as LN. This procedure is detailed in Algorithm 4.
The method developed here for partitioning large size images, named Hierarchical Unsupervised Partitioning by Optimized Affinity Propagation (HUP-OAP), is composed of the three algorithms presented before: the generation of the First partition (Algorithm 3) by using the UP-OAP method (Algorithm 2) and the hierarchical partitioning (Algorithm 4). The main steps of this method are summarized in Algorithm 5.

Algorithm 3 First partition merging block classes
Input: -Original image, I m , or Data table to be partitioned -Maximum block size (Y 1 × Y 2 ) allowing application of the UP-OAP method Procedure: where M 1 is the number of blocks in a row and M 2 is the number of blocks in a column 2. Application of the UP-OAP method on each block for i =1 to M 1 do for j = 1 to M 2 do Partitioning of block B ij by the UP-OAP method Let P ij be the obtained partition of block B ij : , is the set of exemplars and N ij is the number of classes of P ij end for end for 3. Merging classes of blocks B ij by application of the UP-OAP method on the exemplars of blocks. 4. Formation of the partition P 1 : merging each object to its exemplar.
Output: Partition P of K classes and exemplar of each class

Algorithm 4 Hierarchical partitioning
Input: Data table, X 1 (N 1 exemplars × B features), composed of the exemplars of the first partition P 1 of N 1 classes C j 1. Application of UP-OAP on the dataset, X 1 2. Repeat UP-OAP on the new dataset X i (i ≥ 2) X i is composed of the exemplar I j i−1 of each class C j of the partition P i−1 , with: Formation of the partition P i : merging each object to its exemplar Until the stability of the partition P i 3. Choice of the optimal partition that maximizes the LN criterion: The hierarchical partitions of the original dataset, the optimal partition, and a set of exemplars of its classes

Algorithm 5 HUP-OAP
Input: Image or Data table to be partitioned 1. Application of Algorithm 3 to obtain the first partition P 1 and its exemplars 2. Formation of the dataset X 1 composed of the exemplars of P 1 3. Application of Algorithm 4 on the dataset X 1 Output: The hierarchical partitions of the image, the optimal partition, and a set of exemplars of its classes

Numerical Assessment
In this section, we present the assessment of the proposed method on two hyperspectral images. The first one is a small synthetic image and the second one is a real aerial large size image. Through these assessments, the application addressed is the localization and the identification of marine algae classes.

Partitioning of a Synthetic Hyperspectral Image
The synthetic hyperspectral image presented in Figure 3 is used to assess the partitioning result performed by the proposed method. The size of this image is limited to 60 × 60 pixels (100 spectral bands), with wavelengths ranging from 404.2 nm to 978.5 nm. This image was generated from pixel samples of nine GT classes of a real hyperspectral image acquired by our aerial platform in 2013. The samples of each class are randomly selected from the GT data accompanying the real acquired hyperspectral image. The nine classes of this image can be aggregated into four main classes as detailed in Figure 4. The four classes of the GT are Water, Substrate, Algae, and Mixed class. The water class is composed of three sub-classes (Deep, Shallow, and Turbid), the substrate class is composed of two sub-classes (Pebble and Sand), and the algae class can be divided into three sub-classes (Ulva, Enteromorpha, and Fucus). Figure 5 shows the optimal partitioning result (LN criterion: 0.25) obtained at level 2 by the proposed HUP-OAP method (Algorithm 5), where the estimated number of classes is ten. For this evaluation, the set of 100 features corresponding to the spectral signature of each pixel is considered and the image is divided into 16 blocks, where the size of each block is 15 × 15 pixels, to cover the whole image.     Figure 5 shows the optimal partitioning result (LN criterion: 0.25) obtained at level 2 by the proposed HUP-OAP method (Algorithm 5), where the estimated number of classes is ten. For this evaluation, the set of 100 features corresponding to the spectral signature of each pixel is considered and the image is divided into 16 blocks, where the size of each block is 15 × 15 pixels, to cover the whole image.
The confusion matrix of Table 1 highlights the repartition of pixels of the GT classes in the classes formed by the HUP-OAP method. The quality of the partitioning result is evaluated according to the correct classification rate (CCR) criterion which is calculated  GT Classes The confusion matrix of Table 1 highlights the repartition of pixels of the GT classes in the classes formed by the HUP-OAP method. The quality of the partitioning result is evaluated according to the correct classification rate (CCR) criterion which is calculated from the confusion matrix as follows: where Z is the total number of GT points, N c is the number of GT classes, and Z i is the number of the GT points correctly classified in each class i of GT. The CCR obtained by the proposed method is 96.89%. This rate can be corrected to 99.91% if we consider the homogeneity of class 10 formed only by a subset of pixels of the GT C 6 class as shown in Table 1. The example of the GT C 6 class illustrates perfectly the interest of an unsupervised method. Indeed, this class corresponds to a Green algae, but the proposed method clearly divides it into two subclasses. This is due to the fact that during the elaboration of the GT map, the two variants of this class were not specified. The classes 6 and 10 formed by the unsupervised HUP-OAP method correspond to Green algae classes, but with variations. For example, according to the depth of the water (the spectrum of Green algae through the water column), subclasses can be discriminated against. It is therefore thanks to the unsupervised nature of the proposed method that it is possible to objectively highlight the wealth of the information provided by hyperspectral imagery in the near-infrared (NIR) compared to that provided only in the visible domain. The class 10 formed may reflect, for example, the presence of algae in deeper water than the class 6 formed.  Figure 5. Partitioning results of the synthetic image by unsupervised and semi-supervised methods. (*) Optimal hierarchical partition. GT Classes 96.89% (99.91% with homogenous class 10)  Table 3. Performances of the developed method and the five other compared methods.

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9 2 Figure 5. Partitioning results of the synthetic image by unsupervised and semi-supervised methods. (*) Optimal hierarchical partition. GT Classes 96.89% (99.91% with homogenous class 10)  Table 3. Performances of the developed method and the five other compared methods.

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9 3 Figure 5. Partitioning results of the synthetic image by unsupervised and semi-supervised methods. (*) Optimal hierarchical partition. GT Classes 96.89% (99.91% with homogenous class 10)  Table 3. Performances of the developed method and the five other compared methods.

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9 4 Figure 5. Partitioning results of the synthetic image by unsupervised and semi-supervised methods. (*) Optimal hierarchical partition. GT Classes 96.89% (99.91% with homogenous class 10)   Table 3. Performances of the developed method and the five other compared methods.

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9 5 Figure 5. Partitioning results of the synthetic image by unsupervised and semi-supervised methods. (*) Optimal hierarchical partition. GT Classes 96.89% (99.91% with homogenous class 10)   Table 3. Performances of the developed method and the five other compared methods.

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9 6 Figure 5. Partitioning results of the synthetic image by unsupervised and semi-supervised methods. (*) Optimal hierarchical partition. GT Classes 96.89% (99.91% with homogenous class 10)   Table 3. Performances of the developed method and the five other compared methods.

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9 7 Figure 5. Partitioning results of the synthetic image by unsupervised and semi-supervised methods. (*) Optimal hierarchical partition. GT Classes 96.89% (99.91% with homogenous class 10)   Table 3. Performances of the developed method and the five other compared methods.

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9 8 Figure 5. Partitioning results of the synthetic image by unsupervised and semi-supervised methods. (*) Optimal hierarchical partition. GT Classes 96.89% (99.91% with homogenous class 10)   Table 3. Performances of the developed method and the five other compared methods.

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9 9 Figure 5. Partitioning results of the synthetic image by unsupervised and semi-supervised methods. (*) Optimal hierarchical partition. GT Classes 96.89% (99.91% with homogenous class 10)   Table 3. Performances of the developed method and the five other compared methods.

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9  GT Classes 96.89% (99.91% with homogenous class 10)   Table 3. Performances of the developed method and the five other compared methods.

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9 GT Classes C 1 Figure 5. Partitioning results of the synthetic image by unsupervised and semi-supervised methods. (*) Optimal hierarchical partition. GT Classes 96.89% (99.91% with homogenous class 10)   Table 3. Performances of the developed method and the five other compared methods.

GT Classes
96.89% (99.91% with homogenous class 10)  Table 3. Performances of the developed method and the five other compared methods.

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9

GT Classes
96.89% (99.91% with homogenous class 10)  Table 3. Performances of the developed method and the five other compared methods.

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9  GT Classes 96.89% (99.91% with homogenous class 10)

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9

Partitioning of a Real Large Size Hyperspectral Aerial Image
Number of classes fixed to 9         Table 2 shows that for the image in Figure 3, the partitioning results are the same regardless of the block size. We can notice that the partitioning result of the image without division into blocks is identical to that obtained by division into blocks. We can also see that the smaller the block size, the lower the CPU time and memory space. In order to evaluate the performances of the developed method compared to others, we have chosen unsupervised methods (Standard AP and U-FCMO) and methods that require a minimum of a priori knowledge, i.e., the knowledge of the number of classes without any training sample (Stable FCMO, FCM, and K-means).

Semi-Supervised (Fixed Number of Classes
For the semi-supervised methods, the number of classes was set to 9 in order to match the number of classes of GT image, and for FCM, U-FCMO, and S-FCMO methods, the fuzzification parameter was fixed at 2. We specify that for the K-means and the FCM methods, the rates given are the average of five fluctuating results due to their non-stability with respect to initialization. For methods of the state of the art, the metric used to calculate the similarity matrix is the Euclidean distance (d 2 ). Table 3 gives the performances of the three unsupervised and three semi-supervised methods by computing four criteria: CCR (%), CCR with homogenous output classes (%), CPU time (s), and Memory space (Mb). Table 3. Performances of the developed method and the five other compared methods. These results show that the developed method gives the best results according to three criteria (CCR, CCR with homogenous output classes, and CPU time), in addition to its unsupervised advantage. On the other hand, it requires more memory space than the U-FCMO, S-FCMO [26], FCM, and K-means methods because of the responsibility and availability matrices, but considerably less than the standard AP method. We can also note that the semi-supervised FCM and K-means methods give overall the least interesting results according to the CCR criterion. In addition, their results are not stable from one run to another, despite the introduction of the number of classes.

Partitioning of a Real Large Size Hyperspectral Aerial Image
The objective of this experimental database is to identify two main algae species (Green and Brown) and to provide an accurate mapping of their coverage rate.
We specify that the HUP-OAP method in its design is developed for partitioning large-size images which can be larger than the example treated here.
The large-size image (630 × 1800 pixels) of Figure 6a used for this assessment was acquired on 27 May 2013 (part of the French seashore) using the AISA Eagle sensor integrated in the aerial acquisition platform available at the TSI2M Laboratory. The ground spatial resolution of this image is 0.6 m, and the number of spectral bands is 100, covering the V-NIR spectral range from 404.2 nm to 978.5 nm.
To allow the evaluation and validation of the results of the proposed unsupervised HUP-OAP method, we used a field campaign performed at the same time as the aerial survey. The field spectra measurements were acquired with a spectroradiometer coupled with a GPS. After this step, the GT points were validated [21,22], where only ground points with similar spectral signature to their corresponding pixels in the original aerial hyperspectral image were selected.
This example proves that it is impossible in practice to elaborate a GT for all the pixels of a large-size image. For this reason, we limited ourselves to a few survey points in the field to assess and validate the unsupervised partitioning method developed in this paper. Figure 6b shows the location of the field measurements of four classes over the original hyperspectral image. Figure 7 highlights the validated GT spectral signature and average ± standard deviation (in luminance) of these four main classes: Brown algae, Green algae, Rocks and Pebbles, and Sand. These last two classes can be aggregated into a single substrate class.
To partition this large size hyperspectral image (630 × 1800 pixels × 100 spectral bands: the size of the data table is 1,134,000 pixels × 100 features) by the HUP-OAP method (Algorithm 5), the chosen size of each block is 63 × 90 pixels (200 blocks). Figure 8 shows the optimal partitioning result of the hyperspectral image in Figure 6a maximizing the LN criterion which is obtained at level 4. The estimated number of classes for this partition is 5. Table 4 gives for each partitioning level the number of classes estimated by the HUP-OAP method and the value of the LN optimization criterion. of a large-size image. For this reason, we limited ourselves to a few survey points in the field to assess and validate the unsupervised partitioning method developed in this paper. Figure 6b shows the location of the field measurements of four classes over the original hyperspectral image. Figure 7 highlights the validated GT spectral signature and average ± standard deviation (in luminance) of these four main classes: Brown algae, Green algae, Rocks and Pebbles, and Sand. These last two classes can be aggregated into a single substrate class.  To partition this large size hyperspectral image (630 × 1800 pixels × 100 spectral bands: the size of the data table is 1,134,000 pixels × 100 features) by the HUP-OAP method (Algorithm 5), the chosen size of each block is 63 × 90 pixels (200 blocks). Figure 8 shows the optimal partitioning result of the hyperspectral image in Figure 6a maximizing the criterion which is obtained at level 4. The estimated number of classes for this partition is 5. Table 4 gives for each partitioning level the number of classes estimated by the HUP-OAP method and the value of the optimization criterion. The GT points of the four classes (Green algae, Brown algae, Rocks and Pebbles, and Sand) belong to the four different formed classes. This result highlights a fifth class, whose spectral signature corresponds to that of water. We observe in Figure 9 that the average    The GT points of the four classes (Green algae, Brown algae, Rocks and Pebbles, and Sand) belong to the four different formed classes. This result highlights a fifth class, whose spectral signature corresponds to that of water. We observe in Figure 9 that the average spectral signature ± standard deviation of each formed class differs from the others and can be used as reference learning samples. The optimal partition of level 4 shows that the CCR is 100%, by checking the positions of the 23 points of the four GT classes within the formed classes.
The method thus developed gives, in addition to the optimal partitioning, other partitions which can contribute to the fine analysis and interpretation of the data according to the users' needs. It is also important to stress that the several experiments conducted show that the partitioning result is independent of the choice of block size.
Based on the reference spectral signatures, the algae coverage rate given by the optimal partition corresponds to 44.61% (19% for Brown algae and 25.61% for Green algae), as shown in Table 5.    Table 6. CPU time and Memory space of partitioned image of Figure 6 by HUP-AOP.

Conclusions
In this paper, we presented a new unsupervised hierarchical partitioning method adapted to large-size datasets, such as hyperspectral aerial imagery. This method has eight main advantages that can be objectively listed: (1) no a priori knowledge is required, especially the non-introduction of training samples, to give the user the possibility to detect and locate known classes and new classes called "discovery classes"; (2) stability of the results thanks to its deterministic character; (3) selection of the exemplar of each class and the assignment of a pixel (or an object) to a class are done in a very elaborate way according to optimization criteria; (4) very low computing time with block processing, in contrast to the compared methods; (5) applicable to data or images whatever their size with the possibility of parallelizing the block partitioning; (6) possibility of elaborating several hierarchical partitions by indicating the most relevant one according to an objective criterion; (7) possibility of objectively selecting the samples of the classes in a learning The computing time and memory space for partitioning this image with the proposed method (data table of size: 1.134.000 pixels × 100 features) on an Intel(R) Core (TM) i7-7700 CPU processor with 3.6 GHz and 16 Go memory are given in Table 6 for two different block sizes. We can see that they decrease with the size of the blocks. The indicative time given here can be greatly reduced because the block partitioning can be done in parallel, on a multiprocessor machine. Table 6. CPU time and Memory space of partitioned image of Figure 6 by HUP-AOP. In comparison (see Table 7), the developed unsupervised method yields better performances with respect to FCMO [26] in its unsupervised and semi-supervised versions, denoted as U-FCMO and S-FCMO respectively, and to the semi-supervised K-means and FCM methods. The number of classes for these last three methods has been set to five and the metric used to calculate the similarity matrix is the Euclidean distance (d 2 ). We recall that the standard AP algorithm cannot be applied to partition this large-size image. The analysis of the U-FCMO result shows that the number of estimated classes correspond to the Green algae, Brown algae, Substrate, and Water classes. In this case, the GT points belonging to the Rocks and Pebbles and Sand are aggregated in the same class. This means that the GT points 48, 49, and 50 of the sand class are misclassified, which gives a rate of 86.95%. If we do not take into account the discrimination between these two Substrate classes, the rate can therefore be 100%. However, this result is less accurate than that of the HUP-OAP method which gives more detail by splitting the substrate class into two subclasses. Another interesting piece of information is given by the partition of level 5, where the algae classes are merged into one class and the others (substrates and water) in another class. Furthermore, the user can use the other partitions of the hierarchy for more details. In the case of the S-FCMO method, the GT points 19 and 44 of the Rocks and Pebbles class were aggregated in the Sand class, which gives a CCR of 91.30%.

Conclusions
In this paper, we presented a new unsupervised hierarchical partitioning method adapted to large-size datasets, such as hyperspectral aerial imagery. This method has eight main advantages that can be objectively listed: (1) no a priori knowledge is required, especially the non-introduction of training samples, to give the user the possibility to detect and locate known classes and new classes called "discovery classes"; (2) stability of the results thanks to its deterministic character; (3) selection of the exemplar of each class and the assignment of a pixel (or an object) to a class are done in a very elaborate way according to optimization criteria; (4) very low computing time with block processing, in contrast to the compared methods; (5) applicable to data or images whatever their size with the possibility of parallelizing the block partitioning; (6) possibility of elaborating several hierarchical partitions by indicating the most relevant one according to an objective criterion; (7) possibility of objectively selecting the samples of the classes in a learning system in order to be able to detect them afterwards; finally, (8) applicable to several domains without learning constraints. However, it requires more memory space.
Evaluations of the developed method on synthetic and real hyperspectral images show that the results are relevant without any intervention of the end-user, and its application to large-size images gives the same optimal result regardless of the block size used.
The correct classification rates (CCR) obtained by our method are better than those of semi-supervised methods such as Stable FCM (S-FCMO), K-means, and FCM, despite its unsupervised character (estimation of the number of classes and classification of pixels without any a priori knowledge). It also outperforms compared unsupervised methods such as U-FCMO and the standard AP method.
This true unsupervised method meets the partitioning requirements of large-size images provided by modern hyperspectral sensors. Moreover, it can be applied to a wide range of applications to objectively highlight all existing classes in an image [22]. From this complete partition, the user can exploit all or some of the obtained classes. It also offers the possibility to use the samples of the classes in learning processes.
In perspective, this method will be assessed and validated on other databases extended to several application fields in order to prove the relevance and the benefits of its unsupervised operating mode. An optimization in memory space also remains to be performed.