Land Cover Classiﬁcation Based on Double Scatterer Model and Neural Networks

: In this paper, a supervised land cover classiﬁcation is presented based on the extracted information from polarimetric synthetic aperture radar (PolSAR) images. The analysis of the polarimetric scattering matrix is accomplished according to the Double Scatterer Model which interprets each PolSAR cell by a pair of elementary scattering mechanisms. Subsequently, by utilizing the contribution rate of the two fundamental scatterers, a novel data representation is accomplished, providing great informational content. The main component of the research is to highlight the robust new feature-tool and afterwards to present a classiﬁcation scheme exploiting a fully connected artiﬁcial neural network (ANN). The PolSAR images used to verify the proposed method were acquired by RADARSAT-2 and the experimental results conﬁrm the effectiveness of the presented methodology with an overall classiﬁcation accuracy of 93%, which is considered satisfactory since only four feature-vectors are used.


Introduction
In recent years, the human impact on the global ecology has increased dramatically. Therefore, the importance to control the rate of anthropogenic changes has intensified, inter alia, the need of gathering accurate and timely landcover information. Many remote sensing techniques have been introduced to continuously monitor Earth's surface in order to obtain useful data for environmental management improvement.
Land cover classification is one of the major topics being investigated in the field of remote sensing [1]. Advanced technology sensors attached to satellites provide perpetually high-quality information such as hyperspectral images [2] and InSAR and PolSAR data [3] which optimize and greatly increase the accuracy of land cover classification procedures. PolSAR is an active radar system that transmits and receives microwaves, providing a desirable capacity for all weather, day and night imaging. In this research, data from fully polarimetric SAR, also known as quad-polarization SAR, are utilized. Quad-pol is a mode of SAR imagery which transmits and receives signals of multiple polarization states (HH, HV, VH, VV), capturing all the polarization information available in the backscattered wave. The acquired data are usually represented in a matrix form. Algorithms known as polarimetric target decomposition techniques have been introduced to obtain features which can describe PolSAR images in multiple aspects in order to be utilized in classification and target detection procedures. In the last decade, several land cover classification techniques have been developed utilizing quad-pol data. Most of them combine features extracted using decomposition techniques with machine learning algorithms, achieving remarkable results. In this work, a new land cover classification scheme is introduced. Specifically, the present research focuses on two novel procedures: • Firstly, the implementation of the new established method for PolSAR data processing known as the Double Scatterer Model is carried out, in order to utilize the primary and secondary scatterers with the contribution rates. Based on these parameters a new PolSAR data form is presented. • Secondly, a fully interconnected neural network is designed for the specific supervised classification task. A pixel window is used to determine the input pixels and assimilate, in this way, data of spatially adjacent pixels in both training and test sets.
It is noteworthy that in this research, the first technique of exploiting the Double Scatterer Model is presented and it is combined with an algorithm specially designed to exploit the greatest possible amount of information to be used in the classification process.

Related Works
The first target decomposition method was developed by S. Chandrasekhar [4] in his work on light scattering by small anisotropic particles. In particular, an analysis of phase matrix into a sum of three independent components was presented. In J.R. Huynens' PhD dissertation [5] a decomposition of an average Mueller matrix into the sum of a matrix corresponds to a pure target and an N-target matrix was introduced.
S.R. Cloude [6] was the first to consider an incoherent target decomposition method based on eigenvector analysis of a coherency matrix, which extracts the dominant scattering mechanism from a PolSAR cell through the calculation of the greater eigenvalue. In the sequel, S.R. Cloude and E. Pottier [7] proposed the representation of scattering characteristics by the space of entropy H and the averaged scattering angle a. Equivalent information as in the S.R. Cloude and E. Pottier decomposition is extracted in [8] by utilizing a unique target scattering type parameter which jointly uses the Barakat degree of polarization and the element of the polarimetric coherency matrix.
A model-based decomposition approach has been introduced by A. Freeman and S.L. Durden [9] which fits a physically based three-component scattering model to the polarimetric SAR data. This technique decomposes the covariance matrix into three categories, surface, volume and even-bounce scattering. One basic assumption of this model is the reflection symmetry, which limits its applicability to only reflection symmetric targets. To overcome this problem, Y. Yamaguchi et al. [10] proposed a four-component scattering model by introducing an additional term corresponding to non-reflection symmetric targets. Recently, a model-free four-component scattering power decomposition that alleviates the compensations of the parameter of the orientation angle about the radar line of sight and the occurrence of negative power components was introduced in [11].
Coherent target decompositions analyze the scattering matrix as a weighted combination of scattering response of simple or canonical objects. One of the widely used methods is Pauli spin matrices. Based on both Pauli decomposition and Huynen's work [5], W.L. Cameron and L.K. Leung [12] proposed a stepwise coherent decomposition for the scattering matrix utilizing the properties of reciprocity and symmetry, including a classification scheme.
Lately, K. Karachristos et al. [13] exploited Cameron's coherent decomposition to introduce the Double Scatterer Model, a novel method representing PolSAR cells' information by a pair of elementary scattering mechanisms, each one contributing to the scattering behavior with its own weight. Thus, a new feature-tool was established, well suited for both detection and classification tasks.
In [14], more than 20 polarimetric decomposition methods were used to extract a set of features which were optimized to improve land cover classification using the objectoriented RF-SFS algorithm. X. Liu et al. [15] utilized a polarimetric convolutional network for PolSAR image classification, introducing a new encoding method for a scattering matrix with remarkable results. L. Zhang et al. [16] employed a multiple-component scattering model, Cloude-Pottier decomposition and gray-level co-occurrence matrix to obtain features for PolSAR image description, in order to classify five land cover types based on sparse representation. G. Koukiou and V. Anastassopoulos classified land cover types implementing Markov chains on features extracted by Cameron's decomposition [17]. The application of hidden Markov models for a supervised classification combined with Cameron's scattering technique was carried out by K. Karachristos et al. [18]. The results of all the above methods confirm the dynamic application of the combination of polarimetric decomposition algorithms and machine learning. This paper is structured as follows: Section 3 describes the employed fully polarimetric data and the preprocessing that was used. Section 4 provides the background on Cameron's coherent decomposition and introduces the Double Scatterer Model. In Section 5, a brief review of artificial neural networks is made. Our proposed classification procedure is analyzed in Section 6 and the results are presented in Section 7, while the conclusions are drawn in the final Section 8.

Data Description and Preprocessing
The classification procedure was carried out using a fully polarimetric single-look complex (SLC) dataset obtained by the RADARSAT-2 satellite mission, in April 2008 [19,20]. The PolSAR images depict the broader area of Vancouver, BC, Canada via C-band, using the fine quad-pol beam mode which provides fully polarimetric imaging with nominal resolution 5.2 × 7.6 (Range × Azimuth) (m) and swath widths of approximately 25 km.
To properly work with SLC data, a specific preprocessing was needed [21,22]. Firstly, radiometric calibration was employed to convert raw digital image data from the satellite to a common physical scale based on known reflectance measurements taken from objects on the ground surface. The image is in the acquisition geometry of the sensor, resulting in distortions related to the side-looking geometry. Thus, there is a need for geocoding. This is accomplished by employing the range Doppler orthorectification method, provided by the SNAP application platform, as well as the calibration process. The range Doppler terrain correction operator makes use of the available orbit state vector information in the metadata, the radar timing annotations and the slant to ground range conversion parameters together with the reference digital elevation model data to derive the precise geolocation information. Data transformation is depicted in Figure 1.
This paper is structured as follows: Section 3 describes the employed fully polarimetric data and the preprocessing that was used. Section 4 provides the background on Cameron's coherent decomposition and introduces the Double Scatterer Model. In Section 5, a brief review of artificial neural networks is made. Our proposed classification procedure is analyzed in Section 6 and the results are presented in Section 7, while the conclusions are drawn in the final Section 8.

Data Description and Preprocessing
The classification procedure was carried out using a fully polarimetric single-look complex (SLC) dataset obtained by the RADARSAT-2 satellite mission, in April 2008 [19,20]. The PolSAR images depict the broader area of Vancouver, BC, Canada via C-band, using the fine quad-pol beam mode which provides fully polarimetric imaging with nominal resolution 5.2 × 7.6 (Range × Azimuth) (m) and swath widths of approximately 25 km.
To properly work with SLC data, a specific preprocessing was needed [21,22]. Firstly, radiometric calibration was employed to convert raw digital image data from the satellite to a common physical scale based on known reflectance measurements taken from objects on the ground surface. The image is in the acquisition geometry of the sensor, resulting in distortions related to the side-looking geometry. Thus, there is a need for geocoding. This is accomplished by employing the range Doppler orthorectification method, provided by the SNAP application platform, as well as the calibration process. The range Doppler terrain correction operator makes use of the available orbit state vector information in the metadata, the radar timing annotations and the slant to ground range conversion parameters together with the reference digital elevation model data to derive the precise geolocation information. Data transformation is depicted in Figure 1.
Concerning the partition of different types of land covers, this was based on the geological features of the specific area according to previous studies [23]. Therefore, four land cover types were selected, water, urban/built up area, dense vegetation area and agriculture/pasture. The selected areas are shown in Figure 2. Concerning the partition of different types of land covers, this was based on the geological features of the specific area according to previous studies [23]. Therefore, four land cover types were selected, water, urban/built up area, dense vegetation area and agriculture/pasture. The selected areas are shown in Figure 2. Intensities of HV, HH, VH and VV channels of PolSAR SLC data depicting the broader area of Vancouver, before and after preprocessing (radiometric calibration and geometric correction).

Figure 2.
Broader area of Vancouver, by Google Earth. Sea land cover is blue, urban/built up area is red, suburban area is orange, dense vegetation is green and agriculture/pasture is beige.

Double Scatterer Model
The Double Scatterer Model could be introduced as an extension of Cameron's coherent decomposition, so that each PolSAR cell is interpreted by a pair of fundamental scattering mechanisms.
In particular, W.L. Cameron and L.K. Leung [12] presented a technique of decomposing the polarization scattering matrix into three parts, based on the properties of reciprocity and symmetry. The three parts are non-reciprocal, asymmetric and symmetric. According to this separation, 11 classes can arise to characterize the scattering matrix under examination, namely the non-reciprocal, the asymmetric, left and right helix and 7 classes of symmetric elementary scattering mechanisms, 6 of which are known geometrical structures (trihedral, dihedral, dipole, ¼ wave device, cylinder, narrow diplane) and the last one corresponds to a symmetric class with unknown structure.
Cameron's stepwise algorithm proceeds as follows: Firstly, the scattering matrix ̃ is expressed on the Pauli basis: where Following Cameron's algorithm, sometimes it is convenient to view the matrix ̃ as a vector ⃗⃗⃗ ∈ 4 :
(4) Figure 2. Broader area of Vancouver, by Google Earth. Sea land cover is blue, urban/built up area is red, suburban area is orange, dense vegetation is green and agriculture/pasture is beige.

Double Scatterer Model
The Double Scatterer Model could be introduced as an extension of Cameron's coherent decomposition, so that each PolSAR cell is interpreted by a pair of fundamental scattering mechanisms.
In particular, W.L. Cameron and L.K. Leung [12] presented a technique of decomposing the polarization scattering matrix into three parts, based on the properties of reciprocity and symmetry. The three parts are non-reciprocal, asymmetric and symmetric. According to this separation, 11 classes can arise to characterize the scattering matrix under examination, namely the non-reciprocal, the asymmetric, left and right helix and 7 classes of symmetric elementary scattering mechanisms, 6 of which are known geometrical structures (trihedral, dihedral, dipole, 1 4 wave device, cylinder, narrow diplane) and the last one corresponds to a symmetric class with unknown structure.
Cameron's stepwise algorithm proceeds as follows: Firstly, the scattering matrix S is expressed on the Pauli basis: where Following Cameron's algorithm, sometimes it is convenient to view the matrix S as a vector → S ∈ C 4 : The vector → S is related to the matrix → S by the operators Geomatics 2022, 2 327 which are defined as Thus, leading the expression of S to be: The hatŜ of vector → S symbolizes a unit vector ( Ŝ = 1, where | . . . | stands for vector magnitude).
Secondly, based on the reciprocity theorem, according to which S HV = S V H , Cameron divides the respective target into reciprocal or non-reciprocal. This is carried out by calculating the projection angle of the scattering matrix in the reciprocal subspace: If the projection angle is less than 45 o , the elementary scattering is considered as reciprocal, otherwise it is taken as non-reciprocal. The scattering matrix of a reciprocal scatterrer is now decomposed as: Since in (9) the only non-reciprocal component is S d = 0 −j j 0 with the non-diagonal elements to be opposite. Ultimately, the reciprocal scatterer in a vector form is expressed as follows: Lastly, Cameron further decomposes the matrix which corresponds to a reciprocal elementary scattering mechanism into a symmetric and an asymmetric component. A scatterer can be identified as symmetric when the target has an axis of symmetry in the plane perpendicular to the radar LOS, or alternatively if there exists a rotation ψ c that cancels out the projection of S rec on the antisymmetric component S c = with ε = βcos(χ) + γsin(χ) (15) and As for the degree of symmetry, it is expressed as the degree to which S deviates from S max sym and it can be calculated as follows: where || . . . || stands for the norm of a complex vector form to which the matrix corresponds.
If τ sym = 0 , then the scattering matrix corresponds to a perfectly symmetric target. If τ sym = π 4 , the target that backscatters the radiation is considered asymmetric. Cameron considers as symmetric any elementary scattering with angle τ sym ≤ π 8 . Thus, an arbitrary scatterer S that obeys the reciprocity and symmetry theorem as formulated in Cameron's method can be decomposed according to where α indicates the amplitude of the scattering matrix, ϕ is the nuisance phase and ψ is the scatterer orientation angle. The matrix R(ψ) denotes the rotation operator andΛ(z) is given by:Λ with z being referred to as a complex parameter that eventually determines the scattering mechanism.
In Table 1 are given the complex vectorsΛ(z) and the corresponding values of z for symmetric elementary scattering mechanisms. The range of the z parameter implies that the scattering matrix can be represented by a point on the unit disk of the complex plane. The positions of the various types of elementary scattering mechanisms are shown on the unit circle represented in Figure 3 along with the regions on the unit disk which are considered as belonging to these scattering mechanisms. Evidently, and according to the values of z given in Table 1, all elementary scatterers lie on the diameter of the unit disk except for the 1 /4 wave devices which lie on the imaginary axis.
In order to determine the scattering behavior of an unknown scattering target z, Cameron considered the following distance metric [12]: Cameron et al. [24] noticed the need for a closed surface rather than the disk, as a result of the double presence of the 1 /4 wave device. Ideally, the symmetric space could be the unit sphere. This was thoroughly demonstrated by a mapping procedure proposed in [24]. This mapping procedure is depicted in Figure 4. Specifically, in the new topology, they associated each point (x, y) of the unit disk with a circular arc a(x, y) on the unit sphere containing the points (−1, 0), (x, y) and (1, 0).  Cameron et al. [24] noticed the need for a closed surface rather than the disk, as a result of the double presence of the ¼ wave device. Ideally, the symmetric space could be the unit sphere. This was thoroughly demonstrated by a mapping procedure proposed in [24]. This mapping procedure is depicted in Figure 4. Specifically, in the new topology, they associated each point ( , ) of the unit disk with a circular arc ( , ) on the unit sphere containing the points (−1, 0), ( , ) and (1, 0).
Obviously, for the point ( , ) not on the rim of the disk, the arc length is less than π. In such a case, the arc would be "stretched" to have length equal to and be part of a great circle. By associating each point ( , ) with a semi-circle, the way this mapping works is easily depicted, by placing these circles tangent on the sphere's surface with the initial position ( , ) of the point on the unit disk determining the latitude and longitude of the point on the unit sphere. This mapping is represented in Figure 4, according to [24] with spherical coordinates and given by: , ≠ 0,  The space distance measure of a test scatterer and each of the reference scattering mechanisms of Table 1 are now given by an equivalent to (20), but more intuitive, form:  Obviously, for the point (x, y) not on the rim of the disk, the arc length is less than π. In such a case, the arc would be "stretched" to have length equal to π and be part of a great circle. By associating each point (x, y) with a semi-circle, the way this mapping works is easily depicted, by placing these circles tangent on the sphere's surface with the initial position (x, y) of the point on the unit disk determining the latitude ϕ s and longitude θ s of the point on the unit sphere. This mapping is represented in Figure 4, according to [24] with spherical coordinates θ s and ϕ s given by: where The space distance measure d of a test scatterer z and each of the reference scattering mechanisms of Table 1 are now given by an equivalent to (20), but more intuitive, form: and With a view to collect the highest amount of polarimetric information, K. Karachristos et al. [13] present the Double Scatterer Model, an algorithm-extension of Cameron's stepwise procedure. Basically, they proposed a method to interpret each PolSAR cell as a contribution of the two most dominant elementary scattering mechanisms in order to extract rich information content. Specifically, the main steps of the method are the following:

1.
For each scattering matrix, the complex parameter z will be computed. If the criteria of reciprocity and symmetry are met, the imaginary and the real part of z will determine a point on the complex unit disk, according to Cameron's algorithm.

2.
The mapping of the point on the surface of the unit sphere follows. The PolSAR cell under examination and its scattering matrix are now represented by the longitude θ and the latitude ϕ on the unit sphere ( Figure 5).

3.
According to Poelman [25], the elemental scattering mechanisms of cylinder and narrow diplane can be obtained as a linear combination of the rest of the elementary scatterers: Since the scattering mechanisms of cylinder and narrow diplane can be composed of trihedral, dipole and dihedral, the three mentioned above as well as the 1 /4 wave device can be characterized as the fundamental scattering mechanisms. This claim led us to disregard the scattering mechanisms of cylinder and narrow diplane as being of minimum importance and update the spherical topology as depicted in Figure 5.
It is important to note the computes the contribution degree of each of the two dominating fundamental scattering mechanisms. When is approaching 1 or 100%, it means that the target scatterer is fully described by one of the four fundamental scattering mechanisms.
In the marginal case where = 90, the scatterer can be assumed as undetermined and be classified as "non-categorizable". The same class is used for asymmetric scatterers.

4.
The location of the right-angled spherical triangle depends on the angle values (θ, ϕ) of the point under examination. Whether it is above or below the equator, one vertex of the triangle will always be the one pole of the sphere and the other two, the nearest scattering mechanisms calculated by using the orthodromic/great circle distance D: D = arccos (sinϕ 1 sinϕ 2 + cosϕ 1 cosϕ 2 cos(∆θ)).

5.
The vector with an initial point on the sphere's center and the terminal one given by the coordinates on the spherical shell are projected on the level of the equator to which the reference scattering mechanisms belong, based on the angle ϕ ( Figure 5). Specifically, the projection is contained in the quadrant enclosed by the center of the sphere and the two closest to the examination point scatterers.

6.
The immediate consequence is the analysis of the projection of the vector in two vertical components which are the two nearest scatterers.
Based on the above, the mixture interpretation for each scatterer is accomplished by: where It is important to note the P i computes the contribution degree of each of the two dominating fundamental scattering mechanisms. When P i is approaching 1 or 100%, it means that the target scatterer S t is fully described by one of the four fundamental scattering mechanisms.
In the marginal case where ϕ = 90, the scatterer can be assumed as undetermined and be classified as "non-categorizable". The same class is used for asymmetric scatterers.

Artificial Neural Networks (ANNs)
In its most general form, a neural network is a machine that is designed to model the way in which the brain performs a particular task or function of interest. To achieve good performance, neural networks employ a massive interconnection of simple computing cells referred to as "neurons" or processing units. S. Haykin [26] offers the following definition of a neural network viewed as an adaptive machine: A neural network is a massively parallel distributed processor made up of simple processing units, which has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two respects:

1.
Knowledge is acquired by the network from its environment through a learning process.

2.
Interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge.
The procedure used to perform the learning process is called a learning algorithm, the function of which is to modify the synapsis weights of the network in an orderly fashion to action a desired design objective.
The multilayer perceptron (MLP) neural network, a model that uses a single-or multilayer perceptron to approximate the inherent input-output relationships, is the most commonly used network model for image classification in remote sensing [27,28]. Typically, the network consists of a set of sensory units (source nodes) that constitute the input layer, one or more hidden layers of computational units and one output layer ( Figure 6). The essential components of MLP are the architecture, involving the numbers of neurons and layers of the network, and the learning algorithm. MLP networks are usually trained with the supervised backpropagation (BP) algorithm [29]. This algorithm is based on the error-correction learning rule. Basically, error backpropagation learning consists of two passes through the different layers of the network: a forward pass and a backward one. In the forward pass, an activity pattern (input vector) is applied to the sensory nodes of the network and its effect propagates through the network layer by layer. Finally, a set of outputs is produced as the actual response of the network. During the forward pass, the synaptic weights of the network are all fixed. During the backward pass, the synaptic weights are all adjusted in accordance with an error-correction rule. Specifically, the actual response of the network is subtracted from a desired response to produce an error signal. This error signal is then propagated backward through the network, hence the name "error backpropagation". The synaptic weights are adjusted to make the actual response of the network move closer to the desired response, in a statistical sense. Apart from the architecture of the MLP and the learning algorithm, operational factors such as data characteristics and training parameters can affect the model performance. However, these factors are application-dependent and best addressed on a case-by-case basis. Therefore, the operational issues will be discussed in concert with the case study in the next section.
Geomatics 2022, 2, FOR PEER REVIEW 11 Figure 6. The structure of a fully connected automated neural network.

Proposed Classification Procedure
Applying the proposed classification scheme, the Double Scatterer Model is firstly applied to the processed data. Each PolSAR cell/pixel is now presented by a weighted composition of two fundamental scattering mechanisms, in a form analogous to the Figure 6. The structure of a fully connected automated neural network.

Proposed Classification Procedure
Applying the proposed classification scheme, the Double Scatterer Model is firstly applied to the processed data. Each PolSAR cell/pixel is now presented by a weighted composition of two fundamental scattering mechanisms, in a form analogous to the following: Cell = Primary Scatterer * weight + Secondary Scatterer * weight.
Each fundamental scattering mechanism can be assigned to an integer number from 1 to 5, as it is shown in Table 2. The second and the third stages involve the key innovation points of our research that lie in the assignment of specific intervals of real numbers to each scattering mechanism, thus determining the purity of scattering behavior to each cell.
Fundamental scatterers are now represented by their identical continuous values rather than by a unique integer number, as is shown in Table 3. This is accomplished by focusing on the contribution of the dominant scattering mechanism in each cell. Specifically, based on the fact that the contribution rate of the primary scatterer is always greater than 0.5, we can assume that each cell is represented by a number resulting from the formula below: Cell value = Primary Scatterer * Contribution Rate. Applying the above to each scattering mechanism located in an interval of continuous values and by performing the appropriate scaling, we can transform these intervals so that each elementary scattering mechanism will be identified in a unique continuous range without overlaps between the intervals.
A much more detailed representation of PolSAR data has now been achieved. Subsequently, in order to utilize the informational content of the secondary scattering mechanism, the difference between the weights/contribution rates of the primary and secondary scatterers has been calculated for each cell. In this sense, the pure scattering behavior is determined. Purity = Primary Scatterer s weight -Secondary Scatterer s weight According to the above, each cell/pixel corresponds to 2 values, a real number from the interval [0, 1] that determines the fundamental scatterer and a value that represents the purity of the scattering behavior. These features, extracted based on the Double Scatterer Model, will be used in the classification procedure.
The last stage of our process is the classification procedure performed by an ANN. In order to exploit the spatial associations, a 7 × 7 pixel window is used to calculate the mean values and the standard deviations of the 2 features in local neighborhoods. Notably, in each land cover a 7 × 7 pixel sliding window determines the mean value and the standard deviation of 49 values that correspond to the fundamental scatterer intervals (34) and 49 values that represent the feature of purity (35). In total, 4 features for each neighborhood of 7 × 7 pixels/cells will be used in the classifier.
The neural network designed for this research is a fully interconnected linkage of three layers. The input layer is composed of 4 neurons since we use 4 features. Using the 7 × 7 window, the network is able to assimilate data of spatially adjacent pixels, a fact that significantly affects the accuracy of our task since it is utilizing the local neighborhood information of each cell/pixel. The visual classification of imagery by a human involves the use of both spectral and spatial associations, a combination which we attempt to exploit in this study. A single layer including 3 neurons is termed the hidden layer. The selection of both the number of hidden layers and the number of neurons was based on tests on the specific task and on our goal to construct a simple network with the fewest possible parameters. By increasing the number of either neurons or of hidden layers, the model's complexity increased as well as the computational time of the process, without leading to better results. The output layer is composed of 4 neurons representing the target classes of land covers (sea, urban/built up, dense vegetation, agriculture/pasture) that were to be produced by the network. Every neuron within one layer is fully interconnected with the neurons in the adjacent layers. These interconnections, known as synapses, as mentioned in the previous section, are determined by the activation function, which in our task is the sigmoid function: The fact that the sigmoid function is monotonic, continuous and differentiable everywhere, coupled with the property that its derivative can be expressed in terms of itself, makes it easy to derive the update equations for learning the weights in a neural network when using the backpropagation algorithm, as in the network we developed. Typically, the backpropagation algorithm uses a gradient-based algorithm to learn the weights of a neural network. In our case, we chose the "adam" optimizer, and a thorough description of this optimization method can be found in the research published by Diederik P. Kingma and Jimmy Lei Ba [30]. In the designed MLP network, epoch training is used as it is more efficient and stable than pixel-by-pixel training [31]. One epoch is when an entire dataset is passed forward and backward through the neural network only once. Since one epoch is too big to feed to the computer at once, we divide it into several smaller batches. Batch size is the last hyperparameter determined in the proposed ANN, it corresponds to the total number of training samples that will be passed through the network at one time. In this study, the number of epochs was chosen to be 10.000, which is large enough to gain sufficient knowledge of class membership from the training dataset, but not too large to make the training data overtrained, while the batch size is 128 and causes the model to generalize well on the data. To evaluate the performance of our model on the available dataset, k-fold cross-validation was implemented. The selected data depicted in colors in Figure 2 are split into K folds with a ratio of 70/30 train/test set, respectively, and are used to evaluate the model's ability when given new data. K refers to the number of groups the data sample is split into. In our study, the k-value is 5, so we can call this a 5-fold cross-validation.

Experimental Results
The results of the proposed classification scheme exceeded our expectations. The process of training and presenting results with the specific parameters mentioned in Section 6 takes only 11 min. In particular, the average accuracy of the 5-fold cross-validation, using the 7 × 7 pixel window, is estimated to be ∼ 93%. A more in-depth analysis of the results is depicted in the confusion matrix below, in Figure 7. In order to present the confusion matrix, a random train/test split was used with the ratio of 70/30 for each land cover.
to the number of groups the data sample is split into. In our study, the k-value is 5, so we can call this a 5-fold cross-validation.

Experimental Results
The results of the proposed classification scheme exceeded our expectations. The process of training and presenting results with the specific parameters mentioned in Section 6 takes only 11 min. In particular, the average accuracy of the 5-fold crossvalidation, using the 7 × 7 pixel window, is estimated to be ~93%. A more in-depth analysis of the results is depicted in the confusion matrix below, in Figure 7. In order to present the confusion matrix, a random train/test split was used with the ratio of 70/30 for each land cover.  The accuracy rates are very high. The only class that presents lower classification accuracy is the agriculture/pasture land cover. The latter was confused with the sea land cover. It is worth noting that this has been already observed in [18] and can be explained. In particular, the scattering behavior of agriculture areas and sea land cover is similar, since both cases are flat surfaces. Moreover, the explanation can be supported by Figure 8, in which the dominance of the trihedral scatterer is clear. The significant difference is the higher rates of the contribution of the trihedral scattering mechanism in sea land cover compared to agriculture/pasture areas, which leads to an accurate discrimination. In Table 4, the results from recent and well-established land cover classification procedures are reported so that a brief comparison with the present technique can be made for better evaluation.
Geomatics 2022, 2, FOR PEER REVIEW 14 The accuracy rates are very high. The only class that presents lower classification accuracy is the agriculture/pasture land cover. The latter was confused with the sea land cover. It is worth noting that this has been already observed in [18] and can be explained. In particular, the scattering behavior of agriculture areas and sea land cover is similar, since both cases are flat surfaces. Moreover, the explanation can be supported by Figure  8, in which the dominance of the trihedral scatterer is clear. The significant difference is the higher rates of the contribution of the trihedral scattering mechanism in sea land cover compared to agriculture/pasture areas, which leads to an accurate discrimination. In Table  4, the results from recent and well-established land cover classification procedures are reported so that a brief comparison with the present technique can be made for better evaluation.    It can be stated that the overall accuracies (OAs) are very high, but the accuracy per class gives the greatest detail that characterizes each classifier. The proposed classification scheme accomplished an OA of 0.9287 with high rates in each land cover class as well, with the least possible complexity.

Conclusions
The proposed methodology presents a whole toolchain that exploits the robustness and elegancy of the Double Scatterer Model to accomplish information extraction from PolSAR data and utilize the well-established classifier of a simple ANN to achieve remarkable results in land cover classification. The idea to use the spatial content information was embodied with the use of the sliding pixel/cell window and provides a new perspective on how the features extracted according to the Double Scatterer Model can be exploited. The use of both a very simple neural network and four values as features to accomplish these high accuracy rates makes analogous studies very promising. More sophisticated networks combined with more features extracted by similar methods to the presented procedure are very likely to be able to classify a multitude of classes with high accuracy rates and be used in target detection tasks.
Author Contributions: K.K. and V.A. have equally contributed to Conceptualization, Methodology, Validation, and Writing-Original Draft Preparation. All authors have read and agreed to the published version of the manuscript.