Highlights
What are the main findings?
- Explore the overfitting phenomenon in a finite mixture model (FMM) and find that blindly increasing the number of mixture components is not a reasonable approach.
- Propose a stepwise-regression-based finite mixture model (SRFMM) to alleviate the overfitting issue and employ the particle swarm optimization (PSO) algorithm for parameter and coefficient estimation due to its robustness and parallelism.
What are the implication of the main findings?
- The proposed SRFMM can accurately, stably, and efficiently model the isotropic and anisotropic regions in multi-aspect synthetic aperture radar (SAR) images under various observation aspects and aperture angles and the PSO algorithm is efficient and robust.
Abstract
Compared with conventional synthetic aperture radar (SAR), multi-aspect SAR can observe a scene from various aspects, thus providing a more detailed and comprehensive analysis and description of the target. As a result, an accurate, stable, and efficient model is required to adaptively model the multi-aspect SAR images according to the precision requirements. To address this challenge, we propose a stepwise-regression-based finite mixture model (SRFMM), with the aim of constructing a finite mixture model (FMM) by combining the fewest single parametric models that meet a specified accuracy demand. The SRFMM first employs a voting-based ranking strategy to determine the order in which the single parametric models are added to the FMM. And then, it linearly combines single parametric models one by one in the determined order until the desired accuracy is achieved or overfitting occurs to obtain the final FMM. In the implementation of SRFMM, we employ the particle swarm optimization (PSO) algorithm for parameter and coefficient estimation due to its robustness and parallelism. We have conducted an experimental evaluation of the SRFMM using the C-band circular SAR (CSAR) data, and the results indicated that the SRFMM can accurately, stably, and efficiently model the isotropic and anisotropic regions in multi-aspect SAR images under various observation aspects and aperture angles. Evaluation on the X-band CSAR data also indicates the applicability of the SRFMM.
1. Introduction
Synthetic aperture radar (SAR) is an active remote sensing radar capable of effectively operating day and night, in all weather conditions [1]. It has diverse applications, including disaster management, environmental monitoring, and agricultural assessment [2,3,4]. Multi-aspect SAR can conduct azimuthal observations of the target scene from various aspects because the platform flies in a curved line. For example, circular SAR (CSAR), a special multi-aspect SAR, flies along a circular trajectory and can provide a 360° continuous observation of the scene [5]. Multi-aspect SAR can overcome the issues in conventional SAR such as layovers, shadows, foreshortening, and limited observation aspects; obtain more backscattering information; and provide a more detailed and comprehensive analysis and description of the target [5,6,7,8]. However, this makes the statistical distribution characteristics of multi-aspect SAR images more complex. Precisely modeling multi-aspect SAR images plays a key role in their application such as in change detection, target detection, moving target detection, and image classification [9,10,11,12,13,14]. For example, the performance of moving target detection can be improved when the background clutter is modeled precisely [5,14]. However, multi-aspect SAR image modeling faces the following challenges: (1) Multi-aspect SAR datasets contain a large number of images under different observation aspects, requiring fast and efficient data processing. (2) The statistical distribution characteristics of multi-aspect SAR images may vary with the observation aspect, requiring a flexible model. Therefore, an accurate, stable, and efficient model is needed to adaptively and precisely model the multi-aspect SAR images according to the precision requirements. Due to the significance of SAR image modeling, there have been many statistical modeling methods to describe the conventional SAR images [15,16,17]. Among these, Parametric methods such as Weibull [18], Gamma [19], G0 [20], K [21], and Lognormal [22] have been widely studied due to their simplicity and clear analytical expressions [5]. However, the applicability of parametric models is limited. On the one hand, these methods show effectiveness only in specific terrain types and lack universality [23]. For example, the Lognormal distribution can accurately model some specific urban regions but performs relatively poorly in other areas. On the other hand, these models are not sufficiently accurate for describing the complex scenes in high-resolution SAR images [9,17,24]. This limitation becomes even more obvious for multi-aspect SAR images [5].
To achieve a high-accuracy description of the statistical distribution characteristics of SAR images, many works have explored nonparametric models such as Parzen windows [25], artificial neural networks [26], and support vector machines [27]. These methods do not rely on assumptions about the statistical distribution characteristics of SAR images while relying on a large number of training samples to provide an accurate description [8,13]. However, they suffer from poor interpretability and heavy computation. The high computational cost can limit the application of these nonparametric models in multi-aspect SAR which contains a larger number of images from different observation aspects and aperture angles.
To combine the advantages of parametric and nonparametric models, some works proposed semiparametric models. As a representative of semiparametric models, the finite mixture model (FMM) attracts a great deal of attention due to its flexibility and effectiveness. The FMM linearly combines multiple single parametric models to describe the unknown probability density function (PDF) of SAR image [9,17,28]. The main differences among finite mixture models lie in the mixture components, the model selection strategies, and the parameter estimation methods, which affect modeling accuracy and computational efficiency. FMM can achieve a great improvement in accuracy over single parametric models with only a slight increase in computational overhead. However, combining multiple parametric models into one increases the number of parameters and introduces coefficients for each mixture component. Estimating both the parameters and the coefficients is challenging. Karl Pearson first studied the mixture model in 1894 [29]. Dempster et al. proposed the expectation–maximization (EM) algorithm and provided a solid foundation for the subsequent development of mixture models [30]. Later, in 1985, Celeux et al. proposed a stochastic version of the EM algorithm, known as SEM, which randomly samples the given data according to the estimated posterior probabilities of each component of FMM [31]. This sampling strategy makes the SEM more efficient than the EM and helps alleviate the problem of getting trapped in local optima.
To explore the advantages of FMM in the field of SAR, Blake et al. proposed K-mixture and Lognormal-mixture models, which have a superior ability to fit the probability density histogram of given data compared to single parametric models [32,33]. In order to automatically select the component of FMM, Moser et al. combined SEM with the method of log-cumulants (MoLC), proposing a dictionary-based stochastic expectation–maximization (DSEM) method in 2006. DSEM can effectively select the components and estimate the parameters of the FMM [9]. Later, Vladimir A. Krylov et al. introduced an enhanced DSEM (EDSEM), expanding the dictionary to extend its applicability to very-high-resolution (VHR) satellite SAR images and further refining the dictionary to reduce the time consumption of the EDSEM [17]. However, these FMM methods were developed based on the conventional SAR.
In order to investigate the potential of the FMM to precisely describe the statistical properties of the multi-aspect SAR images, our previous work proposed an FMM containing five single parametric models and applied the simulated annealing (SA) algorithm to estimate the parameters and coefficients [5]. Compared with single parametric models, this FMM has the potential to achieve better modeling performance for multi-aspect SAR images. However, this model still has two limitations. First, it fixes the number of components in the mixture and does not fully consider how the fitting performance of the FMM varies with the number of components. As a result, they may suffer from overfitting in certain cases—in other words, the fitting performance of FMM may decrease as the number of mixed single parametric models increases. Second, the parameter and coefficient estimation process of this FMM is time-consuming. Consequently, this restricts the efficiency and practicality of the FMM for multi-aspect SAR images. To thoroughly explore the influence of the number of single parametric models in the FMM on its fitting performance as well as accurately, stably, and efficiently model the statistical distribution characteristics of multi-aspect SAR images, we introduce the stepwise-regression strategy into the model selection process, proposing a stepwise-regression-based finite mixture model named SRFMM. The SRFMM first employs a voting-based ranking strategy to determine the order in which single parametric models are added to the FMM. Then it incrementally combines these models in the determined order until the desired accuracy is reached or overfitting occurs, thereby obtaining the final FMM. In the implementation of the SRFMM, we employ the particle swarm optimization (PSO) algorithm for parameter and coefficient estimation in the SRFMM, due to its robustness and parallelism.
In summary, this paper makes the following three main contributions:
- Explore the overfitting phenomenon in FMM and propose a stepwise-regression-based finite mixture model named SRFMM to alleviate the overfitting issue.
- Employ the PSO algorithm for parameter and coefficient estimation in the SRFMM and compare it with the EM and SA algorithms.
- Evaluate the SRFMM using the C-band circular SAR (CSAR) data and explore the applicability of the SRFMM to the X-band CSAR data.
The remainder of this paper is organized as follows: Section 2 presents the proposed SRFMM and its detailed modeling process. Section 3 analyzes the overfitting phenomenon in FMM and evaluates the SRFMM using the C-band CSAR data. Section 4 explores the applicability of the SRFMM to the X-band CSAR data. Section 5 concludes the whole paper.
2. Stepwise-Regression-Based Finite Mixture Model
To investigate how the fitting performance of the FMM varies with the number of single parametric models in the mixture and to accurately, stably, and efficiently model the statistical distribution characteristics of multi-aspect SAR images, we propose a stepwise-regression-based finite mixture model (SRFMM). It first ranks single parametric models according to their fitting performance, and it then incrementally combines them in the determined order until the desired accuracy is achieved or overfitting occurs to obtain the final FMM.
In this section, we first provide a definition of the FMM and introduce the five single parametric models included in the mixture, followed by a description of the adjusted coefficient of determination (i.e., adjusted R-squared) used to evaluate the fitting performance of the FMM. We then illustrate the voting-based ranking strategy for determining the order in which single parametric models are added to the FMM. After that, we provide a detailed introduction to the proposed SRFMM. Finally, we present the PSO algorithm employed to estimate the parameters and coefficients of each mixed single parametric model.
2.1. FMM, Single Parametric Models, and Adjusted R-Squared
The FMM linearly combines multiple single parametric models to describe the unknown PDF of the SAR image. It can be defined as
in which r is the SAR image amplitude or intensity; is a single parametric model dependent on a parameter vector ; denotes the coefficient of the i-th single parametric model, such that
and is a vector containing all the parameters and coefficients of the single parametric models, i.e.,
In this paper, we select five single parametric models—Weibull, Gamma, Lognormal, K, and G0—to construct the FMM, because they can respectively model various types of terrain with different homogeneity and anisotropy in SAR images of different resolutions well [5]. Table 1 summarizes their expressions and applicable terrain types.
Table 1.
Expressions and applicable terrain types of the five single parametric models.
In this paper, we use the adjusted coefficient of determination [5,34] (i.e., adjusted R-squared) to evaluate the fitting performance of the FMM. The adjusted R-squared can indicate the overfitting phenomenon because it takes into account the number of parameters and coefficients in the FMM, thereby revealing the impact of the model complexity on the modeling results. It can be defined as
where denotes the true value of a probability density histogram, represents its predicated value of the discretized PDF, is the mean of the true value of the probability density histogram, n is the number of intervals in the histogram, and m is the total number of the parameters and coefficients in the FMM. When the approaches 1, the fitting performance of the FMM is better.
2.2. The Voting-Based Ranking Strategy
The proposed SRFMM incrementally adds the single parametric models into the FMM until the desired accuracy is reached or overfitting occurs. Intuitively, prioritizing the single parametric models with better fitting performance can enable the FMM to meet the accuracy requirements with the fewest components. Based on this insight, we employ a voting-based ranking strategy to determine the order in which the single parametric models are added to the FMM.
Specifically, the voting-based ranking strategy begins by randomly selecting a set of regions from the multi-aspect SAR image. Then, it models the statistical distribution characteristics of each selected region using the five single parametric models separately. For all selected regions, it calculates the adjusted R-squared values of the fitting results for each observation aspect and aperture angle. Finally, the five single parametric models are ranked according to the number of adjusted R-squared values that exceed a predefined threshold (referred to as the first threshold, in this paper).
This strategy is based on the following considerations: randomly selecting a set of regions to rank the single parametric models can cover all terrain types of the multi-aspect SAR image. Therefore, the ranking results can be applicable to all the regions of interest and the ranking time is fixed regardless of the number of regions of interest to be modeled in the multi-aspect SAR images.
2.3. The Proposed SRFMM
By increasing the model parameters, the FMM enhances its degrees of freedom and achieves better fitting performance than single parametric models. However, as more single parametric models are mixed into the FMM, it becomes susceptible to overfitting, which has not been well studied at present. To study how the fitting performance of the FMM changes with the number of single parametric models in the mixture, we introduce the stepwise-regression strategy into its model selection process. This strategy enables the FMM to model the multi-aspect SAR images more accurately, stably, and efficiently. Based on this, we propose a stepwise-regression-based finite mixture model, named SRFMM. Specifically, given the ranking order of the single parametric models determined by the voting-based strategy, the SRFMM incrementally adds them into the FMM until the desired accuracy is achieved or overfitting occurs. Figure 1 illustrates the SRFMM modeling process for multi-aspect SAR images, including model ranking, model selection, and parameter and coefficient estimation.
Figure 1.
The SRFMM modeling process for multi-aspect synthetic aperture radar (SAR) images, including model ranking, model selection, and parameter and coefficient estimation.
As can be seen in Figure 1, the SRFMM has four steps as follows:
Step 1: The SRFMM aims to model the probability density histogram of the original multi-aspect SAR image amplitude or intensity, rather than the raw data themselves. This step divides the amplitude or intensity values into several equally spaced intervals and counts the number of data points in each interval to obtain the final histogram.
Step 2: Given a set of single parametric models , , …, , this step applies the voting-based ranking strategy described in Section 2.2 to reorder them as , , …, , based on their fitting performance for the randomly selected regions.
Step 3: In this step, the i-th ranked single parametric model multiplied by a coefficient is added to the FMM.
Step 4: This step employs the PSO algorithm to estimate the parameters and coefficients of the FMM due to its robustness and parallelism. If the adjusted R-squared values of the fitting results exceed a predefined threshold (referred to as the second threshold, in this paper) or fall below the adjusted R-squared values obtained in the previous iteration, this step outputs the final FMM. Otherwise, add the next single parametric model and return to Step 3 for the next iteration. The final FMM consists of two parts as follows:
- An ordered set of selected single parametric models , , …, , in which i is the smallest number of components such that the fitting performance of the FMM exceeds the second threshold, or such that the accuracy of adding the -th model is lower than that of using i models.
- A vector contains all the parameters and coefficients of each single parametric model, expressed as . It can ensure that the SRFMM accurately models the probability density histogram of the multi-aspect SAR image amplitude or intensity.
2.4. The Parameter and Coefficient Estimation Method for the FMM
The FMM contains more parameters than single parametric models and introduces additional coefficient for each mixture component. Consequently, the estimation of parameters and coefficients becomes considerably more challenging. Therefore, it is necessary to select an appropriate method for efficient and accurate estimation of the parameters and coefficients. Traditional methods, such as maximum likelihood estimation (MLE) and the expectation–maximization (EM) algorithm, suffer from low precision, long time expenditure, and strong dependence on initial values. In contrast, heuristic algorithms can achieve high precision and stable results. Among them, the simulated annealing (SA) algorithm has been employed to estimate the parameters and coefficients of the FMM and has produced good results. However, it still requires a considerable amount of time. The particle swarm optimization (PSO) algorithm is robust and highly parallelizable. Therefore, we consider employing the PSO algorithm to overcome the problems of previous methods and achieve efficient parameter and coefficient estimation for the SRFMM.
PSO is a population-based stochastic optimization algorithm proposed by Kennedy and Eberhart in 1995, originally designed to simulate social behaviors [35]. In PSO, a group of candidate solutions, called particles, iteratively explore the solution space based on their current velocities, personal bests, and the global best. Each iteration can be represented as a movement process, as illustrated in Figure 2. Each iteration updates the current solutions (represented by the position of each particle) through a combination of three components as follows:
Figure 2.
The iteration of particle swarm optimization (PSO), which combines the following three components: the social experience, the personal experience, and the inertia.
Inertia (determined by the current velocity ): The particle tends to maintain its previous motion. This component is obtained by multiplying the current velocity by an inertia weight w, resulting in .
Personal experience (determined by the best position of each particle): The particle is attracted toward the best position it has previously experienced. This component is calculated as , in which is the personal learning rate and is a random number.
Social experience (determined by the best position of all particles): The particle is also attracted toward the global best position. This component is calculated as , in which is the global learning rate and is a random number.
As shown in Figure 2, the new velocity can be calculated by summing these three components as follows:
and the new position of the particle can be updated accordingly, as follows:
This process can be regarded as a movement resulting from the balance between exploration (following inertia) and exploitation (moving according to the personal and social experience).
Example. We use the FMM composed of Gamma and Lognormal distributions to illustrate how the parameters and coefficients are updated through the above iteration. The specific process is shown in Figure 3.
Figure 3.
Example of updating the parameters and coefficients of the finite mixture model (FMM).
As shown in Figure 3, the PSO algorithm uses three particles, each representing a solution of the FMM. Each solution includes the coefficients of the Gamma and Lognormal components ( and ) as well as their parameters (, , and ). Given the current position , velocity , the personal best position , and the global best position , PSO can update the velocity using Formula (5). In this example, , , , , and . It is worth noting that, in practice, each particle uses different random numbers (including and ) for each coefficient and parameter update, and the initial coefficients and parameters are also generated randomly. However, for simplicity, in this example, we assume that the random numbers are identical within the same component. After obtaining the next velocity , PSO calculates the next position for each particle using Formula (6). However, the sum of the coefficients for the Gamma and Lognormal components may not equal 1. To correct this, the PSO adjusts the coefficients as follows to obtain the final position for the current iteration:
In the experiments presented in Section 3, the inertia weight w decreases from to linearly with the iteration, balancing exploration and exploitation. The personal learning rate and global learning rate are set to [36]. The number of particles is set to 1000. These parameters are selected according to the specific experimental requirements to ensure stability and accuracy. The flow of the PSO algorithm is like that in [37].
3. The Evaluation of the Proposed SRFMM
To study the overfitting phenomenon of the FMM as well as to accurately, stably, and efficiently model the statistical distribution characteristics of multi-aspect SAR images, we introduce the stepwise-regression strategy into the model selection process, proposing a stepwise-regression-based finite mixture model named SRFMM. To examine the impact of the number of mixture components in the FMM on its fitting performance and to evaluate the effectiveness of the proposed SRFMM, three research questions need to be explored as follows:
RQ-1: How does the fitting performance of the FMM change as the number of single parametric models mixed in the FMM increases?
RQ-2: Can the proposed SRFMM accurately, stably, and efficiently model the multi-aspect SAR images under various observation aspects and aperture angles?
RQ-3: Is the PSO algorithm stable and efficient?
To explore the three research questions, we conducted comprehensive experiments on the C-band CSAR data. The remainder of this section is organized as follows: Section 3.1 introduces the CSAR dataset used in this study. Section 3.2 compares the fitting performance of FMMs consisting of one to five single parametric models. Section 3.3 discusses the number of single parametric models required to achieve the desired accuracy and experimentally analyzes the fitting performance of the SRFMM under various observation aspects and aperture angles. Section 3.4 evaluates the stability and efficiency of the PSO algorithm. Section 3.5 compares the fitting performance and computational efficiency of the PSO, EM, and SA algorithms for the SRFMM parameter estimation.
3.1. The C-Band Circular SAR Data
In this paper, we use the C-band CSAR data acquired from the first C-band airborne circular SAR flight experiment in Zhuhai, Guangdong Province, China, to explore the impact of the number of single parametric models in the FMM on its fitting performance and to evaluate the effectiveness of the proposed SRFMM. The detailed parameters of this experiment are listed in Table 2.
Table 2.
Experimental parameters of the C-band circular SAR (CSAR).
During the experiment, the aircraft maintained a constant elevation and flew along a circular trajectory above the target scene, emitting electromagnetic waves to the scene and receiving echoes that returned from different azimuth aspects. The received echo data were then processed with the back-projection (BP) imaging algorithm, which is a typical SAR processing algorithm, to obtain ground range sub-aperture images in local Cartesian coordinate system. In order to ensure the azimuth resolution corresponds to the range resolution, the sub-aperture length was set to 5° during the imaging processing. Subsequently, the full-aperture SAR image can be obtained by accumulating all of the sub-aperture images. Figure 4 shows the optical image and the full-aperture CSAR image of the whole imaging scene.
Figure 4.
Optical image and full-aperture C-band CSAR image of the imaging scene.
3.2. Fitting Performance of FMM with Varying Numbers of Single Parametric Models
To investigate the overfitting phenomenon as the number of single parametric models in the FMM increases, we incrementally added single parametric models to construct a series of FMMs, denoted as FMM1 to FMM5 (FMMn denotes the FMM obtained by mixing the first n single parametric models from the determined ordered set). As mentioned in Section 2.2, we employed a voting-based ranking strategy to determine the order of forming FMM1 to FMM5. Specifically, we randomly selected 200 regions with a window size of 160 from the CSAR data, which can basically cover various regions of the entire image and applied five single parametric models (Weibull, Gamma, Lognormal, K, and G0) to model the statistical distribution characteristics of these regions for all observation aspects and aperture angles. For each model, we then counted how many adjusted R-squared values exceeded a given threshold (the first threshold , set to 0.99 in this paper), and we ranked these models according to the statistical results. After analyzing the statistical results, we found that Gamma achieved the highest number of adjusted R-squared values exceeding the first threshold, thus ranking first, followed by G0, K, Weibull, and Lognormal.
To thoroughly examine how the fitting performance changes as the number of single parametric models in the FMM increases, we selected 16 regions with a window size of 160 from the CSAR data (8 isotropic regions such as land, and 8 anisotropic regions such as building), and we then employed FMM1 to FMM5 to model these regions. Figure 5 presents the 16 selected regions from the entire scene, where the isotropic regions are enclosed by green boxes and the anisotropic regions are enclosed by red boxes. The enlarged full-aperture CSAR images of these selected regions are shown in Figure 6. These regions are selected as they basically cover the typical targets in the whole scene.
Figure 5.
The selected isotropic and anisotropic regions from the entire scene.
Figure 6.
The enlarged full-aperture CSAR images of the selected regions.
After modeling the 16 regions using FMM1 to FMM5, we calculated the adjusted R-squared values of the fitting results. Then, we calculated the average values of these adjusted R-squared values over all observation aspects and aperture angles for isotropic and anisotropic regions, respectively. The average fitting results of FMM1 to FMM5, along with example fitting curves, are presented in Figure 7.
Figure 7.
The average fitting results of FMM1 to FMM5 along with example fitting curves. (The first row of line charts presents the fitting results for isotropic and anisotropic regions under the observation aspect and aperture angle. Below each line chart, the probability density histogram of an example angle is presented along with the fitting curves from FMM1 to FMM5).
As shown in Figure 7, the average adjusted R-squared values across observation aspects and aperture angles increase from FMM1 to FMM3 for the isotropic regions and from FMM1 to FMM4 for the anisotropic regions, respectively. This indicates that the fitting performance of these FMMs improves as the number of mixed single parametric models increases. However, after adding the other single parametric model(s) (i.e., FMM4 and FMM5 for isotropic regions and FMM5 for anisotropic regions), the fitting performance deteriorates on the contrary. This result indicates that, initially, the fitting performance of the FMM improves as more single parametric models are added. However, too many models lead to overfitting and a reduction in fitting performance. (RQ-1). To provide an intuitive illustration of how the fitting performance varies with the number of mixture components (FMM1-FMM5), the fitting curves of the models with different numbers of components for the probability density histogram are illustratively presented beneath the line chart in each corresponding situation. As shown in the examples, for the isotropic region, the fitting performance improves as the number of single parametric models increases from FMM1 to FMM3, but it starts to decline after FMM4. For the anisotropic region, the performance improves from FMM1 to FMM4, and it then decreases when the last single parametric model (i.e., FMM5) is added. These trends are consistent with the variation of the average fitting results as the number of models changes.
It is important to note that, from FMM2, the improvement in fitting performance becomes marginal as the number of single parametric models increases. In fact, adding more mixture components to the FMM increases the time expenditure for parameter and coefficient estimation. Therefore, blindly increasing the number of mixture components is not a reasonable approach. Taking this into account, we introduce the stepwise-regression strategy into the model selection process, which will be further applied to the SRFMM in the next subsection.
3.3. Evaluation of the SRFMM
As discussed in Section 3.2, it is unreasonable to blindly increase the number of mixed single parametric models. On the one hand, adding too many single parametric models to the FMM may cause overfitting. On the other hand, starting from FMM2, increasing the number of mixture components does not significantly improve the fitting performance, while it inevitably introduces additional time expenditure. Considering these, we introduce the stepwise regression strategy into the model selection process of the SRFMM, which progressively adds single parametric models to the FMM until the desired accuracy is achieved or overfitting occurs. In this subsection, we first calculate the average number of single parametric models required to achieve the desired accuracy and the average fitting results. Then, we select two isotropic and two anisotropic regions to demonstrate the fitting performance of the proposed SRFMM under various observation aspects and aperture angles.
To illustrate how the desired accuracy (i.e., the second threshold, , introduced in Section 2.3) affects the number of single parametric models mixed in the FMM, we selected the following two values: 0.995 and 0.999, for this experiment. The values can be adjusted according to user requirements. We selected these two values because they can represent relatively low and high accuracy, allowing for a comprehensive evaluation of the SRFMM. For each value, we applied the SRFMM to model the 16 regions selected from the CSAR data (8 isotropic regions and 8 anisotropic regions), as described in Section 3.2. Then, we calculated the average number of single parametric models mixed in the FMMs over all observation aspects and aperture angles for each isotropic region and anisotropic region. Figure 8 shows the average number of single parametric models of the SRFMM in different regions with different values.
Figure 8.
The average number of single parametric models of the stepwise-regression-based finite mixture model (SRFMM) in different regions with different values.
From Figure 8, we can draw the following conclusions:
- Increasing the desired accuracy within a certain range requires mixing more single parametric models into the FMM.
- There is no need to combine all single parametric models into the FMM to achieve high accuracy. Even with an accuracy as high as 0.999, only an average of 2.63 single parametric models are required (for the first anisotropic region under aperture angle). Therefore, it is crucial to introduce the stepwise-regression strategy into the model selection process of the SRFMM. This strategy minimizes the number of mixed single parametric models and can alleviate overfitting.
Besides, we calculated the average adjusted R-squared values of the fitting results over all observation aspects and aperture angles for each isotropic region and anisotropic region. The average fitting results of the SRFMM for different regions with different values are shown in Figure 9.
Figure 9.
The average fitting results of the SRFMM for different regions with different values.
As shown in Figure 9, the fitting results of the proposed SRFMM achieved the desired accuracy in all cases. We further calculated the average computation time for each SAR image. The SRFMM required a total of 347 and 540 seconds for the values of 0.995 and 0.999, respectively, corresponding to 0.15 and 0.23 seconds per SAR image. This showed that the SRFMM is accurate, stable, and efficient in modeling the multi-aspect SAR images (RQ-2).
We also evaluated the fitting performance for 16 regions across different observation aspects and aperture angles and the results were consistent. To further illustrate the accuracy and stability of the proposed SRFMM, we selected 2 isotropic regions and 2 anisotropic regions from 16 regions for display. Figure 10 shows the full-aperture CSAR images of four selected regions, including two isotropic regions and two anisotropic regions. The reason for choosing these regions is that they, respectively, represent natural isotropic and man-made anisotropic targets in real-world scenarios. Figure 11 and Figure 12 show the fitting performance of the SRFMM under various angular conditions for the selected regions in multi-aspect SAR images when is 0.995 and 0.999, respectively.
Figure 10.
Full-aperture CSAR images of the selected regions.
Figure 11.
Fitting performance of the SRFMM under various angular conditions for the isotropic regions in multi-aspect SAR images with different values.
Figure 12.
Fitting performance of the SRFMM under various angular conditions for the anisotropic regions in multi-aspect SAR images with different values.
As shown in Figure 11 and Figure 12, for the selected isotropic and anisotropic regions, the fitting performance remains stable under various angular conditions. These results indicate that the proposed SRFMM can accurately and stably model the isotropic and anisotropic regions in multi-aspect SAR images under various angular conditions according to the accuracy requirements (RQ-2).
3.4. Evaluation of the PSO Algorithm
In this paper, we employ the PSO algorithm for parameter and coefficient estimation, taking advantage of its robustness and parallelism. All experiments were conducted on a laptop equipped with an Intel Core i9-14900HX CPU (2.20 GHz), 64 GB of RAM, and an NVIDIA GeForce RTX 4060 Laptop GPU with 8 GB of memory. Specifically, considering the available hardware resources, different observation aspects and aperture angles are processed in parallel using eight CPU threads. For each CPU thread, the solutions of 1000 PSO particles are computed in parallel on the GPU using MATLAB (version: R2024b) gpuArray. In this subsection, we conduct two experiments to validate the robustness and efficiency of the PSO algorithm.
We first generated 100 random sets of initial parameters and coefficients for each observation aspect and each aperture angle to examine how different initial states affect the fitting performance for isotropic and anisotropic regions under various angular conditions. We select FMM3 for isotropic regions and FMM4 for anisotropic regions in this experimental evaluation because they achieve the best performance, respectively, as discussed in Section 3.2. Figure 13 and Figure 14 show the boxplot of the fitting results of different regions in multi-aspect SAR images to analyze the robustness of the PSO algorithm and Figure 15 shows how the average time consumption varies with different initial states in different regions.
Figure 13.
Robustness analysis of the PSO algorithm for isotropic regions in C-band SAR images under various observation aspects and aperture angles. (Each boxplot is derived from the statistical analysis of the adjusted R-squared values for 100 different initial states under each observation aspect or aperture angle).
Figure 14.
Robustness analysis of the PSO algorithm for anisotropic regions in C-band SAR images under various observation aspects and aperture angles. (Each boxplot is derived from the statistical analysis of the adjusted R-squared values for 100 different initial states under each observation aspect or aperture angle).
Figure 15.
Average time consumption for the PSO algorithm with different initial states in different regions.
As shown in Figure 13 and Figure 14, the adjusted R-squared values remain stable under various angular conditions for both isotropic and anisotropic regions. The median values exhibit minimal variation, and the relatively narrow interquartile ranges indicate that the distributions of the fitting results are concentrated, with only a few outliers. These results suggest that the fitting performance is only marginally affected by the initial states, demonstrating the stability of the PSO algorithm (RQ-3).
As shown in Figure 15, the average time consumption for each SAR image remains stable across different initial parameters and coefficients for both isotropic regions and anisotropic regions. This indicates that the initial values have little impact on the average time consumption. In addition, the average computation time for anisotropic regions is longer than that for isotropic regions. This can be explained by the following reasons: the statistical distribution characteristics of anisotropic regions are more complex than those of isotropic regions, thus requiring more time for parameter and coefficient estimation.
To further demonstrate the efficiency of the PSO algorithm due to its parallelism, we have calculated the average time consumption under different numbers of particles. Figure 16 shows the average time consumption for different numbers of particles with different values.
Figure 16.
Average time consumption for different numbers of particles with different values in C-band. (Each average time contains an observation aspect and an aperture angle.)
As shown in Figure 16, the average computation time is stable for low precision requirements and shows a slight upward trend with fluctuations for high precision requirements when the number of particles increases. This is because all particles can be processed in parallel on the GPU; thus, the overall computation time is only marginally affected by the particle count. Therefore, the PSO algorithm is quite efficient (RQ-3).
3.5. Comparative Experiments
We employed the EM, SA, and PSO algorithms for the parameter estimation of the SRFMM, and we conducted experiments on two isotropic regions and two anisotropic regions, as shown in Figure 10. To more objectively evaluate the fitting performance of the models, the adjusted R-squared (), Kolmogorov–Smirnov (KS) distance () [9,38], and the correlation coefficient () [5,9] were selected as evaluation metrics. primarily reflects the fitting performance over the main body of the distribution; measures the maximum difference between the cumulative distribution functions (CDFs) of the models and observed data; while describes the consistency of variation trends and linear relationships between them. Table 3, Table 4, Table 5 and Table 6 show the mean and standard deviation of different metrics for four regions. For each region, the statistics are computed separately over observation aspects and aperture angles. In the above tables, the mean and standard deviation are represented by and , respectively.
Table 3.
Mean and standard deviation of different metrics for the isotropic region A, computed separately over observation aspects and aperture angles.
Table 4.
Mean and standard deviation of different metrics for the isotropic region B, computed separately over observation aspects and aperture angles.
Table 5.
Mean and standard deviation of different metrics for the anisotropic region A, computed separately over observation aspects and aperture angles.
Table 6.
Mean and standard deviation of different metrics for the anisotropic region B, computed separately over observation aspects and aperture angles.
As shown in Table 3, Table 4, Table 5 and Table 6, compared with the EM and SA algorithms, the PSO algorithm has the lowest average and the highest average and across all four regions. Besides, the standard deviations of all metrics for PSO are smaller than those of the EM and SA algorithms. Therefore, employing the PSO algorithm for parameter estimation in the SRFMM provides more accurate modeling of the selected regions under different angular conditions.
To further demonstrate the efficiency of the PSO algorithm, we calculated the total computation time and the average time consumption per SAR image for different parameter estimation methods. The results are presented in Table 7.
Table 7.
The total computation time and the average time consumption per SAR image for different parameter estimation methods.
As shown in Table 7, compared with the EM and the SA algorithms, the PSO algorithm achieves the lowest computation time. Therefore, it is the most efficient method for parameter estimation in the SRFMM (RQ-3).
4. Additional Experiments
In this section, we conduct applicability experiments on the X-band GOTCHA dataset to explore the applicability of the proposed method. The CSAR data were acquired at X-band with a 640 MHz bandwidth, covering full polarization and full azimuth. The imaging scene contains multiple civilian vehicles and calibration targets [39]. Figure 17 shows the optical image and the full-aperture X-band CSAR image of the whole imaging scene. As mentioned in Section 3.5, the PSO algorithm outperforms the EM and SA algorithms for parameter estimation in the SRFMM. Therefore, this section explores the applicability of the SRFMM using the PSO algorithm as the parameter estimation method.
Figure 17.
Optical image and full-aperture X-band CSAR image of the imaging scene.
We evaluated the proposed method on various regions in the X-band data, and the results were generally consistent with the C-band results. To further illustrate the applicability of the proposed method, we select an isotropic region and an anisotropic region for illustration. Figure 18 shows the full-aperture X-band CSAR images of two selected regions. The isotropic region is enclosed by a green box and the anisotropic region is enclosed by a red box in the entire scene. Figure 19 shows the fitting performance of the SRFMM for the selected regions in the X-band multi-aspect SAR images under various observation aspects and aperture angles when is 0.995 and 0.999, respectively.
Figure 18.
The selected isotropic and anisotropic regions from the entire scene in X-band.
Figure 19.
Fitting performance of the SRFMM under various angular conditions for the selected regions in multi-aspect SAR images with different values.
As illustrated in Figure 19, the SRFMM also provides accurate and stable modeling of isotropic and anisotropic regions in the X-band multi-aspect SAR images under various observation aspects and aperture angles.
Similar to Section 3.4, we also conduct two experiments on the X-band data to validate the robustness and efficiency of the PSO. The experimental results are shown in Figure 20 and Figure 21.
Figure 20.
Robustness analysis of the PSO algorithm for isotropic and anisotropic regions in X-band SAR images under various observation aspects and aperture angles. (Each boxplot is derived from the statistical analysis of the adjusted R-squared values for 100 different initial states under each observation aspect or aperture angle).
Figure 21.
Average time consumption for the PSO algorithm with different initial states in different regions of X-band SAR images.
As shown in Figure 20, while the X-band fitting performance is more affected by the initial states than that of the C-band, the variation remains within 0.005, demonstrating the stability of the PSO algorithm. Similar to Figure 15, Figure 21 indicates that the initial values have little impact on the average time consumption in the X-band, consistent with the C-band results.
5. Conclusions
In this paper, we propose a stepwise-regression-based finite mixture model (SRFMM) to explore how the number of single parametric models affects the fitting performance of the FMM and effectively model the multi-aspect SAR images. The SRFMM applies a voting-based ranking strategy to determine the model order and incrementally adds the single parametric models until the desired accuracy is reached or overfitting occurs. In the implementation of SRFMM, we used the PSO algorithm for parameter and coefficient estimation due to its robustness and parallelism. Experiments on the C-band and X-band CSAR data indicate that the SRFMM can accurately, stably, and efficiently model the isotropic and anisotropic regions in multi-aspect SAR images under various observation aspects and aperture angles. Although the PSO algorithm improves the speed of parameter estimation, the overall efficiency remains to be further improved for subsequent practical applications. Future work will focus on acquiring more multi-aspect SAR data and conducting practical applications based on these datasets.
Author Contributions
Methodology, R.Z.; Software, R.Z.; Resources, W.H.; Data curation, R.Z.; Writing—original draft, R.Z.; Writing—review and editing, F.T. and W.H.; Supervision, F.T. and W.H.; Project administration, F.T. and W.H.; Funding acquisition, F.T. and W.H. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the National Natural Science Foundation of China under Grant U25B6001.
Data Availability Statement
The C-band dataset presented in this article are not publicly available because the data are part of an ongoing study. Requests to access the datasets should be directed to tengfei@aircas.ac.cn. The Gotcha volumetric SAR dataset is available at https://www.sdms.afrl.af.mil (accessed on 10 April 2023).
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Moreira, A.; Prats-Iraola, P.; Younis, M.; Krieger, G.; Hajnsek, I.; Papathanassiou, K.P. A Tutorial on Synthetic Aperture Radar. IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–43. [Google Scholar] [CrossRef]
- Costa, F.A.L.; Rocha, F.H.F.; Gotelip, M.R. Lessonia-1 SAR Project for Improving the Disaster Management in Brazil. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2024, 48, 21–26. [Google Scholar] [CrossRef]
- Russo, L.; Sorriso, A.; Ullo, S.L.; Gamba, P. A Deep Learning Architecture for Land Cover Mapping Using Spatio-Temporal Sentinel-1 Features. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2025, 18, 10562–10581. [Google Scholar] [CrossRef]
- Flores, L.; Nendel, C.; Bookhagen, B.; Oviedo Reyes, J.A.; Smith, T.; Ghazaryan, G. The Potential of Sentinel-1 Time Series for Large-Scale Assessment of Maize and Wheat Phenology across Germany. GISci. Remote Sens. 2025, 62, 2531593. [Google Scholar] [CrossRef]
- Zhu, R.; Teng, F.; Hong, W. Analysis and Modeling of Statistical Distribution Characteristics for Multi-Aspect SAR Images. Remote Sens. 2025, 17, 1295. [Google Scholar] [CrossRef]
- Hong, W. Progress in Circular SAR Imaging Technique. J. Radars. 2012, 1, 124–135. [Google Scholar] [CrossRef]
- Teng, F.; Lin, Y.; Wang, Y.; Shen, W.; Feng, S.; Hong, W. An Anisotropic Scattering Analysis Method Based on the Statistical Properties of Multi-Angular SAR Images. Remote Sens. 2020, 12, 2152. [Google Scholar] [CrossRef]
- Yue, X.; Teng, F.; Lin, Y.; Hong, W. Target Scattering Feature Extraction Based on Parametric Model Using Multi-Aspect SAR Data. Remote Sens. 2023, 15, 1883. [Google Scholar] [CrossRef]
- Moser, G.; Zerubia, J.; Serpico, S.B. Dictionary-Based Stochastic Expectation-Maximization for SAR Amplitude Probability Density Function Estimation. IEEE Trans. Geosci. Remote Sens. 2006, 44, 188–200. [Google Scholar] [CrossRef]
- Li, H.C.; Hong, W.; Wu, Y.R.; Fan, P.Z. An Efficient and Flexible Statistical Model Based on Generalized Gamma Distribution for Amplitude SAR Images. IEEE Trans. Geosci. Remote Sens. 2010, 48, 2711–2722. [Google Scholar] [CrossRef]
- Ban, Y.; Yousif, O.A. Multitemporal Spaceborne SAR Data for Urban Change Detection in China. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2012, 5, 1087–1094. [Google Scholar] [CrossRef]
- Gao, G.; Ouyang, K.; Luo, Y.; Liang, S.; Zhou, S. Scheme of Parameter Estimation for Generalized Gamma Distribution and Its Application to Ship Detection in SAR Images. IEEE Trans. Geosci. Remote Sens. 2017, 55, 1812–1832. [Google Scholar] [CrossRef]
- Tison, C.; Nicolas, J.M.; Tupin, F.; Maitre, H. A New Statistical Model for Markovian Classification of Urban Areas in High-Resolution SAR Images. IEEE Trans. Geosci. Remote Sens. 2004, 42, 2046–2057. [Google Scholar] [CrossRef]
- Shen, W.; Lin, Y.; Yu, L.; Xue, F.; Hong, W. Single Channel Circular SAR Moving Target Detection Based on Logarithm Background Subtraction Algorithm. Remote Sens. 2018, 10, 742. [Google Scholar] [CrossRef]
- Gao, G. Statistical Modeling of SAR Images: A Survey. Sensors 2010, 10, 775–795. [Google Scholar] [CrossRef]
- Zhou, X.; Peng, R.; Wang, C. A Two-Component K–Lognormal Mixture Model and Its Parameter Estimation Method. IEEE Trans. Geosci. Remote Sens. 2015, 53, 2640–2651. [Google Scholar] [CrossRef]
- Krylov, V.A.; Moser, G.; Serpico, S.B.; Zerubia, J. Enhanced Dictionary-Based SAR Amplitude Distribution Estimation and Its Validation with Very High-Resolution Data. IEEE Geosci. Remote Sens. Lett. 2011, 8, 148–152. [Google Scholar] [CrossRef]
- Boothe, R.R. The Weibull Distribution Applied to The Ground Clutter Backscatter Coefficient; Defense Technical Information Center: Fort Belvoir, VA, USA, 1969. [CrossRef]
- Nezry, E.; Lopes, A.; Ducrot-Gambart, D.; Nezry, C.; Lee, J.S. Supervised Classification of K-Distributed SAR Images of Natural Targets and Probability of Error Estimation. IEEE Trans. Geosci. Remote Sens. 1996, 34, 1233–1242. [Google Scholar] [CrossRef]
- Frery, A.C.; Muller, H.J.; Yanasse, C.C.F.; Sant’Anna, S.J.S. A Model for Extremely Heterogeneous Clutter. IEEE Trans. Geosci. Remote Sens. 1997, 35, 648–659. [Google Scholar] [CrossRef]
- Jakeman, E.; Pusey, P.N. Significance of K Distributions in Scattering Experiments. Phys. Rev. Lett. 1978, 40, 546–550. [Google Scholar] [CrossRef]
- George, S.F. The Detection of Nonfluctuating Targets in Log-Normal Clutter; Technical Report 6796; Naval Research Laboratory: Washington, DC, USA, 1968. [Google Scholar]
- Oliver, C.; Quegan, S. Understanding Synthetic Aperture Radar Images; SciTech Publishing: Raleigh, NC, USA, 2004. [Google Scholar]
- Yue, D.X.; Xu, F.; Frery, A.C.; Jin, Y.Q. Synthetic Aperture Radar Image Statistical Modeling: Part One-Single-Pixel Statistical Models. IEEE Geosci. Remote Sens. Mag. 2021, 9, 82–114. [Google Scholar] [CrossRef]
- Parzen, E. On Estimation of a Probability Density Function and Mode. Ann. Math. Statist. 1962, 33, 1065–1076. [Google Scholar] [CrossRef]
- Bishop, C.M. Neural Networks for Pattern Recognition; Oxford Univ. Press: Oxford, UK, 1995. [Google Scholar]
- Mantero, P.; Moser, G.; Serpico, S.B. Partially Supervised Classification of Remote Sensing Images through SVM-Based Probability Density Estimation. IEEE Trans. Geosci. Remote Sens. 2005, 43, 559–570. [Google Scholar] [CrossRef]
- Petrou, M.; Giorgini, F.; Smits, P. Modeling the Histograms of Various Classes in SAR Images. Pattern Recognit. Lett. 2002, 23, 1103–1107. [Google Scholar] [CrossRef]
- Pearson, K. Contributions to The Mathematical Theory of Evolution. Philos. Trans. R. Soc. Lond. A 1894, 185, 71–110. [Google Scholar] [CrossRef]
- Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum Likelihood from Incomplete Data Via the EM Algorithm. J. R. Stat. Soc. Ser. B. (Methodol.) 1977, 39, 1–22. [Google Scholar] [CrossRef]
- Celeux, G.; Chrétien, S.; Forbes, F.; Mkhadri, A. A Component-Wise EM Algorithm for Mixtures; Research Report 3746; INRIA: Le Chesnay-Rocquencourt, France, 1999. [Google Scholar]
- Ward, K.D. Compound Representation of High Resolution Sea Clutter. Electron. Lett. 1981, 17, 561–563. [Google Scholar] [CrossRef]
- Blake, A.P.; Blacknell, D.; Oliver, C.J. High Resolution SAR Clutter Textural Analysis. In Proceedings of the IEE Colloquium on Recent Developments in Radar and Sonar Imaging Systems: What Next? Springer: Berlin/Heidelberg, Germany, 1995; pp. 10/1–10/9. [Google Scholar] [CrossRef]
- Mittlböck, M.; Waldhör, T. Adjustments for R2-measures for Poisson Regression Models. Comput. Stat. Data Anal. 2000, 34, 461–472. [Google Scholar] [CrossRef]
- Kennedy, J.; Eberhart, R. Particle Swarm Optimization. In Proceedings of the ICNN’95—International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar] [CrossRef]
- Zhou, H.; Wei, X. Particle Swarm Optimization Based on a Novel Evaluation of Diversity. Algorithms 2021, 14, 29. [Google Scholar] [CrossRef]
- Gui, L.; Hai, Y.; Wu, J.; Li, Z. A Sub-aperture Partition Method for Airborne SAR Based on Particle Swarm Optimization. In Proceedings of the 2021 CIE International Conference on Radar (Radar), Haikou, China, 15–19 December 2021; pp. 382–385. [Google Scholar] [CrossRef]
- Li, H.C.; Krylov, V.A.; Fan, P.Z.; Zerubia, J.; Emery, W.J. Unsupervised Learning of Generalized Gamma Mixture Model With Application in Statistical Modeling of High-Resolution SAR Images. IEEE Trans. Geosci. Remote Sens. 2016, 54, 2153–2170. [Google Scholar] [CrossRef]
- Casteel, C.H., Jr.; Gorham, L.A.; Minardi, M.J.; Scarborough, S.M.; Naidu, K.D.; Majumder, U.K. A Challenge Problem for 2D/3D Imaging of Targets from a Volumetric Data Set in an Urban Environment. In Proceedings of the Algorithms for Synthetic Aperture Radar Imagery XIV; Society of Photo Optical: Bellingham, WA, USA, 2007; Volume 6568, p. 65680D. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.




















