^{1}

^{*}

^{2}

^{2}

^{1}

^{1}

^{1}

^{3}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Speckle noise (salt and pepper) is inherent to synthetic aperture radar (SAR), which causes a usual noise-like granular aspect and complicates the image classification. In SAR image analysis, the spatial information might be a particular benefit for denoising and mapping classes characterized by a statistical distribution of the pixel intensities from a complex and heterogeneous spectral response. This paper proposes the Probability Density Components Analysis (PDCA), a new alternative that combines filtering and frequency histogram to improve the classification procedure for the single-channel synthetic aperture radar (SAR) images. This method was tested on L-band SAR data from the Advanced Land Observation System (ALOS) Phased-Array Synthetic-Aperture Radar (PALSAR) sensor. The study area is localized in the Brazilian Amazon rainforest, northern Rondônia State (municipality of Candeias do Jamari), containing forest and land use patterns. The proposed algorithm uses a moving window over the image, estimating the probability density curve in different image components. Therefore, a single input image generates an output with multi-components. Initially the multi-components should be treated by noise-reduction methods, such as maximum noise fraction (MNF) or noise-adjusted principal components (NAPCs). Both methods enable reducing noise as well as the ordering of multi-component data in terms of the image quality. In this paper, the NAPC applied to multi-components provided large reductions in the noise levels, and the color composites considering the first NAPC enhance the classification of different surface features. In the spectral classification, the Spectral Correlation Mapper and Minimum Distance were used. The results obtained presented as similar to the visual interpretation of optical images from TM-Landsat and Google Maps.

The presence of speckle noise (salt and pepper) in synthetic aperture radar (SAR) images causes a usual noise-like granular aspect, which complicates the direct image classification. Thus, a basic problem in SAR image analysis is to develop accurate models for the statistics of the pixel intensities in order to obtain procedures for denoising and classification. Several techniques intend to acquire a stationary model for distinct land cover typologies of image data, eliminating the variations in the scene caused by the presence of surface roughness, topography and the dielectric constant, among others. The stationary assumption is most feasible from a set of pixels of the same land cover class.

One of the main strategies used in SAR image classification is to apply a preprocessing step by speckle filtering, such as the median filter [

Another approach is the use of the Probability Density Function (PDF) for the discrimination of different types of distributed scatters [

This paper aims to propose a method that combines filtering and probability density to improve the classification procedure for single-channel SAR image. In particular, we focus on the frequency histogram, which is the most basic spatial information to describe the distributed SAR scattering. The area occupied by the vertical bars of a histogram gives an idea of the probability, since the sum of the areas of all the bars represents 100% of the data. Therefore, the division of the histogram classes by the total number of samples used provides a probability density concerning the distribution of radar data for the target. The proposed algorithm uses a moving window method, which calculates the probability density curve and records each histogram category on a specific image component. Thus, a single input image generates different output image components. Each output component represents a determinate category of the histogram, so the spectrum (the z component of the image) shows the data distribution curve for the central pixel of the window. This histogram curve can be classified according to the methods of a spectral classifier, such as the Spectral Angle Mapper (SAM) [

The study area is located in the Amazon rainforest, where the acquisition of optical images without cloud cover and adequate sunlight is very difficult. This restriction is the main incentive for the use of synthetic aperture radar (SAR) images for the Amazon rainforest, which can be acquired independently of sunlight or cloud cover. Thus, SAR images offer the potential for continuous monitoring of the Amazon forest cover.

The study area is located in the northern Rondônia State, in the municipality of Candeias do Jamari, 20 km from the capital Porto Velho, (8°29″06″S–9°26′85″S latitudes and 63°42′59″W–63°48′15″W longitudes) (

The area has a tropical rainforest climate, with a dry season from June to August, and a wet season from December to March [

The study area shows both large and continuous forest as regions that were deforested over the years, mostly because of the occupation of subsistence agriculture and livestock. Therefore, the area has pasture, agriculture, secondary vegetation and regeneration in the pasture. Furthermore, the study site is bounded to the west by the reservoir of Samuel Hydroelectric, which is the largest artificial lake in the state with a flooded area of 584.6 km^{2} and a mean depth of around 6 m. The altitude in the dam center is 87 m above sea level. Occupation occurred around the center of the municipality, accessible areas using the road, as well as along the Samuel Reservoir. The rapid conversion of forests causes generalized habitat fragmentation [

In 2006, the Japan Aerospace Exploration Agency (JAXA) launched the Advanced Land Observing Satellite (ALOS), carrying the Phased Array L-Band Synthetic Aperture Radar (PALSAR). In this study, L-band SAR data were acquired from the ALOS-PALSAR sensor over the test site on 19 June 2009 in Fine Beam Dual polarization (FBS; look angle: 34.3°, HV and HH polarization) and processing level 1.5 (the image, including the geo-reference and geo-code) with pixel spacing of 12.5 m. The output image was in Geographic Tagged Image File Format (GeoTIFF) format projected to the Universal Transverse Mercator (UTM) coordinate system and World Geodetic System 84 (WGS84) datum.

The PALSAR instrument provides enhanced sensor characteristics, including full polarimetry, variable off-nadir viewing, and Scan SAR operations, as well as significantly improved radiometric and geometric performance. It is a fully polarimetric instrument, which operates in the L-band with 1270-MHz (23.6 cm) center frequency and 14- and 28-MHz bandwidths [

The histogram is a graphical representation of the frequency distribution, showing the number of observations present in a given category (known as bins). The division of the total number of samples in each bin generates a probability density graph. The sum of all the bins becomes equal to one.

This paper introduces the probability density component analysis (PDCA) method, which calculates the frequency histogram for a moving window and distributes the value of each category in a specific image. Thus, a sequence of images is generated, where the total number of images is equal to the number of histogram bins (

The proposed methodology considers the following input variables: (a) the number of histogram bins (

The window size is the total number of data present in the probability density curve. The best window size should be evaluated according to the data in the study. The increase in the window size causes a decrease in spatial resolution. In contrast, the decrease of the window size can cause a sparse data distribution in the probability density curve.

Similarly, for the design of a hyperspectral cube [

The MNF transform adopts similar arguments to the PCA to derivate its components. This method is a linear transformation that uses a signal-to-noise ratio to sort images,

NAPC transform is mathematically equivalent to MNF transform, but the former transform can be implemented using standard principal components algorithm, without the need for matrix inversion and eigenanalysis of a nonsymmetric matrix [_{D}_{N}) can be estimated using the following equation:

In this algorithm, a “D” equal to 1 was used. The _{D}_{D}

The endmembers consist of pure elements in the image that, by mixing, form all other spectra present. The techniques for endmember detection were developed for hyperspectral sensors, but have been employed for multispectral sensors and are used in this study on the PDCs. The algorithms used to detect the endmembers implicitly or explicitly assume the convex geometry and the linear mixing model [

The most widely used algorithm is the method proposed by Boardman and Kruse [

The spectral classifiers compare image spectra to a reference spectrum from spectral libraries or to spectral endmembers [

The classification obtained by the proposed method was compared to a classification by visual interpretation of Landsat-TM images on 15 July 2009 (_{ii}_{i+}_{+i}

Radar imaging systems are based on backscattering from land cover, which generates a textural pattern characterized by different proportions of dark and light pixels, which prevent the direct use of classical techniques applicable to optical images. Thus, SAR images have specific characteristics that are quite different from optical remote sensing, depending on the scatter of ground objects and textural patterns, which vary with different targets. Normally, the traditional procedures for the classification of radar images are preceded by a step of filtering or texture analysis. Therefore, several texture descriptors have been adopted for SAR image classification.

A comparison of the proposed method should consider other methods that have univariate data as the input and multivariate data as the output, which generates multi-channels of textural attributes to describe a particular pixel. Thus, the multi-channel filtering approach decomposes the image into a number of filtered images, each of which contains intensity variations over a frequency and orientation [

Thus, the proposed method is compared with traditional methods: the gray-level co-occurrence matrix (GLCM) and Gabor filters. The multi-channel methods allow the use of spectral classifiers (e.g., SCM and SAM) from a single image. In these studies, the local spectrum is established through features that are obtained by filtering the input image with a set of textural operators.

The gray level co-occurrence matrix (GLCM) [_{(i,j)}” describe the relative frequencies of all pairwise combinations of grey levels (_{ij}

Therefore, a singular GLCM is required for each parameter combination (

A co-occurrence window width is an important parameter for classification purposes, relating to the target dimension in the studied image [

Originally, 14 statistical parameters were extracted from GLCM [

In this paper, we used a “

A Gabor function is a Gaussian modulated complex sinusoid in the spatial domain, which is widely applied to image processing, computer vision and pattern recognition [_{x}_{y}

The spatial frequency bandwidth (

A multi-channel filtering approach can be implemented, where an image is filtered by a set of suitable Gabor filters at different orientations and spatial frequencies, resulting in a filter bank design used for analysis, classification or segmentation [

For the use of the PDCA method in ALOS-PALSAR images, we considered a window size of 11 × 11 and the number of bins to be 16.

The PDC images were subjected to NAPC transformation. PDC-NAPCs show an increasing noise fraction from the first toward the last. The signal fraction is grouped into the first five components (

The color composites considering the first NAPCs enhance the reservoir, land use areas and forest (

In order to eliminate noisy features in the probability density curve, the inverse NAPC rotation was performed, considering the six signal components. This procedure allows an intense smoothing of the probability density curve, without serious signal degradation (

The endmembers were identified by visualization in the n-dimensional scatter plot of the first three NAPCs. The point distribution shows a configuration in the shape of an “S” (

The water curve has a frequency accumulated in the first bin, because when the pulse hits the flat surface, most of the energy is directed outwards away from the surface at a right angle away from the receiver; so, little energy is recorded (

The images were classified using the spectral measures: SCM and Euclidian distance. Classes related to B and C endmembers represent the same group, when compared with the optical images (

The accuracy indices are slightly higher for the SCM classification (kappa coefficient = 0.8137 and overall accuracy = 87.08%) in relation to the Minimum Distance (kappa coefficient = 0.8035 and overall accuracy = 86.4302%). Therefore, the resulting image classification shows great similarity to visual image interpretation (

Furthermore, along the boundaries between the pasture and forest classes, the presence of narrow strip of secondary forest and disturbed forest was erroneously detected. This error is due to the interference of two classes in the moving window, generating an intermediate curve similar to secondary forest.

The textural classifications (GLCM and Gabor Filter) were tested with the same window size (11 × 11), endmembers and spectral classifiers (similarity and distance measures) in order to limit the comparison to only the textural descriptors. The comparison among the methods considered the accuracy indices (kappa coefficient and the overall coefficient) between the textural classification and the visual interpretation from high-resolution optical images.

Co-occurrence images present a high correlation, particularly with respect to Dis and Con (

The sequential search method for the selection of the best textural descriptors was applied to the classified images by Euclidian distance.

In the scheme shown in

The classifications from all spatial frequencies (

The successive elimination of higher frequencies caused an improvement in classification accuracy. As might be expected, the higher frequency images highlighted the edge detection, but contributed little to distinguishing the land-cover classes. The higher the filter frequency, the greater the interference of impulse noise in the image [

The methods discussed are available in software developed in the C++ language (

The software is divided into three modules: (1) the PDCA method; (2) forward and inverse transformation NAPC; and (3) the SCM classifier. The PDCA method considers as input the following parameters: a single radar image, the number of bins representing the number of components and the size of the moving window. The PDCA generates the following output data: an image composed of all components and a single image in grayscale relating to bins determined by the user. The forward NAPC transformation considers as input the PDCs and the number of neighborhoods to be used in calculating the matrix of noise. The inverse transformation NAPC considers as input the signal components of the NAPC and the eigenvectors matrix of the forward NAPC rotation. Finally, the SCM method takes as input data the endmembers and the components after the noise removal.

All inputs and results are shown in the file list, so it is possible to visualize them by choosing “gray scale” or “RGB” composite. The display interface provides basic functions for images visualization such as zoom areas and pixel values. Moreover, the results (output files) can be read from other viewers of binary images.

In examining SAR data, a crucial issue is the challenge of developing accurate models for the statistics of pixel intensities, with the purpose of classifying or filtering. In this article, the main objective was to show a new algorithm that improves the SAR image classification. The algorithm combines different methods; some procedures have already been applied in radar images (PDF and filters by moving-window) and others from methods for hyperspectral images processing (MNF, pixel purity index and SCM) are still unused. One of the main innovations compared to already published strategies is the elaboration of the probability density curves for each image pixel using a moving window approach, which generates multi-components that minimize noise and enable the application of spectral classification techniques. Unlike other moving-window filters [

The proposed algorithm has two free parameters, which must be adjusted to obtain a histogram with a good quality sample distribution. In the present study, we used a window size of 11 × 11 pixels that generates 121 samples that are distributed in 16 bins. Despite the good results shown by sample distribution, new studies may test other combinations in order to have a gain in performance. Since the radar image is converted to the PDC components, it is not possible to obtain an inverse transform.

NAPC transformation is applied in the PDC components in order to minimize the noise of the probability density curves. This procedure is very different from the conventional methods of noise removal in radar images, which operates on a single image. Therefore, this new approach using multi-components has no similarity with other methods applied to radar data, but is compatible with the procedures used in hyperspectral images [

We did a test on a well-known and representative area of the Amazon Forest, where the model achieves the accurate detection of targets. Forest mapping using radar data is particularly important in the tropics, where optical sensors are often constrained by the presence of haze, smoke or clouds. However, the proposed method should be further tested in other environments in order to evaluate its performance for different targets. For this, the algorithm is in free software for those who want to test it.

The PDCA method was compared with the GLCM and Gabor methods, widely used in the SAR image classification and texture analysis. These methods present various free variables that provide a wide variety of results. In this study, different combinations of descriptors were tested in order to obtain better classification accuracy. The best combination of GLCM descriptors was composed of dissimilarity, contrast, and homogeneity. However, the accuracy values among the best descriptor set and other subsets are statistically very close, because of the high multicollinearity of GLCM images. The best descriptors from Gabor filter images are the lower spatial frequency (

A limitation of the proposed method is the classification on the boundaries of different classes. In these locations, the probability density curve displays a hybrid behavior of the two classes causing difficulty in classification. Future work will mainly cover the development of new algorithms for this specific problem. In addition, others radar bands or methods of noise elimination, endmember identification and classification can be tested.

The PDCA method is conceptually simple and effective in speckle filtering and allows a supervised classification from the probability density curves. The probability density curve describes a stationary condition, mainly after the noise elimination with the NAPC transform, which facilitates the classification. In this case, the spatial frequencies of the radar images are distributed in different images, which is an improvement to their representation, unlike other methods that are limited to only one output image. In this sense, the color composition of the PDC-NAPC signal components is an interesting enhancement technique highlighting the main surface targets and allowing for significant improvement of single image radar.

The PDCs can be treated similarly to the multispectral images from optical sensor systems or aerial gamma-ray surveys. Endmembers of probability density curves are identified using the n-dimensional scatter plot and the classification is accomplished through the SCM method. The probability distribution for the amplitude and intensity SAR data is far from being symmetric because of the Rayleigh distribution. However, in this study, the asymmetry attribute is a factor for endmember individualization. The method also does not have difficulties in identifying targets with strong returns from point scatters as well as targets with very low intensities and very small ranges. Thus, the method is suitable for discriminating different types of scattering. However, there are still several issues that can be investigated in future research in order to enhance and achieve better performance, such as the sensitivity analysis of free parameters, comparisons with other methods, and applications to other environments. All these factors are challenges for SAR image processing and the spectral-analysis algorithms extend the processing alternatives.

This study was funded through a project from Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq). The authors are thankful for the financial support from a CNPq fellowship (Osmar Abílio de Carvalho Júnior, Renato Fontes Guimarães and Roberto Arnaldo Trancoso Gomes). Special thanks are given to Financiadora de Estudos e Projetos (FINEP) for additional support.

Osmar Abílio de Carvalho Júnior wrote the manuscript and was responsible for the research design, mathematical model, data preparation and analysis. Luz Marilda de Moraes Maciel and Ana Paula Ferreira de Carvalho provided some of the data, conducted the field-works and gave relevant technical support. Renato Fontes Guimarães and Roberto Arnaldo Trancoso Gomes supported the analysis and interpretation of the results. Nilton Correia da Silva and Cristiano Rosa Silva provided significant input to the numerical analysis, computational optimization and object-oriented programming. All of the authors contributed in editing and reviewing the manuscript.

The authors declare no conflict of interest.

Study location in the municipality of Candeias do Jamari, RO.

Procedures to generate the Probability Density Components: establishment of a moving window (for example, with a dimension of 5 × 5), calculating the frequency histogram, and the establishment of images of the probability density components.

The concept of probability density component analysis is shown with a spectrum calculated for each spatial element in an image. The curve describe different targets.

(

3D image cube composed of probability density components (PDCs), where a cube face is an RGB color composition (1PDC, 8PDC and 16PDC).

Probability density components from a window size of 11 × 11 and 16 bins: (

Noise-adjusted principal component (NAPC) transform of the probability density components: (

(

Noise treatment of the PDC by the use of inverse NAPC rotation: (

Endmember identification using the n-dimensional scatter plot.

Probability density signatures in the study area (A, B, C, D and E curves).

Classified maps using (

Rule images of the Spectral Correlation Mapper (SCM) method, considering the endmembers of

GLCM-descriptor signatures for the study area: (A) Curve A; (B) Curve B; (C) Curve C; (D) Curve D; and (E) Curve E.

Sequential search scheme for the selection of the best GLCM descriptors.

Confusion or error matrix for four classes.

| ||||
---|---|---|---|---|

Pasture and agriculture | 87.33 | 19.43 | 4.78 | 3.38 |

Secondary Forest | 8.61 | 66.78 | 18.54 | 0.45 |

Forest | 1.27 | 13.38 | 75.62 | 0.14 |

Water | 2.79 | 0.42 | 1.06 | 96.04 |

Total | 100.00 | 100.00 | 100.00 | 100.00 |

Correlation matrix among the gray-level co-occurrence matrix (GLCM) descriptors. Dis, dissimilarity; Con, contrast; Ent, entropy; Var, variance; SM, second moment; Hom, homogeneity; Cor, correlation.

1.00 | 0.97 | 0.89 | 0.85 | −0.76 | −0.84 | −0.88 | |

0.97 | 1.00 | 0.78 | 0.86 | −0.66 | −0.79 | −0.83 | |

0.89 | 0.78 | 1.00 | 0.74 | −0.80 | −0.73 | −0.81 | |

0.85 | 0.86 | 0.74 | 1.00 | −0.63 | −0.71 | −0.64 | |

−0.76 | −0.66 | −0.80 | −0.63 | 1.00 | 0.92 | 0.77 | |

−0.84 | −0.79 | −0.73 | −0.71 | 0.92 | 1.00 | 0.83 | |

−0.88 | −0.83 | −0.81 | −0.64 | 0.77 | 0.83 | 1.00 |