Open Access This article is
- freely available
Remote Sens. 2019, 11(12), 1414; https://doi.org/10.3390/rs11121414
Region Merging Method for Remote Sensing Spectral Image Aided by Inter-Segment and Boundary Homogeneities
School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
Key Laboratory of Photoelectronic Imaging Technology and System, Beijing Institute of Technology, Beijing 100081, China
School of Physics and Optoelectronic Engineering, Foshan University, Guangdong 528000, China
Author to whom correspondence should be addressed.
Received: 11 May 2019 / Accepted: 6 June 2019 / Published: 14 June 2019
Image segmentation is extensively used in remote sensing spectral image processing. Most of the existing region merging methods assess the heterogeneity or homogeneity using global or pre-defined parameters, which lack the flexibility to further improve the goodness-of-fit. Recently, the local spectral angle (SA) threshold was used to produce promising segmentation results. However, this method falls short of considering the inherent relationship between adjacent segments. In order to overcome this limitation, an adaptive SA thresholds methods, which combines the inter-segment and boundary homogeneities of adjacent segment pairs by their respective weights to refine predetermined SA threshold, is employed in a hybrid segmentation framework to enhance the image segmentation accuracy. The proposed method can effectively improve the segmentation accuracy with different kinds of reference objects compared to the conventional segmentation approaches based on the global SA and local SA thresholds. The results of the visual comparison also reveal that our method can match more accurately with reference polygons of varied sizes and types.
Keywords:region merging; local spectral angle threshold; multi-band watershed transformation; geographic object-based image analysis (GEOBIA)
With the rapid development of high resolution remote sensing imaging techniques, geographic object-based image analysis (GEOBIA) has become a promising paradigm to extract accurate and reliable ground information from various detectors [1,2]. GEOBIA framework typically encompasses several sub-procedures such as image segmentation, geo-object recognition, feature extraction and image classification [3,4,5,6,7,8,9]. Image segmentation relies on the spectral or spatial knowledge to produce a spatial partition with contiguous and homogeneous characteristics to form the basis or information carrier for the following processing steps [10,11]. Thus, image segmentation plays a crucial role to impact on the overall performance of the GEOBIA in geographic information science.
In the past, edge-based  and region-based  partition strategies were proposed to implement the image segmentation. Edge-based algorithms fracture the underlying images based on perceivable edges inferred by the dissimilarity between neighboring pixels . However, edge-based algorithms are sensitive to noise or texture variation, thus apt to render over-segmentation around textured regions . On the other hand, region-based algorithms exploit homogeneity or heterogeneity of adjacent regions to improve the robustness of the segmentation results against noise [13,15]. The segmentation errors of region-based algorithms often occur along the boundaries, which tend to bias the solution towards under-segmentation . To take advantages from both sides, a set of hybrid segmentation methods have been proposed to jointly exploit the principles of edge-based and region-based algorithms [17,18,19].
Recently, a new hybrid framework, referred to as segment-merging method has been developed to solve for the segmentation problem. In its procedure, the edge information is first utilized to produce the initial partitions of the image, and then the region-based algorithms are used to conduct the merging stage based on the interior information within each patch. Benefiting from the complementation between the edge-based and region-based algorithms, the segment-merging method is deemed as a promising framework to explore the integral information in a geo-object way [16,20]. In the segment-merging method, the initial segmentation step can be implemented by a variety of methods, such as support vector machine , watershed transformation , superpixel method , fractal net evolution approach , mean-shift based method  and etc. During the subsequent merging process, merging order (MO) and merging criteria (MC) are two key factors to be considered in the algorithm design and optimization. According to the degree of strictness, MO can be classified in the order from relaxed to strict as follows: fitting, best fitting, local-mutual best-fitting and global-mutual-best fitting . In practice, the merging algorithms can be optimized by relying on one or more MOs mentioned above . Compared to MO, MC has a more significant influence on the performance of merging. Benz et al. concluded that the object feature heterogeneity used in MC should take into account the color and shape information . He proposed an MC based on the heterogeneity of spectral and shape, which was adopted by the eCognition software. Following research found that the intra-segment homogeneity should not be neglected. For instance, He et al. and Wang et al. optimized the MC to encompass both intra-segment homogeneity and inter-segment heterogeneity [27,28]. Yang et al. proposed to design the MC based on only inter-segment homogeneity without the usage of heterogeneity characteristics . In addition, boundary information has also been involved in the MC to support the image segmentation. Zhang et al. proved that adding the features of edge strength would help by avoiding under-segmentation. Chen et al. included the edge penalty and constrained spectral variance difference into the MC to consider the difference of spectral heterogeneity between two neighboring objects .
Nowadays, the segment-merging methods employing local spectral angle (SA) threshold as MC attract many scholars’ attention . Compared with the conventional strategies using global SA threshold, these kinds of methods refine the preset threshold with local homogeneity to obtain adaptive local SA threshold. By these means, geo-objects with various sizes and types can be segmented correctly. However, most of the segment-merging methods with local SA threshold neglect the inherent relations among adjacent regions in determining the local MC.
This paper proposes to design an adaptive SA threshold based on both inter-segment and boundary homogeneities to improve the accuracy of segment-merging methods. The goal can be achieved by the following four steps. Firstly, the image is initially partitioned into regions using a watershed transformation . Then, we assess the spectral distance between adjacent regions through their SA and construct an initial MC based on the preset SA threshold. After that, based on the suggestion in related works , the preset SA threshold is refined by the local homogeneity of adjacent regions, which is quantified by the weighted average of inter-segment and boundary homogeneities in terms of relative areas of the boundary regions and internal regions, to obtain the adaptive local SA threshold. Finally, the underlying regions are merged when the minimum spectral distance is less than the adaptive local SA threshold. The proposed algorithm will go over all regions and repeat the aforementioned steps to complete the merging stage and figure out the final image segmentation results.
This section describes the proposed hybrid segment-merging method in four steps. First, we employed the watershed transformation method to accommodate the requirements of remote sensing image processing and obtain the initial segmentations. Second;y, we utilized the spectral distance between adjacent regions to represent their heterogeneity and employ a SA threshold that is affected by the local homogeneity as the MC. Thirdly, the local SA thresholds were refined based on the inter-segment homogeneities and boundary homogeneities. Finally, according to local mutual best-fitting strategy, regions were iteratively merged until no spectral distance exceeded the adaptive local SA threshold. The above framework is shown in Figure 1. In addition, we provide a method to evaluate the accuracy of the segmentation results.
2.1. Initial Segmentation
As mentioned above, initial segmentation can be implemented by various methods, which can produce similar segmentation results. Here, we adopted the watershed transformation algorithm due to its advantage in detecting edge information through gradient image, which is suitable for the subsequent merging step and the comparison with relevant methods.
Most traditional watershed-transformation based methods were derived from a panchromatic image or a single band of multispectral image. As the spectral information plays an increasingly significant role in geoscience, researchers attempt to make full use of edge information in all spectral bands . In the past, Yang et al. developed a multi-band watershed segmentation method  to compromise between a large amount of spectral information and computational complexity. Here, we follow the line of thought and acquire initial segmentation by the following steps.
Firstly, the scalar gradient value of each pixel is measured by the spectral distance with the consideration of all spectral bands. In spectral space, SA is considered as a common spectral distance metric . The SA between two pixels is defined as:where L is the number of spectral bands within the image and respectively represent the spectral response of two different pixels in the spectral band. The maximum SA for a given pixel located at is defined as:where stands for the spectral angle between any pair of adjacent pixels within a moving window W (four or eight neighboring pixels) that is centralized at the coordinate . The denotes the magnitude of the scalar gradient.
Secondly, the gradient image was obtained. Using Equations (1) and (2), the spatial dimensions of images were scanned with the window, and gradient values corresponding to all pixels were obtained. These gradient values form a gradient map according to the spatial positions of their corresponding pixels.
Thirdly, a gradient image was segmented by the watershed transformation algorithm to get the initial segmentation .
In this step, we do not use any pre-processing method so as to obtain an over-segmentation result, which is required by the subsequent region merging step.
2.2. Merging Criteria
Relying on the region adjacent graph (RAG) proposed in , we defined the spatial adjacent regions and obtained their spatial relationships. Then the spectral distance between adjacent segments is defined as:
Compared to Equation (1), and are respectively replaced by and to represent the average spectral responses of the two segments and . The can be expressed in the form of SA and its value varies from 0 degree to 90 degree. A lower value of means a smaller spectral distance and a lower heterogeneity between the adjacent segments.
Since spectral distance can measure the spectral similarity of adjacent segments, it can be used for region merging [29,30]. The spectral distances between each segment and its neighbors were calculated and the minimum spectral distance was selected. According to the MO of local-mutual best-fitting, if the spectral distance between segment and segment was the smallest of the spectral distances between them and their adjacent regions, segment and segment were mutually most similar to each other and selected as a pair of candidates to be merged.
When the spectral distance between adjacent regions was smaller than a metric, the adjacent segments were merged. Unlike other methods that use less intuitive metrics as MC [14,15,19], we employed an intuitive and physically-defined metric, SA as the MC. A preset SA threshold was needed to determine if the spectral distance between two segments is close enough for them to be merged. As concluded by Yang et al. , a higher SA threshold level should be applied to the homogenous segments than the heterogeneous segments during the region merging. The relationship between local homogeneity of segment i and its local SA threshold can be expressed as:
2.3. Local Adaptive SA Threshold Aided by Inter-Segment and Boundary Homogeneities
In this section, we pay attention to the impact of inter-segment and boundary homogeneities on the local SA threshold, which is conceptually illustrated in Figure 2. Rectangles with dotted red frames represent the segments and . and indicate the spectral distances of segment pairs and in the three rows, respectively. It is noted that and have the same and the smallest spectral distance among their adjacent segments. In the first row, is more homogenous than , therefore has a higher SA threshold for the subsequent merging operations (). In the second row, the homogeneities of and are the same, but is more homogenous than . Therefore, has a higher SA threshold for merging (). Comparing the scenarios in the first and the second rows, we can conclude that the inter-segment homogeneity should encompass not only the object segment (like , ), but also its adjacent segments (like , ). In the third row, and have the same inter-segment homogeneity, but the boundary (rectangles with full red frames ) of , which is defined as a set of pixels having a common edge between two adjacent segments, is more homogenous than that of . Therefore, a higher threshold should be permitted (). In the end, is merged with (due to ), while is not merged with (due to ). As shown in Figure 2 and the interpretation mentioned above, the inter-segment and boundary homogeneities of neighboring segments should be considered simultaneously in the region merging. Therefore, we modify Equation (4) to match our analysis above:where is the homogeneity of adjacent segments , and is the corresponding local SA threshold. Hereafter, the influence of homogeneity was quantified and modeled.
The homogeneity of the segment is evaluated by the standard deviation (STDV) of the averaged digital number (DN) value of all spectrum bands within the segment:where p is one of the pixels within the segment , is the spectral response of the pixel p in the spectral band l. Noticed that a lower value of indicates that is a more homogenous segment. In the same way, inter-segment homogeneity between segment and its adjacent segment is given by:
The boundary homogeneity between and its adjacent segment is defined in a similar way:where is the boundary region of segment i and segment j.
A global homogeneity measurement was set as the summation of homogeneities of all initial segments weighted by their areas:where is the area of segment , quantified by the number of pixels contained in the segment, and N is the number of initial segments. In order to consider every segment equally, global homogeneity was divided by the sum of the areas of all segments to obtain the relative global homogeneity:
On the other hand, the local inter-segment homogeneity is defined as:
The smaller value of meant less discrepancy between adjacent segments . If segments and were merged, they would form a more homogenous region compared with other adjacent segments. So, higher local SA thresholds should be allowed for adjacent segments with smaller . Different from the definition of local inter-segment homogeneity, local boundary homogeneity is defined as the ratio of the smoothness of the boundary region to the inter-segment homogeneity of its corresponding adjacent segments:where the smaller value of indicates that the boundary is a smoother transition region. Compared to a boundary with less homogeneity, merging the adjacent segments with a smaller will produce a better result. So, a higher local SA threshold should be allowed for those adjacent segments with smaller .
Based on the above analysis of inter-segment and boundary region homogeneities, the local homogeneity is formulated as the weighted average of the former one and the later one in terms of the relative areas of boundary regions and the internal regions of initial partitions:
Finally, the adjusted local SA threshold for adjacent segments is refined by Equation (5). It is shown that the adjacent segments with higher inter-segment and boundary homogeneities allow higher local SA thresholds, and are more likely to be merged with its adjacent segments.
2.4. Region Merging Using Adjusted Local SA Threshold
As described in Table 1, the only required input parameter of the proposed method was the preset SA threshold. Comparing the spectral distance with the local SA threshold of a pair of candidates, they will be merged if and only if . In the end, the RAG was updated, and the aforementioned steps were repeated until no spectral distance of the adjacent segments exceeds their local SA thresholds. It is worth noting that the preset SA threshold needs to be varied from 1 to 10 (by interval of 1) to obtain the best-fitting segmentation.
The homogeneity of each segment and boundary was gradually decreased as the segments were iteratively merged. Consequently, the adaptive local SA thresholds decreased steadily and the merging process automatically converged to the final segmentation result.
2.5. Segment Evaluation
Numerous assessment methods have been proposed to evaluate the capability of image segmentation algorithms [35,36]. Among them, a region overlapping metric is often used due to its accurate expression of the geometric discrepancy. According to Ref. , over-segment error (OSE) and under-segment error (USE) are defined as:where is ith reference polygon, is overlapped area between and the jth segment polygon. and are defined within the range of , where 1 indicates perfect matched segments. In GEOBIA, we need to make the one-to-one correspondence between the reference polygons and the corresponding segments , meaning that the most suitable segment must be selected to match each reference polygon from all candidates (anyone that overlaps that reference polygon). Therefore, the OSE and USE are combined to define the matching index (MI) to identify which candidate fits best to each reference polygon:
In the above equation, varies from zero to one, where one indicates perfect overlapping and zero means no overlapping. The candidate polygon refers to the segment with the highest value among all candidates.
Since all fitting segments for the reference polygons are identified by Equation (16), the next question to solve is how to evaluate the global geometric discrepancy. In this paper, in order to make the evaluation results easy to compare with relevant papers  and the conclusions are more convincing, quality rate (QR) described by Weidner  is adopted and modified hereafter as following:where is the number of reference polygons. refers to the candidate polygon corresponding to reference polygon . Note that although the areas of reference polygons were different, all of them were given the same weight. value varies linearly within the range . The lower value indicates less discrepancy with respect to the reference polygons, thus more accurate segmentation is obtained.
3. Experiment Setups
In this section, we select a set of high resolution remote sensing images to validate the effectiveness of the proposed methods. For comparison, the relevant methods and their strategies are also described.
3.1. Data Sets
In order to accurately assess the relevant methods, various sizes and types of geo-objects are extracted from test images as references. Two experienced experts in geographical mapping were hired to conduct this work. First, they extracted the same set of certain references, then crossed check the marked images and selected the controversial references. Finally, they discussed the controversial ones and decided whose partition was more reasonable. To reduce the discrepancy, if they could not reach an agreement on a certain reference, this one will be discarded. In this way, the retained references avoid the influence of subjective effects.
Farmland division is a typical application scenario for image segmentation approaches. We chose a SPOT6 scene acquired on 15 December 2015, which was located in Al Wihda of Iraq. The size of the subset image consisted of pixels with a spatial resolution of 6 meters. In addition, it contains blue (450–525 nm), green (530–590 nm), red (625–695 nm), and near infrared (760–890 nm) spectral bands. Figure 3a illustrates the false-color image with a combination of near infrared, red, and green spectral components as R, G, and B (on WGS84 UTM coordinate system). Figure 3b shows the ground truth segments of 50 reference geo-objects.
Urban geo-objected recognition is another significant application of the image segmentation methods. A Worldview 2 subset image with pixels, acquired on 9 February, was selected as the typical image. The area captured by the image was located in Washington, DC, USA. It had a spatial resolution of 1.84 m and 8 spectral bands, including the coastal (400–450 nm), blue (450–510 nm), green (510–580 nm), yellow (585–625 nm), red (630–690 nm), red edge (705–745 nm), near infrared 1 (770–895 nm), and near infrared 2 (860–1040 nm) spectral bands. Figure 4a illustrates the false-color image with a combination of near infrared 1, red, and green as R, G, and B (on WGS84 UTM coordinate system). Figure 4b shows the ground truth segments of 50 reference geo-objects.
Compared with the above two applications, the segmentation of the ground objects in rural areas usually encounters problems of complex object types. A region called Rambla Can Bell near Barcelona was selected as the study area. Its remote sensing image (Figure 5a) was obtained by the SPOT6 satellite mentioned above on 7 December 2012 with a size of pixels and a spatial resolution of 6 m. To evaluate the proposed method efficiently, two subareas images (Figure 5b,d) with a size of pixels were chosen. The ground truth segments of 25 reference geo-objects for each subareas image were shown in Figure 5c,e.
As shown in Figure 3b, Figure 4b and Figure 5c,e, compared to each image, we can notice that all images contain various sizes and types of geo-objects, which are sufficient to support the image segmentation. Among them, rural areas showed more types of geo-objects which indicates segmenting rural images is more challenging. Their basic information is provided in Table 2. Figure 6 exhibits the histograms to show the distribution of the areas of reference polygons in the farmland (Figure 6a), urban (Figure 6b), and rural (Figure 6c) images.
3.2. Methods for Comparison
To the best of our knowledge, the method proposed in  was the first one focusing on how to merge initial segments based on the local SA thresholds. Thus, we compare the proposed method in this paper to that in . Yang et al. described the relation of local homogeneity and local SA threshold by Equation (4). The can be formulated as:
A lower value of indicates that is a more homogeneous region and deserves a higher local threshold.
A global SA threshold method was also fulfilled in a similar way as typical methods [6,24]. By setting in Equation (4) to be 1, the global SA threshold is deduced:
It is noticed that the global SA threshold method neglects the influence of the interior and boundary information on region merging, and the accuracy of the method only relies on the preset threshold .
In this section, we assess and compare the performance of the global SA threshold (hereinafter referred to as GSA) method, local SA threshold (hereinafter referred to as LSA) method and adaptive local SA threshold aided by inter-segment and boundary homogeneities (hereinafter referred to as LSAH) method. The effectiveness of the proposed method is proved by both indicator evaluation and visual evaluation.
4.1. Farmland Area
According to the evaluation of QR shown in Figure 7, the best segmentation was obtained when the preset SA () were 3, 3 and 4, respectively, for GSA, LSA, and LSAH methods. Compared to the GSA and LSA methods (QR: 0.2362 and 0.2128), the LSAH method (QR: 0.1637) reduced the discrepancy in areas by 0.0725 and 0.0491. In addition, the LSA method was more accurate than the GSA method (QR of 0.2128 vs. 0.2362). The best segmentation results obtained by the aforementioned three methods were selected for detailed analysis on the MI and QR values of every reference polygons, where the MI and QR values were calculated by Equations (16) and (17). As shown in the first column of Figure 8, the scatter plots were produced by adopting the MI and QR values of all candidate polygons. In the second and the third columns of Figure 8, the reference polygons were drawn according to different values of MI and QR respectively.
To further evaluate the three segmentation methods, two subsets were selected from image for visual comparison. Each reference polygon and the corresponding segment in the two subsets were labeled with a yellow solid line and a white solid line, respectively. As can be seen from the segmentation results of GSA method and LSA method, although they can successfully recognize some references, over-segment (Figure 9c,f) and under-segment (Figure 9b,g) still exist. As for the LSAH method, it performs better than GSA and LSA methods, because it can distinguish geo-objects of different types and sizes well (Figure 9d,h), and the segments produced by it were closer to the reference objects.
4.2. Urban Area
As shown in Figure 10, according to the evaluation of QR value, the GSA, LSA and LSAH methods obtained the best segmentation when the preset SA () were 3, 2 and 4 respectively. Compared to the GSA and LSA methods (QR: 0.2275 and 0.1945), the LSAH method (QR: 0.1615) reduced the discrepancy in area by 0.066 and 0.033. Moreover, the LSA method performed better than the GSA method (QR of 0.1945 vs. 0.2275). To further describe and analyze the MI and QR values of each reference polygon when the three methods obtained the best segmentation results, the scatter plots were produced in the first column of Figure 11 and the reference polygons were colored in the second and third column of Figure 11 according to corresponding MI and QR values.
In order to compare the segmentation results in a direct visual sense, we chose three subset images from the segmentation results and labeled each reference polygon with a yellow solid line and each corresponding segment with a white solid line. The GSA method performed worse than LSA because of one of the rooftops in Figure 12b was under-segmented while it was matching with the reference one in Figure 12c. LSAH method was the only one that produced the matching tree-crown and building in Figure 12d,l, while GSA over-segment the asphalt and woods in Figure 12b,j, and LSA under-segment the tree-crown in Figure 12c but over-segment the building in Figure 12k.
4.3. Rural Area
As indicated by the QR values of the three methods in Figure 13, the GSA and the LSA methods obtained the best segmentation when the preset SA () were 3 at the same time, while the LSAH method obtained the best segmentation when the preset SA was 5. Compared to the GSA and LSA methods (QR: 0.3778 and 0.3277), LSAH received the best evaluation (QR:0.2711) which reduced the discrepancy in area by 0.1067 and 0.0566. Consistent with the results above, the LSA method was more accurate than the GSA method (QR of 0.3277 vs. 0.3778). The best segments of the three methods were further analyzed and displayed by the MI and QR values of every reference geo-objects. The scatter plots were produced in the first column of Figure 14 and reference polygons were drawn in the second and the third columns of Figure 14.
In Figure 15, two subgraphs were selected from the segmentation results of T1 and T2 for visual comparison, respectively. We labeled each reference polygon with a yellow solid line and each corresponding segment with a white solid line. In the first and fourth row of Figure 15, the GSA method over-segmented the cropland and grass while LSA and LSAH methods obtained relatively matching segmentations compared to reference polygons. In the second and third rows of Figure 15, buildings and woods were segmented well using LSAH method. In contrast, some of these areas were under-segmented by GSA method in Figure 15f,j, and the segmentation results produced by LSA method showed both over-segmentation and under-segmentation in Figure 15g,k.
Geo-objects of different sizes and types have to be recognized in different GEOBIA applications [39,40,41,42]. Therefore, image segmentation which is a necessary prerequisite step in GEOBIA and has a significant influence on the subsequent image processing should be able to solve the above problem. Recent years, researchers have focused on multiscale segmentation methods [10,43,44]. There is a strategy among them referred to as segment-merging which employs the edge information to produce initial partitions and then uses the interior information to conduct the merging stage and produce promising results. Under this framework, we utilized the watershed transformation to acquire initial segments and the SA threshold to guide region merging. In particular, we study the effect of inter–segment and boundary region homogeneities on local SA threshold and proved the superiority of the proposed method over GSA and LSA methods.
As observed from Figure 7, Figure 10 and Figure 13, although the GSA, LSA and LSAH methods start from the same initial segments, the latter two methods perform better than former when they were optimally segmented, and most of their performance is better at other SA thresholds. This indicates the importance and effectiveness of considering homogeneity during the region merging process using SA thresholds. For the latter two methods, although they all consider the homogeneity of segments, the LSAH method achieves better segmentation than the LSA method. This proves the superiority of refined the local SA threshold by inter–segment and boundary region homogeneities weighted by its relative area.
As shown in Figure 8, Figure 11 and Figure 14, the MI and QR values of the above three methods perform differently as the size of geo-objects increases. Although the GSA method performs worse than the latter two, its performance is relatively stable. Because it judges whether adjacent regions should be merged by comparison of values of preset SA threshold and spectral distance metric between regions, which is not susceptible to the size of geo-objects. For the latter two, the LSA method is more sensitive to the sizes of reference polygons when their areas were small. This may be due to the fact that merging criteria of the LSA method was limited only by internal homogeneities of regions when the preset SA is determined, while the internal homogeneity is easily affected by the aberration-like phenomenon when region area is small. LSAH is more robust to this phenomenon due to the additional consideration of the homogeneity of adjacent regions and boundaries.
As can be seen from the visual comparison, such as the cropland and the building in Figure 9a, the roof and asphalt in Figure 12i, the woods and open field in Figure 15i, the LSAH method was able to delineate the more various size of geo-objects in different applications compared to the GSA and LSA methods. It can also be seen in the scatter plots of the first column of Figure 8, Figure 11 and Figure 14 that the MI and QR values of the LSAH method were more robust to the variation of sizes of reference polygons than the GSA and LSA methods. Therefore, we intuitively believed that the LSAH method has the ability to delineate small and large geo-objects simultaneously in different scenes and that this ability is stronger than the other two methods. To further support this view, we calculated the standard deviations of segment sizes of the best segmentations obtained by GSA, LSA, LSAH methods for farmland area, urban area and rural area, respectively (Figure 16). Compared to the results of all methods, LSAH has the highest standard deviation in each study area, which means it can divide the image into more different sizes. In addition, it should be noted that the standard deviation of the three methods is very close, which explains from another side why the QR values of the three methods do not show obvious differences.
For GSA method, which is guided by the preset SA threshold, lower SA threshold should be generated to prevent dissimilar adjacent regions from merging, and a higher SA threshold should be generated to encourage similar adjacent regions to merge. As for the LSA method, the adaptive SA threshold is only affected by the interior region homogeneities and not reflects the discrepancy between adjacent regions. This results in the same adaptive SA threshold for any region and its different adjacent regions, which may not effectively prevent inappropriate merging. In contrast, the LSAH method considers the inter-segment homogeneities and boundary homogeneities of the regions comprehensively. So that it can produce different adaptive SA thresholds for any region and its different adjacent regions to better describe the influence of the discrepancy between adjacent regions on the SA threshold. In this way, the occurrence of erroneous merges was prevented to some extent and leads to better performance under the QR criterion. The success of the LSAH method shows that it is a better solution than the GSA and LSA methods for evaluating the homogeneity of regions by combining boundary homogeneity and inter-segment homogeneity.
In this paper, we propose a region merging method guided by Local SA thresholds. This method assigns an adaptive spectral angle threshold to each pair of adjacent segments according to the variation of a preset spectral angle threshold which affected by the homogeneity of adjacent segments and boundary regions. Our work can be summarized as follows: based on the initial segments obtained by watershed transformation, we employ the RAG to define adjacent regions; next, we quantized the spectral distance and set an initial SA threshold; then, the preset SA threshold is refined by the local homogeneity of adjacent regions, which is quantified by the weighted average of inter-segment and boundary homogeneities in terms of relative areas of the boundary regions and internal regions to obtain adaptive local SA thresholds; after that, the underlying region is merged when the minimum spectral distance is less than the adaptive local SA threshold; finally, the proposed algorithm will scan all regions and repeat the aforementioned steps to complete the merging stage and figure out the final image segmentation results. The superiority of the proposed method over GSA and LSA methods was verified in the typical applications of remote sensing. Experiment results also reveal that the proposed method can effectively recognize different kinds and types of reference objects, which illustrates its potential to benefit many subsequent applications in GEOBIA framework.
However, just like commonly refined threshold methods, it still needs more research on how to automatically select the optimal preset SA threshold, which could further promote this kind of method to be an unsupervised strategy. Other issues that need to be discussed are how to balance the weights between inter-segment homogeneity and boundary homogeneity in region merging and how to utilize other metrics to evaluate local homogeneity.
Y.Z., X.M. and T.X. conceived of the region merging method aided by inter-segment and boundary homogeneities. Y.Z. and X.W. performed the experiments. C.X. and H.T. were responsible for data analysis. Y.Z. wrote the manuscript and all authors revised the final manuscript. In addition, T.X. is the corresponding author.
This work was supported by the Major Science Instrument Program of the National Natural Science Foundation of China under Grant 61527802, the General Program of National Nature Science Foundation of China under Grants 61371132 and 61471043 and the National Natural Science Foundation of China under grant no 61471123.
The authors would like to thank Tingfa Xu for the support.
Conflicts of Interest
The authors declare that they have no competing interests.
- Blaschke, T.; Hay, G.J.; Kelly, M.; Lang, S.; Hofmann, P.; Addink, E.; Feitosa, R.Q.; Meer, F.V.D.; Werff, H.V.D.; Coillie, F.V. Geographic Object-Based Image Analysis—Towards a new paradigm. ISPRS J. Photogramm. Remote Sens. 2014, 87, 180–191. [Google Scholar] [CrossRef] [PubMed]
- Chen, G.; Weng, Q.; Hay, G.J.; He, Y. Geographic object-based image analysis (GEOBIA): Emerging trends and future opportunities. Gisci. Remote Sens. 2018, 55. [Google Scholar] [CrossRef]
- Blaschke, T.; Lang, S.; Hay, G. Object-Based Image Analysis: Spatial Concepts For Knowledge-Driven Remote Sensing Applications; Springer Science & Business Media: Berlin, Germany, 2008. [Google Scholar]
- Lang, S. Object-Based Image Analysis for Remote Sensing Applications: Modeling Reality—Dealing with Complexity; Springer: Berlin, Germany, 2008; pp. 3–27. [Google Scholar]
- Blaschke, T. Object based image analysis for remote sensing. ISPRS J. Photogramm. Remote Sens. 2010, 65, 2–16. [Google Scholar] [CrossRef]
- Conrad, O.; Bechtel, B.; Bock, M.; Dietrich, H.; Fischer, E.; Gerlitz, L.; Wehberg, J.; Wichmann, V.; Böhner, J. System for Automated Geoscientific Analyses (SAGA) v. 2.1.4. Geosci. Model Dev. Discuss. 2015, 8, 2271–2312. [Google Scholar] [CrossRef]
- Berhane, T.M.; Lane, C.R.; Wu, Q.; Anenkhonov, O.A.; Chepinoga, V.V.; Autrey, B.C.; Liu, H. Comparing Pixel-and Object-Based Approaches in Effectively Classifying Wetland-Dominated Landscapes. Remote Sens. 2017, 10, 46. [Google Scholar] [CrossRef] [PubMed]
- Inglad, J. Automatic recognition of man-made objects in high resolution optical remote sensing images by SVM classification of geometric image features. ISPRS J. Photogramm. Remote Sens. 2007, 62, 236–248. [Google Scholar] [CrossRef]
- Jia, H.; Xing, Z.; Song, W. Three Dimensional Pulse Coupled Neural Network Based on Hybrid Optimization Algorithm for Oil Pollution Image Segmentation. Remote Sens. 2019, 11, 1046. [Google Scholar] [CrossRef]
- Trias-Sanz, R.; Stamon, G.; Louchet, J. Using colour, texture, and hierarchial segmentation for high-resolution remote sensing. ISPRS J. Photogramm. Remote Sens. 2008, 63, 156–168. [Google Scholar] [CrossRef]
- Radoux, J.; Bourdouxhe, A.; Coos, W.; Dufrêne, M.; Defourny, P. Improving Ecotope Segmentation by Combining Topographic and Spectral Data. Remote Sens. 2019, 11, 354. [Google Scholar] [CrossRef]
- Canny, J. A Computational Approach to Edge Detection. In Readings in Computer Vision; Morgan Kaufmann: San Francisco, CA, USA, 1987; pp. 184–203. [Google Scholar]
- Bischof. Seeded Region Growing. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 16, 641–647. [Google Scholar]
- Chen, B.; Qiu, F.; Wu, B.; Du, H. Image Segmentation Based on Constrained Spectral Variance Difference and Edge Penalty. Remote Sens. 2015, 7, 5980–6004. [Google Scholar] [CrossRef]
- Baatz, M. Multi resolution Segmentation: An optimum approach for high quality multi scale image segmentation. In Beutrage zum AGIT-Symposium; Springer: Salzburg, Germany, 2000; pp. 12–23. [Google Scholar]
- Liu, J.; Li, P.; Wang, X. A new segmentation method for very high resolution imagery using spectral and morphological information. ISPRS J. Photogramm. Remote Sens. 2015, 101, 145–162. [Google Scholar] [CrossRef]
- Pavlidis, T.; Liow, Y.T. Integrating region growing and edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 1990, 12, 225–233. [Google Scholar] [CrossRef]
- Cortez, D.; Nunes, P.; Sequeira, M.M.D.; Pereira, F. Image segmentation towards new image representation methods. Signal Process. Image Commun. 1995, 6, 485–498. [Google Scholar] [CrossRef]
- Zhang, X.; Xiao, P.; Feng, X. Fast Hierarchical Segmentation of High-Resolution Remote Sensing Image with Adaptive Edge Penalty. Photogramm. Eng. Remote Sens. 2015, 80, 71–80. [Google Scholar] [CrossRef]
- Zhang, X.; Xiao, P.; Feng, X.; Wang, J.; Wang, Z. Hybrid region merging method for segmentation of high-resolution remote sensing images. ISPRS J. Photogramm. Remote Sens. 2014, 98, 19–28. [Google Scholar] [CrossRef]
- Mitra, P.; Shankar, B.U.; Pal, S.K. Segmentation of multispectral remote sensing images using active support vector machines. Pattern Recognit. Lett. 2004, 25, 1067–1074. [Google Scholar] [CrossRef]
- Beucher, S.; Mathmatique, C.D.M. The Watershed Transformation Applied To Image Segmentation. Scanning Microsc. Suppl. 1992, 6, 299–314. [Google Scholar]
- Toro, C.; Martín, C.; Pedrero, A.; Ruiz, E. Superpixel-Based Roughness Measure for Multispectral Satellite Image Segmentation. Remote Sens. 2015, 7, 14620–14645. [Google Scholar] [CrossRef]
- Benz, U.C.; Hofmann, P.; Willhauck, G.; Lingenfelder, I.; Heynen, M. Multi-resolution, object-oriented fuzzy analysis of remote sensing data for GIS-ready information. ISPRS J. Photogramm. Remote Sens. 2004, 58, 239–258. [Google Scholar] [CrossRef]
- Comaniciu, D.; Meer, P. Mean shift: A robust approach toward feature space analysis. IEEE Trans Pattern Anal. Mach. Intell. 2002, 24, 603–619. [Google Scholar] [CrossRef]
- Su, T. A novel region-merging approach guided by priority for high resolution image segmentation. Remote Sens. Lett. 2017, 8, 771–780. [Google Scholar] [CrossRef]
- Yang, J.; He, Y.; Weng, Q. An Automated Method to Parameterize Segmentation Scale by Enhancing Intrasegment Homogeneity and Intersegment Heterogeneity. IEEE Geosci. Remote Sens. Lett. 2015, 12, 1282–1286. [Google Scholar] [CrossRef]
- Wang, Y.; Meng, Q.; Qi, Q.; Yang, J.; Liu, Y. Region Merging Considering Within- and Between-Segment Heterogeneity: An Improved Hybrid Remote-Sensing Image Segmentation Method. Remote Sens. 2018, 10, 781. [Google Scholar] [CrossRef]
- Yang, J.; He, Y.; Caspersen, J. Region merging using local spectral angle thresholds: A more accurate method for hybrid segmentation of remote sensing images. Remote Sens. Environ. 2017, 190, 137–148. [Google Scholar] [CrossRef]
- Yang, J.; He, Y.; Caspersen, J. A multi-band watershed segmentation method for individual tree crown delineation from high resolution multispectral aerial image. In Proceedings of the 2014 IEEE Geoscience and Remote Sensing Symposium, Quebec City, QC, Canada, 13–18 July 2014; pp. 1588–1591. [Google Scholar]
- Li, P.; Guo, J.; Song, B.; Xiao, X. A Multilevel Hierarchical Image Segmentation Method for Urban Impervious Surface Mapping Using Very High Resolution Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2011, 4, 103–116. [Google Scholar] [CrossRef]
- Kruse, F.A.; Lefkoff, A.; Boardman, J.; Heidebrecht, K.; Shapiro, A.; Barloon, P.; Goetz, A. The spectral image processing system (SIPS)—interactive visualization and analysis of imaging spectrometer data. Remote Sens. Environ. 1993, 44, 145–163. [Google Scholar] [CrossRef]
- Vincent, L.; Soille, P. Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations. IEEE Trans. Pattern Anal. Mach. 1991, 13, 583–598. [Google Scholar] [CrossRef]
- Trémeau, A.; Colantoni, P. Regions adjacency graph applied to color image segmentation. IEEE Trans. Image Process 2002, 9, 735–744. [Google Scholar] [CrossRef]
- Zhang, X.; Feng, X.; Xiao, P.; He, G.; Zhu, L. Segmentation quality evaluation using region-based precision and recall measures for remote sensing images. ISPRS J. Photogramm. Remote Sens. 2015, 102, 73–84. [Google Scholar] [CrossRef]
- Su, T.; Zhang, S. Local and global evaluation for remote sensing image segmentation. ISPRS J. Photogramm. Remote Sens. 2017, 130, 256–276. [Google Scholar] [CrossRef]
- Clinton, N.; Holt, A.; Scarborough, J.; Yan, L.; Gong, P. Accuracy Assessment Measures for Object-based Image Segmentation Goodness. Photogramm. Eng. Remote Sens. 2010, 76, 289–299. [Google Scholar] [CrossRef]
- Weidner, U. Contribution to the assessment of segmentation quality for remote sensing applications. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2008, 37, 479–484. [Google Scholar]
- Benedek, C.; Descombes, X.; Zerubia, J. Building Development Monitoring in Multitemporal Remotely Sensed Image Pairs with Stochastic Birth-Death Dynamics. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 33–50. [Google Scholar] [CrossRef]
- Grinias, I.; Panagiotakis, C.; Tziritas, G. MRF-based segmentation and unsupervised classification for building and road detection in peri-urban areas of high-resolution satellite images. ISPRS J. Photogramm. Remote Sens. 2016, 122, 145–166. [Google Scholar] [CrossRef]
- Chen, R.; Li, X.; Li, J. Object-Based Features for House Detection from RGB High-Resolution Images. Remote Sens. 2018, 10, 451. [Google Scholar] [CrossRef]
- Shepherd, J.D.; Bunting, P.; Dymond, J.R. Operational Large-Scale Segmentation of Imagery Based on Iterative Elimination. Remote Sens. 2019, 11, 658. [Google Scholar] [CrossRef]
- Basaeed, E.; Bhaskar, H.; Hill, P.; Al-Mualla, M.; Bull, D. A supervised hierarchical segmentation of remote-sensing images using a committee of multi-scale convolutional neural networks. Int. J. Remote Sens. 2016, 37, 1671–1691. [Google Scholar] [CrossRef]
- Fu, Z.; Sun, Y.; Fan, L.; Han, Y. Multiscale and Multifeature Segmentation of High-Spatial Resolution Remote Sensing Images Using Superpixels with Mutual Optimal Strategy. Remote Sens. 2018, 10, 1289. [Google Scholar] [CrossRef]
Figure 1. The flowchart of the proposed segment-merging method. It shows the visual representation of its iterative process.
Figure 2. Schematic representation of region merging. Small squares with thin black frames represent the pixels. Large squares with thick black frames represent initial segments ( , , , , , , included).
Figure 3. Farmland image used for experiments: (a) represents the false-color image of farmland. (b) shows the ground truth segments of the farmland.
Figure 4. Urban image used for experiments: (a) represents the false-color image of urban image. (b) shows the ground truth segments of the urban image.
Figure 5. Rural image used for experiments: (a) overview of the rural area. It illustrates the false-color image with a combination of near infrared, red and green spectral components as R, G, and B (on WGS84 UTM coordinate system). (b,d) shows the false-color image of two subareas images T1 and T2. (c,e) shows the ground truth segments of each subarea image.
Figure 6. The distribution of the reference polygons: (a) the farmland area, (b) the urban area, (c) the rural area. The reference polygons were measured in pixels.
Figure 7. The quality rate (QR) values of the farmland image. It shows the QR values with respect to the preset spectral angle (SA) threshold for the three methods mentioned above.
Figure 8. Evaluation for the best corresponding segment of farmland image. From the first to the third row, the figures correspond to evaluation results produced by the global SA (GSA), local SA (LSA) and adaptive local SA threshold aided by inter-segment and boundary homogeneities (LSAH) methods. The first column (a,d,g) shows the scatter plots for QRs and matching indices (MIs). The second (b,e,h) and the third columns (c,f,i) represent the MIs and QRs for each candidate polygon, respectively.
Figure 9. Subsets of the farmland image. The reference polygons are shown in (a,e) and the best corresponding segments of the GSA, LSA, and LSAH methods are respectively shown in (b,f), (c,g), and (d,h).
Figure 10. The QR values of the urban image. It shows the QR values with respect to the preset SA threshold for the three methods mentioned above.
Figure 11. Evaluation for the best corresponding segment of urban image. From the first to the third row, the figures correspond to evaluation results produced by the GSA, LSA, and LSAH methods. The first column (a,d,g) shows the scatter plots for QRs and MIs. The second (b,e,h) and the third columns (c,f,i) represent the MIs and QRs for each candidate polygon, respectively.
Figure 12. Subsets of the urban image. The reference polygons are shown in (a,e,i) and the best corresponding segment of the GSA, LSA, and LSAH methods are respectively shown in (b,f,j), (c,g,k), and (d,h,l).
Figure 13. The QR values of the rural image. It shows the QR values with respect to the preset SA threshold for the three methods mentioned above.
Figure 14. Evaluation for the best corresponding segment of the rural image. The figures of the first three rows and the last three rows correspond to the evaluation results produced by the GSA, LSA, and LSAH methods of the subareas T1 and T2. The first column (a,d,g,j,m,p) shows the scatter plots for QRs and MIs. The second (b,e,h,k,n,q) and the third columns (c,f,i,l,o,r) represent the MIs and QRs for each candidate polygon, respectively.
Figure 15. Subsets of the rural images T1 and T2. The reference polygons shown in (a,e) are selected from T1, and the reference polygons shown in (i,m) are selected from T2. The best corresponding segments of the GSA, LSA, and LSAH methods are respectively shown in (b,f,j,n), (c,g,k,o), and (d,h,l,p).
Figure 16. Standard deviations of segment sizes by the GSA, LSA, and LSAH segmentations for the farmland, urban, and rural area, respectively.
Table 1. Algorithm for region merging using local spectral angle (SA) threshold aided by inter-segment and boundary homogeneities.
|Input data: Initial segmentation|
|Input parameter: Spectral angel|
|Output: Final segmentation result|
Table 2. The basic information of the farmland, the urban, and the rural images.
|Reference Polygon||Average Area||Min Area||Max Area||Main Land Cover Types|
|Numbers||(Pixel)||(Pixel)||(Pixel)||of the Reference Polygons|
|Cropland, Open field|
|urban||50||369.39||12||2006||Building, Road, Woods|
|Open field, Asphalt|
|rural||50||175.34||14||825||Building, Open field, Woods|
|Asphalt, Grass, Water, Cropland|
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).