A Region Merging Segmentation with Local Scale Parameters : Applications to Spectral and Elevation Data

Region merging is the most effective method for the segmentation of remote sensing data. The quality and the size of the resulted image objects is controlled by a global heterogeneity threshold, termed as the scale parameter. However, the multidimensional nature of the visible features in a scene defies the use of an even optimum single global scale parameter. In this study, a novel region merging segmentation method is proposed, where a local scale parameter is defined for each image object by its internal and external heterogeneity measures (i.e., local variance and Moran’s I). This method allows image objects with low internal and external heterogeneity to be further merged with higher scale parameter values, since they are more likely to be a part of an adjacent object, than objects with high internal and external heterogeneity. The proposed method was applied in spectral and elevation data and its results were evaluated visually and with supervised and unsupervised evaluation methods. The comparison with multi-resolution segmentation (MRS) showed that the proposed region merging method can produce improved segmentation results in terms of maximizing intra-object homogeneity and inter-object heterogeneity as well as in the delimitation of specific target objects, present in spectral and elevation data. The unsupervised evaluation results of the (1) Côte d’Azur, (2) Manchester, and (3) Szada images from the SZTAKI-INRIA building detection dataset showed that the proposed method (overall goodness, OGf (1): 0.7375, (2): 0.7923, (3): 0.7967) performs better than MRS (OGf (1): 0.7224, (2): 0.7648, (3): 0.7823). The higher values of OGf indicate their ability to produce segmentation results with reduced over-segmentation effects and without the need of presegmented input data, in contrast to the objective heterogeneity and relative homogeneity (OHRH) hybrid segmentation method (OGf (1): 0.5864, (2): 0.5151, (3): 0.6983).


Introduction
Object-based image analysis (OBIA) has been gaining prominence as an alternative solution to traditional pixel-based methods.Blaschke et al. [1] confirmed that OBIA is the new paradigm shift in the analysis of high spatial resolution remote sensing images.They discussed the limitations of pixel-based methods and defined the core concepts and advantages of OBIA.The basic processing units of OBIA are spatially continuous, disjoint, and homogeneous (in one or more dimensions of a feature space) regions, called segments or image objects.The use of image objects as the basic analysis unit can improve the classification of remote sensing data by incorporating semantics (i.e., integration of expert knowledge) and hierarchical networks.Furthermore, the ability of OBIA to reduce the spectral variance within image objects can moderate the 'salt and pepper' noise compared with pixel-based methods and produce more visually consistent results [1].
Image segmentation of remote sensing data is an important step in the OBIA workflow.Hay and Castilla [2] pointed out that its objective is not the extraction of semantic objects, but the extraction of primitive image objects, which are internally homogeneous and externally heterogeneous from their neighboring objects.Within OBIA, the transformation of image objects to representations of semantic objects is a complex procedure, which includes iterative loops of classification and processing using both their internal features and their mutual relationships [2].Therefore, the quality of image objects is essential for their correct processing and classification.
A variety of image segmentation methods has been developed.According to Schiewe [3], these methods can be grouped into three categories: Edge-based, region-based, and hybrid segmentation methods.The edge-based segmentation methods are designed to detect discontinuities between the objects and thus determine the boundaries between adjacent objects.In contrast, region-based methods are designed to detect neighboring pixels that have similar digital values and thus belong to the same object.In hybrid methods, the basic idea is to merge pre-existing objects determined with the help of any segmentation technique.This study will focus on a specific type of region-based methods termed as region merging.
Meinel and Neubert [4] compared several segmentation methods and concluded that region merging is the most effective image segmentation method for the analysis of high-resolution remote sensing images.A characteristic region-merging method and probably the most widely used one within the OBIA domain is multi-resolution segmentation (MRS) [5], which is available in the eCognition ® software package.The basic procedure in MRS and most region merging segmentation methods is the merging of smaller objects into larger ones during a local optimization procedure that minimizes the average heterogeneity of image objects.The average size and the number of objects are controlled by a global heterogeneity threshold, termed as the scale parameter (SP) [5].Remote sensing data can be segmented at many different segmentation scales by changing the SP.Given the multidimensional nature of the visible features in a scene, region merging methods can produce multiscale segmentations by setting different SP values.Smaller values of SP will produce smaller homogeneous objects, while larger values will produce larger and more heterogeneous objects.
Kim et al. [6] investigated the effect of scale on segmentation and classification quality.They concluded that the average size of image objects critically impacts on image classification accuracy and the selection of an optimum SP value is a crucial decision in segmenting remote sensing data.Usually, a trial-and-error strategy is used to select the optimum SP value, which can create image objects with sizes similar to the average size of the target objects.In the last decade, several supervised and unsupervised methods for the automated identification of the optimum SP value have been developed.In supervised methods, the optimum SP value is the one that can create image objects similar to reference target objects.In unsupervised methods, internal and external heterogeneity measures are computed for multiple segmentation results.The best segmentation result created by the relevant optimum SP is composed of image objects that satisfy the condition of maximizing internal homogeneity and external heterogeneity.Espindola et al. [7] measured the intra-object homogeneity using the weight variance of the objects and the inter-object homogeneity using the global Moran's I.The two values were normalized and combined to define the optimal SP for the segmentation.However, the scenes usually consist of different types of visible features that vary in size and the use of an even optimum single SP will possibly split some features into multiple objects (i.e., over-segmentation) and/or merge them together with other neighboring features in a single object (i.e., under-segmentation).In this context, Liu et al. [8] reviewed the state-of-the-art methods for the identification of SP and concluded that multiscale segmentation strategies are preferable than single scale segmentations.However, Liu et al. [8] pointed out that the objective and automated selection and combination of multiple optimum SPs for the segmentation of remote sensing data is a key and urgent scientific problem that needs to be resolved first.Aside from multiscale segmentation strategies, several local post-segmentation optimization methods [9,10] have been proposed, which try to further improve the quality of segmentation and reduce over-and under-segmentation problems.Johnson and Xie [9] identified with global statistical heterogeneity measures the optimal SP for MRS segmentation.Once the segmentation process was performed, the over-and under-segmented objects were identified with local intra-object and inter-object heterogeneity statistics and were further refined by appropriate splitting and merging.This post-segmentation refinement strategy was effective in terms of improving segmentation quality because it moderated under-and over-segmentation problems.However, the additional steps for the identification of inappropriate objects and the subsequent splitting and merging steps require tuning by the user and pose this method as less objective [11].
An alternative solution to cope with scene complexity could be a segmentation approach, which incorporates the local intra-object and inter-object heterogeneity measures within the merging process.Such solutions have been adopted to two hybrid segmentation methods [11,12].Yang et al. [11] developed a hybrid segmentation method, where the initial objects derived by an edge-based segmentation method were merged with local spectral angle thresholds.These thresholds were adjusted according to the internal heterogeneity of each object.On the other hand, Wang et al. [12] proposed the consideration of both inter-and intra-object heterogeneity in the computation of the merging costs in the region merging step of a hybrid segmentation method.Even though these studies showed promising results, Wang et al. [12] noted that the quality of the final segmentation result depends on the quality of the initial objects obtained from the previous edge-based segmentation step.Thus, hybrid-based methods require the critical parameterization of two segmentation steps, in contrast to region merging methods, which require only one.Therefore, the main research question that this study tries to answer is how efficient is the incorporation of local heterogeneity measures within the merging process of a pure region merging segmentation method (i.e., without the requirement of presegmented input data).
This study describes a novel region merging method that employs local SP values defined by both the internal and external heterogeneity of each object.This method allows objects, which are homogeneous and similar to their neighbors, to have a higher SP value than heterogeneous and distinguishable from their neighbors' objects, since they are more likely to be part of their adjacent objects.The purpose of this segmentation method is the creation of image objects, which are the basic units for further classification and refinement procedures in the OBIA workflow.Therefore, the proposed method must have some important properties in order to be efficient: (1) It can produce objects that satisfy the condition of maximizing internal homogeneity and external heterogeneity; (2) it can create various scale segmentations, in order to delimit multiscale objects in a flexible and adaptive way; and (3) it can handle various types of remote sensing data (e.g., spectral, elevation data).The quality assessment of the proposed method was performed through the comparison with other well-known and state-of-the-art image segmentation methods, like MRS [5] and the hybrid-based method of Wang et al. [12], which also incorporates the intra-object and inter-object heterogeneity statistical measures within the merging process.The segmentation methods were applied to two remote sensing applications with spectral and elevation datasets.Additionally, the SZTAKI-INRIA benchmark dataset [13] was employed for this assessment.

Overview
The proposed region merging method attempts to maximize the intra-object homogeneity and the inter-object heterogeneity of image objects with the inclusion of dynamically defined local SPs (SP local ).The process of the proposed region merging segmentation method is illustrated in Figure 1 and can be summarized as follows.The process starts with each pixel forming one image object (i.e., each pixel is labelled as a separate object) with four edges (i.e., four-neighbor connectivity).Then, the neighboring objects are identified and the merging costs of adjacent object pairs are computed (Section 2.2.1).The merging cost (MC) is based on local homogeneity criteria, describing the spectral similarity of neighboring image objects.Furthermore, the inter-and intra-object heterogeneity measures are computed for each object, in order to define its SP local (Section 2.2.2).At each iteration, all existing objects are evaluated, in order to determine whether they have to be merged or not.Two adjacent objects, A and B, are merged if they fulfil the local mutual best fitting criterion (Section 2.2.3).The procedure stops when the MC between all objects exceeds their SP local and there are no more possible merges.The segmentation results were compared with the results of a widely used region merging segmentation method (Section 2.3) and were evaluated with supervised (Section 2.4) and unsupervised segmentation evaluation methods (Section 2.5).All results were generated in raster format (BIL) with a unique value for each object.Then, a raster-to-vector conversion was carried out using ESRI ArcMap.The main components of the segmentation algorithm are described in detail in the following sections.all existing objects are evaluated, in order to determine whether they have to be merged or not.Two adjacent objects, A and B, are merged if they fulfil the local mutual best fitting criterion (Section 2.2.3).The procedure stops when the MC between all objects exceeds their SPlocal and there are no more possible merges.The segmentation results were compared with the results of a widely used region merging segmentation method (Section 2.3) and were evaluated with supervised (Section 2.4) and unsupervised segmentation evaluation methods (Section 2.5).All results were generated in raster format (BIL) with a unique value for each object.Then, a raster-to-vector conversion was carried out using ESRI ArcMap.The main components of the segmentation algorithm are described in detail in the following sections.

Merging Cost (MC)
The identification of spatially adjacent objects is required in order to compute the MC for each potential merge of pairs of adjacent objects.In this method, the four nearest neighboring objects were considered to be spatially adjacent to other objects, since it was found that the four-neighbor connectivity produces better results than the eight-neighbor connectivity [14].Then, the MC for each pair of adjacent objects was computed according to the spectral-based part of the MC proposed by Baatz and Schäpe [5].The MC of two adjacent image objects (object 1 and 2) is defined by the change of spectral heterogeneity, Δh, in a virtual merge (Equation (1)): where c is the band and wc is the weight of each band.nobj_1 and nobj_2 are the area in pixels of objects 1 and 2. hc,obj_1 and hc,obj_2 are the standard deviation of each band pixel values of objects 1 and 2. hc,merge is

Merging Cost (MC)
The identification of spatially adjacent objects is required in order to compute the MC for each potential merge of pairs of adjacent objects.In this method, the four nearest neighboring objects were considered to be spatially adjacent to other objects, since it was found that the four-neighbor connectivity produces better results than the eight-neighbor connectivity [14].Then, the MC for each pair of adjacent objects was computed according to the spectral-based part of the MC proposed by Baatz and Schäpe [5].The MC of two adjacent image objects (object 1 and 2) is defined by the change of spectral heterogeneity, ∆h, in a virtual merge (Equation (1)): where c is the band and w c is the weight of each band.n obj_1 and n obj_2 are the area in pixels of objects 1 and 2. h c,obj_1 and h c,obj_2 are the standard deviation of each band pixel values of objects 1 and 2. h c,merge is the standard deviation of the resulted object after the merging of objects 1 and 2. The objects are merged only when the homogeneity criterion is fulfilled according to Equation (2): 2.2.2.Local Scale Parameter (SP local ) Definition Local inter-and intra-object heterogeneity measures can assess local over-and under-segmentation effects [9].Low values of local heterogeneity measures indicate that the object under study and its neighbor(s) are likely to be parts of the same target object.Therefore, these objects should be merged again.On the contrary, objects with high values of intra-and inter-object heterogeneity contain more than one type of feature because they are internally heterogeneous, and they should not be merged again since they are very different from their neighbors.In order to obtain augmented segmentation results with maximized internal homogeneity and external heterogeneity and reduced over-and under-segmentation bias, it is important to establish local SPs defined by the internal and external heterogeneity of each object.
The user only specifies a single global scale parameter (SP global ) and a local heterogeneitydependent scale parameter (SP local ) is automatically computed for each object as follows.The first step in every iteration of the merging process is the computation of the internal and external heterogeneity statistics of each object.Local variance (local Var) was used as the internal heterogeneity measure.Local Moran's I [15] is the decomposed form of global Moran's I and measures the heterogeneity between object i and its neighboring objects, j, according to Equation (3): where y i and y j are the mean values of objects i and j, and y is the mean value of the scene.Only neighboring objects were considered for calculations, so w ij is computed according to Equation (4): where L i,total is the border length of object i, and L ij is the shared border length of object i and one of its neighboring objects, j.The w ij values range between 0 and 1, where 1 expresses objects i enclosed to a single object, j, and 0 indicates that object j is not spatially adjacent to object i. Local I values close to 0 and negative values indicate objects with high external heterogeneity (i.e., different to their neighbors), and positive values indicate objects with low external heterogeneity (i.e., similar to their neighbors).Second, the local Var and I values were computed for all bands and were averaged across all bands using the Equation ( 5): where X average is the average value of local Var or I, w c is the weight of each band, and X c is the value of local Var or I for band c.In order to allow for the intra-object and inter-object heterogeneity measures to be considered equally, both were normalized to a similar range (0-1) according to Equation (6): where X min and X max are the minimum and maximum value of local Var average or I average .In each iteration of the region merging process, the X min and X max were updated when a smaller minimum and/or larger maximum value was computed.After normalization, the normalized values of objects with low local Var and I values were relatively close to 0, while the normalized values of objects with high values of these measures were close to 1. Until this point, the approach is similar to the methodology part of Johnson and Xie [9] for the identification of over-and under-segmented objects.However, the approach of Johnson and Xie [9] is a post-refinement procedure for the enhancement of the quality of existing segmentation results, while the presented approach is a region merging segmentation.Therefore, the last step of this approach was differentiated to correspond with the objective of this study.Johnson and Xie [9] employed the heterogeneity index, H, which combines the normalized local Var and I.The final derived objects from the segmentation were classified according to the H value and were further refined by splitting and merging.On the other hand, the strategy of this study is to define during the region merging process the SP local of each growing object by adjusting the SP global with a local factor, LF, defined by the internal and external homogeneity of the object.According to the homogeneity criterion in Equation ( 1), the use of H as the LF to adjust the SP global will treat under-segmented and over-segmented objects the same way, given that H ranges from −1 to 1 and the SP local and MC are positive values.A further rescaling step would be required to adjust H values to a more appropriate range.In order to avoid further processing steps, the normalized local Var and I values were combined for the definition of the LF for each object according to the Equation ( 7): where Var norm is the normalized local Var and I norm is the normalized local Moran's I. LF values range from 0 to 2. Values close to 2 indicate objects, which are quite internally homogenous and very similar to their neighbors.This high degree of internal and external homogeneity indicates that the object and its neighbor(s) are likely to be parts of the same ground feature.Thus, these objects must be further merged.Values close to 0 indicate objects, which are less homogeneous internally and more different from their neighbors.These properties agree with the widely-accepted definition of what constitutes a good segmentation and further merging of these objects is not required.The LF of each object was multiplied with the user-defined SP global in order to define the SP local according to Equation (8): According to the above, the objects with low internal (i.e., low local Var and Var norm values) and external heterogeneity (i.e., high local I and I norm values) will have a larger SP local value than objects with high internal (i.e., high local Var and Var norm values) and external heterogeneity (i.e., low local I and I norm values).Note that when an object is smaller than the real feature in the scene, its internal and external heterogeneity are low and its SP local is high.As the iterative region merging process continues and the object approximates the size of the real feature, its internal and external heterogeneity are increasing [16].Therefore, its SP local will reach its lower value and the object will not be further merged.In cases of multi-scale segmentation strategies, the increase in size of an image object, which is considered to be a component of a larger one (e.g., a tree as part of a forest), will decrease its internal and external heterogeneity [17].The decrease will stop when the image object matches its correspondent at a higher level of organization (i.e., forest), above which the heterogeneity measures will start to decrease again [16].Thus, the process is inherently convergent, meaning that it automatically converges towards a final segmentation, since the SP local value decreases, when the object reaches the size of the target real feature.

Merging Order
An arbitrary object, A, can be merged with several neighboring objects, B, which fulfil the homogeneity criterion (Section 2.2.1).Baatz and Schäpe [5] and Lassalle et al. [18] described four different heuristics to choose the adjacent object to be merged with object A by considering further constraints concerning the neighborhood and similarity.In the simplest case of the fitting heuristic, object A is merged randomly with one of its adjacent objects if the homogeneity criterion is fulfilled.In the case of the best fitting, object A is merged with the neighboring object, which fulfils the homogeneity criterion and has the smallest MC compared to all possible merges of A with any other adjacent object (i.e., the homogeneity criterion is fulfilled best).In local mutual best fitting, object A is merged with its neighboring object, B, only if the homogeneity criterion is fulfilled best for both objects A and B. In the global mutual best fitting heuristic, only the pair of adjacent objects in the whole image, which fulfils the homogeneity criterion best, is merged at each iteration.
The fitting and best fitting heuristics require a visiting order strategy, which affects the segmentation quality [5].The visiting order in global mutual best fitting is implicitly given by the heuristic, while in local mutual best fitting, the objects will always be merged with the same order regardless of the visiting order [18].Both global and local mutual best fitting heuristics produce segmentations of the same quality [5], but the local mutual best fitting has a smaller computational cost [19].Thus, the local mutual best fitting heuristic was used in the proposed region merging segmentation.

Segmentation Comparison
Due to the absence of existing region merging methods with local SPs, we compared our proposed method (Local) with two methods employing a global SP.First, a modified version of our region merging method (Global) was implemented by setting the LF to 1 for all objects, thereby limiting the SP local values at a single global value.Furthermore, both methods were compared to MRS [5], which is one of the most efficient and popular region merging segmentation methods employed in object-based analysis of remote sensing data.All three methods were applied to spectral and elevation data at multiple scales by varying the SP global from 10 to 100 in increments of 10.All bands were used and their relative weights were set to 1.In MRS, the shape and compactness parameters, which control the weights of the shape information in the computation of the merging cost, were set at their default values (shape = 0.1, compactness = 0.5).The results of each segmentation method were evaluated with supervised methods and unsupervised methods, which are described in the following sections.

Supervised Evaluation
The supervised evaluation methods identify the optimal segmentation results, which can approximate best the reference objects.A variety of arithmetic or geometric dissimilarity metrics can indicate the spatial agreement between reference objects and the respective (overlapping) image objects [20].
The area proportions between image objects, S i , and reference objects, R j , with a mutual spatial overlap threshold of 50% were used to estimate five area-based measures: Area fit index (AFI), quality rate (QR), over segmentation, under segmentation, and the combination of the last two measures (D) [20].A numerically-based measure, the miss rate (MR), was also used in the assessment.MR relates the total number of reference objects, NR j , to the number of undetected reference objects, NR missed (i.e., objects without a corresponding object), and is denoted as: A good quality segmentation is reached, when the overall differences of all criteria between the segmentation results and the associated reference objects are as low as possible.Except for AFI, the value range for all measures is from 0 to 1.The closer the value is to 0, the better is the spatial match between reference objects and their overlapping segments.Only AFI can reach negative values, when the reference objects are under-segmented [21].

Unsupervised Evaluation
A segmentation result is considered optimal when the average global intra-object homogeneity and inter-object heterogeneity are maximized [7].The global Moran's I value indicates the overall inter-object heterogeneity, whereas the mean weighted variance measures the average degree of intra-object homogeneity.The global Moran's I is computed according to Equation (10): where n is the total number of objects, y i is the mean value of object i, y is the mean value of the scene, and w ij is the weight that defines the spatial adjacency of objects i and j.The value of w ij was computed according to Equation (4).Global Moran's I values range from −1 to 1. Values close to 0 and negative values indicate high inter-object heterogeneity, which is desirable for a segmentation.
The mean intra-object homogeneity (wVar) was computed using the following Equation ( 11): where v i is the variance of the pixel values within object i, and a i is the area in pixels of object i.
The lower the value, the more homogeneous objects are in terms of the measured property.Espindola et al. [7] successfully utilized a combination of these measures to identify optimal region-growing segmentations in a multi-scale analysis.This method was also employed here to find the optimum segmentation scales and assess statistically the quality of segmentation results.According to Espindola et al. [7], the global internal and external heterogeneity measures were rescaled to a range (0-1) in order to be considered equally.The minimum wVar and MI values were rescaled to 1, while the maximum values of these measures were rescaled to 0. The normalized values were combined for each band using the formula in Equation ( 12): The S values for each band were averaged to identify the best single-scale segmentation.The optimal segmentation was identified as the one with the highest average S value, because at this segmentation, there is the lowest combined wVar and MI value.An alternative combination method is the F-measure (F), which is often used for classification accuracy assessment.Zhang et al. [22] used F to combine precision-and recall-based metrics representing the degree of over-segmentation and under-segmentation, respectively, and it was found to be more sensitive to excessive undersegmentation or over-segmentation.Furthermore, F was used by Johnson et al. [23] to compute the overall goodness, OG f , according to the following Equation ( 13): where a is a weight that controls the relative weights of wVar norm and MI norm .OG f values range from 0 to 1 and higher values indicate a higher segmentation quality.In this study, both measures were considered to have the same weights, i.e., a = 1.

Results
The effectiveness of the segmentation methods with local and global SP values is presented here by comparing their optimum segmentation results as indicated by supervised and unsupervised segmentation evaluation methods.The segmentation evaluation results are available as supplementary online material.

Remote Sensing Applications
The proposed segmentation method was employed in two remote sensing applications, including urban geo-object recognition (i.e., building delimitation) with spectral data (Section 3.1.1)and impact crater delimitation with elevation data (Section 3.1.2).

Building Delimitation from Spectral Data
Image segmentation of spectral remote sensing images is a mandatory processing step within OBIA workflows for building detection [24,25].For this study, a 1 m color (RGB) digital aerial image of Boston, Massachusetts, USA and its reference building dataset were obtained from a publicly available dataset for building detection [26].The study area is a 500 × 500 m subset of a typical urban area, which contains many different types of land cover within a small area (buildings, roads, trees, grassland).

Impact Crater Delimitation from Elevation Data
OBIA has shown promising results in landform classification from digital elevation models (DEMs) [27,28].Even though image segmentation methods were designed for use with remotely sensed spectral data, several researchers [29][30][31][32] have proven that they can also be used with DEMs.Van Niekerk [33] compared several approaches and found that the multiresolution segmentation (MRS) algorithm is the most sensitive to morphological discontinuities in DEMs and is able to create objects with maximum internal homogeneity and are distinguishable from their neighbors.
The number of OBIA applications in the analysis of DEMs and their derivatives has increased in the last decade.Eisank et al. [34] showed that the efficient delimitation of landforms depends on the image segmentation algorithm, the parameterization of the segmentation, and the DEM derivatives, which are the input in the segmentation process.In order to avoid the effects of DEM derivatives, a landform mapping application, which requires only the segmentation of the DEM, is preferable for the evaluation.
A characteristic and simple object-based landform mapping approach was proposed by Vamshi et al. [35].They showed that OBIA is efficient for the detection of lunar impact craters by segmenting only the high resolution DEM available from National Aeronautics and Space Administration (NASA) Lunar Reconnaissance Orbiter (LRO) mission [36].Their method was based on an iterative process of segmenting the DEM into objects with SPs of decreasing value so that impact craters of different sizes can be detected.In each iteration, the resulting image objects were classified to impact craters according to shape and morphometric criteria.
In this study, a 2500 × 2500 m subset of the 5 m resolution DEM representing the topography of Mare Imbrium was employed for the evaluation of the segmentation.The same DEM was used also in the study of Vamshi et al. [35] and is available on the NASA website (http://ode.rsl.wustl.edu/moon/indexMapSearch.aspx).The performance of our segmentation method in impact crater detection was evaluated by comparing the detected impact craters with the reference impact craters, created manually by Vamshi et al. [35].

Building Delimitation from Spectral Data
Based on the unsupervised evaluation method, the best segmentations for MRS, Global, and Local were obtained by setting the SP global at 60, 50, and 50, respectively.According to the diagrams in Figure 2, the three-band average wVar of the best segmentation from the Local (wVar = 555,247.79and MI = −0.1661)was lower than those from the best segmentations from MRS (wVar = 1,038,643.19and MI = −0.1900)and Global (wVar = 971633.24and MI = −0.1615).The diagrams in Figure 2 show that the Global produced results with higher average wVar values in comparison to results from Local and MRS.The Local method appears to produce objects with similar average wVar and apparently lower MI values than the MRS for most scale parameter values.MRS and Global appear to produce segmentation results with similar MI values.The best segmentation results of the three methods indicated by the unsupervised evaluation are demonstrated in Figure 3b-d.
The visual results in Figure 3 show that the results of all methods do not have obvious differences in the delineation of building objects.This fact is proven by the relatively similar performance of the three methods based on the five area-based measures (Figure 4).The superiority of the proposed Local can also be identified by observing the results locally and particularly in areas covered with grassland (Figure 3).Both Local and MRS showed a greater performance in segmenting grassland areas as single objects compared to the Global method, which showed over-segmentation effects.Indicatively, the MRS was able to delineate the playground in the middle of Figure 3B.The Local method was able to capture the spectral discontinuities inside the playground area in Figure 3D, while the Global method over-segmented the playground area (Figure 3C).
The supervised evaluation results of the three segmentations produced by MRS, Global, and Local are presented in The visual results in Figure 3 show that the results of all methods do not have obvious differences in the delineation of building objects.This fact is proven by the relatively similar performance of the three methods based on the five area-based measures (Figure 4).The superiority of the proposed Local can also be identified by observing the results locally and particularly in areas covered with grassland (Figure 3).Both Local and MRS showed a greater performance in segmenting grassland areas as single objects compared to the Global method, which showed over-segmentation effects.Indicatively, the MRS was able to delineate the playground in the middle of Figure 3B.The Local method was able to capture the spectral discontinuities inside the playground area in Figure 3D, while the Global method over-segmented the playground area (Figure 3C).

Impact Crater Delimitation from Elevation Data
According to the unsupervised evaluation method, the optimum scale parameter value for MRS and the Global was set to 50, while for the Local, it was set to 40.The diagrams (Figure 5

Impact Crater Delimitation from Elevation Data
According to the unsupervised evaluation method, the optimum scale parameter value for MRS and the Global was set to 50, while for the Local, it was set to 40.The diagrams (Figure 5 7 illustrates the best segmentation results according to MR for the three segmentation methods. According to Figure 7, the proposed method and the MRS delineated 12 and 11 impact craters, respectively, while the Global managed to delineated only 9.This caused the high degree of underestimated areas (blue in Figure 7) in the intersection maps of all segmentation methods.On the contrary, the proportion of over-estimated areas (red in Figure 7) was relatively small for MRS, Local, and Global, compared to their under-estimated areas.Nevertheless, all segmentation methods appear promising because the overlap area (green in Figure 7) shows that delimited craters match with the reference craters quite well.The area-based dissimilarity measures for the three segmentation approaches indicated different optimum SP global values (Figure 6).All area-based measures confirmed the greater efficiency of the Local than the other two methods.Indicatively, the AFI values of the Local results are approximately 14% lower than that of the MRS results and 31% lower than the Global results.The best segmentation result as indicated by the MR was obtained with the Local when the SP global was set to 10 (MR = 0.5714).The Local has a lower value of MR than the MRS (MR = 0.6071) and the Global (MR = 0.6785).Figure 7 illustrates the best segmentation results according to MR for the three segmentation methods.According to Figure 7, the proposed method and the MRS delineated 12 and 11 impact craters, respectively, while the Global managed to delineated only 9.This caused the high degree of under-estimated areas (blue in Figure 7) in the intersection maps of all segmentation methods.On the contrary, the proportion of over-estimated areas (red in Figure 7) was relatively small for MRS, Local, and Global, compared to their under-estimated areas.Nevertheless, all segmentation methods appear promising because the overlap area (green in Figure 7) shows that delimited craters match with the reference craters quite well.
7 illustrates the best segmentation results according to MR for the three segmentation methods.According to Figure 7, the proposed method and the MRS delineated 12 and 11 impact craters, respectively, while the Global managed to delineated only 9.This caused the high degree of underestimated areas (blue in Figure 7) in the intersection maps of all segmentation methods.On the contrary, the proportion of over-estimated areas (red in Figure 7) was relatively small for MRS, Local, and Global, compared to their under-estimated areas.Nevertheless, all segmentation methods appear promising because the overlap area (green in Figure 7) shows that delimited craters match with the reference craters quite well.

Comparison with a Relative Hybrid Segmentation Method
At this point, it is necessary to compare the performance of the proposed Local method with the hybrid segmentation (objective heterogeneity and relative homogeneity-OHRH) method of Wang et al. [12], which incorporates also the intra-object and inter-object heterogeneity statistical measures within the merging process.The Local, the Global, and the MRS method were applied for the segmentation of the SZTAKI-INRIA building detection dataset [13], which is available from their website (http://web.eee.sztaki.hu/remotesensing/building_benchmark.html).This dataset contains nine aerial and satellite images of Budapest and Szada in Hungary, Manchester in United Kingdom, Bodensee in Germany, Normandy, and Côte d'Azur in France.Following Wang et al. [12], only the images from the regions of Côte d'Azur, Manchester, and Szada were tested with the three segmentation methods.The segmentation settings for all methods remained the same as before (Section 2.3).
The best segmentation results according to OGf are shown in Figure 8 for visual comparison, where only the MRS and Local results are illustrated.The unsupervised evaluation results according

Comparison with a Relative Hybrid Segmentation Method
At this point, it is necessary to compare the performance of the proposed Local method with the hybrid segmentation (objective heterogeneity and relative homogeneity-OHRH) method of Wang et al. [12], which incorporates also the intra-object and inter-object heterogeneity statistical measures within the merging process.The Local, the Global, and the MRS method were applied for the segmentation of the SZTAKI-INRIA building detection dataset [13], which is available from their website (http://web.eee.sztaki.hu/remotesensing/building_benchmark.html).This dataset contains nine aerial and satellite images of Budapest and Szada in Hungary, Manchester in United Kingdom, Bodensee in Germany, Normandy, and Côte d'Azur in France.Following Wang et al. [12], only the images from the regions of Côte d'Azur, Manchester, and Szada were tested with the three segmentation methods.The segmentation settings for all methods remained the same as before (Section 2.3).
The best segmentation results according to OG f are shown in Figure 8 for visual comparison, where only the MRS and Local results are illustrated.The unsupervised evaluation results according to OG f for the three region merging segmentation methods are shown in Table 1.Furthermore, the OG f values that came up from the evaluation of the OHRH method of Wang et al. [12] and full Lambda-schedule algorithm (FLSA) of Robinson et al. [37] are also provided in Table 1.Additionally, the comparison with the relevant results from OHRH [12] and FLSA [37] showed that the MRS and the Local methods can produce segmentation results with reduced over-segmentation effects, which is also evident from their higher OG f values.
Remote Sens. 2018Sens. , 10, 2024 15 of 20 to OGf for the three region merging segmentation methods are shown in Table 1.Furthermore, the OGf values that came up from the evaluation of the OHRH method of Wang et al. [12] and full Lambda-schedule algorithm (FLSA) of Robinson et al. [37] are also provided in Table 1.Additionally, the comparison with the relevant results from OHRH [12] and FLSA [37] showed that the MRS and the Local methods can produce segmentation results with reduced over-segmentation effects, which is also evident from their higher OGf values.According to Figure 8, the segmentation results produced from MRS and Local do not show obvious differences.In order to validate the efficiency of the Local method, the same two subsets (Figure 9) from each image selected by Wang et al. [12] were examined visually in this study.The superiority of the Local method in segmenting tree clusters compared to the MRS method is evident in most cases.In subset B1 (Figures 8 and 9), the Local method produced a segment, which captures as a whole the tree cluster.On the contrary, MRS over-segmented the same tree cluster in subset A1 (Figures 8 and 9).Both, the MRS and the Local methods showed similar performances in the segmentation of buildings.However, MRS failed to distinguish several rooftops from their surrounding trees.Specifically, the shaded parts of the rooftops on the upper left side of subset A1 and the lower right side of subset B2 (Figures 8 and 9) were merged with their adjacent tree clusters.According to Figure 8, the segmentation results produced from MRS and Local do not show obvious differences.In order to validate the efficiency of the Local method, the same two subsets (Figure 9) from each image selected by Wang et al. [12] were examined visually in this study.The superiority of the Local method in segmenting tree clusters compared to the MRS method is evident in most cases.In subset B1 (Figures 8 and 9), the Local method produced a segment, which captures as a whole the tree cluster.On the contrary, MRS over-segmented the same tree cluster in subset A1 (Figures 8 and 9).Both, the MRS and the Local methods showed similar performances in the segmentation of buildings.However, MRS failed to distinguish several rooftops from their surrounding trees.Specifically, the shaded parts of the rooftops on the upper left side of subset A1 and the lower right side of subset B2 (Figures 8 and 9) were merged with their adjacent tree clusters.

Discussion
In this study, we proposed a novel region merging segmentation method with local SP values, defined by local intra-and inter-object heterogeneity measures.The proposed method has been proven to be able to: (1) Produce objects, which are internally homogeneous and distinguishable from their neighboring objects; (2) it can create various scale segmentations and delineate more reference objects than other region merging segmentation methods with global SP values; and (3) it can handle both spectral and elevation data.
There are studies assessing the efficiency of various region-merging and growing segmentation methods with spectral data [4,38].Almost all existing region-merging and region-growing methods have been compared in these studies and they have shown that the MRS is the most efficient one.For this reason, it was decided to compare the results of the proposed segmentation method only with the results produced from the MRS.Additionally, MRS is also the most efficient method for the segmentation of elevation data, because it can perform best for the detection of morphological discontinuities according to Van Niekerk [33].
The assessment of the segmentation results with the unsupervised segmentation evaluation method showed that the proposed method with local SPs outperformed the MRS and the Global in the production of objects with maximized average internal homogeneity and external heterogeneity.In the case of the segmentation of elevation data with the Local method, the average external heterogeneity is maximized with a larger rate than the MRS segmentation.This fact proves that the Local method, which incorporated local intra-and inter-segment heterogeneity measures within the

Discussion
In this study, we proposed a novel region merging segmentation method with local SP values, defined by local intra-and inter-object heterogeneity measures.The proposed method has been proven to be able to: (1) Produce objects, which are internally homogeneous and distinguishable from their neighboring objects; (2) it can create various scale segmentations and delineate more reference objects than other region merging segmentation methods with global SP values; and (3) it can handle both spectral and elevation data.
There are studies assessing the efficiency of various region-merging and growing segmentation methods with spectral data [4,38].Almost all existing region-merging and region-growing methods have been compared in these studies and they have shown that the MRS is the most efficient one.For this reason, it was decided to compare the results of the proposed segmentation method only with the results produced from the MRS.Additionally, MRS is also the most efficient method for the segmentation of elevation data, because it can perform best for the detection of morphological discontinuities according to Van Niekerk [33].
The assessment of the segmentation results with the unsupervised segmentation evaluation method showed that the proposed method with local SPs outperformed the MRS and the Global in the production of objects with maximized average internal homogeneity and external heterogeneity.In the case of the segmentation of elevation data with the Local method, the average external heterogeneity is maximized with a larger rate than the MRS segmentation.This fact proves that the Local method, which incorporated local intra-and inter-segment heterogeneity measures within the definition of local SP is more sensitive to morphological discontinuities in DEMs than segmentation methods with Global SP, like MRS.
The visual difference among the results from the segmentation of the spectral data with the Local, Global, and MRS segmentation methods is not significant, and the differences are mainly reflected locally as shown in Figure 3, Figure 8, and Figure 9.The proposed Local method is more efficient in segmenting tree clusters and grassland areas as one object in comparison to MRS, while the performance of both methods in the building delimitation is visually similar.This is the reason why the supervised evaluation results for the three methods are almost similar in the case of building delimitation with spectral data.In the case of impact crater delimitation, the area-based and the arithmetic dissimilarity measures for the results produced by the Local method were lower than the measures of the other two methods.Indicatively, the MR values of the results from the Local method for both applications were the lowest among the three methods.The ability of the Local method to capture more reference objects and in combination with the use of multiscale optimization methods, in order to select optimum segmentation scales, could help towards the development of more efficient classification methods with a reduced error of omission [39].
The unsupervised evaluation of the segmentation results for the SZTAKI-INRIA building detection dataset [13] showed that the proposed Local method outperformed the FLSA of Robinson et al. [37] and the hybrid OHRH of Wang et al. [12].Both FLSA and OHRH methods showed increased over-segmentation in contrast to Local and MRS methods.The superiority of Local and MRS over FLSA and OHRH proves that the use of Baatz and Schäpe criterion [5] as MC is more efficient than the FLSA criterion [37] and the OHRH criterion [12] for the segmentation of high resolution remote sensing images.
Another desirable property of the proposed method is its ability to segment efficiently remote sensing data without the need of initial image objects acquired from a presegmentation step, like hybrid methods.Wang et al. [12] have shown that the quality of the initial segments is critical for the performance of hybrid segmentation methods.The fact that the proposed Local method outperformed the hybrid OHRH method of Wang et al. [12] answers the question of how efficient is the incorporation of local heterogeneity measures within the merging process of a pure region merging segmentation method without the need of initial objects.
The proposed Local region-merging method followed a methodology, which is similar to the one employed by Johnson and Xie [9] for the post-segmentation refinement of under-and over-segmented objects derived from segmentations with optimal settings.However, the Local method incorporated within the merging process the local heterogeneity measures used by Johnson and Xie [9] to characterize the under-segmented and over-segmented objects.This fact enabled the automated and objective characterization and refinement of the objects during merging process.In the Johnson and Xie [9] approach, the identification and refinement of under-segmented and over-segmented objects require manual tuning by the user and pose their method as less objective.
Image segmentation is the first step of the OBIA workflow and is concerned with the generation of image objects from the remote sensing data.Image objects are the analysis unit for the subsequent classification steps and their quality affects the quality of the final classification results.Therefore, the development of segmentation methods that produce improved segmentation results can facilitate object-based classification approaches.The proposed Local method with local SPs was able to provide better segmentation results in terms of statistical measures and reference object detection.Thus, our proposed method has the potential to be applied to object-based approaches that employ multi-band spectral or elevation remote sensing data.

Figure 1 .
Figure 1.Region merging with local scale parameters workflow.

Figure 1 .
Figure 1.Region merging with local scale parameters workflow.

Figure 4 . 20 Figure 2 .
Figure 2. The performance of the three segmentation methods according to the three-band average (A) weighted variance (wVar) and (B) Moran's I values.

Figure 2 .
Figure 2. The performance of the three segmentation methods according to the three-band average (A) weighted variance (wVar) and (B) Moran's I values.

Figure 3 .
Figure 3. (A) The spectral data overlapped by the reference building dataset.The segmentation results for building delimitation from spectral data produced by three methods: (B) MRS (SPglobal = 60), (C) region merging with global SP (SPglobal = 50), and (D) region merging with local SPs (SPglobal = 50)), using the optimum SPglobal indicated by the unsupervised evaluation.

Figure 3 .
Figure 3. (A) The spectral data overlapped by the reference building dataset.The segmentation results for building delimitation from spectral data produced by three methods: (B) MRS (SP global = 60), (C) region merging with global SP (SP global = 50), and (D) region merging with local SPs (SP global = 50)), using the optimum SP global indicated by the unsupervised evaluation.

Figure 4 .
Figure 4.The supervised evaluation results for the segmentations of spectral data produced from three segmentation methods (MRS, Global, and Local) according to the discrepancy measures: (A) Area fit index; (B) quality rate; (C) over segmentation; (D) under segmentation; (E) combined over and under-segmentation; and (F) miss Rate.
) for wVar and Moran's I show that the Local outperforms the other two methods to produce multiscale segmentation results with lower Moran's I values.The average wVar and Moran's I values of the best segmentation results produced from the three methods were nearly similar (Local: wVar = 13,629.50and MI = 0.4633, Global: wVar = 10,846.80and MI = 0.4956, MRS: wVar = 10,968.11and MI = 0.4463).The area-based dissimilarity measures for the three segmentation approaches indicated different optimum SPglobal values (Figure 6).All area-based measures confirmed the greater efficiency of the Local than the other two methods.Indicatively, the AFI values of the Local results are approximately 14% lower than that of the MRS results and 31% lower than the Global results.The best segmentation result as indicated by the MR was obtained with the Local when the SPglobal was set to 10 (MR = 0.5714).The Local has a lower value of MR than the MRS (MR = 0.6071) and the Global (MR = 0.6785).

Figure 4 .
Figure 4.The supervised evaluation results for the segmentations of spectral data produced from three segmentation methods (MRS, Global, and Local) according to the discrepancy measures: (A) Area fit index; (B) quality rate; (C) over segmentation; (D) under segmentation; (E) combined over and under-segmentation; and (F) miss Rate.
) for wVar and Moran's I show that the Local outperforms the other two methods to produce multiscale segmentation results with lower Moran's I values.The average wVar and Moran's I values of the best segmentation results produced from the three methods were nearly similar (Local: wVar = 13,629.50and MI = 0.4633, Global: wVar = 10,846.80and MI = 0.4956, MRS: wVar = 10,968.11and MI = 0.4463).Remote Sens. 2018, 10, 2024 13 of 20

Figure 5 .
Figure 5.The performance of the three segmentation methods according to the three-band average (A) weighted variance (wVar) and (B) Moran's I values.

Figure 5 .
Figure 5.The performance of the three segmentation methods according to the three-band average (A) weighted variance (wVar) and (B) Moran's I values.

Figure 5 .
Figure 5.The performance of the three segmentation methods according to the three-band average (A) weighted variance (wVar) and (B) Moran's I values.

Figure 6 .
Figure 6.The supervised evaluation results for the segmentations of elevation data produced from three segmentation methods (MRS, Global, and Local) according to the discrepancy measures: (A) Area fit index; (B) quality rate; (C) over segmentation; (D) under segmentation; (E) combined over and under-segmentation; and (F) miss Rate.

Figure 6 .
Figure 6.The supervised evaluation results for the segmentations of elevation data produced from three segmentation methods (MRS, Global, and Local) according to the discrepancy measures: (A) Area fit index; (B) quality rate; (C) over segmentation; (D) under segmentation; (E) combined over and under-segmentation; and (F) miss Rate.

Figure 8 .
Figure 8.The segmentation results produced by MRS and Local using the SZDAKI-INRIA building detection dataset.

Figure 8 .
Figure 8.The segmentation results produced by MRS and Local using the SZDAKI-INRIA building detection dataset.

Figure 9 .
Figure 9. Subsets of segmentations in Figure 8, where the first and third columns illustrate the MRS results and the other columns the Local results.

Figure 9 .
Figure 9. Subsets of segmentations in Figure 8, where the first and third columns illustrate the MRS results and the other columns the Local results.

Table 1 .
The unsupervised evaluation results for the best segmentations of the SZTAKI-INRIA building detection dataset produced by MRS, Global, Local, OHRH, and FLSA methods.

Table 1 .
The unsupervised evaluation results for the best segmentations of the SZTAKI-INRIA building detection dataset produced by MRS, Global, Local, OHRH, and FLSA methods.