Exploration of Semantic Geo-Object Recognition Based on the Scale Parameter Optimization Method for Remote Sensing Images

: Image segmentation is of signiﬁcance because it can provide objects that are the minimum analysis units for geographic object-based image analysis (GEOBIA). Most segmentation methods usually set parameters to identify geo-objects, and different parameter settings lead to different segmentation results; thus, parameter optimization is critical to obtain satisfactory segmentation results. Currently, many parameter optimization methods have been developed and successfully applied to the identiﬁcation of single geo-objects. However, few studies have focused on the recognition of the union of different types of geo-objects (semantic geo-objects), such as a park. The recognition of semantic geo-objects is likely more crucial than that of single geo-objects because the former type of recognition is more correlated with the human perception. This paper proposes an approach to recognize semantic geo-objects. The key concept is that a single geo-object is the smallest compo-nent unit of a semantic geo-object, and semantic geo-objects are recognized by iteratively merging single geo-objects. Thus, the optimal scale of the semantic geo-objects is determined by iteratively recognizing the optimal scales of single geo-objects and using them as the initiation point of the reset scale parameter optimization interval. In this paper, we adopt the multiresolution segmentation (MRS) method to segment Gaofen-1 images and tested three scale parameter optimization methods to validate the proposed approach. The results show that the proposed approach can determine the scale parameters, which can produce semantic geo-objects.


Introduction
Advances in satellite sensor technologies have enabled the acquisition of images with different spatial resolutions. For remote sensing images with moderate and high spatial resolutions, the traditional pixel-based approach cannot satisfy the requirements of several remote sensing applications because the same geo-object with different spectra and different geo-objects with identical spectra are present in remote sensing images. With the development of geographic object-based image analysis (GEOBIA) techniques, image classification has been enhanced due to the reduction in spectral variability within geo-objects [1][2][3].
Image segmentation is the first critical step in the GEOBIA framework, and the quality of image segmentation determines the accuracy of subsequent image classification [4][5][6][7][8]. It is challenging to perform image segmentation on remote sensing images involving complex land covers, and many segmentation methods have been developed, In this paper, we selected one scene acquired by Gaofen-1 (GF-1) on 7 August 2015, in Shenzhen, China. In general, the GF-1 satellite is equipped with six cameras: panchromatic and multispectral cameras with spatial resolutions of 2 and 8 m, respectively, and four multispectral wide cameras with a spatial resolution of 16 m [35]. Other technical specifications of the GF-1 satellite are presented in Table 1. The NNDiffuse pansharpening function of ENVI 5.3 was used to fuse the multispectral and panchromatic images with spatial resolutions of 8 m and 2 m, respectively, into a 4-band pansharpened multispectral image with a spatial resolution of 2 m. Shenzhen is a coastal city in southern China adjacent to Hong Kong. The city is located south of the Tropic of Cancer, between 113 • 43 and 114 • 38 east longitude and between 22 • 24 and 22 • 52 north latitude. Shenzhen is located in the south of Guangdong Province; the eastern coast of the Pearl River Estuary is bordered by Daya Bay and Dapeng Bay in the east, Pearl River Estuary and Lingdingyang Bay in the west, Shenzhen River in the south, connected with Hong Kong, and Dongguan and Huizhou in the north. The total land area of Shenzhen is 1996.85 km 2 . The weather of Shenzhen corresponds to a dry, mild climate with abundant rainfall. The main landforms of Shenzhen include low mountains, flat platforms, and terraced hills. Plains account for 22.1% of the land area, and the forest coverage rate is 44.6%.
Four experimental areas were selected for this study and included traditional urban and suburban areas containing various land cover types. Roads, trees, water bodies, vegetation, various buildings, and other objects are present in the experimental areas. Small geo-objects are relatively clear because the experimental images have a high spatial resolution of 2 m. Four test images are shown in Figure 1. The image shown in Figure 1a contains factories, residential buildings, vegetation, and roads. The image shown in Figure 1b contains forests, rivers, factories, houses, and roads. The shown image in Figure 1c contains houses, small water bodies, vegetation, roads, and unconstructed land. The image shown in Figure 1d contains a small section of rivers, ponds, vegetation, and roads. Different combinations of geo-objects can provide several references for follow-up research. The areas of the 4 images are all 1.6 × 1.6 km 2 .

Overview
A semantic geo-object represents the union of different single geo-objects; consequently, the optimization of the scale parameters of semantic geo-objects is based on the optimization of single geo-objects. Thus, the proposed approach to recognize semantic geo-objects can be divided into three steps. The first step involves segmentation; we obtain a series of segmentation results by using MRS method in the experiment. The second step involves the scale parameter optimization of single geo-objects; we use the three PO methods reported by Johnson

Overview
A semantic geo-object represents the union of different single geo-objects; consequently, the optimization of the scale parameters of semantic geo-objects is based on the optimization of single geo-objects. Thus, the proposed approach to recognize semantic geo-objects can be divided into three steps. The first step involves segmentation; we obtain a series of segmentation results by using MRS method in the experiment. The second step involves the scale parameter optimization of single geo-objects; we use the three PO methods reported by Johnson  optimization of semantic geo-objects, and the implementation of an iterative process to determine the scale parameter(s) of semantic geo-objects. The approach to recognize the segmentation parameter(s) that can produce semantic geo-objects is illustrated in Figure 2.
ISPRS Int. J. Geo-Inf. 2021, 10, x FOR PEER REVIEW 5 of 16 tively, to obtain the optimal scales of single objects. The third step involves the scale parameter optimization of semantic geo-objects, and the implementation of an iterative process to determine the scale parameter(s) of semantic geo-objects. The approach to recognize the segmentation parameter(s) that can produce semantic geo-objects is illustrated in Figure 2.

Semantic Geo-Object PO (i) Segmentation
The multiresolution segmentation (MRS) method is used to segment the test images. The MRS method, which is embedded in eCognition Developer 9.0 software, is a bottomup approach based on a region-merging technique; the approach selects each pixel and considers the shape, size, and attributes of the pixels within the object [37]. The method stops merging when the heterogeneity threshold is reached. The MRS method involves three parameters: scale, shape, and compactness. The scale parameter determines the maximum allowable heterogeneity, the shape parameter controls the shape and color, and the compactness parameter controls the smoothness of the image. The shape and compactness parameters are both set as 0.1 through visual analysis. The focus of this study is to determine a suitable scale parameter.
(ii) PO of single geo-objects Based on a series of segmentation results, we use the following three PO methods to search for the appropriate segmentation scale parameter of single geo-objects.
The first PO method is JSM, proposed by Johnson et al. (2015) [22]. This method uses the WV and MI to measure the intrasegment homogeneity and intersegment heterogeneity, respectively [22,28,38]. The WV can clarify the differences in a region. A low WV value indicates a high homogeneity. The WV can be calculated as follows:

Semantic Geo-Object PO (i) Segmentation
The multiresolution segmentation (MRS) method is used to segment the test images. The MRS method, which is embedded in eCognition Developer 9.0 software, is a bottomup approach based on a region-merging technique; the approach selects each pixel and considers the shape, size, and attributes of the pixels within the object [37]. The method stops merging when the heterogeneity threshold is reached. The MRS method involves three parameters: scale, shape, and compactness. The scale parameter determines the maximum allowable heterogeneity, the shape parameter controls the shape and color, and the compactness parameter controls the smoothness of the image. The shape and compactness parameters are both set as 0.1 through visual analysis. The focus of this study is to determine a suitable scale parameter.
(ii) PO of single geo-objects Based on a series of segmentation results, we use the following three PO methods to search for the appropriate segmentation scale parameter of single geo-objects.
The first PO method is JSM, proposed by Johnson et al. (2015) [22]. This method uses the WV and MI to measure the intrasegment homogeneity and intersegment heterogeneity, respectively [22,28,38]. The WV can clarify the differences in a region. A low WV value indicates a high homogeneity. The WV can be calculated as follows: where a i and v i denote the area and variance in region i, respectively, and n is the number of segments. MI is an autocorrelation index that reflects the degree of spatial correlation [38]. A low MI value indicates a high heterogeneity. The MI can be determined as follows: where n is the total number of segments; y i and y j are the mean gray values of segments i and j; y is the mean gray value of the entire image; and w ij is a measure of the spatial adjacency of segments i and j [23,28]. If regions i and j are adjacent, w ij = 1; otherwise, w ij = 0 [28].
The MI and WV values should be normalized to 0-1 before implementing the F-measure. The normalized formula can be defined as follows: where WV norm and MI norm represent the normalized WV and MI values, respectively; X is the WV or MI value; and X max and X min represent the maximum and minimum WV or MI values of all generated segmentations, respectively [22]. High WV norm value represents low intrasegment homogeneity, and low MI norm value represents high intersegment heterogeneity. Furthermore, WV norm and MI norm are calculated for each spectral band and subsequently averaged [22,39]. Finally, the F-measure is used to combine the WV and MI values to measure the "overall goodness" (OG), as follows: where b is the relative contribution of WV norm and MI norm . In this paper, we consider WV norm and MI norm have identical weights, i.e., b = 1. The second PO method is WM, proposed by Wang et al. (2018) [34]. The homogeneity indicator and the combination strategy of this approach are similar to those in JSM, although a different heterogeneity indicator is adopted [34]. The JM distance, which is used as the heterogeneity indicator, has been demonstrated to be effective in evaluating the segmentation quality. The JM distance is typically used to measure the spectral separability between two class density functions [40,41]. Thus, the spectral heterogeneity of two adjacent fragments can be measured using the JM distance. For more information regarding the JM distance, please refer to the work of   [34].
The third PO method is ZM, proposed by Zhang et al. (2012) [18]. This method uses two metrics (T and D) to measure the intrasegment homogeneity and intersegment heterogeneity, respectively. T is calculated as follows: The segmented image is represented by I, the image size is represented by S, the number in region i is represented by R, the mean error of the feature vector is denoted by E i , and the area of region i is denoted by A i . D, which is used to measure the intersegment heterogeneity, represents a normalized variance that considers the mean feature vector [18]. D is calculated as follows: where m ij is the mean spectral value of band i in region j; mm i is the mean value of all the spectral mean values for the band I for all regions; and c is the number of spectral bands. The variance increases with the number of regions in the segmentation result; therefore, D can be scaled by √ R [18]. The T and D values are normalized to 0-1 before implementing the OG strategy. OG z is the weighted sum of T and D, calculated as follows: The value of T and D are large when oversegmentation and undersegmentation occur, respectively. Because another normalization operation is implemented in the original method, the optimal segmentation pertains to the result with the maximum OG z . The change rate of T with respect to D can help determine the weight λ [26], which is calculated as follows: (iii) PO of semantic geo-objects Based on the scale parameter optimization results of single geo-objects, we perform the PO of semantic geo-objects.
The PO of semantic geo-objects is an iterative process that proceeds from small single geo-objects to large single geo-objects to semantic geo-objects. In the optimization process, the results of the first optimization are usually associated with small and medium single geo-objects. The second optimization produces results pertaining to medium single geoobjects and semantic geo-objects. The third and subsequent optimizations gradually produce the semantic segmentation results. Usually, semantic geo-objects are formed by a combination of single geo-objects, and the choice of semantic geo-objects is derived from single geo-objects. For example, geo-objects such as residential buildings, green belts, and small pools are present in one semantic geo-object (a community). To more accurately recognize semantic geo-objects, we develop an approach to identify the semantic segmentation scale parameters. The details of the proposed approach are presented in Table 2. Table 2. Approach to determine the optimal semantic segmentation scale.
Input: a series of segmentation results with the MRS from the initial scale (6)  Output: target scale parameters and corresponding segmentation results

Experimental Process
The main experimental process to recognize semantic geo-objects is as follows. First, a series of segmentation results are produced using the MRS method. The analysis of the segmentation results indicates that the test images are considerably oversegmented and undersegmented when the scale parameter is set as 6 and 70, respectively; therefore, we adjust the scale parameter to range from 6 to 70 in increments of 2. Both the compactness and shape parameters are set as 0. and ZM, are adopted to verify the proposed approach [18,22,34]. Traditionally, the JSM, WM, and ZM assume that the scale with maximum value of the objective function (OF) corresponds to the optimal segmentation results. Third, we determine whether the first PO results conform to semantic geo-objects. If the first PO results are in accordance with the semantic geo-objects, the scale is considered to be the most suitable for segmenting segment semantic geo-objects. If the result is oversemantic, we consider the scale to be the initial scale; otherwise, the scale is the final scale. The scale is examined iteratively until the semantic segmentation requirements are satisfied.

Results of Scale Parameter Optimization for Single Geo-Objects
To obtain the segmentation scale parameters of single geo-objects, we perform a series of calculations. We obtain 33 segmentation results by using the MRS method embedded in the eCognition Developer 9.0 software by varying the scale parameters from 6 to 70 in steps of 2 (6, 8, 10, 12, etc.). Corresponding to the scale parameters, we obtain 33 OF values of the three PO methods by calculating the WV, MI, JM distance, T, and D. The values of OF based on the JSM, WM, and ZM that correspond to the scale parameters are shown in Figure 3. Output: target scale parameters and corresponding segmentation results

Experimental Process
The main experimental process to recognize semantic geo-objects is as follows. First, a series of segmentation results are produced using the MRS method. The analysis of the segmentation results indicates that the test images are considerably oversegmented and undersegmented when the scale parameter is set as 6 and 70, respectively; therefore, we adjust the scale parameter to range from 6 to 70 in increments of 2. Both the compactness and shape parameters are set as 0.  [18,22,34]. Traditionally, the JSM, WM, and ZM assume that the scale with maximum value of the objective function (OF) corresponds to the optimal segmentation results. Third, we determine whether the first PO results conform to semantic geo-objects. If the first PO results are in accordance with the semantic geo-objects, the scale is considered to be the most suitable for segmenting segment semantic geo-objects. If the result is oversemantic, we consider the scale to be the initial scale; otherwise, the scale is the final scale. The scale is examined iteratively until the semantic segmentation requirements are satisfied.

Results of Scale Parameter Optimization for Single Geo-Objects
To obtain the segmentation scale parameters of single geo-objects, we perform a series of calculations. We obtain 33 segmentation results by using the MRS method embedded in the eCognition Developer 9.0 software by varying the scale parameters from 6 to 70 in steps of 2 (6, 8, 10, 12, etc.). Corresponding to the scale parameters, we obtain 33 OF values of the three PO methods by calculating the WV, MI, JM distance, T, and D. The values of OF based on the JSM, WM, and ZM that correspond to the scale parameters are shown in Figure 3.  As shown in Figure 3, the curves exhibit a similar trend. Specifically, the OF values first increase and subsequently decrease with increasing scale. In certain cases, these values fluctuate. In addition, ZM yields OF values with several large fluctuations for P3, which reflects the instability of the method for this image; however, ZM performs stable segmentation on the other test images. In addition, the scale with the highest OF value is considered the optimal scale.
The optimal scales and corresponding OF values obtained using the three methods are listed in Table 3. The maximum OF values for P1-P4, obtained using JSM, WM, and ZM are 0.6324, 0.5543, and 1.8595; 0.5138, 0.5846, and 1.8071; 0.5727, 0.5978, and 1.9607; and 0.6432, 0.5730, and 1.9612, respectively. The scale parameters corresponding to the maximum OF values are considered optimal. Therefore, the first optimal scales of P1-P4 using the JSM, WM, and ZM are 16, 20, and 16; 26, 24, and 16; 18, 20, and 68; and 20, 20, and 18, respectively. Because the internal indicators of the considered PO methods are different and ZM adopts different combination methods, we obtain different results for the first scale optimization. Subsets of the segmentation results are shown in Figure 4 to enable a visual comparison with the results of the first scale optimization performed using the JSM, WM, and ZM. For clear observation of the effect, partial regions of the four test images are shown. Figure 4 indicates that the JSM, WM, and ZM can effectively segment small and medium single geo-objects. For example, in Figure 4a-c, the small rooftops are well segmented. In Figure 4g-i, different rooftop shapes are well segmented. Figure 4m,n exhibit a satisfactory segmentation of small and medium rooftops and a grass path. In the middle parts of Figure 4s-u, three medium nature geo-objects are well segmented. However, oversegmentation occurs for several large single geo-objects and semantic geo-objects. For example, in Figure 4j-i, the rooftops of large factories are oversegmented. Figure 4d-f show the occurrence oversegmentation for forests. In Figure 4p,q, the semantic areas of unused land are segmented into small fragments. In addition, grass on the side of the houses is also oversegmented in Figure 4s-u. Figure 4v-x show that a piece of unused land containing vegetation is segmented into fragmented segments. In addition to Figure 4o,r, other subsets of the segmentation results in Figure 4 clearly demonstrate that the semantic geo-objects are not well recognized. In several remote sensing applications, because a larger area must be considered, certain semantic geo-objects are more meaningful than a single geo-object. For example, in a certain study, we may need to obtain the scope of a residential area on the remote sensing image, because different types of buildings are present in the residential area along with small green belts and other geo-objects. In this scenario, we must consider a way to directly segment the scope of the residential area, as a more effective strategy. Specifically, we must develop an approach to recognize semantic geo-objects. Because the first scale optimization does not yield satisfactory semantic segmentation results, the proposed approach searches for suitable scale parameters to obtain the semantic segmentation results, as described in the following sections.
objects. Because the first scale optimization does not yield satisfactory semantic segmentation results, the proposed approach searches for suitable scale parameters to obtain the semantic segmentation results, as described in the following sections.

Results of Scale Parameter Optimization for Semantic Geo-Objects
The iterative process of searching for the scale parameter that can optimally segment semantic geo-objects is shown in Table 4. For P1, the three PO methods include four stages of scale parameter optimization, and the semantic segmentation results are generated in the fourth stage of the optimized scale determination. After the scale parameter optimal selection by the ZM, we obtain the maximum scale parameter of the undersegmented results in the fourth stage; therefore, it is inferred that the target scale parameter of the ZM corresponds to the third stage. Thus, the scale parameters of the JSM, WM, and ZM are set as 52, 58, and 48 for P1, P2, and P4, respectively, considering the three stages of scale parameter optimization, and the segmentation results are generated in the third stage. The optimized scale parameters of the JSM, WM, and ZM are set as 60, 64, and 66 for P2 and 48, 54, and 48 for P4, respectively. For P3, both the JSM and WM implement four stages of scale parameter optimization, and we use ZM to optimize the scale parameters after two stages. The optimized scale parameters of the JSM, WM, and ZM are set as 44, 52, and 68 for P3, respectively. In addition, the scale parameters produced by the four-stage scale parameter optimization are not larger than those produced by the three-stage optimization in certain cases for different remote sensing images.

Results of Scale Parameter Optimization for Semantic Geo-Objects
The iterative process of searching for the scale parameter that can optimally segment semantic geo-objects is shown in Table 4. For P1, the three PO methods include four stages of scale parameter optimization, and the semantic segmentation results are generated in the fourth stage of the optimized scale determination. After the scale parameter optimal selection by the ZM, we obtain the maximum scale parameter of the undersegmented results in the fourth stage; therefore, it is inferred that the target scale parameter of the ZM corresponds to the third stage. Thus, the scale parameters of the JSM, WM, and ZM are set as 52, 58, and 48 for P1, P2, and P4, respectively, considering the three stages of scale parameter optimization, and the segmentation results are generated in the third stage. The optimized scale parameters of the JSM, WM, and ZM are set as 60, 64, and 66 for P2 and 48, 54, and 48 for P4, respectively. For P3, both the JSM and WM implement four stages of scale parameter optimization, and we use ZM to optimize the scale parameters after two stages. The optimized scale parameters of the JSM, WM, and ZM are set as 44, 52, and 68 for P3, respectively. In addition, the scale parameters produced by the four-stage scale parameter optimization are not larger than those produced by the three-stage optimization in certain cases for different remote sensing images.
To further validate the proposed approach, Figure 5 shows the segmentation results with the scales produced using the proposed approach. To observe the overall effect, we show the entire area of the four test images.

Discussion
Image segmentation is a crucial task because it can provide objects for GEOBIA. Effective segmentation must be ensured to enable subsequent image interpretation. Intuitively, it is meaningful to obtain target objects, such as roads, houses, and transportation [43][44][45][46]. Hence, scale parameter optimization is a key step in achieving the desirable segments. In general, the segmentation result obtained using the PO method can be quantitatively evaluated using discrepancy measurement methods. Many discrepancy measurement approaches have been successfully applied to quantitatively evaluate single geoobjects produced by PO methods, as described in Section 2.2.2 [29,[47][48][49][50][51][52][53][54][55][56]. However, only a few studies have quantitatively evaluated semantic segmentation results because the understanding of semantic geo-objects is subjective, and experts often differ in their opinions regarding the definition of semantic geo-objects. Currently, it is difficult to quantitatively evaluate the results of semantic segmentation based on a single criterion. Therefore, As shown in Figure 5, most semantic geo-objects are well recognized. In Figure 5a-c, several residential areas containing different types of single geo-objects are effectively identified. Figure 5d-f exhibit satisfactory semantic segmentation for large homogenous rooftops, large yards, and vegetation belts. In the upper-middle part of Figure 5h,i, a residential zone, which is a semantic geo-object with different spectral features that contain houses, a green belt, and pools, is well segmented. In the upper-right area, Figure 5h,i show the effective delineation of the unused land containing different types of geo-objects. Figure 5g displays a slightly inferior result. Figure 5j-l show the successful identification of a long channel in the image range; this type of artificial channel is a typical semantic geo-object in reality. In Figure 5a-c, small rooftops with notable spectral features in the test images are separated from the surrounding geo-objects, and a large rooftop is not fully recognized in Figure 5d-f, because of the large spectral contrast. These cases correspond to unavoidable situations in the current experiments because effective segmentation is difficult when the spectral contrast is extremely high on the surface of a large single geoobject or between two adjacent geo-objects [42]. Although minor imperfections are noted for the JSM, WM, and ZM, satisfactory semantic segmentation can be realized using the optimized target scale parameters in the four test images.

Discussion
Image segmentation is a crucial task because it can provide objects for GEOBIA. Effective segmentation must be ensured to enable subsequent image interpretation. Intuitively, it is meaningful to obtain target objects, such as roads, houses, and transportation [43][44][45][46]. Hence, scale parameter optimization is a key step in achieving the desirable segments. In general, the segmentation result obtained using the PO method can be quantitatively evaluated using discrepancy measurement methods. Many discrepancy measurement approaches have been successfully applied to quantitatively evaluate single geo-objects produced by PO methods, as described in Section 2.2.2 [29,[47][48][49][50][51][52][53][54][55][56]. However, only a few studies have quantitatively evaluated semantic segmentation results because the understanding of semantic geo-objects is subjective, and experts often differ in their opinions regarding the definition of semantic geo-objects. Currently, it is difficult to quantitatively evaluate the results of semantic segmentation based on a single criterion. Therefore, we used a visual evaluation method in this paper. Future research can be aimed at establishing a quantitative evaluation method for semantic segmentation.
In certain cases, to recognize semantic geo-objects, a weakened spectral difference is required within a semantic geo-object. From this perspective, low and medium-spatialresolution images may be more suitable for segmenting semantic geo-objects. However, the geo-objects in cities are often relatively small, and high spatial resolution images can provide abundant and detailed geo-object information. In such cases, low-and mediumspatial-resolution images are suboptimal for segmenting single geo-objects. For example, images with a high spatial resolution can help identify small urban geo-objects, although this identification cannot be realized through images with a low spatial resolution. In practical applications, the combination of single and semantic geo-objects is most useful. For example, if the objective is to visit a building in a park, the location of the park is first identified. When we arrive at the park, our focus shifts to the building. Future work will be focused on the recognition and union of single and semantic geo-objects.
Our results (Table 4) show that most of the optimal scales were obtained in the third or fourth stage of the PO. Thus, generally, semantic geo-objects can be effectively segmented after three iterative stages of PO. In addition, most of the scales identified in the previous stages caused the oversegmentation of semantic geo-objects and slight undersegmentation. Thus, the scale obtained in the first optimization was used as the initial scale in most cases. However, when using the ZM for optimization, the scales recognized in the first stage produced larger segments than the semantic criterion for one image. In such cases, it is necessary to reset the final scale of the scale range and optimize the parameter again. In addition, this paper only assessed three PO methods. In future research, more PO methods can be considered to validate the proposed approach and enhance the method of recognizing semantic geo-objects.
A widespread phenomenon in remote sensing images is that the semantic geo-objects considered in this paper, such as a park, may have different types of geo-objects, although the features of the park are general and manifest as a certain homogeneity in the types of geo-objects in a park. This issue can be observed in real scenes, in which many features in parks exhibit similarities. This information is a realistic theoretical basis for exploring the feasibility of the proposed technique. A semantic geo-object is often formed by a series of single geo-objects, e.g., a semantic area contains several geo-objects with a common shape, texture, and color. These common features are a manifestation of the homogeneity of a semantic geo-object. Future work will be aimed at techniques to investigate the internal homogeneity of a semantic geo-object and to enhance the heterogeneity of a semantic geo-object relative to its surroundings to more accurately recognize semantic geo-objects.
The key objective of this study was to establish a technique to search for the suitable scale parameter of semantic geo-objects in high spatial resolution images. In this paper, three scale parameter optimization methods were used to evaluate the feasibility of the proposed approach. After the first PO, the optimized scale parameters of single geo-objects were obtained. Based on these optimized scales, the optimization was iteratively performed. Finally, satisfactory semantic segmentation results were obtained for four test images. However, at present, no comparable techniques are available for the proposed exploratory approach. We believe that this study can provide a concept for further research and encourage other researchers to provide a similar strategy for comparison. The proposed approach was developed by considering actual research and applications to obtain the spatial scope of a region in a real scene and is thus believed to be of significance for application. We hope that this paper can provide references for future semantic segmentation research.

Conclusions
An approach to search for the optimal scale parameters of semantic segmentation was developed. The scales were searched through iterative PO by continuously reducing the range of optimization. This paper used the MRS algorithm as the segmentation method and considered three PO methods (JSM, WM, and ZM) to validate the proposed approach. GF-1 images were used as the test images, and the visual experiments demonstrated the efficiency of the proposed approach in determining the scale that can generate satisfactory semantic geo-objects. Future work will be aimed at enhancing semantic segmentation scale parameter determination and evaluation methods.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.