Object-Oriented Building Contour Optimization Methodology for Image Classiﬁcation Results via Generalized Gradient Vector Flow Snake Model

: Building boundary optimization is an essential post-process step for building extraction (by image classiﬁcation). However, current boundary optimization methods through smoothing or line ﬁtting principles are unable to optimize complex buildings. In response to this limitation, this paper proposes an object-oriented building contour optimization method via an improved generalized gradient vector ﬂow (GGVF) snake model and based on the initial building contour results obtained by a classiﬁcation method. First, to reduce interference from the adjacent non-building object, each building object is clipped via their extended minimum bounding rectangles (MBR). Second, an adaptive threshold Canny edge detection is applied to each building image to detect the edges, and the progressive probabilistic Hough transform (PPHT) is applied to the edge result to extract the line segments. For those cases with missing or wrong line segments in some edges, a hierarchical line segments reconstruction method is designed to obtain complete contour constraint segments. Third, accurate contour constraint segments for the GGVF snake model are designed to quickly ﬁnd the target contour. With the help of the initial contour and constraint edge map for GGVF, a GGVF force ﬁeld computation is executed, and the related optimization principle can be applied to complex buildings. Experimental results validate the robustness and effectiveness of the proposed method, whose contour optimization has higher accuracy and comprehensive value compared with that of the reference methods. This method can be used for effective post-processing to strengthen the accuracy of building extraction results.


Introduction
Building extraction from high-resolution remote sensing images plays a key role in mapping, city planning and management, and disaster damage analysis and response. Classification is an important building extraction method that employs the traditional machine learning principle [1,2] and deep learning [3,4]. However, given the presence of shadows and vegetation occlusion, the interference of similar spectra, and the complexity of building structures, the building extraction results are often irregular [5,6] and fail to meet real application requirements. Meanwhile, many researchers have focused on improving building extraction accuracy than optimizing building contours. Therefore, designing an optimization method that can enhance the similarity between building detection results and real building shapes has become imperative.
The optimization methods adopted by researchers mainly include the dominant direction method, the bounding rectangle method, and the active contour model method.
The first category of regularized building boundary extraction methods is based on the prevailing building direction. Lee et al. [7] proposed an optimized method by making up a building with regular grids. First, the straight-line segments identified via Hough transformation were obtained based on the initial building results derived from the classification. Second, a grid was constructed by acquiring the dominant line and the line that is parallel and perpendicular to the dominant line. Third, after calculating the building proportion in each grid cell and judging which cell belongs to the building, the building is reconstructed by using those grid cells that meet the specified conditions. Albert et al. [8] used the alpha shape algorithm to obtain the initial building contour. The building dominant direction and line segments were extracted via Hough transformation and were combined with the corner points to obtain a regular contour via energy minimization. Similarly, based on the hypothesis that the adjacent building edges are perpendicular to one another, the hierarchical least-squares solution was applied in building contour optimization [9]. Initially, relatively long line segments were extracted from the building boundary, and their least-squares solution was determined under the assumption that these long line segments lie in two mutually perpendicular directions. In the next step, all line segments were included to determine the least-squares solution by using the slopes of the long line segments as weighted approximations. Partovi et al. [10] adopted the random sampling consistent method to fit those building edges with more than vertical correlation. Ding et al. [11] grouped the corner points and applied the least-squares method to fit each group of corner points into a straight line. After rotating the straight-line segments to be perpendicular or parallel to the dominant building direction, the building was regularized by connecting the intersection points of adjacent line segments. Although these methods regularize the building boundary to a certain extent, they have high requirements for initial building results. When the initial result is not good enough, these regularization methods lack the ability to further judge the real physical boundary of the building and can only reduce the serration segments on the contour boundary.
The second category of regularized building boundary extraction methods is based on MBR. By viewing a building as a group of rectangles, Kwak et al. [12] decomposed the initial building results into different rectangular parts, obtained their MBRs, and regularized the building by reconstructing these MBRs. However, when vegetation or other non-building objects surrounding a building are incorrectly classified as a building, the MBR edges would not be fitted with the real building edge, thereby leading to an inaccurate optimization. To solve this problem, Feng et al. [13] used topological rules to check whether the building edge is occluded by vegetation. The edge points in a line were then used to optimize the occluded edge. This method efficiently handles the occlusion problem and improves the integrity of building extraction. Chang et al. [14] used the Hausdorff distance algorithm to evaluate the similarity between the building and MBR boundaries. The coordinate of the relative MBR boundary was then used to replace the partial boundary that was similar to MBR. For those boundary parts not optimized by these methods, the corner points were detected, filtered, and reconnected for regularization. While these methods can further optimize building boundaries, they have more stringent requirements for the shape of buildings. These methods can only be used to optimize rectangular or rectangular combination buildings with right angles. Moreover, they cannot be well regularized for buildings with non-right angle or arc features.
The third category of regularized building boundary extraction methods is based on an active contour model [15]. Specifically, these methods combine the low-level image information with high-level prior knowledge to extract and optimize buildings. Peng et al. [16] designed an improved snake model that combined radiation features with context information to optimize building boundaries. They obtained satisfactory results in their experiments in dense urban areas. Ahmadi et al. [17] explored an active contour model that extracts the building boundary based on a contour set formula. They also used the HSV components to modify the traditional active contour segmentation model for building extraction. To extract buildings from imageries and LiDAR point cloud data, the active contour model is often used to improve the completeness and correctness of building results [18,19]. Those methods based on the active contour model are not limited by the shape of buildings, thereby supporting the reliability and practicability of the active contour model in the optimization of building extraction. However, some problems need to be addressed when using the active contour model. For example, this model requires a manual setting of the initial contour, and the concave boundary does not converge to the actual edge.
In sum, the current methods rely on the initial results, which are limited by the regularized building shape and the active contour model itself. In response to optimizing the building contour to make it closer to the actual physical shape, this paper develops an object-oriented building contour optimization method for image classification results obtained via the GGVF snake model [20]. First, based on the initial building outlines extracted by classification, the extended MBR clips the original image to obtain a single building sub-image, thereby reducing the subsequent computational load and global impact. Afterward, based on the GGVF snake model, an improved strategy is developed to enhance adjustability. On the one hand, the initial building boundary is corroded and used as the GGVF initial contour. On the other hand, accurate contour constraint segments for the GGVF snake model are designed to rapidly identify the target contour. Next, adaptive threshold Canny edge detection and PPHT [21] are then used to obtain an accurate and adjustable contour edge map. Finally, after computing the GGVF force field, the improved edge map is used to quickly minimize the energy function down to the building constraint contour to obtain the final optimized contour.
The main highlights of this work are summarized as follows: (1) This study adopts an object-oriented optimization strategy to reduce the interference from adjacent objects. The single building image after clipping can make the optimization focused on the building, reduce the impact of other adjacent objects, and greatly decrease the amount of calculations. (2) An improved GGVF snake model is designed by automatically obtaining the initial contour and constraint edge map. The initial contour is modified from the classification results, and the constraint edge map is extracted effectively via the improved Canny detector and Hough transformation. The proposed method is not limited by the building shape and size and has strong robustness.
The rest of this paper is arranged as follows: In Section 2, the optimization method is introduced in detail. The experimental conditions, results, and analysis are given in Section 3. In Section 4, the proposed method is discussed in detail. Finally, Section 5 concludes the whole paper.

Methodology
The GGVF snake model can transform an active contour convergence into long, thin indentations (LTIs) and maintain the other desirable properties of gradient vector flow (GVF) [22], such as its extended capture range. Therefore, this model is useful in optimizing the building active contour. However, the initial seed points and edge map are vital factors in model convergence. This paper designs a strategy for automatically providing effective initial seed information and edge map. Based on the GGVF model and the promoted strategy, the building contour can be optimized accurately and automatically.
First, the initial building results should be obtained via classification or other methods. To reduce the interference from adjacent object information, each building object image can be clipped according to the boundary rectangle of its initial building from the original image. Afterward, adaptive threshold Canny edge detection is applied to each clipped building image to detect the edge, and PPHT is used to detect the line segments in these edges. These line segments are reconnected as the outer constrained edge, and the initial building contour is used as the inner seed information, both of which are inputted into clipped building image to detect the edge, and PPHT is used to detect the line segments in these edges. These line segments are reconnected as the outer constrained edge, and the initial building contour is used as the inner seed information, both of which are inputted into the GGVF snake model. With this combination, the energy can be rapidly minimized to make the initial contour close to the real outline. The building contour can be optimized by making full use of the line segment information. The flowchart is shown in Figure 1.

GGVF Snake Model
By introducing two spatially varying weighting functions into the GVF formulation, Xu and Prince proposed an external force called GGVF [20]. As a generalization of GVF, GGVF was reported to improve contour convergence to LTIs and robustness to noise. GGVF is defined as the equilibrium solution to the following partial differential equation: where t is the iteration time, v(x, y, t) is iteratively calculated from the initial vector field v(x, y, 0), and (x, y, t) denotes the partial derivative of vector field v(x, y, t) with respect to t, and ∇ 2 is the Laplacian operator. Here, |∇f| is the gradient of the edge map. The weighting function g(·) and h(·) apply to the first (smoothing term) and second terms (data term) in the right-hand side of Equation (1), respectively. The parameter k regulates to some extent the tradeoff between the smoothing and data terms and should be set according to the amount of noise in the image. According to Equations (2) and (3), as k increases, g(|∇f|) and h(|∇f|) in Equation (1) increases and decreases, respectively. That means, as

GGVF Snake Model
By introducing two spatially varying weighting functions into the GVF formulation, Xu and Prince proposed an external force called GGVF [20]. As a generalization of GVF, GGVF was reported to improve contour convergence to LTIs and robustness to noise. GGVF is defined as the equilibrium solution to the following partial differential equation: where where t is the iteration time, v(x, y, t) is iteratively calculated from the initial vector field v(x, y, 0), and v t (x, y, t) denotes the partial derivative of vector field v(x, y, t) with respect to t, and ∇ 2 is the Laplacian operator. Here, |∇f | is the gradient of the edge map. The weighting function g(·) and h(·) apply to the first (smoothing term) and second terms (data term) in the right-hand side of Equation (1), respectively. The parameter k regulates to some extent the tradeoff between the smoothing and data terms and should be set according to the amount of noise in the image. According to Equations (2) and (3), as k increases, g(|∇f |) and h(|∇f |) in Equation (1) increases and decreases, respectively. That means, as k becomes larger, the GGVF external forces at the indentation boundary are more easily changed. In addition, the larger the tradeoff parameter k, the greater the impact of noise. Therefore, it is necessary to estimate the appropriate maximum k for a given image [23]. The range of k is usually (0.01, 0.2).

Objected-Oriented GGVF Initial Contour Acquirement
The information surrounding the building in high-resolution remote sensing imagery, such as the adjacent buildings, vegetation, roads, and other objects, are sometimes wrongly recognized, leading to reductions in detection accuracy and efficiency. Therefore, the analysis range of each building is shrunk by clipping each building as a single object according to its boundary rectangle. Specifically, the initial building results can be acquired by using classification or other methods. In this paper, the shadow-shifted classification method proposed in [1] is applied, and the boundary rectangle of each building is obtained and dilated several times to derive its clipping range. The building image object is eventually obtained. Figure 2 shows the schematic diagram of how the building object images are acquired. The initial building contour obtained in this step not only provides the initial building seed information to the GGVF snake model but also can be used to obtain a clipped building image to further detect some line segments as constraint boundary information for optimizing the contour in the GGVF snake model. k becomes larger, the GGVF external forces at the indentation boundary are more ea changed. In addition, the larger the tradeoff parameter k, the greater the impact of no Therefore, it is necessary to estimate the appropriate maximum k for a given image [ The range of k is usually (0.01, 0.2).

Objected-Oriented GGVF Initial Contour Acquirement
The information surrounding the building in high-resolution remote sensing agery, such as the adjacent buildings, vegetation, roads, and other objects, are sometim wrongly recognized, leading to reductions in detection accuracy and efficiency. Theref the analysis range of each building is shrunk by clipping each building as a single ob according to its boundary rectangle. Specifically, the initial building results can be quired by using classification or other methods. In this paper, the shadow-shifted cla fication method proposed in [1] is applied, and the boundary rectangle of each build is obtained and dilated several times to derive its clipping range. The building image ject is eventually obtained. Figure 2 shows the schematic diagram of how the build object images are acquired. The initial building contour obtained in this step not only p vides the initial building seed information to the GGVF snake model but also can be u to obtain a clipped building image to further detect some line segments as constr boundary information for optimizing the contour in the GGVF snake model.

GGVF Constraint Edge Map Extraction
Accurate contour constraint segments are important for the GGVF snake mode find the target contour quickly in the optimization process. In this paper, an adap threshold Canny edge detection is applied to each building image to detect the edge. terward, PPHT is applied to the edge result to extract the line segments. For those ca with missing or wrong line segments in some edges, a hierarchical line segments rec struction method is proposed to obtain complete contour constraint segments.

Adaptive Canny Edge Detection
The reliable building edges extracted by the Canny detector are vital for line s ments using PPHT. First, each clipped image is smoothed via Gaussian convolution. 2D first derivatives are then used to calculate the gradient magnitude and direction. first-order derivative of an image f(i, j) at location (i, j) is defined as the following 2D vec [24]: The absolute gradient magnitude (edge strength) is computed as

GGVF Constraint Edge Map Extraction
Accurate contour constraint segments are important for the GGVF snake model to find the target contour quickly in the optimization process. In this paper, an adaptive threshold Canny edge detection is applied to each building image to detect the edge. Afterward, PPHT is applied to the edge result to extract the line segments. For those cases with missing or wrong line segments in some edges, a hierarchical line segments reconstruction method is proposed to obtain complete contour constraint segments.

Adaptive Canny Edge Detection
The reliable building edges extracted by the Canny detector are vital for line segments using PPHT. First, each clipped image is smoothed via Gaussian convolution. The 2D first derivatives are then used to calculate the gradient magnitude and direction. The first-order derivative of an image f (i, j) at location (i, j) is defined as the following 2D vector [24]: The absolute gradient magnitude (edge strength) is computed as The gradient direction (edge orientation) is defined as Second, a non-maximal suppression process is applied to the gradient magnitude image to remove the local maxima. Only those pixels with an edge strength higher than Remote Sens. 2021, 13, 2406 6 of 17 that of their two adjacent pixels in the gradient direction are identified as edge candidates. To remove false edge segments caused by noise and fine texture, a hysteresis tracking process is further applied with two thresholds in which all candidate edge pixels below the lower threshold are labeled as non-edges, whereas all pixels above the low threshold that can be connected to any pixel above the high threshold through a chain of edge pixels are labeled as edge pixels.
These two thresholds should be suitable for each building to obtain accurate edge detection results. To extract the edge accurately, an adaptive threshold strategy is employed to determine the optimal thresholds for each building. First, the gradient magnitude is normalized to (0, 1). Based on the histogram of the gradient magnitude, the probability sum of one level K can be calculated by consecutively adding the probabilities of each grey level lower than K. P NE denotes the rate of the non-edge point in the whole image.
where K is the rate of non-edge points in the whole edge points, and I m and I n are the image height and width, respectively. Assume that L F_hist is the grey level of the first non-edge point whose probability sum is greater than P NE for the first time, and L T_hist is the grey level total number of the histogram, where L T_hist = 64. The high threshold T high represents the ratio between L F_hist and L T_hist and can be determined automatically. The low threshold T low is then defined as the ratio of T high , where R∈ (0, 1).
As long as R is suitable, T low can be determined automatically and adaptively. A series of tests show that R= 0.4 and K = 0.7 are suitable experimental values for most buildings to obtain a reasonable edge. Therefore, the improved automatic threshold strategy is useful for improving the edge accuracy and providing a solid basis for line detection based on PPHT.

Line Segments Detection and Optimization Based on PPHT
The building line segments are extracted to construct the building constraint edge map. PPHT can further extract line segments from those edges obtained by the Canny detector. However, given the complex background of buildings, some line segments extracted by PPHT may not be accurate or enough to construct the constraint edge map. Therefore, based on the line segments extracted by PPHT, a strategy for customizing an effective constraint edge map for different conditions is defined. First, line fitting is applied to connect the adjacent parallel line segments to reduce inference. Second, for the wrongly detected line segments caused by the clipping image, those line segments near the image edge are deleted. The topology relationship between the line segment and the initial building contour is taken into account to delete ineffective segments. Third, a complementary strategy is applied to make the constraint edge map complete by making up for those missing line segments in some directions.

Line Segments Extraction by PPHT
The basis of the Hough transformation is given in Equation (10), where (θ, ρ) and (x, y) represent the Hough transformation and image domains, respectively, whereas δ refers to the Dirac delta function. Each point (x, y) in the original image F(x, y) is transformed into a sinusoid ρ=xcos(θ) − ysin(θ), and H(θ, ρ) represents the total number of sinusoids that intersect at the point (θ, ρ). Therefore, Equation (10) returns the total number of points comprising the line in the original image. By choosing a line-cut threshold T for H(θ, ρ) and by using the inverse Hough transformation, the original image is filtered in order to keep only those lines that contain at least T points [25].
The line segments are extracted by using PPHT, which is improved based on Hough transformation. The main process is to set up an accumulator for each small space in the parameter space and then randomly select the front scenic spot on the image and map this spot to the parameter space to draw a curve. When the intersection of the curves reaches the minimum threshold, the line corresponding to this point is identified. Search for the points on the line in the image, and then connect the qualified points into line segments. If the line segment length meets the minimum length, then record the starting and ending points of the line segment and then repeat the above steps.

Line Segments Fitting
After extracting the line segments, some of them are merged if they are parallel and near enough to reduce similar segments. First, all detected line segments are grouped into set L set_1= l 1 1 , . . . , l 1 n and copied as another set L set_2= l 2 1 , . . . , l 2 n . Each line segment in L set_1 will be compared with all the segments in L set_2 to measure the perpendicular distance and the intersection angle between them. Assume that l 1 i and l 2 j are the line segments in L set_1 and L set_2 , respectively, whereas P S i , and P E i and P S j and P E j are the start and end points of l 1 i and l 2 j , respectively. Suppose that the perpendicular distance and intersection angle of l 1 i and r line segments { l 2 j1 , . . . , l 2 jr are small enough to meet the merging requirement. In this case, { l 2 j1 , . . . , l 2 jr will be labeled as the matched segments of l 1 i . Beginning with l 2 j1 , a new average line segment l N j1 will be generated by connecting the middle points of the start and end points. Afterward, l 2 j1 and the corresponding l 1 j1 will be removed from L set_2 and L set_1 . l N j1 will be compared with the remaining unmatched line segments in { l 2 j1+1 , . . . , l 2 jr . Continue updating by following the above steps until all matched segments are finished. The final average line segment of l 1 i will then be added to L remain until all segments in L set_1 are finished. The remaining segment group will then be used as the simplified set to replace the original segment group. L temp = Fitting(L temp , l 2 j ) 7: Delete l 1 j from L set_1 8: Delete l 2 j from L set_2 9: End 10: Add L temp to L remain 11: End 12: Output L remain The approximate schematic process of line fitting is shown in Figure 3  Removing Building Roof Line Segments Some falsely detected line segments inside the roof region are removed according to their topology relationship with the initial building contour. First, the initial building contours are shrunk via morphology erosion to ensure they are inside the building. Second, the topology relationship between each line segment and the erased building contour is calculated. If one or more endpoints of a line segment are in the erased contour, then the line segment is deleted. Therefore, the remaining line segments around the building contour can be used to accurately construct the building constraint edge map.
Complementing the Incomplete Building Constraint Edge Map PPHT may not provide complete line segments to construct an effective constraint edge map for GGVF snake model convergence. A reasonable building constraint edge map should contain all edge directions of a building. Therefore, a strategy is developed to complement the missing edge parts. First, by comparing the distance between each PPHT segment and the initial contour for matching, some contours without matched PPHT segments are labeled as the missing edge parts. Second, for these regions, the Shi-Tomasi corner point detection algorithm is applied to the original image to obtain some useful corner points. If some corner points are present in this area, then they are connected to the endpoint in order. Otherwise, the endpoints of the missing parts are connected directly to form a complete constraint edge map. This map is then used in order for the converged GGVF to extract contours accurately. Otherwise, an unclosed or inaccurate edge map will affect the final contour.

GGVF Force Field Computation
Applying the initial contour and the constraint edge map to the GGVF snake model will help this model rapidly reach convergence to obtain the optimized contour [26]. The initial GGVF vectors v(x, y) = [u(x, y), v(x, y)] are generally normalized with respect to their magnitudes via vector-based normalization. To set up the iterative solution, let the spatial sample intervals be ∆x and ∆y, and let the time step for each iteration be ∆t.
As in [20], the partial differential equation v(x, y, t) specifying GGVF can be implemented by using an explicit finite difference scheme, which is stable if the time step ∆t and the spatial sample intervals ∆x and ∆y satisfy (11) where max g is the maximum value of ( ) g ⋅ over the range of gradients encountered in the edge map image. An implicit scheme for the numerical implementations of Equation (1) would be unconditionally stable and is therefore not needed in this condition, thereby increasing the speed of the explicit scheme.
Δ Δ Δ ≤ max 4 x y t g Removing Building Roof Line Segments Some falsely detected line segments inside the roof region are removed according to their topology relationship with the initial building contour. First, the initial building contours are shrunk via morphology erosion to ensure they are inside the building. Second, the topology relationship between each line segment and the erased building contour is calculated. If one or more endpoints of a line segment are in the erased contour, then the line segment is deleted. Therefore, the remaining line segments around the building contour can be used to accurately construct the building constraint edge map.
Complementing the Incomplete Building Constraint Edge Map PPHT may not provide complete line segments to construct an effective constraint edge map for GGVF snake model convergence. A reasonable building constraint edge map should contain all edge directions of a building. Therefore, a strategy is developed to complement the missing edge parts. First, by comparing the distance between each PPHT segment and the initial contour for matching, some contours without matched PPHT segments are labeled as the missing edge parts. Second, for these regions, the Shi-Tomasi corner point detection algorithm is applied to the original image to obtain some useful corner points. If some corner points are present in this area, then they are connected to the endpoint in order. Otherwise, the endpoints of the missing parts are connected directly to form a complete constraint edge map. This map is then used in order for the converged GGVF to extract contours accurately. Otherwise, an unclosed or inaccurate edge map will affect the final contour.

GGVF Force Field Computation
Applying the initial contour and the constraint edge map to the GGVF snake model will help this model rapidly reach convergence to obtain the optimized contour [26]. The initial GGVF vectors v(x, y) = [u(x, y), v(x, y)] are generally normalized with respect to their magnitudes via vector-based normalization. To set up the iterative solution, let the spatial sample intervals be ∆x and ∆y, and let the time step for each iteration be ∆t.
As in [20], the partial differential equation v(x, y, t) specifying GGVF can be implemented by using an explicit finite difference scheme, which is stable if the time step ∆t and the spatial sample intervals ∆x and ∆y satisfy where g max is the maximum value of g(·) over the range of gradients encountered in the edge map image. An implicit scheme for the numerical implementations of Equation (1) would be unconditionally stable and is therefore not needed in this condition, thereby increasing the speed of the explicit scheme. An example with intermediate results is shown in Figure 4. The main process includes initial building contour acquisition (Figure 4b), building object clipping (Figure 4c), adaptive threshold Canny edge detection (Figure 4d), PPHT line segment detection (Figure 4e), and GGVF snake iterative calculation to obtain the final result (Figure 4f). An example with intermediate results is shown in Figure 4. The main process includes initial building contour acquisition (Figure 4b), building object clipping (Figure 4c), adaptive threshold Canny edge detection (Figure 4d), PPHT line segment detection (Figure 4e), and GGVF snake iterative calculation to obtain the final result (Figure 4f).

Dataset Description
The main experimental areas are in Illinois, Virginia, and Kansas, as shown in Figure  5. The image spatial resolution is 0.4 m. These experimental areas include buildings different in spectra, shape, size, texture, and density. Therefore, these images are adequate to verify the reliability and robustness of the proposed method.

Dataset Description
The main experimental areas are in Illinois, Virginia, and Kansas, as shown in Figure 5. The image spatial resolution is 0.4 m. These experimental areas include buildings different in spectra, shape, size, texture, and density. Therefore, these images are adequate to verify the reliability and robustness of the proposed method. Figure 4. The main process includes initial building contour acquisition (Figure 4b), building object clipping (Figure 4c), adaptive threshold Canny edge detection (Figure 4d), PPHT line segment detection (Figure 4e), and GGVF snake iterative calculation to obtain the final result (Figure 4f).

Dataset Description
The main experimental areas are in Illinois, Virginia, and Kansas, as shown in Figure  5. The image spatial resolution is 0.4 m. These experimental areas include buildings different in spectra, shape, size, texture, and density. Therefore, these images are adequate to verify the reliability and robustness of the proposed method.

Accuracy Indexes
To objectively evaluate the effectiveness of the proposed method, truth images are extracted by professional staff, and four commonly used accuracy indexes, including completeness (CM), correctness (CR), comprehensive value (F1), and overall accuracy (OA), are used as measures in comparisons with the truth data [27].
where |V TP | is the total number of building pixels that are classified to the building, |V FP | is the total number of non-building pixels that are recognized as buildings, and |V FN | is the total number of building pixels that are labelled as non-buildings.

Effectiveness Evaluation for Different Initial Results
To verify whether the proposed method can be applied to different initial methods, the initial building results extracted by the shift shadow algorithm (SSDA) and the traditional GGVF snake method (TD-GGVF) are optimized by using the proposed method. The SSDA method performs building extraction by classifying an image into the building, bare land, shadow, and vegetation. The building samples are obtained by shifting the shadow region, and the results are verified by using the shadow index. The TD-GGVF obtains the building contour based on the traditional GGVF snake model by manually setting the initial seed points, and then the edge map is obtained based on the original image. These methods based on different principles can provide initial building results that are irregular. Therefore, the proposed method is applied to these two initial results to test its effectiveness. The initial building and optimization results of SSDA and TD-GGVF are shown in Figure 6, and the accuracy comparison results are shown in Table 1.
From the initial building results, each building extraction method faces the common problem that the building boundaries are irregular. The proposed optimization method can further improve the accuracy of the building boundary. The initial results of the SSDA are shown in the first column in Figure 6, which shows that the proposed method can effectively distinguish buildings from confusing bare land by automatically extracting and verifying samples. However, given the similarity of the spectrum between building and some non-building objects and the occlusion of adjacent vegetation, some false and missing detections are still observed. As reflected in its results, the TD-GGVF method can efficiently extract the building when the building boundary in the original image is clear. Nevertheless, the spectral characteristics of the building and its surroundings are complex. The extraction result becomes inaccurate when the building boundary is not prominent, when the spectrum of the building roof is not uniform, or when a local abnormal spectrum is present on the building roof. The proposed method aims to address these irregularities caused by wrong recognition. As shown in the optimization results, the improvements in the accurate edge extraction and the approximation of the GGVF snake model can refine the building boundary to restore the missing and wrong segmentation parts. The building contour constraint line segments in the proposed method can avoid the spectral interference of the TD-GGVF method. As shown in Table 1, the comprehensive value and overall accuracy of the proposed method are higher than those of the SSDA method. The comprehensive value is improved by 1.41% (SSDA) and 0.98% (TD-GGVF) on average, whereas the overall accuracy is improved by 2.49% (SSDA) and 1.77% (TD-GGVF). The comparison results show that the proposed optimization method can be effectively and automatically used for different building results acquired by classification or the snake model. The optimization is not limited by the building shape and size and can be applied as an effective post-processing tool for improving building regularization and accuracy. sive value and overall accuracy of the proposed method are higher than those of the SSDA method. The comprehensive value is improved by 1.41% (SSDA) and 0.98% (TD-GGVF) on average, whereas the overall accuracy is improved by 2.49% (SSDA) and 1.77% (TD-GGVF). The comparison results show that the proposed optimization method can be effectively and automatically used for different building results acquired by classification or the snake model. The optimization is not limited by the building shape and size and can be applied as an effective post-processing tool for improving building regularization and accuracy.

Comparison with Other Contour Optimization Methods
Different contour optimization methods, including the Douglas-Peucker straightline approximation method (DPSLA) [28], the multi-stars constraint segmentation and regularization method (MCSR), and the proposed method, are applied to the same initial building results (first column of Figure 7) for comparison. DPSLA aims to find a fitting polygon of building results by removing those contour points near the line segments connected by the front and back points of these contour points. MCSR optimizes the building edge by grouping, line fitting, and line rotation after corner extraction. The results of the representative patches in each test image are visualized in Figure 7, and the precision is shown in Table 2.
An analysis of the comparison results in Figure 7 reveals that the proposed method is more powerful and feasible than other methods, improving buildings of different shapes. DPSLA is sufficient that some small broken line segments can be approximated into a straight line, and the building outline is regularized to a certain extent. However, DPSLA is greatly affected by the initial building results and is unable to correct the wrong and missing parts in the initial results. Moreover, MCSR in [11] is useful in optimizing rectangular building contours and helps regularize the initial results. Nevertheless, for complex buildings with arcs or non-right angles, MCSR improperly deals with these edges, thereby leading to misclassification in these areas. By contrast, by combining the profile spectrum and texture characteristics of buildings, the proposed method makes full use of the active contour model to accurately approximate the real contours. Therefore, this method is not affected by the initial building results and building shapes. Despite some wrong detection or missing parts in initial buildings, the proposed method correctly self-adapts through the optimization process. The data analysis results in Table 2 show that compared with DPSLA and MCSR, the proposed method has a comprehensive value that is 1.63% (DPSLA) and 2.16% (MCSR) higher on average and an overall accuracy that is 2.86% (DPSLA) and 3.84% (MCSR) higher on average. Therefore, the optimization capability of the proposed method is more powerful and self-adjustable than that of other methods to approach the real building shape.

The Positive Effect of Improved Automatic Canny Detector
The Canny detector automatic threshold is vital in determining effective edge segments that can be used in the next PPHT. To illustrate the advantages of the adaptive threshold method in Canny detection, a comparison test using a different threshold strategy is performed, as shown in Figure 8. The results in Figure 8b,e are obtained from the whole image by using the global unified threshold method, whereas those in Figure 8c,f are obtained by using the improved automatic threshold strategy. Each building is detected by its corresponding customized threshold. Given the differences in the characteristics of each building, the global threshold cannot be suitable for each building, thereby creating deviations in the edge detection results. In comparison, the designed strategy in the Canny detector can obtain different thresholds for each building without manual help. Therefore, the threshold facilitates the accurate detection of edges.
Remote Sens. 2021, 13, x FOR PEER REVIEW 14 of 18 threshold method in Canny detection, a comparison test using a different threshold strategy is performed, as shown in Figure 8. The results in Figure 8b,e are obtained from the whole image by using the global unified threshold method, whereas those in Figure 8c,f are obtained by using the improved automatic threshold strategy. Each building is detected by its corresponding customized threshold. Given the differences in the characteristics of each building, the global threshold cannot be suitable for each building, thereby creating deviations in the edge detection results. In comparison, the designed strategy in the Canny detector can obtain different thresholds for each building without manual help. Therefore, the threshold facilitates the accurate detection of edges.

The Positive Effect of Constructing Constraint Line Segments Based on PPHT
Effective line segments serve as vital bases for constructing a complete constraint contour for the GGVF snake model. To verify the effectiveness of these constraint line segments on buildings with different shapes, the results are compared with the original PPHT results, as shown in Figure 9. The straight-line segment detected by PPHT is shown in Figure 9c. Some line segments are overlooked or redundant in the detection (denoted by the red closed line), thereby resulting in missing or inaccurate constraint line segments on one side of the detected building contour. Results of the improved constraint line segments are shown in Figure 9d. Compared with the PPHT line segments, the missing contour line segments can be supplemented more accurately, and some inside line segments

The Positive Effect of Constructing Constraint Line Segments Based on PPHT
Effective line segments serve as vital bases for constructing a complete constraint contour for the GGVF snake model. To verify the effectiveness of these constraint line segments on buildings with different shapes, the results are compared with the original PPHT results, as shown in Figure 9. The straight-line segment detected by PPHT is shown in Figure 9c. Some line segments are overlooked or redundant in the detection (denoted by the red closed line), thereby resulting in missing or inaccurate constraint line segments on one side of the detected building contour. Results of the improved constraint line segments are shown in Figure 9d. Compared with the PPHT line segments, the missing contour line segments can be supplemented more accurately, and some inside line segments are removed. Therefore, the constraint line segments that conform to the building contour can be obtained and used to construct an edge map.
edge detection with the adaptive threshold.

The Positive Effect of Constructing Constraint Line Segments Based on PPHT
Effective line segments serve as vital bases for constructing a complete constraint contour for the GGVF snake model. To verify the effectiveness of these constraint line segments on buildings with different shapes, the results are compared with the original PPHT results, as shown in Figure 9. The straight-line segment detected by PPHT is shown in Figure 9c. Some line segments are overlooked or redundant in the detection (denoted by the red closed line), thereby resulting in missing or inaccurate constraint line segments on one side of the detected building contour. Results of the improved constraint line segments are shown in Figure 9d. Compared with the PPHT line segments, the missing contour line segments can be supplemented more accurately, and some inside line segments are removed. Therefore, the constraint line segments that conform to the building contour can be obtained and used to construct an edge map.  Eight individual building images are randomly selected and compared with the results without the improved PPHT constraint line segments. In Figure 10, the first row shows the truth building contour, the second row presents the results based on the original PPHT line segments, and the third row presents the results based on the improved PPHT line segments. The PPHT optimization results are smooth on the boundary and close to the true shape of the building. Given that the optimization on PPHT can provide more accurate constraint line segments for constructing an edge map, this approach can be used to compute a final contour that is near the original boundary of the building. The comparison results verify the effectiveness and reliability of PPHT optimization. Eight individual building images are randomly selected and compared with the results without the improved PPHT constraint line segments. In Figure 10, the first row shows the truth building contour, the second row presents the results based on the original PPHT line segments, and the third row presents the results based on the improved PPHT line segments. The PPHT optimization results are smooth on the boundary and close to the true shape of the building. Given that the optimization on PPHT can provide more accurate constraint line segments for constructing an edge map, this approach can be used to compute a final contour that is near the original boundary of the building. The comparison results verify the effectiveness and reliability of PPHT optimization.

Building footprint
Result of PPHT

Overall Comparison Analysis
To analyze the contribution of each improvement in the whole optimization process and compare the proposed method with other contour methods in terms of quantity, over 200 buildings are randomly selected from the WHU building dataset [29], and their contours are optimized by using different strategies in Canny detection, constraint line segments, and other optimization methods. The accuracy results are shown in Figure 11. Figure 11a shows the accuracy difference between using the adaptive and manual thresholds in the Canny detection step. The improvement in Canny detection accuracy can increase the F1 and overall precision by 2.35% and 3.83%, respectively. Figure 11b shows the accu-

Overall Comparison Analysis
To analyze the contribution of each improvement in the whole optimization process and compare the proposed method with other contour methods in terms of quantity, over 200 buildings are randomly selected from the WHU building dataset [29], and their contours are optimized by using different strategies in Canny detection, constraint line segments, and other optimization methods. The accuracy results are shown in Figure 11. Figure 11a shows the accuracy difference between using the adaptive and manual thresholds in the Canny detection step. The improvement in Canny detection accuracy can increase the F1 and overall precision by 2.35% and 3.83%, respectively. Figure 11b shows the accuracy difference between using the optimized and original PPHT constraint line segments. The improvement in constraint line segments enhances the F1 and overall precision by 3.92% and 6.47%, respectively. Figure 11c compares the proposed method that uses both the adaptive threshold Canny and PPHT segment optimization with the original method that lacks these two improvements. Combining the self-adaptive Canny detector with the optimized PPHT segments improves the F1 and overall precision by 4.94% and 8.13%, respectively. Therefore, the proposed method is proven effective. Figure 11d compares the proposed method with two other methods for optimizing the building contour, namely, DPSLA and MCSR, and shows that the proposed method obtains the higher F1 and overall precision. In sum, the proposed method optimizes the GGVF snake model by introducing the Canny detector and PPHT line segments, which are useful in constructing a suitable constraint edge map for each building and improving building accuracy.

Comparisons of Time Complexity
In this part, we discuss the main time complexity [30] of the proposed and compared methods in the experiments. In the proposed method, the first step is to gain the building contours [31] and sub-images. The second step is to detect Canny [32] edge with the adaptive threshold. The third is to obtain the building constraint edge map. The last step is to gain the final building contour through GGVF [33]. The main time complexity of each step is shown in Table 3. As we can see, the main time cost is in the fourth step, and the total time complexity of the proposed method is O(WlHl + + WsHs + +  [11], it depends on the number of building contour points and the number of corner points detected. For MSCR, the time complexity is O(WlHl + ). The method proposed in this paper has higher time complexity than that of the compared methods. This is because the method proposed in this paper has more steps and considers them more comprehensively.

Comparisons of Time Complexity
In this part, we discuss the main time complexity [30] of the proposed and compared methods in the experiments. In the proposed method, the first step is to gain the building contours [31] and sub-images. The second step is to detect Canny [32] edge with the adaptive threshold. The third is to obtain the building constraint edge map. The last step is to gain the final building contour through GGVF [33]. The main time complexity of each step is shown in Table 3. As we can see, the main time cost is in the fourth step, and the total time complexity of the proposed method is O(W l H l + n 2 k + W s H s + n c n h + n 3 it ). As we have W l ≥ W s and H l ≥ H s , the time cost can be simplified as O(W l H l + n 2 k + W s H s + n c n h + n 3 it ) ≈ O(W l H l + n 2 k + n c n h + n 3 it ). The complexity of DPSLA [28] depends mainly on the number of building contour points. Thus, in general, the complexity of the DPSLA is O(W l H l + n 2 c ). As for the main time complexity of MCSR [11], it depends on the number of building contour points and the number of corner points detected. For MSCR, the time complexity is O(W l H l + N c n c ). The method proposed in this paper has higher time complexity than that of the compared methods. This is because the method proposed in this paper has more steps and considers them more comprehensively. O(W l H l + n 2 k + n c n h + n 3 it ) DPSLA [28] O(W l H l ) O(n 2 c ) --O(W l H l + n 2 c ) MSCR [11] O(W l H l ) O(N c n c ) --O(W l H l + N c n c ) 1 In the table, W l and H l are the width and height of the VHR image, W s and H s are the width and height of the clipped building object image, n k is the convolution kernel size of Gaussian filter, n c is the number of building contour points, n h is the number of lines detected, n it is the number of iterations of GGVF, and N c is the number of corners detected by Harris.

Conclusions
Building extraction results often show irregular building contours because of inaccurate recognition, and most of the extant optimization methods are limited only to rectangular non-complex buildings. To devise a contour optimization method for complex buildings, an object-oriented strategy combined with the GGVF snake model is proposed in this paper. First, based on the initial building extraction results, each building object is clipped from images to reduce the impact of the adjacent non-building objects. Second, a building boundary shrunk based on the initial result is used as the initial contour of the GGVF and is automatically obtained. Third, a Canny edge detection with adaptive threshold and the optimized PPHT line segments are designed to obtain a highly accurate constraint edge map. Fourth, after inputting the accurate and automatic initial contour and edge map into the GGVF snake model, the building contour is calculated feasibly and self-adaptively. The proposed method is not limited by the complexity of buildings. Experimental verifications confirm that the proposed method not only optimizes the building outline but also improves the final accuracy of building extraction to a certain extent. This method can also optimize the contour similar to the real building shape. The proposed method may be applied as a post-process method for building extraction to improve the regularity and accuracy of buildings.