Accurate Extraction of Ground Objects from Remote Sensing Image Based on Mark Clustering Point Process

: The geometric features of ground objects can reﬂect the shape, contour, length, width, and pixel distribution of ground objects and have important applications in the process of object detection and recognition. However, the geometric features of objects usually present irregular geometric shapes. In order to ﬁt the irregular geometry accurately, this paper proposes the mark clustering point process. Firstly, the random points in the parent process are used to determine the location of the ground object, and the irregular graph constructed by the clustering points in the sub-process is used as the identiﬁcation to ﬁt the geometry of the ground object. Secondly, assuming that the spectral measurement values of ground objects obey the independent and uniﬁed multivalued Gaussian distribution, the spectral measurement model of remote sensing image data is constructed. Then, the geometric extraction model of the ground object is constructed under the framework of Bayesian theory and combined with the reversible jump Markov chain Monte Carlo (RJMCMC) algorithm to simulate the posterior distribution and estimate the parameters. Finally, the optimal object extraction model is solved according to the maximum a posteriori (MAP) probability criterion. This paper experiments on color remote sensing images. The experimental results show that the proposed method can not only determine the position of the object but also ﬁt the geometric features of the object accurately.


Introduction
The main information of remote sensing images is generally concentrated in a few key areas (such as roads, rivers, reservoirs, islands, etc.). These areas that people are interested in are called ground objects in remote sensing images. The geometric features of these ground objects are strips or area polygons, which can reflect the shape, contour, length, and width size of the ground object and pixel distribution of the ground object [1]. These geometric features have important applications in the process of object detection and recognition. Therefore, how to accurately extract the geometric features of ground objects in remote sensing images has become one of the key research contents of scholars [2,3].
The common geometric feature extraction methods of ground objects in remote sensing images include the feature discrimination-based method [4], classification learning-based method [5], Hough voting-based method [6,7], and object model-based method [8]. With the continuous improvement of remote sensing image resolution in recent years, remote sensing image object extraction methods based on object models are more widely used. Among them, the object extraction method based on the marked point process is one of the most effective methods [9][10][11]. In the geometric feature extraction model of remote sensing image based on the marked point process, the position of the object is represented by the random point process defined on the image domain, and the geometry of the object is represented by the geometric marker attached to the random point [12]. However, the existing remote sensing image geometric feature extraction algorithms based on the marked point process show excellent characteristics and broad development prospects. However, marks are defined by regular geometry. For example, rectangular is defined to extract buildings [13], elliptical to extract tree crowns [14], and line segment to extract roads [15]. This kind of method can only extract single and regular ground objects, which has great limitations and limits the application of this kind of method. In order to overcome this shortcoming, Ortner et al. (2008) [16] used two marked point processes with different geometries (rectangle and line segment) as marks and used the Markov chain Monte Carlo (MCMC = Markov chain Monte Carlo) algorithm to jointly sample the two marked point processes. However, this method is difficult to consider the relationship between the geometric features represented by these two different graphics, which makes the expression of energy function too complex to realize the parameter estimation of simulated annealing. Based on the above work, Lafarge et al. (2010) [17] proposed a geometric feature extraction method based on multi marked point process. This method realizes the modeling and extraction of different geometric features by establishing a graphics library containing a variety of geometric graphics. However, it is still difficult to accurately extract the geometric features of arbitrarily shaped ground objects. Zhao et al. (2018) [11] proposed an irregular marked point process to extract geometric features of oil spill dark spots. However, this method assumes that the image data obey Gamma distribution and is only effective for SAR image data.
To detect objects with irregular geometry in color remote sensing images, we propose the mark clustering point process. Firstly, the distribution and geometric characteristics of the object are established by using the mark clustering point process, the random points in the parent process are used to locate the ground object, and the irregular graphics constructed by the clustering points in the sub process are used as the mark to fit the geometric shape of the ground object. Secondly, assuming that the spectral measurement values of ground objects obey the independent and unified multivalued Gaussian distribution, the spectral measurement model of color remote sensing image data is constructed. Then, the geometric extraction model of the ground object is constructed under the framework of Bayesian theory and combined with the reversible jump Markov chain Monte Carlo (RJMCMC) algorithm to simulate the posterior distribution and estimate the parameters. Finally, the optimal object extraction model is solved according to the maximum a posteriori (MAP) probability criterion. The remainder of the paper is organized as follows. After the overview of related work, we introduce the proposed object detection method. Then we display the experimental results, followed by the conclusion.
The innovations of this paper are: 1.
The mark clustering point process is proposed to fit the geometric features of ground objects.

2.
The paper proposes to establish the data model of color remote sensing images using multivalued Gaussian distribution.

3.
This paper designs six moving operations of RJMCMC, which are used to fit the geometric features of ground objects.

Proposed Algorithm
The remote sensing data are regarded as a set of discrete points sampled in a limited area on the plane, y = {(x i , z i ), i = 1, ..., n}, where i is the pixel index; n is the total number of data points. x i = (x 1i , x 2i ) ∈ S ⊂ R 2 is the position coordinate of the pixel grid corresponding to the data point i, S is the area covered by the remote sensing data (called the data domain). z i = (z id , d = 1, ..., D) T is the spectral measure vector corresponding to the pixel i, D is the dimension of the sampling data, d is the dimension index, and T is the transpose operation.

Geometric Model of Ground Object
According to the mark clustering point process, the geometric model of the ground object in color remote sensing image is modeled. The mark clustering point process in data domain S is defined as W = (U, V), and its configuration is assumed to be w = (u, v). The configuration u = {u j , j = 1, ..., m} of the parent process U defined on S is used to describe the location and spatial distribution of ground objects, where j is the index of ground objects, m is the number of objects, u j = (u 1j , u 2j ) ∈ S represents the position of the object j. The configuration of cluster sub-process V is defined as v = {v j , j = 1, ..., m}, where v j = {v jg , g = 1, ..., k j } is the cluster point set corresponding to random point u j in the parent process, v jg = (v 1jg , v 2jg ) ∈ S is the coordinate of cluster point g corresponding to random point u j , g is the cluster point index, and k j is the number of cluster point corresponding to random point u j . Each cluster point in the cluster point set v j is connected in sequence to form the j-th polygon v j , which is used to fit the geometry of the ground object, that is, the mark corresponding to the j-th ground object. Figure 1 shows an example of modeling the geometry of a feature object using the mark clustering point process. Figure 1a shows the clustering point process. There are two parent processes u = {u 1 , u 2 }, represented by "+". Taking the random point in the parent process (the "+" in the figure) as the center of the circle, two radii r 1 and r 2 are generated by Gaussian distribution. Generate the corresponding circular area (while generating the radius, ensure that the two circular areas do not intersect, to ensure that the two ground objects do not intersect in geographical location), to generate the sub-process. Each node in the sub-process is represented by "•". As can be seen from the figure, Figure 1b shows the mark clustering point process based on Figure 1a. Taking the cluster points in each sub-process in Figure 1a as nodes and connecting them to form a simple polygon as the mark of the corresponding object. It can be seen that in this example, the sub-processes corresponding to the two-parent processes have six and five cluster points, respectively, to generate hexagon and pentagon.

Data Model of Image
The data domain S is divided into two categories, S = {S o , S b }, in which the object class is composed of m polygons, that is, S o = {S j , j = 1, . . . , m}, S j is the area covered by the pixels contained in each polygon.
The pixel spectral measure value set in the region S j is expressed as z j = {z i (x i ), x i ∈ S j }, and the pixel spectral measure value set in all object regions is expressed as z o = {z j , j = 1 ,..., m}. The set of pixel spectral measures in the background region is expressed as Generally, assuming that the spectral measurement values of each pixel in the color remote sensing image obey the independent and same multivalued Gaussian distribution, the data model of color remote sensing image z under the condition of giving the mark clustering point process configuration w is, Assuming that the components of θ are independent of each other, mean value µ b and µ o follow the multivalued Gaussian distribution with mean and covariance as (µ µ , Σ µ ), covariance Σ b and Σ o follow the multivalued Gaussian distribution with mean and covariance as (µ Σ , Σ Σ ). The prior probability of θ is, Among them, the distribution parameter µ µ , Σ µ , µ Σ , and Σ Σ can be specified according to the prior cognition of θ.

Ground Object Extraction Model
In this paper, the Bayesian theorem is used to model the ground object extraction model. Taking the image data model p(z | w, θ) as the probability model (likelihood function), combined with the prior probability p(θ) of distribution parameters and the prior model p(w) of the geometric configuration of the above-ground object.
The statistical laws of the location, distribution, and geometric shape of ground objects can be defined as the joint a priori probability distribution of the configuration (u, v, k, m) of the mark clustering point process by irregular graphics, that is, p(u, v, k, m), where k = {k j , j = 1, ..., m}. Assuming that the prior probabilities describing the location and geometry of the ground object are independent of each other, p(u, v, k, m) can be written as, Further assuming that each random point is independent of each other in the process of describing objects, and the geometric form of objects is also independent of each other, Equation (3) can be rewritten as, This paper uses the cluster point process theory to model the location and geometric shape of ground objects. The clustering point process is a point process formed based on the Poisson point process. It is generally considered that its parent process is the Poisson process. Therefore, two conditions need to be met. One is that for the number of evenly distributed ground objects in the data domain S, it is generally considered that the variable appears randomly and independently at a fixed average rate, and it is considered to obey Poisson distribution. Then the probability distribution p(m) of the number of ground objects m can be expressed as, where Poi represents the Poisson distribution function and λ m is the Poisson distribution parameter.
The other is that the location of ground objects in data domain S is generally considered to obey uniform distribution. That is, the probability distribution p(u j ) of the location of the j-th ground object follows the uniform distribution, which can be expressed as, where, Uni represents uniform distribution and |S| represents the area of data domain S. At the same time, assuming that each object is independent of each other, the probability distribution p(u) of the location of all objects in the data domain S is expressed as, Since it is impossible for ground objects to overlap geographically, it is necessary to restrict the location and distribution of ground objects in the point process u to prevent different ground objects from occupying the same area. Therefore, the interaction probability function of the position of the ground object is defined as, where p(u j , u j' ) is the probability distribution function of the interaction relationship between ground objects j and j'. In order to completely eliminate the overlap of ground objects j and j', p(u j , u j' ) can be defined as, where S j and S j' represent the area occupied by ground objects j and j', and Φ represent an empty set. After modeling the location and number of ground objects (parent process), we model the geometry of ground objects (sub-process). Then, assuming that the geometric configurations of objects are independent of each other, then, For the cluster point set v j = (v jg , g = 1, ..., k j ) used to fit the geometric shape of the j-th ground object, assuming that the distribution of each cluster point is independent of each other, then, In order to express the clustering of each cluster point in the clustering sub-process to the position of each random point in the clustering parent process, it can be assumed that ||u j -v jg || 2 satisfies the Gaussian distribution, The number of cluster points k j corresponding to the object j satisfies the Poisson distribution with parameter λ k , Assuming that the geometric distribution of each object is independent of each other, the number of clustering sub-processes of each object is independent of each other. For the number of nodes on the whole remote sensing image k = {k j , j = 1, ..., m}, the probability distribution function should be expressed as, Then the final ground object extraction model (a posteriori probability model) p(w, θ | z) can be expressed as,

Simulation and Optimization of Object Extraction Model
After the establishment of the object extraction model, the model needs to be simulated to obtain the optimal solution, to realize the object extraction based on remote sensing images. In this paper, the RJMCMC algorithm is used to simulate the ground object extraction (a posteriori probability) model, and the corresponding moving operation is designed to be combined with the MAP criterion to determine the location, quantity, and geometry of ground objects.

Simulation
Aiming at the ground object extraction (a posteriori probability) model based on the Bayesian theorem, the corresponding RJMCMC algorithm is designed. In addition to the parameters contained in the geometric configuration w of the ground object, the model should also include the distribution parameters θ of the distribution model obeyed by the remote sensing image observation data. Therefore, set all model parameters as Θ = (u, v, k, θ, m), and the posteriori probability model for extracting objects can be rewritten as, When the number of objects m and node-set k change, the dimension of model parameters Θ = (u, v, k, θ, m) will change. Therefore, the RJMCMC algorithm is used to simulate the object extraction model with variable dimensions. The basic idea is: to set the current sampling iteration indicator as t, and the corresponding model parameter set is Θ (t) . In the next iteration, the iteration indicator is t + 1, and the candidate parameters Θ * are proposed according to the model parameters Θ (t) , that is, Θ * =Θ * (Θ (t) , ζ). The vector ζ is a continuous random vector defined to ensure the dimensional balance of Θ * and Θ (t) , which meets the following requirements: |Θ * | = |Θ (t) | ± |ζ |, where | · | is the vector dimension. In order to completely fit the geometric shape of the ground object, the operation of changing the geometric shape of the ground object is designed. It mainly includes: updating distribution parameters θ, adding or deleting polygons, adding or deleting polygon nodes, and merging polygons.
Let the geometric configuration of the ground object under the current iteration number t state be

Updating distribution parameters θ
The distribution parameters (t) as an example, a new candidate parameter µ b * is proposed, and it is considered that it follows the Gaussian distribution with mean µ b (t) and variance ε b , respectively. The acceptance rate of updating distribution parameter µ b (t) is defined as, We take the random number γ~Uni([0, 1]), if A(θ * , θ (t) ) ≥ γ, the operation of updating distribution parameters θ is accepted, θ (t+1) = θ * . Otherwise, the operation is rejected, θ (t+1) = θ (t) .

Add polygons
For add a polygon operation, let the geometric configuration of the ground object in the t-th iteration state be In the t+1-th iteration, add a polygon, the index of the newly added polygon is m (t) + 1, and the configuration is w * After adding polygon operation, the candidate configuration of ground object geometry is . The distribution parameter corresponding to the newly added polygon is θ * , which is extracted from its predefined prior distribution. When a polygon is added, the distribution parameters θ (t) of the ground object extraction model will not change, so its probability distribution will not change. However, the parameters (u (t) , v (t) , k (t) , m (t) ) in the geometric configuration w (t) of the ground object will change with the increase in polygons, so the corresponding probability distribution will also change. At the same time, the remote sensing image data model p(z | θ (t) , w (t) ) will also change accordingly. Therefore, the acceptance rate of the operation of adding polygon is defined as, is the vector to keep the dimension balance, and p(ζ) is the probability function of the proposed vector.
. p D and p A are the probability of selecting add or delete polygon operations, respectively. If one of these operations is not preferred in the design of the model simulation algorithm, we can assume p D = p A , that is, we can choose to add or delete polygons with equal probability.

Delete polygons
For delete a polygon operation, let the geometric configuration of the ground object in the t-th iteration state be }. An arbitrary polygon v j (t) is randomly selected with equal probability 1/m (t) from the current target geometric configuration w (t) .
The geometric configuration of the selected polygon is w j . We delete this polygon and reorder the remaining polygons. Then the geometric configuration of the object is w * = {w j * , j = 1, ..., m (t) − 1}. At the same time, the distribution parameters corresponding to this polygon are deleted θ j (t) . When this operation is performed, the probability distribution of the distribution parameters θ (t) remains unchanged. The parameters (u (t) , v (t) , k (t) , m (t) ) in the geometric configuration w (t) of ground object and the remote sensing image data model p(z | θ (t) , w (t) ) will change. At the same time, this operation will also change the dimension of the parameter set. For simplicity, the optional vector to maintain dimensional balance is ζ = (w j (t) , θ j (t) ). Therefore, the acceptance rate of the operation of the deleting polygon is defined as, where, We take the random number γ~Uni([0, 1]), if A D (w * ) ≥ γ, the operation of delete polygons is accepted, w (t+1) = w * . Otherwise, the operation is rejected, w (t+1) = w (t) .

Add polygon nodes
For any given polygon, the neighborhood node of any node is the node directly connected with the node, and the two directly connected nodes form a neighborhood node pair. Let the geometric configuration of the ground object in the t-th iteration state be w (t) = {w j (t) , j = 1, ..., m (t) }. An arbitrary polygon v j (t) is randomly selected with equal probability 1/m (t) from the current target geometric configuration w (t) . Select a neighborhood node pair (v jg (t) , v jg+1 (t) ) with equal probability 1/k j (t) from the node set v j (t) of the selected polygon. A point is randomly selected as an additional candidate node in the intersection area between the circle with (v jg (t) + v jg+1 (t) )/2 as the center |v jg (t) + v jg+1 (t) |/2 as the radius and the visible area of edge v jg (t) v jg+1 (t) . Then, the candidate node is inserted between the node pairs (v jg (t) , v jg+1 (t) ) to construct the candidate polygon v j * . Finally, the nodes of the candidate polygon v j * are reordered as v j * = {v jg * , g = 1, ..., k j (t) + 1}. Without losing generality, record the new node as v jg * . When this operation is performed, the probability distributions of the distribution parameters θ (t) and the parameters (u (t) , m (t) ) in the geometric configuration w (t) of the ground object remain unchanged. The probability distribution of (v (t) , k (t) ) parameters and remote sensing image data model p(z| θ (t) , w (t) ) will change. At the same time, this operation will also change the dimension of the parameter set. For simplicity, the optional vector to keep the dimension balance is ζ = {v jg * }. The acceptance rate of the operation of adding polygon node is defined as, where z j * = S j * \ S j (t) , z bj * = S j (t) \ S j * , S j * and S j (t) represent the areas occupied by the candidate polygon v j * and the current polygon v j (t) , respectively. If z j * or z bj * is an empty set, we set its corresponding conditional probability density function to 1. p DV and p AV are the probability of selecting add or delete polygon node operations, respectively.

Delete polygon nodes
Let the geometric configuration of the ground object in the t-th iteration state be w (t) = {w j (t) , j = 1, ..., m (t) }. An arbitrary polygon v j (t) is randomly selected with equal probability 1 / m (t) from the current target geometric configuration w (t) . Then, a node is randomly selected from the polygon node set v j (t) with equal probability 1 / k j (t) and recorded as v jg (t) . If the two neighborhood nodes of the node v jg (t) are visible to each other, delete the node and reorder the remaining nodes v j * = {v jg * , g = 1, ..., k j (t) − 1} to build the candidate polygon v j * . When this operation is performed, the probability distributions of the distribution parameters θ (t) and the parameters (u (t) , m (t) ) in the geometric configuration w (t) of the ground object remain unchanged. The probability distribution of (v (t) , k (t) ) parameters and remote sensing image data model p(z| θ (t) , w (t) ) will change. At the same time, this operation will also change the dimension of the parameter set. For simplicity, the optional vector to keep the dimension balance is ζ = {v jg * }. The acceptance rate of the operation of deleting polygon node is defined as, We take the random number γ~Uni([0, 1]), if A DV (v j * , v j (t) ) ≥ γ, the operation of delete a polygon node is accepted, w (t+1) = w * . Otherwise, the operation is rejected, w (t+1) = w (t) . Figure 3 is the example of adding and deleting a polygon node, where Figure 3a shows the geometric configuration w (t) of the ground object at time t. Furthermore, Figure 3b,c show examples of adding or deleting a polygon node at time t + 1, respectively.

6.
Merging polygons Let the geometric configuration of the ground object in the t-th iteration state be w (t) = {w j (t) , j = 1, ..., m (t) }. Select two polygons v j (t) and v j' (t) randomly from the current object geometric configuration w (t) , and then calculate their node pair spacing, respectively. If the distance between the shortest and sub-shortest node pairs is less than a given threshold, we merge the two polygons. The merging polygon operation is realized by merging the node pairs. Take the middle point of the shortest node pair and the sub-shortest node pair as the newly added nodes and combine them with other nodes to construct the candidate polygon v j * . When this operation is performed, the distribution parameters θ (t) remain unchanged, and its probability distribution remains unchanged. The parameters (u (t) , v (t) , k (t) , m (t) ) in the geometric configuration w (t) of the ground object will change with the merging of polygons and their corresponding probability distribution will also change. At the same time, the remote sensing image data model p(z | θ (t) , w (t) ) will also change, so the acceptance rate of merging polygons operation is, where, z j We take the random number γ~Uni([0, 1]), if A M (w * ) ≥ γ, the operation of merge polygons is accepted, w (t+1) = w * . Otherwise, the operation is rejected, w (t+1) = w (t) . Figure 4 is an example of merge polygons, whereas Figure 4a shows the geometric configuration w (t) of the ground object at time t. Furthermore, Figure 4b shows the example of merging polygons at time t + 1.

Optimization
MAP criterion is the simplest optimization scheme. When the posterior probability in the Bayesian extraction model is the largest, the corresponding solution is the optimal object extraction result. The optimal parameter estimation solution under MAP condition is, whereΘ = ŵ,θ = û,v,k,m,θ is the maximum a posteriori probability estimate of the parameter set (w, θ) = (u, v, k, m, θ). Among the above parameters, u = {u j = (u 1j , u 2j ) ∈ S, j = 1, ..., m}, where u j = (u 1j , u 2j ) represents the position of object j, and m represents the number of objects. v = {v j , j = 1, ..., m}, where v j = {v jg , g = 1, ..., k j } is the node coordinate set corresponding to the object j. Therefore, the configuration w = (u, v) of the mark clustering point process can completely describe the extracted ground object.

Experiment Results
In order to verify the accuracy of the proposed algorithm quantitatively and qualitatively, synthetic and real color remote sensing images are used for experiments.

Synthetic Image Experiment
According to the given template image (as shown in Figure 5a), two ground object types of forest land and bare land are spliced to generate a 256 × 256 pixels image (as shown in Figure 5b). The blue line in Figure 5c is the outline of the object. The proposed algorithm is used to extract ground objects from the synthetic image in Figure 5b. During the experiment, if the number of iterations is set too much, it will cause time redundancy. If the number of iterations is small, the algorithm will not converge, and the fitting of the geometric features of the object will not be completed. After many experiments, the number of iterations in this paper is set to 4000. We conducted experiments on the Intel (R) Core (TM) i7-10510u CPU @ 1.80GHz 2.30 GHz device, and the running time was about 303.403035 s.
During the experiment, each iteration performs six designed operations, of which the last five operations will affect the geometry of the polygon used to fit the objects. Table 1 lists the number of times each move operation is accepted. Among them, the operations of adding polygons, deleting polygons, and merging polygons are used to determine the number of objects and are accepted fewer times. The operations of adding polygon nodes and deleting polygon nodes are used to adjust the shape of polygons and are accepted more times.  Figure 6 shows the acceptance rate of the last five operations. Among them, the polygon addition operation was accepted four times, which occurred in iterations 26, 49, 62, and 78, respectively, as shown in Figure 6a. The polygon deletion operation was accepted three times, which occurred in iterations 2, 4, and 185, respectively, as shown in Figure 6b. The merge polygon operation is accepted once, which occurs in the 581st iteration, as shown in Figure 6e. These three operations are used to determine the number of objects. Adding and deleting polygon nodes are accepted more times, 683 times and 616 times, respectively, as shown in Figure 6c,d to adjust the geometry of the polygon.
The total number of iterations of the experiment is 4000, and the object extraction results are displayed every 20 times, as shown in Figure 7. Among them, Figure 7a shows the initialization result. During the initialization process, the number of given parent processes is 4, which is represented by the green cross wire in the figure, and the given clustering range is represented by the blue circle in the figure. The sub-processes corresponding to the parent process are generated within the blue circle. The number of sub-processes is set to 4, which is represented by solid red dots in the figure. Connect the sub-processes corresponding to each parent process in sequence to form an initial polygon, as shown by the red line, to form an initialization result. Figure 7b shows the results of the 20th iteration. Compared with the initialization results (Figure 7a), it is found that in the first 20 iterations, the add polygon operation is not accepted, and the delete polygon operation is accepted twice, which occurs in the second and fourth iterations, respectively. Figure 7c shows the results of the 40th iteration. Compared with Figure 7b, it is found that in the 20 iterations, a polygon is added, which occurs in the 26th iteration, and the polygon deletion operation is not received. Figure 7d shows the results of the 60th iteration. Compared with Figure 7c, it is found that in the 20 iterations, a polygon is added, which occurs in the 49th iteration, and the polygon deletion operation is not received. Figure 7e shows the results of the 80th iteration. Compared with Figure 7d, it is found that in the 20 iterations, the polygon addition operation is accepted twice, which occurs in the 62nd and 78th iterations, respectively, and the polygon deletion operation is not accepted. Figure 7f shows the results of the 200th iteration. Compared with Figure 7e, it is found that one polygon deletion operation is accepted, which occurs in the 185th iteration. Figure 7g shows the result of the 580th iteration. Compared with Figure 7f, the operations of adding and deleting polygons are not accepted, but the geometry of polygons is adjusted by adding and deleting polygon nodes. Figure 7h shows the results of the 600th iteration. Compared with Figure 7g, it is found that a polygon merging operation is accepted, which occurs in the 581st iteration. The polygon geometry is adjusted by continuously adding and deleting nodes to form the final object extraction result, as shown in Figure 7i. It can be seen that after adding, deleting, and merging polygons, the final number of polygons is consistent with the object number of figures.   Figure 8a is to convert the object extraction results into segmentation results in order to quantitatively evaluate the experimental results by using the confusion matrix method. Figure 8b shows the superposition results of the extracted ground object contour and the experimental image to qualitatively evaluate the experimental results. Figure 8c shows the extracted contour line of the ground object. The solid green dots in the figure represent each node, and the solid red lines are used to connect each node in order to form a polygon. According to the final experimental results, the central, quadrilateral, pentagonal, and elliptical objects are connected by 23, 13, 23, and 21 nodes, respectively.
The proposed method belongs to the statistical method of traditional methods. Statistical methods are usually divided into pixel processing units and object-oriented processing units. Therefore, in order to verify the effectiveness of the experiment, the paper first uses the statistical method in which the pixel is the processing unit for comparison [18]. Secondly, the comparison is made by using the mark point process method, which takes the object as the processing unit but only uses the regular rectangle as the mark [19]. At last, the method based on the active-active contour model, which is commonly used in traditional methods, is used for comparison [20]. The experimental results of the three comparison algorithms are shown in Figure 9. In order to quantitatively evaluate the experimental results by using the confusion matrix, the segmentation results of the three algorithms are obtained, as shown in Figure 9a1-c1. In order to qualitatively evaluate the experimental results visually, the object extraction results of the three algorithms are superimposed with the experimental images, as shown in Figure 9a2-c2. The experimental results show that the algorithms based on pixel and active contour model have a good segmentation effect on bare land with uniform spectral measurement, but for forest areas with uneven spectral measurement, a lot of noise will be generated in the experimental results. The method based on the rule mark point process has a good positioning effect on bare ground objects but a poor geometric fitting effect.  Take the template image (Figure 5a) as the standard image, and take the segmentation results of the proposed algorithm and three comparison algorithms (Figures 8a and 9a1-c1) as the experimental image to establish the confusion matrix. Then calculate the accuracy index according to the confusion matrix, and the results are shown in Table 2. It can be seen from the table that the user accuracy of the background area and the product accuracy of the object area that are based on the pixel algorithm and the active contour model algorithm are higher than the proposed algorithm. This is because the spectral measure distribution of the object area is relatively uniform, so the two algorithms have better segmentation results for the object area. However, for the background region with complex spectral measure distribution, the two algorithms will produce a lot of noise, which reduces the overall accuracy of the experimental results. Although the rule mark point process algorithm will not produce a lot of noise, its overall accuracy is also affected because of its poor-fitting effect on the boundary of ground objects. In general, the proposed algorithm is superior to the other three comparison algorithms in terms of total accuracy, kappa coefficient, and F1 measurement.

Real Image Experiment
In order to avoid the over-idealization of quantitative analysis of experimental results using only synthetic images, two real remote sensing images [21] with islands and lakes as objects are selected for the ground object extraction experiment. The boundary line is determined by visual analysis through human observation. Because there will be some errors in the object boundary line determined by human vision, only the remote sensing image with one object is selected in the experiment. As shown in Figure 10, Figure 10b1,b2 are remote sensing images of islands and lakes, respectively. Figure 10c1,c2 are the manually determined object boundaries (solid blue line in the figure). Figure 10a1,a2 are standard image templates for artificially determined object boundaries to quantitatively evaluate the experimental results.
The proposed algorithm is used to experiment in Figure 10b1,b2, and the experimental results are shown in Figure 11. Among them, Figure 10c1,c2 are the object extraction results. The solid green dots in the figure represent the nodes used to form the polygon, and the number of nodes is 35 and 44, respectively. Each group of nodes is sequentially connected with solid red lines to form polygons, which are used to fit island and lake objects, respectively. The formed polygon is superimposed with the experimental image (Figure 10b1,b2) for visual evaluation. The superposition results are shown in Figure 11b1,b2. It can be seen from the figure that the polygon formed by the experimental results can not only locate the island and lake but also accurately fit their geometry. In order to quantitatively evaluate the experimental results, the extracted object results are converted into segmentation results, as shown in Figure 11a1,a2.  We also use three comparison algorithms to experiment with Figure 10b1,b2, and the experimental results are shown in Figure 12. Figure 12a1-c1 shows the segmentation results of the three comparison algorithms on the island image and Figure 12a2-c2 shows the object extraction results of the three comparison algorithms on the island image. In the island image, comparing the island object with the background, it can be seen that the spectral measure distribution in the background area is more uniform than that in the target area. Therefore, in the experimental results based on pixel and active contour algorithm, the object area produces more noise, while in the background area, there is noise only in some areas with uneven spectral measure distribution. Figure 12a3-c3 shows the segmentation results of the three comparison algorithms on the lake image and Figure 12a4-c4 shows the object extraction results of the three comparison algorithms on the lake image. In the lake image, the distribution of pixel spectral measures in the lake area is more uniform than that in the background area. Therefore, in the experimental results based on pixel and active contour algorithm, more noise is generated in the background area. The rule mark point process algorithm will not produce noise, but the fitting effect of the boundary of the object area is not ideal, whether for the island image with complex object distribution or the lake image with complex background distribution.
We compare the standard images of the island image and lake image (as shown in Figure 10a1 Table 3. In the results of the pixel algorithm and active contour algorithm, only the user accuracy of the background area and the product accuracy of the object area in the lake image are high. This is because the pixel spectral measure distribution in the object area of the lake image is relatively uniform. Compared with the object area of the lake image, the distribution of spectral measures is more complex, which leads to low result accuracy and reduces the overall accuracy of the two algorithms. The accuracy of the rule mark point process algorithm is mostly better than these two algorithms, so the total accuracy is also better than these two algorithms. For the proposed algorithm, although the user accuracy of the background area and the product accuracy of the object area in the lake image may be lower than the three comparison algorithms, the other accuracy is higher, which improves the overall accuracy of the algorithm. It can also be seen from the total accuracy, kappa coefficient, and F1 measure that the proposed algorithm is significantly better than the three comparison algorithms.  (b1) Rule mark island segmentation; (c1) Active contour island segmentation; (a2) Pixel based island extraction; (b2) Rule mark island extraction; (c2) Active contour island extraction; (a3) Pixel-based lake segmentation; (b3) Rule mark lake segmentation; (c3) Active contour lake segmentation; (a4) Pixel based lake extraction; (b4) Rule mark lake extraction; (c4) Active contour lake extraction. Figure 13 shows six images selected from the aerospace remote sensing object detection data set [21] marked by Northwest University of technology, all of which are 256 × 256 pixels. Figure 13a1-c1 shows three island images, and Figure 13a2-c2 shows three lake images. It can be seen from the image that the spectral measures of the pixels in the background area of the three island images are evenly distributed, and the spectral measures of the object pixels are quite different. The spectral measure distribution of the pixels in the background area of the three lake images is more complex, while the spectral measure distribution of the object pixels is more uniform. The geometric shapes of the objects in these six images show irregular characteristics. The proposed algorithm is used to experiment with the remote sensing image in Figure 13. The experimental results are shown in Figure 14. In the figure, the solid green dots represent the nodes constituting the ground object geometry, and the solid red lines represent the extracted object geometry. Among them, Figure 14a1-c1 shows the results of extracting island objects, and the number of extracted objects is four, three, and two, respectively, which is the same as the visual interpretation results in the experimental image. Figure 14a2-c2 shows the extracted lake results, and the number of extracted objects is two, six, and four, respectively. The extraction results of the first two images are consistent with the visual interpretation results, while the number of surface objects visually interpreted in Figure 14c2 should be 3. In the experimental results, two polygons are used to fit a lake object.
The object extraction results are superimposed with the experimental images for qualitative evaluation, as shown in Figure 15. The proposed algorithm can not only accurately locate the object but also accurately fit its geometry for both island images and lake images. When a polygon cannot be used to fit an object, the algorithm will automatically detect multiple polygons to fit a ground object.  In this paper, the pixel-based image processing method is used to experiment with the six remote sensing images in Figure 13. The experimental results are shown in Figure 16. Among them, Figure 16a1-c1 show the experimental results of three island images. The pixel spectral measurement values in the object area are greatly different, while the pixel spectral measurement values in the background area are relatively small compared with the object area, which makes more noise in the object area and less noise in the back area, especially in Figure 16c1. Figure 16a2-c2 shows the experimental results of three lake images. The pixel spectral measurement values in the background area have a large difference, while the pixel spectral measurement values in the target area have a small difference relative to the background area. Therefore, in the experimental results, the noise in the object area is relatively less, and the noise in the background area is relatively more. (c1) Island image 3; (a2) Lake image 1; (b2) Lake image 2; (c2) Lake image 3.
In this paper, the mark point process algorithm with a rectangle as a mark is used to experiment on six images. The experimental results are shown in Figure 17. Figure 17a1-c1 shows the experimental results of three island images, and the number of extracted ground objects are three, four, and two, respectively. Figure 17a2-c2 shows the experimental results of three lake images, and the number of ground objects extracted are two, six, and three, respectively. The geometric shape of the object extracted by this algorithm is rectangular because the pre-defined mark of this algorithm is rectangular. A rectangle is difficult to fit the ground object with irregular geometric characteristics, which reduces the overall accuracy of the experimental results.
In this paper, the algorithm based on the active contour model is used to experiment on six images. The experimental results are shown in Figure 18. It can be seen from the experimental results that there will be more noise in the experimental results of the algorithm for island images with large differences in pixel spectral measurements in the object area and lake images with large differences in pixel spectral measurements in the background area. Although there is less noise than in the pixel-based algorithm, the positions of the noisy images are basically the same. Therefore, it can be seen that whether the algorithm is based on a pixel or active contour model, it is difficult to achieve a more accurate object extraction effect for the regions with large differences in pixel spectral measures.

Conclusions
In view of the irregularity of the geometric shape of the ground object in remote sensing image, based on the theory of random geometric point process, this paper proposes a new ground object extraction method based on the mark clustering point process and expresses the position, distribution, and geometric shape of the ground object. In this paper, the traditional statistical method is used for detection, which requires a lot of sampling work, so the experimental speed is slow. Therefore, in the experiment, only small image objects with obvious irregular geometry are fitted with geometric features. In future work, we can consider how to quickly realize the object detection of large images or images with more complex scenes.