Semi-Automatic Extraction of Rural Roads under the Constraint of Combined Geometric and Texture Features

: The extraction of road information from high-resolution remotely-sensed images has important application value in many ﬁelds. Rural roads have the characteristics of relatively narrow widths and diversiﬁed pavement materials; these characteristics can easily lead to problems involving the similarity of the road texture with the texture of surrounding objects and make it difﬁcult to improve the automation of traditional high-precision road extraction methods. Based on this background, a semi-automatic rural road extraction method constrained by a combination of geometric and texture features is proposed in this paper. First, an adaptive road width extraction model is proposed to improve the accuracy of the initial road centre point. Then, aiming at the continuous change of curvature of rural roads, a tracking direction prediction model is proposed. Finally, a matching model under geometric texture constraints is proposed, which solves the problem of similarity between road and neighbourhood texture to a certain extent. The experimental results show that by selecting different types of experimental scenes or remotely sensed image data, compared with other methods, the proposed method can not only guarantee the road extraction accuracy but also improve the degree of automation to a certain extent.


Introduction
Road data play an important role in many fields, such as urban planning, traffic management, map updating, disaster management, road monitoring, public health, unmanned aerial vehicle (UAV)-based visual navigation, driving assistance systems, and agricultural development [1][2][3]. With the rapid development of remote sensing satellites and sensors, higher-resolution remote sensing image acquisition is becoming increasingly easy [4]. As remote sensing images represent basic ground feature recognition data, an increasing number of scholars have been attracted to invest in road extraction research. Rural roads play a key role in rural planning and are an important form of infrastructure for ensuring social and economic development in rural areas. In China, for example, the rural road foundation is large. According to the statistics of the white paper "sustainable development of China's transportation", China's rural roads account for 83.8% of the country's total highway mileage. In this context, the extraction of rural roads is particularly important.
Domestic and foreign researchers have proposed a large number of road extraction methods. According to the need to provide a priori samples, this paper divides these road extraction methods into model-driven methods and data-driven methods.
Model-driven is a data processing method without samples. Its advantages lie in applying models directly to data, with low manual participation and high automation. The main representatives of this type of method are knowledge-driven methods and object-oriented methods.
The knowledge-driven method constructs the knowledge model related to the road, and then establishes hypothesis test models between the knowledge model and imageprocessing results, so as to achieve knowledge-based road extraction [5]. Among these methods, Hedman et al. [6] extracted roads according to their corresponding linear geometric feature knowledge. Baltsavias [7] summarized the feasibility of using existing knowledge and geographic data to improve automation efficiency. Grote et al. [8] used a digital surface model (DSM) and high-resolution colour-infrared images to extract suburban roads. Although knowledge-driven methods can use existing information to improve the road extraction efficiency, determining how to adapt the existing knowledge to various road scenes is the most difficult core problem for the knowledge-driven method.
Other typical examples of model-driven methods include object-oriented methods. In an object-oriented method, a segmentation algorithm is first used to separate the road area from other areas, and then the appropriate classification and post-processing method are selected to extract the roads. For example, Lei et al. [9] used the gray consistency of the road surface and the mutability of the road edges in the gray images to segment the studied region, then used shape features to select the road region. After multi-resolution segmentation, Kumar et al. [10] used a fuzzy membership function and image object attribute value to define different classes in order to extract roads. Bakhtiari et al. [11] first used the Canny operator to detect the contours of roads, then used a support vector machine (SVM) to classify the images after segmentation. Object-oriented methods take the object region as the processing unit, lessen the deficiencies of pixel-level analyses, and improve the spatial smoothness of the road extraction effect [12]. However, the image segmentation results obtained with these methods are greatly affected by the image quality. In cases involving texture similarity, shadows, or occlusion, it is easy to obtain segmentation results that are inconsistent with the actual features, and the road extraction accuracy following scene transformations is difficult to ensure.
Some differences exist in road image data among different sources. Rural road data, in particular, contain many road materials and curvature continuity changes. Model-driven methods are based on road features, and it is difficult to analyse these differences on the basis of limited parameter analyses. These problems make it difficult to ensure the accuracy of road extractions in model-driven methods; thus, data-driven methods have been developed.
Data-driven methods are based on the characteristics of different data. By manually selecting a priori feature samples, models are established to fit and learn the parameters of the selected features to facilitate the discrimination of roads. Deep learning and template matching are typical data-driven methods.
In the deep learning road extraction method, the model learns a prior set of road data and then identifies the roads in the test set using the discriminant function. Heermann et al. [13] proposed the back-propagation (BP) algorithm, allowing road extraction methods based on neural networks to develop rapidly. In recent years, with more applications of deep learning in the field of road extraction, globally aware road detection networks with multi-scale residual learning (GAMSNet) [14], boundary and topologically aware neural networks for road extraction (BT-RoadNet) [15], multitask road-related extraction networks (MRENet) [16], deep structured self-driving networks (DSDNet) [17], and other new networks continue to emerge. Deep learning methods provide new opportunities for the semantic expansion of remote sensing image interpretations [18]. However, the effectiveness of road extraction techniques with deep learning methods is greatly affected by the quality of the sample set, and noise and occlusion problems cause fractures in most road extraction results. It is difficult to obtain extraction accuracy and recall values over 90% simultaneously when using current deep learning methods [19,20]. In addition, no vector topology relationship exists in the extraction results, and the results thus require a large amount of postprocessing to yield product-level data.
As another data-driven method, the template matching method establishes a template by manually selecting local road samples, analyses the similarity of the parameter information contained in the pre-selected data and template data, and selects the area with the greatest similarity to complete road extraction. Wang et al. [21] extracted initial roads through extensive contour analysis and then used the snake model to optimize the road locations. Leninisha et al. [22] improved the geometrically active deformation model and proposed an extended geometrically active deformation model with improved accuracy and efficiency. Additional, classic template matching methods include circular template [23], T-shaped template [24], sector template [25], and rectangular template [26] methods. The circular template automatically generates the initial template by morphological gradient; then it searches other road points between the starting point and the end point according to the iterative interpolation method to complete the road tracking and matching. The T-shaped template uses the angle texture feature to obtain the initial road points; then calculate the road width and road direction; finally, gray least square matching is used to locate the optimal road points. Based on the principle that the edge of the road near the ground object is the same as the road direction, the sector template proposes the multiscale line segment orientation histogram (MLSOH) descriptor [25]: the extracted information of the line segment near the road is counted, and the direction with the maximum probability is found as the dynamic tracking direction of the road; then, a sector descriptor is established to extract the centre point of the road by using the texture feature of the road. However, the applicability of these traditional template matching algorithms in complex areas, such as in regions containing obstacles or shadows, needs to be improved. Dai et al. [27] proposed a semi-automatic extraction method of rural roads with high resolution remote sensing images combined with multiple features. The MLOSH descriptor was used to calculate the road tracking direction and reduce the influence of local curvature changes on tracking; then, a multi-circle template was established; finally, the proposed panchromatic and hue, saturation and value (HSV) spatial interactive matching model was used to match and track the roads. This method solves some problems caused by the diversification of rural road materials.
On the one hand, template matching, as a data-driven method, avoids the disadvantage that it is difficult to analyse the differences of rural roads driven by the model. On the other hand, compared with the same data-driven deep learning method, template matching starts from the local part of the image and has strong human interference, strong error correction ability and high extraction accuracy (the accuracy and recall can reach more than 95%), thus meeting the requirements for practical applications. However, rural roads have the characteristics of continuous curvature changes, relatively narrow widths and diversified pavement materials, leading to a low degree of automation in template matching road extraction methods. Based on these challenges and on template matching methods, this paper proposes a semi-automatic rural road extraction method constrained by the combination of geometric features and texture features; this method not only ensures the road extraction accuracy but also improves the degree of automation. The specific contributions of this work are as follows: (1) An adaptive road width extraction model is proposed. In the existing methods, the detection of road width is slow and the accuracy is low due to the need to set the threshold [25]. According to the good edge fitting of the extraction results of rural road line segment sequence [28], the road width is extracted by calculating the projection distance. The efficiency and accuracy of road width extraction are improved, and the quality of initial road centre point is improved. (2) The existing direction prediction model is improved. Based on the principle of MLSOH descriptor to determine the direction, Dai et al. [28] first uses the line segment sequence with better segment fusion to replace the discrete segment. Then, the line segment sequence near the road is divided into the artificially specified angle range. Finally, the range of the maximum cumulative length of the line segment sequence is selected as the road tracking direction. However, the width of rural roads is narrow, and the road direction needs an accurate angle value. Therefore, this paper adjusts the cumulative length of line segment sequence in [28] to the length of single line segment sequence, and obtains a more accurate and stable road direction. (3) The proposed method solves the matching problem when the road is similar to the background. Compared with urban roads, rural roads, as low-grade roads, have the characteristics of narrower road width and diversified road materials. These easily lead to the similarity of road and background texture in the image. As a result, the traditional road extraction method has a low degree of automation on the premise of ensuring high accuracy. To solve this problem, we abandon the idea that the road matching model only relies on texture spectral features, and add geometric weights into the matching model to form a dynamic matching model incorporating geometric information. The model can solve the matching problem when the road is similar to the background by analysing the geometric and texture information of the road.

Experimental Data
This paper mainly uses high-resolution orthophotographic panchromatic images. The image data are selected from different remote sensing satellites, namely, Gaofen (GF)-2 and GF-7 data. Among the data, the GF-2 data represent Enshi city, Hubei Province, China and Dandong city, Liaoning Province, China, and the GF-7 data represent Zhangjiakou city, Hebei Province, China. The detailed image parameters are listed in Table 1.

Methodology
The experimental method of this paper is divided into four parts: Section 2.2.1 introduces the pre-processing work, including line segment sequence extraction to obtain road structure features and L0 filter to improve the internal homogeneity of the road. We select manual input points after pre-processing. Section 2.2.2 introduces the adaptive road width extraction model and determines the accurate initial road centre point. Section 2.2.3 introduces the tracking direction prediction model to provide accurate tracking direction for the next matching. Section 2.2.4 introduces the rural road matching model under geometric texture constraints. If the matching is successful, the road direction prediction is carried out again to track the next road point. If it fails, the results are output. The flow-process-chart of this process is shown in Figure 1.

Pre-Processing
According to the unique linear characteristics of the road, we use the extraction of line segment sequence [28] to obtain the prior information of the road. As shown in Figure 2b, the line segment sequence encodes and groups the road edge segments to obtain continuous road edge information.
In addition, aiming at the problem of image noise, we use L0 filter [29]. By removing the small non-zero gradient, the unimportant details are smoothed while the significant edge of the image is enhanced (retaining the large gradient), as shown in Figure 3. This not only improves the internal homogeneity of the road, but also distinguishes the road from the surrounding features.   After pre-processing, an artificial point P 1 was selected at a clear position of road boundary to start road tracking and matching.

Adaptive Road Width Extraction Model
In this paper, an adaptive road width extraction model is designed to obtain relatively accurate initial road centre point. Figure 4 shows the overall process of the model. The specific steps are as follows: (1) Based on the initial point P 1 input manually, points with 5 pixels ahead of and behind P 1 are selected along the road prediction direction to obtain a total of three points, P 1 , P 2 , and P 3 , on the road. (2) Starting with P 1 , the projection of P 1 on the edge line segment on one side of the road is calculated to obtain the projection point A 1 and the projection distance X 1 . Then, the projection of P 1 on the edge segment on the other side of the road is calculated to obtain the projection point A 2 and the projection distance X 2 . The sum of X 1 and X 2 is the road width W 1 . Following this method, the road widths W 2 and W 3 corresponding to P 2 and P 3 , respectively, are calculated, and the average road width W among the three points is obtained.
The translation direction is selected as the projection direction on the side of the road farthest away from P 1 , and the translation distance X is determined, as shown in Equation (1). The initial point P 1 is translated along the translation direction with a distance X to the road centre point P.

Tracking Direction Prediction Model
In view of the good edge fitting ability of line segment sequence, this paper adjusts the cumulative length of the line segment sequence counted in [28] to the length of single line segment sequence. The specific process is as follows: (1) A rectangular search box is established with the current road point as the centre and 2 times the road width as the side length.

Geometric Texture Combination Matching Model
In this paper, geometric weight is added to the matching model and a matching model of geometric texture combination is proposed. First, the line segment sequence detected by the reference point is divided into the corresponding matching points according to the direction. On this basis, it is fused with the texture measure value corresponding to the matching point to calculate the geometric texture combination measure value, and the maximum combination measure value is the best matching point. As shown in Figure 6, after using the method in this paper, the best matching template changes from the yellow matching template deviating from the road centre to the green matching template in the road centre. The matching model flow of this paper is as follows:

1.
Template creation Based on the characteristics of rural roads, multi-circle template is built based on predicting road directions, and a reference template and seven matching templates are obtained [27].

2.
Geometric similarity measure Using the road structure information provided by the line segment sequence, the line segment sequence detected by the reference point is divided according to the direction. The geometric similarity measure is calculated. (2) The confidence value K i of each reference line segment sequence is calculated using Equation (2), where Length i is the length of the i-th reference line segment sequence. The length of each line segment sequence is used to compare the lengths of all detected surrounding line segments to obtain the confidence value K i of the reference line segment sequence. The greater this confidence value is, the greater the probability that this line segment sequence is a road edge and the more accurate the indicated direction.
(3) The confidence values of the reference line segment sequence are accumulated in the direction indicated by the matching point, then obtain the geometric measurement values {G 1 , G 2 , G 3 , G 4 , G 5 , G 6 , and G 7 } corresponding to each matching point. The larger the measurement value is, the closer the direction indicated by the matching point is to the road direction.

3.
Texture similarity measure In this paper, the gray variance and normalized cross-correlation coefficient (NCC) are fused to obtain the texture similarity measure.
(1) Calculate the gray variance in the matching template.
Gray variance measures the texture homogeneity of the image in a given region.
In Equation (3), N is the number of pixels in the matching template; I(x j , y j ) is the gray value of the j-th pixel; Graymean1 is the gray mean of the reference template; and v i is the variance in the i-th matching template.
Since the texture measurement value is jointly acted upon by the gray variance and NCC, it is necessary to normalize the two action values to equalize the influence of the gray variance and NCC in the determination of the texture measurement value. This paper uses the linear normalization function provided by OpenCV. In Equation (4), src i is the i-th initial value; src x is all initial values; max is the maximum value of the normalized range, which is 1 in this paper; min is the minimum value of the normalized range, which is 0 in this paper; and dst i is the i-th normalized value.
(2) Calculate the NCC between the matching template and the reference template.
In the model proposed in this paper, the reference template is used as the reference graph, the matching template is used as the real-time graph, and the correlation coefficient between the two is calculated to determine the matching effect.
In Equation (5), N is the number of pixels in the reference graph and the real-time graph; S j is the gray value of the j-th pixel in the real-time graph; S is the average gray value of the pixels in the real-time graph; g j is the gray value of the j-th pixel in the reference graph; g is the average gray value of the pixels in the reference graph; and p i is the correlation coefficient corresponding to the i-th real-time graph and the reference graph. The pixels of the real-time graph and the reference graph must individually correspond to the same position in their respective regions.
The correlation coefficient p i satisfies Equation (6), and the similarity between the two images is measured in the range of [−1, 1]. The closer to 1 the value is, the stronger the similarity.
The obtained p i values are normalized using Equation (4), and the normalized correlation coefficients {P 1 , P 2 , P 3 , P 4 , P 5 , P 6 , and P 7 } are obtained.
(3) The gray variance and NCC are fused to obtain the texture measurement.
In Equation (7), V i is the normalized variance value of the i-th matching template and P i is the normalized correlation coefficient of the i-th matching template. Since larger variance is correlated with smaller matching degree, the size of (1 − V i ) is used to represent the matching degree. The term t i represents the texture measurement value corresponding to the i-th matching point, and this term is normalized using Equation (4) to obtain the normalized matching point texture measurement values {T 1 , T 2 , T 3 , T 4 , T 5 , T 6 , and T 7 }.

Matching model
(1) Calculation of the combined measurement values.
In Equation (8), G i is the geometric measurement value corresponding to the i-th matching point and T i is the texture measurement value corresponding to the i-th matching point. Since the value ranges of both values are [0, 1] and their corresponding values are directly proportional to the matching effect, these values are added to obtain the final combined measurement value C i , with a value range of [0, 2]. The maximum value among {C 1 , C 2 , C 3 , C 4 , C 5 , C 6 , and C 7 } is selected, and the corresponding matching point indicates the best matching point. (2) Template comparison. In the equations below, Graymean2 is the gray mean of the best matching template and Gray2 is the gray value of the corresponding matching point. Graymean(A) is the average gray value of reference template set A, and Gray(A) corresponds to the average gray value of the reference point. Set A is composed of 5 recently obtained reference templates. If fewer than 5 reference templates have been obtained, set A is composed of all currently obtained reference templates. In this way, our template comparison is flexible and avoids the contingency caused by a single comparison. In addition, the grayscale is divided into 16 equal levels with sizes of g [27].
If Equations (9) and (10) are satisfied at the same time, the texture similarity between the best matching template and the previous tracking points is confirmed to adhere to the constraints. At this time, the centre point of the best matching template, that is, the seed point, is fine-tuned to be equidistant from the road edges to obtain an optimized seed point. Then, the best-matching template can be retained as a seed point, and the tracking of the next point can be continued on this basis. If these conditions are not met, the step size is gradually increased to 5 times the road width [25] to span part of the occluded area. If the constraint is still not satisfied, the final result is printed.

Comparison Method
In this paper, manual input points are used to semi-automatically extract roads, so four template-matching methods are selected for comparison with the proposed method: the circular template proposed by Lian et al. [23], the T-shaped template proposed by Lin et al. [24], the sector descriptor method proposed by Dai et al. [25] and the semi-automatic method of extracting rural roads from high-resolution remote sensing images based on a multi-feature combination proposed by Dai et al. [27]. Through these comparisons, the feasibility and automation of the proposed method are verified.

Evaluating Indicators
Performance indicators such as precision, recall (integrity), and quality are important parameters for evaluating road extraction methods [30]. Precision refers to the percentage of roads that are correctly extracted. Recall is the ratio between the reference road data correctly matched with the total length of the reference road map. Intersection over union (IoU) and F1 combine precision and recall into an individual metric, which is used as the final road quality index [31]. These metrics are calculated as follows: where TP is the length of correctly extracted roads, FP is the length of non-road pixels extracted as roads, and FN is the length of roads that was not extracted by the algorithm. The evaluation index used in this paper also considers the number of manually input points and the road extraction time to determine the road extraction efficiency. In addition, to calculate the evaluation index, ArcMap 10.2 software was used to hand-draw the actual ground-truth data of the experimental data for comparison with the experimental results. Figure 7a shows a GF-2 orthophoto with a spatial resolution of 1 m and a size of 4000 × 4000. The image displays a rural mountainous area in Enshi city, Hubei Province, China (see Table 1 for specific information).

Experiment 1
In the experimental area shown in Figure 7b, it is obvious that the curvatures of roads in rural areas change continuously. Using the method proposed in this paper, first, the local road tracking direction is obtained from the tracking direction prediction model, and the outputs fit the road edges well. Second, the tracking step is shortened in curved road areas to avoid situations in which the tracking process extends beyond the road edge. The combination of these two steps enables continuous seed point tracking in most curved-road sections, and only one seed point must be input. The T-shaped template method [24] and the circular template method [23] do not consider the basic road direction data provided by the structural road information, so a large number of manually input points are needed for road tracking at curvature-change areas in these methods. The MLSOH descriptor is added in the sector template method [25] and the multi-feature combination method [27], so large numbers of artificial points are not required in areas with slow curvature changes; however, the direction predictions fail in road areas with large bending degrees, and small numbers of manually input points are required for processing.
In the experimental area displayed in Figure 7c, the road is mainly constructed of earth and has a certain degree of bending, and sand and stones are located on both sides of the road, resulting in the road having a texture similar to that of the surrounding features. The circular template method and T-shaped template method depend on texture matching in this area. Due to the lack of direction predictions, tracking beyond the road boundary easily occurs. Although the sector template method can eliminate the influence of some curves, the resulting direction predictions are not sufficiently restrictive in the convergent texture area, and a small number of artificial points are also needed for tracking in this area. The multi-feature combination method basically does not need to supplement points to adapt to the changes in curvature in this area. However, in the convergent texture area, the road converges in one direction and does not meet the HSV spatial interactive matching conditions. Therefore, panchromatic matching is still used, and this process invalidates the effect of the multi-feature combination method in excluding the convergent texture area. The method proposed in this paper further constrains the geometric information of roads to enlarge the geometric measurement value of the actual road direction. On this basis, the texture measurement value is added to extract the roads in the convergent texture region, and there is no need to manually supplement points in the region shown in Figure 7c.  Figure 8a shows a GF-7 orthophoto with a spatial resolution of 0.65 m and a size of 4000 × 4000. The image displays a mountainous area in Zhangjiakou city, Hebei Province, China (see Table 1 for specific information).

Experiment 2
In the experimental area shown in Figure 8b, the lower half of the road is low-grade and is therefore narrow, so the area of the template is small during the road tracking process. However, in the road extraction method based on template matching, the texture characteristics in the template are the decisive factors applied to obtain the matching effect. If the area covered by the template is small, then the selection of the best-matching template also changes when the road texture changes slightly, leading to poor tracking stability in the narrow road section. After adding the geometric measurement information, the geometric measurement value in the main direction is significantly higher than those in the other surrounding directions. Therefore, seed point tracking can be constrained in the direction of the road to overcome the interference of uneven road textures in the mountainous area. Our method requires the fewest manually input points on narrow roads, ensures accuracy, and is confirmed to perform well when applied to narrow roads.
The road shown in Figure 8c is a dirt road at the edge of a mountainous area. Due to the large traffic flow, the texture contrast between the road and the surrounding ground is poor. The T-shaped template method [24] and circular template method [23] do not consider any road edge information, so large numbers of seed points need to be manually input in the convergent texture area to continuously track the road when using these methods. The sector template method [25] predicts the road direction according to the MLSOH descriptor; this method can ensure that the road tracking is always oriented in the road direction, but it also requires four points to be input to eliminate the influence of texture similarity. The multi-feature combination method [27] uses the HSV spatial matching model to solve areas in which the texture of the road is similar to that of the surrounding ground objects. However, in areas where the contrast in the HSV colour space is not sufficiently strong, the required conditions of the HSV spatial matching model are not met [27]. Therefore, two seed points also need to be input in the region shown in Figure 8c. Our method calculates the sum of the geometric measurement value and the texture measurement value of the matching template, and this method does not consider the texture measurement value as the only factor when determining the best-matching template; thus, the road tracking method proposed in this study ensures that the correct tracking direction is followed under texture constraints. The road section shown in Figure 8c can be tracked well without requiring the manual addition of points. Figure 9a shows a GF-2 orthophoto with a spatial resolution of 1 m and a size of 4000 × 4000. The image displays a rural area in Dandong city, Liaoning Province, China (see Table 1 for specific information).

Experiment 3
The road shown in the experimental area in Figure 9b is not an urban trunk road; rather, it is a low-grade earthen road and is similar to the surrounding ground material. In this case, if the initial point selection is not accurate, the initial information is not sufficient to provide an optimal reference template. The T-shaped template method [24] requires a manual visual determination of the initial road point, so artificial deviation arises when using this method. In the circular template method [23], sector template method [25], and multi-feature combination method [27], the road width is calculated using an adaptive correction model [25] to obtain the centre point of the road. However, due to the great limitations involved in setting the gradient threshold, the adaptability of this type of method is poor in road areas where the road texture is not clearly distinguished from that of the surrounding features. The adaptive road width extraction model proposed in this paper is based on obtaining a strong fit of the road edge contained in the line segment sequence, and this method is not limited to the determination of artificial vision or by a gradient threshold; thus, the proposed method can accurately calculate the initial road centre point. As shown in Figure 9b, the method proposed in this paper requires only one seed point to be input to achieve road extraction. Although the multi-feature method also needs to input only one seed point, the overall road extraction deviates from the road centre due to the poor accuracy of road width calculation using adaptive template.  The experimental area shown in Figure 9c comprises an area of aggregated housing, so some roads in this region are covered by shadows of houses and vegetation. The circular template method and T-shaped template method mainly rely on texture information for road matching, so large numbers of manually input points are required in these methods to solve the problem of occlusion. The sector template method and multi-feature combination method can rely on the direction provided by the MLSOH descriptor to avoid directional interference to some extent. However, in areas where the roads are partially occluded by shadows, although the matching template located in the centre of the road meets the texture constraint conditions, the texture measurement value of other matching templates lacking shadow interference may be larger. In this case, the road would be tracked outside the actual road area, so a small number of points need to be manually added to avoid tracking errors. In the method proposed in this paper, the sum of the geometric measurement value and texture measurement value is calculated. After adding a large geometric measurement value to the matching template in the road centre, the template can become the bestmatching template, and the interference of some shadows can be eliminated.

Discussion
In Table 2, the accuracy, recall rate, quality, input points and time required for the five road extraction methods are calculated. Overall, the five road extraction methods compared are all template matching methods, so the values of accuracy, recall rate and F1 score are all above 95%, and most of them are above 98%. Compare the five experimental methods. The T-shaped template method and the circular template method lack road direction information, so a large number of artificial points are added to improve the accuracy of the method. By adding the input points and time results of the three experiments in Table 2, it can be seen that the T-shaped template method and the circular template method need to input 1670 points and 1245 points, respectively, for processing the three images, requiring 1592 s and 795 s, respectively, far exceeding the other three methods. This proves that the efficiency of the T-shaped template method and circular template method is very low, and there is not much comparative value. The accuracies of the sector template method and the multifeature combination method are slightly lower. However, due to the MLSOH descriptor for direction prediction, fewer manual points are required, and the efficiency is greatly improved compared with the first two methods. The processing of the three images required 314 points and 228 points, which required 850 s and 923 s of processing time, respectively. In this paper, road extraction was carried out with geometric texture combination constraints. The six evaluation indexes of the experiment were all higher than those of the sector and multifeature combination methods, and the input points and time were significantly reduced. Only 103 points and 510 s of processing time were needed to process three images. It is proven that the method of extracting rural roads with geometric texture combined with measured values in this paper can ensure certain road extraction accuracy and improve the efficiency of road extraction.
This paper selects three rural road images in different spaces. It shows the scene corresponding to the continuous change of curvature of rural roads and the similarity between roads and surrounding features. These problems are important factors affecting the degree of automation of the template matching method. The T-shaped template method and circular template method rely on the two most basic template matching principles of road internal homogeneity and road internal and external heterogeneity characteristics, and can only match simple roads. In the case of continuous curvature change and road texture similar to surrounding ground objects, a large number of artificial points need to be added to intervene. The sector template method adds the MLSOH descriptor on the basis of template matching. Road tracking has a certain direction basis and adapts to some problems of curvature continuity of rural roads, but it is difficult to solve the problem of texture similarity. On the one hand, the multifeature combination method improves the MLSOH descriptor and makes the road direction information more reliable. On the other hand, the interactive matching model of panchromatic and HSV space is used to solve some problems of similar texture. In this paper, the adaptive road width extraction model and the tracking direction prediction model are proposed to improve the accuracy of initial road point and adapt to the continuous change of rural road curvature. In addition, the matching model under the constraint of geometric texture is proposed to further solve the texture similarity problem between rural roads and their surroundings. Experiments in three different areas prove the universality of the proposed method.

Conclusions
In this paper, we propose a semi-automatic rural road extraction method that combines road geometric and texture constraints. The adaptive road width extraction model is used to improve the quality of initial road points. The road tracking direction prediction model adapts to the characteristics of continuous changes in rural road curvature. The geometric texture matching model is used to solve the matching problems that arise when the road and background characteristics are similar. Finally, different types of data are used to verify the effectiveness of the proposed method. The three sets of experiments conducted in this paper show that the proposed method can ensure road extraction accuracy, with a recall rate of more than 95%, while improving the degree of automation to a certain extent.
However, the method in this paper still has some shortcomings: (1) the applicability of the method proposed in this paper to long-distance shadow occlusion or shadow occlusion at road curves is poor and (2) further research is needed to determine how to extract road regions containing neither geometric information nor obvious texture information.