Author Contributions
Y.D. Conceptualization, methodology, Software, Datacollection, Data processing and analysis, Visualization, Writing—original draft, Writing—review & editing. R.Z. Conceptualization, Funding, Writing—original draft & editing. X.X. Conceptualization, Supervision, Writing—review & editing. J.Z. Datacollection, Writing—review & editing. All authors have read and agreed to the published version of the manuscript.
Figure 1.
Spatial distribution of the study areas and datasets (left), and examples of test images with corresponding ground truth labels (right). The sub-figures are in HLJ (A–C), SD (D–F) and ZJ (G–I) sites respectively.
Figure 1.
Spatial distribution of the study areas and datasets (left), and examples of test images with corresponding ground truth labels (right). The sub-figures are in HLJ (A–C), SD (D–F) and ZJ (G–I) sites respectively.
Figure 2.
Overall j workflow of the proposed method. Given an RGB image as input, CNN model extracts the boundary-region feature out as Frame-I illustrated. Given the activation result by CNN model, the TRC module converts it to a graph composed of key-points and links (II.a). subsequently, the DLD (II.b) and DLE (II.c) can be chosen to futher refine the FP boundary, and finally convert the graph to polygonal FP result (II.d).
Figure 2.
Overall j workflow of the proposed method. Given an RGB image as input, CNN model extracts the boundary-region feature out as Frame-I illustrated. Given the activation result by CNN model, the TRC module converts it to a graph composed of key-points and links (II.a). subsequently, the DLD (II.b) and DLE (II.c) can be chosen to futher refine the FP boundary, and finally convert the graph to polygonal FP result (II.d).
Figure 3.
Demonstration of prepared training images and corresponding label, in which the red, blue and green area represent the region and buffered edge of FP and non-filed area respectively.
Figure 3.
Demonstration of prepared training images and corresponding label, in which the red, blue and green area represent the region and buffered edge of FP and non-filed area respectively.
Figure 4.
Illustration of the key points identification method. Given the binary boundary result from CNN model (left), the initial skeleton map is derived (middle). Based on the skeleton map, three conditions (right): the endpoint of the skeleton (the upper row), the normal point on the skeleton (the middle row) and the cross point (the bottom row) and their determination methods are used to detect key points. The red lines represent the individual intervals separated by skeleton pixels on the clockwise halo of 3 × 3 neighborhoods.
Figure 4.
Illustration of the key points identification method. Given the binary boundary result from CNN model (left), the initial skeleton map is derived (middle). Based on the skeleton map, three conditions (right): the endpoint of the skeleton (the upper row), the normal point on the skeleton (the middle row) and the cross point (the bottom row) and their determination methods are used to detect key points. The red lines represent the individual intervals separated by skeleton pixels on the clockwise halo of 3 × 3 neighborhoods.
Figure 5.
Samples of FP region and edge extraction of different backbones in HLJ site.
Figure 5.
Samples of FP region and edge extraction of different backbones in HLJ site.
Figure 6.
Samples of FP region and edge extraction of different backbones in SD site.
Figure 6.
Samples of FP region and edge extraction of different backbones in SD site.
Figure 7.
Samples of FP region and edge extraction of different backbones in ZJ site.
Figure 7.
Samples of FP region and edge extraction of different backbones in ZJ site.
Figure 8.
Qualitative comparison results of different methods on typical test area samples.
Figure 8.
Qualitative comparison results of different methods on typical test area samples.
Figure 9.
Comparison of vectorization results across test area: HLJ (a,b), SD (c,d) and ZJ (e,f). Adjacent parcels are rendered using a four-color filling method. Red rectangles hightlight regions of interest discussed in the text.
Figure 9.
Comparison of vectorization results across test area: HLJ (a,b), SD (c,d) and ZJ (e,f). Adjacent parcels are rendered using a four-color filling method. Red rectangles hightlight regions of interest discussed in the text.
Figure 10.
Statistical analysis of the difference between the width of the deep learning boundary and the actual boundary. Group statistics were conducted based on two dimensions: the study area (row direction) and the actual boundary range (column direction).
Figure 10.
Statistical analysis of the difference between the width of the deep learning boundary and the actual boundary. Group statistics were conducted based on two dimensions: the study area (row direction) and the actual boundary range (column direction).
Figure 11.
Demonstration of deep learning boundary features on road category. The image patch (
a) shows the sampling point (red arrow) and the cross-sectional sampling range (yellow horizontal line); (
b) presents section diagrams of binary prediction (horizontal bars) and activation curves from various models. This organization style is applied in
Figure 11,
Figure 12,
Figure 13 and
Figure 14.
Figure 11.
Demonstration of deep learning boundary features on road category. The image patch (
a) shows the sampling point (red arrow) and the cross-sectional sampling range (yellow horizontal line); (
b) presents section diagrams of binary prediction (horizontal bars) and activation curves from various models. This organization style is applied in
Figure 11,
Figure 12,
Figure 13 and
Figure 14.
Figure 12.
Demonstration of deep learning boundary features on road category with wide shoulder. The red arrow indicates the target sampling point, and the yellow horizontal line marks the cross-sectional sampling range.
Figure 12.
Demonstration of deep learning boundary features on road category with wide shoulder. The red arrow indicates the target sampling point, and the yellow horizontal line marks the cross-sectional sampling range.
Figure 13.
Demonstration of deep learning boundary features on canal categories. The red arrow indicates the target sampling point, and the yellow horizontal line marks the cross-sectional sampling range.
Figure 13.
Demonstration of deep learning boundary features on canal categories. The red arrow indicates the target sampling point, and the yellow horizontal line marks the cross-sectional sampling range.
Figure 14.
Demonstration of deep learning boundary features on other categories. The red arrow indicates the target sampling point, and the yellow horizontal line marks the cross-sectional sampling range.
Figure 14.
Demonstration of deep learning boundary features on other categories. The red arrow indicates the target sampling point, and the yellow horizontal line marks the cross-sectional sampling range.
Figure 15.
Demonstration of results of ablation study of post-processing based on different deep learning model. From column 1 to 5, the image patch, binary deep learning result, polygonal FP result with only TRC, polygonal FP result with DLD, polygonal FP result with DLD and DLE. Rows I–IV represent four representative test cases from different study areas. Red lines indicate the final vectorized FP boundaries.
Figure 15.
Demonstration of results of ablation study of post-processing based on different deep learning model. From column 1 to 5, the image patch, binary deep learning result, polygonal FP result with only TRC, polygonal FP result with DLD, polygonal FP result with DLD and DLE. Rows I–IV represent four representative test cases from different study areas. Red lines indicate the final vectorized FP boundaries.
Figure 16.
Comparison of polygonal FP results derived from OWT-UCM and our proposed post-processing based on same deep learning result. Where, (a) is the input image to deep learning model; (b) is the boundary activation from DLinkNet; (c) is the ultra-boundary map; (d) lists small crops from our post-processed result (red lines) and boundary maps from OWT-UCM result with different threshold level; (e) is our post-processed result.
Figure 16.
Comparison of polygonal FP results derived from OWT-UCM and our proposed post-processing based on same deep learning result. Where, (a) is the input image to deep learning model; (b) is the boundary activation from DLinkNet; (c) is the ultra-boundary map; (d) lists small crops from our post-processed result (red lines) and boundary maps from OWT-UCM result with different threshold level; (e) is our post-processed result.
Figure 17.
Sensitivity analysis of the core parameters in the DLD (left) and DLE (right) modules across DLinkNet and PSPNet models.
Figure 17.
Sensitivity analysis of the core parameters in the DLD (left) and DLE (right) modules across DLinkNet and PSPNet models.
Table 1.
Quantitative evaluations of different backbones in diffrent sites.
Table 1.
Quantitative evaluations of different backbones in diffrent sites.
| | Loc. | Boundary | FP Region |
|---|
| | Rec. | Prec. | IoU | F1 | Rec. | Prec. | IoU | F1 |
|---|
| ResUNet | HLJ | 0.789 | 0.096 | 0.094 | 0.172 | 0.704 | 0.949 | 0.679 | 0.809 |
| UNet | 0.785 | 0.110 | 0.107 | 0.194 | 0.766 | 0.960 | 0.743 | 0.852 |
| DLinkNet | 0.739 | 0.138 | 0.132 | 0.233 | 0.863 | 0.962 | 0.835 | 0.910 |
| PSPNet | 0.694 | 0.138 | 0.130 | 0.231 | 0.868 | 0.958 | 0.836 | 0.911 |
| InceptionV3+ | 0.546 | 0.086 | 0.080 | 0.148 | 0.784 | 0.833 | 0.677 | 0.808 |
| ResUNet | SD | 0.803 | 0.143 | 0.138 | 0.243 | 0.709 | 0.937 | 0.676 | 0.807 |
| UNet | 0.845 | 0.143 | 0.139 | 0.245 | 0.705 | 0.945 | 0.677 | 0.808 |
| DLinkNet | 0.860 | 0.155 | 0.151 | 0.263 | 0.755 | 0.949 | 0.725 | 0.841 |
| PSPNet | 0.822 | 0.160 | 0.154 | 0.268 | 0.781 | 0.943 | 0.746 | 0.855 |
| InceptionV3+ | 0.620 | 0.133 | 0.123 | 0.219 | 0.809 | 0.776 | 0.656 | 0.792 |
| ResUNet | ZJ | 0.838 | 0.118 | 0.115 | 0.206 | 0.558 | 0.719 | 0.458 | 0.628 |
| UNet | 0.859 | 0.144 | 0.140 | 0.246 | 0.571 | 0.817 | 0.506 | 0.672 |
| DLinkNet | 0.861 | 0.164 | 0.160 | 0.276 | 0.549 | 0.896 | 0.516 | 0.681 |
| PSPNet | 0.809 | 0.161 | 0.155 | 0.268 | 0.560 | 0.896 | 0.526 | 0.690 |
| InceptionV3+ | 0.632 | 0.083 | 0.079 | 0.147 | 0.700 | 0.534 | 0.435 | 0.606 |
Table 2.
Quantitative comparison on model output across state-of-the-arts methods.
Table 2.
Quantitative comparison on model output across state-of-the-arts methods.
| | Loc. | Boundary | FP Region |
|---|
| |
Rec.
|
Prec.
|
IoU
|
F1
|
Rec.
|
Prec.
|
IoU
|
F1
|
|---|
| BFINet | HLJ | 0.663 | 0.109 | 0.103 | 0.187 | 0.733 | 0.903 | 0.679 | 0.809 |
| BSiNet | 0.410 | 0.124 | 0.105 | 0.191 | 0.482 | 0.910 | 0.460 | 0.630 |
| HGBNet | 0.389 | 0.111 | 0.095 | 0.173 | 0.633 | 0.939 | 0.608 | 0.756 |
| ReaUNet | - | - | - | - | 0.654 | 0.913 | 0.615 | 0.762 |
| SEANet | 0.481 | 0.081 | 0.075 | 0.139 | 0.429 | 0.954 | 0.420 | 0.591 |
| DLinkNet | 0.739 | 0.138 | 0.132 | 0.233 | 0.863 | 0.962 | 0.835 | 0.910 |
| PSPNet | 0.694 | 0.138 | 0.130 | 0.231 | 0.868 | 0.958 | 0.836 | 0.911 |
| BFINet | SD | 0.746 | 0.107 | 0.104 | 0.188 | 0.805 | 0.674 | 0.580 | 0.734 |
| BSiNet | 0.468 | 0.121 | 0.107 | 0.193 | 0.604 | 0.660 | 0.461 | 0.631 |
| HGBNet | 0.410 | 0.116 | 0.100 | 0.181 | 0.638 | 0.686 | 0.494 | 0.661 |
| ReaUNet | - | - | - | - | 0.800 | 0.675 | 0.578 | 0.732 |
| SEANet | 0.679 | 0.070 | 0.068 | 0.128 | 0.703 | 0.769 | 0.580 | 0.734 |
| DLinkNet | 0.860 | 0.155 | 0.151 | 0.263 | 0.755 | 0.949 | 0.725 | 0.841 |
| PSPNet | 0.822 | 0.160 | 0.154 | 0.268 | 0.781 | 0.943 | 0.746 | 0.855 |
| BFINet | ZJ | 0.795 | 0.171 | 0.163 | 0.281 | 0.781 | 0.690 | 0.578 | 0.733 |
| BSiNet | 0.456 | 0.230 | 0.181 | 0.306 | 0.608 | 0.664 | 0.465 | 0.635 |
| HGBNet | 0.514 | 0.199 | 0.167 | 0.286 | 0.684 | 0.803 | 0.585 | 0.738 |
| ReaUNet | - | - | - | - | 0.709 | 0.643 | 0.509 | 0.674 |
| SEANet | 0.703 | 0.098 | 0.094 | 0.172 | 0.568 | 0.738 | 0.472 | 0.642 |
| DLinkNet | 0.861 | 0.164 | 0.160 | 0.276 | 0.549 | 0.896 | 0.516 | 0.681 |
| PSPNet | 0.809 | 0.161 | 0.155 | 0.268 | 0.560 | 0.896 | 0.526 | 0.690 |
Table 3.
Quantitative evaluations for verized results. ↑ higher is better, ↓ lower is better.
Table 3.
Quantitative evaluations for verized results. ↑ higher is better, ↓ lower is better.
| Configuration | Model | Area | Prec. ↑ | Rec. ↑ | F1 ↑ | IoU ↑ | GOC ↓ | GUC ↓ | GTC ↓ | PoLiS ↓ |
|---|
| w/Both () | DLinkNet | HLJ | 0.9154 | 0.9079 | 0.9113 | 0.8393 | 0.1330 | 0.1345 | 0.1316 | 114.66 |
| DLinkNet | SD | 0.9112 | 0.9328 | 0.9218 | 0.8554 | 0.1106 | 0.1016 | 0.1222 | 83.39 |
| DLinkNet | ZJ | 0.8244 | 0.8415 | 0.8325 | 0.7130 | 0.1464 | 0.1450 | 0.1492 | 29.06 |
| PSPNet | HLJ | 0.9161 | 0.8965 | 0.9057 | 0.8299 | 0.1234 | 0.1298 | 0.1179 | 123.93 |
| PSPNet | SD | 0.9065 | 0.9318 | 0.9188 | 0.8502 | 0.1026 | 0.0973 | 0.1094 | 75.35 |
| PSPNet | ZJ | 0.8371 | 0.8049 | 0.8198 | 0.6947 | 0.1231 | 0.1258 | 0.1210 | 31.28 |
Table 4.
Ablation study of the DLD and DLE post-processing modules. All metrics are area-level averages across two test images. ↑ higher is better, ↓ lower is better.
Table 4.
Ablation study of the DLD and DLE post-processing modules. All metrics are area-level averages across two test images. ↑ higher is better, ↓ lower is better.
| Configuration | Model | Area | Prec. ↑ | Rec. ↑ | F1 ↑ | IoU ↑ | GOC ↓ | GUC ↓ | GTC ↓ | PoLiS ↓ |
|---|
| Base (w/o DLD, w/o DLE) | DLinkNet | HLJ | 0.9391 | 0.8621 | 0.8977 | 0.8177 | 0.1106 | 0.1001 | 0.1247 | 118.57 |
| DLinkNet | SD | 0.9291 | 0.8655 | 0.8962 | 0.8125 | 0.0808 | 0.0573 | 0.1378 | 88.91 |
| DLinkNet | ZJ | 0.8715 | 0.7242 | 0.7908 | 0.6540 | 0.1056 | 0.0890 | 0.1309 | 30.52 |
| PSPNet | HLJ | 0.9371 | 0.8575 | 0.8942 | 0.8119 | 0.1066 | 0.1018 | 0.1132 | 126.21 |
| PSPNet | SD | 0.9232 | 0.8745 | 0.8982 | 0.8156 | 0.0802 | 0.0601 | 0.1209 | 76.53 |
| PSPNet | ZJ | 0.8763 | 0.7002 | 0.7775 | 0.6361 | 0.0879 | 0.0803 | 0.0975 | 31.77 |
| w/DLD () | DLinkNet | HLJ | 0.9283 | 0.8889 | 0.9074 | 0.8332 | 0.1303 | 0.1245 | 0.1373 | 116.82 |
| DLinkNet | SD | 0.9187 | 0.9137 | 0.9162 | 0.8458 | 0.1064 | 0.0887 | 0.1333 | 85.80 |
| DLinkNet | ZJ | 0.8467 | 0.8024 | 0.8236 | 0.7001 | 0.1483 | 0.1372 | 0.1628 | 29.55 |
| PSPNet | HLJ | 0.9278 | 0.8802 | 0.9024 | 0.8249 | 0.1213 | 0.1213 | 0.1218 | 124.45 |
| PSPNet | SD | 0.9149 | 0.9121 | 0.9134 | 0.8411 | 0.0986 | 0.0844 | 0.1189 | 76.38 |
| PSPNet | ZJ | 0.8588 | 0.7623 | 0.8068 | 0.6762 | 0.1178 | 0.1132 | 0.1232 | 31.15 |
| w/DLE () | DLinkNet | HLJ | 0.8639 | 0.9445 | 0.9023 | 0.8241 | 0.0979 | 0.1445 | 0.0760 | 109.97 |
| DLinkNet | SD | 0.8890 | 0.9580 | 0.9222 | 0.8558 | 0.0872 | 0.0983 | 0.0804 | 78.64 |
| DLinkNet | ZJ | 0.7674 | 0.8753 | 0.8171 | 0.6910 | 0.1017 | 0.1168 | 0.0907 | 29.48 |
| PSPNet | HLJ | 0.8623 | 0.9385 | 0.8988 | 0.8181 | 0.0910 | 0.1399 | 0.0689 | 123.64 |
| PSPNet | SD | 0.8791 | 0.9607 | 0.9179 | 0.8486 | 0.0787 | 0.0965 | 0.0679 | 72.96 |
| PSPNet | ZJ | 0.7608 | 0.8684 | 0.8100 | 0.6809 | 0.0879 | 0.1082 | 0.0745 | 33.66 |
| w/Both () | DLinkNet | HLJ | 0.9154 | 0.9079 | 0.9113 | 0.8393 | 0.1330 | 0.1345 | 0.1316 | 114.66 |
| DLinkNet | SD | 0.9112 | 0.9328 | 0.9218 | 0.8554 | 0.1106 | 0.1016 | 0.1222 | 83.39 |
| DLinkNet | ZJ | 0.8244 | 0.8415 | 0.8325 | 0.7130 | 0.1464 | 0.1450 | 0.1492 | 29.06 |
| PSPNet | HLJ | 0.9161 | 0.8965 | 0.9057 | 0.8299 | 0.1234 | 0.1298 | 0.1179 | 123.93 |
| PSPNet | SD | 0.9065 | 0.9318 | 0.9188 | 0.8502 | 0.1026 | 0.0973 | 0.1094 | 75.35 |
| PSPNet | ZJ | 0.8371 | 0.8049 | 0.8198 | 0.6947 | 0.1231 | 0.1258 | 0.1210 | 31.28 |