MAC-GAN: A Community Road Generation Model Combining Building Footprints and Pedestrian Trajectories
Abstract
:1. Introduction
- (i)
- For remote sensing data, the internal environment of residential communities is complex, and many types of elements, such as buildings, trees, and lawns, obscure the confined branch roads in the remote sensing images of the communities, thus posing a challenge for extracting fine community branch roads in community space.
- (ii)
- For trajectory data, the trajectories in community space mainly consist of pedestrian trajectories mixed with a small number of vehicle trajectories due to the restrictions on the entry of foreign vehicles into the residential area and the common design of people–vehicle diversion. Pedestrian trajectories recorded by various mobile phone apps only exist within a limited time and space range with the characteristics of low frequency. Firstly, such low-frequency trajectories lacking details of real paths would probably result in an ambiguous representation and undoubtedly exacerbate the difficulties of road generation. Secondly, such trajectories imprint mixed traffic flow for different agents (e.g., pedestrians and vehicles). The hybrid features in community-like spaces greatly degrade the performance of current methods. Thirdly, unlike the driving behavior of vehicles on general roads, pedestrians walk freely in the community, such as across lawns, squares, etc. This may result in trajectories with non-uniform density distribution, and random sampling may further exacerbate this problem. Existing trajectory-based methods of road extraction have limitations in dealing with such community trajectory data.
- (iii)
- For community space, community roads are tighter and denser (i.e., adjacent roads are closer in space) than urban-scale roads. The features adopted by existing methods are insufficient for extracting the mixed staggered roads with different levels from low-frequency trajectories. In addition, the GPS drifts caused by dense tall buildings and residents induce spatial uncertainties and increase the difficulty of distinguishing adjacent roads.
- (1)
- We combine trajectory information with rich road geometry and topological features and building footprints with road contextual spatial information to enrich the research dimension of road extraction methods.
- (2)
- We propose a generative adversarial model named MAC-GAN for community road extraction. We configure the generator MACU-Net for MAC-GAN, which has cross-perceptual field convolution blocks to enhance the attention to and perception of road space neighborhoods. It builds skip connection and adaptive attention mechanisms to fuse multi-scale features. MACU-Net captures the ternary features of the “road trajectory–building footprint” for generating roads with sparse and uneven trajectories.
- (3)
- We explore a new geodata transformation application of GANs on a community scale to transform a coarser and accessible geospatial dataset (trajectories and building footprints) into another geospatial dataset (roads), and we verify the feasibility and effectiveness of this application to generate road data.
2. Related Work
2.1. Road Generation Methods Based on Trajectory
2.2. Generative Adversarial Networks and Geospatial Data Translation
3. Methodology
3.1. Framework
3.2. Pre-Processing
3.3. Model Architecture
3.3.1. Generator
- Asymmetric convolutional block (ACB). Existing neural network models typically extract features within the square window using the square convolution kernel, which is feasible for most block-shaped objects and spatial region blocks. However, the community roads in our study are narrow strips whose directions mainly extend vertically and horizontally. Using the square convolutional kernel can hardly focus on extracting linear features of the roads. In addition, the community has many road intersections, most of which are in the shape of “crossings” and “T-junctions.” Due to the lack of direction sensitivity in square convolutional kernels, they cannot concentrate on extracting road information in different directions at intersections and geometric shape features of intersections. The nonlinear features of roads are generally manifested as complex geometric shapes such as curves, loops, and irregular edges. It is difficult for square convolution kernels to adequately extract nonlinear features of different scales and shapes. Moreover, the importance of features captured by square convolution kernels is heterogeneous. Specifically, the central crossover location contributes more information to feature extraction and less to the corners [36], which will cause the information extracted by the square convolution to be redundant and unrepresentative, further weakening the model’s ability to extract nonlinear features. To overcome these limitations of the square convolution kernel, we choose the three-branch convolutional block (ACB) with cross-receptive fields shown in Figure 3 to extract the spatial features of the community roads. As shown in Figure 4, in ACB, the 3 × 3 convolution kernel is used to capture the contextual information of the road, and the 1 × 3 and 3 × 1 convolution kernels pay attention to capturing the road’s linear characteristics, the intersection’s geometry, and the representative linear and non-linear features at the skeleton. Thus, ACB reduces the capture of redundant information, ensures the extraction of essential road and intersection features, enhances the extraction of representative nonlinear features, and maintains sensitivity to contextual spatial features. The ACB is expressed as follows:
- 2.
- Multiscale features skip connection and fusion. Considering that the insufficient information flow extraction and utilization limit the original U-Net architecture’s potential, we incorporate multi-scale jump connections into the U-Net to facilitate interaction between encoders and decoders and to fully capture fine-grained road location, geometric and topology features, and coarse-grained semantic features. Figure 4 shows how generates its feature map. The first step is the multi-scale features skip connection. Firstly, the same-level encoder layer’s (i.e., ’s) feature maps are concatenated. Subsequently, the transposed convolution and ACB transmit the lower-level decoder layers’ (i.e., ’s and ’s) fine-grained road geometry and topology features. Finally, the max pooling layer and ACB deliver the higher encoder layers’ (i.e., ’s and ’s) coarse-grained road semantic information. This process can be expressed as follows:
3.3.2. Discriminator
3.3.3. Loss Function
4. Study Dataset and Evaluation
4.1. Study Dataset
4.2. Baselines and Settings
- Pix2pix configures a U-Net generator whose skip connection improves the semantic extraction capability of the encoder–decoder framework. U-Net has become a standard scheme to capture nonlinear and hierarchical features of input images to reconstruct images, so we take it as one of the baseline models.
- GANmapper’s generator is configured as an encoder–decoder, which includes nine residual blocks (He et al., 2016). With the setting of the residual blocks, the loss of spatial information from down-sampling that cannot be restored by up-sampling is reduced to some extent.
- DLink-GAN configures D-LinkNet [33] as the generator. D-LinkNet is a common encoder–decoder for road segmentation and extraction. D-LinkNet uses ResNet34 [37] to replace the encoder of U-Net, which reduces the loss of spatial information from down-sampling. In addition, the central part of its encoder and decoder uses several skip connections. The dilated convolutional layers at the center obtain a larger receptive field, which can extract and retain detailed “trajectory–building footprint–road” triple information of spatial features.
4.3. Evaluation Metrics
5. Experimental Results and Analysis
5.1. Test for the Optimized Configurations of Input–Target Pairs
5.2. Comparison with Baselines
5.3. Effect of ACB Block
5.4. Impact of Trajectories of Different Sparseness and Missing Degrees
5.5. Loss of Model Training
5.6. Limitations
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Bettencourt, L.M.J. The origins of scaling in cities. Science 2013, 340, 1438–1441. [Google Scholar] [CrossRef]
- Karagiorgou, S.; Pfoser, D. On vehicle tracking data-based road network generation. In Proceedings of the 20th International Conference on Advances in Geographic Information Systems, New York, NY, USA, 6–9 November 2012; pp. 89–98. [Google Scholar]
- Liu, X.; Zhu, Y.; Wang, Y.; Forman, G.; Ni, L.M.; Fang, Y.; Li, M.J.H.L. Road Recognition Using Coarse-Grained Vehicular Traces; Hewlett-Packard Development Company: Palo Alto, CA, USA, 2012. [Google Scholar]
- Xie, X.; Wong, K.B.-Y.; Aghajan, H.; Veelaert, P.; Philips, W. Inferring directed road networks from GPS traces by track alignment. ISPRS Int. J. Geo-Inf. 2015, 4, 2446–2471. [Google Scholar] [CrossRef]
- Zhang, L.; Thiemann, F.; Sester, M. Integration of GPS traces with road map. In Proceedings of the Third International Workshop on Computational Transportation Science, New York, NY, USA, 2 November 2010; pp. 17–22. [Google Scholar]
- Bruntrup, R.; Edelkamp, S.; Jabbar, S.; Scholz, B. Incremental map generation with GPS traces. In Proceedings of the 2005 IEEE Intelligent Transportation Systems, Vienna, Austria, 16 September 2005; pp. 574–579. [Google Scholar]
- Cao, L.; Krumm, J. From GPS traces to a routable road map. In Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, New York, NY, USA, 4–6 November 2009; pp. 3–12. [Google Scholar]
- Quddus, M.A.; Ochieng, W.Y.; Noland, R.B. Current map-matching algorithms for transport applications: State-of-the art and future research directions. Transp. Res. Part C Emerg. Technol. 2007, 15, 312–328. [Google Scholar] [CrossRef]
- Ahmed, M.; Wenk, C. Constructing street networks from GPS trajectories. In Proceedings of the Algorithms—ESA 2012: 20th Annual European Symposium, Ljubljana, Slovenia, 10–12 September 2012; pp. 60–71. [Google Scholar]
- Edelkamp, S.; Schrödl, S. Route planning and map inference with global positioning traces. In Computer Science in Perspective; Springer: Berlin/Heidelberg, Germany, 2003; pp. 128–151. [Google Scholar]
- Worrall, S.; Nebot, E. Automated process for generating digitised maps through GPS data compression. In Proceedings of the Australasian Conference on Robotics and Automation, Brisbane, Australia, 10–12 December 2007. [Google Scholar]
- Wang, J.; Rui, X.; Song, X.; Tan, X.; Wang, C.; Raghavan, V. A novel approach for generating routable road maps from vehicle GPS traces. Int. J. Geogr. Inf. Sci. 2015, 29, 69–91. [Google Scholar] [CrossRef]
- Guo, Y.; Bardera, A.; Fort, M.; Silveira, R.I. A scalable method to construct compact road networks from GPS trajectories. Int. J. Geogr. Inf. Sci. 2021, 35, 1309–1345. [Google Scholar] [CrossRef]
- Davies, J.J.; Beresford, A.R.; Hopper, A. Scalable, distributed, real-time map generation. IEEE Pervasive Comput. 2006, 5, 47–54. [Google Scholar] [CrossRef]
- Biagioni, J.; Eriksson, J. Inferring road maps from global positioning system traces: Survey and comparative evaluation. Transp. Res. Rec. 2012, 2291, 61–71. [Google Scholar] [CrossRef]
- Yang, X.; Tang, L.; Ren, C.; Chen, Y.; Xie, Z.; Li, Q. Pedestrian network generation based on crowdsourced tracking data. Int. J. Geogr. Inf. Sci. 2020, 34, 1051–1074. [Google Scholar] [CrossRef]
- Shi, W.; Shen, S.; Liu, Y. Automatic generation of road network map from massive GPS, vehicle trajectories. In Proceedings of the 2009 12th International IEEE Conference on Intelligent Transportation Systems, St. Louis, MO, USA, 4–7 October 2009; pp. 1–6. [Google Scholar]
- Biagioni, J.; Eriksson, J. Map inference in the face of noise and disparity. In Proceedings of the 20th International Conference on Advances in Geographic Information Systems, New York, NY, USA, 6–9 November 2012; pp. 79–88. [Google Scholar]
- Li, Y.; Xiang, L.; Zhang, C.; Wu, H. Fusing taxi trajectories and RS images to build road map via DCNN. IEEE Access 2019, 7, 161487–161498. [Google Scholar] [CrossRef]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- Mirza, M.; Osindero, S. Conditional generative adversarial nets. arXiv 2014, arXiv:1411.1784. [Google Scholar]
- Song, J.; Li, J.; Chen, H.; Wu, J. MapGen-GAN: A fast translator for remote sensing image to map via unsupervised adversarial learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 2341–2357. [Google Scholar] [CrossRef]
- Dong, G.; Huang, W.; Smith, W.A.; Ren, P. A shadow constrained conditional generative adversarial net for SRTM data restoration. Remote Sens. Environ. 2020, 237, 111602. [Google Scholar] [CrossRef]
- Zhu, D.; Cheng, X.; Zhang, F.; Yao, X.; Gao, Y.; Liu, Y. Spatial interpolation using conditional generative adversarial neural net-works. Int. J. Geogr. Inf. Sci. 2020, 34, 735–758. [Google Scholar] [CrossRef]
- Milojevic-Dupont, N.; Hans, N.; Kaack, L.H.; Zumwald, M.; Andrieux, F.; de Barros Soares, D.; Lohrey, S.; Pichler, P.-P.; Creutzig, F. Learning from urban form to predict building heights. PLoS ONE 2020, 15, e0242010. [Google Scholar] [CrossRef] [PubMed]
- Mocnik, F.-B. Benford’s law and geographical information—The example of OpenStreetMap. Int. J. Geogr. Inf. Sci. 2021, 35, 1746–1772. [Google Scholar] [CrossRef]
- Majic, I.; Naghizade, E.; Winter, S.; Tomko, M. There is no way! Ternary qualitative spatial reasoning for error detection in map data. Trans. GIS 2021, 25, 2048–2073. [Google Scholar] [CrossRef]
- Wu, A.N.; Biljecki, F.J. GANmapper: Geographical data translation. Int. J. Geogr. Inf. Sci. 2022, 36, 1–29. [Google Scholar] [CrossRef]
- Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Li, R.; Zheng, S.; Duan, C.; Su, J.; Zhang, C. Multistage attention ResU-Net for semantic segmentation of fine-resolution remote sensing images. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Zhou, L.; Zhang, C.; Wu, M. D-LinkNet: LinkNet with pretrained encoder and dilated convolution for high resolution satellite imagery road extraction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 182–186. [Google Scholar]
- Huang, H.; Lin, L.; Tong, R.; Hu, H.; Zhang, Q.; Iwamoto, Y.; Han, X.; Chen, Y.-W.; Wu, J. Unet 3+: A full-scale connected unet for medical image segmentation. In Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 1055–1059. [Google Scholar]
- Isola, P.; Zhu, J.-Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1125–1134. [Google Scholar]
- Li, R.; Duan, C.; Zheng, S. MACU-Net Semantic Segmentation from High-Resolution Remote Sensing Images. arXiv 2020, arXiv:2007.13083. [Google Scholar]
- Chaurasia, A.; Culurciello, E. Linknet: Exploiting encoder representations for efficient semantic segmentation. In Proceedings of the 2017 IEEE Visual Communications and Image Processing (VCIP), St. Petersburg, FL, USA, 10–13 December 2017; pp. 1–4. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
Day | Time | Id | Longitude | Latitude |
---|---|---|---|---|
8 March 2019 | 23:31:00 | 7 | 114.310290 | 30.523696 |
Configuration | Model | Accuracy | Recall | F1 Score | IOU | FID |
---|---|---|---|---|---|---|
Cb | 0.356 | 0.339 | 0.341 | 0.212 | 114.29 | |
Ct | 0.706 | 0.681 | 0.687 | 0.542 | 161.84 | |
Cbt | 0.702 | 0.752 | 0.718 | 0.574 | 92.20 |
MAC-GAN | Pix2pix | GANmapper | DLink-GAN | |
---|---|---|---|---|
Accuracy | 0.702 | 0.668 | 0.778 | 0.809 |
Recall | 0.752 | 0.499 | 0.552 | 0.613 |
F1 score | 0.718 | 0.650 | 0.701 | 0.694 |
IOU | 0.574 | 0.499 | 0.552 | 0.542 |
FID | 92.20 | 86.61 | 77.92 | 100.17 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, L.; Wei, J.; Zuo, Z.; Zhou, S. MAC-GAN: A Community Road Generation Model Combining Building Footprints and Pedestrian Trajectories. ISPRS Int. J. Geo-Inf. 2023, 12, 181. https://doi.org/10.3390/ijgi12050181
Yang L, Wei J, Zuo Z, Zhou S. MAC-GAN: A Community Road Generation Model Combining Building Footprints and Pedestrian Trajectories. ISPRS International Journal of Geo-Information. 2023; 12(5):181. https://doi.org/10.3390/ijgi12050181
Chicago/Turabian StyleYang, Lin, Jing Wei, Zejun Zuo, and Shunping Zhou. 2023. "MAC-GAN: A Community Road Generation Model Combining Building Footprints and Pedestrian Trajectories" ISPRS International Journal of Geo-Information 12, no. 5: 181. https://doi.org/10.3390/ijgi12050181
APA StyleYang, L., Wei, J., Zuo, Z., & Zhou, S. (2023). MAC-GAN: A Community Road Generation Model Combining Building Footprints and Pedestrian Trajectories. ISPRS International Journal of Geo-Information, 12(5), 181. https://doi.org/10.3390/ijgi12050181