Abstract
Road marking lines can be extracted from aerial images using semantic segmentation (SS) models; however, in this work, a conditional generative adversarial network, RoadMark-cGAN, is proposed for direct extraction of these representations with image-to-image translation techniques. The generator features residual and attention blocks added in a functional bottleneck, while the discriminator features a modified PatchGAN, with an optimized encoder and an attention block added. The proposed model is improved in three versions (v2 to v4), in which dynamic dropout techniques and a novel “Morphological Boundary-Sensitive Class-Balanced” (MBSCB) loss are progressively added to better handle the high class imbalance present in the data. All models were trained on a novel “RoadMarking-binary” dataset (29,405 RGB orthoimage tiles of 256 × 256 pixels and their corresponding ground truth masks) to learn the distribution of road marking lines found on pavement. The metrical evaluation on the test set containing 2045 unseen images showed that the best proposed model achieved average improvements of 45.2% and 1.7% in the Intersection-over-Union (IoU) score for the positive, underrepresented class when compared to the best Pix2Pix and SS models, respectively, trained for the same task. Finally, a qualitative, visual comparison was conducted to assess the quality of the road marking predictions of the best models and their mapping performance.