Road Extraction from High Resolution Image with Deep Convolution Network — A Case Study of GF-2 Image †

Recently, with the development of remote sensing and computer techniques, automatic and accurate road extraction is becoming feasible for practical usage. Nowadays, accurate extraction of road information from satellite data has become one of the most popular topics in both remote sensing and transportation fields. It is very useful for applying this technique to fast map updating, construction supervision, and so on. However, as there is usually a huge volume of information provided by remote sensing data, an efficient method to refine the big volume of data is important in corresponding applications. We apply deep convolution network to perform an image segmentation approach, as a solution for extracting road networks from high resolution images. In order to take advantage of deep learning, we study the methods of generating representative training and testing datasets, and develop semi-supervised leaning skills to enhance the data scale. The extraction of the satellite images that are affected by color distortion is also studied, in order to make the method more robust for more applicational fields. The GF-2 satellite data is used for experiments, as its images may show optical distortion in small pieces. Experiments in this paper showed that, the proposed solution successfully identifies road networks from complex situations with a total accuracy of more than 80% in discriminable areas.


Introduction
In remote sensing applications, the road network extraction plays an important role in traffic management, navigation, map updating, and city planning.Remote sensing images have the unique advantages of providing large scale information, which is very suitable for analyzing road networks efficiently [1,2].Accurate road information, from high spatial-resolution images, has become urgently required in recent years.In the past few decades, great efforts have been made for extracting and updating the road network [3][4][5][6][7][8].
The road network has standard geometrical morphology, but generally speaking, it is not easy to extract road networks precisely from remote sensing images.The reason is that the road networks presented in real settings are usually covered by many kinds of ground objects, like vehicles, trees, and shadows.Therefore, the shape and color of roads are usually diverse in different areas.In recent years, different methods have been developed.The state-of-art methods could be divided into three typical categories: (1) outline or border extraction; (2) object detection/classification; and (3) image segmentation [9][10][11].This paper focuses on the third category, i.e., the image scene is divided into road area and background area.If desired, the road areas can be further segmented, for example, according to the paved materials.This method aims to label each pixel with an ownership probability of different classes, which is challenging work.Recently, with the development of calculation power (GPU) and the big data concept, deep convolutional neural network (DCNN) has been widely used in image studies [12,13].It is believed that DCNN methods have revolutionized the computer vision area with remarkable achievements.On the other hand, in the current era of big data, although various images could be easily achieved from the network for free, which actually supports the use of DCNN, these images consist of huge amounts of information, incluing both the valuable and useless ones for our specific research.Consequently, the first problem that should be solved is to effectively refine and enhanced the datasets to make them more suitable for our study.
GaoFen-2 (GF-2) is one of China's civilian optical satellite products, with the resolution of better than 1 m.Due to the excellent spatial resolution of GF-2 image, it could be widely used in various fileds, including the road uptate.However, the reason we choose GF-2 is not for its high spatial resolution, but because GF-2 has an obvious problem in radiance calibration, which has been found in previous studies.We aim to develop road extraction methods that are feasible even in color distortion cases, which may be helpful to expand the applicational areas of GF-2 images.
Inspired by big data theory and deep learning techniques, this paper proposes a solution for using the GF-2 images to extract road networks.Firstly, the semi-supervised method is applied to generate labeled data.The benchmark road is automatic produced, and then manually revised according to the road design and construction specifications made by the transportation industry.The data with color distortion is also regarded as one of the road types.Following that, the DCNN model with deep layers was trained to learn the various road characters.The DCNN will distinguish multi-type roads from complex situations.

The Problem and Task Description
As mentioned before, we use GF-2 data in this paper.GF-2 satellite was launched on 19 August 2014, which is the first optical remote sensing satellite in China with a spatial resolution superior to 1 m for panchromatic, and 4 m for multispectral.Since the first image and transmission of data was started, GF-2 has supported various studies, such as civil land observation, land and resource monitoring, map updating, and others.Table 1 shows the detailed parameters of the GF-2 satellite [14].A disadvantage of GF-2 is that its images are sometimes affected by optical distortion.Actually, this is a common problem for many high resolution optical satellites.However, in GF-2, obvious radiometric distortion is a very noticeable problem when identifying road networks.Figure 1 shows a typical example of this problem, where we find that the road surface gives abnormal spectra reflectivity.The roads are paved with asphalt and concrete, shown with a different color.However, in Figure 1, we cannot differentiate them because the color of the same asphalt road is turned from dark grey into bright white.The problem is more likely to happen when the nearby objects have some certain reflectances, which may affect the satellite sensors.This paper aims to handle this problem mainly by the following ways.Firstly, inspired by the basic idea of big data, we try to use data of large scale and varied information, in order to gain different conditions; as many as possible.Secondly, we use DCNN methods with very deep layers to learn the abstract features from these conditions.Morever, data enhancement and perturbations are applied to further generate enough data and avoid overfitting.

Producing Dataset by Semi-Surprised Method
For deep learning algorithm, a pixel-level labelled dataset is very crucial for model training, but for a remote sensing dataset, manually drawing regions with clear boundaries is a very timeconsuming process.Nowadays, with the development of web-map services, rich information about the roads could be easily obtained from the network for free.For example, as one of the most popular web-maps, the Open Street Map (OSM) has been widely used in providing road information, including the road level, road name, and road type.In this case, by combining the OSM's road vectors and the satellite images, it will obviously decrease the time cost in producing the road labels.More importantly, through this way, it could also help to improve the accuracy of the labels that are produced by different people.The main workload is to register the road network and satellite images to a relative coordinate, which can be performed by typical geographic information system (GIS) software.After that, we need to further improve the label accuracy, because in some areas, the map may outdated.One can use the open social platforms or free map sever, API, to search the ground place/road names of assigned coordinate ranges, providing suspected locations.And finally, we inspect and manually draw the final road regions.The paved materials of road surface can be set according to the road color .For a specific case like Figure 1, the road surface will be labeled as an "other case", in order to distinguish it from normal situations.
After labeling the images, a subsequent pre-preprocessing and data cleaning step are required to produce the training datasets for building the model, that is, since the original image scene usually covers hundreds of square kilometers.The computer cannot handle such large data at a time.In order to generate train/test datasets, each original satellite image should be firstly clipped into several small ones, according to the specific computer settings.The following procedure of data cleaning is to remove the images with none or few road segments in it.This process is also very crucial, because the images having roads usually make up only a small proportion, as most training images have no roads on them, and the DCNN algorithms will tend to classify most pixels as background.

Road Segmentation by DCNN
There are different DCNN methods that can be used, like Unet, DeepLab, Segnet, and so on [15][16][17].In this paper, we chose DeepLab, since it can applied with very deep layers.In order to match such large DCNN layers, data enhancement is a necessary process.Usually, the image rotation and mirror is processed to extend the training datsets by times.We can also randomly set the clipping origin of Section 2.2, and from the view of computer, this procedure will produce different object shapes.

Post-Processing and Refining
Many researchers aim to find an end-to-end way for image segmentation, but for practical applications of road verification from complex satellite data, it is desirable to apply a modified approach.For example, using a morphology algorithm to combine the broken network structures, filtering, out spots and burrs.To make the extracted road region more realistic, the extracted road segments are processed through several morphological algorithms to fill the holes, smooth the edges, connect the road segment, and finally, achieve the coarse center lines of road segments.
(1) Order the initial center lines by giving a start pixel, and divide the lines into several parts according to the branch points; (2) For each segment of the lines, a group of straight line approximations can be obtained after giving an interval [18], which is 50 in this paper. , where xi and yi are the coordinates of the certain line segment; M is the number of subsegments that are divided by the given interval; ai, M and bi,M are the coefficients of the approximated line function for M th subsegment.
(3) A group tragedy, which is inspired by [19], is finally adopted to further improve the results iteratively.To put it simply, for each three neighbor line segments, if these segments share the same direction up to a tolerance τ, they will be regarded as the same line, and a new line approximation will be made for all the points in these segments.
(4) Iteratively check the current lines through (3) and finish modification.

Results and Disscussion
In our experiments, the GF-2 data of different locations and seasons are used for the experimental test.The multispectral image bands are fused with panchromatic bands to produce the images with the spatial resolution of 1 m.The original image scene is divided into 256 × 256 size images, producing thousands of images for training and testing.The label consists of three parts: centerline, width, and material.There are materials, including asphalt, concrete, and others.The label details have been checked manually, in order to guarantee the authenticity of the data.
Figure 2 shows the results in the country area.There is an optical distortion in this location, and we can see that the roof in the center is excessively bright in the scene, while the vegetation in the top right corner is converted to a grey color.However, as shown in Figure 2b, the proposed method extracts road segments successfully.
Figure 3 gives the classification results when different materials are represented in the same image scope.It can be found that there are three road segments with two different pavements, asphalt and concrete, respectively.The asphalt material is labeled as red, and concrete is represented as green.There is a parking lot in the center of this image, paved with concrete.The parking lot is not extracted, implying that the proposed method judges the object not only by colors, but also utilizes the shape information.In this image, we can see that the asphalt road is not precisely in dark color, due to the optical distortion, but the road segment is also extracted correctly.

Discussion
However, there is still lots of further work needed for improvements: (1) the method should be continually tested with various areas for wide usage; (2) accurate and smoothness approximations for curve lines should be further studied.

Conclusions
This paper transposes DCNN to remote sensing road detection.Semi-supervised labelling method is studied to provide the applicability for a wide range of different areas.DCNN is used as a tool for feature extraction, and the precise road boundaries are achieved by post-processing, and the final results will be more consistent with the real situation.That is, a much straighter and smoother road region could be provided, with further steps towards a practical application.According to the experiments, a total correctness of approximately 80% can be obtained through our proposed method.

Figure 1 .
Figure 1.The optical distortion on road networks of a GF-2 image: (a) A village area with abnormally whitened road networks ; (b) magnified detail of (a).