Vegetation Greening for Winter Oblique Photography Using Cycle-Consistence Adversarial Networks

A 3D city model is critical for the construction of a digital city. One of the methods of building a 3D city model is tilt photogrammetry. In this method, oblique photography is crucial for generating the model because the visual quality of photography directly impacts the model’s visual effect. Yet, sometimes, oblique photography does not have good visual quality due to a bad season or defective photographic equipment. For example, for oblique photography taken in winter, vegetation is brown. If this photography is employed to generate the 3D model, the result would be bad visually. Yet, common methods for vegetation greening in oblique photography rely on the assistance of the infrared band, which is not available sometimes. Thus, a method for vegetation greening in winter oblique photography without the infrared band is required, which is proposed in this paper. The method was inspired by the work on CycleGAN (Cycle-consistence Adversarial Networks). In brief, the problem of turning vegetation green in winter oblique photography is considered as a style transfer problem. Summer oblique photography generally has green vegetation. By applying CycleGAN, winter oblique photography can be transferred to summer oblique photography, and the vegetation can turn green. Yet, due to the existence of “checkerboard artifacts”, the original result cannot be applied for real production. To reduce artifacts, the generator of CycleGAN is modified. As the final results suggest, the proposed method unlocks the bottleneck of vegetation greening when the infrared band is not available and artifacts are reduced.


Introduction
A 3D city model is critical for the construction of a digital city.It is broadly used to provide information for urban planning, construction, management and emergency response.One of the methods to build a 3D city model is tilt photogrammetry [1].Such a method generates the model using oblique photography, in which the visual quality of oblique photography directly impacts the final results' visual effect.Accordingly, it is important to take oblique photography with good visual quality.
Yet, due to the bad effects from defective photographic equipment, bad weather or season, oblique photography may have poor visual quality sometimes.One of the situations is taking oblique photography in winter.For oblique photography taken in winter, the vegetation is usually brown, and the image color is not bright.If this photography were used to generate a 3D model [2], the result would be poor, as shown in Section 4. To generate a model with good visual quality, it is required to improve the oblique photography's visual quality.Yet, common methods like [3] need oblique Symmetry 2018, 10, 294 2 of 14 photography's infrared band to help turn the vegetation green, which is not available sometimes.Hopefully, a method capable of turning vegetation green without the infrared band can be found.
In recent years, the convolutional neural network has been applied in a large number of domains, and great success has been achieved.In the image-to-image translation area, Cycle-consistence Adversarial Networks (CycleGAN) [4] arouses much attention by virtue of its excellent performance.It is capable of capturing features of one image collection and finding out how these features could be translated into another image collection.One of the impressive examples is its amazing transformation between summer style and winter style.Inspired by this, vegetation greening in winter oblique photography could serve as a style transfer problem for the reason that oblique photography taken in summer usually has green vegetation.To verify the feasibility of this assumption, winter oblique photography is converted into summer oblique photography using CycleGAN.As the result suggests, vegetation becomes green after applying CycleGAN in winter oblique photography.Yet, "checkerboard artifacts" are also found in transferred winter oblique photography.To reduce "checkerboard artifacts", CycleGAN's generator is modified with respect to its kernels.As the final result suggests, vegetation becomes green after the transformation, and artifacts are successfully reduced, as shown in Section 4.
To sum up, the contributions of our work are listed as follows.
(1) Vegetation greening in winter oblique photography is achieved.In comparison with common methods, the infrared band is no longer required.(2) Checkerboard artifacts are reduced after CycleGAN is modified.The transferred photography can be applied in production.
(3) The model can be trained with unpaired images, which is practical.
The rest of this paper is organized as follows: In the next section, we review the relevant work about unpaired image-to-image translation, GAN and cycle consistency.The proposed method is illustrated in Section 3. The comparison between the proposed method and other methods is drawn in Section 4. The last section draws the conclusions and discusses the future work.

Related Works
Unpaired image-to-image translation: The concept of image-to-image translation was first proposed in [5].Then, numerous methods for image-to-image translation have been proposed.On the whole, these methods can fall into two groups.The first one is based on paired images for training, and the second one is based on unpaired images.
For methods [5][6][7][8][9][10] in the first one, a lack of paired images poses a big challenge [4].In practice, it is hard and expensive to prepare paired images for training.One of the examples is artistic stylization.For every input image to be stylized, it is hard to prepare its corresponding output because these desired outputs are highly sophisticated.To overcome the limitation of insufficient paired images, methods that do not require paired images for training have been proposed.
In [11], image-to-image translation based on unpaired images was achieved using a Bayesian network.In [12], a Bayesian network was combined with a neural network to perform efficient inference, so that a direct probabilistic model can be learned.In [13], an unsupervised image-to-image translation network based on a variational autoencoder and GAN was proposed.It helped realize learning without paired images under GAN.In [14], GAN was also employed to help build a Coupled Generative Adversarial Network (CoGAN), in which the network can learn joint distributions of different styles of images.In recent work, CycleGAN [4] aroused great attention by virtue of its state-of-the-art performance.One example was its amazing translation between summer and winter.
Generate adversarial network: The Generate Adversarial Network (GAN) has a short history.In 2014, GAN was firstly proposed in [15], and great success had been achieved.Then, different types of GAN were proposed.In [16], a Laplacian pyramid was applied to an adversarial network, so that coarse images could be made fine.In [17], a deep convolutional generative adversarial network was proposed.It narrowed the gap between supervised learning and unsupervised learning.The work in [18] proposed a recurrent adversarial network.It could generate image samples for training.The work in [19] came up with an interpretable representation learning using Information maximizing Generative Adversarial Network (InfoGAN).It was capable of learning disentangled representations in a completely unsupervised condition.The work in [20] listed several methods for better training of GAN, and [21] explained the principles of GAN in terms of energy.
Cycle consistency: The use of cycle consistency as a way to regularize data has a long history.Cycle consistency consists of forward consistency and backward consistency.It has served as a trick for decades [22].In [23][24][25], higher-order cycle consistency was used in different tasks, like human translation, 3D shape matching and depth estimation.Especially, in the work of [25,26], cycle consistency loss served as a method to train the neural network.This forms a strategy in CycleGAN.

The Proposed Method
Our method aims to turn vegetation green in winter oblique photography when the infrared band is not available.By using CycleGAN, this purpose can be achieved when winter oblique photography is transferred to summer oblique photography.Beside, CycleGAN's generator is modified by adjusting the kernel's size to reduce "checkerboard artifacts" in transferred photography.

CycleGAN
Image-to-image translation might be difficult because paired images are hard to prepare for the model's training.For instance, in our case, it is hard to find winter oblique photography's corresponding to summer oblique photography.To handle this problem, CycleGAN is introduced to achieve an image-to-image translation.After training without paired images, the well-trained CycleGAN model can be employed to realize a mapping from winter oblique photography to summer oblique photography.Figure 1 shows an example of winter oblique photography and summer oblique photography.
Symmetry 2018, 10, x FOR PEER REVIEW 3 of 14 of GAN were proposed.In [16], a Laplacian pyramid was applied to an adversarial network, so that coarse images could be made fine.In [17], a deep convolutional generative adversarial network was proposed.It narrowed the gap between supervised learning and unsupervised learning.The work in [18] proposed a recurrent adversarial network.It could generate image samples for training.The work in [19] came up with an interpretable representation learning using Information maximizing Generative Adversarial Network (InfoGAN).It was capable of learning disentangled representations in a completely unsupervised condition.The work in [20] listed several methods for better training of GAN, and [21] explained the principles of GAN in terms of energy.
Cycle consistency: The use of cycle consistency as a way to regularize data has a long history.Cycle consistency consists of forward consistency and backward consistency.It has served as a trick for decades [22].In [23][24][25], higher-order cycle consistency was used in different tasks, like human translation, 3D shape matching and depth estimation.Especially, in the work of [25,26], cycle consistency loss served as a method to train the neural network.This forms a strategy in CycleGAN.

The Proposed Method
Our method aims to turn vegetation green in winter oblique photography when the infrared band is not available.By using CycleGAN, this purpose can be achieved when winter oblique photography is transferred to summer oblique photography.Beside, CycleGAN's generator is modified by adjusting the kernel's size to reduce "checkerboard artifacts" in transferred photography.

CycleGAN
Image-to-image translation might be difficult because paired images are hard to prepare for the model's training.For instance, in our case, it is hard to find winter oblique photography's corresponding to summer oblique photography.To handle this problem, CycleGAN is introduced to achieve an image-to-image translation.After training without paired images, the well-trained CycleGAN model can be employed to realize a mapping from winter oblique photography to summer oblique photography.Figure 1 shows an example of winter oblique photography and summer oblique photography.First and foremost, we define X as winter oblique photography's style domain and Y as summer oblique photography's style domain.These domains' distributions are denoted as x~p data (x) and y~p data (y), respectively.Every oblique photography that pertains to winter is denoted as {x i } N i=1 , where x i ∈ X.Every oblique photography that pertains to summer is denoted as y j M j=1 , where y j ∈ Y. Besides, there are two mappings G: X→Y and F: Y→X.The first one translates winter oblique photography to summer oblique photography, and the second one translates summer oblique photography to winter oblique photography.The transferred winter oblique photography is denoted as {G(x)}.Likewise, the transferred summer oblique photography is denoted as {F(y)}.
In addition, there are also two discriminators D X and D Y in CycleGAN.D X is responsible for distinguishing real winter photography {x i } N i=1 and fake winter photography (transferred summer photography) {F(y)}.D Y is responsible for distinguishing real summer photography y j M j=1 and fake summer photography (transferred winter photography) {G(x)}.
The structure of CycleGAN is shown in Figure 2.
Symmetry 2018, 10, x FOR PEER REVIEW 4 of 14 First and foremost, we define X as winter oblique photography's style domain and Y as summer oblique photography's style domain.These domains' distributions are denoted as x~pdata(x) and y~pdata(y), respectively.Every oblique photography that pertains to winter is denoted as where xi ∈ X.Every oblique photography that pertains to summer is denoted as 1 { } M j j y = , where yj ∈ Y. Besides, there are two mappings G: X→Y and F: Y→X.The first one translates winter oblique photography to summer oblique photography, and the second one translates summer oblique photography to winter oblique photography.The transferred winter oblique photography is denoted as {G(x)}.Likewise, the transferred summer oblique photography is denoted as {F(y)}.
In addition, there are also two discriminators DX and DY in CycleGAN.DX is responsible for distinguishing real winter photography The structure of CycleGAN is shown in Figure 2. CycleGAN consists of two mappings G: X→Y and F: Y→X and two discriminators DX and DY, as shown in Figure 2. DY helps G: X→Y better translates winter photography to summer photography, and the same goes for F: Y→X and DX.

Adversarial Loss
Adversarial loss [15] is used for both mapping G: X→Y and F: Y→X.For the mapping G: X→Y and its discriminator DY, the loss is defined as: In CycleGAN, G tries to generate photography {G(x)} that is close to the distribution of domain Y, while DY seeks to distinguish fake {G(x)} and real G attempts to minimize Equation (1), while DY tries to maximize it.Thus, the target of mapping G: X→Y is written as: Likewise, the target of mapping F: Y→X can be written as:

Adversarial Loss
Adversarial loss [15] is used for both mapping G: X→Y and F: Y→X.For the mapping G: X→Y and its discriminator D Y , the loss is defined as: In CycleGAN, G tries to generate photography {G(x)} that is close to the distribution of domain Y, while D Y seeks to distinguish fake {G(x)} and real y j M j=1 .G attempts to minimize Equation ( 1), while D Y tries to maximize it.Thus, the target of mapping G: X→Y is written as: Likewise, the target of mapping F: Y→X can be written as:

Cycle Consistency Loss
The increase of cycle consistency loss [4] aims to ensure that learned mapping can map {x i } N i=1 to desired output y j M j=1 .For x from domain X, the image cycle translation should be able to bring it to its origin, which is written as x→G(x)→F(G(x)) ≈ x.Equally, there is y→F(y)→G(F(y)) ≈ y.Then, cycle consistency loss is defined as: where ] and E y∼p data (y) [ F(G(y)) − y 2 ] are forward cycle loss and backward cycle loss, respectively.

Total Loss
The final objective is: where λ determines the importance of cycle consistency loss.By experiments, λ is set as 11 here.

Elimination of Checkerboard Artifacts
We can find some small artifacts [27] called "checkerboard artifacts" if taking a close look at the image generated by the neural network.These artifacts make the generated image look poor in detail.
Figure 3a is the input photography, and Figure 3b is the output photography.Figure 3c,d shows the parts of Figure 3b where "checkerboard artifacts" are obvious.
cycle consistency loss is defined as: where

Total Loss
The final objective is: where λ determines the importance of cycle consistency loss.By experiments, λ is set as 11 here.

Elimination of Checkerboard Artifacts
We can find some small artifacts [27] called "checkerboard artifacts" if taking a close look at the image generated by the neural network.These artifacts make the generated image look poor in detail.
Figure 3a is the input photography, and Figure 3b is the output photography.Figure 3c,d shows the parts of Figure 3b where "checkerboard artifacts" are obvious.In [28], the cause of artifacts is clarified.In brief, the neural network often uses the deconvolution operation to build images from low resolution to high resolution.In this period, uneven overlap is created, which leads to the appearance of artifacts, especially in an image's dark region.
To reduce artifacts in generated images, [29] came up with several solutions.
(1) Renounce the use of the deconvolution operation.The method instead is: first, use up-sampling methods to build the image in the desired size; then, use the convolution operation to process the image.The choices of up-sampling methods are the nearest neighbor method and the bilinear method.The author of [29] recommended the nearest neighbor method.
(2) Adjust the kernel's size in the model's generator.Adjust the kernel's size to enable it to be split by stride.In CycleGAN's generator, some layer's kernel size is 3 with a stride of 2. Following the instruction of [29], these kernels' size is modified to 4, so that it can be divided by 2.
After experiments, Solution (2) is adopted to reduce artifacts of Solution (1), which does not reduce artifacts obviously.More details can be seen in Section 4.
The whole process can be illustrated as follows.In [28], the cause of artifacts is clarified.In brief, the neural network often uses the deconvolution operation to build images from low resolution to high resolution.In this period, uneven overlap is created, which leads to the appearance of artifacts, especially in an image's dark region.
To reduce artifacts in generated images, [29] came up with several solutions.
(1) Renounce the use of the deconvolution operation.The method instead is: first, use up-sampling methods to build the image in the desired size; then, use the convolution operation to process the image.The choices of up-sampling methods are the nearest neighbor method and the bilinear method.The author of [29] recommended the nearest neighbor method.(2) Adjust the kernel's size in the model's generator.Adjust the kernel's size to enable it to be split by stride.In CycleGAN's generator, some layer's kernel size is 3 with a stride of 2. Following the instruction of [29], these kernels' size is modified to 4, so that it can be divided by 2.
After experiments, Solution (2) is adopted to reduce artifacts of Solution (1), which does not reduce artifacts obviously.More details can be seen in Section 4.
The whole process can be illustrated as follows.
Until convergence

Dataset
Training data of winter oblique photography were acquired from Changsha, Hunan Province.Summer oblique photography was captured from Jingjiang, Jiangsu Province.It is noteworthy that all the photography should be at the same resolution.Otherwise, the training would be hard.

Implementation Details
The generator used in CycleGAN was from [30].For promotion, some kernels' size was changed, so that "checkerboard artifacts" could be reduced.The structure of the modified generator is defined in Table 1.This generator has 4 layers with the kernel size of 4 and 9 residual blocks [31] under the instance norm [32].
The structure of the residual block is illustrated in Figure 4.
all the photography should be at the same resolution.Otherwise, the training would be hard.

Implementation Details
The generator used in CycleGAN was from [30].For promotion, some kernels' size was changed, so that "checkerboard artifacts" could be reduced.The structure of the modified generator is defined in Table 1.This generator has 4 layers with the kernel size of 4 and 9 residual blocks [31] under the instance norm [32].
The structure of the residual block is illustrated in Figure 4. Our experiments were performed on an NVIDIA Titan XP GPU.The operation system was Windows 7, and PyTorch served as the deep learning framework.It took us 3 days to finish the training.Because of the paper's typesetting, the images of some figures may be compressed.For the original images, see https://github.com/carlblocking/results-of-my-first-sci-paper.Our experiments were performed on an NVIDIA Titan XP GPU.The operation system was Windows 7, and PyTorch served as the deep learning framework.It took us 3 days to finish the training.Because of the paper's typesetting, the images of some figures may be compressed.For the original images, see https://github.com/carlblocking/results-of-my-first-sci-paper.

Results and Comparison
First and foremost, a comparison of the 3D model with good and bad visual quality is shown as mentioned in Section 1. Figure 5a is the result of the original photography, and Figure 5b is from transferred photography.These images were captured from the 3D model's look-down angle.They were generated using the software smart3D.It is obvious that Figure 5b has better visual quality than 5a.The vegetation is greener, and the image is brighter.

Results and Comparison
First and foremost, a comparison of the 3D model with good and bad visual quality is shown as mentioned in Section 1. Figure 5a is the result of the original photography, and Figure 5b is from transferred photography.These images were from the 3D model's look-down angle.They were generated using the software smart3D.It is obvious that Figure 5b has better visual quality than 5a.The vegetation is greener, and the image is brighter.Then, the solutions mentioned in Section 3 with the aim to reduce artifacts were tested.The results suggest that Solution (2) produces a satisfactory outcome, as shown in Figure 6.Then, the solutions mentioned in Section 3 with the aim to reduce artifacts were tested.The results suggest that Solution (2) produces a satisfactory outcome, as shown in Figure 6.

Results and Comparison
First and foremost, a comparison of the 3D model with good and bad visual quality is shown as mentioned in Section 1. Figure 5a is the result of the original photography, and Figure 5b is from transferred photography.These images were captured from the 3D model's look-down angle.They were generated using the software smart3D.It is obvious that Figure 5b has better visual quality than 5a.The vegetation is greener, and the image is brighter.Then, the solutions mentioned in Section 3 with the aim to reduce artifacts were tested.The results suggest that Solution (2) produces a satisfactory outcome, as shown in Figure 6. Figure 6a is the input photography.Figure 6b is Solution (1)'s output.In CycleGAN's code [4], there are two different generators.One is from paper [30], which was applied in the realization of CycleGAN [4].The other is U-net [33].To compare different generators' performance, experiments based on our modified generator and these two generators were performed.First, these generators were tested on winter oblique photography taken in Hengyang, Hunan Province.The result is shown in Figure 7.
In Figure 7, inputs are the oblique photography of a building and a garden.In the result of the generator of [30], artifacts can be found at the edge of buildings and the garden's shadow.Artifacts were not obvious in results of the U-net generator and our modified generator.The difference may not be obvious due to the compression of image.To better show the results, SSIM (Structure Similarity index) is introduced here [34].
Symmetry 2018, 10, x FOR PEER REVIEW 9 of 14 effect like an oil painting.Furthermore, artifacts are reduced in Figure 6c.Accordingly, Solution (2) served as an improvement to CycleGAN's generator here.In CycleGAN's code [4], there are two different generators.One is from paper [30], which was applied in the realization of CycleGAN [4].The other is U-net [33].To compare different generators' performance, experiments based on our modified generator and these two generators were performed.First, these generators were tested on winter oblique photography taken in Hengyang, Hunan Province.The result is shown in Figure 7.
In Figure 7, inputs are the oblique photography of a building and a garden.In the result of the generator of [30], artifacts can be found at the edge of buildings and the garden's shadow.Artifacts were not obvious in results of the U-net generator and our modified generator.The difference may not be obvious due to the compression of image.To better show the results, SSIM (Structure Similarity index) is introduced here [34].

Input
Generator from [30] U-net generator Our modified generator In Figure 7, photography from different generators is compared with the input photography in terms of SSIM.The results are listed in Table 2.The results of the generator of [30] achieved the lowest SSIM, suggesting the existence of artifacts.The U-net generator and our modified generator achieved higher SSIM, suggesting fewer artifacts.Yet, this suggests that our modified generator achieved lower SSIM than the U-net generator.This is because the results from our modified generator were greener than those of U-net generator's, which made the generated photography more different from the original input photography.To verify this, these generators were tested on another group of oblique photography.In Figure 8, the inputs are the oblique photography of the countryside in Qiqihar, Heilongjiang Province.It shows another type of oblique photography with poor visual quality.They were taken in bad weather, so that the brightness was low.In general, these generators have successfully improved these inputs' visual quality.Yet, in the results of the generator from [30], artifacts remained.In the results of the U-net generator and our modified generator, artifacts were reduced.Yet, U-net generator's result was less green in vegetation in comparison with the results from our In Figure 7, photography from different generators is compared with the input photography in terms of SSIM.The results are listed in Table 2.The results of the generator of [30] achieved the lowest SSIM, suggesting the existence of artifacts.The U-net generator and our modified generator achieved higher SSIM, suggesting fewer artifacts.Yet, this suggests that our modified generator achieved lower SSIM than the U-net generator.This is because the results from our modified generator were greener than those of U-net generator's, which made the generated photography more different from the original input photography.To verify this, these generators were tested on another group of oblique photography.In Figure 8, the inputs are the oblique photography of the countryside in Qiqihar, Heilongjiang Province.It shows another type of oblique photography with poor visual quality.They were taken in bad weather, so that the brightness was low.In general, these generators have successfully improved these inputs' visual quality.Yet, in the results of the generator from [30], artifacts remained.In the results of the U-net generator and our modified generator, artifacts were reduced.Yet, U-net generator's result was less green in vegetation in comparison with the results from our modified generator.To better show this difference, we tested the U-net generator and our modified generator on another group of photographs.input, the generator of [30], the U-net generator and our modified generator.
In Figure 9, the input oblique photography involves the mountains of Changsha, Hunan Province.The photography was taken in winter.The vegetation was not green.After transformation, the vegetation turned green in both generators' results.Yet, it is observed from Figure 9 that our modified model achieved better performance than U-net as the vegetation was greener.input U-net generator Our modified generator To evaluate the performances of different generators, they were compared in terms of forward cycle loss and backward cycle loss.
Forward cycle loss was generally lower than backward cycle loss, as listed in Table 3.Our model achieved the lowest loss in both forward cycle loss and backward cycle loss.As defined in Section 3, forward cycle loss evaluates the performance of mapping F: Y→X, which translates summer oblique photography to winter oblique photography.Backward cycle loss evaluates the input, the generator of [30], the U-net generator and our modified generator.
In Figure 9, the input oblique photography involves the mountains of Changsha, Hunan Province.The photography was taken in winter.The vegetation was not green.After transformation, the vegetation turned green in both generators' results.Yet, it is observed from Figure 9 that our modified model achieved better performance than U-net as the vegetation was greener.input, the generator of [30], the U-net generator and our modified generator.
In Figure 9, the input oblique photography involves the mountains of Changsha, Hunan Province.The photography was taken in winter.The vegetation was not green.After transformation, the vegetation turned green in both generators' results.Yet, it is observed from Figure 9 that our modified model achieved better performance than U-net as the vegetation was greener.input U-net generator Our modified generator To evaluate the performances of different generators, they were compared in terms of forward cycle loss and backward cycle loss.
Forward cycle loss was generally lower than backward cycle loss, as listed in Table 3.Our model achieved the lowest loss in both forward cycle loss and backward cycle loss.As defined in Section 3, forward cycle loss evaluates the performance of mapping F: Y→X, which translates summer oblique photography to winter oblique photography.Backward cycle loss evaluates the To evaluate the performances of different generators, they were compared in terms of forward cycle loss and backward cycle loss.
Forward cycle loss was generally lower than backward cycle loss, as listed in Table 3.Our model achieved the lowest loss in both forward cycle loss and backward cycle loss.As defined in Section 3, forward cycle loss evaluates the performance of mapping F: Y→X, which translates summer oblique photography to winter oblique photography.Backward cycle loss evaluates the performance of mapping G: X→Y, which translates winter oblique photography to summer oblique photography.
One assumption can explain why forward cycle loss was lower than backward cycle loss: that it might be easier for mapping F: Y→X to degrade the visual quality of the photography.Conversely, it is difficult for mapping G: X→Y to recover photography from low visual quality to high visual quality.Thus, mapping G: X→Y is subject to higher loss than that of mapping F: Y→X.Besides, in backward cycle loss, the generator from [30] obtains the highest value.This could be attributed to the existence of artifacts.Furthermore, our modified generator's loss value was lower than that of the U-net generator.This was probably because our modified generator produced better results than U-net, as the vegetation was greener.Yet, none of these assumptions have theoretical proof.Hence, deeper research is required in the future.
Finally, an experiment of using transferred oblique photography to build the 3D model was performed.
First, the original input photography and its corresponding transferred results are shown in Figure 10.In Figure 10a is the oblique photography taken in bad weather, and Figure 10b is the transferred result of this photograph.The original input photography was not bright, and its vegetation was not green.Common methods cannot turn the vegetation green when the infrared band is not available.Using our well-trained model, the photography can be improved significantly.
Symmetry 2018, 10, x FOR PEER REVIEW 11 of 14 performance of mapping G: X→Y, which translates winter oblique photography to summer oblique photography One assumption can explain why forward cycle loss was lower than backward cycle loss: that it might be easier for mapping F: Y→X to degrade the visual quality of the photography.Conversely, it is difficult for mapping G: X→Y to recover photography from low visual quality to high visual quality.Thus, mapping G: X→Y is subject to higher loss than that of mapping F: Y→X.

Generator
Forward Cycle Loss Backward Cycle Loss Generator from [30] 0.05 0.49 U-net generator 0.11 0.55 Our modified generator 0.015 0.32 Besides, in backward cycle loss, the generator from [30] obtains the highest value.This could be attributed to the existence of artifacts.Furthermore, our modified generator's loss value was lower than that of the U-net generator.This was probably because our modified generator produced better results than U-net, as the vegetation was greener.Yet, none of these assumptions have theoretical proof.Hence, deeper research is required in the future.
Finally, an experiment of using transferred oblique photography to build the 3D model was performed.
First, the original input photography and its corresponding transferred results are shown in Figure 10.In Figure 10a is the oblique photography taken in bad weather, and Figure 10b is the transferred result of this photograph.The original input photography was not bright, and its vegetation was not green.Common methods cannot turn the vegetation green when the infrared band is not available.Using our well-trained model, the photography can be improved significantly.Then, the transferred photography was employed to generate a 3D model.It is clear from Figure 11 that the generated 3D model had a good visual effect.Then, the transferred photography was employed to generate a 3D model.It is clear from Figure 11 that the generated 3D model had a good visual effect.Yet, in very rare cases, our model made the building slightly green in darkness, as shown in Figure 12.We are now finding the reasons for and solutions to this problem.
In Figure 12, the wall becomes slightly green in darkness.Yet, in very rare cases, our model made the building slightly green in darkness, as shown in Figure 12.We are now finding the reasons for and solutions to this problem.

Conclusions and Discussion
In Figure 12, the wall becomes slightly green in darkness.
Symmetry 2018, 10, x FOR PEER REVIEW 12 of 14 Then, the transferred photography was employed to generate a 3D model.It is clear from Figure 11 that the generated 3D model had a good visual effect.Yet, in very rare cases, our model made the building slightly green in darkness, as shown in Figure 12.We are now finding the reasons for and solutions to this problem.
In Figure 12, the wall becomes slightly green in darkness.

Figure 1 .
Figure 1.A comparison of vegetation in winter oblique photography and summer oblique photography.(a) is winter oblique photography, and (b) is summer oblique photography.

Figure 1 .
Figure 1.A comparison of vegetation in winter oblique photography and summer oblique photography.(a) is winter oblique photography, and (b) is summer oblique photography.
photography) {F(y)}.DY is responsible for distinguishing real summer photography
forward cycle loss and backward cycle loss, respectively.

Process 1 :θ 2 [θ
CycleGAN training process.Preparation: Training images of winter X and training images of summer Y, mapping G with generated parameters G θ and mapping F with yielded parameters F θ , discriminator DX to minimize LGAN(G,DY,X,Y) and ~( ) to minimize LGAN(F,DX,Y,X) and

Figure 3 .
Figure 3.An example of "checkerboard artifacts" in the generated photography: (a) input photography; (b) parts of artifacts in generated photography; (c,d) are enlarged parts of (b) where artifacts are obvious.

Figure 5 .
Figure 5.Comparison of the 3D model of bad and good visual quality.(a) is the 3D model generated by the original photography and (b) is generated by the transferred photography.

Figure 6 .
Figure 6.Results from different artifact reduction solutions in Section 3. (a) is the input photography; (b) is the result of Solution (1); (c) is the result of Solution (2).

Figure 6a is the
Figure 6a is the input photography.Figure 6b is Solution (1)'s output.Figure 6c is Solution (2)'s output.It is obvious that Figure 6c has better visual quality than Figure 6b because Figure 6b has an Figure 6a is the input photography.Figure 6b is Solution (1)'s output.Figure 6c is Solution (2)'s output.It is obvious that Figure 6c has better visual quality than Figure 6b because Figure 6b has an Figure 6a is the input photography.Figure 6b is Solution (1)'s output.Figure 6c is Solution (2)'s output.It is obvious that Figure 6c has better visual quality than Figure 6b because Figure 6b has an

Figure 5 .
Figure 5.Comparison of the 3D model of bad and good visual quality.(a) is the 3D model generated by the original photography and (b) is generated by the transferred photography.

Figure 5 .
Figure 5.Comparison of the 3D model of bad and good visual quality.(a) is the 3D model generated by the original photography and (b) is generated by the transferred photography.

Figure 6 .
Figure 6.Results from different artifact reduction solutions in Section 3. (a) is the input photography; (b) is the result of Solution (1); (c) is the result of Solution (2).
Figure 6a is the input photography.Figure 6b is Solution (1)'s output.Figure 6c is Solution (2)'s output.It is obvious that Figure 6c has better visual quality than Figure 6b because Figure 6b has an

Figure 6 .
Figure 6.Results from different artifact reduction solutions in Section 3. (a) is the input photography; (b) is the result of Solution (1); (c) is the result of Solution (2).

Figure
Figure6ais the input photography.Figure6bis Solution (1)'s output.Figure6cis Solution (2)'s output.It is obvious that Figure6chas better visual quality than Figure6bbecause Figure6bhas an effect like an oil painting.Furthermore, artifacts are reduced in Figure6c.Accordingly, Solution (2) served as an improvement to CycleGAN's generator here.
Figure6ais the input photography.Figure6bis Solution (1)'s output.Figure6cis Solution (2)'s output.It is obvious that Figure6chas better visual quality than Figure6bbecause Figure6bhas an effect like an oil painting.Furthermore, artifacts are reduced in Figure6c.Accordingly, Solution (2) served as an improvement to CycleGAN's generator here.
Figure6ais the input photography.Figure6bis Solution (1)'s output.Figure6cis Solution (2)'s output.It is obvious that Figure6chas better visual quality than Figure6bbecause Figure6bhas an effect like an oil painting.Furthermore, artifacts are reduced in Figure6c.Accordingly, Solution (2) served as an improvement to CycleGAN's generator here.

Figure 7 .
Figure 7. Results of different generators.From left to right are: input, results of the generator of [30], U-net generator and our modified generator.

Figure 7 .
Figure 7. Results of different generators.From left to right are: input, results of the generator of [30], U-net generator and our modified generator.

Symmetry 2018 ,Figure 8 .
Figure 8. Results of different generators on another dataset.From left to right are the results from:input, the generator of[30], the U-net generator and our modified generator.

Figure 9 .
Figure 9. Results of different generators tested on another photograph set.From left to right are the results from: input, the U-net generator and our modified generator.

Figure 8 .
Figure 8. Results of different generators on another dataset.From left to right are the results from:input, the generator of[30], the U-net generator and our modified generator.

Symmetry 2018 ,Figure 8 .
Figure 8. Results of different generators on another dataset.From left to right are the results from:input, the generator of[30], the U-net generator and our modified generator.

Figure 9 .
Figure 9. Results of different generators tested on another photograph set.From left to right are the results from: input, the U-net generator and our modified generator.

Figure 9 .
Figure 9. Results of different generators tested on another photograph set.From left to right are the results from: input, the U-net generator and our modified generator.

Figure 10 .
Figure 10.(a) is the input, and (b) is the output.Figure 10.(a) is the input, and (b) is the output.

Figure 10 .
Figure 10.(a) is the input, and (b) is the output.Figure 10.(a) is the input, and (b) is the output.

Figure 11 .
Figure 11.3D model generated by transferred photography from Figure 10b.

Figure 11 .
Figure 11.3D model generated by transferred photography from Figure 10b.

Figure 11 .
Figure 11.3D model generated by transferred photography from Figure 10b.

Table 1 .
Structure of our generator.

Table 2 .
Comparison of SSIM from different generators.

Table 2 .
Comparison of SSIM from different generators.

Table 3 .
Comparison of forward cycle loss and backward cycle loss with different generators.

Table 3 .
Comparison of forward cycle loss and backward cycle loss with different generators.