Skip Content
You are currently on the new version of our website. Access the old version .
SensorsSensors
  • Article
  • Open Access

20 October 2022

Colorful Image Colorization with Classification and Asymmetric Feature Fusion

,
,
,
and
1
Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
2
University of Chinese Academy of Sciences, Beijing 100049, China
*
Authors to whom correspondence should be addressed.
This article belongs to the Special Issue Image Processing and Pattern Recognition Based on Deep Learning

Abstract

An automatic colorization algorithm can convert a grayscale image to a colorful image using regression loss functions or classification loss functions. However, the regression loss function leads to brown results, while the classification loss function leads to the problem of color overflow and the computation of the color categories and balance weights of the ground truth required for the weighted classification loss is too large. In this paper, we propose a new method to compute color categories and balance the weights of color images. In this paper, we propose a new method to compute color categories and balance weights of color images. Furthermore, we propose a U-Net-based colorization network. First, we propose a category conversion module and a category balance module to obtain the color categories and to balance weights, which dramatically reduces the training time. Second, we construct a classification subnetwork to constrain the colorization network with category loss, which improves the colorization accuracy and saturation. Finally, we introduce an asymmetric feature fusion (AFF) module to fuse the multiscale features, which effectively prevents color overflow and improves the colorization effect. The experiments show that our colorization network has peak signal-to-noise ratio (PSNR) and structure similarity index measure (SSIM) metrics of 25.8803 and 0.9368, respectively, for the ImageNet dataset. As compared with existing algorithms, our algorithm produces colorful images with vivid colors, no significant color overflow, and higher saturation.

1. Introduction

Colorization has played an important role in processing grayscale pictures such as medical pictures, night vision pictures, electron microscopic pictures, satellite remote sensing pictures, and old photos. However, colorization is a complex and diverse problem, since the same piece of clothing can be red, blue, brown, or other colors. Therefore, it currently remains a challenging subject.
Traditional colorization methods are mainly divided into two types: color expansion through adjacent pixels [1,2,3,4] and color transfer through reference images [5,6,7,8]. However, both methods require a lot of manual interaction and rely heavily on the accuracy of color marking or the selection of reference maps. In recent years, with the rapid development of deep learning, a large number of automatic colorization algorithms based on convolutional neural networks (CNNs) have been proposed. However, most colorization algorithms use regression loss functions (such as L1 and L2) [9,10,11,12,13,14,15,16,17,18,19,20,21]. These algorithms resolve the features of grayscale images and add color channels to achieve colorization. The generated colorful images have been relatively satisfactory, but the problem of brown and unsaturated generated images has persisted, as shown in Figure 1. To generate vibrant and saturated colorful images, Zhang et al. [22] used the classification loss function for colorization. However, this algorithm triggered very serious color overflow, as shown in Figure 1. Moreover, the long training time of his network made it difficult to train.
Figure 1. Problems with the current colorization networks. Using regression loss functions (such as Iizuka et al. [11]) results in a brownish, unsaturated result. Using classification loss functions (such as Zhang et al. [22]) results in color overflow.
In order to improve the brown and unsaturated phenomenon of generated images, suppress the color overflow of generated images and reduce the training time of classification loss function network, we propose a new method to compute color categories and balance weights of color images. Furthermore, we propose a colorization network based on U-Net [23]. First, we propose a category conversion module and a category balance module to obtain the color categories and to balance weights. These two modules replace the original point-by-point calculation by matrix indexing, which significantly reduces the training time. Second, in order to obtain richer global features for the colorization network, we construct a classification subnetwork which classifies grayscale images according to 1000 image categories of the ImageNet dataset. The classification subnetwork constrains the colorization network with category loss to improve the colorization accuracy and saturation. Finally, inspired by Cho [24], we introduce an AFF module to fuse the multiscale features. Multiscale feature fusion enables the colorization network to grasp both global features and local features, which effectively prevents color overflow and improves the colorization effect. As a result, our colorization algorithm produces vibrant images with no visible color overflow. The contributions of this work are:
  • A category conversion module and a category balance module are proposed to significantly reduce the training time.
  • A classification subnetwork is proposed to improve colorization accuracy and saturation.
  • An AFF module is introduced to prevent color overflow and to improve the colorization effect.

3. Method

3.1. Overview

Given a grayscale image x l R 1 * h * w as input, the purpose of colorization is to predict the remaining a and b channels x a b R 2 * h * w in the Lab channel and turn the single channel x l into a three-channel color image x l a b R 3 * h * w ; l, a and b represent the brightness of the Lab color space, and range from red to green and from yellow to blue, respectively. In this work, we design an end-to-end colorization network based on U-Net. As shown in Figure 2, our colorization network consists of three parts: an encoder, a classification subnetwork, and a decoder. Our colorization network outputs the picture category probability distribution and color category probability distribution. The color category probability distribution becomes x a b after the color recovery (CRC) module x a b concentrates x l to obtain the colorful image x l a b .
Figure 2. Network structure. Our colorization network consists of an encoder (left), a classification subnetwork (bottom right), a decoder (right), three AFF modules and a CRC module.
As shown in Figure 2, the encoder consists of six layers of convolutional blocks. When input M i n R n * c * h * w passes through the convolution block, the obtained detailed features M o u t R n * 2 c * h / 2 * w / 2 are saved and passed to the next layer of the convolution block. After six layers of convolutional blocks feature extracting, the encoder generates global features x g R 2048 * h / 32 * w / 32 of input grayscale images x l R 1 * h * w . The classification subnetwork consists of a convolution module and an average pooling layer. The classification subnetwork resolves the global features x g R 2048 * h / 32 * w / 32 generated by the encoder into the picture category probability distribution Y ^ R n * 1000 * 1 * 1 . The decoder consists of three layers of convolutional blocks. Before input M i n R n * c * h * w passes through the convolutional block, it is concatenated with the same size features of the AFF module output. The decoder resolves the global features x g R 2048 * h / 32 * w / 32 generated by the encoder into color class probability distributions Z ^ R n * 313 * h / 4 * w / 4 of the grayscale image x l .

3.2. Calculating Color Categories and Balance Weights

In order to reduce the computation of color categories and balance weights, we propose a category conversion module and a category balance module. These two modules obtain the color categories and balance the weights of real colorful images for training.

3.2.1. Category Conversion Module

As shown in Figure 3, given the pixel (blue dot) with a and b values (3, −3), Zhang et al. [22] calculated the Euclidean distances d between the blue dot and the 32 nearest color categories (red and yellow dots) to the blue dot. Next, they obtained the probability distribution of each color category by Gaussian weighting using Equation (1). Finally, they selected the color category with the highest probability 120 using Equation (2). Equation (1) decreases monotonically with d , so the color category of the pixel a , b is the color category q , corresponding to the center point a 0 , b 0 of the small square where the pixel point is located.
Figure 3. Part of the color category distribution. The lower half values of the small squares are their corresponding color categories q. For the pixel (blue dot) 3 , 3 , the same color category 120 is obtained for the method of Zhang et al. and our method.
Therefore, in order to obtain the color category of pixel a , b , we calculated the a 0 , b 0 value of the center point of the 10 × 10 square where the pixel a , b was located. Next, we converted a 0 , b 0 to the corresponding color category q . As shown in Figure 3, given the pixel 3 , 3 , we calculated the values 0 , 0 for the center point (red dot) of the small square where this pixel was located and determine the color category 120 for 3 , 3 by the color category 120 corresponding to 0 , 0 .
To calculate the color categories Z R n * h * w corresponding to the ground truth a and b channels x a b R n * 2 * h * w , we used the above method to construct the color category matrix M indexing the color category Z through Z = M x a b , where n is the batch size for one training and h and w are the pixel locations. The color category matrix M R 420 is formulated as follows:
M 22 * a 0 / 10 + 9 + b 0 / 10 + 11 = q a 0 , b 0
M [ k ] = 1 , w h e r e   k   is   the   element   without   a   index   value
where [ ] is an integer symbol, q a 0 , b 0 is the color class q corresponding to a 0 , b 0 .
The category conversion module calculates a 0 and b 0 values of real, colorful pictures a and b channels x a b R n * 2 * h * w and indexes the corresponding color categories Z R n * h * w by color category matrix. The color categories Z R n * h * w are formulated as follows:
x a b = x a b / 10 + 0.5
Z = M x a b : , 0 , : , : + 9 * 22 + x a b : , 1 , : , : + 11

3.2.2. Category Balance Module

In real colorful pictures, since the backgrounds such as sky, grass, ocean, and walls occupy a large number of pixels, most of the pixels are color categories with low values of a and b. To encourage diversity in colorization, we construct the balance weight matrix ω , which is formulated as follows:
w = 1 λ p ˜ + λ / Q 1
ω = w 1 q p ˜ q w q 1
where Q represents the number of color categories used, here is 313; λ represents the weight of mixing the average distribution of each color category and the color category distribution of the ImageNet training set of 1.28 million images, and 0.5 was set. The category balance module obtains the corresponding balance weight ω ( Z h , w ) based on the color category Z h , w . Finally, the category conversion module and the category balance module are formulated as follows:
Z , ω Z h , w = H x a b

3.3. Residual Block

In order to solve the problem of training difficulties brought by the deeper layers of the colorization network, we construct the residual block based on the idea of ResNet [25]. As shown in Figure 4, our residual block consists of one 1 × 1 convolution kernel on the top and two 3 × 3 convolution kernels on the bottom. The upper convolution kernel only transforms the number of input channel to the output, and the lower convolution kernels transform the number of input channel and extract the features. The summation of upper and lower features optimizes the forward path of the colorization network and makes the network easier to train. Therefore, our residual block can effectively solve the problem of network degradation brought by the deeper layers of the network.
Figure 4. The structure of the residual block in the green part of Figure 2. Our residual block consists of one 1 × 1 convolution kernel and two 3 × 3 convolution kernels.

3.4. Asymmetric Feature Fusion Module

In most U-Net-based algorithms, the decoder only concatenates features of the same scale as the encoder. However, the top-down downsampling structure of the encoder causes only the high scale features to act on the low scale features, so the high scale features concatenated by the decoder are not affected by the low scale features, resulting in the degradation of the colorization effect.
Inspired by multi-input multioutput U-Net (MIMO-UNet) [24] and dense connections between intra-scale features [26], we introduce the AFF module, as shown in Figure 5.
Figure 5. Asymmetric feature fusion module structure. The AFF module consists of resized modules, a 1 × 1 convolution kernel, and a 3 × 3 convolution kernel.
The AFF module concatenates the features of all scales of the encoder E n 1 - E n 5 , outputs the multiscale fused features with the convolution kernel, and finally concatenates the features of the corresponding scales with the decoder. Three A F F s A F F 1 , A F F 2 , A F F 3 are formulated as follows:
A F F 1 o u t = A F F 1 S u b s 4 E n 1 , S u b s 2 E n 2 , E n 3 , U p s 2 E n 4 , U p s 4 E n 5
A F F 2 o u t = A F F 2 S u b s 8 E n 1 , S u b s 4 E n 2 , S u b s 2 E n 3 , E n 4 , U p s 2 E n 5
A F F 3 o u t = A F F 3 S u b s 16 E n 1 , S u b s 8 E n 2 , S u b s 4 E n 3 , S u b s 2 E n 4 , E n 5
where A F F n o u t denotes the output of the nth layer, E n n denotes the output of the nth convolutional block of the encoder, S u b s k denotes downsampling by a factor of k , and U p s k denotes upsampling by a factor of k .

3.5. Color Recovery Module

We construct the inverse color category matrix M 1 indexing the values of a and b through x 0 = M 1 q , where q is the color category of pixel and M 1 is the inverse of the color category matrix M . The index of M 1 is the color category q , corresponding to a 0 , b 0 of q .
The color recovery module divides the color class distribution Z ^ R 313 * h / 4 * w / 4 by the annealing parameter and selects the color category with the highest probability. Next, we use M 1 to index the a , b value x 0 R 2 * h / 4 * w / 4 . Finally, we upsample x 0 by a factor of 4 to obtain x a b R 2 * h * w . The color recovery module is formulated as follows:
q * = arg max q Z ^ h , w , q / T
x 0 = M 1 q *
x a b = U p s 4 x 0
T is the annealing parameter, which is taken as 0.38 here. U p s k denotes the upsampling amplification k times.

3.6. Colorization with Classification

Although the classification loss function can generate vibrant colors, the colorization inaccuracy caused by not obtaining the global environment of the input grayscale image is always present. To solve this problem, we construct a classification subnetwork and facilitate the optimization by also training for picture category losses jointly with color category losses. The classification subnetwork resolves the global features x g R 2048 * h / 32 * w / 32 acquired by the encoder into the picture category probability distribution Y ^ R n * 1000 * 1 * 1 for grayscale images. We use 1000 category labels m [ 0 , 999 ] delineated by the ImageNet dataset, which cover all objects in the natural and human world. The classification subnetwork makes the global features of the encoder output more comprehensive through the picture category loss function, thus, enabling the decoder to resolve more accurate color categories. The classification network uses the cross-entropy loss function and is formulated as follows:
L c l s Y , Y ^ = h , w m Y h , w , m log Y ^ h , w , m
where Y h , w , m R n * 1 * 1 is the category label of the real image. The decoder outputs the color category probability distribution Z ^ R n * 313 * h / 4 * w / 4 of the grayscale image. The colorization network uses the cross-entropy loss function and is formulated as follows:
L c o l Z , Z ^ = h , w ω Z h , w q Z h , w , q log Z ^ h , w , q
where Z , ω Z h , w is the color category and balance weight of the real image, which can be obtained by the category conversion module and the category balance module. The total loss function is formulated as follows:
L = λ c o l L c o l + λ c l s L c l s
where λ c o l and λ c l s are hyperparameters controlling the picture category loss and color category loss.

4. Experiments

4.1. Experimental Details

To verify the effectiveness of our proposed colorization algorithm, we built the colorization network in the pytorch framework and trained it with two NVIDIA GeForce RTX 3090 graphics cards. In this experiment, approximately 1.28 million images containing 1000 image categories from the ImageNet training set were used to train the colorization network, and 50,000 images of the ImageNet validation set were used to test the colorization effect.
We initialized our colorization network with the Xavier normal function and trained the colorization network with the SGD optimizer. The initial learning rate, momentum parameter, and weight decay were set to 10−3, 0.9, and 10−4, respectively. The learning rate decays gradually with training, and λ c o l and λ c l s are set to 1 and 0.003, respectively. Batch size is set to 64 and the input image size is fixed to 224 × 224. Our colorization network is trained for 10 epochs and the training time for each epoch is approximately 16 h. The learning rate change is formulated as follows:
l r I t e r = l r * α 1 + l r * α 2 / 100
α 1 = E p o c h I t e r / E p o c h N u m * E p o c h L e n g t h
α 2 = 1 α 1 l r P o w
where E p o c h N u m is the number of training epochs; E p o c h L e n g t h is the total number of training epochs; E p o c h I t e r is the current number of training; l r P o w is the exponential parameter, here is 0.9; l r I t e r is the current learning rate; and l r is the initial learning rate.

4.2. Calculating Time Experiments

To verify the accuracy of calculating the color categories and balance weights of color images proposed in this paper, we randomly selected 200 images from each image category of the ImageNet training set of 1000 image categories (1,281,167 images in total) and calculated the color categories and corresponding balance weights of the images using Zhang et al.’s method [22] and our method for 200,000 images, respectively. For approximately 43.9 billion pixels of 200,000 images, the color categories and corresponding balance weights calculated by the two methods are exactly the same. However, as shown in Table 1, the method of Zhang et al. takes approximately 3 days of computation in our computer, while our method takes less than 2 h of computation.
Table 1. Calculation time for color categories and balance weights. The values show that our method is faster to compute.
The batch size of our colorization network is 64, and therefore, training a batch requires computing the color categories and corresponding balance weights for 64 images with a resolution of 224 × 224. As shown in Table 1, computing the color categories and balance weights for approximately 3.2 million pixels on our computer takes about 18.86 s for Zhang et al.’s method, while our method takes only approximately 0.4 s.

4.3. Quantitative Analysis

In order to quantitatively evaluate the colorization effect of our colorization network, we use the SSIM and the PSNR as the evaluation indexes for quantitative analysis.
The SSIM evaluates the similarity between a color picture generated by the colorization network and a real picture in terms of brightness, contrast, and structure. The SSIM can sensitively perceive the local structural differences between the two pictures. The SSIM takes values from 0 to 1, and a larger SSIM value means that the two images are more similar. SSIM is formulated as follows:
l x , y = 2 μ x μ y + C 1 / μ x 2 + μ y 2 + C 1
c x , y = 2 σ x σ y + C 2 / σ x 2 + σ y 2 + C 2
s x , y = σ x y + C 3 / σ x σ y + C 3
S S I M x , y = l x , y α * c x , y β * s x , y γ
where μ x and μ y denote the mean of image x and y, respectively; σ x and σ y denote the variance of image x and y, respectively; σ x y denotes the covariance of image x and y; C 1 , C 2 , C 3 are constants; and α , β , γ denote the importance of each module.
The PSNR is an objective measure of image quality evaluation before and after image compression. The larger the value of PSNR, the less distorted the image. The PSNR of a real image x with resolution m × n and a generated image y is calculated as follows:
M S E = 1 m n i = 0 m 1 j = 0 n 1 x i , j y i , j 2
P S N R = 10 * log 10 M A X x 2 / M S E
where M A X x 2 indicates the maximum possible pixel value of the image.
We tested our algorithm on 50,000 images from the ImageNet validation set against the algorithms of Larsson et al. [10], Iizuka et al. [11], Zhang et al. [22], Deoldify [18], and Su et al. [19]. Table 2 shows the comparison of our experimental results with the SSIM and the PSNR of the above algorithms. It can be clearly seen that our colorization network has higher SSIM and PSNR values, which means the colorization effect of our network is better.
Table 2. Quantitative analysis of colorization effect. As compared with the PSNR and SSIM values of other colorization algorithms, the colorization effect of our network is better.

4.4. Qualitative Analysis

In order to verify the effectiveness of our colorization algorithm, in this paper, we compare our colorization algorithm with those of Larsson et al. [10], Iizuka et al. [11], Zhang et al. [22], Deoldify [18], and Su et al. [19]. We use 50,000 images from the ImageNet validation set for testing and adjust the resolution of the generated images to 256 × 256. The experimental results are shown in Figure 6, where our algorithm generates more vivid and more saturated colorful images.
Figure 6. Visualization comparison of our colorization algorithm and other colorization algorithms. Our colorization network generates more vivid and saturated colorful images.
As shown in Figure 6, our algorithm generates more vivid and saturated color images as compared with Larsson et al., Iizuka et al., Deoldify, and Su et al. Regarding the color of the small tomatoes in the first column of images, as compered with our bright red color, the other algorithms generate less saturated colors, showing a dark red or unnatural pink. In contrast to our vivid saturated purple flower, the other algorithms generate dull colors, rendering gray and mauve. In addition, as compared with Zhang et al., our algorithm effectively prevents color overflow and oversaturation. Regarding the hand in the fourth column, the fingertips of Zhang et al.’s algorithm overflow a very obvious green color and the mushroom is oversaturated with red, while our algorithm generates a more natural and vivid color for the hand and mushroom. Furthermore, our generated images successfully maintain the integrity and coherence of the color of the same object. Regarding the color of the third column of leaves, our algorithm effectively guarantees a bright green, while the algorithms of Zhang et al. and Su et al. appear unnatural red.

4.5. Ablation Experiments

We designed ablation experiments to demonstrate that adding a classification subnetwork and AFF module to the colorization network can effectively improve the colorization effect. We used the U-Net with the classification subnetwork and AFF module removed as the baseline network and trained it on the ImageNet 50,000 validation set. From Table 3, we can see that the PSNR and SSIM values are higher after adding the classification subnetwork and AFF module, which indicates that the classification subnetwork and AFF module can significantly improve the colorization effect of the colorization network.
Table 3. Ablation experiments. The PSNR and SSIM values show that the classification subnetwork and the AFF module play a positive role in the colorization effect of the network.
In total, we performed three sets of ablation experiments: U-Net plus the classification subnetwork, U-Net plus the AFF module, and our colorization network. As can be seen in Table 2 as well as Figure 7, the classification subnetwork and the AFF module play a positive role in colorization.
Figure 7. Ablation experiments. The classification subnetwork can help the colorization network to color more accurately. The AFF module can improve the color overflow phenomenon and can enhance the colorization effect.
As shown in Figure 7, the colorful images generated by U-Net have the problems of color overflow and low saturation. As for the cabbage in the first row, the color of the U-Net-generated picture leaves is gray-green, which is not bright enough and the color distribution is not uniform. After adding the classification subnetwork, the color of the leaves is a more vivid tender green, which indicates that the classification subnetwork can help the colorization network to color more accurately, but an obvious color overflow appears in the lower middle. After adding the AFF module, there is no obvious color overflow and the color of the leaves is a bright tender green, indicating that the AFF module can improve the color overflow phenomenon and enhance the colorization effect. The U-Net plus AFF module improves the color overflow phenomenon, but the color of the vegetable leaves is light. In the second row of images, the U-Net generated hand and mushroom are light in color and the tip of thumb shows color overflow. After adding the classification subnetwork, the color of hand and mushroom are more vivid, but the tip of thumb still have green color overflow. After adding the AFF module, there is no obvious color overflow, and the hands and mushrooms are healthy flesh color and bright red, respectively. It can be seen that the sorting subnetwork and AFF module can significantly improve the colorization effect.

4.6. User Study

To better evaluate the colorization effect of our algorithm, we conducted a user study to evaluate the results of the U-Net base network, the results of our colorization network, and the ground truth validation images. The study was completed by 20 participants with normal or corrected-to-normal and without color blindness. We randomly selected 100 images of different categories in the test set, for a total of 300 images. All images were displayed at a resolution of 256 × 256 pixels. Each participant was shown 300 pictures and asked to respond “Does this picture look natural?” to each picture within 1 s. Figure 8 and Table 4 show the results of the experiment. The U-Net performed poorly, with only 72.9% of the images considered to be natural. Our colorization network had 92.9% of the images considered to be natural, which was very close to the ground truth’s 95.8%. This is a good indication that our algorithm can generate more natural and saturated colors.
Figure 8. Boxplots of the naturalness of the images evaluated by different users. The 92.9% of our colorization network is closer to the 95.8% of ground truth than the 72.9% of the base U-Net. This indicates that our algorithm generates more natural color pictures.
Table 4. Naturalness of user study. The values show that our network generates more vivid color pictures as compared with the base U-Net.

4.7. Limitation

Although our algorithm achieves better colorization results, our colorization algorithm does not determine the color category of each pixel of the input image. As shown in Figure 2, our network outputs a color category resolution of 56 × 56 instead of the input image 224 × 224, after which we obtain a color image of the corresponding resolution by upsampling 4 times. In order to obtain more accurate color categories and colorization effects, we adjust the resolution of the output color categories to the resolution of the input image 224 × 224 and train using the same dataset and training method.
The generated color images are shown in Figure 9. The pixel-level network generates color images where a certain single color (blue, green) fills the whole image and uneven blocks of color appear. This is probably caused by two reasons. First, our classification of color categories is not accurate enough. Second, when the network becomes a pixel-level network, our network does not effectively capture the local features of the input image. In the future, we may solve this problem by dividing finer color categories or using generative adversarial networks.
Figure 9. Colorization effect of pixel-level network. A single color (blue, green) fills the whole picture, and the last picture appears as an uneven block of color.

5. Conclusions

In this paper, we propose a new method to compute color categories and balance weights of color images. Furthermore, we propose a U-Net-based colorization network incorporating a classification subnetwork and an AFF module. The category conversion module and the category balance module significantly reduce the training time. The classification subnetwork can significantly improve the colorization accuracy and saturation. The AFF module can significantly prevent color overflow and improve the colorization effect. Quantitative experiments show that our colorization network has higher PSNR and SSIM values of 25.8803 and 0.9368. Qualitative experiments show that the colorization effect of our colorization network is higher than that of existing algorithms. In addition, our improved method of calculating color categories and balance weights for color images should also attract more scholars to use color categories for colorization.

Author Contributions

Conceptualization, Z.W., Y.Y., D.L., Y.W. and M.L.; methodology, Z.W., D.L. and Y.W.; software, Z.W. and M.L.; validation, Z.W., Y.Y. and Y.W.; formal analysis, Z.W., Y.Y. and D.L.; investigation, Z.W., Y.Y. and Y.W.; resources, Z.W., Y.Y. and D.L.; data curation, Z.W., Y.Y. and D.L.; writing—original draft preparation, Z.W., Y.Y., D.L. and Y.W.; writing—review and editing, Z.W., Y.Y. and D.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Levin, A.; Lischinski, D.; Weiss, Y. Colorization using optimization. ACM Trans. Graph. 2004, 23, 689–694. [Google Scholar] [CrossRef]
  2. Yatziv, L.; Sapiro, G. Fast image and video colorization using chrominance blending. IEEE Trans. Image Process. 2006, 15, 1120–1129. [Google Scholar] [CrossRef]
  3. Qu, Y.; Wong, T.-T.; Heng, P.-A. Manga Colorization. ACM Trans. Graph. 2006, 25, 1214–1220. [Google Scholar] [CrossRef]
  4. Luan, Q.; Wen, F.; Cohen-Or, D.; Liang, L.; Xu, Y.-Q.; Shum, H.-Y. Natural Image Colorization. In Proceedings of the 18th Eurographics Conference on Rendering Techniques, Grenoble, France, 25 June 2007; pp. 309–320. [Google Scholar]
  5. Welsh, T.; Ashikhmin, M.; Mueller, K. Transferring color to greyscale images. ACM Trans. Graph. 2002, 21, 277–280. [Google Scholar] [CrossRef]
  6. Irony, R.; Cohen-Or, D.; Lischinski, D. Colorization by Example. In Proceedings of the Eurographics Symposium on Rendering (2005), Konstanz, Germany, 29 June–1 July 2005; Bala, K., Dutre, P., Eds.; The Eurographics Association: Vienna, Austria, 2005. [Google Scholar]
  7. Liu, X.; Wan, L.; Qu, Y.; Wong, T.-T.; Lin, S.; Leung, C.-S.; Heng, P.-A. Intrinsic Colorization. In Proceedings of the ACM SIGGRAPH Asia 2008, New York, NY, USA, 1 December 2008. [Google Scholar]
  8. Wang, X.; Jia, J.; Liao, H.; Cai, L. Affective Image Colorization. J. Comput. Sci. Technol. 2012, 27, 1119–1128. [Google Scholar] [CrossRef]
  9. Cheng, Z.; Yang, Q.; Sheng, B. Deep Colorization. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 415–423. [Google Scholar]
  10. Larsson, G.; Maire, M.; Shakhnarovich, G. Learning Representations for Automatic Colorization. In Lecture Notes in Computer Science; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer: Cham, Switzerland, 2016; Volume 9908. [Google Scholar] [CrossRef]
  11. Iizuka, S.; Simo-Serra, E.; Ishikawa, H. Let there be color!: Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans. Graph. 2016, 35, 1–11. [Google Scholar] [CrossRef]
  12. Nazeri, K.; Ng, E.; Ebrahimi, M. Image Colorization Using Generative Adversarial Networks. In International Conference on Articulated Motion and Deformable Objects; Springer: Berlin/Heidelberg, Germany, 2018; Volume 10945, pp. 85–94. [Google Scholar] [CrossRef]
  13. Isola, P.; Zhu, J.-Y.; Zhou, T.; Efros, A.A. Image-to-Image Translation with Conditional Adversarial Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1125–1134. [Google Scholar]
  14. Cao, Y.; Zhou, Z.; Zhang, W.; Yu, Y. Unsupervised Diverse Colorization via Generative Adversarial Networks. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases; Springer: Berlin/Heidelberg, Germany, 2017; pp. 151–166. [Google Scholar] [CrossRef]
  15. Vitoria, P.; Raad, L.; Ballester, C. ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution. In Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, CO, USA, 1–5 March 2020; IEEE: New York, NY, USA, 2020; pp. 2434–2443. [Google Scholar]
  16. Zhang, R.; Zhu, J.-Y.; Isola, P.; Geng, X.; Lin, A.S.; Yu, T.; Efros, A.A. Real-time user-guided image colorization with learned deep priors. ACM Trans. Graph. 2017, 36, 1–11. [Google Scholar] [CrossRef]
  17. Zhao, J.; Han, J.; Shao, L.; Snoek, C.G.M. Pixelated Semantic Colorization. Int. J. Comput. Vis. 2019, 128, 818–834. [Google Scholar] [CrossRef]
  18. Antic, J. Jantic/Deoldify: A Deep Learning Based Project for Colorizing and Restoring Old Images (and Video!). Available online: https://github.com/jantic (accessed on 16 October 2019).
  19. Su, J.-W.; Chu, H.-K.; Huang, J.-B. Instance-Aware Image Colorization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 7965–7974. [Google Scholar] [CrossRef]
  20. Wu, Y.; Wang, X.; Li, Y.; Zhang, H.; Zhao, X.; Shan, Y. Towards Vivid and Diverse Image Colorization with Generative Color Prior. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 14377–14386. [Google Scholar] [CrossRef]
  21. Jin, X.; Li, Z.; Liu, K.; Zou, D.; Li, X.; Zhu, X.; Zhou, Z.; Sun, Q.; Liu, Q. Focusing on Persons: Colorizing Old Images Learning from Modern Historical Movies. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual Event, 20–24 October 2021; pp. 1176–1184. [Google Scholar]
  22. Zhang, R.; Isola, P.; Efros, A.A. Colorful Image Colorization. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2016; pp. 649–666. [Google Scholar]
  23. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar] [CrossRef]
  24. Cho, S.-J.; Ji, S.-W.; Hong, J.-P.; Jung, S.-W.; Ko, S.-J. Rethinking Coarse-to-Fine Approach in Single Image Deblurring. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 4621–4630. [Google Scholar] [CrossRef]
  25. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
  26. Kim, S.-W.; Kook, H.-K.; Sun, J.-Y.; Kang, M.-C.; Ko, S.-J. Parallel Feature Pyramid Network for Object Detection. In Computer Vision–ECCV 2018; Lecture Notes in Computer Science; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Springer International Publishing: New York, NY, USA, 2018; Volume 11209, pp. 239–256. ISBN 978-3-030-01227-4. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.