DCTE-LLIE: A Dual Color-and-Texture-Enhancement-Based Method for Low-Light Image Enhancement

: The enhancement of images captured under low-light conditions plays a vitally important role in the area of image processing and can significantly affect the performance of following operations. In recent years, deep learning techniques have been leveraged in the area of low-light image enhancement tasks, and deep-learning-based low-light image enhancement methods have been the mainstream for low-light image enhancement tasks. However, due to the inability of existing methods to effectively maintain the color distribution of the original input image and to effectively handle feature descriptions at different scales, the final enhanced image exhibits color distortion and local blurring phenomena. So, in this paper, a novel dual color-and-texture-enhancement-based low-light image enhancement method is proposed, which can effectively enhance low-light images. Firstly, a novel color enhancement block is leveraged to help maintain color distribution during the enhancement process, which can further eliminate the color distortion effect; after that, an attention-based multiscale texture enhancement block is proposed to help the network focus on multiscale local regions and extract more reliable texture representations automatically, and a fusion strategy is leveraged to fuse the multiscale feature representations automatically and finally generate the enhanced reflection component. The experimental results on public datasets and real-world low-light images established the effectiveness of the proposed method on low-light image enhancement tasks.


Introduction
Images contain sufficient real-world information, and image processing techniques allow automatic information extraction from images, further enabling different applications of the aforementioned information.However, under certain extreme conditions, it may not be possible to acquire images under sufficient light conditions.There may be a lot of causes for insufficient light conditions, such as low-light environments, limited performance of photography equipment, and improper usage of the equipment.Low-light conditions may lead to multiple degradations, such as poor visibility, low contrast, and accidental noise, which means that compared with images captured under sufficient imaging conditions, it is a challenge to extract information from low-light images.So, based on the discussions above, low-light image enhancement (LLIE) has become an important branch in the image processing area.
In recent years, researchers have paid great attention to the LLIE task, and multiple methods have been proposed to improve the subjective and objective quality of lowlight images [1].Previous methods [2][3][4][5] mainly focus on the spatial low-light image enhancement tasks, such methods directly operating on the pixels of input images, making the enhancement process simple and efficient, and the speed can also be very fast.However, although such methods can improve the brightness of input images to a certain extent and enhance the details, the overall characteristics of the image are ignored, so the final enhanced image has a poor sense of hierarchy.The histogram equalization-based methods are simple and have a good enhancement performance on most low-light images, but they cannot conform to the law of object imaging models, and the enhanced image may contain problems, such as dark brightness, detail loss, and color distortion, which cannot achieve satisfactory visual effects.
Another category is based on the Retinex theory.The Retinex theory was first proposed by Land in 1977 [6] and was a human visual characteristics-based image enhancement method.The Retinex theory assumes that the color of an object depends on the object's ability to reflect light, rather than the absolute value of the texture of reflected light, and the color is not affected by lighting inhomogeneity.So, the Retinex theory needs to decompose the observed image into a reflection component and an illumination component, and only the reflection component is used to achieve enhancement on observed images.The Retinextheory-based methods can realize brightness unity, detail enhancement, and color fidelity, which has been favored by scholars around the world and has been widely used in the field of image enhancement.
In recent years, deep learning [7] has been established to be a very powerful tool for feature extraction, and it has been widely used in multiple research areas, such as image classification [8][9][10][11], object detection [12][13][14][15][16], image segmentation [17][18][19][20], etc.By optimizing the feature extractor and classifier in an end-to-end fashion, deep learning models can automatically extract high-level feature representations and achieve the global optimum during the training process.Deep learning has also been applied to the area of image enhancement, especially for low-light image enhancement tasks.Researchers also proposed RetinexNet [21], which combined Retinex theory with deep learning technology, and further proposed multiple low-light image enhancement methods.The general process of Retinex-theory-based methods is to extract the illumination component and reflection component through a decomposition model first, and then enhancement is applied to the reflection component, and the enhanced reflection component and illumination are composed again to generate the final enhanced image.
However, although deep-learning-based image enhancement methods [22][23][24][25][26][27][28] have achieved very outstanding performance, the low-light image enhancement task is still considered a huge challenge, and most of the existing methods still suffer from the problem of color distortion and local blurring.This is because most of the existing enhancement methods have difficulty retaining color distribution, while also failing to extract multiscale region texture representations during the enhancement process.So, in response to the above questions, a novel low-light image enhancement method, which is called the dual color-and-texture-enhancement-based LLIE (DCTE-LLIE) method is proposed in this paper.The contribution of the proposed DCTE-LLIE method can be summarized as follows: 1.
A novel method called the DCTE-LLIE method is proposed in this paper.The proposed DCTE-LLIE method can extract more realistic color and texture feature representations of low-light images during the enhancement process, which can help eliminate the color distortion and local blurring effect on the final enhanced image more effectively.

2.
A novel color enhancement block (CEB) is proposed to extract more realistic color representations by maintaining the color distribution of the low-light images during the enhancement process, help extract more reasonable color representations, and finally, eliminate the color distortion of the final enhanced image.

3.
A multiscale attention-based texture enhancement block (ATEB) is proposed to help the network focus on the local regions and extract more effective and reliable texture feature representations during the training process; also, a multiscale feature fusion strategy is proposed to fuse multiscale features automatically and finally generate more reliable texture representations.
The rest of the paper is organized as follows: The related work of low-light image enhancement is introduced in Section 2, and the detailed description of the proposed DCTE-LLIE method is described in Section 3. The experimental results on public datasets and real-world images are shown in Section 4. The paper is concluded in Section 5.

Related Work
The low-light image enhancement methods can be divided into three categories: (a) generic enhancement methods; (b) Retinex-theory-based enhancement methods; and (c) deep-learning-based enhancement methods.
The generic enhancement methods: Histogram equalization (HE) [29] and its variations are first applied to the enhancement process to make sure the histogram of the output image can meet the constraints.The HE-based methods can be categorized into global histogram-equalization-based methods [29] and local histogram-equalization-based methods [30][31][32].The global HE-based method aims to adjust the overall gray level of the low-light image, but the performance will be worse if the input image is darker, as the object information of the image cannot be highlighted and the details cannot be well preserved.The local HE-based method is to split the input image into multiple subblocks, equalize the histograms for each subblock separately, and finally superimpose the subblocks to achieve the enhanced image.As each subblock is a neighborhood of a certain pixel, the high-frequency gray level can be improved, and the low-frequency gray level is suppressed.The dehazing-based method [33] takes advantage of the inverse connection between low-light images and images under hazy environments.However, the above methods may lead to amplified intensive noise during the enhancement process.
Another category is the Retinex-theory-based method.Retinex theory can decompose the input image into an illumination component and reflection component, and the enhancement process can be applied to the illumination component and reflection component, respectively.Finally, the enhanced illumination component and reflection component are composed again to obtain the final enhanced image.Multiple methods have been proposed for the decomposition process [34][35][36].The single-scale Retinex [34] method first leverages the Gaussian filter to smooth the illumination component and further enhance the low-light image.The multiscale Retinex method [35] extends the above methods by multiscale Gaussian filters.The LIME [37] method takes advantage of structure priors to estimate the illumination components and finally uses the reflection component as the enhanced result.Also, multiple methods [38,39] are proposed to enhance low-light images by simultaneously achieving image enhancement and noise removal.
Although existing methods have shown outstanding performance on the low-light image enhancement task and are established to obtain pleasant results, the performance of the above methods may be limited due to the models' capacity.However, in recent years, with the rapid development of deep neural networks, convolution neural networks (CNNs) have been widely used in the compute vision area.Also, as CNNs have power in the lowlevel computer vision area [22][23][24][25][26][27][28], CNN-based LLIE methods have drawn great attention.Wang et al. [22] proposed novel lightening back-projection blocks to learn the residual for normal-light estimations.Wang et al. [24] proposed a novel stream regularization model LLFlow to achieve outstanding performance on multiple tasks.Guo et al. [27] transfer an input image into luminance-chrominance color space and eliminate noise in the brightened luminance.Many existing methods also take advantage of the Retinex theory to help improve the performance of LLIE methods.Zhao et al. [23] proposed a novel generative strategy for Retinex decomposition, further estimated the latent components, and performed low-light image enhancement.Hai et al. [25] took advantage of frequency information to preserve image details and achieved more robust results.Yang et al. [26] proposed an enhanced method that takes advantage of ViT to acquire high-level global fine-tuned features and achieved further improvement on enhancement.Cai et al. [28] proposed an illumination-guided transformer that utilizes illumination representations to direct the modeling of nonlocal interactions of regions with different light conditions.

DCTE-LLIE Method
Although researchers have proposed multiple low-light image enhancement methods, and the enhancement performance is established, existing methods still suffer from color distortion and local blurring effects.So, considering the above problems, a novel method called the DCTE-LLIE method is proposed in this paper to deal with low-light image enhancement tasks.The proposed method contains three main blocks: (a) the decomposion subnetwork; (b) the color enhancement block (CEB), and (c) the attention-based texture enhancement block (ATEB).The whole structure of the proposed DCTE-LLIE method is shown in Figure 1.As the DCTE-LLIE method is designed based on the Retinex theory, the input image is first decomposed into the reflection component and illumination component, and the enhancement method is then applied to the reflection and illumination components, respectively.The illumination component is enhanced by a denoising operation that is the same as RetinexNet, while the proposed CEB and ATEB are applied to the reflection component.Finally, the enhanced illumination component and reflection component are composed to obtain the final enhanced image.

Decomposition Subnetwork
As the DCTE-LLIE method is designed based on the Retinex theory, the input image should be first decomposed into a reflection component and an illumination component.In this paper, a decomposition subnetwork is used to achieve image decomposition.Based on the Retinex theory, the image I can be decomposed as where L(x, y) is the reflection component, R(x, y) is the illumination component, × represents the multiplication operation.According to the Retinex theory, the reflection component is a constant part that is determined by the nature of the object itself, while the illumination component is affected by the external light.So, the enhancement process can be achieved on the reflection component by removing the influence of lighting while correcting the illumination component.
The proposed decomposition network is constructed by 5 convolution layers followed by ReLU, and a two-channel output that corresponds to the reflection and illumination component is generated.The output goes through a sigmoid function to generate the final output which has the same size as the input.For each input image pair (including the low-light image and corresponding normal-light image), the low-light image is decomposed into reflection component R low and illumination component L low , while the normal-light image is decomposed into reflection component R normal and illumination component L normal .A constrained loss function is leveraged to optimize the parameter of the proposed decomposition subnetwork, and the final loss function is composed of reconstruction loss L recon , reflection component consistency loss L ir , and illumination smoothing loss L is .The decomposition loss L D can be formed as where the L recon loss is defined as where * represents the inner product operation.L recon is designed to enable the reflection and illumination components to reconstruct the origin input as much as possible.The reflection component consistency loss L ir is defined as where |||| 1 means the L1 loss.L ir aims to make sure the reflection components of low-light image R low and normal-light image R normal should be as similar as possible.The smoothing loss L is aims to represent that an ideal illumination component should be as smooth as possible.The illumination smooth loss L is is then defined as where ▽I i means calculating the gradient of I i , and ▽R i represents the gradient of R i .The proposed smooth loss assigns weights to the gradient map of the lighting components by finding a gradient for the reflection components so that the areas that are smoother on the reflection components should also be as smooth as possible on the lighting components.

Color Enhancement Block
Retinex-theory-based LLIE methods can decompose an image into a reflection component and illumination component, and then apply enhancement on the reflection component.However, the final enhanced image may still suffer from the color distortion effect because existing Retinex-theory-based methods cannot maintain the color distributions during the enhancement process, which means they cannot extract realistic color representations and finally lead to color distortion on the final enhanced image.So, in order to extract more realistic color representations, a novel color enhancement block (CEB) is proposed in this paper.The structure of the proposed CEB is shown in Figure 2. In order to extract more realistic color representations, a novel CEB is proposed in this paper to help the network extract more realistic color representations by maintaining color distribution during the enhancement process.As shown in Figure 2, the reflection component R low is first transformed into the LAB color space.LAB space is designed based on the human perception of color, so compared with other color spaces such as RGB and CMYK, LAB is more in line with human vision and easier to adjust.L channel can be used to adjust image brightness, and the AB channel can be used to adjust image color representation.
So, in this paper, the AB channel is considered as the input of the color enhancement block (CEB).The AB channel first goes through the attention map generator and generates a two-channel attention map M corresponding to the AB channel.The generated attention map M is concated with the AB channel, and the concated feature further goes through the attention point generator to generate an attention point map P. The generated attention map M and attention point map P are used to help the proposed network maintain the color distribution of the input image and extract more realistic color representations.The generated attention map M and attention point map P were supervised by the color distribution map M and P, which were generated from the reflection component R normal of the normal-light image, as the images captured under normal-light conditions contain more realistic and precious color information.For a given image, the color distribution map M was calculated by counting the color appearance for each pixel.The color distribution map M is calculated as where ⊙ means Hadamard product operation, R normal [AB] represents the AB channel of R normal , so i ∈ 1, 2 as the R normal [AB] has only two channels, and F is the color frequency map.Each component F(x, y) equals the number of occurrences of the color of the input image.So, F can be used to represent the color distribution of the input image.After that, in order to eliminate the effect of noise and background, a threshold is used to eliminate the dominant color distribution and useless noise frequency to finally focus on the foreground colors, as the color background of an input image should be very similar, which is represented as very high values in F. After M is calculated, multiple foreground points are randomly selected from M to generate attention point map P. In this part, M is used to supervise the attention map M, and P is used to supervise the attention point map P. The first part can help the network extract local color representations, while the second part guides the network in extracting pixel-level color representations.It is easy to understand that map M has much more complicated and duplicated colors than P, so the supervision of P can cover most of the color distribution while using the least constraints.
It is worth noticing that the generated color attention map M and attention point map P are supervised by the color distribution map generated by the illumination component R normal from the normal-light image.This is because the reflection component of the normal-light image contains more realistic color information, so the proposed CEB can help the network extract more realistic color representations.

Attention-Based Texture Enhancement Block
To further eliminate the local blurring effect in the enhanced image, a novel attentionbased texture enhancement block (ATEB) is also proposed to help the network focus on the local regions and further extract more reliable local region representations.As existing methods cannot extract multiscale region representations and further lead to a local blur effect, it is essential to extract more effective multiscale local representations.Attention mechanisms [40] are established to show excellent ability to help networks focus on the region of interest and extract more reliable feature representations.So, in this paper, an attention mechanism is also applied to the texture enhancement block, and an attentionbased multiscale texture enhancement block (ATEB) is proposed.
The structure of the proposed ATEB is shown in Figure 3.The input image is first decomposed into the illumination component and reflection component by the decomposition subnetwork, and the enhancement process is then applied to the reflection component.The proposed ATEB contains three parts: (a) the multiscale attention block, which used to help network focus on the multiscale region of interest of input image by extracting multiscale attention maps; (b) a U-shaped enhancement block, where a U-shape network structure is constructed to extract multiscale feature presentations; and (c) a multiscale feature fusion strategy, where the multiscale attention map and multiscale feature representations are fused to generate feature representations with sufficient local region texture information.The reflection component R low and R normal contains sufficient but various texture information, so R low and R normal are first mixed to form a mixed reflection representation R mix .The mixed feature representation R mix = (R low + R normal )/2.After that, R mix is sent into the multiscale attention block to generate multiscale spatial attention maps.The proposed multiple-scale attention block is constructed by multiple 3 × 3 convolution layers and one 1 × 1 convolution layer.Firstly, two 3 × 3 convolution layers are applied on the mixed reflection feature to extract multiscale regions of interest and remove background information.After that, a 1 × 1 convolution layer is used to reduce the channel number and computation cost.Finally, two 3 × 3 convolution layers are leveraged to extract the final multiscale attention maps.The generated multiscale attention maps further correspond to the multiscale feature representations generated from the U-shape network.
As shown in Figure 3, the reflection component R low and illumination component I low from the low-light image are concated to from a concated feature representation as the input of the U-shape feature enhancement block.In this paper, a U-shape enhancement network is leveraged to extract multiscale local region feature representations and achieve enhancement on the reflection component.The left part of the U-shape network can capture feature representations from low level to high level, while the right part can recover the details of each layer and continuously pass high-level semantic information to the bottom.So, the proposed U-shape enhancement network can effectively extract multiscale feature representations.
After the multiscale feature is extracted from the U-shape enhancement network, the multiscale attention maps are applied to the multiscale feature representations to generate the final feature representation which contains sufficient local region texture information.
In order to fuse multiscale feature representations effectively, a novel fusion strategy is also proposed, where multiple deconvolution layers are used to transform feature maps with different sizes into the same size and finally fuse multiscale feature maps into an enhanced reflection component.The final reflection component is then combined with the enhanced illumination component to generate the final enhanced image.

Training Strategy
The proposed method cannot optimize the model in an end-to-end fashion, especially the decomposition subnetwork and color enhancement block.This is because the decomposition subnetwork will seriously affect the performance of the following operation.For the color enhancement block, two color distribution maps M and P are leveraged to supervise the realistic color representation extraction process.So, in this paper, a three-step training strategy is proposed to achieve parameter optimization for the proposed DCTE-LLIE method.The training process for the proposed method can be divided into three steps: (a) decomposition subnetwork optimization to optimize the Retinex decomposition subnetwork by minimizing the loss function L D ; (b) color enhancement block optimization, where the color distribution map and color point map is generated to guide network maintaining color distribution and help extract more realistic color representations, and this step is achieved by minimizing the color maintaining loss function L C ; and (c) the whole network optimization, where the whole network is optimized in an end-to-end fashion, while the parameter of the decomposition subnetwork and the CEB are also fine-tuned during this step.
The second step was to optimize the color attention generator, where the color attention map M was supervised by the color distribution map M through the following function: The color point attention map P was supervised by the color points map P by the following function: L P = || P − P|| 1 (8) the final color maintaining loss function L C for color enhancement block optimization is defined as where α is the balance parameter to balance the color attention map loss L M and color point attention map loss L P .After that, the whole DCTE-LLIE method can be optimized in an end-to-end fashion, while the model trained in the first two steps can also be fine-tuned during the training process.The final loss function is formed as where Ri is the enhanced reflection component, while Îj is the enhanced illumination component, • represent for the Hardmard.The final loss function L F aims to minimize the difference between the enhanced image which is composed of the enhanced reflection component and illumination component and the ground truth image.

Experiment Details
To establish the effectiveness of the proposed method, the proposed method is applied to three public datasets: the LOL, SID, and MIT-Adobe FiveK datasets, and the proposed DCTE-LLIE method is also applied to the low-light images captured in real-world scenarios.
The LOL dataset is the first dataset that captures low-light images and corresponding normal-light images in a real-world environment.Also, the LOL dataset consists of two categories, 500 real-world image pairs and 1000 composite image pairs.The real-world images are captured under real-world environments, while the composite images are generated from the normal-light image from the RAISE dataset.In this paper, only realworld images are used in this experiment, in which 485 image pairs are used for training and 15 image pairs are used for testing.
The SID dataset includes 5094 raw images captured under low-light conditions, and each low-light image has a corresponding normal-light high-quality image.Also, the SID dataset contains indoor and outdoor images, and the outdoor images are usually captured under moonlight or street lighting conditions.Images in the SID dataset are captured by two different cameras: Sony α7S and Fujifilm X-T2, so the SID contains two subsets: ID Sony and SID Fuji .
The MIT-Adobe FiveK dataset was proposed in 2011, including 5000 raw images captured by DSLR camera.Each image was postadjust by 5 experienced photographers using Adobe Lightroom for color tone.As this dataset contains pairwise data of the original image and 5 postimages, and there are multiple photos from the same photographer, so it can be used for a later style of learning.

Evaluation Index
In order to evaluate the effectiveness of the proposed method, multiple evaluation indexes are used.In this paper, PSNR, SSIM [41], and LPIPS [42] are used.
Peak signal-to-noise ratio (PSNR) indicates the ratio between the maximum power a signal can reach and the noise power that can affect it.The calculation method can be formed as follows: where MSE is the mean squared error.The MSE can be calculated as where I means the images captured under normal-light conditions, R means the enhanced image, and L means the largest gray level, and its value is 2 n − 1. n is the number of pixel bits.When the PSNR is bigger, the enhancement affection is better.
Structural similarity (SSIM) can be used to evaluate the similarity between two images and the value range from −1 to 1, and a large value indicates that the enhanced image is more structurally similar to the image captured under normal-light conditions.It is measured from three aspects: brightness, contrast, and structure.The SSIM between image I and image R can be calculated as where µ I means the luminance, and it can be calculated as the mean of all pixel values.The formula can be represented as Learning perceptual image patch similarity (LPIPS) is also used as an evaluation index, which uses deep features to measure the perceptual similarity of images.This indicator is learned through deep learning methods.Compared with PSNR and SSIM, LPIPS can more truly reflect the human eye's perception of image quality.

Comparison with State-of-the-Art Methods
The public LOL dataset is used to evaluate the effectiveness of the proposed DCTE-LLIE method, and multiple deep-learning-based methods are also compared with the proposed method.The comparison results are shown in Table 1.It is easy to understand that compared with generic methods, deep-learning-based methods can achieve more outstanding performance, which establishes the feature extraction ability that deep learning technology showed in the low-light image enhancement task.So, in this paper, only deep-learning-based enhancement methods are used for comparison with the proposed DCTE-LLIE method.As shown in Table 1, one can easily find that the proposed method can achieve outstanding performance on the low-light image enhancement task.This is mainly because the proposed DCTE-LLIE method can extract more realistic color representations and more reliable region representations during the enhancement process.The proposed color enhancement block can help the proposed network maintain color distribution during the enhancement process, while the multiscale attention-based texture enhancement block can help extract more reliable multiscale local texture representations.So the proposed DCTE-LLIE method can achieve better performance on three evaluation indexes than most of the deep-learning-based methods.

The Effectiveness of Color Enhancement Block
As shown in Table 2, it is easy to find that the performance of the DCTE-LLIE method without a color enhancement block is worse than that of the method with a color enhancement block.This comparison can establish the effectiveness of the proposed color enhancement block on the performance of a low-light image enhancement block.This is because the proposed color enhancement block leveraged a steady color maintenance strategy to extract a more reliable color representation.By calculating the color distribution map and color point map, the color distribution of the input image can be well preserved and the extracted color representation can be more reliable.So, the proposed CEB can make sure the color distribution can be well preserved in the final enhanced image, and the enhancement performance can be effectively improved by the performance comparison.As shown in Table 2, the effectiveness of the proposed texture block is also established.By comparing the proposed DCTE-LLIE method with DCTE-LLIE without ATEB, one can easily find that the proposed ATEB block can significantly improve the final enhancement performance.This is because the attention module can help the module focus on multiscale local region texture details and further maintain local details during the enhancement process.Also, by comparing the performance between the DCTE-LLIE method and the DCTE-LLIE method without a multiscale feature fusion strategy, the effectiveness of the proposed multiscale feature fusion strategy can be established.The proposed ATEB can make the network focus on the region of interest from different perspectives, and the U-shape feature extractor can extract multiscale feature representations, which makes the final fused feature representation more reliable and can significantly improve the final enhancement result.

Further Experimental Results on MIT-Adobe FiveK Dataset
To further evaluate the effectiveness and genetic ability of the proposed method, the proposed DCTE-LLIE method is also applied to the MIT-Adobe FiveK dataset, and its performance is also compared with multiple existing enhancement methods.The experimental results are shown in Table 3.One can also find that the proposed DCTE-LLIE method can still achieve satisfactory enhancement performance, and the performance of the DCTE-LLIE method on PSNR and SSIM is only a little bit worse than that of LLFlow [24], while it is better than other deep-learning-based enhancement methods.This means that the proposed DCTE-LLIE method can still achieve outstanding performance on different enhancement tasks, and the general ability of the proposed method can be promised.

Further Experimental Results on the SID Dataset
The proposed DCTE-LLIE method is also applied to the SID dataset.The SID dataset has two subsets, SID Sony and SID Fuji , corresponding to images captured under a Sony camera and a Fuji camera, respectively.The experiments on both subsets are shown in Table 4.It is easy to find out that the proposed DCTE-LLIE method can still achieve outstanding performance on the SID dataset, no matter whether on a Fuji camera or a Sony camera.Compared with existing low-light enhancement methods, the proposed method can achieve the best performance.This is because the proposed method can extract realistic color representation and reliable local region texture representation, which can further improve the final enhancement performance.At the same time, the experimental results can prove that the proposed method can be applied to different equipment, which further confirms the robustness of the proposed method.

Visualization Performance on Real-World Scenarios
Some visual comparisons between the proposed DCTE-LLIE method and other enhancement methods are also shown in this paper.Some real-world low-light images captured from real-world scenarios at noon are used to evaluate the performance of the proposed method.As shown in Figures 4-7, one can easily find that most of the existing methods struggle with extracting realistic color and texture presentations.For example, LIME overenhanced the moderately brightness area, which makes the final result particularly unnatural and exhibits varying degrees of color distortion; RetinexNet can effectively improve the brightness of input image, but there still exists color distortion and local region blurring effect; and the EnlightenGAN method is based on the GAN model, and this method can enhance the illumination better while producing less noise.However, compared with the ground truth image, there are still weak light areas in the local area, and the contrast is not high.Among all the enhanced images, the result of the proposed DCTE-LLIE method is much closer to the ground truth image.

Conclusions
In this paper, a novel deep learning and Retinex-theory-based method is proposed to deal with low-light image enhancement.Considering that existing enhancement methods can lead to color distortion and blur effects during the enhancement process, the proposed methods have the following improvements.Firstly, a color enhancement block is proposed to help the network extract more realistic color representations.Secondly, a multiscale attention mechanism is applied to the texture enhancement block to help the network focus on the local region areas and further help extract multiscale feature representations, and the extracted multiscale feature representations are finally fused to achieve a final enhanced reflection component through multiscale feature fusion strategy.The experimental results on public datasets also established the effectiveness of the proposed method on low-light image enhancement tasks.

Figure 1 .
Figure 1.The structure of the proposed reflection enhancement branch of the DCTE-LLIE method.

Figure 2 .
Figure 2. The structure of proposed color enhancement block.

Figure 3 .
Figure 3.The structure of the proposed attention-based texture enhancement block.

Table 1 .
The performance comparison between multiple low-light image enhancement methods and the proposed method on the LOL dataset.

Table 2 .
The performance of ablation study on the DCTE-LLIE method.

Table 3 .
The performance comparison between multiple low-light image enhancement methods with the proposed method on the MIT-Adobe FiveK dataset.

Table 4 .
The performance comparison between multiple low-light image enhancement methods with the proposed method on the SID dataset.