A Novel Low-Illumination Image Enhancement Method Based on Convolutional Neural Network with Retinex Theory

Mao, Haixia; Peng, Wei; Tian, Yan; Zhu, Xiaochun

doi:10.3390/app152212324

Open AccessArticle

A Novel Low-Illumination Image Enhancement Method Based on Convolutional Neural Network with Retinex Theory

by

Haixia Mao

¹

,

Wei Peng

²,

Yan Tian

² and

Xiaochun Zhu

^1,3,*

¹

School of Automotive and Transportation Engineering, Shenzhen Polytechnic University, 7098, Liuxian Road, Nanshan District, Shenzhen 518055, China

²

School of Electronic Information and Communications, Huazhong University of Science and Technology, 1037, Luoyu Road, Hongshan District, Wuhan 430074, China

³

College of Civil and Transportation Engineering, Shenzhen University, 3688, Nanhai Avenue, Nanshan District, Shenzhen 518060, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(22), 12324; https://doi.org/10.3390/app152212324

Submission received: 12 September 2025 / Revised: 5 November 2025 / Accepted: 18 November 2025 / Published: 20 November 2025

Download

Browse Figures

Versions Notes

Abstract

Low-illumination images can seriously affect or even limit the performance of the human eye or a computer vision system, making image enhancement processing necessary. Traditional image enhancement methods, such as those based on image fusion, frequency thresholding, or spatial domain processing, lack robustness. Existing state-of-the-art methods, including GLADNet and MSR-Net, also suffer from color bias or the square effect in the recovered images. To address these issues, we propose a novel low-illumination image enhancement method based on a convolutional neural network (CNN) that incorporates the Retinex theory, namely Retinex-CNN. A decomposition sub-network is designed to transform the original image into a reflectance map and a light map, and then they are further optimized by a reflectance map refinement sub-network and a light map enhancement sub-network, respectively. Finally, according to Retinex theory, the refined reflectance map and the enhanced light map are synthesized to obtain the fusion result with better visual sense. We used synthetic and real low-illumination image datasets for training, testing, and comparison with other methods. In the synthetic scene, Retinex-CNN demonstrates superior performance with higher PSNR, MSE, and SSIM. In the real scene, Retinex-CNN has the best VIF score in six public datasets (over 0.75) and the best NIQE score in four of them. The experiments demonstrate that Retinex-CNN can not only effectively improve the brightness of the image but also enhance the clarity of the details and mitigate the serious color distortion and halo phenomenon. Additionally, the image enhancement process is less time-consuming.

Keywords:

image enhancement; retinex theory; low-illumination image; convolutional neural network

1. Introduction

Under low-illumination conditions, such as cloudy days, nights, and object occlusions, the images obtained often suffer from a low dynamic range, a serious loss of detailed information, and significant noise. Therefore, low-illumination images can seriously affect or even limit the performance of the human eye or computer vision systems [1,2,3]. Enhancing low-illumination images can improve clarity, highlight the textured details of the scene, and prevent performance degradation in subsequent visual tasks, which has important theoretical significance and practical application value.

Traditional image enhancement algorithms include image fusion-based methods, frequency domain-based methods, and spatial domain-based methods. Image fusion-based methods sometimes cannot effectively enhance images with darker local regions and may lose some detailed information, resulting in blurring of the recovered image [4,5]. Frequency domain-based methods, represented by the wavelet transform, have been developed to a high level of maturity, enabling them to not only remove noise effectively but also enhance image details. However, this type of algorithm will cause the enhanced image to exhibit color and brightness distortion and does not meet the visual perception of the human eye. Algorithms based on histogram equalization (HE) are simpler but have greater time complexity, and they have been widely used in image preprocessing. However, the principle of this type of algorithm is relatively simple and does not conform to the object imaging model. As a result, the recovered image has brightness distortion, color distortion, and a loss of details, among other issues, while the visual effect is poor [4,6,7].

The integration of Retinex theory with convolutional neural networks (CNNs) has emerged as a promising approach for low-illumination image enhancement in recent research. Various network structures have been developed, and the algorithmic process is roughly to extract the incident light through the trained network model and then adjust it to calculate the reflectance according to the Retinex model. Park et al. [8] proposed a bipartite self-coding network that only processes the luminance component and effectively avoids color distortion. Wang et al. [9] designed a trainable neural network, GLADNet, which incorporates a nearest-neighbor interpolation operation to avoid the artificial ghosting phenomenon and improve the quality of the enhanced image. Xu et al. [10] introduced a low-illumination residual CNN (LRCNN) that jointly addresses noise removal and contrast enhancement. They added several residual connections to preserve image details while allowing the network to avoid gradient vanishing and overfitting phenomena. Shen et al. [11] proposed a novel network, namely MSR-Net, which employed a CNN to mimic the processing of the traditional multi-scale Retinex algorithm, with the network structure hitching a ride in the form of an MSR.

However, these methods have several shortcomings, such as MSR-Net not retaining details well, lacking noise immunity, and exhibiting over-enhancement. Both GLADNet and LRCNN directly estimate the reflectance without considering the incident light, which results in an enhanced image that exhibits visual artifacts inconsistent with human eye perception, and their robustness is poor when dealing with images from different scenes. Lei et al. [12] developed a deep learning network inspired by Retinex theory, integrating three core components, image decomposition, illumination enhancement, and color restoration, but lacking reflectance map refinement. Therefore, it is still necessary to design an algorithm that can retain image details well, has anti-noise performance, and optimizes the image to align with human visual perception.

Therefore, we propose a novel CNN that incorporates Retinex theory for low-illumination image enhancement, called Retinex-CNN. It utilizes convolutional networks to decompose the low-illumination image into a reflectance map and a light map for optimization. Our contributions are as follows:

(1) A new CNN-based network architecture is designed for low-illumination image enhancement, which consists of three subnetworks, namely the decomposition subnetwork, the reflectance map refinement subnetwork, and the light map enhancement subnetwork;

(2) Based on the Retinex theory and the proposed network structure, three corresponding specific loss functions are designed for model training;

(3) Experiments conducted on synthetic and real datasets demonstrate that Retinex-CNN can not only improve the brightness of the image but also improve the detail of the image and avoid serious color distortion and the halo phenomenon to a certain extent.

2. Materials and Methods

2.1. Retinex-CNN

The Retinex-CNN is designed based on CNN and Retinex theory for low-illumination image enhancement. The network can input a complete low-illumination image into the convolutional layer and output a normal-illumination image directly after processing. As shown in Figure 1, the network comprises three subnetworks: a decomposition subnetwork, a reflectance map refinement subnetwork, and a light map enhancement subnetwork.

2.1.1. Decomposition Subnetwork

According to the Retinex theory, an image captured from the real world consists of two parts, i.e., a reflectance map and a light map. Therefore, as illustrated in Figure 2, we design a decomposition subnetwork based on a CNN to perform the image decomposition.

It uses three ConvReLU blocks (a convolutional layer followed by a ReLU layer) to extract the shallow feature. The kernel sizes of the convolutional layers are 7 × 7, 5 × 5, and 3 × 3, respectively, and the number of channels is 32, 20, and 12, respectively. The information from these features is then combined by a concatenation layer, i.e., Depthconcat. And two ConvBNPReLU blocks (i.e., a convolutional layer followed by a BatchNormal layer and then a PReLU layer) are used to refine such extracted features. Finally, the decomposed reflectance map and the light map are generated by a convolutional layer with a 3 × 3 kernel and four channels (i.e., three channels for the reflectance map and one channel for the light map), followed by a Sigmoid function.

2.1.2. Reflectance Map Refinement Subnetwork

As shown in Figure 3, the reflectance map refinement subnetwork consists of an encoder and a decoder. The former is based on ResizeConvPReLU blocks (its kernel size is 3 × 3, the number of channels is 64, and it uses a nearest-neighbor interpolation function for downsampling), which is used to coarsely extract the features of the reflectance map. And the decoder is used to reconstruct the reflectance map by using ResizeConvPReLU blocks (its kernel size is 3 × 3 and the number of channels is 64). Finally, a convolutional layer with a 3 × 3 kennel is used to recover the details of the reconstructed image from the decoding part and then normalized to (0,1) using the Sigmoid function.

2.1.3. Light Map Enhancement Subnetwork

As shown in Figure 4, the light map enhancement subnetwork first uses a downsampling encoder to extract global features, then the enhanced features are obtained through an upsampling decoder, and finally, a concatenation layer and a convolutional layer is used to recover the image details.

The downsampling encoder consists of three ConvBNReLU blocks. The kernel size and the number of channels of the convolutional layer are both 3 × 3 and 64, respectively. In this part, the nearest neighbor interpolation method is used to realize the downsampling operation.

The upsampling decoder: Similarly to the encoder, it consists of three ConvBNReLU blocks with the same parameters. But the nearest neighbor interpolation is used to achieve the upsampling operation.

At the final part, a concatenation layer utilizes jump connections to combine multi-layer features in the channel dimension to provide rich feature information for the convolutional layer, where the convolutional layer consists of a 3 × 3 convolutional kernel with the Sigmoid activation function.

2.2. Loss Function

As presented in Figure 1, Retinex-CNN consists of three subnetworks, and each subnetwork is self-contained and can be trained independently and step-by-step. And it is still an end-to-end network.

2.2.1. Decomposition Subnetwork Loss Function

Whether it is a normal-illumination image or a low-illumination image, it consists of two parts: the reflectance map and the illumination map. That is, taking the normal-illumination image

S_{normal}

as the input of the decomposition subnetwork, its reflectance map

R_{normal}

and light map

B_{normal}

can be obtained. Taking the low-illumination image

S_{low}

as the input of the decomposition network, its reflectance map

R_{low}

and illumination map

B_{low}

can be obtained.

Since the reflectance map is an intrinsic property of the image and is not affected by lighting conditions, a reflectance map invariance loss function

L_{r}

can be designed as follows:

L_{r} = {|| R_{l o w} - R_{n o r m a l} ||}_{2}

(1)

A better light map should be smooth in texture details while still preserving the overall structural boundaries. However, the overall variational minimization loss function is structure-blind, resulting in a blurry light map with strong black edges retained on the reflectance map. To capture the overall structure of the image, the loss function is weighted by the gradient of the reflectance image in the overall variational minimization loss function. The light map loss function

L_{l}

can be formulated as follows:

L_{l} = {‖\nabla B_{n o r m a l} \cdot e^{- λ \nabla R_{n o r m a l}}‖}_{2} + {‖\nabla B_{l o w} \cdot e^{- λ \nabla R_{l o w}}‖}_{2}

(2)

Since decomposition and fusion are inverse processes that can be based on the product of the reflectance map and the light map to be able to obtain the corresponding real image, a reconstruction loss function

L_{s}

can be designed as follows:

L_{s} = {‖R_{n o r m a l} \cdot B_{n o r m a l} - S_{n o r m a l}‖}_{2} + {‖R_{l o w} \cdot B_{l o w} - S_{l o w}‖}_{2}

(3)

Thus, the loss function of the decomposition subnetwork,

L_{D}^{N e t}

, consists of the reflectance map invariance loss function

L_{r}

, the light map loss function

L_{l}

, and the reconstruction loss function

L_{s}

. That is,

L_{D}^{N e t} = L_{s} + λ_{r} L_{r} + λ_{l} L_{l}

(4)

where

λ_{r}

and

λ_{l}

are coefficients balancing the constancy of the reflectance map and the local smoothness properties of the light map, respectively. They are set to

λ_{r} = 0.01

and

λ_{l} = 0.1

, which can achieve good performance. In Equation (2),

λ

is the structure−perceived intensity balancing coefficient, which is set to

λ = 10

according to prior experience, and

e^{- λ \nabla R}

relaxes the smoothness constraints in places where the gradient of the reflectance map is large, i.e., when the image structure position and lighting are discontinuous, the constraint on smoothness is relaxed. And

\nabla

denotes the horizontal and vertical gradients.

2.2.2. Light Map Enhancement Subnetwork Loss Function

The light map

B_{low}

obtained after the decomposition subnetwork is dark is the root cause of the reduced visibility of the image. If only the reflectance map

S_{low}

is retained and the illumination low

B_{low}

is ignored, the recovered image does not conform to the visual characteristics of the human eye and looks unnatural. Therefore, the light component of the low-illumination image needs to be enhanced and then added to the recovered image to make it look more comfortable and natural to the human eye.

As mentioned before, the reflectance map

R

is not affected by lighting conditions. Therefore, the enhanced light map

B_{low}^{'}

can be compared with the reflectance map

R_{normal}

of the normal-illumination image and the reflectance map

R_{low}

of the low-illumination image, respectively, and the normal-illumination image

S_{normal}

can be obtained by utilizing Equation (3). Meanwhile, the enhanced light map

B_{low}^{'}

should also have the basic properties of local consistency and structure perception. Therefore, the design of the loss function of the light map enhancement subnetwork, i.e.,

L_{B}^{N e t}

, can be calculated as follows:

L_{B}^{N e t} = L_{1} + λ_{1} L_{2} + λ_{2} L_{3}

(5)

L_{1} = \sum_{i = l o w, n o r m a l} {‖R_{i} \cdot B_{low}^{'} - S_{normal}‖}_{2}

(6)

L_{2} = \sum_{i = l o w, n o r m a l} {‖\nabla B_{low}^{'} \cdot e^{- λ \nabla R_{i}}‖}_{2}

(7)

L_{3} = {‖B_{low}^{'} - B_{n o r m a l}‖}_{2}

(8)

where L denotes the total loss function of the light map enhancement subnetwork,

L_{1}

denotes the image reconstruction loss function,

L_{2}

as the loss function is designed according to the local consistency and structural perception characteristics of the light map, and

L_{3}

makes the enhanced light map consistent with the light map of the normal-illumination image. Here

λ_{1} = 0.1

and

λ_{2} = 0.01

, which can achieve good performance.

2.2.3. Reflectance Map Refinement Subnetwork Loss Function

Several constraints are artificially imposed on the decomposition of the low-illumination image, one of which is the structure-aware smoothing property of the light map. As a result, the estimated light map

B_{low}

is smooth, and both detail and noise are decomposed onto the reflectance map

R_{low}

. In fact, images taken in darker scenes tend to be affected by the surrounding light, which results in the overall color of the acquired image being biased towards the ambient light, causing color distortion. To address the two problems mentioned above, this paper uses a residual learning strategy when refining the reflectance map

R_{low}

, and its loss function is designed as follows:

P = R_{n o r m a l} - R_{l o w}

(9)

L_{R}^{N e t} = {‖P^{'} - P‖}_{2}

(10)

where

P^{'}

denotes the difference of

R_{normal}

and

R_{low}

learned by the subnetwork, and

P

denotes their difference in the real situation.

2.3. Experiment Setup

To validate the proposed Retinex-CNN’s ability for low-illumination image enhancement, we conduct separate experiments on synthetic and actual low-illumination images, encompassing various scenes such as cities, traffic, forests, and farmlands. We collected normal-illumination images of various scenes from ImageNet as real data and then synthesized low-illumination images using the method proposed by Ying [13]. We also collected a large number of real low-illumination images of different scenes from surveillance devices and cell phones. Finally, six publicly available datasets are selected; they are DICM [14], Fusion [15], MEF [16], LIME [17], NPE [18], and VV [19].

During the training phase, the input is a low-illumination image, and the output is a corresponding normal-illumination image. We input images into the network in batches, with consistent input and output width and height. The width and height of the input and output are the same, the batch size is 3 × 3, and the number of batches is 64. The training set and the test set in the synthetic scene consist of 1485 pairs and 100 pairs of images, respectively, and the training set and the test set in the real scene consist of 700 pairs and 80 pairs of images, respectively. The optimizer uses a gradient descent method with momentum, a momentum factor of 0.9, an initial learning rate of 0.001, and a learning rate decaying to 10% of the original every 20 epochs.

Due to the wide variety of existing low-illumination image enhancement algorithms, and only some of the codes are open source, it is difficult to reproduce those codes that are not open source. Therefore, in this paper, we validate, analyze, and compare the existing low-illumination image enhancement algorithms, which are widely recognized as HE, the multiscale retinal enhancement algorithm MSRCR with color restoration, the SRIE [20], the de-fogging algorithm based on dark channel prior theory (Dong) [21], and LIME [17] based on the a priori theory of the dark channel, and the learning-based algorithm GLADNet, which has emerged in recent years.

2.4. Image Quality Evaluation Methods

2.4.1. Evaluation Metrics for Composite Image Quality

When assessing the performance of the proposed method, both subjective human evaluation and objective metrics are needed. Since the synthetic dataset contains normal-illumination images and low-illumination images, which belong to the category of full-reference image quality evaluation, the differences between the enhanced images of each algorithm and the real normal-illumination images can be compared as a way to objectively evaluate the goodness of different algorithms.

We use the Peak Signal to Noise Ratio (PSNR), Mean Squared Error (MSE), SSIM [22], and LOE [23] to evaluate the performance of different algorithms. Among them, PSNR can indicate the degree of distortion of the enhanced image relative to the real normal light image. The larger its value, the higher the quality of the enhanced image, meaning the less distortion. The unit is the decibel (dB). SSIM is an evaluation index used to reflect the completeness of an image’s structural information. The larger its value, the more similar the enhanced image is to the structural features of a real normal-light image. MSE is evaluated by calculating the distance between the pixels of two images; the smaller the value, the less different the enhanced image is from the real normal light image. LOE reflects the natural retention of the image; the smaller the value, the better the luminance order, and the more natural the image looks.

2.4.2. Actual Image Quality Evaluation Indicators

Unlike synthetic images, real low-illumination images do not have their corresponding normal-illumination images as a reference and cannot use SSIM, PSNR, and MSE as evaluation indices as synthetic images do, which belongs to the category of unreferenced evaluation. Therefore, we use blind image quality evaluation metrics, including the Natural Image Quality Evaluator (NIQE) [24] and Visual Information Fidelity (VIF) [25]. Among them, NIQE is a trained image quality evaluation model that works by calculating the distance between the feature model parameters of the resultant image and those of the trained model to obtain the final score. The lower the score, the higher the quality of the image, and the more it aligns with the human eye’s subjective evaluation criteria. VIF is an image quality evaluation index built on the natural image statistical model, image distortion model, and human eye vision model; the higher the score, the better the image quality.

3. Experiment Results

3.1. Enhancement of Synthesized Low-Illumination Images

Figure 5 shows representative samples of the experimental results of the enhancement of low-illumination images using the existing classical five algorithms as well as the proposed method, as shown in the figure, in order of real normal-illumination images, low-illumination images, HE, MSRCR, Dong, GLADNet, and Retinex-CNN enhancement results.

As far as subjective evaluation is concerned, HE, MSRCR, Dong, LIME, GLADNet, and Retinex-CNN are able to enhance the synthesized data to improve the image quality and visualization.

By observing the images of these four scenes, it can be found that the images obtained by the HE algorithm suffer from serious color bias, such as the sky region in Figure 5a, where the original image is blue, and the enhanced image is grayish. As can be seen in Figure 5a, the brightness of the image obtained by the HE algorithm is over-saturated, such as in the brighter region of the original image itself. The image obtained by MSRCR looks very unnatural and loses the global luminance level, with serious luminance and color distortions. The image obtained by Dong has obvious black outlines at the edges, such as the buildings. The image processed by the algorithm LIME is bright overall and shows different degrees of color distortion. The image recovered by GLADNet has serious color distortion, such as the sky, in addition to some details of the image being invisible, such as the white clouds.

Retinex-CNN can avoid over-enhancement, color distortion, and detail loss to some extent when enhancing low-illumination images. As shown in Figure 5b, Retinex-CNN can better enhance the brightness of the dark area under the bridge, a sanitation worker wearing an orange coat can be clearly seen, and the color of the vehicle is basically consistent with the original image. In Figure 5c, the black contour effect that occurs in the Dong and GLADNet algorithms is avoided. In addition, Retinex-CNN is better for image enhancement with uniform illumination and slightly worse for non-uniform illumination images. This is due to the fact that the training data is artificially synthesized, and its overall illumination is uniform, which makes Retinex-CNN have some limitations.

3.2. Enhancement of Real Low-Illumination Images

Experiments on real low-illumination images can verify the effectiveness and robustness of Retinex-CNN. The results of the analysis are presented in Figure 6.

As presented in Figure 6, different methods can enhance the brightness of real low-illumination images to varying degrees, but there are also different issues. The HE algorithm is prone to over-enhancement, color distortion, and loss of details. For example, the ground of Figure 6a, which was originally in a brighter area, is brown, but after enhancement, the area is visually very dazzling, and due to oversaturation, the color is severely distorted and appears white. Similar phenomena were observed in the other four sets of images. The MSRCR algorithm has problems with color distortion, brightness distortion, loss of details, and artifacts. As shown in Figure 6a, it is difficult to see clouds in the sky area, and in Figure 6d, the Earth is white. However, the original image has bright colors and noticeable artifacts at the astronaut’s body contour.

The enhancement effect of SRIE algorithm is limited, and the brightness enhancement in darker areas is not enough. Details in the image cannot be seen in Figure 6a,c,d. The images restored by Dong’s algorithm have strong edge effects that do not match visual perception, such as the black thick lines appearing at the junction of lotus leaves and water surface in Figure 6a, the contour of tourists’ reflections in Figure 6b. LIME is prone to brightness distortion and often excessively enhances areas that were originally brighter, resulting in severe loss of details in bright areas, such as the face and arms of a male tourist wearing glasses in Figure 6b. The GLADNet algorithm has issues with color deviation and overexposure, as shown in Figure 6a where the lantern and leaf areas are light red.

Compared with several other algorithms, Retinex-CNN has a better enhancement effect. For example, in Figure 6a, the leaves, lanterns, blue sky, and white clouds have the same color as the original image; in Figure 6b, the skin color of the human body is preserved and the face of the traveler in the dark area can be clearly seen; in Figure 6c, the folding bed in the corner and the objects on the bookshelf can be clearly seen, and the desktop with the original image exposure looks more natural. Overall, Retinex-CNN is not as good as the LIME algorithm for dark region brightness enhancement, but the enhanced image can still be seen clearly, and over-enhancement can be avoided. The original image’s color is more effectively maintained relative to HE, MSRCR, and GLADNet.

3.3. Objective Evaluation of Image Quality

Table 1 lists the average values of the evaluation metrics of the experimental results of each algorithm on the synthetic dataset. LOE scores second only to Dong and achieves first place with a large advantage in the three objective evaluation indexes of PSNR, MSE, and SSIM. Its comprehensive score is the first, which indicates that the image obtained by the Retinex-CNN recovery is closer to the real normal-illumination image than other algorithms, with less brightness distortion and better visual effect, and also verifies the performance of the Retinex-CNN.

Table 2 and Table 3 show the NIQE and VIF evaluation scores of different methods on six publicly available datasets. As demonstrated in Table 2, our method has the first NIQE scores on all four datasets except for the third score on the LIME dataset and the fourth score on the NPE dataset. We extracted some data from both LIME and NPE databases for analysis, as shown in Figure 5.

As presented in Figure 7, the overall brightness of the images enhanced by the Retinex-CNN algorithm is more uniform and does not form a higher contrast, which may be the reason for the low NIQE score. NIQE is a trained image quality evaluation model, which works by calculating the distance between the feature model parameters of the resulting image and the trained model parameters to obtain the final score. The lower the score, the higher the image quality, which is more in line with the subjective evaluation criteria of the human eye. Obviously, Retinex-CNN has the lowest average NIQE score, on these six publicly available datasets, and the enhanced images have higher quality compared to other algorithms. As presented in Table 3, the Retinex-CNN algorithm has the highest VIF score in DICM (0.86), Fusion (1.09), MEF (0.79), LIME (0.75), NPE (1.01), and VV (0.98), which indicates that, compared to other methods, our enhanced images demonstrate superior information retention and better align with human visual aesthetics.

4. Analysis and Discussion

4.1. Analysis of Time Consumption

In addition to subjectively and objectively evaluating the quality of the images recovered by the algorithms, it is also important to focus on the time overhead of these processes. Therefore, we analyzed the processing time of each algorithm for different resolution images, as shown in Figure 8.

The SRIE algorithm recovers a better image, but the time overhead is large, which is not conducive to solving the real-time processing problem. The HE algorithm has the lowest time consumption, but the effect is not good, which is often accompanied by the phenomena of color distortion, loss of details, and overexposure. While Dong, GLADNet, and our proposed method have a time consumption proportional to the image size, Dong is twice as slow as the other algorithms. The LIME algorithm achieves better results in terms of time and enhancement effect but over-enhances places that are otherwise in normal illumination. Compared to GLADNet, the Retinex-CNN algorithm is able to retain the original color of the image well while meeting the real-time requirements.

4.2. Strengths and Limitations

The proposed Retinex-CNN demonstrated the ability to avoid over-enhancement, color distortion, and detail loss in low-illumination images, delivering results closer to real normal-illumination scenes with minimal brightness distortion. Especially, it excels in uniformly illuminated images, achieving better visual quality and higher comprehensive scores compared to other algorithms. The enhanced images show more uniform brightness, avoiding excessive contrast, and retain richer information, aligning better with human visual aesthetics.

While effective for uniform illumination, Retinex-CNN struggles slightly with non-uniformly illuminated images, particularly in dark region enhancement. However, it still ensures clear visibility without over-enhancement, striking a practical balance between enhancement quality and computational efficiency.

5. Conclusions

Low-illumination images can seriously affect or even limit the performance of the human eye or computer vision systems. To design an algorithm that can retain image details well, has anti-noise performance, and the enhanced image conforms to the visual characteristics of the human eye, we propose the Retinex-CNN. In the synthetic dataset, Retinex-CNN leads by a large margin in the evaluation metrics of PSNR, MSE, and SSIM. In the six public datasets, Retinex-CNN has the highest VIF evaluation in all tests and the highest NIQE evaluation in four of them. The experimental results indicate that Retinex-CNN effectively enhances image brightness and detail clarity while avoiding severe color distortion and halo effects, with a relatively low processing time.

The image enhancement method proposed in this study has been applied to traffic law enforcement departments, significantly improving their ability to determine responsibility for nighttime traffic accidents. After being applied in the Nanshan District Traffic Brigade of Shenzhen, Guangdong Province of China, this technology can accurately reconstruct the visual evidence chain of the accident scene, generate objective evidence with legal effect, support the traffic accident determination department to shorten the time by 51.2% (from 6.5 h to 3.2 h), and reduce the liability dispute rate to 5.1% (13.6 percentage points lower than the traditional process).

In the future, more analyses of different scenarios and related parameters should be conducted to optimize the training stage and improve the performance of the model.

Author Contributions

Conceptualization, H.M.; methodology, H.M.; validation, H.M.; formal analysis, H.M. and Y.T.; investigation, H.M.; writing—original draft preparation, H.M.; writing—review and editing, H.M., and X.Z.; visualization, W.P.; funding acquisition, H.M. and X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This paper is supported by the key program of Shenzhen Polytechnic University, named Complex weather imaging perception enhancement technology for intelligent transportation (No: 6021310003K).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets used in this study are publicly available datasets and can be downloaded on their own, and the code used is available from the corresponding author upon request. The code are not publicly available due to privacy.

Acknowledgments

The authors thank Min Zhang from the School of Geosciences and Info-Physics, Central South University for the help revising and polishing the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, C.; Qu, X.; Gnanasambandam, A.; Elgendy, O.A.; Ma, J.; Chan, S.H. Photon-limited object detection using non-local feature matching and knowledge distillation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 3976–3987. [Google Scholar]
Liu, J.; Xu, D.; Yang, W.; Fan, M.; Huang, H. Benchmarking Low-Light Image Enhancement and Beyond. Int. J. Comput. Vis. 2021, 129, 1153–1184. [Google Scholar] [CrossRef]
Xu, X.; Wang, S.; Wang, Z.; Zhang, X.; Hu, R. Exploring Image Enhancement for Salient Object Detection in Low Light Images. ACM Trans. Multimed. Comput. Commun. Appl. 2021, 17, 1–19. [Google Scholar] [CrossRef]
Pizer, S.M.; Amburn, E.P.; Austin, J.D.; Cromartie, R.; Geselowitz, A.; Greer, T.; ter Haar Romeny, B.; Zimmerman, J.B.; Zuiderveld, K. Adaptive histogram equalization and its variations. Comput. Vis. Graph. Image Process. 1987, 39, 355–368. [Google Scholar] [CrossRef]
Ancuti, C.; Ancuti, C.O.; Haber, T.; Bekaert, P. Enhancing underwater images and videos by fusion. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 81–88. [Google Scholar]
Hussain, K.; Rahman, S.; Khaled, S.M.; Abdullah-Al-Wadud, M.; Shoyaib, M. Dark image enhancement by locally transformed histogram. In Proceedings of the 8th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2014), Dhaka, Bangladesh, 18–20 December 2014; pp. 1–7. [Google Scholar]
Zhai, Y.-S.; Liu, X.-M. An improved fog-degraded image enhancement algorithm. In Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, 2–4 November 2007; pp. 522–526. [Google Scholar]
Park, S.; Yu, S.; Kim, M.; Park, K.; Paik, J. Dual Autoencoder Network for Retinex-Based Low-Light Image Enhancement. IEEE Access 2018, 6, 22084–22093. [Google Scholar] [CrossRef]
Wang, W.; Wei, C.; Yang, W.; Liu, J. Gladnet: Low-light enhancement network with global awareness. In Proceedings of the 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi′an, China, 15–19 May 2018; pp. 751–755. [Google Scholar]
Xu, W.; Lee, M.; Zhang, Y.; You, J.; Suk, S.; Choi, J.-Y. Deep residual convolutional network for natural image denoising and brightness enhancement. In Proceedings of the 2018 International Conference on Platform Technology and Service (PlatCon), Jeju, Republic of Korea, 29–31 January 2018; pp. 1–6. [Google Scholar]
Shen, L.; Yue, Z.; Feng, F.; Chen, Q.; Liu, S.; Ma, J. Msr-net: Low-light image enhancement using deep convolutional network. arXiv 2017, arXiv:1711.02488. [Google Scholar]
Lei, C.; Tian, Q. Low-light image enhancement algorithm based on deep learning and Retinex theory. Appl. Sci. 2023, 13, 10336. [Google Scholar] [CrossRef]
Ying, Z.; Li, G.; Ren, Y.; Wang, R.; Wang, W. A new low-light image enhancement algorithm using camera response model. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Venice, Italy, 22–29 October 2017; pp. 3015–3022. [Google Scholar]
Lee, C.; Lee, C.; Kim, C. Contrast enhancement based on layered difference representation. IEEE Trans. Image Process. 2013, 22, 5372–5384. [Google Scholar] [CrossRef] [PubMed]
Fu, X.; Zeng, D.; Huang, Y.; Liao, Y.; Ding, X.; Paisley, J. A fusion-based enhancing method for weakly illuminated images. Signal Process. 2016, 129, 82–96. [Google Scholar] [CrossRef]
Ma, K.; Kai, Z.; Zhou, W. Perceptual Quality Assessment for Multi-Exposure Image Fusion. IEEE Trans. Image Process. 2015, 24, 3345–3356. [Google Scholar] [CrossRef] [PubMed]
Sheikh, H.R.; Sabir, M.F.; Bovik, A.C. A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 2006, 15, 3440–3451. [Google Scholar] [CrossRef] [PubMed]
Wang, S.; Zheng, J.; Hu, H.M.; Li, B. Naturalness preserved enhancement algorithm for non-uniform illumination images. IEEE Trans. Image Process. 2013, 22, 3538–3548. [Google Scholar] [CrossRef] [PubMed]
Vonikakis, V.; Kouskouridas, R.; Gasteratos, A. On the evaluation of illumination compensation algorithms. Multimed. Tools Appl. 2017, 77, 9211–9231. [Google Scholar] [CrossRef]
Fu, X.; Zeng, D.; Huang, Y.; Zhang, X.-P.; Ding, X. A weighted variational model for simultaneous reflectance and illumination estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2782–2790. [Google Scholar]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “Completely Blind” Image Quality Analyzer. IEEE Signal Process. Lett. 2013, 20, 209–212. [Google Scholar] [CrossRef]
Ying, Z.; Li, G.; Gao, W. A bio-inspired multi-exposure fusion framework for low-light image enhancement. arXiv 2017, arXiv:1711.00591. [Google Scholar]
Dong, X.; Pang, Y.; Wen, J. Fast efficient algorithm for enhancement of low lighting video. In ACM SIGGRApH 2010 Posters; ACM: New York, NY, USA, 2010; p. 1. [Google Scholar]
Guo, X.; Li, Y.; Ling, H. LIME: Low-Light Image Enhancement via Illumination Map Estimation. IEEE Trans. Image Process. 2017, 26, 982–993. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The structure of the proposed Retinex-CNN.

Figure 2. The structure of the decomposition subnetwork. (×② means two times).

Figure 3. The structure of the reflectance map refinement subnetwork.

Figure 4. The structure of the light map enhancement subnetwork.

Figure 5. Enhancement of synthetic low-illumination images using different algorithms. (a) City building, (b) traffic scene, (c) forest, (d) cropland, and row (e) shows the zoomed-in details of the regions in red rectangles in row (d).

Figure 6. Enhancement of real low-illumination images using different algorithms. (a) Manor, (b) character, (c) indoor, (d) astronaut scene, and (e) traffic scene.

Figure 7. Representative results of comparison methods in (a,b) LIME and (c,d) NPE datasets.

Figure 8. Processing time of each algorithm for different resolution images.

Table 1. Objective evaluation indicators for synthetic data.

	HE	MSRCR	Dong	LIME	GLADNet	Retinex-CNN
MSE	1048.1	1559.1	1500.3	1505.5	634.2	498.5
PSNR	19.5223	16.7178	17.1631	16.5124	20.9026	22.7643
SSIM	0.7824	0.7732	0.6202	0.6396	0.7637	0.8237
LOE	306.946	1316.013	205.019	1757.513	388.024	289.019

The larger the values of PSNR and SSIM, the better; The smaller the values of MSE and LOE, the better. The use of bold means highlighting the best.

Table 2. NIQE evaluation scores of different algorithms on six public datasets.

	Origin	HE	MSRCR	SRIE	Dong	LIME	GLADNet	Retinex-CNN
DICM	3.8608	3.6795	3.3491	3.2894	4.0314	3.4989	3.1146	2.8015
Fusion	4.1425	3.6456	3.4853	3.5362	3.5967	3.7663	3.4281	3.3008
MEF	5.1884	4.3768	4.0614	4.1803	4.6952	4.4466	3.6899	3.4246
LIME	4.4134	4.5304	3.8752	3.8707	4.2005	4.3064	4.2634	4.0082
NPE	5.4441	4.5693	4.5644	4.8364	5.0251	4.6432	4.7601	4.7006
VV	3.5583	3.0058	2.7718	2.8324	2.8047	2.8556	2.7282	2.5117

The smaller the values of NIQE, the better. The use of bold means highlighting the best.

Table 3. VIF evaluation scores of different algorithms on public datasets.

	HE	MSRCR	SRIE	Dong	LIME	GLADNet	Retinex-CNN
DICM	0.5728	0.3670	0.6028	0.4103	0.2620	0.6501	0.8604
Fusion	0.8121	0.4901	0.7165	0.5111	0.3715	0.8527	1.0906
MEF	0.3809	0.3069	0.5383	0.3727	0.2482	0.5662	0.7972
LIME	0.2822	0.2490	0.5255	0.3289	0.2095	0.4628	0.7574
NPE	0.4948	0.3860	0.6013	0.4016	0.2978	0.8103	1.0177
VV	0.8655	0.5490	0.6991	0.5581	0.4110	0.9328	0.9845

The larger the values of VIF, the better. The use of bold means highlighting the best.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mao, H.; Peng, W.; Tian, Y.; Zhu, X. A Novel Low-Illumination Image Enhancement Method Based on Convolutional Neural Network with Retinex Theory. Appl. Sci. 2025, 15, 12324. https://doi.org/10.3390/app152212324

AMA Style

Mao H, Peng W, Tian Y, Zhu X. A Novel Low-Illumination Image Enhancement Method Based on Convolutional Neural Network with Retinex Theory. Applied Sciences. 2025; 15(22):12324. https://doi.org/10.3390/app152212324

Chicago/Turabian Style

Mao, Haixia, Wei Peng, Yan Tian, and Xiaochun Zhu. 2025. "A Novel Low-Illumination Image Enhancement Method Based on Convolutional Neural Network with Retinex Theory" Applied Sciences 15, no. 22: 12324. https://doi.org/10.3390/app152212324

APA Style

Mao, H., Peng, W., Tian, Y., & Zhu, X. (2025). A Novel Low-Illumination Image Enhancement Method Based on Convolutional Neural Network with Retinex Theory. Applied Sciences, 15(22), 12324. https://doi.org/10.3390/app152212324

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Low-Illumination Image Enhancement Method Based on Convolutional Neural Network with Retinex Theory

Abstract

1. Introduction

2. Materials and Methods

2.1. Retinex-CNN

2.1.1. Decomposition Subnetwork

2.1.2. Reflectance Map Refinement Subnetwork

2.1.3. Light Map Enhancement Subnetwork

2.2. Loss Function

2.2.1. Decomposition Subnetwork Loss Function

2.2.2. Light Map Enhancement Subnetwork Loss Function

2.2.3. Reflectance Map Refinement Subnetwork Loss Function

2.3. Experiment Setup

2.4. Image Quality Evaluation Methods

2.4.1. Evaluation Metrics for Composite Image Quality

2.4.2. Actual Image Quality Evaluation Indicators

3. Experiment Results

3.1. Enhancement of Synthesized Low-Illumination Images

3.2. Enhancement of Real Low-Illumination Images

3.3. Objective Evaluation of Image Quality

4. Analysis and Discussion

4.1. Analysis of Time Consumption

4.2. Strengths and Limitations

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI