License Plate Recognition Under the Dual Challenges of Sand and Light: Dataset Construction and Model Optimization

Wang, Zihao; Yang, Yining; Yang, Panxiong; Zhang, Xiaoge; Li, Jiaming; Sun, Yanling; Ma, Li; Cui, Dong

doi:10.3390/app15126444

Open AccessArticle

License Plate Recognition Under the Dual Challenges of Sand and Light: Dataset Construction and Model Optimization

by

Zihao Wang

^1,†,

Yining Yang

^1,†,

Panxiong Yang

²,

Xiaoge Zhang

¹,

Jiaming Li

¹,

Yanling Sun

¹

,

Li Ma

^1,* and

Dong Cui

^2,*

¹

School of Optoelectronic Engineering, Xidian University, Xi’an 710071, China

²

Xi’an Institute Electromechanical Information Technology, Xi’an 710065, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2025, 15(12), 6444; https://doi.org/10.3390/app15126444

Submission received: 24 October 2024 / Revised: 14 January 2025 / Accepted: 23 January 2025 / Published: 7 June 2025

Download

Browse Figures

Versions Notes

Abstract

License plate recognition in sandstorm conditions faces challenges such as image blurriness, reduced contrast, and partial information loss, which result in significant limitations in the feature extraction and recognition accuracy of existing methods. To address these challenges, this study proposes a license plate recognition method based on an improved AlexNetBN network. By introducing Batch Normalization (BN) layers, the model achieves greater training stability and generalization in complex environments. A dedicated dataset tailored for license plate recognition in sandstorm conditions was constructed, and data augmentation techniques were used to simulate real-world scenarios for model training and testing. Experimental results demonstrate that, compared to the traditional AlexNet model, AlexNetBN achieves higher recognition accuracy and robustness in environments with frequent sandstorms and significant variations in lighting intensity. This study not only effectively enhances license plate recognition performance under sandstorm conditions but also offers new insights and references for applying CNN-based methods in low-visibility scenarios.

Keywords:

license plate recognition; AlexNetBN network; robustness enhancement; feature extraction

1. Introduction

In recent years, target recognition under adverse weather conditions has gained widespread attention due to its practical applications in autonomous driving [1], intelligent surveillance, and traffic management [2]. Meanwhile, dust storms, as a common and challenging environment, exist in a variety of classifications [3]; except during strong and very strong dust storms, people still carry out their normal life activities including but not limited to the use of vehicles in the dust storm weather; therefore, license plate recognition and traffic management under dust storm weather are still greatly affected. Airborne dust particles cause severe image degradation, resulting in blurring, reduced contrast, and partial occlusion of targets. These effects make it increasingly difficult for traditional recognition methods to achieve accurate performance. Developing a robust recognition system that can function effectively under such conditions is critical for ensuring public safety and improving traffic efficiency.

Currently, various target recognition algorithms have been developed for object detection and classification in standard environments. Traditional methods rely heavily on manual feature extraction techniques such as SIFT [4] and HOG [5]. While these methods perform adequately under normal conditions, their ability to describe features degrades significantly when faced with blurred or low-contrast images. Moreover, sliding window-based object localization is computationally expensive and struggles to meet real-time requirements.

With the advent of deep learning and the fact that convolutional neural networks (CNNs) are characterized by local receptive fields [6] and weight sharing [7], CNNs have become a core technology for modern target recognition. CNN-based methods, especially the R-CNN family [8,9] (R-CNN, Fast R-CNN, Faster R-CNN, and Libra R-CNN), have made significant advances in detection efficiency and accuracy through the use of Regional Proponent Networks (RPNs) [10]. However, these methods still face some challenges when applied to dust storm weather conditions. The R-CNN family has limited feature extraction capability for low-quality images (e.g., blurred and low-contrast images), which makes it difficult to deal with the case of unclear object contours. First, the presence of sandstorms can lead to image blurring, uneven lighting, and contrast reduction, which can interfere with features extracted from the convolutional layer, resulting in an unstable model training process and slow or even failed convergence. Secondly, in such complex environments, the traditional pooling operation of convolutional neural networks may not be able to adequately preserve the key features, thus leading to a decrease in recognition accuracy.

The YOLO (You Only Look Once) family of algorithms has made significant progress in real-time object detection by combining object localization and classification into a single-stage detection network, resulting in extremely high detection speeds. However, although the fast detection mechanism of YOLO improves the speed, lower versions of YOLO (e.g., YOLO V1 [11], YOLO V2 [12], YOLO V3 [13], YOLO V4 [14], and YOLOv7 [15]) still face similar challenges as the R-CNN family in applications in sandstorm environments: image blurring and low contrast co-lead to degraded detection accuracy, especially when dealing with partially occluded or missing object information. While they are very robust, they struggle to capture detailed features under such conditions. While the advanced YOLO version (YOLOv8 [16]) effectively addresses some of the issues present in the lower versions of YOLO, the multi-scale detection used in YOLOv8 requires feature extraction and inference at multiple scales, which increases the computational cost and the post-processing complexity, and may lead to slower inference, especially when dealing with high-resolution images.

DSSD [17] (Deconvolutional Single Shot Detector) improves the detection of multi-scale objects by introducing deconvolution layers, which enhance detection performance on low-resolution targets. However, DSSD still faces performance instability in adverse weather conditions, especially in sandstorm environments, where image degradation is significant. Its ability to describe blurred features remains limited, leading to suboptimal detection results.

Currently, there are many reliable results on license plate detection in harsh environments, such as the techniques studied by Rio-Alvarez, A. et al. [18], which were tested based on descriptors and classifiers with different textures and are highly adaptable to machine learning-based algorithms, and Azam, S. et al. [19], who worked on a new ALPD method that can efficiently detect license plate regions from complex environments. Therefore, in this study, we will focus on the work of license plate number recognition and classification after the completion of license plate detection segmentation, and the optimization of license plate detection techniques under sandstorm conditions will be studied in detail in future work. At the same time, taking into account the different advantages and disadvantages of the networks described above, in this research, we adopt the AlexNetBN network to address the performance bottlenecks of existing methods in sandstorm weather. AlexNetBN, an enhanced version of the classic AlexNet [20], introduces Batch Normalization (BN) layers to improve model training stability and generalization [21], particularly when handling degraded images. Compared to other methods, AlexNetBN excels at feature extraction in challenging conditions, such as blurred and low-contrast images. By optimizing the network structure to focus on key feature regions that are less affected by blur and contrast reduction, our approach improves target recognition accuracy and robustness in sandstorm environments.

The main contributions of this paper are as follows: (1) We propose a robust target recognition algorithm tailored for sandstorm conditions, optimizing detection accuracy and resilience through the AlexNetBN network. (2) We develop and release a custom dataset of license plates affected by sandstorms to validate the effectiveness of our method, while also comparing its performance with existing advanced algorithms. In the following sections, we will first provide a detailed discussion of the AlexNetBN architecture, explaining why it is suitable for target recognition in sandstorm weather. Next, we will describe the construction of our custom dataset and the pre-processing techniques used to simulate the real-world conditions of sandstorms. Finally, we will present our experimental results, comparing the performance of our method with alternative approaches in terms of accuracy, efficiency, and robustness.

This work not only offers a new solution for target recognition under adverse weather conditions but also provides valuable insights into the application of CNN-based algorithms in scenarios where visibility is significantly compromised.

2. Materials and Methods

2.1. Introduction to the Model Structure

To address the issue that the feature extraction of the convolutional layer is disturbed by the sandstorm environment and the traditional pooling operation fails to meet the requirement of retaining key features, we made improvements based on the AlexNet model, specifically introducing BN layers after each convolutional layer. The BN layer helps to reduce the perturbation of input features, ensuring that before feature extraction by the convolutional layers, the image characteristics are already normalized [21]. This mitigates the impact of uneven illumination distribution, accelerates model convergence, and improves training stability. Additionally, we utilized max pooling operations to further enhance the model’s resistance to interference, as max pooling retains the most prominent features when noise levels are high [22], thereby ensuring the effective extraction of key information from the license plates.

Furthermore, considering the sensitivity of fully connected layers to noise, lighting, and rotation in complex environments, we retained only one fully connected layer in the model to reduce the number of parameters and avoid overfitting [23]. Ultimately, the license plate images are processed through five convolutional layers, each followed by a BN layer, and then further refined through pooling layers, before being integrated by the fully connected layer, and finally, character recognition is achieved via the SoftMax layer [24], and the model structure is shown in Figure 1.

This improved AlexNetBN architecture effectively handles license plate recognition tasks under extreme conditions such as sandstorms, enhancing the model’s robustness and training efficiency.

2.2. Analysis of Optimization Principles

In the process of license plate recognition, the input image first undergoes convolutional layer processing [25], where convolutional kernels extract feature information from the image. The feature information processed by the ReLU activation function is then fed into the next convolutional layer for similar operations. The feature maps generated by the convolutional operations are input into the pooling layer for further feature extraction. This study uses Max Pooling to enhance the ability of convolutional kernels to extract image information [26]. Finally, the features are integrated through a fully connected layer [27], and the final classification result is obtained through the classifier.

By observing the heat maps generated during the convolutional operations, as shown in Figure 2b, in complex environments, the convolutional layers have difficulty effectively extracting the key features of license plates. To address the nonlinearity issue in convolutional neural networks, this study employs the ReLU activation function [28], which is defined by the following formula:

Re L U (w) = \{\begin{matrix} w & w \geq 0 \\ 0 & w \leq 0 \end{matrix}

(1)

w

represents the input value of the activation function, which specifically refers to the output of the previous convolutional layer in this study.

However, relying solely on the ReLU activation function is insufficient to effectively address the challenges faced by the convolutional layers when processing license plates in complex environments. To overcome these issues, BN is introduced into the traditional AlexNet model. The BN layer normalizes each feature channel and introduces two learnable parameters to enhance the model’s expressive capability.

During normalization, the BN layer computes the mean and variance of the input features within each mini-batch [21]. Let

μ_{b a t c h}

represent the mean, and the formula for calculating the mean is as follows:

μ_{b a t c h} = \frac{1}{m} \sum_{i = 1}^{m} X_{i}

(2)

where

m

is the number of samples in the mini-batch, and

X_{i}

represents the input feature of the i sample. Let

{σ^{2}}_{b a t c h}

represent the variance, and the formula for calculating the variance is as follows:

{σ^{2}}_{b a t c h} = \frac{1}{m} \sum_{i = 1}^{m} {(X_{i} - μ_{b a t c h})}^{2}

(3)

Each input feature is normalized according to the following formula [21]:

{\hat{X}}_{i} = \frac{X_{i} - μ_{b a t c h}}{\sqrt{{σ^{2}}_{b a t c h} + ε}}

(4)

where

{\hat{X}}_{i}

is the normalized input, and

ε

is a small constant added to prevent division by zero.

The BN layer normalizes the input of each layer to reduce the impact of internal covariate shift. This mechanism keeps the mean and variance relatively stable during training, effectively reducing disturbances in the input features [29]. This stability allows the network to better handle conditions like sandstorm blurring and strong light reflections, resulting in more uniform illumination of license plate images, effectively resisting noise interference, and facilitating subsequent feature extraction.

In addition, the BN layer introduces two learnable parameters, whose formulas are as follows:

y_{i} = γ {\hat{X}}_{i} + β

(5)

where the scaling parameter

γ

adjusts the scale of the features, allowing the normalized features to be adapted to the range suitable for the current task. The shift parameter

β

translates the normalized features, further adjusting their distribution. The introduction of these parameters enables the network to flexibly recover the original feature distribution during training, reducing sensitivity to the initial values of the network weights [21].

Figure 2 shows the attention levels of different networks to various regions of the input image [30,31]. Figure 2a represents the input image, while Figure 2b illustrates that the network without the BN layer excessively focuses on the image’s edge information. Figure 2c visually reveals that the network with the BN layer focuses more on the area containing the license plate number. A comparison of Figure 2a–c shows that the addition of the BN layer enables the network to concentrate more on extracting the core information of the license plate while paying less attention to distracting edge information and background noise. In complex environments, it can effectively filter out some unnecessary information in the image, such as lighting variations and cluttered backgrounds, ensuring that the attention remains focused on the license plate number. Thus, adding a BN layer helps to address the limitations of the convolutional layers, significantly improving their feature extraction efficiency, which is crucial for license plate recognition in complex environments. We attempted to integrate the BN layer into the convolutional layer to further enhance the model’s convergence speed and recognition performance, addressing the challenges of license plate recognition in environments with frequent sandstorms and dramatic changes in light intensity.

3. Results

3.1. Construction of the Dataset

Choosing a suitable dataset is especially critical when training a license plate recognition model suitable for special environments such as semi-arid regions with frequent sand and dust and intense light [32]. Datasets targeting specific environmental conditions can significantly improve the performance of the model in these scenarios, especially in regions with drastic changes in light and frequent sand and dust weather. However, the coverage of existing mainstream datasets at home and abroad in this area is still insufficient to provide adequate support for the needs of this study. Table 1 demonstrates a comparative analysis of some public datasets; however, we find that most public datasets do not contain a large number of license plate images from sandy and dusty environments with strong lighting. Therefore, constructing a more targeted dataset that can effectively fill this gap is a top priority for this study.

The following data enhancement strategies were used in this study:

Noise addition: By introducing additional ‘granular’ interference into the image, we simulate the image quality problems that may occur in the real environment. This strategy effectively trains the robustness of the model in dealing with multiple noise interferences, so that it can maintain stable recognition performance in the face of complex real-world scenarios, providing the model with ‘real-world experience’ in dealing with various uncertain environments.
Light intensity variations: By simulating different lighting conditions, such as strong sunlight and deep shadows, the model is able to learn how to accurately recognize license plates in diverse lighting environments, thus effectively reducing recognition errors due to lighting variations.
Rotation and tilting: By rotating and tilting the image to simulate the presentation of the license plate under different viewing angles and perspectives, the model is enhanced to adapt to changes in the position of the license plate, thus improving its recognition accuracy in real application scenarios. This enhancement allows the model to better cope with the challenges posed by changes in angles and optimizes its performance in complex scenes.
Masking: By adding a large amount of noise or increasing the local brightness in the image, some or all characters of the license plate are masked to simulate the state of the license plate in complex environments such as sand and dust coverage or direct sunlight, which further improves the robustness and adaptive ability of the model.

With these data enhancement strategies, we constructed a more targeted dataset CSCL (license plates in China with a lot of sand and dust and strong changes in light variations). Some of the license plate images are shown in Figure 3. This dataset more accurately reflects the characteristics of the license plate in areas with frequent sand and dust and strong light, and significantly improves the performance of the license plate recognition model in semi-arid areas. The comparison of this dataset with the existing dataset is shown in Table 1.

By adding Gaussian noise, salt-and-pepper noise, random noise, and adjusting the image colors, the images are made to appear as if they have more ‘dust,’ with a more yellowish hue, thus more realistically simulating a dust-heavy environment. The result is shown in Figure 3a. Randomly adjusting image brightness simulates conditions of strong lighting, as shown in Figure 3b. Rotation techniques are employed for data augmentation to enhance the robustness of model training and prevent issues such as overfitting. The results are shown in Figure 3c. By covering the license plate surface with a significant amount of ‘dust’ and greatly increasing local brightness, real-world scenarios are simulated where license plates are covered by dust or directly exposed to sunlight, as shown in Figure 3d.

3.2. Experimental Results and Analyses

The evaluation metrics for the license plate recognition network model in this study mainly included accuracy (acc), PFC (Probability of Fully Identifying the Correct License Plate Number), and Loss (value of the loss function). Among them, acc is the ratio of the number of symbols correctly recognized by the model in the prediction to the total number of symbols; by comparing the accuracy rates of different models or different datasets, the optimal model could be selected; by observing the trend of the accuracy rate during the training process, we could judge whether the model converged or overfitted; by analyzing the accuracy rate on the test set, we could evaluate the model’s generalization ability. The PFC refers to the model’s ability to completely recognize all the characters correctly on a license plate. It is more stringent than the accuracy rate, as PFC requires that every character on the license plate must be recognized correctly; it is usually lower than the accuracy rate because it is more sensitive to errors. Loss is a metric that measures the difference between the model’s prediction and the true value; observing the trend of the loss value during the training process can determine whether the model learns effectively or not; comparing the loss value on the training and validation sets can help us detect overfitting. Here, A represents the model trained using the augmented dataset, N represents the model with an added BN layer, O represents the model trained using the unaugmented dataset, val indicates the model’s performance on the test set, and train indicates the model’s performance on the training set.

In this study, we trained both the AlexNetBN model and the AlexNet model using the original dataset and the augmented dataset, respectively. After completing the training, we conducted an in-depth analysis of the performance metrics of the different models on these two datasets, aiming to validate the effectiveness of the optimization measures proposed in this study and to explore their impact on model performance.

Figure 4 shows the performance of different models on the test set. From Figure 4, it can be observed that during the fifth and sixth epochs, the recognition accuracy of each model reached its peak. Subsequently, as the number of training epochs increased, the improvement in recognition accuracy gradually diminished and tended to converge. Furthermore, when the number of training epochs exceeded six, some models began to exhibit overfitting. The second part of this section will focus on analyzing this phenomenon. Therefore, this study will focus on the recognition performance of each model around the sixth epoch, as well as the overfitting that occurs when the number of training epochs increases, to gain a deeper understanding of the performance characteristics of the models and their potential influencing factors.

3.2.1. Comparative Experiments on Datasets

In this section, we analyze the performance metrics of the AlexNetBN model after training with different datasets to validate the effectiveness and importance of data augmentation techniques in model training.

Figure 5a shows that the enhancement of the dataset greatly improved the model performance, and it can be seen that the model trained using the enhanced dataset had a reduced change in accuracy after the fifth round of training, and tended to converge with a recognition accuracy of more than 90%; whereas the model trained using the original dataset began to converge only after the ninth round, and had a recognition accuracy of only about 80%. As can be seen from Figure 5b, in the model training after adding the BN layer, the number of completely correctly recognized license plate numbers (PFC) of the model trained with the enhanced dataset was significantly larger than that of the model trained with the original dataset, and at the same time, the PFC value was very sensitive to the occurrence of recognition errors, which suggests that it was less likely to recognize license plate numbers incorrectly in the model trained with the enhanced dataset. From the analysis of Figure 5a,b, it can be concluded that, under the same model architecture, the model trained with the augmented dataset outperformed the model trained with the original dataset in terms of recognition accuracy, convergence speed, and PFC. This indicates that the diversity and relevance of the dataset are crucial to the model’s performance. Data augmentation enhances the effectiveness of training data by introducing diversified sample transformations, allowing the model to better learn feature diversity and robustness. This not only improves the generalization ability of the model but also achieves better performance within a shorter training period [38]. Therefore, this study clearly demonstrates the effectiveness of data augmentation techniques.

3.2.2. Ablation Experiment

In this section, we analyze the performance metrics of different models after training with different datasets to illustrate the effectiveness and importance of the BN layer.

The comparison of accuracy curves in Figure 6a shows that the model with the BN layer converged faster around the fourth epoch, and its recognition accuracy was significantly higher than that of the model without the BN layer. Figure 6b demonstrates that at the sixth epoch, the model with the BN layer had a license plate symbol recognition accuracy of more than 90% and a probability of recognizing the correct license plate number of about 56%, which is much higher than the model without the BN layer, which had a recognition accuracy and a PFC value of about 80% versus 20%. These results clearly show that the BN layer makes deep network training more efficient by reducing the phenomena of gradient vanishing and gradient explosion [39], thereby effectively improving the overall performance of the model, greatly improving the model’s recognition accuracy in areas with high sand and dust and large illumination variations.

Observing Figure 7, we can see that while the difference between the predicted and true values of the different models in the training set gradually decreased with the increase in training epochs, their performance on the test set varied significantly. Without the BN layer, the error on the test set initially decreased with the increase in training epochs but then began to rise. This indicates that, in the later training sessions, the model’s predictions of the results on the test set became more different from the true results, and the model overlearned the image features in the training set. Although the model’s recognition accuracy remained almost unchanged, the increase in the loss function value revealed the presence of overfitting. In contrast, the model with the BN layer did not exhibit overfitting. This phenomenon indicates that the BN layer, by normalizing the input of each layer, reduces the impact of internal covariate shift and introduces a certain level of noise [40], thereby effectively serving a regularization function [41]. This regularization helps to mitigate the risk of overfitting on the training set and allows the model to maintain better recognition performance when encountering unseen data.

3.2.3. Comparison Experiment Before and After Optimization

In complex environments with frequent sandstorms and dramatic fluctuations in lighting conditions, the model is highly susceptible to interference during critical processing steps such as feature extraction. The trend of the two accuracy curves in Figure 8 clearly demonstrates that the addition of the BN layer and the application of data augmentation techniques effectively improve the stability of license plate recognition performance under such challenging conditions. Specifically, the overall optimized model after convergence had a recognition accuracy of more than 91%, which is much higher than the original model’s 80% recognition accuracy. This result highlights the importance of combining BN and data augmentation techniques during model training, further confirming the significant impact of these two strategies in enhancing model performance and generalization ability.

4. Discussion

4.1. Conclusions

In this paper, we created a dedicated dataset for sandstorms that may lead to the blurring of license plate recognition, reduction in contrast, and loss of some information, and used this enhanced CSCL dataset to train the AlexNetBN network. Our method proves its effectiveness through mathematical principles and model training and testing. We compared the AlexNetBN model after using the optimization method with the AlexNet model without the optimization method; the former performed extremely well on the test set, with a recognition accuracy of more than 91%, while the latter did not perform well on the test set, with a recognition accuracy of only about 80%, which shows that our method can handle the problem of license plate recognition in areas with frequent sandstorms and large variations in light intensity more effectively.

4.2. Future Work

However, some aspects of this work still need to be addressed in the future. Firstly, the proposed license plate recognition algorithm is mainly designed for Chinese license plates with a blue background and white text. Given the diverse styles of license plates worldwide, the scope of this study is relatively limited. Therefore, future research should broaden its scope to include license plate recognition across different countries and regions. Secondly, our optimization model was not compared with other current state-of-the-art models (e.g., modern YOLO algorithms) for performance in harsh environments, so this aspect will be addressed in our future work. Finally, our experiments did not include the study and optimization of license plate detection techniques under sandstorm conditions, so we will study license plate detection techniques under sandstorm conditions in detail in our future work.

Author Contributions

Conceptualization, Z.W. and Y.Y.; formal analysis, Y.Y.; funding acquisition, Z.W.; investigation, Z.W., P.Y. and Y.S.; software, Z.W., J.L. and L.M.; validation, X.Z., D.C. and P.Y.; writing—original draft, Z.W. and Y.Y.; writing—review and editing, D.C., L.M. and Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This project was generously supported by the National Natural Science Foundation of China (Grant No. 62405229), supported by the Fundamental Research Funds for the Central Universities (No. XJSJ25007).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

We extend our sincere gratitude to the support and assistance from the School of Innovation and Entrepreneurship of Xidian University for the work.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Lu, J.; Chen, Z. Research on multi-sensor fusion target detection algorithm in autonomous driving scenarios. Netw. Secur. Technol. Appl. 2023, 2023, 38–41. [Google Scholar] [CrossRef]
Fu, G. Research and Application of Dynamic Prediction Model for Urban Intelligent Transportation. Ph.D. Thesis, South China University of Technology, Guangzhou, China, 2014. [Google Scholar]
Wang, W.; Fang, Z. A review of sandstorm weather and its research progress. J. Appl. Meteorol. 2004, 15, 366–381. [Google Scholar]
Zheng, L.; Yang, Y.; Tian, Q. SIFT meets CNN: A decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 1224–1244. [Google Scholar] [CrossRef]
Wang, X.; Han, T.X.; Yan, S. An HOG-LBP Human Detector with Partial Occlusion Handling. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 29 September–2 October 2009; pp. 32–39. [Google Scholar]
Zhang, Y.; Wang, H.; Yang, G.; Zhang, J.; Gong, C.; Wang, Y. CSNet: A ConvNeXt-based Siamese network for RGB-D salient object detection. Vis. Comput. 2024, 40, 1805–1823. [Google Scholar] [CrossRef]
Liu, R.; Meng, G.; Yang, B.; Sun, C.; Chen, X. Dislocated time series convolutional neural architecture: An intelligent fault diagnosis approach for electric machine. IEEE Trans. Ind. Inform. 2016, 13, 1310–1320. [Google Scholar] [CrossRef]
Hmidani, O.; Alaoui, E.M.I. A comprehensive survey of the R-CNN family for object detection. In Proceedings of the 2022 5th International Conference on Advanced Communication Technologies and Networking (CommNet), Marrakech, Morocco, 12–14 December 2022; pp. 1–6. [Google Scholar]
Pang, J.; Chen, K.; Shi, J.; Feng, H.; Ouyang, W.; Lin, D. Libra r-cnn: Towards Balanced Learning for Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; 821–830. [Google Scholar]
Rafique, M.A.; Pedrycz, W.; Jeon, M. Vehicle license plate detection using region-based convolutional neural networks. Soft Comput. 2018, 22, 6429–6440. [Google Scholar] [CrossRef]
Redmon, J. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
Redmon, J. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar]
Wang, G.; Chen, Y.; An, P.; Hong, H.; Hu, J.; Huang, T. UAV-YOLOv8: A small-object-detection model based on improved YOLOv8 for UAV aerial photography scenarios. Sensors 2023, 23, 7190. [Google Scholar] [CrossRef]
Fu, C.Y.; Liu, W.; Ranga, A.; Tyagi, A.; Berg, A.C. Dssd: Deconvolutional single shot detector. arXiv 2017, arXiv:1701.06659. [Google Scholar]
Rio-Alvarez, A.; de Andres-Suarez, J.; González-Rodríguez, M.; Fernandez-Lanvin, D.; López Pérez, B. Effects of Challenging Weather and Illumination on Learning-Based License Plate Detection in Noncontrolled Environments. Sci. Program. 2019, 2019, 6897345. [Google Scholar] [CrossRef]
Azam, S.; Islam, M.M. Automatic license plate detection in hazardous condition. J. Vis. Commun. Image Represent. 2016, 36, 172–186. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Ioffe, S. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv 2015, arXiv:1502.03167. [Google Scholar]
Zafar, A.; Aamir, M.; Mohd Nawi, N.; Arshad, A.; Riaz, S.; Alruban, A.; Dutta, A.K.; Almotairi, S. A comparison of pooling methods for convolutional neural networks. Appl. Sci. 2022, 12, 8643. [Google Scholar] [CrossRef]
Basha, S.S.; Dubey, S.R.; Pulabaigari, V.; Mukherjee, S. Impact of fully connected layers on performance of convolutional neural networks for image classification. Neurocomputing 2020, 378, 112–119. [Google Scholar] [CrossRef]
Liu, W.; Wen, Y.; Yu, Z.; Yang, M. Large-margin softmax loss for convolutional neural networks. arXiv 2016, arXiv:1612.02295. [Google Scholar]
Günther, J.; Pilarski, P.M.; Helfrich, G.; Shen, H.; Diepold, K. First steps towards an intelligent laser welding architecture using deep neural networks and reinforcement learning. Procedia Technol. 2014, 15, 474–483. [Google Scholar] [CrossRef]
Scherer, D.; Müller, A.; Behnke, S. Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition. In International Conference on Artificial Neural Networks; Springer: Berlin/Heidelberz, Germany, 2010; pp. 92–101. [Google Scholar]
Khan, A.; Sohail, A.; Zahoora, U.; Qureshi, A.S. A survey of the recent architectures of deep convolutional neural networks. Artif. Intell. Rev. 2020, 53, 5455–5516. [Google Scholar] [CrossRef]
Nwankpa, C.; Ijomah, W.; Gachagan, A.; Marshall, S. Activation functions: Comparison of trends in practice and research for deep learning. arXiv 2018, arXiv:1811.03378. [Google Scholar]
Garbin, C.; Zhu, X.; Marques, O. Dropout vs. batch normalization: An empirical study of their impact to deep learning. Multimed. Tools Appl. 2020, 79, 12777–12815. [Google Scholar] [CrossRef]
Yoshioka, T.; Ito, N.; Delcroix, M.; Ogawa, A.; Kinoshita, K.; Fujimoto, M.; Yu, C.; Fabian, W.J.; Espi, M.; Higuchi, T.; et al. The NTT CHiME-3 System: Advances in Speech Enhancement and Recognition for Mobile Multi-Microphone Devices. In Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, USA, 13–17 December 2015; pp. 436–443. [Google Scholar]
Si, N.; Zhang, W.; Qu, D.; Luo, X.; Chang, H.; Niu, T. Representation visualization of convolutional neural networks: A survey. Acta Autom. Sin. 2022, 48, 1890–1920. [Google Scholar]
Jabeen, F.; Khusro, S.; Anjum, N. Research in Collaborative Tagging Applications: Choosing the Right Dataset. VAWKUM Trans. Comput. Sci. 2023, 11, 1–25. [Google Scholar] [CrossRef]
Li, H.; Wang, P.; Shen, C. Toward end-to-end car license plate detection and recognition with deep neural networks. IEEE Trans. Intell. Transp. Syst. 2018, 20, 1126–1136. [Google Scholar] [CrossRef]
Laroca, R.; Severo, E.; Zanlorensi, L.A.; Oliveira, L.S.; Gonçalves, G.R.; Schwartz, W.R.; Menotti, D. A Robust Real-Time Automatic License Plate Recognition Based on the YOLO Detector. In Proceedings of the 2018 International Joint Conference on Neural Networks (ijcnn), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–10. [Google Scholar]
Gonçalves, G.R.; da Silva, S.P.G.; Menotti, D.; Schwartz, W.R. Benchmark for license plate character segmentation. J. Electron. Imaging 2016, 25, 053034. [Google Scholar] [CrossRef]
Hsu, G.S.; Chen, J.C.; Chung, Y.Z. Application-oriented license plate recognition. IEEE Trans. Veh. Technol. 2012, 62, 552–561. [Google Scholar] [CrossRef]
Xie, L.; Ahmad, T.; Jin, L.; Liu, Y.; Zhang, S. A new CNN-based method for multi-directional car license plate detection. IEEE Trans. Intell. Transp. Syst. 2018, 19, 507–517. [Google Scholar] [CrossRef]
Liu, X. Research on Face Image Recognition Technology Based on Deep Learning. Ph.D. Thesis, University of Chinese Academy of Sciences (Changchun Institute of Optical Precision Machinery and Physics, Chinese Academy of Sciences), Changchun, China, 2019. [Google Scholar]
Yang, G.; Pennington, J.; Rao, V.; Sohl-Dickstein, J.; Schoenholz, S.S. A mean field theory of batch normalization. arXiv 2019, arXiv:1902.08129. [Google Scholar]
Yong, H.; Huang, J.; Meng, D.; Hua, X.; Zhang, L. Momentum Batch Normalization for Deep Learning with Small Batch Size. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part XII 16. Springer International Publishing: New York, NY, USA, 2020; pp. 224–240. [Google Scholar]
Luo, P.; Wang, X.; Shao, W.; Peng, Z. Towards understanding regularization in batch normalization. arXiv 2018, arXiv:1809.00846. [Google Scholar]

Figure 1. AlexNetBN model structure diagram.

Figure 2. Image processing effect diagram. (a) Input image; (b) heat map after convolutional layer processing; (c) heat map after convolutional and BN layer processing.

Figure 3. Sample self-built dataset. (a) Simulation of the effect of a license plate after sand and dust coverage; (b) simulating the effect of license plate under the scene of drastic change in light intensity; (c) example of license plate rotation effect; (d) simulation of the effect of a license plate obscured by sand or light.

Figure 4. Accuracy of different models in the training cycle. The content within the black dashed box is the main focus of this paper.

Figure 5. Performance of models with different datasets. (a) Accuracy on the test set during model training using different datasets; (b) comparison of PFC values on the test set at the beginning of training for models trained using different datasets.

Figure 6. Performance of models with different structures. (a) Changes in accuracy on the test set during training of different models using the same dataset; (b) comparison of accuracy and PFC for different models at almost convergence.

Figure 7. Change in the value of the loss function.

Figure 8. Model recognition accuracy before and after optimization.

Table 1. Comparison of license plate datasets.

Dataset	Year	Country	Amount	Resolution	Description
CCPD (2019) [33]	2019	China	250 k	720 × 1280	Unevenly bright license plates, tilted license plates, etc., but not enough of them
UFPR-ALPR [34]	2018	Brazil	4500	1920 × 1080	License plates from different national regions, lack of dusty and light-variable license plates
GAP-LP [34]	2019	Tunisia	9175	Includes multiple resolutions	Lack of license plates with lots of dust and strong light variations; labels may be inaccurate and inconsistent
OpenALRP-EU [35]	2016	Europe	108	Includes multiple resolutions	License plates from various EU countries with poor generalization when used
USCD-still [36]	2005	America	291	640 × 480	Multiple license plate styles and image capture conditions, fewer license plates in complex environments
CD-HARD [37]	2016	Involving multiple countries	102	Includes multiple resolutions	Higher number of difficult-to-identify samples, but lack of photographs of license plates that are sandy and have strong light variations
CSCL	2024	China	25 k	240 × 80	Includes a large number of license plate photos for areas with frequent dust storms and drastic changes in lighting

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, Z.; Yang, Y.; Yang, P.; Zhang, X.; Li, J.; Sun, Y.; Ma, L.; Cui, D. License Plate Recognition Under the Dual Challenges of Sand and Light: Dataset Construction and Model Optimization. Appl. Sci. 2025, 15, 6444. https://doi.org/10.3390/app15126444

AMA Style

Wang Z, Yang Y, Yang P, Zhang X, Li J, Sun Y, Ma L, Cui D. License Plate Recognition Under the Dual Challenges of Sand and Light: Dataset Construction and Model Optimization. Applied Sciences. 2025; 15(12):6444. https://doi.org/10.3390/app15126444

Chicago/Turabian Style

Wang, Zihao, Yining Yang, Panxiong Yang, Xiaoge Zhang, Jiaming Li, Yanling Sun, Li Ma, and Dong Cui. 2025. "License Plate Recognition Under the Dual Challenges of Sand and Light: Dataset Construction and Model Optimization" Applied Sciences 15, no. 12: 6444. https://doi.org/10.3390/app15126444

APA Style

Wang, Z., Yang, Y., Yang, P., Zhang, X., Li, J., Sun, Y., Ma, L., & Cui, D. (2025). License Plate Recognition Under the Dual Challenges of Sand and Light: Dataset Construction and Model Optimization. Applied Sciences, 15(12), 6444. https://doi.org/10.3390/app15126444

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

License Plate Recognition Under the Dual Challenges of Sand and Light: Dataset Construction and Model Optimization

Abstract

1. Introduction

2. Materials and Methods

2.1. Introduction to the Model Structure

2.2. Analysis of Optimization Principles

3. Results

3.1. Construction of the Dataset

3.2. Experimental Results and Analyses

3.2.1. Comparative Experiments on Datasets

3.2.2. Ablation Experiment

3.2.3. Comparison Experiment Before and After Optimization

4. Discussion

4.1. Conclusions

4.2. Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI