1. Introduction
Autonomous vehicles use sensors, artificial intelligence (AI), and automatic assistance systems to control their driving operations. With the development of artificial intelligence (AI) technology, autonomous driving systems have rapidly advanced to the stage of practical application. The Taiwanese government is actively promoting self-driving technology and has set up special test areas for field testing. It has formulated laws to regulate the testing of autonomous vehicles on the road to ensure that these vehicles operate safely.
Referring to Svetlana’s research and focusing on attacking and interfering with the traffic sign recognition model [
1], we simulated attacks in two categories: digital simulated attacks and physical simulated attacks. In the previous research, we targeted digitally simulated attacks [
2]. In this study, we incorporated physically simulated attack methods to investigate strategies for resisting these attacks and mitigating interference.
We simulated attacks with LED lights of different colors based on the You Look Only Once (YOLOv5) model and the German Traffic Sign Recognition Benchmark (GTSRB) dataset. We interfered with the traffic sign recognition model with the LED lights and tested the recognition performance of the model when interfered with by LED colored light. We calculated the accuracy of recognition with the interference. We tested whether the interference results with colored light in this environment were the same as those of previous digitally simulated attacks.
In the experiment, we used physical traffic signs in the laboratory. The LED light sources included no light, normal light, red light, blue light, green light, and yellow light. We also considered environmental factors, such as turning on/off lights, to simulate the interference from other potential light sources in the actual environment. Attacks illuminated by lights of different colors interfered with the machine learning model, which affected the ability of self-driving vehicles to recognize traffic signs. This caused the self-driving system to fail to detect the existence of traffic signs or commit recognition errors.
To resist this attack, we integrated the traffic sign image feedback from the attack into the training set to iteratively train the machine learning model. This made the new machine learning model resistant to related attacks and prevented the model from being interfered with by such attacks.
2. Methodology
2.1. Dataset
We used GTSRB [
1] as the experimental dataset. This dataset contains 43 traffic signs used in Germany.
Figure 1 lists the 15 examples that we used in the experiment. The original number of training set images in the GTSRB dataset is 39,544, and the number of test set images is 10,320. The original training set was split into training and validation sets in a ratio of 8:2 in this study. The validation set was used to adjust the model’s hyperparameters during the training phase to prevent overfitting results [
3,
4]. Therefore, the number of images in the training set of was 31,735, and that in the validation set was 7842.
2.2. Model
We used the YOLOv5 machine learning model [
5], which is an object detection model proposed by Joseph Redmon in 2015 with a one-stage detection method. YOLOv5 has four main versions: small (s), medium (m), large (l), and extra-large (x), providing higher accuracy from small to large. Built on Pytorch, it obtained high-speed, high-precision results that were adaptable to various scenarios. In the experiment, we calculated precision and recall. Both were higher than 0.9, indicating the model’s performance [
1].
2.3. Metrics
We adopted a confidence score as an evaluation metric. The confidence score shows the probability that the algorithm has correctly detected the object. The confidence score is calculated as the average precision at different intersection over union (
IoU) thresholds. The
IoU is measured by the degree of overlap (intersection) between the two bounding boxes of two objects. The formula for
IoU is defined as follows.
The confidence score is the product of the probability of predicting an object,
Pr(
Object), and the
IoU ratio [
6].
Pr(
Object) is the probability that the object is contained in the bounding box, and
IoU is the ratio of the intersection and union area of the predicted bounding box and the ground truth annotation box. The confidence score is calculated as follows.
For example, when the camera detects an object in front of it, the model draws the expected bounding box and calculates the probability that the object in this box is consistent with the traffic sign determined by the model. Object probability is an indicator to make predictions based on the features in the learning data. In YOLO, the object probability is mapped to a value between 0 and 1 by the sigmoid function to represent the prediction result.
IoU is an indicator used to measure the degree of overlap between the predicted bounding box and the ground truth box. It is calculated by dividing the intersection area by the union area. The value 1 means complete overlap, and 0 means no overlap. The
IoU formula used in this experiment is as follows.
For example, if the object probability is 0.9 (i.e., the probability that there is an object in the predicted bounding box) and the IoU is 0.8, then the confidence score is 0.9 × 0.8 = 0.72, which is the probability that the object is in the predicted bounding box.
3. Results
To conduct the attack experiments, we used LED-colored lights to illuminate traffic signs. In a physical simulation environment, we taped traffic signs to a white wall and connected a camera to the real-time video built into the YOLOv5 model for scenarios with traffic signs and various light sources to determine the effects of interference under different colors on the model’s ability to recognize traffic signs.
LED lights were classified into no light, normal light, red light, blue light, green light, and yellow light. External environmental factors were defined as lights on and off to simulate the light source conditions in the actual environment during the day or at night. Video footages in 1080 p and 30 fps were used to recognize traffic signs. The movement of vehicles was simulated in a micro-dynamic manner by moving the camera closer and further from the traffic sign at a distance of 5 to 30 cm.
Experiment Records
The traffic signs used in the test experiment included a red sign for stopping, a red frame sign to show a speed limit, and a blue sign for keep right. In the experiment, we selected the results with the highest confidence score as the result.
Table 1 shows the results of the experiments with three traffic signs under the light-on state and corresponding confidence scores. Blue lights interfered with the recognition of most of red traffic signs excluding the speed limit sign; and the red/yellow/green lights interfered with the recognition of the blue sign.
Table 2 shows the results of the experiments with three types of traffic signs under the light-off state, and the numerical data therein represent the corresponding confidence scores. All of the four colors of light interfered with the recognition of the red signs; in addition, the speed limit sign was also affected; and most of the blue sign was misidentified under the interference.
The model’s success rate in identifying blue traffic signs was unstable and different from the results of previous experiments. Therefore, it is necessary to experiment with a situation where the blue sign is illuminated by LED lights.
We used the traffic signs in three categories: red-frame signs (speed limit, construction), red signs (stop), and blue signs. Red-frame signs, as long as the distance was 5–10 cm, were not interfered with and were identified correctly. All red signs were not affected by the attack by contrasting colors (such as blue or green) at 5 cm. The model misjudged the blue sign regardless of whether the lights were on or off. For example, the keep right sign was misrecognized.
To understand whether the distance between the camera and the sign affected the model’s judgment, we moved the camera closer to and farther from the traffic sign at a distance from 5 to 30 cm. When the distance was 5–10 cm, most signs were not affected by the attack. If the distance was 10–20 cm, the recognition results for the model were identical to those of the digital simulation experiment. If the distance was 20–30 cm, the model was unable to identify the object as a traffic sign, except for simple signs such as “Speed limit”, which are likely to be identified as traffic signs.
When the lights were turned on, most of the signs were correctly recognized, except for red signs under the blue light and the blue sign under the red/yellow/green lights. When the lights were turned off, recognition of red-frame signs was less interfered with; red, green, blue, and other lights interfered with recognition of red signs. When the lights were turned off, the yellow light affected traffic signs, and the blue signs were most easily interfered with and misrecognized. The model was more affected by interference from the lights. When the lights were turned off, the traffic signs were misrecognized because there was no other light source to weaken the attack light source.
4. Discussion
In addition to a digitally simulated attack [
1], we used a physically simulated attack in the experiment. We recorded and retained post-attack sign images, imported them into the original training set, and then retrained the YOLO model. The retrained YOLO model identified attacks effectively. After adding the images interfered with by the attack to the original training dataset, the new training dataset included 41,280 images. The traffic signs used in the test experiments included red signs, red-frame signs, and blue signs. We took the results with the highest confidence score as the result.
Table 3 shows the results of the experiments after training the models with lights on. The numerical data represent the corresponding confidence scores. When the light was turned on, red signs were recognized correctly even under various colors of light; while the speed limit sign was recognized correctly only under white light (and red light), the rest were not recognized; blue signs were recognized incorrectly with various colors of light.
Table 4 shows the results of the experiments in the lights-off state after training. All red signs and speed-limit signs were correctly identified. In addition, blue signs were correctly identified with red or blue lights. Most results showed low confidence scores, which means that when lights were turned off, the colored lights interfered with recognition of the signs.
5. Conclusions
For all-red signs, misrecognition occurred in attacks with contrasting colored light (such as blue or green light), with a greater impact in a low-light environment. Red-frame signs, however, were less affected by interference during an attack. As long as the distance was close (5–10 cm), misjudgment did not occur. Compared with all-red and blue signs, red-frame signs were less susceptible to colored light attacks, and their confidence score decreased slightly. Regarding blue signs, whether in a well-lit or low-light environment, and whether exposed to attack light or merely slight indoor brightness, they were correctly recognized for directional indicators such as ‘Turn Left Ahead’, ‘Turn Right Ahead’, ‘Go Straight or Right’, and ‘Go Straight or Left’. Correct recognition required an LED light source positioned at a fixed point at a close distance (5–10 cm). In cases of incorrect predictions, signs were misclassified as less common signs, such as ‘Priority Road’ or ‘Double Curve’. These results indicated that the new model remained unstable even after retraining, necessitating additional parameters and training for reliable prediction results.
Author Contributions
Conceptualization, C.-H.L.; methodology, C.-H.L.; software, C.-T.Y. and Y.-L.C.; validation, C.-H.L.; formal analysis, C.-H.L.; investigation, C.-H.L.; resources, C.-T.Y.; data curation, Y.-L.C.; writing—original draft preparation, C.-T.Y. and Y.-L.C.; writing—review and editing, C.-H.L.; visualization, C.-T.Y. and Y.-L.C.; supervision, C.-H.L.; project administration, C.-H.L.; funding acquisition, C.-H.L. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the National Science and Technology Council, Taiwan grant number NSTC 113-2221-E-029-031.
Data Availability Statement
Data are contained within the article.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Pavlitska, S.; Lambing, N.; Zöllner, J.M. Adversarial attacks on traffic sign recognition: A survey. In Proceedings of the 3rd IEEE International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), Tenerife, Spain, 19–21 July 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Lin, C.-H.; Yu, C.-T.; Chen, Y.-L.; Lin, Y.-Y.; Chiao, H.-T. Simulated adversarial attacks on traffic sign recognition of autonomous vehicles. Eng. Proc. 2025, 92, 15. [Google Scholar] [CrossRef]
- Gholamy, A.; Kreinovich, V.; Kosheleva, O. Why 70/30 or 80/20 relation between training and testing sets: A pedagogical explanation. Int. J. Intell. Technol. Appl. Stat. 2018, 11, 105–111. [Google Scholar]
- Xu, Y.; Goodacre, R. On splitting training and validation set: A comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. J. Anal. Test. 2018, 2, 249–262. [Google Scholar] [CrossRef] [PubMed]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
- Zhang, W.; Watanabe, Y. Real-time identification system using YOLO and isomorphic object identification for visually impaired people. In Proceedings of the 2023 IEEE 12th Global Conference on Consumer Electronics (GCCE), Nara, Japan, 10–13 October 2023; pp. 757–758. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).