Vehicle Detection Using Deep Learning Technique in Tunnel Road Environments

: This paper proposes a real-time detection method for a car driving ahead in real time on a tunnel road. Unlike the general road environment, the tunnel environment is irregular and has signiﬁcantly lower illumination, including tunnel lighting and light reﬂected from driving vehicles. The environmental restrictions are large owing to pollution by vehicle exhaust gas. In the proposed method, a real-time detection method is used for vehicles in tunnel images learned in advance using deep learning techniques. To detect the vehicle region in the tunnel environment, brightness smoothing and noise removal processes are carried out. The vehicle region is learned after generating a learning image using the ground-truth method. The YOLO v2 model, with an optimal performance compared to the performances of deep learning algorithms, is applied. The training parameters are reﬁned through experiments. The vehicle detection rate is approximately 87%, while the detection accuracy is approximately 94% for the proposed method applied to various tunnel road environments.


Introduction
Recently, various technologies for autonomous vehicles have emerged. A support system for the safe driving of vehicles has been achieved by combining various sensors included in the vehicle, such as lane maintenance, omnidirectional vehicle distance estimation, side vehicle detection, and vehicle distance maintenance sensors [1][2][3]. This paves the way for the realization of fully autonomous driving. Among the sensors installed in vehicles to support autonomous driving, the charge-coupled device (CCD) vision sensor is the most important [4][5][6][7][8][9]. Most driving tasks are the driver's visual tasks. The road environment information is analyzed through visual information, the situation is recognized, and the vehicle steering task is finally determined through the driving task. Thus, image recognition through vision sensors is important for the safety support of autonomous vehicles. Among the sensors required for safe driving support, the most commonly mounted or installed sensor on vehicles is a vehicle-type black-box device, which is a front and rear video recording device [10,11].
However, the current vehicle black-box system is simply used as a video recording device for accident identification. If the black-box product is equipped with a function for safe driving support, it will be possible to support safe driving. Some products include image processing functions such as lane keeping, determination of whether the vehicle in front is starting or not, and traffic sign recognition. However, existing black-box systems capable of recognizing intelligent road conditions can be applied in environments where lighting or road conditions do not significantly change. For example, it is impossible to recognize the correct road situation in an environment with poor lighting that is not a general road environment, such as those in tunnels or bridges. Traffic accidents have been continuously occurring in tunnels in Korea, and the number of deaths is also increasing [12]. Figure 1 shows the scene of a traffic accident in a tunnel [13]. As shown in Figure 1, most traffic accidents in tunnels are caused by collisions with vehicles in front. Figure 1 shows the scene of a traffic accident in a tunnel [13]. As shown in Figure 1, most traffic accidents in tunnels are caused by collisions with vehicles in front.
According to the traffic accident analysis system of the Korea Road Traffic Authority [12], the traffic accident status in tunnels in the period of 2010 to 2019 is shown in Figure 2. The number of injured persons has increased owing to the increase in the number of tunnel traffic accidents over the past four years. In addition, according to the analysis of traffic accident types in tunnels in the period of 2015 to 2019 in Figure 3, the ratio of vehicle-to-vehicle traffic accidents was above 88%, owing to the nature of the tunnel. Figures 4 and 5 show diagrams demonstrating the violation of road traffic regulations and status of each vehicle type among traffic accident types in tunnels. Approximately 60% of the traffic accidents in tunnels in the last five years were due to negligence, while 25% of the accidents were caused by not maintaining a safe distance between vehicles. More than 76% of vehicles were passenger vehicles.  [13]. (see https://www.socialfocus.co.kr/ news/articleView.html?idxno=7398, http://www.sisa-news.com/news/article.html?no=121142, https: //www.seoul.co.kr/news/newsView.php?id=20200506800014, https://news.zum.com/articles/59902225).
According to the traffic accident analysis system of the Korea Road Traffic Authority [12], the traffic accident status in tunnels in the period of 2010 to 2019 is shown in Figure 2. The number of injured persons has increased owing to the increase in the number of tunnel traffic accidents over the past four years. In addition, according to the analysis of traffic accident types in tunnels in the period of 2015 to 2019 in Figure 3, the ratio of vehicle-to-vehicle traffic accidents was above 88%, owing to the nature of the tunnel. Figures 4 and 5 show diagrams demonstrating the violation of road traffic regulations and status of each vehicle type among traffic accident types in tunnels. Approximately 60% of the traffic accidents in tunnels in the last five years were due to negligence, while 25% of the accidents were caused by not maintaining a safe distance between vehicles. More than 76% of vehicles were passenger vehicles.           Through the analysis of big traffic accident data [12], it is necessary to provide a guidance for vehicle drivers in tunnels to maintain a safe distance from the vehicle driving ahead and be attentive to the scene in front. Therefore, through a support system for the presence or absence of a vehicle ahead in such a tunnel, the number of traffic accidents in the tunnel can be reduced.
In the last five years, 3218 tunnel traffic accidents have occurred in Korea, in which 7472 people have been killed and injured. Thus, approximately 2.32 people were affected in a tunnel traffic accident. The risk is very high compared to 1.52 people per year in traffic accidents. Therefore, a safe driving support system that can inform the driver about whether the vehicle is driving ahead in real time to a vehicle running in a tunnel can largely reduce the number of traffic accidents in tunnels. Various methods for the detection and recognition of vehicles on roads have been proposed [14][15][16]. Through the analysis of big traffic accident data [12], it is necessary to provide a guidance for vehicle drivers in tunnels to maintain a safe distance from the vehicle driving ahead and be attentive to the scene in front. Therefore, through a support system for the presence or absence of a vehicle ahead in such a tunnel, the number of traffic accidents in the tunnel can be reduced. In the last five years, 3218 tunnel traffic accidents have occurred in Korea, in which 7472 people have been killed and injured. Thus, approximately 2.32 people were affected in a tunnel traffic accident. The risk is very high compared to 1.52 people per year in traffic accidents. Therefore, a safe driving support system that can inform the driver about whether the vehicle is driving ahead in real time to a vehicle running in a tunnel can largely reduce the number of traffic accidents in tunnels. Various methods for the detection and recognition of vehicles on roads have been proposed [14][15][16]. These methods involve various sensors. However, deep learning models [17][18][19] that can be applied to vehicle recognition using the image processing function are mainly the result of learning from vehicle images acquired in the daytime driving road environment, while the vehicle recognition rate is very low under tunnel-like environments. Therefore, in this paper, we propose an omnidirectional vehicle detection method in a tunnel environment. The tunnel environment has various brightnesses levels and colors depending on the characteristics of the lighting applied to the tunnel. In this study, to minimize the effect of the illumination light in the tunnel, the brightness of the image is smoothed and the effect of noise is minimized. The images of cars driving in the tunnel are learned using a deep learning model. In addition, we propose a method to detect a vehicle running in a tunnel using the learned deep learning model.

Proposed Method
In this paper, we propose a real-time detection method for a vehicle in a tunnel environment. Figure 6 shows images of vehicles in a tunnel road environment. Through the analysis of big traffic accident data [12], it is necessary to provide a guidance for vehicle drivers in tunnels to maintain a safe distance from the vehicle driving ahead and be attentive to the scene in front. Therefore, through a support system for the presence or absence of a vehicle ahead in such a tunnel, the number of traffic accidents in the tunnel can be reduced.
In the last five years, 3218 tunnel traffic accidents have occurred in Korea, in which 7472 people have been killed and injured. Thus, approximately 2.32 people were affected in a tunnel traffic accident. The risk is very high compared to 1.52 people per year in traffic accidents. Therefore, a safe driving support system that can inform the driver about whether the vehicle is driving ahead in real time to a vehicle running in a tunnel can largely reduce the number of traffic accidents in tunnels. Various methods for the detection and recognition of vehicles on roads have been proposed [14][15][16]. These methods involve various sensors. However, deep learning models [17][18][19] that can be applied to vehicle recognition using the image processing function are mainly the result of learning from vehicle images acquired in the daytime driving road environment, while the vehicle recognition rate is very low under tunnel-like environments. Therefore, in this paper, we propose an omnidirectional vehicle detection method in a tunnel environment. The tunnel environment has various brightnesses levels and colors depending on the characteristics of the lighting applied to the tunnel. In this study, to minimize the effect of the illumination light in the tunnel, the brightness of the image is smoothed and the effect of noise is minimized. The images of cars driving in the tunnel are learned using a deep learning model. In addition, we propose a method to detect a vehicle running in a tunnel using the learned deep learning model.

Proposed Method
In this paper, we propose a real-time detection method for a vehicle in a tunnel environment. Figure 6 shows images of vehicles in a tunnel road environment.  Tunnel images have low illumination compared to general road images, diffused reflections frequently occur due to the tunnel lighting, and they contain noise due to automobile smoke. In addition, it is challenging to detect the vehicle area visually at the entrance and exit of the tunnel owing to the sudden change in illumination. Therefore, in this study, a deep learning technique is applied to learn vehicles running in the tunnel. Brightness balance and noise removal steps are implemented to minimize the effects of various tunnel illumination lights and noise on the tunnel image. The input tunnel images are acquired from a black box installed in the vehicle.

Overview
We propose a method for the real-time detection of vehicles in vehicle black-box images acquired on tunnel roads. On tunnel roads, generally, the tunnel image quality is reduced owing to the irregular lighting, diffused reflection by the tunnel lighting, light reflected from the surface of driving vehicles, and exhaust gas from vehicles, in contrast to general roads. The image acquired on the tunnel road includes haze, light leakage, and blurring. When a vehicle detection method based on color and shape is applied to a tunnel road, error occurs in the detection. Therefore, in the proposed method, vehicle detection is performed by image brightness equalization and noise removal in advance. Figure 7 shows a flowchart of the vehicle detection process in the tunnel proposed in this paper. The black-box image is a 1920 × 1080 pixel, full color high-definition (HD) quality image, which requires Symmetry 2020, 12, 2012 5 of 11 a large time period to be processed. In this study, the image was reduced by applying the bilinear interpolation method to a 1 2 image. In addition, to correct the brightness of the image, illuminance smoothing was performed and the noise was removed by applying an average-value filter of pixel values. In the image post-processing step, the execution time was minimized by selectively using only the middle area of the image where the vehicle driving ahead appeared, not the entire input image. In the training stage, the YOLO v2 model was used for the images that were previously labeled with the ground-truth method. In the final vehicle detection step, a vehicle detector was used to detect the position of the vehicle in the tunnel image.
vehicles, and exhaust gas from vehicles, in contrast to general roads. The image acquired on the tunnel road includes haze, light leakage, and blurring. When a vehicle detection method based on color and shape is applied to a tunnel road, error occurs in the detection. Therefore, in the proposed method, vehicle detection is performed by image brightness equalization and noise removal in advance. Figure 7 shows a flowchart of the vehicle detection process in the tunnel proposed in this paper. The black-box image is a 1920 × 1080 pixel, full color high-definition (HD) quality image, which requires a large time period to be processed. In this study, the image was reduced by applying the bilinear interpolation method to a ½ image. In addition, to correct the brightness of the image, illuminance smoothing was performed and the noise was removed by applying an average-value filter of pixel values. In the image post-processing step, the execution time was minimized by selectively using only the middle area of the image where the vehicle driving ahead appeared, not the entire input image. In the training stage, the YOLO v2 model was used for the images that were previously labeled with the ground-truth method. In the final vehicle detection step, a vehicle detector was used to detect the position of the vehicle in the tunnel image.

Pre-processing
In this step, to effectively detect vehicles in a tunnel image, the processing amount is reduced and the image quality is improved. To reduce the processing calculation amount, the image size is

Pre-Processing
In this step, to effectively detect vehicles in a tunnel image, the processing amount is reduced and the image quality is improved. To reduce the processing calculation amount, the image size is reduced and the brightness is corrected to improve the image quality. In addition, the noise generated by diffuse reflection by exhaust gas and tunnel lighting in the tunnel environment is removed. A black-box, a video recording device for vehicles, is used to record the driving situation of the vehicle. Consequently, vehicle black box devices require a wide angle of view and high image quality to store road images. To this end, most black-box devices are tapped as a CCD sensors that provides a high HD-level quality. The size of the image acquired from the black-box device is 1920×1080 pixels (24-bit red-green-blue (RGB) color image). With the processing of the high-resolution image to detect a vehicle, the vehicle detection rate is high and the position can be accurately detected. However, the calculation amount is increased. Moreover, if a high-resolution image is used to learn a vehicle from a tunnel road image, a limit exists, which leads to an increase in the learning time. In the proposed method, the size of the input black-box HD-level image is reduced through bilinear interpolation. The advantage of the bilinear interpolation method is that it can output smoother images than those obtained by the nearest-neighbor interpolation method. Through the proposed method, the input tunnel image is reduced to 1/2 pixel size. As stated above, most tunnel environments have low illuminations compared to the general road environment. In the pre-processing, the first step in improving the brightness of the input image is performed. In the tunnel environment, the image quality is largely reduced owing to the haze attributed to light scattering from vehicle exhaust gas, road dust, and tunnel lighting. The haze contribution is stored together in the vehicle black box as a noise. Therefore, in the pre-processing step, Symmetry 2020, 12, 2012 6 of 11 the brightness of the image is improved and the included haze is minimized. The tunnel image (I) obtained from the black box can be expressed by where x is the two-dimensional (2D) image pixel coordinates, I(x) is the observed image, J(x) is the original image, L is the atmospheric light, and t(x) is the transmission map describing the portion of light. The original image J(x) to be estimated by Equation (1) can be expressed using the atmospheric light and transmission map. Therefore, the original image is estimated using the Retinex theory [20][21][22] to remove the noise. In the step of estimating J(x) of the original image from which the noise was removed, the atmospheric light L is estimated using a dark channel prior. The transmission map t(x) is estimated using atmospheric light. The original image J(x) without noise is estimated using the estimated L and t(x). Figure 8 shows the result of the estimation of the image with an improved brightness and removed noise from the tunnel image. Figure 8b shows an inverted image obtained by calculating the complement from the image in Figure 8a, acquired from the black box. Figure 8c shows a dark channel image, representing the lowest brightness in each RGB channel of that in (b). Using the image in Figure 8b, the atmospheric light L is estimated using the method reported by Dubok et al. [23]. The transmission map (Figure 8d) t(x) is estimated using the atmospheric light L. Using the atmospheric light L and transmission map t(x), and image with a smoothed brightness and removed noise is generated, as shown in Figure 8e. Finally, the image with a complement is obtained (Figure 8f).
vehicle, the vehicle detection rate is high and the position can be accurately detected. However, the calculation amount is increased. Moreover, if a high-resolution image is used to learn a vehicle from a tunnel road image, a limit exists, which leads to an increase in the learning time. In the proposed method, the size of the input black-box HD-level image is reduced through bilinear interpolation. The advantage of the bilinear interpolation method is that it can output smoother images than those obtained by the nearest-neighbor interpolation method. Through the proposed method, the input tunnel image is reduced to 1/2 pixel size. As stated above, most tunnel environments have low illuminations compared to the general road environment. In the pre-processing, the first step in improving the brightness of the input image is performed. In the tunnel environment, the image quality is largely reduced owing to the haze attributed to light scattering from vehicle exhaust gas, road dust, and tunnel lighting. The haze contribution is stored together in the vehicle black box as a noise. Therefore, in the pre-processing step, the brightness of the image is improved and the included haze is minimized. The tunnel image (I) obtained from the black box can be expressed by where x is the two-dimensional (2D) image pixel coordinates, I(x) is the observed image, J(x) is the original image, L is the atmospheric light, and t(x) is the transmission map describing the portion of light. The original image J(x) to be estimated by Equation (1) can be expressed using the atmospheric light and transmission map. Therefore, the original image is estimated using the Retinex theory [20][21][22] to remove the noise. In the step of estimating J(x) of the original image from which the noise was removed, the atmospheric light L is estimated using a dark channel prior. The transmission map t(x) is estimated using atmospheric light. The original image J(x) without noise is estimated using the estimated L and t(x). Figure 8 shows the result of the estimation of the image with an improved brightness and removed noise from the tunnel image. Figure 8b shows an inverted image obtained by calculating the complement from the image in Figure 8a, acquired from the black box. Figure 8c shows a dark channel image, representing the lowest brightness in each RGB channel of that in (b). Using the image in Figure 8b, the atmospheric light L is estimated using the method reported by Dubok et al. [23]. The transmission map (Figure 8d) t(x) is estimated using the atmospheric light L. Using the atmospheric light L and transmission map t(x), and image with a smoothed brightness and removed noise is generated, as shown in Figure 8e. Finally, the image with a complement is obtained (Figure 8f).

Vehicle Detection
In this step, the vehicle region is detected in the tunnel image using the YOLO v2 model. It utilizes the vehicle detector created through the vehicle learning step in advance. We use the pre-trained model to learn the vehicle region. Rather than constructing a new learning model, the YOLO v2 vehicle detector was created by effectively modifying the pre-trained learning model. ResNet-50 was used as the prior learning model [24]. ResNet-50 is a 50-layer convolutional neural network trained on over 1 million images in the ImageNet database. This model can be classified into approximately 1000 categories. The image input size of the neural network is 224 × 224. The YOLO v2 neural network consists of two sub-neural networks, feature extraction and detection neural networks. The feature extraction neural network used in this study uses the previously learned RestNet-50 CNN model. The detection neural network consists of several convolutional layers and a YOLO v2 dedicated layer. The inputs used to parameterize the YOLO v2 neural network are the neural network input size, anchor box, and feature extraction neural network. The size of the neural input was set to [224 224 3], while the number of anchor boxes was set to 11. The feature extraction neural network used 40 activation rectified linear units (ReLUs). Figure 9 shows the neural network structure of the YOLO v2 model used for the vehicle region detection by the proposed method.

Vehicle Detection
In this step, the vehicle region is detected in the tunnel image using the YOLO v2 model. It utilizes the vehicle detector created through the vehicle learning step in advance. We use the pretrained model to learn the vehicle region. Rather than constructing a new learning model, the YOLO v2 vehicle detector was created by effectively modifying the pre-trained learning model. ResNet-50 was used as the prior learning model [24]. ResNet-50 is a 50-layer convolutional neural network trained on over 1 million images in the ImageNet database. This model can be classified into approximately 1000 categories. The image input size of the neural network is 224 × 224. The YOLO v2 neural network consists of two sub-neural networks, feature extraction and detection neural networks. The feature extraction neural network used in this study uses the previously learned RestNet-50 CNN model. The detection neural network consists of several convolutional layers and a YOLO v2 dedicated layer. The inputs used to parameterize the YOLO v2 neural network are the neural network input size, anchor box, and feature extraction neural network. The size of the neural input was set to [224 224 3], while the number of anchor boxes was set to 11. The feature extraction neural network used 40 activation rectified linear units (ReLUs). Figure 9 shows the neural network structure of the YOLO v2 model used for the vehicle region detection by the proposed method.

Experimental Results
To evaluate of the proposed method, an experiment was carried out on 1920 × 1080 24-bit color images of driving videos acquired from a car black-box in various tunnels. The experiment was carried out using MATLAB. For the training data, we estimated the anchor size and number of the YOLO v2 model that could most effectively represent the vehicle region according to the size of the vehicle region designated by the ground-truth method and ratio of horizontal/vertical pixels. In the image input from the training data, it is important to use a mask that matches the size of the vehicle region in the extraction of feature information during the model learning according to the size distribution of the vehicle region. It is necessary to set the anchor box size and number of candidates, most effective for the vehicle area sizes, through cross-comparison of the vehicle area in the experimental data and vehicle area detected during the experiment. Figure 10 shows the size and width/length ratio of the vehicle region in the experimental data. Figure 11 shows the accuracy of intersection of detected vehicle regions according to the number of YOLO learning anchors.
According to the experiment, the size of the anchor box that most effectively detects the vehicle area in the training data was set to 11 and the YOLO learner was used. For the network training, stochastic gradient-descent optimization functions were used, the initial learning rate was 0.0001, the mini-batch size for each training iteration was set to 64, and the maximum number of iterations was set to 30. Only 70% of the data were used, 15% were used for verification, while the remaining 15% were used for testing. Figure 12 shows a graph of the precision and recall obtained using the YOLO v2 vehicle detector generated after the training. The average accuracy of the resulting vehicle detectors was approximately 95%. Figure 13 shows the results of applying Aggregated Channel Features (ACF) [24,25], Fast R-CNN [26,27], Single Shot Detector (SSD) [28,29], feature informationbased vehicle detectors, and the proposed method to detect vehicles in various terminal environments. The proposed method provides good results for the detection of vehicles in a tunnel environment. However, a vehicle cannot be correctly detected in a road portion where a sudden

Experimental Results
To evaluate of the proposed method, an experiment was carried out on 1920 × 1080 24-bit color images of driving videos acquired from a car black-box in various tunnels. The experiment was carried out using MATLAB. For the training data, we estimated the anchor size and number of the YOLO v2 model that could most effectively represent the vehicle region according to the size of the vehicle region designated by the ground-truth method and ratio of horizontal/vertical pixels. In the image input from the training data, it is important to use a mask that matches the size of the vehicle region in the extraction of feature information during the model learning according to the size distribution of the vehicle region. It is necessary to set the anchor box size and number of candidates, most effective for the vehicle area sizes, through cross-comparison of the vehicle area in the experimental data and vehicle area detected during the experiment. Figure 10 shows the size and width/length ratio of the vehicle region in the experimental data. Figure 11 shows the accuracy of intersection of detected vehicle regions according to the number of YOLO learning anchors.
According to the experiment, the size of the anchor box that most effectively detects the vehicle area in the training data was set to 11 and the YOLO learner was used. For the network training, stochastic gradient-descent optimization functions were used, the initial learning rate was 0.0001, the mini-batch size for each training iteration was set to 64, and the maximum number of iterations was set to 30. Only 70% of the data were used, 15% were used for verification, while the remaining 15% were used for testing. Figure 12 shows a graph of the precision and recall obtained using the YOLO v2 vehicle detector generated after the training. The average accuracy of the resulting vehicle detectors was approximately 95%. Figure 13 shows the results of applying Aggregated Channel Features (ACF) [24,25], Fast R-CNN [26,27], Single Shot Detector (SSD) [28,29], feature information-based vehicle detectors, and the proposed method to detect vehicles in various terminal environments. The proposed method provides good results for the detection of vehicles in a tunnel environment. However, a vehicle cannot be correctly detected in a road portion where a sudden change in illuminance occurs, such as a tunnel entrance/exit portion. Figure 13 shows the results of vehicle detection in a tunnel using different learners.
of weak classifiers used for learning was set to a maximum of 2048, while the number of iterations of learning process was set to 10 to proceed with the learning. The Fast R-CNN-based vehicle detector uses a deep convolutional neural network based on a region of interest. For learning, VCG-16 was used as the pre-training model, the mini-batch size was set to 16, the initial learning rate was set to 0.0001, and the maximum number of epochs was set to 30. The SSD-based vehicle detectors use the pre-training model ResNet-50 for feature extraction and stochastic gradient descent with momentum for learning. The initial learning rate of the learner was set to 0.0001, the mini-batch size was set to 16, and the maximum number of epochs was set to 30. According to the experiment, the proposed method provided good results for vehicle detection in tunnels.   learning process was set to 10 to proceed with the learning. The Fast R-CNN-based vehicle detector uses a deep convolutional neural network based on a region of interest. For learning, VCG-16 was used as the pre-training model, the mini-batch size was set to 16, the initial learning rate was set to 0.0001, and the maximum number of epochs was set to 30. The SSD-based vehicle detectors use the pre-training model ResNet-50 for feature extraction and stochastic gradient descent with momentum for learning. The initial learning rate of the learner was set to 0.0001, the mini-batch size was set to 16, and the maximum number of epochs was set to 30. According to the experiment, the proposed method provided good results for vehicle detection in tunnels.    The ACF-based vehicle detection method [27] could not detect vehicles at a distance or at the entrance and exit of a tunnel. For the Fast R-CNN-based vehicle detection method [28], the vehicle detection rate was the lowest, owing to the use of a vehicle model learned on a general road in a tunnel. The SSD-based vehicle detection method [29] could not detect vehicles located at a distance. The proposed method has a relatively effective vehicle detection rate regardless of the distance.
The comparison of the vehicle detection rates in various tunnel environments shows an accuracy improvement of approximately 10.7% with the introduction of the pre-processing. Table 1 compares the vehicle detection rates with and without the preprocessing in the vehicle detection step. A vehicle cannot be detected in the tunnel when two or more vehicles overlap owing to lane changes while driving. In addition, it could not detect a vehicle in progress behind a large bus or truck. According to the experiment, the average vehicle detection rate of the proposed method was approximately 86.8%. The comparison of the vehicle detection accuracy shows a performance of approximately The ACF-based vehicle detectors decompose the learning vehicle images into 10 feature channels and reduce them in multiple steps to calculate the features of the vehicle region. In addition, to classify the features of the vehicle region with the AdaBoost algorithm, only the regions where the features of the vehicle are located are classified in stages using several weak classifiers. The number of weak classifiers used for learning was set to a maximum of 2048, while the number of iterations of learning process was set to 10 to proceed with the learning. The Fast R-CNN-based vehicle detector uses a deep convolutional neural network based on a region of interest. For learning, VCG-16 was used as the pre-training model, the mini-batch size was set to 16, the initial learning rate was set to 0.0001, and the maximum number of epochs was set to 30. The SSD-based vehicle detectors use the pre-training model ResNet-50 for feature extraction and stochastic gradient descent with momentum for learning. The initial learning rate of the learner was set to 0.0001, the mini-batch size was set to 16, and the maximum number of epochs was set to 30. According to the experiment, the proposed method provided good results for vehicle detection in tunnels.
The ACF-based vehicle detection method [27] could not detect vehicles at a distance or at the entrance and exit of a tunnel. For the Fast R-CNN-based vehicle detection method [28], the vehicle detection rate was the lowest, owing to the use of a vehicle model learned on a general road in a tunnel. The SSD-based vehicle detection method [29] could not detect vehicles located at a distance. The proposed method has a relatively effective vehicle detection rate regardless of the distance.
The comparison of the vehicle detection rates in various tunnel environments shows an accuracy improvement of approximately 10.7% with the introduction of the pre-processing. Table 1 compares the vehicle detection rates with and without the preprocessing in the vehicle detection step. A vehicle cannot be detected in the tunnel when two or more vehicles overlap owing to lane changes while driving. In addition, it could not detect a vehicle in progress behind a large bus or truck. According to the experiment, the average vehicle detection rate of the proposed method was approximately 86.8%. The comparison of the vehicle detection accuracy shows a performance of approximately 94.1%. The vehicle detection was judged successful if it overlapped the location of the vehicle area by approximately 50% or more by the ground-truth method in advance.

Conclusions and Future Work
In this paper, we proposed a method to detect a vehicle driving ahead in a tunnel environment. In the proposed scheme, a vehicle detector was created using a YOLO v2 learner. The learning was performed on road images acquired in various tunnel environments to generate the detector. To increase the accuracy of vehicle detection in a tunnel environment, vehicle detection performance was improved by applying the noise reduction and illuminance smoothing steps to the tunnel image in advance. In addition, according to the application of several deep learning learners, the YOLO v2 network was effective for vehicle detection in a tunnel environment. However, it was challenging to detect vehicles at the entrance and exit of the tunnel owing to the sudden change in brightness. We intend to continue with studies on vehicle detection using Kalman filters, estimation of the distance between vehicles in the tunnel, and discrimination of brake application through the detection of brake lights.