An Atmospheric Visibility Grading Method Based on Ensemble Learning and Stochastic Weight Average

In order to adequately characterize the visual characteristics of atmospheric visibility and overcome the disadvantages of the traditional atmospheric visibility measurement method with significant dependence on preset reference objects, high cost, and complicated steps, this paper proposed an ensemble learning method for atmospheric visibility grading based on deep neural network and stochastic weight averaging. An experiment was conducted using the scene of an expressway, and three visibility levels were set, i.e., Level 1, Level 2, and Level 3. Firstly, the EfficientNet was transferred to extract the abstract features of the images. Then, training and grading were performed on the feature sets through the SoftMax regression model. Subsequently, the feature sets were ensembled using the method of stochastic weight averaging to obtain the atmospheric visibility grading model. The obtained datasets were input into the grading model and tested. The grading model classified the results into three categories, with the grading accuracy being 95.00%, 89.45%, and 90.91%, respectively, and the average accuracy of 91.79%. The results obtained by the proposed method were compared with those obtained by the existing methods, and the proposed method showed better performance than those of other methods. This method can be used to classify the atmospheric visibility of traffic and reduce the incidence of traffic accidents caused by atmospheric visibility.


Introduction
Atmospheric visibility is a critical item in road meteorological observation, which has an essential impact on traffic safety and human health. In poor weather or haze conditions, the visibility will be significantly reduced, affecting the driver's judgment, even causing traffic accidents and threatening people's lives. On the other hand, atmospheric visibility can also reflect the air quality, which is very important for humans, as poor air quality damages human health. Therefore, the grading of atmospheric visibility is of considerable significance to traffic safety and human health.
In meteorology, atmospheric visibility is an indicator that reflects the transparency of the atmosphere [1]. It is generally defined as the maximum ground horizontal distance that a person with normal vision can see the outline of a target clearly under the weather conditions at that time. Traditional visibility measurement methods include the visual inspection method, instrumental measurement method, and image-based grading method. The visual inspection method estimates the atmospheric visibility of a scene through This paper proposed a method based on EfficientNet [11] and ensemble learning to detect atmospheric visibility levels. In this paper, three atmospheric visibility levels were set for specific traffic road scenes. EfficientNet was trained first, and then the trained neural network was ensembled by stochastic weight averaging (SWA) [12] to obtain a visibility grading model.

Materials
In order to analyze the level of atmospheric visibility, an image acquisition device was set up by a Hikvision high-definition camera with a resolution of 5 million pixels and a focal length of 12 mm. The image acquisition device, energy device, and computer constituted the image acquisition system. The experimental device is shown in Figure 1, and the experimental site is on the top floor of the Boyuan Building on the Pukou Campus of Nanjing Agricultural University, China. The device model of the image acquisition device is shown in Table 1. The images were collected from 5:00 to 18:00 every day from November to December 2019. The collected images were used to establish an atmospheric visibility image dataset. For ensuring the accuracy of visibility grading, the visibility level of each image was determined according to the air pollution index (API) value provided by the Ministry of Ecology and Environment of China (http://106.37.208.233:20035/, accessed on 6 March 2020). Some samples of the acquired images are shown in Figure 2. The image dataset contains a total of 2500 training sets and 500 validation sets.
Atmosphere 2021, 12, x FOR PEER REVIEW 3 of 15 of it is not so caring. It is for these considerations that in subsequent articles, we are defining three levels of visibility instead of calculating specific values. This paper proposed a method based on EfficientNet [11] and ensemble learning to detect atmospheric visibility levels. In this paper, three atmospheric visibility levels were set for specific traffic road scenes. EfficientNet was trained first, and then the trained neural network was ensembled by stochastic weight averaging (SWA) [12] to obtain a visibility grading model.

Materials
In order to analyze the level of atmospheric visibility, an image acquisition device was set up by a Hikvision high-definition camera with a resolution of 5 million pixels and a focal length of 12 mm. The image acquisition device, energy device, and computer constituted the image acquisition system. The experimental device is shown in Figure 1, and the experimental site is on the top floor of the Boyuan Building on the Pukou Campus of Nanjing Agricultural University, China. The device model of the image acquisition device is shown in Table 1. The images were collected from 5:00 to 18:00 every day from November to December 2019. The collected images were used to establish an atmospheric visibility image dataset. For ensuring the accuracy of visibility grading, the visibility level of each image was determined according to the air pollution index (API) value provided by the Ministry of Ecology and Environment of China (http://106.37.208.233:20035/, accessed on 6 March 2020). Some samples of the acquired images are shown in Figure 2. The image dataset contains a total of 2500 training sets and 500 validation sets.

Visibility Level
At present, there is no uniform standard for visibility classification. The visibility classification in this paper is mainly based on the air pollution index provided by the China Meteorological Administration, which is divided into three levels. This article is aimed at specific scenes of traffic roads. By detecting the level of visibility, it is recommended that the driver's safe driving behavior under this visibility is in order to achieve the goal of reducing traffic accidents. In such specific scenarios, the exact value of visibility is not needed; only qualitative perception is needed [13].
The visibility labeling process in this article is: first through the real-time collection of images, and then according to the time the image was taken and the air pollution index (API) released by the China Meteorological Administration at that time and the place (data source: http://106.37.208.233:20035/ , accessed on 6 March 2020), correspondingly, so as to classify and mark the visibility of the image. The flow chart of atmospheric visibility classification is shown in Figure 3.  Table 2, in the field scene, the visibility level is Level 1, and the visibility range exceeds 1000 meters, which is safe for the driver to drive; at Levels 2 and 3, it is environmental pollution and heavy pollution, which leads to reduced visibility, and it is easy to cause traffic accidents due to low visibility.

Visibility Level
At present, there is no uniform standard for visibility classification. The visibility classification in this paper is mainly based on the air pollution index provided by the China Meteorological Administration, which is divided into three levels. This article is aimed at specific scenes of traffic roads. By detecting the level of visibility, it is recommended that the driver's safe driving behavior under this visibility is in order to achieve the goal of reducing traffic accidents. In such specific scenarios, the exact value of visibility is not needed; only qualitative perception is needed [13].
The visibility labeling process in this article is: first through the real-time collection of images, and then according to the time the image was taken and the air pollution index (API) released by the China Meteorological Administration at that time and the place (data source: http://106.37.208.233:20035/, accessed on 6 March 2020), correspondingly, so as to classify and mark the visibility of the image. The flow chart of atmospheric visibility classification is shown in Figure 3.

Visibility Level
At present, there is no uniform standard for visibility classification. The visi classification in this paper is mainly based on the air pollution index provided b China Meteorological Administration, which is divided into three levels. This arti aimed at specific scenes of traffic roads. By detecting the level of visibility, it is re mended that the driver's safe driving behavior under this visibility is in order to ach the goal of reducing traffic accidents. In such specific scenarios, the exact value of visi is not needed; only qualitative perception is needed [13].
The visibility labeling process in this article is: first through the real-time colle of images, and then according to the time the image was taken and the air pollution i (API) released by the China Meteorological Administration at that time and the (data source: http://106.37.208.233:20035/ , accessed on 6 March 2020), corresponding as to classify and mark the visibility of the image. The flow chart of atmospheric visi classification is shown in Figure 3.  Table 2, in the field scene, the visibility level is Level 1, and the visibility r exceeds 1000 meters, which is safe for the driver to drive; at Levels 2 and 3, it is env mental pollution and heavy pollution, which leads to reduced visibility, and it is ea cause traffic accidents due to low visibility.  From Table 2, in the field scene, the visibility level is Level 1, and the visibility range exceeds 1000 meters, which is safe for the driver to drive; at Levels 2 and 3, it is environmental pollution and heavy pollution, which leads to reduced visibility, and it is easy to cause traffic accidents due to low visibility.

The Method of Atmospheric Visibility Grading
An atmospheric visibility grading model is established by constructing historical image datasets and using a deep learning method [14]. The model is defined as: for a training sample image set X = {x 1 , x 2 , . . . , x n }, the corresponding visibility is labeled as Y = {y 1 , y 2 , . . . , y n }, and a regression function f (x) between the training image and visibility value grading is established, with the range of f (x), which corresponds to three visibility levels, so that the sample image x is mapped to the visibility level [6].
In this paper, a deep learning model was proposed to solve the problem. Our training data were images, and the model detected the visibility level through the camera in front of the expressway to give early warning to the driver, so as to plan the path in advance and avoid accidents. The flow of the method proposed in this paper is shown in Figure 4. First, the EfficientNet is used to extract the abstract features of the image. Then, the feature set is trained and classified by the SoftMax regression model [15] and ensembled according to the SWA method to obtain the visibility grading model.

The Method of Atmospheric Visibility Grading
An atmospheric visibility grading model is established by constructing historical image datasets and using a deep learning method [14]. The model is defined as: for a training sample image set = { , , … , } , the corresponding visibility is labeled as = { , , … , }, and a regression function ( ) between the training image and visibility value grading is established, with the range of ( ), which corresponds to three visibility levels, so that the sample image x is mapped to the visibility level [6].
In this paper, a deep learning model was proposed to solve the problem. Our training data were images, and the model detected the visibility level through the camera in front of the expressway to give early warning to the driver, so as to plan the path in advance and avoid accidents. The flow of the method proposed in this paper is shown in Figure 4. First, the EfficientNet is used to extract the abstract features of the image. Then, the feature set is trained and classified by the SoftMax regression model [15] and ensembled according to the SWA method to obtain the visibility grading model.

Application of EfficientNet
The impact of changes in atmospheric visibility on images is mainly reflected in the degradation of visual features such as image brightness, contrast, color, and scene depth. However, it is difficult to fully reflect the influence of the atmosphere on image formation based on only a few visual features [16]. Therefore, this paper applied the pre-trained deep neural network to visibility grading, followed by extracting the values through the network to train the neural network, thereby achieving the grading results. EfficientNet used composite coefficients to uniformly scale all dimensions of the model to achieve the highest accuracy and efficiency. The pre-trained EfficientNet was applied to the visibility grading model to effectively extract the abstract features and solve the problem of a small number of samples and uneven distribution of the visibility dataset.
In a traditional neural network training process, regardless of whether the recognition is correct or not, the classification error of each category is returned indiscriminately. The feedback parameter correction amount is small so that the parameter correction cannot be effectively expanded, and the network training convergence speed is reduced.
The definition of EfficientNet is [11]: designing a baseline network using neural architecture search and uniformly scaling all dimensions of depth, width, and resolution. The advantage of EfficientNet is that the design of the middle layer ensures that both its accuracy and efficiency are better than all previous convolutional networks, no matter whether the amount of data is large or small. This conclusion came from the experiment that was conducted in EfficientNet. According to results compared with other traditional neural networks, EfficientNet performs better than any other deep neural network in the case of the same amount of hyperparameters or the same number of layers.
EfficientNet was divided into eight structures of B0-B7 according to the depth of the model. Because the data and grading level in this research were relatively simple, EfficientNet-B0 was used as the baseline model. The architecture of EfficientNet is shown in Figure 5. Its main building block is mobile inverted bottleneck MBConv, to which the original researcher also adds squeeze-and-excitation optimization. To adapt EfficientNet to our research, a softmax layer was added at the end of the basic deep neural network model, which is used as a three-classification block to maintain our goals.

Application of EfficientNet
The impact of changes in atmospheric visibility on images is mainly reflected in the degradation of visual features such as image brightness, contrast, color, and scene depth. However, it is difficult to fully reflect the influence of the atmosphere on image formation based on only a few visual features [16]. Therefore, this paper applied the pre-trained deep neural network to visibility grading, followed by extracting the values through the network to train the neural network, thereby achieving the grading results. EfficientNet used composite coefficients to uniformly scale all dimensions of the model to achieve the highest accuracy and efficiency. The pre-trained EfficientNet was applied to the visibility grading model to effectively extract the abstract features and solve the problem of a small number of samples and uneven distribution of the visibility dataset.
In a traditional neural network training process, regardless of whether the recognition is correct or not, the classification error of each category is returned indiscriminately. The feedback parameter correction amount is small so that the parameter correction cannot be effectively expanded, and the network training convergence speed is reduced.
The definition of EfficientNet is [11]: designing a baseline network using neural architecture search and uniformly scaling all dimensions of depth, width, and resolution. The advantage of EfficientNet is that the design of the middle layer ensures that both its accuracy and efficiency are better than all previous convolutional networks, no matter whether the amount of data is large or small. This conclusion came from the experiment that was conducted in EfficientNet. According to results compared with other traditional neural networks, EfficientNet performs better than any other deep neural network in the case of the same amount of hyperparameters or the same number of layers.
EfficientNet was divided into eight structures of B0-B7 according to the depth of the model. Because the data and grading level in this research were relatively simple, Effi-cientNet-B0 was used as the baseline model. The architecture of EfficientNet is shown in Figure 5. Its main building block is mobile inverted bottleneck MBConv, to which the original researcher also adds squeeze-and-excitation optimization. To adapt EfficientNet to our research, a softmax layer was added at the end of the basic deep neural network model, which is used as a three-classification block to maintain our goals.  To be more specific, we utilized the image obtained in the field to describe the detailed procedure of our basic EfficientNet-B0 model. First, the images were obtained from our Atmosphere 2021, 12, 869 7 of 15 collective devices. Then it was reshaped to a 224 × 224 image with RGB channels. Later, the reshaped image would be transformed into other shapes with the help of the EfficientNet model, which is revealed in Figure 6. When all the steps were done, the matrix that was gained would be transported to the softmax layer. To be more specific, we utilized the image obtained in the field to describe the detailed procedure of our basic EfficientNet-B0 model. First, the images were obtained from our collective devices. Then it was reshaped to a 224 × 224 image with RGB channels. Later, the reshaped image would be transformed into other shapes with the help of the EfficientNet model, which is revealed in Figure 6. When all the steps were done, the matrix that was gained would be transported to the softmax layer. Figure 6. Feature extraction using a pre-trained EfficientNet-B0 baseline network model.

Training Based on Softmax Regression Model
In order to establish the relationship between image features and visibility level, this paper used Softmax to establish the regression model [15,17]. The Softmax regression model is a generalization of the logistic regression model on multi-grading problems. In this paper, the class label y represents the three visibility levels {Level 1, Level 2, Level 3}.
As shown in Figure 5 before, the simple three-classification model consists of a basic pretrained EfficientNet-B0 model and a softmax layer, whose output is three probability values (from 0 to 1). The probabilities corresponding to the three classifications are obtained by calculating matrix multiplication in the softmax layer. The criterion that the sum of these three probabilities is 1 is always true. If the output probability is [0, 0.14, 0.86], the input belongs to Level 3.
In the Softmax regression model, the training set consists of m labeled samples, as given by: For a given test input x, a hypothesis function is used to estimate the probability value ( = | ) for each category j, that is, estimating the probability of each classification result of x. The hypothesis function is going to output a k-dimensional vector to represent the k estimated probability values. In this paper, k = 3.
The Softmax classifier maps the input vector from N-dimensional space to the classification, and the result is given in the form of probability, as expressed by: Figure 6. Feature extraction using a pre-trained EfficientNet-B0 baseline network model.

Training Based on Softmax Regression Model
In order to establish the relationship between image features and visibility level, this paper used Softmax to establish the regression model [15,17]. The Softmax regression model is a generalization of the logistic regression model on multi-grading problems. In this paper, the class label y i represents the three visibility levels {Level 1, Level 2, Level 3}.
As shown in Figure 5 before, the simple three-classification model consists of a basic pretrained EfficientNet-B0 model and a softmax layer, whose output is three probability values (from 0 to 1). The probabilities corresponding to the three classifications are obtained by calculating matrix multiplication in the softmax layer. The criterion that the sum of these three probabilities is 1 is always true. If the output probability is [0, 0.14, 0.86], the input belongs to Level 3.
In the Softmax regression model, the training set consists of m labeled samples, as given by: For a given test input x, a hypothesis function is used to estimate the probability value p(y = j|x) for each category j, that is, estimating the probability of each classification result of x. The hypothesis function is going to output a k-dimensional vector to represent the k estimated probability values. In this paper, k = 3.
The Softmax classifier maps the input vector from N-dimensional space to the classification, and the result is given in the form of probability, as expressed by: where, θ k = [θ k1 θ k2 θ k3 ] T is the weight value, which is the classification parameter corresponding to the category, and the total model parameter θ is expressed as: The parameter θ is obtained by training the Softmax classifier. As a parameter, θ can calculate the probability of all possible categories of the item to be classified, and then the category to which it belongs can be determined. Given a dataset including m training samples x (1) , y (1) , . . . , x (m) , y (m) , where x represents the input vector, and y is the category label for each x. For a given test sample x (i) , the Softmax classifier is used to estimate the probability that it belongs to each category. The form of the function h θ (x) is expressed as: where, h θ x (i) is a vector whose element p(y (i) = j x (i) ; θ) represents the probability of x (i) belonging to category k, the sum of the elements in the vector is 1. For x (i) , the k corresponding to the maximum probability value is selected as the grading result of the current image. The value of θ can be obtained from the cost function, which is defined as: where, T y (i) = j is an indicative function, with 1 for true value and 0 for false value. By minimizing J(θ), the classifier parameter θ can be obtained. A trained Softmax classifier is used in the image retrieval method to process each query image and feedback of its category.

Ensemble Learning
The traditional ensemble method is to ensemble several different models, then use the same input to predict the model, and then use some common method to determine the final grading result of the ensemble model [18,19]. In this paper, ensemble learning and deep learning are combined, and the final grading result can be produced by combining the gradings of multiple neural networks. Integrating the neural networks with different structures will get an excellent ensembled model. Because each model may make mistakes on different training samples, this ensemble method can maximize the model performance. The pretrained EfficientNet-B0 is substituted into Bagging for training, and 20 iterations are performed. The results are shown in Table 3.  The loss conforms to the expected convergence curve, indicating that the model fits well. After the training of EfficientNet-B0, the accuracy of the model is shown in Figure 7, and the final training result accuracy is close to 1, which indicates our model has a good performance in this grading task. By comparing the accuracy of each neural network, the last three networks are selected and ensembled by the SWA method as the training model for visibility grading. The loss conforms to the expected convergence curve, indicating that the model fits well. After the training of EfficientNet-B0, the accuracy of the model is shown in Figure 7, and the final training result accuracy is close to 1, which indicates our model has a good performance in this grading task. By comparing the accuracy of each neural network, the last three networks are selected and ensembled by the SWA method as the training model for visibility grading.

Ensemble Method Based on Weight Space
SWA [12] is very close to the fast geometric ensemble (FGE), but its calculated loss is small. The SWA method is used in this paper to optimize the previous model obtained by ensemble learning.
The method ensembles the model by combining the same weights in different training stages, and then uses this ensemble model of combined weights to make gradings. Wilson et al. [20] proposed that the advantage of this method is that when combined weights are used, a model will be obtained after training, and this model can speed up the subsequent grading process. Experimental results show that this ensemble method of combined weights outperforms some popular snapshot ensembles (SE).
The working principle of SWA is as follows:

Ensemble Method Based on Weight Space
SWA [12] is very close to the fast geometric ensemble (FGE), but its calculated loss is small. The SWA method is used in this paper to optimize the previous model obtained by ensemble learning.
The method ensembles the model by combining the same weights in different training stages, and then uses this ensemble model of combined weights to make gradings. Wilson et al. [20] proposed that the advantage of this method is that when combined weights are used, a model will be obtained after training, and this model can speed up the subsequent grading process. Experimental results show that this ensemble method of combined weights outperforms some popular snapshot ensembles (SE).
The working principle of SWA is as follows: The first model is used to store the average of the model weights (W SWA ). The final model will be obtained after training and then is used for grading.
The second model is used to traverse the weight space (such as W in the equation) and explore using a recurrent learning rate. At the end of each learning rate period, the current weight of the second model is used to update the average weight of the model by weighting the average between the old average weight and the new set of weights in the second model. W SWA ← W SWA ·n models + W n models + 1 We utilized ensemble learning with the SWA method, which consists of three Effi-cientNet models with three discrepant groups of hyperparameters. A group of parameters is obtained in every training epoch in our training process, which is also recognized as an EfficientNet model. According to this method, the last three networks of Table 3 are selected, i.e., Epoch18, Epoch19, and Epoch20. Firstly, the three W SWA models are stored in Epoch18, as Model 1, and the Epoch19 and Epoch20 networks are stored in memory as Model 2 during training. The average weight of Epoch18 and the new set of weights of the Epoch19 model are averaged to form the new Model 1. The above steps are repeated with the Epoch20 model. Finally, the atmospheric visibility grading model can be obtained using the three models through stochastic weights.

Experiment of Ensemble Learning Grading Based on SWA
In order to verify the effectiveness of the proposed method based on the Efficient-Net and SWA algorithm, the cross-validation method was used to divide the training set and the validation set after obtaining the image dataset by the image acquisition device. A total of 2500 images with different visibility levels were used as the training set, and 500 images with different visibility levels were used as the verification set. In addition, 650 images with three approximate visibility ratios of 1:1:1 were used as the testing set, with the distribution of sample quantity shown in Table 4. The Jupyter Notebook software in Anaconda was used to build the SWA model based on Efficient-Net and ensemble learning for training and testing based on Tensorflow and the Keras deep learning framework. Then the dataset was substituted into the SE model and FGE model, and a comparison was performed on the results. Our code will be made public on GitHub: https://github.com/ChristopherCao/An-Atmospheric-Visibility-Grading-Method-Based-on-Ensemble-Learning-and-Stochastic-Weight-Average (accessed on 20 June 2021).

Evaluation Indicator
In this paper, the following indicators were set to verify the effect of deep neural network EfficientNet.
1. Convergence coefficient: When the model training process ceased, the model training curve is regarded to tend to be stable with the increase of training times, then the model is considered to have converged. The equation is expressed by: where, refers to the accuracy in the training process, and E is a custom threshold. 2. Accuracy: There are many evaluation indicators to measure the model performance, and recognition accuracy is often used as the standard of model evaluation.
where, TP represents a positive sample predicted by the model to be positive, TN represents a negative sample predicted by the model to be negative, FP represents a negative sample predicted by the model to be positive, and FN is a positive sample predicted by the model to be negative.
The traditional neural network feeds back the classification error to the hidden layer indiscriminately each time it is trained, so it cannot accurately strengthen the influence of the main component features on the network training. The EfficientNet algorithm can adjust the weight of the model according to the sample classification error, and then backpropagate to enhance or reduce the corresponding hidden layer parameters and bias to achieve the purpose of targeted training on the corresponding features, thereby effectively enhancing the main component. The influence of features on network training reduces the interference of invalid features on network training, continuously improves the accuracy of classification results, optimizes the convergence effect, and improves the sample recognition rate.
This section mainly analyzes the convergence and recognition effect of the Efficient-Net algorithm. Analysis results show that the weight adjustment and integration model can improve the algorithm's convergence, conduct training on large error characteristics, and improve classification accuracy. In order to further verify the application effect of the

Evaluation Indicator
In this paper, the following indicators were set to verify the effect of deep neural network EfficientNet.

1.
Convergence coefficient: When the model training process ceased, the model training curve is regarded to tend to be stable with the increase of training times, then the model is considered to have converged. The equation is expressed by: where, acc i refers to the accuracy in the training process, and E is a custom threshold.

2.
Accuracy: There are many evaluation indicators to measure the model performance, and recognition accuracy is often used as the standard of model evaluation.
where, TP represents a positive sample predicted by the model to be positive, TN represents a negative sample predicted by the model to be negative, FP represents a negative sample predicted by the model to be positive, and FN is a positive sample predicted by the model to be negative.
The traditional neural network feeds back the classification error to the hidden layer indiscriminately each time it is trained, so it cannot accurately strengthen the influence of the main component features on the network training. The EfficientNet algorithm can adjust the weight of the model according to the sample classification error, and then backpropagate to enhance or reduce the corresponding hidden layer parameters and bias to achieve the purpose of targeted training on the corresponding features, thereby effectively enhancing the main component. The influence of features on network training reduces the interference of invalid features on network training, continuously improves the accuracy of classification results, optimizes the convergence effect, and improves the sample recognition rate.
This section mainly analyzes the convergence and recognition effect of the EfficientNet algorithm. Analysis results show that the weight adjustment and integration model can improve the algorithm's convergence, conduct training on large error characteristics, and improve classification accuracy. In order to further verify the application effect of the algorithm proposed in this article, the next chapter compares the EfficientNet algorithm with the same structure network algorithm.

Experiment Analysis
In order to verify the effectiveness of the model based on the visibility classification dataset, three factors, the convergence coefficient, model complexity, and accuracy, were evaluated, respectively. The specific settings were as follows to maintain the fairness of the experiment.
(1) The EfficientNet and the algorithm of the traditional neural networks were tested twice utilizing a visibility dataset, and the number of iteration periods of the two algorithms under convergence conditions was compared and analyzed. (2) This method was compared with VGG19 [21], ResNet [22,23], and Senet [24] based on the visibility dataset. Repeated experiments were carried out to analyze the confusion matrix results of EfficientNet and other algorithms, as well as the overall recognition accuracy.

Convergence Analysis
According to the self-built dataset, the EfficientNet algorithm and the traditional neural network with the same structure were repeatedly trained twice, where the initial iteration period was set to 200 times. The number of training iteration times were compared when the two algorithms stop training. The curve between convergence indicator and training period was obtained through experimental analysis, as shown in Figure 9.
Atmosphere 2021, 12, x FOR PEER REVIEW 12 of 15 algorithm proposed in this article, the next chapter compares the EfficientNet algorithm with the same structure network algorithm.

Experiment Analysis
In order to verify the effectiveness of the model based on the visibility classification dataset, three factors, the convergence coefficient, model complexity, and accuracy, were evaluated, respectively. The specific settings were as follows to maintain the fairness of the experiment.
(1) The EfficientNet and the algorithm of the traditional neural networks were tested twice utilizing a visibility dataset, and the number of iteration periods of the two algorithms under convergence conditions was compared and analyzed. (2) This method was compared with VGG19 [21], ResNet [22,23], and Senet [24] based on the visibility dataset. Repeated experiments were carried out to analyze the confusion matrix results of EfficientNet and other algorithms, as well as the overall recognition accuracy.

Convergence Analysis
According to the self-built dataset, the EfficientNet algorithm and the traditional neural network with the same structure were repeatedly trained twice, where the initial iteration period was set to 200 times. The number of training iteration times were compared when the two algorithms stop training. The curve between convergence indicator and training period was obtained through experimental analysis, as shown in Figure 9.  Figure 9 shows the training convergence curve of EfficientNet twice and the traditional network with the same structure. Figure 9a is the first training result. It can be seen from the figure that the EfficientNet training cycle reaches the convergence condition at about 80, then the training stops. The traditional network with the same structure needs to be trained for about 140 cycles to reach the convergence condition, then the training stops. Figure 9b is the second training result. It can be seen from the figure that the two training shows that EfficientNet basically reaches the convergence condition of about 80 when the training stops; the traditional neural network of the same structure needs to reach the convergence condition at about 130 iterations. Therefore, compared with the traditional neural network, the EfficientNet algorithm reaches the convergence condition 60 iterations earlier than the traditional neural network under the same conditions, indicating that the EfficientNet algorithm can effectively accelerate the convergence of the algorithm.  Figure 9 shows the training convergence curve of EfficientNet twice and the traditional network with the same structure. Figure 9a is the first training result. It can be seen from the figure that the EfficientNet training cycle reaches the convergence condition at about 80, then the training stops. The traditional network with the same structure needs to be trained for about 140 cycles to reach the convergence condition, then the training stops. Figure 9b is the second training result. It can be seen from the figure that the two training shows that EfficientNet basically reaches the convergence condition of about 80 when the training stops; the traditional neural network of the same structure needs to reach the convergence condition at about 130 iterations. Therefore, compared with the traditional neural network, the EfficientNet algorithm reaches the convergence condition 60 iterations earlier than the traditional neural network under the same conditions, indicating that the EfficientNet algorithm can effectively accelerate the convergence of the algorithm.

Overall Accuracy Analysis
In order to prove that this method is more suitable for visibility grading, the method was compared with VGG19, ResNet, and SEnet. Firstly, the whole sample training set was used as the model training data to construct the visibility recognition model. Then, the following five methods are verified by experiments based on the trained visibility recognition model. Finally, five repeated experimental results of EfficientNet, VGG19, ResNet, and SEnet under the overall sample were acquired through experimental comparison. The experimental results are shown in Figure 10.

Overall Accuracy Analysis
In order to prove that this method is more suitable for visibility grading, the method was compared with VGG19, ResNet, and SEnet. Firstly, the whole sample training set was used as the model training data to construct the visibility recognition model. Then, the following five methods are verified by experiments based on the trained visibility recognition model. Finally, five repeated experimental results of EfficientNet, VGG19, ResNet, and SEnet under the overall sample were acquired through experimental comparison. The experimental results are shown in Figure 10.  The recognition accuracy of VGG19 is about 73%, the recognition accuracy of SEnet is about 80%, and the recognition accuracy of ResNet is about 76%. According to the experimental results, several algorithms have good stability and low volatility for mass fog visibility recognition. Therefore, EfficientNet has a high accuracy, which shows that the algorithm in this paper can solve the slow training speed of large-scale data and maintain a high recognition effect.

Analysis of Recognition Accuracy
In the experiment, the overall sample set was applied to train the model, and the performance of the model was analyzed according to the recognition results of each visibility level test set. Firstly, the training set was input to construct the visibility recognition model. Then, the corresponding mass fog visibility level test set was used to verify based on the trained visibility detection model. Finally, the identification results of EfficientNet, VGG19, ResNet, and SEnet under each level were obtained by comparison.
It can be seen from Figure 11 that the recognition accuracy of EfficientNet at three visibility levels reaches 95.00%, 89.45%, and 90.91%, respectively, and the recognition accuracy rate is above 90%. The recognition accuracy rate at different visibility levels is higher than that of VGG19, ResNet, and SEnet, indicating that EffcientNet has high recognition accuracy.

Recognition Rate
Efficient VGG19 ResNet Senet Figure 10. The recognition accuracy of each neural network, including EfficientNet, VGG19, ResNet, and SEnet. Figure 10 also demonstrates the results of five experiments of EfficientNet, VGG19, ResNet, and SEnet. According to the figure, the five-times recognition accuracy of Effi-cientNet is about 90%. The recognition accuracy of VGG19 is about 73%, the recognition accuracy of SEnet is about 80%, and the recognition accuracy of ResNet is about 76%. According to the experimental results, several algorithms have good stability and low volatility for mass fog visibility recognition. Therefore, EfficientNet has a high accuracy, which shows that the algorithm in this paper can solve the slow training speed of large-scale data and maintain a high recognition effect.

Analysis of Recognition Accuracy
In the experiment, the overall sample set was applied to train the model, and the performance of the model was analyzed according to the recognition results of each visibility level test set. Firstly, the training set was input to construct the visibility recognition model. Then, the corresponding mass fog visibility level test set was used to verify based on the trained visibility detection model. Finally, the identification results of EfficientNet, VGG19, ResNet, and SEnet under each level were obtained by comparison.
It can be seen from Figure 11 that the recognition accuracy of EfficientNet at three visibility levels reaches 95.00%, 89.45%, and 90.91%, respectively, and the recognition accuracy rate is above 90%. The recognition accuracy rate at different visibility levels is higher than that of VGG19, ResNet, and SEnet, indicating that EffcientNet has high recognition accuracy.
In addition, it can be seen from Figure 11 of the confusion matrix that EfficientNet is better than VGG19, ResNet, and SEnet in the recognition accuracy of each level. It shows that EfficientNet can recognize the visibility level of mass fog well under the condition of accelerating the operation speed, which meets the requirements of real-time monitoring of an expressway. Atmosphere 2021, 12, x FOR PEER REVIEW 14 of 15 Figure 11. The confusion matrix diagram of the recognition results of each network.
In addition, it can be seen from Figure 11 of the confusion matrix that EfficientNet is better than VGG19, ResNet, and SEnet in the recognition accuracy of each level. It shows that EfficientNet can recognize the visibility level of mass fog well under the condition of accelerating the operation speed, which meets the requirements of real-time monitoring of an expressway.

Conclusions
(1) At present, deep learning has been widely used in the field of image processing, and it is easy to operate and has considerably high accuracy among many methods, which has received extensive attention from researchers. Particularly, EfficientNet is a deep neural network model that can be applied to small datasets and maintain high accuracy. It has shown good performance on the image datasets we collected in the experiment. (2) The SWA method ensures that the model can converge faster during the training process. On the other hand, it achieves the self-ensemble of the deep learning model, that is, the deep neural network with the same model architecture, but different weights are ensembled. Therefore, the detection accuracy is significantly improved. (3) An atmospheric visibility detection model based on EfficientNet and ensemble learning has been built in this paper. Three atmospheric visibility levels can be detected and classified by training a deep learning model and integrating the obtained deep neural network with the same model architecture but different model weights. According to the performance of the model in the experiment, it can be concluded that the detection accuracy of the three atmospheric visibility levels of Level 1, Level 2, and Level 3 are 95.00%, 89.45%, and 90.91%, respectively, with an average detection rate of 91.79%. The model proposed in this paper can more accurately grade atmospheric visibility. In addition, the proposed method can only maintain high accuracy in an experimental environment with good light characteristics, and the experimental data set needs to be further improved.

Conclusions
(1) At present, deep learning has been widely used in the field of image processing, and it is easy to operate and has considerably high accuracy among many methods, which has received extensive attention from researchers. Particularly, EfficientNet is a deep neural network model that can be applied to small datasets and maintain high accuracy. It has shown good performance on the image datasets we collected in the experiment. (2) The SWA method ensures that the model can converge faster during the training process. On the other hand, it achieves the self-ensemble of the deep learning model, that is, the deep neural network with the same model architecture, but different weights are ensembled. Therefore, the detection accuracy is significantly improved. (3) An atmospheric visibility detection model based on EfficientNet and ensemble learning has been built in this paper. Three atmospheric visibility levels can be detected and classified by training a deep learning model and integrating the obtained deep neural network with the same model architecture but different model weights. According to the performance of the model in the experiment, it can be concluded that the detection accuracy of the three atmospheric visibility levels of Level 1, Level 2, and Level 3 are 95.00%, 89.45%, and 90.91%, respectively, with an average detection rate of 91.79%. The model proposed in this paper can more accurately grade atmospheric visibility. In addition, the proposed method can only maintain high accuracy in an experimental environment with good light characteristics, and the experimental data set needs to be further improved.