1. Introduction
Meteorological visibility is an important parameter for measuring the atmospheric quality, and it has a significant impact on the transport safety [
1]. However, the measurement and evaluation of visibility is a very complicated and challenging task, which is subjected to the errors caused by external factors such as suspended particles in the air [
2]. Traditional visibility estimation methods mainly include the manual evaluation method and the visibility meter method [
3]. The manual evaluation method refers to the approach of visual observation of the largest visible distance by a well-trained meteorological observer, while the visibility meter method estimates the visible distance by measuring the atmospheric transmittance or extinction coefficient [
4]. In the manual evaluation method, the meteorological observer generally uses the targets at different distances as references, and ignores errors caused by other environmental factors and determines the meteorological optical range (MOR) [
5]. However, this method has great limitations, and the result of this method depends on the number of available targets at different distances in the environment to be measured and the personal subjective judgement of the weather observer [
6]. Furthermore, this method is inefficient and irreproducible, and the observation of visibility is also limited by the time between the observations and the environmental changes. The visibility meter approach includes the forward scattering method and the back-scattering method [
7]. In general, based on the cost and performance considerations, most of visibility meters use the forward scattering method. However, accurate forward scattering equipment is very expensive and requires specialized installation and calibration skills. Furthermore, it can only measure the visibilities accurately within a relatively short visible range.
In recent years, due to the continuous advancement of computers and digital cameras, the digital images obtained by web cameras could be used for computer vision and obtaining accurate scene information. Many past researches have been done on visibility estimation and visibility restoration under low visibility conditions by studying the blurred and degrade images obtained by digital camera.
In 2016, Huang S C et al. [
8] presented a new approach with three modules, depth estimation, color analysis, and visibility restoration to solve the problem of visibility restoration of outdoor digital images in presence of haze, fog, and sandstorms. This method could simplify the complex image restoration problem. Compared with other methods, this method can be applied to images in different weather conditions, it is quick and efficient for images restoration with the removal of fog. Farhan Hussain [
9] proposed a novel deep neural networks approach for the visibility enhancement under low visibility in foggy conditions. They proposed a generalized model, with an approximate model generated by the deep learning neural network for the fog in the scene, to restore the image quality of the scene. This method could restore the scene of the image in real time without other prerequisite information. Zhigang Ling [
10] proposed a deep network that can recognize the local patch and the three-color channels information to enhance the image quality by dehazing process.
In 2017, Mingye Ju [
11] proposed a method for visibility restoration based on the fast single image defogging technique and a more robust atmospheric scattering model (ASM) which can overcome the problems of illumination nonuniformity and multiple scattering. Lei Zhu [
12] proposed a regression prediction model for the visibility forecast in the Urumqi International Airport. This regression prediction model was based on multi-factor, and its prediction result was very stable. When the visibility was higher than 1500 m, the average absolute error was better than 2000 m, and the prediction effect was less than 1000 m. Shengyan Li [
13] proposed an intelligent digital method to estimate the visibility by using the webcam weather images and the generalized regression neural network (GRNN). This proposed method uses a convolutional neural networks (CNN) network to estimate the visibility value of the webcam image through the pre-trained AlexNet. In the proposed model, the convolutional neural networks (CNN) network is used to extract image features, and the designed generalized regression neural network (GRNN) is used to approximate the visibility function with image features as input. However, the model visibility evaluation range is relatively limited (0–35 km), the training rate is 77.9%, and the test accuracy is about 61.8% only. Bohao Chen [
14] proposed a novel radial basis function (RBF) neural network method for haze elimination. One of the advantages of this method is that it can retain the edge of the visible structure and the brightness of image with the haze eliminated. This method can distinguish the haze component from the real-world haze images, and it can learn edge features according to the scene structure in the hidden layer of the radial basis function (RBF) network. This method can restore blurred images efficiently.
In 2018, Hazar Chaabani [
15] proposed a novel deep learning method which involves feature extraction and uses of support vector machine (SVM) that can achieve safer driving conditions under foggy weather. This proposed method could be integrated into the next-generation variable information signs and the advanced driver assistance systems (ADAS) to alert the driver of the visibility range and recommend the appropriate speed, thereby helping to achieve safer driving in foggy condition. Palvanov Akmaljon Alijon [
16] proposed a novel deep hybrid convolutional neural network (DHCNN) method for visibility estimation under heavy foggy conditions. The proposed method used the Laplacian of the Gaussian filter to estimate the visibility of the image under low visibility (foggy) conditions. This method could replace high-cost visibility measuring instruments. The proposed method can estimate the visibility and collecting images through closed-circuit television in real time. Yang You [
17] propose a deep learning method for estimating the relative atmospheric visibility from the digital images. The proposed method uses a shortcut connection to bridge a CNN module, which captures global view of an image, with a RNN coarse-to-fine module, which captures the farthest discerned local region. Although the evaluation capability of this CNN-RNN model was only 300–800 m, its accuracy could reach 90.3% accuracy. Youngjin Choi [
18] proposed a novel method with the uses of Closed-circuit television (CCTV), to estimate the visibilities from digital images with sea fog. Due to the lack of effective information from the CCTV images over a long distance, the accuracy of this method is about 70%. In addition, the optical sensor is 4.5 km away from the installation point of the CCTV and this will cause some noise errors.
In 2019, Wenqi Ren [
19] proposed a multi-scale convolutional neural networks method for single-image dehazing. Zhenyu Lu [
20] proposed a method with hierarchical sparse representations to estimate the image visibility. The proposed method used the Fuzzy C-means algorithm (FCM) to build a historical database of 5000 samples, and uses a hierarchical sparse representation to predict the visibility of new inputs. This hierarchical sparse representation method is easy to expand, which could improve accuracy, reduce absolute errors, and provide convenience for other meteorological analysis. Qian Li [
21] proposed a novel deep convolutional neural networks (DCNN) method for visibility estimation under the condition of insufficient visibility labeled data. This proposed method divided each image into several sub-regions, used a neural network without reference image to extract features from the image. Then the extracted features were imported into support vector regression for training, and the visibility evaluation of each sub-region were obtained. The final visibility evaluation was obtained according to the fusion weight of the regression model. The results of the proposed method showed that the accuracy of visibility estimation can be more than 90%. Fatma Outay [
22] proposed a novel method based on “learning features” to estimate the visibility under foggy weather, in which AlexNet deep convolutional neural networks (DCNN) was used for feature extraction, and support vector machine (SVM) classifier was used for visibility estimation. Chuang Zhang [
23] presented a visibility prediction method based on the multimodal fusion. The proposer method established the numerical prediction model with XGBoost, LightGBM, and emission detection algorithms. Akmaljon Palvanov [
24] gave a detailed overview of the latest research results on visibility estimation under various weather conditions. He proposed a novel deep integrated convolutional neural networks (VisNet) method to estimate images visibility by using webcam weather images and three deeply integrated convolutional neural network streams are connected in parallel in the VisNet. Compared with other methods, the proposed VisNet network had more advantages in versatility. However, the proposed method involves quite heavy computation and extensive data processing.
In 2020, Lo [
25] proposed a novel multiple support vector regression (MSVR) model for visibility estimation. This proposed method extracted different subregion areas from the weather images according to the prescribed landmarks’ information and used the VGG16 network to extract image features. According to different visibility ranges, the images are divided into different classes and their features were imported into the Support Vector Machine (SVM) for regression analysis and visibility estimation. However, the method only uses a single subregion for visibility estimation and the overall accuracy of this method was about 87% only. The comparisons of some proposed methods are summarized in
Table 1.
At present, deep neural networks had been widely used in the visibility estimation and restoration of weather images. Past research used various forms of neural networks to extract features from digital images, and used the extracted features as input data for classification and evaluation. Some methods focused on network optimization, network performances and shortening the computation time [
12,
13], while some methods focused on the improving the accuracy of the visibility estimation., Some of these past researches focused on better accuracy in smaller estimation range but these methods increase the computation load [
17]. Past research work has been done for improving the efficiency of the algorithm by using fusion methods so as to increase the adaptability of the extracted features [
21]. However, as some of the extracted features were reductant, these features would affect the training efficiency and the accuracy of the estimation results. From the perspective of features extraction, this paper looked for effective features extraction and reducing reductant image information for meteorological visibility estimation.
Instead of selecting the image subregions by using prerequisite landmark objects information and human judgement as proposed in [
25], this paper proposed a novel method for the meteorological visibility estimation based on image feature fusion, which can be able to find the effective image subregions through image pre-processing and gray-level averaging. This proposed method used deep learning neural network to extract features and established visibility evaluation models for each subregion through support vector machine (SVM). According to the results of the fusion analysis, the visibility estimates of subregions were fused together to obtain the final image visibility. Since effective subregions coordinates were already obtained by the preprocessing method, this method only performed feature extraction on the selected subregions, which would reduce the calculation time and increased the efficiency of visibility evaluation.
A visibility estimation method with intelligent subregions selection, feature extraction and feature fusion is proposed in this paper. The step by step procedures of the proposed method are briefly described as follows. Firstly, the proposed algorithm performed the gray-weighted averaging or the image pre-processing on all the images in the database. Coordinates of the effective subregions were determined. After extracting the effective subregions, feature extraction was performed on the subregions. Deep learning neural network (VGG-16 network, VGG-19 network, DenseNet network, and ResNet_50 network) were then used to extract the subregions’ features. Regression analysis model of each subregions was established through the support vector machine (SVM) and visibility estimates of subregions can be obtained. According to the results of fusion weight analysis, the visibility fusion was performed on all the subregions so as to obtain the final estimate of the visibility.
3. Experiment Results and Analysis
3.1. Experiment Platform
In order to evaluate the method proposed in this paper, we conduct experiments on the platform shown in
Table 2. Here, the image resolution of Hong Kong Observatory (HKO) Image database is 1920 × 1080 pixels. The true value of visibility for each image to be trained comes from the visibility meter data provided by HKO. The dataset has a total number of 4841 selected images from database. We have selected 3630 images randomly as training set and the remaining 1211 images are selected as test set.
According to the needs of the experiment, each image could provide appropriate sub-regions through gray-scale average. The visibility distribution of the image database is shown in
Table 3.
3.2. Result and Analysis
In the experiment, the visibility of the test set images was evaluated according to the neural network-regression model. Through gray averaging and region segmentation, we had obtained the effective subregions. First, we performed the feature extraction on the extracted effective area, and then imported the extracted features into the support vector machine model for training. Finally, we obtained the predicted visibility value. Comparing the actual visibility value and the predicted visibility value in each effective subregion, we could get the visibility evaluation result of the effective subregions. By analyzing the error of each effective subregions, the fusion weight of each effective area can be obtained. The fusion weights were used for the fusion of the visibility evaluation model of all effective subregions. Finally, the comprehensive visibility of fusion could be obtained.
The organization of the sections for the results and analysis are summarized as follows. Among them, for different effective subregions, the results of model analysis of the visibility evaluation were shown in part (1). In order to verify the effectiveness of the method proposed in this paper more extensively, we have used different image features extracted from different networks, such as (VGG)-16 network, (VGG)-19 network, DenseNet network, ResNet-50 network. The extracted features were imported into support vector machines to train, and finally we got the visibility regression results. In part (2), we evaluated the performances of the four networks, (VGG)-16 network, (VGG)-19 network, DenseNet network and ResNet-50 network. The detailed experimental results were shown in part (2). In addition, in order to verify the performance of the method in different visibility ranges, we have evaluated the results for different visibility ranges in part (2). Finally, in order to compare the fusion methods, the experimental results under different fusion strategies were analyzed and discussed in part (3). In part (4), we compared the fusion effect of this paper with those in other papers.
- (1)
Analysis of different effective areas
In order to assess the validity of the effective subregions, the visibility estimates and the subregion weights of each effective subregion were shown in
Table 4. The visibility of subregions with detailed objects, such as effective subregion No.3, No.4, and No.5 in
Figure 10, was closer to the actual visibility. These subregions with rich details were weighted correspondingly higher than the blurred subregions. In order to assess the effect of the number of effective subregions on the estimation accuracy, the accuracies with different numbers of effective subregions were shown in
Table 5. The accuracy with the fusion of one effective subregion (No.1) and three effective subregions (No.1, No.2, and No.3) were much lower than that obtained by fusion of all five subregions (No.1–No.5). The main reason was that the larger the number of sub-regions, the higher the authenticity of the details in each effective sub-region, and the easier it was to be close to the true value. In addition, if the segmented single effective subregion was too large in scope, it had too many different levels of structure and details in that subregion, which would result in reduced sensitivity of the extracted features and thus reduced accuracy of the final detection. Likewise, if the extent of a single subregion was too small, the area containing too little hierarchical structure and detail would have limited validity and would not be as well differentiated in terms of visibility. All in all, according to the experimental results, it was more appropriate to divide the whole image into five subregions for experimental case in this paper.
- (2)
Performance of different feature extraction networks
In order to verify the effectiveness of the proposed method in this paper, we have used different image features extraction networks, such as (VGG)-16 network, (VGG)-19 network, DenseNet network, and ResNet-50 network. The 512-dimensional feature vectors were extracted from the VGG-16 and VGG-19 networks. We extract the 1920-dimensional feature vectors from the DenseNet network. Feature vectors with 2048 entries were extracted from the ResNet_50 network as the coded features.
Table 6 showed the visibility accuracy of these four networks in each visibility range. Although the overall accuracies of VGG-16 and VGG-19 network was 88%, the VGG-16 and VGG-19 networks gave lower accuracies in lower visibility range as compared with the DenseNet network and the ResNet-50 network.
As ResNet_50 and DenseNet networks were more sensitive to image attenuation and could provide valid image features at different visibility levels, it could increase the network’s extraction rate of valid features. These two networks would be more sensitive to provide valid features for visibility estimation regression. According to the experimental results, ResNet_50 network was recommended for image feature extraction, especially in the low visibility range. ResNet_50 network also had higher stability and robustness in other ranges.
- (3)
Different fusion method
In order to assess the visibility estimation results for different fusion strategies, we have evaluated the following fusion strategies, namely the random fusion, the average fusion and the proposed weight fusion. In the random fusion method, any one effective subregion after image segmentation was selected randomly for the feature extraction and regression model analysis, and its visibility result was regarded as the final fusion result. In the average fusion method, the average value of all the estimates from different subregions was used as the final fusion result. In the weight fusion method, the fusion weight of each effective subregion was derived from the results of the error analysis. The accuracies of different fusion methods were shown in
Table 7.
According to the results of
Table 7, random selection strategy give the poorest results. As the randomly selected method has not included sufficient feature values for visibility estimation. In random fusion method, only the features of the selected effective sub-region are used, which reduces the information for visibility evaluation, thereby affecting the final accuracy. Factors such as the uneven light illumination in the image may cause excessive errors in final estimated visibility value. Compared with the random fusion method, the average fusion method could give better performances in the range of 21–50 km as compared to the random fusion method. The performance of average fusion method in low visibility range (0–20 km) is still not satisfactory. On the other hand, the weight fusion method effectively fused the local estimates of the subregion images with the considerations of the fitted variances of the predicted distribution. Therefore, weight fusion can give better robustness and stability. In summary, weighted fusion method can give the best visibility estimation results for whole visibility range.
- (4)
Comparison of different methods in other paper
As compared with the method in [
25], the accuracy of visibility estimation in [
25] was only about 80%. The effective subregions in [
25] are selected based on pre-requisite landmark objects information and human judgement. The selection process is not from an objective approach of regression model analysis. Therefore, some important image information could be ignored after the subregion selection process. Furthermore, as only one single subregion is used for the visibility estimation by support vector machine (SVM), the accuracy is about 82% only. On the other hand, the accuracy of the proposed method in this paper can be about 90%. It shows that the effective subregions selection method proposed in this paper was more reasonable and the fusion method can give more accurate results.
As compared with the method in [
21], the accuracy of visibility estimation in [
21] can also reach 90%. However, the sub-regions are extracted based on equal division of image. Subregions’ selection by equal division of the whole image may have the following disadvantages. Useful image objects and the reductant image objects will be mixed and distribute among different subregions. As the proposed algorithm in [
21] adjust the subregions image to the size of 224 × 3 × 224 after the division process, this will cause deformation of the subregion image and affect the estimation results. As the useful image information in the subregions is not extracted efficiently, this will increase the computation time.
The proposed method in this paper focus on more effective selection of subregions with useful static objects, it can reduce the area with reductant information in the subregions. The proposed method solves the problem of low data processing efficiency and low estimation accuracy, it can reduce the unnecessary computation load for the reductant image information in the image.
This proposed method can extract the effective subregions effectively and it can also provide reasonable accuracy for a wide estimation range with the uses of multiple SVR models and fusion method. The mapping surface between the visibility values and the feature vectors is complex and has high dimensions. By incorporating a number of piecewise SVR models, the multiple SVR models can approximate the highly dimension complex mapping surface of the visibility function.
5. Conclusions
At present, deep neural networks have been widely used in the visibility estimation of weather images, but most methods do not focus on the features and content of the effective subregions. From the perspective of effective feature extraction, this paper looked for efficient selection of useful subregions and features extraction for the visibility estimation. This paper proposed a novel deep learning neural network method for the visibility estimation based on feature fusion method, which located the most effective image subregions by gray-level averaging. This proposed method used deep learning neural network to extract features and the established visibility models for each subregion by using Support Vector Machine (SVM). The visibilities of the sub-regions were fused together according to the results of the weight fusion analysis.
In the proposed method, all the images in the database were gray-weighted firstly to remove interference areas, in order to obtain the effective subregions. In this paper, five subregions were extracted for subsequent feature extraction. Four feature extraction networks (Densest, ResNet_50, Vgg16, and Vgg19) were used to extract features from the subregions. The features vectors obtained by the neural network were then imported into the proposed SVR regression models, in which the visibility functions with image features input are curve fitted by the SVR models. According to the results of the error analysis, the weight fusion was performed to derive the final visibility estimate.
This proposed method extracts valid and effective subregions to improve the training efficiency and estimation accuracy, it also solves the problem of long computation time due to the data processing of the whole image or equally divided subregions of the whole image. Since the effective subregions were derived by the gray average method during pre-processing, this process in performed only in the initialization stage. While the image processing is performed only for the selected effective subregions, image processing for the invalid and reductant area is avoid. Compared with other methods, this paper not only extracted the features of image efficiently, it shortened the data processing time for the whole process, and it improved the efficiency of model training, and avoided the interference of invalid features.
The major idea of the proposed subregions selection method is summarized as follows. As the distribution, number and location of static landmark objects in a digital image are depended on the actual physical environment, we cannot select the effective subregions arbitrarily. Suppose the image dataset are sorted in ascending order of visibility distances. We can identify the location of the nearest to farthest objects by performing the gray level average for the dataset from the smallest to the largest visibility distance. The major aim of the proposed selection method is to group a set of static landmark objects into a particular subregion so that the variation of its image characteristic is sensitive a particular visibility range. Hence, we can train a SVR model to curve fitted the visibility function with the input image features. By combining these SVR models by fusion method, we can estimate the final visibility by using the approximate multiple SVR model.
The proposed method gives good performances and accuracies in a range of 0–50 km which is suitable for practical applications. Experimental results show that the visibility estimation accuracy of the proposed method is more than 90%. It could be used to estimate the visibility value of the whole image, with high robustness and effectiveness. This method does not require to define a large-scale visual annotation set, and it also eliminates the processing of invalid and reductant information on the digital images as compared to other existing methods. For the fine-tuning of the neural network and extraction of effective subregions, it greatly reduces the model complexity and the computation time as compared to other methods.
Although our model could successfully evaluate the visibility of images accurately, it still had some limitations. In terms of calculation time, it was quite time-consuming to carry out the preprocessing and gray level averaging of the whole dataset to obtain the coordinates of the effective subregions. However, the preprocessing stage is only necessary during the initialization and pre-tuning stage. While the new images are taken by CCTV and added into dataset, the SVR models will be updated. In the case of increased application noise, this would affect the choice of the effective area. In addition, the fusion of the sub-region visibility assessment model proposed in this paper can reduce the effect of increased application noise to some extent.
Another limitation of the proposed method in this paper is that it is applicable to daytime images only. Further modifications of the algorithm are needed when it is applied to night-time images. These limitations will be further investigated in the future. In addition, in our future research, we will also focus on optimizing the selection of effective subregions. So as to minimize the number of subregions in a particular image dataset while the visibility estimation accuracy can be maintained at reasonable level.