Hyperparameter Tuned Deep Autoencoder Model for Road Classiﬁcation Model in Intelligent Transportation Systems

: Unmanned aerial vehicles (UAVs) have signiﬁcant abilities for automatic detection and mapping of urban surface materials due to their high resolution. It requires a massive quantity of data to understand the ground material properties. In recent days, computer vision based approaches for intelligent transportation systems (ITS) have gained considerable interest among research com-munities and business people. Road classiﬁcation using remote sensing images plays a vital role in urban planning. It remains challenging because of scene complexity, ﬂuctuating road structures, and inappropriate illumination circumstances. The design of intelligent models and other machine learning (ML) approaches for road classiﬁcation has yet to be further explored. In this aspect, this paper presents a metaheuristics optimization with deep autoencoder enabled road classiﬁcation model (MODAE-RCM). The presented MODAE-RCM technique mainly focuses on the classiﬁcation of roads into ﬁve types, namely wet, ice, rough, dry, and curvy roads. In order to accomplish this, the presented MODAE-RCM technique exploits modiﬁed fruit ﬂy optimization (MFFO) with neural architectural search network (NASNet) for feature extraction. In order to classify roads, an interactive search algorithm (ISA) with a DAE model is used. The exploitation of metaheuristic hyperparameter optimizers helps to improve the classiﬁcation results. The experimental validation of the MODAE-RCM technique was tested by employing a dataset comprising ﬁve road types. The simulation analysis highlighted the superior outcomes of the MODAE-RCM approach to other existing techniques.


Introduction
In recent times, intelligent transportation systems (ITS) using unmanned aerial vehicles (UAV) have become essential. A set of comprehensive data regarding road networks is considered to be one of the transportation features which helps in planning and assessing transportation [1]. Retrieval of road data such as road surface materials and pavement type condition was one important problem in urban areas. This can be performed by remote sensing (RS) or conventional surveying [2]. The traditional method requires huge labour and consumes more time. Hyperspectral image, otherwise called imaging spectrometry, can be defined as acquiring data in several narrow, contiguous spectral bands. It offers detailed data compared to other RS methods [3]. Various chemical materials, such as gravel and asphalt, are extracted on a detailed level through hyperspectral images by their respective physical properties (reflectivity, absorption, albedo, and many more). This feature will be useful to discriminate and extract objects of cities, particularly those having the same spectral property [4]. Road surface materials are detected through hyperspectral images, incurring less cost than field surveying. Many existing approaches to map roads can be either semi-automatic or manual [5]. However, such techniques will be expensive and consume more time. Particularly, they might include more field work and clarification of aerial images, through which limited data is obtained. Hyperspectral data contains substantial potentiality relating to automatic detection of road surface materials [6]. However, no standard technique to map road surfaces or find the state of road surface resources has existed until now. Many existing techniques were first formulated for mineral detection. Therefore, it will be difficult to leverage such techniques in detecting road surface materials, because of the difference in such resources in the case of roads in smaller regions [7].
General road categories include ice, dry, rough, and curvy road surfaces. Classification of these road forms will be a fascinating issue. In this context, several image processing methods are used for assisting smart vehicles and for classifying various road categories [8]. However, the road image quality is affected through dynamic weather, illumination variation, and blurring. Progressive approaches, such as deep learning (DL), are helpful for observing the challenging atmosphere by a computer vision (CV) technique, and might help smart vehicles to modify their driving performance depending on the road type [9]. Driving performance mainly disturbs people near road regions, most often if a road has a medium amount of traffic. It is unrealistic to assess the condition of each road which a smart vehicle could approach [10]. There are massive variables that must be considered when handling such vehicles, including a difference in heavy traffic, varying meteorological conditions, and road conditions. This paper presents a metaheuristics optimization with a deep autoencoder enabled road classification model (MODAE-RCM). The presented MODAE-RCM technique mainly focuses on the classification of roads into five types, namely wet, ice, rough, dry, and curvy roads. For accomplishing this, the presented MODAE-RCM technique exploits modified fruit fly optimization (MFFO) with a neural architectural search network (NASNet) for feature extraction. In order to classify roads, an interactive search algorithm (ISA) with the deep autoencoder enabled model (DAE) is used. The exploitation of metaheuristic hyperparameter optimizers helps to considerably improve the classification results. The experimental validation of the MODAE-RCM technique was tested using a dataset comprising five road types.

Literature Review
Alshehhi et al. [11] presented a single patch-related convolutional neural network (CNN) structure for the extraction of buildings and roads from high-resolution remote sensing (RS) datasets. Low-level features of buildings and roads (i.e., compactness and asymmetry) in nearby areas were compiled with CNN attributes at the time of postprocessing phase for the purpose of enhancing the act. Lourenço et al. [12] recognized the potential of the object-based image analysis (OBIA) method for mapping numerous invasive plant species across roads, utilizing high spatial resolution images. Secondly, the author repeated the earlier classification and segmentation stages on the fifteen masked images of vegetated regions.
Zhang et al. [13] modeled a new stagewise domain adaptation method, termed Road domain adaptation (RoadDA), for addressing data structure (DS) problems in this domain. In the initial phase, RoadDA would adapt the targeted domain attributes for aligning the source ones through Generative Adversarial Network (GAN)-related interdomain adaptation. Particularly, feature pyramid fusion modules are formulated in order to evade data loss of thin and long roads, and to study discriminatory and robust attributes. In addition, to solve intradomain discrepancy in the targeted field, in the next phase, the author modeled an adversarial self-trained technique. Chen et al. [14] devised an Adaboostlike End-To-End Multiple Lightweight U-Nets method (AEML U-Nets) for the purpose of extracting roads. The authors in [15] devised a compiled technique merging classification and segmentation techniques with linked component analysis, for the purpose of extracting road class out of orthophoto imageries. This modeled approach was threefold. Firstly, a multiresolution segmenting technique has been implemented for image segmentation. After, the main classification techniques like Vector Machines (SVM), Decision Tree (DT), and k-Nearest Neighbor (KNN) are applied on the basis of textural, spectral, and geometric data. The acquired outcomes are classified into 2 classes, namely, non-road and road.
Ding et al. [16] modeled Non-Local Feature Search Networks (NFSNets) that could enhance the segmentation precision of RS imageries of roads and buildings, and attain precise urban planning. It could efficiently minimize the big area misclassifications of road and building discontinuation in the segmenting procedure. The Global Feature Refinement (GFR) components were presented for the purpose of integrating the features derived from the SAFT module and backbone network. It improvises the semantic data belonging to the feature map and gains a more detailed segmenting outcome. Dewangan et al. [17] modeled a CNN-related road classification network (RCNet) for the precise categorization of road surfaces. This process involves five categories of road surfaces, namely rough, curvy, ice, dry, and wet roads. The simulation outcomes show the performance of this presented RCNet in different optimizer approaches. The standard performance assessment measures were employed for the purpose of testing and validating this presented technique on the Oxford RobotCar data.

Materials and Methods
In this study, a new MODAE-RCM method was formulated for accurate and automated road classification. The presented MODAE-RCM technique mainly focuses on the classification of roads into five types, namely wet, ice, rough, dry, and curvy roads. It encompasses a series of processes, namely MFFO with NASNet feature extraction, DAE classification, and ISA hyperparameter optimizer. Figure 1 illustrates the workflow of MODAE-RCM system.

Feature Extraction
The presented MODAE-RCM technique exploited the MFFO with the NASNet model for feature extraction. The NASNet Mobile technique was a recently accomplished DL method with 5,326,716 variables, which will exhibit maximum reliability. The NASNet structure has a building block, and a group of blocks was jointly integrated for cell formation. The search space involved in the NASNet was the factorization of a network to cells, which was divided into blocks. The kind of blocks and cell count are not predetermined [18]. However, they must be improved for the dataset which is chosen. The probable function of the block includes max pooling, separable convolution, average pooling, identity map, convolution, and many more. The blocks are capable of mapping two inputs into output featured mapping. The network growth can be focused on three attributes, namely the amount of cells that are stacked (N), the count of filters from the primary layer (F), and the cell infrastructure. In order to regulate the hyperparameters of the NASNet method, the MFFO algorithm was applied to it. Generally, FFO is easier to design; however, it suffers from local optimal problems [19]. The flight direction and distance of osphresis foraging are not regular, and blind flight can decrease the searching act of the FFO method. Therefore, the MFFO technique was designed by the use of Levi flight into osphresis foraging for the purpose of adjusting the direction and distance of the FFO algorithm. Levi flight is a type of walking stage among short distance searches and, sporadically, long distance walks. Therefore, it could raise population diversity and extend the searching area, which causes the FFO algorithm to escape from local optima problems and decrease the possibility of earlier convergence. In addition, a condition probability P a is defined as vision foraging for the challenging optimum location that can be defined in order to enhance the searching accuracy of the FFO algorithm. The algorithmic process of the MFFO algorithm is defined as the following: Step 1: initializing parameters. Fix maximum number of iterations as G max , the FF swarm size as N, and FF location X i,0 , Y i,0 and optimum position X b,0 , Y b,0 arbitrarily in the interval of [0, 1].
Step 2: upgrade the location of the FF by the Levi flight: Step 3: execute the FFO process.
Step 4: produce an arbitrary number P t .
Step 5: compute the difference between optimum solutions Xb estindx,G , Y estindx,G and the worst solution X worstindx,G and Y worstindx,G of the population. [ Step 6: perform optimization and reiterate steps 2-4 to determine if smell concentration is superior to earlier. Then, the process is terminated upon reaching maximum accuracy or fixed number of iterations G max .
The MFFO algorithm will derive a fitness function (FF) for obtaining enhanced classifier outcomes. It would determine a positive value for denoting a superior outcome of candidate solutions. In this work, the reduced classifier error rate will be denoted as the FF, as given in Equation (3).

Road Classification Using DAE
At this stage, the DAE classifier is applied for classification process. DAE is an Auto-Encoder (AE) with one or more hidden layers (HL) [20]. The addition of HLs from a DAE permits the AE to learn further difficult paradigms of mathematical data. For an AE having a single HL, the procedure of mapping an input layer to the HL was the encoder stage. The mapping of the HL to the output layer is the decoder stage. During the DAE with several HLs, encoding and decoding pairs are added. The DAE infrastructure comprises five HLs (collected of three encoding and decoding pairs). The DAE phase starts with the stage in which the primary encoding (E1) encrypts input X, the secondary encoding (E2) encrypts the outcome in E1, and the tertiary encoding (E3) encrypts the outcome in E2. The encoder stage is expressed on the middle layer as Z = E3(E2(E1(X))).
In a single HL, the AE vector encoded h is written as h = f (W.X + b), whereas W stands for the weighted matrix, b denotes the bias vector, and X indicates the input vector. The vector encoder's purpose in forwarding propagation to HL 1 develops in Equation (4).
For an AE with single HL,X = f W T .h + b , the decoder function to DAE develops as: (1) ), whereas f refers to the node activation function (AF) utilized on all the layers. The AF f on NN neurons is a mathematical purpose carried out to the resultant signal utilized for enabling or disabling neurons. The AF maps resultant values into a chosen range, between −1 and 1 or 0 and 1 (depending upon the AF utilized). Figure 2 depicts the infrastructure of AE. The cost function of DAE was the distance function between input and reconstructinĝ X. Cost, also called loss, was computed with Mean Square Error (MSE) loss to AF: The input data were normalized to between zero and one; afterward, the reconstruction procedure on the resultant layer was conducted with non-linear sigmoid functions. In order to input, either the binary number or the input with a range between zero and one was utilized for binary cross-entropy as loss functions [4y, n8]. To the entire train of m data, (w, b) = 1 m ∑ m i=1 J w, b, x i ,x i . The minimum loss value was computed utilizing the subsequent formula: Back-propagation (BP) upgraded the biases and weight values of all the nodes from all the layers, for the purpose of decreasing the cost. An optimum cost of the most minimum loss value was nearly 0. Afterward, in the AE trained procedure, the data on the bottleneck (Z) layer was a representation of data from the low dimensional encoder procedure. The encoder infrastructure (Z) was put forward as input to a DNN technique called transfer learning (TL). In TL, the Z-encoder infrastructure transmitted to AE and the biases and weights values were classified.

Hyperparameter Tuning
Finally, the ISA technique was exploited for the optimal hyperparameter optimizer. The ISA technique is gradient-free, and population-based search technique that was used was established by Mortazavi et al. [21]. All the agents from the ISA technique which were dependent upon their tendency factor (τ i ) utilized both track and interact stages for the purpose of updating their place. During the tracking stage, the agent searched the vicinity of places spotted by particular agents as the optimum agent X G , the weighting agent X W , and the optimum place of arbitrary agents kept in the preceding optimum matrix X P . During the interact stage, the agent upgraded its place, depending upon pairwise data shared with other arbitrary agents. The ISA technique's mathematical equation was: if τ i ≥ 0.3 [Tracking phase]: where the upper left superscripts "t + 1" and "t" denote upgrade and present states of variables, correspondingly; τ i denotes the tendency factor arbitrarily selected in the range between zero and one; 0 0 signifies the coefficient that is always obtained as 0.4. φ 1 , φ 2 , and φ 3 demonstrate coefficients of accelerations which were picked arbitrarily in the range of zero and one; t X P , t X , and t X G represent the agent arbitrarily selected from the agent which kept the previous optimum places, the present agent, and the optimum agents [22]. In addition, X W refers to the weighting agent, determined as the weight average of every population, and the mathematical equation is given below.
According to this design, PS defines the population size, and f returns the main function value to select agents. Moreover, µ refers to the smaller positive number for avoiding probable division by zero conditions, and is obtained as 1 × 10 −5 .

Performance Evaluation
In this section, the road classification performance of the MODAE-RCM method is investigated, utilizing a dataset which comprises 12,500 samples with five types of roads, as depicted in Table 1. Figure 3 demonstrates some sample images.  The confusion matrices provided by the MODAE-RCM method on the road classification process are portrayed in Figure 4. The figure highlighted in the MODAE-RCM method has categorized all the types of roads accurately. Table 2 and Figure 5 represent the overall road classification outcomes of the MODAE-RCM approach on the entire dataset. The outcomes reported the MODAE-RCM model has proficiently recognized all of the five distinct kinds of roads on the applied input images. It can be noticed that the MODAE-RCM model has offered an average accu y of 99.29%, sens y of 98.23%, spec y of 99.56%, F score of 98.23%, and AUC score of 98.89%.    Table 3 and Figure 6 signify the overall road classification outcomes of the MODAE-RCM method on 70% of the TR database. These results indicate that the MODAE-RCM approach has proficiently recognized all of the five distinct kinds of roads in the applied input images. It is noted that the MODAE-RCM approach has presented an average accu y of 99.29%, sens y of 98.23%, spec y of 99.56%, F score of 98.23%, and AUC score of 98.89%.    Figure 7 portray the overall road classification outcomes of the MODAE-RCM approach on 30% of the TS database. These outcomes indicate that the MODAE-RCM method has proficiently recognized all of the five distinct kinds of roads on the applied input images. It is noted that the MODAE-RCM approach has rendered an average accu y of 99.30%, sens y of 98.25%, spec y of 99.56%, F score of 98.24%, and AUC score of 98.90%.
The training accuracy (TRA) and validation accuracy (VLA) gained by the MODAE-RCM technique under the test database are exemplified in Figure 8. The experimental result denotes that the MODAE-RCM method has reached maximal values of TRA and VLA. Principally, the VLA is greater than the TRA.  The training loss (TRL) and validation loss (VLL) reached by the MODAE-RCM method under the test database are established in Figure 9. The simulation outcome denotes that the MODAE-RCM system has established the lowest values of TRL and VLL. Specifically, the VLL is lesser than the TRL.
A clear precision-recall inspection of the MODAE-RCM technique under the test database is represented in Figure 10. The figure denotes that the MODAE-RCM approach has resulted in enhanced values of precision-recall values in every class label. Table 5 provides the overall road classification performance of the MODAE-RCM model with other existing models [17].       Figure 12 reports a brief sens y investigation of the MODAE-RCM with compared approaches. The figure represents that the SGD, Adagrad, and Adadelta models have reported lower classifier results. The RMSProp and Adamax methods also exposed rea-sonable sens y values. Although Adam attempted to gain considerable sens y of 98.09%, the MODAE-RCM method has accomplished a maximum sens y of 98.25%.  Figure 13 portrays a brief spec y inspection of the MODAE-RCM system with compared approaches. The figure signifies that the SGD, Adagrad, and Adadelta models have reported lower classifier results. The Adam and Adamax approaches have also revealed reasonable spec y values. Although the RMSProp attempted to gain a considerable spec y of 99.15%, the MODAE-RCM method has established a maximum spec y of 99.56%. Figure 14 reports a brief F score inspection of the MODAE-RCM system with compared approaches. The figure denotes that the SGD, Adagrad, and Adadelta techniques have reported lower classifier results. The RMSProp and Adamax models have also displayed reasonable F score values. Although Adam attempted to obtain a considerable F score of 97.89%, the MODAE-RCM method has established a maximum spec y of 98.24%.
After examining the aforementioned results, it was assured that the MODAE-RCM approach has reached maximum road classification performance.

Conclusions
In this study, a novel MODAE-RCM algorithm was formulated for accurate and automated road classification. The presented MODAE-RCM technique mainly focuses on the classification of roads into five types, namely wet, ice, rough, dry, and curvy roads. For accomplishing this, the presented MODAE-RCM technique exploited the MFFO with the NASNet model for feature extraction. Finally, the ISA-DAE classifier is applied for the classification process. The exploitation of metaheuristic hyperparameter optimizers helps to improve the classification results. The experimental validation of the MODAE-RCM technique is tested by utilizing a dataset comprising five road types. The simulation analysis pointed out the superior outcomes of the MODAE-RCM technique to other existing techniques. In the future, the performance of the MODAE-RCM approach can be further boosted via hybrid DL models. Data Availability Statement: Data sharing is not applicable to this article as no datasets were generated during the current study.