Recognition of Traffic Sign Based on Bag-of-Words and Artificial Neural Network

The traffic sign recognition system is a support system that can be useful to give notification and warning to drivers. It may be effective for traffic conditions on the current road traffic system. A robust artificial intelligence based traffic sign recognition system can support the driver and significantly reduce driving risk and injury. It performs by recognizing and interpreting various traffic sign using vision-based information. This study aims to recognize the well-maintained, un-maintained, standard, and non-standard traffic signs using the Bag-of-Words and the Artificial Neural Network techniques. This research work employs a Bag-of-Words model on the Speeded Up Robust Features descriptors of the road traffic signs. A robust classifier Artificial Neural Network has been employed to recognize the traffic sign in its respective class. The proposed system has been trained and tested to determine the suitable neural network architecture. The experimental results showed high accuracy of classification of traffic signs including complex background images. The proposed traffic sign detection and recognition system obtained 99.00% classification accuracy with a 1.00% false positive rate. For real-time implementation and deployment, this marginal false positive rate may increase reliability and stability of the proposed system.


Introduction
The road traffic signs must be installed and placed appropriately to ensure that they are clearly visibility to the road users.Though the traffic sign has standard physical structure and appearance, various natural issues and human errors cause the variation in color, shape, or both.Moreover, most of the time a valuable sign is ignored by the driver and puts them in a potentially dangerous situation because of the lack of attention and unclear visibility [1].Additionally, in many places, the visibility of road traffic signs may get obscured; consequently drivers may unintentionally ignore them.
Though road users can recognize and classify distinct road signs without any mistake or delay, robust and fully automated detection and recognition of the road sign remains a challenge for autonomous vehicles.Currently, many researchers and vehicle manufacturing companies are designing autonomous vehicles [2].These vehicles can operate without the intervention of a human driver.Hence, such autonomous vehicles need the robust traffic sign detection and recognition system to be in place for them to operate on the road safely.
Recognition of traffic signs has been receiving more attention in recent years due to the advanced integrated system for smart vehicles.Due to upgrading to a higher living standard, the communication and transportation system has improved, and the use of vehicles has also increased.With the increasing number of vehicles, the rate of traffic accidents has also increased.At present, road traffic accidents have become one of the most frequent causes of death worldwide [3].By 2030, road accidents will be the fifth most common cause of death around the world [4], including in Malaysia [5].Numerous research organizations are currently working to reduce road-related incidents by integrating transportation systems with artificial intelligence as an advanced driver assistance system.An advanced driver assistance system automatically detects traffic signs by using a camera mounted on the dashboard of a vehicle; this functions as a road sign recognition system.This system helps the driver to be aware of the road and traffic signs, rules of driving along the road and notifies the driver of the said signs which ultimately help reduce the possibility of having an accident.An intelligent traffic sign classification system is a vital function of an intelligent transport system [6,7].The automatic traffic sign identification involves two main stages namely, detection, and recognition.The detection stage is performed to identify the region of interest (ROI) generally by using the color segmentation, followed by some form of shape identification.Detected traffic sign candidates are then either recognized or rejected through the recognition stage.The recognition stage is performed with some machine learning techniques such as the artificial neural network (ANN) [8][9][10][11][12][13][14], support vector machine (SVM) [1,[15][16][17][18][19][20][21][22] or the template matching [23].
Recently, many researchers have proposed various traffic sign detection and recognition systems and they achieved satisfactory results [24].Though existing systems can achieve fast processing speed and classification accuracy, the existing systems can only detect and recognize well-maintained and standard traffic signs.Moreover, these existing systems may not work properly for un-maintained and non-standard traffic signs.In many developing countries, to maintain the traffic sign may not be the top priority of respective authorities and these signs may get affected by environmental factors such as rain, illumination, pollution, damage due to accident or human error, etc.For instance, consider Figure 1a where the traffic sign has been faded due to continues sunlight.Furthermore, many countries have different colors and shapes for traffic signs identification because of their climate and other environmental factors.For instance, Figure 1b shows the standard 'Stop' traffic sign used by various developed countries such as USA, Canada, etc.On the other hand, Figure 1c shows the alternative traffic sign 'Stop' that is used in Vanuatu.Here, we can see, in Figure 1b,c the shape of the symbol 'Stop' is different for different countries.
Recognition of traffic signs has been receiving more attention in recent years due to the advanced integrated system for smart vehicles.Due to upgrading to a higher living standard, the communication and transportation system has improved, and the use of vehicles has also increased.With the increasing number of vehicles, the rate of traffic accidents has also increased.At present, road traffic accidents have become one of the most frequent causes of death worldwide [3].By 2030, road accidents will be the fifth most common cause of death around the world [4], including in Malaysia [5].Numerous research organizations are currently working to reduce road-related incidents by integrating transportation systems with artificial intelligence as an advanced driver assistance system.An advanced driver assistance system automatically detects traffic signs by using a camera mounted on the dashboard of a vehicle; this functions as a road sign recognition system.This system helps the driver to be aware of the road and traffic signs, rules of driving along the road and notifies the driver of the said signs which ultimately help reduce the possibility of having an accident.
An intelligent traffic sign classification system is a vital function of an intelligent transport system [6,7].The automatic traffic sign identification involves two main stages namely, detection, and recognition.The detection stage is performed to identify the region of interest (ROI) generally by using the color segmentation, followed by some form of shape identification.Detected traffic sign candidates are then either recognized or rejected through the recognition stage.The recognition stage is performed with some machine learning techniques such as the artificial neural network (ANN) [8][9][10][11][12][13][14], support vector machine (SVM) [1,[15][16][17][18][19][20][21][22] or the template matching [23].
Recently, many researchers have proposed various traffic sign detection and recognition systems and they achieved satisfactory results [24].Though existing systems can achieve fast processing speed and classification accuracy, the existing systems can only detect and recognize wellmaintained and standard traffic signs.Moreover, these existing systems may not work properly for un-maintained and non-standard traffic signs.In many developing countries, to maintain the traffic sign may not be the top priority of respective authorities and these signs may get affected by environmental factors such as rain, illumination, pollution, damage due to accident or human error, etc.For instance, consider Figure 1a where the traffic sign has been faded due to continues sunlight.Furthermore, many countries have different colors and shapes for traffic signs identification because of their climate and other environmental factors.For instance, Figure 1b shows the standard 'Stop' traffic sign used by various developed countries such as USA, Canada, etc.On the other hand, Figure 1c shows the alternative traffic sign 'Stop' that is used in Vanuatu.Here, we can see, in Figure 1b,c the shape of the symbol 'Stop' is different for different countries.Therefore, in this research, an intelligent and robust traffic sign detection and recognition system is developed that can identify the well-maintained, un-maintained, standard, and non-standard road traffic signs.The proposed system is developed using machine learning approaches.In the proposed Therefore, in this research, an intelligent and robust traffic sign detection and recognition system is developed that can identify the well-maintained, un-maintained, standard, and non-standard road traffic signs.The proposed system is developed using machine learning approaches.In the proposed system, the features were extracted using visual Bag-of-Words (BoW), and the discriminative features were chosen using k-means clustering approach.Finally, for the classification of the traffic sign, three different classifiers were employed namely, ANN, SVM, and Ensemble subspace kNN (k-nearest neighbors) classifiers.Our experimental results showed 99.00% accuracy for traffic sign recognition using ANN classifier.
This remainder of the paper is organized in the following way.Section 2 presents the related work.Research methodology is presented in Section 3. Section 4 shows experimental results.Section 5 presents the discussion based on obtained results.In Section 6, the significance of our proposed method is compared with one state-of-the-art baseline.Moreover, in addition to the collected dataset, the performance of the proposed method is also evaluated on one publicly available dataset of traffic sign images [25].Finally, this research work is concluded in Section 7.

Related Work
Various traffic sign detection and recognition methods and algorithms have been developed.In recent years, several studies have proposed the intelligent traffic sign classification systems to classify ideogram-based traffic signs in real-time [16,[26][27][28][29][30].Ohgushi et al. [27] developed a traffic sign classification system that utilized color information and Bags of Features (BoF) using SVM classifier to classify traffic signs.Their proposed system failed to recognize traffic signs in two instances, when the traffic sign is intensively illuminated and when the traffic sign has partially been occluded with the same color object.Some researchers carried out the investigation to just detect traffic signs without the use of classification [31,32], whereby some researchers were focused on detection and recognition of the traffic signs [33].Round road sign detection on Chinese highways was proposed by Wu et al. [31].The limitation of [31] is it can only apply to the detection and recognition of round shaped road signs and cannot identify any other shaped traffic sign.
Wali et al. [15] proposed a method which had three main phases: the first phase was image pre-processing, the second phase was detection and the last phase was recognition.In the detection phase, they used color segmentation with shape matching.Finally, SVM classifier was used to perform recognition and it achieved 95.71% system accuracy.Lai et al. [34] proposed a sign recognition method for intelligent vehicles with smart mobile phones.Color detection was employed to perform in hue, saturation, and value (HSV) color space segmentation.Shape recognition based on template matching was done by using a similarity calculation.The optical character recognition (OCR) was utilized on the pixels inside the shape border to decide provided a match to the authentic sign.However, their proposed system was limited to only red color traffic signs.Moreover, very limited types of signs were used for classification.
Virupakshappa et al. [35] proposed a method of bag-of-visual-words technique with Speeded Up Robust Features (SURF) descriptors and SVM classifier which was used to identify the traffic signs.Their experimental results obtained an accuracy of 95.20%.Shams et al. [36] introduced a multi-class traffic sign recognition system based on the BoW features model and extended it further by using a spatial histogram that incorporates rough layout of images and greatly improves the classification.Their proposed recognition system is performed by SVM.A two-stage fuzzy inference model was introduced by Lin et al. [37] to detect the road traffic signs from a video frame and SVM was employed for recognition of road traffic signs.However, their proposed method was limited to only prohibitory and warning traffic signs.
Yin et al. [24] introduced a different technique for detection and recognition of traffic signs in real-time.They used Hough transformation to identify the ROI point to the traffic sign.Rotation invariant binary pattern (RIBP) was used to extract features and ANN was used for traffic signs classification.Lin et al. [38] proposed a low-multifaceted speed limit detection and recognition process which not only supported distinctive sorts of speed breaking point signs from different nations but also maintained a great identification rate under severe climates.However, their proposed system was only limited to the speed limit traffic signs.
As a subject of dynamic scholarly research, traffic sign recognition is additionally a technology that is being researched and enforced within the business industries.Many vehicle manufacturers Symmetry 2017, 9, 138 4 of 21 (such as Tesla, Inc., Palo Alto, CA, USA; Continental AG, Hanover, Germany) develop the technology to detect and recognize the road traffic signs as a part of the smart vehicle.In 2010, the BMW-5 series (BMW, Munich, Germany) initiated the project of production of a traffic sign recognition system [39].Moreover, BMW and several vehicle manufacturers also revealed some other models of this similar technology [40].Volkswagen has also introduced it on the Audi A8 [41] (Audi, Ingolstadt, Germany).Furthermore, Mercedes-Benz has developed traffic sign recognition systems on their E and S class vehicles.Additionally, Google (Santa Clara, CA, USA) has also industrialized an automotive technology which lets a vehicle self-drive.A combination of the utilized information is stored in its map database, and that information is collected from its real-time environment.Google's autonomous vehicle is able to securely operate in complex urban situations [42].However, there is one incident that occurred with Google's autonomous vehicle, when a driver ran a red light and collided with the passenger side door [43].Recently, the Tesla Team announced that all of their new manufactured vehicles will possess full self-driving hardware [44].
The existing academic literature shows that there are a variety of traffic sign detection and recognition systems.However, these existing systems are trained and tested on good quality images.Moreover, these systems are specifically designed for developed countries such as America, Canada, Germany, Finland, etc.In these countries, traffic signs are regularly and strictly maintained by the relevant authorities.Hence, the existing systems can undoubtedly work better in developed countries where these signs are maintained regularly.However, in many developing countries, these signs are not maintained on a top priority basis.Moreover, in many different countries, different symbols are used for the same purpose (as shown in Figure 1b,c).However, aforementioned studies did not consider this issue of non-standardization of traffic signs among various countries or even one country (such as Malaysia where different symbols are used for the pedestrian sign, as shown in Figure 2).Therefore, in this study, an intelligent, efficient, and effective traffic sign recognition system is developed that can detect and recognize un-maintained and non-standard traffic signs.(such as Tesla, Inc., Continental AG) develop the technology to detect and recognize the road traffic signs as a part of the smart vehicle.In 2010, the BMW-5 series initiated the project of production of a traffic sign recognition system [39].Moreover, BMW and several vehicle manufacturers also revealed some other models of this similar technology [40].Volkswagen has also introduced it on the Audi A8 [41].Furthermore, Mercedes-Benz has developed traffic sign recognition systems on their E and S class vehicles.Additionally, Google has also industrialized an automotive technology which lets a vehicle self-drive.A combination of the utilized information is stored in its map database, and that information is collected from its real-time environment.Google's autonomous vehicle is able to securely operate in complex urban situations [42].However, there is one incident that occurred with Google's autonomous vehicle, when a driver ran a red light and collided with the passenger side door [43].Recently, the Tesla Team announced that all of their new manufactured vehicles will possess full self-driving hardware [44].
The existing academic literature shows that there are a variety of traffic sign detection and recognition systems.However, these existing systems are trained and tested on good quality images.Moreover, these systems are specifically designed for developed countries such as America, Canada, Germany, Finland, etc.In these countries, traffic signs are regularly and strictly maintained by the relevant authorities.Hence, the existing systems can undoubtedly work better in developed countries where these signs are maintained regularly.However, in many developing countries, these signs are not maintained on a top priority basis.Moreover, in many different countries, different symbols are used for the same purpose (as shown in Figure 1b,c).However, aforementioned studies did not consider this issue of non-standardization of traffic signs among various countries or even one country (such as Malaysia where different symbols are used for the pedestrian sign, as shown in Figure 2).Therefore, in this study, an intelligent, efficient, and effective traffic sign recognition system is developed that can detect and recognize un-maintained and non-standard traffic signs.

Research Methodology
The research methodology of the proposed system is depicted in Figure 3. Here, the figure shows the learning and classification phases of the traffic signs recognition classifier.In the learning phase, various traffic signs were used to train the classifier.Various features were extracted from these images to create a master feature vector.This master feature vector is then fed to the classifier as an input to construct the classification model.In recognition or classification phase, a traffic sign is detected from the image, then features are extracted from the detected image to convert the detected sign into a feature vector.Finally, the constructed classifier recognizes the detected traffic sign.This process is discussed in more detail in subsequent sections.

Research Methodology
The research methodology of the proposed system is depicted in Figure 3. Here, the figure shows the learning and classification phases of the traffic signs recognition classifier.In the learning phase, various traffic signs were used to train the classifier.Various features were extracted from these images to create a master feature vector.This master feature vector is then fed to the classifier as an input to construct the classification model.In recognition or classification phase, a traffic sign is detected from the image, then features are extracted from the detected image to convert the detected sign into a feature vector.Finally, the constructed classifier recognizes the detected traffic sign.This process is discussed in more detail in subsequent sections.

Acquisition of Image
The sample images are collected by capturing them with a low-cost onboard digital camera (Canon Power Shot SX530 HS (Canon Inc., Tokyo, Japan)) which is placed in a moving vehicle.Under the several environmental conditions, sample images were captured from different route locations in Malaysia.As the Malaysian road is left handed, an onboard digital camera is mounted on the lefthand side of the dashboard.This camera position is suitable for capturing the left sided images of traffic signs.The main objective of this segment is to implement a traffic image database.It should be noted that our collected image datasets include standard, non-standard, maintained and the unmaintained set of traffic sign images.The training and testing images were captured in all kinds of weather such as rainy, sunny, and cloudy.The images were captured at various day and evening times ranging from 7:00 AM to 7:00 PM.The frame rate of the captured real-time images was 29 frames per second with an average of 60 km/h vehicle speed.
Our image dataset contains 12 different classes of traffic signs, namely Caution!Hump, Give Way, Towing Zone, Traffic Lights Ahead, No Entry, Stop, Speed Limit, Pedestrian Crossing, Keep Left Curve Chevron, Keep Right Curve Chevron, No Stopping, and No Parking.These 12 classes of traffic signs were captured because these signs have different colors, shapes, and pictograms throughout Malaysia.Moreover, these signs are frequently installed in many places.For each class, 100 samples were used for training purposes.For testing, 1200 real-time test images were used to detect and recognize the class of particular traffic sign.Hence, overall, our dataset comprised of 2400 training and testing images.

Traffic Sign Detection Phase
To detect a traffic sign from a captured real-time image, initially image pre-processing was performed to eliminate the noise of unwanted background, normalizing the intensity of the different elements of the images, eliminating reflections, and image portions masking.Afterward, to improve the image quality in terms of brightness, and contrasts, histogram equalization technique was employed, as shown in Figure 4.

Acquisition of Image
The sample images are collected by capturing them with a low-cost onboard digital camera (Canon Power Shot SX530 HS (Canon Inc., Tokyo, Japan)) which is placed in a moving vehicle.Under the several environmental conditions, sample images were captured from different route locations in Malaysia.As the Malaysian road is left handed, an onboard digital camera is mounted on the left-hand side of the dashboard.This camera position is suitable for capturing the left sided images of traffic signs.The main objective of this segment is to implement a traffic image database.It should be noted that our collected image datasets include standard, non-standard, maintained and the un-maintained set of traffic sign images.The training and testing images were captured in all kinds of weather such as rainy, sunny, and cloudy.The images were captured at various day and evening times ranging from 7:00 AM to 7:00 PM.The frame rate of the captured real-time images was 29 frames per second with an average of 60 km/h vehicle speed.
Our image dataset contains 12 different classes of traffic signs, namely Caution!Hump, Give Way, Towing Zone, Traffic Lights Ahead, No Entry, Stop, Speed Limit, Pedestrian Crossing, Keep Left Curve Chevron, Keep Right Curve Chevron, No Stopping, and No Parking.These 12 classes of traffic signs were captured because these signs have different colors, shapes, and pictograms throughout Malaysia.Moreover, these signs are frequently installed in many places.For each class, 100 samples were used for training purposes.For testing, 1200 real-time test images were used to detect and recognize the class of particular traffic sign.Hence, overall, our dataset comprised of 2400 training and testing images.

Traffic Sign Detection Phase
To detect a traffic sign from a captured real-time image, initially image pre-processing was performed to eliminate the noise of unwanted background, normalizing the intensity of the different elements of the images, eliminating reflections, and image portions masking.Afterward, to improve the image quality in terms of brightness, and contrasts, histogram equalization technique was employed, as shown in Figure 4.  Finally, the traffic sign was detected from the captured image using detection phase as shown in Figure 5.As shown here, the quality of raw input image was enhanced by using the histogram equalization and it was read into color and grayscale mode.Moreover, a median filter was used to remove existing noise from the grayscale image and then the grayscale image was converted into a binary image using the threshold level of 0.18.This threshold level was used because it gives the best performance for sign detection.Afterward, small objects from the binary image were removed.Additionally, the target traffic signs' property such as center, width, and height were calculated by using shape measurement to determine the ROI.Finally, based on the calculated ROI, the detected traffic sign was extracted from the original red, green, blue (RGB) colored image in colored mode.
An example illustration of the detection system has been shown in Figure 6.Here, Figure 6a is representing an image frame having a complex background, and Figure 6b is representing an image frame having an ordinary background.As shown here, in our proposed system, the detection phase can extract both the image frame's ROI and it can also extract target traffic sign candidates.Finally, the traffic sign was detected from the captured image using detection phase as shown in Figure 5.As shown here, the quality of raw input image was enhanced by using the histogram equalization and it was read into color and grayscale mode.Moreover, a median filter was used to remove existing noise from the grayscale image and then the grayscale image was converted into a binary image using the threshold level of 0.18.This threshold level was used because it gives the best performance for sign detection.Afterward, small objects from the binary image were removed.Additionally, the target traffic signs' property such as center, width, and height were calculated by using shape measurement to determine the ROI.Finally, based on the calculated ROI, the detected traffic sign was extracted from the original red, green, blue (RGB) colored image in colored mode.
An example illustration of the detection system has been shown in Figure 6.Here, Figure 6a is representing an image frame having a complex background, and Figure 6b is representing an image frame having an ordinary background.As shown here, in our proposed system, the detection phase can extract both the image frame's ROI and it can also extract target traffic sign candidates.

Traffic Sign Recognition Phase
Once the detection phase detects the traffic sign candidate from the captured image, that traffic sign will be passed to recognition phase to classify that sign into a specific class.There are two main steps involved to recognize or classify the detected traffic sign, particularly feature extraction and classification.These steps are discussed in detail in subsequent sections.

Feature Extraction
The process of feature extraction and selection obtain useful parts of an image to represent it in a compact feature vector [45].For extracting useful features from a detected traffic sign, BoW was employed.The BoW uses SURF and k-means clustering to obtain the most discriminative features of an image.
In the computer vision, SURF [46] approach is used to extract local feature descriptor.This approach can be utilized with machine learning classifier to identify an object from an image.This technique was developed as a variant of traditional Scale Invariant Feature Transform (SIFT) descriptor and is much faster than tradition SIFT.For the image feature extraction, using SURF descriptor is the most important step of this proposed system of traffic sign recognition.SURF delivers the sign of Laplacian, position, orientation, and scale as shown in Figure 7.

Traffic Sign Recognition Phase
Once the detection phase detects the traffic sign candidate from the captured image, that traffic sign will be passed to recognition phase to classify that sign into a specific class.There are two main steps involved to recognize or classify the detected traffic sign, particularly feature extraction and classification.These steps are discussed in detail in subsequent sections.

Feature Extraction
The process of feature extraction and selection obtain useful parts of an image to represent it in a compact feature vector [45].For extracting useful features from a detected traffic sign, BoW was employed.The BoW uses SURF and k-means clustering to obtain the most discriminative features of an image.
In the computer vision, SURF [46] approach is used to extract local feature descriptor.This approach can be utilized with machine learning classifier to identify an object from an image.This technique was developed as a variant of traditional Scale Invariant Feature Transform (SIFT) descriptor and is much faster than tradition SIFT.For the image feature extraction, using SURF descriptor is the most important step of this proposed system of traffic sign recognition.SURF delivers the sign of Laplacian, position, orientation, and scale as shown in Figure 7.For the descriptors and the interest point extraction, SURF involves multiple steps of the image features calculation.To find the interest points, SURF uses a Hessian-based blob detector.The element of a Hessian matrix expression, the extent of the response is an expression of the local change around the area.In the following stages, the raw images are calculated and are used to improve the system performance.The following equations are known as a Hessian matrix [47].

𝐻(𝑥, 𝜎) = [
(, )   (, )   (, )   (, ) ] where In Equation ( 1) Lxx(x, σ) represents the convolution of the image with the second derivative of the Gaussian.The essential parameter of SURF is non-maximal-suppression of determinants of Hessian matrices.The convolutions are very expensive to compute, hence, these are approximated and fasten with a utilization of approximated kernels and integral images.An input image I(x) is an image where each point x = (x, y) T is stored as the sum of all pixels in rectangular area between origo and x shown in the Equation (2).For the descriptors and the interest point extraction, SURF involves multiple steps of the image features calculation.To find the interest points, SURF uses a Hessian-based blob detector.The element of a Hessian matrix expression, the extent of the response is an expression of the local change around the area.In the following stages, the raw images are calculated and are used to improve the system performance.The following equations are known as a Hessian matrix [47]. where In Equation ( 1) Lxx(x, σ) represents the convolution of the image with the second derivative of the Gaussian.The essential parameter of SURF is non-maximal-suppression of determinants of Hessian matrices.The convolutions are very expensive to compute, hence, these are approximated and fasten with a utilization of approximated kernels and integral images.An input image I(x) is an image where each point x = (x, y) T is stored as the sum of all pixels in rectangular area between origo and x shown in the Equation (2).
In an image representation using the BoW model, an image may be presented as a document.Similarly, "word" in images also need to be defined.To achieve this, the following three phases are usually required, first is features detection, second is features description, and third is clustering [48].A simple definition of the BoW for an image may be the "histogram representation of an image based on independent features" [49].This is utilized to change over the 64 measurement descriptors into histograms of codes.This method is utilized to take up with the SURF descriptors to its different picture.If the output of SURF descriptors area unit is used as an input for ANN then ANN might be unable to understand the decision boundaries.The BoW outputs are going to be a group of codes for every descriptor and a thin bar chart of those codes represent every traffic sign distinctively.
K-means clustering is commonly used with BoW process.However, finding the centroid of the clusters is the most critical step.K-means clustering [50] is a technique of vector quantization.It initially comes from signal processing, however, these days, it is also extensively used for the classification of an image.The main objective of k-means clustering is to make a section of n observations into k clusters, whereby, every observation will be a part of a cluster with the nearest mean, as shown in Figure 8. Clustering may be implemented by using several ways and methods, however, mostly all of the methods are especially used in a mathematical or computational process.In this computational process method, the preliminary centroids mean value has been provided to use for a task stage and update stage.The task stage works to the distribution of every updated observation to the closest mean by using the Within-Cluster-Sum-of-Squares (WCSS), through this up to date stage another mean value is computed by including the new perception.This technique is involved in repetition until every one of the perceptions is mulled over or the quantity of repetitions achieves the operator given value.The k-means++ technique in MATLAB (MathWorks, Natick, MA, USA) is utilized to determine the preliminary mean value.This technique specified a code if the centroid is found.For our proposed system, the number of k was set to 200.The number of k = 200 means that every sign is encoded with 200 features to identify the class of detected sign.The value for k = 200 has been used because this number showed the best performing results.
In an image representation using the BoW model, an image may be presented as a document.Similarly, "word" in images also need to be defined.To achieve this, the following three phases are usually required, first is features detection, second is features description, and third is clustering [48].A simple definition of the BoW for an image may be the "histogram representation of an image based on independent features" [49].This is utilized to change over the 64 measurement descriptors into histograms of codes.This method is utilized to take up with the SURF descriptors to its different picture.If the output of SURF descriptors area unit is used as an input for ANN then ANN might be unable to understand the decision boundaries.The BoW outputs are going to be a group of codes for every descriptor and a thin bar chart of those codes represent every traffic sign distinctively.
K-means clustering is commonly used with BoW process.However, finding the centroid of the clusters is the most critical step.K-means clustering [50] is a technique of vector quantization.It initially comes from signal processing, however, these days, it is also extensively used for the classification of an image.The main objective of k-means clustering is to make a section of n observations into k clusters, whereby, every observation will be a part of a cluster with the nearest mean, as shown in Figure 8. Clustering may be implemented by using several ways and methods, however, mostly all of the methods are especially used in a mathematical or computational process.In this computational process method, the preliminary centroids mean value has been provided to use for a task stage and update stage.The task stage works to the distribution of every updated observation to the closest mean by using the Within-Cluster-Sum-of-Squares (WCSS), through this up to date stage another mean value is computed by including the new perception.This technique is involved in repetition until every one of the perceptions is mulled over or the quantity of repetitions achieves the operator given value.The k-means++ technique in MATLAB (MathWorks, Natick, MA, USA) is utilized to determine the preliminary mean value.This technique specified a code if the centroid is found.For our proposed system, the number of k was set to 200.The number of k = 200 means that every sign is encoded with 200 features to identify the class of detected sign.The value for k = 200 has been used because this number showed the best performing results.Afterward, the updated data will be associated with the closest centroid by applying Euclidean distance function [51].The minimum Euclidian distance is the neighboring centroid to the new data.This technique of defining the neighboring centroid is also called minimum Euclidean distance measure [52].Afterward, the subsequent data is allocated with the new code, which is linked with centroid as depicted in Figure 9. Afterward, the updated data will be associated with the closest centroid by applying Euclidean distance function [51].The minimum Euclidian distance is the neighboring centroid to the new data.This technique of defining the neighboring centroid is also called minimum Euclidean distance measure [52].Afterward, the subsequent data is allocated with the new code, which is linked with centroid as depicted in Figure 9.
Figure 10 shows the BoW feature vector visualization of one traffic sign where the number of visual word index is 200 because every sign is encoded with 200 features.

Classification of Traffic Sign
The abovementioned feature extraction process outcome is numeric feature vector.This numeric feature vector is fed as an input to machine learning classifier to classify or recognize the traffic sign into respective class.In our proposed system, three different machine learning classifiers were employed namely, SVM, ANN, and Ensemble subspace kNN classifier to evaluate which one is more accurate and robust in our dataset.These three classifiers were chosen because Delgado et al. [53] tested 179 classifiers, and concluded that SVM, ANN, RF, and kNN produce better results in image classification.

SVM
The SVM decision model is based on statistical theory.It obtains reasonable accuracy in image processing and other related applications.SVM learns from training data, where each data instance has n data points followed by a class of the instance.In SVM, classes are separated by applying an optimum hyperplane by decreasing the distance between classes.These hyperplanes are known as support vector, where each side of the vector contains the instances of different classes [54].

Ensemble Subspace kNN
In Ensemble subspace kNN classifiers, each member of Ensemble classifiers has access to a random feature subset.This random feature subset is delegated to classifiers by using random subspace algorithm.The outcomes of these multiple nearest neighbor classifiers are combined for final decision using majority voting method [55].

Classification of Traffic Sign
The abovementioned feature extraction process outcome is numeric feature vector.This numeric feature vector is fed as an input to machine learning classifier to classify or recognize the traffic sign into respective class.In our proposed system, three different machine learning classifiers were employed namely, SVM, ANN, and Ensemble subspace kNN classifier to evaluate which one is more accurate and robust in our dataset.These three classifiers were chosen because Delgado et al. [53] tested 179 classifiers, and concluded that SVM, ANN, RF, and kNN produce better results in image classification.

Classification of Traffic Sign
The abovementioned feature extraction process outcome is numeric feature vector.This numeric feature vector is fed as an input to machine learning classifier to classify or recognize the traffic sign into respective class.In our proposed system, three different machine learning classifiers were employed namely, SVM, ANN, and Ensemble subspace kNN classifier to evaluate which one is more accurate and robust in our dataset.These three classifiers were chosen because Delgado et al. [53] tested 179 classifiers, and concluded that SVM, ANN, RF, and kNN produce better results in image classification.

SVM
The SVM decision model is based on statistical theory.It obtains reasonable accuracy in image processing and other related applications.SVM learns from training data, where each data instance has n data points followed by a class of the instance.In SVM, classes are separated by applying an optimum hyperplane by decreasing the distance between classes.These hyperplanes are known as support vector, where each side of the vector contains the instances of different classes [54].

Ensemble Subspace kNN
In Ensemble subspace kNN classifiers, each member of Ensemble classifiers has access to a random feature subset.This random feature subset is delegated to classifiers by using random subspace algorithm.The outcomes of these multiple nearest neighbor classifiers are combined for final decision using majority voting method [55].

ANN
An ANN is implemented by using the neural network pattern recognition tool in MATLAB.The implemented ANN is a two-layer feedforward neural network.The sigmoid transfer function is applied in the hidden layer, and a softmax transfer function has been used with output layer.For the training of multiple layer ANN, a systematic method is applied with the back propagation learning algorithm to the network.The number of the hidden neurons is set to 100 because this configuration produced the best results in terms of processing time and error percentage, as shown in Table 1.As the input layer contains 200 neurons and the output layer consists of 12 neurons, the number of neurons of the hidden layer had to be constrained to these limits [56].The output neuron number is set to 12, as our dataset was also comprised of 12 different traffic signs.The main objective of the training is to adjust the weight so that the input produces to get the desired output.The proposed ANN architecture is shown in Figure 11.

Experimental Setup
All the experiments were run in MATLAB with the image processing toolbox, computer vision toolbox, classification learner app, and neural network toolbox is used to implement this system.Intel Core-i5 (Intel, Santa Clara, CA, USA) 2.50GHz CPU computer with 4GB of RAM is used to run this program to recognize the traffic sign.Creating Bag-of-Features from 12 image sets, feature extraction selects the feature point locations using the detector method.Then extracting SURF features from the selected feature point locations, the detecting SURF features are used to detect key points for feature extraction.The balance of features across all image sets is done to increase the quality of clustering.Subsequently, the algorithm uses the most discriminative 3141 features from each image set.The k-means clustering is then used to create a 200-word visual vocabulary.The overall extracted features were 37,692, and the number of clusters (k) is 200.Clustering is completed by 29/100 iterations (~0.60 s/iteration) and converged in 29 iterations.Finally encoding 12 image sets are done by using Bag-of-Features.This feature vector is then fed as an input to the ANN, SVM, and Ensemble subspace kNN to recognize the class of detected road sign.To measure the performance of classification task, accuracy was used as the performance metric.

Experimental Results
Our experimental results showed that ANN outperformed SVM and Ensemble subspace kNN.Moreover, there is a very marginal difference between the results obtained by SVM and Ensemble subspace kNN.
The SVM classifier achieved 92.30% accuracy when it was tested on real-time images.The detailed confusion matrix and misclassification error rate of SVM classifier are shown in Figure 12a,b respectively.The SVM classifier obtained the maximum misclassification error rate of 16%, 14%, and 13% for the classes 'No Waiting', 'Keep Left Curve Chevron', and 'No Entry' respectively.Moreover, SVM obtained the lowest misclassification error rate of 1% for classes 'Stop' and 'Towing Zone'.

Experimental Results
Our experimental results showed that ANN outperformed SVM and Ensemble subspace kNN.Moreover, there is a very marginal difference between the results obtained by SVM and Ensemble subspace kNN.
The SVM classifier achieved 92.30% accuracy when it was tested on real-time images.The detailed confusion matrix and misclassification error rate of SVM classifier are shown in Figure 12a,b respectively.The SVM classifier obtained the maximum misclassification error rate of 16%, 14%, and 13% for the classes 'No Waiting', 'Keep Left Curve Chevron', and 'No Entry' respectively.Moreover, SVM obtained the lowest misclassification error rate of 1% for classes 'Stop' and 'Towing Zone'.

Experimental Results
Our experimental results showed that ANN outperformed SVM and Ensemble subspace kNN.Moreover, there is a very marginal difference between the results obtained by SVM and Ensemble subspace kNN.
The SVM classifier achieved 92.30% accuracy when it was tested on real-time images.The detailed confusion matrix and misclassification error rate of SVM classifier are shown in Figure 12a,b respectively.The SVM classifier obtained the maximum misclassification error rate of 16%, 14%, and 13% for the classes 'No Waiting', 'Keep Left Curve Chevron', and 'No Entry' respectively.Moreover, SVM obtained the lowest misclassification error rate of 1% for classes 'Stop' and 'Towing Zone'.The ANN classifier achieved 99.00% accuracy when it was tested on real-time images.The detailed confusion matrix and misclassification error rate of ANN classifier are shown in Figure 14a,b respectively.Figure 14a shows the correct classification results in green squares and the incorrect classification results in red squares.In addition, the lower right blue square illustrates the overall accuracy of ANN classifier.As shown in Figure 14b, the ANN classifier obtained the maximum misclassification error rate of 7% for class 'No Waiting'.Moreover, it obtained 2% misclassification error rate for classes 'Keep Left Curve Chevron', 'No Entry', and 'Speed Limit'.For all other remaining classes, the ANN showed 0% misclassification error rate.As shown in experimental results, ANN classifier outperformed SVM, and Ensemble subspace kNN classifier.Hence, to compare the performance of our proposed system with baselines, we have compared the results of the ANN classifier.The compare the performance of our proposed system, we compare our results with eight baselines [1,15,17,24,35,37,57,58].The evaluation results showed the outperformance of our proposed system when compared to eight baselines.The detailed results are shown in Table 2. Here, it can be seen that all these compared studies have used the traffic sign dataset to classify traffic signs using machine learning approaches.However, our proposed method of traffic sign detection and recognition is quite different from the existing studies.In our proposed   As shown in experimental results, ANN classifier outperformed SVM, and Ensemble subspace kNN classifier.Hence, to compare the performance of our proposed system with baselines, we have compared the results of the ANN classifier.The compare the performance of our proposed system, we compare our results with eight baselines [1,15,17,24,35,37,57,58].The evaluation results showed the outperformance of our proposed system when compared to eight baselines.The detailed results are shown in Table 2. Here, it can be seen that all these compared studies have used the traffic sign dataset to classify traffic signs using machine learning approaches.However, our proposed method of traffic sign detection and recognition is quite different from the existing studies.In our proposed As shown in experimental results, ANN classifier outperformed SVM, and Ensemble subspace kNN classifier.Hence, to compare the performance of our proposed system with baselines, we have compared the results of the ANN classifier.The compare the performance of our proposed system, Symmetry 2017, 9, 138 14 of 21 we compare our results with eight baselines [1,15,17,24,35,37,57,58].The evaluation results showed the outperformance of our proposed system when compared to eight baselines.The detailed results are shown in Table 2. Here, it can be seen that all these compared studies have used the traffic sign dataset to classify traffic signs using machine learning approaches.However, our proposed method of traffic sign detection and recognition is quite different from the existing studies.In our proposed system, we have employed the state-of-the-art BoW model for feature extraction.This model uses SURF for the identification of feature descriptors.Moreover, to determine the best interest points of a traffic sign, our proposed method uses the k-means clustering approach.Moreover, we have used the ANN, which is robust classifier as compared to other related classifiers.Figure 15 shows some real-time experimental results of our developed system using ANN classifier.
Table 2. Evaluation between others existing method and proposed method.

Reference
Overall Accuracy (%) Processing Time (s) [1] 97.60 - [15] 95.71 0.43 [17] 93.60 - [24] 98.62 0.36 [35] 95.20 - [37] 92.47 - [57] 90.27 0.35 [58] 86.70 -Proposed method 99.00 0.28 Symmetry 2017, 9, 138 14 of 20 system, we have employed the state-of-the-art BoW model for feature extraction.This model uses SURF for the identification of feature descriptors.Moreover, to determine the best interest points of a traffic sign, our proposed method uses the k-means clustering approach.Moreover, we have used the ANN, which is robust classifier as compared to other related classifiers.Figure 15 shows some real-time experimental results of our developed system using ANN classifier.
Table 2. Evaluation between others existing method and proposed method.

Discussion
According to the 'no free lunch' theorem [59], there is no single machine learning algorithm that performs best in all application areas.Hence, a variety of decision models should be tested.Therefore, we evaluated the performance of three different classifiers namely SVM, Ensemble subspace kNN, and ANN.Our experimental results showed that the SVM performance was lower than the performance of Ensemble subspace kNN and ANN classifiers.Although the quadratic SVM and the Ensemble subspace kNN produce good classification results, on our dataset ANN showed the best performance results for this proposed system, as shown in Figure 16.Therefore, traffic sign recognition system based on the BoW and ANN (instead of BoW and SVM, and BoW and Ensemble subspace kNN) is recommended for real-time implementation.

Discussion
According to the 'no free lunch' theorem [59], there is no single machine learning algorithm that performs best in all application areas.Hence, a variety of decision models should be tested.Therefore, we evaluated the performance of three different classifiers namely SVM, Ensemble subspace kNN, and ANN.Our experimental results showed that the SVM performance was lower than the performance of Ensemble subspace kNN and ANN classifiers.Although the quadratic SVM and the Ensemble subspace kNN produce good classification results, on our dataset ANN showed the best performance results for this proposed system, as shown in Figure 16.Therefore, traffic sign recognition system based on the BoW and ANN (instead of BoW and SVM, and BoW and Ensemble subspace kNN) is recommended for real-time implementation.The performance of SVM lies in the choice of kernel [60].The selection of proper SVM kernel and kernel function parameters, such as width or sigma parameter, may further increase the SVM performance [61][62][63].Moreover, the optimal design for multi-class SVM classifier is still a challenging task for many researchers.There may be various reasons behind the poor performance of Ensemble subspace kNN classifier including many Ensemble methods for weak learners such as kNN do not improve the classification performance [64].Moreover, such Ensemble subspace kNN is very sensitive to input features of images.Moreover, this classifier assumes that all input features are independent of each other and there is no dependency in the input image features [55].The other reason behind the poor performance of kNN is that it does not consider the issue of a soft boundary of the input images where an image appears on either side of a class boundary.There may be several reasons behind the outperformance of ANN classifier.For instance, ANN is a non-parametric model that does not need much statistical computation.Moreover, these decision models are very useful for complex or abstract issues such as image classification.Another reason may be that ANN predicts the best feature weights recursively with the help of input neuron, hidden neurons, and output neurons.This kind of underlying architecture makes ANN a robust classifier compare to other statistical classifiers [65,66].
Moreover, in our classification results, we found that the traffic signs belong to class 'No waiting' and 'No Stopping' showed the highest misclassification error rate in SVM and Ensemble kNN classifiers.This is because the shape and color of these two signs resemble a lot.Hence, the SVM classifier and Ensemble kNN classifier could not accurately classify these two signs.In addition, the ANN classier accurately classified 'No Stopping' class and also in ANN the misclassification of 'No Waiting' class reduced as compared to SVM and Ensemble kNN.In addition, the traffic sign belonging to class 'Pedestrian Crossing' has multiple variations on color and shapes in our dataset, therefore, SVM and Ensemble subspace kNN classifiers obtained high misclassification error rate.Conversely, ANN classifier achieved reasonable classification accuracy with lowest misclassification error rate for the same 'Pedestrian Crossing' class.Another related case is with the class 'No Entry' where SVM and Ensemble subspace kNN classifiers showed high misclassification error results, whereby, ANN obtained the lowest misclassification error rate.

Significance of Dataset and Proposed Approach
The BoF feature extraction approach with ANN classifier is proposed in this study to detect and recognize non-standard traffic signs.To examine the significance of our proposed method, one baseline was created from the collected dataset for this research, namely, the histogram of oriented gradients (HOG) technique proposed in [67,68].In [67] hybrids of HOG and SURF features descriptor was used with ANN classifier to select the most important and discriminative features.In [68] HOG based feature extraction technique with SVM and Random Forest, classifiers were used to recognize traffic sign images.Additionally, the performance of our proposed approach was also evaluated by using publicly available dataset (BTSD) of traffic sign images.To conduct the experiments with the The performance of SVM lies in the choice of kernel [60].The selection of proper SVM kernel and kernel function parameters, such as width or sigma parameter, may further increase the SVM performance [61][62][63].Moreover, the optimal design for multi-class SVM classifier is still a challenging task for many researchers.There may be various reasons behind the poor performance of Ensemble subspace kNN classifier including many Ensemble methods for weak learners such as kNN do not improve the classification performance [64].Moreover, such Ensemble subspace kNN is very sensitive to input features of images.Moreover, this classifier assumes that all input features are independent of each other and there is no dependency in the input image features [55].The other reason behind the poor performance of kNN is that it does not consider the issue of a soft boundary of the input images where an image appears on either side of a class boundary.There may be several reasons behind the outperformance of ANN classifier.For instance, ANN is a non-parametric model that does not need much statistical computation.Moreover, these decision models are very useful for complex or abstract issues such as image classification.Another reason may be that ANN predicts the best feature weights recursively with the help of input neuron, hidden neurons, and output neurons.This kind of underlying architecture makes ANN a robust classifier compare to other statistical classifiers [65,66].
Moreover, in our classification results, we found that the traffic signs belong to class 'No waiting' and 'No Stopping' showed the highest misclassification error rate in SVM and Ensemble kNN classifiers.This is because the shape and color of these two signs resemble a lot.Hence, the SVM classifier and Ensemble kNN classifier could not accurately classify these two signs.In addition, the ANN classier accurately classified 'No Stopping' class and also in ANN the misclassification of 'No Waiting' class reduced as compared to SVM and Ensemble kNN.In addition, the traffic sign belonging to class 'Pedestrian Crossing' has multiple variations on color and shapes in our dataset, therefore, SVM and Ensemble subspace kNN classifiers obtained high misclassification error rate.Conversely, ANN classifier achieved reasonable classification accuracy with lowest misclassification error rate for the same 'Pedestrian Crossing' class.Another related case is with the class 'No Entry' where SVM and Ensemble subspace kNN classifiers showed high misclassification error results, whereby, ANN obtained the lowest misclassification error rate.

Significance of Dataset and Proposed Approach
The BoF feature extraction approach with ANN classifier is proposed in this study to detect and recognize non-standard traffic signs.To examine the significance of our proposed method, one baseline was created from the collected dataset for this research, namely, the histogram of oriented gradients (HOG) technique proposed in [67,68].In [67] hybrids of HOG and SURF features descriptor was used with ANN classifier to select the most important and discriminative features.In [68] HOG based feature extraction technique with SVM and Random Forest, classifiers were used to recognize traffic sign images.Additionally, the performance of our proposed approach was also evaluated by using publicly available dataset (BTSD) of traffic sign images.To conduct the experiments with the BTSD dataset, 12 class of Belgian traffic sign image samples are taken into consideration because they are also commonly used in Malaysia.To compare, we have conducted nine additional experiments to show the significance of dataset and proposed approach.The findings of these experiments are shown in Figure 17 with confusion matrices.These experiments were conducted to measure the overall accuracy of all three classifiers using the baseline and collected dataset and BTSD dataset.The baseline accuracy was compared with the accuracy of the proposed method.The accuracy of the baseline approach and proposed approach is shown in Table 3.As shown here, the proposed method outperformed the baseline method in both datasets.Moreover, in classifiers ANN outperformed others.These experiments were conducted to measure the overall accuracy of all three classifiers using the baseline and collected dataset and BTSD dataset.The baseline accuracy was compared with the accuracy of the proposed method.The accuracy of the baseline approach and proposed approach is shown in Table 3.As shown here, the proposed method outperformed the baseline method in both datasets.Moreover, in classifiers ANN outperformed others.

Conclusions
Although signs constitute a part of the visual language, the recognition of traffic signs is a part of intelligent transportation systems.The sign recognition system for traffic signs can be used to warn or notify the road users or for both where potential restriction may be effective on the present traffic condition.In this research, an intelligent traffic sign detection and recognition system were developed to detect well-maintained, un-maintained, standard, and non-standard traffic signs using state-of-the-art image classification techniques.The existing traffic sign detection and recognition systems mostly trained and tested on well-maintained and quality images of traffic signs.However, in real-time these images of traffic signs may not be well-maintained (especially in developing countries).Therefore, in this research, we addressed and developed a system that will resolve this issue.In our developed system, traffic sign detection was performed using histogram equalization with several thresholding and shape measurement processes.BoW with ANN were employed to recognize the traffic signs.In addition, the quadratic SVM and Ensemble subspace kNN classifiers were tested and evaluated.Ultimately, ANN showed the best accuracy of 99.00% to recognize the traffic signs.Furthermore, the proposed system outperformed when compared to eight existing baselines.As the recognition of a traffic sign is vision-based, traffic signs which are obscured by different vehicles, trees or even another sign, might not be recognized.

Figure 1 .
Figure 1.Traffic sign (Stop); (a) Faded stop sign used in Malaysia, (b) Stop sign used in USA, Canada, Australia; (c) Stop sign used in Vanuatu.

Figure 1 .
Figure 1.Traffic sign (Stop); (a) Faded stop sign used in Malaysia, (b) Stop sign used in USA, Canada, Australia; (c) Stop sign used in Vanuatu.

Figure 3 .
Figure 3. Proposed system for traffic sign detection and recognition.

Figure 3 .
Figure 3. Proposed system for traffic sign detection and recognition.

Figure 5 .
Figure 5. Block diagram of traffic sign detection phase.

Figure 5 .
Figure 5. Block diagram of traffic sign detection phase.

Figure 5 .
Figure 5. Block diagram of traffic sign detection phase.

Figure 10
Figure 10 shows the BoW feature vector visualization of one traffic sign where the number of visual word index is 200 because every sign is encoded with 200 features.

Figure 10
Figure 10 shows the BoW feature vector visualization of one traffic sign where the number of visual word index is 200 because every sign is encoded with 200 features.

Figure 12 .
Figure 12.Support vector machine (SVM) classifier result; (a) Confusion matrix; (b) Misclassification error rate.The Ensemble subspace kNN classifier achieved 92.70% accuracy when it was tested on realtime images.The detailed confusion matrix and misclassification error rate of Ensemble subspace kNN classifier are shown in Figure 13a,b respectively.The Ensemble subspace kNN classifier obtained the maximum misclassification error rate of 25%, 16%, and 15% for the classes 'No Entry',
Symmetry 2017, 9, 138 16 of 20 BTSD dataset, 12 class of Belgian traffic sign image samples are taken into consideration because they are also commonly used in Malaysia.To compare, we have conducted nine additional experiments to show the significance of dataset and proposed approach.The findings of these experiments are shown in Figure 17 with confusion matrices.

Figure 17 .
Figure 17.Confusion matrices of nine additional experiments.

Figure 17 .
Figure 17.Confusion matrices of nine additional experiments.

Table 1 .
Neural network performance with number of hidden neurons in hidden layer.

Table 3 .
Comparison of accuracy results of baseline approach and proposed approach.