An Adaptive Approach for Multi-National Vehicle License Plate Recognition Using Multi-Level Deep Features and Foreground Polarity Detection Model

Featured Application: We present an adaptive framework for the recognition of multinational vehicles license plates. To make it generalized, this research does not require any prior knowledge of license plate layout and, furthermore, training data is not used from all of the targeted countries. These properties make this approach more suitable in order to get the registration identity of multinational vehicles. Abstract: License plate recognition system (LPR) plays a vital role in intelligent transport systems to build up smart environments. Numerous country speciﬁc methods have been proposed successfully for an LPR system, but there is a need to ﬁnd a generalized solution that is independent of license plate layout. The proposed architecture is comprised of two important LPR stages: (i) License plate character segmentation (LPCS) and (ii) License plate character recognition (LPCR). A foreground polarity detection model is proposed by using a Red-Green-Blue (RGB) channel-based color map in order to segment and recognize the LP characters e ﬀ ectively at both LPCS and LPCR stages respectively. Further, a multi-channel CNN framework with layer aggregation module is proposed to extract deep features, and support vector machine is used to produce target labels. Multi-channel processing with merged features from di ﬀ erent-level convolutional layers makes output feature map more expressive. Experimental results show that the proposed method is capable of achieving high recognition rate for multinational vehicles license plates under various illumination conditions.


Introduction
Automatic license plate recognition (ALPR) is a significant topic in the field of intelligent transport systems (ITS) and remains ever challenging in the research era of image processing and computer vision.This is the framework that is used widely to extract the license plate registration number from digital images for vehicle identification.This system also does not require any additional hardware like transmitter or responder to detect vehicles as every vehicle is identified by its license plate.Numerous potential applications have urged researchers to pay more attention to developing an efficient and sophisticated ALPR system.The law enforcement agencies are widely using this system for traffic monitoring, congestion control, borders or restricted area security and also to detect suspicious or theft vehicles.LPR system is also being used in smart parking areas and smart toll stations for flexible entry and exit of vehicle, smooth traffic flow, deduct fee or fine and to enhance the security.Moreover, it can be installed in intelligent vehicles to recognize the identity of neighboring vehicles for communication purposes.
License plate detection (LPD) and license plate recognition (LPR) are two major modules of an ALPR system.LPR module can be further split into license plate character segmentation (LPCS) and license plate character recognition (LPCR).In recent years, particularly focused work has been seen for multinational ALPR systems, but it has been done only for license plate detection and verification [1][2][3][4][5].Every module has its own importance but LPR is comparatively more important as most required information is extracted at this stage and this is a difficult task, specifically in the case of multi-style LPs for multinational vehicles.The whole system will be useless if we could not get a whole registration number at the LPR stage.In order to complete the system, the proposed work has been done for license plate recognition part that further splits into LPCS and LPCR stages.
Most of the work is seen only for specific types of LPs in existing literature and their effectiveness is limited as it cannot be applied to another region.Moreover, change in license plate regulations would lead to the failure of such specific methods.Various countries do not have standardized LPs and also allow customized license plates that add further complications.In addition, many countries have a lot of diversity in their license plates.Therefore, there is a need to develop a generalized framework for the recognition of multinational vehicle license plates to cater above-mentioned issues.Generally, challenges that are involved in effective character segmentation and recognition of license plates are illumination variance, rotation, shadow and border touched characters due to screws or noise, but some more challenges are involved when we deal with multinational vehicle license plates.They have different background and foreground colors, complex backgrounds containing different shapes and writings, and different shape and size of LPs and characters as shown in Figure 1.
developing an efficient and sophisticated ALPR system.The law enforcement agencies are widely using this system for traffic monitoring, congestion control, borders or restricted area security and also to detect suspicious or theft vehicles.LPR system is also being used in smart parking areas and smart toll stations for flexible entry and exit of vehicle, smooth traffic flow, deduct fee or fine and to enhance the security.Moreover, it can be installed in intelligent vehicles to recognize the identity of neighboring vehicles for communication purposes.
License plate detection (LPD) and license plate recognition (LPR) are two major modules of an ALPR system.LPR module can be further split into license plate character segmentation (LPCS) and license plate character recognition (LPCR).In recent years, particularly focused work has been seen for multinational ALPR systems, but it has been done only for license plate detection and verification [1][2][3][4][5].Every module has its own importance but LPR is comparatively more important as most required information is extracted at this stage and this is a difficult task, specifically in the case of multi-style LPs for multinational vehicles.The whole system will be useless if we could not get a whole registration number at the LPR stage.In order to complete the system, the proposed work has been done for license plate recognition part that further splits into LPCS and LPCR stages.
Most of the work is seen only for specific types of LPs in existing literature and their effectiveness is limited as it cannot be applied to another region.Moreover, change in license plate regulations would lead to the failure of such specific methods.Various countries do not have standardized LPs and also allow customized license plates that add further complications.In addition, many countries have a lot of diversity in their license plates.Therefore, there is a need to develop a generalized framework for the recognition of multinational vehicle license plates to cater above-mentioned issues.Generally, challenges that are involved in effective character segmentation and recognition of license plates are illumination variance, rotation, shadow and border touched characters due to screws or noise, but some more challenges are involved when we deal with multinational vehicle license plates.They have different background and foreground colors, complex backgrounds containing different shapes and writings, and different shape and size of LPs and characters as shown in Figure 1.A unique part-based approach is developed that gives very promising results for character segmentation of multinational vehicle license plates despite the above-mentioned challenges.The RGB color-based foreground polarity detection model and character height estimation filter are proposed to accomplish the license plate character segmentation task effectively.To deal with multistyle backgrounds, thresholding under various illumination conditions, skew correction and border touched characters for multinational VLPs are salient features of proposed segmentation method.For the recognition of segmented characters, multi-channel CNN framework with layer aggregation module is proposed to extract deep features, and support vector machine is used as classifier to produce target labels.
Our major contributions in this article are: (i) the proposed generalized method does not require any prior knowledge of LP layout and no training data is required at LPCS stage, furthermore we do not use training data from all targeted countries to extract deep features at LPCR stage, (ii) established color-based foreground polarity detection framework that contributes at both LPCS and LPCR stages.This algorithm significantly classifies the foreground from background based on their colors even under various illumination conditions (iii) multi-channel processing with concatenated output feature vector is introduced in order to get enhanced original image information, (iv) improved CNN A unique part-based approach is developed that gives very promising results for character segmentation of multinational vehicle license plates despite the above-mentioned challenges.The RGB color-based foreground polarity detection model and character height estimation filter are proposed to accomplish the license plate character segmentation task effectively.To deal with multi-style backgrounds, thresholding under various illumination conditions, skew correction and border touched characters for multinational VLPs are salient features of proposed segmentation method.For the recognition of segmented characters, multi-channel CNN framework with layer aggregation module is proposed to extract deep features, and support vector machine is used as classifier to produce target labels.
Our major contributions in this article are: (i) the proposed generalized method does not require any prior knowledge of LP layout and no training data is required at LPCS stage, furthermore we do not use training data from all targeted countries to extract deep features at LPCR stage, (ii) established color-based foreground polarity detection framework that contributes at both LPCS and LPCR stages.This algorithm significantly classifies the foreground from background based on their colors even under various illumination conditions (iii) multi-channel processing with concatenated output feature vector is introduced in order to get enhanced original image information, (iv) improved CNN is proposed by merging different-level convolutional features to get a more expressive output feature map.
The remainder of the paper is arranged in the following order.Section 2 presents the related work in detail.Section 3 explains the proposed method and Section 4 provides the experimental results and comparison with the latest methods.In Section 5, we present the concluding remarks.

Related Work
Generally, connected components with binary thresholding, projection-based analysis, maximally stable extremal regions (MSER), sliding window and many CNN based methods have been extensively used for character extraction from license plates.
The research conducted in [6,7] claims to be implemented for multinational vehicle LPR but was only tested on Israeli and Bulgarian LPs.The two-stage character segmentation approach [8] is proposed for Chinese LPs by using CCA and bank of harrow-shaped filter (HSF).The CCA and kohonen neural network [9] is used to segment characters from Indian LPs.The CCA and Contour detection is used for character segmentation in Bangladesh and Iraq license plates [10][11][12].CIE-LAB color space, OTSU segmentation and CCA are used to extract characters from a Pakistani vehicle license plate.This method [13] is suitable only for highly contrasting images and very sensitive to shadow.The projection-based methods [14,15] have also been used for China and Iran respectively for LPCS.Usually, this technique is used with one of the binarization methods (OTSU, Adaptive, MET, etc.).The first image is converted into binary then horizontal and vertical projection is used to separate the characters.Maximally stable extremal regions (MSER) is also very famous method for LPCS as used in [16,17] for Chinese license plates.Sliding window technique [18,19] has also been used for America and Iran, but this method requires prior knowledge of window size and it is computationally complex.In recent years many CNN-based approaches are introduced to accomplish segmentation free LPR task as discussed in [20,21].Recurrent neural network and connectionist temporal classification is used in [20] while in [21], YOLO detector is used to accomplish this task and the target countries are China and Brazil respectively.The author proposed robust license plate recognition by introducing the concept of synthetic images to train a convolutional neural network [22].A multi-task convolutional neural network [23] is proposed for license plate recognition with better accuracy and lower computational cost.
Once the segmentation is done successfully, the next step is to recognize the characters separately.Generally, four approaches are used to accomplish the recognition task.The simplest and straightforward method is template matching [24,25], which was used a lot in the beginning, but this method is not flexible enough for different fonts, noise, rotation and thickness changes.The artificial neural network [12,[26][27][28] is another very popular approach that has been used widely to recognize the license plate characters as well as many other classification tasks.The SVM classifier [13,16,29,30] with various feature extraction methods is also used to perform recognition tasks as SVM is strong and fast classifier for real time applications.The extreme learning machines (ELMs) [31] and hidden Markov model (HMM) [32] are also used as license plate recognition classifiers.The concept of using the conventional low-level features is now considered old fashioned due to the revolution in computer vision by the introduction of deep learning methods.The deep convolutional neural network (CNN) architectures learn a hierarchy of discriminate features automatically that richly describe image content.In recent years, the use of deep learning frameworks [20,21,[33][34][35] are also seen in ALPR system because of its powerful recognition capability.
No method mentioned above (CCA, projection, MSER, sliding window) has the capability to handle multinational LPs directly.Some CNN based methods have capability to do this task, but they require a lot of training data which is not feasible in multinational scenario.Our proposed recognition framework is capable enough to recognize the license plate characters of different shapes, styles and having different background/foreground colors under various illumination conditions.

Proposed Methodology
The proposed framework is comprised of license plate character segmentation (LPCS) and license plate character recognition (LPCR) modules.The detailed proposed architectures of these modules are discussed in this section.

License Plate Character Segmentation
Figure 2 shows the complete process of the license plate characters segmentation.As discussed above, we used part-based approach to accomplish this task.The first objective is to segment the region of interest (ROI) in order to reduce the image processing area and to discard a lot of redundant information.In the second part, a proposed character height estimation filter along with connected component analysis (CCA) is applied to extract the required objects.These blocks are discussed below in detail.

Proposed Methodology
The proposed framework is comprised of license plate character segmentation (LPCS) and license plate character recognition (LPCR) modules.The detailed proposed architectures of these modules are discussed in this section.

License Plate Character Segmentation
Figure 2 shows the complete process of the license plate characters segmentation.As discussed above, we used part-based approach to accomplish this task.The first objective is to segment the region of interest (ROI) in order to reduce the image processing area and to discard a lot of redundant information.In the second part, a proposed character height estimation filter along with connected component analysis (CCA) is applied to extract the required objects.These blocks are discussed below in detail.LP image detected by [1] is fed to part-I for the foreground and background classification.To distinguish the background from the foreground is one of the most difficult and important steps for object segmentation.The proposed method strongly requires the identification of background/foreground color and polarity, as further segmentation processes based on this step.The term polarity is referred to bright or dark based on the intensity of foreground/background colors.We observed that most of the countries use unique color for background as well as for foreground.Probably, the largest color candidates would belong to background region while foreground would have second highest color candidates, which is the key point that we use for background and foreground classification.We propose a model by using RGB color space which is best suited to identifying any color even under various illumination conditions.
RGB-based color space can be visualized as a cube with corners of black, the three primaries (red, green, blue), the three secondaries (cyan, magenta, yellow), and white.The eight pure color corners and distribution cube is shown in Figure 3 [36].By considering above mentioned colors and color distribution, a model is proposed that represents all shades of colors into 8 pure colors and has the capability to determine any background and foreground color, which is strongly required in case of multinational VLPs.The total candidates of red, green, blue, yellow, cyan and magenta are computed by (1) as per the color's definition mentioned in Table 1, and the threshold value is determined by using color distribution cube and rigorous experiments on 3718 images in the test dataset.LP image detected by [1] is fed to part-I for the foreground and background classification.To distinguish the background from the foreground is one of the most difficult and important steps for object segmentation.The proposed method strongly requires the identification of background/foreground color and polarity, as further segmentation processes based on this step.The term polarity is referred to bright or dark based on the intensity of foreground/background colors.We observed that most of the countries use unique color for background as well as for foreground.Probably, the largest color candidates would belong to background region while foreground would have second highest color candidates, which is the key point that we use for background and foreground classification.We propose a model by using RGB color space which is best suited to identifying any color even under various illumination conditions.
RGB-based color space can be visualized as a cube with corners of black, the three primaries (red, green, blue), the three secondaries (cyan, magenta, yellow), and white.The eight pure color corners and distribution cube is shown in Figure 3 [36].By considering above mentioned colors and color distribution, a model is proposed that represents all shades of colors into 8 pure colors and has the capability to determine any background and foreground color, which is strongly required in case of multinational VLPs.The total candidates of red, green, blue, yellow, cyan and magenta are computed by (1) as per the color's definition mentioned in Table 1, and the threshold value is determined by using color distribution cube and rigorous experiments on 3718 images in the test dataset.By using (1) we can get color candidates count (  ) of red, green, blue, cyan, magenta and yellow, but we cannot set any hard limit to separate the group of black and white pixels as the illumination level is unknown.The values of white and black pixels (  ) are extracted together as one group and the candidate's vector (  ) is obtained as where   = { 1 < 0.2 &  2 < 0.2}.
Then Otsu's method based on two intra-class variance is applied to separate them into two classes having black and white pixels separately as shown in Figure 4. Further, larger and smaller class between these two groups is determined as Table 1.Color definition based on three main groups.

Color Count
Color Threshold (T The distances between primary colors red-green, green-blue and blue-red are represented by d 1 , d 2 and d 3 respectively. By using (1) we can get color candidates count (C C ) of red, green, blue, cyan, magenta and yellow, but we cannot set any hard limit to separate the group of black and white pixels as the illumination level is unknown.The values of white and black pixels (V BW ) are extracted together as one group and the candidate's vector (C BW ) is obtained as where Then Otsu's method based on two intra-class variance is applied to separate them into two classes having black and white pixels separately as shown in Figure 4. Further, larger and smaller class between these two groups is determined as  Adaptive thresholding technique strongly requires the prior knowledge of foreground polarity to separate background from foreground precisely.This information is also needed for morphological operations to obtain the required results.Algorithm 1 is used to accomplish this task, whereas max1 and max2 belong to C C that represent the background and foreground respectively.The background and foreground polarities depend on the sequence of colors as shown in Figure 5b.For example, Ind-1 has brighter polarity as compared to Ind-2.The CG2 (Ind) is determined as Algorithm 1 Foreground polarity detection process Input 1: max1 % max1 represents the color pixel count belongs to LP background Until this point, we have known the background and foreground colors because the largest candidates group belong to the background while the second largest candidates group belong to the foreground, as shown in Figure 5a.This information is further used to determine the foreground polarity for post processing, as this information is most necessary for adaptive thresholding and for most morphological operations.Adaptive thresholding technique strongly requires the prior knowledge of foreground polarity to separate background from foreground precisely.This information is also needed for morphological operations to obtain the required results.Algorithm 1 is used to accomplish this task, whereas max1 and max2 belong to C C that represent the background and foreground respectively.The background and foreground polarities depend on the sequence of colors as shown in Figure 5b.For example, Ind-1 has brighter polarity as compared to Ind-2.The CG2 (Ind) is determined as Algorithm 1 Foreground polarity detection process Input 1: max1 % max1 represents the color pixel count belongs to LP background Adaptive thresholding technique strongly requires the prior knowledge of foreground polarity to separate background from foreground precisely.This information is also needed for morphological operations to obtain the required results.Algorithm 1 is used to accomplish this task, whereas max1 and max2 belong to C C that represent the background and foreground respectively.The background and foreground polarities depend on the sequence of colors as shown in Figure 5b.For example, Ind-1 has brighter polarity as compared to Ind-2.The CG2 (Ind) is determined as Algorithm 1 Foreground polarity detection process Input 1: max1 % max1 represents the color pixel count belongs to LP background Input 2: max2 % max2 represents the color pixel count belongs to LP foreground The next step is to convert the RGB image into binary image BI(x, y).For this purpose, first the image is converted into gray image (GI) by using the standard conversion computed by ( 8) and then we use real-time adaptive thresholding for binary conversion using local mean intensity (first-order statistics) as mentioned in (9) with neighborhood size (Ns).This is most suitable thresholding technique compared to OTSU's, MET and many others based on two intra-class variance to separate background and foreground particularly in shadow and illumination variance environment.Adaptive thresholding requires prior knowledge of foreground polarity which we have already determined in the above steps.
Ns = 2 * f loor(size(I)/16) + 1 (10) Figure 6 shows the effect of prior knowledge of foreground polarity.In Figure 6a, the LP with dark foreground is well thresholded while in Figure 6b, the LP with bright foreground is well thresholded.So, for multinational vehicle LPs, we must have prior knowledge of foreground polarities in order to separate foreground from background as multinational VLPs have different backgrounds and foregrounds polarities.After getting binary image, we change its background polarity to bright if needed by using (11).The next step is to extract the ROI which contains only required characters by discarding redundant area of license plate.Let I be a given set of objects of 2 types, then required region (R R ) can be extracted as where type 1 and type 2 represent the white and black objects respectively.
The character height estimation is also one of the most important and difficult tasks to accomplish the character's segmentation for multinational LPs as different countries LPs have different character size.It also uses further for skew correction and to eliminate the other remaining redundant objects.Until this step, we have already discarded a lot of redundant area of LP by extracting the required background region.To accomplish this task, we propose an adaptive way to estimate the character's height, which can be estimated as The   ℎ is a set that is used to keep the objects of specific height.Where, ℎ  is the maximum height among all detected bounding boxes in extracted region (  ) of license plate.The  ℎ is the targeted object that is selected based on the criteria of height ℎ  .The next bounding box object is compared with ℎ  , if its height is ±20 %  ℎ  then it will be considered as required object.If it does not fulfil the criteria then the previous bounding box with maximum height will be discarded and a recent one will be considered as maximum height object for further processing.This process will be continued until all required objects are extracted.By following this approach, all larger and smaller objects are eliminated automatically and only required objects, i.e., license plate characters are left.
After finding the heights of required objects, we use this information for skew correction by using (14).The skew detection and correction problem is also addressed in an efficient and accurate way, as it helps to enhance the system performance at segmentation as well as at recognition stage.
where  ℎ ⋲   ℎ ,  ( ℎ ) and  ( ℎ ) is y-axis values of right most and left most detected objects while  ( ℎ ) and  ( ℎ ) is x-axis values of right most and left most detected objects of hi.
The border touched characters are separated by using the information of upper and lower boundaries of objects in set   ℎ as shown in Fig. 8c.The final step is to remove the remaining redundant information and to get bounding boxes on our required objects that are LP characters, so, this task is done by using ( 15) The next step is to extract the ROI which contains only required characters by discarding redundant area of license plate.Let I be a given set of objects of 2 types, then required region (R R ) can be extracted as where type 1 and type 2 represent the white and black objects respectively.The character height estimation is also one of the most important and difficult tasks to accomplish the character's segmentation for multinational LPs as different countries LPs have different character size.It also uses further for skew correction and to eliminate the other remaining redundant objects.Until this step, we have already discarded a lot of redundant area of LP by extracting the required background region.To accomplish this task, we propose an adaptive way to estimate the character's height, which can be estimated as The H O hi is a set that is used to keep the objects of specific height.Where, h max is the maximum height among all detected bounding boxes in extracted region (R R ) of license plate.The O hi is the targeted object that is selected based on the criteria of height h i .The next bounding box object is compared with h max , if its height is ±20% o f h max then it will be considered as required object.If it does not fulfil the criteria then the previous bounding box with maximum height will be discarded and a recent one will be considered as maximum height object for further processing.This process will be continued until all required objects are extracted.By following this approach, all larger and smaller objects are eliminated automatically and only required objects, i.e., license plate characters are left.
After finding the heights of required objects, we use this information for skew correction by using (14).The skew detection and correction problem is also addressed in an efficient and accurate way, as it helps to enhance the system performance at segmentation as well as at recognition stage.
where O hi H O hi , y RM(O hi ) and y LM(O hi ) is y-axis values of right most and left most detected objects while x RM(O hi ) and x LM(O hi ) is x-axis values of right most and left most detected objects of hi.
The border touched characters are separated by using the information of upper and lower boundaries of objects in set H O hi as shown in Figure 8c.The final step is to remove the remaining redundant information and to get bounding boxes on our required objects that are LP characters, so, this task is done by using (15) where, FI represents the final image that have only required LP characters.The output of above discussed LPCS processes can be seen in Figures 7 and 8.In order to prove the effectiveness of proposed method, Figure 9 presents some sample images of LPs having blur, noise, shadow and also effected by various illumination conditions.The proposed method is not capable of handling the multi-color background/foreground LPs as shown in Figure 10.As discussed above, most of the countries have unique color for background and foreground.Probably, the largest color would belong to background region while foreground would have second largest color, which is the key point that is used to distinguish the background from foreground.In this proposed approach, the rest of the processes are based on this information.That is why this method would give failed segmentation of the license plate characters.where,  represents the final image that have only required LP characters.
The output of above discussed LPCS processes can be seen in Figures 7 and 8.In order to prove the effectiveness of proposed method, Figure 9 presents some sample images of LPs having blur, noise, shadow and also effected by various illumination conditions.The proposed method is not capable of handling the multi-color background/foreground LPs as shown in figure 10.As discussed above, most of the countries have unique color for background and foreground.Probably, the largest color would belong to background region while foreground would have second largest color, which is the key point that is used to distinguish the background from foreground.In this proposed approach, the rest of the processes are based on this information.That is why this method would give failed segmentation of the license plate characters.where,  represents the final image that have only required LP characters.
The output of above discussed LPCS processes can be seen in Figures 7 and 8.In order to prove the effectiveness of proposed method, Figure 9 presents some sample images of LPs having blur, noise, shadow and also effected by various illumination conditions.The proposed method is not capable of handling the multi-color background/foreground LPs as shown in figure 10.As discussed above, most of the countries have unique color for background and foreground.Probably, the largest color would belong to background region while foreground would have second largest color, which is the key point that is used to distinguish the background from foreground.In this proposed approach, the rest of the processes are based on this information.That is why this method would give failed segmentation of the license plate characters.

License Plate Character Recognition
We propose a deep learning (DL) framework to accomplish the license plate character classification task as DL is most recently been introduced in approaches in the area of artificial intelligence.Deep learning is powered by neural networks.Convolutional neural networks (ConvNets or CNNs) are a category of neural networks that have proven very effective in areas such as image recognition and classification.They are made up of neurons that have learnable weights and biases.Each neuron receives some inputs, performs a dot product and optionally follows it with a non-linearity.The whole network still expresses a single differentiable score function: from the raw image pixels on one end to class scores at the other.And they still have a loss function on the last (fully-connected) layer.
Figure 11 represents the proposed structure for character recognition of multinational vehicles LPs.First, the segmented image is decomposed into red, green and blue channels and then it is passed through polarity matching module to process data uniformly in order to eliminate the impact of foreground polarity that varies in multinational VLPs.To hierarchically learn features, these separated channels images are fed to CNNs and get output vectors, which are further concatenated to acquire enhanced image's feature information.Before sending this output feature vector to the classifier we introduce data normalization module which has a better impact on output scores.Finally, this normalized feature vector is fed to classifier in order to obtain target labels.

License Plate Character Recognition
We propose a deep learning (DL) framework to accomplish the license plate character classification task as DL is most recently been introduced in approaches in the area of artificial intelligence.Deep learning is powered by neural networks.Convolutional neural networks (ConvNets or CNNs) are a category of neural networks that have proven very effective in areas such as image recognition and classification.They are made up of neurons that have learnable weights and biases.Each neuron receives some inputs, performs a dot product and optionally follows it with a non-linearity.The whole network still expresses a single differentiable score function: from the raw image pixels on one end to class scores at the other.And they still have a loss function on the last (fully-connected) layer.
Figure 11 represents the proposed structure for character recognition of multinational vehicles LPs.First, the segmented image is decomposed into red, green and blue channels and then it is passed through polarity matching module to process data uniformly in order to eliminate the impact of foreground polarity that varies in multinational VLPs.To hierarchically learn features, these separated channels images are fed to CNNs and get output vectors, which are further concatenated to acquire enhanced image's feature information.Before sending this output feature vector to the classifier we introduce data normalization module which has a better impact on output scores.Finally, this normalized feature vector is fed to classifier in order to obtain target labels.

License Plate Character Recognition
We propose a deep learning (DL) framework to accomplish the license plate character classification task as DL is most recently been introduced in approaches in the area of artificial intelligence.Deep learning is powered by neural networks.Convolutional neural networks (ConvNets or CNNs) are a category of neural networks that have proven very effective in areas such as image recognition and classification.They are made up of neurons that have learnable weights and biases.Each neuron receives some inputs, performs a dot product and optionally follows it with a non-linearity.The whole network still expresses a single differentiable score function: from the raw image pixels on one end to class scores at the other.And they still have a loss function on the last (fully-connected) layer.
Figure 11 represents the proposed structure for character recognition of multinational vehicles LPs.First, the segmented image is decomposed into red, green and blue channels and then it is passed through polarity matching module to process data uniformly in order to eliminate the impact of foreground polarity that varies in multinational VLPs.To hierarchically learn features, these separated channels images are fed to CNNs and get output vectors, which are further concatenated to acquire enhanced image's feature information.Before sending this output feature vector to the classifier we introduce data normalization module which has a better impact on output scores.Finally, this normalized feature vector is fed to classifier in order to obtain target labels.

License Plate Character Recognition
We propose a deep learning (DL) framework to accomplish the license plate character classification task as DL is most recently been introduced in approaches in the area of artificial intelligence.Deep learning is powered by neural networks.Convolutional neural networks (ConvNets or CNNs) are a category of neural networks that have proven very effective in areas such as image recognition and classification.They are made up of neurons that have learnable weights and biases.Each neuron receives some inputs, performs a dot product and optionally follows it with a non-linearity.The whole network still expresses a single differentiable score function: from the raw image pixels on one end to class scores at the other.And they still have a loss function on the last (fully-connected) layer.
Figure 11 represents the proposed structure for character recognition of multinational vehicles LPs.First, the segmented image is decomposed into red, green and blue channels and then it is passed through polarity matching module to process data uniformly in order to eliminate the impact of foreground polarity that varies in multinational VLPs.To hierarchically learn features, these separated channels images are fed to CNNs and get output vectors, which are further concatenated to acquire enhanced image's feature information.Before sending this output feature vector to the classifier we introduce data normalization module which has a better impact on output scores.Finally, this normalized feature vector is fed to classifier in order to obtain target labels.The pre-trained network is used as a starting point to learn a new task.Fine-tuning a network with transfer learning is usually much faster than training a network with randomly initialized weights from scratch.Learned features can be transferred quickly to a new task using a smaller number of training images.Some of the well-known pre-trained networks (AlexNet [37], VGG-16 [38], GoogleNet [39], ResNet-18 [40], Inception v3 [41]) have been trained over a million images and can classify images into 1000 object categories and have learned rich feature representations for a wide range of images.The speed is one of the most important parameters as we are working for realtime applications.By keeping this constraint in mind, we choose AlexNet CNN as our base network for transfer learning as it is time efficient among the rest of the four well known CNNs, as shown in Figure 12.The test time is considered for whole test database to test the speed of networks for our particular application.The test database for recognition part contains 21717 license plate characters images that are extracted at segmentation stage from 3718 license plate images of eight different countries.In deep neural networks, higher-level convolutional layers have rich features representation while the spatial information is deficient.In contrast, in shallow layers, spatial information is reserved at the cost of less expressive features.Generally, features from the last convolutional layer are used by FC layers which are further used for classification.Max-pooling is a down sampling strategy in convolutional neural networks.Therefore, we may have a chance to get spatial information that might be lost during down sampling.Vanishing gradient is another problem that may occur in deep networks.For recognition and detection tasks, performance can be enhanced and information loss can also be reduced by using collective feature information of different convolutional layers, as the authors suggested in [42][43][44][45].In addition, it is noted that the convergence time is also improved by using the concept of deep layers aggregation.Therefore, we propose an improved CNN at the lines of AlexNet, as shown in Figure 13 and parametric detail is discussed in Table 2.In improved CNN, the feature of convolution layer 4 and 5 are merged and one extra max-pooling layer is also introduced to match the features dimensions of both convolution layers.The pre-trained network is used as a starting point to learn a new task.Fine-tuning a network with transfer learning is usually much faster than training a network with randomly initialized weights from scratch.Learned features can be transferred quickly to a new task using a smaller number of training images.Some of the well-known pre-trained networks (AlexNet [37], VGG-16 [38], GoogleNet [39], ResNet-18 [40], Inception v3 [41]) have been trained over a million images and can classify images into 1000 object categories and have learned rich feature representations for a wide range of images.The speed is one of the most important parameters as we are working for real-time applications.By keeping this constraint in mind, we choose AlexNet CNN as our base network for transfer learning as it is time efficient among the rest of the four well known CNNs, as shown in Figure 12.The test time is considered for whole test database to test the speed of networks for our particular application.The test database for recognition part contains 21717 license plate characters images that are extracted at segmentation stage from 3718 license plate images of eight different countries.The pre-trained network is used as a starting point to learn a new task.Fine-tuning a network with transfer learning is usually much faster than training a network with randomly initialized weights from scratch.Learned features can be transferred quickly to a new task using a smaller number of training images.Some of the well-known pre-trained networks (AlexNet [37], VGG-16 [38], GoogleNet [39], ResNet-18 [40], Inception v3 [41]) have been trained over a million images and can classify images into 1000 object categories and have learned rich feature representations for a wide range of images.The speed is one of the most important parameters as we are working for realtime applications.By keeping this constraint in mind, we choose AlexNet CNN as our base network for transfer learning as it is time efficient among the rest of the four well known CNNs, as shown in Figure 12.The test time is considered for whole test database to test the speed of networks for our particular application.The test database for recognition part contains 21717 license plate characters images that are extracted at segmentation stage from 3718 license plate images of eight different countries.In deep neural networks, higher-level convolutional layers have rich features representation while the spatial information is deficient.In contrast, in shallow layers, spatial information is reserved at the cost of less expressive features.Generally, features from the last convolutional layer are used by FC layers which are further used for classification.Max-pooling is a down sampling strategy in convolutional neural networks.Therefore, we may have a chance to get spatial information that might be lost during down sampling.Vanishing gradient is another problem that may occur in deep networks.For recognition and detection tasks, performance can be enhanced and information loss can also be reduced by using collective feature information of different convolutional layers, as the authors suggested in [42][43][44][45].In addition, it is noted that the convergence time is also improved by using the concept of deep layers aggregation.Therefore, we propose an improved CNN at the lines of AlexNet, as shown in Figure 13 and parametric detail is discussed in Table 2.In improved CNN, the feature of convolution layer 4 and 5 are merged and one extra max-pooling layer is also introduced to match the features dimensions of both convolution layers.In deep neural networks, higher-level convolutional layers have rich features representation while the spatial information is deficient.In contrast, in shallow layers, spatial information is reserved at the cost of less expressive features.Generally, features from the last convolutional layer are used by FC layers which are further used for classification.Max-pooling is a down sampling strategy in convolutional neural networks.Therefore, we may have a chance to get spatial information that might be lost during down sampling.Vanishing gradient is another problem that may occur in deep networks.For recognition and detection tasks, performance can be enhanced and information loss can also be reduced by using collective feature information of different convolutional layers, as the authors suggested in [42][43][44][45].In addition, it is noted that the convergence time is also improved by using the concept of deep layers aggregation.Therefore, we propose an improved CNN at the lines of AlexNet, as shown in Figure 13 and parametric detail is discussed in Table 2.In improved CNN, the feature of convolution layer 4 and 5 are merged and one extra max-pooling layer is also introduced to match the features dimensions of both convolution layers.The next most important step is to choose the classifier that accepts the feature vector from CNN feature learning module and generates output labels.In [46][47][48], the authors claimed that the SVM is a strong and fast classifier for real-time classification applications and great attention has been paid to the fusion of neural networks and SVM [49,50].That is why the same is used in our proposed system.For the support vector machine algorithm, kernel and hinge loss function is used as described in ( 16) and ( 17) [51].
where f(x)=xβ+b, β is a vector of p coefficients, x is an observation from p predictor variables and b is the scalar bias.

Experimental Results and Discussion
The next most important step is to choose the classifier that accepts the feature vector from CNN feature learning module and generates output labels.In [46][47][48], the authors claimed that the SVM is a strong and fast classifier for real-time classification applications and great attention has been paid to the fusion of neural networks and SVM [49,50].That is why the same is used in our proposed system.For the support vector machine algorithm, kernel and hinge loss function is used as described in ( 16) and ( 17) [51].
where G(xj,xk) is element (j,k) of the Gram matrix, where xj and xk are p-dimensional vectors representing observations j and k in X. [y, where f(x) = xβ + b, β is a vector of p coefficients, x is an observation from p predictor variables and b is the scalar bias.

Experimental Results and Discussion
We have used MATLAB 2019b on an Intel®Core™ i7 CPU having 4.20GHz processing power along with GTX 1080Ti GPU for feature extraction and evaluation.Experiments on 3718 high quality, low quality and blurry images have been performed.Most of the vehicles license plates are from different states of America, Europe along with vehicles from Pakistan, UAE, Canada, Australia, Mexico, UAE, etc.
License plates having different foreground and background colors with a variety of formats are included in the test dataset, as shown in Figure 14.Most of the images are gathered from Media Lab [52] and Olav's databases [53].Moreover, the low-resolution images and the images effected by diverse weather and illumination conditions are also included in the test database.In several license plate images, shadow, partial shadow and blur are also included.Some cases in which license plates were under direct sunlight are also included to verify the adaptability of the proposed method.We have used MATLAB 2019b on an Intel® Core™ i7 CPU having 4.20GHz processing power along with GTX 1080Ti GPU for feature extraction and evaluation.Experiments on 3718 high quality, low quality and blurry images have been performed.Most of the vehicles license plates are from different states of America, Europe along with vehicles from Pakistan, UAE, Canada, Australia, Mexico, UAE, etc.
License plates having different foreground and background colors with a variety of formats are included in the test dataset, as shown in Figure 14.Most of the images are gathered from Media Lab [52] and Olav's databases [53].Moreover, the low-resolution images and the images effected by diverse weather and illumination conditions are also included in the test database.In several license plate images, shadow, partial shadow and blur are also included.Some cases in which license plates were under direct sunlight are also included to verify the adaptability of the proposed method.

LP Characters Segmentation Results Analysis
Regardless of the variety of license plates, the proposed framework effectively segments all the characters of almost 97% of license plates with 98% precision, as shown in Table 3, which is quite good in the case of multinational VLPs.The major reason for high accuracy for multinational VLPs is that our proposed approach first eliminates the redundant part of license plate by extracting the region of interest (ROI).It consists of the largest area based on unique color that is the salient feature of majority of the country's LPs.Then, a character height estimation filter is used, which adaptively detects the length of characters in any type of license plate.The accuracy and precision are defined as

LP Characters Segmentation Results Analysis
Regardless of the variety of license plates, the proposed framework effectively segments all the characters of almost 97% of license plates with 98% precision, as shown in Table 3, which is quite good in the case of multinational VLPs.The major reason for high accuracy for multinational VLPs is that our proposed approach first eliminates the redundant part of license plate by extracting the region of interest (ROI).It consists of the largest area based on unique color that is the salient feature of majority of the country's LPs.Then, a character height estimation filter is used, which adaptively detects the length of characters in any type of license plate.The accuracy and precision are defined as Accuracy = NMC LPs

LP Characters Recognition Results Analysis
In this part, each character extracted from the license plate will be separately recognized.In total, 21717 characters of 36 (0-9, A-Z) classes having different length of each class are extracted in the previous part.The test dataset consists of different style characters with different background and foreground polarities are extracted from various VLP formats to evaluate the performance of proposed feature model under different conditions.As we have used SVM for classification, which is a supervised learning model, training images are required to get a decision boundary for performance assessment.Therefore, a database of 84058 characters of 37 (0-9, A-Z, EC) classes and almost 2250 characters of each class are used for training.One extra-class (EC) is also introduced to eliminate the falsely segmented characters (FS C ).In order to support the adaptability constraint for recognition of multinational VLPs and to prove the effectiveness of the proposed framework, license plate characters from one country are gathered for training and made it suitable for the test data to have any background and foreground color and polarity.All training data consist of gray scale images to eliminate the effect of color and all images have dark foreground polarity to make it uniform.This approach will make this system more generic in nature to support multinational framework, as it is not feasible to gather training data from all targeted countries.
Our proposed model contributes well at three different levels and achieved outstanding performance collectively for such a diversified data.The individual and collective performance of these levels is discussed in the next sections.

Layers Aggregation Module
The CNN module is the backbone of the proposed method as it plays a vital role in extracting the features from license plate images, which greatly affects the recognition ability of the proposed framework.This is the reason behind proposing and evaluating four different suggested modified CNN structures by using layer aggregation models to fuse low-and high-level features.These four models are referred to as CNN C15 , CNN C25 , CNN C35 andCNN C45 and concatenate the feature map of convolution layers 1, 2, 3 and 4 with convolution layer 5 respectively.CNN B denotes the base structure and outputs only convolution layer 5 feature map as traditional structure works.The proposed layers aggregation models for CNN structure can be seen in Figure 15, where C represents the convolution and P represents the pooling layer.
Appl.Sci.2019, 9, x 15 of 21 convolution layers 1, 2, 3 and 4 with convolution layer 5 respectively.CNNB denotes the base structure and outputs only convolution layer 5 feature map as traditional structure works.The proposed layers aggregation models for CNN structure can be seen in Figure 15, where C represents the convolution and P represents the pooling layer.Table 4 demonstrates the comparative study of four different layer aggregation and traditional models for 8 different countries.It is observed that network with CNNC45 model gives better character recognition as compared to CNNB, CNNC15, CNNC25 and CNNC35.Probably, CNNC15, CNNC25 and CNNC35 still contain excessive background noise as compared to CNNC45, which would lead to reducing the performance of the system.Table 4 shows that CNNC45 got 90.30% recognition accuracy, which is highest among all other structures even it beats the state-of-the-art network (CNNB).Multiple image fields are the essential requirement for efficient object classification as image features have direct impact on performance of object recognition framework.Here we propose multichannels based multi-CNN architecture along with concatenated output feature vector, that is capable to deal with comprehensive and more appropriate features to enhance the original image's feature information.First, red, green and blue channel images are obtained by decomposing the original RGB image, then it is fed to separately trained neural networks for each channel image and finally the fully connected layer is obtained by concatenating one-dimensional eigenvectors of three single image domains, as shown in Figure 11.
Experimental results proved that the proposed framework with multi-channel CNN module (MMC-CNN) can learn image features more effectively as compared to traditional convolutional neural network (CNN-LA), and achieved 92.13% overall recognition accuracy which is 2% more than traditional CNN structure, as shown in Table 5.  and CNN C35 still contain excessive background noise as compared to CNN C45 , which would lead to reducing the performance of the system.Table 4 shows that CNN C45 got 90.30% recognition accuracy, which is highest among all other structures even it beats the state-of-the-art network (CNN B ). Multiple image fields are the essential requirement for efficient object classification as image features have direct impact on performance of object recognition framework.Here we propose multi-channels based multi-CNN architecture along with concatenated output feature vector, that is capable to deal with comprehensive and more appropriate features to enhance the original image's feature information.First, red, green and blue channel images are obtained by decomposing the original RGB image, then it is fed to separately trained neural networks for each channel image and finally the fully connected layer is obtained by concatenating one-dimensional eigenvectors of three single image domains, as shown in Figure 11.
Experimental results proved that the proposed framework with multi-channel CNN module (M MC-CNN ) can learn image features more effectively as compared to traditional convolutional neural network (CNN-LA), and achieved 92.13% overall recognition accuracy which is 2% more than traditional CNN structure, as shown in Table 5.We explored the proposition that, in terms of foreground polarity, data uniformity enhances the network recognition performance.Our focused work is for multinational VLPs, and we know that different country LPs have different foreground polarities, even one country LPs may have different FP.For this purpose, we propose and integrate polarity matching module before feeding the image to multichannel convolutional neural network, that checks and inverts the foreground polarity of an image if needed, as we already have FP information that is extracted by using foreground/background classification framework at segmentation stage.The foreground polarity matching module works as follows The FP matching module is integrated between multi-channel input and multi-channel CNN network module to rectify the mismatch of foreground polarity and gives more accurate recognition results as compared to the previous stage.A data normalization module is also used to make an output feature vector in standardized form.The total 96% recognition accuracy is achieved after the integration of these modules in the proposed framework.
For better understanding, we graphically present the performance comparison of every module for all targeted countries separately, as shown in Figures 16 and 17. Figure 16 shows the cumulative accuracy of every integrated module while Figure 17 only demonstrates the percentage increase in accuracy that every module contributes in the proposed system.It is observed that cumulative recognition performance enhances from 86.25% to 96% for our proposed network with the combination of different modules at different levels, as shown in Figure 18.Base represents the result achieved by the base network while Level 1, Level 2 and Level 3 represent the output of layer aggregation, multi-channel CNN and FG polarity matching modules respectively.A comparison of the proposed method with already existing works is also provided.Five existing state-of-the-art methods have been implemented in MATLAB and their performance is recorded on test dataset.These existing methods are compared with the proposed method in terms of recognition accuracy and computational time and results are presented in Table 6.It has been observed that the proposed method outperforms all the existing methods by achieving 96.04% recognition accuracy.Furthermore, the proposed method is faster, in terms of computational time, than all tested methods except the method in [21].Although the method in [21] has little better computational time than the proposed method, its accuracy is almost 12.5% less than the proposed method.A comparison of the proposed method with already existing works is also provided.Five existing state-of-the-art methods have been implemented in MATLAB and their performance is recorded on test dataset.These existing methods are compared with the proposed method in terms of recognition accuracy and computational time and results are presented in Table 6.It has been observed that the proposed method outperforms all the existing methods by achieving 96.04% recognition accuracy.Furthermore, the proposed method is faster, in terms of computational time, than all tested methods except the method in [21].Although the method in [21] has little better computational time than the proposed method, its accuracy is almost 12.5% less than the proposed method.

Method
Year RA, % Time, ms Ref. [31] 2018 82.2 5.6 Ref. [13] 2018 93.1 4.0 Ref. [34] 2018 92.5 7.0 Ref. [21] 2018 83.5 1.9 Ref. [30] 2018 88.3 6.8 A comparison of the proposed method with already existing works is also provided.Five existing state-of-the-art methods have been implemented in MATLAB and their performance is recorded on test dataset.These existing methods are compared with the proposed method in terms of recognition accuracy and computational time and results are presented in Table 6.It has been observed that the proposed method outperforms all the existing methods by achieving 96.04% recognition accuracy.Furthermore, the proposed method is faster, in terms of computational time, than all tested methods except the method in [21].Although the method in [21] has little better computational time than the proposed method, its accuracy is almost 12.5% less than the proposed method.

Conclusions
In this work, we propose an adaptive framework that is well suited for the recognition of multinational vehicle license plates.To make it generalized, this research does not require any prior knowledge of LP layout and furthermore we do not use training data from all targeted countries.This is the prominent feature of the proposed methodology.A foreground polarity detection model is developed for the classification of background and foreground based on their colors, which is not only the backbone of the LPCS stage but contributes well at the LPCR stage.This model also works well in the presence of shadow and under various illumination conditions.The multi-channel processing with improved CNN structure is proposed to extract deep features, and the real-time SVM classifier is used to get output labels.Features from convolution layer 4 and 5 are merged in a layer aggregation module in order to get a more expressive output feature map.The recognition performance is enhanced at three different levels and 96.04% recognition accuracy is achieved, which is a significant improvement in the case of multi-style license plates.

Figure 2 .
Figure 2. Proposed architecture for license plate character segmentation.

Figure 2 .
Figure 2. Proposed architecture for license plate character segmentation.

Figure 3 .
Figure 3. Visualization of RGB color space: (a) RGB-cube with 8 pure colors, (b) RGB-cube with colors distribution, and (c) gray shades distribution bar.

Figure 3 .
Figure 3. Visualization of RGB color space: (a) RGB-cube with 8 pure colors, (b) RGB-cube with colors distribution, and (c) gray shades distribution bar.

Figure 6 .
Figure 6.Effectiveness of prior knowledge of FG polarity: (a) License plate (LP) image with dark foreground polarity, (b) LP image with bright foreground polarity.

Figure 6 .
Figure 6.Effectiveness of prior knowledge of FG polarity: (a) License plate (LP) image with dark foreground polarity, (b) LP image with bright foreground polarity.

Figure 8 .
Figure 8. Required objects detection part based on character height estimation filter: (a) Angle detection, (b) Angle correction, (c) Border touched character separation, (d) Bounding boxes on required objects after removing a small object.

Figure 9 .
Figure 9. LP images with noise, blur, shadow and various illumination conditions.

Figure 10 .
Figure 10.Failed segmentation samples of license plate images.

Figure 8 .
Figure 8. Required objects detection part based on character height estimation filter: (a) Angle detection, (b) Angle correction, (c) Border touched character separation, (d) Bounding boxes on required objects after removing a small object.

Figure 8 .
Figure 8. Required objects detection part based on character height estimation filter: (a) Angle detection, (b) Angle correction, (c) Border touched character separation, (d) Bounding boxes on required objects after removing a small object.

Figure 9 .
Figure 9. LP images with noise, blur, shadow and various illumination conditions.

Figure 10 .
Figure 10.Failed segmentation samples of license plate images.

Figure 9 .
Figure 9. LP images with noise, blur, shadow and various illumination conditions.

Figure 8 .
Figure 8. Required objects detection part based on character height estimation filter: (a) Angle detection, (b) Angle correction, (c) Border touched character separation, (d) Bounding boxes on required objects after removing a small object.

Figure 9 .
Figure 9. LP images with noise, blur, shadow and various illumination conditions.

Figure 10 .
Figure 10.Failed segmentation samples of license plate images.

Figure 10 .
Figure 10.Failed segmentation samples of license plate images.

Figure 11 .
Figure 11.Proposed architecture for multinational vehicles license plate character recognition.

Figure 11 .
Figure 11.Proposed architecture for multinational vehicles license plate character recognition.

Figure 11 .
Figure 11.Proposed architecture for multinational vehicles license plate character recognition.

Figure 13 .
Figure 13.The improved CNN network structure.

Figure 13 .
Figure 13.The improved CNN network structure.

Figure 14 .
Figure 14.Samples images of LPs from the test dataset

Figure 14 .
Figure 14.Samples images of LPs from the test dataset.

Figure 15 .
Figure 15.Different layers aggregation structures of proposed CNN model.

Figure 15 .
Figure 15.Different layers aggregation structures of proposed CNN model.
(x, y) i f FP = Dark I FP (x, y) i f FP = Bright

Figure 16 .
Figure 16.Graphical representation of integrated modules overall performance with base structure.

Figure 17 .
Figure 17.Graphical representation of individual modules contribution for every targeted country.

Figure 16 .
Figure 16.Graphical representation of integrated modules overall performance with base structure.

Figure 16 .
Figure 16.Graphical representation of integrated modules overall performance with base structure.

Figure 17 .
Figure 17.Graphical representation of individual modules contribution for every targeted country.

Figure 17 .
Figure 17.Graphical representation of individual modules contribution for every targeted country.

21 Figure 18 .
Figure 18.Graphical behavior of accumulative rise in recognition accuracy for proposed modules.

Figure 18 .
Figure 18.Graphical behavior of accumulative rise in recognition accuracy for proposed modules.
where ζ = I(x n , y m )

Table 2 .
The parametric detail of improved CNN

Table 2 .
The parametric detail of improved CNN

Table 2 .
The parametric detail of improved CNN.
NMC LPs + MC LPswhere NMC LPs (no-missing character license plates) are license plates of all detected characters and MC LPs (missing-characters license plates) are LPs having partially detected characters.CS C represents the correctly segmented characters and FS C denotes the falsely segmented characters of no-missing character license plates (NMC LPs ) and only these LP characters are further used in recognition stage, as the LPs with missing characters (MC LPs ) are useless for further processes.

Table 3 .
Accuracy and precision of segmented characters.

Table 4 .
Performance comparison of base CNN with modified CNN structures

Table 4
demonstrates the comparative study of four different layer aggregation and traditional models for 8 different countries.It is observed that network with CNN C45 model gives better character recognition as compared to CNN B , CNN C15 , CNN C25 and CNN C35 .Probably, CNN C15 , CNN C25

Table 4 .
Performance comparison of base CNN with modified CNN structures.

Table 5 .
Performance comparison of different modules of proposed recognition network.

Table 6 .
Comparison with existing methods.

Table 6 .
Comparison with existing methods.