Maargha: a Prototype System for Road Condition and Surface Type Estimation by Fusing Multi-sensor Data

Road infrastructure in countries like India is expanding at a rapid pace and is becoming increasingly difficult for authorities to identify and fix the bad roads in time. Current Geographical Information Systems (GIS) lack information about on-road features like road surface type, speed breakers and dynamic attribute data like the road quality. Hence there is a need to build road monitoring systems capable of collecting such information periodically. Limitations of satellite imagery with respect to the resolution and availability, makes road monitoring primarily an on-field activity. Monitoring is currently performed using special vehicles that are fitted with expensive laser scanners and need skilled resource besides providing only very low coverage. Hence such systems are not suitable for continuous road monitoring. Cheaper alternative systems using sensors like accelerometer and GPS (Global Positioning System) exists but they are not equipped to achieve higher information levels. This paper presents a prototype system MAARGHA (MAARGHA in Sanskrit language means an eternal path to solution), which demonstrates that it can overcome the disadvantages of the existing systems by fusing multi-sensory data like camera image, accelerometer data and GPS trajectory at an information level, apart from providing additional road information like road surface type. MAARGHA has been tested across different road conditions and sensor data characteristics to assess its potential applications in real world scenarios. The developed system achieves higher information levels when compared to state of the art road condition estimation systems like Roadroid. The system performance in road surface type classification is dependent on the local environmental conditions at the time of imaging. In our study, the road surface type classification accuracy reached 100% for datasets with near ideal environmental conditions and dropped down to 60% for datasets with shadows and obstacles.


Introduction
Connecting cities and hinterlands, roads infrastructure is the artery of a country's economy.It enables cheap and quick transportation of people, services and goods inland.In developing countries, of late, there has been increased focus and budget allotted for the construction and maintenance of highways.But the local urban and rural roads have been neglected due to insufficient funds, poor planning and lack of coordination between municipal authorities and contractors.Under-maintained roads deteriorate over time and become unusable and go beyond repair.In India, the better funded national highways account only for 1.9% of the entire road network, although it has the world's second longest, length of roads.These roads are laid with asphalt, bitumen tar and concrete or even compacted mud depending upon the requirements of the location and incur different maintenance costs.Hence, statistics of the road condition or quality and road surface type is essential to help convince the decision makers to allocate budget to new roads and maintenance projects.It can also help them prioritize maintenance of certain areas depending on the extent of damage that the road incurs.
Road maintenance is either done periodically or as an event to fix major issues of road condition like potholes, surface cracks and the like.To identify and locate the latter, continuous monitoring of roads for these dynamic changes is needed as part of a well-defined operational Geographical Information systems (GIS) based Road information system that keeps track of the road condition and surface type.Apart from providing geometry and location of the road features, current GIS systems usually provide attribute data like speed limit, area type, traffic flow direction, and occasionally some of the road furniture like traffic signals and signboards.However these systems do not have an efficient data collection infrastructure to update its dynamic attributes.Current practice is to typically record manually the road characteristics for every 100 m segment, a costly and time-consuming approach to capture attributes that were either missed out previously or those that are changing over time.
Instead of observing the ground truth from manual inspections, the use of technology for accurate road quality monitoring has been preferred.Expensive vehicle mounted systems that use high power laser or radar sensors have been proposed but are a challenge to scale up across the cities and regions due to their low coverage and unavailability of skilled operators.Alternatively, using smart phones as data collection platforms has been of much interest in various studies including road monitoring applications as it has an array of useful sensors like GPS, camera, etc. Smart phones are also pervasive and inexpensive platforms, but the use of consumer grade sensors for data collection may produce noisy and unreliable data.Hence in the context of road monitoring, cheap and smart systems need to be built that uses heterogeneous data to achieve a comparable information level of the road condition [1,2] similar to the expensive laser based systems.This paper proposes and builds a prototype system that recognizes and classifies road condition and road surface type by fusing multisensory data, derived from accelerometer, GPS and camera, at an information level.It efficiently classifies the road surface type into asphalt/bitumen tar, concrete or mud roads and the road condition into four distinct levels, which are good, satisfactory, unsatisfactory and poor.

Existing Road Monitoring Systems
While a range of road monitoring systems are in practice, ranging from a fully manual, paper based one to a fully automated laser guided engineering grade information provider, it is important to understand and evaluate these in terms of information diversity, accuracy and the related parameters: affordability and scalability.Road monitoring agencies have witnessed a paradigm shift from camera-based systems towards laser scanners starting with the introduction of "LASER Road Imaging System" (LRIS) in 2005 [3] and similar laser systems like IBEO Laser [4] and PPS [5].Typically, one or two illuminating laser beams reflected by the road surface are imaged by line cameras to provide 3D profile of the road.These systems provide high fidelity (1 mm resolution) and low per-sortie operational time (>100 Kph).However high equipment cost, complexity of data processing and skilled resource requirement of these systems slows down the overall operation and make them impractical to be applied as a tool for continuous road monitoring.VOTERS PAVEMON [6] is a web based GIS system that uses integrated multi sensor data like tire pressure sensor, accelerometer, laser, radar, and imagery to arrive at road distress parameters.Despite the capabilities, the need for specialized vehicle set up confines the system from being used as a mass data collection platform.While the above two systems provide engineering grade accurate information on the road condition, other additional information like road surface has to be manually collected.In addition, these systems score less on affordability and scalability.
Imitating the recent trend in data collection methodologies, smart phones have been utilized as a road monitoring platform.Smartphones in today's world are a frequent sight and also have an arsenal of valuable sensors to be used for data collection.Sophisticated systems may be beneficial but cell phone based systems can easily scale up and is a valuable advantage that they offer.Smart phone based prototypes Nericell [7] and Wolverine [8] detect road bumps based on change in accelerometer readings along the direction of gravity (Z direction).Additionally, they also estimate the traffic conditions based on braking events that results in persistence of surge in accelerometer reading along the direction of vehicle motion (X direction).While Nericell employs fixed thresholds Z-Peak [9] and Z-Sus [7] at two speed levels for road condition classification, Wolverine learns the thresholds by training an SVM based on six features-the mean and standard deviations in the three coordinate axes (µX, µY, µZ, σX, σY and σZ) over a window of 1 s.Work has also been done in distinguishing potholes from other road anomalies like rail crossing, center lane lights, speed breakers etc. (Pothole Patrol (P 2 ) [9]) by analyzing the patterns in accelerometer reading using X-Z ratio and Speed-Z ratio.The assumption that potholes impact only one side of the car and spatial clustering on collected data improved the accuracy of the detection by reducing false positives.Roadroid [1] is a commercially available smart phone based system for road monitoring.It classifies the road into good, satisfactory, unsatisfactory and poor based on calculated IRI (International Roughness Index) values and claims 80% to 90% accuracy for the results.Vehicle type and mobile platform sensitivity are taken into account for the calculations involved.This system can also be used as an inspection tool by manually adding information about bumps and also taking snapshots of the road and uploading them to the web database.The road surface imagery that is collected is used only for manually validating the road condition reports and no useful information is extracted automatically out of them.Smart phone running Roadroid App is shown in the Figure 1.Though these smartphone based systems are affordable and scalable, the main drawback of these are that they produce data at an information quality level of 3 or 4 compared to precision laser systems whose quality level is 1 [2].Research needs to focus on using additional data, either heterogeneous or redundant, in order to push up the accuracy score of these systems.Earlier system designs were dominated by system performance [7,8] but with the ever-increasing processing capabilities, phones are equipped to handle computationally expensive tasks like on-board image processing.This opens up a lot more possibilities in data gathering.Moreover, road monitoring does offer the luxury of offline processing.

Process Block Diagram
The proposed system uses camera images, GPS trajectory and accelerometer data to perform spatially explicit road surface type and road condition mapping.The system is primarily aimed at monitoring and updating a Road GIS database with appropriate attribute information.At present, this system is not a road evaluation tool and generating engineering grade data is out of its scope.Figure 2 shows the functional blocks of the proposed system, MAARGHA.
Road surface type estimation is performed using the periodic snapshots from the camera.Independently, images that are captured are utilized for detecting potholes.It provides additional information to the road condition estimator thread that is performed in parallel using accelerometer and GPS speed data.Information obtained from all the processing threads are finally fused at an information level using the present position given by the GPS map matching algorithm.Result is a set of road mass points whose attribute data is generated.The attribute data can then be used to update the road characteristics on a GIS database.This paper focuses on the part of the system shown by the process blocks bound within the red box in Figure 2.

Prototype Design
The prototype was developed using an Android smart phone for data collection, and a laptop for running the offline data processing algorithms.The smartphone was mounted on the car's windshield or the dashboard (see Figure 3) and the data was collected using a sensor data-logging app, at frequencies as listed in Table 1 for each sensor.The smart phone app synchronizes the raw data collected from different sensors using timestamp.The screenshot of the software application running on the laptop is shown in the Figure 4 and a demo of the same is accessible [10].Table 2 shows the software and hardware requirements for the development of the proposed system.Accelerometer readings are an indirect way of finding the undulations on the road surface by measuring the degree of vibration inside the vehicle.The vibrations are measured along the three measurement axes: direction of gravity (Z-axis), direction of vehicle motion (Y-axis) and the direction parallel to the dashboard (X-axis).For simplicity, the smart phone is carefully mounted in a well-oriented position with the X-, Y-, and Z-axes of the phone aligning with the measurement axes.
In practice the accelerometer readings have to be reoriented as in Nericell [7] depending on the mount position.Except for intentional speed breakers, vibrations that get through the car's suspension system is mostly chaotic and is reflected on all the three accelerometer axes of measurement.Standard deviation (SD) of accelerometer readings along X-, Y-, and Z-axes (Ax, Ay, Az) and GPS Speed are considered as features (σx, σy, σz, Vg).Vehicle speed considered in the feature vector is the instantaneous speed.Unlike P 2 [9], the speed is considered as a separate feature instead of a ratio Az/Vg since it is found that the maximum-vibration-amplitude to speed ratio is not constant when traversing similar sized speed breakers at different speed values (Figure 5).Vehicle breaking and turning shows sustained reading changes in Ax and Ay.Hence such slow changes in readings are removed using a high pass filter so that only high frequency data is used for SD calculation.Accelerometer readings are obtained at 15 Hz frequency and the features are extracted over a window of 2 s.A non-parametric approach is adopted and the system is trained manually for a supervised classification.The roads are labeled into poor, unsatisfactory, satisfactory and good for the training and sufficient samples are collected.Later a K-Nearest Neighbor (K-NN) algorithm is used for classification of fresh data in to the above-mentioned classes.

Camera Based Pothole Detection
Primary objective of this work is to improve the information quality level of the current state of the art in road monitoring systems using heterogeneous data.Bad road condition across the width of the roads is detected using the accelerometer but the scattered potholes are mostly missed, as it is a general tendency to avoid them while driving.In the current systems, Nericell [7] and Roadroid [1], the images captured by the camera are not utilized and serves only as an occasional visual validation proof.Camera snapshots are typically captured at a frequency of 0.3 Hz to 0.5 Hz and as a result it becomes an overhead to manually inspect every image post data capture, especially when monitoring is done across a vast area.However, robust image segmentation based pothole detection is a difficult problem considering the complexity of the scenes captured by the dashboard camera.Emir Buza et al. [11] proposed a method to detect pothole using image processing and spectral clustering for rough estimation of potholes.Otshu [12] based thresholding is applied to select darker pixels of the road for pothole detection.However, potholes cannot be generalized only as patches that have darker pixels inside the boundary compared to the surroundings.The color of the interior pixels depend on the road or mud layer that is exposed as a result of wearing out of the top layer.This can result in pothole interior pixels to have the same or even higher average pixel intensity than the surroundings (Figure 6).The proposed system MAARGHA, detects potholes based on the edge features detected on the images (Figure 7).The results of image based detection supplements the accelerometer readings in detecting the missed out potholes.The smart phone is mounted with the camera facing towards the road and the captured snapshots are cropped to focus the region of interest to 1 or 2 m ahead of the car.The sub-image is then smoothed using a 5 × 5 bilateral filter, which preserves the sharp edges of potholes and shadows.Because shadows are regions of low intensity (Figure 8a) and are connected to the boundary of the image, they are identified using intensity threshold based connected component analysis (Figure 8b).If the shadowed regions constitute more than 50% of the image the scene is ignored for further processing to avoid false positives.Consequently, a canny filter with an empirical lower gradient threshold of 30 is used to obtain the edges of the sub-images and edge contours are extracted.Commonly used high:low gradient threshold ratio of 3:1 is used to obtain the higher threshold value.A second level of filtering is done by selecting the images that have a limited number of contours n (empirical value).The selected images are assigned unsatisfactory road condition score.The pseudo algorithm is given below in Algorithm 1.

Algorithm 1. Pothole Detection Algorithm
Step 1: Apply 5 × 5 bilateral filter on sub-image I Step 2: Reject I if pixels with intensity values <100 constitute more than 50% of the sub-image Step 3: Detect n contours {C1, C2, ..., Cn} based on Canny edges with 90-high and 30-low threshold value Step 4: Pothole detected if 1 < n < 5 Ghost images do get formed due to the reflection on the windshield of the car and these need to be filtered out from the input set to the algorithm.The position of the sun with respect to the car can be used to detect such possible images and neglect them.Linear shapes like lane markings can be removed by exploiting the small length to breadth ratio of a bounding box.However, such enhancements are operational details and have not been implemented under the scope of this work.The road condition estimator takes both the accelerometer based classification and the pothole detection algorithm to provide the four-class output, with precedence to the latter algorithm in case where the classification is worse than the former method.

Histogram Based Road Surface Type Recognition
Based on the composition of the construction materials used, the roads appear in different colors.Much work has been done in the domain of autonomous navigation systems (Survey by Vipul et al. [13,14]), but mostly to distinguish between road and non-road regions.Color content, color features and additional features like road boundary [6] and depth sensors based road region modeling have been used in road detection.But the problem in hand is to classify the detected road into different surface material types like bitumen, concrete and mud.This is a problem of outdoor color classification.Methods discussed in [15] have adopted parametric and non-parametric approaches for outdoor colour classification and can be applied in the context of road surface type identification.Parametric methods [16][17][18][19][20][21][22] predict the color based on illumination models while non-parametric methods [23][24][25][26] follow sample based training and classification regime.The later has been successfully applied for road surface type estimation in the proposed system due to its smaller computational footprint.The classification problem is simplified if carried out in two steps.First is to differentiate between tar roads vs. mud/concrete roads based on the intensity distribution of the scene.In the second step, the mud roads are differentiated from concrete roads depending on the colorfulness of image.
Image is converted to HSI color space and the histograms are calculated for Saturation and Intensity components of the sub-image.Intensity histograms are then used as features in the first stage classification.Under open sky conditions, the intensity values of mud and concrete roads are higher than the tar/bitumen roads.HSI color space is the closest to the way humans perceive colors and hence hue is commonly utilized for color classification [27].However, HSI being a cylindrical color space the hue component which explains the color of the object, is unstable as the saturation and intensity decreases.Whereas, the saturation is found to be stable and describes the colorfulness of the scene.Thus, the histogram of the saturation channel is used as a feature vector to train the system to differentiate between mud and concrete roads in the second stage classification.The marked difference between the saturation values of mud and concrete roads is seen in the saturation channel histograms shown in Figures 9 and 10

Histogram Distance Metric
Performing color classification using histograms needs a measure of histogram similarity.There are a number of distance metrics that are used for histogram comparison.The paper [28] talks about the failure of Euclidean distance metric at higher dimensions.Aherene et al. [29] proposes Bhattacharyya distance (equation 1) to be used for histogram comparison when compared to the famed Chi-Squared distance method.A Bhattacharyya distance of zero means a perfect match where as a distance of 1 means a perfect mismatch.This distance metric is used for performing a K-NN classification using the training dataset.
refers to the value of the th i bin of the histogram.

Map Matching of GPS Ticks
To associate the road profile and surface type data accurately with the correct stretch of the road, GPS data is used.But the common commercial grade GPS used in mobile phones are less accurate and may also experience multipath reflection while driving through streets.So the GPS ticks obtained by the devices are distributed randomly on either sides of the road segment.A simple online map matching algorithm, discussed as "Algorithm 2" in [30], is used.Raw GPS ticks are snapped to the nearest road segment that is parallel to the direction of vehicle motion, as derived from the heading data (Figure 11).A good resolution of the road data is selected such that the separate road center lines are available along the vehicle driving direction (Two center lines if it is a two way road).Post map matching, for every unique GPS point, the latest results from road condition and road surface type classification is associated depending on the timestamp of the GPS data point.The result is a set of GPS points called the road mass points, whose road condition and road surface type attributes are available.This marks the information level fusion of the results obtain from different modules and can be used to update a Road Information System.

Data Sets
The training set was generated in a nearby locality at Hyderabad, India by driving through different types of roads and manually classifying the road surface type and road condition.Adequate samples were collected for asphalt/tar, concrete and mud surface types and also for the various road conditions.Both the training data were saved in separate text files and synchronized using time stamps.
The collected data was grouped into two sets based on the image information available.The first group of datasets was collected on sections of the roads having few shadowed regions and under clear sky conditions (Ideal Data).The second class of data has many gaps in terms of shadows, disturbances in the path, unclear road surface type, etc. (See Figure 12).The system was trained using one part of the ideal data and classification performed for all conditions (Table 3).

Results
The classification mode of the application takes in the sensor data and performs a K-NN classification for both road condition and the road surface type data.The value of K was chosen as 5 based on the accuracy of the results after experimentation with different K values.To avoid incorrect classification when the vehicle is stationary, the classification is performed only when the speed of the vehicle exceeds 3 Kph.The legend for road condition and surface type mapping results is shown in Figure 13.The results presented in Figures 14-20 were obtained by driving the vehicle mounted with the setup for a total stretch of around 4 km in a nearby locality (The demo is available at [10]).The locality was chosen based on the availability of different types of the road surface and road condition classes.Classification of road surface type and road condition is performed every 2 s.

Road Surface Type Classification Results
The road surface type classification accuracy was measured against a manually generated ground truth by calculating the percentage of correct classification as per Equation (2).True positives are the number of instances when the algorithm matched with the manual classification.False positives are the number of instances when the surface type was misclassified.True and false negatives were considered to be zero as they cannot be measured in this scenario.The results of road classification for the two groups of datasets have been presented below.
The Figure 14 and Table 4 shows a 100% accurate surface type classification result under clear sky conditions and when there are no shadowed regions.Figure 15 shows results from dataset 2, where larger part of the road stretch is laden with shadows of buildings, which resulted in incorrect classification.Table 5 summarizes the results, showing an initial accuracy of 51%.Given that the ideal dataset was used for training, the shadowed road surfaces tend to get classified as tar/bitumen class as seen in Figure 15.The true positives occur at sections of the road without shadows.As road surface type does not change frequently, sparsely distributed incorrect classification results can be corrected either using the neighborhood information or manually picking the odd results.In this case, around seven such instances can be identified and removed bringing up the accuracy to 60%.However, time of the day based filtering of the results from multiple sorties can significantly minimize the effect of shadows.

Road Condition Classification Results
The results presented here are from both the proposed MAARGHA system and the Roadroid system [1], obtained by gathering data at the same time for the same stretch of the road segments.The setup needed two smartphones to be mounted to the vehicle windshield, one running Roadroid app and the other one running the data collection app used by MAARGHA.While the figures provide a visual comparison of the results obtained, the corresponding table tabulates and presents the performance of these two systems.A rough estimate of the ground truth was known from the microphone recordings of the vehicle driver while conducting the experiments.The road condition results in Figure 16 and Table 6 reflects high correlation between MAARGHA and Roadroid systems for predominantly smooth roads.In roads intermittent with rough patches (Figure 17 and Table 7) both the systems identified similar bad patches (poor) but they opine slightly different in the road condition classification.Few of the road mass points that were classified as "good" and "satisfactory" in Roadroid has been classified as "unsatisfactory" in "MAARGHA".Inspecting the accelerometer readings at such instances, it is observed that the vehicle produced more body roll than vertical motion, when moving slowly in continuous bad patches of the road.The yellow line in the sub image of Figure 17 is the lateral acceleration along the X-axis corresponding to the body roll of the vehicle.Roadroid appears to be less sensitive towards this lateral motion.Other than such differences the results of both the systems match each other for the poor and smoother sections of the road.According to the ground truth recorded in the microphone, the third comparison result is for a road stretch that had significant rough road condition.This road stretch consisted of rough mud road at the beginning and smoother tar road towards the end.As per the results (Figure 18 and Table 8) Roadroid follows similar trend of assigning more "good" and "satisfactory" scores as compared to MAARGHA, while the latter produced more "unsatisfactory" results.Overall, both systems are comparable having similar response to smooth roads and slightly divergent response to the rough roads as seen visually from the Figures 16-18.Both the systems successfully identify major road anomalies and thus providing the important information expected out of a road survey.

Pothole Detection Results
The proposed system uses images obtained from the smartphone camera to identify the potholes that are usually missed out by accelerometer only approach.In sections of the roads where accelerometer based results from MAARGHA and results from Roadroid (Figure 19) showed 100% "good" roads, image based identification helped to identify the missed out potholes.Since the frequency of the snapshots is 0.5 Hz, generally multiple runs (Figure 20 shows MAARGHA results from a single run) produces 100% detection.The end result of the classification is more realistic with the additional sensor information and move closer towards ground.The combined results of MAARGHA and Roadroid is summarized in Table 9.

Figure 3 .
Figure 3. Smartphone Mounted on Windshield with Data Collection App Running.

Figure 4 .
Figure 4. Screenshot of the PC application.

Figure 5 .
Figure 5. Vibration Response: Plot of ratio between Az Max and Average GPS Speed (Vg) while driving through similar speed breakers.

Figure 13 .
Figure 13.Legend for Road Condition and Surface type Mapping.

Figure 14 .
Figure 14.Road Surface type Map for dataset 1 (few snapshots from the sortie are shown to the left and right of the map)-largely consisting of Asphalt and Concrete road.

Figure 15 .Figure 16 .Figure 17 .
Figure 15.Road Surface type Map for dataset 2 (red circle indicates one set of erroneous classification while green circle shows some of the correct classification results)-primarily a mud road with one section being Asphalt road.

Figure 18 .
Figure 18.Results comparison for significantly rough Road.(a) is the Road Condition Results Map from MAARGHA, while (b) from Roadroid (Screenshot from its cloud server).

Figure 19 .
Figure 19.Roadroid results showing 100% smooth road condition for a section of the road that does have minor potholes.

Figure 20 .
Figure 20.MAARGHA accelerometer results supplemented by pothole detection showing some unsatisfactory road condition points (red circles are some of the detected potholes shown with their snapshots).

Table 1 .
Sensor Data Collection Frequency.

Table 2 .
Software and Hardware System Requirements.

Table 3 .
Number of Training Samples for Road Condition and Surface type Classes.

Table 4 .
Road Surface type Classification for dataset 1.

Table 5 .
Road Surface type Classification for dataset 2.

Table 9 .
Comparison Report showing significance of image based pothole detection.