Census Bureau’s Topologically Integrated Geographic Encoding Referencing (TIGER)

As open source volunteered geographic information continues to gain popularity, the user community and data contributions are expected to grow, e.g., CloudMade, Apple, and Ushahidi now provide OpenStreetMap© (OSM) as a base layer for some of their mapping applications. This, coupled with the lack of cartographic standards and the expectation to one day be able to use this vector data for more geopositionally sensitive applications, like GPS navigation, leaves potential users and researchers to question the accuracy of the database. This research takes a photogrammetric approach to determining the positional accuracy of OSM road features using stereo imagery and a vector adjustment model. The method applies rigorous analytical measurement principles to compute accurate real world geolocations of OSM road vectors. The proposed approach was tested on several urban gridded city streets from the OSM database with the results showing that the post adjusted shape points improved positionally by 86%. Furthermore, the vector adjustment was able to recover 95% of the actual positional displacement present in the database. To demonstrate a practical application, a head-to-head positional accuracy assessment between OSM, the USGS National Map (TNM), and United States Census Bureau’s Topologically Integrated Geographic Encoding Referencing (TIGER) 2007 roads was conducted.


Motivation and Problem Statement
OpenStreetMap © (OSM) is an open source geographical mapping project that provides the public with a free digital map of the world.It is unique because its users are the main contributors and responsible for building and maintaining the map.OSM contributors can edit (update) the map in several ways including using Global Positioning System (GPS) waypoints and tracks to identify features in the field or measuring satellite imagery to identify roads and other geographical features of interest [1].
This type of open contributing by volunteer contributors is known as crowdsourcing and refers to many volunteers providing information into the database, where each individual volunteer contributes a small portion that pertains to their local knowledgebase.Goodchild [2] has coined the term "Volunteered Geographic Information" (VGI) to describe this type of crowdsourcing activity.
The VGI phenomenon raises some important questions regarding the expertise of its contributors, especially when it is applied to mapping features on the earth's surface.For example, in the United States a Land Surveyor is required to hold a bachelor's degree in Surveying Engineering and receive several years of practical experience to be eligible to sit for state exams to become licensed.The training covers complex measurement theory such as GPS positioning, photogrammetry, remote sensing, least squares statistical analysis, error propagation, and state boundary law [3].Moreover, once a surveyor becomes licensed, most projects they work on require the adherence to strict mapping standards and positioning guidelines [4][5][6].
However, OSM contributors are not required to have any formal education or training in the mapping profession, nor required to follow any positioning standards to derive the vector data that is contained in the database.As OSM continues to gain popularity the user community and data contributions are expected to grow, e.g., CloudMade, Apple, and Ushahidi now provide OSM as a base layer for some of their mapping applications [7][8][9].This, coupled with the lack of cartographic and data quality standards and the expectation to one day be able to use this vector data for more geopositionally sensitive applications, like GPS navigation, leaves potential users and researchers to question the accuracy of the database.

Proposed Solution
To address this issue, a rigorous photogrammetric approach is taken to determine the positional accuracy of OSM roads by using aerial stereo imagery, ground control points (GCP's), and a vector adjustment model to determine real world geolocations for OSM shape points.The vector adjustment is based on a traditional photogrammetric bundle adjustment model and results in: (1) adjusted OSM shape point locations, (2) shape point residuals (which describe the positional accuracy), and (3) confidence regions about the adjusted shape point locations, Circular Error and Linear Error at the 90th percentile (CE90/LE90).
The benefit of this approach is that it is based on recreating the sensor geometry present when the image was taken and using proven photogrammetric principles to estimate a ground position.The proposed approach takes into consideration distortions present in the imaging system and applies corrections (interior orientation) to more accurately represent the ideal image position, free of systematic error.GCP's are used for absolute orientation of the imagery and to form an accurate image-to-ground relationship via an analytical stereo model.The vector adjustment allows for a simultaneous adjustment of all the vector shape points at one time, thereby offering more redundancy to the least squares model and improving the overall confidence in the results.The approach is also location and vector dataset independent, allowing for different geographic databases in different areas of the world to be analyzed with this approach.
To demonstrate the proof-of-concept the proposed vector adjustment was used to compute the positional accuracy of several urban gridded city streets in the OSM database.Additionally, to validate the performance, the Root Mean Square Error (RMSE) metric for the adjustment residuals will be compared to the RMSE value determined from the differences between the OSM database and the corresponding location determined by high-accuracy GPS surveying techniques (ground truth).This type of comparison provides a quantitative way to describe the performance of the adjustment and demonstrates the positional accuracy computed is consistent with what is actually present in the database.Once this positional accuracy information is known it can be carried along as an attribute to the vector shape points and polylines at the feature level, thereby providing valuable metadata and improving the overall usefulness of the database.
To demonstrate a practical application of the vector adjustment, a head-to-head accuracy assessment between OSM [10], the United States Geological Survey (USGS) National Map (TNM) [11], and the United States Census Bureau's Topologically Integrated Geographic Encoding Referencing (TIGER) 2007 [12] road vectors was conducted to determine which database is the most positionally accurate over the test area.Having an approach to understand the spatial accuracy of geographic data can be useful for many applications including route planning, GPS navigation, and "smart conflation" where reference/target vectors are chosen based on their positional accuracy.

Background and Previous Work
A review of the existing literature related to determining the positional accuracy of vector data has provided insight into key research performed on this topic.Goodchild and Hunter [13] suggest that the positional accuracy of spatial objects can be defined through measures of the differences between the apparent location of the feature recorded in the database and the feature's true location.Different metrics describing positional accuracy are presented depending on the geometry of the feature being tested, e.g., the accuracy of points are usually expressed as RMSE error estimates, while linear features could be compared using the buffer method.Kiiveri [14] suggests the need for objective methods for assessing, representing, and transmitting uncertainty in vector data through calculations so decision makers have some idea of the reliability of the information.Positional accuracy can be separated into two classes, absolute and relative [15], where relative positional accuracy describes the consistency of any position on a map with respect to any other and absolute accuracy is a measure of deviation of an estimate from the true value [16].
Spatial data quality research begins with the acknowledgement that the data stored in a geographic database are rarely, if ever, truly free of error and the database contains an approximation of the real world [17].Doucette et al. [18] suggest that uncertainty handling is relevant to any scientific activity that involves making measurements of real world phenomena.Studies to describe the positional accuracy of geographical vector databases have been conducted where a test dataset is compared with a truth dataset of higher positional accuracy [19][20][21][22][23].While other studies compared a test vector dataset with GPS locations [24,25] and positions determined from georeferenced orthoimagery [26].
Haklay [21,27] has presented landmark papers on the OSM project and determining the positional accuracy of a portion of the database in London, England.Road centerlines from OSM are compared to their corresponding location in the UK's Ordinance Survey (OS) using the buffering method.The results show positional accuracies ranging from 3.17 m to 8.33 m with the average difference at 5.8 m.
One of the most common ways to build vector datasets is by digitizing, tracing, or vectorizing a raster aerial image (both satellite and airborne based platforms).Since the resulting vector features will take on the coordinate reference frame of the aerial image, it is important to ensure the imagery is georeferenced properly.While image-processing methods can routinely provide georeferenced products with an accuracy of 2 m (CE90) for satellite based products [28], and 0.30 m (horizontal RMSE) for aerial based products [29], it is important to keep in mind that different processing techniques yield products with varying degrees of accuracies.This is apparent in Figure 1, which depicts a location in a 2007 Google Earth © image [30] as compared to the same location in older archived imagery.The results show positional differences ranging from 5 to 24 m.Geostatistics can be used to make predictions of sampled attributes at unsampled locations from sparse, often expensive data [31].Burrough [31] defines two issues that geostatistical research can help address as: (1) interpolation errors and (2) error propagation in spatial models.Ever since spatial database accuracy was identified as a research initiative at the US National Center for Geographic Information and Analysis [32] an effort has been made to investigate various aspects of accuracy in spatial databases [33][34][35][36][37]. Shi et al. [38] contributes to the topic by presenting an S-Band model for describing the characteristics of positional uncertainty of geometric features, such as line segments, line features, boundary lines and area features.However, a key assumption is that the measurement errors are independent and uncorrelated.To address this, Shi and Liu [39] develop a G-Band model that handles correlated measurements.Love et al. [40] extend the G-Band concept with a Bayesian model to incorporate expert and historical knowledge that reduces the number of observations needed to make an accurate error analysis of vector data.
Even though work has been done on the subject of determining spatial accuracy of vector datasets, the current literature falls short when it comes to analyzing the positional accuracy using analytical photogrammetric techniques, such as the bundle adjustment.Photogrammetry has traditionally been used to: (1) determine three dimensional real world object space coordinates from stereo imagery; (2) propagate the error present in the sensor imaging system to a ground location; (3) derive three dimensional terrain surfaces, such as digital elevation models; or (4) extend highly accurate GCP's to adjacent image strips or blocks to facilitate the accurate mapping of the terrain and infrastructure from aerial imagery.Therefore, it is conceivable that photogrammetric techniques could be used to address the problem of determining the positional accuracy of geographical vector data by developing an analytical stereo model based on accurate GCP's and image coordinates to estimate the true location of the vector shape points by enforcing the collinearity equations.

Vector Adjustment Concept
The vector adjustment model used to determine the positional accuracy of road vectors is built upon the photogrammetric bundle adjustment model and extended to include vector shape points as the object points.Since the object points are expressed in three dimensions, the elevations for the shape points are estimated using a 1/3 arc second raster Digital Elevation Model (DEM) obtained over the area of interest from the USGS National Elevation Dataset (NED) [41].For this research, road features from the OSM database and surveyed GCP's were used as the object points in the adjustment to determine the positional accuracy of the vector shape points.The general idea is to use heavily weighted (well-known) GCP's and image coordinates to establish the sensor's exposure station and enforce the collinearity concept to solve for the true location of the OSM shape points.The adjustment residuals then describe the difference between the adjusted ("true") location and the OSM database position.In addition, since this is a rigorous application, where the uncertainties in GCP's, image coordinates, and object points are known, error propagation is used to compute confidence regions about the adjusted shape point locations (CE90 and LE90).The overall concept is depicted in Figure 2. In Figure 2 the SP represents the OSM shape points from the geographical database that make up a particular road vector being tested.Image coordinates of the OSM shape points are measured on each stereo image and used as input to the adjustment model.The image coordinates are considered to be well known (heavily weighted) and will ultimately control the adjusted shape point location in ground space.The assumption here is that the imagery is considered to be "truth", either more updated or of higher importance than the OSM database location.In addition, initial approximations for the exterior orientation (EO) parameters are provided to facilitate the adjustment process, note the EO parameters are solved for in the adjustment so an initial estimate is all that is needed.
Well-known (heavily weighted) GCP's are considered the absolute control for the adjustment.GCP's are measured in the imagery and used with their ground space coordinates in the adjustment.Since the GCP's are used to formulate the analytical stereo model and to create the image-to-ground relationship, it is important that they be known to a high degree of accuracy.Furthermore, it is essential for the GCP's and OSM shape points to be referenced to the same horizontal/vertical datum and map projection to minimize any misalignment that could otherwise be removed.
The outputs from the vector adjustment are: (1) adjusted OSM shape point locations, (2) shape point adjustment residuals (which describe the positional accuracy), and (3) confidence regions about the adjusted shape point locations (CE90/LE90).
The relationship between the stereo images and an OSM shape point is an extension to the space resection problem [42,43]; the geometric model can be visualized in Figure 3. where, L 1 and L 2 are the exposure station coordinates for stereo image one and two and j 1 and j 2 are measured image coordinates of the OSM shape point, J, on images one and two.For this work, the measuring of image coordinates was done manually, i.e., a user measures each shape point in the stereo imagery.However, it is anticipated that this process could be automated by incorporating an image-to-vector registration process.Once the image and object space coordinates are estimated they can be used as initial approximation in the bundle adjustment model.The collinearity condition, which states that the exposure station ( , , ), an object points image derived location (x j , y j ), and the object point in ground space (X J , Y J , Z J ) all lie on a single line is used to form the observation equations in the bundle adjustment.

Photogrammetric Bundle Adjustment
The basic geometric unit in photogrammetry is the image ray, an image can be thought of as a bundle of rays converging at the perspective center with an unknown position and orientation in space [42].The bundles from all photos are adjusted simultaneously so that corresponding light rays from the measured image coordinates intersect at positions of the object points on the ground.The unknown quantities to be obtained from the bundle adjustment consist of the adjusted object coordinates of the vector shape points and GCP's (X J , Y J , Z J ) and the EO parameters (ω, ϕ, κ, , , ) of all the images.The EO parameters describe the exposure station coordinates ( , , ) and the image orientation parameters (ω, ϕ, κ) that are used in the collinearity equations to formulate the image-to-ground relationship.
According to Wolf and Dewitt [43] the bundle block adjustment can be formulated using the collinearity equations, which are the foundation of the observation equations and used to form the mathematical model.The collinearity equations are documented in Equations ( 1) and (2) as: (1) (2) where, x j and y j are the measured image coordinates of J; X J , Y J and Z J are the coordinates of object point J, , and are coordinates of the image exposure station; x o and y o are the coordinates of the principle point known from the camera calibration report; f is the focal length of the camera (also known from the camera calibration report); and , ,…, are the rotation matrix terms formulated in Equation ( 4).
The rotation matrix terms are a result of individual rotations of ω, ϕ, and κ being applied to rotations about the x, y, and z-axes, respectively.The individual rotation matrices are structured as follows: (3) With the total rotation matrix is a result of combining the individual rotations as follows: (4) Since the collinearity equations are nonlinear, they are linearized by applying the first-order terms of the Taylor's series at a set of initial approximations, a detailed derivation is provided in [43].
The equations used to setup and solve the bundle adjustment have been well documented in several textbooks [42,43].However, it is worth reviewing the basic mathematical model and matrices used to solve the system of linear equations.The general form is based on the unified least squares model and expressed as: (5) where, is a matrix of partial derivatives of the observation equations with respect to the exterior image orientation parameters (ω, ϕ, κ, , , ); is a vector of corrections to the exterior orientation parameters; is a matrix of partial derivatives of the observation equations with respect to the object point coordinates (X J , Y J , Z J ); is a vector of corrections to the object point coordinates (OSM shape points and GCP's); ij is the misclosure vector which is used to minimize the sum of the squared residuals; V ij is the vector of residuals for the measured image coordinates.Adjustment residuals are important because they describe the difference between the measured and the adjusted values, which is a good indicator of how much the adjustment moved the input measurements as a result of the adjustment process.
The matrices used in the least squares model are structured as follows, for simplification the matrices are shown for a single image i and object point j.The and matrices are made up of terms resulting from linearizing the collinearity equations in [43].
The matrix as: The vector as: (8) The vector as: The vector is formulated with Taylor Series expansion terms in [43] as: (10) And the vector as: (11) The adjusted object points are expressed as a function of the initial measurement and the adjustment residual through the following relationship: (12) where, X j , Y j and Z j are the unknown coordinates of point j; , and are the measured coordinate value for point j; and , and are the coordinate residuals for point j.To be consistent with the collinearity equations the object point observation equations will need to be evaluated at initial approximations as follows: With , and being the initial approximations for the coordinates of point j and dX j , dY j , dZ j being the corrections to the approximations for coordinate point j, as solved for in the matrix.Since the collinearity equations are nonlinear an iterative approach is used where corrections to the parameters are solved for and applied each time through the loop until the difference in the correction is very small or essentially unchanged.This condition is referred to as convergence.Simplifying and expressing in matrix form yields: (14) With being the difference between the current approximation and the initial approximation.
(15) And, being the residual for the object point j expressed in each component.(16) Weights for the object point coordinates are also used in the adjustment and based on the accuracy of the GCP's and vector shape points.The weights for X j , Y j , and Z j object point j are expressed in matrix form as (17) where is the a priori reference variance; , , and are the variances in , , and , respectively; and the off diagonal terms are correlation coefficients.The final types of observations used in the bundle adjustment are the EO parameters.Their observation equations take on a form similar to the object points and are given as: (18) And the weight matrix for the EO parameters for a single image i is structured as: The weights for the x and y image coordinates are expressed in the matrix form for an object point j on photo i as: (20) where, is the a priori reference variance; and are the variances in, x ij and y ij , respectively; and the covariance = are assumed to be uncorrelated.After the individual matrices for the observation equations are formed the full set of normal equations can be structured in matrix form as: (21) where the matrices are formatted as: And the Δ matrix being a combination of the corrections to the exterior orientation parameters and the corrections to the object point coordinates, with the size being determined by how many images and object points are in the adjustment.(23) Similarly, the K matrix is structured as: (24) The submatrices used above are defined as: (28) (29) With m being the number of images, n is the number of object points, i is the image subscript, and j is the object point subscript.If a point j does not appear on image i than a zero submatrix is used.When the normal equations are being formed it is recommended to compute the a posteriori reference variance, which is a unit less scalar quantity that describes the uncertainty found in the observations post adjustment.It is a function of the various weight matrices (image coordinate accuracies, EO parameter accuracies, and the object point accuracies) propagated into the misclosure of the collinearity and object point observation equations.The a posteriori standard error of unit weight can be computed as: (30) where n.o. is the total number of observations and n.u. is the total number of unknowns in the adjustment model.Once the solution converges the normal equations are then scaled by the a posteriori reference variance to compute the final variance co-variance matrix for the adjustable parameters as: (31) where the resulting matrix is block diagonal with variances of the exterior orientation parameters and the object point coordinates consistent with how the N matrix was formed.The standard deviations of the adjustment parameters are the computed by taking the square root of the diagonal variances.

Adjustment Weighting
The concept of weighting in the vector adjustment is very important and directly impacts the adjusted values, as well as the estimated uncertainty of the adjusted points.Adjustment weighting consists of assigning a numerical value to the adjustment observations and parameters based on how well the specific quantity is known.Weights are determined for both: (1) the image points, based on an a priori estimate of how well the points can be mensurated in the image, and (2) the object point coordinates, which consist of the GCP's and vector shape point accuracies.
The general idea is the higher confidence a user has in a measurement the smaller the standard deviation will be assigned to that measurement.For example, the GCP's used in this study were established by GPS survey, therefore the accuracies assigned to these coordinates are very small, on the order of 10-cm, or less three dimensionally.One can reason that a small standard deviation corresponds to a high user confidence in the numerical value of the measurement.On the other hand, the vector shape point coordinates were determined with less stringent methods; such as measuring a single satellite image or with a hand held GPS receiver.In this case the coordinates are much less well known, i.e., a user has less confidence in the accuracy of the point, so a standard deviation on the order of a few meters or more could be realistic.
In most cases the accuracy of the OSM shape points will not be known a priori, therefore the following approach can be implemented to estimate them.First, assign an arbitrary large standard deviation to the OSM shape points.Second, run an adjustment to determine the shape point residuals and compute the RMSE of the shape point residuals to estimate the positional displacement.Thirdly, use the RMSE value and residuals to estimate a priori standard deviations for the OSM shape points.Lastly, rerun the adjustment to compute accurate adjusted information.

Absolute Accuracy
Accuracy is a term that refers to the closeness between measurements and their true values [44].The closer a measurement is to the true value the better accuracy the measurement has.In reality, the true value is not known and can only be estimated by making measurements and analyzing those measurements.Therefore, the true value is often referred to as the expected value and can be measured by computing the average or mean of a subset of measurements.Additionally, the measurements themselves are not perfect quantities and are subject to errors resulting from the people making the measurements, the equipment used to make the measurement, and random noise in the measurements themselves.
Product accuracy statistics summarize the dispersion of the individual errors of a large set of checkpoints.The military community requires an accuracy statistic that combines biases and random errors and then estimates them at the 90% probability level.Vertical accuracy is reported as a linear error (LE) because it is an uncertainty along the single vertical accuracy.Horizontal accuracy is a function of the two horizontal dimensions in the x and y directions.It can be considered a circular error (CE), which refers to the radius of the circle, centered about the derived location, within which the true or expected location of points lies [45].Ager derives the CE90 and LE90 in detail in [45], so the reader is referred here for more information.

Root Mean Square Error (RMSE)
The Root Mean Square Error (RMSE) is defined as the square root of the average squared discrepancies in coordinate values [6].The RMSE is made up of the mean square error (MSE) in the various x, y, and z components.The MSE is a value measuring the sum of the squared differences between a measured value and it's truth-value and is defined in [44] as: (32) where, x is the measured value, t is the truth-value, and n is the number of measurements.In reality, the truth-value is rarely known, so this work seeks to replace truth with the best estimate of the measured value.For example, the GPS established ground control points used in the bundle adjustment are considered truth, while their values were actually determined by measurement with inherent uncertainty.However, the positions are the best estimate of truth and will be used as such.Root mean square positional error can then be defined by taking the square root of the MSE in the x an y directions.(33) The RMSE statistic can then be used to describe the uncertainty present in a set of like measurements.

Experimental Results and Analysis
The experimental results presented here are meant to test the accuracy of the proposed vector adjustment model and how well it recovers the actual positional displacement present in the input vectors.Shape points from the OSM database will be used, along with aerial stereo imagery, a USGS 1/3 arc second DEM, and GCP's as input to the vector adjustment model.The positional accuracy will be measured by comparing shape point coordinate differences of the OSM database and an adjusted shape point location to the real world ground surveyed location (ground truth).For example, the difference between the OSM database and the surveyed location represents the actual positional displacement present in the shape point; similarly, measuring the difference between the adjusted shape point and the surveyed location describes the accuracy of the adjustment itself.In addition, comparing the adjustment residuals to those determined from ground truth will provide a measure of performance to how well the vector adjustment model recovers the true positional displacement.The bundle adjustment routine in the SOCET GXP © (version 3.2) photogrammetry and geospatial exploitation software by BAE Systems Corporation was used to facilitate the adjustment testing, while the RMSE metric was used to quantify the results.

Project Datasets
The aerial imagery used for this project is aerial frame stereo imagery taken with a Leica (Wild) RC10 aerial photography camera (sensor) over the Purdue University campus in West Lafayette, Indiana on 5 October 1999.The imagery has been scanned into digital format and has a scale of 1:4,000 or 1" = 100 m with a 12 cm ground sample distance (GSD).The altitude of the aircraft was 610 m above the terrain or 798 m above mean seal level and was collected to include 80% forward overlap and 60% side overlap.The sensor was calibrated by the USGS and includes a camera calibration report, which was used for interior orientation and contains information such as calibrated focal length (152.4 mm), lens distortion parameters, and calibrated fiducial marks.No Inertial Navigation System (INS) data was collected during the imagery acquisition.
The DEM is used to estimate initial approximations for the shape point elevations, and consists of a 1/3 arc second (about 10-m) raster DEM in GeoTiff format [41].The NED is the primary elevation data product of the USGS and is a seamless dataset with the best available raster elevation data of the conterminous United States, Alaska, Hawaii, and territorial islands.The NED is derived from diverse source data that are processed to a common coordinate system and unit of vertical measure, being geographic coordinates in units of decimal degrees, and in conformance with the North American Datum of 1983 (NAD 83).All elevation values are in meters and, over the conterminous United States, are referenced to the North American Vertical Datum of 1988 (NAVD 88).NED data is available nationally (except for Alaska) at resolutions of 1 arc-second (about 30 m) and 1/3 arc-second (about 10 m), and in limited areas at 1/9 arc-second (about 3 m) [41].Care was taken to ensure the appropriate conversion between the NAD 83 and WGS 84 datum were applied when working with the dataset.
The truth dataset was determined by locating the subject test vectors (road centerlines) with high order GPS surveying techniques.The field survey campaign was conducted in July of 2012 on the campus of Purdue University.The purpose of the survey was to locate the actual real world positions of the road centerlines being tested, as well as establish GCP's to facilitate the testing.The GPS equipment used was a Leica Viva RTK system and consisted of a GPS receiver (rover) connected to the Indiana Department of Transportation (INDOT) INWL Continuously Operating Reference Station (CORS-base station) via a digital data modem.This type of setup allows the user to connect directly to a very high accuracy "zero order" control point and receive adjusted survey grade coordinates at the rover in real time.

OSM Urban City Streets Scenario
The city streets scenario is meant to test urban roads in the OSM database for positional accuracy by measuring the intersections of gridded street vectors.This scenario is located near the center of the campus and covers two-square blocks (1 block = ±100 m) along 3rd, 4th, 5th, Russell, Waldron, and University streets.The adjusted vectors were derived with the proposed approach and can be visualized in Figure 4, where the original OSM database vectors are in green and the adjusted vectors in blue.
The base imagery has been georeferenced with surveyed ground control points so an accurate spatial relationship can be drawn between the vector data and the real world surroundings depicted in the imagery.A close up view of two areas is provided in Figures 5 and 6.
Visual inspection shows the adjusted vectors more closely match the centerline of the road in the imagery, as compared to the original OSM database vectors.The adjusted shape points coordinates and statistical confidences are shown in Table 1.Where, three GCP's were used to adjust the vector shape points and had a priori standard deviations of 2 cm in the x direction, 2 cm in the y direction, and 1 cm in the z direction.The reader is referred to [46] for additional information on the minimum number and configuration of GCP's needed to produce reliable adjustment results.The a priori standard deviations for the OSM shape points were significantly higher at 2, 3, and 0.20 m in the x, y, and z directions, respectively.The a posteriori reference variance for this adjustment was 0.89, signifying the a priori weights going in are consistent with the uncertainty found in the observations post adjustment.The adjustment residuals are summarized in Table 2.The residuals indicate which way the OSM shape point had to move to become more positionally accurate, with the positive value denoting a point moving in the East (Vx) and North directions (Vy).It is interesting the vertical residuals are essentially zero, signifying no change from between the initial approximation (DEM) and the adjusted shape point elevations.In general, it is conceivable because the DEM was generated by photogrammetric methods and tied to the NAVD88 datum.Similarly, the GCP's used as the reference frame for this project were also tied to the NAVD88 datum.In addition, the intersections are fairly flat and wide paved areas, thereby increasing the accuracy of the DEM in these areas.However, keep in mind the LE90 from Table 1 is 0.20 m for these points.To investigate further the results are compared to surveyed ground truth in Table 3.The Un-adjusted deltas refer to the OSM shape point minus the truth coordinate value, while the adjusted information refers to the adjusted shape point coordinate minus the ground truth.This type of comparison shows the roads actual positional displacement (un-adjusted) and what is left over after adjustment.The results show an 86%•( 100 − 0.69/4.81)improvement in positional accuracy (RMSE) post adjustment.Furthermore, the vector adjustment was able to recover 95%• (4.81/5.04) of the actual displacement present in the road shape points.The results also show that two of the points (OSM 137 and 28) actually moved slightly farther away from truth as a result of the adjustment, however one should remember the truth values were determined by estimating the center of a wide non-descript road intersection with traffic, making it difficult to measure the exact center of the roadways.So, it is possible the adjusted positions are closer to truth than the deltas initially suggest, especially when considering the CE90 of the two shape points at 13 and 16 cm, respectively.
The surveyed ground truth was determined in July of 2012, while the imagery was flown earlier in 1999.The challenge here was accounting for the construction that took place in the 12+ years in between.For example, it was evident that the roads associated with OSM shape points 557-573 have been rebuilt by the newer looking pavement, and it is possible there was a grade change associated with the reconstruction project that could account for the larger elevation differences seen in these points.
An additional check to ensure the adjustment produces reliable results was implemented.Several GCP's were included in the adjustment as "check points" only.Check points do not influence the outcome of the adjustment, rather their carried along and positioned using the computed model-to-ground relationship.The objective is to have known points that can be carried through the adjustment process to verify the performance of the vector adjustment.Table 4 contains the differences between the check points adjusted coordinates and the truth coordinates measured with GPS.The results in Table 4 show the RMSE of the check points to be 15 cm, which is consistent with the average CE90 of the vector shape points at 14 cm.This is another metric that provides confidence in adjusted shape point positions in Table 1.

Feature Attributes
Implementing the proposed approach provides an adjusted road vector that is close to truth, as indicated from the scenario analysis above.The truth vector represents a location that is spatially accurate, in an absolute sense, in relationship to the actual features on the ground because it was derived by high-order geodetic surveying techniques (RTK GPS from a CORS) by a licensed professional surveyor.Therefore, since the adjusted position of the road vector is close to the truth vector, it can be considered to be "positionally accurate".Following, having computed this valuable information it makes sense to store it at the feature level going forward so a user can easily access and exploit the information to help them make more informed decisions about the data they are working with.
A feature level attribute refers to saving information about an individual geometric feature in a geographic database, e.g., recording the shape point adjustment residuals or a polylines RMSE value, both of which describes their positional accuracy.Example attributes for a shape point are seen in Table 5 with the adjusted information in italics.

Head-to-Head Accuracy Assessment of OSM, TNM, and TIGER 07 Roads
The head-to-head accuracy assessment of OSM, the USGS National Map (TNM), and TIGER 07 roads is meant to demonstrate a practical application of the proposed vector adjustment model.There are many applications that could benefit from knowing the positional accuracy of shape points and vectors in a geographical database, e.g., GPS navigation, vector-to-vector conflation problems, operational planning, and an improved decision making capability just to name a few.In this example, vector road centerlines and shape points were extracted from each database for the same location used in the urban city streets scenario above.Care was taken to ensure the datasets were referenced to the same horizontal and vertical datum, WGS84 and NAVD88 respectively.Each dataset was adjusted independently using the same GCP's with the same a priori standard deviations (5-cm in x/y and 8-cm in elevation) and same image coordinates for the shape points.The a priori standard deviations for the ground location of the shape points in each dataset were varied according to an estimate of the positional accuracy.Figure 7 depicts the datasets referenced to an accurately georeferenced base image.
In Figure 7 the OSM vectors are in green, TNM vectors in yellow, and the TIGER 07 vectors in blue.From visual inspection, the TIGER 07 lines appear to be shifted more than the OSM or TNM.However, keep in mind that a general user may not have base imagery or be comfortable with the accuracy of the georeferencing to make this judgment without the vector adjustment.The horizontal adjustment residuals for each dataset are summarized in Table 6.A close up view of the center intersection showing the spatial relationship of the road centerlines is provided in Figure 8. Analyzing the numbers suggests that TNM roads have the best positional accuracy, as indicated by the smallest RMSE estimate of 2.89 m, compared to OSM at 4.35 m and TIGER 07 at 19.17 m.This is also confirmed by comparing the shape point adjustment residuals, which shows that the TNM points moved less than the OSM and TIGER 07 roads.Note the RMSE value for OSM is slightly different than in the analysis above because two points were removed so an accurate comparison could be made between the three datasets.The TIGER 07 roads have the largest displacement, as suggested by visual inspection, with two of the shape points moving over 20 m in the west direction.Overall, the TIGER 07 roads appear to have a bias in the East and South directions, as indicated by the negative x-residuals and positive y-residuals.

Conclusions
The proposed approach and vector adjustment model was developed to assess the positional accuracy of geographical vector data, such as road centerlines.The OSM database provides a unique dynamic environment to use as the test subject for this research because its underlining purpose of providing open source mapping by the people, to the people, suggests the importance of knowing how good the data is.For example, OSM contributors are mostly voluntary, non-professionals, who have an interest in mapping a local area they are familiar with.In addition, there is no cartographic or data quality standards in place to ensure contributors "map" in a similar fashion or adhere to any specific equipment requirements (GPS), field collection procedures, image mensuration standards, or map accuracy standards.
Most of the current methods of determining positional accuracy are based on comparing test vectors to a reference/truth dataset that is known to be of higher quality.However, these methods are not rigorous in nature (thoroughly modeling and propagating error in the system).To address this issue, the vector adjustment model presented here is based on applying rigorous photogrammetric positioning principles to vector data to determine the positional accuracy of shape points.Aerial imagery was used to build an analytical stereo model, with GCP's and vector shape points, to enforce the collinearity equations with a bundle adjustment.Post adjustment the vector shape points are transformed closer toward their "true" real world ground locations and the adjustment residuals describe the positional accuracy of the shape point.Furthermore, accuracy estimates (CE90 and LE90) are computed for the adjusted shape points, which afford a user confidence in the information coming out of the adjustment.
The proposed approach was tested on several urban gridded city streets from the OSM database.The results show that the post adjusted shape points improved positionally by 86%.Furthermore, the vector adjustment was able to recover 95% of the actual positional displacement present in the shape points.Once this valuable information is computed for the vector data it can be recorded as an attribute at the feature level, thereby improving the overall usefulness of the database and allowing a user to make more informed decisions based on the data.To demonstrate a practical application of the vector adjustment, it was used to characterize the positional accuracy of OSM, TNM, and TIGER 07 road vectors by comparing the RMSE values of the adjustment residuals.

Future Work and Recommendations
An application of this research would be to use the adjusted shape point residuals output from the vector adjustment model as input for a "smart conflation" procedure.For example, the scenario might include two road networks with varying attributes and geometries that could benefit from being integrated into one product.Traditionally, the user specifies one of the datasets to be held as the reference to control the geometry (for matched features) that prevails in the conflated product.This decision is usually based on the users overall knowledge and experience with the data and is made at the database or layer level.However, implementing the proposed vector adjustment model could provide positional accuracy information at the feature level, i.e., individual road centerlines.This information could then be used in a conflation solution where the matched features are compared and the one with the smallest positional displacement prevails as the reference and transferred to the conflated product.
This research project utilized aerial frame imagery that covered an area of less than 1 km 2 of area on the ground.Although, it was useful for demonstrating this proof-of-concept, it would be beneficial to extend this work to satellite imaging technology.For example, commercial satellites such as Quickbird are known to capture imagery having 10 km 2 , or more, of ground coverage.This would allow a much larger piece of a vector database to be tested.The positioning model for the imagery would be a bit different because satellites use scanning systems to acquire lines of imagery over time, which requires the use of sensor modeling to determine accurate image and ground coordinates.Nevertheless, the sensor model could be obtained from the satellite image provider or approximated using a replacement sensor model [47].
The proposed approach is somewhat manual at this point, requiring an individual to mensurate the imagery to identify image coordinates for the shape point locations.However, this process could be automated by performing a vector-to-image registration between the vector road network and the corresponding roads in the imagery.Road intersections serve as an ideal candidate to establish correspondence between each dataset for two reasons: (1) the linear road vectors usually intersect at a node whose coordinate represents the center of the intersection, and (2) intersections usually perform well for Automated Feature Extraction (AFE) algorithms, which are needed to identify the intersection in the imagery.The vector adjustment could then be automated based on input files for the image coordinates, object point coordinates (vector database), GCP's, initial approximations for the EO parameters, their associated positional uncertainties, etc.
Finally, open source geographic data offers a unique opportunity to exploit the geolocation attributes associated with it.Future research should investigate how OSM roads could be used (as pseudo-GCP's) to determine/improve the location of a sensor (position and attitude) that collects imagery.This is essentially the reverse of what is being done here and could prove useful for geopositioning applications of non-traditional sensing systems, such as Unmanned Aerial Vehicle (UAV) platforms.

Figure 2 .
Figure 2. Schematic diagram of the vector adjustment concept.

Figure 3 .
Figure 3. Geometric diagram of the vector adjustment concept.

Table 1 .
Adjusted shape point information.

Table 4 .
Check point comparison.