Evaluating Neural Radiance Fields for ADA-Compliant Sidewalk Assessments: A Comparative Study with LiDAR and Manual Methods

Du, Hang; Wang, Shuaizhou; Zhang, Linlin; Amo-Boateng, Mark; Adu-Gyamfi, Yaw

doi:10.3390/infrastructures10080191

Open AccessArticle

Evaluating Neural Radiance Fields for ADA-Compliant Sidewalk Assessments: A Comparative Study with LiDAR and Manual Methods

by

Hang Du

^1,*

,

Shuaizhou Wang

²,

Linlin Zhang

¹

,

Mark Amo-Boateng

¹

and

Yaw Adu-Gyamfi

¹

Department of Civil and Environmental Engineering, University of Missouri-Columbia, Columbia, MO 65201, USA

²

Department of Computer Science, University of Missouri-Columbia, Columbia, MO 65201, USA

^*

Author to whom correspondence should be addressed.

Infrastructures 2025, 10(8), 191; https://doi.org/10.3390/infrastructures10080191

Submission received: 9 June 2025 / Revised: 13 July 2025 / Accepted: 20 July 2025 / Published: 22 July 2025

Download

Browse Figures

Versions Notes

Abstract

An accurate assessment of sidewalk conditions is critical for ensuring compliance with the Americans with Disabilities Act (ADA), particularly to safeguard mobility for wheelchair users. This paper presents a novel 3D reconstruction framework based on neural radiance field (NeRF), which utilize a monocular video input from consumer-grade cameras to generate high-fidelity 3D models of sidewalk environments. The framework enables automatic extraction of ADA-relevant geometric features, including the running slope, the cross slope, and vertical displacements, facilitating an efficient and scalable compliance assessment process. A comparative study is conducted across three surveying methods—manual measurements, LiDAR scanning, and the proposed NeRF-based approach—evaluated on four sidewalks and one curb ramp. Each method was assessed based on accuracy, cost, time, level of automation, and scalability. The NeRF-based approach achieved high agreement with LiDAR-derived ground truth, delivering an F1 score of 96.52%, a precision of 96.74%, and a recall of 96.34% for ADA compliance classification. These results underscore the potential of NeRF to serve as a cost-effective, automated alternative to traditional and LiDAR-based methods, with sufficient precision for widespread deployment in municipal sidewalk audits.

Keywords:

sidewalk assessment; ADA compliance; neural radiance field; 3D reconstruction; survey method comparison

1. Introduction

Ensuring safe and accessible pedestrian infrastructure is a critical mandate under the Americans with Disabilities Act (ADA), which defines geometric requirements for sidewalk elements, such as running slopes, cross slopes, and vertical discontinuities, between pavement segments [1]. Accurate and timely assessments of these features are essential for identifying noncompliant conditions that pose safety risks—particularly for individuals who use wheelchairs or other mobility devices. This paper presents a novel 3D reconstruction framework based on neural radiance field (NeRF) that enables automated extraction of ADA compliance metrics—including the running slope, the cross slope, and vertical displacements—from a monocular video. The goal was to evaluate NeRF’s accuracy, scalability, and practicality compared to traditional manual surveys and LiDAR-based methods.

Municipalities have traditionally relied on manual field surveys to assess sidewalk geometry, employing tools such as digital levels, dual-axis tilt sensors (DASs), and measuring tapes [2,3,4]. Digital levels are typically used for evaluating slopes near curb ramps; however, irregular spatial sampling often leads to missed subtle variations [5]. Although DAS sensors are inexpensive and easy to operate, their high sensitivity to minor movements and vibrations undermines measurement precision [6]. Measuring tapes, commonly used for identifying trip hazards, frequently yield inconsistent results due to variability in user technique [7]. More advanced instruments, like GIS-integrated rolling-level systems, improve workflow efficiency but are often cost-prohibitive, with unit prices reaching up to USD 7500 [8]. Consequently, manual methods remain labor-intensive, time-consuming, and error-prone—making them unsuitable for large-scale or repeated sidewalk evaluations [9]. For instance, the City of Novi, Michigan, reportedly spent USD 68,000 to manually inspect 220 miles of sidewalks, at a cost of approximately USD 309 per mile [10].

In response to these limitations, researchers and practitioners have investigated vision-based alternatives, such as photogrammetry and LiDAR. LiDAR systems can produce high-resolution point clouds suitable for detailed geometric assessments, but their high cost and operational complexity limit their broader adoption [11]. Photogrammetry presents a more cost-effective solution but often lacks the precision to detect fine-scale elevation differences, particularly on low-texture surfaces, such as concrete [12]. Both approaches also demand controlled data acquisition conditions and extensive post-processing, which further restricts their scalability for routine ADA compliance evaluations.

Recent advances in 3D computer vision offer a new path forward. NeRF is an emerging technique that reconstructs 3D scenes from 2D images by learning a continuous volumetric representation of the environment [13]. Unlike traditional photogrammetry, which triangulates discrete points, a NeRF model the color and density of a scene as a function of spatial coordinates and viewing direction. This enables the generation of dense, photo-realistic 3D models from a monocular video, even in casual or uncontrolled capture conditions [14]. While NeRF has shown promise in domains such as robotics and augmented reality, its application in civil infrastructure—particularly for sidewalk condition assessments—remains largely unexplored.

This study introduces a NeRF-based 3D reconstruction framework that leverages consumer-grade monocular video input to extract ADA-relevant geometric metrics, including the running slope, the cross slope, and vertical displacements. The primary objective was to evaluate NeRF’s feasibility as a low-cost, scalable alternative to manual and LiDAR-based methods. A comparative analysis was conducted across three data collection techniques—manual, LiDAR, and NeRF—benchmarked on real-world sidewalk and curb ramp scenarios. Each method was assessed based on measurement accuracy, cost, level of automation, and scalability, with the goal of informing future practices in sidewalk asset management and accessibility compliance monitoring.

The remainder of this paper is organized as follows. Section 2 provides a review of related work in sidewalk assessment technologies, including manual, LiDAR, and vision-based methods. Section 3 details the study area, data collection procedures, and the proposed NeRF-based 3D reconstruction and metric extraction framework. Section 4 presents the evaluation metrics and experimental results, including a comparative analysis of accuracy, cost, and scalability across all methods. Finally, Section 5 discusses the implications of the findings and offers conclusions and recommendations for future research and practical deployment.

2. Review of Sidewalk Assessment Techniques

A variety of tools have been developed to assess sidewalk conditions for ADA compliance, ranging from handheld instruments to advanced 3D modeling systems. This review summarizes key approaches across sensor-based platforms, 2D vision models, and 3D reconstruction technologies, with an emphasis on their ability to extract ADA-relevant geometric metrics, such as the running slope, the cross slope, and vertical displacements. The limitations of these methods—particularly in terms of cost, automation, and precision—motivate the need for a scalable alternative, such as NeRF.

2.1. Sensor- and Vision-Based Methods

Low-cost sensor-based systems aim to improve the efficiency of sidewalk inspections compared to manual surveys. Ultra-Light Inertial Profilers (ULIPs), which integrate laser sensors and accelerometers, can detect surface slopes and irregularities more quickly than traditional tools [15]. Similarly, GPS-enabled devices, including “Sidewalk Surface Testers” and tablets mounted on wheelchairs, provide on-the-go measurements of slope and elevation [16]. While these systems increase operational efficiency, their resolution is insufficient to capture localized slope transitions and minor vertical displacements critical for ADA compliance. Parallel developments in 2D computer vision have produced deep learning models for defect detection using RGB imagery. For example, YOLO [17] and Sidewalk Defect Detection Models (SDDMs) [18] offer fast, automated crack identification from UAV or mobile device footage. These tools are affordable and scalable but lack depth data, which limits their capacity to measure slope or elevation changes—rendering them unsuitable for full ADA geometric assessments.

2.2. Three-Dimensional Reconstruction: Photogrammetry and LiDAR

Three-dimensional (3D) modeling techniques provide enhanced geometric details for infrastructure assessments. Among these, photogrammetry, based on Structure-from-Motion (SfM) and Multi-View Stereo (MVS), reconstructs 3D surfaces from overlapping image sets and is widely adopted due to its affordability and accessibility [19]. However, its performance is highly dependent on surface texture, lighting conditions, and camera quality. In texture-poor environments, such as concrete sidewalks, photogrammetry often fails to detect the millimeter-scale elevation differences required for ADA compliance assessments [20].

LiDAR technologies address many of these limitations by generating lighting-independent, high-resolution 3D point clouds. High-end terrestrial LiDAR systems can achieve sub-centimeter accuracy and are frequently employed for slope and trip hazard analysis [21,22]. Nevertheless, these systems are expensive, demand multiple scan positions, and require manual point cloud registration, making them challenging to deploy in typical sidewalk environments [23]. Mobile LiDAR offers partial automation and supports semantic segmentation; however, it remains constrained by occlusions [24], uneven terrain [25], and limited fields of view [26]. While occlusion-handling algorithms have been developed [27], they are primarily effective in regular or planar scenes and often require manual parameter tuning. Low-cost LiDAR devices, including those embedded in smartphones, offer portability but suffer from reduced depth resolution and a limited sensing range, making them more suitable for supplementary validation [28]. Other methods, such as CAD-based 3D reconstructions, rely on idealized input and substantial manual effort, restricting their scalability across varied infrastructure types [29]. As emphasized by Sestras et al. [12], both photogrammetry and LiDAR continue to face significant challenges in achieving scalable, millimeter-level elevation mapping.

2.3. Neural Radiance Field (NeRF) for 3D Sidewalk Modeling

Recent advances in neural rendering introduce NeRF as a promising alternative for 3D modeling. A NeRF learns a continuous volumetric function that maps 3D coordinates and viewing directions to color and density, enabling photorealistic 3D reconstructions from a sparse, monocular video [30]. Unlike SfM, a NeRF does not rely on feature triangulation or dense input and is robust to surface textures and lighting variations—making it well-suited for reconstructing real-world sidewalk scenes [14]. Recent studies have demonstrated NeRF’s utility in civil engineering and urban applications, including damage detection [31], 3D documentation of building interiors [32], and reconstruction of complex urban environments [33]. Qin et al. [34] applied crowd-sourced imagery and NeRF to generate street-level 3D scenes for autonomous vehicle navigation. Despite its demonstrated potential, NeRF has not yet been investigated for pedestrian-scale applications, like ADA sidewalk audits, particularly for measuring slopes and vertical displacements.

2.4. Slope and Elevation Estimation from 3D Data

Even with accurate 3D reconstructions, extracting slope and displacement measurements remains challenging. End-to-End (E2E) methods estimate slope based on elevation changes between segment endpoints but assume planar surfaces—an oversimplification that leads to errors in curb ramps and irregular terrains [35]. Linear regression techniques, while more robust, perform best on uniform pavement and may underperform on curved or discontinuous segments [36]. To address this limitation, some researchers have adopted Principal Component Analysis (PCA) to estimate cross slopes more accurately in such complex geometries as curved ramps [37]. Alternative approaches using Google Street View (GSV) depth maps convert panoramic imagery into digital terrain models (DTMs), enabling slope estimations without field data. However, due to noise and resolution limits in consumer-grade cameras, GSV-derived slopes often fail to capture fine elevation changes required for ADA compliance [38]. Other researchers have explored vertical displacement detection using point cloud projections. Jiang [39] proposed a method for detecting vertical displacement and mapping sidewalk deficiencies using mobile LiDAR-derived elevation data and orthophotos. Yu [40] extracted crack lines from elevation maps using clustering and skeletonization algorithms. While effective for displacement and defect analysis, these methods are not easily adaptable for slope extraction and often require significant preprocessing.

In summary, while current technologies offer partial solutions for sidewalk geometry assessments, no existing method simultaneously achieves high precision, low cost, automation, and scalability for ADA compliance evaluations. Manual methods lack coverage. LiDAR is accurate but expensive. Photogrammetry and 2D vision approaches are limited in vertical fidelity. NeRF presents a compelling alternative, but its effectiveness for sidewalk-scale applications has not been validated. This study addresses this gap by evaluating NeRF’s capability to extract ADA-relevant slope and displacement metrics and by comparing its performance to established LiDAR and manual survey methods.

3. Methodology

This study evaluates the effectiveness of a NeRF for sidewalk geometry assessment and compares its performance against traditional manual measurements and LiDAR-based techniques. The methodology included three main phases: data collection, 3D model generation, and performance evaluation. Figure 1 illustrates the overall workflow. In the data collection phase, sidewalk and curb ramp segments were surveyed using three categories of tools: (1) a consumer-grade monocular video (for NeRF); (2) high-end terrestrial and smartphone-based LiDAR sensors; and (3) traditional manual instruments, including digital levels, dual-axis tilt sensors (DASs), and measuring tapes. These tools were used to capture geometric information relevant to ADA compliance, including the running slope, the cross slope, and vertical displacements.

For the NeRF-based workflow, a GoPro camera was used to capture a monocular video around each sidewalk segment. These videos were processed through a NeRF pipeline to generate high-fidelity 3D point clouds. In parallel, LiDAR data were collected both using a stationary high-end terrestrial scanner and a mobile device (iPhone), while manual measurements were conducted on-site using field instruments. The second phase involved extracting ADA-relevant parameters from the 3D point clouds and manual data sources. For each method, the cross slope, the running slope, and vertical displacements were computed using tailored extraction techniques. These measurements were aligned with ADA thresholds to determine compliance levels. In the final phase, results from the three survey methods were evaluated across several criteria: measurement accuracy (with LiDAR serving as a reference standard), operational cost, time requirements, degree of automation, and scalability. This comparative framework allows for a comprehensive analysis of each method’s strengths and limitations in the context of large-scale sidewalk assessments.

3.1. Study Area and Data Collection

To evaluate the proposed NeRF-based method alongside traditional and LiDAR-based techniques, five sidewalk scenarios were selected from the University of Missouri campus. The study area includes four sidewalk segments and one curb ramp (Sites 1–4 and Site R-curb ramp), chosen to represent a diverse set of conditions, including sloped terrains, cracks, vertical displacements, root intrusions, land subsidence, and irregular curb ramp profiles. These sites are illustrated in Figure 2.

Figure 3 presents an overview of the data collection methods used in this study. Each sidewalk segment was surveyed using the following: (a) traditional manual tools (digital levels, DAS sensors, and tape measures), (b) LiDAR systems (high-end terrestrial and low-end smartphone-integrated systems), and (c) a monocular vision setup for NeRF reconstruction. All three approaches aimed to extract ADA-relevant geometric metrics—the running slope, the cross slope, and vertical displacements—for later comparison.

3.2. Manual Survey Procedures

Manual data collection followed ADA field guidelines using a digital level, DAS sensor, and measuring tape. The digital level was aligned perpendicular or parallel to the direction of pedestrian travel to measure the cross and running slope, respectively. DAS sensors were placed on the sidewalk surface for approximately one minute. Fluctuating sensor readings were treated statistically: if the approximated distribution was normal, the mode was recorded as the representative slope value. In addition, a smartphone-based inclinometer app (iPhone 15 “Bubble Level”) was used to estimate slope angles. The app leverages onboard accelerometers and gyroscopes to provide real-time inclination readings. While it lacks the precision of dedicated sensors, its ease of use made it a useful supplementary tool for validating measurements in the field and ensuring consistency across multiple runs. Vertical displacements between adjacent sidewalk slabs were measured at height discontinuities using a tape measure, consistent with municipal field practices. These manual measurements served both as baseline observations and for validating automated techniques.

3.3. LiDAR Survey Procedures

For high-end data acquisition, a Livox HAP TX sensor was deployed on a fixed tripod 1 m above the sidewalk. The sensor, integrated with a Robot Operating System (ROS), captured point cloud and IMU data over a 20 s interval. This configuration provided a spatial resolution of 0.18° (horizontal) and 0.23° (vertical), with a data acquisition rate of 452,000 points per second—comparable to 144-line LiDAR systems. All LiDAR frames were later merged to reduce sparsity in the point cloud. The resulting datasets, shown in Figure 4b, were processed using the slope and displacement extraction routines described in Section 3.4.4 and Section 3.4.5.

To detect vertical displacements (trip hazards), both manual and automated methods were employed. Reflective tape was applied to the top and bottom of known elevation discontinuities to facilitate accurate identification in the point cloud. Manual measurements involved selecting top and bottom points and calculating the Euclidean distance between their average coordinates. The automated pipeline applied the same logic to LiDAR-derived data without manual intervention. The procedures for automated displacement extraction are detailed in Section 3.4.5.

3.4. Monocular Vision-Based Approach Using NeRF

The third data collection method employed a monocular vision-based workflow using a NeRF to reconstruct high-fidelity 3D models of sidewalk environments from consumer-grade video footage. A GoPro camera was mounted on a bicycle at a height of approximately 1 m and used to record approximately one minute of continuous video while circling each site at a steady speed of 1.4–1.5 m/s (see Figure 5). This setup allowed for efficient and flexible data collection under natural lighting conditions without the need for specialized 3D sensors.

3.4.1. Preprocessing and Camera Pose Estimation

The recorded video was processed using nerfstudio, an open-source framework for NeRF training and rendering. Still frames were extracted from the video at regular intervals to ensure even spatial coverage of the scene. To enable accurate 3D reconstruction, camera poses for each frame were estimated using the Structure-from-Motion (SfM) pipeline implemented in COLMAP. This process included detecting image features, matching key points across frames, and performing bundle adjustments to refine camera intrinsics and extrinsics. Each extracted frame was also resized and normalized for brightness and contrast to maintain consistent visual quality across the training dataset. These preprocessing steps were essential for ensuring reliable NeRF model performance and scene fidelity.

3.4.2. NeRF Model Training and 3D Reconstruction

The preprocessed frames and camera poses were used to train a volumetric scene representation using the Nerfacto architecture within nerfstudio. Nerfacto maps 3D spatial coordinates and camera viewing directions to RGB color and volumetric density (following the volumetric rendering formulation outlined in Appendix A), enabling a photorealistic rendering of the scene from arbitrary viewpoints [41]. As illustrated in Figure 6, the training pipeline consists of ray sampling, pose refinement, and volumetric field optimization. The model was trained with a learning rate of 0.0005, using 4096 rays per batch, over 1,000,000 iterations. Training was performed on a high-performance workstation equipped with an Intel Core i9-13900 CPU, an NVIDIA GeForce RTX 4080 GPU, and 32 GB of RAM. Each training session required approximately 20 min per site. Upon completion, a dense 3D point cloud was extracted and exported in standard formats (e.g., .ply, .las) for further analysis.

3.4.3. Metric Extraction and Evaluation

From the NeRF-generated point clouds, three ADA-relevant geometric metrics were automatically extracted: the running slope, the cross slope, and vertical displacements. The same post-processing techniques developed for LiDAR-based data (detailed in Section 3.4.4 and Section 3.4.5) were applied to the NeRF outputs to ensure methodological consistency across all data sources. To assess the geometric accuracy of NeRF reconstructions, the extracted metrics were compared against those derived from high-end terrestrial LiDAR, treated as ground truth. Evaluation criteria included precision, recall, and F1 scores for 3D alignment. In addition, the visual fidelity of NeRF-generated 2D renderings was assessed using the peak signal-to-noise ratio (PSNR), Structural Similarity Index (SSIM), and Learned Perceptual Image Patch Similarity (LPIPS). The robustness of the NeRF pipeline under varying lighting conditions was also tested to evaluate performance consistency in real-world scenarios. Quantitative results for these evaluations are provided in the Results Section 4.4 and Appendix B and Appendix C. A validation study using high-end LiDAR ground truth assessed NeRF’s accuracy in capturing geometric features across diverse environments and spatial scales; the results are provided in Appendix D.

3.4.4. Slope Detection from NeRF Point Clouds

After point cloud reconstruction, the cross slope and running slope were extracted using an enhanced End-to-End (E2E) method designed to better capture localized geometric variations in complex surfaces, such as curb ramps and flares.

To compute the slope, multiple transect lines were defined across each ramp surface. Each line was subdivided into a series of uniform bins based on the aspect ratio of the surface to ensure even spatial coverage. Within each line, adjacent bins were used to calculate segment slopes, which were then averaged to represent the overall slope along the line. The final slope estimate for the ramp was obtained by averaging the slopes across all transects. This enhancement allowed for the detection of small-scale slope changes that may be missed by single-point tools, like digital levels or dual-axis tilt sensors. By aggregating across multiple local segments, this method also reduced the influence of noise or irregular pavement. The slope between two bins was calculated using the following:

S l o p e = \frac{z_{2} - z_{1}}{\sqrt{{(x_{2} - x_{1})}^{2} + {(y_{2} - y_{1})}^{2}}}

(1)

where (

x_{1}

,

y_{1},

z_{1}

) and (

x_{2}

,

y_{2}

,

z_{2}

) represent the average coordinate values of neighboring bins.

Figure 7 shows an example of a selected line across a flare surface, divided into six bins, with slope vectors computed between each pair.

3.4.5. Vertical Displacements and Trip Hazard Detection

To enable vertical displacement detection in NeRF-generated point clouds, a 12-inch reference ruler was placed at each scene. This ruler was used to compute a voxel-to-real-world scaling factor:

c o e f f i c i e n t_{c o n v e r s i o n} = \frac{V D_{r e a l}}{V D_{0}}

(2)

A vertical displacement was then computed as follows:

T r i p H a z a r d_{r e a l} = T r i p H a z a r d_{v o x e l} \cdot c o e f f i c i e n t_{c o n v e r s i o n}

(3)

Manual measurements were also extracted for validation, and a comparison between NeRF and LiDAR measurements at a specific point is provided in Table 1. The small absolute differences observed suggest that NeRF can serve as a reliable alternative to LiDAR for detecting trip hazards.

Automatic hazard detection using elevation gradients: To automate trip hazard identification, the z-axis elevation values were reshaped into a 1D array and binned into 255 intervals to analyze vertical change distributions. Vertical displacement gradients were computed using the Sobel operator in the y-direction. A threshold-based filter was applied to flag areas exceeding ADA limits:

Large trip hazard: >4 inches (10.16 cm).
Minor trip hazard: >0.5 inches (1.27 cm).

The detection method estimated the slope using the gradient:

s l o p e = \frac{d y}{d x} = \frac{d y}{σ \times p}

(4)

d y > T = s l o p e \times σ \times p > T

(5)

where p is the resolution, defined as 1 cm per pixel; σ = 3 pixels (based on the crack width); and T is the detection threshold.

The 3 × 3 Sobel kernel used for computing gradients in the y-direction is as follows:

G_{y} = [\begin{matrix} - 1 & 0 & 1 \\ - 2 & 0 & 2 \\ - 1 & 0 & 1 \end{matrix}]

(6)

The y-gradient of the z-channel was computed as follows:

S_{y} (x, y) = I (x, y) * G_{y}

(7)

Significant changes were detected by applying a threshold:

|S_{y} (x, y)| > T

(8)

This approach enabled efficient pre-screening of large sidewalk areas for vertical discontinuities without requiring manual input.

4. Evaluation

This section presents two sets of evaluations. First, we assess the geometric and visual fidelity of the NeRF reconstructions compared to high-resolution LiDAR point clouds. Second, we evaluate sidewalk segments for ADA compliance using extracted slope and elevation metrics from NeRF, LiDAR, and manual methods. Five performance dimensions—accuracy, cost, time, automation, and scalability—were used to compare methods.

4.1. Visualization of Curb Ramp and Sidewalk

To support the interpretation of geometric surface features, the average elevation (z-value) within each spatial bin was computed and visualized using a color gradient, where lighter shades indicate higher elevations. This technique enabled clear visual comparisons of cross and running slopes across different sidewalk segments. For the curb ramp, elevation profiles generated from both NeRF and LiDAR point clouds showed strong agreement, as seen in Figure 8b,c. This close alignment suggests that a NeRF is capable of providing LiDAR-equivalent geometric measurements in complex sidewalk geometries. For Sidewalk 1, cross slope analysis was performed by first rotating the point cloud to remove the influence of the running slope. This top-down segmentation, shown in Figure 8e,f, revealed clearly defined slab boundaries. The first slab section showed noticeable subsidence (indicated by darker colors), while the left edge of the third section exhibited an elevated terrain (lighter colors), consistent with field observations.

When both the running and cross slopes were visualized together (Figure 8g), Sidewalk 4 exhibited a smooth color gradient from dark to light, representing a continuous elevation change. This gradient reflected the sidewalk’s sloped profile and confirmed agreement between the 3D reconstruction and the known site conditions. These visualizations demonstrate that a NeRF is not only capable of capturing slope and elevation variations with high fidelity but can also outperform LiDAR in producing denser, more continuous surface models. The higher point density in NeRF-generated reconstructions enhances the visibility of subtle features—such as slab transitions, elevation drops, and surface deformations—that are critical for evaluating ADA compliance.

4.2. NeRF Performance Metrics

4.2.1. Three-Dimensional Reconstruction Accuracy

To evaluate geometric accuracy, the NeRF point clouds were manually aligned with ground truth LiDAR scans in CloudCompare using twelve reflective control points. The alignment was refined with a rigid transformation matrix, allowing for scale, rotation, and translation normalization. Figure 9 illustrates the aligned point clouds from the NeRF and LiDAR. We assessed geometric similarity using the following:

Precision: the percentage of NeRF points within a threshold distance of the nearest LiDAR point.
Recall: the percentage of LiDAR points that were captured within a threshold by NeRF.
F1 score: the harmonic mean of precision and recall.

Thresholds of 1 cm and 5 cm were used to assess alignment sensitivity across resolutions. Precision and recall both exceeded 90% at 5 cm, demonstrating that NeRF reconstructions capture sidewalk geometry at a resolution suitable for engineering use.

4.2.2. Visual Rendering Accuracy

In addition to geometric accuracy, we evaluated NeRF’s ability to generate photorealistic 2D renderings using the following metrics:

PSNR (peak signal-to-noise ratio): measures pixel-level fidelity.
SSIM (Structural Similarity Index): evaluates perceptual similarity across luminance and structure.
LPIPS (Learned Perceptual Image Patch Similarity): measures high-level perceptual differences using deep feature maps.

A higher PSNR and SSIM and a lower LPIPS indicate higher visual fidelity. These metrics are summarized in Appendix B.

4.3. ADA Compliance Benchmarking

Table 2 summarizes the ADA slope and displacement thresholds used for compliance evaluation. For each segment and curb ramp, the slope and vertical displacements were computed from NeRF reconstructions and validated against LiDAR and manual measurements.

Each method was evaluated based on the following:

Accuracy: alignment with LiDAR ground truth.
Cost: equipment, software, and labor.
Time: collection and processing durations.
Automation: extent of manual intervention required.
Scalability: suitability for citywide sidewalk audits.

Table 2 provides slope ranges, compliance thresholds, and repair recommendations in alignment with MoDOT ADA standards [8].

4.4. Results

4.4.1. NeRF Performance

To comprehensively evaluate the NeRF reconstruction framework, both geometric accuracy and visual quality were assessed across all test sites. Figure 10 presents the sample reconstruction outputs, while Table 3 reports the average precision, recall, and F1 scores across multiple spatial thresholds. Table 4 summarizes the 2D image quality metrics—PSNR, SSIM, and LPIPS—which reflect fidelity to the input video frames. Higher PSNR and SSIM values denote superior visual quality, while lower LPIPS scores indicate closer perceptual similarity to ground truth images.

Across all sidewalk segments and curb ramps, the NeRF consistently demonstrated strong reconstruction performance. As expected, geometric precision and recall improved as the distance threshold for evaluation widened. Among the five surveyed sites, the curb ramp yielded the highest geometric precision, followed closely by Sidewalk 1, which also achieved high recall and F1 scores. These results suggest strong agreement between NeRF- and LiDAR-derived ground truth in these areas.

Sidewalk 3, however, exhibited the lowest precision and recall. While this might initially suggest poorer performance, the 2D image quality metrics tell a different story. Sidewalk 3 achieved the highest SSIM score (0.5712) and one of the lowest LPIPS scores (0.3337), indicating that the NeRF preserved fine structural and textural details more effectively than LiDAR in that location. This apparent discrepancy likely reflects limitations in the LiDAR ground truth, which may have under-sampled sharp discontinuities, like large trip hazards, which were captured more accurately in the NeRF reconstruction. Consequently, the reduced geometric overlap should not be interpreted as NeRF failure but rather as a challenge in benchmarking vision-based methods against sparse or occlusion-limited LiDAR data.

Among all evaluated locations, Sidewalk 1 emerged as the strongest candidate for benchmarking. It demonstrated excellent alignment with LiDAR in geometric terms and achieved top-tier scores across all image-based metrics. These findings suggest that a NeRF can deliver LiDAR-comparable reconstructions with high visual fidelity, offering a scalable and low-cost solution for sidewalk condition monitoring and ADA compliance assessments.

4.4.2. Slope and Vertical Displacement Measurement Results

Slope measurements of the curb ramp

To evaluate slope estimation accuracy on irregular surfaces, slope measurements for individual segments of the curb ramp were computed using the enhanced E2E method applied to both NeRF and LiDAR point clouds. These automated results were then compared with manual measurements collected using a digital level, a dual-axis tilt (DAS) sensor, and a smartphone (iPhone 15), as presented in Table 5. The NeRF, LiDAR, and digital-level measurements demonstrated strong agreement across all ramp components, consistently yielding the same ADA compliance classifications. This consistency reinforces the validity of NeRF as a reliable and accurate method for geometric slope assessments, comparable to high-end LiDAR systems.

In contrast, measurements obtained from the DAS and smartphone deviated notably from the NeRF and LiDAR baselines. The DAS exhibited substantial variability, likely due to its high sensitivity to minor positional changes and vibrations. Likewise, the smartphone’s built-in inclinometer produced less reliable results, underestimating slope values even in segments with clearly defined inclines. These inconsistencies highlight the limitations of consumer-grade devices and low-cost sensors in environments requiring high geometric precision. To further assess robustness, NeRF and LiDAR slope measurements were repeated under two lighting conditions—full sunlight and partial shade—on the same curb ramp. As detailed in Appendix C, both conditions yielded compliant results, with slightly reduced errors under shaded conditions. This outcome confirms the stability and reliability of NeRF- and LiDAR-derived slope estimates under variable illumination.

Slope measurements for sidewalk segments

Table 6 presents the slope measurements for Sidewalk 1, which was divided into four segments to facilitate localized comparisons. Measurements were obtained using five methods: NeRF, high-end LiDAR (used as the reference standard), a digital level, a dual-axis tilt (DAS) sensor, and a smartphone (iPhone 15). The segmentation of the sidewalk was necessary to accommodate the limited spatial coverage of the LiDAR system and to isolate localized geometric variations that could affect compliance.

The NeRF and LiDAR produced closely aligned results across all segments, with a mean absolute difference of 0.49° for cross slopes and 0.95° for running slopes. This high degree of agreement demonstrates a NeRF’s capacity to capture slope geometry with near-LiDAR precision. Importantly, both the NeRF and LiDAR consistently detected noncompliant slope conditions across all segments, supporting their reliability for ADA compliance evaluations.

By comparison, the digital level showed greater variability, particularly in Segments 1 and 4, where it failed to identify noncompliant running slopes that were clearly detected by the NeRF and LiDAR. These discrepancies are likely due to the digital level’s reliance on discrete point measurements, which can miss subtle or localized slope variations—especially when the pavement is irregular or warped. Measurements from the DAS sensor and smartphone were the least accurate. Both methods frequently underestimated slope magnitudes and failed to detect several noncompliant conditions, particularly for running slopes. These errors likely stem from the DAS’s susceptibility to motion-induced noise and the smartphone’s limited angular resolution. Overall, the NeRF provided highly accurate and reliable slope assessments, aligning closely with LiDAR and significantly outperforming lower-cost alternatives. These findings highlight NeRFs’ promise as a scalable and cost-effective solution for detailed sidewalk compliance assessments, especially in settings where traditional survey instruments may be impractical or inaccessible.

Vertical displacement detection

Sidewalk segments 2, 3, and 4 presented challenging conditions for detecting abrupt or localized elevation changes—commonly associated with vertical trip hazards. In the LiDAR datasets (see Figure 8), sparse point density and the reliance on precise sensor positioning limited the resolution of fine surface features. Even with reflective tape markers, significant anomalies, such as the large trip hazard on Sidewalk 3 and a surface crack on Sidewalk 4, were poorly defined or incompletely captured.

In contrast, the NeRF-based reconstructions delivered higher spatial coverage and resolution without requiring constrained viewing angles. This flexibility enabled the model to detect fine-scale vertical features with greater fidelity, as evidenced by the detailed reconstructions shown in Figure 8. Ground-truthing was performed using a tape measure to identify trip hazards exceeding ADA thresholds—specifically, displacements greater than 4 inches (10.16 cm) on Sidewalks 2 and 3 and smaller displacements above 0.5 inch (1.27 cm) on Sidewalk 4. To visualize the distribution of vertical changes, a histogram of elevation gradients was generated (Figure 11a), with larger displacements appearing toward the right tail of the distribution.

To automate hazard detection, vertical gradient magnitudes were computed along the y-axis, and points exceeding thresholds of 4 inches and 0.5 inch were flagged. These points were color-coded based on height (Figure 11b) and overlaid on the original point cloud for spatial interpretation. A side-view projection further validated the accurate capture of elevation discontinuities. To corroborate the method’s accuracy, an idealized reference model was constructed, with details provided in Appendix E. The color comparison between the model’s known elevation profile and the NeRF-detected hazard locations confirmed that the NeRF reliably captured both the location and magnitude of trip hazards. These findings demonstrate NeRF’s effectiveness as an automated tool for identifying vertical displacement features critical to ADA compliance.

5. Comparative Analysis of Survey Methods

Table 7 presents a side-by-side comparison of manual, LiDAR-based, and NeRF-based sidewalk survey methods based on five key criteria: accuracy, cost, time efficiency, data volume and handling requirements, level of automation, and scalability. Each method demonstrates unique strengths and limitations that influence its suitability for ADA compliance monitoring at different scales.

Scalability is defined as the overall effort required to perform data collection, including equipment cost, data volume, and time efficiency. This consideration is particularly relevant for large-scale sidewalk assessments, where repeated data acquisition across multiple locations is necessary. Table 7 summarizes the average data acquisition time per site for each method, accounting for both setup and measurements. Site R required measurements at three locations, including two flares and one ramp; Site 1 contained four slabs; Sites 2 and 3 each contained two slabs; and Site 4 contained three slabs. Each slab measured 1.5 × 1.2 m. Manual surveying typically involves 1 min for setup and 1 min per slab for measurement, resulting in an average of 5.6 min per site. High-resolution LiDAR scanning, such as with the Livox HAP sensor, requires an average of 5 min for equipment setup and 20 s of scanning per slab. For longer sidewalks, multiple scans are necessary—as in Site 1, where four separate scans were conducted—resulting in an average of 14.9 min per site. In contrast, smartphone-based LiDAR systems require only 1 min per slab for measurement with no setup time, resulting in an average of 2.8 min per site. NeRF-based acquisition involves recording a continuous video, requiring approximately 0.5 min per slab without setup, with an average acquisition time of 1.4 min per site.

Regarding data volume and handling effort, manual surveying relies on handwritten or typed entry of measurements. DAS methods generate approximately 4 GB of data per slab, totaling 11.2 GB per site. The Livox HAP outputs approximately 452,000 points per second, producing around 5 GB of point cloud data per scan and typically requiring a synchronized camera for contextual visual information—about 14 GB per site. Smartphone-based methods display results directly via mobile applications, though some manual input may still be required. NeRF-based acquisition produces a 0.5 min video averaging 0.25 GB per slab, totaling 0.7 GB per site, which can be directly used for 3D reconstruction with minimal handling effort.

Manual techniques—such as digital levels and measuring tapes—remain highly accurate for localized assessments but are labor-intensive and time-consuming. Their dependence on human operation and point-by-point measurements significantly limits their scalability, making them ill-suited for large-area or repeat assessments. High-end terrestrial LiDAR offers exceptional geometric precision and has long served as the benchmark for 3D spatial data. However, it entails considerable equipment costs, setup times, and technical expertise. These constraints limit its practicality for widespread or frequent field deployment, particularly for smaller municipalities. In contrast, smartphone-integrated LiDAR provides a portable and low-cost alternative, but its limited spatial resolution and insensitivity to movement reduce its reliability—especially for detecting subtle vertical discontinuities or assessing slopes with ADA-required precision.

NeRF-based monocular vision methods strike a favorable balance across the evaluation dimensions. When paired with automated 3D reconstruction and metric extraction pipelines, a NeRF achieves competitive accuracy in both slope and elevation measurements, while requiring only low-cost video input. In addition to reducing equipment costs, a NeRF also significantly shortens the data collection time: entire sidewalk segments can be captured in minutes using a simple walk-through. Its high degree of automation, rapid capture process, and flexibility under varied lighting conditions position it as a promising solution for scalable, cost-effective sidewalk evaluations. These features make them well-suited for large or complex sidewalk networks. Overall, this comparative analysis highlights the trade-offs between traditional accuracy, operational burden, and technological scalability. NeRF-based methods, in particular, offer a compelling pathway toward automated and repeatable infrastructure assessments—especially in contexts where time, cost, data volume, and accessibility are critical considerations.

6. Conclusions and Discussion

This study demonstrates that NeRF, when trained on monocular video data, can accurately reconstruct 3D sidewalk geometry and extract ADA-relevant compliance metrics—including the running slope, the cross slope, and vertical displacements. The proposed framework bridges a key gap in pedestrian infrastructure monitoring by offering a low-cost, scalable alternative to both manual surveys and high-end LiDAR systems.

Quantitative comparisons revealed that a NeRF achieves precision and recall levels approaching those of LiDAR, with slope measurement deviations averaging less than 1° in most cases. Furthermore, the NeRF outperformed other low-cost methods (e.g., smartphone and DAS sensors) in identifying noncompliant features and detecting trip hazards, thanks to its dense, high-resolution reconstructions. Visual quality assessments confirmed that a NeRF’s 2D renderings are perceptually faithful, supporting dual use in visualization and measurements.

The NeRF pipeline proved particularly useful in situations where traditional sensors underperform—such as detecting localized cracks, surface heaving, and slab deformations. Its ability to work with consumer-grade cameras, along with the reduced need for field time, significantly lowers deployment barriers for resource-limited municipalities. These strengths make it well-suited for repeatable, city-scale ADA audits or integration into autonomous inspection systems.

However, several challenges and limitations persist. The reconstruction quality is sensitive to the image quality, lighting conditions, and environmental clutter. Computational demands for NeRF training remain high, posing obstacles for real-time use or deployment on embedded platforms. Additionally, generalizations across diverse urban scenes—especially those with occlusions or low-texture surfaces (e.g., tiles or asphalt)—require further validation. This study also lacks a quantitative comparison with other deep learning-based reconstruction methods. Larger-scale case studies and more automated pose estimations will be necessary to extend the method’s applicability to longer sidewalk corridors and more complex environments.

In summary, the proposed NeRF-based framework offers a promising path forward for cost-effective, automated sidewalk assessments in support of ADA compliance. By enabling detailed 3D reconstructions from a monocular video, it democratizes access to high-resolution infrastructure monitoring and could serve as a valuable tool for cities seeking to improve pedestrian accessibility. Future work will focus on enhancing automation, improving robustness to real-world variability, and integrating this method into mobile survey platforms to support continuous and scalable deployment.

Author Contributions

H.D.: Writing—review and editing, writing—original draft, visualization, validation, software, methodology, investigation, formal analysis, data curation, conceptualization. S.W.: writing—review and editing and investigation. L.Z.: writing—review and editing and investigation. M.A.-B.: writing—review and editing. Y.A.-G.: writing—review and editing and conceptualization. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

All data, models, and code that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

NeRF	Neural Radiance Field
DASs	Dual-axis tilt sensors
LiDAR	Light Detection and Ranging
MoDOT	Missouri Department of Transportation
ADA	Americans with Disabilities Act

Appendix A. NeRF Rendering Function and Training Procedure

Neural radiance field (NeRF) represent a 3D scene by learning a continuous function that maps each 3D position and viewing direction to an emitted color and volume density. This function is implemented as a fully connected neural network trained using volumetric rendering principles.

Appendix A.1. NeRF Scene Representation

Given a 3D position X = (x, y, z) and a 2D viewing direction d = (θ, φ), the NeRF model

F_{θ}

predicts the following:

F_{θ} (X, d) = (c, σ)

(A1)

where c = (r, g, b) is the RGB color emitted from position X in direction d, and σ is the volume density at X.

Appendix A.2. Volumentric Rendering

To render the color C(r) of a ray r(t) = o + t·d (where o is the ray origin, and t is a scalar distance along the ray from t_n to t_f), a NeRF integrates color and density along the ray using the following volume-rendering equation:

C_{r} = \int_{t_{n}}^{t_{f}} T (t) σ (r (t)) c (r (t), d) d t

(A2)

where T(t) is the accumulated transmittance, defined as follows:

T (t) = e x p (- \int_{t_{n}}^{t} σ (r (s)) d s)

(A3)

This process computes the probability that a photon travels along the ray and reaches point r(t) without being absorbed.

In practice, this integral is discretized by sampling N points along each ray and using alpha compositing to accumulate RGB values.

Appendix A.3. Training Procedure

The NeRF model is trained by minimizing the error between the rendered color and the ground-truth pixel color over all training images. This is achieved using stochastic gradient descent with batches of rays sampled from image pixels.

Appendix A.4. NeRF Training Pseudocode

Procedure TrainNeRFModel(dataset, total_epochs):

Initialize NeRF model parameters θ

For epoch in total_epochs:

For each batch in dataset:

Sample a set of rays from input images

For each ray:

Sample N points along the ray

Predict color and density at each point

Compute the rendered color using volume rendering

End

Compute loss between rendered and true pixel colors

Update model parameters θ using backpropagation

End

Return trained NeRF model

In this study, training was implemented using the nerfacto model within the nerfstudio framework. Once trained, the NeRF model generated dense point clouds that were exported in standard formats for sidewalk geometry extraction, as described in Section 3.4.4 and Section 3.4.5.

Appendix B. Evaluation Matrix

To evaluate the similarity between the ground truth (derived from LiDAR) and NeRF-reconstructed 3D point cloud, the following metrics were utilized.

Precision [42]: For a reconstructed point set

S

and a ground truth set

G

, the precision metric

P (d)

evaluates the proximity of points in

S

to those in

G

within a specified distance threshold

d

. It is mathematically expressed as follows:

P (d) = (\frac{100}{| S |}) \sum_{s \in S} I ({}_{g \in G}^{m i n}{| | s - g | | < d})

(A4)

where I (⋅) is an indicator function. Precision ranges from 0% to 100%, where higher scores reflect a closer match between the reconstructed points and the ground truth.

Recall [42]: Conversely, recall

S (d)

measures the extent to which the reconstruction

S

captures the points in the ground truth

G

within a specified distance threshold

d

. It is represented as follows:

S (d) = (\frac{100}{| G |}) \sum_{g \in G} I ({}_{s \in S}^{m i n}{| | g - s | | < d})

(A5)

Higher recall values indicate better completeness of the reconstruction.

F1 score [42]: The F1 score combined precision and recall, providing a balanced assessment:

F (d) = \frac{2 \times P (d) \times S (d)}{P (d) + S (d)}

(A6)

This harmonic mean penalizes large disparities between precision and recall. F1 scores range from 0% to 100%. Further discussion on the choice of d is provided in the analysis of the precision–recall curves.

To assess the quality of the 2D image rendered by the NeRF in comparison to the validation image, the following metrics were employed.

LPIPS (Learned Perceptual Image Patch Similarity) [43]: quantifies perceptual differences between two image patches,

x

and

x_{0}

, by extracting and normalizing feature activations from the

L

network layer. The perceptual distance was computed as follows:

d (x, x_{0}) = \sum_{l} \frac{1}{H W} \sum_{l} {‖w_{l} \cdot (ŷ_{x}^{l} - ŷ_{x_{0}}^{l})‖}_{2}^{2}

(A7)

where

w_{l}

is a channel-wise weighting vector, and the

ŷ^{l}

denotes normalized features from layer

l

. Lower LPIPS scores indicate higher perceptual similarity.

Peak signal-to-noise ratio (PSNR) [43]: PSNR evaluates the image quality by comparing a reconstructed image to a reference. It was defined as follows:

P S N R = 10 \times {l o g}_{10} (\frac{{M A X}_{I}^{2}}{M S E})

(A8)

where

M S E = \frac{1}{m n} \sum_{i = 1}^{m} \sum_{j = 1}^{n} {(I (i, j) - K (i, j))}^{2}

(A9)

where

I (i, j)

and

K (i, j)

are the reference and reconstructed images, respectively. Higher PSNR values denote better reconstruction accuracy.

Structural Similarity Index (SSIM) [43]:

SSIM evaluates the image quality based on luminance, contrast, and structure:

S S I M (x, y) = \frac{(2 µ_{x} σ_{y} + C_{1}) (2 σ_{x y} + C_{2})}{(µ_{x}^{2} + µ_{y}^{2} + C_{1}) (σ_{x}^{2} + σ_{y}^{2} + C_{2})}

(A10)

where

µ_{x}

and

µ_{y}

represent the mean values of images x and y, while

σ_{x}^{2}

and

σ_{y}^{2}

denote their variances. The term

σ_{x y}

captures the covariance between the two images, reflecting structural similarity. The constants

C_{1}

and

C_{2}

are included to prevent instability when the denominator is small. SSIM values range from 0 to 1, with higher values indicating greater perceptual similarity.

By systematically comparing these approaches, this study aimed to identify their respective strengths, limitations, and suitability for scalable and accurate sidewalk condition evaluations across diverse urban environments.

Appendix C. Performance Comparison of Illuminations

Figure A1. NeRF point cloud of curb ramp under different shadow conditions: (a) input scenery; (b) point cloud reconstructed by NeRF.

Table A1. Slope estimation under varying shadow conditions: NeRF reconstruction results and absolute differences compared to LiDAR.

			Slope (°)		Absolute Difference (°)
Curb Ramp			Cross	Running	Cross	Running
	Sunshine	Left	5.76 *	6.89 *	0.8	0.89
		Ramp	0.38	5.37 *	0.06	0.09
		Right	4.86	5.69	0.34	0.31
	Shadow	Left	6.17 *	5.96 *	0.39	0.04
		Ramp	0.27	5.09 *	0.05	0.37
		Right	4.8	5.26	0.4	0.12

* Asterisk denotes instances of noncompliance with ADA.

Appendix D. Data Validation

Indoor Environment: This evaluation was conducted in a controlled setting with a consistent background and lighting to assess NeRF’s accuracy in reconstructing individual structures. Object sizes ranged from 17.85 cm to 166.37 cm (Figure 1). LiDAR scans were performed from a fixed height at a 1 m distance. As a result, RMSE values remained below 0.02 m. Notably, the highest tolerance was observed for the object iPad, as LiDAR, limited in capturing smaller objects, generated fewer point clouds compared to NeRF, which excelled in capturing detailed structures while remaining within acceptable tolerance levels.

Outdoor Urban Area: This real-world setting introduced challenges such as variable lighting, wind, and pedestrian movement to assess NeRF’s robustness. Object sizes ranged from 91.44 cm to 822.96 cm, with LiDAR scans taken from a fixed height at distances of 3–5 m. Notably, the highest tolerance was observed on the object column. Because LiDAR was too far from the smaller objects, the result was a sparse point cloud with limited data.

Figure A2. Evaluation of registration accuracy between NeRF and LiDAR point clouds in indoor and outdoor environments. “24.76 × 17.85” means “24.76 by 17.85”, usually referring to the dimensions of something—such as width by height. The rest remains the same.

Appendix E. Idealized Model

To validate detected vertical trip hazards, an idealized model was constructed using a 12 × 12 × 1/8-inch board simulating ADA-relevant heights: ½ inch (1.27 cm), 1 inch (2.54 cm), and 1½ inches (3.81 cm). Figure 3 visualizes these heights with distinct colors. By comparing these model colors with those assigned to detect trip hazards, each hazard’s height was accurately validated.

Figure A3. Color-coded validation of trip hazard heights using an idealized model: (a) three 12 × 12 × 1/8-inch boards simulating ADA-relevant elevations (½, 1, and 1½ inches); (b) color comparison for validating detected hazard heights.

References

City of Columbia. Americans with Disabilities Act (ADA) Sidewalk Transition Plan. Available online: https://www.como.gov/wp-content/uploads/2020/10/ADA-Sidewalk-Transition-Plan.REV_.2022.pdf (accessed on 25 May 2025).
United States Access Board. Accessible Guidelines and Standards for Public Rights-of-Way. Available online: https://www.access-board.gov/prowag/proposed/chapter-r3-technical-requirements/ (accessed on 5 July 2025).
United States Department of Justice. 2010 ADA Standards for Accessible Design. Available online: https://www.ada.gov/law-and-regs/design-standards/2010-stds/ (accessed on 5 July 2025).
U.S. Department of Transportation Federal Highway Administration. A Guide for Maintaining Pedestrian Facilities for Enhanced Safety. Available online: https://safety.fhwa.dot.gov/ped_bike/tools_solve/fhwasa13037/chap3.cfm (accessed on 5 July 2025).
Guensler, R.; Grossman, A.; Frackelton, A.; Elango, V.; Xu, Y.; Toth, C.; Akanser, A.; Castrillon, F.; Palinginis, E.; Sadana, R. Automated Sidewalk Quality and Safety Assessment System; FHWA-GA-15-1216; Georgia Department of Transportation: Atlanta, GA, USA, 2015. [Google Scholar]
Zhou, J.; Wu, X.; Zheng, J.; Zhou, S.; Yang, X. Research on high-precision calibration device for dual-axis tilt sensor. SPIE Future Sens. Technol. 2024, 13083, 182–186. [Google Scholar]
Che, E.; Olsen, M.J.; Trejo, D. Evaluation of Curb Ramp Compliance: Review of Tools, Methods, and Time to Develop Error Tolerances. 2024. Available online: https://rosap.ntl.bts.gov/view/dot/73575/dot_73575_DS1.pdf (accessed on 5 July 2025).
Missouri Department of Transportation (MoDOT). Sidewalk Mapper. Available online: https://www.modot.org/sites/default/files/documents/SidewalkMapper.pdf (accessed on 25 May 2025).
Frackelton, A.; Grossman, A.; Palinginis, E.; Castrillon, F.; Elango, V.; Guensler, R. Measuring walkability: Development of an automated sidewalk quality assessment tool. Suburb. Sustain. 2013, 1, 4. [Google Scholar] [CrossRef]
City of Novi. Approval to Award a Contract to ICC-IMS for a Sidewalk Condition Survey through the TXShare Cooperative Purchasing Program. Available online: https://cityofnovi.org/media/gtdcdceq/250224consentf.pdf (accessed on 25 May 2025).
Faisal, A.; Gargoum, S. Cost-effective LiDAR for pothole detection and quantification using a low-point-density approach. Autom. Constr. 2025, 172, 106006. [Google Scholar] [CrossRef]
Seștras, P.; Badea, G.; Badea, A.C.; Sălăgean, T.; Roșca, S.; Kader, S.; Remondino, F. Land surveying with UAV photogrammetry and LiDAR for optimal building planning. Autom. Constr. 2025, 173, 106092. [Google Scholar] [CrossRef]
Mildenhall, B.; Srinivasan, P.P.; Tancik, M.; Barron, J.T.; Ramamoorthi, R.; Ng, R. NeRF: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 2021, 65, 99–106. [Google Scholar] [CrossRef]
Remondino, F.; Karami, A.; Yan, Z.; Mazzacca, G.; Rigon, S.; Qin, R. A critical analysis of NeRF-based 3D reconstruction. Remote Sens. 2023, 15, 3585. [Google Scholar] [CrossRef]
Fares, A.; Zayed, T. Industry- and academic-based trends in pavement roughness inspection technologies over the past five decades: A critical review. Remote Sens. 2023, 15, 2941. [Google Scholar] [CrossRef]
Infrastructure Management Services. Sidewalk Survey Summary Report—City of Miami Beach. 2017. Available online: https://www.miamibeachfl.gov/wp-content/uploads/2018/12/MiamiBeachSidewalkReport_.pdf (accessed on 6 July 2025).
Qiu, Q.; Lau, D. Real-time detection of cracks in tiled sidewalks using YOLO-based method applied to unmanned aerial vehicle (UAV) images. Autom. Constr. 2023, 147, 104745. [Google Scholar] [CrossRef]
Yussif, A.-M.; Zayed, T.; Taiwo, R.; Fares, A. Promoting sustainable urban mobility via automated sidewalk defect detection. Sustain. Dev. 2024, 32, 5861–5881. [Google Scholar] [CrossRef]
Kim, J.W.; Jung, J.; Kim, T. Automated region extraction and displacement detection for paving blocks adjacent to deep excavation using photogrammetry. Autom. Constr. 2025, 174, 106126. [Google Scholar] [CrossRef]
Federal Highway Administration. Determination of Improved Pavement Smoothness When Using 3D Modeling and Automatic Machine Guidance; FHWA-HIF-21-048; U.S. Department of Transportation: Washington, DC, USA, 2021. Available online: https://www.fhwa.dot.gov/publications/research/infrastructure/pavements/21059/21059.pdf (accessed on 6 July 2025).
Zhou, Y.; Che, E.; Turkan, Y.; Olsen, M.J. Virtual ADA compliance assessment: Mimicking digital inclinometers to measure slopes within point clouds. J. Surv. Eng. 2024, 150, 04024009. [Google Scholar] [CrossRef]
Zakaria, M.H.; Fawzy, H.; El-Beshbeshy, M.; Farhan, M. A comparative study of terrestrial laser scanning and photogrammetry: Accuracy and applications. Civ. Eng. J. 2025, 11, 1196–1216. [Google Scholar] [CrossRef]
Aryan, A.; Bosché, F.; Tang, P. Planning for terrestrial laser scanning in construction: A review. Autom. Constr. 2021, 125, 103551. [Google Scholar] [CrossRef]
Wang, Y.; Chen, Q.; Zhu, Q.; Liu, L.; Li, C.; Zheng, D. A survey of mobile laser scanning applications and key techniques over urban areas. Remote Sens. 2019, 11, 1540. [Google Scholar] [CrossRef]
Famili, A.; Sarasua, W.A.; Shams, A.; Davis, W.J.; Ogle, J.H. Application of mobile terrestrial LiDAR scanning systems for identification of potential pavement rutting locations. Transp. Res. Rec. 2021, 2675, 1063–1075. [Google Scholar] [CrossRef]
Shatnawi, N.; Obaidat, M.T.; Al-Mistarehi, B. Road pavement rut detection using mobile and static terrestrial laser scanning. Appl. Geomat. 2021, 13, 901–911. [Google Scholar] [CrossRef]
Liu, Z.; van Oosterom, P.; Balado, J.; Swart, A.; Beers, B. Detection and reconstruction of static vehicle-related ground occlusions in point clouds from mobile laser scanning. Autom. Constr. 2022, 141, 104461. [Google Scholar] [CrossRef]
Huai, J.; Shao, Y.; Zhang, Y.; Yilmaz, A. A low-cost portable LiDAR-based mobile mapping system on an Android smartphone. arXiv 2025, arXiv:2506.15983. [Google Scholar] [CrossRef]
Queruel, M.; Bornhofen, S.; Histace, A.; Ducoulombier, L. Scan-to-BIM: Unlocking current limitations through artificial intelligence. In Proceedings of the ISARC—International Symposium on Automation and Robotics in Construction, Lille, France, 3–5 June 2024; IAARC Publications: Lyon, France; Volume 41, pp. 1040–1047. [Google Scholar]
Arshad, M.A.; Jubery, T.; Afful, J.; Jignasu, A.; Balu, A.; Ganapathysubramanian, B. Evaluating neural radiance fields for 3D plant geometry reconstruction in field conditions. Plant Phenomics 2024, 6, 0235. [Google Scholar] [CrossRef] [PubMed]
Kim, G.; Cha, Y. 3D pixelwise damage mapping using a deep attention based modified Nerfacto. Autom. Constr. 2024, 168, 105878. [Google Scholar] [CrossRef]
Cui, D.; Wang, W.; Hu, W.; Peng, J.; Zhao, Y.; Zhang, Y.; Wang, J. 3D reconstruction of building structures incorporating neural radiation fields and geometric constraints. Autom. Constr. 2024, 165, 105517. [Google Scholar] [CrossRef]
Fan, W.; Liu, X.; Zhang, Y.; Wei, D.; Guo, H.; Yue, D. 3D wireframe model reconstruction of buildings from multi-view images using neural implicit fields. Autom. Constr. 2025, 174, 106145. [Google Scholar] [CrossRef]
Qin, T.; Li, C.; Ye, H.; Wan, S.; Li, M.; Liu, H. Crowd-sourced NeRF: Collecting data from production vehicles for 3D street view reconstruction. IEEE Trans. Intell. Transp. Syst. 2024, 25, 16145–16156. [Google Scholar] [CrossRef]
Shams, A.; Sarasua, W.A.; Russell, B.T.; Davis, W.J.; Post, C.; Rastiveis, H. Extracting highway cross slopes from airborne and mobile LiDAR point clouds. Transp. Res. Rec. 2023, 2677, 372–384. [Google Scholar] [CrossRef]
Tsai, Y.; Ai, C.; Wang, Z.; Pitts, E. Mobile cross-slope measurement method using LiDAR technology. Transp. Res. Rec. 2013, 2367, 53–59. [Google Scholar] [CrossRef]
Wang, Y.; Liu, Y.; Li, Z.; Gu, T.; Pauwels, P.; Yu, B. Automatic cross section extraction and cross slope measurement for curved ramps using light detection and ranging point clouds. Measurement 2024, 228, 114369. [Google Scholar] [CrossRef]
Ning, H.; Ye, X.; Chen, Z.; Liu, T.; Cao, T. Sidewalk extraction using aerial and street view images. Environ. Plan. B Urban Anal. City Sci. 2022, 49, 7–22. [Google Scholar] [CrossRef]
Jiang, Y.; Han, S.; Li, D.; Bai, Y.; Wang, M. Automatic concrete sidewalk deficiency detection and mapping with deep learning. Expert Syst. Appl. 2022, 207, 117980. [Google Scholar] [CrossRef]
Yu, Y.; Li, J.; Guan, H.; Wang, C. 3D crack skeleton extraction from mobile LiDAR point clouds. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Quebec City, QC, Canada, 13–18 July 2014. [Google Scholar] [CrossRef]
Nerfstudio. Nerfstudio Documentation. Available online: https://docs.nerf.studio/extensions/sdfstudio.html (accessed on 25 May 2025).
Li, D.; Shi, G.; Li, J.; Chen, Y.; Zhang, S.; Xiang, S. PlantNet: A dual-function point cloud segmentation network for multiple plant species. ISPRS J. Photogramm. Remote Sens. 2022, 184, 243–263. [Google Scholar] [CrossRef]
Park, S.H.; Moon, Y.S.; Cho, N.I. Perception-oriented single image super-resolution using optimal objective estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–23 June 2023. [Google Scholar]

Figure 1. Workflow of this study.

Figure 2. Study sites on the University of Missouri campus for evaluating NeRF, LiDAR, and traditional sidewalk survey methods. The red tapes were placed on the sidewalk and ramps to enable georeferencing of datasets.

Figure 3. Overview of data collection methods: traditional surveying, LiDAR scanning, and monocular vision-based NeRF for extracting ADA-relevant sidewalk metrics. The red box in the second column highlights the LiDAR scanner, while the red tapes in the third and fourth columns are used to ensure accurate spatial referencing during data collection.

Figure 4. High-end LiDAR data acquisition using Livox HAP TX: (a) real image of curb ramp with slope measurement direction and (b) ROS-integrated setup and merged point cloud with high spatial resolution.

Figure 5. Data collection setup for NeRF-based 3D reconstruction. A GoPro camera mounted on a bicycle was used to capture a monocular video by circling each sidewalk segment. This footage served as input for the NeRF training pipeline described in Section 2.4. This approach enabled efficient field data collection without the use of specialized 3D sensors.

Figure 6. NeRF training and rendering pipeline implemented using nerfstudio [41]. The pipeline includes ray sampling, pose refinement using SfM-generated camera trajectories, volumetric field optimization, and final rendering. The trained NeRF model outputs high-fidelity 3D point clouds and 2D image renderings, which were used to extract ADA compliance metrics.

Figure 7. Example of slope extraction on a flare surface using the enhanced E2E method. Blue points (324) represent the full point cloud, while red points indicate the first selected transect line divided into 325 six bins (black boxes). Slope vectors (arrows) were computed between adjacent bins.

Figure 8. Visualization of image and point cloud data for slope analysis: (a) real images of the curb ramp, (b) LiDAR-based elevation profile, (c) NeRF-based elevation profile, (d) real images of Sidewalk 4 with four concrete slabs, (e) NeRF point cloud with top-down segmentation, (f) cross slope visualization using a color gradient, and (g) combined cross and running slope visualization.

Figure 9. NeRF–LiDAR alignment: LiDAR point cloud, NeRF reconstruction, and the overlaid version. The red tapes are reflective tapes used to facilitate accurate alignment between the two point clouds.

Figure 10. Precision, recall, and F1 scores for the curb ramp and Sidewalks 1–4 at different thresholds.

Figure 11. Vertical displacement detection for Sidewalk 2–4: (a) histogram of vertical changes, (b) color-coded trip hazards by height, (c) trip hazards shown as blue points overlaid on original point cloud, the blue points represent the point cloud of vertical displacements extracted using the Sobel filter, and (d) top-down view.

Table 1. Comparison of vertical distance measurements from LiDAR and NeRF against manual measurement.

Site	Manual Measurement (cm)	LiDAR (cm)	LiDAR Absolute Difference from Manual Measurement (cm)	NeRF (cm)	NeRF Absolute Difference from Manual Measurement (cm)
Sidewalk 2	9.80	9.48	0.32	9.95	0.15
Sidewalk 3	10.40	9.06	1.34	10.77	0.37

Table 2. ADA sidewalk compliance standards and repair recommendations [8].

Types		Standards		Range	Compliance Level
Curb ramp	Ramp	Running slope	2.86° ≤ slope ≤ 4.76°	1.15–2.29°	Substantially compliant
		Cross slope	≤1.15°	2.35–5.71°	Correction recommended
	Flare	Running slope	≤5.71°	≥5.71°	Correction recommended
		Cross slope	≤5.71°	≥5.71°	Correction recommended
Sidewalk		Running slope	≤2.86°	2.86–4.52°	Substantially compliant
				4.57–5.71°	Correction recommended
				≥5.71°	Correction recommended
		Cross slope	≤1.15°	1.15–2.29°	Substantially compliant
				2.35–5.71°	Correction recommended
				≥5.71°	Correction recommended
Gap				0.5–1 inch	Substantially compliant
				1.1–2 inch	Correction recommended
				≥2 inches	Correction recommended
Crack				0.25–0.5 inch	Substantially compliant
				0.6–1 inch	Substantially compliant
				≥1 inch	Correction recommended
Vertical displacement	<0.25 inch

Table 3. Average precision, recall, and F1 scores for the curb ramp and Sidewalks 1–4.

Threshold (m)	Precision (%)	Recall (%)	F1 Score (%)
0.05	96.74	96.34	96.52
0.01	60.52	68.29	63.16
0.005	19.26	43.45	25.46

Table 4. Performance metrics used to evaluate the quality of 2D images rendered by NeRF.

Sidewalk	PSNR	SSIM	LPIPS
1	21.2156	0.5564	0.2668
2	19.0188	0.4369	0.2453
3	21.36871	0.5712	0.3337
4	18.0586	0.2123	0.5192

Table 5. Slope estimation for the curb ramp using NeRF, LiDAR, a digital level, DAS, and a smartphone.

Part	NeRF (°)		LiDAR (°)		Digital Level (°)		DAS (°)		iPhone 15 (°)
	Cross	Running	Cross	Running	Cross	Running	Cross	Running	Cross	Running
Flare L	5.76	6.89	6.56	6.00	5.8	7.3	4.56 *	7.33	4.50 *	5.70 *
Ramp	0.38 *	5.37	0.32 *	5.46	0.9 *	4.8	2.04	4.98	1.52	2.93 *
Flare R	4.86 *	5.69 *	5.2 *	5.38 *	4.3 *	2.9 *	5.93	2.36 *	5.33	0.83 *

* Asterisk denotes instances of compliance with ADA.

Table 6. Slope estimation for Sidewalk 1 using NeRF, LiDAR, a digital level, DAS, and a smartphone.

Part	NeRF (°)		LiDAR (°)		Digital Level (°)		DAS (°)		iPhone 15 (°)
	Cross	Running	Cross	Running	Cross	Running	Cross	Running	Cross	Running
Part 1	5.43	2.73 *	5.63	1.67 *	3.35	2.95	4.12	3.79	4.7	2.4 *
Part 2	4.57	2.92	4.77	3.36	2.1	2.9	1.31	2.46 *	1.15	2.1 *
Part 3	5.43	4.14	4.51	2.88	3.05	3.1	2.38	2.51 *	2.7	2.75 *
Part 4	4.00	3.90	4.65	2.86	3.25	1.45 *	2.45	1.42 *	2.55	1.05 *

* Asterisk denotes instances of compliance with ADA.

Table 7. Comparative evaluation of sidewalk surveying methods.

	Manual			LiDAR		NeRF
	Digital Level	DAS	Tape	Livox	iPhone	GoPro
Accuracy	High (Digital level) Moderate (DAS and tape)			Very high	Moderate to low	High to very high
Cost	Moderate (equipment + labor)			Very high	Low	Low to moderate
Time (minute)	5.6			14.9	2.8	1.4
Dataset	Manual document (DAS is 11.2 GB of data)			14 GB	Manual document	0.7 GB
Automation	None			Partial (needs operator)	Partial (sensor + app)	High (automated extraction)
Slope	Yes			Yes	Yes	Yes
Vertical displacement	Yes			Yes	None	Yes
Best scenario to use	Small scale			Medium to large scale	Small scale	Large or complex

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Du, H.; Wang, S.; Zhang, L.; Amo-Boateng, M.; Adu-Gyamfi, Y. Evaluating Neural Radiance Fields for ADA-Compliant Sidewalk Assessments: A Comparative Study with LiDAR and Manual Methods. Infrastructures 2025, 10, 191. https://doi.org/10.3390/infrastructures10080191

AMA Style

Du H, Wang S, Zhang L, Amo-Boateng M, Adu-Gyamfi Y. Evaluating Neural Radiance Fields for ADA-Compliant Sidewalk Assessments: A Comparative Study with LiDAR and Manual Methods. Infrastructures. 2025; 10(8):191. https://doi.org/10.3390/infrastructures10080191

Chicago/Turabian Style

Du, Hang, Shuaizhou Wang, Linlin Zhang, Mark Amo-Boateng, and Yaw Adu-Gyamfi. 2025. "Evaluating Neural Radiance Fields for ADA-Compliant Sidewalk Assessments: A Comparative Study with LiDAR and Manual Methods" Infrastructures 10, no. 8: 191. https://doi.org/10.3390/infrastructures10080191

APA Style

Du, H., Wang, S., Zhang, L., Amo-Boateng, M., & Adu-Gyamfi, Y. (2025). Evaluating Neural Radiance Fields for ADA-Compliant Sidewalk Assessments: A Comparative Study with LiDAR and Manual Methods. Infrastructures, 10(8), 191. https://doi.org/10.3390/infrastructures10080191

Article Menu

Evaluating Neural Radiance Fields for ADA-Compliant Sidewalk Assessments: A Comparative Study with LiDAR and Manual Methods

Abstract

1. Introduction

2. Review of Sidewalk Assessment Techniques

2.1. Sensor- and Vision-Based Methods

2.2. Three-Dimensional Reconstruction: Photogrammetry and LiDAR

2.3. Neural Radiance Field (NeRF) for 3D Sidewalk Modeling

2.4. Slope and Elevation Estimation from 3D Data

3. Methodology

3.1. Study Area and Data Collection

3.2. Manual Survey Procedures

3.3. LiDAR Survey Procedures

3.4. Monocular Vision-Based Approach Using NeRF

3.4.1. Preprocessing and Camera Pose Estimation

3.4.2. NeRF Model Training and 3D Reconstruction

3.4.3. Metric Extraction and Evaluation

3.4.4. Slope Detection from NeRF Point Clouds

3.4.5. Vertical Displacements and Trip Hazard Detection

4. Evaluation

4.1. Visualization of Curb Ramp and Sidewalk

4.2. NeRF Performance Metrics

4.2.1. Three-Dimensional Reconstruction Accuracy

4.2.2. Visual Rendering Accuracy

4.3. ADA Compliance Benchmarking

4.4. Results

4.4.1. NeRF Performance

4.4.2. Slope and Vertical Displacement Measurement Results

5. Comparative Analysis of Survey Methods

6. Conclusions and Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. NeRF Rendering Function and Training Procedure

Appendix A.1. NeRF Scene Representation

Appendix A.2. Volumentric Rendering

Appendix A.3. Training Procedure

Appendix A.4. NeRF Training Pseudocode

Appendix B. Evaluation Matrix

Appendix C. Performance Comparison of Illuminations

Appendix D. Data Validation

Appendix E. Idealized Model

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI