Quantitative Assessment of Concrete Pavement Subsurface Quality Using Ultrasonic Tomography: Development and Initial Validation of a Multi-Metric Scoring System

Olavarría, Jorge E.; Darnell, Megan M.; Smetana, Mason; Vandenbossche, Julie M.; Khazanovich, Lev

doi:10.3390/app16052233

Open AccessArticle

Quantitative Assessment of Concrete Pavement Subsurface Quality Using Ultrasonic Tomography: Development and Initial Validation of a Multi-Metric Scoring System

by

Jorge E. Olavarría

,

Megan M. Darnell

,

Mason Smetana

,

Julie M. Vandenbossche

and

Lev Khazanovich

^*

Department of Civil and Environmental Engineering, University of Pittsburgh, Pittsburgh, PA 15260, USA

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(5), 2233; https://doi.org/10.3390/app16052233

Submission received: 2 February 2026 / Revised: 16 February 2026 / Accepted: 20 February 2026 / Published: 26 February 2026

(This article belongs to the Special Issue Application of Ultrasonic Non-Destructive Testing—Second Edition)

Download

Browse Figures

Versions Notes

Abstract

Linear array ultrasonic devices such as the MIRA A1040 are highly effective at detecting subsurface defects in concrete; however, interpretation of their data is time-consuming, subjective, and requires specialized expertise. This paper proposes a quantitative signal-processing framework that computes objective subsurface-quality Multi-Metric Scores derived from ultrasonic tomography B-scans. The framework integrates the Signal-to-Background Ratio, Energy Concentration Ratio, and Spatial Dispersion into a composite 0–100 scale. Laboratory testing demonstrated clear discrimination between control samples (scores 79–100) and specimens with intentionally placed voids (8–38) or honeycombing defects (6–35). Field validation confirmed similar separation using an acceptance threshold of 70. The proposed scoring methodology offers a practical, automated approach for real-time quality assessment of concrete pavements under realistic field construction conditions.

Keywords:

ultrasonic tomography; concrete pavement quality assessment; nondestructive testing

1. Introduction

Subsurface defects in concrete pavements due to poor construction practices escape detection until they manifest as surface distress or functional failure years after placement [1,2]. While existing research demonstrates ultrasonic detection of specific defects [3,4,5,6,7,8], no prior work has developed an automated, multi-metric scoring system capable of continuous quality assessment suitable for construction acceptance decisions.

Current construction quality assurance relies on visual inspection and core sampling, neither of which adequately addresses subsurface conditions. Visual methods detect only surface defects within 1–3 cm of the pavement surface [9], and sounding techniques such as chain dragging have difficulty identifying flaws at greater depths [10]. Core sampling provides direct material characterization, but with a coverage too sparse to capture localized defects [11]. The destructive and time-consuming nature of coring further limits its practical utility during active construction [12].

Non-destructive methods such as ultrasonic, electrical, magnetic and thermographic techniques provide a faster and more comprehensive way to evaluate concrete pavements after placement [13,14,15,16,17,18,19].

Ultrasonic tomography has proven to identify hidden conditions nondestructively, yet the technology remains largely absent from routine pavement quality control despite two decades of demonstrated capability [3,4,20,21,22]. The MIRA A1040 employs dry-point-contact shear wave transducers with Synthetic Aperture Focusing Technique (SAFT) reconstruction to image internal conditions at depths up to 800 mm [23,24]. The device is easy to operate, and the “touch-and-go” process makes measurements highly efficient and productive.

Hoegh et al. demonstrated that ultrasonic tomography was the only method in blind testing to accurately determine the horizontal extent of subsurface joint deterioration, outperforming chain dragging, rod sounding, and ground-penetrating radar [10]. Dinh et al. confirmed capabilities for detecting delaminations, voids, and reinforcement location, with concrete cover measurement accuracy significantly exceeding magnetic-based methods [23]. However, while these studies demonstrate the technology’s diagnostic power, construction quality assurance presents a different challenge: it demands rapid, objective assessments that can be applied consistently across production volumes.

Recent advances in deep learning have shown promise for automating ultrasonic image interpretation. Convolutional neural networks have successfully detected and segmented internal defects in reinforced concrete using ultrasonic B-scans [25], while similar approaches have been applied to classify flaws in plain concrete specimens [26].

The barrier is not the sensing hardware but the interpretation: B-scan images require specialized expertise that most pavement practitioners lack, and manual analysis is too slow and subjective for production-scale assessment [25,27,28]. This study aims to: (1) develop an automated Multi-Metric Score combining three complementary ultrasonic signal characteristics, (2) validate the system on controlled laboratory samples, and (3) demonstrate field applicability on a full-scale pavement test section.

2. Materials and Methods

2.1. Experimental Overview

The experimental procedure consisted of two phases: laboratory testing using samples with controlled defects and field verification on a full-scale pavement test section. The laboratory phase established reproducible techniques for fabricating two categories of subsurface defects: internal voids and honeycombing. These samples were used to test the capability of ultrasonic tomography to distinguish each condition from control samples, representing sound concrete.

2.2. Ultrasonic Testing Equipment

2.2.1. MIRA A1040

Ultrasonic measurements were performed using the MIRA A1040 Classic (ACS-Solutions GmbH, Saarbrücken, Germany) (Figure 1a). The device operates on the principle of ultrasonic shear wave propagation, employing an array of 48 dry-point contact (DPC) transducers arranged in 12 blocks containing 4 elements each. Each block sequentially transmits shear wave pulses while the remaining channels receive reflections from internal interfaces, producing 66 independent transmitting-receiving pair measurements per scan. The transducers are independently spring-loaded to accommodate rough concrete surfaces without requiring coupling agents or surface preparation.

The device operates at a nominal frequency of 50 kHz. The MIRA A1040 generates B-Scan images through Synthetic Aperture Focusing Technique (SAFT) reconstruction, displaying two-dimensional cross-sectional representations of subsurface conditions to a user-specified depth. For this study, B-scans were generated to a depth of 350 mm, exceeding the 300 mm specimen thickness to ensure complete depth coverage.

2.2.2. Data Acquisition Protocol

Measurements were taken at 24 and 48 h after concrete placement for both laboratory samples and field areas by placing the MIRA device on the finished top surface (Figure 1). For lab samples, four scans were collected on each sample at the middle region of the top surface. For field areas, measurements were taken at predefined grid locations within each defect area and control section. The on-screen velocity displayed by the MIRA A1040 device was recorded for each measurement and subsequently verified by our own independent velocity calculations from the raw waveform data.

2.3. Concrete Materials

A concrete mixture was designed to meet Federal Aviation Administration (FAA) specifications for airfield pavement construction. Two variants were prepared: a low-slump mixture for simulated slipform paving (target slump 38 ± 13 mm) and a higher-slump mixture for fixed-form construction (target slump 140 ± 13 mm). Both mixtures maintained a water-cement ratio of 0.43 as required by FAA specifications. The higher workability for fixed-form construction was achieved through the use of additional superplasticizer rather than increased water content.

The test specimens and field section represent jointed plain concrete pavement (JPCP) construction, the most common concrete pavement type for highways and airfields. JPCP contains no distributed reinforcement in the pavement body, with load transfer provided by dowel bars at transverse joints. This distinguishes it from continuously reinforced concrete pavement (CRCP), which incorporates longitudinal steel throughout the pavement length [29].

2.4. Defect Construction Techniques

2.4.1. Internal Voids

Preliminary testing evaluated four methods for creating internal voids: wax beads, foam balls, chopped foam pool noodle material, and dry ice. Samples containing each material were cast in small beams, consolidated using a shaft, and sectioned after hardening to examine the internal distribution. Dry ice was selected for the primary study because it sublimates completely, leaving true voids without residual material at the interface. Cores extracted from laboratory samples confirmed the formation of clean air-filled voids as shown in Figure 2a. For laboratory samples, dry ice chunks were placed after an initial lift of concrete was placed. For field areas, dry ice chunks approximately 40 × 75 mm were placed in two layers on a 65 mm concrete bed approximately 75 mm apart from each other, as seen in Figure 3b.

The defects produced by dry ice sublimation are representative of those generated during construction through poor consolidation, inadequate vibration, air entrapment, and honeycombing. These construction defects all produce air–concrete interfaces, which generate strong ultrasonic reflections, enabling detection. Although field defects may exhibit irregular geometries, the ultrasonic response depends on impedance discontinuities at interfaces rather than specific void morphology.

2.4.2. Honeycombing

Honeycomb defects were created using two fabrication methods. In the first method (samples C2S2 and C2S3), unmixed No. 4 quartz aggregate was placed in the center of the form and vibrated with the surrounding concrete. In the second method (samples C3S1 and C3S2), No. 4 aggregate was mixed with a small batch of concrete to produce a dry, rocky consistency with insufficient paste to fill aggregate interstices, which was then placed in the center of the form without further consolidation. For the laboratory samples, the honeycomb material was placed as a 75 to 100 mm layer between layers of standard concrete, resulting in the sample shown in Figure 2b. The field honeycomb section followed a procedure similar to the second method described, with a batch ratio of 3 shovels of concrete, 7 scoops of coarse aggregate (AASHTO #4 gradation), and 2 scoops of fine aggregate.

2.5. Field Test Section

Figure 3a shows the full-scale pavement test section that was constructed at the U.S. Army Engineer Research and Development Center (ERDC) in Vicksburg, Mississippi. The test section consisted of a 200-mm thick concrete pavement cast on a prepared clay base, using a concrete designed to achieve a 28-day flexural strength of 2.4 MPa. The test section included four areas: Control North, Internal voids, Honeycombing, and Control South.

2.6. Signal Processing and Quality Score Development

2.6.1. Raw Signal Acquisition and Processing

The MIRA A1040 stores raw ultrasonic waveform data in binary files with the .lbv extension. Each measurement produces 66 independent time-domain signals corresponding to the transmitter-receiver pair combinations from the 12-channel transducer array.

The raw signals contain both direct wave arrivals (traveling along the surface between transducers) and reflected waves from internal interfaces or the bottom surface. Direct wave arrival times are proportional to sensor spacing, as illustrated by the progressive delay from 30 mm to 150 mm in Figure 4. Reflected waves, in contrast, arrive at times determined by reflector depth and two-way travel path rather than transducer separation.

Since the inter-sensor distances are known from the array geometry (ranging from 30 mm to 330 mm), the arrival times are plotted against their corresponding travel distances. Using a linear regression on the distance–time relationship leads to the calculation of velocity. The calculated velocity is compared to the MIRA on-screen display value for verification.

2.6.2. SAFT Reconstruction

The Synthetic Aperture Focusing Technique (SAFT) transforms the raw time-domain signals into a two-dimensional B-scan image that represents the medium below the measurement based on the body shear wave reflections that arrive at the surface, with high intensity areas indicating strong reflections due to changes in acoustic impedance [30,31].

Using the independently calculated velocity, a Python-based algorithm reconstructs the B-scans shown in Figure 5. For sound concrete, the B-scan exhibits a single dominant reflection at a depth corresponding to the specimen thickness, where the interface between the concrete and the underlying base material generates a strong reflection that propagates back to the surface. In specimens containing defects, multiple reflections are observed, indicating local variations in acoustic impedance. These discontinuities manifest as distinct colored regions in the B-scan.

The complete signal processing pipeline, including filtering, windowing, and SAFT implementation details, follows the procedures described in Hoegh et al. [4,32]

2.6.3. Envelope and Energy Map

The envelope is obtained through the Hilbert transform, which provides the instantaneous amplitude of the signal independent of phase. For a real-valued signal

x (t)

, the analytic signal is formed by combining the original signal with its Hilbert transform:

z (t) = x (t) + j H {x (t)}

(1)

where

H {x (t)}

denotes the Hilbert transform of

x (t)

[33]. The instantaneous amplitude envelope is then obtained as the modulus of the analytic signal:

A (t) = | z (t) | = \sqrt{x {(t)}^{2} + H {x (t)}^{2}}

(2)

The envelope, shown in Figure 6a, is always positive and varies smoothly with depth, making it suitable for statistical analysis. Squaring the envelope yields what is conventionally termed instantaneous energy in signal processing: the squared magnitude of the signal. This is not acoustic energy in the physical sense, as the raw amplitude lacks calibrated units, but it serves as a proportional measure suitable for comparing relative signal strength. The instantaneous energy profile enhances contrast by emphasizing strong reflections over diffuse scattering (Figure 6b).

2.6.4. Subsurface Quality Metrics

Quantitative metrics for the automatic analysis of the B-Scans were developed by statistically evaluating their signal characteristics. For that, the instantaneous energy values for each B-Scan were averaged laterally to produce a one-dimensional energy-depth profile, such as the one shown in Figure 7. Also, the standard deviation for the lateral average was plotted against depth for further analysis (Figure 8). With these new plots, 3 individual metrics were developed:

Signal-to-Background Ratio (S/B): On the Mean Energy plot (Figure 7), a backwall reflection (characterized as a surge in instantaneous energy) is expected at a depth equal to the sample thickness. By defining a window of ±10% around the known thickness, the Signal component was the maximum energy within this window, while the Background component was the mean value from the surface to the window start. The S/B ratio is a dimensionless quantity, with higher values indicating stronger backwall reflections relative to background energy.
Energy Concentration Ratio (ECR): Similar to the S/B Ratio, now the mean energy within the analysis window (±10% of the nominal thickness) was compared to the mean energy in the pre-window zone. Unlike the S/B ratio which uses the peak value, ECR uses the mean within the analysis window, making it less sensitive to localized high-amplitude reflections. ECR is dimensionless, with higher values indicating better-defined backwall reflections.
Spatial Dispersion Percentage: This corresponds to the proportion (expressed as a percentage) of depths at which the standard deviation of the mean lateral energy exceeds 25% of the maximum value observed within the analysis window (±10% of the nominal thickness), as shown in Figure 8. This 25% threshold was selected to distinguish meaningful lateral variation from background noise. The Spatial Dispersion Percentage is higher on B-Scans that show more reflections along the depth of the sample, which would indicate discontinuities, voids or other problems.

2.6.5. Composite Quality Score

The three subsurface metrics were combined into a Composite Quality Score (0–100). Metrics, where higher values indicate better quality (S/B ratio, Energy Concentration Ratio), were normalized by dividing by reference values:

s_{i} = min (\frac{M_{i}}{M_{i, ref}}, 1)

(3)

where

M_{i}

is the measured value (S/B ratio or ECR) and

M_{i, ref}

is a reference value representing ideal performance.

For Spatial Dispersion, since lower values indicate better quality (i.e., fewer discontinuities), a different function was used:

S_{dispersion} = clip (1 - \frac{D - D_{min}}{D_{range}}, 0, 1)

(4)

where D is the measured Spatial Dispersion percentage,

D_{min}

represents the dispersion value at or below which concrete is considered ideal, and

D_{range}

is the normalization range corresponding to the interval between ideal and poor dispersion (currently recommended as 30%). The clip function constrains the result to the interval

[0, 1]

.

The normalized scores were combined into a composite quality score using:

Q = 100 \times \sum_{i} w_{i} \cdot s_{i}

(5)

where

s_{i}

is the normalized score for each metric,

w_{i}

is the corresponding weight, and Q is the resulting quality score on a 0–100 scale. The weights were selected based on the interpretation of the laboratory testing phase. The resulting score was classified as: Good (≥70), Fair (50–69), Poor (30–49), or Very Poor (<30).

2.6.6. Validation Approach

Validation followed a two-phase approach. Laboratory validation consisted of testing controlled samples with defects at 24 and 48 h, and then comparing calculated quality scores between control samples and those containing each defect type. Cores were extracted from selected samples and sectioned to visually confirm defect presence. Field validation included a full-scale test section constructed under realistic conditions, which was used to assess whether laboratory-developed scoring thresholds remained applicable in field conditions.

2.7. Software Implementation

The Multi-Metric Score is implemented in Python 3.14.3 and processes raw MIRA .lbv files in batch mode. Users specify a folder containing measurement files; the software generates quality scores for each measurement along with summary statistics and exports results to CSV format for further analysis. The B-scan image and energy-depth profile for each measurement are displayed alongside scores, enabling immediate visual verification of flagged measurements.

3. Results

The automated quality assessment Multi-Metric Score was evaluated on nine laboratory samples and four field test areas at both 24 and 48 h after concrete placement. The individual metrics were extracted from B-scan reconstructions and composite quality scores were obtained by combining all metrics into a single score.

3.1. Surface Velocity

Shear wave velocities were calculated from raw waveform data and compared to MIRA on-screen values. Laboratory samples exhibited velocities ranging from 2194 to 2521 m/s at 24 h, increasing by 1–5% at 48 h as hydration progressed (Table 1). The sections constructed in the field showed lower initial velocities ranging from 1556–1658 m/s at 24 h with larger increases of 8–13% by 48 h, reflecting differences in curing conditions between laboratory and field environments (Table 2). The concrete used in the field was intentionally constructed of much poorer quality concrete, surrounding the defects.

3.2. B-Scan Observations

Visual examination of the B-scan images revealed distinct patterns between the control samples and those containing subsurface defects. The control samples exhibited a clear, continuous backwall reflection appearing as a horizontal band of elevated amplitude at the expected thickness depth, with minimal disturbances in the concrete body above the backwall, as seen in Figure 5a. Samples containing voids and honeycombing displayed several reflections throughout the concrete volume, with the backwall reflection appearing weak, discontinuous, or sometimes absent (Figure 9).

3.3. Individual Metric Performance

3.3.1. Signal-to-Background Ratio

The Signal-to-Background ratio demonstrated strong discrimination between the good “control” and the defective samples in laboratory testing. The control samples achieved average S/B ratios of 19.2–127.4, while samples containing voids ranged from 2.3 to 12.2 and honeycombing samples from 1.7 to 12.5 (Table 3). The field results (Table 4) showed narrower S/B ranges compared to laboratory testing. The control sections achieved average S/B ratios of 17.6–23.3, while defect areas ranged from 7.5 to 13.1.

3.3.2. Energy Concentration Ratio

The energy concentration ratio (ECR) was defined as a complementary metric to the S/B ratio. Laboratory control samples achieved average ECR values from 7.1 to 54.7, while void samples ranged from 1.3 to 4.5 and honeycomb samples from 0.8 to 3.3. The average field ECR values were lower overall, but still showed a difference between control (5.0–8.4) and defect areas (3.5–3.9).

3.3.3. Spatial Dispersion

The control laboratory samples exhibited a low Spatial Dispersion (11.4–29.2%), indicating energy confined primarily to the backwall region. Samples with voids showed a higher average SD (63.3–74.9%) and honeycombing samples exhibited the highest dispersion values (72.6–91.4%), reflecting widespread scattering from distributed internal defects. The field areas showed similar patterns, with the control sections (20.3–46.1%) exhibiting lower dispersion than the areas with voids (63.9–72.8%) or with honeycombing (86.0–86.1%).

3.4. Multi-Metric Score

3.4.1. Reference Value Selection

Reference values for score normalization were established using the laboratory control samples. For each metric, the reference value was selected to represent the expected performance in sound concrete, such that the control samples would achieve normalized scores approaching 1.0 while defective samples would score substantially lower.

For the Signal-to-Background Ratio, a reference value of 15 showed a good separation between the control and the defective samples for the lab testing. For Energy Concentration Ratio, the reference value selected was 10. This was selected by looking at both results from 24 and 48 h, seen on Table 3.

For spatial dispersion, where lower values indicate better quality, the normalization was inverted using a reference range of 20% (ideal) to 50% (defective), such that dispersion

\leq

20% yields a normalized score of 1.0 and dispersion

\geq

50% yields 0.0.

3.4.2. Weight Assignment

Weights were determined based on how effectively each metric could differentiate between sample groups during the laboratory testing. Spatial Dispersion received the highest weight (0.4) as it demonstrated the clearest separation between control and defective samples. Signal-To-background ratio and Energy Concentration Ratio, as complementary metrics, received a weight of 0.3 each. The weights sum to 1, and the resulting composite score, after multiplying by the individual metrics, ranges from 0 to 100.

3.4.3. Lab Testing Scores

Using these weights, the quality scores were calculated for each sample, showing a good discrimination between the control and defect samples. The Multi-Metric score for the control samples was between 79 and 100, consistently above the acceptance threshold of 70, while Honeycombed areas scored 6–35 and areas with voids scored 8–38 as seen in Figure 10 and Table 3.

3.4.4. Field Validation

The reference values, weights, and thresholds established from the laboratory data were applied without modification to field areas to assess generalization to realistic construction conditions.

Field Multi-Metric scores showed greater variability than laboratory results but maintained meaningful discrimination between control and defect areas, as seen in Figure 11. Control sections scored averages of 56–76, increasing from 24 to 48 h as the concrete matured to 84–90. Honeycomb areas scored averages of 26–37, and areas with voids scored 33–36, with all measurements scoring below the acceptance threshold of 70.

4. Discussion

The Multi-Metric Score demonstrated strong agreement with the known sample conditions across both laboratory and field testing environments, though the small sample size (n = 9 laboratory specimens, 4 field areas) limits statistical generalization. In laboratory testing, where defect locations were precisely placed during fabrication, the scoring system achieved separation between the control and defective samples. All control samples scored above 70 (classified as “Good”), while all samples containing voids or honeycombing scored under 70, with the majority falling below 50 (“Poor” or “Very Poor”).

Field validation offered a more realistic evaluation setting. Although cores were not extracted from the field test section, the systematic placement of defects during construction provided a reliable ground truth for classification purposes. The scoring system correctly identified defect regions at 48 h, with control sections scoring above 70 and defect areas scoring substantially lower.

The agreement between automated scores and known defect locations suggests that the Multi-Metric Score transfers from controlled laboratory conditions to field construction environments, though validation on larger datasets is needed to refine the calibration parameters.

4.1. Laboratory Results

4.1.1. Individual Metric Performance

The three subsurface quality metrics captured complementary aspects of ultrasonic wave interaction with internal concrete conditions.

Spatial Dispersion proved most effective at discriminating between the control and samples with defects. The physical interpretation is straightforward: in sound concrete, acoustic energy concentrates at the backwall interface, while on samples with internal defects, these defects scatter energy throughout the concrete volume, producing elevated lateral variability at multiple depths.

S/B Ratio and ECR quantified the contrast between backwall reflection strength and the acoustic background preceding it. These metrics directly correspond to what an experienced operator would observe qualitatively: a clear, bright backwall reflection against a dark (low-amplitude) concrete body indicating sound material, while reflections in other areas or an absent backwall reflection would indicate internal discontinuities.

Among the three metrics, Spatial Dispersion provided the clearest separation between control and defective samples. This metric is inherently sensitive to honeycombing and voids because the changes in interface and acoustic impedance result in multiple reflections along the depth of the sample. The consistently high Spatial Dispersion values for samples with honeycombing (exceeding 70% in nearly all cases) reflect this physical mechanism.

4.1.2. Multi-Metric Score Performance

The weighted combination of metrics into a single 0–100 score achieved complete separation between control and samples with defects in laboratory testing at both 24 and 48 h using the 70/100 threshold, with control samples scoring 79–100 and samples with defects scoring 7–38 across both time points.

4.2. Field Validation

Field testing revealed more narrow metric ranges compared to laboratory results while maintaining meaningful discrimination. Control sections scored a mean value of 66 at 24 h, substantially lower and more variable than laboratory control samples. By 48 h, control scores improved to a mean value of 87, with the majority exceeding the 70-point threshold.

The reduced scores for the control sections at 24 h merit particular attention. Four of six control measurements fell below the 70-point “Good” threshold despite representing sound concrete. This likely reflects field construction conditions, such as hand-casting, ambient curing, and a mix design representative of lower-quality field placements. However, their scores were still higher than those from areas with defects.

This time-dependent behavior suggests that field measurements at very early ages may require age-adjusted interpretation thresholds, or that 48-h measurements provide more reliable classification than 24-h measurements for acceptance decisions.

Defect Detection in Field Conditions

The Multi-Metric Score maintained discrimination under field conditions in this pilot study: all measurements on areas with defects fell below the acceptance threshold of 70/100 at both 24 and 48 h, while control sections showed progressive improvement (Figure 11). Validation across a broader range of field sites and conditions is warranted.

The scoring system performed more consistently for the honeycombing than for the voids. The honeycombed samples did not achieve scores approaching the 70-point threshold, whereas samples with voids exhibited wider variability: 28–43 at 24 h and 24–62 at 48 h. This difference could be explained by the spatial distribution of each defect type. Honeycombing extended continuously across substantial volumes, ensuring that measurements within the designated area intersected the defect regardless of the exact placement. The void defects created by dry ice produced discrete cavities with sound concrete between them. Individual measurements within the “void region” could fall either directly above a void (producing low scores) or between voids (producing scores approaching control values).

4.3. Relationship to Qualitative B-Scan Interpretation

The quantitative metrics underlying the Multi-Metric Score correspond directly to features that experienced operators assess visually in B-scan images. An expert examining a B-scan image would evaluate: (1) whether a clear backwall reflection appears at the expected thickness depth, (2) whether the concrete volume above the backwall appears “clean” or contains other reflections, and (3) whether the backwall reflection is continuous and well-defined or fragmented and diffuse (Figure 5).

The Multi-Metric Score automates a decision process that experts perform intuitively, translating visual pattern recognition into numerical thresholds. When the system assigns a low score, examination of the corresponding B-scan image consistently reveals features that would prompt an expert to flag the measurement for further investigation.

The single 0–100 score simplifies the rich information content of a full B-scan image. This simplification is intentional: the score serves as a screening tool to identify areas that require detailed examination rather than a replacement for expert interpretation. When scores fall below the threshold, users should examine the individual metric values and corresponding B-scan image to understand which characteristics triggered the low assessment and to characterize the apparent anomaly.

4.4. Factors Affecting the Multi-Metric Score Performance

Multi-Metric Scores exhibited time-dependent changes reflecting concrete hydration and strength development. In practical terms, this means that measurements taken at very early ages may underrate quality. Nevertheless, in cases where this occurred, the Multi-Metric score still assigned the highest values to the control samples at both the 24-h and 48-h time points.

It’s interesting to note that, from 24 to 48 h, the control regions showed an increase in their scores in both lab and field measurements, whereas the honeycombed regions and regions with voids showed an overall decrease in their scores over the same interval in the field conditions.

Also, the discrete nature of defects highlighted an important consideration for field implementation: score interpretation depends on the number of measurements relative to defect size. An isolated low score among high-scoring measurements requires the user to further investigate the area and may reflect localized anomalies rather than widespread quality problems.

For construction acceptance applications, systematic grid-based measurement patterns ensure adequate spatial coverage. When initial measurements identify regions with mixed or borderline scores, additional measurements at finer spacing can delineate defect extent and distinguish between isolated anomalies and continuous problem areas.

4.5. Limitations

The study examined only voids and honeycombing defects. Other defect types relevant to pavement construction, such as delaminations or vertical cracks, were not evaluated. The metrics may require modification or supplementation to address these conditions. Also, all samples were between 200 and 300 mm thick, representative of typical pavement construction. Performance at greater depths, where signal attenuation becomes more significant, was not evaluated.

One disadvantage is that the Multi-Metric Score cannot distinguish between voids and reinforcing steel (rebar, dowel bars, wire mesh). Both create strong reflections above the backwall depth that would elevate the spatial dispersion and reduce the S/B ratio. In reinforced concrete, the Multi-Metric Score would likely flag areas above the reinforcement as potentially defective. Future research directions to address this limitation include: (1) integration with as-built documentation to identify and exclude regions with known reinforcement, enabling automated masking of expected steel locations; (2) development of analysis techniques that exploit the distinct acoustic impedance characteristics of steel versus air voids.

Also, layered systems, such as bonded overlays, produce interface reflections that can degrade quality scores despite representing sound construction. Users must interpret scores in the context of the known pavement structure. However, for the primary application of jointed plain concrete pavement quality control, assessment can be performed in the extensive mid-slab regions where no reinforcement is present.

4.6. Interpretation Guidelines

Quality scores should be interpreted as screening indicators rather than definitive assessments:

Score $\geq 70$ : Acceptable subsurface quality; no further investigation required.
Score 50–69: Borderline quality that requires further evaluation. Examine the individual metric values to identify which component triggered the reduced score, visually inspect the B-scan image for interpretable features, and consider collecting additional measurements in the surrounding area to determine whether the anomaly is localized or extends over a broader region.
Score < 50: Likely a defect requiring investigation. Examine the B-scan image to characterize the nature and apparent depth of the anomaly. Take additional measurements on a finer grid spacing around the flagged location to define the extent of the boundary of the defect. If multiple adjacent measurements consistently score below 50, the area should be documented for engineering evaluation and potential corrective action.

5. Conclusions

This proof-of-concept study developed and preliminarily validated an automated Multi-Metric Score for assessing concrete pavement subsurface quality using ultrasonic tomography. The Multi-Metric Score transforms raw MIRA A1040 measurements into an objective quality score (0–100) through three signal-derived metrics: Signal-to-Background Ratio, Energy Concentration Ratio, and Spatial Dispersion. The principal findings are:

Velocity alone did not reliably discriminate between defect types, as the measured values reflect near-surface material properties rather than subsurface conditions. Control samples and those with internal defects (voids, honeycombing) exhibited similar velocity ranges, indicating that surface wave velocity is not a sensitive indicator of subsurface quality for the defect types investigated.
The Multi-Metric Quality Score successfully discriminated between control samples and those containing internal voids or honeycombing in this limited dataset. Laboratory testing achieved separation between groups using a threshold of 70 points. Field validation, using weights calibrated solely on laboratory data, confirmed discrimination capability under realistic construction conditions.
The three metrics used on the Multi-Metric Score capture complementary aspects of ultrasonic wave propagation: backwall reflection strength and energy distribution. Each corresponds to observable B-scan image characteristics.
The Multi-Metric Score requires no training data beyond control sample measurements for reference value calibration. Standard signal processing operations enable implementation in common programming environments.
Application of laboratory-derived parameters to field areas maintained classification accuracy despite showing narrower score ranges. After 48 h, all control sections scored ≥70 (“Good”), and all defect areas scored <70 (“Fair” or worse), confirming transferability to realistic conditions.

The scoring parameters established in this pilot study were derived empirically from a limited sample size representing only two defect types. This work should be considered proof-of-concept pending large-scale validation. The scoring procedure has been implemented into user-friendly software, making it attractive for implementation. However, extensive field testing across diverse construction projects is critically needed to refine the thresholds and weighting factors, and to validate their applicability to different concrete mixtures, pavement thicknesses, and curing conditions.

Author Contributions

Conceptualization, J.E.O., L.K. and J.M.V.; methodology, J.E.O., L.K. and J.M.V.; software, L.K. and J.E.O.; validation, J.E.O., M.S. and M.M.D.; formal analysis, J.E.O. and L.K.; investigation, J.E.O., M.S. and M.M.D.; resources, M.S. and M.M.D.; data curation, J.E.O. and M.S.; writing—original draft preparation, J.E.O.; writing—review and editing, L.K. and J.M.V.; visualization, J.E.O.; supervision, L.K. and J.M.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by U.S. Army Engineer Research and Development Center grant number W912HZ-24-C-0038.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data will be made available by the authors on request.

Acknowledgments

The authors would like to thank Alessandro Fascetti, who served as a Principal Investigator on the umbrella project, for his support of this research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Sutter, L.; VanDam, T.; Peterson, K. Evaluation of Concrete Pavements with Materials-Related Distress [Final Report and Appendices]; Technical Report; Michigan Tech Transportation Institute, Michigan Technological University: Houghton, MI, USA, 2010. [Google Scholar]
Rangaraju, P.R. Investigating premature deterioration of a concrete highway. Transp. Res. Rec. 2002, 1798, 1–7. [Google Scholar] [CrossRef]
Transportation Research Board; National Academies of Sciences, Engineering, and Medicine. Incorporating Nondestructive Testing in Quality Assurance of Highway Pavement Construction: Manual; The National Academies Press: Washington, DC, USA, 2023. [Google Scholar] [CrossRef]
Hoegh, K.; Khazanovich, L.; Yu, H.T. Ultrasonic tomography for evaluation of concrete pavements. Transp. Res. Rec. 2011, 2232, 85–94. [Google Scholar] [CrossRef]
Niederleithinger, E.; Lay, V.; Grohmann, M.; Epple, N.; Liao, C.M.; Trujillo, C.A.S. Geophysical Methods Applied in Ultrasonic Inspection and Monitoring of Concrete Constructions. In Geophysik im Wandel-Schriftenreihe zum 100. Jubiläum der Deutschen Geophysikalischen Gesellschaft; Deutsche Geophysikalische Gesellschaft e.V. (DGG): Hannover, Germany, 2024. [Google Scholar]
Zhang, L.; Qiao, C.; Jia, S.; Zeng, J.; Li, H.; Zhang, T.; Wu, S. Imaging of inclusions in concrete with enhanced low-frequency ultrasound tomography. Sens. Actuators A Phys. 2025, 386, 116324. [Google Scholar] [CrossRef]
Kirillova, E.; Tatarinov, A.; Kovalenko, S.; Shahmenko, G. Prediction of degradation of concrete surface layer using neural networks applied to ultrasound propagation signals. Acoustics 2025, 7, 19. [Google Scholar] [CrossRef]
Alqurashi, I.; Alver, N.; Bagci, U.; Catbas, F.N. A review of ultrasonic testing and evaluation methods with applications in civil NDT/E. J. Nondestruct. Eval. 2025, 44, 53. [Google Scholar]
Frỳbort, A.; Štulířová, J.; Grošek, J. Complementary analyses of concrete characteristics performed on cores taken from concrete pavements. In Proceedings of the 4th International Scientific Conference Structural and Physical Aspects of Construction Engineering (SPACE 2019), MATEC Web of Conferences; EDP Sciences: Les Ulis, France, 2020; Volume 310, p. 00036. [Google Scholar] [CrossRef][Green Version]
Hoegh, K.; Khazanovich, L.; Worel, B.J.; Yu, H.T. Detection of subsurface joint deterioration: Blind test comparison of ultrasound array technology with conventional nondestructive methods. Transp. Res. Rec. 2013, 2367, 3–12. [Google Scholar] [CrossRef]
Morcous, G.; Erdogmus, E. Use of Ground Penetrating Radar for Construction Quality Assurance of Concrete Pavement; Final Report, NDOR Project No. P307; Nebraska Transportation Center, University of Nebraska–Lincoln: Lincoln, NE, USA, 2009. [Google Scholar]
Nasief, H.G.; Whited, G.C.; Loh, W.Y. Wisconsin Method for Probing Portland Cement Concrete Pavement for Thickness: Statistical Comparison and Validation. Transp. Res. Rec. 2011, 2228, 99–107. [Google Scholar] [CrossRef]
Frankowski, P.K.; Chady, T. Evaluation of Reinforced Concrete Structures with Magnetic Method and ACO (Amplitude-Correlation-Offset) Decomposition. Materials 2023, 16, 5589. [Google Scholar] [CrossRef]
Frankowski, P.; Chady, T. A Comparative Analysis of the Magnetization Methods Used in the Magnetic Nondestructive Testing of Reinforced Concrete Structures. Materials 2023, 16, 7020. [Google Scholar] [CrossRef]
Gupta, S.; Lin, Y.A.; Lee, H.J.; Buscheck, J.; Wu, R.; Lynch, J.P.; Garg, N.; Loh, K.J. In situ crack mapping of large-scale self-sensing concrete pavements using electrical resistance tomography. Cem. Concr. Compos. 2021, 122, 104154. [Google Scholar] [CrossRef]
Hallaji, M.; Seppänen, A.; Pour-Ghaz, M. Electrical resistance tomography to monitor unsaturated moisture flow in cementitious materials. Cem. Concr. Res. 2015, 69, 10–18. [Google Scholar] [CrossRef]
Szymanik, B.; Chady, T.; Frankowski, P. Inspection of reinforcement concrete structures with active infrared thermography. In Proceedings of the AIP Conference Proceedings; AIP Publishing LLC: Melville, NY, USA, 2017; Volume 1806, p. 100013. [Google Scholar]
Chatterjee, S.; Deb, A. Reinforcement Detection in Concrete Using Infrared Thermography. In Proceedings of the RILEM Spring Convention and Conference; Springer: Berlin/Heidelberg, Germany, 2025; pp. 87–95. [Google Scholar]
Szymanik, B.; Kocoń, M.; Keo, S.A.; Brachelet, F.; Defer, D. Detection of Steel Reinforcement in Concrete Using Active Microwave Thermography and Neural Network-Based Analysis. Appl. Sci. 2025, 15, 8419. [Google Scholar] [CrossRef]
Ge, L.; Li, Q.; Wang, Z.; Li, Q.; Lu, C.; Dong, D.; Wang, H. High-resolution ultrasonic imaging technology for the damage of concrete structures based on total focusing method. Comput. Electr. Eng. 2023, 105, 108526. [Google Scholar] [CrossRef]
Kuchipudi, S.T.; Ghosh, D. An ultrasonic wave-based framework for imaging internal cracks in concrete. Struct. Control Health Monit. 2022, 29, e3108. [Google Scholar] [CrossRef]
Chen, R.; Tran, K.T.; La, H.M.; Rawlinson, T.; Dinh, K. Detection of delamination and rebar debonding in concrete structures with ultrasonic SH-waveform tomography. Autom. Constr. 2022, 133, 104004. [Google Scholar] [CrossRef]
Dinh, K.; Tran, K.; Gucunski, N.; Ferraro, C.C.; Nguyen, T. Imaging concrete structures with ultrasonic shear waves—Technology development and demonstration of capabilities. Infrastructures 2023, 8, 53. [Google Scholar] [CrossRef]
Kuznetsov, M.; Maltseva, O.; Noskov, A.; Kuznetsov, A. Experience of using the ultrasonic low-frequency tomograph for inspection of reinforced concrete structures. In Proceedings of the IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2019; Volume 481, p. 012047. [Google Scholar]
Kuchipudi, S.T.; Ghosh, D. Automated detection and segmentation of internal defects in reinforced concrete using deep learning on ultrasonic images. Constr. Build. Mater. 2024, 411, 134491. [Google Scholar]
Słoński, M.; Schabowicz, K.; Krawczyk, E. Detection of flaws in concrete using ultrasonic tomography and convolutional neural networks. Materials 2020, 13, 1557. [Google Scholar] [CrossRef]
Algernon, D.; Arndt, R.W.; Denzel, W.; Ebsen, B.; Feistkorn, S.; Friese, M.; Grosse, C.; Kathage, S.; Kessler, S.; Köpp, C.; et al. NDT procedures in relation to quality assurance and validation of nondestructive testing in civil engineering. In Proceedings of the NDE/NDT Structural Materials Technology for Highways and Bridges (SMT); American Society for Nondestructive Testing: Columbus, OH, USA, 2019; pp. 31–38. [Google Scholar]
Wiggenhauser, H. Advanced NDT methods for the assessment of concrete structures. In Concrete Repair, Rehabilitation and Retrofitting II; CRC Press: Boca Raton, FL, USA, 2008; pp. 37–48. [Google Scholar]
Taylor, P.C.; Voigt, G.F. Integrated Materials and Construction Practices for Concrete Pavement: A State-of-the-Practice Manual; Center for Transportation Research and Education, Iowa State University: Ames, IA, USA, 2007. [Google Scholar]
Hoegh, K.; Khazanovich, L.; Yu, H.T. Concrete pavement joint diagnostics with ultrasonic tomography. Transp. Res. Rec. 2012, 2305, 54–61. [Google Scholar] [CrossRef]
Hoegh, K.; Khazanovich, L. Correlation analysis of 2D tomographic images for flaw detection in pavements. J. Test. Eval. 2012, 40, 247–255. [Google Scholar] [CrossRef]
Hoegh, K.; Khazanovich, L.; Yu, T. Shear Wave Method for Concrete Pavement Diagnostics. In Proceedings of the International Conference on Concrete Pavements, Québec City, QC, Canada, 8–12 July 2012. [Google Scholar]
Oppenheim, A.V.; Schafer, R.W. Discrete-Time Signal Processing; Pearson Education Signal Processing Series; Pearson Education: Upper Saddle River, NJ, USA, 1999. [Google Scholar]

Figure 1. (a) MIRA A1040 position on a lab sample. (b) Side view of a sample with a honeycombing defect, with MIRA A1040 position atop.

Figure 2. (a) Core from dry ice lab sample, showing introduced voids. (b) Side view of honeycombed lab sample.

Figure 3. (a) Finished field test section. (b) Placing of dry ice on the initial lift to introduce voids in the slab.

Figure 4. Five signals at the reported pair distance extracted from the .lbv file showing the time-dependency of the direct wave due to distance.

Figure 5. B-Scans showing (a) A control sample with a clean backwall reflection. (b) A honeycombed sample with several zones of reflection inside the concrete section.

Figure 6. (a) Energy envelope from a Hilbert transform of the raw amplitude signal. (b) Instantaneous energy from the squared envelope.

Figure 7. Mean energy profile, with key analysis points and window noted for use in determining S/B ratio and ECR.

Figure 8. Lateral standard deviation profile, with key points, analysis window, and active regions noted for use in determining SD.

Figure 9. B-Scans showing (a) Lab sample with honeycombing. (b) Lab sample with voids and a backwall reflection. (c) Field test with honeycombing. (d) Field test with voids, without any backwall reflection.

Figure 10. Distribution of Composite Quality Score for laboratory samples at (a) 24 h and (b) 48 h.

Figure 11. Distribution of Composite Quality Score for field tests at (a) 24 h and (b) 48 h.

Table 1. Velocities at 24 and 48 h for Lab Testing Samples.

Lab Validation		Average Velocity (m/s)
Sample	Description	24 h	48 h	Increase
C1S1	Control Sample	2351	2434	4%
C1S2	Voids	2388	2488	4%
C1S3	Voids	2313	2437	5%
C2S1	Control Sample	2194	2266	3%
C2S2	Honeycombing	2304	2384	3%
C2S3	Honeycombing	2369	2428	2%
C3S1	Honeycombing	2521	2539	1%
C3S2	Honeycombing	2556	2585	1%
C4S2	Control Sample	2481	2513	1%

Table 2. Velocities at 24 and 48 h for Field Testing Areas.

Field Validation	Average Velocity (m/s)
Area	24 h	48 h	Increase
Control North	1556	1726	11%
Honeycomb	1593	1767	11%
Control South	1607	1749	9%
Voids	1658	1880	13%

Table 3. Average Individual Metrics and Composite Quality Score for Laboratory Samples at 24 and 48 h.

		S/B Ratio [-]		ECR [-]		Spatial Disp. (%)		Score [-]
ID	Description	24 h	48 h	24 h	48 h	24 h	48 h	24 h	48 h
C1S1	Control	85.0	72.6	30.4	26.1	11.4	16.1	100	100
C1S2	Voids	2.3	4.7	1.3	1.6	71.4	74.9	8	14
C1S3	Voids	9.4	12.2	3.4	4.5	72.1	63.3	29	38
C2S1	Control	46.9	19.2	19.8	7.1	27.7	29.2	90	79
C2S2	Honeycombing	1.7	5.2	0.8	1.7	81.0	81.2	6	16
C2S3	Honeycombing	3.4	12.5	1.3	3.3	91.4	90.3	11	35
C3S1	Honeycombing	8.6	7.8	2.6	2.3	72.0	73.3	25	26
C3S2	Honeycombing	6.2	6.9	3.0	2.8	75.0	73.4	21	26
C4S1	Control	127.4	62.3	54.7	35.6	16.4	18.7	100	99

Table 4. Average Individual Metrics and Composite Quality Score for Field Testing Areas at 24 and 48 h.

	S/B Ratio [-]		ECR [-]		Spatial Disp. (%)		Score [-]
Description	24 h	48 h	24 h	48 h	24 h	48 h	24 h	48 h
Control North	17.6	18.0	5.0	8.4	46.1	20.3	56	90
Control South	19.2	23.3	8.1	7.4	34.2	20.4	76	84
Honeycomb	13.1	7.5	3.6	3.8	86.1	86.0	37	26
Voids	12.3	10.4	3.9	3.5	72.8	63.9	36	33

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Olavarría, J.E.; Darnell, M.M.; Smetana, M.; Vandenbossche, J.M.; Khazanovich, L. Quantitative Assessment of Concrete Pavement Subsurface Quality Using Ultrasonic Tomography: Development and Initial Validation of a Multi-Metric Scoring System. Appl. Sci. 2026, 16, 2233. https://doi.org/10.3390/app16052233

AMA Style

Olavarría JE, Darnell MM, Smetana M, Vandenbossche JM, Khazanovich L. Quantitative Assessment of Concrete Pavement Subsurface Quality Using Ultrasonic Tomography: Development and Initial Validation of a Multi-Metric Scoring System. Applied Sciences. 2026; 16(5):2233. https://doi.org/10.3390/app16052233

Chicago/Turabian Style

Olavarría, Jorge E., Megan M. Darnell, Mason Smetana, Julie M. Vandenbossche, and Lev Khazanovich. 2026. "Quantitative Assessment of Concrete Pavement Subsurface Quality Using Ultrasonic Tomography: Development and Initial Validation of a Multi-Metric Scoring System" Applied Sciences 16, no. 5: 2233. https://doi.org/10.3390/app16052233

APA Style

Olavarría, J. E., Darnell, M. M., Smetana, M., Vandenbossche, J. M., & Khazanovich, L. (2026). Quantitative Assessment of Concrete Pavement Subsurface Quality Using Ultrasonic Tomography: Development and Initial Validation of a Multi-Metric Scoring System. Applied Sciences, 16(5), 2233. https://doi.org/10.3390/app16052233

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Quantitative Assessment of Concrete Pavement Subsurface Quality Using Ultrasonic Tomography: Development and Initial Validation of a Multi-Metric Scoring System

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Overview

2.2. Ultrasonic Testing Equipment

2.2.1. MIRA A1040

2.2.2. Data Acquisition Protocol

2.3. Concrete Materials

2.4. Defect Construction Techniques

2.4.1. Internal Voids

2.4.2. Honeycombing

2.5. Field Test Section

2.6. Signal Processing and Quality Score Development

2.6.1. Raw Signal Acquisition and Processing

2.6.2. SAFT Reconstruction

2.6.3. Envelope and Energy Map

2.6.4. Subsurface Quality Metrics

2.6.5. Composite Quality Score

2.6.6. Validation Approach

2.7. Software Implementation

3. Results

3.1. Surface Velocity

3.2. B-Scan Observations

3.3. Individual Metric Performance

3.3.1. Signal-to-Background Ratio

3.3.2. Energy Concentration Ratio

3.3.3. Spatial Dispersion

3.4. Multi-Metric Score

3.4.1. Reference Value Selection

3.4.2. Weight Assignment

3.4.3. Lab Testing Scores

3.4.4. Field Validation

4. Discussion

4.1. Laboratory Results

4.1.1. Individual Metric Performance

4.1.2. Multi-Metric Score Performance

4.2. Field Validation

Defect Detection in Field Conditions

4.3. Relationship to Qualitative B-Scan Interpretation

4.4. Factors Affecting the Multi-Metric Score Performance

4.5. Limitations

4.6. Interpretation Guidelines

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI