For this evaluation, we focus on four vehicle classes (bus, car, keke, truck). The remaining classes (bicycle and motorcycle) were rarely observed in the test videos and, more importantly, have comparatively small contributions to pavement loading relative to heavier vehicles; excluding them reduces noise in the workload-oriented analysis while preserving the dominant traffic demand signals. Counts produced by the higher-performing pipeline variants were compared against a manual baseline created by human annotation within the same ROI and time window.
Figure 8 shows a comparison of the manual baseline counts with the counting by the selected pipeline variants (P1–P4).
4.3.1. Vehicle Counting Analysis
For the Yola Road downward panel, which represents the easier daytime and high-visibility scene, all four selected pipeline variants recover the dominant car and keke flows reasonably well, but the YOLO11l-based ROI variants remain the closest to the baseline. In particular, YOLO11l + BoT-SORT + ROI predicts 129 cars and 186 keke against manual counts of 108 and 181, respectively, while YOLO11l + ByteTrack + ROI predicts 127 cars and 187 keke. The YOLO26l ROI variants are still competitive but show slightly larger deviations in this panel, especially through complete bus misses and larger keke under-counting for the ByteTrack version. This pattern is also reflected in the aggregate downward-flow error summaries, where the Yola downward overall WAPE is lower for the YOLO11l ROI variants than for the corresponding YOLO26l ROI variants.
The Yola Road upward panel reveals the main directional weakness of the counting system. Here, all four variants depart substantially from the manual distribution, primarily through car over-counting and keke under-counting. YOLO11l + BoT-SORT + ROI predicts 241 cars and only 79 keke relative to manual counts of 106 cars and 206 keke, while YOLO11l + ByteTrack + ROI moderates this imbalance to 219 cars and 93 keke. The YOLO26l variants do not remove the asymmetry: YOLO26l + BoT-SORT + ROI produces 45 buses, 209 cars, and 38 keke, and YOLO26l + ByteTrack + ROI produces 50 buses, 194 cars, and 41 keke, again relative to a baseline of zero buses, 106 cars, and 206 keke. Consistent with the grouped bars, the Yola Road upward overall WAPE remains high for all selected ROI variants, ranging from about 0.74 for YOLO11l + ByteTrack + ROI to about 0.97 for YOLO26l + BoT-SORT + ROI.
In the Mubi Road panels, which correspond to evening operation under cloudier, lower-visibility, and more congested conditions, the grouped bars show substantially larger departures from the manual baseline. The YOLO11l variants exhibit severe distortion of the class composition, especially through large car over-counting and strong keke under-counting. For example, YOLO11l + ByteTrack + ROI predicts 125 cars and only 20 keke in downward flow, compared with manual counts of 35 cars and 156 keke, and in upward flow it predicts 114 cars and 23 keke, compared with manual counts of 17 cars and 120 keke. The YOLO26l variants remain imperfect, but they visibly reduce this distortion: YOLO26l + ByteTrack + ROI predicts 65 cars and 63 keke in downward flow, and 40 cars and 44 keke in upward flow, bringing the class distribution closer to the baseline than the YOLO11l-based alternatives. This improvement is also reflected in the summary metrics, where the Mubi upward overall WAPE decreases from 1.45 for YOLO11l + BoT-SORT + ROI and 1.39 for YOLO11l + ByteTrack + ROI to 0.85 for YOLO26l + BoT-SORT + ROI and 0.81 for YOLO26l + ByteTrack + ROI.
Overall, the selected grouped-bar figure supports three conclusions. First, ROI-based tuned counting is viable in the easier Yola Road downward setting, where all four selected variants broadly reproduce the dominant class composition. Second, the strongest empirical limitation remains direction-dependent instability, particularly for upward flow, where rear-view similarity and tracking fragmentation lead to persistent confusion between keke and car. Third, in the more difficult Mubi Road scene, the YOLO26l-based variants provide a more favorable trade-off than the YOLO11l-based variants, especially when coupled with ByteTrack, but they still fall short of reliable class-balanced counting under adverse field conditions. These observations show that the proposed framework has practical value as a calibrated, keke-aware traffic composition measurement tool for roadside field studies, especially in settings where local vehicle mix is more informative than aggregate flow alone. Accordingly, the contribution of this work is not a claim of universally robust counting, but the introduction and validation of a reproducible composition-aware counting framework that can support transport analysis, reveal failure modes, and serve as a foundation for subsequent calibration and deployment studies.
4.3.2. Error Analysis
Using the class-wise totals from the downward and upward traffic-flow distribution plots (
Figure 8), we quantify counting accuracy relative to the manual baseline for the four dominant classes (bus, car, keke, truck). For the downward traffic flow of the Yola Road test video (baseline: bus = 6, car = 108, keke = 181, truck = 23), the tuned P1 (
YOLO11l + BoT-SORT + ROI) pipeline variant predicts (1, 129, 186, 2). We compute the
absolute count error per class as
mean absolute error as
root mean square error as
and
mean absolute percentage errorThe same procedure is applied to the remaining selected tuned ROI-based variants for both traffic directions in the Yola Road and Mubi Road test videos.
Table 3 and
Table 4 report the resulting predicted counts, absolute errors, and APE values for P1 (
11L-BoT-ROI), P2 (
11L-Byte-ROI), P3 (
26L-BoT-ROI), and P4 (
26L-Byte-ROI).
For the Yola Road test video, the smallest relative error is achieved on keke in the downward flow (
Table 3) for
11L-BoT-ROI (APE = 2.76%). Across all the vehicle classes in the downward flow, keke APE remains low (2.76–9.39%), car APE remains moderate (17.59–23.15%), whereas heavy vehicles and low-count categories show much larger percentage deviations (truck APE = 86.96–91.30% and bus APE = 83.33–100.00%). In the upward flow (
Table 3), car APE ranges from 83.02% to 127.36%, keke APE from 54.85% to 81.55%, and truck APE from 25.00% to 85.00%. Among these,
26L-Byte-ROI gives the lowest upward car APE (83.02%),
11L-Byte-ROI gives the lowest upward keke APE (54.85%), and
26L-BoT-ROI gives the lowest upward truck APE (25.00%). For the bus class in the upward flow direction, the manual baseline is zero, so APE is undefined for all four counting pipeline variants.
Table 4 shows substantially larger deviations for the Mubi Road test video. In the downward flow, keke APE ranges from 59.62% to 87.18% and car APE from 85.71% to 288.57%; the lowest values in both cases are obtained by
26L-Byte-ROI. In the upward flow, keke APE remains high (63.33–83.33%), and car APE remains very high (135.29–588.24%), again with
26L-Byte-ROI giving the lowest car and keke APEs. Truck is the only class for which some Mubi upward variants attain comparatively lower error, with APE = 20.00% for both
11L-Byte-ROI and
26L-BoT-ROI; however, downward truck errors remain extreme for all variants (1400.00–2000.00%). Also, the upward bus APE is undefined because the manual baseline is zero.
Overall, the new class-wise tables show that the selected tuned ROI-based pipelines are much more reliable on the Yola Road downward stream than on the Yola Road upward stream or either Mubi Road stream. The most stable result across the four road-direction scenarios is downward keke counting on Yola Road, whereas the most severe failure modes occur in the Mubi video and in low-count categories such as bus and truck, where small absolute differences lead to very large percentage errors. These results suggest the need for further keke-aware tuning to address the scene and direction-dependent failure modes through additional data collection and further refinement of the counting pipeline.
Table 5 shows that aggregate counting accuracy is strongest for the Yola Road downward flow, where all four selected tuned ROI-based pipeline variants remain within a narrow error band (MAE = 12.75–17.00, RMSE = 14.69–18.37, MAPE = 48.89–54.87%), with
YOLO11l + ByteTrack + ROI giving the lowest overall error. For the Yola Road upward flow, errors increase substantially, although
YOLO11l + ByteTrack + ROI again yields the lowest MAE and RMSE (61.75 and 80.25), while
YOLO26l + BoT-SORT + ROI gives the lowest MAPE (67.91%). The Mubi Road video is markedly more challenging: downward-flow errors are high for all variants, with
YOLO26l + ByteTrack + ROI achieving the lowest MAE and RMSE (46.00 and 53.44), but percentage errors remain very large because low-count classes strongly inflate MAPE. In the upward Mubi flow, the
YOLO26l-based variants clearly outperform the
YOLO11l-based variants, with
YOLO26l + ByteTrack + ROI giving the best overall aggregate result (MAE = 28.75, RMSE = 40.33, MAPE = 79.54%). Overall, the table confirms strong scene and direction-dependence, with Yola Road downward being the most reliable setting and Mubi, especially downward, remaining the most difficult.
4.3.3. Flow-Rate and ESAL Analysis
For the road usage analysis, we computed the flow rate and ESAL rate to evaluate the implications for traffic measurement and pavement loading. The flow-rate analysis is computed class-wise using Equation (
8) and then aggregated using Equation (
9) to obtain the total flow. The estimated flow rates for the Yola and Mubi Road test videos were presented in
Table 6 and
Table 7, respectively. The YOLO11l-based variants report a similar total flow rate with the baseline, indicating that vehicle objects are effectively being detected, but the issues lie with the misclassification of the detected vehicle class. However, the YOLO26l variants deviate more significantly, especially in the Mubi Road upward traffic case.
The ESAL rate was computed using Equation (
11). The LEF was computed using Equation (
10). We assumed the axle load for the vehicle classes as follows: 0.4 tons for keke, 1.4 tons for car, 11 tons for truck, and 3.5 tons for bus (common bus types are minibuses). The standard reference load used was 8.16 tons. Therefore, for the Yola Road downward-flow traffic, the baseline keke
was computed as,
The baseline keke ESAL rate was computed as,
Similarly, the individual ESAL rate was computed for all vehicle classes and aggregated for the baseline, and the four selected tuned ROI-based pipeline variants, with the results for each direction reported in
Table 6 and
Table 7. For the Yola Road downward flow, all selected variants produce total flow rates that remain close to the baseline (baseline
veh/h; variants
–
veh/h), but their
values are much smaller than the baseline
ESAL/h because trucks are strongly under-counted:
11L-BoT-ROI gives
,
11L-Byte-ROI gives
, and both
26L variants give about
. For the Yola Road upward flow, total flow is again relatively close to the baseline (
veh/h versus
–
veh/h), but class composition differs markedly across variants. The
11L variants under-count keke and over-count car, yielding lower
values than the baseline
ESAL/h (
for
11L-BoT-ROI and
for
11L-Byte-ROI), whereas the
26L variants overestimate heavier classes, especially bus and truck, producing inflated
values of
and
.
A similar pattern is observed in the Mubi Road video, but with substantially larger distortions. In the downward flow, the total flow rate of all selected variants remains close to the baseline ( veh/h versus – veh/h), yet all variants massively inflate relative to the baseline ESAL/h because of severe over-counting of heavy vehicles, especially trucks: for 11L-BoT-ROI, for 11L-Byte-ROI, for 26L-BoT-ROI, and for 26L-Byte-ROI. In the upward flow, baseline is ESAL/h, while all four variants still overestimate pavement loading (– ESAL/h), although the inflation is smaller than in the downward case. Overall, these results highlight that total flow rate can remain numerically close to the manual baseline even when class composition is substantially wrong, while ESAL-rate is highly sensitive to errors in heavy-vehicle recognition. Consequently, although keke-aware tuning improves mixed-traffic representation, reliable pavement-loading inference additionally requires robust heavy-vehicle detection and careful calibration of class-specific load equivalency factors.