1. Introduction
The increasing complexity of modern transportation networks has driven the development of Intelligent Transportation Systems (ITS), technological ecosystems that integrate sensing devices, data processing algorithms, and real-time communication to enhance the efficiency, safety, and sustainability of mobility [
1,
2]. ITS have proven to be key instruments for optimizing the operation of main infrastructure such as toll stations, where continuous and accurate monitoring of vehicle flow is essential for tariff management, road planning, and strategic decision-making [
3]. The ability to automatically classify and count vehicles, without human intervention, reduces waiting times, minimizes operational errors, and enables differentiated pricing schemes and the collection of structured data for traffic analysis [
4,
5,
6,
7]. Such digital solutions are especially relevant where mobility demands are growing faster than the capacity to expand physical infrastructure, and a progressive transition toward more innovative, interoperable, and sustainable solutions is needed.
Traditionally, toll stations have used intrusive technologies for vehicle classification, including inductive loops, piezoelectric sensors, and axle-counting systems, using physical barriers or pressure plates embedded in the road [
8,
9]. Intrusive sensors refer to technologies that require installation directly into the road infrastructure, typically involving saw-cut grooves, drilled holes, or subsurface channels in the pavement [
10]. These methods offer a certain degree of accuracy but present several operational limitations. Their installation requires structural interventions on the road surface, often involving lane closures, and is subject to physical wear over time due to repeated traffic loads and adverse weather conditions. Although generally less sensitive to environmental visibility, their performance degrades as components deteriorate, leading to reduced reliability and increased maintenance demands. In response to these constraints, non-intrusive solutions based on computer vision have been explored, employing video cameras combined with pattern recognition algorithms and computational intelligence to infer the type of vehicles in motion [
11,
12,
13]. However, this approach also has vulnerabilities, such as being affected by variations in natural lighting, partial occlusions, and weather conditions such as rain or fog [
14,
15]. As a result, there is a growing interest in the use of active sensors such as Light Detection and Ranging (LiDAR), which employ laser scanning to generate three-dimensional point clouds of the environment, and are independent of lighting conditions and more robust to partial occlusions [
16,
17,
18,
19]. This technology, widely used in perception systems for autonomous vehicles and 3D mapping [
20,
21,
22,
23], offers a promising technical framework for vehicle classification applications in tolling environments. However, its adoption in real-world operational contexts still faces challenges in terms of cost, real-time processing, and compatibility with vehicle categorization schemes regulated by national standards.
The processing of point clouds from LiDAR sensors requires efficient computational architectures capable of extracting meaningful features from high-dimensional unstructured data. Unlike images with a regular matrix structure, 3D point clouds are irregular and pose unique challenges in representation, segmentation, and classification [
24,
25,
26]. In this context, various approaches combine filtering, normalization, and ground plane removal techniques with the extraction of descriptors such as Fast Point Feature Histograms (FPFH), which capture local information about surface curvature and orientation [
26,
27], and which can be encoded using Bag-of-Words (BoW) models to facilitate their use in statistical classifiers [
28,
29,
30]. These feature vectors have been successfully integrated with supervised learning algorithms such as Support Vector Machines (SVM) [
31,
32], neural networks [
25,
33,
34], and Bayesian networks [
35], among others. The combination of these methods has proven effective for 3D object classification, including vehicle recognition, in controlled or simulated environments; however, their implementation in dynamic settings such as real-world toll plazas requires additional adaptations to account for variations in vehicle geometry, movement, multiple lanes, and real-time inference requirements. These limitations underscore the need to design processing pipelines capable of operating autonomously in complex operational environments while complying with each jurisdiction’s regulatory and technical standards.
Despite the growing global interest in LiDAR technologies applied to vehicle monitoring, their adoption in developing countries remains limited due to budget constraints, deficiencies in technological infrastructure, and the absence of public policies promoting road system digital modernization. In the case of Colombia, the national highway network includes more than 140 toll stations operated by public agencies and private concessions, which apply tariff schemes defined by regulations issued by the Instituto Nacional de Vías (INVIAS) and the Agencia Nacional de Infraestructura (ANI) [
36,
37]. These regulations classify vehicles by vehicle type, axle count, gross vehicle weight, and specific usage, such as distinguishing between buses and trucks with identical axle configurations. This multifactor classification criterion adds notable complexity to the automation process, as it requires detection of lifted axles, evaluation of physical dimensions, and contextual interpretation of vehicle type.
Most Colombian tolls still rely on mechanical counting mechanisms or manual supervision, limiting scalability, traceability, and integration with modern traffic management systems. This situation highlights a persistent gap between technological advances and their implementation in real-world infrastructure, underscoring the need for systems that respond to local technical, regulatory, and operational conditions. This article contributes a replicable and field-tested architecture for automated vehicle counting and classification in toll plazas, tailored to the axle-based tolling model used in Colombia. The proposed system integrates 2D LiDAR sensors, video cameras, and Doppler radar with point cloud processing and supervised learning algorithms.
Unlike most academic proposals, the system was deployed and validated under long-term operational conditions, demonstrating its practical feasibility. It complies with current Colombian toll classification regulations and supports the development of robust, scalable, and non-intrusive ITS technologies capable of real-time operation. Moreover, the system produces structured mobility data that can inform oversight processes and serve as input to transport system models, laying the groundwork for data-driven digital transformation in toll collection and infrastructure planning.
The following sections present the remainder of this paper:
Section 2 provides a review of the state of the art in vehicle classification and LiDAR technologies;
Section 3 describes the system architecture and the data processing methodology;
Section 4 presents the experimental results obtained in real-world tests; and finally,
Section 5 outlines the conclusions and future lines of work.
2. Related Work
Automatic vehicle classification is essential in ITS, supporting toll collection, infrastructure management, and logistics planning. In recent years, LiDAR has emerged as a reliable and non-intrusive alternative to traditional sensors such as inductive loops, piezoelectric strips, and vision systems affected by lighting. Its ability to generate detailed 3D point clouds enables precise geometric profiling of moving vehicles, paving the way for advanced classification systems that combine 3D data processing with machine learning to perform robustly under real traffic conditions.
Several technologies have been developed for automated vehicle detection and classification, each with distinct strengths, limitations, and deployment requirements.
Table 1 presents a comparative analysis of the most relevant approaches currently used in toll monitoring and ITS, including inductive loops, piezoelectric sensors, video-based methods, and LiDAR-based solutions. These technologies vary in terms of intrusiveness, resilience to environmental conditions, suitability for axle-based classification, and real-time performance.
Intrusive systems such as inductive loops and piezoelectric sensors are widely used in permanent installations due to their robustness and accuracy in detecting axle counts. However, they require interventions on the road surface, which leads to higher installation and maintenance costs, as well as limited scalability [
10]. In contrast, camera-based systems combined with deep learning architectures like YOLO offer non-intrusive installation and good classification by vehicle type, but they suffer from reduced performance in low-light or adverse weather conditions and are not designed for axle-based classification [
42].
LiDAR-based approaches strike a balance between non-intrusiveness and structural accuracy. The geometry-based method proposed in this work enables axle-level classification using LiDAR hardware and lightweight algorithms such as FPFH feature extraction and BoW-SVM classification. While deep learning architectures like PointNet or VoxelNet may achieve higher classification power in complex 3D data, they require large training datasets and substantial computational resources [
44]. Our system offers a practical, real-time solution tested in operational toll environments, with modularity and scalability as key benefits. This comparison reinforces the suitability of LiDAR-based solutions for toll supervision scenarios, particularly in countries with axle-based tolling policies, including Colombia.
Furthermore,
Table 2 summarizes the main methodological and operational characteristics of some studies on vehicle classification using LiDAR sensors and complementary technologies. It reveals an evolution from geometric descriptors to machine learning, alongside increasing LiDAR adoption. However, it also highlights ongoing challenges, including limited validation under real-world conditions, low class diversity, the absence of essential functionalities, such as axle counting or license plate recognition, and weak alignment with regulatory standards. The heterogeneity in sensor configurations further underscores the need for robust and modular solutions tailored to complex environments such as toll stations.
The “Sensors/Data source” and “Sensor position” columns reflect the diversity of data acquisition technologies and sensor configurations across the reviewed studies. While all approaches employ LiDAR sensors as the primary data source, their deployment varies widely, from single-channel overhead scanning setups (e.g., [
16,
17,
32,
46]) to lateral configurations (e.g., [
47,
48]), as well as hybrid architectures incorporating multiple viewpoints (e.g., [
49,
50]). Some studies, such as [
47], also integrate camera-based visual data, leveraging sensor fusion techniques to enhance the feature space and compensate for modality-specific limitations. This variation in sensor configuration highlights the crucial role of geometric system design in vehicle classification. The performance of key operations, such as contour extraction, axle counting, and structural profiling, depends on factors such as sensor placement, scan angle, and field of view. Lateral setups often yield richer detail for axle detection, while overhead configurations support lane-based segmentation but may suffer from occlusions. These architectural choices directly affect data quality, algorithm accuracy, and system scalability in real-world deployments like toll stations.
Axle counting is another critical functionality, particularly in regulatory contexts where toll categories depend on the number of axles rather than just size or gross weight. Despite its importance, the “Axle counting” column shows that only a minority of studies explicitly implement this feature, notably [
48,
49]. These works deploy targeted geometric strategies and assignment algorithms to detect wheel positions and infer axle configurations. However, most studies focus solely on general vehicle classification, omitting axle-related attributes. This omission significantly reduces their applicability in operational tolling systems, especially in countries like Colombia, where axle count determines tariff class. Liftable or non-standard axle arrangements, for instance, require detailed structural analysis that basic morphological classification methods cannot resolve. These limitations highlight the need for classification architectures incorporating flexible, high-precision axle-counting modules.
The “Vehicle image/Plate detection” column reveals that only three studies, including the system proposed in this paper, report using vehicle images. However, only the present work integrates license plate detection as a system feature. For example, ref. [
32] captures broad road segments for point cloud generation, and ref. [
47] uses video for speed estimation, but neither employs image data for vehicle identification. Including license plate detection is a significant advantage, enabling hybrid validation mechanisms that enhance system traceability and regulatory compliance. Additionally, by linking classification results to license plates, the system supports user segmentation and enables differentiated commercial campaigns, such as targeted discounts and loyalty programs.
The “Classification” and “Categories” columns reveal substantial heterogeneity in the taxonomic schemes across the reviewed studies. While some works rely on simplified class structures with just 2 to 6 categories (e.g., [
16,
32,
46,
50]), others, such as [
17,
47], and the present study, implement more granular taxonomies with up to eight or nine classes. However, only the proposed system adheres explicitly to national regulatory standards. This alignment enables direct integration into toll enforcement frameworks, where tariff assignment depends on detailed vehicle categorization. The lack of standardization in many prior studies is a limitation, especially for deployment in regulatory environments. Generic taxonomies such as “car” or “truck” may be adequate in experimental contexts but are insufficient where vehicle type directly influences pricing, enforcement, or compliance. Using an eight-class model aligned with Colombian regulations in the current study ensures legal compatibility, supports traceability, and enables interoperability with national ITS infrastructure.
The “Classification accuracy” and “Classification algorithms” columns show that reported accuracy levels in the literature range from
to
. Studies using more advanced methods, such as deep neural networks or SVMs, generally achieve higher performance (e.g., [
23,
32,
50]). The system proposed in this article reaches an overall classification accuracy of
, with powerful results in the classification of heavy vehicles, achieving
accuracy in categories involving vehicles with three or more axles. However, differences in classification schemes and dataset sizes across studies limit the possibility of direct comparisons.
The classification algorithms employed also reflect a methodological evolution. Early works relied on heuristic rules or basic linear classifiers, whereas recent studies incorporate statistical and machine learning techniques. This study’s use of SVM reflects this shift, effectively balancing classification accuracy, generalization capacity, and computational efficiency.
The “Feature extraction” column shows a similar transition. Early works relied on global geometric descriptors, such as length, height, or width [
46,
50]. In contrast, recent contributions incorporate more expressive 3D descriptors like VFH (Viewpoint Feature Histogram), SHOT (Signature of Histograms of Orientations), and FPFH [
16,
32,
47]. These local descriptors capture fine-grained surface information and are more robust to occlusions and noise. The present study distinguishes itself by combining global geometric features (e.g., vehicle length, height) and local shape descriptors, enhancing the system’s ability to differentiate between visually similar classes.
The “Samples” column reveals significant variation in dataset sizes, ranging from as few as 65 labeled samples in [
49] to over 44,000 objects in the present work. This disparity has profound implications for model training, especially regarding class imbalance and generalization capacity. Smaller datasets often fail to capture the diversity needed to train robust classifiers. In contrast, the large, empirically curated dataset used in the current study supports both statistical validity and operational reliability, making the resulting model more transferable to complex deployment environments such as toll plazas.
Despite notable advances, the state of the art in LiDAR-based vehicle classification still faces critical limitations that hinder its direct application in real-world operational settings. Many systems are validated under controlled conditions, limiting their reliability in high-demand environments like toll plazas, where real-time accuracy under variable conditions is essential. Widely used datasets, such as KITTI [
51], nuScenes [
20], and ZPVehicles [
19], are not tailored to classification schemes based on axle count, usage, or gross weight, requirements standard in countries like Colombia, forcing reliance on narrow proprietary datasets. Deep learning models, though accurate, demand high computational resources, posing challenges for edge deployment in regions with limited infrastructure. Additionally, most systems rely on generic labels like “light” or “heavy” vehicles, which are insufficient for automated tolling or tariff enforcement tasks that require alignment with local classification standards.
In the Latin American context, the development of the SSICAV system in Colombia, documented in the thesis underlying this article, offers a comprehensive and robust solution tailored to the operational realities of the country’s road infrastructure. The multisensor architecture integrates 2D LiDAR, Doppler radars, and video cameras in a processing pipeline that includes distance and statistical filtering, RANSAC-based segmentation, angular correction, and extracting geometric attributes and FPFH. The resulting feature vectors feed into an SVM trained on real-world field data. The system enables automatic classification into eight vehicle categories aligned with official standards from Colombia’s INVIAS and ANI, and it is establishing itself as a pioneering model in the region and a potential reference for large-scale national deployment.
In summary, the academic literature shows steady progress in using LiDAR sensors for vehicle classification, with a clear shift from rule-based models and geometric descriptors toward 3D neural architectures. However, researchers have yet to bridge the gap between experimental developments and practical deployment in real-world tolling environments. This review underscores the need for systems such as the one proposed in this study: solutions that combine geometric precision, computational efficiency, compliance with national standards, and empirical validation under real operational conditions.
4. Results
This work’s main contribution is the development of the Vehicle Counting and Classification System, SSICAV. Designed as a comprehensive technological solution, SSICAV aims to optimize vehicle management at toll stations by integrating advanced sensing technologies and data processing capabilities.
4.1. Axle Detection and Counting Results
The evaluation treated the axle detection and counting module as a multiclass classification task, where each class corresponds to a vehicle with a specific number of axles, ranging from 2 to 6. Although the classifier could predict classes from 0 to 8, labels outside the valid range (i.e., 0, 1, 7, and 8) represent misclassifications. The analysis excluded vehicles with more than six axles due to their low representativeness in the dataset.
The system evaluation used a test set comprising 37,108 vehicle samples collected under real operating conditions. The overall accuracy reached
, suggesting the system reliably estimates axle counts in most cases. However, performance varied across classes, partly due to the inherent class imbalance, as shown in
Table 7. Two-axle vehicles accounted for more than
of the dataset.
The confusion matrix in
Figure 26 highlights the classifier’s ability to detect 2-axle vehicles correctly and reliably. However, the results reveal a systematic tendency to underestimate axle counts, particularly in higher axle configurations (e.g., 3 to 6 axles), which the model frequently misclassified.
The metrics demonstrate that the axle classification model performs highly accurately under operational conditions. A global accuracy of and a weighted F1-score of reflect strong performance, especially for dominant classes like 2-axle vehicles. However, macro-averaged metrics (precision , recall , F1-score ) reveal significant class variability. This result is mainly due to reduced precision in underrepresented categories; for example, the 3-axle class achieved high recall () but low precision (), indicating a substantial number of false positives.
The row-normalized confusion matrix is shown in
Figure 27 to facilitate class-wise performance analysis. The model achieved the highest recall for 2-axle vehicles (
), which dominate the dataset. Interestingly, despite being a minority class, 3-axle vehicles exhibited a recall of
, indicating that the model correctly recognized most true instances of this class. However, their precision was substantially lower (
), revealing frequent false positives, primarily due to confusion with 2-axle vehicles. This imbalance suggests that while the model is sensitive to identifying 3-axle configurations, it struggles to discriminate them clearly from more common classes.
Figure 28 presents the complete set of per-class metrics. The 2-axle category achieved an F1-score of
, reflecting consistently high precision and recall. In contrast, the 3-axle class obtained an F1-score of only
, reflecting its low precision. The model exhibited more balanced behavior for classes with 4 to 6 axles, achieving F1-scores between
and
, although recall progressively declined as axle count increased. These patterns highlight the increased classification complexity in higher axle configurations, often compounded by partial occlusions and low data representativeness.
The analysis identified the sensor’s inability to fully capture some vehicles’ wheels, primarily due to traffic traveling close to the platform where the laser scanner was installed, as one of the main limitations in axle counting. This issue was more evident in long or multi-axle vehicles, and it worsened in reversible lanes due to irregular alignment and frequent stops near the sensor. These conditions led to systematic underestimation in complex axle configurations.
In summary, despite sensor visibility and class imbalance limitations, the model demonstrated high accuracy in detecting 2-axle vehicles and acceptable performance for configurations involving 3 to 6 axles. These results confirm the system’s applicability in toll plaza environments and point to potential improvements. The proposed approach addresses current limitations by combining infrastructure adjustments and sensing enhancements. It includes modifying the toll plaza divider to enable unobstructed laser beam passage and integrating complementary sensors, such as infrared or optical devices, to improve detection under low visibility and complex vehicle configurations.
4.2. License Plate Recognition Results
The annotation process excluded 10,938 images from the evaluation because insufficient resolution prevented visual identification of license plates in those cases. These images came from a dataset of 37,108 labeled samples corresponding to Class 1 through Class 5 vehicles. In the remaining 26,170 images, where the license plate region was sufficiently clear for manual labeling, the recognition performance was assessed based on the number of correctly identified characters per plate. This metric reflects the system’s effectiveness in accurately extracting alphanumeric sequences under real-world operating conditions.
On this subset, the system achieved a character-level accuracy of , measured as the proportion of individual characters correctly recognized across all plate positions. Full-plate recognition, defined as correctly identifying all six alphanumeric characters, was achieved in of the cases. The average normalized Levenshtein distance between the predicted and ground-truth strings was , indicating that fewer than two character-level edits were required to correct a prediction.
Table 8 presents the distribution of prediction accuracy by the number of correct characters. In total,
of all correctly segmented plates contained at least five valid characters. Conversely, complete failures occurred in only
of the evaluated cases, suggesting a low probability of total misrecognition when the plate is successfully detected.
The analysis used the hour of image recording to evaluate the impact of operating conditions on system performance. As illustrated in
Figure 29, recognition accuracy exhibits a marked dependence on time of day. Between 07:00 and 18:00, full-plate accuracy consistently exceeds
, peaking at
at 14:00. Outside these hours, accuracy drops significantly, reaching values below
during late-night hours.
Figure 30 presents a complementary analysis based on Levenshtein distance. This figure shows that the average distance remains below
during daylight hours while increasing sharply during nighttime periods. The highest average distance occurs between 18:00 and 05:00, when ambient lighting is lowest, indicating a strong correlation between visibility and recognition fidelity.
The license plate recognition module demonstrated robust performance under favorable imaging conditions. Recognition quality was notably affected by environmental factors such as lighting and vehicle distance, particularly during low-light periods. These findings suggest that performance could be improved through sensor-level enhancements (e.g., infrared or auxiliary lighting) and adaptive preprocessing techniques to mitigate nighttime degradation.
4.3. Classifier Results
The training process used SVM with three kernel types: linear, RBF, and polynomial. These kernels were selected for their capacity to manage high-dimensional and nonlinear data, as typically found in processed point clouds. Also, the process scaled the geometric features using normalization techniques to maximize model performance, including minmax, maxabs, robust, and standard scaling. This preprocessing for standardizing variable ranges reduced the model’s sensitivity to outliers, improving training stability.
The evaluation tested the model using different BoW configurations by varying the number of generated visual words (5, 25, and 125) and the centroid initialization methods, which included random selection and K-means++. The implementation relied on the Mini-Batch K-means clustering algorithm. The evaluation uses multiple configurations to assess their impact on the classifier’s generalization ability.
Standard metrics such as macro F1-score and confusion matrices allow for the evaluation of the classifier’s performance.
Table 9 shows that the configuration with the best results uses the RBF kernel in combination with the standard scaler. This configuration yielded overall macro F1-score rates exceeding
.
Despite variations in the number of visual words and the centroid initialization methods, these parameters did not significantly impact the classifier’s overall performance. Moreover, comparative analysis revealed a substantial improvement in accuracy when geometric features and BoW representations were combined. As shown in the GFO (Geometric Features Only) and BWO (BoW Only) columns, the integration of both feature types enhanced the classifier’s precision and robustness, reinforcing its effectiveness for real-world deployment.
Figure 31 displays the confusion matrices generated during the validation phase of the highest-performing trained classifiers. These matrices provide a detailed overview of the system’s performance, highlighting the configurations that achieved the best results. Classifiers using the RBF kernel in combination with standard scaler demonstrated outstanding performance, with consistently high performance across all major vehicle classes. Among them, the classifier configured with an RBF kernel, standard scaler, random centroid initialization, and 25 visual words achieved the highest overall accuracy.
The classifier configuration with the best overall and per-class performance reached an overall accuracy of
, a macro F1-score of
, and a Matthews Correlation Coefficient (MCC) of
, indicating strong class discrimination capability.
Figure 32 shows the breakdown of precision, recall, and F1-score by class. Class 5 achieved the highest performance with an F1-score of
, followed by classes 2 and 3, with F1-scores of
and
, respectively. Classes 0 and 1 exhibited greater variability between precision and recall: class 0 reached high precision (
) but lower recall (
), while class 1 showed the opposite pattern (precision of
and recall of
), suggesting a tendency of the model to confuse these categories with adjacent ones. Overall, the graph highlights the classifier’s balanced behavior, validating its robustness and applicability in real-world vehicle classification scenarios.
Class-wise accuracy results were as follows: class 0—, class 1—, class 2—, class 3—, class 4—, and class 5—. The high performance of class 5, which includes vehicles with three or more axles, stands out, demonstrating the model’s ability to correctly identify complex vehicle configurations.
However, the confusion matrices revealed misclassification patterns, particularly between classes 3 and 4, and among categories 0, 1, and 2. These errors mainly stem from two factors. First, the visual similarity between certain vehicle types poses a considerable challenge. For example, minibuses with single rear wheels share geometric features with those with dual rear wheels, making them difficult to distinguish based solely on data captured by the laser sensor. Second, limitations in the laser sensor’s coverage also contributed to reduced accuracy in some categories. When vehicles travel too close to the lane divider, laser beams fail to fully capture the structure of the inner wheels, resulting in incomplete or distorted point clouds. This issue particularly affected classes where correct classification relies on the accurate detection of all vehicle axles.
The results confirm that using the RBF kernel in combination with the standard scaler optimizes classifier performance, even under challenging operational conditions. However, the confusion matrices reveal persistent misclassifications that highlight the need for further enhancements. These include integrating additional sensors to improve data capture in critical areas, such as rear dual-wheel detection, and refining the algorithms to better handle geometric similarities across classes. Implementing these improvements can mitigate the identified limitations and enhance the classifier’s ability to maintain high accuracy across a broader range of real-world operating scenarios.
4.4. Pilot Installation at a Toll Station
The SSICAV pilot installation was deployed at the Circasia toll station, operated by the Autopistas del Café concession, located in Filandia, Quindío. This toll plaza comprises five lanes with a standard width of meters, separated by 2-meter-wide barriers. The site permitted the evaluation of the system’s performance under real operational conditions, with a high volume of vehicular flow and infrastructure representative of typical toll stations. The installation placed the system components on the central divider between lanes 2 and 3, counting from the left in the direction toward Pereira.
The system configuration included calibrating the ROI and filtering distance. Lane 2 was designated as the right lane, and Lane 3 as the left lane. For the right lane, it set the angular range between
and
, and for the left lane, between
and
, both with distance ranges of 1 to 3 meters from the LiDAR sensor. It sets the valid range for lateral filtering (Y-axis) in 1 to
meters, while the height (Z-axis) is from
to 3 meters.
Figure 33 shows the final positioning of SSICAV’s sensors and devices.
5. Conclusions and Future Work
This work presented the design and implementation of an automatic vehicle counting and classification system (SSICAV) built on a multisensor architecture that integrates LiDAR technology, visible-spectrum cameras, and Doppler speed radars. The system was conceived with a modular and parameterizable design, enabling its adaptation to diverse operational contexts and facilitating scalability to more complex traffic scenarios.
Point cloud preprocessing represented a critical stage in the system pipeline. It involved distance-based filtering, RANSAC-based ground plane segmentation, angular correction, and statistical noise reduction. These procedures ensured a clean and accurate representation of the vehicle structure, enhancing the performance of downstream processing stages.
Vehicle characterization extracted geometric and structural descriptors, including surface normals and FPFH coded with a BoW model. A uniform sampling algorithm was applied to keypoint selection to reduce computational overhead while maintaining descriptive power. Additional geometric features such as vehicle length, height, and axle count were extracted and combined with visual words to train SVM classifiers. The resulting model achieved an overall accuracy of , with up to precision in complex vehicle classes (e.g., those with three or more axles), demonstrating its robustness under real-world conditions.
The axle-counting module achieved a accuracy for two-axle vehicles. However, performance declined in configurations that involve more axles due to occlusion caused by vehicles passing close to the median barrier. To expand the laser’s effective field of view, structural modifications, such as creating recessed channels or incorporating complementary sensors, are recommended to address this limitation.
The license plate recognition system achieved accuracy under optimal conditions but exhibited performance degradation during nighttime due to insufficient lighting and limited camera resolution. Enhancements, including infrared illumination, high-sensitivity cameras, and advanced character recognition algorithms, are proposed to ensure consistent performance across varying lighting conditions.
The proposed system is conceived as a modular architecture, designed to operate continuously in real traffic environments while meeting the functional requirements established by national road authorities. Its validation was carried out through field testing in an operational toll plaza, evaluating its classification accuracy, robustness under changing environmental conditions, and ability to integrate with existing traffic management systems. These results demonstrate the technical feasibility of implementing LiDAR and machine learning-based solutions for critical road infrastructure tasks in resource-constrained countries. They also lay the groundwork for future applications such as demand forecasting, maintenance planning, and the interoperability of ITS systems at the national level.
In future developments, we plan to explore the integration of our LiDAR-based system with heterogeneous data sources to enhance the analysis and understanding of traffic and mobility phenomena. The structured data produced by our system, including timestamps, vehicle categories, axle counts, and license plate information, could be combined with complementary data such as Floating Car Data (FCD), ALPR-based tracking, GPS probe data, or even Bluetooth/Wi-Fi sensor data. Such integration would support advanced applications such as traffic modeling, demand estimation, and mobility monitoring in real time.
This direction aligns with recent trends in transportation research that emphasize the fusion of heterogeneous data sources and the application of emerging ICTs to support intelligent transport system models. In particular, the integration of infrastructure-based observations with vehicle-generated data, such as Floating Car Data (FCD), has proven effective in improving traffic modeling and demand estimation, especially in situations with limited data availability. For instance, Ref. [
109] explores the use of FCD and fixed sensors to estimate fundamental diagrams in the city of Santander, Spain, while Ref. [
110] demonstrates how FCD alone can be used to estimate travel demand models despite the absence of traditional data sources. In this context, our LiDAR-based system could provide a reliable stream of structured vehicle-level data that complements these approaches, contributing to the calibration, validation, and enrichment of data-driven models in real-world applications.
In conclusion, the SSICAV has proven to be a robust, reliable, and high-performance system designed to transform vehicle management at toll plazas through advanced technologies. The identified challenges represent opportunities to optimize the system further and expand its applicability to more diverse operating scenarios. This work lays a solid foundation for the evolution of ITS, promoting their large-scale adoption in the future.