Connected Perception Between Lightweight Robot and External Camera for Blind-Spot Awareness

Tantrairatn, Suradet; Phinphimai, Poommin; Phuangmalee, Nattapong; Karaked, Pawarut; Petcharat, Nutchanan; Pichitkul, Auraluck; Ariyarit, Atthaphon

doi:10.3390/technologies14060338

Open AccessArticle

Connected Perception Between Lightweight Robot and External Camera for Blind-Spot Awareness

by

Suradet Tantrairatn

¹

,

Poommin Phinphimai

²

,

Nattapong Phuangmalee

²,

Pawarut Karaked

³,

Nutchanan Petcharat

³,

Auraluck Pichitkul

¹

and

Atthaphon Ariyarit

^1,*

¹

School of Mechanical Engineering, Institute of Engineering, Suranaree University of Technology, 111 University Avenue, Nakhon Ratchasima 30000, Thailand

²

School of Mechatronics Engineering, Institute of Engineering, Suranaree University of Technology, 111 University Avenue, Nakhon Ratchasima 30000, Thailand

³

Institute of Research and Development, Suranaree University of Technology, 111 University Avenue, Nakhon Ratchasima 30000, Thailand

^*

Author to whom correspondence should be addressed.

Technologies 2026, 14(6), 338; https://doi.org/10.3390/technologies14060338

Submission received: 22 April 2026 / Revised: 28 May 2026 / Accepted: 28 May 2026 / Published: 3 June 2026

Download

Browse Figures

Versions Notes

Abstract

This paper presents a connected perception framework for blind-spot awareness by connecting an external camera system with a lightweight autonomous robot. The proposed system combines real-time object detection, localization, position prediction, and collision avoidance to enhance environmental perception beyond the limitations of onboard sensing. A YOLOv11-based detection model is employed for obstacle detection, achieving high accuracy with a mean average precision (mAP@0.5) of 0.991. For obstacle localization, the external camera system achieves centimeter-level accuracy, which is further improved using Multiple Linear Regression (MLR)-based correction, reducing the localization error by approximately 75.77%. In addition, position prediction models for both camera-based and autonomous vehicle systems demonstrate strong performance, with coefficients of determination (

R^{2}

) exceeding 0.98. The system also achieves effective collision avoidance, successfully stopping in all tested scenarios with response times ranging from 0.2 to 0.45 s. The integration of external and onboard perception enables effective blind-spot mitigation and improves situational awareness within simulated blind-spot corner scenarios representing real-world occlusion challenges. The results validate the system-level integration of these modules as a practical framework for addressing sensing limitations in autonomous robotic applications.

Keywords:

object detection; connected perception; infrastructure-assisted robotics; blind-spot awareness; vehicle-to-infrastructure (V2I); low-compute autonomous system; situational awareness sharing

1. Introduction

Autonomous vehicle technology, which enables vehicles to perceive their environment and navigate without human intervention [1,2], is a pivotal innovation which is poised to revolutionize the future of transportation and service industries. Its potential to enhance efficiency, safety, and comfort is significant [1,3]. However, achieving full autonomy remains a formidable challenge, particularly in the domain of complex perception and environmental understanding, which lies at the core of all autonomous vehicle and robotic systems [2].

A fundamental obstacle for autonomous vehicles is the inherent limitation of their onboard sensors. Although equipped with high-performance sensors like LiDAR and cameras, their perception range is confined to the ego-vehicle’s perspective. This limitation leads to critical issues such as blind spots and occlusion, preventing the vehicle from detecting objects which are obscured by other elements in the environment [2,3,4].

To overcome these challenges and expand situational awareness, this research leverages sensor fusion technology and a vehicle-to-infrastructure (V2I) communication architecture [3,4,5,6]. In this framework, an autonomous vehicle, equipped with 3D LiDAR [4,5,6], communicates and exchanges data with smart infrastructure, where external cameras are deployed as fixed sensor nodes [3,4]. Fusing the elevated perspective from the infrastructure with the vehicle’s local sensor data effectively extends the perception range and mitigates the problem of occlusion [3,4,5,6]. While sophisticated frameworks have been developed for full-scale automated vehicles utilizing high-density sensor arrays and industrial-grade hardware [5], the challenge of integrating these connected perception strategies into lightweight, versatile autonomous vehicles remains an essential area of study.

In the proposed system, the autonomous vehicle utilizes a high-performance, compact NVIDIA Jetson AGX Xavier as its main processing unit. It employs the PointPillars model [7] for efficient, real-time 3D object detection from LiDAR data, thereby reducing computational load. Concurrently, the infrastructure component performs 2D object detection from video streams using an Artificial Neural Network (ANN) model. This setup is integrated into a lightweight autonomous vehicle featuring a comprehensive navigation suite, including road-following, obstacle avoidance, and emergency braking capabilities. However, despite these capabilities, the vehicle faces physical limitations at blind-spot corners where local sensors cannot detect cross-traffic or hidden obstacles.

To address this critical safety gap, this research proposes a system-level solution designed to simulate a model for real-world autonomous cornering scenarios. Our contributions are summarized as follows:

Designing a targeted perception framework specifically for scenarios where onboard LiDAR is physically obstructed, demonstrating how infrastructure-based vision acts as a crucial auxiliary system to resolve blind-spot challenges in autonomous navigation.
Proposing an integrated multi-module system that orchestrates YOLOv11 for high-accuracy object detection, Multiple Linear Regression (MLR) for coordinate alignment, and ANN-based position prediction to provide the vehicle with reliable environmental feedback.
Implementing a direct communication pipeline that transmits external perception coordinates straight to the vehicle’s local control architecture. This includes establishing an onboard-only baseline experiment to empirically demonstrate the absolute failure of localized sensors in occluded environments, thereby proving that the integrated infrastructure system successfully enables safety-critical maneuvers, such as pre-emptive halting, to prevent otherwise unavoidable collisions.

2. Methodology

2.1. Preliminary Testing and Baseline Validation

Before evaluating the integrated vehicle-to-infrastructure (V2I) architecture, a series of preliminary experiments were conducted to establish the physical parameters of the target obstacle and empirically validate the sensing limitations of the onboard LiDAR system within the designated experimental grid.

2.1.1. Obstacle Velocity Characterization

To ensure precise trajectory tracking and time-of-flight evaluation, the operational velocity of the remote-controlled (RC) target vehicle was calibrated. Due to inherent motor limitations, movement of the RC vehicle was initiated prior to entering the active grid area to allow it to reach a steady-state velocity. As the vehicle crossed the experimental grid, its instantaneous spatial coordinates and corresponding system timestamps were extracted directly from the perception pipeline, utilizing the YOLOv11 framework for real-time object detection and the DeepSORT algorithm for continuous object tracking, as described in Section 2.7 and Section 2.8, respectively. Based on these tracked state trajectories across the fixed coordinate grid markers, the vehicle’s velocity was calculated according to the fundamental kinematic relationship

v = \frac{s}{t}

(1)

where s represents the coordinate displacement determined via the integrated YOLOv11-DeepSORT localization pipeline across the designated grid intervals, and t is the precise elapsed time recorded between those localized tracking frames. This calibration process was repeated across 10 independent trials, yielding a consistent average operational velocity that serves as the deterministic input for subsequent tracking evaluation.

2.1.2. Onboard Sensing Evaluation in Line-of-Sight Scenarios

An initial baseline test was executed using only the onboard 3D LiDAR system without infrastructure camera support in a clear line-of-sight (unblinded) layout. The objective was to verify the local detection threshold of the autonomous rover. Initial trials indicated that while the LiDAR could capture spatial reflections from the RC vehicle at a distance, the obstacle merely registered as a faint, curved line segment of point clouds. Such minimal line fragments do not provide sufficient geometric features to be considered a well-detected or classifiable object by the onboard processing pipeline. Furthermore, as the target vehicle moved into closer proximity to the rover, the relative height difference between the low-profile chassis of the RC car and the elevated mounting position of the 3D LiDAR caused the point cloud reflections to disappear entirely, as the obstacle fell beneath the sensor’s lower vertical beam divergence boundary as shown in Figure 1.

To rectify this hardware geometry constraint and ensure that subsequent localized detection failures were strictly caused by environmental occlusions rather than sensor height mismatches, a structural modification was implemented. A lightweight foam block was affixed to the upper chassis of the RC vehicle, effectively increasing its vertical profile to intersect the LiDAR’s active scanning channels. Following this physical adjustment, the onboard sensor successfully extracted sufficient point cloud features, demonstrating consistent object classification and tracking under unblinded line-of-sight conditions as shown in Figure 2.

2.1.3. Onboard Baseline Testing in the Blind-Spot Geometry

With the obstacle’s height profile calibrated, a final baseline experiment was conducted to evaluate the onboard-only sensing performance directly within the target blind-spot corner. This validation phase was explicitly configured to replicate the exact environmental scenario and spatial layout defined in Section 2.2. Under this setup, the autonomous rover approached the intersection under an absolute physical blind spot, operating exclusively via its onboard LiDAR suite without any external assistance or feedback from the infrastructure camera system.

To systematically evaluate the collision risks, multiple experimental trials were executed by varying the arrival synchronization of the two vehicles at the intersection. Two primary operational hypotheses were tested, based on which vehicle reached the blind corner first.

Target Vehicle (RC Car) Priority: Under the hypothesis that the RC obstacle reaches the corner first, the target vehicle passes through the intersection before a collision occurs due to its higher operational velocity. However, in a real-world deployment context, this scenario still introduces an unacceptable safety risk due to the critically low spatial separation margins.
Autonomous Rover Priority: Under the hypothesis that the autonomous rover arrives at the corner first, a catastrophic side-impact collision occurs. Because the physical architecture completely occludes the upcoming obstacle from the onboard LiDAR’s line of sight, the local perception stack fails to register the threat entirely.

Crucially, across all baseline trials under exclusive onboard control, the autonomous rover consistently failed to detect the cross-traffic obstacle. Instead, each experiment terminated with the rover executing an emergency brake only when detecting the static wall located on the opposite side of the intersection, representing the singular obstacle within its unoccluded field of view. These systematic failures empirically demonstrate that local onboard perception is entirely insufficient for blind-corner safety, verifying the critical operational necessity for the proposed connected V2I architecture.

2.2. Blind-Spot Scenario and Experimental Setup

The experimental test area was designed to emulate a blind-spot scenario representative of an occluded urban intersection, which is a well-known challenge in connected and autonomous driving. Intersections are regarded as safety-critical environments because of occlusion, limited sensor field of view, and complex interactions among road users [8]. Recent studies have shown that cooperative perception and vehicle-to-infrastructure (V2I) collaboration can enhance situational awareness in such scenarios by extending perception beyond the sensing range of onboard sensors [5,9].

As shown in Figure 3, a small-scale vehicle (blue car) is positioned as a physical obstacle to partially occlude the field of view of the autonomous rover, creating blind-spot regions. The rover follows a predefined trajectory toward the occluded area, while an approaching object moves from outside the rover’s direct sensing range, with motion directions for both indicated by the blue and red arrows, respectively, reproducing a realistic blind-spot interaction.

To provide a clearer conceptual representation, Figure 4 illustrates the scenario using a bird’s-eye view. In this configuration, the approaching object is initially hidden by surrounding structures, which may delay detection when relying solely on onboard perception. The blue and red arrows indicate the motion directions of the rover and the approaching object, respectively. The yellow represents the communication link between the external camera and the rover represents a connected perception mechanism that enables perception sharing in a V2I-style framework, similar to infrastructure-assisted perception approaches proposed for occluded intersection scenarios [5].

The experiments were conducted in a controlled indoor environment to reduce the influence of external disturbances such as weather and lighting variation. A predefined grid-based test area was constructed to support localization and position evaluation in real-world coordinates. Such structured layouts are commonly used in camera calibration and robotic localization studies to obtain metric spatial information and assess sensor accuracy [10,11,12]. The camera-based measurements were referenced to a planar ground pattern, following standard calibration principles for extracting metric information from 2D imagery [11].

This experimental setup enables systematic evaluation of perception limitations under occlusion conditions and provides a simplified but realistic testbed for investigating how external sensing and connected perception can improve safety and situational awareness in autonomous rover operation [8,9].

2.3. Lightweight Robot

The autonomous platform used in this study is a lightweight delivery robot designed for real-time operation in indoor environments. As shown in Figure 5, the robot is equipped with a three-dimensional Light Detection and Ranging (LiDAR) sensor mounted on top for distance measurement and obstacle perception, and an Inertial Measurement Unit (IMU) installed inside the chassis for estimating motion dynamics and orientation.

The overall system architecture is illustrated in Figure 6. The platform integrates sensing, perception, planning, and control modules within an onboard computing unit. The sensing subsystem provides raw data from the LiDAR and IMU, which are processed for localization and environment perception.

For robot localization, a LiDAR-based approach is adopted by combining a high-definition (HD) map representation with Normal Distributions Transform (NDT)-based pose estimation. The NDT method is widely used for scan matching and pose correction because it enables robust registration of LiDAR point clouds and supports real-time localization in mobile robotic systems [13,14]. In addition, HD maps provide detailed geometric information that improves localization accuracy by serving as a high-precision spatial reference for autonomous systems [15,16].

For environment perception, the robot employs a lightweight 3D object detection module derived from the PointPillars framework. To improve detection capability across different object distances, the range characteristics of LiDAR measurements were considered following the observations of Peri et al. [17], who showed that near-range and far-range objects exhibit different spatial densities and benefit from range-aware feature encoding. Based on this insight, the range learning strategy of the detection model was adjusted to better capture object features under varying sensing distances.

To further reduce computational complexity while preserving essential multi-scale features, the default feature pyramid network (FPN) of PointPillars was replaced with an Attention Pyramid Network (APN). The APN design emphasizes informative feature channels and improves feature fusion with a lightweight structure and low additional runtime [18].

The planning and control modules generate motion trajectories and execute velocity and steering commands based on perception outputs and the robot’s current state. This integrated system enables reliable autonomous navigation, obstacle avoidance, and real-time operation of the lightweight robot platform.

2.4. External Camera-Based Perception System

To address the perception limitations caused by blind-spot regions that cannot be reliably observed by the autonomous rover’s onboard sensors, an external camera-based perception system is integrated into the experimental framework. The external camera is strategically installed at an elevated position to provide a wide-area view of the test environment, enabling observation of regions that fall outside the rover’s direct sensing range.

As illustrated in Figure 7, the external camera monitors a predefined region of interest indicated by the red dashed boundary, which represents the effective field of view used for object detection and tracking. This region is specifically selected to cover blind-spot areas where approaching vehicles or obstacles may not be detected by the rover’s onboard LiDAR because of occlusion or sensor field of view limitations. By continuously observing this area, the external camera serves as an infrastructure-based perception source that complements the rover’s local sensing capabilities.

2.5. Camera Calibration Procedure

To establish a precise geometric mapping between the image plane and the physical testing environment, the camera calibration process is executed through the following sequential procedures.

Physical Grid Setup: A camera configured at a resolution of $640 \times 480$ pixels is positioned to monitor a predefined $3 \times 3 m$ experimental area. This physical space is marked with a localized $7 \times 7$ physical coordinate grid, as illustrated in Figure 8.
Virtual Grid Mapping: Four discrete reference points defining a bounding rectangle are selected within the camera’s initial pixel coordinate space. This designated region of interest is subsequently segmented into a corresponding $7 \times 7$ virtual grid overlay, as depicted in Figure 9.
Perspective Transformation and Coordinate Alignment: The bounded pixel region undergoes a perspective transformation to generate a bird’s-eye-view projection (the specific mathematical formulations for this transformation are detailed in Section 2.6). Through this transformation, image pixel coordinates are mapped directly to physical grid coordinates; for example, the image pixel coordinate $[230, 181]$ is mapped onto the localized grid origin $[0, 0]$ at the bottom-left corner, as shown in Figure 10.
Target Tracking Integration: Following the spatial transformation layer, the target tracking framework utilizes a YOLOv11 pipeline for continuous object detection (detailed in Section 2.7) paired with the DeepSORT algorithm for multi-object tracking and ID retention (detailed in Section 2.8) to output real-time state vectors relative to the calibrated grid.

The visual information captured by the external camera is processed to detect and track vehicles or objects within the designated field of view. The resulting perception data are subsequently transmitted to the autonomous rover through a communication link, enabling the rover to gain awareness of objects located in blind-spot regions. This external camera-based perception forms a key component of the connected perception framework and provides the foundation for subsequent processing steps, including coordinate transformation, object localization, and decision-making, which are described in the following subsections.

2.6. Perspective Transformation

To enable spatial reasoning from external camera observations, a perspective transformation is applied to convert the captured images into a bird’s-eye view (BEV) representation. This transformation aligns the camera view with the ground plane, providing a top-down representation of the environment for consistent spatial analysis [19,20].

In this work, the transformation is implemented using a homography-based mapping between the image plane and the ground plane. The resulting BEV representation projects image pixels onto a planar surface, forming a grid-based spatial layout that facilitates object localization and motion analysis relative to the autonomous rover. Homography-based projection and geometric inverse perspective mapping are widely used for constructing BEV representations from camera images in the fields of autonomous driving and robotic perception [21,22].

To further ensure spatial consistency, inverse perspective mapping (IPM) is applied to convert detected object positions from the BEV domain back to real-world coordinates. This enables accurate estimation of object locations and trajectories in the physical environment, even when objects are initially observed from occluded viewpoints [19].

By integrating BEV transformation with coordinate mapping, the system provides a unified spatial representation that supports downstream tasks such as object tracking, trajectory prediction, and connected perception decision-making.

Homography Transformation

To establish a geometric relationship between the image plane and the bird’s-eye view (BEV) representation, a homography transformation is employed. Homography establishes a projective mapping between two planar surfaces, enabling consistent spatial alignment between the camera image and the ground plane [19,23].

The transformation is defined in Equation (1)

(\begin{matrix} x' \\ y' \\ 1 \end{matrix}) = H (\begin{matrix} x \\ y \\ 1 \end{matrix}),

(2)

where

(x, y)

and

(x', y')

denote the homogeneous coordinates in the image plane and the transformed BEV plane, respectively, and

H \in R^{3 \times 3}

is the homography matrix.

In this work, the homography matrix is computed through camera calibration by establishing correspondences between known ground reference points and their image projections. This allows image pixels to be projected onto a planar ground surface, forming a spatially consistent BEV representation.

The homography-based transformation plays a critical role in enabling accurate object localization from camera observations. By aligning the image coordinates with the ground plane, the system can estimate object positions in real-world coordinates, facilitating integration with the robot’s localization and perception modules.

This geometric mapping also provides the foundation for inverse perspective mapping (IPM) and the subsequent coordinate transformation processes described in the following sections.

2.7. YOLOv11-Based Object Detection

The YOLOv11 model is adopted as the object detection module in the proposed perception pipeline. The YOLOv11 framework is implemented as the primary detection engine due to its efficient single-stage architecture, which facilitates rapid 2D object identification through a unified neural network [24]. Modern iterations of this model have further enhanced the precision of vehicle detection [25] and maintained reliable performance across varied and complex environments [26].

In this work, the detection process operates on bird’s-eye view (BEV) representations generated from external camera images, as described in Section 2.6. This BEV-based input provides a spatially consistent top-down view of the environment, which improves the robustness of object localization and simplifies geometric interpretation compared with perspective images.

The use of an external camera allows the detection module to capture objects located in blind-spot regions that are not observable by the rover’s onboard sensors. The detection results, including bounding boxes and confidence scores, are computed for each frame and projected into the shared coordinate system for downstream processing.

An example of the detection output is shown in Figure 11. The detected objects are subsequently passed to a tracking module, where temporal association and trajectory estimation are performed. This integration enables continuous monitoring of dynamic objects and supports decision-making in the connected perception framework.

2.8. DeepSORT-Based Object Tracking

To enable consistent tracking of detected objects across consecutive frames, DeepSORT is employed as the multi-object tracking framework in the proposed perception pipeline [27].

The tracking module takes the detection results from the YOLOv11-based detector, including bounding boxes and confidence scores, and associates objects over time by assigning unique identities to each detected object. This allows the system to maintain temporal consistency and estimate object trajectories in dynamic environments.

DeepSORT integrates motion prediction and appearance-based association to improve tracking robustness under challenging conditions such as occlusions, missed detections, and object interactions. This is particularly important in the proposed system, where objects may temporarily disappear from the field of view because of blind spots or limited sensor coverage.

The resulting tracking outputs consist of temporally consistent object identities and trajectories, which are further utilized for downstream tasks such as trajectory analysis, motion prediction, and connected decision-making. This integration enables reliable monitoring of dynamic objects and enhances the overall perception capability of the system.

After object detection, the spatial positions of detected objects are estimated to support navigation and decision-making. In the proposed system, object locations are derived from the geometric properties of the detected bounding boxes in the transformed image representation.

For each detected object, the center point of the bounding box is selected as a representative location, as illustrated in Figure 12. This provides a compact and computationally efficient approximation of the object position in the image space while maintaining sufficient localization accuracy.

By leveraging the perspective transformation and the homography mapping described in Section 2.6, the extracted center points are projected into the bird’s-eye view (BEV) coordinate system. This transformation enables object positions to be expressed in a unified spatial reference frame aligned with the robot’s environment.

An example of the integrated detection, tracking, and localization results in the BEV domain is shown in Figure 13. The proposed approach enables consistent object localization by combining detection, tracking, and geometric transformation within a shared coordinate system.

The resulting object positions provide reliable spatial information for downstream tasks such as trajectory analysis, motion prediction, and autonomous decision-making, allowing external camera observations to be effectively incorporated into the connected perception framework.

2.9. Connected Perception-Based Decision Making

2.9.1. Position Prediction Models

To support proactive collision avoidance, the proposed system incorporates position prediction models that estimate the future motion of both externally detected objects and the autonomous rover. Trajectory prediction is widely recognized as a key component of autonomous driving decision-making, because it enables the system to anticipate future interactions and reduce the risk of collision [28,29].

Two complementary prediction models are developed: (1) a camera-based object position prediction model and (2) an autonomous rover position prediction model.

The camera-based prediction model utilizes object locations obtained from the external camera perception system. The input to this model consists of sequential object positions represented by bounding box centers in the bird’s-eye view (BEV) coordinate system. By learning temporal patterns from historical trajectories, the model estimates the future motion of detected objects within the shared spatial reference frame. This design is consistent with prior trajectory prediction studies that employ sequence-based neural models, such as LSTM-based predictors, for estimating future vehicle motion [30,31].

The autonomous rover prediction model estimates the future motion of the robot using onboard localization information derived from LiDAR and IMU sensors. The model learns the rover’s motion dynamics from historical position and motion data, which is consistent with recent research on ego-state estimation and vehicle motion prediction using recurrent and sensor-fusion-based approaches [32,33].

Both prediction models are implemented using artificial neural networks (ANNs) to learn nonlinear motion patterns from sequential spatial data. The general structure of the ANN-based prediction network is illustrated in Figure 14. The dataset was divided into training, validation, and test sets with a ratio of 70:15:15, respectively. The predicted trajectories of both external objects and the rover are then integrated into the connected perception framework to evaluate potential collision risks and support decision-making, following trajectory-prediction-based collision warning approaches in connected vehicles [34,35].

By incorporating prediction into the perception–decision loop, the system enables anticipatory responses rather than purely reactive behavior, thereby improving safety in dynamic environments.

2.9.2. Collision Risk Analysis

Following the position prediction stage described in Section 2.9.1, the system evaluates potential collision risk by analyzing the predicted future positions of both the detected objects and the autonomous rover. Collision risk assessment is a key component of autonomous driving safety, as it enables early identification of hazardous interactions and supports proactive avoidance strategies [36,37].

The risk assessment is performed by computing the spatial distance between the predicted object position and the predicted rover position within the shared Cartesian coordinate system. In this work, the Euclidean distance is used as a simple and computationally efficient proximity measure, as defined in Equation (2)

d = \sqrt{{(x_{2} - x_{1})}^{2} + {(y_{2} - y_{1})}^{2}},

(3)

where

(x_{1}, y_{1})

denotes the predicted position of the detected object obtained from the camera-based prediction model, and

(x_{2}, y_{2})

denotes the predicted position of the autonomous rover obtained from the rover position prediction model.

If the computed distance falls below a predefined safety threshold, the situation is classified as a potential collision risk. More broadly, surrogate safety measures such as time-to-collision (TTC), post-encroachment time (PET), and other proximity-based indicators are commonly used to quantify traffic conflicts and near-collision situations in intelligent transportation systems [38,39,40].

By continuously evaluating the predicted spatial relationships between dynamic entities, the system enables early risk detection and supports proactive decision-making. This trajectory-based risk evaluation strategy is consistent with recent connected vehicle and intelligent driving approaches that combine motion prediction with safety assessment and collision warning [34,41,42]. The identified risk conditions are then transmitted to the decision module, which determines appropriate control actions such as speed reduction or emergency stopping.

2.9.3. Collision-Aware Motion Control

Building upon the perception and localization modules described in Section 2.4, the proposed system incorporates a connected-data decision-making mechanism to ensure safe autonomous operation.

To enhance driving safety, perception results from both the lightweight autonomous rover and the external camera infrastructure are integrated within a connected data framework. The overall architecture of the connected perception and decision-making process is illustrated in Figure 15.

In this framework, the object detection and tracking results obtained from the external camera are transmitted through a communication network using the MQTT protocol. The transmitted data include object identities, positions, and motion information represented in a shared coordinate system. Simultaneously, the rover maintains its own localization state based on LiDAR and inertial measurements.

By combining onboard localization with externally detected object information, the system continuously evaluates potential collision risks in real time. This cooperative perception strategy extends the rover’s situational awareness beyond the limitations of onboard sensing, particularly in blind-spot scenarios [5,10]. The integration of shared perception data within a connected communication framework is also consistent with recent V2X-based trajectory prediction approaches [41].

When a potential collision risk is detected, the decision module generates appropriate control actions based on the estimated risk level. These actions include speed reduction or emergency stopping, which are executed by the rover’s motion control subsystem.

Such risk-aware control strategies are widely adopted in autonomous driving systems to enable proactive collision avoidance and improve operational safety [34,37].

By integrating perception, prediction, and control within a unified connected-data framework, the proposed system improves responsiveness to dynamic obstacles and enhances overall safety in real-world environments.

3. Experiments and Results

3.1. Results of Preliminary Testing and Baseline Validation

This subsection evaluates the empirical data from the preliminary testing phase. The analysis focuses strictly on confirming the constant speed profile of the RC obstacle and documenting the performance logs of the independent onboard configuration within the physical blind spot. Line-of-sight tracking data are omitted here, as unoccluded scenarios fall outside the scope of this connected framework evaluation.

3.1.1. Obstacle Velocity Characterization

To evaluate the dynamic consistency of the target obstacle, the operational velocity of the remote-controlled (RC) vehicle was systematically measured across 10 independent experimental trials. In each trial, the RC vehicle was manually controlled from a distant staging area, allowing it to accelerate and achieve a steady-state velocity before entering the active monitoring zone. Spatial tracking began at the initial incoming grid point and continued uninterrupted until the vehicle exited the final boundary point along the same horizontal grid row. Data points were recorded continuously at the operational frequency of the integrated YOLOv11 and DeepSORT tracking pipeline.

To verify velocity uniformity, individual segment speeds were calculated between sequential grid intersections across all 10 trials. As summarized in Table 1, the target obstacle maintained a highly consistent profile, yielding an overall mean speed of

2.9125 m / s

(

\pm 0.6774 m / s

) across all cycles, with the steady-state global baseline velocity stabilizing at

2.975 m / s

(

\pm 0.681 m / s

). This empirical consistency confirms the obstacle’s suitability as a stable, reproducible experimental control variable for subsequent testing.

3.1.2. Onboard Baseline Testing in the Blind-Spot Geometry

To evaluate the physical safety boundaries of the isolated system, empirical tracking data were captured across the three configured initial distance profiles illustrated in the picture in Figure 16. In the LiDAR scanning data panel, the red rings represent the LiDAR scan layers on the ground, the colored point clusters indicate detected obstacles, and the RGB axes represent the pose of the rover. The kinematic tracking logs and operational outcomes across all 30 experimental trials are comprehensively summarized in Table 2. Under Condition 1 (where both the RC car and the robot are positioned far from the blind-spot corner) and Condition 2 (where the RC car is positioned closer to the blind-spot corner), the velocity differential allowed the RC target vehicle to clear the intersection vertex immediately ahead of the rover’s trajectory. While these trials resulted in a physical pass-by without a direct collision, they represent a critical real-world operational risk due to the near-zero spatial separation margins.

In contrast, under Condition 3 (where the autonomous robot is positioned closer to the blind-spot corner), the configuration resulted in a deterministic failure case, where the cross-traffic RC vehicle crashed directly into the autonomous rover in 10 out of 10 independent trials. This catastrophic side-impact collision highlights the vulnerability of relying solely on localized vehicle perception within structural blind spots.

Crucially, all 30 trials spanning the three distinct initial conditions were considered as complete failures in terms of effectively reducing vehicle speed to mitigate operational risks and prevent the accident, as the onboard intelligence was unaware of the oncoming threat early enough to guarantee safety during runtime. The continuous velocity commands were structured such that values exceeding 0.8 indicate a continuous nominal speeding state, while commands dropping between 0 and 0.79 represent a lowered speeding profile, with 0 explicitly denoting a complete stop. Although there were 6 specific cases (4 trials in Condition 1 and 2 trials in Condition 2) where the robot did reduce its speed upon obstacle detection, the recorded reaction times (counting after the obstacle was found) and subsequent speed profiles confirm that this mitigation was not sufficient to stop in time before the collision occurred. Consequently, because the onboard sensor suite fundamentally failed to register the dynamic threat early enough due to severe environmental occlusions, each onboard baseline experiment ultimately concluded with the autonomous rover executing an emergency brake and stopping safely only upon encountering the static wall located on the opposite side of the intersection corridor, as shown in the RVIZ data and the picture in Figure 17.

3.2. Main Experimental Setup

The proposed system was evaluated in a controlled outdoor environment at Suranaree University of Technology. The test field represents an urban intersection scenario with a road width of 6 m. The simulated environment of a typical vehicle is shown in Figure 18, and the environment of the autonomous robot is shown in Figure 19. As shown in Table 3, the experimental setup clearly defines the computing and sensing roles. The autonomous robot utilized an NVIDIA Jetson AGX Xavier module (NVIDIA Corporation, Santa Clara, CA, USA) as its main onboard controller for local operations. For vehicle perception and tracking, a Velodyne VLP-16 LiDAR (Velodyne Lidar, San Jose, CA, USA) was employed as the primary 3D object scanning sensor, while a Pixhawk 3 Pro IMU (Drotek, Toulouse, France) was used to evaluate the autonomous rover’s speed; notably, both the LiDAR and IMU data were fused for localization within this system. In contrast, the main system processing unit was powered by an external workstation equipped with an Intel Core i7-11700K processor (Intel Corporation, Santa Clara, CA, USA). This high-performance central computer runs the overall connected perception framework, executing the core obstacle pre-detection and collision avoidance algorithms. It actively sends a protective stop command to halt the autonomous robot in critical safety scenarios where the onboard system lacks the computational capacity to process the complete environmental data. Additionally, a fixed RGB camera with a resolution of

1920 \times 1080

pixels at 30 frames/s was installed on a nearby infrastructure pole at a height of 2.5 m, providing an overhead view of the intersection, with both platforms connected via a local Wi-Fi network to enable real-time communication through the ROS framework.

3.3. Detection Results

The performance of the proposed YOLOv11-based detection model was evaluated using standard metrics, including precision, recall, F1-score, and mAP at 0.5, as summarized in Table 4. To evaluate model performance, the dataset was constructed with a split configuration of 80% for training, 10% for validation, and 10% for testing. The target object detected by the model was specifically the remote-controlled (RC) car utilized as the dynamic obstacle in the main experiments.

The model achieved a precision of 1.00 and a recall of 0.99, resulting in an F1-score of 0.98. The mAP at 0.5 reached 0.991, indicating highly accurate detection performance for the target RC vehicle obstacle.

As shown in Figure 20, the Precision–Recall curve remains close to the upper-right corner, demonstrating a strong balance between precision and recall across different confidence thresholds.

Furthermore, the normalized confusion matrix in Figure 21 shows that 99% of objects were correctly classified, with only 1% misclassification. This indicates that the proposed model produces minimal false positives and false negatives, confirming its robustness under the tested environmental conditions. The results also suggest that the model maintains stable detection performance even at high recall levels.

3.4. Localization Results

3.4.1. External Camera Localization

The localization performance of the external camera system was evaluated by comparing the measured object’s positions with its ground-truth coordinates.

As shown in Figure 22, the detected object positions are projected onto a predefined ground grid, enabling spatial localization in real-world coordinates.

The measured positions exhibited consistent and systematic deviations from the ground truth along both the X- and Y-axes, as illustrated in Figure 23. Table 5 summarizes the localization performance.

The localization error ranged from 0.01 to 0.04 m in the X-direction and up to 0.08 m in the Y-direction. The RMSE value of 0.04 m was computed based on the Euclidean distance between the measured positions and the ground-truth coordinates across all test samples, indicating centimeter-level localization accuracy.

These deviations indicate the presence of systematic localization error, which motivates the use of regression-based correction.

To further reduce localization error, a multiple linear regression (MLR) model was applied to compensate for systematic deviations observed in the measured positions.

As shown in Figure 24, the corrected positions exhibited significantly improved alignment with the ground truth compared with the original measurements.

Quantitatively, the maximum localization error was reduced from approximately 0.072 m in Figure 23 to 0.0127 m in Figure 24, while the average error decreased from approximately 0.03736 m to 0.00905 m, corresponding to an error reduction of approximately 75.77%.

These results confirm that the MLR-based correction effectively compensates for systematic localization errors, enhancing the precision and reliability of the external camera localization system for real-time applications.

3.4.2. Lightweight Robot Localization

The localization performance of the autonomous robot was evaluated using a LiDAR- and IMU-based positioning system. These sensors provide spatial information and motion estimation, enabling accurate localization in dynamic environments.

As shown in Figure 25, the robot position was estimated within the experimental area using sensor fusion from LiDAR and IMU data. The system is capable of continuously tracking the robot’s position during motion.

Figure 26 compares the ground-truth and measured positions. The results show that the estimated positions closely match the true positions across the test grid.

Quantitatively, the localization error ranged from approximately 0.01 to 0.03 m along the X-axis and from 0.01 to 0.025 m along the Y-axis. These results indicate that the system maintains consistent localization accuracy across both axes.

The relatively low error demonstrates that the LiDAR- and IMU-based localization system provides stable and reliable positioning performance, even under continuous motion. This makes the system suitable for real-time autonomous navigation and robotic perception applications.

3.5. Position Prediction Results

The performance of the position prediction models was evaluated using standard regression metrics, including mean squared error (MSE), mean absolute error (MAE), and the coefficient of determination (

R^{2}

), as summarized in Table 6 and Table 7.

For the camera-based position prediction model, the results show an MSE of 0.00387, an MAE of 0.03969, and an

R^{2}

value of 0.99588. These results indicate that the model achieves high prediction accuracy, with minimal deviation between predicted and ground-truth positions.

For the autonomous rover position prediction model, the performance slightly decreases, with an MSE of 0.0058, an MAE of 0.0493, and an

R^{2}

value of 0.9813. The increased prediction error is likely due to the dynamic motion of the rover and sensor noise introduced by LiDAR and IMU measurements.

Comparatively, the camera-based model demonstrated superior accuracy, as reflected by its lower error values and higher

R^{2}

score. This suggests that camera-based predictions are more stable under controlled conditions, while the rover-based model is affected by motion dynamics and environmental uncertainty.

Overall, both models achieved high prediction performance, with

R^{2}

values above 0.98, indicating strong agreement between predicted and actual positions. These results confirm the effectiveness of the proposed position prediction models for real-time robotic perception and autonomous navigation applications.

3.6. Collision Avoidance Results

The performance of the proposed system in deceleration and emergency stopping was evaluated across 10 repeated trials per scenario under various test configurations, including static and moving objects, low-light conditions, and different environmental factors, as summarized in Table 8.

The results show that the system successfully detected obstacles and executed stopping maneuvers in all test cases, achieving a 100% success rate without collision. The system response time ranged from 0.2 to 0.45 s, demonstrating fast and reliable reaction to potential collision risks.

The braking distance varied between 0.8 and 1.3 m, depending on the initial distance to the object and the vehicle speed. As expected, larger initial distances and lower speeds allow for safer stopping with shorter braking distances.

The system maintained stable performance under different environmental conditions. For instance, in low-light scenarios and environments with wet surfaces, the system was still able to detect obstacles and perform safe stopping without failure.

Furthermore, the results indicate that response time is a critical factor influencing stopping performance. Faster response times lead to shorter braking distances and improved safety margins, particularly in dynamic scenarios involving moving objects.

Overall, the experimental results demonstrate that the proposed system is capable of reliable and robust collision avoidance. The integration of camera, LiDAR, and IMU sensors enables accurate perception and timely decision-making, ensuring safe operation in real-time autonomous navigation scenarios.

4. Conclusions

This paper presented a connected perception framework for blind-spot awareness integrating an external camera system with a lightweight autonomous robot. The proposed system combines object detection, localization, position prediction, and collision avoidance to enhance environmental perception in scenarios where onboard sensing alone is insufficient.

The experimental results demonstrated that the proposed approach achieved high detection accuracy, with a mean average precision (mAP at 0.5) of 0.991 using the YOLOv11-based detection model. For localization, the external camera system achieved centimeter-level accuracy, with an RMSE of 0.04 m, which was further improved through multiple linear regression (MLR)-based correction, resulting in a localization error reduction of approximately 75.77%. Furthermore, baseline onboard-only testing across 30 experimental trials empirically proved that localized sensing failed to detect the cross-traffic threat, as recorded velocity commands confirmed that the onboard intelligence remained completely unaware of the oncoming target vehicle.

In addition, the position prediction models showed strong performance, with coefficients of determination (

R^{2}

) exceeding 0.98 for both the camera-based and autonomous rover models, indicating reliable prediction of object positions. The system also demonstrated robust collision avoidance capability, achieving a 100% stopping success rate across all tested scenarios utilizing the connected framework, with response times ranging from 0.2 to 0.45 s.

The integration of external camera perception with onboard sensing enabled the system to effectively address blind-spot limitations, providing enhanced situational awareness and improved safety in real-world environments. The results confirm that the proposed connected perception framework is suitable for real-time autonomous robotic applications.

Future work will focus on extending the system to more complex and dynamic environments, as well as incorporating advanced sensor fusion techniques and learning-based models to further improve perception accuracy and robustness.

Author Contributions

Conceptualization, S.T., P.P., and A.A.; methodology, S.T., P.P., and N.P. (Nutchanan Petcharat); software, N.P. (Nattapong Phuangmalee); validation, A.P., P.K., and A.A.; formal analysis, N.P. (Nutchanan Petcharat); resources, S.T.; data curation, P.K.; writing—original draft preparation, N.P. (Nattapong Phuangmalee); writing—review and editing, N.P. (Nutchanan Petcharat) and N.P. (Nattapong Phuangmalee); supervision, S.T., A.P., and A.A.; project administration, A.A.; funding acquisition, A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Suranaree University of Technology (SUT): Full-time66/39/2568.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting the results of this study are available in the article. Requests for further materials should be directed to the corresponding author.

Acknowledgments

During the preparation of this manuscript, the authors used the Google Gemini 3.5 flash and ChatGPT 5.5 tools for the purposes of language polishing, text generation in select paragraphs, and the formatting and management of LaTeX tables. The authors have fully reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LiDAR	Light Detection and Ranging
IMU	Inertial Measurement Unit
V2I	Vehicle-to-Infrastructure
BEV	Bird’s Eye View
MLR	Multiple Linear Regression

References

Malik, S.; Khan, M.A.; El-Sayed, H. Collaborative autonomous driving—A survey of solution approaches and future challenges. Sensors 2021, 21, 3783. [Google Scholar] [CrossRef]
Zhang, Y.; Carballo, A.; Yang, H.; Takeda, K. Perception and sensing for autonomous vehicles under adverse weather conditions: A survey. ISPRS J. Photogramm. Remote Sens. 2023, 196, 146–177. [Google Scholar] [CrossRef]
George, R.; Clancy, J.; Brophy, T.; Sistu, G.; O’Grady, W.; Chandra, S.; Collins, F.; Mullins, D.; Jones, E.; Deegan, B.; et al. Infrastructure Assisted Autonomous Driving: Research, Challenges, and Opportunities. IEEE Open J. Veh. Technol. 2025, 6, 662–716. [Google Scholar] [CrossRef]
Zhou, X.; Wang, C.; Xie, Q.; Qiu, T. V2I-Coop: Accurate object detection for connected automated vehicles at accident black spots with V2I cross-modality cooperation. IEEE Trans. Mob. Comput. 2025, 24, 2043–2055. [Google Scholar] [CrossRef]
Mo, Y.; Vijay, R.; Rufus, R.; de Boer, N.; Kim, J.; Yu, M. Enhanced Perception for Autonomous Vehicles at Obstructed Intersections: An Implementation of Vehicle to Infrastructure (V2I) Collaboration. Sensors 2024, 24, 936. [Google Scholar] [CrossRef] [PubMed]
Xiang, C.; Zhang, L.; Xie, X.; Zhao, L.; Ke, X.; Niu, Z.; Wang, F. Multi-Sensor Fusion Algorithm in Cooperative Vehicle-Infrastructure System for Blind Spot Warning. Int. J. Distrib. Sens. Netw. 2022, 18, 15501329221100412. [Google Scholar] [CrossRef]
Lang, A.H.; Vora, S.; Caesar, H.; Zhou, L.; Yang, J.; Beijbom, O. PointPillars: Fast Encoders for Object Detection from Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; IEEE: New York, NY, USA, 2019; pp. 12689–12697. [Google Scholar] [CrossRef]
Li, S.; Shu, K.; Chen, C.; Cao, D. Planning and Decision-Making for Connected Autonomous Vehicles at Road Intersections: A Review. Chin. J. Mech. Eng. 2021, 34, 133. [Google Scholar] [CrossRef]
Cui, G.; Zhang, W.; Xiao, Y.; Yao, L.; Fang, Z. Cooperative Perception Technology of Autonomous Driving in the Internet of Vehicles Environment: A Review. Sensors 2022, 22, 5535. [Google Scholar] [CrossRef]
Müller, J.; Strohbeck, J.; Herrmann, M.; Buchholz, M. Motion Planning for Connected Automated Vehicles at Occluded Intersections With Infrastructure Sensors. arXiv 2021, arXiv:2110.11246. [Google Scholar] [CrossRef]
Zhang, Z. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 1330–1334. [Google Scholar] [CrossRef]
Thrun, S.; Burgard, W.; Fox, D. Probabilistic Robotics; MIT Press: Cambridge, MA, USA, 2005. [Google Scholar]
Biber, P.; Straßer, W. The normal distributions transform: A new approach to laser scan matching. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 27–31 October 2003; IEEE: New York, NY, USA, 2003. [Google Scholar]
Lim, H.; Hwang, S.; Shin, S.; Myung, H. Normal Distributions Transform Is Enough: Real-Time 3D Scan Matching for Pose Correction of Mobile Robot under Large Odometry Uncertainties. In Proceedings of the International Conference on Control, Automation and Systems (ICCAS), Busan, Republic of Korea, 13–16 October 2020; IEEE: New York, NY, USA, 2020. [Google Scholar] [CrossRef]
Bao, Z.; Hossain, S.; Lang, H.; Lin, X. A Review of High-Definition Map Creation Methods for Autonomous Driving. Eng. Appl. Artif. Intell. 2023, 122, 106125. [Google Scholar] [CrossRef]
Shin, D.; Park, K.-m.; Park, M. High Definition Map-Based Localization Using ADAS Environment Sensors for Application to Automated Driving Vehicles. Appl. Sci. 2020, 10, 4924. [Google Scholar] [CrossRef]
Peri, N.; Li, M.; Wilson, B.; Wang, Y.-X.; Hays, J.; Ramanan, D. An Empirical Analysis of Range for 3D Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCV Workshops), Paris, France, 1–6 October 2023; IEEE: New York, NY, USA, 2023. [Google Scholar]
Zhang, J.; Yan, Y.; Cheng, Z.; Wang, W. Lightweight Attention Pyramid Network for Object Detection and Instance Segmentation. Appl. Sci. 2020, 10, 883. [Google Scholar] [CrossRef]
Mallot, H.A.; Bülthoff, H.H.; Little, J.J.; Bohrer, S. Inverse Perspective Mapping Simplifies Optical Flow Computation and Obstacle Detection. Biol. Cybern. 1991, 64, 177–185. [Google Scholar] [CrossRef]
Khosravi, K.; Ghaemifar, M.; Ebadollahi, S. Enhancing Spatial Awareness: A Survey of Camera-Based Frontal View to Bird’s-Eye-View Conversion. SSRN Electron. J. 2024. [Google Scholar] [CrossRef]
Gong, S.; Ye, X.; Tan, X.; Wang, J.; Ding, E.; Zhou, Y.; Bai, X. GitNet: Geometric Prior-Based Transformation for Birds-Eye-View Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022. [Google Scholar]
Justs, D.J.; Novickis, R.; Ozols, K.; Greitans, M. Bird’s-Eye-View Image Acquisition from Simulated Scenes Using Geometric Inverse Perspective Mapping. In Proceedings of the 2020 IEEE 17th Biennial Baltic Electronics Conference (BEC), Tallinn, Estonia, 6–8 October 2020; IEEE: New York, NY, USA, 2020. [Google Scholar] [CrossRef]
Hartley, R.; Zisserman, A. Multiple View Geometry in Computer Vision; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: New York, NY, USA, 2016; pp. 779–788. [Google Scholar] [CrossRef]
Al Rabbani, M.A. YOLOv11 for vehicle detection: Advancements, performance, and applications in intelligent transportation systems. arXiv 2024, arXiv:2410.22898. [Google Scholar] [CrossRef]
He, L.-H.; Zhou, Y.-Z.; Liu, L.; Cao, W.; Ma, J.-H. Research on Object Detection and Recognition in Remote Sensing Images Based on YOLOv11. Sci. Rep. 2025, 15, 14032. [Google Scholar] [CrossRef]
Wojke, N.; Bewley, A.; Paulus, D. Simple online and realtime tracking with a deep association metric. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; IEEE: New York, NY, USA, 2017; pp. 3645–3649. [Google Scholar]
Bharilya, V.; Kumar, N. Machine learning for autonomous vehicle’s trajectory prediction: A comprehensive survey, challenges, and future research directions. Array 2024, 46, 100733. [Google Scholar] [CrossRef]
Xu, M.; Liu, Z.; Wang, B.; Li, S. A Survey of Autonomous Driving Trajectory Prediction: Methodologies, Challenges, and Future Prospects. Machines 2025, 13, 818. [Google Scholar] [CrossRef]
Altché, F.; de La Fortelle, A. An LSTM Network for Highway Trajectory Prediction. In Proceedings of the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), Yokohama, Japan, 16–19 October 2017; IEEE: New York, NY, USA, 2017; pp. 353–359. [Google Scholar] [CrossRef]
Deo, N.; Trivedi, M.M. Convolutional Social Pooling for Vehicle Trajectory Prediction. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22June 2018; IEEE: New York, NY, USA, 2018. [Google Scholar] [CrossRef]
Dahal, P.; Mentasti, S.; Paparusso, L.; Arrigoni, S.; Braghin, F. RobustStateNet: Robust ego vehicle state estimation for autonomous driving. Robot. Auton. Syst. 2024, 172, 104585. [Google Scholar] [CrossRef]
Deo, N.; Trivedi, M.M. Multi-modal trajectory prediction of surrounding vehicles with maneuver based LSTMs. In Proceedings of the IEEE Intelligent Vehicles Symposium (IV), Changshu, China, 26–30 June 2018; IEEE: New York, NY, USA, 2018; pp. 1179–1184. [Google Scholar]
Baek, M.; Jeong, D.; Choi, D.; Lee, S. Vehicle Trajectory Prediction and Collision Warning via Fusion of Multisensors and Wireless Vehicular Communications. Sensors 2020, 20, 288. [Google Scholar] [CrossRef]
Zhang, R.; Cao, L.; Bao, S.; Tan, J. A Method for Connected Vehicle Trajectory Prediction and Collision Warning Algorithm Based on V2V Communication. Int. J. Crashworthiness 2017, 22, 15–25. [Google Scholar] [CrossRef]
Goudarzi, P.; Hassanzadeh, B. Collision Risk in Autonomous Vehicles: Classification, Challenges, and Open Research Areas. Vehicles 2024, 6, 157–190. [Google Scholar] [CrossRef]
Hamidaoui, M.; Talhaoui, M.Z.; Li, M.; Midoun, M.A.; Haouassi, S.; Mekkaoui, D.E.; Smaili, A.; Cherraf, A.; Benyoub, F.Z. Survey of Autonomous Vehicles’ Collision Avoidance Algorithms. Sensors 2025, 25, 395. [Google Scholar] [CrossRef]
Arun, A.; Haque, M.M.; Washington, S.; Sayed, T.; Mannering, F. A Systematic Review of Traffic Conflict-Based Safety Measures with a Focus on Application Context. Anal. Methods Accid. Res. 2021, 32, 100185. [Google Scholar] [CrossRef]
Mahmud, S.M.S.; Ferreira, L.; Hoque, M.S.; Tavassoli, A. Application of Proximal Surrogate Indicators for Safety Evaluation: A Review of Recent Developments and Research Needs. IATSS Res. 2017, 41, 153–163. [Google Scholar] [CrossRef]
Nikolaou, D.; Ziakopoulos, A.; Yannis, G. A Review of Surrogate Safety Measures Uses in Historical Crash Investigations. Sustainability 2023, 15, 7580. [Google Scholar] [CrossRef]
Wang, P.; Yu, H.; Liu, C.; Wang, Y.; Ye, R. Real-Time Trajectory Prediction Method for Intelligent Connected Vehicles in Urban Intersection Scenarios. Sensors 2023, 23, 2950. [Google Scholar] [CrossRef]
Shi, J.; Sun, D.; Guo, B. Vehicle Trajectory Prediction Based on Graph Convolutional Networks in Connected Vehicle Environment. Appl. Sci. 2023, 13, 13192. [Google Scholar] [CrossRef]

Figure 1. Onboard LiDAR detection and 3D point cloud visualization of the target RC vehicle.

Figure 2. Onboard LiDAR detection and 3D point cloud visualization of the target RC vehicle equipped with an elevated foam block obstruction.

Figure 3. Blind-spot experimental setup.

Figure 4. Bird’s-eye view of the blind-spot scenario and connected perception framework.

Figure 5. Hardware of the lightweight robot. (a) Top-mounted LiDAR; (b) IMU inside the chassis.

Figure 6. Overall system architecture of the lightweight robot. The blue arrows indicate the direction of data flow and control signals between the subsystems.

Figure 7. External camera setup for blind-spot monitoring.

Figure 8. Grid coordinate definitions in the camera pixel space.

Figure 9. Perspective transformation and mapping of pixel coordinates to grid indices.

Figure 10. Virtual grid on experimental area.

Figure 11. Example output of YOLOv11-based object detection in the BEV representation.

Figure 12. Center point extraction from a detected bounding box, used as a representative object location for localization.

Figure 13. Integrated object detection, tracking, and localization results in the bird’s-eye view (BEV) representation.

Figure 14. Architecture of the artificial neural network used for position prediction of objects and the autonomous rover.

Figure 15. Overall framework of connected perception between the lightweight robot and the external camera for blind-spot awareness.

Figure 16. Comparison of test scenarios and corresponding 3D point cloud visualization.

Figure 17. Autonomous robot executing a halt at the set distance in front of the static wall after the target vehicle has passed.

Figure 18. Simulated environment of a vehicle on a typical road.

Figure 19. Simulation environment of the autonomous robot.

Figure 20. Precision–Recall curve of the proposed YOLOv11-based detection model.

Figure 21. Normalized confusion matrix of the proposed YOLOv11 model.

Figure 22. Example visualization of object localization using the external camera. (a) Predefined ground grid coordinate system; (b) Real-world camera view with object detection and localization.

Figure 23. Comparisonbetween ground-truth and measured object positions.

Figure 24. Comparison between ground-truth and corrected object positions obtained using multiple linear regression.

Figure 25. Experimental setup for lightweight robot localization using LiDAR and IMU. (a) Predefined ground grid coordinate system; (b) Real-world camera view.

Figure 26. Comparison between ground-truth and onboard LiDAR/IMU-fused localization coordinates.

Table 1. Empirical velocity performance and standard deviation of the target obstacle across independent trials.

Trial Number	Average Speed (m/s)	Standard Deviation (SD)
Trial 1	3.045	0.648
Trial 2	2.292	0.637
Trial 3	2.978	0.622
Trial 4	2.922	0.658
Trial 5	3.012	0.657
Trial 6	2.988	0.691
Trial 7	2.972	0.667
Trial 8	3.018	0.749
Trial 9	2.961	0.698
Trial 10	2.937	0.747
Mean Speed (All Trials)	2.9125	0.6774
Global Baseline Velocity	2.975	0.681

Table 2. Results of kinematic metrics and operational outcomes of onboard blind-Spot baseline testing.

Initial Condition	Case Scenario Outcome	Obstacle Found Event		Obstacle in Front Event		Reaction to Obstacle (After Found)
Initial Condition	Case Scenario Outcome	Time (s)	Speed Cmd (m/s)	Time (s)	Speed Cmd (m/s)	Time (s)	Speed Cmd (m/s)
Condition 1: Robot 1.0 m/RC 4.5 m (Both Vehicles Far)	Pre-Intersection Target Clearance (High-Risk)	1.79	1.284	2.11	1.284	1.79	0.7746
		2.20	1.095	2.62	1.284	-	-
		1.95	1.284	2.27	1.284	-	-
		2.08	1.284	2.50	1.095	2.27	0.5478
		2.20	1.095	3.10	1.095	2.21	0.5478
		2.33	1.200	3.10	1.095	-	-
		2.24	1.200	2.84	1.095	2.52	0.5477
		1.92	1.200	2.56	1.200	-	-
		2.22	1.200	2.79	1.095	-	-
		2.24	1.095	2.88	1.095	-	-
Condition 2: Robot 1.0 m/RC 3.0 m (RC Obstacle Near Corner)	Early Target Intersection Clearance	1.98	1.095	2.49	1.095	-	-
		2.05	1.095	2.56	1.095	-	-
		1.76	1.200	2.34	1.200	2.4	0.7746
		1.70	1.200	2.24	1.200	2.59	0.7746
		1.70	1.200	2.31	1.200	-	-
		2.27	1.200	2.82	1.200	-	-
		1.99	1.200	2.69	1.095	-	-
		1.41	1.200	1.95	1.095	-	-
		1.95	1.095	2.59	1.095	-	-
		1.95	1.200	2.49	1.200	-	-
Condition 3: Robot 0.5 m/RC 4.5 m (Autonomous Robot Near Corner)	Cross-Traffic Interception Collision	2.08	1.095	3.65	1.095	-	-
		2.02	1.095	3.49	0.949	-	-
		1.85	1.095	3.58	0.776	-	-
		2.01	1.095	3.58	0.949	-	-
		1.60	0.949	3.07	1.095	-	-
		1.88	0.949	3.16	0.949	-	-
		1.96	0.949	3.08	1.095	-	-
		1.79	0.949	2.98	0.949	-	-
		0.93	1.225	2.05	0.949	-	-
		1.85	1.225	2.81	1.095	-	-

Table 3. Hardware and software configuration used in the experiments.

Component	Specification
LiDAR sensor	Velodyne VLP-16 (16 channels, 100 m range)
IMU	Pixhawk 3 Pro
Camera	1080p RGB IP camera (2.5 m height, 90° FOV)
Main computer	Intel Core i7-11700K
Robot hardware	NVIDIA Jetson AGX Xavier, 32 GB RAM

Table 4. Detection performance of the proposed YOLOv11 model.

Metric	Value
Precision	1.00
Recall	0.99
F1-score	0.98
mAP@0.5	0.991

Table 5. Localization error of the external camera system.

Metric	Value (m)
Mean error (X-axis)	0.02
Mean error (Y-axis)	0.03
RMSE	0.04
Max error	0.08

Table 6. Performance of the camera-based position prediction model.

Metric	Value
Mean squared error (MSE)	0.00387
Mean absolute error (MAE)	0.03969
Coefficient of determination ( $R^{2}$ )	0.99588

Table 7. Performance of the autonomous rover position prediction model.

Metric	Value
Mean squared error (MSE)	0.0058
Mean absolute error (MAE)	0.0493
Coefficient of determination ( $R^{2}$ )	0.9813

Table 8. Complete autonomous vehicle system performance and braking test results.

Test	Test	Distance from	System Response	Braking	Stopping	Autonomous
Case	Scenarios	Object (Meters)	Time (Seconds)	Distance (m)	Status	Speed (m/s)
1	Static object (front)	1.3	0.3	1	Stopped successfully	2
2	Static object (side)	1.1	0.4	0.9	Stopped successfully	2.5
3	Slowly moving object (constant speed)	1.5	0.25	1.2	Stopped successfully	2.3
4	Slowly moving object (constant speed)	1	0.5	0.8	Stopped successfully	2
5	Static object in low-light conditions	1.7	0.35	1.3	Stopped successfully	2.1
6	Slowly moving object (constant speed)	1.4	0.2	1.1	Stopped successfully	2.2
7	Slowly moving object (constant speed)	1.2	0.45	0.95	Stopped successfully	2.3
8	Fast moving object (constant speed)	1	0.3	0.9	Stopped successfully	2.1
9	Environment with water on the ground	1.4	0.4	1	Stopped successfully	2.2
10	Slowly moving object (constant speed)	1.3	0.3	1.1	Stopped successfully	2.3

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tantrairatn, S.; Phinphimai, P.; Phuangmalee, N.; Karaked, P.; Petcharat, N.; Pichitkul, A.; Ariyarit, A. Connected Perception Between Lightweight Robot and External Camera for Blind-Spot Awareness. Technologies 2026, 14, 338. https://doi.org/10.3390/technologies14060338

AMA Style

Tantrairatn S, Phinphimai P, Phuangmalee N, Karaked P, Petcharat N, Pichitkul A, Ariyarit A. Connected Perception Between Lightweight Robot and External Camera for Blind-Spot Awareness. Technologies. 2026; 14(6):338. https://doi.org/10.3390/technologies14060338

Chicago/Turabian Style

Tantrairatn, Suradet, Poommin Phinphimai, Nattapong Phuangmalee, Pawarut Karaked, Nutchanan Petcharat, Auraluck Pichitkul, and Atthaphon Ariyarit. 2026. "Connected Perception Between Lightweight Robot and External Camera for Blind-Spot Awareness" Technologies 14, no. 6: 338. https://doi.org/10.3390/technologies14060338

APA Style

Tantrairatn, S., Phinphimai, P., Phuangmalee, N., Karaked, P., Petcharat, N., Pichitkul, A., & Ariyarit, A. (2026). Connected Perception Between Lightweight Robot and External Camera for Blind-Spot Awareness. Technologies, 14(6), 338. https://doi.org/10.3390/technologies14060338

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Connected Perception Between Lightweight Robot and External Camera for Blind-Spot Awareness

Abstract

1. Introduction

2. Methodology

2.1. Preliminary Testing and Baseline Validation

2.1.1. Obstacle Velocity Characterization

2.1.2. Onboard Sensing Evaluation in Line-of-Sight Scenarios

2.1.3. Onboard Baseline Testing in the Blind-Spot Geometry

2.2. Blind-Spot Scenario and Experimental Setup

2.3. Lightweight Robot

2.4. External Camera-Based Perception System

2.5. Camera Calibration Procedure

2.6. Perspective Transformation

Homography Transformation

2.7. YOLOv11-Based Object Detection

2.8. DeepSORT-Based Object Tracking

2.9. Connected Perception-Based Decision Making

2.9.1. Position Prediction Models

2.9.2. Collision Risk Analysis

2.9.3. Collision-Aware Motion Control

3. Experiments and Results

3.1. Results of Preliminary Testing and Baseline Validation

3.1.1. Obstacle Velocity Characterization

3.1.2. Onboard Baseline Testing in the Blind-Spot Geometry

3.2. Main Experimental Setup

3.3. Detection Results

3.4. Localization Results

3.4.1. External Camera Localization

3.4.2. Lightweight Robot Localization

3.5. Position Prediction Results

3.6. Collision Avoidance Results

4. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI