Singular Value Decomposition (SVD) Method for LiDAR and Camera Sensor Fusion and Pattern Matching Algorithm

Tian, Kaiqiao; Song, Meiqi; Cheok, Ka C.; Radovnikovich, Micho; Kobayashi, Kazuyuki; Cai, Changqing

doi:10.3390/s25133876

Open AccessArticle

Singular Value Decomposition (SVD) Method for LiDAR and Camera Sensor Fusion and Pattern Matching Algorithm

by

Kaiqiao Tian

¹

,

Meiqi Song

²,

Ka C. Cheok

¹,

Micho Radovnikovich

¹,

Kazuyuki Kobayashi

³

and

Changqing Cai

^4,*

¹

Electrical and Computer Engineering, Oakland University, Rochester, MI 48309, USA

²

Department of Mathematics and Statistics, Oakland University, Rochester, MI 48309, USA

³

Department of Advanced Sciences, Hosei University, Tokyo 184-8584, Japan

⁴

College of Electrical and Information Engineering, Changchun Institute of Technology, 395 Kuan Ping Road, Changchun 130103, China

^*

Author to whom correspondence should be addressed.

Sensors 2025, 25(13), 3876; https://doi.org/10.3390/s25133876

Submission received: 5 May 2025 / Revised: 14 June 2025 / Accepted: 18 June 2025 / Published: 21 June 2025

(This article belongs to the Special Issue Recent Advances in LiDAR Sensor)

Download

Browse Figures

Versions Notes

Abstract

LiDAR and camera sensors are widely utilized in autonomous vehicles (AVs) and robotics due to their complementary sensing capabilities—LiDAR provides precise depth information, while cameras capture rich visual context. However, effective multi-sensor fusion remains challenging due to discrepancies in resolution, data format, and viewpoint. In this paper, we propose a robust pattern matching algorithm that leverages singular value decomposition (SVD) and gradient descent (GD) to align geometric features—such as object contours and convex hulls—across LiDAR and camera modalities. Unlike traditional calibration methods that require manual targets, our approach is targetless, extracting matched patterns from projected LiDAR point clouds and 2D image segments. The algorithm computes the optimal transformation matrix between sensors, correcting misalignments in rotation, translation, and scale. Experimental results on a vehicle-mounted sensing platform demonstrate an alignment accuracy improvement of up to 85%, with the final projection error reduced to less than 1 pixel. This pattern-based SVD-GD framework offers a practical solution for maintaining reliable cross-sensor alignment under calibration drift, enabling real-time perception systems to operate robustly without recalibration. This method provides a practical solution for maintaining reliable sensor fusion in autonomous driving applications subject to long-term calibration drift.

Keywords:

LiDAR and camera data sensor fusion; singular value decomposition; gradient descent; pattern matching; error detection

1. Introduction

Sensor fusion is a fundamental technique in robotics and autonomous systems that combines data from multiple heterogeneous sensors to achieve a more comprehensive and accurate perception of the surrounding environment. Among various sensor combinations, the fusion of Light Detection and Ranging (LiDAR) and camera data has gained significant attention due to their complementary characteristics—LiDAR provides precise geometric and depth information, while cameras capture rich semantic and visual context. This multimodal fusion enhances environmental understanding, leading to improved object detection, scene interpretation, and overall system reliability, which are crucial for Advanced Driver Assistance Systems (ADAS) and autonomous navigation [1,2,3,4].

Numerous fusion strategies have been developed to optimize the integration of LiDAR and camera data, including fully convolutional neural networks [2], deep fusion architectures [5], and segmentation-based approaches [6,7,8]. These algorithms aim to address challenges in data alignment, improve robustness to noise, and ensure real-time performance in dynamic environments. However, the effectiveness of such methods critically depends on the accuracy of sensor calibration and the process of determining the spatial and temporal relationships between sensors.

Traditional calibration techniques often rely on manual procedures using checkerboards or custom-designed targets [9,10,11,12,13]. While these methods can achieve high precision in controlled environments, they require repetitive labor, are sensitive to installation errors, and are impractical for large-scale or long-term deployments. Moreover, calibration parameters are prone to drift over time due to environmental factors such as temperature variation, vibration, and sensor wear, degrading fusion performance unless periodic recalibration is performed [14].

To address these limitations, recent research has explored targetless calibration methods that extract features from natural scenes—such as edges, planes, or point correspondences—to estimate the extrinsic transformation between sensors [14,15,16]. However, these methods can still struggle under severe noise, occlusion, or perspective distortion. Pattern matching-based approaches represent a promising alternative by aligning geometric structures such as contours, convex hulls, or projected keypoints between modalities. Some studies apply mutual information or optimization-based matching [15], while others incorporate learned correspondences through deep models [16].

Singular value decomposition (SVD)-based methods have emerged as a mathematically robust and interpretable tool for estimating rigid transformations from point sets. The Kabsch algorithm, based on SVD, has been widely adopted in registration and poses estimation tasks due to its closed-form optimality under least-squares errors [17,18,19,20]. When combined with robust feature extraction, this technique offers a powerful framework for aligning heterogeneous sensor data.

In this paper, we introduce a novel pattern matching algorithm for LiDAR–camera fusion that leverages the strengths of SVD and iterative refinement through gradient descent. Rather than directly re-estimating intrinsic or extrinsic parameters, our method aligns projected 3D point cloud features with 2D image contours by matching spatial patterns. This strategy allows us to correct sensor misalignment without requiring calibration targets, enabling efficient, repeatable fusion correction in real-world autonomous systems. Experimental results demonstrate that our method significantly improves alignment under varying environmental and calibration drift conditions.

2. Methodology

2.1. Sensor Setup and Calibration

For development and testing, we employ a real-world sensing platform consisting of a Chevrolet (Detroit, MI, USA) Cruise vehicle equipped with an Ouster OS1-64 LiDAR (Ouster, Inc., San Francisco, CA, USA) mounted on the roof and a Logitech RGB camera (Logitech, Lausanne, Switzerland) positioned on the windshield (see Figure 1). To facilitate algorithm evaluation and simulation under controlled conditions, a simulation of the vehicle was constructed in the Robot Operating System (ROS) environment (ROS Noetic, Ubuntu 20.04). This virtual model preserves the real vehicle’s geometry and sensor placement, enabling reliable reproduction of sensor data streams and frame transformations.

As illustrated in Figure 2, the transform (TF) tree of the vehicle model defines the spatial relationships between the vehicle body and its mounted sensors using six degrees of freedom—roll, pitch, yaw, and x, y, z translations. The base_footprint frame represents the vehicle’s reference point on the ground, typically defined at the geometric center between the rear wheels. The vehicle_cam frame, corresponding to the camera’s physical location, is defined as a child frame relative to the LiDAR frame.

As illustrated in Figure 3, a checkerboard calibration target with two cut-out holes was used to evaluate the alignment between LiDAR and camera data. Within the ROS RViz environment, both the LiDAR point cloud and the vehicle’s TF tree are visualized, demonstrating the spatial consistency between the 3D LiDAR data and the 2D image plane. The resulting point cloud captures distinct environmental features such as the ground surface, vertical light poles, and the calibration board itself, verifying the accuracy of the fusion setup.

To project LiDAR point cloud data onto the 2D image plane, we apply a standard extrinsic and intrinsic calibration procedure. The camera intrinsics (focal length, principal point, and distortion coefficients) and the extrinsic transformation between the LiDAR and camera coordinate frames

[R | t]

were determined using a checkerboard target and validated in ROS. This transformation allows 3D LiDAR points

(x, y, z)

, to be mapped onto 2D image coordinates

(u, v)

using the following projection model:

P = K \times [R | t]

(1)

where

K

is the camera intrinsic matrix. This projected mapping enables spatial comparison between LiDAR and camera features, which is essential for the proposed SVD-based pattern matching algorithm. We employed the YOLOv5s architecture for object detection, consisting of CSPDarknet53 backbone, PANet neck, and YOLO head layers. The model was pretrained on the COCO dataset (80 object categories, 118 k images) and fine-tuned on a custom dataset of 3000 manually labeled pedestrian images captured under urban driving conditions. The input resolution was 640 × 480 pixels, using a stochastic gradient descent optimizer with a learning rate of 0.01 and batch size of 16. The confidence threshold for detection was set to 0.4 during inference. As shown in Figure 4, the CNN detection result from the camera is overlaid with the projected LiDAR point cloud on the 2D image.

2.2. Pattern Matching Algorithm

The proposed algorithm projects LiDAR point clouds onto the camera image plane, followed by the extraction of object contours and convex hulls from both modalities to capture their geometric features. Singular value decomposition (SVD) is applied to estimate the initial transformation parameters, including rotation, translation, and scale, that align the two datasets. This estimate is further refined through gradient descent (GD) optimization, which minimizes a cost function based on the residual geometric discrepancy. The combined SVD-GD approach enables robust alignment correction under varying degrees of calibration drift and sensor misalignment. Figure 5 illustrates the data preprocessing pipeline, including CNN-based object detection, LiDAR projection onto the image plane, and contour and convex hull extraction from both modalities. The extracted features are then used for SVD-based alignment and subsequent GD refinement.

The proposed pattern matching algorithm identifies and corrects sensor misalignment by minimizing discrepancies between the extracted geometric features, leading to improved fusion accuracy. By applying this algorithm, the discrepancies between the two datasets can be minimized, leading to more accurate and precise sensor fusion results.

Let the respective LiDAR and camera data at frame

i

be denoted as matrices

p_{i}

and

q_{i}

, respectively, where each matrix contains the spatial features or point correspondences captured by each sensor at that frame:

p_{i} = {[\begin{matrix} p_{x} \\ p_{y} \\ p_{z} \end{matrix}]}_{i} = [\begin{matrix} p_{x_{i}} \\ p_{y_{i}} \\ p_{z_{i}} \end{matrix}] q_{i} = {[\begin{matrix} q_{x} \\ q_{y} \\ q_{z} \end{matrix}]}_{i} = [\begin{matrix} q_{x_{i}} \\ q_{y_{i}} \\ q_{z_{i}} \end{matrix}]

(2)

With reference to the p-frame, the relative transformation from the p-frame to the q-frame can be represented by a translation vector

d_{q}^{p}

and a rotation matrix

R_{q}^{p}

, where

\begin{matrix} d_{q}^{p} = {[\begin{matrix} x \\ y \\ z \end{matrix}]}_{q}^{p} \\ R_{q}^{p} = [\begin{matrix} c o s ψ & - s i n ψ & 0 \\ s i n ψ & c o s ψ & 0 \\ 0 & 0 & 1 \end{matrix}] [\begin{matrix} c o s θ & 0 & s i n θ \\ 0 & 1 & 0 \\ - s i n θ & 0 & c o s θ \end{matrix}] [\begin{matrix} 1 & 0 & 0 \\ 0 & c o s ϕ & - s i n ϕ \\ 0 & s i n ϕ & c o s ϕ \end{matrix}] \\ = [\begin{matrix} c o s ψ c o s θ & - s i n ψ c o s θ + c o s ψ s i n θ s i n ϕ & s i n ψ s i n ϕ + c o s ψ s i n θ c o s ϕ \\ s i n ψ c o s θ & c o s ψ c o s ϕ + s i n ψ s i n θ s i n ϕ & - c o s ψ s i n ϕ + s i n ψ s i n θ c o s ϕ \\ - s i n θ & c o s θ s i n ϕ & c o s θ c o s ϕ \end{matrix}] \end{matrix}

(3)

The forward kinematics relationship between the position vector

p

and

q

is given by

p_{i} = d_{q}^{p} + R_{q}^{p} q_{i}

(4)

If a collection of

m

data vectors is obtained at each frame, the dataset can be represented as a matrix, where each column (or row) corresponds to a single data vector captured at a specific frame.

P = [p_{1} p_{2} \dots p_{m}] Q = [q_{1} q_{2} \dots q_{m}]

(5)

Then, the matrix relationship is as follows:

P = D_{q}^{p} + R_{q}^{p} Q D_{q}^{p} = d_{q}^{p} 1_{1 \times m}

(6)

The relationship between the means of the data is similarly expressed as

\bar{p} = d_{q}^{p} + R_{q}^{p} \bar{p} w h e r e, \bar{p} = \frac{1}{m} \sum_{i = 1}^{m} p_{i}, \bar{q} = \frac{1}{m} \sum_{i = 1}^{m} q_{i}

(7)

Let the matrix

H

be defined as the product of the centered data matrices:

H = [P - \bar{P}] {[Q - \bar{Q}]}^{'}

(8)

where

[P - \bar{P}]

represents the deviation of the LiDAR data matrix

P

from its mean

\bar{P} = \bar{p} 1_{1 \times m}

and

{[Q - \bar{Q}]}^{'}

represents the transpose of the deviation of the camera data matrix

Q

from its mean

\bar{Q} = \bar{q} 1_{1 \times m}

.

To extract the rotational relationship between the two sets of observations, we apply singular value decomposition (SVD) to

H

, yielding

H = U S V^{'}

(9)

where

U

and

V

are unitary matrices in the minimal singular value decomposition, and

S

is a diagonal matrix of singular values of

H

. This decomposition allows for the estimation of the optimal rotation matrix that aligns the LiDAR and camera data in a least-squares context.

Use forward kinematics to express

H

as

H = [D_{q}^{p} + R_{q}^{p} Q - D_{q}^{p} + R_{q}^{p} \bar{Q}] {[Q - \bar{Q}]}^{'} = R_{q}^{p} [Q - \bar{Q}] {[Q - \bar{Q}]}^{'}

(10)

Use a similarity transformation to decompose the symmetric matrix, yielding the following relationship:

\begin{matrix} H = R_{q}^{p} T_{Q Q} Λ_{Q Q} T_{Q Q}^{'} \\ Λ_{Q Q} = [d i a g \{λ_{j} ([Q - \bar{Q}] {[Q - \bar{Q}]}^{'}), j = 1,2, 3; λ_{1} \geq λ_{2} \geq λ_{3}\}] \\ λ_{j} = e i g e n v a l u e o f [Q - \bar{Q}] {[Q - \bar{Q}]}^{'} \\ T_{Q Q} = m o d a l m a t r i x o f [Q - \bar{Q}] {[Q - \bar{Q}]}^{'} \\ T_{Q Q}^{'} = T_{Q Q} o r T_{Q Q}^{'} = I \end{matrix}

(11)

We decompose

H

using the SVD algorithm:

H = U S V^{'}

(12)

W h e r e, S = [d i a g \{\sqrt{λ_{j} (H^{'} H)}, j = 1,2, 3; λ_{1} \geq λ_{2} \geq λ_{3}\}] U U^{'} = U^{'} U = I V^{'} V = {V V}^{'} = I

(13)

It can be observed that

H^{'} H = [Q - \bar{Q}] {[Q - \bar{Q}]}^{'} {R_{q}^{p}}^{'} R_{q}^{p} [Q - \bar{Q}] {[Q - \bar{Q}]}^{'} = {[[Q - \bar{Q}] {[Q - \bar{Q}]}^{'}]}^{2}

(14)

As a result, we find that

S = [d i a g \{\sqrt{λ_{j} (H^{'} H)}, j = 1,2, 3\}] = [d i a g \{\sqrt{λ_{j} ({[[Q - \bar{Q}] {[Q - \bar{Q}]}^{'}]}^{2})}, j = 1,2, 3\}] = [d i a g \{λ_{j} ([Q - \bar{Q}] {[Q - \bar{Q}]}^{'}), j = 1,2, 3\}] = Λ_{Q Q}

(15)

From the relationship

H = R_{q}^{p} T_{Q Q} Λ_{Q Q} T_{Q Q}^{'} = U S V^{'}

, one finds that

H = R_{q}^{p} T_{Q Q} V^{'} = T_{Q Q}^{'}

(16)

Accordingly, the rotation matrix that aligns the LiDAR coordinate frame with the camera coordinate frame is given by:

R_{q}^{p} = U V^{'}

(17)

Once the rotation matrix describing the orientation between the LiDAR and camera data is obtained, it can be used as the initial input to a gradient descent optimization function. This iterative method refines both the rotation matrix and the translation (scale) vector by minimizing a defined cost function, thereby improving the overall calibration accuracy. Upon convergence, the optimized parameters are used to update the transformation tree between the LiDAR and camera frames, resulting in a more precise estimate of their relative pose.

The designed gradient descent state is presented as

θ = [p_{x_{i}}, p_{y_{i}}, s, r a d M a j b]

(18)

Here,

p_{x_{i}}

and

p_{y_{i}}

represent the LiDAR data points, which can be substituted with camera data as needed. The parameter

s

denotes the scale factor between the LiDAR and camera data, which is initially set to 1. The major rotation angle,

r a d M a j b

, is computed using the singular value decomposition (SVD) result as follows:

r a d M a j b = a t a n 2 (V^{'} (2,1) V^{'} (1,1))

(19)

where

V^{'}

is the right singular matrix obtained from the decomposition of the correlation matrix between the LiDAR and camera point sets.

The gradient descent method is a first-order iterative optimization algorithm used to minimize a cost function

J (θ)

with respect to its parameters

θ

. At each iteration, the parameters are updated in the opposite direction of the gradient of the cost function, aiming to find the local minimum. The parameter update rule is given by

θ_n e w = θ_o l d - γ \nabla J (θ)

(20)

where

θ

the parameter vector to be optimized (e.g., rotation angle, scale factor, and translation),

γ

is the learning rate or step size (

γ > 0

),

\nabla J (θ)

is the gradient of the cost function with respect to

θ

, and

J (θ)

is the cost function, typically defined as the mean squared error between transformed LiDAR data and the corresponding camera data.

The gradient of the cost function

J (θ)

, denoted as

\nabla J (θ)

is derived to guide the optimization process in the gradient descent algorithm. It is expressed as follows:

\nabla J (θ) = [2 q_{1} (p_{x_{i}} - {\bar{p}}_{x_{i}}), 2 q_{2} (p_{y_{i}} - {\bar{p}}_{y_{i}}), - 2 q_{3} (r a d M a j a - s \times r a d M a j a) \times s, - 2 q_{4} (r a d M a j a - r a d M a j b)]

(21)

where

p_{x_{i}}

and

p_{y_{i}}

are the LiDAR data coordinates for the

i

-th point,

{\bar{p}}_{x_{i}}

and

{\bar{p}}_{y_{i}}

are the corresponding mean values of the LiDAR data,

s

is the scale factor between the LiDAR and camera data (initially

s = 1

),

r a d M a j a

and

r a d M a j b

represent major rotation angles extracted from SVD-based pattern alignment,

q = [q_{1}, q_{2}, q_{3}, q_{4}]

is a tuning matrix (or adjustment vector) used to scale the gradient components individually.

3. Experimental Results

3.1. Test Scenarios

To evaluate the robustness and performance of the proposed SVD-GD pattern matching algorithm, controlled misalignments were manually introduced into the system by modifying the extrinsic calibration parameters. These included rotational offsets, translation shifts, scale variations, and camera distortions, simulating practical calibration drift scenarios that may occur during long-term deployment in autonomous systems. The algorithm was tested across five representative cases. All experimental figures in this section were generated using MATLAB R2023a.

3.1.1. General Misalignment Correction (Figure 6)

In this general case, random misalignments were introduced to simulate minor calibration drift. The algorithm iteratively refined rotation, translation, and scale parameters to minimize geometric discrepancies between LiDAR and camera detections. The blue outlines represent the initial LiDAR projections, the red outlines show CNN-based camera object detections, and the pink contours illustrate intermediate stages during optimization, demonstrating convergence toward accurate alignment.

Figure 6. Iterative correction process of the SVD-GD algorithm. Blue: initial LiDAR projection; red: camera-based object detection; pink: intermediate results during optimization. The green and purple coordinate axes visualize the camera and LiDAR coordinate frames, respectively.

3.1.2. Checkerboard Misalignment (Figure 7)

A controlled checkerboard target with known rotational and translational offsets was used to simulate sensor misalignment. The algorithm successfully corrected these errors, aligning the LiDAR projection with the camera-detected features, despite initial calibration offsets.

Figure 7. Correction under checkerboard misalignment scenario. Blue: initial LiDAR projection; red: camera detection; pink: intermediate results during optimization.

3.1.3. Camera Distortion (Figure 8)

In this scenario, simulated camera lens distortion introduced additional projection errors. The algorithm effectively compensated for both geometric and projection misalignments by adjusting rotation, translation, and scale concurrently.

Figure 8. Correction under camera distortion scenario. Blue: initial LiDAR projection; red: camera detection; pink: intermediate results during optimization.

3.1.4. LiDAR Mounting Drift (Figure 9)

To simulate mechanical mounting shifts, positional offsets were applied to the LiDAR sensor. The algorithm successfully recovered the correct transformation despite significant positional drifts between the two modalities.

Figure 9. Correction under LiDAR mounting drift scenario. Blue: initial LiDAR projection; red: camera detection; pink: intermediate results during optimization.

3.1.5. Extreme Misalignment (Figure 10)

In this most challenging case, large rotational, translational, and scaling errors were introduced, including severe orientation shifts. The algorithm demonstrated strong resilience by converging to the correct transformation parameters even under extreme calibration errors.

Figure 10. Correction under extreme misalignment scenario. Blue: initial LiDAR projection; red: camera detection; pink: intermediate results during optimization.

3.2. Quantitative Evaluation

To quantitatively evaluate the effectiveness of the proposed SVD-GD pattern matching algorithm, we measured the alignment accuracy across several real-world scenarios, including calibration drift, camera distortion, and extreme sensor misalignment. The metric used is the average 2D pixel distance or angular deviation between the projected LiDAR data and corresponding camera features, both before and after optimization. Additionally, the runtime per frame was recorded to assess computational efficiency. Table 1 summarizes the improvement in alignment accuracy and runtime performance for each test case.

4. Discussion

While the proposed SVD-GD pattern matching algorithm demonstrates strong alignment accuracy across multiple misalignment scenarios, several factors influencing sensor misalignment remain. Even though LiDAR sensors provide accurate depth measurements, practical deployment introduces relative errors between LiDAR and camera sensors due to physical installation tolerances, mounting vibrations, mechanical deformations, and thermal expansion during long-term operation. These factors result in a gradual drift of the extrinsic parameters, which can degrade fusion performance over time.

The current method relies on the availability of clear geometric features such as object contours, making it most effective when distinct objects, such as pedestrians or vehicles, are visible in the scene. Performance may degrade under severe occlusions, in highly cluttered environments, or when object detections are ambiguous. In addition, while the algorithm operates efficiently for single-object alignment (~0.4 s per frame), further optimization will be required for multi-object scenarios or real-time processing under high frame rates. Despite these challenges, the method provides a practical and adaptive solution for correcting misalignment in dynamic environments where traditional manual calibration is impractical. Despite these challenges, the proposed method offers a practical and adaptive solution for real-world autonomous systems, where continuous calibration is difficult to maintain

5. Conclusions and Future Work

This paper presented a pattern matching algorithm that maintains alignment between LiDAR and camera sensors by extracting geometric features and applying SVD-based transformation estimation with GD refinement. The proposed targetless approach was validated across multiple real-world scenarios, including sensor drift, camera distortion, and extreme misalignments, achieving alignment improvements exceeding 85% with final alignment errors reduced to sub-pixel and sub-degree levels.

By enabling continuous correction of calibration drift, the algorithm reduces dependence on repeated manual recalibration, offering long-term stability for autonomous systems operating in changing environments. Future work will focus on extending the method to handle multiple objects, improving robustness under occlusions, and integrating real-time adaptive calibration frameworks for autonomous driving applications.

Author Contributions

Conceptualization, K.T. and K.C.C.; Methodology, K.T., K.C.C., M.R. and C.C.; Software, K.T. and M.S.; Validation, K.C.C., M.R. and K.K.; Writing—original draft, K.T.; Writing—review & editing, M.S. and C.C.; Visualization, M.S. and C.C.; Supervision, K.C.C. and K.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Key Project Jilin Province Science and Technology Development Plan (20230202038NC) and High-end Foreign Experts Introduction Project of the Foreign Experts Bureau of the Department of Science and Technology of Jilin Province (L202501, S202405).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study is available on request from the corresponding author.

Acknowledgments

The authors sincerely thank the reviewers and editors for their valuable suggestions, which helped improve the quality of this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LiDAR	Light Detection and Ranging
CNN	Convolutional Neural Network
SVD	Singular Value Decomposition
GD	Gradient Descent
ROS	Robot Operating System
PCD	Point Cloud Data
YOLO	You Only Look Once (object detection model)

References

Zhang, F.; Clarke, D.; Knoll, A. Vehicle detection based on LiDAR and camera fusion. In Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems (ITSC), Qingdao, China, 8–11 October 2014; pp. 1620–1625. [Google Scholar]
Caltagirone, L.; Bellone, M.; Svensson, L.; Wahde, M. LIDAR–camera fusion for road detection using fully convolutional neural networks. Robot. Auton. Syst. 2019, 111, 125–131. [Google Scholar] [CrossRef]
Zhao, X.; Sun, P.; Xu, Z.; Min, H.; Yu, H. Fusion of 3D LIDAR and camera data for object detection in autonomous vehicle applications. IEEE Sens. J. 2020, 20, 4901–4913. [Google Scholar] [CrossRef]
Zhong, H.; Wang, H.; Wu, Z.; Zhang, C.; Zheng, Y.; Tang, T. A survey of LiDAR and camera fusion enhancement. Procedia Comput. Sci. 2021, 183, 579–588. [Google Scholar] [CrossRef]
Li, Y.; Yu, A.W.; Meng, T.; Caine, B.; Ngiam, J.; Peng, D.; Shen, J.; Lu, Y.; Zhou, D.; Le, Q.V.; et al. Deepfusion: Lidar-camera deep fusion for multi-modal 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 17182–17191. [Google Scholar]
Zhao, L.; Zhou, H.; Zhu, X.; Song, X.; Li, H.; Tao, W. Lif-seg: Lidar and camera image fusion for 3d lidar semantic segmentation. IEEE Trans. Multimed. 2023, 26, 1158–1168. [Google Scholar] [CrossRef]
Berrio, J.S.; Shan, M.; Worrall, S.; Nebot, E. Camera-LIDAR integration: Probabilistic sensor fusion for semantic mapping. IEEE Trans. Intell. Transp. Syst. 2021, 23, 7637–7652. [Google Scholar] [CrossRef]
Li, J.; Zhang, X.; Li, J.; Liu, Y.; Wang, J. Building and optimization of 3D semantic map based on Lidar and camera fusion. Neurocomputing 2020, 409, 394–407. [Google Scholar] [CrossRef]
Huang, J.K.; Grizzle, J.W. Improvements to target-based 3D LiDAR to camera calibration. IEEE Access 2020, 8, 134101–134110. [Google Scholar] [CrossRef]
Pusztai, Z.; Hajder, L. Accurate calibration of LiDAR-camera systems using ordinary boxes. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Venice, Italy, 22–29 October 2017; pp. 394–402. [Google Scholar]
Veľas, M.; Španěl, M.; Materna, Z.; Herout, A. Calibration of rgb camera with velodyne lidar. J. WSCG 2014, 2014, 135–144. [Google Scholar]
Grammatikopoulos, L.; Papanagnou, A.; Venianakis, A.; Kalisperakis, I.; Stentoumis, C. An effective camera-to-LiDAR spatiotemporal calibration based on a simple calibration target. Sensors 2022, 22, 5576. [Google Scholar] [CrossRef] [PubMed]
Tsai, D.; Worrall, S.; Shan, M.; Lohr, A.; Nebot, E. Optimising the selection of samples for robust lidar camera calibration. In Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, 19–22 September 2021; pp. 2631–2638. [Google Scholar]
Li, X.; Xiao, Y.; Wang, B.; Ren, H.; Zhang, Y.; Ji, J. Automatic targetless LiDAR–camera calibration: A survey. Artif. Intell. Rev. 2023, 56, 9949–9987. [Google Scholar] [CrossRef]
Li, X.; Duan, Y.; Wang, B.; Ren, H.; You, G.; Sheng, Y.; Ji, J.; Zhang, Y. Edgecalib: Multi-frame weighted edge features for automatic targetless lidar-camera calibration. IEEE Robot. Autom. Lett. 2024, 9, 10073–10080. [Google Scholar] [CrossRef]
Tan, Z.; Zhang, X.; Teng, S.; Wang, L.; Gao, F. A Review of Deep Learning-Based LiDAR and Camera Extrinsic Calibration. Sensors 2024, 24, 3878. [Google Scholar] [CrossRef] [PubMed]
Yamada, R.; Yaguchi, Y. Probability-Based LIDAR–Camera Calibration Considering Target Positions and Parameter Evaluation Using a Data Fusion Map. Sensors 2024, 24, 3981. [Google Scholar] [CrossRef] [PubMed]
Bisgard, J. Analysis and Linear Algebra: The Singular Value Decomposition and Applications; American Mathematical Society: Providence, RI, USA, 2020; Volume 94. [Google Scholar]
Jaradat, Y.; Masoud, M.; Jannoud, I.; Manasrah, A.; Alia, M. A tutorial on singular value decomposition with applications on image compression and dimensionality reduction. In Proceedings of the 2021 International Conference on Information Technology (ICIT), Amman, Jordan, 14–15 July 2021; pp. 769–772. [Google Scholar]
He, Y.L.; Tian, Y.; Xu, Y.; Zhu, Q.X. Novel soft sensor development using echo state network integrated with singular value decomposition: Application to complex chemical processes. Chemom. Intell. Lab. Syst. 2020, 200, 103981. [Google Scholar] [CrossRef]

Figure 1. Vehicle platform and ROS simulation.

Figure 2. The vehicle and sensor system transfer frame tree.

Figure 3. LiDAR and camera manual calibration. The left image shows the LiDAR point cloud in the 3D coordinate frame, while the right image displays the LiDAR points (green dots) projected onto the camera image.

Figure 4. Camera CNN detection result with LiDAR point cloud projected onto the 2D image.

Figure 5. Processing pipeline of the proposed pattern matching algorithm. Top-left: 2D camera image with classification results. Middle-left: projection of 3D LiDAR point cloud onto the image. Bottom-left: overlaid contours of the camera image and LiDAR projection. Top-right: object extraction from the camera image. Middle-right: contour and convex hull of 2D LiDAR projection. Bottom-right: convex hulls from both camera and LiDAR data.

Table 1. Alignment accuracy improvement and runtime performance across test scenarios.

Scenario	Initial Misalignment	Residual Error After SVD-GD	Improvement (%)	Runtime per Frame (s)
Checkerboard Scene (Figure 7)	5.3 px	0.8 px	84.9%	0.34
Camera Distortion (Figure 8)	7.2 px	1.1 px	84.7%	0.39
LiDAR Drift on Highway (Figure 9)	10.0° rotation	1.2° rotation	88.0%	0.42
Extreme Misalignment (Figure 10)	45° + scale shift	2.5°/3.5 px	93.5% (avg)	0.47

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tian, K.; Song, M.; Cheok, K.C.; Radovnikovich, M.; Kobayashi, K.; Cai, C. Singular Value Decomposition (SVD) Method for LiDAR and Camera Sensor Fusion and Pattern Matching Algorithm. Sensors 2025, 25, 3876. https://doi.org/10.3390/s25133876

AMA Style

Tian K, Song M, Cheok KC, Radovnikovich M, Kobayashi K, Cai C. Singular Value Decomposition (SVD) Method for LiDAR and Camera Sensor Fusion and Pattern Matching Algorithm. Sensors. 2025; 25(13):3876. https://doi.org/10.3390/s25133876

Chicago/Turabian Style

Tian, Kaiqiao, Meiqi Song, Ka C. Cheok, Micho Radovnikovich, Kazuyuki Kobayashi, and Changqing Cai. 2025. "Singular Value Decomposition (SVD) Method for LiDAR and Camera Sensor Fusion and Pattern Matching Algorithm" Sensors 25, no. 13: 3876. https://doi.org/10.3390/s25133876

APA Style

Tian, K., Song, M., Cheok, K. C., Radovnikovich, M., Kobayashi, K., & Cai, C. (2025). Singular Value Decomposition (SVD) Method for LiDAR and Camera Sensor Fusion and Pattern Matching Algorithm. Sensors, 25(13), 3876. https://doi.org/10.3390/s25133876

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Singular Value Decomposition (SVD) Method for LiDAR and Camera Sensor Fusion and Pattern Matching Algorithm

Abstract

1. Introduction

2. Methodology

2.1. Sensor Setup and Calibration

2.2. Pattern Matching Algorithm

3. Experimental Results

3.1. Test Scenarios

3.1.1. General Misalignment Correction (Figure 6)

3.1.2. Checkerboard Misalignment (Figure 7)

3.1.3. Camera Distortion (Figure 8)

3.1.4. LiDAR Mounting Drift (Figure 9)

3.1.5. Extreme Misalignment (Figure 10)

3.2. Quantitative Evaluation

4. Discussion

5. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI