Previous Article in Journal
Evaluating Feature-Based Homography Pipelines for Dual-Camera Registration in Acupoint Annotation
Previous Article in Special Issue
Multi-Channel Spectro-Temporal Representations for Speech-Based Parkinson’s Disease Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Knowledge-Guided Symbolic Regression for Interpretable Camera Calibration

by
Rui Pimentel de Figueiredo
Department of Mechanical and Production Engineering, Aarhus University, 8200 Aarhus, Denmark
J. Imaging 2025, 11(11), 389; https://doi.org/10.3390/jimaging11110389 (registering DOI)
Submission received: 26 September 2025 / Revised: 25 October 2025 / Accepted: 31 October 2025 / Published: 2 November 2025
(This article belongs to the Special Issue Celebrating the 10th Anniversary of the Journal of Imaging)

Abstract

Calibrating cameras accurately requires the identification of projection and distortion models that effectively account for lens-specific deviations. Conventional formulations, like the pinhole model or radial–tangential corrections, often struggle to represent the asymmetric and nonlinear distortions encountered in complex environments such as autonomous navigation, robotics, and immersive imaging. Although neural methods offer greater adaptability, they demand extensive training data, are computationally intensive, and often lack transparency. This work introduces a symbolic model discovery framework guided by physical knowledge, where symbolic regression and genetic programming (GP) are used in tandem to identify calibration models tailored to specific optical behaviors. The approach incorporates a broad class of known distortion models, including Brown–Conrady, Mei–Rives, Kannala–Brandt, and double-sphere, as modular components, while remaining extensible to any predefined or domain-specific formulation. Embedding these models directly into the symbolic search process constrains the solution space, enabling efficient parameter fitting and robust model selection without overfitting. Through empirical evaluation across a variety of lens types, including fisheye, omnidirectional, catadioptric, and traditional cameras, we show that our method produces results on par with or surpassing those of established calibration techniques. The outcome is a flexible, interpretable, and resource-efficient alternative suitable for deployment scenarios where calibration data are scarce or computational resources are constrained.

1. Introduction

Camera calibration is a foundational task in computer vision, robotics, and photogrammetry, where accurately modeling a camera’s intrinsic parameters directly impacts tasks such as 3D reconstruction, sensor fusion, and visual localization. A longstanding challenge in this area is selecting the most appropriate distortion model to account for the nonlinear and often asymmetric behavior of lenses, particularly wide-angle, fisheye, or catadioptric optics.
Traditional calibration techniques rely on the pinhole projection model combined with parametric distortion formulations such as the Brown–Conrady or radial–tangential model [1,2]. While effective in many settings, these models are limited in expressivity and may underperform when dealing with unconventional optics or distorted imaging environments. More expressive approaches, such as high-order polynomials [3], inverse distortion models [4], and deep neural networks [5], offer improved accuracy but introduce significant computational overhead, rely on large labeled datasets, and typically lack interpretability.

1.1. Motivation and Contributions

In this work, we introduce a symbolic regression framework for automatic model discovery in camera calibration. Our approach uses genetic programming (GP) to evolve symbolic expressions that describe lens distortion, while constraining the search using a grammar informed by domain knowledge. Specifically, the symbolic search space is populated not with generic mathematical operators alone but with physically meaningful model components, such as Brown–Conrady, Mei–Rives, Kannala–Brandt, and double-sphere, which have well-understood behaviors in optical systems. This grammar-guided design offers several advantages, particularly in maintaining interpretability despite the complexity of the discovered models. Each model component corresponds to a well-known distortion model with clear physical meaning. The modularity of the approach allows for the composition of more complex distortions while ensuring the transparency and traceability of each component. In particular, this design
  • restricts the solution space to interpretable, physically plausible models, reducing the risk of overfitting;
  • retains flexibility to discover new hybrid models by composing known distortions;
  • enables closed-form, differentiable expressions that can be refined via traditional nonlinear optimization (e.g., Levenberg–Marquardt [6]).

1.2. Summary of Contributions

This paper contributes the following:
  • A symbolic regression framework for intrinsic calibration, combining GP-based model discovery with domain-specific symbolic grammars;
  • An extensible model library incorporating classical and modern distortion formulations;
  • An empirical evaluation across diverse simulated lenses, demonstrating competitive or superior reprojection accuracy versus standard models.
By integrating physical priors into symbolic model discovery, our method offers a transparent, flexible, and computationally viable alternative to both rigid parametric and opaque black-box approaches.
To clearly isolate the potential contribution of symbolic regression for camera model discovery, we restrict our experiments to intrinsic calibration using synthetic and noiseless data. In order to ensure model generalization beyond the training data, we employ Monte Carlo cross-validation with a strict train/test split. This approach ensures that models are evaluated on unseen poses, mitigating the risk of overfitting. Additionally, model selection favors not only low reprojection errors but also parsimony by incorporating a complexity-penalized fitness function. This helps to prevent overly complex models that may only fit the training data well, improving their generalization across different pose sets. We assume a small set of known projection and distortion models and do not consider extrinsic parameters during model search. This setup avoids the added complexity and computational cost of full calibration and serves as an intermediate step toward a broader framework for the automatic discovery of novel, real-world distortion models, with finer primitive sets, ultimately composed of basic mathematical operations.

2. Related Work

Camera calibration plays a critical role in various fields. In robotics, it enhances simultaneous localization and mapping (SLAM), object manipulation, and sensor data fusion [7,8,9]. For autonomous vehicles, calibration supports tasks such as lane detection, obstacle recognition, and multicamera system alignment [10,11]. In augmented and virtual reality, it ensures accurate spatial alignment between virtual objects and the physical environment [12]. Additionally, in 3D reconstruction and photogrammetry, calibration is essential in producing precise metric models from images [13].
Recent developments include nonparametric and data-driven approaches like Gaussian process calibration [14] and deep learning-based camera modeling [5]. However, there is limited focus on methods that discover interpretable, equation-based models. This work advances the field by leveraging symbolic regression for automated model selection in camera calibration, providing a flexible and transparent alternative to both traditional parametric and black-box techniques.
Traditional calibration relies on the pinhole camera model combined with parametric distortion models such as the Brown–Conrady radial–tangential model [1,2,15]. While effective in many cases, these models often fail to capture complex distortions from fisheye or low-cost wide-angle lenses. More flexible models, including high-order polynomials [3], omnidirectional models [16], and inverse distortion approaches [4], offer improved accuracy but at the cost of increased complexity and sensitivity to data quality.
Deep learning-based methods [5] have also been introduced for distortion correction and model estimation. However, they typically require large datasets and lack interpretability, making them less suitable in data-scarce or safety-critical environments.
Selecting the appropriate distortion model is a critical yet often manual step in calibration workflows. To address this, symbolic regression and genetic programming (GP) have emerged as promising alternatives that automatically discover mathematical models from data [17,18]. Unlike predefined parametric models, symbolic regression can generate novel, interpretable expressions tailored to specific lens characteristics. Although computationally expensive during training, the resulting models are compact, efficient, and generalizable—making them well suited for offline calibration. While symbolic regression offers a powerful mechanism for discovering new analytical models that may better characterize specific lenses, in this work, we focus on model selection, leaving model discovery for future work.

3. Background

This section reviews foundational concepts for an understanding of camera projection models, coordinate systems, and lens distortion. We cover Cartesian and homogeneous coordinates in 2D and 3D, projective geometry’s role in camera modeling, the pinhole camera model, and lens distortion models accounting for real-world imaging imperfections.

3.1. Coordinate Systems

Coordinate systems assign numerical values, or coordinates, to uniquely specify points in space. Cartesian coordinates represent points in 2D as ( x , y ) R 2 and in 3D as ( X , Y , Z ) R 3 . Homogeneous coordinates extend these to a projective space, enabling the representation of points at infinity and simplifying transformations. A 2D point ( x , y ) corresponds to homogeneous coordinates ( x , y , w ) with w 0 , convertible by
( x , y ) = x w , y w .
Similarly, a 3D point ( X , Y , Z ) becomes ( X , Y , Z , w ) with conversion
( X , Y , Z ) = X w , Y w , Z w .
This allows matrix multiplication to express affine and projective transformations.

3.2. Projective Geometry

Projective geometry studies properties that are invariant under projective transformations that preserve lines but not distances or angles, essential in modeling perspective projection from 3D to 2D.

3.2.1. Pinhole Camera Model

A 3D point ( X , Y , Z ) projects to a 2D image point ( u , v ) through the camera center. First, the normalized image coordinates are
x = X Z , y = Y Z ,
representing projection onto a plane at Z = 1 . Pixel coordinates are then obtained via intrinsic parameters through a perspective transformation of the form
u = f x x + c x , v = f y y + c y ,
or, equivalently,
u v 1 = K x y 1 , K = f x 0 c x 0 f y c y 0 0 1 .
Including extrinsic parameters (rotation R and translation t ), the full projection is
u v 1 = K [ R t ] X Y Z 1 .

3.2.2. Lens Camera Models

Real-world lenses introduce distortion that deviates from the ideal pinhole model. Lens distortion is modeled as a nonlinear function mapping the ideal (undistorted) image coordinates x u = ( x u , y u ) to distorted coordinates x d = ( x d , y d ) :
x d = ϕ ( x u , d )
where ϕ is the distortion function, and  d contains the distortion coefficients.
Brown–Conrady Model
The Brown–Conrady model [19] includes both radial and tangential distortion:
D = 1 + k 1 r 2 + k 2 r 4 + k 3 r 6 , r = x u 2 + y u 2 , x d = x u D + 2 p 1 x u y u + p 2 ( r 2 + 2 x u 2 ) , y d = y u D + 2 p 2 x u y u + p 1 ( r 2 + 2 y u 2 ) .
where D represents the radial distortion factor, r the radial distance from the center, and  k i , p i the radial and tangential distortion coefficients, respectively.
Rational Distortion Model (Extended Brown–Conrady)
The rational distortion model extends the classical Brown–Conrady formulation by introducing a rational (i.e., fractional) polynomial form for the radial distortion. This model is particularly effective in modeling complex lens distortions in wide-angle and consumer-grade cameras.
It includes six radial distortion coefficients ( k 1 , k 2 , k 3 , k 4 , k 5 , k 6 ) and two tangential coefficients ( p 1 , p 2 ) . The distorted coordinates ( x d , y d ) are computed as
x d = x u · 1 + k 1 r 2 + k 2 r 4 + k 3 r 6 1 + k 4 r 2 + k 5 r 4 + k 6 r 6 + 2 p 1 x u y u + p 2 ( r 2 + 2 x u 2 )
y d = y u · 1 + k 1 r 2 + k 2 r 4 + k 3 r 6 1 + k 4 r 2 + k 5 r 4 + k 6 r 6 + p 1 ( r 2 + 2 y u 2 ) + 2 p 2 x u y u
This rational formulation allows for high-accuracy distortion modeling by capturing both complex radial behavior and tangential effects.
Kannala–Brandt Model
Designed for fisheye lenses, the Kannala–Brandt model [20] models the radial distortion as
r d = r u 1 + k 1 r u 2 + k 2 r u 4 + k 3 r u 6 + k 4 r u 8
The distorted coordinates are
x d = r d · x u r u , y d = r d · y u r u
where r u = x u 2 + y u 2 is the radial distance in the undistorted image.
Mei–Rives Model
The Mei–Rives model [16] combines spherical projection with a pinhole camera model. A 3D point ( X , Y , Z ) is first projected onto the unit sphere:
p s = 1 X 2 + Y 2 + Z 2 X Y Z
Then, it is projected to the image plane:
p d = π ( p s ) = K · ϕ ( p s )
where K is the intrinsic camera matrix and ϕ ( p s ) accounts for lens distortion.
Equidistant Model
In the equidistant projection model, the image radius is proportional to the angle θ between the optical axis and the incoming ray:
r = f · θ , θ = arccos Z X 2 + Y 2 + Z 2
The corresponding image coordinates are
x = r · X X 2 + Y 2 , y = r · Y X 2 + Y 2
where r is the radial distance from the center of the image.
Double-Sphere Model
The double-sphere model [21] projects a 3D point X = ( X , Y , Z ) onto two virtual spheres and then onto the image plane. Define the projection onto the unit sphere as
p s = X X 2 + Y 2 + Z 2 , ρ = p x 2 + p y 2
The distorted projection is then
p d = p s α ρ + ( 1 α ) ( ξ + p z )
α [ 0 , 1 ] represents a blending factor that controls the transition between a pinhole-like and a fisheye-like projection, while ξ determines the distance between the two virtual spheres and thus the amount of distortion introduced by the model. Finally, the pixel coordinates are obtained as
u v = K p d 1
This model is particularly useful in robotics and SLAM systems for calibration with wide-angle lenses.

4. Methodologies

Camera intrinsic calibration involves determining the internal parameters of a camera that govern how 3D points in the world are projected onto the 2D image plane. These intrinsic parameters are fixed properties of the camera, independent of its position or orientation, and include the focal length, principal point, and distortion coefficients.
The calibration process aims to minimize the reprojection error, the difference between the observed image points and the corresponding points projected by the estimated camera model. Formally, given a set of 3D object points,
x i = [ X i , Y i , Z i ] T ,
and their corresponding 2D image points,
p i = [ u i , v i ] T ,
the goal is to find camera parameters P = [ K , ϕ ( u u , d ) ] , including intrinsic matrix K and distortion coefficients d , that minimize the sum of squared reprojection errors:
E = i = 1 N p i f ( x i , P ) 2 ,
where f ( x i , P ) is the projection function mapping 3D points to 2D image points considering distortion.
Expanding this error into pixel coordinates,
E = i = 1 N ( u i u d ) 2 + ( v i v d ) 2 ,
where ( u d , v d ) are the distorted projected points depending on P . The calibration problem is thus
P * = arg min P i = 1 N ( u i u d ( P ) ) 2 + ( v i v d ( P ) ) 2 .
  • Optimization Methods
This nonlinear least-squares problem is commonly solved using the Levenberg–Marquardt algorithm [6], which iteratively updates parameters by
P n e w = P o l d J T J + λ I 1 J T e ,
where J is the Jacobian matrix of residuals with respect to the parameters, λ is a damping factor, I is the identity matrix, and  e is the vector of reprojection errors.
If the residuals are small, Gauss–Newton optimization can be used with updates
P n e w = P o l d J T J 1 J T e .
For multiview scenarios, bundle adjustment simultaneously optimizes the camera parameters and 3D points by minimizing the reprojection errors over all views, leading to more accurate calibration [22].

Planar Pattern-Based Calibration

Calibration using planar patterns (e.g., checkerboards) exploits the fact that all points lie on a flat surface, typically the plane Z = 0 . This reduces the problem to a 2D-to-2D mapping, simplifying the estimation of camera parameters.
  • Projection and Homography
A 3D point x i in the calibration pattern frame projects to image point p i as
s u i v i 1 = K · [ R | t ] v X i Y i Z i 1 ,
where K is the intrinsic matrix, [ R | t ] v is the extrinsic transform for viewpoint v, and s is a scale factor.
Assuming that the pattern is planar ( Z i = 0 ), this simplifies to
u i v i 1 = K r 1 r 2 t X i Y i 1 ,
where r 1 , r 2 are the first two columns of R .
This defines a homography H v between the planar pattern and the image:
H v = K r 1 r 2 t .
Each view provides one homography satisfying
u i v i 1 = H v X i Y i 1 .
The image coordinates explicitly are
u i = f x ( r 11 v X i + r 12 v Y i + t x v ) r 31 v X i + r 32 v Y i + t z v + c x
v i = f y ( r 21 v X i + r 22 v Y i + t y v ) r 31 v X i + r 32 v Y i + t z v + c y .
A single homography cannot uniquely separate intrinsic from extrinsic parameters, as changes in homography may stem from either. Hence, multiple views of the calibration pattern are necessary to resolve both intrinsic and extrinsic parameters by jointly solving multiple homographies.
This multiview approach ensures that intrinsic parameters are well constrained and the camera pose is accurately estimated.

5. Methodologies

Camera calibration aims to find a model minimizing the reprojection error:
E = i = 1 N p i f ( x i , P ) 2
Traditional methods fix the model structure and estimate the parameters, while symbolic regression discovers both model forms and parameters [23,24]. By defining a set of operators and operands, symbolic regression flexibly searches for interpretable mathematical expressions that best fit the data.
Expressions are represented as hierarchical trees, with operators as internal nodes and variables/constants as leaves [25]. For example, the pinhole camera model (Equation (3)) can be represented as expression trees (Figure 1).
Symbolic regression solves
f ^ = arg min f F i = 1 N ( y i f ( x i 1 , , x i n ) ) 2
using evolutionary techniques like genetic programming to evolve models with improved accuracy and interpretability.

5.1. Selecting Camera Models via Symbolic Regression

Given calibration data D = { ( x i , p i ) } , the objective is to select and optimize a symbolic function f ( x i ; θ ) that maps 3D points to 2D projections, by minimizing
i = 1 N p i f ( x i ; θ ) 2 ,
where θ represents intrinsic and extrinsic parameters, as well as distortion coefficients.

Genetic Programming Setup

  • Representation: Candidate solutions encode parameterized, predefined camera models (e.g., Brown–Conrady, Mei, Kannala–Brandt) as expression trees, combining fixed model structures with variables and constants.
  • Initialization: The initial population P ( 0 ) = { I 1 , , I N } consists of variants of these predefined models with randomized parameters.
  • Fitness: Evaluated via the mean squared error (MSE) between the predicted projections and observed data.
  • Selection: Employ roulette wheel or tournament selection to probabilistically favor fitter individuals.
  • Genetic Operators:
    Crossover: Exchange parameters or subtrees between parent models, preserving structural validity.
    Mutation: Randomly perturb parameters or substitute subexpressions within models.
  • Evolution: New generations are formed through elitism combined with genetic operators until convergence or stopping criteria are met.

5.2. Parameter Optimization via Levenberg–Marquardt

Symbolic regression identifies model structures; parameters θ (e.g., focal lengths, distortion coefficients) are refined using the Levenberg–Marquardt algorithm [6]:
θ k + 1 = θ k ( J T J + λ I ) 1 J T r
where J is the Jacobian, r the residual, and  λ the damping factor. This iterative update improves the parameter estimates and calibration accuracy.

6. Experiments

We evaluated our symbolic regression framework for camera calibration, implemented via the AlpineGP engine [26]. The aim is to learn compact symbolic expressions modeling projection and distortion across camera types. AlpineGP employs a grammar-guided evolutionary search with domain-specific primitives, well suited for geometric modeling. Performance is assessed via the reprojection error and runtime over various camera models, distortion profiles, and poses. Learned models are compared against established projection and distortion formulations from prior work.

Numerical Dataset Generation for Camera Calibration

To evaluate calibration algorithms under controlled conditions, we generated a synthetic dataset of 3D points arranged in a planar chessboard pattern (7 rows, 9 columns), with 0.03 m corner spacing, spanning 0.24 m by 0.18 m. Each sample consists of 3D corner coordinates defined in the target’s local frame (Figure 2).
Pattern instances are randomly posed relative to a fixed camera frame, with translations uniformly sampled as x [ 0.3 , 0.3 ] m, y [ 0.2 , 0.2 ] m, z [ 0.01 , 0.35 ] m and rotations about each axis within ±25 (roll, pitch, yaw). A total of N = 100 unique poses are generated to cover typical camera-to-pattern configurations.
The virtual camera has a resolution of 640 × 480 pixels, focal length of 35 mm, zero skew, a principal point at (320, 240), and an aspect ratio of 0.75, as summarized in Table 1.
Table 2 lists the distortion models included, covering a broad range of lens behaviors. This image-free dataset provides a controlled testbed for intrinsic and extrinsic calibration, enabling the isolated benchmarking of geometric estimation without image processing confounds.

7. Estimating the Search Space Size

Let P be the set of chosen primitives defined in Table 3. Each primitive f P has arity ( f ) , i.e., the number of total arguments that it accepts. However, for distortion primitives (such as Brown–Conrady and Mei–Rives), only the first two or three arguments correspond to coordinate vectors and thus accept nested expressions, while the remaining arguments are fixed constants or parameters. Let A f represent the number of argument positions of each primitive where nested expressions are allowed, according to the following:
A f = 2 or 3 , if f is a distortion primitive ; arity ( f ) , otherwise .
We limit the expression depth to D max and the total function nodes to F max . The total number of possible trees is
N trees = k = 1 F max tree T | T | = k , depth ( T ) D max f T A f × N terminals L T , k = 1 F max | P | k · A avg k · N terminals k + 1
where | T | is the total nodes in the tree, | P | is the size of the primitive set, depth ( T ) is the maximum depth of the tree, N terminals is the number of terminal symbols (nonconstant arguments plus constants), L T is the count of leaves, and  A avg is the average of A f across all primitives.
For our set of | P | = 11 primitives, we constrain the expression depth to D max = 3 and the total function nodes to F max = 4 . Assuming conservatively that A avg 2 and  N terminals 8 (3 coordinates plus 5 constant terminals), then
N trees k = 1 4 11 k · 2 k · 8 k + 1 .
This yields a coarse lower bound of N trees 10 8 , i.e., hundreds of millions of possible expression trees.
Symbolic regression over even this modest-sized space is infeasible via exhaustive search, particularly because each candidate expression often requires costly parameter fitting (e.g., via Levenberg–Marquardt optimization). Symbolic regression is formally known to be NP-hard [27], and the combinatorial explosion outlined above underscores why genetic programming remains one of the most practical and scalable strategies for exploring such high-dimensional expression spaces. In contrast, the brute-force enumeration of all possible nested compositions is computationally prohibitive, especially when accounting for both the nested argument structure and the per-candidate parameter optimization.
Taken together, for our use case, these considerations justify the need to (1) limit primitives to meaningful camera model functions, (2) bound the expression depth and function count to ensure tractability, and (3) employ genetic algorithms to efficiently explore the resulting combinatorial space.
On our test workstation (Intel i9 CPU, 32 GB RAM), each symbolic regression run takes approximately 15–20 min per model, depending on the complexity of the search space and parameter optimization. In comparison, traditional calibration methods (e.g., OpenCV) are typically completed in seconds. However, our method is intended for offline calibration scenarios, where interpretability and model discovery are prioritized over runtime.

Evaluating Calibration Performance

The symbolic regression genetic programming optimizer was configured with the parameters described in Table 4, aiming to balance exploration and exploitation within the search space for the efficient identification of symbolic expressions that accurately represented the underlying data. The hyperparameters in Table 4 were initially selected based on common genetic programming practice and refined through a limited grid search on validation tasks. We acknowledge that a systematic hyperparameter tuning process or sensitivity analysis would provide deeper insight into the robustness of the framework.
For evaluation, we employed Monte Carlo cross-validation [28], a technique involving the random subsampling of the dataset multiple times to reduce variance in the performance estimates, leading to more reliable evaluations. Each iteration involved splitting the dataset into training and testing subsets, training the model on the training subset, and evaluating it on the testing subset. The evaluation metrics from each iteration were then averaged to provide a more robust assessment of the calibration algorithm’s generalization capabilities. In our study, we performed 10 trials, each involving a random 30/70 train/test split. The choice of a 30/70 train/test split with 10 Monte Carlo trials was intended to evaluate generalization while retaining enough test data to measure robustness. We note that alternative validation schemes (e.g., k-fold cross-validation or leave-one-out) could have enabled more comprehensive assessments, especially given the small dataset size. However, k-fold methods would require repeated model discovery for each fold, substantially increasing the computational cost. The results from these trials were averaged to obtain a more stable estimate of the calibration algorithm’s effectiveness.
  • Reprojection Error
To assess the accuracy of the symbolic regression models, we use the reprojection error, a standard metric in camera calibration that measures the Euclidean distance between the observed 2D image points and their reprojected counterparts from 3D world coordinates:
E reproj = | x observed x projected | 2
where x observed and x projected denote the true and predicted image coordinates, respectively [29].
Table 5 reports the mean reprojection errors across various lens distortion profiles for our symbolic regression models alongside standard OpenCV calibration models (e.g., fisheye, pinhole, omnidir, rational). Figure 3 shows qualitative results on test images. The symbolic regression models consistently achieve low reprojection errors across the majority of camera types. In low-distortion settings (e.g., “no distortion” and “telephoto”), symbolic models reach subpixel accuracy, being comparable to or better than OpenCV’s best-fitting parametric models. For moderate distortions (e.g., “light fisheye,” “catadioptric light”), the symbolic models continue to outperform traditional models, with near-zero errors. Notably, as can ben seen in Table 6, two discovered models (light fisheye and light wide-angle) are not the ground truth projection functions yet achieve a perfect data fit with a zero reprojection error. This underscores symbolic regression’s ability to uncover structurally different but functionally equivalent models in complex projection spaces.
However, for highly nonlinear or extreme distortion profiles—such as those found in 360° cameras or extreme hyperbolic projections—the symbolic model exhibits greater variability and increased errors compared to OpenCV’s omnidirectional models, which are specifically designed to handle such complex geometries. This performance gap is partially attributed to the sensitivity of symbolic regression to internal constant optimization, particularly in deeper or more highly curved projection functions. Although extended genetic optimization times could mitigate this limitation, symbolic models remain competitive overall. In many scenarios, they outperform traditional rational and fisheye models—even in the absence of manual tuning or domain-specific constraints.
Overall, these results highlight symbolic regression as a viable and often superior alternative to traditional calibration models. Unlike OpenCV’s fixed-function distortions, our models are discovered automatically and can yield compact, interpretable expressions tailored to each distortion scenario. This opens the door for calibration systems that are both model-agnostic and semantically meaningful, especially in applications requiring generalization or analytical guarantees.

8. Conclusions

In this work, we present a symbolic regression framework for the automatic discovery of camera calibration models, using genetic programming to search over a space of interpretable geometric primitives. Our approach recovers compact, closed-form expressions that rival or outperform those of traditional parametric models in reprojection accuracy across a wide range of distortion profiles—including pinhole, fisheye, catadioptric, and omnidirectional lenses.
By producing models that retain the semantic structure, our method offers greater interpretability compared to black-box or high-degree polynomial models. An analysis of the symbolic search space reveals a combinatorially large number of possible expressions (on the order of 10 8 ), justifying the use of heuristic search over brute-force methods.
Empirically, the symbolic models achieve near-zero reprojection errors in undistorted and mildly distorted settings and remain competitive in more complex cases. The performance in highly nonlinear regimes can be further improved by tightening the optimizer tolerances (e.g., lowering xtol, ftol, gtol), enabling finer parameter tuning at the cost of computation.

Current Limitations and Future Work

While our initial experiments focused on synthetic datasets, this setup enables controlled comparisons, repeatability, and detailed analysis of the discovered models across diverse lens types. However, we acknowledge the importance of evaluating performance on real-world calibration tasks involving standard targets (e.g., checkerboards), various lens types, and challenges such as noise, blur, or illumination variability. These are key directions for future work.
Importantly, this study focused solely on intrinsic calibration, deliberately excluding the estimation of extrinsic parameters (i.e., rotation and translation between camera and world coordinates). While the joint optimization of intrinsics and extrinsics is standard in practical calibration pipelines, including extrinsics significantly increases the computational burden, especially when nested within a symbolic search loop. As such, we defer integrated intrinsic–extrinsic optimization to future work, where we will explore ways to manage this additional complexity efficiently.
The primitive set was chosen in our experiments based on prior work in camera calibration, including distortion models commonly used in the literature (e.g., Brown–Conrady, double-sphere, FOV). Our selection was guided by the goal of enabling expressive yet interpretable model discovery. We did experiment with alternative primitive sets, including expanded and reduced versions. Initial findings suggested that a larger primitive set would improve the expressiveness but increase the search time and risk of overfitting. Conversely, a minimal set may limit model discovery. A comprehensive ablation study on primitive selection is planned as part of our future work.
We further aim to refine model discovery by increasing the granularity of the symbolic primitives, enabling the system to build models from lower-level mathematical operations (e.g., powers, trigonometric terms). This may allow the emergence of novel, hybrid formulations that generalize beyond established models. Finally, we plan to investigate integrated structure-and-parameter optimization to improve the robustness and scalability for real-world deployment.

Funding

Funded by the European Union (European Research Council (ERC), ALPS, 101039481).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The views and opinions expressed are those of the author(s) only and do not necessarily reflect those of the European Union or the ERC Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Brown, D.C. Close-range camera calibration. Photogramm. Eng. 1971, 37, 855–866. [Google Scholar]
  2. Zhang, Z. A Flexible New Technique for Camera Calibration. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 1330–1334. [Google Scholar] [CrossRef]
  3. Scaramuzza, D.; Martinelli, A.; Siegwart, R. A Flexible Technique for Accurate Omnidirectional Camera Calibration and Structure from Motion. In Proceedings of the IEEE International Conference on Computer Vision Systems, New York, NY, USA, 4–7 January 2006. [Google Scholar]
  4. Scharstein, D.; Szeliski, R. A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms. In Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001), Kauai, HI, USA, 9–10 December 2001; Volume 47, pp. 7–42. [Google Scholar]
  5. Li, X.; Zhang, B.; Sander, P.V.; Liao, J. Blind Geometric Distortion Correction on Images Through Deep Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 16536–16546. [Google Scholar]
  6. Levenberg, K. A Method for the Solution of Certain Non-Linear Problems in Least Squares. Q. Appl. Math. 1944, 2, 164–168. [Google Scholar] [CrossRef]
  7. Durrant-Whyte, H.; Bailey, T. Simultaneous Localization and Mapping: Part I. IEEE Robot. Autom. Mag. 2006, 13, 99–110. [Google Scholar] [CrossRef]
  8. Davison, A.J. Real-Time Simultaneous Localization and Mapping with a Single Camera. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Nice, France, 13–16 October 2003; pp. 1403–1410. [Google Scholar]
  9. Kormushev, P.; Calinon, S.; Caldwell, D.G. Robot Motor Skill Coordination with EM-based Reinforcement Learning. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan, 18–22 October 2010; pp. 3232–3237. [Google Scholar]
  10. Levinson, J.; Thrun, S. Robust Vehicle Localization in Urban Environments Using Probabilistic Maps. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Anchorage, AK, USA, 3–7 May 2010; pp. 4372–4378. [Google Scholar]
  11. Li, J.; Pi, J.; Wei, P.; Luo, Z.; Yan, G. Automatic Multi-Camera Calibration and Refinement Method in Road Scene for Self-Driving Car. IEEE Trans. Intell. Vehicles 2024, 9, 2429–2438. [Google Scholar] [CrossRef]
  12. Azuma, R.T. A Survey of Augmented Reality. Presence Teleoperators Virtual Environ. 1997, 6, 355–385. [Google Scholar] [CrossRef]
  13. Debevec, P.E.; Malik, J. Modeling and Rendering Architecture from Photographs: A Hybrid Geometry- and Image-Based Approach. In Proceedings of the SIGGRAPH, New Orleans, LA, USA, 4–9 August 1996; pp. 11–20. [Google Scholar]
  14. Ranganathan, P.; Olson, E. Gaussian Process for Lens Distortion Modeling. In Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura-Algarve, Portugal, 7–12 October 2012; pp. 3620–3625. [Google Scholar]
  15. Heikkilä, J.; Silvén, O. A Four-step Camera Calibration Procedure with Implicit Image Correction. In Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition, San Juan, Puerto Rico, 17–19 June 1997; pp. 1106–1112. [Google Scholar]
  16. Mei, C.; Rives, P. Single View Point Omnidirectional Camera Calibration from Planar Grids. In Proceedings of the 2007 IEEE International Conference on Robotics and Automation, Rome, Italy, 10–14 April 2007; pp. 3945–3950. [Google Scholar] [CrossRef]
  17. Schmidt, M.; Lipson, H. Distilling Free-Form Natural Laws from Experimental Data. Science 2009, 324, 81–85. [Google Scholar] [CrossRef] [PubMed]
  18. Bongard, J.; Lipson, H. Automated Reverse Engineering of Nonlinear Dynamical Systems. Proc. Natl. Acad. Sci. USA 2007, 104, 9943–9948. [Google Scholar] [CrossRef] [PubMed]
  19. Brown, D.C. Decentering Distortion of Lenses. Photom. Eng. 1966, 32, 444–462. [Google Scholar]
  20. Kannala, J.; Brandt, S.S. A Generic Camera Model and Calibration Method for Conventional, Wide-Angle, and Fish-Eye Lenses. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 2006, 28, 1335–1340. [Google Scholar] [CrossRef] [PubMed]
  21. Usenko, V.; Engel, J.; Stückler, J.; Cremers, D. The Double Sphere Camera Model. In Proceedings of the International Conference on 3D Vision (3DV), Verona, Italy, 5–8 September 2018; pp. 552–560. [Google Scholar] [CrossRef]
  22. Triggs, B.; McLauchlan, P.F.; Hartley, R.I.; Fitzgibbon, A.W. Bundle Adjustment—A Modern Synthesis. In Vision Algorithms: Theory and Practice; Lecture Notes in Computer, Science; Kanade, T., Kryszczyk, A., Pajdla, T., Shafique, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2000; Volume 1883, pp. 298–372. [Google Scholar] [CrossRef]
  23. O’Reilly, U.-M. Genetic Programming II: Automatic Discovery of Reusable Programs. Artif. Life 1994, 1, 439–441. [Google Scholar] [CrossRef]
  24. Makke, N.; Chawla, S. Interpretable scientific discovery with symbolic regression: A review. Artif. Intell. Rev. 2024, 57, 2. [Google Scholar] [CrossRef]
  25. Wiener, R. Expression Trees. In Generic Data Structures and Algorithms in Go: An Applied Approach Using Concurrency, Genericity and Heuristics; Apress: Berkeley, CA, USA, 2022; pp. 387–399. [Google Scholar] [CrossRef]
  26. Manti, S.; Lucantonio, A. Discovering interpretable physical models using symbolic regression and discrete exterior calculus. Mach. Learn. Sci. Technol. 2024, 5, 015005. [Google Scholar] [CrossRef]
  27. Bartlett, D.J.; Desmond, H.; Ferreira, P.G. Exhaustive Symbolic Regression. IEEE Trans. Evol. Comput. 2023, 28, 950–964. [Google Scholar] [CrossRef]
  28. Xu, Q.S.; Liang, Y.Z. Monte Carlo cross validation. Chemom. Intell. Lab. Syst. 2001, 56, 1–11. [Google Scholar] [CrossRef]
  29. Hartley, R.; Zisserman, A. Multiple View Geometry in Computer Vision, 2nd ed.; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Figure 1. Expression trees representing the pinhole camera model, as defined in Equation (3).
Figure 1. Expression trees representing the pinhole camera model, as defined in Equation (3).
Jimaging 11 00389 g001
Figure 2. Sample calibration data: 9 × 7 coplanar points (left) and their image plane projections (right).
Figure 2. Sample calibration data: 9 × 7 coplanar points (left) and their image plane projections (right).
Jimaging 11 00389 g002
Figure 3. Example reprojections for the obtained symbolic regression models under different distortion profiles.
Figure 3. Example reprojections for the obtained symbolic regression models under different distortion profiles.
Jimaging 11 00389 g003
Table 1. Virtual camera intrinsics used for dataset generation.
Table 1. Virtual camera intrinsics used for dataset generation.
ParameterValue
Resolution640 × 480 pixels
Focal length35 mm
Skew0
Principal point(320, 240)
Aspect ratio0.75
Table 2. Camera distortion profiles used in the dataset.
Table 2. Camera distortion profiles used in the dataset.
LabelModelCoefficients
No distortionPinhole[]
Telephoto (low distortion)Brown–Conrady[−0.01, 0.001, 0.0001, −0.0002, 0.0]
Light fisheyeKannala–Brandt[0.05, −0.01, 0.005, −0.001]
Catadioptric lightMei–Rives[0.5]
Moderate omnidirectionalMei–Rives[1.0]
360 cameraMei–Rives[1.5]
Extreme hyperbolicMei–Rives[2.0]
Light wide-angleEquidistant[0.01, −0.005, 0.0, 0.0]
Table 3. Basic primitives used for camera modeling in symbolic regression.
Table 3. Basic primitives used for camera modeling in symbolic regression.
Primitive NameDescriptionPurpose in Camera ModelingArguments
normalizeComputes normalized image plane coordinates: X Z or Y Z Projects 3D points to 2D before applying distortion or intrinsicsX or Y, Z
linear_affineApplies s c a l e · x + o f f s e t Models scaling and shifting (e.g., focal length, principal point)x or y, scale, offset
brown_conradyClassical Brown–Conrady radial–tangential modelCaptures lens distortion using radial and tangential termsx or y, y or x, k1, k2, p1, p2, k3
kannala_brandtOdd-order polynomial fisheye modelModels extreme wide-angle distortionsx or y, y or x, k0, k1, k2, k3
mei_rivesSpherical projection with mirror parameter ξ For central catadioptric (mirror-based) systemsX or Y, Y or X, Z, ξ
equidistantEquidistant fisheye modelEnsures angle from optical axis maps linearly to radiusx or y, y or x, k1, k2, k3, k4
double_sphereTwo-sphere projection model with ξ , α Models ultra-wide FOV more accurately than pinholeX or Y, Y or X, Z, ξ , α
rationalRational model: P ( x , y ) Q ( x , y ) Flexible model using polynomial numerator/denominatorx, y, k1–k6
omnidirectionalPolynomial mapping for omnicamerasApproximates wide-angle views with polynomial termsx or y, y or x, c0–c3
Table 4. Experimental parameters for symbolic regression optimization.
Table 4. Experimental parameters for symbolic regression optimization.
ParameterValue
Population Size ( N individuals )100
Generations ( N gen )100
Multi-Island Model ( N islands )10
Crossover Probability ( P crossover )0.7
Mutation Probability ( P mutation )0.3
Table 5. Reprojection errors (in pixels) for various distortion profiles and camera models.
Table 5. Reprojection errors (in pixels) for various distortion profiles and camera models.
DatasetCamera Calibration Model
Pinhole Rational Symbolic Regression
No distortion0.000 ± 0.0000.000 ± 0.0000.000 ± 0.000
Telephoto (low distortion)0.000 ± 0.0000.000 ± 0.0000.000 ± 0.000
Light fisheye0.716 ± 0.5050.764 ± 0.5360.000 ± 0.000
Catadioptric light0.147 ± 0.0910.125 ± 0.0890.000 ± 0.000
Moderate omnidirectional0.406 ± 0.3310.376 ± 0.3220.286 ± 1.648
360 camera0.745 ± 0.4870.725 ± 0.4843.323 ± 4.915
Extreme hyperbolic0.358 ± 0.2170.644 ± 0.4801.778 ± 2.611
Light wide-angle0.291 ± 0.2030.265 ± 0.1750.000 ± 0.000
Table 6. Symbolic models with parameterized functions.
Table 6. Symbolic models with parameterized functions.
DatasetSymbolic Model
No distortion (Best U) l i n e a r _ a f f i n e n o r m a l i z e x , z , 320.0 , 320.0
No distortion (Best V) l i n e a r _ a f f i n e n o r m a l i z e y , z , 359.19 , 240.0
Telephoto (low distortion) (Best U) l i n e a r _ a f f i n e b r o w n _ c o n r a d y d o u b l e _ s p h e r e x , y , z , 0.2 , 0.21 , n o r m a l i z e y , 0.82 , 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , 384.23 , 319.96
Telephoto (low distortion) (Best V) l i n e a r _ a f f i n e b r o w n _ c o n r a d y n o r m a l i z e y , z , n o r m a l i z e x , z , 0.01 , 0.0 , 0.0 , 0.0 , 0.0 , 359.19 , 240.0
Light fisheye (Best U) l i n e a r _ a f f i n e k a n n a l a _ b r a n d t d o u b l e _ s p h e r e x , y , l i n e a r _ a f f i n e z , 5.22 , 0.0 , 0.81 , 0.7 , y , 0.33 , 0.14 , 0.03 , 0.05 , 311.98 , 320.0
Light fisheye (Best V) l i n e a r _ a f f i n e d o u b l e _ s p h e r e k a n n a l a _ b r a n d t y , x , 0.33 , 0.14 , 0.04 , 0.04 , x , l i n e a r _ a f f i n e z , 5.21 , 0.0 , 0.81 , 0.69 , 350.03 , 240.0
Catadioptric light (Best U) l i n e a r _ a f f i n e m e i _ r i v e s x , y , z , 0.5 , 320.0 , 320.0
Catadioptric light (Best V) l i n e a r _ a f f i n e m e i _ r i v e s y , x , z , 0.5 , 359.19 , 240.0
Moderate omnidirectional (Best U) l i n e a r _ a f f i n e m e i _ r i v e s x , y , z , 1.0 , 320.0 , 320.0
Moderate omnidirectional (Best V) l i n e a r _ a f f i n e m e i _ r i v e s y , x , z , 1.0 , 359.19 , 240.0
360 camera (Best U) l i n e a r _ a f f i n e m e i _ r i v e s x , y , z , 1.5 , 320.0 , 320.0
360 camera (Best V) l i n e a r _ a f f i n e m e i _ r i v e s y , x , z , 1.5 , 359.19 , 240.0
Extreme hyperbolic (Best U) l i n e a r _ a f f i n e m e i _ r i v e s x , y , z , 2.0 , 320.0 , 320.0
Extreme hyperbolic (Best V) l i n e a r _ a f f i n e m e i _ r i v e s y , x , z , 2.0 , 359.19 , 240.0
Light wide-angle (Best U) l i n e a r _ a f f i n e e q u i d i s t a n t d o u b l e _ s p h e r e x , y , l i n e a r _ a f f i n e z , 149.6 , 0.0 , 0.99 , 0.65 , y , 0.33 , 0.15 , 0.02 , 0.06 , 318.35 , 320.0
Light wide-angle (Best V) l i n e a r _ a f f i n e d o u b l e _ s p h e r e y , x , l i n e a r _ a f f i n e z , 142.48 , 0.0 , 0.99 , 0.65 , 357.37 , 240.0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pimentel de Figueiredo, R. Knowledge-Guided Symbolic Regression for Interpretable Camera Calibration. J. Imaging 2025, 11, 389. https://doi.org/10.3390/jimaging11110389

AMA Style

Pimentel de Figueiredo R. Knowledge-Guided Symbolic Regression for Interpretable Camera Calibration. Journal of Imaging. 2025; 11(11):389. https://doi.org/10.3390/jimaging11110389

Chicago/Turabian Style

Pimentel de Figueiredo, Rui. 2025. "Knowledge-Guided Symbolic Regression for Interpretable Camera Calibration" Journal of Imaging 11, no. 11: 389. https://doi.org/10.3390/jimaging11110389

APA Style

Pimentel de Figueiredo, R. (2025). Knowledge-Guided Symbolic Regression for Interpretable Camera Calibration. Journal of Imaging, 11(11), 389. https://doi.org/10.3390/jimaging11110389

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop