THE CONFORMAL CAMERA IN MODELING BINOCULAR VISION

Primate vision is an active process that constructs a stable internal representation of the 3D world based on 2D sensory inputs that are inherently unstable due to incessant eye movements. We present here a mathematical framework for processing visual information for a biologically-mediated active vision stereo system with asymmetric conformal cameras. This model utilizes the geometric analysis on the Riemann sphere developed in the group-theoretic framework of the conformal camera, thus far only applicable in modeling monocular vision. The asymmetric conformal camera model constructed here includes the fovea’s asymmetric displacement on the retina and the eye’s natural crystalline lens tilt and decentration, as observed in ophthalmological diagnostics. We extend the group-theoretic framework underlying the conformal camera to the stereo system with asymmetric conformal cameras. Our numerical simulation shows that the theoretical horopter curves in this stereo system are conics that well approximate the empirical longitudinal horopters of the primate vision system.


Introduction
Primates must explore the environment with saccades and smooth pursuit eye movements because acuity in primate foveate vision is limited to a visual angle of a mere two degrees.With about 4 saccades/sec, the high-acuity fovea can be successively fixated on the scene's salient and behaviorally relevant parts at a speed of up to 900 deg/sec.Smooth pursuit at up to 100 deg/sec keeps the fovea focused on slowly moving objects, while a combination of smooth pursuit and saccades tracks objects moving either unpredictably, or faster than 30 deg/sec.
In the primate brain, most of the neurons processing visual information encode the position of objects in gaze-centered coordinates, that is, in the frame attached at the fovea in retino-cortical maps.Although this retinotopic information is constantly changing due to the eye's incessant movements, our perception appears stable.Thus, primate vision must be thought of as the outcome of an active process that constructs a clear and stable internal representation of the 3D world based on a combination of unstable sensory inputs and oculomotor signals.
Our computational methodology based on the conformal camera underlying geometric analysis of the Riemann sphere, developed in [27][28][29], addresses some of the challenges in modeling active vision.Most notably, by modeling the external scene projected on the retina of a rotating eye with the correspondingly updated retino-cortical maps, it can provide us with efficient algorithms capable of maintaining visual stability when imaging with an anthropomorphic camera head mounted on a moving platform replicating human eye movements [30][31][32].
In this paper, we first review the conformal camera's group-theoretic framework, which, till now, has only been formulated for modeling monocular vision.Then, we discuss the extension of this framework to a model of stereo vision that conforms to the physiological data of primate eyes.
This paper is organized in two parts.The first consists of sections 2, 3 and 4. In Section 2, the image projective transformations in the conformal camera are given by the Möbius group PSL(2, C) acting on the camera's image plane, identified with the Riemann sphere by linear fractional mappings.The group PSL(2, C) establishes the group of holomorphic automorphisms of the complex structure on the Riemann sphere [12].The invariants under these automorphisms furnish both Möbius geometry [8] and complex projective geometry [3].
Section 3 introduces both the continuous and discrete projective Fourier transforms.The group of image projective transformations in the conformal camera is the simplest semisimple group.Since representations on semisimple groups have a well-understood mathematical theory, we can provide the conformal camera with its own Fourier analysis-a direction in the representation theory of the semisimple Lie groups [14].The projective Fourier transform (PFT) is constructed by restricting Fourier analysis on the group SL(2, C), the double cover of PSL(2, C), to the image plane of the conformal camera.We stress that the complex projective geometry underlying the conformal camera contrasts with the real projective geometry usually used in computational vision, which does not possess meaningful Fourier analysis on its group of motions.
Next, in Section 4, we discuss the conformal camera's relevance to the computational aspects of anthropomorphic vision.We start here with a discussion of the conformal camera's relevance to early and intermediate-level vision.Then, we discuss the modeling of retinotopy with the conformal camera.We point out that the discrete PFT (DPFT) is computable by a fast Fourier transform algorithm (FFT) in the log-polar coordinates that approximate the retino-cortical maps of the visual and oculomotor pathways.These retinotopic maps are believed to be fundamental to the primate's cortical computations for processing visual and oculomotor information [13].
Still in this section, our modeling of retinotopic image processing with PFT is compared with the accepted model, one that is based on another complex logarithmic function and was originally proposed by Schwartz [23].In particular, we stress the advantage of imaging with PFT over imaging with the exponential chirp transform developed by the Schwartz group [5].We conclude this section by discussing the numerical implementation of DPFT in image processing.
The second part of this paper studies the extension of our modeling with the conformal camera to binocular vision.In Section 5, after we review the background of biological stereo vision, we explain how the conformal camera can model the stereo system with a simplified version of the schematic eye, one with a spherical eyeball and rotational symmetry about the optical axis.In contrast to this simplified eye model, the fovea center in the primate's eye is supratemporal on the retina and the visual axis that connects the fixation point with the fovea center is angled about 5.2 degrees nasally to the optical axis.
In Section 6, we develop the asymmetric conformal camera model of the eye that includes both the fovea's asymmetric displacement and the lens' tilt and decentration.From the group-theoretic framework of the asymmetric conformal camera, we conclude that tilting and translating the image plane is like putting 'conformal glasses' on the standard conformal camera.Finally, in Section 7, we demonstrate, by a numerical simulation in GeoGebra, that the resulting horizontal horopter curves are conics that well approximate the empirical horopters, as originally postulated in [19].

The conformal camera
The conformal camera with the underlying geometric and computational framework was proposed in [27].

Stereographic Projection
The conformal camera consists of a unit sphere S 2 that models the retina and a plane C through the sphere's center O where the image processing takes place.The spatial points are centrally projected onto both the sphere and the image plane through the nodal point N, chosen on the sphere such that the line interval ON is perpendicular to the image plane, see Figure 1.The 'image' entity is given by stereographic projection σ from the sphere to the plane C. Because σ is conformal and maps circles to circles (see the text), it preserves the 'retinal illuminance', that is, the pixels.This image representation is appropriate for efficient computational processing.The camera's orientation in space is described by a positively-oriented orthonormal frame (e 1 , e 2 , e 3 ) such that e 3 = −→ ON.The frame is attached to the camera's center O, giving spatial coordinates (x 1 , x 2 , x 3 ).The image plane x 3 = 0 is parametrized with complex coordinates x 1 + ix 2 .Then, the projection into the image plane C is given by The restriction of (1) to the sphere S 2 \ {N} defines stereographic projection σ = j| S 2 .The mapping σ, when extended by σ(N) = ∞ with the point ∞ appended to the image plane C, identifies the sphere S 2 with the extended image plane C = C ∪ {∞} known as the Riemann sphere.Stereographic projection σ is conformal, that is, the mapping σ preserves the angle of two intersecting curves.In addition, stereographic projection maps circles in the sphere that do not contain N to a circle in the plane, and maps a circle passing through N to a line that can be considered a circle through ∞ [18].

The Group of Image Projective Transformations
A stationary planar object, or a planar surface of a 3D object, shown in Figure 2 as a black rectangular region in the scene, is projected into the image plane in the initial gaze (Gaze 1) of the conformal camera.The gaze change from Gaze 1 to Gaze 2 is shown in Figure 2 as a horizontal rotation φ.Here the sequence of transformations q → z → q 1 → q 2 → q 3 → z = g • z explains the image projective transformation.
The image transformations resulting from the gaze change are compositions of the two basic transformations that are schematically shown in Figure 2 in the rotated image plane.Alternatively, these transformations can be formulated in the initial image plane [32].
The first basic image transformation, the h-transformation, is rendered by translating the object's projected image by b = b 1 , b 2 , b 3 and then projecting it centrally through the rotated nodal point N 1 back into the image plane (stereographic projection σ in Figure 1).It is given by the following mapping where δ = (1 The last equality in (2) defines the linear-fractional action of the matrix on the point z in the image plane.
The second basic image transformation resulting from the gaze rotation with the angle φ is denoted as the k-transformation.This transformation is defined by projecting the output from the h-transformation into the sphere through the center of projection N 1 , rotating it with the sphere by the angle −2φ, and then projecting it back to the (rotated) image plane.Here, the gaze rotation with φ results in the sphere rotation by −2φ by the Central Angle Theorem.In general, the image transformation corresponding to the gaze rotation by the Euler angles (ψ, φ, ψ ) is the following where α = e −i(ψ+ψ )/2 cos φ and β = −e −i(ψ−ψ )/2 sin φ.
In the k-transformation, rotation angles can be assumed to be known for they are used by the bioligical vision system to program the eye movements that fixate targets.We have shown in [31] how to estimate the intended gaze rotation from the image of the target.The h-transformation is given in terms of the unknown vector The composition of the basic transformations in ( 2) and (3), can be done with the multiplication of matrices, as shown in the second line of (4).Because the mappings (4) are conformal, they introduce the conformal distortions shown in Figure 2 by the backprojected gray-shaded region outlined on the object in the scene.Although these distortions could be removed with minimal computational cost [27,29], we do not because they are useful in visual information processing [31].
In the language of groups, h(b 1 , b 2 , b 3 ) ∈ A N, where and k(ψ, −2φ, ψ ) ∈ SU(2), where is the double cover of the group of rotations SO(3) [1].Thus, SU(2) is isomorphic to the group of unit quaternions.
The polar decomposition SL(2, C) = SU(2)ASU(2), where implies that the finite iterations of hand k-transformations generate the action of SL(2, C) on the Riemann sphere C by linear-fractional mappings such that An image intensity function f 's projective transformations are given by the following action where we need the quotient group

Geometry of the Image Plane
The group PSL(2, C) with the action ( 6) is known as the Möbius group of holomorphic automorphisms on the Riemann sphere C that gives the complex, or analytic, structure on C [12].In the Kleinian view of geometry, known as the Erlanger Program, Möbius geometry is the study of invariants under the group of holomorphic automorphisms [8].Further, the group PSL(2, C) also defines the projective geometry of a one-dimensional complex space [3], giving us the isomorphism of the complex projective line and the Riemann sphere.This means that the conformal camera synthesizes geometric and analytic, or numerical, structures to provide a unique computational environment that is geometrically precise and numerically efficient.
The conformal camera's image plane does not admit a distance that is invariant under image projective transformations; therefore, geometry of the camera does not possess a Riemann metric.For instance, there are no geodesics and curvature.However, because linear-fractional transformations map circles to circles, circles may play the role of geodesics, with the inverse of the circle's radius playing the role of curvature.This makes the conformal camera relevant to the intermediate-level vision computational aspects of natural scene understanding, later discussed in Section 4.2.

Group representations and Fourier analysis
The main role of the theory of group representation, in relation to Fourier analysis, is to decompose the space of square-integrable functions defined on a set the group acts naturally on in terms of the irreducible unitary representations-the simplest homomorphisms of the group into the set of unitary linear operators on a Hilbert space.
In this decomposition, the generalized Fourier transform plays the same role on any group as the classical Fourier transform does on the additive group of real numbers.In this classical case of Fourier transform, the irreducible unitary representations are homomorphisms of the additive group into the multiplicative group of complex numbers of modulus one, or the circle group.Here the homomorphisms are given by the complex exponential functions one finds in the definition of the standard Fourier integral [22].
Because group theory is rooted in large part in geometry through Klein's Erlanger Program of studying spaces through their groups of motions, this geometric Fourier analysis emphasizes the covariance of the decompositions with respect to the geometric transformations.
The group SL(2, C) is the simplest of semisimple groups that have a well understood representation theory initiated by Gelfand's school and completed by Harish-Chandra [9].
Therefore, the conformal camera possesses its own Fourier analysis well adapted to image projective transformations given by the group SL(2, C) acting on the Hilbert space of square-integrable functions on the image plane C [28,29].

Projective Fourier Transform
The projective Fourier analysis has been constructed by restricting geometric Fourier analysis on SL(2, C) to the image plane of the conformal camera (see Section 7 in [28]).The resulting projective Fourier transform (PFT) of a given image intensity function f is the following where (s, k) ∈ R × Z, and, if z = x 1 + ix 2 , then (i/2)dzdz = dx 1 dx 2 .In log-polar coordinates (u, θ) given by ln re iθ = ln r + iθ = u + iθ, (8) takes on the form of the standard Fourier integral f (s, k) = f (e u+iθ )e u e −i(us+θk) dudθ. ( Inverting it, we obtain the representation of the image intensity function in the (u, θ)-coordinates, where f(u, θ) = f (e u+iθ ).We stress that although the functions f and f have the same values at the corresponding points, they are defined on different spaces; the function f is defined on the log-polar space (the cortical visual area) while f is defined on the image plane (the retina).
In spite of the logarithmic singularity of log-polar coordinates, an image f that is integrable on This observation is crucial in constructing the discrete PFT.

Noncompact and Compact Realizations of PFT
It should be noted that the one dimensional PFT was constructed from the infinite dimensional Fourier transform on SL(2, C) in the noncompact picture of irreducible unitary representations of group SL(2, C), see [28].Later in [29], the second, finite dimensional projective Fourier transform was constructed in the compact picture of irreducible unitary representations of group SL(2, C).Both pictures have been used to study group representations in semisimple representation theory; each picture simplifies representations by emphasizing different aspects without the loss of information, see [14].=' means that equality holds up to a lower dimensional subset of the zero measure, that is, almost everywhere, and N in (5) represents Euclidean translations under the action (6).These facts justify use of the name 'projective Fourier transform' and allow us to develop numerically efficient implementations of this transform in image processing well adapted to projective transformations [28].

Discrete Projective Fourier Transform
It follows from (10) that we can remove a disk |z| ≤ r a to regularize f such that the support of f(u, θ) is contained within (ln r a , ln r b ) × [0, 2π), and approximate the integral in (9) by a double Riemann sum with equally spaced partition points where 0 ≤ k ≤ M − 1, 0 ≤ l ≤ N − 1 and δ = T/M with T = ln(r b /r a ).We can obtain (see [28] for details) the discrete projective Fourier transform (DPFT), and its inverse, where f k,l = (2πT/MN) f (e u k e iθ l ) are image plane samples and f k,l = (2πT/MN)f(u k , θ l ) are log-polar samples.Both expressions ( 12) and ( 13) can be computed efficiently by FFT.

Imaging with the Conformal Camera
The action of the group of image projective transformations on the image function is given without precise relation to the object; recall the definition of vector b in the h-transformation.Even for horizontal rotations of the conformal camera, vector b needs to be defined.However, this vector does not need to be defined when imaging with the conformal camera is applied to processing visual information during saccades [30].On the other hand, for processing visual information during smooth pursuit with the conformal camera, the analytical expression for the vector b was derived in [32] by using the objects' relative motions.
Moreover, the imaging with the conformal camera must be considered only for 'planar' objects, that is, the planar surfaces of 3D objects.To justify this requirement, we note that only the most basic features are extracted from the impinged visual information on the retina before being sent to the areas of the brain used for processing.Thus, the initial image of the centrally projected scene is comprised of numerous brightness and color spots from many different locations in space and does not contain explicit information about the perceptual organization of the scene [25].What is initially perceived is a small number of objects' surfaces segmented from the background and each other [21].The object's 3D attributes and the scene's spatial organizations are acquired when 2D projections on the retina are processed by numerous cortical areas downstream the visual pathway.This processing extracts the monocular information (texture gradients, relative size, linear and aerial perspectives, shadows and motion parallax) and, when two eyes see the scene, the binocular information (depth and shape).

Intermediate-Level Vision
Intermediate-level vision is made up of perceptual analyses carried out by the brain, and is responsible for our ability to identify objects when they are partially occluded and our ability to perceive an object to be the same even as size and perspective changes.
Two basic intermediate-level vision descriptors that the brain employs in identifying global objects are the medial axis transformation [4] and the curvature extrema [10].The medial axis, which the visual system extracts as a skeletal description of objects [15], can be defined as the set of the centers of the maximal circles inscribed inside the contour.The curvatures at the corresponding points of the contour are given by the inverse radii of the circles.
Since circles are preserved under image projective transformations, the intermediate-level descriptors are preserved during the conformal camera's movements.We conclude that imaging with the conformal camera should be relevant to modeling primate visual perception.

DPFT in Modeling Retinotopy
Information from the visual field, sampled and processed by the retina, arrives to the midbrain's superior colliculus (SC) and, via the lateral geniculate nucleus (LGN), to the primary visual cortex (V1).Both SC and V1 contain retinotopic maps.These retinotopic maps can be characterized by the following principle (e.g.[7]): for the contralateral visual field, retinotopy transforms the retinal polar coordinates centered at the fovea to the cortical coordinates given by the perpendicular polar and eccentricity axes.Further, the amount of cortical tissue dedicated to the representation of a unit distance on the retina-the magnification factor-is inversely related to its eccentricity, implying that foveal regions are characterized by a large cortical magnification, with the extrafoveal region scaled logarithmically with eccentricity [7].
As the retinal image changes during gaze rotations, the retinotopic map in V1 undergoes the corresponding changes that form the input for subsequent topographically organized visual areas.
The mappings w = ln(z ± a) − ln a give an accepted approximation of the retinotopic structure in V1 and SC areas [23,26], where a > 0 removes logarithmic singularity and ±a indicates either the left or right brain hemisphere, depending on its sign.On the other hand, the DPFT that provides the data model for image representation can be efficiently computed by FFT in log-polar coordinates given by the complex logarithmic mapping w = ln z.Thus, this logarithmic mapping must be used in our model to approximate retinotopy.
However, both complex logarithmic mappings give similar approximations for the peripheral region.In fact, for |z| a, ln(z ± a) − ln a is approximately linear while, for |z| a, it is dominated by ln z.Moreover, to construct discrete sampling for DPFT, the image is regularized by removing a disk |z| ≤ r a , which represents the foveal region that contains the singularity of ln z at z = 0.Although it may seem at first that our model is compromised by the loss of a foveal region of about a 2 deg central angle, our next discussion demonstrates how the opposite may be true.
From the basic properties of ln z, ln(e iφ z) = ln z + iφ, ln(ρz) = ln z + ln ρ, it follows that the rotation and dilation transformations of an intensity function in exp-polar coordinates f (e u e iθ ) correspond to simple translations of the log-polar image f(u, θ) via: f (e iφ e u e iθ ) = f e u e i(θ+φ) = f (u, θ + φ) and f (ρe u e iθ ) = f (e u+v e iθ ) = f (u + v, θ) .
The functions f and f were introduced in Section 3.2.These distinctive features of ln z are useful in the development of image identification and recognition algorithms.The Schwartz model of retinotopy, therefore, results in the destruction of these properties so critical to computational vision.
Further, psychophysiological evidence suggests that the fovea and periphery have different functional roles in vision and very likely involve different image processing principles [24,35].An example of the separation in foveal and peripheral processing is explained in [24] in the context of curveballs in baseball.Often batters report that balls undergo a dramatic and nearly discontinuous shift in their position as they dive in a downward path near home plate.This shift in the ball's position occurs when the image of the ball passes the boundary on the retina between these two regions.[24] argues that this phenomenon is a result of the differences between foveal and peripheral processing.
We finally mention the computational advantages of representing images in terms of the PFT rather than in terms of the exponential chirp transform (ECT) developed by the Schwartz research group in [5].The ECT is constructed by making the substitution (x, y) = (e u cos θ, e u sin θ) (14) in the standard 2D Fourier integral.Because the Jacobian of the transformation ( 14) is translation-invariant, this substitution makes the ECT well adapted to translations in Cartesian coordinates.There is then a clear dissonance between the nonuniform retinal image sampling grid and this shift invariance of the ECT.On the other hand, the PFT is a genuine Fourier transform constructed from irreducible unitary representations of the group of image projective transformations.Further, the change of variables by transforms the PFT into standard Fourier integral.Thus, the discrete PFT is computable by FFT in log-polar coordinates that approximate the retinotopy.
The difference between ( 14) and ( 15) implies that the PFT does not experience the problem of exponentially growing frequencies like the ECT does and, for a band-limited original image, there is no difficulty with the Nyquist sampling condition in log-polar space [28,29].

Numerical Implementation of DPFT
The DPFT approximation was obtained using the rectangular sampling grid (u k , θ l ) in (11), corresponding, under the mapping u k + iθ l −→ z k,l = e u k +iθ l = r k e iθ l , to a nonuniform sampling grid with equal sectors and exponentially increasing radii where δ = u k+1 − u k is the spacing δ = T/M and r 0 = r a is the radius of the disc that has been removed to regularize the logarithmic singularity of u = ln r.
Let us assume that we have been given a picture of the size A × B displayed with K dots per unit length.The physical dimensions of the pixel and the picture, in the chosen unit of length, are 1/K × 1/K and A/K × B/K, respectively.In addition, we assume that the retinal coordinates' origin is the picture's center.The central disc of radius r 0 represents the foveal region of a uniform sampling grid with the number of the pixels N f , given by πr 2 0 = N f /K 2 .This means that the foveal image cannot increase the picture's resolution, which is dependent on its distance from the eye.The number of sectors is obtained from the condition 2π(r 0 + r 1 )/2 ≈ N(1/K), where N = [2πr 0 K + π].Here [a] is the closest integer to a.To obtain the number of rings M, we assume that ρ 0 = r 0 (e δ − 1) = 1/K and r b = r M = r 0 e Mδ .We can take either Example 1.We let A × B = 512 × 512 and K = 4 per mm, so that the physical dimensions in mm are 128 ×128 and r b = 128 √ 2/2 = 90.5.Furthermore, we let N f = 296, so that r 0 = 2.427 and N = 64.Finally, δ = ln(10.7084/9.7084)≈ 0.09804 and (1/0.09804)ln(90.5/2.427)≈ M = 37.The sampling grid consists of points in polar coordinates: (r k + ρ k+1 /2, θ l + π/64) = (2.552ek0.09804 , (2l + 1)π/64), k = 0, 1, ..., 36, l = 0, 1, ..., 63.
In this example, the original image has 262, 144 pixels, whereas both foveal and peripheral representations of the image contain only 2, 664 pixels.Thus, there are about 100 times more pixels in the original image than in the image sampled in log-polar coordinates.To compare, light carrying visual information about the external world is initially sampled by about 125 million photoreceptors.When processed by the retinal circuitry, this visual information converges on about 1.5 million ganglion cell axons that carry the output from the eye.

Binocular Vision and the Conformal Camera
Each of our two eyes receives a slightly different retinal projection of a scene due to their lateral separation from each other.Nevertheless, we experience our visual world as if it were seen from just one viewpoint.The two disparate 2D retinal images are fused into one image that gives us the impression of a 3D space.The locus of points in space that are seen singularly is known as the horopter, and the perceived direction that represents the visual axes of the two eyes is often referred to as the Cyclopean axis.
Binocular disparity, or stereopsis, refers to the small differences in the perspective projections on the right and left eyes.When a point lies either in front, or behind the horopter curve containing the fixation point, the difference in the angles subtended on each retina between the image and the center of the fovea defines retinal disparity.This difference provides a cue for the object's depth from an observer's point of fixation.The difference in retinal disparities for a pair of points defines their relative disparity, and the relative disparity provides a cue for the perception of 3D structure, components of which include relative depth and shape.Relative disparity is usually assumed to not depend on the eyes' positions [16].
Conventional geometric theory of binocular projections is incorrect in identifying the geometric horopter with the Vieth-Müller circle.This two-century old theory incorrectly assumes that the eye's optical node coincides with the eyeball's rotational center, yet it still influences theoretical developments in binocular vision.Anatomically correct binocular projection geometry was recently presented in [33].
The main results in [33] are the following: (1) the Vieth-Müller circle is the isovergence circle that is not the geometric horopter and (2) relative disparity depends on eye position when the nodal point is at the anatomically correct location.Moreover, calculations for typical viewing distances show that such changes in relative disparity are within binocular acuity limits [34].During fixation, the eyes continually jitter, drift, and make micro-saccades, and we hypothesize that the small changes in perceived size and shape due to these eye movements may be needed, not only for perceptual benefits such as 'breaking camouflage', but also for the aesthetic benefit of stereopsis [20].
The geometric horopter in [33] corresponds to a simplified version of the schematic eye, also called a reduced eye.In this model, the two nodal points coincide at the refractive surface's center of curvature.The light from the fixation point travels through the nodal point to the center of the fovea, a path referred to as the visual axis.The optical axis coincides with the visual axis when the fovea center is assumed to coincide with the posterior focus.This model of the reduced eye complies with the conformal camera's imaging framework discussed in Section 2, resulting in the conformal camera that is capable of modeling stereo vision.
Still, the reduced eye model remains an idealization.In contrast to the schematic eye model, the eye's fovea center is supratemporal on the retina and the visual axis is angled about 5.2 degrees nasally to the optical axis.This angle is called the α angle.Moreover, ophthalmological diagnostics have shown that even in normal eyes with good visual acuity, a small amount of lens misalignments relative to the optic axis do exist [6,17].The importance of the eyes' asymmetry follows from two facts.The first is well-known: the empirical longitudinal horopter deviates from the circular curves that form the geometric horopters.This so-called Hering-Hillebrand horopter deviation, shown in Figure 3, can be accounted for by asymmetry in the effective spatial positions of the corresponding elements in the two eyes.Two retinal elements, each one in a different eye, are corresponding if they invoke a single percept when simulated.The second fact is the claim recently made in [2] that the natural crystalline lens tilt and decentration in humans is inclined to compensate for various types of aberration.

The Asymmetric Conformal Camera
To model the eye with both the tilt and decentration of the natural crystalline lens, we present in this section the asymmetric conformal camera.Although the optical axis should best approximate the lost symmetry in the alignment of the eye's components, we assume that this axis is the line passing through the nodal point and the spherical eyeball's rotation center.
The modified conformal camera is obtained by rotating the image plane about the x 2 -axis by the angle β and then translating the origin by z 0 , as shown in Figure 4. We refer to Figure 4 for the notation used in the remaining part of the paper.The immediate requirement is that the visual axis passing through the nodal point and point z 0 forms the angle α = 5.2 • with the optical axis.The eye model with an asymmetrically displaced fovea and a tilted and decentred lens outlined with thin (1pt) curves.The asymmetric conformal camera is outlined with thick (2pt) curves.The stereo system is obtained if the left camera is reflected in the head axis X 3 .Each camera's image plane, shown here for the left eye, is perpendicular to the head axis such that the horopter curve is a straight line passing through the fixation point F at the abathic distance from origin O.
However, because the nodal point is identified with the 'north pole' of the stereographic projection-the point N(0, 0, 1), we place the nodal point 1 cm from the eye rotation center to simplify the discussion.This discrepancy with the physiological distance of 0.6 cm can be easily corrected.The angles α = 5.2 • and β gives the fovea an asymmetric displacement of f p = 1.63 mm and the lens decentration y β = sin β.
The points on the image plane have coordinates relative to the plane origin O L .The projection ξ β = z β − z 0 of the space point on the tilted image plane with the 'foveal' center at z 0 can be expressed in terms of the projection z on the original image plane, allowing us to find the transformation between image planes.To this end, we note that |y β N L | = cos β.Then, from the right triangles ∆z 0 y β N L and ∆z β y β N L , we get and Solving those last two formulas for z 0 − y β and z β − y β , and taking the difference, we obtain Next, from the right triangle ∆zC L N L , we have Introducing (17) to (16), we obtain which can be expressed by the following linear fractional action We call the matrix in ( 18) by m β so that which satisfies (19) is the inner automorphism of the group SL(2, C).This inner automorphism maps the group of image projective transformations onto itself by using the image projective transformation m β that represents the camera's asymmetry.
Since g and g β have the same algebraic properties, they behave geometrically in the same way.For instance, the kh-transformation discussed in Section 1, which gives the image transformation from the conformal camera gaze change, is preserved under the conjugation because (kh) β = k β h β .Thus, we can work with the asymmetric conformal camera as we did with the standard conformal camera.
To this end, given the image intensity function f : D → R in the standard conformal camera, we define the image intensity function on the image plane as follows: where the transformation m β can be considered a coordinate transformation.Then, the conjugate g β = m β gm −1 β has the form corresponding to base changes in linear algebra.To see this, we let a linear map be represented by matrices M and N in two different bases.Then N = PMP −1 , where P is the base change matrix.
The result of tilting and translating the image plane does not affect the conformal camera's geometric and computational frameworks; this can be phrased as putting 'conformal glasses' on the camera.

Discussion: Modeling Empirical Horopters
Two points, each in one of the two eyes' retinas, are considered corresponding if they give the same perceived visual direction.The circular shape of the geometric horopter is the consequence of a simple geometric assumption on corresponding points: two retinal points, onto which a non-fixated point in space is projected through each of the two nodal points, are corresponding if the angles subtended at the two eyes with fixation lines are equal.
If one relaxes this assumption by assuming that corresponding points in the temporal direction from the center of the fovea towards the periphery are compressed as compared with corresponding points in the nasal direction, the geometric horizontal horopter curves are no longer circular, see Figure 2.16 in [11].
The asymmetric conformal camera is defined by rotating the image plane by the angle β about the eye's axis, the vertical axis to the horizontal visual plane, and translating the plane origin by z 0 in the temporal direction (cf. Figure 4).In doing this, we demand that the angle between the fixation axis passing through the nodal point and point z 0 , and the optical axis passing through the nodal point and spherical eyeball center of rotation, is the angle α = 5.2 • .Here z 0 is the stereographic projection of the fovea center f .Now a simple geometric fact follows: when equally spaced points on the rotated image plane are projected into the sphere with the nodal point N L as the center of projection, their images on the sphere are compressed in the temporal direction from f , as compared with the nasal direction.
In this section, we study the horizontal horopter curves of the stereo system with asymmetric conformal cameras by back projecting the pairs of corresponding points from the uniformly distributed points on the camera's image plane to the object space.From [19], we expected that the horopters would be conics.To test this, we chose six points for each fixation-five points to give a unique conic curve and the sixth point to verify that this is indeed the horopter conic.One of these six points is the fixation point that is always on the horopter.Further, the nodal points and the point that does not project to the image plane, projecting instead to ∞, are taken as the points on the horopter except for when the horopter is a straight line.When the horopter is a straight line, we need only three points.
To do this, we fix the camera's parameters as follows.The abathic distance is the fixation point's distance when the horopter is a straight line, see Figure 3.This occurs when the fixation point of the two eye models provides two image planes that are perpendicular to the head axis.This orientation of the eyes with the abathic distance at the fixation point is shown in Figure 4 if this figure is completed by adding the right eye as the reflection of the left eye about the head axis X 3 .Then, the abathic distance for the eye radius of 1 cm, with a given α, β and the ocular separation a, can be easily expressed as We use in (20) the physiologically accurate values of α = 5.2 • and a = 6.5 cm.Then, assuming the values of β in the range −0.1 • ≤ β ≤ 4.2 • we find, using (20), that the observed abathic distance values in humans are in the range 35 cm ≤ d β ≤ 190 cm with an average value of d 3.3 • = 100 cm.The assumed values of the angle β are in the range of the crystalline lens tilt's angle, as measured in healthy human eyes [6].
Figure 5.The graphs of the three horizontal horopters, the ellipse for fixation F 1 (163.7, 79.5), the straight line for fixation F (126.8, 161.4), and the hyperbola for fixation F 2 (53.4, 211.5), with coordinates given in millimeters.The eye radius is 7.9 mm and the interocular distance is 78.0 mm.For each of the fixations F i , i = 1, 2, the six points were obtained by back projecting the corresponding points.These six points included the fixation point, the two nodal points, the point Pin f ty i that projects to ∞ and two additional points, P1 i and P2 i .Five of these points were used to obtain the conics and the sixth point was used for the verification.The straight-line horopter is for the fixation point at the abathic distance and is given by three points, F, P 1 and P 2 .
In the simulation with GeoGebra, we use different values for our parameters than those that would be used in the human binocular system.In order to display horopters with three different shapes: ellipse, straight line and hyperbola, in one graphical window output, we take the eye radius of 7.9 mm and the interocular distance of 78 mm.The graphs of the horopters obtained in GeoGebra are shown in Figure 5.The fixation points are given in the caption of this figure.
We note that the hyperbola has two branches, one passing through the fixation point, and the other passing through the nodal points.GeoGebra also computed the conics' equations:

Conclusions
The first part of the paper reviewed the conformal camera's geometric analysis developed by the author in the group-theoretic framework.We identified the semisimple group SL(2, C) as the group of image transformations during the conformal camera's gaze rotations.This group is the double cover of PSL(2, C), the group that gives both the complex structure on the Riemann sphere and the one-dimensional complex geometry.This duality synthesizes the analytic and geometric structures.
Representation theory on semisimple groups, one of the greatest achievements of 20th-century mathematics, allows the conformal camera to possess its own Fourier analysis-a direction in the representation theory of semisimple Lie groups.The projective Fourier transform was constructed by restricting Fourier analysis on the group SL(2, C) to the image plane of the conformal camera.The image representation in terms of the discrete projective Fourier transform is computable by a fast Fourier transform algorithm in the log-polar coordinates that approximate the retino-cortical maps of the visual pathways.It means that the projective Fourier transform is well adapted to both image transformations produced by the conformal camera's gaze change, and to the correspondingly updated log-polar maps.These maps permit efficient image processing with FFT.
The first part of the paper was concluded with a discussion of the conformal camera's relevance to the computational aspects of anthropomorphic vision.First we discussed the relevance of imaging with the conformal camera to early and intermediate-level vision.Then, we compared the conformal camera model of retinotopy with the accepted Schwartz model and pointed out the conformal camera's advantages in biologically-mediated image processing.Finally, we discussed numerical implementation of the discrete projective Fourier transform.In this implementation, the log-polar image contained 100 times less pixels then the original image, comparable to the ratio of 125 million photoreceptors sampling incoming visual information to 1.5 million of ganglion cell axons carrying the output from the eye to the brain.
The model with the conformal camera was developed to process visual information in an anthropomorphic camera head mounted on a moving platform replicating human eye movements.It was demonstrated in the author's previous studies that this model is capable of supporting the stability of foveate vision when the environment is explored with about four saccadic eye movements per second and when the eye executes smooth pursuit eye movements.Previously this model only considered aspects of monocular vision.
In the second part of the paper, binocular vision was reviewed and the stereo extension of the conformal camera's group-theoretic framework was presented.We did this for the eye model that includes the asymmetrically displaced fovea on the retina, and the tilted and decentred natural crystalline lens.We concluded this part showing, with a numerical simulation, that the resultant horopters are conics that well approximate the empirical horopters.
The geometry of the conical horopters in the stereo system with asymmetric conformal cameras requires further study.In the near future, the spatial orientation and shape of the conic curves need to be derived in terms of the perceived direction and the parameters of asymmetry.This will allow development of disparity maps for the stereo system with asymmetric conformal cameras.

Figure 1 .
Figure 1.The conformal camera as the eye model.The points in the object space are centrally projected into the sphere.The sphere and the center of projection N represent the eye's retina and nodal point.The 'image' entity is given by stereographic projection σ from the sphere to the plane C. Because σ is conformal and maps circles to circles (see the text), it preserves the 'retinal illuminance', that is, the pixels.This image representation is appropriate for efficient computational processing.

Figure 2 .
Figure 2. (a) When the camera gaze is rotated by φ, the image projective transformation is given here in the rotated image plane by the g-transformation that results from the composition of two basic image transformations.The first involves the image that is translated by the vector b and projected back into the image plane.The second transformation is given in terms of the image projected into the sphere, rotated by −2φ, and projected back into the image plane.The image transformation adds the conformal distortions, schematically shown by the transformed image's back projection into the plane containing the planar object.(b) The camera and scene are shown in the view seen when looking from above.Here the sequence of transformations q → z → q 1 → q 2 → q 3 → z = g • z explains the image projective transformation.
The functions Π k,s (z) = |z| is z |z| k in the projective Fourier transform (8) play the role of exponentials in standard Fourier transform.In the language of group representation theory, one-dimensional representations Π k,s (z) are the only unitary representations of the Borel subgroup B = MAN of SL(2, C), where M = e iθ 0 0 e −iθ , N = N.In contrast, all of the nontrivial irreducible unitary representations of SL(2, C) are infinite-dimensional.Now, the group B 'exhausts' the projective group SL(2, C) by Gauss decomposition SL(2, C) .= NB, where ' .

Figure 3 .
Figure 3. Empirical longitudinal horopters are shown schematically for symmetric convergence points.Abathic distance is defined here as the distance from the line connecting the eye centers to the fixation point F at which the horopter is a straight line.

Figure 4 .
Figure 4.The eye model with an asymmetrically displaced fovea and a tilted and decentred lens outlined with thin (1pt) curves.The asymmetric conformal camera is outlined with thick (2pt) curves.The stereo system is obtained if the left camera is reflected in the head axis X 3 .Each camera's image plane, shown here for the left eye, is perpendicular to the head axis such that the horopter curve is a straight line passing through the fixation point F at the abathic distance from origin O.