Canonical coordinates for retino-cortical magnification

.


Introduction
The visual system of humans (and other mammalian species) interfaces the optical world via a sensorium (or visual front-end, the internal embodiment of the visual field) that is characterized by receptive fields, i.e., light sensitive cells, of various sizes and aperture profiles.Scale space theory provides a foundation for a rigorous taxonomy and functional interpretation of visual receptive fields as (exact) non-infinitesimal differential operators.The conjecture that the brain can be seen as a "geometry engine" has been introduced by Koenderink in his seminal papers and has been amply exploited in the image analysis literature [1][2][3][4][5][6][7][8][9][10][11].The keyword is resolution (or its inverse, scale).
Conventional linear scale space theory, however, typically presumes (scale-wise) uniform sampling, ignoring the foveal properties and real-time dynamics typical of biological visual systems.For the temporal aspects, the reader is again referred to Koenderink for an interesting proposition [12].The present article concerns the static foveation aspect of the visual front-end, de-emphasizing specific characteristics of visual receptive fields other than the size and spatial arrangement of their effective domains of support.
The visual system of humans shows a roughly linear decrease of visual acuity with eccentricity, is more or less rotationally invariant relative to the foveal point and exhibits a large degree of invariance to object size.Ample empirical evidence supports these claims, including quantitative studies of retino-cortical magnification [13].Among others, this phenomenon is responsible for the fact that, in the case of humans, about half of the striate cortex is devoted to a foveal region covering only one percent of the visual field.The log-polar mapping arises naturally in simplified models of foveal systems endowed with the aforementioned properties, cf.[14].Image processing algorithms, as well as space-variant CMOScameras [15,16] have been constructed to mimic this mapping.These dimensionality reducing sampling devices have turned out to be useful in computer vision, as they help to optimize visual tasks, such as time to impact calculations.
However, no previously proposed theory seems to provide a principled account of the spatial organization of the entire retina, including the fovea centralis (the central, avascular zone of the retina with maximal acuity), where the log-polar model breaks down, due to its physically void singularity.This shortcoming is reflected in the design of log-polar mapping algorithms and space-variant cameras, in which one typically employs some heuristics to handle the transition between periphery and central retina.
In this article, based on previous workshop papers [14,17], a model is presented that overcomes the difficulties of the log-polar paradigm.It is based on the same invariance principles, viz.global rotation invariance (with respect to the foveal point) and scale invariance, but explicitly incorporates a physical resolution limitation.The model admits so-called canonical coordinates, inducing a natural discrete grid for operational representation in software, hardware or biological "wetware".Its biological merits will be discussed by comparison with known facts on human vision.To guide the reader not familiar with the biological terminology used in this article, a glossary of relevant biological terms has been appended.
For simplicity, stereopsis will be ignored.The retina will be modeled as a flat disk, representing the domain of definition of the full visual field.This field comprises two hemifields, each of which projects onto a (contralateral) hemiretina, in turn projecting retinotopically to the primary visual (or striate) cortex through the optic radiation via the subcortical lateral geniculate nucleus.Thus, each lateral part of the visual cortex reflects the contralateral visual hemifield [13], cf. the sketch in Figure 1.

Modeling the Sensorium
Consider the basic scale invariant non-exact (the barred symbol, d, is used to stress the anholonomic nature of the one-forms) one-forms: x 2 + y 2 + a 2 and dη = dy in which (x, y) ∈ R 2 are Cartesian coordinates.From an operational point of view, one could regard these as locally adaptive measurement rods, with the help of which, one may assign a numeric value to a vector depending on the base point to which it is attached.More precisely, if v(x, y) = v x (x, y)e x + v y (x, y)e y is such a vector, "living" at base point (x, y) and expressed relative to globally-defined Cartesian unit vectors e x and e y , then: x 2 + y 2 + a 2 and dη(v) = v y (x, y) LGN, lateral geniculate nucleus.
These numeric values could be interpreted to represent the "visual significance" of the respective Cartesian vector components depending on the position in the visual field.Clearly, in order to be equally significant, the components of a vector at some peripheral location (large x 2 + y 2 + a 2 ) will need to be larger than those of a more central one (small x 2 + y 2 + a 2 ).The reason for this is to geometrically express the roughly linear increase of typical receptive field size (with a concomitant linear decrease of spatial resolving power) as a function of eccentricity x 2 + y 2 .The (small) physical size parameter a > 0 is needed to avoid a non-physical singularity at the center.Its visual significance will become apparent below.
We will confine the region of interest to a disk of radius R, which represents the radius of the geometric retina( x 2 + y 2 ≤ R).The parameter, a, represents a transient radius separating the geometric foveola(the geometric foveola is a construct of our model, to be distinguished from the biological foveolain the mammalian retina; terminology betrays a modest amount of foresight) ( x 2 + y 2 ≤ a) from its periphery( x 2 + y 2 > a).
The one-forms of Equation ( 1) induce a scale invariant area two-form, geometrically representing a spatially-weighted area of support (regardless of shape) of a Euclidean sensory element, dx ∧ dy, at position (x, y) relative to the foveal center: in which: is the square root of the metric determinant associated with the two-dimensional spatial metric: Again, the number in Equation ( 4) reflects the fact that a peripheral receptive field will need to have a larger area of support in order to be treated on par with a similar one closer to the foveal center.
As an aside, recall that the every two-dimensional Riemannian manifold is conformally flat and that the Ricci curvature tensor, Ric, is always proportional to the metric tensor, viz.: in which R µν and g µν are the components of Ric and G, respectively.For the case at hand, Equation ( 5), the space-variant Ricci scalar (equal to twice the Gaussian curvature) equals: This curvature scalar assumes appreciable values within the fovea centralis.The point of departure in previous work [14] was based on the assumption that R = 0 for all (x, y) = (0, 0).Combined with the aforementioned symmetries and applied to a conformally flat metric, this requirement admits a family of metrics, of which Equation ( 5) is a particular member, only if a = 0, leaving a spurious singularity at the origin.It is clearly not compatible with our regularized metric with a > 0.

Modeling Retino-Cortical Magnification
Consider the area (measured by the Riemannian metric of Equation ( 5)) of an infinitesimally narrow ring δΓ : ρ < x 2 + y 2 < ρ + δρ around the fovea: Geometrically, the quantity: measures the (Riemannian) perimeter of a the circle of radius ρ around the fovea.It is easy to see that there exists a maximal circle, with perimeter P + = P (a), demarcating the transition ρ + = a between the geometric foveola and periphery (whether this transient circle has a distinguished functional role in mammalian vision is not clear, but anatomical evidence does show a fairly sharp demarcation of the biological foveola [13]).
If we normalize V (ρ), such that V (0) = 0, and introduce the dimensionless quantities: and: with 0 ≤ t ≤ T , and: then: This integrated retino-cortical magnificationmeasures the relative capacity dedicated to the central region inside a foveal disk of radius ρ = at relative to that of the full retina.Limiting cases are clearly v(0, T ) = 0 and v(T, T ) = 1.For the foveola, we have v(t + def = 1, T ) = ln 2/ ln(1 + T 2 ); cf. Figure 2 for an illustration.
To verify the biological plausibility of our model, consider the case of a peripheral ring with the same processing capacity as the enclosed foveal disk, i.e., Equation ( 13) with equipartitioning radiusρ 1 2 = a t1 2 defined, such that: A straightforward computation yields: in which the approximation reflects the phenomenological case, T 1.In other words, under this assumption, the theoretical equipartitioning radius approximately equals the geometric mean of the radii of the geometric foveola and geometric retina: Generalizing Equations ( 14)-( 16), we may define the α-partitioning radii ρ α = at α , via: yielding (the approximations hold for not too small α, i.e., well outside the geometric foveola): or, in terms of physical length scales, This prediction likewise admits experimental verification once consistent values of a and R are at hand.
Vice versa, we may use Equations ( 16)- (19) to definethe constant, a, given empirical data for R and, say, ρ 1 2 .For instance, in the case of humans, it is known that about half of the striate cortex is devoted to the portion of the retina that lies within 7 • -8 • of the fovea [13].Assuming that the striate cortex has a homogeneous distribution of similar visual cells, the fractional size within it occupied by the retinotopically-mapped central portion of the visual field will be tantamount to its fractional processing capacity.A typical retina measures [18] R ≈ 21 mm; 1 • of visual angle corresponds to approximately 288 µm.The monocular visual field covers approximately 160 • × 175 • (width × height [19]), or roughly (85 ± 5) • in eccentricity when approximated by an isotropic figure for our purposes (Hartridge [20] reports a functional limit on visual field eccentricity of 104 • ).With these figures, we have (for the biological counterparts of the quantities involved) t 1 2 /T = ρ 1 2 /R ≈ 0.1024, from which one deduces with the help of Equations ( 12) and ( 15) that a = (t ) 2 ≈ 95 used in Figure 2 justifies our assumption, T 1, and the predicted size of the geometric foveola happens to agree remarkably well with that of the human foveola (Rodieck ([13], Chapter 9) reports a value of 0.20 mm), which gives us a biological interpretation of the constant, a, and, at the same time, justifies the name "geometric foveola"; cf. Figure 3.
In the next section, we consider the modification of the log-polar map by taking into account the physical resolution limitation of the fovea centralis and show how the log-polar map emerges asymptotically.

Canonical Coordinates
Retino-cortical magnification can be conveniently described in terms of canonical coordinates(recall that the primary visual cortex on one side of the brain represents a hemifield, hence the bounds on φ): Using these canonical coordinates, it follows that the basic area two-form (recall Equation ( 3)) can be expressed as the wedge product of holonomic one-forms, warranting the attribute "canonical": Note that near the foveal point, we have: In the periphery, we reobtain the familiar log-polar coordinates (up to an irrelevant offset): The physical part of the (p, q)-domain is reminiscent of the actually observed shape of the cortical surface of V1; compare Figures 3 and 4. .The canonical (p, q)-domain is the region between the graphs of q ± π 2 = ± π 2 tanh p and the lines p = 0 and p = arcsinh T .On the left, (t, φ) are dimensionless radial and azimuthal coordinates.On the right, the canonical (p, q)-coordinates are plotted as Cartesian coordinates, with p on the horizontal axis.Recall Equation ( 28), and compare with Figure 3.
Expressed in canonical coordinates, the retino-cortical metric, Equation ( 5), looks rather more cumbersome: It is evident that the (p, q)-coordinate lines do not intersect perpendicularly, unlike with log-polar coordinates; cf. Figure 4.For the angle of intersection, α, at a fiducial point (p, q), we have: For peripheral points, this is close to zero: For central points, on the other hand, we have: independent of eccentricity, revealing the non-orthogonal intersection of canonical coordinate lines away from the horizon (φ = 0).The deviation from orthogonality remains nonetheless fairly small almost everywhere, reaching its maximum at the foveal point, with an intersection angle still close to 60 • .Typical measurements, such as Figure 3, lack resolving power to delineate foveal details in the most posterior part of the striate cortex and, thus, remain indecisive.Let Γ T denote the full retinal domain.In (p, q)-space this is the area in-between the graphs of: and the lines p = 0 and p = arcsinh T (to see this, express q as a function of p and parameter φ using Equation (20), and consider the boundary values φ ± = ±π/2).By the same token, if Γ t denotes the fraction of the retinal domain on the left of the line p = arcsinh t, then a straightforward computation yields: Γt Ω reproducing Equation (13) as expected.This confirms that the (p, q)-coordinates of Equation ( 20) are indeed more natural than the commonly used log-polar coordinates.The latter arise in the limit of vanishing a.As such, log-polar coordinates fail to describe both foveal, as well as transient behavior, and are suited only for the peripheral field.Although the periphery represents by far the largest part of the visual field, it is much less significant in the visual brain.As a final remark, note that, by construction, (p, q)-space is most naturally discretized by a uniform sampling (with grid constants ∆p and ∆q of fixed aspect ratio).Biologically, one expects this to be reflected in a uniform spatial layout and the functional similarity of cortical receptive fields in the entire primary visual cortex.

Conclusion and Discussion
We have established a biologically plausible geometric model for an isotropic, scale-invariant foveal system that incorporates physical resolution limitations.The model is naturally described in terms of a canonical coordinate map that generalizes the familiar log-polar map typically used in the context of foveal systems.Unlike the latter, however, the generalized map has a globally valid domain of definition and handles the transition from the peripheral field (the classical log-polar regime) to fovea centralis in a graceful manner.
The model is consistent with certain known facts of biological vision, notably retino-cortical magnification.Other quantitative predictions may be inferred from it to assess its biological merits and, perhaps, even to predict hitherto unexplored properties about biological foveal systems.Empirical verification, however, will require methods of high resolving power in order to discriminate details within the foveola and its corresponding image in the striate cortex in order to be able to verify or refute the canonical paradigm stipulated in this article.
An important property that has not been accounted for in our model is the fact that at each spatial location in the retina or striate cortex, there exist ensembles of similar receptive fields differing only in size.The proposed model describes the inherent resolution limitation of the visual system by considering "typical" (e.g., minimal) size as a function of eccentricity.Geometric models that do take into account the local multiscale nature of the visual system have been proposed in the context of artificial and biological vision (cf. the Riemannian structure proposed by Eberly et al. [22] and, in a different application context, van Wijk and Nuij [23]; and the affine structure stipulated by Florack [24]).Koenderink has considered the surface geometry of an intensity cross-section induced by a visual stimulus [25].Many intriguing questions remain, such as how these geometries (one of which does not necessarily exclude another!) contribute to our perception of the (geometry of the) visual world "outside"; cf.Koenderink et al. [26,27].However, even rudimentary questions remain essentially unsolved, such as how the visual system, without the infinite regression inevitably arising from the tacit ministration of a homunculus-a little (wo)man inside the brain viewing an internal

Figure 1 .
Figure 1.Schematic representation of the optic pathways from each of the four quadrants of view for both eyes.Adapted from Wikimedia Commons, original illustration by Ratznium.LGN, lateral geniculate nucleus.

Figure 2 .
Figure2.Retino-cortical magnification, V (t, T ) (left), and its integral, V (t, T ) (right), as a function of dimensionless eccentricity, t, illustrated for the case T = 95 (dashed vertical line); recall Equations (12)-(15).The peak on the left occurs at t + = 1 and marks the border ρ + = a of the geometric foveola.The half maximum on the right is reached at t 1 2 ≈ √ T ,

Figure 4
Figure 4.The canonical (p, q)-domain is the region between the graphs of q ± π 2 = ± π 2 tanh p and the lines p = 0 and p = arcsinh T .On the left, (t, φ) are dimensionless radial and azimuthal coordinates.On the right, the canonical (p, q)-coordinates are plotted as Cartesian coordinates, with p on the horizontal axis.Recall Equation (28), and compare with Figure3.