Design and Analysis of a Single—Camera Omnistereo Sensor for Quadrotor Micro Aerial Vehicles (MAVs)

Jaramillo, Carlos; Valenti, Roberto G.; Guo, Ling; Xiao, Jizhong

doi:10.3390/s16020217

Open AccessArticle

Design and Analysis of a Single—Camera Omnistereo Sensor for Quadrotor Micro Aerial Vehicles (MAVs)^†

by

Carlos Jaramillo

¹

,

Roberto G. Valenti

²,

Ling Guo

³ and

Jizhong Xiao

^2,*

¹

Deptartment of Computer Science, The Graduate Center, The City University of New York (CUNY), 365 Fifth Avenue, New York, NY 10016, USA

²

Electrical Engineering Department, The City College, City University of New York (CUNY City College), Convent Ave & 140th Street, New York, NY 10031, USA

³

Automation Department, Nanjing University of Science and Technology (NUST), Nanjing 210094, China

^*

Author to whom correspondence should be addressed.

^†

This paper is an extended version of our paper published in Proceedings of the IEEE Conference on Industrial Electronics and Applications (ICIEA). Jaramillo, C.; Guo, L.; Xiao, J. A Single-Camera Omni-Stereo Vision System for 3D Perception of Micro Aerial Vehicles (MAVs), Melbourne, Australia, 19–21 June 2013; Volume 10016.

Sensors 2016, 16(2), 217; https://doi.org/10.3390/s16020217

Submission received: 24 November 2015 / Accepted: 29 January 2016 / Published: 6 February 2016

(This article belongs to the Special Issue Sensors for Robots)

Download

Browse Figures

Versions Notes

Abstract

:

We describe the design and 3D sensing performance of an omnidirectional stereo (omnistereo) vision system applied to Micro Aerial Vehicles (MAVs). The proposed omnistereo sensor employs a monocular camera that is co-axially aligned with a pair of hyperboloidal mirrors (a vertically-folded catadioptric configuration). We show that this arrangement provides a compact solution for omnidirectional 3D perception while mounted on top of propeller-based MAVs (not capable of large payloads). The theoretical single viewpoint (SVP) constraint helps us derive analytical solutions for the sensor’s projective geometry and generate SVP-compliant panoramic images to compute 3D information from stereo correspondences (in a truly synchronous fashion). We perform an extensive analysis on various system characteristics such as its size, catadioptric spatial resolution, field-of-view. In addition, we pose a probabilistic model for the uncertainty estimation of 3D information from triangulation of back-projected rays. We validate the projection error of the design using both synthetic and real-life images against ground-truth data. Qualitatively, we show 3D point clouds (dense and sparse) resulting out of a single image captured from a real-life experiment. We expect the reproducibility of our sensor as its model parameters can be optimized to satisfy other catadioptric-based omnistereo vision under different circumstances.

Keywords:

catadioptrics; omnistereo; 3D perception; Micro Aerial Vehicles (MAVs)

Graphical Abstract

1. Introduction

Micro aerial vehicles (MAVs), such as quadrotor helicopters, are popular platforms for unmanned aerial vehicle (UAV) research due to their structural simplicity, small form factor, vertical take-off and landing (VTOL) capability, and high omnidirectional maneuverability. In general, UAVs have plenty of military and civilian applications, such as target localization and tracking, 3-dimensional (3D) mapping, terrain and infrastructural inspection, disaster monitoring, environmental and traffic surveillance, search and rescue, deployment of instrumentation, and cinematography, among other uses. However, MAVs have size, payload, and on-board computation limitations, which involve the use of compact and lightweight sensors. The most commonly used perception sensors on MAVs are laser scanners and cameras in various configurations such as monocular, stereo, or omnidirectional. We present a vision-based omnidirectional stereo (omnistereo) sensor motivated by several aspects of MAV robotics.

1.1. Sensor Motivation

We justify the need for the proposed omnistereo sensor after observing two basic differences in the sensor requirements between MAVs and ground vehicles:

Size and payload—In MAV applications, the sensor’s physical dimensions and weight are always a great concern due to payload constraints. Generally, MAVs require fewer and lighter sensors that are compactly designed, while larger robots (including high-payload UAVs) have greater freedom of sensor choice.
Field-of-view (FOV)—Due to their omnidirectional motion model, MAVs require a simultaneous observation of the 3D surroundings. Conversely, most ground robots can safely rely upon narrow vision as their motion control on the plane is more stable.

1.2. Existing Range Sensors for MAVs

In addition to specifying our sensor requirements, it is important to note the most prevalent robot range sensors used today by MAVs and their limitations. For example, lightweight 2.5D laser scanners can accurately measure distances at fast rates, however, their instantaneous sensing is limited to plane sweeps, which in turn require the quadrotor to move vertically in order to generate 3D maps or to foresee obstacles and free space during navigation. More recently, 3D laser rangefinders and LiDARs are being developed, such as the sensor presented in [1], but this one is not compact enough for MAVs. Another disadvantage of laser-based technologies is their active sensing nature, which requires more power to operate and their measurements are more vulnerable to detection and corruption (e.g., due to dark/reflective surfaces) than vision-based solutions. Time-of-flight (ToF) cameras as well as red, green, blue plus depth (RGB-D) sensors like the Microsoft Kinect^® are also very popular for robot navigation. They have been adopted for low-sunlight conditions and mainly indoor navigation of MAVs [2] due to its structured infrared light projection and short range sensing (under 5 m) [3]. Hence, a lightweight imaging system capable of instantly providing a large field of view (FOV) with acceptable resolutions is essential for MAV applications in 3D space. These state-of-the-art sensors’ pitfalls motivate the design and analysis of our omnistereo sensor.

1.3. Related Work

Using omnidirectional images alone and motion—like the approaches taken in [4,5]—have been proposed to map and localize a robot. Omnidirectional vision using a single mirror for the flight of large UAVs was first attempted in [6]. In [7], Hrabar proposed the use of traditional horizontal stereo-based obstacle avoidance and path planing for AUVs, but these techniques were only tested in a scaled-down air vehicle simulator (AVS). Omnidirectional catadioptric cameras can be aided by structured light such as the prototypes presented in [8] and more flexible configurations demonstrated in [9]. Alternatively, stereo cameras can provide passive, instantaneous 3D information for robot mapping and navigation (including UAVs [10]). Intuitively, omnidirectional stereo (omnistereo) can be achieved through circular arrangements of multiple perspective cameras with overlapping views. Higher resolution panoramas can be achieved by rotating a linear camera as presented in [11], but this approach suffers from motion blur in dynamic environments. We point the reader to [12] for a detailed study of multiple view geometry, and [13] for a compendium of geometric computer vision concepts. Instead, our solution to omnistereo vision consists of a ‘catadioptric’ system by employing cameras and mirrors [14].

Throughout the years, [15,16,17,18,19,20] are some of the works that have applied various omnistereo catadioptric configurations for ground mobile robots. Unfortunately, these systems are not compact since they use separate camera-mirror pairs, which are known to experience synchronization issues. In [21], Yi and Ahuja described a configuration using a mirror and a concave lens for omnistereo, but it rendered a very short baseline in comparison to the two-mirror configurations. Originally, Nayar and Peri [22] studied 9 possible folded-catadioptric configurations for a single-camera omnistereo imaging system. Eventually, a catadioptric system using two hyperbolic mirrors in a vertical configuration was implemented by He et al. [23]. Their omnistereo sensor provides a lengthy baseline at the expense of a very tall system. In the past [24], we developed a novel omnistereo catadioptric rig consisting of a perspective camera coaxially-aligned with two spherical mirrors of distinct radii (in a “folded" configuration). One caveat of spherical mirrors is their non-centrality; they do not satisfy the single effective viewpoint (SVP) constraint (discussed in Section 2.2) but rather a locus of viewpoints is obtained [25].

1.4. Proposed Sensor

We design a SVP-compliant omnistereo system based on the folded, catadioptric configuration with hyperboloidal mirrors. Our approach resembles the work of Jang, Kim, and Kweon [26], who first implemented an omnistereo system using a pair of hyperbolic mirrors and a single camera. However, their sensor’s characteristics were not studied in order to justify their design parameters and capabilities, which we do in our case.

It is true that an omnidirectional catadioptric system sacrifices spatial resolution on the imaging sensor (analyzed in Section 3.4). However, our sensor offers practical advantages such as reduced cost, acceptable weight, and truly-instantaneous pixel-disparity correspondences since the same single camera-lens operates for both views, so mis-synchronization issues do not exist. In fact, we believe we are the first to present a single-camera catadioptric omnistereo solution for MAVs. The initial geometry of our model was proposed in [27]. Now, we perform an extensive analysis of our model’s parameters (Section 2) involving its geometric projection (Section 3) that are obtained as a constrained numerical optimization solution devising the sensor’s real-life application to MAVs passive range sensing (Section 4). We also show how the panoramic images are obtained, where we find correspondences and triangulate 3D points for which an uncertainty model is introduced (Section 5). Finally, we present our experimental results and evaluation for 3D sensing with the proposed omnistereo sensor (Section 6), and we discuss the future direction of our work in Section 7.

2. Sensor Design

Figure 1 shows the single-camera catadioptric omnistereo vision system that we specifically design to be mounted on top of our micro quadrotors (manufactured by Ascending Technologies [28]). It consists of (1) one hyperboloid-planar mirror at the top; (2) one hyperboloidal mirror at the bottom; and (3) a high-resolution USB camera also at the bottom (inside the bottom mirror and looking up). The components are housed and supported by a (4) transparent tube or plastic standoffs (for the real-life prototype shown in Figure 13). The choice of the hyperboloidal reflectors owes to three reasons: it is one of the four non-degenerated conic shapes satisfying the SVP constraint [29]; it allows a wider vertical FOV than elliptical and planar mirrors; and it does not require a telescopic (orthographic) lens for imaging as with paraboloidal mirrors (so our system can be downsized). In addition, the planar part of mirror 1 works as a reflex mirror, which in part reduces distortion caused by dual conic reflections. Based on the SVP property, the system obtains two radial images of the omnidirectional views in the form of an inner and an outer ring as illustrated in Figure 2a,b). Nevertheless, the unique set of parameters describing the entire system categorizes it as a “global camera model" given by [13] because changing the value of any parameter in the model affects the overall projection function of visible light rays in the scene as well as other computational imaging factors such as depth resolution and overlapping field of view, which we attempt to optimize with the following design subsections. Please, refer to Appendix A for clarification on our symbolic notation.

2.1. Model Parameters

In the configuration of Figure 3, mirror 1’s real or primary focus is

F_{1}

, which is separated by a distance

c_{1}

from its virtual or secondary focus,

F_{1}^{'}

, at the bottom. Without loss of generality, we make both the camera’s pinhole and

F_{1}^{'}

coincide with the origin of the camera’s coordinate system,

O_{C}

. This way, the position of the primary focus,

F_{1}

, can be referenced by vector

^{[C]} f_{1} = {[\begin{matrix} 0, & 0, & c_{1} \end{matrix}]}^{T}

in Cartesian coordinates with respect to the camera frame,

[C]

. Similarly, the distance between the foci of mirror 2,

F_{2}

and

F_{2}^{'}

, is measured by

c_{2}

. Here, we use the planar (reflex) mirror of radius

r_{r e f}

and unit normal vector

\begin{matrix} ^{[C]} {\hat{n}}_{r e f} = [\begin{matrix} 0, & 0, & - 1 \end{matrix}] \end{matrix}

(1)

in order to project the real camera’s pinhole located at

O_{C}

as a virtual camera

O_{C}^{'}

coinciding with the virtual focal point

F_{2}^{'}

positioned at

^{[C]} f_{2 v} = {[\begin{matrix} 0, & 0, & d \end{matrix}]}^{T}

. We achieve this by setting

d / 2

as the symmetrical distance from the reflex mirror to

O_{C}

and from the reflex mirror to

O_{C}^{'}

. With respect to

[C]

, mirror 2’s primary focus,

F_{2}

, results in position

^{[C]} f_{2} = {[\begin{matrix} 0, & 0, & d - c_{2} \end{matrix}]}^{T}

. It yields the following expression for the reflective plane:

\begin{matrix} ^{[C]} {\hat{n}}_{r e f}^{T}^{[C]} x = - d / 2 \end{matrix}

(2)

The profile of each hyperboloid is determined by independent parameters

k_{1}

and

k_{2}

, respectively. Their reflective vertical field of view (vFOV) are indicated by angles

α_{1}

and

α_{2}

. They play an important role when designing the total vFOV of the system,

α_{s y s}

, formally defined by Equation (54) and illustrated in Figure 5. Also importantly, while performing stereo vision, it is to consider angle

α_{S R O I}

, which measures the common (overlapping) vFOV of the omnistereo system. The camera’s nominal field of view

α_{c a m}

and its opening radius

r_{c a m}

also determine the physical areas of the mirrors that can be fully imaged. Theoretically, the mirrors’ vertical axis of symmetry (coaxial configuration) produces two image points that are radially collinear. This property is advantageous for the correspondence search during stereo sensing (Section 5) with a baseline measured as

b = c_{1} + c_{2} - d

(3)

Among design parameters, we also include the total height of the system,

h_{s y s}

, and weight

m_{s y s}

, both being formulated in Section 2.3.

To summarize, the model has 6 primary design parameters given as a vector

\begin{matrix} θ = [\begin{matrix} c_{1}, & c_{2}, & k_{1}, & k_{2}, & d, & r_{s y s} \end{matrix}] \end{matrix}

(4)

in addition to by-product parameters such as

\begin{matrix} [\begin{matrix} b, & h_{s y s}, & r_{r e f}, & r_{c a m}, & m_{s y s}, & α_{1}, & α_{2}, & α_{s y s}, & α_{S R O I}, & α_{c a m} \end{matrix}] \end{matrix}

In Section 4, we perform a numerical optimization of the parameters in θ with the goal to maximize the baseline, b, required for life-size navigational stereopsis. At the same time, we restrict the overall size of the rig (Section 2.3) without sacrificing sensing performance characteristics such as vertical field of view, spatial resolution, and depth resolution. In the upcoming subsections, we first derive the analytical solutions for the forward projection problem in our coaxial stereo configuration as a whole. In Section 3.2, we derive the back-projection equations for lifting 2D image points into 3D space.

2.2. Single Viewpoint (SVP) Configuration for OmniStereo

As a central catadioptric system, its projection geometry must obey the existence of the so-called single effective viewpoint (SVP). While the SVP guarantees that true perspective geometry can always be recovered from the original image, it limits the selection of mirror profiles to a set of conics. Generally, a circular hyperboloid of revolution (about its axis of symmetry) conforms to the SVP constraint as demonstrated by Baker and Nayar in [30]. Since a hyperboloidal mirror has two foci, the effective viewpoint is the primary focus

F

inside the physical mirror and the secondary (outer) focus

F^{'}

is where the centre (pinhole) of the perspective camera should be placed for depicting a scene obeying the SVP configuration discussed in this section.

First of all, a hyperboloid i can be described by the following parametric equation:

\begin{matrix} \frac{{(z_{i} - z_{0_{i}})}^{2}}{a_{i}^{2}} - \frac{r_{i}^{2}}{b_{i}^{2}} = 1, with a_{i} = \frac{c_{i}}{2} \sqrt{\frac{k_{i} - 2}{k_{i}}}, b_{i} = \frac{c_{i}}{2} \sqrt{\frac{2}{k_{i}}} \end{matrix}

(5)

where

z_{0_{i}} = \frac{c_{i}}{2}

is the offset (shift) position of the focus along the Z-axis from the origin

O_{C}

, and

r_{i}

is the orthogonal distance to the axis of revolution / symmetry (i.e. the Z-axis) from a point

P_{i}

on its surface.

In fact, the position of a valid point

P_{i}

is constrained within the mirror’s physical surface of reflection, which is radially limited by

r_{i, m i n}

and

r_{i, m a x}

, such that:

\begin{matrix} r_{i} = \sqrt{x_{i}^{2} + y_{i}^{2}}, for r_{i, m i n} \leq r_{i} \leq r_{i, m a x}, \forall i \in {1, 2} \end{matrix}

(6)

and

r_{1, m i n} = r_{r e f}, r_{1, m a x} = r_{s y s}, r_{2, m i n} = r_{c a m}, r_{2, m a x} = r_{s y s}

. Observe that the radius of the system is the upper bound for both mirrors (Figure 3). In addition, the hyperboloids profiled by Equation (5) must obey the following conical constraints:

\begin{matrix} \forall i \in {1, 2} (c_{i} > 0 \land k_{i} > 2) \end{matrix}

(7)

k is a constant parameter (unit-less) inversely related to the mirror’s curvature or more precisely, the eccentricity

ε_{c}

of the conic. In fact,

ε_{c} > 1

for hyperbolas, yet a plane is produced when

ε_{c} \to \infty

or

k = 2

.

We devise

M_{i}

as the set of all the reflection points

P_{i}

with coordinates

(x_{i}, y_{i}, z_{i})

laying on the surface of the respective mirror i within bounds. Formally,

\begin{matrix} M_{i} := \{P_{i} \in R^{3} | \frac{{(z_{i} - z_{0_{i}})}^{2}}{a_{i}^{2}} - \frac{r_{i}^{2}}{b_{i}^{2}} = 1 \land Equation (6) \land Equation (7)\} \end{matrix}

(8)

In our model, we describe both hyperboloidal mirrors, 1 and 2, with respect to the camera frame

[C]

, which acts as the common origin of the coordinate system. Therefore,

\begin{matrix} z_{0_{1}} & = \frac{c_{1}}{2} \end{matrix}

(9)

\begin{matrix} z_{0_{2}} & = d - \frac{c_{2}}{2} \end{matrix}

(10)

By expanding Equation (5) with their respective index terms, it becomes

\begin{matrix} {(z_{1} - \frac{c_{1}}{2})}^{2} - r_{1}^{2} (\frac{k_{1}}{2} - 1) & = \frac{c_{1}^{2}}{4} (\frac{k_{1} - 2}{k_{1}}) \end{matrix}

(11)

\begin{matrix} {(z_{2} - d + \frac{c_{2}}{2})}^{2} - r_{2}^{2} (\frac{k_{2}}{2} - 1) & = \frac{c_{2}^{2}}{4} (\frac{k_{2} - 2}{k_{2}}) \end{matrix}

(12)

Additionally, we define the function

f_{z_{i}} : r \mapsto z_{i}

to find the corresponding

z_{i}

component from a given r value as

\begin{matrix} \begin{matrix} f_{z_{i}} (r) := & \{\begin{matrix} z_{0_{i}} + γ_{i} & if i = 1 \land Equation (6) \\ z_{0_{i}} - γ_{i} & if i = 2 \land Equation (6) \\ N o n e & otherwise \end{matrix} \end{matrix} \end{matrix}

(13)

where

γ_{i} = \frac{a_{i}}{b_{i}} \sqrt{b_{i}^{2} + r_{i}^{2}}

.

The inverse relation

f_{r_{i}} : z \mapsto \{+ r_{i}, - r_{i}\}

can be also implemented as

\begin{matrix} \begin{matrix} f_{r_{i}} (z) := & \{\begin{matrix} \pm b_{i} Γ_{i} & if i \in {1, 2} \land Equation (6) \\ N o n e & otherwise \end{matrix} \end{matrix} \end{matrix}

(14)

where

Γ_{i} = \sqrt{\frac{{(z - z_{0_{i}})}^{2}}{a_{i}^{2}} - 1}

, so a valid input z can be associated with both positive and negative solutions

r_{i}

.

2.3. Rig Size

In the attempt to evaluate the overall system size, we consider the height and weight variables due to the primary design parameters, θ.

First, the height of the system,

h_{s y s}

can be estimated from the functional relationships

f_{z_{1}}

and

f_{z_{2}}

defined in Equation (13), which can provide the respective

z -

component values at the out-most point on the mirror’s surface. More specifically, knowing

r_{s y s}

, we get

\begin{matrix} h_{s y s} = z_{m a x} - z_{m i n} \end{matrix}

(15)

where

z_{m a x} = f_{z_{1}} (r_{s y s})

and

z_{m i n} = f_{z_{2}} (r_{s y s})

.

The rig’s weight can be indicated by the total resulting mass of the main “tangible” components:

\begin{matrix} m_{s y s} = m_{c a m} + m_{t u b} + m_{m i r} \end{matrix}

(16)

where the mass of the camera-lens combination is

m_{c a m}

; the mass of the support tube

m_{t u b}

can be estimated from its cylindrical volume

V_{t u b}

and material density

ρ_{t u b}

, and the mass due to the mirrors

\begin{matrix} \begin{matrix} m_{m i r} & = V_{m i r} ρ_{m i r} \\ = (V_{1} + V_{r e f} + V_{2}) ρ_{m i r} \end{matrix} \end{matrix}

(17)

For computing the volume of the hyperboloidal shell,

V_{i}

for mirror i, we apply a “ring method” of volume integration. By assuming all mirror material has the same wall thickness

τ_{m}

, we acquire

V_{i}

by integrating the horizontal cross-sections area along the

Z

-axis. Each ring area depends on its outer and inner circumferences that vary according to radius

r ∣_{z}

for a given height z. Equation (14) establishes the functional relation

{r_{i}}^{+} = f_{r_{i}} (z)

, from which we only need its positive answer. We let A be the function that computes the ring area of constant thickness

τ_{m}

for a variable outer radius

r_{i}

\begin{matrix} \begin{matrix} A (r_{i}) & = π r_{i}^{2} - π {(r_{i} - τ_{m})}^{2} \\ = π τ_{m} (2 r_{i} - τ_{m}) \end{matrix} \end{matrix}

(18)

We consider the definite integral evaluated in the z interval bounded by its height limits, which are correlated with its radial limits Equation (6) and can be obtained via the

f_{z_{i}}

defined in Equation (13), such that

\begin{matrix} z_{i, m i n} = f_{z_{i}} (r_{i, m i n}) and z_{i, m a x} = f_{z_{i}} (r_{i, m a x}) \end{matrix}

(19)

Then, we proceed to integrate Equation (18), so the shell volume for each hyperboloidal mirror is defined as

\begin{matrix} V_{i} & = \int_{z_{i, m i n}}^{z_{i, m a x}} A (r_{i}) dz \end{matrix}

(20)

Finally, since the reflex mirror piece is just a solid cylinder of thickness

τ_{m}

, its volume is simply

\begin{matrix} V_{r e f} = τ_{m} π r_{r e f}^{2} \end{matrix}

(21)

3. Projective Geometry

3.1. Analytical Solutions to Projection (Forward)

Assuming a central catadioptric configuration for the mirrors and camera system (Section 2.2), we derive the closed-form solution to the imaging process (forward projection) for an observable point

P_{w}

, positioned in three-dimensional Euclidean space,

R^{3}

, with respect to the reference frame,

[C]

, as vector

^{[C]} p_{w} = {[\begin{matrix} x_{w}, & y_{w}, & z_{w} \end{matrix}]}^{T}

. In addition, we assume all reference frames such as

[F_{1}]

and

[F_{2}]

have the same orientation as

[C]

.

For mathematical stability, we must constrain that all projecting world points lie outside the mirror’s volume:

\begin{matrix} f_{r_{i}} (z_{w}) < ρ_{w}, where ρ_{w} = \sqrt{x_{w}^{2} + y_{w}^{2}} \end{matrix}

(22)

where

f_{r_{i}}

is defined by Equation (14) and

ρ_{w}

measures the horizontal range to

P_{w}

.

P_{w}

is imaged at pixel position

^{[I]} m_{1}

after its reflection as point

P_{1}

on the hyperboloidal surface of mirror 1 (Figure 4). On the other hand, the second image point’s position,

^{[C]} m_{2}

, due to reflection point

P_{2}

on mirror 2 is rather obtained indirectly after an additional point

P_{r}

is reflected at

^{[C]} p_{r e f}

on the reflex mirror represented via Equation (32).

First, for

P_{w}

’s reflection point via mirror 1 at position vector

^{[C]} p_{1}

, we use

λ_{1}

as the parametrization term for the line equation passing through

F_{1}

toward

P_{w}

with direction

^{[F_{1}]} d_{1} =^{[C]} p_{w} -^{[C]} f_{1}

. The position of any point

P_{1}

on this line is given by:

^{[C]} p_{1} =^{[C]} f_{1} + λ_{1}^{[F_{1}]} d_{1}

(23)

Substituting Equation (23) into Equation (11), we obtain:

\begin{matrix} {(λ_{1} (z_{w} - c_{1}) + \frac{c_{1}}{2})}^{2} - (λ_{1}^{2} x_{w}^{2} + λ_{1}^{2} y_{w}^{2}) (\frac{k_{1}}{2} - 1) \\ - \frac{c_{1}^{2}}{4} (\frac{k_{1} - 2}{k_{1}}) & = 0 \end{matrix}

in order to solve for

λ_{1}

, which turns out to be

λ_{1} = \frac{c_{1}}{‖^{[F_{1}]} d_{1} ‖ \sqrt{k_{1} \cdot (k_{1} - 2)} - k_{1} (z_{w} - c_{1})}

(24)

where

‖^{[F_{1}]} d_{1} ‖ = \sqrt{x_{w}^{2} + y_{w}^{2} + {(z_{w} - c_{1})}^{2}}

is the Euclidean norm between

P_{w}

and mirror 1’s focus,

F_{1}

.

In practice, we represent the reflection point’s position

^{[C]} p_{1}

as a matrix-vector multiplication between the

3 \times 4

transformation matrix

K_{1} = [\begin{matrix} λ_{1} I_{(3)}, & (1 - λ_{1})^{[C]} f_{1} \end{matrix}]

and the point’s position vector

^{[C]} p_{w, h} = {[\begin{matrix} x_{w}, & y_{w}, & z_{w}, & 1 \end{matrix}]}^{T}

in homogeneous coordinates:

^{[C]} p_{1} = K_{1}^{[C]} p_{w, h}

(25)

Note that

^{[C]} p_{1}

’s elevation angle,

θ_{1}

, must be bounded as

\begin{matrix} θ_{1, m i n} \leq θ_{1} \leq θ_{1, m a x} \end{matrix}

(26)

where

θ_{1, m i n}

and

θ_{1, m a x}

are the angular elevation limits for the real reflective area of the hyperboloid.

Finally, the reflection point

P_{1}

with position

^{[C]} p_{1}

can now be perspectively projected as a pixel point located at

^{[I]} m_{1} = {[\begin{matrix} u_{1}, & v_{1} \end{matrix}]}^{T}

on the image. In fact, the entire imaging process of

P_{w}

via mirror 1 can be expressed in homogeneous coordinates as:

^{[I]} m_{1, h} = ζ_{1} K_{c} K_{1}^{[C]} p_{w, h}

(27)

where the scalar

ζ_{1} = 1 / z_{1} = 1 / (c_{1} + λ_{1} (z_{w} - c_{1}))

is the perspective normalizer that maps the principal ray passing through

p_{1}

onto a point

^{[C]} q_{1} = {[\begin{matrix} x_{q_{1}}, & y_{q_{1}}, & 1 \end{matrix}]}^{T}

on the normalized projection plane

{\hat{π}}_{i m g_{1}}

. The traditional

3 \times 3

intrinsic matrix of the camera’s pinhole model is

\begin{matrix} K_{c} = [\begin{matrix} f_{u} & s & u_{c} \\ 0 & f_{v} & v_{c} \\ 0 & 0 & 1 \end{matrix}] \end{matrix}

(28)

in which

f_{u} = f / h_{x}

and

f_{v} = f / h_{y}

are based on the focal length f and the pixel dimension

(h_{x}, h_{y})

, s is the skew parameter, and

^{[I]} m_{c} = {[\begin{matrix} u_{c}, & v_{c} \end{matrix}]}^{T}

is the optical center position on the image

[I]

. Figure 4 illustrates the projection point

f^{[C]} q_{1}

on the respective image plane

π_{i m g_{1}}

.

Similarly, we provide the analytical solution for the forward projection of

P_{w}

via mirror 2 by first considering the position of reflection point

P_{2}

:

^{[C]} p_{2} = K_{2}^{[C]} p_{w, h}

(29)

where

K_{2} = [\begin{matrix} λ_{2} I_{(3)}, & (1 - λ_{2})^{[C]} f_{2} \end{matrix}]

is similar to the transformation matrix

K_{1}

, but obviously it now uses

^{[C]} f_{2}

and

λ_{2} = \frac{c_{2}}{‖^{[F_{2}]} d_{2} ‖ \sqrt{k_{2} \cdot (k_{2} - 2)} + k_{2} (z_{w} - (d - c_{2}))}

(30)

with direction vector’s norm

\begin{matrix} ‖^{[F_{2}]} d_{2} ‖ = ‖^{[C]} p_{w} -^{[C]} f_{2} ‖ = \sqrt{x_{w}^{2} + y_{w}^{2} + {(z_{w} - (d - c_{2}))}^{2}} \end{matrix}

(31)

For completeness, note that the physical projection via mirror 2 is incident to the reflex mirror at

\begin{matrix} ^{[C]} p_{r e f} =^{[C]} f_{2 v} + λ_{r e f} (^{[C]} p_{2} -^{[C]} f_{2 v}) \end{matrix}

(32)

where

λ_{r e f} = \frac{d}{2 (d - z_{2})}

according to Equation (2) in the theoretical model. Ultimately, ignoring any astigmatism and chromatic aberrations introduced by the reflex mirror, and because the same (and only) real camera with

K_{c}

is used for imaging, we obtain the projected pixel position

^{[I]} m_{2, h} = {[\begin{matrix} u_{2}, & v_{2}, & 1 \end{matrix}]}^{T}

:

^{[I]} m_{2, h} = ζ_{2} K_{c} K_{r e f} K_{2}^{[C]} p_{w, h}

(33)

where

ζ_{2} = 1 / (d - z_{2})

is the perspective normalizer to find

^{[C]} q_{2}

on the normalized projection plane,

{\hat{π}}_{i m g_{2}}

.

Due to planar mirroring via the reflex mirror,

_{[C]}^{[C^{'}]} K_{r e f}

is used to change the coordinates of

P_{2}

from

[C]

onto the virtual camera frame,

[C^{'}]

, located at

^{[C]} f_{2 v}

. Hence,

\begin{matrix} _{[C]}^{[C^{'}]} K_{r e f} = [\begin{matrix} I_{(3)} + 2 D_{{\hat{n}}_{r e f}}, & ^{[C]} f_{2 v} \end{matrix}] \end{matrix}

(34)

where the

3 \times 1

unit normal vector of the reflex mirror plane,

^{[C]} {\hat{n}}_{r e f}

given in Equation (1), is mapped into its corresponding

3 \times 3

diagonal matrix

D_{{\hat{n}}_{r e f}}

, via the relationship:

\begin{matrix} D_{{\hat{n}}_{r e f}} \leftarrow I_{(3)} diag (^{[C]} {\hat{n}}_{r e f}) \end{matrix}

(35)

It is convenient to define the forward projection functions

f_{φ_{1}} (^{[C]} p)

and

f_{φ_{2}} (^{[C]} p)

for a 3D point

P

whose position vector is known with respect to

[C]

and which is situated within the vertical field of view

α_{i}

of mirror i (for

i \in {1, 2}

) indicated in Figure 5. Function

f_{φ_{i}} (^{[C]} p)

maps

^{[C]} p

to image point

^{[I]} m_{i}

on frame

[I]

, such that

f_{φ_{i}} : R^{3} \mapsto R^{2}

,

\begin{matrix} f_{φ_{i}} (^{[C]} p) := & \{\begin{matrix} ^{[C]} p \overset{Equation (27)}{\mapsto}^{[I]} m_{1} & if i = 1 \land Equations (37) and (22) \\ ^{[C]} p \overset{Equation (33)}{\mapsto}^{[I]} m_{1} & if i = 2 \land Equations (37) and (22) \\ N o n e & otherwise \end{matrix} \end{matrix}

(36)

In fact,

^{[I]} m_{i}

is considered valid if it is located within the imaged radial bounds, such that:

\begin{matrix} ^{[I_{C_{i}}]} ‖^{[I_{i}]} m_{r_{i, m i n}} ‖ \leq^{[I_{C_{i}}]}^{[I]} m_{i} \leq^{[I_{C_{i}}]} ‖^{[I]} m_{r_{i, m a x}} ‖ \end{matrix}

(37)

where the frame of reference

[I_{C_{i}}]

implies that its origin is the image center

^{[I]} m_{c} = {[\begin{matrix} u_{c_{i}}, & v_{c_{i}} \end{matrix}]}^{T}

of the

[I_{i}]

masked image (Figure 7). Therefore, the magnitude (norm) of any position

^{[I_{C_{i}}]} m

in pixel space

[I_{C_{i}}]

can be measured as

\begin{matrix} ^{[I_{C_{i}}]} ‖^{[I_{i}]} m ‖ : = ‖^{[I_{i}]} m -^{[I_{i}]} m_{c} ‖ = \sqrt{{(u - u_{c})}^{2} + {(v - v_{c})}^{2}} \end{matrix}

(38)

In particular,

^{[I_{C_{i}}]} ‖^{[I]} m_{r_{i, l i m}}

is the image radius obtained from the projection

^{[I]} m_{r_{i, l i m}} \leftarrow f_{φ_{i}} (^{[C]} p_{i, l i m})

corresponding to a particular point coincident with the line of sight of the radial limit

r_{i, l i m}

—it being either

r_{s y s}

,

r_{r e f}

, or

r_{c a m}

as indicated by Equation (6).

3.2. Analytical Solutions to Back Projection

The back projection procedure establishes the relationship between the 2D position of a pixel point

^{[I]} m_{i} = {[\begin{matrix} u, & v \end{matrix}]}^{T}

on the image

[I_{i}]

and its corresponding 3D projective direction vector

v_{i}

toward the observed point

P_{w}

in the world.

Initially, the pixel point

^{[I]} m_{1}

(imaged via mirror 1) is mapped as

Q_{1}

onto the normalized projection plane

{\hat{π}}_{i m g_{1}}

with coordinates

^{[C]} q_{1} = {[\begin{matrix} x_{q_{1}}, & y_{q_{1}}, & 1 \end{matrix}]}^{T}

by applying the inverse transformation of the camera intrinsic matrix Equation (28) as follows:

^{[C]} q_{1} =_{[C]}^{[I]} K_{c}^{- 1}^{[I]} m_{1, h} = [\begin{matrix} \frac{1}{f_{u}} & - \frac{s}{f_{u} f_{v}} & \frac{s v_{c} - f_{v} u_{c}}{f_{u} f_{v}} \\ 0 & \frac{1}{f_{v}} & - \frac{v_{c}}{f_{v}} \\ 0 & 0 & 1 \end{matrix}] [\begin{matrix} u_{1} \\ v_{1} \\ 1 \end{matrix}]

(39)

For simplicity, we assume no distortion parameters exist, so we can proceed with the lifting step along the principal ray that passes through three points: the camera’s pinhole

O_{C}

, point

Q_{1}

on the projection plane, and the reflection point

P_{1}

(Figure 4). The vector form of this line equation can be written as:

\begin{matrix} ^{[C]} p_{1} & =^{[C]} o_{c} + t_{1} (^{[C]} q_{1} -^{[C]} o_{c}) = t_{1}^{[C]} q_{1} \end{matrix}

(40)

By substituting Equation (40) into Equation (11), we solve for the parameter

t_{1}

, to get

t_{1} = \frac{c_{1}}{k_{1} - ‖^{[C]} q_{1} ‖ \sqrt{k_{1} \cdot (k_{1} - 2)}}

(41)

where

‖^{[C]} q_{1} ‖ = \sqrt{x_{q_{1}}^{2} + y_{q_{1}}^{2} + 1}

is the distance between

Q_{1}

and

O_{C}

.

Given

^{[F_{1}]} v_{1}

as the direction vector leaving focal point

F_{1}

toward the world point

^{[C]} P_{w}

. Through frame transformation

_{[C]}^{[F_{1}]} T_{1}^{[C]} p_{1, h}

, we get

\begin{matrix} ^{[F_{1}]} v_{1} =_{[C]}^{[F_{1}]} T_{1}^{[C]} p_{1, h}, where {_{[C]}^{[F_{1}]} T_{1}}_{(3 \times 4)} = [\begin{matrix} I_{(3)}, & -^{[C]} f_{1} \end{matrix}] \end{matrix}

(42)

for

^{[C]} p_{1, h}

as the homogeneous form of Equation (40). In fact,

^{[F_{1}]} v_{1}

provides the back-projected angles (elevation

θ_{1}

, azimuth

ψ_{1}

) from focus

F_{1}

toward

^{[C]} P_{w}

:

\begin{matrix} ^{[F_{1}]} θ_{1} & = arcsin (\frac{z_{v_{1}}}{‖^{[F_{1}]} v_{1} ‖}) = arcsin (\frac{z_{1} - c_{1}}{‖^{[F_{1}]} v_{1} ‖}) \end{matrix}

(43)

\begin{matrix} ^{[F_{1}]} ψ_{1} & = arctan (\frac{y_{v_{1}}}{x_{v_{1}}}) = arctan (\frac{y_{1}}{x_{1}}) \end{matrix}

(44)

where

‖^{[F_{1}]} v_{1} ‖

is the norm of the back-projection vector up to the mirror surface.

Using the same approach, we lift a pixel point

^{[I]} m_{2}

imaged via mirror 2. Because the virtual camera

O_{C}^{'}

located at

^{[C]} f_{2} = {[\begin{matrix} 0, & 0, & d - c_{2} \end{matrix}]}^{T}

uses the same intrinsic matrix

K_{c}

, we can safely back-project pixel

^{[I]} m_{2}

to

Q_{2 v}

on the normalized projection plane

{\hat{π}}_{i m g_{2}}

as follows:

\begin{matrix} ^{[C^{'}]} q_{2 v} =^{[C]} q_{2} = K_{c}^{- 1}^{[I]} m_{2, h} \end{matrix}

(45)

where the inverse transformation of the camera intrinsic matrix

K_{c}^{- 1}

is given by Equation (28). Since the reflection matrix

K_{r e f}

defined in Equation (34) is bidirectional due to the symmetric position of the reflex mirror about

[C]

and

[C^{'}]

, we can find the desired position of

^{[C]} q_{2 v}

with respect to

[C]

:

^{[C]} q_{2 v} =_{[C^{'}]}^{[C]} K_{r e f}^{[C^{'}]} q_{2 v, h}

(46)

which is equivalent to

^{[C]} q_{2 v} = {[\begin{matrix} x_{q_{2 v}}, & y_{q_{2 v}}, & d - 1 \end{matrix}]}^{T}

.

In Figure 4, we can see the principal ray that passes through the virtual camera’s pinhole

O_{C}^{'}

and the reflection point

P_{2}

, so this line equation can be written as:

\begin{matrix} ^{[C]} p_{2} & =^{[C]} f_{2 v} + t_{2} (^{[C]} q_{2 v} -^{[C]} f_{2 v}) \end{matrix}

(47)

Solving for

t_{2}

Equations (47) and (12), we get

t_{2} = \frac{c_{2}}{k_{2} - ‖^{[C]} q_{2} ‖ \sqrt{k_{2} \cdot (k_{2} - 2)}}

(48)

where

‖^{[C]} q_{2} ‖ = \sqrt{x_{q_{2}}^{2} + y_{q_{2}}^{2} + 1}

is the distance between the normalized projection point

Q_{2}

and the camera

O_{C}

while considering Equation (46). Beware that the newly found location of

P_{2}

is given with respect to the real camera frame,

[C]

.

Again, we obtain the back-projection ray

\begin{matrix} ^{[F_{2}]} v_{2} =_{[C]}^{[F_{2}]} T_{2}^{[C]} p_{2, h}, where {_{[C]}^{[F_{2}]} T_{2}}_{(3 \times 4)} = [\begin{matrix} I_{(3)}, & -_{}^{} f_{2} \end{matrix}] \end{matrix}

(49)

in order to indicate the direction leaving from the primary focus

F_{2}

toward

P_{w}

through

P_{2}

. Here, the corresponding elevation and azimuth angles are respectively given by

\begin{matrix} ^{[F_{2}]} θ_{2} & = arcsin (\frac{z_{v_{2}}}{‖^{[F_{2}]} v_{2} ‖}) = arcsin (\frac{d - t_{2}}{‖^{[F_{2}]} v_{2} ‖}) \end{matrix}

(50)

\begin{matrix} ^{[F_{2}]} ψ_{2} & = arctan (\frac{y_{v_{2}}}{x_{v_{2}}}) = arctan (\frac{y_{2}}{x_{2}}) \end{matrix}

(51)

where

‖^{[F_{2}]} v_{2} ‖ = \sqrt{x_{2}^{2} + y_{2}^{2} + {(c_{2} - t_{2})}^{2}}

is the magnitude of the direction vector from its reflection point

P_{2}

.

Like done for the (forward) projection, it is convenient to define the back-projection functions

f_{β_{1}}

and

f_{β_{2}}

for lifting a 2D pixel point

^{[I]} m

within radial bounds validated by Equation (37) to their angular components

^{[F_{i}]} (θ_{i}, ψ_{i})

with respect to the respective foci frame

[F_{i}]

(oriented like

[C]

) as indicated by Equations (43), (44), (50) and (51), such that

f_{β_{i}} : R^{2} \mapsto R^{2}

,

\begin{matrix} f_{β_{i}} (^{[I]} m) & : = & \{\begin{matrix} (^{[I]} m \overset{Equation (43)}{\mapsto}^{[F_{1}]} θ_{1},^{[I]} m \overset{Equation (44)}{\mapsto}^{[F_{1}]} ψ_{1}) & if i = 1 \\ (^{[I]} m \overset{Equation (50)}{\mapsto}^{[F_{2}]} θ_{2},^{[I]} m \overset{Equation (51)}{\mapsto}^{[F_{2}]} ψ_{2}) & if i = 2 \\ N o n e & \neg Equation (37) . \end{matrix} \end{matrix}

(52)

3.3. Field-of-View

The horizontal FOV is clearly 360° for both mirrors. In other words, azimuths ψ can be measured in the interval

[0, 2 π)

rad

. As discussed previously, there exists a positive correlation between the vertical field of view (vFOV) angle

α_{i}

of mirror i and its profile parameter

k_{i}

, such that

α_{i} \to 180 °

as

k_{i} \to \infty

(see Figure 9). As demonstrated in Figure 5,

α_{i}

is physically bounded by its corresponding elevation angles:

θ_{i, m a x}

,

θ_{i, m i n}

. Both vFOV angles,

α_{1}

and

α_{2}

, are computed from their elevation limits as follows:

\begin{matrix} α_{1} & = θ_{1, m a x} - θ_{1, m i n} \end{matrix}

(53a)

\begin{matrix} α_{2} & = θ_{2, m a x} - θ_{2, m i n} \end{matrix}

(53b)

The overall vFOV of the system is also given from these elevation limits:

α_{s y s} = max (θ_{1, m a x}, θ_{2, m a x}) - min (θ_{1, m i n}, θ_{2, m i n})

(54)

Figure 6 highlights the the so-called common vFOV angle,

α_{S R O I}

, for the stereo region of interest (SROI) where the same point can be seen from both mirrors so point correspondences can be found (Section 5). In our model,

α_{S R O I}

can be decided from the value of the three prevailing elevation angles (

θ_{1, m a x}

,

θ_{1, m i n}

, and

θ_{2, m i n}

), such that:

α_{S R O I} = θ_{S R O I, m a x} - θ_{S R O I, m i n}

(55)

where generally,

\begin{matrix} θ_{S R O I, m i n} & = max (θ_{1, m i n}, θ_{2, m i n}) \end{matrix}

(56a)

\begin{matrix} θ_{S R O I, m a x} & = min (θ_{1, m a x}, θ_{2, m a x}) \end{matrix}

(56b)

The shaded area in Figure 6 illustrates the SROI that is far-bounded by the set of triangulated points found at the maximum range due to minimum disparity

Δ m_{12} = 1 px

in the discrete case (refer to Figure 17), such that

\begin{matrix} \begin{matrix} P_{f s} = \{P_{w} \leftarrow f_{Δ} ((θ_{1}, ψ_{1}), (θ_{2}, ψ_{2})) & ∣ (θ_{1}, ψ_{1}) \leftarrow f_{β_{1}} (m_{1}) \\ \land (θ_{2}, ψ_{2}) \leftarrow f_{β_{2}} (m_{2}) \\ \land Δ m_{12} = 1, px\} \end{matrix} \end{matrix}

(57)

where functions

f_{β_{i}}

and

f_{Δ}

, are provided in Equations (52) and (89).

The SROI is near-bounded (to the

Z

-axis of radial symmetry) by its vertices

P_{n s_{h i g h}}

,

P_{n s_{m i d}}

and

P_{n s_{l o w}}

, which result from the following ray-intersection cases:

(a): $P_{n s_{h i g h}} \leftarrow f_{Δ} ((θ_{1, m a x}, ψ_{1}), (θ_{2, m a x}, ψ_{2}))$
(b): $P_{n s_{m i d}} \leftarrow f_{Δ} ((θ_{1, m i n}, ψ_{1}), (θ_{2, m a x}, ψ_{2}))$
(c): $P_{n s_{l o w}} \leftarrow f_{Δ} ((θ_{1, m i n}, ψ_{1}), (θ_{2, m i n}, ψ_{2}))$

where the intersection function

f_{Δ}

is implemented for direction rays (or angles) as defined in the Triangulation Section 5.2.

By assuming a radial symmetry on the camera’s field of view

α_{c a m}

, it should allow for a complete view of the mirror surface at its outmost diameter of

2 r_{s y s}

according to Equation (6). Substantially, as depicted in Figure 6,

α_{c a m}

is upper-bounded by the camera hole radius

r_{c a m}

selected according to Equation (78). The following inequality constraint emerges

\begin{matrix} 2 arctan (\frac{r_{s y s}}{f_{z_{1}} (r_{s y s})}) \leq α_{c a m} \leq 2 arctan (\frac{r_{c a m}}{f_{z_{2}} (r_{c a m})}) \end{matrix}

(58)

where the respective functions

f_{z_{i}}

are defined in Equation (13).

Our specific viewing requirements when mounting the omnidirectional sensor along the central axis of the quadrotor ensure that objects located at 15 cm under the rig’s base and at 1 meter away (from the central axis) can be viewed. Thus, angles

θ_{1, m i n}

and

θ_{2, m i n}

should only be large enough as to avoid occlusions from the MAV’s propellers (Figure 5) and to produce inner and outer ring images at a useful ratio (Figure 7).

3.4. Spatial Resolution

The resolution of the images acquired by our system are not space invariant. In fact, an omnidirectional camera producing spatial resolution-invariant images can only be obtained through a non-analytical function of the mirror profile as shown in [31]. In this section, we study the effect our design has on its spatial resolution as it depends on position parameters like d and

c_{i}

introduced in Section 2.1 as well as a direct dependency on the characteristics (e.g., focal length f) of the camera obtaining the image.

Let

η_{c a m}

be the spatial resolution for a conventional perspective camera as defined by Baker and Nayar in [25,29]. It measures the ratio between the infinitesimal solid angle d

ω_{i}

(usually measured in steradians) that is directed toward a point

P_{i}

at an angle

θ_{i, pix}

(formed with the optical axis

Z_{C}

) and the infinitesimal element of image area d

A_{pix}

that d

ω_{i}

subtends (as shown in Figure 8). Accordingly, we have:

\begin{matrix} η_{c a m} & = \frac{{d A}_{pix}}{{d ω}_{i}} = \frac{f^{2}}{{cos}^{3} θ_{i, pix}} \end{matrix}

(59)

whose behavior tends to decrease as

θ_{pix} \to 0

, so higher resolution areas on the sensor plane continuously increase the farther away they get from the optical center imaged at

^{[I]} m_{c}

. For ease of visualization, we plot only the u pixel coordinates corresponding to the 2D spatial resolution

η_{2 D}

, which is obtained by projecting the solid angle Ω onto a planar angle

θ_{Ω}

(the apex angle in 2D of the solid cone of view). This yields

θ_{Ω} = 2 arccos (1 - Ω / 2 π)

, and we reduce the image area into its circular diameter with

2 \sqrt{A / π}

. Generally, our conversion from 3D spatial resolution η in

[m^{2} / sr]

units to 2D proceeds as follows:

\begin{matrix} η_{2 D} = \frac{2 \sqrt{η / π}}{θ_{Ω = 1 sr}} \end{matrix}

(60)

where

θ_{Ω = 1 sr} \approx 1.14390752211 rad

. More specifically, Equation (59) is manipulated to provide

η_{i, c a m}

as the indicative of spatial resolution toward any specific point in the mirror,

^{[C]} P_{i} \in M_{i}

according to Equation (8), as follows:

\begin{matrix} η_{i, c a m} & = \{\begin{matrix} f^{2} {(\frac{\sqrt{r_{1}^{2} + z_{1}^{2}}}{z_{1}})}^{3} & if i = 1 \\ f^{2} {(\frac{\sqrt{r_{2}^{2} + {(d - z_{2})}^{2}}}{d - z_{2}})}^{3} & if i = 2 \end{matrix} \end{matrix}

(61)

where

r_{i}

is the radial length defined in Equation (6) and its associated

z_{i}

coordinate, f is the camera’s focal length, and the design parameters d and

c_{i}

that relate to the position of the mirror focal points

F_{i}

with respect to the camera frame

[C]

.

Thus, for a conventional perspective camera,

η_{i, c a m}

grows as

θ_{i, pix} \to π / 2

due to the foreshortening effect that stretches the image representation around the sensor plane’s periphery where spatial information gets collected onto a larger number of pixels. Therefore, image areas farther from the optical axis are considered to have higher spatial resolutions.

Baker and Nayar also defined the resolution,

η_{i}

, of a catadioptric sensor in order to quantify the view of the world or d

ν_{i}

, an infinitesimal element of the solid angle subtended by the mirror’s effective viewpoint

F_{i}

, which is consequently imaged onto a pixel area d

A_{pix}

. Again, here we provide the resolution according to our model:

\begin{matrix} η_{1} & = \frac{{d A}_{pix}}{{d ν}_{1}} = [\frac{r_{1}^{2} + {(c_{1} - z_{1})}^{2})}{r_{1}^{2} + z_{1}^{2}}] η_{1, c a m} \end{matrix}

(62a)

\begin{matrix} η_{2} & = \frac{{d A}_{pix}}{{d ν}_{2}} = [\frac{r_{2}^{2} + {(c_{2} - d + z_{2})}^{2})}{r_{2}^{2} + {(d - z_{2})}^{2}}] η_{2, c a m} \end{matrix}

(62b)

for our mirror-perspective camera configuration, where

O_{C}

is the origin of coordinates as shown in Figure 8 and

η_{i, c a m}

is given in Equation (61).

As demonstrated by the plot of Figure 12 in Section 4.2.2,

η_{i}

grows accordingly towards the periphery of each mirror (the equatorial region). This aspect of our sensor design is very important because it indicates that the common field of view,

α_{S R O I}

, where stereo vision is employed (Section 5), is imaged at a relatively higher resolution than the unused polar regions closer to the optical axis (the

Z_{C}

axis).

If we modify

η_{i}

by substituting

r_{i}

with its equivalent

f_{r_{i}} (z_{i})

function defined in Equation (14), using mirror 1 for example, we get:

\begin{matrix} \begin{matrix} η_{1} & = [\frac{f_{r_{1}}^{2} (z_{1}) + {(c_{1} - z_{1})}^{2}}{f_{r_{1}}^{2} (z_{1}) + z_{1}^{2}}] η_{1, c a m} \\ = \frac{f^{2} \sqrt{f_{r_{1}}^{2} (z_{1}) + z_{1}^{2}} [f_{r_{1}}^{2} (z_{1}) + {(c_{1} - z_{1})}^{2}]}{z_{1}^{3}} \end{matrix} \end{matrix}

(63)

which is an inherent indicative of how the resolution

η_{i}

for a reflection point

P_{i}

increases with

k_{i} \to \infty

(Figure 11). Conversely, the smaller the

k_{i}

parameter gets (related to eccentricity as discussed in Section 2.2), the flatter the mirror becomes, so its resolution resembles more that of the perspective camera alone. Mathematically,

{lim}_{k_{i} \to 2} η_{i} \to η_{i, c a m}

.

As shown in Figure 9, a smaller

k_{i}

would require a wider radius

r_{s y s}

in order to achieve the same omnidirectional vertical field of view,

α_{s y s}

. Even worse, in order to image such a wider reflector, either the camera’s field of view,

α_{c a m}

, would have to increase (by decreasing the focal length f and perhaps requiring a larger camera hole

r_{c a m}

and sensor size), or the distance

c_{i}

between the effective pinhole and the viewpoint would have to increase accordingly. Another consequence is the effect on the baseline b, which must change in order to maintain the same vertical field of view (Figure 10). As a result, the depth resolution of the stereo system would suffer as well.

4. Parameter Optimization and Prototyping

The nonlinear nature of this system makes it very difficult to balance among its desirable performance aspects. The optimal vector of design parameters,

θ^{*}

, can be found by posing a constrained maximization problem for the objective function

\begin{matrix} f_{b} (θ) = c_{1} + c_{2} - d \end{matrix}

(64)

which measures the baseline according to Equation (3). Indeed, the optimization problem is subject to the set of constraints C, which we enumerate in Section 4.1. Formally,

\begin{matrix} θ^{*} = \underset{θ \in Θ}{arg max} f_{b} (θ) subject to C \end{matrix}

(65)

where

Θ \subseteq R^{6}

is the 6-dimensional solution space for

θ \in R^{6}

given in Equation (4) as

θ = [\begin{matrix} c_{1}, & c_{2}, & k_{1}, & k_{2}, & d, & r_{s y s} \end{matrix}]

.

4.1. Optimization Constraints

We discuss the constraints that the proposed omnistereo sensor is subject to. Overall, we mainly take the following into account:

(a): geometrical constraints — including SVP and reflex constraints described by Equations (11), (12) and (2);
(b): physical constraints — the rig’s dimensions, which include the mirrors radii as well as by-product parameters such as system height $h_{s y s}$ and mass $m_{s y s}$ ;
(c): performance constraints — the spatial resolution and range from triangulation determined by parameters $k_{1}$ , $k_{2}$ , and $c_{1}$ ; the desired viewing angles for an optimal SROI field of view, $α_{S R O I}$ .

Following the design model described throughout Section 2, we now list the pertaining linear and nonlinear constraints that compose the set C. We disjoint the linear constraints in a subset

C_{L}

and the non-linear constraints subset

C_{N L}

, so

C = C_{L} ⊎ C_{N L}

. Within each subset, we generalize equality constraints as functions

h : R^{6} \mapsto R

that obey

\begin{matrix} h (θ) = 0 \end{matrix}

(66)

whereas inequality functions

g : R^{6} \mapsto R

satisfy

\begin{matrix} g (θ) \leq 0 \end{matrix}

(67)

4.1.1. Linear Constraints

We have only setup linear inequalities for constraints in

C_{L}

. Specifically, we require the following:

g1:: In order to set the position of $F_{2}$ below the origin $O_{C}$ of the pinhole camera frame $[C]$ , the focal distance $c_{2}$ of mirror 2 must be larger than d (distance between $O_{C}$ and $F_{2 v}$ ),

$\begin{matrix} d \leq c_{2} \end{matrix}$

(68)
g2:: Because the hyperboloidal mirror should reflect light towards its effective viewpoint $F_{1}$ without being occluded by the reflex mirror, mirror 1’s focal distance, $c_{1}$ , needs to exceed the placement of the reflex mirror,

$\begin{matrix} d / 2 \leq c_{1} \end{matrix}$

(69)
g3:: The empirical constraint

$\begin{matrix} \frac{5}{3} \leq \frac{k_{2}}{k_{1}} \end{matrix}$

(70)

pertains our rig dimensions in order to assign a greater curvature to mirror 2’s profile (located a the bottom), so its view is directed toward the equatorial region rather than up. Complementarily, this constraint flattens mirror 1’s profile, so it can possess a greater view of the ground. This curvature inequality allows the SROI to be bounded by a wider vertical field of view when the sensor must be mounted above the MAV’s propellers as depicted in Figure 5.

4.1.2. Non-Linear Constraints

For the non-linear design constraints, we establish the following inequalities:

g4:: The AscTec Pelican quadrotor has a maximum payload of 650 $g$ (according to the manufacturer specifications [28]). Therefore, we must satisfy the system mass computed via Equation (16), such that

$\begin{matrix} m_{s y s} \leq 650 \end{matrix}$

(71)
g5:: Similarly, we limit the system’s height obtained with Equation (15) by a height limit $h_{s y s, m a x}$ ,

$\begin{matrix} h_{s y s} \leq h_{s y s, m a x} \end{matrix}$

(72)

For example, we set $h_{s y s, m a x} = 150 mm$ for the 37 $mm$ -radius rig.
g6:: The origin of coordinates for the camera frame is set at its viewpoint, $O_{C}$ . In order to fit the camera enclosure under mirror 2, it is realistic to position the focus $F_{2}$ on the vertical transverse axis at more than 5 $mm$ away from $O_{C}$ :

$\begin{matrix} 5 \leq z_{0_{2}} - a_{2} \end{matrix}$

(73)

where $z_{0_{2}}$ is defined in Equation (10), and $a_{2}$ pertains to Equation (5).

Next, we determine the bounds for the limiting angles that partake in the computation of the system’s vertical field of view

α_{s y s}

, which is based on equation Equation (54). Our application has specific viewing requirements that can be achieved with the following application conditions:

g7:: Let $Λ_{1, m a x} = 14^{\circ}$ be an acceptable upper-bound for angle $θ_{1, m a x}$ , such that

$\begin{matrix} θ_{1, m a x} \leq Λ_{1, m a x} \end{matrix}$

(74)
g8:: Because we desire a larger view towards the ground from mirror 1, we empirically set $Λ_{1, m i n} = - 25^{\circ}$ as a lower-bound for the minimum elevation $θ_{1, m i n}$ ,

$\begin{matrix} Λ_{1, m i n} \leq θ_{1, m i n} \end{matrix}$

(75)
g9:: In order to avoid occlusions with the MAV’s propellers while being capable to image objects located about 5 $cm$ under the rig’s base and 20 $cm$ away (horizontally) from the central axis, we limit mirror 2’s lowest angle by a lower-bound $Λ_{2, m i n} = - 14^{\circ}$ ,

$\begin{matrix} Λ_{2, m i n} \leq θ_{2, m i n} \end{matrix}$

(76)

Finally, we restrict the radius of the system,

r_{s y s}

, to be identical for both hypeboloids by satisfying the following equality condition:

h1:: With functions $f_{r_{1}}$ and $f_{r_{2}}$ defined in Equation (14), we set

$\begin{matrix} r_{s y s} = r_{i, m a x} & = f_{r_{i}} (z_{i, m a x}), \forall i \in {1, 2} \end{matrix}$

where we imply that $z_{i, m a x} \leftarrow f_{z_{i}} (r_{s y s})$ using Equation (13). Thus, the entire function composition for this equality becomes

$\begin{matrix} f_{r_{1}} (f_{z_{1}} (r_{s y s})) & = f_{r_{2}} (f_{z_{2}} (r_{s y s})) \end{matrix}$

(77)

4.2. Optimal Results

Applying the aforementioned constraints (Section 4.1) and using an iterative nonlinear optimization method such as one of the surveyed in [32], a bounded solution vector

θ^{*}

converges to the the values shown in Table 1 for two rig sizes. Table 2 contains the by-product parameters corresponding to the dimensions listed in Table 1.

As Figure 3 illustrates, a realistic dimension for the radius of the camera hole,

r_{c a m}

, must consider the maximum value between a physical micro-lens radius (

r_{l e n s}

) and the radius

r_{α_{c a m} ∣_{r_{s y s}}}

for an unoccluded field of view of the camera

α_{c a m}

imaging the complete surface of mirror 1. Practically,

r_{c a m} = max (r_{l e n s}, r_{α_{c a m} ∣_{r_{s y s}}})

(78)

For both rigs, the expected vertical field of views are

α_{s y s} = 75 ° - (- 21 °) \approx 96 °

according to Equation (54), and

α_{S R O I} = 14 ° - (- 14 °) \approx 28 °

using Equation (55). Note that

θ_{2, m a x}

may be actually limited by the camera hole radius, so in reality

θ_{c a m} ⇝ 59 °

, and

α_{s y s} ⇝ 80 °

. For the big rig, Table 3 shows the nearest vertices of the SROI that result from these angles (Figure 6).

4.2.1. Optimality of Parameters $k_{1}$ and $k_{2}$

Finally, we study the effect parameter

k_{i}

has over the system radius

r_{s y s}

(Figure 9), the omnistereo baseline b (Figure 10), and the spatial resolution (Figure 11 and Figure 12). Figure 9 addresses the relation between

k_{i}

and radius

r_{s y s}

(recall the rig size specified in Section 2.3). In Figure 11, it can be seen that for the same

r_{s y s}

, realistic values for

k_{1}

fall in the range

3 < k_{1} < 13

, and the vertical field of view

α_{1} \to 0

as

k \to 2

, which is expected according to the SVP property specified in Section 2.2. In fact, the left part of Figure 11 also demonstrates the necessary

r_{s y s}

to maintain

α_{S R O I} \approx 28

for various values of

k_{i}

.

Figure 10 shows the inverse relationship between values of

k_{1}

and the baseline, b, as we attempt to fit the view of a wider/narrower mirror profile (due to

k_{1}

) on the constant camera field of view,

α_{c a m}

. In order to make a fair comparison, let

k_{1}^{'} = k_{1} + ε_{k}, \forall k_{1} > 2, ε_{k} > 0

for which we find its new focal length

c_{1}^{'}

while solving for the new

r_{s y s}^{'}

and

z_{m a x}^{'}

. Provided with a function such that

c_{1} \leftarrow f_{c_{1}} (k_{1})

, we perform the analysis for a given

α_{S R O I}

and

α_{c a m}

shown in Figure 10. Given the baseline function

f_{b}

defined in Equation (64), the following implication holds true:

\begin{matrix} f_{b} ∣_{c_{1} \leftarrow f_{c_{1}} (k_{1})} > f_{b} ∣_{c_{1}^{'} \leftarrow f_{c_{1}} (k_{1} + ε_{k})}, \forall k_{1} > 2, ε_{k} > 0 \end{matrix}

(79)

Notice that

k_{2}

,

c_{2}

and d are kept constant through this last analysis, and we ignore possible occlusions from the reflex mirror fixed at

d / 2

.

4.2.2. Spatial Resolution Optimality

In this section, we compare the sensor’s spatial resolution,

η_{i}

, defined in Section 3.4 for the optimal parameters listed in Table 1 (for the big rig, only). In Figure 12, we verify how both resolutions

η_{1}

and

η_{2}

increase towards the equatorial region according to the spatial resolution theory presented in [29]. Indeed, the increase in spatial resolution within the SROI that covers the equatorial region (as indicated in Figure 6) justifies our model’s coaxial configuration intended for omnistereo applications.

In Figure 11, we compare the effect on

η_{i}

for various mirror profiles, which depend directly on

k_{i}

. We illustrate the change in curvature due to parameters

k_{1}

and

k_{2}

and also show (in the legend) the respective

r_{s y s}

achieving a common vFOV of

α_{S R O I} \approx 28 °

as for the optimal parameters of the big rig. From this plot, we appreciate the compromise due to optimal parameters,

k_{1}^{(O p t .)} = 5.7

and

k_{2}^{(O p t .)} = 9.7

, for a realistic system size due to

r_{s y s}

and a suitable range of spatial resolutions,

η_{i}

, within the SROI.

4.3. Prototypes

We validate our design with both synthetic and real-life models.

4.3.1. Synthetic Prototype (Simulation)

After converging to an optimal solution

θ^{*}

, we employ these parameters (Table 1) to describe synthetic models using POV-Ray, an open-source ray-tracer. We render 3D scenes via the camera of the synthetic omnistereo sensor like the example shown in Figure 2b. The simulation stage plays two important roles in our investigation:

(1): to acquire ground-truth 3D-scene information in order to evaluate the computed range by the omnistereo system (as explained in Section 5); and
(2): to provide an almost accurate geometrical representation of the model by discounting some real-life computer vision artifacts such as assembly misalignments, glare from the support tube (motivating the use of standoffs on the real prototype), as well as the camera’s shallow depth-of-field. All of these artifacts can affect the quality of the real-life results shown in Section 6.

4.3.2. Real-Life Prototypes

We have also produced two physical prototypes that can be installed on the Pelican quadrotor (made by Ascending Technologies [28]). Figure 13a shows the rig constructed with hyperboloidal mirrors of

r_{s y s} \approx 37 mm

, and a Logitech^® HD Pro Webcam C910 camera capable of (2592 × 1944) pixel images at 15∼20 FPS. We decided to skip the use of the acrylic glass tube to separate the mirrors at the specified

h_{s y s}

distance, and instead we constructed a lighter 3-standoff mount in order to avoid glare and cross-reflections. This support was designed in 3D-CAD and printed for assembly. The three areas of occlusion due to the 3

mm

-wide standoffs are non-invasive for the purpose of omnidirectional sensing and can be ignored with simple masks during image processing. In fact, we stamped fiducial markers to the vertical standoffs to aid with the panoramas generation (Section 5.1) and future calibration methods. To image the entire surface of mirror 1, we require a camera with a (minimum) field of view of

α_{c a m} > 31 °

, which is achieved by

r_{α_{c a m}} > 1.4 mm

. In practice, as noted by Equation (78), microlenses measure around

r_{l e n s} \approx 7 mm

. Therefore, we set

r_{c a m} > 7 mm

, as a safe specification to fit a standard microlens through the opening of mirror 2 as shown in Figure 3.

Recall that

m_{s y s}

is limited by the maximum 650

g

-payload that the AscTec Pelican quadrotor is capable of flying with (according to the manufacturer specifications [28]). The camera with lens weights approximately 25

g

. A cylindrical tube made of acrylic has an average density

ρ_{t u b} \approx 1.18

g

\cdot

cm

^{- 3}

, whereas the mirrors machined out of brass have a density

ρ_{m i r} \approx 8.5

g

\cdot

cm

^{- 3}

. Empirically, we verify a close estimate of the entire system’s mass, such that

m_{s y s} \approx 550 g

for the big rig, and

m_{s y s} \approx 150 g

for the small rig.

5. 3D Sensing from Omnistereo Images

Stereo vision from point correspondences on images at distinct locations is a popular method for obtaining 3D range information via triangulation. Techniques for image point matching are generally divided between dense (area-based scanning [32]) and sparse (feature description [33]) approaches. Due to parallax, the disparity in point positions for objects close to the vision system must be larger than for objects that are farther away. As illustrated in Figure 6, the nearsightedness of the sensor is determined mainly by the common observable space (a.k.a. SROI) acquired by the limiting elevation angles of the mirrors (Section 3.3). In addition, we will see next (Section 5.2) that the baseline b also plays a major role in range computation.

Due to our model’s coaxial configuration, we could scan for pixel correspondences radially between a given pair of warped images

([I_{1}], [I_{2}])

like in the approach taken by similar works such as [34]. However, it seems more convenient to work on a rectified image space, such as with panoramic images, where the search for correspondences can be performed using any of the various existing methods for perspective stereo views. Hence, we first demonstrate how these rectified panoramic images are produced (Section 5.1) and used for establishing point correspondences. Then, we proceed to study our triangulation method for the range computation from a given set of point correspondences (Section 5.2). Last, we show preliminary 3D point clouds as the outcome from such procedure.

5.1. Panoramic Images

Figure 14 illustrates how we form the respective panoramic image

[Ξ_{1}]

out of its warped omnidirectional image

[I_{1}]

. As illustrated in Figure 7,

[I_{i}]

is simply the region of interest out of the full image

[I]

where projection occurs via mirror i. However, we can safely refer to

[I]

because it will never be the case that projections via different mirrors overlap on the same pixel position

^{[I]} m

. In a few words, we obtain a panorama

[Ξ_{i}]

by reverse-mapping each discretized 3D point

P_{c y l_{i}} \in S_{c y l_{i}}

to its projected pixel coordinates

^{[I]} m

on

[I]

according to Section 3.2.

More thoroughly, for

i = {1, 2}

,

S_{c y l_{i}}

is the set of all valid 3D points

P_{c y l_{i}}

that lie on an imaginary unit cylinder centered along the Z-axis and positioned with respect to the mirror’s primary focus

F_{i}

. Recall that the radius of a unit cylinder is

r_{c y l} = 1

, so its circumference becomes

w_{c y l} = 2 π r_{c y l} = 2 π

. Noticed that the imaging ratio,

χ_{I_{1 : 2}} = \frac{h_{I_{1}}}{h_{I_{2}}}

, illustrated in Figure 7 provides a way of inferring the scale between pairs of point correspondences. However, we achieve conforming scales among both panoramic representations by simply setting both cylinders to an equal height

h_{c y l}

, which is determined from the system’s elevation limits,

(θ_{s y s, m i n}, θ_{s y s, m a x})

, since they partake in the measurement of the system’s vertical field of view given by Equation (54). Hence, we obtain

\begin{matrix} h_{c y l} = z_{c y l, m a x} - z_{c y l, m i n}, where \{\begin{matrix} z_{c y l, m a x} & = tan (θ_{s y s, m a x}) \\ z_{c y l, m i n} & = tan (θ_{s y s, m i n}) \end{matrix} \end{matrix}

(80)

Consequently, to achieve panoramic images

[Ξ_{i}]

of the same dimensions by maintaining a true aspect ratio

w_{Ξ} : h_{Ξ}

, it suffices to indicate either the width (number of columns)

w_{Ξ}

or the height (number of rows)

h_{Ξ}

as number of pixels. Here, we propose a custom method for resolving the panoramic image dimensions by setting the equality for the length

l_{p x}

of an individual “square” pixel in the cylinder (behaving like a panoramic camera sensor):

\begin{matrix} l_{p x} = \frac{w_{c y l}}{w_{Ξ}} = \frac{h_{c y l}}{h_{Ξ}} \end{matrix}

(81)

For instance, if the width

w_{Ξ}

is given, then the height is simply

h_{Ξ} = w_{Ξ} h_{c y l} / w_{c y l}

.

To increase the processing speed for each panoramic image

[Ξ_{i}]

, we fill up its corresponding look-up-table LUT

_{Ξ_{i}}

of size

w_{Ξ} \times h_{Ξ}

that encodes the mapping for each panoramic pixel coordinates

^{[Ξ_{i}]} m =^{[Ξ_{i}]} {[\begin{matrix} u, & v \end{matrix}]}^{T}

to its respective projection

^{[I_{i}]} m =^{[I_{i}]} {[\begin{matrix} u, & v \end{matrix}]}^{T}

on the distorted image

[I_{i}]

. Each pixel

^{[Ξ_{i}]} m

gets associated with its cylinder’s 3D point positioned at

^{[F_{i}]} p_{c y l_{i}}

, which can inherently be indicated by its elevation

^{[F_{i}]} θ_{i}

and azimuth

^{[F_{i}]} ψ_{i}

(relative to the mirror’s primary focus

F_{i}

) as illustrated in Figure 4. Thus, the ray

^{[F_{i}]} v_{i}

of a particular 3D point directed about

^{[F_{i}]} (ψ_{i}, θ_{i})

must pass through

P_{c y l_{i}}

in order to get imaged as pixel

^{[I]} m_{i}

.

Since the circumference of the cylinder,

w_{c y l}

, is discretized with respect to the number of pixel columns or width

w_{Ξ}

, we use the pixel length

l_{p x}

as the factor to obtain the arc length

l_{ψ_{i}}

spanned by the azimuth

^{[F_{i}]} ψ_{i}

out of a given

^{[Ξ_{i}]} u

coordinate on the panoramic image. Generally,

\begin{matrix} ^{[F_{i}]} ψ_{i} = \frac{l_{ψ_{i}}}{r_{c y l}} = \frac{w_{c y l} -^{[Ξ_{i}]} u l_{p x}}{r_{c y l}} \end{matrix}

(82)

or simply

^{[F_{i}]} ψ_{i} = 2 π -^{[Ξ_{i}]} u l_{p x}

for the unit cylinder case.

An order reversal in the columns of the panorama is performed by Equation (82) because we account for the relative position between

S_{c y l_{i}}

and the projection plane

π_{i m g}

. For

[Ξ_{1}]

, Figure 14 depicts the unrolling of the cylindrical panoramic image onto a planar panoramic image. However, note that

π_{i m g}

is shown from above (or its back) in Figure 14, so the panorama visualization places the viewer inside the cylinder at

F_{1}

.

Similarly, the elevation angle

^{[F_{i}]} θ_{i}

is inferred out the row or

^{[Ξ_{i}]} v

coordinate, which is scaled to its cylindrical representation by

l_{p x}

. Recall that both cylinders have the same height,

h_{c y l}

, computed by Equation (80). By taking into account any row offset from the maximum height position,

^{[F_{i}]} z_{c y l, m a x}

, of the cylinder, we get

\begin{matrix} ^{[F_{i}]} θ_{i} & = arctan (^{[F_{i}]} z_{c y l, m a x} -^{[Ξ_{i}]} v l_{p x}) \end{matrix}

(83)

Given these angles and assuming coaxial alignment, we evaluate the positon vector

^{[C]} p_{c y l_{i}}

for a point on the panoramic cylinder with respect to the camera frame

[C]

:

\begin{matrix} ^{[C]} p_{c y l_{i}} & = r_{c y l} [\begin{matrix} cos (^{[F_{i}]} ψ_{i}) \\ sin (^{[F_{i}]} ψ_{i}) \\ tan (^{[F_{i}]} θ_{i}) \end{matrix}] +^{[C]} f_{i} \end{matrix}

(84)

where

r_{c y l}

cancels out for a unit cylinder. The direction equations Equations (82) and (83) leading to Equation (84) as a process:

^{[Ξ_{i}]} m \mapsto_{Equation (83)}^{Equation (82)}^{[F_{i}]} (\begin{matrix} ψ \\ θ \end{matrix}) \overset{Equation (84)}{\mapsto}^{[C]} p_{c y l_{i}}

, which is eventually used as the input argument to Equation (36) in order to determine pixel

^{[I_{i}]} m

via the mapping function

h_{Ξ_{i}} : R^{2} \mapsto R^{2}

,

\begin{matrix} ^{[I_{i}]} m \leftarrow h_{Ξ_{i}} (^{[Ξ_{i}]} m) & : = & f_{φ_{i}} ({^{[C]} p_{c y l_{i}}|}_{^{[Ξ_{i}]} m}) \end{matrix}

(85)

Stereo Matching on Panoramas

We understand that the algorithm chosen for finding matches is crucial to attain correct pixel disparity results. We refer the reader to [35] for a detailed survey of stereo correspondence methods. After comparing various block matching algorithms, we were able to obtain acceptable disparity maps with the semi-global block matching (SGBM) method introduced by [36], which can find subpixel matches in real time. As a result of this stereo block matcher among the pair of panoramic images

([Ξ_{1}], [Ξ_{2}])

, we get the dense disparity map

[Ξ_{Δ m_{12}}]

visualized as an image in Figure 15 and Figure 21a. Note that valid disparity values must be positive

({Δ m_{12}|}_{^{[Ξ_{i}]} m_{1}} > 0)

and they are given with respect to the reference image, in this case,

[Ξ_{1}]

. In addition, recall that no stereo matching algorithm (as far as we are aware) is totally immune to mismatches due to several well-known reasons in the literature such as ambiguity of cyclic patterns.

An advantage of the block (window) search for correspondences is that it can be narrowed along epipolar lines. Unlike the traditional horizontal stereo configuration, our system captures panoramic images whose views differ in a vertical fashion. As shown in [14], the unwrapped panoramas contain vertical, parallel epipolar lines that facilitate the pixel correlation search. Thus, given a pixel position

^{[Ξ_{i}]} m_{1}

on the reference panorama

[Ξ_{1}]

and its disparity value

{Δ m_{12}|}_{^{[Ξ_{i}]} m_{1}}

, we can resolve the correspondence

^{[Ξ_{2}]} m_{2}

pixel coordinate on the target image,

[Ξ_{2}]

, by simply offsetting the v-coordinate with the disparity value:

\begin{matrix} ^{[Ξ_{2}]} m_{2} & = [\begin{matrix} u_{1} \\ v_{1} + {Δ m_{12}|}_{^{[Ξ_{1}]} m_{1}} \end{matrix}] \end{matrix}

(86)

5.2. Range from Triangulation

Recall the duality that states a point

P_{w}

as the intersection of a pair of lines. Regardless of the correspondence search technique employed, such as block stereo matching between panoramas

[Ξ_{i}]

(Section 5.1.1) or feature detection directly on

[I]

, we can resolve for

^{[I]} (m_{1}, m_{2})

. From Equations (42) and (49), we obtain the respective pair of back-projected rays

(^{[F_{1}]} v_{1},^{[F_{2}]} v_{2})

, emanating from their respective physical viewpoints,

F_{1}

and

F_{2}

, which are separated by baseline b. We can compute elevation angles

θ_{1}

and

θ_{2}

using equations Equations (43) and (50). Then, we can triangulate the back-projected rays in order to calculate the horizontal range

ρ_{w}

defined in Equation (22), as follows:

ρ_{w} = \frac{b cos (θ_{1}) cos (θ_{2})}{sin (θ_{1} - θ_{2})}

(87)

Finally, we obtain the 3D position of

P_{w}

:

^{[C]} p_{w} = [\begin{matrix} - ρ_{w} cos (ψ_{12}) \\ - ρ_{w} sin (ψ_{12}) \\ c_{1} - ρ_{w} tan (θ_{1}) \end{matrix}]

(88)

where

ψ_{12}

is the common azimuthal angle (on the XY-plane) for coplanar rays, so it can be determined either by Equation (44) or Equation (51). Functionally, we define the “naive” intersection function that implements Equations (87) and (88) such that

\begin{matrix} ^{[C]} p_{w} \leftarrow f_{Δ} ((θ_{1}, ψ_{1}), (θ_{2}, ψ_{2}), θ) \end{matrix}

(89)

where θ is the model parameters vector defined in Equation (4) and can be omitted when calling this function because the model parameters should not change (ideally).

5.2.1. Common Perpendicular Midpoint Triangulation Method

Because the coplanarity of these rays cannot be guaranteed (skew rays case), a better triangulation approximation while considering coaxial misalignments is to find the midpoint of their common perpendicular line segment (as attempted in [23]). As illustrated in Figure 16, we define the common perpendicular line segment

\bar{G_{1} G_{2}}

as the parametrized vector

v_{1 ⊥ 2} = λ_{1 ⊥ 2} {\hat{v}}_{1 ⊥ 2}

, for the unit vector normal to the back-projected rays,

v_{1}

and

v_{2}

, such that:

\begin{matrix} {\hat{v}}_{1 ⊥ 2} = \frac{v_{1} \otimes v_{2}}{|| v_{1} \otimes v_{2} ||} \end{matrix}

(90)

If the rays are not parallel (

|| v_{1} \otimes v_{2} || \neq 0

), we can compute the “exact” solution,

λ = {[\begin{matrix} λ_{G_{1}}, & λ_{G_{2}}, & λ_{1 ⊥ 2} \end{matrix}]}^{T}

, of the well-determined linear matrix equation

\begin{matrix} V λ & = b, where V = [\begin{matrix} v_{1}, & - v_{2}, & {\hat{v}}_{1 ⊥ 2} \end{matrix}] and b =^{[C]} f_{2} -^{[C]} f_{1} \end{matrix}

(91)

It follows that the location of the midpoint

P_{w_{G}}

on the common perpendicular

v_{1 ⊥ 2}

with respect to the common frame

[C]

is

\begin{matrix} ^{[C]} p_{w_{G}} & =^{[C]} f_{1} + λ_{G_{1}}^{[F_{1}]} v_{1} + \frac{1}{2} λ_{1 ⊥ 2}^{[G_{1}]} {\hat{v}}_{1 ⊥ 2} \end{matrix}

(92)

5.2.2. Range Variation

Before we introduce an uncertainty model for triangulation (Section 5.3), we briefly analyze how range varies according to the possible combinations of pixel correspondences,

^{[I]} (m_{1}, m_{2})

on the image

[I]

. Here, we demonstrate how a radial variation of discretized pixel disparities,

Δ m_{12}

, affects the 3D position of a point obtained from triangulation (Section 5.2). Figure 17 demonstrates the nonlinear characteristics of the variation in horizontal range,

Δ ρ_{w}

, from the discrete relation between pixel positions

^{[I]} m_{i}

and their respective back-projected (direction) rays obtained from

f_{β_{i}}

and triangulated via function

f_{Δ}

defined in Equation (89). It can be observed that the horizontal range variation,

Δ ρ_{w}

, increases quadratically as

Δ m_{12} \to 1 px

, which is the minimum discrete pixel disparity, which provides a maximum horizontal range

ρ_{w, m a x} \approx [18, 28]

m

(computed analytically). The main plot of Figure 17 shows the small disparity values in the interval

\bar{Δ m_{12}} = [1, 20] px

, whereas the subplot is a zoomed-in extension of the large disparity cases in the interval

\bar{Δ m_{12}} = [20, 100] px

.

The current analysis is an indicative that triangulation error (e.g., due to false pixel correspondences) may have a severe effect on range accuracy that increases quadratically with distance as it can be appreciated with the 8

m

variation on the disparity interval

\bar{Δ m_{12}} = [1, 2] px

. Also, observe the example of Figure 20 for a reconstructed point cloud, where this range sensing characteristic is more noticeable for faraway points. In fact, the following uncertainty model provides a probabilistic framework for the triangulation error (uncertainty) that agrees with the current numerical claims.

5.3. Triangulation Uncertainty Model

Let

f_{P_{w}}

be the vector-valued function that computes the 3D coordinates of point

P_{w_{G}}

with respect to

[C]

as the common perpendicular midpoint defined in Equation (92). We express this triangulation function component-wise as follows:

\begin{matrix} ^{[C]} p_{w_{G}} \leftarrow f_{P_{w}} (m_{12}) [\begin{matrix} f_{x_{w}} (m_{12}) \\ f_{y_{w}} (m_{12}) \\ f_{z_{w}} (m_{12}) \end{matrix}] \end{matrix}

(93)

where

m_{12} = [u_{1}, v_{1}, u_{2}, v_{2}]

is composed by the pixel coordinates of the correspondence

^{[I]} (m_{1}, m_{2})

upon which to base the triangulation (Section 5.2).

Without loss of generality, we model a multivariate Gaussian uncertainty model for triangulation, so that the position vector

^{[C]} p_{w_{G}}

of any world point is centered at its mean

^{[C]} μ_{f_{P_{w}}}

with a

3 \times 3

covariance matrix

Σ_{f_{P_{w}}}

:

\begin{matrix} ^{[C]} μ_{f_{P_{w}}} = [\begin{matrix} x_{w} \\ y_{w} \\ z_{w} \end{matrix}], & Σ_{f_{P_{w}}} = [\begin{matrix} σ_{f_{x_{w}}}^{2} & σ_{f_{x_{w}}} σ_{f_{y_{w}}} & σ_{f_{x_{w}}} σ_{f_{z_{w}}} \\ σ_{f_{x_{w}}} σ_{f_{y_{w}}} & σ_{f_{y_{w}}}^{2} & σ_{f_{y_{w}}} σ_{f_{z_{w}}} \\ σ_{f_{x_{w}}} σ_{f_{z_{w}}} & σ_{f_{y_{w}}} σ_{f_{z_{w}}} & σ_{f_{z_{w}}}^{2} \end{matrix}] \end{matrix}

(94)

However, since

f_{P_{w}}

is a non-linear vector-valued function, we linearize it by approximation to a first-order Taylor expansion and we use its Jacobian matrix to propagate the uncertainty (covariance) as in the linear case as follows:

\begin{matrix} Σ_{f_{P_{w}}} = J_{f_{P_{w}}} Ω_{m_{12}} {J_{f_{P_{w}}}}^{T} \end{matrix}

(95)

where the

3 \times 4

Jacobian matrix for the triangulation function is

\begin{matrix} J_{f_{P_{w}}} = [\begin{matrix} \frac{\partial f_{x_{w}}}{\partial u_{1}} & \frac{\partial f_{x_{w}}}{\partial v_{1}} & \frac{\partial f_{x_{w}}}{\partial u_{2}} & \frac{\partial f_{x_{w}}}{\partial v_{2}} \\ \frac{\partial f_{y_{w}}}{\partial u_{1}} & \frac{\partial f_{y_{w}}}{\partial v_{1}} & \frac{\partial f_{y_{w}}}{\partial u_{2}} & \frac{\partial f_{y_{w}}}{\partial v_{2}} \\ \frac{\partial f_{z_{w}}}{\partial u_{1}} & \frac{\partial f_{z_{w}}}{\partial v_{1}} & \frac{\partial f_{z_{w}}}{\partial u_{2}} & \frac{\partial f_{z_{w}}}{\partial v_{2}} \end{matrix}] \end{matrix}

(96)

and the

4 \times 4

covariance matrix of the pixel arguments being

\begin{matrix} Ω_{m_{12}} = σ_{p x}^{2} I_{4} \end{matrix}

(97)

where we assume

σ_{p x} = 1 px

for the standard deviation of each pixel coordinate in the discretized pixel space. The complete symbolic solution of

Σ_{f_{P_{w}}}

is too involved to appear in this manuscript. However, in Figure 18, we show the top-view of the covariance ellipsoid drawn at a three-

σ_{f_{P_{w}}}

level for a point triangulated nearly around

ρ_{w} \approx 100 mm

. Figure 19 visualizes uncertainty ellipsoids drawn at a one-

σ_{f_{P_{w}}}

level for several triangulation ranges. We refer the reader to the end of Section 6.3 where we validate the safety of this 1 pixel deviation assumption through experimental results using subpixel precision.

6. Experiment Results

In this section, we demonstrate the capabilities of the omnistereo sensor to provide 3D information either as dense point clouds or as for the registration of sparse 2D features and 3D points. We also evaluate the precision of both projection and triangulation of a few detected corners from a chessboard whose various 3D poses are given as ground-truth.

6.1. Dense 3D Point Clouds

By implementing the process described in Section 5, we begin by visualizing the dense point-cloud obtained from the omnidirectional synthetic image given in Figure 2b, whose actual size is 1280 × 960 pixels. The associated panoramic images,

[Ξ_{i}]

, were obtained using function

h_{Ξ_{i}}

defined in Equation (85) and are shown in Figure 15. Pixel correspondences

(^{[Ξ_{1}]} m_{1},^{[Ξ_{2}]} m_{2})

on the panoramic representations are mapped via

h_{Ξ_{i}}

into their respective image positions

^{[I]} (m_{1}, m_{2})

. Then, these are triangulated with

^{[C]} f_{P_{w}}

given in Equation (93), resulting in the set (cloud) of color 3D points

P_{Δ}

visualized in Figure 20. Here, the synthetic scene (Figure 2a) is for a room 5.0

m

wide (along its

X

-axis), 8.0

m

long (along its

Y

-axis), and 2.5

m

high (along its

Z

-axis). With respect to the scene center of coordinates,

[S]

, the catadioptric omnistereo sensor,

[C]

, is positioned at

_{[C]}^{[S]} t = {[\begin{matrix} 1.60, & - 2.85, & 0.16 \end{matrix}]}^{T}

in meters.

We also present results from a real experiment using the prototype described in Section 4.3.2 and shown in Figure 13a. The panoramic images and dense point cloud shown in Figure 21 are obtained by implementing the pertinent functions described throughout this manuscript and by holding the SVP assumption of an ideal configuration. We provide these qualitative results as preliminary proof of concept for the proposed sensor after employing a calibration procedure based on the generalized unified model proposed in [37].

6.2. Sparse 3D Points from Features

Using the SURF feature detector and descriptors [38], Figure 22 demonstrates 44 correct matches that are triangulated with Equation (93). Sparse 3D points can be useful for applications of visual odometry where the sensor changes poses and those registered point features can be matched against new images. Please, refer to [39] for a tutorial on visual odometry.

6.3. Triangulation Evaluation

6.3.1. Evaluation of Synthetic Rig

Due to the unstructured nature of the dense point clouds previously discussed, we proceed to triangulate sets of sparse 3D points whose positions with respect to the omnistereo sensor camera frame,

[C]

, are known in advance. We synthesize a calibration chessboard pattern

[G]

containing

m \times n

square cells for various predetermined poses

_{[G]}^{[C]} T_{h}

. Since the sensor is assumed to be rotationally symmetric, it suffices to experiment with groups of

L = 4

chessboard patterns situated at a given horizontal range. A total of

L m n

3D points are available for each range group. Each corner point’s position

^{[C]} p_{j}

is found with respect to

[C]

via the frame transformation

^{[C]} p_{j, g} =_{[C_{g}]}^{[C]} T_{h}^{[C_{g}]} p_{j}

for all indices

j \in {1, \dots, m n}, g \in {1, \dots, L}

.

Figure 23 shows the set of detected corner points on the image from the group of patterns set to a range of

^{[C]} ρ_{G} = 2 m

. We adjust the pattern’s cell sizes accordingly so its points can be safely discerned by an automated corner detector [35]. We systematically establish correspondences of pattern points on the omnidirectional image, and proceed to triangulate with Equation (93). For each range group of points, we compute the root-mean-square of the 3D position errors (RMSE) between the observed (triangulated) points

^{[C]} {\tilde{p}}_{j} \leftarrow f_{P_{w}} ({\tilde{m}}_{1}, {\tilde{m}}_{2})

and the true (known) points

^{[C]} p_{j}

that were used to describe the ray-traced image. Table 4 compiles the RMSE results and the standard deviation (SD) for some group of patterns whose frames

[G_{g}]

, are located at specified horizontal ranges

^{[C]} ρ_{G} \in [0.25, 8.0]

m

away from

[C]

.

We notice that for all the 3D points in the synthetic patterns, we obtained an average error of

0.1 px

with a standard deviation

{\tilde{σ}}_{p x} = 0.05 px

for the subpixel detection of corners on the image versus their theoretical values obtained from

f_{φ_{i}}

defined in Equation (36). This last experiment helps us validate the pessimistic choice of

σ_{p x} = 1 px

for the discrete pixel space in the triangulation uncertainty model proposed in Section 5.3.

6.3.2. Evaluation of Real-Life Rig

The following experiment uses

L = 5

different poses of a real chessboard pattern with

5 \times 8

corner points where the square cell size is

24 mm

. As done in Section 6.3.1, the evaluated error is the Euclidean norms between the triangulated points and the ground-truth positions of the chessboard posses captured via a motion capture system. The RMSE for all projected points in this set of chessboard patterns is 2.5

pixels

with a standard deviation of 1.5

pixels

. The RMSE for all triangulated points in this set is 3.5

mm

with a standard deviation of 1.4

mm

. Figure 24 visually confirms the proximity of the triangulated chessboard poses against the ground-truth pose information.

7. Discussion and Future Work

The portable aspect of the proposed omnistereo sensor is one of its greatest advantages, as discussed in the introduction section. The total weight of the big rig using 37

mm

-radius mirrors is about 550

g

, so it can be carried by the AscTec Pelican quadrotor under its payload limitations of 650

g

. The mirror profiles maximize the stereo baseline while obeying the various design constraints such as size and field of view. Currently, the mirrors are custom-manufactured out of brass using CNC machining. However, it is possible to reduce the system’s weight dramatically by employing lighter materials.

In reality, it is almost impossible to assemble a perfect imaging system that fulfills the SVP assumption and avoids the triangulation uncertainty studied in Section 5.3 on top of the error already introduced by any feature matching technique. The coaxial misalignment of the folded mirrors-camera system, defocus blur of the lens, and the inauspicious glare from the support tube are all practical caveats we need to overcome for better 3D sensing tasks. As described in the text for the real-life rig, we have avoided the traditional use of a support cylinder in order to workaround the cross-reflections and glare issues. Possible vibrations caused by the robot dynamics are reduced by vibration pads placed on the sensor-body interface. Details about our tentative calibration method for vertically-folded omnistereo systems has not been included in the current study since we would like the reader’s attention to be devoted to the sensor characteristics defended by this analysis.

Our ongoing research is also focusing on the development of efficient software algorithms for real-time 3D pose estimation from point clouds. Bear in mind that all the experimental results demonstrated in this manuscript rely upon a single camera snapshot. We understand that the narrow vertical field-of-view where stereo vision operates is a limiting factor for dense scene reconstruction from a single image, so we have also considered non-optimal geometries for the quadrotor’s view. In fact, increasing the region of interest for stereo (SROI) while maintaining the wide baseline implies an enlargement of each mirror’s radius. We believe that our omnidirectional system is more advantageous than forward-looking sensors because it can provide a robust pose estimation by extracting 3D point features from all around the scene at once. As in our past work [24], fusing multiple modalities (e.g., stereo and optical-flow) is a possibility in order to resolve the scale-factor problem inherent while performing structure from motion over the non-stereo regions of each mirror (near the poles).

In this work, we performed an extensive study of the proposed omnistereo sensor’s properties, such as its spatial resolution and triangulation uncertainty. We validated the projection accuracy of the synthetic model (the ideal case) where 3D points in the world are given exactly. In order to validate the precision of the real sensor, we require a perfectly constructed and assembled device so point projections can be accepted as the ultimate truth. This is hard to achieve at a low-cost prototyping stage. Although we acquired ground-truth 3D points via a position capture system alone, we deem this insufficient to validate the imaging accuracy of the real sensor because the precision of the calibration method is truly what is being accounted for. For reproducibility purposes, source code is available for the implementation of the theoretical omnistereo model, optimization, plots and figures presented in this analysis [40].

Acknowledgments

This work was supported in part by U.S. Army Research Office grantNo. W911NF-09-1-0565, U.S. National Science Foundation grant No. IIS-0644127, and a Ford Foundation Pre-doctoral Fellowship awarded to Carlos Jaramillo.

Author Contributions

The work presented in this paper is a collaborative development by all of the authors. C. Jaramillo wrote this manuscript, carried out all the experiments and conceived the extensive analysis of the omnistereo sensor studied here. R.G. Valenti contributed with the analytical derivation of various equations and manuscript revisions. L. Guo established the geometrical model and rules for the constrained optimization of design parameters. J. Xiao funded and guided this entire study and helped with revisions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Symbolic Notation

$P_{i}$	a point $\in R^{3}$ where post-subscript i as a unique identifier.
$[A]$	a reference frame or image space with origin $O_{A}$ .
$^{[A]} p_{i}$	The position vector of $P_{i}$ with respect to reference frame $[A]$ .
$^{[A]} p_{i, h}$	for homogeneous coordinates.
$^{[I]} m_{i}$	a 2D point or pixel position on image frame $[I]$ .
$‖ p_{i} ‖$	the magnitude (Euclidean norm) of $p_{i}$ .
$\hat{q}$	A unit vector so $\|\| \hat{q} \|\| = 1$ .
$M_{i}$	a $3 \times 3$ matrix, or $M_{i, h}$ in homogeneous coordinates.
$f_{s}$	a scalar-valued function that outputs some s.
$f_{v}$	a vector-valued function for the computation of $v$ .

All coordinate systems obey the right-hand rule unless otherwise indicated.

References

Marani, R.; Renò, V.; Nitti, M.; D’Orazio, T.; Stella, E. A Compact 3D Omnidirectional Range Sensor of High Resolution for Robust Reconstruction of Environments. Sensors 2015, 15, 2283–2308. [Google Scholar] [CrossRef]
Valenti, R.G.; Dryanovski, I.; Jaramillo, C.; Strom, D.P.; Xiao, J. Autonomous quadrotor flight using onboard RGB-D visual odometry. In Proceedings of the International Conference on Robotics and Automation (ICRA 2014), Hong Kong, China, 31 May–7 June 2014; pp. 5233–5238.
Khoshelham, K.; Elberink, S.O. Accuracy and resolution of Kinect depth data for indoor mapping applications. Sensors 2012, 12, 1437–1454. [Google Scholar] [CrossRef] [PubMed]
Payá, L.; Fernández, L.; Gil, A.; Reinoso, O. Map building and monte carlo localization using global appearance of omnidirectional images. Sensors 2010, 10, 11468–11497. [Google Scholar] [CrossRef] [PubMed]
Berenguer, Y.; Payá, L.; Ballesta, M.; Reinoso, O. Position Estimation and Local Mapping Using Omnidirectional Images and Global Appearance Descriptors. Sensors 2015, 15, 26368–26395. [Google Scholar] [CrossRef] [PubMed]
Hrabar, S.; Sukhatme, G. Omnidirectional vision for an autonomous helicopter. In Proceedings of the International Conference on Robotics and Automation (ICRA), Taipei, Taiwan, 14–19 September 2003; pp. 3602–3609.
Hrabar, S. 3D path planning and stereo-based obstacle avoidance for rotorcraft UAVs. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Nice, France, 22–26 September 2008; pp. 807–814.
Orghidan, R.; Mouaddib, E.M.; Salvi, J. Omnidirectional depth computation from a single image. In Proceedings of the IEEE International Conference on Robotics and Automation, Barcelona, Spain, 18–22 April 2005; pp. 1222–1227.
Paniagua, C.; Puig, L.; Guerrero, J.J. Omnidirectional structured light in a flexible configuration. Sensors 2013, 13, 13903–13916. [Google Scholar] [CrossRef] [PubMed]
Byrne, J.; Cosgrove, M.; Mehra, R. Stereo based obstacle detection for an unmanned air vehicle. In Proceedings of the International Conference on Robotics and Automation, Orlando, FL, USA, 15–19 May 2006.
Smadja, L.; Benosman, R.; Devars, J. Hybrid stereo configurations through a cylindrical sensor calibration. Mach. Vis. Appl. 2006, 17, 251–264. [Google Scholar] [CrossRef]
Hartley, R.; Zisserman, A. Multiple View Geometry in Computer Vision, 2nd ed.; Volume 2, Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Sturm, P.; Ramalingam, S.; Tardif, J.P.; Gasparini, S.; Barreto, J.P. Camera models and fundamental concepts used in geometric computer vision. Found. Trends®Comput. Graph. Vis. 2010, 6, 1–183. [Google Scholar] [CrossRef]
Gluckman, J.; Nayar, S.K.; Thoresz, K.J. Real-Time Omnidirectional and Panoramic Stereo. Comput. Vis. Image Underst. 1998. [Google Scholar]
Koyasu, H.; Miura, J.; Shirai, Y. Realtime omnidirectional stereo for obstacle detection and tracking in dynamic environments. In Proceedings of the International Conference on Intelligent Robots and Systems (IROS), Maui, HI, USA, 29 October–3 November 2001; Volume 1, pp. 31–36.
Bajcsy, R.; Lin, S.S. High resolution catadioptric omni-directional stereo sensor for robot vision. In Proceedings of the 2003 IEEE International Conference on Robotics and Automation, Taipei, Taiwan, 14–19 September 2003; pp. 1694–1699.
Cabral, E.E.; de Souza, J.C.J.; Hunold, M.C. Omnidirectional stereo vision with a hyperbolic double lobed mirror. In Proceedings of the 17th International Conference on Pattern Recognition (ICPR), Cambridge, UK, 23–26 August 2004; pp. 0–3.
Su, L.; Zhu, F. Design of a novel stereo vision navigation system for mobile robots. In Proceedings of the IEEE Robotics and Biomimetics (ROBIO), Hong Kong, China, 5–9 July 2005; pp. 611–614.
Mouaddib, E.M.; Sagawa, R. Stereovision with a single camera and multiple mirrors. In Proceedings of the International Conference on Robotics and Automation, Barcelona, Spain, 18–22 April 2005; pp. 800–805.
Schönbein, M.; Kitt, B.; Lauer, M. Environmental Perception for Intelligent Vehicles Using Catadioptric Stereo Vision Systems. In Proceedings of the European Conference on Mobile Robots (ECMR), Örebro, Sweden, 7–9 September 2011; pp. 1–6.
Yi, S.; Ahuja, N. An Omnidirectional Stereo Vision System Using a Single Camera. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, China, 20–24 August 2006; pp. 861–865.
Nayar, S.K.; Peri, V. Folded catadioptric cameras. In Proceedings of the 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Fort Collins, CO, USA, 23–25 June 1999; pp. 217–223.
He, L.; Luo, C.; Zhu, F.; Hao, Y. Stereo Matching and 3D Reconstruction via an Omnidirectional Stereo Sensor. In Motion Planning; Number 60575024; In-Tech Education and Publishing: Vienna, Austria, 2008; pp. 123–142. [Google Scholar]
Labutov, I.; Jaramillo, C.; Xiao, J. Generating near-spherical range panoramas by fusing optical flow and stereo from a single-camera folded catadioptric rig. Mach. Vis. Appl. 2011, 24, 1–12. [Google Scholar] [CrossRef]
Swaminathan, R.; Grossberg, M.D.; Nayar, S.K. Caustics of catadioptric cameras. In Proceedings of the Eighth IEEE International Conference on Computer Vision (ICCV 2001), Vancouver, BC, Canada, 7–14 July 2001; Volume 2, pp. 2–9.
Jang, G.; Kim, S.; Kweon, I. Single camera catadioptric stereo system. In Proceedings of the Workshop on Omnidirectional Vision, Camera Networks and Nonclassical Cameras (OMNIVIS2005), Beijing, China, 21 October 2005.
Jaramillo, C.; Guo, L.; Xiao, J. A Single-Camera Omni-Stereo Vision System for 3D Perception of Micro Aerial Vehicles (MAVs). In Proceedings of the IEEE Conference on Industrial Electronics and Applications (ICIEA), Melbourne, Australia, 19–21 June 2013; Volume 10016.
Ascending Technologies (AscTec). Available online: http://www.asctec.de/en/uav-uas-drones-rpas-roav/ (accessed on 23 May 2014).
Baker, S.; Nayar, S.K. A theory of single-viewpoint catadioptric image formation. Int. J. Comput. Vis. 1999, 35, 175–196. [Google Scholar] [CrossRef]
Nayar, S.K.; Baker, S. Catadioptric Image Formation. In Proceedings of the 1997 DARPA Image Understanding Workshop, New Orleans, LA, USA, May 1997; pp. 1431–1437.
Gaspar, J.; Deccó, C.; Okamoto, J.J.; Santos-Victor, J.; Sistemas, I.D.; Pais, A.R.; Brazil, S.P. Constant resolution omnidirectional cameras. In Proceedings of the OMNIVIS’02 Workshop on Omni-directional Vision, Copenhagen, Denmark, 2 June 2002.
Forsgren, A.; Gill, P.; Wright, M. Interior Methods for Nonlinear Optimization. Soc. Ind. Appl. Math. (SIAM Rev.) 2002, 44, 525–597. [Google Scholar] [CrossRef]
Tuytelaars, T.T.; Mikolajczyk, K. Local Invariant Feature Detectors- A Survey. Found. Trends® in Comput. Graph. Vis. 2008, 3, 177–280. [Google Scholar] [CrossRef] [Green Version]
Spacek, L. Coaxial Omnidirectional Stereopsis. Computer Vision-ECCV 2004; Springer Berlin Heidelberg: Berlin, Heidelberg, 2004; pp. 354–365. [Google Scholar]
Bradski, G.; Kaehler, A. Learning OpenCV: Computer vision with the OpenCV library; O’Reilly Media, Inc.: Sebastopol, California, 2008. [Google Scholar]
Hirschmüller, H. Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 30, 328–341. [Google Scholar] [CrossRef] [PubMed]
Xiang, Z.; Dai, X.; Gong, X. Noncentral catadioptric camera calibration using a generalized unified model. Opt. Lett. 2013, 38, 1367–1369. [Google Scholar] [CrossRef]
Bay, H.; Ess, A.; Tuytelaars, T.; Vangool, L. Speeded-Up Robust Features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef]
Scaramuzza, D.; Fraundorfer, F. Visual Odometry Part 1: The First 30 Years and Fundamentals. IEEE Robot. Autom. Mag. 2011, 18, 80–92. [Google Scholar] [CrossRef]
Source Code Repository. Available online: https://github.com/ubuntuslave/omnistereo_sensor_design (accessed on 5 February 2016).

Figure 1. Synthetic and real prototypes for the catadioptric single-camera omnistereo system.

Figure 2. Photo-realistic synthetic scene: (a) Side-view of the quadrotor with the omnistereo rig in an office environment; (b) the image captured by the system’s camera using this pose.

Figure 3. Geometric model and observable design parameters.

Figure 4. Omnistereo projection of a 3D point

P_{w}

to obtain image points

^{[I]} m_{1}

and

^{[I]} m_{2}

.

Figure 4. Omnistereo projection of a 3D point

P_{w}

to obtain image points

^{[I]} m_{1}

and

^{[I]} m_{2}

.

Figure 5. Vertical Field of View (vFOV) angles:

α_{1}

and

α_{2}

are the individual angles of the mirrors formed by their respective elevation limits

θ_{1 / 2, m i n / m a x}

;

α_{s y s}

is the overall vFOV angle of the system; and

α_{S R O I}

measures the overlapping region conceived between

α_{1}

and

α_{2}

.

Figure 5. Vertical Field of View (vFOV) angles:

α_{1}

and

α_{2}

are the individual angles of the mirrors formed by their respective elevation limits

θ_{1 / 2, m i n / m a x}

;

α_{s y s}

is the overall vFOV angle of the system; and

α_{S R O I}

measures the overlapping region conceived between

α_{1}

and

α_{2}

.

Figure 6. A cross section of the SROI (shaded area) formed by the intersection of view rays for the limiting elevations

θ_{1 / 2, m i n / m a x}

. The nearest stereo (

n s

) points are labeled

P_{n s_{h i g h}}

,

P_{n s_{m i d}}

and

P_{n s_{l o w}}

since they are the vertices of the hull that near-bounds the set of usable points for depth computation from triangulation (Section 5.2). See Table 3 for the proposed sensor’s values.

Figure 6. A cross section of the SROI (shaded area) formed by the intersection of view rays for the limiting elevations

θ_{1 / 2, m i n / m a x}

. The nearest stereo (

n s

) points are labeled

P_{n s_{h i g h}}

,

P_{n s_{m i d}}

and

P_{n s_{l o w}}

since they are the vertices of the hull that near-bounds the set of usable points for depth computation from triangulation (Section 5.2). See Table 3 for the proposed sensor’s values.

Figure 7. The omnidirectional image

[I]

shown in Figure 2b is now annotated for the separate regions of interest in

[I_{1}]

and

[I_{2}]

. In addition, we indicate the corresponding radial heights

h_{I_{1}}

and

h_{I_{2}}

of the SROI, so we can determine the imaging ratio

χ_{I_{1 : 2}} = \frac{h_{I_{1}}}{h_{I_{2}}}

. For the optimal parameter values listed in Table 1, we find that

χ_{I_{1 : 2}} \approx 2

.

Figure 7. The omnidirectional image

[I]

shown in Figure 2b is now annotated for the separate regions of interest in

[I_{1}]

and

[I_{2}]

. In addition, we indicate the corresponding radial heights

h_{I_{1}}

and

h_{I_{2}}

of the SROI, so we can determine the imaging ratio

χ_{I_{1 : 2}} = \frac{h_{I_{1}}}{h_{I_{2}}}

. For the optimal parameter values listed in Table 1, we find that

χ_{I_{1 : 2}} \approx 2

.

Figure 8. The spatial resolution for a central catadioptric sensor is the ratio between an infinitesimal image area dA and its corresponding solid angle d

ν_{1}

that views a point

P_{w}

. (Note: infinitesimal elements are exaggerated in the figure for better visualization.)

Figure 8. The spatial resolution for a central catadioptric sensor is the ratio between an infinitesimal image area dA and its corresponding solid angle d

ν_{1}

that views a point

P_{w}

. (Note: infinitesimal elements are exaggerated in the figure for better visualization.)

Figure 9. The effect that parameter

k_{i}

(showing mirror 1 only) has over the system radius

r_{s y s}

for various values of the vertical field of view angle

α_{1}

. In order to maintain a vertical field of view

α_{i}

that is bounded by

z_{m a x} ∣_{r_{s y s}}

, the value of

r_{s y s}

must change accordingly. Inherently, the system’s height,

h_{s y s}

, and its mass,

m_{s y s}

, are also affected by

k_{i}

(see Section 2.3).

Figure 9. The effect that parameter

k_{i}

(showing mirror 1 only) has over the system radius

r_{s y s}

for various values of the vertical field of view angle

α_{1}

. In order to maintain a vertical field of view

α_{i}

that is bounded by

z_{m a x} ∣_{r_{s y s}}

, the value of

r_{s y s}

must change accordingly. Inherently, the system’s height,

h_{s y s}

, and its mass,

m_{s y s}

, are also affected by

k_{i}

(see Section 2.3).

Figure 10. The effect that parameter

k_{1}

has over the omnistereo system’s baseline b for several common FOV angles (

α_{S R O I}

) and a fixed camera with

α_{c a m}

. An inverse relationship exists between k and b as plotted here (using a logarithmic scale for the vertical axis). Intuitively, the flatter the mirror gets (

k \to 2

), the farther

F_{1}

must be translated in order to fit within the camera’s view,

α_{S R O I}

, causing b to increase.

Figure 10. The effect that parameter

k_{1}

has over the omnistereo system’s baseline b for several common FOV angles (

α_{S R O I}

) and a fixed camera with

α_{c a m}

. An inverse relationship exists between k and b as plotted here (using a logarithmic scale for the vertical axis). Intuitively, the flatter the mirror gets (

k \to 2

), the farther

F_{1}

must be translated in order to fit within the camera’s view,

α_{S R O I}

, causing b to increase.

Figure 11. Comparison of

k_{i}

values and their effect on spatial resolution

η_{i}

for

i = {1, 2}

. For the big rig, the optimal focal dimensions

c_{1}

and

c_{2}

(from Table 1) were used as well as the angular span on the common vertical FOV,

α_{S R O I} \approx 28

. Although resolution

η_{i}^{(O p t .)}

for the optimal values of

k_{i}

could be improved by employing smaller k values (lower curvature profiles indicated on the left plot of the figure), this would in turn increase the system radius,

r_{s y s}

, as to maintain

α_{i}

(Figure 9). As expected, the plot on the right help us appreciate how the spatial resolutions,

η_{i}

, increase towards the equatorial regions (

θ_{1} \to θ_{S R O I, m a x}

and

θ_{2} \to θ_{S R O I, m i n}

).

Figure 11. Comparison of

k_{i}

values and their effect on spatial resolution

η_{i}

for

i = {1, 2}

. For the big rig, the optimal focal dimensions

c_{1}

and

c_{2}

(from Table 1) were used as well as the angular span on the common vertical FOV,

α_{S R O I} \approx 28

. Although resolution

η_{i}^{(O p t .)}

for the optimal values of

k_{i}

could be improved by employing smaller k values (lower curvature profiles indicated on the left plot of the figure), this would in turn increase the system radius,

r_{s y s}

, as to maintain

α_{i}

(Figure 9). As expected, the plot on the right help us appreciate how the spatial resolutions,

η_{i}

, increase towards the equatorial regions (

θ_{1} \to θ_{S R O I, m a x}

and

θ_{2} \to θ_{S R O I, m i n}

).

Figure 12. Using the formula given in Equation (60), we plot the 2D version of the spatial resolution of our proposed omnistereo catadioptric sensor (37

mm

-radius rig). Both resolutions

η_{1}

and

η_{2}

increase towards the equatorial region where they are physically limited by

r_{s y s}

. This verifies the spatial resolution theory given in [29], and it justifies our coaxial configuration useful for omnistereo sensing within the SROI indicated in Figure 6.

Figure 12. Using the formula given in Equation (60), we plot the 2D version of the spatial resolution of our proposed omnistereo catadioptric sensor (37

mm

-radius rig). Both resolutions

η_{1}

and

η_{2}

increase towards the equatorial region where they are physically limited by

r_{s y s}

. This verifies the spatial resolution theory given in [29], and it justifies our coaxial configuration useful for omnistereo sensing within the SROI indicated in Figure 6.

Figure 13. Real-life prototype of the omnistereo sensor.

Figure 14. An example for the formation of panoramic image

[Ξ_{1}]

out of the omnidirectional image

[I_{1}]

(showing only the masked region of interest on the back of image plane

π_{i m g_{1}}

). Any particular ray,

v_{1}

indicated by its elevation and azimuth such as

^{[F_{1}]} (ψ_{1}, θ_{1})

that is directed towards the focus

F_{1}

must traverse the projection cylinder

S_{c y l_{1}}

at point

P_{c y l_{1}}

. More abstractly, the figure also shows how a pixel position

^{[Ξ_{1}]} m_{α}

on the panoramic pixel space gets mapped from its corresponding pixel position

^{[I_{1}]} m_{α}

via function

h_{Ξ_{1}}

defined in Equation (85). Although not up to scale, it’s crucial to notice the relative orientation between

S_{c y l_{1}}

and the back of the projection plane

π_{i m g_{1}}

where the omnidirectional image

[I_{1}]

is found.

Figure 14. An example for the formation of panoramic image

[Ξ_{1}]

out of the omnidirectional image

[I_{1}]

(showing only the masked region of interest on the back of image plane

π_{i m g_{1}}

). Any particular ray,

v_{1}

indicated by its elevation and azimuth such as

^{[F_{1}]} (ψ_{1}, θ_{1})

that is directed towards the focus

F_{1}

must traverse the projection cylinder

S_{c y l_{1}}

at point

P_{c y l_{1}}

. More abstractly, the figure also shows how a pixel position

^{[Ξ_{1}]} m_{α}

on the panoramic pixel space gets mapped from its corresponding pixel position

^{[I_{1}]} m_{α}

via function

h_{Ξ_{1}}

defined in Equation (85). Although not up to scale, it’s crucial to notice the relative orientation between

S_{c y l_{1}}

and the back of the projection plane

π_{i m g_{1}}

where the omnidirectional image

[I_{1}]

is found.

Figure 15. For the synthetic omnidirectional image

[I]

shown in Figure 2b, we generate its pair of panoramic images

([Ξ_{1}], [Ξ_{2}])

using the procedure explained in Section 5.1. Note that we only work on the SROI (shown here) to perform a semi-global block match between the panoramas as indicated in Section 5.1.1. The resulting disparity map,

[Ξ_{Δ m_{12}}]

, is visualized at the bottom as a gray-scale panoramic image normalized about its 256 intensity levels, where brighter colors imply larger disparity values. To distinguish the relative vertical view of both panoramas, we have annotated the row position of the zero-elevation.

Figure 15. For the synthetic omnidirectional image

[I]

shown in Figure 2b, we generate its pair of panoramic images

([Ξ_{1}], [Ξ_{2}])

using the procedure explained in Section 5.1. Note that we only work on the SROI (shown here) to perform a semi-global block match between the panoramas as indicated in Section 5.1.1. The resulting disparity map,

[Ξ_{Δ m_{12}}]

, is visualized at the bottom as a gray-scale panoramic image normalized about its 256 intensity levels, where brighter colors imply larger disparity values. To distinguish the relative vertical view of both panoramas, we have annotated the row position of the zero-elevation.

Figure 16. The more realistic case of skew back-projection rays

(v_{1}, v_{2})

approximates the triangulated point

P_{w}

by getting the midpoint

P_{w_{G}}

on the common perpendicular line segment

\bar{G_{1} G_{2}} : λ_{1 ⊥ 2} {\hat{v}}_{1 ⊥ 2}

. Note that the visualized skew rays were formed from a pixel correspondence pair

^{[I]} (m_{1}, m_{2})

and by offsetting the coordinate

u_{2}

by 15 pixels.

Figure 16. The more realistic case of skew back-projection rays

(v_{1}, v_{2})

approximates the triangulated point

P_{w}

by getting the midpoint

P_{w_{G}}

on the common perpendicular line segment

\bar{G_{1} G_{2}} : λ_{1 ⊥ 2} {\hat{v}}_{1 ⊥ 2}

. Note that the visualized skew rays were formed from a pixel correspondence pair

^{[I]} (m_{1}, m_{2})

and by offsetting the coordinate

u_{2}

by 15 pixels.

Figure 17. Variation of horizontal range,

Δ ρ_{w}

, due to change in pixel disparity

Δ m_{12}

on the omnidirectional image,

[I]

. There exists a “nonlinear & inverse” relation between the change in depth from triangulation (

Δ ρ_{w}

) and the number of disparity pixels (

Δ m_{12}

) available from the omnistereo image pair

([I_{1}], [I_{2}])

, which are exclusive subspaces of

[I]

.

Figure 17. Variation of horizontal range,

Δ ρ_{w}

, due to change in pixel disparity

Δ m_{12}

on the omnidirectional image,

[I]

. There exists a “nonlinear & inverse” relation between the change in depth from triangulation (

Δ ρ_{w}

) and the number of disparity pixels (

Δ m_{12}

) available from the omnistereo image pair

([I_{1}], [I_{2}])

, which are exclusive subspaces of

[I]

.

Figure 18. Top-view of the three-sigma level ellipsoid for the triangulation uncertainty of a pixel pair

^{[I]} (m_{1}, m_{2})

with an assumed standard deviation

σ_{p x} = 1 px

.

Figure 18. Top-view of the three-sigma level ellipsoid for the triangulation uncertainty of a pixel pair

^{[I]} (m_{1}, m_{2})

with an assumed standard deviation

σ_{p x} = 1 px

.

Figure 19. Uncertainty ellipsoids for triangulated points at ranges

ρ_{w} \approx {0.3, 0.5, 1.0}

m

.

Figure 19. Uncertainty ellipsoids for triangulated points at ranges

ρ_{w} \approx {0.3, 0.5, 1.0}

m

.

Figure 20. A 3-D dense point cloud computed out of the synthetic model that rendered the omnidirectional image shown in Figure 2b. Pixel correspondences are established via the panoramic depth map visualized in Figure 15. The 3D point triangulation implements the common perpendicular midpoint method indicated in Section 5.2.1. The position of the omnistereo sensor mounted on the quadrotor is annotated as frame

[C]

with respect to the scene’s coordinates frame

[S]

. (a) 3D visualization of the point cloud (the quadrotor with the omnistereo rig has been added for visualization only); (b) Orthographic projection of the point cloud to the

XY

-plane of the visualization grid.

Figure 20. A 3-D dense point cloud computed out of the synthetic model that rendered the omnidirectional image shown in Figure 2b. Pixel correspondences are established via the panoramic depth map visualized in Figure 15. The 3D point triangulation implements the common perpendicular midpoint method indicated in Section 5.2.1. The position of the omnistereo sensor mounted on the quadrotor is annotated as frame

[C]

with respect to the scene’s coordinates frame

[S]

. (a) 3D visualization of the point cloud (the quadrotor with the omnistereo rig has been added for visualization only); (b) Orthographic projection of the point cloud to the

XY

-plane of the visualization grid.

Figure 21. Real-life experiment using the 37

mm

-radius prototype and a single 2592 × 1944 pixels image where the rig was positioned in the middle of the room observed in Figure 13a. Some landmarks of the scene are annotated as following: Ⓐ appliances, Ⓑ monitors and shelf, Ⓒ back wall, Ⓓ chair, Ⓔ monitors and shelf,Ⓕ book, Ⓖ monitors, Ⓗ person, Ⓘ hallway, Ⓙ supplies. For the point cloud, the grid size is 0.50 m in all directions and points are thickened for clarity.

Figure 21. Real-life experiment using the 37

mm

-radius prototype and a single 2592 × 1944 pixels image where the rig was positioned in the middle of the room observed in Figure 13a. Some landmarks of the scene are annotated as following: Ⓐ appliances, Ⓑ monitors and shelf, Ⓒ back wall, Ⓓ chair, Ⓔ monitors and shelf,Ⓕ book, Ⓖ monitors, Ⓗ person, Ⓘ hallway, Ⓙ supplies. For the point cloud, the grid size is 0.50 m in all directions and points are thickened for clarity.

Figure 22. Sparse point correspondences for the real-life image from Figure 13b. Point correspondences are identifiable by random colors that persist in both the panoramic image and the respective triangulated 3D points (scaled-up for visualization).

Figure 23. Example of sparse point correspondences detected with subpixel precision from corners on the chessboard patterns around the omnistereo sensor. The size of the rendered images for this experiment is 1280 × 960 pixels. For this example’s patterns, the square cell size is 140

mm

. The RMSE for this set of points at

^{[C]} ρ_{G} = 2 m

is approximately 15

mm

(Table 4).

Figure 23. Example of sparse point correspondences detected with subpixel precision from corners on the chessboard patterns around the omnistereo sensor. The size of the rendered images for this experiment is 1280 × 960 pixels. For this example’s patterns, the square cell size is 140

mm

. The RMSE for this set of points at

^{[C]} ρ_{G} = 2 m

is approximately 15

mm

(Table 4).

Figure 24. Visualization of estimated 3D poses for some chessboard patterns using the real-life omnistereo rig. Color annotations: ground-truth poses (green), estimated triangulated poses (red).

Table 1. Optimal System Design Parameters.

**Table 1.** Optimal System Design Parameters.
Parameter	Big Rig	Small Rig
$b = max f_{b} (θ^{*})$	131.61	108.92
$r_{s y s} [mm]$	37.0	28.0
$c_{1} [mm]$	123.49	104.59
$c_{2} [mm]$	241.80	204.34
$d [mm]$	233.68	200.00
$k_{1}$	5.73	6.88
$k_{2}$	9.74	11.47

Table 2. By-product Length Parameters.

**Table 2.** By-product Length Parameters.
Parameter	Big Rig	Small Rig
$r_{r e f} [mm]$	17.23	11.74
$r_{c a m} [mm]$	7	7
$h_{s y s} [mm]$	150.00	120.00

Table 3. Near Vertices of the SROI for the Big Rig.

**Table 3.** Near Vertices of the SROI for the Big Rig.
Vertex	$^{[C]} ρ_{w}$ [ $mm$ ]	$^{[C]} z_{w}$ [ $mm$ ]
$P_{n s_{h i g h}}$	93.5	144.4
$P_{n s_{m i d}}$	65.2	98.4
$P_{n s_{l o w}}$	763.4	-170.3

Table 4. Results of RMSE from Synthetic Triangulation Experiment.

**Table 4.** Results of RMSE from Synthetic Triangulation Experiment.
$^{[C]} ρ_{G}$ [ $m$ ]	RMSE [ $mm$ ]	SD [ $mm$ ]
0.25	0.46	0.31
0.50	1.20	0.71
1.0	4.62	2.55
2.0	14.85	9.06
4.0	57.67	31.34
8.0	219.09	129.92

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jaramillo, C.; Valenti, R.G.; Guo, L.; Xiao, J. Design and Analysis of a Single—Camera Omnistereo Sensor for Quadrotor Micro Aerial Vehicles (MAVs). Sensors 2016, 16, 217. https://doi.org/10.3390/s16020217

AMA Style

Jaramillo C, Valenti RG, Guo L, Xiao J. Design and Analysis of a Single—Camera Omnistereo Sensor for Quadrotor Micro Aerial Vehicles (MAVs). Sensors. 2016; 16(2):217. https://doi.org/10.3390/s16020217

Chicago/Turabian Style

Jaramillo, Carlos, Roberto G. Valenti, Ling Guo, and Jizhong Xiao. 2016. "Design and Analysis of a Single—Camera Omnistereo Sensor for Quadrotor Micro Aerial Vehicles (MAVs)" Sensors 16, no. 2: 217. https://doi.org/10.3390/s16020217

APA Style

Jaramillo, C., Valenti, R. G., Guo, L., & Xiao, J. (2016). Design and Analysis of a Single—Camera Omnistereo Sensor for Quadrotor Micro Aerial Vehicles (MAVs). Sensors, 16(2), 217. https://doi.org/10.3390/s16020217

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Design and Analysis of a Single—Camera Omnistereo Sensor for Quadrotor Micro Aerial Vehicles (MAVs) †

Abstract

1. Introduction

1.1. Sensor Motivation

1.2. Existing Range Sensors for MAVs

1.3. Related Work

1.4. Proposed Sensor

2. Sensor Design

2.1. Model Parameters

2.2. Single Viewpoint (SVP) Configuration for OmniStereo

2.3. Rig Size

3. Projective Geometry

3.1. Analytical Solutions to Projection (Forward)

3.2. Analytical Solutions to Back Projection

3.3. Field-of-View

3.4. Spatial Resolution

4. Parameter Optimization and Prototyping

4.1. Optimization Constraints

4.1.1. Linear Constraints

4.1.2. Non-Linear Constraints

4.2. Optimal Results

4.2.1. Optimality of Parameters k 1 and k 2

4.2.2. Spatial Resolution Optimality

4.3. Prototypes

4.3.1. Synthetic Prototype (Simulation)

4.3.2. Real-Life Prototypes

5. 3D Sensing from Omnistereo Images

5.1. Panoramic Images

Stereo Matching on Panoramas

5.2. Range from Triangulation

5.2.1. Common Perpendicular Midpoint Triangulation Method

5.2.2. Range Variation

5.3. Triangulation Uncertainty Model

6. Experiment Results

6.1. Dense 3D Point Clouds

6.2. Sparse 3D Points from Features

6.3. Triangulation Evaluation

6.3.1. Evaluation of Synthetic Rig

6.3.2. Evaluation of Real-Life Rig

7. Discussion and Future Work

Acknowledgments

Author Contributions

Conflicts of Interest

Appendix A. Symbolic Notation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Design and Analysis of a Single—Camera Omnistereo Sensor for Quadrotor Micro Aerial Vehicles (MAVs)^†

4.2.1. Optimality of Parameters $k_{1}$ and $k_{2}$