^{*}

E-mail:

Reproduction is permitted for noncommercial purposes.

How can the compound eye of insects capture the prey so accurately and quickly? This interesting issue is explored from the perspective of computer vision instead of from the viewpoint of biology. The focus is on performance evaluation of noise immunity for motion recovery using the single-row superposition-type planar compound like eye (SPCE). The SPCE owns a special symmetrical framework with tremendous amount of ommatidia inspired by compound eye of insects. The noise simulates possible ambiguity of image patterns caused by either environmental uncertainty or low resolution of CCD devices. Results of extensive simulations indicate that this special visual configuration provides excellent motion estimation performance regardless of the magnitude of the noise. Even when the noise interference is serious, the SPCE is able to dramatically reduce errors of motion recovery of the ego-translation without any type of filters. In other words, symmetrical, regular, and multiple vision sensing devices of the compound-like eye have statistical averaging advantage to suppress possible noises. This discovery lays the basic foundation in terms of engineering approaches for the secret of the compound eye of insects.

Over one hundred years ago, the configuration of the compound eye of insects started attracting researchers' attentions. Recently, the biologically inspired visual studies have flourished with a boom in the microlens technology. The development of image acquisition systems based on the framework of the compound eye has also progressed quicker than ever. The fabrication of micro compound eye has been reported in the literature with a gradual orientation towards its commercial applications. Those well-known commercial applications include the TOMBO compound eye proposed by Tanita et al. [

Why can the compound eye of insects capture the prey so accurately and quickly? This interesting topic has not been completely answered yet. The biologists believe that it is because of the flicker effect [

Is there anything special about the image captured by the compound eye? How is the image pattern produced by the compound eye? The mosaic theory of insect vision initially proposed by Muller in 1826, and elaborated by Exner in 1891, is still generally accepted today. According to the mosaic theory, there are two basic types of compound eyes, apposition and superposition [

These two types of the compound-eye's structure mentioned above are based on the ecological aspect and can help us realizing how to produce images from compound eyes. But according to the computer vision aspect, which configuration should we adopt to? The compound-like eye used in computer vision proposed by Aloimonos [

How to form the configuration of planar compound-like eye in computer vision? Because generations of images in the superposition and the apposition types are very different, some definitions of compound-like eye in superposition type are described as follows:

Arrangement: Assume each ommatidium is put together in parallel or vertical manner of arrangement in a well-ordered way. Based on this situation, a number of CCD cameras treated as ommatidium are arranged on the surface of a plane. A constant horizontal distance exists between adjacent ommatidia.

Image acquisition: In order to distinguish with the apposition type compound-like eye, the image acquisition of compound eye of insects in superposition type is defined as a whole image through itself ommatidia, so that each ommatidium can look at the whole image of an object.

Image patterns: To produce ambiguous patterns viewed by the compound eye of an insect, assume the image points are deviated from the idea position by possible interfering noises. The contaminated image will result in very different reformed patterns since it is interfered by noises randomly. Therefore the image viewed by each ommatidium will have ambiguous appearances of an object. Besides, each ommatidium creates its image that depends on its different arrangement location. As a result, the images generated by the superposition-type planar compound-like eyes (SPCE) will be close to blurred patterns viewed by ommatidia of an insect.

Simulation: To construct images viewed by the compound eye, the essential pinhole perspective projection will be adopted.

Before establishing the translational motion model for the single-row SPCE, the translational motion for trinocular needs to be investigated first. Then, the extension to the translational motion for the single-row SPCE can be made straightforwardly by expanding towards two sides of the parallel trinocular structure.

Given _{1} and _{2}, as illustrated in

For an arbitrary point _{1} and _{2}, the products of the image disparities in _{l}_{l}_{m}_{m}_{r}, y_{r}

Suppose _{x}_{y}_{z}^{T}^{i}_{x}^{i}_{y}

First select a pair of images that come from left and middle CCDs. The optical flow fields along the _{l}_{m}_{1} = _{2}

The above equations indicate that when the focal length

For the purpose of simplicity and clarity, the corresponding image regions: Ω_{l}_{m}_{r}

Obviously, it is an over-determined problem to solve the translational motion parameters. Due to the special symmetric framework of the arrangement of compound eye, three strategies are applied. Model A includes all optical flow equations. Model B deletes all dependent constraints. Model C keeps the symmetric framework but deletes the relationship between the left and right CCDs.

If all equations are considered, then

The translational motion can therefore be recovered by the standard least square estimation as follows:

Generally speaking, the above derivation can be applied to both point-to-point correspondence and patch-to-patch matching cases.

Much simpler mathematical expressions can be established by getting rid of those equations that are dependent with others. Since

It just takes time to distinguish equations with dependency. A compromise approach to overcome this difficulty is to ignore the third set of equations that consists of the left and right CCDs with the most dependency by excluding

Although the parameters for the translational motion can be resolved by the standard least square approach, possible singularity problem, i.e., the determinant of ^{T}

The determinant of the matrix ^{T}_{1}_{2} − _{2}_{1}) ≠ 0 and (_{1}_{3} − _{3}_{1}) = 0 (refer to _{1} and _{1}, are also not zero. Therefore, the singularity problem will not happen in the presented approach. Correspondingly, both Models A and B can also lead to similar results. ^{T}^{T}

Due to the rich relationships among CCDs owned by the parallel trinocular, standard solutions for those three models under the ideal condition of free noise are identical. However, the image will normally be corrupted by noises in real world, and the solutions for the translation will not be similar.

Installing more CCDs along two sides of the parallel trinocular gradually reaches a single row SPCE structure. Expanding the image optical flow equations for the translational motion of the parallel trinocular by increasing the number of CCDs also leads to the translational motion model for the single row SPCE. Basically, the approach to solve the single row SPCE is the same as that in the parallel trinocular. The only difference is the orders of the image matrix and the optical flow vector, which should correspond to the amount of CCD.

Assume the amount of CCD at each side from the central camera is

Based on the images of SPCE generated from computer vision, recovery of the translational motion using a single-row SPCE can be accomplished by the following procedures:

In order to have the images captured by cameras of the single row SPCE as ambiguous as the picture viewed by ommatidium, random noises are added to ideal images. The fuzziness properties of all individual images are assumed to be independent to one another.

When the single-row SPCE looks at an object, each ommatidium CCD will perceive a different profile according to its given location. In this manner, the compound-like eye will observe a whole picture consisting of many small, similar, and ambiguous patterns.

When the object moves, the single-row SPCE can detect this translational movement using two complete images before and after the motion.

Using those two vague images that include the information of the translation, the corresponding optical flow for each camera can be determined.

Any two CCDs are able to generate a pair of image optical flow equations. The greater the total quantity of the camera is, the more the number of the image optical flow equations becomes.

These large amount of image optical flow equations can be stacked in a matrix form, such as

Using the least square estimation approach, the ego-translational motion can be immediately obtained by ^{T}^{1}^{T}

When the amount of the ommatidium CCD camera grows, the resolved ego-translational motion parameters using the single-row SPCE will approach to the ideal values.

In order to verify the performance of noise immunity for the single row SPCE, a given synthesized cloud of fifty 3D points shown in

To simulate a realistic situation, noises have to be introduced into ideal data. Gupta and Kanal [

Assume the image components in the ideal motion field (_{x}_{1}, _{y}_{1}, _{x}_{2}, and _{y}_{2} are assumed to have the same statistical property and are given by

For the purpose of simplification for the follow-up validation and simulation, the translation movement was approximated by the translation velocity under the assumption of unit sampling time. Therefore, the image velocity could imply the image displacement. A small displacement and a large displacement were chosen as (0.1, 0.1, 0.1) and (60, 50, 5) with unit of mm, respectively. According to the above algorithm, for an ideal case free of noise, the estimated ego-translations were correspondingly derived as (0.1, 0.1, 0.1) [mm] and (59.6661, 49.7217, 4.9718) [mm]. Different levels of noise with variance varying from 1 to 100 were studied. Due to movement variation being too small in the small displacement, only the large displacement (60, 50, 5) would be applied for validation. All gap distances between adjacent CCD cameras were 90 mm. The arrangements of CCDs in the single row SPCE include 1×3, 1×5, 1×7, 1×9, 1×11, 1×13, 1×15, and 1×25. A total of 300 trials for each situation were conducted. The image points with different noise variances for the 1×25 single-row SPCE at two contiguous time instants (black and red) are shown in

Considering those ambiguous image patterns contaminated by different levels of noise, model validation based on performance of motion recovery using those three presented motion models will be explored. The relative error is defined as [_{x}, V_{y}_{z}

For validation of the three Models, additional two cases including an arbitrary single point, i.e., (-365, 112, 960) and any 5 points will be explored. For different single row SPCE configurations under a noise level of variance 100, relative errors of motion recovery in percent of the ego-translation are demonstrated in

Based on the experimental results, the following summaries can be made:

Models A and C always keep in the acceptable level when the CCD camera number increases. But Model B cannot converge to a satisfactory level. Since Model A possesses more equations in translation motion estimation, it is not surprisingly less relative errors are found. While Model B owns less equation, the relative error becomes larger. In the single point case, for the 1×155 single row arrangement, it is clear that the relative error of Model A is the best, Model C is next, and Model B is the worst.

It has been shown that the errors decrease quickly when the number of points increases [

The two relative error curves for Models A and C exhibit interesting variations when the number of test points increases. The intersection point of these two relative error curves are about at 1×29 and 1×70 for the cases of a single point and 5 points, respectively. But when the test point number reaches fifty, it is noted that the intersection point does not exist. This interesting phenomenon clearly indicates the relative error of Model C is close to that of Model A in most situations.

Since the dimensions of image matrix and optical flow vector in those three models vary with the CCD camera number of the single row SPCE, their executing times will be different.

However, if we set a base with Model B,

To sum up, from the above comparison on the accuracy and computation efficiency of the single row SPCE with fifty points depicted in

According to previous examination, Model C appears to be the best motion model in noise-resistance capability. Hence Model C is selected for further investigation for motion recovery of the ego-translation movement.

For different CCD numbers of the single row SPCE from 1×3 to 1×25, when the variance of noise increases from 1 to 100, the relative error for motion estimation of the ego-translation is enhanced by 10 times in average. This trend is reasonable, because the variance of the noise is proportional to the square of the corresponding component.

Regardless of the magnitude of noise level, the noise interference damages image patterns when the CCD number is small. Besides, the capability of noise-resistance becomes better when the CCD number increases. When the variance reaches 100, the relative error of ego-translation in 1×3 configuration is up to almost 50%. Using the single-row SPCE framework, the relative error can fall down to less than 2% by just modifying the CCD arrangement to 1×25. It can be concluded under the influence of noise interference, when the CCD number increases, the relative error will be greatly reduced. The multiple camera scheme, inspired by the compound eyes of insects, successfully provides outstanding filtering performance on image patterns with noises.

The above validations in sections 5.1 and 5.2 are based on regular arrangement of single row SPCE. But what happen if the arrangement of SPCE is irregular? We just made a small deviation for the most left CCD camera in each case of

From

When the CCD camera number increases, the relative error becomes larger. Since the deviation is far away from the origin of the coordinate, the relative error will be large. The more of the amount of CCD is, the more of the relative error becomes. For example in 1×25, the relative errors in each case are almost about 120%.

Although under the different level noises, this irregular arrangement of the single row SPCE still dominates the relative error of motion estimation. Thus, the relative errors of the same single row SPCE configuration are close. Exception in 1×3, since the amount of CCD number is too fewer, the relative errors are dominated by the noises of different level variances.

Although the irregular arrangement of the single row SPCE happened with small deviation, the accuracy of estimation for the translation motion is destroyed seriously. Therefore, the regular arrangement is very important in compound eye of an insect for the motion estimation.

From the above experimental results of the single row SPCE, when the CCD's number in the single row SPCE framework increases, the relative error of ego-translation will significantly reduce. If the number of the CCD goes to infinity, the relative error is expected to approach to zero. The SPCE appears to own a powerful capability to overcome noises. In particular, even when the variance of noise is large, the images blurred, the SPCE can still show its effective noise-resistance capability to recover the motion parameters of the translational movement.

The dragonfly has nearly 30000 ommatidia in each eye, which makes sense because they hunt in flight, whereas butterflies and moths, which do not hunt in flight, only own 12000 to 17000 ommatidia [

Apparently, some of the reasons why the compound eye of the insect is able to help capturing its prey so exactly and quickly are its symmetrical and regular arrangement framework, and its sufficient large number of ommatidia. The multiple camera schemes contribute multiple visual measurements for image patterns that behave like an inherent filter for possible noises. Based on the sufficient multiple image patterns, powerful capability of noise-resistance for motion recovery can be accomplished without using any types of additional signal filters.

The compound eyes of the flying insect in the biological world are highly evolved organs. Although the images received from their views are vague and unclear, they are still able to capture prey so exactly and quickly. Inspired by these insects, using pinhole image formation geometry to investigate the visual mechanism of the SPCE to moving objects was conducted.

The principle of the parallel trinocular was first introduced. After extending to the single row SPCE, the experiments for motion recovery of the translational movement under the influence of noises were extensively performed. Especially, a compromised translation motion model is proposed to provide the excellent motion recovery performance with the most accuracy and the faster computation efficiency. Meanwhile, the experimental results also indicate that no matter whether the noise is large or small, the relative error of the ego-translation reduces when the amount of CCD number of the single row SPCE increases. This outcome also lays the basic theoretical foundation to explain why the compound eye of the insect can seize the prey with a tremendous efficiency from the engineering point of view.

This work was supported in part by the National Science Council of the Republic of China under Grant NSC 93-2212-E-110-014.

For an arbitrary point _{li}_{mi}_{ri}_{lj}_{mj}_{rj}_{1} is equal to _{2} or not.

_{1}_{2} − _{2}_{1} ≠ 0

_{1}_{3} − _{3}_{1} = 0

_{1}_{3}_{3}_{1} = 0, _{1}_{3} − _{3}_{1} = 0 and _{1}_{2} − _{2}_{1} ≠ 0, _{1}_{2} − _{2}_{1} ≠ 0

In addition, _{1}b_{2}-a_{2}b_{1}^{T}

Model A:

Model B:

order of

Model C:

order of

Schematic for the parallel trinocular.

A synthesized cloud of fifty 3D points.

The image points with different noise variances for the 1×25 single-row SPCE before (black) and after (red) a movement.

The relative errors of translation motion in three models with a single, five, and fifty 3D points under a noise level of variance 100.

The elapsed time per running of translation motion in three Models with fifty 3D points under a noise level of variance 100.

Elapsed time in sec of the ego-translation for different single-row SPCE configurations with three models under a noise level of variance 100.

1×5 | 1×15 | 1×35 | 1×55 | 1×75 | 1×95 | 1×125 | 1×155 | |
---|---|---|---|---|---|---|---|---|

A | 0.13 | 1.05 | 5.57 | 14.15 | 27.52 | 46.56 | 83.37 | 153.2 |

B | 0.07 | 0.22 | 0.52 | 0.82 | 1.13 | 1.42 | 1.86 | 2.63 |

C | 0.08 | 0.24 | 0.53 | 0.98 | 1.34 | 1.48 | 1.95 | 2.77 |

Relative errors in percent of the ego-translation for different single-row SPCE configurations with various noise variances.

1×3 | 1×5 | 1×7 | 1×9 | 1×11 | 1×13 | 1×15 | 1×25 | |
---|---|---|---|---|---|---|---|---|

1 | 4.69 | 1.50 | 0.90 | 0.60 | 0.42 | 0.36 | 0.30 | 0.16 |

4 | 10.01 | 2.82 | 1.62 | 1.19 | 0.80 | 0.68 | 0.62 | 0.33 |

16 | 19.51 | 6.43 | 3.74 | 2.25 | 1.72 | 1.35 | 1.14 | 0.68 |

36 | 28.18 | 9.59 | 4.91 | 3.68 | 2.60 | 2.00 | 1.74 | 0.95 |

64 | 40.13 | 12.43 | 6.15 | 4.59 | 3.74 | 2.81 | 2.26 | 1.34 |

100 | 47.85 | 14.03 | 8.79 | 5.81 | 4.18 | 3.60 | 2.89 | 1.58 |

Relative errors in percent of the ego-translation for different single-row SPCE configurations with various noise variances after a small deviation.

1×3 | 1×5 | 1×7 | 1×9 | 1×11 | 1×13 | 1×15 | 1×25 | |
---|---|---|---|---|---|---|---|---|

1 | 3.90 | 54.23 | 67.87 | 75.58 | 81.91 | 87.74 | 93.81 | 120.0 |

4 | 8.25 | 54.30 | 67.96 | 75.58 | 81.85 | 87.76 | 93.83 | 120.0 |

16 | 16.01 | 54.47 | 67.91 | 75.62 | 81.89 | 87.79 | 93.86 | 120.1 |

36 | 23.59 | 54.65 | 67.99 | 75.58 | 81.56 | 87.83 | 93.89 | 120.1 |

64 | 32.28 | 54.83 | 67.98 | 75.41 | 82.01 | 87.82 | 93.93 | 120.1 |

100 | 40.31 | 54.98 | 68.07 | 75.53 | 82.00 | 87.89 | 93.97 | 120.1 |