<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xml:lang="en" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Sensors</journal-id>
<journal-title>Sensors</journal-title>
<issn pub-type="epub">1424-8220</issn>
<publisher>
<publisher-name>Molecular Diversity Preservation International (MDPI)</publisher-name></publisher></journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3390/s110808164</article-id>
<article-id pub-id-type="publisher-id">sensors-11-08164</article-id>
<article-categories>
<subj-group>
<subject>Article</subject></subj-group></article-categories>
<title-group>
<article-title>FPGA-Based Multimodal Embedded Sensor System Integrating Low- and Mid-Level Vision</article-title></title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Botella</surname><given-names>Guillermo</given-names></name><xref ref-type="aff" rid="af1-sensors-11-08164"><sup>1</sup></xref><xref ref-type="corresp" rid="c1-sensors-11-08164"><sup>*</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Martín H.</surname><given-names>José Antonio</given-names></name><xref ref-type="aff" rid="af1-sensors-11-08164"><sup>1</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Santos</surname><given-names>Matilde</given-names></name><xref ref-type="aff" rid="af1-sensors-11-08164"><sup>1</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Meyer-Baese</surname><given-names>Uwe</given-names></name><xref ref-type="aff" rid="af2-sensors-11-08164"><sup>2</sup></xref></contrib></contrib-group>
<aff id="af1-sensors-11-08164">
<label>1</label> Department of Computer Architectures and Automatic Control, Complutense University of Madrid, 28040 Madrid, Spain; E-Mails: <email>jamartinh@fdi.ucm.es</email> (J.A.M.H.); <email>msantos@dacya.ucm.es</email> (M.S.)</aff>
<aff id="af2-sensors-11-08164">
<label>2</label> Department of Electrical and Computer Engineering, FAMU-FSU College of Engineering, Tallahassee, FL 32310, USA; E-Mail: <email>umb@eng.fsu.edu</email></aff>
<author-notes>
<corresp id="c1-sensors-11-08164">
<label>*</label>Author to whom correspondence should be addressed; E-Mail: <email>gbotella@fdi.ucm.es</email>; Tel.: +34-91-394-76-50; Fax: +34-91-394-75-27.</corresp></author-notes>
<pub-date pub-type="collection">
<year>2011</year></pub-date>
<pub-date pub-type="epub">
<day>22</day>
<month>8</month>
<year>2011</year></pub-date>
<volume>11</volume>
<issue>8</issue>
<fpage>8164</fpage>
<lpage>8179</lpage>
<history>
<date date-type="received">
<day>16</day>
<month>2</month>
<year>2011</year></date>
<date date-type="rev-recd">
<day>6</day>
<month>7</month>
<year>2011</year></date>
<date date-type="accepted">
<day>15</day>
<month>8</month>
<year>2011</year></date></history>
<permissions>
<copyright-statement>© 2011 by the authors; licensee MDPI, Basel, Switzerland.</copyright-statement>
<copyright-year>2011</copyright-year>
<license>
<p>This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).</p></license></permissions>
<abstract>
<p>Motion estimation is a low-level vision task that is especially relevant due to its wide range of applications in the real world. Many of the best motion estimation algorithms include some of the features that are found in mammalians, which would demand huge computational resources and therefore are not usually available in real-time. In this paper we present a novel bioinspired sensor based on the synergy between optical flow and orthogonal variant moments. The bioinspired sensor has been designed for Very Large Scale Integration (VLSI) using properties of the mammalian cortical motion pathway. This sensor combines low-level primitives (optical flow and image moments) in order to produce a mid-level vision abstraction layer. The results are described trough experiments showing the validity of the proposed system and an analysis of the computational resources and performance of the applied algorithms.</p></abstract>
<kwd-group>
<kwd>bio-inspired systems</kwd>
<kwd>machine vision</kwd>
<kwd>optical flow</kwd>
<kwd>orthogonal variant moments</kwd>
<kwd>VLSI</kwd></kwd-group></article-meta></front>
<body>
<sec sec-type="intro">
<label>1.</label>
<title>Introduction</title>
<p>There are several definitions of the goal of visual perception [<xref ref-type="bibr" rid="b1-sensors-11-08164">1</xref>,<xref ref-type="bibr" rid="b2-sensors-11-08164">2</xref>] as the interpretation of the information arriving at the retina, while a general agreement about the different abstraction levels and the limits between them is lacking.</p>
<p>Low-level vision obtains useful measurements such as colour, spatial frequency, binocular disparity, motion processing, <italic>etc.</italic>, from several channels. Some of these channels, or space-temporal filters, can be identified with receptive fields that deliver information to the retina. Others, such as binocular disparity or motion processing, are combinations of the previously mentioned ones.</p>
<p>Mid-level vision integrates primitives processed at a previous level. Information delivered at this stage corresponds to real-world inferences such as egomotion and independent moving objects (IMOs). They are called causal actions or object candidates in connection with any multimodal characterization. Examples of these are the combination of luminance measurements to infer lightness, shape from shading, perceptual grouping, figure organization, <italic>etc.</italic></p>
<p>Finally, High-level vision interprets the scene through specific tasks such as relational reasoning, knowledge building, object recognition, <italic>etc.</italic> [<xref ref-type="bibr" rid="b1-sensors-11-08164">1</xref>]</p>
<p>Regarding Low-level vision, optical flow considered as pixel motion estimation (velocity measure in terms of modulus and phase) of an image sequence, is an ill-posed problem due the inherent complexity of the signal processing tasks associated with it.</p>
<p>Motion processing has many important applications nowadays including robot navigation [<xref ref-type="bibr" rid="b3-sensors-11-08164">3</xref>], biomedicine assistance [<xref ref-type="bibr" rid="b4-sensors-11-08164">4</xref>], and so on [<xref ref-type="bibr" rid="b5-sensors-11-08164">5</xref>]. Almost all complex computer vision systems include a core to specifically process motion, which will be then integrated with other early level primitives as mentioned above. These primitives are passed as input parameters to higher level vision stages. The applications mentioned here needs real-time capability when they are part of an embedded system, where the processing resources are constrained. There are some approaches [<xref ref-type="bibr" rid="b6-sensors-11-08164">6</xref>] that only work with enough accuracy over a velocity range or noise free environment. Others suffer from contrast dependence or are unable to estimate second order motion [<xref ref-type="bibr" rid="b7-sensors-11-08164">7</xref>,<xref ref-type="bibr" rid="b8-sensors-11-08164">8</xref>].</p>
<p>On the other hand, moments in computer vision [<xref ref-type="bibr" rid="b9-sensors-11-08164">9</xref>] are statistical measures which capture important information about an image, for instance, to describe its shape. Variant moments [<xref ref-type="bibr" rid="b10-sensors-11-08164">10</xref>–<xref ref-type="bibr" rid="b13-sensors-11-08164">13</xref>] are an alternative to the classic moment invariants. Variant moments are considered Low-Level processing because they process at the pixel level.</p>
<p>In this work, we present a prototype based on a FPGA device suitable for industrial applications which involves reduced size, rapid prototyping, low cost and power consumption. Our bioinspired sensor integrates two Low-level vision primitives represented by gradient family optical flow estimation and variant orthogonal moments. The optical flow platform provides the modulus and phase velocity values of each captured pixel. Orthogonal variant moments improve the robustness of the final system featuring the pixels. Both early-vision cues provide information for the Mid-Level output which has been configured in this contribution in the framework of segmentation and tracking tasks.</p>
<p>This paper is organized as follows, Section 2 provides a brief description of the different vision levels applied and the architecture of the whole integration. Section 3 describes the algorithms of the multimodal sensor. Section 4 presents the experimental results, the performance and the hardware resources needed. Section 5 summarizes the main innovative points, the comparison with other approaches and the presents the conclusions of the work.</p></sec>
<sec>
<label>2.</label>
<title>Multimodal Platform</title>
<p>In this section the different Vision Levels applied are described. The final aim can be summarized in two challenges: the efficient integration of different primitives belonging to Low-level vision and the Mid-level vision processing module which gathers and computes data from the previously integration performed.</p>
<sec>
<label>2.1.</label>
<title>Pixel-Level Granularity: Low Level Vision</title>
<p>The starting point of the Low-level module of the platform is an improved FPGA-based implementation [<xref ref-type="bibr" rid="b14-sensors-11-08164">14</xref>,<xref ref-type="bibr" rid="b15-sensors-11-08164">15</xref>], which is briefly explained in this subsection. The optical flow Multichannel Gradient Model (McGM), designed by Johnston [<xref ref-type="bibr" rid="b16-sensors-11-08164">16</xref>–<xref ref-type="bibr" rid="b20-sensors-11-08164">20</xref>], was chosen to implement the Low-level vision system in VLSI due its robustness and bio-inspiration. This model deals efficiently with many challenges, such as illumination, static patterns, contrast invariance, robustness against failures, justification of some optical illusions [<xref ref-type="bibr" rid="b16-sensors-11-08164">16</xref>], detection of second order motion and camouflage processing [<xref ref-type="bibr" rid="b16-sensors-11-08164">16</xref>,<xref ref-type="bibr" rid="b17-sensors-11-08164">17</xref>], <italic>etc.</italic> Its physical architecture and design principles are based on the biological nervous systems of mammalians [<xref ref-type="bibr" rid="b1-sensors-11-08164">1</xref>,<xref ref-type="bibr" rid="b20-sensors-11-08164">20</xref>–<xref ref-type="bibr" rid="b22-sensors-11-08164">22</xref>]. At the same time, it avoids operations such as matrix inversion or iterative methods that are not biologically justified [<xref ref-type="bibr" rid="b16-sensors-11-08164">16</xref>–<xref ref-type="bibr" rid="b18-sensors-11-08164">18</xref>]. The original description of the McGM model [<xref ref-type="bibr" rid="b16-sensors-11-08164">16</xref>–<xref ref-type="bibr" rid="b20-sensors-11-08164">20</xref>] has been modified to improve the viability of the implementation in hardware.</p>
<p>Low-level vision processes the early visual information in a highly parallel and local way as the retina and primary visual cortex do [<xref ref-type="bibr" rid="b1-sensors-11-08164">1</xref>,<xref ref-type="bibr" rid="b23-sensors-11-08164">23</xref>]. The goal of this part is to estimate optical flow using a quotient of massively parallel bank of filters. These filters are obtained with a kernel function which depends on time and space. It conforms a bank filtering that progressively increases the order of the spatial (<italic>r</italic>) and temporal (<italic>t</italic>) differential operators involved in the kernel <xref ref-type="disp-formula" rid="FD1">Equation (1)</xref>:
<disp-formula id="FD1">
<label>(1)</label> 
<mml:math display="block">
<mml:mrow>
<mml:mi>K</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>r</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mn>4</mml:mn>
<mml:mi>π</mml:mi>
<mml:mi>σ</mml:mi></mml:mrow></mml:mfrac>
<mml:msup>
<mml:mrow>
<mml:mi>e</mml:mi></mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi>r</mml:mi></mml:mrow>
<mml:mn>2</mml:mn></mml:msup></mml:mrow>
<mml:mrow>
<mml:mn>4</mml:mn>
<mml:mi>σ</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:msup>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:msqrt>
<mml:mi>π</mml:mi></mml:msqrt>
<mml:mi>τ</mml:mi>
<mml:mi>α</mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi>e</mml:mi></mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi>r</mml:mi></mml:mrow>
<mml:mn>2</mml:mn></mml:msup>
<mml:mo>/</mml:mo>
<mml:mn>4</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac>
<mml:msup>
<mml:mrow>
<mml:mi>e</mml:mi></mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mtext>ln</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>/</mml:mo>
<mml:mi>α</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mi>τ</mml:mi></mml:mfrac></mml:mrow>
<mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow>
<mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:msup></mml:mrow></mml:math></disp-formula>where the parameters have been tuned to the follow values: <italic>σ</italic> = 1.5, <italic>α</italic> =10 and <italic>τ</italic> = 0.2. This expression is obtained following psychophysical and biological evidences from the mammalian and human visual systems [<xref ref-type="bibr" rid="b1-sensors-11-08164">1</xref>]. It has been normalized and tuned assuming a human spatial frequency limit of 60 cycles/deg and a critical flicker fusion limit of 60 Hz [<xref ref-type="bibr" rid="b16-sensors-11-08164">16</xref>].</p>
<p>After that, a tridimensional Taylor approximation of every pixel which depends on the derivative operators previously calculated from the kernel function is replaced by the intensity value. This expansion takes derivatives in time, <italic>t</italic>, and two spatial directions, <italic>x</italic> and <italic>y</italic>. These derivatives fit well with the receptive fields in the neural systems, there being multiple neurophysiological and psychophysical facts that support this processing scheme [<xref ref-type="bibr" rid="b1-sensors-11-08164">1</xref>,<xref ref-type="bibr" rid="b16-sensors-11-08164">16</xref>]. This system is biologically plausible and can be implemented by an artificial neural system in the visual cortex involving addition, multiplication and division of the linear spatial-temporal orientated filters [<xref ref-type="bibr" rid="b15-sensors-11-08164">15</xref>]. The implemented model is a sequence of stages, where summarily their main concepts and associated task are described in the next paragraph:</p>
<p>Stage I accomplishes the temporal differentiation through fully stable and causal FIR filtering, convolving derivative operators of the kernel function (log-time domain Gaussian). It is important to notice that this implementation is different than that presented in previous works (IIR filtering) [<xref ref-type="bibr" rid="b14-sensors-11-08164">14</xref>,<xref ref-type="bibr" rid="b15-sensors-11-08164">15</xref>], achieving in this contribution longer delay although gaining in stability, modularity and scalability.</p>
<p>Stage II implements the spatial differentiation building functions of each temporal derivative previously implemented. This structure representation is computed via convolution with a set of neural “basis” filters modeled as derivatives of Gaussians.</p>
<p>Stage III steers each one of the space-time filters previously built at arbitrary orientations using a linear combination of other filters in a small “basis” set. Using the linear property of the convolution as main advantage, a filter <italic>F<sub>θ</sub></italic> with orientation <italic>θ</italic> from the previous basic filter bank is formed. Many gradient optical flow models [<xref ref-type="bibr" rid="b2-sensors-11-08164">2</xref>,<xref ref-type="bibr" rid="b7-sensors-11-08164">7</xref>,<xref ref-type="bibr" rid="b8-sensors-11-08164">8</xref>,<xref ref-type="bibr" rid="b24-sensors-11-08164">24</xref>] can be implemented by just combining the outputs reached at this point.</p>
<p>Stage IV builds a Taylor expansion and its derivatives over <italic>x</italic>, <italic>y</italic> and <italic>t</italic> (denominated <italic>X</italic>,<italic>Y</italic>,<italic>T</italic> respectively) using the earlier calculated measures, delivering at the output a sextet which contains the products <italic>XX</italic>,<italic>XY</italic>,<italic>XT</italic>,<italic>YY</italic>,<italic>YT</italic>,<italic>TT</italic>. The Taylor approximation is truncated removing terms above first order in time and orthogonal direction accomplishing the fact of no more than three temporal filters and no greater spatial complexity in filters attending the biological proofs [<xref ref-type="bibr" rid="b25-sensors-11-08164">25</xref>].</p>
<p>At this point, the whole information of the sequence of input frames is represented by a 3D structure where each pixel belonging to it can be reached in terms of a filter population tuned to different orientations and spatial frequencies.</p>
<p>Stage V forms four different functions called direct 
<inline-formula>
<mml:math>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>s</mml:mi></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub></mml:mrow>
<mml:mo stretchy="true">^</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula>, 
<inline-formula>
<mml:math>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>s</mml:mi></mml:mrow>
<mml:mo>⊥</mml:mo></mml:msub></mml:mrow>
<mml:mo stretchy="true">^</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula>, and inverse 
<inline-formula>
<mml:math>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>⌣</mml:mo></mml:mover></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, 
<inline-formula>
<mml:math>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>⌣</mml:mo></mml:mover></mml:mrow>
<mml:mo>⊥</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> speeds where each pair of values is expressed using the plain and orthogonal components. These functions depend on the plethora of the different derivatives calculated before. The so-called <italic>aperture problem</italic> [<xref ref-type="bibr" rid="b24-sensors-11-08164">24</xref>] inherent to optical flow is faced conditioning the raw values through a least square method applied to the different projections <italic>θ</italic>. These four functions are the velocity estimation primitives following the robustness and bioinspired nature of the model. The functions are combined, contributing either direct and inverse speed to the value accuracy due to the fact they are antagonistic and complementary enhancing strongly the robustness of the sensor. Additionally, there are several works supporting neurons which perform inverse speed measures [<xref ref-type="bibr" rid="b26-sensors-11-08164">26</xref>,<xref ref-type="bibr" rid="b27-sensors-11-08164">27</xref>], this fact also supplies an explanation of the sensitivity to static noise for motion blind patients [<xref ref-type="bibr" rid="b28-sensors-11-08164">28</xref>].</p>
<p>Stage VI finally calculates two outputs: direction output from a measurement of phase that is combined across all speed related measures and the modulus output as a quotient of determinants, as shown in the following expressions:
<disp-formula id="FD2">
<label>(2)</label> 
<mml:math display="block">
<mml:mrow>
<mml:mi mathvariant="italic">Modulu</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi>s</mml:mi></mml:mrow>
<mml:mn>2</mml:mn></mml:msup>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>s</mml:mi></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub></mml:mrow>
<mml:mo stretchy="true">^</mml:mo></mml:mover>
<mml:mtext>cos</mml:mtext>
<mml:mi>θ</mml:mi></mml:mrow></mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>s</mml:mi></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub></mml:mrow>
<mml:mo stretchy="true">^</mml:mo></mml:mover>
<mml:mtext>sin</mml:mtext>
<mml:mi>θ</mml:mi></mml:mrow></mml:mtd></mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>⌣</mml:mo></mml:mover></mml:mrow>
<mml:mo>⊥</mml:mo></mml:msub>
<mml:mtext>cos</mml:mtext>
<mml:mi>θ</mml:mi></mml:mrow></mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>⌣</mml:mo></mml:mover></mml:mrow>
<mml:mo>⊥</mml:mo></mml:msub>
<mml:mtext>sin</mml:mtext>
<mml:mi>θ</mml:mi></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>s</mml:mi></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub></mml:mrow>
<mml:mo stretchy="true">^</mml:mo></mml:mover>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>⌣</mml:mo></mml:mover></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo stretchy="true">^</mml:mo></mml:mover></mml:mrow></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>⌣</mml:mo></mml:mover></mml:mrow>
<mml:mo>⊥</mml:mo></mml:msub></mml:mrow></mml:mtd></mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>s</mml:mi></mml:mrow>
<mml:mo>⊥</mml:mo></mml:msub></mml:mrow>
<mml:mo stretchy="true">^</mml:mo></mml:mover>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>⌣</mml:mo></mml:mover></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>s</mml:mi></mml:mrow>
<mml:mo>⊥</mml:mo></mml:msub></mml:mrow>
<mml:mo stretchy="true">^</mml:mo></mml:mover>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>⌣</mml:mo></mml:mover></mml:mrow>
<mml:mo>⊥</mml:mo></mml:msub></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mfrac></mml:mrow>
<mml:mo>|</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>
<disp-formula id="FD3">
<label>(3)</label> 
<mml:math display="block">
<mml:mrow>
<mml:mi mathvariant="italic">Phase</mml:mi>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mtext>tan</mml:mtext></mml:mrow></mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo stretchy="true">^</mml:mo></mml:mover></mml:mrow></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>⌣</mml:mo></mml:mover></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mtext>sin</mml:mtext>
<mml:mi>θ</mml:mi>
<mml:mo>+</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo stretchy="true">^</mml:mo></mml:mover></mml:mrow></mml:mrow>
<mml:mo>⊥</mml:mo></mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>⌣</mml:mo></mml:mover></mml:mrow>
<mml:mo>⊥</mml:mo></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mtext>cos</mml:mtext>
<mml:mi>θ</mml:mi></mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo stretchy="true">^</mml:mo></mml:mover></mml:mrow></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>⌣</mml:mo></mml:mover></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mtext>cos</mml:mtext>
<mml:mi>θ</mml:mi>
<mml:mo>−</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo stretchy="true">^</mml:mo></mml:mover></mml:mrow></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo>⌣</mml:mo></mml:mover></mml:mrow>
<mml:mrow>
<mml:mo>||</mml:mo></mml:mrow></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mtext>sin</mml:mtext>
<mml:mi>θ</mml:mi></mml:mrow></mml:mfrac>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>The complete optical flow Low-level vision model can be easily and gradually degraded to match previous models [<xref ref-type="bibr" rid="b18-sensors-11-08164">18</xref>], even getting an ordinary optical flow Gradient model [<xref ref-type="bibr" rid="b7-sensors-11-08164">7</xref>,<xref ref-type="bibr" rid="b8-sensors-11-08164">8</xref>,<xref ref-type="bibr" rid="b29-sensors-11-08164">29</xref>], as pointed out in a previous work [<xref ref-type="bibr" rid="b15-sensors-11-08164">15</xref>].</p></sec>
<sec>
<label>2.2.</label>
<title>Wave-Level Granularity: Low- and Mid-Level Vision</title>
<p>One of the most well established approaches in computer-vision and image analysis is the use of moment invariants. Moment invariants, surveyed extensively by Prokop and Reeves [<xref ref-type="bibr" rid="b9-sensors-11-08164">9</xref>] and more recently by Flusser [<xref ref-type="bibr" rid="b11-sensors-11-08164">11</xref>], were first introduced to the pattern recognition community by Hu [<xref ref-type="bibr" rid="b12-sensors-11-08164">12</xref>,<xref ref-type="bibr" rid="b13-sensors-11-08164">13</xref>], who employed the results of the theory of algebraic invariants and derived a set of seven moment invariants (the well-known Hu invariant set), which is now a classical reference in any work that makes use of moments. Since the introduction of the Hu invariant set, numerous works have been devoted to various improvements, generalizations and their application in different areas, e.g., various types of moments such as Zernike moments, pseudo-Zernike moments, rotational moments, and complex moments have been used to recognize image patterns in a number of applications [<xref ref-type="bibr" rid="b30-sensors-11-08164">30</xref>].</p>
<p>The problem of the influence of discretization and noise on moment accuracy as object descriptors has been previously addressed by proposing several new techniques to increase the accuracy and efficiency of moment descriptors, deduction of the focus information from the second or fourth order central moments of a sequence of images [<xref ref-type="bibr" rid="b31-sensors-11-08164">31</xref>], as well as methods for the efficient computation of certain classes of moments (e.g., Zernike moments, discrete orthogonal moments) [<xref ref-type="bibr" rid="b32-sensors-11-08164">32</xref>–<xref ref-type="bibr" rid="b35-sensors-11-08164">35</xref>]. Moreover, other works [<xref ref-type="bibr" rid="b36-sensors-11-08164">36</xref>] address the same problem of Hu from different perspectives, e.g., achieving invariance to intensity, rotation, and scaling of color images based on the concept of principal component analysis and a competitive learning algorithm.</p>
<p>In short, moment invariants are measures of an image or signal that remain constant under some transformations, e.g., rotation, scaling, translation or illumination. Moments are applicable to different aspects of image processing, ranging from invariant pattern recognition and image encoding to pose estimation. Such moments can produce image descriptors invariant under rotation, scale, translation, orientation, <italic>etc.</italic> The general definition of moments of order <italic>p + q</italic> is as follows:
<disp-formula id="FD4">
<label>(4)</label> 
<mml:math display="block">
<mml:mrow>
<mml:msub>
<mml:mi>M</mml:mi>
<mml:mi mathvariant="italic">pq</mml:mi></mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>∫</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>∫</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi>x</mml:mi></mml:mrow>
<mml:mi>p</mml:mi></mml:msup>
<mml:msup>
<mml:mrow>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mi>q</mml:mi></mml:msup>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>d</mml:mi>
<mml:mi>x</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>3</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo>,</mml:mo>
<mml:mo>∞</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>These moments produce a weighted description of <italic>f</italic>(<italic>x,y</italic>) over the entire image. The basis functions (<italic>x<sup>p</sup> y<sup>q</sup></italic>) may have a range of useful properties that may be passed onto the moments.</p>
<p>The method of variant moments [<xref ref-type="bibr" rid="b37-sensors-11-08164">37</xref>] is a new technique for image analysis and computer vision that has many promising features for producing new kinds of very robust and simple computer vision algorithms. Variant moments possess a very simple definition; they are versatile and can be calculated very efficiently. They can also be used to characterize an image, object and scene for low, mid and high levels respectively. It seems very reasonable that one of its main areas of applications would be exploitation of the possible synergies with many other state of the art computer vision systems, e.g., optic flow-based techniques, as explained in this contribution.</p>
<p>Orthogonality means the decomposition an object, e.g., a point or vector, into, say, two components (its rectangular components <italic>x</italic>, <italic>y</italic>) in such a way that these two components are, <italic>a priori</italic>, uncorrelated, that is, it is possible to analyze how the object varies in one of its components, say <italic>x</italic>, in an independent way from the rest of the components, say <italic>y.</italic></p>
<p>An <italic>Orthogonal Variant Moment m</italic> = <italic>O(f)</italic> is a measurement of a function <italic>f</italic> such that <italic>m</italic> varies if and only if the specific characteristic that is measured with this particular moment changes, that is, it is a measurement of an exclusive feature of a signal, image or wave form. Thus, an orthogonal variant moment set <bold><italic>S</italic></bold> is such that every element is uncorrelated with any other element of the set; in such a way that the value of some particular moment in an image sequence can vary while the remaining moments remain constant.</p>
<p>Invariants are sensitive to any image change or perturbation for which they are not invariant, so any unexpected perturbation will affect the measurements, that is, methods based on this approach can suffer from a high degree of uncertainty. On the contrary, a variant moment is designed to be sensitive to a specific perturbation, <italic>i.e.</italic>, to measure a transformation, not to be invariant to it and thus if the specific perturbation occurs it will be measured, hence any unexpected disturbance will not affect the objective of the measurement, that is, variant moments behave as specific detectors.</p>
<p>Assuming the restriction of two dimensional images on the plane, some useful orthogonal variant moments are the volume and area under the curve, the surface area <bold><italic>S<sub>a</sub></italic></bold> computed by two orthogonal components (<bold><italic>L</italic></bold><sub>x</sub>) for the <italic>x</italic>-axis and (<bold><italic>L</italic></bold><sub>y</sub>) for the <italic>y</italic>-axis, an approximation of the phase of a wave which are called the position or station defined also in two orthogonal components <bold><italic>P<sub>x</sub></italic></bold> and <bold><italic>P<sub>y</sub></italic></bold>.</p>
<p>Also, time derivatives of these orthogonal variant moments are used to obtain relevant measures about dynamic image sequences, for instance, measures of velocity and acceleration, <bold><italic>V</italic></bold> and <bold><italic>A</italic></bold> respectively, are obtained from the time derivatives of the position, <italic>∂</italic><bold><italic>P</italic></bold><italic><sub>x</sub></italic> and <italic>∂</italic><bold><italic>P</italic></bold><italic><sub>y</sub></italic>. The time derivatives of the surface area (length), <italic>∂</italic><bold><italic>L</italic></bold><sub>x</sub>, <italic>∂</italic><bold><italic>L</italic></bold><sub>y</sub>, represent the speed with which the disturbance is attenuated or amplified by a factor <italic>k</italic>. As long as the ratio between <italic>∂</italic><italic>L</italic><sub>x</sub> and <italic>∂</italic><italic>L</italic><sub>y</sub> remains constant, this fact can be interpreted as a zoom in/out from a perpendicular observer to the <italic>xy</italic>-plane.</p>
<p>The method introduced previously [<xref ref-type="bibr" rid="b37-sensors-11-08164">37</xref>] operates by extracting, for each frame <bold><italic>I</italic></bold> of an image sequence or stream, a set <bold><italic>M</italic></bold> of moments, as shown in <xref ref-type="disp-formula" rid="FD5">Equation (5)</xref>:
<disp-formula id="FD5">
<label>(5)</label> 
<mml:math display="block">
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>I</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mi>A</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>I</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>;</mml:mo>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>x</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>I</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>;</mml:mo>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>y</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>I</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>;</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mi>x</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>I</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>;</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mi>y</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>I</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo stretchy="false">]</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>Once obtained the <bold><italic>M</italic></bold> vector, these moments can be used directly in several computer vision algorithms, for instance, to produce image segmentation, movement detection, shape analysis and object and pattern recognition.</p></sec>
<sec>
<label>2.3.</label>
<title>Multimodal Sensor Architecture Integrated</title>
<p>The high level description tool Handel-C was chosen to implement this core within the DK environment [<xref ref-type="bibr" rid="b38-sensors-11-08164">38</xref>]. The board used is the well-known AlphaData RC1000 [<xref ref-type="bibr" rid="b39-sensors-11-08164">39</xref>] which includes a Virtex 2000E-BG560 chip and four SRAM banks of 2 Mbytes each. These external banks have been used for different implementations, accessing to them from both the FPGA and the PCI bus as shown in the <xref ref-type="fig" rid="f1-sensors-11-08164">Figure 1</xref>. Low-level optical flow vision is designed and built through an asynchronous pipeline where a message or token is passed to the next core each time one core finish the processing task. Nevertheless Low-level moment vision platform is implemented in a parallel way, being independent each one of the rest.</p>
<p>Each orthogonal variant moment and the optical flow scheme contribute to the final Mid-Level Vision estimation. The multimodal sensor core integrates the information from different abstraction layers (six modules for optical flow, five modules for the orthogonal moments and one module for the Mid-Level vision tasks). The Mid-Level vision core is arranged in this work for segmentation and tracking estimation with also an efficient implementation of clustering algorithm, although additional functionality to this last module can be added using this general architecture.</p></sec></sec>
<sec sec-type="methods">
<label>3.</label>
<title>Algorithms of the Mid-Level Multimodal Sensor: Tracking &amp; Segmentation Case Study</title>
<p>In this section the algorithms for performing tracking and segmentation are presented. <xref ref-type="table" rid="t6-sensors-11-08164">Algorithm 1</xref> (Segmentation function) shows a classical segmentation procedure that uses the well-known <italic>k</italic>-means clustering algorithm, although any other clustering algorithm could be used instead to group pixels into different classes. The <italic>k</italic>-means algorithm is implemented in hardware, thus modifying the structure proposed by [<xref ref-type="bibr" rid="b40-sensors-11-08164">40</xref>], in order to reduce the computation time between the class centre and the pixels.</p>
<p>Every pixel is classified using a set of features for itself and a neighbourhood surrounding it, such as its <italic>x,y</italic>-coordinates, a set of orthogonal variant moments calculated for the subimage formed by the pixel’s neighbourhood <italic>W<sub>ij</sub></italic> and additionally two components provided by the optic-flow subsystem indicating the magnitude <italic>m<sub>ij</sub></italic> and the phase <italic>θ<sub>ij</sub></italic>. Thus every pixel is represented by a vector of features <italic>F<sub>ij</sub></italic> that will be classified into a cluster or class. The <italic>k</italic>-means algorithm has a quite critical parameter <italic>k</italic> which determines the number of different clusters to generate. One simple method to overcome this apparent limitation (due to the unknown possible number of moving objects in the scene) uses a large enough <italic>k</italic> and drops all insignificant or low quality clustering generated.</p>
<p>The full motion detection and tracking system is then achieved by the procedure described in <xref ref-type="table" rid="t7-sensors-11-08164">Algorithm 2</xref>. The method is as follows: given an image sequence <italic>S</italic>, the algorithm will perform, for each temporal image frame, the segmentation procedure described above, in order to group the pixels of the current frame into different clusters. Once each valid cluster has been generated, every pixel will have a label indicating its class, e.g., 1, 2, 3, … <italic>k</italic>. With this starting information, the algorithms can proceed to superimpose a surrounding box over the image frame for each detected object. At this step, each cluster will represent a moving object and thus we can handle mid-level entities instead of low level entities (pixels).</p>
<table-wrap id="t6-sensors-11-08164" position="anchor">
<label>Algorithm 1.</label>
<caption>
<p>The proposed integrated segmentation algorithm incorporating the variant moments and the measures of optic flow, flow’s magnitude and phase of each pixel (<italic>m<sub>ij</sub></italic>, <italic>θ<sub>ij</sub></italic>).</p></caption>
<table frame="hsides" rules="none">
<tbody>
<tr>
<td align="left" valign="top">1:</td>
<td align="left" valign="top"><bold>Function Segmentation(I)</bold></td></tr>
<tr>
<td align="left" valign="top">2:</td>
<td align="left" valign="top">{An image <italic>I</italic> of N × M pixel intensities}</td></tr>
<tr>
<td align="left" valign="top">3:</td>
<td align="left" valign="top"><bold>for</bold> <italic>i</italic> = 1 to N <bold>do</bold></td></tr>
<tr>
<td align="left" valign="top">4:</td>
<td align="left" valign="top">  <bold>for</bold> <italic>j</italic> = 1 to M <bold>do</bold></td></tr>
<tr>
<td align="left" valign="top">5:</td>
<td align="left" valign="top">    Obtain a window:
<disp-formula>
<mml:math display="block">
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mi mathvariant="italic">ij</mml:mi></mml:msub>
<mml:mo>=</mml:mo>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>−</mml:mo>
<mml:mfrac>
<mml:mi>w</mml:mi>
<mml:mn>2</mml:mn></mml:mfrac>
<mml:mo>…</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>+</mml:mo>
<mml:mfrac>
<mml:mi>w</mml:mi>
<mml:mn>2</mml:mn></mml:mfrac>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>−</mml:mo>
<mml:mfrac>
<mml:mi>h</mml:mi>
<mml:mn>2</mml:mn></mml:mfrac>
<mml:mo>…</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>+</mml:mo>
<mml:mfrac>
<mml:mi>h</mml:mi>
<mml:mn>2</mml:mn></mml:mfrac></mml:mrow>
<mml:mo>]</mml:mo></mml:mrow>
<mml:mtext>of</mml:mtext>
<mml:mo> </mml:mo>
<mml:mi>w</mml:mi>
<mml:mo>×</mml:mo>
<mml:mi>h</mml:mi>
<mml:mo> </mml:mo>
<mml:mtext>neighbours of</mml:mtext>
<mml:mo> </mml:mo>
<mml:mi>I</mml:mi>
<mml:mo stretchy="false">[</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">]</mml:mo></mml:mrow></mml:math></disp-formula></td></tr>
<tr>
<td align="left" valign="top">6:</td>
<td align="left" valign="top">    Obtain pixel features:
<disp-formula>
<mml:math display="block">
<mml:mrow>
<mml:msub>
<mml:mi>F</mml:mi>
<mml:mi mathvariant="italic">ij</mml:mi></mml:msub>
<mml:mo>←</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mover>
<mml:mover>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo></mml:mrow>
<mml:mo stretchy="true">︷</mml:mo></mml:mover>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">coordinates</mml:mi></mml:mrow></mml:mover></mml:mrow></mml:mtd>
<mml:mtd>
<mml:mrow/></mml:mtd></mml:mtr></mml:mtable>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:munder>
<mml:munder>
<mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mi mathvariant="italic">ij</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>x</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mi mathvariant="italic">ij</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>y</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mi mathvariant="italic">ij</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mi>x</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mi mathvariant="italic">ij</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mi>y</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mi mathvariant="italic">ij</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo></mml:mrow>
<mml:mrow/></mml:mrow>
<mml:mo stretchy="true">︸</mml:mo></mml:munder>
<mml:mrow>
<mml:mi mathvariant="italic">variant</mml:mi>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">moments</mml:mi></mml:mrow></mml:munder></mml:mrow></mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mover>
<mml:mover>
<mml:mrow>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi mathvariant="italic">ij</mml:mi></mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>θ</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">ij</mml:mi></mml:mrow></mml:msub></mml:mrow>
<mml:mo stretchy="true">︷</mml:mo></mml:mover>
<mml:mrow>
<mml:mi mathvariant="italic">optic</mml:mi>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">flow</mml:mi></mml:mrow></mml:mover></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow>
<mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></td></tr>
<tr>
<td align="left" valign="middle">7:</td>
<td align="left" valign="middle">  <bold>end for</bold></td></tr>
<tr>
<td align="left" valign="middle">8:</td>
<td align="left" valign="middle"><bold>end for</bold></td></tr>
<tr>
<td align="left" valign="bottom">9:</td>
<td align="left" valign="bottom"><italic>class-id</italic> = <italic>k-means</italic>(<italic>F,k,w</italic>)</td></tr>
<tr>
<td align="left" valign="bottom">10:</td>
<td align="left" valign="bottom"><bold>return</bold> <italic>class-id</italic></td></tr></tbody></table></table-wrap>
<table-wrap id="t7-sensors-11-08164" position="anchor">
<label>Algorithm 2.</label>
<caption>
<p>The tracking algorithm used in the experiments.</p></caption>
<table frame="hsides" rules="none">
<tbody>
<tr>
<td colspan="2" align="left" valign="top"><bold>Require:</bold> An image sequence <italic>S.</italic></td></tr>
<tr>
<td align="left" valign="top">1:</td>
<td align="left" valign="top"><bold>for</bold> each time step <italic>t</italic> <bold>do</bold></td></tr>
<tr>
<td align="left" valign="top">2:</td>
<td align="left" valign="top">  <italic>I<sub>t</sub></italic> ← new image frame from <italic>S</italic></td></tr>
<tr>
<td align="left" valign="top">3:</td>
<td align="left" valign="top">  class-id = <bold>Segmentation</bold>(<italic>I<sub>t</sub></italic>)</td></tr>
<tr>
<td align="left" valign="top">4:</td>
<td align="left" valign="top">  <bold>for</bold> each object in class-id <bold>do</bold></td></tr>
<tr>
<td align="left" valign="top">5:</td>
<td align="left" valign="top">    Update the object’s surrounding box based on pixel positions of class-id</td></tr>
<tr>
<td align="left" valign="middle">6:</td>
<td align="left" valign="top">  <bold>end for</bold></td></tr>
<tr>
<td align="left" valign="bottom">7:</td>
<td align="left" valign="bottom"><bold>end for</bold></td></tr></tbody></table></table-wrap></sec>
<sec>
<label>4.</label>
<title>Application to the Multimodal Bioinspired Sensor to Mid-Level Vision Tasks</title>
<p>In this section, the whole system is characterized according to the computational resources needed and the throughput obtained. Also, for the sake of clarity some visual results and a comparison with similar approaches are presented.</p>
<sec>
<label>4.1.</label>
<title>Computational Resources</title>
<p>Regarding the hardware resources, the metric for measuring the logic and the memory used will be the slice and the Block Ram occupation index. The software tool used to synthesize the final sensor under reconfigurable hardware (FPGA devices) is the ISE 12 suite [<xref ref-type="bibr" rid="b41-sensors-11-08164">41</xref>].</p>
<p>The slower stage in the Low-level optical flow platform is Stage IV while Stage II needs the maximum number of Block RAMs due to the computations performed, as shown in <xref ref-type="table" rid="t1-sensors-11-08164">Table 1</xref>. Stage V also needs a considerable amount of slices due the intensive use of multipliers. Some resources have been preserved in this implementation to be able to integrate all the optical flow and orthogonal moments in a whole system.</p>
<p>The number of cycles used (NC), determines the slower stage which restricts an improved throughput of the final system, regarding which, the Xilinx timing analyzer tool [<xref ref-type="bibr" rid="b41-sensors-11-08164">41</xref>] delivers the results in terms of frequency around 25%–35% lower than the real frequency tested in our experiments. <xref ref-type="table" rid="t1-sensors-11-08164">Table 1</xref> also shows the performance of the optical flow scheme based on chained stages, attending to the pixel/seconds processed, is concluded that it is possible to compute real-time estimation with a resolution of 320 × 240 pixels.</p>
<p>Low-level orthogonal moments resources are presented in <xref ref-type="table" rid="t2-sensors-11-08164">Table 2</xref>. Although the moments <italic>L<sub>x</sub></italic> and <italic>L<sub>y</sub></italic> represent the slowest part of the Orthogonal Moment scheme and they use more slices and Block Rams than <italic>P<sub>x</sub></italic> and <italic>P<sub>y</sub></italic>, in general, so these do not impose a resource limitation in the whole system.</p>
<p>Once each separate stage corresponding to an early-vision primitive is properly implemented, the integration and processing of the complete system is needed. <xref ref-type="table" rid="t3-sensors-11-08164">Table 3</xref> shows how the limits of the bioinspired global sensor are imposed by the Low-level vision platform, with the Mid-level vision acting as a supplement in terms of resources needed. In fact, the implemented platform has adapted the resources in comparison with previous works [<xref ref-type="bibr" rid="b14-sensors-11-08164">14</xref>,<xref ref-type="bibr" rid="b15-sensors-11-08164">15</xref>]; with this, the limit of the global system will be imposed by the slowest stage, awaiting the information from the asynchronous pipeline to be processed. Regarding that, it is important to remark that taking into account how the architecture has been designed the Mid-level task is one the last stages of the pipeline. The hardware requirements in term of slices, memory, number of cycles and performance for the implementation of the Multimodal Bioinspired Sensor can be seen in <xref ref-type="table" rid="t3-sensors-11-08164">Table 3</xref>.</p>
<p><xref ref-type="table" rid="t4-sensors-11-08164">Table 4</xref> finally shows the throughput obtained for several input resolutions of the global system expressed in Kpps (kilo pixels per second) and fps (frames per second). The maximum performance of the global system reaches up 2,000 Kpixels/second.</p></sec>
<sec sec-type="results">
<label>4.2.</label>
<title>Visual Results</title>
<p>Three different experiments related to the processing of real input sequences captured from a static camera are displayed. The Low-level vision output indicates the optical flow estimation of each pixel using modulus and phase. On the one hand, the modulus (how fast the pixel is processed) is represented with a gradient intensity code, where black colour means no motion and white colour represents values with high velocity, on the other hand, the phase (direction towards which the processed pixel is moving) is represented using a colour coding as shown in the colour boundary frame. According to this formalism, downward motion will be represented using the blue tonalities, upward will use yellow tonalities and so on. Every pixel has individual information of its modulus and phase and every object has information about its segmentation and tracking surrounding area.</p>
<sec>
<label>4.2.1.</label>
<title>Experiment I</title>
<p>The first stimulus (<xref ref-type="fig" rid="f2-sensors-11-08164">Figure 2</xref>) represents two persons walking towards the left and showing a little residual motion in the central part of the frame sequence (a) with a resolution of 128 × 128 pixels. The motion is marked with yellow lines in order to indicate a qualitative approach. Phase Estimation indicates that the majority of the motion is moving towards the left (b). Modulus estimation gets a measure of the velocity of the pixels (c). Finally the tracking task follows the three different segmented objects (e).</p></sec>
<sec>
<label>4.2.2.</label>
<title>Experiment II</title>
<p>The second stimulus is a traffic sequence transition (<xref ref-type="fig" rid="f3-sensors-11-08164">Figure 3</xref>). There are different objects and speeds interacting (a) with a resolution of 128 × 128 pixels. Phase estimation delivers results moving towards down, right and up (b). Modulus estimation again provides velocity values (c). Segmentation (d) and Tracking (e) scheme processes five shapes.</p></sec>
<sec>
<label>4.2.3.</label>
<title>Experiment III</title>
<p>The third stimulus represents a person spreading their arms and legs upwards and downwards (a) with a resolution of 256 × 164 pixels (<xref ref-type="fig" rid="f4-sensors-11-08164">Figure 4</xref>). Phase estimation provides blue, green and red color values indicating motion towards the left, up and down. (b). Modulus estimation shows the different velocity values (c). Segmentation (d) and Tracking (e) process six contours.</p></sec></sec>
<sec>
<label>4.3.</label>
<title>Comparison with Other Approaches</title>
<p>Comparisons with other embedded complex vision models are presented in <xref ref-type="table" rid="t5-sensors-11-08164">Table 5</xref>. The motion computation family and the method used are listed. The performance obtained and the computation densities are also shown. Every pixel value should be computed (100% density), nevertheless some of the methods below filter the inputs, reducing the processing space and thus the density.</p>
<p>There are many embedded engines regarding low-level vision [<xref ref-type="bibr" rid="b42-sensors-11-08164">42</xref>–<xref ref-type="bibr" rid="b46-sensors-11-08164">46</xref>]. This design reaches 2 Mpps, being able to deliver 26 frames/second with a resolution of 320 × 240 pixels, and a complete computation density (100%), thus enough for automation applications such as a little robot. It is important to remark that this model links two different abstraction layers, providing a Mid-level vision output.</p>
<p>Other approaches are based on motion estimation models (low-level) that are not biologically plausible; for example, the optical flow part of the presented model has been proved [<xref ref-type="bibr" rid="b16-sensors-11-08164">16</xref>,<xref ref-type="bibr" rid="b17-sensors-11-08164">17</xref>] to recover motion patterns based on texture-defined contours (second order motion) [<xref ref-type="bibr" rid="b47-sensors-11-08164">47</xref>,<xref ref-type="bibr" rid="b48-sensors-11-08164">48</xref>], which is very useful, e.g., for camouflage tasks and prediction the behaviour of many optical illusions.</p></sec></sec>
<sec sec-type="discussion|conclusions">
<label>5.</label>
<title>Conclusions and Further Work</title>
<p>A complex bioinspired sensor, capable of computing multimodal low-level vision primitives to produce robust mid-level vision methods, is presented. The bioinspired sensor has been designed for Very Large Scale Integration (VLSI) using properties of the cortical motion pathway. This sensor combines low-level primitives (optical flow and image moments) in order to produce a mid-level vision abstraction layer. The whole system is scalable and modular, being it also possible to select the visual primitives involved (number of moments) as well as the bit-width of the filters and computation accuracy in the low-level vision (optical flow). This architecture can integrate different visual processing channels, so the proposed system makes possible the implementation of complex bioinspired algorithms on-chip.</p>
<p>In this respect, the integration of these low-level primitives through the proposed sensor has been applied to the design of a very efficient and robust visual tracking system. This specific system is robust in applications with high luminance variations and noisy environments. It is also useful in the research on the human perceptual system.</p>
<p>The integration of such different approaches represents a novel way of efficiently approaching complex computer vision systems. To the best of our knowledge, this is the first time that several low-level primitives are integrated with mid-level vision.</p>
<p>The integration of other low-level vision primitives such as phase, colour, motion, and binocular disparity is the next step in our research. It will also include mid-level inferences in the processing hence additional research will consider the combination of variant and invariant moments in the framework of low-level (pixel level) and mid-level (object level) vision and its integration with the optical flow. This complex vision system is currently being built on modern FPGAs using VHDL.</p>
<p>Furthermore, the computation of the multi-scale optical flow based on different moment measurements, instead of using the gradient based approaches of pixel intensity changes, and its hardware implementation, is a direct extension that is suggested by the presented model.</p></sec></body>
<back>
<ack>
<p>This work has been partially supported by Spanish Project DPI2009-14552-C02-01. Authors wish to thank Alan Johnston and Jason L. Dale, from the Vision Group at University College London, for their great help and support for some of the previous works mentioned here.</p></ack>
<ref-list>
<title>References</title>
<ref id="b1-sensors-11-08164"><label>1.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Bruce</surname><given-names>V</given-names></name><name><surname>Green</surname><given-names>PR</given-names></name><name><surname>Georgeson</surname><given-names>MA</given-names></name></person-group><source>Visual Perception: Physiology, Psychology &amp; Ecology</source><edition>3rd ed</edition><publisher-name>Laurence Erlbaum Associates</publisher-name><publisher-loc>Hove, UK</publisher-loc><year>1998</year></citation></ref>
<ref id="b2-sensors-11-08164"><label>2.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Szelinsky</surname><given-names>R</given-names></name></person-group><source>Computer Vision Algorithms and Applications</source><publisher-name>Springer</publisher-name><publisher-loc>Berlin, Germany</publisher-loc><year>2011</year></citation></ref>
<ref id="b3-sensors-11-08164"><label>3.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Guzel</surname><given-names>MS</given-names></name><name><surname>Bicker</surname><given-names>R</given-names></name></person-group><article-title>Optical Flow Based System Design for Mobile Robots</article-title><conf-name>Proceedings of the 2010 IEEE Conference on Robotics Automation and Mechatronics, Robotics Automation and Mechatronics (RAM)</conf-name><conf-loc>Singapore</conf-loc><conf-date>28–30 June 2010</conf-date><fpage>545</fpage><lpage>550</lpage></citation></ref>
<ref id="b4-sensors-11-08164"><label>4.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Sim</surname><given-names>KF</given-names></name><name><surname>Sundaraj</surname><given-names>K</given-names></name></person-group><article-title>Human Motion Tracking of Athlete Using Optical Flow and Artificial Markers</article-title><conf-name>Proceedings of the 2010 International Conference on Intelligent and Advanced Systems (ICIAS)</conf-name><conf-loc>Kuala Lumpur, Malaysia</conf-loc><conf-date>15–17 June 2010</conf-date><fpage>1</fpage><lpage>4</lpage></citation></ref>
<ref id="b5-sensors-11-08164"><label>5.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Papadopoulos</surname><given-names>GT</given-names></name><name><surname>Briassouli</surname><given-names>A</given-names></name><name><surname>Mezaris</surname><given-names>V</given-names></name><name><surname>Kompatsiaris</surname><given-names>I</given-names></name><name><surname>Strintzis</surname><given-names>MG</given-names></name></person-group><article-title>Statistical motion information extraction and representation for semantic video analysis</article-title><source>IEEE Trans. Circuits Syst. Video Technol</source><year>2009</year><volume>19</volume><fpage>1513</fpage><lpage>1528</lpage><pub-id pub-id-type="doi">10.1109/TCSVT.2009.2026932</pub-id></citation></ref>
<ref id="b6-sensors-11-08164"><label>6.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname><given-names>C</given-names></name><name><surname>Chen</surname><given-names>Y</given-names></name></person-group><article-title>Motion estimation method using 3D steerable filter</article-title><source>Image Vis. Comput</source><year>1995</year><volume>13</volume><fpage>21</fpage><lpage>32</lpage><pub-id pub-id-type="doi">10.1016/0262-8856(95)91465-P</pub-id></citation></ref>
<ref id="b7-sensors-11-08164"><label>7.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Lucas</surname><given-names>BD</given-names></name><name><surname>Kanade</surname><given-names>T</given-names></name></person-group><article-title>An Iterative Image Registration Technique with an Application to Stereo Vision</article-title><conf-name>Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI’81)</conf-name><conf-loc>Vancouver, BC, Canada</conf-loc><conf-date>24–28 August 1981</conf-date><fpage>674</fpage><lpage>679</lpage></citation></ref>
<ref id="b8-sensors-11-08164"><label>8.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baker</surname><given-names>S</given-names></name><name><surname>Matthews</surname><given-names>I</given-names></name></person-group><article-title>Lucas-kanade 20 years on: A unifying framework</article-title><source>Int. J. Comput. Vis</source><year>2004</year><volume>56</volume><fpage>221</fpage><lpage>255</lpage><pub-id pub-id-type="doi">10.1023/B:VISI.0000011205.11775.fd</pub-id></citation></ref>
<ref id="b9-sensors-11-08164"><label>9.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Prokop</surname><given-names>RJ</given-names></name><name><surname>Reeves</surname><given-names>AP</given-names></name></person-group><article-title>A survey of moment-based techniques for unoccluded object representation and recognition</article-title><source>CVGIP: Graph. Models Image Process</source><year>1992</year><volume>54</volume><fpage>438</fpage><lpage>460</lpage><pub-id pub-id-type="doi">10.1016/1049-9652(92)90027-U</pub-id></citation></ref>
<ref id="b10-sensors-11-08164"><label>10.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Papakostas</surname><given-names>GA</given-names></name><name><surname>Koulouriotis</surname><given-names>DE</given-names></name><name><surname>Karakasis</surname><given-names>EG</given-names></name></person-group><article-title>A unified methodology for the efficient computation of discrete orthogonal image moments</article-title><source>Inf. Sci</source><year>2009</year><volume>176</volume><fpage>3619</fpage><lpage>3633</lpage></citation></ref>
<ref id="b11-sensors-11-08164"><label>11.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Flusser</surname><given-names>J</given-names></name></person-group><article-title>Moment Invariants in Image Analysis</article-title><conf-name>Proceedings of the World Academy of Science, Engineering and Technology</conf-name><conf-loc>Czech Republic</conf-loc><conf-date>February 2006</conf-date><volume>11</volume><fpage>196</fpage><lpage>201</lpage></citation></ref>
<ref id="b12-sensors-11-08164"><label>12.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname><given-names>M-K</given-names></name></person-group><article-title>Pattern recognition by moment invariants</article-title><source>IEEE Trans. Inf. Theory</source><year>1961</year><volume>49</volume><fpage>14</fpage><lpage>28</lpage></citation></ref>
<ref id="b13-sensors-11-08164"><label>13.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname><given-names>M-K</given-names></name></person-group><article-title>Visual pattern recognition by moment invariants</article-title><source>IRE Trans. Inf. Theory</source><year>1962</year><volume>8</volume><fpage>179</fpage><lpage>187</lpage></citation></ref>
<ref id="b14-sensors-11-08164"><label>14.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Botella</surname><given-names>G</given-names></name><name><surname>Meyer-Baese</surname><given-names>U</given-names></name><name><surname>García</surname><given-names>A</given-names></name></person-group><article-title>Bioinspired robust optical flow processor system for VLSI implementation</article-title><source>IEEE Electron. Lett</source><year>2009</year><volume>45</volume><fpage>1304</fpage><lpage>1306</lpage></citation></ref>
<ref id="b15-sensors-11-08164"><label>15.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Botella</surname><given-names>G</given-names></name><name><surname>García</surname><given-names>A</given-names></name><name><surname>Rodríguez</surname><given-names>M</given-names></name><name><surname>Ros</surname><given-names>E</given-names></name><name><surname>Meyer-Baese</surname><given-names>U</given-names></name><name><surname>Molina</surname><given-names>MC</given-names></name></person-group><article-title>Robust bioinspired architecture for optical flow computation</article-title><source>IEEE Trans. VLSI Syst</source><year>2010</year><volume>18</volume><fpage>616</fpage><lpage>629</lpage><pub-id pub-id-type="doi">10.1109/TVLSI.2009.2013957</pub-id></citation></ref>
<ref id="b16-sensors-11-08164"><label>16.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johnston</surname><given-names>A</given-names></name><name><surname>Clifford</surname><given-names>CW</given-names></name></person-group><article-title>A unified account of three apparent motion illusions</article-title><source>Vis. Res</source><year>1994</year><volume>35</volume><fpage>1109</fpage><lpage>1123</lpage></citation></ref>
<ref id="b17-sensors-11-08164"><label>17.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johnston</surname><given-names>A</given-names></name><name><surname>Clifford</surname><given-names>CW</given-names></name></person-group><article-title>Perceived motion of contrast modulated gratings: PredICTION of the McGM and the role of full-wave rectification</article-title><source>Vis. Res</source><year>1995</year><volume>35</volume><fpage>1771</fpage><lpage>1783</lpage><pub-id pub-id-type="doi">10.1016/0042-6989(94)00258-N</pub-id><pub-id pub-id-type="pmid">7660584</pub-id></citation></ref>
<ref id="b18-sensors-11-08164"><label>18.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johnston</surname><given-names>A</given-names></name><name><surname>McOwan</surname><given-names>PW</given-names></name><name><surname>Benton</surname><given-names>CP</given-names></name></person-group><article-title>Robust velocity computation from a biologically motivated model of motion perception</article-title><source>Proc. Biol. Sci</source><year>1999</year><volume>266</volume><fpage>509</fpage><lpage>518</lpage><pub-id pub-id-type="doi">10.1098/rspb.1999.0666</pub-id></citation></ref>
<ref id="b19-sensors-11-08164"><label>19.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McOwan</surname><given-names>PW</given-names></name><name><surname>Benton</surname><given-names>C</given-names></name><name><surname>Dale</surname><given-names>J</given-names></name><name><surname>Johnston</surname><given-names>A</given-names></name></person-group><article-title>A multi-differential neuromorphic approach to motion detection</article-title><source>Int. J. Neural Syst</source><year>1999</year><volume>9</volume><fpage>429</fpage><lpage>434</lpage><pub-id pub-id-type="doi">10.1142/S0129065799000435</pub-id><pub-id pub-id-type="pmid">10630473</pub-id></citation></ref>
<ref id="b20-sensors-11-08164"><label>20.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johnston</surname><given-names>A</given-names></name><name><surname>McOwan</surname><given-names>PW</given-names></name><name><surname>Benton</surname><given-names>CP</given-names></name></person-group><article-title>Biological computation of image motion from flows over boundaries</article-title><source>J. Physiol. (Paris)</source><year>2003</year><volume>97</volume><fpage>325</fpage><lpage>334</lpage><pub-id pub-id-type="doi">10.1016/j.jphysparis.2003.09.016</pub-id></citation></ref>
<ref id="b21-sensors-11-08164"><label>21.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Lindeberg</surname><given-names>T</given-names></name><name><surname>Romeny</surname><given-names>B</given-names></name></person-group><article-title>Linear scale-space: I. Basic Theory, II. Early Visual Operations</article-title><source>Geometry-Driven Diffusion</source><publisher-name>Kluwer Academic Publishers</publisher-name><publisher-loc>Boston, MA, USA</publisher-loc><year>1994</year><fpage>1</fpage><lpage>77</lpage></citation></ref>
<ref id="b22-sensors-11-08164"><label>22.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johnston</surname><given-names>A</given-names></name><name><surname>McOwan</surname><given-names>PW</given-names></name><name><surname>Buxton</surname><given-names>HA</given-names></name></person-group><article-title>Computational model of the analysis of some first-order and second-order motion patterns by simple and complex cells</article-title><source>Proc. R. Soc. London</source><year>1992</year><volume>250</volume><fpage>297</fpage><lpage>306</lpage><pub-id pub-id-type="doi">10.1098/rspb.1992.0162</pub-id></citation></ref>
<ref id="b23-sensors-11-08164"><label>23.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Nalwa</surname><given-names>VS</given-names></name></person-group><source>A Guided Tour of Computer Vision</source><publisher-name>Addison-Wesley</publisher-name><publisher-loc>Reading, MA, USA</publisher-loc><year>1993</year></citation></ref>
<ref id="b24-sensors-11-08164"><label>24.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barron</surname><given-names>JL</given-names></name><name><surname>Fleet</surname><given-names>DJ</given-names></name><name><surname>Beauchemin</surname><given-names>SS</given-names></name></person-group><article-title>Performance of optical flow techniques</article-title><source>Int. J. Comput. Vis</source><year>1994</year><volume>12</volume><fpage>43</fpage><lpage>77</lpage><pub-id pub-id-type="doi">10.1007/BF01420984</pub-id></citation></ref>
<ref id="b25-sensors-11-08164"><label>25.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hess</surname><given-names>RF</given-names></name><name><surname>Snowden</surname><given-names>RJ</given-names></name></person-group><article-title>Temporal frequency filters in the human peripheral visual field</article-title><source>Vis. Res</source><year>1992</year><volume>32</volume><fpage>61</fpage><lpage>72</lpage><pub-id pub-id-type="doi">10.1016/0042-6989(92)90113-W</pub-id><pub-id pub-id-type="pmid">1502812</pub-id></citation></ref>
<ref id="b26-sensors-11-08164"><label>26.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lagae</surname><given-names>L</given-names></name><name><surname>Raiguel</surname><given-names>S</given-names></name><name><surname>Orban</surname><given-names>GA</given-names></name></person-group><article-title>Speed and direction selectivity of macaque middle temporal neurons</article-title><source>J. Neurophysiol</source><year>1993</year><volume>69</volume><fpage>19</fpage><lpage>39</lpage><pub-id pub-id-type="pmid">8433131</pub-id></citation></ref>
<ref id="b27-sensors-11-08164"><label>27.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mikami</surname><given-names>A</given-names></name><name><surname>Newsome</surname><given-names>WT</given-names></name><name><surname>Wurtz</surname><given-names>RH</given-names></name></person-group><article-title>Motion selectivity in macaque visual cortex. I. Mechanisms of direction and speed selectivity in extrastriate area MT</article-title><source>J. Neurophysiol</source><year>1986</year><volume>55</volume><fpage>1308</fpage><lpage>1327</lpage><pub-id pub-id-type="pmid">3016210</pub-id></citation></ref>
<ref id="b28-sensors-11-08164"><label>28.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McLeod</surname><given-names>P</given-names></name><name><surname>Dittrich</surname><given-names>W</given-names></name><name><surname>Driver</surname><given-names>J</given-names></name><name><surname>Perrett</surname><given-names>D</given-names></name><name><surname>Zihl</surname><given-names>J</given-names></name></person-group><article-title>Preserved and impaired detection of structure from motion by a motion-blind patient</article-title><source>Visual Cognit</source><year>1996</year><volume>3</volume><fpage>363</fpage><lpage>391</lpage><pub-id pub-id-type="doi">10.1080/135062896395634</pub-id></citation></ref>
<ref id="b29-sensors-11-08164"><label>29.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Horn</surname><given-names>KP</given-names></name><name><surname>Schunck</surname><given-names>BG</given-names></name></person-group><article-title>Determining optical flow</article-title><source>Artif. Intell</source><year>1981</year><volume>17</volume><fpage>185</fpage><lpage>203</lpage><pub-id pub-id-type="doi">10.1016/0004-3702(81)90024-2</pub-id></citation></ref>
<ref id="b30-sensors-11-08164"><label>30.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Teh</surname><given-names>C-H</given-names></name><name><surname>Chin</surname><given-names>RT</given-names></name></person-group><article-title>On image analysis by the methods of moments</article-title><source>IEEE Trans. Pattern Anal. Mach. Intell</source><year>1988</year><volume>10</volume><fpage>496</fpage><lpage>513</lpage><pub-id pub-id-type="doi">10.1109/34.3913</pub-id></citation></ref>
<ref id="b31-sensors-11-08164"><label>31.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname><given-names>YN</given-names></name><name><surname>Zhang</surname><given-names>Y</given-names></name><name><surname>Wen</surname><given-names>CY</given-names></name></person-group><article-title>A new focus measure method using moments</article-title><source>Image Vis. Comput</source><year>2000</year><volume>18</volume><fpage>959</fpage><lpage>965</lpage><pub-id pub-id-type="doi">10.1016/S0262-8856(00)00038-X</pub-id></citation></ref>
<ref id="b32-sensors-11-08164"><label>32.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Papakostas</surname><given-names>GA</given-names></name><name><surname>Boutalis</surname><given-names>YS</given-names></name><name><surname>Karras</surname><given-names>DA</given-names></name><name><surname>Mertzios</surname><given-names>BG</given-names></name></person-group><article-title>A new class of zernike moments for computer vision applications</article-title><source>Inf. Sci</source><year>2007</year><volume>177</volume><fpage>2802</fpage><lpage>2819</lpage><pub-id pub-id-type="doi">10.1016/j.ins.2007.01.010</pub-id></citation></ref>
<ref id="b33-sensors-11-08164"><label>33.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Papakostas</surname><given-names>GA</given-names></name><name><surname>Karakasis</surname><given-names>EG</given-names></name><name><surname>Koulouriotis</surname><given-names>DE</given-names></name></person-group><article-title>Exact and Speedy Computation of Legendre Moments on Binary Images</article-title><conf-name>Proceedings of the Eight International Workshop on Image Analysis for Multimedia Interactive Services, WIAMIS ’07</conf-name><conf-loc>Santorini, Greece</conf-loc><conf-date>6–8 June 2007</conf-date></citation></ref>
<ref id="b34-sensors-11-08164"><label>34.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Papakostas</surname><given-names>GA</given-names></name><name><surname>Koulouriotis</surname><given-names>DE</given-names></name><name><surname>Karakasis</surname><given-names>EG</given-names></name></person-group><article-title>A unified methodology for the efficient computation of discrete orthogonal image moments</article-title><source>Inf. Sci</source><year>2009</year><volume>176</volume><fpage>3619</fpage><lpage>3633</lpage></citation></ref>
<ref id="b35-sensors-11-08164"><label>35.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wee</surname><given-names>C-Y</given-names></name><name><surname>Paramesran</surname><given-names>R</given-names></name><name><surname>Takeda</surname><given-names>F</given-names></name></person-group><article-title>New computational methods for full and subset zernike moments</article-title><source>Inf. Sci</source><year>2004</year><volume>159</volume><fpage>203</fpage><lpage>220</lpage><pub-id pub-id-type="doi">10.1016/j.ins.2003.08.006</pub-id></citation></ref>
<ref id="b36-sensors-11-08164"><label>36.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sookhanaphibarn</surname><given-names>K</given-names></name><name><surname>Lursinsap</surname><given-names>C</given-names></name></person-group><article-title>A new feature extractor invariant to intensity, rotation, and scaling of color images</article-title><source>Inf. Sci</source><year>2006</year><volume>176</volume><fpage>2097</fpage><lpage>2119</lpage><pub-id pub-id-type="doi">10.1016/j.ins.2005.10.005</pub-id></citation></ref>
<ref id="b37-sensors-11-08164"><label>37.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Martin H</surname><given-names>JA</given-names></name><name><surname>Santos</surname><given-names>M</given-names></name><name><surname>de Lope</surname><given-names>J</given-names></name></person-group><article-title>Orthogonal variant moments features in image analysis</article-title><source>Inf. Sci</source><year>2010</year><volume>180</volume><fpage>846</fpage><lpage>860</lpage><pub-id pub-id-type="doi">10.1016/j.ins.2009.08.032</pub-id></citation></ref>
<ref id="b38-sensors-11-08164"><label>38.</label><citation citation-type="web"><source>Handel-C Languaje Reference Manual</source><publisher-name>Agility Design Solutions Inc</publisher-name><year>2008</year><comment>Available online: <ext-link xlink:href="http://www.mentor.com/products/fpga/handel-c/upload/handelc-reference.pdf" ext-link-type="uri">http://www.mentor.com/products/fpga/handel-c/upload/handelc-reference.pdf</ext-link> (accessed on 16 August 2011).</comment></citation></ref>
<ref id="b39-sensors-11-08164"><label>39.</label><citation citation-type="web"><person-group person-group-type="author"><collab>AlphaData RC1000 product</collab></person-group><comment>Available online: <ext-link xlink:href="http://www.alpha-data.com" ext-link-type="uri">http://www.alpha-data.com</ext-link> (accessed on 16 August 2011).</comment></citation></ref>
<ref id="b40-sensors-11-08164"><label>40.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Frigo</surname><given-names>J</given-names></name></person-group><article-title>Evaluation of the StreamsC, CtoFPGA compiler: An applications perspective</article-title><conf-name>Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays</conf-name><conf-loc>Monterey, CA, USA</conf-loc><conf-date>11–13 February 2001</conf-date><fpage>134</fpage><lpage>140</lpage></citation></ref>
<ref id="b41-sensors-11-08164"><label>41.</label><citation citation-type="web"><person-group person-group-type="author"><collab>Software and Design Tools</collab></person-group><comment>Available online: <ext-link xlink:href="http://www.xilinx.com/tools/designtools.htm" ext-link-type="uri">http://www.xilinx.com/tools/designtools.htm</ext-link> (accessed on 16 August 2011).</comment></citation></ref>
<ref id="b42-sensors-11-08164"><label>42.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wei</surname><given-names>Z</given-names></name><name><surname>Lee</surname><given-names>D-J</given-names></name><name><surname>Nelson</surname><given-names>BE</given-names></name><name><surname>Archibald</surname><given-names>JK</given-names></name><name><surname>Edwards</surname><given-names>BB</given-names></name></person-group><article-title>FPGA-based embedded motion estimation sensor</article-title><source>Int. J. Reconfig. Comput</source><year>2008</year><fpage>8</fpage><pub-id pub-id-type="doi">10.1155/2008/636145.</pub-id></citation></ref>
<ref id="b43-sensors-11-08164"><label>43.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Díaz</surname><given-names>J</given-names></name><name><surname>Ros</surname><given-names>E</given-names></name><name><surname>Agís</surname><given-names>R</given-names></name><name><surname>Bernier</surname><given-names>JL</given-names></name></person-group><article-title>Superpipelined high-performance optical flow computation architecture</article-title><source>Comput. Vis. Image Underst</source><year>2008</year><volume>112</volume><fpage>262</fpage><lpage>273</lpage><pub-id pub-id-type="doi">10.1016/j.cviu.2008.05.006</pub-id></citation></ref>
<ref id="b44-sensors-11-08164"><label>44.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tomasi</surname><given-names>M</given-names></name><name><surname>Barranco</surname><given-names>F</given-names></name><name><surname>Vanegas</surname><given-names>M</given-names></name><name><surname>Díaz</surname><given-names>J</given-names></name><name><surname>Ros</surname><given-names>E</given-names></name></person-group><article-title>Fine grain pipeline architecture for high performance phase-based optical flow computation</article-title><source>J. Syst. Archit</source><year>2010</year><volume>56</volume><fpage>577</fpage><lpage>587</lpage><pub-id pub-id-type="doi">10.1016/j.sysarc.2010.07.012</pub-id></citation></ref>
<ref id="b45-sensors-11-08164"><label>45.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Sosa</surname><given-names>JC</given-names></name><name><surname>Gomez-Fabela</surname><given-names>R</given-names></name><name><surname>Boluda</surname><given-names>JA</given-names></name><name><surname>Pardo</surname><given-names>F</given-names></name></person-group><article-title>Change-Driven Image Architecture on FPGA with Adaptive Threshold for Optical-Flow Computation</article-title><conf-name>Proceedings of the IEEE International Conference on Reconfigurable Computing and FPGA’s, ReConFig 2006</conf-name><conf-loc>San Luis Potosí, México</conf-loc><conf-date>20–22 September 2006</conf-date><fpage>1</fpage><lpage>8</lpage></citation></ref>
<ref id="b46-sensors-11-08164"><label>46.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mahalingam</surname><given-names>V</given-names></name><name><surname>Bhattacharya</surname><given-names>K</given-names></name><name><surname>Ranganathan</surname><given-names>N</given-names></name><name><surname>Chakravarthula</surname><given-names>H</given-names></name><name><surname>Murphy</surname><given-names>RR</given-names></name><name><surname>Pratt</surname><given-names>KS</given-names></name></person-group><article-title>A VLSI architecture and algorithm for lucas-kanade-based optical flow computation</article-title><source>IEEE Trans. VLSI Syst</source><year>2010</year><volume>18</volume><fpage>29</fpage><lpage>38</lpage><pub-id pub-id-type="doi">10.1109/TVLSI.2008.2006900</pub-id></citation></ref>
<ref id="b47-sensors-11-08164"><label>47.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chubb</surname><given-names>C</given-names></name><name><surname>Sperling</surname><given-names>G</given-names></name></person-group><article-title>Drift-balanced random stimuli: A general basis for studying non-Fourier motion perception</article-title><source>J. Opt. Soc. Am. A</source><year>1988</year><volume>5</volume><fpage>1986</fpage><lpage>2007</lpage><pub-id pub-id-type="doi">10.1364/JOSAA.5.001986</pub-id><pub-id pub-id-type="pmid">3210090</pub-id></citation></ref>
<ref id="b48-sensors-11-08164"><label>48.</label><citation citation-type="web"><person-group person-group-type="author"><collab>First-Order and Second-Order Motion Demos</collab></person-group><comment>Available online: <ext-link xlink:href="http://www.snl.salk.edu/~maarten/demos/2nd.html" ext-link-type="uri">http://www.snl.salk.edu/~maarten/demos/2nd.html</ext-link> (accessed on 16 August 2011).</comment></citation></ref></ref-list>
<sec sec-type="display-objects">
<title>Figures and Tables</title>
<fig id="f1-sensors-11-08164" position="float">
<label>Figure 1.</label>
<caption>
<p>Scheme of the VLSI architecture of the Multi-Modal Sensor implemented in the FPGA.</p></caption>
<graphic xlink:href="sensors-11-08164f1.gif"/></fig>
<fig id="f2-sensors-11-08164" position="float">
<label>Figure 2.</label>
<caption>
<p>Results from Experiment I.</p></caption>
<graphic xlink:href="sensors-11-08164f2.gif"/></fig>
<fig id="f3-sensors-11-08164" position="float">
<label>Figure 3.</label>
<caption>
<p>Results from Experiment II.</p></caption>
<graphic xlink:href="sensors-11-08164f3.gif"/></fig>
<fig id="f4-sensors-11-08164" position="float">
<label>Figure 4.</label>
<caption>
<p>Results from Experiment III.</p></caption>
<graphic xlink:href="sensors-11-08164f4.gif"/></fig>
<table-wrap id="t1-sensors-11-08164" position="float">
<label>Table 1.</label>
<caption>
<p>Slices, memory requirements, number of cycles and performance for the implementation of Low-level vision. Optical flow scheme.</p></caption>
<table frame="box" rules="all">
<thead>
<tr content-type="background-color:#A8A8A8">
<th align="left" valign="middle"><bold>Low-level Vision Stage (Optical flow)</bold></th>
<th align="center" valign="middle"><bold>FIR Temporal Filtering I</bold></th>
<th align="center" valign="middle"><bold>FIR Spatial Filtering II</bold></th>
<th align="center" valign="middle"><bold>Steering III</bold></th>
<th align="center" valign="middle"><bold>Product &amp; Taylor IV</bold></th>
<th align="center" valign="middle"><bold>Quotient V</bold></th>
<th align="center" valign="middle"><bold>Primitives VI</bold></th></tr></thead>
<tbody>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">Slices (%)</td>
<td align="center" valign="top">190 (1%)</td>
<td align="center" valign="top">1307 (7%)</td>
<td align="center" valign="top">1206 (6%)</td>
<td align="center" valign="top">3139 (19%)</td>
<td align="center" valign="top">3646 (20%)</td>
<td align="center" valign="top">2354 (12%)</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">Block RAM (%)</td>
<td align="center" valign="top">1%</td>
<td align="center" valign="top">31%</td>
<td align="center" valign="top">2%</td>
<td align="center" valign="top">13%</td>
<td align="center" valign="top">16%</td>
<td align="center" valign="top">19%</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">MC</td>
<td align="center" valign="top">13</td>
<td align="center" valign="top">17</td>
<td align="center" valign="top">19</td>
<td align="center" valign="top">23</td>
<td align="center" valign="top">21</td>
<td align="center" valign="top">19</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">Throughput (Kpixels/s)/Frequency limited by ISE tool (MHz)</td>
<td align="center" valign="top">4,846/63</td>
<td align="center" valign="top">3,235/55</td>
<td align="center" valign="top">2,526/48</td>
<td align="center" valign="top">1,782/41</td>
<td align="center" valign="top">1,695/39</td>
<td align="center" valign="top">2,000/38</td></tr></tbody></table></table-wrap>
<table-wrap id="t2-sensors-11-08164" position="float">
<label>Table 2.</label>
<caption>
<p>Slices, memory requirements, number of cycles and performance for the implementation of Low-level vision. Orthogonal moment scheme.</p></caption>
<table frame="box" rules="all">
<thead>
<tr content-type="background-color:#A8A8A8">
<th align="left" valign="middle"><bold>Low-level Vision Stage (Orthogonal Variant Moments)</bold></th>
<th align="center" valign="middle"><bold>Area (M<sub>I</sub>)</bold></th>
<th align="center" valign="middle"><bold><italic>L<sub>X</sub></italic> (M<sub>II</sub>)</bold></th>
<th align="center" valign="middle"><bold><italic>L<sub>Y</sub></italic> (M<sub>III</sub>)</bold></th>
<th align="center" valign="middle"><bold><italic>P<sub>X</sub></italic> (M<sub>IV</sub>)</bold></th>
<th align="center" valign="middle"><bold><italic>P<sub>Y</sub></italic> (M<sub>V</sub>)</bold></th></tr></thead>
<tbody>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">Slices (%)</td>
<td align="center" valign="top">321 (2%)</td>
<td align="center" valign="top">1245 (7%)</td>
<td align="center" valign="top">1245 (7%)</td>
<td align="center" valign="top">658 (4%)</td>
<td align="center" valign="top">658 (4%)</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">Block RAM (%)</td>
<td align="center" valign="top">1%</td>
<td align="center" valign="top">4%</td>
<td align="center" valign="top">4%</td>
<td align="center" valign="top">3%</td>
<td align="center" valign="top">3%</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">MC</td>
<td align="center" valign="top">7</td>
<td align="center" valign="top">11</td>
<td align="center" valign="top">11</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">5</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">Throughput (Kpixels/s)/Frequency limited by ISE tool (MHz)</td>
<td colspan="5" align="center" valign="middle">4546/49</td></tr></tbody></table></table-wrap>
<table-wrap id="t3-sensors-11-08164" position="float">
<label>Table 3.</label>
<caption>
<p>Slices, memory requirements, number of cycles and performance for the implementation of Low and Mid-Level vision. Multimodal Bioinspired Sensor.</p></caption>
<table frame="box" rules="all">
<thead>
<tr content-type="background-color:#A8A8A8">
<th align="left" valign="top"><bold>COMPLETE Mid-level and Low level Vision</bold></th>
<th align="center" valign="top"><bold>Motion Estimation (Low-Level)</bold></th>
<th align="center" valign="top"><bold>Orthogonal Variant Moments (Low-Level)l</bold></th>
<th align="center" valign="top"><bold>Tracking &amp; Segmentation Unit (Mid-Level)</bold></th>
<th align="center" valign="top"><bold>Multimodal Bioinspired Sensor. (Mid-level &amp; Low-Level)</bold></th></tr></thead>
<tbody>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">Slices (%)</td>
<td align="center" valign="top">4127 (24%)</td>
<td align="center" valign="top">11842 (65%)</td>
<td align="center" valign="top">1304 (6%)</td>
<td align="center" valign="top">17710 (97%)</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">Block RAM (%)</td>
<td align="center" valign="top">15%</td>
<td align="center" valign="top">80%</td>
<td align="center" valign="top">4%</td>
<td align="center" valign="top">(99%)</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">MC (limiting)</td>
<td align="center" valign="top">29</td>
<td align="center" valign="top">11</td>
<td align="center" valign="top">18</td>
<td align="center" valign="top">29</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">Throughput (Kpixels/s)/Frequency limited by ISE tool (MHz)</td>
<td align="center" valign="top">4546/49</td>
<td align="center" valign="top">2000/38</td>
<td align="center" valign="top">2000/38</td>
<td align="center" valign="top">2000/38</td></tr></tbody></table></table-wrap>
<table-wrap id="t4-sensors-11-08164" position="float">
<label>Table 4.</label>
<caption>
<p>Throughput in terms of Kpps and frames/second for the embedded sensor.</p></caption>
<table frame="box" rules="all">
<thead>
<tr content-type="background-color:#A8A8A8">
<th align="left" valign="middle"><bold>COMPLETE Mid-level and Low-level Vision</bold></th>
<th align="center" valign="middle"><bold>Orthogonal Variant Moments (Low-Level)l</bold></th>
<th align="center" valign="middle"><bold>Motion Estimation (Low-Level)</bold></th>
<th align="center" valign="middle"><bold>Multimodal Bioinspired Sensor. (Mid-level &amp; Low-Level)</bold></th></tr></thead>
<tbody>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">resolution 120 × 96</td>
<td align="center" valign="top">395 frames/s</td>
<td align="center" valign="top">174 frames/s</td>
<td align="center" valign="top">174 frames/s</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">resolution 320 × 240</td>
<td align="center" valign="top">59 frames/s</td>
<td align="center" valign="top">26 frames/s</td>
<td align="center" valign="top">26 frames/s</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">resolution 640 × 480</td>
<td align="center" valign="top">28 frames/s</td>
<td align="center" valign="top">14 frames/s</td>
<td align="center" valign="top">14 frames/s</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="top">Throughput</td>
<td align="center" valign="top">4546 Kpixels/s</td>
<td align="center" valign="top">2000 Kpixels/s</td>
<td align="center" valign="top">2000 Kpixels/s</td></tr></tbody></table></table-wrap>
<table-wrap id="t5-sensors-11-08164" position="float">
<label>Table 5.</label>
<caption>
<p>Comparison with other complex system vision approaches.</p></caption>
<table frame="box" rules="all">
<thead>
<tr content-type="background-color:#A8A8A8">
<th align="left" valign="middle"><bold>Models</bold></th>
<th align="left" valign="middle"><bold>Family</bold></th>
<th align="left" valign="middle"><bold>Method</bold></th>
<th align="left" valign="middle"><bold>Throughput (Mpixel/s)</bold></th>
<th align="left" valign="middle"><bold>Density</bold></th></tr></thead>
<tbody>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="middle">Present work</td>
<td align="center" valign="middle">Gradient</td>
<td align="center" valign="middle">Enhanced McGM and Orthogonal variant moments</td>
<td align="center" valign="middle">2</td>
<td align="center" valign="middle">100%</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="middle">Botella <italic>et al.</italic> [<xref ref-type="bibr" rid="b14-sensors-11-08164">14</xref>,<xref ref-type="bibr" rid="b15-sensors-11-08164">15</xref>] (2009, 2010)</td>
<td align="center" valign="middle">Gradient</td>
<td align="center" valign="middle">McGM</td>
<td align="center" valign="middle">0.2</td>
<td align="center" valign="middle">100%</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="middle">Wei <italic>et al.</italic> [<xref ref-type="bibr" rid="b42-sensors-11-08164">42</xref>] (2008)</td>
<td align="center" valign="middle">Gradient</td>
<td align="center" valign="middle">Horn &amp; Schunck</td>
<td align="center" valign="middle">4</td>
<td align="center" valign="middle">100%</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="middle">Diaz <italic>et al.</italic> [<xref ref-type="bibr" rid="b43-sensors-11-08164">43</xref>] (2007)</td>
<td align="center" valign="middle">Gradient</td>
<td align="center" valign="middle">Lucas &amp; Kanade</td>
<td align="center" valign="middle">82</td>
<td align="center" valign="middle">57.2%</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="middle">Tomasi <italic>et al.</italic> [<xref ref-type="bibr" rid="b44-sensors-11-08164">44</xref>] (2010)</td>
<td align="center" valign="middle">Energy</td>
<td align="center" valign="middle">Phase Based</td>
<td align="center" valign="middle">49</td>
<td align="center" valign="middle">not provided</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="middle">Sosa <italic>et al.</italic> [<xref ref-type="bibr" rid="b45-sensors-11-08164">45</xref>] (2006)</td>
<td align="center" valign="middle">Gradient</td>
<td align="center" valign="middle">Horn &amp; Schunck</td>
<td align="center" valign="middle">1.8</td>
<td align="center" valign="middle">not provided</td></tr>
<tr>
<td content-type="background-color:#A8A8A8" align="left" valign="middle">Mahalingam <italic>et al.</italic> [<xref ref-type="bibr" rid="b46-sensors-11-08164">46</xref>] (2010)</td>
<td align="center" valign="middle">Gradient</td>
<td align="center" valign="middle">Lucas &amp; Kanade</td>
<td align="center" valign="middle">9.9</td>
<td align="center" valign="middle">6.3%</td></tr></tbody></table></table-wrap></sec></back></article>
