<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xml:lang="en" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Sensors</journal-id>
<journal-title>Sensors</journal-title>
<issn pub-type="epub">1424-8220</issn>
<publisher>
<publisher-name>Molecular Diversity Preservation International (MDPI)</publisher-name></publisher></journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3390/s101009194</article-id>
<article-id pub-id-type="publisher-id">sensors-10-09194</article-id>
<article-categories>
<subj-group>
<subject>Article</subject></subj-group></article-categories>
<title-group>
<article-title>Design of Belief Propagation Based on FPGA for the Multistereo CAFADIS Camera</article-title></title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Magdaleno</surname><given-names>Eduardo</given-names></name><xref ref-type="corresp" rid="c1-sensors-10-09194">*</xref></contrib>
<contrib contrib-type="author">
<name><surname>Lüke</surname><given-names>Jonás Philipp</given-names></name></contrib>
<contrib contrib-type="author">
<name><surname>Rodríguez</surname><given-names>Manuel</given-names></name></contrib>
<contrib contrib-type="author">
<name><surname>Rodríguez-Ramos</surname><given-names>José Manuel</given-names></name></contrib>
<aff id="af1-sensors-10-09194">Departamento de Física Fundamental y Experimental, Electrónica y Sistemas, University of La Laguna, Avd. Francisco Sanchez s/n, 38203 La Laguna, Spain; E-Mails: <email>jpluke@ull.es</email> (J.P.L.); <email>mrvalido@ull.es</email> (M.R.); <email>jmramos@ull.es</email> (J.M.-R.)</aff></contrib-group>
<author-notes>
<corresp id="c1-sensors-10-09194">
<label>*</label> Author to whom correspondence should be addressed; E-Mail: <email>emagcas@ull.es</email>; Tel.: +34-922-84-50-35; Fax: +34-922-318-228.</corresp></author-notes>
<pub-date pub-type="collection">
<year>2010</year></pub-date>
<pub-date pub-type="epub">
<day>15</day>
<month>10</month>
<year>2010</year></pub-date>
<volume>10</volume>
<issue>10</issue>
<fpage>9194</fpage>
<lpage>9210</lpage>
<history>
<date date-type="received">
<day>18</day>
<month>8</month>
<year>2010</year></date>
<date date-type="rev-recd">
<day>20</day>
<month>9</month>
<year>2010</year></date>
<date date-type="accepted">
<day>29</day>
<month>9</month>
<year>2010</year></date></history>
<permissions>
<copyright-statement>© 2010 by the authors; licensee MDPI, Basel, Switzerland.</copyright-statement>
<copyright-year>2010</copyright-year>
<license>
<p>This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).</p></license></permissions>
<abstract>
<p>In this paper we describe a fast, specialized hardware implementation of the belief propagation algorithm for the CAFADIS camera, a new plenoptic sensor patented by the University of La Laguna. This camera captures the lightfield of the scene and can be used to find out at which depth each pixel is in focus. The algorithm has been designed for FPGA devices using VHDL. We propose a parallel and pipeline architecture to implement the algorithm without external memory. Although the BRAM resources of the device increase considerably, we can maintain real-time restrictions by using extremely high-performance signal processing capability through parallelism and by accessing several memories simultaneously. The quantifying results with 16 bit precision have shown that performances are really close to the original Matlab programmed algorithm.</p></abstract>
<kwd-group>
<kwd>plenoptic sensors</kwd>
<kwd>FPGA</kwd>
<kwd>real-time processing</kwd>
<kwd>depth estimation</kwd>
<kwd>multistereo</kwd></kwd-group></article-meta></front>
<body>
<sec sec-type="intro">
<label>1.</label>
<title>Introduction</title>
<p>3D reconstruction has been a very active research field for many years. The problem can be approached with active techniques, in which the system interacts with the scene, or with passive techniques in which the system, instead of interacting with the scene, captures images from several view points in order to reconstruct the scene-related depth information.</p>
<p>Using passive techniques, only two views are enough to reconstruct 3D information from the scene by means of a stereo algorithm. However, these techniques can be generalized to more than two views and are then called multistereo techniques. Both dual stereo and multistereo are generally based on finding a correspondence between the pixels of several images taken from different view points. This is called the correspondence problem and generally needs some optimization process in order to find the best correspondence between pixels.</p>
<p>The correspondence problem can be solved within the Markov Random Field (MRF) framework [<xref ref-type="bibr" rid="b1-sensors-10-09194">1</xref>–<xref ref-type="bibr" rid="b3-sensors-10-09194">3</xref>]. However, this yields an optimization problem that is NP-hard. Satisfactory techniques have been developed to find approximate solutions, namely graph cuts and belief propagation. These techniques are very demanding in computational terms, if compared to other techniques. Although these techniques produce good results, they are slow. This is an impediment when 3D reconstruction warrants real-time performance, for example in a 3DTV video camera.</p>
<p>CAFADIS is a 3D video camera patented by the University of La Laguna that performs depth reconstruction in real time. The CAFADIS camera is an intermediate sensor between the Shack-Hartmann and the pyramid sensor [<xref ref-type="bibr" rid="b4-sensors-10-09194">4</xref>]. It uses a plenoptic camera configuration in order to capture multiview information (it samples an image plane using a microlens array) [<xref ref-type="bibr" rid="b4-sensors-10-09194">4</xref>]. This multiple view information is composed of hundreds of images taken from slightly different points of view that are captured with a single lens, single body device. Plenoptic sensors capture the lightfield of the scene and can be used to synthesize a set of photographs focused at different depths in the scene [<xref ref-type="bibr" rid="b4-sensors-10-09194">4</xref>–<xref ref-type="bibr" rid="b9-sensors-10-09194">9</xref>]. The image resulting from application of the CAFADIS sensor can be seen as formed by four dimensions: two CCD co-ordinates associated to each microlens and a further two co-ordinates stemming from the microlens array. These can then be used to estimate a focus measure usable as cost function that has to be optimized in order to find out at which depth each pixel is in focus [<xref ref-type="bibr" rid="b10-sensors-10-09194">10</xref>]. As a consequence, a depth value can be assigned to each pixel. This 3D map, combined with the 2D scene image, can be used as input for a 3D display. This optimization process can also be done within a MRF framework by means of the belief propagation algorithm.</p>
<p>The optimization process is very slow, so specific hardware has to be used to achieve real-time performance. A first prototype of the CAFADIS camera for 3D reconstruction was built using a computer provided with multiple Graphical Processing Units (GPUs) and achieving satisfactory results [<xref ref-type="bibr" rid="b4-sensors-10-09194">4</xref>,<xref ref-type="bibr" rid="b10-sensors-10-09194">10</xref>]. However, this hardware has the disadvantage of not being portable in the least. Now, the goal is to obtain full portability with a single lens, single body optical configuration and specific parallel hardware programmed on Field Programmable Gate Arrays (FPGAs).</p>
<p>The FPGA technology makes the sensor applications small-sized (portable), flexible, customizable, reconfigurable and reprogrammable with the advantages of good customization, cost-effectiveness, integration, accessibility and expandability [<xref ref-type="bibr" rid="b11-sensors-10-09194">11</xref>]. Moreover, an FPGA can accelerate the sensor calculations due to the architecture of this device. In this way, FPGA technology offers extremely high-performance signal processing and conditioning capabilities through parallelism based on slices and arithmetic circuits and highly flexible interconnection possibilities [<xref ref-type="bibr" rid="b12-sensors-10-09194">12</xref>]. Furthermore, FPGA technology is an alternative to custom ICs (integrated circuits) for implementing logic. Custom integrated circuits (ASICS) are expensive to develop, while generating time-to-market delays because of the prohibitive design time. Thanks to computer-aided design tools, FPGA circuits can be implemented in a relatively short space of time [<xref ref-type="bibr" rid="b13-sensors-10-09194">13</xref>]. FPGA technology features are an important consideration in sensor applications nowadays. Recent examples of sensor developments using FPGAs are the works of Rodriguez-Donate <italic>et al</italic>. [<xref ref-type="bibr" rid="b14-sensors-10-09194">14</xref>], Moreno-Tapia <italic>et al</italic>. [<xref ref-type="bibr" rid="b15-sensors-10-09194">15</xref>], Trejo-Hernandez <italic>et al</italic>. [<xref ref-type="bibr" rid="b16-sensors-10-09194">16</xref>] and Zhang <italic>et al</italic>. [<xref ref-type="bibr" rid="b17-sensors-10-09194">17</xref>].</p>
<p>In this sense, the main objective of this work is to select an efficient belief propagation algorithm and then to implement it over a FPGA platform, paving the way for accomplishing the computational requirements of real-time processing and size requirements of the CAFADIS camera. The fast and specialized hardware implementation of the belief propagation algorithm was carried out and successfully compared with other existing implementations of the same algorithm based on FPGA.</p>
<p>The rest of the paper is structured as follows: we will start by describing the belief propagation algorithm. Then, Section 3 describes the design of the architecture. Section 4 explains the obtained results and, finally, the conclusions and future work are presented.</p></sec>
<sec>
<label>2.</label>
<title>Belief Propagation Algorithm</title>
<p>The belief propagation algorithm [<xref ref-type="bibr" rid="b1-sensors-10-09194">1</xref>] is used to optimize an energy function in a MRF framework. The energy function <italic>E</italic> is composed of a data term <italic>E<sub>d</sub></italic> and a smoothness term <italic>E<sub>s</sub></italic>, <italic>E = E<sub>d</sub></italic> <italic>+ λE<sub>s</sub></italic>, where the parameter <italic>λ</italic> measures the relative importance of each term. The data term is the sum of the per-pixel data costs, <italic>E<sub>d</sub></italic> <italic>= Σ<sub>p</sub></italic> <italic>c<sub>p</sub>(d)</italic>, where, in this case, <italic>c<sub>p</sub>(d)</italic> is the focus measure taken from the set of photographs focused at different depths that was previously synthesized. The smoothness term is based on the 4-connected neighbors of each pixel and can be written as <italic>E<sub>s</sub></italic> <italic>= Σ<sub>p,q</sub></italic> <italic>V<sub>pq</sub>(d<sub>p</sub>, d<sub>q</sub>)</italic> where <bold><italic>p</italic></bold> and <bold><italic>q</italic></bold> are two neighboring pixels. Although there exist other ways to define <italic>V<sub>pq</sub>(d<sub>p</sub>, d<sub>q</sub>)</italic>, here the following definition was used:
<disp-formula id="FD1">
<label>(1)</label>
<mml:math display="block">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>V</mml:mi></mml:mrow>
<mml:mrow>
<mml:mtext mathvariant="bold">pq</mml:mtext></mml:mrow></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mrow>
<mml:mo>,</mml:mo>
<mml:mo> </mml:mo></mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mtable columnalign="left">
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mn>0</mml:mn></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mi>   </mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub></mml:mrow></mml:mtd></mml:mtr>
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mn>1</mml:mn></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mtext>otherwise</mml:mtext></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>The energy function is optimized using an iterative message passing scheme that passes messages over the 4-connected neighbors of each pixel in the image grid. Each message consists in a vector of <italic>k</italic> positions, one for each depth level taken into account, and is computed using the following update rule:
<disp-formula id="FD2">
<label>(2)</label>
<mml:math display="block">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>M</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">q</mml:mi></mml:mrow>
<mml:mi>i</mml:mi></mml:msubsup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mtext>min</mml:mtext></mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub></mml:mrow></mml:munder>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>c</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mi>μ</mml:mi>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold">s</mml:mi>
<mml:mo>∈</mml:mo>
<mml:mi>N</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold">p</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:munder>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>M</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">s</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">p</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:msubsup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>−</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>M</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">p</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:msubsup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mi>λ</mml:mi>
<mml:mo>⋅</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>V</mml:mi></mml:mrow>
<mml:mrow>
<mml:mtext mathvariant="bold">pq</mml:mtext></mml:mrow></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>where 
<inline-formula>
<mml:math>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>M</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">q</mml:mi></mml:mrow>
<mml:mi>i</mml:mi></mml:msubsup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> is the message passed from pixel <bold><italic>p</italic></bold> to pixel <bold><italic>q</italic></bold> for depth level <italic>d<sub>q</sub></italic> at iteration <italic>I</italic>, <italic>N</italic>(<bold><italic>p</italic></bold><italic>)</italic> is the four-connected neighborhood of pixel <bold><italic>p</italic></bold> and <italic>μ ∈</italic> (<italic>0,1</italic>].</p>
<p>After a certain number of iterations <italic>I</italic>, the algorithm is expected to converge to the solution. Then the belief vector for every pixel has to be computed to obtain the depth level at which each pixel is focused and, finally, the depth at which the object that images on that pixel is located. The belief vector for pixel <bold><italic>q</italic></bold> is computed as follows:
<disp-formula id="FD3">
<label>(3)</label>
<mml:math display="block">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>b</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>c</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mi>μ</mml:mi>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi>
<mml:mo>∈</mml:mo>
<mml:mi>N</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold">q</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:munder>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>M</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">p</mml:mi></mml:mrow>
<mml:mi>I</mml:mi></mml:msubsup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>The depth value for pixel <bold><italic>q</italic></bold> is the depth level <italic>d<sub>q</sub></italic> with minimum belief value. This general approach of the message passing rule requires <italic>O(k<sup>2</sup></italic> <italic>n I)</italic> execution time, where <italic>k</italic> is the number of depth levels, <italic>n</italic> is the number of pixels in the image and <italic>I</italic> is the number of iterations. Notice that the message for each pixel could be computed in parallel taking <italic>O(k<sup>2</sup>)</italic> time for each iteration. Using the techniques described in [<xref ref-type="bibr" rid="b1-sensors-10-09194">1</xref>], the timing requirements and arithmetic resources can be reduced drastically. This is a benefit for implementing the algorithm on FPGA, since less of the valuable resources of the FPGA will be necessary for each pixel.</p>
<p>Two of the approaches used in [<xref ref-type="bibr" rid="b1-sensors-10-09194">1</xref>] in order to save computation time and memory are to transform the quadratic update rule into a linear update rule taking into account the particular structure of <italic>V<sub>pq</sub>(d<sub>p</sub>, d<sub>q</sub>)</italic> and to use a bipartite graph approach in order to perform the computations in place and in half the time.</p>
<p>The transformation of the general message update rule gives the following update rule:
<disp-formula id="FD4">
<label>(4)</label>
<mml:math display="block">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>M</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">q</mml:mi></mml:mrow>
<mml:mi>i</mml:mi></mml:msubsup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mtext>min</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">q</mml:mi></mml:mrow></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mtext>min</mml:mtext></mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub></mml:mrow></mml:munder>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">q</mml:mi></mml:mrow></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mi>λ</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow>
<mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>with:
<disp-formula id="FD5">
<label>(5)</label>
<mml:math display="block">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">q</mml:mi></mml:mrow></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>c</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mi>μ</mml:mi>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold">s</mml:mi>
<mml:mo>∈</mml:mo>
<mml:mi>N</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold">p</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:munder>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>M</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">s</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">p</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:msubsup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>−</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>M</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">p</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:msubsup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>This allows computation of the message update for each pixel in <italic>O(k)</italic> time and allows a saving in arithmetic resources.</p>
<p>On the other hand, one can observe that the image grid can be split into two sets so that the outgoing messages of a pixel in set A only depends on the incoming messages from neighbors in set B, and <italic>vice-versa</italic>. This gives a checkerboard-like pattern, where the message updating rule is computed at odd iterations for pixels in set A and at even iterations for pixels in set B. With this approach, all messages are initialized at zero and the updating is then alternatively conducted on the two sets. Once the algorithm converges to the solution, the belief vector can be computed in the usual way, since no significant difference is expected between an iteration message and the previous one, so that 
<inline-formula>
<mml:math>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>M</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">q</mml:mi></mml:mrow>
<mml:mi>i</mml:mi></mml:msubsup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>≈</mml:mo>
<mml:mo> </mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>M</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">p</mml:mi>
<mml:mo>→</mml:mo>
<mml:mi mathvariant="bold">q</mml:mi></mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:msubsup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi mathvariant="bold">q</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. With this technique the memory requirements are reduced to half and computing speed is doubled.</p></sec>
<sec>
<label>3.</label>
<title>Algorithm to Hardware</title>
<p>The global control system to be developed is shown in <xref ref-type="fig" rid="f1-sensors-10-09194">Figure 1</xref>. The functional architecture has four sub-modules. At the front of the system the camera link module receives data from CAFADIS in a serial mode. The following stages perform the digital data processing using FPGA resources. The second and the third stages are the estimation of cost and the belief propagation algorithm respectively. The estimation of cost is a less demanding computation and the main computing power is carried out by the belief propagation. Finally, a simple VGA controller is necessary to display the depth data.</p>
<p>We will focus on the FPGA implementation from <xref ref-type="disp-formula" rid="FD4">Equations 4</xref> and <xref ref-type="disp-formula" rid="FD5">5</xref> to improve processing time. The pseudo-code for the belief propagation algorithm is as follows.</p>
<p>The algorithm can be accelerated using parallel processing power of FPGAs instead of other classical technology platforms. In our implementation the improvements are due to the fact that:
<list list-type="bullet">
<list-item>
<p>Arithmetic computations are performed in pipeline and as parallel as possible.</p></list-item>
<list-item>
<p>The number of planes in the architecture implemented is parallelized.</p></list-item></list></p>
<p>Taking into account these considerations, the overall implemented architecture is depicted in <xref ref-type="fig" rid="f2-sensors-10-09194">Figure 2</xref>. The address generation unit acts as the global controller of the system. In order to carry out the smoothing, the new values of messages are recalculated using the appropriate values that are extracted from the memory cost and message passing at each iteration. This phase is computed into the <italic>common_core</italic> module for each plane.</p>
<p>Finally, the smoothing module compares the new values obtained from all levels and the new values are stored in the message passing memory after smoothing (<xref ref-type="table" rid="t1-sensors-10-09194">Table 1</xref>).</p>
<p>These steps are performed using gray pixels in odd iterations and the white pixels in even iterations (<xref ref-type="fig" rid="f2-sensors-10-09194">Figure 2</xref>). The number of iterations is configured in the <italic>address generator</italic> module.</p>
<p>Simultaneously, the <italic>common core</italic> calculates the common part of the four outgoing messages for each plane. Then the incoming message from each neighbor in the previous iteration is subtracted from the common part. After that, the minimum value for each direction is computed and smoothed in the smoothing module. Finally, the belief function is computed in the <italic>minimum plane</italic> block in order to select the plane to which each pixel belongs. When the address generation module completes its iterations, it enables the output of this block, providing the distance associated to each pixel as output signal. These data are obtained alternately (even and odd pixels) in line with the inherent addressing of the algorithm to minimize resources and execution time. The distance data can be flipped using a double dual-port memory system at the output of the <italic>minimum plane</italic> module preserving the pipeline [<xref ref-type="bibr" rid="b18-sensors-10-09194">18</xref>,<xref ref-type="bibr" rid="b19-sensors-10-09194">19</xref>].</p>
<p>The implementations of each of the modules that make up the overall architecture are detailed below.</p>
<sec>
<label>3.1.</label>
<title>Memory planes</title>
<p>According to the algorithm, each memory plane consists of one cost memory and four message-passing memories.</p>
<p>Taking into account <xref ref-type="disp-formula" rid="FD1">Equation 1</xref>, <xref ref-type="fig" rid="f3-sensors-10-09194">Figure 3</xref> shows the addresses of message-passing memory which must be accessed in order to perform the arithmetic computation for an image of Nx = 3 and Ny = 4 pixels for even iterations. <xref ref-type="fig" rid="f4-sensors-10-09194">Figure 4</xref> depicts the same considerations for odd iterations.</p>
<p>To calculate the new messages associated with a given pixel, the up-memory must supply the value of its right, the down-memory, the value of the left, and the left and right-memories should access the top and bottom positions respectively. This addressing causes conflicts at the ends of the arrays. In <xref ref-type="fig" rid="f3-sensors-10-09194">Figure 3</xref>, for example, in order to compute the calculations for pixel 1, the down-memory address is out of range. <xref ref-type="table" rid="t2-sensors-10-09194">Table 2</xref> shows the memory accesses to be performed for the example in <xref ref-type="fig" rid="f3-sensors-10-09194">Figures 3</xref> and <xref ref-type="fig" rid="f4-sensors-10-09194">4</xref>.</p>
<p>The software algorithms solve these conflicts using zero padding. This implies an extra memory of 8Nx + 8Ny + 16 for each plane in a hardware implementation. A second approach is to avoid this zero padding. As shown in <xref ref-type="fig" rid="f3-sensors-10-09194">Figures 3</xref> and <xref ref-type="fig" rid="f4-sensors-10-09194">4</xref>, each message memory only has conflicts on one side of the array. Taking this into account, the extra memory used is reduced to 2Nx + 2Ny for each <bold>Nz</bold> plane.</p>
<p>However, the FPGA's internal memory is a critical resource when implementing this algorithm and the final design optimizes the memory usage by eliminating the above mentioned excesses. Instead of increasing memory sizes, additional logic was added in the address generator design in order to indicate when an address is valid. With this alternative design, the size of the memory is minimized. Furthermore, the size is the same for all the memories, making the VHDL implementation more modular and flexible.</p></sec>
<sec>
<label>3.2.</label>
<title>Address generator</title>
<p>The block diagram of the address generator and control signals are depicted in <xref ref-type="fig" rid="f5-sensors-10-09194">Figure 5</xref>. Basically, this module consists of counters, comparators, shift registers, one multiplier, three adders, two subtracters and a state machine that acts as a control unit.</p>
<p>The operation of the module is as follows: the x-counter is enabled when the <italic>start</italic> signal goes high. The property of this counter is that it lacks the least significant bit index, whose value is calculated with a parity circuit to implement the checkerboard algorithm.</p>
<p>The effective address is generated using <italic>count_Nx</italic> and <italic>count_Ny</italic> counters. This value is calculated by multiplying the number of rows (<bold>Nx</bold>) by <italic>count_Ny</italic> and then adding the current value of the row (<italic>count_Nx</italic>). Based on the current cost address, the values of the message addresses are easily obtained, as well as the address of the plane corresponding to the value calculated (delay of nine clock cycles to synchronize with the arithmetic module). Furthermore, when the values of <italic>count_Ny</italic> and <italic>count_Nx</italic> reach the maximum value, the next iteration is enabled through a third counter (<italic>count_Niter</italic>). The v<italic>alidation_generator</italic> block uses these three values to estimate if the message addresses are valid.</p>
<p>The control unit provides <italic>strobe</italic>, <italic>unload</italic> and <italic>done</italic> signals that are estimated using the value of the three counters. These signals are passed by shift registers to preserve the overall system synchronization.</p>
<p>The validity of the message addresses can be calculated using only the <italic>count_Nx</italic> and <italic>count_Ny</italic> pixel counters (see <xref ref-type="fig" rid="f3-sensors-10-09194">Figures 3</xref>, <xref ref-type="fig" rid="f4-sensors-10-09194">4</xref>). However, the inclusion of the iteration counter saves on resources and execution time of the algorithm. In fact, when the algorithm achieves the last iteration for an image, message memories contain values that are not valid for the next image. The message memory must be empty at the first iteration for a given frame. This implies the use of two sets of memories continuously commuting between odd and even frames or the implementation of an erase phase consuming extra time. Both options use more resources of the FPGA hardware. Thus, the <italic>count_Niter</italic> counter is included in the estimator and when the value of this counter is zero or one, this module assumes that all addresses are invalid.</p></sec>
<sec>
<label>3.3.</label>
<title>Arithmetic module</title>
<p>This module is responsible for performing the calculations of the message passing algorithm according to the equations. The implemented module is depicted in <xref ref-type="fig" rid="f6-sensors-10-09194">Figure 6</xref>. Several registers are included in the circuit to perform the computation in pipeline. <italic>Data_valid</italic> signals are connected to the synchronous reset of the first registers by passing the data, so that if the data are invalid, the values at which operations are carried out are zero.</p>
<p>The value of the <italic>common</italic> signal is calculated using current messages and the value of the <italic>cost</italic> signal as shown at the top of the figure. Simultaneously, message data are delayed so that they can be subtracted from the value of the <italic>common</italic> signal at the bottom of the figure. This architecture preserves the pipeline feature and the FPGA only needs nine clock cycles to carry out the computation of the new message data in parallel mode. The pipeline continuously provides new data after the latency time.</p>
<p>Intermediate values of the arithmetic module are conveniently rounded. So, the input precision is the same as the output precision (generic <italic>data_width</italic>).</p>
<p>The module is synthesized <bold>Nz</bold> times (depending on the number of planes, see <xref ref-type="fig" rid="f2-sensors-10-09194">Figure 2</xref>).</p></sec>
<sec>
<label>3.4.</label>
<title>Smoothness unit</title>
<p>This module performs the smoothness corresponding to the last line of the pseudo-code of <xref ref-type="table" rid="t1-sensors-10-09194">Table 1</xref>. At first, it calculates the minimum of messages for all planes with a comparator tree. Then, a constant factor (<bold>d</bold>) is added and the result is compared to the estimated value for each memory. Finally, the minimum between these two values is stored in the memory (<xref ref-type="fig" rid="f7-sensors-10-09194">Figure 7</xref>). Only 2 + ⌈log<sub>2</sub> <italic>Nz</italic>⌉ clock cycles are necessary for this operation.</p></sec>
<sec>
<label>3.5.</label>
<title>Depth estimator</title>
<p>This block selects the plane that contains the minimum of <italic>common</italic> values in each cycle. Note that the result is only enabled when all the iterations have been carried out. This feature is controlled through a <italic>ce_map</italic> signal. This signal is internally connected with the strobe signal of the arithmetic module. This module requires two clock cycles.</p></sec>
<sec sec-type="methods">
<label>3.6.</label>
<title>Size considerations of the design</title>
<p>The update of the messages takes 13 clock cycles (9 from the arithmetic and 2 + ⌈log<sub>2</sub> <italic>Nz</italic>⌉ from the smoothness module). Thus, we must have:
<disp-formula id="FD6">
<label>(6)</label>
<mml:math display="block">
<mml:mrow>
<mml:mrow>
<mml:mo>⌊</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mi>x</mml:mi>
<mml:mo>⋅</mml:mo>
<mml:mi>N</mml:mi>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mn>2</mml:mn></mml:mfrac></mml:mrow>
<mml:mo>⌋</mml:mo></mml:mrow>
<mml:mo>−</mml:mo>
<mml:mrow>
<mml:mo>⌊</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mi>x</mml:mi></mml:mrow>
<mml:mn>2</mml:mn></mml:mfrac></mml:mrow>
<mml:mo>⌋</mml:mo></mml:mrow>
<mml:mo>&gt;</mml:mo>
<mml:mn>11</mml:mn>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mo>⌈</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mtext>log</mml:mtext></mml:mrow></mml:mrow>
<mml:mn>2</mml:mn></mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>z</mml:mi></mml:mrow>
<mml:mo>⌉</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>since <italic>addr_down = addr – Nx</italic> is the worst case in the address generator module (<xref ref-type="fig" rid="f5-sensors-10-09194">Figure 5</xref>). The divisions by 2 are due to checkerboard optimization. If this equation is not satisfied, a non-updated value from memory is read. This means that the frame must be 8 × 8 or higher.</p></sec></sec>
<sec sec-type="results">
<label>4.</label>
<title>Results</title>
<p>A first script was successfully tested using Matlab. Then the design was programmed using the VHDL hardware description language, simulated using ModelSim, and XST was used to synthesize these modules. An overview of the module operation is shown in a functional simulation (<xref ref-type="fig" rid="f8-sensors-10-09194">Figure 8</xref>).</p>
<p><xref ref-type="fig" rid="f9-sensors-10-09194">Figure 9</xref> depicts the original plenoptic image used for simulations. In <xref ref-type="fig" rid="f10-sensors-10-09194">figure 10</xref> the final depth map of a test image is displayed together with the associated single image. The belief propagation prototype was synthesized with a Xilinx XC5VSX50T Virtex-5. This FPGA is provided in a ML506 Xtreme DSP development platform. This development board has a 200 MHz clock.</p>
<p>The depth estimation using multistereo is less clear than using stereo because the cost function is more complex. Moreover, the quantifying results with 16 bit precision have shown performances really close to the original Matlab programmed algorithm.</p>
<sec sec-type="methods">
<label>4.1.</label>
<title>Time analysis</title>
<p>The implemented architecture is pipeline and it permits continuous data streaming. The use of internal memory allows simultaneous accesses to the messages for each direction and each plane. Also, all arithmetic computations have been replicated for each plane and the number of cycles in order to make the final depth map independent of the number of planes. Taking into account this and the checkerboard algorithm, the cycles for the operation of the module are:
<disp-formula id="FD7">
<label>(7)</label>
<mml:math display="block">
<mml:mrow>
<mml:mi mathvariant="italic">cycles</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">Nx</mml:mi>
<mml:mo>⋅</mml:mo>
<mml:mi mathvariant="italic">Ny</mml:mi></mml:mrow>
<mml:mn>2</mml:mn></mml:mfrac>
<mml:mo>⋅</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="italic">Niter</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn></mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mo>⌈</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mtext>log</mml:mtext></mml:mrow></mml:mrow>
<mml:mn>2</mml:mn></mml:msub>
<mml:mi mathvariant="italic">Nx</mml:mi></mml:mrow>
<mml:mo>⌉</mml:mo></mml:mrow>
<mml:mo>+</mml:mo>
<mml:mn>9</mml:mn>
<mml:mo>≈</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">Nx</mml:mi>
<mml:mo>⋅</mml:mo>
<mml:mi mathvariant="italic">Ny</mml:mi></mml:mrow>
<mml:mn>2</mml:mn></mml:mfrac>
<mml:mo>⋅</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="italic">Niter</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn></mml:mrow>
<mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p><xref ref-type="table" rid="t3-sensors-10-09194">Table 3</xref> shows the cycles and the total time broken down into the stages of the total system for several <bold>Nx</bold>, <bold>Ny</bold> and <bold>Niter</bold> values. A 200 MHz clock has been assumed.</p>
<p>These results can be contrasted with other works. [<xref ref-type="bibr" rid="b20-sensors-10-09194">20</xref>] and [<xref ref-type="bibr" rid="b21-sensors-10-09194">21</xref>] have proposed FPGA-based implementation of belief-propagation algorithm for stereo vision instead of multistereo vision. Their implementations use external memory and the system latency is mainly limited by the memory accesses. Their algorithms produce good results but the computation is slow and the 3D reconstruction is not possible in real-time. For example, in [<xref ref-type="bibr" rid="b20-sensors-10-09194">20</xref>] authors obtain a depth map in 0.32 s for 1,280 × 720 pixels using a Virtex-5. Our architecture based on internal memory reduces the time needed to calculate the depth estimation map by approximately 10 times with respect to an external memory implementation.</p></sec>
<sec>
<label>4.2.</label>
<title>Area</title>
<p>Block RAMs are the critical resource for the implementation of the system in a FPGA device. <xref ref-type="table" rid="t4-sensors-10-09194">Table 4</xref> shows the memory resources used for several FPGAs with 16-bit precision. Other resources such as DSP48 or slices are always below 10% for the FPGAs under consideration.</p></sec></sec>
<sec sec-type="conclusions">
<label>5.</label>
<title>Conclusions and Future Work</title>
<p>The current investigation develops a first FPGA implementation for depth map estimation using the belief propagation algorithm for the CAFADIS plenoptic sensor. The main contribution of this work is the use of FPGA technology for processing the huge amount of data from the plenoptic sensor. FPGA technology features are an important consideration in the CAFADIS camera. The depth reconstruction in real time is ensured due to the extremely high-performance signal processing and conditioning capabilities through parallelism based on FPGA slices and arithmetic circuits and highly flexible interconnection possibilities. Furthermore, the use of a single FPGA can meet the size requirements for a portable video camera. The low cost of FPGA implementation in data processing makes the camera sellable at not too expensive prices in the future.</p>
<p>However, algorithm implementation requires an extremely large internal memory. Such massive amount of storage requirement becomes one of the most crucial limitations for the implementation of Virtex-4, Virtex-5 and Virtex-6 FPGA families and the development platform has to be replaced by a subsequent generation of FPGA. The quantifying results with 16 bit precision have shown performances are really close to the original Matlab programmed algorithm. Our results have been compared with other belief propagation algorithms in FPGA and our implementation is comparatively faster.</p>
<p>The design of the belief algorithm was developed using functional VHDL hardware description language and is technology-independent. So, the system can be implemented on any large enough FPGA. Xilinx has just announced the release of 28-nm Virtex-7 FPGAs. These devices provide the highest performance and capacity for FPGAs (up to 65Mb) [<xref ref-type="bibr" rid="b22-sensors-10-09194">22</xref>,<xref ref-type="bibr" rid="b23-sensors-10-09194">23</xref>] and they will allow algorithm implementation for larger images.</p>
<p>In the future, we will implement this architecture in a Virtex-7 and integrate it in a real-time multistereo vision system. The goal is to obtain a fully portable system.</p></sec></body>
<back>
<ack>
<p>This work has been partially supported by “Programa Nacional de Diseño y Producción Industrial” (Project AYA 2009-13075) of the “Ministerio de Educación y Ciencia” of the Spanish government, and by “European Regional Development Fund” (ERDF).</p></ack>
<ref-list>
<title>References and Notes</title>
<ref id="b1-sensors-10-09194"><label>1.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Felzenszwalb</surname><given-names>PF</given-names></name><name><surname>Huttenlocher</surname><given-names>DR</given-names></name></person-group><article-title>Efficient Belief Propagation for Early Vision</article-title><source>Comp Vision Pattern Recognit</source><year>2004</year><volume>1</volume><fpage>I-261</fpage><lpage>I-268</lpage></citation></ref>
<ref id="b2-sensors-10-09194"><label>2.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kolmogorov</surname><given-names>V</given-names></name></person-group><article-title>Convergent Tree-Reweighted Message Passing for Energy Minimization</article-title><source>IEEE Trans. Pattern Anal. Mach. Intell</source><year>2006</year><volume>28</volume><fpage>1568</fpage><lpage>1582</lpage><pub-id pub-id-type="doi">10.1109/TPAMI.2006.200</pub-id><pub-id pub-id-type="pmid">16986540</pub-id></citation></ref>
<ref id="b3-sensors-10-09194"><label>3.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Szeliski</surname><given-names>R</given-names></name><name><surname>Zabih</surname><given-names>R</given-names></name><name><surname>Scharstein</surname><given-names>D</given-names></name><name><surname>Veksler</surname><given-names>O</given-names></name><name><surname>Kolmogorov</surname><given-names>V</given-names></name><name><surname>Agarwala</surname><given-names>A</given-names></name><name><surname>Tappen</surname><given-names>M</given-names></name><name><surname>Rother</surname><given-names>C</given-names></name></person-group><article-title>A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors</article-title><source>IEEE Trans. Pattern Anal. Mach. Intell</source><year>2008</year><volume>30</volume><fpage>1068</fpage><lpage>1080</lpage><pub-id pub-id-type="doi">10.1109/TPAMI.2007.70844</pub-id><pub-id pub-id-type="pmid">18421111</pub-id></citation></ref>
<ref id="b4-sensors-10-09194"><label>4.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Lüke</surname><given-names>JP</given-names></name><name><surname>Marichal-Hernández</surname><given-names>JG</given-names></name><name><surname>Rosa</surname><given-names>F</given-names></name><name><surname>Rodríguez-Ramos</surname><given-names>JM</given-names></name></person-group><article-title>A Prototype of Real-Time a Single Lens 3D Camera</article-title><conf-name>Proceedings of International Conference on 3D Systems and Applications</conf-name><conf-loc>To, Japan</conf-loc><conf-date>May 2010</conf-date></citation></ref>
<ref id="b5-sensors-10-09194"><label>5.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Ng</surname><given-names>R</given-names></name></person-group><article-title>Fourier Slice Photography</article-title><conf-name>Proceedings of International Conference on Computer Graphics and Interactive Techniques</conf-name><conf-loc>Los Angeles, CA, USA</conf-loc><conf-date>July 2005</conf-date><fpage>735</fpage><lpage>744</lpage></citation></ref>
<ref id="b6-sensors-10-09194"><label>6.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Pérez</surname><given-names>F</given-names></name><name><surname>Marichal</surname><given-names>JG</given-names></name><name><surname>Rodríguez-Ramos</surname><given-names>JM</given-names></name></person-group><article-title>The Discrete Focal Stack Transform</article-title><conf-name>Proceedings of 16th European Signal Processing Conference (EUSIPCO 2008)</conf-name><conf-loc>Lausanne, Switzerland</conf-loc><conf-date>August 25–29, 2008</conf-date></citation></ref>
<ref id="b7-sensors-10-09194"><label>7.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Lumsdaine</surname><given-names>A</given-names></name><name><surname>Georgiev</surname><given-names>T</given-names></name></person-group><source>Full Resolution Lightfield Rendering</source><publisher-name>Adobe Tech Report, Adobe Systems, Inc</publisher-name><publisher-loc>Los Angeles, CA, USA</publisher-loc><month>January</month><year>2008</year></citation></ref>
<ref id="b8-sensors-10-09194"><label>8.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Marichal-Hernández</surname><given-names>JG</given-names></name><name><surname>Lüke</surname><given-names>JP</given-names></name><name><surname>Rosa</surname><given-names>F</given-names></name><name><surname>Pérez</surname><given-names>F</given-names></name><name><surname>Rodríguez-Ramos</surname><given-names>JM</given-names></name></person-group><article-title>Fast Approximate Focal Stack Transform</article-title><conf-name>Proceedings of 3DTV CON 2009</conf-name><conf-loc>Potsdam, Germany</conf-loc><conf-date>May 2009</conf-date></citation></ref>
<ref id="b9-sensors-10-09194"><label>9.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Pérez</surname><given-names>F</given-names></name><name><surname>Lüke</surname><given-names>JP</given-names></name></person-group><article-title>Simultaneous Estimation of Super-Resolved Depth and All-in-Focus Images from a Plenoptic Camera</article-title><conf-name>Proceedings of 3DTV CON 2009</conf-name><conf-loc>Potsdam, Germany</conf-loc><conf-date>May 2009</conf-date></citation></ref>
<ref id="b10-sensors-10-09194"><label>10.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lüke</surname><given-names>JP</given-names></name><name><surname>Pérez Nava</surname><given-names>F</given-names></name><name><surname>Marichal-Hernández</surname><given-names>JG</given-names></name><name><surname>Rodríguez-Ramos</surname><given-names>JM</given-names></name><name><surname>Rosa</surname><given-names>F</given-names></name></person-group><article-title>Near Real-Time Estimation of Super-Resolved Depth and All-in-Focus Images form a Plenoptic Camera Using Graphics Processing Units</article-title><source>Int. J. Digit. Multimedia Broadcasting</source><year>2010</year><volume>2010</volume><fpage>12</fpage></citation></ref>
<ref id="b11-sensors-10-09194"><label>11.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Magdaleno</surname><given-names>E</given-names></name><name><surname>Rodríguez</surname><given-names>M</given-names></name><name><surname>Ayala</surname><given-names>A</given-names></name></person-group><article-title>VHDL Implementation of a Communication Interface for Integrated MEMS</article-title><source>Microsyst. Technol</source><year>2008</year><volume>14</volume><fpage>453</fpage><lpage>462</lpage><pub-id pub-id-type="doi">10.1007/s00542-007-0474-2</pub-id></citation></ref>
<ref id="b12-sensors-10-09194"><label>12.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Magdaleno</surname><given-names>E</given-names></name><name><surname>Rodríguez</surname><given-names>M</given-names></name><name><surname>Rodríguez-Ramos</surname><given-names>JM</given-names></name><name><surname>Ayala</surname><given-names>A</given-names></name></person-group><article-title>Modal Fourier Wavefront Reconstruction Using FPGA Technology</article-title><source>Micro. Nanosyst</source><year>2009</year><volume>1</volume><fpage>72</fpage><lpage>82</lpage><pub-id pub-id-type="doi">10.2174/1876402910901010072</pub-id></citation></ref>
<ref id="b13-sensors-10-09194"><label>13.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Deschamps</surname><given-names>J</given-names></name><name><surname>Bioul</surname><given-names>G</given-names></name><name><surname>Sutter</surname><given-names>G</given-names></name></person-group><source>Synthesis of Arithmetic Circuits FPGA, ASIC and Embedded Systems</source><publisher-name>Wiley-Interscience</publisher-name><publisher-loc>New York, NY, USA</publisher-loc><year>2006</year><fpage>1603</fpage><lpage>1617</lpage></citation></ref>
<ref id="b14-sensors-10-09194"><label>14.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rodriguez-Donate</surname><given-names>C</given-names></name><name><surname>Morales-Velazquez</surname><given-names>L</given-names></name><name><surname>Osornio-Rios</surname><given-names>RA</given-names></name><name><surname>Herrera-Ruiz</surname><given-names>G</given-names></name><name><surname>Romero-Troncoso</surname><given-names>RT</given-names></name></person-group><article-title>FPGA-Based Fused Smart Sensor for Dynamic and Vibration Parameter Extraction in Industrial Robots Links</article-title><source>Sensors</source><year>2010</year><volume>10</volume><fpage>4114</fpage><lpage>4129</lpage><pub-id pub-id-type="doi">10.3390/s100404114</pub-id><pub-id pub-id-type="pmid">22319345</pub-id></citation></ref>
<ref id="b15-sensors-10-09194"><label>15.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moreno-Tapia</surname><given-names>SV</given-names></name><name><surname>Vera-Salas</surname><given-names>LA</given-names></name><name><surname>Osornio-Rios</surname><given-names>RA</given-names></name><name><surname>Dominguez-Gonzalez</surname><given-names>A</given-names></name><name><surname>Stiharu</surname><given-names>I</given-names></name><name><surname>Romero-Troncoso</surname><given-names>RJ</given-names></name></person-group><article-title>A Field Programmable Gate Array-Based Reconfigurable Smart-Sensor Network for Wireless Monitoring of New Generation Computer Numerically Controlled Machines</article-title><source>Sensors</source><year>2010</year><volume>10</volume><fpage>7263</fpage><lpage>7286</lpage><pub-id pub-id-type="doi">10.3390/s100807263</pub-id><pub-id pub-id-type="pmid">22163602</pub-id></citation></ref>
<ref id="b16-sensors-10-09194"><label>16.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Trejo-Hernandez</surname><given-names>M</given-names></name><name><surname>Osornio-Rios</surname><given-names>RA</given-names></name><name><surname>Romero-Troncoso</surname><given-names>RJ</given-names></name><name><surname>Rodriguez-Donate</surname><given-names>C</given-names></name><name><surname>Dominguez-Gonzalez</surname><given-names>A</given-names></name><name><surname>Herrera-Ruiz</surname><given-names>G</given-names></name></person-group><article-title>FPGA-Based Fused Smart-Sensor for Tool-Wear Area Quantitative Estimation in CNC Machine Inserts</article-title><source>Sensors</source><year>2010</year><volume>10</volume><fpage>3373</fpage><lpage>3388</lpage><pub-id pub-id-type="doi">10.3390/s100403373</pub-id><pub-id pub-id-type="pmid">22319304</pub-id></citation></ref>
<ref id="b17-sensors-10-09194"><label>17.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname><given-names>W</given-names></name><name><surname>Chen</surname><given-names>W</given-names></name><name><surname>Tang</surname><given-names>J</given-names></name><name><surname>Xu</surname><given-names>P</given-names></name><name><surname>Li</surname><given-names>Y</given-names></name><name><surname>Li</surname><given-names>S</given-names></name></person-group><article-title>The Development of a Portable Hard Disk Encryption/Decryption System with a MEMS Coded Lock</article-title><source>Sensors</source><year>2009</year><volume>9</volume><fpage>9300</fpage><lpage>9331</lpage><pub-id pub-id-type="doi">10.3390/s91109300</pub-id><pub-id pub-id-type="pmid">22291566</pub-id></citation></ref>
<ref id="b18-sensors-10-09194"><label>18.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Magdaleno</surname><given-names>E</given-names></name><name><surname>Rodríguez</surname><given-names>M</given-names></name><name><surname>Rodríguez-Ramos</surname><given-names>JM</given-names></name></person-group><article-title>An Efficient Pipeline Wavefront Phase Recovery for the CAFADIS Camera for Extremely Large Telescopes</article-title><source>Sensors</source><year>2010</year><volume>10</volume><fpage>1</fpage><lpage>15</lpage><pub-id pub-id-type="doi">10.1109/JSEN.2009.2039287</pub-id><pub-id pub-id-type="pmid">22399874</pub-id></citation></ref>
<ref id="b19-sensors-10-09194"><label>19.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rodríguez-Ramos</surname><given-names>JM</given-names></name><name><surname>Magdaleno</surname><given-names>E</given-names></name><name><surname>Domínguez</surname><given-names>D</given-names></name><name><surname>Rodríguez</surname><given-names>M</given-names></name><name><surname>Marichal</surname><given-names>JG</given-names></name></person-group><article-title>2D-FFT Implementation on FPGA for Wavefront Phase Recovery from the CAFADIS Camera</article-title><source>Proc. SPIE</source><year>2008</year><volume>7015</volume><fpage>701539</fpage></citation></ref>
<ref id="b20-sensors-10-09194"><label>20.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Pérez</surname><given-names>J</given-names></name><name><surname>Sánchez</surname><given-names>P</given-names></name><name><surname>Martínez</surname><given-names>M</given-names></name></person-group><article-title>High-Definition Belief-Propagation Based Stereo Matching FPGA architecture</article-title><conf-name>Proceedings of Conference on Design of Circuits and Integrated System</conf-name><conf-loc>Zaragoza, Spain</conf-loc><conf-date>November, 2009</conf-date></citation></ref>
<ref id="b21-sensors-10-09194"><label>21.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Tseng</surname><given-names>Y</given-names></name><name><surname>Chang</surname><given-names>Y</given-names></name><name><surname>Chang</surname><given-names>T</given-names></name></person-group><article-title>Block-Based Belief Propagation with in-place Message Updating for Stereo Vision</article-title><conf-name>Proceedings of IEEE Conference on Circuits and System</conf-name><conf-loc>Mac, China</conf-loc><conf-date>November 30–December 3, 2008</conf-date><fpage>918</fpage><lpage>921</lpage></citation></ref>
<ref id="b22-sensors-10-09194"><label>22.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Przybus</surname><given-names>B</given-names></name></person-group><source>Xilinx Redefines Power, Performance, and Design Productivity with Three New 28 nm FPGA Families: Virtex-7, Kintex-7, and Artix-7 Devices</source><publisher-name>Xilinx</publisher-name><publisher-loc>San Jose, CA, USA</publisher-loc><month>June</month><year>2010</year></citation></ref>
<ref id="b23-sensors-10-09194"><label>23.</label><citation citation-type="book"><person-group person-group-type="author"><collab>Xilinx</collab></person-group><source>7 Series Overview. Advance Product Specification</source><publisher-name>Xilinx</publisher-name><publisher-loc>San Jose, CA, USA</publisher-loc><month>June</month><year>2010</year></citation></ref></ref-list>
<sec sec-type="display-objects">
<title>Figures and Tables</title>
<fig id="f1-sensors-10-09194" position="float">
<label>Figure 1.</label>
<caption>
<p>Overall system to be integrated in a portable video camera.</p></caption>
<graphic xlink:href="sensors-10-09194f1.gif"/></fig>
<fig id="f2-sensors-10-09194" position="float">
<label>Figure 2.</label>
<caption>
<p>Architecture of the designed belief propagation system.</p></caption>
<graphic xlink:href="sensors-10-09194f2.gif"/></fig>
<fig id="f3-sensors-10-09194" position="float">
<label>Figure 3.</label>
<caption>
<p>Memory addressing for even iterations.</p></caption>
<graphic xlink:href="sensors-10-09194f3.gif"/></fig>
<fig id="f4-sensors-10-09194" position="float">
<label>Figure 4.</label>
<caption>
<p>Memory addressing for odd iterations.</p></caption>
<graphic xlink:href="sensors-10-09194f4.gif"/></fig>
<fig id="f5-sensors-10-09194" position="float">
<label>Figure 5.</label>
<caption>
<p>Architectural block diagram of the address generator.</p></caption>
<graphic xlink:href="sensors-10-09194f5.gif"/></fig>
<fig id="f6-sensors-10-09194" position="float">
<label>Figure 6.</label>
<caption>
<p>Architectural block diagram of the arithmetic core.</p></caption>
<graphic xlink:href="sensors-10-09194f6.gif"/></fig>
<fig id="f7-sensors-10-09194" position="float">
<label>Figure 7.</label>
<caption>
<p>Diagram of the smoothing operation.</p></caption>
<graphic xlink:href="sensors-10-09194f7.gif"/></fig>
<fig id="f8-sensors-10-09194" position="float">
<label>Figure 8.</label>
<caption>
<p>Functional simulation of belief propagation for a 64 × 64 frame and 10.</p></caption>
<graphic xlink:href="sensors-10-09194f8.gif"/></fig>
<fig id="f9-sensors-10-09194" position="float">
<label>Figure 9.</label>
<caption>
<p>Lightfield captured with a plenoptic camera. Image taken from [<xref ref-type="bibr" rid="b5-sensors-10-09194">5</xref>].</p></caption>
<graphic xlink:href="sensors-10-09194f9.gif"/></fig>
<fig id="f10-sensors-10-09194" position="float">
<label>Figure 10.</label>
<caption>
<p><bold>(a)</bold> Single image. <bold>(b)</bold> Depth map.</p></caption>
<graphic xlink:href="sensors-10-09194f10.gif"/></fig>
<table-wrap id="t1-sensors-10-09194" position="float">
<label>Table 1.</label>
<caption>
<p>Pseudo-code for the algorithm.</p></caption>
<table frame="box" rules="none">
<tbody>
<tr>
<td align="left" valign="top"><bold>for</bold> z = 1:Nz<break/>  msg_min = inf;<break/>  <bold>for</bold> n=1:Niter<break/>    <bold>for</bold> y=2:Ny+1<break/>      offset = rem(y+n,2);<break/>      <bold>for</bold> x=2+offset:2:Nx+1 % Applying checker board<break/>        common = cost(z,x−1,y−1) + mu * (msg(z,x,y1,↓) + msg(z,x,y+1,↑) +<break/>                + msg(z,x−1,y,→) + msg(z,x+1,y,←));<break/><break/>        % Update messages<break/>        msg(z,x,y,↑) = common − msg(z,x,y−1,↓);<break/>        msg(z,x,y,↓) = common − msg(z,x,y+1,↑);<break/>        msg(z,x,y,←) = common − msg(z,x−1,y,→);<break/>        msg(z,x,y,→) = common − msg(z,x+1,y,←);<break/><break/>        msg_min = min(msg_min,msg(z,x,y,:))<break/>      <bold>end</bold>;<break/>    <bold>end</bold>;<break/>   <bold>end</bold>;<break/><break/>   % Applying smoothing.<break/>   msg(z,x,y,:) = min(msg_min+d, msg(z,x,y,:));<break/><bold>end</bold>;</td></tr></tbody></table>
<table-wrap-foot><fn id="tfn1-sensors-10-09194">
<p>Nx and Ny determine the size of the image, and Nz is the number of planes.</p></fn></table-wrap-foot></table-wrap>
<table-wrap id="t2-sensors-10-09194" position="float">
<label>Table 2.</label>
<caption>
<p>Address generation for the example.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center" valign="top"><bold>Iteration</bold></th>
<th align="center" valign="top"><bold>Cost address</bold></th>
<th align="center" valign="top"><bold>Up</bold></th>
<th align="center" valign="top"><bold>Down</bold></th>
<th align="center" valign="top"><bold>Left</bold></th>
<th align="center" valign="top"><bold>Right</bold></th></tr></thead>
<tbody>
<tr>
<td align="center" valign="top">odd</td>
<td align="center" valign="top">0</td>
<td align="center" valign="top">3</td>
<td align="center" valign="top">out</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">out</td></tr>
<tr>
<td align="center" valign="top">even</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">out</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">0</td></tr>
<tr>
<td align="center" valign="top">odd</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">out</td>
<td align="center" valign="top">out</td>
<td align="center" valign="top">1</td></tr>
<tr>
<td align="center" valign="top">even</td>
<td align="center" valign="top">3</td>
<td align="center" valign="top">6</td>
<td align="center" valign="top">0</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">out</td></tr>
<tr>
<td align="center" valign="top">odd</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">7</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">3</td></tr>
<tr>
<td align="center" valign="top">even</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">8</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">out</td>
<td align="center" valign="top">4</td></tr>
<tr>
<td align="center" valign="top">odd</td>
<td align="center" valign="top">6</td>
<td align="center" valign="top">9</td>
<td align="center" valign="top">3</td>
<td align="center" valign="top">7</td>
<td align="center" valign="top">out</td></tr>
<tr>
<td align="center" valign="top">even</td>
<td align="center" valign="top">7</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">8</td>
<td align="center" valign="top">6</td></tr>
<tr>
<td align="center" valign="top">odd</td>
<td align="center" valign="top">8</td>
<td align="center" valign="top">11</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">out</td>
<td align="center" valign="top">7</td></tr>
<tr>
<td align="center" valign="top">even</td>
<td align="center" valign="top">9</td>
<td align="center" valign="top">out</td>
<td align="center" valign="top">6</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">out</td></tr>
<tr>
<td align="center" valign="top">odd</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">out</td>
<td align="center" valign="top">7</td>
<td align="center" valign="top">11</td>
<td align="center" valign="top">9</td></tr>
<tr>
<td align="center" valign="top">even</td>
<td align="center" valign="top">11</td>
<td align="center" valign="top">out</td>
<td align="center" valign="top">8</td>
<td align="center" valign="top">out</td>
<td align="center" valign="top">10</td></tr></tbody></table></table-wrap>
<table-wrap id="t3-sensors-10-09194" position="float">
<label>Table 3.</label>
<caption>
<p>Execution time for the belief algorithm in FPGA.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center" valign="bottom"><bold>Nx</bold></th>
<th align="center" valign="bottom"><bold>Ny</bold></th>
<th align="center" valign="bottom"><bold>Niter</bold></th>
<th align="center" valign="bottom"><bold>Cycles</bold></th>
<th align="center" valign="bottom"><bold>Time [ms]</bold></th></tr></thead>
<tbody>
<tr>
<td align="center" valign="top">64</td>
<td align="center" valign="top">64</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">22,539</td>
<td align="center" valign="top">0.11</td></tr>
<tr>
<td align="center" valign="top">64</td>
<td align="center" valign="top">64</td>
<td align="center" valign="top">25</td>
<td align="center" valign="top">53,259</td>
<td align="center" valign="top">0.27</td></tr>
<tr>
<td align="center" valign="top">120</td>
<td align="center" valign="top">160</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">105,611</td>
<td align="center" valign="top">0.53</td></tr>
<tr>
<td align="center" valign="top">120</td>
<td align="center" valign="top">160</td>
<td align="center" valign="top">25</td>
<td align="center" valign="top">249,611</td>
<td align="center" valign="top">1.25</td></tr>
<tr>
<td align="center" valign="top">128</td>
<td align="center" valign="top">128</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">90,123</td>
<td align="center" valign="top">0.45</td></tr>
<tr>
<td align="center" valign="top">128</td>
<td align="center" valign="top">128</td>
<td align="center" valign="top">25</td>
<td align="center" valign="top">213,003</td>
<td align="center" valign="top">1.07</td></tr>
<tr>
<td align="center" valign="top">256</td>
<td align="center" valign="top">256</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">360,459</td>
<td align="center" valign="top">1.80</td></tr>
<tr>
<td align="center" valign="top">256</td>
<td align="center" valign="top">256</td>
<td align="center" valign="top">25</td>
<td align="center" valign="top">851,979</td>
<td align="center" valign="top">4.26</td></tr>
<tr>
<td align="center" valign="top">512</td>
<td align="center" valign="top">512</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">1,441,803</td>
<td align="center" valign="top">7.21</td></tr>
<tr>
<td align="center" valign="top">512</td>
<td align="center" valign="top">512</td>
<td align="center" valign="top">25</td>
<td align="center" valign="top">3,407,883</td>
<td align="center" valign="top">17.04</td></tr>
<tr>
<td align="center" valign="top">1,024</td>
<td align="center" valign="top">1,024</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">5,767,179</td>
<td align="center" valign="top">28.84</td></tr>
<tr>
<td align="center" valign="top">1,024</td>
<td align="center" valign="top">1,024</td>
<td align="center" valign="top">25</td>
<td align="center" valign="top">13,631,499</td>
<td align="center" valign="top">68.16</td></tr></tbody></table></table-wrap>
<table-wrap id="t4-sensors-10-09194" position="float">
<label>Table 4.</label>
<caption>
<p>FPGA internal memory resources.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center" valign="bottom"><bold>FPGA device</bold></th>
<th align="center" valign="bottom"><bold>Configuration of image</bold></th>
<th align="center" valign="bottom"><bold>Basic internal RAM module</bold></th>
<th align="center" valign="bottom"><bold>BRAM (used/available)</bold></th></tr></thead>
<tbody>
<tr>
<td align="center" valign="top">XC4SX35 Virtex-4</td>
<td align="center" valign="top">64 × 64 × 4</td>
<td align="center" valign="top">RAMB16 1K × 16</td>
<td align="center" valign="top">80/192 (41%)</td></tr>
<tr>
<td align="center" valign="top">XC5SX50 Virtex-5</td>
<td align="center" valign="top">64 × 64 × 4</td>
<td align="center" valign="top">BRAM 2K × 16</td>
<td align="center" valign="top">40/132 (30%)</td></tr>
<tr>
<td align="center" valign="top">XC5SX50 Virtex-5</td>
<td align="center" valign="top">64 × 64 × 8</td>
<td align="center" valign="top">BRAM 2K × 16</td>
<td align="center" valign="top">80/132 (60%)</td></tr>
<tr>
<td align="center" valign="top">XC6VLX240 Virtex-6</td>
<td align="center" valign="top">64 × 64 × 8</td>
<td align="center" valign="top">BRAM 2K × 16</td>
<td align="center" valign="top">40/416 (9%)</td></tr>
<tr>
<td align="center" valign="top">XC6VLX240 Virtex-6</td>
<td align="center" valign="top">64 × 64 × 8</td>
<td align="center" valign="top">BRAM 2K × 16</td>
<td align="center" valign="top">80/416 (19%)</td></tr>
<tr>
<td align="center" valign="top">XC6VLX240 Virtex-6</td>
<td align="center" valign="top">128 × 128 × 4</td>
<td align="center" valign="top">BRAM 2K × 16</td>
<td align="center" valign="top">160/416 (38%)</td></tr>
<tr>
<td align="center" valign="top">XC6VLX240 Virtex-6</td>
<td align="center" valign="top">128 × 128 × 8</td>
<td align="center" valign="top">BRAM 2K × 16</td>
<td align="center" valign="top">320/416 (77%)</td></tr>
<tr>
<td align="center" valign="top">XC6VLX240 Virtex-6</td>
<td align="center" valign="top">256 × 128 × 4</td>
<td align="center" valign="top">BRAM 2K × 16</td>
<td align="center" valign="top">320/416 (77%)</td></tr></tbody></table></table-wrap></sec></back></article>
