<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xml:lang="en" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Sensors</journal-id>
<journal-title>Sensors</journal-title>
<issn pub-type="epub">1424-8220</issn>
<publisher>
<publisher-name>Molecular Diversity Preservation International (MDPI)</publisher-name></publisher></journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3390/s90805933</article-id>
<article-id pub-id-type="publisher-id">sensors-09-05933</article-id>
<article-categories>
<subj-group>
<subject>Article</subject></subj-group></article-categories>
<title-group>
<article-title>A 1,000 Frames/s Programmable Vision Chip with Variable Resolution and Row-Pixel-Mixed Parallel Image Processors</article-title></title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Lin</surname><given-names>Qingyu</given-names></name></contrib>
<contrib contrib-type="author">
<name><surname>Miao</surname><given-names>Wei</given-names></name></contrib>
<contrib contrib-type="author">
<name><surname>Zhang</surname><given-names>Wancheng</given-names></name></contrib>
<contrib contrib-type="author">
<name><surname>Fu</surname><given-names>Qiuyu</given-names></name></contrib>
<contrib contrib-type="author">
<name><surname>Wu</surname><given-names>Nanjian</given-names></name><xref ref-type="corresp" rid="c1-sensors-09-05933"><sup>*</sup></xref></contrib>
<aff id="af1-sensors-09-05933">State Key Laboratory for Superlattices and Microstructures, Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China; E-Mail: <email>qylin@red.semi.ac.cn</email> (Q.-Y.L.)</aff></contrib-group>
<author-notes>
<corresp id="c1-sensors-09-05933">
<label>*</label>Author to whom correspondence should be addressed; E-Mail: <email>nanjian@red.semi.ac.cn</email></corresp></author-notes>
<pub-date pub-type="collection">
<year>2009</year></pub-date>
<pub-date pub-type="epub">
<day>27</day>
<month>7</month>
<year>2009</year></pub-date>
<volume>9</volume>
<issue>8</issue>
<fpage>5933</fpage>
<lpage>5951</lpage>
<history>
<date date-type="received">
<day>11</day>
<month>5</month>
<year>2009</year></date>
<date date-type="rev-recd">
<day>24</day>
<month>7</month>
<year>2009</year></date>
<date date-type="accepted">
<day>24</day>
<month>7</month>
<year>2009</year></date></history>
<permissions>
<copyright-statement>© 2009 by the authors; licensee MDPI, Basel, Switzerland</copyright-statement>
<copyright-year>2009</copyright-year>
<license>
<p>This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).</p></license></permissions>
<abstract>
<p>A programmable vision chip with variable resolution and row-pixel-mixed parallel image processors is presented. The chip consists of a CMOS sensor array, with row-parallel 6-bit Algorithmic ADCs, row-parallel gray-scale image processors, pixel-parallel SIMD Processing Element (PE) array, and instruction controller. The resolution of the image in the chip is variable: high resolution for a focused area and low resolution for general view. It implements gray-scale and binary mathematical morphology algorithms in series to carry out low-level and mid-level image processing and sends out features of the image for various applications. It can perform image processing at over 1,000 frames/s (fps). A prototype chip with 64 × 64 pixels resolution and 6-bit gray-scale image is fabricated in 0.18 μm Standard CMOS process. The area size of chip is 1.5 mm × 3.5 mm. Each pixel size is 9.5 μm × 9.5 μm and each processing element size is 23 μm × 29 μm. The experiment results demonstrate that the chip can perform low-level and mid-level image processing and it can be applied in the real-time vision applications, such as high speed target tracking.</p></abstract>
<kwd-group>
<kwd>vision chip</kwd>
<kwd>image processing</kwd>
<kwd>machine vision</kwd>
<kwd>mathematical morphology</kwd></kwd-group></article-meta></front>
<body>
<sec sec-type="intro">
<label>1.</label>
<title>Introduction</title>
<p>A vision chip integrates a sensor array with parallel processors in one chip and performs real-time parallel low- and mid-level image processing without I/O bottlenecks. It has advantages of compact size, high speed, and low power consumption so that it can be widely applied in many fields such as robotics, industrial automation and target tracking systems [<xref ref-type="bibr" rid="b1-sensors-09-05933">1</xref>,<xref ref-type="bibr" rid="b2-sensors-09-05933">2</xref>]. One of the challenges in the development of the vision chip is how to take advantages of the parallel-processing performance of vision chip to realize complex real-time algorithms. This paper presents a programmable vision chip with variable resolution and row-pixel-mixed parallel image processors, which can perform such complex algorithms.</p>
<p>Image processing methods can be cataloged into three kinds of levels: low-, mid-, and high-level processing [<xref ref-type="bibr" rid="b3-sensors-09-05933">3</xref>]. Low-level processing involves primary operations such as noise cancellation and image enhancement. Mid-level processing involves segmentation, description of regions and classification of objects. High-level processing performs intelligent analysis and cognitive vision. The common features of low- and mid-level image processing include parallel image processing and large amount of processing image data. The vision chip can perform row-parallel and pixel-parallel image processing. Therefore the vision chip is suitable for low- and mid-level image processing tasks.</p>
<p>General-purpose gray-scale image vision chips with analog processing elements (PE) have been reported in [<xref ref-type="bibr" rid="b4-sensors-09-05933">4</xref>–<xref ref-type="bibr" rid="b8-sensors-09-05933">8</xref>]. These general-purpose vision chips could only handle low-level image processing and the large amount of outputs image data limits their application in real-time image processing tasks. To solve the problem, a general-purpose vision chip is required that could perform low- and mid-level image processing in the chip subsequently. Therefore the ability of the vision chip must be improved and the I/O bottleneck must be overcome by sending out less image feature data.</p>
<p>Some application-specific vision chips performing low- and mid-level image processing were developed [<xref ref-type="bibr" rid="b9-sensors-09-05933">9</xref>–<xref ref-type="bibr" rid="b12-sensors-09-05933">12</xref>], but these chips were designed for specified applications and with a specified architecture. Another vision chip performing exact dilations was presented in [<xref ref-type="bibr" rid="b13-sensors-09-05933">13</xref>], but this one focused on several specific algorithms, such as morphological dilation, multi-scale skeleton and Distance Transform [<xref ref-type="bibr" rid="b14-sensors-09-05933">14</xref>]. They were not programmable or feasible for general purpose applications.</p>
<p>We also have developed a vision chip that performs low- and mid-level image processing [<xref ref-type="bibr" rid="b15-sensors-09-05933">15</xref>,<xref ref-type="bibr" rid="b16-sensors-09-05933">16</xref>]. The chip was based on specified mathematical morphology algorithms for high-speed target tracking. It could only accomplish binary image processing.</p>
<p>In this paper, we present a 1,000 frames/s (fps) programmable vision chip with variable resolution and row-pixel-mixed parallel gray image processors. The chip overcomes the difficulty of the early proposed general-purpose vision chips in the field of real-time machine vision. It consists of <italic>2N</italic> × <italic>2N</italic> CMOS image sensor, <italic>N</italic> row-parallel 6-bit Algorithmic ADCs, <italic>N</italic> gray-scale image processors, and <italic>N</italic> × <italic>N</italic> Single Instruction Multiple Data (SIMD) pixel-parallel PE array. The chip mainly performs low- and mid-level image processing based on <italic>gray-scale</italic> and <italic>binary mathematical morphology</italic> method and outputs image features for high-level image processing. The chip can move the focused area in one image and change image resolution to perform image processing under different environments or application. The chip can implement various complex algorithms for real-time machine-vision applications by software control. The chip has features of high-speed, low power consumption and small pixel element.</p>
<p>The rest of the paper is organized as follows. In Section 2, we describe the architecture and operations of the chip. In Section 3, the implementation of the chip is presented. In Section 4, we give some image processing examples, including a target tracking algorithm using a prototype chip. In Section 5, the performance of the chip is discussed. Finally, we come to the conclusions in Section 6.</p></sec>
<sec>
<label>2.</label>
<title>Architecture and Operations of the Chip</title>
<sec>
<label>2.1.</label>
<title>Architecture</title>
<p>The architecture of the proposed programmable vision chip with variable resolution and row-pixel-mixed parallel gray image processors is shown in <xref ref-type="fig" rid="f1-sensors-09-05933">Figure 1</xref>. The vision chip consists of <italic>2N</italic> × <italic>2N</italic> image sensor, <italic>N</italic> ADCs, <italic>N</italic> gray-scale image processors (row-parallel processors), <italic>N</italic> × <italic>N</italic> PE array, X processor, Y processor, instruction controller, parameters register, and output module.</p>
<p>The image sensor module consists of <italic>2N</italic> × <italic>2N</italic> 3-transistors photodiode-type active pixel sensor (APS) [<xref ref-type="bibr" rid="b17-sensors-09-05933">17</xref>] array, row and column decoder circuits in the periphery of the sensor array. The row decoder is realized by a multiplexer of four inputs and is controlled by the instructor from the parameter register. <xref ref-type="fig" rid="f2-sensors-09-05933">Figure 2</xref> shows that the decoder can work in four different modes. In each mode the sensor array outputs <italic>N</italic> rows of the whole <italic>2N</italic> × <italic>2N</italic> image into the <italic>N</italic> ADCs module. The column decoder is a common decoder and it is controlled by a Finite State Machine (FSM). By FSM instruction, the column decoder can select <italic>N</italic> columns from the <italic>2N</italic>-column image in column by column or one in every two columns. Therefore an <italic>N</italic> × <italic>N</italic> area selected in the <italic>2N</italic> × <italic>2N</italic> image can be output into the <italic>N</italic> ADCs module. The feature of this image sensor module is that it can emulate the human eye function and focus on a specified area of the image.</p>
<p>The ADCs module consists of <italic>N</italic> row-parallel 6-bit ADCs. Even though a lot of work, such as reported in [<xref ref-type="bibr" rid="b18-sensors-09-05933">18</xref>–<xref ref-type="bibr" rid="b21-sensors-09-05933">21</xref>], has been done on pixel level ADCs; we still chose the row parallel (named column parallel in some papers) structure [<xref ref-type="bibr" rid="b22-sensors-09-05933">22</xref>] for analog to digital converting. The ADC we implemented in the chip is based on an algorithmic approach [<xref ref-type="bibr" rid="b23-sensors-09-05933">23</xref>]. It can fit the vision specifications and has the features of smaller chip area and lower power consumption than pixel level ADC. It can convert the signals of <italic>N</italic> pixels in one column simultaneously. The convert period time depends on the gray-scale resolution.</p>
<p>For a binary image, the ADC finishes the converting operation in 2 clock cycles. For a 6-bit gray-scale image, the ADC converts the image in 7 clock cycles. Therefore we can control the quality of the image and the converting time to suit the different vision application. The gray-scale image processors receive the digital image from <italic>N</italic> ADCs or PE array. Each processor consists of three 6-bit pixel-data registers and an 11-bit ALU, as shown in <xref ref-type="fig" rid="f3-sensors-09-05933">Figure 3</xref>.</p>
<p>The ALU in row-parallel processor <italic>i</italic> can access three pixel-data registers D<italic><sub>i</sub></italic>[j] D<italic><sub>i</sub></italic>[j−1] D<italic><sub>i</sub></italic>[j−2] in itself, three pixel-data registers D<italic><sub>i</sub></italic><sub>−1</sub>[j] D<italic><sub>i</sub></italic><sub>−1</sub>[j − 1] D<italic><sub>i</sub></italic><sub>−1</sub>[j − 2] in row-parallel processor <italic>i −</italic> 1 and three pixel-data registers D<italic><sub>i</sub></italic> <sub>+ 1</sub>[j] D<italic><sub>i</sub></italic> <sub>+ 1</sub>[j − 1] D<italic><sub>i</sub></italic> <sub>+ 1</sub>[j − 2] in row-parallel processor <italic>i</italic> + 1. We terminal the boundary of the sensor array with low voltage (logic ‘0’). This boundary condition is required by <italic>mathematical morphology</italic> image processing. The ALU can process the data of 3 × 3 array in the image and perform 8 basic operations including <italic>‘add’ ‘subtract’ ‘minimum’ ‘maximum’ ‘comparison’ ‘equal’ ‘absolution’</italic> and <italic>‘shift’</italic>. These processors would process one column image data each period. The <italic>gray-scale mathematical morphology</italic> algorithms can be executed by combination of those operations repeatedly and successively.</p>
<p>The core module of the chip is an <italic>N</italic> × <italic>N</italic> PE array. The PE diagram is given in the <xref ref-type="fig" rid="f4-sensors-09-05933">Figure 4</xref>. It consists of nine D-latches, two Multiplexers, an AND gate, an OR gate, an inverter, and eight switches. One PE is connected directly with its four nearest neighborhood PEs. By selecting the MUX1, we can choose one signal as input of the PE, which comes from the neighbor PE or the output of itself. When the switch (RS[i]) was turned on, the negative latch (NL[i]) and positive latch (PL) constitute a D-flop. There are 8 D-flops that can store 8-bit data in one PE. The clocks (clkn[0-7]) of the eight NL[i] (i = 0, 1,…,7) are controlled independently by the instruction controller. PE can perform logical operations between the data in one of NL[1–7] and in NL[0] when we switch on one of RS[1–7] correspondingly. MUX2 is used to select one of the different operations: AND, OR, Invert, and Keeping Data.</p>
<p>There are an X Processor, a Y Processor, a PE data I/O module, and 2<italic>N</italic> PMOS transistors in the periphery of the PE array. One PMOS transistor and <italic>N</italic> NMOS transistors in the PE array constitute an <italic>N</italic>-input pseudo-NMOS NOR gate. The <italic>N</italic> transistors are contained respectively in <italic>N</italic> PEs. There are <italic>N</italic> rows <italic>N</italic>-input pseudo-NMOS NOR gates and <italic>N</italic> column N-input pseudo-NMOS NOR gates in the PE array. The outputs of these <italic>N</italic>-input pseudo-NMOS NOR gates were connected with the X processor and the Y processor.</p>
<p>The X processor contains <italic>N</italic> X Processor Units (XPU) and <italic>N</italic> 5-bit ROM that stores the X coordinates. The X processor receives the projection of the binary image, which is stored in all NL[i] of each PE, on the X-axis. The positions of the left edge and the right edge of the projection are found by XPUs. The X coordinates of the positions are obtained by checking the ROM.</p>
<p>The Y processor contains <italic>N</italic> Y Processor Units (YPU) and <italic>N</italic> 5-bit ROM that stores the Y coordinates. Its inputs are either the outputs of the <italic>N</italic>-input pseudo-NMOS NOR gates in rows or parallel shift-out data of the binary image in PE array. The Y processor not only gives the Y coordinates of the top edge and the bottom edge of the projection of the image on the Y-axis, but also obtains the Y coordinates of the activated pixels in a column of the image in sequence. The X coordinates and the Y coordinates from the X processor and the Y processor are transferred to the coordinates’ output module. The instruction controller connects with the outside circuit. It receives all of algorithms instructions and sends them to other modules. Parameters registers store the initial parameter for each module. The output module receives the X coordinates and the Y coordinates from the X processor and the Y processor and output them in series.</p></sec>
<sec>
<label>2.2.</label>
<title>Operations</title>
<p>The proposed programmable vision chip can perform image acquisition, low and mid-level image processing, and fast output of features of images in a designed procedure. The low- and mid-level image processing is mainly carried out by the <italic>Mathematical Morphology</italic> method [<xref ref-type="bibr" rid="b3-sensors-09-05933">3</xref>]. The chip can process gray-scale image and binary image in series. First the gray-scale image processors use the <italic>gray-scale mathematical morphology</italic> to perform the gray-scale image processing in the row parallel fashion. After some low-/mid-level image processing is finished and the precise skeleton of some objects in the image is extracted, the chip binaries the gray-scale image and uses <italic>binary mathematical morphology</italic> to process the image in pixel parallel fashion subsequently. The speed of the pixel parallel processing is <italic>N</italic> times faster than row parallel processing. The data of binary image is less than gray-scale. The sequence of the gray-scale image and binary image processing can avoid a trade-off between the performance time of each frame and the accuracy of the output data so that we can carry out the image processing quickly and obtain the useful feature information from the image.</p>
<sec>
<label>2.2.1.</label>
<title>Binary Mathematical Morphology</title>
<p>Morphological <italic>Morphology</italic> is an advanced image processing method that is based on the basic logical operations. The method not only performs low-level image processing such as morphological filtering, but also carries out mid-level image processing, such as extracting image objects or features.</p>
<p>The operations in the method can be described by set theory. The two fundamental operations are <italic>erosion</italic> and <italic>dilation</italic>. If <italic>A</italic> and <italic>B</italic> are two images, the <italic>erosion</italic> of <italic>A</italic> by <italic>B</italic>, denoted <italic>A Θ B</italic>, is defined as:
<disp-formula id="FD1">
<label>(1)</label>
<mml:math display="block">
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mi> </mml:mi>
<mml:mi mathvariant="normal">Θ</mml:mi>
<mml:mi> </mml:mi>
<mml:mi>B</mml:mi>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mo>∩</mml:mo>
<mml:mrow>
<mml:mi>b</mml:mi>
<mml:mo>∈</mml:mo>
<mml:mi>B</mml:mi></mml:mrow></mml:munder>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>A</mml:mi>
<mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi>b</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></disp-formula>and the <italic>dilation</italic> of <italic>A</italic> by <italic>B</italic>, denoted <italic>A ⊕ B</italic>, is defined as:
<disp-formula id="FD2">
<label>(2)</label>
<mml:math display="block">
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mi> </mml:mi>
<mml:mo>⊕</mml:mo>
<mml:mi> </mml:mi>
<mml:mi>B</mml:mi>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mo>∪</mml:mo>
<mml:mrow>
<mml:mi>b</mml:mi>
<mml:mo>∈</mml:mo>
<mml:mi>B</mml:mi></mml:mrow></mml:munder>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>A</mml:mi>
<mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi>b</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></disp-formula>where the denotation <italic>(A)<sub>b</sub></italic> is the <italic>translation</italic> or <italic>shift</italic> of the image <italic>A</italic> by point <italic>b = (x, y)</italic>, and is defined as:
<disp-formula id="FD3">
<label>(3)</label>
<mml:math display="block">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>A</mml:mi>
<mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow>
<mml:mi>b</mml:mi></mml:msub>
<mml:mi> </mml:mi>
<mml:mo>=</mml:mo>
<mml:mi> </mml:mi>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi> </mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>a</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi> </mml:mi>
<mml:mi> </mml:mi>
<mml:mi>a</mml:mi>
<mml:mo>∈</mml:mo>
<mml:mi>A</mml:mi></mml:mrow>
<mml:mo>}</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>Usually the image <italic>B</italic> is regarded as a structuring element. Examples of <italic>erosion</italic> and <italic>dilation</italic> are given in <xref ref-type="fig" rid="f5-sensors-09-05933">Figure 5</xref>. We assume that the black box represents the activated pixel and its value is logic ‘1’ in a binary image. <xref ref-type="fig" rid="f5-sensors-09-05933">Figures 5(a)</xref> and <xref ref-type="fig" rid="f5-sensors-09-05933">5(b)</xref> show the original image and the structuring element, respectively. The results of the <italic>erosion</italic> and <italic>dilation</italic> operations are shown in <xref ref-type="fig" rid="f5-sensors-09-05933">Figures 5(c)</xref> and <xref ref-type="fig" rid="f5-sensors-09-05933">5(d)</xref>. Other operations, such as <italic>opening</italic>, <italic>closing</italic>, and <italic>hit-or-miss transform</italic>, are realized by combinations of <italic>erosion</italic>, <italic>dilation,</italic> and logical operations.</p>
<p><italic>Erosion</italic> and <italic>dilation</italic> are constituted by a serial of logic operations including <italic>shift</italic>, <italic>OR</italic> and <italic>AND</italic>. This programmable vision chip efficiently realizes <italic>erosion</italic> and <italic>dilation</italic> by pipelining operations of <italic>Shift</italic> and <italic>OR</italic> or <italic>AND</italic> during the registers in the PE. The procedure is rather regular regardless of the structuring element. The number of clock cycles used to perform <italic>erosion</italic> or <italic>dilation</italic> in the chip can be estimated as <italic>2M</italic>, where <italic>M</italic> is the number of the activated pixels in the structuring element. All of the operations can be easily implemented by the chip. With a function that detects a <italic>void image</italic> which is defined as an image without activated pixel, advanced algorithms of <italic>mathematical morphology</italic> such as <italic>region growing</italic> and <italic>convex hull</italic> can be implemented in the chip.</p></sec>
<sec>
<label>2.2.2.</label>
<title>Gray-Scale Mathematical Morphology</title>
<p><italic>Gray-scale mathematical morphology</italic> is extended from <italic>mathematical morphology</italic>. We assume that <italic>f(x, y)</italic> is the input image and <italic>b(x, y)</italic> is a structuring element, itself a sub-image function.</p>
<p>The <italic>gray-scale erosion</italic>, denoted <italic>f Θ b</italic>, is defined as:
<disp-formula id="FD4">
<label>(4)</label>
<mml:math display="block">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi mathvariant="normal">Θ</mml:mi>
<mml:mi>b</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi> </mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mtext>min</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>−</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mi>D</mml:mi>
<mml:mi>f</mml:mi></mml:msub>
<mml:mo>;</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mi>D</mml:mi>
<mml:mi>b</mml:mi></mml:msub></mml:mrow>
<mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>and the <italic>gray-scale dilation,</italic> denoted <italic>f ⊕ b</italic>, is defined as:
<disp-formula id="FD5">
<label>(5)</label>
<mml:math display="block">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo>⊕</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi> </mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mtext>min</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>−</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>−</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>−</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>−</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mi>D</mml:mi>
<mml:mi>f</mml:mi></mml:msub>
<mml:mo>;</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mi>D</mml:mi>
<mml:mi>b</mml:mi></mml:msub></mml:mrow>
<mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>where <italic>D<sub>f</sub></italic> and <italic>D<sub>b</sub></italic> are the domains of <italic>f</italic> and <italic>b</italic>, respectively. The expressions for <italic>opening</italic> and <italic>closing</italic> operations of gray-scale image have the same form as their binary counterparts. All of these operations are implemented by the row-parallel processors: the gray-scale image processors. The gray-scale image processors process the gray-scale image in row-parallel and output the data into PE array column by column. Two important operations: <italic>opening</italic> and <italic>closing</italic>, can be realized by the combination of the basic operations, <italic>gray-scale erosion</italic> and <italic>gray-scale dilation</italic>.</p>
<p>The opening of image <italic>f</italic> by sub-image (structuring element) <italic>b</italic>, denoted <italic>f ○ b</italic>, is:
<disp-formula id="FD6">
<label>(6)</label>
<mml:math display="block">
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo>∘</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi mathvariant="normal">Θ</mml:mi>
<mml:mi>b</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi> </mml:mi>
<mml:mo>⊕</mml:mo>
<mml:mi>b</mml:mi></mml:mrow></mml:math></disp-formula></p>
<p>Similarly, the closing of <italic>f</italic> by <italic>b</italic>, denoted <italic>f • b</italic>, is:
<disp-formula id="FD7">
<label>(7)</label>
<mml:math display="block">
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo>•</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo>⊕</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi> </mml:mi>
<mml:mi mathvariant="normal">Θ</mml:mi>
<mml:mi>b</mml:mi></mml:mrow></mml:math></disp-formula></p>
<p>Most of the gray-scale morphology operations, such as morphological <italic>smoothing gradient</italic> and <italic>top-hat transformation</italic>, can be realized by combining the <italic>gray-scale erosion</italic> and <italic>gray-scale dilation</italic>. This programmable vision chip implements the gray-scale image processing first after the image is captured. The gray-scale morphological operations are a powerful set of tools for extracting features of interesting objects in an image and recognizing them. Then it converts the gray-scale image to binary image and use <italic>binary mathematical morphology</italic> for subsequent processing, for example <italic>Thinning</italic> or <italic>Skeletons</italic> are the means to get the main part of target object. <xref ref-type="table" rid="t1-sensors-09-05933">Table 1</xref> shows the operation time performance of <italic>binary</italic> and <italic>gray-scale mathematical morphology</italic> on the structure of our chip.</p></sec>
<sec>
<label>2.2.3.</label>
<title>Detecting a Void Image</title>
<p>A very useful function of the chip is to detect whether an image is a <italic>void image</italic>. After <italic>subtraction</italic> is performed between two images, whether the two images are equal is known by detecting if the result image is a <italic>void image</italic>. In many iterative algorithms the instruction controller usually terminates the iterative process by detecting a <italic>void image</italic>. The function of <italic>detecting a void image</italic> can be realized by the NOR gates and the Y processor. If the Y processor detects <italic>a void image</italic>, it will output logic ‘1’ at the port ‘void’. An example is shown in <xref ref-type="fig" rid="f6-sensors-09-05933">Figure 6(a)</xref>.</p></sec>
<sec>
<label>2.2.4.</label>
<title>Extracting the Range and the Center of a Region</title>
<p>The architecture of our chip is very efficient for getting the range and the center of the only one region in an image. If there are several regions in the image, it is required to separate them first. <xref ref-type="fig" rid="f6-sensors-09-05933">Figure 6(b)</xref> shows the global operation on the chip. The image in one register array projected onto the X-axis and the Y-axis by the operation of the <italic>2N</italic> pseudo-NMOS NOR gates. The coordinates of the right edge <italic>x<sub>min</sub></italic> and the left edge <italic>x<sub>max</sub></italic> of the projection of the region on the X-axis are extracted in the X processor. The coordinates of the top edge <italic>y<sub>min</sub></italic> and the bottom edge <italic>y<sub>max</sub></italic> of the projection of the region on the Y-axis are extracted in the Y processor. The four edges indicate the range of the region and are sent to the coordinates output control module, where the center <italic>P<sub>c</sub>(x<sub>c</sub>, y<sub>c</sub>)</italic> of the region is obtained as:
<disp-formula id="FD8">
<label>(8)</label>
<mml:math display="block">
<mml:mrow>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mi>c</mml:mi></mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi></mml:mrow>
<mml:mi>c</mml:mi></mml:msub>
<mml:mo>,</mml:mo>
<mml:mi> </mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mi>c</mml:mi></mml:msub></mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mi>c</mml:mi></mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi></mml:mrow>
<mml:mrow>
<mml:mtext>min</mml:mtext></mml:mrow></mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi></mml:mrow>
<mml:mrow>
<mml:mtext>max</mml:mtext></mml:mrow></mml:msub></mml:mrow>
<mml:mo>)</mml:mo></mml:mrow></mml:mrow>
<mml:mo stretchy="true">/</mml:mo>
<mml:mn>2</mml:mn></mml:mrow>
<mml:mo>,</mml:mo>
<mml:mi> </mml:mi>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mrow>
<mml:mtext>min</mml:mtext></mml:mrow></mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mrow>
<mml:mtext>max</mml:mtext></mml:mrow></mml:msub></mml:mrow>
<mml:mo>)</mml:mo></mml:mrow></mml:mrow>
<mml:mo stretchy="true">/</mml:mo>
<mml:mn>2</mml:mn></mml:mrow></mml:mrow>
<mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>It needs only eight clock cycles to calculate the center and the range of a region. The frequency of the clock is determined by the NOR gates and the circuits in the X processor and the Y processor.</p>
<p>In some applications, such as target tracking, it usually needs to obtain a point to represent the position of an object. Comparing with extraction operation of the centroid by global summation that is used in other papers [<xref ref-type="bibr" rid="b11-sensors-09-05933">11</xref>], the center of the target can be obtained by our vision chip easily and quickly.</p></sec>
<sec>
<label>2.2.5.</label>
<title>Extracting the Coordinates of Activated Pixels</title>
<p>The results of mid-level image processing are that some image features that mostly appear as points, boundaries, edges or skeletons of the objects in the image. It is required to quickly send these features in certain format out of chip to perform further processing. The chip can extract coordinates of activated pixels in the image. Therefore those features in the image can be outputted as the coordinates of activated pixels so that the information transfer is fast without I/O bottleneck from the chip to other digital processors. Another reason of output features in coordinates is that coordinates are easy to be further handled. Many descriptors or representations such as area, curvature, and chain code can be directly obtained from coordinates.</p>
<p>The procedure that the chip obtains the coordinates of activated pixels in the image in one register array column by column is given in <xref ref-type="fig" rid="f6-sensors-09-05933">Figure 6(c)</xref>. First, the data from the first column of the image is transferred into the Y processor. Then the Y processor searches the activated pixels in the column one by one from the bottom to the top in order and at the same time generates the Y coordinates of the activated pixels. Then, after all activated pixels in the column are generated; the PE array sends the image data of the next column to the Y processor. The Y processor begins to generate the coordinates of activated pixels in the new column. The above process is repeated until the Y coordinates of the activated pixels in the last column of the image is generated. On the other hand, the X coordinates are simply generated by a column counter in the coordinates’ output control module.</p>
<p>A vision chip reported in [<xref ref-type="bibr" rid="b10-sensors-09-05933">10</xref>] has similar function of searching activated pixels and extracting their coordinates. It used row-parallel-search architecture with 432 MHz clock frequency. Its parallel search and high-frequency cost large area and power consumption. Additionally, the processing speed of the whole system is seriously limited by the digital data readout. Therefore, before coordinates are send out, buffers are used to store them so that the chip area increases. In comparison with the chip, the architecture of our chip is more reasonable by only placing the search circuits in the Y processor to make the speed of extracting and sending the coordinates compatible. In such way, the clock frequency in the PE array can be much lower than that in the Y processor, no circuits for searching activated pixels exist in the PE array, and no output buffer is used. As the results, area and power are saved much.</p></sec></sec></sec>
<sec sec-type="methods">
<label>3.</label>
<title>VLSI Circuits Design</title>
<p>This vision chip is implemented in 0.18 μm CMOS technology with 3.3 V and 1.8 V multiple voltage supplies. The design of main circuit blocks is demonstrated as follows.</p>
<sec>
<label>3.1.</label>
<title>CMOS Image Sensor</title>
<p>A 3-transistors photodiode-type APS [<xref ref-type="bibr" rid="b17-sensors-09-05933">17</xref>] in standard salicide CMOS process is used in the chip. The APS CMOS image sensor has been widely discussed in [<xref ref-type="bibr" rid="b24-sensors-09-05933">24</xref>]. According to the requirement of 1,000 pfs performance, the integral time of photodiode has to be less than 1 ms for each frame. It needs a high sensitivity photodiode in the sensor array, so that an N-well/P-sub SAB diode without salicide is used as photodiode. SAB is the mask layer of salicide block mask, which is used only in standard salicide CMOS process for blocking salicide formation. Using the photodiode, we gain short integral time and high relative spectral quantum efficiency. The test result and explanation of N-well/P-sub SAB diode without salicide has been presented in our previous work [<xref ref-type="bibr" rid="b16-sensors-09-05933">16</xref>].</p></sec>
<sec>
<label>3.2.</label>
<title>Row-Parallel 6-bit Algorithmic ADCs</title>
<p>The algorithmic ADC has been widely used in CMOS image sensors for many years [<xref ref-type="bibr" rid="b23-sensors-09-05933">23</xref>]. In the vision chip, we chose a traditional structure of algorithmic ADC. The diagram of the ADC shows in <xref ref-type="fig" rid="f7-sensors-09-05933">Figure 7</xref>. V<italic><sub>in</sub></italic> is the output signal of one pixel in the sensor array. V<italic><sub>bias</sub></italic> is the circuit bias voltage. V<italic><sub>ref</sub></italic> and V<italic><sub>offset</sub></italic> are two off-chip reference voltages for analog-to-digital converting. <italic>Φ</italic><sub>1</sub> and <italic>Φ</italic><sub>2</sub> are non-overlapping two-phase clocks. <italic>Φ</italic><sub>A</sub> <italic>Φ</italic><sub>B</sub> <italic>Φ</italic><sub>C</sub> <italic>Φ</italic><sub>D</sub> are switch signals derived from <italic>Φ</italic><sub>1</sub> and <italic>Φ</italic><sub>2</sub>. The sample signal is multiplied by 2 in the operational amplifier <italic>‘Op_1’</italic> and hold in <italic>‘Op_2’</italic>. Then the output voltage of <italic>‘Op_2’</italic> compares with a reference voltage in comparator ‘<italic>Comp’</italic>. After comparison the digital result outputs 1 bit by 1 clock cycle of <italic>Φ</italic><sub>1</sub> or <italic>Φ</italic><sub>2</sub>. At the next clock, the output voltage of <italic>‘Op_2’</italic> recycles to the input of ADC for next bit of the digital output. An analog signal converting into a 6-bit digital signal takes seven clock cycle, one for sampling and six for 6-bit output.</p></sec>
<sec>
<label>3.3.</label>
<title>Row-Parallel Processors</title>
<p>The row-parallel processor is designed for calculating sum, subtraction and comparison of two 6-bit data. The diagram is shown in <xref ref-type="fig" rid="f8-sensors-09-05933">Figure 8</xref>. The ‘Buf’ converts the serial input data to the parallel data. <italic>‘D_Shift_Enable’</italic> controls the data to transfer column by column. <italic>‘B_Sel’</italic> switches the input of the ALU. Because the maximum of the sum of nine 6-bit data is less than 11-bit, so the data width of ALU is designed as 11-bit. It composed of eleven single-bit ALUs. The operating instruction, <italic>‘Operation’</italic>, comes from off-chip circuits.</p></sec>
<sec>
<label>3.4.</label>
<title>Search Chain in X Processors and Y Processors</title>
<p>The search chain can perform a function that finds out the first logic ‘1’ in a serial of bits along with certain direction. The length <italic>L</italic> of the search chain is defined as the number of bits being searched in the chain. An example of the search chain that has <italic>L = 8</italic> is given in <xref ref-type="fig" rid="f9-sensors-09-05933">Figure 9</xref>. The search chain consists of 8 search chain units (SCU). It finds out the position of the first logic ‘1’ from the left to the right in the 8-bit parallel inputs ‘SC_in’. At the position of the first logic ‘1’, the corresponding bit in parallel outputs ‘SC_out’ will become logic ‘0’, and other bits are all logic ‘1’. The input signal ‘SC_active’ controls the operation of the search chain. The search chain operates only when the ‘SC_active’ is set to high. Another output signal ‘SC_end’ comes from the end of the search chain. If ‘SC_active’ is high, ‘SC_end’ will be high only when all parallel-input bits of the search chain are logic ‘0’. ‘SC_end’ is quite useful in some operations. For example, the output port ‘void’ of the Y processor is directly connected with ‘SC_end’. Search circuits realized by dynamic logic with similar function were reported in [<xref ref-type="bibr" rid="b9-sensors-09-05933">9</xref>] and [<xref ref-type="bibr" rid="b10-sensors-09-05933">10</xref>]. Comparing with the dynamic logic circuit, the static logic circuit of our search chain has advantages of easy implementation and tolerance to noise.</p>
<p>The search time is principally determined by the delay of the transmission gates and is proportional to the length <italic>L</italic> of the search chain. The longest delay is 12.12 ns if <italic>L</italic> is 128 and a buffer is inserted between every three transmission gates. The search time is much less than that of the search circuits in [<xref ref-type="bibr" rid="b9-sensors-09-05933">9</xref>] and [<xref ref-type="bibr" rid="b10-sensors-09-05933">10</xref>]. The longest search time is 71 ns for 128 pixels per row in [<xref ref-type="bibr" rid="b9-sensors-09-05933">9</xref>] and 30 ns in [<xref ref-type="bibr" rid="b10-sensors-09-05933">10</xref>].</p>
<sec>
<label>3.4.1.</label>
<title>XPUs in the X Processor</title>
<p>The X processor contains <italic>N</italic> XPUs. The diagram of the <italic>i</italic>-th XPU, <italic>i = 1, 2, 3,…,N</italic> is shown in <xref ref-type="fig" rid="f10-sensors-09-05933">Figure 10(a)</xref>. It consists of two search chain units SCU1[<italic>i</italic>] and SCU2[<italic>i</italic>] which belong to two search chains with opposite search direction. The input NOR_C[<italic>i</italic>] of the two SCUs is the output of the NOR gate located in the <italic>i</italic>-th column of the PE array. A multiplexer selects the outputs of the two SCUs, and the outputs are sent to the ROM that stores X coordinates.</p></sec>
<sec>
<label>3.4.2.</label>
<title>YPUs in the Y Processor</title>
<p>The diagram of the <italic>j</italic>-th YPU, <italic>j = 1, 2, 3,…,N</italic> is given in <xref ref-type="fig" rid="f10-sensors-09-05933">Figure 10(b)</xref>. Like the <italic>i</italic>-th XPU, it also consists of two search chain units SCU1[<italic>j</italic>] and SCU2[<italic>j</italic>] which belong to two search chains with opposite search direction. It has an additional input PE_R[<italic>j</italic>] that is the data from the <italic>j</italic>-th row of the PE array except the NOR_R[<italic>j</italic>], which is the output of the NOR gate in the <italic>j</italic>-th row. If the bit in the register R1[<italic>j</italic>] of the <italic>j</italic>-th YPU is the first logic 1 that is found by the search chain 2 formed by SCU2[<italic>j</italic>], the bit is set to 0 by the path where the AND gate locates. Hence the search chain 2 can continue to find out the next bit that has the value of 1. When the chip performs the operation of <italic>extracting coordinates of activated pixels</italic>, the search chain 2 works. During the process, Sel1 will equal to the signal ‘SC_end’ of search chain 2. This makes the process automatically continue after initialization of the control signals in the Y processor.</p></sec></sec></sec>
<sec>
<label>4.</label>
<title>Chip Implementation and Experiments</title>
<p>The vision chip was designed and fabricated by 0.18 μm Standard CMOS process. The microphotograph of the chip is given in <xref ref-type="fig" rid="f11-sensors-09-05933">Figure 11</xref>. <xref ref-type="table" rid="t2-sensors-09-05933">Table 2</xref> lists the chip's specifications. Experimental results and performance of the vision chip are given below.</p>
<sec>
<label>4.1.</label>
<title>Experiment in Gray-Scale Mathematical Morphology</title>
<p>In this experiment, we used <italic>gray-scale mathematical morphology</italic> to smooth the image and fix some disconnect in the image. We used a hand-written English letter ‘A’ as the source image, as shown in <xref ref-type="fig" rid="f12-sensors-09-05933">Figure 12(a)</xref>. First the CMOS image sensor obtained an original 64 × 64 pixels image and selected a 32 × 32 pixels zoom as the focused image area, and sends it to ADCs, as shown in <xref ref-type="fig" rid="f12-sensors-09-05933">Figure 12(b)</xref>. The row-parallel processors performed the <italic>opening</italic> and <italic>closing</italic> operation on the image. The template image data were stored in PE array. <xref ref-type="fig" rid="f12-sensors-09-05933">Figures 12(c), (d), (e) and (f)</xref> shows the results after two <italic>opening</italic> and two <italic>closing</italic> operations, respectively. Thus the illegible ‘A’ was clearer than the original image. The gray-scale image processing took 80.2 us (3,208 cycles on a 40 MHz clock).</p></sec>
<sec>
<label>4.2.</label>
<title>Experiment in Binary Mathematical Morphology</title>
<p>After the row-parallel image processing, the illegible letter ‘A’ gray-scale image was converted into a binary one in the row-parallel processors. <xref ref-type="fig" rid="f13-sensors-09-05933">Figure 13(a)</xref> shows a binary image of the converted illegible letter ‘A’. Then the binary image processing was performed by the PE array in pixel-parallel binary <italic>mathematical morphology</italic> fashion. <xref ref-type="fig" rid="f13-sensors-09-05933">Figure 13(b–d)</xref> shows the images after <italic>thinning</italic> operations were performed. Thus we got a skeleton of the letter ‘A’. It took 22.7 us (908 cycles of a 40 MHz clock) for this whole binary image processing and was one fourth of the total performance time.</p></sec>
<sec>
<label>4.3.</label>
<title>Target Tracking</title>
<p>Here we give out an application example of a target tracking. After binary <italic>mathematical morphology</italic> image processing, we obtained a target in a selected zoom area. The X Processor and Y Processor gave the X and Y coordinates of the periphery edges of the projection of the target respectively, and calculated the centroid of the target. The whole processing time includes gray-scale image processing, gray-scale to binary converting, binary image processing and coordinates extracting, and is less than 1ms (40,000 cycles of a 40 MHz clock), so that this chip can perform 1,000 fps target tracking application. It also indicates that in 1ms we can perform more complex algorithm, such as <italic>Top-hat transformation</italic>, <italic>textural segmentation</italic> or <italic>granulometry</italic>, because of the enough cycles to utilities. The limitation of the capture frames per second is not the algorithms or circuit but the sensitivity of CMOS image sensors. In this experiment, we used the skeleton of letter ‘A’ for calculation the coordinates. It is better than using the original image directly, because after the algorithms region filling and an <italic>erosion</italic> operation, the results will be a small object stood for the main part of the object. <xref ref-type="fig" rid="f14-sensors-09-05933">Figure 14(d)</xref> shows the trace of a moving letter ‘A’. <xref ref-type="fig" rid="f14-sensors-09-05933">Figurea 14(a), (b) and (c)</xref> are three samples of image during the tracking process, when the time is 1,100 ms, 1,548 ms and 2,008 ms, respectively. The chip can track the moving target and provide its centroid coordinates.</p></sec></sec>
<sec sec-type="discussion">
<label>5.</label>
<title>Discussion of Performance</title>
<p>The chip area is mostly determined by the <italic>N × N</italic> PE array that increases with the square of <italic>N</italic>. Therefore, the area of the PE is a very important parameter. The area of each PE in the prototype chip is 23 × 29 μm<sup>2</sup>. It is large potential to reduce the area of the PE if reducing some performance is acceptable for specified applications. In contrast with other works [<xref ref-type="bibr" rid="b13-sensors-09-05933">13</xref>,<xref ref-type="bibr" rid="b25-sensors-09-05933">25</xref>–<xref ref-type="bibr" rid="b21-sensors-09-05933">21</xref>], our vision chip gets more efficient performance and programmability. That is benefit from the flexibility of gray-scale mathematical morphology. <xref ref-type="table" rid="t3-sensors-09-05933">Table 3</xref> shows the comparison.</p>
<p>In our vision chip, 6-bit ADC has been implemented. If a high quality image is required, we can design 8-bit or 10-bit ADCs in the chip, but it will use more chip area and more power consumption. The choice of 6-bit ADC is a trade-off between the quality and the cost. For most applications of <italic>gray-scale mathematical morphology</italic>, 6-bit image is enough.</p>
<p>Considering a more practical vision chip with more pixels derived from this one, for example a vision chip with a 256 × 256 pixels PE array, we can still run it at 1,000 fps on a 40 MHz clock. Assuming we perform the same algorithm in the 256 × 256 pixels vision chip as presented in this paper, then the duration of gray-scale image processing in row-parallel is proportional to the number of columns <italic>O</italic>(<italic>N</italic>), and the performing time of binary image processing in pixel-parallel is irrelevant to the number of pixels <italic>O</italic>(1). Therefore we can estimate as follows: the total available time for 1 frame is 1 ms or 40,000 cycles of a 40 MHz clock; the gray-scale image processing takes 8 × 80.2 μs (25,664 cycles); the binary image processing takes 22.7 μs (908 cycles of a 40 MHz clock); the total time is 664.3 μs (26,572 cycles), which is still less than 1 ms.</p></sec>
<sec sec-type="conclusions">
<label>6.</label>
<title>Conclusions</title>
<p>A programmable vision chip with variable resolution and row-pixel-mixed parallel image processors was presented. The chip consists of a CMOS sensor array with row-parallel 6-bit Algorithmic ADCs, row-parallel gray-scale image processors, pixel-parallel SIMD PE array, and instruction controller. The chip can change its resolution: high resolution for focused area and low resolution for general view, and perform image processing at an over 1,000 fps rate. The chip architecture supports both gray-scale and binary <italic>mathematical morphology</italic> operations in row- and pixel-parallel fashions. It can carry out low-level and mid-level image processing and sends out features of the image for various applications. A 40 MHz prototype chip with 64 × 64 pixels resolution and 6-bit gray image was fabricated in 0.18 μm Standard CMOS process. The chip’s area was 1.5 mm × 3.5 mm. Each pixel size was 9.5 μm × 9.5 μm and each processing element size is 23 μm × 29 μm. The experiment results demonstrated that it can perform low- and mid-level image processing and be applied in the real-time vision applications, such as high speed target tracking. The chip is supplied by 3.3V and 1.8V multiple voltages. Its power consumption is 82.5 mW (@ 1,000 fps &amp; 40 MHz). It is anticipated that it will find wide applications of real-time vision, such as medical inspection, automatic, robotics, and industrial control systems.</p></sec></body>
<back>
<ack>
<p>This work was supported by the special funds for Major State Basic Research Project 2006CB921201 of China and the National Nature Science Foundation of China Grant 90607007.</p></ack>
<ref-list>
<title>References and Notes</title>
<ref id="b1-sensors-09-05933"><label>1.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Aizawa</surname><given-names>K.</given-names></name></person-group><article-title>Computational sensors—vision VLSI</article-title><source>IEICE Trans. Inf. Syst</source><year>1999</year><volume>E82-D</volume><fpage>580</fpage><lpage>588</lpage></citation></ref>
<ref id="b2-sensors-09-05933"><label>2.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Seitz</surname><given-names>P.</given-names></name></person-group><article-title>Solid-state image sensing</article-title><source>Handbook of Computer Vision and Applications</source><publisher-name>Academic Press</publisher-name><publisher-loc>New York, USA</publisher-loc><year>2000</year><volume>1</volume><fpage>165</fpage><lpage>222</lpage></citation></ref>
<ref id="b3-sensors-09-05933"><label>3.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Gonzalez</surname><given-names>R.C.</given-names></name><name><surname>Woods</surname><given-names>R.E.</given-names></name></person-group><source>Digital Image Processing</source><edition>2nd ed</edition><publisher-name>Pearson Education, Inc</publisher-name><publisher-loc>Upper Saddle River, NJ, USA</publisher-loc><year>2002</year><fpage>2</fpage></citation></ref>
<ref id="b4-sensors-09-05933"><label>4.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yadid-Pecht</surname><given-names>O.</given-names></name><name><surname>Belenky</surname><given-names>A.</given-names></name></person-group><article-title>In-pixel Autoexposure CMOS APS</article-title><source>IEEE J. Solid-State Circ</source><year>2003</year><volume>38</volume><fpage>1425</fpage><lpage>1428</lpage><pub-id pub-id-type="doi">10.1109/JSSC.2003.811984</pub-id></citation></ref>
<ref id="b5-sensors-09-05933"><label>5.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Acosta-Serafini</surname><given-names>P.</given-names></name><name><surname>Ichiro</surname><given-names>M.</given-names></name><name><surname>Sodini</surname><given-names>C.</given-names></name></person-group><article-title>A 1/3 VGA linear wide dynamic range CMOS image sensor implementing a predictive multiple sampling algorithm with overlapping integration intervals</article-title><source>IEEE J. Solid-State Circ</source><year>2004</year><volume>39</volume><fpage>1487</fpage><lpage>1496</lpage><pub-id pub-id-type="doi">10.1109/JSSC.2004.831611</pub-id></citation></ref>
<ref id="b6-sensors-09-05933"><label>6.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dudek</surname><given-names>P.</given-names></name><name><surname>Hicks</surname><given-names>P.J.</given-names></name></person-group><article-title>A general-purpose processor-per-pixel analog SIMD vision chip</article-title><source>IEEE Trans. Circ. Syst. I</source><year>2005</year><volume>52</volume><fpage>13</fpage><lpage>20</lpage><pub-id pub-id-type="doi">10.1109/TCSI.2004.840093</pub-id></citation></ref>
<ref id="b7-sensors-09-05933"><label>7.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kozlowski</surname><given-names>L.</given-names></name><name><surname>Rossi</surname><given-names>G.</given-names></name><name><surname>Blanquart</surname><given-names>L.</given-names></name><name><surname>Marchesini</surname><given-names>R.</given-names></name><name><surname>Huang</surname><given-names>Y.</given-names></name><name><surname>Chow</surname><given-names>G.</given-names></name><name><surname>Richardson</surname><given-names>J.</given-names></name><name><surname>Standley</surname><given-names>D.</given-names></name></person-group><article-title>Pixel noise suppression via soc management of target reset in a 1920 × 1080 CMOS image sensor</article-title><source>IEEE J. Solid-State Circ</source><year>2005</year><volume>40</volume><fpage>2766</fpage><lpage>2776</lpage><pub-id pub-id-type="doi">10.1109/JSSC.2005.858480</pub-id></citation></ref>
<ref id="b8-sensors-09-05933"><label>8.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Massari</surname><given-names>N.</given-names></name><name><surname>Gottardi</surname><given-names>M.</given-names></name></person-group><article-title>A 100 dB dynamic-range CMOS vision sensor with programmable image processing and global feature extraction</article-title><source>IEEE J. Solid-State Circ</source><year>2007</year><volume>42</volume><fpage>647</fpage><lpage>657</lpage><pub-id pub-id-type="doi">10.1109/JSSC.2006.891454</pub-id></citation></ref>
<ref id="b9-sensors-09-05933"><label>9.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Oike</surname><given-names>Y.</given-names></name><name><surname>Ikeda</surname><given-names>M.</given-names></name><name><surname>Asada</surname><given-names>K.</given-names></name></person-group><article-title>A row-parallel position detector for high-speed 3-D camera based on light-section method</article-title><source>IEICE Trans. Electron</source><year>2003</year><volume>E86-C</volume><fpage>2320</fpage><lpage>2328</lpage></citation></ref>
<ref id="b10-sensors-09-05933"><label>10.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Oike</surname><given-names>Y.</given-names></name><name><surname>Ikeda</surname><given-names>M.</given-names></name><name><surname>Asada</surname><given-names>K.</given-names></name></person-group><article-title>A 375 × 365 1 k frames/s range-finding image sensor with 394.5 kHz access rate and 0.2 sub-pixel accuracy</article-title><source>ISSCC</source><year>2004</year><volume>1</volume><fpage>118</fpage><lpage>517</lpage></citation></ref>
<ref id="b11-sensors-09-05933"><label>11.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Komuro</surname><given-names>T.</given-names></name><name><surname>Ishii</surname><given-names>I.</given-names></name><name><surname>Ishikawa</surname><given-names>M.</given-names></name><name><surname>Yoshida</surname><given-names>A.</given-names></name></person-group><article-title>A digital vision chip specialized for high-speed target tracking</article-title><source>IEEE Trans. Electron. Dev</source><year>2003</year><volume>50</volume><fpage>191</fpage><lpage>199</lpage><pub-id pub-id-type="doi">10.1109/TED.2002.807255</pub-id></citation></ref>
<ref id="b12-sensors-09-05933"><label>12.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Watanabe</surname><given-names>Y.</given-names></name><name><surname>Komuro</surname><given-names>T.</given-names></name><name><surname>Kagami</surname><given-names>S.</given-names></name><name><surname>Ishikawa</surname><given-names>M.</given-names></name></person-group><article-title>Vision chip architecture for simultaneous output of multi-target positions</article-title><conf-name>Proceedings of SICE Annual Conference</conf-name><conf-loc>Fukui, Japan</conf-loc><conf-date>August 4–6, 2003</conf-date><fpage>1572</fpage><lpage>1575</lpage></citation></ref>
<ref id="b13-sensors-09-05933"><label>13.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Luppe</surname><given-names>M.</given-names></name><name><surname>da Costa</surname><given-names>L.F.</given-names></name><name><surname>Roda</surname><given-names>V.O.</given-names></name></person-group><article-title>Parallel implementation of exact dilations and multi-scale skeletonization</article-title><source>J. Real-Time Imaging</source><year>2003</year><volume>9</volume><fpage>163</fpage><lpage>169</lpage><pub-id pub-id-type="doi">10.1016/S1077-2014(03)00016-0</pub-id></citation></ref>
<ref id="b14-sensors-09-05933"><label>14.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>da Costa</surname><given-names>L.F.</given-names></name></person-group><article-title>Robust skeletonization through exact Euclidean distance transform and its applications to neuromorphometry</article-title><source>J. Real-Time Imaging</source><year>2003</year><volume>6</volume><fpage>415</fpage><lpage>431</lpage></citation></ref>
<ref id="b15-sensors-09-05933"><label>15.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Miao</surname><given-names>W.</given-names></name><name><surname>Lin</surname><given-names>Q.</given-names></name><name><surname>Wu</surname><given-names>N.</given-names></name></person-group><article-title>A novel vision chip for high-speed target tracking</article-title><source>Jpn. J. Appl. Phys</source><year>2007</year><volume>46</volume><fpage>2220</fpage><lpage>2225</lpage><pub-id pub-id-type="doi">10.1143/JJAP.46.2220</pub-id></citation></ref>
<ref id="b16-sensors-09-05933"><label>16.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Lin</surname><given-names>Q.</given-names></name><name><surname>Miao</surname><given-names>W.</given-names></name><name><surname>Wu</surname><given-names>N.</given-names></name></person-group><article-title>A high-speed target tracking CMOS image sensor</article-title><conf-name>IEEE Asian Solid-State circuits Conference</conf-name><conf-loc>Hangzhou, China</conf-loc><conf-date>November, 2006</conf-date><fpage>139</fpage><lpage>142</lpage></citation></ref>
<ref id="b17-sensors-09-05933"><label>17.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Zimmermann</surname><given-names>H.</given-names></name></person-group><source>Silicon Optoelectronic Integrated Circuits</source><publisher-name>Springer</publisher-name><publisher-loc>Berlin, Germany</publisher-loc><year>2003</year><fpage>62</fpage><lpage>66</lpage></citation></ref>
<ref id="b18-sensors-09-05933"><label>18.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname><given-names>D.</given-names></name><name><surname>Fowler</surname><given-names>B.</given-names></name><name><surname>Gamal</surname><given-names>A.E.</given-names></name></person-group><article-title>A nyquist-rate pixel-level ADC for CMOS image sensors</article-title><source>IEEE J. Solid-State Circ</source><year>1999</year><volume>34</volume><fpage>348</fpage><lpage>356</lpage><pub-id pub-id-type="doi">10.1109/4.748186</pub-id></citation></ref>
<ref id="b19-sensors-09-05933"><label>19.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Loeliger</surname><given-names>T.</given-names></name><name><surname>Bachtold</surname><given-names>P.</given-names></name><name><surname>Binnig</surname><given-names>G.K.</given-names></name><name><surname>Cherubini</surname><given-names>G.</given-names></name><name><surname>Durig</surname><given-names>U.</given-names></name><name><surname>Eleftheriou</surname><given-names>E.</given-names></name><name><surname>Vettiger</surname><given-names>P.</given-names></name><name><surname>Uster</surname><given-names>M.</given-names></name><name><surname>Jackel</surname><given-names>H.</given-names></name></person-group><article-title>Cmos sensor array with cell-level analog-to-digital conversion for local probe data storage</article-title><conf-name>Proceedings of the 28th European Solid-State Circuit Conference (ESSCIRC 2002)</conf-name><conf-loc>Florence, Italy</conf-loc><conf-date>September 24–26, 2002</conf-date><fpage>623</fpage><lpage>626</lpage></citation></ref>
<ref id="b20-sensors-09-05933"><label>20.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Harton</surname><given-names>A.</given-names></name><name><surname>Ahmed</surname><given-names>M.</given-names></name><name><surname>Beuhler</surname><given-names>A.</given-names></name><name><surname>Castro</surname><given-names>F.</given-names></name><name><surname>Dawson</surname><given-names>L.</given-names></name><name><surname>Herold</surname><given-names>B.</given-names></name><name><surname>Kujawa</surname><given-names>G.</given-names></name><name><surname>Lee</surname><given-names>K.</given-names></name><name><surname>Mareachen</surname><given-names>R.</given-names></name><name><surname>Scaminaci</surname><given-names>T.</given-names></name></person-group><article-title>High dynamic range CMOS image sensor with pixel level ADC and <italic>in situ</italic> image enhancement. Sensors and camera systems for scientific and industrial applications VI</article-title><source>Proc. SPIE</source><year>2005</year><volume>5677</volume><fpage>67</fpage><lpage>77</lpage></citation></ref>
<ref id="b21-sensors-09-05933"><label>21.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chi</surname><given-names>Y.</given-names></name><name><surname>Mallik</surname><given-names>U.</given-names></name><name><surname>Choi</surname><given-names>E.</given-names></name><name><surname>Clapp</surname><given-names>M.</given-names></name><name><surname>Gauwenberghs</surname><given-names>G.</given-names></name><name><surname>Etienne-Cummings</surname><given-names>R.</given-names></name></person-group><article-title>CMOS Pixel-level ADC with change detection</article-title><source>Proc. Int. Symp. Circ. Syst</source><year>2006</year><fpage>1647</fpage><lpage>1650</lpage></citation></ref>
<ref id="b22-sensors-09-05933"><label>22.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Furuta</surname><given-names>M.</given-names></name><name><surname>Nishikawa</surname><given-names>Y.</given-names></name><name><surname>Inoue</surname><given-names>T.</given-names></name><name><surname>Kawahito</surname><given-names>S.</given-names></name></person-group><article-title>A high-speed, high-sensitivity digital CMOS image sensor with a global shutter and 12-bit column-parallel cyclic A/D converters</article-title><source>IEEE J. Solid-State Circ</source><year>2007</year><volume>42</volume><fpage>766</fpage><lpage>774</lpage><pub-id pub-id-type="doi">10.1109/JSSC.2007.891655</pub-id></citation></ref>
<ref id="b23-sensors-09-05933"><label>23.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Sartori</surname><given-names>A.</given-names></name><name><surname>Gottardi</surname><given-names>M.</given-names></name><name><surname>Maloberti</surname><given-names>F.</given-names></name><name><surname>Simoni</surname><given-names>A.</given-names></name><name><surname>Torelli</surname><given-names>G.</given-names></name></person-group><article-title>Analog-to-digital converters for optical sensor arrays</article-title><conf-name>Proceedings of the Third International Conference on Electronics, Circuits, and Systems (ICECS ‘96)</conf-name><conf-loc>Rodos, Greece</conf-loc><conf-date>October 13–16, 1996</conf-date><fpage>939</fpage><lpage>942</lpage></citation></ref>
<ref id="b24-sensors-09-05933"><label>24.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname><given-names>C.</given-names></name><name><surname>Shih</surname><given-names>Y.</given-names></name><name><surname>Lan</surname><given-names>J.</given-names></name><name><surname>Hsieh</surname><given-names>C.</given-names></name><name><surname>Huang</surname><given-names>C.</given-names></name><name><surname>Lu</surname><given-names>J.</given-names></name></person-group><article-title>Design, optimization, and performance analysis of new photodiode structures for CMOS Active-Pixel-Sensor (APS) imager applications</article-title><source>IEEE Sens. J</source><year>2004</year><volume>4</volume><fpage>135</fpage><lpage>144</lpage><pub-id pub-id-type="doi">10.1109/JSEN.2003.820361</pub-id></citation></ref>
<ref id="b25-sensors-09-05933"><label>25.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cembrano</surname><given-names>G.</given-names></name><name><surname>Rodriguez-Vazquez</surname><given-names>A.</given-names></name><name><surname>Galan</surname><given-names>R.</given-names></name><name><surname>Jimenez-Garrido</surname><given-names>F.</given-names></name><name><surname>Espejo</surname><given-names>S.</given-names></name><name><surname>Dominguez-Castro</surname><given-names>R.</given-names></name></person-group><article-title>A 1000 fps at 128 × 128 vision processor with 8bit digitized I/O</article-title><source>IEEE J. Solid-State Circ</source><year>2004</year><volume>39</volume><fpage>1044</fpage><lpage>1055</lpage><pub-id pub-id-type="doi">10.1109/JSSC.2004.829931</pub-id></citation></ref>
<ref id="b26-sensors-09-05933"><label>26.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brea</surname><given-names>V.M.</given-names></name><name><surname>Vilariño</surname><given-names>D.L.</given-names></name><name><surname>Paasio</surname><given-names>A.</given-names></name><name><surname>Cabello</surname><given-names>D.</given-names></name></person-group><article-title>“Design of the processing core of a mixed-signal CMOS DTCNN chip for pixel-level snakes”</article-title><source>IEEE Trans. Circ. Syst. I</source><year>2004</year><volume>51</volume><fpage>997</fpage><lpage>1013</lpage><pub-id pub-id-type="doi">10.1109/TCSI.2004.827625</pub-id></citation></ref>
<ref id="b27-sensors-09-05933"><label>27.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sugiyama</surname><given-names>Y.</given-names></name><name><surname>Takumi</surname><given-names>M.</given-names></name><name><surname>Tyoda</surname><given-names>H.</given-names></name><name><surname>Mukozaka</surname><given-names>N.</given-names></name><name><surname>Ihori</surname><given-names>A.</given-names></name><name><surname>Kurashina</surname><given-names>T.</given-names></name><name><surname>Nakamura</surname><given-names>Y.</given-names></name><name><surname>Tonbe</surname><given-names>T.</given-names></name><name><surname>Mizuno</surname><given-names>S.</given-names></name></person-group><article-title>A high-speed CMOS image with profile data acquiring function</article-title><source>IEEE J. Solid-State Circ</source><year>2005</year><volume>40</volume><fpage>2816</fpage><lpage>2823</lpage><pub-id pub-id-type="doi">10.1109/JSSC.2005.858475</pub-id></citation></ref></ref-list>
<sec sec-type="display-objects">
<title>Figures and Tables</title>
<fig id="f1-sensors-09-05933" position="float">
<label>Figure 1.</label>
<caption>
<p>Architecture of the vision chip.</p></caption>
<graphic xlink:href="sensors-09-05933f1.gif"/></fig>
<fig id="f2-sensors-09-05933" position="float">
<label>Figure 2.</label>
<caption>
<p>Row Select modes. (a) Row select mode 1, select interleaved rows (b) Row select mode 2, select the 32 rows in bottom of the image, (c) Row select mode 3, select the mid 32 rows (d) Row select mode 4, select the top 32 rows. In mode 2 3 4, any continuous 32 columns can be selected.</p></caption>
<graphic xlink:href="sensors-09-05933f2.gif"/></fig>
<fig id="f3-sensors-09-05933" position="float">
<label>Figure 3.</label>
<caption>
<p>The interconnection between row-parallel processors.</p></caption>
<graphic xlink:href="sensors-09-05933f3.gif"/></fig>
<fig id="f4-sensors-09-05933" position="float">
<label>Figure 4.</label>
<caption>
<p>The diagram of PE.</p></caption>
<graphic xlink:href="sensors-09-05933f4.gif"/></fig>
<fig id="f5-sensors-09-05933" position="float">
<label>Figure 5.</label>
<caption>
<p>Examples for <italic>erosion</italic> and <italic>dilation</italic> in <italic>mathematical morphology</italic> operations. (a) An image denoted <italic>A</italic>. (b) An image denoted <italic>B</italic>, in which the cross shows where the origin is. (c) The result of <italic>A Θ B</italic>. (d) The result of <italic>A ⊕ B</italic>.</p></caption>
<graphic xlink:href="sensors-09-05933f5.gif"/></fig>
<fig id="f6-sensors-09-05933" position="float">
<label>Figure 6.</label>
<caption>
<p>(a) Diagram for detecting a <italic>void image</italic> (Void = 1). (b) Extracting the range and the center of a region. (c) Diagram of extracting the coordinates (<italic>x, y</italic>) of activated pixels in an image.</p></caption>
<graphic xlink:href="sensors-09-05933f6.gif"/></fig>
<fig id="f7-sensors-09-05933" position="float">
<label>Figure 7.</label>
<caption>
<p>The diagram of algorithmic ADC.</p></caption>
<graphic xlink:href="sensors-09-05933f7.gif"/></fig>
<fig id="f8-sensors-09-05933" position="float">
<label>Figure 8.</label>
<caption>
<p>The diagram of row-parallel processor architecture.</p></caption>
<graphic xlink:href="sensors-09-05933f8.gif"/></fig>
<fig id="f9-sensors-09-05933" position="float">
<label>Figure 9.</label>
<caption>
<p>Schematic diagram of a search chain with 8 Search Chain Units (SCU).</p></caption>
<graphic xlink:href="sensors-09-05933f9.gif"/></fig>
<fig id="f10-sensors-09-05933" position="float">
<label>Figure 10.</label>
<caption>
<p>(a) Schematic diagram of XPU[i]. (b) Schematic diagram of YPU[j].</p></caption>
<graphic xlink:href="sensors-09-05933f10.gif"/></fig>
<fig id="f11-sensors-09-05933" position="float">
<label>Figure 11.</label>
<caption>
<p>Microphotograph of the prototype chip.</p></caption>
<graphic xlink:href="sensors-09-05933f11.gif"/></fig>
<fig id="f12-sensors-09-05933" position="float">
<label>Figure 12.</label>
<caption>
<p>An example of the algorithms using <italic>gray-scale mathematical morphology</italic> performed in the prototype chip. (a) A hand-written English letter ‘A’. (b) A selected 32 × 32 pixels zoom as the focus image. (c), (d) the gray-scale image after two <italic>opening</italic> operations. (e), (f) the gray-scale image after two <italic>opening</italic> and two <italic>closing</italic> operations.</p></caption>
<graphic xlink:href="sensors-09-05933f12.gif"/></fig>
<fig id="f13-sensors-09-05933" position="float">
<label>Figure 13.</label>
<caption>
<p>An example of the algorithms using <italic>binary mathematical morphology</italic> performed in the prototype chip. (a) Binary image of the converted illegible letter ‘A’. (b), (c) Performing <italic>thinning</italic> operations. (d) A skeleton of the letter ‘A’.</p></caption>
<graphic xlink:href="sensors-09-05933f13.gif"/></fig>
<fig id="f14-sensors-09-05933" position="float">
<label>Figure 14.</label>
<caption>
<p>An example of the experiment of target tracking. 3 samples of image during the tracking process, (a) t = 1,100 ms, (b) t = 1,548 ms and (c) t = 2,008 ms. (d) The trace of a moving letter ‘A’.</p></caption>
<graphic xlink:href="sensors-09-05933f14.gif"/></fig>
<table-wrap id="t1-sensors-09-05933" position="float">
<label>Table 1.</label>
<caption>
<p>Time performance.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center" valign="bottom"><bold>Operations</bold></th>
<th align="center" valign="bottom"><bold>Clock cycles</bold></th></tr></thead>
<tbody>
<tr>
<td align="center" valign="top">Load from ADC</td>
<td align="center" valign="top">≈8 × N</td></tr>
<tr>
<td align="center" valign="top">Shift 1 pixel</td>
<td align="center" valign="top">1</td></tr>
<tr>
<td align="center" valign="top">Copy</td>
<td align="center" valign="top">1</td></tr>
<tr>
<td align="center" valign="top">AND/OR/NOT</td>
<td align="center" valign="top">1</td></tr>
<tr>
<td align="center" valign="top">Binary Erosion / Dilation</td>
<td align="center" valign="top">≈M<xref ref-type="table-fn" rid="tfn1-sensors-09-05933"><sup>(1)</sup></xref></td></tr>
<tr>
<td align="center" valign="top">Gray-scale Erosion / Dilation</td>
<td align="center" valign="top">≈M × N<xref ref-type="table-fn" rid="tfn2-sensors-09-05933"><sup>(2)</sup></xref></td></tr>
<tr>
<td align="center" valign="top">Detecting a void image</td>
<td align="center" valign="top">2</td></tr>
<tr>
<td align="center" valign="top">Extracting the Range and the center</td>
<td align="center" valign="top">8</td></tr>
<tr>
<td align="center" valign="top">Extracting coordinates of activated pixels</td>
<td align="center" valign="top">≈2K<xref ref-type="table-fn" rid="tfn3-sensors-09-05933"><sup>(3)</sup></xref></td></tr>
<tr>
<td align="center" valign="top">Bit-serial binary image input/output</td>
<td align="center" valign="top">≈N × N</td></tr></tbody></table>
<table-wrap-foot><fn id="tfn1-sensors-09-05933">
<p>(1) M is the number of active pixels in a structure element.</p></fn><fn id="tfn2-sensors-09-05933">
<p>(2) N × N is the pixels of an image.</p></fn><fn id="tfn3-sensors-09-05933">
<p>(3) K is the number of active pixels in a binary image.</p></fn></table-wrap-foot></table-wrap>
<table-wrap id="t2-sensors-09-05933" position="float">
<label>Table 2.</label>
<caption>
<p>Chip specifications.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center" valign="middle"><bold>Parameter</bold></th>
<th align="center" valign="middle"><bold>Value</bold></th></tr></thead>
<tbody>
<tr>
<td align="left" valign="top">Technology</td>
<td align="left" valign="top">0.18 μm 1P6M CMOS Std.</td></tr>
<tr>
<td align="left" valign="top">Chip Size (pad incl.)</td>
<td align="left" valign="top">3.5 mm × 1.5 mm</td></tr>
<tr>
<td align="left" valign="top">Array Size</td>
<td align="left" valign="top">64 × 64 pixels for Sensor Array</td></tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">32 × 32 pixels for PE Array</td></tr>
<tr>
<td align="left" valign="top">Pixel Size</td>
<td align="left" valign="top">9.5 μm × 9.5 μm for Sensor</td></tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">23 μm × 29 μm for PE</td></tr>
<tr>
<td align="left" valign="top">Number of trans/pixel</td>
<td align="left" valign="top">3 trans in Sensor pixel</td></tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">85 trans in PE pixel</td></tr>
<tr>
<td align="left" valign="top">Fill Factor</td>
<td align="left" valign="top">58%</td></tr>
<tr>
<td align="left" valign="top">Clock frequency</td>
<td align="left" valign="top">40 MHz</td></tr>
<tr>
<td align="left" valign="top">Power supply and consumption</td>
<td align="left" valign="top">1.8 V &amp; 3.3V</td></tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">82.5mW (@, 1,000 fps)</td></tr></tbody></table></table-wrap>
<table-wrap id="t3-sensors-09-05933" position="float">
<label>Table 3.</label>
<caption>
<p>Comparison.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center" valign="bottom"><bold>Reference</bold></th>
<th align="center" valign="bottom"><bold>Our chip</bold></th>
<th align="center" valign="bottom"><bold>[<xref ref-type="bibr" rid="b13-sensors-09-05933">13</xref>]</bold></th>
<th align="center" valign="bottom"><bold>[<xref ref-type="bibr" rid="b25-sensors-09-05933">25</xref>]</bold></th>
<th align="center" valign="bottom"><bold>[<xref ref-type="bibr" rid="b26-sensors-09-05933">26</xref>]</bold></th>
<th align="center" valign="bottom"><bold>[<xref ref-type="bibr" rid="b27-sensors-09-05933">27</xref>]</bold></th></tr></thead>
<tbody>
<tr>
<td align="left" valign="top">Photosensors</td>
<td align="center" valign="top">Yes</td>
<td align="center" valign="top">No</td>
<td align="center" valign="top">Yes</td>
<td align="center" valign="top">Yes</td>
<td align="center" valign="top">Yes</td></tr>
<tr>
<td align="left" valign="top">Technology</td>
<td align="center" valign="top">0.18 μm 1P6M</td>
<td align="center" valign="top">FPGA</td>
<td align="center" valign="top">0.35 μm 1P5M</td>
<td align="center" valign="top">0.25 μm</td>
<td align="center" valign="top">0.6 μm 2P3M</td></tr>
<tr>
<td align="left" valign="top">PE area (μm<sup>2</sup>)</td>
<td align="center" valign="top">23 × 29</td>
<td align="center" valign="top">68 LE<sup><xref ref-type="table-fn" rid="tfn4-sensors-09-05933">*</xref></sup></td>
<td align="center" valign="top">75.5 × 73.3</td>
<td align="center" valign="top">83 × 45</td>
<td align="center" valign="top">20 × 20</td></tr>
<tr>
<td align="left" valign="top">Stored bits per PE</td>
<td align="center" valign="top">8</td>
<td align="center" valign="top">N/A</td>
<td align="center" valign="top">32</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">N/A</td></tr>
<tr>
<td align="left" valign="top">PE Array</td>
<td align="center" valign="top">32 × 32</td>
<td align="center" valign="top">12 × 12</td>
<td align="center" valign="top">128 × 128</td>
<td align="center" valign="top">9 × 9</td>
<td align="center" valign="top">512 × 512</td></tr>
<tr>
<td align="left" valign="top">Image processing</td>
<td align="center" valign="top">6-Bit Gray</td>
<td align="center" valign="top">9-Bit Gray</td>
<td align="center" valign="top">8-Bit Gray</td>
<td align="center" valign="top">4-Bit Gray</td>
<td align="center" valign="top">Analog</td></tr>
<tr>
<td align="left" valign="top">Control Style</td>
<td align="center" valign="top">SIMD</td>
<td align="center" valign="top">Regular</td>
<td align="center" valign="top">SIMD/CNN</td>
<td align="center" valign="top">Complicated</td>
<td align="center" valign="top">Regular</td></tr>
<tr>
<td align="left" valign="top">Global features</td>
<td align="center" valign="top">Specific periphery</td>
<td align="center" valign="top">N/A</td>
<td align="center" valign="top">Non-specific Periphery</td>
<td align="center" valign="top">No exportation</td>
<td align="center" valign="top">Specific periphery</td></tr>
<tr>
<td align="left" valign="top">Programmability</td>
<td align="center" valign="top">High</td>
<td align="center" valign="top">Specific</td>
<td align="center" valign="top">High</td>
<td align="center" valign="top">Moderate</td>
<td align="center" valign="top">Low</td></tr></tbody></table>
<table-wrap-foot><fn id="tfn4-sensors-09-05933">
<label>*</label>
<p>the chip was implemented in FPGA, therefore the area account by LE is given.</p></fn></table-wrap-foot></table-wrap></sec></back></article>
