<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xml:lang="en" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Sensors</journal-id>
<journal-title>Sensors</journal-title>
<issn pub-type="epub">1424-8220</issn>
<publisher>
<publisher-name>Molecular Diversity Preservation International (MDPI)</publisher-name></publisher></journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3390/s120403868</article-id>
<article-id pub-id-type="publisher-id">sensors-12-03868</article-id>
<article-categories>
<subj-group>
<subject>Article</subject></subj-group></article-categories>
<title-group>
<article-title>Improvement of Kinect<sub>TM</sub> Sensor Capabilities by Fusion with Laser Sensing Data Using Octree</article-title></title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Chávez</surname><given-names>Alfredo</given-names></name><xref ref-type="corresp" rid="c1-sensors-12-03868"><sup>*</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Karstoft</surname><given-names>Henrik</given-names></name></contrib>
<aff id="af1-sensors-12-03868">Århus School of Engineering, Århus University Finlandsgade 22, 8200 Århus N, Denmark</aff></contrib-group>
<author-notes>
<corresp id="c1-sensors-12-03868">
<label>*</label>Author to whom correspondence should be addressed; E-Mail: <email>acp@iha.dk</email>; Tel.: +45-2849-8465.</corresp></author-notes>
<pub-date pub-type="collection">
<year>2012</year></pub-date>
<pub-date pub-type="epub">
<day>26</day>
<month>3</month>
<year>2012</year></pub-date>
<volume>12</volume>
<issue>4</issue>
<fpage>3868</fpage>
<lpage>3878</lpage>
<history>
<date date-type="received">
<day>21</day>
<month>1</month>
<year>2012</year></date>
<date date-type="rev-recd">
<day>20</day>
<month>2</month>
<year>2012</year></date>
<date date-type="accepted">
<day>21</day>
<month>2</month>
<year>2012</year></date></history>
<permissions>
<copyright-statement>© 2012 by the authors; licensee MDPI, Basel, Switzerland.</copyright-statement>
<copyright-year>2012</copyright-year>
<license>
<p>This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).</p></license></permissions>
<abstract>
<p>To enhance sensor capabilities, sensor data readings from different modalities must be fused. The main contribution of this paper is to present a sensor data fusion approach that can reduce Kinect<sub>TM</sub> sensor limitations. This approach involves combining laser with Kinect<sub>TM</sub> sensors. Sensor data is modelled in a 3D environment based on octrees using a probabilistic occupancy estimation. The Bayesian method, which takes into account the uncertainty inherent in the sensor measurements, is used to fuse the sensor information and update the 3D octree map. The sensor fusion yields a significant increase of the field of view of the Kinect<sub>TM</sub> sensor that can be used for robot tasks.</p></abstract>
<kwd-group>
<kwd>sensor fusion</kwd>
<kwd>laser</kwd>
<kwd>Kinect<sub>TM</sub></kwd>
<kwd>3D octree map</kwd>
<kwd>collaboration</kwd></kwd-group></article-meta></front>
<body>
<sec sec-type="intro">
<label>1.</label>
<title>Introduction</title>
<p>Fusion of sensory information is essential in the field of mobile robots. The former is necessary in order to achieve full autonomy and consequently widen the range of its applicability. In this context, it is also necessary to develop more reliable systems which can operate in structured and unstructured environments. The result of the fusing process from the sensory information can be used to reconstruct the environment of the robot, and the robot can plan its own path and avoid obstacles. The robot can also adapt to unexpected environments. In other words, in the process of building the map by fusing sensory information of different sources, a more reliable map is obtained. Therefore, if the mobile robot is suddenly facing unexpected situations in the environment, e.g., people moving around, the robot can update the map taking into account the new entities. Consequently, fusion of different sensor readings must be applied in the hierarchical architecture of the robot.</p>
<p>When dealing with sensor data fusion, one of the requirements to take into account is the choice of the internal representation. This internal representation must be chosen so that it is common to all sensors. This means that sensor readings of different modalities must be converted to the common internal representation in advance before the fusion process is carried out. Occupied as well as empty areas of any arbitrary environments must also be modelled without a prior knowledge of it. It must also represent the estimation and the certainty values of the confidence of the true parameters. The fusion process for different sensors must be feasible under this internal representation. Conversion of sensor data from the physical measurements to the internal representation should be easy to carry out. In this context, the map should be expanded as needed and must have multiple resolution for different mobile robot tasks.</p>
<p>Over the years, several approaches for modelling 3D environments have been proposed. Wurm <italic>et al.</italic> [<xref ref-type="bibr" rid="b1-sensors-12-03868">1</xref>] makes a proper review of the previous techniques and also propose a 3D internal representation that fulfils the above requirements. This approach is the OctoMap, which is a library that implements a 3D probabilistic occupancy grid mapping approach.</p>
<p>It is worth mentioning the importance of 3D models for mobile robot tasks. A 3D model has for instance manifold features and can therefore facilitate the disambiguation of different places. Another important fact is that when a mobile robot has to be used in rescue actions and a 3D model of the environment has to be known in advance before any action is taken [<xref ref-type="bibr" rid="b2-sensors-12-03868">2</xref>].</p>
<p>The Kinect<sub>TM</sub> sensor from Microsoft has become quite utilised and has recently become very popular in various mobile robot tasks. However, the narrow field of view and the close range are limitations of the Kinect<sub>TM</sub>. The depth image on the Kinect<sub>TM</sub> has a field of view of 57.8°. To this end, a good field of view is important in mobile robots, because the wider the field view, the more precise the map, e.g., the robot can catch more features from the environment in a single sensor reading. On the other hand, a mobile robot with poor field view must constantly maneuver to fill up the missing map. One possible solution to this problem is to add one more Kinect<sub>TM</sub> to increase the field of view. This approach has the disadvantage, however, of dealing with an increase of data and thus becoming a computational burden. Another solution is to rotate the Kinect<sub>TM</sub> sensor by means of a servo. This again may limit the robot's ability to scan local maps successfully. The minimum range of the Kinect<sub>TM</sub> is about 0.6 m. This limited range might be a problem when navigating. More precisely, the robot may crash with objects that are situated between the Kinect<sub>TM</sub> sensor and the minimum range.</p>
<p>The main contribution of this paper is to focus on the problem of fusing range readings from a laser device with a depth Kinect<sub>TM</sub> image in order to increase the field of view and reduce the minimum range of the Kinect<sub>TM</sub> sensor. The Hokuyo <italic>U RG —</italic> 04<italic>LX</italic> — <italic>UG</italic>01 laser range finder [<xref ref-type="bibr" rid="b3-sensors-12-03868">3</xref>] was selected because of its size and price. It has a sensing range from 0.06 m <italic>→</italic> 4 m. Measurement accuracy is within ±3% tolerance of the current reading for most of the sensors range. The scanning rate is 100 milliseconds across a 240° range. These specifications make the laser ideal for this research in indoor applications.</p>
<p>The current system setup, as shown in <xref ref-type="fig" rid="f1-sensors-12-03868">Figure 1</xref> serves as an experimental testbed. It provides data by a Hokuyo laser range finder and a Microsoft Kinect<sub>TM</sub>. Section 2 is concerned with the octree representation. Section 3 describes how the binary Bayes filter can be applied to the octree map in order to fuse and update sensor readings. Section 4 shows the results of the fusion process. Finally, Section 5 gives the conclusion and future research direction.</p></sec>
<sec>
<label>2.</label>
<title>3D Map Making Based on Octree</title>
<p>Octrees are the three-dimensional generalisation of quadtrees [<xref ref-type="bibr" rid="b4-sensors-12-03868">4</xref>]. In other words, an octree is a hierarchical data structure for spatial subdivision in 3D. They have been successfully used to represent 3D maps [<xref ref-type="bibr" rid="b1-sensors-12-03868">1</xref>,<xref ref-type="bibr" rid="b5-sensors-12-03868">5</xref>–<xref ref-type="bibr" rid="b8-sensors-12-03868">8</xref>]. It mainly consists of recursively subdividing the cube into eight octants. Each octant in every division represents a node. The process ends when a minimum voxel size is reached. <xref ref-type="fig" rid="f2-sensors-12-03868">Figure 2</xref> shows a single occupied voxel and its octree representation.</p>
<p>Sensors suffer from inaccuracies due to noise, hence uncertainties inherited in sensor data readings must be interpreted in a probabilistic fashion. The approach presented in [<xref ref-type="bibr" rid="b1-sensors-12-03868">1</xref>] offers a means of combining the compactness of octrees that use discrete labels with the adaptability and flexibility of probabilistic modelling. For this reason, this paper has taken the previous approach.</p></sec>
<sec>
<label>3.</label>
<title>Sensor Fusion</title>
<p>Range sensor readings are modelled by probability sensor functions [<xref ref-type="bibr" rid="b9-sensors-12-03868">9</xref>] and binary Bayes filter is used to update the occupancy grid [<xref ref-type="bibr" rid="b1-sensors-12-03868">1</xref>,<xref ref-type="bibr" rid="b7-sensors-12-03868">7</xref>,<xref ref-type="bibr" rid="b10-sensors-12-03868">10</xref>,<xref ref-type="bibr" rid="b11-sensors-12-03868">11</xref>]. It is mainly used when the state is both static and binary. <xref ref-type="disp-formula" rid="FD1">Equation (1)</xref> presents the Odds form of the filter, whereas <xref ref-type="disp-formula" rid="FD2">Equation (2)</xref> represents the logOdd (<italic>L</italic>) ratio.</p>
<p>
<disp-formula id="FD1">
<label>(1)</label>
<mml:math id="mm1" display="block">
<mml:semantics id="sm1">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi>t</mml:mi></mml:mrow></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi>t</mml:mi></mml:mrow></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mi>t</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mi>t</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac>
<mml:mfrac>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:semantics></mml:math></disp-formula>
<disp-formula id="FD2">
<label>(2)</label>
<mml:math id="mm2" display="block">
<mml:semantics id="sm2">
<mml:mrow>
<mml:msub>
<mml:mi>l</mml:mi>
<mml:mi>t</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi>t</mml:mi></mml:mrow></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mi>t</mml:mi></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>−</mml:mo>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>o</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:semantics></mml:math></disp-formula></p>
<p><italic>P</italic>(<italic>n|z<sub>1:t</sub></italic>) is the probability of a leaf node <italic>n</italic> being occupied given the sensor measurements <italic>z<sub>1</sub></italic><sub>:t</sub><italic>. P</italic>(<italic>n|z<sub>t</sub></italic>) is the inverse sensor model. The term 
<inline-formula>
<mml:math id="mm3" display="inline">
<mml:semantics id="sm3">
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>o</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="italic">log</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:mrow>
<mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:semantics></mml:math></inline-formula> is the prior probability of the node and it also defines the initial belief before processing any sensor measurement, e.g., <italic>P(n) =</italic> 0.5. It mainly represents how the distribution of the node is given by an observation. The probabilities <italic>P</italic>(<italic>n|z<sub>1:t</sub></italic>) can be recovered from the logOdds radio as stated in <xref ref-type="disp-formula" rid="FD3">Equation (3)</xref>.</p>
<p>
<disp-formula id="FD3">
<label>(3)</label>
<mml:math id="mm4" display="block">
<mml:semantics id="sm4">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi>t</mml:mi></mml:mrow></mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>+</mml:mo>
<mml:mo>exp</mml:mo>
<mml:mo>{</mml:mo>
<mml:msub>
<mml:mi>l</mml:mi>
<mml:mi>t</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>}</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mrow>
<mml:mtext>with</mml:mtext>
<mml:mo>:</mml:mo></mml:mrow></mml:mrow></mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>l</mml:mi>
<mml:mi>t</mml:mi></mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="italic">log</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi>t</mml:mi></mml:mrow></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi>t</mml:mi></mml:mrow></mml:msub>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mfrac></mml:mrow>
<mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:semantics></mml:math></disp-formula></p>
<p>A new sensor reading introduces additional information about the state of the node <italic>n</italic>. This information is done by the inverse sensor model <italic>P</italic>(<italic>n</italic>|<italic>z<sub>t</sub></italic>) and it is combined with the most recent probability estimate stored in the node. This combination is done by the binary Bayes filter readings <italic>z<sub>1:t</sub></italic> = (<italic>z</italic><sub>t</sub>,…, <italic>z</italic><sub>1</sub>) to give a new estimate <italic>P</italic>(<italic>n</italic>|<italic>z<sub>t</sub></italic>). It is worth noting that when initialising the map, an equal probability to each node must be assigned. In other words, the initial node prior probabilities are <italic>P</italic>(<italic>n</italic>) = 0.5.</p></sec>
<sec sec-type="results">
<label>4.</label>
<title>Experimental Results</title>
<p>The experiments presented in this work was done using real world data. Moreover, the experiment results verify the problem formulation stated in the introduction, that is, the problem of increasing the field of view and reducing the minimum range of the Kinect<sub>TM</sub> sensor. In other words, this approach demonstrates that by fusing the Kinect<sub>TM</sub> with laser sensor data sets, the Kinect<sub>TM</sub> improves its field of view as well as its minimum close range detection.</p>
<p>The system setup shown in <xref ref-type="fig" rid="f1-sensors-12-03868">Figure 1</xref> is used to run the simulation, which results are shown in this section. During the simulation, two indoor data sets from the same environment were recorded using two different sensors. Later on, these two data sets are fused to get a single representation of the 3D scenario. The environment together with the sensor system is shown in <xref ref-type="fig" rid="f3-sensors-12-03868">Figure 3</xref>.</p>
<p>The first data set was recorded using the Kinect<sub>TM</sub> sensor. In order to get the Kinect<sub>TM</sub>'s depth image from the sensor, the Openni<sub>TM</sub> [<xref ref-type="bibr" rid="b12-sensors-12-03868">12</xref>] framework libraries were installed in Windows 7. Moreover, the Kinect<sub>TM</sub> Matlab [<xref ref-type="bibr" rid="b13-sensors-12-03868">13</xref>] framework is used to get the 3D (<italic>X, Y, Z</italic>) coordinates from the depth image. <xref ref-type="fig" rid="f4-sensors-12-03868">Figure 4</xref> visualises the depth image, which resolution is (640 × 480) pixels.</p>
<p>The second data set was recorded using a Hokuyo <italic>URG</italic> – 04<italic>LX</italic> – <italic>UG</italic>01 laser range finder, which is placed on top of the Kinect<sub>TM</sub> sensor, as seen in <xref ref-type="fig" rid="f3-sensors-12-03868">Figure 3</xref>. By means of the laser driver [<xref ref-type="bibr" rid="b14-sensors-12-03868">14</xref>], laser measurements can be obtained. Each single measurement consists of a total of 682 laser scans and are taken over a range of 240°. Each scan represents the Euclidian distance (d) from the center of the laser to the detected object. 2D (<italic>X, Y</italic>) laser coordinates can be obtained using a mapping function <italic>f: d</italic> → (<italic>X,Y</italic>).</p>
<p>Each previous recorded data set is represented probabilistically in a 3D occupancy map by means of the OctoMap library [<xref ref-type="bibr" rid="b1-sensors-12-03868">1</xref>]. Moreover, this library is also used to handle the fusion process between these two 3D representations. The library is implemented in C++ and installed on Debian GNU/Linux 6.0.3 (squeeze), released on 8 October 2011.</p>
<p>A 3D octree map representation of the environment from the first data set that corresponds to the Kinect<sub>TM</sub> sensor is shown in <xref ref-type="fig" rid="f5-sensors-12-03868">Figure 5(a)</xref>. For clarity, only the occupied volumes, which resolutions are 0.2 m, are shown in this Figure. <xref ref-type="fig" rid="f5-sensors-12-03868">Figure 5(b)</xref> shows the empty volumes. The narrow field of view of the Kinect<sub>TM</sub>'s depth image can clearly be seen.</p>
<p>The second data set represents a 2D slide of the environment, which is represented as an occupied octree maps, shown in <xref ref-type="fig" rid="f6-sensors-12-03868">Figure 6(a)</xref>, whereas <xref ref-type="fig" rid="f6-sensors-12-03868">Figure 6(b)</xref> shows the empty and occupied voxels. The main feature of this plot is the well-known wide field of view of the laser.</p>
<p>The occupied voxels that correspond to the fusion of the two data sets are shown in <xref ref-type="fig" rid="f7-sensors-12-03868">Figure 7(a)</xref>. This shows that fusion of sensory information from different sources can increase sensor reliability, in this case by enhancing the field of view of the Kinect<sub>TM</sub> sensor. The empty volumes are depicted in <xref ref-type="fig" rid="f7-sensors-12-03868">Figure 7(b)</xref>. This Figure clearly shows that the robot may have more confidence in its side space. This fact helps the mobile robot to avoid constantly maneuvering to get the missing map, and it can easily react if there is an obstacle in the vicinity of the robot that is not detected by the Kinect<sub>TM</sub> sensor, but by the laser.</p>
<p>The laser octree data set representation is compared with the true map as shown in <xref ref-type="fig" rid="f8-sensors-12-03868">Figure 8</xref>. The walls, the objects, the corridor and the door are very well detected. This result just confirms the good accuracy tolerance of the current reading for most of the sensor's range.</p>
<p>A 2D slide representation of the two octree fused data sets are also compared with the true map—this can be seen in <xref ref-type="fig" rid="f9-sensors-12-03868">Figure 9</xref>. This result shows the accuracy of the fused maps when compared with the actual environment's map. What is important to notice in this simulation is how the two sensors complement each other. This is achieved as mentioned previously by increasing the poor field of view of the Kinect<sub>TM</sub> sensor.</p>
<p>In order to test the minimum close range, an object has been placed 38<italic>cm</italic> in front of the testbed. This object is placed after the minimum close range detection of the laser, but it is situated before the minimum close range detection of the Kinect<sub>TM</sub>, which means that the object is between the two minimum range detections. The outcome of the Kinect<sub>TM</sub>'s simulation is depicted in <xref ref-type="fig" rid="f10-sensors-12-03868">Figure 10</xref>. It can clearly be seen that the object is not detected due to the mentioned minimum close range limitations of the Kinect<sub>TM</sub> sensor. However, the laser can detect the object as it was expected, and as shown in <xref ref-type="fig" rid="f11-sensors-12-03868">Figure 11</xref>.</p>
<p>The fusion of the two previous data set readings is presented in <xref ref-type="fig" rid="f12-sensors-12-03868">Figure 12</xref>. The important fact to be noticed in this simulation result is that the laser really improves the minimum close range detection limitation of the Kinect<sub>TM</sub> sensor. In doing so, the robot can react and avoid an obstacle that is close and that is not detected by the Kinect<sub>TM</sub>, making the obstacle avoidance and hence the navigation safer and more reliable.</p></sec>
<sec sec-type="conclusions">
<label>5.</label>
<title>Conclusions and Future Research</title>
<p>It is very rare that a single sensor can provide sufficient information for the reasoning component. In this sense, the current research in this paper has been focusing on fusing information from two different sources in order to increase the capabilities of a single sensor. To this end, the fusion of a laser readings with features extracted from a depth image using the Kinect<sub>TM</sub> sensor has come up with good results. It can be observed in <xref ref-type="fig" rid="f7-sensors-12-03868">Figures 7</xref> and <xref ref-type="fig" rid="f12-sensors-12-03868">12</xref> that the two limitations of the Kinect<sub>TM</sub> sensor, which are (a) the poor field of view and (b) the close range, are overcome by the fusion process. The field of view increments significantly and the close range is reduced; hence objects can be detected closer.</p>
<p>It is believed that the approach of fusing data provided by a laser range and the depth image constitutes an appropriate starting point for a new framework for mobile robots, which tasks of combining the Kinect<sub>TM</sub> with other sensors are demanding.</p>
<p>A starting point of this framework could be experiments of a dynamic fused 3D map of the environment, where sensor transformation frames are taken into account in order to build the map with respect to a world reference frame. The previous successful results can be used for localization and navigation. It is also the intention of this research to investigate further the applicability of the framework to the combination of different sensors for mobile robot nonlinear control tasks.</p></sec></body>
<back>
<ack>
<p>The research has been supported by the Århus School of Engineering, Denmark.</p></ack>
<ref-list>
<title>References</title>
<ref id="b1-sensors-12-03868"><label>1.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Wurm</surname><given-names>K.M.</given-names></name><name><surname>Hornung</surname><given-names>A.</given-names></name><name><surname>Bennewitz</surname><given-names>M.</given-names></name><name><surname>Stachniss</surname><given-names>C.</given-names></name><name><surname>Burgard</surname><given-names>W.</given-names></name></person-group><article-title>OctoMap: A Probabilistic, Flexible, and Compact 3D Map Representation for Robotic Systems</article-title><conf-name>Proceedings of the ICRA 2010 Workshop on Best Practice in 3D Perception and Modeling for Mobile Manipulation</conf-name><conf-loc>Anchorage, AK, USA</conf-loc><conf-date>3 May 2010</conf-date></citation></ref>
<ref id="b2-sensors-12-03868"><label>2.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>de la Puente</surname><given-names>P.</given-names></name><name><surname>Rodriguez-Losada</surname><given-names>D.</given-names></name><name><surname>Valero</surname><given-names>A.</given-names></name><name><surname>Matia</surname><given-names>F.</given-names></name></person-group><article-title>3D Feature Based Mapping Towards Mobile Robots' Enhanced Performance in Rescue Missions</article-title><conf-name>Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '09)</conf-name><conf-loc>St. Louis, MO, USA</conf-loc><conf-date>11–15 October 2009</conf-date><fpage>1138</fpage><lpage>1143</lpage></citation></ref>
<ref id="b3-sensors-12-03868"><label>3.</label><citation citation-type="web"><person-group person-group-type="author"><collab>Hokuyo</collab></person-group><article-title>URG-04LX-UG01</article-title><comment>Available online: <ext-link xlink:href="http://www.hokuyo-aut.jp/02sensor/07scanner/urg04lxug01.html" ext-link-type="uri">http://www.hokuyo-aut.jp/02sensor/07scanner/urg04lxug01.html</ext-link></comment><access-date>accessed on 22 February 2012</access-date></citation></ref>
<ref id="b4-sensors-12-03868"><label>4.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Agoston</surname><given-names>M.K.</given-names></name></person-group><source>Computer Graphics and Geometric Modelling Implementation and Algorithms</source><publisher-name>Springer</publisher-name><publisher-loc>Berlin, Heidelberg, Germany</publisher-loc><year>2005</year></citation></ref>
<ref id="b5-sensors-12-03868"><label>5.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Payeur</surname><given-names>P.</given-names></name><name><surname>Hebert</surname><given-names>P.</given-names></name><name><surname>Laurendeau</surname><given-names>D.</given-names></name><name><surname>Gosselin</surname><given-names>C.M.</given-names></name></person-group><article-title>Probabilistic Octree Modeling of a 3D Dynamic Environment</article-title><conf-name>Proceedings of the IEEE International Conference on Robotics and Automation</conf-name><conf-loc>Albuquerque, NM, USA</conf-loc><conf-date>April 1997</conf-date><comment>Volume 2</comment><fpage>1289</fpage><lpage>1296</lpage></citation></ref>
<ref id="b6-sensors-12-03868"><label>6.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wilhelms</surname><given-names>J.</given-names></name><name><surname>Gelder</surname><given-names>A.</given-names></name></person-group><article-title>Octrees for faster isosurface generation</article-title><source>IEEE Trans. Med. Imag.</source><year>2000</year><volume>19</volume><fpage>739</fpage><lpage>758</lpage><pub-id pub-id-type="doi">10.1109/42.875199</pub-id></citation></ref>
<ref id="b7-sensors-12-03868"><label>7.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Wurm</surname><given-names>K.</given-names></name><name><surname>Hennes</surname><given-names>D.</given-names></name><name><surname>Holz</surname><given-names>D.</given-names></name><name><surname>Rusu</surname><given-names>R.</given-names></name><name><surname>Stachniss</surname><given-names>C.</given-names></name><name><surname>Konolige</surname><given-names>K.</given-names></name><name><surname>Burgard</surname><given-names>W.</given-names></name></person-group><article-title>Hierarchies of Octrees for Efficient 3D Mapping</article-title><conf-name>Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '11)</conf-name><conf-loc>San Francisco, CA, USA</conf-loc><conf-date>25–30 September 2011</conf-date></citation></ref>
<ref id="b8-sensors-12-03868"><label>8.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Donald</surname><given-names>M.</given-names></name></person-group><article-title>Geometric modeling using octree encoding</article-title><source>Comput. Graph. Image Process</source><year>1982</year><volume>19</volume><fpage>129</fpage><lpage>147</lpage><pub-id pub-id-type="doi">10.1016/0146-664X(82)90104-6</pub-id></citation></ref>
<ref id="b9-sensors-12-03868"><label>9.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Moravec</surname><given-names>H.</given-names></name><name><surname>Elfes</surname><given-names>A.</given-names></name></person-group><article-title>High Resolution Maps from Wide Angle Sonar</article-title><conf-name>Proceedings of the 1985 IEEE International Conference on Robotics and Automation</conf-name><conf-loc>St. Louis, MO, USA</conf-loc><conf-date>25–28 March 1985</conf-date><comment>Volume 2</comment><fpage>116</fpage><lpage>121</lpage></citation></ref>
<ref id="b10-sensors-12-03868"><label>10.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Thrun</surname><given-names>S.</given-names></name><name><surname>Fox</surname><given-names>D.</given-names></name><name><surname>Burgard</surname><given-names>W.</given-names></name></person-group><source>Probabilistic Robotics</source><publisher-name>MIT Press</publisher-name><publisher-loc>Cambridge, MA, USA</publisher-loc><year>2005</year></citation></ref>
<ref id="b11-sensors-12-03868"><label>11.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moravec</surname><given-names>H.P.</given-names></name></person-group><article-title>Sensor fusion in certainty grids for mobile robots</article-title><source>AI Mag.</source><year>1988</year><volume>9</volume><fpage>61</fpage><lpage>74</lpage></citation></ref>
<ref id="b12-sensors-12-03868"><label>12.</label><citation citation-type="web"><person-group person-group-type="author"><collab>Openni</collab></person-group><comment>Available online: <ext-link xlink:href="http://75.98.78.94/default.aspx" ext-link-type="uri">http://75.98.78.94/default.aspx</ext-link></comment><access-date>accessed on 22 February 2012</access-date></citation></ref>
<ref id="b13-sensors-12-03868"><label>13.</label><citation citation-type="web"><person-group person-group-type="author"><collab>Kinect Matlab</collab></person-group><comment>Available online: <ext-link xlink:href="http://www.mathworks.com/matlabcentral/fileexchange/30242" ext-link-type="uri">http://www.mathworks.com/matlabcentral/fileexchange/30242</ext-link></comment><access-date>accessed on 22 February 2012</access-date></citation></ref>
<ref id="b14-sensors-12-03868"><label>14.</label><citation citation-type="web"><person-group person-group-type="author"><collab>URG programming guide</collab></person-group><comment>Available online: <ext-link xlink:href="http://www.hokuyo-aut.jp/02sensor/07scanner/download/urgprogramsen/" ext-link-type="uri">http://www.hokuyo-aut.jp/02sensor/07scanner/download/urgprogramsen/</ext-link></comment><access-date>accessed on 22 February 2012</access-date></citation></ref></ref-list>
<sec sec-type="display-objects">
<title>Figures</title>
<fig id="f1-sensors-12-03868" position="float">
<label>Figure 1.</label>
<caption>
<p>System setup which consists of the Microsoft Kinect<sub>TM</sub> sensor and the <italic>U RG</italic> – 04<italic>LX</italic> – <italic>U G</italic>01 laser range finder.</p></caption>
<graphic xlink:href="sensors-12-03868f1.gif"/></fig>
<fig id="f2-sensors-12-03868" position="float">
<label>Figure 2.</label>
<caption>
<p><bold>(a)</bold> The cube has been subdivided into tree depths, where the black cube represents an occupied voxel; <bold>(b)</bold> Octree representation.</p></caption>
<graphic xlink:href="sensors-12-03868f2.gif"/></fig>
<fig id="f3-sensors-12-03868" position="float">
<label>Figure 3.</label>
<caption>
<p>The environment seen by the Kinect<sub>TM</sub> and the laser range finder, which is placed on the top of the Kinect<italic><sub>TM</sub></italic>.</p></caption>
<graphic xlink:href="sensors-12-03868f3.gif"/></fig>
<fig id="f4-sensors-12-03868" position="float">
<label>Figure 4.</label>
<caption>
<p>The depth image from the Kinect<sub>TM</sub> sensor. The units are represented in <italic>mm</italic>.</p></caption>
<graphic xlink:href="sensors-12-03868f4.gif"/></fig>
<fig id="f5-sensors-12-03868" position="float">
<label>Figure 5.</label>
<caption>
<p><bold>(a)</bold> First data occupied set volumes of the environment; <bold>(b)</bold> First data empty set volumes of the environment.</p></caption>
<graphic xlink:href="sensors-12-03868f5.gif"/></fig>
<fig id="f6-sensors-12-03868" position="float">
<label>Figure 6.</label>
<caption>
<p>A 2D laser slide of the environment. <bold>(a)</bold> shows the occupied volumes; <bold>(b)</bold> shows the empty volumes.</p></caption>
<graphic xlink:href="sensors-12-03868f6.gif"/></fig>
<fig id="f7-sensors-12-03868" position="float">
<label>Figure 7.</label>
<caption>
<p>Shows the increased field of view of the Kinect<sub>TM</sub> sensor. <bold>(a)</bold> Two fused occupied volumes data sets; <bold>(b)</bold> Two fused empty volumes data sets.</p></caption>
<graphic xlink:href="sensors-12-03868f7.gif"/></fig>
<fig id="f8-sensors-12-03868" position="float">
<label>Figure 8.</label>
<caption>
<p>The laser range readings are compared with the true map.</p></caption>
<graphic xlink:href="sensors-12-03868f8.gif"/></fig>
<fig id="f9-sensors-12-03868" position="float">
<label>Figure 9.</label>
<caption>
<p>The two fused data sets are compared with the true map.</p></caption>
<graphic xlink:href="sensors-12-03868f9.gif"/></fig>
<fig id="f10-sensors-12-03868" position="float">
<label>Figure 10.</label>
<caption>
<p>The obstacle is not detected because it has been placed before the minimum range detection of the Kinect<sub>TM</sub> sensor.</p></caption>
<graphic xlink:href="sensors-12-03868f10.gif"/></fig>
<fig id="f11-sensors-12-03868" position="float">
<label>Figure 11.</label>
<caption>
<p>The obstacle is detected because it has been placed after the minimum range detection of the laser sensor.</p></caption>
<graphic xlink:href="sensors-12-03868f11.gif"/></fig>
<fig id="f12-sensors-12-03868" position="float">
<label>Figure 12.</label>
<caption>
<p>Improvement of the minimum range detection of the Kinect<sub>TM</sub> sensor.</p></caption>
<graphic xlink:href="sensors-12-03868f12.gif"/></fig></sec></back></article>
