<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xml:lang="en" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Sensors</journal-id>
<journal-title>Sensors</journal-title>
<issn pub-type="epub">1424-8220</issn>
<publisher>
<publisher-name>Molecular Diversity Preservation International (MDPI)</publisher-name></publisher></journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3390/s121217186</article-id>
<article-id pub-id-type="publisher-id">sensors-12-17186</article-id>
<article-categories>
<subj-group>
<subject>Article</subject></subj-group></article-categories>
<title-group>
<article-title>Intuitive Terrain Reconstruction Using Height Observation-Based Ground Segmentation and 3D Object Boundary Estimation</article-title></title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Song</surname><given-names>Wei</given-names></name><xref ref-type="aff" rid="af1-sensors-12-17186"><sup>1</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Cho</surname><given-names>Kyungeun</given-names></name><xref ref-type="aff" rid="af1-sensors-12-17186"><sup>1</sup></xref><xref ref-type="corresp" rid="c1-sensors-12-17186">*</xref></contrib>
<contrib contrib-type="author">
<name><surname>Um</surname><given-names>Kyhyun</given-names></name><xref ref-type="aff" rid="af1-sensors-12-17186"><sup>1</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Won</surname><given-names>Chee Sun</given-names></name><xref ref-type="aff" rid="af2-sensors-12-17186"><sup>2</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Sim</surname><given-names>Sungdae</given-names></name><xref ref-type="aff" rid="af3-sensors-12-17186"><sup>3</sup></xref></contrib></contrib-group>
<aff id="af1-sensors-12-17186">
<label>1</label> Department of Multimedia Engineering, Dongguk University-Seoul, 26 Pildong 3 Ga, Jung-gu, Seoul 100-715, Korea; E-Mails: <email>songwei@dongguk.edu</email> (W.S.); <email>khum@dongguk.edu</email> (K.U.)</aff>
<aff id="af2-sensors-12-17186">
<label>2</label> Division of Electronics and Electrical Engineering, Dongguk University-Seoul, 26 Pildong 3 Ga, Jung-gu, Seoul 100-715, Korea; E-Mail: <email>cswon@dongguk.edu</email></aff>
<aff id="af3-sensors-12-17186">
<label>3</label> Agency for Defense Development, Bugyuseong daero 488 beon gi, Yoseong, Daejeon 305-152, Korea; E-Mail: <email>sdsim@add.re.kr</email></aff>
<author-notes>
<corresp id="c1-sensors-12-17186">
<label>*</label> Author to whom correspondence should be addressed; E-Mail: <email>cke@dongguk.edu</email>; Tel.: +82-2-2260-3834; Fax: +82-2-2260-3766.</corresp></author-notes>
<pub-date pub-type="collection">
<month>12</month>
<year>2012</year></pub-date>
<pub-date pub-type="epub">
<day>12</day>
<month>12</month>
<year>2012</year></pub-date>
<volume>12</volume>
<issue>12</issue>
<fpage>17186</fpage>
<lpage>17207</lpage>
<history>
<date date-type="received">
<day>08</day>
<month>10</month>
<year>2012</year></date>
<date date-type="rev-recd">
<day>07</day>
<month>12</month>
<year>2012</year></date>
<date date-type="accepted">
<day>11</day>
<month>12</month>
<year>2012</year></date></history>
<permissions>
<copyright-statement>© 2012 by the authors; licensee MDPI, Basel, Switzerland</copyright-statement>
<copyright-year>2012</copyright-year>
<license>
<p>This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).</p></license></permissions>
<abstract>
<p>Mobile robot operators must make rapid decisions based on information about the robot’s surrounding environment. This means that terrain modeling and photorealistic visualization are required for the remote operation of mobile robots. We have produced a voxel map and textured mesh from the 2D and 3D datasets collected by a robot’s array of sensors, but some upper parts of objects are beyond the sensors’ measurements and these parts are missing in the terrain reconstruction result. This result is an incomplete terrain model. To solve this problem, we present a new ground segmentation method to detect non-ground data in the reconstructed voxel map. Our method uses height histograms to estimate the ground height range, and a Gibbs-Markov random field model to refine the segmentation results. To reconstruct a complete terrain model of the 3D environment, we develop a 3D boundary estimation method for non-ground objects. We apply a boundary detection technique to the 2D image, before estimating and refining the actual height values of the non-ground vertices in the reconstructed textured mesh. Our proposed methods were tested in an outdoor environment in which trees and buildings were not completely sensed. Our results show that the time required for ground segmentation is faster than that for data sensing, which is necessary for a real-time approach. In addition, those parts of objects that were not sensed are accurately recovered to retrieve their real-world appearances.</p></abstract>
<kwd-group>
<kwd>terrain reconstruction</kwd>
<kwd>3D ground segmentation</kwd>
<kwd>3D boundary estimation</kwd>
<kwd>height histogram</kwd>
<kwd>Gibbs-Markov Random Field</kwd></kwd-group></article-meta></front>
<body>
<sec sec-type="intro">
<label>1.</label>
<title>Introduction</title>
<p>Remote operation of mobile robots is widely used in planetary exploration, search and rescue, surveillance, defense, and other robotic applications [<xref ref-type="bibr" rid="b1-sensors-12-17186">1</xref>]. An operator controls the mobile robot through a remote control system (RCS), which provides an immersive virtual environment to enable an understanding of terrain information [<xref ref-type="bibr" rid="b2-sensors-12-17186">2</xref>–<xref ref-type="bibr" rid="b4-sensors-12-17186">4</xref>]. The operator controls the mobile robot by navigating and interacting with real environments without collisions or encountering other dangers [<xref ref-type="bibr" rid="b5-sensors-12-17186">5</xref>]. In situations where the operator must quickly decide on the motion and path of the robot, rapid feedback of the real environment is vital for effective control, so real-time terrain modeling and photorealistic visualization systems have been developed [<xref ref-type="bibr" rid="b6-sensors-12-17186">6</xref>].</p>
<p>Conventional real-time visualization systems mostly apply a 2D image, a voxel map or a texture map to represent a terrain model. A 2D image is captured by the mobile robot’s camera. For example, the 2D image in <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(a)</xref> is captured by the camera on the front of a robot. A voxel map, as shown at a quarter viewpoint in <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(b)</xref>, is generated by integrating the sensed 3D point clouds into regular grids. From the voxel map, a terrain mesh, as shown at quarter viewpoint in <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(c)</xref>, is generated by integrating the top points in the <italic>x–z</italic> cells into a regular triangular mesh. By mapping the texture in <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(a)</xref> onto the mesh, a textured mesh is obtained [<xref ref-type="bibr" rid="b7-sensors-12-17186">7</xref>]. The yellow vertices in <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(c)</xref> denote the regions that are not projected from the 2D image. A terrain model consisting of geometrical shapes and realistic textures enables a photorealistic visualization approach for the terrain reconstruction and a remote operation of mobile robots.</p>
<p>In large-scale environments, a level-of-detail (LOD) method is used to render the near-field regions of the terrain model. In far-field regions, billboard rendering methods [<xref ref-type="bibr" rid="b8-sensors-12-17186">8</xref>], which represent a texture in front of the terrain model for real-time visualization, are applied. However, when processing a terrain model, the upper regions of objects are often outside the measurement range of the 3D sensor. These “unsensed” parts of large objects exist in the reconstructed terrain model. In <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(c)</xref>, we can see that the top parts of the buildings and trees are missing in the terrain reconstruction result. We need to recover the missing parts of tall objects. The objective of our study is to reconstruct a complete terrain model with object detection and 3D boundary estimation of non-ground objects.</p>
<p>In this paper, we aim to constitute a real-time, large-scale terrain modeling system for photorealistic visualization, including our new ground segmentation method and 3D boundary estimation algorithm. The framework of the proposed system is shown in <xref ref-type="fig" rid="f2-sensors-12-17186">Figure 2</xref>.</p>
<p>The system includes three principal steps. Firstly, data from the integrated sensors are used to generate a voxel map and a textured mesh as terrain models. The multiple sensors mounted on mobile robots collect terrain information in the form of 3D point clouds, 2D images, GPS, and rotation states. Based on the rotation and position data, the received 3D point clouds are transformed to absolute positions which are quantized into regular grids and registered into a voxel map and a textured mesh by projection from vertices to 2D images.</p>
<p>Next, we develop a ground segmentation method to classify ground surface and non-ground objects in the voxel map. We apply a height histogram method, based on the spatial distribution of the ground and objects, to segment ground data in the voxel map. Because the voxels in the terrain model are highly affected by their neighbors, we apply a Gibbs-Markov random field (GMRF) [<xref ref-type="bibr" rid="b9-sensors-12-17186">9</xref>,<xref ref-type="bibr" rid="b10-sensors-12-17186">10</xref>] to refine the segmentation result.</p>
<p>Finally, our 3D boundary detection algorithm is applied to recover unsensed parts of non-ground objects. The missing portions of objects are reconstructed by detecting the object boundary in the 2D image, then estimating the true height from the incomplete boundary in the textured mesh.</p>
<p>This paper is organized as follows: in Section 2, we survey related work on terrain modeling, ground segmentation, and photorealistic modeling methods. In Section 3, we explain a ground segmentation method for voxel maps. In Section 4, we describe our non-ground object boundary estimation method for complete terrain reconstruction. The performance of the proposed ground segmentation and photorealistic visualization methods are analyzed and evaluated in Section 5, and finally, in Section 6, we draw our conclusions.</p></sec>
<sec>
<label>2.</label>
<title>Related Works</title>
<p>There are many approaches to terrain modeling motivated by techniques from large-scale voxel maps and textured meshes. For example, there are algorithms based on multiple-sensor integration, large-scale dataset registration, ground surface and non-ground objects reconstruction, and 3D point interpolation. In this section, we review the ground surface and non-ground objects reconstruction methods. In addition, we investigate non-ground the researches on objects segmentation and 3D point interpolation, in order to recover the unsensed parts of large objects in the reconstructed terrain model.</p>
<p>When we represent a robot’s surrounding terrain in a virtual environment, it is necessary to reconstruct a terrain model using an integrated dataset obtained from multiple sensors [<xref ref-type="bibr" rid="b11-sensors-12-17186">11</xref>–<xref ref-type="bibr" rid="b15-sensors-12-17186">15</xref>]. Conventionally, the voxel map [<xref ref-type="bibr" rid="b16-sensors-12-17186">16</xref>] and textured mesh [<xref ref-type="bibr" rid="b2-sensors-12-17186">2</xref>] have been applied for this terrain modeling.</p>
<p>Huber <italic>et al.</italic>[<xref ref-type="bibr" rid="b8-sensors-12-17186">8</xref>] and Kelly <italic>et al.</italic>[<xref ref-type="bibr" rid="b12-sensors-12-17186">12</xref>] described real-world representation methods using video-ranging modules. 3D textured voxel grids were used to describe the surrounding terrain in the near field, whereas a billboard texture in front of the robot was used to show scenes in the far field. When the virtual camera changed its position and rotation, the billboard could not match the rendering result from the 3D modeling. For different virtual camera motion, therefore, the far-field scene should be represented as it appears in the real world.</p>
<p>Noguera <italic>et al.</italic>[<xref ref-type="bibr" rid="b17-sensors-12-17186">17</xref>] proposed a hybrid photorealistic visualization system with a 2D synthetic panorama generation method to provide on-line photorealistic visualization. The client system rendered the terrain close to the virtual camera using the LOD method. The far-field terrain was represented by a panorama, which was generated from a far-field terrain model rendered by a high-capability server system. However, it is difficult for these methods to estimate the extent of large objects when the 3D sensors cannot measure their heights. To solve this problem, we propose a non-ground object boundary estimation method to recover complete objects from the captured 2D image and the reconstructed terrain mesh.</p>
<p>A ground segmentation algorithm that classifies ground surface and non-ground objects in the reconstructed terrain models is necessary to recover unsensed parts of the non-ground objects. Conrad <italic>et al.</italic>[<xref ref-type="bibr" rid="b18-sensors-12-17186">18</xref>] applied the scale-invariant feature transform (SIFT) algorithm [<xref ref-type="bibr" rid="b19-sensors-12-17186">19</xref>] to establish a correspondence between pixels on stereo images. To cluster them into ground and non-ground classes, he used a modified Expectation Maximization algorithm. In his work, only the corresponding pixels were clustered. Ke <italic>et al.</italic>[<xref ref-type="bibr" rid="b20-sensors-12-17186">20</xref>] improved Conrad’s method by constructing the contours of the image and judging whether a contour belongs to the ground plane. Because of the limited range and resolution of a stereo camera, only a small quantity of ground pixels could be obtained. A 3D sensor with highly accurate data collection is required to determine which areas are safe for a mobile robot.</p>
<p>Oniga <italic>et al.</italic>[<xref ref-type="bibr" rid="b21-sensors-12-17186">21</xref>] utilized a random sample consensus (RANSAC) algorithm to detect a road surface and cluster obstacles based on the density of the sensed points, and Mufti <italic>et al.</italic>[<xref ref-type="bibr" rid="b22-sensors-12-17186">22</xref>] presented a spatio-temporal RANSAC framework to detect planar surfaces. Based on the planar features of the ground, the detected area was then segmented. To improve the accuracy of the RANSAC plane, Lam <italic>et al.</italic>[<xref ref-type="bibr" rid="b23-sensors-12-17186">23</xref>] proposed a least-squares fit plane with a Kalman filter to extract the road data from sequentially obtained 3D point clouds. Due to the computational cost of the RANSAC algorithm, it is difficult to apply this method in real-time ground segmentation approaches.</p>
<p>To segment ground data in the reconstructed terrain model, we need to calculate each voxel’s probability of being in the ground and non-ground configurations. An effective approach to object segmentation from 2D images and 3D point clouds is the Markov random field (MRF) algorithm [<xref ref-type="bibr" rid="b24-sensors-12-17186">24</xref>–<xref ref-type="bibr" rid="b30-sensors-12-17186">30</xref>].</p>
<p>Vernaza <italic>et al.</italic>[<xref ref-type="bibr" rid="b31-sensors-12-17186">31</xref>] presented a prediction-based structured terrain classification method for the DARPA Grand Challenge. He used an MRF model to classify the pixels in 2D images into obstacles or ground regions. However, it is difficult to specify the probability density functions (PDFs) in MRFs. To solve this problem, the Hammersley-Clifford theorem proved an equivalence relationship between MRF and the Gibbs distribution [<xref ref-type="bibr" rid="b25-sensors-12-17186">25</xref>]. Because the computation of GMRFs is too complicated for large-scale datasets, we need to remove redundant elements from the GMRF in order to reduce the computational cost of ground segmentation.</p>
<p>Song <italic>et al.</italic>[<xref ref-type="bibr" rid="b32-sensors-12-17186">32</xref>] proposed a ground segmentation method in 2D images that combined the GMRF method with a flood-fill algorithm. By segmenting ground pixels in the 2D image, the method detects the ground vertices in the texture mesh by projecting from the ground pixels. Due to the computation requirements of image processing, it is not possible to apply ground segmentation for 2D images with real-time processing. In this paper, we propose a ground segmentation method for a 3D terrain mesh without image processing. The method applied a height histogram to estimate ground height range and a GMRF model to classify ground surface and non-ground objects in the voxel map. As it is different from the captured 2D images, the voxel map changes little with collection time. The processing duration of the method is less than that of the sensing duration of 3D point cloud. This way, the proposed method is able to realize real-time terrain reconstruction.</p>
<p>The recovery of unsensed regions plays a major role in obstacle avoidance. Some researchers have applied interpolation algorithms to fill empty holes and smooth terrain [<xref ref-type="bibr" rid="b33-sensors-12-17186">33</xref>–<xref ref-type="bibr" rid="b36-sensors-12-17186">36</xref>]. For example, when we estimate such unobserved data, Douillard <italic>et al.</italic>[<xref ref-type="bibr" rid="b37-sensors-12-17186">37</xref>] interpolated grids in empty regions of elevation maps in order to propagate label estimates. This method represents a terrain map using a 3D textured voxel grid, and applies a point interpolation algorithm to fill any small holes. While successful in filling empty holes and smoothing terrain, this approach also encounters difficulties in estimating the height of large objects based on the 3D sensor measurements.</p>
<p>In hardware design research, Früh <italic>et al.</italic>[<xref ref-type="bibr" rid="b38-sensors-12-17186">38</xref>] utilized a vertical laser scanner to measure large buildings and represent streetscapes in urban environments. When an object is located between the sensors and a building, some regions of the building are blocked by the object in the scanning results. These missing regions are filled by a planar or horizontal interpolation algorithm.</p>
<p>Point interpolation algorithms are used to fill small holes in the 3D grid. However, it is difficult for these methods to estimate the height of large objects, meaning that the actual shape of tall objects is often misrepresented. Point interpolation algorithms are also ineffective in representing porous objects, such as vegetation. To solve these problems, Song <italic>et al.</italic>[<xref ref-type="bibr" rid="b32-sensors-12-17186">32</xref>] proposed a GMRF based height estimation algorithm by estimating object top pixel in 2D images for each sensed object pixel. He reconstructed the complete terrain from the captured 2D image and the reconstructed terrain mesh. The complex computation of GMRF causes a low speed of this method. We propose a boundary estimation method by a kernel-based boundary detection algorithm in 2D image. The top pixels of objects are easily detected by finding the boundary above the sensed pixels.</p>
<p>In this paper, we integrate a colored voxel map and a textured mesh to construct a photorealistic terrain model. For the ground segmentation in the 3D voxel map, we present a height histogram method with a GMRF model. Further, in contrast to interpolation methods, we explain a 3D boundary estimation method to recover unsensed regions in the textured mesh, especially for high or tall objects outside the sensors’ range of measurement.</p></sec>
<sec>
<label>3.</label>
<title>Ground Segmentation in the Voxel Map</title>
<p>Before recovering the unsensed parts of non-ground objects, we require a ground segmentation algorithm that classifies ground and non-ground data from the reconstructed voxel map. We aim to autonomously segment ground surface in rough and slopy terrain environment and segment non-ground object with as few errors as possible. In this section, we apply a height histogram method and a GMRF model for this purpose. To initialize the variables in the GMRF model, such as height observation and configuration, in Section 3.1 we roughly segment the ground surface using a height histogram method based on the spatial distribution of ground surface and non-ground objects. Some errors will exist in this segmentation result. To remove these, we apply the GMRF model to refine the segmentation in Section 3.2. Then, from the non-ground voxel segmentation result, we will estimate the actual height value for the non-ground vertices. This procedure is described in Section 4.</p>
<sec>
<label>3.1.</label>
<title>Ground Height Range Estimation by Height Histogram</title>
<p>We usually segment the 3D points using the height of the robotic vehicle <italic>h</italic><sub>1</sub> as the standard. If the <italic>y</italic> coordinate of a 3D point is between −<italic>h</italic><sub>1</sub> − ∆ and −<italic>h</italic><sub>1</sub> + ∆, then we assume that this point is ground data. However, this method is not accurate in regions where the surface is sloped or rough, as the robot cannot move smoothly and the 3D sensor’s height value is unstable. In this section, we apply a height histogram method to estimate the ground height range in real time from the reconstructed voxel map.</p>
<p>The height histogram is a graph representing the distribution of height values, as shown in <xref ref-type="fig" rid="f3-sensors-12-17186">Figure 3</xref>. Discrete intervals on the <italic>x</italic>-axis represent height ranges, and the vertical extent of each interval represents the number of voxels with a height value within that range. We define a common histogram [<xref ref-type="bibr" rid="b39-sensors-12-17186">39</xref>] as follows:
<disp-formula id="FD1">
<label>(1)</label>
<mml:math id="mm1" display="block">
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>l</mml:mi>
<mml:mi>k</mml:mi></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></disp-formula>where <italic>l<sub>k</sub></italic> is the observation value and <italic>n<sub>k</sub></italic> is the total number of data with observation <italic>l<sub>k</sub></italic>. If the <italic>y</italic> coordinate of a voxel is equal to <italic>l<sub>k</sub></italic>, the variable <italic>n<sub>k</sub></italic> will be increased by 1.</p>
<p>A fraction of ground has a smooth, horizontal surface. The 3D sensor cannot pass through the solid ground surface, and no data is scanned below the ground surface. Hence, the height distribution of this ground fraction is highly localized within (−<italic>h</italic><sub>1</sub> − ∆, −<italic>h</italic><sub>1</sub> + ∆), as illustrated in <xref ref-type="fig" rid="f3-sensors-12-17186">Figure 3(a)</xref>. A non-ground object has a vertical surface on the ground. The height distribution of a non-ground object has an evenly localized distribution within (−<italic>h</italic><sub>1</sub> + ∆, <italic>h</italic><sub>2</sub>), as shown in <xref ref-type="fig" rid="f3-sensors-12-17186">Figure 3(b)</xref>, where <italic>h</italic><sub>2</sub> is the upper extent of the 3D sensor’s range.</p>
<p>We create a height histogram as shown in <xref ref-type="fig" rid="f4-sensors-12-17186">Figure 4(b)</xref>, from the voxels in the voxel map as shown in <xref ref-type="fig" rid="f4-sensors-12-17186">Figure 4(a)</xref>. We estimate the 3D height value <italic>h</italic><sub>1</sub> as that whose voxel count number is the peak of the histogram. The voxels contributing to the interval (−<italic>h</italic><sub>1</sub> − ∆, −<italic>h</italic><sub>1</sub> + ∆) correspond to the ground surface.</p>
<p>By applying this estimated height value as a threshold for ground segmentation, we obtain the result shown in <xref ref-type="fig" rid="f5-sensors-12-17186">Figure 5(a)</xref>, where the voxels in cyan and yellow represent the ground and non-ground data, respectively. We can see that some regions below the ground are recognized as ground data. This rough segmentation method does not generate all ground data, because the configuration is only determined using a local height.</p></sec>
<sec>
<label>3.2.</label>
<title>Refining Process for Ground Segmentation</title>
<p>When we segment ground data using the threshold generated from the 3D histogram method, some errors exist in the segmentation result. In order to remove them in the rough ground segmentation result, we explain a technique that determines the data configuration based on local and neighboring observations in the generated voxel map. We append the GMRF model definition at the end of this paper to explain our method’s theoretical background.</p>
<p>When we apply the GMRF to ground segmentation in a 3D voxel map, we first determine a set of voxels whose configurations imply a high probability of being in the ground class. If the <italic>y</italic> coordinate of a 3D voxel is in the range −<italic>h</italic><sub>1</sub> − ∆ to −<italic>h</italic><sub>1</sub> + ∆, then the configuration of this voxel is toward the ground class. This step represents a rough ground segmentation process that produces dataset <italic>G</italic><sub>1</sub>. The voxels located outside this range are grouped into dataset <italic>G</italic><sub>2</sub>, whose configurations are toward the non-ground class. We use this method to estimate probabilities for each configuration of voxels in the voxel map.</p>
<p>As mentioned in Section 3.1, <italic>G</italic><sub>1</sub> contains some non-ground data and <italic>G</italic><sub>2</sub> does not contain all non-ground data, because the configuration is determined using only a local height value. Next, we apply the GMRF model designed in <xref ref-type="app" rid="app1">Appendix</xref> to classify the configurations of voxels into ground or non-ground classes.</p>
<p>The voxels with a ground configuration are grouped into dataset <italic>G</italic><sub>1</sub>′, whose configurations are determined as the ground class. The voxels in <italic>G</italic><sub>1</sub>′ are represented as the green region in <xref ref-type="fig" rid="f5-sensors-12-17186">Figure 5(b)</xref>. Regions containing non-ground voxels are grouped into dataset <italic>G</italic><sub>2</sub>′. If a voxel <italic>s</italic> ∈ <italic>G</italic><sub>1</sub>′ maps onto a pixel in a 2D image, we determine this pixel to be a ground pixel. If not, this pixel is a non-ground pixel.</p></sec></sec>
<sec>
<label>4.</label>
<title>3D Boundary Estimation for Non-Ground Objects</title>
<p>When mobile robots detect information about the surrounding terrain, some parts of objects are outside the range of measurement of their 3D sensors. For example, in <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(c)</xref>, we can see that the top of the building is missing in the terrain reconstruction result. However, objects such as buildings and vegetation can be seen completely in the 2D image of <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(b)</xref>, captured by the mobile robot’s camera. In this section, we explain a 3D boundary estimation method for non-ground objects. This solves the problem of recovering unsensed regions by estimating the top boundary of an object. Our proposed boundary estimation process consists of two steps. First, we find the boundary between the object and the background in a 2D image. Next, we find the boundary’s 3D coordinates using an inverse projection from 3D points to 2D pixels.</p>
<sec>
<label>4.1.</label>
<title>Boundary Detection of Foreground Objects in 2D Images</title>
<p>Mobile robots require real-time boundary detection. Hence, we apply a simple kernel-based boundary detection method to estimate image gradients and detect the foreground and background in a 2D image. To account for noise in the image, we use dilation and erosion methods to smooth the boundary detection result.</p>
<p>We define a horizontal kernel to detect the boundary in the horizontal direction, and a vertical kernel to detect the boundary in the vertical direction, as shown in <xref ref-type="fig" rid="f6-sensors-12-17186">Figure 6</xref>.</p>
<p>By computing the convolutions <italic>L<sub>x</sub></italic>(<italic>x</italic>, <italic>y</italic>) and <italic>L<sub>y</sub></italic>(<italic>x</italic>, <italic>y</italic>) with the kernels in <xref ref-type="fig" rid="f6-sensors-12-17186">Figure 6</xref>, the horizontal and vertical changes in a pixel (<italic>x</italic>, <italic>y</italic>) are formulated as follows:
<disp-formula id="FD2">
<label>(2)</label>
<mml:math id="mm2" display="block">
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>x</mml:mi></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">ap</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">ap</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">bp</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">bp</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">ap</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">ap</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>
<disp-formula id="FD3">
<label>(3)</label>
<mml:math id="mm3" display="block">
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>y</mml:mi></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">ap</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo> </mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">ap</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo> </mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">bp</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">bp</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">ap</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">ap</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>where <italic>a</italic> and <italic>b</italic> are non-zero constants. The gradient of the change in the pixel (<italic>x, y</italic>) is formulated as follows:
<disp-formula id="FD4">
<label>(4)</label>
<mml:math id="mm4" display="block">
<mml:mrow>
<mml:mo>∇</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo>=</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mi>x</mml:mi></mml:msub>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>x</mml:mi></mml:msub>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mn>2</mml:mn></mml:msup>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mi>y</mml:mi></mml:msub>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>y</mml:mi></mml:msub>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:msqrt></mml:mrow></mml:math></disp-formula></p>
<p>The coefficients <italic>k<sub>x</sub></italic> and <italic>k<sub>y</sub></italic> affect the weight value of the horizontal and vertical changes, respectively. In this project, we detect the boundary between foreground objects and background data, such as ground pixels and sky pixels. Foreground objects are always located below or above the background in a 2D image. The vertical changes affect the boundary more than the horizontal changes. Therefore, <italic>k<sub>y</sub></italic> is larger than <italic>k<sub>x</sub></italic> for all scenarios in our project.</p>
<p>If the change in pixel (<italic>x</italic>, <italic>y</italic>) is large, we consider that this pixel is likely to be on the boundary. Thus, if the magnitude ∇<italic>L</italic>(<italic>x</italic>, <italic>y</italic>) is larger than some threshold, we determine the pixel (<italic>x</italic>, <italic>y</italic>) to be a boundary pixel, at least temporarily.</p>
<p><xref ref-type="fig" rid="f7-sensors-12-17186">Figure 7(a)</xref> shows a binary image of the boundary detection result for <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(b)</xref>. We define <italic>p</italic>(<italic>x</italic>, <italic>y</italic>) = 0 for the black pixels to represent boundary data, and <italic>p</italic>(<italic>x</italic>, <italic>y</italic>) = 1 for the white pixels to represent non-boundary data.</p>
<p>We find that some noise exists in the boundary detection result. To remove this, we apply dilation and erosion filters. We firstly apply erosion process to remove the noise in the boundary detection result. The erosion process is performed by extending the background region in 'white' using an erosion mask, as shown in <xref ref-type="fig" rid="f8-sensors-12-17186">Figure 8(a)</xref>.</p>
<p>We shift the erosion mask across the image and generate a boundary detection result <italic>E</italic> by the convolution function on image <italic>A</italic> with the erosion mask <italic>B</italic><sub>1</sub>, formulated as follows:
<disp-formula id="FD5">
<label>(5)</label>
<mml:math id="mm5" display="block">
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mtable columnalign="left">
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mn>1</mml:mn></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mo> </mml:mo>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>≤</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>≤</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:munder>
<mml:mrow>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>≤</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>≤</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:munder>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>j</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>⊕</mml:mo>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mn>1</mml:mn></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mn>9</mml:mn></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr>
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mn>0</mml:mn></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mo> </mml:mo>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>≤</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>≤</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:munder>
<mml:mrow>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>≤</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>≤</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:munder>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>j</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>⊕</mml:mo>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mn>1</mml:mn></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>≠</mml:mo>
<mml:mn>9</mml:mn></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>where the symbol ⊕ stands for the operation “exclusive or”.</p>
<p>Using the erosion process to remove the noise, we find that some boundary pixels are filtered out. Subsequently, we apply the dilation process to recover the filtered boundary pixels. The dilation process is performed by extending the boundary region in “black” using a dilation mask, as shown in <xref ref-type="fig" rid="f8-sensors-12-17186">Figure 8(b)</xref>.</p>
<p>We shift the dilation mask across the image and generate a boundary detection result D by the convolution function on image <italic>E</italic> with the dilation mask <italic>B</italic><sub>2</sub>, formulated as follows:
<disp-formula id="FD6">
<label>(6)</label>
<mml:math id="mm6" display="block">
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd columnalign="center">
<mml:mn>1</mml:mn></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mo> </mml:mo>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>≤</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>≤</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:munder>
<mml:mrow>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>≤</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>≤</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:munder>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>j</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>⊕</mml:mo>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mn>2</mml:mn></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr>
<mml:mtr>
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mo> </mml:mo>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>≤</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>≤</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:munder>
<mml:mrow>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>≤</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>≤</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:munder>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>j</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>⊕</mml:mo>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mn>2</mml:mn></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>≠</mml:mo>
<mml:mn>0</mml:mn></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>The experimental result of removing noise from <xref ref-type="fig" rid="f7-sensors-12-17186">Figure 7(a)</xref> is shown in <xref ref-type="fig" rid="f7-sensors-12-17186">Figure 7(b)</xref>. The boundary between the foreground and background is extracted as the top black curve in <xref ref-type="fig" rid="f7-sensors-12-17186">Figure 7(b)</xref>.</p></sec>
<sec>
<label>4.2.</label>
<title>3D Boundary Estimation in 3D Textured Terrain Mesh</title>
<p>In this section, we propose a 3D boundary estimation method for the 3D terrain mesh that allows us to recover the complete shape of non-ground objects. Using the boundary detection between the foreground object and the background in a 2D image, we find the boundary’s 3D coordinates by projecting from 2D pixels to 3D vertices secondly.</p>
<p>From the ground data segmentation results, we consider a 3D non-ground voxel in the terrain mesh as part of the foreground object. This is because background data, such as sky, cannot be sensed.</p>
<p>When non-ground voxels in <italic>G</italic><sub>2</sub>′ are inserted into the terrain mesh, the updated vertices are categorized into a non-ground vertex dataset <italic>T</italic><sub>1</sub>. By projecting from <italic>t</italic><sub>1</sub>(<italic>x</italic><sub>1</sub>, <italic>y</italic><sub>1</sub>, <italic>z</italic><sub>1</sub>), <italic>t</italic><sub>1</sub>∈<italic>T</italic><sub>1</sub>, to the 2D image, we map <italic>t</italic><sub>1</sub> to the pixel <italic>t</italic><sub>2</sub>′ in the 2D image given by the boundary detection result. These <italic>t</italic><sub>2</sub>′ make up the dataset <italic>T</italic><sub>2</sub>, which is shown as the blue pixels in <xref ref-type="fig" rid="f9-sensors-12-17186">Figure 9(a)</xref>.</p>
<p>We search for a boundary pixel <italic>t</italic><sub>2</sub>″ above <italic>t</italic><sub>2</sub>′ as the object’s top pixel, as indicated in red in <xref ref-type="fig" rid="f9-sensors-12-17186">Figure 9(b)</xref>. From the true top location <italic>t</italic><sub>2</sub>″ in the 2D image, we estimate the height value for each object vertex.</p>
<p>We find the <italic>y</italic> coordinates of the boundary using an inverse projection from 2D pixels to 3D points, as shown in <xref ref-type="fig" rid="f9-sensors-12-17186">Figure 9(c)</xref>. We place the center of the camera at the origin. The projection ray from the origin to the non-ground object vertex gives an estimate of the height of that vertex.</p>
<p>The direction of the vector 
<inline-formula>
<mml:math id="mm7" display="inline">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">Ct</mml:mi>
<mml:mn>1</mml:mn></mml:msub></mml:mrow>
<mml:mo>′</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="true">⇀</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> is the same as that of 
<inline-formula>
<mml:math id="mm8" display="inline">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">Ct</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow>
<mml:mo>″</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="true">⇀</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula>, from the camera to the estimated 3D point <italic>t</italic><sub>1</sub>(<italic>x</italic><sub>1</sub>, <italic>y</italic><sub>1</sub>, <italic>z</italic><sub>1</sub>). After the camera transforms by the rotation matrix <italic>R</italic>, the vector 
<inline-formula>
<mml:math id="mm9" display="inline">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">Ct</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow>
<mml:mo>″</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="true">⇀</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> is derived as 
<inline-formula>
<mml:math id="mm10" display="inline">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">Ct</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow>
<mml:mo>″</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="true">⇀</mml:mo></mml:mover>
<mml:mo>=</mml:mo>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">Co</mml:mi></mml:mrow>
<mml:mo stretchy="true">⇀</mml:mo></mml:mover>
<mml:mo>+</mml:mo>
<mml:mover accent="true">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">ot</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow>
<mml:mo>″</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="true">⇀</mml:mo></mml:mover></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. Therefore, we formulate that:
<disp-formula id="FD7">
<label>(7)</label>
<mml:math id="mm11" display="block">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">Ct</mml:mi>
<mml:mn>1</mml:mn></mml:msub></mml:mrow>
<mml:mo>′</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="true">→</mml:mo></mml:mover>
<mml:mo>=</mml:mo>
<mml:mi>λ</mml:mi>
<mml:mover accent="true">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">Ct</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow>
<mml:mo>″</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="true">→</mml:mo></mml:mover>
<mml:mo>=</mml:mo>
<mml:mi>λ</mml:mi>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">Co</mml:mi></mml:mrow>
<mml:mo stretchy="true">→</mml:mo></mml:mover>
<mml:mo>+</mml:mo>
<mml:mover accent="true">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">ot</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow>
<mml:mo>″</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="true">→</mml:mo></mml:mover>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>In <xref ref-type="disp-formula" rid="FD7">Equation (7)</xref>, <italic>λ</italic> is a scalar number; the vector from the camera to the principal point is 
<inline-formula>
<mml:math id="mm12" display="inline">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">Co</mml:mi></mml:mrow>
<mml:mo stretchy="true">⇀</mml:mo></mml:mover>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>ɛ</mml:mi>
<mml:mi>x</mml:mi></mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>ɛ</mml:mi>
<mml:mi>y</mml:mi></mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>f</mml:mi></mml:mrow>
<mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>; the vector from the principal point to the estimated vertex of boundary is 
<inline-formula>
<mml:math id="mm13" display="inline">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">ot</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow>
<mml:mo>″</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="true">⇀</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula>. We define a vector [<italic>x</italic>″, <italic>y</italic>″, <italic>z</italic>″] as the result of 
<inline-formula>
<mml:math id="mm14" display="inline">
<mml:mrow>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">Co</mml:mi></mml:mrow>
<mml:mo stretchy="true">⇀</mml:mo></mml:mover>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">ot</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow>
<mml:mo>″</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="true">⇀</mml:mo></mml:mover></mml:mrow></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. According to matrix equivalence, the <xref ref-type="disp-formula" rid="FD7">Equation (7)</xref> is derived as:
<disp-formula id="FD8">
<label>(8)</label>
<mml:math id="mm15" display="block">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">Ct</mml:mi>
<mml:mn>1</mml:mn></mml:msub></mml:mrow>
<mml:mo>′</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="true">¯</mml:mo></mml:mover>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi></mml:mrow>
<mml:mn>1</mml:mn></mml:msub>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi></mml:mrow>
<mml:mn>1</mml:mn></mml:msub></mml:mrow>
<mml:mo>′</mml:mo></mml:msup>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mn>1</mml:mn></mml:msub></mml:mrow>
<mml:mo>]</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mi>λ</mml:mi>
<mml:mi>x</mml:mi>
<mml:mo>″</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>λ</mml:mi>
<mml:mi>y</mml:mi>
<mml:mo>″</mml:mo>
<mml:mi>λ</mml:mi>
<mml:mi>z</mml:mi>
<mml:mo>″</mml:mo></mml:mrow>
<mml:mo>]</mml:mo></mml:mrow>
<mml:mo>∼</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>″</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mn>1</mml:mn></mml:msub>
<mml:mo>/</mml:mo>
<mml:mi>z</mml:mi>
<mml:mo>″</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>″</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mn>1</mml:mn></mml:msub>
<mml:mo>/</mml:mo>
<mml:mi>z</mml:mi>
<mml:mo>″</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mn>1</mml:mn></mml:msub></mml:mrow>
<mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>We derive the estimated height value as: <italic>y</italic><sub>1</sub>′ = <italic>y</italic>″<italic>z</italic><sub>1</sub>/<italic>z</italic>″ or <italic>y</italic><sub>1</sub>′ = <italic>y</italic>″<italic>x</italic><sub>1</sub>/<italic>x</italic>″. Then, we reset the height value of the foreground vertex <italic>t</italic><sub>1</sub> with (<italic>x</italic><sub>1</sub>, <italic>y</italic><sub>1</sub>′, <italic>z</italic><sub>1</sub>). Because the horizon coordinates (<italic>x</italic><sub>1</sub>, <italic>z</italic><sub>1</sub>) of the 3D object vertex are fixed in the terrain mesh, we update the elevation value <italic>y</italic><sub>1</sub>′ of each object vertex in the terrain mesh to obtain the results shown in <xref ref-type="fig" rid="f14-sensors-12-17186">Figure 14</xref>.</p></sec></sec>
<sec sec-type="methods">
<label>5.</label>
<title>Experiments and Analysis</title>
<p>In this section, we describe several experiments to analyze the performance of the proposed non-ground object detection and 3D boundary estimation methods. The experiments have been performed in three steps. Firstly, we have reconstructed a voxel map and textured mesh in the virtual environment by integrating frames of 3D point clouds. Next, we have segmented ground voxels in the voxel map using the height histogram method with a GMRF model. Finally, we have estimated object boundaries in the 2D images using the object vertices in the terrain mesh and evaluated the height of each object cell.</p>
<p>Experiments were carried out using a mobile robot with integrated sensors, including a GPS, gyroscope, video camera, and 3D sensor. We used an HDL-64E Velodyne sensor, giving approximately 1.333 million laser shots per second, to scan 3D points in the unknown environment. The valid data range is approximately 70 m from the robot. Our algorithms are implemented on a laptop PC with a 2.82 GHz Intel<sup>®</sup> Core™2 Quad CPU, a GeForce GTX 275 graphics card, and 4 GB RAM. We drive the robot around an outdoor area of 100 × 100 m<sup>2</sup>, including buildings and trees. The upper parts of these objects are outside the range of the robot’s sensors, but are captured in the 2D images. We also utilized an HDL-32E Velodyne sensor in other two environments, as shown in <xref ref-type="fig" rid="f15-sensors-12-17186">Figure 15</xref>, to investigate the performance of the proposed algorithms.</p>
<sec>
<label>5.1.</label>
<title>Performance of the Ground Segmentation Method</title>
<p>In this section, we analyze the ground segmentation results, discuss the accuracy of the model for different densities of terrain map, and show that the proposed algorithm is fast enough to be used in a real-time approach. We apply the proposed height histogram method to estimate the height range of the ground surface, and then use the GMRF model to segment the ground data with the results of the height histogram.</p>
<p>The voxel map integrated from a few frames has a low density and small point quantity, so that rare neighboring voxels exist centered by a voxel. It is thus difficult to estimate the configurations of terrain voxels. In our projection, we collected 235,940 lasers in a frame, and implemented the ground segmentation once per frame. <xref ref-type="fig" rid="f10-sensors-12-17186">Figure 10(a)</xref> shows the ground segmentation result for the voxel map generated from one frame, which register 88,536 voxels in the terrain model buffer. The ground segmentation took 0.03 s.</p>
<p>We see from <xref ref-type="fig" rid="f10-sensors-12-17186">Figure 10(a)</xref> that the accuracy of the ground segmentation is not high. When we generate a cohesive terrain map integrated from many frames of 3D point clouds, the density is high and the quantity of points is large. <xref ref-type="fig" rid="f10-sensors-12-17186">Figure 10(b)</xref> shows a voxel map made of 1,817,035 voxels, generated from 100 frames. The computation for the ground segmentation result in <xref ref-type="fig" rid="f10-sensors-12-17186">Figure 10(b)</xref> took 0.496 s.</p>
<p><xref ref-type="fig" rid="f11-sensors-12-17186">Figure 11(a)</xref> shows the numbers of the sensed points and the voxels processed for ground segmentation in frames 1∼60. <xref ref-type="fig" rid="f11-sensors-12-17186">Figure 11(b)</xref> shows the speed of the ground segmentation processing. At the beginning of the testing, only 1.8 × 10<sup>5</sup> voxels are sensed in the first five frames. The ground segmentation for the voxel map of low density performs a high speed of the ground segmentation, more than 15 fps. As more frames are collected for the voxel map, a higher number of neighboring voxels are included in the computation of the voxel’s configuration, which cause the duration of the ground segmentation processing to increase. When the robot moves faster than 0.4 m/s, there are 1.237 × 10<sup>5</sup>∼1.253 × 10<sup>5</sup> voxels registered in the voxel map for each frame approximately. The new registered voxels cause a higher computation for GMRF model and the ground segmentation performs a low speed at 6.25 fps averagely. Because the numbers of the new registered voxels are different for each frame, some variances exist in the <xref ref-type="fig" rid="f11-sensors-12-17186">Figure 11(b)</xref>. The sensing duration of a frame was 0.177 s. To realize real-time requirement, the ground segmentation duration need to be less than 0.177 s. From the simulation result of <xref ref-type="fig" rid="f11-sensors-12-17186">Figure 11(b)</xref>, we can see that the ground segmentation takes less than 0.16 second, which satisfies the real-time requirement. By applying multi-thread programming, we collect the voxel map and implement the ground segmentation in parallel, in order to realize real-time terrain modeling.</p>
<p>We also implemented the GMRF-based ground segmentation method in 2D images. Firstly, we segmented the ground vertices in the terrain mesh and mapped them onto 2D pixels. Next, we segmented all the ground pixels using the GMRF model with the flood-fill algorithm, as proposed by Song [<xref ref-type="bibr" rid="b32-sensors-12-17186">32</xref>]. The performance of this approach is shown in <xref ref-type="fig" rid="f11-sensors-12-17186">Figure 11(c)</xref>. The duration of ground segmentation in the 2D images with a solution of 320 × 240 pixels is around 0.53 second. Thus, our proposed ground segmentation method in the voxel map is faster than that in 2D images.</p></sec>
<sec>
<label>5.2.</label>
<title>Performance of the 3D Boundary Estimation Method</title>
<p>Before terrain reconstruction, we apply a calibration of camera and the Velodyne sensor to the camera. In order to realize real-time terrain reconstruction, we implement the calibration method only once before the terrain reconstruction. In <xref ref-type="fig" rid="f12-sensors-12-17186">Figure 12(a)</xref>, the green pixels are projected from the sensed 3D points of 0.1 frame to the captured 2D image, without calibration. We see that the projected pixels do not match their actual position in the 2D image. After the calibration of the projection matrix, the projection results are shown as green pixels in <xref ref-type="fig" rid="f12-sensors-12-17186">Figure 12(b)</xref>. We see that the boundary pixels between the building and ground surface match their positions in the 2D image.</p>
<p>In this section, we investigate the performance of the proposed boundary estimation method by comparing the obtained values with the actual object heights (2.90 m on average). <xref ref-type="fig" rid="f13-sensors-12-17186">Figure 13(a)</xref> shows the height map of the incomplete terrain mesh in <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(c)</xref>, where the horizon coordinates of vertices correspond to the x-axis and y-axis. Previous interpolation algorithms average the empty region using the surrounding 3D points. However, using our proposed 3D boundary estimation method, we recovered the unsensed parts of foreground objects, which are sensed in the incomplete terrain mesh of <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(c)</xref>. <xref ref-type="fig" rid="f13-sensors-12-17186">Figure 13(b)</xref> shows a height map generated after 3D boundary estimation, where the estimated height values were close to the actual value. When the non-ground vertex is far from the camera, the slight errors of the boundary detection in 2D image, even a pixel offset, cause erroneous results in 3D boundary detection due to the inverse projection function, expressed as <xref ref-type="disp-formula" rid="FD8">Equation (8)</xref>. In the simulation result of <xref ref-type="fig" rid="f13-sensors-12-17186">Figure 13(b)</xref>, the errors exist in the far-field regions, more than 100 meters to the camera. The variance of the errors was less than 0.8 meter. However, when the robot moves forward, the shape of the recovered parts of unsensed objects is refined as <xref ref-type="fig" rid="f14-sensors-12-17186">Figure 14(b)</xref>. The variance of the errors is reduced to 0.31 meter.</p>
<p>The textured terrain mesh with complete objects provides the remote operator with an intuitive 3D scene with foreground objects on the ground. This gives a better representation of the surrounding terrain than that generated directly from the sensed datasets. Whereas Huber’s work [<xref ref-type="bibr" rid="b8-sensors-12-17186">8</xref>] rendered far-field regions of 30 m away using a texture billboard, the proposed terrain reconstruction method provides a complete picture of the terrain model for up to 100 m. The rendering speed is more than 25 fps.</p>
<p>By mapping the texture from the 2D image to the terrain mesh, we reconstructed a complete terrain model captured from the quarter view, as shown in <xref ref-type="fig" rid="f14-sensors-12-17186">Figure 14(a)</xref>. The reconstructed terrain model contains 576,247 vertices and 477,315 triangles. Because we amount the camera in front of the robot, the vertices in front of the camera are projected to the 2D image so that the front parts of the terrain model are represented with the projected texture. To denote the vertices that are not projected from the 2D image, we render yellow vertices in the terrain visualization result. As we see from <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(c)</xref>, the upper parts of buildings and trees are not sensed. Using the 3D boundary estimation method, we recovered these missing parts from the image of <xref ref-type="fig" rid="f1-sensors-12-17186">Figure 1(a)</xref>. After the robot moved 40 meters forward, the terrain reconstruction result was refined as <xref ref-type="fig" rid="f14-sensors-12-17186">Figure 14(b)</xref>.</p>
<p><xref ref-type="fig" rid="f15-sensors-12-17186">Figure 15</xref> shows some other simulation results for the proposed terrain reconstruction results. The images of <xref ref-type="fig" rid="f15-sensors-12-17186">Figure 15(a,b)</xref> shows the 2D images captured in front of the mobile robot. The textured meshes in <xref ref-type="fig" rid="f15-sensors-12-17186">Figure 15(c,d)</xref> shows the terrain reconstruction results, where the upper parts of buildings and trees cannot be sensed. The textured meshes in <xref ref-type="fig" rid="f15-sensors-12-17186">Figure 15(e,f)</xref> shows the complete scene recovery result from the terrain mesh in <xref ref-type="fig" rid="f15-sensors-12-17186">Figure 15(c,d)</xref> respectively, using the 3D boundary estimation method. Because there was no sensed vertices of the non-ground objects in far-field, the 3D boundary estimation algorithm were not implemented for these objects. Because the camera does not capture the top of the figure object in the image of <xref ref-type="fig" rid="f15-sensors-12-17186">Figure 15(a)</xref>, the 2D boundary pixels of the figure object are not detected. The reconstructed terrain mesh of <xref ref-type="fig" rid="f15-sensors-12-17186">Figure 15(e)</xref> covered 73.4% of the boundary in 2D image of <xref ref-type="fig" rid="f15-sensors-12-17186">Figure 15(a)</xref>, and the mesh of <xref ref-type="fig" rid="f15-sensors-12-17186">Figure 15(f)</xref> covered 91.3% of the image of <xref ref-type="fig" rid="f15-sensors-12-17186">Figure 15(b)</xref>. We view the realistic objects in 3D terrain mesh effectively with 3D complete scene for foreground objects on the ground.</p></sec></sec>
<sec sec-type="conclusions">
<label>6.</label>
<title>Conclusions</title>
<p>In this paper, we described a ground segmentation technique and a non-ground boundary estimation method for automated surveying and mapping by mobile robots. The methods are shown to be effective in an outdoor environment for a mobile robot with a 3D sensor, video camera, GPS, and gyroscope. The datasets from multiple sensors are integrated in the forms of voxel map and textured mesh in order to develop a terrain modeling system.</p>
<p>During remote operation, it is not convenient to classify non-ground objects using 3D point clouds. Ground segmentation is required for the classification of ground surface and non-ground objects, but traditional methods for 2D images must be implemented for each captured image, leading to a huge computational cost. To overcome this problem, we developed a ground segmentation approach using a height histogram and GMRF model in the reconstructed terrain voxel map. We showed that our method is faster than segmentation algorithms based on image processing.</p>
<p>To represent non-ground objects outside the measurement range of the robot’s 3D sensors, conventional interpolation algorithms are applied. However, it is difficult for these methods to recover the shape of large objects. To solve this problem, we described a 3D boundary estimation method that estimates the true height value of an object from its boundary in the 2D image. The actual height of objects is estimated using a projection from 2D pixels to 3D vertices. This method enables real-time terrain modeling and provides the remote robot operator with photorealistic visualization support.</p>
<p>We tested our approach using a mobile robot mounted with integrated sensors. The simulation results demonstrate the intuitive visualization performance of the proposed method in a large-scale environment. The speed of terrain modeling and photorealistic visualization satisfies the constraints of real-time operation. Our works are compatible with global information database collection, streetscape representation, augmented reality and other multimedia applications.</p>
<p>However, in the ground segmentation results, if the computed voxel is far from the robot, it is difficult to evaluate the accurate probability for the configurations in GMRF model. Even in the rough terrain, such as vegetation areas, the irregular object shape distribution will cause the errors in ground segmentation results. We implement the 3D boundary estimation algorithm from textured mesh, which cannot model the empty areas inside the objects, such as trees. We need to improve and optimize the algorithms to deal with these problems in future.</p></sec></body>
<back>
<ack>
<p>This work was supported by the Agency for Defense Development, Korea.</p></ack>
<ref-list>
<title>References</title>
<ref id="b1-sensors-12-17186"><label>1.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Knauer</surname><given-names>U.</given-names></name><name><surname>Meffert</surname><given-names>B.</given-names></name></person-group><article-title>Fast Computation of Region Homogeneity with Application in a Surveillance Task</article-title><conf-name>Proceedings of ISPRS Commission V Mid-Term Symposium Close Range Image Measurement Techniques</conf-name><conf-loc>Newcastle, UK</conf-loc><conf-date>21–24 June 2010</conf-date><fpage>337</fpage><lpage>342</lpage></citation></ref>
<ref id="b2-sensors-12-17186"><label>2.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Sukumar</surname><given-names>S.R.</given-names></name><name><surname>Yu</surname><given-names>S.J.</given-names></name><name><surname>Page</surname><given-names>D.L.</given-names></name><name><surname>Koschan</surname><given-names>A.F.</given-names></name><name><surname>Abidi</surname><given-names>M.A.</given-names></name></person-group><article-title>Multi-Sensor Integration for Unmanned Terrain Modeling</article-title><conf-name>Proceedings of the SPIE Unmanned Systems Technology VIII</conf-name><conf-loc>Orlando, FL, USA</conf-loc><conf-date>17–20 April 2006</conf-date><fpage>65</fpage><lpage>74</lpage></citation></ref>
<ref id="b3-sensors-12-17186"><label>3.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Saxena</surname><given-names>A.</given-names></name><name><surname>Chung</surname><given-names>S.H.</given-names></name><name><surname>Ng</surname><given-names>A.Y.</given-names></name></person-group><article-title>3-D depth reconstruction from a single still image</article-title><source>Int. J. Comput. Vis</source><year>2008</year><volume>76</volume><fpage>53</fpage><lpage>69</lpage></citation></ref>
<ref id="b4-sensors-12-17186"><label>4.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Kim</surname><given-names>G.H.</given-names></name><name><surname>Huber</surname><given-names>D.</given-names></name><name><surname>Hebert</surname><given-names>M.</given-names></name></person-group><article-title>Segmentation of Salient Regions in Outdoor Scenes Using Imagery and 3D Data</article-title><conf-name>Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV08)</conf-name><conf-loc>Copper Mountain Resort, CO, USA</conf-loc><conf-date>7–9 January 2008</conf-date><fpage>1</fpage><lpage>8</lpage></citation></ref>
<ref id="b5-sensors-12-17186"><label>5.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Saeed</surname><given-names>J.</given-names></name><name><surname>Abdolah</surname><given-names>C.</given-names></name><name><surname>Ehsan</surname><given-names>Z.</given-names></name></person-group><article-title>Determining hit time and location of the ball in humanoid robot league</article-title><source>IJAST</source><year>2011</year><volume>34</volume><fpage>17</fpage><lpage>26</lpage></citation></ref>
<ref id="b6-sensors-12-17186"><label>6.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname><given-names>S.</given-names></name><name><surname>Lee</surname><given-names>S.</given-names></name><name><surname>Kim</surname><given-names>S.</given-names></name><name><surname>Lee</surname><given-names>J.</given-names></name></person-group><article-title>Object tracking of mobile robot using moving color and shape information for the aged walking</article-title><source>IJAST</source><year>2009</year><volume>3</volume><fpage>59</fpage><lpage>68</lpage></citation></ref>
<ref id="b7-sensors-12-17186"><label>7.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cohen-Or</surname><given-names>D.</given-names></name></person-group><article-title>Exact antialiasing of textured terrain models</article-title><source>Visual Computer</source><year>1997</year><volume>13</volume><fpage>184</fpage><lpage>198</lpage><pub-id pub-id-type="doi">10.1007/s003710050098</pub-id></citation></ref>
<ref id="b8-sensors-12-17186"><label>8.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Huber</surname><given-names>D.</given-names></name><name><surname>Herman</surname><given-names>H.</given-names></name><name><surname>Kelly</surname><given-names>A.</given-names></name><name><surname>Rander</surname><given-names>P.</given-names></name><name><surname>Ziglar</surname><given-names>J.</given-names></name></person-group><article-title>Real-Time Photo-Realistic Visualization of 3D Environments for Enhanced Tele-Operation of Vehicles</article-title><conf-name>Proceedings of the International Conference on 3D Digital Imaging and Modeling (3DIM)</conf-name><conf-loc>Kyoto, Japan</conf-loc><conf-date>3–4 October 2009</conf-date><fpage>1518</fpage><lpage>1525</lpage></citation></ref>
<ref id="b9-sensors-12-17186"><label>9.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Besag</surname><given-names>J.</given-names></name></person-group><article-title>Spatial interaction and the statistical analysis of lattice systems</article-title><source>J. R. Statist. Soc.</source><year>1974</year><volume>36</volume><fpage>192</fpage><lpage>236</lpage></citation></ref>
<ref id="b10-sensors-12-17186"><label>10.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Geman</surname><given-names>S.</given-names></name><name><surname>Geman</surname><given-names>D.</given-names></name></person-group><article-title>Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images</article-title><source>J. Appl. Statist.</source><year>1984</year><volume>6</volume><fpage>721</fpage><lpage>741</lpage></citation></ref>
<ref id="b11-sensors-12-17186"><label>11.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nüchter</surname><given-names>A.</given-names></name><name><surname>Hertzberg</surname><given-names>J.</given-names></name></person-group><article-title>Towards semantic maps for mobile robots</article-title><source>Robot. Auton. Sys.</source><year>2008</year><volume>56</volume><fpage>915</fpage><lpage>926</lpage><pub-id pub-id-type="doi">10.1016/j.robot.2008.08.001</pub-id></citation></ref>
<ref id="b12-sensors-12-17186"><label>12.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kelly</surname><given-names>A.</given-names></name><name><surname>Chan</surname><given-names>N.</given-names></name><name><surname>Herman</surname><given-names>H.</given-names></name><name><surname>Huber</surname><given-names>D.</given-names></name><name><surname>Meyers</surname><given-names>R.</given-names></name><name><surname>Rander</surname><given-names>P.</given-names></name><name><surname>Warner</surname><given-names>R.</given-names></name><name><surname>Ziglar</surname><given-names>J.</given-names></name><name><surname>Capstick</surname><given-names>E.</given-names></name></person-group><article-title>Real-time photorealistic virtualized reality interface for remote mobile robot control</article-title><source>Int. J. Robot. Res.</source><year>2011</year><volume>30</volume><fpage>384</fpage><lpage>404</lpage><pub-id pub-id-type="doi">10.1177/0278364910383724</pub-id></citation></ref>
<ref id="b13-sensors-12-17186"><label>13.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yu</surname><given-names>S.J.</given-names></name><name><surname>Sukumar</surname><given-names>S.R.</given-names></name><name><surname>Koschan</surname><given-names>A.F.</given-names></name><name><surname>Page</surname><given-names>D.L.</given-names></name><name><surname>Abidi</surname><given-names>M.A.</given-names></name></person-group><article-title>3D reconstruction of road surfaces using an integrated multi-sensory approach</article-title><source>Opt. Lasers Eng.</source><year>2007</year><volume>45</volume><fpage>808</fpage><lpage>818</lpage><pub-id pub-id-type="doi">10.1016/j.optlaseng.2006.12.007</pub-id></citation></ref>
<ref id="b14-sensors-12-17186"><label>14.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schiewe</surname><given-names>J.</given-names></name></person-group><article-title>Integration of multi-sensor data for landscape modeling using a region-based approach</article-title><source>ISPRS J. Photogram. Remote Sens.</source><year>2003</year><volume>57</volume><fpage>371</fpage><lpage>379</lpage><pub-id pub-id-type="doi">10.1016/S0924-2716(02)00165-X</pub-id></citation></ref>
<ref id="b15-sensors-12-17186"><label>15.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sequeira</surname><given-names>V.</given-names></name><name><surname>Ng</surname><given-names>K.</given-names></name><name><surname>Wolfart</surname><given-names>E.</given-names></name><name><surname>Gonçalves</surname><given-names>J.G.M.</given-names></name><name><surname>Hogg</surname><given-names>D.</given-names></name></person-group><article-title>Automated reconstruction of 3D models from real environments</article-title><source>J. Photogramm. Remote Sens.</source><year>1999</year><volume>54</volume><fpage>1</fpage><lpage>22</lpage><pub-id pub-id-type="doi">10.1016/S0924-2716(98)00026-4</pub-id></citation></ref>
<ref id="b16-sensors-12-17186"><label>16.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rovira-Más</surname><given-names>F.</given-names></name><name><surname>Zhang</surname><given-names>Q.</given-names></name><name><surname>Reid</surname><given-names>J.F.</given-names></name></person-group><article-title>Stereo vision three-dimensional terrain maps for precision agriculture</article-title><source>Comput. Electron. Agric.</source><year>2008</year><volume>60</volume><fpage>133</fpage><lpage>143</lpage><pub-id pub-id-type="doi">10.1016/j.compag.2007.07.007</pub-id></citation></ref>
<ref id="b17-sensors-12-17186"><label>17.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Noguera</surname><given-names>J.M.</given-names></name><name><surname>Segura</surname><given-names>R.J.</given-names></name><name><surname>Ogáyar</surname><given-names>C.J.</given-names></name><name><surname>Joan-Arinyo</surname><given-names>R.</given-names></name></person-group><article-title>Navigating large terrains using commodity mobile devices</article-title><source>Comput. Geosci.</source><year>2011</year><volume>37</volume><fpage>1218</fpage><lpage>1233</lpage><pub-id pub-id-type="doi">10.1016/j.cageo.2010.08.007</pub-id></citation></ref>
<ref id="b18-sensors-12-17186"><label>18.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Conrad</surname><given-names>D.</given-names></name><name><surname>DeSouza</surname><given-names>G.N.</given-names></name></person-group><article-title>Homography-Based Ground Plane Detection for Mobile Robot Navigation Using a Modified EM Algorithm</article-title><conf-name>Proceedings of IEEE International Conference on Robotics and Automation</conf-name><conf-loc>Anchorage, AK, USA</conf-loc><conf-date>3–8 May 2010</conf-date><fpage>910</fpage><lpage>915</lpage></citation></ref>
<ref id="b19-sensors-12-17186"><label>19.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Lowe</surname><given-names>D.G.</given-names></name></person-group><article-title>Object Recognition from Local Scale-Invariant Features</article-title><conf-name>In Proceedings of the International Conference on Computer Vision</conf-name><conf-loc>Corfu, Greece</conf-loc><conf-date>20–25 September 1999</conf-date><fpage>1150</fpage><lpage>1157</lpage></citation></ref>
<ref id="b20-sensors-12-17186"><label>20.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Ke</surname><given-names>P.</given-names></name><name><surname>Meng</surname><given-names>C.</given-names></name><name><surname>Li</surname><given-names>J.</given-names></name><name><surname>Liu</surname><given-names>Y.</given-names></name></person-group><article-title>Homography-Based Ground Area Detection for Indoor Mobile Robot Using Binocular Cameras</article-title><conf-name>IEEE Conference on Robotics, Automation and Mechatronics</conf-name><conf-loc>Qingdao, China</conf-loc><conf-date>17–19 September 2011</conf-date><fpage>30</fpage><lpage>34</lpage></citation></ref>
<ref id="b21-sensors-12-17186"><label>21.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Oniga</surname><given-names>F.</given-names></name><name><surname>Nedevschi</surname><given-names>S.</given-names></name><name><surname>Marc</surname><given-names>M.</given-names></name><name><surname>Thanh</surname><given-names>B.</given-names></name></person-group><article-title>Road Surface and Obstacle Detection Based on Elevation Maps from Dense Stereo</article-title><conf-name>Proceedings of IEEE Conference on Intelligent Transportation Systems (ITSC)</conf-name><conf-loc>Washington, DC, USA</conf-loc><conf-date>30 September–3 October 2007</conf-date><fpage>859</fpage><lpage>865</lpage></citation></ref>
<ref id="b22-sensors-12-17186"><label>22.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mufti</surname><given-names>F.</given-names></name><name><surname>Mahony</surname><given-names>R.</given-names></name><name><surname>Heinzmann</surname><given-names>J.</given-names></name></person-group><article-title>Robust estimation of planar surfaces using spatio-temporal RANSAC for applications in autonomous vehicle navigation</article-title><source>Robot. Auton. Syst.</source><year>2012</year><volume>60</volume><fpage>16</fpage><lpage>28</lpage><pub-id pub-id-type="doi">10.1016/j.robot.2011.08.009</pub-id></citation></ref>
<ref id="b23-sensors-12-17186"><label>23.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Lam</surname><given-names>J.</given-names></name><name><surname>Kusevic</surname><given-names>K.</given-names></name><name><surname>Mrstik</surname><given-names>R.</given-names></name><name><surname>Harrap</surname><given-names>P.</given-names></name><name><surname>Greenspan</surname><given-names>M.</given-names></name></person-group><article-title>Urban scene extraction from mobile ground based lidar data</article-title><conf-name>Proceedings of 3DPVT</conf-name><conf-loc>Paris, France</conf-loc><conf-date>17–20 May 2010</conf-date><fpage>1</fpage><lpage>8</lpage></citation></ref>
<ref id="b24-sensors-12-17186"><label>24.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Zeng</surname><given-names>W.</given-names></name><name><surname>Gao</surname><given-names>W.</given-names></name></person-group><article-title>Semantic Object Segmentation by a Spatio-Temporal MRF Model</article-title><conf-name>Proceedings of the International Conference on Pattern Recognition</conf-name><conf-loc>Cambridge, UK</conf-loc><conf-date>23–26 August 2004</conf-date><fpage>775</fpage><lpage>778</lpage></citation></ref>
<ref id="b25-sensors-12-17186"><label>25.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Szirányi</surname><given-names>T.</given-names></name><name><surname>Zerubia</surname><given-names>J.</given-names></name><name><surname>Czúni</surname><given-names>L.</given-names></name><name><surname>Geldreich</surname><given-names>D.</given-names></name><name><surname>Kato</surname><given-names>Z.</given-names></name></person-group><article-title>Image segmentation using Markov random field model in fully parallel cellular network architectures</article-title><source>Real Time Imag.</source><year>2000</year><volume>6</volume><fpage>195</fpage><lpage>211</lpage><pub-id pub-id-type="doi">10.1006/rtim.1998.0159</pub-id></citation></ref>
<ref id="b26-sensors-12-17186"><label>26.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Anguelov</surname><given-names>D.</given-names></name><name><surname>Taskarf</surname><given-names>B.</given-names></name><name><surname>Chatalbashev</surname><given-names>V.</given-names></name><name><surname>Koller</surname><given-names>D.</given-names></name><name><surname>Gupta</surname><given-names>D.</given-names></name><name><surname>Heitz</surname><given-names>G.</given-names></name><name><surname>Ng</surname><given-names>A.</given-names></name></person-group><article-title>Discriminative Learning of Markov Random Fields for Segmentation of 3D Scan Data</article-title><conf-name>Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition</conf-name><conf-loc>San Diego, CA, USA</conf-loc><conf-date>20–25 June 2005</conf-date><fpage>169</fpage><lpage>176</lpage></citation></ref>
<ref id="b27-sensors-12-17186"><label>27.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Kindermann</surname><given-names>R.</given-names></name><name><surname>Snell</surname><given-names>J.L.</given-names></name></person-group><source>Markov Random Fields and Their Applications</source><publisher-name>American Mathematical Society (AMS)</publisher-name><publisher-loc>Providence, RI, USA</publisher-loc><year>1980</year></citation></ref>
<ref id="b28-sensors-12-17186"><label>28.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Perez</surname><given-names>P.</given-names></name></person-group><article-title>Markov random fields and images</article-title><source>CWI Q.</source><year>1998</year><volume>11</volume><fpage>413</fpage><lpage>437</lpage></citation></ref>
<ref id="b29-sensors-12-17186"><label>29.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Angelo</surname><given-names>A.</given-names></name><name><surname>Dugelay</surname><given-names>J.L.</given-names></name></person-group><article-title>A Markov Random Field Description of Fuzzy Color Segmentation</article-title><conf-name>Proceedings of the 2nd International Conference on Image Processing Theory, Tools and Applications</conf-name><conf-loc>Paris, France</conf-loc><conf-date>7–10 July 2010</conf-date><fpage>270</fpage><lpage>275</lpage></citation></ref>
<ref id="b30-sensors-12-17186"><label>30.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Häselich</surname><given-names>M.</given-names></name><name><surname>Arends</surname><given-names>M.</given-names></name><name><surname>Wojke</surname><given-names>N.</given-names></name><name><surname>Neuhaus</surname><given-names>F.</given-names></name><name><surname>Paulus</surname><given-names>D.</given-names></name></person-group><article-title>Probabilistic terrain classification in unstructured environments</article-title><source>Robot. Auton. Syst.</source><year>2012</year><comment>accepted</comment></citation></ref>
<ref id="b31-sensors-12-17186"><label>31.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Vernaza</surname><given-names>P.</given-names></name><name><surname>Taskar</surname><given-names>B.</given-names></name><name><surname>Lee</surname><given-names>D.</given-names></name></person-group><article-title>Online, Self-supervised Terrain Classification via Discriminatively trained Submodular Markov Random Fields</article-title><conf-name>Proceedings of IEEE International Conference on Robotics and Automation</conf-name><conf-loc>Pasadena, CA, USA</conf-loc><conf-date>19–23 May 2008</conf-date><fpage>2750</fpage><lpage>2757</lpage></citation></ref>
<ref id="b32-sensors-12-17186"><label>32.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Song</surname><given-names>W.</given-names></name><name><surname>Cho</surname><given-names>K.</given-names></name><name><surname>Um</surname><given-names>K.</given-names></name><name><surname>Won</surname><given-names>C.S.</given-names></name><name><surname>Sim</surname><given-names>S.</given-names></name></person-group><article-title>Complete scene recovery and terrain classification in textured terrain meshes</article-title><source>Sensors</source><year>2012</year><volume>12</volume><fpage>11221</fpage><lpage>11237</lpage><pub-id pub-id-type="doi">10.3390/s120811221</pub-id><pub-id pub-id-type="pmid">23112653</pub-id></citation></ref>
<ref id="b33-sensors-12-17186"><label>33.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kraus</surname><given-names>K.</given-names></name><name><surname>Pfeifer</surname><given-names>N.</given-names></name></person-group><article-title>Determination of terrain models in wooded areas with airborne laser scanner data</article-title><source>J. Photogramm. Remote Sens.</source><year>1998</year><volume>53</volume><fpage>193</fpage><lpage>203</lpage><pub-id pub-id-type="doi">10.1016/S0924-2716(98)00009-4</pub-id></citation></ref>
<ref id="b34-sensors-12-17186"><label>34.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname><given-names>Y.M.</given-names></name><name><surname>Chen</surname><given-names>C.J.</given-names></name></person-group><article-title>3D fractal reconstruction of terrain profile data based on digital elevation model</article-title><source>Chaos Soliton. Fractal.</source><year>2009</year><volume>40</volume><fpage>1741</fpage><lpage>1749</lpage><pub-id pub-id-type="doi">10.1016/j.chaos.2007.09.091</pub-id></citation></ref>
<ref id="b35-sensors-12-17186"><label>35.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kobler</surname><given-names>A.</given-names></name><name><surname>Pfeifer</surname><given-names>N.</given-names></name><name><surname>Ogrinc</surname><given-names>P.</given-names></name><name><surname>Todorovski</surname><given-names>L.</given-names></name><name><surname>Oštir</surname><given-names>K.</given-names></name><name><surname>Džeroski</surname><given-names>S.</given-names></name></person-group><article-title>Repetitive interpolation: A robust algorithm for DTM generation from aerial laser scanner data in forested terrain</article-title><source>Remote Sens. Environ.</source><year>2007</year><volume>108</volume><fpage>9</fpage><lpage>23</lpage><pub-id pub-id-type="doi">10.1016/j.rse.2006.10.013</pub-id></citation></ref>
<ref id="b36-sensors-12-17186"><label>36.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hugentobler</surname><given-names>M.</given-names></name><name><surname>Schneider</surname><given-names>B.</given-names></name></person-group><article-title>Breaklines in Coons surfaces over triangles for the use in terrain modeling</article-title><source>Comput. Geosci.</source><year>2005</year><volume>31</volume><fpage>45</fpage><lpage>54</lpage><pub-id pub-id-type="doi">10.1016/j.cageo.2004.09.006</pub-id></citation></ref>
<ref id="b37-sensors-12-17186"><label>37.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Douillard</surname><given-names>B.</given-names></name><name><surname>Brooks</surname><given-names>A.</given-names></name><name><surname>Ramos</surname><given-names>F.</given-names></name></person-group><article-title>A 3D Laser and Vision Based Classifier</article-title><conf-name>Proceedings of the Fifth International Conference on Intelligent Sensors, Sensor Networks and Information Processing</conf-name><conf-loc>Melbourne, Australia</conf-loc><conf-date>7–10 December 2009</conf-date><fpage>295</fpage><lpage>300</lpage></citation></ref>
<ref id="b38-sensors-12-17186"><label>38.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Früh</surname><given-names>C.</given-names></name><name><surname>Zakhor</surname><given-names>A.</given-names></name></person-group><article-title>Data processing algorithms for generating textured 3D building facade meshes from laser scans and camera images</article-title><source>Int. J. Comput. Vis.</source><year>2005</year><volume>61</volume><fpage>159</fpage><lpage>184</lpage><pub-id pub-id-type="doi">10.1023/B:VISI.0000043756.03810.dd</pub-id></citation></ref>
<ref id="b39-sensors-12-17186"><label>39.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Komal</surname><given-names>V.</given-names></name><name><surname>Singh</surname><given-names>Y.</given-names></name></person-group><article-title>Enhancement of Images Using Histogram Processing Techniques</article-title><source>Int. J. Comp. Tech. Appl.</source><year>2011</year><volume>2</volume><fpage>309</fpage><lpage>313</lpage></citation></ref></ref-list>
<app-group>
<app id="app1">
<label>Appendix</label>
<title>GMRF Model Definition</title>
<p>The configuration of a voxel also depends on its connected neighbors. This phenomenon follows the property of GMRF. Hence, we apply GMRF to segment the ground data from the segmentation result computed by the height histogram method.</p>
<p>We define <italic>S</italic> as a set of voxel sites. Any <italic>s</italic>∈<italic>S</italic> is a voxel location (<italic>x<sub>s</sub></italic>, <italic>y<sub>s</sub></italic>, <italic>z<sub>s</sub></italic>) in the voxel map. The random vector <italic>X</italic> = {<italic>X<sub>S</sub></italic>} on <italic>S</italic> has a value <italic>O</italic>. In our application, <italic>O</italic> represents a vector consisting of a height observation variable and a configuration variable. The configuration variable has a ground value and a non-ground value.</p>
<p>A neighborhood system for <italic>s</italic> contains all sites within a distance <italic>r</italic> (<italic>r</italic> ≥ 0) from <italic>s</italic>, defined as <italic>N</italic> = {<italic>N<sub>s</sub></italic>| <italic>s</italic>∈<italic>S</italic>}, where <italic>N<sub>s</sub></italic> is the neighbor set for site <italic>s</italic> given by:
<disp-formula id="FD9">
<label>(9)</label>
<mml:math id="mm16" display="block">
<mml:mrow>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>s</mml:mi></mml:msub>
<mml:mo>=</mml:mo>
<mml:mo stretchy="false">{</mml:mo>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo>′</mml:mo></mml:msup>
<mml:mo>∈</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>‖</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>s</mml:mi></mml:msub>
<mml:mo>−</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub>
<mml:mo>|</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo>|</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>s</mml:mi></mml:msub>
<mml:mo>−</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub>
<mml:mo>|</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo>|</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mi>s</mml:mi></mml:msub>
<mml:mo>−</mml:mo>
<mml:msub>
<mml:mi>Z</mml:mi>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub>
<mml:mo>|</mml:mo>
<mml:mo>≤</mml:mo>
<mml:mi>r</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>≠</mml:mo>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo>′</mml:mo></mml:msup>
<mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:math></disp-formula></p>
<p>We define a clique as a set neighboring a given site. In our application, a clique contains the given voxel and its neighboring voxels within a distance of <italic>r</italic> = 1. A clique set <italic>C</italic> is defined as a collection of single-site <italic>C</italic><sub>1</sub> and pair-site <italic>C</italic><sub>2</sub> cliques. <italic>C</italic> satisfies the condition that each pair of distinct sites in <italic>C</italic> is a neighbor, defined as follows:
<disp-formula id="FD10">
<label>(10)</label>
<mml:math id="mm17" display="block">
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mn>1</mml:mn></mml:msub>
<mml:mo>∪</mml:mo>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:math></disp-formula>
<disp-formula id="FD11">
<label>(11)</label>
<mml:math id="mm18" display="block">
<mml:mrow>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mn>1</mml:mn></mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo stretchy="false">}</mml:mo></mml:mrow>
<mml:mo> </mml:mo>
<mml:mtext>for</mml:mtext>
<mml:mo> </mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>∈</mml:mo>
<mml:mi>S</mml:mi></mml:mrow></mml:math></disp-formula>
<disp-formula id="FD12">
<label>(12)</label>
<mml:math id="mm19" display="block">
<mml:mrow>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mn>2</mml:mn></mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="false">}</mml:mo></mml:mrow>
<mml:mo>|</mml:mo>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo>′</mml:mo></mml:msup>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>i</mml:mi></mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>∈</mml:mo>
<mml:mi>S</mml:mi></mml:mrow>
<mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>Based on its MRF property, the configuration at site <italic>s</italic> only depends on the configuration of its neighboring sites. We find the best possible configuration <italic>f</italic>* for site <italic>s</italic> using the following optimum solution:
<disp-formula id="FD13">
<label>(13)</label>
<mml:math id="mm20" display="block">
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo>*</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mtext>arg</mml:mtext>
<mml:munder>
<mml:mrow>
<mml:mtext>max</mml:mtext></mml:mrow>
<mml:mi>f</mml:mi></mml:munder>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>s</mml:mi></mml:msub>
<mml:mo>=</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo>|</mml:mo>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>t</mml:mi></mml:msub>
<mml:mo>=</mml:mo>
<mml:mi>d</mml:mi>
<mml:mo>,</mml:mo>
<mml:mo>∀</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>∈</mml:mo>
<mml:mi>S</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>To evaluate the PDF in <xref ref-type="disp-formula" rid="FD11">Equation (11)</xref>, we apply the Gibbs distribution [<xref ref-type="bibr" rid="b10-sensors-12-17186">10</xref>], following the Hammersley-Clifford theorem. The probability of a site’s configuration is calculated as:
<disp-formula id="FD14">
<label>(14)</label>
<mml:math id="mm21" display="block">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mi>Z</mml:mi>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:mrow></mml:msup>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>T</mml:mi></mml:mfrac>
<mml:mi>U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:math></disp-formula>
<disp-formula id="FD15">
<label>(15)</label>
<mml:math id="mm22" display="block">
<mml:mrow>
<mml:mi>U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo>∈</mml:mo>
<mml:mi>C</mml:mi></mml:mrow></mml:munder>
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>c</mml:mi></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<disp-formula id="FD16">
<label>(16)</label>
<mml:math id="mm23" display="block">
<mml:mrow>
<mml:mi>Z</mml:mi>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mi>f</mml:mi></mml:munder>
<mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>T</mml:mi></mml:mfrac>
<mml:mi>U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>The potential function <italic>V<sub>c</sub></italic>(<italic>f</italic>) evaluates the effect of neighbor sites in clique <italic>c</italic>∈<italic>C</italic>, and the energy function <italic>U</italic>(<italic>f</italic>) in <xref ref-type="disp-formula" rid="FD14">Equation (14)</xref> is defined as the sum of the impacts of clique set <italic>C</italic>. The probability should satisfy the condition 0 ≤ <italic>p</italic>(<italic>f</italic>) ≤ 1. The partition function <italic>Z</italic> is defined in <xref ref-type="disp-formula" rid="FD16">Equation (16)</xref>. To normalize <italic>p</italic>(<italic>f</italic>), we divide the sum of the exponential functions derived from all possible configurations by the partition function <italic>Z</italic>. The constant <italic>T</italic> is referred to as the temperature factor in Gibbs’ theory, and it controls the deviation of the distribution of <italic>p</italic>(<italic>f</italic>) in MRF.</p>
<p>According to Bayes’ rule, the solution of <xref ref-type="disp-formula" rid="FD13">Equation (13)</xref> is as follows:
<disp-formula id="FD17">
<label>(17)</label>
<mml:math id="mm24" display="block">
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo>*</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mtext>arg</mml:mtext>
<mml:munder>
<mml:mrow>
<mml:mtext>max</mml:mtext></mml:mrow>
<mml:mi>f</mml:mi></mml:munder>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mtext>arg</mml:mtext>
<mml:munder>
<mml:mrow>
<mml:mtext>max</mml:mtext></mml:mrow>
<mml:mi>f</mml:mi></mml:munder>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>f</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mtext>arg</mml:mtext>
<mml:munder>
<mml:mrow>
<mml:mtext>min</mml:mtext></mml:mrow>
<mml:mi>f</mml:mi></mml:munder>
<mml:mi>U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mtext>arg</mml:mtext>
<mml:munder>
<mml:mrow>
<mml:mtext>min</mml:mtext></mml:mrow>
<mml:mi>f</mml:mi></mml:munder>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:mi>U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>f</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow>
<mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>The energy function <italic>U</italic>(<italic>d</italic>|<italic>f</italic>) + <italic>U</italic>(<italic>f</italic>) evaluates the effect of neighbor sites in single- and pair-site potential cliques as follows:
<disp-formula id="FD18">
<label>(18)</label>
<mml:math id="mm25" display="block">
<mml:mrow>
<mml:mi>U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>f</mml:mi></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>C</mml:mi></mml:mrow>
<mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:munder>
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>1</mml:mn></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>s</mml:mi></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>+</mml:mo>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="false">}</mml:mo></mml:mrow>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:munder>
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>2</mml:mn></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi></mml:mrow>
<mml:mi>s</mml:mi></mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi></mml:mrow>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>+</mml:mo></mml:mrow>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:munder>
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>1</mml:mn></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>d</mml:mi>
<mml:mi>s</mml:mi></mml:msub>
<mml:mo>|</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>s</mml:mi></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>+</mml:mo>
<mml:munder>
<mml:mo>∑</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:mrow>
<mml:mo stretchy="false">}</mml:mo></mml:mrow>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:munder>
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>2</mml:mn></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:mi>s</mml:mi></mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi></mml:mrow>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub>
<mml:mo>|</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi></mml:mrow>
<mml:mi>s</mml:mi></mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi></mml:mrow>
<mml:msup>
<mml:mi>s</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>When we apply the GMRF in the voxel map, we define a voxel site <italic>v</italic>∈<italic>S</italic>. The observation <italic>h<sub>v</sub></italic> is the height value of voxel <italic>v</italic>. Evaluation of the clique potential functions <italic>V</italic><sub>1</sub>(<italic>f<sub>v</sub></italic>) and <italic>V</italic><sub>1</sub>(<italic>h<sub>v</sub></italic>|<italic>f<sub>v</sub></italic>) depends on the local configuration and observations of clique <italic>C</italic><sub>1</sub>. The clique potential functions <italic>V</italic><sub>2</sub>(<italic>f<sub>v</sub></italic>, <italic>f<sub>v’</sub></italic>) and <italic>V</italic><sub>2</sub>(<italic>h<sub>v</sub></italic>, <italic>h<sub>v’</sub></italic>|<italic>f<sub>v</sub></italic>, <italic>f<sub>v’</sub></italic>) evaluate the pair-site consistency of clique <italic>C</italic><sub>2</sub>. The clique potential functions are formulated as follows:
<disp-formula id="FD19">
<label>(19)</label>
<mml:math id="mm26" display="block">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>1</mml:mn></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mtable columnalign="left">
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi>α</mml:mi></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>v</mml:mi>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mi>G</mml:mi>
<mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">and</mml:mi></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">ground</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">or</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>v</mml:mi>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mi>G</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">and</mml:mi></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">nonground</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr>
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>α</mml:mi></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>v</mml:mi>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mi>G</mml:mi>
<mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">and</mml:mi></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">ground</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">or</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>v</mml:mi>
<mml:mo>∈</mml:mo>
<mml:msub>
<mml:mi>G</mml:mi>
<mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">and</mml:mi></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">nonground</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<disp-formula id="FD20">
<label>(20)</label>
<mml:math id="mm27" display="block">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>1</mml:mn></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>|</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mtable columnalign="left">
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi>α</mml:mi></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>∈</mml:mo>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mtd></mml:mtr>
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>α</mml:mi></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>∉</mml:mo>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<disp-formula id="FD21">
<label>(21)</label>
<mml:math id="mm28" display="block">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>2</mml:mn></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:msup>
<mml:mi>v</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mtable columnalign="left">
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi>β</mml:mi></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mtd></mml:mtr>
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>β</mml:mi></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>≠</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<disp-formula id="FD22">
<label>(22)</label>
<mml:math id="mm29" display="block">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>2</mml:mn></mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:msup>
<mml:mi>v</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub>
<mml:mo>|</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:msup>
<mml:mi>v</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mtable columnalign="left">
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi>γ</mml:mi>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mrow>
<mml:mo>‖</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>−</mml:mo>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:msup>
<mml:mi>v</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub></mml:mrow>
<mml:mo>‖</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:msup>
<mml:mi>v</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mtd></mml:mtr>
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>γ</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi>e</mml:mi></mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mrow>
<mml:mo>‖</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>−</mml:mo>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:msup>
<mml:mi>v</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub></mml:mrow>
<mml:mo>‖</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi mathvariant="italic">if</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>v</mml:mi></mml:msub>
<mml:mo>≠</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:msup>
<mml:mi>v</mml:mi>
<mml:mo>′</mml:mo></mml:msup></mml:msub></mml:mrow>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula></p>
<p>The constants <italic>α</italic>, <italic>β</italic>, and <italic>γ</italic> are positive values. The configuration <italic>f<sub>v</sub></italic> depends on whether voxel <italic>v</italic> belongs to the ground dataset or the non-ground dataset. The function <italic>R</italic>(<italic>f<sub>v</sub></italic>) returns the height range of voxels with the configuration <italic>f<sub>v</sub></italic>, and the expression ||<italic>h<sub>v</sub></italic> – <italic>h<sub>v’</sub></italic>|| gives the height difference between observations <italic>h<sub>v</sub></italic> and <italic>h<sub>v’</sub></italic>. We solve <xref ref-type="disp-formula" rid="FD17">Equation (17)</xref> using the potential functions defined in <xref ref-type="disp-formula" rid="FD18">Equations (18)</xref>–<xref ref-type="disp-formula" rid="FD22">(22)</xref>, and label the configuration of each voxel.</p></app></app-group>
<sec sec-type="display-objects">
<title>Figures</title>
<fig id="f1-sensors-12-17186" position="float">
<label>Figure 1.</label>
<caption>
<p>Terrain models. (<bold>a</bold>) Captured 2D image. (<bold>b</bold>) Voxel map. (<bold>c</bold>) Textured mesh.</p></caption>
<graphic xlink:href="sensors-12-17186f1.gif"/></fig>
<fig id="f2-sensors-12-17186" position="float">
<label>Figure 2.</label>
<caption>
<p>Framework for terrain modeling and photorealistic visualization using ground segmentation and 3D boundary estimation.</p></caption>
<graphic xlink:href="sensors-12-17186f2.gif"/></fig>
<fig id="f3-sensors-12-17186" position="float">
<label>Figure 3.</label>
<caption>
<p>Histogram examples of height value distributions. (<bold>a</bold>) A height histogram for ground data. (<bold>b</bold>) A height histogram for non-ground data.</p></caption>
<graphic xlink:href="sensors-12-17186f3.gif"/></fig>
<fig id="f4-sensors-12-17186" position="float">
<label>Figure 4.</label>
<caption>
<p>Height histogram generated from voxels in the voxel map. (<bold>a</bold>) The voxel map. (<bold>b</bold>) The height histogram of the voxel map in (a).</p></caption>
<graphic xlink:href="sensors-12-17186f4.gif"/></fig>
<fig id="f5-sensors-12-17186" position="float">
<label>Figure 5.</label>
<caption>
<p>Ground segmentation in the voxel map. (<bold>a</bold>) Rough ground segmentation of the voxel map based on the height histogram. (<bold>b</bold>) Ground segmentation in the voxel map using the height histogram method with the proposed GMRF model.</p></caption>
<graphic xlink:href="sensors-12-17186f5.gif"/></fig>
<fig id="f6-sensors-12-17186" position="float">
<label>Figure 6.</label>
<caption>
<p>Kernel matrices for boundary detection. (<bold>a</bold>) Horizontal kernel. (<bold>b</bold>) Vertical kernel.</p></caption>
<graphic xlink:href="sensors-12-17186f6.gif"/></fig>
<fig id="f7-sensors-12-17186" position="float">
<label>Figure 7.</label>
<caption>
<p>Foreground objects boundary detection. (<bold>a</bold>) Boundary detection result using kernel-based method. (<bold>b</bold>) Removing noise from the boundary detection result using the dilation and erosion methods.</p></caption>
<graphic xlink:href="sensors-12-17186f7.gif"/></fig>
<fig id="f8-sensors-12-17186" position="float">
<label>Figure 8.</label>
<caption>
<p>Erosion and dilation masks. (<bold>a</bold>) The erosion mask <italic>B</italic><sub>1</sub>. (<bold>b</bold>) The dilation mask <italic>B</italic><sub>2</sub>.</p></caption>
<graphic xlink:href="sensors-12-17186f8.gif"/></fig>
<fig id="f9-sensors-12-17186" position="float">
<label>Figure 9.</label>
<caption>
<p>Boundary detection for non-ground objects. (<bold>a</bold>) Projection results from vertices in dataset <italic>T</italic><sub>1</sub>. (<bold>b</bold>) Non-ground objects boundary detection results in 2D image. (<bold>c</bold>) 3D boundary detection process.</p></caption>
<graphic xlink:href="sensors-12-17186f9.gif"/></fig>
<fig id="f10-sensors-12-17186" position="float">
<label>Figure 10.</label>
<caption>
<p>Segmentation results. (<bold>a</bold>) Segmentation result from 88,536 voxels. (<bold>b</bold>) Segmentation result from 1,817,035 voxels.</p></caption>
<graphic xlink:href="sensors-12-17186f10.gif"/></fig>
<fig id="f11-sensors-12-17186" position="float">
<label>Figure 11.</label>
<caption>
<p>Ground segmentation performance over frames 1∼60. (<bold>a</bold>) Number of sensed points and processed voxels. (<bold>b</bold>) Speed of ground segmentation for the voxel map. (<bold>c</bold>) Speed of ground segmentation for the captured 2D images.</p></caption>
<graphic xlink:href="sensors-12-17186f11.gif"/></fig>
<fig id="f12-sensors-12-17186" position="float">
<label>Figure 12.</label>
<caption>
<p>Projection from 3D points of 0.1 frame to a 2D image. (<bold>a</bold>) Projection without calibration processing. (<bold>b</bold>) Projection with calibration processing.</p></caption>
<graphic xlink:href="sensors-12-17186f12.gif"/></fig>
<fig id="f13-sensors-12-17186" position="float">
<label>Figure 13.</label>
<caption>
<p>3D boundary estimation results for non-ground objects. (<bold>a</bold>) The height map generated before 3D boundary estimation. (<bold>b</bold>) The height map generated after 3D boundary estimation.</p></caption>
<graphic xlink:href="sensors-12-17186f13.gif"/></fig>
<fig id="f14-sensors-12-17186" position="float">
<label>Figure 14.</label>
<caption>
<p>Reconstruction result after 3D boundary estimation for non-ground objects. (<bold>a</bold>) Top view of the terrain reconstruction using the 3D boundary estimation algorithm. (<bold>b</bold>) Terrain refining result when the robot moves forward.</p></caption>
<graphic xlink:href="sensors-12-17186f14.gif"/></fig>
<fig id="f15-sensors-12-17186" position="float">
<label>Figure 15.</label>
<caption>
<p>Other simulation results using the proposed terrain reconstruction methods. (<bold>a</bold>,<bold>b</bold>) Captured 2D images. (<bold>c</bold>,<bold>d</bold>) Textured terrain meshes generated from sensed point clouds directly. (<bold>e</bold>,<bold>f</bold>) Terrain reconstruction using the proposed 3D boundary detection from (c) and (d) respectively.</p></caption>
<graphic xlink:href="sensors-12-17186f15.gif"/></fig></sec></back></article>
