Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint

Gao, Zhi; Lao, Mingjie; Sang, Yongsheng; Wen, Fei; Ramesh, Bharath; Zhai, Ruifang

doi:10.3390/s18051449

Open AccessArticle

Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint

by

Zhi Gao

^1,*

,

Mingjie Lao

¹,

Yongsheng Sang

²,

Fei Wen

³,

Bharath Ramesh

¹

and

Ruifang Zhai

⁴

¹

Temasek Laboratories, National University of Singapore, 117411 Singapore

²

College of Computer Science, Sichuan University, Chengdu 610065, China

³

School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China

⁴

College of Informatics, Huazhong Agricultural University, Wuhan 430070, China

^*

Author to whom correspondence should be addressed.

Sensors 2018, 18(5), 1449; https://doi.org/10.3390/s18051449

Submission received: 16 March 2018 / Revised: 28 April 2018 / Accepted: 5 May 2018 / Published: 6 May 2018

(This article belongs to the Section Remote Sensors)

Download

Browse Figures

Versions Notes

Abstract

:

Light detection and ranging (LiDAR) sensors have been widely deployed on intelligent systems such as unmanned ground vehicles (UGVs) and unmanned aerial vehicles (UAVs) to perform localization, obstacle detection, and navigation tasks. Thus, research into range data processing with competitive performance in terms of both accuracy and efficiency has attracted increasing attention. Sparse coding has revolutionized signal processing and led to state-of-the-art performance in a variety of applications. However, dictionary learning, which plays the central role in sparse coding techniques, is computationally demanding, resulting in its limited applicability in real-time systems. In this study, we propose sparse coding algorithms with a fixed pre-learned ridge dictionary to realize range data denoising via leveraging the regularity of laser range measurements in man-made environments. Experiments on both synthesized data and real data demonstrate that our method obtains accuracy comparable to that of sophisticated sparse coding methods, but with much higher computational efficiency.

Keywords:

LiDAR; range data denoising; sparse coding; ridge constraint

1. Introduction

In addition to their increasing proliferation and the growing importance of their roles in numerous applications (including 3D reconstruction and landscape surveying), light detection and ranging (LiDAR) sensors have recently been widely deployed on intelligent systems such as unmanned ground vehicles (UGVs) and unmanned aerial vehicles (UAVs) to perform localization, obstacle detection, and navigation tasks. Consequently, research into range data processing with competitive performance in terms of both accuracy and efficiency has attracted increasing attention. Despite the simplicity and directness in obtaining 3D information, such range data often suffer from noise due to the reflectance property of an object’s surfaces, or due to electrical and mechanical disturbances. Therefore, efficient denoising for range data remains a critical problem.

Some range data denoising methods directly originate from their counterparts for images by regarding the depth value of range data as intensity, and readers can refer to [1] for detailed discussions. However, such methods typically cause artifacts of vertex drifting, which reduce the geometric regularity [2,3]. Techniques designed specifically for range data (e.g., the 3D points cloud) can be roughly classified into point-based [4,5] and mesh-based [6,7] methods. However, estimating normal vectors from noisy data for mesh denoising appears to be a chicken-and-egg problem and remains an open topic [8,9]. In [10,11], range data and color images were fused for denoising, and promising results were reported. However, the prerequisites of these methods only hold when expensive sensors are applied or extra efforts are devoted to registering range data and images.

Sparse coding (SC) has revolutionized signal processing and led to state-of-the-art performance in a variety of applications, including image or video denoising, inpainting [12,13], restoration [14], and synthesis [15], face recognition [16], and anomaly detection [17]. These successes are mainly due to the fact that signal (images or image patches) have naturally sparse representations with respect to appropriate bases [18]. In recent years, such SC techniques have been applied to range data [19,20] and obtained competitive results. However, dictionary learning, which plays the central role in SC techniques, is computationally demanding, resulting in its limited applicability in real-time systems [21,22].

Based on the work of [20], in which reflectance information was synergized with depth data to facilitate the enforcement of adaptive sparsity constraint (therein, the depth and reflectance information that are complementary to each other work together to estimate the informative level of each patch, followed by adaptive dictionary learning which assigns more atoms to represent the patches with more information), we propose SC algorithms without dictionary learning to realize range data denoising via leveraging the regularity of laser range measurements in man-made environments. Specifically, a second-order differential transformation is applied to extract ridges that can be reconstructed and enhanced using a pre-learned ridge dictionary with sparsity constraint. Instead of performing refinement on the coefficients and dictionary alternately and iteratively [12,14,19,20,22], our method can achieve a one-step closed-form solution, resulting in much improved efficiency. Experiments on both synthesized data and real data were conducted to verify the effectiveness of our method, together with performance comparison against other state-of-the-art methods.

2. Our Method

2.1. Notation and Preliminaries

Matrices and vectors are shown in bold capital and bold lower-cased fonts, respectively. Sets are denoted with calligraphic fonts (e.g.,

S

), and their cardinality is denoted as

| S |

. For a vector

x \in R^{m}

,

x [i]

is the i-th entry,

x_{S}

is the subvector of

x

corresponding to the entries with indices in

S

. We define

ℓ_{2}

-norm,

{| | x | |}_{2} = \sum_{i = 1, \dots, m} x {[i]}^{2}

;

ℓ_{1}

-norm,

{| | x | |}_{1} = \sum_{i = 1, \dots, m} | x [i] |

;

ℓ_{0}

-norm,

{| | x | |}_{0} = | s u p p (x) |

(i.e., the number of nonzero elements).

s i g n (x)

is a sign vector with entries:

s i g n (x [i]) = 1

if

x [i] > 0

,

s i g n (x [i]) = - 1

if

x [i] < 0

, and zero otherwise. Based on the sign operator, we define the shrinkage operator:

\begin{matrix} T_{α} (x) [i] = {(| x [i] | - α)}_{+} s i g n (x [i]) = m a x (| x [i] | - α, 0) s i g n (x [i]) . \end{matrix}

For a matrix

X \in R^{m \times n}

and an index set

S

,

X_{S}

is the submatrix containing only the rows of

X

with indices in

S

.

v e c (X)

denotes the column-wise vectorization of matrix

X

. Letters d and r in superscript refer to the components related to the depth and reflectance information, respectively. Letters v and h in subscript refer to the components related to the vertical and horizontal directions, respectively.

Ridge detection for range measurements. We first take a scan from a 2D laser scanner as an example to introduce our operator of ridge detection and then extend the approach to 3D depth profiles. In Figure 1, we plot the depth profile

z \in R^{n}

of a typical indoor environment, which is obtained using a standard planar range finder measured with (known) discrete angles. Clearly, the profile is sufficiently regular and contains a few corners. Considering three consecutive points at coordinates

(x_{i - 1}, z_{i - 1})

,

(x_{i}, z_{i})

, and

(x_{i + 1}, z_{i + 1})

, there is a corner at i if

\frac{z_{i + 1} - z_{i}}{x_{i + 1} - x_{i}} \neq \frac{z_{i} - z_{i - 1}}{x_{i} - x_{i - 1}}

. In the following, we assume that

x_{i} - x_{i - 1} = 1

for all i (this assumption comes without loss of generality since we can define it at arbitrary resolution). Therefore, the corner detection for the 2D profile is to find those indices i such that

z_{i - 1} - 2 z_{i} + z_{i + 1} \neq 0

. To make the notation more compact, we introduce the second-order difference operator:

D_{2 n d} = [\begin{matrix} 1 & - 2 & 1 & 0 & \dots & 0 \\ 0 & 1 & - 2 & 1 & \dots & 0 \\ ⋮ & 0 & ⋱ & ⋱ & ⋱ & 0 \\ 0 & \dots & 0 & 1 & - 2 & 1 \end{matrix}] \in R^{(n - 2) \times n} .

(1)

Thus, a simple operation

D_{2 n d} \cdot z

can detect corners. For a 3D depth profile

Z \in R^{r \times c}

, we estimate the ridges of vertical and horizontal directions

R_{v}

and

R_{h}

, respectively:

\begin{matrix} R_{v} = D_{v 2 n d} \cdot Z \in R^{(r - 2) \times c}, \\ R_{h} = Z \cdot D_{h 2 n d}^{T} \in R^{r \times (c - 2)}, \end{matrix}

(2)

where the matrices

D_{v 2 n d}

and

D_{h 2 n d}

are the same as

D_{2 n d}

in Equation (1), but with suitable dimensions. To combine the two equations of (2), we reformulate it as below:

[\begin{matrix} v e c (R_{h}) \\ v e c (R_{v}) \end{matrix}] = [\begin{matrix} I_{c} \otimes D_{v 2 n d} \\ D_{h 2 n d} \otimes I_{r} \end{matrix}] \cdot v e c (Z) = D_{v h} \cdot v e c (Z) .

(3)

Here, ⊗ is the Kronecker product, and

D_{v h} \in R^{2 (r \cdot c - r - c) \times (r \cdot c)}

.

2.2. Adaptive SC and Its $ℓ_{0}, ℓ_{1}$ Solutions

For each patch

r_{i}

(in the vectorized form) in the depth ridge map, we formulate the following adaptive SC model:

\begin{matrix} \underset{x_{v i}, x_{h i}}{m i n} & {||r_{v i} - D_{L} \cdot x_{v i}||}_{2} + {||r_{h i} - D_{L} \cdot x_{h i}||}_{2}, \\ s . t . & \forall 1 ⩽ i ⩽ N, {||x_{v i}||}_{0} ⩽ k_{v i}, {||x_{h i}||}_{0} ⩽ k_{h i}, \end{matrix}

(4)

where

r_{v i}

and

r_{h i}

represent the patches from vertical and horizontal ridge maps, respectively.

D_{L}

is the fixed pre-learned dictionary, as shown in Figure 2, which is obtained by applying the basic SC algorithm [19] on 40 pre-selected ridge maps of man-made scenarios (here, we display our pre-learned dictionary with the optimal setting of parameters as obtained in Section 3.1.1, namely the number of dictionary atoms is 1024 and the size of each atom is 8 × 8). In [23], different dictionaries (including off-the-shelf ones such as discrete cosine transform (DCT) basis, wavelets basis, and dictionaries obtained via learning—either pre-learned or learned from the data itself) were tested for image denoising. Although the dictionary learned from the data itself reported the best denoising accuracy, it was also the most computationally demanding. In addition to our pre-learned dictionary, we also displayed the DCT dictionary, Gabor wavelets dictionary in Figure 2. The parameter tweaking of such dictionaries and their denoising performance are discussed in Section 3.1.1.

x_{v i}

and

x_{h i}

correspond to the sparse coefficient vectors.

k_{v i}

and

k_{h i}

are the given sparsity controlling parameters. Instead of setting these parameters as constants for all patches, we set them adaptively according to the informative level of each patch (see Section 3.1.1 for details) in a manner similar to that of [20]. Given

k_{v i}

and

k_{h i}

, the popular orthogonal-matching-pursuit (OMP) algorithm is applied to solve such

ℓ_{0}

optimization problem. Moreover, the batch-OMP (BOMP) technique is exploited to further improve the efficiency. As the problem of (4) is non-convex and NP-hard, we reformulate its relaxation as below:

\begin{matrix} \underset{x_{v i}, x_{h i}}{m i n} & {||r_{v i} - D_{L} \cdot x_{v i}||}_{2} + λ_{v i} {||x_{v i}||}_{1} \\ + & {||r_{h i} - D_{L} \cdot x_{h i}||}_{2} + λ_{h i} {||x_{h i}||}_{1} . \end{matrix}

(5)

Here,

λ_{v i}

and

λ_{h i}

, which balance the representation fidelity term and the sparsity penalty term, are again estimated adaptively (see Section 3.1.1 for details). By stacking all the patches, we obtain Equation (6):

\begin{matrix} \underset{X_{v}, X_{h}}{m i n} & {||[r_{v 1}, \dots, r_{v N}] - D_{L} \cdot X_{v}||}_{2} + \sum_{i = 1}^{N} λ_{v i} {||x_{v i}||}_{1} \\ + & {||[r_{h 1}, \dots, r_{h N}] - D_{L} \cdot X_{h}||}_{2} + \sum_{i = 1}^{N} λ_{h i} {||x_{h i}||}_{1}, \end{matrix}

(6)

in which the neighboring patches are extracted with half-overlapping. To solve such

ℓ_{1}

optimization problem, the least-absolute-shrinkage-and-selection-operator (LASSO) algorithm was proposed, which has proven to find the global optimizer.

Clearly, Equations (4)–(6) differ from previous methods [12,14,19,20,22] in one aspect, which is that

D_{L}

is fixed. Thus,

X_{v}

and

X_{h}

can be obtained with a one-step closed-form solution instead of performing refinement on the coefficients and dictionary alternately and iteratively, thus resulting in much improved efficiency. With

X_{v}

and

X_{h}

in hand, we reconstruct each patch

{\hat{r}}_{v i}

and

{\hat{r}}_{h i}

. When all patches are processed, the average value is used for the locations occupied by multiple patches, and then we obtain the refined ridge maps

{\hat{R}}_{v}

and

{\hat{R}}_{h}

. Next, we can recover

\hat{Z}

by performing the inverse operation of Equation (3). However, due to the dependency of the rows of

D_{v h}

, the system is under-determined. Therefore, we need to incorporate the boundary conditions to recover

\hat{Z}

from its refined second-order difference, as shown in Equation (7):

[\begin{matrix} z_{N} \\ v e c (R_{h}) \\ v e c (R_{v}) \end{matrix}] = [\begin{matrix} I_{N} 0_{N \times (r \cdot c - N)} \\ D_{v h} \end{matrix}] \cdot v e c (Z) = [\begin{matrix} A \\ D_{v h} \end{matrix}] \cdot v e c (Z),

(7)

where the available N boundary points are incorporated, and six boundary points are used in our work. A more rigorous theoretical proof of the minimum necessary boundary points is out of the scope of this work. Performing the inverse operation on (7), we can recover

\hat{Z}

as Equation (8):

v e c (\hat{Z}) = {[[\begin{matrix} A^{T} D_{v h}^{T} \end{matrix}] [\begin{matrix} A \\ D_{v h} \end{matrix}]]}^{- 1} [\begin{matrix} A^{T} D_{v h}^{T} \end{matrix}] [\begin{matrix} z_{N} \\ v e c (R_{h}) \\ v e c (R_{v}) \end{matrix}] .

(8)

Here, all the matrices

A

,

D_{v h}

, their transposes, and their inverses can be pre-computed to significantly reduce the overall processing time. We now summarize all previous operations as Algorithm 1. In fact, our algorithm includes two versions of implementation: our-BOMP and our-LASSO.

In Algorithm 2, the convergence criterion is well-defined as

{||x||}_{0} ⩽ k

. In Algorithm 3, the convergence criterion is that the relative update difference of

x_{k}

is less than

10^{- 3}

.

Algorithm 1 Our Sparse Coding for Depth Data Denoising

Input:: Depth map $Z$ , dictionary $D_{c}$ , parameters $k_{1}$ , $k_{2}$ , $λ_{v}$ , $λ_{h}$ ;
Output:: Restored $\hat{Z}$ of Equation (8);
1:: Estimate the ridge maps $R_{V}$ and $R_{H}$ as Equation (2);
2:: Extract and vectorize ridge map patches $r_{v i}$ , $r_{h i}$ , $1 ⩽ i ⩽ N$ ;
3:: //Line 5 & 7 apply different algorithms to estimate $x_{v i}$ , $x_{h i}$ .
4:: //Line 5 belongs to the version of our-BOMP.
5:: Apply the Batch-OMP algorithm to solve Equation (4);
6:: //Line 7 belongs to the version of our-LASSO.
7:: Apply the LASSO algorithm to solve Equation (5);
8:: Obtain the refined ridge maps ${\hat{R}}_{v}$ and ${\hat{R}}_{h}$ via reconstruction;
9:: Recover $\hat{Z}$ via applying Equation (8).

Algorithm 2 Batch-orthogonal-matching-pursuit [24] to solve:

{m i n}_{x} {||y - D \cdot x||}_{2} s . t . {||x||}_{0} ⩽ k

Input:: $y$ , $D$ , k;
Output:: $x$ ;
1:: Initialization: $x = 0$ , $r = y$ , $S = {⌀}$ ;
2:: while stopping criterion not met do
3:: $k = {a r g m a x}_{k} | d_{k}^{T} \cdot r |$ ;
4:: $L = S$ ;
5:: $S = {k} + L$ ;
6:: $G_{S} = D_{S}^{T} D_{S}$ , $G_{L} = D_{L}^{T} D_{L}$ ;
7:: //Line 8 obtains $G_{S}^{- 1}$ using progressive Cholesky update.
8:: $x_{S} = G_{S}^{- 1} D_{S}^{T} y$ ;
the key updating function to estimate $G_{S}^{- 1}$ is:

$G_{S}^{- 1} = [\begin{matrix} G_{L}^{- 1} + \frac{1}{a} G_{L}^{- 1} D_{L}^{T} d_{k} d_{k}^{T} D_{L} G_{L}^{- 1} & - \frac{1}{a} G_{L}^{- 1} D_{L}^{T} d_{k} \\ - \frac{1}{a} d_{k}^{T} D_{L} G_{L}^{- 1} & \frac{1}{a} \end{matrix}]$

here, $a = d_{k}^{T} d_{k} - d_{k}^{T} D_{L} G_{L}^{- 1} D_{L}^{T} d_{k}$
9:: $r = y - D_{S} x_{S}$ ;
10:: end while

Algorithm 3 Fast iterative shrinkage-thresholding algorithm (FISTA) [25] to solve LASSO problem:

\min_{x} {||y - D \cdot x||}_{2} + λ {||x||}_{1}

Input:: $y$ , $D$ , $λ$ ;
Output:: $x$ ;
1:: Initialization: $x_{0} = 0$ , $z_{1} = x_{0}$ , $L = m a x (e i g (D^{T} D))$ , $t_{0} = 1$ , $k = 0$ ;
2:: while not converged do
3:: $k = k + 1$ ;
4:: $g r a d (z_{k}) = D^{T} D z_{k} - D^{T} y$ ;
5:: $x_{k} = T_{\frac{λ}{L}} (z_{k} - \frac{1}{L} g r a d (z_{k}))$ ;
6:: //see Section 2.1 for the definition of $T_{α} (x)$ .
7:: $t_{k} = (1 + \sqrt{1 + 4 \cdot t_{k - 1}^{2}}) / 2$ ;
8:: $z_{k + 1} = x_{k} + \frac{t_{k - 1} - 1}{t_{k}} (x_{k} - x_{k - 1})$ ;
9:: end while

3. Experiments and Analysis

We performed experiments on both synthesized data and actual data recorded using a laser scanner. In the synthesized data experiments where the ground truth was available, the setting of important parameters was investigated, followed by denoising evaluation. In the real data experiments, the results are also demonstrated for visual assessment, in addition to the performance comparison. Similar to [14,20], the peak signal-to-noise ratio (PSNR) value is employed to evaluate the performance. Here, we clarify that our work is run on an ASUS Pro Notebook with a 2.4 GHz Intel Quad-Core i7-4500U processor and 12 GB RAM executing MATLAB code in parallel.

3.1. Experiments on Synthesized Data

Similar to [20,26], we designed a scene of a square box having sides of one meter in a room, as shown in Figure 3, where all information is known as the ground truth.

3.1.1. Important Parameters

To achieve plausible performance in terms of both accuracy and efficiency, the following parameters play the central role: the sparsity controlling parameters

k_{v i}

,

k_{h i}

,

λ_{v i}

,

λ_{h i}

, dictionary dimension d (which indicates that the patch size is

\sqrt{d} \times \sqrt{d}

pixels), and its atom number

n_{D}

(

r = \frac{n_{D}}{d}

is the redundancy of the dictionary).

Applying the same parameter tweaking strategy as that in [20], we recommend the sparsity controlling parameters as shown in Table 1. For each patch

P_{j}

(

j = v

or h, from vertical or horizontal ridge maps, respectively), the parameters

α_{v i}, β_{v i}

(or

α_{h i}, β_{h i}

) are related to its informative-level measurement (

I n M

), which is estimated as below:

\begin{matrix} D_{e n} (P_{v}^{d}) = \sum_{i = 1}^{\sqrt{d}} \sum_{j = 1}^{\sqrt{d}} |P_{v}^{d} [i, j]|, \\ R_{e n} (P^{r}) = \sum_{i = 1}^{\sqrt{d}} \sum_{j = 1}^{\sqrt{d}} G r a d M a g (P^{r}) [i, j], G r a d M a g (P^{r}) [i, j] = \sqrt{{(\frac{\partial P^{r} [i, j]}{\partial x})}^{2} + {(\frac{\partial P^{r} [i, j]}{\partial y})}^{2}}, \\ I n M (P_{v}^{d}) = D_{e n} (P_{v}^{d}) \cdot exp (σ (R_{e n} (P^{r})) - 0.5) . \end{matrix}

(9)

Here,

G r a d M a g (P^{r})

is the gradient magnitude map of patch

P^{r}

, and

σ (\cdot)

is the sigmoid function

σ (\cdot) = \frac{1}{1 + e x p (\cdot)}

. For the input from

(- \infty, + \infty)

, the output of

σ (\cdot)

is in the range

(0, 1)

. In our work, as the value of

R_{e n}

cannot be negative, the output of

σ (R_{e n})

is in the range

[0.5, 1)

. The output of

exp (σ (R_{e n} (P^{r})) - 0.5))

is in the range

[1, e^{0.5})

. With

I n M (P_{v}^{d}), I n M (P_{h}^{d})

for all patches in hand, we determine the sparsity controlling parameters via computing

α_{v i}, β_{v i}, α_{h i}, a n d β_{h i}

as below:

\begin{matrix} I n M_{m i n} = m i n {I n M (P_{v i}^{d}), I n M (P_{h i}^{d})}, i = 1, \dots, N, \\ I n M_{m a x} = m a x {I n M (P_{v i}^{d}), I n M (P_{h i}^{d})}, i = 1, \dots, N, \\ α_{j i} = \frac{I n M_{m a x} - I n M (P_{j i}^{d})}{I n M_{m a x} - I n M_{m i n}}, j = v o r h, \\ β_{j i} = \frac{I n M (P_{j i}^{d}) - I n M_{m i n}}{I n M_{m a x} - I n M_{m i n}}, j = v o r h . \end{matrix}

(10)

In Table 1, the positions of

α_{j i}

and

β_{j i}

are exchanged in Our-BOMP and Our-LASSO because

k_{j i}

and

λ_{j i}

enforce the sparsity constraint in different manners to adaptively determine the number of dictionary atoms permitted for the patches with different informative levels. For example, to assign more atoms to represent the patch with more information,

k_{j i}

should increase to push up the sparsity limit, whereas

λ_{j i}

should decrease to penalize the sparsity constraint term to a lower degree.

In addition to

k_{j i}, λ_{j i}

, we tested different combinations of patch size and redundancy to find the most reasonable settings of d and r. Specifically,

\sqrt{d}

can be 4, 8, 16, or 24, and r can be 4, 8, 16, or 24. Thus, a total of 16 combinations were tested. As shown in Figure 4a, dictionaries with redundancy 16 consistently achieved better denoising results than dictionaries with redundancy 4, 8, and 24, when their patch sizes were the same. Figure 4b shows the computation time. Clearly, the time increased dramatically with increasing patch size and redundancy. Taking both the denoising performance and computation time into account, we conclude that the dictionary setting of

\sqrt{d}

= 8 and r = 16 (namely 1024 atoms) can yield decent outputs, while allowing fast computation. Such parameters are used throughout this paper, except where indicated. In Table 2, we further report the denoising result using different fixed dictionaries: our pre-learned dictionary, DCT dictionary, and Gabor wavelets dictionary. Clearly, the results using our pre-learned dictionary were better those using off-the-shelf ones.

3.1.2. Denoising Results

The denoising results of our work on the synthesized data (see Figure 3d in which 100% dense Gaussian noises with standard deviation of 0.02 m and gray-scale value 5 for range and reflectance data respectively were added) are reported in Table 3, together with the comparison against available competitive methods including Trilateral filter [26], Basic SC [19], and Adaptive SC [20]. Moreover, in addition to the Gaussian noise, sparse outliers (5% of the total positions, which were randomly distributed) were added to the scene, as shown in Figure 3e, and the denoising results are also reported in Table 3. In addition to the PSNR criteria, the root mean square (RMS, with units of millimeters) error indicating the difference between the recovered range data and the ground truth is also included, since it can give readers a more direct indication of how well the contaminated range information was restored by these methods. Moreover, the computational time of each method is also reported. As shown in Table 3, our methods outperformed Trilateral filter, Basic SC, and obtained comparable results to Adaptive SC in terms of denoising accuracy (the best results were obtained by Adaptive SC with PSNR = 33.83 and RMS = 1.98), while achieving significantly higher efficiency than Adaptive SC. Moreover, Our-BOMP even worked slightly more efficiently than Our-LASSO, achieving the processing speed of approximately 13 fps for the range image with resolution 800 × 800.

3.2. Denoising Data from Laser Scanner

This set of experiments used the Brown range image database [27] captured with a Riegl LMS-Z210 laser scanner, its field of view (FoV) is

80^{\circ}

vertically and

259^{\circ}

horizontally. We focused only on the 15 indoor scenes (e.g., classrooms and theatres). To deal with (and benefit from) such a large horizontal FoV, similar to the multiple local projection method in [19], we partitioned the scene into several parts with

80^{\circ}

FoV vertically and horizontally. Note that in the Brown range image database, the depth of each measuring point is saved sequentially, left to right and up to down. Thus, the image location only indicates the order of saving and it was necessary to perform local projection to obtain reasonable images. The consecutive parts were half-overlapping along the horizontal direction, then each part was projected to an image plane orientated to the center. Finally, the average of all the restored parts was computed to restore the whole scene. In the following, the denoising results on such real data are reported and discussed. Similar to Section 3.1, Gaussian noise was added to the depth and reflectance data, and the results on three representative scenes are shown in Figure 5.

As can be observed in Figure 5, the results of Trilateral filter were good in the planar regions, but contained a great deal of noise in the edge regions. Our method, Basic SC, and Adaptive SC are all based on SC, and obtained consistent results in all regions. Moreover, the results of our method in the third column were slightly worse than those of columns 1 and 2, because the curved structure of the scene in the third column posed a greater challenge to our sparse ridge assumption. In Table 4, the average denoising results on all 15 indoor scenes of the dataset are reported. Similar to the results on the synthesized data, our method outperformed Trilateral filter and Basic SC consistently, and obtained comparable results to those of Adaptive SC in terms of denoising accuracy, while achieving significantly higher efficiency than Adaptive SC. Compared with the results in Table 3, the results of Table 4 deteriorated as the structure of real scenes was more complex than the synthetic scene. Similar to the previous section, in addition to the Gaussian noise, sparse outliers were added to the scene, and the denoising results are also reported in Table 4.

4. Conclusions

In this study, based on our previous work [20], we propose SC-based algorithms in which a pre-learned ridge dictionary is applied to realize range data denoising by leveraging the regularity of laser range measurements in man-made environments. Experiments on both synthesized data and real data demonstrate that our method obtained accuracies comparable to those of sophisticated SC methods with much higher efficiency, achieving approximately 13 fps (for resolution of 800 × 800) with the resource of a lightweight laptop computer. Therefore, our method can be applied for real-time systems in man-made environments. Furthermore, we are trying to implement our method with GPU acceleration, intending to further increase the efficiency and also save the precious payload for unmanned systems, especially UAVs.

Author Contributions

Zhi Gao designed the algorithm and wrote the paper; Mingjie Lao designed the codes and performed the experiments; Yongsheng Sang also helped in coding and experimental tuning; Fei Wen focused on the search of related work and experimental comparison; Bharath Ramesh provided significant comments and suggestions; Ruifang Zhai helped with revision.

Funding

This research was supported in part by the Science Foundation of Ministry of Education (MOE) of China and China Mobile Communications Corporation under Grant MCM20160307.

Conflicts of Interest

The authors declare no conflict of interest.

References

Buades, A.; Coll, B.; Morel, J.M. A review of image denoising algorithms, with a new one. Multiscale Model. Simul. 2005, 4, 490–530. [Google Scholar] [CrossRef]
Lysaker, M.; Lundervold, A.; Tai, X.C. Noise removal using fourth-order partial differential equation with applications to medical magnetic resonance images in space and time. IEEE Trans. Image Proc. 2003, 12, 1579–1590. [Google Scholar] [CrossRef] [PubMed]
Wu, Y.; Tracey, B.; Natarajan, P.; Noonan, J.P. Probabilistic non-local means. IEEE Signal Proc. Lett. 2013, 20, 763–766. [Google Scholar] [CrossRef]
Smigiel, E.; Alby, E.; Grussenmeyer, P. TLS data denoising by range image processing. Photogramm. Rec. 2011, 26, 171–189. [Google Scholar] [CrossRef]
Sun, B.; Fang, H.; Huang, D. Lidar signal denoising using least-squares support vector machine. IEEE Signal Proc. Lett. 2005, 12, 101–104. [Google Scholar]
Fleishman, S.; Drori, I.; Cohen-Or, D. Bilateral mesh denoising. ACM Trans. Graph. 2003, 22, 950–953. [Google Scholar] [CrossRef]
Marco, M.; Luca, P.; Augusto, S.; Stefano, T. Fast PDE approach to surface reconstruction from large cloud of points. Comput. Vis. Image Underst. 2008, 112, 274–285. [Google Scholar] [CrossRef]
Belyaev, Y.; Seidel, H. Mesh smoothing by adaptive and anisotropic Gaussian filter applied to mesh normals. In Vision, Modeling, and Visualization; IOS Press: Amsterdam, The Netherlands, 2002; pp. 203–210. [Google Scholar]
Tasdizen, T.; Whitaker, R.; Burchard, P.; Osher, S. Geometric surface smoothing via anisotropic diffusion of normals. In Proceedings of the Conference on Visualization’02, Boston, MA, USA, 27 October—1 November 2002; pp. 125–132. [Google Scholar]
Crabb, R.; Tracey, C.; Puranik, A.; Davis, J. Real-time foreground segmentation via range and color imaging. In Proceedings of the Computer Vision and Pattern Recognition Workshops, Anchorage, AK, USA, 23–28 June 2008; pp. 1–5. [Google Scholar]
Huhle, B.; Schairer, T.; Jenke, P.; Straber, W. Fusion of range and color images for denoising and resolution enhancement with a non–local filter. Comput. Vis. Image Underst. 2010, 114, 1336–1345. [Google Scholar] [CrossRef]
Elad, M.; Aharon, M. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Proc. 2006, 15, 3736–3745. [Google Scholar] [CrossRef]
Ji, H.; Huang, S.; Shen, Z.; Xu, Y. Robust video restoration by joint sparse and low rank matrix approximation. SIAM J. Imaging Sci. 2011, 4, 1122–1142. [Google Scholar] [CrossRef]
Mairal, J.; Elad, M.; Sapiro, G. Sparse representation for color image restoration. IEEE Trans. Image Proc. 2008, 17, 53–695. [Google Scholar] [CrossRef]
Peyré, G. Sparse modeling of textures. J. Math. Imaging Vis. 2009, 34, 17–31. [Google Scholar] [CrossRef]
Wright, J.; Yang, A.Y.; Ganesh, A.; Sastry, S.S.; Ma, Y. Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 31, 210–227. [Google Scholar] [CrossRef] [PubMed]
Adler, A.; Elad, M.; Hel-Or, Y.; Rivlin, E. Sparse coding with anomaly detection. J. Signal Proc. Syst. 2015, 79, 179–188. [Google Scholar] [CrossRef]
Wright, J.; Ma, Y.; Mairal, J.; Sapiro, G.; Huang, T.S.; Yan, S. Sparse representation for computer vision and pattern recognition. Proc. IEEE 2010, 98, 1031–1044. [Google Scholar] [CrossRef]
Mahmoudi, M.; Sapiro, G. Sparse representations for range data restoration. IEEE Trans. Image Proc. 2012, 21, 2909–2915. [Google Scholar] [CrossRef] [PubMed]
Gao, Z.; Li, Q.; Zhai, R.; Lin, F. Laser Range Data Denoising via Adaptive and Robust Dictionary Learning. IEEE Geosci. Remote Sens. Lett. 2015, 12, 1750–1754. [Google Scholar]
Elad, M.; Figueiredo, M.A.; Ma, Y. On the role of sparse and redundant representations in image processing. Proc. IEEE 2010, 98, 972–982. [Google Scholar] [CrossRef]
Gao, Z.; Li, Q.; Zhai, R.; Shan, M.; Lin, F. Adaptive and Robust Sparse Coding for Laser Range Data Denoising and Inpainting. IEEE Trans. Circuits Syst. Video Technol. 2016, 26, 2165–2175. [Google Scholar] [CrossRef]
Elad, M. Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing; Springer: New York, NY, USA, 2010. [Google Scholar]
Rubinstein, R.; Zibulevsky, M.; Elad, M. Efficient implementation of the K-SVD algorithm using batch orthogonal matching pursuit. CS Technion 2008, 40, 1–15. [Google Scholar]
Beck, A.; Teboulle, M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2009, 2, 183–202. [Google Scholar] [CrossRef]
Oishi, S.; Kurazume, R.; Iwashita, Y.; Hasegawa, T. Denoising of range images using a trilateral filter and belief propagation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), San Francisco, CA, USA, 25–30 September 2011; pp. 2020–2027. [Google Scholar]
Lee, A.B.; Huang, J. Brown Range Image Database. 2000. Available online: www.dam.brown.edu/ptg/brid/index.html (accessed on 26 April 2017).

Figure 1. Illustration of the depth profile of a 2D laser scan.

Figure 2. Our pre-learned dictionary of ridge maps (a), discrete cosine transform (DCT) dictionary (or say basis) (b), and Gabor wavelet dictionary (or say basis) (c).

Figure 3. Synthesized data (of size

800 \times 800

pixels) designed for experiments. (a) Reflectance image. (b) Depth map. (c) Result of informative-level estimation. (d) Gaussian noise-corrupted depth data. (e) Gaussian noise and outliers-corrupted depth data.

Figure 3. Synthesized data (of size

800 \times 800

pixels) designed for experiments. (a) Reflectance image. (b) Depth map. (c) Result of informative-level estimation. (d) Gaussian noise-corrupted depth data. (e) Gaussian noise and outliers-corrupted depth data.

Figure 4. (a) PSNR and (b) computation time with dictionaries of different combinations of patch size and redundancy.

Figure 5. Results of denoising experiments on real data (see online version for details). The original and noisy corrupted range images are shown in the first and second rows, respectively. The third row depicts our intermediate results of informative level estimation. Rows 4 to 8 depict the results of Our-BOMP, Our-LASSO, Adaptive SC, Basic SC, and Trilateral filter, respectively.

Table 1. Setting of sparsity controlling parameters. BOMP: batch orthogonal-matching-pursuit; LASSO: least-absolute-shrinkage-and-selection-operator.

Parameters	Our–BOMP	Our–LASSO
Parameters	$k_{vi}, k_{hi}$	$λ_{vi}, λ_{hi}$
Max value	$k_{m a x} = 6$	$λ_{m a x} = 2.2$
Min value	$k_{m i n} = 1$	$λ_{m i n} = 0.2$
Adaptive value *	$k_{i}^{v} = α_{i}^{v} k_{m i n} + β_{i}^{v} k_{m a x}$	$λ_{i}^{v} = β_{i}^{v} λ_{m i n} + α_{i}^{v} λ_{m a x}$
Adaptive value *	$k_{i}^{h} = α_{i}^{h} k_{m i n} + β_{i}^{h} k_{m a x}$	$λ_{i}^{h} = β_{i}^{h} λ_{m i n} + α_{i}^{h} λ_{m a x}$

*

α_{i}^{v}, β_{i}^{v}, α_{i}^{h}, β_{i}^{h}

are related to the informative estimation of each patch.

Table 2. Denoising results (peak signal-to-noise ratio (PSNR) value) using different dictionaries (unit: dB).

Dictionary	Dictionary with Different Dimensions
Dictionary	$r = 10, \sqrt{d} = 8$	$r = 16, \sqrt{d} = 8$	$r = 16, \sqrt{d} = 16$
DCT dictionary	32.26	32.37	32.17
Gabor Wavelet dictionary	32.46	32.58	32.92
Our pre-learned dictionary	32.86	33.75	33.81

Table 3. Denoising results on the synthesized data. RMS: root mean square; SC: sparse coding.

Method	Only Gaussian Noise			Gaussian Noise and Outliers
Method	PSNR (dB)	RMS (mm)	Time * (s)	PSNR (dB)	RMS (mm)	Time * (s)
Our–BOMP	33.72	2.06	0.08	33.28	2.14	0.09
Our–LASSO	33.78	2.04	0.12	33.35	2.12	0.15
Adaptive SC	33.83	1.98	227.38	33.71	2.04	239.62
Basic SC	33.17	2.17	153.62	33.02	2.56	172.38
Trilateral filter	32.37	2.49	0.04	32.12	2.67	0.06

* Time indicates the computation time for each method.

Table 4. Denoising results on real data.

Method	Only Gaussian Noise			Gaussian Noise and Outliers
Method	PSNR (dB)	RMS (mm)	Time * (s)	PSNR (dB)	RMS (mm)	Time * (s)
Our–BOMP	35.37	2.31	0.86	35.17	2.43	0.89
Our–LASSO	35.43	2.27	1.15	35.21	2.37	1.28
Adaptive SC	35.72	2.03	918.71	35.68	2.08	972.33
Basic SC	34.28	2.83	538.37	34.16	2.97	572.81
Trilateral filter	32.13	3.27	0.16	32.02	3.75	0.18

* Time indicates the computation time for each method.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gao, Z.; Lao, M.; Sang, Y.; Wen, F.; Ramesh, B.; Zhai, R. Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint. Sensors 2018, 18, 1449. https://doi.org/10.3390/s18051449

AMA Style

Gao Z, Lao M, Sang Y, Wen F, Ramesh B, Zhai R. Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint. Sensors. 2018; 18(5):1449. https://doi.org/10.3390/s18051449

Chicago/Turabian Style

Gao, Zhi, Mingjie Lao, Yongsheng Sang, Fei Wen, Bharath Ramesh, and Ruifang Zhai. 2018. "Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint" Sensors 18, no. 5: 1449. https://doi.org/10.3390/s18051449

APA Style

Gao, Z., Lao, M., Sang, Y., Wen, F., Ramesh, B., & Zhai, R. (2018). Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint. Sensors, 18(5), 1449. https://doi.org/10.3390/s18051449

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint

Abstract

1. Introduction

2. Our Method

2.1. Notation and Preliminaries

2.2. Adaptive SC and Its $ℓ_{0}, ℓ_{1}$ Solutions

3. Experiments and Analysis

3.1. Experiments on Synthesized Data

3.1.1. Important Parameters

3.1.2. Denoising Results

3.2. Denoising Data from Laser Scanner

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint

Abstract

1. Introduction

2. Our Method

2.1. Notation and Preliminaries

2.2. Adaptive SC and Its ℓ 0 , ℓ 1 Solutions

3. Experiments and Analysis

3.1. Experiments on Synthesized Data

3.1.1. Important Parameters

3.1.2. Denoising Results

3.2. Denoising Data from Laser Scanner

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.2. Adaptive SC and Its $ℓ_{0}, ℓ_{1}$ Solutions