Low-Rank Matrix Recovery from Noisy via an MDL Framework-based Atomic Norm

The recovery of the underlying low-rank structure of clean data corrupted with sparse noise/outliers is attracting increasing interest. However, in many low-level vision problems, the exact target rank of the underlying structure, the particular locations and values of the sparse outliers are not known. Thus, the conventional methods can not separate the low-rank and sparse components completely, especially gross outliers or deficient observations. Therefore, in this study, we employ the Minimum Description Length (MDL) principle and atomic norm for low-rank matrix recovery to overcome these limitations. First, we employ the atomic norm to find all the candidate atoms of low-rank and sparse terms, and then we minimize the description length of the model in order to select the appropriate atoms of low-rank and the sparse matrix, respectively. Our experimental analyses show that the proposed approach can obtain a higher success rate than the state-of-the-art methods even when the number of observations is limited or the corruption ratio is high. Experimental results about synthetic data and real sensing applications (high dynamic range imaging, background modeling, removing shadows and specularities) demonstrate the effectiveness, robustness and efficiency of the proposed method.

Value Decomposition (SVD) when the data are corrupted only by a small amount of noise. Due to the presence of gross outliers in modern applications, the robust variant of PCA, called robust PCA (RPCA) has also been used to reject the outliers [11], [12] argmin where the parameter γ > 0 is a regularization parameter, rank(X) denotes the rank of matrix X ∈ R m×n (rank(X) = r), and E 0 is the number of non-zero entries in the sparse matrix E. Unfortunately, solving Eq. (1) is an NP-hard problem. Instead, Candès et al. [12] solved an approximated problem by convex optimization under rather weak assumptions argmin X,E where X * = i σ i (X) is the nuclear norm of X (σ i (X) denotes the i-th singular value of X), and E 1 represents the l 1 -norm of the sparse matrix E. Various approaches can be used to solve Eq. (2) effectively [13], [14]. Wright et al. [11] and Candès et al. [12] proved that the performance of Eq. (2) will approach stability only by using more observations (larger n). However, the number of observations (n) is typically limited in many image processing and computer vision problems due to physical constraints. Moreover, when the number (n) is very limited, we note that existing methods based on Eq. (2) do not reject some outliers well, such as moving objects in surveillance video [15]- [17], shadows in face images [12], and saturations in low dynamic range (LDR) images [2], [18], [19].
It is well known that the rank (r) of X and the γ both influence the final results of RPCA decomposition. Unfortunately, the target rank and regularizing parameter γ are uncertain in Eq. (1) and Eq. (2), where conventional approaches need to tune the rank of X and γ to achieve the desired goal. However, γ = 1/ max{m, n}, that the typical approaches set, is not the best value [15]. Instead, Ramirez et al. [15], [20] used the Minimum Description Length (MDL) principle [21] to avoid estimating the parameter γ. The MDL principle selects the best low-rank approximation from an RPCA decomposition sequences, which are obtained via the different values of γ. Liu et al. [17] employed structured sparse decomposition to solve the regularizing parameter issue in RPCA, where they replaced the static parameter γ by adaptive settings for image regions with distinct properties in each frame. However, an accurate rank is crucial for recovering the low-rank matrix and rejecting the outliers completely. An example of a scene arXiv:2009.08297v1 [cs.CV] 17 Sep 2020 is shown in Fig. 1. The RPCA fails to recover the low-rank matrix and capture the illumination sudden change.
Atoms are the fundamental basis of the representation of a signal. The atomic norm hull is the set of the fundamental elements. Moreover, the atomic norm induced by the convex hull of all unit-norm one-sparse vectors is the l 1 -norm, and the nuclear norm is induced by taking the convex hull of an atomic set, in which the elements are all unit rank matrices [22]- [24]. To address these issues, such as the limited number of observations, the rank of X, and the regularizing parameter γ, we propose a low-rank model based on the MDL principle within the devised atomic norm (MDLAN), which is also an expanded version of our published conference paper [25]. In our proposed method, we minimize the description length to select the optimum atomic sets for the low-rank matrix (X) and structured sparse matrix (E), respectively. In contrast to [15], we use the MDL principle to determine the number of atoms in the low-rank matrix, thereby avoiding tuning the rank of low-rank matrix X, and we also recover the sparse matrix E via MDL principle. Experimental analyses show that our method can obtain a better approximation of the underlying structure of the given data when the number of observed samples is limited or if the samples have gross outliers. Thus, the proposed framework provides a nonparametric, robust lowrank matrix recovery algorithm. The main contributions of this study are summarized as follows: (1) We present an MDL principle based atomic norm method for low-rank matrix recovery. Unlike other model selection algorithms, the proposed MDLAN uses the description length as a cost function to select the two smallest sets of atoms that can span the lowrank matrix and sparse matrix, respectively. (2) We empirically test the MDL framework based atomic norm and find that it outperforms the state-of-the-art methods when the number of observations is limited or if the observations have gross outliers. (3) It is difficult to address the original optimization problem for MDLAN due to the combination of description length and the atomic norm. Thus, we devise a new ADMM based algorithm that considers an approximation of the original non-convex problem. The remainder of this paper is organized as follows. Section 2 briefly reviews some related researches. In Section 3, we present the unified framework for low-rank matrix recovery based on the atomic norm. In Section 4, we describe the proposed MDLAN method. Section 5 presents the experimental results based on synthetic and real datasets. Finally, we give our conclusions in Section 6.

II. RELATED WORK
In the following, we briefly review recent advances in RPCA and discuss its applications in image processing and computer vision.
To exactly recover X, some studies have replaced the rank (·) with the nuclear norm and the number of nonzero entries with the l 1 -norm in Eq. (2). Candès et al. [12] proved that the rank minimization problem can be solved using Eq. (1), and it can be solved in a tractable manner by the convex relaxation version of Eq. (2). They also proved that the unique solution of Eq. (2) corresponds exactly to the solution of the original NP-hard problem in Eq. (1) under suitable conditions.
Recently, the improvements to RPCA are generally divided into two categories. One category is focus on the structured sparse component E in Eq. (2) [26], [27]. For example, Xin et al. [28] replaced l 1 -norm with an adaptive version of the generalized fused lasso (GFL) regularization [29], which takes into account the spatial neighborhood information of the foregrounds in a video sequence.
where the generalized fused lasso E gf l can be viewed as a combination of two common regularizer, i.e. the l 1 -norm and the total variation (TV) penalty [30].
where e (l) is the l-th column of the sparse matrix E, N is the spatial neighborhood set, λ 1 is a tuning parameter, and w ij = ) (σ ≥ 0 is a tuning parameter empirically set). Ebadi et al. [31] dynamically estimated the support of the sparse matrix E via a superpixel generation step [32], so as to impose the spatial coherence onto the structured sparse outliers. Shah et al. [33] replaced the l 1 -norm in Eq. (2) with hybrid l 1 /l 2 -norm, that can promote the spatial smoothness in the support set of the structured sparse outliers.
Another category is focus on the low-rank component X in Eq. (2) [34]. For example, Cabral et al. [35] and Guo et al. [36] replaced the X with U V , and the relationship X * = min where U ∈ R m×r , V ∈ R r×n and · F represents the Frobenius norm. In addition, Guo et al. [36] employed an entropy term to restrict the support of the outliers. Hu et al. [37] proposed an approximation of the target rank by the truncated nuclear norm, which only minimizes the smallest min(m, n) − r singular values. T-H Oh et al. [2] proposed to minimize the partial sum of the singular values instead of minimizing the nuclear norm. Thus, the formulation of the partial sum can be written as follows The rank minimization algorithms for RPCA have inspired many applications in image processing and computer vision, such as image alignment [38], background subtraction [12], [17], high dynamic range (HDR) imaging [2], [39], and image restoration [40], [41]. However, the clean data are always corrupted by gross noise/outliers or the number of given data is limited due to the sensor reasons or human error [2], [11], [42]. The available methods based on RPCA have difficulty solving these situations. In the present study, we propose an algorithm based on MDL and the atomic norm to overcome these difficulties, i.e., unknown of the target rank r, the regularizing parameter λ, and the deficient observations or gross outliers. , and the proposed approach (d,e), respectively. The rank estimated by RPCA is 7, so ghosting appeared in the background. By contrast, the rank estimated by our approach is 1.

A. Atomic norm
First, we provide a definition of an atomic norm and some assumptions regarding the set of atoms (A). We also assume that the set A is origin-symmetric (i.e., A ∈ A if and only if −A ∈ A). The atomic norm [22] is the gauge functional induced by A: where conv(A) denotes the convex hull of A. In fact, the atomic norm is changed into many familiar norms when we specify the atomic set. The dual norm of · A is defined by X * A := sup{ X, A , a ∈ A} where the inner product is defined as X, A = tr(X T A) for the matrix, and tr(·) denotes the trace of a matrix. The dual atomic norm is crucial for producing the atomic set in our case. Sparsity inducing norm: The sparsity inducing atomic set can be expressed as A S := {±E ij ∈ R m×n , i = 1, 2, · · · , m, j = 1, 2, · · · , n} where E ij denotes a matrix, the (i,j)-th entry of which is 1 and the others are zeros. Any k-sparse matrix in R m×n is a linear combination of k elements from the atomic set defined above.
Low-rankness inducing norm: The low-rankness inducing atomic set can be written as where Z ∈ R m×n represents a rank-1 matrix with unit Frobenius norm. For any matrix X ∈ R m×n ,

IV. MDL PRINCIPLE BASED LOW-RANK RECOVERY
In this section, we first present the concept of MDL principle and its background. We then propose a new low-rank matrix recovery method (MDLAN) based on the MDL principle and atomic norm, as well as the optimization algorithm.

A. Minimum description length principle
The MDL principle works as an objective function that balances a measure of the goodness of fit with the model complexity, and searches for a model M from the set of possible models, M. In the MDL framework, a model M ∈ M that describes the given data Y completely with the fewest number of bits is considered the best. The MDL problem is formulated as follows: where the codelength assignment function L(Y, M ) defines the theoretical codelength required to describe (Y, M ) uniquely. A common implementation of the MDL framework uses the Ideal Shannon Codelength Assignment [43,Ch.5 ). Thus, we obtain the MDL framework where −logP (M ) represents the model complexity and −logP (Y |M ) represents a measure of the goodness of fit.

B. The proposed method
Our family of models for expressing the low-rank matrix recovery problem are defined by where r is the truthful rank of low-rank matrix X, and k represents the truthful number of non-zero entries in the sparse matrix E. Using these definitions, our objective function in the MDL framework can be formulated as follows Combining Eq. (8), Eq. (9) and Eq. (12) yields the following MDL based atomic norm for low-rank matrix recovery (MDLAN) model The basic idea of the proposed MDLAN is to find two smallest sets Ψ and Φ, while minimizing the L(Y −F Ψ α−F Φ β). The cost function in Eq. (13) is non-convex in (Ψ, Φ) and we relax it with an alternative objective function in order to effectively handle the proposed problem.

C. Encoding scheme
In our MDL framework, we need to encode the lowrank matrix ( i L(α i ψ i )) and the sparse matrix (L(F Φ β)), respectively. It is usual to extend the ideal codelength to continuous random variables x with a probability assignment P (x) as L(x) = −logP (x) ≈ −log(p(x)δ), and p(x) is the probability density function of variables x. To losslessly encode the finite-precision obtained variables, we quantize the variables with step δ = 1 [20]. The encoding procedure of prediction scheme. The column of an atom is arranged as a 3 × 3 matrix and the elements outside of the range are assumed to be zero. The causal bilinear predictor is assumed to be a 2 × 2 tempate and the mapping matrix W is of the size 9 × 9.
1) Encoding the sparse matrix: For sake of simplicity, we set the elements in atomic set Φ to be positive signs, and the scalar coefficients β to be mix signs. We assume that the scalar coefficients β comprise a sequence of Laplace random variables [20] where each atom φ i ∈ Φ has only one nonzero entry. The φ i only describes the index of the nonzero position, and therefore c is a fixed constant (the description length of φ i is log(mn)). Moreover, 2) Encoding the low-rank matrix: It is not surprising that each atom represents numerous Eigen-information of the lowrank matrix. In other words, in the case of our real world applications, we can suppose that the columns of an atom are standard static images which are piecewise smooth. So we should efficiently exploit the smoothness of the atoms via employing a prediction scheme. Concretely, to describe the each column (ψ i = [a 1 , a 2 , . . . , a n ]) of the atoms, we reshape it (a j ) as an image or frame of the same size as the original images or frames in the observed matrix Y , respectively. Then we employ a causal bilinear kernel with zero-padding to produce a predicted vectorâ j (each element is given by north element + west element − northwest element), obtaining the residualā j = a j −â j . In particular, the residual can be written asā j = W a j , and the matrix residual can be formed asψ i = W ψ i , where W ∈ R m×m is lower triangular. The detailed procedure is depicted in Fig. 2. We refer to [15], [20] for details on these results.
We also assume the prediction residualψ i to be a sequence of LG distributed continuous random variables [15], [20]. Compared to the codelength ofψ i , the codelength of scalar coefficient α i is inconsequential for our model. So the codelength of low-rank matrix X can be written as i L(W ψ i ).
Here we employ the alternating direction method of multipliers (ADMM) method [13], [44] to solve this constrained optimization problem. The Augmented Lagrangian function of Eq. (15) is where µ is a positive scalar, U ∈ R m×n is the Lagrange multiplier and ·, · denotes the inner product operator. The ADMM consists of the following iterations the two subproblems (Eq. (17) and Eq. (18)) are convex optimization problems while fixing the other variables. Algorithm 1 summarizes the whole recovery procedure, recovering the low-rank matrix and rejecting the sparse outliers alternately. The logic that underlies the proposed MDLAN method is that the codelength cost of adding a new atom to the model is usually very high, so adding a new atom is only reasonable if its contribution is sufficiently high to produce the largest decrease in the other part, i.e., the constrained term Y − X − E = 0. 1) Recovering the low-rank matrix: The subproblem in Eq. (17) can be formulated as follows When we obtain the candidate atomic set of the low-rank matrix, we only need to select the suitable atoms. Since the low-rank matrix is a combination of r atoms, so we first determine the candidate set Ψ by the dual atomic norm.
This is equivalent to finding at mostr = min{m, n} atoms to maximize By the Eckart-Young theorem, the atoms Ψ are obtained from the SVD of G t , as Ψ = {u i v T i }r i=1 , where u i and v i are the i-th principal left and right singular vectors, respectively (the singular value α i is the coefficient of the corresponding atom ψ i and α 1 ≥ α 2 ≥ · · · ≥ αr). This result ensures that the selection atoms achieve the supremum in Eq. (21) and the optimal solution will actually lie in the set Ψ. Minimizing Eq. (20) and estimating the rank of the truthful low-rank matrix, indicate that the selection atoms must compromise between minimizing the codelength L and being near to G t . We can add a new atom to the low-rank matrix X in proper order, to move in the opposite to the worst possible direction of the optimization problem (20). To address this optimization problem efficiently, we propose a weighted formulation [45] of description length minimization designed to democratically penalize the codelength of selection atoms.
where v i ∈ {0, 1} denotes the i-th element of vector v and vector s = (L(W ψ 1 ), L(W ψ 2 ), · · · , L(W ψr)). v i = 1 indicates that the atom ψ i is selected to add to the lowrank model and the atom ψ i has a high enough contribution to decrease the term X − G t 2 F . v i = 0 indicates that the atom ψ i is not selected. So the subproblem (23) has the closed-form solution by variant of shrinkage operator, i.e, X = F Ψ α, I 1 µ t (s) . Where the I τ (x) is a variant of shrinkage operator, defined as It turns out that the number of selection atoms is the rank of the truthful low-rank matrix, r.
2) Rejecting the sparse outliers: The subproblem in Eq. (18) can also be formulated as follows To efficiently minimizing the l 1 -norm and the proximity term in Eq. (25), the soft-thresholding (shrinkage) method is employed. We can obtain the solution of the subproblem in Eq. (25) as S θ t µ t is the soft-thresholding operator [46].

E. Discussion
Why does the proposed MDLAN recover the best approximation of the low-rank and sparse matrices even though the number of observations is limited or the observations have gross outliers? In our MDL framework, recovering the lowrank matrix X by solving Eq. (17) or Eq. (20) aims to find the smallest set of atoms in A L that can span X, so it is equivalent to and recovering the sparse matrix by solving Eq. (18) or Eq. (25) is equivalent to where we note that rank(X) = atoms(X) and E 0 = atoms(E). Thus, this theory (Eq. (26) and Eq. (27)) can

12:
θ t+1 ← the mean of E t+1 13: t ← t + 1. 14: until converged. Output: optimal X t , E t ensure that the proposed algorithm recovers the low-rank matrix accurately and rejects the outliers.
As shown in Algorithm 1, the proposed MDLAN can find the candidate atoms for the truthful low-rank matrix and sparse outliers, respectively, and then decides which atom to add to the model according to the MDL principle. Estimating the rank of the truthful low-rank matrix X correctly is the key to recovering the low-rank matrix accurately and it also contributes to rejecting all the outliers. Similarly, rejecting all the outliers will contribute to the search for the best approximation of truthful low-rank matrix X.

V. EXPERIMENTS
We evaluate the proposed method using both synthetic data sets and real sensing application examples to verify its effectiveness and robustness. In all the experiments, we use the default parameters for the methods compared.

A. Experiments with synthetic data
To compare the proposed method (MDLAN) with state-ofthe-art methods on synthetic data, we synthesize a groundtruth low-rank matrix X 0 ∈ R m×n of rank-r and a sparse matrix E 0 ∈ R m×n with k nonzero entries, that simulates the affected data due to sensor malfunction. The low-rank matrix is a linear combination of r arbitrary orthogonal basis vectors, and the weights used to span the vector are sampled randomly from the uniform distribution U (0, 5). The k entries from X 0 are corrupted by random noise from N (0, 1). We refer to X0−X F X0 F as the normalized root mean squared error (NRMSE).
1) Comparison of the success ratio: We use the recoverability results to verify the robustness of RPCA and our method (MDLAN) with respect to the number of samples (n), synthetic data dimension (m), and corruption ratio (p). For each pair, (n, p) and (m, p), we run 50 trials and report the overall average NRMSE of the trials. If the recovered low-rank matrix X has an NRMSE value smaller than ε (ε = 0.01), we consider that recovery is successful. The magnitudes of the colors in Fig. 3 and Fig. 4 indicate the success probability. The larger red areas indicate the more robust performance of the algorithm. Fig. 3 shows the success ratio using RPCA and the proposed method with rank 2, 4, 6, and 8. We fix m = 4900 and vary n and p. When the number of observations is deficient or the corruption ratio is large, the proposed method can obtain a competitive results. Both methods exhibit similar behavior when more samples are available or the corruption ratio is small. We also perform experiments where we fix n = 15 and vary m and p. As shown in Fig. 4, the proposed method yields more robust results than RPCA for the rank 1 and 3 cases. Fig. 4 shows that the dimension (m) do not have a particularly significant effect on the results. However, the number of observations and corruption ratio severely affect the final recovery results.
2) Comparisons with other low-rank matrix approximations: We also perform experimental comparisons of a rank minimum based method (RPCA) [12], MDL principle based method (LR-MDL) [15], atomic norm based method (Co-GEnT) [23], generalized fused lasso foreground modeling (BSGFL) [28] and partial sum of singular values based method (PSSV) [2]. We verify the robustness of RPCA, LR-MDL, CoGEnT, BSGFL, PSSV, and the proposed method (MDLAN) with respect to the corruption ratio. We fix m = 108, n = 100, r = 4 and vary the corruption ratio p ∈ [0.01, 0.8]. To show more detail obtained by RPCA, LR-MDL, CoGEnT, PSSV and MDLAN in Fig. 5, the results of BSGFL is not shown (Due to considering the spatial neighborhood information in sparse matrix, the BSGFL fails to recover the synthetic sparse matrix). Fig. 5(a) shows the NRMSE of low-rank matrix for each method as a function of the corruption ratio based on the synthetic data averaged over 50 random runs. As shown in Fig. 5(a), when the outlier ratio is lower than 0.3, the proposed method obtains similar results to PSSV and RPCA, which are better than those produced by the other methods (LR-MDL and CoGEnT). When the outlier ratio is more than 0.3, MDLAN achieves much higher accuracy than RPCA, LR-MDL, CoGEnT, and PSSV. It is clear that gross outliers exist and thus the existing methods do not capture all the energy of the underlying structure. The results shown in Fig. 5(c) demonstrate that only the proposed method estimates the rank of the underlying structure correctly (rank-4). As stated in the previous section, the proposed method finds all the candidate atoms of low-rank matrix via the atomic norm and then selects the most appropriate atoms via the MDL principle. Estimating the rank of the underlying structure correctly is crucial for recovering the low-rank matrix accurately and also benefits the outliers estimation.
The NRMSE of sparse matrix obtained by our method in Fig. 5(b) have smaller errors than those produced by RPCA, LR-MDL, CoGEnT, and PSSV when the outlier ratio is more than 0.3. The proposed method can search for the best approximation of the sparse structure via the MDL, so it can obtain more accurate results. Moreover, compared with the other methods, the proposed approach estimates the number of nonzero entries in the sparse matrix more accurately, even when the corruption ratio is up to 0.55, as shown in Fig. 5(d) (the number of nonzero entries recovered by LR-MDL is always mn). When the corruption ratio is more than 0.55, the number of nonzero entries estimated by the proposed method is still close to the original number. To reject the outliers completely, it is necessary to recover the locations and the corresponding values of the nonzero entries accurately, which we achieved by solving Eq. (25) in our MDL framework. Table I shows the recovery results averaged over 50 random runs, where we fix the corruption ratio to p = 0.05 or 0.5. When the data are corrupted with 50% outliers, the average NRMSE for the low-rank matrix using the proposed method is 0.01 and the average NRMSE for the sparse noise matrix is 0.019. In addition, MDLAN preforms better than LR-MDL, CoGEnT and BSGFL when the corruption ratio is only 0.05. In summary, the experimental results on synthetic data suggest that MDLAN performs better at recovering the lowrank matrix and rejecting the outliers from the corrupted data compared with the other state-of-the-art methods.
B. Real-world sensing applications 1) High dynamic range (HDR) imaging: Low dynamic range (LDR) images of a scene are usually captured by a sensor with different bracketing exposures. We formulate the HDR image generation problem as a rank minimization problem, where the moving objects, noise, and other nonlinear artifacts are considered as sparse outliers, and our goal is to merge several LDR images into the final HDR images. We know that LDR images are linearly dependent due to the continuous camera response. Thus, we construct three observed intensity matrices Y ∈ R m×n = [vec(I 1 ), · · · , vec(I n )] by stacking the vectorized input images (processing each color channel respectively), where m and n represent the number of pixels and images, respectively, and I i denotes the input image. We apply the rank minimization methods to the three corrupted matrices to separate the outliers and the background scene (low-rank term).  We apply the proposed approach to the three observed matrices Y ∈ R 699392×4 using a set of LDR images comprising four pictures taken in a forest [18]. The images contain artifacts caused by a person walking in the scene. Moreover, the wind makes the branches move and thus there are shadows due to the wind blowing. The final HDR results are shown in Fig. 6. Compared with the results obtained by RPCA, LR-MDL, CoGEnT, BSGFL, and PSSV, the proposed method can recover the low-rank component (artifact-free in Fig. 6(g)) and reject more outliers, even with only four input images (n = 4). The detailed comparison in Fig. 7 shows that our method can reject the outliers, such as ghosting and shadows, which are caused by person and the wind respectively.
2) Background modeling based on video sensor: We adopt the F -measure as the quantitative metric for performance evaluation of the background modeling. The F -measure which combines precision and recall, is calculated as follows where precision = T P T P +F P and recall = T P T P +F N , TP, FP, TN, and FN, denote the numbers of true positives, false positives, true negatives, and false negatives, respectively. The higher the F -measure, the more accurate the outliers (foreground objects) are detected [47].
In background modeling, it is difficult to determine the correlations between video frames, as well as modeling background variations and the foreground activity. It is reasonable to assume that these background variations are low-rank, while the objects moving in the foreground are large in magnitude  and sparse in the spatial domain. Background estimation is complex due to the presence of foreground activity such as moving people and variations in illumination. We first consider the example video introduced by Li et al. [48], which comprises a sequence of 1186 grayscale frames obtained from a busy shopping center. Multiple people move in the scene, so the shadows on the ground surface vary significantly in the image sequences. To verify the effectiveness of the proposed method when the number of observations is limited, we only utilize a small number of continuous frames (n=100). Each frame has resolution of 256 × 320 and we stack the frames as the columns in our observed matrix Y ∈ R 245760×100 .
The results are displayed in Fig. 8, which show that all the methods successfully detect the moving people. However, many shadows are present in the low-rank background recovered by RPCA, LR-MDL, CoGEnT, BSGFL and PSSV, as shown in Fig. 8(b)-(f). By contrast, our proposed method correctly model the background scene and gives better foreground with fewer false detections.
We then consider two sequences from the SABS data set, including a "Basic" sequence and a "Clutter" sequence. The "Clutter? category of sequences contain a large number of foreground moving objects occluding a large portion of the background, which is very challenging and we also only utilize 100 continuous frames. The results of all models on an example frame are indicated in Fig. 9 and Fig. 10. As shown in Fig. 9, the proposed method obtains a cleaner background (no ghosting) and detects more outliers compared against the other models, when the corruption ratio is high. Fig. 10 demonstrates that the proposed MDLAN can recover the lowrank background (no shadow) and almost cuts a foreground correctly, compared against other models.
The average F-measures and running time (on a 3 GHz Core(TM) i7 CPU) of all the models on the three sequences are shown in Table II. As illustrated in Figs. 8-10, the shadow is included in the sparse component, which makes the value of F-measure relatively low. Table II indicates that the proposed method can achieve the highest F-measure on the three sequences and has shown to be computationally efficient.
3) Removing shadows and specularities from faces: Basri et al. [49] stated that the face recognition problem in computer vision is a low-dimensional linear model and showed that under certain idealized circumstances, images captured by a sensor which is under variable illumination, lie near an approximately nine-dimensional linear subspace known as the harmonic plane. However, due to the presence of shadows and specularities, real face images often violate the aforementioned low-rank model. It is reasonable to consider that outliers such as shadows, specularities, and saturations are sparse in the spatial domain. Thus, we aimed to recover a low-rank model from the corrupted face images. The images have a resolution of 96 × 84 and we stack 20 face images as the columns in our observed matrix Y ∈ R 8064×20 . Fig. 11(a) shows three images from the Extended Yale B database [50], and Fig. 11(b)-(g) and Fig. 12(a)-(f) show the recovered low-rank and corresponding sparse components, respectively. The sparse terms E obtained by CoGEnT and BSGFL are all zeros. Thus, the CoGEnT and BSGFL methods fail to reject the outliers E. Unlike the other methods, when the shaded area is small, MDLAN removes the shadows around the nose region (see the first and second rows in Fig. 11(g)). When the shaded area is large, the proposed method still removes more shadows than RPCA, LR-MDL, CoGEnT, BSGFL and PSSV (see the third row in Fig. 11(g)). Thus, our technique may be useful for pre-processing training images in face recognition systems by removing such outliers.

VI. CONCLUSION
In this study, we introduce the MDL principle and atomic norm into the field of low-rank matrix recovery, and we propose a novel nonparametric low-rank matrix approximation method called MDLAN. The existing algorithms have difficulty tackling the proposed optimization problem, so we consider an approximation of the original problem. Our method selects the best atoms to search for the best approximation of low-rank matrix, and it also can find the sparse noise simultaneously. We compare the proposed approach with state-of-the-art methods using synthetic data and three real sensing low-rank applications, i.e., HDR imaging, background modeling based on video sensor, and removing shadows and specularities from face images. The experimental results using the synthetic and real sensing datasets demonstrate the effectiveness and robustness of the proposed approach.