Hole Concealment Algorithm Using Camera Parameters in Stereo 360 Virtual Reality System

Virtual reality (VR) has been one of the most important topics in the field of multimedia signals and systems for approximately 10 years [...]


Introduction
Virtual reality (VR) has been one of the most important topics in the field of multimedia signals and systems for approximately 10 years. There are many applications for 360 VR systems, such as broadcasting, movies, social media communication, remote education, and virtual tourism [1]. 360 VR images and videos are generated using a 360 VR camera rig or a spinning VR camera, which is usually expensive or consists of heavy equipment.
As the technologies related to VR have been studied, methods that use a general purpose camera or cheap smartphone instead of the technical equipment to generate a 360 VR image have been developed [2][3][4][5]. In [2], the authors explained a method to stitch images using direct alignment or feature-based alignment. When the featurebased alignment approach is employed, feature points are searched in each picture to be stitched using one of the related algorithms [6][7][8][9][10][11]. Then, the feature points are matched by RANSAC [12], which provides homography matrixes to represent the relationship between neighboring pictures with overlapping common regions. Intrinsic and extrinsic matrixes of each picture can be derived from the homography matrix. The intrinsic and extrinsic matrixes of all pictures to be used for generating a VR image are jointly optimized through a bundle adjustment procedure [13]. In [3], a real time algorithm was proposed to make a panoramic image from pictures taken by the camera of a mobile phone, where the resulting image is shown on a cylindrical surface. The pictures taken by a mobile phone have different colors and brightness values compared to other pictures because these depend on the direction and position of the camera used in taking the pictures. This difference produces various artifacts in the VR image. In order to reduce the degradation, an algorithm to find the seams and blend the overlapping regions was proposed in [4]. Automatic Panoramic Image Stitching (APIS) [5] is one of the most efficient methods to make the panorama image or VR pictures. APIS consists of a variety of modules explained in [2,7,8,11,12] which include the feature-based alignment, Scale Invariant Feature Transform (SIFT), Speed-Up Robust Features (SURF), Oriented FAST and Rotated BRIEF (ORB), Random Sample Consensus (RANSAC), bundle adjustment, straightening, and multiband blending.
In order to increase the quality of the stitched VR image, we can use high-end specialized equipment, for example GoPro VR cameras. When high-end specialized equipment is used to generate 360 • VR images, it is considerably easier to align pictures without notable seam artifacts than that when using conventional cameras. Although VR images obtained from high-end specialized equipment have higher quality than those obtained from conventional cameras, conventional cameras are more affordable and simpler to use than specialized equipment. In fact, specialized equipment is bulky and expensive, thereby hindering its use in daily life.
Appl. Sci. 2021, 11,2033 3 of 18 purpose cameras (e.g., built-in cameras in smartphones) are used to capture pictures. The left and right ERPs are generated by applying several techniques to the sets of pictures with the left and right views. For example, the left ERP is made by consecutively applying feature extraction, mapping between features, a calculation of the homography matrix to represent the relationship, bundle adjustment to optimize the camera parameters, seam finding, and blending. The right ERP image is also made by a series of the same techniques. In this paper, APIS [5] is used to make the left and right ERPs, because it is known as one of the most efficient algorithms. APIS is applied independently to the left and right sets of pictures that have been captured by two built-in smartphone cameras. In this scenario, no mechanical equipment is used, other than a tripod. Thus, the pictures are captured in arbitrary directions and positions. Note that APIS does not include a function to conceal the holes in the stitched image. APIS is the basic style algorithm to stitch the pictures. compared subjectively and objectively. In Section 5, we provide a brief conclusion for this paper.

Problem Formulation
The Figure 1 shows the process used to make the stereo 360 VR images, which consist of left and right equi-rectangular projections (ERPs). In this configuration, the left and right cameras are mounted on a rig and capture pictures in the same direction, where general-purpose cameras (e.g., built-in cameras in smartphones) are used to capture pictures. The left and right ERPs are generated by applying several techniques to the sets of pictures with the left and right views. For example, the left ERP is made by consecutively applying feature extraction, mapping between features, a calculation of the homography matrix to represent the relationship, bundle adjustment to optimize the camera parameters, seam finding, and blending. The right ERP image is also made by a series of the same techniques. In this paper, APIS [5] is used to make the left and right ERPs, because it is known as one of the most efficient algorithms. APIS is applied independently to the left and right sets of pictures that have been captured by two built-in smartphone cameras. In this scenario, no mechanical equipment is used, other than a tripod. Thus, the pictures are captured in arbitrary directions and positions. Note that APIS does not include a function to conceal the holes in the stitched image. APIS is the basic style algorithm to stitch the pictures.  Figure 2 shows an example of the left and right ERPs that are generated using the procedure shown in Figure 1. As observed in Figure 2, we find some holes in these pictures. In the scenario illustrated in Figure 1, given that general-purpose cameras (instead of specialized equipment) are used, their narrow field-of-view (FOV) and the fact that pictures are often captured in arbitrary directions may produce non-overlapping regions and distortion (e.g., ghost effect or misalignment) in stitched 360 • VR images. Nonoverlapping regions also show up owing to the limitations of the stitching module, because the derivation of the relationship between spatially neighboring pictures is based on the RANSAC and feature points of each picture. The non-overlapping regions become holes in the stitched image, where the shape of the hole is dependent on the structure of the neighboring warped pictures. The shape of the holes may look like a square or a lozenge because those holes are surrounded by the neighboring warped pictures. Examples of the holes are shown in Section 4, where we can see the various shapes of holes. and distortion (e.g., ghost effect or misalignment) in stitched 360° VR images. Non-overlapping regions also show up owing to the limitations of the stitching module, because the derivation of the relationship between spatially neighboring pictures is based on the RANSAC and feature points of each picture. The non-overlapping regions become holes in the stitched image, where the shape of the hole is dependent on the structure of the  Figure 3 explains the process used to detect a hole in an ERP image. As observed in Figure 2, because a hole is a region that has not been covered by the warped pictures using the stitching algorithm, the color of the hole is black, with a luminance of zero. Thus, in step 1 of Figure 3, the colored ERP image is converted to a black-and-white picture to efficiently detect the holes. In step 2, all of the non-black pixels are replaced by white pixels, as shown in Figure 3c, where the black and white pixels have gray levels of 0 and 255 for an image represented with 8 bits/pixel, respectively. This results in the candidate regions for holes. Note that not all of the pixels in the dark objects have a zero value, even though some of the pixels in the objects may have a zero value. In order to classify the black pixels as hole and non-hole regions, a contour finding algorithm [25] is applied to image Figure 3c in step 3, which provides several contours with various sizes. In image Figure 3d, the perimeter of the contour of the hole is longer than that of a non-hole region, because the non-hole region is part of the dark object and it has the shape of an isolated particle. Therefore, if the length of the contour is shorter than a threshold, the dark region In Figure 2, B c L is a block overlapping a hole in the left ERP. B l L and B r L are the left and right sides neighboring blocks of B c L , respectively. B c R , B l R , and B r R in the right ERP are blocks corresponding to B c L , B l L , and B r L , respectively. The corresponding blocks in the left and right ERPs are located in the same positions.

Hole Detection
This paper proposes an efficient algorithm to conceal the holes, where pixels in the holes are filled with pixels in other ERPs and the location of the pixel used is derived from the camera parameters of the neighboring blocks {B c Figure 3 explains the process used to detect a hole in an ERP image. As observed in Figure 2, because a hole is a region that has not been covered by the warped pictures using the stitching algorithm, the color of the hole is black, with a luminance of zero. Thus, in step 1 of Figure 3, the colored ERP image is converted to a black-and-white picture to efficiently detect the holes. In step 2, all of the non-black pixels are replaced by white pixels, as shown in Figure 3c, where the black and white pixels have gray levels of 0 and 255 for an image represented with 8 bits/pixel, respectively. This results in the candidate regions for holes. Note that not all of the pixels in the dark objects have a zero value, even though some of the pixels in the objects may have a zero value. In order to classify the black pixels as hole and non-hole regions, a contour finding algorithm [25] is applied to image Figure 3c in step 3, which provides several contours with various sizes. In image Figure 3d, the perimeter of the contour of the hole is longer than that of a non-hole region, because the non-hole region is part of the dark object and it has the shape of an isolated particle. Therefore, if the length of the contour is shorter than a threshold, the dark region is removed, as shown in image Figure 3e. Finally, the block overlapping the dark region is set to B c L in image Figure 3e.

Camera Parameters of Neighboring Blocks
In Figure 1, when an ERP image is made, the stitching algorithm is applied to the set of pictures that have been taken by a single camera (the left or right camera). As explained in Section 2, each picture in a set of pictures with the left or right views is warped and translated according to the intrinsic matrix K and extrinsic matrix R of each picture. The following equations are the intrinsic and extrinsic matrixes of picture I i .
where elements f x (I i ) and f y (I i ) are the horizontal and vertical focal lengths of the camera used, respectively. (c x (I i ), c y (I i )) is the location of the intersection between the Z axis of the world coordinate system and the plane of the image sensor. s is a skew coefficient. In (2), R is a rotation matrix used to represent the direction of camera lens when picture I i is taken. K and R are derived from a homography matrix calculated by RANSAC. The parameters in the K's and R's of all the pictures are optimized by the bundle adjustment based on the Levenberg-Marquardt algorithm [26].  In Figure 1, when an ERP image is made, the stitching algorithm is applied to the set of pictures that have been taken by a single camera (the left or right camera). As explained in Section 2, each picture in a set of pictures with the left or right views is warped and translated according to the intrinsic matrix K and extrinsic matrix R of each picture. The following equations are the intrinsic and extrinsic matrixes of picture I i .   (1) and (2) as follows. As the equations have the same pattern, we represent those for B l L only.

Concealment Based on Intrinsic and Extrinsic Matrixes
In this subsection, we derive the relationship between the stereo views, where the camera parameters of the warped pictures are utilized. In Figure 4, I L and I R are the warped images after they have been taken by the left and right cameras, respectively. Based on the geometric relation between the camera coordinate and world coordinate systems [27], the (X, Y, Z) of point P in the world coordinate system is mapped to the (x L , y L ) and (x R , y R ) of points p L and p R in I L and I R , respectively. The relation is represented as follows.
In the proposed algorithm based on (11), each pixel in a hole is filled with a pixel in the right ERP, where H(B L c →B R c ) provides the location of the pixel in the right ERP to replace a particular pixel in a hole. This means that we can conceal the hole if and K(B L c ) are known.
On the other hand, when the hole is found in the right ERP, the following equation is used instead of (11). Figure 5 shows an example of concealing each pixel in a hole when the hole is found in the left ERP.   Combining (5) and (6) gives Therefore, the relation between the pixels in I L and I R can be represented with a 3 × 3 matrix H(I L → I R ) as follows.
where The H(I L → I R ) can be calculated when K(I R ), R(I R ), R(I L ), and K(I L ) are known. If I L and I R are replaced by B c L and B c R , respectively, then (10) becomes In the proposed algorithm based on (11), each pixel in a hole is filled with a pixel in the right ERP, where H( B c L → B c R ) provides the location of the pixel in the right ERP to replace a particular pixel in a hole. This means that we can conceal the hole if , and K(B c L ) are known. On the other hand, when the hole is found in the right ERP, the following equation is used instead of (11). Figure 5 shows an example of concealing each pixel in a hole when the hole is found in the left ERP.
On the other hand, when the hole is found in the right ERP, the following equatio is used instead of (11). Figure 5 shows an example of concealing each pixel in a hole when the hole is found i the left ERP.

Prediction for Camera Parameters
This subsection explains the method used to predict (11). As shown in Figure 2, because B c L includes a hole, R(B c L ) and K(B c L ) are unknown. In addition, because B c R consists of fractions of multiple pictures in the right view, K(B c R ) and R(B c R ) should be recalculated instead of using those resulting from the process to make the right ERP. In order to predict are derived by applying "feature extraction," RANSAC, and "bundle adjustment" to those pairs. Figure 6 shows the examples used to estimate H(B l L → B l R ) and H(B r L → B r R ) , which have the following relationships.
From the homography matrixes in (13) and (14), } are derived by solving the simultaneous equations related to the focal length.
Second, R(B c R ) and R(B c L ) are estimated from {R(B l R ), R(B l L )} and {R(B r R ), R(B r L )}, respectively, which were obtained in the previous step. In [28], a rotation matrix R can be represented as a combination of rotational components R x , R y , R z on the x, y, z axes as follows.
where φ, σ, and θ are the Euler angles about the x, y, and z axes, respectively. Applying the where i = {l, c, r} and j = {L, R}. In (16) are the Euler angles about the x, y, and z axes in R(B i j ), respectively. As observed in Figure 2, because B c L and B c R are located in the centers of {B l L and B r L } and {B l R and B r R }, respectively, we can assume that their Euler angles have the following relationships.  Third, K(B c L ) and K(B c R ) are estimated from {K(B l R ), K(B l L )} and {K(B r R ), K(B r L )}, respectively, which were derived in the first step. As we can see in (1), because the K matrixes consist of the focal length, aspect ratio, and center point, K(B c L ) and K(B c R ) can be derived by averaging the K's of the neighboring blocks, as follows.
, which can be used to conceal each pixel in a hole.

Simulation Results
In order to demonstrate the performance of the proposed algorithm, we compare it with various conventional methods, including those of Bertalmio [16], Telea [17], Criminisi [18], Liu [19], GiliSoft Stamp Remover [33], and Theinpaint [34] from the subjective and objective viewpoints. These methods were explained in Section 1.
We captured picture sets using the built-in camera of a Samsung Galaxy 9 smartphone. There were more than 100 pictures in each set. The pictures within a single set were stitched using APIS to create the stereoscopic VR images. Figure 8 shows the left or right VR images of the stereoscopic VR images having holes of various sizes and locations. The VR images including the holes were used as test sets to evaluate the performances of the proposed algorithm and various conventional methods. In Figure 8a-f, the holes have been generated through the stitching algorithm owing to the reason described in Section 2. Those pictures are used only for subjective evaluations because there is no reference VR image without any hole. When we created Figure 8g,h, the stitched VR images did not have any holes. Therefore, we have made an artificial hole in each VR image. Consequently, Figure 8g,h can be used for objective evaluation, because the difference between the reference image (without any hole) and the concealed images (the results of various algorithms) can be calculated numerically. The main characteristics of the proposed algorithm are as follows: First, the proposed algorithm conceals the holes in stereoscopic VR images, whereas most conventional algorithms [16][17][18][19][20][21][22][23][24]29] for filling holes were proposed for non-VR images. In particular, certain concealment algorithms [30][31][32] were proposed to fill the holes in depth images, where the holes have arbitrary shapes and some of them are small particles. Second, our algorithm can derive the location of the most similar pixel for each pixel in a hole. In our scenario, we use the data of the camera stream of another view to fill the holes, where the exact location of the most similar part should be calculated to increase the objective and subjective quality of the filled region. The position is estimated by analyzing Euler angles of extrinsic matrices of the related neighbor blocks. In the analysis of Euler angles, we decompose the angles along x, y, and z axes and then merge them to find the exact location of the reference data. The relevant explanation is provided in Section 3.4.

Simulation Results
In order to demonstrate the performance of the proposed algorithm, we compare it with various conventional methods, including those of Bertalmio [16], Telea [17], Criminisi [18], Liu [19], GiliSoft Stamp Remover [33], and Theinpaint [34] from the subjective and objective viewpoints. These methods were explained in Section 1.
We captured picture sets using the built-in camera of a Samsung Galaxy 9 smartphone. There were more than 100 pictures in each set. The pictures within a single set were stitched using APIS to create the stereoscopic VR images. Figure 8 shows the left or right VR images of the stereoscopic VR images having holes of various sizes and locations. The VR images including the holes were used as test sets to evaluate the performances of the proposed algorithm and various conventional methods. In Figure 8a-f, the holes have been generated through the stitching algorithm owing to the reason described in Section 2. Those pictures are used only for subjective evaluations because there is no reference VR image without any hole. When we created Figure 8g,h, the stitched VR images did not have any holes. Therefore, we have made an artificial hole in each VR image. Consequently, Figure 8g,h can be used for objective evaluation, because the difference between the reference image (without any hole) and the concealed images (the results of various algorithms) can be calculated numerically. In order to check the performance of the proposed algorithm, we applied the proposed algorithm for the test sets of Brick Road, Alleyway, Tree, Bench, and Street. In Figure 9, VR images including the holes are compared with the concealed images, respectively. As shown in Figure 9, we can see that the proposed algorithm conceals the various holes effectively in the test images.

Subjective Performance
In order to check the performance of the proposed algorithm, we applied the proposed algorithm for the test sets of Brick Road, Alleyway, Tree, Bench, and Street. In Figure 9, VR images including the holes are compared with the concealed images, respectively. As shown in Figure 9, we can see that the proposed algorithm conceals the various holes effectively in the test images.  To compare the performances of various techniques subjectively, the pictures whose holes have been concealed by the methods are shown in Figure 11, where all of the pictures are a part of the left ERP picture and we made a hole artificially as Figure 10b in order to compare the concealed pictures Figure 11a  As observed in Figure 11, the proposed algorithm outperforms the other techniques subjectively. In the picture Figure 11a result from Bertalmio [16], the pixels in the hole are filled by copying the neighboring pixels horizontally or vertically. Thus, the concealed region has artifact lines. When the method of Telea [17] is used to conceal the hole, each pixel in a hole is concealed progressively from the boundary to the center of the hole, where the value of the concealed pixel is set to the weighted sum of the values of the neighboring pixels. Thus, the concealed region of Figure 11b has smoothed and diffused distortion. In Figure 11c, when the method of Criminisi [18] is used, the pixels in the hole are filled by blocks, where the template matching algorithm is used. In this method, the SSD of template part is considered a cost function to estimate the most similar block, and the values of the pixels in the non-template part are not considered. Thus, some pixels in the hole are filled with wrong values. As shown in Figure 11d, the method of Liu [19] based on deep learning has filled the hole with inappropriate blocks, which are generated  As observed in Figure 11, the proposed algorithm outperforms the other techniques subjectively. In the picture Figure 11a result from Bertalmio [16], the pixels in the hole are filled by copying the neighboring pixels horizontally or vertically. Thus, the concealed region has artifact lines. When the method of Telea [17] is used to conceal the hole, each pixel in a hole is concealed progressively from the boundary to the center of the hole, where the value of the concealed pixel is set to the weighted sum of the values of the neighboring pixels. Thus, the concealed region of Figure 11b has smoothed and diffused distortion. In Figure 11c, when the method of Criminisi [18] is used, the pixels in the hole are filled by blocks, where the template matching algorithm is used. In this method, the SSD of template part is considered a cost function to estimate the most similar block, and the values of the pixels in the non-template part are not considered. Thus, some pixels in the hole are filled with wrong values. As shown in Figure 11d, the method of Liu [19] based on deep learning has filled the hole with inappropriate blocks, which are generated As observed in Figure 11, the proposed algorithm outperforms the other techniques subjectively. In the picture Figure 11a result from Bertalmio [16], the pixels in the hole are filled by copying the neighboring pixels horizontally or vertically. Thus, the concealed region has artifact lines. When the method of Telea [17] is used to conceal the hole, each pixel in a hole is concealed progressively from the boundary to the center of the hole, where the value of the concealed pixel is set to the weighted sum of the values of the neighboring pixels. Thus, the concealed region of Figure 11b has smoothed and diffused distortion. In Figure 11c, when the method of Criminisi [18] is used, the pixels in the hole are filled by blocks, where the template matching algorithm is used. In this method, the SSD of template part is considered a cost function to estimate the most similar block, and the values of the pixels in the non-template part are not considered. Thus, some pixels in the hole are filled with wrong values. As shown in Figure 11d, the method of Liu [19] based on deep learning has filled the hole with inappropriate blocks, which are generated from the deep learning engine. Figure 11e,f were resulted from Gilisoft stamp remover [33] and Theipaint [34], respectively. As shown in Figure 11e,f, these methods does not remove the holes completely. In Figure 11g, the hole in the left ERP is concealed just by copying a block in the right ERP, where the shapes and coordinates of the blocks, including the hole in the left ERP and the copied block in the right ERP, are same. As observed in Figure 11g, the concealed region has significant mismatches around the hole, because this method does not consider the disparity between the left and right views. Whereas the conventional techniques have serious artifacts in the concealed region, the image Figure 11h produced by the proposed algorithm has a natural and continuous boundary in the concealed region. Figures 12 and 13 show the simulation results for indoor pictures, where the size of the hole is bigger than that of Figures 10 and 11. As can be seen from Figures 12 and 13 and Figures 10 and 11, the performance tendencies of the various algorithms do not change even if the size of the hole is varied. mover [33] and Theipaint [34], respectively. As shown in Figure 11e,f, these methods does not remove the holes completely. In Figure 11g, the hole in the left ERP is concealed just by copying a block in the right ERP, where the shapes and coordinates of the blocks, including the hole in the left ERP and the copied block in the right ERP, are same. As observed in Figure 11g, the concealed region has significant mismatches around the hole, because this method does not consider the disparity between the left and right views. Whereas the conventional techniques have serious artifacts in the concealed region, the image Figure 11h produced by the proposed algorithm has a natural and continuous boundary in the concealed region. Figures 12-13 show the simulation results for indoor pictures, where the size of the hole is bigger than that of Figures 10 and 11. As can be seen from Figures 12 and 13 and Figures 10 and 11, the performance tendencies of the various algorithms do not change even if the size of the hole is varied. Figure 14 shows the simulation results for the naturally generated holes that occurred non-artificially during the stitching procedure. Note that no reference picture was used in this test. In this configuration, both the left and right ERP images had holes at different locations independently. In this test, the concealment algorithms were applied to the left and right ERP pictures independently. The results of the conventional and proposed tech-ni mover [33] and Theipaint [34], respectively. As shown in Figure 11e,f, these methods does not remove the holes completely. In Figure 11g, the hole in the left ERP is concealed just by copying a block in the right ERP, where the shapes and coordinates of the blocks, including the hole in the left ERP and the copied block in the right ERP, are same. As observed in Figure 11g, the concealed region has significant mismatches around the hole, because this method does not consider the disparity between the left and right views. Whereas the conventional techniques have serious artifacts in the concealed region, the image Figure 11h produced by the proposed algorithm has a natural and continuous boundary in the concealed region. Figures 12-13 show the simulation results for indoor pictures, where the size of the hole is bigger than that of Figures 10 and 11. As can be seen from Figures 12 and 13 and Figures 10 and 11, the performance tendencies of the various algorithms do not change even if the size of the hole is varied. Figure 14 shows the simulation results for the naturally generated holes that occurred non-artificially during the stitching procedure. Note that no reference picture was used in this test. In this configuration, both the left and right ERP images had holes at different locations independently. In this test, the concealment algorithms were applied to the left and right ERP pictures independently. The results of the conventional and proposed techniques are shown in Figure 14b-g,h, respectively. When the proposed algorithm was applied, Equations (11) and (12) were used to conceal the holes in the left and right ERPs, respectively. As observed in Figure 14, the proposed method outperformed the other conventional techniques subjectively. In addition, the performance tendencies of the techniques were the same as those of Figures  (e) (f) (g) (h) Figure 13. Concealed images resulting from various algorithms for a large hole for test set Hallway: (a) Result from Bertalmio [16]; (b) Result from Telea [17]; (c) Result from Criminisi [18]; (d) Result from Liu [19]; (e) Result from Gilisoft stamp remover [33]; (f) Result from Theinpaint [34]; (g) Result from simple copy; (h) Result from proposed method. Figure 13. Concealed images resulting from various algorithms for a large hole for test set Hallway: (a) Result from Bertalmio [16]; (b) Result from Telea [17]; (c) Result from Criminisi [18]; (d) Result from Liu [19]; (e) Result from Gilisoft stamp remover [33]; (f) Result from Theinpaint [34]; (g) Result from simple copy; (h) Result from proposed method. Figure 14 shows the simulation results for the naturally generated holes that occurred non-artificially during the stitching procedure. Note that no reference picture was used in this test. In this configuration, both the left and right ERP images had holes at different locations independently. In this test, the concealment algorithms were applied to the left and right ERP pictures independently. The results of the conventional and proposed techniques are shown in Figure 14b-g,h, respectively. When the proposed algorithm was applied, Equations (11) and (12) were used to conceal the holes in the left and right ERPs, respectively. As observed in Figure 14, the proposed method outperformed the other conventional techniques subjectively. In addition, the performance tendencies of the techniques were the same as those of Figures 10 and 11 and Figures 12 and 13.

Objective Performance
The objective performances of the concealment algorithms were evaluated with common measures like the structural similarity (SSIM) [35], peak-to-peak signal-to-noise ratio (PSNR) [36], and consumed CPU time. All experiments are run on the CPU based on AMD Ryzen 3.40 GHz and 32-GB RAM. The SSIM and PSNR are the average values of the SSIM

Objective Performance
The objective performances of the concealment algorithms were evaluated with common measures like the structural similarity (SSIM) [35], peak-to-peak signal-to-noise ratio (PSNR) [36], and consumed CPU time. All experiments are run on the CPU based on AMD Ryzen 3.40 GHz and 32-GB RAM. The SSIM and PSNR are the average values of the SSIM and PSNR values for the red, green, and blue components. SSIM is in the range of 0-1.
As the values of SSIM and PSNR increase, the quality of the resulting image increases. Note that SSIM and PSNR can be evaluated when the reference pictures are given, as in Figures 10 and 11 and Figures 12 and 13.
On the other hand, the CPU times consumed by the methods were checked using a personal computer. The CPU time was measured with the resolution of a second. As the CPU time depended on the complexity of the algorithm, the technique consuming the smallest CPU time was considered the simplest method.
The outdoor pictures (Mozart Hall) shown in Figures 10 and 11 are evaluated in Tables 1 and 2, where the best and 2nd best results are represented with red and blue numbers, respectively. As seen in Table 1, the proposed algorithm produced the best quality from the viewpoints of both the SSIM and PSNR. Theipaint [34] had the 2nd best quality for both SSIM and PSNR. The "simple copy" required the smallest CPU time, because it filled the hole just by copying the same size block of the other view. The proposed algorithm is one of the simplest algorithms in the consumed time. When the method of Liu [19] was used, the consumed CPU time could not be measured because the training time was huge and depended on the quantity of the training data.  Table 2 shows the performances of the proposed algorithm for a variety of sizes of neighboring blocks {B l R , B r R , B l L , B r L }, where the numbers in the first column denote the ratio between the widths of neighboring block B l L and block B c L that includes the hole. As observed in this table, the best performance is found when the ratio is 1.5 (i.e., the width of B l L is 50% wider than that of B c L ). When the size of the neighboring blocks is small (the ratio is under 1.5), the performance of the proposed algorithm increases with the size, because the accuracy of H(B c L → B c R ) increases as the number of pixels utilized to derive it increases. However, when the size is large (the ratio is over 1.5), the neighboring blocks include fractions of multiple pictures, which degrades the accuracy of H(B c L → B c R ) . In this table, the complex-ity increases with the width of the neighboring blocks, because the number of pixels to be considered in making H(B l L → B l R ) and H(B r L → B r R ) increases. Tables 3 and 4 show the simulation results for the indoor pictures (Hallway) shown in Figures 12 and 13. Note that the hole in Figures 12 and 13 are larger than that of Figures 10 and 11. As observed in Tables 3 and 4, the performance tendencies of the methods are similar to those seen in Tables 1 and 2. As observed in Table 3, the proposed algorithm has the best performance. Gilisoft [33] and Theipaint [34] had the 2nd best quality for PSNR and SSIM, respectively. The 2nd best performance is seen for the different ratios in Tables 2 and 4 because the pictures of Figures 12 and 13 include more complex regions and larger holes than those of Figures 10 and 11. From the results in Tables 1-4, we can see that the proposed algorithm outperforms the conventional methods objectively.

Conclusions
We proposed an efficient algorithm to conceal the holes generated in stereo VR pictures, where the positions of the pixels to fill the hole were derived using the camera parameters. The camera parameters were predicted from their relationship with the homography matrix. Whereas conventional inpainting algorithms have constraints in the concealment of the hole because they were invented for non-VR images, the proposed method efficiently fills all of the pixels in the hole using the relationship between the left and right ERPs.