You are currently viewing a new version of our website. To view the old version click .
Sensors
  • Article
  • Open Access

26 June 2024

Neural Colour Correction for Indoor 3D Reconstruction Using RGB-D Data

,
and
1
Institute of Electronics and Informatics Engineering of Aveiro (IEETA), Intelligent System Associate Laboratory (LASI), University of Aveiro, 3810-193 Aveiro, Portugal
2
Department of Electronics, Telecommunications and Informatics (DETI), University of Aveiro, 3810-193 Aveiro, Portugal
3
Department of Mechanical Engineering (DEM), University of Aveiro, 3810-193 Aveiro, Portugal
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors

Abstract

With the rise in popularity of different human-centred applications using 3D reconstruction data, the problem of generating photo-realistic models has become an important task. In a multiview acquisition system, particularly for large indoor scenes, the acquisition conditions will differ along the environment, causing colour differences between captures and unappealing visual artefacts in the produced models. We propose a novel neural-based approach to colour correction for indoor 3D reconstruction. It is a lightweight and efficient approach that can be used to harmonize colour from sparse captures over complex indoor scenes. Our approach uses a fully connected deep neural network to learn an implicit representation of the colour in 3D space, while capturing camera-dependent effects. We then leverage this continuous function as reference data to estimate the required transformations to regenerate pixels in each capture. Experiments to evaluate the proposed method on several scenes of the MP3D dataset show that it outperforms other relevant state-of-the-art approaches.

1. Introduction

Three-dimensional (3D) reconstruction is the creation of 3D models from the captured shape and appearance of real objects. It is a field that has its roots in several areas within computer vision and graphics, and has gained high importance in others, such as architecture, robotics, autonomous driving, medicine, agriculture, and archaeology. With the rise in popularity of different human-centred applications requiring photo-realistic 3D reconstruction, such as virtual reality (VR) and augmented reality (AR) experiences, the problem of generating high-fidelity models has become an important task.
In a multiview acquisition system, particularly for large indoor scenes, the acquisition conditions, such as lighting, exposure, white balance, and other camera parameters, will differ along the environment, causing colour differences between captures and unappealing visual artefacts in the produced models. Different colour correction methodologies have been proposed for image stitching [1,2,3] algorithms. The global optimization strategy is widely used, and the colour correspondences are typically calculated based on the overlapping area between the images. When it comes to 3D reconstruction, especially through sparse RGB-D captures, this type of methodology cannot be directly applied, as the pose of the cameras will vary significantly in 6DOF within the environment. In this case, dense matching can be employed [4], but it is a very resource intensive process.
We propose a neural-based approach, using a multilayer perceptron (MLP) to learn an implicit representation of the colour in 3D space, while capturing camera-dependent effects. Then, to estimate the transformations required to regenerate pixels in each capture, one smaller MLP is trained using information provided by the larger MLP as reference colour. In the context of 3D reconstruction, even with a reduced number of pixel correspondences between images, a global optimization has scalability issues, as the combinations of images are exponential and processing of the captures required for an accurate reconstruction of an indoor scene can quickly become unfeasible. Conversely, our approach uses lightweight representations and is efficient in the integration process, using surface information derived from RGB-D data to create the continuous function of colour. Experiments to evaluate the proposed method on several scenes of the MP3D dataset show that it outperforms other relevant state-of-the-art approaches.
The remainder of this document is organized as follows: in Section 2, the related work is presented; in Section 3 the proposed methodology is described; in Section 4, the results are showcased and discussed; finally, Section 5 provides concluding remarks.

3. Proposed Approach

We propose a novel approach to tackle the problem of colour consistency correction, not by calculating mapping functions between pairs of images, nor by global optimization of camera models. Inspired by recent advances in neural implicit functions, such as DeepSDF [31], Fourier features [32], and NeRF [33], we represent a continuous scene as a 6D vector-valued function whose input is a 3D point location ( x p , y p , z p ) and a 3D camera position ( x c , y c , z c ) , both under world coordinates, and whose output is a colour ( r , g , b ) . We approximate this continuous 6D scene representation with an MLP network,
E Θ : ( x p , y p , z p , x c , y c , z c ) ( r , g , b )
and optimize its weights ( Θ ) to map from each input 6D coordinate to its corresponding colour, see Figure 1. The loss is a simple MSE error between each pixel in each image and the colour predicted by the network for the corresponding 3D point as seen by the specific camera,
L = c C p P c C ^ ( p , c ) C ( p , c ) 2 2
where C is the set of cameras of the dataset, P c is the subset of 3D points that are seen by camera c , C ^ ( p , c ) is the estimated colour, and C ( p , c ) is the reference colour for the pixel corresponding to the projection of point p in camera c .
Figure 1. Overview of the training of our continuous scene representation using an MLP to approximate the colour in 3D space. Blue arrow represents a discrete sampling of the implicit function.
However, as discussed in [32], deep networks are biased towards learning lower-frequency functions, making it impossible to capture the high-frequency detail of the textures in 3D space. Therefore, we map our input to a higher dimensional space using the positional encoding:
γ ( v ) = v , sin 2 0 π v , cos 2 0 π v , , sin 2 L 1 π v , cos 2 L 1 π v
where v corresponds to each individual component of input ( x p , y p , z p , x c , y c , z c ) . The maximum frequency, L, was set to 10 for the point location, ( x p , y p , z p ) , and 4 for the camera location, ( x c , y c , z c ) .
We follow a similar architecture to DeepSDF [31] and NeRF [33], including a skip connection that concatenates the 3D point position to the fifth layer’s activation, as this seems to improve learning; see Figure 2. Additionally, by concatenating information on camera position only towards the end of the network, we encourage interpolation of the colour between cameras, capture shared colour, and complement information on 3D space that is not seen from that camera more effectively. Unlike radiance field models, which rely on consistent colour information across views to synthesize realistic colour, we use the camera information to capture differences between acquisition conditions, so that we may generate colour for any 3D point as if it were seen by a particular camera. In those approaches, volumetric rendering has become common, but since our goal is efficient colour harmonization for images used in texture mapping of 3D mesh models, we choose instead to use the geometric information provided by RGB-D captures, mapping colour directly to the surface coordinates. This way we avoid ray-sampling and all the computation required to find empty space, which for indoor reconstruction is most of the scene.
Figure 2. Visualization of our neural network architecture for implicit scene representation. Input vectors are shown in green. The number inside each block signifies the vector’s dimension. All layers are standard fully-connected layers, blue arrows indicate layers with ReLU activations, dashed black arrows indicate layers with sigmoid activation, and “+” denotes vector concatenation.
For each capture, we estimate a colour mapping function that will transform the RGB from the image to a set of colours consistent with the implicit representation of the colour in 3D space. This is achieved by training one smaller MLP per image, which will approximate the transformation required to regenerate the pixels for a particular capture,
D Θ : ( r , g , b , x , y ) ( r ^ , g ^ , b ^ )
where x , y correspond to the col and row of the pixel with colour r , g , b in any given capture, and r ^ , g ^ , b ^ represent the estimated values for corrected colour, see Figure 3. Since these networks are mapping from one texture to another, they do not require positional encoding to elevate input to a higher dimensionality, see Figure 4. This approach makes our method robust to decimation of the depth data used to train the implicit scene representation. When regenerating pixels, these smaller MLPs will recover texture detail that may have been lost in the continuous scene representation, requiring only enough information to estimate a colour transfer from each capture to a corrected version that matches the colour learned by the MLP in the sampled points. The loss function for one MLP, learning the colour transformation for a capture, is
L = i I C ^ ( i ) C ( p , c ) 2 2
where I is the set of pixels in a capture, C ^ ( i ) is the estimated colour for pixel i , and C ( p , c ) is the reference colour for the pixel, obtained from the continuous scene representation E Θ , by providing the corresponding 3D point p and a reference camera c .
Figure 3. Overview of the training of our colour mapping function approximators, using the continuous scene representation as reference. The MLP previously trained to estimate colour in 3D space is used to provide ground truth for each of the smaller MLPs that will hold the approximation to regenerate pixels in the captures. Blue arrow represents the final output of the system.
Figure 4. Visualization of our neural network architecture for individual capture colour transformation. Input vectors are shown in green. The number inside each block signifies the vector’s dimension. All layers are standard fully-connected layers, blue arrows indicate layers with ReLU activations, dashed black arrows indicate layers with sigmoid activation.

4. Results

In order to quantitatively assess colour consistency within the image sets, we adopt the commonly used PSNR and CIEDE2000 metrics. We calculate the colour similarity between images using the pairs of corresponding pixels, which we compute robustly using geometric data. Since both PSNR and CIEDE2000 are pairwise image metrics, we carry out the evaluation over each image pair in the set. The displayed score values include the mean and standard deviation for all image pairs. Additionally, we present a weighted mean based on the number of pixel correspondences.
CIEDE2000 [34] was adopted as the most recent colour difference metric from the International Commission on Illumination (CIE). This metric measures colour dissimilarity, meaning that lower score values equate to more similar colour measurements. The CIEDE2000 metric can be formulated as follows:
CIEDE 2000 = Δ L 2 + Δ C a b 2 + Δ H a b 2 + R T · Δ C a b · Δ H a b
where Δ L , Δ C a b , and Δ H a b are the lightness, chroma, and hue differences, respectively, and R T is a hue rotation term.
PSNR [35] measures colour similarity, meaning higher score values equate to more similar colour measurements. The PSNR formula is given by
PSNR = 10 · log 10 L 2 MSE
where L is the largest possible value in the dynamic range of an image, and MSE is the mean squared error between the two images.
We showcase the results of our approach in three distinct scenes from the publicly available Matterport3D (MP3D) [36] dataset. We perform comparisons with several well-known state-of-the-art approaches, providing them with accurate and robust mappings between pixels, calculated by the reprojection of geometric data to each image. For [19], no pixel mappings are required, as colour histogram analogy is employed. In the case of pairwise approaches [7,11,14], we compute a graph of the image set and weigh the edges using the number of pixels mappings between captures. The reference image is decided by selecting the node with the largest centrality. We then compute the propagation paths by applying the shortest path algorithm, as suggested in [22,23].
Because of the nature of 3D reconstruction datasets, particularly those employing stationary sensors, captures can be sparse, with some image pairs presenting insufficient pixel correspondences to accurately calculate colour transfer. This can deteriorate results and cause excessive errors, particularly when applied through propagation paths. Furthermore, for optimization approaches [1,27], using every possible pair of images, regardless of the usefulness of their information, can quickly become unfeasible due to the exponential nature of combinations. Experimentally, we found that setting a small threshold relative to the resolution of the geometric data to avoid using image pairs with negligible information was beneficial for both the efficacy and efficiency of the tested colour correction approaches. For the results presented in this section, we discarded image pairs connected by less than 1% of the available geometric data.
Pairwise methods [7,11,14] have problems with accumulated error, despite the filtration process that was applied. Optimization-based approaches [1,27] seem to have trouble balancing a large number of images, particularly from such different perspectives and with large colour differences. The local colour discrepancies also make it impossible for the camera models to correctly compensate for the effects in the captures. Results for the colour balancing using histogram analogy [19] are somewhat underwhelming; this may be attributed to the fact that this approach is intended for more significant colour transfer and is suitable for problems involving style transfer between pairs of images rather than precise harmonization in a set. Our approach, leveraging implicit representation that takes into account the 3D space in which the textures will be applied, achieved the best results in all experiments; see Table 1, Table 2 and Table 3. As shown in Figure 5, Figure 6 and Figure 7, the proposed approach effectively harmonizes textures from different captures, significantly mitigating the appearance of texture seams. It should be noted that, while pixel regeneration is highly effective in creating consistent textures across the environment, it can sometimes result in a loss of high-frequency detail in certain areas. Several factors influence the final output, including the quality of registration between captures, the precision of sampling during training, the maximum frequency of positional encoding, the number of epochs, and other hyperparameters. These factors can lead to variations in texture detail and colour accuracy. Depending on the conditions of the input data, there is an inherent trade-off between the harmonization of the captures and the level of high-frequency information for particular regions.
Table 1. Evaluation of colour consistency correction in scene 2azQ1b91cZZ_7 of the MP3D dataset. We observe that our method improves in all metrics, highlighted in bold.
Table 2. Evaluation of colour consistency correction in scene VLzqgDo317F_19 of the MP3D dataset. We observe that our method improves in all metrics, highlighted in bold.
Table 3. Evaluation of colour consistency correction in scene aayBHfsNo7d_1 of the MP3D dataset. We observe that our method improves in all metrics, highlighted in bold.
Figure 5. Qualitative comparison of our method with relevant state-of-the-art in scene 2azQ1b91cZZ_7 of the MP3D dataset. We observe that our method presents a more visually appealing mesh, with significantly reduced texture seams [1,7,11,14,19,27].
Figure 6. Qualitative comparison of our method with relevant state-of-the-art in scene VLzqgDo317F_19 of the MP3D dataset. We observe that our method presents a more visually appealing mesh, with significantly reduced texture seams [1,7,11,14,19,27].
Figure 7. Qualitative comparison of our method with relevant state-of-the-art in scene aayBHfsNo7d_1 of the MP3D dataset. We observe that our method presents a more visually appealing mesh, with significantly reduced texture seams [1,7,11,14,19,27].

5. Conclusions

In this paper, we propose a novel neural-based method for colour consistency correction. We leverage an MLP to learn an implicit representation of the environment colour in 3D space, while capturing camera-dependent effects. We then use this network to train smaller MLPs, approximating the required transformations to regenerate pixels in the original captures, producing more consistent colour. Our method is efficient and offers better scalability than global optimization approaches, which are the current most popular approach. Experiments show that the proposed solution outperforms other relevant state-of-the-art methods for colour correction.
The algorithm was designed within the scope of a 3D reconstruction pipeline, as a way to increase the quality of the generated models, making use of the available registered scanning information. However, this can limit its use in other applications. Also, since the proposed approach thrives on thorough depth information, poor-quality data may undermine the effectiveness of the algorithm. Nevertheless, as depth sensors improve and become more widespread, we can expect our method’s performance to enhance.
As future work, it would be interesting to explore anomaly detection techniques to identify points in 3D space with high variability in view-dependent colour. This approach could help pinpoint surfaces likely to exhibit more reflective properties. This information could be used to render the materials more accurately through physically-based rendering (PBR) and further enhance the colour correction results.

Author Contributions

Conceptualization, T.M., P.D., and M.O.; funding acquisition, P.D. and M.O.; methodology, T.M., P.D., and M.O.; software, T.M.; writing—original draft, T.M.; writing—review and editing, P.D. and M.O. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been funded by National Funds through the FCT - Foundation for Science and Technology, in the context of the Ph.D. scholarship 2020.07345.BD, with DOI 10.54499/2020.07345.BD, and project UIDB/00127/2020, supported by IEETA.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

We thank everyone involved for their time and expertise.

Conflicts of Interest

The authors declare no conflicts of interest.

Correction Statement

This article has been republished with a minor correction to resolve spelling and grammatical errors. This change does not affect the scientific content of the article.

References

  1. Brown, M.; Lowe, D. Automatic Panoramic Image Stitching using Invariant Features. Int. J. Comput. Vis. 2007, 74, 59–73. [Google Scholar] [CrossRef]
  2. Xia, M.; Yao, J.; Xie, R.; Zhang, M.; Xiao, J. Color Consistency Correction Based on Remapping Optimization for Image Stitching. In Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 11–17 October 2017; pp. 2977–2984. [Google Scholar] [CrossRef]
  3. Li, L.; Li, Y.; Xia, M.; Li, Y.; Yao, J.; Wang, B. Grid Model-Based Global Color Correction for Multiple Image Mosaicking. IEEE Geosci. Remote Sens. Lett. 2021, 18, 2006–2010. [Google Scholar] [CrossRef]
  4. HaCohen, Y.; Shechtman, E.; Goldman, D.B.; Lischinski, D. Optimizing color consistency in photo collections. ACM Trans. Graph. 2013, 32, 1–10. [Google Scholar] [CrossRef]
  5. Pitie, F.; Kokaram, A.; Dahyot, R. N-dimensional probability density function transfer and its application to color transfer. In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05), Washington, DC, USA, 17–20 October 2005; Volume 2, pp. 1434–1439. [Google Scholar] [CrossRef]
  6. Su, Z.; Deng, D.; Yang, X.; Luo, X. Color transfer based on multiscale gradient-aware decomposition and color distribution mapping. In Proceedings of the 20th ACM International Conference on Multimedia, Nara, Japan, 29 October–2 November 2012; pp. 753–756. [Google Scholar] [CrossRef]
  7. De Marchi, S. Polynomials arising in factoring generalized Vandermonde determinants: An algorithm for computing their coefficients. Math. Comput. Model. 2001, 34, 271–281. [Google Scholar] [CrossRef]
  8. Hwang, Y.; Lee, J.Y.; Kweon, I.S.; Kim, S.J. Color Transfer Using Probabilistic Moving Least Squares. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 3342–3349. [Google Scholar] [CrossRef]
  9. Liu, X.; Zhu, L.; Xu, S.; Du, S. Palette-Based Recoloring of Natural Images Under Different Illumination. In Proceedings of the 2021 IEEE 6th International Conference on Computer and Communication Systems (ICCCS), Chengdu, China, 23–26 April 2021; pp. 347–351. [Google Scholar] [CrossRef]
  10. Wu, F.; Dong, W.; Kong, Y.; Mei, X.; Paul, J.C.; Zhang, X. Content-based colour transfer. Comput. Graph. Forum 2013, 32, 190–203. [Google Scholar] [CrossRef]
  11. Finlayson, G.D.; Mackiewicz, M.; Hurlbert, A. Color Correction Using Root-Polynomial Regression. IEEE Trans. Image Process. 2015, 24, 1460–1470. [Google Scholar] [CrossRef]
  12. Hwang, Y.; Lee, J.Y.; Kweon, I.S.; Kim, S.J. Probabilistic moving least squares with spatial constraints for nonlinear color transfer between images. Comput. Vis. Image Underst. 2019, 180, 1–12. [Google Scholar] [CrossRef]
  13. Niu, Y.; Zheng, X.; Zhao, T.; Chen, J. Visually Consistent Color Correction for Stereoscopic Images and Videos. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 697–710. [Google Scholar] [CrossRef]
  14. Reinhard, E.; Adhikhmin, M.; Gooch, B.; Shirley, P. Color transfer between images. IEEE Comput. Graph. Appl. 2001, 21, 34–41. [Google Scholar] [CrossRef]
  15. Xiao, X.; Ma, L. Gradient-Preserving Color Transfer. Comput. Graph. Forum 2009, 28, 1879–1886. [Google Scholar] [CrossRef]
  16. Nguyen, R.M.; Kim, S.J.; Brown, M.S. Illuminant aware gamut-based color transfer. Comput. Graph. Forum 2014, 33, 319–328. [Google Scholar] [CrossRef]
  17. He, M.; Liao, J.; Chen, D.; Yuan, L.; Sander, P.V. Progressive Color Transfer With Dense Semantic Correspondences. ACM Trans. Graph. 2019, 38, 1–18. [Google Scholar] [CrossRef]
  18. Wu, Z.; Xue, R. Color Transfer With Salient Features Mapping via Attention Maps Between Images. IEEE Access 2020, 8, 104884–104892. [Google Scholar] [CrossRef]
  19. Lee, J.; Son, H.; Lee, G.; Lee, J.; Cho, S.; Lee, S. Deep color transfer using histogram analogy. Vis. Comput. 2020, 36, 2129–2143. [Google Scholar] [CrossRef]
  20. Li, Y.; Li, Y.; Yao, J.; Gong, Y.; Li, L. Global Color Consistency Correction for Large-Scale Images in 3-D Reconstruction. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 3074–3088. [Google Scholar] [CrossRef]
  21. Li, Y.; Fang, C.; Yang, J.; Wang, Z.; Lu, X.; Yang, M.H. Universal style transfer via feature transforms. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 385–395. [Google Scholar]
  22. Chen, C.; Chen, Z.; Li, M.; Liu, Y.; Cheng, L.; Ren, Y. Parallel relative radiometric normalisation for remote sensing image mosaics. Comput. Geosci. 2014, 73, 28–36. [Google Scholar] [CrossRef]
  23. Dal’Col, L.; Coelho, D.; Madeira, T.; Dias, P.; Oliveira, M. A Sequential Color Correction Approach for Texture Mapping of 3D Meshes. Sensors 2023, 23, 607. [Google Scholar] [CrossRef]
  24. Xiong, Y.; Pulli, K. Color matching of image sequences with combined gamma and linear corrections. In Proceedings of the International Conference on Multimedia, Firenze Italy, 25–29 October 2010; pp. 261–270. [Google Scholar]
  25. Yu, L.; Zhang, Y.; Sun, M.; Zhou, X.; Liu, C. An auto-adapting global-to-local color balancing method for optical imagery mosaic. ISPRS J. Photogramm. Remote Sens. 2017, 132, 1–19. [Google Scholar] [CrossRef]
  26. Xie, R.; Xia, M.; Yao, J.; Li, L. Guided color consistency optimization for image mosaicking. ISPRS J. Photogramm. Remote Sens. 2018, 135, 43–59. [Google Scholar] [CrossRef]
  27. Moulon, P.; Duisit, B.; Monasse, P. Global multiple-view color consistency. In Proceedings of the Conference on Visual Media Production, London, UK, 30 November–1 December 2013. [Google Scholar]
  28. Shen, T.; Wang, J.; Fang, T.; Zhu, S.; Quan, L. Color Correction for Image-Based Modeling in the Large. In Proceedings of the Asian Conference on Computer Vision, Taipei, Taiwan, 20–24 November 2016. [Google Scholar]
  29. Park, J.; Tai, Y.W.; Sinha, S.N.; Kweon, I.S. Efficient and Robust Color Consistency for Community Photo Collections. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 430–438. [Google Scholar] [CrossRef]
  30. Yang, J.; Liu, L.; Xu, J.; Wang, Y.; Deng, F. Efficient global color correction for large-scale multiple-view images in three-dimensional reconstruction. ISPRS J. Photogramm. Remote Sens. 2021, 173, 209–220. [Google Scholar] [CrossRef]
  31. Park, J.J.; Florence, P.; Straub, J.; Newcombe, R.; Lovegrove, S. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 165–174. [Google Scholar]
  32. Tancik, M.; Srinivasan, P.P.; Mildenhall, B.; Fridovich-Keil, S.; Raghavan, N.; Singhal, U.; Ramamoorthi, R.; Barron, J.T.; Ng, R. Fourier features let networks learn high frequency functions in low dimensional domains. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Online, 6–12 December 2020. [Google Scholar]
  33. Mildenhall, B.; Srinivasan, P.P.; Tancik, M.; Barron, J.T.; Ramamoorthi, R.; Ng, R. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 2021, 65, 99–106. [Google Scholar] [CrossRef]
  34. Sharma, G.; Wu, W.; Dalal, E.N. The CIEDE2000 color-difference formula: Implementation notes, supplementary test data, and mathematical observations. Color Res. Appl. 2005, 30, 21–30. [Google Scholar] [CrossRef]
  35. Xu, W.; Mulligan, J. Performance evaluation of color correction approaches for automatic multi-view image and video stitching. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 263–270. [Google Scholar] [CrossRef]
  36. Chang, A.; Dai, A.; Funkhouser, T.; Halber, M.; Niessner, M.; Savva, M.; Song, S.; Zeng, A.; Zhang, Y. Matterport3D: Learning from RGB-D Data in Indoor Environments. In Proceedings of the International Conference on 3D Vision (3DV), Qingdao, China, 10–12 October 2017. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.