MetaSCUT: Large-Scale Scene Simulation based on 3D Gaussian Splatting and Universal Physics Engine

School of Future Technology, South China University of Technology
1st Semester 2024-2025

*Indicates Equal Contribution

Abstract

MY ALT TEXT

In the fields of digital twins and virtual reality, the accurate reconstruction and interaction of large-scale scenes have become key research focuses. In this paper, we present a digital twin reconstruction of a campus and implement interactive features within the reconstructed scene. For the campus reconstruction, we utilized a self-collected aerial dataset, with each building in the campus divided into several zones, each containing hundreds of multi-view images. We experimented with two mesh reconstruction methods: the first method is based on Gaussian scattering for mesh extraction, where we introduced a regularization term to align the Gaussian model with the scene's surface and used Poisson reconstruction to efficiently extract an accurate mesh. The second method is inspired by divide-and-conquer training approaches based on Gaussian primitives, along with a multi-level detail strategy. Ultimately, we found that the first method outperformed the second. For the interactive component, we primarily used Blender to implement interactive features, such as a physics-simulated vehicle in the virtual campus. Additionally, we explored the use of the Genesis physics simulation engine to control the movement of multiple robotic arms. However, due to Genesis being in its early development stage and its inability to load scenes, we were unable to integrate it into our virtual campus. In conclusion, we have developed an explorative campus scene that integrates large-scale reconstruction with interactive features, offering a novel solution for the precise reconstruction and interaction within large-scale environments.

Pipeline

Pipeline of our MetaSCUT, which integrates dataset analysis, SuGar-based mesh reconstruction, and Blender-assisted scene rendering. The process begins with aerial data acquisition, including denoising, alignment, and parameter adjustment, to generate an initial point cloud and camera parameters. SuGar mesh reconstruction involves five key steps: generating initial Gaussian representations, optimizing the Gaussians with regularization terms, extracting the mesh using density functions and Poisson reconstruction, refining and binding the mesh, and rendering the scene. Finally, the output mesh is enriched in Blender with Gaussian-based textures, material adjustments, and is exported as static images or videos.

MY ALT TEXT

Quantitative Comparison

MY ALT TEXT

We evaluate the CityGS, SuGaR, and the fine-tuned SuGaR on Rendering Quality aspects, including PSNR, SSIM and LPIPS. With the proper hyper-parameter settings, Sugar module outperforms the vanilla version and CityGS on rendering quality, while producing 3D reconstruction results with less artifacts and empty holes, laying a solid foundation for subsequent simulated interactions.

Reconstruction Demo

Interaction Demo

References

Guédon, A., & Lepetit, V. (2024). SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

Liu, Y., Guan, H., Luo, C., Fan, L., Peng, J., & Zhang, Z. (2024). CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians. Proceedings of the European Conference on Computer Vision (ECCV).

Kerbl, B., Kopanas, G., Leimkühler, T., & Drettakis, G. (2023). 3D Gaussian Splatting for Real-time Radiance Field Rendering. ACM Transactions on Graphics, 42(4), 1-14.