[Guided Research / Master's Thesis] - Semantic Scene Graphs for Gaussian Splatting

20.01.2026, Abschlussarbeiten, Bachelor- und Masterarbeiten

Project Offer for Student Projects
Project Type [MA/guided research]
Project Title: Semantic Scene Graphs for Gaussian Splatting
Research Domain(s): Computer Graphics, Computer Vision, Novel View Synthesis, Reconstruction, Mapping, Semantic Segmentation
Project Director: Prof. Daniel Roth
Project Advisors: Hannah Schieber

Abstract:
3D Gaussian Splatting enables efficient reconstruction and rendering of complex scenes, yet current pipelines allocate representational capacity globally, leading to suboptimal use of Gaussians across semantically diverse regions. Important objects may be under-reconstructed, while large but irrelevant background areas are over-represented, increasing training cost and limiting real-time applicability.

Project Description:
Semantic scene graphs provide a high-level, object-centric representation of a 3D environment. When integrated with Gaussian Splatting-based mapping pipelines, they supply structure, meaning, and relational context on top of the raw, continuous radiance/feature field. This combination enables tasks such as dynamic-object management, interaction modeling, telepresence reasoning, and downstream robot/agent intelligence.

This idea is an object-centric strategy for controlling the number of 3D Gaussians assigned to each object within a Gaussian Splatting mapping pipeline. The approach leverages a semantic scene graph to regulate reconstruction fidelity, model size, and performance based on the importance of each object. The outcome is an adaptive, semantically aligned mapping system that prioritizes mission-critical objects while reducing over-densification in background regions.
Current Gaussian Splatting models apply densification and pruning globally, resulting in a suboptimal distribution of representation capacity. Critical objects may be under-reconstructed, while large but semantically unimportant areas consume excessive Gaussians. This imbalance increases training time, memory cost, and complicates real-time use cases such as XR teleconsultation, robotic perception, and dynamic scene mapping. A systematic method is required to control Gaussian allocation per object.

This thesis shall introduce a semantic scene graph as a supervisory structure. For each detected object, a graph node stores geometric attributes (volume, surface area), visibility statistics, semantic class, and task-driven priority. A Gaussian Budget Function shall map these attributes to a per-object Gaussian limit. During training, densification and pruning operations reference this budget to ensure that each object’s Gaussian count remains within optimal bounds.
Your own creativity is welcome to solve the problem; the upper proposal merely presents an idea.
Key research areas include:

- Reviewing the state-of-the-art in novel view synthesis mapping techniques and semantic scene graph techniques
- Categorize and experiment with existing approaches
- Test on recordings aside from classic datasets

Recommended background (or motivation in learning):
- Basic knowledge of computer vision
- Experience with deep learning model training and application
- Some experience with C++ and Python

Please send your transcript of records, CV and motivation to: Hannah Schieber
(hannah.schieber@tum.de) with CC to hex-thesis.ortho@mh.tum.de

You can find more information and other topics for theses on our website: https://hex-lab.io

Kontakt: hannah.schieber@tum.de, hex-thesis.ortho@mh.tum.de

<b>Literature: </b>
Gorlo, N., Schmid, L., & Carlone, L. (2025). Describe Anything Anywhere At Any Moment. arXiv preprint arXiv:2512.00565.
Wang, X., Yang, D., Gao, Y., Yue, Y., Yang, Y., & Fu, M. (2025). GaussianGraph: 3D Gaussian-based Scene Graph Generation for Open-world Scene Understanding. arXiv preprint arXiv:2503.04034.

Liu, R., Zhang, J., Peng, K., Zheng, J., Cao, K., Chen, Y., ... & Stiefelhagen, R. (2023). Open scene understanding: Grounded situation recognition meets segment anything for helping people with visual impairments. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 1857-1867).

Wald, J., Dhamo, H., Navab, N., & Tombari, F. (2020). Learning 3d semantic scene graphs from 3d indoor reconstructions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 3961-3970).

Gu, Q., Kuwajerwala, A., Morin, S., Jatavallabhula, K. M., Sen, B., Agarwal, A., ... & Paull, L. (2024, May). Conceptgraphs: Open-vocabulary 3d scene graphs for perception and planning. In 2024 IEEE International Conference on Robotics and Automation (ICRA) (pp. 5021-5028). IEEE.

Zhao, A. Y., Gunturu, A., Do, E. Y. L., & Suzuki, R. (2025, September). Guided Reality: Generating Visually-Enriched AR Task Guidance with LLMs and Vision Models. In Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology (pp. 1-15).

Kontakt: hannah.schieber@tum.de