Direkt zum Inhalt springen

[Master's Thesis /Guided Research/Bachelor Thesis] - Surgical Scene Understanding using Panorama Imaging

29.01.2026, Abschlussarbeiten, Bachelor- und Masterarbeiten

Project Offer for Student Projects
Project Type [MA/Guided Research]
Project Title: Surgical Scene Understanding using Panorama Imaging
Research Domain(s): Surgical Scene Understanding, Surgical Phases, Computer Vision, Semantic Segmentation
Project Director: Prof. Daniel Roth
Project Advisors: Hannah Schieber

Project Description:
Modern computer-assisted surgery relies increasingly on robust visual understanding of the operating room. While most existing approaches for surgical scene understanding and phase recognition are based on narrow-field-of-view endoscopic or monocular camera data, real-world operating rooms often provide significantly richer visual contexts. Panorama and wide-angle imaging offer the potential to capture spatial relationships between surgical staff, instruments, anatomical regions, and medical devices in a unified representation.

This thesis aims to investigate surgical scene understanding using panorama imaging, with a particular focus on semantic segmentation, surgical phase recognition, and higher-level scene representations such as surgical scene graphs. The goal is to analyze whether panoramic visual context can improve robustness, generalization, and interpretability compared to conventional camera setups.

The project will commence with a comprehensive review of the state-of-the-art in surgical activity recognition, surgical phase classification, and scene graph–based modeling. Based on this review, existing methods will be categorized and experimentally evaluated, with adaptations made where necessary to support panoramic or wide-field-of-view imagery. Particular emphasis will be placed on testing and validating approaches on recordings beyond classic benchmark datasets, including less constrained or previously unseen surgical recordings.

The student is encouraged to bring in their own ideas and creativity. The outlined approach serves as a guiding framework rather than a strict implementation plan.

Your own creativity is welcome to solve the problem; the upper proposal merely presents an idea.

Key research areas include:
- Reviewing the state-of-the-art in surgical activity / phase classificition techniques and surgical scene graph techniques
- Categorize and experiment with existing approaches
- Test on recordings aside from classic datasets

Recommended background (or motivation in learning):
- Basic knowledge of computer vision
- Experience with deep learning model training and application
- Some experience with C++ and Python

Please send your transcript of records, CV and motivation to: Hannah Schieber
(hannah.schieber@tum.de) with CC to hex-thesis.ortho@mh.tum.de

You can find more information and other topics for theses on our website: https://hex-lab.io

Kontakt: hannah.schieber@tum.de, hex-thesis.ortho@mh.tum.de

Literature:
Carion, N., Gustafson, L., Hu, Y. T., Debnath, S., Hu, R., Suris, D., ... & Feichtenhofer, C. (2025). Sam 3: Segment anything with concepts. arXiv preprint arXiv:2511.16719.

Han, R., Yan, H., Li, J., Wang, S., Feng, W., & Wang, S. (2022, October). Panoramic human activity recognition. In European Conference on Computer Vision (pp. 244-261). Cham: Springer Nature Switzerland.

Cao, M., Yan, R., Shu, X., Zhang, J., Wang, J., & Xie, G. S. (2023, October). Mup: Multi-granularity unified perception for panoramic activity recognition. In Proceedings of the 31st ACM International Conference on Multimedia (pp. 7666-7675).

Özsoy, E., Pellegrini, C., Czempiel, T., Tristram, F., Yuan, K., Bani-Harouni, D., ... & Navab, N. (2025). Mm-or: A large multimodal operating room dataset for semantic understanding of high-intensity surgical environments. In Proceedings of the Computer Vision and Pattern Recognition Conference (pp. 19378-19389).

Holm, F., Ghazaei, G., Czempiel, T., Özsoy, E., Saur, S., & Navab, N. (2023). Dynamic scene graph representation for surgical video. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 81-87).

Demir, K. C., Schieber, H., Weise, T., Roth, D., May, M., Maier, A., & Yang, S. H. (2023). Deep learning in surgical workflow analysis: a review of phase and step recognition. IEEE Journal of Biomedical and Health Informatics, 27(11), 5405-5417.


Kontakt: hannah.schieber@tum.de