Direkt zum Inhalt springen

* MSc Thesis + HiWi Opportunity: Multimodal AI for Evaluating Counseling Competence in Teacher-Parent Interactions

08.05.2026, Abschlussarbeiten, Bachelor- und Masterarbeiten

Combined MSc Thesis + HiWi Position (can be taken together or separately) on multimodal AI analysis of teacher–parent counseling sessions. Effective communication is central to the teaching profession, yet it remains one of the hardest skills to teach and assess objectively. While AI has advanced rapidly in commercial sentiment and gesture recognition, its application to teacher education remains largely underexplored, particularly regarding the multimodal nature of counseling, where gaze, postural mirroring, and prosodic synchrony jointly contribute to trust-building.

Note: This position can be taken as (a) a standalone MSc thesis, (b) a standalone HiWi role, or (c) both combined, where the HiWi work directly contributes to the thesis pipeline.

MSc Thesis — Architecture & Pedagogy

  • Pipeline design: end-to-end multimodal processing integrating WhisperX (speech), OpenFace (vision), and Wav2Vec (audio prosody) into a unified temporal framework.
  • Multimodal fusion: millisecond-level alignment of smiles, supportive utterances, and pitch contours.
  • Indicator validation: correlating machine-detectable patterns, such as Linguistic Style Matching, Mutual Gaze Ratio, and acoustic accommodation, with the Tübingen Counseling Competence Scale (TBKS).
  • Feedback design: deciding how feedback is delivered to student teachers, such as a summary score, emotional-tension heatmap, or timeline of missed alignment opportunities.

HiWi — Feature Extraction & Real-Time Prototyping

  • Large-scale processing: running a 20.43-hour annotated dataset through EmoNet, 6DRepNet, L2CS-Net, and related models.
  • Data mining: comparing trained and untrained groups for measurable changes in postural tension, Duchenne smile frequency, and related indicators.
  • Real-time prototype: low-latency inference and a live dashboard, including a valence–arousal graph and nod-detection indicator, for use in training simulators.

Requirements: Informatics, Data Engineering, EI, MMT, or related fields; solid Python and PyTorch skills; experience with computer vision and/or speech processing is a plus.

Contact:
Anna Bodonhelyi (Primary Supervisor) — anna.bodonhelyi@tum.de
Süleyman Özdel — ozdelsuleyman@tum.de

Kontakt: anna.bodonhelyi@tum.de