2. April 2026

Analysis of Visual Foundation Models with Applications to Video Coding for Machines

Beschreibung

Visual Foundation models (VFM), like DINOv3, V-JEPA,… are widely employed for diverse computer vision tasks.
Unlike traditional computer vision pipelines, VFMs learn rich, transferable representations from vast amounts of data, enabling strong generalization across tasks with little to no finetuning.
This research shall explore the usage of VFM models in machine-to-machine communication of visual data.

Mögliche Themen

Cross-architecture feature translation
Intra frame feature prediction
Inter frame feature prediction (forecasting)
VFM feature compression

Voraussetzungen

Vorausgesetzt werden Erfahrungen in der Programmierung mit Python, Deep Learning (PyTorch/TF) und Bild- und Videokompression

Betreuer

Marc Windsheimer
marc.windsheimer@fau.de
Raum 06.036

Hochschullehrer

Prof. Dr.-Ing. André Kaup
andre.kaup@fau.de
Raum 06.031

Zuletzt aktualisiert: 2. April 2026 - 12:52