Researcher, Alibaba Group
1 paper at NeurIPS 2025
We present OmniTalker, the first end-to-end framework for joint text-driven speech and talking head generation. It achieves 25 FPS synthesis while preserving speaker identity and synchronizing audiovisual outputs in one-shot settings.