2 papers across 2 sessions
This paper introduces an efficient temporal in‑context fine‑tuning framework that enables pretrained video diffusion models to ingest diverse conditioning signals, granting precise and versatile control without any architectural changes.