arXiv 2502.12632
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
By Sihyun Yu, Meera Hahn, et al.
Published 2025-02-18
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
Diffusion models are successful for synthesizing high-quality videos but are limited to generating short clips (e.g., 2-10 seconds). Synthesizing sustained footage (e.g. over minutes) still remains an open research question. In this paper, we propose MALT Diffusion (using Memory-Augmented Latent Transformers), a new diffusion model specialized for long video generation. MALT Diffusion (or just MALT) handles long vid…