arXiv 2410.20502
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
By Zongyi Li, Shujie Hu, et al.
Published 2024-10-27
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
Text-to-video models have recently undergone rapid and substantial advancements. Nevertheless, due to limitations in data and computational resources, achieving efficient generation of long videos with rich motion dynamics remains a significant challenge. To generate high-quality, dynamic, and temporally consistent long videos, this paper presents ARLON, a novel framework that boosts diffusion Transformers with auto…