arXiv 2510.06308

Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding

By Yi Xin, Qi Qin, et al.

Published 2025-10-07

Discussion

Read the public discussion and references gathered around this paper.

We introduce Lumina-DiMOO, an open-source foundational model for seamless multi-modal generation and understanding. Lumina-DiMOO sets itself apart from prior unified models by utilizing a fully discrete diffusion modeling to handle inputs and outputs across various modalities. This innovative approach allows Lumina-DiMOO to achieve higher sampling efficiency compared to previous autoregressive (AR) or hybrid AR-Diff…

View the original paper on arXiv