arXiv 2310.04363

Amortizing intractable inference in large language models

By Edward J. Hu, Moksh Jain, et al.

Published 2023-10-06

Discussion

Read the public discussion and references gathered around this paper.

Autoregressive large language models (LLMs) compress knowledge from their training data through next-token conditional distributions. This limits tractable querying of this knowledge to start-to-end autoregressive sampling. However, many tasks of interest -- including sequence continuation, infilling, and other forms of constrained generation -- involve sampling from intractable posterior distributions. We address t…

View the original paper on arXiv