arXiv 2312.07104

SGLang: Efficient Execution of Structured Language Model Programs

By Lianmin Zheng, Liangsheng Yin, et al.

Published 2023-12-12

Citation lineage

Review the prior work and downstream research connected to this paper.

Large language models (LLMs) are increasingly used for complex tasks that require multiple generation calls, advanced prompting techniques, control flow, and structured inputs/outputs. However, efficient systems are lacking for programming and executing these applications. We introduce SGLang, a system for efficient execution of complex language model programs. SGLang consists of a frontend language and a runtime. T…

View the original paper on arXiv