arXiv 2312.07104

SGLang: Efficient Execution of Structured Language Model Programs

By Lianmin Zheng, Liangsheng Yin, et al.

Published 2023-12-12

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

Large language models (LLMs) are increasingly used for complex tasks that require multiple generation calls, advanced prompting techniques, control flow, and structured inputs/outputs. However, efficient systems are lacking for programming and executing these applications. We introduce SGLang, a system for efficient execution of complex language model programs. SGLang consists of a frontend language and a runtime. T…

View the original paper on arXiv