arXiv 2312.07104

SGLang: Efficient Execution of Structured Language Model Programs

By Lianmin Zheng, Liangsheng Yin, et al.

Published 2023-12-12

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Large language models (LLMs) are increasingly used for complex tasks that require multiple generation calls, advanced prompting techniques, control flow, and structured inputs/outputs. However, efficient systems are lacking for programming and executing these applications. We introduce SGLang, a system for efficient execution of complex language model programs. SGLang consists of a frontend language and a runtime. T…

View the original paper on arXiv