36 点作者 latchkey18 天前

1 comment

is sglang an LLM engine or does it use vLLM/llama.cpp under the hood? and while we're at it, has anyone done a comparison of LLM engines? I've also heard of Mistral.rs, LLM MLC, and obviously HF transformers library and its ktransformers alternative.

评论 #43840797 未加载

评论 #43834522 未加载

评论 #43844417 未加载

Implement Flash Attention Back End in SGLang – Basics and KV Cache

1 comment

Implement Flash Attention Back End in SGLang – Basics and KV Cache

1 comment