2 点作者 k2so4 个月前

1 comment

k2so4 个月前

LLMs = Latency? That's how most of us perceive it. When examining the timing breakdown of a request on Claude, you'll notice that the majority of the time is spent in Content Download—essentially, decoding output tokens.<p>In the blog, I discuss how partial json validation can help in workflow driven LLM products.<p>Would love feedback on how I can improve, thanks!

Making LLM workflows human friendly

1 comment

Making LLM workflows human friendly

1 comment