Streaming
Braintrust supports executing prompts, functions, and evaluations through the API and within the UI through the playground. Like popular LLM services, Braintrust supports streaming results using Server-Sent Events.
The Braintrust SDK and UI automatically parse the SSE stream, and we have adapters for common libraries like the Vercel AI SDK, so you can easily integrate with the rich and growing ecosystem of LLM tools. However, the SSE format itself is also purposefully simple, so if you need to parse it yourself, you can!
To see more about how to use streaming data, see the prompts documentation.
Why does this exist
Streaming is a very powerful way to consume LLM outputs, but the predominant "chat" data structure produced by modern LLMs is more complex than most applications need. In fact, the most common use cases are to simply (a) convert the text of the first message into a string or (b) parse the arguments of the first tool call into a JSON object. The Braintrust SSE format is really optimized to make these use cases easy to parse, while also supporting more advanced scenarios like parallel tools calls.
Formal spec
SSE events consist of three fields: id
(optional), event
(optional), and data
. The Braintrust SSE format always sets event
and data
, and never sets id
.
The SSE events in Braintrust follow this structure:
Text
A text_delta
is a snippet of text, which is JSON-encoded. For example, you might receive:
As you process a text_delta
, you can JSON-decode the string and display it directly.
JSON
A json_delta
is a snippet of JSON-encoded data, which cannot necessarily be parsed on its own.
For example:
As you process a json_delta
events, concatenate the strings together and then parse them
as JSON at the end of the stream.