Streaming output

This topic describes how to stream real-time responses from the Cortex Code Agent SDK.

By default, the SDK yields complete AssistantMessage objects after the model finishes generating each response. To receive incremental updates as text and thinking blocks are generated, enable partial message streaming by setting includePartialMessages (TypeScript) or include_partial_messages (Python) to true.

When partial messages are enabled, Cortex Code emits StreamEvent objects for partial text and thinking content. Complete tool calls still arrive as AssistantMessage objects, and tool results still arrive as UserMessage objects.

Enable streaming output

When enabled, the SDK yields StreamEvent messages containing partial streaming events, in addition to the usual AssistantMessage, UserMessage, and ResultMessage objects. Your code needs to:

  1. Check each message’s type to distinguish StreamEvent from other types.
  2. For StreamEvent, extract the event field and check its type.
  3. Look for content_block_delta events where delta.type is text_delta.
import { query } from "cortex-code-agent-sdk";

for await (const message of query({
  prompt: "List the files in my project",
  options: {
    cwd: process.cwd(),
    includePartialMessages: true,
    allowedTools: ["Bash", "Read"],
  },
})) {
  if (message.type === "stream_event") {
    const event = message.event;
    if (event.type === "content_block_delta") {
      if (event.delta.type === "text_delta") {
        process.stdout.write(event.delta.text);
      }
    }
  }
}

StreamEvent reference

When partial messages are enabled, you receive raw streaming events wrapped in an object:

interface SDKPartialAssistantMessage {
  type: "stream_event";
  event: Record<string, unknown>;  // Raw streaming event
  parent_tool_use_id: string | null;
  uuid: string;
  session_id: string;
}

The event field contains the raw partial streaming event emitted by Cortex Code. Common event types:

Event TypeDescription
content_block_startStart of a new text or thinking block
content_block_deltaIncremental text or thinking update
content_block_stopEnd of the current text or thinking block

Message flow

With partial messages enabled, you commonly receive messages in the following order:

SystemMessage -- session initialization
StreamEvent (content_block_start) -- text or thinking block
StreamEvent (content_block_delta) -- text_delta or thinking_delta chunks...
StreamEvent (content_block_stop)
AssistantMessage -- complete text/thinking block, or complete tool_use block
UserMessage -- complete tool_result block
... more assistant/user turns ...
ResultMessage -- final result

Without partial messages enabled, you still receive the same complete assistant, user, and result messages, but not StreamEvent. Depending on the session, the SDK can also emit system events such as initialization, status, and background-task notifications.

Stream text responses

To display text as it’s generated, look for content_block_delta events where delta.type is text_delta:

import { query } from "cortex-code-agent-sdk";

for await (const message of query({
  prompt: "Explain how databases work",
  options: { cwd: process.cwd(), includePartialMessages: true },
})) {
  if (message.type === "stream_event") {
    const event = message.event;
    if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
      process.stdout.write(event.delta.text);
    }
  }
}
console.log(); // Final newline

Build a streaming UI

The following example accumulates streamed text in a local buffer and re-renders the current response each time a new text_delta arrives. In a real application, replace the render function with your framework’s state update logic:

import { query } from "cortex-code-agent-sdk";

let currentText = "";

function render(text: string) {
  console.clear();
  console.log("Assistant:\n");
  process.stdout.write(text);
}

for await (const message of query({
  prompt: "Explain how databases work",
  options: {
    cwd: process.cwd(),
    includePartialMessages: true,
  },
})) {
  if (message.type === "stream_event") {
    const event = message.event;
    if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
      currentText += event.delta.text;
      render(currentText);
    }
  } else if (message.type === "result") {
    console.log("\n\n--- Complete ---");
  }
}

Known limitations

FeatureImpact on streaming
Structured outputJSON result appears only in ResultMessage.structured_output, not as streaming deltas

Legal notices

Where your configuration of Cortex Code uses a model provided on the Model and Service Pass-Through Terms, your use of that model is further subject to the terms for that model on that page.

The data classification of inputs and outputs are as set forth in the following table.

Input data classificationOutput data classificationDesignation
Usage DataCustomer DataCovered AI Features [1]

For additional information, refer to Snowflake AI and ML.