• Что бы вступить в ряды "Принятый кодер" Вам нужно:
    Написать 10 полезных сообщений или тем и Получить 10 симпатий.
    Для того кто не хочет терять время,может пожертвовать средства для поддержки сервеса, и вступить в ряды VIP на месяц, дополнительная информация в лс.

  • Пользаватели которые будут спамить, уходят в бан без предупреждения. Спам сообщения определяется администрацией и модератором.

  • Гость, Что бы Вы хотели увидеть на нашем Форуме? Изложить свои идеи и пожелания по улучшению форума Вы можете поделиться с нами здесь. ----> Перейдите сюда
  • Все пользователи не прошедшие проверку электронной почты будут заблокированы. Все вопросы с разблокировкой обращайтесь по адресу электронной почте : info@guardianelinks.com . Не пришло сообщение о проверке или о сбросе также сообщите нам.

Vercel AI SDK v5 Internals - Part 7 — Decoupling Your Backend: The ChatTransport Abstraction Explained

Lomanu4 Оффлайн

Lomanu4

Команда форума
Администратор
Регистрация
1 Мар 2015
Сообщения
1,481
Баллы
155
We've been journeying through the Vercel AI SDK v5 canary in this blog series, and if you've been following along from Posts 1-6, you know we've seen some massive architectural shifts. We've talked UIMessage and its parts, the new V2 model interfaces, and the underlying principles of the conceptual ChatStore. Today, we're diving into something that really unlocks a new level of flexibility: the ChatTransport.

This is where the SDK starts to feel really extensible, letting us break free from the default HTTP/SSE chains if our app demands it. Think WebSockets for truly interactive experiences, gRPC for high-performance typed backends, or even going completely offline with client-side storage. This is a big one for building specialized, production-grade AI apps.

?? A Note on Process & Curation: While I didn't personally write every word, this piece is a product of my dedicated curation. It's a new concept in content creation, where I've guided powerful AI tools (like Gemini Pro 2.5 for synthesis, git diff main vs canary v5 informed by extensive research including OpenAI's Deep Research, spent 10M+ tokens) to explore and articulate complex ideas. This method, inclusive of my fact-checking and refinement, aims to deliver depth and accuracy efficiently. I encourage you to see this as a potent blend of human oversight and AI capability. I use them for my own LLM chats on Thinkbuddy, and doing some make-ups and pushing to there too.

1. Transport Interface Sketch: The ChatTransport Contract


TL;DR: The conceptual ChatTransport interface in Vercel AI SDK v5 defines a contract for how chat messages are sent and received, aiming to decouple UI logic from the specific communication mechanism and enabling diverse backend integrations beyond the default HTTP/SSE.

Why this matters? (Context & Pain-Point)

If you've been working with Vercel AI SDK v4, you'll know that useChat was pretty much hardwired to use fetch for communication. It expected a server endpoint (usually /api/chat) that spoke HTTP POST for submissions and Server-Sent Events (SSE) for streaming responses (or GET for resumption). This was, and still is, a solid pattern for many web apps, especially within the Next.js ecosystem where serverless functions make this straightforward.

However, the real world of application development is diverse. What if your backend already communicates over WebSockets for low-latency, bidirectional updates? What if you're integrating with a gRPC service that has its own strict contracts? Or, a common ask I've seen pop up, what if you want to build a chat demo that runs entirely in the browser without a backend, or a React Native app that needs to function offline? In V4, these scenarios often meant you had to step outside useChat's built-in networking and essentially roll your own state management and communication layer, losing some of the benefits of the SDK.

This is where the vision for AI SDK v5's architecture really shines. A core goal, as we've discussed, is to decouple the chat logic (managed by useChat and the principles of the conceptual ChatStore – the central client-side state manager for chat) from the how of message delivery. This decoupling is achieved through the ChatTransport abstraction. It’s about saying, "Here's the conversation, here's the new message; I don't care how you send it and get a response, as long as you follow the contract."

How it’s solved in v5? (Step-by-step, Code, Diagrams)

Now, a crucial point for those of us exploring the v5 Canary releases: as of my latest dive, the useChat hook itself doesn't yet have a directly exposed public prop like transport: myCustomTransportInstance. Internally, useChat (in @ai-sdk/react) uses a utility, often referred to as callChatApi (found in packages/ai/src/ui/call-chat-api.ts), to handle its default HTTP/SSE communication. So, you can't just drop in a custom transport class into useChat options today.

However, understanding the conceptual ChatTransport interface is absolutely key for two reasons:

  1. It illuminates the architectural direction of v5 and its commitment to flexibility.
  2. For developers needing to build custom solutions now (e.g., for React Native or specialized backends), this conceptual interface serves as a blueprint for how to structure your own communication layer to be compatible with v5's message formats and streaming protocols, even if you're building a custom hook that uses it.
  3. It lays the groundwork for what we might expect in future SDK enhancements, potentially a fully pluggable API on useChat.

So, what would this conceptual ChatTransport contract look like? Based on the SDK's needs and common transport patterns, we can sketch out a TypeScript interface. Bear in mind this is a conceptual sketch to illustrate the responsibilities:


// Conceptual sketch - actual SDK API may differ if/when exposed
interface ChatRequestOptions { // Simplified, actual options are more complex
chatId?: string; // ID of the chat session
headers?: Record<string, string>; // Custom headers for the request
body?: Record<string, any>; // Additional data beyond messages, passed from useChat().handleSubmit(e, { body: ... })
abortSignal?: AbortSignal; // For aborting the request, e.g., from useChat().stop()
// ... other potential options like credentials, custom fetch parameters
}

interface ChatTransport {
/**
* Submits messages to the backend and initiates an AI response stream.
* This method is responsible for all aspects of communicating with the
* specific backend, including formatting the request, authentication,
* and making the actual network call or invoking local logic.
*
* Crucially, it MUST return a Promise that resolves to a standard `Response`
* object. The `response.body` of this `Response` MUST be a `ReadableStream`
* that produces data compliant with the Vercel AI SDK v5 UI Message Streaming Protocol
* (i.e., Server-Sent Events of `UIMessageStreamPart` objects).
*/
submit(
messages: UIMessage<any>[], // Current conversation history as an array of UIMessage objects
options: ChatRequestOptions // Options for this specific submission
): Promise<Response>;

/**
* Attempts to resume an interrupted stream for a given chatId.
* This is often triggered by `useChat().experimental_resume()`.
* Like `submit`, it must return a Promise<Response> where `response.body`
* is a v5 UI Message Stream.
* This method is optional if the transport or backend doesn't support resumption.
*/
resume?(
chatId: string,
options?: Pick<ChatRequestOptions, 'abortSignal' | 'headers'> // Subset of options relevant for resumption
): Promise<Response>;

/**
* Fetches an entire saved chat conversation for a given chatId.
* This could be used to populate `initialMessages` in `useChat` or
* to initialize the chat state in a custom setup.
* This method is optional.
*/
getChat?(
chatId: string,
options?: Pick<ChatRequestOptions, 'abortSignal' | 'headers'>
): Promise<UIMessage<any>[] | null>;

// Potentially, lifecycle methods like init() or destroy() could be useful
// for more complex transports that manage persistent connections (e.g., WebSockets).
// init?(config: any): Promise<void>;
// destroy?(): Promise<void>;
}

[FIGURE 1: Conceptual diagram showing useChat delegating to ChatTransport interface, with concrete implementations like HttpTransport, WebSocketTransport, LocalStorageTransport below it.]

Let's break down these conceptual methods:

1.1 submit(messages: UIMessage<any>[], options: ChatRequestOptions): Promise<Response>


  • Input:
    • messages: UIMessage<any>[]: This is the current history of the conversation, an array of UIMessage objects. As we discussed in Post 2 and Post 6, UIMessage is v5's rich, client-side message structure with id, role, typed metadata, and the crucial parts array (containing TextUIPart, ToolInvocationUIPart, etc.).
    • options: ChatRequestOptions: This object would carry any additional parameters for the request.
      • chatId: The unique identifier for the current chat session.
      • headers: Any custom headers that useChat might be configured to send, or that you want to add per-request.
      • body: This is important. useChat().handleSubmit(event, { body: { someCustomData: 'value' } }) allows you to pass extra data with a submission. This body object would be available here for the transport to include in its payload to the backend.
      • abortSignal: An AbortSignal which would be triggered if useChat().stop() is called. The transport implementation must respect this signal to cancel the ongoing operation (e.g., abort a fetch call, send a cancel message over a WebSocket).

  • Responsibility:
    This method is the workhorse. It needs to encapsulate all the logic required to communicate with its specific backend or data source. This includes:
    • Formatting the messages and options.body data into the payload expected by the backend (e.g., JSON, protocol buffers for gRPC, a specific WebSocket message format).
    • Handling authentication (e.g., attaching API keys or auth tokens to headers or the payload).
    • Making the actual network call (e.g., fetch, websocket.send(), gRPC client call) or performing the local operation (e.g., writing to localStorage).
    • Managing any connection-specific logic.

  • Crucial Output: Promise<Response> where response.body is a v5 UI Message Stream.
    This is non-negotiable for compatibility with the SDK's client-side stream processing, particularly the processUIMessageStream utility (which useChat uses internally). The ReadableStream obtained from response.body must yield data formatted as Server-Sent Events (SSE), where each event's data field is a JSON string representing a UIMessageStreamPart (as detailed in Post 2 and Post 5, covering types like 'start', 'text', 'tool-call', 'file', 'metadata', 'error', 'finish').
    The response headers should also reflect this, primarily:
    • Content-Type: text/event-stream
    • x-vercel-ai-ui-message-stream: v1 (The v5 identifier)

    This means that even if your transport uses WebSockets or gRPC, which have their own streaming mechanisms, the submit method must adapt the data received from those protocols into this specific SSE-based ReadableStream format. This adaptation layer is where much of the complexity of a custom transport lies.

1.2 resume?(chatId: string, options?: Pick<ChatRequestOptions, 'abortSignal' | 'headers'>): Promise<Response>


  • Input:
    • chatId: string: The ID of the chat session whose stream needs to be resumed.
    • options: Optional AbortSignal (to cancel the resumption attempt) and headers.

  • Responsibility:
    This method attempts to continue an AI response stream that might have been interrupted (e.g., due to network issues or page navigation).
    • For the default HTTP transport used by useChat().experimental_resume(), this translates to making a GET request to the API endpoint (e.g., /api/chat?chatId=your-chat-id). The server then needs logic to find and restart the stream for that chatId.
    • A WebSocket transport might send a specific "resume" message type to the server.
    • A localStorage transport might re-stream the last persisted assistant message if it was incompletely "streamed" or if resumption logic is defined for it.

  • Output:
    Just like submit, this method must also return a Promise<Response> where response.body is a v5 UI Message Stream. This ensures that resumed streams are processed by the client in the same way as new streams.

1.3 getChat?(chatId: string, options?: Pick<ChatRequestOptions, 'abortSignal' | 'headers'>): Promise<UIMessage<any>[] | null>


  • Input:
    • chatId: string: The ID of the chat session to fetch.
    • options: Optional AbortSignal and headers.

  • Responsibility:
    This method is for fetching an entire saved chat conversation history. It might be used by useChat to populate its initialMessages if you provide a chatId but no explicit initialMessages.
    • An HTTP transport might make a GET request to a specific endpoint like /api/chat/history?chatId=....
    • A localStorage transport would read from localStorage.getItem(\chat_\${chatId}).

  • Output:
    A Promise that resolves to an array of UIMessage<any>[] objects if the chat history is found, or null if not found or an error occurs.

Interaction with Conceptual ChatStore / useChat:

The useChat hook (embodying the principles of the conceptual ChatStore) would be the primary consumer of this ChatTransport.

  1. When useChat().handleSubmit() is called, it would (conceptually) invoke transport.submit() with the current messages and options.
  2. It would then take the Response object from the transport, get its ReadableStream body, and pipe it through processUIMessageStream.
  3. processUIMessageStream uses an onUpdate callback to inform useChat about newly constructed or updated UIMessage objects from the stream, which useChat then uses to update its reactive messages state.
  4. Similarly for useChat().experimental_resume(), which would call transport.resume().

This clear separation of concerns is powerful. useChat focuses on managing client-side state and UI updates, while the ChatTransport handles the messy details of communication.

Take-aways / Migration Checklist Bullets

  • ChatTransport is v5's conceptual abstraction for message delivery, aiming to decouple UI logic from network protocols.
  • While not directly pluggable into useChat options in current canary, understanding its interface is key for custom solutions and future SDK evolution.
  • Core methods are submit(), optional resume(), and optional getChat().
  • Crucially, submit() and resume() must return a Promise<Response> whose body is a v5 UI Message Stream (SSE of UIMessageStreamParts). This is the contract for compatibility.
  • The default useChat behavior (via callChatApi) acts as the built-in HTTP/SSE transport.
  • Custom transports need an adaptation layer to produce this v5 SSE stream if their native protocol is different (e.g., WebSockets, gRPC).
2. Building a WebSocket Transport


TL;DR: Implementing a WebSocket-based ChatTransport involves managing a persistent WebSocket connection, sending serialized messages, and critically, adapting the asynchronous, message-based responses from the WebSocket server into the Vercel AI SDK v5's required Server-Sent Event (SSE) ReadableStream format for the submit method.

Why this matters? (Context & Pain-Point)

While the default HTTP/SSE mechanism in the Vercel AI SDK is great for many applications (it's stateless, scales well with serverless functions, and is simple to implement), there are scenarios where WebSockets offer distinct advantages.

  • Low Latency: For highly interactive applications where every millisecond counts, WebSockets can offer lower latency than repeated HTTP requests or even SSE, as the connection is already established.
  • True Bidirectional Communication: SSE is primarily server-to-client. WebSockets allow for seamless, full-duplex communication. While chat is often client-sends-request, server-streams-response, WebSockets open up possibilities for more complex interactions like the server proactively pushing updates, collaborative features alongside the chat, or more stateful connections.
  • Stateful Connections: If your backend needs to maintain significant state per user session (beyond what's passed in each request), a persistent WebSocket connection can be a natural fit.
  • Existing Infrastructure: You might already have a backend built around WebSockets.

If you need these capabilities, you can't just use the default useChat networking. This is where a custom ChatTransport becomes essential. Building one for WebSockets, however, has a specific challenge: bridging WebSocket's message-oriented nature with the SDK's expectation of an SSE stream.

How it’s solved in v5? (Step-by-step, Code, Diagrams)

Let's walk through the conceptual design of a client-side WebSocketChatTransport.

Client-Side WebSocketChatTransport Implementation Sketch:


  1. Constructor and Connection Management:
    • The transport's constructor would initialize the WebSocket connection to your server (e.g., this.ws = new WebSocket('wss://yourserver.com/ai-chat');).
    • It needs to handle WebSocket lifecycle events:
      • onopen: Connection established. Maybe queue messages if submit was called before onopen.
      • onerror: Handle connection errors. This should probably reject any pending submit promises or signal an error to the stream.
      • onclose: Connection closed. Handle reconnection logic (see 2.2 Heart-beat & Reconnect).
      • onmessage: This is where data arrives from the server. This handler is central to the adaptation layer. [FIGURE 2: Diagram showing WebSocketChatTransport maintaining a WebSocket connection and interacting with a WebSocket server.]

  2. submit(messages: UIMessage<any>[], options: ChatRequestOptions): Promise<Response> Method:
    This is where the magic and the main challenge lie.
    • Serialization and Sending:
      1. Take the messages array (and any relevant data from options.body or other options).
      2. Serialize this into a JSON payload (or your chosen WebSocket message format) suitable for your WebSocket server. This payload might include the messages, chatId, and perhaps a unique requestId for correlation.
      3. Send this payload over the WebSocket: this.ws.send(JSON.stringify(payload));.

* **The Challenge: Returning a v5 UI Message Stream (`Promise<Response>` with SSE `ReadableStream`)**
This is the trickiest part. The `submit` method *must* return a `Promise<Response>` whose body is a `ReadableStream` that speaks the Vercel AI SDK v5 UI Message Streaming Protocol (SSE of `UIMessageStreamPart`s). WebSockets don't natively produce this.
* **Solution Approach – The Adaptation Layer:**
1. When `submit` is called, it needs to immediately create a *new* `ReadableStream`. You can do this using `new ReadableStream({ start(controller) { ... } })` or by adapting a `TransformStream`. Let's call this the `sseStream`.
2. The `WebSocket`'s global `onmessage` handler will receive data chunks (WebSocket messages) from the server.
3. **Correlation:** Since a single WebSocket connection can handle multiple concurrent requests (though less common for a single chat UI's submit), or at least sequential requests, you need a way to correlate server responses to the specific `submit` call that initiated them. A common way is to:
* Generate a unique `correlationId` (e.g., using an SDK utility like `generateId()` or `crypto.randomUUID()`) when `submit` is called.
* Include this `correlationId` in the message sent to the WebSocket server.
* The WebSocket server must include this `correlationId` in every message it sends back related to that request.
4. **Transformation and Enqueuing:**
* Inside the `sseStream`'s `start(controller)` method (or within the `WebSocket.onmessage` handler, carefully managing context), when a WebSocket message arrives from the server:
* Parse the server's WebSocket message.
* Check its `correlationId`. If it matches the `correlationId` of the current `submit` call's `sseStream`, then process it.
* The server's WebSocket message payload must be something your client transport can convert into one or more v5 `UIMessageStreamPart`s (e.g., `{ type: 'text', messageId: '...', value: 'delta' }`).
* Format this `UIMessageStreamPart` as an SSE event string: `data: ${JSON.stringify(uiMessageStreamPart)}\n\n`.
* Push this string into the `sseStream`'s controller: `controller.enqueue(encoder.encode(sseEventString));`. (Assuming `encoder` is a `TextEncoder`).
5. **Stream Lifecycle:**
* When the server indicates the end of the response for that `correlationId` (e.g., by sending a special "finish" WebSocket message or a specific `UIMessageStreamPart` that signifies end), you must `controller.close()` the `sseStream`.
* If the server sends an error for that `correlationId`, you should `controller.error(new Error(errorMessage))` and then `controller.close()`.
6. **Managing Multiple `submit` Calls:** If your transport needs to handle overlapping `submit` calls (though unlikely for a single chat input), you'd need a map of `correlationId` to `ReadableStreamDefaultController` instances to route incoming WebSocket messages to the correct stream. For typical chat, one active "submit" stream at a time is more common.
7. **Return the Response:** Finally, `submit` returns `new Response(sseStream, { headers: { 'Content-Type': 'text/event-stream', 'x-vercel-ai-ui-message-stream': 'v1' } });`.

* **`AbortSignal` Handling:**
The `options.abortSignal` must be respected. Add an event listener to it:

```typescript
options.abortSignal?.addEventListener('abort', () => {
// 1. Send a "cancel" message over WebSocket to the server for this correlationId
// this.ws.send(JSON.stringify({ type: 'cancel', correlationId }));
// 2. Close/abort the client-side sseStream:
// controller.error(new Error('Stream aborted')); // Or just controller.close()
// controller.close();
// 3. Clean up any correlators/listeners for this stream.
});
```
  1. resume?(chatId: string, options?) / getChat?(chatId: string, options?):
    • These methods would also involve sending specific message types over the WebSocket to request these actions from the server (e.g., { type: 'resumeStream', chatId } or { type: 'fetchHistory', chatId }).
    • The server's WebSocket response would again need to be adapted by the transport into a v5 UI Message Stream (for resume) or an array of UIMessage objects (for getChat). The adaptation for resume would be very similar to submit.

2.1 Server Push Format (WebSocket Server-Side Considerations):

Your WebSocket server needs to be designed to work with this client transport.

  • Understanding Client Requests: It must parse incoming WebSocket messages from the client (e.g., the serialized UIMessage history, chatId, correlationId).

  • Streaming AI Responses: When interacting with an LLM and streaming back the AI's response:
    • The server should break down the LLM's output into chunks or parts.
    • For each chunk/part, it should send a WebSocket message back to the client.
    • Crucially, these server-pushed WebSocket messages should contain payloads that the client-side WebSocketChatTransport can easily convert into v5 UIMessageStreamParts.

    • For example, the server could send JSON objects that directly map to the structure of UIMessageStreamParts:

      // Example WebSocket message from server to client
      {
      "correlationId": "unique-req-id-123",
      "payload": {
      "type": "text", // This is a UIMessageStreamPart type
      "messageId": "assistant-msg-456",
      "value": "This is a text delta. "
      }
      }


      Or:

      {
      "correlationId": "unique-req-id-123",
      "payload": {
      "type": "tool-call", // UIMessageStreamPart type
      "messageId": "assistant-msg-456",
      "toolCallId": "tool-xyz",
      "toolName": "calculator",
      "args": "{\"expr\":\"2+2\"}"
      }
      }

    • The server must also send a message indicating the end of the stream for a given correlationId, or an error if one occurs, so the client transport can close its ReadableStream correctly. This could be a special WebSocket message type or a specific UIMessageStreamPart like 'finish' or 'error'.

2.2 Heart-beat & Reconnect:

Robust WebSocket implementations require more than just basic send/receive.

  • Heartbeats (Pings/Pongs): Many proxies, load balancers, or even browsers might close inactive WebSocket connections. To prevent this, the client and/or server should periodically send small "ping" messages, and the other side should respond with "pong" messages. This keeps the connection alive and helps detect dead connections faster. This logic would be part of the transport's connection management.
  • Client-Side Reconnection Logic: If the WebSocket connection drops unexpectedly (e.g., onclose event with an error code, or onerror), the WebSocketChatTransport should implement a reconnection strategy. This typically involves:
    • Waiting for a short period (e.g., exponential backoff: 1s, 2s, 4s, 8s...).
    • Attempting to re-establish the WebSocket connection.
    • Once reconnected, it might need to re-authenticate or signal to the server that it's resuming a session. If a submit was in progress, how to handle its resumption is complex and depends heavily on server support for resuming WebSocket-based streams.

Code Sketch (Conceptual Client-Side WebSocketChatTransport):

This is a simplified sketch focusing on the submit method's adaptation layer. A production version would be more robust.


import { UIMessage, generateId } from 'ai'; // Assuming generateId is an SDK util

// interface ChatRequestOptions { /* ... as defined before ... */ }
// interface ChatTransport { /* ... as defined before ... */ }

// Hypothetical class structure
class WebSocketChatTransport /* implements ChatTransport */ {
private ws: WebSocket;
private activeStreamControllers: Map<string, ReadableStreamDefaultController<Uint8Array>> = new Map();
private textEncoder = new TextEncoder();

constructor(url: string) {
this.ws = new WebSocket(url);

this.ws.onopen = () => {
console.log('WebSocket connection established.');
// Potentially send queued messages or an init message
};

this.ws.onmessage = (event) => {
try {
const serverMessage = JSON.parse(event.data as string);
const { correlationId, payload: uiMessageStreamPartFromServer } = serverMessage;

if (correlationId && this.activeStreamControllers.has(correlationId)) {
const controller = this.activeStreamControllers.get(correlationId)!;

// Expecting uiMessageStreamPartFromServer to be a valid UIMessageStreamPart
// or directly convertible. For this example, let's assume it IS a UIMessageStreamPart.
const sseEventString = `data: ${JSON.stringify(uiMessageStreamPartFromServer)}\n\n`;
controller.enqueue(this.textEncoder.encode(sseEventString));

// Check if this part signifies the end of the stream for this correlationId
if (uiMessageStreamPartFromServer.type === 'finish' || uiMessageStreamPartFromServer.type === 'error') {
controller.close();
this.activeStreamControllers.delete(correlationId);
}
} else {
console.warn('Received WebSocket message with unknown or inactive correlationId:', correlationId);
}
} catch (e) {
console.error('Failed to parse WebSocket message or route to controller:', e, event.data);
// Potentially find related controllers and error them out if possible.
}
};

this.ws.onerror = (error) => {
console.error('WebSocket error:', error);
// Error out all active stream controllers
this.activeStreamControllers.forEach(controller => {
try { controller.error(new Error('WebSocket connection error.')); } catch (e) {/* already closed */}
});
this.activeStreamControllers.clear();
};

this.ws.onclose = (event) => {
console.log('WebSocket connection closed:', event.code, event.reason);
// Error out all active stream controllers
this.activeStreamControllers.forEach(controller => {
try { controller.error(new Error('WebSocket connection closed.')); } catch (e) {/* already closed */}
});
this.activeStreamControllers.clear();
// Implement reconnection logic here if desired
};
}

async submit(messages: UIMessage<any>[], options: ChatRequestOptions): Promise<Response> {
if (this.ws.readyState !== WebSocket.OPEN) {
return Promise.reject(new Error('WebSocket is not open.'));
}

const correlationId = options.chatId || generateId(); // Use chatId or generate a new one for correlation

// Send message to WebSocket server
const payloadToServer = {
type: 'submitConversation', // Your custom type for WebSocket server
correlationId,
messages,
additionalData: options.body,
};
this.ws.send(JSON.stringify(payloadToServer));

// Create the ReadableStream that will produce v5 UI Message Stream parts
const self = this; // For referencing 'this' inside stream methods
const stream = new ReadableStream<Uint8Array>({
start(controller) {
self.activeStreamControllers.set(correlationId, controller);

options.abortSignal?.addEventListener('abort', () => {
if (self.ws.readyState === WebSocket.OPEN) {
self.ws.send(JSON.stringify({ type: 'cancelStream', correlationId }));
}
try {
controller.error(new Error('Stream aborted by client.'));
} catch (e) { /* Stream might be already closed or errored */ }
controller.close(); // Ensure it's closed
self.activeStreamControllers.delete(correlationId);
});
},
cancel(reason) {
console.log('Stream cancelled:', reason);
if (self.ws.readyState === WebSocket.OPEN) {
self.ws.send(JSON.stringify({ type: 'cancelStream', correlationId }));
}
self.activeStreamControllers.delete(correlationId);
},
});

return Promise.resolve(new Response(stream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
'x-vercel-ai-ui-message-stream': 'v1'
}
}));
}

// resume() and getChat() would be implemented similarly,
// sending specific WebSocket messages and adapting responses.
// For resume(), the adaptation to an SSE stream would be identical to submit().
// For getChat(), it would parse the WebSocket response into UIMessage[] directly.
}

This sketch emphasizes the complexity of the adaptation layer. The client WebSocketChatTransport is effectively acting as a mini-server, translating between the WebSocket protocol and the SSE protocol expected by the Vercel AI SDK's stream processing logic.

Take-aways / Migration Checklist Bullets

  • A WebSocketChatTransport is feasible but requires careful implementation, especially the adaptation layer.
  • The transport's submit() method must adapt WebSocket messages into a ReadableStream of v5 UI Message Stream parts (SSE format).
  • Use correlation IDs to map asynchronous WebSocket server responses to specific client requests.
  • Your WebSocket server must send messages that are easily convertible into UIMessageStreamParts by the client transport.
  • Robust implementations need heartbeat mechanisms and client-side reconnection logic.
  • Handle AbortSignal to allow cancellation of WebSocket requests/streams.
  • This pattern gives you low-latency, bidirectional capabilities while still leveraging useChat for UI state.
3. Offline/LocalStorage Transport Demo


TL;DR: A LocalStorageChatTransport demonstrates how to create a client-only chat experience by simulating AI responses, persisting conversation history in the browser's localStorage, and still producing a Vercel AI SDK v5 compliant UI Message Stream for useChat compatibility.

Why this matters? (Context & Pain-Point)

Sometimes you don't need a full backend, or you can't have one. Consider these use cases:

  • Demos and Prototypes: Quickly building a functional chat UI to demonstrate features without setting up a server.
  • Offline Applications: Enabling chat functionality in Progressive Web Apps (PWAs) or mobile apps (like those built with React Native) that need to work even when the network is unavailable.
  • Client-Side LLMs: If you're experimenting with Web LLMs that run entirely in the browser, you need a way for useChat to interact with them.
  • Testing: Creating a mock transport for testing your UI components in isolation.

For these scenarios, the default HTTP transport won't work. A custom ChatTransport that operates entirely on the client side is the perfect solution. localStorage provides a simple way to persist chat history for such demos, although for more robust offline needs, IndexedDB is usually preferred.

How it’s solved in v5? (Step-by-step, Code, Diagrams)

Let's build a conceptual LocalStorageChatTransport. This transport will:

  1. Store chat histories in localStorage, keyed by chatId.
  2. When submit is called, it will simulate an AI response (e.g., echo the user's message).
  3. It will then "stream" this simulated response back as v5 UIMessageStreamParts, just like a real backend would, ensuring compatibility with useChat.

LocalStorageChatTransport Implementation Sketch:


import { UIMessage, generateId, createUIMessageStream, UIMessageStreamWriter } from 'ai'; // Core v5 utilities

// Assume ChatRequestOptions and ChatTransport interface from section 1
// interface ChatRequestOptions { /* ... */ }
// interface ChatTransport { /* ... */ }

const LOCAL_STORAGE_PREFIX = 'chat_';

class LocalStorageChatTransport /* implements ChatTransport */ {

private getStorageKey(chatId: string): string {
return `${LOCAL_STORAGE_PREFIX}${chatId}`;
}

// `getChat` implementation to load history from localStorage
async getChat(
chatId: string,
_options?: Pick<ChatRequestOptions, 'abortSignal' | 'headers'>
): Promise<UIMessage<any>[] | null> {
try {
const storedChat = localStorage.getItem(this.getStorageKey(chatId));
if (storedChat) {
const messages: UIMessage<any>[] = JSON.parse(storedChat);
// Dates will be strings after JSON.parse, convert them back to Date objects
return messages.map(msg => ({
...msg,
createdAt: msg.createdAt ? new Date(msg.createdAt) : new Date()
}));
}
return null;
} catch (error) {
console.error('Error reading chat from localStorage:', error);
return null;
}
}

// `submit` implementation for simulating responses and using v5 stream utilities
async submit(
messages: UIMessage<any>[],
options: ChatRequestOptions
): Promise<Response> {
const chatId = options.chatId || generateId(); // Get or generate chatId

// 1. Simulate LLM Echo/Simple Logic
// For this demo, the "AI" will just echo the user's last text message.
const lastUserMessage = messages[messages.length - 1];
let assistantResponseText = "I'm a simple echo bot. I couldn't find your last message.";

if (lastUserMessage && lastUserMessage.role === 'user') {
// Extract text from TextUIParts of the last user message
const userTextParts = lastUserMessage.parts
.filter(part => part.type === 'text')
// Type assertion here for simplicity; a real app might need safer access
.map(part => (part as { type: 'text'; text: string }).text);

if (userTextParts.length > 0) {
assistantResponseText = `You said: "${userTextParts.join(' ')}"`;
} else {
assistantResponseText = "You sent a message without any text content I could echo.";
}
}

const assistantMessageId = generateId();
const assistantMessageCreatedAt = new Date();

// Construct the assistant's UIMessage
// We'll create a simple message with one TextUIPart.
const assistantMessage: UIMessage<any> = {
id: assistantMessageId,
role: 'assistant',
parts: [{ type: 'text', text: assistantResponseText }],
createdAt: assistantMessageCreatedAt,
// metadata: { simulated: true } // Optional metadata
};

// 2. Persist the new state (user message + new assistant message)
// The 'messages' array already contains the latest user message.
const newConversationHistory = [...messages, assistantMessage];
try {
localStorage.setItem(this.getStorageKey(chatId), JSON.stringify(newConversationHistory));
} catch (error) {
console.error('Error saving chat to localStorage:', error);
// Decide how to handle storage errors - maybe stream an error part?
}

// 3. Return a v5 UI Message Stream for the assistant's message
// We use `createUIMessageStream` and `UIMessageStreamWriter` to correctly format the stream.
// This is crucial for compatibility with `useChat` or `processUIMessageStream`.
const { stream, writer } = createUIMessageStream();

// Start streaming the assistant message
// We need to adapt the writer slightly as it's designed for server-side use with Response objects.
// For client-side, we'll call its methods directly and then construct our own Response.
(async () => {
try {
// Start event for the message
writer.writeStart({
messageId: assistantMessage.id,
createdAt: assistantMessage.createdAt.toISOString() // Ensure ISO string for createdAt
});

// Stream TextUIPart(s)
// Since it's not really "streaming" from an LLM, we send the full text in one delta.
// If assistantMessage had multiple text parts, you'd loop and call writeTextDelta for each.
for (const part of assistantMessage.parts) {
if (part.type === 'text') {
// writer.writeTextDelta(assistantMessage.id, (part as {text: string}).text);
// The writer API expects deltas, but for a full part, you can just send the whole string.
// A more accurate simulation of token-by-token streaming:
const textToStream = (part as { type: 'text'; text: string }).text;
for (let i = 0; i < textToStream.length; i++) {
writer.writeTextDelta(assistantMessage.id, textToStream);
await new Promise(r => setTimeout(r, 20)); // Simulate token delay
}
}
// Add logic for other part types if your simulated AI can generate them (e.g., FileUIPart)
// else if (part.type === 'file') {
// writer.writeFile(assistantMessage.id, (part as FileUIPart));
// }
}

// Finish event for the message
writer.writeFinish({
messageId: assistantMessage.id,
finishReason: 'stop', // Or 'length', 'tool-calls' as appropriate
// usage: { promptTokens: 0, completionTokens: 0 }, // Optional usage stats
// providerMetadata: { source: 'localStorageTransport' } // Optional provider metadata
});
} catch (e) {
console.error("Error writing to UIMessageStream:", e);
writer.writeError(e instanceof Error ? e.message : "Unknown streaming error");
} finally {
writer.close(); // Always close the writer to signal end of stream
}
})();

return Promise.resolve(new Response(stream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
'x-vercel-ai-ui-message-stream': 'v1'
}
}));
}

// `resume` for localStorage might be a no-op or could re-stream the last assistant message
// if the UI needs to simulate a resumption.
async resume?(
chatId: string,
_options?: Pick<ChatRequestOptions, 'abortSignal' | 'headers'>
): Promise<Response> {
console.log(`LocalStorageTransport: resume called for ${chatId}. For this demo, it's a no-op or could re-stream last message.`);
// For a more complete simulation, you could retrieve the last assistant message
// from localStorage and stream it again using createUIMessageStream as in submit().
// For simplicity, returning an empty, immediately closed stream:
const { stream, writer } = createUIMessageStream();
writer.close();
return Promise.resolve(new Response(stream, { headers: { 'Content-Type': 'text/event-stream', 'x-vercel-ai-ui-message-stream': 'v1' } }));
}
}

[FIGURE 3: Screenshot of a chat UI operating with messages being echoed, potentially with a note "Using LocalStorage Mode".]

How to Use It (Conceptual):
If useChat had a direct transport prop:


// In your React component
// import { LocalStorageChatTransport } from './localStorageChatTransport'; // Assuming you created this file
// const myLocalStorageTransport = new LocalStorageChatTransport();
//
// const { messages, input, handleSubmit } = useChat({
// id: 'myOfflineChat',
// transport: myLocalStorageTransport, // This is the conceptual part
// // initialMessages could be populated by myLocalStorageTransport.getChat('myOfflineChat')
// });

Since this direct prop isn't there yet, you'd typically build a custom hook that internally uses LocalStorageChatTransport and manages state similarly to useChat, or use it for testing purposes by directly invoking its methods.

IndexedDB for More Robust Offline Storage:
localStorage is fine for simple demos, but it has limitations:

  • Storage Size: Typically limited to 5-10MB.
  • Synchronous API: Can block the main thread if you're storing/retrieving large amounts of data (though for chat messages, this is usually less of an issue unless histories are immense).
  • String-Only: Stores only strings, so JSON stringification/parsing is always needed.

For more robust offline applications or larger chat histories, IndexedDB is a much better choice. It offers:

  • Significantly larger storage quotas.
  • Asynchronous API, which doesn't block the main thread.
  • Ability to store complex JavaScript objects directly.
  • Support for indexing and querying, which can be useful for searching chat histories.

The principles of an IndexedDBChatTransport would be similar to the LocalStorageChatTransport:

  • getChat(chatId): Read UIMessage[] from an IndexedDB object store.
  • submit(messages, options): Simulate the AI response, save the updated UIMessage[] to IndexedDB (using transactions for atomicity), and then use createUIMessageStream to stream the assistant's message back.

Highlight: v5 Streaming Utilities for Non-Networked Operations
A key takeaway here is that even though this transport doesn't involve a network call to an AI backend, it still uses the Vercel AI SDK v5 streaming utilities (createUIMessageStream and UIMessageStreamWriter). This is crucial because it ensures that the data format produced by the transport (the ReadableStream of UIMessageStreamParts) is exactly what consuming components like useChat (or rather, its internal processUIMessageStream logic) expect. This maintains compatibility and allows you to swap out transports without changing your core UI logic.

This local transport perfectly demonstrates the power of the ChatTransport abstraction. It allows the same useChat hook (or similar custom hooks built on v5 principles) to function seamlessly whether it's talking to a powerful cloud LLM or a simple echo bot running in the browser. This is a big win for flexibility and testability.

Take-aways / Migration Checklist Bullets

  • A client-only ChatTransport (e.g., using localStorage or IndexedDB) is excellent for demos, offline apps (like React Native), and testing.
  • The submit method must simulate an AI response, persist the new conversation state locally, and then use v5 utilities like createUIMessageStream and UIMessageStreamWriter to return a compliant v5 UI Message Stream.
  • getChat would read history from local storage. resume might be a no-op or could re-stream the last local "AI" message.
  • For robust offline needs, prefer IndexedDB over localStorage due to size limits and synchronous API of the latter.
  • Using v5 streaming utilities even for local/synchronous-like operations ensures compatibility with the SDK's stream processing logic.
  • This pattern truly decouples the chat UI from the backend, enabling useChat-like experiences without a server.
4. gRPC Example (Go backend) (Conceptual Sketch)


TL;DR: A gRPC-based ChatTransport would involve using a gRPC-Web client to communicate with a gRPC backend (e.g., written in Go), requiring an adaptation layer on the client to convert the gRPC stream of responses into the Vercel AI SDK v5's expected SSE-formatted UI Message Stream.

Why this matters? (Context & Pain-Point)

For some applications, particularly those with demanding performance requirements, complex microservice architectures, or a need for strictly typed API contracts across different languages, gRPC is a popular choice. It offers:

  • Performance: gRPC often provides lower latency and higher throughput than typical JSON/REST APIs due to its use of Protocol Buffers (Protobufs) for serialization and HTTP/2 for transport.
  • Typed Contracts: Defining services and messages with .proto files generates code in multiple languages, ensuring type safety and reducing integration errors between client and server.
  • Bidirectional Streaming: gRPC natively supports various streaming modes, including server-streaming, client-streaming, and bidirectional streaming, which can be very useful for real-time applications.

If your backend infrastructure is already built on gRPC, or if you're building a new system where gRPC's benefits are compelling, you'll need a way for your Vercel AI SDK-powered frontend to talk to it. This is another prime use case for a custom ChatTransport.

How it’s solved in v5? (Step-by-step, Code, Diagrams)

Let's sketch out how a client-side gRPC ChatTransport might look, assuming a Go backend for the gRPC server.

Client-Side gRPC ChatTransport (Conceptual):


  1. Setup and Code Generation:

    • You'd start by defining your gRPC service and message types in a .proto file. For a chat service, this might look something like:

      // chat.proto
      syntax = "proto3";
      package chat.v1;

      // Option for Go package
      option go_package = "example.com/your-project/chatv1";

      // Represents a UIMessagePart (simplified for gRPC)
      message GRPCRawUIMessagePart {
      string type = 1; // e.g., "text", "tool-call"
      string json_payload = 2; // The JSON string of the UIMessagePart's content
      }

      // Represents a UIMessage (simplified for gRPC)
      message GRPCRawUIMessage {
      string id = 1;
      string role = 2;
      repeated GRPCRawUIMessagePart parts = 3;
      // string metadata_json = 4; // If sending metadata
      }

      message SubmitChatRequest {
      string chat_id = 1;
      repeated GRPCRawUIMessage messages = 2;
      // map<string, string> additional_body = 3; // For options.body
      }

      // This is what the server will stream back for each part
      message SubmitChatResponseStreamPart {
      string correlation_id = 1; // To correlate with the request
      GRPCRawUIMessagePart ui_message_stream_part = 2; // Server sends pre-formatted UIMessageStreamPart
      }

      service ChatService {
      // Server-streaming RPC for chat responses
      rpc SubmitChat(SubmitChatRequest) returns (stream SubmitChatResponseStreamPart);
      // Potentially other RPCs for resume, getChat
      }

* From this `.proto` file, you'd use `protoc` (the Protobuf compiler) along with plugins for JavaScript/TypeScript (for the client) and Go (for the server) to generate client stub code and server interface code.
* For web clients, you'd typically use **gRPC-Web**, which allows browsers to communicate with gRPC services (often through a proxy like Envoy, or directly if the gRPC server supports gRPC-Web). The generated client code would be gRPC-Web compatible.

  1. submit(messages: UIMessage<any>[], options: ChatRequestOptions): Promise<Response> Method:
    • Conversion to gRPC Request:
      1. Take the messages: UIMessage<any>[] array.
      2. Convert each UIMessage and its parts into the corresponding gRPC request message format defined in your .proto (e.g., SubmitChatRequest containing GRPCRawUIMessages). This involves serializing UIMessagePart objects into GRPCRawUIMessagePart (perhaps by JSON stringifying the part's content).
      3. Populate other fields in SubmitChatRequest from options (like chat_id, additional_body).

    • Making the gRPC Call:

      1. Use the generated gRPC-Web client stub to call the SubmitChat RPC method on your server. This will likely return a client-side stream object that emits SubmitChatResponseStreamPart messages from the server.

        // Conceptual client-side gRPC call
        // const grpcClient = new ChatServiceClient('https://your-grpc-web-proxy.com');
        // const grpcRequest = new SubmitChatRequest();
        // // ... populate grpcRequest from UIMessages and options ...
        //
        // const serverStream = grpcClient.submitChat(grpcRequest, metadata); // metadata for headers/auth

* **Adaptation Layer (gRPC Stream to v5 UI Message Stream):** This is the critical part, similar to the WebSocket transport.
1. When `submit` is called, create a new `ReadableStream` (let's call it `sseStream`).
2. Listen to events from the `serverStream` (the gRPC client stream). gRPC client libraries usually have an `on('data', (responsePart) => { ... })`, `on('error', ...)` and `on('end', ...)` pattern.
3. For each `SubmitChatResponseStreamPart` (let's call it `grpcResponsePart`) received from the gRPC server stream:
* The `grpcResponsePart.ui_message_stream_part` should ideally *already be* structured like a Vercel AI SDK `UIMessageStreamPart` (e.g., `{ type: "text", messageId: "...", value: "..." }`). The server would construct this. If not, you'd do another conversion here.
* Convert this `UIMessageStreamPart` object into an SSE event string: `data: ${JSON.stringify(grpcResponsePart.ui_message_stream_part)}\n\n`.
* Push this string into the `sseStream`'s controller: `controller.enqueue(encoder.encode(sseEventString));`.
4. When the gRPC server stream ends (`on('end')`), `controller.close()` the `sseStream`.
5. If an error occurs on the gRPC stream (`on('error')`), `controller.error(grpcError)` and then `controller.close()`.
* **`AbortSignal` Handling:** The gRPC client call should ideally accept an `AbortSignal` or have a `cancel()` method. Wire `options.abortSignal` to this to allow `useChat().stop()` to cancel the gRPC request.
* **Return the Response:** `submit` returns `new Response(sseStream, { headers: { 'Content-Type': 'text/event-stream', 'x-vercel-ai-ui-message-stream': 'v1' } });`.

Go Backend (gRPC Server - Conceptual Sketch):

Your Go server would implement the ChatService interface generated from the .proto file.


// Conceptual Go server-side implementation
package main

import (
// ... necessary gRPC, Protobuf, context, AI SDK (if using Go SDK) imports ...
// chatv1 "example.com/your-project/chatv1" // Generated proto code
)

type chatServer struct {
// chatv1.UnimplementedChatServiceServer // Embed for forward compatibility
// Dependencies like an LLM client
}

// SubmitChat implements the gRPC service method
func (s *chatServer) SubmitChat(req *chatv1.SubmitChatRequest, stream chatv1.ChatService_SubmitChatServer) error {
ctx := stream.Context() // Get context for cancellation, deadlines

// 1. Convert req.GetMessages() (GRPCRawUIMessage[]) if needed for LLM
// This might involve parsing the json_payload of GRPCRawUIMessagePart.

// 2. Interact with an LLM (e.g., using OpenAI Go client, or Vercel AI SDK for Go if available)
// This interaction should produce a stream of data.

// 3. For each part/delta from the LLM stream:
// a. Construct a Vercel AI SDK `UIMessageStreamPart` equivalent object.
// For example, if LLM sends text "Hello", create a UIMessageStreamPart like:
// { type: "text", messageId: "some-message-id", value: "Hello" }
// b. Convert this object to your `chatv1.GRPCRawUIMessagePart` (e.g., by JSON stringifying it into json_payload).
// c. Create a `chatv1.SubmitChatResponseStreamPart` containing this and the correlation_id (e.g., req.GetChatId()).
// d. Send it on the gRPC stream:
// if err := stream.Send(&chatv1.SubmitChatResponseStreamPart{...}); err != nil {
// return err // Handle stream errors
// }
// e. Check ctx.Done() periodically to handle client cancellation.

// 4. After LLM stream finishes, send a 'finish' UIMessageStreamPart.
// Example:
// finishPartPayload := `{"type":"finish", "messageId":"some-message-id", "finishReason":"stop"}`
// finishProtoPart := &chatv1.GRPCRawUIMessagePart{Type: "finish", JsonPayload: finishPartPayload}
// if err := stream.Send(&chatv1.SubmitChatResponseStreamPart{UiMessageStreamPart: finishProtoPart, ...}); err != nil {
// return err
// }

return nil // Indicates successful completion of the stream from server's perspective
}

// ... main function to start gRPC server ...

[FIGURE 4: Diagram showing a gRPC client transport -> gRPC-Web Proxy (optional) -> Go gRPC Server -> LLM. Arrows indicate data flow and transformation points.]

The Go server's responsibility is to take the gRPC request, interact with the LLM, and then stream back messages that are structured such that the client-side gRPC transport can easily convert them into the Vercel AI SDK's UIMessageStreamParts. The server is essentially pre-formatting these parts.

Key Takeaway:
The main architectural pattern here is similar to the WebSocket transport: the client-side gRPC ChatTransport is responsible for the translation between the gRPC streaming protocol (and its Protobuf messages) and the Vercel AI SDK's expected SSE-based UI Message Stream. This translation or adaptation layer is where the bulk of the custom transport logic resides. The server, in turn, should make this translation easier by sending gRPC messages that closely map to the UIMessageStreamPart structure.

Using gRPC can be very powerful for performance and cross-language interoperability, but it does introduce more setup complexity (Protobuf compilation, gRPC-Web proxy if needed) compared to plain HTTP/SSE.

Take-aways / Migration Checklist Bullets

  • A gRPC ChatTransport allows Vercel AI SDK frontends to communicate with gRPC backends.
  • Requires defining services and messages in .proto files and generating client/server code.
  • gRPC-Web is typically used for browser clients, possibly with a proxy like Envoy.
  • The client transport's submit() method converts UIMessage[] to gRPC request messages, makes the gRPC call, and then adapts the gRPC server stream into a v5 UI Message Stream (SSE of UIMessageStreamParts).
  • The gRPC server should stream back messages structured to be easily convertible into UIMessageStreamParts by the client transport.
  • This approach offers performance and typed contracts but adds setup complexity (Protobuf, gRPC-Web proxy).
5. Testing & Fallback Strategy Matrix (Brief Mention)


TL;DR: Effectively testing custom ChatTransport implementations involves unit tests with mocks and integration tests with useChat (or a similar consumer), while a robust production setup might consider fallback strategies if a primary custom transport fails.

Why this matters? (Context & Pain-Point)

Whenever you introduce a custom component into a critical path, like message delivery, thorough testing becomes paramount. A buggy ChatTransport can break your entire chat functionality. Furthermore, for transports that rely on specific network conditions or server availability (like WebSockets or a custom gRPC backend), having a fallback strategy can significantly improve the resilience of your application.

How it’s solved in v5? (Step-by-step, Code, Diagrams)

While v5 provides the architectural capability for custom transports, testing and fallback are largely developer responsibilities.

Testing Custom Transports:


  1. Unit Testing the Transport:
    • Isolate and Mock: Test each method of your custom ChatTransport class in isolation.
      • For a WebSocketChatTransport, you'd mock the WebSocket object itself. You can use libraries like jest-websocket-mock or mock-socket to simulate a WebSocket server, control its responses, and assert that your transport sends the correct messages and handles server messages appropriately.
      • For a LocalStorageChatTransport, you'd mock localStorage (e.g., using jest.spyOn or a simple object mock).
      • For a gRPCChatTransport, mock the generated gRPC client stub to simulate server responses and errors.

    • Verify submit() / resume() Output: This is crucial. Your unit tests must assert that the ReadableStream returned by submit() and resume() correctly produces valid v5 UI Message Stream parts (SSE strings like data: {"type":"text", ...}\n\n). You can consume this stream in your test and parse the SSE events to check their content and order.

      // Conceptual Jest test for a transport's submit method
      // it('should produce a valid v5 UI Message Stream on submit', async () => {
      // const transport = new MyCustomTransport(/* mock dependencies */);
      // const mockMessages: UIMessage[] = [{ id: '1', role: 'user', parts: [{type: 'text', text: 'Hello'}] }];
      // const response = await transport.submit(mockMessages, { abortSignal: new AbortController().signal });
      //
      // expect(response.headers.get('Content-Type')).toBe('text/event-stream');
      // expect(response.headers.get('x-vercel-ai-ui-message-stream')).toBe('v1');

      // const reader = response.body!.getReader();
      // const decoder = new TextDecoder();
      // let streamContent = '';
      // let result;
      // while (!(result = await reader.read()).done) {
      // streamContent += decoder.decode(result.value, { stream: true });
      // }
      // streamContent += decoder.decode(); // Flush any remaining

      // // Basic check - more sophisticated parsing of SSE events would be better
      // expect(streamContent).toContain('data: {"type":"start"');
      // expect(streamContent).toContain('data: {"type":"text"');
      // expect(streamContent).toContain('data: {"type":"finish"');
      // });

* **Test `AbortSignal` Handling:** Ensure that if the `AbortSignal` is triggered, your transport correctly cancels its underlying operations (e.g., aborts `fetch`, closes WebSocket with a cancel message, stops gRPC call) and cleans up resources.
* **Test `getChat()`:** Verify it returns the correct `UIMessage[]` or `null`.
  1. Integration Testing:
    • Once your transport is unit tested, you need to test it in conjunction with a chat UI.
    • If/when useChat supports a direct transport prop, you'd pass your custom transport instance to it in a test environment and verify that the UI behaves as expected (messages stream in, state updates correctly, etc.).
    • If you're building a custom hook that uses your transport, test that hook thoroughly.
    • This might involve setting up a minimal test server for your transport (e.g., a simple WebSocket echo server, a mock gRPC server).

Fallback Strategy:

For transports that might fail due to network conditions or backend unavailability (e.g., WebSocket server is down, gRPC service is unreachable), consider implementing a fallback mechanism.

  • Application-Level Logic: This isn't something the ChatTransport itself would typically handle, but rather the application code that selects or configures the transport.
  • Example Scenario:
    1. Your application tries to initialize and use a WebSocketChatTransport as the primary transport.
    2. If the WebSocket connection fails to establish after a few retries (handled by the transport's internal reconnection logic), or if it encounters persistent errors.
    3. The application logic could then decide to switch to a fallback transport, such as the default HTTP/SSE transport (assuming your backend also exposes a compatible HTTP/SSE endpoint for this purpose).
    4. This would involve re-initializing useChat (if that's how transports are configured) or your custom chat logic with the fallback transport. [FIGURE 5: Matrix/decision tree showing: Try WebSocket -> Fails? -> Try HTTP/SSE Fallback. Columns: Transport, Condition, Action, Notes.]
  • User Notification: It's good practice to notify the user if a fallback occurs, as the experience might be slightly different (e.g., higher latency with HTTP/SSE compared to WebSockets).
  • Complexity: Implementing robust transport fallback adds complexity but can greatly enhance application resilience. It requires careful state management to ensure a smooth transition if possible.

Take-aways / Migration Checklist Bullets

  • Thoroughly unit test your custom ChatTransport, especially ensuring submit()/resume() produce valid v5 UI Message Streams.
  • Mock dependencies (WebSocket servers, gRPC clients, localStorage) for isolated unit tests.
  • Perform integration tests with useChat (or a custom consuming component) to verify end-to-end behavior.
  • For production, consider fallback strategies: if a primary custom transport (e.g., WebSocket) fails, gracefully switch to a more resilient one (e.g., default HTTP/SSE), if a compatible backend endpoint exists.
  • Fallback logic is typically managed at the application level, not within the transport itself.
6. Security Concerns per Transport


TL;DR: Each custom ChatTransport type introduces unique security considerations, from API key exposure in direct client-to-provider models to authentication and encryption requirements for WebSockets and gRPC, and XSS risks with client-side storage.

Why this matters? (Context & Pain-Point)

Switching from the default, relatively well-understood security model of server-side API calls to custom transports means you're taking on more responsibility for securing that communication channel. Different transports have different attack surfaces and risks. Ignoring these can lead to compromised API keys, data breaches, or other security incidents. As a Senior Frontend Engineer, thinking about these implications is part of the job, even if the backend team handles the server-side.

How it’s solved in v5? (Step-by-step, Code, Diagrams)

Let's break down security considerations for some of the transport types we've discussed. The Vercel AI SDK itself provides the abstraction; securing its use is up to us.


  1. Direct Client-to-Provider Transport (e.g., OpenAITransport from <extra_details>, Section 2.C):
    • The Big Risk: API Key Exposure. This is the most significant concern. If your application embeds an LLM provider API key (e.g., OpenAI API key) directly in the client-side JavaScript code for this transport to use, that key will be exposed to anyone who inspects your site's code. This is a major security vulnerability. Once exposed, the key can be abused, leading to unexpected costs or unauthorized use of your AI provider account.
    • When is it Potentially Viable? (Use with Extreme Caution!)
      • User-Provided Keys: The only reasonably safe scenario for this pattern is if the application requires users to provide their own API key, which is then temporarily used for their session and never stored by your application (or only stored client-side with explicit user consent and strong warnings). This is common in developer tools or AI playgrounds.
      • Strictly Local Development/Demos: For your own local development where the key never leaves your machine.
    • Mitigation/Best Practice:
      • Server-Side Key Management is the GOLD STANDARD. For almost all production applications, your LLM API keys should live only on your secure server (as environment variables). The client talks to your server, and your server talks to the LLM provider. This is the pattern the default AI SDK transport encourages.
      • If using user-provided keys, ensure they are transmitted securely (HTTPS), handled carefully in memory, and cleared when the session ends or the user revokes permission.
      • Educate users about the risks if they are providing their own keys.

  2. WebSockets (WebSocketChatTransport):
    • Authentication and Authorization:
      • WebSocket connections are not inherently authenticated. You must implement a mechanism to authenticate WebSocket handshakes. Common methods include:
        • Sending an authentication token (e.g., JWT) as a query parameter in the WebSocket URL (e.g., wss://yourserver.com/ai-chat?token=YOUR_JWT). The server validates this token before upgrading the connection.
        • Using cookies (if your main app authentication is cookie-based and the WebSocket server is on the same domain/subdomain, though this can have CSRF implications if not handled carefully).
        • A custom subprotocol that involves an auth challenge/response after connection.
      • Once authenticated, the server must authorize that the user can perform the requested actions (e.g., access a specific chatId).
    • WSS (Secure WebSockets): Always use wss:// (WebSocket Secure) instead of ws:// in production. This encrypts the WebSocket traffic (using TLS), just like HTTPS does for HTTP.
    • Denial of Service (DoS): Persistent WebSocket connections consume server resources. Protect your WebSocket server from DoS attacks:
      • Rate limit connection attempts.
      • Limit the number of concurrent connections per user or IP.
      • Implement message size limits.
      • Use robust server infrastructure that can handle load.
    • Cross-Site WebSocket Hijacking (CSWH): If authentication relies on ambient credentials like cookies, be aware of CSWH. Ensure your server checks the Origin header during the WebSocket handshake to prevent connections from untrusted domains.

  3. LocalStorage/IndexedDB Transport (Client-Only):
    • Cross-Site Scripting (XSS) Risk: Data stored in localStorage or IndexedDB is accessible to any JavaScript code running on the same origin (domain). If your site has an XSS vulnerability (e.g., from rendering unsanitized user input or AI output elsewhere on the page), malicious JavaScript injected via that XSS flaw could read, modify, or exfiltrate the chat history stored locally.
      • Mitigation:
        • Strong XSS prevention practices across your entire application are paramount.
        • If storing highly sensitive chat history locally, consider encrypting it before writing to localStorage/IndexedDB, using a key derived from user credentials or managed securely. This adds complexity but protects against direct data theft via XSS if the encryption key itself isn't compromised.
    • Data Integrity: Data stored on the client (like in localStorage) can be tampered with by a technically savvy user directly through browser developer tools. Don't rely on client-side storage for critical data that must remain untampered if that data is then sent back to a server and trusted implicitly. Always re-validate client-provided data on the server if it impacts security or critical operations.
    • Storage Limits & Cleanup: While not strictly a security issue, unmanaged local storage can grow. Implement cleanup or quota management.

  4. gRPC Transport (gRPCChatTransport):
    • Standard gRPC Security Practices:
      • TLS: Always use TLS to encrypt gRPC communication in production. gRPC client and server libraries have built-in support for TLS.
      • Authentication: gRPC supports various authentication mechanisms:
        • Token-based authentication (e.g., sending JWTs or OAuth tokens in request metadata).
        • Mutual TLS (mTLS) for strong client/server authentication.
        • Integrate with your existing identity provider.
    • gRPC-Web Proxy: If using gRPC-Web, ensure the proxy (e.g., Envoy) is also configured securely (HTTPS, proper routing, access controls).

  5. General Security Principles (Applicable to All Transports):
    • Input Validation and Sanitization:
      • Always validate and sanitize data received from any external source, whether it's from a user, an LLM, or another service via your transport.
      • If your transport involves sending data to a backend, the backend must re-validate everything it receives from the client transport. Never trust client-side data.
      • When rendering AI-generated content, especially if it can contain Markdown or HTML-like structures, sanitize it to prevent XSS (as discussed in Post 4, Section 10.3 regarding UIMessagePart rendering).
    • Least Privilege: Ensure that any tokens or credentials used by the transport (or the backend it communicates with) have the minimum necessary permissions.
    • Dependency Security: Keep your SDK versions, libraries for transports (WebSocket clients, gRPC clients), and server-side dependencies up to date to patch known vulnerabilities.
    • Logging and Monitoring: Implement security logging and monitoring to detect suspicious activity related to your chat transport or backend.

Choosing a custom transport means you're expanding the surface area of your application. Each choice comes with its own set of security trade-offs and responsibilities. The default HTTP/SSE transport, when used with a well-secured server-side API route, often benefits from mature web security practices and infrastructure.

Take-aways / Migration Checklist Bullets

  • Direct Client-to-Provider transport: High risk of API key exposure. Only for user-provided keys or local dev. Server-side key management is best.
  • WebSockets: Must implement AuthN/AuthZ for connections, use WSS (TLS), and protect against DoS. Be aware of CSWH if using cookies.
  • LocalStorage/IndexedDB: Susceptible to XSS if your site has other vulnerabilities. Consider encryption for sensitive local data. Client data can be tampered with.
  • gRPC: Use TLS and standard gRPC authentication mechanisms (e.g., JWTs in metadata). Secure any gRPC-Web proxy.
  • Always validate and sanitize data from any external source, on both client and server.
  • The default SDK pattern (client -> your secure server -> LLM provider) is generally the most secure for managing API keys.
7. When to Stick with Default SSE/REST


TL;DR: For most standard web chat applications with a Next.js/Node.js backend, the Vercel AI SDK's default HTTP/SSE transport is robust, easy to set up, scales well with serverless functions, and is often the most pragmatic choice unless specific requirements like true bidirectional communication, offline mode, or integration with non-HTTP backends necessitate a custom transport.

Why this matters? (Context & Pain-Point)

It's easy to get excited about new possibilities like WebSockets or gRPC with custom ChatTransports. They offer powerful capabilities. However, "with great power comes great responsibility" – and often, great complexity. Before you jump into building a custom transport, it's crucial to evaluate if you actually need it. The default HTTP/SSE mechanism provided by the Vercel AI SDK (internally via callChatApi in useChat) is well-engineered, battle-tested, and optimized for many common scenarios, especially within the Vercel ecosystem.

Sometimes, the simplest solution is the best, and sticking with the default can save you significant development time, reduce maintenance overhead, and minimize potential security pitfalls associated with custom implementations.

How it’s solved in v5? (Or rather, why the default is often sufficient)

Let's consider the strengths of the default SSE/REST transport and when it's likely your best bet:


  1. Simplicity and Standard Use Cases:
    • For the majority of web applications building chat UIs where the flow is: client sends a message -> server processes it (calls an LLM) -> server streams a response back, the default mechanism is perfectly suited.
    • Easy Setup: If you're using Next.js, creating an API route (e.g., app/api/chat/route.ts) to handle chat requests is straightforward. The Vercel AI SDK provides server-side helpers (streamText, toUIMessageStreamResponse) that make implementing these routes very clean.
    • Well-Tested and Robust: This communication pattern is core to the SDK and has been refined over time. You benefit from the collective testing and improvements made by the Vercel team and the community.
    • Developer Experience: The out-of-the-box experience with useChat and a standard API route is smooth and requires minimal configuration for networking.

  2. Scalability with Serverless Functions:
    • The HTTP request-response model, even with streaming SSE, aligns perfectly with serverless architectures (like Vercel Functions, AWS Lambda). Each chat submission can be handled by a stateless serverless function that spins up, processes the request, streams the response, and then scales down.
    • This provides excellent scalability and cost-efficiency, as you only pay for compute when requests are being processed.
    • Persistent connections like WebSockets can be more complex to manage and scale in a purely serverless environment (though solutions exist, like Vercel's support for WebSocket proxying to regional functions or stateful services).

  3. Leveraging Standard HTTP Infrastructure:
    • Caching: While POST requests for chat submissions aren't typically cached, GET requests (like those used for experimental_resume or potentially a getChat history endpoint) can benefit from standard HTTP caching mechanisms if designed appropriately (e.g., using ETags, Cache-Control headers).
    • Load Balancing: Standard HTTP load balancers work seamlessly with this pattern.
    • Monitoring and Tooling: A vast ecosystem of tools exists for monitoring, logging, and debugging HTTP traffic.
    • Firewalls and Proxies: HTTP/SSE traffic is generally well-understood and less likely to be blocked by corporate firewalls or proxies compared to less common protocols or ports.

  4. No Requirement for True Bidirectional Real-Time (Beyond AI Response Streaming):
    • Server-Sent Events (SSE) are excellent for server-to-client streaming, which is the primary pattern for AI chat responses. The client sends a request, and the server streams back the AI's answer.
    • If your application doesn't require the server to proactively send messages to the client outside of a direct response to a client request (e.g., notifications unrelated to the current chat turn, real-time updates initiated by other users), then the full duplex nature of WebSockets might be overkill.

  5. Preference for Stateless Backend Operations:
    • The default HTTP/SSE transport fits very well with stateless backend functions. Each request from the client (containing the relevant chat history) is self-contained.
    • While WebSockets can be used with stateless backends, they often imply or are used for more stateful server-side connections. If your backend logic is inherently stateless, HTTP/SSE is a natural fit.

So, When Should You Consider a Custom ChatTransport?

You should only seriously consider implementing a custom ChatTransport if you have clear, compelling reasons that the default mechanism cannot satisfy:

  • You Require True Bidirectional, Low-Latency Communication Beyond Chat Responses:
    • If your application involves features like collaborative text editing alongside the chat, real-time presence indicators, or server-initiated updates that need to be pushed to the client instantly without waiting for a client request, WebSockets (via a WebSocketChatTransport) would be a strong candidate.
  • You Need a Client-Only or Offline Mode:
    • For demos that run entirely in the browser, PWAs that need to function offline, or React Native apps interacting with local data/models, a LocalStorageChatTransport, IndexedDBChatTransport, or a transport that calls a Web LLM directly is necessary. The default transport requires a network-accessible server endpoint.
  • You Are Integrating with an Existing Backend That Only Supports WebSockets, gRPC, or Another Protocol:
    • If you're building a frontend for an established backend system that already exposes its services via WebSockets or gRPC and cannot easily add an HTTP/SSE endpoint compliant with the Vercel AI SDK's v5 UI Message Stream protocol, then a custom transport is your bridge.
  • Specific Performance Demands Met by gRPC:
    • If your application has extreme performance requirements where gRPC's efficiencies (Protobufs, HTTP/2) offer a measurable advantage, and you have the infrastructure to support it, a gRPCChatTransport might be considered.
  • Unique Authentication or Network Handling Requirements:
    • If your application has very specific, non-standard authentication flows or network tunneling requirements that are difficult to implement with fetch but easier with a different client library, a custom transport could encapsulate that logic.

The Default is Good. Don't Overcomplicate Unnecessarily.

The Vercel team has put a lot of effort into making the default HTTP/SSE transport robust and efficient. For a vast number of chat applications, especially those built with Next.js and deployed on Vercel, it's the path of least resistance and offers excellent performance and scalability. The allure of custom transports is strong, but always weigh the added complexity and maintenance burden against the actual benefits for your specific use case.

Take-aways / Migration Checklist Bullets

  • The default HTTP/SSE transport (used by useChat via callChatApi) is well-suited for most web chat applications with Next.js/Node.js backends.
  • It's simple to set up, scales well with serverless functions, and benefits from standard HTTP infrastructure.
  • If your chat is primarily client-sends-request, server-streams-response, SSE is highly efficient.
  • Stick with the default if you don't have a hard requirement for:
    • True bidirectional communication beyond AI responses.
    • Client-only/offline mode.
    • Integration with an existing backend that only supports other protocols (WebSockets, gRPC) and cannot easily add a v5-compliant SSE endpoint.
  • Avoid premature optimization or unnecessary complexity by defaulting to the standard transport unless a custom one is clearly justified by your application's unique needs.
8. Future Roadmap & Contribution Hints (Conceptual)


TL;DR: While v5 Canary lays crucial groundwork for transport flexibility, future enhancements could include a direct, documented API for plugging custom transports into useChat, official SDK-provided transports for common scenarios, and community contributions for diverse backend protocols, further solidifying the Vercel AI SDK's adaptability.

Why this matters? (Context & Pain-Point)

The conceptual ChatTransport interface and the architectural decoupling in Vercel AI SDK v5 are exciting because they point towards a future of even greater flexibility. As developers, knowing where a library might be heading helps us make informed decisions today and potentially contribute to its evolution. The current Canary version is a significant step, but there's always room for refinement and expansion.

How it’s solved in v5? (And what the future might hold)

v5 Canary has laid a strong foundation with its V2 model interfaces and the standardized v5 UI Message Streaming Protocol. These are the essential building blocks that make a pluggable transport system viable and effective. Here’s a look at what the future might hold and how the community can play a role:


  1. Official Pluggable API for useChat?
    • Current State: As we've discussed, while the concept of ChatTransport is clear, the v5 Canary useChat hook doesn't (as of my latest understanding from the provided context) feature a direct, publicly documented option like useChat({ transport: myCustomTransportInstance }).
    • Future Potential: This seems like a natural and highly valuable future enhancement. Exposing a clean, well-documented API on useChat (and its equivalents in Vue/Svelte) to accept a ChatTransport instance would greatly simplify the process of using custom transports. Developers could then focus solely on implementing the transport logic, knowing that useChat would handle all the state management, stream processing, and UI updates.
    • This would truly democratize backend flexibility for the SDK.

  2. SDK-Provided Transports?
    • Current State: The SDK effectively provides one "built-in" transport: the default HTTP/SSE mechanism used by callChatApi.
    • Future Potential: Could the Vercel AI SDK eventually ship with official, maintained transport implementations for common alternative scenarios?
      • Official WebSocket Transport: A well-tested, configurable WebSocketChatTransport provided by the SDK could be a huge boon for developers wanting to use WebSockets without implementing the complex adaptation layer themselves.
      • Direct LangServe Transport: For those heavily invested in LangChain, a more direct and optimized transport for communicating with LangServe backends could simplify integration.
      • Transports for Other Backend Services: Perhaps specialized transports for popular BaaS (Backend-as-a-Service) platforms or specific AI-focused backend frameworks.
    • Providing these "batteries-included" transports would lower the barrier to entry for using these protocols with the AI SDK.

  3. Community Contributions:
    • The Power of Open Source: If a clear ChatTransport interface is formally defined and adopted by the SDK, it opens the door wide for community contributions.
    • Ecosystem Growth: Developers could build and share transports for a multitude of backends, protocols, and use cases:
      • Transports for different database backends (e.g., a direct Firebase Realtime Database transport).
      • Transports for message queues (e.g., Kafka, RabbitMQ) if that fits a particular streaming architecture.
      • Transports for specific enterprise systems.
    • A repository of community-maintained ChatTransport implementations could significantly expand the SDK's reach and applicability.

  4. Standardizing the Adaptation Layer:
    • As we saw with the WebSocket and gRPC examples, the trickiest part of a custom transport is often adapting its native streaming/messaging format to the v5 UI Message Stream (SSE of UIMessageStreamParts).
    • Future SDK utilities or clearer guidelines on implementing this adaptation layer could make custom transport development more accessible.

The Vercel AI SDK team has shown a strong commitment to listening to the community and evolving the SDK based on real-world needs. The architectural choices in v5 Canary are a testament to this. While the direct pluggability of transports into useChat might still be on the horizon, the groundwork is undeniably there.

Teasing Post 8: Elevating AI Tool Calls with v5

Having explored how v5 allows flexible communication with backends (or even no backend!) through the ChatTransport concept, our journey through the Vercel AI SDK v5 continues. We've seen how messages are now structured with UIMessageParts, and one of the most powerful part types is ToolInvocationUIPart.

In our next installment, Post 8: First-Class Citizens - Mastering AI Tool Calls with v5's Structured UIMessagePart System, we'll dive deep into how AI SDK v5 elevates AI Tool Calls (formerly known as function calling in some contexts) into a more integrated and robust part of the chat pipeline. We'll see how the ToolInvocationUIPart and the V2 model interfaces provide a clearer, more powerful way to define, execute, and represent tool interactions, moving beyond simple string-based arguments and results. This is crucial for building sophisticated AI agents and applications that can interact with the outside world.


Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.

 
Вверх Снизу