From 714c11289a56e9df3d9a169a8ac879f917e4ddba Mon Sep 17 00:00:00 2001 From: Dhruva Reddy Date: Mon, 16 Mar 2026 16:22:02 -0700 Subject: [PATCH] docs: add transcriptType and turnId fields to WebSocket message docs Document three new optional fields from VapiAI/vapi#10548: - transcriptType (partial|final) on transcriber-response messages - turnId on model-output and user-interrupted messages Made-with: Cursor --- fern/customization/custom-transcriber.mdx | 20 ++++++++++++++++---- fern/server-url/events.mdx | 10 +++++++--- 2 files changed, 23 insertions(+), 7 deletions(-) diff --git a/fern/customization/custom-transcriber.mdx b/fern/customization/custom-transcriber.mdx index 771b4dfc7..bd6aeb24d 100644 --- a/fern/customization/custom-transcriber.mdx +++ b/fern/customization/custom-transcriber.mdx @@ -43,14 +43,22 @@ You'll learn how to: Your server forwards the audio to Deepgram (or your chosen transcriber) using its SDK. Deepgram processes the audio and returns transcript events that include a `channel_index` (e.g. `[0, ...]` for customer, `[1, ...]` for assistant). The service buffers the incoming data, processes the transcript events (with debouncing and channel detection), and emits a final transcript. - The final transcript is sent back to Vapi as a JSON message: + The transcript is sent back to Vapi as a JSON message: ```json { "type": "transcriber-response", "transcription": "The transcribed text", - "channel": "customer" // or "assistant" + "channel": "customer", + "transcriptType": "final" } ``` + + The optional `transcriptType` field controls how Vapi handles the transcript: + + - **`"final"`** (default) — the transcription is definitive. + - **`"partial"`** — the transcription is provisional and may be superseded by a later message. Each partial replaces the previous one until a `"final"` arrives. + + If omitted, `transcriptType` defaults to `"final"` for backward compatibility. @@ -362,6 +370,7 @@ You'll learn how to: type: "transcriber-response", transcription: text, channel, + transcriptType: "final", }; ws.send(JSON.stringify(response)); logger.logDetailed("INFO", "Sent transcription to client", "Server", { @@ -423,12 +432,13 @@ You'll learn how to: - The `"start"` message initializes the Deepgram session. - PCM audio data is forwarded to Deepgram. - Deepgram returns transcript events, which are processed with channel detection and debouncing. - - The final transcript is sent back as a JSON message: + - The transcript is sent back as a JSON message: ```json { "type": "transcriber-response", "transcription": "The transcribed text", - "channel": "customer" // or "assistant" + "channel": "customer", + "transcriptType": "final" } ``` @@ -444,6 +454,8 @@ You'll learn how to: The solution buffers PCM audio and performs simple validation (e.g. ensuring stereo PCM data length is a multiple of 4). If the audio data is malformed, it is trimmed to a valid length. - **Channel detection:** Transcript events from Deepgram include a `channel_index` array. The service uses the first element to determine whether the transcript is from the customer (`0`) or the assistant (`1`). Ensure Deepgram's response format remains consistent with this logic. +- **Partial transcripts:** + Set `transcriptType` to `"partial"` to send progressive transcription updates. Each partial supersedes the previous one until a `"final"` message arrives. This is useful for STT providers that emit fast, low-latency partials that get refined over time (e.g. ElevenLabs Scribe). If `transcriptType` is omitted, Vapi treats the message as `"final"`. --- diff --git a/fern/server-url/events.mdx b/fern/server-url/events.mdx index 0a33f737b..f6a9178b3 100644 --- a/fern/server-url/events.mdx +++ b/fern/server-url/events.mdx @@ -289,13 +289,14 @@ For final-only events, you may receive `type: "transcript[transcriptType=\"final ### Model Output -Tokens or tool-call outputs as the model generates. +Tokens or tool-call outputs as the model generates. The optional `turnId` groups all tokens from the same LLM response, so you can correlate output with a specific turn. ```json { "message": { "type": "model-output", - "output": { /* token or tool call */ } + "output": { /* token or tool call */ }, + "turnId": "abc-123" } } ``` @@ -339,10 +340,13 @@ Fires whenever a transfer occurs. ### User Interrupted +Sent when the user interrupts the assistant. The optional `turnId` identifies the LLM turn that was interrupted, matching the `turnId` on `model-output` messages so you can discard that turn's tokens. + ```json { "message": { - "type": "user-interrupted" + "type": "user-interrupted", + "turnId": "abc-123" } } ```