Utterances

Utterances are verbatim speech directly from Wysp to the User, streamed from Wysp via a WebSocket connection, using the utterance message: Users API -> utterance

{
  "id": "c10fed62-9074-41e8-ad3b-a2da249714a2",
  "transcript": "Turn left at the lights.", // Plain text content of the utterance
  "language": "en-US",
  "timestamp": "2026-01-28T09:00:00.000Z", // Time at which utterance was (or should be) heard.
  "isHeard": true, // Whether utterance has already been heard by the user
  "interruptedAt": 0.749, // Playout progress (0.0-1.0) at which playout was interrupted
}

Heard

Utterances with isHeard: true have already been heard by the user in the form of pre-rendered audio triggered by GPS or time movements. They should be inserted into the User/Agent chat history as having been spoken by the Agent, to maintain consistency in case the user asks follow up questions that reference what they’ve heard.

Unheard

Unheard utterances will only occur if the client doesn’t implement Statements, and thus the Agent must “speak” them directly to the user.

In this case, they should be entered into the chat history, and your agent should speak the transcript directly to the user.