Bodhi Docs
Bodhi Docs
  • Bodhi Overview
  • Developer Quickstart
    • Streaming - Websocket
    • Non-Streaming API
    • Error and Response
  • Starter Apps
Powered by GitBook
On this page
  • 💬 Error Responses
  • Response Format Table
  1. Developer Quickstart

Error and Response

A reference guide outlining how the system communicates issues during API interactions, helping developers understand, handle, and resolve errors gracefully.

💬 Error Responses

This section describes the possible error responses you may encounter when interacting with the Bodhi API, applicable to both streaming and non-streaming modes. Errors typically arise from issues such as invalid configurations, authentication failures, account-related problems, or server-side issues. In the event of an error after a WebSocket connection is established, the server will send an error message before closing the connection.

Error Codes

Status Code
Error Type
Description
Example Server Response

400

Bad Request

Invalid configuration sent via WebSocket, such as malformed JSON, invalid transaction_id, or unavailable model.

{"error": "model 'zz-banking-v2-8khz' is not available", "code": 400}

401

Unauthorized

Missing or incorrect x-customer-id or x-api-key header.

—

402

Insufficient Balance

The account does not have enough credits to process the ASR request.

—

403

Inactive Customer

The customer account is deactivated and cannot process requests.

—

500

Internal Server Error

Unexpected server error or panic during request handling.

—

503

Service Unavailable

The server is currently unavailable or temporarily overloaded.

—

Response Format Table

The table below explains the structure of responses from the Bodhi API, detailing the meaning of each field and helping you better understand the data returned from the API.

{
  "call_id": "<uuid>",
  "segment_id": <int>,
  "eos": <boolean>,
  "type": "<string>",
  "text": "<string>",
  "segment_meta": {
    "tokens": [],
    "timestamps": [],
    "start_time": <float>,
    "confidence": <float>,
    "words": [
      {
        "word": "<string>",
        "confidence": <float>
      }
    ]
  }
}

Field Descriptions

Key
Description

call_id (string)

Unique identifier associated with every streaming connection

segment_id (string)

Integer associated with every speech segment during the entire active socket connection

eos (bool)

Marks the end of the streaming connection when "eos" is true.

type (string)

Possible values: "partial" | "complete"

partial

  • Partial transcript corresponding to every streaming audio chunk

complete

  • Complete/final transcript generated for each speech segment

    • Generated once per segment_id i.e., when the speech segment end is reached

text (string)

The transcript that has been processed thus far.

segment_meta (object)

  • tokens: Array of strings representing individual text pieces (or "tokens") recognized from the audio. Tokens may include words or parts of words.

  • timestamps: Array of numerical values indicating when each token was detected in the segment/sentence (in seconds). Each timestamp aligns with the tokens array, so the i-th timestamp represents the time at which the i-th token was spoken. Useful for measuring latency.

  • start_time: Starting point (in seconds) of the current segment in the overall audio timeline.

  • confidence: Segment level confidence. Float between 0 and 1. Currently supported for all langauges except Telugu, Odia and English.

  • words: Array of word level objects (only populated when type is complete).

    • word: The recognised word.

    • confidence: Float value between 0.0 and 1.0 representing the model’s confidence in the recognized word. Currently supported for all languages except Telugu, Odia and English.

PreviousNon-Streaming APINextStarter Apps

Last updated 6 days ago