Speech-to-Text - Hamsa API

Connect to the WebSocket and send STT requests to transcribe audio into text.

Quick Start

Enter your API key in the authentication field
Click Connect to establish the WebSocket connection
Provide base64-encoded audio data
Click Send to receive transcription

Request Message

After connecting, send a JSON message with the following structure:

type

string

required

Must be "stt"

payload

object

required

Show payload properties

audioBase64

string

required

Base64-encoded audio data.

language

string

default:"ar"

Language code for transcription. Defaults to "ar" (Arabic).

isEosEnabled

boolean

default:"true"

Enable end-of-speech detection.

eosThreshold

number

default:"0.3"

Threshold for end-of-speech detection (0.0 to 1.0).

STT Request

{
  "type": "stt",
  "payload": {
    "audioBase64": "UklGRiQAAABXQVZFZm10IBAAAAABAAEAQB8AAIA+AAACABAAZGF0YQAAAAA=",
    "language": "ar",
    "isEosEnabled": true,
    "eosThreshold": 0.3
  }
}

Response Format

Transcription Result

مرحبا بك في خدمة همسة

Error Response

{
  "type": "error",
  "payload": {
    "message": "Error generating transcription: Audio format not supported"
  }
}

The transcribed text is returned as a plain string, not wrapped in JSON.

Supported Audio Formats

Any audio format supported by the backend (WAV, MP3, etc.), base64-encoded.

​Quick Start

​Request Message

​Response Format

​Supported Audio Formats

Quick Start

Request Message

Response Format

Supported Audio Formats