Documentation

Everything you need to integrate Sayd into your application.

Talk API

Push-to-talk voice input with AI-powered transcript cleaning. WebSocket streaming + LLM processing.

Listen API

Real-time speech-to-text via WebSocket. Raw transcription without AI cleaning — you control the pipeline.

Transcribe API

Upload audio files for async transcription. Supports WAV, MP3, and more. Results via polling.

VAD API

Voice Activity Detection — detect speech segments or check if audio contains speech.

Getting Started

The fastest way to get started:

Create a free account and get your API key
Follow the Quick Start guide
Explore the Talk API documentation

Supported Languages

Sayd supports real-time transcription and translation in 60+ languages. All languages work with both the Real-time Streaming API and the Batch Transcription API. Mixed-language speech (code-switching) is automatically detected.

Language	ISO Code	Language	ISO Code
Afrikaans	`af`	Albanian	`sq`
Arabic	`ar`	Azerbaijani	`az`
Basque	`eu`	Belarusian	`be`
Bengali	`bn`	Bosnian	`bs`
Bulgarian	`bg`	Catalan	`ca`
Chinese	`zh`	Croatian	`hr`
Czech	`cs`	Danish	`da`
Dutch	`nl`	English	`en`
Estonian	`et`	Finnish	`fi`
French	`fr`	Galician	`gl`
German	`de`	Greek	`el`
Gujarati	`gu`	Hebrew	`he`
Hindi	`hi`	Hungarian	`hu`
Indonesian	`id`	Italian	`it`
Japanese	`ja`	Kannada	`kn`
Kazakh	`kk`	Korean	`ko`
Latvian	`lv`	Lithuanian	`lt`
Macedonian	`mk`	Malay	`ms`
Malayalam	`ml`	Marathi	`mr`
Norwegian	`no`	Persian	`fa`
Polish	`pl`	Portuguese	`pt`
Punjabi	`pa`	Romanian	`ro`
Russian	`ru`	Serbian	`sr`
Slovak	`sk`	Slovenian	`sl`
Spanish	`es`	Swahili	`sw`
Swedish	`sv`	Tagalog	`tl`
Tamil	`ta`	Telugu	`te`
Thai	`th`	Turkish	`tr`
Ukrainian	`uk`	Urdu	`ur`
Vietnamese	`vi`	Welsh	`cy`

💡 Auto Language Detection You don't need to specify the language — Sayd automatically detects the spoken language and handles mid-sentence language switching. Translation works between any pair of the 60+ supported languages (3,600+ combinations).

Why Sayd?

Feature	Sayd	Traditional APIs
Latency	< 200ms first byte	500ms+
Pricing	Token-based, from $0	Per-minute, minimums
Agent Integration	Native support	Manual wiring