Fair, flexible pricing
Pay only for what you use. Token-based pricing built to scale with you.
$5 free credit for new users
Every new account gets $5 in free credit — no credit card required, never expires. Start building immediately.
Real-time Audio-to-Text API
All API costs are calculated based on tokens.
Token-based
Equivalent to about $0.126/hour for real-time (streaming) transcription.
Token type
Real-time (streaming)
Input audio tokens
Duration of audio or streaming session
$2.10 per 1M tokens
Input text tokens
Custom instructions or context you provide
$4.20 per 1M tokens
Output text tokens
Transcription and other text returned by the model
$4.20 per 1M tokens
Usage reference: 1 hour of audio is ~30,000 input audio tokens. 1 hour of speech is ~15,000 output text tokens. 1 character of output is ~0.3 tokens.
Get Started — $5 Free Credit