OpenAI TTS

Steerable speech.

GPT-4o-mini-tts generates natural speech with fine-grained control over tone, pace, and style. Streaming supported.

Model

gpt-4o-mini-tts

Voices

6 voices

Languages

50+ languages

Latency

< 500ms

Capabilities

What OpenAI TTS does best

Human-like prosody and rhythm with minimal effort.

Describe the style you want in natural language: 'whisper urgently' or 'speak like a newscaster'.

Real-time audio output for interactive applications.

Handles emphasis, pauses, and emotional range naturally.

FAQ

6 built-in voices: alloy, echo, fable, onyx, nova, and shimmer.

4,096 characters per request. Longer texts are split automatically.

Get started

Generate, clone, narrate and broadcast from a single workspace. No credit card required to start.