gpt-realtime-1.5

Enable low-latency, multimodal AI interactions for voice agents and real-time audio transcription.

Products

About gpt-realtime-1.5

The OpenAI Realtime API, including the gpt-realtime-1.5 model, facilitates low-latency communication with AI models that support speech-to-speech interactions and multimodal inputs (audio, images, text) and outputs (audio, text). It is primarily designed for developers looking to build highly responsive voice agents, enabling natural, real-time conversations in applications. This API is ideal for browser-based voice agents using the Agents SDK for TypeScript, server-side applications requiring consistent low-latency with WebSocket, and VoIP telephony connections via SIP. Beyond voice agents, it also offers real-time audio transcription capabilities. Developers can manage conversation lifecycles, control sessions server-side with webhooks, and optimize costs, making it a versatile tool for integrating advanced real-time AI into various platforms.

Products by gpt-realtime-1.5

gpt-realtime-1.5 by OpenAI

Enable low-latency, multimodal AI interactions for voice agents and real-time audio transcription.

paid