OpenAI has released new ‘voice models’ via API
Including one that translates speech in real time.
OpenAI has released three new real-time speech AI models for developers. Each model has its own tasks: reasoning, translation, and speech-to-text transcription. The new GPT-Realtime models are available via an API.
- GPT-Realtime-2 is the first voice model with GPT-5-level reasoning, capable of processing more complex queries and natural conversation. The context was expanded from 32,000 to 128,000 tokens.
- GPT-Realtime-Translate is a new real-time translation model that translates speech from over 70 input languages to 13 output languages while maintaining the speaker’s speaking rate.
- GPT‑Realtime‑Whisper is a new streaming speech-to-text technology that transcribes speech in real time as the user speaks.

“Together, the models we are launching move real-time audio from simple call-and-response toward voice interfaces that can actually do work: listen, reason, translate, transcribe, and take action as a conversation unfolds,” the company said.
Who will benefit from these updates? The obvious target is businesses looking to expand their customer service capabilities. But OpenAI also notes that the new features will help in a variety of areas, including education, media, events, and developer platforms.
As useful as these tools seem from an enterprise perspective, it also seems plausible that they could be misused. The company said it has built guardrails to stop its new features from being abused to create spam, fraud, or other forms of online abuse.
Certain triggers have been embedded in the system so that “conversations can be halted if they are detected as violating our harmful content guidelines,” OpenAI said.
All of the new voice models are included in OpenAI’s Realtime API. Translate and Whisper are billed by the minute, while GPT-Realtime-2 is billed by token consumption.
API prices are as follows:
- GPT-Realtime-2 – $32 and $64 for depositing and withdrawing tokens, respectively.
- GPT-Realtime-Translate – $0.034 per minute.
- GPT-Realtime-Whisper – $0.017 per minute.
Source:
Image courtesy of OpenAI.


