ChatGPT’s voice revolution: be patient, it’s coming… slowly

ChatGPT’s voice revolution: be patient, it’s coming… slowly
ChatGPT’s voice revolution: be patient, it’s coming… slowly

Did you think you’d soon be able to converse with ChatGPT like a human? You’ll have to wait a little longer. OpenAI has just announced a delay in its new voice features.

Source : Unsplash

In a spring update, OpenAI delivered on the new features of ChatGPT with its GPT-4o model. On the program, new language features and an improved voice assistant reserved initially for subscribers.

Subscribe

It is on Discord that OpenAI reveals its calendar changes. We are currently in an alpha phase of the new linguistic features. They should have been available at the end of June, ultimately, we will have to wait until the end of July.

Why this postponement? OpenAI puts forward two main reasons. First, developers are still working to ensure that the model recognizes content it shouldn’t respond to. Clearly, it is about avoiding slip-ups and inappropriate responses, a crucial issue for conversational AI.

Then there’s the infrastructure issue. Moving from text to voice in real time for millions of users is no small feat. OpenAI needs time to prepare its servers for this deluge of voice requests.

OpenAI is not changing its overall strategy: a gradual rollout. First a small group of ChatGPT Plus users at the end of June (delayed to the end of July), then a gradual expansion. The goal is for all Plus subscribers to be able to use the voice feature by the fall.

A more human assistant

Aside from these linguistic improvements, OpenAI is also working on its screen and video sharing features. The assistant can capture and analyze the content of your screen or recordings from your camera.

In practice, Microsoft, which works in partnership with OpenAI, revealed in 2024 how the CoPilot assistant based on GPT-4o was capable of providing advice to a Minecraft player to build their build. With these improvements, ChatGPT becomes an increasingly human interlocutor until it approaches its reaction time, namely 320 milliseconds, where it took on average a few seconds to calculate a response. At this speed, the chatbot could also combine a set of reactions to appear surprised or sarcastic during requests.

For the moment, these announcements and presentations, as impressive as they are, have only been made in the context of demonstrations. It remains to be seen how these features will be used in daily life and to what extent they will remain relevant in the face of thousands of simultaneous requests.


Did you know? Google News lets you choose your media. Don’t miss out on Frandroid and Numerama.

-

-

PREV the changes would not only be inside
NEXT The Amazfit GTR 2 connected watch drops to less than 70 euros for a short time