ChatGPT-4o is here. The “o” stands for omni, hinting at the combination of text, audio, video and image outputs that the latest version of ChatGPT offers.
OpenAI CEO Sam Altman posted on X that GPT-4o is “natively multimodal” with its native combination of voice, text and vision. At the launch livestream of GPT-4o earlier this week, OpenAI CTO Mira Murati explained how GRP-4o addresses the latency problems with GPT’s previous voice mode implementation. GPT-4o will be freely available and users will be able to do things like convert text into image outputs.
During the livestream the folks at OpenAI demoed the improved voice capability of GPT-4o, showing how one no longer has to wait for the voice assistant to finish; you can but in and it won’t disrupt the conversation. The model offers real-time responsive voice mode, removing the 2.3 second wait time before getting a response. Whilst you can still tell that the assistant’s voice is computer generated, I was impressed by how natural the conversation sounded. With GPT-4o Open AI has improved the model’s ability to both perceive and generate emotions.
And finally, I really liked GPT-4o’s ability to interact visually, with a phone camera pointed at a math problem, with GPT talking the user through solving the math problem. Another good example of how ChatGPT-4o combines different inputs and outputs.
Main learning point: I know that some people were hoping for the release of ChatGPT 5, but I was already very impressed with the public demo of ChatGPT-4o and I look forward to having a play myself, and experience ChatGPT-4o’s ‘omni’ approach first hand.
Related Links:
- https://openai.com/index/spring-update/
- https://openai.com/index/hello-gpt-4o/
- https://blog.samaltman.com/gpt-4o
- https://openai.com/index/gpt-4o-and-more-tools-to-chatgpt-free/
- https://twitter.com/gdb/status/1790071008499544518
- https://www.lifewire.com/openai-reveals-gpt-4o-8647637
- https://www.theverge.com/2024/5/13/24155493/openai-gpt-4o-launching-free-for-all-chatgpt-users
- https://arstechnica.com/information-technology/2024/05/chatgpt-4o-lets-you-have-real-time-audio-video-conversations-with-emotional-chatbot/