ChatGPT can now hear, see and speak, opening up a whole new world of possibilities for how we interact with AI chatbots. The new capabilities unlock the ability to have a voice conversation with ChatGPT, or physically show the bot what you’re talking about.
According to the official OpenAI blog post, you’ll soon be able to show the bot pictures of a landmark while on holiday and have a conversation about the history behind the structure. You could also send the bot a photo of your fridge contents and have it whip up a potential recipe.
The new features will be rolling out to ChatGPT Plus and Enterprise users first over the next few weeks. Voice is coming to iOS and Android apps, and images will be available across platforms. As with most ChatGPT features, users who aren’t subscribed to the Plus platform will likely see the features a little later.
ChatGPT talks back
The blog post notes that you’ll now be able to engage in back-and-forth conversations with your AI assistant on the go via the phone app. From what we can tell it would be a similar experience to how you’d speak to Siri or Amazon Alexa.
The video example on the blog post shows off a stylish user interface with a voice asking ChatGPT to tell a bedtime story, with the user interrupting every so often to ask questions.
Regardless of how you might feel about the technology it’s still very impressive. We’ll have to wait to see if real conversations match up with the seamless example in the video, but if they do, Siri and Amazon Alexa have a lot to be worried about. If I can access a talkative, intelligent chatbot like ChatGPT, which looks at pictures and can go into depth about topics without pause, why would I ever use any other virtual assistants?
If you’re a Plus subscriber, head over to Settings, click ‘New Features’ on the mobile app and opt into voice conversations. You’ll be able to choose your favorite voice out of five different options: Sky, Cove, Ember, Breeze and Juniper, and you can listen to each one over on the official site.
Sight for sore eyes
ChatGPT can also now look at more than one image as well. You can show graphs that need analyzing, get help with homework or just show a rough draft of work you’d like feedback on, but can’t be bothered to type out.
If you want it to focus on something specific in the photo, you can use the new drawing tool within the ChatGPT app and circle exactly what you want the bot to concentrate on.
While this is scarily impressive for a generative AI chatbot, there are concerns that immediately spring to mind upon hearing about the new features.
OpenAI does acknowledge these concerns at the bottom of the announcement, stating that with new features come new challenges, including hallucinations – basically an incorrect response given by an AI bot but delivered with confidence – and the possibility of the voice capabilities that impersonate public figures or commit fraud.
In order to combat this, OpenAI states that Voice Chat was created with real voice actors, and the image input feature was tested with rosh domains in extremism and scientific proficiency, to “align key features for responsible usage”.
We’re so incredibly buzzed to try out the new features, especially the ability to chat directly to ChatGPT and probe its mind. We’re also keen to see how this will ripple down to other products like Bing AI, Google Bard and even Meta’s budding AI project. As ChatGPT is an AI trailblazer, introducing new features like this will mean everyone else will have to catch up.