Google has announced the new Gemini artificial intelligence (AI) model, an AI system that will power a host of the company’s products, from the Google Bard chatbot to its Pixel phones. The company calls Gemini “the most capable and general model we’ve ever built,” claiming it would make AI “more helpful for everyone.” Judging by the incredible video demo further down this page, it may well be right.
Gemini will come in three ‘sizes’: Ultra, Pro and Nano, with each one designed for different uses. All of them will be multimodal, meaning they’ll be able to handle a wide range of inputs, with Google saying that Gemini can take text, code, audio, images and video as prompts.
While Gemini Ultra is designed for extremely demanding use cases such as in data centers, Gemini Nano will fit in your smartphone, raising the prospect of the best Android smartphones gaining a significant AI advantage.
With all of this new power, Google insists that it conducted “rigorous testing” to identify and prevent harmful results arising from people’s use of Gemini. That was challenging, the company said, because the multimodal nature of Gemini means two seemingly innocuous inputs (such as text and an image) can be combined to create something offensive or dangerous.
Coming to all your services and devices
Google has been under pressure to catch up with OpenAI’s ChatGPT and its advanced AI capabilities. Just a few days ago, in fact, news was circulating that Google had delayed its Gemini announcement until next year due to its apparent poor performance in a variety of languages.
Now, it turns out that news was either wrong or Google is pressing ahead despite Gemini’s rumored imperfections. On this point, it’s notable that Gemini will only work in English at first.
What does Gemini mean for you? Well, if you use a Pixel 8 Pro phone, Google says it can now run Gemini Nano, bringing all of its AI capabilities to your pocket. According to a Google blog post, Gemini is found in two new Pixel 8 Pro features: Smart Reply in Gboard, which suggests message replies to you, and Summarize in Recorder, which can sum up your recorded conversations and presentations.
The Google Bard chatbot has also been updated to run Gemini, which the company says is “the biggest upgrade to Bard since it launched.” As well as that, Google says that “Gemini will be available in more of our products and services like Search, Ads, Chrome and Duet AI” in the coming months, Google says.
As part of the announcement, Google revealed a slate of Gemini demonstrations. These show the AI guessing what a user was drawing, playing music to match a drawing, and more.
Gemini vs ChatGPT
It’s no secret that OpenAI’s ChatGPT has been the most dominant AI tool for months now, and Google wants to end that with Gemini. The company has made some pretty bold claims about its abilities, too.
For instance, Google says that Gemini Ultra’s performance exceeds current state-of-the-art results in “30 of the 32 widely-used academic benchmarks” used in large language model (LLM) research and development. In other words, Google thinks it eclipses GPT-4 in nearly every way.
Compared to the GPT-4 LLM that powers ChatGPT, Gemini came out on top in seven out of eight text-based benchmarks, Google claims. As for multimodal tests, Gemini won in all 10 benchmarks, as per Google’s comparison.
Does this mean there’s a new AI champion? That remains to be seen, and we’ll have to wait for more real-world testing from independent users. Still, what is clear is that Google is taking the AI fight very seriously. The ball is very much in OpenAI’s (and Microsoft‘s) court now.