Computing

xAI’s Grok-3 looks impressive, but its true test is going mainstream

February 18, 2025

Elon Musk-led xAI has announced their latest AI model, Grok-3, via a livestream. From the get-go, it was evident that the company wants to quickly fill all the practical gaps that can make its chatbot more approachable to an average user, rather than just selling rhetoric about wokeness and understanding the universe.

The company will be releasing two versions of its latest AI model viz. Grok-3 and Grok-3 mini. The latter is trained for low-compute scenarios, while the former will offer the full set of Grok-3 perks such as DeepSearch, Think, and Big Brain.

Table of Contents

What’s all the fuss about

Homepage of Grok 3 chat. — Nadeem Sarwar / Digital Trends

As Musk talked about all the new features coming with Grok-3 alongside xAI experts, it was obvious that this release is not merely about setting new performance benchmarks, but also catching up on all the hot trends that will define the AI landscape in 2025.

According to the benchmarks shared by the company, Grok-3 and even Grok-3 mini performed better than OpenAI’s GPT-4o, Gemini, Claude, and Deep Seek models at tasks such as coding, mathematics, and scientific problem solving.

On the Chatbot Arena (LMSYS) rankings, an early version of Grok-3 reached a leaderboard high of 1,400 points, ahead of Gemini 2.0 Flash Thinking, DeepSeek, and more. The company developed Grok-3 at an impressive pace, and achieving those performance figures is quite a feat despite being a relative upstart in the face of Google or OpenAI.

Pushing it into the mainstream, however, is going to be the biggest challenge, especially from an access viewpoint. Grok-3 will initially be available to X Premium+ subscribers as part of an early access program. Currently the highest tier of X subscription, Premium+ is priced at $22 per month, and $229 for the annual plan.

Eligible users will get access to Grok-3 features such as reasoning, DeepSearch, higher usage limits, and early access to new tools. The company is also launching a separate subscription service called SuperGrok that offers priority access to Grok-3 and higher image generation limits.

Introduction of SuperGrok by xAI team. — xAI

This subscription will be limited to the Grok mobile app and the freshly-launched Grok.com website. Musk says the latest and most advanced capabilities, however, will be served via the website.

“This is kind of a beta, so you should expect some imperfections at first, but we will improve rapidly,” Musk said on the livestream, adding that users can expect improvements every day. It would be interesting to see how xAI fills the interest gap for an average chatbot enthusiast rocking a phone while simultaneously sending a juicy pitch deck to high-paying enterprise customers.

Caching on the trends

xAI appears to be doing a lot with Grok-3, not just in terms of enhanced capabilities, but also feature parity. One of the standout elements of Grok-3 is the enhanced reasoning and thinking capabilities, which seems to be the hot new trend in the world of language models.

Take for example the Grok-3’s Think mode, which is a direct rival to OpenAI’s o-series models. Such AI models are designed to spend more time thinking and breaking down the user queries before they provide the answer.

Users can see the chain of thoughts in real-time, and the benefits, as per the adopters, are improved performance in science, maths, and coding-related queries. xAI is covering that gulf with not just Think mode, but a separate Big Brain tool for Grok-3 that will supercharge its compute capabilities for more advanced and complex scenarios.

Google is not too far behind with its Gemini line-up. The company recently launched the Gemini 2.0 series of AI models, which include Gemini 2.0 Flash Thinking Experimental and a separate app-first iteration that prioritises information pulled from YouTube, Maps, and Google Search.

DeepSeek, the open-source AI chatbot from China that recently disrupted Wall Street, also offers a thinking and reasoning product called DeepThink. Even though the responses are censored, the performance is quite impressive.

xAI is also chasing the AI agent formula with Grok-3, even though it has a lot of ground to cover, especially when compared to the likes of OpenAI and Google. To that end, the company is launching its first agentic product built atop Grok-3 that it calls DeepSearch.

It works more or less in the same fashion as Deep Research in Google Gemini, and rival products of the same name by Perplexity and OpenAI. It performs a web search, compiles a full report, and also serves all the sources it pulled information from as citations.

xAI is late to the race, but price could be a hindrance when it comes to mass appeal. Perplexity will offer a limited number of Deep Research queries for free, while Google offers a more generous package with Gemini Deep Research at $20 for Gemini Advanced subscribers.

Deep Research (or DeepSearch for Grok-3) is an extremely compute-intensive process, so it makes sense for it to be a premium perk. But giving customers a taste of it, even with a limited number of queries, comes with a higher chance of earning new subscribers, a strategy that both Perplexity and OpenAI are following.

A demonstration of Gemini Live on a Google Pixel 9. — Gemini Live by Google. Joe Maring / Digital Trends

Musk also mentioned that a voice interaction mode is also coming to Grok, and that it will launch in roughly a week. The focus is on providing an alternate method of conversing with Grok, one that feels more natural.

OpenAI’s ChatGPT has offered something called Voice Mode for a while now, and a similar feature called Gemini Live has been available to Google Gemini users, as well.

xAI didn’t provide many details about Grok-3’s voice mode, but confirmed that it will feature conversational memory so that it can remember details of previous interactions. “It’s one of the best experiences of Grok,” Musk said during the livestream.

Finding mass appeal is the challenge

Deep Research is not the only agentic implementation of AI chatbots, and that’s where xAI lags far behind. OpenAI recently introduced Operator, an AI agent that can perform complex web-based tasks on behalf of users by essentially taking over the control of web-browsing chores.

It can perform tasks like shopping, making restaurant reservations, and travel-related work, thanks to the underlying Computer-Using Agent (CUA) framework. Most importantly, OpenAI already has deals in place with companies such as DoorDash, InstaCart, Uber, and eBay to push the Operator as an impressive showcase of practical agentic capabilities.

Then there is the system of ChatGPT plug-ins, which makes the chatbot far more functional by integrating with platforms such as Zapier, Expedia, Klarna, Slack, and Shopify among others. They make ChatGPT a far more appealing product to enterprises than Grok-3.

Google, on the other hand, is leveraging its extensive portfolio of products and apps that people use on a daily basis. Deep system-level integration with apps (via extensions) on Android and availability of multi-modal Gemini capabilities across Workspace products such as Gmail and Docs give it a dramatic functional edge.

DeepSeek, on the other hand, has already been adopted by brands such as Honor. Apple has also pushed a ChatGPT-driven Apple Intelligence stack on millions of iPhones and Macs, and has inked a deal with Alibaba to offer those features in China.

xAI hasn’t found any such takers for Grok, yet. That’s the biggest challenge for xAI right now, and it would be interesting to see what brands it can onboard to push Grok-3, with all its bells and whistles, into the mainstream.

Source link

What’s all the fuss about

Caching on the trends

Finding mass appeal is the challenge

RELATED ARTICLESMORE FROM AUTHOR

Google quietly kills file uploads for rival cloud app on Android, and users are left with zero options

This budget-friendly Allied gaming PC is on sale for just $600

The massive Samsung Odyssey G9 is almost half off today

RELATED ARTICLES MORE FROM AUTHOR