OpenAI’s Deep Research smashes records for the world’s hardest AI exam, with ChatGPT o3-mini and DeepSeek left in its wake




  • The accuracy achieved by the top-scoring AI in the world’s hardest benchmark as improved by 183% in just two weeks
  • ChatGPT o3-mini now scores up to 13% accuracy depending on capacity
  • OpenAI Deep Research obliterates competition with 26.6% accuracy result

The world’s hardest AI exam, Humanity’s Last Exam, was launched less than two weeks ago, and we’ve already seen a huge jump in accuracy, with ChatGPT o3-mini and now OpenAI’s Deep Reasoning topping the leaderboard.

The AI benchmark created by experts from around the world contains some of the hardest reasoning problems and questions known to man – it’s so hard, that when I previously wrote about Humanity’s Last Exam in the article linked above, I couldn’t even understand one of the questions, let alone answer it.





Source link

Previous article‘This Is A Big Deal’—Bitcoin And Crypto Now Braced For A Huge U.S. Price Earthquake