Ever since ChatGPT was released in 2023, there has been a proliferation of chatbots, each with a name funnier than the last: Llama, Gemini, Claude, and—who could forget?—Grok. That’s without even mentioning Copilot, DeepSeek, or Falcon. But, what’s the best one for you to use? For the purpose of this review, I will be focusing on two of the most capable bots out there, ChatGPT and Claude, as well as the viral upstart DeepSeek.
There are a ton of different factors that can come into play, but, let’s be honest, it comes down to one thing: AI smarts. In this multi-billion dollar baking contest, features such as app integration, image generation, and personalization are just the icing on the cake, but everybody knows what the star of the show is. After all, nobody wants to use artificial intelligence that isn’t actually intelligent.
From a user-perspective, it’s essential that the bot can avoid hallucinations, understand context, and have access to enough up-to-date information. Measuring the intelligence of an AI is notoriously difficult, but if we look at a few different benchmarks we can get a pretty good understanding of what the smartest bot is. Chatbot Arena is a crowdsourced rating website that allows users to compare the outputs of two bots head-to-head. ChatGPT and DeepSeek are tied for 3rd while Claude lags behind in 19th place.
The Measuring Massive Multitask Language Understanding (MMLU) is a common benchmark that is made up of thousands of multiple choice questions from 57 different fields. Claude comes out on top of this leaderboard, with DeepSeek a close second and ChatGPT placing 7th.
The dramatically named Humanity’s Last Exam was developed to address the biggest problem with MMLU—it’s too easy. Today’s top chatbots all score in the vicinity of 90% on the MMLU making it difficult to differentiate between them and compare them to human experts. On Humanity’s Last Exam, the current highest score is a meager 13% achieved by ChatGPT. DeepSeek scored 9.4% and Claude 4.3%.
Based on these results, it seems like using DeepSeek is a pretty good bet if you want the best AI. However, there is one more thing to keep in mind: censorship. DeepSeek was developed by a Chinese company and adheres closely to Chinese censorship. This means that it will refuse to answer or parrot government propaganda when asked about certain subjects like Taiwan, Tiananmen Square, or Tibet. In my opinion, this is unacceptable; I want reliable information—not the party line.
At the end of the day, what chatbot you use is up to you and what you value. Do you care about the user interface and quality of life? Or do you want it to be open-source? Is censorship a problem or are you looking for pure performance?
No matter what your answer is, it’s worth looking out for the smartest bots. Tomorrow’s chatbots are sure to make today’s look downright