These 7 AI models just overtook ChatGPT in a new study — and the list may surprise you

5 months ago 9

ARTICLE AD BOX

OpenAI's ChatGPT started the whole craze about generative AI chatbots when it debuted to the public back in late 2022. Since then, the chatbot has managed to retain a large chunk of the market share, despite having many powerful competitors like Gemini, Grok, Claude, Qwen, DeepSeek, Mistral and others.

However, a study by British company Prolific has placed ChatGPT at the 8th spot in terms of best AI models, behind a couple of Gemini models, Grok models, DeepSeek models and even a model by French company Mistral. The company had created its own benchmark called "Humaine." which it says is “built to understand AI performance through the lens of natural human interaction”

“Current evaluation is heavily skewed towards metrics that are meaningful to researchers but opaque to everyday users, such as accuracy on specialised datasets, performance on esoteric reasoning tasks, etc. This has created a disconnect between what gets optimised for and what people actually value” the company says in its blogpost

The company also noted that even human preference leaderboards can fall short if they are not designed with scientific rigour. It noted that platforms which require everyone to vote for their favourite model can be susceptible to sample bias and , likely overrepresent tech savvy users

The new leaderboard aims to address this issue with automated quality monitoring in order to ensure participants were engaging thoughtfully with the task.

ChatGPT ranks below these AI models:

As per the Humaine study, these were the top 10 AI models:

Gemini 2.5 Pro (Google)

DeepSeek v3 (DeepSeek)

Magistral Medium (Mistral)

Grok 4 (xAI)

Grok 3 (xAI)

Gemini 2.5 Flash (Google)

DeepSeek R1 (DeepSeek)

ChatGPT-4.1 (OpenAI)

Gemma (Google)

Gemini 2.0 Flash (Google)

Notably, the study was published in September when Google had not yet released its Gemini 3 Pro model and xAI had not rolled out their Grok 4.1 and Grok 4.1 Thinking models.

Gemini 2.5 Pro being at the top of a benchmark isn't exactly a big deal at this point given that the model has continuously topped various leaderboards since its launch. However, OpenAI's no model ranking in top 5 and even going behind the likes of DeepSeek, , Grok and Mistral is a shocking development if the results are to be believed.

The researchers do not give a reasoning behind why ChatGPT was listed so low in the rankings but they do not that Google's Gemini-2.5-Pro consistently ranked as the top model for the "Overall Winner" metric.

Read Entire Article