OpenAI Behind Competitors Despite GPT 5.1 Release

This morning, I tested OpenAI’s new GPT-5.1. It still falls behind the best models from xAI, Google, and Kimi.

When I tested GPT-5 in August it performed poorly, notching a C-. It gave me outdated data and struggled to cite sources.

OpenAI claims that GPT-5.1 is better at reasoning and easier to customize. I ran a three-round test to find out…

Round #1: Finding Fertility Startups

Low fertility is a huge problem across much of the world. Italy just registered a fertility rate of 1.1, an all-time low.

Even folks who want to have a child sometimes can’t. But what if we were able to cure infertility and help people have more babies?

I asked GPT 5.1 to find me startups working on infertility.

ChatGPT did a great job of finding some leading startups working on infertility. It gave citations on the technologies they’re developing and the amount of capital they raised.

I’m giving this round an A.

Round #2: Defending Against a Robot War

We’ve seen some incredible new robotics demos recently, like the Neo. These androids will help us clean up around the house and build things in factories.

But what if androids are used against us in a war? What would be the best way to stop them?

ChatGPT comes up with some interesting ideas, like hacking the software that drives the robots. This could allow us to neutralize a large number of robots at once more easily than if we used munitions.

But ChatGPT does not cite any sources in its response. So, we don’t know how accurate it is. ChatGPT should’ve cited reports by scholars of war and experts on robotics.

I’m giving this round a C.

Round #3: The Origins of Fall Colors

Let’s move on to happier topics!

Looking out my window right now, I can see the beautiful red-yellow foliage. What exactly causes these gorgeous colors?

GPT-5.1 gives us a great response, including some beautiful leaf pictures.

It explains that a reduction in chlorophyll makes the leaves change color. The citations are good, linking to an article in the Smithsonian.

I’m giving this round an A.

Wrap-Up

Overall, I’m giving GPT-5.1 a B+. That’s up from a C- for GPT-5, but still well below the leading models.

The quality of GPT-5.1’s responses is inconsistent. Some have great sourcing and follow instructions well, while others do not.

Grok 4, Kimi 2 Thinking and Gemini 2.5 Pro are all better than GPT-5.1. The outputs are more consistent and the information more reliable.

When GPT-3.5 came out in late 2022, OpenAI was way ahead. That’s no longer the case.

For all the billions in investment, OpenAI is falling behind its competitors.

But Sam still has some incredible researchers on his team. I wouldn’t count out OpenAI yet.

Have a great weekend, everybody!

3 responses to “OpenAI Behind Competitors Despite GPT 5.1 Release”

Grok 4.1: Elon Drops the World’s Best Model – Tremendous

November 18, 2025 at 11:13 am

[…] OpenAI Behind Competitors Despite GPT 5.1 Release […]

LikeLike

Reply
Grok 4.1 Thinking Beats Gemini 3.0 Pro in Real World Test – Tremendous

November 19, 2025 at 11:55 am

[…] OpenAI Behind Competitors Despite GPT 5.1 Release […]

LikeLike

Reply
Nano Banana Pro: The Best Image Model Yet – Tremendous

November 21, 2025 at 10:54 am

[…] OpenAI Behind Competitors Despite GPT 5.1 Release […]

LikeLike

Reply

OpenAI Behind Competitors Despite GPT 5.1 Release

Share this:

3 responses to “OpenAI Behind Competitors Despite GPT 5.1 Release”

Leave a reply to Nano Banana Pro: The Best Image Model Yet – Tremendous Cancel reply