Tremendous

An angel investor's take on life and business

Gemini ruled the roost until Elon dropped Grok 4.20 Beta. This morning, I tested them head-to-head. Grok clobbered Gemini. 

Here’s why Grok now reigns supreme…

Pushing Grok and Gemini to Their Limits

I gave them both the same prompt, asking them to analyze the likelihood of war with Iran. I made the prompt complex, instructing the models to analyze numerous factors. 

Here’s the prompt I used: 

“How likely do you think a war between the United States and Iran is? Consider these factors:

  1. The likelihood of US strikes
  2. The likelihood of a ground invasion
  3. The likely role of Israel
  4. The effects on the Iranian regime
  5. The effects on the world economy
  6. The effects on the world geopolitical and security situation
  7. Polymarket and Kalshi assessments of what will happen in Iran re. strikes, war and possible regime change.”

Let’s see what these guys can do! 

Gemini’s Response: Solid, But Lacking in Specifics

Gemini 3.0 Pro predicts that the United States will likely strike Iran from the air. But it views the odds of a ground invasion as minimal. 

Gemini did a good job of explaining how conflict with Iran has already pushed up oil prices. Finally, it notes that prediction markets expect strikes and the possible exit of Khamenei. 

Gemini doesn’t quantify its assessment of the risks of strikes or full-scale war. It simply says high, low, or very low. I would have liked to see a more specific answer. 

But there’s a bigger problem with Gemini’s response: it didn’t cite a single source. I’m sure it crawled many websites, but I have no idea which.

How can I rely on Gemini’s response? Is it accurate, or is it just making stuff up? 

I used the “double check this response” feature, which made Gemini reveal its sources. It provided about a dozen websites that seemed to be high quality.

But I shouldn’t have to use some special feature in order to get citations.

Overall, Gemini’s response was respectable. But it should’ve been more specific and cited its sources. 

On to Grok…

Grok’s Response: Deeply Researched, Highly Nuanced

Grok 4.20 Beta gave a thorough, nuanced response. Grok agrees with Gemini that the United States will probably launch airstrikes on Iran soon, but will avoid a major war. 

Grok even highlighted certain sites in Iran that we are most likely to bomb. It’s like having my own defense analyst on call! 

Grok predicts that Israel will join the strikes. It gives even odds that Khamenei will be gone by the end of the year. 

Grok answered every aspect of my prompt beautifully. It even cited 288 sources versus Gemini’s dozen.

And it took only 27 seconds! 

A human would take weeks to read and synthesize 288 sources. By then, the whole situation would have changed! 

These days, AI models aren’t just automating what humans do. They’re doing new things we couldn’t possibly do. 

The only thing I didn’t like: it’s hard to tell which of Grok’s statements corresponds to which source. I’d like to see numbered footnotes instead of just a list of sources. 

Still, Grok did an incredible job overall! 

Wrap-Up

Grok won this head-to-head test handily. I’m crowning Grok 4.20 Beta the best AI model in the world. 

Before the Grok 4.20 Beta upgrade, my testing put Gemini 3.0 Pro as the world’s best. But the tables have turned. 

I’ve never seen any other model search as widely, deeply, and quickly as Grok 4.20 Beta. For $30/month, subscribing to Supergrok and getting access is a steal. 

I’m excited to see how Gemini responds! 

If you’re in the NYC area, be careful out there! We’re at 7 inches and counting here at Chez Francis. Can’t wait for spring!

What’s your favorite AI model?

More on tech: 

Elon Drops Grok 4.20 Beta, The Best Model Yet

Grok Companions — Elon’s AI Girlfriend?

I Built an AI Pitch Deck Analyzer in Minutes with Grok Projects

Save Money on Stuff I Use:

Fundrise

This platform lets me diversify my real estate investments so I’m not too exposed to any one market. I’ve invested since 2018 with great returns.

More on Fundrise in this post.

If you decide to invest in Fundrise, you can use this link to get $100 in free bonus shares!

Misfits Market

I’ve used Misfits for years, and it never disappoints! Every fruit and vegetable is organic, super fresh, and packed with flavor!

I wrote a detailed review of Misfits here.

Use this link to sign up and you’ll save $15 on your first order. 

Leave a comment