Kimi just dropped Kimi 2 Thinking. I tested it this morning. It’s as good as Grok 4, the best model I’ve ever used.

In July, I ran the prior Kimi through testing. It scored a B+, solid but below Grok 4.
This time, Kimi performed far better. Let me show you what it can do…
Round #1: Researching FP&A Startups
I recently saw an interesting new FP&A startup. It can forecast your runway based on when you pay bills.
But FP&A is a crowded category. So I asked Kimi what the leading startups in FP&A are and what they can do.

Kimi provides a table with all the leading FP&A startups. It clearly explains the features of each.
Looks like this category is even more crowded than I thought!
Kimi did include citations, but they’re all in a list on the side. I’d prefer to see the citations integrated into the response to make clear what information comes from where.
Still, Kimi’s response is very useful. I’m going to give this round an A.
Round #2: Looking for Missile Defense Startups
SpaceX just won a huge contract for missile defense radar. It’s part of President Trump’s Golden Dome project.
This made me wonder: what startups could work together with SpaceX to improve our missile defenses?

Kimi found some fascinating companies. Apex in particular grabbed me. They’re making space-based interceptors to shoot down missiles.
Kimi cites many sources, but it’s unclear which source goes to which piece of information. This makes it harder to rely on Kimi’s response.
Overall I’m giving this round an A-.
Round #3: Make New York Build Again
The biggest reason Zohran Mamdani won the mayoral election this week is the high cost of housing.
If we build more housing, we’ll lower rents. What’s the best way to do that?

Kimi provides some interesting ideas, from office conversions to upzoning near transit. It also does a good job of citing sources in the body of the response this time.
But Kimi doesn’t give me much idea who has the power to make these changes. In red tape-loving New York, understanding the regulatory environment is critical.
I’m giving Kimi an A- for this round.
Wrap-Up
Kimi turned in a strong performance in my testing. It answered my questions thoroughly and accurately.
Overall I’m giving Kimi an A-.
This is the same grade I gave Grok 4, the best model I’ve used so far. If Kimi could more clearly tie citations to responses, I’d give it an A.
China has caught up to the very best models from the United States. To regain our lead, the top American labs must release tools more powerful than anything we’ve ever seen.
Have you tried Kimi?
Have a great weekend, everybody!
More on tech:
GROK 4: THE BEST AI MODEL EVER?
DID CHINA JUST DROP THE WORLD’S BEST AI MODEL?
CAN DEEPSEEK BEAT THE BEST AMERICAN MODELS?
Save Money on Stuff I Use:
This platform lets me diversify my real estate investments so I’m not too exposed to any one market. I’ve invested since 2018 with great returns.
More on Fundrise in this post.
If you decide to invest in Fundrise, you can use this link to get $100 in free bonus shares!
I’ve used Misfits for years, and it never disappoints! Every fruit and vegetable is organic, super fresh, and packed with flavor!
I wrote a detailed review of Misfits here.
Use this link to sign up and you’ll save $15 on your first order.
Leave a reply to Grok 4.1 Thinking Beats Gemini 3.0 Pro in Real World Test – Tremendous Cancel reply