Tremendous

An angel investor's take on life and business

Anthropic just dropped its most powerful model yet, Claude 3.7 Sonnet. This morning, I asked it 3 questions I actually needed the answer to. Let’s see how it does…

Teach Me About Biology

Lately, I’ve been fascinated by new developments in biotech. So I had Claude give me a primer on some recent discoveries.

Here’s the prompt I used:

“Please give me a primer on discoveries in biology and biotech in the 21st century. Gear your response to the layman, not a specialist. Focus in particular on developments like CRISPR, single cell gene sequencing, and synthetic biology. But also include anything else that seems relevant. Give me an output using bullet points so it’s easier to read.”

Yes, I say please. You never know, these things could take over some day.

Claude gives a great response, providing understandable info on CRISPR, single-cell genomics and more.

It also formatted the answer beautifully, with bold headings and bullet points. That makes the response a lot easier to read.

The response is great, but it’s hard to trust it. It doesn’t cite any sources.

Overall, I’d give this a B. It’s great information, but if we don’t know where the info is coming from, it’s not that useful.

On to question two…

Learning About the AfD

Lately, the AfD party is all over the news. They took 2nd place in the German election Sunday.

But despite all the headlines, I don’t actually know much about what the AfD stands for. So I asked Claude:

“Tell me about the political positions and past actions of the German AfD party. Please be specific and cite your sources.”

Claude gave a good and thorough response. But even though I specifically directed it to cite its sources, it did not do so, saying “I don’t have web browsing capabilities.”

Just like in my first test, the info looks great. But I have no way to verify it.

So, the response is not that useful. I’d give this a B-.

Out of curiosity, I ran this prompt through Grok 3.

It cited 27 sources in a matter of seconds. It provided both links and in-line citations.

Now that’s a good response!

Government Debt Across Major Countries

America’s debt is getting to a crushing level, $36 trillion. But I’m curious…how bad off are we compared to other major countries?

So, I asked Claude…

“Please give me a table with data for all major economies worldwide. In the table, put the country’s name, its level of government debt in dollars, and it’s debt/GDP ratio (expressed as a percentage). Cite sources whenever possible.”

Claude produced a pretty table with just the info I wanted.

But for some reason, it popped the table up alongside its prior response on the AfD. This was rather jarring and isn’t a great interface for the user.

And again, it doesn’t link to sources. But at least this time, it gave me an idea of where I could go to verify the information.

I’ll give this one a B- as well.

Wrap-Up

How a model does at math competitions isn’t very relevant to me or most people. So rather than look at benchmarks, I test models by putting them through real world tasks.

Claude did a great job of answering the questions. I also liked how it formatted the outputs. The responses were clear, readable, and nicely formatted. That makes the tool easier to use.

But the big problem with 3.7 Sonnet is the lack of citations. How can I trust the information Claude is giving me?

Until that gets fixed, 3.7 Sonnet is not that useful. Overall, I’m giving it a B-.

Grok 3 is still my favorite model. If they want to dethrone Grok, Anthropic needs to step it up.

More on tech:

Using Grok 3 to Manage My Stock Portfolio

How Good is Grok 3?

ChatGPT Pro vs. Gemini Advanced vs. Grok vs. Claude

Save Money on Stuff I Use:

Fundrise

This platform lets me diversify my real estate investments so I’m not too exposed to any one market. I’ve invested since 2018 with great returns.

More on Fundrise in this post.

If you decide to invest in Fundrise, you can use this link to get $100 in free bonus shares!

Misfits Market

I’ve used Misfits for years, and it never disappoints! Every fruit and vegetable is organic, super fresh, and packed with flavor!

I wrote a detailed review of Misfits here.

Use this link to sign up and you’ll save $15 on your first order. 

2 responses to “Testing Claude 3.7 Sonnet, Anthropic’s latest model”

  1. […] Testing Claude 3.7 Sonnet, Anthropic’s Latest Model […]

    Like

  2. […] Testing Claude 3.7 Sonnet, Anthropic’s Latest Model […]

    Like

Leave a comment