Tag: LLM
-
I tested the top models from Grok, Gemini, and ChatGPT head to head this morning. Gemini won, showing incredible power at research and sourcing. The last month has seen some amazing releases from the top AI labs. xAI released Grok 4.1 Thinking, Google released Gemini 3.0 Pro, and OpenAI dropped…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, Elon, Elon Musk, Entrepreneur, gemini, Google, Grok, LLM, OpenAI, Silicon Valley, Startups, Technology, xAI+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
+
+
+
I ran OpenAI’s new GPT 5.2 through a three round test. It still scores below Grok and Gemini. Some of GPT-5.2’s responses are excellent. But the quality of its answers are inconsistent. Let me show you where this model excels and where it falls short… Round #1: Learning About Needle…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, gemini, Google, Grok, LLM, OpenAI, Silicon Valley, Startups, Tech, Technology, Venture Capital, xAI+
+
+
+
+
+
+
+
+
+
+
+
-
Gemini 3.0 Pro dominates AI benchmarks. But in my real world testing, Grok 4.1 Thinking comes out on top. Google’s new mode is ranked #1 in LMArena. It scores off the charts in a variety of AI benchmarks, like Humanity’s Last Exam. But the best way to test a model…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, Elon, Elon Musk, gemini, GOOG, Google, Grok, Large Language Model, LLM, Technology, xAI+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
+
+
+
Elon just dropped Grok 4.1. I tested it this morning. This is the best model I’ve ever used. xAI’s claims 4.1 has fewer hallucinations than prior models. It ranks number one on LMArena, ahead of Gemini, Claude and ChatGPT. Let’s see what this thing can do! Round #1: Are Consumers…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, Elon, Elon Musk, Grok, LLM, Startups, Tech, Technology, Venture Capital, xAI+
+
+
+
+
+
+
+
+
+
+
+
-
This morning, I tested OpenAI’s new GPT-5.1. It still falls behind the best models from xAI, Google, and Kimi. When I tested GPT-5 in August it performed poorly, notching a C-. It gave me outdated data and struggled to cite sources. OpenAI claims that GPT-5.1 is better at reasoning and…
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
-
Kimi just dropped Kimi 2 Thinking. I tested it this morning. It’s as good as Grok 4, the best model I’ve ever used. In July, I ran the prior Kimi through testing. It scored a B+, solid but below Grok 4. This time, Kimi performed far better. Let me show…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, China, Kimi, LLM, Open source, Startups, Technology, Venture Capital, Writing+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
+
+
+
For the first time ever, I’m using voice dictation regularly. Wispr Flow has nailed it where every other app has failed. Let me show you what this thing can do… I first tried voice dictation in the late 1990’s. For its time, it was incredible. But it made so many…
+
+
+
+
+
+
+ AI, Apps, Artificial In, Artificial Intelligence, ChatGPT, Dictation, LLM, Product, productivity, Software, Startups, Tech, Technology, Venture Capital, Voice, Writing+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
+
+
+
Grokipedia is already better than Wikipedia. Last night, Elon dropped Grokipedia. This morning, I tested it against Wikipedia. Grokipedia won 2 out of 3 rounds. For each round, I looked up an article on a topic I’m familiar with. Let’s see how these two encyclopedias compare… Round 1: Home Sweet…
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
+
+
+
Can DeepSeek beat the best American models? When I tested DeepSeek in June, the outputs were garbage. This morning, I gave it a re-test… Like in June, I asked DeepSeek three real questions I need the answer to. Last time, it scored a B-. Let’s see if DeepSeek’s recent updates…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, DeepSeek, Finance, Grok, Investing, Large Language Model, LLM, Money, Startups, Tech, Technology, Venture Capital, xAI+
+
+
+
+
+
+
+
+
+
+
+