Tag: LLM
-
+
+
+
+
+
+
Chinese startup MiniMax just dropped M2.5, its most powerful model yet. But M2.5 flopped in my testing, scoring a C+. That’s only slightly better than MiniMax’s prior model, which I gave a C last month. Let me show you where this model works and where it fails… Round #1: Automating…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, China, LLM, MiniMax, Open source, Technology, Venture Capital, Writing+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
+
+
+
Moonshot AI recently dropped Kimi K2.5, its most powerful model ever. But in my testing, K2.5 failed miserably. When I reviewed Kimi K2 in November, it was a real threat to ChatGPT and Grok. But K2.5 feels like a major downgrade. Across a range of queries, K2.5 delivered useless results.…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, China, DeepSeek, Investing, Kimi, LLM, Open source, Startups, Technology, Venture Capital+
+
+
+
+
+
+
+
+
+
+
+
-
Elon just dropped Grok 1.0 Imagine, xAI’s best video model ever. It’s faster, clearer, and better at making your ideas come alive. This morning, I put it through a three round test. It did a great job overall, while still struggling to follow my prompts at times. Let’s see what…
+
+
+
+
+
+
+ AI, AI Video, Artificial Intelligence, ChatGPT, Elon, Elon Musk, Grok, LLM, Silicon Valley, Startups, Technology, xAI+
+
+
+
+
+
+
+
+
+
+
+
-
Chinese AI startup MiniMax is going public this week at a $7 billion valuation. But its model flopped in my testing. I ran MiniMax through three tests with real world questions. These are harder to game than benchmarks. Let me show you where MiniMax does well and where it struggles……
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, China, Finance, Investing, Large Language Model, LLM, Startups, Stock Market, Tech, Technology, Venture Capital+
+
+
+
+
+
+
+
+
+
+
+
-
China is about to launch its first AI model IPO, Zhipu AI. Investors may be excited, but Zhipu’s product is weak. China’s first IPO of an LLM startup, Zhipu AI, will start trading Thursday. The IPO is expected to value Zhipu at nearly USD $7 billion. This morning, I ran…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, China, Finance, Investing, IPO, LLM, Tech, Technology, Venture Capital+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
+
+
+
The Allen Institute for AI recently released its most powerful model ever: Olmo 3. Olmo is cheap to train, making it perfect for anyone training their own model. Olmo 3 is 2.5 times more efficient to train than Llama 3.1 based on GPU-hours per token. Olmo is also much more…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, LLM, Model Training, Open source, OSS, Tech, Technology, Venture Capital+
+
+
+
+
+
+
+
+
+
+
+
-
I tested the top models from Grok, Gemini, and ChatGPT head to head this morning. Gemini won, showing incredible power at research and sourcing. The last month has seen some amazing releases from the top AI labs. xAI released Grok 4.1 Thinking, Google released Gemini 3.0 Pro, and OpenAI dropped…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, Elon, Elon Musk, Entrepreneur, gemini, Google, Grok, LLM, OpenAI, Silicon Valley, Startups, Technology, xAI+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
+
+
+
I ran OpenAI’s new GPT 5.2 through a three round test. It still scores below Grok and Gemini. Some of GPT-5.2’s responses are excellent. But the quality of its answers are inconsistent. Let me show you where this model excels and where it falls short… Round #1: Learning About Needle…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, gemini, Google, Grok, LLM, OpenAI, Silicon Valley, Startups, Tech, Technology, Venture Capital, xAI+
+
+
+
+
+
+
+
+
+
+
+
-
Gemini 3.0 Pro dominates AI benchmarks. But in my real world testing, Grok 4.1 Thinking comes out on top. Google’s new mode is ranked #1 in LMArena. It scores off the charts in a variety of AI benchmarks, like Humanity’s Last Exam. But the best way to test a model…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, Elon, Elon Musk, gemini, GOOG, Google, Grok, Large Language Model, LLM, Technology, xAI+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
+
+
+
Elon just dropped Grok 4.1. I tested it this morning. This is the best model I’ve ever used. xAI’s claims 4.1 has fewer hallucinations than prior models. It ranks number one on LMArena, ahead of Gemini, Claude and ChatGPT. Let’s see what this thing can do! Round #1: Are Consumers…
+
+
+
+
+
+
+ AI, Artificial Intelligence, ChatGPT, Elon, Elon Musk, Grok, LLM, Startups, Tech, Technology, Venture Capital, xAI+
+
+
+
+
+
+
+
+
+
+
+