Weekly Shots of Insight and Market #5
Google TPU Takeaways, Why NVIDIA is Safe, and The Rise of Agentic AI
Before we dive into the this week’s shots, I want to share some highlights from our Thanksgiving trip to Rocky Mountain National Park and the Great Sand Dunes.
As we wrap up the year, I have three things I am deeply grateful for:
Health & Clarity: For having a healthy body and a clear mind that allows me to keep creating and outputting content.
New Beginnings: For the courage to make the big decision to move to the US this year, restarting both my career and my life.
Support System: For my family and my wife, who have supported my decisions and stood by my side every step of the way.
Market
1. Who actually gets hurt by Google selling TPUs?
TL;DR : It’s not NVIDIA. It’s the underperforming internal ASICs.
The narrative that “Google renting out TPUs kills NVIDIA” is lazy analysis. Here is the reality of the Hyperscaler/CSP landscape: Everyone wants to break NVIDIA’s stranglehold. The goal is to reduce NVIDIA’s share of their compute spend to a custom ratio (e.g., 60% or 50%). To do this, they build custom ASICs.
The Problem: Building a competitive ASIC is incredibly hard. It’s not just about the chip; it’s about vertical and horizontal scaling, network clustering, and the full software stack.
The investment is massive.
If a Hyperscaler forces an inferior internal ASIC onto their teams just to save money, they risk degrading their core product performance (search, recommendations, AI models). This is “tripping over dollars to pick up pennies.”
The Solution: Google offering TPUs to other companies provides a high-quality “Plan B.” If a Hyperscaler’s internal ASIC roadmap is unstable or underperforming, they can now rent Google TPUs to diversify away from NVIDIA. In this scenario, Google TPU acts as a “Second Source” competitor to AMD and internal ASICs, not necessarily a replacement for NVIDIA’s flagship performance.
2. The Real Story: NVIDIA has been playing catch-up to TPU
The market has the timeline backward. People think Google TPU is finally catching up to NVIDIA. Reality: NVIDIA has spent the last two years frantically trying to catch up to Google’s TPU Pod/Cluster architecture.
The TPU Lead: Google has trained Gemini on its own TPUs from day one. The previous generation TPU v4 (with 4,096 chip scale-up) was industry-leading.
Note: Even when Google had the best clusters, OpenAI (using “weaker” NVIDIA GPUs) beat them initially. Gemini 3’s recent success isn’t just because the TPU got better; it’s because Google’s AI science, algorithms, and data engineering improved. Hardware is the enabler, but the model architecture is the differentiator.
The NVIDIA Panic (2023-2024): If Google had opened up TPU sales in 2023, NVIDIA would have been in trouble. Back then, NVIDIA’s scale-up capabilities were limited (NVL8).
Remember the chaos? After the GH200 failed to gain traction, NVIDIA aggressively pivoted to the GB200 NVL72 architecture.
They changed specs constantly. The Taiwanese supply chain (ODMs and component makers) had thousands of R&D engineers doing “volunteer work,” redesigning boards repeatedly because NVIDIA was rushing.
Why the rush? Jensen Huang wasn’t worried about AMD; he was terrified of the gap between his clusters and Google’s TPU Pods.
The Current Landscape (Late 2025): Now, the gap has closed.
NVIDIA: The GB200/GB300 NVL72 is stabilizing. While the scale-up domain (72 GPUs) is smaller than TPU’s (4,096+), the NVL72 rack offers massive advantages: NVLink C2C coherency, shared memory, and the ability to link CPUs and 480GB of LPDDR5X into a massive VRAM pool.
Google: Relies on OCS (Optical Circuit Switches) for interconnects.
Verdict: It is now a balanced field. Each has its strengths.
By the time Google opens TPUs to other Hyperscalers in late 2025, NVIDIA will have solidified its position with Blackwell (GB300), FP4 precision, and the Rubin architecture.
3. Agentic AI: Why the “Genius” Model is Overrated
New Research: NVIDIA’s “ToolOrchestra” & The Butler Philosophy
NVIDIA recently published a paper that shatters the “Bigger is Better” myth. They trained a small 8B parameter model (ToolOrchestra) that scored 37.1% on the “Humanity’s Last Exam” (HLE) benchmark—beating OpenAI’s GPT-5 (35.1%).
The Insight: The “Self-Enhancement Bias” Big models like GPT-5 suffer from “Affluenza.” When given a task, GPT-5 tries to do everything itself (98% of the time) or delegate to a mini-version of itself.
Analogy: This is like hiring a $2,000/hour top-tier lawyer. You ask them to buy you a coffee, and they insist on doing it personally, charging you their hourly rate, instead of using an app or an intern. It’s a commercial disaster.
The “General Contractor” Approach NVIDIA’s 8B model acts like a smart General Contractor (or Jarvis from Iron Man). It isn’t a genius; it is a Steward.
It has a “Rolodex” of tools.
Need math? It calls a Python interpreter.
Need facts? It runs a Google Search.
Need deep reasoning? Only then does it pay to call the expensive GPT-5 API.
The Business Case:
Cost: 30-40% of a monolithic model.
Efficiency: 2.5x better at solving complex tasks.
Strategic Implications: This defines the second half of the AI game. The competition isn’t just about who has the biggest brain (Model); it’s about who builds the best Compound System. NVIDIA is signaling to developers: We don’t need more arrogant geniuses. We need budget-conscious, highly effective managers. This validates NVIDIA’s “Scale Across” strategy—controlling the flow of compute is more important than just owning the raw model weights.
https://arxiv.org/pdf/2511.21689
What I Read






