Weekly Shots of Insight and Market #8、9
Merry Christmas! Micron (MU) Outlook, Groq : A Strategic Move for Talent and Software Supremacy
Hi everyone, Merry Christmas! Over the past two weeks, because I was traveling with my family for the holiday, I have combined two weeks' worth of content into a single issue to fully dedicate my time to them. Christmas is my favorite holiday of the year; beyond the beautiful decorations, gift exchanges, and games, what I love most is the atmosphere of everyone coming together.
The tallest Christmas tree I've seen this year.
Market
MU
I. Micron (MU) Business Segment Outlook
Data Centers: Management has revised the 2025 global server shipment growth forecast upward from 10% to 17–19% YoY. This is driven by cloud service providers continuing to build data centers, which sustains strong demand for HBM, Server LPDDR5, and Gen 9 process NAND Flash. For 2026, agreements on HBM supply and pricing have already been reached with customers, and HBM4 has passed validation, with production ramp-up scheduled for Q2 2026.
PC/Notebooks: Due to memory supply shortages, PC and laptop brands are facing production bottlenecks. Management predicts that 2026 shipments may experience a slight decline, which would offset the benefits of increased average memory capacity per unit.
Mobile & Intelligent Edge: Small-scale shipments of 1-gamma process 16Gb LPDDR6 have commenced. Additionally, samples of 1-gamma process 24Gb LPDDR5 are being provided to customers to continue improving the product mix.
Auto & Industrial: Because these products have longer lifecycles, demand for DDR4 and LPDDR4 remains robust. To support the long-term needs of these customers, the company announced an expansion plan for its Virginia plant in June 2025.
II. HBM Market and Supply Chain Dynamics
Market Growth: The HBM market size is projected to reach approximately 35billionin2025∗∗andgrowto∗∗100 billion by 2028, representing a compound annual growth rate (CAGR) of 40%.
Kinik and Technical Requirements: It is estimated that Micron uses Kinik’s carrier wafers as temporary carriers for the HBM stacking process. Kinik plans to expand capacity by 70,000 pieces per month in Q3 2026 to meet the needs of logic chips (advanced processes) and memory customers.
CMP and Hybrid Bonding: As HBM stacking layers increase, every layer requires CMP (Chemical Mechanical Polishing). The future introduction of Hybrid Bonding will significantly raise requirements for surface flatness, driving demand for higher-specification diamond discs and boosting Kinik’s operational performance.
III. DRAM & NAND Supply and Demand Forecast
Supply and Demand Balance: For both DRAM and NAND, supply growth is projected at approximately 17–20% (primarily driven by the 1-gamma process), while demand growth is expected to be around 20%. Consequently, the expansion of the supply gap is expected to be limited.
Short-term Risks: Cloud service providers currently hold high Server DRAM inventory (70–80 days), which is nearing the “safety level” of 90–120 days; this may lead to a slowdown in procurement. Furthermore, rising DRAM prices could potentially suppress demand for consumer electronics like PCs and smartphones.
Long-term Equilibrium (2027+): The market is expected to return to balance starting in 2027 as new production capacities from major manufacturers—including Samsung (P4), SK Hynix (Yongin), Micron (ID1), Nanya Technology, and Winbond—are completed and come online.
NAND and HDD Competition: As HDD manufacturers (Seagate, WD) expand capacity for HAMR and Ultra SMR technologies (ranging from 36TB to 44TB), the trend of using QLC NAND to replace HDDs for “cold storage” will slow down, easing NAND supply pressure.
Potential Supply Shock: A major variable is Samsung’s Line 12 in Hwaseong. If Samsung converts its 100,000 monthly wafers of MLC NAND capacity entirely to TLC/QLC, it could add an extra 10–15% to global supply, potentially disrupting the market balance
Nvidia’s Potential Acquisition of Groq: A Strategic Move for Talent and Software Supremacy
While social media hype often paints Groq as a revolutionary “Nvidia killer” due to its unique hardware architecture, a deeper look at the technical and business realities suggests that Nvidia’s interest in the startup is likely driven by compiler technology and top-tier talent rather than just its hardware,.
1. The SRAM Debate: Innovation or Limitation?
Groq’s primary claim to fame is its use of SRAM (Static Random-Access Memory) instead of the industry-standard HBM (High Bandwidth Memory). However, this approach comes with significant trade-offs:
• Cost and Scalability: SRAM is integrated directly onto the same die as the logic chip (the GPU, TPU, or LPU) and is traditionally used for cache memory (L1/L2/L3). Because Groq does not use the most advanced manufacturing processes, and because SRAM area does not shrink effectively as logic chips advance, the cost structure is unfavorable, making it extremely difficult to expand capacity.
• Not a “Magic Bullet”: Several other startups, such as Graphcore and Cerebras, have pursued similar SRAM-heavy paths with poor market results,. Graphcore, for instance, was eventually sold to Arm at a significantly depressed valuation.
• The HBM Success Story: Proponents of Groq often also praise Google’s TPU, yet the TPU utilizes large-capacity HBM—the same technology used by Nvidia and AMD—proving that removing HBM is not the only way to achieve high performance.
2. Business Realities vs. Social Media Hype
Despite the buzz on platforms like X (formerly Twitter), Groq’s commercial performance has been relatively stagnant:
• Stagnant Adoption: The LPU (Language Processing Unit) has been on the market for two to three years, yet it still lacks adoption by major mainstream customers.
• Market Reception: Actual revenue and business traction have remained cold compared to the enthusiastic reception in tech circles.
3. Why Nvidia Wants Groq: Compilers and Determinism
The true value of Groq likely lies in its software layer.
• Deterministic Execution: A historical weakness of Nvidia’s CUDA platform is its dynamic scheduling, which makes it difficult to achieve “Zero Jitter”. In contrast, Groq’s compiler achieves “Deterministic Execution,” providing predictable, ultra-low latency.
• Batch-1 Efficiency: While Nvidia dominates heavy model training, Groq’s architecture is exceptionally efficient for Batch-1 scenarios—where a single user requires an immediate, real-time response.
• The Hybrid Strategy: Nvidia could potentially develop “hybrid architecture” products, using HBM for massive training tasks while integrating Groq’s compiler and SRAM tech for edge computing or specialized high-speed inference.
4. The “Acu-hire”: Securing the Architects of the AI Era
Perhaps the most compelling reason for an acquisition is talent. Groq was founded by Jonathan Ross, a core designer of the original Google TPU.
• TPU Legacy: Ross was instrumental in creating the hardware that powered AlphaGo’s historic victory in 2016.
• Software-Defined Hardware: Ross founded Groq to evolve the TPU concept from a rigid, internal-use tool into a more flexible, software-defined system. By moving control from the hardware circuits to the compiler (software), his LPU aims to solve the efficiency gaps found in traditional GPUs.
5. Strategic Defense
Finally, this would be a defensive acquisition. Tech giants like Google, Amazon, and Microsoft are all developing in-house chips to reduce their dependence on Nvidia. By acquiring Groq, Nvidia prevents its rivals from leveraging a unique architecture that has already proven it can outperform Nvidia in specific, niche low-latency applications. At a potential price tag of $20 billion, this is a relatively “inexpensive bet” for Nvidia, representing only a few quarters of their typical R&D budget and Software Supremacy
While social media hype often paints Groq as a revolutionary “Nvidia killer” due to its unique hardware architecture, a deeper look at the technical and business realities suggests that Nvidia’s interest in the startup is likely driven by compiler technology and top-tier talent rather than just its hardware.
What I Read
Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads https://ieeexplore.ieee.org/document/9138986
A Software-Defined Scale-Out Computer Architecture https://ieeexplore.ieee.org/document/9808428
MU Transcript



