Dylan Patel — Deep Dive on the 3 Big Bottlenecks to Scaling AI Compute

March 13, 2026 • 2 hr 31 min

🎧 Listen Now

🤖 AI Summary

Overview

This episode dives into the three major bottlenecks to scaling AI compute: logic, memory, and power. Dylan Patel, founder of SemiAnalysis, provides a comprehensive analysis of the semiconductor supply chain, the economics of AI labs and hyperscalers, and the challenges posed by geopolitical and technological constraints.

Notable Quotes

- H100s are worth more today than three years ago because the value of the models they run has skyrocketed. – Dylan Patel, on the increasing utility of GPUs.

- ASML's EUV tools are the most complicated machines humans make, and they will be the ultimate bottleneck for AI compute by 2030. – Dylan Patel, on the critical role of lithography tools.

- Fast timelines favor the U.S., but long timelines favor China in the semiconductor race. – Dylan Patel, on the geopolitical implications of AI scaling.

🧠 Logic Bottlenecks: Semiconductor Supply Chain

- The semiconductor supply chain is constrained by the production of advanced chips, with TSMC dominating the market.

- NVIDIA secured early allocations of TSMC's 3nm wafers, while companies like Google and Amazon lagged, leading to capacity constraints.

- ASML's EUV tools are the ultimate bottleneck, with production capped at around 100 tools per year by 2030. Each EUV tool is critical for manufacturing advanced chips, and scaling production is hindered by complex supply chains and long lead times.

- Older fabs (e.g., 7nm) could theoretically be repurposed, but they lack the efficiency and performance needed for modern AI workloads.

💾 Memory Crunch: The Rising Demand for DRAM and HBM

- AI models require exponentially more memory due to longer context lengths and larger KV caches.

- HBM (High Bandwidth Memory) is critical but requires 3-4x more wafer area than standard DRAM, exacerbating supply constraints.

- Memory prices have tripled, and smartphone and PC manufacturers are cutting production due to rising costs.

- The transition to 3D DRAM could alleviate some pressure, but it requires significant retooling of fabs and is years away.

⚡ Power Scaling: Addressing Energy Needs

- Power is not expected to be a long-term bottleneck for AI compute. The U.S. has ample capacity to scale energy production through gas turbines, reciprocating engines, and renewable sources.

- Behind-the-meter power generation (e.g., on-site gas turbines) is becoming a popular solution for data centers, bypassing grid constraints.

- Modularization of data centers and innovative cooling solutions (e.g., immersion cooling) will further optimize power usage.

🚀 Space GPUs: A Decade Away

- Space-based data centers are unlikely to be viable this decade due to logistical challenges, such as the need for rapid deployment and maintenance of GPUs.

- Networking between satellites would face significant bandwidth and reliability issues compared to terrestrial data centers.

- While space GPUs could become relevant in the long term, current bottlenecks in semiconductor production and the relative ease of scaling terrestrial power make them impractical for now.

🌏 Geopolitical Dynamics: Taiwan and China

- Taiwan's TSMC is central to the global semiconductor supply chain, and any disruption would drastically slow AI progress.

- China is aggressively building its semiconductor capabilities, but it is unlikely to fully indigenize EUV production before 2030.

- Fast AI timelines favor the U.S., as it leads in compute scaling and infrastructure investment. However, slower timelines could allow China to catch up through sheer scale and vertical integration.

AI-generated content may not be accurate or complete and should not be relied upon as a sole source of truth.

📋 Episode Description

Dylan Patel, founder of SemiAnalysis, provides a deep dive into the 3 big bottlenecks to scaling AI compute: logic, memory, and power.

And walks through the economics of labs, hyperscalers, foundries, and fab equipment manufacturers.

Learned a ton about every single level of the stack. Enjoy!

Watch on YouTube; read the transcript.

Sponsors

* Mercury has already saved me a bunch of time this tax season. Last year, I used Mercury to request W-9s from all the contractors I worked with. Then, when it came time to issue 1099s this year, I literally just clicked a button and Mercury sent them out. Learn more at mercury.com.

* Labelbox noticed that even when voice models appear to take interruptions in stride, their performance degrades. To figure out why, they built a new evaluation pipeline called EchoChain. EchoChain diagnoses voice models’ specific failure modes, letting you understand what your model needs to truly handle interruptions. Check it out at labelbox.com/dwarkesh.

* Jane Street is basically a research lab with a trading desk attached – and their infrastructure backs this up. They’ve got tens of thousands of GPUs, hundreds of thousands of CPU cores, and exabytes of storage. This is what it takes to find subtle signals hidden deep within noisy market data. If this sounds interesting, you can explore open positions at janestreet.com/dwarkesh.

Timestamps

(00:00:00) – Why an H100 is worth more today than 3 years ago

(00:24:52) – Nvidia secured TSMC allocation early; Google is getting squeezed

(00:34:34) – ASML will be the #1 constraint for AI compute scaling by 2030

(00:56:06) – Can’t we just use TSMC’s older fabs?

(01:05:56) – When will China outscale the West in semis?

(01:16:20) – The enormous incoming memory crunch

(01:42:53) – Scaling power in the US will not be a problem

(01:55:03) – Space GPUs aren’t happening this decade

(02:14:26) – Why aren’t more hedge funds making the AGI trade?

(02:18:49) – Will TSMC kick Apple out from N2?

(02:24:35) – Robots and Taiwan risk

Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe