Intelligence is foundation
Podcast Subscribe
Voices & Thought Leaders Thursday, 26 March 2026

Why Arm Started Making Chips: The CPU Demands of Agentic AI

Share: LinkedIn
Why Arm Started Making Chips: The CPU Demands of Agentic AI

Arm designs chips. They don't manufacture them. That's been the business model for decades - license the architecture, let others build the silicon. Until now.

In an interview with Ben Thompson, Arm's CEO Rene Haas explained why the company entered chip manufacturing. The answer isn't about competing with partners. It's about agentic AI and token distribution.

This is one of those shifts that looks small until you realise what it means for infrastructure costs across the industry.

Meta Was the Catalyst

The conversation started with Meta. They approached Arm with a problem: agentic AI systems - the ones that act autonomously, not just respond to prompts - require massive CPU scale.

Here's why. When an AI agent operates, it's not just generating text. It's distributing tokens across multiple processes. Reasoning, planning, tool use, memory retrieval - all happening in parallel. That workload doesn't fit neatly into GPU architecture. It needs CPUs. Lots of them.

Meta's scale made the problem visible first. But the pattern applies to anyone building agentic systems. As AI moves from single-task inference to multi-step autonomous operation, the infrastructure requirements shift. GPUs handle the heavy compute. CPUs handle the coordination.

In simpler terms... think of it like a construction site. GPUs are the heavy machinery - cranes, excavators, doing the big work. CPUs are the site managers - coordinating, routing, making sure everything happens in the right order. Agentic AI needs both, and right now there's a CPU bottleneck.

Why Licensing Wasn't Enough

Arm's traditional model - license the design, let partners manufacture - works brilliantly for most markets. But when Meta said "we need this specific thing at this scale", the licensing model hit a wall.

Custom silicon takes time. Partnerships add complexity. When you're trying to build infrastructure for workloads that didn't exist two years ago, speed matters. Arm decided the fastest path was to manufacture chips themselves.

Haas was clear: this isn't about replacing partners. It's about serving a market need that the existing model couldn't address quickly enough. The CPU demands of agentic AI created a gap. Arm filled it.

What This Means for Builders

Right, but here's the practical bit. If you're building on AI infrastructure, the cost structure is about to shift.

GPU costs have dominated conversations for the past two years. Inference pricing, training budgets, optimisation strategies - all focused on GPU efficiency. That's still important. But as workloads move toward agentic behaviour, CPU costs start climbing.

For developers, this changes architecture decisions. Multi-agent systems, reasoning loops, tool-use patterns - all these features increase CPU load. The models themselves might be efficient, but the orchestration layer needs resources.

Arm entering chip manufacturing signals where the market is heading. More CPU-intensive AI workloads. More demand for coordination infrastructure. More focus on balanced compute rather than GPU-only optimisation.

If you're running production AI systems, this is worth watching. The infrastructure you're building today might need more CPU headroom than you've planned for. Especially if your roadmap includes agentic features.

The Broader Pattern

There's something else here. When hardware companies change their business model, it's usually because the market moved faster than their structure could adapt.

Arm licensing chips worked perfectly for mobile, IoT, embedded systems - markets where design cycles are long and partners want customisation. But AI infrastructure moves differently. Fast iteration. Rapid deployment. Workloads that emerge and scale in months, not years.

The manufacturing move isn't just about Meta's needs. It's about positioning for a market where infrastructure requirements shift quickly. When you can't wait 18 months for a partner to spin up custom silicon, you need a faster path to market.

For anyone watching the AI infrastructure space, this is a signal. The bottleneck isn't just training compute anymore. It's deployment infrastructure - the CPUs, the coordination layers, the systems that make agentic AI actually work at scale.

We've been so focused on GPU availability that the CPU requirements of agentic systems snuck up quietly. Arm making their own chips is a response to that gap. And if they're responding, others will follow.

The maths is changing. Plan accordingly.

More Featured Insights

Builders & Makers
The DevOps Engineer's Guide to Working with LLMs as Junior Team Members
Robotics & Automation
NVIDIA's Conference Reveals Robots Moving from Lab Demos to Real Work

Video Sources

Ania Kubów
AI Foundations for Absolute Beginners
Ania Kubów
A great novel can be written in any language - just like great programs
NVIDIA Robotics
Automotive Special Address: Advancing Level 4 Autonomy
OpenAI
Episode 15 - Inside the Model Spec
Google DeepMind
Introducing Lyria 3 Pro

Today's Sources

DEV.to AI
LLMs in DevOps: Why They Work Best as a "Very Fast Junior Engineer"
Hacker News Best
Running Tesla Model 3's computer on my desk using parts from crashed cars
Towards Data Science
What the Bits-over-Random Metric Changed in How I Think About RAG and Agents
ML Mastery
5 Practical Techniques to Detect and Mitigate LLM Hallucinations Beyond Prompt Engineering
The Robot Report
3 robotics trends from NVIDIA GTC 2026
The Robot Report
Unitree IPO shows a real hardware business, but the humanoid case is still early
The Robot Report
Basler and Orbbec partner for 3D vision systems for mobile robots
Robohub
A history of RoboCup with Manuela Veloso
ROS Discourse
Remote Control of Robotic Arms - Using a Standard Gamepad
Ben Thompson Stratechery
An Interview with Arm CEO Rene Haas About Selling Chips
Latent Space
[AINews] The Biggest Claude Launch of All Time
Gary Marcus
War and AI, the death of Sora, and 3 ways you can catch me live today

About the Curator

Richard Bland
Richard Bland
Founder, Marbl Codes

27+ years in software development, curating the tech news that matters.

Subscribe RSS Feed
View Full Digest Today's Intelligence
Free Daily Briefing

Start Every Morning Smarter

Luma curates the most important AI, quantum, and tech developments into a 5-minute morning briefing. Free, daily, no spam.

  • 8:00 AM Morning digest ready to listen
  • 1:00 PM Afternoon edition catches what you missed
  • 8:00 PM Daily roundup lands in your inbox

We respect your inbox. Unsubscribe anytime. Privacy Policy

© 2026 MEM Digital Ltd t/a Marbl Codes
About Sources Podcast Audio Privacy Cookies Terms Thou Art That
RSS Feed