Intelligence is foundation
Podcast Subscribe
Voices & Thought Leaders Friday, 13 March 2026

When agents flood your database - the infrastructure shift nobody saw coming

Share: LinkedIn
When agents flood your database - the infrastructure shift nobody saw coming

Simon Hørup Eskildsen from Turbopuffer dropped something important in a recent Latent Space conversation. It's one of those technical insights that sounds narrow but actually explains why so many AI applications feel sluggish right now.

The shift from RAG (Retrieval-Augmented Generation) to agentic systems isn't just a new pattern for developers. It fundamentally changes what databases need to handle.

The Old Model vs The New Reality

Traditional RAG systems worked like this: a user asks a question, the system makes one thoughtful database query, retrieves relevant context, generates a response. One user, one query, sequential processing. Database infrastructure was built for exactly this pattern.

Agentic systems behave completely differently. As Simon explains, agents spawn multiple concurrent sub-tasks. One user request might trigger dozens of parallel database queries as the agent simultaneously investigates different angles, retrieves different context, explores different solution paths.

That's not a 10x increase in database load. It's more like 100x. And it's happening right now as developers ship more sophisticated AI applications.

Why Existing Infrastructure Struggles

The databases we've been using weren't designed for this workload pattern. They optimised for different trade-offs - consistency over speed, single large queries over massive parallelism, predictable access patterns over chaotic concurrent requests.

Simon's insight is that search infrastructure needs rethinking from first principles when agents are the primary users. Not humans making occasional searches. Not even traditional applications making predictable queries. But AI agents making hundreds of simultaneous requests with unpredictable patterns.

Turbopuffer is his answer to that problem - a database built specifically for the workloads agents create. The architecture decisions differ fundamentally from both traditional databases and even modern vector databases designed for RAG.

The Pricing Problem

Here's where it gets interesting for anyone building AI applications. When your database load increases 100x because you added agentic capabilities, your infrastructure costs could explode. That's not theoretical - developers are hitting this right now.

Simon discusses how Turbopuffer approaches pricing for this new reality. It's not just about per-query costs. It's about predictable economics when query patterns are inherently unpredictable. That's a hard problem - probably harder than the technical architecture challenges.

For builders, this matters because infrastructure costs directly determine what you can afford to build. If agentic applications cost 100x more to run than RAG systems, many use cases simply won't be viable. The economics need to work, or the applications don't get built.

Architecture Decisions That Actually Matter

The conversation goes deep on technical choices - hybrid search strategies, how to handle massive concurrency, where to make trade-offs between latency and cost. This isn't abstract theory. These are the decisions that determine whether your AI application responds in 200 milliseconds or 2 seconds.

Here's what I noticed is how specific the requirements are. Building for agentic workloads isn't just "make it faster" or "handle more queries". It requires fundamentally different assumptions about access patterns, consistency requirements, and failure modes.

Simon's point about database design following application patterns is crucial. We're not just adding agents to existing architectures. We're discovering that agents require new infrastructure primitives.

What This Means for Developers

If you're building AI applications, particularly anything with agents, this conversation is worth your time. Not because Turbopuffer is the only answer - it's one approach among several emerging solutions. But because the problems Simon articulates are problems every developer building with agents will encounter.

The shift from RAG to agents isn't just about prompt engineering or agent frameworks. It's about infrastructure that can handle what agents actually do when they run. That's the unglamorous foundation work that determines whether ambitious AI applications actually ship or just stay as impressive demos.

We're still early in figuring this out. But conversations like this one - deep technical discussions about real bottlenecks and actual solutions - are how we get from "agents are cool" to "agents are useful in production".

The infrastructure is adapting. Slowly, practically, with real trade-offs. That's how things actually get built.

More Featured Insights

Builders & Makers
Why AI creates more freelance clients, not fewer
Robotics & Automation
Rhoda AI's $450M bet - teaching robots like we teach children

Video Sources

Ania Kubów
How to Land Freelance Clients with Small Business Whisperer Luke Ciciliano
Fireship
7 new open source AI tools you need right now

Today's Sources

DEV.to AI
Mastra: A Modern TypeScript Framework for AI Applications
Hacker News Best
Shall I implement it? No
Hacker News Best
This is not the computer for you
The Robot Report
Rhoda AI exits stealth with $450M to train robots from video
The Robot Report
MassRobotics, NVIDIA, and AWS announce second Physical AI Fellowship cohort
Hackaday Robotics
Building a Robot Partner to Play Air Hockey With
ROS Discourse
Hephaes: Open-source ROS1/2 Logs to Parquet/TFRecord converter
Robohub
Coding for underwater robotics
Latent Space
Retrieval After RAG: Hybrid Search, Agents, and Database Design - Simon Hørup Eskildsen of Turbopuffer
Ethan Mollick
The Shape of the Thing - Ethan Mollick
Gary Marcus
Is the US military actually afraid of Claude? Gary Marcus on Anthropic and supply chain risk
Latent Space
[AINews] The high-return activity of raising your aspirations for LLMs - Latent Space
Azeem Azhar
The case for radical solar optimism - Azeem Azhar

About the Curator

Richard Bland
Richard Bland
Founder, Marbl Codes

27+ years in software development, curating the tech news that matters.

Subscribe RSS Feed
View Full Digest Today's Intelligence
Free Daily Briefing

Start Every Morning Smarter

Luma curates the most important AI, quantum, and tech developments into a 5-minute morning briefing. Free, daily, no spam.

  • 8:00 AM Morning digest ready to listen
  • 1:00 PM Afternoon edition catches what you missed
  • 8:00 PM Daily roundup lands in your inbox

We respect your inbox. Unsubscribe anytime. Privacy Policy

© 2026 MEM Digital Ltd t/a Marbl Codes
About Sources Podcast Audio Privacy Cookies Terms Thou Art That
RSS Feed