Replit launched Agent 4 this week, and the name undersells what's happening. This isn't just another iteration of a coding assistant - it's the first major coding agent to explicitly position itself as a knowledge work platform.
The shift matters. Every coding agent so far has been about writing code faster. Agent 4 is about doing research, synthesising information, and generating documentation alongside the code. It's not replacing developers - it's replacing the parts of knowledge work that don't require human judgement.
This is the pattern Latent Space is tracking across the industry - coding agents were the wedge, but the real market is everything that looks like structured thinking.
What Agent 4 Actually Does
Agent 4 can write code, debug, refactor, and deploy - the same capabilities as Claude's Computer Use or GitHub Copilot Workspace. But it also writes documentation, conducts research, synthesises information from multiple sources, and generates structured reports.
In practical terms, this means you can ask it to "research competitors in the project management space and generate a comparison table" or "document this codebase and create onboarding guides for new developers." It handles both the code and the context around the code.
The interface reflects this. It's not a code editor with AI bolted on - it's a workspace where code is one output among many. You're building a product, not just writing functions.
Why Coding Agents Expand to Knowledge Work
The pattern is predictable once you see it. Coding is structured problem-solving with clear inputs and verifiable outputs. It's the easiest knowledge work to automate because you can test whether the code runs.
But the skills that make a coding agent useful - breaking down complex problems, following multi-step instructions, generating structured output, iterating based on feedback - apply to most knowledge work. Research, analysis, documentation, planning, reporting.
Every company that builds a successful coding agent is now looking at the much larger market of "things that look like coding but aren't." Replit got there first by making it explicit - Agent 4 is a knowledge work agent that happens to be very good at code.
NVIDIA's Open Model Efficiency Push
Alongside Replit's launch, Latent Space covered NVIDIA's increasing focus on open model efficiency. Not open source as a philosophical position, but open models as the practical path to inference at scale.
The economics are simple. If you're running inference for millions of users, paying per-token API costs to OpenAI or Anthropic isn't sustainable. You need to run models yourself. And if you're running models yourself, you need them to be efficient enough to make the unit economics work.
NVIDIA's positioning here is clever. They sell the GPUs that run the models, so they benefit when companies move from API calls to self-hosted inference. They're not competing with OpenAI - they're enabling the tier of companies that can't afford to use OpenAI at scale.
For builders, this matters. If you're building a product with AI at its core, your long-term cost structure depends on running your own models. The open model ecosystem - Llama, Mistral, Qwen - is now mature enough to be a viable alternative for most use cases.
Emerging Infrastructure Patterns for 2026
Latent Space identifies three infrastructure trends converging right now - agent frameworks moving from prototype to production, edge inference becoming practical for real-time applications, and multimodal models making vision and audio as reliable as text.
The agent framework pattern is particularly clear. Tools like LangChain and LlamaIndex started as experiment scaffolding. They're now production infrastructure. Companies are building agents that run continuously, handle state management, and fail gracefully when things go wrong.
Edge inference is the sleeper trend. Models that can run on-device, without internet connectivity, enable a whole category of applications that weren't possible when everything required a server round-trip. Think real-time translation, on-device image analysis, or assistants that work offline.
Multimodal is the most overhyped but also the most real. GPT-4 Vision and Gemini can analyse images reliably enough to build products around. Audio models like Whisper and voice synthesis like ElevenLabs are production-ready. The demos are becoming infrastructure.
What This Means for Builders
If you're building with AI in 2026, the stack is stabilising. You can build on open models, run them yourself, and expect them to work reliably. The infrastructure exists - model hosting, vector databases, agent frameworks, fine-tuning pipelines.
The Replit Agent 4 launch is a signal about where this goes. The companies winning in AI aren't building the most impressive models - they're building the best products around capable-enough models. Replit didn't train Agent 4 from scratch. They built a great product experience on top of existing model infrastructure.
For business owners, the knowledge work agent category is worth watching. If your team does research, writes documentation, generates reports, or synthesises information, these tools are becoming viable alternatives. Not replacements for human judgement, but assistants that handle the structured parts of thinking work.
The coding agent wedge opened the door. Knowledge work agents are what walks through it. And the infrastructure to build them is already here.