Intelligence is foundation
Podcast Subscribe
Voices & Thought Leaders Saturday, 21 February 2026

Azeem Azhar's AI Agent: 179 Failures and Counting

Share: LinkedIn
Azeem Azhar's AI Agent: 179 Failures and Counting

Azeem Azhar has done something most people writing about AI agents haven't: he's actually built one and let it run. For months. With 179 documented failures encoded into its learning patterns.

This isn't a concept piece about what AI agents might do someday. It's a detailed account of what happens when you give an AI system access to your CRM, notes, and task management - and watch it slowly learn to be useful.

The Setup: Mac Mini and Patience

Azhar's personal AI agent system runs on a Mac Mini. It handles CRM management, note organisation, and task automation. The infrastructure isn't exotic. What's unusual is the honest documentation of how it actually performs.

Most AI agent demonstrations show the success cases. Azhar shows the 179 failures. Each one represents a moment where the system misunderstood context, made an incorrect assumption, or simply broke. Each failure got encoded into the system's learning patterns.

This is what real agent deployment looks like. Not the smooth demonstration where everything works perfectly. The messy reality where you spend weeks teaching a system the difference between "urgent" and "important", or why certain contacts should never be automatically categorised.

What Actually Works

After months of iteration, Azhar's agent now handles tasks that would otherwise consume hours each week. CRM updates that previously required manual review now happen automatically. Notes get organised and tagged without human intervention. Tasks get prioritised based on learned patterns.

But here's the interesting bit: the agent doesn't replace human judgement. It reduces the tedium boundary. Tasks that were too boring to do consistently but too important to skip entirely now happen reliably.

Azhar describes this as moving the boundary of tedium. Not eliminating work, but pushing the threshold at which human attention becomes necessary. The agent handles the repetitive context-switching. Azhar focuses on decisions that actually require thinking.

The Learning Curve Is Real

One hundred and seventy-nine failures. Let that sink in. Each one required diagnosis, adjustment, and re-deployment. This isn't plug-and-play automation. It's gradual system training where you teach the agent your specific workflows, preferences, and edge cases.

For business owners considering AI agents, this is the reality check. You're not buying a solution - you're starting a training process. The system will make mistakes. You'll need patience and the technical capability to adjust parameters, refine prompts, and handle edge cases.

But the payoff compounds. Each correction makes the system more reliable. After months of training, Azhar's agent now handles tasks autonomously that initially required constant supervision. The 179 failures represent the learning curve - expensive upfront, valuable long-term.

Why This Matters

Most writing about AI agents exists in the future tense. Someday agents will do this. Eventually they'll handle that. Azhar's documentation exists in the present tense. His agent is running right now. Making mistakes. Learning from them. Gradually becoming useful.

This is what adoption looks like for complex technology. Not dramatic transformation, but gradual capability building. Not perfect automation, but incremental reduction of tedious work.

For anyone building or considering agent systems, Azhar's experience offers a realistic benchmark. Expect failures. Budget time for training. Focus on narrow, well-defined tasks before attempting broad automation. And document everything - the failures teach you as much as the successes.

The boundary of tedium has moved. Not because AI suddenly became perfect, but because someone put in the patient work of training a system to handle specific tasks reliably. That's the real story of AI agents in 2025.

More Featured Insights

Builders & Makers
10 n8n Workflows That Actually Save Developer Time
Robotics & Automation
Ghost Robotics: Building Robots That Actually Work

Video Sources

Fireship
TanStack Start in 100 Seconds
Google DeepMind
Gemini 3 Deep Think: Optimizing 2D semiconductor fabrication

Today's Sources

DEV.to AI
Top 10 n8n Workflows: Automate Dev Tasks Without Extra Code in 2026
Hacker News Best
Turn Dependabot off
Hacker News Best
I found a Vulnerability. They found a Lawyer
Replit Blog
Building Mobile Apps on Replit: Case Study + Inside Look From Product Team
DEV.to AI
Chapter 3: Terraform + Helm - A Better Abstraction
The Robot Report
Ghost Robotics: Innovating for safety
The Robot Report
NORD releases digital twin simulation platform for robotics developers
ROS Discourse
New kid on the block: meet Ajime, robotics CI/CD next-gen platform
ROS Discourse
ROS News for the Week of February 16th, 2026
Azeem Azhar
🫵 You already have an AI agent.
Latent Space
[AINews] The Custom ASIC Thesis
Ben Thompson Stratechery
2026.08: Losing in the Attention Economy

About the Curator

Richard Bland
Richard Bland
Founder, Marbl Codes

27+ years in software development, curating the tech news that matters.

Subscribe RSS Feed
View Full Digest Today's Intelligence
Free Daily Briefing

Start Every Morning Smarter

Luma curates the most important AI, quantum, and tech developments into a 5-minute morning briefing. Free, daily, no spam.

  • 8:00 AM Morning digest ready to listen
  • 1:00 PM Afternoon edition catches what you missed
  • 8:00 PM Daily roundup lands in your inbox

We respect your inbox. Unsubscribe anytime. Privacy Policy

© 2026 MEM Digital Ltd t/a Marbl Codes
About Sources Podcast Audio Privacy Cookies Terms Thou Art That
RSS Feed