Agentic Healing
in Production 🩺

Jack McNicol • Founding Engineer @ SuperIT

Small team building our Sparky agent, FastAPI/React

The dream

Autonomously fixing issues at scale.

Flow state

Update the code so the agent will flow through the code well.

What is the new state of the world?

Agentic Health Check

Read the sessions

Where did it go off track — then find its way back?

Show me recommendations to help the agent reach its goal faster. No -> Human review better

Agentic Health Check

Back pressure

Belt and suspenders

PostTool checks, type checks, property, e2e, failure states, not coverage

Agentic Health Check

Warnings = errors

A noisy build teaches the agent that warnings don’t matter.

Make every warning fail the build — a clean signal for the agent, and for you.

Agentic Health Check

How we use it

Schedule PR Code debt Wiki debt Architecture audit Test coverage
Agentic Health Check

Backwards?

Events, Schedules, Humans → Agents

Managing change.

Agents In the Field
Event Recurring Human Review Pipeline Merge queue
Agents In the Field

Events

Sentry, Logfire, Linear bug, Slack message

Workflows to manage context, investigate, build RCA, RCA to solutions, RCA to proof

Agents In the Field

Event workflow

The agent’s hypothesis is only as good as the telemetry it has.

Triage → Discovery → RCA → Solution → Proof

Agents In the Field

Recurring

Cron-actions. Lowest urgency, widest blast radius.

Dependabot, Security scans, Pick up bug ticket. Schedules allow token cost planning.

Agents In the Field

The developer

Product growth done here, less bug fixing.

Agents In the Field

Review with skills

A skill is a reusable procedure: input → checklist → output.

Cheap deterministic rules first. ast-grep/custom. Once trusted, add to scheduler.

Agents In the Field

The PR is the guardrail

Let agents run wild behind it. Human review? Merge Queues.

Greptile · Cursor BugBot · Claude Code Review. Stack reviewers, don’t replace them. Tag a human required

Agents In the Field

Heal on trigger

Production event fires → agent investigates → posts findings in 1–2 minutes.

Logfire / PostHog webhook → parallel searcher sub-agents → report in Slack.

Agents In the Field

How we use it

Events Triage Discovery RCA Solution Proof
Pull Request Agent Review - 3rd party Security Review Test suite Merge Agent - 1st party Merge queue
Agents In the Field

It's Fixed!

Said Claude...

Reality: it changed some stuff.

Agentic Test

Make it prove its work

Validating the agent’s work is the bottleneck. Compress it.

“Have the agent prove its work in the format that’s fastest for you to validate.” - A love letter to Pi | Lucas Meijer

Agentic Test

Are you ready for the mythos wave?

Time to exploit is now hours.

Agent Learnings

The trap

Agents multiply what you already have — foundations or debt.

Skip the foundation and you get the automation confidently making things worse.

Agent Learnings

What’s the point?

Stop doing the work AI is now better at — so you can do the work it can’t.

Allows us to move to the important parts. Where should we as engineers be focused?

Agent Learnings

Thanks 🩺

@jackmcpickle

agentic-healing.mcpickle.com.au