The agent ran a workflow step you didn't authorize it to run. Or it made a decision based on context you didn't realize it had. Or it did exactly what you asked it to do, but what you asked was wrong, and the agent didn't catch that.
You open the incident post-mortem template. You fill it out the way you always do. And at the end, you have a timeline with a hole in the middle where the failure actually lives.
Your post-mortem process was built for a different kind of problem.
Standard incident review asks a clean set of questions. What changed in the code? What was the exact input? What was the exact output?
These questions produce clear answers when a system executes instructions. You changed line 47. The input was 42. The output should have been 84 but came back as 43. Now you know what to fix.
An agent incident doesn't work that way.
An agent doesn't execute. It reasons. It looks at context, makes a decision about what to do, and then acts. The decision is where the failure lives.
But your template has no place for a decision. It has input and output. It has code changes. It has no slot for "what did the system choose to do, and why did it choose that instead of something else?"
This is not a minor template gap. This is the difference between filing an incident and learning from it.
Three things a standard post-mortem will not surface.
First: what decision the agent made, and what it was optimizing for.
An agent always picks between options. It picks based on what you told it to optimize for. If you said "move the workflow forward," it may move it forward in ways you didn't intend.
If you said "minimize errors," it may refuse to act at all. You need to know what goal was baked in when the agent decided to do what it did. Your template doesn't ask this.
Second: what context was in the agent's window when it made that decision.
An agent is only as good as what it can see. If it had access to a field it shouldn't have seen, or if it missed a field it should have used, the decision makes sense from the agent's perspective but fails from yours.
Most teams don't log what context an agent had at decision time. So when you review the incident, you can't answer this question. You end up guessing about what the agent knew.
Third: what authority it was operating under when it acted.
An agent works within permission boundaries. It has access to certain databases, certain approval chains, certain user data. If it did something you didn't expect, the first question is whether it was authorized to do it at all.
If it was, then you have a governance design problem. If it wasn't, you have an access control problem. These are different failures with different fixes. Your post-mortem template doesn't distinguish between them.
Without those three things on the record, the incident review produces a timeline that looks complete but isn't. You know when the failure happened. You don't know why.
Stop using the standard template. Ask these four questions instead.
What decision did the agent make, and what was it trying to optimize for?
Not "what did it output." But "what was it deciding between, and why did it pick this option?" Push until you have an answer that makes sense from the agent's reasoning, even if it makes no sense from your business.
What context was in its window at that moment?
What did it know about the workflow state, the user, the data, the history? What didn't it know that would have changed the decision? Log the context at the exact moment of decision. Don't infer it later.
This is the difference between "the agent made a reasonable call with bad data" and "the agent ignored data we gave it."
Was it authorized to take that action?
Could it access the system it accessed, the user data it used, the approval chain it bypassed? If yes, the agent did exactly what you set it up to do. If no, you have a permissions leak. These are not the same problem.
Could the operation have caught this class of error before it landed on live work?
Not "could we have prevented this exact incident" but "could we have built a wall that stops this whole category of mistake?" An agent that optimizes in the wrong direction will always find the wrong answer.
An agent with too much context will always see things it shouldn't. An agent with unclear authority will always drift into gray areas. The fix is not to review this one failure. It is to design the operation so the next failure hits a wall.
These four questions are not abstract. They are the review. They are also the only way you actually learn whether the agent will do the same thing tomorrow.
Agents are moving from projects into production. The first wave of real incidents is landing.
Teams are discovering that the incident review process, which works fine for software that executes instructions, produces useless output when applied to a system that made a decision. You can't just look at the code and the input and the output and know what went wrong. The decision is invisible unless you specifically look for it.
A post-mortem on an agent incident is not a code review. It is a decision review.
Until you treat it that way, you are not learning from the failure. You are filing it.
The next time an agent does something unexpected on live work, don't reach for the standard template. Open a fresh document. Ask what decision it made. Ask what context it had. Ask what authority it was using. Ask what wall you need to build. Then you will have a post-mortem that actually means something.
The work of running agents in production includes the work of reviewing failures differently. That is operational work, not theoretical work. Acrein Group designs and runs the incident review process inside its own portfolio companies so that agent decisions surface instead of disappearing into the timeline. If you are building an operation where agents touch live work and you need that process to be survivable, that is the foundation we build it on.
The right conversation at the right moment changes everything. Let's have it.
Talk to us