No serious site lets people work at height without safety gear. There's a harness that catches a fall before it happens, a breaker that trips when a circuit pulls too much, and — when something's built wrong — the ability to rip it out and redo rather than live with it.
An autonomous agent (lesson 2.6) needs the same three things, or "autonomous" just means "unsupervised damage." Safe execution is the gear: catch bad actions before they run (error handling), try again sensibly when something's flaky (retries), and undo cleanly when a change is wrong (rollbacks) — all inside limits that stop a runaway.
The idea, in plain English
Three official sub-skills:
- Error handling — detect and stop/redirect bad actions and failures.
- Retries — re-attempt transient failures sensibly.
- Rollbacks — undo changes that shouldn't stand.
Plus the execution limits that bound a run so it can't loop forever. The main mechanism for catching things is hooks — scripts that fire at key moments and can block a tool before it runs.
Error handling — catch it before it runs
Hooks are the agent's tripwires: scripts that run automatically at session moments (config = JSON with "version": 1 in .github/hooks/*.json or ~/.copilot/hooks/). The key one for safety:
preToolUsefires before each tool call and returns apermissionDecision: allow / deny / ask (in the cloud agent, 'ask' = 'deny'). Denying needs apermissionDecisionReason; you can even rewrite the call withmodifiedArgs. Any deny blocks the tool.permissionRequestcan allow/deny before the permission dialog; withinterrupt: truea deny stops the whole agent.- Exit codes (command hooks):
0= success,2= deny (forpermissionRequest), other non-zero = logged failure.
"Hook failures are logged but never stop the agent." A hook is a guardrail, not a hard security boundary — if it errors, the agent keeps going. (Rhymes with 2.4's "allowlist isn't tamper-proof": agent safety controls are guardrails, not locks.) In Actions, error handling also uses status functions like if: failure() to run a recovery step only when something failed.
Retries + rollbacks
Retries — try again, sensibly:
continueOnAutoMode— if the agent is rate-limited, it auto-switches to auto mode and retries (default off; doesn't cover global rate limits or BYOK).- In Actions, retry is a workflow pattern — re-run a flaky step / route on
if: failure(). - Principle (from 1.4/1.5): retry transient failures; escalate after repeated failure rather than looping (escalation = lesson 2.8).
Rollbacks — undo cleanly:
/rewind(undo last turn) — undoes the last agent turn including any file edits. The local "ctrl-Z" for an agent step.- Because the agent works on a branch → PR (2.5/2.6), the bigger rollback is never merging (or reverting) the PR — the base branch was never touched.
- The cloud agent's filesystem is temporary (deleted after the job), limiting how far a bad change persists — use an http hook to send out anything you want to keep.
Execution limits — bound the run
| Control | What it bounds |
|---|---|
--max-autopilot-continues | how many times the agent auto-continues (default varies by CLI version — recent versions cap it; set it explicitly to be sure) |
| subagent nesting depth limit | how deep agent-spawns-agent can go |
hook timeoutSec (default 30s) | how long a hook may run |
| temporary cloud filesystem | how long a change persists |
// .github/hooks/guard.json
{ "version": 1,
"preToolUse": [
{ "matcher": "shell",
"command": "grep -q 'rm -rf' <<< \"$TOOL_ARGS\" && echo '{\"permissionDecision\":\"deny\",\"permissionDecisionReason\":\"blocked rm -rf\"}' || true",
"timeoutSec": 10 } ] }
error handling = stop it before it happens (preToolUse deny) · retries = try transient failures again (continueOnAutoMode) · rollbacks = undo after the fact (/rewind, revert PR) · limits = stop runaways (max-continues, timeout). But the deny only holds if the hook itself runs — fail-open means a broken hook won't protect you.
The cert-language version
Safe execution combines error handling (hooks like
preToolUsethat allow/deny/modify a tool call before it runs — but fail-open), retries (e.g.continueOnAutoMode, Actionsif: failure()), rollbacks (/rewindthe last turn, or revert/don't-merge the PR), and execution limits (--max-autopilot-continues, subagent depth, hook timeout). These are guardrails layered on top of least-privilege and human review — not hard locks.Our summary · grounded in GitHub Docs (Copilot hooks reference, CLI command reference, Actions expressions) + MS Learn — Agent tooling… · fetched 2026-05-30
Common confusions (read these or lose points)
- "A
preToolUsedeny hook is a hard security boundary." No — hooks are fail-open; if the hook errors, the agent continues. Guardrail, not lock. - "In the cloud agent,
askpauses for approval." No — in the cloud agent 'ask' = 'deny' (no human to ask). - "Don't worry about the autopilot continuation default." Don't rely on it — the
--max-autopilot-continuesdefault has changed across CLI versions; set it explicitly to bound a run. - "Rollback means restoring production." Here it's usually
/rewind(undo last turn) or not merging / reverting the PR — the base branch was never directly changed. - "Cloud hooks work like local hooks." No — cloud loads only
.github/hooks/*.json, bash only, on a temporary filesystem.
Ticks this lesson done on the home roadmap. Saved in this browser.