Lesson 2.7 · Safe execution — error, retries, rollbacks

Story

No serious site lets people work at height without safety gear. There's a harness that catches a fall before it happens, a breaker that trips when a circuit pulls too much, and — when something's built wrong — the ability to rip it out and redo rather than live with it.

An autonomous agent (lesson 2.6) needs the same three things, or "autonomous" just means "unsupervised damage." Safe execution is the gear: catch bad actions before they run (error handling), try again sensibly when something's flaky (retries), and undo cleanly when a change is wrong (rollbacks) — all inside limits that stop a runaway.

The idea, in plain English

Three official sub-skills:

Error handling — detect and stop/redirect bad actions and failures.
Retries — re-attempt transient failures sensibly.
Rollbacks — undo changes that shouldn't stand.

Plus the execution limits that bound a run so it can't loop forever. The main mechanism for catching things is hooks — scripts that fire at key moments and can block a tool before it runs.

Error handling — catch it before it runs

Hooks are the agent's tripwires: scripts that run automatically at session moments (config = JSON with "version": 1 in .github/hooks/*.json or ~/.copilot/hooks/). The key one for safety:

preToolUse fires before each tool call and returns a permissionDecision: allow / deny / ask (in the cloud agent, 'ask' = 'deny'). Denying needs a permissionDecisionReason; you can even rewrite the call with modifiedArgs. Any deny blocks the tool.
permissionRequest can allow/deny before the permission dialog; with interrupt: true a deny stops the whole agent.
Exit codes (command hooks): 0 = success, 2 = deny (for permissionRequest), other non-zero = logged failure.

The crucial nuance — hooks are FAIL-OPEN

"Hook failures are logged but never stop the agent." A hook is a guardrail, not a hard security boundary — if it errors, the agent keeps going. (Rhymes with 2.4's "allowlist isn't tamper-proof": agent safety controls are guardrails, not locks.) In Actions, error handling also uses status functions like if: failure() to run a recovery step only when something failed.

Retries + rollbacks

Retries — try again, sensibly:

continueOnAutoMode — if the agent is rate-limited, it auto-switches to auto mode and retries (default off; doesn't cover global rate limits or BYOK).
In Actions, retry is a workflow pattern — re-run a flaky step / route on if: failure().
Principle (from 1.4/1.5): retry transient failures; escalate after repeated failure rather than looping (escalation = lesson 2.8).

Rollbacks — undo cleanly:

/rewind (undo last turn) — undoes the last agent turn including any file edits. The local "ctrl-Z" for an agent step.
Because the agent works on a branch → PR (2.5/2.6), the bigger rollback is never merging (or reverting) the PR — the base branch was never touched.
The cloud agent's filesystem is temporary (deleted after the job), limiting how far a bad change persists — use an http hook to send out anything you want to keep.

Execution limits — bound the run

Control	What it bounds
`--max-autopilot-continues`	how many times the agent auto-continues (default varies by CLI version — recent versions cap it; set it explicitly to be sure)
subagent nesting depth limit	how deep agent-spawns-agent can go
hook `timeoutSec` (default 30s)	how long a hook may run
temporary cloud filesystem	how long a change persists

Worked example — block a dangerous command before it runs

// .github/hooks/guard.json
{ "version": 1,
  "preToolUse": [
    { "matcher": "shell",
      "command": "grep -q 'rm -rf' <<< \"$TOOL_ARGS\" && echo '{\"permissionDecision\":\"deny\",\"permissionDecisionReason\":\"blocked rm -rf\"}' || true",
      "timeoutSec": 10 } ] }

One-look contrast — the safety knobs

error handling = stop it before it happens (preToolUse deny) · retries = try transient failures again (continueOnAutoMode) · rollbacks = undo after the fact (/rewind, revert PR) · limits = stop runaways (max-continues, timeout). But the deny only holds if the hook itself runs — fail-open means a broken hook won't protect you.

The cert-language version

Safe execution combines error handling (hooks like preToolUse that allow/deny/modify a tool call before it runs — but fail-open), retries (e.g. continueOnAutoMode, Actions if: failure()), rollbacks (/rewind the last turn, or revert/don't-merge the PR), and execution limits (--max-autopilot-continues, subagent depth, hook timeout). These are guardrails layered on top of least-privilege and human review — not hard locks.

Our summary · grounded in GitHub Docs (Copilot hooks reference, CLI command reference, Actions expressions) + MS Learn — Agent tooling… · fetched 2026-05-30

Common confusions (read these or lose points)

"A preToolUse deny hook is a hard security boundary." No — hooks are fail-open; if the hook errors, the agent continues. Guardrail, not lock.
"In the cloud agent, ask pauses for approval." No — in the cloud agent 'ask' = 'deny' (no human to ask).
"Don't worry about the autopilot continuation default." Don't rely on it — the --max-autopilot-continues default has changed across CLI versions; set it explicitly to bound a run.
"Rollback means restoring production." Here it's usually /rewind (undo last turn) or not merging / reverting the PR — the base branch was never directly changed.
"Cloud hooks work like local hooks." No — cloud loads only .github/hooks/*.json, bash only, on a temporary filesystem.

I've finished this lesson — mark it complete

Ticks this lesson done on the home roadmap. Saved in this browser.

Unofficial study material. Not affiliated with, endorsed by, or sponsored by GitHub or Microsoft. “GH-600” and “GitHub” are trademarks of their respective owners, used for identification only.

Safe execution — error handling, retries, rollbacks