Skip to content

Intent Engineering Checklist for Skills and Agents

Use this checklist when creating or auditing any skill or agent prompt. Intent engineering ensures the gap between instructions given and actual agent behavior is minimized. Prompts are contracts, not suggestions.


1. Boundary Statements

Does the prompt explicitly define what the agent MUST NOT do?

Criterion Present Partial Missing
Negative constraints listed ("MUST NOT modify X") Agent has 3+ specific prohibitions relevant to its scope Agent has vague or incomplete prohibitions No negative constraints at all
Scope boundary defined ("operates ONLY on X") Explicit file/directory/system scope stated Scope implied but not stated Agent could reasonably touch anything
Blast radius bounded Agent cannot modify its own instructions, unrelated code, or shared state Some boundaries exist but gaps remain No blast radius consideration

Good pattern:

**You MUST NOT:**
- Modify files outside the reviewed branch diff scope
- Skip the test gate or spec-compliance gate
- Create new public API signatures during auto-fix

Bad pattern:

Be careful not to change things you shouldn't.
Try to stay within scope.


2. Task Completion Definitions

Does the prompt define what "done" means in observable, verifiable terms?

Criterion Present Partial Missing
Positive completion criteria ("complete WHEN X") 2+ observable conditions that can be verified Vague completion ("when you're done") No completion definition
Negative completion boundary ("NOT responsible for X") Explicit list of adjacent work this agent must not start Implicit boundary only Agent could iterate indefinitely
Stop condition ("STOP after X") Explicit instruction to halt and report Implied stopping point No stop signal — agent may gold-plate

Good pattern:

**Task is COMPLETE when:**
- All quality gates report status (pass or fail)
- Findings are classified by severity
- Report is delivered to the user

**This agent is NOT responsible for:**
- Fixing the findings (that is the developer's job)
- Merging the branch
- Deploying to production

Bad pattern:

Review the code thoroughly and make sure everything is good.


3. Constraint Enumeration (Fork-Context Skills)

For skills running in context: fork or with elevated autonomy:

Criterion Present Partial Missing
Obligation language is unambiguous Uses MUST/MUST NOT/SHALL exclusively Mix of "should"/"try to"/"consider" Entirely advisory language
Conditional logic is explicit "If X, then Y. Otherwise Z." Some conditions stated, others implied Agent must infer when rules apply
Priority order stated for competing concerns Explicit ranking (e.g., "security over speed") Implied priority No guidance when concerns conflict

Audit for these ambiguous phrases (replace with MUST/MUST NOT): - "Try to..." → MUST or remove - "Should probably..." → MUST or MAY - "Generally..." → state the exception explicitly - "Consider..." → MUST evaluate, then decide based on [criteria] - "Be careful..." → MUST NOT [specific prohibited action]


4. Failure Mode Flagging

Does the prompt instruct the agent to surface problems rather than silently work around them?

Criterion Present Partial Missing
"Surface, Don't Solve" instruction Explicit instruction with 3-step protocol (stop, describe, ask) General "report issues" language No failure handling at all
Specific failure scenarios listed 3+ scenarios where agent must stop and ask Generic "if something goes wrong" Agent will attempt workarounds
Confidence gate for destructive actions Agent must state confidence before delete/overwrite/deploy Confirmation required for some actions No gates on destructive operations

Good pattern:

**You MUST STOP and surface to the user when:**
- A reviewer agent fails to return results within the expected format
- Auto-fix creates new findings instead of resolving existing ones
- Test gate passes but no spec exists to check compliance against

**Surface, Don't Solve:**
When you encounter an unexpected obstacle, DO NOT work around it silently.
Instead: (1) STOP the current step, (2) DESCRIBE what you encountered,
(3) EXPLAIN why it is unexpected, (4) ASK the user how to proceed.

Bad pattern:

If something goes wrong, try to handle it gracefully.


Scoring Guide

Score each dimension as Present (2), Partial (1), or Missing (0).

Score Rating Action
7-8 Excellent No changes needed
5-6 Good Minor improvements recommended
3-4 Fair Add missing sections before deployment
0-2 Poor Full intent engineering audit required

Quick Reference: Intent Boundaries Section Template

Add this section to any orchestrator or high-autonomy agent, placed immediately after the role description and before workflow steps:

## Intent Boundaries

**You MUST NOT:**
- [Negative constraint specific to this agent's scope]
- [Negative constraint specific to this agent's scope]
- [Negative constraint specific to this agent's scope]

**You MUST STOP and surface to the user when:**
- [Failure scenario specific to this agent's domain]
- [Failure scenario specific to this agent's domain]

**Surface, Don't Solve:**
When you encounter an unexpected obstacle, DO NOT work around it silently.
Instead: (1) STOP the current step, (2) DESCRIBE what you encountered,
(3) EXPLAIN why it is unexpected, (4) ASK the user how to proceed.

**Task is COMPLETE when:**
- [Observable, verifiable condition]
- [Observable, verifiable condition]

**This agent is NOT responsible for:**
- [Adjacent work that belongs to another agent/phase]

Sources

  • Anthropic: Effective Context Engineering for AI Agents (2025)
  • Microsoft: Taxonomy of Failure Modes in AI Agents (2025)
  • Vadim Blog: Skill Evolver — Research to Practice (2025)
  • AWS: Agentic AI Security Scoping Matrix (2025)
  • arXiv 2512.12791: Beyond Task Completion Assessment Framework (2025)
  • Galileo: 7 AI Agent Failure Modes Guide (2025)