Intent Engineering Checklist for Skills and Agents¶
Use this checklist when creating or auditing any skill or agent prompt. Intent engineering ensures the gap between instructions given and actual agent behavior is minimized. Prompts are contracts, not suggestions.
1. Boundary Statements¶
Does the prompt explicitly define what the agent MUST NOT do?
| Criterion | Present | Partial | Missing |
|---|---|---|---|
| Negative constraints listed ("MUST NOT modify X") | Agent has 3+ specific prohibitions relevant to its scope | Agent has vague or incomplete prohibitions | No negative constraints at all |
| Scope boundary defined ("operates ONLY on X") | Explicit file/directory/system scope stated | Scope implied but not stated | Agent could reasonably touch anything |
| Blast radius bounded | Agent cannot modify its own instructions, unrelated code, or shared state | Some boundaries exist but gaps remain | No blast radius consideration |
Good pattern:
**You MUST NOT:**
- Modify files outside the reviewed branch diff scope
- Skip the test gate or spec-compliance gate
- Create new public API signatures during auto-fix
Bad pattern:
2. Task Completion Definitions¶
Does the prompt define what "done" means in observable, verifiable terms?
| Criterion | Present | Partial | Missing |
|---|---|---|---|
| Positive completion criteria ("complete WHEN X") | 2+ observable conditions that can be verified | Vague completion ("when you're done") | No completion definition |
| Negative completion boundary ("NOT responsible for X") | Explicit list of adjacent work this agent must not start | Implicit boundary only | Agent could iterate indefinitely |
| Stop condition ("STOP after X") | Explicit instruction to halt and report | Implied stopping point | No stop signal — agent may gold-plate |
Good pattern:
**Task is COMPLETE when:**
- All quality gates report status (pass or fail)
- Findings are classified by severity
- Report is delivered to the user
**This agent is NOT responsible for:**
- Fixing the findings (that is the developer's job)
- Merging the branch
- Deploying to production
Bad pattern:
3. Constraint Enumeration (Fork-Context Skills)¶
For skills running in context: fork or with elevated autonomy:
| Criterion | Present | Partial | Missing |
|---|---|---|---|
| Obligation language is unambiguous | Uses MUST/MUST NOT/SHALL exclusively | Mix of "should"/"try to"/"consider" | Entirely advisory language |
| Conditional logic is explicit | "If X, then Y. Otherwise Z." | Some conditions stated, others implied | Agent must infer when rules apply |
| Priority order stated for competing concerns | Explicit ranking (e.g., "security over speed") | Implied priority | No guidance when concerns conflict |
Audit for these ambiguous phrases (replace with MUST/MUST NOT): - "Try to..." → MUST or remove - "Should probably..." → MUST or MAY - "Generally..." → state the exception explicitly - "Consider..." → MUST evaluate, then decide based on [criteria] - "Be careful..." → MUST NOT [specific prohibited action]
4. Failure Mode Flagging¶
Does the prompt instruct the agent to surface problems rather than silently work around them?
| Criterion | Present | Partial | Missing |
|---|---|---|---|
| "Surface, Don't Solve" instruction | Explicit instruction with 3-step protocol (stop, describe, ask) | General "report issues" language | No failure handling at all |
| Specific failure scenarios listed | 3+ scenarios where agent must stop and ask | Generic "if something goes wrong" | Agent will attempt workarounds |
| Confidence gate for destructive actions | Agent must state confidence before delete/overwrite/deploy | Confirmation required for some actions | No gates on destructive operations |
Good pattern:
**You MUST STOP and surface to the user when:**
- A reviewer agent fails to return results within the expected format
- Auto-fix creates new findings instead of resolving existing ones
- Test gate passes but no spec exists to check compliance against
**Surface, Don't Solve:**
When you encounter an unexpected obstacle, DO NOT work around it silently.
Instead: (1) STOP the current step, (2) DESCRIBE what you encountered,
(3) EXPLAIN why it is unexpected, (4) ASK the user how to proceed.
Bad pattern:
Scoring Guide¶
Score each dimension as Present (2), Partial (1), or Missing (0).
| Score | Rating | Action |
|---|---|---|
| 7-8 | Excellent | No changes needed |
| 5-6 | Good | Minor improvements recommended |
| 3-4 | Fair | Add missing sections before deployment |
| 0-2 | Poor | Full intent engineering audit required |
Quick Reference: Intent Boundaries Section Template¶
Add this section to any orchestrator or high-autonomy agent, placed immediately after the role description and before workflow steps:
## Intent Boundaries
**You MUST NOT:**
- [Negative constraint specific to this agent's scope]
- [Negative constraint specific to this agent's scope]
- [Negative constraint specific to this agent's scope]
**You MUST STOP and surface to the user when:**
- [Failure scenario specific to this agent's domain]
- [Failure scenario specific to this agent's domain]
**Surface, Don't Solve:**
When you encounter an unexpected obstacle, DO NOT work around it silently.
Instead: (1) STOP the current step, (2) DESCRIBE what you encountered,
(3) EXPLAIN why it is unexpected, (4) ASK the user how to proceed.
**Task is COMPLETE when:**
- [Observable, verifiable condition]
- [Observable, verifiable condition]
**This agent is NOT responsible for:**
- [Adjacent work that belongs to another agent/phase]
Sources¶
- Anthropic: Effective Context Engineering for AI Agents (2025)
- Microsoft: Taxonomy of Failure Modes in AI Agents (2025)
- Vadim Blog: Skill Evolver — Research to Practice (2025)
- AWS: Agentic AI Security Scoping Matrix (2025)
- arXiv 2512.12791: Beyond Task Completion Assessment Framework (2025)
- Galileo: 7 AI Agent Failure Modes Guide (2025)