Intent Engineering Checklist for Skills and Agents¶

Use this checklist when creating or auditing any skill or agent prompt. Intent engineering ensures the gap between instructions given and actual agent behavior is minimized. Prompts are contracts, not suggestions.

1. Boundary Statements¶

Does the prompt explicitly define what the agent MUST NOT do?

Criterion	Present	Partial	Missing
Negative constraints listed ("MUST NOT modify X")	Agent has 3+ specific prohibitions relevant to its scope	Agent has vague or incomplete prohibitions	No negative constraints at all
Scope boundary defined ("operates ONLY on X")	Explicit file/directory/system scope stated	Scope implied but not stated	Agent could reasonably touch anything
Blast radius bounded	Agent cannot modify its own instructions, unrelated code, or shared state	Some boundaries exist but gaps remain	No blast radius consideration

Good pattern:

**You MUST NOT:**
- Modify files outside the reviewed branch diff scope
- Skip the test gate or spec-compliance gate
- Create new public API signatures during auto-fix

Bad pattern:

Be careful not to change things you shouldn't.
Try to stay within scope.

2. Task Completion Definitions¶

Does the prompt define what "done" means in observable, verifiable terms?

Criterion	Present	Partial	Missing
Positive completion criteria ("complete WHEN X")	2+ observable conditions that can be verified	Vague completion ("when you're done")	No completion definition
Negative completion boundary ("NOT responsible for X")	Explicit list of adjacent work this agent must not start	Implicit boundary only	Agent could iterate indefinitely
Stop condition ("STOP after X")	Explicit instruction to halt and report	Implied stopping point	No stop signal — agent may gold-plate

Good pattern:

**Task is COMPLETE when:**
- All quality gates report status (pass or fail)
- Findings are classified by severity
- Report is delivered to the user

**This agent is NOT responsible for:**
- Fixing the findings (that is the developer's job)
- Merging the branch
- Deploying to production

Bad pattern:

Review the code thoroughly and make sure everything is good.

3. Constraint Enumeration (Fork-Context Skills)¶

For skills running in context: fork or with elevated autonomy:

Criterion	Present	Partial	Missing
Obligation language is unambiguous	Uses MUST/MUST NOT/SHALL exclusively	Mix of "should"/"try to"/"consider"	Entirely advisory language
Conditional logic is explicit	"If X, then Y. Otherwise Z."	Some conditions stated, others implied	Agent must infer when rules apply
Priority order stated for competing concerns	Explicit ranking (e.g., "security over speed")	Implied priority	No guidance when concerns conflict

Audit for these ambiguous phrases (replace with MUST/MUST NOT): - "Try to..." → MUST or remove - "Should probably..." → MUST or MAY - "Generally..." → state the exception explicitly - "Consider..." → MUST evaluate, then decide based on [criteria] - "Be careful..." → MUST NOT [specific prohibited action]

4. Failure Mode Flagging¶

Does the prompt instruct the agent to surface problems rather than silently work around them?

Criterion	Present	Partial	Missing
"Surface, Don't Solve" instruction	Explicit instruction with 3-step protocol (stop, describe, ask)	General "report issues" language	No failure handling at all
Specific failure scenarios listed	3+ scenarios where agent must stop and ask	Generic "if something goes wrong"	Agent will attempt workarounds
Confidence gate for destructive actions	Agent must state confidence before delete/overwrite/deploy	Confirmation required for some actions	No gates on destructive operations

Good pattern:

**You MUST STOP and surface to the user when:**
- A reviewer agent fails to return results within the expected format
- Auto-fix creates new findings instead of resolving existing ones
- Test gate passes but no spec exists to check compliance against

**Surface, Don't Solve:**
When you encounter an unexpected obstacle, DO NOT work around it silently.
Instead: (1) STOP the current step, (2) DESCRIBE what you encountered,
(3) EXPLAIN why it is unexpected, (4) ASK the user how to proceed.

Bad pattern:

If something goes wrong, try to handle it gracefully.

Scoring Guide¶

Score each dimension as Present (2), Partial (1), or Missing (0).

Score	Rating	Action
7-8	Excellent	No changes needed
5-6	Good	Minor improvements recommended
3-4	Fair	Add missing sections before deployment
0-2	Poor	Full intent engineering audit required

Quick Reference: Intent Boundaries Section Template¶

Add this section to any orchestrator or high-autonomy agent, placed immediately after the role description and before workflow steps:

## Intent Boundaries

**You MUST NOT:**
- [Negative constraint specific to this agent's scope]
- [Negative constraint specific to this agent's scope]
- [Negative constraint specific to this agent's scope]

**You MUST STOP and surface to the user when:**
- [Failure scenario specific to this agent's domain]
- [Failure scenario specific to this agent's domain]

**Surface, Don't Solve:**
When you encounter an unexpected obstacle, DO NOT work around it silently.
Instead: (1) STOP the current step, (2) DESCRIBE what you encountered,
(3) EXPLAIN why it is unexpected, (4) ASK the user how to proceed.

**Task is COMPLETE when:**
- [Observable, verifiable condition]
- [Observable, verifiable condition]

**This agent is NOT responsible for:**
- [Adjacent work that belongs to another agent/phase]

Sources¶

Anthropic: Effective Context Engineering for AI Agents (2025)
Microsoft: Taxonomy of Failure Modes in AI Agents (2025)
Vadim Blog: Skill Evolver — Research to Practice (2025)
AWS: Agentic AI Security Scoping Matrix (2025)
arXiv 2512.12791: Beyond Task Completion Assessment Framework (2025)
Galileo: 7 AI Agent Failure Modes Guide (2025)