Teaching AI Agents Real TDD: Why Strict Rules Matter
I’ve been working on a standardization SOP that restructures repository AGENTS.md files1 to enforce strict Test-Driven Development rules for autonomous coding agents. The goal is simple: make AI agents follow real TDD, not the fake version they often default to.
The Problem
AI agents claim to follow TDD but frequently perform invalid Red steps. Here’s a typical failure:
1FAILED test_parse_issue_success
2AttributeError: method does not exist
The agent treats this as a valid Red step and proceeds to implement the method. This is incorrect. The test never executed the intended behavior—it failed because the structure was missing, not because the behavior was wrong.
Real TDD requires tests to fail because behavior is incorrect, not because a symbol doesn’t exist.2
Why This Happens
Autonomous agents need much more explicit TDD definitions than human developers. Without strict guardrails, they frequently:
- treat missing methods as valid Red failures
- write tests without meaningful assertions
- modify tests to make implementations pass
- apply TDD to incorrect parts of the codebase (config files, scripts, test utilities)
The core issue is that agents interpret “Red-Green-Refactor” too loosely. They need normative specifications, not informal guidance.3
The Approach
I created an SOP that standardizes AGENTS.md files using a reference structure with required sections. The TDD rules are defined using RFC 2119 language3 (MUST, MUST NOT, SHOULD, MAY) to eliminate ambiguity.
The SOP introduces placeholders that agents must resolve by analyzing the repository:
1<production_src_paths>
2<test_paths>
This forces agents to discover where production code lives versus where tests live. It prevents them from incorrectly applying TDD to test files, scripts, or configuration.
Key TDD Rules
1. Define Production Code Boundaries
Agents must know exactly where production code lives. The rules require automatic discovery from repository structure:
- Production code →
src/,app/,lib/, etc. - Tests →
tests/,test/,spec/
2. Enforce Behavioral Red Failures
A valid Red test must:
- import modules successfully
- instantiate required objects
- execute the target function
- fail only on behavioral assertions
Invalid Red failures include:
SyntaxErrorImportErrorModuleNotFoundErrorAttributeErrorcaused by missing methodsNameError- fixture or test setup failures
If these occur, the test must be rewritten.
3. Require Placeholder Implementations
When testing a new function that doesn’t exist yet, the agent must first create a placeholder implementation:
1def parse_issue(data):
2 raise NotImplementedError
This ensures the test fails due to missing behavior rather than missing structure.
4. Require Meaningful Assertions
Tests without assertions about observable behavior are invalid:
1# Invalid
2def test_foo():
3 foo()
4
5# Valid
6def test_foo():
7 result = foo()
8 assert result == expected_value
5. Prevent Test Rewriting to Pass Code
A common AI failure pattern:
- Write a test
- Implement code
- Test fails
- Modify the test instead of fixing the code
The rules explicitly forbid weakening or rewriting tests to make implementations pass unless the user confirms the test is incorrect.
Structural Safeguards
AGENTS.md Integrity Boundary
The file includes a controlled modification boundary.4 Only the section titled “Detailed Documentation” may be modified automatically. All rule sections above this boundary must remain unchanged to prevent accidental alteration of agent behavior.
Locking Normative Safety Rules
These sections must be copied verbatim and not simplified or reinterpreted:
TDD (Red-Green-Refactor)Step GatingAGENTS.md Integrity
Agents sometimes weaken rules when rewriting documentation. This prevents that.5
Step Gating Model
Agents must follow a strict step workflow:6
- Stop after completing a step
- Provide a command the user can run to verify results
- Wait for confirmation
- Update documentation or implementation plans
- Commit the changes
If verification fails, the agent must diagnose and re-gate.
What Worked Well
Using RFC 2119 language (MUST, MUST NOT) significantly improved agent compliance. The normative style removes interpretation ambiguity.
Requiring placeholder implementations solved the “missing method” Red step problem. Agents now create structure first, then test behavior.
The integrity boundary prevented agents from accidentally weakening safety rules during documentation updates.
What Didn’t Work
Initial versions didn’t define production code boundaries explicitly. Agents applied TDD to test utilities and configuration files, creating circular dependencies.
Early rules didn’t forbid test rewriting. Agents would modify assertions to match incorrect implementations rather than fixing the code.
Lessons Learned
AI agents require explicit definitions of concepts human developers understand implicitly. “Write a failing test” isn’t enough—you must define what constitutes a valid failure.
Normative language matters. Informal guidance gets reinterpreted. RFC 2119 keywords force agents to treat rules as specifications.
Agents need structural guardrails to prevent self-modification of safety rules. The integrity boundary is essential for autonomous workflows.
Next Steps
The SOP currently handles TDD and step gating. Future versions should address:
- lint and type check gates
- commit message standards
- documentation update requirements
- knowledge capture workflows
The goal is a complete specification for safe autonomous development that produces production-quality code without human intervention at each step.
References
- RFC 2119 - Key words for use in RFCs to Indicate Requirement Levels - Normative language specification
AGENTS.md is a Markdown file placed at the root of a code repository that provides instructions, conventions, and context for AI coding agents. It acts as a machine-readable playbook that standardizes how autonomous agents interact with a codebase. See AGENTS.md Standard for more details. ↩︎
In proper Test-Driven Development, the Red phase requires that tests fail due to incorrect behavior, not missing structure. A test that fails with
ImportError,AttributeError, orNameErrorhasn’t actually tested the intended behavior—it failed before reaching the assertion. This distinction is critical for maintaining the integrity of the Red-Green-Refactor cycle. See Test-Driven Development (TDD) for verification requirements. ↩︎RFC 2119 defines keywords like MUST, MUST NOT, SHOULD, and MAY for use in technical specifications. These keywords create unambiguous requirement levels: MUST indicates absolute requirements, SHOULD indicates strong recommendations, and MAY indicates optional features. Using this language in agent instructions eliminates interpretation ambiguity. The specification is public domain and available at RFC 2119. ↩︎ ↩︎
Document integrity boundaries are controlled modification zones that protect critical sections from unauthorized or accidental changes. In the context of AGENTS.md, this boundary ensures that normative rules governing agent behavior remain stable while allowing project-specific documentation to evolve. This pattern is common in regulated industries where document control maintains system integrity. See Document Integrity for principles. ↩︎
AI agents can inadvertently weaken safety rules when asked to “improve” or “simplify” documentation. By marking certain sections as normative and requiring verbatim copying, the SOP prevents agents from reinterpreting critical constraints. This is similar to how change control procedures in quality management systems require explicit approval for modifications to controlled documents. ↩︎
Step gating (also called quality gates or verification gates) introduces structured control points in a workflow where each phase must meet defined criteria before progression. In software development, gates typically verify that code compiles, tests pass, and quality standards are met before allowing the next step. For AI agents, step gating ensures human verification at critical points, preventing cascading errors. See Process Gating and Software Quality Gates for implementation patterns. ↩︎