ace-test 0.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.ace-defaults/nav/protocols/agent-sources/ace-test.yml +19 -0
- data/.ace-defaults/nav/protocols/guide-sources/ace-test.yml +19 -0
- data/.ace-defaults/nav/protocols/tmpl-sources/ace-test.yml +11 -0
- data/.ace-defaults/nav/protocols/wfi-sources/ace-test.yml +19 -0
- data/CHANGELOG.md +169 -0
- data/LICENSE +21 -0
- data/README.md +40 -0
- data/Rakefile +12 -0
- data/handbook/agents/mock.ag.md +164 -0
- data/handbook/agents/profile-tests.ag.md +132 -0
- data/handbook/agents/test.ag.md +99 -0
- data/handbook/guides/SUMMARY.md +95 -0
- data/handbook/guides/embedded-testing-guide.g.md +261 -0
- data/handbook/guides/mocking-patterns.g.md +464 -0
- data/handbook/guides/quick-reference.g.md +46 -0
- data/handbook/guides/test-driven-development-cycle/meta-documentation.md +26 -0
- data/handbook/guides/test-driven-development-cycle/ruby-application.md +18 -0
- data/handbook/guides/test-driven-development-cycle/ruby-gem.md +19 -0
- data/handbook/guides/test-driven-development-cycle/rust-cli.md +18 -0
- data/handbook/guides/test-driven-development-cycle/rust-wasm-zed.md +19 -0
- data/handbook/guides/test-driven-development-cycle/typescript-nuxt.md +18 -0
- data/handbook/guides/test-driven-development-cycle/typescript-vue.md +19 -0
- data/handbook/guides/test-layer-decision.g.md +261 -0
- data/handbook/guides/test-mocking-patterns.g.md +414 -0
- data/handbook/guides/test-organization.g.md +140 -0
- data/handbook/guides/test-performance.g.md +353 -0
- data/handbook/guides/test-responsibility-map.g.md +220 -0
- data/handbook/guides/test-review-checklist.g.md +231 -0
- data/handbook/guides/test-suite-health.g.md +337 -0
- data/handbook/guides/testable-code-patterns.g.md +315 -0
- data/handbook/guides/testing/ruby-rspec-config-examples.md +120 -0
- data/handbook/guides/testing/ruby-rspec.md +87 -0
- data/handbook/guides/testing/rust.md +52 -0
- data/handbook/guides/testing/test-maintenance.md +364 -0
- data/handbook/guides/testing/typescript-bun.md +47 -0
- data/handbook/guides/testing/vue-firebase-auth.md +546 -0
- data/handbook/guides/testing/vue-vitest.md +236 -0
- data/handbook/guides/testing-philosophy.g.md +82 -0
- data/handbook/guides/testing-strategy.g.md +151 -0
- data/handbook/guides/testing-tdd-cycle.g.md +146 -0
- data/handbook/guides/testing.g.md +170 -0
- data/handbook/skills/as-test-create-cases/SKILL.md +24 -0
- data/handbook/skills/as-test-fix/SKILL.md +26 -0
- data/handbook/skills/as-test-improve-coverage/SKILL.md +22 -0
- data/handbook/skills/as-test-optimize/SKILL.md +34 -0
- data/handbook/skills/as-test-performance-audit/SKILL.md +34 -0
- data/handbook/skills/as-test-plan/SKILL.md +34 -0
- data/handbook/skills/as-test-review/SKILL.md +34 -0
- data/handbook/skills/as-test-verify-suite/SKILL.md +45 -0
- data/handbook/templates/e2e-sandbox-checklist.template.md +289 -0
- data/handbook/templates/test-case.template.md +56 -0
- data/handbook/templates/test-performance-audit.template.md +132 -0
- data/handbook/templates/test-responsibility-map.template.md +92 -0
- data/handbook/templates/test-review-checklist.template.md +163 -0
- data/handbook/workflow-instructions/test/analyze-failures.wf.md +120 -0
- data/handbook/workflow-instructions/test/create-cases.wf.md +675 -0
- data/handbook/workflow-instructions/test/fix.wf.md +120 -0
- data/handbook/workflow-instructions/test/improve-coverage.wf.md +370 -0
- data/handbook/workflow-instructions/test/optimize.wf.md +368 -0
- data/handbook/workflow-instructions/test/performance-audit.wf.md +17 -0
- data/handbook/workflow-instructions/test/plan.wf.md +323 -0
- data/handbook/workflow-instructions/test/review.wf.md +16 -0
- data/handbook/workflow-instructions/test/verify-suite.wf.md +343 -0
- data/lib/ace/test/version.rb +7 -0
- data/lib/ace/test.rb +10 -0
- metadata +152 -0
|
@@ -0,0 +1,99 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: test
|
|
3
|
+
description: Run tests with smart defaults and helpful diagnostics
|
|
4
|
+
expected_params:
|
|
5
|
+
required: []
|
|
6
|
+
optional:
|
|
7
|
+
- target: 'Test target - file path, directory (atoms, molecules), or group name'
|
|
8
|
+
- profile: 'Profile N slowest tests to identify performance issues'
|
|
9
|
+
- verbose: 'Show detailed test output'
|
|
10
|
+
last_modified: '2026-01-22'
|
|
11
|
+
type: agent
|
|
12
|
+
source: ace-test
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
You are a testing specialist using the **ace-test** command-line tool.
|
|
16
|
+
|
|
17
|
+
## Core Responsibilities
|
|
18
|
+
|
|
19
|
+
Your primary role is to run tests efficiently and help diagnose issues:
|
|
20
|
+
- Run tests with smart defaults based on context
|
|
21
|
+
- Profile slow tests to identify performance bottlenecks
|
|
22
|
+
- Help interpret test failures and suggest fixes
|
|
23
|
+
- Guide users toward proper test organization
|
|
24
|
+
|
|
25
|
+
## Primary Tool: ace-test
|
|
26
|
+
|
|
27
|
+
Use the **ace-test** command for all test execution.
|
|
28
|
+
|
|
29
|
+
## Commands
|
|
30
|
+
|
|
31
|
+
### Run All Tests
|
|
32
|
+
```bash
|
|
33
|
+
# Run all tests in current package
|
|
34
|
+
ace-test
|
|
35
|
+
|
|
36
|
+
# Run all tests across monorepo
|
|
37
|
+
ace-test-suite
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
### Run Specific Tests
|
|
41
|
+
```bash
|
|
42
|
+
# Run single test file
|
|
43
|
+
ace-test test/atoms/pattern_analyzer_test.rb
|
|
44
|
+
|
|
45
|
+
# Run test directory/group
|
|
46
|
+
ace-test atoms
|
|
47
|
+
ace-test molecules
|
|
48
|
+
ace-test organisms
|
|
49
|
+
ace-test integration
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### Profile Tests
|
|
53
|
+
```bash
|
|
54
|
+
# Profile 10 slowest tests
|
|
55
|
+
ace-test --profile 10
|
|
56
|
+
|
|
57
|
+
# Profile slowest tests in a group
|
|
58
|
+
ace-test atoms --profile 10
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
## Test Performance Targets
|
|
62
|
+
|
|
63
|
+
When profiling, use these thresholds to identify issues:
|
|
64
|
+
|
|
65
|
+
| Test Layer | Target Time | Hard Limit |
|
|
66
|
+
|------------|-------------|------------|
|
|
67
|
+
| Unit (atoms) | <10ms | 50ms |
|
|
68
|
+
| Unit (molecules) | <50ms | 100ms |
|
|
69
|
+
| Unit (organisms) | <100ms | 200ms |
|
|
70
|
+
| Integration | <500ms | 1s |
|
|
71
|
+
| E2E | <2s | 5s |
|
|
72
|
+
|
|
73
|
+
## Interpreting Results
|
|
74
|
+
|
|
75
|
+
When tests fail:
|
|
76
|
+
1. Check the error message and stack trace
|
|
77
|
+
2. Look for patterns (same file, similar names)
|
|
78
|
+
3. Consider recent changes (`git log --oneline -10`)
|
|
79
|
+
4. Check for zombie mocks if tests are slow but passing
|
|
80
|
+
|
|
81
|
+
When tests are slow:
|
|
82
|
+
1. Profile to identify bottlenecks
|
|
83
|
+
2. Check for real I/O in unit tests
|
|
84
|
+
3. Look for unstubbed subprocess calls
|
|
85
|
+
4. Verify mocks are hitting actual code paths
|
|
86
|
+
|
|
87
|
+
## Related Guides
|
|
88
|
+
|
|
89
|
+
- [Quick Reference](guide://quick-reference) - TL;DR testing patterns
|
|
90
|
+
- [Test Performance](guide://test-performance) - Optimization strategies
|
|
91
|
+
- [Mocking Patterns](guide://mocking-patterns) - How to stub properly
|
|
92
|
+
|
|
93
|
+
## Response Format
|
|
94
|
+
|
|
95
|
+
When providing results:
|
|
96
|
+
1. Show test command and summary
|
|
97
|
+
2. Highlight failures with file:line references
|
|
98
|
+
3. Suggest fixes based on error patterns
|
|
99
|
+
4. Recommend profiling if tests seem slow
|
|
@@ -0,0 +1,95 @@
|
|
|
1
|
+
# ACE Testing Guide Index
|
|
2
|
+
|
|
3
|
+
Navigation index for all testing guides in the ace-test package.
|
|
4
|
+
|
|
5
|
+
## Core Testing Guides
|
|
6
|
+
|
|
7
|
+
| Guide | Protocol | Description |
|
|
8
|
+
|-------|----------|-------------|
|
|
9
|
+
| Quick Reference | `guide://quick-reference` | TL;DR of testing patterns - flat structure, naming, IO isolation |
|
|
10
|
+
| Testing Philosophy | `guide://testing-philosophy` | Testing pyramid, IO isolation principle, when real IO is allowed |
|
|
11
|
+
| Test Organization | `guide://test-organization` | Flat directory structure, naming conventions, layer boundaries |
|
|
12
|
+
| Mocking Patterns | `guide://mocking-patterns` | MockGitRepo, WebMock, subprocess stubbing, ENV testing patterns |
|
|
13
|
+
| Test Performance | `guide://test-performance` | Performance targets by layer, composite helpers, zombie mocks detection |
|
|
14
|
+
| Testable Code Patterns | `guide://testable-code-patterns` | Avoiding exit calls, returning status codes, exception patterns |
|
|
15
|
+
| Testing Guide | `guide://testing` | General testing guidelines and best practices |
|
|
16
|
+
| TDD Cycle | `guide://testing-tdd-cycle` | Test-driven development implementation cycle |
|
|
17
|
+
| Embedded Testing | `guide://embedded-testing-guide` | Embedded testing in workflows |
|
|
18
|
+
|
|
19
|
+
## Test Strategy & Planning
|
|
20
|
+
|
|
21
|
+
Decision frameworks for test design and layer assignment.
|
|
22
|
+
|
|
23
|
+
| Guide | Protocol | Description |
|
|
24
|
+
|-------|----------|-------------|
|
|
25
|
+
| Testing Strategy | `guide://testing-strategy` | Fast/Slow loop strategy for high-performance test suites |
|
|
26
|
+
| Test Layer Decision | `guide://test-layer-decision` | Decision matrix for unit vs integration vs E2E |
|
|
27
|
+
| Test Responsibility Map | `guide://test-responsibility-map` | Map behaviors to test layers to avoid redundant coverage |
|
|
28
|
+
|
|
29
|
+
## Test Quality & Health
|
|
30
|
+
|
|
31
|
+
Patterns for maintaining test suite quality and performance.
|
|
32
|
+
|
|
33
|
+
| Guide | Protocol | Description |
|
|
34
|
+
|-------|----------|-------------|
|
|
35
|
+
| Test Mocking Patterns | `guide://test-mocking-patterns` | Behavior testing, zombie mock detection, contract testing |
|
|
36
|
+
| Test Suite Health | `guide://test-suite-health` | Metrics, CI integration, periodic audits |
|
|
37
|
+
| Test Review Checklist | `guide://test-review-checklist` | Quick checklist for reviewing test PRs |
|
|
38
|
+
|
|
39
|
+
## Technology-Specific Guides
|
|
40
|
+
|
|
41
|
+
### Testing by Technology
|
|
42
|
+
|
|
43
|
+
| Guide | File | Description |
|
|
44
|
+
|-------|------|-------------|
|
|
45
|
+
| RSpec Patterns | `testing/ruby-rspec.md` | Ruby RSpec-specific testing patterns |
|
|
46
|
+
| RSpec Config Examples | `testing/ruby-rspec-config-examples.md` | RSpec configuration examples |
|
|
47
|
+
| Rust Testing | `testing/rust.md` | Rust testing patterns |
|
|
48
|
+
| TypeScript/Bun | `testing/typescript-bun.md` | Bun test patterns for TypeScript |
|
|
49
|
+
| Vue/Vitest | `testing/vue-vitest.md` | Vue + Vitest testing patterns |
|
|
50
|
+
| Vue/Firebase Auth | `testing/vue-firebase-auth.md` | Vue with Firebase authentication testing |
|
|
51
|
+
| Test Maintenance | `testing/test-maintenance.md` | Test maintenance and refactoring guidelines |
|
|
52
|
+
|
|
53
|
+
## TDD Cycle Guides
|
|
54
|
+
|
|
55
|
+
### Test-Driven Development by Platform
|
|
56
|
+
|
|
57
|
+
| Guide | File | Description |
|
|
58
|
+
|-------|------|-------------|
|
|
59
|
+
| Ruby Gem TDD | `test-driven-development-cycle/ruby-gem.md` | TDD workflow for Ruby gems |
|
|
60
|
+
| Ruby Application TDD | `test-driven-development-cycle/ruby-application.md` | TDD workflow for Ruby applications |
|
|
61
|
+
| Rust CLI TDD | `test-driven-development-cycle/rust-cli.md` | TDD for Rust CLI tools |
|
|
62
|
+
| Rust WASM/Zed TDD | `test-driven-development-cycle/rust-wasm-zed.md` | TDD for Rust WASM/Zed extensions |
|
|
63
|
+
| TypeScript Vue TDD | `test-driven-development-cycle/typescript-vue.md` | TDD for Vue applications |
|
|
64
|
+
| TypeScript Nuxt TDD | `test-driven-development-cycle/typescript-nuxt.md` | TDD for Nuxt applications |
|
|
65
|
+
| Meta Documentation TDD | `test-driven-development-cycle/meta-documentation.md` | TDD for documentation projects |
|
|
66
|
+
|
|
67
|
+
## Access Guides via ace-nav
|
|
68
|
+
|
|
69
|
+
```bash
|
|
70
|
+
# Quick reference
|
|
71
|
+
ace-nav guide://quick-reference
|
|
72
|
+
|
|
73
|
+
# Testing philosophy
|
|
74
|
+
ace-nav guide://testing-philosophy
|
|
75
|
+
|
|
76
|
+
# Mocking patterns
|
|
77
|
+
ace-nav guide://mocking-patterns
|
|
78
|
+
|
|
79
|
+
# TDD cycle
|
|
80
|
+
ace-nav guide://testing-tdd-cycle
|
|
81
|
+
|
|
82
|
+
# New test strategy guides
|
|
83
|
+
ace-nav guide://testing-strategy
|
|
84
|
+
ace-nav guide://test-layer-decision
|
|
85
|
+
ace-nav guide://test-mocking-patterns
|
|
86
|
+
ace-nav guide://test-suite-health
|
|
87
|
+
ace-nav guide://test-responsibility-map
|
|
88
|
+
ace-nav guide://test-review-checklist
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
## See Also
|
|
92
|
+
|
|
93
|
+
- [Workflows](../workflow-instructions/) - Testing-related workflows
|
|
94
|
+
- [Agents](../agents/) - Testing automation agents
|
|
95
|
+
- [Templates](../templates/) - Test case templates
|
|
@@ -0,0 +1,261 @@
|
|
|
1
|
+
---
|
|
2
|
+
doc-type: guide
|
|
3
|
+
title: Embedding Tests in AI Agent Workflows
|
|
4
|
+
purpose: Embedded testing in workflows
|
|
5
|
+
ace-docs:
|
|
6
|
+
last-updated: 2026-01-23
|
|
7
|
+
last-checked: 2026-03-21
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Embedding Tests in AI Agent Workflows
|
|
11
|
+
|
|
12
|
+
This guide details the standard for incorporating tests directly within AI agent workflow instruction files.
|
|
13
|
+
Integrating tests makes workflows more robust, provides faster feedback, and improves the reliability of
|
|
14
|
+
automated tasks.
|
|
15
|
+
|
|
16
|
+
## Purpose
|
|
17
|
+
|
|
18
|
+
Embedding tests directly into workflow instructions allows an AI agent to:
|
|
19
|
+
|
|
20
|
+
- Verify pre-conditions before starting complex operations
|
|
21
|
+
- Validate the outcome of individual actions or tool uses
|
|
22
|
+
- Confirm that a series of steps achieved the desired overall result
|
|
23
|
+
- Request user verification for subjective or critical outputs
|
|
24
|
+
- Ensure adherence to safety guardrails and compliance requirements
|
|
25
|
+
- Support self-contained workflow execution without external test files
|
|
26
|
+
|
|
27
|
+
This immediate feedback loop helps catch errors early, reduces the need for extensive manual checking,
|
|
28
|
+
and improves automated processes. In the context of self-contained workflows, embedded tests are
|
|
29
|
+
essential for validation without external dependencies.
|
|
30
|
+
|
|
31
|
+
## Test Categories
|
|
32
|
+
|
|
33
|
+
The following categories of tests can be embedded in workflows:
|
|
34
|
+
|
|
35
|
+
1. **Pre-condition Checks:**
|
|
36
|
+
- **Description:** Verify that the environment and inputs are ready before an action or task begins.
|
|
37
|
+
- **Examples:** Ensure input files exist, required tools are available, API keys are set.
|
|
38
|
+
|
|
39
|
+
2. **Action Validation (Tool-Specific Tests):**
|
|
40
|
+
- **Description:** Validate the immediate output or effect of a specific tool usage or agent action.
|
|
41
|
+
- **Examples:** Check if a file was correctly modified, a command ran successfully, an API call returned expected data.
|
|
42
|
+
|
|
43
|
+
3. **Post-condition Checks (Task-Level Outcome Validation):**
|
|
44
|
+
- **Description:** Verify that a sequence of actions has achieved its broader goal.
|
|
45
|
+
- **Examples:** Confirm a generated report contains all necessary sections, a
|
|
46
|
+
refactoring task didn\'t break existing tests.
|
|
47
|
+
|
|
48
|
+
4. **Output Validation (Against External Systems or Ground Truth):**
|
|
49
|
+
- **Description:** Compare the agent\'s final output with an external reference or known correct state.
|
|
50
|
+
- Examples: Check if a deployed service responds correctly, if data written
|
|
51
|
+
to a database matches expectations.
|
|
52
|
+
|
|
53
|
+
5. **Guardrail Tests (Safety and Compliance):**
|
|
54
|
+
- **Description:** Ensure the agent\'s operations stay within safe boundaries and meet compliance rules.
|
|
55
|
+
- Examples: Prevent accidental deletion of critical files,
|
|
56
|
+
check for hardcoded secrets,
|
|
57
|
+
ensure generated code meets linting standards.
|
|
58
|
+
|
|
59
|
+
6. **User Feedback/Verification Prompts:**
|
|
60
|
+
- **Description:** Solicit explicit confirmation from a human user for steps that are subjective, critical, or
|
|
61
|
+
difficult to automate verification for.
|
|
62
|
+
- **Examples:** Ask user to review a generated summary, confirm a proposed destructive action.
|
|
63
|
+
|
|
64
|
+
## Syntax for Embedding Tests
|
|
65
|
+
|
|
66
|
+
Tests are embedded in workflow markdown files using a specific blockquote structure. There are two main keywords:
|
|
67
|
+
`TEST` for automated checks and `VERIFY` for user feedback prompts.
|
|
68
|
+
|
|
69
|
+
```markdown
|
|
70
|
+
```markdown
|
|
71
|
+
|
|
72
|
+
> TEST: <Test Name (brief, human-readable)>
|
|
73
|
+
> Type: <Pre-condition | Action Validation | Post-condition | Output Validation | Guardrail>
|
|
74
|
+
> Assert: <Human-readable description of what\\\\\\\'s being checked>
|
|
75
|
+
> [Command: <executable command or script call, e.g., `bin/test --check-file ...`>]
|
|
76
|
+
> [File: <path_to_file_to_check_or_use_in_command>]
|
|
77
|
+
> [Pattern: <regex_or_string_pattern for_grep_like_checks>]\
|
|
78
|
+
> [Expected: <expected_value_or_outcome_description>]
|
|
79
|
+
|
|
80
|
+
> VERIFY: <Verification Point Name>
|
|
81
|
+
> Type: User Feedback
|
|
82
|
+
> Prompt: <Text to display to the user for verification>
|
|
83
|
+
> [Options: <e.g., (yes/no), (proceed/abort/edit)>]
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
**Fields:**
|
|
87
|
+
|
|
88
|
+
- `TEST:` / `VERIFY:`: Keyword to initiate the test or verification block. Followed by a human-readable name.
|
|
89
|
+
- `Type:`: One of the defined test categories.
|
|
90
|
+
- `Assert:` (for `TEST`): A clear, human-readable statement of the condition being checked.
|
|
91
|
+
- `Command:` (optional, for `TEST`): The shell command the agent should execute to perform the test. A non-zero
|
|
92
|
+
exit code indicates failure. This is the **preferred method for automated checks**.
|
|
93
|
+
- `File:` (optional, for `TEST`): Path to a file relevant to the test. Can be used by the `Command` or by the agent
|
|
94
|
+
for simple checks if no `Command` is provided.
|
|
95
|
+
- `Pattern:` (optional, for `TEST`): A regex or string pattern to search for, typically within the `File`.
|
|
96
|
+
- `Expected:` (optional, for `TEST`): A description of the expected outcome or value, useful if the `Command` doesn\'t
|
|
97
|
+
directly assert this.
|
|
98
|
+
- `Prompt:` (for `VERIFY`): The question or instruction presented to the user.
|
|
99
|
+
- `Options:` (optional, for `VERIFY`): Suggested responses for the user.
|
|
100
|
+
|
|
101
|
+
## Agent Interpretation and Execution
|
|
102
|
+
|
|
103
|
+
The AI agent should:
|
|
104
|
+
|
|
105
|
+
1. **Parse Blocks:** Identify `> TEST:` and `> VERIFY:` blocks in the workflow markdown.
|
|
106
|
+
2. **Execute `TEST` Blocks:**
|
|
107
|
+
- If a `Command:` is provided, execute it (e.g., using a `terminal` tool). The `bin/test` utility is designed to
|
|
108
|
+
be the common target for these commands. A non-zero exit status from the command signifies a test failure.
|
|
109
|
+
- If no `Command:` is provided, the agent may attempt simple, direct checks based on `File:`, `Pattern:`, and
|
|
110
|
+
`Assert:`, such as file existence or basic content checks. However, complex logic should always be
|
|
111
|
+
encapsulated in a `Command:`.
|
|
112
|
+
3. **Handle `VERIFY` Blocks:**
|
|
113
|
+
- Display the `Prompt:` text to the user.
|
|
114
|
+
- Wait for user input. The workflow may pause or proceed based on the response.
|
|
115
|
+
4. **Handle Test Outcomes:**
|
|
116
|
+
- **Success:** Proceed with the workflow.
|
|
117
|
+
- **Failure (Automated Test or Negative User Feedback):** Report the failure (including test name, assertion,
|
|
118
|
+
and command output if any). Halt the current task or follow specific error-handling instructions in the
|
|
119
|
+
workflow, then await user guidance.
|
|
120
|
+
5. **Log:** Record all test executions and their outcomes.
|
|
121
|
+
|
|
122
|
+
## Integration with Self-Contained Workflows
|
|
123
|
+
|
|
124
|
+
Embedded tests are crucial for maintaining workflow independence:
|
|
125
|
+
|
|
126
|
+
1. **No External Test Scripts**: Tests should be defined inline rather than referencing external test files
|
|
127
|
+
2. **Self-Validating Steps**: Each major step can include its own validation
|
|
128
|
+
3. **Context-Aware Testing**: Tests can reference files loaded in the Project Context Loading section
|
|
129
|
+
4. **Technology-Agnostic Commands**: Use placeholder commands that can be adapted to different stacks
|
|
130
|
+
|
|
131
|
+
### Example: Self-Contained Test in Planning Step
|
|
132
|
+
|
|
133
|
+
```markdown
|
|
134
|
+
### Planning Steps
|
|
135
|
+
* [ ] Review dependency analysis findings
|
|
136
|
+
> TEST: Analysis Review Complete
|
|
137
|
+
> Type: Pre-condition Check
|
|
138
|
+
> Assert: Dependency analysis results are incorporated into refactoring plan
|
|
139
|
+
> Command: grep -l "dependency" workflow-independence-plan.md
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
## Examples
|
|
143
|
+
|
|
144
|
+
### Simple, Fast Feedback Tests
|
|
145
|
+
|
|
146
|
+
### Example 1: Check if a file was created
|
|
147
|
+
|
|
148
|
+
## Step 2: Generate Configuration File
|
|
149
|
+
|
|
150
|
+
The agent will generate `config.json` based on the inputs.
|
|
151
|
+
|
|
152
|
+
> TEST: Config File Created
|
|
153
|
+
> Type: Post-condition Check
|
|
154
|
+
> Assert: The `config.json` file exists in the output directory.
|
|
155
|
+
> File: output/config.json
|
|
156
|
+
> Command: bin/test --check-file-exists output/config.json
|
|
157
|
+
|
|
158
|
+
### Example 2: Check if a file contains specific text
|
|
159
|
+
|
|
160
|
+
## Step 3: Update README
|
|
161
|
+
|
|
162
|
+
The agent will add a "## Usage" section to `README.md`.
|
|
163
|
+
|
|
164
|
+
```markdown
|
|
165
|
+
|
|
166
|
+
> TEST: README Usage Section Added
|
|
167
|
+
> Type: Action Validation
|
|
168
|
+
> Assert: The `README.md` file now contains the \"## Usage\" heading.
|
|
169
|
+
> File: README.md
|
|
170
|
+
> Pattern: \"## Usage\"
|
|
171
|
+
> Command: bin/test --check-file-contains-pattern \"## Usage\" README.md
|
|
172
|
+
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
### Higher-Level Verification Tests
|
|
176
|
+
|
|
177
|
+
### Example 3: Confirm a command achieves an outcome (e.g., code formatting)
|
|
178
|
+
|
|
179
|
+
## Step 4: Format Source Code
|
|
180
|
+
|
|
181
|
+
The agent will run the project's code formatter on all `.py` files.
|
|
182
|
+
|
|
183
|
+
> TEST: Code Formatting Applied
|
|
184
|
+
> Type: Post-condition Check
|
|
185
|
+
> Assert: The code formatter reports no changes are needed, indicating formatting was successful.
|
|
186
|
+
Command: black --check . # (Assumes \\\'black\\\' exits non-zero if changes are needed)
|
|
187
|
+
|
|
188
|
+
```markdown
|
|
189
|
+
|
|
190
|
+
*Note: For commands like linters or formatters that exit 0 if successful (or no changes needed) and non-zero
|
|
191
|
+
if issues are found/changes would be made, the agent might need to interpret the exit code accordingly.
|
|
192
|
+
A wrapper script via `bin/test` could invert this if needed (e.g. `bin/test --expect-exit-code 0 \"black --check .\"`).*
|
|
193
|
+
|
|
194
|
+
### Example 4: User verification of generated content
|
|
195
|
+
|
|
196
|
+
## Step 5: Generate Project Summary
|
|
197
|
+
|
|
198
|
+
The agent will write a summary of the project to `docs/summary.md`.
|
|
199
|
+
|
|
200
|
+
> VERIFY: Summary Accuracy
|
|
201
|
+
> Type: User Feedback
|
|
202
|
+
> Prompt: Please review `docs/summary.md`. Does it accurately reflect the project\'s current state and goals?
|
|
203
|
+
> Options: (Yes, Accurate / No, Needs Revision)
|
|
204
|
+
```markdown
|
|
205
|
+
|
|
206
|
+
|
|
207
|
+
**Example 5: Pre-condition check for API key**
|
|
208
|
+
|
|
209
|
+
## Step 1: Initialize API Client
|
|
210
|
+
|
|
211
|
+
The agent will prepare to make calls to an external service.
|
|
212
|
+
|
|
213
|
+
> TEST: API Key Available
|
|
214
|
+
> Type: Pre-condition Check
|
|
215
|
+
> Assert: The `EXTERNAL_SERVICE_API_KEY` environment variable is set.
|
|
216
|
+
> Command: bin/test --check-env-var-set EXTERNAL_SERVICE_API_KEY
|
|
217
|
+
```markdown
|
|
218
|
+
|
|
219
|
+
## The `bin/test` Utility
|
|
220
|
+
|
|
221
|
+
A helper script, `bin/test`, is envisioned to simplify common test operations invoked via the `Command:` field.
|
|
222
|
+
This script would provide a consistent interface for checks like:
|
|
223
|
+
|
|
224
|
+
- File existence (`--check-file-exists <path>`)
|
|
225
|
+
- File non-existence (`--check-file-not-exists <path>`)
|
|
226
|
+
- File is not empty (`--check-file-not-empty <path>`)
|
|
227
|
+
- File contains a string/pattern (`--check-file-contains-pattern "<pattern>" <path>`)
|
|
228
|
+
- Environment variable is set (`--check-env-var-set <VAR_NAME>`)
|
|
229
|
+
- Arbitrary command execution and exit code checking (`--exec "your command here" --expect-exit-code <N>`)
|
|
230
|
+
|
|
231
|
+
**Note**: In self-contained workflows, test commands should be generic placeholders when `bin/test`
|
|
232
|
+
isn't available, with comments showing language-specific alternatives:
|
|
233
|
+
|
|
234
|
+
```markdown
|
|
235
|
+
> TEST: Version File Updated
|
|
236
|
+
> Type: Action Validation
|
|
237
|
+
> Assert: Version number updated in project files
|
|
238
|
+
> Command: grep "version.*X.Y.Z" package.json || grep "version.*X.Y.Z" Cargo.toml || grep "VERSION.*X.Y.Z" setup.py
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
## Best Practices for Embedded Tests
|
|
242
|
+
|
|
243
|
+
### In Self-Contained Workflows
|
|
244
|
+
|
|
245
|
+
1. **Embed Test Logic**: Include test commands directly rather than referencing external test suites
|
|
246
|
+
2. **Use Generic Patterns**: Write tests that can work across different technology stacks
|
|
247
|
+
3. **Provide Alternatives**: Show multiple command options for different environments
|
|
248
|
+
4. **Keep Tests Simple**: Focus on existence checks and pattern matching
|
|
249
|
+
5. **Document Expected Output**: Be clear about what constitutes success
|
|
250
|
+
|
|
251
|
+
### Example: Technology-Agnostic Test
|
|
252
|
+
|
|
253
|
+
```markdown
|
|
254
|
+
> TEST: Project Tests Pass
|
|
255
|
+
> Type: Post-condition Check
|
|
256
|
+
> Assert: All project tests pass successfully
|
|
257
|
+
> Command: bin/test || npm test || bundle exec rspec || pytest || cargo test
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
By adopting this standard, AI agent workflows can become significantly more reliable and easier to debug, leading
|
|
261
|
+
to more efficient and trustworthy automation, while maintaining complete independence from external resources.
|