ace-test 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (67) hide show
  1. checksums.yaml +7 -0
  2. data/.ace-defaults/nav/protocols/agent-sources/ace-test.yml +19 -0
  3. data/.ace-defaults/nav/protocols/guide-sources/ace-test.yml +19 -0
  4. data/.ace-defaults/nav/protocols/tmpl-sources/ace-test.yml +11 -0
  5. data/.ace-defaults/nav/protocols/wfi-sources/ace-test.yml +19 -0
  6. data/CHANGELOG.md +169 -0
  7. data/LICENSE +21 -0
  8. data/README.md +40 -0
  9. data/Rakefile +12 -0
  10. data/handbook/agents/mock.ag.md +164 -0
  11. data/handbook/agents/profile-tests.ag.md +132 -0
  12. data/handbook/agents/test.ag.md +99 -0
  13. data/handbook/guides/SUMMARY.md +95 -0
  14. data/handbook/guides/embedded-testing-guide.g.md +261 -0
  15. data/handbook/guides/mocking-patterns.g.md +464 -0
  16. data/handbook/guides/quick-reference.g.md +46 -0
  17. data/handbook/guides/test-driven-development-cycle/meta-documentation.md +26 -0
  18. data/handbook/guides/test-driven-development-cycle/ruby-application.md +18 -0
  19. data/handbook/guides/test-driven-development-cycle/ruby-gem.md +19 -0
  20. data/handbook/guides/test-driven-development-cycle/rust-cli.md +18 -0
  21. data/handbook/guides/test-driven-development-cycle/rust-wasm-zed.md +19 -0
  22. data/handbook/guides/test-driven-development-cycle/typescript-nuxt.md +18 -0
  23. data/handbook/guides/test-driven-development-cycle/typescript-vue.md +19 -0
  24. data/handbook/guides/test-layer-decision.g.md +261 -0
  25. data/handbook/guides/test-mocking-patterns.g.md +414 -0
  26. data/handbook/guides/test-organization.g.md +140 -0
  27. data/handbook/guides/test-performance.g.md +353 -0
  28. data/handbook/guides/test-responsibility-map.g.md +220 -0
  29. data/handbook/guides/test-review-checklist.g.md +231 -0
  30. data/handbook/guides/test-suite-health.g.md +337 -0
  31. data/handbook/guides/testable-code-patterns.g.md +315 -0
  32. data/handbook/guides/testing/ruby-rspec-config-examples.md +120 -0
  33. data/handbook/guides/testing/ruby-rspec.md +87 -0
  34. data/handbook/guides/testing/rust.md +52 -0
  35. data/handbook/guides/testing/test-maintenance.md +364 -0
  36. data/handbook/guides/testing/typescript-bun.md +47 -0
  37. data/handbook/guides/testing/vue-firebase-auth.md +546 -0
  38. data/handbook/guides/testing/vue-vitest.md +236 -0
  39. data/handbook/guides/testing-philosophy.g.md +82 -0
  40. data/handbook/guides/testing-strategy.g.md +151 -0
  41. data/handbook/guides/testing-tdd-cycle.g.md +146 -0
  42. data/handbook/guides/testing.g.md +170 -0
  43. data/handbook/skills/as-test-create-cases/SKILL.md +24 -0
  44. data/handbook/skills/as-test-fix/SKILL.md +26 -0
  45. data/handbook/skills/as-test-improve-coverage/SKILL.md +22 -0
  46. data/handbook/skills/as-test-optimize/SKILL.md +34 -0
  47. data/handbook/skills/as-test-performance-audit/SKILL.md +34 -0
  48. data/handbook/skills/as-test-plan/SKILL.md +34 -0
  49. data/handbook/skills/as-test-review/SKILL.md +34 -0
  50. data/handbook/skills/as-test-verify-suite/SKILL.md +45 -0
  51. data/handbook/templates/e2e-sandbox-checklist.template.md +289 -0
  52. data/handbook/templates/test-case.template.md +56 -0
  53. data/handbook/templates/test-performance-audit.template.md +132 -0
  54. data/handbook/templates/test-responsibility-map.template.md +92 -0
  55. data/handbook/templates/test-review-checklist.template.md +163 -0
  56. data/handbook/workflow-instructions/test/analyze-failures.wf.md +120 -0
  57. data/handbook/workflow-instructions/test/create-cases.wf.md +675 -0
  58. data/handbook/workflow-instructions/test/fix.wf.md +120 -0
  59. data/handbook/workflow-instructions/test/improve-coverage.wf.md +370 -0
  60. data/handbook/workflow-instructions/test/optimize.wf.md +368 -0
  61. data/handbook/workflow-instructions/test/performance-audit.wf.md +17 -0
  62. data/handbook/workflow-instructions/test/plan.wf.md +323 -0
  63. data/handbook/workflow-instructions/test/review.wf.md +16 -0
  64. data/handbook/workflow-instructions/test/verify-suite.wf.md +343 -0
  65. data/lib/ace/test/version.rb +7 -0
  66. data/lib/ace/test.rb +10 -0
  67. metadata +152 -0
@@ -0,0 +1,99 @@
1
+ ---
2
+ name: test
3
+ description: Run tests with smart defaults and helpful diagnostics
4
+ expected_params:
5
+ required: []
6
+ optional:
7
+ - target: 'Test target - file path, directory (atoms, molecules), or group name'
8
+ - profile: 'Profile N slowest tests to identify performance issues'
9
+ - verbose: 'Show detailed test output'
10
+ last_modified: '2026-01-22'
11
+ type: agent
12
+ source: ace-test
13
+ ---
14
+
15
+ You are a testing specialist using the **ace-test** command-line tool.
16
+
17
+ ## Core Responsibilities
18
+
19
+ Your primary role is to run tests efficiently and help diagnose issues:
20
+ - Run tests with smart defaults based on context
21
+ - Profile slow tests to identify performance bottlenecks
22
+ - Help interpret test failures and suggest fixes
23
+ - Guide users toward proper test organization
24
+
25
+ ## Primary Tool: ace-test
26
+
27
+ Use the **ace-test** command for all test execution.
28
+
29
+ ## Commands
30
+
31
+ ### Run All Tests
32
+ ```bash
33
+ # Run all tests in current package
34
+ ace-test
35
+
36
+ # Run all tests across monorepo
37
+ ace-test-suite
38
+ ```
39
+
40
+ ### Run Specific Tests
41
+ ```bash
42
+ # Run single test file
43
+ ace-test test/atoms/pattern_analyzer_test.rb
44
+
45
+ # Run test directory/group
46
+ ace-test atoms
47
+ ace-test molecules
48
+ ace-test organisms
49
+ ace-test integration
50
+ ```
51
+
52
+ ### Profile Tests
53
+ ```bash
54
+ # Profile 10 slowest tests
55
+ ace-test --profile 10
56
+
57
+ # Profile slowest tests in a group
58
+ ace-test atoms --profile 10
59
+ ```
60
+
61
+ ## Test Performance Targets
62
+
63
+ When profiling, use these thresholds to identify issues:
64
+
65
+ | Test Layer | Target Time | Hard Limit |
66
+ |------------|-------------|------------|
67
+ | Unit (atoms) | <10ms | 50ms |
68
+ | Unit (molecules) | <50ms | 100ms |
69
+ | Unit (organisms) | <100ms | 200ms |
70
+ | Integration | <500ms | 1s |
71
+ | E2E | <2s | 5s |
72
+
73
+ ## Interpreting Results
74
+
75
+ When tests fail:
76
+ 1. Check the error message and stack trace
77
+ 2. Look for patterns (same file, similar names)
78
+ 3. Consider recent changes (`git log --oneline -10`)
79
+ 4. Check for zombie mocks if tests are slow but passing
80
+
81
+ When tests are slow:
82
+ 1. Profile to identify bottlenecks
83
+ 2. Check for real I/O in unit tests
84
+ 3. Look for unstubbed subprocess calls
85
+ 4. Verify mocks are hitting actual code paths
86
+
87
+ ## Related Guides
88
+
89
+ - [Quick Reference](guide://quick-reference) - TL;DR testing patterns
90
+ - [Test Performance](guide://test-performance) - Optimization strategies
91
+ - [Mocking Patterns](guide://mocking-patterns) - How to stub properly
92
+
93
+ ## Response Format
94
+
95
+ When providing results:
96
+ 1. Show test command and summary
97
+ 2. Highlight failures with file:line references
98
+ 3. Suggest fixes based on error patterns
99
+ 4. Recommend profiling if tests seem slow
@@ -0,0 +1,95 @@
1
+ # ACE Testing Guide Index
2
+
3
+ Navigation index for all testing guides in the ace-test package.
4
+
5
+ ## Core Testing Guides
6
+
7
+ | Guide | Protocol | Description |
8
+ |-------|----------|-------------|
9
+ | Quick Reference | `guide://quick-reference` | TL;DR of testing patterns - flat structure, naming, IO isolation |
10
+ | Testing Philosophy | `guide://testing-philosophy` | Testing pyramid, IO isolation principle, when real IO is allowed |
11
+ | Test Organization | `guide://test-organization` | Flat directory structure, naming conventions, layer boundaries |
12
+ | Mocking Patterns | `guide://mocking-patterns` | MockGitRepo, WebMock, subprocess stubbing, ENV testing patterns |
13
+ | Test Performance | `guide://test-performance` | Performance targets by layer, composite helpers, zombie mocks detection |
14
+ | Testable Code Patterns | `guide://testable-code-patterns` | Avoiding exit calls, returning status codes, exception patterns |
15
+ | Testing Guide | `guide://testing` | General testing guidelines and best practices |
16
+ | TDD Cycle | `guide://testing-tdd-cycle` | Test-driven development implementation cycle |
17
+ | Embedded Testing | `guide://embedded-testing-guide` | Embedded testing in workflows |
18
+
19
+ ## Test Strategy & Planning
20
+
21
+ Decision frameworks for test design and layer assignment.
22
+
23
+ | Guide | Protocol | Description |
24
+ |-------|----------|-------------|
25
+ | Testing Strategy | `guide://testing-strategy` | Fast/Slow loop strategy for high-performance test suites |
26
+ | Test Layer Decision | `guide://test-layer-decision` | Decision matrix for unit vs integration vs E2E |
27
+ | Test Responsibility Map | `guide://test-responsibility-map` | Map behaviors to test layers to avoid redundant coverage |
28
+
29
+ ## Test Quality & Health
30
+
31
+ Patterns for maintaining test suite quality and performance.
32
+
33
+ | Guide | Protocol | Description |
34
+ |-------|----------|-------------|
35
+ | Test Mocking Patterns | `guide://test-mocking-patterns` | Behavior testing, zombie mock detection, contract testing |
36
+ | Test Suite Health | `guide://test-suite-health` | Metrics, CI integration, periodic audits |
37
+ | Test Review Checklist | `guide://test-review-checklist` | Quick checklist for reviewing test PRs |
38
+
39
+ ## Technology-Specific Guides
40
+
41
+ ### Testing by Technology
42
+
43
+ | Guide | File | Description |
44
+ |-------|------|-------------|
45
+ | RSpec Patterns | `testing/ruby-rspec.md` | Ruby RSpec-specific testing patterns |
46
+ | RSpec Config Examples | `testing/ruby-rspec-config-examples.md` | RSpec configuration examples |
47
+ | Rust Testing | `testing/rust.md` | Rust testing patterns |
48
+ | TypeScript/Bun | `testing/typescript-bun.md` | Bun test patterns for TypeScript |
49
+ | Vue/Vitest | `testing/vue-vitest.md` | Vue + Vitest testing patterns |
50
+ | Vue/Firebase Auth | `testing/vue-firebase-auth.md` | Vue with Firebase authentication testing |
51
+ | Test Maintenance | `testing/test-maintenance.md` | Test maintenance and refactoring guidelines |
52
+
53
+ ## TDD Cycle Guides
54
+
55
+ ### Test-Driven Development by Platform
56
+
57
+ | Guide | File | Description |
58
+ |-------|------|-------------|
59
+ | Ruby Gem TDD | `test-driven-development-cycle/ruby-gem.md` | TDD workflow for Ruby gems |
60
+ | Ruby Application TDD | `test-driven-development-cycle/ruby-application.md` | TDD workflow for Ruby applications |
61
+ | Rust CLI TDD | `test-driven-development-cycle/rust-cli.md` | TDD for Rust CLI tools |
62
+ | Rust WASM/Zed TDD | `test-driven-development-cycle/rust-wasm-zed.md` | TDD for Rust WASM/Zed extensions |
63
+ | TypeScript Vue TDD | `test-driven-development-cycle/typescript-vue.md` | TDD for Vue applications |
64
+ | TypeScript Nuxt TDD | `test-driven-development-cycle/typescript-nuxt.md` | TDD for Nuxt applications |
65
+ | Meta Documentation TDD | `test-driven-development-cycle/meta-documentation.md` | TDD for documentation projects |
66
+
67
+ ## Access Guides via ace-nav
68
+
69
+ ```bash
70
+ # Quick reference
71
+ ace-nav guide://quick-reference
72
+
73
+ # Testing philosophy
74
+ ace-nav guide://testing-philosophy
75
+
76
+ # Mocking patterns
77
+ ace-nav guide://mocking-patterns
78
+
79
+ # TDD cycle
80
+ ace-nav guide://testing-tdd-cycle
81
+
82
+ # New test strategy guides
83
+ ace-nav guide://testing-strategy
84
+ ace-nav guide://test-layer-decision
85
+ ace-nav guide://test-mocking-patterns
86
+ ace-nav guide://test-suite-health
87
+ ace-nav guide://test-responsibility-map
88
+ ace-nav guide://test-review-checklist
89
+ ```
90
+
91
+ ## See Also
92
+
93
+ - [Workflows](../workflow-instructions/) - Testing-related workflows
94
+ - [Agents](../agents/) - Testing automation agents
95
+ - [Templates](../templates/) - Test case templates
@@ -0,0 +1,261 @@
1
+ ---
2
+ doc-type: guide
3
+ title: Embedding Tests in AI Agent Workflows
4
+ purpose: Embedded testing in workflows
5
+ ace-docs:
6
+ last-updated: 2026-01-23
7
+ last-checked: 2026-03-21
8
+ ---
9
+
10
+ # Embedding Tests in AI Agent Workflows
11
+
12
+ This guide details the standard for incorporating tests directly within AI agent workflow instruction files.
13
+ Integrating tests makes workflows more robust, provides faster feedback, and improves the reliability of
14
+ automated tasks.
15
+
16
+ ## Purpose
17
+
18
+ Embedding tests directly into workflow instructions allows an AI agent to:
19
+
20
+ - Verify pre-conditions before starting complex operations
21
+ - Validate the outcome of individual actions or tool uses
22
+ - Confirm that a series of steps achieved the desired overall result
23
+ - Request user verification for subjective or critical outputs
24
+ - Ensure adherence to safety guardrails and compliance requirements
25
+ - Support self-contained workflow execution without external test files
26
+
27
+ This immediate feedback loop helps catch errors early, reduces the need for extensive manual checking,
28
+ and improves automated processes. In the context of self-contained workflows, embedded tests are
29
+ essential for validation without external dependencies.
30
+
31
+ ## Test Categories
32
+
33
+ The following categories of tests can be embedded in workflows:
34
+
35
+ 1. **Pre-condition Checks:**
36
+ - **Description:** Verify that the environment and inputs are ready before an action or task begins.
37
+ - **Examples:** Ensure input files exist, required tools are available, API keys are set.
38
+
39
+ 2. **Action Validation (Tool-Specific Tests):**
40
+ - **Description:** Validate the immediate output or effect of a specific tool usage or agent action.
41
+ - **Examples:** Check if a file was correctly modified, a command ran successfully, an API call returned expected data.
42
+
43
+ 3. **Post-condition Checks (Task-Level Outcome Validation):**
44
+ - **Description:** Verify that a sequence of actions has achieved its broader goal.
45
+ - **Examples:** Confirm a generated report contains all necessary sections, a
46
+ refactoring task didn\'t break existing tests.
47
+
48
+ 4. **Output Validation (Against External Systems or Ground Truth):**
49
+ - **Description:** Compare the agent\'s final output with an external reference or known correct state.
50
+ - Examples: Check if a deployed service responds correctly, if data written
51
+ to a database matches expectations.
52
+
53
+ 5. **Guardrail Tests (Safety and Compliance):**
54
+ - **Description:** Ensure the agent\'s operations stay within safe boundaries and meet compliance rules.
55
+ - Examples: Prevent accidental deletion of critical files,
56
+ check for hardcoded secrets,
57
+ ensure generated code meets linting standards.
58
+
59
+ 6. **User Feedback/Verification Prompts:**
60
+ - **Description:** Solicit explicit confirmation from a human user for steps that are subjective, critical, or
61
+ difficult to automate verification for.
62
+ - **Examples:** Ask user to review a generated summary, confirm a proposed destructive action.
63
+
64
+ ## Syntax for Embedding Tests
65
+
66
+ Tests are embedded in workflow markdown files using a specific blockquote structure. There are two main keywords:
67
+ `TEST` for automated checks and `VERIFY` for user feedback prompts.
68
+
69
+ ```markdown
70
+ ```markdown
71
+
72
+ > TEST: <Test Name (brief, human-readable)>
73
+ > Type: <Pre-condition | Action Validation | Post-condition | Output Validation | Guardrail>
74
+ > Assert: <Human-readable description of what\\\\\\\'s being checked>
75
+ > [Command: <executable command or script call, e.g., `bin/test --check-file ...`>]
76
+ > [File: <path_to_file_to_check_or_use_in_command>]
77
+ > [Pattern: <regex_or_string_pattern for_grep_like_checks>]\
78
+ > [Expected: <expected_value_or_outcome_description>]
79
+
80
+ > VERIFY: <Verification Point Name>
81
+ > Type: User Feedback
82
+ > Prompt: <Text to display to the user for verification>
83
+ > [Options: <e.g., (yes/no), (proceed/abort/edit)>]
84
+ ```
85
+
86
+ **Fields:**
87
+
88
+ - `TEST:` / `VERIFY:`: Keyword to initiate the test or verification block. Followed by a human-readable name.
89
+ - `Type:`: One of the defined test categories.
90
+ - `Assert:` (for `TEST`): A clear, human-readable statement of the condition being checked.
91
+ - `Command:` (optional, for `TEST`): The shell command the agent should execute to perform the test. A non-zero
92
+ exit code indicates failure. This is the **preferred method for automated checks**.
93
+ - `File:` (optional, for `TEST`): Path to a file relevant to the test. Can be used by the `Command` or by the agent
94
+ for simple checks if no `Command` is provided.
95
+ - `Pattern:` (optional, for `TEST`): A regex or string pattern to search for, typically within the `File`.
96
+ - `Expected:` (optional, for `TEST`): A description of the expected outcome or value, useful if the `Command` doesn\'t
97
+ directly assert this.
98
+ - `Prompt:` (for `VERIFY`): The question or instruction presented to the user.
99
+ - `Options:` (optional, for `VERIFY`): Suggested responses for the user.
100
+
101
+ ## Agent Interpretation and Execution
102
+
103
+ The AI agent should:
104
+
105
+ 1. **Parse Blocks:** Identify `> TEST:` and `> VERIFY:` blocks in the workflow markdown.
106
+ 2. **Execute `TEST` Blocks:**
107
+ - If a `Command:` is provided, execute it (e.g., using a `terminal` tool). The `bin/test` utility is designed to
108
+ be the common target for these commands. A non-zero exit status from the command signifies a test failure.
109
+ - If no `Command:` is provided, the agent may attempt simple, direct checks based on `File:`, `Pattern:`, and
110
+ `Assert:`, such as file existence or basic content checks. However, complex logic should always be
111
+ encapsulated in a `Command:`.
112
+ 3. **Handle `VERIFY` Blocks:**
113
+ - Display the `Prompt:` text to the user.
114
+ - Wait for user input. The workflow may pause or proceed based on the response.
115
+ 4. **Handle Test Outcomes:**
116
+ - **Success:** Proceed with the workflow.
117
+ - **Failure (Automated Test or Negative User Feedback):** Report the failure (including test name, assertion,
118
+ and command output if any). Halt the current task or follow specific error-handling instructions in the
119
+ workflow, then await user guidance.
120
+ 5. **Log:** Record all test executions and their outcomes.
121
+
122
+ ## Integration with Self-Contained Workflows
123
+
124
+ Embedded tests are crucial for maintaining workflow independence:
125
+
126
+ 1. **No External Test Scripts**: Tests should be defined inline rather than referencing external test files
127
+ 2. **Self-Validating Steps**: Each major step can include its own validation
128
+ 3. **Context-Aware Testing**: Tests can reference files loaded in the Project Context Loading section
129
+ 4. **Technology-Agnostic Commands**: Use placeholder commands that can be adapted to different stacks
130
+
131
+ ### Example: Self-Contained Test in Planning Step
132
+
133
+ ```markdown
134
+ ### Planning Steps
135
+ * [ ] Review dependency analysis findings
136
+ > TEST: Analysis Review Complete
137
+ > Type: Pre-condition Check
138
+ > Assert: Dependency analysis results are incorporated into refactoring plan
139
+ > Command: grep -l "dependency" workflow-independence-plan.md
140
+ ```
141
+
142
+ ## Examples
143
+
144
+ ### Simple, Fast Feedback Tests
145
+
146
+ ### Example 1: Check if a file was created
147
+
148
+ ## Step 2: Generate Configuration File
149
+
150
+ The agent will generate `config.json` based on the inputs.
151
+
152
+ > TEST: Config File Created
153
+ > Type: Post-condition Check
154
+ > Assert: The `config.json` file exists in the output directory.
155
+ > File: output/config.json
156
+ > Command: bin/test --check-file-exists output/config.json
157
+
158
+ ### Example 2: Check if a file contains specific text
159
+
160
+ ## Step 3: Update README
161
+
162
+ The agent will add a "## Usage" section to `README.md`.
163
+
164
+ ```markdown
165
+
166
+ > TEST: README Usage Section Added
167
+ > Type: Action Validation
168
+ > Assert: The `README.md` file now contains the \"## Usage\" heading.
169
+ > File: README.md
170
+ > Pattern: \"## Usage\"
171
+ > Command: bin/test --check-file-contains-pattern \"## Usage\" README.md
172
+
173
+ ```
174
+
175
+ ### Higher-Level Verification Tests
176
+
177
+ ### Example 3: Confirm a command achieves an outcome (e.g., code formatting)
178
+
179
+ ## Step 4: Format Source Code
180
+
181
+ The agent will run the project's code formatter on all `.py` files.
182
+
183
+ > TEST: Code Formatting Applied
184
+ > Type: Post-condition Check
185
+ > Assert: The code formatter reports no changes are needed, indicating formatting was successful.
186
+ Command: black --check . # (Assumes \\\'black\\\' exits non-zero if changes are needed)
187
+
188
+ ```markdown
189
+
190
+ *Note: For commands like linters or formatters that exit 0 if successful (or no changes needed) and non-zero
191
+ if issues are found/changes would be made, the agent might need to interpret the exit code accordingly.
192
+ A wrapper script via `bin/test` could invert this if needed (e.g. `bin/test --expect-exit-code 0 \"black --check .\"`).*
193
+
194
+ ### Example 4: User verification of generated content
195
+
196
+ ## Step 5: Generate Project Summary
197
+
198
+ The agent will write a summary of the project to `docs/summary.md`.
199
+
200
+ > VERIFY: Summary Accuracy
201
+ > Type: User Feedback
202
+ > Prompt: Please review `docs/summary.md`. Does it accurately reflect the project\'s current state and goals?
203
+ > Options: (Yes, Accurate / No, Needs Revision)
204
+ ```markdown
205
+
206
+
207
+ **Example 5: Pre-condition check for API key**
208
+
209
+ ## Step 1: Initialize API Client
210
+
211
+ The agent will prepare to make calls to an external service.
212
+
213
+ > TEST: API Key Available
214
+ > Type: Pre-condition Check
215
+ > Assert: The `EXTERNAL_SERVICE_API_KEY` environment variable is set.
216
+ > Command: bin/test --check-env-var-set EXTERNAL_SERVICE_API_KEY
217
+ ```markdown
218
+
219
+ ## The `bin/test` Utility
220
+
221
+ A helper script, `bin/test`, is envisioned to simplify common test operations invoked via the `Command:` field.
222
+ This script would provide a consistent interface for checks like:
223
+
224
+ - File existence (`--check-file-exists <path>`)
225
+ - File non-existence (`--check-file-not-exists <path>`)
226
+ - File is not empty (`--check-file-not-empty <path>`)
227
+ - File contains a string/pattern (`--check-file-contains-pattern "<pattern>" <path>`)
228
+ - Environment variable is set (`--check-env-var-set <VAR_NAME>`)
229
+ - Arbitrary command execution and exit code checking (`--exec "your command here" --expect-exit-code <N>`)
230
+
231
+ **Note**: In self-contained workflows, test commands should be generic placeholders when `bin/test`
232
+ isn't available, with comments showing language-specific alternatives:
233
+
234
+ ```markdown
235
+ > TEST: Version File Updated
236
+ > Type: Action Validation
237
+ > Assert: Version number updated in project files
238
+ > Command: grep "version.*X.Y.Z" package.json || grep "version.*X.Y.Z" Cargo.toml || grep "VERSION.*X.Y.Z" setup.py
239
+ ```
240
+
241
+ ## Best Practices for Embedded Tests
242
+
243
+ ### In Self-Contained Workflows
244
+
245
+ 1. **Embed Test Logic**: Include test commands directly rather than referencing external test suites
246
+ 2. **Use Generic Patterns**: Write tests that can work across different technology stacks
247
+ 3. **Provide Alternatives**: Show multiple command options for different environments
248
+ 4. **Keep Tests Simple**: Focus on existence checks and pattern matching
249
+ 5. **Document Expected Output**: Be clear about what constitutes success
250
+
251
+ ### Example: Technology-Agnostic Test
252
+
253
+ ```markdown
254
+ > TEST: Project Tests Pass
255
+ > Type: Post-condition Check
256
+ > Assert: All project tests pass successfully
257
+ > Command: bin/test || npm test || bundle exec rspec || pytest || cargo test
258
+ ```
259
+
260
+ By adopting this standard, AI agent workflows can become significantly more reliable and easier to debug, leading
261
+ to more efficient and trustworthy automation, while maintaining complete independence from external resources.