ace-test 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (67) hide show
  1. checksums.yaml +7 -0
  2. data/.ace-defaults/nav/protocols/agent-sources/ace-test.yml +19 -0
  3. data/.ace-defaults/nav/protocols/guide-sources/ace-test.yml +19 -0
  4. data/.ace-defaults/nav/protocols/tmpl-sources/ace-test.yml +11 -0
  5. data/.ace-defaults/nav/protocols/wfi-sources/ace-test.yml +19 -0
  6. data/CHANGELOG.md +169 -0
  7. data/LICENSE +21 -0
  8. data/README.md +40 -0
  9. data/Rakefile +12 -0
  10. data/handbook/agents/mock.ag.md +164 -0
  11. data/handbook/agents/profile-tests.ag.md +132 -0
  12. data/handbook/agents/test.ag.md +99 -0
  13. data/handbook/guides/SUMMARY.md +95 -0
  14. data/handbook/guides/embedded-testing-guide.g.md +261 -0
  15. data/handbook/guides/mocking-patterns.g.md +464 -0
  16. data/handbook/guides/quick-reference.g.md +46 -0
  17. data/handbook/guides/test-driven-development-cycle/meta-documentation.md +26 -0
  18. data/handbook/guides/test-driven-development-cycle/ruby-application.md +18 -0
  19. data/handbook/guides/test-driven-development-cycle/ruby-gem.md +19 -0
  20. data/handbook/guides/test-driven-development-cycle/rust-cli.md +18 -0
  21. data/handbook/guides/test-driven-development-cycle/rust-wasm-zed.md +19 -0
  22. data/handbook/guides/test-driven-development-cycle/typescript-nuxt.md +18 -0
  23. data/handbook/guides/test-driven-development-cycle/typescript-vue.md +19 -0
  24. data/handbook/guides/test-layer-decision.g.md +261 -0
  25. data/handbook/guides/test-mocking-patterns.g.md +414 -0
  26. data/handbook/guides/test-organization.g.md +140 -0
  27. data/handbook/guides/test-performance.g.md +353 -0
  28. data/handbook/guides/test-responsibility-map.g.md +220 -0
  29. data/handbook/guides/test-review-checklist.g.md +231 -0
  30. data/handbook/guides/test-suite-health.g.md +337 -0
  31. data/handbook/guides/testable-code-patterns.g.md +315 -0
  32. data/handbook/guides/testing/ruby-rspec-config-examples.md +120 -0
  33. data/handbook/guides/testing/ruby-rspec.md +87 -0
  34. data/handbook/guides/testing/rust.md +52 -0
  35. data/handbook/guides/testing/test-maintenance.md +364 -0
  36. data/handbook/guides/testing/typescript-bun.md +47 -0
  37. data/handbook/guides/testing/vue-firebase-auth.md +546 -0
  38. data/handbook/guides/testing/vue-vitest.md +236 -0
  39. data/handbook/guides/testing-philosophy.g.md +82 -0
  40. data/handbook/guides/testing-strategy.g.md +151 -0
  41. data/handbook/guides/testing-tdd-cycle.g.md +146 -0
  42. data/handbook/guides/testing.g.md +170 -0
  43. data/handbook/skills/as-test-create-cases/SKILL.md +24 -0
  44. data/handbook/skills/as-test-fix/SKILL.md +26 -0
  45. data/handbook/skills/as-test-improve-coverage/SKILL.md +22 -0
  46. data/handbook/skills/as-test-optimize/SKILL.md +34 -0
  47. data/handbook/skills/as-test-performance-audit/SKILL.md +34 -0
  48. data/handbook/skills/as-test-plan/SKILL.md +34 -0
  49. data/handbook/skills/as-test-review/SKILL.md +34 -0
  50. data/handbook/skills/as-test-verify-suite/SKILL.md +45 -0
  51. data/handbook/templates/e2e-sandbox-checklist.template.md +289 -0
  52. data/handbook/templates/test-case.template.md +56 -0
  53. data/handbook/templates/test-performance-audit.template.md +132 -0
  54. data/handbook/templates/test-responsibility-map.template.md +92 -0
  55. data/handbook/templates/test-review-checklist.template.md +163 -0
  56. data/handbook/workflow-instructions/test/analyze-failures.wf.md +120 -0
  57. data/handbook/workflow-instructions/test/create-cases.wf.md +675 -0
  58. data/handbook/workflow-instructions/test/fix.wf.md +120 -0
  59. data/handbook/workflow-instructions/test/improve-coverage.wf.md +370 -0
  60. data/handbook/workflow-instructions/test/optimize.wf.md +368 -0
  61. data/handbook/workflow-instructions/test/performance-audit.wf.md +17 -0
  62. data/handbook/workflow-instructions/test/plan.wf.md +323 -0
  63. data/handbook/workflow-instructions/test/review.wf.md +16 -0
  64. data/handbook/workflow-instructions/test/verify-suite.wf.md +343 -0
  65. data/lib/ace/test/version.rb +7 -0
  66. data/lib/ace/test.rb +10 -0
  67. metadata +152 -0
@@ -0,0 +1,163 @@
1
+ ---
2
+ doc-type: template
3
+ title: Test Review Checklist
4
+ purpose: Test PR review checklist
5
+ ace-docs:
6
+ last-updated: 2026-02-19
7
+ last-checked: 2026-03-21
8
+ ---
9
+
10
+ # Test Review Checklist
11
+
12
+ **PR**: #{{number}}
13
+ **Package**: {{package}}
14
+ **Reviewer**: {{name}}
15
+ **Date**: {{date}}
16
+
17
+ ## Quick Summary
18
+
19
+ - [ ] Tests added/modified: {{count}}
20
+ - [ ] Test type: Unit / Integration / E2E
21
+ - [ ] Performance verified: Yes / No / N/A
22
+
23
+ ---
24
+
25
+ ## 1. Layer Appropriateness
26
+
27
+ Is each test at the correct layer?
28
+
29
+ | Test | Current Layer | Correct? | Notes |
30
+ |------|---------------|----------|-------|
31
+ | {{test}} | Unit/Integration/E2E | Yes/No | {{notes}} |
32
+
33
+ **Checklist**:
34
+ - [ ] Unit tests have NO real I/O (subprocess, network, filesystem)
35
+ - [ ] Integration tests stub external dependencies
36
+ - [ ] E2E tests are in `test/e2e/TS-*/` format (scenario.yml + TC-*.tc.md)
37
+ - [ ] No flag permutation tests in E2E (should be unit)
38
+ - [ ] ONE CLI parity test per integration file max
39
+
40
+ ## 2. Stubbing Quality
41
+
42
+ Are mocks/stubs correctly implemented?
43
+
44
+ **Checklist**:
45
+ - [ ] Boundary methods stubbed (not just inner methods)
46
+ - [ ] `available?` checks stubbed if `run` is stubbed
47
+ - [ ] No zombie mocks (stub targets match actual code)
48
+ - [ ] Mock data is realistic (from snapshots or schemas)
49
+ - [ ] Composite helpers used where appropriate
50
+
51
+ **Red Flags**:
52
+ - [ ] Deep nesting (>3 levels) without composite helper
53
+ - [ ] Stubbing private methods
54
+ - [ ] Mock expectations without behavior assertions
55
+
56
+ ## 3. Behavior vs Implementation
57
+
58
+ Do tests verify behavior, not implementation details?
59
+
60
+ **Checklist**:
61
+ - [ ] Tests assert on OUTPUT, not method calls
62
+ - [ ] Tests survive internal refactoring
63
+ - [ ] Mock expectations only for side-effect methods
64
+ - [ ] No testing of private method order
65
+
66
+ **Example Check**:
67
+ ```ruby
68
+ # BAD: Tests implementation
69
+ mock.verify # "Was X called?"
70
+
71
+ # GOOD: Tests behavior
72
+ assert_equal expected, result.output
73
+ ```
74
+
75
+ ## 4. Performance
76
+
77
+ Will tests run fast enough?
78
+
79
+ **Checklist**:
80
+ - [ ] Profiled with `ace-test --profile 5`
81
+ - [ ] Unit tests <100ms each
82
+ - [ ] No `sleep` calls without stubbing
83
+ - [ ] No subprocess calls without stubbing
84
+ - [ ] Cache pre-warming if needed
85
+
86
+ **Performance Check**:
87
+ ```bash
88
+ ace-test {{package}} --profile 10
89
+ # Verify no new tests >100ms
90
+ ```
91
+
92
+ ## 5. Coverage Quality
93
+
94
+ Do tests actually catch bugs?
95
+
96
+ **Checklist**:
97
+ - [ ] Happy path tested
98
+ - [ ] Error cases tested
99
+ - [ ] Edge cases tested (nil, empty, boundaries)
100
+ - [ ] Test fails when code is broken (try breaking it)
101
+
102
+ **Negative Test Check**:
103
+ - [ ] At least one error scenario tested
104
+ - [ ] Invalid input handling verified
105
+ - [ ] Exception/error messages checked
106
+
107
+ ## 6. E2E Specific (if applicable)
108
+
109
+ For E2E tests in TS-format (`TC-*.tc.md`):
110
+
111
+ **Checklist**:
112
+ - [ ] PASS/FAIL assertions are explicit
113
+ - [ ] File paths discovered at runtime, not hardcoded
114
+ - [ ] Error test cases included (not just happy path)
115
+ - [ ] Exit codes verified for error scenarios
116
+ - [ ] Cleanup documented
117
+ - [ ] Prerequisites listed
118
+
119
+ ## 7. Test Organization
120
+
121
+ Is the test well-structured?
122
+
123
+ **Checklist**:
124
+ - [ ] Test file in correct directory (atoms/molecules/organisms/e2e)
125
+ - [ ] Test name describes behavior (`test_returns_error_for_invalid_input`)
126
+ - [ ] Arrange-Act-Assert pattern followed
127
+ - [ ] No test interdependencies
128
+ - [ ] Fixtures in `test/fixtures/` if shared
129
+
130
+ ---
131
+
132
+ ## Verdict
133
+
134
+ - [ ] **Approve**: Tests are well-designed and performant
135
+ - [ ] **Request Changes**: Issues identified above
136
+ - [ ] **Needs Discussion**: Architectural concerns
137
+
138
+ **Comments**:
139
+
140
+ {{reviewer_comments}}
141
+
142
+ ---
143
+
144
+ ## Quick Reference
145
+
146
+ ### Performance Thresholds
147
+
148
+ | Layer | Target | Warning | Critical |
149
+ |-------|--------|---------|----------|
150
+ | Unit (atoms) | <10ms | >50ms | >100ms |
151
+ | Unit (molecules) | <50ms | >100ms | >200ms |
152
+ | Integration | <500ms | >1s | >2s |
153
+
154
+ ### Stub the Boundary Pattern
155
+
156
+ ```ruby
157
+ # Always stub availability check if stubbing execution
158
+ Runner.stub(:available?, true) do
159
+ Runner.stub(:run, result) do
160
+ subject.process
161
+ end
162
+ end
163
+ ```
@@ -0,0 +1,120 @@
1
+ ---
2
+ doc-type: workflow
3
+ title: Analyze Test Failures Workflow
4
+ purpose: analyze-test-failures workflow instruction
5
+ ace-docs:
6
+ last-updated: 2026-02-24
7
+ last-checked: 2026-03-21
8
+ ---
9
+
10
+ # Analyze Test Failures Workflow
11
+
12
+ ## Goal
13
+
14
+ Analyze failing automated tests and classify each failure before any fix is applied.
15
+
16
+ This workflow produces a decision report that answers:
17
+ - Is this failure caused by implementation code?
18
+ - Is this failure caused by test code/spec?
19
+ - Is this failure caused by test infrastructure/environment?
20
+
21
+ ## Hard Rule
22
+
23
+ - Do not edit application code or test files in this workflow.
24
+ - Do not run formatting/autofix commands in this workflow.
25
+ - This workflow ends with an analysis report only.
26
+ - Do not ask the user where/how to fix during this workflow; decide from evidence.
27
+
28
+ ## Prerequisites
29
+
30
+ - Failing tests have already been executed
31
+ - Failure output is available (logs, stack traces, failing test list)
32
+ - Project context can be loaded
33
+
34
+ ## Project Context Loading
35
+
36
+ - Read and follow: `ace-bundle wfi://bundle`
37
+ - Check recent changes: `git log --oneline -10`
38
+
39
+ ## Classification Categories
40
+
41
+ Use exactly one category per failing test:
42
+
43
+ 1. `implementation-bug`
44
+ - Product/runtime behavior is wrong
45
+ - Test expectation is valid
46
+
47
+ 2. `test-defect`
48
+ - Test assertion/setup/fixture is stale or incorrect
49
+ - Product behavior appears correct for current requirements
50
+
51
+ 3. `test-infrastructure`
52
+ - Environment/tooling/framework/configuration/isolation issue
53
+ - Failure is not specific to business behavior
54
+
55
+ ## Analysis Procedure
56
+
57
+ 1. Collect failing tests
58
+ - Identify failing file/test IDs from latest run output
59
+ - Capture exact error signatures
60
+
61
+ 2. Gather evidence per failure
62
+ - Primary stacktrace line
63
+ - Related test file and assertion context
64
+ - Related implementation file/entrypoint context
65
+ - Environment/tooling context (timeouts, missing deps, DB state, network/mocks)
66
+
67
+ 3. Classify each failure
68
+ - Assign one category (`implementation-bug`, `test-defect`, `test-infrastructure`)
69
+ - Add confidence: `high`, `medium`, or `low`
70
+ - Record one disconfirming check (what could prove this classification wrong)
71
+ - If confidence is `medium` or `low`, run at least one additional diagnostic read/search before final decision
72
+
73
+ 4. Determine fix target
74
+ - `implementation code`
75
+ - `test code`
76
+ - `test infrastructure`
77
+
78
+ 5. Choose autonomous fix decision
79
+ - Select a single primary fix action per failure
80
+ - Provide concrete file targets in priority order
81
+ - Define explicit no-touch boundaries
82
+ - Do not emit option lists that require user selection
83
+
84
+ ## Required Output Contract
85
+
86
+ Produce this section before exiting:
87
+
88
+ ```markdown
89
+ ## Failure Analysis Report
90
+
91
+ | Failure | Category | Evidence | Fix Target | Fix Target Layer | Primary Candidate Files | Fallback Candidate Files | Do-Not-Touch Boundaries | Confidence | Disconfirming Check |
92
+ |---|---|---|---|---|---|---|---|---|---|
93
+ | path/to/test_file.rb:TestName | implementation-bug | stacktrace + behavior mismatch summary | implementation code | implementation | app/service.rb, app/model.rb | test/integration/foo_test.rb | test/e2e/** | high | run related tests after patch |
94
+ ```
95
+
96
+ Then include:
97
+
98
+ ```markdown
99
+ ## Fix Decisions
100
+ - First item to fix: ...
101
+ - Chosen fix decision: ...
102
+ - Why this target first: ...
103
+
104
+ ### Execution Plan Input
105
+ - Primary failure to fix first: ...
106
+ - Why first: ...
107
+ - Required verification commands: ...
108
+ - Expected pass criteria per command: ...
109
+ ```
110
+
111
+ ## Success Criteria
112
+
113
+ - Every failing test is classified
114
+ - Evidence is concrete and traceable
115
+ - Fix target is explicit per failure
116
+ - Fix target files are explicit per failure (primary + fallback)
117
+ - No-touch boundaries are explicit per failure
118
+ - A single autonomous chosen fix decision is present per failure
119
+ - A prioritized first failure is selected
120
+ - No code/test edits were made in this workflow