RubyGems - ace-test - Versions diffs - 0.6.0 - Mend

ace-test 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (67) hide show

checksums.yaml +7 -0
data/.ace-defaults/nav/protocols/agent-sources/ace-test.yml +19 -0
data/.ace-defaults/nav/protocols/guide-sources/ace-test.yml +19 -0
data/.ace-defaults/nav/protocols/tmpl-sources/ace-test.yml +11 -0
data/.ace-defaults/nav/protocols/wfi-sources/ace-test.yml +19 -0
data/CHANGELOG.md +169 -0
data/LICENSE +21 -0
data/README.md +40 -0
data/Rakefile +12 -0
data/handbook/agents/mock.ag.md +164 -0
data/handbook/agents/profile-tests.ag.md +132 -0
data/handbook/agents/test.ag.md +99 -0
data/handbook/guides/SUMMARY.md +95 -0
data/handbook/guides/embedded-testing-guide.g.md +261 -0
data/handbook/guides/mocking-patterns.g.md +464 -0
data/handbook/guides/quick-reference.g.md +46 -0
data/handbook/guides/test-driven-development-cycle/meta-documentation.md +26 -0
data/handbook/guides/test-driven-development-cycle/ruby-application.md +18 -0
data/handbook/guides/test-driven-development-cycle/ruby-gem.md +19 -0
data/handbook/guides/test-driven-development-cycle/rust-cli.md +18 -0
data/handbook/guides/test-driven-development-cycle/rust-wasm-zed.md +19 -0
data/handbook/guides/test-driven-development-cycle/typescript-nuxt.md +18 -0
data/handbook/guides/test-driven-development-cycle/typescript-vue.md +19 -0
data/handbook/guides/test-layer-decision.g.md +261 -0
data/handbook/guides/test-mocking-patterns.g.md +414 -0
data/handbook/guides/test-organization.g.md +140 -0
data/handbook/guides/test-performance.g.md +353 -0
data/handbook/guides/test-responsibility-map.g.md +220 -0
data/handbook/guides/test-review-checklist.g.md +231 -0
data/handbook/guides/test-suite-health.g.md +337 -0
data/handbook/guides/testable-code-patterns.g.md +315 -0
data/handbook/guides/testing/ruby-rspec-config-examples.md +120 -0
data/handbook/guides/testing/ruby-rspec.md +87 -0
data/handbook/guides/testing/rust.md +52 -0
data/handbook/guides/testing/test-maintenance.md +364 -0
data/handbook/guides/testing/typescript-bun.md +47 -0
data/handbook/guides/testing/vue-firebase-auth.md +546 -0
data/handbook/guides/testing/vue-vitest.md +236 -0
data/handbook/guides/testing-philosophy.g.md +82 -0
data/handbook/guides/testing-strategy.g.md +151 -0
data/handbook/guides/testing-tdd-cycle.g.md +146 -0
data/handbook/guides/testing.g.md +170 -0
data/handbook/skills/as-test-create-cases/SKILL.md +24 -0
data/handbook/skills/as-test-fix/SKILL.md +26 -0
data/handbook/skills/as-test-improve-coverage/SKILL.md +22 -0
data/handbook/skills/as-test-optimize/SKILL.md +34 -0
data/handbook/skills/as-test-performance-audit/SKILL.md +34 -0
data/handbook/skills/as-test-plan/SKILL.md +34 -0
data/handbook/skills/as-test-review/SKILL.md +34 -0
data/handbook/skills/as-test-verify-suite/SKILL.md +45 -0
data/handbook/templates/e2e-sandbox-checklist.template.md +289 -0
data/handbook/templates/test-case.template.md +56 -0
data/handbook/templates/test-performance-audit.template.md +132 -0
data/handbook/templates/test-responsibility-map.template.md +92 -0
data/handbook/templates/test-review-checklist.template.md +163 -0
data/handbook/workflow-instructions/test/analyze-failures.wf.md +120 -0
data/handbook/workflow-instructions/test/create-cases.wf.md +675 -0
data/handbook/workflow-instructions/test/fix.wf.md +120 -0
data/handbook/workflow-instructions/test/improve-coverage.wf.md +370 -0
data/handbook/workflow-instructions/test/optimize.wf.md +368 -0
data/handbook/workflow-instructions/test/performance-audit.wf.md +17 -0
data/handbook/workflow-instructions/test/plan.wf.md +323 -0
data/handbook/workflow-instructions/test/review.wf.md +16 -0
data/handbook/workflow-instructions/test/verify-suite.wf.md +343 -0
data/lib/ace/test/version.rb +7 -0
data/lib/ace/test.rb +10 -0
metadata +152 -0

data/handbook/guides/test-review-checklist.g.md ADDED Viewed

@@ -0,0 +1,231 @@
+---
+doc-type: guide
+title: Test Review Checklist Guide
+purpose: Test PR review checklist
+ace-docs:
+  last-updated: 2026-02-19
+  last-checked: 2026-03-21
+---
+# Test Review Checklist Guide
+## Goal
+Quick checklist for reviewing PRs that add or modify tests. Ensures tests are:
+- At the correct layer
+- Properly stubbed
+- Testing behavior (not implementation)
+- Fast enough
+- Actually catching bugs
+## The Quick Check (30 seconds)
+Before deep review, check:
+1. **Layer**: Is test in correct directory? (atoms/molecules/organisms/e2e)
+2. **Speed**: Run `ace-test <package> --profile 5` - any >100ms?
+3. **I/O**: Search for `Open3`, `system(`, `File.` in new test code
+If any fail → detailed review needed.
+## Detailed Checklist
+### 1. Layer Appropriateness
+| Check | Pass | Fail |
+|-------|------|------|
+| Unit tests have NO real I/O | ✓ | Subprocess, network, or filesystem calls |
+| Integration tests stub external deps | ✓ | Real API or subprocess calls |
+| E2E tests are in `test/e2e/TS-*/` | ✓ | E2E behavior in unit test file |
+| No flag permutations in E2E | ✓ | Multiple E2E tests for CLI flags |
+| Max ONE CLI parity test per integration file | ✓ | Multiple subprocess tests |
+**Red Flag**: Test name says "unit" but takes >100ms
+### 2. Stubbing Quality
+| Check | Pass | Fail |
+|-------|------|------|
+| Boundary methods stubbed | ✓ `available?` stubbed | Only `run` stubbed |
+| No zombie mocks | ✓ Stub targets exist | Stub target renamed/removed |
+| Mock data is realistic | ✓ From snapshot/schema | Invented data |
+| Composite helpers used | ✓ Single helper | >3 levels of nesting |
+**Red Flag**: Deep nesting without composite helper
+```ruby
+# BAD: 5 levels of nesting
+mock_a do
+  mock_b do
+    mock_c do
+      mock_d do
+        test_code
+      end
+    end
+  end
+end
+# GOOD: Composite helper
+with_mock_context(a: x, b: y) do
+  test_code
+end
+```
+### 3. Behavior vs Implementation
+| Check | Pass | Fail |
+|-------|------|------|
+| Tests assert on OUTPUT | ✓ `assert_equal expected, result` | Only `mock.verify` |
+| Tests survive refactoring | ✓ Tests behavior | Tests method names |
+| Mock expectations only for side-effects | ✓ `Git.commit` | `Parser.parse` |
+**Red Flag**: Test only has `mock.verify` without output assertions
+```ruby
+# BAD: Only verifies mock was called
+mock.expect(:process, true, [data])
+subject.call(data)
+mock.verify  # What was the result?
+# GOOD: Verifies actual behavior
+result = subject.call(data)
+assert_equal expected_output, result.output
+assert result.success?
+```
+### 4. Performance
+| Check | Pass | Fail |
+|-------|------|------|
+| Unit tests <100ms | ✓ All fast | Any >100ms |
+| No unstubbed `sleep` | ✓ `Kernel.stub :sleep` | Real sleep in retry tests |
+| No real subprocess | ✓ Stubbed | `Open3.capture3` without stub |
+| Cache pre-warming if needed | ✓ In test_helper | Cache miss on every test |
+**Quick Check**:
+```bash
+ace-test <package> --profile 10
+# All unit tests should be <100ms
+```
+### 5. Coverage Quality
+| Check | Pass | Fail |
+|-------|------|------|
+| Happy path tested | ✓ | Missing |
+| Error cases tested | ✓ At least one | None |
+| Edge cases tested | ✓ nil, empty, boundaries | Only happy path |
+| Test actually fails when broken | ✓ Try breaking it | Always passes |
+**Verification**: Temporarily break the code, test should fail.
+### 6. Test Base Class Check
+- [ ] All tests in `test/molecules/` inherit from `<Package>Test` base class
+- [ ] NOT directly from `Minitest::Test`
+**Red Flag**: Test using `Minitest::Test` without access to package helpers
+```ruby
+# BAD: Missing package helpers (stub_prompt_path, shared temp dir, etc.)
+class FeedbackExtractorTest < Minitest::Test
+  # No access to stub_prompt_path, must manually stub
+end
+# GOOD: Has access to all package test helpers
+class FeedbackExtractorTest < AceReviewTest
+  # Can use stub_prompt_path(@extractor), shared temp dir, etc.
+end
+```
+### 7. E2E Specific
+For tests in `test/e2e/TS-*/`:
+| Check | Pass | Fail |
+|-------|------|------|
+| Explicit PASS/FAIL assertions | ✓ `&& echo PASS \|\| echo FAIL` | Implicit success |
+| Paths discovered at runtime | ✓ `find`, `ls` | Hardcoded paths |
+| Error test cases included | ✓ Wrong args, missing files | Only happy path |
+| Exit codes verified | ✓ `[ $? -eq 1 ]` | Exit code ignored |
+| Cleanup documented | ✓ Cleanup section | No cleanup |
+## Common Review Comments
+### Performance Issues
+> "This test takes 150ms. Please stub the availability check:
+> ```ruby
+> Runner.stub(:available?, true) do
+>   # existing test code
+> end
+> ```"
+### Wrong Layer
+> "This test uses real subprocess calls but is in `test/atoms/`. Either:
+> - Stub the subprocess and keep in atoms
+> - Move to `test/e2e/` as an E2E test"
+### Implementation Testing
+> "This test only verifies the mock was called. Please add assertion on the actual result:
+> ```ruby
+> result = subject.call(input)
+> assert_equal expected_output, result.value
+> ```"
+### Missing Error Cases
+> "Please add at least one error case test. For example:
+> ```ruby
+> def test_raises_on_invalid_input
+>   assert_raises(ValidationError) { subject.call(nil) }
+> end
+> ```"
+## Quick Reference Card
+### Performance Thresholds
+| Layer | Target | Warning | Critical |
+|-------|--------|---------|----------|
+| Unit (atoms) | <10ms | >50ms | >100ms |
+| Unit (molecules) | <50ms | >100ms | >200ms |
+| Integration | <500ms | >1s | >2s |
+### Stub the Boundary
+```ruby
+# Always stub availability if stubbing execution
+Runner.stub(:available?, true) do
+  Runner.stub(:run, result) do
+    subject.process
+  end
+end
+```
+### Behavior Assertion Pattern
+```ruby
+# Arrange
+input = build_test_input
+# Act
+result = subject.call(input)
+# Assert (behavior, not implementation)
+assert result.success?
+assert_equal expected_output, result.value
+assert_nil result.error
+```
+## Template
+Use `templates/test-review-checklist.template.md` for formal PR reviews.
+## See Also
+- [Test Layer Decision](guide://test-layer-decision)
+- [Test Mocking Patterns](guide://test-mocking-patterns)
+- [Test Performance](guide://test-performance)

data/handbook/guides/test-suite-health.g.md ADDED Viewed

@@ -0,0 +1,337 @@
+---
+doc-type: guide
+title: Test Suite Health Guide
+purpose: Maintain healthy test suites through measurement and continuous improvement
+ace-docs:
+  last-updated: 2026-02-22
+  last-checked: 2026-03-21
+---
+# Test Suite Health Guide
+## Goal
+Maintain test suites that are:
+- **Fast**: Quick feedback loop for developers
+- **Reliable**: No flaky tests, deterministic results
+- **Effective**: Actually catch bugs before production
+- **Maintainable**: Easy to update as code evolves
+## Health Metrics
+### Performance Metrics
+| Metric | Target | Warning | Critical |
+|--------|--------|---------|----------|
+| Unit test (atoms) | <10ms | >50ms | >100ms |
+| Unit test (molecules) | <50ms | >100ms | >200ms |
+| Integration test | <500ms | >1s | >2s |
+| Full package suite | <30s | >60s | >120s |
+| Full monorepo suite | <5min | >10min | >20min |
+**Measurement**: `ace-test --profile 20`
+### Reliability Metrics
+| Metric | Target | Warning | Critical |
+|--------|--------|---------|----------|
+| Flake rate | <1% | >2% | >5% |
+| Test determinism | 100% | <99% | <95% |
+| CI pass rate | >98% | <95% | <90% |
+**Measurement**: Track CI runs over time, re-run suspicious tests
+### Effectiveness Metrics
+| Metric | Target | Interpretation |
+|--------|--------|----------------|
+| Defect Removal Efficiency | >85% | `bugs_caught / (bugs_caught + escaped_bugs)` |
+| Code coverage (critical paths) | >80% | Focus on business logic, not getters |
+| Escaped defects | Trending down | Bugs found in production |
+| Mean Time to Detect | <1 day | Time from bug introduction to test failure |
+**Measurement**: Track bugs in issue tracker, label with "escaped-defect"
+## Periodic Audit Schedule
+### Weekly: Changed Package Review
+**When**: After significant changes to a package
+**Actions**:
+1. Run `ace-test <package> --profile 10`
+2. Check for new tests exceeding thresholds
+3. Verify no zombie mocks introduced
+**Trigger**: PR changes >100 lines in test files
+### Monthly: Full Suite Audit
+**When**: First week of each month
+**Actions**:
+1. Run `ace-test-suite` with profiling
+2. Generate health report
+3. Create tasks for issues found
+4. Compare metrics to previous month
+**Checklist**:
+- [ ] All packages under 30s
+- [ ] No unit test >100ms
+- [ ] No flaky tests (run 3x)
+- [ ] Coverage not decreased
+### Quarterly: Deep Review
+**When**: End of each quarter
+**Actions**:
+1. Review all E2E tests for relevance
+2. Check mock data for API drift
+3. Analyze escaped defects pattern
+4. Update testing guides if needed
+**Checklist**:
+- [ ] E2E tests still match user workflows
+- [ ] Mock snapshots updated from real APIs
+- [ ] Escaped defects analyzed, regression tests added
+- [ ] Guides reflect current practices
+## CI Integration
+### Performance Gates
+Add to `.github/workflows/test.yml`:
+```yaml
+- name: Run tests with profiling
+  run: |
+    ace-test --profile 20 2>&1 | tee test-profile.txt
+- name: Check performance thresholds
+  run: |
+    # Extract slowest tests
+    slow_tests=$(grep -E "^\s+[0-9]+\.\s+" test-profile.txt | \
+      awk '$NF ~ /[0-9]+\.[1-9][0-9][0-9]s/ {print}')
+    if [ -n "$slow_tests" ]; then
+      echo "::warning::Tests exceeding 100ms threshold:"
+      echo "$slow_tests"
+      # Count critical violations (>200ms)
+      critical=$(echo "$slow_tests" | awk '$NF ~ /[0-9]+\.[2-9][0-9][0-9]s|[1-9]\.[0-9]+s/ {count++} END {print count+0}')
+      if [ "$critical" -gt 0 ]; then
+        echo "::error::$critical tests exceed 200ms critical threshold"
+        exit 1
+      fi
+    fi
+```
+### Flakiness Detection
+```yaml
+- name: Run tests multiple times for flakiness
+  run: |
+    for i in 1 2 3; do
+      echo "=== Run $i ==="
+      ace-test-suite --quiet || echo "FAILED_RUN_$i"
+    done | tee runs.txt
+    failures=$(grep -c "FAILED_RUN" runs.txt || true)
+    if [ "$failures" -gt 0 ] && [ "$failures" -lt 3 ]; then
+      echo "::error::Flaky tests detected ($failures/3 runs failed)"
+      exit 1
+    fi
+```
+### Coverage Tracking
+```yaml
+- name: Generate coverage report
+  run: |
+    COVERAGE=true ace-test-suite
+- name: Check coverage threshold
+  run: |
+    coverage=$(cat coverage/coverage.json | jq '.metrics.covered_percent')
+    if (( $(echo "$coverage < 80" | bc -l) )); then
+      echo "::error::Coverage $coverage% below 80% threshold"
+      exit 1
+    fi
+```
+## Pre-commit Hook
+Add to `.git/hooks/pre-commit`:
+```bash
+#!/bin/bash
+# Get changed packages
+changed_packages=$(git diff --cached --name-only | \
+  grep "^ace-" | cut -d/ -f1 | sort -u)
+if [ -z "$changed_packages" ]; then
+  exit 0
+fi
+echo "Running tests for changed packages..."
+for pkg in $changed_packages; do
+  echo "Testing $pkg..."
+  # Run with profile, fail on slow tests
+  output=$(ace-test "$pkg" --profile 5 2>&1)
+  status=$?
+  if [ $status -ne 0 ]; then
+    echo "Tests failed in $pkg"
+    echo "$output"
+    exit 1
+  fi
+  # Check for slow tests
+  slow=$(echo "$output" | grep -E "[0-9]+\.[1-9][0-9][0-9]s" | head -3)
+  if [ -n "$slow" ]; then
+    echo "Warning: Slow tests in $pkg:"
+    echo "$slow"
+  fi
+done
+echo "All tests passed!"
+```
+## Troubleshooting Common Issues
+### Issue: Random Test Failures
+**Symptoms**: Different tests fail on different runs
+**Diagnosis**:
+1. Run tests 5x, note which fail
+2. Check for shared mutable state
+3. Look for test order dependencies
+**Common Causes**:
+- Cache invalidation between tests
+- Global state not reset
+- Time-dependent assertions
+**Fix**:
+- Pre-warm caches at test startup
+- Reset state in setup, not teardown
+- Use time stubs
+### Issue: Sudden Slowdown
+**Symptoms**: Test suite time increased significantly
+**Diagnosis**:
+1. `ace-test --profile 20` before and after
+2. Compare slow test lists
+3. Check for removed stubs
+**Common Causes**:
+- Zombie mocks (stub no longer matches code)
+- New tests without proper stubbing
+- Dependency upgrade with slower behavior
+**Fix**:
+- Update stubs to match new code paths
+- Add missing stubs
+- Profile dependency to identify bottleneck
+### Issue: Tests Pass Locally, Fail in CI
+**Symptoms**: Green locally, red in CI
+**Diagnosis**:
+1. Check CI environment differences
+2. Look for timing-sensitive tests
+3. Check for missing test dependencies
+**Common Causes**:
+- Different tool versions
+- Network/filesystem differences
+- Race conditions more visible on CI
+**Fix**:
+- Pin tool versions in CI
+- Add explicit waits or retries
+- Use mocks for environment-dependent behavior
+### Issue: High Escaped Defect Rate
+**Symptoms**: Bugs reaching production despite tests
+**Diagnosis**:
+1. Analyze escaped defects by category
+2. Check if tests exist for those paths
+3. Review test assertions
+**Common Causes**:
+- Missing edge case coverage
+- Tests check implementation, not behavior
+- Integration gaps (unit tests pass, system fails)
+**Fix**:
+- Add regression tests for each escaped defect
+- Review test assertions for completeness
+- Add integration/E2E tests for critical paths
+## Health Report Template
+Generate monthly with `/ace-test-verify-suite`:
+```markdown
+# Test Suite Health Report
+**Date**: YYYY-MM-DD
+**Packages**: N packages analyzed
+**Total Tests**: N tests
+## Performance Summary
+| Package | Tests | Time | Slowest Test |
+|---------|-------|------|--------------|
+| ace-lint | 45 | 1.2s | test_complex_validation (89ms) |
+| ace-git | 78 | 2.1s | test_diff_generation (156ms) |
+### Threshold Violations
+- [ ] ace-git: 2 tests >100ms (warning)
+- [x] ace-lint: All tests <100ms
+## Reliability Summary
+| Metric | Value | Status |
+|--------|-------|--------|
+| Flake rate | 0.5% | OK |
+| CI pass rate | 99.2% | OK |
+### Flaky Tests Identified
+- None this month
+## Effectiveness Summary
+| Metric | Value | Trend |
+|--------|-------|-------|
+| Escaped defects | 2 | Down from 4 |
+| DRE | 87% | Up from 83% |
+### Escaped Defects This Month
+1. #456 - Config parsing edge case
+2. #461 - CLI exit code mismatch
+## Action Items
+1. [ ] Add regression test for #456
+2. [ ] Investigate slow test in ace-git
+3. [ ] Update mock data for GitHub API
+```
+## See Also
+- [Test Layer Decision](guide://test-layer-decision) - Where to test
+- [Test Mocking Patterns](guide://test-mocking-patterns) - How to mock
+- [Test Performance](guide://test-performance) - Optimization techniques