RubyGems - ace-test - Versions diffs - 0.6.0 - Mend

ace-test 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (67) hide show

checksums.yaml +7 -0
data/.ace-defaults/nav/protocols/agent-sources/ace-test.yml +19 -0
data/.ace-defaults/nav/protocols/guide-sources/ace-test.yml +19 -0
data/.ace-defaults/nav/protocols/tmpl-sources/ace-test.yml +11 -0
data/.ace-defaults/nav/protocols/wfi-sources/ace-test.yml +19 -0
data/CHANGELOG.md +169 -0
data/LICENSE +21 -0
data/README.md +40 -0
data/Rakefile +12 -0
data/handbook/agents/mock.ag.md +164 -0
data/handbook/agents/profile-tests.ag.md +132 -0
data/handbook/agents/test.ag.md +99 -0
data/handbook/guides/SUMMARY.md +95 -0
data/handbook/guides/embedded-testing-guide.g.md +261 -0
data/handbook/guides/mocking-patterns.g.md +464 -0
data/handbook/guides/quick-reference.g.md +46 -0
data/handbook/guides/test-driven-development-cycle/meta-documentation.md +26 -0
data/handbook/guides/test-driven-development-cycle/ruby-application.md +18 -0
data/handbook/guides/test-driven-development-cycle/ruby-gem.md +19 -0
data/handbook/guides/test-driven-development-cycle/rust-cli.md +18 -0
data/handbook/guides/test-driven-development-cycle/rust-wasm-zed.md +19 -0
data/handbook/guides/test-driven-development-cycle/typescript-nuxt.md +18 -0
data/handbook/guides/test-driven-development-cycle/typescript-vue.md +19 -0
data/handbook/guides/test-layer-decision.g.md +261 -0
data/handbook/guides/test-mocking-patterns.g.md +414 -0
data/handbook/guides/test-organization.g.md +140 -0
data/handbook/guides/test-performance.g.md +353 -0
data/handbook/guides/test-responsibility-map.g.md +220 -0
data/handbook/guides/test-review-checklist.g.md +231 -0
data/handbook/guides/test-suite-health.g.md +337 -0
data/handbook/guides/testable-code-patterns.g.md +315 -0
data/handbook/guides/testing/ruby-rspec-config-examples.md +120 -0
data/handbook/guides/testing/ruby-rspec.md +87 -0
data/handbook/guides/testing/rust.md +52 -0
data/handbook/guides/testing/test-maintenance.md +364 -0
data/handbook/guides/testing/typescript-bun.md +47 -0
data/handbook/guides/testing/vue-firebase-auth.md +546 -0
data/handbook/guides/testing/vue-vitest.md +236 -0
data/handbook/guides/testing-philosophy.g.md +82 -0
data/handbook/guides/testing-strategy.g.md +151 -0
data/handbook/guides/testing-tdd-cycle.g.md +146 -0
data/handbook/guides/testing.g.md +170 -0
data/handbook/skills/as-test-create-cases/SKILL.md +24 -0
data/handbook/skills/as-test-fix/SKILL.md +26 -0
data/handbook/skills/as-test-improve-coverage/SKILL.md +22 -0
data/handbook/skills/as-test-optimize/SKILL.md +34 -0
data/handbook/skills/as-test-performance-audit/SKILL.md +34 -0
data/handbook/skills/as-test-plan/SKILL.md +34 -0
data/handbook/skills/as-test-review/SKILL.md +34 -0
data/handbook/skills/as-test-verify-suite/SKILL.md +45 -0
data/handbook/templates/e2e-sandbox-checklist.template.md +289 -0
data/handbook/templates/test-case.template.md +56 -0
data/handbook/templates/test-performance-audit.template.md +132 -0
data/handbook/templates/test-responsibility-map.template.md +92 -0
data/handbook/templates/test-review-checklist.template.md +163 -0
data/handbook/workflow-instructions/test/analyze-failures.wf.md +120 -0
data/handbook/workflow-instructions/test/create-cases.wf.md +675 -0
data/handbook/workflow-instructions/test/fix.wf.md +120 -0
data/handbook/workflow-instructions/test/improve-coverage.wf.md +370 -0
data/handbook/workflow-instructions/test/optimize.wf.md +368 -0
data/handbook/workflow-instructions/test/performance-audit.wf.md +17 -0
data/handbook/workflow-instructions/test/plan.wf.md +323 -0
data/handbook/workflow-instructions/test/review.wf.md +16 -0
data/handbook/workflow-instructions/test/verify-suite.wf.md +343 -0
data/lib/ace/test/version.rb +7 -0
data/lib/ace/test.rb +10 -0
metadata +152 -0

data/handbook/guides/test-layer-decision.g.md ADDED Viewed

@@ -0,0 +1,261 @@
+---
+doc-type: guide
+title: Test Layer Decision Guide
+purpose: Help developers and agents decide where to test each behavior
+ace-docs:
+  last-updated: 2026-02-23
+  last-checked: 2026-03-21
+---
+# Test Layer Decision Guide
+## Goal
+This guide helps you decide **where** to test each behavior. Placing tests at the wrong layer leads to slow feedback loops, brittle tests, or gaps in coverage.
+## The Testing Pyramid
+```
+        /\
+       /E2E\        10% - Critical user journeys
+      /------\
+     /  Integ \     20% - Component interactions
+    /----------\
+   /    Unit    \   70% - Pure logic, edge cases
+  /--------------\
+```
+| Layer | Target Time | What It Tests |
+|-------|-------------|---------------|
+| Unit (atoms) | <10ms | Pure functions, single responsibility |
+| Unit (molecules) | <50ms | Composed operations, controlled I/O |
+| Integration (organisms) | <500ms | Business logic orchestration |
+| E2E | Seconds | Critical user workflows, real dependencies |
+## Decision Matrix
+Use this matrix to decide where a test belongs:
+| Question | Unit | Integration | E2E |
+|----------|:----:|:-----------:|:---:|
+| Tests pure logic with no side effects? | **Yes** | - | - |
+| Tests data transformation? | **Yes** | - | - |
+| Tests component orchestration? | - | **Yes** | - |
+| Needs real filesystem operations? | No | Sometimes | **Yes** |
+| Needs real git repository? | No | Rarely | **Yes** |
+| Needs real subprocess execution? | **Never** | Stub | **Yes** |
+| Calls external APIs (GitHub, LLM)? | Mock | Mock | **Yes** |
+| Tests CLI argument parsing? | API | API | **Yes** |
+| Tests CLI output format? | - | 1 per file | **Yes** |
+| Tests error messages and exit codes? | API | API | **Yes** |
+| Tests tool installation/availability? | - | - | **Yes** |
+| Tests multi-step user workflow? | - | - | **Yes** |
+## Layer Responsibilities
+### Unit Tests (atoms/molecules)
+**Purpose**: Verify individual functions work correctly in isolation.
+**Test these behaviors**:
+- Pure function logic (input → output)
+- Edge cases (empty input, nil, boundaries)
+- Error handling (invalid input, exceptions)
+- Data transformations
+- Configuration parsing
+- String/path manipulation
+**Stub everything external**:
+- Filesystem → use temp files or mocks
+- Subprocess → stub `Open3.capture3`, `system()`
+- Network → stub with WebMock
+- Git → use `MockGitRepo`
+- Time → stub `Time.now` if needed
+**Example**:
+```ruby
+# atoms/path_expander_test.rb
+def test_expands_home_directory
+  result = PathExpander.expand("~/config.yml")
+  assert_equal "/Users/test/config.yml", result
+end
+def test_returns_absolute_path_unchanged
+  result = PathExpander.expand("/absolute/path.yml")
+  assert_equal "/absolute/path.yml", result
+end
+```
+### Integration Tests (molecules/organisms)
+**Purpose**: Verify components work together correctly.
+**Test these behaviors**:
+- Data flow between modules
+- Error propagation across components
+- Configuration cascade resolution
+- Orchestration logic
+- ONE CLI parity test per file (verify CLI matches API)
+**Stub external dependencies**:
+- Real subprocess calls
+- External APIs
+- Slow operations (git init, network)
+**Allow controlled I/O**:
+- Temp directories for file tests
+- In-memory data structures
+**Example**:
+```ruby
+# organisms/config_resolver_test.rb
+def test_merges_project_over_user_config
+  with_temp_config_files(
+    user: { model: "gpt-4" },
+    project: { model: "claude" }
+  ) do
+    result = ConfigResolver.resolve("llm")
+    assert_equal "claude", result[:model]
+  end
+end
+# ONE CLI parity test
+def test_cli_matches_api_output
+  api_result = Ace::MyTool.process("input.txt")
+  cli_output, status = Open3.capture3("ace-mytool", "input.txt")
+  assert status.success?
+  assert_equal api_result.output, cli_output.strip
+end
+```
+### E2E Tests (manual tests)
+**Purpose**: Verify complete user workflows work in real environments.
+**Test these behaviors**:
+- Critical user journeys end-to-end
+- Tool installation and availability
+- Real API interactions (sandboxed)
+- Complex multi-step workflows
+- Environment-specific behavior
+- CLI behavior with real tools
+**Use real dependencies**:
+- Real filesystem
+- Real git operations
+- Real subprocess calls
+- Real external tools (StandardRB, gitleaks, etc.)
+**Example** (TS-format `TC-*.tc.md`):
+```markdown
+### TC-001: Full Lint Workflow
+**Steps:**
+1. Create test file with lint issues
+2. Run `ace-lint test.rb`
+3. Verify issues detected
+4. Run `ace-lint test.rb --fix`
+5. Verify issues fixed
+**Expected:**
+- Step 2: Exit code 1, issues listed
+- Step 4: Exit code 0, file modified
+```
+## Quick Reference: Where Does This Test Go?
+### Put in Unit Tests
+- "Does `parse_config` handle empty YAML?"
+- "Does `format_output` escape special characters?"
+- "Does `validate_input` reject nil?"
+- "Does `calculate_score` handle edge cases?"
+### Put in Integration Tests
+- "Does the config cascade merge correctly?"
+- "Does error in component A propagate to B?"
+- "Does the CLI produce same output as API?" (ONE test)
+- "Does the workflow orchestrator coordinate correctly?"
+### Put in E2E Tests
+- "Does the full lint workflow work with real StandardRB?"
+- "Can users run `ace-git-commit` from any directory?"
+- "Does tool detect when gitleaks is not installed?"
+- "Does the complete review workflow produce valid reports?"
+## Common Mistakes
+### Mistake 1: E2E Tests for Flag Permutations
+**Wrong**: 10 E2E tests for each CLI flag combination
+```ruby
+# Each takes 500ms+ due to subprocess
+def test_verbose_flag; Open3.capture3(BIN, "--verbose"); end
+def test_quiet_flag; Open3.capture3(BIN, "--quiet"); end
+def test_debug_flag; Open3.capture3(BIN, "--debug"); end
+```
+**Right**: 1 E2E test + unit tests for flags
+```ruby
+# E2E: verify CLI works
+def test_cli_executes_successfully
+  _, status = Open3.capture3(BIN, "input.txt")
+  assert status.success?
+end
+# Unit: test flag handling via API
+def test_verbose_flag_enables_debug_output
+  result = MyTool.process("input.txt", verbose: true)
+  assert result.debug_output_enabled?
+end
+```
+### Mistake 2: Real Git in Unit Tests
+**Wrong**: Each test creates real git repo (~150ms)
+```ruby
+def setup
+  @repo = create_real_git_repo  # SLOW
+end
+```
+**Right**: Use MockGitRepo for unit tests
+```ruby
+def setup
+  @repo = MockGitRepo.new  # FAST
+  @repo.add_commit("abc123", message: "test commit")
+end
+```
+### Mistake 3: Missing Availability Stubs
+**Wrong**: Stub `run` but not `available?`
+```ruby
+Runner.stub(:run, result) do
+  subject.lint(file)  # Calls available?() → subprocess!
+end
+```
+**Right**: Stub entire call chain
+```ruby
+Runner.stub(:available?, true) do
+  Runner.stub(:run, result) do
+    subject.lint(file)  # Fast
+  end
+end
+```
+## Performance Targets
+| Test Type | Target | Hard Limit | Action if Exceeded |
+|-----------|--------|------------|-------------------|
+| Unit (atoms) | <10ms | 50ms | Check for subprocess leaks |
+| Unit (molecules) | <50ms | 100ms | Check for unstubbed deps |
+| Integration | <500ms | 1s | Move real calls to E2E |
+| E2E | <5s | 30s | Split into smaller scenarios |
+## See Also
+- [Test Performance Guide](guide://test-performance) - Optimization techniques
+- [Test Mocking Patterns](guide://test-mocking-patterns) - How to stub correctly
+- [E2E Testing Guide](guide://e2e-testing) - E2E test conventions

data/handbook/guides/test-mocking-patterns.g.md ADDED Viewed

@@ -0,0 +1,414 @@
+---
+doc-type: guide
+title: Test Mocking Patterns Guide
+purpose: Ensure mocks actually test real behavior and stay in sync with production
+ace-docs:
+  last-updated: 2026-02-01
+  last-checked: 2026-03-21
+---
+# Test Mocking Patterns Guide
+## Goal
+This guide ensures your mocks:
+1. Test **behavior**, not implementation details
+2. Stay in sync with real APIs (no drift)
+3. Don't become "zombies" that test nothing
+## Core Principle: Test Behavior, Not Implementation
+### The Difference
+**Implementation testing** (fragile):
+- "Was method X called?"
+- "Was it called with these exact arguments?"
+- "Was it called exactly 3 times?"
+**Behavior testing** (robust):
+- "Given this input, is the output correct?"
+- "Given this error condition, is the error message helpful?"
+- "Does the system reach the correct final state?"
+### Example
+```ruby
+# BAD: Tests implementation (breaks when refactored)
+def test_processes_data
+  processor = Minitest::Mock.new
+  processor.expect :transform, "result", ["input"]
+  processor.expect :validate, true, ["result"]
+  subject = DataHandler.new(processor)
+  subject.handle("input")
+  processor.verify  # "Were these methods called?"
+end
+# GOOD: Tests behavior (survives refactoring)
+def test_processes_data
+  result = DataHandler.new.handle("input")
+  assert result.success?
+  assert_equal "expected_output", result.value
+end
+```
+## Stub Hierarchy: Use the Simplest Double
+From simplest to most complex:
+| Type | Purpose | When to Use |
+|------|---------|-------------|
+| **Dummy** | Placeholder, never used | Parameter that won't be called |
+| **Stub** | Returns canned values | Control return values |
+| **Spy** | Records calls for inspection | Verify interactions happened |
+| **Mock** | Verifies expectations | Protocol/contract testing |
+| **Fake** | Working implementation | Complex behavior needed |
+**Rule**: Use the simplest type that meets your needs.
+```ruby
+# Dummy - just needs to exist
+def test_with_unused_dependency
+  dummy_logger = Object.new
+  subject = Worker.new(logger: dummy_logger)
+  # logger never called in this test path
+end
+# Stub - control return value
+def test_with_stubbed_api
+  api_result = { status: "ok", data: [1, 2, 3] }
+  ApiClient.stub :fetch, api_result do
+    result = subject.process
+    assert_equal [1, 2, 3], result.items
+  end
+end
+# Fake - real behavior, simplified
+class FakeFileSystem
+  def initialize
+    @files = {}
+  end
+  def write(path, content)
+    @files[path] = content
+  end
+  def read(path)
+    @files[path] or raise "File not found: #{path}"
+  end
+end
+```
+## Zombie Mocks: Detection and Prevention
+### What Are Zombie Mocks?
+Mocks that stub methods **no longer called** by the implementation. Tests pass but don't test anything real.
+### How They Happen
+1. Code is refactored, method renamed or removed
+2. Tests still stub old method name
+3. Real code path executes (slowly or incorrectly)
+4. Test passes anyway
+### Case Study
+```ruby
+# Original implementation
+class ChangeDetector
+  def get_diff(files)
+    execute_git_command("git diff #{files.join(' ')}")
+  end
+end
+# Test stubbed this:
+ChangeDetector.stub :execute_git_command, "" do
+  result = detector.get_diff(files)  # Fast, stubbed
+end
+# Later, implementation changed:
+class ChangeDetector
+  def get_diff(files)
+    Ace::Git::DiffOrchestrator.generate(files: files)  # New method!
+  end
+end
+# Test still stubs old method - ZOMBIE!
+ChangeDetector.stub :execute_git_command, "" do
+  result = detector.get_diff(files)  # Slow! Real DiffOrchestrator runs
+end
+```
+### Detection
+1. **Profile tests**: `ace-test --profile 10`
+2. **Look for slow unit tests**: >100ms indicates zombie
+3. **Try breaking the stub**: Change stub return value, test should fail
+```ruby
+# Zombie detection test
+def test_stub_is_actually_used
+  # If this stub is a zombie, changing return value won't affect test
+  Runner.stub(:run, "UNEXPECTED_VALUE_12345") do
+    result = subject.lint(file)
+    # If test passes without checking for UNEXPECTED_VALUE_12345,
+    # the stub might be a zombie
+  end
+end
+```
+### Prevention
+1. **Update stubs when refactoring**: Part of the refactoring checklist
+2. **Use composite helpers**: Centralized, easier to maintain
+3. **Profile regularly**: Weekly `ace-test --profile 20`
+4. **Document stub targets**: Comments explaining what's stubbed and why
+## Composite Helpers: Reducing Stub Complexity
+### The Problem
+Deep nesting makes tests hard to read and maintain:
+```ruby
+def test_complex_workflow
+  mock_config do
+    mock_git_status do
+      mock_diff_generator do
+        mock_api_client do
+          result = subject.execute
+        end
+      end
+    end
+  end
+end
+```
+### The Solution
+Composite helpers that combine related stubs:
+```ruby
+def test_complex_workflow
+  with_mock_repo_context(branch: "feature", clean: true) do
+    result = subject.execute
+    assert result.success?
+  end
+end
+# In test_helper.rb
+def with_mock_repo_context(branch: "main", clean: true, diff: nil)
+  mock_branch = build_branch_info(name: branch)
+  mock_status = clean ? :clean : :dirty
+  mock_diff ||= Ace::Git::Models::DiffResult.empty
+  Ace::Git::Molecules::BranchInfo.stub :fetch, mock_branch do
+    Ace::Git::Atoms::StatusChecker.stub :clean?, clean do
+      Ace::Git::Organisms::DiffOrchestrator.stub :generate, mock_diff do
+        yield
+      end
+    end
+  end
+end
+```
+### Design Principles
+1. **Sensible defaults**: Most tests use standard values
+2. **Keyword arguments**: Override only what matters
+3. **Clear naming**: `with_mock_<context>` pattern
+4. **Single responsibility**: One helper per "context"
+### Existing Composite Helpers
+| Package | Helper | Stubs |
+|---------|--------|-------|
+| ace-git | `with_mock_repo_load` | BranchInfo + StatusChecker + DiffOrchestrator |
+| ace-git-secrets | `with_rewrite_test_mocks` | gitleaks + rewriter + working directory |
+| ace-lint | `with_stubbed_validators` | ValidatorRegistry + Runner availability |
+| ace-review | `stub_synthesizer_prompt_path` | ace-nav subprocess |
+## Contract Testing: Keeping Mocks in Sync
+### The Problem
+Mock data can drift from real API responses:
+```ruby
+# Mock returns this:
+{ "status" => "ok", "items" => [] }
+# Real API returns this:
+{ "status" => "success", "data" => { "items" => [] } }
+# Test passes, production fails!
+```
+### Solution 1: Snapshot-Based Mocks
+Capture real API responses and use them as mocks:
+```ruby
+# 1. Record real response (one-time, manual)
+# curl https://api.github.com/repos/owner/repo/pulls/123 > fixtures/pr_response.json
+# 2. Use in tests
+def mock_pr_response
+  JSON.parse(File.read("fixtures/pr_response.json"))
+end
+def test_fetches_pr_details
+  WebMock.stub_request(:get, /pulls\/123/)
+    .to_return(body: mock_pr_response.to_json)
+  result = PrFetcher.fetch(123)
+  assert_equal "open", result.state
+end
+```
+### Solution 2: Schema Validation
+Validate mock data against OpenAPI/JSON Schema:
+```ruby
+# fixtures/schemas/github_pr.json defines the schema
+def test_mock_matches_schema
+  schema = JSON.parse(File.read("fixtures/schemas/github_pr.json"))
+  mock = mock_pr_response
+  errors = JSON::Validator.validate(schema, mock)
+  assert_empty errors, "Mock doesn't match schema: #{errors}"
+end
+```
+### Solution 3: Periodic Drift Check
+Scheduled job that compares mocks to real responses:
+```ruby
+# Run monthly or after API version updates
+def test_mock_matches_live_api
+  skip "Run manually to check for API drift"
+  live_response = real_api_client.fetch_pr(TEST_PR_ID)
+  mock_response = mock_pr_response
+  # Compare structure (not exact values)
+  assert_same_keys live_response, mock_response
+  assert_same_types live_response, mock_response
+end
+```
+## Stubbing Patterns by Dependency Type
+### Subprocess Calls
+```ruby
+# Stub Open3.capture3
+Open3.stub :capture3, ["output", "", mock_status] do
+  result = Runner.execute("command")
+end
+# Stub system()
+Kernel.stub :system, true do
+  Runner.check_availability
+end
+# Don't forget availability checks!
+Runner.stub(:available?, true) do
+  # Now stub the actual execution
+end
+```
+### HTTP Requests
+```ruby
+# Using WebMock
+WebMock.stub_request(:get, "https://api.example.com/data")
+  .to_return(body: { items: [] }.to_json, status: 200)
+# Using VCR for recorded responses
+VCR.use_cassette("api_response") do
+  result = ApiClient.fetch_data
+end
+```
+### Filesystem
+```ruby
+# Temp directory helper
+def with_temp_dir
+  Dir.mktmpdir do |dir|
+    yield dir
+  end
+end
+# Fake filesystem for complex tests
+def test_with_fake_fs
+  fake_fs = FakeFileSystem.new
+  fake_fs.write("config.yml", "key: value")
+  subject = ConfigLoader.new(filesystem: fake_fs)
+  result = subject.load("config.yml")
+end
+```
+### Time
+```ruby
+# Stub Time.now
+Time.stub :now, Time.new(2026, 1, 31, 12, 0, 0) do
+  result = subject.generate_timestamp
+  assert_equal "2026-01-31T12:00:00", result
+end
+# Stub sleep for retry tests
+Kernel.stub :sleep, nil do
+  result = subject.retry_with_backoff(max_retries: 3)
+end
+```
+### Git Operations
+```ruby
+# Use MockGitRepo (fast, no subprocess)
+def test_with_mock_repo
+  repo = MockGitRepo.new
+  repo.add_commit("abc123", message: "Initial commit", files: ["README.md"])
+  repo.add_commit("def456", message: "Add feature", files: ["feature.rb"])
+  subject = CommitAnalyzer.new(repo: repo)
+  result = subject.analyze("def456")
+  assert_equal ["feature.rb"], result.changed_files
+end
+# Real git only in integration/E2E
+def test_with_real_repo
+  with_temp_git_repo do |repo_path|
+    # Creates real .git directory
+    File.write("#{repo_path}/test.txt", "content")
+    system("git", "-C", repo_path, "add", ".")
+    system("git", "-C", repo_path, "commit", "-m", "test")
+  end
+end
+```
+## Checklist: Is My Mock Testing Real Behavior?
+- [ ] **Behavior focus**: Test checks output/state, not method calls
+- [ ] **Stub is used**: Changing stub return value causes test to fail
+- [ ] **Data is realistic**: Mock data from real API snapshot or validated schema
+- [ ] **Complete chain**: All entry points to expensive operations stubbed
+- [ ] **No zombie**: Stub target matches current implementation
+- [ ] **Documented**: Comment explains what's stubbed and why
+## See Also
+- [Test Layer Decision](guide://test-layer-decision) - Where to test each behavior
+- [Test Performance](guide://test-performance) - Performance optimization
+- [E2E Testing](guide://e2e-testing) - When mocks aren't enough