npm - @grimoire-cc/cli - Versions diffs - 0.13.3 → 0.14.0 - Mend

@grimoire-cc/cli 0.13.3 → 0.14.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (50) hide show

package/packs/python-pack/skills/grimoire.unit-testing-python/reference/anti-patterns.md ADDED Viewed

@@ -0,0 +1,244 @@
+# Testing Anti-Patterns
+Common testing mistakes that reduce test value and increase maintenance cost. These are language-agnostic — they apply to any test framework.
+## Table of Contents
+- [The Liar](#the-liar)
+- [The Giant](#the-giant)
+- [Excessive Setup](#excessive-setup)
+- [The Slow Poke](#the-slow-poke)
+- [The Peeping Tom](#the-peeping-tom)
+- [The Mockery](#the-mockery)
+- [The Inspector](#the-inspector)
+- [The Flaky Test](#the-flaky-test)
+- [The Cargo Culter](#the-cargo-culter)
+- [The Hard Test](#the-hard-test)
+## The Liar
+**What it is:** A test that passes but doesn't actually verify the behavior it claims to test. It gives false confidence.
+**How to spot it:**
+- Test name says "validates input" but assertions only check the return type
+- Assertions are too loose (`assert result is not None` instead of checking the actual value)
+- Test catches exceptions broadly and passes regardless
+**Fix:** Ensure assertions directly verify the specific behavior described in the test name. Every assertion should fail if the behavior breaks.
+```python
+# Bad — passes even if discount logic is completely wrong
+def test_apply_discount():
+    result = apply_discount(100, 10)
+    assert result is not None
+# Good — fails if the calculation is wrong
+def test_apply_discount_with_10_percent_returns_90():
+    result = apply_discount(100, 10)
+    assert result == 90.0
+```
+## The Giant
+**What it is:** A single test that verifies too many things. When it fails, you can't tell which behavior broke.
+**How to spot it:**
+- Test has more than 8-10 assertions
+- Test name uses "and" (e.g., "creates user and sends email and updates cache")
+- Multiple Act phases in one test
+**Fix:** Split into focused tests, each verifying one logical concept. Multiple assertions are fine if they verify aspects of the same behavior.
+```typescript
+// Bad — three unrelated behaviors in one test
+test('user registration works', () => {
+  const user = register({ name: 'Alice', email: 'alice@test.com' });
+  expect(user.id).toBeDefined();
+  expect(emailService.send).toHaveBeenCalled();
+  expect(cache.set).toHaveBeenCalledWith(`user:${user.id}`, user);
+  expect(auditLog.entries).toHaveLength(1);
+});
+// Good — separate tests for each behavior
+test('register with valid data creates user with id', () => { ... });
+test('register with valid data sends welcome email', () => { ... });
+test('register with valid data caches the user', () => { ... });
+test('register with valid data writes audit log entry', () => { ... });
+```
+## Excessive Setup
+**What it is:** Tests that require dozens of lines of setup before the actual test logic. Often signals that the code under test has too many dependencies.
+**How to spot it:**
+- Arrange section is 20+ lines
+- Multiple mocks configured with complex behaviors
+- Shared setup methods that configure things most tests don't need
+**Fix:** Use factory methods/builders for test data. Consider whether the code under test needs refactoring to reduce dependencies. Only set up what the specific test needs.
+```go
+// Bad — every test sets up the entire world
+func TestProcessOrder(t *testing.T) {
+    db := setupDatabase()
+    cache := setupCache()
+    logger := setupLogger()
+    emailClient := setupEmailClient()
+    validator := NewValidator(db)
+    processor := NewProcessor(cache)
+    service := NewOrderService(db, cache, logger, emailClient, validator, processor)
+    // ... 10 more lines of setup
+    result, err := service.ProcessOrder(ctx, order)
+    assert.NoError(t, err)
+}
+// Good — factory method hides irrelevant details
+func TestProcessOrder_WithValidOrder_Succeeds(t *testing.T) {
+    service := newTestOrderService(t)
+    result, err := service.ProcessOrder(ctx, validOrder())
+    assert.NoError(t, err)
+    assert.Equal(t, "processed", result.Status)
+}
+```
+## The Slow Poke
+**What it is:** Tests that are slow because they use real I/O, network calls, or sleeps. Slow tests get run less frequently and slow down the feedback loop.
+**How to spot it:**
+- `time.Sleep()`, `Thread.sleep()`, `setTimeout` in tests
+- Real HTTP calls, database connections, file system operations
+- Test suite takes more than a few seconds for unit tests
+**Fix:** Mock external dependencies. Use fake implementations for I/O. Replace time-based waits with event-based synchronization.
+## The Peeping Tom
+**What it is:** Tests that access private/internal state to verify behavior instead of testing through the public interface.
+**How to spot it:**
+- Reflection to access private fields
+- Testing internal method calls instead of observable results
+- Assertions on implementation details (internal data structures, private counters)
+**Fix:** Test through the public API. If you can't verify behavior through the public interface, the class may need a design change (e.g., expose a query method or extract a collaborator).
+## The Mockery
+**What it is:** Tests that mock so heavily that they're testing mock configurations rather than real behavior. Every dependency is mocked, including simple value objects.
+**How to spot it:**
+- More mock setup lines than actual test logic
+- Mocking concrete classes, value objects, or data structures
+- Test passes but the real system fails because mocks don't match reality
+**Fix:** Only mock at system boundaries (external services, databases, clocks). Use real implementations for in-process collaborators when practical.
+## The Inspector
+**What it is:** Tests that verify exact method calls and their order rather than outcomes. They break whenever the implementation changes, even if behavior is preserved.
+**How to spot it:**
+- `verify(mock, times(1)).method()` for every mock interaction
+- Assertions on call order
+- Test breaks when you refactor without changing behavior
+**Fix:** Verify state (the result) rather than interactions (how it got there). Only verify interactions for side effects that ARE the behavior (e.g., "email was sent").
+```java
+// Bad — breaks if implementation changes sort algorithm
+verify(sorter, times(1)).quickSort(any());
+verify(sorter, never()).mergeSort(any());
+// Good — verifies the outcome
+assertThat(result).isSortedAccordingTo(naturalOrder());
+```
+## The Flaky Test
+**What it is:** Tests that pass and fail intermittently without code changes. They erode trust in the test suite.
+**Common causes:**
+- Time-dependent logic (`new Date()`, `time.Now()`)
+- Random data without fixed seeds
+- Shared mutable state between tests
+- Race conditions in async tests
+- Dependency on test execution order
+**Fix:** Inject time as a dependency. Use fixed seeds for randomness. Ensure test isolation. Use proper async synchronization.
+## The Cargo Culter
+**What it is:** Writing tests to hit a coverage percentage target rather than to verify behavior. The tests exist to satisfy a metric, not to provide confidence.
+**How to spot it:**
+- Tests that assert trivially obvious things (e.g., `assert user.name == user.name`)
+- Every private method has a corresponding test accessed via reflection
+- 100% coverage but bugs still escape to production
+- Test suite takes minutes to pass but developers don't trust it
+**Fix:** Coverage is a diagnostic tool, not a goal. Use it to find untested gaps, not as a number to optimize. High 80s–90% emerges naturally from disciplined TDD. A test that only exists to push coverage up is worse than no test — it adds maintenance cost without adding confidence.
+```python
+# Bad — written for coverage, not for confidence
+def test_user_has_name():
+    user = User(name="Alice")
+    assert user.name is not None  # This verifies nothing meaningful
+# Good — written to verify a business rule
+def test_user_with_empty_name_raises_validation_error():
+    with pytest.raises(ValidationError, match="name cannot be empty"):
+        User(name="")
+```
+> See: https://martinfowler.com/bliki/TestCoverage.html
+## The Hard Test
+**What it is:** Not an anti-pattern in the test itself, but a signal from the test about the production code. When a test is painful, complex, or requires elaborate setup, the production code has a design problem.
+**How to spot it:**
+- Need to mock 5+ dependencies to test one class
+- Need to access private internals to verify behavior
+- Test requires a complex sequence of operations just to get to the state under test
+- You find yourself thinking "testing this would be too hard"
+**What it signals:**
+- Too many responsibilities in one class (SRP violation)
+- Hidden dependencies or tight coupling
+- Poor separation of concerns
+- Untestable architecture (e.g., side effects embedded in business logic)
+**Fix:** Resist the urge to skip the test or work around it with clever mocking. Instead, fix the production code design. Extract classes, inject dependencies, separate concerns. A hard test is a free design review — take the feedback.
+```python
+# Hard to test — service does too much
+class OrderService:
+    def process(self, order):
+        db = Database()          # hidden dependency
+        email = EmailClient()    # hidden dependency
+        self._validate(order)
+        db.save(order)
+        email.send_confirmation(order)
+        self._update_inventory(order)  # another responsibility
+# Easy to test — dependencies explicit, concerns separated
+class OrderService:
+    def __init__(self, repo: OrderRepository, notifier: Notifier):
+        self._repo = repo
+        self._notifier = notifier
+    def process(self, order: Order) -> OrderResult:
+        self._validate(order)
+        saved = self._repo.save(order)
+        self._notifier.notify(saved)
+        return saved
+```
+---
+## Further Reading
+- xUnit Patterns (Meszaros): http://xunitpatterns.com
+- Codepipes testing anti-patterns: https://blog.codepipes.com/testing/software-testing-antipatterns.html
+- Google SWE Book — Test Doubles: https://abseil.io/resources/swe-book/html/ch13.html

package/packs/python-pack/skills/grimoire.unit-testing-python/reference/tdd-workflow-patterns.md ADDED Viewed

@@ -0,0 +1,259 @@
+# TDD Workflow Patterns
+Guidance on the test-driven development process, when to apply it, and advanced techniques.
+## Table of Contents
+- [Canon TDD — Start with a Test List](#canon-tdd--start-with-a-test-list)
+- [Red-Green-Refactor](#red-green-refactor)
+- [Transformation Priority Premise](#transformation-priority-premise)
+- [F.I.R.S.T. Principles](#first-principles)
+- [London School vs Detroit School](#london-school-vs-detroit-school)
+- [When to Use TDD](#when-to-use-tdd)
+- [When TDD Is Less Effective](#when-tdd-is-less-effective)
+- [BDD and ATDD Extensions](#bdd-and-atdd-extensions)
+- [Advanced Techniques](#advanced-techniques)
+## Canon TDD — Start with a Test List
+> Source: https://tidyfirst.substack.com/p/canon-tdd
+Kent Beck's recommended starting point is not a single test but a **test list** — a written enumeration of all behaviors you intend to verify. This separates the creative work (what to test) from the mechanical work (write, make pass, refactor).
+**Process:**
+1. Write down all behaviors the code needs — a flat list, not tests
+2. Pick the simplest item on the list
+3. Write one failing test for it
+4. Make it pass with the minimum code
+5. Refactor
+6. Cross off the item; repeat
+**Why test order matters:** Starting with simpler behaviors forces simpler transformations (see TPP below) and lets the design emerge naturally. Jumping to complex cases early leads to over-engineered solutions. The test list keeps you focused and prevents scope creep.
+## Red-Green-Refactor
+> Source: https://martinfowler.com/bliki/TestDrivenDevelopment.html
+The core TDD cycle, repeated in small increments:
+### 1. Red — Write a Failing Test
+Write the smallest test that describes the next piece of behavior. The test MUST fail before you write any production code. A test that passes immediately provides no confidence.
+**Rules:**
+- Write only ONE test at a time
+- The test should compile/parse but fail at the assertion
+- If the test passes immediately, it's either trivial or testing existing behavior
+### 2. Green — Make It Pass
+Write the MINIMUM code to make the failing test pass. Do not add extra logic, handle cases not yet tested, or optimize.
+**Rules:**
+- Write the simplest code that makes the test pass
+- It's OK to hardcode values initially — the next test will force generalization
+- Do not add code for future tests
+- All existing tests must still pass
+### 3. Refactor — Clean Up
+With all tests green, improve the code structure without changing behavior. Tests give you the safety net.
+**Rules:**
+- No new functionality during refactoring
+- All tests must remain green after each refactoring step
+- Remove duplication, improve naming, extract methods
+- Refactor both production code AND test code
+### Cycle Length
+Each Red-Green-Refactor cycle should take 1–10 minutes. If you're spending more than 10 minutes in the Red or Green phase, the step is too large — break it down.
+## Transformation Priority Premise
+> Source: http://blog.cleancoder.com/uncle-bob/2013/05/27/TheTransformationPriorityPremise.html
+When going from Red to Green, prefer simpler transformations over complex ones. Listed from simplest to most complex:
+1. **Constant** — return a hardcoded value
+2. **Scalar** — replace constant with a variable
+3. **Direct** — replace unconditional with conditional (if/else)
+4. **Collection** — operate on a collection instead of a scalar
+5. **Iteration** — add a loop
+6. **Recursion** — add recursive call
+7. **Assignment** — replace computed value with mutation
+**Example — building FizzBuzz with TDD:**
+```
+Test 1: input 1 → "1"          Transformation: Constant
+Test 2: input 2 → "2"          Transformation: Scalar (use the input)
+Test 3: input 3 → "Fizz"       Transformation: Direct (add if)
+Test 4: input 5 → "Buzz"       Transformation: Direct (add another if)
+Test 5: input 15 → "FizzBuzz"  Transformation: Direct (add combined if)
+Test 6: input 1-15 → full list Transformation: Iteration (generalize)
+```
+By following this priority, you avoid over-engineering early and let the design emerge naturally from the tests.
+## F.I.R.S.T. Principles
+Every unit test must satisfy these five properties:
+| Principle | Definition | Violation Signal |
+|-----------|------------|-----------------|
+| **Fast** | Runs in milliseconds | Real I/O, network calls, `sleep()` |
+| **Independent** | No dependency on other tests | Shared mutable state, ordered execution |
+| **Repeatable** | Same result every run | System clock, random data without seed, race conditions |
+| **Self-Validating** | Pass or fail without manual interpretation | Tests that print output for a human to read |
+| **Timely** | Written before or alongside production code | Tests added weeks after a feature shipped |
+F.I.R.S.T. is a diagnostic checklist: if a test violates any property, it will erode team trust and reduce the value of the suite.
+## London School vs Detroit School
+> Source: https://martinfowler.com/articles/mocksArentStubs.html
+Two schools of TDD with different philosophies on test doubles. Most teams use a hybrid.
+### Detroit School (Classicist, Inside-Out)
+- **Unit definition**: A module of any size — can span multiple classes
+- **Approach**: Bottom-up; start from domain logic, build outward
+- **Test doubles**: Avoid mocks; use real objects when feasible
+- **Verification**: State verification — examine the result after execution
+- **Testing style**: Black-box; test through public API
+- **Refactoring**: Safe — tests aren't coupled to implementation details
+- **Best for**: Building confidence in real interactions; reducing brittleness
+### London School (Mockist, Outside-In)
+- **Unit definition**: A single class in isolation
+- **Approach**: Top-down; start from the API, work inward
+- **Test doubles**: Mock all collaborators
+- **Verification**: Behavior verification — confirm correct method calls occurred
+- **Testing style**: White-box; tests know about internals
+- **Refactoring**: Can be brittle — tests break when implementation changes
+- **Best for**: Designing interactions upfront; driving architecture decisions
+### Recommended: Hybrid Approach
+Apply Detroit discipline as the default — use real objects, verify state. Apply London mocking only at architectural boundaries (external APIs, databases, clocks). Never mock value objects, pure functions, or in-process helpers.
+The most important rule: if you're mocking to make a test easy to write, that's often a design smell (see The Hard Test in anti-patterns). If you're mocking because the dependency is genuinely external or slow, that's the right use.
+## When to Use TDD
+TDD is most valuable when:
+- **Business logic** — Complex rules, calculations, state machines. TDD forces you to think through all cases before implementing.
+- **Algorithm development** — Sorting, parsing, validation, transformation logic. Tests serve as a specification.
+- **Bug fixes** — Write a test that reproduces the bug first (Red), then fix it (Green). This prevents regressions.
+- **API/interface design** — Writing tests first helps you design interfaces from the consumer's perspective.
+- **Refactoring** — Ensure tests exist before refactoring. If they don't, write characterization tests first, then refactor.
+## When TDD Is Less Effective
+TDD is not universally optimal. Use judgment:
+- **UI/visual components** — Layout, styling, animations are hard to express as unit tests. Use visual regression testing or snapshot tests instead.
+- **Exploratory/prototype code** — When you don't know what to build yet, writing tests first slows exploration. Spike first, then write tests.
+- **Thin integration layers** — Simple pass-through code (e.g., a controller that calls a service) may not benefit from test-first approach. Integration tests are more valuable here.
+- **Infrastructure/glue code** — Database migrations, config files, build scripts. Test these with integration or end-to-end tests.
+- **External API wrappers** — Thin clients wrapping external APIs are better tested with integration tests against the real (or sandboxed) API.
+For these cases, write tests AFTER the implementation (test-last), but still write them.
+## BDD and ATDD Extensions
+### Behavior-Driven Development (BDD)
+> Source: https://martinfowler.com/bliki/GivenWhenThen.html
+BDD extends TDD by using natural language to describe behavior. Useful when tests need to be readable by non-developers.
+**Given-When-Then** structure:
+```gherkin
+Given a cart with items totaling $100
+When a 10% discount is applied
+Then the total should be $90
+```
+Maps to test code:
+```python
+def test_cart_with_10_percent_discount_totals_90():
+    # Given
+    cart = Cart(items=[Item(price=100)])
+    # When
+    cart.apply_discount(PercentageDiscount(10))
+    # Then
+    assert cart.total == 90.0
+```
+### Acceptance TDD (ATDD)
+Write high-level acceptance tests before implementing a feature. These tests describe the feature from the user's perspective and drive the overall design. Unit tests (via TDD) then drive the implementation of each component.
+**Flow:**
+1. Write acceptance test (fails — Red)
+2. Use TDD to implement components needed to pass it
+3. Acceptance test passes (Green)
+4. Refactor
+ATDD is most valuable for features with clear acceptance criteria and when working with product owners or stakeholders.
+## Advanced Techniques
+### Property-Based Testing
+Instead of writing individual input/output pairs, define **properties** that should always hold true and let a framework generate hundreds of test cases automatically.
+**Best for:** Pure functions, algorithms, data transformations, serialization round-trips.
+**Tools:**
+- Python: [Hypothesis](https://hypothesis.readthedocs.io)
+- JavaScript/TypeScript: [fast-check](https://fast-check.dev)
+- Go: `testing/quick` (stdlib), [gopter](https://github.com/leanovate/gopter)
+- Rust: [proptest](https://github.com/proptest-rs/proptest)
+- Java: [jqwik](https://jqwik.net)
+- Elixir: [StreamData](https://hexdocs.pm/stream_data)
+**Example property** (Python/Hypothesis):
+```python
+from hypothesis import given, strategies as st
+@given(st.lists(st.integers()))
+def test_sort_is_idempotent(lst):
+    assert sorted(sorted(lst)) == sorted(lst)
+```
+### Mutation Testing
+Mutation testing introduces small code changes (mutations) and checks whether your tests catch them. A test suite that lets mutations survive has gaps in its coverage.
+**Metric:** Mutation score = % of mutations killed. Target 80%+.
+**Tools:**
+- JavaScript/TypeScript/C#: [Stryker](https://stryker-mutator.io)
+- Java: [PITest](https://pitest.org)
+- Python: [mutmut](https://mutmut.readthedocs.io)
+- Go: [go-mutesting](https://github.com/zimmski/go-mutesting)
+Run mutation testing periodically (not on every commit) to identify weak spots in the test suite.
+### Contract Testing
+In microservice or distributed architectures, contract tests verify that services communicate correctly without running full integration tests.
+**How it works:**
+1. Consumer defines a contract (expected interactions)
+2. Provider verifies it can fulfill the contract
+3. Both test independently — no need to spin up the full system
+**Tool:** [Pact](https://pact.io) — supports most major languages.
+Contract tests replace the expensive integration test layer for inter-service communication while still catching breaking API changes early.

package/packs/rust-pack/grimoire.json ADDED Viewed

@@ -0,0 +1,29 @@
+{
+  "name": "rust-pack",
+  "version": "1.0.0",
+  "agents": [],
+  "skills": [
+    {
+      "name": "grimoire.unit-testing-rust",
+      "path": "skills/grimoire.unit-testing-rust",
+      "description": "Rust unit testing specialist. Patterns and best practices for the built-in test framework, mockall, and proptest. Use when writing tests for .rs files, or asking about Rust testing patterns, test modules, mocking traits, property-based testing, integration tests.",
+      "version": "1.0.0",
+      "triggers": {
+        "keywords": ["cargo-test", "mockall", "proptest", "rstest"],
+        "file_extensions": [".rs"],
+        "patterns": [
+          "write.*test",
+          "add.*test",
+          "create.*test",
+          "test.*coverage",
+          "rust.*test",
+          "cargo.*test"
+        ],
+        "file_paths": [
+          "**/tests/**/*.rs",
+          "**/*_test.rs"
+        ]
+      }
+    }
+  ]
+}