PyPI - claude-mpm - Versions diffs - 4.15.2__py3-none-any.whl → 4.20.3__py3-none-any.whl - Mend

claude-mpm 4.15.2py3-none-any.whl → 4.20.3py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (203) hide show

claude_mpm/skills/bundled/test-driven-development.md ADDED Viewed

@@ -0,0 +1,378 @@
+---
+skill_id: test-driven-development
+skill_version: 0.1.0
+description: Comprehensive TDD patterns and practices for all programming languages, eliminating redundant testing guidance per agent.
+updated_at: 2025-10-30T17:00:00Z
+tags: [testing, tdd, best-practices, quality-assurance]
+---
+# Test-Driven Development (TDD)
+Comprehensive TDD patterns and practices for all programming languages. This skill eliminates ~500-800 lines of redundant testing guidance per agent.
+## When to Use
+Apply TDD for:
+- New feature implementation
+- Bug fixes (test the bug first)
+- Code refactoring (tests ensure behavior preservation)
+- API development (test contracts)
+- Complex business logic
+## TDD Workflow (Red-Green-Refactor)
+### 1. Red Phase: Write Failing Test
+```
+Write a test that:
+- Describes the desired behavior
+- Fails for the right reason (not due to syntax errors)
+- Is focused on a single behavior
+```
+### 2. Green Phase: Make It Pass
+```
+Write the minimum code to:
+- Pass the test
+- Not introduce regressions
+- Follow existing patterns
+```
+### 3. Refactor Phase: Improve Code
+```
+While keeping tests green:
+- Remove duplication
+- Improve naming
+- Simplify logic
+- Extract functions/classes
+```
+## Test Structure Patterns
+### Arrange-Act-Assert (AAA)
+```
+// Arrange: Set up test data and conditions
+const user = createTestUser({ role: 'admin' });
+// Act: Perform the action being tested
+const result = await authenticateUser(user);
+// Assert: Verify the outcome
+expect(result.isAuthenticated).toBe(true);
+expect(result.permissions).toContain('admin');
+```
+### Given-When-Then (BDD Style)
+```
+Given: A user with admin privileges
+When: They attempt to access protected resource
+Then: Access is granted with appropriate permissions
+```
+## Test Naming Conventions
+### Pattern: `test_should_<expected_behavior>_when_<condition>`
+**Examples:**
+- `test_should_return_user_when_id_exists()`
+- `test_should_raise_error_when_user_not_found()`
+- `test_should_validate_email_format_when_creating_account()`
+### Language-Specific Conventions
+**Python (pytest):**
+```python
+def test_should_calculate_total_when_items_added():
+    # Arrange
+    cart = ShoppingCart()
+    cart.add_item(Item("Book", 10.00))
+    cart.add_item(Item("Pen", 1.50))
+    # Act
+    total = cart.calculate_total()
+    # Assert
+    assert total == 11.50
+```
+**JavaScript (Jest):**
+```javascript
+describe('ShoppingCart', () => {
+  test('should calculate total when items added', () => {
+    const cart = new ShoppingCart();
+    cart.addItem({ name: 'Book', price: 10.00 });
+    cart.addItem({ name: 'Pen', price: 1.50 });
+    const total = cart.calculateTotal();
+    expect(total).toBe(11.50);
+  });
+});
+```
+**Go:**
+```go
+func TestShouldCalculateTotalWhenItemsAdded(t *testing.T) {
+    // Arrange
+    cart := NewShoppingCart()
+    cart.AddItem(Item{Name: "Book", Price: 10.00})
+    cart.AddItem(Item{Name: "Pen", Price: 1.50})
+    // Act
+    total := cart.CalculateTotal()
+    // Assert
+    if total != 11.50 {
+        t.Errorf("Expected 11.50, got %f", total)
+    }
+}
+```
+## Test Types and Scope
+### Unit Tests
+- **Scope:** Single function/method
+- **Dependencies:** Mocked
+- **Speed:** Fast (< 10ms per test)
+- **Coverage:** 80%+ of code paths
+### Integration Tests
+- **Scope:** Multiple components
+- **Dependencies:** Real or test doubles
+- **Speed:** Moderate (< 1s per test)
+- **Coverage:** Critical paths and interfaces
+### End-to-End Tests
+- **Scope:** Full user workflows
+- **Dependencies:** Real (in test environment)
+- **Speed:** Slow (seconds to minutes)
+- **Coverage:** Core user journeys
+## Mocking and Test Doubles
+### When to Mock
+- External APIs and services
+- Database operations (for unit tests)
+- File system operations
+- Time-dependent operations
+- Random number generation
+### Mock Types
+**Stub:** Returns predefined data
+```python
+def get_user_stub(user_id):
+    return User(id=user_id, name="Test User")
+```
+**Mock:** Verifies interactions
+```python
+mock_service = Mock()
+service.process_payment(payment_data)
+mock_service.process_payment.assert_called_once_with(payment_data)
+```
+**Fake:** Working implementation (simplified)
+```python
+class FakeDatabase:
+    def __init__(self):
+        self.data = {}
+    def save(self, key, value):
+        self.data[key] = value
+    def get(self, key):
+        return self.data.get(key)
+```
+## Test Coverage Guidelines
+### Target Coverage Levels
+- **Critical paths:** 100%
+- **Business logic:** 95%+
+- **Overall project:** 80%+
+- **UI components:** 70%+
+### What to Test
+- ✅ Business logic and algorithms
+- ✅ Edge cases and boundary conditions
+- ✅ Error handling and validation
+- ✅ State transitions
+- ✅ Public APIs and interfaces
+### What NOT to Test
+- ❌ Framework internals
+- ❌ Third-party libraries
+- ❌ Trivial getters/setters
+- ❌ Generated code
+- ❌ Configuration files
+## Testing Best Practices
+### 1. One Assertion Per Test (When Possible)
+```python
+# Good: Focused test
+def test_should_validate_email_format():
+    assert is_valid_email("user@example.com") is True
+# Avoid: Multiple unrelated assertions
+def test_validation():
+    assert is_valid_email("user@example.com") is True
+    assert is_valid_phone("123-456-7890") is True  # Different concept
+```
+### 2. Test Independence
+```python
+# Good: Each test is self-contained
+def test_user_creation():
+    user = create_user("test@example.com")
+    assert user.email == "test@example.com"
+# Avoid: Tests depending on execution order
+shared_user = None
+def test_create_user():
+    global shared_user
+    shared_user = create_user("test@example.com")
+def test_update_user():  # Depends on previous test
+    shared_user.name = "Updated"
+```
+### 3. Descriptive Test Failures
+```python
+# Good: Clear failure message
+assert result.status == 200, f"Expected 200, got {result.status}: {result.body}"
+# Avoid: Unclear failure
+assert result.status == 200
+```
+### 4. Test Data Builders
+```python
+# Good: Reusable test data creation
+def create_test_user(**overrides):
+    defaults = {
+        'email': 'test@example.com',
+        'name': 'Test User',
+        'role': 'user'
+    }
+    return User(**{**defaults, **overrides})
+# Usage
+admin = create_test_user(role='admin')
+guest = create_test_user(email='guest@example.com')
+```
+## Testing Anti-Patterns to Avoid
+### ❌ Testing Implementation Details
+```python
+# Bad: Tests internal structure
+def test_user_storage():
+    user = User("test@example.com")
+    assert user._internal_cache is not None  # Implementation detail
+```
+### ❌ Fragile Tests
+```python
+# Bad: Breaks with harmless changes
+assert user.to_json() == '{"name":"John","email":"john@example.com"}'
+# Good: Tests behavior, not format
+data = json.loads(user.to_json())
+assert data['name'] == "John"
+assert data['email'] == "john@example.com"
+```
+### ❌ Slow Tests in Unit Test Suite
+```python
+# Bad: Real HTTP calls in unit tests
+def test_api_integration():
+    response = requests.get("https://api.example.com/users")  # Slow!
+    assert response.status_code == 200
+```
+### ❌ Testing Everything Through UI
+```python
+# Bad: Testing business logic through UI
+def test_calculation():
+    browser.click("#input1")
+    browser.type("5")
+    browser.click("#input2")
+    browser.type("3")
+    browser.click("#calculate")
+    assert browser.find("#result").text == "8"
+# Good: Test logic directly
+def test_calculation():
+    assert calculate(5, 3) == 8
+```
+## Quick Reference by Language
+### Python (pytest)
+```python
+# Setup/Teardown
+@pytest.fixture
+def database():
+    db = create_test_database()
+    yield db
+    db.cleanup()
+# Parametrized tests
+@pytest.mark.parametrize("input,expected", [
+    ("user@example.com", True),
+    ("invalid-email", False),
+])
+def test_email_validation(input, expected):
+    assert is_valid_email(input) == expected
+```
+### JavaScript (Jest)
+```javascript
+// Setup/Teardown
+beforeEach(() => {
+  database = createTestDatabase();
+});
+afterEach(() => {
+  database.cleanup();
+});
+// Async tests
+test('should fetch user data', async () => {
+  const user = await fetchUser(1);
+  expect(user.name).toBe('John');
+});
+```
+### Go
+```go
+// Table-driven tests
+func TestEmailValidation(t *testing.T) {
+    tests := []struct {
+        input    string
+        expected bool
+    }{
+        {"user@example.com", true},
+        {"invalid-email", false},
+    }
+    for _, tt := range tests {
+        result := IsValidEmail(tt.input)
+        if result != tt.expected {
+            t.Errorf("IsValidEmail(%s) = %v, want %v",
+                tt.input, result, tt.expected)
+        }
+    }
+}
+```
+## TDD Benefits Realized
+- **Design Improvement:** Tests drive better API design
+- **Documentation:** Tests serve as executable documentation
+- **Confidence:** Refactoring becomes safe
+- **Debugging:** Tests isolate issues quickly
+- **Coverage:** Ensures comprehensive test coverage
+- **Regression Prevention:** Catches bugs before deployment

claude_mpm/skills/bundled/testing/condition-based-waiting/SKILL.md ADDED Viewed

@@ -0,0 +1,123 @@
+---
+name: Condition-Based Waiting
+description: Replace arbitrary timeouts with condition polling for reliable async tests
+when_to_use: when tests have race conditions, timing dependencies, or inconsistent pass/fail behavior
+version: 1.1.0
+languages: all
+---
+# Condition-Based Waiting
+## Overview
+Flaky tests often guess at timing with arbitrary delays. This creates race conditions where tests pass on fast machines but fail under load or in CI.
+**Core principle:** Wait for the actual condition you care about, not a guess about how long it takes.
+## When to Use
+```dot
+digraph when_to_use {
+    "Test uses setTimeout/sleep?" [shape=diamond];
+    "Testing timing behavior?" [shape=diamond];
+    "Document WHY timeout needed" [shape=box];
+    "Use condition-based waiting" [shape=box];
+    "Test uses setTimeout/sleep?" -> "Testing timing behavior?" [label="yes"];
+    "Testing timing behavior?" -> "Document WHY timeout needed" [label="yes"];
+    "Testing timing behavior?" -> "Use condition-based waiting" [label="no"];
+}
+```
+**Use when:**
+- Tests have arbitrary delays (`setTimeout`, `sleep`, `time.sleep()`)
+- Tests are flaky (pass sometimes, fail under load)
+- Tests timeout when run in parallel
+- Waiting for async operations to complete
+**Don't use when:**
+- Testing actual timing behavior (debounce, throttle intervals)
+- Always document WHY if using arbitrary timeout
+## Core Pattern
+```typescript
+// ❌ BEFORE: Guessing at timing
+await new Promise(r => setTimeout(r, 50));
+const result = getResult();
+expect(result).toBeDefined();
+// ✅ AFTER: Waiting for condition
+await waitFor(() => getResult() !== undefined);
+const result = getResult();
+expect(result).toBeDefined();
+```
+## Quick Patterns
+| Scenario | Pattern |
+|----------|---------|
+| Wait for event | `waitFor(() => events.find(e => e.type === 'DONE'))` |
+| Wait for state | `waitFor(() => machine.state === 'ready')` |
+| Wait for count | `waitFor(() => items.length >= 5)` |
+| Wait for file | `waitFor(() => fs.existsSync(path))` |
+| Complex condition | `waitFor(() => obj.ready && obj.value > 10)` |
+## Implementation
+Generic polling function:
+```typescript
+async function waitFor<T>(
+  condition: () => T | undefined | null | false,
+  description: string,
+  timeoutMs = 5000
+): Promise<T> {
+  const startTime = Date.now();
+  while (true) {
+    const result = condition();
+    if (result) return result;
+    if (Date.now() - startTime > timeoutMs) {
+      throw new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`);
+    }
+    await new Promise(r => setTimeout(r, 10)); // Poll every 10ms
+  }
+}
+```
+See @example.ts for complete implementation with domain-specific helpers (`waitForEvent`, `waitForEventCount`, `waitForEventMatch`) from actual debugging session.
+## Common Mistakes
+**❌ Polling too fast:** `setTimeout(check, 1)` - wastes CPU
+**✅ Fix:** Poll every 10ms
+**❌ No timeout:** Loop forever if condition never met
+**✅ Fix:** Always include timeout with clear error
+**❌ Stale data:** Cache state before loop
+**✅ Fix:** Call getter inside loop for fresh data
+## When Arbitrary Timeout IS Correct
+```typescript
+// Tool ticks every 100ms - need 2 ticks to verify partial output
+await waitForEvent(manager, 'TOOL_STARTED'); // First: wait for condition
+await new Promise(r => setTimeout(r, 200));   // Then: wait for timed behavior
+// 200ms = 2 ticks at 100ms intervals - documented and justified
+```
+**Requirements:**
+1. First wait for triggering condition
+2. Based on known timing (not guessing)
+3. Comment explaining WHY
+## Real-World Impact
+From debugging session (2025-10-03):
+- Fixed 15 flaky tests across 3 files
+- Pass rate: 60% → 100%
+- Execution time: 40% faster
+- No more race conditions

claude_mpm/skills/bundled/testing/test-driven-development/SKILL.md ADDED Viewed

@@ -0,0 +1,145 @@
+---
+name: test-driven-development
+description: Write the test first, watch it fail, write minimal code to pass
+version: 3.2.0
+category: testing
+author: Jesse Vincent
+license: MIT
+source: https://github.com/obra/superpowers-skills/tree/main/skills/testing/test-driven-development
+progressive_disclosure:
+  entry_point:
+    summary: "Enforce test-first development with strict RED/GREEN/REFACTOR cycle. Never write implementation before failing test."
+    when_to_use: "When implementing any feature or bugfix, before writing implementation code. Always for new features, bug fixes, refactoring, and behavior changes."
+    quick_start: "1. Write failing test 2. Watch it fail (verify) 3. Write minimal code to pass 4. Watch it pass (verify) 5. Refactor if needed 6. Repeat"
+  references:
+    - workflow.md
+    - examples.md
+    - philosophy.md
+    - anti-patterns.md
+    - integration.md
+context_limit: 800
+tags:
+  - tdd
+  - testing
+  - red-green-refactor
+  - test-first
+---
+# Test-Driven Development (TDD)
+## Overview
+Write the test first. Watch it fail. Write minimal code to pass.
+**Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing.
+This skill enforces strict test-first development following the RED/GREEN/REFACTOR cycle. Violating the letter of the rules is violating the spirit of the rules.
+## When to Use This Skill
+**Always:**
+- New features
+- Bug fixes
+- Refactoring
+- Behavior changes
+**Exceptions (ask human partner):**
+- Throwaway prototypes
+- Generated code
+- Configuration files
+Thinking "skip TDD just this once"? Stop. That's rationalization.
+## The Iron Law
+```
+NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
+```
+Write code before the test? Delete it. Start over.
+**No exceptions:**
+- Don't keep it as "reference"
+- Don't "adapt" it while writing tests
+- Delete means delete
+## Core Principles
+1. **RED**: Write a failing test first
+2. **GREEN**: Write minimal code to make test pass
+3. **REFACTOR**: Improve code while keeping tests green
+4. **NEVER**: Write implementation before tests
+## Quick Start
+### The RED/GREEN/REFACTOR Cycle
+```
+RED → Verify RED → GREEN → Verify GREEN → REFACTOR → Repeat
+```
+1. **RED**: Write one minimal test showing desired behavior
+2. **Verify RED**: Run test, confirm it fails for right reason
+3. **GREEN**: Write simplest code to pass test
+4. **Verify GREEN**: Run test, confirm it passes
+5. **REFACTOR**: Clean up while keeping tests green
+6. **Repeat**: Next test for next feature
+## Cycle Details
+**RED**: Write one minimal test (one behavior, clear name, real code)
+**Verify RED**: MANDATORY - watch it fail for right reason
+**GREEN**: Write simplest code to pass (no extras)
+**Verify GREEN**: MANDATORY - watch it pass, all tests pass
+**REFACTOR**: Clean up while keeping tests green (optional)
+## Navigation
+For detailed information:
+- **[Workflow](references/workflow.md)**: Complete RED/GREEN/REFACTOR workflow with detailed examples
+- **[Examples](references/examples.md)**: Real-world TDD scenarios with step-by-step walkthroughs
+- **[Philosophy](references/philosophy.md)**: Why order matters and why tests-after don't work
+- **[Anti-patterns](references/anti-patterns.md)**: Common mistakes, rationalizations, and red flags
+- **[Integration](references/integration.md)**: Using TDD with debugging and other skills
+## Key Reminders
+- ALWAYS write the test BEFORE implementation
+- Make each test fail FIRST to verify it's testing something
+- Keep implementation minimal - just enough to pass tests
+- Refactor only when tests are green
+- One cycle at a time - small steps
+- If test passes immediately, it's not testing new behavior
+## Red Flags - STOP and Start Over
+If you catch yourself:
+- Writing code before test
+- Test passes immediately
+- Can't explain why test failed
+- "I'll test after"
+- "Keep as reference"
+- "Already spent X hours, deleting is wasteful"
+- "Tests after achieve the same purpose"
+**ALL of these mean: Delete code. Start over with TDD.**
+## Why Order Matters
+Tests-after pass immediately (proves nothing), test-first fail then pass (proves it works). See [Philosophy](references/philosophy.md) for detailed explanation.
+## Integration with Other Skills
+- **systematic-debugging**: Create failing test in Phase 4 (bug reproduction)
+- **verification-before-completion**: Verify tests exist and watched them fail
+- **defense-in-depth**: Add validation tests after implementing feature
+## Real-World Impact
+From TDD practice:
+- Test-first: 95%+ first-time correctness
+- Test-after: 40% first-time correctness
+- TDD time: 25-45 minutes per feature (including tests)
+- Non-TDD time: 15 minutes coding + 60-120 minutes debugging
+**TDD is pragmatic** - finds bugs before commit, prevents regressions, documents behavior, enables refactoring.

claude-mpm 4.15.2__py3-none-any.whl → 4.20.3__py3-none-any.whl

claude-mpm 4.15.2py3-none-any.whl → 4.20.3py3-none-any.whl