npm - orchestr8 - Versions diffs - 2.6.1 → 2.7.1 - Mend

orchestr8 2.6.1 → 2.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (46) hide show

package/.blueprint/prompts/nigel-runtime.md ADDED Viewed

@@ -0,0 +1,46 @@
+You are Nigel, the Tester Agent.
+## Task
+Create tests from user stories and acceptance criteria. Tests must expose ambiguities and edge cases early, providing a stable contract for the Developer to code against.
+## Inputs (read these files)
+- Stories: {FEAT_DIR}/story-*.md
+- Feature Spec: {FEAT_DIR}/FEATURE_SPEC.md
+## Outputs (write these files IN ORDER)
+Step 1: Write {TEST_DIR}/test-spec.md containing:
+- Brief understanding (5-10 lines)
+- AC to Test ID mapping table (compact)
+- Key assumptions (bullet list)
+Step 2: Write {TEST_FILE} containing:
+- Executable tests (Jest or Node test runner)
+- One describe block per story
+- One test per acceptance criterion
+## Rules
+- Write test-spec.md FIRST, then write test file
+- Keep test-spec.md under 100 lines using table format
+- Tests should be self-documenting with minimal comments
+- Reference story files by path in test descriptions
+- Make failure states meaningful with expected error messages
+- Do not over-prescribe implementation details
+- Focus on externally observable behaviour
+## Test Design Principles
+- Clarity over cleverness
+- Deterministic tests (avoid flaky patterns)
+- Cover boundaries: min/max, empty/null, invalid formats
+## Completion
+Brief summary: test count, AC coverage %, assumptions (5 bullets max).
+## Reference
+For detailed guidance, see: .blueprint/agents/AGENT_TESTER_NIGEL.md

package/.blueprint/templates/STORY_TEMPLATE.md ADDED Viewed

@@ -0,0 +1,96 @@
+# User Story Template
+Use this template when writing user stories and acceptance criteria.
+---
+## Screen [N] — [Title]
+### User story
+As a [role], I want [capability] so that [benefit].
+---
+### Context / scope
+- Professional user (Solicitor)
+- England standard possession claim
+- Screen is reached when: [entry condition]
+- Route:
+  - `GET /claims/[route-name]`
+  - `POST /claims/[route-name]`
+- This screen captures: [what data]
+---
+### Acceptance criteria
+Write ACs in **Given/When/Then** format (precondition, action, result):
+**AC-1 — [Short description]**
+- Given [precondition],
+- When [action],
+- Then [expected result].
+**AC-2 — [Short description]**
+- Given [precondition],
+- When [action],
+- Then [expected result].
+<!-- Continue with AC-3, AC-4, etc. -->
+**AC-N — Previous navigation**
+- Given I click Previous,
+- Then I am returned to [previous route]
+- And any entered data is preserved in session.
+**AC-N+1 — Continue navigation**
+- Given I click Continue and validation passes,
+- Then I am redirected to [next route].
+**AC-N+2 — Cancel behaviour**
+- Given I click Cancel,
+- Then I am returned to /case-list
+- And the claim draft remains stored in session.
+**AC-N+3 — Accessibility compliance**
+- Given validation errors occur,
+- Then:
+  - a GOV.UK error summary is displayed at the top of the page,
+  - errors link to the relevant field,
+  - focus moves to the error summary,
+  - and all inputs are properly labelled and keyboard accessible.
+---
+### Session persistence
+```js
+session.claim.fieldName = {
+  property: 'value' | null
+}
+```
+---
+### Out of scope
+- [Item 1]
+- [Item 2]
+---
+## Guidelines for writing user stories
+### Every AC must be:
+- Deterministic
+- Observable via the UI or session
+- Unambiguous
+### Routing must be explicit for:
+- Previous link
+- Continue button
+- Cancel link
+- Any conditional paths
+### Keep stories focused:
+- Maximum 5-7 ACs per story
+- If more needed, split into multiple story files

package/.blueprint/templates/TEST_TEMPLATE.md ADDED Viewed

@@ -0,0 +1,76 @@
+# Test Template
+Use this template when writing test specifications and executable tests.
+---
+## Outputs you must produce
+### 1. test-spec.md (write FIRST, keep under 100 lines)
+- Brief understanding (5-10 lines max)
+- AC to Test ID mapping table (compact format)
+- Key assumptions (bullet list)
+### 2. Executable test file (write SECOND)
+- One `describe` block per user story
+- One `it` block per acceptance criterion
+- Self-documenting test names with minimal comments
+---
+## AC to Test ID Mapping Table Format
+| AC | Test ID | Scenario |
+|----|---------|----------|
+| AC-1 | T-1.1 | Valid credentials leads to success |
+| AC-1 | T-1.2 | Invalid password leads to error |
+| AC-2 | T-2.1 | Missing field shows validation |
+---
+## Traceability Table Format
+| Acceptance Criterion | Test IDs | Notes |
+|---------------------|----------|-------|
+| AC-1 | T-1.1, T-1.2 | Happy path covered |
+| AC-2 | T-2.1 | Edge case pending |
+---
+## Test Design Principles
+- **Clarity over cleverness**: Prioritise readability with explicit steps
+- **Determinism**: Avoid flaky patterns and random inputs
+- **Coverage with intent**: Focus on behavioural coverage, not test count
+- **Boundaries and edge cases**: Consider min/max, empty/null, invalid formats
+---
+## Test Structure Example
+```javascript
+describe('Feature: [Feature Name]', () => {
+  describe('[User Story Reference]', () => {
+    it('T-1.1: [behaviour description]', async () => {
+      // Given [precondition]
+      // When [action]
+      // Then [expected result]
+    });
+    it('T-1.2: [another behaviour]', () => {
+      // Test implementation
+    });
+  });
+});
+```
+---
+## Guidelines
+- Make failure states meaningful with expected error messages
+- Avoid over-prescribing implementation details
+- Focus on externally observable behaviour
+- Keep tests small and isolated with one main assertion per test
+- Clean up async tasks and resources at test end
+- Use `it.skip` or `test.todo` for pending/blocked tests

package/README.md CHANGED Viewed

@@ -77,11 +77,34 @@ Run the pipeline with the `/implement-feature` skill in Claude Code:
 /implement-feature "user-auth" --no-history  # Skip history recording
 /implement-feature "user-auth" --no-commit   # Skip auto-commit
 /implement-feature "user-auth" --pause-after=alex|cass|nigel|codey-plan
+/implement-feature "user-auth" --with-stories  # Force include Cass stage
+/implement-feature "user-auth" --skip-stories  # Force skip Cass stage
+```
+## Smart Story Routing (v2.7)
+The pipeline automatically classifies features as **technical** or **user-facing** and routes accordingly:
+| Feature Type | Cass Stage | Example Features |
+|--------------|------------|------------------|
+| **Technical** | Skipped | refactoring, optimization, infrastructure, caching |
+| **User-facing** | Included | login flows, dashboards, forms, notifications |
+This saves ~25-40k tokens per technical feature while preserving story quality for user-facing features.
+```bash
+# Auto-detection (default)
+/implement-feature "token-optimization"  # Detected as technical → skips Cass
+/implement-feature "user-dashboard"      # Detected as user-facing → includes Cass
+# Manual override
+/implement-feature "edge-case" --with-stories   # Force include Cass
+/implement-feature "edge-case" --skip-stories   # Force skip Cass
 ```
 ## Pipeline Flow
-The pipeline now includes validation, feedback loops, and history tracking:
+The pipeline includes validation, smart routing, feedback loops, and history tracking:
 ```
 ┌─────────────────────────────────────────────────────────────────┐
@@ -97,19 +120,30 @@ The pipeline now includes validation, feedback loops, and history tracking:
                               │
                               ▼
 ┌─────────────────────────────────────────────────────────────────┐
-│  Alex (Feature Spec)                                            │
-│         │                                                       │
-│         ▼                                                       │
-│  Cass rates Alex → Quality Gate (pause if rating < 3)           │
-│         │                                                       │
-│         ▼                                                       │
-│  Cass (User Stories)                                            │
-│         │                                                       │
-│         ▼                                                       │
-│  Nigel rates Cass → Quality Gate                                │
-│         │                                                       │
-│         ▼                                                       │
-│  Nigel (Tests)                                                  │
+│  Alex (Feature Spec) + Handoff Summary                          │
+└─────────────────────────────────────────────────────────────────┘
+                              │
+                              ▼
+┌─────────────────────────────────────────────────────────────────┐
+│  Smart Routing (v2.7)                                           │
+│  • Classify feature as technical or user-facing                 │
+│  • Technical → skip Cass (saves ~25-40k tokens)                 │
+│  • User-facing → include Cass                                   │
+└─────────────────────────────────────────────────────────────────┘
+                              │
+              ┌───────────────┴───────────────┐
+              │                               │
+              ▼                               ▼
+┌──────────────────────┐        ┌──────────────────────┐
+│  Technical Features  │        │  User-Facing Features│
+│  Skip to Nigel       │        │  Cass (User Stories) │
+└──────────────────────┘        └──────────────────────┘
+              │                               │
+              └───────────────┬───────────────┘
+                              │
+                              ▼
+┌─────────────────────────────────────────────────────────────────┐
+│  Nigel (Tests) + Handoff Summary                                │
 │         │                                                       │
 │         ▼                                                       │
 │  Codey rates Nigel → Quality Gate                               │
@@ -145,6 +179,10 @@ orchestr8 includes these built-in modules for observability and self-improvement
 | **insights** | Analyzes patterns, detects bottlenecks, recommends improvements |
 | **retry** | Smart retry strategies based on failure history |
 | **feedback** | Agent-to-agent quality assessment with correlation tracking |
+| **classifier** | Smart routing — classifies features as technical or user-facing |
+| **handoff** | Structured summaries between agents for token efficiency |
+| **business-context** | Lazy loading of business context based on feature needs |
+| **tools** | Tool schemas and validation for Claude native features |
 ### How They Work Together
@@ -175,14 +213,24 @@ analyzes:               recommends:            calibrates:
 ```
 your-project/
 ├── .blueprint/
-│   ├── agents/                    # Agent specifications (with guardrails)
+│   ├── agents/                    # Agent specifications
 │   │   ├── AGENT_SPECIFICATION_ALEX.md
 │   │   ├── AGENT_BA_CASS.md
 │   │   ├── AGENT_TESTER_NIGEL.md
-│   │   └── AGENT_DEVELOPER_CODEY.md
-│   ├── templates/                 # Spec templates
+│   │   ├── AGENT_DEVELOPER_CODEY.md
+│   │   └── GUARDRAILS.md          # Shared guardrails (v2.7)
+│   ├── prompts/                   # Slim runtime prompts (v2.7)
+│   │   ├── TEMPLATE.md
+│   │   ├── alex-runtime.md
+│   │   ├── cass-runtime.md
+│   │   ├── nigel-runtime.md
+│   │   ├── codey-plan-runtime.md
+│   │   └── codey-implement-runtime.md
+│   ├── templates/                 # Spec and output templates
 │   │   ├── SYSTEM_SPEC.md
-│   │   └── FEATURE_SPEC.md
+│   │   ├── FEATURE_SPEC.md
+│   │   ├── STORY_TEMPLATE.md      # (v2.7)
+│   │   └── TEST_TEMPLATE.md       # (v2.7)
 │   ├── ways_of_working/           # Development rituals
 │   ├── features/                  # Feature specs (populated per feature)
 │   └── system_specification/      # System spec (populated on first run)
@@ -244,6 +292,22 @@ $ npx orchestr8 insights
 - Avg duration: 14 min → 11 min (improving)
 ```
+## Token Efficiency (v2.7)
+Version 2.7 introduces several optimizations to reduce token usage:
+| Optimization | Savings | Description |
+|--------------|---------|-------------|
+| **Shared Guardrails** | ~1,200 tokens | Single GUARDRAILS.md instead of duplicated in each agent spec |
+| **Slim Runtime Prompts** | ~5,200 tokens | 30-50 line prompts instead of 200-400 line full specs |
+| **Upstream Summaries** | ~2,000-4,000 tokens | Handoff summaries between agents instead of full artifacts |
+| **Template Extraction** | ~800 tokens | Templates moved to separate files, loaded on demand |
+| **Lazy Business Context** | Variable | Only loaded when feature spec references it |
+| **Compressed Feedback** | ~400 tokens | 3-line feedback prompts instead of 7-line |
+| **Smart Story Routing** | ~25,000-40,000 tokens | Skip Cass for technical features |
+**Total estimated savings: 10,000+ tokens per pipeline run** (more for technical features)
 ## License
 MIT