npm - orchestr8 - Versions diffs - 2.6.0 → 2.7.0 - Mend

orchestr8 2.6.0 → 2.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (46) hide show

package/.blueprint/features/feature_lazy-business-context/IMPLEMENTATION_PLAN.md ADDED Viewed

@@ -0,0 +1,54 @@
+# Implementation Plan - Lazy Business Context Loading
+## Summary
+Implement lazy loading of business context by adding a detection function that scans feature specs for `.business_context` or `business_context/` references. Update the orchestrator to store the detection result in the queue and modify SKILL.md agent prompts to conditionally include the business context directive based on agent name (Alex always gets it) and detection/override flag state.
+## Files to Create/Modify
+| Path | Action | Purpose |
+|------|--------|---------|
+| `src/business-context.js` | Create | Detection logic and context inclusion functions |
+| `src/orchestrator.js` | Modify | Integrate detection during queue initialization |
+| `SKILL.md` | Modify | Update agent prompts with conditional business context |
+| `bin/cli.js` | Modify | Add `--include-business-context` flag parsing |
+## Implementation Steps
+1. **Create `src/business-context.js` module**
+   - Export `needsBusinessContext(featureSpecContent)` - returns boolean based on string matching
+   - Export `parseIncludeBusinessContextFlag(args)` - returns boolean for flag presence
+   - Export `shouldIncludeBusinessContext(agentName, detected, overrideFlag)` - determines if agent gets context
+   - Export `generateBusinessContextDirective(includeContext)` - returns directive string or empty
+   - Tests covered: T-DL-1 through T-DL-5, T-OF-1, T-AE-1 through T-AE-5, T-CI-1, T-CI-2
+2. **Update `src/orchestrator.js` queue structure**
+   - Modify `setCurrent()` to accept optional `needsBusinessContext` parameter
+   - Add field to `current` object: `needsBusinessContext: boolean`
+   - Tests covered: T-CI-3, T-CI-4
+3. **Update `bin/cli.js` argument parsing**
+   - Add `--include-business-context` to recognized flags
+   - Pass flag value to orchestrator when initializing queue
+   - Tests covered: T-OF-1, T-OF-4
+4. **Integrate detection in pipeline setup (Step 5)**
+   - Read feature spec content when initializing queue
+   - Call `needsBusinessContext()` on content
+   - Apply override flag if present
+   - Store result in queue `current.needsBusinessContext`
+   - Tests covered: T-INT-1, T-INT-2, T-INT-3
+5. **Update SKILL.md agent prompts conditionally**
+   - Add conditional note in Step 6 (Alex): "Business Context: .business_context/" always included
+   - Add conditional note in Steps 7-10: "Business Context: .business_context/" only if `needsBusinessContext: true`
+   - Document new `--include-business-context` flag in Invocation section
+   - Tests covered: T-CI-1, T-CI-2, T-AE-1 through T-AE-3
+6. **Add exports to `src/index.js`**
+   - Export business-context module functions for external use
+## Risks/Questions
+- **Risk**: Feature specs that need business context but don't explicitly cite it will miss context. Mitigation: The `--include-business-context` override flag provides escape hatch.
+- **Question**: Should detection log when it skips business context for debugging? Recommend: Add optional verbose logging but default to silent.

package/.blueprint/features/feature_model-native-features/FEATURE_SPEC.md ADDED Viewed

@@ -0,0 +1,174 @@
+# Feature Specification — Model Native Features
+## 1. Feature Intent
+**Why this feature exists.**
+- Claude and other LLMs have native features that are more token-efficient than text equivalents
+- System prompts are processed differently than user messages (potentially lower cost)
+- Tool use provides structured outputs without parsing overhead
+- Leveraging these features can significantly reduce token usage and improve reliability
+---
+## 2. Scope
+### In Scope
+- Use system prompts for static agent context (specs, guardrails)
+- Use tool definitions for structured outputs (feedback, handoff summaries)
+- Investigate prompt caching for repeated context
+- Document model-specific optimizations
+### Out of Scope
+- Changing pipeline logic
+- Supporting multiple LLM providers (Claude-focused initially)
+- Real-time model switching
+---
+## 3. Actors Involved
+| Actor | Native Feature Usage |
+|-------|---------------------|
+| All Agents | System prompt for static context |
+| All Agents | Tool use for structured outputs |
+| Pipeline | Prompt caching for repeated context |
+---
+## 4. Behaviour Overview
+### System Prompts for Static Context
+**Current approach:**
+```
+User: You are Alex, the System Specification Agent.
+Read your full specification from: .blueprint/agents/AGENT_SPECIFICATION_ALEX.md
+...
+```
+**Native approach:**
+```
+System: [Agent spec content loaded here - potentially cached/cheaper]
+User: Create a feature specification for "user-auth".
+Inputs: ...
+Outputs: ...
+```
+### Tool Use for Structured Outputs
+**Current approach:**
+```
+Output your feedback as:
+FEEDBACK: { "rating": N, "issues": [...], "recommendation": "..." }
+```
+**Native approach:**
+```javascript
+tools: [{
+  name: "submit_feedback",
+  description: "Submit quality rating for prior stage",
+  input_schema: {
+    type: "object",
+    properties: {
+      rating: { type: "number", minimum: 1, maximum: 5 },
+      issues: { type: "array", items: { type: "string" } },
+      recommendation: { enum: ["proceed", "pause", "revise"] }
+    },
+    required: ["rating", "issues", "recommendation"]
+  }
+}]
+```
+### Prompt Caching
+- Cache static content (agent specs, guardrails, templates)
+- Reduce repeated token transmission
+- Leverage Claude's prompt caching feature when available
+**Key outcomes:**
+- Reduced token costs for static content
+- More reliable structured outputs (no parsing errors)
+- Faster responses with cached prompts
+---
+## 5. State & Lifecycle Interactions
+- No pipeline state changes
+- Tool responses integrated into existing feedback flow
+- Caching is transparent to pipeline logic
+---
+## 6. Rules & Decision Logic
+| Rule | Description |
+|------|-------------|
+| System prompt for static | Agent specs, guardrails go in system prompt |
+| User prompt for dynamic | Task-specific instructions in user message |
+| Tools for structure | Any JSON output should use tool definitions |
+| Cache when possible | Repeated context should leverage caching |
+---
+## 7. Dependencies
+- Claude API access with system prompt support
+- Tool use capability in Task tool
+- Prompt caching feature (if available)
+- SKILL.md restructured for system/user prompt split
+---
+## 8. Non-Functional Considerations
+- **Performance:** Significant token reduction (system prompts may be cached/cheaper)
+- **Reliability:** Tool use eliminates JSON parsing errors
+- **Complexity:** Requires understanding model-specific features
+- **Portability:** Optimizations may be Claude-specific
+---
+## 9. Assumptions & Open Questions
+**Assumptions:**
+- Task tool can be configured to use system prompts
+- Tool definitions can be passed to sub-agents
+- Prompt caching provides meaningful savings
+**Open Questions:**
+- How does Task tool handle system prompts currently?
+- What's the actual cost difference for system vs user tokens?
+- Is prompt caching available and how is it triggered?
+- Should we abstract these features for multi-model support later?
+---
+## 10. Impact on System Specification
+- Adds model-specific optimization layer
+- May need to document Claude-specific features in system spec
+- Consider portability implications if multi-model support is planned
+---
+## 11. Handover to BA (Cass)
+**Story themes:**
+- Investigate system prompt support in Task tool
+- Restructure SKILL.md for system/user prompt split
+- Define tool schemas for structured outputs (feedback, handoff)
+- Implement tool-based feedback collection
+- Investigate and document prompt caching
+**Expected story boundaries:**
+- One story for system prompt investigation and implementation
+- One story for tool schema definitions
+- One story for feedback tool integration
+- One story for prompt caching investigation
+---
+## 12. Change Log (Feature-Level)
+| Date | Change | Reason | Raised By |
+|-----|------|--------|-----------|
+| 2026-02-25 | Initial spec | Token efficiency improvement | Claude |

package/.blueprint/features/feature_model-native-features/IMPLEMENTATION_PLAN.md ADDED Viewed

@@ -0,0 +1,45 @@
+# Implementation Plan - Model Native Features
+## Summary
+This feature extracts model-native constructs (tool schemas, prompt structure helpers) into reusable modules to improve token efficiency and output reliability. The implementation focuses on creating exportable tool schema definitions and prompt builder utilities that can be integrated into the pipeline's existing feedback and handoff systems.
+## Files to Create/Modify
+| Path | Action | Purpose |
+|------|--------|---------|
+| `src/tools/schemas.js` | Create | Define tool schemas for feedback and handoff |
+| `src/tools/validation.js` | Create | Tool input validation utilities |
+| `src/tools/prompts.js` | Create | System/user prompt structure builders |
+| `src/tools/index.js` | Create | Module exports |
+| `src/feedback.js` | Modify | Use schema validation from tools module |
+| `src/handoff.js` | Modify | Add handoff tool schema support |
+| `test/feature_model-native-features.test.js` | Modify | Import from implementation modules |
+## Implementation Steps
+1. **Create `src/tools/schemas.js`** - Define `FEEDBACK_TOOL_SCHEMA` and `HANDOFF_TOOL_SCHEMA` as exportable constants matching the test definitions. Include name, description, and input_schema with all property constraints.
+2. **Create `src/tools/validation.js`** - Implement `validateToolInput(schema, input)` function that validates inputs against schema constraints (type checking, bounds, enums, required fields). Return `{ valid, errors }` format.
+3. **Create `src/tools/prompts.js`** - Implement `buildPromptMessages(staticContent, dynamicContent)` returning array with system and user messages. Include `cache_control: { type: 'ephemeral' }` on system prompt. Add `identifyCacheableContent(content)` helper.
+4. **Create `src/tools/index.js`** - Export all schemas, validation, and prompt utilities from single entry point.
+5. **Update `src/feedback.js`** - Replace inline validation logic with imported `validateToolInput` using `FEEDBACK_TOOL_SCHEMA`. Maintain backward compatibility with existing `parseFeedbackFromOutput` function.
+6. **Update `src/handoff.js`** - Add `validateHandoffToolInput` function using `HANDOFF_TOOL_SCHEMA`. Keep existing markdown-based validation for backward compatibility.
+7. **Update tests** - Modify test file to import schemas and validation from `src/tools/` instead of inline definitions. Ensure all 12 test cases pass.
+8. **Add module to main exports** - Update `src/index.js` to export tools module for external use.
+9. **Run full test suite** - Verify all tests pass with `node --test`.
+10. **Document usage** - Add brief note to SKILL.md about tool schema availability (optional, if time permits).
+## Risks/Questions
+- **Task tool system prompt support**: The feature spec notes uncertainty about how Task tool handles system prompts. Implementation focuses on schemas/utilities first; actual Task tool integration may require separate investigation.
+- **Prompt caching availability**: Claude's prompt caching feature availability is undetermined. The `cache_control` structure is included but actual caching benefits depend on API support.
+- **Backward compatibility**: Both feedback.js and handoff.js must maintain existing text-based parsing alongside new tool validation to avoid breaking current pipeline.

package/.blueprint/features/feature_shared-guardrails/FEATURE_SPEC.md ADDED Viewed

@@ -0,0 +1,119 @@
+# Feature Specification — Shared Guardrails
+## 1. Feature Intent
+**Why this feature exists.**
+- The same guardrails section (~45 lines, ~400 tokens) is duplicated verbatim in all 4 agent specification files
+- This wastes ~1,200 tokens per pipeline run (4 agents reading identical content)
+- Extracting to a shared file reduces token usage and ensures consistency when guardrails are updated
+---
+## 2. Scope
+### In Scope
+- Extract guardrails section to `.blueprint/agents/GUARDRAILS.md`
+- Update all agent specs to reference the shared file
+- Ensure agents still read and apply guardrails
+### Out of Scope
+- Changing the content of guardrails
+- Agent-specific guardrail variations (all agents use same guardrails currently)
+---
+## 3. Actors Involved
+| Actor | Can Do | Cannot Do |
+|-------|--------|-----------|
+| Alex | Read shared guardrails, apply to outputs | Modify guardrails |
+| Cass | Read shared guardrails, apply to outputs | Modify guardrails |
+| Nigel | Read shared guardrails, apply to outputs | Modify guardrails |
+| Codey | Read shared guardrails, apply to outputs | Modify guardrails |
+---
+## 4. Behaviour Overview
+**Happy path:**
+1. Pipeline invokes agent (e.g., Alex)
+2. Agent reads slim spec file which references `GUARDRAILS.md`
+3. Agent reads `GUARDRAILS.md` once
+4. Agent applies guardrails to all outputs
+**Key outcomes:**
+- Identical guardrail enforcement as before
+- ~1,200 fewer tokens per pipeline run
+- Single source of truth for guardrail updates
+---
+## 5. State & Lifecycle Interactions
+- No state changes — this is a structural refactor
+- Agent behaviour is unchanged
+- Pipeline flow is unchanged
+---
+## 6. Rules & Decision Logic
+| Rule | Description |
+|------|-------------|
+| Single source | All agents reference the same `GUARDRAILS.md` file |
+| Read once | Each agent reads guardrails once per invocation |
+| No override | Agent specs cannot override shared guardrails |
+---
+## 7. Dependencies
+- All 4 agent specification files must be updated
+- SKILL.md agent prompts may need adjustment if they reference guardrails directly
+- `src/init.js` and `src/update.js` must handle the new file during init/update
+---
+## 8. Non-Functional Considerations
+- **Performance:** Reduces token usage by ~1,200 per run
+- **Maintainability:** Single file to update when guardrails change
+- **Consistency:** Eliminates risk of guardrails drifting between agents
+---
+## 9. Assumptions & Open Questions
+**Assumptions:**
+- All agents will continue to use identical guardrails
+- Agents can follow file references (read file X when spec says "see file X")
+**Open Questions:**
+- Should agent prompts explicitly include guardrails content, or trust agents to read the reference?
+---
+## 10. Impact on System Specification
+- Reinforces existing system assumptions (guardrails apply to all agents)
+- No contradiction with system spec
+- No system spec change required
+---
+## 11. Handover to BA (Cass)
+**Story themes:**
+- Extract guardrails to shared file
+- Update agent specs to reference shared file
+- Update init/update commands to handle new file
+**Expected story boundaries:**
+- One story for file extraction and agent spec updates
+- One story for init/update command changes
+---
+## 12. Change Log (Feature-Level)
+| Date | Change | Reason | Raised By |
+|-----|------|--------|-----------|
+| 2026-02-25 | Initial spec | Token efficiency improvement | Claude |

package/.blueprint/features/feature_shared-guardrails/IMPLEMENTATION_PLAN.md ADDED Viewed

@@ -0,0 +1,34 @@
+# Implementation Plan — Shared Guardrails
+## Summary
+Extract the duplicated ~45-line guardrails section from all 4 agent specifications into a new `.blueprint/agents/GUARDRAILS.md` file, then update each agent spec to reference the shared file instead of containing inline guardrails. No code changes required in init.js or update.js since they already copy the entire `agents/` directory.
+## Files to Create/Modify
+| Path | Action | Purpose |
+|------|--------|---------|
+| `.blueprint/agents/GUARDRAILS.md` | Create | New shared guardrails file containing extracted content |
+| `.blueprint/agents/AGENT_SPECIFICATION_ALEX.md` | Modify | Remove inline guardrails, add reference to GUARDRAILS.md |
+| `.blueprint/agents/AGENT_BA_CASS.md` | Modify | Remove inline guardrails, add reference to GUARDRAILS.md |
+| `.blueprint/agents/AGENT_TESTER_NIGEL.md` | Modify | Remove inline guardrails, add reference to GUARDRAILS.md |
+| `.blueprint/agents/AGENT_DEVELOPER_CODEY.md` | Modify | Remove inline guardrails, add reference to GUARDRAILS.md |
+## Implementation Steps
+1. **Create GUARDRAILS.md** - Extract the guardrails section (lines 169-210 from AGENT_SPECIFICATION_ALEX.md) into new file at `.blueprint/agents/GUARDRAILS.md`. Content includes: Allowed Sources, Prohibited Sources, Citation Requirements, Assumptions vs Facts, Confidentiality, Escalation Protocol.
+2. **Update AGENT_SPECIFICATION_ALEX.md** - Remove the inline `## Guardrails` section (lines 169-210). Add reference: `## Guardrails\n\nRead and apply the shared guardrails from: `.blueprint/agents/GUARDRAILS.md``
+3. **Update AGENT_BA_CASS.md** - Remove inline guardrails section (lines 383-425). Add same reference format.
+4. **Update AGENT_TESTER_NIGEL.md** - Remove inline guardrails section (lines 174-216). Add same reference format.
+5. **Update AGENT_DEVELOPER_CODEY.md** - Remove inline guardrails section (lines 426-468). Add same reference format.
+6. **Run tests** - Execute `node --test test/feature_shared-guardrails.test.js` to verify all ACs pass.
+## Risks/Questions
+- ASSUMPTION: Claude agents can follow file references and will read GUARDRAILS.md when instructed. Per story-extract-guardrails.md:Notes, this is a documented assumption.
+- No code changes needed in src/init.js or src/update.js since `agents/` is already in the UPDATABLE array (per story-update-init-commands.md:Notes).

package/.blueprint/features/feature_shared-guardrails/story-extract-guardrails.md ADDED Viewed

@@ -0,0 +1,60 @@
+# Story — Extract Guardrails to Shared File
+## User story
+As a framework maintainer, I want to extract the duplicated guardrails section from all agent specification files into a single shared GUARDRAILS.md file so that guardrails are maintained in one place and token usage is reduced.
+---
+## Context / scope
+- Affects all 4 agent specifications (Alex, Cass, Nigel, Codey)
+- Guardrails section is currently ~45 lines / ~400 tokens per agent
+- New file location: `.blueprint/agents/GUARDRAILS.md`
+- This is a structural refactor; agent behaviour remains unchanged
+Per FEATURE_SPEC.md:Section 1: "The same guardrails section (~45 lines, ~400 tokens) is duplicated verbatim in all 4 agent specification files"
+---
+## Acceptance criteria
+**AC-1 — Shared guardrails file exists**
+- Given the `.blueprint/agents/` directory,
+- When the shared guardrails feature is implemented,
+- Then a new file `GUARDRAILS.md` exists at `.blueprint/agents/GUARDRAILS.md`
+- And it contains the complete guardrails content (Allowed Sources, Prohibited Sources, Citation Requirements, Assumptions vs Facts, Confidentiality, Escalation Protocol sections).
+**AC-2 — Agent specs reference shared file**
+- Given each agent specification file (AGENT_SPECIFICATION_ALEX.md, AGENT_BA_CASS.md, AGENT_TESTER_NIGEL.md, AGENT_DEVELOPER_CODEY.md),
+- When the shared guardrails feature is implemented,
+- Then the inline guardrails section is removed from each file
+- And each file contains a reference to read `.blueprint/agents/GUARDRAILS.md`.
+**AC-3 — Guardrails content is identical**
+- Given the new `GUARDRAILS.md` file,
+- When comparing its content to the original inline guardrails,
+- Then the content is identical (no additions, removals, or modifications to guardrail rules).
+**AC-4 — Agent specs remain functional**
+- Given an agent (e.g., Alex) reads its specification file,
+- When the specification references `GUARDRAILS.md`,
+- Then the agent can locate and read the shared guardrails file
+- And the agent applies all guardrails to its outputs.
+**AC-5 — No duplicate guardrails remain**
+- Given all 4 agent specification files,
+- When searching for guardrails sections,
+- Then no file contains inline guardrails content (only the reference to the shared file).
+---
+## Out of scope
+- Modifying the content of guardrails (per FEATURE_SPEC.md:Section 2)
+- Agent-specific guardrail variations
+- Changes to init/update commands (covered in separate story)
+- Modifying agent prompts in SKILL.md
+---
+## Notes
+- Per FEATURE_SPEC.md:Section 6, agents cannot override shared guardrails
+- Per FEATURE_SPEC.md:Section 9, ASSUMPTION: Agents can follow file references (read file X when spec says "see file X")

package/.blueprint/features/feature_shared-guardrails/story-update-init-commands.md ADDED Viewed

@@ -0,0 +1,63 @@
+# Story — Update Init/Update Commands for Shared Guardrails
+## User story
+As a user installing or updating orchestr8, I want the init and update commands to correctly handle the new GUARDRAILS.md file so that the shared guardrails are properly distributed to target projects.
+---
+## Context / scope
+- Affects `src/init.js` (initialization command)
+- Affects `src/update.js` (update command)
+- New file `.blueprint/agents/GUARDRAILS.md` must be included in both operations
+- Existing behaviour for other files remains unchanged
+Per FEATURE_SPEC.md:Section 7: "src/init.js and src/update.js must handle the new file during init/update"
+---
+## Acceptance criteria
+**AC-1 — Init copies GUARDRAILS.md**
+- Given a user runs `orchestr8 init` in a new project,
+- When the `.blueprint/agents/` directory is copied,
+- Then the `GUARDRAILS.md` file is included in the copied content
+- And the file is placed at `.blueprint/agents/GUARDRAILS.md` in the target project.
+**AC-2 — Update replaces GUARDRAILS.md**
+- Given a user runs `orchestr8 update` in an existing project,
+- When the `.blueprint/agents/` directory is updated,
+- Then the `GUARDRAILS.md` file is replaced with the latest version from the package
+- And the file is placed at `.blueprint/agents/GUARDRAILS.md` in the target project.
+**AC-3 — Update preserves user content directories**
+- Given a user runs `orchestr8 update`,
+- When the update process completes,
+- Then `features/` and `system_specification/` directories remain untouched
+- And only `agents/`, `templates/`, and `ways_of_working/` are replaced.
+**AC-4 — Agent specs with references are copied**
+- Given the agent specification files now reference `GUARDRAILS.md`,
+- When init or update copies the agent specs,
+- Then all agent specs (AGENT_*.md) are copied with their guardrails references intact
+- And no orphaned guardrails references exist (GUARDRAILS.md is always present when agent specs reference it).
+**AC-5 — Backward compatibility**
+- Given an existing project with old agent specs (containing inline guardrails),
+- When a user runs `orchestr8 update`,
+- Then the old agent specs are replaced with new specs referencing GUARDRAILS.md
+- And the new GUARDRAILS.md file is added to the project.
+---
+## Out of scope
+- Changes to the guardrails content itself
+- Changes to SKILL.md agent prompts
+- Changes to queue management or pipeline flow
+- User content migration (users do not have custom guardrails)
+---
+## Notes
+- Per SYSTEM_SPEC.md:Section 8, the update command replaces framework directories while preserving user content
+- The `agents/` directory is listed in UPDATABLE array in `src/update.js`, so GUARDRAILS.md will be handled automatically once it exists in the source
+- No code changes are expected in init.js or update.js since they copy entire directories; the change is purely in the source assets