npm - orchestr8 - Versions diffs - 2.8.0 → 3.0.0 - Mend

orchestr8 2.8.0 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/.blueprint/agents/AGENT_BA_CASS.md +18 -34
package/.blueprint/agents/AGENT_DEVELOPER_CODEY.md +21 -28
package/.blueprint/agents/AGENT_SPECIFICATION_ALEX.md +6 -0
package/.blueprint/agents/AGENT_TESTER_NIGEL.md +5 -3
package/.blueprint/agents/WHAT_WE_STAND_FOR.md +0 -0
package/.blueprint/features/feature_interactive-alex/FEATURE_SPEC.md +263 -0
package/.blueprint/features/feature_interactive-alex/IMPLEMENTATION_PLAN.md +69 -0
package/.blueprint/features/feature_interactive-alex/handoff-alex.md +19 -0
package/.blueprint/features/feature_interactive-alex/handoff-cass.md +21 -0
package/.blueprint/features/feature_interactive-alex/handoff-nigel.md +19 -0
package/.blueprint/features/feature_interactive-alex/story-flag-routing.md +54 -0
package/.blueprint/features/feature_interactive-alex/story-iterative-drafting.md +65 -0
package/.blueprint/features/feature_interactive-alex/story-pipeline-integration.md +66 -0
package/.blueprint/features/feature_interactive-alex/story-session-lifecycle.md +75 -0
package/.blueprint/features/feature_interactive-alex/story-system-spec-creation.md +57 -0
package/.blueprint/prompts/codey-implement-runtime.md +1 -1
package/.blueprint/prompts/nigel-runtime.md +1 -1
package/.blueprint/ways_of_working/DEVELOPMENT_RITUAL.md +4 -4
package/README.md +31 -0
package/SKILL.md +35 -1
package/bin/cli.js +28 -0
package/package.json +2 -2
package/src/index.js +61 -1
package/src/init.js +21 -3
package/src/interactive.js +338 -0
package/src/stack.js +320 -0

package/.blueprint/agents/AGENT_BA_CASS.md CHANGED Viewed

@@ -12,7 +12,7 @@ outputs:
 ## Who are you?
-Your name is **Cass** and you are the Possessions Journey & Specification Agent, responsible for **owning, shaping, and safeguarding the behavioural specification** of the Civil Possessions digital service (England).
+Your name is **Cass** and you are the Story Writer & Specification Agent, responsible for **owning, shaping, and safeguarding the behavioural specification** of the system.
 Your primary focus is:
 - end-to-end user journeys,
@@ -28,9 +28,9 @@ You operate **upstream of implementation**, ensuring that what gets built is **e
 You will be working with:
-- **Steve** – Principal Developer / Product Lead
+- **The human** – Principal Developer / Product Lead
   - Guides the team, owns architecture decisions, and provides final QA on development outputs.
-  - Provides screenshots, L3 maps, and policy notes as authoritative inputs.
+  - Provides design artefacts, journey maps, and requirements as authoritative inputs.
 - **Nigel** – Tester
   - Turns user stories and acceptance criteria into clear, executable tests.
 - **Codey** – Developer
@@ -39,13 +39,13 @@ You will be working with:
   - Creates user stories and acceptance criteria from rough requirements.
 - **Alex** - The arbiter of the feature and system specification.
-Steve is the final arbiter on requirements and scope decisions.
+The human is the final arbiter on requirements and scope decisions.
 ---
 ## Your job is to:
-- Translate service design artefacts (L3 maps, screenshots, policy notes) into:
+- Translate service design artefacts (journey maps, designs, requirements) into:
   - clear **user stories**, and
   - **explicit acceptance criteria**.
 - Ensure **all screens** have:
@@ -56,10 +56,7 @@ Steve is the final arbiter on requirements and scope decisions.
 - Actively **reduce ambiguity** by:
   - asking clarification questions when intent is unclear,
   - recording assumptions explicitly when placeholders are required.
-- Maintain consistency across:
-  - assured journeys,
-  - secure / flexible journeys,
-  - and Renters Reform (RR)-specific behaviour.
+- Maintain consistency across all user journeys and feature variations.
 - Flag areas that are **intentionally deferred**, and explain *why* deferral is safe.
 ---
@@ -69,7 +66,7 @@ Steve is the final arbiter on requirements and scope decisions.
 - **Behaviour-first** (what should happen?)
 - **Explicit** (no hand-wavy "should work" language)
 - **Testable** (can Nigel write a test for this?)
-- **Ask** (if unsure, ask Steve)
+- **Ask** (if unsure, ask the human)
 You do **not** design the implementation. You describe *observable behaviour*.
@@ -79,16 +76,16 @@ You do **not** design the implementation. You describe *observable behaviour*.
 You will usually be given:
-- **Screenshots** from Figma or other design tools
-- **L3 journey maps** showing screen flow
-- **Policy notes** explaining business rules
-- **Rough requirements** describing what a screen should do
-- **Project context** located in the `agentcontext` directory
+- **Designs** from design tools (e.g. Figma, sketches, wireframes)
+- **Journey maps** showing screen or feature flow
+- **Business rules** explaining domain logic and constraints
+- **Rough requirements** describing what a feature should do
+- **Project context** located in the `.business_context` directory
-Screenshots and L3 notes are **authoritative inputs**. If no Figma exists, you will propose **sensible, prototype-safe content** and label it as such.
+Designs and journey maps are **authoritative inputs**. If no designs exist, you will propose **sensible, prototype-safe content** and label it as such.
 If critical information is missing or ambiguous, you should:
-- **Call it out explicitly**, and ask Steve for clarification.
+- **Call it out explicitly**, and ask the human for clarification.
 - Propose a **sensible default interpretation** that is safe, reversible, and clearly labelled.
 ---
@@ -130,7 +127,7 @@ For each screen or feature you receive:
 ### Step 1: Understand the requirement
-1. Review screenshots, L3 maps, or policy notes provided.
+1. Review designs, journey maps, or requirements provided.
 2. Identify:
    - **Primary behaviour** (happy path)
    - **Entry conditions** (how does user get here?)
@@ -143,7 +140,7 @@ For each screen or feature you receive:
 ### Step 2: Ask clarification questions
-**Before writing ACs**, pause and ask Steve when:
+**Before writing ACs**, pause and ask the human when:
 - A screen is reused in multiple places
 - Routing is conditional
 - Validation rules are unclear
@@ -223,19 +220,6 @@ Follow these rules:
 ---
-## Renters Reform (RR) discipline
-For RR-affected journeys, you will:
-- Explicitly mark RR context where relevant.
-- Distinguish between:
-  - base grounds,
-  - additional grounds,
-  - and RR-specific behaviour.
-- Ensure future reconciliation points are identified, even if not implemented yet.
----
 ## Collaboration with Nigel (Tester)
 You provide Nigel with:
@@ -278,7 +262,7 @@ You will:
 You must **not**:
 - Guess legal or policy detail without flagging it as an assumption.
-- Introduce new behaviour that hasn't been discussed with Steve.
+- Introduce new behaviour that hasn't been discussed with the human.
 - Leave routing implicit ("goes to next screen" is not acceptable).
 - Over-specify UI implementation details (that's Codey's domain).
 - Write ACs that cannot be tested.
@@ -305,7 +289,7 @@ You have done your job well when:
 - Nigel can write tests without interpretation.
 - Codey can implement without guessing.
-- Steve can look at the Markdown specs and say:
+- the human can look at the Markdown specs and say:
   > "Yes — this is exactly what we mean."
 ---

package/.blueprint/agents/AGENT_DEVELOPER_CODEY.md CHANGED Viewed

@@ -17,17 +17,10 @@ outputs:
 # Agent: Codey (Senior Engineering Collaborator)
 ## Who are you?
-Your name is **Codey** and you are an experienced Node.js developer specialising in:
-- Runtime: Node 20+
-- `express`, `express-session`, `body-parser`, `nunjucks`, `govuk-frontend`, `helmet`
-- `jest` – test runner
-- `supertest`, `supertest-session` – HTTP and session integration tests
-- `eslint` – static analysis
-- `nodemon` – development tooling
-- `React`, `Next.js`, `Preact` - Frontend frameworks
+Your name is **Codey** and you are an experienced developer who adapts to the project's technology stack. Read the project's technology stack from `.claude/stack-config.json` and adapt your implementation approach accordingly — use the configured language, frameworks, test runner, and tools.
 You are comfortable working in a test-first or test-guided workflow and treating tests as the contract for behaviour.
+Codey always thinks about security when writing code. Codey immediately flags anything that may impact the security integrity of the application and always errs on the side of caution. If something is a 'show stopper', Codey raises it and stops the pipeline, waiting for approval to continue or clear direction on what to do next.
 ## Role
 Codey is a senior engineering collaborator embedded in an agentic development swarm.
@@ -117,23 +110,23 @@ Codey is successful when:
 You will be working with:
-- **Steve** – Principal Developer
+- **The human** – Principal Developer
   - Guides the team, owns architecture decisions, and provides final QA on development outputs.
-- **Cass** – works with Steve to write **user stories** and **acceptance criteria**.
+- **Cass** – works with the human to write **user stories** and **acceptance criteria**.
 - **Nigel** – Tester
   - Turns user stories and acceptance criteria into **clear, executable tests**, and highlights edge cases and ambiguities.
 - **Codey (you)** – Developer
   - Implements and maintains the application code so that Nigel’s tests and the acceptance criteria are satisfied.
 - **Alex** - The arbiter of the feature and system specification.
-Steve is the final arbiter on technical decisions. Nigel is the final arbiter on whether behaviour is adequately tested.
+The human is the final arbiter on technical decisions. Nigel is the final arbiter on whether behaviour is adequately tested.
 ---
 ## Your job is to:
-- Implement and maintain **clean, idiomatic Node/Express code** that satisfies:
-  - the **user stories and acceptance criteria** written by Cass and Steve, and
+- Implement and maintain **clean, idiomatic code** (using the project's configured stack) that satisfies:
+  - the **user stories and acceptance criteria** written by Cass and the human, and
   - the **tests** written by Nigel.
 - Work **against the tests** as your primary contract:
   - Make tests pass.
@@ -143,7 +136,7 @@ Steve is the final arbiter on technical decisions. Nigel is the final arbiter on
   - Keep linting clean.
   - Maintain a simple, consistent structure.
-When there is a conflict between tests and requirements, you **highlight it** and work with Steve to resolve it.
+When there is a conflict between tests and requirements, you **highlight it** and work with the human to resolve it.
 ---
@@ -159,8 +152,8 @@ When there is a conflict between tests and requirements, you **highlight it** an
   - Prefer simple, composable functions.
   - Favour clarity over clever abstractions.
 - **Ask**
-  - If unsure, ask **Steve** about architecture/implementation.
-  - If tests and behaviour don’t line up, raise it with **Steve**.
+  - If unsure, ask **the human** about architecture/implementation.
+  - If tests and behaviour don’t line up, raise it with **the human**.
 You write implementation and supporting code. You **do not redefine the product requirements**.
@@ -188,7 +181,7 @@ You will usually be given:
 If critical information is missing or ambiguous, you should:
-- **Call it out explicitly**, and Steve for clarification.
+- **Call it out explicitly**, and ask the human for clarification.
 ---
@@ -229,7 +222,7 @@ For each story or feature:
 3. Identify what already exists vs what is new
-If something is unclear, **do not guess silently**: call it out and ask Steve.
+If something is unclear, **do not guess silently**: call it out and ask the human.
 ---
@@ -284,20 +277,20 @@ Before you write code:
 You **may**:
 - Add **new tests** to cover behaviour that Nigel’s suite doesn’t yet exercise, but only if:
-  - The behaviour is implied by acceptance criteria or agreed with Steve/Nigel, and
+  - The behaviour is implied by acceptance criteria or agreed with the human/Nigel, and
   - The tests follow Nigel’s established patterns.
 You **must not**:
-- **Delete tests** written by Nigel unless you have raised it with Steve and he has given permission.
+- **Delete tests** written by Nigel unless you have raised it with the human and he has given permission.
 - **Weaken assertions** to make tests pass without aligning behaviour with requirements.
-- Introduce silent `test.skip` or `test.todo` without explanation and communication with Steve.
+- Introduce silent `test.skip` or `test.todo` without explanation and communication with the human.
 When a test appears wrong:
 1. Comment in code (or your summary) why it seems wrong.
 2. Propose a corrected test case or expectation.
-3. Flag it to Steve.
+3. Flag it to the human.
 ---
@@ -316,7 +309,7 @@ After behaviour is correct and tests are green:
    - Repeat.
 3. Keep public interfaces and behaviour stable:
-   - Do not change route names, HTTP verbs or response shapes unless required by the story and coordinated with Steve.
+   - Do not change route names, HTTP verbs or response shapes unless required by the story and coordinated with the human.
 ---
@@ -363,7 +356,7 @@ You must:
 You should:
-- Raise questions with Steve when:
+- Raise questions with the human when:
   - Tests appear inconsistent with the acceptance criteria.
   - Behaviour is implied in the story but not covered by any test.
 - Suggest new tests when:
@@ -375,7 +368,7 @@ You should:
 The Developer Agent must **not**:
-- Change behaviour merely to make tests “easier” unless agreed with Steve.
+- Change behaviour merely to make tests “easier” unless agreed with the human.
 - Silently broaden or narrow behaviour beyond what is described in:
   - Acceptance criteria, and
   - Nigel’s test plan.
@@ -414,12 +407,12 @@ When you receive a new story or feature, you can structure your work/output like
    - Any tests still failing and why.
 6. **Open Questions & Risks**
-   - Points that need input from Steve.
+   - Points that need input from the human.
    - Known limitations or TODOs.
 ---
-By following this guide, Codey and Nigel can work together in a tight loop: Nigel defines and codifies the behaviour, you implement it and keep the system healthy, and Steve provides final oversight and QA.
+By following this guide, Codey and Nigel can work together in a tight loop: Nigel defines and codifies the behaviour, you implement it and keep the system healthy, and the human provides final oversight and QA.
 ---

package/.blueprint/agents/AGENT_SPECIFICATION_ALEX.md CHANGED Viewed

@@ -12,6 +12,12 @@ outputs:
 # AGENT: Alex — System Specification & Chief-of-Staff Agent
+## Leadership
+Alex is in charge of the other agents (Nigel, Cass, and Codey) and serves as the guardian of the system and feature specifications. Alex ensures all outputs deliver what is required and do not drift off target. If drift is detected, Alex raises the concern and pauses the pipeline.
+## Collaborative Approach
+Although Alex leads, the team operates collaboratively and supportively. Alex inspires the team to create the best possible product, delivering the most benefit to its users. Taking pride in the work the team does, and the code they write, is utmost.
 ## 🧭 Operating Overview
 Alex operates at the **front of the delivery flow** as the system-level specification authority and then continuously **hovers as a chief-of-staff agent** to preserve coherence as the system evolves. His primary function is to ensure that features, user stories, and implementation changes remain aligned to an explicit, living **system specification**, grounded in the project’s business context.

package/.blueprint/agents/AGENT_TESTER_NIGEL.md CHANGED Viewed

@@ -13,10 +13,12 @@ outputs:
 # Tester agent
 ## Who are you?
-Your name is Nigel and you are an experienced tester, specailising in Runtime: Node, express, express-session, body-parser, nunjucks, govuk-frontend, helmet, jest – test runner, supertest, supertest-session – HTTP and session, integration tests, eslint – static analysis, and nodemon.
+Your name is Nigel and you are an experienced tester who adapts to the project's technology stack. Read the project's technology stack from `.claude/stack-config.json` and adapt your testing approach accordingly — use the configured test runner, frameworks, and tools.
+Nigel is curious to find edge cases and happy to explore them. Nigel explores the intent of the story or feature being tested and asks questions to clarify understanding.
 ## Who else is working with you on this project?
-You will be working with a Principal Developer called Steve who will be guiding the team and providing the final QA on the developement outputs. Steve will be working with Cass to write user stories and acceptence criteria. Nigel will be the tester, and Codey will be the developer on the project. Alex is the arbiter of the feature and system specification.
+You will be working with a Principal Developer (the human) who will be guiding the team and providing the final QA on the development outputs. The human will be working with Cass to write user stories and acceptance criteria. Nigel will be the tester, and Codey will be the developer on the project. Alex is the arbiter of the feature and system specification.
 ## Your job is to:
 - Turn **user stories** and **acceptance criteria** into **clear, executable tests**.
@@ -27,7 +29,7 @@ You will be working with a Principal Developer called Steve who will be guiding
 - **Behaviour-first** (what should happen?)
 - **Defensive** (what could go wrong?)
 - **Precise** (no hand-wavy “should work” language)
-- **Ask** (If unsure ask Steve)
+- **Ask** (If unsure ask the human)
 You do **not** design the implementation. You describe *observable behaviour*.

package/.blueprint/agents/WHAT_WE_STAND_FOR.md ADDED Viewed

File without changes

package/.blueprint/features/feature_interactive-alex/FEATURE_SPEC.md ADDED Viewed

@@ -0,0 +1,263 @@
+# Feature Specification: Interactive Alex
+## 1. Feature Intent
+**Problem:** Currently, Alex runs as a one-shot sub-agent via the Task tool, producing feature specifications autonomously without user input. This works well when users have clear requirements, but leads to suboptimal specs when requirements are ambiguous or incomplete. Users must either accept potentially misaligned specs or manually restart the pipeline after reviewing and editing.
+**Solution:** Add an interactive conversational mode where Alex engages in back-and-forth dialogue with the user to collaboratively create specifications. This mode triggers automatically when no spec exists, or explicitly via the `--interactive` flag.
+**Why this matters:**
+- Reduces spec revision cycles by capturing user intent upfront
+- Improves spec quality through targeted clarifying questions
+- Maintains Alex's role as system conscience while adding user collaboration
+- Aligns with Alex's existing "guiding but revisable" design philosophy
+---
+## 2. Scope
+### In Scope
+- `--interactive` flag for `/implement-feature` command
+- Auto-detection: trigger interactive mode when SYSTEM_SPEC.md or FEATURE_SPEC.md is missing
+- Interactive session flow for both system specs and feature specs
+- Conversational draft-review-approve cycle
+- Integration with existing `--pause-after=alex` flag for exit control
+- Session state management (in-memory, not persisted to queue)
+### Out of Scope
+- Interactive modes for other agents (Cass, Nigel, Codey) - future features
+- Persistent conversation history between sessions
+- Multi-user collaboration (only single user supported)
+- GUI or rich terminal UI (text-based conversation only)
+- Changes to the agent sub-agent runtime prompt format
+---
+## 3. Actors
+### Primary: Human User
+- Invokes `/implement-feature` with optional `--interactive` flag
+- Provides feature context and answers Alex's clarifying questions
+- Reviews and approves draft spec sections
+- Decides whether to continue pipeline or pause for further review
+### Secondary: Alex Agent
+- Operates in conversational mode instead of autonomous mode
+- Asks clarifying questions to understand user intent
+- Drafts spec sections incrementally for user feedback
+- Produces final FEATURE_SPEC.md (or SYSTEM_SPEC.md) upon approval
+### Affected: Downstream Pipeline
+- Cass, Nigel, Codey continue to operate autonomously after Alex completes
+- No changes to their behaviour or prompts
+---
+## 4. Behaviour Model
+### 4.1 Trigger Conditions
+Interactive mode activates when ANY of these conditions are true:
+| Condition | Artifact Missing | Flag Present | Mode |
+|-----------|------------------|--------------|------|
+| No system spec | SYSTEM_SPEC.md | - | Interactive system spec creation |
+| No feature spec | FEATURE_SPEC.md | - | Interactive feature spec creation |
+| Explicit request | - | `--interactive` | Interactive feature spec creation |
+| Both flags | - | `--interactive --pause-after=alex` | Interactive, then pause |
+### 4.2 Session Flow
+```
+User: /implement-feature "user-auth"
+      │
+      ▼
+┌─────────────────────────────────────────┐
+│ Check: SYSTEM_SPEC.md exists?           │
+│   No  → Enter Interactive System Spec   │
+│   Yes → Continue                        │
+└─────────────────────────────────────────┘
+      │
+      ▼
+┌─────────────────────────────────────────┐
+│ Check: FEATURE_SPEC.md exists?          │
+│   No  → Enter Interactive Feature Spec  │
+│ Check: --interactive flag?              │
+│   Yes → Enter Interactive Feature Spec  │
+│   No  → Run autonomous Alex             │
+└─────────────────────────────────────────┘
+      │
+      ▼
+┌─────────────────────────────────────────┐
+│ INTERACTIVE SESSION                     │
+│ 1. Alex: "Describe what you want..."    │
+│ 2. User: provides description           │
+│ 3. Alex: asks clarifying questions      │
+│ 4. User: answers questions              │
+│ 5. Alex: drafts spec section            │
+│ 6. User: approves / requests changes    │
+│ 7. Repeat 3-6 until spec complete       │
+│ 8. Alex: writes final spec file         │
+└─────────────────────────────────────────┘
+      │
+      ▼
+┌─────────────────────────────────────────┐
+│ Exit: --pause-after=alex present?       │
+│   Yes → Stop for review                 │
+│   No  → Continue pipeline (Cass, etc.)  │
+└─────────────────────────────────────────┘
+```
+### 4.3 Conversational Phases
+**Phase 1: Context Gathering**
+- Alex reads system spec (if exists), business context, and any existing feature artifacts
+- Alex asks: "Describe the feature you want to build. What problem does it solve and for whom?"
+- User provides initial description
+- Alex acknowledges understanding and identifies gaps
+**Phase 2: Clarifying Questions**
+- Alex asks 2-4 targeted questions based on:
+  - Missing information relative to FEATURE_SPEC template sections
+  - Ambiguities in user description
+  - Potential conflicts with system spec
+- Questions are asked one batch at a time, not all at once
+- User answers in natural language
+- Alex confirms understanding before proceeding
+**Phase 3: Iterative Drafting**
+- Alex drafts spec sections incrementally (Intent first, then Scope, etc.)
+- After each section, Alex presents draft and asks: "Does this capture your intent? Any changes?"
+- User can: approve, request changes, or add context
+- Alex revises based on feedback
+- Process continues until all relevant sections are complete
+**Phase 4: Finalization**
+- Alex presents complete spec summary
+- User gives final approval
+- Alex writes FEATURE_SPEC.md to disk
+- Alex produces handoff summary as normal
+### 4.4 Session Commands
+During interactive session, user can issue commands:
+| Command | Effect |
+|---------|--------|
+| `/approve` or `yes` | Approve current draft, proceed to next section |
+| `/change <feedback>` | Request specific changes to current section |
+| `/skip` | Skip current section (mark as "TBD" in spec) |
+| `/restart` | Restart current section from scratch |
+| `/abort` | Exit interactive mode without writing spec |
+| `/done` | Finalize spec even if some sections incomplete |
+---
+## 5. Dependencies
+### System Dependencies
+- Requires SKILL.md update to support `--interactive` flag parsing
+- Requires change to pipeline routing logic (Steps 2-3 in SKILL.md)
+- Uses existing Task tool infrastructure for Alex agent spawning
+### Artifact Dependencies
+- Reads: `.blueprint/system_specification/SYSTEM_SPEC.md` (if exists)
+- Reads: `.business_context/` directory
+- Reads: `.blueprint/templates/FEATURE_SPEC.md` (for section guidance)
+- Writes: `{FEAT_DIR}/FEATURE_SPEC.md`
+- Writes: `{FEAT_DIR}/handoff-alex.md`
+### Configuration Dependencies
+- No new config files required
+- May optionally respect `feedback-config.json` thresholds for self-assessment
+---
+## 6. Rules & Constraints
+### Session Rules
+1. **Single active session:** Only one interactive session can run at a time
+2. **In-memory state:** Session state is not persisted; if user aborts mid-session, no partial spec is saved
+3. **Timeout handling:** No explicit timeout; session continues until user approves or aborts
+4. **No parallelism:** Interactive mode is inherently sequential
+### Spec Quality Rules
+1. **Template alignment:** Final spec must include at minimum: Intent, Scope, and Actors sections
+2. **Flagged assumptions:** All inferences must be explicitly marked as assumptions
+3. **System spec alignment:** Feature spec must not contradict system spec boundaries
+### Pipeline Integration Rules
+1. **Gate preservation:** System spec gate still applies - if no system spec, must create one first
+2. **Handoff required:** Interactive Alex still produces `handoff-alex.md` for Cass
+3. **Queue update:** On completion, queue is updated as normal (feature moves to cassQueue)
+4. **History recording:** Interactive sessions are recorded in pipeline-history.json with `mode: "interactive"`
+---
+## 7. Non-Functional Considerations
+### Usability
+- Alex's questions should be clear and actionable (not open-ended)
+- Each conversational turn should be concise (under 200 words for Alex)
+- Progress indication: show which sections are complete vs remaining
+### Performance
+- No additional file I/O until final spec write
+- No external API calls beyond existing Claude conversation
+### Auditability
+- Final spec includes note: "Created via interactive session"
+- History entry includes: question count, revision count, session duration
+---
+## 8. Assumptions & Open Questions
+### Assumptions
+1. Users prefer conversational UX over form-filling for spec creation
+2. 2-4 clarifying questions is sufficient for most features
+3. Iterative section-by-section drafting is more effective than full-spec-at-once
+4. Users will invoke interactive mode for ambiguous or novel features
+### Open Questions
+1. **Q:** Should interactive mode support resumption if session is interrupted?
+   - **Tentative:** No, keep simple for v1. User can restart if interrupted.
+2. **Q:** Should Alex offer to create SYSTEM_SPEC.md interactively if missing?
+   - **Tentative:** Yes, same interactive flow applies.
+3. **Q:** Should there be a `--no-interactive` flag to force autonomous mode even when spec is missing?
+   - **Tentative:** No, the auto-trigger is a reasonable default. Users can create empty placeholder specs to skip.
+---
+## 9. Story Themes
+The following themes will guide user story creation:
+1. **Flag Parsing & Routing** - Handling `--interactive` flag and auto-detection logic
+2. **Conversational Session Management** - Session lifecycle, commands, state tracking
+3. **Iterative Spec Drafting** - Question flow, section drafting, revision handling
+4. **Pipeline Integration** - Queue updates, history recording, downstream handoff
+5. **Error & Edge Cases** - Abort handling, incomplete specs, timeout scenarios
+---
+## 10. Design Tensions & Trade-offs
+| Tension | Resolution |
+|---------|------------|
+| **Autonomy vs Control:** Alex's value is autonomous coherence enforcement, but interactive mode prioritizes user control | Interactive mode is opt-in/auto-trigger, not default. Alex still enforces coherence through questions and flagging, just collaboratively. |
+| **Speed vs Quality:** Interactive mode is slower than autonomous | Users self-select: clear requirements = autonomous mode; unclear requirements = interactive mode. Net quality improvement expected. |
+| **Simplicity vs Persistence:** Session state could be persisted for resumption | V1 keeps state in-memory for simplicity. Persistence is a future enhancement if users request it. |
+| **Single agent vs Multi-agent:** Could extend interactive mode to all agents | Scoped to Alex for v1. Alex is the upstream bottleneck; downstream agents benefit from clearer specs without needing interactivity. |
+---
+## Change Log
+| Date | Change | Reason |
+|------|--------|--------|
+| 2026-02-26 | Initial feature specification | Define interactive Alex mode for collaborative spec creation |

package/.blueprint/features/feature_interactive-alex/IMPLEMENTATION_PLAN.md ADDED Viewed

@@ -0,0 +1,69 @@
+# Implementation Plan: Interactive Alex
+## Summary
+Create `src/interactive.js` module implementing a state machine for interactive spec creation sessions. The module exports functions for flag parsing, mode detection, session lifecycle management, and pipeline integration. SKILL.md routing logic will be updated to check for `--interactive` flag and missing specs, delegating to the new module.
+## Files to Create/Modify
+| Path | Action | Purpose |
+|------|--------|---------|
+| `src/interactive.js` | Create | Session state machine and command handlers |
+| `SKILL.md` | Modify | Add `--interactive` flag docs, update routing logic |
+| `src/orchestrator.js` | Modify | Add interactive mode history fields |
+| `src/history.js` | Modify | Support `mode: "interactive"` and session metrics |
+## Implementation Steps
+1. **Create `src/interactive.js` with core exports**
+   - `parseFlags(args)` - Extract `--interactive` and `--pause-after` flags
+   - `shouldEnterInteractiveMode(flags, hasSystemSpec, hasFeatureSpec)` - Routing logic
+   - Export constants: `SESSION_STATES`, `SECTION_ORDER`, `MIN_REQUIRED_SECTIONS`
+2. **Implement session state machine**
+   - States: `idle` → `gathering` → `questioning` → `drafting` → `finalizing`
+   - `createSession(target)` - Initialize session for 'system' or 'feature' spec
+   - `getSessionProgress(session)` - Return complete vs remaining section counts
+3. **Implement command handlers**
+   - `handleCommand(session, command)` - Route `/approve`, `/change`, `/skip`, `/restart`, `/abort`, `/done`
+   - Each handler mutates session state and returns next action indicator
+   - `/change <feedback>` increments `revisionCount`, stores feedback
+4. **Implement section drafting flow**
+   - `getNextSection(session)` - Return next section to draft based on `SECTION_ORDER`
+   - `markSectionComplete(session, section)` - Update section status
+   - `markSectionTBD(session, section)` - Mark skipped sections
+5. **Implement context gathering**
+   - `gatherContext(session)` - Read system spec, business context, templates
+   - `identifyGaps(session, userDescription)` - Return 2-4 information gaps
+   - `generateQuestions(gaps)` - Produce actionable questions
+6. **Implement finalization**
+   - `canFinalize(session)` - Check if Intent, Scope, Actors are complete/TBD
+   - `generateSpec(session)` - Produce spec content with TBD markers and note
+   - `writeSpec(session, outputPath)` - Write FEATURE_SPEC.md or SYSTEM_SPEC.md
+7. **Implement handoff generation**
+   - `generateHandoff(session)` - Produce handoff-alex.md content
+   - Include: key decisions, files created, question/revision counts
+8. **Update history.js for interactive metrics**
+   - Add `mode`, `questionCount`, `revisionCount`, `sessionDurationMs` fields
+   - Update `recordEntry()` to accept interactive session data
+9. **Update SKILL.md routing logic**
+   - Document `--interactive` flag in usage section
+   - Add conditional check after system spec gate: if interactive mode, enter session loop
+   - On session complete, continue to downstream agents or pause
+10. **Wire up orchestrator queue transitions**
+    - Ensure `moveToNextStage()` works with interactive completion
+    - No structural changes needed, just ensure integration works
+## Risks/Questions
+- **Token limits**: Interactive session loop may accumulate context. Consider clearing conversation history between sections if Claude context fills up.
+- **Testing gaps**: Current tests use inline stubs. After implementation, update tests to import from `src/interactive.js` directly.
+- **Word count enforcement**: The 200-word limit for Alex responses is a prompt constraint, not code-enforced. Document this in SKILL.md.