npm - @vpxa/aikit - Versions diffs - 0.1.73 → 0.1.74 - Mend

@vpxa/aikit 0.1.73 → 0.1.74

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (126) hide show

package/scaffold/compiled/flows-data.mjs ADDED Viewed

@@ -0,0 +1,1571 @@
+// Auto-generated by scaffold/compile.mjs — DO NOT EDIT
+export const FLOWS = {
+  "_epilogue": [
+    { file: "steps/docs-sync/README.md", content: `# Epilogue: Documentation Sync
+> **This is a mandatory epilogue step.** It runs automatically after every flow completes to keep project documentation synchronized with code changes.
+## Objective
+Review the changes made during this flow and update the \`docs/\` folder using AI Kit analysis tools — never write docs from scratch when a tool can generate the foundation.
+## Prerequisites
+Load the \`docs\` skill before proceeding — it contains the full documentation convention, templates, architecture blueprint workflow, and change-to-doc mapping rules.
+## Instructions
+### 0. Gather Flow Artifacts
+Read all artifacts produced during this flow — they contain design decisions, requirements, and verification results that are the most valuable documentation input.
+\`\`\`
+flow_status()                                           # Get artifactsPath
+find({ pattern: "*.md", path: "{{artifacts_path}}" })   # Discover all flow artifacts
+digest({ sources: [                                     # Compress artifacts for context
+  { path: "<found-artifact-1>" },
+  { path: "<found-artifact-2>" },
+  ...
+]})
+\`\`\`
+Map each discovered artifact to documentation actions using the artifact-to-doc mapping from the \`docs\` skill. Different flows produce different artifacts — read everything \`find()\` returns and focus on content that contains decisions, requirements, and verification results.
+If no artifacts exist, proceed to Step 1 in source-only mode.
+### 1. Assess Changes (tool-driven)
+\`\`\`
+git_context({})                                         # What changed in this flow
+blast_radius({ changed_files: ["<changed-files>"] })    # Impact analysis — which modules affected
+\`\`\`
+Use the output to classify changes:
+| Change Signal | Documentation Action |
+|---------------|---------------------|
+| New files in \`src/\` | Potential new component doc |
+| Modified API surface | Update reference docs |
+| New package or module boundary | Update architecture overview |
+| Architecture decision made | Delegate to \`adr-skill\` |
+| Test-only or config-only changes | Likely skip |
+### 2. Apply the Change-to-Doc Mapping
+Follow the decision tree from the \`docs\` skill to determine which documentation actions are needed.
+### 3. Bootstrap \`docs/\` If Needed (full tool-driven workflow)
+If \`docs/\` doesn't exist, run the **Architecture Blueprint Workflow** from the \`docs\` skill:
+\`\`\`
+# Step 1: Generate content with AI Kit tools
+produce_knowledge({ path: "." })                        # → Foundation for docs/README.md
+analyze_structure({ path: "." })                        # → docs/architecture/overview.md structure
+analyze_diagram({ path: "." })                          # → docs/architecture/ Mermaid diagrams
+analyze_dependencies({ path: "." })                     # → docs/architecture/overview.md deps section
+analyze_entry_points({ path: "." })                     # → docs/reference/api.md foundation
+analyze_patterns({ path: "." })                         # → docs/architecture/overview.md patterns
+# Step 2: Create the docs/ tree from tool outputs
+docs/
+├── README.md              ← From produce_knowledge + project context
+├── architecture/
+│   ├── overview.md        ← From analyze_structure + analyze_dependencies + analyze_diagram
+│   └── components/        ← From analyze_symbols per major component
+├── decisions/
+│   └── index.md           ← ADR log (delegate to adr-skill)
+├── guides/
+│   └── testing.md         ← From analyze_patterns test info
+└── reference/
+    └── api.md             ← From analyze_entry_points
+\`\`\`
+Use the Architecture Blueprint sections from the \`docs\` skill as the template for each document.
+### 4. Update Existing Docs (tool-assisted)
+When \`docs/\` already exists:
+\`\`\`
+compact({ path: "docs/architecture/overview.md", query: "section to update" })  # Read target section
+blast_radius({ changed_files: ["<files>"] })                                     # What's affected
+\`\`\`
+- **Don't rewrite** — update the relevant sections of existing docs
+- **Don't duplicate** — if the information is in code comments or READMEs, reference it
+- Use \`compact\` to read existing doc sections before editing
+- Use \`blast_radius\` output to determine which sections need updating
+### 5. Delegate When Appropriate
+- Architecture decisions → \`adr-skill\` → \`docs/decisions/\`
+- Architecture diagrams → \`c4-architecture\` skill → \`docs/architecture/\`
+- Full architecture refresh → Run the Architecture Blueprint Workflow from \`docs\` skill
+### 6. Update Index
+If documents were added, removed, or renamed, update \`docs/README.md\` to reflect the current structure.
+### 7. Skip If Nothing Changed
+If the flow's changes don't warrant doc updates (e.g., pure bug fix with no revelations), report:
+- "No documentation updates needed"
+- Reason: (brief explanation)
+## Completion Criteria
+- [ ] \`git_context\` + \`blast_radius\` used to assess changes
+- [ ] Change-to-doc mapping applied from \`docs\` skill
+- [ ] \`docs/\` bootstrapped with tool outputs if it didn't exist
+- [ ] Relevant docs created or updated (or skipped with reason)
+- [ ] \`docs/README.md\` index is current
+- [ ] No placeholder/empty docs created — all content tool-generated or hand-written with purpose` },
+  ],
+  "aikit-advanced": [
+    { file: "README.md", content: `# aikit:advanced — Full Development Flow
+Full development flow for **new features, API design, and architecture changes**.
+## Steps
+| # | Step | Skill | Produces | Requires | Agents |
+|---|------|-------|----------|----------|--------|
+| 1 | **Design Gate** | \`steps/design/README.md\` | \`design-decisions.md\` | — | Researcher-Alpha/Beta/Gamma/Delta |
+| 2 | **Specification** | \`steps/spec/README.md\` | \`spec.md\` | \`design-decisions.md\` | Researcher-Alpha |
+| 3 | **Planning** | \`steps/plan/README.md\` | \`plan.md\` | \`spec.md\` | Planner, Explorer |
+| 4 | **Task Breakdown** | \`steps/task/README.md\` | \`tasks.md\` | \`plan.md\` | Planner, Architect-Reviewer-Alpha |
+| 5 | **Execution** | \`steps/execute/README.md\` | \`progress.md\` | \`tasks.md\` | Orchestrator, Implementer, Frontend, Refactor |
+| 6 | **Verification** | \`steps/verify/README.md\` | \`verify-report.md\` | \`progress.md\` | Code-Reviewer-Alpha/Beta, Architect-Reviewer-Alpha/Beta, Security |
+## How It Works
+Each step has a **README.md** file that contains the detailed instructions for the agent(s) executing that step. The Orchestrator reads the README.md via \`flow_read_instruction\` and delegates work accordingly.
+### Step 1: Design Gate
+- Full brainstorming session for new features and architectural changes
+- FORGE classification (\`forge_classify\`) + grounding (\`forge_ground\`) for complex tasks
+- Parallel 4-researcher decision protocol for non-trivial technical decisions
+- ADR generation for critical-tier tasks
+- **Mandatory user stop** before proceeding — design decisions must be approved
+- Read \`steps/design/README.md\` for the full protocol
+### Step 2: Specification
+- Elicit requirements from the user, clarify scope
+- Define acceptance criteria and constraints
+- Build on design decisions from the previous step
+### Step 3: Planning
+- Deep codebase analysis using \`search\`, \`scope_map\`, \`trace\`, \`analyze_*\`
+- Design architecture based on spec and design decisions
+- Create comprehensive implementation plan with file-level changes
+### Step 4: Task Breakdown
+- Break the plan into ordered, atomic implementation tasks
+- Define dependencies between tasks
+- Identify parallel batches for multi-agent execution
+- Architecture review of the task structure
+### Step 5: Execution
+- Orchestrator dispatches agents in parallel batches per the task breakdown
+- Each agent gets a scoped task (1-3 files) with clear acceptance criteria
+- TDD: write tests first, then implement
+- Per-batch review cycle: Code Review (dual) → Arch Review → Security → Evidence Gate
+### Step 6: Verification
+- Dual code review (Code-Reviewer-Alpha + Beta)
+- Architecture review (Architect-Reviewer-Alpha + Beta)
+- Security review
+- Run \`check({})\` + \`test_run({})\` + \`blast_radius({})\`
+- \`evidence_map({ action: "gate" })\` for final quality gate
+## Using Skills Inside Steps
+When the Orchestrator activates a step:
+1. **Read the instruction first** — \`flow_read_instruction\` returns the README.md for the current step
+2. **Follow step instructions** — the README.md is the primary guide for what to do
+3. **Delegate to listed agents** — each step lists which agents are appropriate
+4. **Produce the required artifact** — the step's \`produces\` field specifies what file to create in the artifacts directory
+5. **Check dependencies** — the step's \`requires\` field lists artifacts from previous steps that must exist
+6. **Report status** — agents report \`DONE\` | \`DONE_WITH_CONCERNS\` | \`NEEDS_CONTEXT\` | \`BLOCKED\` to the Orchestrator
+## Artifacts
+All artifacts are stored in the run directory under \`.flows/{topic}/\`. The template variable \`{{artifacts_path}}\` resolves to the actual path at runtime.
+` },
+    { file: "steps/design/README.md", content: `# Design Gate — Advanced Flow
+Full design gate for new features, API design, and architecture changes. Runs brainstorming, decision protocol, and FORGE classification before specification begins.
+## When This Step Runs
+This is the **first step** of the \`aikit:advanced\` flow. It runs before specification.
+## Instructions
+### 1. Task Classification
+Classify the task:
+| Category | Indicators | Action |
+|----------|-----------|--------|
+| **Bug fix** | Error, regression, "fix" — wrong flow, should use \`aikit:basic\` | → Note mismatch, still run Quick Design |
+| **New feature** | New behavior, new API, new component | → Run **Full Design** below |
+| **Architecture change** | Restructure, migration, new pattern, cross-cutting | → Run **Full Design** with architecture focus |
+### 2. FORGE Classification
+Run \`forge_classify({ task: "<task description>", files: [<relevant files>] })\` to determine the complexity tier.
+| Tier | Meaning | Design Depth |
+|------|---------|-------------|
+| **Floor** | Low risk, well-understood | Quick brainstorm, 1-2 decisions |
+| **Standard** | Moderate complexity | Full brainstorm, parallel research, decision protocol |
+| **Critical** | High risk, contract/security implications | Deep brainstorm, 4-researcher parallel review, ADR required |
+### 3. Brainstorming Session
+Load the \`brainstorming\` skill and conduct a structured brainstorming session:
+1. **Intent Discovery** — What is the user trying to achieve? What problem does this solve?
+2. **Constraint Mapping** — Technical constraints, time constraints, compatibility requirements
+3. **Approach Exploration** — Generate 2-4 possible approaches
+4. **Trade-off Analysis** — Compare approaches on: complexity, maintainability, performance, risk
+For **Critical** tier tasks, also explore:
+- Security implications
+- Backward compatibility
+- Migration path
+- Rollback strategy
+### 4. Decision Protocol (Standard & Critical tiers)
+When technical decisions need resolution:
+1. **Identify decisions** — List each decision point with 2+ viable options
+2. **Parallel research** — Delegate to Researcher agents (2 for Standard, 4 for Critical):
+   - Researcher-Alpha: Deep analysis of primary approach
+   - Researcher-Beta: Trade-offs and edge cases of alternatives
+   - Researcher-Gamma: Cross-domain patterns and precedents
+   - Researcher-Delta: Feasibility and performance implications
+3. **Synthesize** — Combine researcher findings into a recommendation per decision
+4. **ADR** (Critical tier) — Load \`adr-skill\` and create an Architecture Decision Record
+### 5. FORGE Ground (Standard & Critical tiers)
+Run \`forge_ground({ task, root_path: "." })\` to:
+- Scope the affected files and modules
+- Identify unknowns and risks
+- Load existing constraints and conventions
+**Auto-upgrade check**: If \`forge_ground\` reveals contract-type unknowns or security concerns not caught by initial \`forge_classify\`, recommend tier upgrade.
+### 6. Write \`{{artifacts_path}}/design-decisions.md\` to disk
+**You MUST create this file on disk** using \`create_file\` or equivalent — do not just present content in chat.
+\`\`\`markdown
+## Design Decisions
+### FORGE Assessment
+- **Tier**: {Floor | Standard | Critical}
+- **Rationale**: {why this tier}
+- **Auto-upgrade**: {yes/no — if yes, explain}
+### Task Summary
+- **Goal**: {what we're building}
+- **Problem**: {what problem this solves}
+- **Users affected**: {who is impacted}
+### Approach
+- **Chosen approach**: {description}
+- **Alternatives considered**: {list with reasons for rejection}
+### Key Decisions
+| # | Decision | Choice | Rationale |
+|---|----------|--------|-----------|
+| 1 | {decision} | {choice} | {why} |
+### Constraints
+- {constraint 1}
+- {constraint 2}
+### Risks
+| Risk | Likelihood | Impact | Mitigation |
+|------|-----------|--------|------------|
+| {risk} | {L/M/H} | {L/M/H} | {mitigation} |
+### Open Questions
+- {question 1}
+- {question 2}
+\`\`\`
+### 7. Present to User
+Use \`present({ format: "html" })\` (or \`format: "browser"\` in CLI mode) to show:
+- Design decisions summary
+- FORGE tier and rationale
+- Key trade-offs
+- Open questions requiring user input
+**🛑 MANDATORY STOP** — Wait for user approval of design decisions before proceeding.
+### 8. Report to Orchestrator
+After user approves:
+- \`DONE\` — design decisions approved, ready for specification
+- \`DONE_WITH_CONCERNS\` — approved with caveats (list them)
+- \`NEEDS_CONTEXT\` — user raised questions that need more research
+**Do NOT call \`flow_step\`** — let the Orchestrator advance the flow.
+## Outputs
+Write \`{{artifacts_path}}/design-decisions.md\` to disk. **You MUST create this file** using \`create_file\` or equivalent — do not just present content in chat. This file is a prerequisite for the next step.
+## Produces
+- \`{{artifacts_path}}/design-decisions.md\` — FORGE tier, approach, key decisions, constraints, risks
+## Agents
+- \`Researcher-Alpha\` — Deep analysis of primary approach
+- \`Researcher-Beta\` — Trade-offs and edge cases
+- \`Researcher-Gamma\` — Cross-domain patterns
+- \`Researcher-Delta\` — Feasibility and performance
+## Foundation Integration
+Load these skills BEFORE executing this step:
+| Skill | Purpose | When |
+|-------|---------|------|
+| \`aikit\` | Core MCP tools — search, analyze, remember, validate | Always (auto-loaded) |
+| \`present\` | Rich rendering for any structured output — assessments, reports, comparisons, reviews, status boards, tables, charts, and all artifact content | Use for ANY output that benefits from rich rendering, not just dashboards |
+| \`multi-agents-development\` | Dispatch templates, task decomposition, review pipeline patterns | Before dispatching any subagent |
+| \`brainstorming\` | Structured ideation for design/creative decisions | Before any design choice or new feature exploration |
+| \`c4-architecture\` | C4 model diagrams showing system structure changes | When visualizing proposed architecture |
+| \`adr-skill\` | Architecture Decision Records for non-trivial decisions | Critical tier — document architecture decisions |
+### Presentation Rules
+- Use \`present\` for **any output** that benefits from rich rendering — not limited to dashboards
+- Assessments, reports, comparisons, reviews, status boards → \`present({ format: "html" })\`
+- Tables, charts, progress tracking, code review findings → always present
+- Artifact content and summaries → present with structured layout
+- Only use plain text for brief confirmations and simple questions
+## Completion Criteria
+- [ ] \`{{artifacts_path}}/design-decisions.md\` written to disk (NOT just presented in chat)
+- [ ] FORGE tier determined and documented
+- [ ] Brainstorming session completed (for Standard+ tier)
+- [ ] Key design decisions documented with rationale
+- [ ] User approval received (🛑 MANDATORY STOP)
+## Knowledge Capture
+Before completing this step, persist important findings using \`remember()\`:
+- **Design decisions**: Chosen approach and alternatives considered with trade-offs
+- **Architecture patterns**: New patterns introduced or existing patterns that must be followed
+- **Constraints discovered**: Technical limitations, compatibility requirements, or performance boundaries
+**Every step produces knowledge worth preserving.** If you discovered something that would help a future session, call \`remember()\` now.
+` },
+    { file: "steps/execute/README.md", content: `---
+name: execute
+description: Implement all tasks from the task breakdown, dispatching agents in parallel where possible.
+---
+# Execution
+## Prerequisites Check
+Before starting this step, verify:
+- [ ] Task breakdown approved with valid task graph
+- [ ] \`{{artifacts_path}}/tasks.md\` exists with defined batches and dependencies
+If prerequisites are NOT met -> **backtrack to task step** (\`flow_step({ action: 'redo' })\` on previous step)
+## Purpose
+Execute all tasks from the breakdown, dispatching implementation agents in batches for maximum parallelism while maintaining correctness.
+## Inputs
+- \`{{artifacts_path}}/tasks.md\` — the atomic task list with dependencies and agent assignments
+## Process
+1. **Load tasks** — Read task graph, dependencies, and parallelism map
+2. **Execute by batch** — For each batch in the parallelism map:
+   a. Dispatch assigned agents in parallel (different files = safe parallelism)
+   b. Each agent receives: task scope, affected files, acceptance criteria, relevant code context via \`compact()\`/\`digest()\`
+   c. Wait for all agents in batch to complete
+   d. Run \`check({})\` + \`test_run({})\` after each batch
+   e. Fix any failures before proceeding to next batch
+3. **Track progress** — Update task checkboxes as each completes
+4. **Handle failures** — If an agent reports \`BLOCKED\` or \`NEEDS_CONTEXT\`:
+   - Max 2 retries per task with refined context
+   - If still blocked, escalate to user
+5. **Final validation** — After all batches: \`check({})\` + \`test_run({})\` must pass
+## Outputs
+Write \`{{artifacts_path}}/progress.md\` to disk. **You MUST create this file** using \`create_file\` or equivalent — do not just present content in chat.
+Template:
+\`\`\`markdown
+# Execution Progress: <feature title>
+## Task Status
+- [x] T1.1: <description> — DONE
+- [x] T1.2: <description> — DONE
+- [x] T2.1: <description> — DONE
+- [ ] T2.2: <description> — IN PROGRESS
+...
+## Changes Made
+| File | Change | Task |
+|------|--------|------|
+| <file> | <description> | T1.1 |
+| ... | ... | ... |
+## Tests Added/Modified
+<list of test files and what they cover>
+## Validation
+- check: PASS/FAIL
+- test_run: PASS/FAIL (<N> passed, <M> failed)
+## Deviations
+<any changes from the task plan and why>
+## Blocked Items
+<items that needed user intervention, if any>
+\`\`\`
+## Agents
+| Agent | Role |
+|-------|------|
+| **Orchestrator** | Coordinates batch execution, handles failures and retries |
+| **Implementer** | Primary code writer for backend, logic, and infrastructure tasks |
+| **Frontend** | UI/UX implementation for React components, styling, responsive design |
+| **Refactor** | Code cleanup, restructuring, and pattern alignment tasks |
+**Parallelism rules**:
+- Read-only agents: unlimited parallelism
+- File-modifying agents: parallel ONLY on completely different files
+- Max 4 concurrent file-modifying agents
+## Foundation Integration
+Load these skills BEFORE executing this step:
+| Skill | Purpose | When |
+|-------|---------|------|
+| \`aikit\` | Core MCP tools — search, analyze, remember, validate | Always (auto-loaded) |
+| \`present\` | Rich rendering for any structured output — assessments, reports, comparisons, reviews, status boards, tables, charts, and all artifact content | Use for ANY output that benefits from rich rendering, not just dashboards |
+| \`multi-agents-development\` | Dispatch templates, task decomposition, review pipeline patterns | Before dispatching any subagent |
+| \`brainstorming\` | Structured ideation for design/creative decisions | Before any design choice or new feature exploration |
+| \`session-handoff\` | Context preservation for long-running execution phases | When execution spans multiple sessions or context is filling |
+### Presentation Rules
+- Use \`present\` for **any output** that benefits from rich rendering — not limited to dashboards
+- Assessments, reports, comparisons, reviews, status boards → \`present({ format: "html" })\`
+- Tables, charts, progress tracking, code review findings → always present
+- Artifact content and summaries → present with structured layout
+- Only use plain text for brief confirmations and simple questions
+### Orchestrator Dispatch Protocol
+Follow the \`multi-agents-development\` skill patterns for dispatch:
+1. **Independence Check** before parallelizing:
+   - Same files? → sequential
+   - Shared mutable state? → sequential
+   - Execution-order dependent? → sequential
+   - Need shared new types? → define contract first, then parallel
+   - All clear? → **parallel dispatch**
+2. **Subagent Context Template** (each dispatch includes):
+   - **Scope**: exact files + boundary (do NOT touch)
+   - **Goal**: acceptance criteria, testable
+   - **Arch Context**: actual code snippets via \`compact()\`/\`digest()\`
+   - **Constraints**: patterns, conventions, anti-patterns
+   - **Self-Review**: checklist before declaring DONE
+3. **Status Protocol**: \`DONE\` | \`DONE_WITH_CONCERNS\` | \`NEEDS_CONTEXT\` | \`BLOCKED\`
+4. **Max 2 retries** per task — then escalate to user
+## Completion Criteria
+- [ ] All tasks marked completed
+- [ ] \`check({})\` passes
+- [ ] \`test_run({})\` passes
+- [ ] No blocked items remaining
+- [ ] \`{{artifacts_path}}/progress.md\` written
+## Knowledge Capture
+Before completing this step, persist important findings using \`remember()\`:
+- **Implementation decisions**: Why specific approaches were chosen over alternatives
+- **Patterns established**: New conventions or patterns that future code should follow
+- **Gotchas encountered**: Edge cases, workarounds, or non-obvious behaviors discovered during execution
+**Every step produces knowledge worth preserving.** If you discovered something that would help a future session, call \`remember()\` now.
+` },
+    { file: "steps/plan/README.md", content: `---
+name: plan
+description: Analyze the codebase, design the architecture, and create a detailed implementation plan.
+---
+# Planning
+## Prerequisites Check
+Before starting this step, verify:
+- [ ] Specification approved with clarity score ≥90
+- [ ] \`{{artifacts_path}}/spec.md\` exists and is complete
+If prerequisites are NOT met → **backtrack to spec step** (\`flow_step({ action: 'redo' })\` on previous step)
+## Purpose
+Translate the specification into a concrete, phased implementation plan with architecture decisions, file-level scope, and dependency ordering.
+## Inputs
+- \`{{artifacts_path}}/spec.md\` — the validated specification
+## Process
+1. **Load spec** — Read and internalize all requirements and acceptance criteria
+2. **Codebase analysis** — \`scope_map({ task: "<feature>" })\` to identify affected subsystems
+3. **Deep dive** — \`file_summary()\` + \`compact()\` on each affected module
+4. **Architecture design** — Decide on:
+   - Where new code lives (new files vs extensions)
+   - API surface changes
+   - Data model changes
+   - Integration patterns
+5. **ADR for non-trivial decisions** — Use \`adr-skill\` for decisions that affect future development
+6. **Phase decomposition** — Break work into 3–10 ordered phases, each independently testable
+7. **Dependency graph** — Map which phases depend on others and which can parallelize
+8. **Risk assessment** — Identify implementation risks per phase
+## Outputs
+Write \`{{artifacts_path}}/plan.md\` to disk. **You MUST create this file** using \`create_file\` or equivalent — do not just present content in chat.
+Template:
+\`\`\`markdown
+# Implementation Plan: <feature title>
+## Architecture Overview
+<high-level design with rationale>
+## Affected Modules
+| Module | Changes | Risk |
+|--------|---------|------|
+| <module> | <what changes> | low/medium/high |
+## Phases
+### Phase 1: <name>
+- **Files**: <list>
+- **Changes**: <description>
+- **Tests**: <what to test>
+- **Depends on**: none
+- **Parallelizable with**: Phase 2
+### Phase 2: <name>
+...
+## Architecture Decisions
+- ADR-<N>: <title> — <chosen option and rationale>
+## Risks
+| Risk | Likelihood | Impact | Mitigation |
+|------|-----------|--------|------------|
+| <risk> | low/medium/high | <impact> | <mitigation> |
+\`\`\`
+## Agents
+| Agent | Role |
+|-------|------|
+| **Planner** | Creates comprehensive TDD implementation plans with phase decomposition |
+| **Explorer** | Rapid codebase exploration for discovery of affected files and dependencies |
+Use Explorer for initial breadth (file discovery, dependency tracing), then Planner for depth (phase design, ordering).
+## Foundation Integration
+Load these skills BEFORE executing this step:
+| Skill | Purpose | When |
+|-------|---------|------|
+| \`aikit\` | Core MCP tools — search, analyze, remember, validate | Always (auto-loaded) |
+| \`present\` | Rich rendering for any structured output — assessments, reports, comparisons, reviews, status boards, tables, charts, and all artifact content | Use for ANY output that benefits from rich rendering, not just dashboards |
+| \`multi-agents-development\` | Dispatch templates, task decomposition, review pipeline patterns | Before dispatching any subagent |
+| \`brainstorming\` | Structured ideation for design/creative decisions | Before any design choice or new feature exploration |
+| \`adr-skill\` | Architecture Decision Records for non-trivial technical decisions | When plan involves architecture choices that need documentation |
+| \`c4-architecture\` | C4 model diagrams showing system structure changes | When plan modifies system architecture |
+### Presentation Rules
+- Use \`present\` for **any output** that benefits from rich rendering — not limited to dashboards
+- Assessments, reports, comparisons, reviews, status boards → \`present({ format: "html" })\`
+- Tables, charts, progress tracking, code review findings → always present
+- Artifact content and summaries → present with structured layout
+- Only use plain text for brief confirmations and simple questions
+## Completion Criteria
+- [ ] All spec requirements have corresponding plan phases
+- [ ] Each phase has explicit file scope and test strategy
+- [ ] Architecture decisions documented with rationale
+- [ ] Dependency graph has no circular dependencies
+- [ ] Risks identified with mitigations
+- [ ] \`{{artifacts_path}}/plan.md\` written
+## Knowledge Capture
+Before completing this step, persist important findings using \`remember()\`:
+- **Task dependencies**: Critical ordering constraints and parallel opportunities
+- **Risk assessment**: Identified risks and mitigation strategies
+- **Resource decisions**: File ownership, module boundaries, and integration points
+**Every step produces knowledge worth preserving.** If you discovered something that would help a future session, call \`remember()\` now.
+` },
+    { file: "steps/spec/README.md", content: `---
+name: spec
+description: Elicit requirements, clarify scope, and define acceptance criteria through structured dialogue.
+---
+# Specification
+## Prerequisites Check
+Before starting this step, verify:
+- [ ] Design decisions approved (design-decisions.md exists with user approval)
+- [ ] FORGE tier assigned and documented
+If prerequisites are NOT met -> **backtrack to design step** (\`flow_step({ action: 'redo' })\` on previous step)
+## Purpose
+Transform a vague or broad feature request into a precise, testable specification through requirements elicitation and stakeholder dialogue.
+## Inputs
+- User's feature request, issue, or idea
+- Codebase context (accessed via aikit MCP tools)
+## Process
+1. **Understand intent** — Parse what the user wants and why
+2. **Search for context** — \`search()\` for related prior decisions, existing patterns, and similar features
+3. **Elicit requirements** — Ask structured questions to clarify:
+   - **Functional**: What must the system do?
+   - **Non-functional**: Performance, security, accessibility constraints
+   - **Scope boundaries**: What is explicitly out of scope?
+   - **Acceptance criteria**: How do we know it's done?
+4. **Score clarity** — Use the \`requirements-clarity\` skill to score 0–100. Iterate questions until ≥ 90.
+5. **Draft specification** — Write formal spec with all requirements resolved
+## Outputs
+Write \`{{artifacts_path}}/spec.md\` to disk. **You MUST create this file** using \`create_file\` or equivalent — do not just present content in chat.
+Template:
+\`\`\`markdown
+# Specification: <feature title>
+## Summary
+<1-2 sentence description>
+## Motivation
+<why this feature is needed>
+## Functional Requirements
+1. <requirement with acceptance criterion>
+2. ...
+## Non-Functional Requirements
+- Performance: <constraints>
+- Security: <constraints>
+- Accessibility: <constraints>
+## Scope
+### In Scope
+- <item>
+### Out of Scope
+- <item>
+## Acceptance Criteria
+- [ ] <testable criterion>
+- [ ] ...
+## Open Questions
+<none — all resolved during elicitation>
+## Clarity Score: <N>/100
+\`\`\`
+## Agents
+| Agent | Role |
+|-------|------|
+| **Researcher-Alpha** | Deep analysis of existing codebase patterns, prior decisions, and technical constraints |
+Use the \`brainstorming\` skill for creative/design exploration before formalizing requirements. Use \`requirements-clarity\` skill to score and iterate until the spec is unambiguous.
+## Foundation Integration
+Load these skills BEFORE executing this step:
+| Skill | Purpose | When |
+|-------|---------|------|
+| \`aikit\` | Core MCP tools — search, analyze, remember, validate | Always (auto-loaded) |
+| \`present\` | Rich rendering for any structured output — assessments, reports, comparisons, reviews, status boards, tables, charts, and all artifact content | Use for ANY output that benefits from rich rendering, not just dashboards |
+| \`multi-agents-development\` | Dispatch templates, task decomposition, review pipeline patterns | Before dispatching any subagent |
+| \`brainstorming\` | Structured ideation for design/creative decisions | Before any design choice or new feature exploration |
+| \`requirements-clarity\` | Score requirements 0-100, iterate until ≥90 before proceeding | Before writing spec — ensures requirements are clear enough |
+### Presentation Rules
+- Use \`present\` for **any output** that benefits from rich rendering — not limited to dashboards
+- Assessments, reports, comparisons, reviews, status boards → \`present({ format: "html" })\`
+- Tables, charts, progress tracking, code review findings → always present
+- Artifact content and summaries → present with structured layout
+- Only use plain text for brief confirmations and simple questions
+## Completion Criteria
+- [ ] All functional requirements have acceptance criteria
+- [ ] Scope boundaries are explicit
+- [ ] Requirements clarity score ≥ 90
+- [ ] No open questions remain
+- [ ] \`{{artifacts_path}}/spec.md\` written
+## Knowledge Capture
+Before completing this step, persist important findings using \`remember()\`:
+- **Requirements clarified**: Ambiguities resolved and assumptions validated
+- **Scope boundaries**: What the spec covers and explicit exclusions
+- **Acceptance criteria**: Key testable conditions that define "done"
+**Every step produces knowledge worth preserving.** If you discovered something that would help a future session, call \`remember()\` now.
+` },
+    { file: "steps/task/README.md", content: `---
+name: task
+description: Break the implementation plan into ordered, atomic tasks with dependencies and agent assignments.
+---
+# Task Breakdown
+## Prerequisites Check
+Before starting this step, verify:
+- [ ] Implementation plan approved
+- [ ] \`{{artifacts_path}}/plan.md\` exists with defined phases
+If prerequisites are NOT met → **backtrack to plan step** (\`flow_step({ action: 'redo' })\` on previous step)
+## Purpose
+Decompose the implementation plan into small, atomic tasks that agents can execute independently, with clear dependency ordering and acceptance criteria per task.
+## Inputs
+- \`{{artifacts_path}}/plan.md\` — the phased implementation plan
+## Process
+1. **Load plan** — Read phases, file scope, and dependency graph
+2. **Decompose phases into tasks** — Each task should:
+   - Touch 1–3 files maximum
+   - Have a single, testable outcome
+   - Take one agent one focused session
+3. **Define dependencies** — Map task-to-task dependencies (not just phase-to-phase)
+4. **Assign agents** — Match each task to the best-fit agent based on scope
+5. **Identify parallelism** — Mark which tasks can run simultaneously
+6. **Architecture review** — Have Architect-Reviewer validate task ordering won't create integration issues
+## Outputs
+Write \`{{artifacts_path}}/tasks.md\` to disk. **You MUST create this file** using \`create_file\` or equivalent — do not just present content in chat.
+Template:
+\`\`\`markdown
+# Task Breakdown: <feature title>
+## Task Graph
+### Phase 1: <name>
+- [ ] **T1.1**: <description>
+  - Files: <list>
+  - Agent: <agent name>
+  - Depends on: none
+  - Acceptance: <testable criterion>
+- [ ] **T1.2**: <description>
+  - Files: <list>
+  - Agent: <agent name>
+  - Depends on: T1.1
+  - Acceptance: <testable criterion>
+### Phase 2: <name>
+- [ ] **T2.1**: <description> (can parallel with T1.2)
+  ...
+## Parallelism Map
+| Batch | Tasks | Agents |
+|-------|-------|--------|
+| 1 | T1.1 | Implementer |
+| 2 | T1.2, T2.1 | Implementer, Frontend |
+| 3 | T2.2 | Implementer |
+| ... | ... | ... |
+## Estimated Batches: <N>
+\`\`\`
+## Agents
+| Agent | Role |
+|-------|------|
+| **Planner** | Decomposes plan phases into atomic tasks with dependency ordering |
+| **Architect-Reviewer-Alpha** | Validates task decomposition won't cause integration issues |
+Planner does the decomposition, then Architect-Reviewer validates the task graph.
+## Foundation Integration
+Load these skills BEFORE executing this step:
+| Skill | Purpose | When |
+|-------|---------|------|
+| \`aikit\` | Core MCP tools — search, analyze, remember, validate | Always (auto-loaded) |
+| \`present\` | Rich rendering for any structured output — assessments, reports, comparisons, reviews, status boards, tables, charts, and all artifact content | Use for ANY output that benefits from rich rendering, not just dashboards |
+| \`multi-agents-development\` | Dispatch templates, task decomposition, review pipeline patterns | Before dispatching any subagent |
+| \`brainstorming\` | Structured ideation for design/creative decisions | Before any design choice or new feature exploration |
+### Presentation Rules
+- Use \`present\` for **any output** that benefits from rich rendering — not limited to dashboards
+- Assessments, reports, comparisons, reviews, status boards → \`present({ format: "html" })\`
+- Tables, charts, progress tracking, code review findings → always present
+- Artifact content and summaries → present with structured layout
+- Only use plain text for brief confirmations and simple questions
+## Completion Criteria
+- [ ] Every plan phase maps to ≥1 task
+- [ ] Each task touches ≤3 files
+- [ ] Dependencies form a DAG (no cycles)
+- [ ] Parallelism opportunities identified
+- [ ] Architect review confirms no integration risks
+- [ ] \`{{artifacts_path}}/tasks.md\` written
+## Knowledge Capture
+Before completing this step, persist important findings using \`remember()\`:
+- **Task decomposition rationale**: Why tasks were split this way and what each accomplishes
+- **Interface contracts**: APIs, types, or data shapes that tasks depend on
+- **Coordination points**: Where tasks interact and handoff requirements
+**Every step produces knowledge worth preserving.** If you discovered something that would help a future session, call \`remember()\` now.
+` },
+    { file: "steps/verify/README.md", content: `---
+name: verify
+description: Dual code review, architecture review, security review, and comprehensive test validation.
+---
+# Verification (Advanced)
+## Prerequisites Check
+Before starting this step, verify:
+- [ ] All tasks marked complete in progress tracker
+- [ ] \`check({})\` and \`test_run({})\` pass
+- [ ] \`{{artifacts_path}}/progress.md\` exists with execution results
+If prerequisites are NOT met → **backtrack to execute step** (\`flow_step({ action: 'redo' })\` on previous step)
+## Purpose
+Perform thorough multi-perspective validation of all changes through parallel dual code review, architecture review, and security analysis.
+## Inputs
+- \`{{artifacts_path}}/spec.md\` — original requirements and acceptance criteria
+- \`{{artifacts_path}}/plan.md\` — architecture decisions and phase design
+- \`{{artifacts_path}}/tasks.md\` — task breakdown with per-task acceptance criteria
+- \`{{artifacts_path}}/progress.md\` — implementation status and changes made
+## Process
+1. **Load all artifacts** — Read spec, plan, tasks, and progress
+2. **Dual code review** (parallel):
+   - Code-Reviewer-Alpha: focus on correctness, conventions, quality
+   - Code-Reviewer-Beta: focus on edge cases, error handling, maintainability
+3. **Architecture review** (parallel with code review):
+   - Architect-Reviewer-Alpha: validate changes align with plan and ADRs
+   - Architect-Reviewer-Beta: assess long-term maintainability and evolution
+4. **Security review**:
+   - Security agent: OWASP Top 10, auth/authz, input validation, secrets
+5. **Quality gates** — \`check({})\` + \`test_run({})\` must pass
+6. **Blast radius** — \`blast_radius({ changed_files: [...] })\` on all modified files
+7. **Acceptance criteria** — Verify each spec acceptance criterion is met
+8. **FORGE gate** — \`evidence_map({ action: "gate" })\` for final quality assessment
+9. **Synthesize report** — Merge all reviewer findings into unified verdict
+## Outputs
+Write \`{{artifacts_path}}/verify-report.md\` to disk. **You MUST create this file** using \`create_file\` or equivalent — do not just present content in chat.
+Template:
+\`\`\`markdown
+# Verification Report: <feature title>
+## Verdict: PASS | FAIL
+## Acceptance Criteria
+- [x] <criterion> — verified by <method>
+- [ ] <criterion> — FAILED: <reason>
+## Quality Gates
+- check: PASS/FAIL
+- test_run: PASS/FAIL (<N> passed, <M> failed)
+- blast_radius: <impact summary>
+- FORGE gate: YIELD/HOLD/HARD_BLOCK
+## Code Review (Alpha)
+<findings with severity: critical/major/minor/suggestion>
+## Code Review (Beta)
+<findings with severity>
+## Architecture Review
+<alignment assessment, any concerns>
+## Security Review
+<vulnerabilities found, OWASP compliance>
+## Recommendation
+APPROVE | REQUEST CHANGES
+### Required Changes (if any)
+1. <change needed>
+2. ...
+\`\`\`
+## Agents
+| Agent | Role |
+|-------|------|
+| **Code-Reviewer-Alpha** | Primary code review — correctness, quality, conventions |
+| **Code-Reviewer-Beta** | Secondary code review — edge cases, error handling, maintainability |
+| **Architect-Reviewer-Alpha** | Primary architecture review — alignment with plan and ADRs |
+| **Architect-Reviewer-Beta** | Secondary architecture review — long-term evolution |
+| **Security** | Security analysis — OWASP, auth, input validation |
+**Parallelism**: All 5 reviewers can run in parallel — they are read-only.
+## Foundation Integration
+Load these skills BEFORE executing this step:
+| Skill | Purpose | When |
+|-------|---------|------|
+| \`aikit\` | Core MCP tools — search, analyze, remember, validate | Always (auto-loaded) |
+| \`present\` | Rich rendering for any structured output — assessments, reports, comparisons, reviews, status boards, tables, charts, and all artifact content | Use for ANY output that benefits from rich rendering, not just dashboards |
+| \`multi-agents-development\` | Dispatch templates, task decomposition, review pipeline patterns | Before dispatching any subagent |
+| \`brainstorming\` | Structured ideation for design/creative decisions | Before any design choice or new feature exploration |
+| \`lesson-learned\` | Extract engineering principles from completed work | After verification — capture lessons from the implementation |
+| \`session-handoff\` | Context preservation for session continuity | When verification spans sessions or for final handoff documentation |
+### Presentation Rules
+- Use \`present\` for **any output** that benefits from rich rendering — not limited to dashboards
+- Assessments, reports, comparisons, reviews, status boards → \`present({ format: "html" })\`
+- Tables, charts, progress tracking, code review findings → always present
+- Artifact content and summaries → present with structured layout
+- Only use plain text for brief confirmations and simple questions
+### FORGE Quality Gate
+After all reviews complete:
+1. \`evidence_map({ action: "gate", task_id: "<slug>" })\` → returns YIELD/HOLD/HARD_BLOCK
+2. YIELD → approved, proceed to commit
+3. HOLD → minor issues, fix then re-gate (max 3 rounds)
+4. HARD_BLOCK → critical issues, escalate to user
+## Completion Criteria
+- [ ] Dual code review complete (2 reviewers)
+- [ ] Architecture review complete (2 reviewers)
+- [ ] Security review complete
+- [ ] \`check({})\` passes
+- [ ] \`test_run({})\` passes
+- [ ] All spec acceptance criteria verified
+- [ ] FORGE gate passed (YIELD)
+- [ ] \`{{artifacts_path}}/verify-report.md\` written with clear verdict
+## Knowledge Capture
+Before completing this step, persist important findings using \`remember()\`:
+- **Test coverage gaps**: Areas that couldn't be fully tested and why
+- **Quality findings**: Issues found during verification and their resolutions
+- **Session checkpoint**: Summarize what was accomplished, decisions made, and any remaining work
+**Every step produces knowledge worth preserving.** If you discovered something that would help a future session, call \`remember()\` now.
+` },
+  ],
+  "aikit-basic": [
+    { file: "README.md", content: `# aikit:basic — Quick Development Flow
+Quick development flow for **bug fixes, small features, and refactoring**.
+## Steps
+| # | Step | Skill | Produces | Requires | Agents |
+|---|------|-------|----------|----------|--------|
+| 1 | **Design Gate** | \`steps/design/README.md\` | \`design-decisions.md\` | — | Researcher-Alpha/Beta/Gamma/Delta |
+| 2 | **Assessment** | \`steps/assess/README.md\` | \`assessment.md\` | \`design-decisions.md\` | Explorer, Researcher-Alpha |
+| 3 | **Implementation** | \`steps/implement/README.md\` | \`progress.md\` | \`assessment.md\` | Implementer, Frontend |
+| 4 | **Verification** | \`steps/verify/README.md\` | \`verify-report.md\` | \`progress.md\` | Code-Reviewer-Alpha, Security |
+## How It Works
+Each step has a **README.md** file that contains the detailed instructions for the agent(s) executing that step. The Orchestrator reads the README.md via \`flow_read_instruction\` and delegates work accordingly.
+### Step 1: Design Gate
+- **Auto-skips** for bug fixes and refactors (produces a minimal \`design-decisions.md\` noting it was skipped)
+- For small features: runs quick brainstorming, FORGE classification, and optional decision protocol
+- Read \`steps/design/README.md\` for the full decision tree
+### Step 2: Assessment
+- Explore the codebase to understand scope and impact
+- Use \`search\`, \`scope_map\`, \`file_summary\`, \`compact\` to gather context
+- Identify the approach and produce \`assessment.md\`
+### Step 3: Implementation
+- Write code following the assessment plan
+- The Orchestrator dispatches Implementer/Frontend agents with specific file scopes
+- Follow TDD practices where applicable
+### Step 4: Verification
+- Code review, test execution, security check
+- Run \`check({})\` + \`test_run({})\` + \`blast_radius({})\`
+- Produce \`verify-report.md\` with findings
+## Using Skills Inside Steps
+When the Orchestrator activates a step:
+1. **Read the instruction first** — \`flow_read_instruction\` returns the README.md for the current step
+2. **Follow step instructions** — the README.md is the primary guide for what to do
+3. **Delegate to listed agents** — each step lists which agents are appropriate
+4. **Produce the required artifact** — the step's \`produces\` field specifies what file to create in the artifacts directory
+5. **Check dependencies** — the step's \`requires\` field lists artifacts from previous steps that must exist
+6. **Report status** — agents report \`DONE\` | \`DONE_WITH_CONCERNS\` | \`NEEDS_CONTEXT\` | \`BLOCKED\` to the Orchestrator
+## Artifacts
+All artifacts are stored in the run directory under \`.flows/{topic}/\`. The template variable \`{{artifacts_path}}\` resolves to the actual path at runtime.
+` },
+    { file: "steps/assess/README.md", content: `---
+name: assess
+description: Understand scope, analyze the codebase, and identify the implementation approach.
+---
+# Assessment
+## Purpose
+Analyze the task requirements and codebase to produce a clear, actionable assessment before any code changes begin.
+## Inputs
+- User's task description or issue reference
+- Codebase (accessed via aikit MCP tools)
+## Prerequisites Check
+Before executing this step, verify:
+- [ ] Design decisions documented (from the design step)
+- [ ] FORGE classification determined (tier assigned)
+- [ ] If brainstorming was done, session outcomes are recorded
+If any prerequisites are missing or incomplete:
+1. Inform the Orchestrator with specifics about what's missing
+2. Recommend \`flow_step({ action: 'redo' })\` on the **design** step
+3. Do NOT proceed with partial inputs — quality degrades downstream
+## Process
+1. **Parse the goal** — Extract what needs to change, success criteria, and constraints
+2. **Search for prior work** — \`search({ query: "<task keywords>" })\` to check for existing decisions or related code
+3. **Map affected scope** — \`scope_map({ task: "<description>" })\` to identify files and modules involved
+4. **Analyze structure** — \`file_summary()\` on each affected file; \`compact()\` for deeper sections
+5. **Identify risks** — Note dependencies, breaking change potential, test coverage gaps
+6. **Draft approach** — Outline the implementation strategy in 3–7 steps
+## Outputs
+Write \`{{artifacts_path}}/assessment.md\` to disk. **You MUST create this file** using \`create_file\` or equivalent — do not just present content in chat.
+Template:
+\`\`\`markdown
+# Assessment: <task title>
+## Goal
+<what needs to happen>
+## Affected Files
+<list of files with brief reason>
+## Approach
+<numbered implementation steps>
+## Risks
+<potential issues and mitigations>
+## Open Questions
+<anything that needs clarification before proceeding>
+\`\`\`
+## Agents
+| Agent | Role |
+|-------|------|
+| **Explorer** | Rapid file discovery, dependency tracing, structural context |
+| **Researcher-Alpha** | Deep analysis of complex logic, prior decisions, architectural implications |
+Use Explorer first for breadth, then Researcher-Alpha for depth on complex areas.
+## Foundation Integration
+Load these skills BEFORE executing this step:
+| Skill | Purpose | When |
+|-------|---------|------|
+| \`aikit\` | Core MCP tools — search, analyze, remember, validate | Always (auto-loaded) |
+| \`present\` | Rich rendering for any structured output — assessments, reports, comparisons, reviews, status boards, tables, charts, and all artifact content | Use for ANY output that benefits from rich rendering, not just dashboards |
+| \`multi-agents-development\` | Dispatch templates, task decomposition, review pipeline patterns | Before dispatching any subagent |
+| \`brainstorming\` | Structured ideation for design/creative decisions | Before any design choice or new feature exploration |
+| \`c4-architecture\` | C4 model architecture diagrams — system context, container, component, deployment views | When visualizing system structure during assessment |
+| \`adr-skill\` | Architecture Decision Records — create, review, maintain ADRs | When assessment reveals non-trivial design decisions |
+### Presentation Rules
+- Use \`present\` for **any output** that benefits from rich rendering — not limited to dashboards
+- Assessments, reports, comparisons, reviews, status boards → \`present({ format: "html" })\`
+- Tables, charts, progress tracking, code review findings → always present
+- Artifact content and summaries → present with structured layout
+- Only use plain text for brief confirmations and simple questions
+## Completion Criteria
+- [ ] All affected files identified
+- [ ] Implementation approach is concrete (not vague)
+- [ ] Risks documented with mitigations
+- [ ] No unresolved open questions that block implementation
+- [ ] \`{{artifacts_path}}/assessment.md\` written
+## Knowledge Capture
+Before completing this step, persist important findings using \`remember()\`:
+- **Codebase discoveries**: File locations, architecture patterns, or dependency relationships found during assessment
+- **Problem diagnosis**: Root cause analysis, contributing factors, and affected components
+- **Scope decisions**: What's in scope, what's explicitly excluded, and why
+**Every step produces knowledge worth preserving.** If you discovered something that would help a future session, call \`remember()\` now.
+` },
+    { file: "steps/design/README.md", content: `# Design Gate — Basic Flow
+Lightweight design gate for bug fixes, small features, and refactoring. Evaluates the task type and determines whether design work is needed before proceeding.
+## When This Step Runs
+This is the **first step** of the \`aikit:basic\` flow. It runs before assessment.
+## Instructions
+### 1. Task Classification
+Classify the task into one of these categories:
+| Category | Indicators | Action |
+|----------|-----------|--------|
+| **Bug fix** | Error reports, stack traces, regression, "fix", "broken" | → **Auto-skip** to next step |
+| **Refactor** | Code cleanup, rename, restructure, no behavior change | → **Auto-skip** to next step |
+| **Small feature** | New behavior, new endpoint, new component, UI change | → Run **Quick Design** below |
+**If the task is a bug fix or refactor**, write \`{{artifacts_path}}/design-decisions.md\` to disk:
+\`\`\`markdown
+## Design Decisions
+- **Task type**: Bug fix / Refactor
+- **Design gate**: Auto-skipped — no design work needed
+- **Proceed to**: Assessment
+\`\`\`
+**You MUST create this file on disk** at the exact \`{{artifacts_path}}/design-decisions.md\` path — do not just present the content in chat.
+Then report \`DONE\` to the Orchestrator so the flow advances.
+### 2. Quick Design (Small Features Only)
+For small features that need minimal design:
+1. **FORGE Classify** — Run \`forge_classify({ task: "<task description>", files: [<relevant files>] })\` to determine complexity tier
+2. **Brainstorming** (if tier ≥ Standard) — Load the \`brainstorming\` skill and run a focused brainstorming session:
+   - What is the user trying to achieve?
+   - What are the constraints?
+   - What is the simplest approach?
+3. **Decision Protocol** (if technical decisions exist) — Delegate to 2-4 Researcher agents in parallel:
+   - Each researcher evaluates a different approach
+   - Synthesize findings into a recommendation
+4. **Write \`{{artifacts_path}}/design-decisions.md\`** to disk:
+\`\`\`markdown
+## Design Decisions
+### FORGE Assessment
+- **Tier**: {Floor | Standard | Critical}
+- **Rationale**: {why this tier}
+### Task Summary
+- **Goal**: {what we're building}
+- **Approach**: {chosen approach}
+- **Key decisions**: {list}
+### Constraints
+- {constraint 1}
+- {constraint 2}
+\`\`\`
+### 3. Report to Orchestrator
+When complete, report status:
+- \`DONE\` — design decisions captured, ready for assessment
+- \`DONE_WITH_CONCERNS\` — design captured but open questions remain (list them)
+**Do NOT call \`flow_step\`** — let the Orchestrator advance the flow.
+## Outputs
+Write \`{{artifacts_path}}/design-decisions.md\` to disk. **You MUST create this file** using \`create_file\` or equivalent — do not just present content in chat. This file is a prerequisite for the next step.
+## Produces
+- \`{{artifacts_path}}/design-decisions.md\` — Task classification, FORGE tier, key design decisions
+## Agents
+- \`Researcher-Alpha\`, \`Researcher-Beta\`, \`Researcher-Gamma\`, \`Researcher-Delta\` — for parallel research during decision protocol
+## Foundation Integration
+Load these skills BEFORE executing this step:
+| Skill | Purpose | When |
+|-------|---------|------|
+| \`aikit\` | Core MCP tools — search, analyze, remember, validate | Always (auto-loaded) |
+| \`present\` | Rich rendering for any structured output — assessments, reports, comparisons, reviews, status boards, tables, charts, and all artifact content | Use for ANY output that benefits from rich rendering, not just dashboards |
+| \`multi-agents-development\` | Dispatch templates, task decomposition, review pipeline patterns | Before dispatching any subagent |
+| \`brainstorming\` | Structured ideation for design/creative decisions | Before any design choice or new feature exploration |
+| \`c4-architecture\` | C4 model architecture diagrams — system context, container, component, deployment views | When visualizing system structure during design |
+| \`adr-skill\` | Architecture Decision Records — create, review, maintain ADRs | When making non-trivial design or technology decisions |
+### Presentation Rules
+- Use \`present\` for **any output** that benefits from rich rendering — not limited to dashboards
+- Assessments, reports, comparisons, reviews, status boards → \`present({ format: "html" })\`
+- Tables, charts, progress tracking, code review findings → always present
+- Artifact content and summaries → present with structured layout
+- Only use plain text for brief confirmations and simple questions
+## Completion Criteria
+- [ ] \`{{artifacts_path}}/design-decisions.md\` written to disk (NOT just presented in chat)
+- [ ] FORGE tier determined (for small features)
+- [ ] Key design decisions documented
+## Knowledge Capture
+Before completing this step, persist important findings using \`remember()\`:
+- **Design decisions**: Chosen approach and alternatives considered with trade-offs
+- **Architecture patterns**: New patterns introduced or existing patterns that must be followed
+- **Constraints discovered**: Technical limitations, compatibility requirements, or performance boundaries
+**Every step produces knowledge worth preserving.** If you discovered something that would help a future session, call \`remember()\` now.
+` },
+    { file: "steps/implement/README.md", content: `---
+name: implement
+description: Write code following the assessment plan, using TDD practices where applicable.
+---
+# Implementation
+## Purpose
+Execute the implementation plan from the assessment, writing production code and tests.
+## Inputs
+- \`{{artifacts_path}}/assessment.md\` — the approach, affected files, and risks
+## Prerequisites Check
+Before executing this step, verify:
+- [ ] Assessment complete and scope approved (from the assess step)
+- [ ] Files-to-modify list is clear and bounded
+- [ ] \`check({})\` baseline captured (know what currently passes)
+If any prerequisites are missing or incomplete:
+1. Inform the Orchestrator with specifics about what's missing
+2. Recommend \`flow_step({ action: 'redo' })\` on the **assess** step
+3. Do NOT proceed with partial inputs — quality degrades downstream
+## Process
+1. **Read assessment** — Load \`{{artifacts_path}}/assessment.md\` and internalize the approach
+2. **Set up tests first** — Where applicable, write failing tests that define success
+3. **Implement changes** — Follow the approach steps sequentially
+   - One logical change per commit-worthy chunk
+   - Stay within the assessed file scope — do not expand without re-assessment
+4. **Run validation** — \`check({})\` for type/lint errors, \`test_run({})\` for test results
+5. **Fix issues** — Iterate until \`check\` and \`test_run\` pass (max 3 rounds)
+6. **Document progress** — Write progress artifact with what was done and any deviations
+## Outputs
+Write \`{{artifacts_path}}/progress.md\` to disk. **You MUST create this file** using \`create_file\` or equivalent — do not just present content in chat.
+Template:
+\`\`\`markdown
+# Implementation Progress: <task title>
+## Changes Made
+<list of files changed with brief description>
+## Tests
+<tests added or modified>
+## Deviations from Assessment
+<any changes to the original plan and why>
+## Validation
+- check: PASS/FAIL
+- test_run: PASS/FAIL (<N> passed, <M> failed)
+## Notes
+<anything the reviewer should know>
+\`\`\`
+## Agents
+| Agent | Role |
+|-------|------|
+| **Implementer** | Primary code writer — follows TDD, writes production code and tests |
+| **Frontend** | UI/UX specialist — use when changes involve React components, styling, or responsive design |
+Dispatch Implementer for backend/logic changes, Frontend for UI changes. Both can run in parallel if working on different files.
+## Foundation Integration
+Load these skills BEFORE executing this step:
+| Skill | Purpose | When |
+|-------|---------|------|
+| \`aikit\` | Core MCP tools — search, analyze, remember, validate | Always (auto-loaded) |
+| \`present\` | Rich rendering for any structured output — assessments, reports, comparisons, reviews, status boards, tables, charts, and all artifact content | Use for ANY output that benefits from rich rendering, not just dashboards |
+| \`multi-agents-development\` | Dispatch templates, task decomposition, review pipeline patterns | Before dispatching any subagent |
+| \`brainstorming\` | Structured ideation for design/creative decisions | Before any design choice or new feature exploration |
+| \`session-handoff\` | Context preservation for session transfers | When implementation is long-running and context may fill up |
+### Presentation Rules
+- Use \`present\` for **any output** that benefits from rich rendering — not limited to dashboards
+- Assessments, reports, comparisons, reviews, status boards → \`present({ format: "html" })\`
+- Tables, charts, progress tracking, code review findings → always present
+- Artifact content and summaries → present with structured layout
+- Only use plain text for brief confirmations and simple questions
+### Orchestrator Dispatch Protocol
+Follow the \`multi-agents-development\` skill patterns for dispatch:
+1. **Independence Check** before parallelizing:
+   - Same files? → sequential
+   - Shared mutable state? → sequential
+   - Execution-order dependent? → sequential
+   - Need shared new types? → define contract first, then parallel
+   - All clear? → **parallel dispatch**
+2. **Subagent Context Template** (each dispatch includes):
+   - **Scope**: exact files + boundary (do NOT touch)
+   - **Goal**: acceptance criteria, testable
+   - **Arch Context**: actual code snippets via \`compact()\`/\`digest()\`
+   - **Constraints**: patterns, conventions, anti-patterns
+   - **Self-Review**: checklist before declaring DONE
+3. **Status Protocol**: \`DONE\` | \`DONE_WITH_CONCERNS\` | \`NEEDS_CONTEXT\` | \`BLOCKED\`
+4. **Max 2 retries** per task — then escalate to user
+## Completion Criteria
+- [ ] All assessment steps implemented
+- [ ] \`check({})\` passes (no type/lint errors)
+- [ ] \`test_run({})\` passes (no test failures)
+- [ ] No files modified outside assessed scope
+- [ ] \`{{artifacts_path}}/progress.md\` written
+## Knowledge Capture
+Before completing this step, persist important findings using \`remember()\`:
+- **Implementation decisions**: Why specific approaches were chosen over alternatives
+- **Patterns established**: New conventions or patterns that future code should follow
+- **Gotchas encountered**: Edge cases, workarounds, or non-obvious behaviors discovered during implementation
+**Every step produces knowledge worth preserving.** If you discovered something that would help a future session, call \`remember()\` now.
+` },
+    { file: "steps/verify/README.md", content: `---
+name: verify
+description: Review code changes, run tests, validate correctness and quality.
+---
+# Verification
+## Purpose
+Validate that the implementation meets the original requirements, passes all quality gates, and introduces no regressions.
+## Inputs
+- \`{{artifacts_path}}/assessment.md\` — original requirements and approach
+- \`{{artifacts_path}}/progress.md\` — what was implemented and any deviations
+## Prerequisites Check
+Before executing this step, verify:
+- [ ] Implementation complete (from the implement step)
+- [ ] \`check({})\` + \`test_run({})\` pass at baseline
+- [ ] Changed files list is available for blast radius analysis
+If any prerequisites are missing or incomplete:
+1. Inform the Orchestrator with specifics about what's missing
+2. Recommend \`flow_step({ action: 'redo' })\` on the **implement** step
+3. Do NOT proceed with partial inputs — quality degrades downstream
+## Process
+1. **Load context** — Read assessment and progress artifacts
+2. **Code review** — Review all changed files for:
+   - Correctness against requirements
+   - Code quality and adherence to project conventions
+   - Error handling and edge cases
+   - No unnecessary changes (scope creep)
+3. **Run quality gates** — \`check({})\` + \`test_run({})\` must pass
+4. **Blast radius** — \`blast_radius({ changed_files: [...] })\` to assess impact
+5. **Security scan** — Check for OWASP Top 10 issues in changed code
+6. **Write report** — Document findings with PASS/FAIL verdict
+## Outputs
+Write \`{{artifacts_path}}/verify-report.md\` to disk. **You MUST create this file** using \`create_file\` or equivalent — do not just present content in chat.
+Template:
+\`\`\`markdown
+# Verification Report: <task title>
+## Verdict: PASS | FAIL
+## Quality Gates
+- check: PASS/FAIL
+- test_run: PASS/FAIL (<N> passed, <M> failed)
+- blast_radius: <summary of impact>
+## Code Review Findings
+<issues found, if any, with severity>
+## Security
+<any security concerns>
+## Recommendation
+<APPROVE for commit / REQUEST CHANGES with specific items>
+\`\`\`
+## Agents
+| Agent | Role |
+|-------|------|
+| **Code-Reviewer-Alpha** | Primary code reviewer — correctness, quality, conventions |
+| **Security** | Security specialist — vulnerability analysis, OWASP compliance |
+Run both in parallel — they review different aspects of the same changes.
+## Foundation Integration
+Load these skills BEFORE executing this step:
+| Skill | Purpose | When |
+|-------|---------|------|
+| \`aikit\` | Core MCP tools — search, analyze, remember, validate | Always (auto-loaded) |
+| \`present\` | Rich rendering for any structured output — assessments, reports, comparisons, reviews, status boards, tables, charts, and all artifact content | Use for ANY output that benefits from rich rendering, not just dashboards |
+| \`multi-agents-development\` | Dispatch templates, task decomposition, review pipeline patterns | Before dispatching any subagent |
+| \`brainstorming\` | Structured ideation for design/creative decisions | Before any design choice or new feature exploration |
+| \`lesson-learned\` | Extract engineering lessons from completed work via git history | After verification completes — capture principles from what was built |
+| \`session-handoff\` | Context preservation for session transfers | When verification is the final step and session context should be saved |
+### Presentation Rules
+- Use \`present\` for **any output** that benefits from rich rendering — not limited to dashboards
+- Assessments, reports, comparisons, reviews, status boards → \`present({ format: "html" })\`
+- Tables, charts, progress tracking, code review findings → always present
+- Artifact content and summaries → present with structured layout
+- Only use plain text for brief confirmations and simple questions
+### FORGE Quality Gate
+After all reviews complete:
+1. \`evidence_map({ action: "gate", task_id: "<slug>" })\` → returns YIELD/HOLD/HARD_BLOCK
+2. YIELD → approved, proceed to commit
+3. HOLD → minor issues, fix then re-gate (max 3 rounds)
+4. HARD_BLOCK → critical issues, escalate to user
+## Completion Criteria
+- [ ] \`check({})\` passes
+- [ ] \`test_run({})\` passes
+- [ ] Code review complete with no blocking issues
+- [ ] Security review complete
+- [ ] Blast radius assessed
+- [ ] \`{{artifacts_path}}/verify-report.md\` written with clear PASS/FAIL verdict
+## Knowledge Capture
+Before completing this step, persist important findings using \`remember()\`:
+- **Test coverage gaps**: Areas that couldn't be fully tested and why
+- **Quality findings**: Issues found during verification and their resolutions
+- **Session checkpoint**: Summarize what was accomplished, decisions made, and any remaining work
+**Every step produces knowledge worth preserving.** If you discovered something that would help a future session, call \`remember()\` now.
+` },
+  ],
+};