npm - @musashishao/agent-kit - Versions diffs - 1.6.0 → 1.6.1 - Mend

@musashishao/agent-kit 1.6.0 → 1.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/.agent/agents/project-planner.md +22 -2
package/.agent/rules/CODE_RULES.md +88 -0
package/.agent/rules/GEMINI.md +35 -177
package/.agent/rules/MEMORY_STATE.md +62 -0
package/.agent/rules/REFERENCE.md +71 -0
package/.agent/skills/app-builder/SKILL.md +10 -7
package/.agent/skills/spec-writing/SKILL.md +189 -0
package/.agent/skills/tdd-workflow/SKILL.md +30 -0
package/.agent/workflows/create.md +16 -6
package/.agent/workflows/plan.md +8 -8
package/.agent/workflows/spec.md +189 -0
package/.agent/workflows/test.md +41 -0
package/package.json +2 -2

package/.agent/agents/project-planner.md CHANGED Viewed

@@ -3,13 +3,32 @@ name: project-planner
 description: Smart project planning agent. Breaks down user requests into tasks, plans file structure, determines which agent does what, creates dependency graph. Use when starting new projects or planning major features.
 tools: Read, Grep, Glob, Bash
 model: inherit
-skills: clean-code, app-builder, plan-writing, brainstorming
+skills: clean-code, app-builder, plan-writing, brainstorming, spec-writing
 ---
 # Project Planner - Smart Project Planning
 You are a project planning expert. You analyze user requests, break them into tasks, and create an executable plan.
+## 🔴 PHASE -2: SPEC CHECK (BEFORE PLANNING)
+**Before creating a plan, check for existing Specification:**
+1. **Search** for `SPEC-*.md` files in project root
+2. **If found:** Read spec and use as input for planning
+3. **If not found + complex request (3+ files):** Redirect to `/spec` first
+4. **If simple request:** Proceed to Phase -1
+> 🔴 **Rule:** Complex features (3+ files) MUST have a Spec before Plan.
+| Condition | Action |
+|-----------|--------|
+| `SPEC-*.md` exists | Read spec → Use as planning input |
+| Complex request, no spec | Say: "This is a complex feature. Run `/spec` first." |
+| Simple request (1-2 files) | Proceed directly to Phase -1 |
+---
 ## 🛑 PHASE 0: CONTEXT CHECK (QUICK)
 **Check for existing context before starting:**
@@ -38,6 +57,7 @@ You are a project planning expert. You analyze user requests, break them into ta
 | If You See | Then |
 |------------|------|
+| `SPEC-*.md` exists | Use spec as planning foundation |
 | "User Request: X" in prompt | Use X as the task, ignore folder name |
 | "Decisions: Y" in prompt | Apply Y without re-asking |
 | Existing plan in workspace | Read and CONTINUE it, don't restart |
@@ -395,7 +415,7 @@ python .agent/skills/webapp-testing/scripts/playwright_runner.py http://localhos
 | 5 | **Rollback** | Every task has recovery path | Tasks fail, prepare for it |
 | 6 | **Context** | Explain WHY not just WHAT | Better agent decisions |
 | 7 | **Risks** | Identify before they happen | Prepared responses |
-| 8 | **DYNAMIC NAMING** | `docs/PLAN-{task-slug}.md` | Easy to find, multiple plans OK |
+| 8 | **DYNAMIC NAMING** | `./{task-slug}.md` | Easy to find, multiple plans OK |
 | 9 | **Milestones** | Each phase ends with working state | Continuous value |
 | 10 | **Phase X** | Verification is ALWAYS final | Definition of done |

package/.agent/rules/CODE_RULES.md ADDED Viewed

@@ -0,0 +1,88 @@
+---
+trigger: always_on
+---
+# CODE_RULES.md - TIER 1: Code Rules
+> This module defines coding rules, Socratic Gate, and verification protocols.
+---
+## 📱 Project Type Routing
+| Project Type | Primary Agent | Skills |
+|--------------|---------------|--------|
+| **MOBILE** (iOS, Android, RN, Flutter) | `mobile-developer` | mobile-design |
+| **WEB** (Next.js, React web) | `frontend-specialist` | frontend-design |
+| **BACKEND** (API, server, DB) | `backend-specialist` | api-patterns, database-design |
+> 🔴 **Mobile + frontend-specialist = WRONG.** Mobile = mobile-developer ONLY.
+---
+## 🛑 GLOBAL SOCRATIC GATE
+**MANDATORY: Every user request must pass through the Socratic Gate before ANY tool use or implementation.**
+| Request Type | Strategy | Required Action |
+|--------------|----------|-----------------|
+| **New Feature / Build** | Deep Discovery | ASK minimum 3 strategic questions |
+| **Code Edit / Bug Fix** | Context Check | Confirm understanding + ask impact questions |
+| **Vague / Simple** | Clarification | Ask Purpose, Users, and Scope |
+| **Full Orchestration** | Gatekeeper | **STOP** subagents until user confirms plan details |
+| **Direct "Proceed"** | Validation | **STOP** → Even if answers are given, ask 2 "Edge Case" questions |
+**Protocol:**
+1. **Never Assume:** If even 1% is unclear, ASK.
+2. **Handle Spec-heavy Requests:** When user gives a list (Answers 1, 2, 3...), do NOT skip the gate. Instead, ask about **Trade-offs** or **Edge Cases** before starting.
+3. **Wait:** Do NOT invoke subagents or write code until the user clears the Gate.
+4. **Reference:** Full protocol in `@[skills/brainstorming]`.
+---
+## 🏁 Final Checklist Protocol
+**Trigger:** When the user says "final checks", "run all tests", or similar phrases.
+| Task Stage | Command | Purpose |
+|------------|---------|---------|
+| **Manual Audit** | `python scripts/checklist.py .` | Priority-based project audit |
+| **Pre-Deploy** | `python scripts/checklist.py . --url <URL>` | Full Suite + Performance + E2E |
+**Priority Execution Order:**
+1. **Security** → 2. **Lint** → 3. **Schema** → 4. **Tests** → 5. **UX** → 6. **Seo** → 7. **Lighthouse/E2E**
+**Rules:**
+- **Completion:** A task is NOT finished until `checklist.py` returns success.
+- **Reporting:** If it fails, fix the **Critical** blockers first (Security/Lint).
+---
+## 🎭 Gemini Mode Mapping
+| Mode | Agent | Behavior |
+|------|-------|----------|
+| **plan** | `project-planner` | 4-phase methodology. NO CODE before Phase 4. |
+| **ask** | - | Focus on understanding. Ask questions. |
+| **edit** | `orchestrator` | Execute. Check `{task-slug}.md` first. |
+**Plan Mode (4-Phase):**
+1. ANALYSIS → Research, questions
+2. PLANNING → `{task-slug}.md`, task breakdown
+3. SOLUTIONING → Architecture, design (NO CODE!)
+4. IMPLEMENTATION → Code + tests
+> 🔴 **Edit mode:** If multi-file or structural change → Offer to create `{task-slug}.md`. For single-file fixes → Proceed directly.
+---
+## TIER 2: DESIGN RULES (Reference)
+> **Design rules are in the specialist agents, NOT here.**
+| Task | Read |
+|------|------|
+| Web UI/UX | `.agent/frontend-specialist.md` |
+| Mobile UI/UX | `.agent/mobile-developer.md` |
+> 🔴 **For design work:** Open and READ the agent file. Rules are there.

package/.agent/rules/GEMINI.md CHANGED Viewed

@@ -36,7 +36,7 @@ Agent activated → Check frontmatter "skills:" field
 ---
-## �📥 REQUEST CLASSIFIER (STEP 2)
+## 📥 REQUEST CLASSIFIER (STEP 2)
 **Before ANY action, classify the request:**
@@ -45,9 +45,34 @@ Agent activated → Check frontmatter "skills:" field
 | **QUESTION** | "what is", "how does", "explain" | TIER 0 only | Text Response |
 | **SURVEY/INTEL**| "analyze", "list files", "overview" | TIER 0 + Explorer | Session Intel (No File) |
 | **SIMPLE CODE** | "fix", "add", "change" (single file) | TIER 0 + TIER 1 (lite) | Inline Edit |
-| **COMPLEX CODE**| "build", "create", "implement", "refactor" | TIER 0 + TIER 1 (full) + Agent | **{task-slug}.md Required** |
+| **SPEC REQUIRED** | "build", "create", "implement" (3+ files) | TIER 0 + SDD Gate | **SPEC-{slug}.md → PLAN-{slug}.md** |
+| **COMPLEX CODE**| "refactor", multi-file changes | TIER 0 + TIER 1 (full) + Agent | **{task-slug}.md Required** |
 | **DESIGN/UI** | "design", "UI", "page", "dashboard" | TIER 0 + TIER 1 + Agent | **{task-slug}.md Required** |
-| **SLASH CMD** | /create, /orchestrate, /debug | Command-specific flow | Variable |
+| **SLASH CMD** | /create, /orchestrate, /debug, /spec | Command-specific flow | Variable |
+---
+## 🔒 SDD GATE (Spec Driven Development)
+> **For complex features, SPEC before PLAN before CODE.**
+### When to Trigger SDD Gate
+| Condition | Action |
+|-----------|--------|
+| New feature with 3+ files | ✅ Run `/spec` first |
+| Architecture changes | ✅ Run `/spec` first |
+| User says "build", "create", "implement" | ✅ Suggest `/spec` |
+| Simple bug fix (1 file) | ❌ Skip, use `/debug` |
+| Single file change | ❌ Skip, direct edit |
+### SDD Flow
+```
+/spec → SPEC-{slug}.md → User Approval → /plan → PLAN-{slug}.md → User Approval → /create
+```
+> 🔴 **Rule:** Skipping `/spec` for complex features = suboptimal AI output.
 ---
@@ -68,10 +93,6 @@ When user's prompt is NOT in English:
 - No verbose explanations
 - No over-commenting
 - No over-engineering
-- **Self-Documentation:** Every agent is responsible for documenting their own changes in relevant `.md` files.
-- **Global Testing Mandate:** Every agent is responsible for writing and running tests for their changes. Follow the "Testing Pyramid" (Unit > Integration > E2E) and the "AAA Pattern" (Arrange, Act, Assert).
-- **Global Performance Mandate:** "Measure first, optimize second." Every agent must ensure their changes adhere to 2025 performance standards (Core Web Vitals for Web, query optimization for DB, bundle limits for FS).
-- **Infrastructure & Safety Mandate:** Every agent is responsible for the deployability and operational safety of their changes. Follow the "5-Phase Deployment Process" (Prepare, Backup, Deploy, Verify, Confirm/Rollback). Always verify environment variables and secrets security.
 ### 📁 File Dependency Awareness
@@ -84,12 +105,6 @@ When user's prompt is NOT in English:
 > 🔴 **MANDATORY:** Read `ARCHITECTURE.md` at session start to understand Agents, Skills, and Scripts.
-**Path Awareness:**
-- Agents: `.agent/` (Project)
-- Skills: `.agent/skills/` (Project)
-- Runtime Scripts: `.agent/skills/<skill>/scripts/`
 ### 🧠 Read → Understand → Apply
 ```
@@ -97,171 +112,14 @@ When user's prompt is NOT in English:
 ✅ CORRECT: Read → Understand WHY → Apply PRINCIPLES → Code
 ```
-**Before coding, answer:**
-1. What is the GOAL of this agent/skill?
-2. What PRINCIPLES must I apply?
-3. How does this DIFFER from generic output?
----
-## TIER 0.5: MEMORY & STATE (NEW)
-### 🧠 Persistence Protocol
-**MANDATORY: You MUST synchronize your state with the JSON Memory system.**
-1. **Read Task Context:** At the start of any complex task, check `.agent/memory/session.json` and `brain.json`.
-2. **Update Progress:** After completing a significant sub-task, update the session using:
-   `python3 .agent/scripts/ak_cli.py memory update-task --task "TaskName" --status "completed"`
-3. **Adaptive UI/UX:**
-   - **MANDATORY:** Check `.agent/memory/user.json` for `skill_level`.
-   - Adjust your language level (newbie, basic, technical) immediately.
-   - Use analogies for newbies, concise code for technical users.
 ---
-## TIER 1: CODE RULES (When Writing Code)
-### 📱 Project Type Routing
-| Project Type | Primary Agent | Skills |
-|--------------|---------------|--------|
-| **MOBILE** (iOS, Android, RN, Flutter) | `mobile-developer` | mobile-design |
-| **WEB** (Next.js, React web) | `frontend-specialist` | frontend-design |
-| **BACKEND** (API, server, DB) | `backend-specialist` | api-patterns, database-design |
-> 🔴 **Mobile + frontend-specialist = WRONG.** Mobile = mobile-developer ONLY.
-### 🛑 Socratic Gate
-**For complex requests, STOP and ASK first:**
+## 📚 MODULAR RULES (Load as needed)
-### 🛑 GLOBAL SOCRATIC GATE (TIER 0)
-**MANDATORY: Every user request must pass through the Socratic Gate before ANY tool use or implementation.**
-| Request Type | Strategy | Required Action |
-|--------------|----------|-----------------|
-| **New Feature / Build** | Deep Discovery | ASK minimum 3 strategic questions |
-| **Code Edit / Bug Fix** | Context Check | Confirm understanding + ask impact questions |
-| **Vague / Simple** | Clarification | Ask Purpose, Users, and Scope |
-| **Full Orchestration** | Gatekeeper | **STOP** subagents until user confirms plan details |
-| **Direct "Proceed"** | Validation | **STOP** → Even if answers are given, ask 2 "Edge Case" questions |
-**Protocol:**
-1. **Never Assume:** If even 1% is unclear, ASK.
-2. **Handle Spec-heavy Requests:** When user gives a list (Answers 1, 2, 3...), do NOT skip the gate. Instead, ask about **Trade-offs** or **Edge Cases** (e.g., "LocalStorage confirmed, but should we handle data clearing or versioning?") before starting.
-3. **Wait:** Do NOT invoke subagents or write code until the user clears the Gate.
-4. **Reference:** Full protocol in `@[skills/brainstorming]`.
-### 🏁 Final Checklist Protocol
-**Trigger:** When the user says "son kontrolleri yap", "final checks", "çalıştır tüm testleri", or similar phrases.
-| Task Stage | Command | Purpose |
-|------------|---------|---------|
-| **Manual Audit** | `python scripts/checklist.py .` | Priority-based project audit |
-| **Pre-Deploy** | `python scripts/checklist.py . --url <URL>` | Full Suite + Performance + E2E |
-**Priority Execution Order:**
-1. **Security** → 2. **Lint** → 3. **Schema** → 4. **Tests** → 5. **UX** → 6. **Seo** → 7. **Lighthouse/E2E**
-**Rules:**
-- **Completion:** A task is NOT finished until `checklist.py` returns success.
-- **Reporting:** If it fails, fix the **Critical** blockers first (Security/Lint).
-**Available Scripts (12 total):**
-| Script | Skill | When to Use |
-|--------|-------|-------------|
-| `security_scan.py` | vulnerability-scanner | Always on deploy |
-| `dependency_analyzer.py` | vulnerability-scanner | Weekly / Deploy |
-| `lint_runner.py` | lint-and-validate | Every code change |
-| `test_runner.py` | testing-patterns | After logic change |
-| `schema_validator.py` | database-design | After DB change |
-| `ux_audit.py` | frontend-design | After UI change |
-| `accessibility_checker.py` | frontend-design | After UI change |
-| `seo_checker.py` | seo-fundamentals | After page change |
-| `bundle_analyzer.py` | performance-profiling | Before deploy |
-| `mobile_audit.py` | mobile-design | After mobile change |
-| `lighthouse_audit.py` | performance-profiling | Before deploy |
-| `playwright_runner.py` | webapp-testing | Before deploy |
-> 🔴 **Agents & Skills can invoke ANY script** via `python .agent/skills/<skill>/scripts/<script>.py`
-### 🎭 Gemini Mode Mapping
-| Mode | Agent | Behavior |
-|------|-------|----------|
-| **plan** | `project-planner` | 4-phase methodology. NO CODE before Phase 4. |
-| **ask** | - | Focus on understanding. Ask questions. |
-| **edit** | `orchestrator` | Execute. Check `{task-slug}.md` first. |
-**Plan Mode (4-Phase):**
-1. ANALYSIS → Research, questions
-2. PLANNING → `{task-slug}.md`, task breakdown
-3. SOLUTIONING → Architecture, design (NO CODE!)
-4. IMPLEMENTATION → Code + tests
-> 🔴 **Edit mode:** If multi-file or structural change → Offer to create `{task-slug}.md`. For single-file fixes → Proceed directly.
----
-## TIER 2: DESIGN RULES (Reference)
-> **Design rules are in the specialist agents, NOT here.**
-| Task | Read |
-|------|------|
-| Web UI/UX | `.agent/frontend-specialist.md` |
-| Mobile UI/UX | `.agent/mobile-developer.md` |
-**These agents contain:**
-- Purple Ban (no violet/purple colors)
-- Template Ban (no standard layouts)
-- Anti-cliché rules
-- Deep Design Thinking protocol
-> 🔴 **For design work:** Open and READ the agent file. Rules are there.
----
+| Module | Path | When to Load |
+|--------|------|--------------|
+| **Memory & State** | `.agent/rules/MEMORY_STATE.md` | Session start, task completion |
+| **Code Rules** | `.agent/rules/CODE_RULES.md` | Writing code, using agents |
+| **Reference** | `.agent/rules/REFERENCE.md` | Looking up agents, skills, scripts |
-## 📁 QUICK REFERENCE
-### Available Master Agents (8)
-| Agent | Domain & Focus |
-|-------|----------------|
-| `orchestrator` | Multi-agent coordination and synthesis |
-| `project-planner` | Discovery, Architecture, and Task Planning |
-| `security-auditor` | Master Cybersecurity (Audit + Pentest + Infra Hardening) |
-| `backend-specialist` | Backend Architect (API + Database + Server/Docker Deploy) |
-| `frontend-specialist` | Frontend & Growth (UI/UX + SEO + Edge/Static Deploy) |
-| `mobile-developer` | Mobile Specialist (Cross-platform + Mobile Performance)|
-| `debugger` | Systematic Root Cause Analysis & Bug Fixing |
-| `game-developer` | Specialized Game Logic & Assets & Performance |
-### Key Skills
-| Skill | Purpose |
-|-------|---------|
-| `clean-code` | Coding standards (GLOBAL) |
-| `brainstorming` | Socratic questioning |
-| `app-builder` | Full-stack orchestration |
-| `frontend-design` | Web UI patterns |
-| `mobile-design` | Mobile UI patterns |
-| `plan-writing` | {task-slug}.md format |
-| `behavioral-modes` | Mode switching |
-### Script Locations
-| Script | Path |
-|--------|------|
-| Full verify | `scripts/verify_all.py` |
-| Security scan | `.agent/skills/vulnerability-scanner/scripts/security_scan.py` |
-| UX audit | `.agent/skills/frontend-design/scripts/ux_audit.py` |
-| Mobile audit | `.agent/skills/mobile-design/scripts/mobile_audit.py` |
-| Lighthouse | `.agent/skills/performance-profiling/scripts/lighthouse_audit.py` |
-| Playwright | `.agent/skills/webapp-testing/scripts/playwright_runner.py` |
----
+> 🔴 **Load modules selectively** based on current task. Do not load all at once.

package/.agent/rules/MEMORY_STATE.md ADDED Viewed

@@ -0,0 +1,62 @@
+---
+trigger: always_on
+---
+# MEMORY_STATE.md - TIER 0.5: Memory & State
+> This module defines memory persistence, session summaries, and ADR protocols.
+---
+## 🧠 Persistence Protocol
+**MANDATORY: You MUST synchronize your state with the JSON Memory system.**
+1. **Read Task Context:** At the start of any complex task, check `.agent/memory/session.json` and `brain.json`.
+2. **Update Progress:** After completing a significant sub-task, update the session using:
+   `python3 .agent/scripts/ak_cli.py memory update-task --task "TaskName" --status "completed"`
+3. **Adaptive UI/UX:**
+   - **MANDATORY:** Check `.agent/memory/user.json` for `skill_level`.
+   - Adjust your language level (newbie, basic, technical) immediately.
+   - Use analogies for newbies, concise code for technical users.
+---
+## 📝 Session Summary Protocol
+**Purpose:** Reduce token usage by maintaining concise session checkpoints.
+1. **At Session Start:** Read `.agent/memory/SESSION_SUMMARY.md` to understand recent context.
+2. **After Task Completion:** Append a short summary (~50-100 words) to the file:
+   ```markdown
+   ## Latest Session
+   **Date:** [YYYY-MM-DD]
+   **Task:** [Task Name]
+   **Summary:** [1-2 sentences of what was accomplished]
+   **Files Modified:** [list of key files]
+   ```
+3. **Benefits:**
+   - New sessions start with minimal but sufficient context.
+   - Reduces "lost in the middle" phenomenon.
+   - Saves tokens on long projects.
+---
+## 📜 ADR Protocol (Architecture Decision Records)
+**Purpose:** Preserve decision history for long-term project maintainability.
+**When to Create an ADR:**
+| Trigger | Example |
+|---------|---------|
+| Tech stack selection | "Using Prisma instead of Drizzle" |
+| Architecture pattern choice | "Choosing microservices over monolith" |
+| Major library decision | "Selecting Tailwind over styled-components" |
+**How to Create:**
+1. Copy `.agent/adr/ADR-TEMPLATE.md`
+2. Rename to `ADR-XXX-short-slug.md` (auto-increment number)
+3. Fill in: Context, Options, Decision, Consequences
+4. Set Status to `Accepted`
+**Directory:** `.agent/adr/`

package/.agent/rules/REFERENCE.md ADDED Viewed

@@ -0,0 +1,71 @@
+---
+trigger: always_on
+---
+# REFERENCE.md - Quick Reference
+> This module contains lookup tables for agents, skills, and scripts.
+---
+## Available Master Agents (8)
+| Agent | Domain & Focus |
+|-------|----------------|
+| `orchestrator` | Multi-agent coordination and synthesis |
+| `project-planner` | Discovery, Architecture, and Task Planning |
+| `security-auditor` | Master Cybersecurity (Audit + Pentest + Infra Hardening) |
+| `backend-specialist` | Backend Architect (API + Database + Server/Docker Deploy) |
+| `frontend-specialist` | Frontend & Growth (UI/UX + SEO + Edge/Static Deploy) |
+| `mobile-developer` | Mobile Specialist (Cross-platform + Mobile Performance)|
+| `debugger` | Systematic Root Cause Analysis & Bug Fixing |
+| `game-developer` | Specialized Game Logic & Assets & Performance |
+---
+## Key Skills
+| Skill | Purpose |
+|-------|---------|
+| `clean-code` | Coding standards (GLOBAL) |
+| `brainstorming` | Socratic questioning |
+| `spec-writing` | SPEC-{slug}.md format (SDD) |
+| `plan-writing` | PLAN-{slug}.md format |
+| `app-builder` | Full-stack orchestration |
+| `frontend-design` | Web UI patterns |
+| `mobile-design` | Mobile UI patterns |
+| `behavioral-modes` | Mode switching |
+---
+## Available Scripts (12 total)
+| Script | Skill | When to Use |
+|--------|-------|-------------|
+| `security_scan.py` | vulnerability-scanner | Always on deploy |
+| `dependency_analyzer.py` | vulnerability-scanner | Weekly / Deploy |
+| `lint_runner.py` | lint-and-validate | Every code change |
+| `test_runner.py` | testing-patterns | After logic change |
+| `schema_validator.py` | database-design | After DB change |
+| `ux_audit.py` | frontend-design | After UI change |
+| `accessibility_checker.py` | frontend-design | After UI change |
+| `seo_checker.py` | seo-fundamentals | After page change |
+| `bundle_analyzer.py` | performance-profiling | Before deploy |
+| `mobile_audit.py` | mobile-design | After mobile change |
+| `lighthouse_audit.py` | performance-profiling | Before deploy |
+| `playwright_runner.py` | webapp-testing | Before deploy |
+> 🔴 **Agents & Skills can invoke ANY script** via `python .agent/skills/<skill>/scripts/<script>.py`
+---
+## Script Locations
+| Script | Path |
+|--------|------|
+| Full verify | `scripts/verify_all.py` |
+| Security scan | `.agent/skills/vulnerability-scanner/scripts/security_scan.py` |
+| UX audit | `.agent/skills/frontend-design/scripts/ux_audit.py` |
+| Mobile audit | `.agent/skills/mobile-design/scripts/mobile_audit.py` |
+| Lighthouse | `.agent/skills/performance-profiling/scripts/lighthouse_audit.py` |
+| Playwright | `.agent/skills/webapp-testing/scripts/playwright_runner.py` |

package/.agent/skills/app-builder/SKILL.md CHANGED Viewed

@@ -95,20 +95,23 @@ python .agent/skills/app-builder/scripts/generate_ai_infra.py --project-root . -
 User: "Make an Instagram clone with photo sharing and likes"
 App Builder Process:
-1. Project type: Social Media App
-2. Tech stack: Next.js + Prisma + Cloudinary + Clerk
-3. Create plan:
+1. **Quality Gate** ⭐ NEW
+   ├─ Check Approved SPEC-*.md
+   └─ Check PLAN-*.md and Test Plan
+2. Project type: Social Media App
+3. Tech stack: Next.js + Prisma + Cloudinary + Clerk
+4. Create plan:
    ├─ Database schema (users, posts, likes, follows)
    ├─ API routes (12 endpoints)
    ├─ Pages (feed, profile, upload)
    └─ Components (PostCard, Feed, LikeButton)
-4. Coordinate agents
-5. Report progress
-6. **Generate AI Infrastructure** ⭐
+5. Coordinate agents
+6. Report progress
+7. **Generate AI Infrastructure** ⭐
    ├─ AGENTS.md (auto-detected stack)
    ├─ .agent/graph.json (dependencies)
    └─ .agent/rag/ (ready for indexing)
-7. Start preview
+8. Start preview
 ```
 ---

package/.agent/skills/spec-writing/SKILL.md ADDED Viewed

@@ -0,0 +1,189 @@
+---
+name: spec-writing
+description: Specification writing principles for SDD workflow. How to write clear, testable specs that guide AI implementation. Use when creating feature specifications before planning.
+allowed-tools: Read, Glob, Grep
+---
+# Spec Writing Skill
+> Write specifications that eliminate ambiguity and enable precise AI implementation.
+---
+## Core Principles
+| Principle | Meaning |
+|-----------|---------|
+| **Problem-First** | Understand the problem before describing the solution |
+| **User-Centric** | Every feature connects to a user need |
+| **Testable** | Every requirement can be verified |
+| **Bounded** | Explicit about what's in/out of scope |
+| **Minimal** | Only include what's necessary to implement |
+---
+## The 5 Mandatory Sections
+### 1. Problem Statement
+**Purpose:** Why does this feature exist?
+| ✅ Good | ❌ Bad |
+|---------|--------|
+| "Users cannot reset their password without contacting support, causing 50+ support tickets/week" | "We need password reset" |
+| "Mobile users abandon checkout 40% more than desktop due to form complexity" | "Improve mobile checkout" |
+**Format:** 1-2 paragraphs with measurable problem (if available).
+---
+### 2. User Stories
+**Purpose:** Who needs this and why?
+**Format:**
+```
+As a [user type]
+I want to [action]
+So that [benefit/outcome]
+```
+**Rules:**
+- Maximum 5 user stories per spec
+- Each story must have a clear benefit
+- Prioritize: Must-have > Should-have > Nice-to-have
+---
+### 3. Acceptance Criteria
+**Purpose:** How do we know it's done?
+| ✅ Testable | ❌ Vague |
+|-------------|----------|
+| "User receives password reset email within 60 seconds" | "Email should be fast" |
+| "Form validates email format before submission" | "Form should validate properly" |
+| "Button is disabled until all required fields are filled" | "Good UX" |
+**Format:** Table with ID, criterion, and testability confirmation.
+---
+### 4. Out of Scope
+**Purpose:** Prevent scope creep and wasted effort.
+**Include:**
+- Features explicitly NOT being built
+- Edge cases NOT being handled
+- Platforms NOT being supported
+- Future enhancements deferred
+**Example:**
+```markdown
+## Out of Scope
+- Social login (OAuth) - will be separate feature
+- Password strength meter - nice-to-have for v2
+- SMS-based reset - only email for MVP
+```
+---
+### 5. Technical Constraints
+**Purpose:** Guardrails for implementation.
+**Include if relevant:**
+- Must integrate with existing system X
+- Must support browser Y
+- Must not exceed budget Z
+- Must use technology A (if mandated)
+---
+## Spec Writing Process
+```
+1. Receive request
+       │
+       ▼
+2. Ask Socratic questions (minimum 3)
+       │
+       ▼
+3. Draft Problem Statement
+       │
+       ▼
+4. Write User Stories
+       │
+       ▼
+5. Define Acceptance Criteria (MUST be testable)
+       │
+       ▼
+6. List Out of Scope items
+       │
+       ▼
+7. Note Technical Constraints
+       │
+       ▼
+8. Create SPEC-{slug}.md file
+```
+---
+## Socratic Questions for Spec Writing
+Use `@[skills/brainstorming]` for dynamic question generation. Key areas:
+| Area | Example Questions |
+|------|-------------------|
+| **Users** | Who are the primary users? What's their technical level? |
+| **Scale** | How many users? How often will they use this? |
+| **Integration** | What existing systems must this work with? |
+| **Edge Cases** | What happens if X fails? What about offline users? |
+| **Success** | How will we measure if this feature is successful? |
+---
+## Anti-Patterns (Avoid)
+| Anti-Pattern | Problem | Fix |
+|--------------|---------|-----|
+| **Solutioning in Spec** | Spec describes HOW, not WHAT | Remove implementation details |
+| **Vague Criteria** | "Should be fast" is not testable | Add numbers: "< 200ms response" |
+| **Missing Scope** | No out-of-scope section | Always define boundaries |
+| **Feature Creep** | Spec grows during implementation | Lock spec before planning |
+| **No User Stories** | Features without user context | Connect every feature to a user need |
+---
+## Spec vs Plan vs Implementation
+| Document | Answers | Contains |
+|----------|---------|----------|
+| **SPEC** | WHAT and WHY | Requirements, criteria, constraints |
+| **PLAN** | HOW and WHO | Tasks, agents, file structure |
+| **CODE** | THE THING | Actual implementation |
+> 🔴 **Rule:** Spec contains NO code, NO file structure, NO implementation details.
+---
+## Quality Checklist
+Before finalizing spec, verify:
+- [ ] Problem is clearly stated with measurable impact
+- [ ] All user stories have clear benefits
+- [ ] Every acceptance criterion is testable
+- [ ] Out of scope section exists and is explicit
+- [ ] No implementation details leaked into spec
+- [ ] Spec is approved by user before planning
+---
+## Integration with Agent Kit
+**Workflow:**
+```
+/spec → SPEC-{slug}.md → User Review → /plan → PLAN-{slug}.md → /create
+```
+**Related Skills:**
+- `brainstorming` - Question generation
+- `plan-writing` - Creating task breakdowns
+- `clean-code` - Implementation standards

package/.agent/skills/tdd-workflow/SKILL.md CHANGED Viewed

@@ -144,6 +144,36 @@ Every test follows:
 | Agent B | Implement to pass (GREEN) |
 | Agent C | Optimize (REFACTOR) |
+### Red-Phase Automation ⭐ NEW
+When triggered via `/test red [feature]`:
+1. **Read** `SPEC-[feature].md` to extract acceptance criteria.
+2. **Generate** a test file targeting the *behavior* not implementation.
+3. **Run** the test and ensure it **FAILS**.
+#### Failing Test Template
+```typescript
+// tests/[feature].test.ts
+import { describe, it, expect, vi } from 'vitest'; // or jest
+describe('[Feature Name]', () => {
+  it('should [acceptance criteria 1] - RED PHASE', () => {
+    // Arrange
+    const input = { /* mock data from SPEC */ };
+    // Act
+    const result = featureFunction(input); // Function doesn't exist yet!
+    // Assert
+    expect(result).toBeDefined(); // Will fail: function not implemented
+  });
+});
+```
+> 🔴 **Remember:** If this test passes on first run, something is wrong.
 ---
 > **Remember:** The test is the specification. If you can't write a test, you don't understand the requirement.

package/.agent/workflows/create.md CHANGED Viewed

@@ -14,24 +14,34 @@ This command starts a new application creation process with **AI-Ready Infrastru
 ### Steps:
-1. **Request Analysis**
+1. **Phase 0: Quality Gate** ⭐
+   - **SDD Check**: Ensure `SPEC-*.md` exists and is marked as `Approved` for complex features (3+ files).
+   - **SDD Check**: Ensure `PLAN-*.md` exists and aligns with the Spec.
+   - **TDD Red-Phase Check** ⭐ NEW:
+     1. Generate at least 1 failing test based on the approved Spec (e.g., `tests/[feature].test.ts`).
+     2. Run the test: `npm test` or equivalent.
+     3. Verify the test **FAILS** (exit code != 0).
+     4. **If test passes or doesn't exist**: **HALT** and notify user. The test must fail first (RED phase) to ensure the Spec is correctly understood.
+   - **Action**: If missing, REDIRECT to `/spec`, `/plan`, or `/test red [feature]` before proceeding.
+2. **Request Analysis**
    - Understand what the user wants
    - If information is missing, use `brainstorming` skill to ask clarifying questions
-2. **Project Planning**
+3. **Project Planning**
    - Use `project-planner` agent for task breakdown
    - Determine tech stack
    - Plan file structure
    - Create plan file and proceed to building
-3. **Application Building (After Approval)**
+4. **Application Building (After Approval)**
    - Orchestrate with `app-builder` skill
    - Coordinate expert agents:
      - `database-architect` → Schema
      - `backend-specialist` → API
      - `frontend-specialist` → UI
-4. **AI Infrastructure Setup** ⭐ NEW
+5. **AI Infrastructure Setup** ⭐
    - Generate `AGENTS.md` using `documentation-templates/agents-template.md`
    - Create `.agent/` directory structure:
      ```
@@ -44,7 +54,7 @@ This command starts a new application creation process with **AI-Ready Infrastru
    - Run `graph-mapper` script to generate initial dependency graph
    - Configure `.cursor/mcp.json` or equivalent for MCP servers
-5. **Preview & Self-Healing Verification** ⭐ NEW
+6. **Preview & Self-Healing Verification** ⭐
    - Run initial tests or build command.
    - If fail: Activate `self-healing` protocol to fix errors autonomously.
    - Start with `auto_preview.py` once verified.
@@ -56,7 +66,7 @@ This command starts a new application creation process with **AI-Ready Infrastru
 After building the application, automatically generate:
-### Step 4.1: Generate AGENTS.md
+### Step 5.1: Generate AGENTS.md
 Create `AGENTS.md` in project root with:
 - Project name and status

package/.agent/workflows/plan.md CHANGED Viewed

@@ -25,7 +25,7 @@ Use the `project-planner` agent with this context:
 CONTEXT:
 - User Request: $ARGUMENTS
 - Mode: PLANNING ONLY (no code)
-- Output: docs/PLAN-{task-slug}.md (dynamic naming)
+- Output: ./{task-slug}.md (dynamic naming)
 NAMING RULES:
 1. Extract 2-3 key words from request
@@ -47,7 +47,7 @@ RULES:
 | Deliverable | Location |
 |-------------|----------|
-| Project Plan | `docs/PLAN-{task-slug}.md` |
+| Project Plan | `./{task-slug}.md` |
 | Task Breakdown | Inside plan file |
 | Agent Assignments | Inside plan file |
 | Verification Checklist | Phase X in plan file |
@@ -58,7 +58,7 @@ RULES:
 Tell user:
 ```
-[OK] Plan created: docs/PLAN-{slug}.md
+[OK] Plan created: ./{slug}.md
 Next steps:
 - Review the plan
@@ -72,11 +72,11 @@ Next steps:
 | Request | Plan File |
 |---------|-----------|
-| `/plan e-commerce site with cart` | `docs/PLAN-ecommerce-cart.md` |
-| `/plan mobile app for fitness` | `docs/PLAN-fitness-app.md` |
-| `/plan add dark mode feature` | `docs/PLAN-dark-mode.md` |
-| `/plan fix authentication bug` | `docs/PLAN-auth-fix.md` |
-| `/plan SaaS dashboard` | `docs/PLAN-saas-dashboard.md` |
+| `/plan e-commerce site with cart` | `./ecommerce-cart.md` |
+| `/plan mobile app for fitness` | `./fitness-app.md` |
+| `/plan add dark mode feature` | `./dark-mode.md` |
+| `/plan fix authentication bug` | `./login-fix.md` |
+| `/plan SaaS dashboard` | `./saas-dashboard.md` |
 ---

package/.agent/workflows/spec.md ADDED Viewed

@@ -0,0 +1,189 @@
+---
+description: Generate specification document before planning. Use for complex features requiring clear requirements before implementation.
+---
+# /spec - Specification Writing Mode
+$ARGUMENTS
+---
+## 🔴 CRITICAL RULES
+1. **NO CODE WRITING** - This command creates specification file only
+2. **Socratic Gate** - Ask clarifying questions before writing spec
+3. **Dynamic Naming** - Spec file named based on feature
+4. **User Approval Required** - Spec must be approved before `/plan`
+---
+## Purpose
+Create a clear, testable specification document that serves as the "source of truth" for AI agents during implementation. This prevents "vibe-coding" and ensures alignment between user intent and code output.
+---
+## When to Use
+| Scenario | Use `/spec`? |
+|----------|--------------|
+| New feature with 3+ files | ✅ YES |
+| Architecture changes | ✅ YES |
+| Complex business logic | ✅ YES |
+| Simple bug fix | ❌ NO (use `/debug`) |
+| Single file change | ❌ NO (direct edit) |
+---
+## Task
+Use the `spec-writing` skill with this context:
+```
+CONTEXT:
+- User Request: $ARGUMENTS
+- Mode: SPECIFICATION ONLY (no code, no plan)
+- Output: SPEC-{feature-slug}.md (project root)
+NAMING RULES:
+1. Extract 2-3 key words from request
+2. Lowercase, hyphen-separated
+3. Max 30 characters
+4. Example: "OAuth authentication" → SPEC-oauth-auth.md
+MANDATORY SECTIONS:
+1. Problem Statement (What problem are we solving?)
+2. User Stories / Use Cases (Who uses this and how?)
+3. Acceptance Criteria (Measurable, testable requirements)
+4. Out of Scope (What we are NOT doing)
+5. Technical Constraints (If any)
+RULES:
+1. Use @[skills/brainstorming] for question generation
+2. Ask minimum 3 clarifying questions before writing
+3. Each acceptance criterion must be testable
+4. DO NOT write implementation details
+5. REPORT the exact file name created
+```
+---
+## Output Format
+```markdown
+# SPEC: [Feature Name]
+> Generated: [Date]
+> Status: Draft | Approved
+---
+## Problem Statement
+[1-2 paragraphs describing the problem this feature solves]
+---
+## User Stories
+### US-1: [Story Title]
+**As a** [user type]
+**I want to** [action]
+**So that** [benefit]
+### US-2: [Story Title]
+...
+---
+## Acceptance Criteria
+| ID | Criterion | Testable? |
+|----|-----------|-----------|
+| AC-1 | [Specific, measurable requirement] | ✅ |
+| AC-2 | [Specific, measurable requirement] | ✅ |
+| AC-3 | [Specific, measurable requirement] | ✅ |
+---
+## Out of Scope
+- [Feature/capability explicitly NOT included]
+- [Another exclusion]
+---
+## Technical Constraints
+| Constraint | Reason |
+|------------|--------|
+| [e.g., Must use existing auth system] | [Why] |
+---
+## Open Questions
+- [ ] [Any unresolved questions for user]
+```
+---
+## Expected Output
+| Deliverable | Location |
+|-------------|----------|
+| Specification | `./SPEC-{feature-slug}.md` |
+| User Stories | Inside spec file |
+| Acceptance Criteria | Inside spec file |
+---
+## After Specification
+Tell user:
+```
+✅ Spec created: SPEC-{slug}.md
+Next steps:
+1. Review the specification
+2. Mark status as "Approved" if correct
+3. Run `/plan` to create implementation plan
+```
+---
+## Naming Examples
+| Request | Spec File |
+|---------|-----------|
+| `/spec OAuth authentication with Google` | `SPEC-oauth-google.md` |
+| `/spec shopping cart with checkout` | `SPEC-cart-checkout.md` |
+| `/spec real-time notifications` | `SPEC-realtime-notif.md` |
+| `/spec admin dashboard` | `SPEC-admin-dashboard.md` |
+---
+## SDD Flow Overview
+```
+┌─────────────┐     ┌─────────────┐     ┌─────────────┐
+│   /spec     │ ──► │   /plan     │ ──► │  /create    │
+│ (WHAT)      │     │ (HOW)       │     │ (DO)        │
+└─────────────┘     └─────────────┘     └─────────────┘
+      │                   │                   │
+      ▼                   ▼                   ▼
+  SPEC-*.md           PLAN-*.md          Code Files
+```
+> 🔴 **Rule:** Complex features MUST follow this flow. Skipping `/spec` for complex features = suboptimal results.
+---
+## Usage
+```
+/spec OAuth authentication with Google
+/spec shopping cart with Stripe checkout
+/spec real-time chat with WebSocket
+/spec admin dashboard with analytics
+```

package/.agent/workflows/test.md CHANGED Viewed

@@ -21,6 +21,7 @@ This command generates tests, runs existing tests, or checks test coverage.
 /test [file/feature] - Generate tests for specific target
 /test coverage       - Show test coverage report
 /test watch          - Run tests in watch mode
+/test red [feature]  - 🔴 RED Phase: Generate & run a failing test (TDD Enforcer)
 ```
 ---
@@ -100,6 +101,7 @@ Total: 15 tests (14 passed, 1 failed)
 /test user registration flow
 /test coverage
 /test fix failed tests
+/test red checkout-flow          # 🔴 Generate failing test for checkout
 ```
 ---
@@ -142,3 +144,42 @@ describe('AuthService', () => {
 - **Descriptive test names**
 - **Arrange-Act-Assert pattern**
 - **Mock external dependencies**
+---
+## 🔴 Red-Phase TDD Enforcer
+When `/test red [feature]` is triggered:
+1. **Analyze the approved SPEC-*.md** for the feature
+2. **Generate a failing test stub** based on acceptance criteria
+3. **Run the test** and capture output
+4. **Verify failure**: Exit code MUST be != 0
+### Expected Output
+```
+🔴 RED PHASE: checkout-flow
+Generating failing test for: checkout-flow
+  Based on: SPEC-checkout.md
+📝 Created: tests/checkout.test.ts
+Running test...
+  ❌ FAIL checkout › should calculate total with discount
+    Error: Function not implemented
+✅ RED PHASE COMPLETE
+   Test failed as expected. Ready for GREEN phase.
+```
+### If Test Passes (Error!)
+```
+⚠️ RED PHASE FAILED
+The test passed, which means either:
+  1. The feature already exists (no work needed)
+  2. The test is incorrect (doesn't reflect Spec)
+Action: Review the test and Spec before proceeding.
+```

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@musashishao/agent-kit",
-  "version": "1.6.0",
+  "version": "1.6.1",
   "description": "AI Agent templates - Skills, Agents, Workflows, and AI-Ready Data Infrastructure Gateway",
   "main": "index.js",
   "bin": {
@@ -52,4 +52,4 @@
   "engines": {
     "node": ">=16.0.0"
   }
-}
+}