npm - @uzysjung/agent-harness - Versions diffs - 26.83.0 - Mend

@uzysjung/agent-harness 26.83.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (212) hide show

package/templates/codex/skills/uzys-test/SKILL.md ADDED Viewed

@@ -0,0 +1,24 @@
+---
+name: uzys-test
+description: "Verify phase — TDD 워크플로우와 Track별 커버리지 게이트를 실행한다. Codex 포팅 (원본: .claude/commands/uzys/test.md)"
+---
+# /uzys-test — test Phase (Codex)
+> **Generated from**: `.claude/commands/uzys/test.md` via `scripts/claude-to-codex.sh` (Phase C)
+> **Slash**: `/uzys-test`
+## Pre-flight
+이 skill 호출 전 확인:
+- 직전 phase 완료 체크: `docs/todo.md`에서 이전 Phase 체크박스 확인
+- `pre_tool_use` hook이 Skill matcher로 순서 강제 (Codex hook 시스템, ADR-002 v2)
+- 직전 phase 미완료 시 exit 2로 차단됨
+## Goal
+{SKILL_BODY_PLACEHOLDER}
+---
+*Phase C에서 본 SKILL.md 본문이 `.claude/commands/uzys/test.md`로부터 포팅됨.*

package/templates/commands/ecc/checkpoint.md ADDED Viewed

@@ -0,0 +1,32 @@
+현재 진행 상태의 스냅샷을 생성한다. 게이트 간 savepoint로 활용.
+## Process
+아래 항목을 확인하고 체크포인트를 기록한다:
+1. **테스트 상태** — 전체/통과/실패 수, 커버리지
+2. **빌드 상태** — 성공 또는 에러
+3. **코드 변경** — `git diff --stat` 요약
+4. **완료된 작업** — 체크리스트
+5. **차단 이슈** — 있으면 기술
+6. **다음 단계** — 남은 작업
+## Output Format
+```markdown
+### Checkpoint: [YYYY-MM-DD HH:MM]
+**Tests**: Total X, Pass Y, Fail Z, Coverage XX%
+**Build**: PASS / FAIL
+**Changes since last checkpoint**: `git diff --stat`
+**Completed**: [x] Task 1, [x] Task 2, [ ] Task 3 (진행 중)
+**Blockers**: [있으면 기술]
+**Next**: 1. Step 1, 2. Step 2
+```
+## Usage
+- 주요 변경 전 savepoint로
+- Phase 전환 시 진행 상태 기록
+- 롤백 결정 시 참조점
+- `git commit -m "chore: checkpoint [설명]"` 과 함께 사용

package/templates/commands/ecc/e2e.md ADDED Viewed

@@ -0,0 +1,105 @@
+---
+description: Generate and run E2E tests with Playwright
+agent: everything-claude-code:e2e-runner
+subtask: true
+---
+# E2E Command
+Generate and run end-to-end tests using Playwright: $ARGUMENTS
+## Your Task
+1. **Analyze user flow** to test
+2. **Create test journey** with Playwright
+3. **Run tests** and capture artifacts
+4. **Report results** with screenshots/videos
+## Test Structure
+```typescript
+import { test, expect } from '@playwright/test'
+test.describe('Feature: [Name]', () => {
+  test.beforeEach(async ({ page }) => {
+    // Setup: Navigate, authenticate, prepare state
+  })
+  test('should [expected behavior]', async ({ page }) => {
+    // Arrange: Set up test data
+    // Act: Perform user actions
+    await page.click('[data-testid="button"]')
+    await page.fill('[data-testid="input"]', 'value')
+    // Assert: Verify results
+    await expect(page.locator('[data-testid="result"]')).toBeVisible()
+  })
+  test.afterEach(async ({ page }, testInfo) => {
+    // Capture screenshot on failure
+    if (testInfo.status !== 'passed') {
+      await page.screenshot({ path: `test-results/${testInfo.title}.png` })
+    }
+  })
+})
+```
+## Best Practices
+### Selectors
+- Prefer `data-testid` attributes
+- Avoid CSS classes (they change)
+- Use semantic selectors (roles, labels)
+### Waits
+- Use Playwright's auto-waiting
+- Avoid `page.waitForTimeout()`
+- Use `expect().toBeVisible()` for assertions
+### Test Isolation
+- Each test should be independent
+- Clean up test data after
+- Don't rely on test order
+## Artifacts to Capture
+- Screenshots on failure
+- Videos for debugging
+- Trace files for detailed analysis
+- Network logs if relevant
+## Test Categories
+1. **Critical User Flows**
+   - Authentication (login, logout, signup)
+   - Core feature happy paths
+   - Payment/checkout flows
+2. **Edge Cases**
+   - Network failures
+   - Invalid inputs
+   - Session expiry
+3. **Cross-Browser**
+   - Chrome, Firefox, Safari
+   - Mobile viewports
+## Report Format
+```
+E2E Test Results
+================
+PASS: Passed: X
+FAIL: Failed: Y
+SKIPPED: Skipped: Z
+Failed Tests:
+- test-name: Error message
+  Screenshot: path/to/screenshot.png
+  Video: path/to/video.webm
+```
+---
+**TIP**: Run with `--headed` flag for debugging: `npx playwright test --headed`

package/templates/commands/ecc/eval.md ADDED Viewed

@@ -0,0 +1,88 @@
+---
+description: Run evaluation against acceptance criteria
+agent: everything-claude-code:build
+---
+# Eval Command
+Evaluate implementation against acceptance criteria: $ARGUMENTS
+## Your Task
+Run structured evaluation to verify the implementation meets requirements.
+## Evaluation Framework
+### Grader Types
+1. **Binary Grader** - Pass/Fail
+   - Does it work? Yes/No
+   - Good for: feature completion, bug fixes
+2. **Scalar Grader** - Score 0-100
+   - How well does it work?
+   - Good for: performance, quality metrics
+3. **Rubric Grader** - Category scores
+   - Multiple dimensions evaluated
+   - Good for: comprehensive review
+## Evaluation Process
+### Step 1: Define Criteria
+```
+Acceptance Criteria:
+1. [Criterion 1] - [weight]
+2. [Criterion 2] - [weight]
+3. [Criterion 3] - [weight]
+```
+### Step 2: Run Tests
+For each criterion:
+- Execute relevant test
+- Collect evidence
+- Score result
+### Step 3: Calculate Score
+```
+Final Score = Σ (criterion_score × weight) / total_weight
+```
+### Step 4: Report
+## Evaluation Report
+### Overall: [PASS/FAIL] (Score: X/100)
+### Criterion Breakdown
+| Criterion | Score | Weight | Weighted |
+|-----------|-------|--------|----------|
+| [Criterion 1] | X/10 | 30% | X |
+| [Criterion 2] | X/10 | 40% | X |
+| [Criterion 3] | X/10 | 30% | X |
+### Evidence
+**Criterion 1: [Name]**
+- Test: [what was tested]
+- Result: [outcome]
+- Evidence: [screenshot, log, output]
+### Recommendations
+[If not passing, what needs to change]
+## Pass@K Metrics
+For non-deterministic evaluations:
+- Run K times
+- Calculate pass rate
+- Report: "Pass@K = X/K"
+---
+**TIP**: Use eval for acceptance testing before marking features complete.

package/templates/commands/ecc/evolve.md ADDED Viewed

@@ -0,0 +1,7 @@
+관련된 instinct들을 클러스터링하여 상위 스킬/커맨드로 진화시킨다.
+```bash
+python3 .claude/skills/continuous-learning-v2/scripts/instinct-cli.py evolve
+```
+유사한 trigger/action 패턴을 가진 instinct들을 자동 그룹화하고, 새로운 복합 패턴이나 Rule 후보를 제안한다. 승격 제안은 인간 승인 후 적용.

package/templates/commands/ecc/harness-audit.md ADDED Viewed

@@ -0,0 +1,73 @@
+# Harness Audit Command
+Run a deterministic repository harness audit and return a prioritized scorecard.
+## Usage
+`/harness-audit [scope] [--format text|json] [--root path]`
+- `scope` (optional): `repo` (default), `hooks`, `skills`, `commands`, `agents`
+- `--format`: output style (`text` default, `json` for automation)
+- `--root`: audit a specific path instead of the current working directory
+## Deterministic Engine
+Always run:
+```bash
+node scripts/harness-audit.js <scope> --format <text|json> [--root <path>]
+```
+This script is the source of truth for scoring and checks. Do not invent additional dimensions or ad-hoc points.
+Rubric version: `2026-03-30`.
+The script computes 7 fixed categories (`0-10` normalized each):
+1. Tool Coverage
+2. Context Efficiency
+3. Quality Gates
+4. Memory Persistence
+5. Eval Coverage
+6. Security Guardrails
+7. Cost Efficiency
+Scores are derived from explicit file/rule checks and are reproducible for the same commit.
+The script audits the current working directory by default and auto-detects whether the target is the ECC repo itself or a consumer project using ECC.
+## Output Contract
+Return:
+1. `overall_score` out of `max_score` (70 for `repo`; smaller for scoped audits)
+2. Category scores and concrete findings
+3. Failed checks with exact file paths
+4. Top 3 actions from the deterministic output (`top_actions`)
+5. Suggested ECC skills to apply next
+## Checklist
+- Use script output directly; do not rescore manually.
+- If `--format json` is requested, return the script JSON unchanged.
+- If text is requested, summarize failing checks and top actions.
+- Include exact file paths from `checks[]` and `top_actions[]`.
+## Example Result
+```text
+Harness Audit (repo): 66/70
+- Tool Coverage: 10/10 (10/10 pts)
+- Context Efficiency: 9/10 (9/10 pts)
+- Quality Gates: 10/10 (10/10 pts)
+Top 3 Actions:
+1) [Security Guardrails] Add prompt/tool preflight security guards in hooks/hooks.json. (hooks/hooks.json)
+2) [Tool Coverage] Sync commands/harness-audit.md and .opencode/commands/harness-audit.md. (.opencode/commands/harness-audit.md)
+3) [Eval Coverage] Increase automated test coverage across scripts/hooks/lib. (tests/)
+```
+## Arguments
+$ARGUMENTS:
+- `repo|hooks|skills|commands|agents` (optional scope)
+- `--format text|json` (optional output format)

package/templates/commands/ecc/instinct-status.md ADDED Viewed

@@ -0,0 +1,8 @@
+CL-v2에서 학습된 모든 instinct를 표시한다 (프로젝트 스코프 + 글로벌).
+```bash
+python3 .claude/skills/continuous-learning-v2/scripts/instinct-cli.py status
+```
+각 instinct는 신뢰도 점수(0.3-0.9)와 함께 표시. 도메인별로 그룹화.
+high-confidence instinct(≥0.8)는 Rule 승격 후보로 표시된다.

package/templates/commands/ecc/promote.md ADDED Viewed

@@ -0,0 +1,10 @@
+프로젝트 스코프 instinct를 글로벌 스코프로 승격시킨다.
+```bash
+python3 .claude/skills/continuous-learning-v2/scripts/instinct-cli.py promote [id]
+```
+id 없이 실행하면 승격 후보(confidence ≥ 0.8, 2개+ 프로젝트에서 확인)를 목록 표시.
+id를 지정하면 해당 instinct를 글로벌로 승격.
+글로벌 instinct는 모든 프로젝트에서 활성화된다.

package/templates/commands/ecc/security-scan.md ADDED Viewed

@@ -0,0 +1,10 @@
+Claude Code 설정(.claude/ 디렉토리)에 대한 AgentShield 보안 스캔을 실행한다.
+`npx ecc-agentshield scan` 을 실행하여 CLAUDE.md, settings.json, MCP 서버, hooks, agent 정의에서 보안 취약점, 설정 오류, 인젝션 위험을 검사한다.
+옵션:
+- `--fix`: 자동 수정 가능한 항목 수정 (하드코딩 시크릿 → 환경변수 참조 등)
+- `--min-severity medium`: 최소 심각도 필터
+- `--format json|markdown|html`: 출력 형식
+결과는 A-F 등급으로 표시. CRITICAL/HIGH 발견 시 즉시 조치 필요.

package/templates/commands/uzys/auto.md ADDED Viewed

@@ -0,0 +1,190 @@
+SPEC 확정 후 나머지 5단계(Plan → Build → Test → Review → Ship)를 자동으로 순차 진행하고, **SPEC 정합성이 충족될 때까지 Ralph loop로 반복 검증**한다.
+## 사전 조건
+1. `docs/SPEC.md` 존재 확인. 없으면 "/uzys:spec을 먼저 실행하세요" 안내 후 **중단**.
+2. `.claude/gate-status.json`의 `define.completed` = true 확인. false면 **중단**.
+## 실행 순서
+각 단계를 **순차 실행**. 각 단계 완료 시 gate-status.json을 자동 업데이트하고 다음 단계로 진행한다.
+### 1. Plan
+- `docs/plan.md` + `docs/todo.md` 생성
+- plan-checker agent로 Goal-backward 검증 (Revision Gate, 최대 3회)
+- 완료 시: `jq '.plan.completed = true | .plan.timestamp = (now | strftime("%Y-%m-%dT%H:%M:%SZ"))' .claude/gate-status.json > /tmp/gate-tmp.json && mv /tmp/gate-tmp.json .claude/gate-status.json`
+### 2. Build
+- `docs/todo.md`에서 미완료 task를 순차 선택
+- 각 task에 TDD 사이클 적용 (RED → GREEN → REFACTOR)
+- 각 task 완료 시 **즉시 커밋** (commit-policy 준수)
+- todo.md 체크박스 업데이트
+- 모든 task 완료 시: gate-status build.completed = true
+### 3. Test
+- 전체 테스트 스위트 실행
+- test-policy.md 커버리지 기준 확인 (UI 60%, API 80%, 로직 90%)
+- 미달 시 추가 테스트 작성 시도 (최대 3회)
+- 통과 시: gate-status verify.completed = true
+### 4. Review
+- reviewer subagent (context: fork) 호출
+- 5축 리뷰: correctness, readability, architecture, security, performance
+- CRITICAL 이슈 발견 시 즉시 수정 시도 (최대 3회)
+- CRITICAL 0건 시: gate-status review.completed = true
+### 5. SPEC Compliance Check (Ralph Loop) ← 핵심
+**Ship 전에 반드시 실행**. SPEC.md에 정의된 모든 요구사항이 실제로 구현됐는지 확인하고, 미달이면 Build로 돌아가 수정한다.
+#### 5.1 SPEC 파싱
+`docs/SPEC.md`에서 다음을 자동 추출:
+- **Objective** 섹션의 핵심 목표
+- **Features** 섹션의 각 기능 항목 (체크리스트 또는 표)
+- **Acceptance Criteria** (있으면)
+- **Boundaries > DO NOT CHANGE** 영역
+- **Non-Goals** 목록
+#### 5.2 구현 검증 (자동)
+추출된 각 항목에 대해:
+1. **파일 존재 확인**: 해당 기능이 구현된 파일이 프로젝트에 존재하는가? (Glob/Grep)
+2. **코드 매칭**: 기능 키워드가 소스 코드에 등장하는가? (Grep)
+3. **테스트 존재**: 해당 기능에 대한 테스트 파일이 있는가?
+4. **빌드 통과**: 프로젝트 빌드/타입체크가 PASS인가?
+5. **DO NOT CHANGE 미침범**: 보호 영역이 수정되지 않았는가?
+6. **Non-Goals 미침범**: Non-Goals에 해당하는 구현이 추가되지 않았는가?
+#### 5.3 결과 분류
+각 항목을 다음 3가지로 분류:
+- **PASS**: 구현 확인됨
+- **PARTIAL**: 파일은 있지만 불완전 (예: stub, TODO, 빈 함수)
+- **MISSING**: 구현 없음
+#### 5.4 Ralph Loop (반복 수정)
+```
+iteration = 0
+MAX_ITERATIONS = 5
+while MISSING 또는 PARTIAL 항목 존재:
+    iteration += 1
+    if iteration > MAX_ITERATIONS:
+        → Escalation Gate: 사용자에게 "5회 시도 후에도 미달 항목 N건" 보고
+        → 사용자 결정 대기 (계속 / 중단 / 수동 수정)
+        break
+    1. MISSING/PARTIAL 항목 목록 출력
+    2. 각 항목에 대해:
+       - 구현 코드 작성 (Build 단계 로직 재사용)
+       - 해당 기능의 테스트 작성
+       - 즉시 커밋
+    3. 전체 빌드/테스트 재실행
+    4. SPEC Compliance 재검증 (5.2 반복)
+    5. 결과 출력: "Iteration {iteration}: PASS {n}, PARTIAL {m}, MISSING {k}"
+if 모든 항목 PASS:
+    → "SPEC 정합성 100% 달성" 출력
+    → gate-status verify.completed = true (이미 안 되어있으면)
+```
+#### 5.5 검증 보고서
+각 iteration 후 보고:
+```
+## SPEC Compliance Report — Iteration {N}/{MAX}
+| # | Feature | Status | Evidence |
+|---|---------|--------|----------|
+| 1 | 노트 CRUD | PASS | src/services/note.ts:15 + tests/note.test.ts |
+| 2 | 카테고리 트리 | PARTIAL | src/components/Sidebar.tsx 있지만 중첩 미구현 |
+| 3 | FTS 검색 | MISSING | 검색 관련 파일 없음 |
+PASS: X / PARTIAL: Y / MISSING: Z
+Next: {Y+Z}건 수정 후 재검증
+```
+### 6. Ship (SPEC 100% 달성 후)
+- agentshield-gate 자동 실행 (CRITICAL 차단)
+- spec-drift-check ship 모드 실행
+- 전부 통과 시: gate-status ship.completed = true
+- 최종 커밋 + 태그 제안
+### 7. Post-Ship: CLAUDE.md 리뷰 (자동)
+Ship 완료 후 CLAUDE.md를 자동 검토한다:
+1. **instinct 확인**: `/ecc:instinct-status` 실행 → confidence ≥ 0.8인 instinct를 CLAUDE.md 또는 `.claude/rules/` 반영 후보로 제안
+2. **패턴 체크**: 이번 세션에서 반복된 교정/실수 패턴이 있으면 CLAUDE.md에 추가 제안
+3. **모순 검출**: 기존 CLAUDE.md 지시와 이번 세션 행동이 충돌한 부분 보고
+4. **rules-distill**: ECC `rules-distill` 스킬로 현재 스킬에서 cross-cutting 원칙 추출 가능성 확인
+**제약**: 제안만. 직접 수정 금지 (인간 승인 필수). 변경 제안을 사용자에게 목록으로 출력.
+**예시 출력**:
+```
+## CLAUDE.md 개선 제안 (Post-Ship Review)
+1. [instinct] "Rust 에러는 Result<T, E> 반환 강제" (confidence 0.85) → error-handling rule 보강 후보
+2. [반복 교정] 사용자가 3회 "커밋 먼저 해" 교정 → commit-policy 강조 필요
+3. [모순 없음]
+4. [rules-distill] 해당 없음
+승인 시 적용할 항목 번호를 알려주세요 (또는 "skip"):
+```
+## 자동 재시도 (Revision Gate 패턴)
+각 단계에서 실패 시:
+1. 원인 분석 + 즉시 수정 시도
+2. 최대 **3회** 재시도 (단계별)
+3. SPEC Compliance Loop는 **최대 5회** (전체 루프)
+4. 초과 시 **사용자에게 escalation** (Escalation Gate)
+5. 사용자 응답 대기
+## 중단 조건 (Abort Gate)
+- SPEC.md 300줄 초과 → spec-scaling 제안 + 중단
+- 동일 이슈 3회 연속 미해결 → escalation
+- SPEC Compliance 5회 반복 후 MISSING 잔존 → escalation
+- 사용자 Ctrl+C → 현재 상태 보존
+## Arguments
+```
+/uzys:auto                  # Plan부터 시작 (기본)
+/uzys:auto from=build       # Build부터 (Plan 이미 완료 시)
+/uzys:auto from=test        # Test부터
+/uzys:auto from=review      # Review부터
+/uzys:auto from=ship        # Ship부터 (SPEC compliance 포함)
+/uzys:auto from=verify      # SPEC compliance check만 실행
+```
+## Ralph Loop 동작 요약
+```
+/uzys:spec (사용자)
+    ↓
+/uzys:auto (자동 시작)
+    ↓
+  Plan → Build → Test → Review
+    ↓
+  SPEC Compliance Check ← Ralph Loop 진입
+    ↓         ↑
+  MISSING?  → Build로 돌아가 수정 → Test → 재검증
+    ↓ (PASS)
+  Ship
+```
+**핵심**: Ship은 SPEC.md의 **모든 Feature가 PASS일 때만** 진입 가능. 한 건이라도 MISSING이면 Build로 돌아가 구현. 이 루프가 "될 때까지 계속 돈다".
+## 참조
+- gate-check.sh는 `/uzys:auto`를 게이트 체크 대상에서 **제외**. auto 커맨드가 내부에서 gate-status를 직접 관리.
+- 각 단계의 상세 동작은 개별 `/uzys:plan`, `/uzys:build` 등의 커맨드 정의를 따른다.
+- gates-taxonomy.md의 4유형 게이트 (Pre-flight/Revision/Escalation/Abort) 적용.
+- verification-loop 스킬 (ECC): Build→Type→Lint→Test→Security→Diff 6단계 검증 사이클. SPEC compliance check 내에서 활용.
+- P9 Circuit Breakers: 5회 반복 상한. 무한 루프 방지.

package/templates/commands/uzys/build.md ADDED Viewed

@@ -0,0 +1,42 @@
+Build phase — TDD로 점진적 구현한다.
+## Gate Check
+`docs/todo.md`가 존재하는지 확인한다. 없으면 "Plan 단계(`/uzys:plan`)를 먼저 완료하세요" 경고.
+## Process
+1. `docs/todo.md`에서 다음 미완료 task를 선택한다.
+2. agent-skills의 incremental-implementation + test-driven-development 스킬을 따른다:
+   - RED: 실패하는 테스트 먼저 작성
+   - GREEN: 테스트를 통과하는 최소 구현
+   - REFACTOR: 코드 개선 (테스트 유지)
+3. 완료된 task를 todo.md에서 체크한다.
+4. **git-policy.md 적용: 즉시 commit → push**. issue_tracking enabled 시 commit message에 issue 번호 포함:
+   - 진행 중: `<type>: 설명 (refs #N)`
+   - task 완전 완료: PR에서 `Closes #N` 권장 (commit 단위 close 지양)
+5. Build 중 새 bug/feature 발견 → `gh-issue-workflow` skill로 backlog 등록 (5섹션 ISSUE 템플릿). 현재 작업 흐름 깨지 않고 비동기 기록.
+## Context-Aware Skill Selection
+현재 편집 중인 파일에 따라 추가 스킬을 자동 활성화:
+| 파일 유형 | 추가 활성화 |
+|-----------|------------|
+| `.tsx`, `.jsx`, `.html`, `.css` | frontend-ui-engineering 스킬 + DESIGN.md/.impeccable.md 참조 |
+| API 라우트, 엔드포인트 | api-and-interface-design 스킬 |
+| 외부 라이브러리 사용 | source-driven-development 스킬 (공식 문서 확인) |
+## Auto-Actions
+- SPEC.md가 300줄 초과 시 spec-scaling 스킬로 분리 제안.
+- 커밋 없이 다음 task로 넘어가면 경고.
+- 각 task 완료 시 todo.md 자동 업데이트.
+## Gate Status Update
+이 단계가 성공적으로 완료되면 `.claude/gate-status.json`의 `build.completed`를 `true`로, `build.timestamp`를 현재 시각으로 업데이트한다.
+```bash
+jq '.build.completed = true | .build.timestamp = now | .build.timestamp = (now | strftime("%Y-%m-%dT%H:%M:%SZ"))' .claude/gate-status.json > /tmp/gate-tmp.json && mv /tmp/gate-tmp.json .claude/gate-status.json
+```

package/templates/commands/uzys/plan.md ADDED Viewed

@@ -0,0 +1,55 @@
+Plan phase — 작업을 검증 가능한 작은 단위로 분해한다.
+## Gate Check
+`docs/SPEC.md`가 존재하는지 확인한다. 없으면 "Define 단계(`/uzys:spec`)를 먼저 완료하세요" 경고.
+## Plan Depth — 변경 복잡도에 맞춰라
+모든 SPEC을 동일하게 분해하지 않는다. 복잡도별로 plan 깊이를 조정:
+| 복잡도 | 신호 | Plan 형태 |
+|--------|------|----------|
+| **Trivial** | diff를 1문장으로 설명 가능 / single file / 명확한 fix | **Plan skip 가능**. todo.md에 1-task entry만 (또는 plan 단계 자체 건너뛰기 사용자 합의 시) |
+| **Standard** | multi-file / 여러 모듈 / unfamiliar 코드 일부 | **Milestone plan** (3-5개 outcome). per-task AC는 milestone 수준 |
+| **Complex** | 새 기능 / cross-cutting / ambiguous 요구사항 | **Detailed plan** (vertical slice 10+ task, per-task AC, 의존성 그래프) |
+판단 기준: Anthropic best practices — *"if you can describe the diff in one sentence, skip the plan"*. Opus급 모델은 자율 분해 가능하므로 micro-task 강제 시 ceremony가 됨.
+## Process
+1. SPEC.md를 읽고 전체 범위 + **복잡도** 판정 (위 표).
+1.5. **GitHub Issue 우선 fetch (issue_tracking: enabled 시)**:
+   - `gh issue list --state open --json number,title,body,labels` 호출
+   - 각 issue body에서 `방향성 (YYYY-MM-DD 확정)` 패턴 grep — 확정된 것만 후보
+   - 전제(Given) 미충족 issue 제외
+   - 우선순위 정렬 (label P0 > P1 > P2 > unlabeled)
+   - 상위 1-3개 issue → todo.md 진입 후보
+2. **Trivial이면**: todo.md만 생성하고 즉시 Build로. plan.md는 1-2줄.
+3. **Standard/Complex이면**: agent-skills의 planning-and-task-breakdown 스킬을 따라 분해:
+   - Vertical slicing: 수평 레이어가 아닌 수직 기능 단위
+   - 각 task에 Acceptance Criteria 정의 (Standard는 milestone 수준, Complex는 task 수준)
+   - 의존성 순서 정렬
+4. **North Star 4-gate 체크 (Complex 복잡도 + `docs/NORTH_STAR.md` 존재 시)**:
+   - 신규 기능/task가 NORTH_STAR.md §5 Decision Heuristics의 4-gate(Trend/Persona/Capability/Lean)를 모두 통과하는가?
+   - 1개 이상 fail 시 사용자에게 보고 후 결정 대기 (자동 진행 금지)
+   - NORTH_STAR.md 부재 시 skip
+5. Sprint Contract: 범위(포함/제외) + 완료 기준 + 제약 조건.
+6. `docs/plan.md` + `docs/todo.md` 생성.
+## Output
+- `docs/plan.md` — 전체 계획, Phase 분해, 의존성
+- `docs/todo.md` — 체크리스트 형태의 할 일 목록
+## Gate
+plan.md + todo.md가 존재하고, 최소 1개 task가 정의되어 있어야 완료.
+## Gate Status Update
+이 단계가 성공적으로 완료되면 `.claude/gate-status.json`의 `plan.completed`를 `true`로, `plan.timestamp`를 현재 시각으로 업데이트한다.
+```bash
+jq '.plan.completed = true | .plan.timestamp = now | .plan.timestamp = (now | strftime("%Y-%m-%dT%H:%M:%SZ"))' .claude/gate-status.json > /tmp/gate-tmp.json && mv /tmp/gate-tmp.json .claude/gate-status.json
+```