npm - buildcrew - Versions diffs - 1.8.4 → 1.8.5 - Mend

buildcrew 1.8.4 → 1.8.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.ko.md CHANGED Viewed

@@ -18,6 +18,7 @@ AI 코딩 에이전트가 아무리 똑똑해도, 구조 없이 쓰면 결과가
 - **프로세스** — 품질 게이트가 있는 순차 파이프라인. 통과 못하면 자동으로 재시도
 - **하네스** — 코드베이스를 분석해서 프로젝트 맥락을 자동으로 파악
 - **오케스트레이터** — `@buildcrew`에게 말하면 알아서 적절한 에이전트를 투입
+- **세컨드 오피니언** — 모든 작업 후 독립적인 리뷰 (Codex 또는 Claude subagent)
 ```
 나:     @buildcrew 유저 인증 추가해줘
@@ -30,87 +31,24 @@ AI 코딩 에이전트가 아무리 똑똑해도, 구조 없이 쓰면 결과가
 ## 시작하기
-```bash
-# 1. 에이전트 설치
-npx buildcrew
-# 2. 프로젝트 하네스 자동 생성 (질문 없이 코드베이스 분석)
-npx buildcrew init
-# 3. 필요한 부분만 커스터마이징
-code .claude/harness/
-# 4. 바로 사용
-@buildcrew 유저 대시보드 추가해줘
-```
----
-## 하네스 엔지니어링
-`npx buildcrew init` 하나로 코드베이스를 스캔해서 프로젝트 하네스를 자동 생성합니다. 질문하지 않습니다.
-### 자동 감지 항목
-| 카테고리 | 감지 대상 |
-|---------|----------|
-| 프레임워크 | Next.js, Nuxt, React, Vue, SvelteKit, Express |
-| 언어/CSS | TypeScript, TailwindCSS, Framer Motion |
-| 데이터베이스 | Supabase, Prisma, Drizzle, MongoDB |
-| 인증 | NextAuth, Supabase Auth, Firebase Auth |
-| 결제 | Stripe, Paddle, Toss Payments |
-| AI | OpenAI, Anthropic, Google AI |
-| 배포 | Vercel, Netlify, Fly.io, Docker |
-| 컴포넌트 | `src/components/` 자동 스캔 |
-| API 라우트 | `src/app/api/` 자동 스캔 |
-| 다국어 | i18n 디렉토리 스캔 |
-### 생성되는 파일
-감지 결과에 따라 필요한 하네스 파일이 자동으로 생성됩니다:
-```
-.claude/harness/
-├── project.md        ← 항상 (프로젝트 컨텍스트, 스택, 컴포넌트, API)
-├── rules.md          ← 항상 (프레임워크에 맞는 코딩 규칙)
-├── erd.md            ← DB 감지 시
-├── api-spec.md       ← API 라우트 발견 시
-├── design-system.md  ← TailwindCSS 감지 시
-├── architecture.md   ← 항상
-└── user-flow.md      ← i18n 또는 5개 이상 컴포넌트 시
-```
-### 커스터마이징
-생성된 파일에서 `<!-- HTML 주석 -->`으로 된 부분만 채우면 됩니다. 나머지는 코드베이스에서 이미 채워져 있습니다.
+한 명령어로 전부 끝납니다:
 ```bash
-npx buildcrew harness     # 어떤 파일을 편집해야 하는지 확인
+npx buildcrew
 ```
-### 열린 구조
+인터랙티브 셋업이 순서대로 진행합니다:
+1. 15개 에이전트 + 오케스트레이터 설치
+2. Playwright MCP 설치 여부 (브라우저 테스트에 필요)
+3. 프로젝트 하네스 생성 여부 (스택 자동 감지)
+4. 추가 하네스 템플릿 선택
-`.claude/harness/`에 아무 `.md` 파일이나 추가하면 에이전트가 읽습니다:
+그 다음 바로 사용:
 ```bash
-npx buildcrew add glossary    # 용어 사전 추가
-npx buildcrew add env-vars    # 환경 변수 가이드 추가
-echo "# 내 메모" > .claude/harness/notes.md  # 직접 파일 생성도 가능
+@buildcrew 유저 대시보드 추가해줘
 ```
-### 에이전트 라우팅
-각 에이전트는 자기 역할에 맞는 하네스 파일만 읽습니다:
-| 파일 | 읽는 에이전트 |
-|------|-------------|
-| `project.md`, `rules.md` | 모든 에이전트 |
-| `erd.md`, `architecture.md`, `api-spec.md` | developer, reviewer, security-auditor, investigator |
-| `design-system.md` | designer |
-| `glossary.md`, `user-flow.md` | planner, designer, browser-qa |
-| `env-vars.md` | developer, security-auditor |
-| 커스텀 `.md` 파일 | reviewer, security-auditor (전부 읽음) |
 ---
 ## 에이전트
@@ -119,32 +57,41 @@ echo "# 내 메모" > .claude/harness/notes.md  # 직접 파일 생성도 가능
 | 에이전트 | 모델 | 역할 |
 |---------|------|------|
-| **planner** | opus | 6가지 강제 질문으로 요구사항 분석 → 4관점 자체 리뷰 (CEO, 엔지니어링, 디자인, QA). 관점별 1-10점 채점 후 기준 미달 시 자동 보강. |
-| **designer** | opus | UI/UX 레퍼런스 리서치 + 모션 엔지니어링. Playwright로 실제 사이트 스크린샷 수집, Figma MCP 연동, 애니메이션과 인터랙션이 포함된 프로덕션 컴포넌트 생성. AI 슬롭 블랙리스트 적용. |
-| **developer** | sonnet | 6가지 구현 질문으로 코드베이스를 먼저 파악한 뒤 구현. 3관점 자체 리뷰 (아키텍처, 코드 품질, 안전성)로 자기 코드 검증. 에러 핸들링 프로토콜 내장. 기능 구현, 버그 수정, 반복 수정 3가지 모드 지원. |
+| **planner** | opus | 6가지 강제 질문으로 요구사항 분석. 4관점 자체 리뷰 (CEO, 엔지니어링, 디자인, QA). 관점별 1-10점 채점. |
+| **designer** | opus | UI/UX 레퍼런스 리서치 + 모션 엔지니어링. Playwright 스크린샷, Figma MCP, 프로덕션 컴포넌트 생성. |
+| **developer** | opus | 6가지 구현 질문으로 코드베이스 파악 후 구현. 3관점 자체 리뷰 (아키텍처, 품질, 안전성). 에러 핸들링 프로토콜 내장. |
 ### 품질 팀
 | 에이전트 | 모델 | 역할 |
 |---------|------|------|
-| **qa-tester** | sonnet | 5가지 테스트 전략 질문으로 체계적 검증. 테스트 맵을 먼저 만들고 수용 기준별 검증 수행. 엣지 케이스 자동 생성, 신뢰도 점수 기반 버그 분류. |
-| **browser-qa** | sonnet | Playwright MCP로 실제 브라우저 테스트. 유저 플로우, 반응형, 콘솔 에러 확인. 건강 점수 (0-100) 산출. |
-| **reviewer** | opus | 4명의 전문가 관점으로 심층 리뷰 (보안, 성능, 테스트, 유지보수). 신뢰도 점수 + 스코프 드리프트 감지 + 적대적 리뷰. 기계적 이슈는 즉시 자동 수정. |
-| **health-checker** | sonnet | 코드 품질 대시보드. 7개 카테고리별 가중 점수 (0-10)와 트렌드 추적. |
+| **qa-tester** | sonnet | 5가지 테스트 전략 질문 + 테스트 맵. 엣지 케이스 자동 생성, 신뢰도 점수 기반 분류. |
+| **browser-qa** | sonnet | 4단계 브라우저 테스트 (파악→탐색→스트레스→판단). Playwright MCP. 건강 점수 0-100. |
+| **reviewer** | opus | 4전문가 심층 리뷰 (보안, 성능, 테스트, 유지보수) + 적대적 리뷰 + 자동 수정. 코드 작성 후 실행. |
+| **health-checker** | sonnet | 3단계 코드 품질 (감지→측정→처방). 가중 점수 0-10 + 트렌드 + 조치 항목 5개. |
 ### 보안 & 운영 팀
 | 에이전트 | 모델 | 역할 |
 |---------|------|------|
-| **security-auditor** | opus | OWASP Top 10 + STRIDE 위협 모델 기반 10단계 보안 감사. 신뢰도 게이트 적용. |
-| **canary-monitor** | sonnet | 배포 후 프로덕션 상태 모니터링. 페이지 로드, API 응답, 콘솔 에러, 성능 비교. |
-| **shipper** | sonnet | 8가지 사전 점검 후 배포. Semver 결정 프레임워크로 버전 자동 판단. 체인지로그 자동 생성, PR 템플릿, 배포 후 검증까지. |
+| **security-auditor** | sonnet | OWASP Top 10 + STRIDE 위협 모델. 10단계 보안 감사. |
+| **canary-monitor** | sonnet | 3단계 배포 후 모니터링 (파악→검증→판단). 베이스라인 비교, 신뢰도 점수. |
+| **shipper** | sonnet | 8가지 사전 점검 → 버전 → 체인지로그 → PR → 배포 후 검증. |
+### 생각 & 리뷰 팀
+| 에이전트 | 모델 | 역할 |
+|---------|------|------|
+| **thinker** | opus | "이거 만들 가치가 있나?" — 6가지 핵심 질문, 전제 의문, 대안 3개, 외부 관점 수집, 설계 문서 생성. |
+| **architect** | opus | 코드 짜기 전 아키텍처 리뷰 — 스코프 챌린지, 컴포넌트 다이어그램, 데이터 흐름, 실패 모드, 테스트 커버리지 맵. |
+| **design-reviewer** | sonnet | UI/UX 품질 — 8차원 0-10점 평가, Playwright 스크린샷 기반, 구체적 수정안 + 노력도, WCAG 준수. |
 ### 전문가
 | 에이전트 | 모델 | 역할 |
 |---------|------|------|
-| **investigator** | sonnet | 5가지 증거 수집 → 가설 수립 (확률 기반) → 가설 검증 → 수정. 12가지 대표 버그 패턴 내장. 무관한 코드 수정 자동 차단. |
+| **investigator** | sonnet | 4단계 근본 원인 디버깅. 12가지 대표 버그 패턴. 무관한 코드 수정 자동 차단. |
+| **qa-auditor** | opus | 3개 병렬 subagent (보안, 버그, 설계 준수)가 git diff를 설계 문서와 비교 검사. API 키 불필요. |
 ---
@@ -162,19 +109,71 @@ echo "# 내 메모" > .claude/harness/notes.md  # 직접 파일 생성도 가능
 | **Health** | "헬스체크 돌려줘" | 품질 대시보드 |
 | **Canary** | "배포 확인해줘" | 프로덕션 모니터링 |
 | **Review** | "코드 리뷰해줘" | 4전문가 병렬 리뷰 + 자동 수정 |
-| **Ship** | "배포해줘" | 사전 점검 → 테스트 → 버전 → 체인지로그 → PR |
+| **Ship** | "배포해줘" | 사전 점검 → 버전 → 체인지로그 → PR |
+| **QA Audit** | "코드 검사해줘" | 3 subagent 병렬 검사 |
+| **Think** | "이거 만들 가치가 있을까?" | 6가지 질문 + 대안 + 설계 문서 |
+| **Arch Review** | "아키텍처 리뷰해줘" | 스코프 + 다이어그램 + 실패 모드 |
+| **Design Review** | "디자인 리뷰해줘" | 8차원 점수 + 구체적 수정안 |
+### 모드 우선순위
+메시지가 여러 모드에 해당하면 우선순위 테이블로 해결합니다. Debug가 항상 최우선. Think은 Feature보다 우선. 애매하면 사용자에게 물어봅니다.
+### 세컨드 오피니언
+모든 모드 완료 후 독립적인 리뷰를 제안합니다:
+- **Codex CLI 있으면**: 다른 AI 모델이 독립 리뷰
+- **없으면**: 세션 기억 없는 새 Claude subagent가 리뷰
+사용자가 결과를 보고 어떤 걸 반영할지 결정합니다.
 ### 반복 실행
-매 반복마다 전체 파이프라인을 처음부터 다시 돌립니다. 기획자가 이전 결과를 읽고 계획을 수정하고, 개발자가 다시 구현하고, QA가 다시 검증합니다:
+매 반복마다 전체 파이프라인을 처음부터 다시 돌립니다:
 ```
 @buildcrew 유저 대시보드 추가해줘, 5 iterations
 ```
-### 모드 체이닝
+---
+## 하네스 엔지니어링
+`npx buildcrew` 실행 시 코드베이스를 스캔해서 프로젝트 하네스를 자동 생성합니다.
+### 자동 감지 항목
+| 카테고리 | 감지 대상 |
+|---------|----------|
+| 프레임워크 | Next.js, Nuxt, React, Vue, SvelteKit, Express |
+| 언어/CSS | TypeScript, TailwindCSS, Framer Motion |
+| 데이터베이스 | Supabase, Prisma, Drizzle, MongoDB |
+| 인증 | NextAuth, Supabase Auth, Firebase Auth |
+| 결제 | Stripe, Paddle, Toss Payments |
+| AI | OpenAI, Anthropic, Google AI |
+| 배포 | Vercel, Netlify, Fly.io, Docker |
+### 생성되는 파일
+```
+.claude/harness/
+├── project.md        ← 항상 (프로젝트 컨텍스트, 스택, 컴포넌트, API)
+├── rules.md          ← 항상 (프레임워크에 맞는 코딩 규칙)
+├── erd.md            ← DB 감지 시
+├── api-spec.md       ← API 라우트 발견 시
+├── design-system.md  ← TailwindCSS 감지 시
+├── architecture.md   ← 항상
+└── user-flow.md      ← i18n 또는 5개 이상 컴포넌트 시
+```
+### 열린 구조
+`.claude/harness/`에 아무 `.md` 파일이나 추가하면 에이전트가 읽습니다.
-Feature 완료 → Ship 제안 → Canary 제안. Canary에서 문제 발견 → Debug 자동 전환.
+```bash
+npx buildcrew harness     # 어떤 파일을 편집해야 하는지 확인
+npx buildcrew add         # 사용 가능한 템플릿 목록
+```
 ---
@@ -185,12 +184,11 @@ Feature 완료 → Ship 제안 → Canary 제안. Canary에서 문제 발견 →
 ```
 .claude/pipeline/{기능명}/
 ├── 01-plan.md           요구사항 + 4관점 리뷰 점수
-├── 02-references.md     UI/UX 레퍼런스
 ├── 02-design.md         디자인 결정 + 컴포넌트 스펙
-├── 03-dev-notes.md      구현 노트 + 6질문 분석 + 자체 리뷰 점수
-├── 04-qa-report.md      테스트 맵 + 수용 기준 검증 + 버그 리포트
-├── 05-browser-qa.md     건강 점수 + 스크린샷 + 유저 플로우
-├── 06-review.md         4전문가 발견사항 + 자동 수정 내역
+├── 03-dev-notes.md      구현 + 6질문 분석 + 자체 리뷰
+├── 04-qa-report.md      테스트 맵 + 수용 기준 검증
+├── 05-browser-qa.md     건강 점수 + 스크린샷
+├── 06-review.md         4전문가 발견 + 자동 수정
 └── 07-ship.md           PR URL + 릴리즈 노트
 ```
@@ -200,11 +198,11 @@ Feature 완료 → Ship 제안 → Canary 제안. Canary에서 문제 발견 →
 | 명령어 | 설명 |
 |--------|------|
-| `npx buildcrew` | 에이전트 설치 (11개 + 오케스트레이터) |
-| `npx buildcrew init` | 하네스 자동 생성 (질문 없이 코드베이스 분석) |
-| `npx buildcrew init --force` | 하네스 재생성 |
+| `npx buildcrew` | 전체 인터랙티브 셋업 (에이전트 + MCP + 하네스) |
+| `npx buildcrew init` | 하네스만 생성 |
+| `npx buildcrew init --force` | 하네스 재생성 (기존 백업) |
 | `npx buildcrew add` | 사용 가능한 템플릿 목록 |
-| `npx buildcrew add <name>` | 템플릿 추가 (erd, architecture 등) |
+| `npx buildcrew add <name>` | 템플릿 추가 |
 | `npx buildcrew harness` | 하네스 파일 상태 확인 |
 | `npx buildcrew --force` | 에이전트 덮어쓰기 |
 | `npx buildcrew --list` | 에이전트 목록 + 모델 정보 |
@@ -214,60 +212,33 @@ Feature 완료 → Ship 제안 → Canary 제안. Canary에서 문제 발견 →
 ## 요구사항
 - **필수**: [Claude Code](https://claude.ai/code) CLI
-- **선택**: [Playwright MCP](https://github.com/anthropics/anthropic-quickstarts/tree/main/mcp-servers/playwright) — browser-qa, canary-monitor, designer가 사용
+- **필수**: [Playwright MCP](https://github.com/anthropics/anthropic-quickstarts/tree/main/mcp-servers/playwright) — 셋업 중 자동 설치
 - **선택**: [Figma MCP](https://github.com/anthropics/anthropic-quickstarts/tree/main/mcp-servers/figma) — designer가 사용
-```bash
-# 실제 브라우저 테스트 활성화
-claude mcp add playwright -- npx @anthropic-ai/mcp-server-playwright
-```
-## 커스터마이징
-```
-.claude/agents/      에이전트 정의 — 역할, 도구, 모델 수정 가능
-.claude/harness/     프로젝트 컨텍스트 — 언제든 수정, .md 파일 자유 추가
-.claude/pipeline/    결과물 — 기능별 자동 생성
-```
-## 실시간 진행 상태
-모든 에이전트가 이모지 태그된 진행 로그를 출력합니다:
-```
-📋 PLANNER — "유저 대시보드" 요구사항 분석 시작
-🧠 6가지 강제 질문 분석...
-🔎 4관점 자체 리뷰...
-   🏢 CEO: 8/10  ⚙️ 엔지니어링: 9/10  🎨 디자인: 9/10  🧪 QA: 8/10
-✅ PLANNER — 완료 (평균: 8.5/10)
-💻 DEVELOPER — 구현 시작
-🔍 6가지 구현 질문 분석...
-🏗️ 구현 중...
-🔎 3관점 자체 리뷰 — 아키텍처: 8/10, 품질: 9/10, 안전성: 7/10
-✅ DEVELOPER — 완료 (12개 파일 변경, 평균: 8.0/10)
-🧪 QA TESTER — 테스트 맵 구축 → 검증 중
-   ✅ AC-1: 통과  ❌ AC-2: 실패 (신뢰도: 9/10)
-🔬 REVIEWER — 4전문가 + 적대적 리뷰 — 승인 (2건 자동 수정)
-```
+- **선택**: [Codex CLI](https://github.com/openai/codex) — 크로스 모델 세컨드 오피니언
 ## 아키텍처
 ```
-@buildcrew (오케스트레이터, opus)
+@buildcrew (오케스트레이터, opus, 199줄)
     │
     ├─ .claude/harness/*.md 읽기
-    ├─ 유저 메시지에서 모드 자동 감지
+    ├─ 유저 메시지에서 모드 자동 감지 (13개 모드, 우선순위 테이블)
     ├─ 하네스 컨텍스트와 함께 에이전트 디스패치
-    └─ 품질 게이트 적용 + 전체 파이프라인 반복 관리
+    ├─ 품질 게이트 + 반복 관리
+    └─ 완료 후 세컨드 오피니언 제안
          │
-         ├── 빌드:    planner → designer → developer
-         ├── 품질:    qa-tester → browser-qa → reviewer
+         ├── 생각:     thinker → architect
+         ├── 빌드:     planner → designer → developer
+         ├── 품질:     qa-tester → browser-qa → reviewer
          ├── 보안/운영: security-auditor, canary-monitor, shipper
+         ├── 리뷰:     architect, design-reviewer, qa-auditor
          └── 디버그:   investigator
 ```
+### 버전 자동 업데이트
+에이전트에 버전 헤더가 포함되어 있습니다. 기존 프로젝트에서 `npx buildcrew`를 다시 실행하면 구버전 에이전트가 자동으로 업데이트됩니다.
 ## 라이선스
 MIT

package/README.md CHANGED Viewed

@@ -18,6 +18,7 @@ AI coding agents are powerful, but without structure they produce inconsistent r
 - **A process** — sequential pipeline with quality gates and iteration
 - **A harness** — your project context auto-detected from your codebase
 - **An orchestrator** — just talk to `@buildcrew`, it routes automatically
+- **A second opinion** — independent review after every mode (Codex or Claude subagent)
 ```
 You:   @buildcrew Add user authentication
@@ -30,85 +31,24 @@ No external dependencies. No runtime. No binaries. Just Markdown.
 ## Getting Started
-```bash
-# 1. Install agents
-npx buildcrew
-# 2. Auto-generate project harness (zero questions asked)
-npx buildcrew init
-# 3. Customize (replace <!-- comments --> in generated files)
-code .claude/harness/
-# 4. Start working
-@buildcrew Add user dashboard
-```
----
-## Harness Engineering
-`npx buildcrew init` scans your codebase and generates a project harness — **zero questions asked**.
-### What it auto-detects
-| Category | Detected from |
-|----------|--------------|
-| Framework | package.json (Next.js, Nuxt, React, Vue, SvelteKit, Express) |
-| Language | TypeScript, TailwindCSS, Framer Motion |
-| Database | Supabase, Prisma, Drizzle, MongoDB |
-| Auth | NextAuth, Supabase Auth, Firebase Auth |
-| Payments | Stripe, Paddle, Toss Payments |
-| AI | OpenAI, Anthropic, Google AI |
-| Deploy | Vercel, Netlify, Fly.io, Docker |
-| Components | Scans `src/components/` |
-| API Routes | Scans `src/app/api/` |
-| Locales | Scans i18n directories |
-### What it generates
-Based on detection, relevant harness files are auto-created:
-```
-.claude/harness/
-├── project.md        ← always (project context, stack, components, API routes)
-├── rules.md          ← always (smart defaults for your framework)
-├── erd.md            ← if database detected
-├── api-spec.md       ← if API routes found
-├── design-system.md  ← if TailwindCSS detected
-├── architecture.md   ← always
-└── user-flow.md      ← if i18n or 5+ components
-```
-### Customize
-Generated files use `<!-- HTML comments -->` for parts you need to fill in. Everything else is pre-filled from your codebase.
+One command does everything:
 ```bash
-npx buildcrew harness     # Check which files need editing
+npx buildcrew
 ```
-### The harness is open
+The interactive setup will:
+1. Install 15 agents + orchestrator
+2. Ask to install Playwright MCP (required for browser testing)
+3. Ask to generate project harness (auto-detects your stack)
+4. Let you pick additional harness templates
-Add any `.md` file to `.claude/harness/` — agents read them all:
+Then start working:
 ```bash
-npx buildcrew add glossary    # Add from template
-npx buildcrew add env-vars    # Add from template
-echo "# Notes" > .claude/harness/my-notes.md  # Or create your own
+@buildcrew Add user dashboard
 ```
-### Agent routing
-| File | Routed to |
-|------|-----------|
-| `project.md`, `rules.md` | All agents |
-| `erd.md`, `architecture.md`, `api-spec.md` | developer, reviewer, security-auditor, investigator |
-| `design-system.md` | designer |
-| `glossary.md`, `user-flow.md` | planner, designer, browser-qa |
-| `env-vars.md` | developer, security-auditor |
-| Custom `.md` files | reviewer, security-auditor (read ALL) |
 ---
 ## Agents
@@ -118,35 +58,44 @@ echo "# Notes" > .claude/harness/my-notes.md  # Or create your own
 | Agent | Model | Role |
 |-------|-------|------|
 | **planner** | opus | 6 Forcing Questions + 4-Lens Self-Review (CEO, Engineering, Design, QA). Plans scored 1-10 per lens. |
-| **designer** | opus | UI/UX research + motion engineering → Playwright screenshots → Figma MCP → production components with animations, scroll effects, gestures. AI slop blacklist. |
-| **developer** | sonnet | 6 Implementation Questions + 3-Lens Self-Review (Architecture, Code Quality, Safety). Error Handling Protocol. 3 modes: feature, bugfix, iteration. |
+| **designer** | opus | UI/UX research + motion engineering. Playwright screenshots, Figma MCP, production components with animations. AI slop blacklist. |
+| **developer** | opus | 6 Implementation Questions + 3-Lens Self-Review (Architecture, Code Quality, Safety). Error Handling Protocol. 3 modes: feature, bugfix, iteration. |
 ### Quality Team
 | Agent | Model | Role |
 |-------|-------|------|
-| **qa-tester** | sonnet | 5 Test Strategy Questions + Test Map methodology. Systematic edge case generation. Confidence-scored findings with severity classification. |
-| **browser-qa** | sonnet | Real browser testing via Playwright MCP — flows, responsive, console, health score (0-100). |
-| **reviewer** | opus | 4-specialist deep analysis (security, perf, testing, maintainability) with confidence scoring + scope drift detection + adversarial pass + fix-first approach. |
-| **health-checker** | sonnet | Code quality dashboard — 7-category weighted 0-10 score + trends. |
+| **qa-tester** | sonnet | 5 Test Strategy Questions + Test Map methodology. Edge case generation, confidence-scored findings. |
+| **browser-qa** | sonnet | 4-phase browser testing (orient, explore, stress, judge) via Playwright MCP. Health score 0-100, self-review. |
+| **reviewer** | opus | 4-specialist analysis (security, perf, testing, maintainability) + confidence scoring + adversarial pass + auto-fix. Runs AFTER code. |
+| **health-checker** | sonnet | 3-phase code quality (detect, measure, prescribe). Weighted 0-10 score + trends + top 5 actionable items. |
 ### Security & Ops
 | Agent | Model | Role |
 |-------|-------|------|
-| **security-auditor** | opus | OWASP Top 10 + STRIDE threat model. 10-phase audit with confidence gate. |
-| **canary-monitor** | sonnet | Post-deploy health — pages, APIs, console, performance vs baseline. |
-| **shipper** | sonnet | 8-point pre-flight + semver decision framework + changelog methodology + PR template + post-ship verification. |
+| **security-auditor** | sonnet | OWASP Top 10 + STRIDE threat model. 10-phase audit with confidence gate. |
+| **canary-monitor** | sonnet | 3-phase post-deploy health (orient, verify, judge). Baseline comparison, confidence-scored findings. |
+| **shipper** | sonnet | 8-point pre-flight + semver + changelog + PR + post-ship verification. |
+### Thinking & Review Team
+| Agent | Model | Role |
+|-------|-------|------|
+| **thinker** | opus | "Should we build this?" — 6 forcing questions, premise challenge, 3 alternatives, cross-model outside perspective, design doc output. |
+| **architect** | opus | Architecture review BEFORE code — scope challenge, component diagrams, data flow, failure modes, test coverage map. |
+| **design-reviewer** | sonnet | UI/UX quality — 8 dimensions scored 0-10, screenshot evidence via Playwright, specific fixes with effort estimates, WCAG compliance. |
 ### Specialist
 | Agent | Model | Role |
 |-------|-------|------|
-| **investigator** | sonnet | 5 Evidence Sources + hypothesis scoring + 12 common bug patterns + regression prevention. Edit freeze on unrelated code. |
+| **investigator** | sonnet | 4-phase root cause debugging. 12 common bug patterns. Edit freeze on unrelated code. |
+| **qa-auditor** | opus | 3 parallel subagents (security, bugs, spec compliance) audit git diffs against design docs. No API key needed. |
 ---
-## 9 Operating Modes
+## 13 Operating Modes
 Talk to `@buildcrew` naturally. It auto-detects the mode.
@@ -154,27 +103,80 @@ Talk to `@buildcrew` naturally. It auto-detects the mode.
 |------|---------|----------|
 | **Feature** | "Add user dashboard" | Plan → Design → Dev → QA → Browser QA → Review |
 | **Project Audit** | "full project audit" | Scan → Prioritize → Fix → Verify (loop) |
-| **Browser QA** | "browser qa localhost:3000" | Playwright tests + health score |
+| **Browser QA** | "browser qa localhost:3000" | Playwright testing + health score |
 | **Security** | "security audit" | OWASP + STRIDE + secrets + deps |
 | **Debug** | "debug: login broken" | 4-phase root cause investigation |
-| **Health** | "health check" | Quality dashboard (types, lint, deps, i18n) |
+| **Health** | "health check" | Quality dashboard (types, lint, deps, bundle) |
 | **Canary** | "canary https://myapp.com" | Post-deploy production monitoring |
 | **Review** | "code review" | Multi-specialist analysis + auto-fix |
 | **Ship** | "ship" | Test → version → changelog → PR |
+| **QA Audit** | "qa" | 3 parallel subagent audit on git diff |
+| **Think** | "is this worth building?" | 6 forcing questions + alternatives + design doc |
+| **Arch Review** | "architecture review" | Scope challenge + diagrams + failure modes |
+| **Design Review** | "design review" | 8-dimension scoring + specific fixes |
+### Mode Priority
+When a message matches multiple modes, a priority table resolves conflicts. Debug always wins. Think beats Feature. "architecture review" goes to Architect, not Reviewer. If truly ambiguous, asks the user.
+### Second Opinion
+After any mode completes, buildcrew offers an independent second opinion:
+- **Codex CLI available**: genuinely different AI model reviews the work
+- **No Codex**: fresh Claude subagent with no session memory
+The user decides what to act on.
 ### Iterations
-Each iteration runs the **full end-to-end pipeline** — planner re-evaluates, designer refines, developer fixes, QA re-verifies:
+Each iteration runs the **full end-to-end pipeline**:
 ```
 @buildcrew Add user dashboard, 5 iterations
 ```
-### Mode chaining
+---
+## Harness Engineering
+`npx buildcrew` auto-detects your stack and generates a project harness.
+### What it auto-detects
+| Category | Detected from |
+|----------|--------------|
+| Framework | package.json (Next.js, Nuxt, React, Vue, SvelteKit, Express) |
+| Language | TypeScript, TailwindCSS, Framer Motion |
+| Database | Supabase, Prisma, Drizzle, MongoDB |
+| Auth | NextAuth, Supabase Auth, Firebase Auth |
+| Payments | Stripe, Paddle, Toss Payments |
+| AI | OpenAI, Anthropic, Google AI |
+| Deploy | Vercel, Netlify, Fly.io, Docker |
+| Components | Scans `src/components/` |
+| API Routes | Scans `src/app/api/` |
+| Locales | Scans i18n directories |
+### Generated files
+```
+.claude/harness/
+├── project.md        ← always (project context, stack, components, API routes)
+├── rules.md          ← always (smart defaults for your framework)
+├── erd.md            ← if database detected
+├── api-spec.md       ← if API routes found
+├── design-system.md  ← if TailwindCSS detected
+├── architecture.md   ← always
+└── user-flow.md      ← if i18n or 5+ components
+```
+### The harness is open
-Auto-suggests the next mode:
-- Feature complete → Ship → Canary
-- Canary CRITICAL → Debug
+Add any `.md` file to `.claude/harness/` — agents read them all.
+```bash
+npx buildcrew harness     # Check which files need editing
+npx buildcrew add         # List available templates
+```
 ---
@@ -185,12 +187,11 @@ Each feature generates a full document chain:
 ```
 .claude/pipeline/{feature}/
 ├── 01-plan.md           Requirements + 4-lens review scores
-├── 02-references.md     Curated UI/UX references from real sites
 ├── 02-design.md         Design decisions + component specs
-├── 03-dev-notes.md      Implementation notes + files changed
-├── 04-qa-report.md      Acceptance criteria verification
+├── 03-dev-notes.md      Implementation + 6-question analysis + self-review
+├── 04-qa-report.md      Test map + acceptance criteria verification
 ├── 05-browser-qa.md     Health score + screenshots + flows
-├── 06-review.md         Review findings + auto-fixes applied
+├── 06-review.md         4-specialist findings + auto-fixes
 └── 07-ship.md           PR URL + release notes
 ```
@@ -200,11 +201,11 @@ Each feature generates a full document chain:
 | Command | Description |
 |---------|-------------|
-| `npx buildcrew` | Install agents (11 + orchestrator) |
-| `npx buildcrew init` | Auto-generate harness (zero questions) |
-| `npx buildcrew init --force` | Regenerate harness |
+| `npx buildcrew` | Full interactive setup (agents + MCP + harness) |
+| `npx buildcrew init` | Generate harness only |
+| `npx buildcrew init --force` | Regenerate harness (backs up existing) |
 | `npx buildcrew add` | List harness templates |
-| `npx buildcrew add <name>` | Add a template (erd, architecture, etc.) |
+| `npx buildcrew add <name>` | Add a template |
 | `npx buildcrew harness` | Show harness file status |
 | `npx buildcrew --force` | Overwrite existing agents |
 | `npx buildcrew --list` | List agents with models |
@@ -214,60 +215,33 @@ Each feature generates a full document chain:
 ## Requirements
 - **Required**: [Claude Code](https://claude.ai/code) CLI
-- **Optional**: [Playwright MCP](https://github.com/anthropics/anthropic-quickstarts/tree/main/mcp-servers/playwright) — for browser-qa, canary-monitor, designer
+- **Required**: [Playwright MCP](https://github.com/anthropics/anthropic-quickstarts/tree/main/mcp-servers/playwright) — installed automatically during setup
 - **Optional**: [Figma MCP](https://github.com/anthropics/anthropic-quickstarts/tree/main/mcp-servers/figma) — for designer
-```bash
-# Enable real browser testing
-claude mcp add playwright -- npx @anthropic-ai/mcp-server-playwright
-```
-## Customization
-```
-.claude/agents/      Agent definitions — edit roles, tools, model
-.claude/harness/     Project context — edit anytime, add any .md
-.claude/pipeline/    Output documents — auto-generated per feature
-```
-## Real-time Status
-Every agent outputs emoji-tagged progress logs so you can track what's happening:
-```
-📋 PLANNER — Starting requirements analysis for "user dashboard"
-🧠 Phase 1: Asking 6 Forcing Questions...
-🔎 Phase 3: 4-Lens Self-Review...
-   🏢 CEO: 8/10  ⚙️ Engineering: 9/10  🎨 Design: 9/10  🧪 QA: 8/10
-✅ PLANNER — Complete (avg score: 8.5/10)
-🎨 DESIGNER — Starting UI/UX design...
-💻 DEVELOPER — Phase 1: 6 Implementation Questions...
-   🔍 Phase 2: Implementation...
-   🔎 Phase 3: 3-Lens Self-Review — Architecture: 8/10, Quality: 9/10, Safety: 7/10
-✅ DEVELOPER — Complete (12 files changed, avg: 8.0/10)
-🧪 QA TESTER — 5 Test Strategy Questions → Test Map built
-   ✅ AC-1: PASS  ❌ AC-2: FAIL (confidence: 9/10)
-🔬 REVIEWER — 4 specialists + adversarial — APPROVE (2 auto-fixed)
-```
+- **Optional**: [Codex CLI](https://github.com/openai/codex) — for cross-model second opinion
 ## Architecture
 ```
-@buildcrew (orchestrator, opus)
+@buildcrew (orchestrator, opus, 199 lines)
     │
     ├─ reads .claude/harness/*.md
-    ├─ detects mode from user message
+    ├─ detects mode from user message (13 modes, priority table)
     ├─ dispatches agents with harness context
-    └─ enforces quality gates + full end-to-end iteration
+    ├─ enforces quality gates + iteration
+    └─ offers second opinion after completion
          │
-         ├── Build:    planner → designer → developer
-         ├── Quality:  qa-tester → browser-qa → reviewer
-         ├── Sec/Ops:  security-auditor, canary-monitor, shipper
-         └── Debug:    investigator
+         ├── Think:     thinker → architect
+         ├── Build:     planner → designer → developer
+         ├── Quality:   qa-tester → browser-qa → reviewer
+         ├── Sec/Ops:   security-auditor, canary-monitor, shipper
+         ├── Review:    architect, design-reviewer, qa-auditor
+         └── Debug:     investigator
 ```
+### Version Auto-Update
+Agents include version headers. When you run `npx buildcrew` on an existing project, outdated agents are automatically updated — no `--force` needed.
 ## License
 MIT

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "buildcrew",
-  "version": "1.8.4",
+  "version": "1.8.5",
   "description": "15 AI agents for Claude Code — full development lifecycle from product thinking to production monitoring",
   "homepage": "https://buildcrew-landing.vercel.app",
   "author": "z1nun",