npm - cokit-cli - Versions diffs - 1.2.3 → 1.2.6 - Mend

cokit-cli 1.2.3 → 1.2.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (79) hide show

package/README.md +6 -7
package/agents/brainstormer.agent.md +9 -2
package/agents/code-reviewer.agent.md +59 -84
package/agents/code-simplifier.agent.md +9 -6
package/agents/debugger.agent.md +17 -8
package/agents/docs-manager.agent.md +104 -8
package/agents/fullstack-developer.agent.md +57 -13
package/agents/git-manager.agent.md +2 -382
package/agents/planner.agent.md +36 -8
package/agents/researcher.agent.md +18 -3
package/agents/tester.agent.md +13 -14
package/agents/ui-ux-designer.agent.md +209 -33
package/docs/README.md +4 -3
package/docs/claudekit-porting-rules.md +182 -0
package/docs/codebase-summary.md +11 -10
package/docs/cokit-comprehensive-mapping-guide.md +4 -4
package/docs/cokit-slides.md +1 -1
package/docs/cokit-sync-and-maintenance-guide.md +8 -3
package/docs/cokit-team-presentation.md +5 -5
package/docs/guide-next-steps-speckit-cokit-implementation.md +1 -1
package/docs/project-overview-pdr.md +1 -1
package/docs/project-roadmap.md +6 -7
package/package.json +1 -1
package/prompts/ck-ask.prompt.md +1 -1
package/prompts/ck-bootstrap.prompt.md +1 -1
package/prompts/ck-cook.prompt.md +12 -12
package/prompts/ck-plan-fast.prompt.md +1 -0
package/prompts/ck-plan-hard.prompt.md +2 -1
package/prompts/ck-plan-red-team.prompt.md +227 -0
package/prompts/ck-plan.prompt.md +1 -0
package/prompts/ck-simplify.prompt.md +1 -1
package/skills/code-review/SKILL.md +78 -28
package/skills/cook/SKILL.md +45 -11
package/skills/debug/SKILL.md +112 -17
package/skills/fix/SKILL.md +20 -8
package/skills/frontend-design/SKILL.md +6 -3
package/skills/planning/SKILL.md +47 -15
package/skills/research/SKILL.md +1 -1
package/skills/scout/SKILL.md +24 -11
package/skills/web-testing/SKILL.md +60 -6
package/skills/web-testing/references/report-format.md +57 -0
package/skills/web-testing/references/test-execution-workflow.md +118 -0
package/skills/web-testing/references/ui-testing-workflow.md +97 -0
package/templates/repo/.github/agents/brainstormer.agent.md +9 -2
package/templates/repo/.github/agents/code-reviewer.agent.md +59 -84
package/templates/repo/.github/agents/code-simplifier.agent.md +9 -6
package/templates/repo/.github/agents/debugger.agent.md +17 -8
package/templates/repo/.github/agents/docs-manager.agent.md +104 -8
package/templates/repo/.github/agents/fullstack-developer.agent.md +57 -13
package/templates/repo/.github/agents/git-manager.agent.md +2 -382
package/templates/repo/.github/agents/planner.agent.md +36 -8
package/templates/repo/.github/agents/researcher.agent.md +18 -3
package/templates/repo/.github/agents/tester.agent.md +13 -14
package/templates/repo/.github/agents/ui-ux-designer.agent.md +209 -33
package/templates/repo/.github/prompts/ck-ask.prompt.md +1 -1
package/templates/repo/.github/prompts/ck-bootstrap.prompt.md +1 -1
package/templates/repo/.github/prompts/ck-cook.prompt.md +12 -12
package/templates/repo/.github/prompts/ck-plan-fast.prompt.md +1 -0
package/templates/repo/.github/prompts/ck-plan-hard.prompt.md +2 -1
package/templates/repo/.github/prompts/ck-plan-red-team.prompt.md +227 -0
package/templates/repo/.github/prompts/ck-plan.prompt.md +1 -0
package/templates/repo/.github/prompts/ck-simplify.prompt.md +1 -1
package/templates/repo/.github/prompts/ck-spec-specify.prompt.md +1 -0
package/templates/repo/.github/skills/code-review/SKILL.md +78 -28
package/templates/repo/.github/skills/cook/SKILL.md +45 -11
package/templates/repo/.github/skills/debug/SKILL.md +112 -17
package/templates/repo/.github/skills/fix/SKILL.md +20 -8
package/templates/repo/.github/skills/frontend-design/SKILL.md +6 -3
package/templates/repo/.github/skills/planning/SKILL.md +47 -15
package/templates/repo/.github/skills/research/SKILL.md +1 -1
package/templates/repo/.github/skills/scout/SKILL.md +24 -11
package/templates/repo/.github/skills/web-testing/SKILL.md +60 -6
package/templates/repo/.github/skills/web-testing/references/report-format.md +57 -0
package/templates/repo/.github/skills/web-testing/references/test-execution-workflow.md +118 -0
package/templates/repo/.github/skills/web-testing/references/ui-testing-workflow.md +97 -0
package/prompts/ck-journal.prompt.md +0 -19
package/prompts/ck-preview.prompt.md +0 -77
package/templates/repo/.github/prompts/ck-journal.prompt.md +0 -19
package/templates/repo/.github/prompts/ck-preview.prompt.md +0 -77

package/skills/debug/SKILL.md CHANGED Viewed

@@ -1,12 +1,11 @@
 ---
 name: debug
-description: Debug systematically with root cause analysis before fixes. Use for bugs, test failures, unexpected behavior, performance issues, call stack tracing, multi-layer validation.
-languages: all
+description: Debug systematically with root cause analysis before fixes. Covers bugs, test failures, log analysis, CI/CD failures, database diagnostics, system investigation, performance issues, call stack tracing, multi-layer validation.
 ---
-# Debugging
+# Debugging & System Investigation
-Comprehensive debugging framework combining systematic investigation, root cause tracing, defense-in-depth validation, and verification protocols.
+Comprehensive debugging framework combining systematic investigation, root cause tracing, defense-in-depth validation, verification protocols, and system-level diagnostics.
 ## Core Principle
@@ -16,29 +15,29 @@ Random fixes waste time and create new bugs. Find the root cause, fix at source,
 ## When to Use
-**Always use for:** Test failures, bugs, unexpected behavior, performance issues, build failures, integration problems, before claiming work complete
+**Code-level:** Test failures, bugs, unexpected behavior, build failures, integration problems, before claiming work complete
+**System-level:** CI/CD pipeline failures, log analysis, database diagnostics, performance bottlenecks, infrastructure issues
 **Especially when:** Under time pressure, "quick fix" seems obvious, tried multiple fixes, don't fully understand issue, about to claim success
-## The Four Techniques
+## Techniques
 ### 1. Systematic Debugging (`references/systematic-debugging.md`)
-Four-phase framework ensuring proper investigation:
+Four-phase framework:
 - Phase 1: Root Cause Investigation (read errors, reproduce, check changes, gather evidence)
 - Phase 2: Pattern Analysis (find working examples, compare, identify differences)
 - Phase 3: Hypothesis and Testing (form theory, test minimally, verify)
 - Phase 4: Implementation (create test, fix once, verify)
-**Key rule:** Complete each phase before proceeding. No fixes without Phase 1.
+Complete each phase before proceeding. No fixes without Phase 1.
 **Load when:** Any bug/issue requiring investigation and fix
 ### 2. Root Cause Tracing (`references/root-cause-tracing.md`)
-Trace bugs backward through call stack to find original trigger.
-**Technique:** When error appears deep in execution, trace backward level-by-level until finding source where invalid data originated. Fix at source, not at symptom.
+Trace bugs backward through call stack to find original trigger. Fix at source, not at symptom.
 **Includes:** `scripts/find-polluter.sh` for bisecting test pollution
@@ -46,9 +45,7 @@ Trace bugs backward through call stack to find original trigger.
 ### 3. Defense-in-Depth (`references/defense-in-depth.md`)
-Validate at every layer data passes through. Make bugs impossible.
-**Four layers:** Entry validation → Business logic → Environment guards → Debug instrumentation
+Validate at every layer data passes through. Four layers: Entry validation → Business logic → Environment guards → Debug instrumentation
 **Load when:** After finding root cause, need to add comprehensive validation
@@ -58,19 +55,117 @@ Run verification commands and confirm output before claiming success.
 **Iron law:** NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
-Run the command. Read the output. Then claim the result.
 **Load when:** About to claim work complete, fixed, or passing
+### 5. Investigation Methodology
+For system-level issues (CI/CD, infrastructure, data pipeline):
+1. **Scope** - Define what is broken and what is working
+2. **Gather** - Collect logs, metrics, error outputs before touching anything
+3. **Isolate** - Narrow to smallest reproducible case
+4. **Hypothesize** - Form one theory, test it, reject or confirm
+5. **Fix & Validate** - Fix at root, verify at every affected layer
+**Load when:** Issue is not code-local — spans services, environments, or pipelines
+### 6. Log & CI/CD Analysis
+Use `gh` CLI and structured queries to diagnose pipeline failures:
+```bash
+# View failed CI run logs
+gh run view <run-id> --log-failed
+# List recent runs for a workflow
+gh run list --workflow=<name> --limit 10
+# Watch a running workflow
+gh run watch <run-id>
+```
+For structured logs: filter by severity, timestamp range, and correlation ID before reading raw output.
+**Load when:** CI/CD failure, deployment issue, or log-driven investigation
+### 7. Performance Diagnostics
+Identify bottlenecks before optimizing:
+- Profile first — measure before guessing
+- Check slow queries with `EXPLAIN ANALYZE` (PostgreSQL) or equivalent
+- Identify N+1 query patterns in ORM usage
+- Check memory allocation patterns for leaks
+- Use `psql` for live database diagnostics
+**Load when:** Slowness reported, timeout errors, resource exhaustion
+### 8. Reporting Standards
+For multi-component investigations, write a structured diagnostic report:
+```
+## Diagnostic Report
+- **Issue:** [one-line description]
+- **Root Cause:** [where and why it fails]
+- **Evidence:** [logs, output, reproduction steps]
+- **Fix Applied:** [what was changed]
+- **Verification:** [command run + result]
+- **Remaining Risk:** [any open questions]
+```
+Save to `plans/reports/debugger-{date}-{slug}.md`.
+**Load when:** Investigation spans multiple components or will be shared with others
+### 9. Task Management
+For multi-component investigations, track progress with a checklist rather than holding state mentally:
+```
+- [ ] Reproduce the issue
+- [ ] Identify root cause
+- [ ] Fix applied
+- [ ] Tests passing
+- [ ] Verification complete
+```
+Add this checklist to the active plan or investigation report. Check items off as each step completes.
+**Load when:** Investigation touches 3+ components or files
+### 10. Frontend Verification
+For visual bugs or UI regressions, use browser developer tools (or the `agent-browser` skill) to inspect rendering, network, and console errors directly in the browser.
+Use `/ck-scout ext` to search for frontend-specific patterns before diving into devtools.
+**Load when:** Visual regression, layout bug, client-side network error, or UI behavior that differs from expected
 ## Quick Reference
 ```
-Bug → systematic-debugging.md (Phase 1-4)
+Code bug → systematic-debugging.md (Phase 1-4)
   Error deep in stack? → root-cause-tracing.md (trace backward)
   Found root cause? → defense-in-depth.md (add layers)
   About to claim success? → verification.md (verify first)
+System issue → Investigation Methodology (5 steps)
+  CI/CD failure? → Log & CI/CD Analysis (gh CLI)
+  Slow/timeout? → Performance Diagnostics
+  Multi-component? → Task Management checklist + Reporting Standards
+  Visual/UI bug? → Frontend Verification (agent-browser / browser devtools)
 ```
+## Tools Integration
+| Tool | Use Case |
+|------|----------|
+| `execute` | Run test commands, build scripts, verification steps |
+| `gh` CLI | CI/CD log analysis, PR checks, workflow runs |
+| `psql` | Live database diagnostics and slow query analysis |
+| `agent-browser` skill | Frontend visual verification and network inspection |
+| `/ck-scout` | Search codebase for related patterns before investigating |
 ## Red Flags
 Stop and follow process if thinking:

package/skills/fix/SKILL.md CHANGED Viewed

@@ -12,6 +12,7 @@ Unified skill for fixing issues of any complexity with intelligent routing.
 - `--auto` - Activate autonomous mode (**default**)
 - `--review` - Activate human-in-the-loop review mode
 - `--quick` - Activate quick mode
+- `--parallel` - Fix 2+ independent issues concurrently using parallel agents
 ## Workflow
@@ -31,10 +32,10 @@ See `references/mode-selection.md` for question format.
 - Activate `debug` skill.
 - Guess all possible root causes.
-- Search in parallel to verify each hypothesis.
+- Spawn multiple parallel search agents to verify each hypothesis.
 - Create report with all findings for the next step.
-### Step 3: Complexity Assessment & Fix Implementation
+### Step 3: Complexity Assessment & Task Orchestration
 Classify before routing. See `references/complexity-assessment.md`.
@@ -45,17 +46,27 @@ Classify before routing. See `references/complexity-assessment.md`.
 | **Complex** | System-wide, architecture impact | `references/workflow-deep.md` |
 | **Parallel** | 2+ independent issues | Parallel `fullstack-developer` agents |
-### Step 4: Fix Verification & Prevent Future Issues
+**Task orchestration notes:**
+- **Quick workflow:** Skip task creation — proceed directly to fix.
+- **Moderate+ workflows:** After classifying, create a todo checklist for all phases upfront with dependencies before starting any implementation. Track each phase with checkboxes and note blockers inline.
+See `references/task-orchestration.md` for checklist structure and dependency tracking patterns.
+### Step 4: Fix Implementation & Verification
 - Read and analyze all the implemented changes.
-- Search in parallel to find possible related code for verification.
+- Spawn multiple parallel search agents to verify no regressions in related code.
 - Make sure these fixes don't break other parts of the codebase.
 - Prevent future issues by adding comprehensive validation.
 ### Step 5: Finalize
-- Report summary to user with confidence level/score, all the changes and related files.
-- Ask to commit via `git-manager` agent and update docs if needed via `docs-manager` agent (in parallel).
+**MANDATORY — always execute all steps:**
+1. Report summary to user with confidence score (0–10), all changes, and related files.
+2. Update `./docs` via ``docs-manager`` agent (NON-OPTIONAL — always run even for small fixes).
+3. Mark all checklist tasks complete.
+4. Ask user to commit via ``git-manager`` agent.
 ---
@@ -65,8 +76,8 @@ See `references/skill-activation-matrix.md` for complete matrix.
 **Always activate:** `debug` (all workflows)
 **Conditional:** `problem-solving`, `sequential-thinking`, `brainstorming`, `context-engineering`
-**Agents:** `debugger`, `researcher`, `planner`, `code-reviewer`, `tester`
-**Parallel:** Multiple parallel searches for scouting, terminal commands for verification
+**Agents:** ``debugger``, ``researcher``, ``planner``, ``code-reviewer``, ``tester``
+**Parallel patterns:** Multiple parallel search agents for scouting; parallel terminal commands for verification; parallel ``fullstack-developer`` agents for independent issues (`--parallel` flag)
 ## Output Format
@@ -85,6 +96,7 @@ Unified step markers:
 Load as needed:
 - `references/mode-selection.md` - Mode selection question format
 - `references/complexity-assessment.md` - Classification criteria
+- `references/task-orchestration.md` - Todo checklist structure and dependency tracking
 - `references/workflow-quick.md` - Quick: debug → fix → review
 - `references/workflow-standard.md` - Standard: full pipeline
 - `references/workflow-deep.md` - Deep: research + brainstorm + plan

package/skills/frontend-design/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: frontend-design
-description: Create polished frontend interfaces from designs/screenshots. Use for web components, replicating UI designs, quick prototypes, avoiding AI slop.
+description: Create polished frontend interfaces from designs/screenshots/videos. Use for web components, replicating UI designs, quick prototypes, 3D experiences, immersive interfaces, avoiding AI slop.
 ---
 Create distinctive, production-grade frontend interfaces. Implement real working code with exceptional aesthetic attention.
@@ -12,15 +12,18 @@ Choose workflow based on input type:
 | Input | Workflow | Reference |
 |-------|----------|-----------|
 | Screenshot | Replicate exactly | `./references/workflow-screenshot.md` |
+| Video | Replicate with animations | `./references/workflow-video.md` |
 | Quick task | Rapid implementation | `./references/workflow-quick.md` |
 | Describe only | Document for devs | `./references/workflow-describe.md` |
+| 3D/WebGL request | Three.js immersive | `./references/workflow-3d.md` |
+| Complex/award-quality | Full immersive | `./references/workflow-immersive.md` |
 | From scratch | Design Thinking below | - |
 **All workflows**: Activate `ui-styling` skill FIRST for design patterns and component library.
 ## Screenshot/Video Replication (Quick Reference)
-1. **Analyze** visually - extract colors, fonts, spacing, effects
+1. **Analyze** visually (use `ai-multimodal` skill if available) - extract colors, fonts, spacing, effects
 2. **Plan** with `ui-ux-designer` agent - create phased implementation
 3. **Implement** - match source precisely
 4. **Verify** - compare to original
@@ -45,7 +48,7 @@ Before coding, commit to a BOLD aesthetic direction:
 - **Motion**: CSS-first, anime.js for complex (`./references/animejs.md`). Orchestrated page loads > scattered micro-interactions.
 - **Spatial**: Unexpected layouts. Asymmetry. Overlap. Negative space OR controlled density.
 - **Backgrounds**: Atmosphere over solid colors. Gradients, noise, patterns, shadows, grain.
-- **Assets**: Process with ImageMagick, FFmpeg, RMBG CLI tools
+- **Assets**: Process with ImageMagick, FFmpeg, RMBG CLI tools (or `ai-multimodal`/`media-processing` skills if available)
 ## Asset & Analysis References

package/skills/planning/SKILL.md CHANGED Viewed

@@ -7,6 +7,22 @@ description: Plan implementations, design architectures, create technical roadma
 Create detailed technical implementation plans through research, codebase analysis, solution design, and comprehensive documentation.
+## Workflow Modes
+Default: `--auto` — analyze the task and auto-pick the most appropriate mode.
+| Flag | Research | Red Team | Validation | Cook Flag |
+|------|----------|----------|------------|-----------|
+| `--auto` | Auto | Auto | Auto | auto |
+| `--fast` | Skip | Skip | Skip | fast |
+| `--hard` | Full | Yes | Yes | hard |
+| `--parallel` | Parallel | Yes | Yes | parallel |
+| `--two` | Full | Yes | Yes | two |
+Add `--no-tasks` to any mode to skip todo checklist hydration after the plan is written.
+See `references/workflow-modes.md` for detailed mode behavior.
 ## When to Use
 Use this skill when:
@@ -41,12 +57,15 @@ Load: `references/output-standards.md`
 ## Workflow Process
-1. **Initial Analysis** → Read codebase docs, understand context
-2. **Research Phase** → Spawn researchers, investigate approaches
-3. **Synthesis** → Analyze reports, identify optimal solution
-4. **Design Phase** → Create architecture, implementation design
-5. **Plan Documentation** → Write comprehensive plan
-6. **Review & Refine** → Ensure completeness, clarity, actionability
+1. **Pre-Creation Check** → Check `## Plan Context` from hook injection; follow Active Plan State rules below.
+2. **Mode Detection** → Use explicit flag if provided; otherwise auto-detect based on task complexity.
+3. **Research Phase** → Spawn parallel researcher agents to investigate approaches (skip in `--fast` mode).
+4. **Codebase Analysis** → Read docs in `./docs`; activate `/ck-scout` if file relationships are unclear.
+5. **Plan Documentation** → Write comprehensive plan via `planner` agent using the directory structure below.
+6. **Red Team Review** → Spawn adversarial reviewers to challenge assumptions (`--hard`, `--parallel`, `--two` modes only). See `references/workflow-modes.md`.
+7. **Post-Plan Validation** → Use `/ck-plan-validate` to verify completeness and coherence (`--hard`, `--parallel`, `--two` modes only).
+8. **Hydrate Tasks** → Create a todo checklist from plan phases with dependency annotations (default on; skip with `--no-tasks` or fewer than 3 phases).
+9. **Context Reminder** → Output the cook command with the absolute plan path (MANDATORY): `Use plan at: {absolute-plan-dir-path}`
 ## Output Requirements
@@ -57,13 +76,15 @@ Load: `references/output-standards.md`
 - Provide multiple options with trade-offs when appropriate
 - Fully respect the `./docs/development-rules.md` file.
-## Task Integration (Optional)
+## Task Management
-When session has `TASK_LIST_ID` set (active plan):
-- Create tasks for each phase with clear subjects
-- Set dependencies: Phase N+1 `blockedBy` Phase N
-- Agents coordinate via shared task list automatically
-- Update tasks to mark progress (in_progress → completed)
+Plan files are persistent on disk. Todo checklists are session-scoped. Hydration bridges the gap by converting plan phases into trackable checklist items at plan-creation time.
+- **Default:** Auto-hydrate after plan is written (create checklist with one item per phase).
+- **Skip with:** `--no-tasks` flag or when plan has fewer than 3 phases (3-Task Rule).
+- **Checklist format:** Include phase name, dependencies, and owning agent hint per item.
+See `references/task-management.md` for checklist schema and dependency notation.
 ### Important
 DO NOT create plans or reports in USER directory.
@@ -95,8 +116,8 @@ Prevents version proliferation by tracking current working plan via session stat
 ### Active vs Suggested Plans
 Check the `## Plan Context` section injected by hooks:
-- **"Plan: {path}"** = Active plan, explicitly set via `set-active-plan.cjs` - use for reports
-- **"Suggested: {path}"** = Branch-matched, hint only - do NOT auto-use
+- **"Plan: {path}"** = Active plan, explicitly set via `set-active-plan.cjs` — use this path for all reports
+- **"Suggested: {path}"** = Branch-matched hint only — do NOT auto-use
 - **"Plan: none"** = No active plan
 ### Rules
@@ -117,7 +138,7 @@ All agents writing reports MUST:
 DO NOT create plans or reports in USER directory.
 ALWAYS create plans or reports in CURRENT WORKING PROJECT DIRECTORY.
-**Important:** Suggested plans do NOT get plan-specific reports - this prevents pollution of old plan folders.
+**Important:** Suggested plans do NOT get plan-specific reports — this prevents pollution of old plan folders.
 ## Quality Standards
@@ -129,3 +150,14 @@ ALWAYS create plans or reports in CURRENT WORKING PROJECT DIRECTORY.
 - Validate against existing codebase patterns
 **Remember:** Plan quality determines implementation success. Be comprehensive and consider all solution aspects.
+## References
+Load as needed:
+- `references/workflow-modes.md` - Mode behavior details and flag descriptions
+- `references/task-management.md` - Checklist schema, dependency notation, hydration rules
+- `references/research-phase.md` - Research phase execution
+- `references/codebase-understanding.md` - Codebase analysis steps
+- `references/solution-design.md` - Solution design process
+- `references/plan-organization.md` - Plan file structure and organization
+- `references/output-standards.md` - Task breakdown and output format standards

package/skills/research/SKILL.md CHANGED Viewed

@@ -24,7 +24,7 @@ You will employ a multi-source research strategy:
 1. **Search Strategy**:
    - **Gemini Toggle**: Check `$HOME/.copilot/.ck.json` (or `~/.copilot/.ck.json`) for `skills.research.useGemini` (default: `true`). If `false`, skip Gemini and use WebSearch.
-   - **Gemini Model**: Read from `$HOME/.copilot/.ck.json`: `gemini.model` (default: `gemini-3.0-flash`)
+   - **Gemini Model**: Read from `$HOME/.copilot/.ck.json`: `gemini.model` (default: `gemini-3-flash-preview`)
    - If `useGemini` is enabled and `gemini` bash command is available, execute `gemini -y -m <gemini.model> "...your search prompt..."` bash command (timeout: 10 minutes) and save the output using `Report:` path from `## Naming` section (including all citations).
    - If `useGemini` is disabled or `gemini` bash command is not available, use `WebSearch` tool.
    - Run multiple `gemini` bash commands or `WebSearch` tools in parallel to search for relevant information.

package/skills/scout/SKILL.md CHANGED Viewed

@@ -29,7 +29,7 @@ Fast, token-efficient codebase scouting using parallel agents to find files need
 ## Configuration
 Read from `$HOME/.copilot/.ck.json`:
-- `gemini.model` - Gemini model (default: `gemini-3.0-flash`)
+- `gemini.model` - Gemini model (default: `gemini-3-flash-preview`)
 ## Workflow
@@ -43,21 +43,33 @@ Read from `$HOME/.copilot/.ck.json`:
 - Assign each agent specific directories or patterns
 - Ensure no overlap, maximize coverage
-### 3. Spawn Parallel Agents
+### 3. Register Scout Tasks
+**Skip this step if agent count <= 2.**
+- Check for existing scout tasks in the current session to avoid duplicates.
+- Create a markdown checklist — one entry per agent — including scope metadata (directories, patterns assigned).
+- Example checklist entry: `- [ ] Agent 1: src/api/*, src/models/* — searching for auth-related files`
+- Reference: `references/task-management-scouting.md` for checklist format and metadata fields.
+### 4. Spawn Parallel Agents
 Load appropriate reference based on decision tree:
 - **Internal (Default):** `references/internal-scouting.md` (search agents)
 - **External:** `references/external-scouting.md` (Gemini/OpenCode)
 **Notes:**
-- Prompt detailed instructions for each agent with exact directories or files it should read
-- Remember that each agent has less than 200K tokens of context window
-- Amount of agents to-be-spawned depends on the current system resources available and amount of files to be scanned
-- Each agent must return a detailed summary report to a main agent
-### 4. Collect Results
-- Timeout: 3 minutes per agent (skip non-responders)
-- Aggregate findings into single report
-- List unresolved questions at end
+- Update each task to in_progress before spawning its agent.
+- Prompt detailed instructions for each agent with exact directories or files it should read.
+- Remember that each agent has less than 200K tokens of context window.
+- Amount of agents to-be-spawned depends on the current system resources available and amount of files to be scanned.
+- Each agent must return a detailed summary report to a main agent.
+### 5. Collect Results
+- Timeout: 3 minutes per agent (skip non-responders).
+- Update completed tasks to done; log timed-out agents in the report under "Unresolved Questions".
+- Aggregate findings into single report.
+- List unresolved questions at end.
 ## Report Format
@@ -76,3 +88,4 @@ Load appropriate reference based on decision tree:
 - `references/internal-scouting.md` - Using search agents
 - `references/external-scouting.md` - Using Gemini/OpenCode CLI
+- `references/task-management-scouting.md` - Checklist format and scope metadata for scout tasks

package/skills/web-testing/SKILL.md CHANGED Viewed

@@ -1,23 +1,52 @@
 ---
 name: web-testing
-description: Web testing with Playwright, Vitest, k6. E2E/unit/integration/load/security/visual/a11y testing. Use for test automation, flakiness, Core Web Vitals, mobile gestures, cross-browser.
+description: Web testing with Playwright, Vitest, k6. E2E/unit/integration/load/security/visual/a11y testing. Multi-language support (JS/TS, Python, Go, Rust, Flutter). Use for test automation, flakiness, Core Web Vitals, mobile gestures, cross-browser.
 ---
 # Web Testing Skill
-Comprehensive web testing: unit, integration, E2E, load, security, visual regression, accessibility.
+Comprehensive testing: unit, integration, E2E, load, security, visual regression, accessibility. Multi-language workflow orchestration with structured QA reporting.
+## Core Principle
+**NEVER IGNORE FAILING TESTS.** Fix root causes, not symptoms. No mocks/cheats/tricks to pass builds.
 ## Quick Start
 ```bash
+# JavaScript/TypeScript
 npx vitest run                    # Unit tests
 npx playwright test               # E2E tests
-npx playwright test --ui          # E2E with UI
+npm run test:coverage             # Coverage
+# Python
+pytest --cov=src                  # Unit + coverage
+# Go / Rust / Flutter
+go test ./... -cover              # Go
+cargo test                        # Rust
+flutter test --coverage           # Flutter
+# Web Quality
 k6 run load-test.js               # Load tests
-npx @axe-core/cli https://example.com  # Accessibility
-npx lighthouse https://example.com     # Performance
+npx @axe-core/cli https://...    # Accessibility
+npx lighthouse https://...       # Performance
 ```
+## Workflows
+Three orchestrated workflows — load the relevant reference when needed:
+| Workflow | Reference | Use When |
+|----------|-----------|----------|
+| Code Testing | `./references/test-execution-workflow.md` | Running unit/integration/e2e, checking coverage, validating builds |
+| UI Testing | `./references/ui-testing-workflow.md` | Visual regression, responsive checks, accessibility audits, form automation |
+| Report Format | `./references/report-format.md` | Generating structured QA summary reports |
+**Code Testing Process:** Identify scope → Pre-flight checks → Execute tests → Analyze results → Coverage analysis → Build verification → Report
+**UI Testing Process:** Discovery → Visual capture → Console errors → Network validation → Responsive testing → Form testing → Performance metrics → Report
 ## Testing Strategy (Choose Your Model)
 | Model | Structure | Best For |
@@ -26,10 +55,15 @@ npx lighthouse https://example.com     # Performance
 | Trophy | Integration-heavy | Modern SPAs |
 | Honeycomb | Contract-centric | Microservices |
-→ `./references/testing-pyramid-strategy.md`
+> `./references/testing-pyramid-strategy.md`
 ## Reference Documentation
+### Workflows & Reports
+- `./references/test-execution-workflow.md` - Orchestrated code testing (multi-language)
+- `./references/ui-testing-workflow.md` - Browser-based visual testing via `agent-browser`
+- `./references/report-format.md` - Structured QA report template
 ### Core Testing
 - `./references/unit-integration-testing.md` - Vitest, browser mode, AAA
 - `./references/e2e-testing-playwright.md` - Fixtures, sharding, selectors
@@ -64,6 +98,17 @@ npx lighthouse https://example.com     # Performance
 - `./references/pre-release-checklist.md` - Complete release checklist
 - `./references/functional-testing-checklist.md` - Feature testing
+## Tools Integration
+- **Test runners**: Vitest, Jest, Mocha, pytest, go test, cargo test, flutter test
+- **Coverage**: Istanbul/c8/nyc, pytest-cov, go cover
+- **E2E**: Playwright (multi-browser, sharding)
+- **Load**: k6
+- **Browser**: `agent-browser` skill for UI testing
+- **Analysis**: `ai-multimodal` skill (if available) for screenshot analysis
+- **Debugging**: `debug` skill when tests reveal bugs
+- **Thinking**: `sequential-thinking` skill for complex test failure analysis
 ## Scripts
 ### Initialize Playwright Project
@@ -92,3 +137,12 @@ jobs:
       - run: npm run test:a11y      # Accessibility
       - run: npx lhci autorun       # Performance
 ```
+## Quality Standards
+- All critical paths must have test coverage
+- Validate happy path AND error scenarios
+- Ensure test isolation — no interdependencies
+- Tests must be deterministic and reproducible
+- Clean up test data after execution
+- Coverage: 80%+ lines, 70%+ branches minimum

package/skills/web-testing/references/report-format.md ADDED Viewed

@@ -0,0 +1,57 @@
+# Test Report Format
+Structured QA report template. Sacrifice grammar for concision.
+## Template
+```markdown
+# Test Report — {date} — {scope}
+## Test Results Overview
+- **Total**: X tests
+- **Passed**: X | **Failed**: X | **Skipped**: X
+- **Duration**: Xs
+## Coverage Metrics
+| Metric   | Value | Threshold | Status |
+|----------|-------|-----------|--------|
+| Lines    | X%    | 80%       | PASS/FAIL |
+| Branches | X%    | 70%       | PASS/FAIL |
+| Functions| X%    | 80%       | PASS/FAIL |
+## Failed Tests
+### `test/path/file.test.ts` — TestName
+- **Error**: Error message
+- **Stack**: Relevant stack trace (truncated)
+- **Cause**: Brief root cause analysis
+- **Fix**: Suggested resolution
+## UI Test Results (if applicable)
+- **Pages tested**: X
+- **Screenshots**: ./screenshots/
+- **Console errors**: none | [list]
+- **Responsive**: checked at [viewports] | skipped
+- **Performance**: LCP Xs, FID Xms, CLS X
+## Build Status
+- **Build**: PASS/FAIL
+- **Warnings**: none | [list]
+- **Dependencies**: all resolved | [issues]
+## Critical Issues
+1. [Blocking issue description + impact]
+## Recommendations
+1. [Actionable improvement with priority]
+## Unresolved Questions
+- [Any open questions, if any]
+```
+## Guidelines
+- Include ALL failed tests with error messages — don't summarize away details
+- Coverage: highlight specific uncovered files/functions, not just percentages
+- Screenshots: embed paths directly in report for easy access
+- Recommendations: prioritize by impact (critical > high > medium > low)
+- Keep report under 200 lines — split into sections if larger scope needed