@cubis/foundry 0.3.76 → 0.3.77

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (165) hide show
  1. package/dist/cli/core.js +57 -0
  2. package/dist/cli/core.js.map +1 -1
  3. package/mcp/src/tools/skillTools.test.ts +34 -1
  4. package/package.json +1 -1
  5. package/src/cli/core.ts +66 -0
  6. package/workflows/skills/_schema/skill-platform-attributes.json +7 -0
  7. package/workflows/skills/deep-research/SKILL.md +81 -0
  8. package/workflows/skills/deep-research/evals/assertions.md +17 -0
  9. package/workflows/skills/deep-research/evals/evals.json +56 -0
  10. package/workflows/skills/deep-research/examples/01-latest-docs-check.md +12 -0
  11. package/workflows/skills/deep-research/examples/02-ecosystem-comparison.md +12 -0
  12. package/workflows/skills/deep-research/examples/03-research-to-implementation-handoff.md +12 -0
  13. package/workflows/skills/deep-research/references/comparison-checklist.md +57 -0
  14. package/workflows/skills/deep-research/references/research-output.md +69 -0
  15. package/workflows/skills/deep-research/references/source-ladder.md +81 -0
  16. package/workflows/skills/generated/skill-audit.json +11 -2
  17. package/workflows/skills/generated/skill-catalog.json +36 -4
  18. package/workflows/skills/skills_index.json +32 -0
  19. package/workflows/workflows/agent-environment-setup/generated/route-manifest.json +7 -7
  20. package/workflows/workflows/agent-environment-setup/manifest.json +27 -1
  21. package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/orchestrator.md +6 -5
  22. package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/project-planner.md +4 -3
  23. package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/researcher.md +8 -4
  24. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/accessibility.toml +2 -0
  25. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/backend.toml +2 -0
  26. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/create.toml +2 -0
  27. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/database.toml +2 -0
  28. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/debug.toml +2 -0
  29. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/devops.toml +2 -0
  30. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/implement-track.toml +2 -0
  31. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/migrate.toml +2 -0
  32. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/mobile.toml +2 -0
  33. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/onboard.toml +2 -0
  34. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/orchestrate.toml +2 -0
  35. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/plan.toml +2 -0
  36. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/refactor.toml +2 -0
  37. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/release.toml +2 -0
  38. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/review.toml +2 -0
  39. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/security.toml +2 -0
  40. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/test.toml +2 -0
  41. package/workflows/workflows/agent-environment-setup/platforms/antigravity/commands/vercel.toml +2 -0
  42. package/workflows/workflows/agent-environment-setup/platforms/antigravity/rules/GEMINI.md +13 -8
  43. package/workflows/workflows/agent-environment-setup/platforms/antigravity/skills/deep-research/SKILL.md +89 -0
  44. package/workflows/workflows/agent-environment-setup/platforms/antigravity/skills/deep-research/evals/assertions.md +17 -0
  45. package/workflows/workflows/agent-environment-setup/platforms/antigravity/skills/deep-research/evals/evals.json +56 -0
  46. package/workflows/workflows/agent-environment-setup/platforms/antigravity/skills/deep-research/examples/01-latest-docs-check.md +12 -0
  47. package/workflows/workflows/agent-environment-setup/platforms/antigravity/skills/deep-research/examples/02-ecosystem-comparison.md +12 -0
  48. package/workflows/workflows/agent-environment-setup/platforms/antigravity/skills/deep-research/examples/03-research-to-implementation-handoff.md +12 -0
  49. package/workflows/workflows/agent-environment-setup/platforms/antigravity/skills/deep-research/references/comparison-checklist.md +57 -0
  50. package/workflows/workflows/agent-environment-setup/platforms/antigravity/skills/deep-research/references/research-output.md +69 -0
  51. package/workflows/workflows/agent-environment-setup/platforms/antigravity/skills/deep-research/references/source-ladder.md +81 -0
  52. package/workflows/workflows/agent-environment-setup/platforms/antigravity/workflows/onboard.md +3 -3
  53. package/workflows/workflows/agent-environment-setup/platforms/antigravity/workflows/orchestrate.md +2 -2
  54. package/workflows/workflows/agent-environment-setup/platforms/antigravity/workflows/plan.md +4 -4
  55. package/workflows/workflows/agent-environment-setup/platforms/claude/agents/orchestrator.md +6 -5
  56. package/workflows/workflows/agent-environment-setup/platforms/claude/agents/project-planner.md +4 -3
  57. package/workflows/workflows/agent-environment-setup/platforms/claude/agents/researcher.md +8 -4
  58. package/workflows/workflows/agent-environment-setup/platforms/claude/hooks/README.md +15 -0
  59. package/workflows/workflows/agent-environment-setup/platforms/claude/hooks/route-research-guard.mjs +39 -0
  60. package/workflows/workflows/agent-environment-setup/platforms/claude/hooks/settings.snippet.json +15 -0
  61. package/workflows/workflows/agent-environment-setup/platforms/claude/rules/CLAUDE.md +15 -8
  62. package/workflows/workflows/agent-environment-setup/platforms/claude/skills/deep-research/SKILL.md +95 -0
  63. package/workflows/workflows/agent-environment-setup/platforms/claude/skills/deep-research/evals/assertions.md +17 -0
  64. package/workflows/workflows/agent-environment-setup/platforms/claude/skills/deep-research/evals/evals.json +56 -0
  65. package/workflows/workflows/agent-environment-setup/platforms/claude/skills/deep-research/examples/01-latest-docs-check.md +12 -0
  66. package/workflows/workflows/agent-environment-setup/platforms/claude/skills/deep-research/examples/02-ecosystem-comparison.md +12 -0
  67. package/workflows/workflows/agent-environment-setup/platforms/claude/skills/deep-research/examples/03-research-to-implementation-handoff.md +12 -0
  68. package/workflows/workflows/agent-environment-setup/platforms/claude/skills/deep-research/references/comparison-checklist.md +57 -0
  69. package/workflows/workflows/agent-environment-setup/platforms/claude/skills/deep-research/references/research-output.md +69 -0
  70. package/workflows/workflows/agent-environment-setup/platforms/claude/skills/deep-research/references/source-ladder.md +81 -0
  71. package/workflows/workflows/agent-environment-setup/platforms/claude/skills/skills_index.json +32 -0
  72. package/workflows/workflows/agent-environment-setup/platforms/claude/workflows/onboard.md +3 -3
  73. package/workflows/workflows/agent-environment-setup/platforms/claude/workflows/orchestrate.md +2 -2
  74. package/workflows/workflows/agent-environment-setup/platforms/claude/workflows/plan.md +4 -4
  75. package/workflows/workflows/agent-environment-setup/platforms/codex/agents/orchestrator.md +6 -5
  76. package/workflows/workflows/agent-environment-setup/platforms/codex/agents/project-planner.md +4 -3
  77. package/workflows/workflows/agent-environment-setup/platforms/codex/agents/researcher.md +8 -4
  78. package/workflows/workflows/agent-environment-setup/platforms/codex/rules/AGENTS.md +13 -8
  79. package/workflows/workflows/agent-environment-setup/platforms/codex/skills/deep-research/SKILL.md +89 -0
  80. package/workflows/workflows/agent-environment-setup/platforms/codex/skills/deep-research/evals/assertions.md +17 -0
  81. package/workflows/workflows/agent-environment-setup/platforms/codex/skills/deep-research/evals/evals.json +56 -0
  82. package/workflows/workflows/agent-environment-setup/platforms/codex/skills/deep-research/examples/01-latest-docs-check.md +12 -0
  83. package/workflows/workflows/agent-environment-setup/platforms/codex/skills/deep-research/examples/02-ecosystem-comparison.md +12 -0
  84. package/workflows/workflows/agent-environment-setup/platforms/codex/skills/deep-research/examples/03-research-to-implementation-handoff.md +12 -0
  85. package/workflows/workflows/agent-environment-setup/platforms/codex/skills/deep-research/references/comparison-checklist.md +57 -0
  86. package/workflows/workflows/agent-environment-setup/platforms/codex/skills/deep-research/references/research-output.md +69 -0
  87. package/workflows/workflows/agent-environment-setup/platforms/codex/skills/deep-research/references/source-ladder.md +81 -0
  88. package/workflows/workflows/agent-environment-setup/platforms/codex/workflows/onboard.md +3 -3
  89. package/workflows/workflows/agent-environment-setup/platforms/codex/workflows/orchestrate.md +2 -2
  90. package/workflows/workflows/agent-environment-setup/platforms/codex/workflows/plan.md +4 -4
  91. package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/orchestrator.md +6 -5
  92. package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/project-planner.md +4 -3
  93. package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/researcher.md +8 -4
  94. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-accessibility.prompt.md +2 -1
  95. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-backend.prompt.md +2 -1
  96. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-create.prompt.md +2 -1
  97. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-database.prompt.md +2 -1
  98. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-debug.prompt.md +2 -1
  99. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-devops.prompt.md +2 -1
  100. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-implement-track.prompt.md +2 -1
  101. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-migrate.prompt.md +2 -1
  102. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-mobile.prompt.md +2 -1
  103. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-onboard.prompt.md +2 -1
  104. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-orchestrate.prompt.md +2 -1
  105. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-plan.prompt.md +2 -1
  106. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-refactor.prompt.md +2 -1
  107. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-release.prompt.md +2 -1
  108. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-review.prompt.md +2 -1
  109. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-security.prompt.md +2 -1
  110. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-test.prompt.md +2 -1
  111. package/workflows/workflows/agent-environment-setup/platforms/copilot/prompts/workflow-vercel.prompt.md +2 -1
  112. package/workflows/workflows/agent-environment-setup/platforms/copilot/rules/copilot-instructions.md +13 -8
  113. package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/deep-research/SKILL.md +94 -0
  114. package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/deep-research/evals/assertions.md +17 -0
  115. package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/deep-research/evals/evals.json +56 -0
  116. package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/deep-research/examples/01-latest-docs-check.md +12 -0
  117. package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/deep-research/examples/02-ecosystem-comparison.md +12 -0
  118. package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/deep-research/examples/03-research-to-implementation-handoff.md +12 -0
  119. package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/deep-research/references/comparison-checklist.md +57 -0
  120. package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/deep-research/references/research-output.md +69 -0
  121. package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/deep-research/references/source-ladder.md +81 -0
  122. package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/skills_index.json +32 -0
  123. package/workflows/workflows/agent-environment-setup/platforms/copilot/workflows/onboard.md +3 -3
  124. package/workflows/workflows/agent-environment-setup/platforms/copilot/workflows/orchestrate.md +2 -2
  125. package/workflows/workflows/agent-environment-setup/platforms/copilot/workflows/plan.md +4 -4
  126. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/accessibility.toml +2 -0
  127. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/backend.toml +2 -0
  128. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/create.toml +2 -0
  129. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/database.toml +2 -0
  130. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/debug.toml +2 -0
  131. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/devops.toml +2 -0
  132. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/implement-track.toml +2 -0
  133. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/migrate.toml +2 -0
  134. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/mobile.toml +2 -0
  135. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/onboard.toml +2 -0
  136. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/orchestrate.toml +2 -0
  137. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/plan.toml +2 -0
  138. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/refactor.toml +2 -0
  139. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/release.toml +2 -0
  140. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/review.toml +2 -0
  141. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/security.toml +2 -0
  142. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/test.toml +2 -0
  143. package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/vercel.toml +2 -0
  144. package/workflows/workflows/agent-environment-setup/platforms/gemini/rules/GEMINI.md +13 -8
  145. package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/SKILL.md +89 -0
  146. package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/evals/assertions.md +17 -0
  147. package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/evals/evals.json +56 -0
  148. package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/examples/01-latest-docs-check.md +12 -0
  149. package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/examples/02-ecosystem-comparison.md +12 -0
  150. package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/examples/03-research-to-implementation-handoff.md +12 -0
  151. package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/references/comparison-checklist.md +57 -0
  152. package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/references/research-output.md +69 -0
  153. package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/references/source-ladder.md +81 -0
  154. package/workflows/workflows/agent-environment-setup/platforms/gemini/workflows/onboard.md +3 -3
  155. package/workflows/workflows/agent-environment-setup/platforms/gemini/workflows/orchestrate.md +2 -2
  156. package/workflows/workflows/agent-environment-setup/platforms/gemini/workflows/plan.md +4 -4
  157. package/workflows/workflows/agent-environment-setup/shared/agents/orchestrator.md +2 -1
  158. package/workflows/workflows/agent-environment-setup/shared/agents/project-planner.md +2 -1
  159. package/workflows/workflows/agent-environment-setup/shared/agents/researcher.md +5 -1
  160. package/workflows/workflows/agent-environment-setup/shared/rules/STEERING.md +44 -13
  161. package/workflows/workflows/agent-environment-setup/shared/rules/overrides/claude.md +2 -0
  162. package/workflows/workflows/agent-environment-setup/shared/rules/overrides/gemini.md +20 -0
  163. package/workflows/workflows/agent-environment-setup/shared/workflows/onboard.md +1 -1
  164. package/workflows/workflows/agent-environment-setup/shared/workflows/orchestrate.md +1 -1
  165. package/workflows/workflows/agent-environment-setup/shared/workflows/plan.md +2 -2
@@ -5,7 +5,7 @@ tools: Read, Grep, Glob, Bash, Edit, Write
5
5
  model: inherit
6
6
  maxTurns: 30
7
7
  memory: project
8
- skills: system-design, api-design, database-design, architecture-doc, mcp-server-builder, tech-doc, prompt-engineering, skill-creator, typescript-best-practices, javascript-best-practices, python-best-practices
8
+ skills: system-design, api-design, database-design, deep-research, mcp-server-builder, tech-doc, prompt-engineering, skill-creator, typescript-best-practices, javascript-best-practices, python-best-practices
9
9
  handoffs:
10
10
  - agent: "orchestrator"
11
11
  title: "Start Implementation"
@@ -24,7 +24,7 @@ Decompose complex requests into implementable plans with clear ownership, depend
24
24
  - Load `system-design` for system design tradeoffs in the plan.
25
25
  - Load `api-design` when the plan involves API contract decisions.
26
26
  - Load `database-design` when the plan involves data modeling or migration.
27
- - Load `architecture-doc` when planning requires external information or comparison.
27
+ - Load `deep-research` when planning requires fresh external information, public comparison, or evidence beyond the repo.
28
28
  - Use `skill_validate` before `skill_get`, and use `skill_get_reference` only for the specific sidecar file needed.
29
29
 
30
30
  ## Skill References
@@ -34,7 +34,7 @@ Decompose complex requests into implementable plans with clear ownership, depend
34
34
  | `system-design` | Plan involves system design tradeoffs or component boundaries. |
35
35
  | `api-design` | Plan involves API contract decisions or integration points. |
36
36
  | `database-design` | Plan involves data modeling, schema design, or migration strategy. |
37
- | `architecture-doc` | Planning requires external research or approach comparison. |
37
+ | `deep-research` | Planning requires external research or approach comparison. |
38
38
  | `mcp-server-builder` | Plan involves MCP server or tool implementation. |
39
39
  | `skill-creator` | Plan involves skill package creation or modification. |
40
40
 
@@ -45,6 +45,7 @@ Decompose complex requests into implementable plans with clear ownership, depend
45
45
  - Every task needs an owner (agent), acceptance criteria, and verification approach.
46
46
  - Plan for rollback — every change should be reversible.
47
47
  - Front-load risk — tackle the hardest technical uncertainty first.
48
+ - When outside evidence is needed, send research through `deep-research` first instead of mixing web browsing into every implementation stream.
48
49
 
49
50
  ## Planning Methodology
50
51
 
@@ -18,7 +18,7 @@ tools: Read, Grep, Glob, Bash
18
18
  model: inherit
19
19
  maxTurns: 30
20
20
  memory: project
21
- skills: architecture-doc, system-design, database-design, tech-doc, prompt-engineering
21
+ skills: deep-research, system-design, database-design, tech-doc, prompt-engineering
22
22
  handoffs:
23
23
  - agent: "project-planner"
24
24
  title: "Plan Implementation"
@@ -30,8 +30,8 @@ Investigate thoroughly, synthesize findings, and deliver structured knowledge be
30
30
 
31
31
  ## Skill Loading Contract
32
32
 
33
- - Do not call `skill_search` for `architecture-doc`, `system-design`, `database-design`, `tech-doc`, or `prompt-engineering` when the task is clearly research work.
34
- - Load `architecture-doc` first for all research tasks — it defines the research methodology.
33
+ - Do not call `skill_search` for `deep-research`, `system-design`, `database-design`, `tech-doc`, or `prompt-engineering` when the task is clearly research work.
34
+ - Load `deep-research` first for all research tasks — it defines the source ladder, evidence labeling, and research output contract.
35
35
  - Add `system-design` when research involves system design patterns or tradeoffs.
36
36
  - Add `database-design` when research involves data storage options or migration approaches.
37
37
  - Add `tech-doc` when research involves OpenAI API or model behavior verification.
@@ -42,7 +42,7 @@ Investigate thoroughly, synthesize findings, and deliver structured knowledge be
42
42
 
43
43
  | File | Load when |
44
44
  | ----------------------- | --------------------------------------------------------------------- |
45
- | `architecture-doc` | All research tasks — defines the core research methodology. |
45
+ | `deep-research` | All research tasks — defines the core research methodology. |
46
46
  | `system-design` | Research involves system design patterns or architectural tradeoffs. |
47
47
  | `database-design` | Research involves data storage, database comparison, or migration. |
48
48
  | `tech-doc` | Research involves OpenAI API, model behavior, or version differences. |
@@ -51,6 +51,9 @@ Investigate thoroughly, synthesize findings, and deliver structured knowledge be
51
51
  ## Operating Stance
52
52
 
53
53
  - Breadth first, then depth — survey the landscape before drilling into specifics.
54
+ - Repo first, then web — inspect local code, configs, and docs before using external sources.
55
+ - Official docs first — use vendor or maintainer documentation as primary evidence.
56
+ - Community evidence is secondary — Reddit, blog posts, and forum threads can inform implementation, but label them as lower-trust support.
54
57
  - Cite sources — every finding should be traceable to evidence.
55
58
  - Distinguish fact from inference — clearly label assumptions.
56
59
  - Produce actionable findings — research without recommendations is incomplete.
@@ -72,5 +75,6 @@ Investigate thoroughly, synthesize findings, and deliver structured knowledge be
72
75
  - Clear distinction between verified facts and educated guesses.
73
76
  - Actionable recommendations with tradeoff analysis.
74
77
  - Remaining knowledge gaps identified.
78
+ - Output order: verified facts, secondary/community evidence, gaps, recommended next route.
75
79
 
76
80
  > **Codex note:** Specialists are internal reasoning postures, not spawned processes. Switch postures by adopting the specialist's guidelines inline.
@@ -43,7 +43,7 @@ Execute this tree top-to-bottom. Stop at the **first match**. Never skip levels.
43
43
  ├─ [TRIVIAL] Single-step, obvious, reversible?
44
44
  │ → Execute directly. No routing. Stop.
45
45
 
46
- ├─ [EXPLICIT] User named a workflow or @specialist?
46
+ ├─ [EXPLICIT] User named a workflow, @specialist, or exact skill?
47
47
  │ → Honor that route exactly. Stop.
48
48
 
49
49
  ├─ [SINGLE-DOMAIN] Multi-step but contained in one specialty?
@@ -63,6 +63,7 @@ Execute this tree top-to-bottom. Stop at the **first match**. Never skip levels.
63
63
  **Hard rules:**
64
64
 
65
65
  - Never pre-load skills before route resolution.
66
+ - If the user names an exact skill ID, run `skill_validate` on that ID before `route_resolve`.
66
67
  - Never invoke a specialist posture when direct execution suffices.
67
68
  - Never chain more than one `skill_search` per request.
68
69
  - Codex compatibility aliases (`$workflow-*`, `$agent-*`) are accepted as hints only — not primary route surfaces.
@@ -367,6 +368,7 @@ Use this matrix to match incoming tasks to the correct skill and primary special
367
368
  | docker-compose-dev | DevOps | Docker Compose local dev environments | @devops-engineer |
368
369
  | kubernetes-deploy | DevOps | K8s manifests, Helm charts, deployment | @devops-engineer |
369
370
  | observability | DevOps | Logging, metrics, tracing, alerting | @devops-engineer |
371
+ | deep-research | Research | Latest docs, public comparisons, external verification | @researcher |
370
372
  | llm-eval | AI/ML | LLM evaluation, benchmarking, evals | @researcher |
371
373
  | rag-patterns | AI/ML | RAG architecture, embeddings, retrieval | @researcher |
372
374
  | prompt-engineering | AI/ML | Prompt design, few-shot, chain-of-thought | @researcher |
@@ -414,12 +416,15 @@ Selection policy:
414
416
  Keep MCP context lazy and exact. Skills are supporting context, not the route layer.
415
417
 
416
418
  1. Never begin with `skill_search`. Inspect the repo/task locally first.
417
- 2. Resolve workflows, agents, or free-text route intent with `route_resolve` before loading any skills.
418
- 3. If the route is still unresolved and local grounding leaves the domain unclear, use one narrow `skill_search`.
419
- 4. Always run `skill_validate` on the exact selected ID before `skill_get`.
420
- 5. Call `skill_get` with `includeReferences:false` by default.
421
- 6. Load at most one sidecar markdown file at a time with `skill_get_reference`.
422
- 7. Do not auto-prime every specialist with a skill. Load only what the task clearly needs.
423
- 8. Use upstream MCP servers such as `postman`, `stitch`, or `playwright` for real cloud/browser actions when available.
419
+ 2. If the user already named `/workflow`, `@agent`, or an exact skill ID, honor it directly. For exact skills, run `skill_validate` first and skip `route_resolve` when valid.
420
+ 3. Resolve only free-text workflow/agent intent with `route_resolve` before loading non-explicit skills.
421
+ 4. If the route is still unresolved and local grounding leaves the domain unclear, use one narrow `skill_search`.
422
+ 5. Always run `skill_validate` on the exact selected ID before `skill_get`.
423
+ 6. Call `skill_get` with `includeReferences:false` by default.
424
+ 7. Load at most one sidecar markdown file at a time with `skill_get_reference`.
425
+ 8. Do not auto-prime every specialist with a skill. Load only what the task clearly needs.
426
+ 9. For research: repo/local evidence first, official docs next, Reddit/community only as labeled secondary evidence.
427
+ 10. Escalate to research only when freshness matters, public comparison matters, or the user explicitly asks to research/verify.
428
+ 11. Use upstream MCP servers such as `postman`, `stitch`, or `playwright` for real cloud/browser actions when available.
424
429
 
425
430
  <!-- cbx:mcp:auto:end -->
@@ -0,0 +1,89 @@
1
+ ---
2
+ name: deep-research
3
+ description: Use when investigating latest vendor behavior, comparing tools or platforms, verifying claims beyond the repo, or gathering external evidence before implementation.
4
+ ---
5
+ # Deep Research
6
+
7
+ ## Purpose
8
+
9
+ Run a disciplined research pass before implementation when the repo alone is not enough. This skill keeps research evidence-driven: inspect the local codebase first, escalate to official docs when freshness or public comparison matters, then use labeled community evidence only when it adds practical context.
10
+
11
+ ## When to Use
12
+
13
+ - Verifying latest SDK, CLI, API, or platform behavior
14
+ - Comparing tools, frameworks, hosted services, or implementation approaches
15
+ - Checking whether public docs and the local repo disagree
16
+ - Gathering external evidence before planning a migration or new capability
17
+ - Producing a structured research brief that hands off cleanly into implementation
18
+
19
+ ## Instructions
20
+
21
+ 1. **Define the research question before collecting sources** because vague research sprawls quickly. Restate the target topic, freshness requirement, comparison axis, and what decision the findings need to support.
22
+
23
+ 2. **Inspect the repo first** because many questions are already answerable from local code, configs, tests, docs, or generated assets. Do not browse externally until the local evidence is exhausted or clearly insufficient.
24
+
25
+ 3. **Decide whether external research is actually required** because not every task needs web evidence. Escalate only when freshness matters, public comparison matters, or the user explicitly asks to research or verify.
26
+
27
+ 4. **Follow the source ladder strictly** because evidence quality matters. Use official docs, upstream repositories, standards, and maintainer material as primary sources before looking at blogs, issue threads, or Reddit.
28
+
29
+ 5. **Capture concrete source details** because research without provenance is hard to trust. Record exact links, relevant dates, versions, and any repo files that support or contradict the external evidence.
30
+
31
+ 6. **Cross-check important claims across more than one source when possible** because public docs, repos, and community advice can drift. If sources disagree, say so explicitly instead of smoothing over the conflict.
32
+
33
+ 7. **Use Reddit and other community sources only as labeled secondary evidence** because they can surface practical gotchas but are not authoritative. Treat them as implementation color, not final truth.
34
+
35
+ 8. **Separate verified facts from inference** because downstream planning depends on confidence. Mark what is directly supported by repo evidence or official sources versus what you infer from patterns or secondary signals.
36
+
37
+ 9. **Keep the output decision-oriented** because the goal is not to dump links. Tie each finding back to the implementation, workflow, agent, or skill decision it affects.
38
+
39
+ 10. **Recommend the next route explicitly** because research is usually a handoff, not the end of the task. Name the next workflow, agent, or exact skill that should continue the work.
40
+
41
+ 11. **State the remaining gaps and risks** because incomplete research is still useful when the uncertainty is visible. Call out what you could not verify, what may have changed recently, and what assumptions remain.
42
+
43
+ 12. **Avoid over-quoting and over-collecting** because research quality comes from synthesis, not volume. Prefer concise summaries with high-signal citations over long pasted excerpts.
44
+
45
+ 13. **When the task turns into implementation, stop researching and hand off** because mixing discovery and execution usually creates drift. Deliver the research brief first, then route into the correct workflow or specialist.
46
+
47
+ ## Output Format
48
+
49
+ Deliver:
50
+
51
+ 1. **Research question** — topic, freshness requirement, and decision to support
52
+ 2. **Verified facts** — repo evidence and primary-source findings
53
+ 3. **Secondary/community evidence** — labeled lower-trust supporting signals
54
+ 4. **Gaps / unknowns** — unresolved questions or contradictory evidence
55
+ 5. **Recommended next route** — direct execution, workflow, agent, or exact skill to use next
56
+
57
+ ## References
58
+
59
+ Load only the file needed for the current question.
60
+
61
+ | File | Load when |
62
+ | --- | --- |
63
+ | `references/source-ladder.md` | Need the repo-first and source-priority policy for official docs versus community evidence. |
64
+ | `references/research-output.md` | Need the structured output format, evidence labeling rules, or handoff pattern. |
65
+ | `references/comparison-checklist.md` | Comparing vendors, frameworks, or tools and need a concrete evaluation frame. |
66
+
67
+ ## Examples
68
+
69
+ Use these when the task shape already matches.
70
+
71
+ | File | Use when |
72
+ | --- | --- |
73
+ | `examples/01-latest-docs-check.md` | Verifying a latest capability or doc claim before implementation. |
74
+ | `examples/02-ecosystem-comparison.md` | Comparing multiple tools or platforms with official-first sourcing. |
75
+ | `examples/03-research-to-implementation-handoff.md` | Turning research findings into a concrete next workflow or specialist handoff. |
76
+
77
+ ## Codex Research Flow
78
+
79
+ - Start in the repo. Gather code, config, tests, and docs before using any external source.
80
+ - When external evidence is required, prefer official docs first and keep community evidence clearly labeled as secondary because Codex environments may be network-restricted or stale.
81
+ - End with a concrete next route: direct execution, a workflow, an agent posture, or one exact follow-up skill.
82
+
83
+ ## Codex Platform Notes
84
+
85
+ - Specialists are internal reasoning postures, not spawned subagent processes.
86
+ - Reference the repo-root AGENTS instructions for posture definitions and switching contracts.
87
+ - Codex operates under network restrictions — skills should not assume outbound HTTP access.
88
+ - Use `$ARGUMENTS` to access user-provided arguments when the skill is invoked.
89
+ - All skill guidance executes within the sandbox; file I/O is confined to the workspace.
@@ -0,0 +1,17 @@
1
+ # Deep Research Eval Assertions
2
+
3
+ ## Eval 1: Latest Capability Verification
4
+
5
+ 1. **repo-first** — Starts by checking repo or local evidence before jumping to web claims.
6
+ 2. **official-first** — Uses official docs or upstream sources as the primary evidence for the capability.
7
+ 3. **secondary-labeled** — If community sources are mentioned, labels them as secondary evidence instead of presenting them as authoritative.
8
+ 4. **gaps-called-out** — Identifies unresolved uncertainty or missing confirmation.
9
+ 5. **next-route** — Ends with a concrete recommended workflow, agent, or skill to use next.
10
+
11
+ ## Eval 2: Tool Comparison
12
+
13
+ 1. **comparison-frame** — Defines the comparison axes instead of producing vague preferences.
14
+ 2. **repo-impact** — Connects the comparison back to the current repo or implementation constraints.
15
+ 3. **fact-vs-inference** — Separates verified facts from inference or interpretation.
16
+ 4. **decision-oriented** — Produces a recommendation or explicit defer condition.
17
+ 5. **no-research-sprawl** — Keeps the output concise and structured rather than dumping raw links.
@@ -0,0 +1,56 @@
1
+ [
2
+ {
3
+ "name": "latest-capability-verification",
4
+ "description": "Validate that the skill performs repo-first research, prioritizes official documentation, labels secondary evidence, and recommends a next route.",
5
+ "prompt": "Research whether the latest official Claude Code hook surface supports reinforcing route honoring before implementation. Start with the current repo state, then use official docs if needed. If community sources add useful practical context, include them but label them appropriately. End with the next workflow, agent, or skill we should use.",
6
+ "assertions": [
7
+ {
8
+ "id": "repo-first",
9
+ "description": "Starts with repo or local evidence before using external sources."
10
+ },
11
+ {
12
+ "id": "official-first",
13
+ "description": "Treats official docs or upstream sources as the primary evidence."
14
+ },
15
+ {
16
+ "id": "secondary-labeled",
17
+ "description": "Labels any Reddit or community evidence as secondary rather than authoritative."
18
+ },
19
+ {
20
+ "id": "gaps-called-out",
21
+ "description": "States any unresolved gaps, conflicts, or unknowns."
22
+ },
23
+ {
24
+ "id": "next-route",
25
+ "description": "Ends with a concrete recommended next route."
26
+ }
27
+ ]
28
+ },
29
+ {
30
+ "name": "tool-comparison",
31
+ "description": "Validate that the skill compares options with a clear frame, ties findings back to repo impact, and produces a decision-ready output.",
32
+ "prompt": "Compare whether our CLI should keep enforcement in Gemini command wrappers only or also add Claude hook templates. Use the repo state first, then official docs for current platform capabilities, and include community evidence only if it adds implementation nuance. Finish with a recommendation and the next route to take.",
33
+ "assertions": [
34
+ {
35
+ "id": "comparison-frame",
36
+ "description": "Defines concrete comparison axes such as repo impact, enforcement surface, and maintenance cost."
37
+ },
38
+ {
39
+ "id": "repo-impact",
40
+ "description": "Connects each option back to the current repo or bundle behavior."
41
+ },
42
+ {
43
+ "id": "fact-vs-inference",
44
+ "description": "Separates verified facts from inference or interpretation."
45
+ },
46
+ {
47
+ "id": "decision-oriented",
48
+ "description": "Produces a recommendation or explicit defer condition."
49
+ },
50
+ {
51
+ "id": "no-research-sprawl",
52
+ "description": "Keeps the answer structured instead of turning into an unfiltered dump of sources."
53
+ }
54
+ ]
55
+ }
56
+ ]
@@ -0,0 +1,12 @@
1
+ # Example: Latest Docs Check
2
+
3
+ ## User Request
4
+
5
+ > Research whether Claude Code hooks can reinforce route honoring before we add new workflow rules.
6
+
7
+ ## Expected Shape
8
+
9
+ 1. Inspect the repo's current Claude rule and hook support first.
10
+ 2. Verify the current official Claude docs for hooks, event names, and config format.
11
+ 3. Separate those verified facts from any community commentary about hook effectiveness.
12
+ 4. End with the recommended next route, for example `@researcher` continuing the research or `/create` applying the validated hook template changes.
@@ -0,0 +1,12 @@
1
+ # Example: Ecosystem Comparison
2
+
3
+ ## User Request
4
+
5
+ > Compare whether our CLI should keep Gemini command enforcement only, or add another platform-native hook layer for Claude as well.
6
+
7
+ ## Expected Shape
8
+
9
+ 1. Start with repo constraints and current platform bundle behavior.
10
+ 2. Compare the official platform capabilities using primary docs.
11
+ 3. Add any useful community evidence as clearly labeled secondary input.
12
+ 4. Produce a recommendation tied to the repo: which platform gets which enforcement surface, and why.
@@ -0,0 +1,12 @@
1
+ # Example: Research To Implementation Handoff
2
+
3
+ ## User Request
4
+
5
+ > Research the latest Codex, Claude, and Gemini MCP behavior, then tell me the next route to update our workflow rules safely.
6
+
7
+ ## Expected Shape
8
+
9
+ 1. Gather repo evidence first.
10
+ 2. Verify current official docs for each platform.
11
+ 3. Summarize verified facts, secondary evidence, and gaps.
12
+ 4. End with a precise next route such as `/plan` for a policy change or `skill-creator` for skill/rule packaging work.
@@ -0,0 +1,57 @@
1
+ # Comparison Checklist
2
+
3
+ Use this when evaluating tools, frameworks, APIs, or platforms.
4
+
5
+ ## 1. Scope the Comparison
6
+
7
+ Define:
8
+
9
+ - what is being compared
10
+ - whether the comparison is about implementation fit, operational cost, or product capability
11
+ - what time horizon matters: immediate migration, medium-term maintenance, or long-term platform fit
12
+
13
+ ## 2. Compare on Stable Axes
14
+
15
+ Use a short set of dimensions:
16
+
17
+ - integration fit with the current repo
18
+ - maturity and maintenance signal
19
+ - official documentation quality
20
+ - configuration complexity
21
+ - ecosystem and tooling support
22
+ - operational constraints
23
+ - migration cost
24
+
25
+ Do not compare on vague criteria like "better DX" without concrete evidence.
26
+
27
+ ## 3. Capture Repo Impact
28
+
29
+ Tie each option back to the current codebase:
30
+
31
+ - what code would change
32
+ - which workflows or agents would own the work
33
+ - what risks are specific to this repo
34
+ - whether new MCP tools or skills would be required
35
+
36
+ ## 4. Separate Product Claims from Team Constraints
37
+
38
+ An option can be technically stronger and still be a worse fit for the repo.
39
+
40
+ Keep these separate:
41
+
42
+ - product capability
43
+ - ecosystem quality
44
+ - team familiarity
45
+ - migration blast radius
46
+ - existing architecture constraints
47
+
48
+ ## 5. Decision Frame
49
+
50
+ Finish with one of:
51
+
52
+ - recommend option A
53
+ - recommend option B
54
+ - defer decision pending one missing verification
55
+ - keep current approach because switching cost outweighs gain
56
+
57
+ If the evidence is mixed, say what would change the recommendation.
@@ -0,0 +1,69 @@
1
+ # Research Output Contract
2
+
3
+ ## Required Sections
4
+
5
+ ### 1. Research question
6
+
7
+ State:
8
+
9
+ - the exact topic
10
+ - why research was necessary
11
+ - whether freshness or public comparison mattered
12
+ - the decision this research is meant to support
13
+
14
+ ### 2. Verified facts
15
+
16
+ List the strongest findings first.
17
+
18
+ For each fact:
19
+
20
+ - state the claim in one sentence
21
+ - cite the source class: repo, official docs, upstream repo, standard
22
+ - include the relevant link or file path
23
+ - include date/version when it matters
24
+
25
+ ### 3. Secondary / community evidence
26
+
27
+ Only include this when it adds signal the primary sources did not provide.
28
+
29
+ For each item:
30
+
31
+ - label it as secondary evidence
32
+ - state what practical signal it adds
33
+ - avoid presenting it as settled fact
34
+
35
+ ### 4. Gaps / unknowns
36
+
37
+ Document:
38
+
39
+ - unresolved conflicts
40
+ - missing official confirmation
41
+ - assumptions that still need validation
42
+ - risks if the team proceeds anyway
43
+
44
+ ### 5. Recommended next route
45
+
46
+ Research should end with one clear recommendation:
47
+
48
+ - direct execution
49
+ - a specific workflow like `/plan` or `/create`
50
+ - a specialist like `@researcher` or `@frontend-specialist`
51
+ - an exact skill like `stitch` or `deep-research`
52
+
53
+ Keep this recommendation concrete enough that the next step does not need another routing pass.
54
+
55
+ ## Compression Rules
56
+
57
+ - Prefer 5 strong findings over 20 weak ones.
58
+ - Do not paste long quotes from docs when a citation plus summary will do.
59
+ - If multiple sources say the same thing, summarize once and cite the strongest source.
60
+ - If research found nothing reliable, say that directly.
61
+
62
+ ## Handoff Pattern
63
+
64
+ When handing off to implementation or planning, include:
65
+
66
+ - the decision summary
67
+ - the highest-confidence constraints
68
+ - the unresolved risks
69
+ - the next route to take
@@ -0,0 +1,81 @@
1
+ # Source Ladder
2
+
3
+ ## Goal
4
+
5
+ Use the smallest amount of external research that still produces a decision-ready answer. Keep the evidence traceable and ordered by trust.
6
+
7
+ ## 1. Repo / Local Evidence First
8
+
9
+ Start by inspecting:
10
+
11
+ - application code and tests
12
+ - README files and internal docs
13
+ - generated workflow or skill assets
14
+ - lockfiles, config files, and package manifests
15
+ - existing integration code and migration history
16
+
17
+ If the repo already answers the question, stop there. Do not browse externally just because web research feels safer.
18
+
19
+ ## 2. Primary External Sources
20
+
21
+ Use these next:
22
+
23
+ - official vendor docs
24
+ - upstream repositories and release notes
25
+ - standards bodies and reference specs
26
+ - maintainer-authored examples
27
+
28
+ Prefer sources that expose:
29
+
30
+ - exact feature names
31
+ - current version constraints
32
+ - config formats
33
+ - dates or changelog context
34
+
35
+ When the topic is time-sensitive, capture the date you verified the source and the version or doc page involved.
36
+
37
+ ## 3. Secondary / Community Sources
38
+
39
+ Use these only after primary evidence:
40
+
41
+ - Reddit threads
42
+ - issue comments
43
+ - independent blog posts
44
+ - forum discussions
45
+ - third-party comparison articles
46
+
47
+ Community evidence is useful for:
48
+
49
+ - practical gotchas
50
+ - migration pain points
51
+ - missing-doc workarounds
52
+ - real-world adoption patterns
53
+
54
+ Community evidence is not enough on its own for authoritative claims about product behavior, supported configuration, or security guarantees.
55
+
56
+ ## 4. Conflict Handling
57
+
58
+ When sources disagree:
59
+
60
+ 1. Prefer repo evidence for the current codebase state.
61
+ 2. Prefer official docs over community claims for product behavior.
62
+ 3. Prefer newer dated material when the sources cover the same feature.
63
+ 4. If the conflict remains unresolved, report it as a gap instead of guessing.
64
+
65
+ ## 5. Evidence Labels
66
+
67
+ Use these labels in research output:
68
+
69
+ - **Verified fact** — backed by repo evidence or a primary source
70
+ - **Secondary evidence** — backed only by community or indirect sources
71
+ - **Inference** — reasoned conclusion not directly stated by a source
72
+ - **Gap** — could not be verified confidently
73
+
74
+ ## 6. Stop Conditions
75
+
76
+ Stop researching when:
77
+
78
+ - the decision is already clear
79
+ - new sources only repeat the same point
80
+ - the remaining uncertainty is small and clearly documented
81
+ - the task should move into implementation or planning
@@ -30,9 +30,9 @@ Use this when joining a new project, exploring an unfamiliar codebase, or prepar
30
30
 
31
31
  ## Skill Routing
32
32
 
33
- - Primary skills: `architecture-doc`, `system-design`
33
+ - Primary skills: `deep-research`, `system-design`
34
34
  - Supporting skills (optional): `system-design`, `database-design`, `typescript-best-practices`, `javascript-best-practices`, `python-best-practices`
35
- - Start with `architecture-doc` for systematic exploration and `system-design` for architecture mapping. Add `system-design` for undocumented systems.
35
+ - Start with `deep-research` for systematic exploration and `system-design` for architecture mapping. Add `system-design` for undocumented systems. Prefer repo evidence first; use external sources only when setup or dependency behavior cannot be confirmed locally.
36
36
 
37
37
  ## Workflow steps
38
38
 
@@ -55,7 +55,7 @@ Use this when joining a new project, exploring an unfamiliar codebase, or prepar
55
55
  ONBOARD_WORKFLOW_RESULT:
56
56
  primary_agent: researcher
57
57
  supporting_agents: [code-archaeologist?, backend-specialist?, frontend-specialist?]
58
- primary_skills: [architecture-doc, system-design]
58
+ primary_skills: [deep-research, system-design]
59
59
  supporting_skills: [system-design?, database-design?]
60
60
  project_overview:
61
61
  purpose: <string>
@@ -24,8 +24,8 @@ Use this when a task spans multiple domains (backend + frontend, security + infr
24
24
  ## Skill Routing
25
25
 
26
26
  - Primary skills: `system-design`, `api-design`
27
- - Supporting skills (optional): `database-design`, `architecture-doc`, `mcp-server-builder`, `tech-doc`, `prompt-engineering`, `skill-creator`
28
- - Start with `system-design` for system design coordination and `api-design` for integration contracts. Add supporting skills based on the coordination challenge.
27
+ - Supporting skills (optional): `database-design`, `deep-research`, `mcp-server-builder`, `tech-doc`, `prompt-engineering`, `skill-creator`
28
+ - Start with `system-design` for system design coordination and `api-design` for integration contracts. Add `deep-research` before implementation when the coordination challenge depends on fresh external facts or public comparison.
29
29
 
30
30
  ## Workflow steps
31
31
 
@@ -36,13 +36,13 @@ Use this when starting a new feature, project, or significant change that needs
36
36
  ## Skill Routing
37
37
 
38
38
  - Primary skills: `system-design`, `api-design`
39
- - Supporting skills (optional): `database-design`, `architecture-doc`, `mcp-server-builder`, `tech-doc`, `prompt-engineering`, `skill-creator`
40
- - Start with `system-design` for system design and `api-design` for API contracts. Add `database-design` when data modeling is central, `architecture-doc` when external knowledge is needed.
39
+ - Supporting skills (optional): `database-design`, `deep-research`, `mcp-server-builder`, `tech-doc`, `prompt-engineering`, `skill-creator`
40
+ - Start with `system-design` for system design and `api-design` for API contracts. Add `database-design` when data modeling is central, `deep-research` when fresh external knowledge or public comparison is needed.
41
41
 
42
42
  ## Workflow steps
43
43
 
44
44
  1. Clarify scope, success criteria, and constraints.
45
- 2. Research existing patterns and dependencies.
45
+ 2. Research existing patterns and dependencies, starting in-repo and escalating to `deep-research` only when outside evidence is required.
46
46
  3. Decompose into tasks with ownership and dependencies.
47
47
  4. Define interfaces, contracts, and failure modes.
48
48
  5. Produce acceptance criteria for each milestone.
@@ -62,7 +62,7 @@ PLAN_WORKFLOW_RESULT:
62
62
  primary_agent: project-planner
63
63
  supporting_agents: [orchestrator?, backend-specialist?, frontend-specialist?, database-architect?]
64
64
  primary_skills: [system-design, api-design]
65
- supporting_skills: [database-design?, architecture-doc?, mcp-server-builder?]
65
+ supporting_skills: [database-design?, deep-research?, mcp-server-builder?]
66
66
  plan:
67
67
  scope_summary: <string>
68
68
  tasks:
@@ -28,8 +28,8 @@ Your only permitted actions:
28
28
 
29
29
  ## Skill Loading Contract
30
30
 
31
- - Do not call `skill_search` for `system-design`, `api-design`, `database-design`, `architecture-doc`, `mcp-server-builder`, `tech-doc`, `prompt-engineering`, or `skill-creator` when the task is clearly multi-stream coordination, planning, architecture design, contract design, research, or skill package work.
32
- - Use `system-design` when the coordination problem is really a design tradeoff problem, `api-design` when integration contracts are the coordination bottleneck, `database-design` when the shared dependency is a data-model or migration concern, `architecture-doc` when the coordination risk is stale or conflicting external information, `mcp-server-builder` for MCP-specific streams, `tech-doc` for OpenAI-doc verification streams, `prompt-engineering` for instruction-quality streams, and `skill-creator` when the coordinated changes are in skills, mirrors, routing, or packaging.
31
+ - Do not call `skill_search` for `system-design`, `api-design`, `database-design`, `deep-research`, `mcp-server-builder`, `tech-doc`, `prompt-engineering`, or `skill-creator` when the task is clearly multi-stream coordination, planning, architecture design, contract design, research, or skill package work.
32
+ - Use `system-design` when the coordination problem is really a design tradeoff problem, `api-design` when integration contracts are the coordination bottleneck, `database-design` when the shared dependency is a data-model or migration concern, `deep-research` when the coordination risk is stale or conflicting external information, `mcp-server-builder` for MCP-specific streams, `tech-doc` for OpenAI-doc verification streams, `prompt-engineering` for instruction-quality streams, and `skill-creator` when the coordinated changes are in skills, mirrors, routing, or packaging.
33
33
  - Prefer platform-native delegation features when available, but keep the orchestration contract stable even when execution stays in a single track.
34
34
  - Use `skill_validate` before `skill_get`, and use `skill_get_reference` only for the specific sidecar file needed by the current coordination step.
35
35
 
@@ -42,7 +42,7 @@ Load on demand. Do not preload all references.
42
42
  | `system-design` | Coordination depends on resolving system design or interface tradeoffs first. |
43
43
  | `api-design` | The critical shared dependency is an API contract or integration boundary. |
44
44
  | `database-design` | The coordination risk centers on schema, migration, data ownership, or engine choice. |
45
- | `architecture-doc` | External sources, latest information, or public-repo comparisons are blocking confident execution. |
45
+ | `deep-research` | External sources, latest information, or public-repo comparisons are blocking confident execution. |
46
46
  | `mcp-server-builder` | One stream is MCP server design, tool shape, or transport selection. |
47
47
  | `tech-doc` | One stream needs current OpenAI docs or version-specific behavior verification. |
48
48
  | `prompt-engineering` | One stream is repairing prompts, agent rules, or instruction quality. |
@@ -155,7 +155,8 @@ ANTI-LAZINESS:
155
155
  5. **Iterate, don't accept mediocrity** — if output is incomplete or wrong, re-delegate with feedback.
156
156
  6. **Track progress visibly** — maintain a task list showing status of each work item.
157
157
  7. **Fail fast on blockers** — if a dependency is missing or a task is stuck after 3 iterations, escalate.
158
- 8. **Synthesize at the end** — combine outputs with concrete actions, risks, and verification evidence.
158
+ 8. **Route research explicitly** — when freshness or public comparison matters, delegate to `@researcher` or load `deep-research` before implementation.
159
+ 9. **Synthesize at the end** — combine outputs with concrete actions, risks, and verification evidence.
159
160
 
160
161
  ## Anti-Patterns to Prevent
161
162
 
@@ -185,6 +186,6 @@ ORCHESTRATION_RESULT:
185
186
  ```
186
187
 
187
188
  ## Skill routing
188
- Prefer these skills when task intent matches: `system-design`, `api-design`, `database-design`, `architecture-doc`, `mcp-server-builder`, `tech-doc`, `prompt-engineering`, `skill-creator`, `typescript-best-practices`, `javascript-best-practices`, `python-best-practices`.
189
+ Prefer these skills when task intent matches: `system-design`, `api-design`, `database-design`, `deep-research`, `mcp-server-builder`, `tech-doc`, `prompt-engineering`, `skill-creator`, `typescript-best-practices`, `javascript-best-practices`, `python-best-practices`.
189
190
 
190
191
  If none apply directly, use the closest specialist guidance and state the fallback.