multi-forge 0.2.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (311) hide show
  1. forge/__init__.py +3 -0
  2. forge/_extensions/agents/.gitkeep +0 -0
  3. forge/_extensions/commands/.gitkeep +0 -0
  4. forge/_extensions/skills/analyze/SKILL.md +87 -0
  5. forge/_extensions/skills/challenge/SKILL.md +91 -0
  6. forge/_extensions/skills/consensus/SKILL.md +120 -0
  7. forge/_extensions/skills/consensus/resources/code_consensus_evaluation.md +94 -0
  8. forge/_extensions/skills/consensus/resources/consensus_evaluation.md +70 -0
  9. forge/_extensions/skills/consensus/resources/synthesis.md +101 -0
  10. forge/_extensions/skills/debate/SKILL.md +116 -0
  11. forge/_extensions/skills/debate/resources/code_debate_evaluation.md +101 -0
  12. forge/_extensions/skills/debate/resources/debate_evaluation.md +90 -0
  13. forge/_extensions/skills/panel/SKILL.md +141 -0
  14. forge/_extensions/skills/panel/resources/synthesis.md +103 -0
  15. forge/_extensions/skills/qa/SKILL.md +704 -0
  16. forge/_extensions/skills/qa/resources/checklist/0-enable.md +78 -0
  17. forge/_extensions/skills/qa/resources/checklist/1-preflight.md +24 -0
  18. forge/_extensions/skills/qa/resources/checklist/10-resume.md +143 -0
  19. forge/_extensions/skills/qa/resources/checklist/11-config.md +150 -0
  20. forge/_extensions/skills/qa/resources/checklist/12-search.md +58 -0
  21. forge/_extensions/skills/qa/resources/checklist/13-guard.md +237 -0
  22. forge/_extensions/skills/qa/resources/checklist/14-workflow.md +305 -0
  23. forge/_extensions/skills/qa/resources/checklist/15-skills.md +155 -0
  24. forge/_extensions/skills/qa/resources/checklist/16-handoff.md +224 -0
  25. forge/_extensions/skills/qa/resources/checklist/17-info.md +50 -0
  26. forge/_extensions/skills/qa/resources/checklist/18-disable.md +84 -0
  27. forge/_extensions/skills/qa/resources/checklist/19-uninstall.md +146 -0
  28. forge/_extensions/skills/qa/resources/checklist/2-extensions.md +188 -0
  29. forge/_extensions/skills/qa/resources/checklist/20-cleanup.md +36 -0
  30. forge/_extensions/skills/qa/resources/checklist/3-auth.md +234 -0
  31. forge/_extensions/skills/qa/resources/checklist/4-proxy.md +481 -0
  32. forge/_extensions/skills/qa/resources/checklist/5-session.md +541 -0
  33. forge/_extensions/skills/qa/resources/checklist/6-hooks.md +275 -0
  34. forge/_extensions/skills/qa/resources/checklist/7-costs.md +309 -0
  35. forge/_extensions/skills/qa/resources/checklist/8-status-line.md +174 -0
  36. forge/_extensions/skills/qa/resources/checklist/9-direct-commands.md +146 -0
  37. forge/_extensions/skills/qa/resources/checklist.md +103 -0
  38. forge/_extensions/skills/qa/resources/report-template.md +62 -0
  39. forge/_extensions/skills/qa/scripts/start-container.sh +529 -0
  40. forge/_extensions/skills/qa/scripts/walkthrough-state.py +1137 -0
  41. forge/_extensions/skills/review/SKILL.md +125 -0
  42. forge/_extensions/skills/review/references/claude-4.6.md +474 -0
  43. forge/_extensions/skills/review/references/claude-4.7.md +710 -0
  44. forge/_extensions/skills/review/references/gemini-3.1.md +546 -0
  45. forge/_extensions/skills/review/references/gpt-5.5.md +490 -0
  46. forge/_extensions/skills/review/references/skills-writing-guide.md +1588 -0
  47. forge/_extensions/skills/review/resources/code-anthropic.md +160 -0
  48. forge/_extensions/skills/review/resources/code-gemini.md +184 -0
  49. forge/_extensions/skills/review/resources/code-openai.md +203 -0
  50. forge/_extensions/skills/review/resources/code.md +160 -0
  51. forge/_extensions/skills/review-docs/SKILL.md +121 -0
  52. forge/_extensions/skills/review-docs/resources/docs-anthropic.md +170 -0
  53. forge/_extensions/skills/review-docs/resources/docs-gemini.md +204 -0
  54. forge/_extensions/skills/review-docs/resources/docs-openai.md +231 -0
  55. forge/_extensions/skills/review-docs/resources/docs.md +170 -0
  56. forge/_extensions/skills/smoke-test/SKILL.md +27 -0
  57. forge/_extensions/skills/smoke-test/scripts/smoke-test.sh +118 -0
  58. forge/_extensions/skills/understand/SKILL.md +148 -0
  59. forge/_extensions/skills/understand/resources/code-anthropic.md +163 -0
  60. forge/_extensions/skills/understand/resources/code-gemini.md +194 -0
  61. forge/_extensions/skills/understand/resources/code-openai.md +181 -0
  62. forge/_extensions/skills/understand/resources/code.md +163 -0
  63. forge/_extensions/skills/understand/resources/docs-anthropic.md +177 -0
  64. forge/_extensions/skills/understand/resources/docs-gemini.md +202 -0
  65. forge/_extensions/skills/understand/resources/docs-openai.md +191 -0
  66. forge/_extensions/skills/understand/resources/docs.md +177 -0
  67. forge/_extensions/skills/walkthrough/SKILL.md +599 -0
  68. forge/_extensions/skills/walkthrough/resources/checklist.md +765 -0
  69. forge/_extensions/skills/walkthrough/scripts/run-in-repo.sh +118 -0
  70. forge/_extensions/skills/walkthrough/scripts/setup-test-repo.sh +198 -0
  71. forge/_extensions/skills/walkthrough/scripts/walkthrough-state.py +1137 -0
  72. forge/backend/__init__.py +174 -0
  73. forge/backend/adapters/__init__.py +38 -0
  74. forge/backend/adapters/litellm.py +158 -0
  75. forge/backend/creation.py +89 -0
  76. forge/backend/registry.py +178 -0
  77. forge/cli/__init__.py +16 -0
  78. forge/cli/auth.py +483 -0
  79. forge/cli/backend.py +298 -0
  80. forge/cli/claude.py +411 -0
  81. forge/cli/config_cmd.py +303 -0
  82. forge/cli/extensions.py +1001 -0
  83. forge/cli/gc.py +165 -0
  84. forge/cli/guard.py +1018 -0
  85. forge/cli/guards.py +106 -0
  86. forge/cli/handoff.py +110 -0
  87. forge/cli/hooks/__init__.py +36 -0
  88. forge/cli/hooks/_group.py +20 -0
  89. forge/cli/hooks/_helpers.py +149 -0
  90. forge/cli/hooks/commands.py +1677 -0
  91. forge/cli/hooks/direct_commands.py +1304 -0
  92. forge/cli/hooks/install.py +232 -0
  93. forge/cli/hooks/policy.py +151 -0
  94. forge/cli/hooks/read_hygiene.py +74 -0
  95. forge/cli/hooks/verification.py +370 -0
  96. forge/cli/logs.py +406 -0
  97. forge/cli/main.py +292 -0
  98. forge/cli/proxy.py +1821 -0
  99. forge/cli/proxy_costs.py +313 -0
  100. forge/cli/search.py +416 -0
  101. forge/cli/session.py +892 -0
  102. forge/cli/session_addendum.py +81 -0
  103. forge/cli/session_fork.py +750 -0
  104. forge/cli/session_handoff.py +141 -0
  105. forge/cli/session_lifecycle.py +2053 -0
  106. forge/cli/session_manage.py +1336 -0
  107. forge/cli/session_memory.py +201 -0
  108. forge/cli/status_line.py +1398 -0
  109. forge/cli/workflow.py +1964 -0
  110. forge/config/__init__.py +110 -0
  111. forge/config/dataclass_utils.py +88 -0
  112. forge/config/defaults/__init__.py +0 -0
  113. forge/config/defaults/backends/__init__.py +0 -0
  114. forge/config/defaults/backends/litellm.yaml +196 -0
  115. forge/config/defaults/templates/__init__.py +0 -0
  116. forge/config/defaults/templates/litellm-anthropic-local.yaml +33 -0
  117. forge/config/defaults/templates/litellm-anthropic.yaml +24 -0
  118. forge/config/defaults/templates/litellm-gemini-flash-local.yaml +37 -0
  119. forge/config/defaults/templates/litellm-gemini-local.yaml +32 -0
  120. forge/config/defaults/templates/litellm-gemini-test.yaml +34 -0
  121. forge/config/defaults/templates/litellm-gemini.yaml +21 -0
  122. forge/config/defaults/templates/litellm-openai-codex-local.yaml +36 -0
  123. forge/config/defaults/templates/litellm-openai-local.yaml +38 -0
  124. forge/config/defaults/templates/litellm-openai.yaml +28 -0
  125. forge/config/defaults/templates/openrouter-anthropic.yaml +23 -0
  126. forge/config/defaults/templates/openrouter-deepseek.yaml +26 -0
  127. forge/config/defaults/templates/openrouter-gemini-flash.yaml +26 -0
  128. forge/config/defaults/templates/openrouter-gemini.yaml +23 -0
  129. forge/config/defaults/templates/openrouter-glm.yaml +23 -0
  130. forge/config/defaults/templates/openrouter-kimi.yaml +30 -0
  131. forge/config/defaults/templates/openrouter-minimax.yaml +26 -0
  132. forge/config/defaults/templates/openrouter-openai-codex.yaml +23 -0
  133. forge/config/defaults/templates/openrouter-openai.yaml +28 -0
  134. forge/config/defaults/templates/openrouter-qwen.yaml +25 -0
  135. forge/config/loader.py +675 -0
  136. forge/config/schema.py +448 -0
  137. forge/core/__init__.py +5 -0
  138. forge/core/auth/__init__.py +67 -0
  139. forge/core/auth/capabilities.py +219 -0
  140. forge/core/auth/credentials_file.py +244 -0
  141. forge/core/auth/protocols.py +18 -0
  142. forge/core/auth/secrets.py +243 -0
  143. forge/core/auth/template_secrets.py +112 -0
  144. forge/core/data/__init__.py +5 -0
  145. forge/core/data/model_catalog.yaml +1522 -0
  146. forge/core/data/pricing.yaml +140 -0
  147. forge/core/data/system_prompt_addendums/__init__.py +0 -0
  148. forge/core/data/system_prompt_addendums/gemini.md +330 -0
  149. forge/core/data/system_prompt_addendums/openai.md +328 -0
  150. forge/core/llm/__init__.py +231 -0
  151. forge/core/llm/clients/__init__.py +14 -0
  152. forge/core/llm/clients/base.py +115 -0
  153. forge/core/llm/clients/litellm.py +619 -0
  154. forge/core/llm/clients/openai_compat.py +244 -0
  155. forge/core/llm/clients/openrouter.py +234 -0
  156. forge/core/llm/credentials.py +439 -0
  157. forge/core/llm/detection.py +86 -0
  158. forge/core/llm/errors.py +44 -0
  159. forge/core/llm/protocols.py +80 -0
  160. forge/core/llm/types.py +176 -0
  161. forge/core/logging.py +146 -0
  162. forge/core/models/__init__.py +91 -0
  163. forge/core/models/catalog.py +467 -0
  164. forge/core/models/pricing.py +165 -0
  165. forge/core/models/types.py +167 -0
  166. forge/core/naming.py +212 -0
  167. forge/core/ops/__init__.py +73 -0
  168. forge/core/ops/context.py +141 -0
  169. forge/core/ops/gc.py +802 -0
  170. forge/core/ops/proxy.py +146 -0
  171. forge/core/ops/resolution.py +135 -0
  172. forge/core/ops/session.py +344 -0
  173. forge/core/ops/session_context.py +548 -0
  174. forge/core/paths.py +38 -0
  175. forge/core/process.py +54 -0
  176. forge/core/reactive/__init__.py +38 -0
  177. forge/core/reactive/cost_tracking.py +300 -0
  178. forge/core/reactive/env.py +180 -0
  179. forge/core/reactive/proxy.py +78 -0
  180. forge/core/reactive/routing.py +622 -0
  181. forge/core/reactive/session_runner.py +185 -0
  182. forge/core/reactive/structured_output.py +62 -0
  183. forge/core/reactive/tagger.py +94 -0
  184. forge/core/reactive/throttle.py +132 -0
  185. forge/core/state/__init__.py +59 -0
  186. forge/core/state/exceptions.py +59 -0
  187. forge/core/state/io.py +140 -0
  188. forge/core/state/lock.py +99 -0
  189. forge/core/state/timestamps.py +60 -0
  190. forge/core/transcript.py +78 -0
  191. forge/core/typing_helpers.py +24 -0
  192. forge/core/workqueue/__init__.py +67 -0
  193. forge/core/workqueue/queue.py +552 -0
  194. forge/core/workqueue/types.py +63 -0
  195. forge/guard/__init__.py +26 -0
  196. forge/guard/deterministic/__init__.py +26 -0
  197. forge/guard/deterministic/base.py +158 -0
  198. forge/guard/deterministic/coding_standards.py +256 -0
  199. forge/guard/deterministic/registry.py +148 -0
  200. forge/guard/deterministic/tdd.py +171 -0
  201. forge/guard/engine.py +216 -0
  202. forge/guard/protocols.py +91 -0
  203. forge/guard/queries.py +96 -0
  204. forge/guard/semantic/__init__.py +34 -0
  205. forge/guard/semantic/promotion.py +18 -0
  206. forge/guard/semantic/supervisor.py +813 -0
  207. forge/guard/semantic/verdict.py +183 -0
  208. forge/guard/store.py +124 -0
  209. forge/guard/team/__init__.py +6 -0
  210. forge/guard/team/config.py +24 -0
  211. forge/guard/team/handlers.py +209 -0
  212. forge/guard/team/prompts.py +41 -0
  213. forge/guard/types.py +125 -0
  214. forge/guard/workflow/__init__.py +17 -0
  215. forge/guard/workflow/branches.py +67 -0
  216. forge/guard/workflow/config.py +63 -0
  217. forge/guard/workflow/divergence.py +113 -0
  218. forge/guard/workflow/policy.py +87 -0
  219. forge/guard/workflow/stages.py +205 -0
  220. forge/install/__init__.py +55 -0
  221. forge/install/cli.py +281 -0
  222. forge/install/exceptions.py +163 -0
  223. forge/install/hooks.py +109 -0
  224. forge/install/installer.py +1037 -0
  225. forge/install/models.py +321 -0
  226. forge/install/preset.py +272 -0
  227. forge/install/settings_merge.py +831 -0
  228. forge/install/tracking.py +238 -0
  229. forge/install/version.py +141 -0
  230. forge/proxy/__init__.py +0 -0
  231. forge/proxy/base_client.py +181 -0
  232. forge/proxy/client_adapter.py +476 -0
  233. forge/proxy/client_factory.py +531 -0
  234. forge/proxy/converters.py +1206 -0
  235. forge/proxy/cost_logger.py +132 -0
  236. forge/proxy/cost_tracker.py +242 -0
  237. forge/proxy/data_models.py +338 -0
  238. forge/proxy/error_hints.py +92 -0
  239. forge/proxy/metrics.py +222 -0
  240. forge/proxy/model_spec.py +158 -0
  241. forge/proxy/proxies.py +333 -0
  242. forge/proxy/proxy_identity.py +134 -0
  243. forge/proxy/proxy_orchestrator.py +1018 -0
  244. forge/proxy/proxy_startup.py +54 -0
  245. forge/proxy/server.py +1561 -0
  246. forge/proxy/utils.py +537 -0
  247. forge/review/__init__.py +6 -0
  248. forge/review/adversarial.py +111 -0
  249. forge/review/consensus.py +236 -0
  250. forge/review/engine.py +356 -0
  251. forge/review/models.py +437 -0
  252. forge/review/resources/__init__.py +5 -0
  253. forge/review/resources/codereview-performance.md +85 -0
  254. forge/review/resources/codereview-quick.md +75 -0
  255. forge/review/resources/codereview-security.md +92 -0
  256. forge/review/resources/codereview.md +85 -0
  257. forge/review/resources/docreview-quick.md +75 -0
  258. forge/review/resources/docreview.md +86 -0
  259. forge/review/resources/thinkdeep.md +89 -0
  260. forge/review/routing.py +368 -0
  261. forge/review/synthesis.py +73 -0
  262. forge/runtime_config.py +438 -0
  263. forge/search/__init__.py +55 -0
  264. forge/search/bm25_store.py +264 -0
  265. forge/search/content_store.py +197 -0
  266. forge/search/engine.py +352 -0
  267. forge/search/exceptions.py +51 -0
  268. forge/search/extractor.py +234 -0
  269. forge/search/index_state.py +295 -0
  270. forge/search/store.py +215 -0
  271. forge/search/tokenizer.py +24 -0
  272. forge/session/__init__.py +130 -0
  273. forge/session/active.py +339 -0
  274. forge/session/artifacts.py +202 -0
  275. forge/session/claude/__init__.py +50 -0
  276. forge/session/claude/cleanup.py +105 -0
  277. forge/session/claude/invoke.py +236 -0
  278. forge/session/claude/paths.py +200 -0
  279. forge/session/cleanup.py +216 -0
  280. forge/session/config.py +34 -0
  281. forge/session/direct_model.py +107 -0
  282. forge/session/effective.py +169 -0
  283. forge/session/exceptions.py +255 -0
  284. forge/session/handoff.py +881 -0
  285. forge/session/handoff_agent.py +544 -0
  286. forge/session/hooks/__init__.py +35 -0
  287. forge/session/hooks/models.py +73 -0
  288. forge/session/hooks/session_start.py +507 -0
  289. forge/session/identity.py +84 -0
  290. forge/session/index.py +553 -0
  291. forge/session/manager.py +1506 -0
  292. forge/session/models.py +572 -0
  293. forge/session/overrides.py +344 -0
  294. forge/session/plan_resolution.py +286 -0
  295. forge/session/prev_sessions.py +128 -0
  296. forge/session/store.py +431 -0
  297. forge/session/validation.py +47 -0
  298. forge/session/worktree/__init__.py +65 -0
  299. forge/session/worktree/cleanup.py +262 -0
  300. forge/session/worktree/config_copy.py +203 -0
  301. forge/session/worktree/create.py +332 -0
  302. forge/sidecar/__init__.py +29 -0
  303. forge/sidecar/container.py +161 -0
  304. forge/sidecar/docker.py +86 -0
  305. forge/sidecar/secrets.py +19 -0
  306. multi_forge-0.2.0.dist-info/METADATA +242 -0
  307. multi_forge-0.2.0.dist-info/RECORD +311 -0
  308. multi_forge-0.2.0.dist-info/WHEEL +4 -0
  309. multi_forge-0.2.0.dist-info/entry_points.txt +2 -0
  310. multi_forge-0.2.0.dist-info/licenses/LICENSE +203 -0
  311. multi_forge-0.2.0.dist-info/licenses/NOTICE +14 -0
@@ -0,0 +1,101 @@
1
+ # Adversarial Code Evaluation
2
+
3
+ ```xml
4
+ <role>
5
+ You are a senior code evaluator performing a structured adversarial assessment.
6
+ {stance_prompt}
7
+ You identify bugs, design issues, security concerns, and performance problems.
8
+ You provide actionable feedback with specific code references.
9
+ </role>
10
+
11
+ <behavior>
12
+ - Read all code in scope before forming opinions
13
+ - Cite specific file:line references for every finding
14
+ - Evaluate strictly on technical merits
15
+ - Support every claim with evidence or reasoning
16
+ - Cover ALL files in ONE pass -- do not present partial results
17
+ - Be specific: "potential null dereference at auth.py:45" not "might have issues"
18
+ - Provide a clear verdict with confidence level
19
+ </behavior>
20
+
21
+ <scope_constraints>
22
+ - Review only what's in scope
23
+ - Do not expand to adjacent code unless directly affected
24
+ - If tests exist for reviewed code, check them for coverage gaps
25
+ </scope_constraints>
26
+ ```
27
+
28
+ ---
29
+
30
+ ## Code Under Evaluation
31
+
32
+ {target}
33
+
34
+ ---
35
+
36
+ ## Evaluation Framework
37
+
38
+ ### 1. Quality
39
+
40
+ - Logic errors and edge cases
41
+ - Error handling: are errors caught, propagated, and surfaced correctly?
42
+ - Type safety: do type annotations match runtime behavior?
43
+ - Test coverage: are critical paths tested?
44
+
45
+ ### 2. Security
46
+
47
+ - Input validation at trust boundaries
48
+ - Injection vectors (command, SQL, path traversal)
49
+ - Secrets in code or logs
50
+ - Authentication and authorization gaps
51
+
52
+ ### 3. Performance
53
+
54
+ - Unnecessary allocations or copies in hot paths
55
+ - N+1 query patterns
56
+ - Missing caching where data is reused
57
+ - Blocking calls in async contexts
58
+
59
+ ### 4. Architecture
60
+
61
+ - Component boundaries: is coupling appropriate?
62
+ - Dependency direction: do imports flow the right way?
63
+ - Abstraction level: is complexity in the right place?
64
+ - Interface contracts: are public APIs stable and well-defined?
65
+
66
+ ### 5. Risks
67
+
68
+ - What could go wrong in production?
69
+ - What is the blast radius of failure?
70
+ - Missing error recovery or graceful degradation?
71
+ - Deployment or migration risks?
72
+
73
+ ### 6. Recommendation
74
+
75
+ - Overall verdict: ACCEPT, ACCEPT_WITH_CONDITIONS, or REJECT
76
+ - Confidence level: LOW, MEDIUM, HIGH
77
+ - Key conditions (if ACCEPT_WITH_CONDITIONS)
78
+
79
+ ---
80
+
81
+ ## Output Format
82
+
83
+ ````xml
84
+ <output_format>
85
+ Respond with a structured evaluation in JSON:
86
+
87
+ {
88
+ "verdict": "ACCEPT" | "ACCEPT_WITH_CONDITIONS" | "REJECT",
89
+ "confidence": "LOW" | "MEDIUM" | "HIGH",
90
+ "key_findings": [
91
+ {"category": "quality|security|performance|architecture|risks",
92
+ "finding": "specific finding with file:line reference",
93
+ "severity": "critical|high|medium|low"}
94
+ ],
95
+ "recommendation": "1-2 sentence summary of your recommendation",
96
+ "conditions": ["condition 1", "condition 2"]
97
+ }
98
+
99
+ Wrap the JSON in a ```json code fence.
100
+ </output_format>
101
+ ````
@@ -0,0 +1,90 @@
1
+ # Structured Evaluation
2
+
3
+ ```xml
4
+ <role>
5
+ You are a technical evaluator performing a structured assessment.
6
+ {stance_prompt}
7
+ </role>
8
+
9
+ <behavior>
10
+ - Evaluate strictly on technical merits
11
+ - Support every claim with evidence or reasoning
12
+ - Be specific: cite exact trade-offs, not vague concerns
13
+ - Provide a clear verdict with confidence level
14
+ </behavior>
15
+ ```
16
+
17
+ ---
18
+
19
+ ## Proposal Under Evaluation
20
+
21
+ {proposal}
22
+
23
+ ---
24
+
25
+ ## Evaluation Framework
26
+
27
+ ### 1. Feasibility
28
+
29
+ - Can this be implemented with the available technology and resources?
30
+ - What are the key technical dependencies?
31
+ - Are there proven precedents or is this novel?
32
+
33
+ ### 2. Correctness
34
+
35
+ - Does the proposal solve the stated problem?
36
+ - Are there logical gaps or incorrect assumptions?
37
+ - Does it handle edge cases and failure modes?
38
+
39
+ ### 3. Trade-offs
40
+
41
+ - What does this approach gain vs alternatives?
42
+ - What does it cost (complexity, performance, maintenance)?
43
+ - Are the trade-offs appropriate for the context?
44
+
45
+ ### 4. Risks
46
+
47
+ - What could go wrong in implementation?
48
+ - What could go wrong in production?
49
+ - What is the blast radius of failure?
50
+
51
+ ### 5. Completeness
52
+
53
+ - Are all requirements addressed?
54
+ - Are there missing considerations?
55
+ - What would need to be added before this is production-ready?
56
+
57
+ ### 6. Alternatives
58
+
59
+ - What other approaches could solve this problem?
60
+ - Why might they be better or worse?
61
+
62
+ ### 7. Recommendation
63
+
64
+ - Overall verdict: ACCEPT, ACCEPT_WITH_CONDITIONS, or REJECT
65
+ - Confidence level: LOW, MEDIUM, HIGH
66
+ - Key conditions (if ACCEPT_WITH_CONDITIONS)
67
+
68
+ ---
69
+
70
+ ## Output Format
71
+
72
+ ````xml
73
+ <output_format>
74
+ Respond with a structured evaluation in JSON:
75
+
76
+ {
77
+ "verdict": "ACCEPT" | "ACCEPT_WITH_CONDITIONS" | "REJECT",
78
+ "confidence": "LOW" | "MEDIUM" | "HIGH",
79
+ "key_findings": [
80
+ {"category": "feasibility|correctness|trade-offs|risks|completeness",
81
+ "finding": "specific finding",
82
+ "severity": "critical|high|medium|low"}
83
+ ],
84
+ "recommendation": "1-2 sentence summary of your recommendation",
85
+ "conditions": ["condition 1", "condition 2"]
86
+ }
87
+
88
+ Wrap the JSON in a ```json code fence.
89
+ </output_format>
90
+ ````
@@ -0,0 +1,141 @@
1
+ ---
2
+ name: forge:panel
3
+ description: Multi-model panel review. Multiple models review independently, then findings are synthesized.
4
+ disable-model-invocation: true
5
+ argument-hint: '[target: path or instruction] [--output path] [--code] [--models m1,m2] [--roles r1,r2] [--review-type type] [--severity level]'
6
+ context: fork
7
+ allowed-tools: Bash, Read
8
+ ---
9
+
10
+ # Panel Review
11
+
12
+ Run a panel review: fans out the same review task to multiple models in parallel, then synthesizes findings.
13
+
14
+ ## Usage
15
+
16
+ ```
17
+ /forge:panel [target] [--code] [--models model1,model2]
18
+ ```
19
+
20
+ ## Arguments
21
+
22
+ | Argument | Required | Description |
23
+ | --------------- | -------- | ---------------------------------------------------------------------------- |
24
+ | `target` | Optional | File, directory, or instruction on what to review (defaults to cwd) |
25
+ | `--code` | Optional | Switch: use code review framework (default: document review) |
26
+ | `--models` | Optional | Comma-separated model list (default: Forge workflow defaults) |
27
+ | `--roles` | Optional | Comma-separated reviewer roles (security, performance, architecture, ...) |
28
+ | `--review-type` | Optional | Review focus: full, security, performance, quick (security/perf need --code) |
29
+ | `--severity` | Optional | Minimum severity to report: high or critical |
30
+ | `--output` | Optional | Write result to file instead of conversation (e.g., `review.md`) |
31
+
32
+ **Available models:** !`forge workflow list-models`
33
+
34
+ Only use models with status **ready** in the table above. If the default set includes unavailable models, pass
35
+ `--models <ready models>` explicitly. If the user explicitly requested an unavailable model, stop and tell them what
36
+ proxy or credential is missing rather than silently substituting. If no models are ready, tell the user what's missing
37
+ and stop.
38
+
39
+ ## Models Used
40
+
41
+ | Model | Strength | Via |
42
+ | ------------------------ | ----------------------------------- | ----------------------- |
43
+ | `gpt-5.5` | Logical problems, systematic review | openrouter-openai proxy |
44
+ | `gemini-3.1-pro-preview` | Balanced analysis, large context | openrouter-gemini proxy |
45
+ | `claude-opus` | Stable Claude Opus 4.6 reasoning | Direct Anthropic |
46
+
47
+ Selectable direct Claude workers include `claude-opus-4.6`, `claude-opus-4.6-1m`, and `claude-opus-4.7`. Use
48
+ `claude-opus-4.7` as a bounded review/quorum worker when the prompt has a concrete target and should require file:line
49
+ evidence. You can include both 4.6 and 4.7 in one panel, for example:
50
+
51
+ ```bash
52
+ forge workflow panel src/ --code --models claude-opus-4.6,claude-opus-4.7 --json --cwd "$(pwd)"
53
+ ```
54
+
55
+ ---
56
+
57
+ ## Execution
58
+
59
+ ### Step 1: Resolve Target and Flags
60
+
61
+ Parse `$ARGUMENTS` into a positional target and optional flags. The target is the first non-flag value (file path,
62
+ directory, or free-form instruction). Strip any leading `@` prefix on the target (Claude Code file reference syntax). If
63
+ no target is found, default to the current working directory.
64
+
65
+ Recognized flags (extract from `$ARGUMENTS` if present):
66
+
67
+ - `--code` — switch
68
+ - `--models <value>` — comma-separated model list
69
+ - `--roles <value>` — comma-separated role list
70
+ - `--review-type <value>` — one of: full, security, performance, quick
71
+ - `--severity <value>` — one of: high, critical
72
+ - `--output <path>` — write result to file instead of conversation
73
+
74
+ Never ask the user to clarify. If `$ARGUMENTS` contains anything, proceed immediately.
75
+
76
+ ### Step 2: Run Multi-Model Review
77
+
78
+ Execute the panel workflow, forwarding all parsed flags:
79
+
80
+ ```bash
81
+ forge workflow panel <target> [--code] [--models <models>] [--roles <roles>] [--review-type <type>] [--severity <sev>] --json --cwd "$(pwd)"
82
+ ```
83
+
84
+ Omit any flag the user didn't specify.
85
+
86
+ Parse the JSON output. The structure is:
87
+
88
+ ```json
89
+ {
90
+ "prompt": "...",
91
+ "results": {
92
+ "gpt-5.5": {"response": "...", "error": null, "success": true, "duration_seconds": 45.2},
93
+ "gemini-3.1-pro-preview": {"response": "...", "error": null, "success": true, "duration_seconds": 38.1},
94
+ "claude-opus": {"response": "...", "error": null, "success": true, "duration_seconds": 52.7}
95
+ },
96
+ "resolved_models": {
97
+ "gpt-5.5": {
98
+ "requested_model": "gpt-5.5",
99
+ "resolved_model": "openai/gpt-5.5",
100
+ "provider": "openrouter",
101
+ "proxy": "openrouter-openai",
102
+ "template": "openrouter-openai",
103
+ "source": "preferred_proxy"
104
+ }
105
+ },
106
+ "successful": 3,
107
+ "failed": 0
108
+ }
109
+ ```
110
+
111
+ ### Step 3: Synthesize Results
112
+
113
+ Read `${CLAUDE_SKILL_DIR}/resources/synthesis.md` for synthesis instructions. If the file is missing, report the actual
114
+ missing-path problem and stop. Then respond with:
115
+
116
+ 0. Resolved models used: one line per worker from `resolved_models`, including requested model, resolved model ref,
117
+ provider, proxy, and template
118
+ 1. Consensus issues (found by 2+ models)
119
+ 2. Unique findings from each model
120
+ 3. Conflict resolution
121
+ 4. Unified priority list
122
+ 5. Suggested fix order based on dependencies
123
+
124
+ **Output routing:** If `--output` was specified, write the complete synthesis to that path using the Write tool (create
125
+ parent directories if needed). Print a one-line confirmation: `Wrote synthesis to {path}`. Do not also print the full
126
+ result in the conversation. If `--output` was not specified, print the result in the conversation as usual.
127
+
128
+ ---
129
+
130
+ ## Error Handling
131
+
132
+ - If 1 model fails: Include its error, synthesize from successful models
133
+ - If 2+ models fail: Report failure, do not attempt synthesis
134
+ - If proxy not available: `forge workflow panel` skips that model and reports the error in JSON
135
+
136
+ ## Requirements
137
+
138
+ - **Forge CLI**: `forge` must be on PATH
139
+ - **Claude CLI**: workflow workers run through local `claude -p`; `claude` must be on PATH in this Bash environment
140
+ - **Proxies**: GPT-5.5 and Gemini require active proxies (`forge proxy create openrouter-openai`)
141
+ - **List available models**: `forge workflow list-models`
@@ -0,0 +1,103 @@
1
+ # Multi-Model Synthesis Instructions
2
+
3
+ You have received responses from multiple AI models reviewing the same target. Your task is to synthesize these into a
4
+ unified, actionable report.
5
+
6
+ ## Synthesis Framework
7
+
8
+ ### 1. Identify Consensus Issues
9
+
10
+ Issues found by **2 or more models** have higher confidence. List these first:
11
+
12
+ ```markdown
13
+ ## Consensus Findings (High Confidence)
14
+
15
+ ### Critical
16
+ - **[Issue]** (found by: gpt-5.5, gemini-3.1-pro-preview)
17
+ - Location: `file.py:123`
18
+ - Impact: [description]
19
+ - Fix: [suggestion]
20
+ ```
21
+
22
+ ### 2. Catalog Unique Findings
23
+
24
+ Each model has different strengths. Unique findings may be valid insights others missed:
25
+
26
+ | Model | Strength | Unique Finding Type |
27
+ | ---------------------- | ------------ | ------------------------------------------ |
28
+ | gpt-5.5 | Logic errors | Edge cases, off-by-one, null handling |
29
+ | gemini-3.1-pro-preview | Pragmatic | Missing tests, documentation gaps |
30
+ | claude-opus | Architecture | Coupling, abstraction leaks, design issues |
31
+
32
+ ### 3. Resolve Conflicts
33
+
34
+ When models disagree:
35
+
36
+ 1. **Examine the target** directly to verify claims
37
+ 2. **Consider context** - is one model misunderstanding the target's conventions?
38
+ 3. **Note uncertainty** if unresolvable
39
+
40
+ ```markdown
41
+ ## Disputed Findings
42
+
43
+ - **[Issue]**: gpt-5.5 says X, gemini says Y
44
+ - My assessment: [your determination after examining the target]
45
+ ```
46
+
47
+ ### 3.5. Extract Cross-Review Insights
48
+
49
+ What does the *combination* of reviews reveal that no single review shows?
50
+
51
+ 1. **Convergence patterns**: Do independent reviewers flag the same subsystem or concern, even with different framing?
52
+ Shared convergence on an area amplifies its importance.
53
+ 2. **Blind spots from disagreement**: When one model flags a risk that others ignore, note whether the others lacked
54
+ evidence or lacked the analytical frame to see it.
55
+ 3. **Severity calibration**: Note where reviewers disagree on severity -- the spread itself is informative.
56
+ 4. **Mechanical/parsing findings**: Findings based on literal parsing (syntax errors, invalid markup, broken links,
57
+ wrong field names) are uniquely valuable from multi-model review. Elevate these regardless of which single model
58
+ found them.
59
+
60
+ ### 4. Create Unified Priority List
61
+
62
+ Rank all validated findings by:
63
+
64
+ 1. **Severity**: Critical > High > Medium > Low
65
+ 2. **Confidence**: Consensus > Unique (verified) > Unique (unverified)
66
+ 3. **Scope**: Widespread > Isolated
67
+
68
+ ### 5. Suggest Fix Order
69
+
70
+ Consider dependencies when ordering fixes:
71
+
72
+ ```markdown
73
+ ## Recommended Fix Order
74
+
75
+ 1. [Critical issue] - blocks other fixes
76
+ 2. [High issue] - foundation for others
77
+ 3. [Medium issues] - can be parallelized
78
+ 4. [Low issues] - nice to have
79
+ ```
80
+
81
+ ## Output Format
82
+
83
+ ```markdown
84
+ # Multi-Model Review: [Target Name]
85
+
86
+ ## Summary
87
+ - Models consulted: 3 (gpt-5.5, gemini-3.1-pro-preview, claude-opus)
88
+ - Consensus issues: N
89
+ - Unique findings: N
90
+ - Conflicts resolved: N
91
+
92
+ ## Consensus Findings (High Confidence)
93
+ [...]
94
+
95
+ ## Unique Findings Worth Noting
96
+ [...]
97
+
98
+ ## Disputed or Uncertain
99
+ [...]
100
+
101
+ ## Recommended Fix Order
102
+ [...]
103
+ ```