multi-forge 0.2.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (311) hide show
  1. forge/__init__.py +3 -0
  2. forge/_extensions/agents/.gitkeep +0 -0
  3. forge/_extensions/commands/.gitkeep +0 -0
  4. forge/_extensions/skills/analyze/SKILL.md +87 -0
  5. forge/_extensions/skills/challenge/SKILL.md +91 -0
  6. forge/_extensions/skills/consensus/SKILL.md +120 -0
  7. forge/_extensions/skills/consensus/resources/code_consensus_evaluation.md +94 -0
  8. forge/_extensions/skills/consensus/resources/consensus_evaluation.md +70 -0
  9. forge/_extensions/skills/consensus/resources/synthesis.md +101 -0
  10. forge/_extensions/skills/debate/SKILL.md +116 -0
  11. forge/_extensions/skills/debate/resources/code_debate_evaluation.md +101 -0
  12. forge/_extensions/skills/debate/resources/debate_evaluation.md +90 -0
  13. forge/_extensions/skills/panel/SKILL.md +141 -0
  14. forge/_extensions/skills/panel/resources/synthesis.md +103 -0
  15. forge/_extensions/skills/qa/SKILL.md +704 -0
  16. forge/_extensions/skills/qa/resources/checklist/0-enable.md +78 -0
  17. forge/_extensions/skills/qa/resources/checklist/1-preflight.md +24 -0
  18. forge/_extensions/skills/qa/resources/checklist/10-resume.md +143 -0
  19. forge/_extensions/skills/qa/resources/checklist/11-config.md +150 -0
  20. forge/_extensions/skills/qa/resources/checklist/12-search.md +58 -0
  21. forge/_extensions/skills/qa/resources/checklist/13-guard.md +237 -0
  22. forge/_extensions/skills/qa/resources/checklist/14-workflow.md +305 -0
  23. forge/_extensions/skills/qa/resources/checklist/15-skills.md +155 -0
  24. forge/_extensions/skills/qa/resources/checklist/16-handoff.md +224 -0
  25. forge/_extensions/skills/qa/resources/checklist/17-info.md +50 -0
  26. forge/_extensions/skills/qa/resources/checklist/18-disable.md +84 -0
  27. forge/_extensions/skills/qa/resources/checklist/19-uninstall.md +146 -0
  28. forge/_extensions/skills/qa/resources/checklist/2-extensions.md +188 -0
  29. forge/_extensions/skills/qa/resources/checklist/20-cleanup.md +36 -0
  30. forge/_extensions/skills/qa/resources/checklist/3-auth.md +234 -0
  31. forge/_extensions/skills/qa/resources/checklist/4-proxy.md +481 -0
  32. forge/_extensions/skills/qa/resources/checklist/5-session.md +541 -0
  33. forge/_extensions/skills/qa/resources/checklist/6-hooks.md +275 -0
  34. forge/_extensions/skills/qa/resources/checklist/7-costs.md +309 -0
  35. forge/_extensions/skills/qa/resources/checklist/8-status-line.md +174 -0
  36. forge/_extensions/skills/qa/resources/checklist/9-direct-commands.md +146 -0
  37. forge/_extensions/skills/qa/resources/checklist.md +103 -0
  38. forge/_extensions/skills/qa/resources/report-template.md +62 -0
  39. forge/_extensions/skills/qa/scripts/start-container.sh +529 -0
  40. forge/_extensions/skills/qa/scripts/walkthrough-state.py +1137 -0
  41. forge/_extensions/skills/review/SKILL.md +125 -0
  42. forge/_extensions/skills/review/references/claude-4.6.md +474 -0
  43. forge/_extensions/skills/review/references/claude-4.7.md +710 -0
  44. forge/_extensions/skills/review/references/gemini-3.1.md +546 -0
  45. forge/_extensions/skills/review/references/gpt-5.5.md +490 -0
  46. forge/_extensions/skills/review/references/skills-writing-guide.md +1588 -0
  47. forge/_extensions/skills/review/resources/code-anthropic.md +160 -0
  48. forge/_extensions/skills/review/resources/code-gemini.md +184 -0
  49. forge/_extensions/skills/review/resources/code-openai.md +203 -0
  50. forge/_extensions/skills/review/resources/code.md +160 -0
  51. forge/_extensions/skills/review-docs/SKILL.md +121 -0
  52. forge/_extensions/skills/review-docs/resources/docs-anthropic.md +170 -0
  53. forge/_extensions/skills/review-docs/resources/docs-gemini.md +204 -0
  54. forge/_extensions/skills/review-docs/resources/docs-openai.md +231 -0
  55. forge/_extensions/skills/review-docs/resources/docs.md +170 -0
  56. forge/_extensions/skills/smoke-test/SKILL.md +27 -0
  57. forge/_extensions/skills/smoke-test/scripts/smoke-test.sh +118 -0
  58. forge/_extensions/skills/understand/SKILL.md +148 -0
  59. forge/_extensions/skills/understand/resources/code-anthropic.md +163 -0
  60. forge/_extensions/skills/understand/resources/code-gemini.md +194 -0
  61. forge/_extensions/skills/understand/resources/code-openai.md +181 -0
  62. forge/_extensions/skills/understand/resources/code.md +163 -0
  63. forge/_extensions/skills/understand/resources/docs-anthropic.md +177 -0
  64. forge/_extensions/skills/understand/resources/docs-gemini.md +202 -0
  65. forge/_extensions/skills/understand/resources/docs-openai.md +191 -0
  66. forge/_extensions/skills/understand/resources/docs.md +177 -0
  67. forge/_extensions/skills/walkthrough/SKILL.md +599 -0
  68. forge/_extensions/skills/walkthrough/resources/checklist.md +765 -0
  69. forge/_extensions/skills/walkthrough/scripts/run-in-repo.sh +118 -0
  70. forge/_extensions/skills/walkthrough/scripts/setup-test-repo.sh +198 -0
  71. forge/_extensions/skills/walkthrough/scripts/walkthrough-state.py +1137 -0
  72. forge/backend/__init__.py +174 -0
  73. forge/backend/adapters/__init__.py +38 -0
  74. forge/backend/adapters/litellm.py +158 -0
  75. forge/backend/creation.py +89 -0
  76. forge/backend/registry.py +178 -0
  77. forge/cli/__init__.py +16 -0
  78. forge/cli/auth.py +483 -0
  79. forge/cli/backend.py +298 -0
  80. forge/cli/claude.py +411 -0
  81. forge/cli/config_cmd.py +303 -0
  82. forge/cli/extensions.py +1001 -0
  83. forge/cli/gc.py +165 -0
  84. forge/cli/guard.py +1018 -0
  85. forge/cli/guards.py +106 -0
  86. forge/cli/handoff.py +110 -0
  87. forge/cli/hooks/__init__.py +36 -0
  88. forge/cli/hooks/_group.py +20 -0
  89. forge/cli/hooks/_helpers.py +149 -0
  90. forge/cli/hooks/commands.py +1677 -0
  91. forge/cli/hooks/direct_commands.py +1304 -0
  92. forge/cli/hooks/install.py +232 -0
  93. forge/cli/hooks/policy.py +151 -0
  94. forge/cli/hooks/read_hygiene.py +74 -0
  95. forge/cli/hooks/verification.py +370 -0
  96. forge/cli/logs.py +406 -0
  97. forge/cli/main.py +292 -0
  98. forge/cli/proxy.py +1821 -0
  99. forge/cli/proxy_costs.py +313 -0
  100. forge/cli/search.py +416 -0
  101. forge/cli/session.py +892 -0
  102. forge/cli/session_addendum.py +81 -0
  103. forge/cli/session_fork.py +750 -0
  104. forge/cli/session_handoff.py +141 -0
  105. forge/cli/session_lifecycle.py +2053 -0
  106. forge/cli/session_manage.py +1336 -0
  107. forge/cli/session_memory.py +201 -0
  108. forge/cli/status_line.py +1398 -0
  109. forge/cli/workflow.py +1964 -0
  110. forge/config/__init__.py +110 -0
  111. forge/config/dataclass_utils.py +88 -0
  112. forge/config/defaults/__init__.py +0 -0
  113. forge/config/defaults/backends/__init__.py +0 -0
  114. forge/config/defaults/backends/litellm.yaml +196 -0
  115. forge/config/defaults/templates/__init__.py +0 -0
  116. forge/config/defaults/templates/litellm-anthropic-local.yaml +33 -0
  117. forge/config/defaults/templates/litellm-anthropic.yaml +24 -0
  118. forge/config/defaults/templates/litellm-gemini-flash-local.yaml +37 -0
  119. forge/config/defaults/templates/litellm-gemini-local.yaml +32 -0
  120. forge/config/defaults/templates/litellm-gemini-test.yaml +34 -0
  121. forge/config/defaults/templates/litellm-gemini.yaml +21 -0
  122. forge/config/defaults/templates/litellm-openai-codex-local.yaml +36 -0
  123. forge/config/defaults/templates/litellm-openai-local.yaml +38 -0
  124. forge/config/defaults/templates/litellm-openai.yaml +28 -0
  125. forge/config/defaults/templates/openrouter-anthropic.yaml +23 -0
  126. forge/config/defaults/templates/openrouter-deepseek.yaml +26 -0
  127. forge/config/defaults/templates/openrouter-gemini-flash.yaml +26 -0
  128. forge/config/defaults/templates/openrouter-gemini.yaml +23 -0
  129. forge/config/defaults/templates/openrouter-glm.yaml +23 -0
  130. forge/config/defaults/templates/openrouter-kimi.yaml +30 -0
  131. forge/config/defaults/templates/openrouter-minimax.yaml +26 -0
  132. forge/config/defaults/templates/openrouter-openai-codex.yaml +23 -0
  133. forge/config/defaults/templates/openrouter-openai.yaml +28 -0
  134. forge/config/defaults/templates/openrouter-qwen.yaml +25 -0
  135. forge/config/loader.py +675 -0
  136. forge/config/schema.py +448 -0
  137. forge/core/__init__.py +5 -0
  138. forge/core/auth/__init__.py +67 -0
  139. forge/core/auth/capabilities.py +219 -0
  140. forge/core/auth/credentials_file.py +244 -0
  141. forge/core/auth/protocols.py +18 -0
  142. forge/core/auth/secrets.py +243 -0
  143. forge/core/auth/template_secrets.py +112 -0
  144. forge/core/data/__init__.py +5 -0
  145. forge/core/data/model_catalog.yaml +1522 -0
  146. forge/core/data/pricing.yaml +140 -0
  147. forge/core/data/system_prompt_addendums/__init__.py +0 -0
  148. forge/core/data/system_prompt_addendums/gemini.md +330 -0
  149. forge/core/data/system_prompt_addendums/openai.md +328 -0
  150. forge/core/llm/__init__.py +231 -0
  151. forge/core/llm/clients/__init__.py +14 -0
  152. forge/core/llm/clients/base.py +115 -0
  153. forge/core/llm/clients/litellm.py +619 -0
  154. forge/core/llm/clients/openai_compat.py +244 -0
  155. forge/core/llm/clients/openrouter.py +234 -0
  156. forge/core/llm/credentials.py +439 -0
  157. forge/core/llm/detection.py +86 -0
  158. forge/core/llm/errors.py +44 -0
  159. forge/core/llm/protocols.py +80 -0
  160. forge/core/llm/types.py +176 -0
  161. forge/core/logging.py +146 -0
  162. forge/core/models/__init__.py +91 -0
  163. forge/core/models/catalog.py +467 -0
  164. forge/core/models/pricing.py +165 -0
  165. forge/core/models/types.py +167 -0
  166. forge/core/naming.py +212 -0
  167. forge/core/ops/__init__.py +73 -0
  168. forge/core/ops/context.py +141 -0
  169. forge/core/ops/gc.py +802 -0
  170. forge/core/ops/proxy.py +146 -0
  171. forge/core/ops/resolution.py +135 -0
  172. forge/core/ops/session.py +344 -0
  173. forge/core/ops/session_context.py +548 -0
  174. forge/core/paths.py +38 -0
  175. forge/core/process.py +54 -0
  176. forge/core/reactive/__init__.py +38 -0
  177. forge/core/reactive/cost_tracking.py +300 -0
  178. forge/core/reactive/env.py +180 -0
  179. forge/core/reactive/proxy.py +78 -0
  180. forge/core/reactive/routing.py +622 -0
  181. forge/core/reactive/session_runner.py +185 -0
  182. forge/core/reactive/structured_output.py +62 -0
  183. forge/core/reactive/tagger.py +94 -0
  184. forge/core/reactive/throttle.py +132 -0
  185. forge/core/state/__init__.py +59 -0
  186. forge/core/state/exceptions.py +59 -0
  187. forge/core/state/io.py +140 -0
  188. forge/core/state/lock.py +99 -0
  189. forge/core/state/timestamps.py +60 -0
  190. forge/core/transcript.py +78 -0
  191. forge/core/typing_helpers.py +24 -0
  192. forge/core/workqueue/__init__.py +67 -0
  193. forge/core/workqueue/queue.py +552 -0
  194. forge/core/workqueue/types.py +63 -0
  195. forge/guard/__init__.py +26 -0
  196. forge/guard/deterministic/__init__.py +26 -0
  197. forge/guard/deterministic/base.py +158 -0
  198. forge/guard/deterministic/coding_standards.py +256 -0
  199. forge/guard/deterministic/registry.py +148 -0
  200. forge/guard/deterministic/tdd.py +171 -0
  201. forge/guard/engine.py +216 -0
  202. forge/guard/protocols.py +91 -0
  203. forge/guard/queries.py +96 -0
  204. forge/guard/semantic/__init__.py +34 -0
  205. forge/guard/semantic/promotion.py +18 -0
  206. forge/guard/semantic/supervisor.py +813 -0
  207. forge/guard/semantic/verdict.py +183 -0
  208. forge/guard/store.py +124 -0
  209. forge/guard/team/__init__.py +6 -0
  210. forge/guard/team/config.py +24 -0
  211. forge/guard/team/handlers.py +209 -0
  212. forge/guard/team/prompts.py +41 -0
  213. forge/guard/types.py +125 -0
  214. forge/guard/workflow/__init__.py +17 -0
  215. forge/guard/workflow/branches.py +67 -0
  216. forge/guard/workflow/config.py +63 -0
  217. forge/guard/workflow/divergence.py +113 -0
  218. forge/guard/workflow/policy.py +87 -0
  219. forge/guard/workflow/stages.py +205 -0
  220. forge/install/__init__.py +55 -0
  221. forge/install/cli.py +281 -0
  222. forge/install/exceptions.py +163 -0
  223. forge/install/hooks.py +109 -0
  224. forge/install/installer.py +1037 -0
  225. forge/install/models.py +321 -0
  226. forge/install/preset.py +272 -0
  227. forge/install/settings_merge.py +831 -0
  228. forge/install/tracking.py +238 -0
  229. forge/install/version.py +141 -0
  230. forge/proxy/__init__.py +0 -0
  231. forge/proxy/base_client.py +181 -0
  232. forge/proxy/client_adapter.py +476 -0
  233. forge/proxy/client_factory.py +531 -0
  234. forge/proxy/converters.py +1206 -0
  235. forge/proxy/cost_logger.py +132 -0
  236. forge/proxy/cost_tracker.py +242 -0
  237. forge/proxy/data_models.py +338 -0
  238. forge/proxy/error_hints.py +92 -0
  239. forge/proxy/metrics.py +222 -0
  240. forge/proxy/model_spec.py +158 -0
  241. forge/proxy/proxies.py +333 -0
  242. forge/proxy/proxy_identity.py +134 -0
  243. forge/proxy/proxy_orchestrator.py +1018 -0
  244. forge/proxy/proxy_startup.py +54 -0
  245. forge/proxy/server.py +1561 -0
  246. forge/proxy/utils.py +537 -0
  247. forge/review/__init__.py +6 -0
  248. forge/review/adversarial.py +111 -0
  249. forge/review/consensus.py +236 -0
  250. forge/review/engine.py +356 -0
  251. forge/review/models.py +437 -0
  252. forge/review/resources/__init__.py +5 -0
  253. forge/review/resources/codereview-performance.md +85 -0
  254. forge/review/resources/codereview-quick.md +75 -0
  255. forge/review/resources/codereview-security.md +92 -0
  256. forge/review/resources/codereview.md +85 -0
  257. forge/review/resources/docreview-quick.md +75 -0
  258. forge/review/resources/docreview.md +86 -0
  259. forge/review/resources/thinkdeep.md +89 -0
  260. forge/review/routing.py +368 -0
  261. forge/review/synthesis.py +73 -0
  262. forge/runtime_config.py +438 -0
  263. forge/search/__init__.py +55 -0
  264. forge/search/bm25_store.py +264 -0
  265. forge/search/content_store.py +197 -0
  266. forge/search/engine.py +352 -0
  267. forge/search/exceptions.py +51 -0
  268. forge/search/extractor.py +234 -0
  269. forge/search/index_state.py +295 -0
  270. forge/search/store.py +215 -0
  271. forge/search/tokenizer.py +24 -0
  272. forge/session/__init__.py +130 -0
  273. forge/session/active.py +339 -0
  274. forge/session/artifacts.py +202 -0
  275. forge/session/claude/__init__.py +50 -0
  276. forge/session/claude/cleanup.py +105 -0
  277. forge/session/claude/invoke.py +236 -0
  278. forge/session/claude/paths.py +200 -0
  279. forge/session/cleanup.py +216 -0
  280. forge/session/config.py +34 -0
  281. forge/session/direct_model.py +107 -0
  282. forge/session/effective.py +169 -0
  283. forge/session/exceptions.py +255 -0
  284. forge/session/handoff.py +881 -0
  285. forge/session/handoff_agent.py +544 -0
  286. forge/session/hooks/__init__.py +35 -0
  287. forge/session/hooks/models.py +73 -0
  288. forge/session/hooks/session_start.py +507 -0
  289. forge/session/identity.py +84 -0
  290. forge/session/index.py +553 -0
  291. forge/session/manager.py +1506 -0
  292. forge/session/models.py +572 -0
  293. forge/session/overrides.py +344 -0
  294. forge/session/plan_resolution.py +286 -0
  295. forge/session/prev_sessions.py +128 -0
  296. forge/session/store.py +431 -0
  297. forge/session/validation.py +47 -0
  298. forge/session/worktree/__init__.py +65 -0
  299. forge/session/worktree/cleanup.py +262 -0
  300. forge/session/worktree/config_copy.py +203 -0
  301. forge/session/worktree/create.py +332 -0
  302. forge/sidecar/__init__.py +29 -0
  303. forge/sidecar/container.py +161 -0
  304. forge/sidecar/docker.py +86 -0
  305. forge/sidecar/secrets.py +19 -0
  306. multi_forge-0.2.0.dist-info/METADATA +242 -0
  307. multi_forge-0.2.0.dist-info/RECORD +311 -0
  308. multi_forge-0.2.0.dist-info/WHEEL +4 -0
  309. multi_forge-0.2.0.dist-info/entry_points.txt +2 -0
  310. multi_forge-0.2.0.dist-info/licenses/LICENSE +203 -0
  311. multi_forge-0.2.0.dist-info/licenses/NOTICE +14 -0
@@ -0,0 +1,490 @@
1
+ # GPT-5.5 Prompting Guide
2
+
3
+ > Synthesized from [OpenAI Prompt Guidance](https://developers.openai.com/api/docs/guides/prompt-guidance),
4
+ > [OpenAI Platform docs](https://developers.openai.com/api/docs/guides/latest-model), and
5
+ > [OpenAI Cookbook](https://developers.openai.com/cookbook/examples/gpt-5/gpt-5_prompting_guide). May 2026.
6
+
7
+ ## Overview
8
+
9
+ GPT-5.5 is OpenAI's frontier model for **complex professional work**, announced April 23, 2026 and made available in the
10
+ API on April 24, 2026. It is tuned for long-context, tool-heavy, professional workflows. Prompt-relevant changes:
11
+
12
+ - **1,050,000 token API context window**
13
+ - **128,000 max output tokens**
14
+ - **`medium` default reasoning effort**, with `none`, `low`, `high`, and `xhigh` available
15
+ - **More outcome-first behavior** - shorter prompts with clear success criteria usually work better than process-heavy
16
+ legacy scaffolding
17
+
18
+ **Key characteristic:** GPT-5.5 is designed for production-grade assistants and agents. It performs best when prompts
19
+ clearly specify the **output contract**, **tool-use expectations**, and **completion criteria**. The highest-leverage
20
+ prompt changes are choosing reasoning effort by task shape, defining exact output and citation formats, and making
21
+ completion criteria explicit.
22
+
23
+ ---
24
+
25
+ ## Core API Parameters
26
+
27
+ ### `reasoning.effort`
28
+
29
+ | Level | Use Case |
30
+ | -------- | ------------------------------------------------------------------------------------------------ |
31
+ | `none` | Execution-heavy workloads: workflow steps, extraction, triage, structured transforms. |
32
+ | `low` | Tasks needing nuanced interpretation: implicit requirements, ambiguity, cancelled-tool recovery. |
33
+ | `medium` | **Default.** Research-heavy: long-context synthesis, multi-document review, conflict resolution. |
34
+ | `high` | Complex multi-step problems, strategy writing. |
35
+ | `xhigh` | Maximum reasoning depth. 3-5x cost of `none`. |
36
+
37
+ **Defaults across the GPT-5 family:**
38
+
39
+ - GPT-5: `medium`
40
+ - GPT-5.1, GPT-5.2: `none`
41
+ - GPT-5.5: `medium`
42
+
43
+ **Best practice:** Make prompt updates before increasing reasoning effort. Increase `reasoning.effort` one notch only
44
+ after prompt fixes. When lowering to `none` for execution-heavy workloads, encourage the model to "think" or outline
45
+ steps before answering.
46
+
47
+ ### `verbosity`
48
+
49
+ A **dedicated API parameter** (not just prompt engineering) that controls response length.
50
+
51
+ | Level | Behavior | Use when |
52
+ | -------- | -------------------------------------------- | --------------------------------------------- |
53
+ | `low` | Terse, to-the-point, just the facts | Latency and scanability matter most |
54
+ | `medium` | **Default.** Balanced detail for most tasks. | General assistants and professional workflows |
55
+ | `high` | Detailed, explanatory, comprehensive. | The user asked for depth or auditability |
56
+
57
+ ```python
58
+ response = client.responses.create(
59
+ model="gpt-5.5",
60
+ input="Your prompt here",
61
+ text={"verbosity": "low"}
62
+ )
63
+ ```
64
+
65
+ **Interaction with prompts:** If explicit instructions conflict with the `verbosity` parameter, explicit instructions
66
+ take precedence. For code generation, Cursor found that setting `verbosity: low` for text output while prompting for
67
+ verbose code in tool calls produced the best results.
68
+
69
+ ### Context Window
70
+
71
+ - **1,050,000 tokens** input / **128,000 tokens** max output
72
+ - Prompts above 272K input tokens have higher API pricing; use context budgets deliberately.
73
+
74
+ ### Knowledge Cutoff
75
+
76
+ **December 1, 2025.**
77
+
78
+ ---
79
+
80
+ ## Key Behavioral Differences from GPT-5.2
81
+
82
+ | Aspect | GPT-5.5 Behavior |
83
+ | --------------------- | --------------------------------------------------------------------------------- |
84
+ | Reasoning default | `medium`; use `low` before `none` when planning, search, or tool use still matter |
85
+ | Prompt shape | Outcome-first prompts usually work better than step-by-step process scaffolding |
86
+ | Tool calling | Stronger tool selection; define tool triggers, evidence rules, and stop rules |
87
+ | User-visible preamble | Useful for time-to-first-token in long or tool-heavy turns |
88
+ | Verbosity | Concise and direct by default; controllable via the `verbosity` API parameter |
89
+ | Instruction following | More literal and thorough; define success criteria and stopping conditions |
90
+
91
+ ---
92
+
93
+ ## Prompting Patterns
94
+
95
+ ### Output Contracts and Completion Criteria
96
+
97
+ OpenAI's primary recommendation for GPT-5.5. Explicitly define **what "done" looks like**:
98
+
99
+ ```xml
100
+ <output_contract>
101
+ - Return a JSON object with keys: summary, findings[], recommendations[], confidence_score.
102
+ - Each finding must include: file_path, line_range, severity (critical|warning|info), description.
103
+ - confidence_score is 0.0-1.0 reflecting how thoroughly the codebase was analyzed.
104
+ - Task is complete when all files in scope have been reviewed and findings are deduplicated.
105
+ </output_contract>
106
+ ```
107
+
108
+ Start with the smallest prompt that passes your evals. Add blocks only when they fix a measured failure mode.
109
+
110
+ ### Controlling Verbosity and Output Shape
111
+
112
+ Use the `verbosity` API parameter as the **primary lever**, and prompt-level constraints as secondary:
113
+
114
+ ```xml
115
+ <output_verbosity_spec>
116
+ - Default: 3-6 sentences or <=5 bullets for typical answers.
117
+ - For simple "yes/no + short explanation" questions: <=2 sentences.
118
+ - For complex multi-step or multi-file tasks:
119
+ - 1 short overview paragraph
120
+ - then <=5 bullets tagged: What changed, Where, Risks, Next steps, Open questions.
121
+ - Do not rephrase the user's request unless it changes semantics.
122
+ </output_verbosity_spec>
123
+ ```
124
+
125
+ ### Initiative Nudges
126
+
127
+ If the model feels too literal or stops at the first plausible answer, add an **initiative nudge** before raising
128
+ `reasoning.effort`:
129
+
130
+ ```xml
131
+ <initiative>
132
+ - Do not stop at the first plausible answer.
133
+ - Look for second-order issues, edge cases, and missing constraints.
134
+ - If the task is safety or accuracy critical, perform at least one verification step.
135
+ </initiative>
136
+ ```
137
+
138
+ This is cheaper and often more effective than bumping `reasoning.effort` up a notch.
139
+
140
+ ### Preventing Scope Drift
141
+
142
+ GPT-5.5 is more controllable than GPT-5.2 but still prone to scope drift on coding tasks:
143
+
144
+ ```xml
145
+ <design_and_scope_constraints>
146
+ - Implement EXACTLY and ONLY what the user requests.
147
+ - No extra features, no added components, no UX embellishments.
148
+ - Style aligned to the design system at hand.
149
+ - Do NOT invent colors, shadows, tokens, animations, or new UI elements unless requested.
150
+ - If any instruction is ambiguous, choose the simplest valid interpretation.
151
+ </design_and_scope_constraints>
152
+ ```
153
+
154
+ ### Long-Context and Recall
155
+
156
+ With 1.05M tokens available, long-context handling is more common. For inputs >10K tokens, use **forced re-grounding**:
157
+
158
+ ```xml
159
+ <long_context_handling>
160
+ - For inputs longer than ~10k tokens (multi-chapter docs, long threads, multiple PDFs):
161
+ - First, produce a short internal outline of key sections relevant to the user's request.
162
+ - Re-state the user's constraints explicitly before answering.
163
+ - Anchor claims to sections ("In the 'Data Retention' section...") rather than speaking generically.
164
+ - If the answer depends on fine details (dates, thresholds, clauses), quote or paraphrase them.
165
+ </long_context_handling>
166
+ ```
167
+
168
+ ### Preambles (Tool-Use Transparency)
169
+
170
+ GPT-5.5 can generate brief, user-visible explanations before invoking tools — outlining its intent before the actual
171
+ tool call. This boosts tool-calling accuracy without bloating reasoning overhead.
172
+
173
+ Enable with a system instruction:
174
+
175
+ ```
176
+ Before you call a tool, explain in one sentence why you are calling it.
177
+ ```
178
+
179
+ ### Handling Ambiguity & Hallucination Risk
180
+
181
+ ```xml
182
+ <uncertainty_and_ambiguity>
183
+ - If the question is ambiguous or underspecified, explicitly call this out and:
184
+ - Ask up to 1-3 precise clarifying questions, OR
185
+ - Present 2-3 plausible interpretations with clearly labeled assumptions.
186
+ - Never fabricate exact figures, line numbers, or external references when uncertain.
187
+ - When unsure, prefer language like "Based on the provided context..." instead of absolute claims.
188
+ </uncertainty_and_ambiguity>
189
+ ```
190
+
191
+ **High-risk self-check for sensitive contexts:**
192
+
193
+ ```xml
194
+ <high_risk_self_check>
195
+ Before finalizing an answer in legal, financial, compliance, or safety-sensitive contexts:
196
+ - Briefly re-scan your own answer for:
197
+ - Unstated assumptions,
198
+ - Specific numbers or claims not grounded in context,
199
+ - Overly strong language ("always," "guaranteed," etc.).
200
+ - If you find any, soften or qualify them and explicitly state assumptions.
201
+ </high_risk_self_check>
202
+ ```
203
+
204
+ ---
205
+
206
+ ## Agentic Steerability & User Updates
207
+
208
+ GPT-5.5 works well in long-running workflows when the prompt defines progress, stopping conditions, and when to ask for
209
+ help.
210
+
211
+ ### Verbosity + Code Quality (Cursor's Pattern)
212
+
213
+ Cursor found the best results by separating text and code verbosity:
214
+
215
+ - Set `verbosity: low` at the API level to keep text outputs brief
216
+ - In the prompt, strongly encourage verbose, well-commented output in coding tools only
217
+
218
+ This prevents status updates and post-task summaries from disrupting flow while keeping code readable.
219
+
220
+ ### User Update Discipline
221
+
222
+ ```xml
223
+ <user_updates_spec>
224
+ - Send brief updates (1-2 sentences) only when:
225
+ - You start a new major phase of work, or
226
+ - You discover something that changes the plan.
227
+ - Avoid narrating routine tool calls ("reading file...", "running tests...").
228
+ - Each update must include at least one concrete outcome ("Found X", "Confirmed Y", "Updated Z").
229
+ - Do not expand the task beyond what the user asked; if you notice new work, call it out as optional.
230
+ </user_updates_spec>
231
+ ```
232
+
233
+ ### Delegation Rules
234
+
235
+ ```xml
236
+ <delegation_rules>
237
+ - Delegate only when subtasks are independent or can proceed in parallel.
238
+ - For each delegated task, define ownership, expected output, dependencies, and "done".
239
+ - Keep blocking decisions in the main workflow unless delegation is explicitly useful.
240
+ - Integrate delegated results before finalizing.
241
+ </delegation_rules>
242
+ ```
243
+
244
+ ---
245
+
246
+ ## Tool Calling and Parallelism
247
+
248
+ ### Tool Use Rules
249
+
250
+ ```xml
251
+ <tool_usage_rules>
252
+ - Prefer tools over internal knowledge whenever:
253
+ - You need fresh or user-specific data (tickets, orders, configs, logs).
254
+ - You reference specific IDs, URLs, or document titles.
255
+ - Parallelize independent reads (read_file, fetch_record, search_docs) when possible.
256
+ - After any write/update tool call, briefly restate:
257
+ - What changed,
258
+ - Where (ID or path),
259
+ - Any follow-up validation performed.
260
+ </tool_usage_rules>
261
+ ```
262
+
263
+ ### Parallel Tool Calls
264
+
265
+ - GPT-5.5 supports parallel function calls — invoking multiple tools in a single model pass
266
+ - Do not rely on `none` for multi-step tool workflows; use `low` or higher when planning, search, or chained tools
267
+ matter
268
+ - OpenAI measures parallelization efficiency via **tool yields**: if 3 tools are called in parallel, followed by 3 more
269
+ in parallel, the number of yields is 2 (a better latency proxy than raw tool call count)
270
+
271
+ ---
272
+
273
+ ## Structured Extraction
274
+
275
+ For extraction, prompts should define the schema, missing-field behavior, and completeness check.
276
+
277
+ 1. Always provide a schema or JSON shape
278
+ 2. Use structured outputs for strict schema adherence
279
+ 3. Distinguish required vs optional fields
280
+ 4. Ask for "extraction completeness"
281
+ 5. Handle missing fields explicitly
282
+
283
+ ```xml
284
+ <extraction_spec>
285
+ You will extract structured data from tables/PDFs/emails into JSON.
286
+ - Always follow this schema exactly (no extra fields):
287
+ {
288
+ "party_name": string,
289
+ "jurisdiction": string | null,
290
+ "effective_date": string | null,
291
+ "termination_clause_summary": string | null
292
+ }
293
+ - If a field is not present in the source, set it to null rather than guessing.
294
+ - Before returning, quickly re-scan the source for any missed fields and correct omissions.
295
+ </extraction_spec>
296
+ ```
297
+
298
+ **New in GPT-5.5:** You can define tools with `type: custom` to enable models to send plaintext inputs directly to
299
+ tools, rather than being limited to structured JSON.
300
+
301
+ ---
302
+
303
+ ## Web Search and Research
304
+
305
+ GPT-5.5 is more steerable at synthesizing across many sources. Knowledge cutoff: **December 1, 2025**.
306
+
307
+ ### Research Agent Prompt
308
+
309
+ ```xml
310
+ <web_search_rules>
311
+ - Act as an expert research assistant; default to comprehensive, well-structured answers.
312
+ - Prefer web research over assumptions whenever facts may be uncertain or incomplete.
313
+ - Include citations for all web-derived information.
314
+ - Research all parts of the query, resolve contradictions, and follow important second-order
315
+ implications until further research is unlikely to change the answer.
316
+ - Do not ask clarifying questions; instead cover all plausible user intents with both breadth and depth.
317
+ - Write clearly and directly using Markdown (headers, bullets, tables when helpful).
318
+ - Define acronyms, use concrete examples, and keep a natural, conversational tone.
319
+ </web_search_rules>
320
+ ```
321
+
322
+ ### Search Modes
323
+
324
+ | Mode | Use Case |
325
+ | -------------- | ------------------------------------------- |
326
+ | Non-reasoning | Quick lookups, completes in seconds |
327
+ | Agentic search | Iterative reasoning with follow-up searches |
328
+ | Deep research | Exhaustive investigations, takes minutes |
329
+
330
+ **Tip:** Using hints like "go deep" triggers more thorough research.
331
+
332
+ ---
333
+
334
+ ## Responses API
335
+
336
+ GPT-5.5 is designed around the **Responses API** for reasoning, tool-calling, and multi-turn use cases.
337
+
338
+ | Feature | Chat Completions | Responses API |
339
+ | --------------------------- | ---------------- | ------------- |
340
+ | Basic text generation | Yes | Yes |
341
+ | Reasoning item preservation | No | Yes |
342
+ | `previous_response_id` | No | Yes |
343
+
344
+ **Why Responses API matters:** It preserves reasoning items across turns, which improves multi-step tool workflows and
345
+ can reduce redundant reasoning. If you manually replay assistant output items, preserve returned reasoning and `phase`
346
+ items unchanged.
347
+
348
+ ---
349
+
350
+ ## Migration Guide to GPT-5.5
351
+
352
+ ### Migration Mapping
353
+
354
+ | Current Model | Target | Reasoning Effort | Notes |
355
+ | ------------- | ------- | ------------------ | ----------------------------------- |
356
+ | GPT-5.2 | GPT-5.5 | Default (drop-in) | Just change the model name |
357
+ | GPT-5.3-Codex | GPT-5.5 | Default | GPT-5.5 subsumes Codex capabilities |
358
+ | o3 | GPT-5.5 | `medium` or `high` | For reasoning-heavy workloads |
359
+ | GPT-4.1 | GPT-5.5 | `none` | Treat as fast/low-deliberation |
360
+ | GPT-4o | GPT-5.5 | `none` | Same as GPT-4.1 |
361
+
362
+ ### Migration Steps
363
+
364
+ 1. **Switch models, don't change prompts yet** — Test model change in isolation
365
+ 2. **Pin `reasoning.effort`** — Match prior model's latency/depth profile
366
+ 3. **Run evals for baseline** — If results look good, ready to ship
367
+ 4. **If regressions, try an initiative nudge first** — Before raising reasoning effort
368
+ 5. **If still regressing, tune the prompt** — Use Prompt Optimizer + targeted constraints
369
+ 6. **Re-run evals after each small change** — Iterate incrementally
370
+
371
+ ### Prompt Optimizer
372
+
373
+ OpenAI's [Prompt Optimizer](https://platform.openai.com/chat/edit?optimize=true) in Playground helps:
374
+
375
+ - Quickly improve existing prompts for GPT-5.5
376
+ - Migrate across GPT-5 models
377
+ - Remove common failure modes
378
+
379
+ ---
380
+
381
+ ## Complete Example: Enterprise Agent System Prompt
382
+
383
+ ```xml
384
+ <role>
385
+ You are a GPT-5.5 enterprise assistant for [DOMAIN].
386
+ You are precise, analytical, persistent, and disciplined.
387
+ </role>
388
+
389
+ <output_contract>
390
+ - Define the exact output shape for each task type.
391
+ - Task is complete when [explicit completion criteria].
392
+ - If completion criteria cannot be met, explain what is missing and what would unblock it.
393
+ </output_contract>
394
+
395
+ <output_verbosity_spec>
396
+ - Default: 3-6 sentences or <=5 bullets for typical answers.
397
+ - For simple questions: <=2 sentences.
398
+ - For complex tasks: 1 overview paragraph + <=5 tagged bullets
399
+ (What changed, Where, Risks, Next steps, Open questions).
400
+ - Do not rephrase the user's request unless it changes semantics.
401
+ </output_verbosity_spec>
402
+
403
+ <design_and_scope_constraints>
404
+ - Implement EXACTLY and ONLY what the user requests.
405
+ - No extra features, no added components, no embellishments.
406
+ - If instruction is ambiguous, choose the simplest valid interpretation.
407
+ </design_and_scope_constraints>
408
+
409
+ <initiative>
410
+ - Do not stop at the first plausible answer.
411
+ - Look for second-order issues, edge cases, and missing constraints.
412
+ - If safety or accuracy critical, perform at least one verification step.
413
+ </initiative>
414
+
415
+ <uncertainty_and_ambiguity>
416
+ - If ambiguous: ask 1-3 clarifying questions OR present 2-3 interpretations with labeled assumptions.
417
+ - Never fabricate exact figures or references when uncertain.
418
+ - Prefer "Based on the provided context..." over absolute claims.
419
+ </uncertainty_and_ambiguity>
420
+
421
+ <tool_usage_rules>
422
+ - Prefer tools over internal knowledge for fresh/user-specific data.
423
+ - Parallelize independent reads when possible.
424
+ - Before calling a tool, explain in one sentence why you are calling it.
425
+ - After write/update: restate what changed, where, and validation performed.
426
+ </tool_usage_rules>
427
+
428
+ <user_updates_spec>
429
+ - Brief updates (1-2 sentences) only when starting new phase or plan changes.
430
+ - Avoid narrating routine tool calls.
431
+ - Each update must include concrete outcome.
432
+ - Do not expand task beyond what user asked.
433
+ </user_updates_spec>
434
+
435
+ <high_risk_self_check>
436
+ Before finalizing in legal/financial/compliance/safety contexts:
437
+ - Re-scan for unstated assumptions, ungrounded claims, overly strong language.
438
+ - Soften or qualify as needed.
439
+ </high_risk_self_check>
440
+ ```
441
+
442
+ ---
443
+
444
+ ## Key Differences: GPT-5.5 vs GPT-5.2 vs Gemini 3.1 Pro
445
+
446
+ | Aspect | GPT-5.5 | GPT-5.2 | Gemini 3.1 Pro |
447
+ | --------------------- | ---------------------------------------- | ------------------------- | -------------------------- |
448
+ | Default reasoning | `medium` | `none` | `high` (dynamic) |
449
+ | Default verbosity | Direct, controllable via API | Low, prompt-controlled | Concise |
450
+ | Context window | 1.05M tokens | 400K tokens | 1M tokens |
451
+ | Temperature | Flexible | Flexible | Keep at 1.0 |
452
+ | Structured extraction | Strong (+ custom tool types) | Strong | Good |
453
+ | Tool prompting | Define triggers, evidence, stop rules | May need more scaffolding | Use direct tool rules |
454
+ | Multi-turn state | `previous_response_id` / reasoning items | Responses state | Thought signatures |
455
+ | Knowledge cutoff | Dec 1, 2025 | August 2025 | January 2025 |
456
+ | Best for | Agentic, coding, professional work | Enterprise, document | Reasoning, multimodal work |
457
+
458
+ ---
459
+
460
+ ## Pro Tips
461
+
462
+ 1. **Define output contracts first** — Explicit completion criteria are the highest-leverage prompt change for GPT-5.5
463
+
464
+ 2. **Use initiative nudges before raising reasoning effort** — Cheaper and often more effective
465
+
466
+ 3. **Set `verbosity` at the API level** — Separate text brevity from code verbosity (Cursor pattern)
467
+
468
+ 4. **Enable preambles for tool-use transparency** — "Before calling a tool, explain why" boosts accuracy
469
+
470
+ 5. **Tune `reasoning.effort` by task shape** — Lower to `none` for execution-heavy workloads; raise to `high` for
471
+ complex multi-step problems
472
+
473
+ 6. **Anchor long-context answers** — Reference specific sections even with 1.05M context available
474
+
475
+ 7. **Migration is incremental** — One change at a time; model first, then reasoning effort, then prompt
476
+
477
+ 8. **Parallelize tool calls** — Use `low` or higher for multi-step tool planning; measure tool yields, not raw calls
478
+
479
+ ---
480
+
481
+ ## Sources
482
+
483
+ - [OpenAI: Prompt Guidance for GPT-5.5](https://developers.openai.com/api/docs/guides/prompt-guidance)
484
+ - [OpenAI: Using GPT-5.5](https://developers.openai.com/api/docs/guides/latest-model)
485
+ - [OpenAI: GPT-5.5 Model](https://developers.openai.com/api/docs/models/gpt-5.5)
486
+ - [OpenAI: Introducing GPT-5.5](https://openai.com/index/introducing-gpt-5-5/)
487
+ - [OpenAI: Reasoning Models](https://developers.openai.com/api/docs/guides/reasoning)
488
+ - [OpenAI: Reasoning Best Practices](https://developers.openai.com/api/docs/guides/reasoning-best-practices)
489
+ - [OpenAI Cookbook: GPT-5 Prompting Guide](https://developers.openai.com/cookbook/examples/gpt-5/gpt-5_prompting_guide)
490
+ - [OpenAI Cookbook: GPT-5 New Params and Tools](https://cookbook.openai.com/examples/gpt-5/gpt-5_new_params_and_tools)