@dynokostya/just-works 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/agents/csharp-code-writer.md +32 -0
- package/.claude/agents/diagrammer.md +49 -0
- package/.claude/agents/frontend-code-writer.md +36 -0
- package/.claude/agents/prompt-writer.md +38 -0
- package/.claude/agents/python-code-writer.md +32 -0
- package/.claude/agents/swift-code-writer.md +32 -0
- package/.claude/agents/typescript-code-writer.md +32 -0
- package/.claude/commands/git-sync.md +96 -0
- package/.claude/commands/project-docs.md +287 -0
- package/.claude/settings.json +112 -0
- package/.claude/settings.json.default +15 -0
- package/.claude/skills/csharp-coding/SKILL.md +368 -0
- package/.claude/skills/ddd-architecture-python/SKILL.md +288 -0
- package/.claude/skills/feature-driven-architecture-python/SKILL.md +302 -0
- package/.claude/skills/gemini-3-prompting/SKILL.md +483 -0
- package/.claude/skills/gpt-5-2-prompting/SKILL.md +295 -0
- package/.claude/skills/opus-4-6-prompting/SKILL.md +315 -0
- package/.claude/skills/plantuml-diagramming/SKILL.md +758 -0
- package/.claude/skills/python-coding/SKILL.md +293 -0
- package/.claude/skills/react-coding/SKILL.md +264 -0
- package/.claude/skills/rest-api/SKILL.md +421 -0
- package/.claude/skills/shadcn-ui-coding/SKILL.md +454 -0
- package/.claude/skills/swift-coding/SKILL.md +401 -0
- package/.claude/skills/tailwind-css-coding/SKILL.md +268 -0
- package/.claude/skills/typescript-coding/SKILL.md +464 -0
- package/.claude/statusline-command.sh +34 -0
- package/.codex/prompts/plan-reviewer.md +162 -0
- package/.codex/prompts/project-docs.md +287 -0
- package/.codex/skills/ddd-architecture-python/SKILL.md +288 -0
- package/.codex/skills/feature-driven-architecture-python/SKILL.md +302 -0
- package/.codex/skills/gemini-3-prompting/SKILL.md +483 -0
- package/.codex/skills/gpt-5-2-prompting/SKILL.md +295 -0
- package/.codex/skills/opus-4-6-prompting/SKILL.md +315 -0
- package/.codex/skills/plantuml-diagramming/SKILL.md +758 -0
- package/.codex/skills/python-coding/SKILL.md +293 -0
- package/.codex/skills/react-coding/SKILL.md +264 -0
- package/.codex/skills/rest-api/SKILL.md +421 -0
- package/.codex/skills/shadcn-ui-coding/SKILL.md +454 -0
- package/.codex/skills/tailwind-css-coding/SKILL.md +268 -0
- package/.codex/skills/typescript-coding/SKILL.md +464 -0
- package/AGENTS.md +57 -0
- package/CLAUDE.md +98 -0
- package/LICENSE +201 -0
- package/README.md +114 -0
- package/bin/cli.mjs +291 -0
- package/package.json +39 -0
|
@@ -0,0 +1,295 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: gpt-5-2-prompting
|
|
3
|
+
description: Apply when creating or editing prompts targeting GPT-5.2. Covers verbosity control, scope discipline, reasoning_effort awareness, long-context handling, ambiguity management, structured extraction, compaction awareness, tool-calling prompt patterns, web research defaults, and migration from older GPT models.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# GPT-5.2 Prompt Writing Guidelines
|
|
7
|
+
|
|
8
|
+
## When to Use
|
|
9
|
+
|
|
10
|
+
- Creating or editing system prompts targeting GPT-5.2
|
|
11
|
+
- Writing prompts for long-context inputs (10k+ tokens)
|
|
12
|
+
- Structuring extraction prompts for documents, PDFs, and emails
|
|
13
|
+
- Managing verbosity and scope in GPT-5.2 outputs
|
|
14
|
+
- Writing tool descriptions and tool-calling instructions for GPT-5.2
|
|
15
|
+
- Writing web research agent prompts
|
|
16
|
+
- Migrating prompts from GPT-4o, GPT-4.1, GPT-5, or GPT-5.1
|
|
17
|
+
|
|
18
|
+
## Overview
|
|
19
|
+
|
|
20
|
+
GPT-5.2 is OpenAI's latest frontier model. It delivers stronger instruction adherence, lower default verbosity, and more deliberate reasoning scaffolding compared to previous GPT models. The model is prompt-sensitive -- it responds well to structured constraints and explicit output specifications.
|
|
21
|
+
|
|
22
|
+
<context>
|
|
23
|
+
Key behavioral characteristics to design prompts around:
|
|
24
|
+
|
|
25
|
+
- **More Deliberate Scaffolding**: Produces clearer intermediate plans with more structured reasoning; benefits from explicit scope and verbosity constraints
|
|
26
|
+
- **Lower Verbosity**: More concise and task-focused by default, though still responsive to prompt preferences
|
|
27
|
+
- **Stronger Instruction Adherence**: Less drift from user intent; improved formatting and rationale presentation
|
|
28
|
+
- **Conservative Grounding Bias**: Favors correctness and explicit reasoning; ambiguity handling improves with clarification prompts
|
|
29
|
+
- **Tool Efficiency Trade-offs**: Takes additional tool actions in interactive flows; optimizable via prompting
|
|
30
|
+
</context>
|
|
31
|
+
|
|
32
|
+
## Verbosity and Output Control
|
|
33
|
+
|
|
34
|
+
GPT-5.2 defaults to concise output. Use the following template to set explicit verbosity expectations in system prompts:
|
|
35
|
+
|
|
36
|
+
```
|
|
37
|
+
<output_verbosity_spec>
|
|
38
|
+
- Default: 3-6 sentences or <=5 bullets for typical answers.
|
|
39
|
+
- For simple "yes/no + short explanation" questions: <=2 sentences.
|
|
40
|
+
- For complex multi-step or multi-file tasks:
|
|
41
|
+
- 1 short overview paragraph
|
|
42
|
+
- then <=5 bullets tagged: What changed, Where, Risks, Next steps, Open questions.
|
|
43
|
+
- Provide clear and structured responses that balance informativeness with conciseness.
|
|
44
|
+
- Avoid long narrative paragraphs; prefer compact bullets and short sections.
|
|
45
|
+
- Do not rephrase the user's request unless it changes semantics.
|
|
46
|
+
</output_verbosity_spec>
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
Adjust the default ranges to match your use case. The model respects quantitative output constraints well -- specifying "3-6 sentences" is more effective than "be concise."
|
|
50
|
+
|
|
51
|
+
## Scope and Design Constraints
|
|
52
|
+
|
|
53
|
+
GPT-5.2 is stronger at structured code but may produce more code than minimal specs require. Use scope constraints adapted to the task domain to prevent over-engineering and scope creep.
|
|
54
|
+
|
|
55
|
+
For code generation and UI tasks:
|
|
56
|
+
|
|
57
|
+
```
|
|
58
|
+
<design_and_scope_constraints>
|
|
59
|
+
- Explore existing design systems deeply.
|
|
60
|
+
- Implement EXACTLY and ONLY what the user requests.
|
|
61
|
+
- No extra features, no added components, no UX embellishments.
|
|
62
|
+
- Style aligned to the design system at hand.
|
|
63
|
+
- Do NOT invent colors, shadows, tokens, animations, or new UI elements, unless requested.
|
|
64
|
+
- If any instruction is ambiguous, choose the simplest valid interpretation.
|
|
65
|
+
</design_and_scope_constraints>
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
For extraction and data tasks, scope constraints take the form of schema adherence:
|
|
69
|
+
|
|
70
|
+
```
|
|
71
|
+
- Follow this schema exactly (no extra fields).
|
|
72
|
+
- If a field is not present in the source, set it to null rather than guessing.
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
For research and synthesis tasks, scope is less about restriction and more about query coverage -- instruct the model to cover plausible user intents rather than expanding into tangential topics.
|
|
76
|
+
|
|
77
|
+
## Long-Context Handling
|
|
78
|
+
|
|
79
|
+
For inputs exceeding approximately 10k tokens (large documents, codebases, conversation histories), add explicit grounding instructions:
|
|
80
|
+
|
|
81
|
+
```
|
|
82
|
+
<long_context_handling>
|
|
83
|
+
- For inputs longer than ~10k tokens (multi-chapter docs, long threads, multiple PDFs):
|
|
84
|
+
- First, produce a short internal outline of key sections relevant to the request.
|
|
85
|
+
- Re-state the user's constraints explicitly before answering.
|
|
86
|
+
- In your answer, anchor claims to sections rather than speaking generically.
|
|
87
|
+
- If the answer depends on fine details (dates, thresholds, clauses), quote or paraphrase them.
|
|
88
|
+
</long_context_handling>
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
Without these instructions, long-context responses may drift from specifics. The grounding template keeps the model anchored to source material.
|
|
92
|
+
|
|
93
|
+
## Ambiguity and Hallucination Mitigation
|
|
94
|
+
|
|
95
|
+
GPT-5.2 has a conservative grounding bias but still benefits from explicit uncertainty handling instructions:
|
|
96
|
+
|
|
97
|
+
```
|
|
98
|
+
<uncertainty_and_ambiguity>
|
|
99
|
+
- If the question is ambiguous or underspecified, explicitly call this out and:
|
|
100
|
+
- Ask up to 1-3 precise clarifying questions, OR
|
|
101
|
+
- Present 2-3 plausible interpretations with clearly labeled assumptions.
|
|
102
|
+
- When external facts may have changed recently and no tools are available:
|
|
103
|
+
- Answer in general terms and state that details may have changed.
|
|
104
|
+
- Never fabricate exact figures, line numbers, or external references when uncertain.
|
|
105
|
+
- When unsure, prefer language like "Based on the provided context..." instead of absolute claims.
|
|
106
|
+
</uncertainty_and_ambiguity>
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
For high-stakes domains, add a self-check step:
|
|
110
|
+
|
|
111
|
+
```
|
|
112
|
+
<high_risk_self_check>
|
|
113
|
+
Before finalizing an answer in legal, financial, compliance, or safety-sensitive contexts:
|
|
114
|
+
- Briefly re-scan your own answer for:
|
|
115
|
+
- Unstated assumptions,
|
|
116
|
+
- Specific numbers or claims not grounded in context,
|
|
117
|
+
- Overly strong language ("always," "guaranteed," etc.).
|
|
118
|
+
- If you find any, soften or qualify them and explicitly state assumptions.
|
|
119
|
+
</high_risk_self_check>
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
## Reasoning Effort Awareness
|
|
123
|
+
|
|
124
|
+
GPT-5.2 supports a `reasoning_effort` parameter (`none` | `minimal` | `low` | `medium` | `high` | `xhigh`) that trades off speed/cost versus deeper reasoning. GPT-5.2 defaults to `none`, behaving as a fast, low-deliberation model out of the box.
|
|
125
|
+
|
|
126
|
+
When writing prompts, be aware that reasoning effort affects how the model processes your instructions:
|
|
127
|
+
|
|
128
|
+
- At `none`/`low`: keep prompts direct and unambiguous; the model won't reason deeply through complex instructions
|
|
129
|
+
- At `medium`/`high`: the model can handle more nuanced multi-step instructions
|
|
130
|
+
- At `xhigh`: maximum reasoning depth; suitable for the hardest analytical tasks
|
|
131
|
+
|
|
132
|
+
### Migration Mapping
|
|
133
|
+
|
|
134
|
+
When migrating prompts across models, use the appropriate reasoning_effort to preserve behavior:
|
|
135
|
+
|
|
136
|
+
| Current Model | Target | reasoning_effort | Notes |
|
|
137
|
+
|---|---|---|---|
|
|
138
|
+
| GPT-4o | GPT-5.2 | `none` | Treat as "fast/low-deliberation" by default |
|
|
139
|
+
| GPT-4.1 | GPT-5.2 | `none` | Same as GPT-4o for snappy behavior |
|
|
140
|
+
| GPT-5 | GPT-5.2 | same (`minimal` -> `none`) | Preserve none/low/medium/high |
|
|
141
|
+
| GPT-5.1 | GPT-5.2 | same | Adjust only after running evals |
|
|
142
|
+
|
|
143
|
+
## Agentic Steerability
|
|
144
|
+
|
|
145
|
+
GPT-5.2 excels at agentic scaffolding and multi-step execution. Use update instructions to control how the model communicates progress. Reuse GPT-5.1 patterns while adding two key tweaks: clamp verbosity of updates and make scope discipline explicit.
|
|
146
|
+
|
|
147
|
+
```
|
|
148
|
+
<user_updates_spec>
|
|
149
|
+
- Send brief updates (1-2 sentences) only when:
|
|
150
|
+
- You start a new major phase of work, or
|
|
151
|
+
- You discover something that changes the plan.
|
|
152
|
+
- Avoid narrating routine tool calls ("reading file...", "running tests...").
|
|
153
|
+
- Each update must include at least one concrete outcome ("Found X", "Confirmed Y", "Updated Z").
|
|
154
|
+
- Do not expand the task beyond what the user asked; if you notice new work, call it out as optional.
|
|
155
|
+
</user_updates_spec>
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
This prevents the model from over-narrating its process while still keeping users informed at meaningful checkpoints.
|
|
159
|
+
|
|
160
|
+
## Tool-Calling Prompt Patterns
|
|
161
|
+
|
|
162
|
+
### Usage Rules
|
|
163
|
+
|
|
164
|
+
Include this in system prompts to guide tool selection and reporting:
|
|
165
|
+
|
|
166
|
+
```
|
|
167
|
+
<tool_usage_rules>
|
|
168
|
+
- Prefer tools over internal knowledge whenever:
|
|
169
|
+
- You need fresh or user-specific data (tickets, orders, configs, logs).
|
|
170
|
+
- You reference specific IDs, URLs, or document titles.
|
|
171
|
+
- Parallelize independent reads (read_file, fetch_record, search_docs) when possible to reduce latency.
|
|
172
|
+
- After any write/update tool call, briefly restate:
|
|
173
|
+
- What changed,
|
|
174
|
+
- Where (ID or path),
|
|
175
|
+
- Any follow-up validation performed.
|
|
176
|
+
</tool_usage_rules>
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### Tool Description Guidance
|
|
180
|
+
|
|
181
|
+
When writing tool descriptions for GPT-5.2 prompts:
|
|
182
|
+
|
|
183
|
+
- **Describe tools crisply** in 1-2 sentences. The model parses tool descriptions carefully -- verbose descriptions waste context without improving selection accuracy.
|
|
184
|
+
- **Encourage parallelism** for codebases, vector stores, and multi-entity operations. GPT-5.2 handles parallel tool calls well when tools are marked as independent.
|
|
185
|
+
- **Require verification** for high-impact operations (orders, billing, infrastructure changes). Add a verification step in the tool description or system prompt.
|
|
186
|
+
|
|
187
|
+
## Structured Extraction
|
|
188
|
+
|
|
189
|
+
For extracting structured data from tables, PDFs, emails, and documents:
|
|
190
|
+
|
|
191
|
+
```
|
|
192
|
+
<extraction_spec>
|
|
193
|
+
You will extract structured data from tables/PDFs/emails into JSON.
|
|
194
|
+
|
|
195
|
+
- Always follow this schema exactly (no extra fields):
|
|
196
|
+
{
|
|
197
|
+
"party_name": string,
|
|
198
|
+
"jurisdiction": string | null,
|
|
199
|
+
"effective_date": string | null,
|
|
200
|
+
"termination_clause_summary": string | null
|
|
201
|
+
}
|
|
202
|
+
- If a field is not present in the source, set it to null rather than guessing.
|
|
203
|
+
- Before returning, quickly re-scan the source for any missed fields and correct omissions.
|
|
204
|
+
</extraction_spec>
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
For multi-table/multi-file extraction, serialize per-document results separately and include a stable ID (filename, contract title, page range).
|
|
208
|
+
|
|
209
|
+
Provide the exact JSON schema in the prompt. GPT-5.2 adheres to schema constraints well -- specifying `null` for missing fields prevents hallucinated values.
|
|
210
|
+
|
|
211
|
+
## Compaction Awareness
|
|
212
|
+
|
|
213
|
+
Compaction extends effective context windows for long-running, tool-heavy workflows via the `/responses/compact` endpoint. When writing prompts for systems that use compaction:
|
|
214
|
+
|
|
215
|
+
- **Keep prompts functionally identical when resuming** -- do not change system prompts between compacted sessions, as this causes behavior drift
|
|
216
|
+
- **Design prompts to be self-contained** -- compacted items are opaque; the system prompt is the only stable anchor across compaction boundaries
|
|
217
|
+
- **Compact after major milestones** (not every turn) -- frequent compaction loses nuance
|
|
218
|
+
|
|
219
|
+
## Web Research
|
|
220
|
+
|
|
221
|
+
GPT-5.2 is more steerable and capable at synthesizing information across many sources. Use this template to configure web research behavior:
|
|
222
|
+
|
|
223
|
+
```
|
|
224
|
+
<web_search_rules>
|
|
225
|
+
- Act as an expert research assistant; default to comprehensive, well-structured answers.
|
|
226
|
+
- Prefer web research over assumptions whenever facts may be uncertain or incomplete; include citations for all web-derived information.
|
|
227
|
+
- Research all parts of the query, resolve contradictions, and follow important second-order implications until further research is unlikely to change the answer.
|
|
228
|
+
- Do not ask clarifying questions; instead cover all plausible user intents with both breadth and depth.
|
|
229
|
+
- Write clearly and directly using Markdown (headers, bullets, tables when helpful); define acronyms, use concrete examples, and keep a natural, conversational tone.
|
|
230
|
+
</web_search_rules>
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
### Web Research Agent Reference Prompt
|
|
234
|
+
|
|
235
|
+
The official guide provides this structure for a comprehensive web research agent:
|
|
236
|
+
|
|
237
|
+
- **CORE MISSION**: Answer fully and helpfully with enough evidence for skeptical readers. Never invent facts. Go one step further by adding high-value adjacent material.
|
|
238
|
+
- **PERSONA**: Be the world's greatest research assistant. Engage warmly while avoiding ungrounded flattery. Default to natural, conversational tone.
|
|
239
|
+
- **FACTUALITY AND ACCURACY**: Browse the web and include citations for all non-creative queries. Always browse for latest/current topics, time-sensitive info, recommendations, navigational queries, or ambiguous terms.
|
|
240
|
+
- **CITATIONS**: Include citations after paragraphs containing non-obvious web-derived claims. Use multiple sources for key claims, prioritizing primary sources.
|
|
241
|
+
- **HOW YOU RESEARCH**: Conduct deep research. Use parallel searches when helpful. Research until additional searching is unlikely to materially change the answer.
|
|
242
|
+
- **WRITING GUIDELINES**: Be direct and comprehensive. Use simple language. Use readable Markdown formatting. Do not add potential follow-up questions unless explicitly asked.
|
|
243
|
+
- **REQUIRED VALUE-ADD**: Provide concrete examples, specific numbers/dates, and "how it works" detail. Include relevant, well-researched material that makes answers more useful.
|
|
244
|
+
- **HANDLING AMBIGUITY**: Never ask clarifying questions unless explicitly requested. State your best-guess interpretation and comprehensively cover the most likely intent(s).
|
|
245
|
+
- **IF YOU CANNOT FULLY COMPLY**: Don't lead with blunt refusal. First deliver what you can, then clearly state limitations.
|
|
246
|
+
|
|
247
|
+
## Prompt Migration Guide
|
|
248
|
+
|
|
249
|
+
When migrating prompts to GPT-5.2, follow these steps to isolate changes and preserve behavior:
|
|
250
|
+
|
|
251
|
+
1. **Switch models without prompt changes** -- test the model alone to isolate model-vs-prompt effects. Make one change at a time.
|
|
252
|
+
2. **Pin reasoning_effort** -- explicitly set to match the prior model's latency/depth profile (see migration mapping above).
|
|
253
|
+
3. **Run evals for baseline** -- measure post-switch performance before touching prompts.
|
|
254
|
+
4. **Tune if regressions** -- use targeted constraints (verbosity/format/schema, scope discipline) to restore parity or improve.
|
|
255
|
+
5. **Re-run evals after each change** -- iterate by bumping reasoning_effort one notch or making incremental prompt tweaks, then re-measure.
|
|
256
|
+
|
|
257
|
+
### Prompt-Specific Migration Notes
|
|
258
|
+
|
|
259
|
+
**From GPT-4o / GPT-4.1:**
|
|
260
|
+
- Remove defensive prompting (GPT-5.2 handles edge cases better by default)
|
|
261
|
+
- Add verbosity control template if outputs are too long or too short
|
|
262
|
+
- Add scope constraints template for code generation tasks
|
|
263
|
+
|
|
264
|
+
**From GPT-5 / GPT-5.1:**
|
|
265
|
+
- Test without prompt changes first -- isolate model-vs-prompt effects
|
|
266
|
+
- Add long-context handling template if working with large inputs
|
|
267
|
+
- Add uncertainty template for high-stakes domains
|
|
268
|
+
|
|
269
|
+
## Anti-Patterns
|
|
270
|
+
|
|
271
|
+
- **Asking clarifying questions when you can cover plausible intents** -- instruct the model to present interpretations instead
|
|
272
|
+
- **Expanding task scope beyond user request** -- implement only what was asked
|
|
273
|
+
- **Inventing exact figures or external references when uncertain** -- instruct to hedge or use tools to verify
|
|
274
|
+
- **Rephrasing user requests unless semantics change** -- preserve the user's language
|
|
275
|
+
- **Narrating routine tool calls in agent updates** -- instruct to report only meaningful milestones
|
|
276
|
+
- **Creating extra UI/styling beyond design system specs** -- enforce scope constraints
|
|
277
|
+
- **Changing prompts during model migrations** -- test model alone first, then tune prompts
|
|
278
|
+
- **Over-prompting for default behavior** -- GPT-5.2's instruction adherence means less scaffolding is needed for basics
|
|
279
|
+
- **Verbose tool descriptions** -- keep to 1-2 sentences; more wastes context without improving selection
|
|
280
|
+
|
|
281
|
+
## Quality Checklist
|
|
282
|
+
|
|
283
|
+
- [ ] Verbosity expectations are set explicitly (sentence counts, bullet limits)
|
|
284
|
+
- [ ] Scope constraints are defined for code and design tasks
|
|
285
|
+
- [ ] Long-context grounding is added for inputs over 10k tokens
|
|
286
|
+
- [ ] Uncertainty handling is specified for the domain's risk level
|
|
287
|
+
- [ ] `reasoning_effort` is chosen appropriately for the task complexity
|
|
288
|
+
- [ ] Tool descriptions are crisp (1-2 sentences each)
|
|
289
|
+
- [ ] Extraction tasks include exact JSON schema with null handling
|
|
290
|
+
- [ ] Web search instructions specify research depth and citation behavior
|
|
291
|
+
- [ ] Prompt has been tested without changes after model migration
|
|
292
|
+
|
|
293
|
+
## Reference
|
|
294
|
+
|
|
295
|
+
- Official GPT-5.2 Prompting Guide: https://developers.openai.com/cookbook/examples/gpt-5/gpt-5-2_prompting_guide/
|
|
@@ -0,0 +1,315 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: opus-4-6-prompting
|
|
3
|
+
description: Apply when creating or editing prompts targeting Opus 4.6. Covers adaptive thinking, XML tag structure, language softening, behavioral tuning, over-engineering prevention, tool overtriggering mitigation, and prompt migration.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Opus 4.6 Prompting
|
|
7
|
+
|
|
8
|
+
## When to Use
|
|
9
|
+
|
|
10
|
+
- Creating or editing system prompts targeting Opus 4.6
|
|
11
|
+
- Tuning tool usage, autonomy, and safety guardrails in prompts
|
|
12
|
+
- Adjusting prompt language to match Opus 4.6's sensitivity
|
|
13
|
+
- Migrating prompt text from older Claude models
|
|
14
|
+
|
|
15
|
+
## Overview
|
|
16
|
+
|
|
17
|
+
Opus 4.6 is Anthropic's most capable frontier model. It is more responsive to system prompts, more autonomous in agentic workflows, and more capable at long-horizon reasoning than previous Claude models. Techniques that reduced undertriggering in earlier models can now cause overtriggering, and prefilling is no longer supported.
|
|
18
|
+
|
|
19
|
+
<context>
|
|
20
|
+
Key characteristics to design around:
|
|
21
|
+
|
|
22
|
+
- **Adaptive Thinking**: Dynamically decides when and how deeply to reason, replacing manual `budget_tokens`
|
|
23
|
+
- **Prompt Sensitivity**: More responsive to system prompt instructions than any previous Claude model -- better compliance but greater risk of overcorrection
|
|
24
|
+
- **Enhanced Autonomy**: Proactively takes action, discovers state from filesystem, orchestrates subagents
|
|
25
|
+
- **Long-Horizon Reasoning**: Exceptional state tracking across extended interactions
|
|
26
|
+
- **Direct Communication**: Less verbose by default, skips summaries after tool use, uses fewer filler phrases
|
|
27
|
+
</context>
|
|
28
|
+
|
|
29
|
+
## General Principles
|
|
30
|
+
|
|
31
|
+
### Be Explicit with Instructions
|
|
32
|
+
|
|
33
|
+
Opus 4.6 follows instructions with high fidelity. If you want behavior beyond the literal request, state it explicitly -- the model does not infer unstated requirements.
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
Good: "Read the Python files in the src/ directory, identify performance bottlenecks,
|
|
37
|
+
and implement optimizations. Explain your reasoning before each change."
|
|
38
|
+
|
|
39
|
+
Avoid: "Make the code faster."
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
### Add Context and Motivation
|
|
43
|
+
|
|
44
|
+
Explain WHY an instruction exists. Opus 4.6 uses context to prioritize and calibrate behavior -- a rule with a reason is followed more consistently than a bare directive.
|
|
45
|
+
|
|
46
|
+
```xml
|
|
47
|
+
<task>
|
|
48
|
+
Format all responses as plain text without markdown.
|
|
49
|
+
|
|
50
|
+
<context>
|
|
51
|
+
Your response will be read aloud by a text-to-speech system.
|
|
52
|
+
Users are visually impaired and rely entirely on audio output.
|
|
53
|
+
Markdown formatting characters would be spoken literally and disrupt comprehension.
|
|
54
|
+
</context>
|
|
55
|
+
</task>
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
### Be Vigilant with Examples
|
|
59
|
+
|
|
60
|
+
Opus 4.6 pays close attention to every detail in your prompt, including examples. If an example contains an anti-pattern, the model may reproduce it. Every example should reflect exactly the behavior you want.
|
|
61
|
+
|
|
62
|
+
### Long-Horizon Reasoning
|
|
63
|
+
|
|
64
|
+
Opus 4.6 excels at tasks spanning many steps, files, or reasoning chains. Structure long tasks as sequences of verifiable milestones rather than monolithic instructions.
|
|
65
|
+
|
|
66
|
+
## Thinking
|
|
67
|
+
|
|
68
|
+
Opus 4.6 uses **adaptive thinking** -- the model decides how deeply to reason based on task complexity. The `effort` parameter (`max`/`high`/`medium`/`low`) is the primary control lever for tuning the cost/quality trade-off. Before adding prompt constraints to reduce overthinking, try lowering `effort` first.
|
|
69
|
+
|
|
70
|
+
### Controlling Thinking via Prompts
|
|
71
|
+
|
|
72
|
+
You can shape thinking behavior through prompt instructions:
|
|
73
|
+
|
|
74
|
+
```
|
|
75
|
+
When you're deciding how to approach a problem, choose an approach and commit to it.
|
|
76
|
+
Avoid revisiting decisions unless you encounter new information that directly
|
|
77
|
+
contradicts your reasoning.
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
### Thinking Sensitivity
|
|
81
|
+
|
|
82
|
+
When adaptive thinking is disabled, replace "think" with alternatives to avoid inadvertently triggering internal reasoning:
|
|
83
|
+
|
|
84
|
+
| Avoid | Use Instead |
|
|
85
|
+
|-------|-------------|
|
|
86
|
+
| "think about" | "consider" |
|
|
87
|
+
| "think through" | "evaluate" |
|
|
88
|
+
| "think carefully" | "analyze carefully" |
|
|
89
|
+
| "I think" | "I believe" |
|
|
90
|
+
| "think step by step" | "work through step by step" |
|
|
91
|
+
| "let me think" | "let me consider" |
|
|
92
|
+
|
|
93
|
+
Note: Official Anthropic docs confirm this sensitivity for Opus 4.5. It likely applies to 4.6 as well but has not been explicitly confirmed for this model version.
|
|
94
|
+
|
|
95
|
+
## Prompt Structure
|
|
96
|
+
|
|
97
|
+
### XML Tags
|
|
98
|
+
|
|
99
|
+
XML tags help Opus 4.6 distinguish between context, instructions, and output format. Use them where they add clarity; default to markdown headers and tables when those are sufficient.
|
|
100
|
+
|
|
101
|
+
**Recommended tags:**
|
|
102
|
+
- `<document>`, `<context>`: For input content
|
|
103
|
+
- `<instructions>`, `<task>`: For directives
|
|
104
|
+
- `<example>`, `<examples>`: For demonstrations
|
|
105
|
+
- `<input>`, `<output>`: For input/output pairs
|
|
106
|
+
- `<constraint>`, `<requirements>`: For limitations
|
|
107
|
+
- `<format>`, `<output_format>`: For response structure
|
|
108
|
+
|
|
109
|
+
### Prefilling Not Supported
|
|
110
|
+
|
|
111
|
+
Prefilling assistant responses is not supported in Opus 4.6 (returns a 400 error). To control output format, use Structured Outputs or prompt instructions. For example, to get JSON output without preamble, instruct: "Respond with a JSON object. No preamble or explanation." To continue an interrupted response, move the context to the user turn: "Your previous response was interrupted after: [last content]. Continue from there."
|
|
112
|
+
|
|
113
|
+
## Behavioral Tuning
|
|
114
|
+
|
|
115
|
+
### Tool Overtriggering
|
|
116
|
+
|
|
117
|
+
Prompts designed to reduce undertriggering in earlier models cause overtriggering in Opus 4.6. The model is more compliant by default, so forceful language overcorrects. Dial back to calm, conditional instructions.
|
|
118
|
+
|
|
119
|
+
| Before (causes overtriggering) | After (calibrated for Opus 4.6) |
|
|
120
|
+
|-------------------------------|--------------------------------|
|
|
121
|
+
| `CRITICAL: You MUST use this tool when...` | `Use this tool when...` |
|
|
122
|
+
| `You MUST ALWAYS search before answering` | `Search before answering when the question involves specific facts` |
|
|
123
|
+
| `NEVER respond without checking...` | `Check [source] when the user asks about [topic]` |
|
|
124
|
+
| `REQUIRED: Execute this tool for every query` | `Execute this tool when the query involves [condition]` |
|
|
125
|
+
|
|
126
|
+
**Language replacements:**
|
|
127
|
+
|
|
128
|
+
| Aggressive Term | Opus 4.6 Equivalent |
|
|
129
|
+
|----------------|---------------------|
|
|
130
|
+
| `CRITICAL` | Remove entirely |
|
|
131
|
+
| `You MUST` | State the instruction directly |
|
|
132
|
+
| `ALWAYS` | State the instruction directly |
|
|
133
|
+
| `NEVER` | `Don't` or state the positive alternative |
|
|
134
|
+
| `REQUIRED` | Remove entirely |
|
|
135
|
+
| `MANDATORY` | Remove or use `should` |
|
|
136
|
+
| `IMPORTANT:` | Remove the prefix, keep the instruction |
|
|
137
|
+
|
|
138
|
+
### Overthinking and Excessive Thoroughness
|
|
139
|
+
|
|
140
|
+
1. **Replace blanket defaults with targeted instructions:**
|
|
141
|
+
|
|
142
|
+
```
|
|
143
|
+
Before: "ALWAYS read ALL related files before making ANY changes."
|
|
144
|
+
After: "Read files directly relevant to the change. For single-file edits,
|
|
145
|
+
reading the target file is sufficient."
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
2. **Add decisiveness constraints:**
|
|
149
|
+
|
|
150
|
+
```
|
|
151
|
+
When you're deciding how to approach a problem, choose an approach and commit
|
|
152
|
+
to it. Avoid revisiting decisions unless you encounter new information that
|
|
153
|
+
directly contradicts your reasoning.
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
3. **Use `effort` as the primary control lever.** Set `effort: "medium"` or `effort: "low"` for straightforward tasks before adding prompt constraints.
|
|
157
|
+
|
|
158
|
+
4. **Remove anti-laziness prompts.** Instructions like "be thorough", "do not be lazy", "think carefully" amplify Opus 4.6's already-proactive behavior into runaway loops.
|
|
159
|
+
|
|
160
|
+
5. **Remove explicit think tool instructions.** Prompts like "use the think tool to plan your approach" cause over-planning in Opus 4.6. The model plans effectively without being told to.
|
|
161
|
+
|
|
162
|
+
6. **Remove over-prompting** added to compensate for undertriggering in earlier models.
|
|
163
|
+
|
|
164
|
+
### Over-Engineering Prevention
|
|
165
|
+
|
|
166
|
+
Opus 4.6 is capable enough to elaborate beyond what was asked. Scope boundaries prevent the model from adding unrequested features, defensive code, or premature abstractions.
|
|
167
|
+
|
|
168
|
+
```xml
|
|
169
|
+
<scope_constraints>
|
|
170
|
+
Only make changes that are directly requested or clearly necessary. Keep solutions
|
|
171
|
+
simple and focused:
|
|
172
|
+
- Don't add features, refactor code, or make "improvements" beyond what was asked.
|
|
173
|
+
- Don't add docstrings, comments, or type annotations to code you didn't change.
|
|
174
|
+
- Don't add error handling, fallbacks, or validation for scenarios that can't happen.
|
|
175
|
+
- Don't create helpers, utilities, or abstractions for one-time operations.
|
|
176
|
+
</scope_constraints>
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### Balancing Autonomy and Safety
|
|
180
|
+
|
|
181
|
+
Opus 4.6's enhanced autonomy makes it important to distinguish reversible from irreversible actions explicitly.
|
|
182
|
+
|
|
183
|
+
```xml
|
|
184
|
+
<action_safety>
|
|
185
|
+
Before taking any action, evaluate its reversibility and impact:
|
|
186
|
+
|
|
187
|
+
Actions that need user confirmation:
|
|
188
|
+
- Destructive operations (deleting files, dropping tables, overwriting data)
|
|
189
|
+
- Hard-to-reverse operations (force push, database migrations, deployment)
|
|
190
|
+
- Operations visible to others (posting messages, sending emails, creating PRs)
|
|
191
|
+
|
|
192
|
+
Actions you can take without confirmation:
|
|
193
|
+
- Reading files and gathering information
|
|
194
|
+
- Creating new files (non-destructive)
|
|
195
|
+
- Running tests
|
|
196
|
+
- Local git commits
|
|
197
|
+
- Writing to scratch/temporary files
|
|
198
|
+
</action_safety>
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
### Communication Style
|
|
202
|
+
|
|
203
|
+
Opus 4.6 is more direct and less verbose than previous models. It skips summaries after tool use by default. If you want summaries, add an explicit instruction:
|
|
204
|
+
|
|
205
|
+
```
|
|
206
|
+
After completing a task that involves tool use, provide a quick summary of the work you've done.
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
Only add this if you want summaries. The default behavior (skipping them) may be preferable.
|
|
210
|
+
|
|
211
|
+
### Action vs Suggestion Steering
|
|
212
|
+
|
|
213
|
+
Opus 4.6 may suggest instead of act, or act instead of suggest. Steer with explicit instructions.
|
|
214
|
+
|
|
215
|
+
To default to implementation:
|
|
216
|
+
|
|
217
|
+
```xml
|
|
218
|
+
<default_to_action>
|
|
219
|
+
By default, implement changes rather than only suggesting them. If the user's intent
|
|
220
|
+
is unclear, infer the most useful likely action and proceed, using tools to discover
|
|
221
|
+
any missing details instead of guessing.
|
|
222
|
+
</default_to_action>
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
To default to suggestions:
|
|
226
|
+
|
|
227
|
+
```xml
|
|
228
|
+
<do_not_act_before_instructions>
|
|
229
|
+
Do not jump into implementation or change files unless clearly instructed to make
|
|
230
|
+
changes. Default to providing information and recommendations rather than taking
|
|
231
|
+
action. Only proceed with edits when the user explicitly requests them.
|
|
232
|
+
</do_not_act_before_instructions>
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
### Hallucination Minimization
|
|
236
|
+
|
|
237
|
+
Opus 4.6 is less prone to hallucinations but can still speculate about unread code:
|
|
238
|
+
|
|
239
|
+
```xml
|
|
240
|
+
<investigate_before_answering>
|
|
241
|
+
Never speculate about code you have not opened. If the user references a specific
|
|
242
|
+
file, read the file before answering. Investigate and read relevant files before
|
|
243
|
+
answering questions about the codebase.
|
|
244
|
+
</investigate_before_answering>
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
### Temporary File Cleanup
|
|
248
|
+
|
|
249
|
+
Opus 4.6 may create scratch files during iteration. Add cleanup instructions if this is undesirable:
|
|
250
|
+
|
|
251
|
+
```
|
|
252
|
+
If you create any temporary new files, scripts, or helper files for iteration,
|
|
253
|
+
clean up these files by removing them at the end of the task.
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
### Test Hard-Coding Prevention
|
|
257
|
+
|
|
258
|
+
The model can focus too heavily on making tests pass rather than solving the underlying problem:
|
|
259
|
+
|
|
260
|
+
```
|
|
261
|
+
Write a general-purpose solution using standard tools. Do not hard-code values or
|
|
262
|
+
create solutions that only work for specific test inputs. Implement the actual logic
|
|
263
|
+
that solves the problem generally. If tests are incorrect, inform me rather than
|
|
264
|
+
working around them.
|
|
265
|
+
```
|
|
266
|
+
|
|
267
|
+
### Subagent Overuse
|
|
268
|
+
|
|
269
|
+
```xml
|
|
270
|
+
<subagent_guidance>
|
|
271
|
+
Use subagents when tasks can run in parallel, require isolated context, or involve
|
|
272
|
+
independent workstreams. For simple tasks, sequential operations, or single-file
|
|
273
|
+
edits, work directly rather than delegating.
|
|
274
|
+
</subagent_guidance>
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
### LaTeX Output Defaults
|
|
278
|
+
|
|
279
|
+
Opus 4.6 defaults to LaTeX for mathematical expressions. Opt out if your rendering target does not support it:
|
|
280
|
+
|
|
281
|
+
```
|
|
282
|
+
When presenting mathematical expressions, use plain text notation rather than
|
|
283
|
+
LaTeX. For example, write "x^2 + 3x + 1" instead of "$x^2 + 3x + 1$".
|
|
284
|
+
```
|
|
285
|
+
|
|
286
|
+
## Prompt Migration Checklist
|
|
287
|
+
|
|
288
|
+
- [ ] Replace CRITICAL, MUST, ALWAYS, NEVER, REQUIRED with calm, direct equivalents
|
|
289
|
+
- [ ] Remove prefilled assistant responses; use Structured Outputs or prompt instructions
|
|
290
|
+
- [ ] Remove anti-laziness prompts ("be thorough", "think carefully", "do not be lazy")
|
|
291
|
+
- [ ] Remove explicit think tool instructions ("use the think tool to plan")
|
|
292
|
+
- [ ] Remove compensatory over-prompting added for older models
|
|
293
|
+
- [ ] Replace "think" with "consider", "evaluate", "analyze" if adaptive thinking is disabled
|
|
294
|
+
- [ ] Add safety guardrails for destructive/irreversible actions
|
|
295
|
+
- [ ] Add scope constraints to prevent over-engineering
|
|
296
|
+
- [ ] Add subagent usage guidance for tool-heavy workflows
|
|
297
|
+
- [ ] Add LaTeX opt-out if rendering target does not support LaTeX
|
|
298
|
+
- [ ] Test with effort "medium" first, then adjust up or down
|
|
299
|
+
|
|
300
|
+
## Anti-Patterns
|
|
301
|
+
|
|
302
|
+
- **Aggressive emphasis**: `CRITICAL: You MUST ALWAYS...` overcorrects in Opus 4.6. Use direct, calm instructions.
|
|
303
|
+
- **Anti-laziness prompts**: "Be thorough", "think carefully", "do not be lazy" amplify proactive behavior into runaway loops.
|
|
304
|
+
- **Explicit think tool instructions**: "Use the think tool to plan your approach" causes over-planning. The model plans effectively without being told to.
|
|
305
|
+
- **Suggesting instead of acting**: If you want implementation, say "change" or "implement", not "suggest changes". The model takes verbs literally.
|
|
306
|
+
- **Conflicting instructions**: "Be concise but also very detailed" -- pick one or separate them by context.
|
|
307
|
+
- **Ambiguous examples**: Every example the model sees is a pattern it may reproduce. Be precise.
|
|
308
|
+
- **Overloaded prompts**: Break large requests into phases.
|
|
309
|
+
- **Missing output format**: Specify expected response structure.
|
|
310
|
+
- **Over-prompting for default behavior**: Remove instructions for things Opus 4.6 does by default.
|
|
311
|
+
|
|
312
|
+
## Reference
|
|
313
|
+
|
|
314
|
+
- Official Opus 4.6 Prompting Guide: https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices/
|
|
315
|
+
- Official Claude Opus 4.5 Migration Guide: https://github.com/anthropics/claude-code/tree/main/plugins/claude-opus-4-5-migration
|