pdd-cli 0.0.45__py3-none-any.whl → 0.0.90__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- pdd/__init__.py +4 -4
- pdd/agentic_common.py +863 -0
- pdd/agentic_crash.py +534 -0
- pdd/agentic_fix.py +1179 -0
- pdd/agentic_langtest.py +162 -0
- pdd/agentic_update.py +370 -0
- pdd/agentic_verify.py +183 -0
- pdd/auto_deps_main.py +15 -5
- pdd/auto_include.py +63 -5
- pdd/bug_main.py +3 -2
- pdd/bug_to_unit_test.py +2 -0
- pdd/change_main.py +11 -4
- pdd/cli.py +22 -1181
- pdd/cmd_test_main.py +73 -21
- pdd/code_generator.py +58 -18
- pdd/code_generator_main.py +672 -25
- pdd/commands/__init__.py +42 -0
- pdd/commands/analysis.py +248 -0
- pdd/commands/fix.py +140 -0
- pdd/commands/generate.py +257 -0
- pdd/commands/maintenance.py +174 -0
- pdd/commands/misc.py +79 -0
- pdd/commands/modify.py +230 -0
- pdd/commands/report.py +144 -0
- pdd/commands/templates.py +215 -0
- pdd/commands/utility.py +110 -0
- pdd/config_resolution.py +58 -0
- pdd/conflicts_main.py +8 -3
- pdd/construct_paths.py +258 -82
- pdd/context_generator.py +10 -2
- pdd/context_generator_main.py +113 -11
- pdd/continue_generation.py +47 -7
- pdd/core/__init__.py +0 -0
- pdd/core/cli.py +503 -0
- pdd/core/dump.py +554 -0
- pdd/core/errors.py +63 -0
- pdd/core/utils.py +90 -0
- pdd/crash_main.py +44 -11
- pdd/data/language_format.csv +71 -63
- pdd/data/llm_model.csv +20 -18
- pdd/detect_change_main.py +5 -4
- pdd/fix_code_loop.py +330 -76
- pdd/fix_error_loop.py +207 -61
- pdd/fix_errors_from_unit_tests.py +4 -3
- pdd/fix_main.py +75 -18
- pdd/fix_verification_errors.py +12 -100
- pdd/fix_verification_errors_loop.py +306 -272
- pdd/fix_verification_main.py +28 -9
- pdd/generate_output_paths.py +93 -10
- pdd/generate_test.py +16 -5
- pdd/get_jwt_token.py +9 -2
- pdd/get_run_command.py +73 -0
- pdd/get_test_command.py +68 -0
- pdd/git_update.py +70 -19
- pdd/incremental_code_generator.py +2 -2
- pdd/insert_includes.py +11 -3
- pdd/llm_invoke.py +1269 -103
- pdd/load_prompt_template.py +36 -10
- pdd/pdd_completion.fish +25 -2
- pdd/pdd_completion.sh +30 -4
- pdd/pdd_completion.zsh +79 -4
- pdd/postprocess.py +10 -3
- pdd/preprocess.py +228 -15
- pdd/preprocess_main.py +8 -5
- pdd/prompts/agentic_crash_explore_LLM.prompt +49 -0
- pdd/prompts/agentic_fix_explore_LLM.prompt +45 -0
- pdd/prompts/agentic_fix_harvest_only_LLM.prompt +48 -0
- pdd/prompts/agentic_fix_primary_LLM.prompt +85 -0
- pdd/prompts/agentic_update_LLM.prompt +1071 -0
- pdd/prompts/agentic_verify_explore_LLM.prompt +45 -0
- pdd/prompts/auto_include_LLM.prompt +100 -905
- pdd/prompts/detect_change_LLM.prompt +122 -20
- pdd/prompts/example_generator_LLM.prompt +22 -1
- pdd/prompts/extract_code_LLM.prompt +5 -1
- pdd/prompts/extract_program_code_fix_LLM.prompt +7 -1
- pdd/prompts/extract_prompt_update_LLM.prompt +7 -8
- pdd/prompts/extract_promptline_LLM.prompt +17 -11
- pdd/prompts/find_verification_errors_LLM.prompt +6 -0
- pdd/prompts/fix_code_module_errors_LLM.prompt +4 -2
- pdd/prompts/fix_errors_from_unit_tests_LLM.prompt +8 -0
- pdd/prompts/fix_verification_errors_LLM.prompt +22 -0
- pdd/prompts/generate_test_LLM.prompt +21 -6
- pdd/prompts/increase_tests_LLM.prompt +1 -5
- pdd/prompts/insert_includes_LLM.prompt +228 -108
- pdd/prompts/trace_LLM.prompt +25 -22
- pdd/prompts/unfinished_prompt_LLM.prompt +85 -1
- pdd/prompts/update_prompt_LLM.prompt +22 -1
- pdd/pytest_output.py +127 -12
- pdd/render_mermaid.py +236 -0
- pdd/setup_tool.py +648 -0
- pdd/simple_math.py +2 -0
- pdd/split_main.py +3 -2
- pdd/summarize_directory.py +49 -6
- pdd/sync_determine_operation.py +543 -98
- pdd/sync_main.py +81 -31
- pdd/sync_orchestration.py +1334 -751
- pdd/sync_tui.py +848 -0
- pdd/template_registry.py +264 -0
- pdd/templates/architecture/architecture_json.prompt +242 -0
- pdd/templates/generic/generate_prompt.prompt +174 -0
- pdd/trace.py +168 -12
- pdd/trace_main.py +4 -3
- pdd/track_cost.py +151 -61
- pdd/unfinished_prompt.py +49 -3
- pdd/update_main.py +549 -67
- pdd/update_model_costs.py +2 -2
- pdd/update_prompt.py +19 -4
- {pdd_cli-0.0.45.dist-info → pdd_cli-0.0.90.dist-info}/METADATA +19 -6
- pdd_cli-0.0.90.dist-info/RECORD +153 -0
- {pdd_cli-0.0.45.dist-info → pdd_cli-0.0.90.dist-info}/licenses/LICENSE +1 -1
- pdd_cli-0.0.45.dist-info/RECORD +0 -116
- {pdd_cli-0.0.45.dist-info → pdd_cli-0.0.90.dist-info}/WHEEL +0 -0
- {pdd_cli-0.0.45.dist-info → pdd_cli-0.0.90.dist-info}/entry_points.txt +0 -0
- {pdd_cli-0.0.45.dist-info → pdd_cli-0.0.90.dist-info}/top_level.txt +0 -0
|
@@ -0,0 +1,1071 @@
|
|
|
1
|
+
% You are an expert PDD (Prompt-Driven Development) engineer. Your task is to update a prompt file to reflect code changes while making it as compact as possible.
|
|
2
|
+
|
|
3
|
+
% Prompting Guide (follow these best practices):
|
|
4
|
+
# Prompt‑Driven Development Prompting Guide
|
|
5
|
+
|
|
6
|
+
This guide shows how to write effective prompts for Prompt‑Driven Development (PDD). It distills best practices from the PDD whitepaper, the PDD doctrine, and working patterns in this repo. It also contrasts PDD prompts with interactive agentic coding tools (e.g., Claude Code, Cursor) where prompts act as ad‑hoc patches instead of the source of truth.
|
|
7
|
+
|
|
8
|
+
References: pdd/docs/whitepaper.md, pdd/docs/prompt-driven-development-doctrine.md, README.md (repo structure, conventions), [Effective Context Engineering](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents), [Anthropic Prompt Engineering Overview](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview).
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Quickstart: PDD in 5 Minutes
|
|
13
|
+
|
|
14
|
+
If you are new to Prompt-Driven Development (PDD), follow this recipe:
|
|
15
|
+
|
|
16
|
+
1. **Think "One Prompt = One Module":** Don't try to generate the whole app at once. Focus on one file (e.g., `user_service.py`).
|
|
17
|
+
2. **Use a Template:** Start with a clear structure: Role, Requirements, Dependencies, Instructions.
|
|
18
|
+
3. **Explicitly Include Context:** Use `[File not found: path/to/file]` to give the model *only* what it needs (e.g., a shared preamble or a dependency interface). This is a **PDD directive**, not just XML.
|
|
19
|
+
4. **Regenerate, Don't Patch:** If the code is wrong, fix it using `pdd fix`. This updates the system's memory so the next `pdd generate` is grounded in the correct solution.
|
|
20
|
+
5. **Verify:** Run the generated code/tests.
|
|
21
|
+
|
|
22
|
+
*Tip: Treat your prompt like source code. It is the single source of truth.*
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
|
|
26
|
+
## Glossary
|
|
27
|
+
|
|
28
|
+
- **Context Engineering:** The art of curating exactly what information (code, docs, examples) fits into the LLM's limited "working memory" (context window) to get the best result.
|
|
29
|
+
- **Shared Preamble:** A standard text file (e.g., `project_preamble.prompt`) included in every prompt to enforce common rules like coding style, forbidden libraries, and formatting.
|
|
30
|
+
- **PDD Directive:** Special tags like `[Error processing include: ` or `Error: Command '` that the PDD tool processes *before* sending the text to the AI. The AI sees the *result* (the file content), not the tag.
|
|
31
|
+
- **Source of Truth:** The definitive record. In PDD, the **Prompt** is the source of truth; the code is just a temporary artifact generated from it.
|
|
32
|
+
- **Grounding (Few-Shot History):** The process where the PDD system automatically uses successful past pairs of (Prompt, Code) as "few-shot" examples during generation. This ensures that regenerated code adheres to the established style and logic of the previous version, preventing the model from hallucinating a completely different implementation.
|
|
33
|
+
- **Drift:** When the generated code slowly diverges from the prompt's intent over time, or when manual edits to code make it inconsistent with the prompt.
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
## Why PDD Prompts (Not Patches)
|
|
38
|
+
|
|
39
|
+
- Prompts are the source of truth; code is a generated artifact. Update the prompt and regenerate instead of patching code piecemeal.
|
|
40
|
+
- Regeneration preserves conceptual integrity and reduces long‑term maintenance cost (see pdd/docs/whitepaper.md).
|
|
41
|
+
- Prompts consolidate intent, constraints, dependencies, and examples into one place so the model can enforce them.
|
|
42
|
+
- Tests accumulate across regenerations and act as a regression net; prompts and tests stay in sync.
|
|
43
|
+
|
|
44
|
+
Contrast with interactive patching (Claude Code, Cursor): prompts are ephemeral instructions for local diffs. They are great for short, local fixes, but tend to drift from original intent as context is implicit and often lost. In PDD, prompts are versioned, explicit, and designed for batch, reproducible generation.
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
## The PDD Mental Model
|
|
49
|
+
|
|
50
|
+
- One prompt typically maps to one code file or narrowly scoped module.
|
|
51
|
+
- You explicitly curate the context to place in the model’s window (don’t “dump the repo”).
|
|
52
|
+
- Change behavior by editing the prompt; re‑generate the file; run crash/verify/test/fix; then update the prompt with learnings.
|
|
53
|
+
- Keep the “dev unit” synchronized: prompt + generated code + minimal runnable example + tests.
|
|
54
|
+
|
|
55
|
+
Key principles: everything is explicit, prompts are the programming language, and you regenerate rather than patch.
|
|
56
|
+
|
|
57
|
+
Dev Unit (Prompt with Code, Example, Test):
|
|
58
|
+
|
|
59
|
+
```mermaid
|
|
60
|
+
flowchart TB
|
|
61
|
+
P[Prompt]
|
|
62
|
+
C[Code]
|
|
63
|
+
E[Example]
|
|
64
|
+
T[Tests]
|
|
65
|
+
|
|
66
|
+
P --> C
|
|
67
|
+
P --> E
|
|
68
|
+
P --> T
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
Notes:
|
|
72
|
+
- The prompt defines intent. Code, example, and tests are generated artifacts.
|
|
73
|
+
- Regenerate rather than patch; keep tests accumulating over time.
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
## Automated Grounding (PDD Cloud)
|
|
78
|
+
|
|
79
|
+
Unlike standard LLM interactions where every request is a blank slate, PDD Cloud uses **Automated Grounding** to prevent "implementation drift."
|
|
80
|
+
|
|
81
|
+
### How It Works
|
|
82
|
+
|
|
83
|
+
When you run `pdd generate`, the system:
|
|
84
|
+
1. Embeds your prompt into a vector
|
|
85
|
+
2. Searches for similar prompts in the cloud database (cosine similarity)
|
|
86
|
+
3. Auto-injects the closest (prompt, code) pair as a few-shot example
|
|
87
|
+
|
|
88
|
+
**This is automatic.** You don't configure it. As you edit your prompt:
|
|
89
|
+
- The embedding changes
|
|
90
|
+
- Different examples may be retrieved
|
|
91
|
+
- Generation naturally adapts to your prompt's content
|
|
92
|
+
|
|
93
|
+
On first generation: Similar existing modules in your project provide grounding.
|
|
94
|
+
On re-generation: Your prior successful generation is typically the closest match.
|
|
95
|
+
|
|
96
|
+
### Why This Matters for Prompt Writing
|
|
97
|
+
|
|
98
|
+
- **Your prompt wording affects grounding.** Similar prompts retrieve similar examples.
|
|
99
|
+
- **Implementation patterns are handled automatically.** Grounding provides structural consistency from similar modules (class vs functional, helper patterns, etc.).
|
|
100
|
+
- **Prompts can be minimal.** Focus on requirements; grounding handles implementation patterns.
|
|
101
|
+
|
|
102
|
+
*Note: This is distinct from "Examples as Interfaces" (which teach how to **use** a dependency). Grounding teaches the model how to **write** the current module.*
|
|
103
|
+
|
|
104
|
+
> **Local users (no cloud):** Without grounding, prompts must be more detailed—include structural guidance and explicit examples via `<include>`. Use a shared preamble for coding style. The minimal prompt guidance in this document assumes cloud access.
|
|
105
|
+
|
|
106
|
+
---
|
|
107
|
+
|
|
108
|
+
## Grounding Overrides: Pin & Exclude (PDD Cloud)
|
|
109
|
+
|
|
110
|
+
For users with PDD Cloud access, you can override automatic grounding using XML tags:
|
|
111
|
+
|
|
112
|
+
**`<pin>module_name</pin>`** — Force a specific example to always be included
|
|
113
|
+
- Use case: Ensure a critical module always follows a "golden" pattern
|
|
114
|
+
- Use case: Bootstrap a new module with a specific style
|
|
115
|
+
|
|
116
|
+
**`<exclude>module_name</exclude>`** — Block a specific example(s) from being retrieved
|
|
117
|
+
- Use case: Escape an old pattern that's pulling generation in the wrong direction
|
|
118
|
+
- Use case: Intentionally break from established patterns for a redesign
|
|
119
|
+
|
|
120
|
+
These tags are processed by the preprocessor (like `<include>`) and removed before the LLM sees the prompt.
|
|
121
|
+
|
|
122
|
+
**Most prompts don't need these.** Automatic grounding works well for:
|
|
123
|
+
- Standard modules with similar existing examples
|
|
124
|
+
- Re-generations of established modules
|
|
125
|
+
- Modules following common project patterns
|
|
126
|
+
|
|
127
|
+
---
|
|
128
|
+
|
|
129
|
+
## Anatomy of a Good PDD Prompt
|
|
130
|
+
|
|
131
|
+
A well-designed prompt contains **only what can't be handled elsewhere**. With cloud grounding and accumulated tests, prompts can be minimal.
|
|
132
|
+
|
|
133
|
+
### Required Sections
|
|
134
|
+
|
|
135
|
+
1. **Role and scope** (1-2 sentences): What this module does
|
|
136
|
+
2. **Requirements** (5-10 items): Functional and non-functional specs
|
|
137
|
+
3. **Dependencies** (via `<include>`): Only external or critical interfaces
|
|
138
|
+
|
|
139
|
+
### Optional Sections
|
|
140
|
+
|
|
141
|
+
4. **Instructions**: Only if default behavior needs overriding
|
|
142
|
+
5. **Deliverables**: Only if non-obvious
|
|
143
|
+
|
|
144
|
+
### What NOT to Include
|
|
145
|
+
|
|
146
|
+
- **Coding style** (naming, formatting, imports) → Handled by shared preamble
|
|
147
|
+
- **Implementation patterns** (class structure, helpers) → Handled by grounding
|
|
148
|
+
- **Every edge case** → Handled by accumulated tests
|
|
149
|
+
- **Implementation steps** → Let the LLM decide (unless critical)
|
|
150
|
+
|
|
151
|
+
### Target Size: Prompt-to-Code Ratio
|
|
152
|
+
|
|
153
|
+
Aim for **10-30%** of your expected code size:
|
|
154
|
+
|
|
155
|
+
| Ratio | Meaning |
|
|
156
|
+
|-------|---------|
|
|
157
|
+
| **< 10%** | Too vague—missing contracts, error handling, or key constraints |
|
|
158
|
+
| **10-30%** | Just right—requirements and contracts without implementation details |
|
|
159
|
+
| **> 50%** | Too detailed—prompt is doing preamble's or grounding's job |
|
|
160
|
+
|
|
161
|
+
If your prompt exceeds 30%, ask: Am I specifying things that preamble, grounding, or tests should handle?
|
|
162
|
+
|
|
163
|
+
**Note:** Tests are generated from the module prompt (via Requirements), so explicit Testing sections are unnecessary—well-written Requirements are inherently testable. Use `context/test.prompt` for project-wide test guidance.
|
|
164
|
+
|
|
165
|
+
See pdd/pdd/templates/generic/generate_prompt.prompt for a concrete scaffold.
|
|
166
|
+
|
|
167
|
+
---
|
|
168
|
+
|
|
169
|
+
## Prompt Syntax Essentials
|
|
170
|
+
|
|
171
|
+
These patterns are used across prompts in this repo:
|
|
172
|
+
|
|
173
|
+
- Preamble and role: start with a concise, authoritative description of the task and audience (e.g., “You are an expert Python engineer…”).
|
|
174
|
+
- Includes for context: bring only what the model needs.
|
|
175
|
+
- Single include: `<include>path/to/file]`. **Note:** This is a PDD directive, not standard XML. The PDD tool replaces this tag with the actual file content *before* the LLM sees it. (Handles both text and images).
|
|
176
|
+
- Multiple: `[File not found: path1]
|
|
177
|
+
[File not found: path2]
|
|
178
|
+
[File not found: …]`
|
|
179
|
+
- Grouping: wrap includes in a semantic tag to name the dependency or file they represent, for example:
|
|
180
|
+
```xml
|
|
181
|
+
<render_js>
|
|
182
|
+
[File not found: src/render.js]
|
|
183
|
+
</render_js>
|
|
184
|
+
```
|
|
185
|
+
- When including larger files inline for clarity, wrap with opening/closing tags named after the file, e.g. `<render.js>…</render.js>`.
|
|
186
|
+
- Inputs/outputs: state them explicitly (names, types, shapes). Prompts should define Inputs/Outputs and steps clearly.
|
|
187
|
+
- Steps & Chain of Thought: Outline a short, deterministic plan. For complex logical tasks, explicitly instruct the model to "Analyze the requirements and think step-by-step before writing code." This improves accuracy on difficult reasoning problems.
|
|
188
|
+
- Constraints: specify style, performance targets, security, and error handling.
|
|
189
|
+
- Environment: reference required env vars (e.g., `PDD_PATH`) when reading data files.
|
|
190
|
+
|
|
191
|
+
Tip: Prefer small, named sections using XML‑style tags to make context scannable and deterministic.
|
|
192
|
+
|
|
193
|
+
### Special XML Tags: pdd, shell, web
|
|
194
|
+
|
|
195
|
+
The PDD preprocessor supports additional XML‑style tags to keep prompts clean, reproducible, and self‑contained. Processing order (per spec) is: `pdd` → `include`/`include-many` → `shell` → `web`. When `recursive=True`, `<shell>` and `[Web scraping error: Unexpected error during scrape URL: Status code 400. Bad Request - [{'code': 'invalid_format', 'format': 'url', 'path': ['url'], 'message': 'Invalid URL'}, {'code': 'custom', 'path': ['url'], 'message': 'URL must have a valid top-level domain or be a valid path'}, {'code': 'custom', 'path': ['url'], 'message': 'Invalid URL'}]]`
|
|
196
|
+
- Purpose: fetch the page (via Firecrawl) and inline the markdown content.
|
|
197
|
+
- Behavior: executes during non‑recursive preprocessing; on failure, inserts a bracketed error note.
|
|
198
|
+
- Example: `[Skip to main content](https://docs.litellm.ai/docs/completion/json_mode#__docusaurus_skipToContent_fallback)
|
|
199
|
+
|
|
200
|
+
On this page
|
|
201
|
+
|
|
202
|
+
## Quick Start [](https://docs.litellm.ai/docs/completion/json_mode\#quick-start "Direct link to Quick Start")
|
|
203
|
+
|
|
204
|
+
- SDK
|
|
205
|
+
- PROXY
|
|
206
|
+
|
|
207
|
+
```python
|
|
208
|
+
from litellm import completion
|
|
209
|
+
import os
|
|
210
|
+
|
|
211
|
+
os.environ["OPENAI_API_KEY"] = ""
|
|
212
|
+
|
|
213
|
+
response = completion(
|
|
214
|
+
model="gpt-4o-mini",
|
|
215
|
+
response_format={ "type": "json_object" },
|
|
216
|
+
messages=[\
|
|
217
|
+
{"role": "system", "content": "You are a helpful assistant designed to output JSON."},\
|
|
218
|
+
{"role": "user", "content": "Who won the world series in 2020?"}\
|
|
219
|
+
]
|
|
220
|
+
)
|
|
221
|
+
print(response.choices[0].message.content)
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
```bash
|
|
225
|
+
curl http://0.0.0.0:4000/v1/chat/completions \
|
|
226
|
+
-H "Content-Type: application/json" \
|
|
227
|
+
-H "Authorization: Bearer $LITELLM_KEY" \
|
|
228
|
+
-d '{
|
|
229
|
+
"model": "gpt-4o-mini",
|
|
230
|
+
"response_format": { "type": "json_object" },
|
|
231
|
+
"messages": [\
|
|
232
|
+
{\
|
|
233
|
+
"role": "system",\
|
|
234
|
+
"content": "You are a helpful assistant designed to output JSON."\
|
|
235
|
+
},\
|
|
236
|
+
{\
|
|
237
|
+
"role": "user",\
|
|
238
|
+
"content": "Who won the world series in 2020?"\
|
|
239
|
+
}\
|
|
240
|
+
]
|
|
241
|
+
}'
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
## Check Model Support [](https://docs.litellm.ai/docs/completion/json_mode\#check-model-support "Direct link to Check Model Support")
|
|
245
|
+
|
|
246
|
+
### 1\. Check if model supports `response_format` [](https://docs.litellm.ai/docs/completion/json_mode\#1-check-if-model-supports-response_format "Direct link to 1-check-if-model-supports-response_format")
|
|
247
|
+
|
|
248
|
+
Call `litellm.get_supported_openai_params` to check if a model/provider supports `response_format`.
|
|
249
|
+
|
|
250
|
+
```python
|
|
251
|
+
from litellm import get_supported_openai_params
|
|
252
|
+
|
|
253
|
+
params = get_supported_openai_params(model="anthropic.claude-3", custom_llm_provider="bedrock")
|
|
254
|
+
|
|
255
|
+
assert "response_format" in params
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
### 2\. Check if model supports `json_schema` [](https://docs.litellm.ai/docs/completion/json_mode\#2-check-if-model-supports-json_schema "Direct link to 2-check-if-model-supports-json_schema")
|
|
259
|
+
|
|
260
|
+
This is used to check if you can pass
|
|
261
|
+
|
|
262
|
+
- `response_format={ "type": "json_schema", "json_schema": … , "strict": true }`
|
|
263
|
+
- `response_format=<Pydantic Model>`
|
|
264
|
+
|
|
265
|
+
```python
|
|
266
|
+
from litellm import supports_response_schema
|
|
267
|
+
|
|
268
|
+
assert supports_response_schema(model="gemini-1.5-pro-preview-0215", custom_llm_provider="bedrock")
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
Check out [model\_prices\_and\_context\_window.json](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json) for a full list of models and their support for `response_schema`.
|
|
272
|
+
|
|
273
|
+
## Pass in 'json\_schema' [](https://docs.litellm.ai/docs/completion/json_mode\#pass-in-json_schema "Direct link to Pass in 'json_schema'")
|
|
274
|
+
|
|
275
|
+
To use Structured Outputs, simply specify
|
|
276
|
+
|
|
277
|
+
```text
|
|
278
|
+
response_format: { "type": "json_schema", "json_schema": … , "strict": true }
|
|
279
|
+
```
|
|
280
|
+
|
|
281
|
+
Works for:
|
|
282
|
+
|
|
283
|
+
- OpenAI models
|
|
284
|
+
- Azure OpenAI models
|
|
285
|
+
- xAI models (Grok-2 or later)
|
|
286
|
+
- Google AI Studio - Gemini models
|
|
287
|
+
- Vertex AI models (Gemini + Anthropic)
|
|
288
|
+
- Bedrock Models
|
|
289
|
+
- Anthropic API Models
|
|
290
|
+
- Groq Models
|
|
291
|
+
- Ollama Models
|
|
292
|
+
- Databricks Models
|
|
293
|
+
|
|
294
|
+
- SDK
|
|
295
|
+
- PROXY
|
|
296
|
+
|
|
297
|
+
```python
|
|
298
|
+
import os
|
|
299
|
+
from litellm import completion
|
|
300
|
+
from pydantic import BaseModel
|
|
301
|
+
|
|
302
|
+
# add to env var
|
|
303
|
+
os.environ["OPENAI_API_KEY"] = ""
|
|
304
|
+
|
|
305
|
+
messages = [{"role": "user", "content": "List 5 important events in the XIX century"}]
|
|
306
|
+
|
|
307
|
+
class CalendarEvent(BaseModel):
|
|
308
|
+
name: str
|
|
309
|
+
date: str
|
|
310
|
+
participants: list[str]
|
|
311
|
+
|
|
312
|
+
class EventsList(BaseModel):
|
|
313
|
+
events: list[CalendarEvent]
|
|
314
|
+
|
|
315
|
+
resp = completion(
|
|
316
|
+
model="gpt-4o-2024-08-06",
|
|
317
|
+
messages=messages,
|
|
318
|
+
response_format=EventsList
|
|
319
|
+
)
|
|
320
|
+
|
|
321
|
+
print("Received={}".format(resp))
|
|
322
|
+
|
|
323
|
+
events_list = EventsList.model_validate_json(resp.choices[0].message.content)
|
|
324
|
+
```
|
|
325
|
+
|
|
326
|
+
1. Add openai model to config.yaml
|
|
327
|
+
|
|
328
|
+
```yaml
|
|
329
|
+
model_list:
|
|
330
|
+
- model_name: "gpt-4o"
|
|
331
|
+
litellm_params:
|
|
332
|
+
model: "gpt-4o-2024-08-06"
|
|
333
|
+
```
|
|
334
|
+
|
|
335
|
+
2. Start proxy with config.yaml
|
|
336
|
+
|
|
337
|
+
```bash
|
|
338
|
+
litellm --config /path/to/config.yaml
|
|
339
|
+
```
|
|
340
|
+
|
|
341
|
+
3. Call with OpenAI SDK / Curl!
|
|
342
|
+
|
|
343
|
+
Just replace the 'base\_url' in the openai sdk, to call the proxy with 'json\_schema' for openai models
|
|
344
|
+
|
|
345
|
+
**OpenAI SDK**
|
|
346
|
+
|
|
347
|
+
```python
|
|
348
|
+
from pydantic import BaseModel
|
|
349
|
+
from openai import OpenAI
|
|
350
|
+
|
|
351
|
+
client = OpenAI(
|
|
352
|
+
api_key="anything", # 👈 PROXY KEY (can be anything, if master_key not set)
|
|
353
|
+
base_url="http://0.0.0.0:4000" # 👈 PROXY BASE URL
|
|
354
|
+
)
|
|
355
|
+
|
|
356
|
+
class Step(BaseModel):
|
|
357
|
+
explanation: str
|
|
358
|
+
output: str
|
|
359
|
+
|
|
360
|
+
class MathReasoning(BaseModel):
|
|
361
|
+
steps: list[Step]
|
|
362
|
+
final_answer: str
|
|
363
|
+
|
|
364
|
+
completion = client.beta.chat.completions.parse(
|
|
365
|
+
model="gpt-4o",
|
|
366
|
+
messages=[\
|
|
367
|
+
{"role": "system", "content": "You are a helpful math tutor. Guide the user through the solution step by step."},\
|
|
368
|
+
{"role": "user", "content": "how can I solve 8x + 7 = -23"}\
|
|
369
|
+
],
|
|
370
|
+
response_format=MathReasoning,
|
|
371
|
+
)
|
|
372
|
+
|
|
373
|
+
math_reasoning = completion.choices[0].message.parsed
|
|
374
|
+
```
|
|
375
|
+
|
|
376
|
+
**Curl**
|
|
377
|
+
|
|
378
|
+
```bash
|
|
379
|
+
curl -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
|
|
380
|
+
-H 'Content-Type: application/json' \
|
|
381
|
+
-H 'Authorization: Bearer sk-1234' \
|
|
382
|
+
-d '{
|
|
383
|
+
"model": "gpt-4o",
|
|
384
|
+
"messages": [\
|
|
385
|
+
{\
|
|
386
|
+
"role": "system",\
|
|
387
|
+
"content": "You are a helpful math tutor. Guide the user through the solution step by step."\
|
|
388
|
+
},\
|
|
389
|
+
{\
|
|
390
|
+
"role": "user",\
|
|
391
|
+
"content": "how can I solve 8x + 7 = -23"\
|
|
392
|
+
}\
|
|
393
|
+
],
|
|
394
|
+
"response_format": {
|
|
395
|
+
"type": "json_schema",
|
|
396
|
+
"json_schema": {
|
|
397
|
+
"name": "math_reasoning",
|
|
398
|
+
"schema": {
|
|
399
|
+
"type": "object",
|
|
400
|
+
"properties": {
|
|
401
|
+
"steps": {
|
|
402
|
+
"type": "array",
|
|
403
|
+
"items": {
|
|
404
|
+
"type": "object",
|
|
405
|
+
"properties": {
|
|
406
|
+
"explanation": { "type": "string" },
|
|
407
|
+
"output": { "type": "string" }
|
|
408
|
+
},
|
|
409
|
+
"required": ["explanation", "output"],
|
|
410
|
+
"additionalProperties": false
|
|
411
|
+
}
|
|
412
|
+
},
|
|
413
|
+
"final_answer": { "type": "string" }
|
|
414
|
+
},
|
|
415
|
+
"required": ["steps", "final_answer"],
|
|
416
|
+
"additionalProperties": false
|
|
417
|
+
},
|
|
418
|
+
"strict": true
|
|
419
|
+
}
|
|
420
|
+
}
|
|
421
|
+
}'
|
|
422
|
+
```
|
|
423
|
+
|
|
424
|
+
## Validate JSON Schema [](https://docs.litellm.ai/docs/completion/json_mode\#validate-json-schema "Direct link to Validate JSON Schema")
|
|
425
|
+
|
|
426
|
+
Not all vertex models support passing the json\_schema to them (e.g. `gemini-1.5-flash`). To solve this, LiteLLM supports client-side validation of the json schema.
|
|
427
|
+
|
|
428
|
+
```text
|
|
429
|
+
litellm.enable_json_schema_validation=True
|
|
430
|
+
```
|
|
431
|
+
|
|
432
|
+
If `litellm.enable_json_schema_validation=True` is set, LiteLLM will validate the json response using `jsonvalidator`.
|
|
433
|
+
|
|
434
|
+
[**See Code**](https://github.com/BerriAI/litellm/blob/671d8ac496b6229970c7f2a3bdedd6cb84f0746b/litellm/litellm_core_utils/json_validation_rule.py#L4)
|
|
435
|
+
|
|
436
|
+
- SDK
|
|
437
|
+
- PROXY
|
|
438
|
+
|
|
439
|
+
```python
|
|
440
|
+
# !gcloud auth application-default login - run this to add vertex credentials to your env
|
|
441
|
+
import litellm, os
|
|
442
|
+
from litellm import completion
|
|
443
|
+
from pydantic import BaseModel
|
|
444
|
+
|
|
445
|
+
messages=[\
|
|
446
|
+
{"role": "system", "content": "Extract the event information."},\
|
|
447
|
+
{"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},\
|
|
448
|
+
]
|
|
449
|
+
|
|
450
|
+
litellm.enable_json_schema_validation = True
|
|
451
|
+
litellm.set_verbose = True # see the raw request made by litellm
|
|
452
|
+
|
|
453
|
+
class CalendarEvent(BaseModel):
|
|
454
|
+
name: str
|
|
455
|
+
date: str
|
|
456
|
+
participants: list[str]
|
|
457
|
+
|
|
458
|
+
resp = completion(
|
|
459
|
+
model="gemini/gemini-1.5-pro",
|
|
460
|
+
messages=messages,
|
|
461
|
+
response_format=CalendarEvent,
|
|
462
|
+
)
|
|
463
|
+
|
|
464
|
+
print("Received={}".format(resp))
|
|
465
|
+
```
|
|
466
|
+
|
|
467
|
+
1. Create config.yaml
|
|
468
|
+
|
|
469
|
+
```yaml
|
|
470
|
+
model_list:
|
|
471
|
+
- model_name: "gemini-1.5-flash"
|
|
472
|
+
litellm_params:
|
|
473
|
+
model: "gemini/gemini-1.5-flash"
|
|
474
|
+
api_key: os.environ/GEMINI_API_KEY
|
|
475
|
+
|
|
476
|
+
litellm_settings:
|
|
477
|
+
enable_json_schema_validation: True
|
|
478
|
+
```
|
|
479
|
+
|
|
480
|
+
2. Start proxy
|
|
481
|
+
|
|
482
|
+
```bash
|
|
483
|
+
litellm --config /path/to/config.yaml
|
|
484
|
+
```
|
|
485
|
+
|
|
486
|
+
3. Test it!
|
|
487
|
+
|
|
488
|
+
```bash
|
|
489
|
+
curl http://0.0.0.0:4000/v1/chat/completions \
|
|
490
|
+
-H "Content-Type: application/json" \
|
|
491
|
+
-H "Authorization: Bearer $LITELLM_API_KEY" \
|
|
492
|
+
-d '{
|
|
493
|
+
"model": "gemini-1.5-flash",
|
|
494
|
+
"messages": [\
|
|
495
|
+
{"role": "system", "content": "Extract the event information."},\
|
|
496
|
+
{"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},\
|
|
497
|
+
],
|
|
498
|
+
"response_format": {
|
|
499
|
+
"type": "json_schema",
|
|
500
|
+
"json_schema": {
|
|
501
|
+
"name": "math_reasoning",
|
|
502
|
+
"schema": {
|
|
503
|
+
"type": "object",
|
|
504
|
+
"properties": {
|
|
505
|
+
"steps": {
|
|
506
|
+
"type": "array",
|
|
507
|
+
"items": {
|
|
508
|
+
"type": "object",
|
|
509
|
+
"properties": {
|
|
510
|
+
"explanation": { "type": "string" },
|
|
511
|
+
"output": { "type": "string" }
|
|
512
|
+
},
|
|
513
|
+
"required": ["explanation", "output"],
|
|
514
|
+
"additionalProperties": false
|
|
515
|
+
}
|
|
516
|
+
},
|
|
517
|
+
"final_answer": { "type": "string" }
|
|
518
|
+
},
|
|
519
|
+
"required": ["steps", "final_answer"],
|
|
520
|
+
"additionalProperties": false
|
|
521
|
+
},
|
|
522
|
+
"strict": true
|
|
523
|
+
}
|
|
524
|
+
},
|
|
525
|
+
}'
|
|
526
|
+
```
|
|
527
|
+
|
|
528
|
+
- [Quick Start](https://docs.litellm.ai/docs/completion/json_mode#quick-start)
|
|
529
|
+
- [Check Model Support](https://docs.litellm.ai/docs/completion/json_mode#check-model-support)
|
|
530
|
+
- [1\. Check if model supports `response_format`](https://docs.litellm.ai/docs/completion/json_mode#1-check-if-model-supports-response_format)
|
|
531
|
+
- [2\. Check if model supports `json_schema`](https://docs.litellm.ai/docs/completion/json_mode#2-check-if-model-supports-json_schema)
|
|
532
|
+
- [Pass in 'json\_schema'](https://docs.litellm.ai/docs/completion/json_mode#pass-in-json_schema)
|
|
533
|
+
- [Validate JSON Schema](https://docs.litellm.ai/docs/completion/json_mode#validate-json-schema)
|
|
534
|
+
|
|
535
|
+
Ask AI
|
|
536
|
+
`
|
|
537
|
+
|
|
538
|
+
> ⚠️ **Warning: Non-Deterministic Tags**
|
|
539
|
+
>
|
|
540
|
+
> `<shell>` and `<web>` introduce **non-determinism**:
|
|
541
|
+
> - `<shell>` output varies by environment (different machines, different results)
|
|
542
|
+
> - `<web>` content changes over time (same URL, different content)
|
|
543
|
+
>
|
|
544
|
+
> **Impact:** Same prompt file → different generations on different machines/times
|
|
545
|
+
>
|
|
546
|
+
> **Prefer instead:** Capture output to a static file, then `[Error processing include: ` that file. This ensures reproducible regeneration.
|
|
547
|
+
|
|
548
|
+
Use these tags sparingly. When you must use them, prefer stable commands with bounded output (e.g., `head -n 20` in `<shell>`).
|
|
549
|
+
|
|
550
|
+
---
|
|
551
|
+
|
|
552
|
+
## Advanced Tips
|
|
553
|
+
|
|
554
|
+
### Shared Preamble for Consistency
|
|
555
|
+
|
|
556
|
+
Use a shared include (e.g., `<include>context/project_preamble.prompt]`) at the top of every prompt. You should create this file in your project's `context/` directory to define your "Constitution": consistent coding style (e.g., indentation, naming conventions), preferred linting rules, and forbidden libraries. This ensures all generated code speaks the same language without cluttering individual prompts.
|
|
557
|
+
|
|
558
|
+
### Automatic Update Propagation via Includes
|
|
559
|
+
|
|
560
|
+
A key benefit of `[Error processing include: ` directives is **automatic propagation**: when the included file changes, all prompts that reference it automatically reflect those changes on the next generation—without editing the prompts themselves.
|
|
561
|
+
|
|
562
|
+
Use this pattern when:
|
|
563
|
+
- **Authoritative documentation exists elsewhere** (e.g., a README that defines environment variables, API contracts, or configuration options). Include it rather than duplicating the content.
|
|
564
|
+
- **Shared constraints evolve** (e.g., coding standards, security policies). A single edit to the preamble file updates all prompts.
|
|
565
|
+
- **Interface definitions change** (e.g., a dependency's example file). Prompts consuming that example stay current.
|
|
566
|
+
|
|
567
|
+
*Tradeoff:* Large includes consume context tokens. If only a small portion of a file is relevant, consider extracting that portion into a dedicated include file (e.g., `docs/output_conventions.md` rather than the full `README.md`).
|
|
568
|
+
|
|
569
|
+
### Positive over Negative Constraints
|
|
570
|
+
|
|
571
|
+
Models often struggle with negative constraints ("Do not use X"). Instead, phrase requirements positively: instead of "Do not use unassigned variables," prefer "Initialize all variables with default values." This greatly improves reliability.
|
|
572
|
+
|
|
573
|
+
### Positioning Critical Instructions (Hierarchy of Attention)
|
|
574
|
+
|
|
575
|
+
LLMs exhibit "middle-loss" – they pay more attention to the **beginning** (role, preamble) and the **end** (steps, deliverables) of the prompt context. If a critical constraint (e.g., security, output format) is ignored, ensure it's placed in your shared preamble, explicitly reiterated in the final "Instructions" or "Steps" section, or even pre-filled in the expected output format if applicable.
|
|
576
|
+
|
|
577
|
+
### Command-Specific Context Files
|
|
578
|
+
|
|
579
|
+
Some PDD commands (e.g., `pdd test`, `pdd example`) can automatically include project-specific context files like `context/test.prompt` or `context/example.prompt` during their internal preprocessing. Use these to provide instructions tailored to your project, such as preferred testing frameworks or specific import statements, without modifying the main prompt.
|
|
580
|
+
|
|
581
|
+
**`context/test.prompt`** is particularly important:
|
|
582
|
+
- Defines testing conventions, frameworks, and patterns for your project
|
|
583
|
+
- Included automatically when running `pdd test` (alongside the module prompt and generated code)
|
|
584
|
+
- Tests accumulate over time via `--merge` as bugs are found
|
|
585
|
+
- Tests persist when the module prompt changes—only code is regenerated, not tests
|
|
586
|
+
- This ensures tests remain stable "permanent assets" while code can be freely regenerated
|
|
587
|
+
|
|
588
|
+
---
|
|
589
|
+
|
|
590
|
+
## Why PDD Scales to Large Codebases
|
|
591
|
+
|
|
592
|
+
- Explicit, curated context: use minimal examples and targeted includes instead of dumping source, reducing tokens and confusion.
|
|
593
|
+
- Modular dev units: one prompt per file/module constrains scope, enabling independent regeneration and parallel work.
|
|
594
|
+
- Batch, reproducible flow: eliminate long chat histories; regeneration avoids patch accumulation and incoherent diffs.
|
|
595
|
+
- Accumulating tests: protect behavior across wide regenerations and refactors; failures localize issues quickly.
|
|
596
|
+
- Single source of truth: prompts unify intent and dependencies, improving cross‑team coordination and reducing drift.
|
|
597
|
+
- Automated Grounding: By feeding successful past generations back into the context, the system stabilizes the code over time, making "regeneration" safe even for complex modules.
|
|
598
|
+
|
|
599
|
+
### Tests as Generation Context
|
|
600
|
+
|
|
601
|
+
A key PDD feature: existing tests are automatically included as context when generating code. This means:
|
|
602
|
+
|
|
603
|
+
- The LLM sees the test file and knows what behaviors must be preserved
|
|
604
|
+
- Generated code is constrained to pass existing tests
|
|
605
|
+
- New tests accumulate over time, progressively constraining future generations
|
|
606
|
+
- This creates a "ratchet effect" - each bug fix adds a test, preventing regression
|
|
607
|
+
|
|
608
|
+
This is distinct from test *generation*. Tests are generated via `pdd test PROMPT_FILE CODE_FILE`, which uses the module prompt, generated code, and `context/test.prompt` for project-wide guidance. Tests accumulate over time via `--merge` as bugs are found. Requirements in the module prompt implicitly define what to test—each requirement should correspond to at least one test case.
|
|
609
|
+
|
|
610
|
+
```mermaid
|
|
611
|
+
flowchart LR
|
|
612
|
+
subgraph Assets
|
|
613
|
+
P[Module Prompt] --> G[pdd generate]
|
|
614
|
+
T[Existing Tests] --> G
|
|
615
|
+
G --> C[Generated Code]
|
|
616
|
+
end
|
|
617
|
+
|
|
618
|
+
subgraph Accumulation
|
|
619
|
+
BUG[Bug Found] --> NT[New Test Written]
|
|
620
|
+
NT --> T
|
|
621
|
+
end
|
|
622
|
+
```
|
|
623
|
+
|
|
624
|
+
---
|
|
625
|
+
|
|
626
|
+
Patch vs PDD at Scale (diagram):
|
|
627
|
+
|
|
628
|
+
```mermaid
|
|
629
|
+
flowchart LR
|
|
630
|
+
subgraph Patching
|
|
631
|
+
C0[Codebase] --> P0[Chat prompt]
|
|
632
|
+
P0 --> D0[Local diff]
|
|
633
|
+
D0 --> C0
|
|
634
|
+
end
|
|
635
|
+
|
|
636
|
+
subgraph PDD
|
|
637
|
+
PG[Prompts graph] --> GZ[Batch regenerate]
|
|
638
|
+
GZ --> CM[Code modules]
|
|
639
|
+
CM --> XT[Examples and Tests]
|
|
640
|
+
XT --> UP[Update prompts]
|
|
641
|
+
UP --> PG
|
|
642
|
+
end
|
|
643
|
+
```
|
|
644
|
+
|
|
645
|
+
---
|
|
646
|
+
|
|
647
|
+
## The Three Pillars of PDD Generation
|
|
648
|
+
|
|
649
|
+
Understanding how prompts, grounding, and tests work together is key to writing minimal, effective prompts.
|
|
650
|
+
|
|
651
|
+
| Pillar | What It Provides | Maintained By |
|
|
652
|
+
|--------|-----------------|---------------|
|
|
653
|
+
| **Prompt** | Requirements and constraints (WHAT) | Developer (explicit) |
|
|
654
|
+
| **Grounding** | Implementation patterns (HOW) | System (automatic, Cloud) |
|
|
655
|
+
| **Tests** | Behavioral correctness | Accumulated over time |
|
|
656
|
+
|
|
657
|
+
### How They Interact
|
|
658
|
+
|
|
659
|
+
- **Prompt** defines WHAT → "validate user input, return errors"
|
|
660
|
+
- **Grounding** defines HOW → class structure, helper patterns (from similar modules)
|
|
661
|
+
- **Tests** define CORRECTNESS → edge cases discovered through bugs
|
|
662
|
+
|
|
663
|
+
### Conflict Resolution
|
|
664
|
+
|
|
665
|
+
- **Tests override grounding**: If a test requires new behavior, generation must satisfy it
|
|
666
|
+
- **Explicit requirements override grounding**: If prompt says "use functional style", that overrides OOP examples in grounding
|
|
667
|
+
- **Grounding fills gaps**: Everything not specified in prompt or constrained by tests
|
|
668
|
+
|
|
669
|
+
### Why Prompts Can Be Minimal
|
|
670
|
+
|
|
671
|
+
You don't need to specify:
|
|
672
|
+
- **Coding style** → preamble provides it
|
|
673
|
+
- **Implementation patterns** → grounding provides them
|
|
674
|
+
- **Edge cases** → tests encode them
|
|
675
|
+
|
|
676
|
+
You only specify:
|
|
677
|
+
- What the module does
|
|
678
|
+
- What contracts it must satisfy
|
|
679
|
+
- What constraints apply
|
|
680
|
+
|
|
681
|
+
---
|
|
682
|
+
|
|
683
|
+
## Example (Minimal, Python)
|
|
684
|
+
|
|
685
|
+
This simplified example illustrates a minimal functional prompt:
|
|
686
|
+
|
|
687
|
+
```text
|
|
688
|
+
% You are an expert Python engineer. Your goal is to write a function `get_extension` that returns the file extension for a given language.
|
|
689
|
+
|
|
690
|
+
<include>context/python_preamble.prompt]
|
|
691
|
+
|
|
692
|
+
% Inputs/Outputs
|
|
693
|
+
Input: language (str), like "Python" or "Makefile".
|
|
694
|
+
Output: str file extension (e.g., ".py"), or "" if unknown.
|
|
695
|
+
|
|
696
|
+
% Data
|
|
697
|
+
The CSV at $PDD_PATH/data/language_format.csv contains: language,comment,extension
|
|
698
|
+
|
|
699
|
+
% Steps
|
|
700
|
+
1) Load env var PDD_PATH and read the CSV
|
|
701
|
+
2) Normalize language case
|
|
702
|
+
3) Lookup extension
|
|
703
|
+
4) Return "" if not found or invalid
|
|
704
|
+
```
|
|
705
|
+
|
|
706
|
+
This style:
|
|
707
|
+
- Declares role and outcome
|
|
708
|
+
- Specifies IO, data sources, and steps
|
|
709
|
+
- Uses an `[Error processing include: ` to pull a shared preamble
|
|
710
|
+
|
|
711
|
+
---
|
|
712
|
+
|
|
713
|
+
## Scoping & Modularity
|
|
714
|
+
|
|
715
|
+
- One prompt → one file/module. If a prompt gets too large or brittle, split it into smaller prompts that compose via explicit interfaces.
|
|
716
|
+
- Treat examples as interfaces: create a minimal runnable example demonstrating how the module is meant to be used.
|
|
717
|
+
- Avoid “mega‑prompts” that try to implement an entire subsystem. Use the PDD graph of prompts instead. For how prompts compose via examples, see “Dependencies & Composability (Token‑Efficient Examples)”.
|
|
718
|
+
|
|
719
|
+
---
|
|
720
|
+
|
|
721
|
+
## Writing Effective Requirements
|
|
722
|
+
|
|
723
|
+
Requirements are the core of your prompt. Everything else is handled automatically by grounding and tests.
|
|
724
|
+
|
|
725
|
+
### Structure (aim for 5-10 items)
|
|
726
|
+
|
|
727
|
+
1. **Primary function**: What does this module do? (one sentence)
|
|
728
|
+
2. **Input contract**: Types, validation rules, what's accepted
|
|
729
|
+
3. **Output contract**: Types, error conditions, return values
|
|
730
|
+
4. **Key invariants**: What must always be true
|
|
731
|
+
5. **Performance constraints**: If any (latency, memory, complexity)
|
|
732
|
+
6. **Security constraints**: If any (input sanitization, auth requirements)
|
|
733
|
+
|
|
734
|
+
### Each Requirement Should Be
|
|
735
|
+
|
|
736
|
+
- **Testable**: If you can't write a test for it, it's too vague
|
|
737
|
+
- **Behavioral**: Describe WHAT, not HOW
|
|
738
|
+
- **Unique**: Don't duplicate what preamble or grounding provides
|
|
739
|
+
|
|
740
|
+
### Example: Before/After
|
|
741
|
+
|
|
742
|
+
**Too detailed:**
|
|
743
|
+
```
|
|
744
|
+
1. Create a UserValidator class with validate() method
|
|
745
|
+
2. Use snake_case for all methods ← belongs in preamble
|
|
746
|
+
3. Import typing at the top ← belongs in preamble
|
|
747
|
+
4. Add docstrings to all public methods ← belongs in preamble
|
|
748
|
+
5. Handle null by returning ValidationError
|
|
749
|
+
6. Handle empty string by returning ValidationError
|
|
750
|
+
7. Handle whitespace-only by returning ValidationError
|
|
751
|
+
```
|
|
752
|
+
|
|
753
|
+
**Just right** (requirements only):
|
|
754
|
+
```
|
|
755
|
+
1. Function: validate_user(input) → ValidationResult
|
|
756
|
+
2. Input: Any type (untrusted user input)
|
|
757
|
+
3. Output: ValidationResult with is_valid bool and errors list
|
|
758
|
+
4. Invalid inputs: null, empty, whitespace-only, malformed
|
|
759
|
+
5. Performance: O(n) in input length
|
|
760
|
+
6. Security: No eval/exec, treat input as untrusted
|
|
761
|
+
```
|
|
762
|
+
|
|
763
|
+
Style conventions (2-4) belong in a shared preamble. Edge cases (5-7) can be collapsed into a single requirement.
|
|
764
|
+
|
|
765
|
+
**Requirements as Test Specifications:** Each requirement implies at least one test case. If you can't test a requirement, it's too vague.
|
|
766
|
+
|
|
767
|
+
---
|
|
768
|
+
|
|
769
|
+
## Prompt Abstraction Level
|
|
770
|
+
|
|
771
|
+

|
|
772
|
+
|
|
773
|
+
Write prompts at the level of *architecture, contract, and intent*, not line-by-line *implementation details*.
|
|
774
|
+
|
|
775
|
+
### Heuristics: Are You at the Right Level?
|
|
776
|
+
|
|
777
|
+
| Indicator | Too Detailed (> 30%) | Too Vague (< 10%) |
|
|
778
|
+
|-----------|----------------------|-------------------|
|
|
779
|
+
| **Content** | Specifying variable names, loop structures | Missing error handling strategy |
|
|
780
|
+
| **Style** | Dictating indentation, imports | No input/output types |
|
|
781
|
+
| **Result** | Prompt harder to maintain than code | Every generation is wildly different |
|
|
782
|
+
|
|
783
|
+
### If Your Prompt Is Too Long
|
|
784
|
+
|
|
785
|
+
Ask yourself:
|
|
786
|
+
- **Am I specifying coding style?** → Remove it (preamble handles this)
|
|
787
|
+
- **Am I specifying implementation patterns?** → Remove them (grounding handles this)
|
|
788
|
+
- **Am I listing every edge case?** → Remove them (tests handle this)
|
|
789
|
+
- **Is the module too big?** → Split into multiple prompts
|
|
790
|
+
|
|
791
|
+
### Examples
|
|
792
|
+
|
|
793
|
+
- **Too Vague:** "Create a user page." (Model guesses everything; unrepeatable)
|
|
794
|
+
- **Too Detailed:** "Create a class User with a private field _id. In the constructor, set _id. Write a getter..." (Prompt is harder to maintain than code)
|
|
795
|
+
- **Just Right:** "Implement a UserProfile component that displays user details and handles the 'update' action via the API. It must handle loading/error states and match the existing design system."
|
|
796
|
+
|
|
797
|
+
**Rule of Thumb:** Focus on **Interfaces**, **Invariants**, and **Outcomes**. Let the preamble handle coding style; let grounding handle implementation patterns.
|
|
798
|
+
|
|
799
|
+
---
|
|
800
|
+
|
|
801
|
+
## Dependencies
|
|
802
|
+
|
|
803
|
+
### When to Use `<include>`
|
|
804
|
+
|
|
805
|
+
Include dependencies explicitly when:
|
|
806
|
+
- **External libraries** not in your grounding history
|
|
807
|
+
- **Critical interfaces** that must be exact
|
|
808
|
+
- **New modules** with no similar examples in grounding
|
|
809
|
+
|
|
810
|
+
```xml
|
|
811
|
+
<billing_service>
|
|
812
|
+
<include>context/billing_service_example.py]
|
|
813
|
+
</billing_service>
|
|
814
|
+
```
|
|
815
|
+
|
|
816
|
+
### When to Rely on Grounding
|
|
817
|
+
|
|
818
|
+
If you've successfully generated code that uses a dependency before, grounding often suffices—the usage pattern is already in the cloud database.
|
|
819
|
+
|
|
820
|
+
**Prefer explicit `[Error processing include: ` for:** External APIs, critical contracts, cross-team interfaces
|
|
821
|
+
**Rely on grounding for:** Internal modules with established patterns
|
|
822
|
+
|
|
823
|
+
### Token Efficiency
|
|
824
|
+
|
|
825
|
+
Real source code is heavy. A 500-line module might have a 50-line usage example. By including only the example, you save ~90% of tokens. Use `pdd auto-deps` to automatically populate relevant examples.
|
|
826
|
+
|
|
827
|
+
```mermaid
|
|
828
|
+
flowchart LR
|
|
829
|
+
subgraph Module_B
|
|
830
|
+
PB[Prompt B] --> GB[Generate] --> CB[Code B]
|
|
831
|
+
CB --> EB[Example B]
|
|
832
|
+
end
|
|
833
|
+
|
|
834
|
+
subgraph Module_A
|
|
835
|
+
PA[Prompt A] --> GA[Generate] --> CA[Code A]
|
|
836
|
+
PA --> EB
|
|
837
|
+
end
|
|
838
|
+
|
|
839
|
+
EB --> CA
|
|
840
|
+
```
|
|
841
|
+
|
|
842
|
+
---
|
|
843
|
+
|
|
844
|
+
## Regenerate, Verify, Test, Update
|
|
845
|
+
|
|
846
|
+
**Crucial Prerequisite:** Before regenerating a module, ensure you have **high test coverage** for its current functionality. Because PDD overwrites the code file entirely, your test suite is the critical safety net that prevents regression of existing features while you iterate on new ones.
|
|
847
|
+
|
|
848
|
+
The PDD workflow (see pdd/docs/whitepaper.md):
|
|
849
|
+
|
|
850
|
+
1) **Generate:** Fully regenerate (overwrite) the code module and its example.
|
|
851
|
+
2) **Crash → Verify:** Run the example. Fix immediate runtime errors.
|
|
852
|
+
3) **Test (Accumulate):** Run existing tests. If fixing a bug, **write a new failing test case first** and append it to the test file. *Never overwrite the test file; tests must accumulate to prevent regressions.*
|
|
853
|
+
4) **Fix via Command:** When you use `pdd fix` (or manual fixes verified by tests), the system **automatically submits** the successful Prompt+Code pair to PDD Cloud (or local history).
|
|
854
|
+
5) **Fix via Prompt:** If the logic is fundamentally flawed, update the prompt text to clarify the requirement or constraint that was missed, then **go to step 1**.
|
|
855
|
+
5) **Drift Check (Optional):** Occasionally regenerate the module *without* changing the prompt (e.g., after upgrading LLM versions or before major releases). If the output differs significantly or fails tests, your prompt has "drifted" (it relied on lucky seeds or implicit context). Tighten the prompt until the output is stable.
|
|
856
|
+
6) **Update:** Once tests pass, back-propagate any final learnings into the prompt.
|
|
857
|
+
|
|
858
|
+
Key practice: Code and examples are ephemeral (regenerated); Tests and Prompts are permanent assets (accumulated and versioned).
|
|
859
|
+
|
|
860
|
+
**Important:** Tests ARE generated from the module prompt (plus code and `context/test.prompt`). The key distinction is their lifecycle:
|
|
861
|
+
- Code is regenerated on prompt changes; tests accumulate and persist
|
|
862
|
+
- Requirements implicitly define test coverage—each requirement implies at least one test
|
|
863
|
+
- Use `context/test.prompt` for project-wide test guidance (frameworks, patterns)
|
|
864
|
+
- Existing tests are included as context during code generation
|
|
865
|
+
- This creates a "ratchet effect" where each new test permanently constrains future generations
|
|
866
|
+
|
|
867
|
+
### Workflow Cheatsheet: Features vs. Bugs
|
|
868
|
+
|
|
869
|
+
| Task Type | Where to Start | The Workflow |
|
|
870
|
+
| :--- | :--- | :--- |
|
|
871
|
+
| **New Feature** | **The Prompt** | 1. Add/Update Requirements in Prompt.<br>2. Regenerate Code (LLM sees existing tests).<br>3. Write new Tests to verify. |
|
|
872
|
+
| **Bug Fix** | **The Test File** | 1. Use `pdd bug` to create a failing test case (repro) in the Test file.<br>2. Clarify the Prompt to address the edge case if needed.<br>3. Run `pdd fix` (LLM sees the new test and must pass it). |
|
|
873
|
+
|
|
874
|
+
**Key insight:** When you run `pdd generate` after adding a test, the LLM sees that test as context. This means the generated code is constrained to pass it - the test acts as a specification, not just a verification.
|
|
875
|
+
|
|
876
|
+
**Why?** Features represent *new intent* (Prompt). Bugs represent *missed intent* which must first be captured as a constraint (Test) before refining the definition (Prompt).
|
|
877
|
+
|
|
878
|
+
### When to Update the Prompt (and When Not To)
|
|
879
|
+
|
|
880
|
+
After a successful fix, ask: "Where should this knowledge live?"
|
|
881
|
+
|
|
882
|
+
| Knowledge Type | Where It Lives | Update Prompt? |
|
|
883
|
+
|---------------|----------------|----------------|
|
|
884
|
+
| New edge case behavior | Test file | **No** |
|
|
885
|
+
| Implementation pattern fix | Grounding (auto-captured) | **No** |
|
|
886
|
+
| Missing requirement | Prompt | **Yes** |
|
|
887
|
+
| Wrong constraint | Prompt | **Yes** |
|
|
888
|
+
| Security/compliance rule | Prompt or preamble | **Yes** |
|
|
889
|
+
|
|
890
|
+
**Rule of thumb:** Update the prompt only for **intent changes**:
|
|
891
|
+
- "The module should also handle X" → Add requirement
|
|
892
|
+
- "The constraint was wrong" → Fix requirement
|
|
893
|
+
- "This security rule applies" → Add requirement
|
|
894
|
+
|
|
895
|
+
**Don't update for implementation fixes:**
|
|
896
|
+
- "There was a bug with null handling" → Add test; grounding captures the fix
|
|
897
|
+
- "The code style was inconsistent" → Update preamble (not prompt)
|
|
898
|
+
- "I prefer different variable names" → Update preamble/prompt
|
|
899
|
+
|
|
900
|
+
---
|
|
901
|
+
|
|
902
|
+
## PDD vs Interactive Agentic Coders (Claude Code, Cursor)
|
|
903
|
+
|
|
904
|
+
- Source of truth:
|
|
905
|
+
- PDD: the prompt is primary and versioned; code is regenerated output
|
|
906
|
+
- Interactive: the code is primary; prompts are ephemeral patch instructions
|
|
907
|
+
- Workflow:
|
|
908
|
+
- PDD: batch‑oriented, reproducible runs; explicit context via includes
|
|
909
|
+
- Interactive: live chat loops; implicit context; local diffs
|
|
910
|
+
- Scope:
|
|
911
|
+
- PDD: scoped to modules/files with clear interfaces; compose via examples
|
|
912
|
+
- Interactive: excels at small, local edits; struggles as scope and history grow
|
|
913
|
+
- Synchronization:
|
|
914
|
+
- PDD: update prompts after fixes; tests accumulate and protect behavior
|
|
915
|
+
- Interactive: prompt history rarely persists; documentation often drifts
|
|
916
|
+
|
|
917
|
+
When to use which: Use PDD for substantive new modules, refactors, and anything requiring long‑term maintainability and repeatability. Use interactive patching for trivial hotfixes; follow up with a prompt `update` so the source of truth remains synchronized.
|
|
918
|
+
|
|
919
|
+
---
|
|
920
|
+
|
|
921
|
+
## Patch vs PDD: Concrete Examples
|
|
922
|
+
|
|
923
|
+
Patch‑style prompt (interactive agent):
|
|
924
|
+
|
|
925
|
+
```text
|
|
926
|
+
Fix this bug in src/utils/user.ts. In function parseUserId, passing null should return null instead of throwing.
|
|
927
|
+
|
|
928
|
+
Constraints:
|
|
929
|
+
- Change the minimum number of lines
|
|
930
|
+
- Do not alter the function signature or add new functions
|
|
931
|
+
- Keep existing imports and formatting
|
|
932
|
+
- Output a unified diff only
|
|
933
|
+
|
|
934
|
+
Snippet:
|
|
935
|
+
export function parseUserId(input: string) {
|
|
936
|
+
return input.trim().split(":")[1];
|
|
937
|
+
}
|
|
938
|
+
```
|
|
939
|
+
|
|
940
|
+
PDD‑style prompt (source of truth):
|
|
941
|
+
|
|
942
|
+
```text
|
|
943
|
+
% You are an expert TypeScript engineer. Create a module `user_id_parser` with a function `parseUserId` that safely extracts a user id.
|
|
944
|
+
|
|
945
|
+
% Role & Scope
|
|
946
|
+
A utility module responsible for parsing user identifiers from various inputs.
|
|
947
|
+
|
|
948
|
+
% Requirements
|
|
949
|
+
1) Function: `parseUserId(input: unknown): string | null`
|
|
950
|
+
2) Accepts strings like "user:abc123" and returns "abc123"
|
|
951
|
+
3) For null/undefined/non‑string, return null without throwing
|
|
952
|
+
4) Trim whitespace; reject blank ids as null
|
|
953
|
+
5) Log at debug level on parse failures; no exceptions for expected cases
|
|
954
|
+
6) Performance: O(n) in input length; no regex backtracking pitfalls
|
|
955
|
+
7) Security: no eval/exec; treat input as untrusted
|
|
956
|
+
|
|
957
|
+
% Dependencies
|
|
958
|
+
<logger>
|
|
959
|
+
<include>context/logger_example.ts]
|
|
960
|
+
</logger>
|
|
961
|
+
|
|
962
|
+
% Instructions
|
|
963
|
+
- Implement in `src/utils/user_id_parser.ts`
|
|
964
|
+
- Export `parseUserId`
|
|
965
|
+
- Add narrow helpers if needed; keep module cohesive
|
|
966
|
+
- Include basic JSDoc and simple debug logging hooks
|
|
967
|
+
```
|
|
968
|
+
|
|
969
|
+
Key differences:
|
|
970
|
+
- Patch prompt constrains a local edit and often asks for a diff. It assumes code is the source of truth.
|
|
971
|
+
- PDD prompt defines the module's contract, dependencies, and deliverables. It is the source of truth; code is regenerated to match it, while tests accumulate over time.
|
|
972
|
+
|
|
973
|
+
---
|
|
974
|
+
|
|
975
|
+
## Checklist: Before You Run `pdd generate`
|
|
976
|
+
|
|
977
|
+
### Must Have
|
|
978
|
+
- [ ] Module purpose is clear (1-2 sentences)
|
|
979
|
+
- [ ] Requirements are testable and behavioral (5-10 items)
|
|
980
|
+
- [ ] Dependencies included (if external or critical)
|
|
981
|
+
|
|
982
|
+
### For Established Modules
|
|
983
|
+
- [ ] Tests exist for known edge cases
|
|
984
|
+
- [ ] Previous generation was successful (grounding will use it)
|
|
985
|
+
|
|
986
|
+
### For New Modules
|
|
987
|
+
- [ ] Similar modules exist in codebase (grounding will find them)
|
|
988
|
+
- [ ] Or: Consider `<pin>` to reference a template module (Cloud)
|
|
989
|
+
|
|
990
|
+
### You Don't Need to Specify
|
|
991
|
+
- Coding style (preamble handles this)
|
|
992
|
+
- Implementation patterns (grounding handles this)
|
|
993
|
+
- Every edge case (tests handle this)
|
|
994
|
+
|
|
995
|
+
---
|
|
996
|
+
|
|
997
|
+
## Common Pitfalls (And Fixes)
|
|
998
|
+
|
|
999
|
+
- Too much context: prune includes; prefer targeted examples over entire files.
|
|
1000
|
+
- Vague requirements: convert to explicit contracts, budgets, and behaviors.
|
|
1001
|
+
- Mega‑prompts: split into smaller prompts (one per file/module) and compose.
|
|
1002
|
+
- Prompt outweighs the code: if the prompt is larger than the generated file, it’s usually over‑specifying control flow. Aim for prompts to be a fraction of the target code size; keep them at the interface/behavior level and let the model fill in routine implementation.
|
|
1003
|
+
- Patching code directly: make the change in the prompt and regenerate; then `update` with learnings.
|
|
1004
|
+
- Throwing away tests: keep and expand; they are your long‑term leverage.
|
|
1005
|
+
|
|
1006
|
+
---
|
|
1007
|
+
|
|
1008
|
+
## Naming & Conventions (This Repo)
|
|
1009
|
+
|
|
1010
|
+
- One prompt per module/file, named like `${BASENAME}_${LanguageOrFramework}.prompt` (see templates under `pdd/pdd/templates`).
|
|
1011
|
+
- Follow codebase conventions from README.md for Python and TypeScript style.
|
|
1012
|
+
- Use curated examples under `context/` to encode interfaces and behaviors.
|
|
1013
|
+
|
|
1014
|
+
---
|
|
1015
|
+
|
|
1016
|
+
## Final Notes
|
|
1017
|
+
|
|
1018
|
+
Think of prompts as your programming language. Keep them concise, explicit, and modular. Regenerate instead of patching, verify behavior with accumulating tests, and continuously back‑propagate implementation learnings into your prompts. That discipline is what converts maintenance from an endless patchwork into a compounding system of leverage.
|
|
1019
|
+
|
|
1020
|
+
|
|
1021
|
+
% Files to Read
|
|
1022
|
+
|
|
1023
|
+
1. **Prompt file**: {prompt_path}
|
|
1024
|
+
- Read this file and all its XML include tags to understand the current specification
|
|
1025
|
+
|
|
1026
|
+
2. **Code file (modified)**: {code_path}
|
|
1027
|
+
- This contains the user's modifications that the prompt needs to capture
|
|
1028
|
+
|
|
1029
|
+
3. **Test files** (if any exist): {test_paths}
|
|
1030
|
+
- Read ALL test files listed above
|
|
1031
|
+
- Behaviors verified by tests DON'T need to be explicitly specified in the prompt
|
|
1032
|
+
- Tests accumulate across numbered files (test_module.py, test_module_1.py, etc.)
|
|
1033
|
+
|
|
1034
|
+
% Your 4-Step Process
|
|
1035
|
+
|
|
1036
|
+
**Step 1: Assess Differences**
|
|
1037
|
+
- Read the prompt file (expand all XML include tags mentally)
|
|
1038
|
+
- Compare against the modified code
|
|
1039
|
+
- Identify what the code does that isn't in the prompt
|
|
1040
|
+
- Identify what the prompt says that isn't in the code
|
|
1041
|
+
|
|
1042
|
+
**Step 2: Filter Using Guide + Tests**
|
|
1043
|
+
- Consult the prompting guide above for what belongs in a prompt
|
|
1044
|
+
- Check the test files - if tests verify a behavior, the prompt can be more abstract
|
|
1045
|
+
- Decide which differences should go in the updated prompt
|
|
1046
|
+
|
|
1047
|
+
**Step 3: Remove Duplication**
|
|
1048
|
+
- Check if the prompt body duplicates content from XML include files
|
|
1049
|
+
- Remove redundant specifications
|
|
1050
|
+
- Keep the prompt compact but sufficient
|
|
1051
|
+
|
|
1052
|
+
**Step 4: Check for Existing Shared Includes** (optional)
|
|
1053
|
+
- If you identify content that could be shared, check if a relevant include already exists in `context/`
|
|
1054
|
+
- Only create new includes for patterns that are truly common (would be used in 3+ prompts)
|
|
1055
|
+
- Do NOT scan all prompts - this step is opportunistic based on your knowledge of the prompt being updated
|
|
1056
|
+
|
|
1057
|
+
**Step 5: Validate**
|
|
1058
|
+
- Ensure the prompt can reliably regenerate the modified code
|
|
1059
|
+
- Ensure the prompt is human-readable:
|
|
1060
|
+
- Clear role/purpose statement
|
|
1061
|
+
- Logical structure (requirements, dependencies)
|
|
1062
|
+
- Understandable by a developer unfamiliar with the code
|
|
1063
|
+
- Target 10-30% prompt-to-code ratio (per the guide)
|
|
1064
|
+
|
|
1065
|
+
% Action
|
|
1066
|
+
|
|
1067
|
+
Write the updated prompt directly to: {prompt_path}
|
|
1068
|
+
|
|
1069
|
+
If you create new shared include files, write them to the `context/` directory.
|
|
1070
|
+
|
|
1071
|
+
If the prompt is already optimal, you may leave it unchanged.
|