ag-cortex 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agent/commands/test-browser.md +339 -0
- package/.agent/rules/00-constitution.md +46 -0
- package/.agent/rules/project-rules.md +49 -0
- package/.agent/skills/agent-browser/SKILL.md +223 -0
- package/.agent/skills/agent-native-architecture/SKILL.md +435 -0
- package/.agent/skills/agent-native-architecture/references/action-parity-discipline.md +409 -0
- package/.agent/skills/agent-native-architecture/references/agent-execution-patterns.md +467 -0
- package/.agent/skills/agent-native-architecture/references/agent-native-testing.md +582 -0
- package/.agent/skills/agent-native-architecture/references/architecture-patterns.md +478 -0
- package/.agent/skills/agent-native-architecture/references/dynamic-context-injection.md +338 -0
- package/.agent/skills/agent-native-architecture/references/files-universal-interface.md +301 -0
- package/.agent/skills/agent-native-architecture/references/from-primitives-to-domain-tools.md +359 -0
- package/.agent/skills/agent-native-architecture/references/mcp-tool-design.md +506 -0
- package/.agent/skills/agent-native-architecture/references/mobile-patterns.md +871 -0
- package/.agent/skills/agent-native-architecture/references/product-implications.md +443 -0
- package/.agent/skills/agent-native-architecture/references/refactoring-to-prompt-native.md +317 -0
- package/.agent/skills/agent-native-architecture/references/self-modification.md +269 -0
- package/.agent/skills/agent-native-architecture/references/shared-workspace-architecture.md +680 -0
- package/.agent/skills/agent-native-architecture/references/system-prompt-design.md +250 -0
- package/.agent/skills/agent-native-reviewer/SKILL.md +246 -0
- package/.agent/skills/andrew-kane-gem-writer/SKILL.md +184 -0
- package/.agent/skills/andrew-kane-gem-writer/references/database-adapters.md +231 -0
- package/.agent/skills/andrew-kane-gem-writer/references/module-organization.md +121 -0
- package/.agent/skills/andrew-kane-gem-writer/references/rails-integration.md +183 -0
- package/.agent/skills/andrew-kane-gem-writer/references/resources.md +119 -0
- package/.agent/skills/andrew-kane-gem-writer/references/testing-patterns.md +261 -0
- package/.agent/skills/ankane-readme-writer/SKILL.md +50 -0
- package/.agent/skills/architecture-strategist/SKILL.md +52 -0
- package/.agent/skills/best-practices-researcher/SKILL.md +100 -0
- package/.agent/skills/bug-reproduction-validator/SKILL.md +67 -0
- package/.agent/skills/code-simplicity-reviewer/SKILL.md +85 -0
- package/.agent/skills/coding-tutor/.claude-plugin/plugin.json +9 -0
- package/.agent/skills/coding-tutor/README.md +37 -0
- package/.agent/skills/coding-tutor/commands/quiz-me.md +1 -0
- package/.agent/skills/coding-tutor/commands/sync-tutorials.md +25 -0
- package/.agent/skills/coding-tutor/commands/teach-me.md +1 -0
- package/.agent/skills/coding-tutor/skills/coding-tutor/SKILL.md +214 -0
- package/.agent/skills/coding-tutor/skills/coding-tutor/scripts/create_tutorial.py +202 -0
- package/.agent/skills/coding-tutor/skills/coding-tutor/scripts/index_tutorials.py +203 -0
- package/.agent/skills/coding-tutor/skills/coding-tutor/scripts/quiz_priority.py +190 -0
- package/.agent/skills/coding-tutor/skills/coding-tutor/scripts/setup_tutorials.py +132 -0
- package/.agent/skills/compound-docs/SKILL.md +510 -0
- package/.agent/skills/compound-docs/assets/critical-pattern-template.md +34 -0
- package/.agent/skills/compound-docs/assets/resolution-template.md +93 -0
- package/.agent/skills/compound-docs/references/yaml-schema.md +65 -0
- package/.agent/skills/compound-docs/schema.yaml +176 -0
- package/.agent/skills/create-agent-skills/SKILL.md +299 -0
- package/.agent/skills/create-agent-skills/references/api-security.md +226 -0
- package/.agent/skills/create-agent-skills/references/be-clear-and-direct.md +531 -0
- package/.agent/skills/create-agent-skills/references/best-practices.md +404 -0
- package/.agent/skills/create-agent-skills/references/common-patterns.md +595 -0
- package/.agent/skills/create-agent-skills/references/core-principles.md +437 -0
- package/.agent/skills/create-agent-skills/references/executable-code.md +175 -0
- package/.agent/skills/create-agent-skills/references/iteration-and-testing.md +474 -0
- package/.agent/skills/create-agent-skills/references/official-spec.md +185 -0
- package/.agent/skills/create-agent-skills/references/recommended-structure.md +168 -0
- package/.agent/skills/create-agent-skills/references/skill-structure.md +372 -0
- package/.agent/skills/create-agent-skills/references/using-scripts.md +113 -0
- package/.agent/skills/create-agent-skills/references/using-templates.md +112 -0
- package/.agent/skills/create-agent-skills/references/workflows-and-validation.md +510 -0
- package/.agent/skills/create-agent-skills/templates/router-skill.md +73 -0
- package/.agent/skills/create-agent-skills/templates/simple-skill.md +33 -0
- package/.agent/skills/create-agent-skills/workflows/add-reference.md +96 -0
- package/.agent/skills/create-agent-skills/workflows/add-script.md +93 -0
- package/.agent/skills/create-agent-skills/workflows/add-template.md +74 -0
- package/.agent/skills/create-agent-skills/workflows/add-workflow.md +120 -0
- package/.agent/skills/create-agent-skills/workflows/audit-skill.md +138 -0
- package/.agent/skills/create-agent-skills/workflows/create-domain-expertise-skill.md +605 -0
- package/.agent/skills/create-agent-skills/workflows/create-new-skill.md +191 -0
- package/.agent/skills/create-agent-skills/workflows/get-guidance.md +121 -0
- package/.agent/skills/create-agent-skills/workflows/upgrade-to-router.md +161 -0
- package/.agent/skills/create-agent-skills/workflows/verify-skill.md +204 -0
- package/.agent/skills/data-integrity-guardian/SKILL.md +70 -0
- package/.agent/skills/data-migration-expert/SKILL.md +97 -0
- package/.agent/skills/deployment-verification-agent/SKILL.md +159 -0
- package/.agent/skills/design-implementation-reviewer/SKILL.md +85 -0
- package/.agent/skills/design-iterator/SKILL.md +197 -0
- package/.agent/skills/dhh-rails-reviewer/SKILL.md +45 -0
- package/.agent/skills/dhh-rails-style/SKILL.md +184 -0
- package/.agent/skills/dhh-rails-style/references/architecture.md +653 -0
- package/.agent/skills/dhh-rails-style/references/controllers.md +303 -0
- package/.agent/skills/dhh-rails-style/references/frontend.md +510 -0
- package/.agent/skills/dhh-rails-style/references/gems.md +266 -0
- package/.agent/skills/dhh-rails-style/references/models.md +359 -0
- package/.agent/skills/dhh-rails-style/references/testing.md +338 -0
- package/.agent/skills/dspy-ruby/SKILL.md +594 -0
- package/.agent/skills/dspy-ruby/assets/config-template.rb +359 -0
- package/.agent/skills/dspy-ruby/assets/module-template.rb +326 -0
- package/.agent/skills/dspy-ruby/assets/signature-template.rb +143 -0
- package/.agent/skills/dspy-ruby/references/core-concepts.md +265 -0
- package/.agent/skills/dspy-ruby/references/optimization.md +623 -0
- package/.agent/skills/dspy-ruby/references/providers.md +305 -0
- package/.agent/skills/every-style-editor/SKILL.md +134 -0
- package/.agent/skills/every-style-editor/references/EVERY_WRITE_STYLE.md +529 -0
- package/.agent/skills/figma-design-sync/SKILL.md +166 -0
- package/.agent/skills/file-todos/SKILL.md +251 -0
- package/.agent/skills/file-todos/assets/todo-template.md +155 -0
- package/.agent/skills/framework-docs-researcher/SKILL.md +83 -0
- package/.agent/skills/frontend-design/SKILL.md +42 -0
- package/.agent/skills/gemini-imagegen/SKILL.md +237 -0
- package/.agent/skills/gemini-imagegen/requirements.txt +2 -0
- package/.agent/skills/gemini-imagegen/scripts/compose_images.py +168 -0
- package/.agent/skills/gemini-imagegen/scripts/edit_image.py +157 -0
- package/.agent/skills/gemini-imagegen/scripts/gemini_images.py +265 -0
- package/.agent/skills/gemini-imagegen/scripts/generate_image.py +147 -0
- package/.agent/skills/gemini-imagegen/scripts/multi_turn_chat.py +215 -0
- package/.agent/skills/git-history-analyzer/SKILL.md +42 -0
- package/.agent/skills/git-worktree/SKILL.md +302 -0
- package/.agent/skills/git-worktree/scripts/worktree-manager.sh +345 -0
- package/.agent/skills/julik-frontend-races-reviewer/SKILL.md +222 -0
- package/.agent/skills/kieran-python-reviewer/SKILL.md +104 -0
- package/.agent/skills/kieran-rails-reviewer/SKILL.md +86 -0
- package/.agent/skills/kieran-typescript-reviewer/SKILL.md +95 -0
- package/.agent/skills/lint/SKILL.md +16 -0
- package/.agent/skills/pattern-recognition-specialist/SKILL.md +57 -0
- package/.agent/skills/performance-oracle/SKILL.md +110 -0
- package/.agent/skills/pr-comment-resolver/SKILL.md +69 -0
- package/.agent/skills/rclone/SKILL.md +150 -0
- package/.agent/skills/rclone/scripts/check_setup.sh +60 -0
- package/.agent/skills/repo-research-analyst/SKILL.md +113 -0
- package/.agent/skills/security-sentinel/SKILL.md +93 -0
- package/.agent/skills/skill-creator/SKILL.md +209 -0
- package/.agent/skills/skill-creator/scripts/init_skill.py +304 -0
- package/.agent/skills/skill-creator/scripts/package_skill.py +112 -0
- package/.agent/skills/skill-creator/scripts/quick_validate.py +72 -0
- package/.agent/skills/spec-flow-analyzer/SKILL.md +113 -0
- package/.agent/skills/test-agent/SKILL.md +4 -0
- package/.agent/workflows/agent-native-audit.md +277 -0
- package/.agent/workflows/ask-user-question.md +21 -0
- package/.agent/workflows/changelog.md +137 -0
- package/.agent/workflows/compound.md +202 -0
- package/.agent/workflows/create-agent-skill.md +8 -0
- package/.agent/workflows/deepen-plan-research.md +334 -0
- package/.agent/workflows/deepen-plan-synthesis.md +182 -0
- package/.agent/workflows/deepen-plan.md +79 -0
- package/.agent/workflows/feature-video.md +342 -0
- package/.agent/workflows/generate-command.md +162 -0
- package/.agent/workflows/heal-skill.md +142 -0
- package/.agent/workflows/lfg.md +20 -0
- package/.agent/workflows/plan-analysis.md +67 -0
- package/.agent/workflows/plan-next-steps.md +63 -0
- package/.agent/workflows/plan-review.md +33 -0
- package/.agent/workflows/plan-synthesis.md +106 -0
- package/.agent/workflows/plan.md +49 -0
- package/.agent/workflows/report-bug.md +150 -0
- package/.agent/workflows/reproduce-bug.md +99 -0
- package/.agent/workflows/resolve-parallel.md +34 -0
- package/.agent/workflows/resolve-pr-parallel.md +49 -0
- package/.agent/workflows/resolve-todo-parallel.md +35 -0
- package/.agent/workflows/review-analysis.md +145 -0
- package/.agent/workflows/review-synthesis.md +262 -0
- package/.agent/workflows/review.md +64 -0
- package/.agent/workflows/ship.md +90 -0
- package/.agent/workflows/test-command.md +3 -0
- package/.agent/workflows/triage.md +310 -0
- package/.agent/workflows/work.md +157 -0
- package/.agent/workflows/xcode-test.md +332 -0
- package/LICENSE +22 -0
- package/README.md +49 -0
- package/bin/ag-cortex.js +54 -0
- package/lib/core.js +165 -0
- package/package.json +31 -0
|
@@ -0,0 +1,359 @@
|
|
|
1
|
+
<overview>
|
|
2
|
+
Start with pure primitives: bash, file operations, basic storage. This proves the architecture works and reveals what the agent actually needs. As patterns emerge, add domain-specific tools deliberately. This document covers when and how to evolve from primitives to domain tools, and when to graduate to optimized code.
|
|
3
|
+
</overview>
|
|
4
|
+
|
|
5
|
+
<start_with_primitives>
|
|
6
|
+
## Start with Pure Primitives
|
|
7
|
+
|
|
8
|
+
Begin every agent-native system with the most atomic tools possible:
|
|
9
|
+
|
|
10
|
+
- `read_file` / `write_file` / `list_files`
|
|
11
|
+
- `bash` (for everything else)
|
|
12
|
+
- Basic storage (`store_item` / `get_item`)
|
|
13
|
+
- HTTP requests (`fetch_url`)
|
|
14
|
+
|
|
15
|
+
**Why start here:**
|
|
16
|
+
|
|
17
|
+
1. **Proves the architecture** - If it works with primitives, your prompts are doing their job
|
|
18
|
+
2. **Reveals actual needs** - You'll discover what domain concepts matter
|
|
19
|
+
3. **Maximum flexibility** - Agent can do anything, not just what you anticipated
|
|
20
|
+
4. **Forces good prompts** - You can't lean on tool logic as a crutch
|
|
21
|
+
|
|
22
|
+
### Example: Starting Primitive
|
|
23
|
+
|
|
24
|
+
```typescript
|
|
25
|
+
// Start with just these
|
|
26
|
+
const tools = [
|
|
27
|
+
tool("read_file", { path: z.string() }, ...),
|
|
28
|
+
tool("write_file", { path: z.string(), content: z.string() }, ...),
|
|
29
|
+
tool("list_files", { path: z.string() }, ...),
|
|
30
|
+
tool("bash", { command: z.string() }, ...),
|
|
31
|
+
];
|
|
32
|
+
|
|
33
|
+
// Prompt handles the domain logic
|
|
34
|
+
const prompt = `
|
|
35
|
+
When processing feedback:
|
|
36
|
+
1. Read existing feedback from data/feedback.json
|
|
37
|
+
2. Add the new feedback with your assessment of importance (1-5)
|
|
38
|
+
3. Write the updated file
|
|
39
|
+
4. If importance >= 4, create a notification file in data/alerts/
|
|
40
|
+
`;
|
|
41
|
+
```
|
|
42
|
+
</start_with_primitives>
|
|
43
|
+
|
|
44
|
+
<when_to_add_domain_tools>
|
|
45
|
+
## When to Add Domain Tools
|
|
46
|
+
|
|
47
|
+
As patterns emerge, you'll want to add domain-specific tools. This is good—but do it deliberately.
|
|
48
|
+
|
|
49
|
+
### Vocabulary Anchoring
|
|
50
|
+
|
|
51
|
+
**Add a domain tool when:** The agent needs to understand domain concepts.
|
|
52
|
+
|
|
53
|
+
A `create_note` tool teaches the agent what "note" means in your system better than "write a file to the notes directory with this format."
|
|
54
|
+
|
|
55
|
+
```typescript
|
|
56
|
+
// Without domain tool - agent must infer structure
|
|
57
|
+
await agent.chat("Create a note about the meeting");
|
|
58
|
+
// Agent: writes to... notes/? documents/? what format?
|
|
59
|
+
|
|
60
|
+
// With domain tool - vocabulary is anchored
|
|
61
|
+
tool("create_note", {
|
|
62
|
+
title: z.string(),
|
|
63
|
+
content: z.string(),
|
|
64
|
+
tags: z.array(z.string()).optional(),
|
|
65
|
+
}, async ({ title, content, tags }) => {
|
|
66
|
+
// Tool enforces structure, agent understands "note"
|
|
67
|
+
});
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
### Guardrails
|
|
71
|
+
|
|
72
|
+
**Add a domain tool when:** Some operations need validation or constraints that shouldn't be left to agent judgment.
|
|
73
|
+
|
|
74
|
+
```typescript
|
|
75
|
+
// publish_to_feed might enforce format requirements or content policies
|
|
76
|
+
tool("publish_to_feed", {
|
|
77
|
+
bookId: z.string(),
|
|
78
|
+
content: z.string(),
|
|
79
|
+
headline: z.string().max(100), // Enforce headline length
|
|
80
|
+
}, async ({ bookId, content, headline }) => {
|
|
81
|
+
// Validate content meets guidelines
|
|
82
|
+
if (containsProhibitedContent(content)) {
|
|
83
|
+
return { text: "Content doesn't meet guidelines", isError: true };
|
|
84
|
+
}
|
|
85
|
+
// Enforce proper structure
|
|
86
|
+
await feedService.publish({ bookId, content, headline, publishedAt: new Date() });
|
|
87
|
+
});
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
### Efficiency
|
|
91
|
+
|
|
92
|
+
**Add a domain tool when:** Common operations would take many primitive calls.
|
|
93
|
+
|
|
94
|
+
```typescript
|
|
95
|
+
// Primitive approach: multiple calls
|
|
96
|
+
await agent.chat("Get book details");
|
|
97
|
+
// Agent: read library.json, parse, find book, read full_text.txt, read introduction.md...
|
|
98
|
+
|
|
99
|
+
// Domain tool: one call for common operation
|
|
100
|
+
tool("get_book_with_content", { bookId: z.string() }, async ({ bookId }) => {
|
|
101
|
+
const book = await library.getBook(bookId);
|
|
102
|
+
const fullText = await readFile(`Research/${bookId}/full_text.txt`);
|
|
103
|
+
const intro = await readFile(`Research/${bookId}/introduction.md`);
|
|
104
|
+
return { text: JSON.stringify({ book, fullText, intro }) };
|
|
105
|
+
});
|
|
106
|
+
```
|
|
107
|
+
</when_to_add_domain_tools>
|
|
108
|
+
|
|
109
|
+
<the_rule>
|
|
110
|
+
## The Rule for Domain Tools
|
|
111
|
+
|
|
112
|
+
**Domain tools should represent one conceptual action from the user's perspective.**
|
|
113
|
+
|
|
114
|
+
They can include mechanical validation, but **judgment about what to do or whether to do it belongs in the prompt**.
|
|
115
|
+
|
|
116
|
+
### Wrong: Bundles Judgment
|
|
117
|
+
|
|
118
|
+
```typescript
|
|
119
|
+
// WRONG - analyze_and_publish bundles judgment into the tool
|
|
120
|
+
tool("analyze_and_publish", async ({ input }) => {
|
|
121
|
+
const analysis = analyzeContent(input); // Tool decides how to analyze
|
|
122
|
+
const shouldPublish = analysis.score > 0.7; // Tool decides whether to publish
|
|
123
|
+
if (shouldPublish) {
|
|
124
|
+
await publish(analysis.summary); // Tool decides what to publish
|
|
125
|
+
}
|
|
126
|
+
});
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
### Right: One Action, Agent Decides
|
|
130
|
+
|
|
131
|
+
```typescript
|
|
132
|
+
// RIGHT - separate tools, agent decides
|
|
133
|
+
tool("analyze_content", { content: z.string() }, ...); // Returns analysis
|
|
134
|
+
tool("publish", { content: z.string() }, ...); // Publishes what agent provides
|
|
135
|
+
|
|
136
|
+
// Prompt: "Analyze the content. If it's high quality, publish a summary."
|
|
137
|
+
// Agent decides what "high quality" means and what summary to write.
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
### The Test
|
|
141
|
+
|
|
142
|
+
Ask: "Who is making the decision here?"
|
|
143
|
+
|
|
144
|
+
- If the answer is "the tool code" → you've encoded judgment, refactor
|
|
145
|
+
- If the answer is "the agent based on the prompt" → good
|
|
146
|
+
</the_rule>
|
|
147
|
+
|
|
148
|
+
<keep_primitives_available>
|
|
149
|
+
## Keep Primitives Available
|
|
150
|
+
|
|
151
|
+
**Domain tools are shortcuts, not gates.**
|
|
152
|
+
|
|
153
|
+
Unless there's a specific reason to restrict access (security, data integrity), the agent should still be able to use underlying primitives for edge cases.
|
|
154
|
+
|
|
155
|
+
```typescript
|
|
156
|
+
// Domain tool for common case
|
|
157
|
+
tool("create_note", { title, content }, ...);
|
|
158
|
+
|
|
159
|
+
// But primitives still available for edge cases
|
|
160
|
+
tool("read_file", { path }, ...);
|
|
161
|
+
tool("write_file", { path, content }, ...);
|
|
162
|
+
|
|
163
|
+
// Agent can use create_note normally, but for weird edge case:
|
|
164
|
+
// "Create a note in a non-standard location with custom metadata"
|
|
165
|
+
// → Agent uses write_file directly
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
### When to Gate
|
|
169
|
+
|
|
170
|
+
Gating (making domain tool the only way) is appropriate for:
|
|
171
|
+
|
|
172
|
+
- **Security:** User authentication, payment processing
|
|
173
|
+
- **Data integrity:** Operations that must maintain invariants
|
|
174
|
+
- **Audit requirements:** Actions that must be logged in specific ways
|
|
175
|
+
|
|
176
|
+
**The default is open.** When you do gate something, make it a conscious decision with a clear reason.
|
|
177
|
+
</keep_primitives_available>
|
|
178
|
+
|
|
179
|
+
<graduating_to_code>
|
|
180
|
+
## Graduating to Code
|
|
181
|
+
|
|
182
|
+
Some operations will need to move from agent-orchestrated to optimized code for performance or reliability.
|
|
183
|
+
|
|
184
|
+
### The Progression
|
|
185
|
+
|
|
186
|
+
```
|
|
187
|
+
Stage 1: Agent uses primitives in a loop
|
|
188
|
+
→ Flexible, proves the concept
|
|
189
|
+
→ Slow, potentially expensive
|
|
190
|
+
|
|
191
|
+
Stage 2: Add domain tools for common operations
|
|
192
|
+
→ Faster, still agent-orchestrated
|
|
193
|
+
→ Agent still decides when/whether to use
|
|
194
|
+
|
|
195
|
+
Stage 3: For hot paths, implement in optimized code
|
|
196
|
+
→ Fast, deterministic
|
|
197
|
+
→ Agent can still trigger, but execution is code
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
### Example Progression
|
|
201
|
+
|
|
202
|
+
**Stage 1: Pure primitives**
|
|
203
|
+
```markdown
|
|
204
|
+
Prompt: "When user asks for a summary, read all notes in /notes,
|
|
205
|
+
analyze them, and write a summary to /summaries/{date}.md"
|
|
206
|
+
|
|
207
|
+
Agent: Calls read_file 20 times, reasons about content, writes summary
|
|
208
|
+
Time: 30 seconds, 50k tokens
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
**Stage 2: Domain tool**
|
|
212
|
+
```typescript
|
|
213
|
+
tool("get_all_notes", {}, async () => {
|
|
214
|
+
const notes = await readAllNotesFromDirectory();
|
|
215
|
+
return { text: JSON.stringify(notes) };
|
|
216
|
+
});
|
|
217
|
+
|
|
218
|
+
// Agent still decides how to summarize, but retrieval is faster
|
|
219
|
+
// Time: 10 seconds, 30k tokens
|
|
220
|
+
```
|
|
221
|
+
|
|
222
|
+
**Stage 3: Optimized code**
|
|
223
|
+
```typescript
|
|
224
|
+
tool("generate_weekly_summary", {}, async () => {
|
|
225
|
+
// Entire operation in code for hot path
|
|
226
|
+
const notes = await getNotes({ since: oneWeekAgo });
|
|
227
|
+
const summary = await generateSummary(notes); // Could use cheaper model
|
|
228
|
+
await writeSummary(summary);
|
|
229
|
+
return { text: "Summary generated" };
|
|
230
|
+
});
|
|
231
|
+
|
|
232
|
+
// Agent just triggers it
|
|
233
|
+
// Time: 2 seconds, 5k tokens
|
|
234
|
+
```
|
|
235
|
+
|
|
236
|
+
### The Caveat
|
|
237
|
+
|
|
238
|
+
**Even when an operation graduates to code, the agent should be able to:**
|
|
239
|
+
|
|
240
|
+
1. Trigger the optimized operation itself
|
|
241
|
+
2. Fall back to primitives for edge cases the optimized path doesn't handle
|
|
242
|
+
|
|
243
|
+
Graduation is about efficiency. **Parity still holds.** The agent doesn't lose capability when you optimize.
|
|
244
|
+
</graduating_to_code>
|
|
245
|
+
|
|
246
|
+
<decision_framework>
|
|
247
|
+
## Decision Framework
|
|
248
|
+
|
|
249
|
+
### Should I Add a Domain Tool?
|
|
250
|
+
|
|
251
|
+
| Question | If Yes |
|
|
252
|
+
|----------|--------|
|
|
253
|
+
| Is the agent confused about what this concept means? | Add for vocabulary anchoring |
|
|
254
|
+
| Does this operation need validation the agent shouldn't decide? | Add with guardrails |
|
|
255
|
+
| Is this a common multi-step operation? | Add for efficiency |
|
|
256
|
+
| Would changing behavior require code changes? | Keep as prompt instead |
|
|
257
|
+
|
|
258
|
+
### Should I Graduate to Code?
|
|
259
|
+
|
|
260
|
+
| Question | If Yes |
|
|
261
|
+
|----------|--------|
|
|
262
|
+
| Is this operation called very frequently? | Consider graduating |
|
|
263
|
+
| Does latency matter significantly? | Consider graduating |
|
|
264
|
+
| Are token costs problematic? | Consider graduating |
|
|
265
|
+
| Do you need deterministic behavior? | Graduate to code |
|
|
266
|
+
| Does the operation need complex state management? | Graduate to code |
|
|
267
|
+
|
|
268
|
+
### Should I Gate Access?
|
|
269
|
+
|
|
270
|
+
| Question | If Yes |
|
|
271
|
+
|----------|--------|
|
|
272
|
+
| Is there a security requirement? | Gate appropriately |
|
|
273
|
+
| Must this operation maintain data integrity? | Gate appropriately |
|
|
274
|
+
| Is there an audit/compliance requirement? | Gate appropriately |
|
|
275
|
+
| Is it just "safer" with no specific risk? | Keep primitives available |
|
|
276
|
+
</decision_framework>
|
|
277
|
+
|
|
278
|
+
<examples>
|
|
279
|
+
## Examples
|
|
280
|
+
|
|
281
|
+
### Feedback Processing Evolution
|
|
282
|
+
|
|
283
|
+
**Stage 1: Primitives only**
|
|
284
|
+
```typescript
|
|
285
|
+
tools: [read_file, write_file, bash]
|
|
286
|
+
prompt: "Store feedback in data/feedback.json, notify if important"
|
|
287
|
+
// Agent figures out JSON structure, importance criteria, notification method
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
**Stage 2: Domain tools for vocabulary**
|
|
291
|
+
```typescript
|
|
292
|
+
tools: [
|
|
293
|
+
store_feedback, // Anchors "feedback" concept with proper structure
|
|
294
|
+
send_notification, // Anchors "notify" with correct channels
|
|
295
|
+
read_file, // Still available for edge cases
|
|
296
|
+
write_file,
|
|
297
|
+
]
|
|
298
|
+
prompt: "Store feedback using store_feedback. Notify if importance >= 4."
|
|
299
|
+
// Agent still decides importance, but vocabulary is anchored
|
|
300
|
+
```
|
|
301
|
+
|
|
302
|
+
**Stage 3: Graduated hot path**
|
|
303
|
+
```typescript
|
|
304
|
+
tools: [
|
|
305
|
+
process_feedback_batch, // Optimized for high-volume processing
|
|
306
|
+
store_feedback, // For individual items
|
|
307
|
+
send_notification,
|
|
308
|
+
read_file,
|
|
309
|
+
write_file,
|
|
310
|
+
]
|
|
311
|
+
// Batch processing is code, but agent can still use store_feedback for special cases
|
|
312
|
+
```
|
|
313
|
+
|
|
314
|
+
### When NOT to Add Domain Tools
|
|
315
|
+
|
|
316
|
+
**Don't add a domain tool just to make things "cleaner":**
|
|
317
|
+
```typescript
|
|
318
|
+
// Unnecessary - agent can compose primitives
|
|
319
|
+
tool("organize_files_by_date", ...) // Just use move_file + judgment
|
|
320
|
+
|
|
321
|
+
// Unnecessary - puts decision in wrong place
|
|
322
|
+
tool("decide_file_importance", ...) // This is prompt territory
|
|
323
|
+
```
|
|
324
|
+
|
|
325
|
+
**Don't add a domain tool if behavior might change:**
|
|
326
|
+
```typescript
|
|
327
|
+
// Bad - locked into code
|
|
328
|
+
tool("generate_standard_report", ...) // What if report format evolves?
|
|
329
|
+
|
|
330
|
+
// Better - keep in prompt
|
|
331
|
+
prompt: "Generate a report covering X, Y, Z. Format for readability."
|
|
332
|
+
// Can adjust format by editing prompt
|
|
333
|
+
```
|
|
334
|
+
</examples>
|
|
335
|
+
|
|
336
|
+
<checklist>
|
|
337
|
+
## Checklist: Primitives to Domain Tools
|
|
338
|
+
|
|
339
|
+
### Starting Out
|
|
340
|
+
- [ ] Begin with pure primitives (read, write, list, bash)
|
|
341
|
+
- [ ] Write behavior in prompts, not tool logic
|
|
342
|
+
- [ ] Let patterns emerge from actual usage
|
|
343
|
+
|
|
344
|
+
### Adding Domain Tools
|
|
345
|
+
- [ ] Clear reason: vocabulary anchoring, guardrails, or efficiency
|
|
346
|
+
- [ ] Tool represents one conceptual action
|
|
347
|
+
- [ ] Judgment stays in prompts, not tool code
|
|
348
|
+
- [ ] Primitives remain available alongside domain tools
|
|
349
|
+
|
|
350
|
+
### Graduating to Code
|
|
351
|
+
- [ ] Hot path identified (frequent, latency-sensitive, or expensive)
|
|
352
|
+
- [ ] Optimized version doesn't remove agent capability
|
|
353
|
+
- [ ] Fallback to primitives for edge cases still works
|
|
354
|
+
|
|
355
|
+
### Gating Decisions
|
|
356
|
+
- [ ] Specific reason for each gate (security, integrity, audit)
|
|
357
|
+
- [ ] Default is open access
|
|
358
|
+
- [ ] Gates are conscious decisions, not defaults
|
|
359
|
+
</checklist>
|