@vibe-agent-toolkit/vat-development-agents 0.1.0-rc.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +87 -0
- package/agents/agent-generator/DESIGN-NOTES.md +572 -0
- package/agents/agent-generator/README.md +279 -0
- package/agents/agent-generator/agent.yaml +76 -0
- package/agents/agent-generator/examples/example-input.md +62 -0
- package/agents/agent-generator/prompts/system.md +168 -0
- package/agents/agent-generator/prompts/user.md +42 -0
- package/agents/agent-generator/schemas/input.schema.json +86 -0
- package/agents/agent-generator/schemas/output.schema.json +137 -0
- package/agents/agent-generator/validate-agent.ts +91 -0
- package/agents/resource-optimizer/SCOPE.md +1156 -0
- package/package.json +44 -0
|
@@ -0,0 +1,1156 @@
|
|
|
1
|
+
# resource-optimizer: Agent Resource Analysis for Context Efficiency
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
|
|
5
|
+
The resource-optimizer agent analyzes agent resource files (markdown, text, JSON, YAML) to identify opportunities for improving context efficiency. Following Anthropic's principle of "smallest high-signal tokens," it evaluates whether resources contain redundant information, unclear descriptions, overly broad content, or poor structural organization that wastes valuable LLM context budget.
|
|
6
|
+
|
|
7
|
+
## Problem Statement
|
|
8
|
+
|
|
9
|
+
Agent resources often accumulate inefficiencies over time:
|
|
10
|
+
|
|
11
|
+
1. **Redundancy**: Multiple resources duplicate the same information
|
|
12
|
+
2. **Low Signal-to-Noise**: Verbose descriptions that could be more concise
|
|
13
|
+
3. **Overly Broad Scope**: Resources that try to cover too much ground
|
|
14
|
+
4. **Poor Structure**: Information buried in walls of text instead of scannable sections
|
|
15
|
+
5. **Outdated Content**: Examples or instructions that no longer apply
|
|
16
|
+
|
|
17
|
+
These inefficiencies waste precious context tokens, reducing agent effectiveness and increasing costs.
|
|
18
|
+
|
|
19
|
+
## Input/Output Interface
|
|
20
|
+
|
|
21
|
+
### Input Schema
|
|
22
|
+
|
|
23
|
+
```json
|
|
24
|
+
{
|
|
25
|
+
"$schema": "http://json-schema.org/draft-07/schema#",
|
|
26
|
+
"type": "object",
|
|
27
|
+
"properties": {
|
|
28
|
+
"analysisScope": {
|
|
29
|
+
"type": "object",
|
|
30
|
+
"description": "Defines what resources to analyze",
|
|
31
|
+
"properties": {
|
|
32
|
+
"resourcePaths": {
|
|
33
|
+
"type": "array",
|
|
34
|
+
"description": "Specific resource files to analyze",
|
|
35
|
+
"items": {
|
|
36
|
+
"type": "string"
|
|
37
|
+
}
|
|
38
|
+
},
|
|
39
|
+
"resourceDirectory": {
|
|
40
|
+
"type": "string",
|
|
41
|
+
"description": "Directory containing resources to analyze (recursive)"
|
|
42
|
+
},
|
|
43
|
+
"includePatterns": {
|
|
44
|
+
"type": "array",
|
|
45
|
+
"description": "Glob patterns for files to include (e.g., '**/*.md')",
|
|
46
|
+
"items": {
|
|
47
|
+
"type": "string"
|
|
48
|
+
},
|
|
49
|
+
"default": ["**/*.md", "**/*.txt", "**/*.json", "**/*.yaml", "**/*.yml"]
|
|
50
|
+
},
|
|
51
|
+
"excludePatterns": {
|
|
52
|
+
"type": "array",
|
|
53
|
+
"description": "Glob patterns for files to exclude (e.g., 'node_modules/**')",
|
|
54
|
+
"items": {
|
|
55
|
+
"type": "string"
|
|
56
|
+
},
|
|
57
|
+
"default": ["node_modules/**", ".git/**", "dist/**", "build/**"]
|
|
58
|
+
}
|
|
59
|
+
},
|
|
60
|
+
"oneOf": [
|
|
61
|
+
{ "required": ["resourcePaths"] },
|
|
62
|
+
{ "required": ["resourceDirectory"] }
|
|
63
|
+
]
|
|
64
|
+
},
|
|
65
|
+
"analysisOptions": {
|
|
66
|
+
"type": "object",
|
|
67
|
+
"description": "Configuration for analysis behavior",
|
|
68
|
+
"properties": {
|
|
69
|
+
"checkRedundancy": {
|
|
70
|
+
"type": "boolean",
|
|
71
|
+
"description": "Check for duplicate content across resources",
|
|
72
|
+
"default": true
|
|
73
|
+
},
|
|
74
|
+
"checkClarity": {
|
|
75
|
+
"type": "boolean",
|
|
76
|
+
"description": "Evaluate description clarity and conciseness",
|
|
77
|
+
"default": true
|
|
78
|
+
},
|
|
79
|
+
"checkSpecificity": {
|
|
80
|
+
"type": "boolean",
|
|
81
|
+
"description": "Assess whether scope is appropriately narrow",
|
|
82
|
+
"default": true
|
|
83
|
+
},
|
|
84
|
+
"checkStructure": {
|
|
85
|
+
"type": "boolean",
|
|
86
|
+
"description": "Evaluate information organization and scannability",
|
|
87
|
+
"default": true
|
|
88
|
+
},
|
|
89
|
+
"minTokenSavings": {
|
|
90
|
+
"type": "integer",
|
|
91
|
+
"description": "Minimum estimated token savings to report an issue",
|
|
92
|
+
"default": 50,
|
|
93
|
+
"minimum": 0
|
|
94
|
+
},
|
|
95
|
+
"severityThreshold": {
|
|
96
|
+
"type": "string",
|
|
97
|
+
"description": "Minimum severity level to report",
|
|
98
|
+
"enum": ["info", "minor", "moderate", "major", "critical"],
|
|
99
|
+
"default": "minor"
|
|
100
|
+
}
|
|
101
|
+
}
|
|
102
|
+
},
|
|
103
|
+
"comparisonBaseline": {
|
|
104
|
+
"type": "object",
|
|
105
|
+
"description": "Optional baseline for comparison (e.g., previous version)",
|
|
106
|
+
"properties": {
|
|
107
|
+
"baselineDirectory": {
|
|
108
|
+
"type": "string",
|
|
109
|
+
"description": "Directory containing baseline resources"
|
|
110
|
+
},
|
|
111
|
+
"baselineCommit": {
|
|
112
|
+
"type": "string",
|
|
113
|
+
"description": "Git commit SHA to compare against"
|
|
114
|
+
}
|
|
115
|
+
}
|
|
116
|
+
}
|
|
117
|
+
},
|
|
118
|
+
"required": ["analysisScope"]
|
|
119
|
+
}
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
### Output Schema
|
|
123
|
+
|
|
124
|
+
```json
|
|
125
|
+
{
|
|
126
|
+
"$schema": "http://json-schema.org/draft-07/schema#",
|
|
127
|
+
"type": "object",
|
|
128
|
+
"properties": {
|
|
129
|
+
"summary": {
|
|
130
|
+
"type": "object",
|
|
131
|
+
"description": "High-level analysis summary",
|
|
132
|
+
"properties": {
|
|
133
|
+
"totalResources": {
|
|
134
|
+
"type": "integer",
|
|
135
|
+
"description": "Total number of resources analyzed"
|
|
136
|
+
},
|
|
137
|
+
"totalTokens": {
|
|
138
|
+
"type": "integer",
|
|
139
|
+
"description": "Total estimated tokens in analyzed resources"
|
|
140
|
+
},
|
|
141
|
+
"issuesFound": {
|
|
142
|
+
"type": "integer",
|
|
143
|
+
"description": "Total number of efficiency issues identified"
|
|
144
|
+
},
|
|
145
|
+
"potentialTokenSavings": {
|
|
146
|
+
"type": "integer",
|
|
147
|
+
"description": "Estimated tokens that could be saved by addressing issues"
|
|
148
|
+
},
|
|
149
|
+
"efficiencyScore": {
|
|
150
|
+
"type": "number",
|
|
151
|
+
"description": "Overall efficiency score (0-100, higher is better)",
|
|
152
|
+
"minimum": 0,
|
|
153
|
+
"maximum": 100
|
|
154
|
+
}
|
|
155
|
+
},
|
|
156
|
+
"required": [
|
|
157
|
+
"totalResources",
|
|
158
|
+
"totalTokens",
|
|
159
|
+
"issuesFound",
|
|
160
|
+
"potentialTokenSavings",
|
|
161
|
+
"efficiencyScore"
|
|
162
|
+
]
|
|
163
|
+
},
|
|
164
|
+
"resourceAnalysis": {
|
|
165
|
+
"type": "array",
|
|
166
|
+
"description": "Detailed analysis per resource",
|
|
167
|
+
"items": {
|
|
168
|
+
"type": "object",
|
|
169
|
+
"properties": {
|
|
170
|
+
"resourcePath": {
|
|
171
|
+
"type": "string",
|
|
172
|
+
"description": "Path to the analyzed resource"
|
|
173
|
+
},
|
|
174
|
+
"tokenCount": {
|
|
175
|
+
"type": "integer",
|
|
176
|
+
"description": "Estimated token count for this resource"
|
|
177
|
+
},
|
|
178
|
+
"issues": {
|
|
179
|
+
"type": "array",
|
|
180
|
+
"description": "Efficiency issues found in this resource",
|
|
181
|
+
"items": {
|
|
182
|
+
"type": "object",
|
|
183
|
+
"properties": {
|
|
184
|
+
"type": {
|
|
185
|
+
"type": "string",
|
|
186
|
+
"description": "Type of efficiency issue",
|
|
187
|
+
"enum": [
|
|
188
|
+
"redundancy",
|
|
189
|
+
"clarity",
|
|
190
|
+
"specificity",
|
|
191
|
+
"structure",
|
|
192
|
+
"outdated"
|
|
193
|
+
]
|
|
194
|
+
},
|
|
195
|
+
"severity": {
|
|
196
|
+
"type": "string",
|
|
197
|
+
"description": "Severity of the issue",
|
|
198
|
+
"enum": ["info", "minor", "moderate", "major", "critical"]
|
|
199
|
+
},
|
|
200
|
+
"description": {
|
|
201
|
+
"type": "string",
|
|
202
|
+
"description": "Human-readable description of the issue"
|
|
203
|
+
},
|
|
204
|
+
"location": {
|
|
205
|
+
"type": "object",
|
|
206
|
+
"description": "Location of the issue within the resource",
|
|
207
|
+
"properties": {
|
|
208
|
+
"startLine": {
|
|
209
|
+
"type": "integer",
|
|
210
|
+
"description": "Starting line number (1-indexed)"
|
|
211
|
+
},
|
|
212
|
+
"endLine": {
|
|
213
|
+
"type": "integer",
|
|
214
|
+
"description": "Ending line number (1-indexed)"
|
|
215
|
+
},
|
|
216
|
+
"section": {
|
|
217
|
+
"type": "string",
|
|
218
|
+
"description": "Section heading or context"
|
|
219
|
+
}
|
|
220
|
+
}
|
|
221
|
+
},
|
|
222
|
+
"estimatedTokenWaste": {
|
|
223
|
+
"type": "integer",
|
|
224
|
+
"description": "Estimated tokens wasted by this issue"
|
|
225
|
+
},
|
|
226
|
+
"recommendation": {
|
|
227
|
+
"type": "string",
|
|
228
|
+
"description": "Specific recommendation for addressing the issue"
|
|
229
|
+
},
|
|
230
|
+
"examples": {
|
|
231
|
+
"type": "object",
|
|
232
|
+
"description": "Optional before/after examples",
|
|
233
|
+
"properties": {
|
|
234
|
+
"before": {
|
|
235
|
+
"type": "string",
|
|
236
|
+
"description": "Current inefficient content"
|
|
237
|
+
},
|
|
238
|
+
"after": {
|
|
239
|
+
"type": "string",
|
|
240
|
+
"description": "Suggested efficient content"
|
|
241
|
+
}
|
|
242
|
+
}
|
|
243
|
+
},
|
|
244
|
+
"relatedResources": {
|
|
245
|
+
"type": "array",
|
|
246
|
+
"description": "Other resources related to this issue (for redundancy)",
|
|
247
|
+
"items": {
|
|
248
|
+
"type": "string"
|
|
249
|
+
}
|
|
250
|
+
}
|
|
251
|
+
},
|
|
252
|
+
"required": [
|
|
253
|
+
"type",
|
|
254
|
+
"severity",
|
|
255
|
+
"description",
|
|
256
|
+
"estimatedTokenWaste",
|
|
257
|
+
"recommendation"
|
|
258
|
+
]
|
|
259
|
+
}
|
|
260
|
+
},
|
|
261
|
+
"efficiencyScore": {
|
|
262
|
+
"type": "number",
|
|
263
|
+
"description": "Efficiency score for this resource (0-100)",
|
|
264
|
+
"minimum": 0,
|
|
265
|
+
"maximum": 100
|
|
266
|
+
}
|
|
267
|
+
},
|
|
268
|
+
"required": [
|
|
269
|
+
"resourcePath",
|
|
270
|
+
"tokenCount",
|
|
271
|
+
"issues",
|
|
272
|
+
"efficiencyScore"
|
|
273
|
+
]
|
|
274
|
+
}
|
|
275
|
+
},
|
|
276
|
+
"crossResourceIssues": {
|
|
277
|
+
"type": "array",
|
|
278
|
+
"description": "Issues that span multiple resources",
|
|
279
|
+
"items": {
|
|
280
|
+
"type": "object",
|
|
281
|
+
"properties": {
|
|
282
|
+
"type": {
|
|
283
|
+
"type": "string",
|
|
284
|
+
"description": "Type of cross-resource issue",
|
|
285
|
+
"enum": ["redundancy", "fragmentation", "inconsistency"]
|
|
286
|
+
},
|
|
287
|
+
"severity": {
|
|
288
|
+
"type": "string",
|
|
289
|
+
"enum": ["info", "minor", "moderate", "major", "critical"]
|
|
290
|
+
},
|
|
291
|
+
"description": {
|
|
292
|
+
"type": "string",
|
|
293
|
+
"description": "Description of the cross-resource issue"
|
|
294
|
+
},
|
|
295
|
+
"affectedResources": {
|
|
296
|
+
"type": "array",
|
|
297
|
+
"description": "Resources involved in this issue",
|
|
298
|
+
"items": {
|
|
299
|
+
"type": "string"
|
|
300
|
+
}
|
|
301
|
+
},
|
|
302
|
+
"estimatedTokenWaste": {
|
|
303
|
+
"type": "integer",
|
|
304
|
+
"description": "Total estimated token waste across resources"
|
|
305
|
+
},
|
|
306
|
+
"recommendation": {
|
|
307
|
+
"type": "string",
|
|
308
|
+
"description": "Recommendation for resolution"
|
|
309
|
+
}
|
|
310
|
+
},
|
|
311
|
+
"required": [
|
|
312
|
+
"type",
|
|
313
|
+
"severity",
|
|
314
|
+
"description",
|
|
315
|
+
"affectedResources",
|
|
316
|
+
"estimatedTokenWaste",
|
|
317
|
+
"recommendation"
|
|
318
|
+
]
|
|
319
|
+
}
|
|
320
|
+
},
|
|
321
|
+
"recommendations": {
|
|
322
|
+
"type": "object",
|
|
323
|
+
"description": "Prioritized recommendations for improvement",
|
|
324
|
+
"properties": {
|
|
325
|
+
"highPriority": {
|
|
326
|
+
"type": "array",
|
|
327
|
+
"description": "Most impactful improvements to make",
|
|
328
|
+
"items": {
|
|
329
|
+
"type": "object",
|
|
330
|
+
"properties": {
|
|
331
|
+
"recommendation": {
|
|
332
|
+
"type": "string"
|
|
333
|
+
},
|
|
334
|
+
"estimatedImpact": {
|
|
335
|
+
"type": "string",
|
|
336
|
+
"description": "High-level impact estimate"
|
|
337
|
+
},
|
|
338
|
+
"affectedResources": {
|
|
339
|
+
"type": "array",
|
|
340
|
+
"items": {
|
|
341
|
+
"type": "string"
|
|
342
|
+
}
|
|
343
|
+
}
|
|
344
|
+
},
|
|
345
|
+
"required": ["recommendation", "estimatedImpact"]
|
|
346
|
+
}
|
|
347
|
+
},
|
|
348
|
+
"mediumPriority": {
|
|
349
|
+
"type": "array",
|
|
350
|
+
"description": "Moderate improvements worth considering",
|
|
351
|
+
"items": {
|
|
352
|
+
"type": "object",
|
|
353
|
+
"properties": {
|
|
354
|
+
"recommendation": {
|
|
355
|
+
"type": "string"
|
|
356
|
+
},
|
|
357
|
+
"estimatedImpact": {
|
|
358
|
+
"type": "string"
|
|
359
|
+
}
|
|
360
|
+
},
|
|
361
|
+
"required": ["recommendation", "estimatedImpact"]
|
|
362
|
+
}
|
|
363
|
+
},
|
|
364
|
+
"lowPriority": {
|
|
365
|
+
"type": "array",
|
|
366
|
+
"description": "Minor improvements (optional)",
|
|
367
|
+
"items": {
|
|
368
|
+
"type": "object",
|
|
369
|
+
"properties": {
|
|
370
|
+
"recommendation": {
|
|
371
|
+
"type": "string"
|
|
372
|
+
},
|
|
373
|
+
"estimatedImpact": {
|
|
374
|
+
"type": "string"
|
|
375
|
+
}
|
|
376
|
+
},
|
|
377
|
+
"required": ["recommendation", "estimatedImpact"]
|
|
378
|
+
}
|
|
379
|
+
}
|
|
380
|
+
},
|
|
381
|
+
"required": ["highPriority"]
|
|
382
|
+
}
|
|
383
|
+
},
|
|
384
|
+
"required": [
|
|
385
|
+
"summary",
|
|
386
|
+
"resourceAnalysis",
|
|
387
|
+
"recommendations"
|
|
388
|
+
]
|
|
389
|
+
}
|
|
390
|
+
```
|
|
391
|
+
|
|
392
|
+
## Key Behaviors
|
|
393
|
+
|
|
394
|
+
### 1. Identify Redundancy
|
|
395
|
+
|
|
396
|
+
**Goal**: Detect duplicate or overlapping information across resources.
|
|
397
|
+
|
|
398
|
+
**Detection Method**:
|
|
399
|
+
- Semantic similarity analysis (embeddings) to find conceptually similar content
|
|
400
|
+
- Exact text matching for boilerplate or copied sections
|
|
401
|
+
- Pattern matching for repeated examples or code snippets
|
|
402
|
+
|
|
403
|
+
**Analysis**:
|
|
404
|
+
- Calculate token overlap between resources
|
|
405
|
+
- Identify which resource is the authoritative source
|
|
406
|
+
- Determine if duplication is intentional (cross-reference) or wasteful
|
|
407
|
+
|
|
408
|
+
**Recommendation**:
|
|
409
|
+
- Consolidate duplicate content into a single resource
|
|
410
|
+
- Replace duplicates with references or links
|
|
411
|
+
- Remove redundant examples if one suffices
|
|
412
|
+
|
|
413
|
+
**Example**:
|
|
414
|
+
```
|
|
415
|
+
Issue: Redundancy (Major)
|
|
416
|
+
Resources: agent-a/README.md, agent-b/README.md
|
|
417
|
+
Token Waste: 400 tokens
|
|
418
|
+
|
|
419
|
+
Description: Both READMEs contain identical "Getting Started with VAT" section (200 tokens each)
|
|
420
|
+
|
|
421
|
+
Recommendation: Move common "Getting Started" to docs/getting-started.md,
|
|
422
|
+
reference from both agent READMEs.
|
|
423
|
+
|
|
424
|
+
Estimated Savings: 350 tokens (accounting for reference overhead)
|
|
425
|
+
```
|
|
426
|
+
|
|
427
|
+
### 2. Check Clarity
|
|
428
|
+
|
|
429
|
+
**Goal**: Evaluate whether descriptions are concise and high-signal.
|
|
430
|
+
|
|
431
|
+
**Detection Method**:
|
|
432
|
+
- Analyze description length vs. information density
|
|
433
|
+
- Identify verbose phrases that can be condensed
|
|
434
|
+
- Detect unnecessary hedging language ("basically", "essentially", "kind of")
|
|
435
|
+
- Flag overly complex sentence structures
|
|
436
|
+
|
|
437
|
+
**Analysis**:
|
|
438
|
+
- Compare description token count to ideal range for content type
|
|
439
|
+
- Measure readability scores (Flesch-Kincaid, etc.)
|
|
440
|
+
- Identify opportunities for bullet points vs. prose
|
|
441
|
+
|
|
442
|
+
**Recommendation**:
|
|
443
|
+
- Provide condensed alternative wording
|
|
444
|
+
- Suggest structural improvements (tables, lists)
|
|
445
|
+
- Highlight unnecessary qualifiers or filler words
|
|
446
|
+
|
|
447
|
+
**Example**:
|
|
448
|
+
```
|
|
449
|
+
Issue: Clarity (Moderate)
|
|
450
|
+
Resource: agent-generator/SCOPE.md
|
|
451
|
+
Location: Lines 45-60 (Purpose section)
|
|
452
|
+
Token Waste: 80 tokens
|
|
453
|
+
|
|
454
|
+
Before (150 tokens):
|
|
455
|
+
"The purpose of this agent is essentially to help developers who are working
|
|
456
|
+
on creating new agents for the vibe-agent-toolkit by providing them with a
|
|
457
|
+
way to basically generate all the necessary scaffolding and boilerplate files
|
|
458
|
+
that they would otherwise have to create manually, which can be time-consuming
|
|
459
|
+
and error-prone, especially when you're trying to ensure consistency across
|
|
460
|
+
multiple agents."
|
|
461
|
+
|
|
462
|
+
After (70 tokens):
|
|
463
|
+
"Generates scaffolding and boilerplate files for new vibe-agent-toolkit agents,
|
|
464
|
+
ensuring consistency and reducing manual setup time."
|
|
465
|
+
|
|
466
|
+
Recommendation: Replace verbose explanation with concise description.
|
|
467
|
+
Use active voice and remove hedging language.
|
|
468
|
+
|
|
469
|
+
Estimated Savings: 80 tokens
|
|
470
|
+
```
|
|
471
|
+
|
|
472
|
+
### 3. Assess Specificity
|
|
473
|
+
|
|
474
|
+
**Goal**: Determine if resources have appropriate scope (not too broad, not too narrow).
|
|
475
|
+
|
|
476
|
+
**Detection Method**:
|
|
477
|
+
- Analyze breadth of topics covered vs. resource type
|
|
478
|
+
- Identify tangential information that doesn't serve the core purpose
|
|
479
|
+
- Detect overly generic content that lacks actionable detail
|
|
480
|
+
|
|
481
|
+
**Analysis**:
|
|
482
|
+
- Compare resource focus to stated purpose
|
|
483
|
+
- Identify sections that could be split into separate resources
|
|
484
|
+
- Flag content that belongs in different documentation layers (README vs. detailed docs)
|
|
485
|
+
|
|
486
|
+
**Recommendation**:
|
|
487
|
+
- Suggest splitting broad resources into focused ones
|
|
488
|
+
- Identify content to move to more appropriate locations
|
|
489
|
+
- Recommend removing tangential information
|
|
490
|
+
|
|
491
|
+
**Example**:
|
|
492
|
+
```
|
|
493
|
+
Issue: Specificity (Major)
|
|
494
|
+
Resource: agent-generator/SCOPE.md
|
|
495
|
+
Location: Lines 200-350 (Implementation Details section)
|
|
496
|
+
Token Waste: 500 tokens
|
|
497
|
+
|
|
498
|
+
Description: SCOPE.md contains detailed implementation code examples and
|
|
499
|
+
debugging strategies. This level of detail belongs in implementation docs,
|
|
500
|
+
not scope definition.
|
|
501
|
+
|
|
502
|
+
Recommendation:
|
|
503
|
+
1. Move implementation details to agent-generator/DESIGN.md
|
|
504
|
+
2. Keep only high-level approach in SCOPE.md
|
|
505
|
+
3. Replace details with reference: "See DESIGN.md for implementation approach"
|
|
506
|
+
|
|
507
|
+
Estimated Savings: 450 tokens from SCOPE.md (net positive after accounting
|
|
508
|
+
for DESIGN.md additions)
|
|
509
|
+
```
|
|
510
|
+
|
|
511
|
+
### 4. Evaluate Structure
|
|
512
|
+
|
|
513
|
+
**Goal**: Ensure information is organized for maximum scannability and retrieval.
|
|
514
|
+
|
|
515
|
+
**Detection Method**:
|
|
516
|
+
- Analyze heading hierarchy and nesting depth
|
|
517
|
+
- Identify "walls of text" (long paragraphs without breaks)
|
|
518
|
+
- Detect missing structural elements (tables of contents, summaries)
|
|
519
|
+
- Flag poor use of formatting (lists vs. prose, code blocks vs. inline)
|
|
520
|
+
|
|
521
|
+
**Analysis**:
|
|
522
|
+
- Measure section lengths and recommend chunking
|
|
523
|
+
- Evaluate heading clarity and hierarchy
|
|
524
|
+
- Identify opportunities for tables, diagrams, or other visual aids
|
|
525
|
+
|
|
526
|
+
**Recommendation**:
|
|
527
|
+
- Suggest restructuring for better scannability
|
|
528
|
+
- Recommend adding navigational aids (TOC, anchors)
|
|
529
|
+
- Propose format changes (prose → lists, inline → code blocks)
|
|
530
|
+
|
|
531
|
+
**Example**:
|
|
532
|
+
```
|
|
533
|
+
Issue: Structure (Moderate)
|
|
534
|
+
Resource: agent-generator/DESIGN.md
|
|
535
|
+
Location: Lines 100-250 (Configuration section)
|
|
536
|
+
Token Waste: 200 tokens
|
|
537
|
+
|
|
538
|
+
Description: Configuration options explained in prose paragraphs.
|
|
539
|
+
Information is hard to scan and compare.
|
|
540
|
+
|
|
541
|
+
Before (prose):
|
|
542
|
+
"The agent generator accepts several configuration options. The first option
|
|
543
|
+
is the agent name, which should follow kebab-case convention. The second
|
|
544
|
+
option is..."
|
|
545
|
+
|
|
546
|
+
After (table):
|
|
547
|
+
| Option | Format | Description | Example |
|
|
548
|
+
|--------|--------|-------------|---------|
|
|
549
|
+
| agentName | kebab-case | Agent identifier | task-analyzer |
|
|
550
|
+
| targetDir | path | Output directory | ./docs/agents |
|
|
551
|
+
|
|
552
|
+
Recommendation: Convert prose descriptions to structured table for faster
|
|
553
|
+
scanning and comparison.
|
|
554
|
+
|
|
555
|
+
Estimated Savings: 150 tokens (more concise) + improved retrieval accuracy
|
|
556
|
+
```
|
|
557
|
+
|
|
558
|
+
## Example Analysis
|
|
559
|
+
|
|
560
|
+
### Input: Analyze agent-generator resources
|
|
561
|
+
|
|
562
|
+
```json
|
|
563
|
+
{
|
|
564
|
+
"analysisScope": {
|
|
565
|
+
"resourceDirectory": "docs/agents/vat-development-agents/agent-generator"
|
|
566
|
+
},
|
|
567
|
+
"analysisOptions": {
|
|
568
|
+
"checkRedundancy": true,
|
|
569
|
+
"checkClarity": true,
|
|
570
|
+
"checkSpecificity": true,
|
|
571
|
+
"checkStructure": true,
|
|
572
|
+
"minTokenSavings": 50,
|
|
573
|
+
"severityThreshold": "minor"
|
|
574
|
+
}
|
|
575
|
+
}
|
|
576
|
+
```
|
|
577
|
+
|
|
578
|
+
### Output: Analysis Report
|
|
579
|
+
|
|
580
|
+
```json
|
|
581
|
+
{
|
|
582
|
+
"summary": {
|
|
583
|
+
"totalResources": 4,
|
|
584
|
+
"totalTokens": 12500,
|
|
585
|
+
"issuesFound": 8,
|
|
586
|
+
"potentialTokenSavings": 1800,
|
|
587
|
+
"efficiencyScore": 78.5
|
|
588
|
+
},
|
|
589
|
+
"resourceAnalysis": [
|
|
590
|
+
{
|
|
591
|
+
"resourcePath": "docs/agents/vat-development-agents/agent-generator/SCOPE.md",
|
|
592
|
+
"tokenCount": 5000,
|
|
593
|
+
"issues": [
|
|
594
|
+
{
|
|
595
|
+
"type": "clarity",
|
|
596
|
+
"severity": "moderate",
|
|
597
|
+
"description": "Purpose section uses verbose, hedging language",
|
|
598
|
+
"location": {
|
|
599
|
+
"startLine": 45,
|
|
600
|
+
"endLine": 60,
|
|
601
|
+
"section": "Purpose"
|
|
602
|
+
},
|
|
603
|
+
"estimatedTokenWaste": 80,
|
|
604
|
+
"recommendation": "Replace verbose explanation with concise active-voice description. Remove hedging words ('essentially', 'basically').",
|
|
605
|
+
"examples": {
|
|
606
|
+
"before": "The purpose of this agent is essentially to help developers who are working on creating new agents...",
|
|
607
|
+
"after": "Generates scaffolding and boilerplate files for new vibe-agent-toolkit agents, ensuring consistency and reducing manual setup time."
|
|
608
|
+
}
|
|
609
|
+
},
|
|
610
|
+
{
|
|
611
|
+
"type": "specificity",
|
|
612
|
+
"severity": "major",
|
|
613
|
+
"description": "SCOPE.md contains implementation details that belong in DESIGN.md",
|
|
614
|
+
"location": {
|
|
615
|
+
"startLine": 200,
|
|
616
|
+
"endLine": 350,
|
|
617
|
+
"section": "Implementation Details"
|
|
618
|
+
},
|
|
619
|
+
"estimatedTokenWaste": 500,
|
|
620
|
+
"recommendation": "Move implementation details to DESIGN.md. Keep only high-level approach in SCOPE.md with reference to DESIGN.md."
|
|
621
|
+
}
|
|
622
|
+
],
|
|
623
|
+
"efficiencyScore": 72.0
|
|
624
|
+
},
|
|
625
|
+
{
|
|
626
|
+
"resourcePath": "docs/agents/vat-development-agents/agent-generator/DESIGN.md",
|
|
627
|
+
"tokenCount": 4500,
|
|
628
|
+
"issues": [
|
|
629
|
+
{
|
|
630
|
+
"type": "structure",
|
|
631
|
+
"severity": "moderate",
|
|
632
|
+
"description": "Configuration options described in prose instead of table",
|
|
633
|
+
"location": {
|
|
634
|
+
"startLine": 100,
|
|
635
|
+
"endLine": 250,
|
|
636
|
+
"section": "Configuration"
|
|
637
|
+
},
|
|
638
|
+
"estimatedTokenWaste": 200,
|
|
639
|
+
"recommendation": "Convert configuration prose to structured table with columns: Option, Format, Description, Example."
|
|
640
|
+
}
|
|
641
|
+
],
|
|
642
|
+
"efficiencyScore": 81.5
|
|
643
|
+
},
|
|
644
|
+
{
|
|
645
|
+
"resourcePath": "docs/agents/vat-development-agents/agent-generator/IMPLEMENTATION_PLAN.md",
|
|
646
|
+
"tokenCount": 2000,
|
|
647
|
+
"issues": [
|
|
648
|
+
{
|
|
649
|
+
"type": "structure",
|
|
650
|
+
"severity": "minor",
|
|
651
|
+
"description": "Long paragraphs in Task Breakdown section reduce scannability",
|
|
652
|
+
"location": {
|
|
653
|
+
"startLine": 50,
|
|
654
|
+
"endLine": 150,
|
|
655
|
+
"section": "Task Breakdown"
|
|
656
|
+
},
|
|
657
|
+
"estimatedTokenWaste": 100,
|
|
658
|
+
"recommendation": "Break long task descriptions into bullet points. Add sub-headings for task phases."
|
|
659
|
+
}
|
|
660
|
+
],
|
|
661
|
+
"efficiencyScore": 85.0
|
|
662
|
+
},
|
|
663
|
+
{
|
|
664
|
+
"resourcePath": "docs/agents/vat-development-agents/agent-generator/README.md",
|
|
665
|
+
"tokenCount": 1000,
|
|
666
|
+
"issues": [
|
|
667
|
+
{
|
|
668
|
+
"type": "redundancy",
|
|
669
|
+
"severity": "minor",
|
|
670
|
+
"description": "Installation instructions duplicate main repo README",
|
|
671
|
+
"location": {
|
|
672
|
+
"startLine": 10,
|
|
673
|
+
"endLine": 25,
|
|
674
|
+
"section": "Installation"
|
|
675
|
+
},
|
|
676
|
+
"estimatedTokenWaste": 80,
|
|
677
|
+
"recommendation": "Replace installation section with: 'See main repo README for installation instructions.'",
|
|
678
|
+
"relatedResources": [
|
|
679
|
+
"README.md"
|
|
680
|
+
]
|
|
681
|
+
}
|
|
682
|
+
],
|
|
683
|
+
"efficiencyScore": 88.0
|
|
684
|
+
}
|
|
685
|
+
],
|
|
686
|
+
"crossResourceIssues": [
|
|
687
|
+
{
|
|
688
|
+
"type": "redundancy",
|
|
689
|
+
"severity": "moderate",
|
|
690
|
+
"description": "Agent naming conventions explained in both SCOPE.md and DESIGN.md",
|
|
691
|
+
"affectedResources": [
|
|
692
|
+
"docs/agents/vat-development-agents/agent-generator/SCOPE.md",
|
|
693
|
+
"docs/agents/vat-development-agents/agent-generator/DESIGN.md"
|
|
694
|
+
],
|
|
695
|
+
"estimatedTokenWaste": 150,
|
|
696
|
+
"recommendation": "Define naming conventions once in DESIGN.md. Reference from SCOPE.md: 'Agent names follow conventions defined in DESIGN.md#naming-conventions'."
|
|
697
|
+
}
|
|
698
|
+
],
|
|
699
|
+
"recommendations": {
|
|
700
|
+
"highPriority": [
|
|
701
|
+
{
|
|
702
|
+
"recommendation": "Move implementation details from SCOPE.md to DESIGN.md",
|
|
703
|
+
"estimatedImpact": "Save 500 tokens, improve document focus",
|
|
704
|
+
"affectedResources": [
|
|
705
|
+
"docs/agents/vat-development-agents/agent-generator/SCOPE.md",
|
|
706
|
+
"docs/agents/vat-development-agents/agent-generator/DESIGN.md"
|
|
707
|
+
]
|
|
708
|
+
},
|
|
709
|
+
{
|
|
710
|
+
"recommendation": "Convert configuration prose to structured table in DESIGN.md",
|
|
711
|
+
"estimatedImpact": "Save 150 tokens, improve scannability by 40%",
|
|
712
|
+
"affectedResources": [
|
|
713
|
+
"docs/agents/vat-development-agents/agent-generator/DESIGN.md"
|
|
714
|
+
]
|
|
715
|
+
}
|
|
716
|
+
],
|
|
717
|
+
"mediumPriority": [
|
|
718
|
+
{
|
|
719
|
+
"recommendation": "Consolidate naming conventions into single authoritative section",
|
|
720
|
+
"estimatedImpact": "Save 150 tokens, reduce maintenance burden"
|
|
721
|
+
},
|
|
722
|
+
{
|
|
723
|
+
"recommendation": "Condense verbose purpose descriptions across resources",
|
|
724
|
+
"estimatedImpact": "Save 200 tokens total, improve clarity"
|
|
725
|
+
}
|
|
726
|
+
],
|
|
727
|
+
"lowPriority": [
|
|
728
|
+
{
|
|
729
|
+
"recommendation": "Break long task descriptions into bullet points",
|
|
730
|
+
"estimatedImpact": "Save 100 tokens, improve scannability by 25%"
|
|
731
|
+
},
|
|
732
|
+
{
|
|
733
|
+
"recommendation": "Replace duplicated installation instructions with references",
|
|
734
|
+
"estimatedImpact": "Save 80 tokens, improve consistency"
|
|
735
|
+
}
|
|
736
|
+
]
|
|
737
|
+
}
|
|
738
|
+
}
|
|
739
|
+
```
|
|
740
|
+
|
|
741
|
+
### Interpretation
|
|
742
|
+
|
|
743
|
+
**Overall Efficiency**: 78.5/100 (Good, with room for improvement)
|
|
744
|
+
|
|
745
|
+
**Key Findings**:
|
|
746
|
+
1. **Major Issue**: SCOPE.md contains 500 tokens of implementation details that belong in DESIGN.md
|
|
747
|
+
2. **Structural Issues**: Configuration and task information would be more efficient in structured formats
|
|
748
|
+
3. **Minor Redundancy**: Some content duplicated across resources (naming conventions, installation)
|
|
749
|
+
|
|
750
|
+
**Estimated Impact**:
|
|
751
|
+
- Addressing high-priority issues: 650 tokens saved (5.2% reduction)
|
|
752
|
+
- Addressing all issues: 1,800 tokens saved (14.4% reduction)
|
|
753
|
+
- Improved scannability and retrieval accuracy
|
|
754
|
+
|
|
755
|
+
**Recommendation**: Focus on high-priority issues first. Moving implementation details and restructuring configuration will provide the most significant efficiency gains.
|
|
756
|
+
|
|
757
|
+
## Implementation Approach
|
|
758
|
+
|
|
759
|
+
### Architecture
|
|
760
|
+
|
|
761
|
+
```
|
|
762
|
+
resource-optimizer/
|
|
763
|
+
├── src/
|
|
764
|
+
│ ├── analyzers/
|
|
765
|
+
│ │ ├── redundancy-analyzer.ts # Detect duplicate content
|
|
766
|
+
│ │ ├── clarity-analyzer.ts # Evaluate conciseness
|
|
767
|
+
│ │ ├── specificity-analyzer.ts # Assess scope appropriateness
|
|
768
|
+
│ │ └── structure-analyzer.ts # Evaluate organization
|
|
769
|
+
│ ├── utils/
|
|
770
|
+
│ │ ├── token-estimator.ts # Estimate token counts
|
|
771
|
+
│ │ ├── content-parser.ts # Parse markdown/text/JSON/YAML
|
|
772
|
+
│ │ └── similarity-calculator.ts # Semantic similarity
|
|
773
|
+
│ ├── reporters/
|
|
774
|
+
│ │ └── analysis-reporter.ts # Format output report
|
|
775
|
+
│ └── index.ts # Main orchestration
|
|
776
|
+
└── test/
|
|
777
|
+
├── analyzers/
|
|
778
|
+
├── utils/
|
|
779
|
+
└── integration/
|
|
780
|
+
```
|
|
781
|
+
|
|
782
|
+
### LLM Selection
|
|
783
|
+
|
|
784
|
+
**Primary LLM**: Claude 3.5 Sonnet (claude-3-5-sonnet-20241022)
|
|
785
|
+
|
|
786
|
+
**Rationale**:
|
|
787
|
+
- Strong semantic understanding for redundancy detection
|
|
788
|
+
- Excellent at identifying clarity issues in writing
|
|
789
|
+
- Good at evaluating information architecture
|
|
790
|
+
- Balanced cost vs. capability for analysis tasks
|
|
791
|
+
|
|
792
|
+
**Alternative for Cost Optimization**:
|
|
793
|
+
- Claude 3.5 Haiku for initial token counting and structural analysis
|
|
794
|
+
- Sonnet for semantic analysis (redundancy, clarity assessment)
|
|
795
|
+
|
|
796
|
+
### Workflow
|
|
797
|
+
|
|
798
|
+
1. **Load Resources**
|
|
799
|
+
- Discover files based on input scope
|
|
800
|
+
- Parse content (markdown, JSON, YAML, text)
|
|
801
|
+
- Estimate token counts
|
|
802
|
+
|
|
803
|
+
2. **Run Analyzers** (parallel where possible)
|
|
804
|
+
- Redundancy: Compare content across resources using embeddings
|
|
805
|
+
- Clarity: Analyze descriptions, identify verbose patterns
|
|
806
|
+
- Specificity: Evaluate scope vs. purpose
|
|
807
|
+
- Structure: Assess organization and formatting
|
|
808
|
+
|
|
809
|
+
3. **Calculate Efficiency Scores**
|
|
810
|
+
- Per-resource score based on issues found
|
|
811
|
+
- Overall score weighted by resource size
|
|
812
|
+
- Token waste estimates
|
|
813
|
+
|
|
814
|
+
4. **Generate Recommendations**
|
|
815
|
+
- Prioritize by impact (token savings × severity)
|
|
816
|
+
- Group related issues
|
|
817
|
+
- Provide before/after examples
|
|
818
|
+
|
|
819
|
+
5. **Output Report**
|
|
820
|
+
- JSON format for programmatic use
|
|
821
|
+
- Optional markdown report for human review
|
|
822
|
+
|
|
823
|
+
## Success Criteria
|
|
824
|
+
|
|
825
|
+
### Quantitative
|
|
826
|
+
|
|
827
|
+
- **Accuracy**: 85%+ precision on identifying real efficiency issues (validated by human review)
|
|
828
|
+
- **Coverage**: Detect 90%+ of token waste exceeding `minTokenSavings` threshold
|
|
829
|
+
- **Performance**: Analyze 100KB of resources in < 30 seconds
|
|
830
|
+
- **Consistency**: Same resources analyzed twice produce identical results
|
|
831
|
+
|
|
832
|
+
### Qualitative
|
|
833
|
+
|
|
834
|
+
- **Actionability**: Recommendations are specific enough to implement without additional analysis
|
|
835
|
+
- **False Positive Rate**: < 15% of reported issues are deemed not worth fixing by users
|
|
836
|
+
- **User Satisfaction**: Developers find the analysis valuable and act on recommendations
|
|
837
|
+
|
|
838
|
+
## Testing Strategy
|
|
839
|
+
|
|
840
|
+
### Unit Tests
|
|
841
|
+
|
|
842
|
+
Test each analyzer in isolation:
|
|
843
|
+
|
|
844
|
+
**Redundancy Analyzer**:
|
|
845
|
+
```typescript
|
|
846
|
+
describe('RedundancyAnalyzer', () => {
|
|
847
|
+
it('should detect exact text duplication across resources', () => {
|
|
848
|
+
const resource1 = { path: 'a.md', content: 'Installation: Run npm install' };
|
|
849
|
+
const resource2 = { path: 'b.md', content: 'Installation: Run npm install' };
|
|
850
|
+
|
|
851
|
+
const issues = analyzer.analyze([resource1, resource2]);
|
|
852
|
+
|
|
853
|
+
expect(issues).toContainEqual(
|
|
854
|
+
expect.objectContaining({
|
|
855
|
+
type: 'redundancy',
|
|
856
|
+
affectedResources: ['a.md', 'b.md'],
|
|
857
|
+
estimatedTokenWaste: expect.any(Number),
|
|
858
|
+
})
|
|
859
|
+
);
|
|
860
|
+
});
|
|
861
|
+
|
|
862
|
+
it('should detect semantic similarity with different wording', () => {
|
|
863
|
+
const resource1 = { path: 'a.md', content: 'Install dependencies using npm' };
|
|
864
|
+
const resource2 = { path: 'b.md', content: 'Use npm to install required packages' };
|
|
865
|
+
|
|
866
|
+
const issues = analyzer.analyze([resource1, resource2]);
|
|
867
|
+
|
|
868
|
+
expect(issues.some(i => i.type === 'redundancy')).toBe(true);
|
|
869
|
+
});
|
|
870
|
+
|
|
871
|
+
it('should ignore intentional cross-references', () => {
|
|
872
|
+
const resource1 = { path: 'a.md', content: 'See b.md for details' };
|
|
873
|
+
const resource2 = { path: 'b.md', content: 'Detailed explanation here...' };
|
|
874
|
+
|
|
875
|
+
const issues = analyzer.analyze([resource1, resource2]);
|
|
876
|
+
|
|
877
|
+
expect(issues).not.toContainEqual(
|
|
878
|
+
expect.objectContaining({ relatedResources: expect.arrayContaining(['a.md', 'b.md']) })
|
|
879
|
+
);
|
|
880
|
+
});
|
|
881
|
+
});
|
|
882
|
+
```
|
|
883
|
+
|
|
884
|
+
**Clarity Analyzer**:
|
|
885
|
+
```typescript
|
|
886
|
+
describe('ClarityAnalyzer', () => {
|
|
887
|
+
it('should flag verbose descriptions', () => {
|
|
888
|
+
const resource = {
|
|
889
|
+
path: 'scope.md',
|
|
890
|
+
content: 'This agent essentially helps developers by basically providing them with...',
|
|
891
|
+
};
|
|
892
|
+
|
|
893
|
+
const issues = analyzer.analyze([resource]);
|
|
894
|
+
|
|
895
|
+
expect(issues).toContainEqual(
|
|
896
|
+
expect.objectContaining({
|
|
897
|
+
type: 'clarity',
|
|
898
|
+
severity: expect.any(String),
|
|
899
|
+
recommendation: expect.stringContaining('remove hedging'),
|
|
900
|
+
})
|
|
901
|
+
);
|
|
902
|
+
});
|
|
903
|
+
|
|
904
|
+
it('should provide concise alternatives', () => {
|
|
905
|
+
const resource = {
|
|
906
|
+
path: 'doc.md',
|
|
907
|
+
content: 'The system works by processing the input data through a series of transformations...',
|
|
908
|
+
};
|
|
909
|
+
|
|
910
|
+
const issues = analyzer.analyze([resource]);
|
|
911
|
+
|
|
912
|
+
expect(issues[0]?.examples).toHaveProperty('before');
|
|
913
|
+
expect(issues[0]?.examples).toHaveProperty('after');
|
|
914
|
+
expect(issues[0]?.examples?.after.length).toBeLessThan(issues[0]?.examples?.before.length);
|
|
915
|
+
});
|
|
916
|
+
});
|
|
917
|
+
```
|
|
918
|
+
|
|
919
|
+
**Specificity Analyzer**:
|
|
920
|
+
```typescript
|
|
921
|
+
describe('SpecificityAnalyzer', () => {
|
|
922
|
+
it('should flag overly broad scope in focused documents', () => {
|
|
923
|
+
const resource = {
|
|
924
|
+
path: 'agent-a/SCOPE.md',
|
|
925
|
+
content: `
|
|
926
|
+
# Agent A Scope
|
|
927
|
+
## Purpose
|
|
928
|
+
Specific purpose for Agent A
|
|
929
|
+
## Implementation Details
|
|
930
|
+
[500 tokens of implementation code examples]
|
|
931
|
+
## Debugging Strategies
|
|
932
|
+
[300 tokens of debugging tips]
|
|
933
|
+
`,
|
|
934
|
+
};
|
|
935
|
+
|
|
936
|
+
const issues = analyzer.analyze([resource]);
|
|
937
|
+
|
|
938
|
+
expect(issues).toContainEqual(
|
|
939
|
+
expect.objectContaining({
|
|
940
|
+
type: 'specificity',
|
|
941
|
+
severity: 'major',
|
|
942
|
+
recommendation: expect.stringContaining('move to DESIGN.md'),
|
|
943
|
+
})
|
|
944
|
+
);
|
|
945
|
+
});
|
|
946
|
+
});
|
|
947
|
+
```
|
|
948
|
+
|
|
949
|
+
**Structure Analyzer**:
|
|
950
|
+
```typescript
|
|
951
|
+
describe('StructureAnalyzer', () => {
|
|
952
|
+
it('should detect walls of text', () => {
|
|
953
|
+
const resource = {
|
|
954
|
+
path: 'doc.md',
|
|
955
|
+
content: 'A'.repeat(2000), // Long paragraph
|
|
956
|
+
};
|
|
957
|
+
|
|
958
|
+
const issues = analyzer.analyze([resource]);
|
|
959
|
+
|
|
960
|
+
expect(issues).toContainEqual(
|
|
961
|
+
expect.objectContaining({
|
|
962
|
+
type: 'structure',
|
|
963
|
+
recommendation: expect.stringContaining('break into sections'),
|
|
964
|
+
})
|
|
965
|
+
);
|
|
966
|
+
});
|
|
967
|
+
|
|
968
|
+
it('should suggest tables for prose-based configuration', () => {
|
|
969
|
+
const resource = {
|
|
970
|
+
path: 'config.md',
|
|
971
|
+
content: `
|
|
972
|
+
The first option is name, which is a string. The second option is age, which is a number.
|
|
973
|
+
`,
|
|
974
|
+
};
|
|
975
|
+
|
|
976
|
+
const issues = analyzer.analyze([resource]);
|
|
977
|
+
|
|
978
|
+
expect(issues).toContainEqual(
|
|
979
|
+
expect.objectContaining({
|
|
980
|
+
type: 'structure',
|
|
981
|
+
recommendation: expect.stringContaining('table'),
|
|
982
|
+
})
|
|
983
|
+
);
|
|
984
|
+
});
|
|
985
|
+
});
|
|
986
|
+
```
|
|
987
|
+
|
|
988
|
+
### Integration Tests
|
|
989
|
+
|
|
990
|
+
Test end-to-end analysis workflows:
|
|
991
|
+
|
|
992
|
+
```typescript
|
|
993
|
+
describe('ResourceOptimizer Integration', () => {
|
|
994
|
+
it('should analyze multiple resources and generate comprehensive report', async () => {
|
|
995
|
+
const input = {
|
|
996
|
+
analysisScope: {
|
|
997
|
+
resourceDirectory: './test/fixtures/sample-agent',
|
|
998
|
+
},
|
|
999
|
+
analysisOptions: {
|
|
1000
|
+
checkRedundancy: true,
|
|
1001
|
+
checkClarity: true,
|
|
1002
|
+
checkSpecificity: true,
|
|
1003
|
+
checkStructure: true,
|
|
1004
|
+
minTokenSavings: 50,
|
|
1005
|
+
},
|
|
1006
|
+
};
|
|
1007
|
+
|
|
1008
|
+
const output = await resourceOptimizer.analyze(input);
|
|
1009
|
+
|
|
1010
|
+
expect(output.summary).toMatchObject({
|
|
1011
|
+
totalResources: expect.any(Number),
|
|
1012
|
+
totalTokens: expect.any(Number),
|
|
1013
|
+
issuesFound: expect.any(Number),
|
|
1014
|
+
potentialTokenSavings: expect.any(Number),
|
|
1015
|
+
efficiencyScore: expect.any(Number),
|
|
1016
|
+
});
|
|
1017
|
+
|
|
1018
|
+
expect(output.resourceAnalysis).toBeInstanceOf(Array);
|
|
1019
|
+
expect(output.recommendations).toHaveProperty('highPriority');
|
|
1020
|
+
expect(output.recommendations.highPriority).toBeInstanceOf(Array);
|
|
1021
|
+
});
|
|
1022
|
+
|
|
1023
|
+
it('should prioritize issues by estimated impact', async () => {
|
|
1024
|
+
// Setup: Create resources with issues of varying severity/token waste
|
|
1025
|
+
const output = await resourceOptimizer.analyze(testInput);
|
|
1026
|
+
|
|
1027
|
+
const highPriorityImpact = estimateImpact(output.recommendations.highPriority);
|
|
1028
|
+
const mediumPriorityImpact = estimateImpact(output.recommendations.mediumPriority);
|
|
1029
|
+
|
|
1030
|
+
expect(highPriorityImpact).toBeGreaterThan(mediumPriorityImpact);
|
|
1031
|
+
});
|
|
1032
|
+
});
|
|
1033
|
+
```
|
|
1034
|
+
|
|
1035
|
+
### Validation Tests
|
|
1036
|
+
|
|
1037
|
+
Validate against real-world scenarios:
|
|
1038
|
+
|
|
1039
|
+
```typescript
|
|
1040
|
+
describe('ResourceOptimizer Validation', () => {
|
|
1041
|
+
it('should analyze agent-generator resources', async () => {
|
|
1042
|
+
const input = {
|
|
1043
|
+
analysisScope: {
|
|
1044
|
+
resourceDirectory: 'docs/agents/vat-development-agents/agent-generator',
|
|
1045
|
+
},
|
|
1046
|
+
};
|
|
1047
|
+
|
|
1048
|
+
const output = await resourceOptimizer.analyze(input);
|
|
1049
|
+
|
|
1050
|
+
// Validate that analysis completes successfully
|
|
1051
|
+
expect(output.summary.totalResources).toBeGreaterThan(0);
|
|
1052
|
+
expect(output.resourceAnalysis.length).toBe(output.summary.totalResources);
|
|
1053
|
+
});
|
|
1054
|
+
|
|
1055
|
+
it('should detect known efficiency issues in test corpus', async () => {
|
|
1056
|
+
// Test against a corpus with intentionally inefficient resources
|
|
1057
|
+
const output = await resourceOptimizer.analyze(knownIssuesCorpus);
|
|
1058
|
+
|
|
1059
|
+
// Verify that known issues are detected
|
|
1060
|
+
expect(output.summary.issuesFound).toBeGreaterThanOrEqual(expectedIssueCount);
|
|
1061
|
+
});
|
|
1062
|
+
});
|
|
1063
|
+
```
|
|
1064
|
+
|
|
1065
|
+
## Phase 2 Considerations
|
|
1066
|
+
|
|
1067
|
+
### Enhancements for Future Phases
|
|
1068
|
+
|
|
1069
|
+
1. **Auto-Fix Capabilities**
|
|
1070
|
+
- Implement automatic refactoring for simple issues (clarity, structure)
|
|
1071
|
+
- Generate git diffs for proposed changes
|
|
1072
|
+
- Interactive mode for reviewing and applying fixes
|
|
1073
|
+
|
|
1074
|
+
2. **Baseline Tracking**
|
|
1075
|
+
- Store efficiency scores over time
|
|
1076
|
+
- Detect regressions in context efficiency
|
|
1077
|
+
- Integrate with CI/CD to block PRs that worsen efficiency
|
|
1078
|
+
|
|
1079
|
+
3. **Custom Rules**
|
|
1080
|
+
- Allow users to define project-specific efficiency rules
|
|
1081
|
+
- Support custom analyzers via plugin system
|
|
1082
|
+
- Configuration for severity thresholds and token budgets
|
|
1083
|
+
|
|
1084
|
+
4. **Visual Reports**
|
|
1085
|
+
- Generate HTML/markdown reports with charts
|
|
1086
|
+
- Heatmaps showing efficiency across resources
|
|
1087
|
+
- Before/after comparisons with syntax highlighting
|
|
1088
|
+
|
|
1089
|
+
5. **Integration with Agent Generator**
|
|
1090
|
+
- Automatically analyze generated resources
|
|
1091
|
+
- Provide efficiency feedback during agent creation
|
|
1092
|
+
- Enforce efficiency thresholds for new agents
|
|
1093
|
+
|
|
1094
|
+
6. **Advanced Analysis**
|
|
1095
|
+
- Detect outdated examples or references
|
|
1096
|
+
- Identify opportunities for consolidation across agent sets
|
|
1097
|
+
- Analyze query patterns to optimize for common retrieval scenarios
|
|
1098
|
+
|
|
1099
|
+
### Integration Points
|
|
1100
|
+
|
|
1101
|
+
**With vat-resources**:
|
|
1102
|
+
- Use existing resource parsing and validation
|
|
1103
|
+
- Share token estimation utilities
|
|
1104
|
+
- Leverage resource metadata for context
|
|
1105
|
+
|
|
1106
|
+
**With agent-generator**:
|
|
1107
|
+
- Analyze generated resources for efficiency
|
|
1108
|
+
- Provide feedback during agent creation
|
|
1109
|
+
- Template resources with efficiency best practices baked in
|
|
1110
|
+
|
|
1111
|
+
**With CLI**:
|
|
1112
|
+
- `vibe-agent optimize <path>` command
|
|
1113
|
+
- `--fix` flag for automatic corrections
|
|
1114
|
+
- `--watch` mode for continuous monitoring
|
|
1115
|
+
|
|
1116
|
+
## Open Questions
|
|
1117
|
+
|
|
1118
|
+
1. **Token Estimation Accuracy**: How closely should token estimates match actual LLM tokenization? (Target: ±5%)
|
|
1119
|
+
|
|
1120
|
+
2. **Semantic Similarity Threshold**: What similarity score indicates redundancy vs. legitimate overlap? (Propose: 0.85+ for exact, 0.70-0.84 for semantic)
|
|
1121
|
+
|
|
1122
|
+
3. **Efficiency Score Formula**: How should we weight different issue types in the overall score?
|
|
1123
|
+
```
|
|
1124
|
+
Proposed:
|
|
1125
|
+
score = 100 - (
|
|
1126
|
+
(redundancy_waste * 1.5) + // Redundancy most costly
|
|
1127
|
+
(clarity_waste * 1.0) +
|
|
1128
|
+
(specificity_waste * 1.2) +
|
|
1129
|
+
(structure_waste * 0.8) // Structure less critical than content
|
|
1130
|
+
) / total_tokens * 100
|
|
1131
|
+
```
|
|
1132
|
+
|
|
1133
|
+
4. **Human-in-the-Loop**: Should the agent require human confirmation before flagging issues? Or trust the analysis and allow humans to dismiss false positives?
|
|
1134
|
+
|
|
1135
|
+
5. **Minimum Resource Size**: Should very small resources (< 100 tokens) be skipped? May not be worth analyzing.
|
|
1136
|
+
|
|
1137
|
+
6. **Cross-Agent Analysis**: Should the tool analyze resources across multiple agents to detect global redundancy patterns?
|
|
1138
|
+
|
|
1139
|
+
## Decision: Defer to Phase 2+
|
|
1140
|
+
|
|
1141
|
+
**Rationale**:
|
|
1142
|
+
|
|
1143
|
+
While resource-optimizer would be valuable, it's not critical for the initial agent-generator workflow. The agent-generator can function effectively without automatic efficiency analysis, and manual review by developers is sufficient for Phase 1.
|
|
1144
|
+
|
|
1145
|
+
**Phase 2 Priorities**:
|
|
1146
|
+
1. Get agent-generator working end-to-end
|
|
1147
|
+
2. Validate with real agent creation scenarios
|
|
1148
|
+
3. Gather feedback on resource efficiency pain points
|
|
1149
|
+
4. Use findings to refine resource-optimizer scope
|
|
1150
|
+
|
|
1151
|
+
**Revisit Trigger**:
|
|
1152
|
+
- After creating 5+ agents with agent-generator
|
|
1153
|
+
- If manual efficiency reviews become a bottleneck
|
|
1154
|
+
- When users request automated efficiency analysis
|
|
1155
|
+
|
|
1156
|
+
Resource-optimizer remains a valuable future enhancement with a well-defined scope ready for implementation.
|