@vibe-agent-toolkit/vat-development-agents 0.1.31-rc.1 → 0.1.32-rc.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/.claude/plugins/marketplaces/vat-skills/CHANGELOG.md +38 -0
- package/dist/.claude/plugins/marketplaces/vat-skills/plugins/vibe-agent-toolkit/.claude-plugin/plugin.json +1 -1
- package/dist/.claude/plugins/marketplaces/vat-skills/plugins/vibe-agent-toolkit/skills/authoring/SKILL.md +2 -0
- package/dist/generated/resources/skills/vat-agent-authoring.js +3 -3
- package/dist/skills/authoring/SKILL.md +2 -0
- package/package.json +4 -4
|
@@ -7,11 +7,49 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
7
7
|
|
|
8
8
|
## [Unreleased]
|
|
9
9
|
|
|
10
|
+
### Added
|
|
11
|
+
- **Evidence substrate** (`@vibe-agent-toolkit/agent-skills/evidence`). Parsers produce neutral `EvidenceRecord`s with stable pattern IDs from `PATTERN_REGISTRY`; a derivation step rolls evidence into capability `Observation`s; a verdict engine compares observations against declared targets. Designed so pattern refinement never changes the observation contract.
|
|
12
|
+
- **`vat audit --verbose`** renders the evidence chain beneath each `CAPABILITY_*` observation — pattern ID, file, line, match text — and includes an `evidence[]` array in YAML output. Use it to debug false positives or confirm what a detector actually saw.
|
|
13
|
+
- **Runtime profile table** (`RUNTIME_PROFILES` in `@vibe-agent-toolkit/claude-marketplace`) is the single source of truth for what each Claude runtime provides and lacks (local shell, browser, network level, preinstalled binaries).
|
|
14
|
+
- **Verdict engine** (`computeVerdicts`) combines capability observations with declared targets to produce `COMPAT_TARGET_*` issues. Four states: expected (silent), `COMPAT_TARGET_INCOMPATIBLE` (warning), `COMPAT_TARGET_NEEDS_REVIEW` (warning), `COMPAT_TARGET_UNDECLARED` (info).
|
|
15
|
+
- **Config-level `targets` declaration** in `vibe-agent-toolkit.config.yaml` under `skills.defaults.targets` and `skills.config.<name>.targets`. Declaring targets suppresses non-applicable compat verdicts.
|
|
16
|
+
- **Marketplace-level `defaults.targets`** in `.claude-plugin/marketplace.json`. Layer priority (highest to lowest): `plugin.json` → `marketplace.json` → `vibe-agent-toolkit.config.yaml`.
|
|
17
|
+
- **Post-build validation**: `vat skills build` runs the full validation suite against built `dist/skills/*/SKILL.md` (skipping source-only codes like `LINK_OUTSIDE_PROJECT`). Build failures surface identically to source failures.
|
|
18
|
+
- **`info` severity** in the validation framework. `CAPABILITY_*` and `COMPAT_TARGET_UNDECLARED` emit as info; they appear in output and respect `validation.severity` overrides but do not contribute to build failure status.
|
|
19
|
+
- New validation codes: `CAPABILITY_LOCAL_SHELL`, `CAPABILITY_EXTERNAL_CLI`, `CAPABILITY_BROWSER_AUTH` (info); `COMPAT_TARGET_INCOMPATIBLE`, `COMPAT_TARGET_NEEDS_REVIEW` (warning); `COMPAT_TARGET_UNDECLARED` (info).
|
|
20
|
+
|
|
21
|
+
### Changed
|
|
22
|
+
- **BREAKING: Runtime target rename.** `claude-desktop` → `claude-chat`, `cowork` → `claude-cowork`. Update `plugin.json`, `marketplace.json`, and any config references. The `claude-desktop` name was architecturally wrong — Claude Desktop is a host application, not a runtime.
|
|
23
|
+
- **BREAKING: `runCompatDetectors` returns `DetectorOutput { evidence, observations }`** instead of `ValidationIssue[]`. The skill-validator converts observations to issues via `observationToIssue`; external callers must do the same or consume observations directly.
|
|
24
|
+
- **BREAKING: `CompatibilityResult` restructured.** Old shape: `{ declared, analyzed: Record<Target, Verdict>, evidence: CompatibilityEvidence[] }`. New: `{ declaredTargets, evidence: EvidenceRecord[], observations: Observation[], verdicts: Verdict[] }`.
|
|
25
|
+
- **BREAKING: Scanner output shape.** Scanners in `@vibe-agent-toolkit/claude-marketplace` now return `EvidenceRecord[]` with registered pattern IDs; `ScannerOutput { evidence, observations }` replaces `CompatibilityEvidence`.
|
|
26
|
+
|
|
27
|
+
### Removed
|
|
28
|
+
- **BREAKING:** `COMPAT_REQUIRES_BROWSER_AUTH`, `COMPAT_REQUIRES_LOCAL_SHELL`, `COMPAT_REQUIRES_EXTERNAL_CLI` codes (replaced by `CAPABILITY_*` + `COMPAT_TARGET_*`).
|
|
29
|
+
- **BREAKING:** `CompatibilityEvidence` type, legacy `Verdict` string union (`'compatible' | 'needs-review' | 'incompatible'`), `ImpactLevel` type, `ALL_TARGETS` export, `aggregateVerdicts`, `hasNonOkImpact` helpers.
|
|
30
|
+
- **BREAKING:** Hardcoded `IMPACT_*` constants and `packages/claude-marketplace/src/scanners/impact-constants.ts` module. Impact logic now lives in the runtime profile table and verdict engine.
|
|
31
|
+
- `yaml` runtime dependency from `@vibe-agent-toolkit/claude-marketplace` (YAML parsing now lives in agent-skills via frontmatter delegation).
|
|
32
|
+
|
|
33
|
+
### Migration Notes
|
|
34
|
+
Pre-1.0 breaking. Callers must:
|
|
35
|
+
1. Update `plugin.json` `targets` arrays to use `claude-chat` / `claude-cowork` / `claude-code`.
|
|
36
|
+
2. Replace `COMPAT_REQUIRES_*` entries in `validation.severity` / `validation.allow` with the matching `CAPABILITY_*` or `COMPAT_TARGET_*` code.
|
|
37
|
+
3. If consuming `CompatibilityResult` programmatically, migrate from `analyzed`/`declared` fields to `verdicts`/`declaredTargets`.
|
|
38
|
+
4. Declare runtime targets in at least one layer (plugin, marketplace defaults, or config) or accept `COMPAT_TARGET_UNDECLARED` info emissions.
|
|
39
|
+
5. Run `vat audit --verbose` to inspect evidence and confirm the refactor's output matches intent.
|
|
40
|
+
|
|
41
|
+
## [0.1.31] - 2026-04-17
|
|
42
|
+
|
|
10
43
|
### Added
|
|
11
44
|
- **v1 compat smells.** Three new `COMPAT_*` codes — `COMPAT_REQUIRES_BROWSER_AUTH`, `COMPAT_REQUIRES_LOCAL_SHELL`, `COMPAT_REQUIRES_EXTERNAL_CLI` — detect per-skill runtime capabilities (browser auth, local shell, external CLI) via static analysis of SKILL.md and its transitively linked markdown. Default severity `warning`; configure per-skill via `validation.severity` / `validation.allow` like any other framework code. Full rationale and when-to-allow guidance in `docs/validation-codes.md`.
|
|
12
45
|
- **`vat audit --user` now documents `CLAUDE_CONFIG_DIR`.** Help text and `packages/cli/docs/audit.md` name the env var, mark `~/.claude` as the default rather than unconditional, and document a shell-loop pattern for multi-directory workflows. No code change — `CLAUDE_CONFIG_DIR` has always been honored in `packages/claude-marketplace/src/paths/claude-paths.ts` — but the UX gap closes.
|
|
46
|
+
- `vat audit`: gitignore-aware scanning. When scanning inside a git repository, paths matched by `.gitignore` are skipped by default — no hardcoded directory list needed. `--include-artifacts` opts back in. When the user explicitly targets a gitignored path (e.g., `vat audit dist/skills/`), filtering is disabled for that subtree.
|
|
47
|
+
- `vat audit`: config-aware validation in VAT projects. When `vibe-agent-toolkit.config.yaml` is found at the scan root, audit uses the project's build settings (`linkFollowDepth`, `files`, `excludeReferencesFromBundle`) to validate skills — eliminating false `LINK_OUTSIDE_PROJECT` warnings for links the build pipeline resolves. Audit never applies `validation.allow` (always shows all issues).
|
|
48
|
+
- `docs/skill-quality-and-compatibility.md`: new project stance doc articulating what VAT believes makes a skill good and compatible. Linked from the `authoring` skill and cross-referenced from `docs/validation-codes.md`.
|
|
13
49
|
|
|
14
50
|
### Changed
|
|
51
|
+
- `vat audit` now skips gitignored paths by default. Before this change, running `vat audit` in a TypeScript project scanned every SKILL.md in `node_modules/`, `dist/`, and other artifact directories (often hundreds of duplicate files). The new behavior uses the project's `.gitignore` rules, which adapts to each project's layout automatically. Use `--include-artifacts` to opt back in for deliberate artifact audits.
|
|
52
|
+
|
|
15
53
|
- **`SKILL_CONSOLE_INCOMPATIBLE` retired.** The Bash/Edit/Write/NotebookEdit tool-mention warning is replaced by the new `COMPAT_REQUIRES_LOCAL_SHELL`, giving adopters a single canonical detector with configurable severity and per-path allow entries.
|
|
16
54
|
|
|
17
55
|
### Removed
|
|
@@ -386,6 +386,8 @@ const output2 = await agent.execute({
|
|
|
386
386
|
|
|
387
387
|
## References
|
|
388
388
|
|
|
389
|
+
- Skill Quality and Compatibility — VAT's Stance — what VAT believes makes a skill good and compatible, and how those beliefs turn into validation codes. Read this before overriding severity defaults or adding allow entries.
|
|
390
|
+
- Validation Codes Reference — full list of codes VAT emits, their default severity, and override recipes.
|
|
389
391
|
- [Skill Quality Checklist](resources/skill-quality-checklist.md) — Pre-publication checklist for all skills (general + CLI-backed)
|
|
390
392
|
- agent-authoring.md — Complete patterns guide
|
|
391
393
|
- orchestration.md — Multi-agent workflows
|
|
@@ -7,7 +7,7 @@ export const meta = {
|
|
|
7
7
|
description: "Use when authoring SKILL.md files, designing agent architectures, or configuring packaging options. Covers SKILL.md structure, agent archetypes, orchestration patterns, and validation override patterns."
|
|
8
8
|
};
|
|
9
9
|
|
|
10
|
-
export const text = "\n# VAT Agent Authoring: SKILL.md, Archetypes & Patterns\n\n## SKILL.md Structure\n\nA SKILL.md file is the definition file for a VAT agent skill. It tells Claude what the skill\ndoes and how to use it. All SKILL.md files must have YAML frontmatter:\n\n\`\`\`markdown\n---\nname: my-skill\ndescription: One sentence: what this skill does and when to use it (max 200 chars)\n---\n\n# My Skill\n\nRest of the skill documentation...\n\`\`\`\n\nRequired frontmatter fields:\n- \`name\` — unique identifier, kebab-case, matches the skill\'s directory name\n- \`description\` — trigger description used for skill routing; be specific about activation conditions\n\nBest practices for \`description\`:\n- Start with \"Use when...\" to make activation conditions explicit\n- Include the key commands or concepts the skill covers\n- Keep under 200 characters\n\n## Agent Archetypes\n\nVAT supports four agent archetypes for different use cases.\n\n### Archetype 1: Pure Function Tool\n\n**When to use:** Stateless validation, transformation, computation — no LLM needed.\n\n**Characteristics:** Deterministic output, fast execution, easy to test.\n\n**Example use cases:** Input validation, data transformation, format conversion, rules-based logic.\n\n\`\`\`typescript\nexport async function validateInput(input: MyInput): Promise<ValidationResult> {\n if (input.text.length < 5) {\n return { status: \'error\', error: \'too-short\' };\n }\n return { status: \'success\', data: { valid: true } };\n}\n\`\`\`\n\n### Archetype 2: One-Shot LLM Analyzer\n\n**When to use:** Single LLM call for analysis, classification, or generation.\n\n**Characteristics:** One LLM call per execution, stateless, handles LLM errors.\n\n**Example use cases:** Sentiment analysis, text classification, entity extraction, creative generation.\n\n\`\`\`typescript\nexport async function analyzeSentiment(text: string, context: AgentContext) {\n const response = await context.callLLM([\n { role: \'user\', content: \`Analyze sentiment: \"${text}\"\` }\n ]);\n\n const parsed = JSON.parse(response);\n return { status: \'success\', data: parsed };\n}\n\`\`\`\n\n### Archetype 3: Conversational Assistant\n\n**When to use:** Multi-turn dialogue, progressive data collection across sessions.\n\n**Characteristics:** Multiple LLM calls, maintains session state, phases (gathering → ready → complete).\n\n**Example use cases:** Customer support chatbots, product advisors, interview agents, multi-step forms.\n\n\`\`\`typescript\nexport async function conversationalAgent(\n message: string,\n sessionState: SessionState\n) {\n if (sessionState.phase === \'gathering\') {\n return {\n reply: \"Can you tell me more about X?\",\n sessionState: { ...sessionState },\n result: { status: \'in-progress\' }\n };\n }\n\n return {\n reply: \"Here\'s your result!\",\n sessionState: { ...sessionState, phase: \'complete\' },\n result: { status: \'success\', data: finalResult }\n };\n}\n\`\`\`\n\n### Archetype 4: External Event Integrator\n\n**When to use:** Waiting for external events (approvals, webhooks, third-party APIs).\n\n**Characteristics:** Emits event, blocks waiting for response, timeout handling, mockable for testing.\n\n**Example use cases:** Human-in-the-loop approval, webhook integrations, external API polling.\n\n\`\`\`typescript\nexport async function humanApproval(\n request: ApprovalRequest,\n options = { mockable: true, timeout: 30000 }\n) {\n if (options.mockable) {\n return { status: \'success\', data: { approved: true } };\n }\n\n const response = await emitEvent(request, options.timeout);\n return { status: \'success\', data: response };\n}\n\`\`\`\n\n## Result Envelopes\n\nAlways return result envelopes — never throw exceptions for expected errors.\n\n\`\`\`typescript\n// AgentResult<TData, TError> — for single-execution agents\ntype AgentResult<TData, TError> =\n | { status: \'success\'; data: TData }\n | { status: \'error\'; error: TError };\n\n// StatefulAgentResult — for conversational agents\ntype StatefulAgentResult<TData, TError, TMetadata> =\n | { status: \'in-progress\'; metadata?: TMetadata }\n | { status: \'success\'; data: TData }\n | { status: \'error\'; error: TError };\n\`\`\`\n\nStandard LLM error literals: \`\'llm-refusal\'\`, \`\'llm-invalid-output\'\`, \`\'llm-timeout\'\`,\n\`\'llm-rate-limit\'\`, \`\'llm-token-limit\'\`, \`\'llm-unavailable\'\`.\n\nAlways check status before accessing data:\n\`\`\`typescript\nconst output = await myAgent.execute(input);\nif (output.result.status === \'success\') {\n console.log(output.result.data);\n} else if (output.result.status === \'error\') {\n console.error(\'Failed:\', output.result.error);\n}\n\`\`\`\n\n## Orchestration Patterns\n\n### Sequential Pipeline\n\n\`\`\`typescript\nconst analysisOutput = await analyzer.execute(input);\nconst processedOutput = await andThen(\n analysisOutput.result,\n async (data) => {\n const out = await processor.execute(data);\n return out.result;\n }\n);\n\`\`\`\n\n### Parallel Execution\n\n\`\`\`typescript\nconst [output1, output2, output3] = await Promise.all([\n agent1.execute(input),\n agent2.execute(input),\n agent3.execute(input),\n]);\n\`\`\`\n\n### Validation Loop (Generate + Validate with Retry)\n\n\`\`\`typescript\nasync function generateValidOutput(input: MyInput, maxAttempts = 5) {\n for (let attempt = 0; attempt < maxAttempts; attempt++) {\n const generatorOutput = await generator.execute(input);\n if (generatorOutput.result.status === \'error\') continue;\n\n const validatorOutput = await validator.execute(generatorOutput.result.data);\n if (validatorOutput.result.status === \'success\' &&\n validatorOutput.result.data.valid) {\n return generatorOutput.result.data;\n }\n }\n throw new Error(\'Max attempts exceeded\');\n}\n\`\`\`\n\n### Human-in-the-Loop\n\n\`\`\`typescript\nconst generatorOutput = await generator.execute(input);\nif (generatorOutput.result.status === \'success\') {\n const approvalOutput = await humanApproval.execute({\n content: generatorOutput.result.data,\n context: input,\n });\n if (approvalOutput.result.data.approved) {\n return generatorOutput.result.data;\n }\n}\n\`\`\`\n\n### Conversational Multi-Turn\n\n\`\`\`typescript\nlet session = { state: { phase: \'gathering\' }, history: [] };\n\nwhile (true) {\n const userMessage = await getUserInput();\n const output = await conversationalAgent.execute({\n message: userMessage,\n sessionState: session.state,\n });\n\n console.log(\'Agent:\', output.reply);\n session = {\n state: output.sessionState,\n history: [...session.history,\n { role: \'user\', content: userMessage },\n { role: \'assistant\', content: output.reply }\n ],\n };\n\n if (output.result.status === \'success\') break;\n if (output.result.status === \'error\') break;\n // status === \'in-progress\': continue\n}\n\`\`\`\n\n## packagingOptions Reference\n\nConfigure in your skill\'s \`vat.skills[]\` entry in \`package.json\`:\n\n\`\`\`json\n{\n \"vat\": {\n \"skills\": [{\n \"name\": \"my-skill\",\n \"source\": \"./SKILL.md\",\n \"path\": \"./dist/skills/my-skill\",\n \"packagingOptions\": {\n \"linkFollowDepth\": 1,\n \"resourceNaming\": \"resource-id\",\n \"stripPrefix\": \"knowledge-base\",\n \"excludeReferencesFromBundle\": {\n \"rules\": [\n { \"patterns\": [\"**/concepts/**\"], \"template\": \"Use search to find: {{link.text}}\" }\n ],\n \"defaultTemplate\": \"{{link.text}} (search knowledge base)\"\n }\n }\n }]\n }\n}\n\`\`\`\n\n**\`linkFollowDepth\`** — How deep to follow links from SKILL.md:\n\n| Value | Behavior |\n|-------|----------|\n| \`0\` | Skill file only (no links followed) |\n| \`1\` | Direct links only |\n| \`2\` | Direct + one transitive level **(default)** |\n| \`\"full\"\` | Complete transitive closure |\n\n**\`resourceNaming\`** — How bundled files are named:\n\n| Strategy | Example | Use When |\n|----------|---------|----------|\n| \`basename\` | \`overview.md\` | Few files, unique names **(default)** |\n| \`resource-id\` | \`topics-quickstart-overview.md\` | Many files, flat output |\n| \`preserve-path\` | \`topics/quickstart/overview.md\` | Preserve structure |\n\nUse \`stripPrefix\` to remove a common directory prefix (e.g., \`\"knowledge-base\"\`).\n\n**\`excludeReferencesFromBundle\`** — Rules for excluding files and rewriting their links:\n- \`rules[]\` — Ordered glob patterns (first match wins), each with optional Handlebars template\n- \`defaultTemplate\` — Applied to depth-exceeded links not matching any rule\n\n**Template variables:**\n\n| Variable | Description |\n|----------|-------------|\n| \`{{link.text}}\` | Link display text |\n| \`{{link.href}}\` | Original href (without fragment) |\n| \`{{link.fragment}}\` | Fragment including \`#\` prefix, or empty |\n| \`{{link.type}}\` | Link type (\`\"local_file\"\`, etc.) |\n| \`{{link.resource.id}}\` | Target resource ID (if resolved) |\n| \`{{link.resource.fileName}}\` | Target filename (if resolved) |\n| \`{{skill.name}}\` | Skill name from frontmatter |\n\n**\`validation\`** — Unified framework for overriding default severity and allowing specific issue instances:\n\n\`\`\`yaml\n# In vibe-agent-toolkit.config.yaml under skills.defaults or skills.config.<name>\nvalidation:\n severity:\n LINK_DROPPED_BY_DEPTH: error # upgrade: block on depth-dropped links\n LINK_TO_NAVIGATION_FILE: ignore # silence: this skill intentionally links to READMEs\n allow:\n PACKAGED_UNREFERENCED_FILE:\n - paths: [\"templates/runtime.json\"]\n reason: \"consumed programmatically at runtime\"\n expires: \"2026-09-30\"\n SKILL_LENGTH_EXCEEDS_RECOMMENDED:\n - reason: \"whole-skill concern; paths defaults to [\'**/*\']\"\n\`\`\`\n\nTwo sub-keys, each covering a different override granularity:\n\n- **\`severity\`** — Class-level. Raise any code to \`error\` (blocks build), lower to \`warning\` (emits, non-blocking), or \`ignore\` (fully suppressed). Applies to every instance of that code.\n- **\`allow\`** — Per-instance. Suppress specific \`(code, path)\` matches with a required \`reason\` and optional \`expires\` date. \`paths\` is optional (defaults to \`[\"**/*\"]\` — the whole skill). Use for legitimate exceptions that don\'t warrant code-wide silencing.\n\nThings adopters typically adjust:\n\n- Downgrade \`LINK_DROPPED_BY_DEPTH\` to \`ignore\` when intentionally linking out to external docs.\n- Allow specific files under \`PACKAGED_UNREFERENCED_FILE\` when they\'re consumed programmatically by CLI scripts at runtime.\n- Raise \`ALLOW_EXPIRED\` to \`error\` for zero-tolerance expiry policies.\n\nExpired \`allow\` entries still apply — VAT emits \`ALLOW_EXPIRED\` as a reminder rather than silently re-surfacing the underlying issue (no surprise build breaks when a date passes). Unused \`allow\` entries surface as \`ALLOW_UNUSED\` (analogous to ESLint\'s unused-disable).\n\nFull code reference at \`docs/validation-codes.md\`. \`vat audit\` is advisory: it applies \`severity\` for display grouping only, ignores \`allow\`, and always exits 0. Use \`vat skills validate\` or \`vat skills build\` for gated checks.\n\n## Testing Agents\n\n### Unit Testing Pure Functions\n\n\`\`\`typescript\nimport { describe, expect, it } from \'vitest\';\nimport { resultMatchers } from \'@vibe-agent-toolkit/agent-runtime\';\n\ndescribe(\'myValidator\', () => {\n it(\'should validate correct input\', async () => {\n const output = await myValidator.execute({ text: \'valid\' });\n resultMatchers.expectSuccess(output.result);\n expect(output.result.data.valid).toBe(true);\n });\n});\n\`\`\`\n\n### Integration Testing with Mock LLM\n\n\`\`\`typescript\nimport { createMockContext } from \'@vibe-agent-toolkit/agent-runtime\';\n\nconst mockContext = createMockContext(\n JSON.stringify({ sentiment: \'positive\', confidence: 0.9 })\n);\nconst output = await myAnalyzer.execute({ text: \'Great!\' }, mockContext);\nresultMatchers.expectSuccess(output.result);\n\`\`\`\n\n### Testing Conversational Flows\n\n\`\`\`typescript\n// Turn 1\nconst output1 = await agent.execute({ message: \'Hello\' });\nexpect(output1.reply).toContain(\'name?\');\nresultMatchers.expectInProgress(output1.result);\n\n// Turn 2 — pass session state forward\nconst output2 = await agent.execute({\n message: \'My name is Alice\',\n sessionState: output1.sessionState,\n});\n\`\`\`\n\n## Best Practices\n\n1. **Return result envelopes, never throw** for expected errors\n2. **Define error types as literal unions** (\`\'invalid-format\' | \'timeout\'\`) not \`string\`\n3. **Use Zod schemas** for all input/output validation\n4. **Test all paths** — success, each error type, edge cases\n5. **Use mock mode** for external dependencies to enable offline testing\n6. **Document with JSDoc** — purpose, params, return type, example, \`@throws Never throws\`\n7. **Keep SKILL.md focused** — if it exceeds ~300 lines, split into action skills\n\n## References\n\n- [Skill Quality Checklist](skill-quality-checklist.md) — Pre-publication checklist for all skills (general + CLI-backed)\n- [agent-authoring.md](../../../../docs/agent-authoring.md) — Complete patterns guide\n- [orchestration.md](../../../../docs/orchestration.md) — Multi-agent workflows\n- [Building Effective Agents - Anthropic](https://www.anthropic.com/research/building-effective-agents)\n";
|
|
10
|
+
export const text = "\n# VAT Agent Authoring: SKILL.md, Archetypes & Patterns\n\n## SKILL.md Structure\n\nA SKILL.md file is the definition file for a VAT agent skill. It tells Claude what the skill\ndoes and how to use it. All SKILL.md files must have YAML frontmatter:\n\n\`\`\`markdown\n---\nname: my-skill\ndescription: One sentence: what this skill does and when to use it (max 200 chars)\n---\n\n# My Skill\n\nRest of the skill documentation...\n\`\`\`\n\nRequired frontmatter fields:\n- \`name\` — unique identifier, kebab-case, matches the skill\'s directory name\n- \`description\` — trigger description used for skill routing; be specific about activation conditions\n\nBest practices for \`description\`:\n- Start with \"Use when...\" to make activation conditions explicit\n- Include the key commands or concepts the skill covers\n- Keep under 200 characters\n\n## Agent Archetypes\n\nVAT supports four agent archetypes for different use cases.\n\n### Archetype 1: Pure Function Tool\n\n**When to use:** Stateless validation, transformation, computation — no LLM needed.\n\n**Characteristics:** Deterministic output, fast execution, easy to test.\n\n**Example use cases:** Input validation, data transformation, format conversion, rules-based logic.\n\n\`\`\`typescript\nexport async function validateInput(input: MyInput): Promise<ValidationResult> {\n if (input.text.length < 5) {\n return { status: \'error\', error: \'too-short\' };\n }\n return { status: \'success\', data: { valid: true } };\n}\n\`\`\`\n\n### Archetype 2: One-Shot LLM Analyzer\n\n**When to use:** Single LLM call for analysis, classification, or generation.\n\n**Characteristics:** One LLM call per execution, stateless, handles LLM errors.\n\n**Example use cases:** Sentiment analysis, text classification, entity extraction, creative generation.\n\n\`\`\`typescript\nexport async function analyzeSentiment(text: string, context: AgentContext) {\n const response = await context.callLLM([\n { role: \'user\', content: \`Analyze sentiment: \"${text}\"\` }\n ]);\n\n const parsed = JSON.parse(response);\n return { status: \'success\', data: parsed };\n}\n\`\`\`\n\n### Archetype 3: Conversational Assistant\n\n**When to use:** Multi-turn dialogue, progressive data collection across sessions.\n\n**Characteristics:** Multiple LLM calls, maintains session state, phases (gathering → ready → complete).\n\n**Example use cases:** Customer support chatbots, product advisors, interview agents, multi-step forms.\n\n\`\`\`typescript\nexport async function conversationalAgent(\n message: string,\n sessionState: SessionState\n) {\n if (sessionState.phase === \'gathering\') {\n return {\n reply: \"Can you tell me more about X?\",\n sessionState: { ...sessionState },\n result: { status: \'in-progress\' }\n };\n }\n\n return {\n reply: \"Here\'s your result!\",\n sessionState: { ...sessionState, phase: \'complete\' },\n result: { status: \'success\', data: finalResult }\n };\n}\n\`\`\`\n\n### Archetype 4: External Event Integrator\n\n**When to use:** Waiting for external events (approvals, webhooks, third-party APIs).\n\n**Characteristics:** Emits event, blocks waiting for response, timeout handling, mockable for testing.\n\n**Example use cases:** Human-in-the-loop approval, webhook integrations, external API polling.\n\n\`\`\`typescript\nexport async function humanApproval(\n request: ApprovalRequest,\n options = { mockable: true, timeout: 30000 }\n) {\n if (options.mockable) {\n return { status: \'success\', data: { approved: true } };\n }\n\n const response = await emitEvent(request, options.timeout);\n return { status: \'success\', data: response };\n}\n\`\`\`\n\n## Result Envelopes\n\nAlways return result envelopes — never throw exceptions for expected errors.\n\n\`\`\`typescript\n// AgentResult<TData, TError> — for single-execution agents\ntype AgentResult<TData, TError> =\n | { status: \'success\'; data: TData }\n | { status: \'error\'; error: TError };\n\n// StatefulAgentResult — for conversational agents\ntype StatefulAgentResult<TData, TError, TMetadata> =\n | { status: \'in-progress\'; metadata?: TMetadata }\n | { status: \'success\'; data: TData }\n | { status: \'error\'; error: TError };\n\`\`\`\n\nStandard LLM error literals: \`\'llm-refusal\'\`, \`\'llm-invalid-output\'\`, \`\'llm-timeout\'\`,\n\`\'llm-rate-limit\'\`, \`\'llm-token-limit\'\`, \`\'llm-unavailable\'\`.\n\nAlways check status before accessing data:\n\`\`\`typescript\nconst output = await myAgent.execute(input);\nif (output.result.status === \'success\') {\n console.log(output.result.data);\n} else if (output.result.status === \'error\') {\n console.error(\'Failed:\', output.result.error);\n}\n\`\`\`\n\n## Orchestration Patterns\n\n### Sequential Pipeline\n\n\`\`\`typescript\nconst analysisOutput = await analyzer.execute(input);\nconst processedOutput = await andThen(\n analysisOutput.result,\n async (data) => {\n const out = await processor.execute(data);\n return out.result;\n }\n);\n\`\`\`\n\n### Parallel Execution\n\n\`\`\`typescript\nconst [output1, output2, output3] = await Promise.all([\n agent1.execute(input),\n agent2.execute(input),\n agent3.execute(input),\n]);\n\`\`\`\n\n### Validation Loop (Generate + Validate with Retry)\n\n\`\`\`typescript\nasync function generateValidOutput(input: MyInput, maxAttempts = 5) {\n for (let attempt = 0; attempt < maxAttempts; attempt++) {\n const generatorOutput = await generator.execute(input);\n if (generatorOutput.result.status === \'error\') continue;\n\n const validatorOutput = await validator.execute(generatorOutput.result.data);\n if (validatorOutput.result.status === \'success\' &&\n validatorOutput.result.data.valid) {\n return generatorOutput.result.data;\n }\n }\n throw new Error(\'Max attempts exceeded\');\n}\n\`\`\`\n\n### Human-in-the-Loop\n\n\`\`\`typescript\nconst generatorOutput = await generator.execute(input);\nif (generatorOutput.result.status === \'success\') {\n const approvalOutput = await humanApproval.execute({\n content: generatorOutput.result.data,\n context: input,\n });\n if (approvalOutput.result.data.approved) {\n return generatorOutput.result.data;\n }\n}\n\`\`\`\n\n### Conversational Multi-Turn\n\n\`\`\`typescript\nlet session = { state: { phase: \'gathering\' }, history: [] };\n\nwhile (true) {\n const userMessage = await getUserInput();\n const output = await conversationalAgent.execute({\n message: userMessage,\n sessionState: session.state,\n });\n\n console.log(\'Agent:\', output.reply);\n session = {\n state: output.sessionState,\n history: [...session.history,\n { role: \'user\', content: userMessage },\n { role: \'assistant\', content: output.reply }\n ],\n };\n\n if (output.result.status === \'success\') break;\n if (output.result.status === \'error\') break;\n // status === \'in-progress\': continue\n}\n\`\`\`\n\n## packagingOptions Reference\n\nConfigure in your skill\'s \`vat.skills[]\` entry in \`package.json\`:\n\n\`\`\`json\n{\n \"vat\": {\n \"skills\": [{\n \"name\": \"my-skill\",\n \"source\": \"./SKILL.md\",\n \"path\": \"./dist/skills/my-skill\",\n \"packagingOptions\": {\n \"linkFollowDepth\": 1,\n \"resourceNaming\": \"resource-id\",\n \"stripPrefix\": \"knowledge-base\",\n \"excludeReferencesFromBundle\": {\n \"rules\": [\n { \"patterns\": [\"**/concepts/**\"], \"template\": \"Use search to find: {{link.text}}\" }\n ],\n \"defaultTemplate\": \"{{link.text}} (search knowledge base)\"\n }\n }\n }]\n }\n}\n\`\`\`\n\n**\`linkFollowDepth\`** — How deep to follow links from SKILL.md:\n\n| Value | Behavior |\n|-------|----------|\n| \`0\` | Skill file only (no links followed) |\n| \`1\` | Direct links only |\n| \`2\` | Direct + one transitive level **(default)** |\n| \`\"full\"\` | Complete transitive closure |\n\n**\`resourceNaming\`** — How bundled files are named:\n\n| Strategy | Example | Use When |\n|----------|---------|----------|\n| \`basename\` | \`overview.md\` | Few files, unique names **(default)** |\n| \`resource-id\` | \`topics-quickstart-overview.md\` | Many files, flat output |\n| \`preserve-path\` | \`topics/quickstart/overview.md\` | Preserve structure |\n\nUse \`stripPrefix\` to remove a common directory prefix (e.g., \`\"knowledge-base\"\`).\n\n**\`excludeReferencesFromBundle\`** — Rules for excluding files and rewriting their links:\n- \`rules[]\` — Ordered glob patterns (first match wins), each with optional Handlebars template\n- \`defaultTemplate\` — Applied to depth-exceeded links not matching any rule\n\n**Template variables:**\n\n| Variable | Description |\n|----------|-------------|\n| \`{{link.text}}\` | Link display text |\n| \`{{link.href}}\` | Original href (without fragment) |\n| \`{{link.fragment}}\` | Fragment including \`#\` prefix, or empty |\n| \`{{link.type}}\` | Link type (\`\"local_file\"\`, etc.) |\n| \`{{link.resource.id}}\` | Target resource ID (if resolved) |\n| \`{{link.resource.fileName}}\` | Target filename (if resolved) |\n| \`{{skill.name}}\` | Skill name from frontmatter |\n\n**\`validation\`** — Unified framework for overriding default severity and allowing specific issue instances:\n\n\`\`\`yaml\n# In vibe-agent-toolkit.config.yaml under skills.defaults or skills.config.<name>\nvalidation:\n severity:\n LINK_DROPPED_BY_DEPTH: error # upgrade: block on depth-dropped links\n LINK_TO_NAVIGATION_FILE: ignore # silence: this skill intentionally links to READMEs\n allow:\n PACKAGED_UNREFERENCED_FILE:\n - paths: [\"templates/runtime.json\"]\n reason: \"consumed programmatically at runtime\"\n expires: \"2026-09-30\"\n SKILL_LENGTH_EXCEEDS_RECOMMENDED:\n - reason: \"whole-skill concern; paths defaults to [\'**/*\']\"\n\`\`\`\n\nTwo sub-keys, each covering a different override granularity:\n\n- **\`severity\`** — Class-level. Raise any code to \`error\` (blocks build), lower to \`warning\` (emits, non-blocking), or \`ignore\` (fully suppressed). Applies to every instance of that code.\n- **\`allow\`** — Per-instance. Suppress specific \`(code, path)\` matches with a required \`reason\` and optional \`expires\` date. \`paths\` is optional (defaults to \`[\"**/*\"]\` — the whole skill). Use for legitimate exceptions that don\'t warrant code-wide silencing.\n\nThings adopters typically adjust:\n\n- Downgrade \`LINK_DROPPED_BY_DEPTH\` to \`ignore\` when intentionally linking out to external docs.\n- Allow specific files under \`PACKAGED_UNREFERENCED_FILE\` when they\'re consumed programmatically by CLI scripts at runtime.\n- Raise \`ALLOW_EXPIRED\` to \`error\` for zero-tolerance expiry policies.\n\nExpired \`allow\` entries still apply — VAT emits \`ALLOW_EXPIRED\` as a reminder rather than silently re-surfacing the underlying issue (no surprise build breaks when a date passes). Unused \`allow\` entries surface as \`ALLOW_UNUSED\` (analogous to ESLint\'s unused-disable).\n\nFull code reference at \`docs/validation-codes.md\`. \`vat audit\` is advisory: it applies \`severity\` for display grouping only, ignores \`allow\`, and always exits 0. Use \`vat skills validate\` or \`vat skills build\` for gated checks.\n\n## Testing Agents\n\n### Unit Testing Pure Functions\n\n\`\`\`typescript\nimport { describe, expect, it } from \'vitest\';\nimport { resultMatchers } from \'@vibe-agent-toolkit/agent-runtime\';\n\ndescribe(\'myValidator\', () => {\n it(\'should validate correct input\', async () => {\n const output = await myValidator.execute({ text: \'valid\' });\n resultMatchers.expectSuccess(output.result);\n expect(output.result.data.valid).toBe(true);\n });\n});\n\`\`\`\n\n### Integration Testing with Mock LLM\n\n\`\`\`typescript\nimport { createMockContext } from \'@vibe-agent-toolkit/agent-runtime\';\n\nconst mockContext = createMockContext(\n JSON.stringify({ sentiment: \'positive\', confidence: 0.9 })\n);\nconst output = await myAnalyzer.execute({ text: \'Great!\' }, mockContext);\nresultMatchers.expectSuccess(output.result);\n\`\`\`\n\n### Testing Conversational Flows\n\n\`\`\`typescript\n// Turn 1\nconst output1 = await agent.execute({ message: \'Hello\' });\nexpect(output1.reply).toContain(\'name?\');\nresultMatchers.expectInProgress(output1.result);\n\n// Turn 2 — pass session state forward\nconst output2 = await agent.execute({\n message: \'My name is Alice\',\n sessionState: output1.sessionState,\n});\n\`\`\`\n\n## Best Practices\n\n1. **Return result envelopes, never throw** for expected errors\n2. **Define error types as literal unions** (\`\'invalid-format\' | \'timeout\'\`) not \`string\`\n3. **Use Zod schemas** for all input/output validation\n4. **Test all paths** — success, each error type, edge cases\n5. **Use mock mode** for external dependencies to enable offline testing\n6. **Document with JSDoc** — purpose, params, return type, example, \`@throws Never throws\`\n7. **Keep SKILL.md focused** — if it exceeds ~300 lines, split into action skills\n\n## References\n\n- [Skill Quality and Compatibility — VAT\'s Stance](../../../../docs/skill-quality-and-compatibility.md) — what VAT believes makes a skill good and compatible, and how those beliefs turn into validation codes. Read this before overriding severity defaults or adding allow entries.\n- [Validation Codes Reference](../../../../docs/validation-codes.md) — full list of codes VAT emits, their default severity, and override recipes.\n- [Skill Quality Checklist](skill-quality-checklist.md) — Pre-publication checklist for all skills (general + CLI-backed)\n- [agent-authoring.md](../../../../docs/agent-authoring.md) — Complete patterns guide\n- [orchestration.md](../../../../docs/orchestration.md) — Multi-agent workflows\n- [Building Effective Agents - Anthropic](https://www.anthropic.com/research/building-effective-agents)\n";
|
|
11
11
|
|
|
12
12
|
export const fragments = {
|
|
13
13
|
skillmdStructure: {
|
|
@@ -47,7 +47,7 @@ export const fragments = {
|
|
|
47
47
|
},
|
|
48
48
|
references: {
|
|
49
49
|
header: "## References",
|
|
50
|
-
body: "- [Skill Quality Checklist](skill-quality-checklist.md) — Pre-publication checklist for all skills (general + CLI-backed)\n- [agent-authoring.md](../../../../docs/agent-authoring.md) — Complete patterns guide\n- [orchestration.md](../../../../docs/orchestration.md) — Multi-agent workflows\n- [Building Effective Agents - Anthropic](https://www.anthropic.com/research/building-effective-agents)",
|
|
51
|
-
text: "## References\n\n- [Skill Quality Checklist](skill-quality-checklist.md) — Pre-publication checklist for all skills (general + CLI-backed)\n- [agent-authoring.md](../../../../docs/agent-authoring.md) — Complete patterns guide\n- [orchestration.md](../../../../docs/orchestration.md) — Multi-agent workflows\n- [Building Effective Agents - Anthropic](https://www.anthropic.com/research/building-effective-agents)"
|
|
50
|
+
body: "- [Skill Quality and Compatibility — VAT\'s Stance](../../../../docs/skill-quality-and-compatibility.md) — what VAT believes makes a skill good and compatible, and how those beliefs turn into validation codes. Read this before overriding severity defaults or adding allow entries.\n- [Validation Codes Reference](../../../../docs/validation-codes.md) — full list of codes VAT emits, their default severity, and override recipes.\n- [Skill Quality Checklist](skill-quality-checklist.md) — Pre-publication checklist for all skills (general + CLI-backed)\n- [agent-authoring.md](../../../../docs/agent-authoring.md) — Complete patterns guide\n- [orchestration.md](../../../../docs/orchestration.md) — Multi-agent workflows\n- [Building Effective Agents - Anthropic](https://www.anthropic.com/research/building-effective-agents)",
|
|
51
|
+
text: "## References\n\n- [Skill Quality and Compatibility — VAT\'s Stance](../../../../docs/skill-quality-and-compatibility.md) — what VAT believes makes a skill good and compatible, and how those beliefs turn into validation codes. Read this before overriding severity defaults or adding allow entries.\n- [Validation Codes Reference](../../../../docs/validation-codes.md) — full list of codes VAT emits, their default severity, and override recipes.\n- [Skill Quality Checklist](skill-quality-checklist.md) — Pre-publication checklist for all skills (general + CLI-backed)\n- [agent-authoring.md](../../../../docs/agent-authoring.md) — Complete patterns guide\n- [orchestration.md](../../../../docs/orchestration.md) — Multi-agent workflows\n- [Building Effective Agents - Anthropic](https://www.anthropic.com/research/building-effective-agents)"
|
|
52
52
|
}
|
|
53
53
|
};
|
|
@@ -386,6 +386,8 @@ const output2 = await agent.execute({
|
|
|
386
386
|
|
|
387
387
|
## References
|
|
388
388
|
|
|
389
|
+
- Skill Quality and Compatibility — VAT's Stance — what VAT believes makes a skill good and compatible, and how those beliefs turn into validation codes. Read this before overriding severity defaults or adding allow entries.
|
|
390
|
+
- Validation Codes Reference — full list of codes VAT emits, their default severity, and override recipes.
|
|
389
391
|
- [Skill Quality Checklist](resources/skill-quality-checklist.md) — Pre-publication checklist for all skills (general + CLI-backed)
|
|
390
392
|
- agent-authoring.md — Complete patterns guide
|
|
391
393
|
- orchestration.md — Multi-agent workflows
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@vibe-agent-toolkit/vat-development-agents",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.32-rc.1",
|
|
4
4
|
"description": "VAT development agents - dogfooding the vibe-agent-toolkit",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"keywords": [
|
|
@@ -65,13 +65,13 @@
|
|
|
65
65
|
"postinstall": "vat claude plugin install --npm-postinstall || exit 0"
|
|
66
66
|
},
|
|
67
67
|
"dependencies": {
|
|
68
|
-
"@vibe-agent-toolkit/agent-schema": "0.1.
|
|
69
|
-
"@vibe-agent-toolkit/cli": "0.1.
|
|
68
|
+
"@vibe-agent-toolkit/agent-schema": "0.1.32-rc.1",
|
|
69
|
+
"@vibe-agent-toolkit/cli": "0.1.32-rc.1",
|
|
70
70
|
"yaml": "^2.8.2"
|
|
71
71
|
},
|
|
72
72
|
"devDependencies": {
|
|
73
73
|
"@types/node": "^25.0.3",
|
|
74
|
-
"@vibe-agent-toolkit/resource-compiler": "0.1.
|
|
74
|
+
"@vibe-agent-toolkit/resource-compiler": "0.1.32-rc.1",
|
|
75
75
|
"ts-patch": "^3.2.1",
|
|
76
76
|
"typescript": "^5.7.3"
|
|
77
77
|
},
|