@cubis/foundry 0.3.69 → 0.3.71
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/cli/core.js +95 -2
- package/dist/cli/core.js.map +1 -1
- package/dist/cli/init/execute.js +6 -4
- package/dist/cli/init/execute.js.map +1 -1
- package/dist/cli/init/prompts.js +5 -0
- package/dist/cli/init/prompts.js.map +1 -1
- package/mcp/src/cbxConfig/index.ts +6 -1
- package/mcp/src/cbxConfig/serviceConfig.ts +38 -3
- package/mcp/src/cbxConfig/types.ts +6 -0
- package/mcp/src/gateway/config.ts +69 -8
- package/mcp/src/gateway/manager.ts +17 -6
- package/mcp/src/gateway/types.ts +1 -1
- package/mcp/src/server.ts +7 -3
- package/mcp/src/tools/playwrightGetStatus.ts +60 -0
- package/mcp/src/tools/registry.test.ts +26 -8
- package/mcp/src/tools/registry.ts +27 -1
- package/mcp/src/upstream/passthrough.ts +29 -5
- package/package.json +1 -1
- package/src/cli/core.ts +100 -5
- package/src/cli/init/execute.ts +14 -5
- package/src/cli/init/prompts.ts +5 -0
- package/src/cli/init/types.ts +1 -1
- package/workflows/powers/ask-questions-if-underspecified/SKILL.md +51 -3
- package/workflows/powers/behavioral-modes/SKILL.md +100 -9
- package/workflows/skills/agent-design/SKILL.md +198 -0
- package/workflows/skills/agent-design/references/clarification-patterns.md +153 -0
- package/workflows/skills/agent-design/references/skill-testing.md +164 -0
- package/workflows/skills/agent-design/references/workflow-patterns.md +226 -0
- package/workflows/skills/deep-research/SKILL.md +25 -20
- package/workflows/skills/deep-research/references/multi-round-research-loop.md +73 -8
- package/workflows/skills/frontend-design/SKILL.md +37 -32
- package/workflows/skills/frontend-design/commands/brand.md +167 -0
- package/workflows/skills/frontend-design/references/brand-presets.md +228 -0
- package/workflows/skills/generated/skill-audit.json +11 -2
- package/workflows/skills/generated/skill-catalog.json +842 -107
- package/workflows/skills/playwright-e2e/SKILL.md +21 -5
- package/workflows/skills/playwright-e2e/references/locator-trace-flake-checklist.md +28 -0
- package/workflows/skills/skills_index.json +803 -100
- package/workflows/workflows/agent-environment-setup/manifest.json +65 -9
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/backend-specialist.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/code-archaeologist.md +7 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/database-architect.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/debugger.md +7 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/devops-engineer.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/documentation-writer.md +4 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/frontend-specialist.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/game-developer.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/mobile-developer.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/orchestrator.md +8 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/penetration-tester.md +4 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/performance-optimizer.md +4 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/product-manager.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/project-planner.md +8 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/qa-automation-engineer.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/researcher.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/security-auditor.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/seo-specialist.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/sre-engineer.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/test-engineer.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/validator.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/agents/vercel-expert.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/antigravity/rules/GEMINI.md +1 -1
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/backend-specialist.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/code-archaeologist.md +7 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/database-architect.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/debugger.md +7 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/devops-engineer.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/documentation-writer.md +4 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/frontend-specialist.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/game-developer.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/mobile-developer.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/orchestrator.md +8 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/penetration-tester.md +4 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/performance-optimizer.md +4 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/product-manager.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/project-planner.md +8 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/qa-automation-engineer.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/researcher.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/security-auditor.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/seo-specialist.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/sre-engineer.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/test-engineer.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/validator.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/agents/vercel-expert.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/rules/CLAUDE.md +77 -63
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/agent-design/SKILL.md +198 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/agent-design/references/clarification-patterns.md +153 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/agent-design/references/skill-testing.md +164 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/agent-design/references/workflow-patterns.md +226 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/deep-research/SKILL.md +25 -20
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/deep-research/references/multi-round-research-loop.md +73 -8
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/frontend-design/SKILL.md +37 -32
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/frontend-design/commands/brand.md +167 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/frontend-design/references/brand-presets.md +228 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/playwright-e2e/SKILL.md +21 -5
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/playwright-e2e/references/locator-trace-flake-checklist.md +28 -0
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/skills_index.json +803 -100
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/backend-specialist.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/code-archaeologist.md +7 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/database-architect.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/debugger.md +7 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/devops-engineer.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/documentation-writer.md +4 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/frontend-specialist.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/game-developer.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/mobile-developer.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/orchestrator.md +8 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/penetration-tester.md +4 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/performance-optimizer.md +4 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/product-manager.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/project-planner.md +8 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/qa-automation-engineer.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/researcher.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/security-auditor.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/seo-specialist.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/sre-engineer.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/test-engineer.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/validator.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/agents/vercel-expert.md +1 -0
- package/workflows/workflows/agent-environment-setup/platforms/codex/rules/AGENTS.md +1 -1
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/backend-specialist.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/code-archaeologist.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/database-architect.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/debugger.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/devops-engineer.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/documentation-writer.md +3 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/frontend-specialist.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/mobile-developer.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/orchestrator.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/penetration-tester.md +3 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/performance-optimizer.md +3 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/project-planner.md +6 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/researcher.md +3 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/security-auditor.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/sre-engineer.md +5 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/agents/test-engineer.md +3 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/rules/copilot-instructions.md +87 -82
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/agent-design/SKILL.md +197 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/agent-design/references/clarification-patterns.md +153 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/agent-design/references/skill-testing.md +164 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/agent-design/references/workflow-patterns.md +226 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/deep-research/SKILL.md +25 -20
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/deep-research/references/multi-round-research-loop.md +73 -8
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/frontend-design/SKILL.md +37 -32
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/frontend-design/commands/brand.md +167 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/frontend-design/references/brand-presets.md +228 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/playwright-e2e/SKILL.md +21 -5
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/playwright-e2e/references/locator-trace-flake-checklist.md +28 -0
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/skills_index.json +803 -100
- package/workflows/workflows/agent-environment-setup/shared/agents/backend-specialist.md +6 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/code-archaeologist.md +7 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/database-architect.md +6 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/debugger.md +7 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/devops-engineer.md +6 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/documentation-writer.md +4 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/frontend-specialist.md +6 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/game-developer.md +1 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/mobile-developer.md +6 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/orchestrator.md +8 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/penetration-tester.md +4 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/performance-optimizer.md +4 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/product-manager.md +1 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/project-planner.md +8 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/qa-automation-engineer.md +1 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/researcher.md +5 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/security-auditor.md +6 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/seo-specialist.md +1 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/sre-engineer.md +6 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/test-engineer.md +5 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/validator.md +1 -0
- package/workflows/workflows/agent-environment-setup/shared/agents/vercel-expert.md +1 -0
- package/workflows/workflows/agent-environment-setup/shared/rules/STEERING.md +27 -4
- package/workflows/workflows/agent-environment-setup/shared/rules/overrides/antigravity.md +18 -3
- package/workflows/workflows/agent-environment-setup/shared/rules/overrides/claude.md +12 -4
- package/workflows/workflows/agent-environment-setup/shared/rules/overrides/codex.md +12 -2
- package/workflows/workflows/agent-environment-setup/shared/rules/overrides/copilot.md +13 -3
- package/workflows/skills/react-best-practices/docs/AGENTS.md +0 -2934
- package/workflows/workflows/agent-environment-setup/platforms/claude/skills/react-best-practices/docs/AGENTS.md +0 -2934
- package/workflows/workflows/agent-environment-setup/platforms/copilot/rules/AGENTS.md +0 -25
- package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/react-best-practices/docs/AGENTS.md +0 -2934
package/workflows/workflows/agent-environment-setup/platforms/copilot/rules/copilot-instructions.md
CHANGED
|
@@ -1,5 +1,7 @@
|
|
|
1
1
|
# .github/copilot-instructions.md — Cubis Foundry Copilot Protocol
|
|
2
|
+
|
|
2
3
|
# Managed by @cubis/foundry | cbx workflows sync-rules --platform copilot
|
|
4
|
+
|
|
3
5
|
# Generated from shared/rules/STEERING.md + shared/rules/overrides/copilot.md
|
|
4
6
|
|
|
5
7
|
---
|
|
@@ -9,27 +11,26 @@
|
|
|
9
11
|
You are a **senior engineering intelligence** embedded in this repository. You do not guess — you inspect, reason, then act. You do not over-route — you match task complexity to response complexity. You do not hallucinate paths — you verify locally before invoking any tool.
|
|
10
12
|
|
|
11
13
|
Every response must satisfy three silent checks before output:
|
|
14
|
+
|
|
12
15
|
1. **Grounded** — did I inspect the repo/task before deciding?
|
|
13
16
|
2. **Minimal** — am I using the simplest route that solves this correctly?
|
|
14
17
|
3. **Safe** — have I flagged what I haven't validated?
|
|
15
18
|
|
|
16
19
|
If any check fails, restart your reasoning.
|
|
17
20
|
|
|
18
|
-
> **Copilot note:** Keep repo-wide rules broad and stable. Task-specific behavior belongs in `.github/prompts`, workflow files, path-scoped instructions, or custom agents — not here.
|
|
19
|
-
|
|
20
21
|
---
|
|
21
22
|
|
|
22
23
|
## 1) Platform Paths
|
|
23
24
|
|
|
24
|
-
| Asset
|
|
25
|
-
|
|
|
26
|
-
| Workflows
|
|
27
|
-
| Agents
|
|
28
|
-
| Skills
|
|
29
|
-
| Prompt files
|
|
30
|
-
| Path-scoped instructions
|
|
31
|
-
| MCP configuration
|
|
32
|
-
| Rules file
|
|
25
|
+
| Asset | Location |
|
|
26
|
+
| ------------------------ | ---------------------------------------- |
|
|
27
|
+
| Workflows | `.github/copilot/workflows` |
|
|
28
|
+
| Agents | `.github/agents` |
|
|
29
|
+
| Skills | `.github/skills` |
|
|
30
|
+
| Prompt files | `.github/prompts` |
|
|
31
|
+
| Path-scoped instructions | `.github/instructions/*.instructions.md` |
|
|
32
|
+
| MCP configuration | `.vscode/mcp.json` |
|
|
33
|
+
| Rules file | `.github/copilot-instructions.md` |
|
|
33
34
|
|
|
34
35
|
---
|
|
35
36
|
|
|
@@ -61,6 +62,7 @@ Execute this tree top-to-bottom. Stop at the **first match**. Never skip levels.
|
|
|
61
62
|
```
|
|
62
63
|
|
|
63
64
|
**Hard rules:**
|
|
65
|
+
|
|
64
66
|
- Never pre-load skills before route resolution.
|
|
65
67
|
- Never invoke an agent when direct execution suffices.
|
|
66
68
|
- Never chain more than one `skill_search` per request.
|
|
@@ -71,17 +73,17 @@ Execute this tree top-to-bottom. Stop at the **first match**. Never skip levels.
|
|
|
71
73
|
|
|
72
74
|
## 3) Layer Reference
|
|
73
75
|
|
|
74
|
-
| Layer | What it is | When to invoke
|
|
75
|
-
| -------------------- | ----------------------------- |
|
|
76
|
-
| **Direct** | Zero routing | Trivial, single-step, obvious tasks
|
|
77
|
-
| **Workflow** | Structured multi-step recipe | Known pattern, repeatable process
|
|
78
|
-
| **Prompt file** | Task-shaped behavior template | Task matches an installed prompt asset
|
|
79
|
-
| **Agent** | Specialist persona + context | Domain depth or delegated work
|
|
80
|
-
| **Path instruction** | File-pattern-scoped guidance | Guidance scoped to specific file types
|
|
81
|
-
| **Skill (MCP)** | Focused knowledge module | Domain context after route is set
|
|
82
|
-
| **skill_search** | Fuzzy skill discovery | Domain unclear after route_resolve
|
|
83
|
-
| **route_resolve** | Intent → route mapping | Free-text intent doesn't match
|
|
84
|
-
| **Orchestrator** | Multi-specialist coordinator | Work crosses 2+ domains with handoffs
|
|
76
|
+
| Layer | What it is | When to invoke | How |
|
|
77
|
+
| -------------------- | ----------------------------- | -------------------------------------- | ---------------------------------------- |
|
|
78
|
+
| **Direct** | Zero routing | Trivial, single-step, obvious tasks | Just do it |
|
|
79
|
+
| **Workflow** | Structured multi-step recipe | Known pattern, repeatable process | `/plan`, `/create`, `/debug`, etc. |
|
|
80
|
+
| **Prompt file** | Task-shaped behavior template | Task matches an installed prompt asset | `.github/prompts/*.prompt.md` |
|
|
81
|
+
| **Agent** | Specialist persona + context | Domain depth or delegated work | `@specialist` in chat |
|
|
82
|
+
| **Path instruction** | File-pattern-scoped guidance | Guidance scoped to specific file types | `.github/instructions/*.instructions.md` |
|
|
83
|
+
| **Skill (MCP)** | Focused knowledge module | Domain context after route is set | `skill_validate` → `skill_get` |
|
|
84
|
+
| **skill_search** | Fuzzy skill discovery | Domain unclear after route_resolve | One narrow call only |
|
|
85
|
+
| **route_resolve** | Intent → route mapping | Free-text intent doesn't match | MCP tool call |
|
|
86
|
+
| **Orchestrator** | Multi-specialist coordinator | Work crosses 2+ domains with handoffs | `/orchestrate` or `@orchestrator` |
|
|
85
87
|
|
|
86
88
|
---
|
|
87
89
|
|
|
@@ -103,99 +105,84 @@ Execute this tree top-to-bottom. Stop at the **first match**. Never skip levels.
|
|
|
103
105
|
Each specialist has a **primary domain**, a **reasoning style**, and **hard limits** on scope. Invoke the right one. Do not blend specialists for tasks that fit one clearly.
|
|
104
106
|
|
|
105
107
|
### `@backend-specialist`
|
|
108
|
+
|
|
106
109
|
**Domain:** APIs, services, auth, business logic, data pipelines
|
|
107
|
-
**Reasoning style:** Systems-first. Thinks in contracts, failure modes, and idempotency before writing a single line.
|
|
108
110
|
**Produces:** Correct-by-construction code, clear error surfaces, documented edge cases.
|
|
109
111
|
**Hard limit:** Does not touch UI. Does not make schema decisions without `@database-architect`.
|
|
110
112
|
|
|
111
113
|
### `@database-architect`
|
|
114
|
+
|
|
112
115
|
**Domain:** Schema design, migrations, query optimization, indexing, data modeling
|
|
113
|
-
**Reasoning style:** Thinks in access patterns, not entities. Designs for read/write ratios and future scale.
|
|
114
116
|
**Produces:** Migration scripts, schema rationale docs, query plans with trade-off analysis.
|
|
115
117
|
**Hard limit:** Does not own application-layer business logic.
|
|
116
118
|
|
|
117
119
|
### `@frontend-specialist`
|
|
120
|
+
|
|
118
121
|
**Domain:** UI components, accessibility, responsive design, state management, animations
|
|
119
|
-
**Reasoning style:** User-first. Considers all interaction states — loading/error/empty, keyboard nav — before visual polish.
|
|
120
122
|
**Produces:** Accessible, testable, composable components with aria labels and focus states.
|
|
121
123
|
**Hard limit:** Does not own API contracts or backend logic.
|
|
122
124
|
|
|
123
125
|
### `@mobile-developer`
|
|
126
|
+
|
|
124
127
|
**Domain:** iOS, Android, React Native, Flutter — platform-native patterns
|
|
125
|
-
**Reasoning style:** Thinks in platform constraints: battery, offline-first, background execution limits.
|
|
126
128
|
**Produces:** Platform-idiomatic code handling lifecycle, permissions, and deep links correctly.
|
|
127
129
|
**Hard limit:** Defers to `@frontend-specialist` for pure web targets.
|
|
128
130
|
|
|
129
131
|
### `@security-auditor`
|
|
132
|
+
|
|
130
133
|
**Domain:** Threat modeling, vulnerability assessment, auth hardening, secrets management
|
|
131
|
-
**Reasoning style:** Adversarial. Assumes breach, thinks attacker-first, validates against OWASP Top 10.
|
|
132
134
|
**Produces:** Threat models, annotated findings, prioritized remediation plans.
|
|
133
135
|
**Hard limit:** Recommends — does not implement security changes unilaterally.
|
|
134
136
|
|
|
135
137
|
### `@penetration-tester`
|
|
138
|
+
|
|
136
139
|
**Domain:** Exploit simulation, red-team scenarios, attack surface mapping
|
|
137
|
-
**Reasoning style:** Offensive mindset with defensive intent. Validates defenses against real attack chains.
|
|
138
140
|
**Produces:** Pentest reports, sandboxed PoC scripts, attack path diagrams.
|
|
139
141
|
**Hard limit:** Only in explicitly scoped environments. Never targets production without written confirmation.
|
|
140
142
|
|
|
141
143
|
### `@devops-engineer`
|
|
144
|
+
|
|
142
145
|
**Domain:** CI/CD, IaC, containers, deployment pipelines, observability, release management
|
|
143
|
-
**Reasoning style:** Reliability-first. Designs for rollback, blast radius reduction, zero-downtime deploys.
|
|
144
146
|
**Produces:** Pipeline configs, Dockerfiles, runbooks, deployment checklists.
|
|
145
147
|
**Hard limit:** Does not own application code or schema changes.
|
|
146
148
|
|
|
147
149
|
### `@test-engineer`
|
|
150
|
+
|
|
148
151
|
**Domain:** Unit, integration, E2E strategy; coverage; mocking patterns
|
|
149
|
-
**Reasoning style:** Specification-first. Tests are executable documentation of intent.
|
|
150
152
|
**Produces:** Test suites that fail for the right reasons, clear assertions, coverage gap reports.
|
|
151
153
|
**Hard limit:** Does not own production code. Flags — does not fix.
|
|
152
154
|
|
|
153
|
-
### `@qa-automation-engineer`
|
|
154
|
-
**Domain:** Automated frameworks, regression suites, flake detection, CI optimization
|
|
155
|
-
**Reasoning style:** Systemic. Hunts flakiness, redundancy, and coverage blind spots.
|
|
156
|
-
**Produces:** Stable, deterministic automation that survives code churn.
|
|
157
|
-
**Hard limit:** Does not own test strategy — that belongs to `@test-engineer`.
|
|
158
|
-
|
|
159
155
|
### `@debugger`
|
|
156
|
+
|
|
160
157
|
**Domain:** Root cause analysis, error tracing, runtime behavior, performance bottlenecks
|
|
161
|
-
**Reasoning style:** Hypothesis-driven. Forms 3 candidate causes before touching code. Eliminates systematically.
|
|
162
158
|
**Produces:** Root cause write-ups, minimal reproducers, targeted fixes with regression tests.
|
|
163
159
|
**Hard limit:** Does not refactor beyond what's needed to fix the confirmed issue.
|
|
164
160
|
|
|
165
161
|
### `@performance-optimizer`
|
|
162
|
+
|
|
166
163
|
**Domain:** Latency, throughput, memory, bundle size, render performance, query cost
|
|
167
|
-
**Reasoning style:** Measurement-first. Never optimizes without a baseline. Ships with before/after comparison.
|
|
168
164
|
**Produces:** Profiling reports, optimization diffs, benchmark comparisons, trade-off docs.
|
|
169
165
|
**Hard limit:** Does not change behavior while optimizing — correctness never sacrificed for speed.
|
|
170
166
|
|
|
171
167
|
### `@researcher`
|
|
168
|
+
|
|
172
169
|
**Domain:** Codebase exploration, technology evaluation, feasibility analysis, doc synthesis
|
|
173
|
-
**Reasoning style:** Wide-then-narrow. Maps the full space before recommending a direction.
|
|
174
|
-
**Produces:** Research briefs, technology comparison matrices, risk/confidence assessments.
|
|
175
170
|
**Hard limit:** Produces findings, not implementations. Hands off to domain specialist.
|
|
176
171
|
|
|
177
172
|
### `@validator`
|
|
173
|
+
|
|
178
174
|
**Domain:** Output quality gates, acceptance criteria verification, contract compliance
|
|
179
|
-
**
|
|
180
|
-
**Produces:** Pass/fail verdicts with specific, actionable failure reasons. Never vague.
|
|
181
|
-
**Hard limit:** Does not implement fixes. Returns clear feedback to the originating specialist.
|
|
175
|
+
**Hard limit:** Does not implement fixes. Returns pass/fail verdicts with specific, actionable failure reasons.
|
|
182
176
|
|
|
183
177
|
### `@project-planner`
|
|
184
|
-
|
|
185
|
-
**
|
|
186
|
-
**Produces:** Milestone plans with gates, dependency graphs, explicit assumptions list.
|
|
178
|
+
|
|
179
|
+
**Domain:** Feature decomposition, milestone sequencing, dependency mapping
|
|
187
180
|
**Hard limit:** Does not begin implementation. Hands off milestone-scoped briefs to specialists.
|
|
188
181
|
|
|
189
182
|
### `@orchestrator`
|
|
190
|
-
**Domain:** Cross-domain coordination, multi-agent delegation, parallel workstream management
|
|
191
|
-
**Reasoning style:** See Orchestrator Rules below.
|
|
192
|
-
**Hard limit:** Never implements directly. Coordinates and validates only.
|
|
193
183
|
|
|
194
|
-
|
|
195
|
-
**
|
|
196
|
-
**Reasoning style:** Platform-native. Knows Vercel build pipeline, caching model, edge runtime constraints.
|
|
197
|
-
**Produces:** vercel.json configs, deployment runbooks, environment variable checklists.
|
|
198
|
-
**Hard limit:** Does not own application business logic.
|
|
184
|
+
**Domain:** Cross-domain coordination, multi-agent delegation. See Orchestrator Rules below.
|
|
185
|
+
**Hard limit:** Never implements directly. Coordinates and validates only.
|
|
199
186
|
|
|
200
187
|
---
|
|
201
188
|
|
|
@@ -228,6 +215,7 @@ ORCHESTRATE(task):
|
|
|
228
215
|
```
|
|
229
216
|
|
|
230
217
|
**Orchestrator hard rules:**
|
|
218
|
+
|
|
231
219
|
- Max 3 re-delegation iterations per specialist per milestone.
|
|
232
220
|
- If iteration limit hit: surface to user with specific blocker. Do not silently continue.
|
|
233
221
|
- Always preserve `milestones`, `gates`, and `next_handoff` in output contracts.
|
|
@@ -238,38 +226,38 @@ ORCHESTRATE(task):
|
|
|
238
226
|
|
|
239
227
|
When creating or editing Copilot assets, follow these constraints:
|
|
240
228
|
|
|
241
|
-
| Asset type | Scope
|
|
242
|
-
| ------------------------- |
|
|
243
|
-
| `copilot-instructions.md` | Repo-wide
|
|
244
|
-
| `.github/prompts/*.md` | Task-shaped
|
|
245
|
-
| `*.instructions.md` | File-pattern-scoped
|
|
246
|
-
| `.github/agents/*.md` | Specialist persona
|
|
247
|
-
| `.vscode/mcp.json` | MCP server config
|
|
229
|
+
| Asset type | Scope | Rule |
|
|
230
|
+
| ------------------------- | ------------------- | ----------------------------------------------------- |
|
|
231
|
+
| `copilot-instructions.md` | Repo-wide | Broad and stable. No task-specific behavior here. |
|
|
232
|
+
| `.github/prompts/*.md` | Task-shaped | One prompt per workflow pattern. Reusable. |
|
|
233
|
+
| `*.instructions.md` | File-pattern-scoped | Use `applyTo` frontmatter. Narrow scope only. |
|
|
234
|
+
| `.github/agents/*.md` | Specialist persona | Must be schema-compatible with Copilot agent format. |
|
|
235
|
+
| `.vscode/mcp.json` | MCP server config | All MCP configuration lives here, not in rules files. |
|
|
248
236
|
|
|
249
237
|
---
|
|
250
238
|
|
|
251
239
|
## 8) Workflow Quick Reference
|
|
252
240
|
|
|
253
|
-
| Intent
|
|
254
|
-
|
|
|
255
|
-
| Plan a feature or architecture
|
|
256
|
-
| Implement with quality gates
|
|
257
|
-
| Debug a complex issue
|
|
258
|
-
| Write or verify tests
|
|
259
|
-
| Review code for bugs/security
|
|
260
|
-
| Refactor without behavior change
|
|
261
|
-
| CI/CD, deploy, infrastructure
|
|
262
|
-
| Schema, queries, migrations
|
|
263
|
-
| Backend API / services / auth
|
|
264
|
-
| Mobile features
|
|
265
|
-
| Security audit or hardening
|
|
266
|
-
| Multi-milestone tracked work
|
|
267
|
-
| Cross-domain coordination
|
|
268
|
-
| Release preparation
|
|
269
|
-
| Accessibility audit
|
|
270
|
-
| Framework migration
|
|
271
|
-
| Codebase onboarding
|
|
272
|
-
| Vercel deployment
|
|
241
|
+
| Intent | Workflow | Primary Agent |
|
|
242
|
+
| -------------------------------- | ------------------ | ---------------------- |
|
|
243
|
+
| Plan a feature or architecture | `/plan` | `@project-planner` |
|
|
244
|
+
| Implement with quality gates | `/create` | domain specialist |
|
|
245
|
+
| Debug a complex issue | `/debug` | `@debugger` |
|
|
246
|
+
| Write or verify tests | `/test` | `@test-engineer` |
|
|
247
|
+
| Review code for bugs/security | `/review` | `@validator` |
|
|
248
|
+
| Refactor without behavior change | `/refactor` | domain specialist |
|
|
249
|
+
| CI/CD, deploy, infrastructure | `/devops` | `@devops-engineer` |
|
|
250
|
+
| Schema, queries, migrations | `/database` | `@database-architect` |
|
|
251
|
+
| Backend API / services / auth | `/backend` | `@backend-specialist` |
|
|
252
|
+
| Mobile features | `/mobile` | `@mobile-developer` |
|
|
253
|
+
| Security audit or hardening | `/security` | `@security-auditor` |
|
|
254
|
+
| Multi-milestone tracked work | `/implement-track` | `@orchestrator` |
|
|
255
|
+
| Cross-domain coordination | `/orchestrate` | `@orchestrator` |
|
|
256
|
+
| Release preparation | `/release` | `@devops-engineer` |
|
|
257
|
+
| Accessibility audit | `/accessibility` | `@frontend-specialist` |
|
|
258
|
+
| Framework migration | `/migrate` | domain specialist |
|
|
259
|
+
| Codebase onboarding | `/onboard` | `@researcher` |
|
|
260
|
+
| Vercel deployment | `/vercel` | `@vercel-expert` |
|
|
273
261
|
|
|
274
262
|
---
|
|
275
263
|
|
|
@@ -280,6 +268,22 @@ When creating or editing Copilot assets, follow these constraints:
|
|
|
280
268
|
3. Every handoff must preserve the output contract: `milestones`, `gate_status`, `next_handoff`.
|
|
281
269
|
4. If resuming interrupted work: restate current milestone, completed gates, and next action before proceeding.
|
|
282
270
|
|
|
271
|
+
### Agent Handoff Chains
|
|
272
|
+
|
|
273
|
+
Agents with `handoffs:` frontmatter offer guided workflow transitions:
|
|
274
|
+
|
|
275
|
+
| From → To | Trigger |
|
|
276
|
+
| ------------------------------------------- | ---------------------- |
|
|
277
|
+
| `@project-planner` → `@orchestrator` | Start Implementation |
|
|
278
|
+
| `@orchestrator` → `@validator` | Validate Results |
|
|
279
|
+
| `@debugger` → `@test-engineer` | Add Regression Tests |
|
|
280
|
+
| `@security-auditor` → `@penetration-tester` | Run Exploit Simulation |
|
|
281
|
+
| `@frontend-specialist` → `@test-engineer` | Test UI Components |
|
|
282
|
+
| `@backend-specialist` → `@test-engineer` | Test Backend |
|
|
283
|
+
| `@researcher` → `@project-planner` | Plan Implementation |
|
|
284
|
+
|
|
285
|
+
Handoffs are suggestions — the user chooses when to follow them. `@orchestrator` can use any agent as a subagent; `@project-planner` can delegate to `@researcher` and `@orchestrator` only.
|
|
286
|
+
|
|
283
287
|
---
|
|
284
288
|
|
|
285
289
|
## 10) Safety & Verification Contract
|
|
@@ -319,6 +323,7 @@ Use the following workflows proactively when task intent matches:
|
|
|
319
323
|
- No installed workflows found yet.
|
|
320
324
|
|
|
321
325
|
Selection policy:
|
|
326
|
+
|
|
322
327
|
1. Match explicit slash command first.
|
|
323
328
|
2. Match user intent to workflow description and triggers.
|
|
324
329
|
3. Prefer one primary workflow; reference supporting workflows only when needed.
|
|
@@ -337,6 +342,6 @@ Keep MCP context lazy and exact. Skills are supporting context, not the route la
|
|
|
337
342
|
5. Call `skill_get` with `includeReferences:false` by default.
|
|
338
343
|
6. Load at most one sidecar markdown file at a time with `skill_get_reference`.
|
|
339
344
|
7. Do not auto-prime every specialist with a skill. Load only what the task clearly needs.
|
|
340
|
-
8. Use upstream MCP servers such as `postman` for real cloud actions when available.
|
|
345
|
+
8. Use upstream MCP servers such as `postman`, `stitch`, or `playwright` for real cloud/browser actions when available.
|
|
341
346
|
|
|
342
347
|
<!-- cbx:mcp:auto:end -->
|
package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/agent-design/SKILL.md
ADDED
|
@@ -0,0 +1,197 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: agent-design
|
|
3
|
+
description: "Use when designing, building, or improving a CBX agent, skill, or workflow: clarification strategy, progressive disclosure structure, workflow pattern selection (sequential, parallel, evaluator-optimizer), skill type taxonomy, description tuning, and eval-first testing."
|
|
4
|
+
license: MIT
|
|
5
|
+
metadata:
|
|
6
|
+
author: cubis-foundry
|
|
7
|
+
version: "1.0"
|
|
8
|
+
compatibility: Claude Code, Codex, GitHub Copilot, Gemini CLI
|
|
9
|
+
---
|
|
10
|
+
# Agent Design
|
|
11
|
+
|
|
12
|
+
## Purpose
|
|
13
|
+
|
|
14
|
+
You are the specialist for designing CBX agents and skills that behave intelligently — asking the right questions, knowing when to pause, executing in the right workflow pattern, and testing their own output.
|
|
15
|
+
|
|
16
|
+
Your job is to close the gap between "it kinda works" and "it works reliably under any input."
|
|
17
|
+
|
|
18
|
+
## When to Use
|
|
19
|
+
|
|
20
|
+
- Designing or refactoring a SKILL.md or POWER.md
|
|
21
|
+
- Choosing between sequential, parallel, or evaluator-optimizer workflow
|
|
22
|
+
- Writing clarification logic for an agent that handles ambiguous requests
|
|
23
|
+
- Deciding whether a task needs a skill or just a prompt
|
|
24
|
+
- Testing whether a skill actually works as intended
|
|
25
|
+
- Writing descriptions that trigger the right skill at the right time
|
|
26
|
+
|
|
27
|
+
## Core Principles
|
|
28
|
+
|
|
29
|
+
These come directly from Anthropic's agent engineering research (["Equipping agents for the real world"](https://claude.com/blog/equipping-agents-for-the-real-world-with-agent-skills), March 2026):
|
|
30
|
+
|
|
31
|
+
1. **Progressive disclosure** — A skill's SKILL.md provides just enough context to know when to load it. Full instructions, references, and scripts are loaded lazily, only when needed. More context in a single file does not equal better behavior — it usually hurts it.
|
|
32
|
+
|
|
33
|
+
2. **Eval before optimizing** — Define what "good looks like" (test cases + success criteria) before editing the skill. This prevents regression and tells you when improvement actually happened.
|
|
34
|
+
|
|
35
|
+
3. **Description precision** — The `description` field in YAML frontmatter controls triggering. Too broad = false positives. Too narrow = the skill never fires. Tune it like a search query.
|
|
36
|
+
|
|
37
|
+
4. **Two skill types** — See [Skill Type Taxonomy](#skill-type-taxonomy). These need different testing strategies and have different shelf lives.
|
|
38
|
+
|
|
39
|
+
5. **Start with a single agent** — Before adding workflow complexity, first try a single agent with a rich prompt. Only add orchestration when it measurably improves results.
|
|
40
|
+
|
|
41
|
+
## Skill Type Taxonomy
|
|
42
|
+
|
|
43
|
+
| Type | What it does | Testing goal | Shelf life |
|
|
44
|
+
| ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------- | ------------------------------------------------------- |
|
|
45
|
+
| **Capability uplift** | Teaches Claude to do something it can't do alone (e.g. manipulate PDFs, fill forms, use a domain-specific API) | Verify the output is correct and consistent | Medium — may become obsolete as models improve |
|
|
46
|
+
| **Encoded preference** | Sequences steps Claude could do individually, but in your team's specific order and style (e.g. NDA review checklist, weekly update format) | Verify fidelity to the actual workflow | High — these stay useful because they're uniquely yours |
|
|
47
|
+
|
|
48
|
+
Design question: "Is this skill teaching Claude something new, or encoding how we do things?"
|
|
49
|
+
|
|
50
|
+
## Clarification Strategy
|
|
51
|
+
|
|
52
|
+
An agent that starts wrong wastes everyone's time. Smart agents pause at the right moments.
|
|
53
|
+
|
|
54
|
+
Load `references/clarification-patterns.md` when:
|
|
55
|
+
|
|
56
|
+
- Designing how a skill should handle ambiguous or underspecified inputs
|
|
57
|
+
- Writing the early steps of a workflow where user intent matters
|
|
58
|
+
- Deciding what questions to ask vs. what to infer
|
|
59
|
+
|
|
60
|
+
## Workflow Pattern Selection
|
|
61
|
+
|
|
62
|
+
Three patterns cover 95% of production agent workflows:
|
|
63
|
+
|
|
64
|
+
| Pattern | Use when | Cost | Benefit |
|
|
65
|
+
| ----------------------- | --------------------------------------------------------------- | ----------------------- | ----------------------------------------- |
|
|
66
|
+
| **Sequential** | Steps have dependencies (B needs A's output) | Latency (linear) | Focus: each step does one thing well |
|
|
67
|
+
| **Parallel** | Steps are independent and concurrency helps | Tokens (multiplicative) | Speed + separation of concerns |
|
|
68
|
+
| **Evaluator-optimizer** | First-draft quality isn't good enough and quality is measurable | Tokens × iterations | Better output through structured feedback |
|
|
69
|
+
|
|
70
|
+
Default to sequential. Add parallel when latency is the bottleneck and tasks are genuinely independent. Add evaluator-optimizer only when you can measure the improvement.
|
|
71
|
+
|
|
72
|
+
Load `references/workflow-patterns.md` for the full decision tree, examples, and anti-patterns.
|
|
73
|
+
|
|
74
|
+
## Progressive Disclosure Structure
|
|
75
|
+
|
|
76
|
+
A well-structured CBX skill looks like:
|
|
77
|
+
|
|
78
|
+
```
|
|
79
|
+
skill-name/
|
|
80
|
+
SKILL.md ← lean entry: name, description, purpose, when-to-use, load-table
|
|
81
|
+
references/ ← detailed guides loaded lazily when step requires it
|
|
82
|
+
topic-a.md
|
|
83
|
+
topic-b.md
|
|
84
|
+
commands/ ← slash commands (optional)
|
|
85
|
+
command.md
|
|
86
|
+
scripts/ ← executable code (optional)
|
|
87
|
+
helper.py
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
**SKILL.md should be loadable in <2000 tokens.** Everything else lives in references.
|
|
91
|
+
|
|
92
|
+
The metadata table pattern that works:
|
|
93
|
+
|
|
94
|
+
```markdown
|
|
95
|
+
## References
|
|
96
|
+
|
|
97
|
+
| File | Load when |
|
|
98
|
+
| ----------------------- | ------------------------------------------ |
|
|
99
|
+
| `references/topic-a.md` | Task involves [specific trigger condition] |
|
|
100
|
+
| `references/topic-b.md` | Task involves [specific trigger condition] |
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
This lets the agent make intelligent decisions about what context to load rather than ingesting everything upfront.
|
|
104
|
+
|
|
105
|
+
## Description Writing
|
|
106
|
+
|
|
107
|
+
The `description` field is a trigger — write it like a search query, not marketing copy.
|
|
108
|
+
|
|
109
|
+
**Good description:**
|
|
110
|
+
|
|
111
|
+
```yaml
|
|
112
|
+
description: "Use when evaluating an agent, skill, workflow, or MCP server: rubric design, evaluator-optimizer loops, LLM-as-judge patterns, regression suites, or prototype-vs-production quality gaps."
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
**Bad description:**
|
|
116
|
+
|
|
117
|
+
```yaml
|
|
118
|
+
description: "A comprehensive skill for evaluating things and making sure they work well."
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
Rules:
|
|
122
|
+
|
|
123
|
+
- Lead with the specific trigger verb: "Use when [user does X]"
|
|
124
|
+
- List the specific task types with commas — these act like search keywords
|
|
125
|
+
- Include domain-specific nouns the user would actually type
|
|
126
|
+
- Avoid generic adjectives ("comprehensive", "powerful", "advanced")
|
|
127
|
+
|
|
128
|
+
Test your description: would a user's natural-language request match the intent of these words?
|
|
129
|
+
|
|
130
|
+
## Testing a Skill
|
|
131
|
+
|
|
132
|
+
Before shipping, verify with this checklist:
|
|
133
|
+
|
|
134
|
+
1. **Positive trigger** — Does the skill load when it should? Test 5 natural phrasings of the target task.
|
|
135
|
+
2. **Negative trigger** — Does it stay quiet when it shouldn't load? Test 5 near-miss phrasings.
|
|
136
|
+
3. **Happy path** — Does the skill complete the standard task correctly?
|
|
137
|
+
4. **Edge cases** — What happens with missing input, ambiguous phrasing, or edge-case content?
|
|
138
|
+
5. **Reader test** — Run the delivery (e.g., a generated doc, a plan) through a fresh sub-agent with no context. Can it answer questions about the output correctly?
|
|
139
|
+
|
|
140
|
+
For formal regression suites, load `references/skill-testing.md`.
|
|
141
|
+
|
|
142
|
+
## Instructions
|
|
143
|
+
|
|
144
|
+
### Step 1 — Understand the design task
|
|
145
|
+
|
|
146
|
+
Before touching any file, clarify:
|
|
147
|
+
|
|
148
|
+
- Is this a new skill or improving an existing one?
|
|
149
|
+
- Is it capability uplift or encoded preference?
|
|
150
|
+
- What's the specific failure mode being fixed?
|
|
151
|
+
- What would passing look like?
|
|
152
|
+
|
|
153
|
+
If any of these are unclear, apply the clarification pattern from `references/clarification-patterns.md`.
|
|
154
|
+
|
|
155
|
+
### Step 2 — Choose the structure
|
|
156
|
+
|
|
157
|
+
- If the skill is simple (single task, single purpose): lean SKILL.md with no references
|
|
158
|
+
- If the skill is complex (multiple phases, conditional logic): SKILL.md + references loaded lazily
|
|
159
|
+
- If the skill has reusable commands: add `commands/` directory
|
|
160
|
+
|
|
161
|
+
### Step 3 — Design the workflow
|
|
162
|
+
|
|
163
|
+
Use the pattern selection table above. Start with sequential. Prove you need complexity before adding it.
|
|
164
|
+
|
|
165
|
+
### Step 4 — Write the description
|
|
166
|
+
|
|
167
|
+
Write it last. Once you know what the skill does and how it differs from adjacent skills, the right description is usually obvious.
|
|
168
|
+
|
|
169
|
+
### Step 5 — Define a test
|
|
170
|
+
|
|
171
|
+
Write at least 3 test cases (input → expected output or behavior) before considering the skill done. These become the regression suite.
|
|
172
|
+
|
|
173
|
+
## Output Format
|
|
174
|
+
|
|
175
|
+
Deliver:
|
|
176
|
+
|
|
177
|
+
1. **Skill structure** — directory layout, file list
|
|
178
|
+
2. **SKILL.md** — production-ready with lean body and reference table
|
|
179
|
+
3. **Reference files** — if needed, each scoped to a specific phase or topic
|
|
180
|
+
4. **Test cases** — 3-5 natural language inputs with expected behaviors
|
|
181
|
+
5. **Description** — the final `description` field, tuned for triggering
|
|
182
|
+
|
|
183
|
+
## References
|
|
184
|
+
|
|
185
|
+
| File | Load when |
|
|
186
|
+
| -------------------------------------- | ------------------------------------------------------------------------------ |
|
|
187
|
+
| `references/clarification-patterns.md` | Designing how the agent handles ambiguous or underspecified input |
|
|
188
|
+
| `references/workflow-patterns.md` | Choosing or implementing sequential, parallel, or evaluator-optimizer workflow |
|
|
189
|
+
| `references/skill-testing.md` | Writing evals, regression sets, or triggering tests for a skill |
|
|
190
|
+
|
|
191
|
+
## Examples
|
|
192
|
+
|
|
193
|
+
- "Design a skill for our NDA review process — it should follow our checklist exactly."
|
|
194
|
+
- "The feature-forge skill triggers on the wrong prompts. Help me fix the description."
|
|
195
|
+
- "How do I test whether my skill still works after a model update?"
|
|
196
|
+
- "I need a workflow where 3 agents review code in parallel then one synthesizes findings."
|
|
197
|
+
- "This skill's SKILL.md is 4000 tokens. Help me split it into lean structure with references."
|