@moon791017/neo-skills 1.1.4 → 1.1.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +8 -2
- package/package.json +1 -1
- package/skills/neo-clarification/SKILL.md +1 -1
- package/skills/neo-code-review/SKILL.md +5 -4
- package/skills/neo-code-review/references/review-checklist.md +110 -50
- package/skills/neo-dotnet-tag-helper/SKILL.md +1 -1
- package/skills/neo-rust/SKILL.md +1 -1
- package/skills/neo-sub-agent/SKILL.md +103 -0
- package/skills/neo-sub-agent/assets/templates/antigravity-skill.md +25 -0
- package/skills/neo-sub-agent/assets/templates/claude-agent.md +5 -0
- package/skills/neo-sub-agent/assets/templates/codex-agent.toml +1 -0
- package/skills/neo-sub-agent/assets/templates/copilot-agent.md +5 -0
- package/skills/neo-sub-agent/evals/eval_queries.json +50 -0
- package/skills/neo-sub-agent/evals/evals.json +49 -0
- package/skills/neo-sub-agent/evals/files/sample-sub-agent-spec.json +19 -0
- package/skills/neo-sub-agent/references/client-adapters.md +133 -0
- package/skills/neo-sub-agent/references/sub-agent-design.md +96 -0
- package/skills/neo-sub-agent/scripts/render-sub-agent.py +401 -0
package/README.md
CHANGED
|
@@ -82,7 +82,11 @@
|
|
|
82
82
|
|
|
83
83
|
* **AI 助手開發治理 (`skills/neo-agent-harness`)**:檢查專案規則、測試、CI、審查流程與安全防護,協助 AI 助手更穩定地參與開發。
|
|
84
84
|
|
|
85
|
-
### 12.
|
|
85
|
+
### 12. Sub-Agent 建立器
|
|
86
|
+
|
|
87
|
+
* **跨 CLI Sub-Agent 建立器 (`skills/neo-sub-agent`)**:設計並生成 Antigravity CLI、Codex、Claude Code 與 Copilot CLI 的 sub-agent/custom agent 設定,包含角色拆分、工具權限、檔案格式與驗證流程。
|
|
88
|
+
|
|
89
|
+
### 13. AI Tells / Slop 贅詞消除專家
|
|
86
90
|
|
|
87
91
|
* **文字去 AI 腔調 (`skills/neo-stop-slop`)**:消除中英文 AI 腔、贅詞與公式化囉唆句式,還原為乾淨、生動且簡煉的自然語言,並包含工程師註解、Git Commit 及 PR 說明的特化優化。
|
|
88
92
|
|
|
@@ -162,7 +166,8 @@ npx skills add Benknightdark/neo-skills --all
|
|
|
162
166
|
| **18. Code Review 專家** | `npx skills add Benknightdark/neo-skills --skill neo-code-review` |
|
|
163
167
|
| **19. 需求分析與釐清助手** | `npx skills add Benknightdark/neo-skills --skill neo-clarification` |
|
|
164
168
|
| **20. AI 開發流程治理專家** | `npx skills add Benknightdark/neo-skills --skill neo-agent-harness` |
|
|
165
|
-
| **21.
|
|
169
|
+
| **21. Sub-Agent 建立器** | `npx skills add Benknightdark/neo-skills --skill neo-sub-agent` |
|
|
170
|
+
| **22. AI Tells/Slop 消除專家** | `npx skills add Benknightdark/neo-skills --skill neo-stop-slop` |
|
|
166
171
|
|
|
167
172
|
---
|
|
168
173
|
|
|
@@ -233,6 +238,7 @@ npx -p @moon791017/neo-skills install-system-instructions --ai-agent claude --in
|
|
|
233
238
|
| **TS 型別設計與 CJS/ESM 互通** | `neo-typescript` | `npx skills add Benknightdark/neo-skills --skill neo-typescript` | `解決 tsconfig 與 ESM/CJS 互通性問題` |
|
|
234
239
|
| **複雜/模糊需求釐清與規格化** | `neo-clarification` | `npx skills add Benknightdark/neo-skills --skill neo-clarification` | `規劃一個電商網站` |
|
|
235
240
|
| **AI 助手開發治理與流程健檢** | `neo-agent-harness` | `npx skills add Benknightdark/neo-skills --skill neo-agent-harness` | `評估 AI 開發治理流程` |
|
|
241
|
+
| **建立跨 CLI Sub-Agent / Custom Agent** | `neo-sub-agent` | `npx skills add Benknightdark/neo-skills --skill neo-sub-agent` | `新增一個 Codex code-reviewer sub agent` |
|
|
236
242
|
| **清除文案/註解/Commit 中的 AI 腔** | `neo-stop-slop` | `npx skills add Benknightdark/neo-skills --skill neo-stop-slop` | `消除這段話的 AI 腔贅詞` |
|
|
237
243
|
|
|
238
244
|
## 🛠 開發與測試指南
|
package/package.json
CHANGED
|
@@ -6,7 +6,7 @@ description: >
|
|
|
6
6
|
|
|
7
7
|
# Requirement Clarification Specifications
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
Apply the Inversion & Generator Pattern. Follow this protocol strictly to translate raw, chaotic user complaints and screenshots into clean, structured system specifications.
|
|
10
10
|
|
|
11
11
|
---
|
|
12
12
|
|
|
@@ -6,7 +6,7 @@ description: >
|
|
|
6
6
|
|
|
7
7
|
# Code Review Specifications
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
Apply the Reviewer Pattern. Follow this protocol strictly to systematically, objectively, and constructively review the user's code changes.
|
|
10
10
|
|
|
11
11
|
---
|
|
12
12
|
|
|
@@ -37,9 +37,10 @@ You are a senior code review expert (Reviewer Pattern). Follow this protocol str
|
|
|
37
37
|
[review-checklist.md](file:///Users/ben/Projects/neo-skills/skills/neo-code-review/references/review-checklist.md)
|
|
38
38
|
|
|
39
39
|
2. **Systematically Evaluate Code**:
|
|
40
|
-
- Analyze the code logic deeply and compare it against the
|
|
41
|
-
-
|
|
42
|
-
-
|
|
40
|
+
- Analyze the code logic deeply and compare it against the checklist dimensions: Correctness, Regression Risk, Security, Performance, Data & Concurrency, Test Gaps, SOLID/Design Principles, Logging/Observability, Maintainability, and Style.
|
|
41
|
+
- Prioritize evidence-backed findings that create concrete behavioral, security, operational, compatibility, or maintainability risk.
|
|
42
|
+
- Filter out **🔴 Critical Issues (Must-fix)** for defects such as requirement breakage, security vulnerabilities, data corruption/loss, broken authorization, severe regressions, or production-impacting reliability failures.
|
|
43
|
+
- Classify lower-risk performance, design, logging, test coverage, readability, and maintainability issues under **🟡 Suggestions** when they have a clear reason and actionable remediation.
|
|
43
44
|
|
|
44
45
|
---
|
|
45
46
|
|
|
@@ -1,76 +1,136 @@
|
|
|
1
1
|
# Code Review Checklist
|
|
2
2
|
|
|
3
|
-
This checklist
|
|
3
|
+
This checklist is the review baseline for `neo-code-review`. During review, compare the code changes against these criteria and categorize findings by severity: Critical Issues (🔴 Must-fix), Suggestions (🟡 Clean Code/Performance), and Praise (🟢 High Quality).
|
|
4
|
+
|
|
5
|
+
Only report a finding when the risk is supported by the diff, surrounding code, tests, runtime behavior, or a clearly stated requirement. Do not turn every checklist item into a finding.
|
|
4
6
|
|
|
5
7
|
---
|
|
6
8
|
|
|
7
9
|
## 1. Correctness & Logic
|
|
8
10
|
|
|
9
|
-
- [ ] **
|
|
10
|
-
- [ ] **
|
|
11
|
+
- [ ] **Requirement Preservation**: Does the change break an existing or stated requirement? Does it miss a required business path?
|
|
12
|
+
- [ ] **Logic Errors**: Are conditionals, loops, ordering, calculations, parsing, validation, or branching rules incorrect?
|
|
13
|
+
- [ ] **Boundary Conditions**: Are edge cases handled properly, such as `null`, `undefined`, empty values, maximum/minimum values, negative numbers, special characters, duplicate input, timezone boundaries, or partial data?
|
|
14
|
+
- [ ] **State Transitions**: Are status changes, lifecycle transitions, retries, rollbacks, or cleanup steps consistent and reversible where required?
|
|
11
15
|
- [ ] **Exception Handling**:
|
|
12
|
-
- Are error-prone operations
|
|
13
|
-
- Are caught exceptions
|
|
14
|
-
- [ ] **Concurrency & Thread Safety**: In multi-threaded or asynchronous environments, are there potential race conditions, deadlocks, or unawaited async operations?
|
|
15
|
-
- [ ] **State Management**: Are variable lifecycles correct? Are there any unintended side effects or global namespace pollution?
|
|
16
|
+
- Are error-prone operations such as I/O, network requests, database calls, parsing, serialization, or background jobs handled safely?
|
|
17
|
+
- Are caught exceptions recovered, propagated, or logged appropriately instead of being swallowed silently?
|
|
16
18
|
|
|
17
19
|
---
|
|
18
20
|
|
|
19
|
-
## 2.
|
|
21
|
+
## 2. Regression Risk & Compatibility
|
|
20
22
|
|
|
21
|
-
- [ ] **
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
- [ ] **
|
|
26
|
-
|
|
27
|
-
|
|
23
|
+
- [ ] **Existing APIs**: Does the change break public APIs, endpoint behavior, method signatures, event contracts, or SDK/client expectations?
|
|
24
|
+
- [ ] **Data Formats**: Does it change request/response schemas, database shape, serialized fields, enum values, date formats, number precision, or default values in an incompatible way?
|
|
25
|
+
- [ ] **Error Formats**: Does it break existing error codes, problem details, validation messages, HTTP status codes, or client-side error handling assumptions?
|
|
26
|
+
- [ ] **Permission Flows**: Does it alter authentication, authorization, role checks, ownership checks, or approval flows in a way that blocks valid users or grants excessive access?
|
|
27
|
+
- [ ] **User Workflows**: Does it break existing user journeys, backward compatibility, migrations, imports/exports, saved settings, deep links, or integrations?
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
## 3. Security — 🔴 Critical
|
|
32
|
+
|
|
33
|
+
- [ ] **Injection**:
|
|
34
|
+
- SQL injection, NoSQL injection, command injection, LDAP injection, or template injection.
|
|
35
|
+
- Unsafe string concatenation in queries, shell commands, templates, filters, or dynamic expressions.
|
|
36
|
+
- [ ] **Web Security**:
|
|
37
|
+
- XSS caused by missing output encoding, unsafe HTML rendering, unsafe markdown rendering, or scriptable user content.
|
|
38
|
+
- CSRF exposure on state-changing requests.
|
|
39
|
+
- CORS misconfiguration that allows unintended origins, credentials, or methods.
|
|
28
40
|
- [ ] **Authentication & Authorization**:
|
|
29
|
-
-
|
|
30
|
-
-
|
|
31
|
-
- [ ] **
|
|
32
|
-
-
|
|
33
|
-
-
|
|
41
|
+
- Broken authentication, broken access control, IDOR, missing ownership checks, weak session handling, or bypassable authorization logic.
|
|
42
|
+
- Missing rate limit or abuse protection on sensitive endpoints such as login, reset, invite, export, payment, or AI calls.
|
|
43
|
+
- [ ] **Sensitive Data Protection**:
|
|
44
|
+
- Sensitive data exposure, secret/token leakage, hardcoded credentials, connection strings, private keys, or unsafe debug output.
|
|
45
|
+
- Insecure logging of tokens, passwords, PII, connection strings, full sensitive payloads, or reversible sensitive data.
|
|
46
|
+
- [ ] **File, Network & Execution Risks**:
|
|
47
|
+
- SSRF, path traversal, arbitrary file read/write, unsafe upload handling, insecure deserialization, unsafe reflection, or dynamic execution.
|
|
48
|
+
- [ ] **Validation & Encoding**:
|
|
49
|
+
- Missing input validation, output encoding, authorization checks, allowlists, size limits, content-type validation, or schema validation.
|
|
50
|
+
- [ ] **Cryptography**:
|
|
51
|
+
- Insecure crypto, weak hashes, predictable random values, missing salt, hardcoded keys, custom crypto, or unsafe key management.
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
## 4. Performance & Resource Management
|
|
56
|
+
|
|
57
|
+
- [ ] **Database Efficiency**:
|
|
58
|
+
- N+1 queries, repeated database roundtrips, missing batching, unnecessary joins, fetching unused columns, or repeated writes.
|
|
59
|
+
- Unpaginated large queries, unbounded result sets, missing limits, or missing necessary index guidance for hot queries.
|
|
60
|
+
- [ ] **Async & External Calls**:
|
|
61
|
+
- Synchronous blocking inside async flows, sync-over-async, unawaited tasks, missing cancellation token propagation, or thread-pool starvation risk.
|
|
62
|
+
- External API calls without timeout, retry/backoff, idempotency strategy, or circuit-breaker consideration where failures could cascade.
|
|
63
|
+
- [ ] **Memory & Resource Use**:
|
|
64
|
+
- Large allocations, loading entire large files or payloads into memory, unbounded buffers, unclosed streams, undisposed resources, or leaked handles.
|
|
65
|
+
- Unregistered event listeners, long-lived references, growing global caches, or background work that cannot be stopped.
|
|
66
|
+
- [ ] **Algorithmic Efficiency**:
|
|
67
|
+
- Inefficient algorithms, repeated computation, unnecessary nested loops, repeated serialization/deserialization, or expensive work in hot paths.
|
|
68
|
+
- [ ] **Logging Cost**:
|
|
69
|
+
- Excessive logging in hot paths, logging large payloads, or synchronous logging that can slow requests.
|
|
70
|
+
- [ ] **Caching**:
|
|
71
|
+
- Incorrect cache keys, stale cache invalidation, cache stampede risk, missing TTLs, or caching data with user-specific authorization requirements.
|
|
34
72
|
|
|
35
73
|
---
|
|
36
74
|
|
|
37
|
-
##
|
|
75
|
+
## 5. Data & Concurrency
|
|
38
76
|
|
|
39
|
-
- [ ] **
|
|
40
|
-
- [ ] **
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
- [ ] **Resource
|
|
44
|
-
|
|
45
|
-
- Are resources released in `finally` blocks or using structural patterns like `using` (.NET) or `with` (Python)?
|
|
46
|
-
- [ ] **Memory Leaks**:
|
|
47
|
-
- Are there unregistered event listeners, infinitely growing global caches, or long-lived objects retaining references to short-lived objects?
|
|
48
|
-
- [ ] **Caching & Lazy Loading**: Is an appropriate caching strategy used for high-frequency, low-variance expensive computations or I/O operations?
|
|
77
|
+
- [ ] **Transaction Consistency**: Are related writes protected by transactions or compensating logic where partial failure would corrupt state?
|
|
78
|
+
- [ ] **Duplicate Writes & Idempotency**: Can retries, double-clicks, queue redelivery, or webhook replay create duplicate records or repeated side effects?
|
|
79
|
+
- [ ] **Race Conditions**: Are there check-then-act races, lost updates, optimistic concurrency gaps, locking mistakes, deadlocks, or shared mutable state issues?
|
|
80
|
+
- [ ] **Cache Invalidation**: Does the change keep cache, database, search index, derived data, and client state consistent?
|
|
81
|
+
- [ ] **Resource Release**: Are locks, transactions, database connections, file handles, timers, subscriptions, and background workers released on success and failure?
|
|
82
|
+
- [ ] **Async Error Handling**: Are async failures observed, propagated, retried safely, or surfaced to monitoring instead of becoming unhandled background errors?
|
|
49
83
|
|
|
50
84
|
---
|
|
51
85
|
|
|
52
|
-
##
|
|
86
|
+
## 6. Test Gaps
|
|
53
87
|
|
|
54
|
-
- [ ] **
|
|
55
|
-
- [ ] **
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
- [ ] **
|
|
59
|
-
- Does the code follow the **DRY (Don't Repeat Yourself)** principle?
|
|
60
|
-
- Does the code follow **SOLID principles** (especially the Single Responsibility Principle - SRP)?
|
|
61
|
-
- [ ] **Testability**:
|
|
62
|
-
- Is the code easy to unit test?
|
|
63
|
-
- Is there appropriate dependency injection or decoupling to facilitate mocking external dependencies?
|
|
64
|
-
- Are there corresponding test cases for new features or bug fixes?
|
|
88
|
+
- [ ] **High-Risk Branches**: Are new critical branches, feature flags, migrations, fallbacks, retries, or failure branches covered?
|
|
89
|
+
- [ ] **Error Paths**: Are validation errors, dependency failures, timeouts, malformed data, and partial failures tested?
|
|
90
|
+
- [ ] **Permission Boundaries**: Are unauthorized, unauthenticated, wrong-owner, wrong-role, and cross-tenant cases tested?
|
|
91
|
+
- [ ] **Data Conversion**: Are serialization, parsing, mapping, rounding, timezone handling, enum changes, and schema compatibility tested?
|
|
92
|
+
- [ ] **Regression Scenarios**: Are tests present for the bug being fixed or the existing workflow that could be broken?
|
|
65
93
|
|
|
66
94
|
---
|
|
67
95
|
|
|
68
|
-
##
|
|
96
|
+
## 7. SOLID / Design Principles
|
|
69
97
|
|
|
70
|
-
- [ ] **
|
|
71
|
-
- [ ] **
|
|
72
|
-
|
|
73
|
-
|
|
98
|
+
- [ ] **SRP (Single Responsibility Principle)**: Does a class, method, component, or service take on too many responsibilities, making bugs or tests more likely?
|
|
99
|
+
- [ ] **OCP (Open/Closed Principle)**: Will common new requirements require repeatedly editing fragile core logic instead of extending through stable seams?
|
|
100
|
+
- [ ] **LSP (Liskov Substitution Principle)**: Do inheritance, interface, or polymorphic changes violate existing contracts or caller assumptions?
|
|
101
|
+
- [ ] **ISP (Interface Segregation Principle)**: Are callers forced to depend on methods, fields, DTO members, or service operations they do not need?
|
|
102
|
+
- [ ] **DIP (Dependency Inversion Principle)**: Do high-level modules directly depend on concrete implementations in a way that makes testing, replacement, or isolation difficult?
|
|
103
|
+
|
|
104
|
+
Only list SOLID/design findings when the violation clearly increases defect risk, breaks extension, makes tests meaningfully harder, or creates concrete maintenance risk. Do not report abstract design preferences as findings.
|
|
105
|
+
|
|
106
|
+
---
|
|
107
|
+
|
|
108
|
+
## 8. Logging / Observability
|
|
109
|
+
|
|
110
|
+
- [ ] **Coverage of Important Failure Points**: Do important error paths, external APIs, database operations, background jobs, scheduled jobs, payment flows, and AI calls have enough logging or telemetry to debug production failures?
|
|
111
|
+
- [ ] **Diagnostic Context**: Do logs include useful correlation fields such as request id, entity id, user/tenant identifier when safe, operation, status, duration, and exception details?
|
|
112
|
+
- [ ] **Log Levels**: Are expected business outcomes logged below `error`, and real failures not hidden as `debug` or omitted entirely?
|
|
113
|
+
- [ ] **Sensitive Data Safety**: Logs must not include tokens, passwords, connection strings, personal data, full sensitive payloads, or content that can reconstruct sensitive data.
|
|
114
|
+
- [ ] **Signal-to-Noise**: Do not require logging on every line. Report logging gaps only for failure points that would be difficult to investigate after deployment.
|
|
115
|
+
|
|
116
|
+
---
|
|
117
|
+
|
|
118
|
+
## 9. Maintainability & Style
|
|
119
|
+
|
|
120
|
+
- [ ] **Clear Naming**: Are variables, functions, classes, files, and modules named clearly enough to reflect intent?
|
|
121
|
+
- [ ] **Complexity Control**:
|
|
122
|
+
- Is a function, method, component, or query too long or deeply nested to review safely?
|
|
123
|
+
- Is cyclomatic complexity high enough to hide bugs or make tests brittle?
|
|
124
|
+
- [ ] **Modularity & Duplication**:
|
|
125
|
+
- Does the code avoid meaningful duplication, copy-pasted business rules, and divergent validation logic?
|
|
126
|
+
- Is shared behavior placed at the right abstraction level without over-engineering?
|
|
127
|
+
- [ ] **Testability**:
|
|
128
|
+
- Is the code easy to unit test or integration test?
|
|
129
|
+
- Are dependencies injectable or isolatable where tests need control over time, I/O, network calls, randomness, or external systems?
|
|
130
|
+
- [ ] **Idiomatic Best Practices**: Does the code use language and framework conventions appropriately without fighting the platform?
|
|
131
|
+
- [ ] **Formatting & Consistency**:
|
|
132
|
+
- Do indentation, spacing, imports, quotes, naming, and file organization match the codebase?
|
|
133
|
+
- Does the code pass the local formatter, linter, and type checker when applicable?
|
|
74
134
|
- [ ] **Comment Quality**:
|
|
75
|
-
- Do comments explain
|
|
76
|
-
- Has obsolete or commented-out dead code been
|
|
135
|
+
- Do comments explain non-obvious intent and tradeoffs instead of restating the code?
|
|
136
|
+
- Has obsolete, misleading, or commented-out dead code been removed?
|
|
@@ -18,7 +18,7 @@ metadata:
|
|
|
18
18
|
|
|
19
19
|
## Workflow (Generator Pattern)
|
|
20
20
|
|
|
21
|
-
|
|
21
|
+
Follow these steps exactly to create robust, high-performance C# Tag Helpers.
|
|
22
22
|
|
|
23
23
|
### Step 1: Perceive & Gather Requirements (Inversion)
|
|
24
24
|
Before writing any code, ask the user to clarify the following if not already provided:
|
package/skills/neo-rust/SKILL.md
CHANGED
|
@@ -11,7 +11,7 @@ metadata:
|
|
|
11
11
|
|
|
12
12
|
# Neo Rust Expert
|
|
13
13
|
|
|
14
|
-
|
|
14
|
+
Write safe, maintainable, and idiomatic Rust code by strictly following the official design patterns, leveraging Rust's powerful type system, and avoiding anti-patterns.
|
|
15
15
|
|
|
16
16
|
## Gotchas
|
|
17
17
|
|
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: neo-sub-agent
|
|
3
|
+
description: >
|
|
4
|
+
設計、建立、審查或轉換跨 CLI sub-agent、custom agent、worker agent、reviewer agent、planner agent、explorer agent、background agent 與 multi-agent workflow。Use this skill when the user asks to add a sub agent, create a custom agent, design agent delegation, or generate agent files for Antigravity CLI, Codex, Claude Code, or GitHub Copilot CLI.
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Neo Sub-Agent
|
|
8
|
+
|
|
9
|
+
Use this skill to design and generate focused sub-agent definitions for developer CLIs. Keep the user in control of scope, permissions, and target clients.
|
|
10
|
+
|
|
11
|
+
## Core Rule
|
|
12
|
+
Do not invent client-specific schema. If the requested CLI, field, or file location is not documented in `references/client-adapters.md` or discoverable in the project, state the missing fact and generate only a neutral blueprint.
|
|
13
|
+
|
|
14
|
+
## Workflow
|
|
15
|
+
|
|
16
|
+
### 1. Perceive
|
|
17
|
+
1. Inspect the project before asking questions:
|
|
18
|
+
- Existing agent definitions: `.claude/agents/`, `.codex/agents/`, `.github/agents/`, `.agents/skills/`.
|
|
19
|
+
- Project guidance: `AGENTS.md`, `CLAUDE.md`, `GEMINI.md`, `.github/copilot-instructions.md`.
|
|
20
|
+
- Related skills or conventions in `skills/`, `.agents/skills/`, `.github/skills/`, `.claude/skills/`, `.codex/skills/`.
|
|
21
|
+
2. Identify the target clients from the user request or project evidence: `claude`, `codex`, `copilot`, `agy`, or multiple.
|
|
22
|
+
3. If multiple clients are plausible and the request does not specify one, ask one concise question before writing files.
|
|
23
|
+
|
|
24
|
+
### 2. Decide Whether a Sub-Agent Fits
|
|
25
|
+
Read `references/sub-agent-design.md` when the request involves architecture, multiple agents, or tradeoffs.
|
|
26
|
+
|
|
27
|
+
Use a sub-agent when at least one is true:
|
|
28
|
+
- The work produces verbose logs, search results, or file reads that should not pollute the main context.
|
|
29
|
+
- The work is a repeated specialist role such as code review, planning, test execution, research, migration analysis, or docs verification.
|
|
30
|
+
- The work can run independently or in a clearly bounded sequence.
|
|
31
|
+
- The role needs stricter tool access, sandboxing, model choice, or output contract than the main agent.
|
|
32
|
+
|
|
33
|
+
Prefer a normal skill or main-conversation instruction when:
|
|
34
|
+
- The task is a quick, targeted change.
|
|
35
|
+
- The agent needs frequent back-and-forth with the user.
|
|
36
|
+
- Multiple workers would edit the same files concurrently.
|
|
37
|
+
- The workflow depends on undocumented client behavior.
|
|
38
|
+
|
|
39
|
+
### 3. Define the Agent Spec
|
|
40
|
+
Collect or infer these fields, using safe defaults where reasonable:
|
|
41
|
+
- `name`: lowercase kebab-case, under 64 characters.
|
|
42
|
+
- `description`: trigger-focused; state exactly when to use the agent.
|
|
43
|
+
- `instructions`: role, workflow, constraints, and output contract.
|
|
44
|
+
- `clients`: target clients.
|
|
45
|
+
- `scope`: `project` by default; use `user` only when the user asks for reusable personal agents.
|
|
46
|
+
- `tools`: use the smallest useful set; prefer read-only for reviewers, researchers, planners, and auditors.
|
|
47
|
+
- `model` and reasoning effort: omit unless the user asks or the client-specific reference gives a clear default.
|
|
48
|
+
- `handoff`: what the parent agent should pass in and what the sub-agent must return.
|
|
49
|
+
- `validation`: commands, checks, or review criteria the sub-agent should run or report.
|
|
50
|
+
|
|
51
|
+
### 4. Generate Files
|
|
52
|
+
Read `references/client-adapters.md` before generating client-specific files.
|
|
53
|
+
|
|
54
|
+
For repeatable output, create a JSON spec and run:
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
uv run skills/neo-sub-agent/scripts/render-sub-agent.py --spec spec.json
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
If `uv` is not available and the script has no external dependencies, use:
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
python skills/neo-sub-agent/scripts/render-sub-agent.py --spec spec.json
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
Review the dry-run JSON. If the target paths and content are correct, write files:
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
uv run skills/neo-sub-agent/scripts/render-sub-agent.py --spec spec.json --write
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
Use `--force` only when replacing an existing agent file is explicitly intended.
|
|
73
|
+
|
|
74
|
+
### 5. Review
|
|
75
|
+
Before finalizing, check:
|
|
76
|
+
- The agent has one narrow job and does not duplicate an existing agent.
|
|
77
|
+
- The description is specific enough for automatic delegation.
|
|
78
|
+
- Tool permissions match the role and avoid broad write/shell access when unnecessary.
|
|
79
|
+
- The output contract is short, concrete, and easy for the parent agent to synthesize.
|
|
80
|
+
- Antigravity output is labeled as a skill/delegation blueprint, not a custom sub-agent manifest.
|
|
81
|
+
|
|
82
|
+
## Output Format
|
|
83
|
+
When reporting back to the user, use Traditional Chinese (Taiwan):
|
|
84
|
+
|
|
85
|
+
```markdown
|
|
86
|
+
## 已新增
|
|
87
|
+
- [client] `path/to/agent-file`
|
|
88
|
+
|
|
89
|
+
## 設計重點
|
|
90
|
+
- 角色:
|
|
91
|
+
- 觸發條件:
|
|
92
|
+
- 權限:
|
|
93
|
+
- 回傳格式:
|
|
94
|
+
|
|
95
|
+
## 驗證
|
|
96
|
+
- `command`
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
## Constraints
|
|
100
|
+
- Do not create broad "do everything" agents.
|
|
101
|
+
- Do not give background agents write or shell access unless the task requires it.
|
|
102
|
+
- Do not create multiple agents that can edit the same file set in parallel.
|
|
103
|
+
- Do not treat model-generated review as a replacement for deterministic tests.
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
---
|
|
2
|
+
{{frontmatter}}
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# {{title}}
|
|
6
|
+
|
|
7
|
+
Use this skill as an Antigravity CLI delegation blueprint. It defines when the main agent should route work to a focused background specialist and what that specialist must return.
|
|
8
|
+
|
|
9
|
+
## Compatibility Note
|
|
10
|
+
This file is an Agent Skill / delegation blueprint for Antigravity CLI. It is not a native custom sub-agent manifest; no stable standalone custom sub-agent manifest format is assumed here.
|
|
11
|
+
|
|
12
|
+
## Role
|
|
13
|
+
{{instructions}}
|
|
14
|
+
|
|
15
|
+
## Delegation Rules
|
|
16
|
+
- Use this role only when the task matches the description.
|
|
17
|
+
- Keep the delegated task bounded and self-contained.
|
|
18
|
+
- Return a concise result to the parent conversation instead of full logs.
|
|
19
|
+
- Ask the parent conversation for approval before destructive actions, broad rewrites, migrations, or production-impacting work.
|
|
20
|
+
|
|
21
|
+
## Output Contract
|
|
22
|
+
{{output_contract}}
|
|
23
|
+
|
|
24
|
+
## Validation
|
|
25
|
+
{{validation}}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{{toml}}
|
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
[
|
|
2
|
+
{
|
|
3
|
+
"query": "幫我新增一個 sub agent,專門在 Claude Code 裡做 code review",
|
|
4
|
+
"should_trigger": true
|
|
5
|
+
},
|
|
6
|
+
{
|
|
7
|
+
"query": "Create a Codex explorer agent that maps auth flows without editing files",
|
|
8
|
+
"should_trigger": true
|
|
9
|
+
},
|
|
10
|
+
{
|
|
11
|
+
"query": "我要一個 Copilot CLI testing specialist custom agent",
|
|
12
|
+
"should_trigger": true
|
|
13
|
+
},
|
|
14
|
+
{
|
|
15
|
+
"query": "幫 AGY 做一個背景研究 agent skill,可以整理大量 log",
|
|
16
|
+
"should_trigger": true
|
|
17
|
+
},
|
|
18
|
+
{
|
|
19
|
+
"query": "設計 planner、worker、reviewer 的 multi-agent workflow",
|
|
20
|
+
"should_trigger": true
|
|
21
|
+
},
|
|
22
|
+
{
|
|
23
|
+
"query": "幫我把現有 Claude subagent 轉成 Codex custom agent",
|
|
24
|
+
"should_trigger": true
|
|
25
|
+
},
|
|
26
|
+
{
|
|
27
|
+
"query": "請解釋 sub-agent 什麼時候不該用,並幫我判斷這個場景",
|
|
28
|
+
"should_trigger": true
|
|
29
|
+
},
|
|
30
|
+
{
|
|
31
|
+
"query": "幫我新增一個 Python skill,不需要 sub-agent",
|
|
32
|
+
"should_trigger": false
|
|
33
|
+
},
|
|
34
|
+
{
|
|
35
|
+
"query": "修復這個 Vue component 的 RWD 排版",
|
|
36
|
+
"should_trigger": false
|
|
37
|
+
},
|
|
38
|
+
{
|
|
39
|
+
"query": "請 review 目前 git diff",
|
|
40
|
+
"should_trigger": false
|
|
41
|
+
},
|
|
42
|
+
{
|
|
43
|
+
"query": "幫我寫 Conventional Commit 訊息",
|
|
44
|
+
"should_trigger": false
|
|
45
|
+
},
|
|
46
|
+
{
|
|
47
|
+
"query": "查詢今天台北天氣",
|
|
48
|
+
"should_trigger": false
|
|
49
|
+
}
|
|
50
|
+
]
|
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
{
|
|
2
|
+
"skill_name": "neo-sub-agent",
|
|
3
|
+
"evals": [
|
|
4
|
+
{
|
|
5
|
+
"id": 1,
|
|
6
|
+
"prompt": "新增一個 Claude Code code-reviewer subagent,只能讀檔與搜尋,專門審查 correctness/security/tests。",
|
|
7
|
+
"expected_output": "產出 `.claude/agents/code-reviewer.md`,包含 `name`、trigger-focused `description`、read-only tools,以及明確的 findings-first review prompt。",
|
|
8
|
+
"assertions": [
|
|
9
|
+
"輸出檔案使用 Markdown YAML frontmatter",
|
|
10
|
+
"frontmatter 包含 name 與 description",
|
|
11
|
+
"工具權限不包含 edit 或 shell",
|
|
12
|
+
"prompt 要求先列出具體 findings 與檔案證據"
|
|
13
|
+
]
|
|
14
|
+
},
|
|
15
|
+
{
|
|
16
|
+
"id": 2,
|
|
17
|
+
"prompt": "Create a Codex auth-flow-explorer custom agent. It should inspect auth code paths and never modify files.",
|
|
18
|
+
"expected_output": "產出 `.codex/agents/auth-flow-explorer.toml`,包含 required Codex fields and read-only sandbox guidance.",
|
|
19
|
+
"assertions": [
|
|
20
|
+
"TOML 包含 name、description、developer_instructions",
|
|
21
|
+
"sandbox_mode 設為 read-only 或 instructions 明確禁止修改",
|
|
22
|
+
"description 說明何時使用此 explorer",
|
|
23
|
+
"developer_instructions 要求回傳 concise mapped paths"
|
|
24
|
+
]
|
|
25
|
+
},
|
|
26
|
+
{
|
|
27
|
+
"id": 3,
|
|
28
|
+
"prompt": "做一個 Copilot CLI test-specialist custom agent,負責找測試缺口與補測試,不要碰 production code。",
|
|
29
|
+
"expected_output": "產出 `.github/agents/test-specialist.agent.md`,包含 Copilot custom agent frontmatter and a scoped testing prompt.",
|
|
30
|
+
"assertions": [
|
|
31
|
+
"輸出檔案副檔名為 `.agent.md`",
|
|
32
|
+
"frontmatter 至少包含 description",
|
|
33
|
+
"tools 權限限制合理",
|
|
34
|
+
"instructions 明確禁止修改 production code"
|
|
35
|
+
]
|
|
36
|
+
},
|
|
37
|
+
{
|
|
38
|
+
"id": 4,
|
|
39
|
+
"prompt": "幫 Antigravity CLI 建立 log-researcher 背景研究 agent,但不要假設有原生 manifest。",
|
|
40
|
+
"expected_output": "產出 `.agents/skills/log-researcher/SKILL.md` 作為 Antigravity delegation blueprint,並明確標示不是原生 custom sub-agent manifest。",
|
|
41
|
+
"assertions": [
|
|
42
|
+
"輸出是 Agent Skill 格式且第一行為 YAML frontmatter",
|
|
43
|
+
"description 說明背景研究與大量 log 整理用途",
|
|
44
|
+
"內容包含 delegation rules",
|
|
45
|
+
"明確指出 Antigravity 原生 custom sub-agent manifest 資料不足"
|
|
46
|
+
]
|
|
47
|
+
}
|
|
48
|
+
]
|
|
49
|
+
}
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
{
|
|
2
|
+
"clients": ["claude", "codex", "copilot", "agy"],
|
|
3
|
+
"scope": "project",
|
|
4
|
+
"name": "code-reviewer",
|
|
5
|
+
"description": "Reviews changed code for correctness, security, behavior regressions, and missing tests.",
|
|
6
|
+
"instructions": "You are a focused code reviewer. Prioritize correctness, security, behavior regressions, and missing tests. Return concrete findings first, each with file evidence and a suggested fix. Do not make code changes.",
|
|
7
|
+
"tools": {
|
|
8
|
+
"default": ["Read", "Grep", "Glob"],
|
|
9
|
+
"copilot": ["read", "search"]
|
|
10
|
+
},
|
|
11
|
+
"read_only": true,
|
|
12
|
+
"output_contract": [
|
|
13
|
+
"Findings ordered by severity",
|
|
14
|
+
"File paths and concrete evidence",
|
|
15
|
+
"Open questions only when they affect correctness",
|
|
16
|
+
"Test gaps and residual risk"
|
|
17
|
+
],
|
|
18
|
+
"validation": ["npm test"]
|
|
19
|
+
}
|
|
@@ -0,0 +1,133 @@
|
|
|
1
|
+
# Client Adapter Reference
|
|
2
|
+
|
|
3
|
+
Use this reference before writing any client-specific sub-agent/custom-agent file. If a user asks for a field not listed here and it is not discoverable in the local project, state that the data is insufficient.
|
|
4
|
+
|
|
5
|
+
## Neutral Spec
|
|
6
|
+
Use this shape as the internal planning format before rendering files:
|
|
7
|
+
|
|
8
|
+
```json
|
|
9
|
+
{
|
|
10
|
+
"clients": ["claude", "codex", "copilot", "agy"],
|
|
11
|
+
"scope": "project",
|
|
12
|
+
"name": "code-reviewer",
|
|
13
|
+
"description": "Reviews changed code for correctness, security, and missing tests.",
|
|
14
|
+
"instructions": "You are a code reviewer...",
|
|
15
|
+
"tools": ["Read", "Grep", "Glob"],
|
|
16
|
+
"model": null,
|
|
17
|
+
"output_contract": "Return findings first, with file paths and concrete evidence.",
|
|
18
|
+
"validation": ["npm test"]
|
|
19
|
+
}
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
Use lowercase kebab-case for `name` to preserve cross-client compatibility.
|
|
23
|
+
|
|
24
|
+
## Claude Code
|
|
25
|
+
|
|
26
|
+
Verified format:
|
|
27
|
+
- Project path: `.claude/agents/<name>.md`
|
|
28
|
+
- User path: `~/.claude/agents/<name>.md`
|
|
29
|
+
- File type: Markdown with YAML frontmatter and Markdown body.
|
|
30
|
+
- Required frontmatter: `name`, `description`.
|
|
31
|
+
- Useful optional fields: `tools`, `disallowedTools`, `model`, `permissionMode`, `maxTurns`, `skills`, `mcpServers`, `memory`, `background`, `effort`, `isolation`, `color`.
|
|
32
|
+
- Body: system prompt for the subagent.
|
|
33
|
+
|
|
34
|
+
Default choices:
|
|
35
|
+
- Reviewer, planner, explorer: read-only tools.
|
|
36
|
+
- Implementation worker: only include edit/shell tools when requested.
|
|
37
|
+
- If model is unspecified, omit it so the subagent inherits the session default.
|
|
38
|
+
|
|
39
|
+
Example:
|
|
40
|
+
|
|
41
|
+
```markdown
|
|
42
|
+
---
|
|
43
|
+
name: code-reviewer
|
|
44
|
+
description: Reviews code for correctness, security, and missing tests.
|
|
45
|
+
tools: ["Read", "Grep", "Glob"]
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
You are a code reviewer...
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
## Codex
|
|
52
|
+
|
|
53
|
+
Verified format:
|
|
54
|
+
- Project path: `.codex/agents/<name>.toml`
|
|
55
|
+
- User path: `~/.codex/agents/<name>.toml`
|
|
56
|
+
- File type: standalone TOML custom agent file.
|
|
57
|
+
- Required fields: `name`, `description`, `developer_instructions`.
|
|
58
|
+
- Useful optional fields: `nickname_candidates`, `model`, `model_reasoning_effort`, `sandbox_mode`, `mcp_servers`, `skills.config`.
|
|
59
|
+
- Related global settings live under `[agents]` in Codex config, such as `max_threads`, `max_depth`, and `job_max_runtime_seconds`.
|
|
60
|
+
|
|
61
|
+
Default choices:
|
|
62
|
+
- Explorer/reviewer/planner: `sandbox_mode = "read-only"` when the user wants a non-mutating agent.
|
|
63
|
+
- Implementation worker: do not set `sandbox_mode` unless the user explicitly wants a different sandbox from the parent session.
|
|
64
|
+
- Keep `max_depth` at the default unless the user explicitly needs recursive delegation.
|
|
65
|
+
|
|
66
|
+
Example:
|
|
67
|
+
|
|
68
|
+
```toml
|
|
69
|
+
name = "code-reviewer"
|
|
70
|
+
description = "Reviews code for correctness, security, and missing tests."
|
|
71
|
+
sandbox_mode = "read-only"
|
|
72
|
+
developer_instructions = "Review code like an owner..."
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
## GitHub Copilot CLI
|
|
76
|
+
|
|
77
|
+
Verified format:
|
|
78
|
+
- Project path: `.github/agents/<name>.agent.md` or `.github/agents/<name>.md`
|
|
79
|
+
- User path: `~/.copilot/agents/<name>.agent.md`
|
|
80
|
+
- File type: Markdown with YAML frontmatter and Markdown body.
|
|
81
|
+
- Required frontmatter: `description`.
|
|
82
|
+
- Useful optional fields: `name`, `target`, `tools`, `model`, `disable-model-invocation`, `user-invocable`, `mcp-servers`, `metadata`.
|
|
83
|
+
- Tool aliases include `read`, `edit`, `search`, `execute`, `agent`, `web`, and `todo`; unsupported names are ignored by Copilot.
|
|
84
|
+
|
|
85
|
+
Default choices:
|
|
86
|
+
- Use `.agent.md` for clarity.
|
|
87
|
+
- Include `name` for human display, but rely on filename for the agent ID.
|
|
88
|
+
- Use `tools: ["read", "search"]` for reviewers and researchers unless edits are needed.
|
|
89
|
+
|
|
90
|
+
Example:
|
|
91
|
+
|
|
92
|
+
```markdown
|
|
93
|
+
---
|
|
94
|
+
name: code-reviewer
|
|
95
|
+
description: Reviews changed code for correctness, security, and missing tests.
|
|
96
|
+
tools: ["read", "search"]
|
|
97
|
+
---
|
|
98
|
+
|
|
99
|
+
You are a code reviewer...
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
## Antigravity CLI
|
|
103
|
+
|
|
104
|
+
Verified facts:
|
|
105
|
+
- Antigravity CLI supports runtime background subagents and an `/agents` panel.
|
|
106
|
+
- Antigravity CLI supports workspace skills in `.agents/skills/`.
|
|
107
|
+
- Antigravity CLI docs describe skills/plugins as the reusable customization path.
|
|
108
|
+
|
|
109
|
+
Insufficient data:
|
|
110
|
+
- No stable standalone custom sub-agent manifest format was verified for V1.
|
|
111
|
+
|
|
112
|
+
V1 adapter:
|
|
113
|
+
- Project path: `.agents/skills/<name>/SKILL.md`
|
|
114
|
+
- User/global path: `~/.gemini/antigravity-cli/skills/<name>/SKILL.md`
|
|
115
|
+
- Output type: Agent Skill delegation blueprint that instructs Antigravity to spawn or use a background specialist when appropriate.
|
|
116
|
+
|
|
117
|
+
Default choices:
|
|
118
|
+
- Label the output as a skill/delegation blueprint.
|
|
119
|
+
- Include clear conditions for when to delegate to a background subagent.
|
|
120
|
+
- Avoid claiming this creates a native Antigravity custom subagent manifest.
|
|
121
|
+
|
|
122
|
+
Example:
|
|
123
|
+
|
|
124
|
+
```markdown
|
|
125
|
+
---
|
|
126
|
+
name: code-reviewer
|
|
127
|
+
description: Reviews changed code in Antigravity CLI by delegating review work to a focused background agent when appropriate.
|
|
128
|
+
---
|
|
129
|
+
|
|
130
|
+
# Code Reviewer
|
|
131
|
+
|
|
132
|
+
Use this skill to run a focused code-review delegation...
|
|
133
|
+
```
|
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
# Sub-Agent Design Reference
|
|
2
|
+
|
|
3
|
+
This reference summarizes verified sub-agent design principles for developer CLIs and multi-agent workflows. Use it when designing a new agent role, deciding if delegation is appropriate, or reviewing an existing agent setup.
|
|
4
|
+
|
|
5
|
+
## Basis
|
|
6
|
+
- Agent Skills specification: a skill is a directory with `SKILL.md`; `name` and `description` drive discovery; detailed instructions should use progressive disclosure.
|
|
7
|
+
- Claude Code subagents: subagents run in isolated context windows with their own prompt, tool access, permissions, and optional model.
|
|
8
|
+
- Codex subagents: Codex can spawn specialized agents in parallel and collect results; custom agents live in standalone TOML files.
|
|
9
|
+
- GitHub Copilot CLI custom agents: custom agents are Markdown profiles that can be invoked manually, by prompt, or by inference.
|
|
10
|
+
- Antigravity CLI docs: background subagents are runtime workers, while custom reusable behavior is documented through skills/plugins. No stable standalone custom sub-agent manifest was verified for V1.
|
|
11
|
+
- ADK and LangGraph workflow docs: common patterns include routing, sequential pipelines, parallel fan-out/gather, evaluator-optimizer, and human-in-the-loop gates.
|
|
12
|
+
|
|
13
|
+
## Good Sub-Agent Jobs
|
|
14
|
+
- **Explorer**: read-only codebase mapping, dependency tracing, API discovery, log summarization.
|
|
15
|
+
- **Planner**: research and produce an implementation plan without editing.
|
|
16
|
+
- **Reviewer**: inspect diffs or files for correctness, security, tests, and regressions.
|
|
17
|
+
- **Executor**: perform bounded implementation from a clear spec.
|
|
18
|
+
- **Test runner**: run builds/tests and return failures with concise diagnostics.
|
|
19
|
+
- **Docs researcher**: verify current API behavior from official docs and return citations.
|
|
20
|
+
- **Migration auditor**: inspect many modules independently and return structured risk notes.
|
|
21
|
+
|
|
22
|
+
## Decision Rules
|
|
23
|
+
- Use sub-agents for context isolation, parallel independent work, specialized repeated roles, scoped permissions, or fresh review.
|
|
24
|
+
- Use the main conversation for small fixes, tightly sequential work, tasks needing repeated user clarification, or same-file edits.
|
|
25
|
+
- Use a skill instead of a sub-agent when the goal is reusable instructions inside the main context.
|
|
26
|
+
- Use a workflow/pipeline when order matters more than autonomy.
|
|
27
|
+
|
|
28
|
+
## Design Patterns
|
|
29
|
+
|
|
30
|
+
### Coordinator / Router
|
|
31
|
+
A main agent routes tasks to named specialists. Use when user requests fall into clear categories and each specialist has a distinct description.
|
|
32
|
+
|
|
33
|
+
Required decisions:
|
|
34
|
+
- Routing descriptions must be mutually distinguishable.
|
|
35
|
+
- The coordinator owns final synthesis unless the client uses handoff semantics.
|
|
36
|
+
- Specialists must return concise, structured outputs.
|
|
37
|
+
|
|
38
|
+
### Parallel Fan-Out / Gather
|
|
39
|
+
Run multiple independent agents at the same time and combine results. Use for module audits, multi-area research, PR review dimensions, or batch checks.
|
|
40
|
+
|
|
41
|
+
Required decisions:
|
|
42
|
+
- Each worker must have independent input.
|
|
43
|
+
- Avoid parallel writes to the same files.
|
|
44
|
+
- Cap concurrency when the client supports it.
|
|
45
|
+
- Ask workers to return fixed fields so synthesis is cheap.
|
|
46
|
+
|
|
47
|
+
### Sequential Pipeline
|
|
48
|
+
Run agents in a fixed order: explore -> plan -> implement -> test -> review. Use when each stage produces an artifact the next stage consumes.
|
|
49
|
+
|
|
50
|
+
Required decisions:
|
|
51
|
+
- Define the artifact passed between stages.
|
|
52
|
+
- Stop at human approval gates for product scope, destructive actions, migrations, and production changes.
|
|
53
|
+
- Do not pretend a later reviewer can recover missing context if the prior artifact is vague.
|
|
54
|
+
|
|
55
|
+
### Generator / Critic
|
|
56
|
+
One agent generates an artifact and another reviews it. Use for plans, docs, tests, migrations, prompts, and generated configuration.
|
|
57
|
+
|
|
58
|
+
Required decisions:
|
|
59
|
+
- Reviewer criteria must be explicit.
|
|
60
|
+
- The critic should cite evidence and avoid style-only feedback unless it hides risk.
|
|
61
|
+
- The parent agent decides whether to apply changes.
|
|
62
|
+
|
|
63
|
+
### Human-In-The-Loop
|
|
64
|
+
Use for irreversible operations, security changes, permissions, production deploys, and broad rewrites.
|
|
65
|
+
|
|
66
|
+
Required decisions:
|
|
67
|
+
- State exactly what requires user approval.
|
|
68
|
+
- Background agents should fail safely when they cannot ask for approval.
|
|
69
|
+
- The final output must distinguish completed work from pending approvals.
|
|
70
|
+
|
|
71
|
+
## Prompt Design Checklist
|
|
72
|
+
- Start with a single role sentence.
|
|
73
|
+
- Include "use when" behavior in the description, not only the body.
|
|
74
|
+
- State allowed scope and explicit non-goals.
|
|
75
|
+
- State expected inputs from the parent agent.
|
|
76
|
+
- State output shape: summary, findings, changed files, commands, risks, next steps.
|
|
77
|
+
- Include permission constraints in the agent config when the client supports them.
|
|
78
|
+
- Prefer read-only tools for planning, review, research, and audit agents.
|
|
79
|
+
- Include validation commands only when they are discoverable or user-provided.
|
|
80
|
+
|
|
81
|
+
## Anti-Patterns
|
|
82
|
+
- A catch-all "senior engineer" agent with all tools and no narrow trigger.
|
|
83
|
+
- Many overlapping specialists whose descriptions compete for the same prompts.
|
|
84
|
+
- Background agents that require frequent questions.
|
|
85
|
+
- Parallel agents editing the same files.
|
|
86
|
+
- Recursive delegation without a hard depth/concurrency limit.
|
|
87
|
+
- Agents that return full logs instead of summaries and cited evidence.
|
|
88
|
+
- Client-specific fields copied across tools without checking support.
|
|
89
|
+
|
|
90
|
+
## Review Checklist
|
|
91
|
+
- The agent can be named in one sentence.
|
|
92
|
+
- The description contains concrete trigger words users will actually type.
|
|
93
|
+
- Permissions are minimal for the job.
|
|
94
|
+
- The output contract is stable enough for the parent agent to synthesize.
|
|
95
|
+
- The agent states what it must not do.
|
|
96
|
+
- The design includes a validation or acceptance signal.
|
|
@@ -0,0 +1,401 @@
|
|
|
1
|
+
#!/usr/bin/env python3
|
|
2
|
+
# /// script
|
|
3
|
+
# requires-python = ">=3.11"
|
|
4
|
+
# dependencies = []
|
|
5
|
+
# ///
|
|
6
|
+
|
|
7
|
+
"""Render cross-client sub-agent files from a JSON spec.
|
|
8
|
+
|
|
9
|
+
Diagnostics go to stderr. JSON output goes to stdout.
|
|
10
|
+
"""
|
|
11
|
+
|
|
12
|
+
from __future__ import annotations
|
|
13
|
+
|
|
14
|
+
import argparse
|
|
15
|
+
import json
|
|
16
|
+
import re
|
|
17
|
+
import sys
|
|
18
|
+
from pathlib import Path
|
|
19
|
+
from typing import Any
|
|
20
|
+
|
|
21
|
+
|
|
22
|
+
CLIENT_ALIASES = {
|
|
23
|
+
"claude": "claude",
|
|
24
|
+
"claude-code": "claude",
|
|
25
|
+
"codex": "codex",
|
|
26
|
+
"copilot": "copilot",
|
|
27
|
+
"copilot-cli": "copilot",
|
|
28
|
+
"agy": "agy",
|
|
29
|
+
"antigravity": "agy",
|
|
30
|
+
"antigravity-cli": "agy",
|
|
31
|
+
}
|
|
32
|
+
|
|
33
|
+
ALL_CLIENTS = ["claude", "codex", "copilot", "agy"]
|
|
34
|
+
SAFE_NAME_RE = re.compile(r"^[a-z0-9]+(-[a-z0-9]+)*$")
|
|
35
|
+
|
|
36
|
+
|
|
37
|
+
class SpecError(ValueError):
|
|
38
|
+
"""Raised when the input spec is invalid."""
|
|
39
|
+
|
|
40
|
+
|
|
41
|
+
def log(message: str) -> None:
|
|
42
|
+
print(message, file=sys.stderr)
|
|
43
|
+
|
|
44
|
+
|
|
45
|
+
def read_json(path: Path) -> dict[str, Any]:
|
|
46
|
+
try:
|
|
47
|
+
data = json.loads(path.read_text(encoding="utf-8"))
|
|
48
|
+
except FileNotFoundError as exc:
|
|
49
|
+
raise SpecError(f"spec file not found: {path}") from exc
|
|
50
|
+
except json.JSONDecodeError as exc:
|
|
51
|
+
raise SpecError(f"spec is not valid JSON: {exc}") from exc
|
|
52
|
+
|
|
53
|
+
if not isinstance(data, dict):
|
|
54
|
+
raise SpecError("spec root must be a JSON object")
|
|
55
|
+
return data
|
|
56
|
+
|
|
57
|
+
|
|
58
|
+
def normalize_clients(value: Any) -> list[str]:
|
|
59
|
+
if value is None:
|
|
60
|
+
raise SpecError("spec.clients is required")
|
|
61
|
+
raw = value if isinstance(value, list) else [value]
|
|
62
|
+
clients: list[str] = []
|
|
63
|
+
for item in raw:
|
|
64
|
+
if not isinstance(item, str):
|
|
65
|
+
raise SpecError("spec.clients must contain strings")
|
|
66
|
+
key = item.strip().lower()
|
|
67
|
+
if key == "all":
|
|
68
|
+
for client in ALL_CLIENTS:
|
|
69
|
+
if client not in clients:
|
|
70
|
+
clients.append(client)
|
|
71
|
+
continue
|
|
72
|
+
client = CLIENT_ALIASES.get(key)
|
|
73
|
+
if not client:
|
|
74
|
+
raise SpecError(f"unsupported client: {item}")
|
|
75
|
+
if client not in clients:
|
|
76
|
+
clients.append(client)
|
|
77
|
+
if not clients:
|
|
78
|
+
raise SpecError("spec.clients must not be empty")
|
|
79
|
+
return clients
|
|
80
|
+
|
|
81
|
+
|
|
82
|
+
def validate_spec(spec: dict[str, Any]) -> dict[str, Any]:
|
|
83
|
+
name = spec.get("name")
|
|
84
|
+
description = spec.get("description")
|
|
85
|
+
instructions = (
|
|
86
|
+
spec.get("instructions")
|
|
87
|
+
or spec.get("prompt")
|
|
88
|
+
or spec.get("developer_instructions")
|
|
89
|
+
)
|
|
90
|
+
if not isinstance(name, str) or not name:
|
|
91
|
+
raise SpecError("spec.name is required")
|
|
92
|
+
if len(name) > 64 or not SAFE_NAME_RE.match(name):
|
|
93
|
+
raise SpecError("spec.name must be lowercase kebab-case, 1-64 chars")
|
|
94
|
+
if not isinstance(description, str) or not description.strip():
|
|
95
|
+
raise SpecError("spec.description is required")
|
|
96
|
+
if len(description) > 1024:
|
|
97
|
+
raise SpecError("spec.description must be 1024 chars or less")
|
|
98
|
+
if not isinstance(instructions, str) or not instructions.strip():
|
|
99
|
+
raise SpecError("spec.instructions is required")
|
|
100
|
+
|
|
101
|
+
scope = spec.get("scope", "project")
|
|
102
|
+
if scope not in ("project", "user"):
|
|
103
|
+
raise SpecError("spec.scope must be 'project' or 'user'")
|
|
104
|
+
|
|
105
|
+
normalized = dict(spec)
|
|
106
|
+
normalized["clients"] = normalize_clients(spec.get("clients"))
|
|
107
|
+
normalized["scope"] = scope
|
|
108
|
+
normalized["instructions"] = instructions.strip()
|
|
109
|
+
normalized["description"] = description.strip()
|
|
110
|
+
normalized["name"] = name
|
|
111
|
+
return normalized
|
|
112
|
+
|
|
113
|
+
|
|
114
|
+
def json_yaml(value: Any) -> str:
|
|
115
|
+
if isinstance(value, bool):
|
|
116
|
+
return "true" if value else "false"
|
|
117
|
+
if isinstance(value, (list, dict)):
|
|
118
|
+
return json.dumps(value, ensure_ascii=False)
|
|
119
|
+
if isinstance(value, (int, float)):
|
|
120
|
+
return str(value)
|
|
121
|
+
return json.dumps(str(value), ensure_ascii=False)
|
|
122
|
+
|
|
123
|
+
|
|
124
|
+
def yaml_frontmatter(fields: list[tuple[str, Any]]) -> str:
|
|
125
|
+
lines = []
|
|
126
|
+
for key, value in fields:
|
|
127
|
+
if value is None:
|
|
128
|
+
continue
|
|
129
|
+
if value == [] or value == {}:
|
|
130
|
+
lines.append(f"{key}: {json_yaml(value)}")
|
|
131
|
+
continue
|
|
132
|
+
if value == "":
|
|
133
|
+
continue
|
|
134
|
+
lines.append(f"{key}: {json_yaml(value)}")
|
|
135
|
+
return "\n".join(lines)
|
|
136
|
+
|
|
137
|
+
|
|
138
|
+
def toml_value(value: Any) -> str:
|
|
139
|
+
if isinstance(value, bool):
|
|
140
|
+
return "true" if value else "false"
|
|
141
|
+
if isinstance(value, (int, float)):
|
|
142
|
+
return str(value)
|
|
143
|
+
if isinstance(value, list):
|
|
144
|
+
return json.dumps(value, ensure_ascii=False)
|
|
145
|
+
return json.dumps(str(value), ensure_ascii=False)
|
|
146
|
+
|
|
147
|
+
|
|
148
|
+
def toml_document(fields: list[tuple[str, Any]], tables: dict[str, Any] | None = None) -> str:
|
|
149
|
+
lines = []
|
|
150
|
+
for key, value in fields:
|
|
151
|
+
if value is None or value == "":
|
|
152
|
+
continue
|
|
153
|
+
lines.append(f"{key} = {toml_value(value)}")
|
|
154
|
+
|
|
155
|
+
if tables:
|
|
156
|
+
for table_name, table_value in tables.items():
|
|
157
|
+
if not isinstance(table_value, dict):
|
|
158
|
+
continue
|
|
159
|
+
lines.append("")
|
|
160
|
+
lines.append(f"[mcp_servers.{table_name}]")
|
|
161
|
+
for key, value in table_value.items():
|
|
162
|
+
if isinstance(value, dict):
|
|
163
|
+
log(f"skipping nested mcp_servers.{table_name}.{key}; render manually")
|
|
164
|
+
continue
|
|
165
|
+
lines.append(f"{key} = {toml_value(value)}")
|
|
166
|
+
return "\n".join(lines) + "\n"
|
|
167
|
+
|
|
168
|
+
|
|
169
|
+
def template(name: str) -> str:
|
|
170
|
+
root = Path(__file__).resolve().parents[1]
|
|
171
|
+
return (root / "assets" / "templates" / name).read_text(encoding="utf-8")
|
|
172
|
+
|
|
173
|
+
|
|
174
|
+
def get_tools(spec: dict[str, Any], client: str) -> Any:
|
|
175
|
+
tools = spec.get("tools")
|
|
176
|
+
if tools is None:
|
|
177
|
+
return None
|
|
178
|
+
if isinstance(tools, dict):
|
|
179
|
+
return tools.get(client) or tools.get("default")
|
|
180
|
+
return tools
|
|
181
|
+
|
|
182
|
+
|
|
183
|
+
def output_contract(spec: dict[str, Any]) -> str:
|
|
184
|
+
value = spec.get("output_contract")
|
|
185
|
+
if isinstance(value, list):
|
|
186
|
+
return "\n".join(f"- {item}" for item in value)
|
|
187
|
+
if isinstance(value, str) and value.strip():
|
|
188
|
+
return value.strip()
|
|
189
|
+
return "- Summary\n- Evidence or changed files\n- Risks\n- Suggested next steps"
|
|
190
|
+
|
|
191
|
+
|
|
192
|
+
def validation(spec: dict[str, Any]) -> str:
|
|
193
|
+
value = spec.get("validation")
|
|
194
|
+
if isinstance(value, list):
|
|
195
|
+
return "\n".join(f"- `{item}`" for item in value)
|
|
196
|
+
if isinstance(value, str) and value.strip():
|
|
197
|
+
return f"- `{value.strip()}`"
|
|
198
|
+
return "- State any checks run. If no deterministic check is available, say so explicitly."
|
|
199
|
+
|
|
200
|
+
|
|
201
|
+
def target_path(client: str, spec: dict[str, Any], output_root: Path) -> Path:
|
|
202
|
+
name = spec["name"]
|
|
203
|
+
scope = spec["scope"]
|
|
204
|
+
if scope == "user":
|
|
205
|
+
home = Path.home()
|
|
206
|
+
if client == "claude":
|
|
207
|
+
return home / ".claude" / "agents" / f"{name}.md"
|
|
208
|
+
if client == "codex":
|
|
209
|
+
return home / ".codex" / "agents" / f"{name}.toml"
|
|
210
|
+
if client == "copilot":
|
|
211
|
+
return home / ".copilot" / "agents" / f"{name}.agent.md"
|
|
212
|
+
if client == "agy":
|
|
213
|
+
return home / ".gemini" / "antigravity-cli" / "skills" / name / "SKILL.md"
|
|
214
|
+
|
|
215
|
+
if client == "claude":
|
|
216
|
+
return output_root / ".claude" / "agents" / f"{name}.md"
|
|
217
|
+
if client == "codex":
|
|
218
|
+
return output_root / ".codex" / "agents" / f"{name}.toml"
|
|
219
|
+
if client == "copilot":
|
|
220
|
+
return output_root / ".github" / "agents" / f"{name}.agent.md"
|
|
221
|
+
if client == "agy":
|
|
222
|
+
return output_root / ".agents" / "skills" / name / "SKILL.md"
|
|
223
|
+
raise SpecError(f"unsupported client: {client}")
|
|
224
|
+
|
|
225
|
+
|
|
226
|
+
def render_claude(spec: dict[str, Any]) -> str:
|
|
227
|
+
fm = yaml_frontmatter(
|
|
228
|
+
[
|
|
229
|
+
("name", spec["name"]),
|
|
230
|
+
("description", spec["description"]),
|
|
231
|
+
("tools", get_tools(spec, "claude")),
|
|
232
|
+
("disallowedTools", spec.get("disallowed_tools") or spec.get("disallowedTools")),
|
|
233
|
+
("model", spec.get("model")),
|
|
234
|
+
("permissionMode", spec.get("permission_mode") or spec.get("permissionMode")),
|
|
235
|
+
("maxTurns", spec.get("max_turns") or spec.get("maxTurns")),
|
|
236
|
+
("skills", spec.get("skills")),
|
|
237
|
+
("mcpServers", spec.get("mcp_servers") or spec.get("mcpServers")),
|
|
238
|
+
("memory", spec.get("memory")),
|
|
239
|
+
("background", spec.get("background")),
|
|
240
|
+
("effort", spec.get("effort")),
|
|
241
|
+
("isolation", spec.get("isolation")),
|
|
242
|
+
("color", spec.get("color")),
|
|
243
|
+
]
|
|
244
|
+
)
|
|
245
|
+
return template("claude-agent.md").replace("{{frontmatter}}", fm).replace(
|
|
246
|
+
"{{instructions}}", spec["instructions"]
|
|
247
|
+
)
|
|
248
|
+
|
|
249
|
+
|
|
250
|
+
def render_codex(spec: dict[str, Any]) -> str:
|
|
251
|
+
sandbox_mode = spec.get("sandbox_mode")
|
|
252
|
+
if sandbox_mode is None and spec.get("read_only"):
|
|
253
|
+
sandbox_mode = "read-only"
|
|
254
|
+
doc = toml_document(
|
|
255
|
+
[
|
|
256
|
+
("name", spec["name"].replace("-", "_") if spec.get("codex_use_underscores") else spec["name"]),
|
|
257
|
+
("description", spec["description"]),
|
|
258
|
+
("model", spec.get("model")),
|
|
259
|
+
("model_reasoning_effort", spec.get("model_reasoning_effort")),
|
|
260
|
+
("sandbox_mode", sandbox_mode),
|
|
261
|
+
("developer_instructions", spec["instructions"]),
|
|
262
|
+
("nickname_candidates", spec.get("nickname_candidates")),
|
|
263
|
+
],
|
|
264
|
+
spec.get("mcp_servers"),
|
|
265
|
+
)
|
|
266
|
+
return template("codex-agent.toml").replace("{{toml}}", doc)
|
|
267
|
+
|
|
268
|
+
|
|
269
|
+
def render_copilot(spec: dict[str, Any]) -> str:
|
|
270
|
+
disable_model_invocation = spec.get("disable_model_invocation")
|
|
271
|
+
if disable_model_invocation is None and spec.get("auto_delegate") is False:
|
|
272
|
+
disable_model_invocation = True
|
|
273
|
+
fm = yaml_frontmatter(
|
|
274
|
+
[
|
|
275
|
+
("name", spec.get("display_name") or spec["name"]),
|
|
276
|
+
("description", spec["description"]),
|
|
277
|
+
("target", spec.get("target")),
|
|
278
|
+
("tools", get_tools(spec, "copilot")),
|
|
279
|
+
("model", spec.get("model")),
|
|
280
|
+
("disable-model-invocation", disable_model_invocation),
|
|
281
|
+
("user-invocable", spec.get("user_invocable")),
|
|
282
|
+
("mcp-servers", spec.get("mcp_servers")),
|
|
283
|
+
("metadata", spec.get("metadata")),
|
|
284
|
+
]
|
|
285
|
+
)
|
|
286
|
+
return template("copilot-agent.md").replace("{{frontmatter}}", fm).replace(
|
|
287
|
+
"{{instructions}}", spec["instructions"]
|
|
288
|
+
)
|
|
289
|
+
|
|
290
|
+
|
|
291
|
+
def render_agy(spec: dict[str, Any]) -> str:
|
|
292
|
+
fm = yaml_frontmatter(
|
|
293
|
+
[
|
|
294
|
+
("name", spec["name"]),
|
|
295
|
+
("description", spec["description"]),
|
|
296
|
+
]
|
|
297
|
+
)
|
|
298
|
+
title = spec.get("display_name") or spec["name"].replace("-", " ").title()
|
|
299
|
+
return (
|
|
300
|
+
template("antigravity-skill.md")
|
|
301
|
+
.replace("{{frontmatter}}", fm)
|
|
302
|
+
.replace("{{title}}", title)
|
|
303
|
+
.replace("{{instructions}}", spec["instructions"])
|
|
304
|
+
.replace("{{output_contract}}", output_contract(spec))
|
|
305
|
+
.replace("{{validation}}", validation(spec))
|
|
306
|
+
)
|
|
307
|
+
|
|
308
|
+
|
|
309
|
+
def render(client: str, spec: dict[str, Any]) -> tuple[str, list[str]]:
|
|
310
|
+
warnings: list[str] = []
|
|
311
|
+
if client == "claude":
|
|
312
|
+
return render_claude(spec), warnings
|
|
313
|
+
if client == "codex":
|
|
314
|
+
return render_codex(spec), warnings
|
|
315
|
+
if client == "copilot":
|
|
316
|
+
return render_copilot(spec), warnings
|
|
317
|
+
if client == "agy":
|
|
318
|
+
warnings.append(
|
|
319
|
+
"Antigravity output is a skill/delegation blueprint; no verified native custom sub-agent manifest is rendered."
|
|
320
|
+
)
|
|
321
|
+
return render_agy(spec), warnings
|
|
322
|
+
raise SpecError(f"unsupported client: {client}")
|
|
323
|
+
|
|
324
|
+
|
|
325
|
+
def parse_args() -> argparse.Namespace:
|
|
326
|
+
parser = argparse.ArgumentParser(
|
|
327
|
+
description="Render Claude, Codex, Copilot CLI, and Antigravity sub-agent files from JSON."
|
|
328
|
+
)
|
|
329
|
+
parser.add_argument("--spec", required=True, help="Path to JSON spec.")
|
|
330
|
+
parser.add_argument(
|
|
331
|
+
"--output-root",
|
|
332
|
+
default=".",
|
|
333
|
+
help="Project root for project-scoped outputs. Default: current directory.",
|
|
334
|
+
)
|
|
335
|
+
parser.add_argument(
|
|
336
|
+
"--write",
|
|
337
|
+
action="store_true",
|
|
338
|
+
help="Write files. Omit for dry-run JSON output only.",
|
|
339
|
+
)
|
|
340
|
+
parser.add_argument(
|
|
341
|
+
"--force",
|
|
342
|
+
action="store_true",
|
|
343
|
+
help="Overwrite existing files when used with --write.",
|
|
344
|
+
)
|
|
345
|
+
return parser.parse_args()
|
|
346
|
+
|
|
347
|
+
|
|
348
|
+
def main() -> int:
|
|
349
|
+
args = parse_args()
|
|
350
|
+
try:
|
|
351
|
+
spec = validate_spec(read_json(Path(args.spec)))
|
|
352
|
+
output_root = Path(args.output_root).resolve()
|
|
353
|
+
generated = []
|
|
354
|
+
all_warnings: list[str] = []
|
|
355
|
+
|
|
356
|
+
for client in spec["clients"]:
|
|
357
|
+
content, warnings = render(client, spec)
|
|
358
|
+
path = target_path(client, spec, output_root)
|
|
359
|
+
all_warnings.extend(warnings)
|
|
360
|
+
item = {
|
|
361
|
+
"client": client,
|
|
362
|
+
"target_path": str(path),
|
|
363
|
+
"content": content,
|
|
364
|
+
"warnings": warnings,
|
|
365
|
+
}
|
|
366
|
+
generated.append(item)
|
|
367
|
+
|
|
368
|
+
if args.write:
|
|
369
|
+
for item in generated:
|
|
370
|
+
path = Path(item["target_path"])
|
|
371
|
+
if path.exists() and not args.force:
|
|
372
|
+
raise SpecError(f"target already exists; use --force to overwrite: {path}")
|
|
373
|
+
for item in generated:
|
|
374
|
+
path = Path(item["target_path"])
|
|
375
|
+
path.parent.mkdir(parents=True, exist_ok=True)
|
|
376
|
+
path.write_text(item["content"], encoding="utf-8", newline="\n")
|
|
377
|
+
log(f"wrote {path}")
|
|
378
|
+
|
|
379
|
+
result = {
|
|
380
|
+
"status": "written" if args.write else "dry-run",
|
|
381
|
+
"generated": generated,
|
|
382
|
+
"warnings": all_warnings,
|
|
383
|
+
}
|
|
384
|
+
print(json.dumps(result, ensure_ascii=False, indent=2))
|
|
385
|
+
return 0
|
|
386
|
+
except SpecError as exc:
|
|
387
|
+
print(
|
|
388
|
+
json.dumps({"status": "error", "error_message": str(exc)}, ensure_ascii=False, indent=2)
|
|
389
|
+
)
|
|
390
|
+
log(f"error: {exc}")
|
|
391
|
+
return 3
|
|
392
|
+
except OSError as exc:
|
|
393
|
+
print(
|
|
394
|
+
json.dumps({"status": "error", "error_message": str(exc)}, ensure_ascii=False, indent=2)
|
|
395
|
+
)
|
|
396
|
+
log(f"write error: {exc}")
|
|
397
|
+
return 5
|
|
398
|
+
|
|
399
|
+
|
|
400
|
+
if __name__ == "__main__":
|
|
401
|
+
raise SystemExit(main())
|