@moon791017/neo-skills 1.1.8 → 1.1.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md
CHANGED
|
@@ -11,7 +11,7 @@
|
|
|
11
11
|
|
|
12
12
|
目前專案提供:
|
|
13
13
|
|
|
14
|
-
- `skills
|
|
14
|
+
- `skills/`:內建多個專家技能。
|
|
15
15
|
- `bin/install-skills.js`:同步技能到 Antigravity CLI 全域技能目錄。
|
|
16
16
|
- `bin/install-system-instructions.js`:把系統提示詞安裝到 Claude Code、Copilot CLI、Codex 或 Antigravity 的指導檔。
|
|
17
17
|
- `scripts/check-skills-syntax.py`:非互動式技能結構驗證工具。
|
package/package.json
CHANGED
|
@@ -2,9 +2,11 @@
|
|
|
2
2
|
name: neo-agent-harness
|
|
3
3
|
description: >
|
|
4
4
|
Use this skill when the user asks to improve AI-assisted development reliability,
|
|
5
|
-
AGENTS.md, skills, tests, CI, hooks, review loops,
|
|
6
|
-
|
|
7
|
-
|
|
5
|
+
AGENTS.md, skills, tests, CI, hooks, review loops, agent workflow governance,
|
|
6
|
+
loop engineering, agent automations, worktree isolation, maker/checker separation,
|
|
7
|
+
or external state design for long-running agents.
|
|
8
|
+
It designs feedforward guides, feedback sensors, verification gates, human
|
|
9
|
+
decision points, and loop architectures from repository evidence.
|
|
8
10
|
---
|
|
9
11
|
|
|
10
12
|
# Agent Harness Architect Skill
|
|
@@ -14,10 +16,17 @@ description: >
|
|
|
14
16
|
- The user wants coding agents to be more reliable, safer, or easier to supervise.
|
|
15
17
|
- The user asks for AGENTS.md, skills, tests, CI, hooks, review loops, or project rules to be planned together.
|
|
16
18
|
- The task is about harness engineering, agent harnessability, feedback loops, or "humans on the loop".
|
|
19
|
+
- The user wants to design scheduled automations, cron-driven agent loops, or recurring agent tasks.
|
|
20
|
+
- The user wants multiple agents to work in parallel and needs an isolation strategy (worktree isolation).
|
|
21
|
+
- The user wants to design maker/checker separation (generation and verification by different agents).
|
|
22
|
+
- The user wants to design external memory or cross-conversation state persistence for agents.
|
|
23
|
+
- The user asks how to evolve from a harness to loop-driven development.
|
|
17
24
|
|
|
18
25
|
## Core Principle
|
|
19
26
|
Treat agent-assisted development as a controlled engineering system. Do not only improve prompts; design the guides, sensors, verification steps, and human decision points that let agents work with higher confidence.
|
|
20
27
|
|
|
28
|
+
When the task involves automations, parallel agents, or continuous loops, extend the design to cover loop architecture: the scheduling, isolation, separation, integration, and state layers that let the harness run itself.
|
|
29
|
+
|
|
21
30
|
Use this skill for planning first. Do not modify files unless the user explicitly asks for implementation after the harness plan is clear.
|
|
22
31
|
|
|
23
32
|
## Perceive
|
|
@@ -27,6 +36,9 @@ Use this skill for planning first. Do not modify files unless the user explicitl
|
|
|
27
36
|
- Validation commands: `package.json`, `pyproject.toml`, `Makefile`, CI workflow files, build scripts.
|
|
28
37
|
- Safety and governance: hooks, linters, formatters, secret scanners, dependency scanners.
|
|
29
38
|
- Existing documentation: README, architecture docs, ADRs, testing docs, contribution docs.
|
|
39
|
+
- Agent definitions: `.codex/agents/`, `.claude/agents/`, subagent configs, agent team definitions.
|
|
40
|
+
- Automation and scheduling: cron configs, GitHub Actions workflows, hooks, `/loop` or `/goal` usage.
|
|
41
|
+
- State files: progress markdown, task boards, Linear integrations, cross-conversation state.
|
|
30
42
|
2. Identify the project type, primary languages, current test surface, CI status, and release/deploy path.
|
|
31
43
|
3. Separate discoverable repository facts from product or team preferences that require user confirmation.
|
|
32
44
|
|
|
@@ -48,8 +60,15 @@ Analyze the project through these dimensions:
|
|
|
48
60
|
- Behaviour: requirements, acceptance criteria, user journeys, fixtures, manual test points.
|
|
49
61
|
5. **Human role**
|
|
50
62
|
- Move human effort from repeated low-level correction to high-value decisions: scope, product fit, risk acceptance, architectural tradeoffs, and harness evolution.
|
|
63
|
+
6. **Loop design** (conditional: activate only when the task involves scheduled automations, parallel agents, or continuous loops)
|
|
64
|
+
- **Automations**: which tasks suit scheduled triggers? Frequency, trigger conditions, result routing (triage inbox vs auto-process).
|
|
65
|
+
- **Worktrees**: how to isolate parallel agents? Independent branches and working directories needed?
|
|
66
|
+
- **Maker/checker separation**: which steps need generation and verification by different agents? (For sub-agent definition and implementation, use the neo-sub-agent skill.)
|
|
67
|
+
- **Connectors**: which external tools does the loop need to reach? (issue tracker, CI, Slack, staging API, database)
|
|
68
|
+
- **State**: how to persist cross-conversation progress? Format (markdown / board / Linear)? Who updates it?
|
|
69
|
+
- **Loop risks**: assess comprehension debt, cognitive surrender, and unattended verification risk levels.
|
|
51
70
|
|
|
52
|
-
For detailed patterns and examples, read [reference/harness-patterns.md](reference/harness-patterns.md) when the task needs a full harness design, maturity model, or improvement roadmap. For the complete source article synthesis behind this skill, read [reference/agent-harness-engineering.md](reference/agent-harness-engineering.md) only when deeper conceptual background is needed.
|
|
71
|
+
For detailed patterns and examples, read [reference/harness-patterns.md](reference/harness-patterns.md) when the task needs a full harness design, maturity model, or improvement roadmap. For the complete source article synthesis behind this skill, read [reference/agent-harness-engineering.md](reference/agent-harness-engineering.md) only when deeper conceptual background is needed. For loop engineering patterns, the five loop primitives, risk model, and maturity guidance, read [reference/loop-engineering.md](reference/loop-engineering.md) when the task involves automations, parallel agents, maker/checker separation, or loop-driven development.
|
|
53
72
|
|
|
54
73
|
## Act
|
|
55
74
|
Output in Traditional Chinese (Taiwan). Use this structure:
|
|
@@ -57,6 +76,7 @@ Output in Traditional Chinese (Taiwan). Use this structure:
|
|
|
57
76
|
### 1. 現況盤點
|
|
58
77
|
- Summarize the repository facts discovered.
|
|
59
78
|
- Mention the current guides, sensors, and missing signals.
|
|
79
|
+
- If loop-relevant infrastructure exists (automations, agent definitions, state files), include it here.
|
|
60
80
|
|
|
61
81
|
### 2. Harnessability 評估
|
|
62
82
|
- Rate the current harnessability as `低`, `中`, or `高`.
|
|
@@ -74,9 +94,22 @@ Output in Traditional Chinese (Taiwan). Use this structure:
|
|
|
74
94
|
- Prioritize improvements as P0, P1, and P2.
|
|
75
95
|
- P0 must focus on changes that reduce repeated agent mistakes or prevent high-risk failures.
|
|
76
96
|
|
|
97
|
+
### 5.5 Loop 設計(條件式:僅當任務涉及 loop 時輸出)
|
|
98
|
+
- List tasks suited for scheduled automation, with suggested frequency.
|
|
99
|
+
- Describe the isolation strategy for parallel agents (worktree / branch / container).
|
|
100
|
+
- Identify which steps need maker/checker separation and their verification criteria.
|
|
101
|
+
- List required connectors and external integrations.
|
|
102
|
+
- Design the state persistence scheme and update responsibilities.
|
|
103
|
+
- Assess loop risks: comprehension debt, cognitive surrender, unattended verification, and propose safeguards.
|
|
104
|
+
|
|
77
105
|
### 6. 人類決策點
|
|
78
106
|
- State where humans should stay on the loop.
|
|
79
107
|
- Identify decisions that should not be delegated fully to agents.
|
|
108
|
+
- When loop design is involved, assess these three specific risks:
|
|
109
|
+
- **Verification responsibility**: error accumulation when the loop runs unattended.
|
|
110
|
+
- **Comprehension debt**: the faster the loop produces code, the wider the understanding gap.
|
|
111
|
+
- **Cognitive surrender**: tendency to stop having opinions when the loop runs itself.
|
|
112
|
+
- Clearly mark which loop stages still require human intervention (scope decisions, risk acceptance, architectural tradeoffs).
|
|
80
113
|
|
|
81
114
|
### 7. 驗證方式
|
|
82
115
|
- Provide exact commands or review steps when discoverable.
|
|
@@ -88,3 +121,7 @@ Output in Traditional Chinese (Taiwan). Use this structure:
|
|
|
88
121
|
- Do not recommend broad automation before the project has reliable local validation commands.
|
|
89
122
|
- Do not propose fully autonomous changes for security, compliance, production deploys, or product-scope decisions.
|
|
90
123
|
- Keep the output actionable and tied to repository evidence.
|
|
124
|
+
- Do not recommend loop-driven automations for projects that lack reliable local validation commands and stable CI.
|
|
125
|
+
- When designing loops, explicitly assess comprehension debt, cognitive surrender, and unattended verification risks.
|
|
126
|
+
- For sub-agent design and implementation details, defer to the neo-sub-agent skill to avoid responsibility overlap.
|
|
127
|
+
- Mark loop engineering concepts as emerging practice, not established best practice.
|
|
@@ -178,3 +178,17 @@ Common improvement sequence:
|
|
|
178
178
|
- Humans prioritize and approve harness changes.
|
|
179
179
|
- Low-risk improvements can be automated after repeated success.
|
|
180
180
|
- Harness quality itself is reviewed regularly.
|
|
181
|
+
|
|
182
|
+
### Level 5: Loop-Driven Development
|
|
183
|
+
|
|
184
|
+
- Harness is driven by scheduled automations, no longer relying on manual triggers.
|
|
185
|
+
- Multiple agents work in parallel across isolated worktrees.
|
|
186
|
+
- Generation and verification are handled by separate agents (maker/checker separation).
|
|
187
|
+
- Connectors let the loop directly operate issue trackers, PRs, CI, and notification channels.
|
|
188
|
+
- A state file persists progress across conversations; each run resumes from where the last one stopped.
|
|
189
|
+
- The human role shifts from in-the-loop line-by-line review to on-the-loop supervision: designing loops, sampling outputs, and evolving the harness.
|
|
190
|
+
- Comprehension debt, cognitive surrender, and unattended verification risks are explicitly assessed and managed.
|
|
191
|
+
- High-risk changes are forced out of the loop to await human decisions.
|
|
192
|
+
|
|
193
|
+
Prerequisites: Level 4 harness improvement loop is running stably, CI is reliable, and review processes are mature.
|
|
194
|
+
Note: Loop engineering is still an emerging practice. Start with low-risk, repetitive tasks.
|
|
@@ -0,0 +1,149 @@
|
|
|
1
|
+
# Loop Engineering
|
|
2
|
+
|
|
3
|
+
Use this reference when designing loop architectures that automate agent-driven workflows beyond a single session.
|
|
4
|
+
|
|
5
|
+
## Loop 與 Harness 的關係
|
|
6
|
+
|
|
7
|
+
- Harness = 單一 agent 的工作環境(guides + sensors + gates)
|
|
8
|
+
- Loop = harness 之上的排程驅動層,讓 harness 自己跑
|
|
9
|
+
- 設計 loop 不是取代 prompt,而是把反覆的 prompt 動作系統化
|
|
10
|
+
|
|
11
|
+
```text
|
|
12
|
+
Loop = Automations + Worktrees + Skills + Connectors + Sub-agents + State
|
|
13
|
+
─────────────────────────────────────────────────────────────────
|
|
14
|
+
running on top of the Harness
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
## 五個基本原件 + State
|
|
18
|
+
|
|
19
|
+
### 1. Automations(心跳)
|
|
20
|
+
|
|
21
|
+
沒有 automations 的 loop 只跑一次;有了它才會重複。
|
|
22
|
+
|
|
23
|
+
- 排程式觸發,定時執行探索與分類。
|
|
24
|
+
- 找到問題的送 triage inbox,沒發現的自動歸檔。
|
|
25
|
+
- 可搭配 skills 維護排程任務的可維護性——呼叫 `$skill-name` 而非貼一大段指令。
|
|
26
|
+
- `/loop` 按頻率重複執行;`/goal` 持續執行直到停止條件成立,且由獨立模型判斷是否完成。
|
|
27
|
+
|
|
28
|
+
工具對應:
|
|
29
|
+
|
|
30
|
+
- Codex:Automations tab(選專案、prompt、頻率、環境),結果進 Triage inbox;`/goal` run-until-done。
|
|
31
|
+
- Claude Code:`/loop`、`/goal`、hooks、cron、GitHub Actions。
|
|
32
|
+
|
|
33
|
+
### 2. Worktrees(隔離)
|
|
34
|
+
|
|
35
|
+
多 agent 並行時避免檔案衝突。
|
|
36
|
+
|
|
37
|
+
- 每個 agent 在獨立的 git worktree 工作,共享 repo history。
|
|
38
|
+
- 一個 agent 的編輯不會碰到另一個的 checkout。
|
|
39
|
+
- 人的 review bandwidth 仍是瓶頸——worktree 解決機械衝突,但你能同時審幾條線決定了你能跑幾個 agent(orchestration tax)。
|
|
40
|
+
|
|
41
|
+
工具對應:
|
|
42
|
+
|
|
43
|
+
- Codex:內建 worktree per thread。
|
|
44
|
+
- Claude Code:`git worktree`、`--worktree` flag、subagent 的 `isolation: worktree` 設定。
|
|
45
|
+
|
|
46
|
+
### 3. Skills(知識固化)
|
|
47
|
+
|
|
48
|
+
把反覆解釋的專案上下文寫成 SKILL.md。
|
|
49
|
+
|
|
50
|
+
- 消除 intent debt:每次冷啟動,agent 會用自信的猜測填補意圖缺口。Skill 把意圖寫在外面,agent 每次讀取,不需重建。
|
|
51
|
+
- 沒有 skills 的 loop 每個 cycle 從零推導你的整個專案;有 skills 的 loop 每次都帶著上次的知識跑。
|
|
52
|
+
- Skill 是創作格式,Plugin 是發布格式——跨 repo 分享時打包成 plugin。
|
|
53
|
+
|
|
54
|
+
工具對應:
|
|
55
|
+
|
|
56
|
+
- Codex:Agent Skills (`SKILL.md`),用 `$name` 或 `/skills` 呼叫,或由 description 自動觸發。
|
|
57
|
+
- Claude Code:Agent Skills (`SKILL.md`)。
|
|
58
|
+
|
|
59
|
+
### 4. Plugins / Connectors(外部整合)
|
|
60
|
+
|
|
61
|
+
透過 MCP 連接外部工具,讓 loop 能在真實環境中行動。
|
|
62
|
+
|
|
63
|
+
- 可連接 issue tracker、database、staging API、Slack。
|
|
64
|
+
- Codex 和 Claude Code 都用 MCP,connector 通常跨工具可用。
|
|
65
|
+
- Plugins 把 connectors 和 skills 打包在一起,方便團隊成員一次安裝。
|
|
66
|
+
|
|
67
|
+
沒有 connectors 的 loop 只能輸出建議;有 connectors 的 loop 能直接開 PR、連 ticket、ping channel。
|
|
68
|
+
|
|
69
|
+
### 5. Sub-agents(生成與驗證分離)
|
|
70
|
+
|
|
71
|
+
Loop 的結構性前提是把 maker 和 checker 分開。
|
|
72
|
+
|
|
73
|
+
- 寫程式碼的 model 對自己的作業打分數太寬容。第二個 agent 用不同指令(有時不同 model)才能抓到第一個說服自己接受的問題。
|
|
74
|
+
- `/goal` 底層也是 maker/checker 分離——用獨立的小模型判斷 loop 是否完成,而不是讓做事的 agent 自己說完成了。
|
|
75
|
+
- 常見分工:一個 explore、一個 implement、一個 verify against spec。
|
|
76
|
+
- Sub-agents 會燒更多 token,花在值得第二意見的地方。
|
|
77
|
+
|
|
78
|
+
> **職責邊界**:本段只講「為什麼 loop 需要 maker/checker 分離」這個設計決策。具體 sub-agent 的定義格式、指令撰寫、model 選擇等實作細節,請使用 `neo-sub-agent` 技能。
|
|
79
|
+
|
|
80
|
+
工具對應:
|
|
81
|
+
|
|
82
|
+
- Codex:`.codex/agents/` 下的 TOML 定義檔,每個有 name、description、instructions、optional model 和 reasoning effort。
|
|
83
|
+
- Claude Code:`.claude/agents/` 下的 subagent 定義 + agent teams。
|
|
84
|
+
|
|
85
|
+
### 6. State(外部記憶)
|
|
86
|
+
|
|
87
|
+
模型在對話之間會遺忘,進度必須寫在 repo 裡。
|
|
88
|
+
|
|
89
|
+
- 格式:markdown 檔、Linear board、或任何對話外的持久化儲存。
|
|
90
|
+
- State 負責記住做過什麼、通過什麼、還剩什麼。每個 long-running agent 都依賴它:agent 會忘,repo 不會。
|
|
91
|
+
|
|
92
|
+
## 原件對照表
|
|
93
|
+
|
|
94
|
+
| 原件 | Loop 中的職責 | Codex | Claude Code |
|
|
95
|
+
|:--|:--|:--|:--|
|
|
96
|
+
| Automations | 排程探索與分類 | Automations tab, `/goal` | `/loop`, `/goal`, hooks, cron, GitHub Actions |
|
|
97
|
+
| Worktrees | 隔離並行 | 內建 worktree per thread | `git worktree`, `--worktree`, `isolation: worktree` |
|
|
98
|
+
| Skills | 固化專案知識 | Agent Skills (`SKILL.md`), `$name` | Agent Skills (`SKILL.md`) |
|
|
99
|
+
| Plugins / Connectors | 外部工具整合 | Connectors (MCP) + Plugins | MCP servers + Plugins |
|
|
100
|
+
| Sub-agents | 生成與驗證分離 | `.codex/agents/` TOML | `.claude/agents/` + agent teams |
|
|
101
|
+
| State | 跨對話進度 | Markdown / Linear connector | Markdown (`AGENTS.md`, progress files) / Linear MCP |
|
|
102
|
+
|
|
103
|
+
## 範例:一個完整 loop 的流程
|
|
104
|
+
|
|
105
|
+
1. **Automation** 每天早上在 repo 上執行,prompt 呼叫 triage skill。
|
|
106
|
+
2. Triage skill 讀取昨天的 CI failures、open issues、recent commits。
|
|
107
|
+
3. 發現值得處理的 findings,寫入 **state file** 或 Linear board。
|
|
108
|
+
4. 對每個 finding,開一個隔離的 **worktree**。
|
|
109
|
+
5. 送一個 **sub-agent**(maker)進 worktree 草擬修復。
|
|
110
|
+
6. 送第二個 **sub-agent**(checker)用專案 **skills** 和現有 tests 審查草稿。
|
|
111
|
+
7. **Connectors** 開 PR、更新 ticket、CI 通過後 ping channel。
|
|
112
|
+
8. 無法處理的 finding 送到 triage inbox 給人。
|
|
113
|
+
9. **State file** 記錄什麼被嘗試了、什麼通過了、什麼還開著。
|
|
114
|
+
10. 明天早上的 run 從 state 接續。
|
|
115
|
+
|
|
116
|
+
你設計了一次,之後不再手動 prompt 任何步驟。
|
|
117
|
+
|
|
118
|
+
## Loop 三大風險
|
|
119
|
+
|
|
120
|
+
### 1. 驗證仍在你身上
|
|
121
|
+
|
|
122
|
+
Loop 無人值守時也會無人值守地犯錯。Maker/checker 分離是必要但不充分的——「done」是一個 claim,不是 proof。你的工作是 ship 你確認有效的程式碼。
|
|
123
|
+
|
|
124
|
+
### 2. 理解債(Comprehension Debt)
|
|
125
|
+
|
|
126
|
+
Loop 越快產出你沒寫的程式碼,你對系統的理解缺口越大。除非你讀 loop 產出的東西,否則理解債只會加速累積。
|
|
127
|
+
|
|
128
|
+
### 3. 認知投降(Cognitive Surrender)
|
|
129
|
+
|
|
130
|
+
當 loop 自己跑,人很容易停止有主見、照單全收。同一個 loop 設計,有判斷力的人用來加速理解深入的工作,沒有判斷力的人用來迴避理解工作本身——同一動作,相反結果。
|
|
131
|
+
|
|
132
|
+
### 風險防護策略
|
|
133
|
+
|
|
134
|
+
- 定期抽查 loop 產出,不要只看 CI 綠燈。
|
|
135
|
+
- 設定 loop 的產出量上限,避免 review backlog 失控。
|
|
136
|
+
- 在 state file 記錄人類最後審查的時間點。
|
|
137
|
+
- 高風險變更(安全、合規、產品 scope)強制跳出 loop 等人。
|
|
138
|
+
- 定期用 loop 的錯誤模式回饋改善 harness(agentic flywheel)。
|
|
139
|
+
|
|
140
|
+
## 何時適合引入 Loop vs 留在 Harness
|
|
141
|
+
|
|
142
|
+
| 條件 | 建議 |
|
|
143
|
+
|:--|:--|
|
|
144
|
+
| 專案沒有可靠的本地驗證指令 | 先建 harness |
|
|
145
|
+
| CI 不穩定或經常紅燈 | 先修 CI |
|
|
146
|
+
| 團隊對 agent 產出沒有 review 流程 | 先建 review 流程 |
|
|
147
|
+
| Maturity Level < 3 | 先升級 harness |
|
|
148
|
+
| 重複性高、風險低的任務(triage、格式修復、依賴更新) | 適合 loop |
|
|
149
|
+
| 變更涉及產品 scope、安全、架構取捨 | 不適合全自動 loop |
|