@ps-neko/nekowork 0.1.0-alpha.0 → 0.1.0-alpha.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CLAUDE.md +5 -2
- package/README.md +108 -13
- package/agent.yaml +3 -2
- package/docs/ADVANCED.md +26 -4
- package/docs/AI-DEVELOPMENT-LIFECYCLE.md +2 -1
- package/docs/ARCHITECTURE.md +11 -7
- package/docs/AUDIT.md +30 -22
- package/docs/CATALOG-PACKS.md +77 -0
- package/docs/CHANGELOG.md +26 -3
- package/docs/CLI-STAGES.md +6 -4
- package/docs/CODEMAPS/scripts.md +5 -1
- package/docs/CODEMAPS/skills.md +2 -0
- package/docs/CODEMAPS/tests.md +2 -0
- package/docs/CORE-INVARIANTS.md +3 -2
- package/docs/DEMO-REPORT.md +97 -0
- package/docs/DEMO.md +43 -2
- package/docs/EXAMPLE-PROJECT.md +1 -1
- package/docs/INTERNAL-PROVIDER.md +85 -0
- package/docs/PORTING.md +1 -1
- package/docs/PRODUCT-PRINCIPLES.md +22 -4
- package/docs/PUBLISH-ALPHA.md +90 -21
- package/docs/QUICKSTART.md +44 -13
- package/docs/RELEASE-READINESS.md +44 -14
- package/docs/ROADMAP.md +41 -0
- package/docs/RUNBOOK.md +1 -1
- package/docs/SETUP.md +3 -2
- package/docs/WHY-NEKOWORK.md +23 -1
- package/docs/assets/demo-terminal.svg +41 -0
- package/docs/case-studies/JSHTTP-BASIC-AUTH.md +168 -0
- package/docs/case-studies/PYTHON-HYPER-H11.md +168 -0
- package/docs/case-studies/README.md +2 -0
- package/docs/workflows-stash/harness-validate.yml +42 -9
- package/manifests/install-components.json +5 -0
- package/manifests/install-modules.json +1 -0
- package/manifests/install-profiles.json +44 -0
- package/package.json +1 -1
- package/schemas/install-profiles.schema.json +14 -0
- package/scripts/agents/dispatch.js +5 -1
- package/scripts/agents/runners/internal.js +91 -0
- package/scripts/ci/catalog.js +7 -0
- package/scripts/ci/validate-manifests.js +5 -0
- package/scripts/cli.js +96 -3
- package/scripts/demo-quick-run.js +13 -1
- package/scripts/doctor.js +1 -1
- package/scripts/install-apply.js +15 -2
- package/scripts/install-plan.js +42 -2
- package/scripts/orchestrators/report.js +276 -0
- package/scripts/sync-claude-md.js +4 -0
- package/skills/acceptance-coverage/SKILL.md +37 -0
- package/docs/dev-log/2026-04-29-p1-recovery.md +0 -142
- package/docs/dev-log/2026-04-29-week1-4.md +0 -81
package/CLAUDE.md
CHANGED
|
@@ -8,15 +8,16 @@
|
|
|
8
8
|
|
|
9
9
|
## 자동 갱신 영역
|
|
10
10
|
|
|
11
|
-
<!-- HARNESS:START version=0.1.0-alpha.
|
|
11
|
+
<!-- HARNESS:START version=0.1.0-alpha.2 -->
|
|
12
12
|
<!-- 이 영역은 scripts/sync-claude-md.js 가 자동 갱신한다. 직접 편집 금지. -->
|
|
13
13
|
|
|
14
14
|
## 카탈로그 요약
|
|
15
15
|
|
|
16
16
|
- agents: 11
|
|
17
|
-
- skills:
|
|
17
|
+
- skills: 10
|
|
18
18
|
- commands: 1 (legacy compat)
|
|
19
19
|
- hooks: 5 (gateguard-fact-force, config-protection, quality-gate, pre-bash-dispatcher, persistent-mode)
|
|
20
|
+
- packs: core, quality, security, frontend, testing, release, enterprise
|
|
20
21
|
- profiles: core, developer, security, product, quality, frontend, testing, research, full
|
|
21
22
|
- harnesses: claude, codex, cursor, gemini, opencode
|
|
22
23
|
|
|
@@ -40,12 +41,14 @@
|
|
|
40
41
|
|
|
41
42
|
```bash
|
|
42
43
|
harness install --plan --profile core # 설치 dry-run
|
|
44
|
+
harness install --plan --pack quality # curated pack dry-run
|
|
43
45
|
harness ask "<task>" # question gate, no project mutation
|
|
44
46
|
harness team "<task>" # read-only worker handoffs
|
|
45
47
|
harness work "<task>" # single executor implement handoff
|
|
46
48
|
harness verify "<task>" --session <id> # Codex-only verification
|
|
47
49
|
harness gate status --session <id> # inspect or resolve HUMAN_GATE state
|
|
48
50
|
harness ship "<task>" --session <id> # ship/no-ship readiness handoff
|
|
51
|
+
harness report --session <id> # readable evidence report
|
|
49
52
|
harness apply --session <id> # apply verified SHIP_READY live-work diff
|
|
50
53
|
harness run "<task>" --session <id> # work -> verify -> ship, optional --apply
|
|
51
54
|
harness review "<task>" [--secure|--fast|--no-ship] # legacy full cycle
|
package/README.md
CHANGED
|
@@ -18,12 +18,54 @@ NEKOWORK = Claude work -> Codex verification -> Human Gate
|
|
|
18
18
|
|
|
19
19
|
NEKOWORK is not meant to become a large agent pack. Skills, hooks, profiles, and team modes are added only when they preserve the verification loop.
|
|
20
20
|
|
|
21
|
+
NEKOWORK intentionally keeps the catalog selective. Every agent, skill, hook, profile, module, and pack must preserve the verification loop.
|
|
22
|
+
|
|
23
|
+
**Public alpha evidence:** 7 packs / 9 profiles / 36 components / 5 harness targets / 6 case-study flows / 245 tests / 0 moderate+ npm audit issues / fresh `npx @alpha` smoke
|
|
24
|
+
|
|
25
|
+
NEKOWORK does not automatically commit, push, publish, deploy, or apply diffs. `apply` is explicit and requires verified ship-ready evidence.
|
|
26
|
+
|
|
27
|
+
**One-minute demo:** [terminal transcript](docs/DEMO.md#one-minute-terminal-transcript) / [full report example](docs/DEMO-REPORT.md) / [alpha feedback](https://github.com/Ps-Neko/NEKOWORK/issues/new?template=alpha-feedback.yml) / [roadmap](docs/ROADMAP.md)
|
|
28
|
+
|
|
29
|
+

|
|
30
|
+
|
|
31
|
+
## Example Report
|
|
32
|
+
|
|
33
|
+
`report` is the main trust surface. It turns session evidence into a readable `REPORT.md`:
|
|
34
|
+
|
|
35
|
+
```text
|
|
36
|
+
Verdict: approve_with_fixes
|
|
37
|
+
Ship ready: false
|
|
38
|
+
Human gate: required
|
|
39
|
+
Applied: false
|
|
40
|
+
Profile: quality
|
|
41
|
+
Strict quality: enabled
|
|
42
|
+
Acceptance coverage: 4/5
|
|
43
|
+
Quality warnings: 2
|
|
44
|
+
|
|
45
|
+
Evidence:
|
|
46
|
+
- work-summary.json
|
|
47
|
+
- verify-summary.json
|
|
48
|
+
- ship-summary.json
|
|
49
|
+
- gate-summary.json
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
See the full report contract and example artifact in [docs/DEMO-REPORT.md](docs/DEMO-REPORT.md), and the one-minute terminal transcript in [docs/DEMO.md](docs/DEMO.md).
|
|
53
|
+
|
|
54
|
+
## Compared With Agent Packs
|
|
55
|
+
|
|
56
|
+
| Tool pattern | Optimizes for | NEKOWORK optimizes for |
|
|
57
|
+
|---|---|---|
|
|
58
|
+
| Large Claude Code packs | More agents, commands, skills | Curated verification loop |
|
|
59
|
+
| Team simulation | More specialist perspectives | Read-only team plus one executor |
|
|
60
|
+
| Autopilot | Fast autonomous execution | Report, gate, explicit apply |
|
|
61
|
+
| Discipline workflows | Better development habits | Evidence-backed ship decision |
|
|
62
|
+
|
|
21
63
|
## Three Paths
|
|
22
64
|
|
|
23
65
|
Most users should start with the Beginner path. The other paths are for explicit phase control or legacy compatibility.
|
|
24
66
|
|
|
25
|
-
1. Beginner: `doctor -> ask -> run -> gate`
|
|
26
|
-
2. Advanced: `ask -> plan -> team -> work -> verify -> gate -> ship -> apply`
|
|
67
|
+
1. Beginner: `doctor -> ask -> run -> report -> gate`
|
|
68
|
+
2. Advanced: `ask -> plan -> team -> work -> verify -> gate -> ship -> report -> apply`
|
|
27
69
|
3. Legacy: `review` / `review-cycle`
|
|
28
70
|
|
|
29
71
|
## Why NEKOWORK
|
|
@@ -32,19 +74,43 @@ NEKOWORK is for teams that want AI-assisted development without making the agent
|
|
|
32
74
|
|
|
33
75
|
## Status
|
|
34
76
|
|
|
35
|
-
- Current version: `0.1.0-alpha.
|
|
77
|
+
- Current repository version: `0.1.0-alpha.2`
|
|
36
78
|
- Current package name: `@ps-neko/nekowork`
|
|
37
|
-
- npm
|
|
38
|
-
- Supported install path today: clone, submodule, or local repository integration
|
|
39
|
-
-
|
|
79
|
+
- Current npm alpha: `@ps-neko/nekowork@0.1.0-alpha.2`
|
|
80
|
+
- Supported install path today: npm alpha, clone, submodule, or local repository integration
|
|
81
|
+
- Dist-tag note: use `@alpha` until a stable release; `latest` still points at the first alpha line
|
|
40
82
|
- Default mode: mock providers, no API keys, no provider CLI calls
|
|
41
83
|
|
|
42
84
|
Current local verification:
|
|
43
85
|
|
|
44
86
|
- `npm run lint`: pass
|
|
45
|
-
- `npm test`:
|
|
87
|
+
- `npm test`: 245 tests pass
|
|
46
88
|
- `npm audit --audit-level=moderate`: 0 vulnerabilities
|
|
47
89
|
- `npm pack --dry-run --json`: pass
|
|
90
|
+
- `npx -y @ps-neko/nekowork@alpha doctor --quick`: pass with warnings only
|
|
91
|
+
|
|
92
|
+
## Case-study Evidence
|
|
93
|
+
|
|
94
|
+
| Flow | Risk type | Evidence produced |
|
|
95
|
+
|---|---|---|
|
|
96
|
+
| Financial UI mock | UI/product risk | report + Human Gate |
|
|
97
|
+
| GitHub Actions hardening | CI/security risk | security findings + no-ship/ship evidence |
|
|
98
|
+
| Quality lifecycle smoke | quality risk | strict-quality + acceptance coverage |
|
|
99
|
+
| npm package boundary | package/release risk | pack/audit evidence |
|
|
100
|
+
| Auth parser boundary | auth/security risk | parser boundary evidence |
|
|
101
|
+
| Python protocol parser | protocol correctness risk | test-backed verification |
|
|
102
|
+
|
|
103
|
+
## Official Packs
|
|
104
|
+
|
|
105
|
+
| Pack | Adds | Use when |
|
|
106
|
+
|---|---|---|
|
|
107
|
+
| `core` | minimal verification runtime | first install or repo smoke |
|
|
108
|
+
| `quality` | acceptance coverage, strict evidence prompts | feature work needs proof |
|
|
109
|
+
| `security` | auth/secrets/deploy risk prompts | sensitive changes |
|
|
110
|
+
| `frontend` | UI mockup, component review, accessibility checks | product-facing UI work |
|
|
111
|
+
| `testing` | regression planning and coverage handoffs | test confidence is the main risk |
|
|
112
|
+
| `release` | ship/no-ship evidence | pre-release checks |
|
|
113
|
+
| `enterprise` | full catalog with all gates | high-control teams |
|
|
48
114
|
|
|
49
115
|
## Quick Start
|
|
50
116
|
|
|
@@ -56,6 +122,12 @@ Requirements:
|
|
|
56
122
|
|
|
57
123
|
Fastest no-API demo:
|
|
58
124
|
|
|
125
|
+
```bash
|
|
126
|
+
npx -y @ps-neko/nekowork@alpha doctor --quick
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
Repository demo:
|
|
130
|
+
|
|
59
131
|
```bash
|
|
60
132
|
git clone https://github.com/Ps-Neko/NEKOWORK.git harness
|
|
61
133
|
cd harness
|
|
@@ -63,7 +135,7 @@ npm ci
|
|
|
63
135
|
npm run demo:quick -- --cleanup
|
|
64
136
|
```
|
|
65
137
|
|
|
66
|
-
This creates a disposable target project and runs `doctor -> run -> gate status`. It uses mock providers and does not call Claude, Codex, Gemini, or paid APIs.
|
|
138
|
+
This creates a disposable target project and runs `doctor -> run -> report -> gate status`. It uses mock providers and does not call Claude, Codex, Gemini, or paid APIs.
|
|
67
139
|
|
|
68
140
|
Recommended path for most users:
|
|
69
141
|
|
|
@@ -74,15 +146,16 @@ npm ci
|
|
|
74
146
|
node scripts/cli.js doctor --quick
|
|
75
147
|
node scripts/cli.js ask "clarify a risky or ambiguous request" --session first-ask
|
|
76
148
|
node scripts/cli.js run "implement, verify, and prepare ship readiness" --session first-run
|
|
149
|
+
node scripts/cli.js report --session first-run
|
|
77
150
|
node scripts/cli.js gate status --session first-run
|
|
78
151
|
```
|
|
79
152
|
|
|
80
|
-
`run` executes `work -> verify -> ship`. It does not apply by default. `apply` is always explicit and requires a verified `SHIP_READY` live-work diff.
|
|
153
|
+
`run` executes `work -> verify -> ship`. `report` turns the session evidence into a readable `REPORT.md`. It does not apply by default. `apply` is always explicit and requires a verified `SHIP_READY` live-work diff.
|
|
81
154
|
|
|
82
155
|
Advanced path:
|
|
83
156
|
|
|
84
157
|
```text
|
|
85
|
-
ask -> plan -> team -> work -> verify -> gate -> ship -> apply
|
|
158
|
+
ask -> plan -> team -> work -> verify -> gate -> ship -> report -> apply
|
|
86
159
|
```
|
|
87
160
|
|
|
88
161
|
Legacy compatibility smoke:
|
|
@@ -101,13 +174,14 @@ To see the repository-based external project flow end to end:
|
|
|
101
174
|
npm run demo:external
|
|
102
175
|
```
|
|
103
176
|
|
|
104
|
-
To inspect small case-study targets, see [examples/trading-dashboard-mock](examples/trading-dashboard-mock), [examples/github-actions-hardening](examples/github-actions-hardening),
|
|
177
|
+
To inspect small case-study targets, see [examples/trading-dashboard-mock](examples/trading-dashboard-mock), [examples/github-actions-hardening](examples/github-actions-hardening), [examples/quality-lifecycle-smoke](examples/quality-lifecycle-smoke), and [docs/case-studies](docs/case-studies). They demonstrate financial UI, CI workflow, quality lifecycle, npm package, auth parser, and Python protocol library flows passing local checks while still preserving Codex verification, Human Gate policy, and explicit apply control.
|
|
105
178
|
|
|
106
179
|
## What You Get
|
|
107
180
|
|
|
108
181
|
```text
|
|
109
182
|
doctor ... OK
|
|
110
183
|
run workflow ... OK
|
|
184
|
+
report ... OK
|
|
111
185
|
gate status ... OK
|
|
112
186
|
Demo completed: verdict=approve_with_fixes, ship_ready=false, applied=false
|
|
113
187
|
```
|
|
@@ -116,6 +190,7 @@ Outputs are written under:
|
|
|
116
190
|
|
|
117
191
|
```text
|
|
118
192
|
.harness/state/sessions/<session-id>/handoffs/
|
|
193
|
+
.harness/state/sessions/<session-id>/REPORT.md
|
|
119
194
|
```
|
|
120
195
|
|
|
121
196
|
## Use It In Another Project
|
|
@@ -169,6 +244,7 @@ The public alpha surface is intentionally small:
|
|
|
169
244
|
- `ship`: produce a ship/no-ship readiness handoff after Codex verification
|
|
170
245
|
- `apply`: apply a verified `SHIP_READY` live-work diff to the target project
|
|
171
246
|
- `run`: execute the decomposed wrapper, `work -> verify -> ship`, with optional apply
|
|
247
|
+
- `report`: summarize session evidence into `REPORT.md` without project mutation
|
|
172
248
|
- `review`: run the legacy full Claude-led/Codex-reviewed workflow
|
|
173
249
|
- `review-cycle`: explicit compatibility alias for the legacy full review workflow
|
|
174
250
|
- `install --plan` / `install --apply`: project generated harness surfaces
|
|
@@ -179,13 +255,24 @@ Advanced features such as `team-lite`, `ralph`, `wait`, instincts, cost tracking
|
|
|
179
255
|
|
|
180
256
|
Use `--profile quality` or `--profile security` on `work`, `verify`, and `run` when a task needs stronger evidence prompts. Add `--strict-quality` to `verify` or `run` when missing evidence or acceptance coverage should become a fix-required verdict before ship.
|
|
181
257
|
|
|
258
|
+
Use official packs when choosing an install shape:
|
|
259
|
+
|
|
260
|
+
```bash
|
|
261
|
+
node scripts/install-plan.js --list
|
|
262
|
+
node scripts/install-plan.js --pack quality
|
|
263
|
+
node scripts/install-plan.js --pack security --target codex --json
|
|
264
|
+
```
|
|
265
|
+
|
|
266
|
+
Packs are aliases over validated profiles. They add clearer product packaging without weakening the core gates.
|
|
267
|
+
|
|
182
268
|
## Catalog
|
|
183
269
|
|
|
184
270
|
- Agents: 11
|
|
185
|
-
- Skills:
|
|
271
|
+
- Skills: 10
|
|
186
272
|
- Hooks: 5
|
|
187
273
|
- Modules: 7
|
|
188
274
|
- Profiles: `core`, `developer`, `security`, `product`, `quality`, `frontend`, `testing`, `research`, `full`
|
|
275
|
+
- Official packs: `core`, `quality`, `security`, `frontend`, `testing`, `release`, `enterprise`
|
|
189
276
|
- Harness targets: `claude`, `codex`, `cursor`, `gemini`, `opencode`
|
|
190
277
|
|
|
191
278
|
Key skills:
|
|
@@ -193,6 +280,7 @@ Key skills:
|
|
|
193
280
|
- `claude-led-codex-review`
|
|
194
281
|
- `plan-eng-review`
|
|
195
282
|
- `tdd-workflow`
|
|
283
|
+
- `acceptance-coverage`
|
|
196
284
|
- `review`
|
|
197
285
|
- `ship`
|
|
198
286
|
- `ralph`
|
|
@@ -207,6 +295,7 @@ node scripts/cli.js doctor
|
|
|
207
295
|
node scripts/cli.js doctor --quick --gemini-smoke
|
|
208
296
|
npm run demo:quick
|
|
209
297
|
node scripts/install-plan.js --list
|
|
298
|
+
node scripts/install-plan.js --pack quality
|
|
210
299
|
node scripts/install-plan.js --profile developer
|
|
211
300
|
node scripts/install-apply.js --profile developer --project-root <target>
|
|
212
301
|
|
|
@@ -218,8 +307,10 @@ node scripts/cli.js verify "verify the implemented change" --session work-smoke
|
|
|
218
307
|
node scripts/cli.js verify "verify quality evidence" --profile quality --strict-quality --session work-smoke
|
|
219
308
|
node scripts/cli.js gate status --session work-smoke
|
|
220
309
|
node scripts/cli.js ship "prepare ship readiness" --require-clean-gates --session work-smoke
|
|
310
|
+
node scripts/cli.js report --session work-smoke
|
|
221
311
|
node scripts/cli.js apply --session work-smoke
|
|
222
312
|
node scripts/cli.js run "implement, verify, and prepare ship readiness" --session run-smoke
|
|
313
|
+
node scripts/cli.js report --session run-smoke
|
|
223
314
|
node scripts/cli.js review "implement and review this change" --no-ship
|
|
224
315
|
node scripts/cli.js review-cycle "legacy full-cycle compatibility smoke" --no-ship
|
|
225
316
|
node scripts/cli.js review "security-sensitive change" --secure --no-ship
|
|
@@ -247,14 +338,18 @@ npm run security:hardening
|
|
|
247
338
|
npm pack --dry-run --json
|
|
248
339
|
```
|
|
249
340
|
|
|
250
|
-
`npm pack --dry-run --json` currently produces a package named like `ps-neko-nekowork-0.1.0-alpha.
|
|
341
|
+
`npm pack --dry-run --json` currently produces a package named like `ps-neko-nekowork-0.1.0-alpha.2.tgz`. It does not publish.
|
|
251
342
|
|
|
252
343
|
## Documentation
|
|
253
344
|
|
|
254
345
|
- [docs/QUICKSTART.md](docs/QUICKSTART.md) - first run and common paths
|
|
255
346
|
- [docs/WHY-NEKOWORK.md](docs/WHY-NEKOWORK.md) - comparison and product positioning
|
|
347
|
+
- [docs/CATALOG-PACKS.md](docs/CATALOG-PACKS.md) - curated catalog, official packs, and case-study evidence
|
|
256
348
|
- [docs/PUBLISH-ALPHA.md](docs/PUBLISH-ALPHA.md) - public npm alpha release plan
|
|
349
|
+
- [docs/ROADMAP.md](docs/ROADMAP.md) - small alpha roadmap and non-goals
|
|
350
|
+
- [docs/INTERNAL-PROVIDER.md](docs/INTERNAL-PROVIDER.md) - private command adapter protocol
|
|
257
351
|
- [docs/DEMO.md](docs/DEMO.md) - sample command output and generated files
|
|
352
|
+
- [docs/DEMO-REPORT.md](docs/DEMO-REPORT.md) - readable session report UX
|
|
258
353
|
- [docs/EXAMPLE-PROJECT.md](docs/EXAMPLE-PROJECT.md) - repository-based external project demo
|
|
259
354
|
- [docs/case-studies](docs/case-studies) - real external project run evidence
|
|
260
355
|
- [examples/trading-dashboard-mock](examples/trading-dashboard-mock) - standalone financial UI mock target and case-study evidence
|
package/agent.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
spec_version: gitagent/0.1.0
|
|
2
2
|
name: nekowork
|
|
3
3
|
runtime_name: harness
|
|
4
|
-
version: 0.1.0-alpha.
|
|
4
|
+
version: 0.1.0-alpha.2
|
|
5
5
|
description: "NEKOWORK HARNESS - Local-first multi-AI development verification runtime"
|
|
6
6
|
license: MIT
|
|
7
7
|
homepage: https://github.com/Ps-Neko/NEKOWORK
|
|
@@ -26,6 +26,7 @@ skills:
|
|
|
26
26
|
- claude-led-codex-review
|
|
27
27
|
- plan-eng-review
|
|
28
28
|
- tdd-workflow
|
|
29
|
+
- acceptance-coverage
|
|
29
30
|
- review
|
|
30
31
|
- ship
|
|
31
32
|
- security-hardening
|
|
@@ -97,7 +98,7 @@ profiles:
|
|
|
97
98
|
- full
|
|
98
99
|
|
|
99
100
|
modules:
|
|
100
|
-
#
|
|
101
|
+
# Current catalog: 7 modules. Future modules stay selective/profile-driven.
|
|
101
102
|
- rules-core
|
|
102
103
|
- agents-core
|
|
103
104
|
- hooks-runtime
|
package/docs/ADVANCED.md
CHANGED
|
@@ -174,7 +174,7 @@ Rules:
|
|
|
174
174
|
Policy:
|
|
175
175
|
|
|
176
176
|
- `run` is the short safe wrapper for new users.
|
|
177
|
-
- `run` does not call `plan` in the
|
|
177
|
+
- `run` does not call `plan` in the current alpha line.
|
|
178
178
|
- `plan` is recommended before `work` for larger changes.
|
|
179
179
|
- `work` still records `acceptance-criteria.json`, so `run` preserves success criteria evidence.
|
|
180
180
|
- `apply` is always explicit; use `run --apply` only after live work can produce a captured diff.
|
|
@@ -184,6 +184,28 @@ Outputs:
|
|
|
184
184
|
- `.harness/state/sessions/<id>/run-summary.json`
|
|
185
185
|
- all normal `work`, `verify`, `ship`, and optional `apply` outputs
|
|
186
186
|
|
|
187
|
+
## report
|
|
188
|
+
|
|
189
|
+
`report` turns existing session evidence into a readable inspect-only report:
|
|
190
|
+
|
|
191
|
+
```bash
|
|
192
|
+
node scripts/cli.js report --session run-smoke
|
|
193
|
+
node scripts/cli.js report --session run-smoke --stdout
|
|
194
|
+
node scripts/cli.js report --session run-smoke --output docs/session-report.md
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
Rules:
|
|
198
|
+
|
|
199
|
+
- Reads summaries, markers, acceptance criteria, and handoffs from `.harness/state/sessions/<id>/`.
|
|
200
|
+
- Writes `REPORT.md` and `report-summary.json` by default.
|
|
201
|
+
- Does not call providers, run git commands, apply diffs, or mutate target project files.
|
|
202
|
+
- Can run after `ask`, `work`, `verify`, `ship`, `run`, or `apply`.
|
|
203
|
+
|
|
204
|
+
Outputs:
|
|
205
|
+
|
|
206
|
+
- `.harness/state/sessions/<id>/REPORT.md`
|
|
207
|
+
- `.harness/state/sessions/<id>/report-summary.json`
|
|
208
|
+
|
|
187
209
|
## review-cycle
|
|
188
210
|
|
|
189
211
|
`review-cycle` is the explicit compatibility alias for the legacy full workflow:
|
|
@@ -194,7 +216,7 @@ node scripts/cli.js review-cycle "legacy full-cycle smoke" --no-ship
|
|
|
194
216
|
|
|
195
217
|
Rules:
|
|
196
218
|
|
|
197
|
-
- It is equivalent to `review` in the
|
|
219
|
+
- It is equivalent to `review` in the current alpha line.
|
|
198
220
|
- It keeps the old `ideate -> plan -> implement -> self-review -> codex-review -> codex-challenge -> ship` behavior discoverable while new automation migrates to `run` or the decomposed commands.
|
|
199
221
|
- It writes `review-summary.json` with `mode: legacy-full-review-cycle`.
|
|
200
222
|
- It may use legacy live-review behavior, so new controlled project mutation should prefer `work --live -> verify -> ship -> apply`.
|
|
@@ -280,7 +302,7 @@ node scripts/cli.js instincts ready --blocked
|
|
|
280
302
|
node scripts/cli.js instincts promote <id>
|
|
281
303
|
```
|
|
282
304
|
|
|
283
|
-
Promotion requires confidence `1.0`; automatic promotion without human confirmation is outside the
|
|
305
|
+
Promotion requires confidence `1.0`; automatic promotion without human confirmation is outside the current alpha release scope.
|
|
284
306
|
|
|
285
307
|
## Cost Tracking
|
|
286
308
|
|
|
@@ -304,7 +326,7 @@ Verify it with:
|
|
|
304
326
|
npm run verify:runtime
|
|
305
327
|
```
|
|
306
328
|
|
|
307
|
-
The Node CLI remains the primary
|
|
329
|
+
The Node CLI remains the primary alpha user path.
|
|
308
330
|
|
|
309
331
|
## Full Builder Surface
|
|
310
332
|
|
|
@@ -54,10 +54,11 @@ ask
|
|
|
54
54
|
-> verify
|
|
55
55
|
-> gate
|
|
56
56
|
-> ship
|
|
57
|
+
-> report
|
|
57
58
|
-> apply
|
|
58
59
|
```
|
|
59
60
|
|
|
60
|
-
Quality enters early through `ask` and `plan`, not only at the final review step. Team mode collects multiple perspectives, but the write phase stays single-executor. Verification is independent, gate decisions are explicit, and apply requires evidence.
|
|
61
|
+
Quality enters early through `ask` and `plan`, not only at the final review step. Team mode collects multiple perspectives, but the write phase stays single-executor. Verification is independent, gate decisions are explicit, `report` makes evidence readable, and apply requires evidence.
|
|
61
62
|
|
|
62
63
|
## Quality Profile
|
|
63
64
|
|
package/docs/ARCHITECTURE.md
CHANGED
|
@@ -50,7 +50,7 @@ User command
|
|
|
50
50
|
|
|
|
51
51
|
|-- doctor
|
|
52
52
|
|-- install plan/apply
|
|
53
|
-
|-- ask / plan / team / work / verify / gate / ship / apply / run / review / review-cycle
|
|
53
|
+
|-- ask / plan / team / work / verify / gate / ship / report / apply / run / review / review-cycle
|
|
54
54
|
|-- ralph
|
|
55
55
|
|-- team-lite
|
|
56
56
|
|-- sessions / costs / instincts
|
|
@@ -74,6 +74,7 @@ The public alpha surface is intentionally small:
|
|
|
74
74
|
|
|
75
75
|
```bash
|
|
76
76
|
node scripts/cli.js doctor
|
|
77
|
+
node scripts/cli.js install --plan --pack quality
|
|
77
78
|
node scripts/cli.js install --plan --profile developer
|
|
78
79
|
node scripts/cli.js install --apply --profile developer --project-root <target>
|
|
79
80
|
node scripts/cli.js ask "clarify a risky or ambiguous request" --project-root <target>
|
|
@@ -83,6 +84,7 @@ node scripts/cli.js work "single executor implementation" --session work-smoke -
|
|
|
83
84
|
node scripts/cli.js verify "Codex verification" --session work-smoke --project-root <target>
|
|
84
85
|
node scripts/cli.js gate status --session work-smoke --project-root <target>
|
|
85
86
|
node scripts/cli.js ship "ship readiness" --session work-smoke --project-root <target>
|
|
87
|
+
node scripts/cli.js report --session work-smoke --project-root <target>
|
|
86
88
|
node scripts/cli.js apply --session work-smoke --project-root <target>
|
|
87
89
|
node scripts/cli.js run "decomposed wrapper" --session run-smoke --project-root <target>
|
|
88
90
|
node scripts/cli.js review "change request" --no-ship --project-root <target>
|
|
@@ -99,7 +101,7 @@ Advanced features are documented separately:
|
|
|
99
101
|
|
|
100
102
|
## Review Pipeline
|
|
101
103
|
|
|
102
|
-
The
|
|
104
|
+
The current alpha `review` command remains the Claude-led and Codex-reviewed legacy full cycle. `review-cycle` is an explicit compatibility alias for the same behavior:
|
|
103
105
|
|
|
104
106
|
```text
|
|
105
107
|
ideate
|
|
@@ -114,10 +116,10 @@ ideate
|
|
|
114
116
|
The long-term phase model is additive and keeps `review` compatibility during migration:
|
|
115
117
|
|
|
116
118
|
```text
|
|
117
|
-
ask -> plan -> team -> work -> verify -> gate -> ship -> apply
|
|
119
|
+
ask -> plan -> team -> work -> verify -> gate -> ship -> report -> apply
|
|
118
120
|
```
|
|
119
121
|
|
|
120
|
-
`ask` is a local question gate. `team` creates read-only handoffs from multiple worker perspectives. `work` lets one executor produce an implement handoff and, in live mode, an isolated workspace diff. `verify` runs Codex-only verification against that prior work handoff. `gate` records explicit human approve/block decisions for `HUMAN_GATE`. `ship` creates a ship/no-ship readiness handoff and refuses to bypass unresolved gates. `apply` is the only decomposed command in this chain that mutates the target project, and only by applying a verified `SHIP_READY` live-work diff. `team-lite` remains an advanced read-only staged handoff experiment. Future `review` can be retired or kept as a compatibility wrapper once callers have migrated to the decomposed commands.
|
|
122
|
+
`ask` is a local question gate. `team` creates read-only handoffs from multiple worker perspectives. `work` lets one executor produce an implement handoff and, in live mode, an isolated workspace diff. `verify` runs Codex-only verification against that prior work handoff. `gate` records explicit human approve/block decisions for `HUMAN_GATE`. `ship` creates a ship/no-ship readiness handoff and refuses to bypass unresolved gates. `report` summarizes existing session evidence without mutating project files. `apply` is the only decomposed command in this chain that mutates the target project, and only by applying a verified `SHIP_READY` live-work diff. `team-lite` remains an advanced read-only staged handoff experiment. Future `review` can be retired or kept as a compatibility wrapper once callers have migrated to the decomposed commands.
|
|
121
123
|
|
|
122
124
|
`work` does not run Codex review or ship. It also does not mutate the target project directly; live executor changes are captured as a session diff for later verification.
|
|
123
125
|
|
|
@@ -129,6 +131,8 @@ ask -> plan -> team -> work -> verify -> gate -> ship -> apply
|
|
|
129
131
|
|
|
130
132
|
`ship` does not implement, verify, publish, deploy, or mutate the target project. It requires both prior `work` and Codex verification handoffs. It writes `SHIP_READY` only for fully approved verification or explicit human gate approval, writes `NO_SHIP` for fixable findings, and stops with a human gate when `HUMAN_GATE` is unresolved or explicitly blocked.
|
|
131
133
|
|
|
134
|
+
`report` does not implement, verify, ship, apply, call providers, or inspect project source. It reads session summaries, markers, acceptance criteria, and handoffs, then writes `REPORT.md` and `report-summary.json` under the session directory.
|
|
135
|
+
|
|
132
136
|
`apply` requires `SHIP_READY`, no newer `NO_SHIP`, no unresolved gate, and a captured diff from `work --live`. It applies that diff with `git apply --3way`, records `APPLIED_DIFF`, and leaves commit/push/release actions to the human.
|
|
133
137
|
|
|
134
138
|
`run` is the compatibility-friendly wrapper around the decomposed path. It runs `work -> verify -> ship` and only runs `apply` when `--apply` is explicitly requested and `SHIP_READY` exists. New automation should prefer `run` or the explicit decomposed commands; old automation can continue to use `review` or `review-cycle`.
|
|
@@ -198,8 +202,8 @@ Builders project the catalog into tool-specific files:
|
|
|
198
202
|
|
|
199
203
|
## Release State
|
|
200
204
|
|
|
201
|
-
The current release line is `0.1.0-alpha.
|
|
205
|
+
The current release line is `0.1.0-alpha.2`:
|
|
202
206
|
|
|
203
207
|
- Repository and GitHub tarball release are available.
|
|
204
|
-
- Public npm
|
|
205
|
-
- Clone, submodule, and local checkout integration remain
|
|
208
|
+
- Public npm alpha is published as `@ps-neko/nekowork@alpha`.
|
|
209
|
+
- Clone, submodule, and local checkout integration remain supported for repository-pinned workflows.
|
package/docs/AUDIT.md
CHANGED
|
@@ -1,25 +1,28 @@
|
|
|
1
1
|
# Audit
|
|
2
2
|
|
|
3
|
-
Status date: 2026-05-
|
|
3
|
+
Status date: 2026-05-08
|
|
4
4
|
|
|
5
|
-
This audit summarizes the current NEKOWORK state after the `
|
|
5
|
+
This audit summarizes the current NEKOWORK state after publishing the `0.1.0-alpha.2` public alpha. It replaces the older week-by-week scratch audit, which contained stale planning notes and encoding damage.
|
|
6
6
|
|
|
7
7
|
## Current Status
|
|
8
8
|
|
|
9
9
|
| Area | Status | Notes |
|
|
10
10
|
|---|---|---|
|
|
11
|
-
| Package metadata | OK | `@ps-neko/nekowork@0.1.0-alpha.
|
|
12
|
-
| npm publish |
|
|
11
|
+
| Package metadata | OK | `@ps-neko/nekowork@0.1.0-alpha.2`, `agent.yaml` uses `name: nekowork`, `runtime_name: harness` |
|
|
12
|
+
| npm publish | OK | `@ps-neko/nekowork@alpha` points at `0.1.0-alpha.2` |
|
|
13
13
|
| Source install | OK | Clone, local checkout, and submodule workflows are documented |
|
|
14
|
-
| Public npm alpha
|
|
14
|
+
| Public npm alpha | OK | `docs/PUBLISH-ALPHA.md` records the first alpha publish and the `0.1.0-alpha.2` alpha update |
|
|
15
15
|
| CLI doctor | OK | `doctor`, `doctor --quick`, and `doctor --gemini-smoke` are available |
|
|
16
16
|
| Provider auth | OK | Local delegated CLI auth is the default path |
|
|
17
|
-
|
|
|
17
|
+
| Internal provider adapter | OK | `HARNESS_PROVIDER_OVERRIDE=internal` can call an explicit JSON command adapter without weakening gates |
|
|
18
|
+
| Catalog | OK | 7 official packs, 11 agents, 10 skills, 5 hooks, 7 modules, 36 components, 9 profiles |
|
|
18
19
|
| Multi-harness output | OK | Claude, Codex, Cursor, Gemini, and OpenCode builders are present |
|
|
19
|
-
| Quick demo | OK | `npm run demo:quick` verifies the shortest no-API `doctor -> run -> gate status` path |
|
|
20
|
+
| Quick demo | OK | `npm run demo:quick` verifies the shortest no-API `doctor -> run -> report -> gate status` path |
|
|
21
|
+
| Fresh npm alpha smoke | OK | CI runs `npx -y @ps-neko/nekowork@alpha doctor --quick --json` from a disposable directory |
|
|
22
|
+
| Report UX | OK | `report` writes inspect-only `REPORT.md` and `report-summary.json` from session evidence |
|
|
20
23
|
| External demo | OK | `npm run demo:external` verifies a disposable target project flow |
|
|
21
|
-
| Third-party case
|
|
22
|
-
| Decomposed workflow | OK | `ask`, `team`, `work`, `verify`, `gate`, `ship`, `apply`, and `run` are available |
|
|
24
|
+
| Third-party case studies | OK | `docs/case-studies/` records real public repository runs for npm package, auth boundary, and Python protocol targets |
|
|
25
|
+
| Decomposed workflow | OK | `ask`, `team`, `work`, `verify`, `gate`, `ship`, `report`, `apply`, and `run` are available |
|
|
23
26
|
| Risk policy | OK | Shared classifier drives ask, routing traces, verify challenge/gates, and ship gate rechecks |
|
|
24
27
|
| Acceptance criteria | OK | `work` ensures every session has `acceptance-criteria.json` |
|
|
25
28
|
| Profile safety | OK | Manifest/catalog validators reject profiles that weaken core gates |
|
|
@@ -27,7 +30,7 @@ This audit summarizes the current NEKOWORK state after the `v0.0.3` repository r
|
|
|
27
30
|
| Persistent wakeup | OK | `wait` resumes supported active sessions and blocks on `HUMAN_GATE` |
|
|
28
31
|
| Generated docs | OK | CODEMAP output is stable ASCII and reproducible |
|
|
29
32
|
| Tests | OK | Unit, integration, and e2e suites pass locally and in CI |
|
|
30
|
-
| Release | OK | `v0.0.
|
|
33
|
+
| Release | OK | `v0.1.0-alpha.2` is tagged and published as a GitHub prerelease |
|
|
31
34
|
|
|
32
35
|
## Verification Gates
|
|
33
36
|
|
|
@@ -52,7 +55,7 @@ Current local result for this working tree:
|
|
|
52
55
|
- `npm run test:unit`: covered by full `npm test`
|
|
53
56
|
- `npm run validate:all`: pass
|
|
54
57
|
- `npm run lint`: pass
|
|
55
|
-
- `npm test`:
|
|
58
|
+
- `npm test`: 245 tests pass
|
|
56
59
|
- quick run demo: pass through `npm run demo:quick -- --cleanup`
|
|
57
60
|
- external project e2e smoke: pass through `npm test`
|
|
58
61
|
- `node scripts/sync-claude-md.js --check`: pass
|
|
@@ -60,11 +63,15 @@ Current local result for this working tree:
|
|
|
60
63
|
- `npm audit --audit-level=moderate`: 0 vulnerabilities
|
|
61
64
|
- `npm pack --dry-run --json`: pass
|
|
62
65
|
- `npm publish --dry-run --access public --tag alpha`: pass
|
|
63
|
-
- `npm publish --access public --tag alpha`:
|
|
66
|
+
- `npm publish --access public --tag alpha`: `0.1.0-alpha.2` published
|
|
67
|
+
- `npm view @ps-neko/nekowork dist-tags version versions --json`: `alpha` points at `0.1.0-alpha.2`; `latest` remains `0.1.0-alpha.0`
|
|
68
|
+
- `npx -y @ps-neko/nekowork@alpha doctor --quick`: passed for `0.1.0-alpha.2` with WARN summary from non-git project root and Gemini auth not checked
|
|
64
69
|
|
|
65
70
|
## Completed Work
|
|
66
71
|
|
|
67
72
|
- Local-first provider auth policy implemented and documented.
|
|
73
|
+
- Internal provider command adapter implemented and documented without bypassing verification, Human Gate, or apply controls.
|
|
74
|
+
- `acceptance-coverage` skill added as a focused quality evidence helper.
|
|
68
75
|
- API-key override warnings and guards are in place.
|
|
69
76
|
- Provider CLI path trust checks are in place.
|
|
70
77
|
- `--project-root` separates NEKOWORK tool root from target project root.
|
|
@@ -82,22 +89,23 @@ Current local result for this working tree:
|
|
|
82
89
|
- Release docs, setup docs, runbook, quickstart, porting guide, and CODEMAP docs are readable for external users.
|
|
83
90
|
- The disposable external project demo proves the repository-based target-project flow end to end.
|
|
84
91
|
- The quick run demo proves the one-command no-API first experience.
|
|
92
|
+
- `report` gives public alpha users a readable inspect-only session artifact without applying or mutating project files.
|
|
93
|
+
- Official packs expose curated install shapes without creating a second safety model.
|
|
85
94
|
- Checked-in example fixtures now cover financial UI, CI hardening, and quality lifecycle evidence flows.
|
|
86
|
-
-
|
|
87
|
-
- Public npm alpha
|
|
95
|
+
- Third-party case studies record NEKOWORK runs against `sindresorhus/is-plain-obj`, `jshttp/basic-auth`, and `python-hyper/h11`.
|
|
96
|
+
- Public npm alpha `0.1.0-alpha.2` is published under the `alpha` dist-tag.
|
|
88
97
|
|
|
89
98
|
## Remaining Optional Work
|
|
90
99
|
|
|
91
100
|
| Item | Priority | Reason |
|
|
92
101
|
|---|---|---|
|
|
93
|
-
|
|
|
94
|
-
| More third-party case studies |
|
|
95
|
-
|
|
|
96
|
-
| More skill catalog expansion | Low | Should stay selective to preserve progressive disclosure |
|
|
102
|
+
| Stable `latest` promotion | Medium | `alpha` is correct; npm keeps `latest` on the first alpha line for now, so move it to a stable version later |
|
|
103
|
+
| More third-party case studies | Low | Three public repo case studies exist; more frameworks can still improve adoption evidence later |
|
|
104
|
+
| More skill catalog expansion | Low | Catalog expansion should stay selective to preserve progressive disclosure |
|
|
97
105
|
|
|
98
106
|
## Explicit Non-Goals
|
|
99
107
|
|
|
100
|
-
- No public npm publish for `0.0.3`; public alpha
|
|
108
|
+
- No public npm publish for `0.0.3`; public alpha starts at `0.1.0-alpha.0`.
|
|
101
109
|
- No automatic promotion of learned instincts without human confirmation.
|
|
102
110
|
- No tmux-first runtime import from OMC.
|
|
103
111
|
- No bulk import of large external skill catalogs.
|
|
@@ -105,10 +113,10 @@ Current local result for this working tree:
|
|
|
105
113
|
|
|
106
114
|
## External Readiness Score
|
|
107
115
|
|
|
108
|
-
Current external readiness, excluding
|
|
116
|
+
Current external readiness, excluding broader adoption evidence: **9.1 / 10**.
|
|
109
117
|
|
|
110
118
|
Main deductions:
|
|
111
119
|
|
|
112
|
-
-
|
|
113
|
-
-
|
|
120
|
+
- `latest` currently remains on the first alpha; docs still recommend `@alpha` until a stable release exists.
|
|
121
|
+
- Three independent real-world external project case studies exist so far.
|
|
114
122
|
- Advanced surfaces exist but are intentionally secondary to the public decomposed workflow and install flow.
|
|
@@ -0,0 +1,77 @@
|
|
|
1
|
+
# Catalog Packs
|
|
2
|
+
|
|
3
|
+
NEKOWORK intentionally keeps the catalog selective. Every agent, skill, hook, module, profile, and pack must preserve the verification loop:
|
|
4
|
+
|
|
5
|
+
```text
|
|
6
|
+
Claude work -> Codex verification -> report -> Human Gate -> explicit apply
|
|
7
|
+
```
|
|
8
|
+
|
|
9
|
+
Packs are public install aliases over validated profiles. They make the catalog easier to choose without creating a second safety model.
|
|
10
|
+
|
|
11
|
+
## Current Shape
|
|
12
|
+
|
|
13
|
+
```text
|
|
14
|
+
7 official packs
|
|
15
|
+
9 install profiles
|
|
16
|
+
7 modules
|
|
17
|
+
36 components
|
|
18
|
+
11 agents
|
|
19
|
+
10 skills
|
|
20
|
+
5 hooks
|
|
21
|
+
5 harness targets
|
|
22
|
+
6 case-study flows
|
|
23
|
+
245 tests
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
Harness targets:
|
|
27
|
+
|
|
28
|
+
```text
|
|
29
|
+
Claude, Codex, Cursor, Gemini, OpenCode
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
Case-study flows:
|
|
33
|
+
|
|
34
|
+
```text
|
|
35
|
+
financial UI mock
|
|
36
|
+
GitHub Actions hardening
|
|
37
|
+
quality lifecycle smoke
|
|
38
|
+
npm package boundary
|
|
39
|
+
auth parser boundary
|
|
40
|
+
Python protocol parser boundary
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
## Official Packs
|
|
44
|
+
|
|
45
|
+
| Pack | Profile | Best For | Representative Workflow |
|
|
46
|
+
|---|---|---|---|
|
|
47
|
+
| `core` | `core` | Minimal verification runtime | `doctor -> ask -> run -> report -> gate` |
|
|
48
|
+
| `quality` | `quality` | Disciplined development and evidence coverage | `ask --profile quality -> run --profile quality --strict-quality -> report` |
|
|
49
|
+
| `security` | `security` | Auth, secrets, permissions, deploy, financial, or data-sensitive changes | `ask --profile security -> run --profile security --secure --strict-quality -> report -> gate` |
|
|
50
|
+
| `frontend` | `frontend` | UI mockups, component review, accessibility-oriented checks | `ask --profile product -> team -> run -> report` |
|
|
51
|
+
| `testing` | `testing` | Regression planning and coverage-oriented handoffs | `plan -> work -> verify --profile quality --strict-quality -> report` |
|
|
52
|
+
| `release` | `developer` | Release readiness, changelog, and no-ship/ship evidence | `run -> report -> gate -> ship` |
|
|
53
|
+
| `enterprise` | `full` | Full stable catalog evaluation with all gates intact | `ask -> plan -> team -> work -> verify -> gate -> ship -> report -> apply` |
|
|
54
|
+
|
|
55
|
+
## Commands
|
|
56
|
+
|
|
57
|
+
```bash
|
|
58
|
+
node scripts/install-plan.js --list
|
|
59
|
+
node scripts/install-plan.js --pack security
|
|
60
|
+
node scripts/install-plan.js --pack quality --target claude --json
|
|
61
|
+
node scripts/install-apply.js --pack core --project-root <target>
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
`--pack` and `--profile` cannot be used together. A pack resolves to exactly one profile, and profile safety validation still rejects any default that weakens Codex verification, Human Gate, or single-executor mutation policy.
|
|
65
|
+
|
|
66
|
+
## Positioning
|
|
67
|
+
|
|
68
|
+
NEKOWORK does not try to be the largest catalog. It is a curated catalog for a reportable evidence pipeline:
|
|
69
|
+
|
|
70
|
+
```text
|
|
71
|
+
selective catalog
|
|
72
|
+
+ multi-surface projection
|
|
73
|
+
+ evidence report
|
|
74
|
+
+ Human Gate
|
|
75
|
+
+ explicit apply
|
|
76
|
+
= local-first AI development quality runtime
|
|
77
|
+
```
|