briefops 1.1.0 → 2.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (93) hide show
  1. package/CHANGELOG.md +22 -0
  2. package/README.md +88 -49
  3. package/SECURITY.md +4 -3
  4. package/dist/cli.js +6 -0
  5. package/dist/cli.js.map +1 -1
  6. package/dist/commands/bootstrap.d.ts +2 -0
  7. package/dist/commands/bootstrap.js +59 -0
  8. package/dist/commands/bootstrap.js.map +1 -0
  9. package/dist/commands/continue.js +5 -4
  10. package/dist/commands/continue.js.map +1 -1
  11. package/dist/commands/doctor.js +27 -0
  12. package/dist/commands/doctor.js.map +1 -1
  13. package/dist/commands/finish.js +9 -1
  14. package/dist/commands/finish.js.map +1 -1
  15. package/dist/commands/harness.d.ts +2 -0
  16. package/dist/commands/harness.js +56 -0
  17. package/dist/commands/harness.js.map +1 -0
  18. package/dist/commands/memory.js +20 -3
  19. package/dist/commands/memory.js.map +1 -1
  20. package/dist/commands/obs.d.ts +2 -0
  21. package/dist/commands/obs.js +51 -0
  22. package/dist/commands/obs.js.map +1 -0
  23. package/dist/commands/skill.js +1 -1
  24. package/dist/commands/skill.js.map +1 -1
  25. package/dist/core/bootstrap.d.ts +25 -0
  26. package/dist/core/bootstrap.js +69 -0
  27. package/dist/core/bootstrap.js.map +1 -0
  28. package/dist/core/codex.js +32 -5
  29. package/dist/core/codex.js.map +1 -1
  30. package/dist/core/codexPlugin.js +53 -12
  31. package/dist/core/codexPlugin.js.map +1 -1
  32. package/dist/core/exportTargets.js +21 -18
  33. package/dist/core/exportTargets.js.map +1 -1
  34. package/dist/core/harness.d.ts +27 -0
  35. package/dist/core/harness.js +342 -0
  36. package/dist/core/harness.js.map +1 -0
  37. package/dist/core/lock.d.ts +1 -0
  38. package/dist/core/lock.js +18 -4
  39. package/dist/core/lock.js.map +1 -1
  40. package/dist/core/memory.d.ts +4 -1
  41. package/dist/core/memory.js +54 -2
  42. package/dist/core/memory.js.map +1 -1
  43. package/dist/core/memoryHygiene.d.ts +11 -0
  44. package/dist/core/memoryHygiene.js +93 -0
  45. package/dist/core/memoryHygiene.js.map +1 -1
  46. package/dist/core/memoryProposal.js +21 -8
  47. package/dist/core/memoryProposal.js.map +1 -1
  48. package/dist/core/observability.d.ts +36 -0
  49. package/dist/core/observability.js +70 -0
  50. package/dist/core/observability.js.map +1 -0
  51. package/dist/core/prime.js +16 -9
  52. package/dist/core/prime.js.map +1 -1
  53. package/dist/core/securityDoctor.js +3 -5
  54. package/dist/core/securityDoctor.js.map +1 -1
  55. package/dist/core/strictDoctor.d.ts +22 -0
  56. package/dist/core/strictDoctor.js +95 -0
  57. package/dist/core/strictDoctor.js.map +1 -0
  58. package/dist/core/workflow.d.ts +4 -0
  59. package/dist/core/workflow.js +21 -2
  60. package/dist/core/workflow.js.map +1 -1
  61. package/dist/core/workspace.js +13 -1
  62. package/dist/core/workspace.js.map +1 -1
  63. package/dist/schemas/memory.d.ts +100 -0
  64. package/dist/schemas/memory.js +9 -1
  65. package/dist/schemas/memory.js.map +1 -1
  66. package/dist/schemas/memoryProposal.d.ts +115 -0
  67. package/dist/schemas/memoryProposal.js +2 -1
  68. package/dist/schemas/memoryProposal.js.map +1 -1
  69. package/dist/version.d.ts +1 -1
  70. package/dist/version.js +1 -1
  71. package/docs/codex-resume.md +1 -1
  72. package/docs/compatibility.md +10 -6
  73. package/docs/concept.md +2 -2
  74. package/docs/file-format.md +3 -1
  75. package/docs/handoff-briefs.md +1 -1
  76. package/docs/impact-report.md +140 -0
  77. package/docs/integrations/harnesses.md +14 -6
  78. package/docs/master-harness.md +610 -0
  79. package/docs/memory-lifecycle.md +11 -5
  80. package/docs/privacy-model.md +7 -3
  81. package/docs/quickstart.md +22 -6
  82. package/docs/release-checklist.md +18 -4
  83. package/docs/roadmap.md +12 -5
  84. package/docs/superpowers/plans/2026-06-08-briefops-oss-readiness.md +1 -1
  85. package/docs/token-budget.md +8 -0
  86. package/package.json +2 -2
  87. package/plugins/briefops-codex/.codex-plugin/plugin.json +2 -1
  88. package/plugins/briefops-codex/README.md +8 -6
  89. package/plugins/briefops-codex/skills/briefops-continue-worker/SKILL.md +3 -3
  90. package/plugins/briefops-codex/skills/briefops-finish-task/SKILL.md +4 -4
  91. package/plugins/briefops-codex/skills/briefops-prime-context/SKILL.md +4 -4
  92. package/plugins/briefops-codex/skills/briefops-review-memory/SKILL.md +8 -7
  93. package/plugins/briefops-codex/skills/briefops-route-task/SKILL.md +33 -0
@@ -0,0 +1,610 @@
1
+ # BriefOps Master Harness
2
+
3
+ This document defines the Codex-first Master Harness for BriefOps. The harness is not a larger prompt bundle. It is the operating layer that decides which workflow depth is required for a task, which artifacts must exist, what verification evidence is acceptable, and what memory should survive the session.
4
+
5
+ ## 1. Executive Summary
6
+
7
+ - Build the MVP as one BriefOps plugin/skill pack with modular internal skills, not as separate plugins.
8
+ - Keep BriefOps as the canonical memory and handoff owner. Interoperate with Fable-style state concepts, but do not copy external state files into BriefOps by default.
9
+ - Reimplement Fable-like goal, findings, and evidence concepts in clean-room BriefOps schemas unless a future license review explicitly approves code reuse.
10
+ - Treat Spec-Kit as the specification/planning layer. Wrap or route to it when installed; do not embed its full workflow into the memory layer.
11
+ - Make routing the product center: `briefops harness route --task "<task>"` chooses workflow depth before implementation.
12
+ - Use a hybrid state model: Markdown for human narrative and JSON/JSONL/YAML for agent-readable ledgers.
13
+ - Prevent over-process by routing tiny work to light plans and targeted verification only.
14
+ - Prevent under-process by escalating risky tasks to goal ledgers, findings, full verification, visual evidence, incident logs, or handoffs.
15
+ - Keep the core runtime-agnostic. The first adapter is Codex skills and CLI prompts.
16
+ - The next implementation step is to extend the new harness router into persistent `.briefops/harness/` ledgers and final response checks.
17
+
18
+ ## 2. Source Project Findings
19
+
20
+ ### FableCodex
21
+
22
+ Source: <https://github.com/baskduf/FableCodex>
23
+
24
+ Confirmed facts:
25
+
26
+ | Area | Finding |
27
+ | --- | --- |
28
+ | Runtime target | Codex-focused. The repository describes FableCodex as a Codex plugin that ports Fable ideas to Codex and includes a Codex Fable 5 plugin. |
29
+ | Useful ideas | Goal ledger, findings ledger, verification gate, evidence-first completion, anti-false-done workflow. |
30
+ | Plugin structure | The repository includes `plugins/codex-fable5` and documentation for installing a local Codex plugin. |
31
+ | Command/state model | It is organized around task-local ledgers and evidence, not long-term project memory. |
32
+ | License | AGPL-3.0-or-later per the repository license/readme metadata. |
33
+ | Maturity | Small, focused reference implementation. Treat as conceptual prior art until deeper source audit. |
34
+
35
+ Architectural interpretation:
36
+
37
+ - FableCodex owns the execution tracking and verification ideas.
38
+ - It should not become BriefOps memory. BriefOps should own long-lived facts, decisions, work logs, handoffs, and project state.
39
+ - The MVP should reimplement Fable-like concepts as `goals`, `findings`, and `verification` artifacts under the BriefOps namespace.
40
+
41
+ Integration recommendation:
42
+
43
+ - Do not call FableCodex directly in the MVP.
44
+ - Borrow the lifecycle ideas, not code.
45
+ - Treat code/text reuse as license-sensitive because AGPL obligations may apply.
46
+ - Preserve an explicit compatibility note so existing FableCodex state can be imported later if users ask for it.
47
+
48
+ ### fablize
49
+
50
+ Source: <https://github.com/fivetaku/fablize>
51
+
52
+ Confirmed facts:
53
+
54
+ | Area | Finding |
55
+ | --- | --- |
56
+ | Runtime target | Claude Code-focused. The repository describes itself as a Claude Code plugin. |
57
+ | Useful ideas | Per-task router, escalation from simple tasks to deeper workflows, task-complete gate, verification discipline. |
58
+ | Plugin structure | The repository includes Claude plugin assets and slash-command style workflows. |
59
+ | Command/state model | Stronger as a workflow inspiration source than as a Codex drop-in. |
60
+ | License | MIT in the repository license file. |
61
+ | Maturity | Useful reference, but runtime-specific to Claude Code. |
62
+
63
+ Architectural interpretation:
64
+
65
+ - fablize validates the idea that workflow depth should be routed, not globally forced.
66
+ - FableCodex is the closer Codex implementation reference.
67
+
68
+ Integration recommendation:
69
+
70
+ - Do not graft fablize into Codex.
71
+ - Reuse the design pattern: route first, execute second, verify before completion.
72
+
73
+ ### Ponytail
74
+
75
+ Source: <https://github.com/DietrichGebert/ponytail>
76
+
77
+ Confirmed facts:
78
+
79
+ | Area | Finding |
80
+ | --- | --- |
81
+ | Runtime target | Codex plugin/prompt policy focused on implementation behavior. |
82
+ | Useful ideas | Smallest useful diff, avoid unnecessary refactors, preserve existing behavior, avoid unnecessary dependencies, keep local conventions. |
83
+ | Plugin structure | The repository presents policy guidance for Codex-style coding agents. |
84
+ | Command/state model | Policy-oriented, not a full lifecycle, memory, or verification system. |
85
+ | License | MIT in the repository license file. |
86
+ | Maturity | Best treated as an implementation policy layer. |
87
+
88
+ Architectural interpretation:
89
+
90
+ - Ponytail belongs between planning and patching.
91
+ - It should shape implementation behavior but not own task routing, memory, or verification.
92
+
93
+ Integration recommendation:
94
+
95
+ - Encode Ponytail-like rules into the `briefops-implement` skill and final review checklist.
96
+ - Keep it as policy, not state.
97
+
98
+ ### BriefOps
99
+
100
+ Sources: [README.md](../README.md), [docs/file-format.md](file-format.md), [docs/compatibility.md](compatibility.md), [docs/integrations/harnesses.md](integrations/harnesses.md)
101
+
102
+ Confirmed facts:
103
+
104
+ | Area | Finding |
105
+ | --- | --- |
106
+ | Runtime target | Local-first CLI for Codex, Claude Code, Cursor, and local harnesses. |
107
+ | Role | Persistent memory, work logs, handoffs, worker continuity, compact context priming. |
108
+ | State root | `.briefops/` is the canonical local data root. |
109
+ | Current boundary | BriefOps explicitly says it is not an agent harness or multi-agent orchestrator. |
110
+ | License | MIT. |
111
+
112
+ Architectural interpretation:
113
+
114
+ - The Master Harness should be a new layer above BriefOps memory, not a replacement for current BriefOps.
115
+ - The harness can live in the same CLI/plugin because it routes to existing BriefOps commands before and after work.
116
+
117
+ Integration recommendation:
118
+
119
+ - Add `briefops harness route` for task classification.
120
+ - Add `.briefops/harness/` artifacts only after schemas are stable enough to persist.
121
+
122
+ ### Spec-Kit
123
+
124
+ Sources: local github-spec-kit plugin skills at `/Users/simon/.codex/plugins/cache/local/github-spec-kit/0.1.0/skills/`.
125
+
126
+ Confirmed facts:
127
+
128
+ | Area | Finding |
129
+ | --- | --- |
130
+ | Runtime target | Codex plugin skills for specification, planning, task generation, analysis, and implementation handoff. |
131
+ | Role | Planning layer: requirements, scope, design, task decomposition. |
132
+ | Routing guidance | The local `project-orchestrator` skill says Spec Kit is required for new features, architecture changes, API contracts, security-sensitive work, large refactors, and unclear multi-file changes. |
133
+ | State root | `.specify/` and `specs/` when initialized in a repository. |
134
+
135
+ Architectural interpretation:
136
+
137
+ - Spec-Kit should be invoked only when the harness route says specification/planning is required.
138
+ - It should not own memory or final verification.
139
+
140
+ Integration recommendation:
141
+
142
+ - MVP: reference Spec-Kit in route output and Codex skills.
143
+ - v1: detect `.specify/` and route to `speckit-specify`, `speckit-plan`, and `speckit-tasks` when available.
144
+
145
+ ## 3. Proposed Architecture
146
+
147
+ ```
148
+ User Task
149
+ |
150
+ v
151
+ +-----------------------------+
152
+ | Intake / Classification |
153
+ | briefops harness route |
154
+ +-------------+---------------+
155
+ |
156
+ v
157
+ +-------------+---------------+
158
+ | Memory Layer |
159
+ | briefops prime |
160
+ +-------------+---------------+
161
+ |
162
+ v
163
+ +-------------+---------------+
164
+ | Spec / Plan Layer |
165
+ | Spec-Kit when route needs it|
166
+ +-------------+---------------+
167
+ |
168
+ v
169
+ +-------------+---------------+
170
+ | Implementation Policy |
171
+ | Ponytail-like guardrails |
172
+ +-------------+---------------+
173
+ |
174
+ v
175
+ +-------------+---------------+
176
+ | Execution Tracking |
177
+ | goals + findings |
178
+ +-------------+---------------+
179
+ |
180
+ v
181
+ +-------------+---------------+
182
+ | Verification Layer |
183
+ | evidence gates |
184
+ +-------------+---------------+
185
+ |
186
+ v
187
+ +-------------+---------------+
188
+ | Handoff / Learning |
189
+ | briefops finish / continue |
190
+ +-----------------------------+
191
+ ```
192
+
193
+ ### Layer Ownership
194
+
195
+ | Layer | Responsibility | Inputs | Outputs | State Files | Trigger | Failure Modes |
196
+ | --- | --- | --- | --- | --- | --- | --- |
197
+ | Memory | Durable project context, decisions, work logs, handoffs | Task, worker, project | Prime context, logs, memory proposals | `.briefops/memory/*.yaml`, `.briefops/logs/*.yaml`, `.briefops/handoffs/*.md` | Every meaningful task | Memory spam, stale facts, private data leakage |
198
+ | Intake / Classification | Select workflow depth | User task, repo signals, explicit task type | Route contract | MVP: none; v1: `.briefops/harness/routes.jsonl` | Start of task | Under-routing risky work, over-routing tiny work |
199
+ | Specification | Clarify what to build | Route, requirements, repo context | `spec.md`, acceptance criteria | `.specify/`, `specs/*/spec.md` | New/large/ambiguous features | Specs too heavy, implementation details leak into spec |
200
+ | Planning | Decompose work | Spec, repo constraints | Plan, tasks, risk notes | `specs/*/plan.md`, `tasks.md`, optional `.briefops/harness/goals.json` | Medium or larger work | Plan drift, tasks not executable |
201
+ | Implementation Policy | Keep changes small and local | Plan, existing code conventions | Scoped diff discipline | Skill text, route final contract | Before editing | Unnecessary refactor, new dependency drift |
202
+ | Execution Tracking | Track goals and findings | Route, plan, debugging evidence | Goal ledger, findings | v1 `.briefops/harness/goals.json`, `.briefops/harness/findings.jsonl` | Multi-step, debugging, research, incidents | Hidden blockers, stale goals |
203
+ | Verification | Require evidence before done | Route, changed files, commands | Verification record | v1 `.briefops/harness/verification.md` or `.jsonl` | Before final response | False done, unverifiable claims |
204
+ | Handoff / Learning | Close task and preserve durable knowledge | Result, risks, commands, decisions | Work log, memory update, resume pack | Existing BriefOps files | Finish meaningful work | Logging transient noise, missing next step |
205
+
206
+ ## 4. Orchestrator Routing Matrix
207
+
208
+ The MVP route table is implemented in [src/core/harness.ts](../src/core/harness.ts).
209
+
210
+ | Task Type | Spec | Plan | Goal Ledger | Findings | Verification | Memory Update |
211
+ | --- | --- | --- | --- | --- | --- | --- |
212
+ | Small bug fix | No | Light | Optional | Yes if debugging | Level 2 targeted test | Work log |
213
+ | Medium feature | Light | Yes | Yes | Optional | Level 2 or 3 | Work log plus decisions |
214
+ | Large feature | Yes | Yes | Yes | Yes | Level 3 full project verification | Decision, project state, handoff |
215
+ | Refactor | No unless behavior changes | Yes | Yes for multi-file | Optional | Level 3 regression verification | Work log plus decision if architecture changes |
216
+ | Dependency upgrade | No | Yes | Yes | Yes if breakage appears | Level 3 full project verification | Decision plus work log |
217
+ | UI change | Light | Light | Optional | Yes for visual defects | Level 4 visual evidence | Work log, decision if design pattern changes |
218
+ | Test repair | No | Light | Optional | Yes if uncertain | Level 2 targeted test | Work log |
219
+ | Production incident | No | Incident plan | Yes | Yes | Level 4 evidence-based verification | Incident, decision, handoff |
220
+ | Documentation task | No | No or light | No | No | Level 0 or 1 | Work log only if durable |
221
+ | Architecture decision | Yes | Yes | Optional | Research findings | Level 1 source inspection | Decision plus architecture memory |
222
+ | Exploratory research | No | Research plan | Optional | Yes | Level 1 source inspection | Findings or decision if durable |
223
+ | Code review | No | Review plan | Optional | Yes | Level 1 or 2 | Known issue if durable |
224
+ | Release preparation | No | Yes | Yes | Yes | Level 4 evidence-based verification | Project state plus handoff |
225
+
226
+ Minimum useful routing logic:
227
+
228
+ 1. Use explicit `--type` if the user or agent supplies it.
229
+ 2. Otherwise classify by task keywords and repo signals.
230
+ 3. Escalate when risk signals appear: production, auth, security, payment, database, dependency, release, UI evidence, large refactor.
231
+ 4. De-escalate when scope is documentation, typo, local test repair, or a tightly bounded bug.
232
+ 5. Let the agent override the inferred route only by saying why.
233
+
234
+ ## 5. State Model
235
+
236
+ Recommended committed default:
237
+
238
+ ```text
239
+ .briefops/
240
+ config.yaml
241
+ projects/
242
+ skills/
243
+ workers/
244
+ memory/
245
+ logs/
246
+ handoffs/
247
+ codex/
248
+ harness/
249
+ routes.jsonl
250
+ goals.json
251
+ findings.jsonl
252
+ verification.md
253
+ final-response.md
254
+ ```
255
+
256
+ Artifact ownership:
257
+
258
+ | Artifact | Purpose | Owner | Format | Trigger | Append or Mutable | Human or Machine | Commit? | Sensitive? |
259
+ | --- | --- | --- | --- | --- | --- | --- | --- | --- |
260
+ | `.briefops/memory/*.yaml` | Durable facts, decisions, lessons | BriefOps memory | YAML | `finish` durable candidates | Mutable by status | Both | Usually ignored by default | May contain private memory |
261
+ | `.briefops/logs/*.yaml` | Task work history | BriefOps memory | YAML | Meaningful task finish | Append-only files | Machine-readable | Usually ignored | Often private |
262
+ | `.briefops/handoffs/*.md` | Session continuation | Handoff layer | Markdown | Multi-step or risky finish | Generated | Human-readable | Usually ignored | Often private |
263
+ | `.briefops/harness/routes.jsonl` | Route decisions audit | Orchestrator | JSONL | Each routed task | Append-only | Machine-readable | Usually ignored | Low to medium |
264
+ | `.briefops/harness/goals.json` | Active goals and status | Execution tracking | JSON | Medium+ work | Mutable | Machine-readable | Usually ignored | Medium |
265
+ | `.briefops/harness/findings.jsonl` | Debug/research/review findings | Execution tracking | JSONL | Finding discovered | Append-only | Both | Usually ignored | Medium |
266
+ | `.briefops/harness/verification.md` | Evidence checklist | Verification | Markdown | Before final response | Mutable per task | Human-readable | Usually ignored | Medium |
267
+ | `.briefops/harness/final-response.md` | Final response contract template | Verification | Markdown | Harness install | Mutable template | Human-readable | Could commit if generic | Low |
268
+ | `.specify/` and `specs/` | Feature specs and plans | Spec-Kit | Markdown/JSON | Spec-required tasks | Mixed | Both | Usually commit for product specs | Low to medium |
269
+
270
+ Markdown vs JSON:
271
+
272
+ - Markdown is best for decisions, handoffs, verification narratives, and templates.
273
+ - JSON/JSONL/YAML is best for routing, goals, findings, status, and machine checks.
274
+ - Hybrid model: keep agent-controlled ledgers structured, keep human-facing summaries readable.
275
+
276
+ ## 6. Skill / Command Interface
277
+
278
+ ### CLI
279
+
280
+ ```bash
281
+ briefops harness route --task "<task>"
282
+ briefops harness route --task "<task>" --type large-feature
283
+ briefops harness route --task "<task>" --json
284
+ briefops harness matrix
285
+ briefops prime --format codex --task "<task>"
286
+ briefops finish --task "<task>" --result "<verified result>"
287
+ ```
288
+
289
+ ### Public Codex skills
290
+
291
+ | Skill | Purpose | When Used | Inputs | Outputs | State Touched |
292
+ | --- | --- | --- | --- | --- | --- |
293
+ | `briefops-route-task` | Choose workflow depth | Start of task | User task | Route contract | MVP none |
294
+ | `briefops-prime-context` | Load compact memory | After route, before broad inspection | Task, worker, project | Prime context | Reads `.briefops/` |
295
+ | `briefops-finish-task` | Record outcome and durable memory | End of meaningful work | Result, lessons, risks | Work log, memory proposal/application | Writes `.briefops/` |
296
+ | `briefops-review-memory` | Inspect memory proposals | When pending proposals exist | Proposal id | Apply/reject decision | Writes `.briefops/` |
297
+ | `briefops-continue-worker` | Prepare fresh-thread handoff | Continuation tasks | Worker, task | Handoff/resume/pack | Writes `.briefops/` |
298
+
299
+ ### Future internal skills
300
+
301
+ | Skill | Purpose | User-facing? |
302
+ | --- | --- | --- |
303
+ | `route-task` | Classify task and emit workflow contract | No |
304
+ | `load-memory` | Run prime and summarize selected continuity context | No |
305
+ | `write-ledger` | Create/update goal ledger | No |
306
+ | `record-finding` | Append finding with evidence | No |
307
+ | `check-verification` | Validate route-specific completion evidence | No |
308
+ | `summarize-handoff` | Prepare durable handoff notes | No |
309
+
310
+ Smallest viable MVP skill set:
311
+
312
+ 1. `briefops-route-task`
313
+ 2. `briefops-prime-context`
314
+ 3. `briefops-finish-task`
315
+ 4. `briefops-continue-worker`
316
+
317
+ ## 7. Runtime Flows
318
+
319
+ ### New Feature
320
+
321
+ Flow:
322
+
323
+ ```text
324
+ intake -> context retrieval -> specify -> plan -> tasks -> implement -> verify -> handoff
325
+ ```
326
+
327
+ Required artifacts: route, prime context, spec if medium/large, plan, goal ledger, verification evidence, finish log.
328
+
329
+ Exit criteria: acceptance criteria covered, implementation scoped, verification evidence recorded, memory/handoff updated.
330
+
331
+ Final response: summary, files changed, acceptance criteria, verification, risks, memory update.
332
+
333
+ ### Bug Fix
334
+
335
+ Flow:
336
+
337
+ ```text
338
+ intake -> reproduce -> findings -> patch -> targeted verification -> worklog
339
+ ```
340
+
341
+ Required artifacts: route, reproduction note or reason reproduction was not possible, finding if debugging, targeted test evidence.
342
+
343
+ Exit criteria: root cause or bounded symptom understood, patch scoped, targeted verification passes.
344
+
345
+ Final response: cause, fix, files changed, verification, remaining risk.
346
+
347
+ ### Refactor
348
+
349
+ Flow:
350
+
351
+ ```text
352
+ intake -> scope boundary -> risk analysis -> plan -> implement incrementally -> regression verification -> handoff
353
+ ```
354
+
355
+ Required artifacts: behavior boundary, risk notes, plan, regression evidence.
356
+
357
+ Exit criteria: no intended behavior change unless documented, diff reviewable, tests cover touched behavior.
358
+
359
+ Final response: scope, behavior preservation claim, verification, risks.
360
+
361
+ ### Research
362
+
363
+ Flow:
364
+
365
+ ```text
366
+ intake -> source review -> findings -> recommendation -> decision log
367
+ ```
368
+
369
+ Required artifacts: route, source list, findings, recommendation, decision if durable.
370
+
371
+ Exit criteria: sources inspected directly, facts separated from interpretation, uncertainties named.
372
+
373
+ Final response: findings, sources, recommendation, open questions.
374
+
375
+ ### UI
376
+
377
+ Flow:
378
+
379
+ ```text
380
+ intake -> visual target -> implementation -> render/inspect -> screenshot/evidence -> verification
381
+ ```
382
+
383
+ Required artifacts: route, visual target, screenshot/render evidence, responsive notes.
384
+
385
+ Exit criteria: UI rendered in natural environment, target viewport checked, visual evidence supports completion.
386
+
387
+ Final response: UI outcome, files changed, visual evidence, remaining visual risk.
388
+
389
+ ## 8. Completion Criteria
390
+
391
+ Final responses should include:
392
+
393
+ - Summary of changes.
394
+ - Files changed.
395
+ - Tests or verification performed.
396
+ - Evidence.
397
+ - Remaining risks.
398
+ - Memory updates made or skipped.
399
+ - Next recommended action when useful.
400
+
401
+ Standards by task kind:
402
+
403
+ | Work Type | Done Means |
404
+ | --- | --- |
405
+ | Documentation-only | Content matches current behavior; links/examples checked when practical; skipped execution explained. |
406
+ | Code change | Changed behavior is implemented; targeted or full checks pass; residual risk named. |
407
+ | UI change | Rendered output inspected; screenshot or equivalent evidence captured; viewport risk named. |
408
+ | Database change | Migration path, rollback or compatibility notes, and integration verification exist. |
409
+ | Dependency change | Lockfile/package changes are intentional; build/tests pass; migration notes recorded. |
410
+ | Production fix | Impact, mitigation, verification, and follow-up risk are recorded. |
411
+ | Research-only | Sources are cited; facts and interpretation are separated; recommendation is explicit. |
412
+
413
+ ## 9. Verification Policy
414
+
415
+ | Level | Required When | Acceptable Evidence | Unacceptable Evidence | Escalation |
416
+ | --- | --- | --- | --- | --- |
417
+ | Level 0: No execution | Pure planning or writing | Static review note | Claiming code works without code checks | Escalate if files affect runtime behavior |
418
+ | Level 1: Static inspection | Docs, config, research, review | File inspection, source links, type-aware reasoning | "Looks fine" without inspected evidence | Escalate if behavior changes |
419
+ | Level 2: Targeted verification | Bug fix, isolated change, test repair | Specific test, build target, reproduction check | Unrelated test command only | Escalate on shared module or uncertain coverage |
420
+ | Level 3: Full project verification | Feature, refactor, dependency upgrade | Build plus relevant suite or project-standard checks | Only lint for behavior changes | Escalate for release, UI, incident |
421
+ | Level 4: Evidence-based verification | UI, generated files, release, incident | Screenshot, artifact inspection, release checklist, logs | Verbal assertion only | Block completion if evidence cannot be produced without explanation |
422
+
423
+ ## 10. Memory Policy
424
+
425
+ Memory update types:
426
+
427
+ | Type | Required When | Avoid |
428
+ | --- | --- | --- |
429
+ | No update | Tiny transient tasks with no durable lesson | Losing decisions or risks |
430
+ | Work log only | Most meaningful tasks | Recording every micro-step |
431
+ | Decision log | Architecture, product, workflow, or policy choices | Storing preferences as facts |
432
+ | Project state update | Durable project constraints changed | Duplicating task logs |
433
+ | Handoff summary | Multi-step, interrupted, incident, release, large feature | Handing off stale blockers |
434
+ | Architecture memory | Durable technical rationale | Overwriting unresolved debate |
435
+ | Known issue/finding | Confirmed unresolved defect or risk | Storing unverified suspicion as fact |
436
+
437
+ Rules:
438
+
439
+ 1. Store durable decisions, assumptions, constraints, unresolved risks, and next steps.
440
+ 2. Do not store secrets, credentials, personal data, or transient command noise.
441
+ 3. Keep raw work logs local/private by default.
442
+ 4. Use shared-only export when context may leave the workspace.
443
+ 5. Prefer fewer, higher-quality memory entries.
444
+
445
+ ## 11. Repository Integration
446
+
447
+ Where files live:
448
+
449
+ - Harness source code belongs in the BriefOps CLI and plugin generator.
450
+ - User project state belongs under `.briefops/`.
451
+ - Spec-Kit state belongs under `.specify/` and `specs/` when initialized.
452
+ - Always-visible router instructions belong in `AGENTS.md`, `CLAUDE.md`, or Cursor rules through existing `briefops export`.
453
+
454
+ Commit guidance:
455
+
456
+ - Commit generic docs, router instructions, specs, and team-approved project rules.
457
+ - Ignore `.briefops/` by default because it can contain local/private work history.
458
+ - Allow teams to commit selected shared `.briefops` templates only after review.
459
+
460
+ Monorepos:
461
+
462
+ - Detect nearest workspace root.
463
+ - Support one `.briefops/` at repo root with project names for packages.
464
+ - v1 should support package-specific profiles and test command discovery.
465
+
466
+ Small projects:
467
+
468
+ - Keep a single `.briefops/` workspace.
469
+ - Use default worker and project.
470
+ - Avoid requiring Spec-Kit unless complexity warrants it.
471
+
472
+ Convention detection:
473
+
474
+ - Read `AGENTS.md`, `README.md`, package scripts, CI config, test scripts, and existing docs before choosing commands.
475
+ - Prefer existing test/build commands over invented ones.
476
+
477
+ ## 12. Licensing And Reuse
478
+
479
+ | Source | License | Code Reuse | Ideas Reimplementation | Attribution |
480
+ | --- | --- | --- | --- | --- |
481
+ | BriefOps | MIT | Owned here | Yes | Keep MIT notice |
482
+ | FableCodex | AGPL-3.0-or-later | Avoid copying into BriefOps | Yes, clean-room only | Attribute ideas; do not derive code/text without legal review |
483
+ | fablize | MIT | Possible after source audit | Yes | Attribute if deriving text/code |
484
+ | Ponytail | MIT | Possible after source audit | Yes | Attribute if deriving text/code |
485
+ | Spec-Kit local plugin | Check upstream for exact terms before copying | Do not copy into BriefOps MVP | Route to installed skills | Cite integration boundary |
486
+
487
+ Clean-room rule:
488
+
489
+ - Do not copy source code or long instruction text from external projects into BriefOps.
490
+ - Reimplement concepts as BriefOps-native routing, ledgers, and verification contracts.
491
+ - Keep source findings and design interpretation separate.
492
+
493
+ ## 13. MVP Scope
494
+
495
+ Goal:
496
+
497
+ > Make Codex behave more reliably on real development tasks.
498
+
499
+ MVP features:
500
+
501
+ - One installable Codex plugin/skill pack.
502
+ - `briefops harness route`.
503
+ - Routing matrix for common task categories.
504
+ - Existing `briefops prime` memory intake.
505
+ - Existing `briefops finish` work log and memory update.
506
+ - Goal/finding/verification concepts defined in docs.
507
+ - Final response contract defined by route output.
508
+
509
+ MVP non-goals:
510
+
511
+ - Cloud sync.
512
+ - TUI/dashboard.
513
+ - Database-backed state.
514
+ - Full multi-agent orchestration.
515
+ - Automatic source-code rewriting by harness.
516
+ - Mandatory Spec-Kit for all work.
517
+ - Direct FableCodex/fablize/Ponytail code import.
518
+
519
+ MVP file tree:
520
+
521
+ ```text
522
+ src/core/harness.ts
523
+ src/commands/harness.ts
524
+ plugins/briefops-codex/skills/briefops-route-task/SKILL.md
525
+ docs/master-harness.md
526
+ tests/harness.test.ts
527
+ ```
528
+
529
+ Implementation steps:
530
+
531
+ 1. Ship route command and skill entrypoint.
532
+ 2. Add persistent route audit file `.briefops/harness/routes.jsonl`.
533
+ 3. Add goal ledger writer for multi-step routes.
534
+ 4. Add findings append command for debugging/research/review.
535
+ 5. Add verification checklist generator keyed by route verification level.
536
+ 6. Add final response checker that prints missing required sections.
537
+ 7. Detect `.specify/` and suggest Spec-Kit commands only when required.
538
+
539
+ Test strategy:
540
+
541
+ - Unit-test route classification and explicit overrides.
542
+ - CLI-test `briefops harness route` and `briefops harness matrix`.
543
+ - Plugin-sync test for generated Codex skill files.
544
+ - Golden-output tests for route contracts.
545
+ - Later: fixture tests for `.briefops/harness/` persistence.
546
+
547
+ Risks:
548
+
549
+ - Keyword routing can misclassify tasks. Mitigation: explicit `--type` override and safer escalation rules.
550
+ - Process creep can make small tasks slow. Mitigation: route matrix defaults small work to light workflow.
551
+ - Verification can become performative. Mitigation: route-specific acceptable evidence.
552
+ - State can become noisy. Mitigation: memory policy and local/private defaults.
553
+
554
+ ## 14. Roadmap
555
+
556
+ ### MVP
557
+
558
+ Goal: Make Codex behave more reliably on real development tasks.
559
+
560
+ Features:
561
+
562
+ - Routing.
563
+ - Memory intake.
564
+ - Work log.
565
+ - Goal/finding/verification contracts.
566
+ - Handoff through existing BriefOps commands.
567
+
568
+ ### v1
569
+
570
+ Goal: Make BriefOps Master Harness dependable across multiple projects.
571
+
572
+ Features:
573
+
574
+ - Persistent harness schemas.
575
+ - Project profiles.
576
+ - Better `AGENTS.md` integration.
577
+ - Automatic test command discovery.
578
+ - Release/review/incident flows.
579
+ - Spec-Kit detection and route-aware suggestions.
580
+
581
+ ### v2
582
+
583
+ Goal: Make BriefOps runtime-agnostic.
584
+
585
+ Features:
586
+
587
+ - Adapters for Codex, Claude Code, OpenCode, Gemini CLI.
588
+ - Portable state model.
589
+ - Cross-agent handoff.
590
+ - Advanced memory compaction.
591
+ - Optional dashboard or TUI.
592
+
593
+ ## 15. Open Questions
594
+
595
+ 1. Should `.briefops/harness/routes.jsonl` be enabled by default or only with `--save`?
596
+ 2. Should goal/findings state be one shared ledger or one task-scoped directory per task?
597
+ 3. Should Spec-Kit integration call local skills directly or only print next-step instructions?
598
+ 4. Should route classification read `AGENTS.md` and package scripts in MVP or v1?
599
+ 5. Should final response checking be advisory or blocking?
600
+ 6. Should teams be able to customize routing policy in `.briefops/harness/policy.yaml`?
601
+
602
+ ## 16. Recommended Next Action
603
+
604
+ Implement persistent harness state:
605
+
606
+ ```bash
607
+ briefops harness route --task "<task>" --save
608
+ ```
609
+
610
+ The saved route should append to `.briefops/harness/routes.jsonl` and create a task-scoped verification checklist. That makes the orchestrator observable without turning the MVP into a heavy workflow engine.
@@ -2,14 +2,20 @@
2
2
 
3
3
  BriefOps separates immediate work continuity from durable memory.
4
4
 
5
- Work logs are written first. They can feed the next local handoff or resume immediately, so a fresh AI coding thread can continue from recent results, lessons, decisions, risks, incidents, and next steps without waiting for a memory review.
5
+ Work logs are written first. They can feed the next local handoff or resume immediately, so a fresh AI coding thread can continue from recent results, lessons, decisions, risks, incidents, and next steps without waiting for a review step.
6
6
 
7
- Durable memory is the curated layer. It is used for reusable lessons, decisions, facts, incidents, and constraints that should survive beyond the immediate handoff.
7
+ Durable memory is the curated directory-local layer. It is used for reusable lessons, decisions, facts, incidents, and constraints that should survive beyond the immediate handoff.
8
8
 
9
- Allowed flow:
9
+ Default flow:
10
10
 
11
11
  ```text
12
- log -> memory proposal -> human approval -> memory
12
+ finish -> work log -> memory proposal audit file -> local memory
13
+ ```
14
+
15
+ Review-mode flow:
16
+
17
+ ```text
18
+ finish --memory-review -> work log -> pending memory proposal -> apply/reject locally
13
19
  ```
14
20
 
15
21
  Commands:
@@ -24,4 +30,4 @@ briefops memory proposal-reject <proposal-id>
24
30
 
25
31
  Extraction is deterministic and local. Lessons become lesson memory candidates. Notes prefixed with `decision:` or `fact:` become matching memory candidates. Results containing risk or failure language become incident candidates.
26
32
 
27
- Proposal generation and approval are local file-backed operations protected by workspace locks. BriefOps never auto-approves memory or skill patches.
33
+ Proposal generation and application are local file-backed operations protected by workspace locks. BriefOps asks for explicit direction before skill patches or sharing private memory outside the local workspace.
@@ -4,7 +4,7 @@ BriefOps is local-first. It stores work history, memory, proposals, patches, gen
4
4
 
5
5
  ## Export Policies
6
6
 
7
- `local-private` is for local terminal and local Codex/Claude/Cursor use. It may include private project details, approved private memory, local work logs, risks, next steps, worker history, and metadata counts.
7
+ `local-private` is for local terminal and local Codex/Claude/Cursor use. It may include private project details, local private memory, local work logs, risks, next steps, worker history, and metadata counts.
8
8
 
9
9
  `shared-only` is for artifacts that may leave the local workspace. It includes only memory items marked `visibility: shared` and `exportable: true`.
10
10
 
@@ -25,9 +25,11 @@ Shared-only omits:
25
25
 
26
26
  `briefops export agents-md`, `briefops export claude-md`, `briefops export cursor-rules`, and `briefops export all` generate router files. Router files point AI harnesses to BriefOps commands. They do not copy `.briefops` memory, logs, worker summaries, handoffs, incidents, or private decisions.
27
27
 
28
- ## Human Approval
28
+ ## Local Memory And Explicit Boundaries
29
29
 
30
- BriefOps never auto-approves memory proposals or skill patches. Approval is always an explicit user action through `briefops approve`, `briefops memory proposal-apply`, or `briefops skill apply-patch`.
30
+ BriefOps applies directory-local memory from `briefops finish` by default and keeps proposal files as an audit trail. This updates files under the current repository's `.briefops/` workspace only.
31
+
32
+ Explicit direction is still required before applying skill patches or sharing private memory outside the local workspace. Use `--export-policy shared-only` for portable or committable context.
31
33
 
32
34
  ## Doctor Checks
33
35
 
@@ -40,3 +42,5 @@ BriefOps never auto-approves memory proposals or skill patches. Approval is alwa
40
42
  `briefops doctor --stability` checks local workspace integrity, including schema validity, duplicate memory ids, broken references, managed-path symlinks, and orphaned review artifacts. It is read-only, reports bounded examples, and does not add detailed doctor output to generated prompt artifacts.
41
43
 
42
44
  `briefops doctor --security --fix-stale-locks` removes stale lock files only. It does not remove fresh locks or other workspace files.
45
+
46
+ `briefops doctor --strict --json` aggregates stability, security, privacy, and memory hygiene. It reports `releaseReady: false` when any warning or failure remains, so privacy warnings such as an unignored `.briefops/` workspace cannot silently pass a release gate.