npm - @deltafleet/codex-goalkeeper - Versions diffs - 0.1.0 → 0.1.1 - Mend

@deltafleet/codex-goalkeeper 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/README.md CHANGED Viewed

@@ -1,148 +1,126 @@
 # Codex Goalkeeper
-Context compaction does not kill long Codex runs. Direction drift does.
+Long Codex runs do not usually fail all at once.
-Codex Goalkeeper is a tiny skill that helps Codex keep its bearings during long `/goal` sessions, repeated compactions, handoffs, and multi-day implementation work.
+They drift.
-It does not try to be a hidden memory engine. It gives the agent a simple ritual:
+The agent still sounds confident. The tests still run. The plan still looks plausible. But after enough compaction, handoffs, and resumed turns, the session can quietly forget the thing that mattered most:
-1. Write down the current mission.
-2. Preserve the reasoning that should survive compaction.
-3. Record decisions, failures, and verification as evidence.
-4. Read the checkpoint first before continuing.
+> Why were we doing it this way?
-That is it. Boring files. Better continuity.
+Codex Goalkeeper is a small skill that helps Codex keep long-running `/goal` work oriented across compaction, resumes, and handoffs.
-[한국어](README.ko.md) | [日本語](README.ja.md) | [中文](README.zh-CN.md)
-## The Problem
-Long agent sessions do not usually fail because the model forgets a random detail. They fail because the "why" gets blurred:
+It does not add a hidden memory engine. It gives the agent a durable working ritual:
-- why a direction was chosen
-- what the user explicitly ruled out
-- which attempt already failed
-- what was actually verified
-- what the next action was supposed to be
+- keep a short checkpoint
+- keep a richer context pack
+- append decisions and verification to an event log
+- read the checkpoint before continuing after drift-prone boundaries
-After enough compaction, a session can still sound confident while quietly losing the thread.
+Boring files. Better continuity.
-Goalkeeper makes the thread explicit.
-## What It Creates
+[한국어](README.ko.md) | [日本語](README.ja.md) | [中文](README.zh-CN.md)
-Goalkeeper stores state inside the project you are working on:
+## Install
-```text
-.goalkeeper/
-  active-session
-  sessions/
-    <goal-session-id>/
-      checkpoint.md
-      context-pack.md
-      events.jsonl
+```bash
+npx skills add deltafleet/codex-goalkeeper
 ```
-The three files have different jobs:
+Requirements: Node.js 18+ and `npx`. Codex uses the skill's bundled Node helper scripts during long-goal workflows.
-- `checkpoint.md`: short, always-read recovery state
-- `context-pack.md`: medium-density reasoning that should survive compaction
-- `events.jsonl`: append-only evidence for decisions, failures, commands, risks, and handoffs
+Then ask Codex to use it on long-running work:
-The active Codex goal says where you are going. Goalkeeper preserves how not to get lost on the way.
+> Use codex-goalkeeper for this `/goal`. Keep the goal, constraints, decisions, verification state, failed attempts, and next action recoverable across compaction.
-## Install
+That is the intended user workflow. You should not have to manually run Goalkeeper's helper scripts during normal use. Codex runs them as part of the skill workflow.
-With the Skills CLI:
+## The Problem
-```bash
-npx skills add deltafleet/codex-goalkeeper
-```
+If you use Codex for small tasks, compaction is just a detail. The agent can usually recover.
-From a local checkout:
+But long goals are different.
-```bash
-git clone https://github.com/deltafleet/codex-goalkeeper
-cd codex-goalkeeper
-npx skills add .
-```
+Imagine a real session:
-## Start A Long Goal
+1. You ask Codex to harden a release.
+2. It finds a brittle edge case and chooses a conservative path.
+3. You explicitly reject a tempting shortcut.
+4. Two implementation attempts fail for subtle reasons.
+5. A test finally proves the right route.
+6. The context compacts.
+7. Later, the agent resumes with a neat summary, but not the decision pressure that made the route correct.
-Create a Goalkeeper session in your project:
+That is where drift starts.
-```bash
-node <skill-path>/src/scripts/goalkeeper-init.mjs \
-  --workspace <workspace> \
-  --session 2026-05-18-release-hardening \
-  --goal "Ship the release without losing constraints after compaction"
-```
+The failure mode is not "the model forgot everything." It is worse: it remembers enough to continue, but not enough to continue in the same direction.
-This creates the project-local `.goalkeeper/` directory and sets the active session pointer.
+You see it when an agent:
-## Resume Correctly
+- reopens an approach the user already rejected
+- repeats a failed attempt because the failure was summarized away
+- treats an unverified assumption as settled fact
+- loses the exact next action after a long handoff
+- preserves the goal but loses the operating constraints
+- gives a polished explanation that no longer matches the workstream
-At the start of a new turn, after handoff, or after suspected compaction:
+Goalkeeper exists for that gap between "the goal is still known" and "the session still has its bearings."
-```bash
-node <skill-path>/src/scripts/goalkeeper-turn-start.mjs \
-  --workspace <workspace>
-```
+## What Codex Does
-If the short checkpoint is not enough:
+When the skill is active, Codex maintains a project-local continuity folder:
-```bash
-node <skill-path>/src/scripts/goalkeeper-turn-start.mjs \
-  --workspace <workspace> \
-  --context
+```text
+.goalkeeper/
+  active-session
+  sessions/
+    <goal-session-id>/
+      checkpoint.md
+      context-pack.md
+      events.jsonl
 ```
-This is the core behavior. The agent reads the current mission before touching the rest of the project.
+Each file has a different job:
-## Record What Matters
+- `checkpoint.md` is the short "read this first" recovery state.
+- `context-pack.md` preserves the reasoning chain that is too detailed for the checkpoint.
+- `events.jsonl` records decisions, failed attempts, command evidence, verification, risks, and handoffs.
-Append meaningful events:
+The active Codex goal says where the work is going. Goalkeeper preserves why this is still the right route.
-```bash
-node <skill-path>/src/scripts/goalkeeper-append-event.mjs \
-  --workspace <workspace> \
-  --type decision \
-  --text "Keep the MVP skill-only; no MCP server or background daemon."
-```
+## How It Works
-Refresh the checkpoint after a real state change:
+Goalkeeper turns long agent work into a simple loop:
-```bash
-node <skill-path>/src/scripts/goalkeeper-update-checkpoint.mjs \
-  --workspace <workspace> \
-  --goal "Ship the release without losing constraints after compaction" \
-  --status "Docs complete; validation pending." \
-  --next "Run validation and cut v0.1.0."
+```text
+Long /goal begins
+  -> Codex creates or resumes a Goalkeeper session
+  -> important constraints and decisions are recorded
+  -> failed attempts are kept so they are not repeated
+  -> verification evidence is logged when confidence changes
+  -> checkpoint.md is refreshed at meaningful boundaries
+  -> context-pack.md keeps the deeper reasoning chain
+  -> after resume, handoff, or suspected compaction, Codex reads checkpoint.md first
+  -> if the checkpoint is too thin, Codex reads context-pack.md
+  -> if exact proof is needed, Codex checks events.jsonl or source files
 ```
-Check the workspace before trusting it for a long run:
+This is not transcript storage. It is working-state preservation.
-```bash
-node <skill-path>/src/scripts/goalkeeper-doctor.mjs \
-  --workspace <workspace> \
-  --session 2026-05-18-release-hardening \
-  --strict
-```
+## Why It Is Small On Purpose
-## A Typical Loop
+The obvious version of this project is too big:
-```text
-User starts a long /goal
-  -> Goalkeeper initializes a project-local session
-  -> Codex works normally
-  -> important decisions and verification go into events.jsonl
-  -> checkpoint.md is refreshed at meaningful boundaries
-  -> context-pack.md preserves the reasoning chain
-  -> after resume or compaction, Codex reads checkpoint.md first
-  -> if needed, Codex reads context-pack.md and exact event evidence
-```
+- a daemon
+- a database
+- a session rewriter
+- a private runtime hook
+- a vector memory layer
+- a full transcript engine
-Goalkeeper does not remove compaction. It makes recovery from compaction less dependent on vibes.
+Goalkeeper intentionally avoids that.
+It uses files because files are visible, reviewable, portable, and easy for agents to read after compaction. The point is not to make Codex omniscient. The point is to make the next turn start from the right state.
 ## What This Is Not
@@ -150,39 +128,42 @@ Goalkeeper does not remove compaction. It makes recovery from compaction less de
 - Not an MCP server.
 - Not a database.
 - Not a transcript archive.
-- Not a private runtime hook.
-- Not a promise of perfect memory.
+- Not a private Codex runtime hook.
+- Not a guarantee of perfect memory.
 - Not a way to reduce compaction frequency.
-Goalkeeper is intentionally small because small rituals survive real work.
-## Why It Works
+Goalkeeper improves continuity. It does not pretend to eliminate context limits.
-Agent memory fails when the only durable state is the conversation itself.
+## What Gets Better
-Goalkeeper moves the minimum useful continuity state into the repository:
+With Goalkeeper, a resumed session has a better chance to recover:
-- constraints become visible
-- decisions become inspectable
-- failed paths stop repeating
-- verification survives handoff
-- the next action is explicit
+- the user's non-negotiable constraints
+- the current implementation direction
+- the reason rejected alternatives stayed rejected
+- the tests or commands that changed confidence
+- the real next action
+- unresolved risks that should not be hand-waved away
-The model still has to do the work. Goalkeeper just gives it a better starting point after the context window gets rewritten.
+That is enough to prevent many of the boring, expensive failures in long agent runs.
 ## Repository Layout
 ```text
-SKILL.md                    # skill entrypoint
-agents/openai.yaml          # skill metadata
-src/scripts/                # deterministic helpers
-src/templates/              # starter Goalkeeper files
-src/references/             # workflow and schema references
+src/codex-goalkeeper/       # installable skill payload
+  SKILL.md
+  agents/openai.yaml
+  scripts/
+  templates/
+  references/
+tests/                      # maintainer tests
 examples/goalkeeper-session # static example state
 docs/                       # roadmap and release policy
 ```
-## Validation
+## Maintainer Validation
+For repository maintainers:
 ```bash
 npm run validate
@@ -191,8 +172,8 @@ npm run validate
 Equivalent manual checks:
 ```bash
-find src/scripts -name '*.mjs' -print0 | xargs -0 -n1 node --check
-node src/scripts/test-goalkeeper-update-checkpoint.mjs
+find src/codex-goalkeeper/scripts tests -name '*.mjs' -print0 | xargs -0 -n1 node --check
+node tests/test-goalkeeper-update-checkpoint.mjs
 npx skills add . --list
 ```

package/README.zh-CN.md CHANGED Viewed

@@ -1,196 +1,203 @@
 # Codex Goalkeeper
-让长时间 Codex 任务失败的，通常不是上下文压缩本身，而是方向漂移。
+长时间 Codex 任务通常不是突然失败的。
-Codex Goalkeeper 是一个很小的 skill，用来帮助 Codex 在长时间 `/goal` 会话、反复 compaction、handoff、跨天实现任务中保持方向。
+它们会慢慢偏离方向。
-它不伪装成隐藏的记忆引擎。它只让 agent 养成一个简单习惯：
+Agent 仍然会显得很自信。测试仍然会运行。计划看起来也仍然合理。但经过多次 compaction、handoff 和 resume 之后，最重要的东西可能会悄悄变模糊：
-1. 写下当前任务。
-2. 保存 compaction 后仍然需要保留的推理链。
-3. 把决策、失败和验证结果记录为事件。
-4. 继续工作前先读取 checkpoint。
+> 我们为什么要按这个方向做？
-就这些。无聊的文件。更好的连续性。
+Codex Goalkeeper 是一个很小的 skill，用来帮助 Codex 在长时间 `/goal` 工作中跨越 compaction、resume 和 handoff 仍然保持方向。
-[English](README.md) | [한국어](README.ko.md) | [日本語](README.ja.md)
-## 问题
-长时间 agent 会话通常不是因为忘了某个小细节而失败，而是因为 `为什么要这样做` 变得模糊。
+它不添加隐藏的记忆引擎。它给 agent 一个可持续的工作习惯：
-- 为什么选择这个方向
-- 用户明确禁止了什么
-- 哪些尝试已经失败
-- 什么是真的验证过的
-- 下一步原本应该做什么
+- 保持一个简短 checkpoint
+- 保持一个更丰富的 context pack
+- 把决策和验证写入 event log
+- 在容易发生 drift 的边界之后，继续前先读 checkpoint
-经过多次 compaction 后，一个会话仍然可能听起来很自信，但实际上已经悄悄偏离主线。
+无聊的文件。更好的连续性。
-Goalkeeper 把这条主线显式写进文件。
-## 它会创建什么
+[English](README.md) | [한국어](README.ko.md) | [日本語](README.ja.md)
-Goalkeeper 把状态保存在当前项目内：
+## 安装
-```text
-.goalkeeper/
-  active-session
-  sessions/
-    <goal-session-id>/
-      checkpoint.md
-      context-pack.md
-      events.jsonl
+```bash
+npx skills add deltafleet/codex-goalkeeper
 ```
-三个文件的职责不同：
+要求: Node.js 18+ 和 `npx`。在长 goal workflow 中，Codex 会使用 skill 内置的 Node helper scripts。
-- `checkpoint.md`: 每次恢复时先读的简短状态
-- `context-pack.md`: compaction 后仍需保留的中等密度推理上下文
-- `events.jsonl`: 决策、失败、命令、验证、风险和 handoff 的 append-only 记录
+然后在长任务里告诉 Codex：
-Codex 的 goal 说明目的地。Goalkeeper 保存的是不迷路所需的路线感。
+> Use codex-goalkeeper for this `/goal`. Keep the goal, constraints, decisions, verification state, failed attempts, and next action recoverable across compaction.
-## 安装
+正常使用到这里就够了。用户不需要手动执行 Goalkeeper 的 helper scripts。Codex 会把它们作为 skill workflow 的一部分来运行。
-使用 Skills CLI：
+## 问题
-```bash
-npx skills add deltafleet/codex-goalkeeper
-```
+对于短任务，compaction 通常只是细节。Agent 大多能恢复。
-也可以从本地 checkout 安装：
+但长 goal 不一样。
-```bash
-git clone https://github.com/deltafleet/codex-goalkeeper
-cd codex-goalkeeper
-npx skills add .
-```
+想象一个真实会话：
-## 启动一个长 Goal
+1. 你让 Codex 做 release hardening。
+2. Codex 找到一个脆弱的 edge case，并选择保守路线。
+3. 你明确拒绝了一个看起来诱人但危险的 shortcut。
+4. 两次实现尝试因为微妙原因失败。
+5. 一个测试终于证明了正确路线。
+6. 上下文被 compact。
+7. 后来 agent 带着整洁摘要回来，但当初为什么这个路线正确的压力已经变淡了。
-在工作项目中创建 Goalkeeper 会话：
+drift 就从这里开始。
-```bash
-node <skill-path>/src/scripts/goalkeeper-init.mjs \
-  --workspace <workspace> \
-  --session 2026-05-18-release-hardening \
-  --goal "Ship the release without losing constraints after compaction"
-```
+失败原因不是“模型忘掉了一切”。这更难处理：它记得足够继续工作，但不记得足够沿着同一个方向继续。
-这个命令会创建项目本地的 `.goalkeeper/` 目录和 active session pointer。
+你会在这些情况里看到它：
-## 正确恢复
+- 重新打开用户已经拒绝的方案
+- 因为失败原因在摘要里消失，重复同样的尝试
+- 把未验证的假设当作确定事实
+- 在长 handoff 后丢失准确的 next action
+- 还记得 goal，但丢掉了操作约束
+- 解释很流畅，但已经不匹配实际工作流
-在新的 turn、handoff 之后，或怀疑发生 compaction 时，先运行：
+Goalkeeper 解决的是这个空隙：goal 还在，但会话的方向感已经变弱。
-```bash
-node <skill-path>/src/scripts/goalkeeper-turn-start.mjs \
-  --workspace <workspace>
-```
+## Codex 会做什么
-如果简短 checkpoint 不够，再读取 context pack：
+skill 激活后，Codex 会在项目内维护一个连续性文件夹：
-```bash
-node <skill-path>/src/scripts/goalkeeper-turn-start.mjs \
-  --workspace <workspace> \
-  --context
+```text
+.goalkeeper/
+  active-session
+  sessions/
+    <goal-session-id>/
+      checkpoint.md
+      context-pack.md
+      events.jsonl
 ```
-核心行为很简单：agent 在触碰项目其它文件之前，先读当前任务状态。
+每个文件的职责不同：
-## 只记录真正重要的东西
+- `checkpoint.md`: 恢复时首先读取的简短状态
+- `context-pack.md`: 对 checkpoint 来说太长、但 compaction 后仍应保留的推理链
+- `events.jsonl`: 决策、失败尝试、命令证据、验证、风险和 handoff 记录
-追加重要事件：
+Codex 的 active goal 说明目的地。Goalkeeper 保存为什么这条路线仍然正确。
-```bash
-node <skill-path>/src/scripts/goalkeeper-append-event.mjs \
-  --workspace <workspace> \
-  --type decision \
-  --text "Keep the MVP skill-only; no MCP server or background daemon."
-```
+## 工作原理
-当工作状态发生实际变化时，刷新 checkpoint：
+Goalkeeper 把长时间 agent 工作变成一个简单循环：
-```bash
-node <skill-path>/src/scripts/goalkeeper-update-checkpoint.mjs \
-  --workspace <workspace> \
-  --goal "Ship the release without losing constraints after compaction" \
-  --status "Docs complete; validation pending." \
-  --next "Run validation and cut v0.1.0."
+```text
+长 /goal 开始
+  -> Codex 创建或恢复 Goalkeeper session
+  -> 记录重要约束和决策
+  -> 保留失败尝试，避免重复犯错
+  -> 在信心变化时记录验证证据
+  -> 在有意义的边界刷新 checkpoint.md
+  -> context-pack.md 保存更深的推理链
+  -> resume、handoff 或怀疑 compaction 后，Codex 先读 checkpoint.md
+  -> 如果 checkpoint 太薄，Codex 再读 context-pack.md
+  -> 如果需要精确证据，Codex 检查 events.jsonl 或 source files
 ```
-在长时间任务中依赖它之前，用 doctor 检查状态：
+这不是保存对话 transcript。它保存的是工作状态。
-```bash
-node <skill-path>/src/scripts/goalkeeper-doctor.mjs \
-  --workspace <workspace> \
-  --session 2026-05-18-release-hardening \
-  --strict
-```
+## 为什么故意做得很小
-## 典型流程
+把这个项目做大很容易：
-```text
-用户启动一个长 /goal
-  -> Goalkeeper 创建项目本地会话
-  -> Codex 正常工作
-  -> 重要决策和验证写入 events.jsonl
-  -> 在有意义的边界刷新 checkpoint.md
-  -> context-pack.md 保存推理链
-  -> resume 或 compaction 后，Codex 先读取 checkpoint.md
-  -> 如有需要，再读取 context-pack.md 和事件证据
-```
+- daemon
+- database
+- session rewriter
+- private runtime hook
+- vector memory layer
+- full transcript engine
-Goalkeeper 不会消除 compaction。它让 compaction 后的恢复不再依赖感觉，而是依赖显式状态。
+Goalkeeper 故意避开这些。
+它使用文件，因为文件可见、可审查、可移动，并且 compaction 后 agent 容易重新读取。目标不是让 Codex 全知全能。目标是让下一轮从正确状态开始。
 ## 它不是什么
 - 不是 Codex plugin。
 - 不是 MCP server。
-- 不是数据库。
+- 不是 database。
 - 不是完整对话 transcript 仓库。
-- 不是 private runtime hook。
+- 不是 private Codex runtime hook。
 - 不保证完美记忆。
 - 不会降低 compaction 频率。
-小习惯越简单，越能在真实工作中长期存在。所以 Goalkeeper 保持很小。
+Goalkeeper 改善连续性。它不假装消除上下文限制。
+## 改善什么
+使用 Goalkeeper 后，恢复的会话更有机会找回：
+- 用户的 non-negotiable constraints
+- 当前实现方向
+- 被拒绝的替代方案为什么仍然应被拒绝
+- 改变信心的测试或命令
+- 真实的 next action
+- 不应该被轻描淡写带过的 unresolved risks
+长时间 agent 工作里很多无聊但昂贵的失败，只靠这些就能减少。
+## Repository Layout
+```text
+src/codex-goalkeeper/       # installable skill payload
+  SKILL.md
+  agents/openai.yaml
+  scripts/
+  templates/
+  references/
+tests/                      # maintainer tests
+examples/goalkeeper-session # static example state
+docs/                       # roadmap and release policy
+```
+## Maintainer Validation
-## 验证
+For repository maintainers:
 ```bash
 npm run validate
 ```
-也可以手动运行：
+Equivalent manual checks:
 ```bash
-find src/scripts -name '*.mjs' -print0 | xargs -0 -n1 node --check
-node src/scripts/test-goalkeeper-update-checkpoint.mjs
+find src/codex-goalkeeper/scripts tests -name '*.mjs' -print0 | xargs -0 -n1 node --check
+node tests/test-goalkeeper-update-checkpoint.mjs
 npx skills add . --list
 ```
-## 版本管理
+## Versioning
-Goalkeeper 使用 SemVer。
+Goalkeeper uses SemVer.
-- Patch: 文档、示例、测试、兼容性 bug 修复
-- Minor: 新增兼容 helper 或 workflow field
-- Major: checkpoint、event 或 script contract 的破坏性变更
+- Patch: docs, examples, tests, and compatible bug fixes
+- Minor: new compatible helpers or workflow fields
+- Major: breaking changes to checkpoint, event, or script contracts
-发布步骤见 [docs/RELEASE.md](docs/RELEASE.md)。
+See [docs/RELEASE.md](docs/RELEASE.md) for release steps.
-## 贡献
+## Contributing
-欢迎 issue 和 PR。但项目的取舍很严格：
+Issues and PRs are welcome. The project bias is strict:
-- 保持 core workflow 足够小
-- 不添加隐藏 runtime dependency
-- 不承诺完美恢复
-- 优先使用 project-local files，而不是 global state
-- 用验证命令证明改动
+- keep the core workflow small
+- do not add hidden runtime dependencies
+- do not promise perfect recovery
+- prefer project-local files over global state
+- prove changes with the validation commands above
-请阅读 [CONTRIBUTING.md](CONTRIBUTING.md)、[SECURITY.md](SECURITY.md) 和 [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md)。
+See [CONTRIBUTING.md](CONTRIBUTING.md), [SECURITY.md](SECURITY.md), and [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md).
 ## License

package/docs/RELEASE.md CHANGED Viewed

@@ -2,11 +2,13 @@
 Codex Goalkeeper is released as a GitHub repository and optional npm package.
-The skill source of truth is the repository root:
+The installable skill source of truth is:
-- `SKILL.md`
-- `agents/openai.yaml`
-- `src/`
+- `src/codex-goalkeeper/SKILL.md`
+- `src/codex-goalkeeper/agents/openai.yaml`
+- `src/codex-goalkeeper/scripts/`
+- `src/codex-goalkeeper/references/`
+- `src/codex-goalkeeper/templates/`
 - `examples/`
 - `docs/`