@jaguilar87/gaia 5.0.2 → 5.0.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +2 -2
- package/.claude-plugin/plugin.json +1 -1
- package/ARCHITECTURE.md +0 -1
- package/CHANGELOG.md +110 -0
- package/INSTALL.md +0 -2
- package/README.md +1 -6
- package/bin/README.md +0 -1
- package/bin/cli/_install_helpers.py +1 -1
- package/bin/cli/approvals.py +23 -21
- package/bin/cli/cleanup.py +0 -1
- package/bin/cli/doctor.py +1 -1
- package/bin/cli/memory.py +2 -0
- package/bin/cli/update.py +1 -1
- package/bin/pre-publish-validate.js +48 -5
- package/config/README.md +22 -44
- package/config/surface-routing.json +0 -2
- package/dist/gaia-ops/.claude-plugin/plugin.json +1 -1
- package/dist/gaia-ops/config/README.md +22 -44
- package/dist/gaia-ops/config/surface-routing.json +0 -2
- package/dist/gaia-ops/hooks/modules/agents/contract_validator.py +18 -0
- package/dist/gaia-ops/hooks/modules/agents/handoff_persister.py +214 -2
- package/dist/gaia-ops/hooks/modules/agents/response_contract.py +26 -0
- package/dist/gaia-ops/hooks/modules/agents/transcript_reader.py +15 -0
- package/dist/gaia-ops/hooks/modules/security/__init__.py +0 -5
- package/dist/gaia-ops/hooks/modules/security/approval_grants.py +124 -19
- package/dist/gaia-ops/hooks/modules/security/mutative_verbs.py +99 -7
- package/dist/gaia-ops/hooks/modules/tools/bash_validator.py +127 -24
- package/dist/gaia-ops/hooks/modules/validation/commit_validator.py +90 -55
- package/dist/gaia-ops/skills/README.md +1 -1
- package/dist/gaia-ops/skills/agent-contract-handoff/SKILL.md +3 -0
- package/dist/gaia-ops/skills/agent-response/SKILL.md +4 -2
- package/dist/gaia-ops/skills/gaia-patterns/SKILL.md +1 -1
- package/dist/gaia-ops/skills/gaia-patterns/reference.md +2 -3
- package/dist/gaia-ops/skills/gaia-release/SKILL.md +60 -24
- package/dist/gaia-ops/skills/gaia-release/reference.md +35 -11
- package/dist/gaia-ops/skills/git-conventions/SKILL.md +6 -2
- package/dist/gaia-ops/skills/orchestrator-present-approval/SKILL.md +30 -7
- package/dist/gaia-ops/skills/orchestrator-present-approval/reference.md +32 -15
- package/dist/gaia-ops/skills/readme-writing/SKILL.md +1 -1
- package/dist/gaia-ops/skills/readme-writing/reference.md +0 -1
- package/dist/gaia-ops/skills/security-tiers/SKILL.md +5 -1
- package/dist/gaia-ops/skills/security-tiers/reference.md +3 -1
- package/dist/gaia-ops/skills/subagent-request-approval/SKILL.md +43 -6
- package/dist/gaia-ops/skills/subagent-request-approval/reference.md +66 -16
- package/dist/gaia-ops/tools/context/README.md +1 -1
- package/dist/gaia-ops/tools/gaia_simulator/extractor.py +0 -1
- package/dist/gaia-ops/tools/scan/ui.py +20 -4
- package/dist/gaia-ops/tools/scan/verify.py +3 -3
- package/dist/gaia-ops/tools/validation/README.md +15 -24
- package/dist/gaia-security/.claude-plugin/plugin.json +1 -1
- package/dist/gaia-security/hooks/modules/agents/contract_validator.py +18 -0
- package/dist/gaia-security/hooks/modules/agents/handoff_persister.py +214 -2
- package/dist/gaia-security/hooks/modules/agents/response_contract.py +26 -0
- package/dist/gaia-security/hooks/modules/agents/transcript_reader.py +15 -0
- package/dist/gaia-security/hooks/modules/security/__init__.py +0 -5
- package/dist/gaia-security/hooks/modules/security/approval_grants.py +124 -19
- package/dist/gaia-security/hooks/modules/security/mutative_verbs.py +99 -7
- package/dist/gaia-security/hooks/modules/tools/bash_validator.py +127 -24
- package/dist/gaia-security/hooks/modules/validation/commit_validator.py +90 -55
- package/gaia/state/transitions.py +4 -4
- package/gaia/store/writer.py +56 -0
- package/hooks/modules/README.md +2 -4
- package/hooks/modules/agents/contract_validator.py +18 -0
- package/hooks/modules/agents/handoff_persister.py +214 -2
- package/hooks/modules/agents/response_contract.py +26 -0
- package/hooks/modules/agents/transcript_reader.py +15 -0
- package/hooks/modules/security/__init__.py +0 -5
- package/hooks/modules/security/approval_grants.py +124 -19
- package/hooks/modules/security/mutative_verbs.py +99 -7
- package/hooks/modules/tools/bash_validator.py +127 -24
- package/hooks/modules/validation/commit_validator.py +90 -55
- package/index.js +2 -12
- package/package.json +4 -6
- package/pyproject.toml +3 -3
- package/scripts/bootstrap_database.sh +88 -439
- package/scripts/check_schema_drift.py +208 -0
- package/scripts/migrations/README.md +78 -28
- package/scripts/migrations/schema.checksum +8 -0
- package/scripts/release-prepare.mjs +199 -0
- package/skills/README.md +1 -1
- package/skills/agent-contract-handoff/SKILL.md +3 -0
- package/skills/agent-response/SKILL.md +4 -2
- package/skills/gaia-patterns/SKILL.md +1 -1
- package/skills/gaia-patterns/reference.md +2 -3
- package/skills/gaia-release/SKILL.md +60 -24
- package/skills/gaia-release/reference.md +35 -11
- package/skills/git-conventions/SKILL.md +6 -2
- package/skills/orchestrator-present-approval/SKILL.md +30 -7
- package/skills/orchestrator-present-approval/reference.md +32 -15
- package/skills/readme-writing/SKILL.md +1 -1
- package/skills/readme-writing/reference.md +0 -1
- package/skills/security-tiers/SKILL.md +5 -1
- package/skills/security-tiers/reference.md +3 -1
- package/skills/subagent-request-approval/SKILL.md +43 -6
- package/skills/subagent-request-approval/reference.md +66 -16
- package/tools/context/README.md +1 -1
- package/tools/gaia_simulator/extractor.py +0 -1
- package/tools/scan/ui.py +20 -4
- package/tools/scan/verify.py +3 -3
- package/tools/validation/README.md +15 -24
- package/commands/README.md +0 -64
- package/commands/gaia.md +0 -37
- package/commands/scan-project.md +0 -74
- package/config/crons-schema.md +0 -81
- package/config/git_standards.json +0 -72
- package/dist/gaia-ops/commands/gaia.md +0 -37
- package/dist/gaia-ops/config/crons-schema.md +0 -81
- package/dist/gaia-ops/config/git_standards.json +0 -72
- package/dist/gaia-ops/hooks/modules/security/gitops_validator.py +0 -179
- package/dist/gaia-ops/tools/agentic-loop/decide-status.py +0 -210
- package/dist/gaia-ops/tools/agentic-loop/parse-metric.py +0 -106
- package/dist/gaia-ops/tools/agentic-loop/record-iteration.py +0 -223
- package/dist/gaia-security/hooks/modules/security/gitops_validator.py +0 -179
- package/git-hooks/commit-msg +0 -41
- package/hooks/modules/security/gitops_validator.py +0 -179
- package/scripts/migrations/v10_to_v11.sql +0 -170
- package/scripts/migrations/v10_to_v11_fresh.sql +0 -18
- package/scripts/migrations/v11_to_v12.sql +0 -195
- package/scripts/migrations/v11_to_v12_fresh.sql +0 -19
- package/scripts/migrations/v12_to_v13.sql +0 -48
- package/scripts/migrations/v12_to_v13_fresh.sql +0 -17
- package/scripts/migrations/v13_to_v14.sql +0 -44
- package/scripts/migrations/v13_to_v14_fresh.sql +0 -17
- package/scripts/migrations/v14_to_v15.sql +0 -71
- package/scripts/migrations/v14_to_v15_fresh.sql +0 -19
- package/scripts/migrations/v15_to_v16.sql +0 -57
- package/scripts/migrations/v15_to_v16_fresh.sql +0 -18
- package/scripts/migrations/v16_to_v17.sql +0 -51
- package/scripts/migrations/v16_to_v17_fresh.sql +0 -18
- package/scripts/migrations/v17_to_v18.sql +0 -66
- package/scripts/migrations/v17_to_v18_fresh.sql +0 -24
- package/scripts/migrations/v1_to_v2.sql +0 -97
- package/scripts/migrations/v2_to_v3.sql +0 -68
- package/scripts/migrations/v2_to_v3_merge.sql +0 -69
- package/scripts/migrations/v3_to_v4.sql +0 -67
- package/scripts/migrations/v3_to_v4_fresh.sql +0 -20
- package/scripts/migrations/v4_to_v5.sql +0 -55
- package/scripts/migrations/v4_to_v5_fresh.sql +0 -20
- package/scripts/migrations/v5_to_v6.sql +0 -48
- package/scripts/migrations/v5_to_v6_fresh.sql +0 -17
- package/scripts/migrations/v6_to_v7.sql +0 -26
- package/scripts/migrations/v6_to_v7_fresh.sql +0 -13
- package/scripts/migrations/v7_to_v8.sql +0 -44
- package/scripts/migrations/v7_to_v8_fresh.sql +0 -14
- package/scripts/migrations/v8_to_v9.sql +0 -87
- package/scripts/migrations/v8_to_v9_fresh.sql +0 -15
- package/scripts/migrations/v9_to_v10.sql +0 -109
- package/scripts/migrations/v9_to_v10_episodes_workspace.sql +0 -109
- package/scripts/migrations/v9_to_v10_fresh.sql +0 -18
- package/templates/README.md +0 -70
- package/templates/managed-settings.template.json +0 -43
- package/tools/agentic-loop/decide-status.py +0 -210
- package/tools/agentic-loop/parse-metric.py +0 -106
- package/tools/agentic-loop/record-iteration.py +0 -223
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gaia-release
|
|
3
|
-
description: Use when testing, validating, or publishing Gaia releases --
|
|
3
|
+
description: Use when testing, validating, or publishing Gaia releases -- "install local", "dry-run", "release", live install, sandbox verify, RC, or stable
|
|
4
4
|
metadata:
|
|
5
5
|
user-invocable: false
|
|
6
6
|
type: technique
|
|
@@ -8,24 +8,52 @@ metadata:
|
|
|
8
8
|
|
|
9
9
|
# Gaia Release
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
The norm for getting Gaia onto a machine and into the registry. The user expresses exactly one of three intentions -- **install local**, **dry-run**, or **release** -- and each maps to a complete, automated sequence the orchestrator runs end-to-end. The user never recalls a sub-step and never runs a release script by hand: the script is a tool the flow invokes, not a command the human must remember. This is the lesson of the sagas that shipped broken: a release failed because a version source was bumped one file at a time and a forgotten `pyproject.toml` drifted; another needed a force-push to reconcile a tag. Every one of those was a manual step a human was trusted to remember and didn't. The fix is to norm the sequence so the steps cannot be forgotten -- they are the flow, not a checklist beside it.
|
|
12
12
|
|
|
13
|
-
##
|
|
13
|
+
## The three intentions
|
|
14
14
|
|
|
15
|
-
|
|
16
|
-
|------|---------|-------------|
|
|
17
|
-
| **live self** | `cd /path/to/gaia && npm run gaia:install-local` | Re-install Gaia in the same workspace where Claude Code is running (e.g. `me/`). Validates working-tree changes against your dev environment. |
|
|
18
|
-
| **live external** | `cd /path/to/gaia && bash bin/validate-sandbox.sh --tarball ./jaguilar87-gaia-*.tgz --target local --workspace /path/to/target` | Install the working tree into a different workspace (e.g. `qxo/`). Validates consumer-real conditions without touching your dev environment. |
|
|
19
|
-
| **live fresh** | Add `--fresh` to either of the above | Wipes `node_modules/`, `package.json`, and `package-lock.json` from the target before install. Forces a clean postinstall run. |
|
|
20
|
-
| **dry-run** | `npm run gaia:verify-install:local` | Pack + install into `/tmp/gaia-sandbox-<ts>/` + run the 8-check harness. Validates exactly what `npm publish` would ship. |
|
|
21
|
-
| **RC** | Version bump to `X.Y.Z-rc.N` + GitHub Release | Pipeline publishes to npm with `--tag rc`. Consumers opt-in: `npm install @jaguilar87/gaia@rc`. |
|
|
22
|
-
| **stable** | Version bump to `X.Y.Z` + GitHub Release | Pipeline publishes to npm with `--tag latest`. Default install: `npm install @jaguilar87/gaia`. |
|
|
15
|
+
When the user says one of these, run the *whole* sequence. Do not stop after the first command and wait to be told the next one -- the sequence below IS the intention.
|
|
23
16
|
|
|
24
|
-
|
|
17
|
+
### "install local" -- put the working tree into a real workspace
|
|
25
18
|
|
|
26
|
-
|
|
19
|
+
```
|
|
20
|
+
npm run gaia:install-local
|
|
21
|
+
```
|
|
22
|
+
Then, without being asked:
|
|
23
|
+
1. Run the **Wire-up verification checklist** (below). If any check fails, jump to `reference.md` -> "Diagnostic guide".
|
|
24
|
+
2. **Remind the user to restart `claude`** -- skills, hooks, and agents cache at startup, so a fresh install is invisible until restart.
|
|
27
25
|
|
|
28
|
-
|
|
26
|
+
Installing into a *different* workspace (e.g. `qxo/`) or wiping install metadata first is the same intention with a different target -- see `reference.md` -> "Mode runbooks" for the `--workspace` and `--fresh` forms. Always pass `--workspace` explicitly when invoking from inside the gaia repo (the self-referencing `node_modules/@jaguilar87/gaia/` tricks auto-detect; guarded by `is_gaia_repo_root()` in `validate-sandbox.sh`).
|
|
27
|
+
|
|
28
|
+
### "dry-run" -- prove a clean install works, reproducing CI locally
|
|
29
|
+
|
|
30
|
+
This is not just the sandbox harness -- it is the **local stand-in for CI**, so it must run the same gates CI runs (see the pre-flight principle below). Run, in order:
|
|
31
|
+
1. `npm run pre-publish:validate` -- the version-drift gate (`validate-manifests` in `ci.yml`).
|
|
32
|
+
2. `npm run gaia:verify-install:local` -- packs, installs into `/tmp/gaia-sandbox-<ts>/`, runs the 8-check harness. Validates exactly what `npm publish` would ship.
|
|
33
|
+
3. `npm test` -- the L1 suite (the harness/tests CI runs that reasonably reproduce locally).
|
|
34
|
+
|
|
35
|
+
A green dry-run that skips step 1 is a *subset* of CI, not a stand-in for it -- the gap surfaces only after publish, when the fix costs another release.
|
|
36
|
+
|
|
37
|
+
### "release [version]" -- end-to-end publish, fully automated
|
|
38
|
+
|
|
39
|
+
The orchestrator runs every step below in order. The user supplies (or confirms) the version and approves the T3 operations; the orchestrator does the rest. **The user does not run `release:prepare` -- step (b) invokes it.**
|
|
40
|
+
|
|
41
|
+
| Step | Action | Notes |
|
|
42
|
+
|------|--------|-------|
|
|
43
|
+
| **(a)** | Determine the version | Default to the next **patch**. If the change is major/minor, **confirm with the user** (`NEEDS_INPUT`) before proceeding -- never silently pick major/minor. |
|
|
44
|
+
| **(b)** | `npm run release:prepare <version>` | The atomic core: bumps ALL version sources at once (`package.json`, `pyproject.toml`, `.claude-plugin/plugin.json`, `.claude-plugin/marketplace.json`, `CHANGELOG.md`), runs `build:plugins`, then `pre-publish:validate`. Fails loud on any drift. This is `scripts/release-prepare.mjs` -- invoked by the flow, never by the user. |
|
|
45
|
+
| **(c)** | Pre-flight that reproduces CI | `pre-publish:validate` already ran inside (b). Now run `npm test` (plus any harness that applies) so the local gate matches CI before the tag exists. |
|
|
46
|
+
| **(d)** | Commit | `git add` + `git commit` -- local-only, not T3. |
|
|
47
|
+
| **(e)** | Tag, **force-free** | A *new* tag (`v<version>`); never move an existing one. If the remote diverged, reconcile with **merge, not rebase** (rebase forces a tag move, hard-denied locally). See `reference.md` -> "Reconciling a diverged remote". |
|
|
48
|
+
| **(f)** | Push | `git push` (T3). If diverged, the merge from (e) makes this force-free. |
|
|
49
|
+
| **(g)** | `gh release create v<version>` | Triggers `publish.yml`, which builds, validates, and publishes to npm with the auto-detected tag (`-rc.` -> rc, else latest). Mark RC as pre-release. |
|
|
50
|
+
| **(h)** | Monitor to the outcome | Watch the workflow run to its desenlace, then verify the package landed on npm (`npm run gaia:verify-install:rc` / `:latest`). The release is not done when the tag is pushed -- it is done when npm serves the new version. |
|
|
51
|
+
|
|
52
|
+
For the full command forms, the schema-migration lockstep, and the diverged-remote reconciliation, see `reference.md`.
|
|
53
|
+
|
|
54
|
+
## Wire-up verification checklist
|
|
55
|
+
|
|
56
|
+
After any install (install local, dry-run sandbox, RC, stable), the same checklist applies. If any check fails, jump to `reference.md` -> "Diagnostic guide".
|
|
29
57
|
|
|
30
58
|
1. `ls -la <workspace>/.claude/` -- 7 symlinks (agents, hooks, skills, commands, config, templates, tools) + `logs/`, `approvals/`, `plugin-registry.json`, `settings.local.json`.
|
|
31
59
|
2. `cat <workspace>/.claude/plugin-registry.json` -- `installed[].name` includes `gaia-ops` (or `gaia-security`) at the expected version.
|
|
@@ -36,25 +64,33 @@ After any install (live, dry-run, RC, stable), the same checklist applies. If an
|
|
|
36
64
|
|
|
37
65
|
These six checks are not redundant with `gaia doctor`. Steps 1-5 catch what doctor cannot reach when the wire-up is so broken that doctor itself walks up to the user `.claude/` instead of the workspace.
|
|
38
66
|
|
|
39
|
-
## Release Checklist
|
|
40
|
-
|
|
41
|
-
Pre-publish, publish, and post-publish steps -- plus the schema migration protocol when `EXPECTED_SCHEMA_VERSION` changes -- live in `reference.md` -> "Release checklist" and "Schema migration protocol". Both are read on-demand from the SKILL when actually doing a release; they are not in this file because they would dominate the line budget without informing the day-to-day mode decision.
|
|
42
|
-
|
|
43
67
|
## CI/CD
|
|
44
68
|
|
|
45
69
|
| Workflow | File | Triggers |
|
|
46
70
|
|----------|------|----------|
|
|
47
|
-
| CI | `.github/workflows/ci.yml` | Push / PR -- runs pytest (Python 3.
|
|
71
|
+
| CI | `.github/workflows/ci.yml` | Push / PR -- runs pytest (Python 3.11/3.12), Node tests, plugin build verification, and `validate-manifests` |
|
|
48
72
|
| Publish | `.github/workflows/publish.yml` | GitHub Release event -- builds plugins, validates artifacts, auto-detects npm tag from version (`-rc.` -> rc, `-beta.` -> beta, else -> latest), and publishes |
|
|
49
73
|
|
|
50
74
|
`NPM_TOKEN` lives in GitHub Secrets; local `npm publish` bypasses build verification and is not the supported path.
|
|
51
75
|
|
|
76
|
+
## Principles -- why the sequence is normed, not optional
|
|
77
|
+
|
|
78
|
+
- **The pre-flight reproduces what CI validates, not a subset of it.** When the local check skips a gate CI runs (`pre-publish:validate`), that gate's failures surface only *after* publishing, on the published tarball, where the only remedy is another release. That is exactly how a `pyproject.toml` drift shipped green-local and red-CI. The "dry-run" intention and step (c) of "release" close the gap -- `pre-publish:validate` for drift. See `reference.md` -> "The pre-flight reproduces what CI validates".
|
|
79
|
+
- **Bump every version source in one step, never one at a time.** `pre-publish:validate` requires `package.json`, `pyproject.toml`, `.claude-plugin/plugin.json`, `.claude-plugin/marketplace.json`, and the `CHANGELOG.md` top header to agree. A partial bump leaves the tree in a state the validator rejects and lets a stale source ship. `release:prepare` writes all of them from one target version, so a hand-desync is impossible. See `scripts/release-prepare.mjs`.
|
|
80
|
+
- **Tag force-free; reconcile with merge, never rebase.** `publish.yml` commits built artifacts back to `main`, so the remote leads after every release. Rebasing rewrites hashes and forces a tag move (`git tag -f` / `--force`), hard-denied by local hooks (`git_destructive` in `blocked_commands.py`, exit 2, not approvable). Merge preserves hashes and tags; a new release gets a *new* tag, never a moved one. See `reference.md` -> "Reconciling a diverged remote".
|
|
81
|
+
- **A release ends at npm, not at the tag.** Pushing the tag only starts `publish.yml`. The intention is not satisfied until the workflow reaches its outcome and npm serves the new version -- step (h) is part of the sequence, not a follow-up.
|
|
82
|
+
|
|
52
83
|
## Anti-Patterns
|
|
53
84
|
|
|
54
|
-
- **
|
|
85
|
+
- **Stopping after the first command of an intention** -- "install local" is not just `gaia:install-local`; "release" is not just `release:prepare`. Each intention is the *whole* sequence. Running one command and waiting to be told the next reintroduces the forgettable manual step the norm exists to remove.
|
|
86
|
+
- **Asking the user to run `release:prepare`** -- it is a tool the "release" flow invokes at step (b), not a command the human runs. Surfacing it as a manual step is the same failure mode (a step someone must remember) wearing a new script.
|
|
87
|
+
- **Pre-flight that is a subset of CI** -- skipping `pre-publish:validate` locally means the version drift surfaces after publish. Reproduce CI; do not approximate it.
|
|
88
|
+
- **Bumping version sources one at a time** -- desyncs a source by hand; `pre-publish:validate` rejects the tree and a forgotten file ships if the check is skipped. Always go through `release:prepare`.
|
|
89
|
+
- **Rebase to reconcile a diverged remote** -- forces a tag move, hard-denied locally. Merge instead.
|
|
90
|
+
- **Live-only testing** -- live install runs against accumulated workspace state; only dry-run proves a clean install works.
|
|
55
91
|
- **Local npm publish** -- bypasses the pipeline's build verification step.
|
|
56
92
|
- **Single-mode testing** -- `ops` and `security` load different skill sets and hook configurations; one can break while the other passes.
|
|
57
|
-
- **Stale dist/** -- forgetting `npm run build:plugins` before pack means validating old code.
|
|
58
|
-
- **Missing restart** -- the process caches skills, hooks, and agents at startup;
|
|
59
|
-
- **Ignoring `~/.gaia/last-install-error.json`** -- when postinstall fails silently, this is the marker
|
|
60
|
-
- **Relying on auto-detect when cwd is inside the gaia repo** -- the
|
|
93
|
+
- **Stale dist/** -- forgetting `npm run build:plugins` before pack means validating old code. `release:prepare` and `build:plugins` regenerate it; dry-run packs fresh.
|
|
94
|
+
- **Missing restart** -- the process caches skills, hooks, and agents at startup; installs and mode switches require restarting `claude`.
|
|
95
|
+
- **Ignoring `~/.gaia/last-install-error.json`** -- when postinstall fails silently, this is the marker. Treat its presence as a hard failure regardless of what `gaia doctor` reports.
|
|
96
|
+
- **Relying on auto-detect when cwd is inside the gaia repo** -- the self-referencing `node_modules/@jaguilar87/gaia/` entry tricks the workspace detector. Pass `--workspace /home/jorge/ws/me` explicitly; verify with `readlink /home/jorge/ws/me/.claude/hooks`.
|
|
@@ -44,7 +44,7 @@ Append `--fresh` to either form. The harness will delete `node_modules/`, `packa
|
|
|
44
44
|
**What postinstall does:**
|
|
45
45
|
1. Ships `scripts/` (bootstrap_database.sh) -- failed silently in pre-rc.4 builds; verified in `npm pack --dry-run`.
|
|
46
46
|
2. Creates `.claude/` if missing.
|
|
47
|
-
3. Runs `bootstrap_database.sh` -- seeds the schema (
|
|
47
|
+
3. Runs `bootstrap_database.sh` -- seeds the schema (current floor v18), agent rows, and `schema_version`. Fails loud on any error (writes `~/.gaia/last-install-error.json` and exits non-zero).
|
|
48
48
|
4. Merges hooks into `settings.local.json` via the consolidated `merge_hooks` step.
|
|
49
49
|
5. Creates 7 symlinks under `.claude/` to `node_modules/@jaguilar87/gaia/<dir>/`.
|
|
50
50
|
6. Writes `plugin-registry.json` with `installed[].name == "gaia-ops"`.
|
|
@@ -87,20 +87,32 @@ A change that works in one mode can break the other because they load different
|
|
|
87
87
|
|
|
88
88
|
### RC and stable (pipeline)
|
|
89
89
|
|
|
90
|
-
Both modes share the same pipeline. The pipeline auto-detects the npm tag from the version string.
|
|
90
|
+
Both modes share the same pipeline. The pipeline auto-detects the npm tag from the version string. These steps are the expansion of the "release" intention in `SKILL.md`; the orchestrator runs them, the user supplies/confirms the version.
|
|
91
91
|
|
|
92
92
|
1. Dry-run must pass locally first.
|
|
93
|
-
2.
|
|
94
|
-
-
|
|
95
|
-
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
5.
|
|
93
|
+
2. **`npm run release:prepare <version>`** -- the atomic bump. This is `scripts/release-prepare.mjs`, invoked by the flow, **never run by the user by hand**. In one command it:
|
|
94
|
+
- writes `<version>` to ALL sources at once -- `package.json`, `pyproject.toml`, `.claude-plugin/plugin.json`, `.claude-plugin/marketplace.json` (every plugin entry), and the `CHANGELOG.md` top header (inserts a dated stub above the current top if absent -- edit its body before release);
|
|
95
|
+
- runs `npm run build:plugins` to regenerate `dist/` (including the per-plugin manifests that carry the version);
|
|
96
|
+
- runs `npm run pre-publish:validate` and fails loud on any drift.
|
|
97
|
+
|
|
98
|
+
This replaces hand-bumping one file at a time. `pre-publish:validate` fails the release unless every version source agrees, and the two real escapes a hand-bump leaves are a `pyproject.toml` left behind on a prior version (caught only by `pre-publish:validate`) and a `marketplace.json` that still advertises the old tag. `release:prepare` makes the desync impossible because all sources are written from one target version. For a bare semver: `5.0.5` for stable, `5.1.0-rc.1` for RC (no leading `v` -- the tag adds it). The script is idempotent: re-running with the same version is a no-op bump that re-validates.
|
|
99
|
+
3. Pre-flight that reproduces CI (steps already partly done inside `release:prepare`): `npm test`. `pre-publish:validate` ran in step 2.
|
|
100
|
+
4. Commit (`git add` + `git commit` -- local-only, not T3). **If the remote diverged, reconcile with MERGE, never rebase** (see "Reconciling a diverged remote" below).
|
|
101
|
+
5. Tag (force-free -- a *new* `v<version>`, never moved) + push (`git push`, T3). The merge in step 4 keeps the push force-free.
|
|
99
102
|
6. Create a GitHub Release:
|
|
100
103
|
- Tag: the version from `package.json` (e.g., `v5.0.0-rc.4` or `v5.3.0`).
|
|
101
104
|
- Title: the version.
|
|
102
105
|
- Mark RC releases as pre-release.
|
|
103
106
|
7. `publish.yml` triggers automatically and publishes with `--tag <auto-detected>`.
|
|
107
|
+
8. Monitor the workflow run to its outcome, then verify npm serves the new version (`npm run gaia:verify-install:rc` / `:latest`). The release is done at npm, not at the tag.
|
|
108
|
+
|
|
109
|
+
### Reconciling a diverged remote -- merge, never rebase; never move a tag
|
|
110
|
+
|
|
111
|
+
`publish.yml` commits built artifacts back to `main` and pushes (the "Commit built plugins" step), so after a release the remote `main` is *ahead* of your local. When you next go to release and find the remote diverged, the reconciliation choice is forced by local policy:
|
|
112
|
+
|
|
113
|
+
- **Reconcile with merge, not rebase.** Rebase rewrites your local commit hashes. If a tag already pointed at one of those commits, you would have to re-point it -- which means `git tag -f` or a force-push of the tag. Both match the `git_destructive` pattern in `hooks/modules/security/blocked_commands.py` and are **hard-denied locally** (exit 2, not approvable) -- there is no `approval_id` that unblocks them. Merge preserves the existing hashes, so existing tags stay valid and no force is ever needed.
|
|
114
|
+
- **Tags are create-only -- never move one.** A published tag is immutable history; a new release gets a *new* tag (`-rc.N+1`, next patch/minor), it does not re-point an old one. Moving a tag requires the same force path that local hooks deny.
|
|
115
|
+
- **The force-deny is a local hooks policy, not a CI one.** `publish.yml` itself runs `git tag -f` and `git push --force` for the tag after committing `dist/` -- that is the pipeline operating under its own permissions, outside the local hook layer. Do not read the pipeline's force-push as license to force locally; the local deny stands regardless of what CI does.
|
|
104
116
|
|
|
105
117
|
**Verify from npm** (registry round-trip):
|
|
106
118
|
- RC: `npm run gaia:verify-install:rc`
|
|
@@ -134,17 +146,29 @@ When you bump `EXPECTED_SCHEMA_VERSION` in `bin/cli/doctor.py`, the four steps b
|
|
|
134
146
|
3. Add migration SQL (`UPDATE` / `ALTER TABLE`) when existing DBs need to upgrade in place; otherwise old workspaces stay below the expected version and `gaia doctor` fails.
|
|
135
147
|
4. Run `pytest tests/cli/test_schema_version_lockstep.py` -- it cross-references the constant, the bootstrap insert, and the migration SQL to confirm they all agree.
|
|
136
148
|
|
|
149
|
+
### Build/pre-publish Schema-Drift Guard
|
|
150
|
+
|
|
151
|
+
`bin/pre-publish-validate.js` Step 5c runs `scripts/check_schema_drift.py`, which sha256-fingerprints `gaia/store/schema.sql` and compares it against `scripts/migrations/schema.checksum` (pinned to `EXPECTED_SCHEMA_VERSION`). If the schema has changed but the version was not bumped and no migration file added, the guard fails the build.
|
|
152
|
+
|
|
153
|
+
**Consequence:** if you edit `schema.sql` you MUST either (a) bump `EXPECTED_SCHEMA_VERSION` + add the migration file (following the lockstep above), OR (b) re-pin the checksum with `python3 scripts/check_schema_drift.py --record` (the escape hatch for pure-comment or non-semantic edits). Without one of these, `npm run pre-publish:validate` — and therefore `release:prepare` — will FAIL.
|
|
154
|
+
|
|
137
155
|
## Release Checklist
|
|
138
156
|
|
|
157
|
+
### The pre-flight reproduces what CI validates, not a subset of it
|
|
158
|
+
|
|
159
|
+
A green pre-flight only protects the release if it runs the same gates CI runs. When the local check is a *subset* of CI, the gaps CI covers are discovered after publishing -- on the published tarball, where the only remedy is another release. `npm run gaia:verify-install:local` packs and installs into a sandbox, but it does **not** run `pre-publish:validate`. CI (`.github/workflows/ci.yml`) runs it separately. A real failure escaped exactly through that gap: a `pyproject.toml` version drift that only `pre-publish:validate` catches, which was green locally and red in CI *after* the tag was pushed.
|
|
160
|
+
|
|
161
|
+
So the pre-flight must close that gap before any tag or push:
|
|
162
|
+
|
|
139
163
|
**Pre-publish:**
|
|
140
164
|
- `pytest tests/` green (or `npm test` for the L1 subset).
|
|
165
|
+
- **`npm run pre-publish:validate` green locally** -- this is the version-drift gate (`validate-manifests` job in `ci.yml`). Run it before tag/push, not only in CI. It is what catches a `pyproject.toml` / `package.json` / `plugin.json` / `marketplace.json` desync before it ships.
|
|
141
166
|
- `npm pack --dry-run | grep scripts/` confirms `scripts/bootstrap_database.sh` is included in the tarball.
|
|
142
167
|
- `bash bin/validate-sandbox.sh --tarball ./jaguilar87-gaia-*.tgz --target sandbox --fresh` green (or `npm run gaia:verify-install:local`).
|
|
143
168
|
- Optional smoke: `npm run gaia:install-local -- --workspace /tmp/test-install --fresh`.
|
|
144
169
|
|
|
145
170
|
**Publish:**
|
|
146
|
-
-
|
|
147
|
-
- `npm run build:plugins` regenerates `dist/`.
|
|
171
|
+
- Run `npm run release:prepare <version>` -- atomically bumps all version sources (`package.json`, `pyproject.toml`, `.claude-plugin/plugin.json`, `.claude-plugin/marketplace.json`), regenerates `dist/` via `build:plugins`, and runs `pre-publish:validate` (including the schema-drift guard). Nothing else should write a version.
|
|
148
172
|
- Commit + push.
|
|
149
173
|
- Create GitHub Release with the version tag.
|
|
150
174
|
- Pipeline publishes (`publish.yml` triggers on Release event):
|
|
@@ -165,7 +189,7 @@ The workflow at `.github/workflows/publish.yml` runs on every GitHub Release eve
|
|
|
165
189
|
- Installs deps with `npm ci`.
|
|
166
190
|
- Builds plugins with `npm run build:plugins`.
|
|
167
191
|
- Verifies all expected artifacts in `dist/`.
|
|
168
|
-
- Commits built artifacts back if changed.
|
|
192
|
+
- Commits built artifacts back if changed (commit message carries `[skip ci]` so the dist commit-back does not re-trigger CI).
|
|
169
193
|
- Runs `npm run pre-publish:validate`.
|
|
170
194
|
- Auto-detects npm tag from version string (see "Publish" above).
|
|
171
195
|
- Publishes with `npm publish --access public --tag <detected>`.
|
|
@@ -43,5 +43,9 @@ requires explicit user instruction.
|
|
|
43
43
|
|
|
44
44
|
## Hook Enforcement
|
|
45
45
|
|
|
46
|
-
The `commit_validator.py` hook validates against
|
|
47
|
-
|
|
46
|
+
The `commit_validator.py` hook validates against standards inlined as
|
|
47
|
+
module-level constants in that file (`TYPE_ALLOWED`, `SUBJECT_MAX_LENGTH`,
|
|
48
|
+
`SUBJECT_RULES`, `BODY_MAX_LINE_LENGTH`) -- it covers the conventional-commits
|
|
49
|
+
format, subject, and body rules. Forbidden-footer detection lives separately
|
|
50
|
+
in `bash_validator` (hardcoded there). Format violations block the commit.
|
|
51
|
+
Body line length triggers warnings only.
|
|
@@ -30,9 +30,16 @@ without the data needed to decide. The job is **verbatim relay, not
|
|
|
30
30
|
re-authoring**: rewriting any of the 7 sealed fields breaks the fingerprint and
|
|
31
31
|
`verify_fingerprint` (`gaia/approvals/chain.py`) raises `ChainTamperError`.
|
|
32
32
|
|
|
33
|
-
## Step 0 --
|
|
33
|
+
## Step 0 -- Verify the approval against the DB (mandatory before SHOWN)
|
|
34
34
|
|
|
35
|
-
|
|
35
|
+
A subagent's reported `approval_id` is an unverified claim, not a fact. The agent runs in its own context and can relay an id that is stale, from another session, or simply wrong -- and a stale id presented as a fresh block walks the user into consenting to nothing real (or to a grant that no longer exists). The DB is the source of truth; the agent's report is a pointer into it that you must resolve, never the authority itself.
|
|
36
|
+
|
|
37
|
+
So before AskUserQuestion, two checks against the DB, in order:
|
|
38
|
+
|
|
39
|
+
1. **The approval exists, is fresh, and is from the current session.** Query `gaia approvals pending --session "$CLAUDE_SESSION_ID"` (or `--json` for parsing). The reported `approval_id` MUST appear in that result. If it appears only under `--all-sessions` but not the current session, it is leakage from another session (a test session such as `e2e-sim`, a prior run) -- **do not present**. If it does not appear at all, it does not exist or was already consumed/rejected -- **do not present**. Freshness is the `created_at` of the pending row plus its presence as still-`pending`; an id the agent reports that is not currently pending in *this* session is not a fresh block, whatever the agent says.
|
|
40
|
+
2. **The payload is untampered.** Call `verify_fingerprint(approval_id, payload_json, con) -> bool` from `gaia/approvals/chain.py`. It raises `ChainTamperError` if the payload was modified between subagent emission and your relay (security boundary, do not present), and `ValueError` if no REQUESTED event exists for this `approval_id`. Either case: **do not present**, report the failure, stop.
|
|
41
|
+
|
|
42
|
+
**For a `command_set` (plan-first batch) the agent does not know the id at all.** The hook mints the `approval_id` at SubagentStop (`_intake_command_set_pending` -- see Rule 3); the subagent emits the `command_set` with **no** `approval_id`. So you do not have an agent-reported id to trust even if you wanted to -- you ALWAYS recover the freshly minted id from `gaia approvals pending` for the current session. This is the general shape made unavoidable: the DB mints, the orchestrator recovers, the agent never owns the id.
|
|
36
43
|
|
|
37
44
|
## Mandatory presentation -- 5 labeled fields + nonce-suffixed label
|
|
38
45
|
|
|
@@ -71,12 +78,27 @@ Fields above are extracted from the DB-stored canonical payload (`payload_json`
|
|
|
71
78
|
grant consumed by the first retry (`consume_db_semantic_grant` in
|
|
72
79
|
`gaia/store/writer.py`). A second invocation is a new APPROVAL_REQUEST.
|
|
73
80
|
|
|
74
|
-
3. **
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
81
|
+
3. **Batch grant is `COMMAND_SET` -- one consent, N commands.** Legacy
|
|
82
|
+
`verb_family` was removed; its replacement, `COMMAND_SET`, is now wired
|
|
83
|
+
end-to-end (intake, activation, consume). When a subagent emits a plan-first
|
|
84
|
+
`APPROVAL_REQUEST` carrying a `command_set` of >= 2 `{command, rationale}`
|
|
85
|
+
items and **no** `approval_id`, the SubagentStop processor
|
|
86
|
+
(`handoff_persister._intake_command_set_pending`) mints ONE pending
|
|
87
|
+
`COMMAND_SET` with one `approval_id`. You present that single approval: list
|
|
88
|
+
**all N commands** in the question body, but use **one** Approve label with
|
|
89
|
+
**one** `[P-{nonce8}]` suffix -- one consent covers the whole batch. On
|
|
90
|
+
approval, `activate_db_pending_by_prefix` Step 3b creates a single
|
|
91
|
+
`COMMAND_SET` grant (60-min TTL); each command is consumed byte-for-byte on
|
|
92
|
+
its own retry. `batch_scope` is still ignored (the signal is `command_set`).
|
|
78
93
|
See `reference.md` -> "On batch intents".
|
|
79
94
|
|
|
95
|
+
You present the batch the subagent chose to send; you do not steer it toward
|
|
96
|
+
batching. Whether grouping is warranted is the subagent's judgment (known
|
|
97
|
+
batch, >= 2, friction reduced -- see `subagent-request-approval`). A singular
|
|
98
|
+
approval arriving where you imagined a batch is not a defect to correct: the
|
|
99
|
+
default is just-in-time, and a batch you would have manufactured asks the
|
|
100
|
+
user to consent to commands that may never run.
|
|
101
|
+
|
|
80
102
|
4. **Re-dispatch, do not resume.** `mode` does not survive a SendMessage resume:
|
|
81
103
|
the resume runs in `default` and re-blocks the next protected operation even
|
|
82
104
|
after the Gaia grant activated. Prefer a fresh re-dispatch with the same
|
|
@@ -97,5 +119,6 @@ wording, see `reference.md` -> "GOOD vs BAD Examples", "Option Label Patterns",
|
|
|
97
119
|
| "I'll skip the [P-...] suffix, it's cosmetic" | The hook extracts the nonce from the label to find the right pending row; without it, targeted activation fails and no grant is created. |
|
|
98
120
|
| "Similar command, slightly different path -- I'll reuse / wrap it" | Grants match the statement signature byte-for-byte. Any wrapper, redirect, flag, or path drift is a different signature and a fresh re-block. |
|
|
99
121
|
| "The same command emitted a new approval_id" | Grants are single-use and consumed on the first retry. A second run is a new APPROVAL_REQUEST -- approve again. |
|
|
100
|
-
| "I'll set batch_scope to approve many at once" |
|
|
122
|
+
| "I'll set batch_scope to approve many at once" | `batch_scope` is ignored -- but a real batch path exists: a plan-first `command_set` (>= 2 items, no `approval_id`) is intaken into ONE pending `COMMAND_SET`. Present that single approval (N commands shown, one `[P-...]` nonce, one consent), not N separate approvals. |
|
|
101
123
|
| "I can paraphrase a field before relaying" | The fingerprint covers all 7 sealed fields; any modification raises `ChainTamperError` in Step 0 and the presentation is refused. |
|
|
124
|
+
| **"The agent reported an `approval_id`, so it's a real fresh block"** -- trusting a nonce relayed by the subagent | The agent's reported id is an unverified pointer, not a fact. It can be stale or belong to another session -- subagents have presented a STALE nonce from a test session (`e2e-sim`) as if it were a fresh block. Resolve every reported id against `gaia approvals pending --session "$CLAUDE_SESSION_ID"` (Step 0): it must be currently pending in *this* session. Visible only under `--all-sessions`, or absent entirely, means do not present. For `command_set` the hook mints the id and the agent never has one -- you always recover it from the DB. |
|
|
@@ -107,32 +107,49 @@ contain `[P-<hex>]`. Reject labels never carry a nonce. The captured hex is the
|
|
|
107
107
|
`get_pending(all_sessions=True)` and selects the one whose `id` starts with
|
|
108
108
|
`P-{prefix}`.
|
|
109
109
|
|
|
110
|
-
## On batch intents --
|
|
110
|
+
## On batch intents -- the COMMAND_SET grant (one consent, N commands)
|
|
111
111
|
|
|
112
112
|
The old `verb_family` design (one approval covering many commands of the same
|
|
113
113
|
`base_cmd + verb`) **was removed**. The module docstring in
|
|
114
114
|
`hooks/modules/security/approval_grants.py` is explicit: "The legacy verb_family
|
|
115
115
|
path has been removed."
|
|
116
116
|
|
|
117
|
-
|
|
117
|
+
Its replacement is the `COMMAND_SET` grant: an explicit list of
|
|
118
118
|
`{command, rationale}` items, each matched **byte-for-byte** (D10: no whitespace
|
|
119
119
|
normalization, no quote canonicalization, no shell expansion) and consumed
|
|
120
120
|
individually (`create_command_set_grant` and `match_command_set_grant` in
|
|
121
121
|
`approval_grants.py`).
|
|
122
122
|
|
|
123
|
-
**Current state of the code
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
`
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
123
|
+
**Current state of the code: all three sides are wired -- intake, activation,
|
|
124
|
+
consume.** It is a **plan-first** flow: the subagent declares the batch up-front
|
|
125
|
+
by emitting an `APPROVAL_REQUEST` whose `approval_request` carries a
|
|
126
|
+
`command_set` list and **no** `approval_id`.
|
|
127
|
+
|
|
128
|
+
- **Intake.** The SubagentStop processor
|
|
129
|
+
`hooks/modules/agents/handoff_persister.py` ->
|
|
130
|
+
`_intake_command_set_pending()` reads the `command_set`; when it holds **>= 2**
|
|
131
|
+
items it calls `gaia.approvals.store.insert_requested()` with a payload that
|
|
132
|
+
contains the `command_set` key, minting **exactly ONE** pending `COMMAND_SET`
|
|
133
|
+
approval with one `approval_id`. A set of `<= 1` item is declined (no
|
|
134
|
+
COMMAND_SET is minted for one command).
|
|
135
|
+
- **Activation.** When the user approves, `activate_db_pending_by_prefix()`
|
|
136
|
+
(`hooks/modules/security/approval_grants.py`) reads `payload["command_set"]`,
|
|
137
|
+
and because it has > 1 item branches at **Step 3b** into
|
|
138
|
+
`create_command_set_grant()`, inserting ONE `COMMAND_SET` grant row (status
|
|
139
|
+
`PENDING`, `command_set_json` holding the whole set, 60-min TTL via
|
|
140
|
+
`DEFAULT_COMMAND_SET_TTL_MINUTES`) instead of a singular
|
|
141
|
+
`SCOPE_SEMANTIC_SIGNATURE` grant.
|
|
142
|
+
- **Consume.** On each retry, `bash_validator` calls `match_command_set_grant()`
|
|
143
|
+
(byte-for-byte index match), then `mark_command_set_item_consumed()`; a
|
|
144
|
+
consumed index never matches again (replay protection), and when every index
|
|
145
|
+
is consumed the grant flips to `CONSUMED`.
|
|
146
|
+
|
|
147
|
+
**Practical consequence:** a `batch_scope` field still does nothing -- the signal
|
|
148
|
+
is `command_set`. To approve a sweep of N related commands under one consent,
|
|
149
|
+
present the single `COMMAND_SET` approval the intake minted: show **all N
|
|
150
|
+
commands** in the question body, with **one** Approve label carrying **one**
|
|
151
|
+
`[P-{nonce8}]` suffix. The user gives one consent; each command then runs on its
|
|
152
|
+
own retry within the 60-minute window. You do NOT issue N separate approvals.
|
|
136
153
|
|
|
137
154
|
## Grant Activation Mechanics
|
|
138
155
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: readme-writing
|
|
3
|
-
description: Use when writing or updating a README for a Gaia component folder (agents/, skills/, hooks/, commands/, config/, bin/, tests/, build/,
|
|
3
|
+
description: Use when writing or updating a README for a Gaia component folder (agents/, skills/, hooks/, commands/, config/, bin/, tests/, build/, or the repo root)
|
|
4
4
|
metadata:
|
|
5
5
|
user-invocable: false
|
|
6
6
|
type: technique
|
|
@@ -185,4 +185,3 @@ Copy this when writing a README from scratch. Fill every section -- do not delet
|
|
|
185
185
|
| `bin/` | Low -- CLI tools, user-invoked | No |
|
|
186
186
|
| `tests/` | Low -- run by CI or developer | No |
|
|
187
187
|
| `build/` | Medium -- triggered by npm run build | Optional |
|
|
188
|
-
| `templates/` | Low -- read by build scripts | No |
|
|
@@ -17,7 +17,11 @@ security-tiers classifies every operation into four tiers so an agent knows whet
|
|
|
17
17
|
| **T0** | Read-only; observes state, changes nothing | No | get, list, describe, show, logs, status |
|
|
18
18
|
| **T1** | Local validation; no remote calls, no state | No | validate, lint, fmt, check |
|
|
19
19
|
| **T2** | Simulation / dry-run; may read remote, never writes | No | plan, diff, dry-run, template |
|
|
20
|
-
| **T3** | State-mutating; creates, updates, or destroys | **Yes** | apply, create, delete,
|
|
20
|
+
| **T3** | State-mutating; creates, updates, or destroys | **Yes** | apply, create, delete, push, deploy |
|
|
21
|
+
|
|
22
|
+
`git commit` and `git add` are **not** T3 -- they are local-only operations (they touch the working tree and local refs, never remote state), so they classify as safe by elimination. Only `git push` mutates remote state and is T3. This matches `GIT_LOCAL_SAFE_SUBCOMMANDS` in `mutative_verbs.py`, where `commit` and `add` are listed as local-safe.
|
|
23
|
+
|
|
24
|
+
**T3 gates a direction, not a category of verb.** An operation needs consent because it moves the system toward *more* capability (it grants) or *less* recoverability (it destroys). An operation that only moves the other way -- that *reduces* capability already granted -- does not need consent, because the worst it can do is take back power that was given. So within Gaia's own consent layer, `gaia approvals revoke|reject|reject-all|clean` are **not** T3: they only revoke or discard grants Gaia itself issued, never reaching outside the local approval store. The asymmetry is deliberate -- `gaia approvals approve` *grants* capability without the AskUserQuestion flow, so it stays T3. This is anchored to the `gaia approvals` group in `CONSENT_REDUCING_SUBCOMMAND_EXCEPTIONS` (`mutative_verbs.py`), not generalized to every CLI's "revoke" -- a cloud IAM revoke is a real remote mutation and remains T3.
|
|
21
25
|
|
|
22
26
|
## Classification heuristic
|
|
23
27
|
|
|
@@ -36,7 +36,9 @@ Read on-demand by infrastructure agents. Not injected automatically.
|
|
|
36
36
|
- `kubectl apply -f manifest.yaml`
|
|
37
37
|
- `helm upgrade` (without `--dry-run`)
|
|
38
38
|
- `flux reconcile` (write operations)
|
|
39
|
-
- `git
|
|
39
|
+
- `git push` (any branch) -- mutates remote state
|
|
40
|
+
|
|
41
|
+
Note: `git commit` and `git add` are **not** T3. They are local-only (working tree + local refs, never remote), classified safe by elimination via `GIT_LOCAL_SAFE_SUBCOMMANDS` in `mutative_verbs.py`. Only `git push` reaches remote state.
|
|
40
42
|
|
|
41
43
|
## Edge Cases
|
|
42
44
|
|
|
@@ -63,12 +63,47 @@ prose are invisible to the presentation -- the user would approve blind.
|
|
|
63
63
|
- **The grant is single-use.** It is consumed on your first matching retry. A
|
|
64
64
|
second run within the TTL will not match -- it needs a fresh approval.
|
|
65
65
|
|
|
66
|
-
## Batch / many-command intents
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
66
|
+
## Batch / many-command intents -- COMMAND_SET as a judgment, not a default
|
|
67
|
+
|
|
68
|
+
Grouping commands under one consent is a **judgment call you earn, not the
|
|
69
|
+
reflex you reach for**. The default is singular, just-in-time approval: attempt
|
|
70
|
+
the command, let the hook block it, request that one. Reach for `COMMAND_SET`
|
|
71
|
+
**only when all three hold** -- the batch is already **known** (the commands are
|
|
72
|
+
determined, not predicted), there are **>= 2** of them, and grouping **actually
|
|
73
|
+
reduces friction** versus approving each as it arrives. If any fails (a single
|
|
74
|
+
command, a sequential flow where the next depends on the last's output, or a
|
|
75
|
+
set you cannot yet name), use the singular path. The principle with its
|
|
76
|
+
consequence: **grouping trades the user's per-command visibility for fewer
|
|
77
|
+
prompts; make that trade only when the batch is real and known, because a batch
|
|
78
|
+
you guessed at asks the user to approve commands that may never run.**
|
|
79
|
+
|
|
80
|
+
The hard prohibition this rules out: **never invent or predict commands just to
|
|
81
|
+
have something to group.** Speculatively enumerating a `command_set` to "save
|
|
82
|
+
turns" inverts the cost -- it manufactures ceremony (a multi-command consent
|
|
83
|
+
surface) around work that was never determined, which is more overhead than the
|
|
84
|
+
just-in-time blocks it was meant to avoid. If you do not already know the
|
|
85
|
+
commands, you do not have a batch.
|
|
86
|
+
|
|
87
|
+
When the three conditions do hold, emit an `APPROVAL_REQUEST` whose
|
|
88
|
+
`approval_request` carries a `command_set` -- a list of `{command, rationale}`
|
|
89
|
+
items -- and **no `approval_id`** (nothing has been attempted yet). The
|
|
90
|
+
per-command rationale is what makes the grouped consent honest: the user sees
|
|
91
|
+
why each *known* command is in the batch before approving (D10).
|
|
92
|
+
|
|
93
|
+
What happens to that envelope: the SubagentStop processor
|
|
94
|
+
(`hooks/modules/agents/handoff_persister.py` -> `_intake_command_set_pending`)
|
|
95
|
+
reads the `command_set`, and when it holds **>= 2** items it calls
|
|
96
|
+
`gaia.approvals.store.insert_requested` with a payload containing the
|
|
97
|
+
`command_set` key. That mints **exactly ONE pending `COMMAND_SET` approval**
|
|
98
|
+
with one `approval_id` -- so a batch of N commands is **one consent, N
|
|
99
|
+
commands**, not N approvals. A set of `<= 1` item is not a batch: it does not
|
|
100
|
+
mint a COMMAND_SET (use the normal singular block path for a single command).
|
|
101
|
+
|
|
102
|
+
On the user's approval, that one pending activates into a single `COMMAND_SET`
|
|
103
|
+
grant (60-minute TTL); each item is then consumed byte-for-byte on its own
|
|
104
|
+
retry, with replay protection, until the whole set is `CONSUMED`. See
|
|
105
|
+
`reference.md` for the envelope shape, the intake processor, the grant TTL, and
|
|
106
|
+
the consume path.
|
|
72
107
|
|
|
73
108
|
## Pointers
|
|
74
109
|
|
|
@@ -84,3 +119,5 @@ approval, so emitting `batch_scope` does nothing. See `reference.md` for why.
|
|
|
84
119
|
- **Fabricating `approval_id`, fingerprint, or `sealed_payload`** -- the orchestrator validates against the DB; invented values never match.
|
|
85
120
|
- **Reusing a prior approval** -- single-use, consumed on first retry.
|
|
86
121
|
- **Emitting `batch_scope`** -- the field does not exist; it is ignored.
|
|
122
|
+
- **Grouping by reflex** -- reaching for `COMMAND_SET` because a batch *might* form, instead of because a known batch of >= 2 already exists that grouping makes cheaper. The default is singular just-in-time; grouping is the exception you justify.
|
|
123
|
+
- **Predicting commands to fill a batch** -- inventing commands you have not determined so a `command_set` has >= 2 items. You cannot ask consent for work that does not yet exist; the speculative batch is pure overhead.
|
|
@@ -69,35 +69,85 @@ On your retry, `check_approval_grant()` matches it and immediately consumes it
|
|
|
69
69
|
TTL will NOT match -- the grant is gone. This is replay protection by design;
|
|
70
70
|
re-approve if you need to run the command again.
|
|
71
71
|
|
|
72
|
-
## Batch / COMMAND_SET --
|
|
72
|
+
## Batch / COMMAND_SET -- wired
|
|
73
73
|
|
|
74
74
|
The legacy `verb_family` multi-use grant was removed (see module docstring in
|
|
75
|
-
`approval_grants.py`: "The legacy verb_family path has been removed"). Its
|
|
75
|
+
`approval_grants.py`: "The legacy verb_family path has been removed"). Its
|
|
76
76
|
replacement is the `COMMAND_SET` grant -- an explicit list of `{command, rationale}`
|
|
77
77
|
items, each matched byte-for-byte and consumed individually
|
|
78
78
|
(`approval_grants.create_command_set_grant()`; `approval_grants.match_command_set_grant()`).
|
|
79
|
+
All three sides are now wired end-to-end -- **intake**, **activation**, and
|
|
80
|
+
**consume** -- so one consent covers N commands.
|
|
81
|
+
|
|
82
|
+
**Intake -- plan-first, one pending.** The batch is declared up-front: you emit
|
|
83
|
+
an `APPROVAL_REQUEST` whose `approval_request` carries a `command_set` list and
|
|
84
|
+
**no `approval_id`** (you have attempted nothing). The production intake caller
|
|
85
|
+
is the SubagentStop processor `handoff_persister.persist_handoff()`, which calls
|
|
86
|
+
`_intake_command_set_pending()`. That helper normalizes the `command_set` and,
|
|
87
|
+
when it holds **>= 2** `{command, rationale}` items, builds a sealed_payload
|
|
88
|
+
carrying the `command_set` key (mirroring the shape
|
|
89
|
+
`bash_validator._build_sealed_payload()` emits) and calls
|
|
90
|
+
`gaia.approvals.store.insert_requested()` -- minting **exactly ONE** pending
|
|
91
|
+
`COMMAND_SET` approval with one `approval_id`. A set of length `<= 1` is not a
|
|
92
|
+
batch: the intake declines and the singular semantic-signature path owns it (no
|
|
93
|
+
COMMAND_SET is ever minted for one command). The intake runs independently of
|
|
94
|
+
the audit handoff-row write, so a batch consent is never lost to an unrelated
|
|
95
|
+
DB failure.
|
|
96
|
+
|
|
97
|
+
**Envelope shape.** The sealed_payload the intake writes carries a `command_set`
|
|
98
|
+
key holding the verbatim list of `{command, rationale}` items, and `commands`
|
|
99
|
+
listing every command string in the set:
|
|
79
100
|
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
101
|
+
```json
|
|
102
|
+
{
|
|
103
|
+
"operation": "MUTATIVE command intercepted: push",
|
|
104
|
+
"exact_content": "git add -A",
|
|
105
|
+
"commands": ["git add -A", "git commit -m 'v1.2.0'", "git push origin main"],
|
|
106
|
+
"command_set": [
|
|
107
|
+
{"command": "git add -A", "rationale": "stage release files"},
|
|
108
|
+
{"command": "git commit -m 'v1.2.0'", "rationale": "record the release commit"},
|
|
109
|
+
{"command": "git push origin main", "rationale": "publish to the remote"}
|
|
110
|
+
]
|
|
111
|
+
}
|
|
112
|
+
```
|
|
86
113
|
|
|
87
|
-
**
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
114
|
+
**Activation -- one consent, one grant.** When the user approves, the
|
|
115
|
+
ElicitationResult hook (`approval_grants.activate_db_pending_by_prefix()`)
|
|
116
|
+
detects the `command_set` and branches to `approval_grants.create_command_set_grant()`,
|
|
117
|
+
which inserts a single `COMMAND_SET` grant row into `approval_grants`
|
|
118
|
+
(status `PENDING`, `command_set_json` holding the whole set). The grant TTL is
|
|
119
|
+
**60 minutes** (`DEFAULT_COMMAND_SET_TTL_MINUTES`), aligned to the singular
|
|
120
|
+
active-grant TTL so the batch does not expire mid-consume across sessions.
|
|
121
|
+
|
|
122
|
+
**Consume -- item by item, replay-protected.** On each retry,
|
|
123
|
+
`bash_validator._validate_single_command()` calls `match_command_set_grant()`,
|
|
124
|
+
which finds the matching command's index byte-for-byte and returns it; the
|
|
125
|
+
validator then calls `mark_command_set_item_consumed()`, appending that index to
|
|
126
|
+
`consumed_indexes_json`. A consumed index never matches again (replay
|
|
127
|
+
protection), and when every index is consumed the grant flips to `CONSUMED`.
|
|
128
|
+
Wrapping an approved command -- adding `cd`, a redirect, a pipe, or a flag --
|
|
129
|
+
produces a different string and matches nothing in the set; it requires fresh
|
|
130
|
+
approval.
|
|
131
|
+
|
|
132
|
+
**Consequence:** for a set of N related T3 commands, emit the `command_set`
|
|
133
|
+
envelope and the user approves once. Each command runs on its own retry,
|
|
134
|
+
single-use within the 60-minute window.
|
|
91
135
|
|
|
92
136
|
## Status to emit -- with vs without approval_id
|
|
93
137
|
|
|
94
138
|
Always `plan_status: "APPROVAL_REQUEST"`. The presence of `approval_id` tells the
|
|
95
139
|
orchestrator which path:
|
|
96
140
|
|
|
97
|
-
- **With `approval_id`** -- the hook blocked; orchestrator
|
|
98
|
-
fingerprint and activates the grant on user
|
|
99
|
-
|
|
100
|
-
|
|
141
|
+
- **With `approval_id`** -- the hook blocked a single command; orchestrator
|
|
142
|
+
validates the fingerprint and activates the single-use semantic grant on user
|
|
143
|
+
approval.
|
|
144
|
+
- **Without `approval_id`, with a `command_set` of >= 2 items** -- plan-first
|
|
145
|
+
batch. The SubagentStop intake processor mints ONE pending `COMMAND_SET` and
|
|
146
|
+
the orchestrator presents that single approval (N commands, one nonce) before
|
|
147
|
+
any execution. See "Batch / COMMAND_SET -- wired" above.
|
|
148
|
+
- **Without `approval_id` and without a multi-item `command_set`** -- plan-first
|
|
149
|
+
single (you are presenting one T3 plan before attempting); the orchestrator
|
|
150
|
+
gates on user consent before any execution.
|
|
101
151
|
|
|
102
152
|
## Examples
|
|
103
153
|
|
package/tools/context/README.md
CHANGED
|
@@ -77,7 +77,7 @@ Agent contracts live in `~/.gaia/gaia.db` (`project_context_contracts` + `agent_
|
|
|
77
77
|
**cloud-troubleshooter:**
|
|
78
78
|
- project_identity, stack, git, environment, infrastructure, orchestration
|
|
79
79
|
- cluster_details, infrastructure_topology, terraform_infrastructure
|
|
80
|
-
- gitops_configuration, application_services,
|
|
80
|
+
- gitops_configuration, application_services, architecture_overview
|
|
81
81
|
|
|
82
82
|
The same contracts are exposed under `write_permissions`:
|
|
83
83
|
- `readable_sections`
|
|
@@ -221,7 +221,6 @@ class LogExtractor:
|
|
|
221
221
|
# Exit 2 BLOCK (block_response is None):
|
|
222
222
|
# - "Command blocked by security policy ..." -- permanent deny list
|
|
223
223
|
# - "Commit message validation failed ..." -- validation error
|
|
224
|
-
# - "GitOps policy violation ..." -- GitOps validation
|
|
225
224
|
# - "Empty command not allowed"
|
|
226
225
|
if (
|
|
227
226
|
reason.startswith("Dangerous")
|