@jetrabbits/agentic 0.3.0 → 0.3.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +17 -23
- package/CHANGELOG.md +19 -0
- package/MEMORY.md +41 -87
- package/Makefile +80 -22
- package/README.md +17 -7
- package/agentic +634 -124
- package/areas/devops/ci-cd/AGENTS.md +1 -15
- package/areas/devops/database-ops/AGENTS.md +1 -15
- package/areas/devops/devsecops/AGENTS.md +1 -15
- package/areas/devops/infrastructure/AGENTS.md +1 -15
- package/areas/devops/kubernetes/AGENTS.md +1 -15
- package/areas/devops/networking/AGENTS.md +1 -15
- package/areas/devops/observability/AGENTS.md +1 -15
- package/areas/devops/sre/AGENTS.md +1 -15
- package/areas/software/backend/AGENTS.md +1 -16
- package/areas/software/data-engineering/AGENTS.md +1 -16
- package/areas/software/frontend/AGENTS.md +1 -16
- package/areas/software/full-stack/AGENTS.md +1 -16
- package/areas/software/general/AGENTS.md +1 -7
- package/areas/software/mlops/AGENTS.md +1 -16
- package/areas/software/mobile/AGENTS.md +1 -16
- package/areas/software/platform/AGENTS.md +1 -16
- package/areas/software/qa/AGENTS.md +1 -16
- package/areas/software/security/AGENTS.md +1 -16
- package/areas/template/AGENTS.tmpl.md +1 -17
- package/docs/agentic-lifecycle.md +7 -3
- package/docs/agentic-stabilization/README.md +11 -7
- package/docs/agentic-token-minimization/README.md +7 -5
- package/docs/agentic-usage.md +12 -5
- package/docs/guidance-updates/2026-05-22-centralized-guidance-memory.md +19 -0
- package/docs/opencode_setup.md +7 -5
- package/docs/review-pipeline.md +82 -0
- package/extensions/claude/agents/instruction_reviewer.md +132 -0
- package/extensions/claude/agents/memory_curator.md +97 -0
- package/extensions/codex/AGENTS.override.md +17 -0
- package/extensions/codex/agents/instruction_reviewer.toml +139 -0
- package/extensions/codex/agents/memory_curator.toml +104 -0
- package/extensions/gemini/agents/instruction_reviewer.md +132 -0
- package/extensions/gemini/agents/memory_curator.md +97 -0
- package/extensions/opencode/agents/instruction_reviewer.md +133 -0
- package/extensions/opencode/agents/memory_curator.md +98 -0
- package/extensions/opencode/opencode.json +27 -3
- package/extensions/opencode/plugins/agent-model-mapper.ts +13 -2
- package/extensions/opencode/plugins/telegram-notification.ts +14 -14
- package/package.json +1 -1
- package/scripts/generate_how_to_use_agentic_gif.py +565 -0
|
@@ -4,16 +4,19 @@
|
|
|
4
4
|
|
|
5
5
|
- Post-install doctor checks run independently for `codex`, `opencode`, `claude`, and `gemini`.
|
|
6
6
|
- `AGENTIC_DOCTOR_TIMEOUT_SECONDS` defaults to `10`; a timeout is reported as a doctor failure and install continues.
|
|
7
|
-
- Codex doctor runs non-interactively with `--ephemeral
|
|
7
|
+
- Codex doctor runs non-interactively with `--ephemeral`, `--sandbox workspace-write`, and the same lightweight smoke prompt as other supported doctor targets.
|
|
8
8
|
- OpenCode uses `agent-model-mapper` instead of the removed `model-checker` artifacts.
|
|
9
9
|
- `agent-model-mapper` writes `.opencode/opencode.json` during interactive install only after confirmation.
|
|
10
10
|
- `agent-model-mapper` uses `fzf` for install-time model dropdowns when available and OpenCode startup skips once all roles are mapped.
|
|
11
11
|
- The runtime OpenCode plugin never opens `fzf`, asks questions, or writes project files.
|
|
12
|
-
- Context7
|
|
13
|
-
- OpenCode MemPalace setup writes `mempalace-mcp` config
|
|
14
|
-
- Telegram notification credentials are read
|
|
12
|
+
- Context7 offers an interactive key mode: configure without a key or enter `CONTEXT7_API_KEY` for the selected target configs.
|
|
13
|
+
- OpenCode MemPalace setup writes `mempalace-mcp` config and initializes/mines project memory into a project-specific wing without LLM calls.
|
|
14
|
+
- Telegram notification credentials are read from project `.agentic.json` when the plugin is enabled.
|
|
15
15
|
- MemPalace-enabled installs create a managed `.mempalaceignore` unless the target project already has one.
|
|
16
|
-
-
|
|
16
|
+
- `make test` runs the fast deterministic e2e suite; longer deterministic checks, real blackbox, and coverage checks are explicit targets.
|
|
17
|
+
- `make test-all` runs the full local suite including longer deterministic checks, install/evidence blackbox, and coverage.
|
|
18
|
+
- Real Codex, OpenCode, and Telegram blackbox install/evidence scenarios run through `make test-real-blackbox`.
|
|
19
|
+
- Live Codex/OpenCode/Telegram blackbox sessions require `AGENTIC_REAL_BLACKBOX_LIVE=1`.
|
|
17
20
|
- `make test-coverage` traces `agentic` through e2e runs and fails below 90% line coverage.
|
|
18
21
|
|
|
19
22
|
## Acceptance Criteria
|
|
@@ -24,10 +27,11 @@
|
|
|
24
27
|
- `extensions/opencode/opencode.json` lists `agent-model-mapper`.
|
|
25
28
|
- Runtime model mapper execution does not prompt or modify project files.
|
|
26
29
|
- Telegram plugin tests prove environment-only credentials and no secret output.
|
|
27
|
-
- Real blackbox tests print created files,
|
|
30
|
+
- Real blackbox tests print created files, managed guidance sources, and MCP config evidence, then save instruction evidence to a temp file without printing Telegram secrets.
|
|
28
31
|
|
|
29
32
|
## Operational Constraints
|
|
30
33
|
|
|
31
|
-
- `make test`
|
|
34
|
+
- `make test` is deterministic, designed for a sub-minute local loop, and does not require real agent binaries, model auth, network access, Context7/MemPalace access, or Telegram credentials.
|
|
35
|
+
- `AGENTIC_REAL_BLACKBOX_LIVE=1 make test-real-blackbox` requires real `codex` and `opencode` binaries, working model auth, network access, Context7/MemPalace access, and Telegram credentials for the Telegram case.
|
|
32
36
|
- Telegram credentials must never be committed or written to Agentic config.
|
|
33
37
|
- Coverage is line-based Bash trace coverage for the `agentic` script, not branch coverage.
|
|
@@ -32,15 +32,17 @@ When installing for OpenCode, `agentic` writes optional plugin state to:
|
|
|
32
32
|
~/.config/agentic/opencode-plugins.json
|
|
33
33
|
```
|
|
34
34
|
|
|
35
|
-
Interactive installs ask whether to enable Telegram notifications and model
|
|
35
|
+
Interactive installs ask whether to enable Telegram notifications and model mapping. Non-interactive installs default optional plugins to disabled when no config exists.
|
|
36
36
|
|
|
37
|
-
The OpenCode plugins read
|
|
37
|
+
The OpenCode plugins read project `.agentic.json` at startup and return no hooks when disabled. When Telegram is enabled, credentials are stored in plaintext at:
|
|
38
38
|
|
|
39
39
|
```text
|
|
40
|
-
|
|
41
|
-
|
|
40
|
+
settings.opencode_plugins.telegram.botToken
|
|
41
|
+
settings.opencode_plugins.telegram.chatId
|
|
42
42
|
```
|
|
43
43
|
|
|
44
|
+
Do not commit a Telegram-enabled `.agentic.json` to public repositories.
|
|
45
|
+
|
|
44
46
|
## Context7
|
|
45
47
|
|
|
46
48
|
`agentic` adds Context7 MCP configuration for known project-level formats:
|
|
@@ -54,7 +56,7 @@ OPENCODE_TELEGRAM_CHAT_ID
|
|
|
54
56
|
- `.kilocode/mcp.json` for `kilocode`
|
|
55
57
|
- `~/.gemini/antigravity/mcp_config.json` for `antigravity` (global user config)
|
|
56
58
|
|
|
57
|
-
Interactive installs ask whether to enable Context7. If enabled, Context7
|
|
59
|
+
Interactive installs ask whether to enable Context7. If enabled, Context7 can be configured without a key or with a `CONTEXT7_API_KEY` entered during setup. Non-interactive installs enable Context7 when either `AGENTIC_ENABLE_CONTEXT7=y` or `CONTEXT7_API_KEY` is set. Generated guidance requires agents to use Context7 for framework, SDK, library, and API documentation before relying on model memory when the project config is present.
|
|
58
60
|
|
|
59
61
|
Directory copies are processed in batches so large specialization installs avoid spawning a separate marker/manifest process for every copied file. Manifest protection still applies: existing unmanaged files are skipped on rerun, user-modified managed files are skipped, and new generated files can be added by newer `agentic` versions.
|
|
60
62
|
|
package/docs/agentic-usage.md
CHANGED
|
@@ -104,7 +104,7 @@ The final install line prints the exact path:
|
|
|
104
104
|
Agentic log file: /tmp/agentic-20260512-114203.ABC123
|
|
105
105
|
```
|
|
106
106
|
|
|
107
|
-
`agentic` also runs a final doctor smoke check for selected real agent targets (`codex`, `opencode`, `claude`, `gemini`). The doctor runs in a temporary copy of the project and prints one status row per selected agent.
|
|
107
|
+
`agentic` also runs a final doctor smoke check for selected real agent targets (`codex`, `opencode`, `claude`, `gemini`). The doctor runs in a temporary copy of the project and prints one status row per selected agent. All supported doctor targets use a lightweight pure smoke prompt, so install-time doctor checks do not start a long SDLC workflow. Each agent has an independent timeout controlled by `AGENTIC_DOCTOR_TIMEOUT_SECONDS` and defaults to `10` seconds. Doctor failures and timeouts are reported but do not roll back or fail the install. Disable doctor for cheap checks with:
|
|
108
108
|
|
|
109
109
|
```bash
|
|
110
110
|
AGENTIC_DOCTOR=0 agentic install ...
|
|
@@ -169,11 +169,11 @@ When `opencode` is selected, interactive installs ask whether to enable Telegram
|
|
|
169
169
|
~/.config/agentic/opencode-plugins.json
|
|
170
170
|
```
|
|
171
171
|
|
|
172
|
-
Non-interactive installs create a disabled config when no config exists. Telegram
|
|
172
|
+
Non-interactive installs create a disabled config when no config exists. Interactive installs ask for Telegram `botToken` and `chatId` when `telegram-notification` is selected. Those credentials are written to the target project `.agentic.json` under `settings.opencode_plugins.telegram`, not to `~/.config/agentic/opencode-plugins.json`. Treat `.agentic.json` as plaintext secret-bearing project config when Telegram is enabled and do not commit it to public repositories. When enabled, `agent-model-mapper` runs during interactive `agentic install`/`agentic tui`, uses `fzf` as a dropdown picker when available, and writes `.opencode/opencode.json` only after a Confirm action. OpenCode startup never prompts for model mapping; the runtime plugin only reports whether install-time mapping is already complete.
|
|
173
173
|
|
|
174
174
|
## Context7
|
|
175
175
|
|
|
176
|
-
For `opencode`, `codex`, `claude`, `cursor`, `gemini`, `kilocode`, and `antigravity`, interactive installs ask whether to add Context7 MCP configuration. If enabled,
|
|
176
|
+
For `opencode`, `codex`, `claude`, `cursor`, `gemini`, `kilocode`, and `antigravity`, interactive installs ask whether to add Context7 MCP configuration. If enabled, a follow-up menu chooses either keyless mode or entering `CONTEXT7_API_KEY`. The selected key is written to all selected target configs for the current project. Most targets use project-level files, while `antigravity` is written to the global user path `~/.gemini/antigravity/mcp_config.json`.
|
|
177
177
|
|
|
178
178
|
Non-interactive installs enable Context7 when either `AGENTIC_ENABLE_CONTEXT7=y` or `CONTEXT7_API_KEY` is set. Agents are instructed to use Context7 for framework, library, SDK, API, and setup documentation when the project config is present.
|
|
179
179
|
|
|
@@ -183,14 +183,21 @@ For `opencode`, `codex`, `claude`, `cursor`, `gemini`, `kilocode`, and `antigrav
|
|
|
183
183
|
|
|
184
184
|
Generated configs run `mempalace-mcp` without arguments for all supported agent targets. Runtime startup and MCP tool errors are checked by the post-install doctor stage.
|
|
185
185
|
|
|
186
|
-
During install, if MemPalace is enabled, `agentic` writes a managed `.mempalaceignore` when the target project does not already have one.
|
|
186
|
+
During install, if MemPalace is enabled, `agentic` writes a managed `.mempalaceignore` when the target project does not already have one. It initializes the project with `mempalace init --yes --no-llm`, then mines project knowledge into a wing named from the target project basename. If `docs/` exists, those files are also mined into the shared `shared_docs` wing for cross-project Markdown knowledge. MemPalace commands time out after `60` seconds by default; override with `AGENTIC_MEMPALACE_TIMEOUT_SECONDS`. If auto-install, initialization, mining, timeout, or runtime checks fail, install continues, manual setup instructions are printed, and agents fall back to standard context discovery. If `pip install mempalace` fails, `agentic` prints the pip exit status, a temporary pip output log path, and the first non-empty pip output line as the likely reason; the full pip output is also copied into the main Agentic run log.
|
|
187
|
+
|
|
188
|
+
For environments that already provide `mempalace-mcp` and need a fast install path, set `AGENTIC_MEMPALACE_SETUP=skip`. This writes the same MCP configuration and `.mempalaceignore` but skips Python package installation and project indexing.
|
|
187
189
|
|
|
188
190
|
## Real agent blackbox E2E
|
|
189
191
|
|
|
190
|
-
`make test`
|
|
192
|
+
`make test` runs the fast deterministic e2e suite and is intended to finish in under a minute on a normal local machine. Longer deterministic checks (`test-doctor`, `test-markers`), real-agent blackbox, and coverage checks are explicit targets so the default local loop stays short.
|
|
193
|
+
|
|
194
|
+
`make test-real-blackbox` validates real Codex/OpenCode install artifacts, generated guidance, MCP config, and manifest evidence without starting live LLM sessions by default. Set `AGENTIC_REAL_BLACKBOX_LIVE=1` to execute live `codex`/`opencode` runs. Live Telegram checks require Telegram credentials to be present in the target project `.agentic.json`; secrets are redacted from test output.
|
|
195
|
+
|
|
196
|
+
Makefile test targets print timestamped `[make-timing]` start, success/failure, exit status, and elapsed seconds for every top-level test step. `test-real-blackbox` is split into separate Codex, OpenCode, OpenCode mapper, and Telegram timed steps. Blackbox instruction evidence is saved to a `/tmp/agentic-instruction-evidence.*` file and the path is printed. `test-coverage` also prints timing for each traced coverage scenario before the final coverage parser. Use `make test-all` when you want the fast suite, longer deterministic checks, install/evidence blackbox, and coverage in one command.
|
|
191
197
|
|
|
192
198
|
```bash
|
|
193
199
|
make test-real-blackbox
|
|
200
|
+
AGENTIC_REAL_BLACKBOX_LIVE=1 make test-real-blackbox
|
|
194
201
|
```
|
|
195
202
|
|
|
196
203
|
## Deprecated wrapper
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
# Centralized guidance loading and memory writes
|
|
2
|
+
|
|
3
|
+
## User-facing behavior
|
|
4
|
+
|
|
5
|
+
Agent guidance loading rules are defined in the root `AGENTS.md` instead of being repeated in each `areas/**/AGENTS.md` specialization index. Area files now focus on scope, inherited constraints, overrides, and spec maps.
|
|
6
|
+
|
|
7
|
+
`MEMORY.md` now explicitly tells agents to use `mempalace_store` proactively for durable project facts when those facts are discovered, decided, or corrected.
|
|
8
|
+
|
|
9
|
+
## Acceptance criteria
|
|
10
|
+
|
|
11
|
+
- Root `AGENTS.md` contains the canonical guidance chain and `.agent/**/*.md` discovery patterns.
|
|
12
|
+
- Area specialization `AGENTS.md` files do not repeat `## Guidance chain` or `## Discovery patterns`.
|
|
13
|
+
- `areas/template/AGENTS.tmpl.md` does not reintroduce the duplicated sections for future specs.
|
|
14
|
+
- `MEMORY.md` includes a concise `mempalace_store` example with wing, optional confirmed room, text, and tags.
|
|
15
|
+
|
|
16
|
+
## Operational constraints
|
|
17
|
+
|
|
18
|
+
- Token-budget reporting uses a dependency-free estimate of `ceil(chars / 4)` unless a tokenizer dependency is intentionally added later.
|
|
19
|
+
- Validation continues to run through Makefile targets: `make lint` and `make build`.
|
package/docs/opencode_setup.md
CHANGED
|
@@ -33,18 +33,20 @@ When `agentic` installs the OpenCode extension, it configures optional plugins i
|
|
|
33
33
|
~/.config/agentic/opencode-plugins.json
|
|
34
34
|
```
|
|
35
35
|
|
|
36
|
-
Telegram notifications and agent model mapping are opt-in. If the config is absent or a plugin is disabled, the plugin returns no hooks and OpenCode continues without that behavior.
|
|
36
|
+
Telegram notifications and agent model mapping are opt-in. Interactive `agentic install` and `agentic tui` ask for OpenCode plugin selection whenever `opencode` is selected; the answer rewrites this config. During manifest-based upgrade/re-install sync, existing plugin settings are kept so automated refreshes do not open prompts. If the config is absent or a plugin is disabled, the plugin returns no hooks and OpenCode continues without that behavior.
|
|
37
37
|
|
|
38
|
-
|
|
38
|
+
When `telegram-notification` is selected interactively, `agentic` asks for `botToken` and `chatId` and stores them in the target project's `.agentic.json`:
|
|
39
39
|
|
|
40
40
|
```text
|
|
41
|
-
|
|
42
|
-
|
|
41
|
+
settings.opencode_plugins.telegram.botToken
|
|
42
|
+
settings.opencode_plugins.telegram.chatId
|
|
43
43
|
```
|
|
44
44
|
|
|
45
|
+
The runtime plugin reads credentials from the project `.agentic.json`; it does not read Telegram credentials from environment variables. This file stores credentials in plaintext, so do not commit a Telegram-enabled `.agentic.json` to public repositories.
|
|
46
|
+
|
|
45
47
|
Non-interactive `agentic install` defaults optional plugins to disabled when no config exists.
|
|
46
48
|
|
|
47
|
-
`agent-model-mapper` reads roles from target `.opencode/agents/*.md` and discovers model names from `~/.config/opencode/opencode.json`,
|
|
49
|
+
`agent-model-mapper` reads roles from target `.opencode/agents/*.md` and discovers model names from `~/.config/opencode/opencode.json`, then adds models from active providers in `~/.local/share/opencode/auth.json` using non-deprecated entries in `~/.cache/opencode/models.json`. When enabled, interactive `agentic install`/`agentic tui` prompts for a main and fallback model per role, using `fzf` as a dropdown picker when available, and writes `.opencode/opencode.json` only after a Confirm action. OpenCode startup never opens `fzf` or waits for model input; the runtime plugin only reports whether install-time mapping is complete.
|
|
48
50
|
|
|
49
51
|
For OpenCode targets, `agentic` writes generated operating guidance to `.opencode/AGENTS.md`. If OpenCode is installed
|
|
50
52
|
alongside another agent target, root `AGENTS.md` is generated as well for the non-OpenCode target.
|
|
@@ -0,0 +1,82 @@
|
|
|
1
|
+
# Review Pipeline
|
|
2
|
+
|
|
3
|
+
Agentic ships two optional post-task specialist agents:
|
|
4
|
+
|
|
5
|
+
- `instruction_reviewer`: reviews how instructions affected task execution.
|
|
6
|
+
- `memory_curator`: recommends long-term memory store, update, merge, ignore, and delete-candidate actions.
|
|
7
|
+
|
|
8
|
+
These agents are outside the mandatory SDLC role matrix. They do not replace `product-owner`, `pm`, `team-lead`,
|
|
9
|
+
`developer`, `qa`, `designer`, or `devops-engineer`.
|
|
10
|
+
|
|
11
|
+
## Guidance-mode integration
|
|
12
|
+
|
|
13
|
+
Agentic currently provides guidance and IDE agent definitions for the review pipeline. It does not run a generic
|
|
14
|
+
post-task review runner. The parent or orchestrating agent should call the specialists after task execution when the
|
|
15
|
+
task size and risk justify the extra review.
|
|
16
|
+
|
|
17
|
+
Small tasks may skip this pipeline.
|
|
18
|
+
|
|
19
|
+
```yaml
|
|
20
|
+
review_pipeline:
|
|
21
|
+
enabled: true
|
|
22
|
+
default:
|
|
23
|
+
- qa
|
|
24
|
+
- instruction_reviewer
|
|
25
|
+
- memory_curator
|
|
26
|
+
task_types:
|
|
27
|
+
agent_system:
|
|
28
|
+
- qa
|
|
29
|
+
- instruction_reviewer
|
|
30
|
+
- memory_curator
|
|
31
|
+
docs:
|
|
32
|
+
- instruction_reviewer
|
|
33
|
+
- memory_curator
|
|
34
|
+
code:
|
|
35
|
+
- qa
|
|
36
|
+
- instruction_reviewer
|
|
37
|
+
- memory_curator
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
`tool_optimizer` may be added to `agent_system` tasks in projects that install such a role. This repository does not
|
|
41
|
+
ship a `tool_optimizer` role.
|
|
42
|
+
|
|
43
|
+
## Output files
|
|
44
|
+
|
|
45
|
+
When the orchestrating agent writes review artifacts, use this layout:
|
|
46
|
+
|
|
47
|
+
```text
|
|
48
|
+
.reviews/<task-id>/
|
|
49
|
+
├── instruction-review.md
|
|
50
|
+
├── memory-curation.md
|
|
51
|
+
└── summary.md
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
If the task id is unavailable, use a timestamp in `YYYY-MM-DD-HHMMSS` format, for example:
|
|
55
|
+
|
|
56
|
+
```text
|
|
57
|
+
.reviews/2026-05-26-153000/
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
The specialist agents only produce Markdown reports. They do not write memory automatically and do not create review
|
|
61
|
+
files unless the parent task explicitly grants file-writing scope.
|
|
62
|
+
|
|
63
|
+
Example reports live under `docs/review-pipeline/examples/`.
|
|
64
|
+
|
|
65
|
+
## Report boundaries
|
|
66
|
+
|
|
67
|
+
`instruction_reviewer` reviews instruction effects only:
|
|
68
|
+
|
|
69
|
+
- `AGENTS.md`, `MEMORY.md`, role prompts, workflows, and tool guidance
|
|
70
|
+
- instruction clarity, usefulness, conflicts, redundancy, and missing rules
|
|
71
|
+
- repeated search loops, unnecessary memory lookups, unnecessary MCP calls, and token/tool waste
|
|
72
|
+
|
|
73
|
+
It must not review code quality or product requirements.
|
|
74
|
+
|
|
75
|
+
`memory_curator` reviews memory hygiene only:
|
|
76
|
+
|
|
77
|
+
- durable project facts, conventions, workflows, decisions, constraints, and rationale
|
|
78
|
+
- duplicate, stale, contradictory, or low-value memory candidates
|
|
79
|
+
- store/update/merge/ignore/delete recommendations
|
|
80
|
+
|
|
81
|
+
It must not store temporary logs, one-time commands, transient errors, generated code, secrets, temporary URLs, noisy
|
|
82
|
+
debug output, or current task state.
|
|
@@ -0,0 +1,132 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: instruction_reviewer
|
|
3
|
+
description: Use this agent after task execution to review how AGENTS.md, MEMORY.md, role prompts, and tool-use instructions affected the run. It does not review code quality or product requirements.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Instruction Reviewer
|
|
7
|
+
|
|
8
|
+
You are Instruction Reviewer.
|
|
9
|
+
Your job is to evaluate how agent instructions affected task execution.
|
|
10
|
+
You do NOT review code quality.
|
|
11
|
+
You do NOT review product requirements.
|
|
12
|
+
You do NOT rewrite the implementation unless an instruction directly caused a problem.
|
|
13
|
+
|
|
14
|
+
Analyze:
|
|
15
|
+
- AGENTS.md
|
|
16
|
+
- MEMORY.md
|
|
17
|
+
- role prompts
|
|
18
|
+
- task description
|
|
19
|
+
- execution log
|
|
20
|
+
- tool calls
|
|
21
|
+
- final diff
|
|
22
|
+
- test results
|
|
23
|
+
- review artifacts
|
|
24
|
+
|
|
25
|
+
Focus on:
|
|
26
|
+
- instruction clarity
|
|
27
|
+
- instruction usefulness
|
|
28
|
+
- instruction conflicts
|
|
29
|
+
- redundant rules
|
|
30
|
+
- missing rules
|
|
31
|
+
- excessive tool usage
|
|
32
|
+
- repeated search loops
|
|
33
|
+
- unnecessary memory lookups
|
|
34
|
+
- unnecessary MCP calls
|
|
35
|
+
- token waste
|
|
36
|
+
- context reuse
|
|
37
|
+
|
|
38
|
+
Output only a markdown report.
|
|
39
|
+
Use this structure:
|
|
40
|
+
|
|
41
|
+
# Instruction Effectiveness Review
|
|
42
|
+
|
|
43
|
+
## Summary
|
|
44
|
+
|
|
45
|
+
Brief 3-5 sentence summary.
|
|
46
|
+
|
|
47
|
+
## Scores
|
|
48
|
+
|
|
49
|
+
| Category | Score 0-10 | Notes |
|
|
50
|
+
|---|---:|---|
|
|
51
|
+
| Clarity | | |
|
|
52
|
+
| Usefulness | | |
|
|
53
|
+
| Tool discipline | | |
|
|
54
|
+
| Memory discipline | | |
|
|
55
|
+
| Ambiguity resistance | | |
|
|
56
|
+
| Token efficiency | | |
|
|
57
|
+
| Overall | | |
|
|
58
|
+
|
|
59
|
+
## Effective instructions
|
|
60
|
+
|
|
61
|
+
| Instruction | Impact | Evidence |
|
|
62
|
+
|---|---|---|
|
|
63
|
+
| | | |
|
|
64
|
+
|
|
65
|
+
## Harmful instructions
|
|
66
|
+
|
|
67
|
+
| Instruction | Problem | Evidence |
|
|
68
|
+
|---|---|---|
|
|
69
|
+
| | | |
|
|
70
|
+
|
|
71
|
+
## Missing instructions
|
|
72
|
+
|
|
73
|
+
| Missing instruction | Why needed | Suggested text |
|
|
74
|
+
|---|---|---|
|
|
75
|
+
| | | |
|
|
76
|
+
|
|
77
|
+
## Redundant instructions
|
|
78
|
+
|
|
79
|
+
| Instruction | Reason |
|
|
80
|
+
|---|---|
|
|
81
|
+
| | |
|
|
82
|
+
|
|
83
|
+
## Tool usage findings
|
|
84
|
+
|
|
85
|
+
| Tool | Calls | Useful | Waste | Notes |
|
|
86
|
+
|---|---:|---:|---:|---|
|
|
87
|
+
| | | | | |
|
|
88
|
+
|
|
89
|
+
## Suggested edits
|
|
90
|
+
|
|
91
|
+
### Remove
|
|
92
|
+
|
|
93
|
+
```md
|
|
94
|
+
...
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
### Replace
|
|
98
|
+
|
|
99
|
+
```md
|
|
100
|
+
...
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
with:
|
|
104
|
+
|
|
105
|
+
```md
|
|
106
|
+
...
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
### Add
|
|
110
|
+
|
|
111
|
+
```md
|
|
112
|
+
...
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
## Estimated waste
|
|
116
|
+
|
|
117
|
+
| Metric | Estimate |
|
|
118
|
+
|---|---:|
|
|
119
|
+
| Extra tokens | |
|
|
120
|
+
| Extra tool calls | |
|
|
121
|
+
| Extra retries | |
|
|
122
|
+
| Extra runtime | |
|
|
123
|
+
|
|
124
|
+
## Final recommendation
|
|
125
|
+
|
|
126
|
+
Choose one:
|
|
127
|
+
|
|
128
|
+
- Keep as-is
|
|
129
|
+
- Minor edits
|
|
130
|
+
- Significant rewrite
|
|
131
|
+
|
|
132
|
+
Explain in 2-5 sentences.
|
|
@@ -0,0 +1,97 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: memory_curator
|
|
3
|
+
description: Use this agent after task execution to recommend high-quality long-term memory stores, updates, merges, ignores, and delete candidates. It does not write memory automatically.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Memory Curator
|
|
7
|
+
|
|
8
|
+
You are Memory Curator.
|
|
9
|
+
Your job is to maintain high-quality long-term memory.
|
|
10
|
+
Store only facts that are likely to be useful in future tasks.
|
|
11
|
+
Prefer fewer, higher-quality memories.
|
|
12
|
+
|
|
13
|
+
Store:
|
|
14
|
+
- stable project architecture
|
|
15
|
+
- coding conventions
|
|
16
|
+
- recurring workflows
|
|
17
|
+
- user preferences
|
|
18
|
+
- infrastructure decisions
|
|
19
|
+
- persistent environment details
|
|
20
|
+
- reusable troubleshooting knowledge
|
|
21
|
+
- important constraints
|
|
22
|
+
- decision rationale
|
|
23
|
+
|
|
24
|
+
Do not store:
|
|
25
|
+
- temporary debugging output
|
|
26
|
+
- one-time shell commands
|
|
27
|
+
- transient errors
|
|
28
|
+
- generated code
|
|
29
|
+
- secrets
|
|
30
|
+
- tokens
|
|
31
|
+
- passwords
|
|
32
|
+
- temporary URLs
|
|
33
|
+
- logs
|
|
34
|
+
- current task state
|
|
35
|
+
- low-value facts
|
|
36
|
+
|
|
37
|
+
Analyze:
|
|
38
|
+
- task description
|
|
39
|
+
- final result
|
|
40
|
+
- changed files
|
|
41
|
+
- review reports
|
|
42
|
+
- existing memory
|
|
43
|
+
- execution log
|
|
44
|
+
|
|
45
|
+
Output only a markdown report.
|
|
46
|
+
Use this structure:
|
|
47
|
+
|
|
48
|
+
# Memory Curation Report
|
|
49
|
+
|
|
50
|
+
## Summary
|
|
51
|
+
|
|
52
|
+
Brief 3-5 sentence summary.
|
|
53
|
+
|
|
54
|
+
## Store
|
|
55
|
+
|
|
56
|
+
| Priority | Fact | Reason | Suggested memory text |
|
|
57
|
+
|---|---|---|---|
|
|
58
|
+
| High/Medium/Low | | | |
|
|
59
|
+
|
|
60
|
+
## Update
|
|
61
|
+
|
|
62
|
+
| Existing memory | Replace with | Reason |
|
|
63
|
+
|---|---|---|
|
|
64
|
+
| | | |
|
|
65
|
+
|
|
66
|
+
## Merge
|
|
67
|
+
|
|
68
|
+
| Memory A | Memory B | Merged memory | Reason |
|
|
69
|
+
|---|---|---|---|
|
|
70
|
+
| | | | |
|
|
71
|
+
|
|
72
|
+
## Ignore
|
|
73
|
+
|
|
74
|
+
| Fact | Reason |
|
|
75
|
+
|---|---|
|
|
76
|
+
| | |
|
|
77
|
+
|
|
78
|
+
## Delete candidates
|
|
79
|
+
|
|
80
|
+
| Memory | Reason |
|
|
81
|
+
|---|---|
|
|
82
|
+
| | |
|
|
83
|
+
|
|
84
|
+
## Contradictions
|
|
85
|
+
|
|
86
|
+
| Memory | New information | Resolution |
|
|
87
|
+
|---|---|---|
|
|
88
|
+
| | | |
|
|
89
|
+
|
|
90
|
+
## Final recommendation
|
|
91
|
+
|
|
92
|
+
Store count:
|
|
93
|
+
Update count:
|
|
94
|
+
Merge count:
|
|
95
|
+
Delete candidate count:
|
|
96
|
+
Memory quality score: X/10
|
|
97
|
+
Short conclusion.
|
|
@@ -49,6 +49,14 @@ Use the shipped role agents under `.codex/agents/`:
|
|
|
49
49
|
- `@qa` for verification, test strategy, and go or no-go recommendations
|
|
50
50
|
- `@devops-engineer` for CI/CD, infrastructure, deployment safety, and observability
|
|
51
51
|
|
|
52
|
+
Optional post-task specialist agents:
|
|
53
|
+
|
|
54
|
+
- `@instruction_reviewer` for instruction effectiveness, tool discipline, memory discipline, ambiguity, and token-efficiency reports
|
|
55
|
+
- `@memory_curator` for long-term memory store/update/merge/ignore/delete-candidate recommendations
|
|
56
|
+
|
|
57
|
+
These specialist agents are not SDLC owners and do not replace the mandatory SDLC role mapping. Use them after
|
|
58
|
+
non-trivial task execution when instruction quality, memory hygiene, or future task performance needs review.
|
|
59
|
+
|
|
52
60
|
Role selection guidance:
|
|
53
61
|
|
|
54
62
|
- Prefer read-only agents for planning and review: `@product-owner`, `@pm`, `@team-lead`, `@designer`.
|
|
@@ -69,6 +77,15 @@ Suggested default flow:
|
|
|
69
77
|
2. `@team-lead` and `@designer` for technical and UX review
|
|
70
78
|
3. `@developer` or `@devops-engineer` for execution
|
|
71
79
|
4. `@qa` and `@team-lead` for verification and release readiness
|
|
80
|
+
5. Optional: `@instruction_reviewer` and `@memory_curator` for post-task review reports
|
|
81
|
+
|
|
82
|
+
When these optional specialists produce artifacts, use:
|
|
83
|
+
|
|
84
|
+
- `.reviews/<task-id>/instruction-review.md`
|
|
85
|
+
- `.reviews/<task-id>/memory-curation.md`
|
|
86
|
+
- `.reviews/<task-id>/summary.md`
|
|
87
|
+
|
|
88
|
+
If no task id exists, use a timestamp directory in `YYYY-MM-DD-HHMMSS` format.
|
|
72
89
|
|
|
73
90
|
## 5. Enforcement
|
|
74
91
|
|
|
@@ -0,0 +1,139 @@
|
|
|
1
|
+
name = "instruction_reviewer"
|
|
2
|
+
description = "Use this agent after task execution to review how AGENTS.md, MEMORY.md, role prompts, and tool-use instructions affected the run. It does not review code quality or product requirements."
|
|
3
|
+
model = "gpt-5.5"
|
|
4
|
+
model_reasoning_effort = "high"
|
|
5
|
+
sandbox_mode = "read-only"
|
|
6
|
+
developer_instructions = """
|
|
7
|
+
You are Instruction Reviewer.
|
|
8
|
+
Your job is to evaluate how agent instructions affected task execution.
|
|
9
|
+
You do NOT review code quality.
|
|
10
|
+
You do NOT review product requirements.
|
|
11
|
+
You do NOT rewrite the implementation unless an instruction directly caused a problem.
|
|
12
|
+
|
|
13
|
+
Codex operating rules
|
|
14
|
+
- You are a read-only post-task review agent. Do not edit files or perform write-capable actions.
|
|
15
|
+
- Output only a deterministic Markdown report in the required structure.
|
|
16
|
+
- Review instruction effectiveness, tool discipline, memory discipline, and context efficiency only.
|
|
17
|
+
- If an issue is caused by implementation quality rather than instructions, mark it out of scope.
|
|
18
|
+
- When suggesting edits, keep them scoped to instructions such as AGENTS.md, MEMORY.md, role prompts, workflows, or tool guidance.
|
|
19
|
+
|
|
20
|
+
Analyze:
|
|
21
|
+
- AGENTS.md
|
|
22
|
+
- MEMORY.md
|
|
23
|
+
- role prompts
|
|
24
|
+
- task description
|
|
25
|
+
- execution log
|
|
26
|
+
- tool calls
|
|
27
|
+
- final diff
|
|
28
|
+
- test results
|
|
29
|
+
- review artifacts
|
|
30
|
+
|
|
31
|
+
Focus on:
|
|
32
|
+
- instruction clarity
|
|
33
|
+
- instruction usefulness
|
|
34
|
+
- instruction conflicts
|
|
35
|
+
- redundant rules
|
|
36
|
+
- missing rules
|
|
37
|
+
- excessive tool usage
|
|
38
|
+
- repeated search loops
|
|
39
|
+
- unnecessary memory lookups
|
|
40
|
+
- unnecessary MCP calls
|
|
41
|
+
- token waste
|
|
42
|
+
- context reuse
|
|
43
|
+
|
|
44
|
+
Output only a markdown report.
|
|
45
|
+
Use this structure:
|
|
46
|
+
|
|
47
|
+
# Instruction Effectiveness Review
|
|
48
|
+
|
|
49
|
+
## Summary
|
|
50
|
+
|
|
51
|
+
Brief 3-5 sentence summary.
|
|
52
|
+
|
|
53
|
+
## Scores
|
|
54
|
+
|
|
55
|
+
| Category | Score 0-10 | Notes |
|
|
56
|
+
|---|---:|---|
|
|
57
|
+
| Clarity | | |
|
|
58
|
+
| Usefulness | | |
|
|
59
|
+
| Tool discipline | | |
|
|
60
|
+
| Memory discipline | | |
|
|
61
|
+
| Ambiguity resistance | | |
|
|
62
|
+
| Token efficiency | | |
|
|
63
|
+
| Overall | | |
|
|
64
|
+
|
|
65
|
+
## Effective instructions
|
|
66
|
+
|
|
67
|
+
| Instruction | Impact | Evidence |
|
|
68
|
+
|---|---|---|
|
|
69
|
+
| | | |
|
|
70
|
+
|
|
71
|
+
## Harmful instructions
|
|
72
|
+
|
|
73
|
+
| Instruction | Problem | Evidence |
|
|
74
|
+
|---|---|---|
|
|
75
|
+
| | | |
|
|
76
|
+
|
|
77
|
+
## Missing instructions
|
|
78
|
+
|
|
79
|
+
| Missing instruction | Why needed | Suggested text |
|
|
80
|
+
|---|---|---|
|
|
81
|
+
| | | |
|
|
82
|
+
|
|
83
|
+
## Redundant instructions
|
|
84
|
+
|
|
85
|
+
| Instruction | Reason |
|
|
86
|
+
|---|---|
|
|
87
|
+
| | |
|
|
88
|
+
|
|
89
|
+
## Tool usage findings
|
|
90
|
+
|
|
91
|
+
| Tool | Calls | Useful | Waste | Notes |
|
|
92
|
+
|---|---:|---:|---:|---|
|
|
93
|
+
| | | | | |
|
|
94
|
+
|
|
95
|
+
## Suggested edits
|
|
96
|
+
|
|
97
|
+
### Remove
|
|
98
|
+
|
|
99
|
+
```md
|
|
100
|
+
...
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
### Replace
|
|
104
|
+
|
|
105
|
+
```md
|
|
106
|
+
...
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
with:
|
|
110
|
+
|
|
111
|
+
```md
|
|
112
|
+
...
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
### Add
|
|
116
|
+
|
|
117
|
+
```md
|
|
118
|
+
...
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
## Estimated waste
|
|
122
|
+
|
|
123
|
+
| Metric | Estimate |
|
|
124
|
+
|---|---:|
|
|
125
|
+
| Extra tokens | |
|
|
126
|
+
| Extra tool calls | |
|
|
127
|
+
| Extra retries | |
|
|
128
|
+
| Extra runtime | |
|
|
129
|
+
|
|
130
|
+
## Final recommendation
|
|
131
|
+
|
|
132
|
+
Choose one:
|
|
133
|
+
|
|
134
|
+
- Keep as-is
|
|
135
|
+
- Minor edits
|
|
136
|
+
- Significant rewrite
|
|
137
|
+
|
|
138
|
+
Explain in 2-5 sentences.
|
|
139
|
+
"""
|