@jetrabbits/agentic 0.3.0 → 0.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/AGENTS.md +17 -23
  2. package/CHANGELOG.md +19 -0
  3. package/MEMORY.md +41 -87
  4. package/Makefile +80 -22
  5. package/README.md +17 -7
  6. package/agentic +634 -124
  7. package/areas/devops/ci-cd/AGENTS.md +1 -15
  8. package/areas/devops/database-ops/AGENTS.md +1 -15
  9. package/areas/devops/devsecops/AGENTS.md +1 -15
  10. package/areas/devops/infrastructure/AGENTS.md +1 -15
  11. package/areas/devops/kubernetes/AGENTS.md +1 -15
  12. package/areas/devops/networking/AGENTS.md +1 -15
  13. package/areas/devops/observability/AGENTS.md +1 -15
  14. package/areas/devops/sre/AGENTS.md +1 -15
  15. package/areas/software/backend/AGENTS.md +1 -16
  16. package/areas/software/data-engineering/AGENTS.md +1 -16
  17. package/areas/software/frontend/AGENTS.md +1 -16
  18. package/areas/software/full-stack/AGENTS.md +1 -16
  19. package/areas/software/general/AGENTS.md +1 -7
  20. package/areas/software/mlops/AGENTS.md +1 -16
  21. package/areas/software/mobile/AGENTS.md +1 -16
  22. package/areas/software/platform/AGENTS.md +1 -16
  23. package/areas/software/qa/AGENTS.md +1 -16
  24. package/areas/software/security/AGENTS.md +1 -16
  25. package/areas/template/AGENTS.tmpl.md +1 -17
  26. package/docs/agentic-lifecycle.md +7 -3
  27. package/docs/agentic-stabilization/README.md +11 -7
  28. package/docs/agentic-token-minimization/README.md +7 -5
  29. package/docs/agentic-usage.md +12 -5
  30. package/docs/guidance-updates/2026-05-22-centralized-guidance-memory.md +19 -0
  31. package/docs/opencode_setup.md +7 -5
  32. package/docs/review-pipeline.md +82 -0
  33. package/extensions/claude/agents/instruction_reviewer.md +132 -0
  34. package/extensions/claude/agents/memory_curator.md +97 -0
  35. package/extensions/codex/AGENTS.override.md +17 -0
  36. package/extensions/codex/agents/instruction_reviewer.toml +139 -0
  37. package/extensions/codex/agents/memory_curator.toml +104 -0
  38. package/extensions/gemini/agents/instruction_reviewer.md +132 -0
  39. package/extensions/gemini/agents/memory_curator.md +97 -0
  40. package/extensions/opencode/agents/instruction_reviewer.md +133 -0
  41. package/extensions/opencode/agents/memory_curator.md +98 -0
  42. package/extensions/opencode/opencode.json +27 -3
  43. package/extensions/opencode/plugins/agent-model-mapper.ts +13 -2
  44. package/extensions/opencode/plugins/telegram-notification.ts +14 -14
  45. package/package.json +1 -1
  46. package/scripts/generate_how_to_use_agentic_gif.py +565 -0
@@ -4,16 +4,19 @@
4
4
 
5
5
  - Post-install doctor checks run independently for `codex`, `opencode`, `claude`, and `gemini`.
6
6
  - `AGENTIC_DOCTOR_TIMEOUT_SECONDS` defaults to `10`; a timeout is reported as a doctor failure and install continues.
7
- - Codex doctor runs non-interactively with `--ephemeral` and `--sandbox workspace-write`.
7
+ - Codex doctor runs non-interactively with `--ephemeral`, `--sandbox workspace-write`, and the same lightweight smoke prompt as other supported doctor targets.
8
8
  - OpenCode uses `agent-model-mapper` instead of the removed `model-checker` artifacts.
9
9
  - `agent-model-mapper` writes `.opencode/opencode.json` during interactive install only after confirmation.
10
10
  - `agent-model-mapper` uses `fzf` for install-time model dropdowns when available and OpenCode startup skips once all roles are mapped.
11
11
  - The runtime OpenCode plugin never opens `fzf`, asks questions, or writes project files.
12
- - Context7 no longer asks for an API key; it uses `CONTEXT7_API_KEY` when set and otherwise prints config-path guidance for adding one later.
13
- - OpenCode MemPalace setup writes `mempalace-mcp` config without running `mempalace init` automatically.
14
- - Telegram notification credentials are read only from `OPENCODE_TELEGRAM_BOT_TOKEN` and `OPENCODE_TELEGRAM_CHAT_ID`.
12
+ - Context7 offers an interactive key mode: configure without a key or enter `CONTEXT7_API_KEY` for the selected target configs.
13
+ - OpenCode MemPalace setup writes `mempalace-mcp` config and initializes/mines project memory into a project-specific wing without LLM calls.
14
+ - Telegram notification credentials are read from project `.agentic.json` when the plugin is enabled.
15
15
  - MemPalace-enabled installs create a managed `.mempalaceignore` unless the target project already has one.
16
- - Real Codex, OpenCode, and Telegram blackbox scenarios are part of `make test`.
16
+ - `make test` runs the fast deterministic e2e suite; longer deterministic checks, real blackbox, and coverage checks are explicit targets.
17
+ - `make test-all` runs the full local suite including longer deterministic checks, install/evidence blackbox, and coverage.
18
+ - Real Codex, OpenCode, and Telegram blackbox install/evidence scenarios run through `make test-real-blackbox`.
19
+ - Live Codex/OpenCode/Telegram blackbox sessions require `AGENTIC_REAL_BLACKBOX_LIVE=1`.
17
20
  - `make test-coverage` traces `agentic` through e2e runs and fails below 90% line coverage.
18
21
 
19
22
  ## Acceptance Criteria
@@ -24,10 +27,11 @@
24
27
  - `extensions/opencode/opencode.json` lists `agent-model-mapper`.
25
28
  - Runtime model mapper execution does not prompt or modify project files.
26
29
  - Telegram plugin tests prove environment-only credentials and no secret output.
27
- - Real blackbox tests print created files, instruction evidence, MCP usage prompts, and MemPalace fact prompts without printing Telegram secrets.
30
+ - Real blackbox tests print created files, managed guidance sources, and MCP config evidence, then save instruction evidence to a temp file without printing Telegram secrets.
28
31
 
29
32
  ## Operational Constraints
30
33
 
31
- - `make test` now requires real `codex` and `opencode` binaries, working model auth, network access, Context7/MemPalace access, and Telegram credentials.
34
+ - `make test` is deterministic, designed for a sub-minute local loop, and does not require real agent binaries, model auth, network access, Context7/MemPalace access, or Telegram credentials.
35
+ - `AGENTIC_REAL_BLACKBOX_LIVE=1 make test-real-blackbox` requires real `codex` and `opencode` binaries, working model auth, network access, Context7/MemPalace access, and Telegram credentials for the Telegram case.
32
36
  - Telegram credentials must never be committed or written to Agentic config.
33
37
  - Coverage is line-based Bash trace coverage for the `agentic` script, not branch coverage.
@@ -32,15 +32,17 @@ When installing for OpenCode, `agentic` writes optional plugin state to:
32
32
  ~/.config/agentic/opencode-plugins.json
33
33
  ```
34
34
 
35
- Interactive installs ask whether to enable Telegram notifications and model checking. Non-interactive installs default optional plugins to disabled when no config exists.
35
+ Interactive installs ask whether to enable Telegram notifications and model mapping. Non-interactive installs default optional plugins to disabled when no config exists.
36
36
 
37
- The OpenCode plugins read this config at startup and return no hooks when disabled. Telegram credentials can also be supplied through:
37
+ The OpenCode plugins read project `.agentic.json` at startup and return no hooks when disabled. When Telegram is enabled, credentials are stored in plaintext at:
38
38
 
39
39
  ```text
40
- OPENCODE_TELEGRAM_BOT_TOKEN
41
- OPENCODE_TELEGRAM_CHAT_ID
40
+ settings.opencode_plugins.telegram.botToken
41
+ settings.opencode_plugins.telegram.chatId
42
42
  ```
43
43
 
44
+ Do not commit a Telegram-enabled `.agentic.json` to public repositories.
45
+
44
46
  ## Context7
45
47
 
46
48
  `agentic` adds Context7 MCP configuration for known project-level formats:
@@ -54,7 +56,7 @@ OPENCODE_TELEGRAM_CHAT_ID
54
56
  - `.kilocode/mcp.json` for `kilocode`
55
57
  - `~/.gemini/antigravity/mcp_config.json` for `antigravity` (global user config)
56
58
 
57
- Interactive installs ask whether to enable Context7. If enabled, Context7 is configured without a key unless `CONTEXT7_API_KEY` is already set; the install output prints the config path(s) and an example key placement. Non-interactive installs enable Context7 when either `AGENTIC_ENABLE_CONTEXT7=y` or `CONTEXT7_API_KEY` is set. Generated guidance requires agents to use Context7 for framework, SDK, library, and API documentation before relying on model memory when the project config is present.
59
+ Interactive installs ask whether to enable Context7. If enabled, Context7 can be configured without a key or with a `CONTEXT7_API_KEY` entered during setup. Non-interactive installs enable Context7 when either `AGENTIC_ENABLE_CONTEXT7=y` or `CONTEXT7_API_KEY` is set. Generated guidance requires agents to use Context7 for framework, SDK, library, and API documentation before relying on model memory when the project config is present.
58
60
 
59
61
  Directory copies are processed in batches so large specialization installs avoid spawning a separate marker/manifest process for every copied file. Manifest protection still applies: existing unmanaged files are skipped on rerun, user-modified managed files are skipped, and new generated files can be added by newer `agentic` versions.
60
62
 
@@ -104,7 +104,7 @@ The final install line prints the exact path:
104
104
  Agentic log file: /tmp/agentic-20260512-114203.ABC123
105
105
  ```
106
106
 
107
- `agentic` also runs a final doctor smoke check for selected real agent targets (`codex`, `opencode`, `claude`, `gemini`). The doctor runs in a temporary copy of the project and prints one status row per selected agent. OpenCode uses a lightweight pure smoke prompt instead of the full `develop-feature` command, so install-time doctor checks do not start a long SDLC workflow. Each agent has an independent timeout controlled by `AGENTIC_DOCTOR_TIMEOUT_SECONDS` and defaults to `10` seconds. Doctor failures and timeouts are reported but do not roll back or fail the install. Disable doctor for cheap checks with:
107
+ `agentic` also runs a final doctor smoke check for selected real agent targets (`codex`, `opencode`, `claude`, `gemini`). The doctor runs in a temporary copy of the project and prints one status row per selected agent. All supported doctor targets use a lightweight pure smoke prompt, so install-time doctor checks do not start a long SDLC workflow. Each agent has an independent timeout controlled by `AGENTIC_DOCTOR_TIMEOUT_SECONDS` and defaults to `10` seconds. Doctor failures and timeouts are reported but do not roll back or fail the install. Disable doctor for cheap checks with:
108
108
 
109
109
  ```bash
110
110
  AGENTIC_DOCTOR=0 agentic install ...
@@ -169,11 +169,11 @@ When `opencode` is selected, interactive installs ask whether to enable Telegram
169
169
  ~/.config/agentic/opencode-plugins.json
170
170
  ```
171
171
 
172
- Non-interactive installs create a disabled config when no config exists. Telegram reads `OPENCODE_TELEGRAM_BOT_TOKEN` and `OPENCODE_TELEGRAM_CHAT_ID` from the environment only; tokens are not written to `~/.config/agentic/opencode-plugins.json`. When enabled, `agent-model-mapper` runs during interactive `agentic install`/`agentic tui`, uses `fzf` as a dropdown picker when available, and writes `.opencode/opencode.json` only after confirmation. OpenCode startup never prompts for model mapping; the runtime plugin only reports whether install-time mapping is already complete.
172
+ Non-interactive installs create a disabled config when no config exists. Interactive installs ask for Telegram `botToken` and `chatId` when `telegram-notification` is selected. Those credentials are written to the target project `.agentic.json` under `settings.opencode_plugins.telegram`, not to `~/.config/agentic/opencode-plugins.json`. Treat `.agentic.json` as plaintext secret-bearing project config when Telegram is enabled and do not commit it to public repositories. When enabled, `agent-model-mapper` runs during interactive `agentic install`/`agentic tui`, uses `fzf` as a dropdown picker when available, and writes `.opencode/opencode.json` only after a Confirm action. OpenCode startup never prompts for model mapping; the runtime plugin only reports whether install-time mapping is already complete.
173
173
 
174
174
  ## Context7
175
175
 
176
- For `opencode`, `codex`, `claude`, `cursor`, `gemini`, `kilocode`, and `antigravity`, interactive installs ask whether to add Context7 MCP configuration. If enabled, Context7 is configured without a key unless `CONTEXT7_API_KEY` is already set in the environment. The install output prints the generated config path(s) and an example showing where to add the key later. Most targets use project-level files, while `antigravity` is written to the global user path `~/.gemini/antigravity/mcp_config.json`.
176
+ For `opencode`, `codex`, `claude`, `cursor`, `gemini`, `kilocode`, and `antigravity`, interactive installs ask whether to add Context7 MCP configuration. If enabled, a follow-up menu chooses either keyless mode or entering `CONTEXT7_API_KEY`. The selected key is written to all selected target configs for the current project. Most targets use project-level files, while `antigravity` is written to the global user path `~/.gemini/antigravity/mcp_config.json`.
177
177
 
178
178
  Non-interactive installs enable Context7 when either `AGENTIC_ENABLE_CONTEXT7=y` or `CONTEXT7_API_KEY` is set. Agents are instructed to use Context7 for framework, library, SDK, API, and setup documentation when the project config is present.
179
179
 
@@ -183,14 +183,21 @@ For `opencode`, `codex`, `claude`, `cursor`, `gemini`, `kilocode`, and `antigrav
183
183
 
184
184
  Generated configs run `mempalace-mcp` without arguments for all supported agent targets. Runtime startup and MCP tool errors are checked by the post-install doctor stage.
185
185
 
186
- During install, if MemPalace is enabled, `agentic` writes a managed `.mempalaceignore` when the target project does not already have one. OpenCode installs do not run `mempalace init` automatically; project indexing is optional and printed as a manual follow-up. If auto-install or runtime checks fail, install continues, manual setup instructions are printed, and agents fall back to standard context discovery.
186
+ During install, if MemPalace is enabled, `agentic` writes a managed `.mempalaceignore` when the target project does not already have one. It initializes the project with `mempalace init --yes --no-llm`, then mines project knowledge into a wing named from the target project basename. If `docs/` exists, those files are also mined into the shared `shared_docs` wing for cross-project Markdown knowledge. MemPalace commands time out after `60` seconds by default; override with `AGENTIC_MEMPALACE_TIMEOUT_SECONDS`. If auto-install, initialization, mining, timeout, or runtime checks fail, install continues, manual setup instructions are printed, and agents fall back to standard context discovery. If `pip install mempalace` fails, `agentic` prints the pip exit status, a temporary pip output log path, and the first non-empty pip output line as the likely reason; the full pip output is also copied into the main Agentic run log.
187
+
188
+ For environments that already provide `mempalace-mcp` and need a fast install path, set `AGENTIC_MEMPALACE_SETUP=skip`. This writes the same MCP configuration and `.mempalaceignore` but skips Python package installation and project indexing.
187
189
 
188
190
  ## Real agent blackbox E2E
189
191
 
190
- `make test` includes real Codex, OpenCode, and Telegram blackbox scenarios. The local environment must provide working `codex`, `opencode`, Context7/MemPalace access, network access, valid auth, and Telegram credentials in `OPENCODE_TELEGRAM_BOT_TOKEN` and `OPENCODE_TELEGRAM_CHAT_ID`. Secrets are redacted from test output.
192
+ `make test` runs the fast deterministic e2e suite and is intended to finish in under a minute on a normal local machine. Longer deterministic checks (`test-doctor`, `test-markers`), real-agent blackbox, and coverage checks are explicit targets so the default local loop stays short.
193
+
194
+ `make test-real-blackbox` validates real Codex/OpenCode install artifacts, generated guidance, MCP config, and manifest evidence without starting live LLM sessions by default. Set `AGENTIC_REAL_BLACKBOX_LIVE=1` to execute live `codex`/`opencode` runs. Live Telegram checks require Telegram credentials to be present in the target project `.agentic.json`; secrets are redacted from test output.
195
+
196
+ Makefile test targets print timestamped `[make-timing]` start, success/failure, exit status, and elapsed seconds for every top-level test step. `test-real-blackbox` is split into separate Codex, OpenCode, OpenCode mapper, and Telegram timed steps. Blackbox instruction evidence is saved to a `/tmp/agentic-instruction-evidence.*` file and the path is printed. `test-coverage` also prints timing for each traced coverage scenario before the final coverage parser. Use `make test-all` when you want the fast suite, longer deterministic checks, install/evidence blackbox, and coverage in one command.
191
197
 
192
198
  ```bash
193
199
  make test-real-blackbox
200
+ AGENTIC_REAL_BLACKBOX_LIVE=1 make test-real-blackbox
194
201
  ```
195
202
 
196
203
  ## Deprecated wrapper
@@ -0,0 +1,19 @@
1
+ # Centralized guidance loading and memory writes
2
+
3
+ ## User-facing behavior
4
+
5
+ Agent guidance loading rules are defined in the root `AGENTS.md` instead of being repeated in each `areas/**/AGENTS.md` specialization index. Area files now focus on scope, inherited constraints, overrides, and spec maps.
6
+
7
+ `MEMORY.md` now explicitly tells agents to use `mempalace_store` proactively for durable project facts when those facts are discovered, decided, or corrected.
8
+
9
+ ## Acceptance criteria
10
+
11
+ - Root `AGENTS.md` contains the canonical guidance chain and `.agent/**/*.md` discovery patterns.
12
+ - Area specialization `AGENTS.md` files do not repeat `## Guidance chain` or `## Discovery patterns`.
13
+ - `areas/template/AGENTS.tmpl.md` does not reintroduce the duplicated sections for future specs.
14
+ - `MEMORY.md` includes a concise `mempalace_store` example with wing, optional confirmed room, text, and tags.
15
+
16
+ ## Operational constraints
17
+
18
+ - Token-budget reporting uses a dependency-free estimate of `ceil(chars / 4)` unless a tokenizer dependency is intentionally added later.
19
+ - Validation continues to run through Makefile targets: `make lint` and `make build`.
@@ -33,18 +33,20 @@ When `agentic` installs the OpenCode extension, it configures optional plugins i
33
33
  ~/.config/agentic/opencode-plugins.json
34
34
  ```
35
35
 
36
- Telegram notifications and agent model mapping are opt-in. If the config is absent or a plugin is disabled, the plugin returns no hooks and OpenCode continues without that behavior.
36
+ Telegram notifications and agent model mapping are opt-in. Interactive `agentic install` and `agentic tui` ask for OpenCode plugin selection whenever `opencode` is selected; the answer rewrites this config. During manifest-based upgrade/re-install sync, existing plugin settings are kept so automated refreshes do not open prompts. If the config is absent or a plugin is disabled, the plugin returns no hooks and OpenCode continues without that behavior.
37
37
 
38
- Telegram notifications read credentials from environment variables only:
38
+ When `telegram-notification` is selected interactively, `agentic` asks for `botToken` and `chatId` and stores them in the target project's `.agentic.json`:
39
39
 
40
40
  ```text
41
- OPENCODE_TELEGRAM_BOT_TOKEN
42
- OPENCODE_TELEGRAM_CHAT_ID
41
+ settings.opencode_plugins.telegram.botToken
42
+ settings.opencode_plugins.telegram.chatId
43
43
  ```
44
44
 
45
+ The runtime plugin reads credentials from the project `.agentic.json`; it does not read Telegram credentials from environment variables. This file stores credentials in plaintext, so do not commit a Telegram-enabled `.agentic.json` to public repositories.
46
+
45
47
  Non-interactive `agentic install` defaults optional plugins to disabled when no config exists.
46
48
 
47
- `agent-model-mapper` reads roles from target `.opencode/agents/*.md` and discovers model names from `~/.config/opencode/opencode.json`, falling back to a built-in list only when that file has no model names. When enabled, interactive `agentic install`/`agentic tui` prompts for a main and fallback model per role, using `fzf` as a dropdown picker when available, and writes `.opencode/opencode.json` only after confirmation. OpenCode startup never opens `fzf` or waits for model input; the runtime plugin only reports whether install-time mapping is complete.
49
+ `agent-model-mapper` reads roles from target `.opencode/agents/*.md` and discovers model names from `~/.config/opencode/opencode.json`, then adds models from active providers in `~/.local/share/opencode/auth.json` using non-deprecated entries in `~/.cache/opencode/models.json`. When enabled, interactive `agentic install`/`agentic tui` prompts for a main and fallback model per role, using `fzf` as a dropdown picker when available, and writes `.opencode/opencode.json` only after a Confirm action. OpenCode startup never opens `fzf` or waits for model input; the runtime plugin only reports whether install-time mapping is complete.
48
50
 
49
51
  For OpenCode targets, `agentic` writes generated operating guidance to `.opencode/AGENTS.md`. If OpenCode is installed
50
52
  alongside another agent target, root `AGENTS.md` is generated as well for the non-OpenCode target.
@@ -0,0 +1,82 @@
1
+ # Review Pipeline
2
+
3
+ Agentic ships two optional post-task specialist agents:
4
+
5
+ - `instruction_reviewer`: reviews how instructions affected task execution.
6
+ - `memory_curator`: recommends long-term memory store, update, merge, ignore, and delete-candidate actions.
7
+
8
+ These agents are outside the mandatory SDLC role matrix. They do not replace `product-owner`, `pm`, `team-lead`,
9
+ `developer`, `qa`, `designer`, or `devops-engineer`.
10
+
11
+ ## Guidance-mode integration
12
+
13
+ Agentic currently provides guidance and IDE agent definitions for the review pipeline. It does not run a generic
14
+ post-task review runner. The parent or orchestrating agent should call the specialists after task execution when the
15
+ task size and risk justify the extra review.
16
+
17
+ Small tasks may skip this pipeline.
18
+
19
+ ```yaml
20
+ review_pipeline:
21
+ enabled: true
22
+ default:
23
+ - qa
24
+ - instruction_reviewer
25
+ - memory_curator
26
+ task_types:
27
+ agent_system:
28
+ - qa
29
+ - instruction_reviewer
30
+ - memory_curator
31
+ docs:
32
+ - instruction_reviewer
33
+ - memory_curator
34
+ code:
35
+ - qa
36
+ - instruction_reviewer
37
+ - memory_curator
38
+ ```
39
+
40
+ `tool_optimizer` may be added to `agent_system` tasks in projects that install such a role. This repository does not
41
+ ship a `tool_optimizer` role.
42
+
43
+ ## Output files
44
+
45
+ When the orchestrating agent writes review artifacts, use this layout:
46
+
47
+ ```text
48
+ .reviews/<task-id>/
49
+ ├── instruction-review.md
50
+ ├── memory-curation.md
51
+ └── summary.md
52
+ ```
53
+
54
+ If the task id is unavailable, use a timestamp in `YYYY-MM-DD-HHMMSS` format, for example:
55
+
56
+ ```text
57
+ .reviews/2026-05-26-153000/
58
+ ```
59
+
60
+ The specialist agents only produce Markdown reports. They do not write memory automatically and do not create review
61
+ files unless the parent task explicitly grants file-writing scope.
62
+
63
+ Example reports live under `docs/review-pipeline/examples/`.
64
+
65
+ ## Report boundaries
66
+
67
+ `instruction_reviewer` reviews instruction effects only:
68
+
69
+ - `AGENTS.md`, `MEMORY.md`, role prompts, workflows, and tool guidance
70
+ - instruction clarity, usefulness, conflicts, redundancy, and missing rules
71
+ - repeated search loops, unnecessary memory lookups, unnecessary MCP calls, and token/tool waste
72
+
73
+ It must not review code quality or product requirements.
74
+
75
+ `memory_curator` reviews memory hygiene only:
76
+
77
+ - durable project facts, conventions, workflows, decisions, constraints, and rationale
78
+ - duplicate, stale, contradictory, or low-value memory candidates
79
+ - store/update/merge/ignore/delete recommendations
80
+
81
+ It must not store temporary logs, one-time commands, transient errors, generated code, secrets, temporary URLs, noisy
82
+ debug output, or current task state.
@@ -0,0 +1,132 @@
1
+ ---
2
+ name: instruction_reviewer
3
+ description: Use this agent after task execution to review how AGENTS.md, MEMORY.md, role prompts, and tool-use instructions affected the run. It does not review code quality or product requirements.
4
+ ---
5
+
6
+ # Instruction Reviewer
7
+
8
+ You are Instruction Reviewer.
9
+ Your job is to evaluate how agent instructions affected task execution.
10
+ You do NOT review code quality.
11
+ You do NOT review product requirements.
12
+ You do NOT rewrite the implementation unless an instruction directly caused a problem.
13
+
14
+ Analyze:
15
+ - AGENTS.md
16
+ - MEMORY.md
17
+ - role prompts
18
+ - task description
19
+ - execution log
20
+ - tool calls
21
+ - final diff
22
+ - test results
23
+ - review artifacts
24
+
25
+ Focus on:
26
+ - instruction clarity
27
+ - instruction usefulness
28
+ - instruction conflicts
29
+ - redundant rules
30
+ - missing rules
31
+ - excessive tool usage
32
+ - repeated search loops
33
+ - unnecessary memory lookups
34
+ - unnecessary MCP calls
35
+ - token waste
36
+ - context reuse
37
+
38
+ Output only a markdown report.
39
+ Use this structure:
40
+
41
+ # Instruction Effectiveness Review
42
+
43
+ ## Summary
44
+
45
+ Brief 3-5 sentence summary.
46
+
47
+ ## Scores
48
+
49
+ | Category | Score 0-10 | Notes |
50
+ |---|---:|---|
51
+ | Clarity | | |
52
+ | Usefulness | | |
53
+ | Tool discipline | | |
54
+ | Memory discipline | | |
55
+ | Ambiguity resistance | | |
56
+ | Token efficiency | | |
57
+ | Overall | | |
58
+
59
+ ## Effective instructions
60
+
61
+ | Instruction | Impact | Evidence |
62
+ |---|---|---|
63
+ | | | |
64
+
65
+ ## Harmful instructions
66
+
67
+ | Instruction | Problem | Evidence |
68
+ |---|---|---|
69
+ | | | |
70
+
71
+ ## Missing instructions
72
+
73
+ | Missing instruction | Why needed | Suggested text |
74
+ |---|---|---|
75
+ | | | |
76
+
77
+ ## Redundant instructions
78
+
79
+ | Instruction | Reason |
80
+ |---|---|
81
+ | | |
82
+
83
+ ## Tool usage findings
84
+
85
+ | Tool | Calls | Useful | Waste | Notes |
86
+ |---|---:|---:|---:|---|
87
+ | | | | | |
88
+
89
+ ## Suggested edits
90
+
91
+ ### Remove
92
+
93
+ ```md
94
+ ...
95
+ ```
96
+
97
+ ### Replace
98
+
99
+ ```md
100
+ ...
101
+ ```
102
+
103
+ with:
104
+
105
+ ```md
106
+ ...
107
+ ```
108
+
109
+ ### Add
110
+
111
+ ```md
112
+ ...
113
+ ```
114
+
115
+ ## Estimated waste
116
+
117
+ | Metric | Estimate |
118
+ |---|---:|
119
+ | Extra tokens | |
120
+ | Extra tool calls | |
121
+ | Extra retries | |
122
+ | Extra runtime | |
123
+
124
+ ## Final recommendation
125
+
126
+ Choose one:
127
+
128
+ - Keep as-is
129
+ - Minor edits
130
+ - Significant rewrite
131
+
132
+ Explain in 2-5 sentences.
@@ -0,0 +1,97 @@
1
+ ---
2
+ name: memory_curator
3
+ description: Use this agent after task execution to recommend high-quality long-term memory stores, updates, merges, ignores, and delete candidates. It does not write memory automatically.
4
+ ---
5
+
6
+ # Memory Curator
7
+
8
+ You are Memory Curator.
9
+ Your job is to maintain high-quality long-term memory.
10
+ Store only facts that are likely to be useful in future tasks.
11
+ Prefer fewer, higher-quality memories.
12
+
13
+ Store:
14
+ - stable project architecture
15
+ - coding conventions
16
+ - recurring workflows
17
+ - user preferences
18
+ - infrastructure decisions
19
+ - persistent environment details
20
+ - reusable troubleshooting knowledge
21
+ - important constraints
22
+ - decision rationale
23
+
24
+ Do not store:
25
+ - temporary debugging output
26
+ - one-time shell commands
27
+ - transient errors
28
+ - generated code
29
+ - secrets
30
+ - tokens
31
+ - passwords
32
+ - temporary URLs
33
+ - logs
34
+ - current task state
35
+ - low-value facts
36
+
37
+ Analyze:
38
+ - task description
39
+ - final result
40
+ - changed files
41
+ - review reports
42
+ - existing memory
43
+ - execution log
44
+
45
+ Output only a markdown report.
46
+ Use this structure:
47
+
48
+ # Memory Curation Report
49
+
50
+ ## Summary
51
+
52
+ Brief 3-5 sentence summary.
53
+
54
+ ## Store
55
+
56
+ | Priority | Fact | Reason | Suggested memory text |
57
+ |---|---|---|---|
58
+ | High/Medium/Low | | | |
59
+
60
+ ## Update
61
+
62
+ | Existing memory | Replace with | Reason |
63
+ |---|---|---|
64
+ | | | |
65
+
66
+ ## Merge
67
+
68
+ | Memory A | Memory B | Merged memory | Reason |
69
+ |---|---|---|---|
70
+ | | | | |
71
+
72
+ ## Ignore
73
+
74
+ | Fact | Reason |
75
+ |---|---|
76
+ | | |
77
+
78
+ ## Delete candidates
79
+
80
+ | Memory | Reason |
81
+ |---|---|
82
+ | | |
83
+
84
+ ## Contradictions
85
+
86
+ | Memory | New information | Resolution |
87
+ |---|---|---|
88
+ | | | |
89
+
90
+ ## Final recommendation
91
+
92
+ Store count:
93
+ Update count:
94
+ Merge count:
95
+ Delete candidate count:
96
+ Memory quality score: X/10
97
+ Short conclusion.
@@ -49,6 +49,14 @@ Use the shipped role agents under `.codex/agents/`:
49
49
  - `@qa` for verification, test strategy, and go or no-go recommendations
50
50
  - `@devops-engineer` for CI/CD, infrastructure, deployment safety, and observability
51
51
 
52
+ Optional post-task specialist agents:
53
+
54
+ - `@instruction_reviewer` for instruction effectiveness, tool discipline, memory discipline, ambiguity, and token-efficiency reports
55
+ - `@memory_curator` for long-term memory store/update/merge/ignore/delete-candidate recommendations
56
+
57
+ These specialist agents are not SDLC owners and do not replace the mandatory SDLC role mapping. Use them after
58
+ non-trivial task execution when instruction quality, memory hygiene, or future task performance needs review.
59
+
52
60
  Role selection guidance:
53
61
 
54
62
  - Prefer read-only agents for planning and review: `@product-owner`, `@pm`, `@team-lead`, `@designer`.
@@ -69,6 +77,15 @@ Suggested default flow:
69
77
  2. `@team-lead` and `@designer` for technical and UX review
70
78
  3. `@developer` or `@devops-engineer` for execution
71
79
  4. `@qa` and `@team-lead` for verification and release readiness
80
+ 5. Optional: `@instruction_reviewer` and `@memory_curator` for post-task review reports
81
+
82
+ When these optional specialists produce artifacts, use:
83
+
84
+ - `.reviews/<task-id>/instruction-review.md`
85
+ - `.reviews/<task-id>/memory-curation.md`
86
+ - `.reviews/<task-id>/summary.md`
87
+
88
+ If no task id exists, use a timestamp directory in `YYYY-MM-DD-HHMMSS` format.
72
89
 
73
90
  ## 5. Enforcement
74
91
 
@@ -0,0 +1,139 @@
1
+ name = "instruction_reviewer"
2
+ description = "Use this agent after task execution to review how AGENTS.md, MEMORY.md, role prompts, and tool-use instructions affected the run. It does not review code quality or product requirements."
3
+ model = "gpt-5.5"
4
+ model_reasoning_effort = "high"
5
+ sandbox_mode = "read-only"
6
+ developer_instructions = """
7
+ You are Instruction Reviewer.
8
+ Your job is to evaluate how agent instructions affected task execution.
9
+ You do NOT review code quality.
10
+ You do NOT review product requirements.
11
+ You do NOT rewrite the implementation unless an instruction directly caused a problem.
12
+
13
+ Codex operating rules
14
+ - You are a read-only post-task review agent. Do not edit files or perform write-capable actions.
15
+ - Output only a deterministic Markdown report in the required structure.
16
+ - Review instruction effectiveness, tool discipline, memory discipline, and context efficiency only.
17
+ - If an issue is caused by implementation quality rather than instructions, mark it out of scope.
18
+ - When suggesting edits, keep them scoped to instructions such as AGENTS.md, MEMORY.md, role prompts, workflows, or tool guidance.
19
+
20
+ Analyze:
21
+ - AGENTS.md
22
+ - MEMORY.md
23
+ - role prompts
24
+ - task description
25
+ - execution log
26
+ - tool calls
27
+ - final diff
28
+ - test results
29
+ - review artifacts
30
+
31
+ Focus on:
32
+ - instruction clarity
33
+ - instruction usefulness
34
+ - instruction conflicts
35
+ - redundant rules
36
+ - missing rules
37
+ - excessive tool usage
38
+ - repeated search loops
39
+ - unnecessary memory lookups
40
+ - unnecessary MCP calls
41
+ - token waste
42
+ - context reuse
43
+
44
+ Output only a markdown report.
45
+ Use this structure:
46
+
47
+ # Instruction Effectiveness Review
48
+
49
+ ## Summary
50
+
51
+ Brief 3-5 sentence summary.
52
+
53
+ ## Scores
54
+
55
+ | Category | Score 0-10 | Notes |
56
+ |---|---:|---|
57
+ | Clarity | | |
58
+ | Usefulness | | |
59
+ | Tool discipline | | |
60
+ | Memory discipline | | |
61
+ | Ambiguity resistance | | |
62
+ | Token efficiency | | |
63
+ | Overall | | |
64
+
65
+ ## Effective instructions
66
+
67
+ | Instruction | Impact | Evidence |
68
+ |---|---|---|
69
+ | | | |
70
+
71
+ ## Harmful instructions
72
+
73
+ | Instruction | Problem | Evidence |
74
+ |---|---|---|
75
+ | | | |
76
+
77
+ ## Missing instructions
78
+
79
+ | Missing instruction | Why needed | Suggested text |
80
+ |---|---|---|
81
+ | | | |
82
+
83
+ ## Redundant instructions
84
+
85
+ | Instruction | Reason |
86
+ |---|---|
87
+ | | |
88
+
89
+ ## Tool usage findings
90
+
91
+ | Tool | Calls | Useful | Waste | Notes |
92
+ |---|---:|---:|---:|---|
93
+ | | | | | |
94
+
95
+ ## Suggested edits
96
+
97
+ ### Remove
98
+
99
+ ```md
100
+ ...
101
+ ```
102
+
103
+ ### Replace
104
+
105
+ ```md
106
+ ...
107
+ ```
108
+
109
+ with:
110
+
111
+ ```md
112
+ ...
113
+ ```
114
+
115
+ ### Add
116
+
117
+ ```md
118
+ ...
119
+ ```
120
+
121
+ ## Estimated waste
122
+
123
+ | Metric | Estimate |
124
+ |---|---:|
125
+ | Extra tokens | |
126
+ | Extra tool calls | |
127
+ | Extra retries | |
128
+ | Extra runtime | |
129
+
130
+ ## Final recommendation
131
+
132
+ Choose one:
133
+
134
+ - Keep as-is
135
+ - Minor edits
136
+ - Significant rewrite
137
+
138
+ Explain in 2-5 sentences.
139
+ """