autonomous-coding-toolkit 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +22 -0
- package/.claude-plugin/plugin.json +13 -0
- package/LICENSE +21 -0
- package/Makefile +21 -0
- package/README.md +140 -0
- package/SECURITY.md +28 -0
- package/agents/bash-expert.md +113 -0
- package/agents/dependency-auditor.md +138 -0
- package/agents/integration-tester.md +120 -0
- package/agents/lesson-scanner.md +149 -0
- package/agents/python-expert.md +179 -0
- package/agents/service-monitor.md +141 -0
- package/agents/shell-expert.md +147 -0
- package/benchmarks/runner.sh +147 -0
- package/benchmarks/tasks/01-rest-endpoint/rubric.sh +29 -0
- package/benchmarks/tasks/01-rest-endpoint/task.md +17 -0
- package/benchmarks/tasks/02-refactor-module/task.md +8 -0
- package/benchmarks/tasks/03-fix-integration-bug/task.md +8 -0
- package/benchmarks/tasks/04-add-test-coverage/task.md +8 -0
- package/benchmarks/tasks/05-multi-file-feature/task.md +8 -0
- package/bin/act.js +238 -0
- package/commands/autocode.md +6 -0
- package/commands/cancel-ralph.md +18 -0
- package/commands/code-factory.md +53 -0
- package/commands/create-prd.md +55 -0
- package/commands/ralph-loop.md +18 -0
- package/commands/run-plan.md +117 -0
- package/commands/submit-lesson.md +122 -0
- package/docs/ARCHITECTURE.md +630 -0
- package/docs/CONTRIBUTING.md +125 -0
- package/docs/lessons/0001-bare-exception-swallowing.md +34 -0
- package/docs/lessons/0002-async-def-without-await.md +28 -0
- package/docs/lessons/0003-create-task-without-callback.md +28 -0
- package/docs/lessons/0004-hardcoded-test-counts.md +28 -0
- package/docs/lessons/0005-sqlite-without-closing.md +33 -0
- package/docs/lessons/0006-venv-pip-path.md +27 -0
- package/docs/lessons/0007-runner-state-self-rejection.md +35 -0
- package/docs/lessons/0008-quality-gate-blind-spot.md +33 -0
- package/docs/lessons/0009-parser-overcount-empty-batches.md +36 -0
- package/docs/lessons/0010-local-outside-function-bash.md +33 -0
- package/docs/lessons/0011-batch-tests-for-unimplemented-code.md +36 -0
- package/docs/lessons/0012-api-markdown-unescaped-chars.md +33 -0
- package/docs/lessons/0013-export-prefix-env-parsing.md +33 -0
- package/docs/lessons/0014-decorator-registry-import-side-effect.md +43 -0
- package/docs/lessons/0015-frontend-backend-schema-drift.md +43 -0
- package/docs/lessons/0016-event-driven-cold-start-seeding.md +44 -0
- package/docs/lessons/0017-copy-paste-logic-diverges.md +43 -0
- package/docs/lessons/0018-layer-passes-pipeline-broken.md +45 -0
- package/docs/lessons/0019-systemd-envfile-ignores-export.md +41 -0
- package/docs/lessons/0020-persist-state-incrementally.md +44 -0
- package/docs/lessons/0021-dual-axis-testing.md +48 -0
- package/docs/lessons/0022-jsx-factory-shadowing.md +43 -0
- package/docs/lessons/0023-static-analysis-spiral.md +51 -0
- package/docs/lessons/0024-shared-pipeline-implementation.md +55 -0
- package/docs/lessons/0025-defense-in-depth-all-entry-points.md +65 -0
- package/docs/lessons/0026-linter-no-rules-false-enforcement.md +54 -0
- package/docs/lessons/0027-jsx-silent-prop-drop.md +64 -0
- package/docs/lessons/0028-no-infrastructure-in-client-code.md +49 -0
- package/docs/lessons/0029-never-write-secrets-to-files.md +61 -0
- package/docs/lessons/0030-cache-merge-not-replace.md +62 -0
- package/docs/lessons/0031-verify-units-at-boundaries.md +66 -0
- package/docs/lessons/0032-module-lifecycle-subscribe-unsubscribe.md +89 -0
- package/docs/lessons/0033-async-iteration-mutable-snapshot.md +72 -0
- package/docs/lessons/0034-caller-missing-await-silent-discard.md +65 -0
- package/docs/lessons/0035-duplicate-registration-silent-overwrite.md +85 -0
- package/docs/lessons/0036-websocket-dirty-disconnect.md +33 -0
- package/docs/lessons/0037-parallel-agents-worktree-corruption.md +31 -0
- package/docs/lessons/0038-subscribe-no-stored-ref.md +36 -0
- package/docs/lessons/0039-fallback-or-default-hides-bugs.md +34 -0
- package/docs/lessons/0040-event-firehose-filter-first.md +36 -0
- package/docs/lessons/0041-ambiguous-base-dir-path-nesting.md +32 -0
- package/docs/lessons/0042-spec-compliance-insufficient.md +36 -0
- package/docs/lessons/0043-exact-count-extensible-collections.md +32 -0
- package/docs/lessons/0044-relative-file-deps-worktree.md +39 -0
- package/docs/lessons/0045-iterative-design-improvement.md +33 -0
- package/docs/lessons/0046-plan-assertion-math-bugs.md +38 -0
- package/docs/lessons/0047-pytest-single-threaded-default.md +37 -0
- package/docs/lessons/0048-integration-wiring-batch.md +40 -0
- package/docs/lessons/0049-ab-verification.md +41 -0
- package/docs/lessons/0050-editing-sourced-files-during-execution.md +33 -0
- package/docs/lessons/0051-infrastructure-fixes-cant-self-heal.md +30 -0
- package/docs/lessons/0052-uncommitted-changes-poison-quality-gates.md +31 -0
- package/docs/lessons/0053-jq-compact-flag-inconsistency.md +31 -0
- package/docs/lessons/0054-parser-matches-inside-code-blocks.md +30 -0
- package/docs/lessons/0055-agents-compensate-for-garbled-prompts.md +31 -0
- package/docs/lessons/0056-grep-count-exit-code-on-zero.md +42 -0
- package/docs/lessons/0057-new-artifacts-break-git-clean-gates.md +42 -0
- package/docs/lessons/0058-dead-config-keys-never-consumed.md +49 -0
- package/docs/lessons/0059-contract-test-shared-structures.md +53 -0
- package/docs/lessons/0060-set-e-silent-death-in-runners.md +53 -0
- package/docs/lessons/0061-context-injection-dirty-state.md +50 -0
- package/docs/lessons/0062-sibling-bug-neighborhood-scan.md +29 -0
- package/docs/lessons/0063-one-flag-two-lifetimes.md +31 -0
- package/docs/lessons/0064-test-passes-wrong-reason.md +31 -0
- package/docs/lessons/0065-pipefail-grep-count-double-output.md +39 -0
- package/docs/lessons/0066-local-keyword-outside-function.md +37 -0
- package/docs/lessons/0067-stdin-hang-non-interactive-shell.md +36 -0
- package/docs/lessons/0068-agent-builds-wrong-thing-correctly.md +31 -0
- package/docs/lessons/0069-plan-quality-dominates-execution.md +30 -0
- package/docs/lessons/0070-spec-echo-back-prevents-drift.md +31 -0
- package/docs/lessons/0071-positive-instructions-outperform-negative.md +30 -0
- package/docs/lessons/0072-lost-in-the-middle-context-placement.md +30 -0
- package/docs/lessons/0073-unscoped-lessons-cause-false-positives.md +30 -0
- package/docs/lessons/0074-stale-context-injection-wrong-batch.md +32 -0
- package/docs/lessons/0075-research-artifacts-must-persist.md +32 -0
- package/docs/lessons/0076-wrong-decomposition-contaminates-downstream.md +30 -0
- package/docs/lessons/0077-cherry-pick-merges-need-manual-resolution.md +30 -0
- package/docs/lessons/0078-static-review-without-live-test.md +30 -0
- package/docs/lessons/0079-integration-wiring-batch-required.md +32 -0
- package/docs/lessons/FRAMEWORK.md +161 -0
- package/docs/lessons/SUMMARY.md +201 -0
- package/docs/lessons/TEMPLATE.md +85 -0
- package/docs/plans/2026-02-21-code-factory-v2-design.md +204 -0
- package/docs/plans/2026-02-21-code-factory-v2-implementation-plan.md +2189 -0
- package/docs/plans/2026-02-21-code-factory-v2-phase4-design.md +537 -0
- package/docs/plans/2026-02-21-code-factory-v2-phase4-implementation-plan.md +2012 -0
- package/docs/plans/2026-02-21-hardening-pass-design.md +108 -0
- package/docs/plans/2026-02-21-hardening-pass-plan.md +1378 -0
- package/docs/plans/2026-02-21-mab-research-report.md +406 -0
- package/docs/plans/2026-02-21-marketplace-restructure-design.md +240 -0
- package/docs/plans/2026-02-21-marketplace-restructure-plan.md +832 -0
- package/docs/plans/2026-02-21-phase4-completion-plan.md +697 -0
- package/docs/plans/2026-02-21-validator-suite-design.md +148 -0
- package/docs/plans/2026-02-21-validator-suite-plan.md +540 -0
- package/docs/plans/2026-02-22-mab-research-round2.md +556 -0
- package/docs/plans/2026-02-22-mab-run-design.md +462 -0
- package/docs/plans/2026-02-22-mab-run-plan.md +2046 -0
- package/docs/plans/2026-02-22-operations-design-methodology-research.md +681 -0
- package/docs/plans/2026-02-22-research-agent-failure-taxonomy.md +532 -0
- package/docs/plans/2026-02-22-research-code-guideline-policies.md +886 -0
- package/docs/plans/2026-02-22-research-codebase-audit-refactoring.md +908 -0
- package/docs/plans/2026-02-22-research-coding-standards-documentation.md +541 -0
- package/docs/plans/2026-02-22-research-competitive-landscape.md +687 -0
- package/docs/plans/2026-02-22-research-comprehensive-testing.md +1076 -0
- package/docs/plans/2026-02-22-research-context-utilization.md +459 -0
- package/docs/plans/2026-02-22-research-cost-quality-tradeoff.md +548 -0
- package/docs/plans/2026-02-22-research-lesson-transferability.md +508 -0
- package/docs/plans/2026-02-22-research-multi-agent-coordination.md +312 -0
- package/docs/plans/2026-02-22-research-phase-integration.md +602 -0
- package/docs/plans/2026-02-22-research-plan-quality.md +428 -0
- package/docs/plans/2026-02-22-research-prompt-engineering.md +558 -0
- package/docs/plans/2026-02-22-research-unconventional-perspectives.md +528 -0
- package/docs/plans/2026-02-22-research-user-adoption.md +638 -0
- package/docs/plans/2026-02-22-research-verification-effectiveness.md +433 -0
- package/docs/plans/2026-02-23-agent-suite-design.md +299 -0
- package/docs/plans/2026-02-23-agent-suite-plan.md +578 -0
- package/docs/plans/2026-02-23-phase3-cost-infrastructure-design.md +148 -0
- package/docs/plans/2026-02-23-phase3-cost-infrastructure-plan.md +1062 -0
- package/docs/plans/2026-02-23-research-bash-expert-agent.md +543 -0
- package/docs/plans/2026-02-23-research-dependency-auditor-agent.md +564 -0
- package/docs/plans/2026-02-23-research-improving-existing-agents.md +503 -0
- package/docs/plans/2026-02-23-research-integration-tester-agent.md +454 -0
- package/docs/plans/2026-02-23-research-python-expert-agent.md +429 -0
- package/docs/plans/2026-02-23-research-service-monitor-agent.md +425 -0
- package/docs/plans/2026-02-23-research-shell-expert-agent.md +533 -0
- package/docs/plans/2026-02-23-roadmap-to-completion.md +530 -0
- package/docs/plans/2026-02-24-headless-module-split-design.md +98 -0
- package/docs/plans/2026-02-24-headless-module-split.md +443 -0
- package/docs/plans/2026-02-24-lesson-scope-metadata-design.md +228 -0
- package/docs/plans/2026-02-24-lesson-scope-metadata-plan.md +968 -0
- package/docs/plans/2026-02-24-npm-packaging-design.md +841 -0
- package/docs/plans/2026-02-24-npm-packaging-plan.md +1965 -0
- package/docs/plans/audit-findings.md +186 -0
- package/docs/telegram-notification-format.md +98 -0
- package/examples/example-plan.md +51 -0
- package/examples/example-prd.json +72 -0
- package/examples/example-roadmap.md +33 -0
- package/examples/quickstart-plan.md +63 -0
- package/hooks/hooks.json +26 -0
- package/hooks/setup-symlinks.sh +48 -0
- package/hooks/stop-hook.sh +135 -0
- package/package.json +47 -0
- package/policies/bash.md +71 -0
- package/policies/python.md +71 -0
- package/policies/testing.md +61 -0
- package/policies/universal.md +60 -0
- package/scripts/analyze-report.sh +97 -0
- package/scripts/architecture-map.sh +145 -0
- package/scripts/auto-compound.sh +273 -0
- package/scripts/batch-audit.sh +42 -0
- package/scripts/batch-test.sh +101 -0
- package/scripts/entropy-audit.sh +221 -0
- package/scripts/failure-digest.sh +51 -0
- package/scripts/generate-ast-rules.sh +96 -0
- package/scripts/init.sh +112 -0
- package/scripts/lesson-check.sh +428 -0
- package/scripts/lib/common.sh +61 -0
- package/scripts/lib/cost-tracking.sh +153 -0
- package/scripts/lib/ollama.sh +60 -0
- package/scripts/lib/progress-writer.sh +128 -0
- package/scripts/lib/run-plan-context.sh +215 -0
- package/scripts/lib/run-plan-echo-back.sh +231 -0
- package/scripts/lib/run-plan-headless.sh +396 -0
- package/scripts/lib/run-plan-notify.sh +57 -0
- package/scripts/lib/run-plan-parser.sh +81 -0
- package/scripts/lib/run-plan-prompt.sh +215 -0
- package/scripts/lib/run-plan-quality-gate.sh +132 -0
- package/scripts/lib/run-plan-routing.sh +315 -0
- package/scripts/lib/run-plan-sampling.sh +170 -0
- package/scripts/lib/run-plan-scoring.sh +146 -0
- package/scripts/lib/run-plan-state.sh +142 -0
- package/scripts/lib/run-plan-team.sh +199 -0
- package/scripts/lib/telegram.sh +54 -0
- package/scripts/lib/thompson-sampling.sh +176 -0
- package/scripts/license-check.sh +74 -0
- package/scripts/mab-run.sh +575 -0
- package/scripts/module-size-check.sh +146 -0
- package/scripts/patterns/async-no-await.yml +5 -0
- package/scripts/patterns/bare-except.yml +6 -0
- package/scripts/patterns/empty-catch.yml +6 -0
- package/scripts/patterns/hardcoded-localhost.yml +9 -0
- package/scripts/patterns/retry-loop-no-backoff.yml +12 -0
- package/scripts/pipeline-status.sh +197 -0
- package/scripts/policy-check.sh +226 -0
- package/scripts/prior-art-search.sh +133 -0
- package/scripts/promote-mab-lessons.sh +126 -0
- package/scripts/prompts/agent-a-superpowers.md +29 -0
- package/scripts/prompts/agent-b-ralph.md +29 -0
- package/scripts/prompts/judge-agent.md +61 -0
- package/scripts/prompts/planner-agent.md +44 -0
- package/scripts/pull-community-lessons.sh +90 -0
- package/scripts/quality-gate.sh +266 -0
- package/scripts/research-gate.sh +90 -0
- package/scripts/run-plan.sh +329 -0
- package/scripts/scope-infer.sh +159 -0
- package/scripts/setup-ralph-loop.sh +155 -0
- package/scripts/telemetry.sh +230 -0
- package/scripts/tests/run-all-tests.sh +52 -0
- package/scripts/tests/test-act-cli.sh +46 -0
- package/scripts/tests/test-agents-md.sh +87 -0
- package/scripts/tests/test-analyze-report.sh +114 -0
- package/scripts/tests/test-architecture-map.sh +89 -0
- package/scripts/tests/test-auto-compound.sh +169 -0
- package/scripts/tests/test-batch-test.sh +65 -0
- package/scripts/tests/test-benchmark-runner.sh +25 -0
- package/scripts/tests/test-common.sh +168 -0
- package/scripts/tests/test-cost-tracking.sh +158 -0
- package/scripts/tests/test-echo-back.sh +180 -0
- package/scripts/tests/test-entropy-audit.sh +146 -0
- package/scripts/tests/test-failure-digest.sh +66 -0
- package/scripts/tests/test-generate-ast-rules.sh +145 -0
- package/scripts/tests/test-helpers.sh +82 -0
- package/scripts/tests/test-init.sh +47 -0
- package/scripts/tests/test-lesson-check.sh +278 -0
- package/scripts/tests/test-lesson-local.sh +55 -0
- package/scripts/tests/test-license-check.sh +109 -0
- package/scripts/tests/test-mab-run.sh +182 -0
- package/scripts/tests/test-ollama-lib.sh +49 -0
- package/scripts/tests/test-ollama.sh +60 -0
- package/scripts/tests/test-pipeline-status.sh +198 -0
- package/scripts/tests/test-policy-check.sh +124 -0
- package/scripts/tests/test-prior-art-search.sh +96 -0
- package/scripts/tests/test-progress-writer.sh +140 -0
- package/scripts/tests/test-promote-mab-lessons.sh +110 -0
- package/scripts/tests/test-pull-community-lessons.sh +149 -0
- package/scripts/tests/test-quality-gate.sh +241 -0
- package/scripts/tests/test-research-gate.sh +132 -0
- package/scripts/tests/test-run-plan-cli.sh +86 -0
- package/scripts/tests/test-run-plan-context.sh +305 -0
- package/scripts/tests/test-run-plan-e2e.sh +153 -0
- package/scripts/tests/test-run-plan-headless.sh +424 -0
- package/scripts/tests/test-run-plan-notify.sh +124 -0
- package/scripts/tests/test-run-plan-parser.sh +217 -0
- package/scripts/tests/test-run-plan-prompt.sh +254 -0
- package/scripts/tests/test-run-plan-quality-gate.sh +222 -0
- package/scripts/tests/test-run-plan-routing.sh +178 -0
- package/scripts/tests/test-run-plan-scoring.sh +148 -0
- package/scripts/tests/test-run-plan-state.sh +261 -0
- package/scripts/tests/test-run-plan-team.sh +157 -0
- package/scripts/tests/test-scope-infer.sh +150 -0
- package/scripts/tests/test-setup-ralph-loop.sh +63 -0
- package/scripts/tests/test-telegram-env.sh +38 -0
- package/scripts/tests/test-telegram.sh +121 -0
- package/scripts/tests/test-telemetry.sh +46 -0
- package/scripts/tests/test-thompson-sampling.sh +139 -0
- package/scripts/tests/test-validate-all.sh +60 -0
- package/scripts/tests/test-validate-commands.sh +89 -0
- package/scripts/tests/test-validate-hooks.sh +98 -0
- package/scripts/tests/test-validate-lessons.sh +150 -0
- package/scripts/tests/test-validate-plan-quality.sh +235 -0
- package/scripts/tests/test-validate-plans.sh +187 -0
- package/scripts/tests/test-validate-plugin.sh +106 -0
- package/scripts/tests/test-validate-prd.sh +184 -0
- package/scripts/tests/test-validate-skills.sh +134 -0
- package/scripts/validate-all.sh +57 -0
- package/scripts/validate-commands.sh +67 -0
- package/scripts/validate-hooks.sh +89 -0
- package/scripts/validate-lessons.sh +98 -0
- package/scripts/validate-plan-quality.sh +369 -0
- package/scripts/validate-plans.sh +120 -0
- package/scripts/validate-plugin.sh +86 -0
- package/scripts/validate-policies.sh +42 -0
- package/scripts/validate-prd.sh +118 -0
- package/scripts/validate-skills.sh +96 -0
- package/skills/autocode/SKILL.md +285 -0
- package/skills/autocode/ab-verification.md +51 -0
- package/skills/autocode/code-quality-standards.md +37 -0
- package/skills/autocode/competitive-mode.md +364 -0
- package/skills/brainstorming/SKILL.md +97 -0
- package/skills/capture-lesson/SKILL.md +187 -0
- package/skills/check-lessons/SKILL.md +116 -0
- package/skills/dispatching-parallel-agents/SKILL.md +110 -0
- package/skills/executing-plans/SKILL.md +85 -0
- package/skills/finishing-a-development-branch/SKILL.md +201 -0
- package/skills/receiving-code-review/SKILL.md +72 -0
- package/skills/requesting-code-review/SKILL.md +59 -0
- package/skills/requesting-code-review/code-reviewer.md +82 -0
- package/skills/research/SKILL.md +145 -0
- package/skills/roadmap/SKILL.md +115 -0
- package/skills/subagent-driven-development/SKILL.md +98 -0
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +18 -0
- package/skills/subagent-driven-development/implementer-prompt.md +73 -0
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +57 -0
- package/skills/systematic-debugging/SKILL.md +134 -0
- package/skills/systematic-debugging/condition-based-waiting.md +64 -0
- package/skills/systematic-debugging/defense-in-depth.md +32 -0
- package/skills/systematic-debugging/root-cause-tracing.md +55 -0
- package/skills/test-driven-development/SKILL.md +167 -0
- package/skills/using-git-worktrees/SKILL.md +219 -0
- package/skills/using-superpowers/SKILL.md +54 -0
- package/skills/verification-before-completion/SKILL.md +140 -0
- package/skills/verify/SKILL.md +82 -0
- package/skills/writing-plans/SKILL.md +128 -0
- package/skills/writing-skills/SKILL.md +93 -0
|
@@ -0,0 +1,443 @@
|
|
|
1
|
+
# Headless Module Split Implementation Plan
|
|
2
|
+
|
|
3
|
+
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
|
4
|
+
|
|
5
|
+
**Goal:** Split `scripts/lib/run-plan-headless.sh` (681 lines) into 3 modules, fix issue #73 (MAB path bug), and update all tests.
|
|
6
|
+
|
|
7
|
+
**Architecture:** Extract echo-back gate and sampling logic into standalone lib modules sourced by `run-plan.sh`. Headless retains the batch orchestration loop. Each extracted module is self-contained with documented interfaces for reuse across execution modes and projects.
|
|
8
|
+
|
|
9
|
+
**Tech Stack:** Bash (shellcheck-clean), existing test harness (assert_eq pattern)
|
|
10
|
+
|
|
11
|
+
**Design doc:** `docs/plans/2026-02-24-headless-module-split-design.md`
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## Batch 1: Extract echo-back gate + update tests
|
|
16
|
+
|
|
17
|
+
### Task 1: Create `scripts/lib/run-plan-echo-back.sh`
|
|
18
|
+
|
|
19
|
+
**Files:**
|
|
20
|
+
- Create: `scripts/lib/run-plan-echo-back.sh`
|
|
21
|
+
- Modify: `scripts/lib/run-plan-headless.sh` (remove lines 1-163)
|
|
22
|
+
- Modify: `scripts/run-plan.sh:47` (add source line)
|
|
23
|
+
|
|
24
|
+
**Step 1: Create the new module file**
|
|
25
|
+
|
|
26
|
+
Create `scripts/lib/run-plan-echo-back.sh` with the shebang, header comment, and both functions copied verbatim from `run-plan-headless.sh` lines 1-163:
|
|
27
|
+
|
|
28
|
+
```bash
|
|
29
|
+
#!/usr/bin/env bash
|
|
30
|
+
# run-plan-echo-back.sh — Spec echo-back gate for verifying agent understanding
|
|
31
|
+
#
|
|
32
|
+
# Standalone module: can be sourced by any execution mode (headless, team, ralph).
|
|
33
|
+
# No dependencies on batch loop state — only reads SKIP_ECHO_BACK and STRICT_ECHO_BACK globals.
|
|
34
|
+
#
|
|
35
|
+
# Functions:
|
|
36
|
+
# _echo_back_check <batch_text> <log_file>
|
|
37
|
+
# Lightweight keyword-match gate on agent output. Non-blocking by default.
|
|
38
|
+
# echo_back_check <batch_text> <log_dir> <batch_num> [claude_cmd]
|
|
39
|
+
# Full spec verification: agent restatement → haiku verdict → retry once.
|
|
40
|
+
#
|
|
41
|
+
# Globals (read-only): SKIP_ECHO_BACK, STRICT_ECHO_BACK
|
|
42
|
+
#
|
|
43
|
+
# Echo-back gate behavior (--strict-echo-back / --skip-echo-back):
|
|
44
|
+
# Default: NON-BLOCKING — prints a WARNING if agent echo-back looks wrong, then continues.
|
|
45
|
+
# --skip-echo-back: disables the echo-back check entirely (no prompt, no warning).
|
|
46
|
+
# --strict-echo-back: makes the echo-back check BLOCKING — returns 1 on mismatch, aborting the batch.
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
Then paste the two functions (`_echo_back_check` and `echo_back_check`) exactly as they appear in `run-plan-headless.sh` lines 14-163.
|
|
50
|
+
|
|
51
|
+
**Step 2: Remove echo-back functions from headless**
|
|
52
|
+
|
|
53
|
+
In `scripts/lib/run-plan-headless.sh`, delete everything from line 1 through line 163 (the closing `}` of `echo_back_check`). The file should now start with `run_mode_headless() {`.
|
|
54
|
+
|
|
55
|
+
Update the file header to:
|
|
56
|
+
|
|
57
|
+
```bash
|
|
58
|
+
#!/usr/bin/env bash
|
|
59
|
+
# run-plan-headless.sh — Headless batch execution loop for run-plan
|
|
60
|
+
#
|
|
61
|
+
# Requires globals: WORKTREE, RESUME, START_BATCH, END_BATCH, NOTIFY,
|
|
62
|
+
# PLAN_FILE, QUALITY_GATE_CMD, PYTHON, MAX_RETRIES, ON_FAILURE, VERIFY, MODE,
|
|
63
|
+
# SKIP_ECHO_BACK, STRICT_ECHO_BACK
|
|
64
|
+
# Requires libs: run-plan-parser, state, quality-gate, notify, prompt, scoring, echo-back
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
**Step 3: Add source line in `run-plan.sh`**
|
|
68
|
+
|
|
69
|
+
In `scripts/run-plan.sh`, add this line BEFORE line 47 (`source "$SCRIPT_DIR/lib/run-plan-headless.sh"`):
|
|
70
|
+
|
|
71
|
+
```bash
|
|
72
|
+
source "$SCRIPT_DIR/lib/run-plan-echo-back.sh"
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
**Step 4: Run tests to verify no regressions**
|
|
76
|
+
|
|
77
|
+
Run: `bash scripts/tests/test-echo-back.sh`
|
|
78
|
+
Expected: ALL PASSED (5/5)
|
|
79
|
+
|
|
80
|
+
Run: `bash scripts/tests/test-run-plan-headless.sh`
|
|
81
|
+
Expected: Some echo-back tests will FAIL because they grep `$RPH` for `_echo_back_check()` — that's expected, we fix those in Task 2.
|
|
82
|
+
|
|
83
|
+
**Step 5: Commit**
|
|
84
|
+
|
|
85
|
+
```bash
|
|
86
|
+
git add scripts/lib/run-plan-echo-back.sh scripts/lib/run-plan-headless.sh scripts/run-plan.sh
|
|
87
|
+
git commit -m "refactor: extract echo-back gate to run-plan-echo-back.sh"
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
### Task 2: Update test files for echo-back extraction
|
|
91
|
+
|
|
92
|
+
**Files:**
|
|
93
|
+
- Modify: `scripts/tests/test-echo-back.sh:7` (change source path)
|
|
94
|
+
- Modify: `scripts/tests/test-run-plan-headless.sh` (update echo-back test assertions)
|
|
95
|
+
|
|
96
|
+
**Step 1: Fix `test-echo-back.sh` source path**
|
|
97
|
+
|
|
98
|
+
In `scripts/tests/test-echo-back.sh` line 7, change:
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
# Before:
|
|
102
|
+
source "$SCRIPT_DIR/../lib/run-plan-headless.sh" 2>/dev/null || true
|
|
103
|
+
# After:
|
|
104
|
+
source "$SCRIPT_DIR/../lib/run-plan-echo-back.sh" 2>/dev/null || true
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
**Step 2: Update `test-run-plan-headless.sh` echo-back assertions**
|
|
108
|
+
|
|
109
|
+
The headless test file has tests that grep `$RPH` (the headless file) for echo-back functions. These need to point at the new echo-back file instead. Update these tests:
|
|
110
|
+
|
|
111
|
+
Lines 235-242: Change `$RPH` to the echo-back file:
|
|
112
|
+
```bash
|
|
113
|
+
# Before:
|
|
114
|
+
RPH="$SCRIPT_DIR/../lib/run-plan-headless.sh"
|
|
115
|
+
# (this is already defined at line 7)
|
|
116
|
+
|
|
117
|
+
# Add near the top (after RPH definition, around line 8):
|
|
118
|
+
RPEB="$SCRIPT_DIR/../lib/run-plan-echo-back.sh"
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
Then update the 6 echo-back grep tests (lines 235-260) to use `$RPEB` instead of `$RPH`:
|
|
122
|
+
|
|
123
|
+
- Line 237: `grep -q '_echo_back_check()' "$RPEB"` — and update PASS/FAIL messages to say "run-plan-echo-back.sh"
|
|
124
|
+
- Line 246: `grep -q 'STRICT_ECHO_BACK' "$RPEB"`
|
|
125
|
+
- Line 255: `grep -q 'NON-BLOCKING' "$RPEB"`
|
|
126
|
+
|
|
127
|
+
And update the 3 behavioral tests (lines 262-305) to source `$RPEB` instead of `$RPH`:
|
|
128
|
+
|
|
129
|
+
- Line 265: `source "$RPEB" 2>/dev/null || true`
|
|
130
|
+
- Line 278: `source "$RPEB" 2>/dev/null || true`
|
|
131
|
+
- Line 293: `source "$RPEB" 2>/dev/null || true`
|
|
132
|
+
|
|
133
|
+
**Step 3: Run both test files**
|
|
134
|
+
|
|
135
|
+
Run: `bash scripts/tests/test-echo-back.sh`
|
|
136
|
+
Expected: ALL PASSED (5/5)
|
|
137
|
+
|
|
138
|
+
Run: `bash scripts/tests/test-run-plan-headless.sh`
|
|
139
|
+
Expected: ALL PASSED (all tests pass with updated references)
|
|
140
|
+
|
|
141
|
+
**Step 4: Commit**
|
|
142
|
+
|
|
143
|
+
```bash
|
|
144
|
+
git add scripts/tests/test-echo-back.sh scripts/tests/test-run-plan-headless.sh
|
|
145
|
+
git commit -m "test: update echo-back test references after extraction"
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
## Batch 2: Extract sampling logic + fix #73
|
|
149
|
+
|
|
150
|
+
### Task 3: Create `scripts/lib/run-plan-sampling.sh`
|
|
151
|
+
|
|
152
|
+
**Files:**
|
|
153
|
+
- Create: `scripts/lib/run-plan-sampling.sh`
|
|
154
|
+
- Modify: `scripts/lib/run-plan-headless.sh` (replace inline sampling with function call)
|
|
155
|
+
- Modify: `scripts/run-plan.sh` (add source line)
|
|
156
|
+
|
|
157
|
+
**Step 1: Create the new sampling module**
|
|
158
|
+
|
|
159
|
+
Create `scripts/lib/run-plan-sampling.sh` with two functions extracted from `run-plan-headless.sh`:
|
|
160
|
+
|
|
161
|
+
```bash
|
|
162
|
+
#!/usr/bin/env bash
|
|
163
|
+
# run-plan-sampling.sh — Parallel candidate sampling for batch execution
|
|
164
|
+
#
|
|
165
|
+
# Standalone module: spawns N parallel candidates with prompt variants,
|
|
166
|
+
# scores each via quality gate, picks the winner. Uses patch files (not stash)
|
|
167
|
+
# to manage worktree state across candidates.
|
|
168
|
+
#
|
|
169
|
+
# Functions:
|
|
170
|
+
# check_memory_for_sampling
|
|
171
|
+
# Returns 0 if enough memory for SAMPLE_COUNT candidates, 1 otherwise.
|
|
172
|
+
# Prints warning and sets SAMPLE_COUNT=0 on insufficient memory.
|
|
173
|
+
# run_sampling_candidates <worktree> <plan_file> <batch> <prompt> <quality_gate_cmd>
|
|
174
|
+
# Spawns SAMPLE_COUNT candidates, scores them, applies winner's patch.
|
|
175
|
+
# Returns 0 if winner found, 1 if no candidate passed.
|
|
176
|
+
#
|
|
177
|
+
# Globals (read-only): SAMPLE_COUNT, SAMPLE_MIN_MEMORY_PER_GB
|
|
178
|
+
# Requires libs: run-plan-scoring (score_candidate, select_winner, classify_batch_type, get_prompt_variants)
|
|
179
|
+
# run-plan-quality-gate (run_quality_gate)
|
|
180
|
+
# run-plan-state (get_previous_test_count)
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
**`check_memory_for_sampling` function:** Extract from current headless lines 354-369:
|
|
184
|
+
|
|
185
|
+
```bash
|
|
186
|
+
# check_memory_for_sampling
|
|
187
|
+
# Checks if sufficient memory is available for SAMPLE_COUNT parallel candidates.
|
|
188
|
+
# Returns: 0 if OK, 1 if insufficient (also sets SAMPLE_COUNT=0 and prints warning)
|
|
189
|
+
check_memory_for_sampling() {
|
|
190
|
+
local avail_mb
|
|
191
|
+
avail_mb=$(free -m 2>/dev/null | awk '/Mem:/{print $7}')
|
|
192
|
+
if [[ -z "$avail_mb" ]]; then
|
|
193
|
+
echo " WARNING: Cannot determine available memory. Falling back to single attempt."
|
|
194
|
+
SAMPLE_COUNT=0
|
|
195
|
+
return 1
|
|
196
|
+
fi
|
|
197
|
+
|
|
198
|
+
local needed_mb=$(( SAMPLE_COUNT * ${SAMPLE_MIN_MEMORY_PER_GB:-4} * 1024 ))
|
|
199
|
+
if [[ "$avail_mb" -lt "$needed_mb" ]]; then
|
|
200
|
+
local avail_display needed_display
|
|
201
|
+
avail_display=$(awk "BEGIN {printf \"%.1f\", $avail_mb / 1024}")
|
|
202
|
+
needed_display=$(( SAMPLE_COUNT * ${SAMPLE_MIN_MEMORY_PER_GB:-4} ))
|
|
203
|
+
echo " WARNING: Not enough memory for sampling (${avail_display}G < ${needed_display}G needed). Falling back to single attempt."
|
|
204
|
+
SAMPLE_COUNT=0
|
|
205
|
+
return 1
|
|
206
|
+
fi
|
|
207
|
+
return 0
|
|
208
|
+
}
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
**`run_sampling_candidates` function:** Extract from current headless lines 373-494. Wrap the block in a function that takes 5 parameters:
|
|
212
|
+
|
|
213
|
+
```bash
|
|
214
|
+
# run_sampling_candidates <worktree> <plan_file> <batch> <prompt> <quality_gate_cmd>
|
|
215
|
+
# Spawns SAMPLE_COUNT parallel candidates with batch-type-aware prompt variants.
|
|
216
|
+
# Uses patch files for worktree state management (no git stash — bug #2/#27).
|
|
217
|
+
# Returns: 0 if winner found (worktree contains winner's changes), 1 if no candidate passed.
|
|
218
|
+
# Side-effects: writes logs/sampling-outcomes.json
|
|
219
|
+
run_sampling_candidates() {
|
|
220
|
+
local worktree="$1"
|
|
221
|
+
local plan_file="$2"
|
|
222
|
+
local batch="$3"
|
|
223
|
+
local prompt="$4"
|
|
224
|
+
local quality_gate_cmd="$5"
|
|
225
|
+
|
|
226
|
+
echo " Sampling $SAMPLE_COUNT candidates for batch $batch..."
|
|
227
|
+
local scores=""
|
|
228
|
+
local candidate_logs=()
|
|
229
|
+
|
|
230
|
+
# Save baseline state using a patch file rather than git stash.
|
|
231
|
+
# (rest of the code from lines 382-494, replacing $WORKTREE with $worktree,
|
|
232
|
+
# $PLAN_FILE with $plan_file, $QUALITY_GATE_CMD with $quality_gate_cmd)
|
|
233
|
+
|
|
234
|
+
# ... (full extraction — replace global refs with parameters)
|
|
235
|
+
|
|
236
|
+
# Return 0 for winner found, 1 for no candidate passed
|
|
237
|
+
}
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
Inside the function body, replace these global variable references with function parameters:
|
|
241
|
+
- `$WORKTREE` → `$worktree`
|
|
242
|
+
- `$PLAN_FILE` → `$plan_file`
|
|
243
|
+
- `$QUALITY_GATE_CMD` → `$quality_gate_cmd`
|
|
244
|
+
|
|
245
|
+
Keep `$SAMPLE_COUNT` as a global read (it's set by the caller and used across the module).
|
|
246
|
+
|
|
247
|
+
**Step 2: Replace inline sampling in headless with function call**
|
|
248
|
+
|
|
249
|
+
In `run-plan-headless.sh`, replace the entire sampling block (the `if [[ "${SAMPLE_COUNT:-0}" -gt 0 && $attempt -ge 2 ]]` block through its closing `fi` and `continue`) with:
|
|
250
|
+
|
|
251
|
+
```bash
|
|
252
|
+
# If sampling enabled and this is a retry, use parallel candidates
|
|
253
|
+
if [[ "${SAMPLE_COUNT:-0}" -gt 0 && $attempt -ge 2 ]]; then
|
|
254
|
+
check_memory_for_sampling || true
|
|
255
|
+
if [[ "${SAMPLE_COUNT:-0}" -gt 0 ]]; then
|
|
256
|
+
if run_sampling_candidates "$WORKTREE" "$PLAN_FILE" "$batch" "$prompt" "$QUALITY_GATE_CMD"; then
|
|
257
|
+
batch_passed=true
|
|
258
|
+
break
|
|
259
|
+
fi
|
|
260
|
+
continue # Skip normal retry path below
|
|
261
|
+
fi
|
|
262
|
+
fi
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
Also replace the memory guard block (lines 354-369) with:
|
|
266
|
+
|
|
267
|
+
```bash
|
|
268
|
+
# Memory guard for sampling
|
|
269
|
+
if [[ "${SAMPLE_COUNT:-0}" -gt 0 ]]; then
|
|
270
|
+
check_memory_for_sampling || true
|
|
271
|
+
fi
|
|
272
|
+
```
|
|
273
|
+
|
|
274
|
+
**Step 3: Add source line in `run-plan.sh`**
|
|
275
|
+
|
|
276
|
+
In `scripts/run-plan.sh`, add BEFORE the headless source line:
|
|
277
|
+
|
|
278
|
+
```bash
|
|
279
|
+
source "$SCRIPT_DIR/lib/run-plan-sampling.sh"
|
|
280
|
+
```
|
|
281
|
+
|
|
282
|
+
**Step 4: Run tests**
|
|
283
|
+
|
|
284
|
+
Run: `bash scripts/tests/test-run-plan-headless.sh`
|
|
285
|
+
Expected: ALL PASSED — the sampling grep tests check `$RPH` for `_baseline_patch`, `_winner_patch`, `git apply`. These patterns will still exist in the headless file OR we need to update them. Check and fix if needed.
|
|
286
|
+
|
|
287
|
+
Run: `bash scripts/tests/test-echo-back.sh`
|
|
288
|
+
Expected: ALL PASSED (5/5, unchanged)
|
|
289
|
+
|
|
290
|
+
**Step 5: Commit**
|
|
291
|
+
|
|
292
|
+
```bash
|
|
293
|
+
git add scripts/lib/run-plan-sampling.sh scripts/lib/run-plan-headless.sh scripts/run-plan.sh
|
|
294
|
+
git commit -m "refactor: extract sampling logic to run-plan-sampling.sh"
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
### Task 4: Fix issue #73 (MAB path resolution)
|
|
298
|
+
|
|
299
|
+
**Files:**
|
|
300
|
+
- Modify: `scripts/lib/run-plan-headless.sh` (one line change)
|
|
301
|
+
|
|
302
|
+
**Step 1: Fix the path**
|
|
303
|
+
|
|
304
|
+
In `scripts/lib/run-plan-headless.sh`, find the MAB invocation line (originally line 251, now shifted after extraction). Change:
|
|
305
|
+
|
|
306
|
+
```bash
|
|
307
|
+
# Before:
|
|
308
|
+
"$SCRIPT_DIR/../mab-run.sh" \
|
|
309
|
+
# After:
|
|
310
|
+
"$SCRIPT_DIR/mab-run.sh" \
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
**Step 2: Verify path resolves correctly**
|
|
314
|
+
|
|
315
|
+
Run: `ls -la "$(cd "$(dirname "$(readlink -f scripts/run-plan.sh)")" && pwd)/mab-run.sh"`
|
|
316
|
+
Expected: Shows `scripts/mab-run.sh` exists at the resolved path.
|
|
317
|
+
|
|
318
|
+
**Step 3: Commit**
|
|
319
|
+
|
|
320
|
+
```bash
|
|
321
|
+
git add scripts/lib/run-plan-headless.sh
|
|
322
|
+
git commit -m "fix: MAB path resolution — \$SCRIPT_DIR/mab-run.sh not \$SCRIPT_DIR/../ (closes #73)"
|
|
323
|
+
```
|
|
324
|
+
|
|
325
|
+
### Task 5: Update sampling tests in `test-run-plan-headless.sh`
|
|
326
|
+
|
|
327
|
+
**Files:**
|
|
328
|
+
- Modify: `scripts/tests/test-run-plan-headless.sh`
|
|
329
|
+
|
|
330
|
+
**Step 1: Add sampling module reference**
|
|
331
|
+
|
|
332
|
+
Near the top of `test-run-plan-headless.sh` (after line 8 where `RPEB` was added), add:
|
|
333
|
+
|
|
334
|
+
```bash
|
|
335
|
+
RPS="$SCRIPT_DIR/../lib/run-plan-sampling.sh"
|
|
336
|
+
```
|
|
337
|
+
|
|
338
|
+
**Step 2: Update sampling grep tests**
|
|
339
|
+
|
|
340
|
+
Lines 176-213 grep `$RPH` for sampling patterns (`_baseline_patch`, `_winner_patch`, `git stash`, `git apply`). Update:
|
|
341
|
+
|
|
342
|
+
- The `_baseline_patch` test (line 177): grep `$RPS` instead of `$RPH`
|
|
343
|
+
- The `_winner_patch` test (line 186): grep `$RPS` instead of `$RPH`
|
|
344
|
+
- The `git stash` test (line 196): extract sampling block from `$RPS` instead of `$RPH`
|
|
345
|
+
- The `git apply` test (line 208): extract from `$RPS` instead of `$RPH`
|
|
346
|
+
- Update PASS/FAIL messages to reference "run-plan-sampling.sh"
|
|
347
|
+
|
|
348
|
+
**Step 3: Add existence test for new modules**
|
|
349
|
+
|
|
350
|
+
Add near the existing file-existence test (after line 33):
|
|
351
|
+
|
|
352
|
+
```bash
|
|
353
|
+
# === Extracted echo-back file exists ===
|
|
354
|
+
TESTS=$((TESTS + 1))
|
|
355
|
+
if [[ -f "$RPEB" ]]; then
|
|
356
|
+
echo "PASS: run-plan-echo-back.sh exists"
|
|
357
|
+
else
|
|
358
|
+
echo "FAIL: run-plan-echo-back.sh should exist at scripts/lib/"
|
|
359
|
+
FAILURES=$((FAILURES + 1))
|
|
360
|
+
fi
|
|
361
|
+
|
|
362
|
+
# === Extracted sampling file exists ===
|
|
363
|
+
TESTS=$((TESTS + 1))
|
|
364
|
+
if [[ -f "$RPS" ]]; then
|
|
365
|
+
echo "PASS: run-plan-sampling.sh exists"
|
|
366
|
+
else
|
|
367
|
+
echo "FAIL: run-plan-sampling.sh should exist at scripts/lib/"
|
|
368
|
+
FAILURES=$((FAILURES + 1))
|
|
369
|
+
fi
|
|
370
|
+
|
|
371
|
+
# === run-plan.sh sources new modules ===
|
|
372
|
+
TESTS=$((TESTS + 1))
|
|
373
|
+
if grep -q 'source.*lib/run-plan-echo-back.sh' "$RP"; then
|
|
374
|
+
echo "PASS: run-plan.sh sources lib/run-plan-echo-back.sh"
|
|
375
|
+
else
|
|
376
|
+
echo "FAIL: run-plan.sh should source lib/run-plan-echo-back.sh"
|
|
377
|
+
FAILURES=$((FAILURES + 1))
|
|
378
|
+
fi
|
|
379
|
+
|
|
380
|
+
TESTS=$((TESTS + 1))
|
|
381
|
+
if grep -q 'source.*lib/run-plan-sampling.sh' "$RP"; then
|
|
382
|
+
echo "PASS: run-plan.sh sources lib/run-plan-sampling.sh"
|
|
383
|
+
else
|
|
384
|
+
echo "FAIL: run-plan.sh should source lib/run-plan-sampling.sh"
|
|
385
|
+
FAILURES=$((FAILURES + 1))
|
|
386
|
+
fi
|
|
387
|
+
```
|
|
388
|
+
|
|
389
|
+
**Step 4: Run all test files**
|
|
390
|
+
|
|
391
|
+
Run: `bash scripts/tests/test-run-plan-headless.sh`
|
|
392
|
+
Expected: ALL PASSED
|
|
393
|
+
|
|
394
|
+
Run: `bash scripts/tests/test-echo-back.sh`
|
|
395
|
+
Expected: ALL PASSED
|
|
396
|
+
|
|
397
|
+
Run: `bash scripts/tests/test-mab-run.sh`
|
|
398
|
+
Expected: ALL PASSED (MAB wiring test should still find the check in headless)
|
|
399
|
+
|
|
400
|
+
**Step 5: Commit**
|
|
401
|
+
|
|
402
|
+
```bash
|
|
403
|
+
git add scripts/tests/test-run-plan-headless.sh
|
|
404
|
+
git commit -m "test: update headless tests for sampling extraction + new module checks"
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
## Batch 3: Verify line counts + full suite
|
|
408
|
+
|
|
409
|
+
### Task 6: Verify final line counts and run full test suite
|
|
410
|
+
|
|
411
|
+
**Files:** (read-only verification)
|
|
412
|
+
|
|
413
|
+
**Step 1: Check line counts**
|
|
414
|
+
|
|
415
|
+
Run: `wc -l scripts/lib/run-plan-headless.sh scripts/lib/run-plan-echo-back.sh scripts/lib/run-plan-sampling.sh`
|
|
416
|
+
|
|
417
|
+
Expected (approximate):
|
|
418
|
+
- `run-plan-headless.sh`: ~416 lines
|
|
419
|
+
- `run-plan-echo-back.sh`: ~145 lines
|
|
420
|
+
- `run-plan-sampling.sh`: ~135 lines
|
|
421
|
+
|
|
422
|
+
**Step 2: Run full test suite**
|
|
423
|
+
|
|
424
|
+
Run: `for t in scripts/tests/test-*.sh; do echo "=== $t ==="; bash "$t" || echo "FAILED: $t"; done`
|
|
425
|
+
|
|
426
|
+
Expected: All test files pass.
|
|
427
|
+
|
|
428
|
+
**Step 3: shellcheck all new and modified files**
|
|
429
|
+
|
|
430
|
+
Run: `shellcheck scripts/lib/run-plan-echo-back.sh scripts/lib/run-plan-sampling.sh scripts/lib/run-plan-headless.sh`
|
|
431
|
+
|
|
432
|
+
Expected: No errors. Fix any warnings before proceeding.
|
|
433
|
+
|
|
434
|
+
**Step 4: Close issue #73**
|
|
435
|
+
|
|
436
|
+
Run: `gh issue close 73 --comment "Fixed in $(git log --oneline -1 --grep='MAB path' | cut -d' ' -f1). \$SCRIPT_DIR/mab-run.sh resolves correctly now."`
|
|
437
|
+
|
|
438
|
+
**Step 5: Final commit (if shellcheck fixes needed)**
|
|
439
|
+
|
|
440
|
+
```bash
|
|
441
|
+
git add -A
|
|
442
|
+
git commit -m "chore: shellcheck fixes after module split"
|
|
443
|
+
```
|
|
@@ -0,0 +1,228 @@
|
|
|
1
|
+
# Lesson Scope Metadata — Design
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-02-24
|
|
4
|
+
**Status:** Approved
|
|
5
|
+
**Phase:** 5A (Adoption & Polish)
|
|
6
|
+
**Evidence:** Lesson #63 — 67% false positive rate predicted at ~100 lessons without scope metadata (Zimmermann, 622 predictions)
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Problem
|
|
11
|
+
|
|
12
|
+
The lesson system has 146 lessons (70 toolkit + 76 workspace) — past the ~100 threshold where unscoped lessons hit untenable false positive rates. Domain-specific lessons (HA entity resolution, Telegram bot polling) fire on unrelated projects. This erodes trust and slows quality gates.
|
|
13
|
+
|
|
14
|
+
Two systems need scope:
|
|
15
|
+
1. **Toolkit lessons** (`docs/lessons/0001-*.md`) — YAML frontmatter, scanned by `lesson-check.sh`
|
|
16
|
+
2. **Workspace lessons** (`~/Documents/docs/lessons/2026-*.md`) — freeform markdown, scanned by `lesson-scanner` agent
|
|
17
|
+
|
|
18
|
+
## Goals
|
|
19
|
+
|
|
20
|
+
1. **Project goal**: Reduce false positive rate below 20% as lesson count grows
|
|
21
|
+
2. **Workspace goal**: Lessons compound across projects — bugs found in one project propagate to others through evidence, not manual tagging
|
|
22
|
+
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
## Design
|
|
26
|
+
|
|
27
|
+
### 1. Scope Vocabulary
|
|
28
|
+
|
|
29
|
+
Hierarchical tag system with three tiers:
|
|
30
|
+
|
|
31
|
+
```
|
|
32
|
+
universal # applies to all projects (default)
|
|
33
|
+
language:<lang> # python, bash, javascript, typescript
|
|
34
|
+
framework:<framework> # pytest, preact, systemd, docker
|
|
35
|
+
domain:<domain> # ha-aria, telegram, notion, ollama
|
|
36
|
+
project:<name> # exact project directory name match
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
**Matching rule**: A lesson is relevant if ANY of its scope tags matches the project's scope set, OR if the lesson's scope includes `universal`.
|
|
40
|
+
|
|
41
|
+
**Relationship to `languages:`**: The existing `languages:` field handles file-level filtering (which extensions to scan). `scope:` handles project-level filtering (should this lesson be loaded at all?). They are orthogonal:
|
|
42
|
+
- `scope: [domain:ha-aria]` + `languages: [python]` = only scan `.py` files, only in ha-aria
|
|
43
|
+
- `scope: [language:python]` + `languages: [python]` = scan `.py` files in any Python project
|
|
44
|
+
- `scope: [universal]` + `languages: [python]` = scan `.py` files in every project
|
|
45
|
+
|
|
46
|
+
### 2. Project Manifests (Supply Side)
|
|
47
|
+
|
|
48
|
+
Each project declares its identity via scope tags in its CLAUDE.md:
|
|
49
|
+
|
|
50
|
+
```markdown
|
|
51
|
+
## Scope Tags
|
|
52
|
+
language:python, framework:pytest, domain:ha-aria
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
lesson-check.sh reads this field. Benefits:
|
|
56
|
+
- **No heuristic code** — projects declare themselves
|
|
57
|
+
- **Extensible** — new domains need no code changes
|
|
58
|
+
- **Projects own their identity** — Cluster B (integration boundary) bugs eliminated
|
|
59
|
+
|
|
60
|
+
**Fallback**: If no `## Scope Tags` section exists, auto-detect from:
|
|
61
|
+
1. `detect_project_type()` for language (already exists)
|
|
62
|
+
2. Framework markers: `pyproject.toml` containing `pytest` → `framework:pytest`
|
|
63
|
+
3. Default to `{universal}` — all lessons apply
|
|
64
|
+
|
|
65
|
+
**CLI override**: `--scope "language:python,domain:ha-aria"` for ad-hoc runs.
|
|
66
|
+
|
|
67
|
+
### 3. Lesson Scope Field (Demand Side)
|
|
68
|
+
|
|
69
|
+
#### Toolkit lessons (YAML frontmatter)
|
|
70
|
+
|
|
71
|
+
New field in YAML block:
|
|
72
|
+
|
|
73
|
+
```yaml
|
|
74
|
+
---
|
|
75
|
+
id: 1
|
|
76
|
+
title: "Bare exception swallowing hides failures"
|
|
77
|
+
scope: [language:python] # NEW — project-level filtering
|
|
78
|
+
languages: [python] # existing — file-level filtering
|
|
79
|
+
severity: blocker
|
|
80
|
+
# ...
|
|
81
|
+
---
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
Default when omitted: `[universal]` — backward compatible.
|
|
85
|
+
|
|
86
|
+
#### Workspace lessons (freeform markdown header)
|
|
87
|
+
|
|
88
|
+
New field in the metadata block:
|
|
89
|
+
|
|
90
|
+
```markdown
|
|
91
|
+
# Lesson: HA Entity Area Resolution
|
|
92
|
+
|
|
93
|
+
**Date:** 2026-02-14
|
|
94
|
+
**System:** ARIA (ha-aria)
|
|
95
|
+
**Tier:** lesson_learned
|
|
96
|
+
**Scope:** domain:ha-aria # NEW
|
|
97
|
+
**Category:** data-model
|
|
98
|
+
**Keywords:** HA, entity, area
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
Default when omitted: `universal`.
|
|
102
|
+
|
|
103
|
+
### 4. lesson-check.sh Changes
|
|
104
|
+
|
|
105
|
+
```
|
|
106
|
+
parse_lesson() — add scope parsing from YAML (new field: lesson_scope)
|
|
107
|
+
detect_project_scope() — NEW: read CLAUDE.md ## Scope Tags, fallback to detect_project_type()
|
|
108
|
+
scope_matches() — NEW: returns 0 if lesson scope intersects project scope
|
|
109
|
+
main loop — add scope_matches() gate before language/grep check
|
|
110
|
+
build_help() — show scope per lesson in --help output
|
|
111
|
+
CLI flags:
|
|
112
|
+
--all-scopes — bypass scope filtering (scan everything)
|
|
113
|
+
--show-scope — display detected project scope and exit
|
|
114
|
+
--scope <tags> — override project scope manually
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
### 5. Lesson-Scanner Agent Changes
|
|
118
|
+
|
|
119
|
+
Update `~/.claude/agents/lesson-scanner.md` prompt to:
|
|
120
|
+
1. Read `**Scope:**` from workspace lesson headers
|
|
121
|
+
2. Detect project context from working directory (read CLAUDE.md scope tags)
|
|
122
|
+
3. Filter lessons by scope match before applying
|
|
123
|
+
4. Report scope mismatch in output: `"Skipped 23 lessons (scope mismatch)"`
|
|
124
|
+
|
|
125
|
+
### 6. Scope Inference at Creation Time
|
|
126
|
+
|
|
127
|
+
Integrate into the `/capture-lesson` skill:
|
|
128
|
+
|
|
129
|
+
1. When a new lesson is created, analyze its content for scope signals:
|
|
130
|
+
- File paths mentioned → infer project/domain
|
|
131
|
+
- System names (HA, Telegram, Notion) → infer domain
|
|
132
|
+
- Language-specific patterns → infer language
|
|
133
|
+
2. Propose scope tags to the user for confirmation
|
|
134
|
+
3. Write the scope field into the lesson
|
|
135
|
+
|
|
136
|
+
This prevents decay — new lessons get scope at birth, not retroactively.
|
|
137
|
+
|
|
138
|
+
### 7. Scope Inference Script (Bulk Tagging)
|
|
139
|
+
|
|
140
|
+
`scripts/scope-infer.sh` for the initial migration:
|
|
141
|
+
|
|
142
|
+
```
|
|
143
|
+
scope-infer.sh [--dir <lessons-dir>] [--dry-run] [--apply]
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
For each lesson without a scope field:
|
|
147
|
+
1. Read content, extract signals (keywords, file paths, system references)
|
|
148
|
+
2. Apply heuristics:
|
|
149
|
+
- Title/body contains "HA", "entity", "area", "automation" → `domain:ha-aria`
|
|
150
|
+
- Title/body contains "Telegram", "bot", "polling" → `domain:telegram`
|
|
151
|
+
- Title/body contains "Notion", "database", "sync" → `domain:notion`
|
|
152
|
+
- `languages: [python]` with no domain signals → `language:python`
|
|
153
|
+
- No signals → `universal`
|
|
154
|
+
3. Output proposed scope as a diff (or apply with `--apply`)
|
|
155
|
+
4. Generate summary: "Inferred scope for N lessons: X universal, Y domain-specific, Z language-specific"
|
|
156
|
+
|
|
157
|
+
### 8. Propagation Tracking
|
|
158
|
+
|
|
159
|
+
New optional field in lesson YAML:
|
|
160
|
+
|
|
161
|
+
```yaml
|
|
162
|
+
validated_in: [ha-aria, telegram-brief, autonomous-coding-toolkit]
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
**Workflow**:
|
|
166
|
+
1. Lesson created with `scope: [domain:ha-aria]` and `validated_in: [ha-aria]`
|
|
167
|
+
2. Same anti-pattern found in telegram-brief → `validated_in: [ha-aria, telegram-brief]`
|
|
168
|
+
3. At 3+ validations across different domains → scope auto-widens to `universal`
|
|
169
|
+
|
|
170
|
+
**Implementation**: The lesson-scanner agent appends to `validated_in` when it finds a violation and the fix is confirmed. `scope-infer.sh --update-propagation` can check for scope-widening candidates.
|
|
171
|
+
|
|
172
|
+
This is the compounding mechanism: lessons start narrow and earn broader scope through evidence.
|
|
173
|
+
|
|
174
|
+
---
|
|
175
|
+
|
|
176
|
+
## File Inventory
|
|
177
|
+
|
|
178
|
+
| File | Action | Description |
|
|
179
|
+
|------|--------|-------------|
|
|
180
|
+
| `scripts/lesson-check.sh` | Modify | Add scope parsing, project detection, filtering, CLI flags |
|
|
181
|
+
| `scripts/scope-infer.sh` | Create | Bulk scope inference for existing lessons |
|
|
182
|
+
| `scripts/tests/test-scope-filtering.sh` | Create | Tests for scope matching, project detection |
|
|
183
|
+
| `scripts/tests/test-lesson-check.sh` | Modify | Add scope-aware test cases |
|
|
184
|
+
| `docs/lessons/TEMPLATE.md` | Modify | Add scope field to template |
|
|
185
|
+
| `docs/lessons/0001-*.md` through `0010-*.md` | Modify | Add scope tags to 10 existing toolkit lessons |
|
|
186
|
+
| `~/.claude/agents/lesson-scanner.md` | Modify | Add scope filtering to agent prompt |
|
|
187
|
+
| `~/.claude/skills/capture-lesson/SKILL.md` | Modify | Add scope inference to creation workflow |
|
|
188
|
+
| `~/Documents/docs/lessons/2026-*.md` | Modify (via scope-infer.sh) | Add **Scope:** field to 76 workspace lessons |
|
|
189
|
+
|
|
190
|
+
## Scope Tag Assignments (Existing Toolkit Lessons)
|
|
191
|
+
|
|
192
|
+
| ID | Title | Proposed Scope |
|
|
193
|
+
|----|-------|---------------|
|
|
194
|
+
| 0001 | Bare exception swallowing | `[language:python]` |
|
|
195
|
+
| 0002 | Async def without await | `[language:python]` |
|
|
196
|
+
| 0003 | create_task without callback | `[language:python]` |
|
|
197
|
+
| 0004 | Hardcoded test counts | `[universal]` |
|
|
198
|
+
| 0005 | sqlite without closing | `[language:python]` |
|
|
199
|
+
| 0006 | venv pip path | `[language:python, framework:pytest]` |
|
|
200
|
+
| 0007 | Runner state self-rejection | `[project:autonomous-coding-toolkit]` |
|
|
201
|
+
| 0008 | Quality gate blind spot | `[project:autonomous-coding-toolkit]` |
|
|
202
|
+
| 0009 | Parser overcount empty batches | `[project:autonomous-coding-toolkit]` |
|
|
203
|
+
| 0010 | local outside function bash | `[language:bash]` |
|
|
204
|
+
|
|
205
|
+
## Testing Plan
|
|
206
|
+
|
|
207
|
+
1. **Scope parsing**: lesson with scope field parsed correctly; missing scope defaults to universal
|
|
208
|
+
2. **Project detection**: CLAUDE.md scope tags read; fallback to language detection; --scope override
|
|
209
|
+
3. **Scope matching**: universal matches everything; domain:ha-aria only matches ha-aria projects; empty intersection skips lesson
|
|
210
|
+
4. **--all-scopes**: bypasses filtering, all lessons applied
|
|
211
|
+
5. **--show-scope**: displays detected scope and exits
|
|
212
|
+
6. **Integration**: full lesson-check run with scope filtering on a non-Python project skips Python-only lessons
|
|
213
|
+
7. **scope-infer.sh**: correct inference from lesson content; --dry-run shows diff; --apply writes
|
|
214
|
+
|
|
215
|
+
## Dependencies
|
|
216
|
+
|
|
217
|
+
- None on other phases
|
|
218
|
+
- `detect_project_type()` in `lib/common.sh` already exists
|
|
219
|
+
- TEMPLATE.md update is backward-compatible (scope optional, defaults universal)
|
|
220
|
+
|
|
221
|
+
## Risks
|
|
222
|
+
|
|
223
|
+
| Risk | Mitigation |
|
|
224
|
+
|------|-----------|
|
|
225
|
+
| Over-scoping hides real violations | `--all-scopes` escape hatch; `validated_in` promotes proven lessons |
|
|
226
|
+
| Scope tags go stale | Propagation tracking auto-widens; scope-infer.sh can re-run periodically |
|
|
227
|
+
| CLAUDE.md scope tags missing | Fallback to language detection; warn in --show-scope output |
|
|
228
|
+
| Manual tagging burden for new lessons | `/capture-lesson` skill infers scope at creation time |
|