@jokerized/getresearchdone 0.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +103 -0
- package/README.md +211 -0
- package/agents/grd-baseline-assessor.md +684 -0
- package/agents/grd-code-reviewer.md +300 -0
- package/agents/grd-codebase-mapper.md +355 -0
- package/agents/grd-critique-agent.md +119 -0
- package/agents/grd-debugger.md +519 -0
- package/agents/grd-deep-diver.md +737 -0
- package/agents/grd-eval-planner.md +913 -0
- package/agents/grd-eval-reporter.md +717 -0
- package/agents/grd-executor.md +683 -0
- package/agents/grd-feasibility-analyst.md +624 -0
- package/agents/grd-integration-checker.md +367 -0
- package/agents/grd-knowledge-miner.md +81 -0
- package/agents/grd-migrator.md +88 -0
- package/agents/grd-phase-researcher.md +697 -0
- package/agents/grd-plan-checker.md +443 -0
- package/agents/grd-planner.md +1532 -0
- package/agents/grd-product-owner.md +562 -0
- package/agents/grd-project-researcher.md +513 -0
- package/agents/grd-research-synthesizer.md +273 -0
- package/agents/grd-roadmapper.md +798 -0
- package/agents/grd-surveyor.md +566 -0
- package/agents/grd-verifier.md +893 -0
- package/bin/gd.js +4 -0
- package/bin/gd.ts +227 -0
- package/bin/grd-manifest.js +4 -0
- package/bin/grd-manifest.ts +286 -0
- package/bin/grd-mcp-server.js +4 -0
- package/bin/grd-mcp-server.ts +124 -0
- package/bin/grd-tools.js +4 -0
- package/bin/grd-tools.ts +2471 -0
- package/bin/postinstall.js +4 -0
- package/bin/postinstall.ts +80 -0
- package/commands/add-phase.md +123 -0
- package/commands/add-todo.md +87 -0
- package/commands/assess-baseline.md +289 -0
- package/commands/autopilot.md +100 -0
- package/commands/autoplan.md +55 -0
- package/commands/check-todos.md +87 -0
- package/commands/compare-methods.md +262 -0
- package/commands/complete-milestone.md +225 -0
- package/commands/debug.md +372 -0
- package/commands/deep-dive.md +288 -0
- package/commands/discover.md +281 -0
- package/commands/discuss-phase.md +188 -0
- package/commands/discuss.md +55 -0
- package/commands/eval-report.md +310 -0
- package/commands/evolve.md +79 -0
- package/commands/execute-phase.md +1017 -0
- package/commands/feasibility.md +292 -0
- package/commands/help.md +407 -0
- package/commands/init.md +1508 -0
- package/commands/insert-phase.md +113 -0
- package/commands/iterate.md +327 -0
- package/commands/list-phase-assumptions.md +217 -0
- package/commands/long-term-roadmap.md +202 -0
- package/commands/map-codebase.md +111 -0
- package/commands/migrate.md +159 -0
- package/commands/new-milestone.md +169 -0
- package/commands/pause-work.md +83 -0
- package/commands/plan-milestone-gaps.md +373 -0
- package/commands/plan-phase.md +655 -0
- package/commands/principles.md +328 -0
- package/commands/product-plan.md +319 -0
- package/commands/progress.md +481 -0
- package/commands/quick.md +167 -0
- package/commands/reapply-patches.md +154 -0
- package/commands/remove-phase.md +97 -0
- package/commands/requirement.md +96 -0
- package/commands/resume-project.md +113 -0
- package/commands/settings.md +1144 -0
- package/commands/survey.md +242 -0
- package/commands/sync.md +246 -0
- package/commands/tracker-setup.md +322 -0
- package/commands/update.md +202 -0
- package/commands/verify-phase.md +335 -0
- package/commands/verify-work.md +701 -0
- package/commands/wireup.md +29 -0
- package/dist/bin/gd.d.ts +3 -0
- package/dist/bin/gd.d.ts.map +1 -0
- package/dist/bin/gd.js +178 -0
- package/dist/bin/gd.js.map +1 -0
- package/dist/bin/grd-manifest.d.ts +3 -0
- package/dist/bin/grd-manifest.d.ts.map +1 -0
- package/dist/bin/grd-manifest.js +202 -0
- package/dist/bin/grd-manifest.js.map +1 -0
- package/dist/bin/grd-mcp-server.d.ts +3 -0
- package/dist/bin/grd-mcp-server.d.ts.map +1 -0
- package/dist/bin/grd-mcp-server.js +71 -0
- package/dist/bin/grd-mcp-server.js.map +1 -0
- package/dist/bin/grd-tools.d.ts +3 -0
- package/dist/bin/grd-tools.d.ts.map +1 -0
- package/dist/bin/grd-tools.js +1680 -0
- package/dist/bin/grd-tools.js.map +1 -0
- package/dist/bin/postinstall.d.ts +3 -0
- package/dist/bin/postinstall.d.ts.map +1 -0
- package/dist/bin/postinstall.js +61 -0
- package/dist/bin/postinstall.js.map +1 -0
- package/dist/lib/autopilot-milestone.d.ts +2 -0
- package/dist/lib/autopilot-milestone.d.ts.map +1 -0
- package/dist/lib/autopilot-milestone.js +94 -0
- package/dist/lib/autopilot-milestone.js.map +1 -0
- package/dist/lib/autopilot-pipeline.d.ts +2 -0
- package/dist/lib/autopilot-pipeline.d.ts.map +1 -0
- package/dist/lib/autopilot-pipeline.js +830 -0
- package/dist/lib/autopilot-pipeline.js.map +1 -0
- package/dist/lib/autopilot-waves.d.ts +2 -0
- package/dist/lib/autopilot-waves.d.ts.map +1 -0
- package/dist/lib/autopilot-waves.js +266 -0
- package/dist/lib/autopilot-waves.js.map +1 -0
- package/dist/lib/autopilot.d.ts +2 -0
- package/dist/lib/autopilot.d.ts.map +1 -0
- package/dist/lib/autopilot.js +1314 -0
- package/dist/lib/autopilot.js.map +1 -0
- package/dist/lib/autoplan.d.ts +2 -0
- package/dist/lib/autoplan.d.ts.map +1 -0
- package/dist/lib/autoplan.js +198 -0
- package/dist/lib/autoplan.js.map +1 -0
- package/dist/lib/autoresearch.d.ts +2 -0
- package/dist/lib/autoresearch.d.ts.map +1 -0
- package/dist/lib/autoresearch.js +626 -0
- package/dist/lib/autoresearch.js.map +1 -0
- package/dist/lib/backend.d.ts +2 -0
- package/dist/lib/backend.d.ts.map +1 -0
- package/dist/lib/backend.js +1036 -0
- package/dist/lib/backend.js.map +1 -0
- package/dist/lib/benchmark.d.ts +99 -0
- package/dist/lib/benchmark.d.ts.map +1 -0
- package/dist/lib/benchmark.js +278 -0
- package/dist/lib/benchmark.js.map +1 -0
- package/dist/lib/citations.d.ts +2 -0
- package/dist/lib/citations.d.ts.map +1 -0
- package/dist/lib/citations.js +642 -0
- package/dist/lib/citations.js.map +1 -0
- package/dist/lib/cleanup.d.ts +2 -0
- package/dist/lib/cleanup.d.ts.map +1 -0
- package/dist/lib/cleanup.js +1222 -0
- package/dist/lib/cleanup.js.map +1 -0
- package/dist/lib/cli/adapters.d.ts +10 -0
- package/dist/lib/cli/adapters.d.ts.map +1 -0
- package/dist/lib/cli/adapters.js +27 -0
- package/dist/lib/cli/adapters.js.map +1 -0
- package/dist/lib/cli/agent.d.ts +17 -0
- package/dist/lib/cli/agent.d.ts.map +1 -0
- package/dist/lib/cli/agent.js +53 -0
- package/dist/lib/cli/agent.js.map +1 -0
- package/dist/lib/cli/index.d.ts +21 -0
- package/dist/lib/cli/index.d.ts.map +1 -0
- package/dist/lib/cli/index.js +264 -0
- package/dist/lib/cli/index.js.map +1 -0
- package/dist/lib/cli/output.d.ts +20 -0
- package/dist/lib/cli/output.d.ts.map +1 -0
- package/dist/lib/cli/output.js +22 -0
- package/dist/lib/cli/output.js.map +1 -0
- package/dist/lib/cli/scan-dispatch.d.ts +9 -0
- package/dist/lib/cli/scan-dispatch.d.ts.map +1 -0
- package/dist/lib/cli/scan-dispatch.js +107 -0
- package/dist/lib/cli/scan-dispatch.js.map +1 -0
- package/dist/lib/cli/tools.d.ts +16 -0
- package/dist/lib/cli/tools.d.ts.map +1 -0
- package/dist/lib/cli/tools.js +168 -0
- package/dist/lib/cli/tools.js.map +1 -0
- package/dist/lib/commands/_dashboard-parsers.d.ts +2 -0
- package/dist/lib/commands/_dashboard-parsers.d.ts.map +1 -0
- package/dist/lib/commands/_dashboard-parsers.js +192 -0
- package/dist/lib/commands/_dashboard-parsers.js.map +1 -0
- package/dist/lib/commands/analysis.d.ts +2 -0
- package/dist/lib/commands/analysis.d.ts.map +1 -0
- package/dist/lib/commands/analysis.js +1418 -0
- package/dist/lib/commands/analysis.js.map +1 -0
- package/dist/lib/commands/assumptions.d.ts +2 -0
- package/dist/lib/commands/assumptions.d.ts.map +1 -0
- package/dist/lib/commands/assumptions.js +166 -0
- package/dist/lib/commands/assumptions.js.map +1 -0
- package/dist/lib/commands/blame.d.ts +2 -0
- package/dist/lib/commands/blame.d.ts.map +1 -0
- package/dist/lib/commands/blame.js +133 -0
- package/dist/lib/commands/blame.js.map +1 -0
- package/dist/lib/commands/budget.d.ts +2 -0
- package/dist/lib/commands/budget.d.ts.map +1 -0
- package/dist/lib/commands/budget.js +100 -0
- package/dist/lib/commands/budget.js.map +1 -0
- package/dist/lib/commands/check-plans.d.ts +2 -0
- package/dist/lib/commands/check-plans.d.ts.map +1 -0
- package/dist/lib/commands/check-plans.js +190 -0
- package/dist/lib/commands/check-plans.js.map +1 -0
- package/dist/lib/commands/config.d.ts +2 -0
- package/dist/lib/commands/config.d.ts.map +1 -0
- package/dist/lib/commands/config.js +188 -0
- package/dist/lib/commands/config.js.map +1 -0
- package/dist/lib/commands/dashboard.d.ts +2 -0
- package/dist/lib/commands/dashboard.d.ts.map +1 -0
- package/dist/lib/commands/dashboard.js +466 -0
- package/dist/lib/commands/dashboard.js.map +1 -0
- package/dist/lib/commands/estimate.d.ts +2 -0
- package/dist/lib/commands/estimate.d.ts.map +1 -0
- package/dist/lib/commands/estimate.js +148 -0
- package/dist/lib/commands/estimate.js.map +1 -0
- package/dist/lib/commands/eval-diff.d.ts +2 -0
- package/dist/lib/commands/eval-diff.d.ts.map +1 -0
- package/dist/lib/commands/eval-diff.js +213 -0
- package/dist/lib/commands/eval-diff.js.map +1 -0
- package/dist/lib/commands/freshness.d.ts +2 -0
- package/dist/lib/commands/freshness.d.ts.map +1 -0
- package/dist/lib/commands/freshness.js +163 -0
- package/dist/lib/commands/freshness.js.map +1 -0
- package/dist/lib/commands/health.d.ts +2 -0
- package/dist/lib/commands/health.d.ts.map +1 -0
- package/dist/lib/commands/health.js +435 -0
- package/dist/lib/commands/health.js.map +1 -0
- package/dist/lib/commands/index.d.ts +2 -0
- package/dist/lib/commands/index.d.ts.map +1 -0
- package/dist/lib/commands/index.js +128 -0
- package/dist/lib/commands/index.js.map +1 -0
- package/dist/lib/commands/install.d.ts +56 -0
- package/dist/lib/commands/install.d.ts.map +1 -0
- package/dist/lib/commands/install.js +214 -0
- package/dist/lib/commands/install.js.map +1 -0
- package/dist/lib/commands/knowhow-aggregator.d.ts +2 -0
- package/dist/lib/commands/knowhow-aggregator.d.ts.map +1 -0
- package/dist/lib/commands/knowhow-aggregator.js +279 -0
- package/dist/lib/commands/knowhow-aggregator.js.map +1 -0
- package/dist/lib/commands/knowledge-search.d.ts +2 -0
- package/dist/lib/commands/knowledge-search.d.ts.map +1 -0
- package/dist/lib/commands/knowledge-search.js +113 -0
- package/dist/lib/commands/knowledge-search.js.map +1 -0
- package/dist/lib/commands/long-term-roadmap.d.ts +2 -0
- package/dist/lib/commands/long-term-roadmap.d.ts.map +1 -0
- package/dist/lib/commands/long-term-roadmap.js +272 -0
- package/dist/lib/commands/long-term-roadmap.js.map +1 -0
- package/dist/lib/commands/patterns.d.ts +91 -0
- package/dist/lib/commands/patterns.d.ts.map +1 -0
- package/dist/lib/commands/patterns.js +391 -0
- package/dist/lib/commands/patterns.js.map +1 -0
- package/dist/lib/commands/phase-info.d.ts +2 -0
- package/dist/lib/commands/phase-info.d.ts.map +1 -0
- package/dist/lib/commands/phase-info.js +509 -0
- package/dist/lib/commands/phase-info.js.map +1 -0
- package/dist/lib/commands/plan-lint.d.ts +56 -0
- package/dist/lib/commands/plan-lint.d.ts.map +1 -0
- package/dist/lib/commands/plan-lint.js +481 -0
- package/dist/lib/commands/plan-lint.js.map +1 -0
- package/dist/lib/commands/plan-phase.d.ts +53 -0
- package/dist/lib/commands/plan-phase.d.ts.map +1 -0
- package/dist/lib/commands/plan-phase.js +288 -0
- package/dist/lib/commands/plan-phase.js.map +1 -0
- package/dist/lib/commands/progress.d.ts +2 -0
- package/dist/lib/commands/progress.d.ts.map +1 -0
- package/dist/lib/commands/progress.js +266 -0
- package/dist/lib/commands/progress.js.map +1 -0
- package/dist/lib/commands/quality.d.ts +2 -0
- package/dist/lib/commands/quality.d.ts.map +1 -0
- package/dist/lib/commands/quality.js +80 -0
- package/dist/lib/commands/quality.js.map +1 -0
- package/dist/lib/commands/rollback.d.ts +2 -0
- package/dist/lib/commands/rollback.d.ts.map +1 -0
- package/dist/lib/commands/rollback.js +145 -0
- package/dist/lib/commands/rollback.js.map +1 -0
- package/dist/lib/commands/scan.d.ts +25 -0
- package/dist/lib/commands/scan.d.ts.map +1 -0
- package/dist/lib/commands/scan.js +28 -0
- package/dist/lib/commands/scan.js.map +1 -0
- package/dist/lib/commands/search.d.ts +2 -0
- package/dist/lib/commands/search.d.ts.map +1 -0
- package/dist/lib/commands/search.js +212 -0
- package/dist/lib/commands/search.js.map +1 -0
- package/dist/lib/commands/select-candidate.d.ts +128 -0
- package/dist/lib/commands/select-candidate.d.ts.map +1 -0
- package/dist/lib/commands/select-candidate.js +518 -0
- package/dist/lib/commands/select-candidate.js.map +1 -0
- package/dist/lib/commands/singularity.d.ts +2 -0
- package/dist/lib/commands/singularity.d.ts.map +1 -0
- package/dist/lib/commands/singularity.js +185 -0
- package/dist/lib/commands/singularity.js.map +1 -0
- package/dist/lib/commands/slug-timestamp.d.ts +2 -0
- package/dist/lib/commands/slug-timestamp.d.ts.map +1 -0
- package/dist/lib/commands/slug-timestamp.js +54 -0
- package/dist/lib/commands/slug-timestamp.js.map +1 -0
- package/dist/lib/commands/tail.d.ts +2 -0
- package/dist/lib/commands/tail.d.ts.map +1 -0
- package/dist/lib/commands/tail.js +100 -0
- package/dist/lib/commands/tail.js.map +1 -0
- package/dist/lib/commands/todo.d.ts +2 -0
- package/dist/lib/commands/todo.d.ts.map +1 -0
- package/dist/lib/commands/todo.js +200 -0
- package/dist/lib/commands/todo.js.map +1 -0
- package/dist/lib/commands/watch.d.ts +2 -0
- package/dist/lib/commands/watch.d.ts.map +1 -0
- package/dist/lib/commands/watch.js +72 -0
- package/dist/lib/commands/watch.js.map +1 -0
- package/dist/lib/complexity.d.ts +55 -0
- package/dist/lib/complexity.d.ts.map +1 -0
- package/dist/lib/complexity.js +80 -0
- package/dist/lib/complexity.js.map +1 -0
- package/dist/lib/context/agents.d.ts +2 -0
- package/dist/lib/context/agents.d.ts.map +1 -0
- package/dist/lib/context/agents.js +344 -0
- package/dist/lib/context/agents.js.map +1 -0
- package/dist/lib/context/base.d.ts +2 -0
- package/dist/lib/context/base.d.ts.map +1 -0
- package/dist/lib/context/base.js +81 -0
- package/dist/lib/context/base.js.map +1 -0
- package/dist/lib/context/execute.d.ts +2 -0
- package/dist/lib/context/execute.d.ts.map +1 -0
- package/dist/lib/context/execute.js +753 -0
- package/dist/lib/context/execute.js.map +1 -0
- package/dist/lib/context/index.d.ts +2 -0
- package/dist/lib/context/index.d.ts.map +1 -0
- package/dist/lib/context/index.js +88 -0
- package/dist/lib/context/index.js.map +1 -0
- package/dist/lib/context/progress.d.ts +2 -0
- package/dist/lib/context/progress.d.ts.map +1 -0
- package/dist/lib/context/progress.js +178 -0
- package/dist/lib/context/progress.js.map +1 -0
- package/dist/lib/context/project.d.ts +2 -0
- package/dist/lib/context/project.d.ts.map +1 -0
- package/dist/lib/context/project.js +413 -0
- package/dist/lib/context/project.js.map +1 -0
- package/dist/lib/context/research.d.ts +2 -0
- package/dist/lib/context/research.d.ts.map +1 -0
- package/dist/lib/context/research.js +466 -0
- package/dist/lib/context/research.js.map +1 -0
- package/dist/lib/dead-ends.d.ts +28 -0
- package/dist/lib/dead-ends.d.ts.map +1 -0
- package/dist/lib/dead-ends.js +451 -0
- package/dist/lib/dead-ends.js.map +1 -0
- package/dist/lib/deps.d.ts +2 -0
- package/dist/lib/deps.d.ts.map +1 -0
- package/dist/lib/deps.js +630 -0
- package/dist/lib/deps.js.map +1 -0
- package/dist/lib/discussion.d.ts +2 -0
- package/dist/lib/discussion.d.ts.map +1 -0
- package/dist/lib/discussion.js +1041 -0
- package/dist/lib/discussion.js.map +1 -0
- package/dist/lib/drift.d.ts +36 -0
- package/dist/lib/drift.d.ts.map +1 -0
- package/dist/lib/drift.js +481 -0
- package/dist/lib/drift.js.map +1 -0
- package/dist/lib/evolve/_dimensions-features.d.ts +2 -0
- package/dist/lib/evolve/_dimensions-features.d.ts.map +1 -0
- package/dist/lib/evolve/_dimensions-features.js +369 -0
- package/dist/lib/evolve/_dimensions-features.js.map +1 -0
- package/dist/lib/evolve/_dimensions.d.ts +2 -0
- package/dist/lib/evolve/_dimensions.d.ts.map +1 -0
- package/dist/lib/evolve/_dimensions.js +358 -0
- package/dist/lib/evolve/_dimensions.js.map +1 -0
- package/dist/lib/evolve/_product-ideation.d.ts +2 -0
- package/dist/lib/evolve/_product-ideation.d.ts.map +1 -0
- package/dist/lib/evolve/_product-ideation.js +281 -0
- package/dist/lib/evolve/_product-ideation.js.map +1 -0
- package/dist/lib/evolve/_prompts.d.ts +2 -0
- package/dist/lib/evolve/_prompts.d.ts.map +1 -0
- package/dist/lib/evolve/_prompts.js +153 -0
- package/dist/lib/evolve/_prompts.js.map +1 -0
- package/dist/lib/evolve/cli.d.ts +2 -0
- package/dist/lib/evolve/cli.d.ts.map +1 -0
- package/dist/lib/evolve/cli.js +224 -0
- package/dist/lib/evolve/cli.js.map +1 -0
- package/dist/lib/evolve/discovery.d.ts +2 -0
- package/dist/lib/evolve/discovery.d.ts.map +1 -0
- package/dist/lib/evolve/discovery.js +391 -0
- package/dist/lib/evolve/discovery.js.map +1 -0
- package/dist/lib/evolve/index.d.ts +2 -0
- package/dist/lib/evolve/index.d.ts.map +1 -0
- package/dist/lib/evolve/index.js +88 -0
- package/dist/lib/evolve/index.js.map +1 -0
- package/dist/lib/evolve/orchestrator.d.ts +2 -0
- package/dist/lib/evolve/orchestrator.d.ts.map +1 -0
- package/dist/lib/evolve/orchestrator.js +851 -0
- package/dist/lib/evolve/orchestrator.js.map +1 -0
- package/dist/lib/evolve/scoring.d.ts +2 -0
- package/dist/lib/evolve/scoring.d.ts.map +1 -0
- package/dist/lib/evolve/scoring.js +118 -0
- package/dist/lib/evolve/scoring.js.map +1 -0
- package/dist/lib/evolve/state.d.ts +2 -0
- package/dist/lib/evolve/state.d.ts.map +1 -0
- package/dist/lib/evolve/state.js +264 -0
- package/dist/lib/evolve/state.js.map +1 -0
- package/dist/lib/evolve/types.d.ts +249 -0
- package/dist/lib/evolve/types.d.ts.map +1 -0
- package/dist/lib/evolve/types.js +3 -0
- package/dist/lib/evolve/types.js.map +1 -0
- package/dist/lib/frontmatter.d.ts +2 -0
- package/dist/lib/frontmatter.d.ts.map +1 -0
- package/dist/lib/frontmatter.js +513 -0
- package/dist/lib/frontmatter.js.map +1 -0
- package/dist/lib/gates.d.ts +2 -0
- package/dist/lib/gates.d.ts.map +1 -0
- package/dist/lib/gates.js +578 -0
- package/dist/lib/gates.js.map +1 -0
- package/dist/lib/genome.d.ts +10 -0
- package/dist/lib/genome.d.ts.map +1 -0
- package/dist/lib/genome.js +368 -0
- package/dist/lib/genome.js.map +1 -0
- package/dist/lib/got.d.ts +2 -0
- package/dist/lib/got.d.ts.map +1 -0
- package/dist/lib/got.js +280 -0
- package/dist/lib/got.js.map +1 -0
- package/dist/lib/invariants.d.ts +2 -0
- package/dist/lib/invariants.d.ts.map +1 -0
- package/dist/lib/invariants.js +298 -0
- package/dist/lib/invariants.js.map +1 -0
- package/dist/lib/knowledge.d.ts +2 -0
- package/dist/lib/knowledge.d.ts.map +1 -0
- package/dist/lib/knowledge.js +658 -0
- package/dist/lib/knowledge.js.map +1 -0
- package/dist/lib/long-term-roadmap.d.ts +2 -0
- package/dist/lib/long-term-roadmap.d.ts.map +1 -0
- package/dist/lib/long-term-roadmap.js +602 -0
- package/dist/lib/long-term-roadmap.js.map +1 -0
- package/dist/lib/markdown-split.d.ts +2 -0
- package/dist/lib/markdown-split.d.ts.map +1 -0
- package/dist/lib/markdown-split.js +199 -0
- package/dist/lib/markdown-split.js.map +1 -0
- package/dist/lib/mcp-server.d.ts +2 -0
- package/dist/lib/mcp-server.d.ts.map +1 -0
- package/dist/lib/mcp-server.js +2424 -0
- package/dist/lib/mcp-server.js.map +1 -0
- package/dist/lib/metrics.d.ts +16 -0
- package/dist/lib/metrics.d.ts.map +1 -0
- package/dist/lib/metrics.js +48 -0
- package/dist/lib/metrics.js.map +1 -0
- package/dist/lib/overstory.d.ts +2 -0
- package/dist/lib/overstory.d.ts.map +1 -0
- package/dist/lib/overstory.js +211 -0
- package/dist/lib/overstory.js.map +1 -0
- package/dist/lib/parallel.d.ts +2 -0
- package/dist/lib/parallel.d.ts.map +1 -0
- package/dist/lib/parallel.js +349 -0
- package/dist/lib/parallel.js.map +1 -0
- package/dist/lib/paths.d.ts +2 -0
- package/dist/lib/paths.d.ts.map +1 -0
- package/dist/lib/paths.js +254 -0
- package/dist/lib/paths.js.map +1 -0
- package/dist/lib/phase-complete-llm.d.ts +22 -0
- package/dist/lib/phase-complete-llm.d.ts.map +1 -0
- package/dist/lib/phase-complete-llm.js +331 -0
- package/dist/lib/phase-complete-llm.js.map +1 -0
- package/dist/lib/phase-complete.d.ts +46 -0
- package/dist/lib/phase-complete.d.ts.map +1 -0
- package/dist/lib/phase-complete.js +278 -0
- package/dist/lib/phase-complete.js.map +1 -0
- package/dist/lib/phase-io.d.ts +2 -0
- package/dist/lib/phase-io.d.ts.map +1 -0
- package/dist/lib/phase-io.js +126 -0
- package/dist/lib/phase-io.js.map +1 -0
- package/dist/lib/phase.d.ts +2 -0
- package/dist/lib/phase.d.ts.map +1 -0
- package/dist/lib/phase.js +1344 -0
- package/dist/lib/phase.js.map +1 -0
- package/dist/lib/plan-tournament.d.ts +63 -0
- package/dist/lib/plan-tournament.d.ts.map +1 -0
- package/dist/lib/plan-tournament.js +353 -0
- package/dist/lib/plan-tournament.js.map +1 -0
- package/dist/lib/refinement.d.ts +74 -0
- package/dist/lib/refinement.d.ts.map +1 -0
- package/dist/lib/refinement.js +283 -0
- package/dist/lib/refinement.js.map +1 -0
- package/dist/lib/requirements.d.ts +2 -0
- package/dist/lib/requirements.d.ts.map +1 -0
- package/dist/lib/requirements.js +355 -0
- package/dist/lib/requirements.js.map +1 -0
- package/dist/lib/research-bundle.d.ts +2 -0
- package/dist/lib/research-bundle.d.ts.map +1 -0
- package/dist/lib/research-bundle.js +246 -0
- package/dist/lib/research-bundle.js.map +1 -0
- package/dist/lib/roadmap.d.ts +2 -0
- package/dist/lib/roadmap.d.ts.map +1 -0
- package/dist/lib/roadmap.js +541 -0
- package/dist/lib/roadmap.js.map +1 -0
- package/dist/lib/sample.d.ts +16 -0
- package/dist/lib/sample.d.ts.map +1 -0
- package/dist/lib/sample.js +20 -0
- package/dist/lib/sample.js.map +1 -0
- package/dist/lib/scaffold.d.ts +2 -0
- package/dist/lib/scaffold.d.ts.map +1 -0
- package/dist/lib/scaffold.js +355 -0
- package/dist/lib/scaffold.js.map +1 -0
- package/dist/lib/scan/_utils.d.ts +11 -0
- package/dist/lib/scan/_utils.d.ts.map +1 -0
- package/dist/lib/scan/_utils.js +36 -0
- package/dist/lib/scan/_utils.js.map +1 -0
- package/dist/lib/scan/base64.d.ts +15 -0
- package/dist/lib/scan/base64.d.ts.map +1 -0
- package/dist/lib/scan/base64.js +66 -0
- package/dist/lib/scan/base64.js.map +1 -0
- package/dist/lib/scan/ignorefile.d.ts +30 -0
- package/dist/lib/scan/ignorefile.d.ts.map +1 -0
- package/dist/lib/scan/ignorefile.js +101 -0
- package/dist/lib/scan/ignorefile.js.map +1 -0
- package/dist/lib/scan/injection.d.ts +14 -0
- package/dist/lib/scan/injection.d.ts.map +1 -0
- package/dist/lib/scan/injection.js +39 -0
- package/dist/lib/scan/injection.js.map +1 -0
- package/dist/lib/scan/patterns.d.ts +17 -0
- package/dist/lib/scan/patterns.d.ts.map +1 -0
- package/dist/lib/scan/patterns.js +123 -0
- package/dist/lib/scan/patterns.js.map +1 -0
- package/dist/lib/scan/strip-markdown.d.ts +7 -0
- package/dist/lib/scan/strip-markdown.d.ts.map +1 -0
- package/dist/lib/scan/strip-markdown.js +38 -0
- package/dist/lib/scan/strip-markdown.js.map +1 -0
- package/dist/lib/scan/types.d.ts +23 -0
- package/dist/lib/scan/types.d.ts.map +1 -0
- package/dist/lib/scan/types.js +3 -0
- package/dist/lib/scan/types.js.map +1 -0
- package/dist/lib/scheduler-wait.d.ts +2 -0
- package/dist/lib/scheduler-wait.d.ts.map +1 -0
- package/dist/lib/scheduler-wait.js +59 -0
- package/dist/lib/scheduler-wait.js.map +1 -0
- package/dist/lib/scheduler.d.ts +254 -0
- package/dist/lib/scheduler.d.ts.map +1 -0
- package/dist/lib/scheduler.js +1147 -0
- package/dist/lib/scheduler.js.map +1 -0
- package/dist/lib/state.d.ts +2 -0
- package/dist/lib/state.d.ts.map +1 -0
- package/dist/lib/state.js +744 -0
- package/dist/lib/state.js.map +1 -0
- package/dist/lib/think.d.ts +18 -0
- package/dist/lib/think.d.ts.map +1 -0
- package/dist/lib/think.js +317 -0
- package/dist/lib/think.js.map +1 -0
- package/dist/lib/tracker.d.ts +2 -0
- package/dist/lib/tracker.d.ts.map +1 -0
- package/dist/lib/tracker.js +1121 -0
- package/dist/lib/tracker.js.map +1 -0
- package/dist/lib/types.d.ts +1514 -0
- package/dist/lib/types.d.ts.map +1 -0
- package/dist/lib/types.js +4 -0
- package/dist/lib/types.js.map +1 -0
- package/dist/lib/utils.d.ts +2 -0
- package/dist/lib/utils.d.ts.map +1 -0
- package/dist/lib/utils.js +1363 -0
- package/dist/lib/utils.js.map +1 -0
- package/dist/lib/verify.d.ts +2 -0
- package/dist/lib/verify.d.ts.map +1 -0
- package/dist/lib/verify.js +1153 -0
- package/dist/lib/verify.js.map +1 -0
- package/dist/lib/wireup/autofix.d.ts +2 -0
- package/dist/lib/wireup/autofix.d.ts.map +1 -0
- package/dist/lib/wireup/autofix.js +188 -0
- package/dist/lib/wireup/autofix.js.map +1 -0
- package/dist/lib/wireup/cli.d.ts +2 -0
- package/dist/lib/wireup/cli.d.ts.map +1 -0
- package/dist/lib/wireup/cli.js +194 -0
- package/dist/lib/wireup/cli.js.map +1 -0
- package/dist/lib/wireup/detection.d.ts +47 -0
- package/dist/lib/wireup/detection.d.ts.map +1 -0
- package/dist/lib/wireup/detection.js +410 -0
- package/dist/lib/wireup/detection.js.map +1 -0
- package/dist/lib/wireup/discovery.d.ts +2 -0
- package/dist/lib/wireup/discovery.d.ts.map +1 -0
- package/dist/lib/wireup/discovery.js +934 -0
- package/dist/lib/wireup/discovery.js.map +1 -0
- package/dist/lib/wireup/execution.d.ts +2 -0
- package/dist/lib/wireup/execution.d.ts.map +1 -0
- package/dist/lib/wireup/execution.js +573 -0
- package/dist/lib/wireup/execution.js.map +1 -0
- package/dist/lib/wireup/index.d.ts +2 -0
- package/dist/lib/wireup/index.d.ts.map +1 -0
- package/dist/lib/wireup/index.js +85 -0
- package/dist/lib/wireup/index.js.map +1 -0
- package/dist/lib/wireup/orchestrator.d.ts +2 -0
- package/dist/lib/wireup/orchestrator.d.ts.map +1 -0
- package/dist/lib/wireup/orchestrator.js +366 -0
- package/dist/lib/wireup/orchestrator.js.map +1 -0
- package/dist/lib/wireup/report.d.ts +47 -0
- package/dist/lib/wireup/report.d.ts.map +1 -0
- package/dist/lib/wireup/report.js +201 -0
- package/dist/lib/wireup/report.js.map +1 -0
- package/dist/lib/wireup/scenarios.d.ts +2 -0
- package/dist/lib/wireup/scenarios.d.ts.map +1 -0
- package/dist/lib/wireup/scenarios.js +516 -0
- package/dist/lib/wireup/scenarios.js.map +1 -0
- package/dist/lib/wireup/state.d.ts +2 -0
- package/dist/lib/wireup/state.d.ts.map +1 -0
- package/dist/lib/wireup/state.js +102 -0
- package/dist/lib/wireup/state.js.map +1 -0
- package/dist/lib/wireup/types.d.ts +376 -0
- package/dist/lib/wireup/types.d.ts.map +1 -0
- package/dist/lib/wireup/types.js +3 -0
- package/dist/lib/wireup/types.js.map +1 -0
- package/dist/lib/worktree.d.ts +2 -0
- package/dist/lib/worktree.d.ts.map +1 -0
- package/dist/lib/worktree.js +999 -0
- package/dist/lib/worktree.js.map +1 -0
- package/lib/autopilot-milestone.ts +136 -0
- package/lib/autopilot-pipeline.ts +1179 -0
- package/lib/autopilot-waves.ts +361 -0
- package/lib/autopilot.ts +1874 -0
- package/lib/autoplan.ts +280 -0
- package/lib/autoresearch.js +4 -0
- package/lib/autoresearch.ts +886 -0
- package/lib/backend.ts +1252 -0
- package/lib/benchmark.ts +341 -0
- package/lib/citations.ts +760 -0
- package/lib/cleanup.ts +1588 -0
- package/lib/cli/adapters.ts +41 -0
- package/lib/cli/agent.ts +83 -0
- package/lib/cli/index.ts +273 -0
- package/lib/cli/output.ts +33 -0
- package/lib/cli/scan-dispatch.ts +130 -0
- package/lib/cli/tools.ts +198 -0
- package/lib/commands/_dashboard-parsers.ts +275 -0
- package/lib/commands/analysis.ts +1851 -0
- package/lib/commands/assumptions.ts +232 -0
- package/lib/commands/blame.ts +174 -0
- package/lib/commands/budget.ts +148 -0
- package/lib/commands/check-plans.ts +233 -0
- package/lib/commands/config.ts +287 -0
- package/lib/commands/dashboard.ts +680 -0
- package/lib/commands/estimate.ts +204 -0
- package/lib/commands/eval-diff.ts +252 -0
- package/lib/commands/freshness.ts +213 -0
- package/lib/commands/health.ts +607 -0
- package/lib/commands/index.ts +266 -0
- package/lib/commands/install.ts +307 -0
- package/lib/commands/knowhow-aggregator.ts +345 -0
- package/lib/commands/knowledge-search.ts +153 -0
- package/lib/commands/long-term-roadmap.ts +390 -0
- package/lib/commands/patterns.ts +465 -0
- package/lib/commands/phase-info.ts +698 -0
- package/lib/commands/plan-lint.ts +546 -0
- package/lib/commands/plan-phase.ts +375 -0
- package/lib/commands/progress.ts +319 -0
- package/lib/commands/quality.ts +138 -0
- package/lib/commands/rollback.ts +195 -0
- package/lib/commands/scan.ts +72 -0
- package/lib/commands/search.ts +300 -0
- package/lib/commands/select-candidate.ts +687 -0
- package/lib/commands/singularity.ts +222 -0
- package/lib/commands/slug-timestamp.ts +74 -0
- package/lib/commands/tail.ts +129 -0
- package/lib/commands/todo.ts +273 -0
- package/lib/commands/watch.ts +80 -0
- package/lib/complexity.ts +117 -0
- package/lib/context/agents.ts +505 -0
- package/lib/context/base.ts +123 -0
- package/lib/context/execute.ts +977 -0
- package/lib/context/index.ts +110 -0
- package/lib/context/progress.ts +278 -0
- package/lib/context/project.ts +531 -0
- package/lib/context/research.ts +646 -0
- package/lib/dead-ends.ts +506 -0
- package/lib/deps.ts +773 -0
- package/lib/discussion.ts +1275 -0
- package/lib/drift.ts +519 -0
- package/lib/evolve/_dimensions-features.ts +525 -0
- package/lib/evolve/_dimensions.ts +511 -0
- package/lib/evolve/_product-ideation.ts +405 -0
- package/lib/evolve/_prompts.ts +178 -0
- package/lib/evolve/cli.ts +330 -0
- package/lib/evolve/discovery.ts +571 -0
- package/lib/evolve/index.ts +105 -0
- package/lib/evolve/orchestrator.ts +1139 -0
- package/lib/evolve/scoring.ts +167 -0
- package/lib/evolve/state.ts +330 -0
- package/lib/evolve/types.ts +290 -0
- package/lib/frontmatter.ts +615 -0
- package/lib/gates.ts +695 -0
- package/lib/genome.ts +402 -0
- package/lib/got.js +4 -0
- package/lib/got.ts +361 -0
- package/lib/invariants.ts +378 -0
- package/lib/knowledge.ts +768 -0
- package/lib/long-term-roadmap.ts +806 -0
- package/lib/markdown-split.ts +273 -0
- package/lib/mcp-server.ts +3292 -0
- package/lib/metrics.ts +49 -0
- package/lib/overstory.ts +270 -0
- package/lib/parallel.ts +570 -0
- package/lib/paths.ts +293 -0
- package/lib/phase-complete-llm.ts +376 -0
- package/lib/phase-complete.ts +366 -0
- package/lib/phase-io.ts +101 -0
- package/lib/phase.ts +1981 -0
- package/lib/plan-tournament.ts +426 -0
- package/lib/refinement.ts +349 -0
- package/lib/requirements.ts +469 -0
- package/lib/research-bundle.ts +300 -0
- package/lib/roadmap.ts +775 -0
- package/lib/scaffold.ts +480 -0
- package/lib/scan/_utils.ts +37 -0
- package/lib/scan/base64.ts +90 -0
- package/lib/scan/ignorefile.ts +109 -0
- package/lib/scan/injection.ts +67 -0
- package/lib/scan/patterns.ts +139 -0
- package/lib/scan/strip-markdown.ts +39 -0
- package/lib/scan/types.ts +28 -0
- package/lib/scheduler-wait.ts +58 -0
- package/lib/scheduler.ts +1370 -0
- package/lib/state.ts +1000 -0
- package/lib/think.ts +365 -0
- package/lib/tracker.ts +1591 -0
- package/lib/types.ts +1663 -0
- package/lib/utils.ts +1479 -0
- package/lib/verify.ts +1434 -0
- package/lib/wireup/autofix.ts +241 -0
- package/lib/wireup/cli.ts +278 -0
- package/lib/wireup/detection.ts +542 -0
- package/lib/wireup/discovery.ts +1063 -0
- package/lib/wireup/execution.ts +686 -0
- package/lib/wireup/index.ts +117 -0
- package/lib/wireup/orchestrator.ts +519 -0
- package/lib/wireup/report.ts +286 -0
- package/lib/wireup/scenarios.ts +616 -0
- package/lib/wireup/state.ts +139 -0
- package/lib/wireup/types.ts +436 -0
- package/lib/worktree.ts +1309 -0
- package/package.json +67 -0
|
@@ -0,0 +1,310 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Collect and analyze quantitative evaluation results, compare against baselines, run ablations
|
|
3
|
+
argument-hint: <phase number or name>
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
<purpose>
|
|
7
|
+
Collect and report evaluation results after phase execution. Runs the evaluation protocol
|
|
8
|
+
defined in EVAL.md, compares results against baselines and targets, performs ablation
|
|
9
|
+
analysis, and updates BENCHMARKS.md with new data points. If results fall below targets,
|
|
10
|
+
suggests iteration via /grd:iterate.
|
|
11
|
+
</purpose>
|
|
12
|
+
|
|
13
|
+
<context>
|
|
14
|
+
CLAUDE.md rules: @CLAUDE.md
|
|
15
|
+
|
|
16
|
+
**Project structure** (paths resolved via init):
|
|
17
|
+
- `${phase_dir}/EVAL.md` — evaluation plan (must exist)
|
|
18
|
+
- `${phase_dir}/PLAN.md` — phase execution plan
|
|
19
|
+
- `.planning/BASELINE.md` — current performance baseline
|
|
20
|
+
- `.planning/BENCHMARKS.md` — historical benchmark data across phases
|
|
21
|
+
- `${research_dir}/LANDSCAPE.md` — SOTA references
|
|
22
|
+
- `.planning/config.json` — GRD configuration
|
|
23
|
+
|
|
24
|
+
**Agent available:**
|
|
25
|
+
- `grd-eval-reporter` — specialized evaluation execution and analysis agent
|
|
26
|
+
</context>
|
|
27
|
+
|
|
28
|
+
<process>
|
|
29
|
+
|
|
30
|
+
## Step 0: INITIALIZE — Load Evaluation Context
|
|
31
|
+
|
|
32
|
+
0. **Run initialization**:
|
|
33
|
+
```bash
|
|
34
|
+
INIT=$(node ${CLAUDE_PLUGIN_ROOT}/bin/grd-tools.js init eval-report "$PHASE")
|
|
35
|
+
```
|
|
36
|
+
Parse JSON for: `research_dir`, `phases_dir`, `phase_dir` (resolve from phases_dir + phase number), `landscape_exists`, `baseline_exists`, `autonomous_mode`.
|
|
37
|
+
|
|
38
|
+
1. **Parse arguments**: Extract phase identifier from `$ARGUMENTS`
|
|
39
|
+
- If phase number: resolve to `${phases_dir}/{N}-{name}/`
|
|
40
|
+
- If empty: detect current active phase
|
|
41
|
+
- Validate phase directory exists
|
|
42
|
+
|
|
43
|
+
2. **Load EVAL.md**:
|
|
44
|
+
- Path: `${phase_dir}/EVAL.md`
|
|
45
|
+
- If missing: STOP, suggest `/grd:eval-plan {N}` first
|
|
46
|
+
- Parse all three tiers: sanity, proxy, deferred
|
|
47
|
+
- Extract metric definitions, targets, measurement commands
|
|
48
|
+
|
|
49
|
+
3. **Load context**:
|
|
50
|
+
- Read BASELINE.md (baseline values for comparison)
|
|
51
|
+
- Read BENCHMARKS.md (historical data if available)
|
|
52
|
+
- Read LANDSCAPE.md (SOTA references)
|
|
53
|
+
|
|
54
|
+
4. **Determine evaluation scope**:
|
|
55
|
+
- Default: run Tier 1 (sanity) + Tier 2 (proxy)
|
|
56
|
+
- If milestone flag set: also run Tier 3 (deferred)
|
|
57
|
+
- If user specifies tier: run only that tier
|
|
58
|
+
|
|
59
|
+
**STEP_0_CHECKPOINT:**
|
|
60
|
+
- [ ] Phase identified and EVAL.md loaded
|
|
61
|
+
- [ ] Metric definitions and commands extracted
|
|
62
|
+
- [ ] Baseline and historical data loaded
|
|
63
|
+
- [ ] Evaluation scope determined
|
|
64
|
+
|
|
65
|
+
---
|
|
66
|
+
|
|
67
|
+
## Step 1: DISPLAY BANNER
|
|
68
|
+
|
|
69
|
+
```
|
|
70
|
+
╔══════════════════════════════════════════════════════════════╗
|
|
71
|
+
║ GRD >>> EVAL REPORT ║
|
|
72
|
+
║ ║
|
|
73
|
+
║ Phase: {N} — {phase_name} ║
|
|
74
|
+
║ Scope: {Tier 1 + Tier 2 | All tiers} ║
|
|
75
|
+
║ Metrics to evaluate: {count} ║
|
|
76
|
+
║ Baseline: {available | unavailable} ║
|
|
77
|
+
║ Previous results: {count from EVAL.md history} ║
|
|
78
|
+
╚══════════════════════════════════════════════════════════════╝
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
---
|
|
82
|
+
|
|
83
|
+
## Step 2: SPAWN EVAL REPORTER AGENT
|
|
84
|
+
|
|
85
|
+
**Launch `grd-eval-reporter` agent via Task tool:**
|
|
86
|
+
|
|
87
|
+
Use Task tool with `subagent_type="grd:grd-eval-reporter"`:
|
|
88
|
+
|
|
89
|
+
```
|
|
90
|
+
Execute evaluation protocol and analyze results for phase: {N} — {phase_name}
|
|
91
|
+
|
|
92
|
+
PATHS:
|
|
93
|
+
research_dir: ${research_dir}
|
|
94
|
+
phases_dir: ${phases_dir}
|
|
95
|
+
phase_dir: ${phase_dir}
|
|
96
|
+
|
|
97
|
+
EVALUATION PLAN:
|
|
98
|
+
{Full EVAL.md content}
|
|
99
|
+
|
|
100
|
+
BASELINE:
|
|
101
|
+
{BASELINE.md content, or "No baseline"}
|
|
102
|
+
|
|
103
|
+
HISTORICAL RESULTS:
|
|
104
|
+
{Previous entries from EVAL.md Results section, or "First evaluation"}
|
|
105
|
+
|
|
106
|
+
EVALUATION SCOPE: {tiers to run}
|
|
107
|
+
|
|
108
|
+
EXECUTE THE FOLLOWING PROTOCOL:
|
|
109
|
+
|
|
110
|
+
## 1. RUN TIER 1 — SANITY CHECKS
|
|
111
|
+
For each sanity check in EVAL.md:
|
|
112
|
+
- Execute the specified command
|
|
113
|
+
- Capture pass/fail result and output
|
|
114
|
+
- Record runtime
|
|
115
|
+
- If any sanity check FAILS: flag as critical, continue running remaining checks
|
|
116
|
+
|
|
117
|
+
## 2. RUN TIER 2 — PROXY METRICS
|
|
118
|
+
For each proxy metric in EVAL.md:
|
|
119
|
+
- Execute the specified evaluation command/script
|
|
120
|
+
- Capture numeric result
|
|
121
|
+
- Compare against: baseline, target, SOTA
|
|
122
|
+
- Compute improvement: (result - baseline) / (target - baseline) * 100%
|
|
123
|
+
- Record runtime and resource usage
|
|
124
|
+
|
|
125
|
+
## 3. RUN TIER 3 — DEFERRED (if in scope)
|
|
126
|
+
For each deferred evaluation:
|
|
127
|
+
- Execute the specified protocol
|
|
128
|
+
- Capture comprehensive results
|
|
129
|
+
- Document any issues encountered during execution
|
|
130
|
+
|
|
131
|
+
## 4. ABLATION ANALYSIS (if ablation plan defined)
|
|
132
|
+
For each ablation component:
|
|
133
|
+
- Disable the component
|
|
134
|
+
- Re-run Tier 2 proxy metrics
|
|
135
|
+
- Record impact: delta from full-system result
|
|
136
|
+
- Rank components by contribution
|
|
137
|
+
|
|
138
|
+
## 5. ANALYSIS
|
|
139
|
+
- Which metrics met targets? Which fell short?
|
|
140
|
+
- What is the gap for metrics below target?
|
|
141
|
+
- Trend analysis: improving, stable, or regressing?
|
|
142
|
+
- Ablation insights: which components matter most?
|
|
143
|
+
- Statistical significance of improvements (if multiple runs)
|
|
144
|
+
|
|
145
|
+
## 6. VERDICT
|
|
146
|
+
- ALL_TARGETS_MET: phase evaluation passes
|
|
147
|
+
- PARTIAL: some targets met, some below
|
|
148
|
+
- BELOW_TARGETS: most metrics below target
|
|
149
|
+
- REGRESSION: worse than baseline
|
|
150
|
+
|
|
151
|
+
OUTPUT FORMAT:
|
|
152
|
+
Return structured results with:
|
|
153
|
+
- Per-metric results table (metric, baseline, target, result, status)
|
|
154
|
+
- Ablation impact table (component, delta, significance)
|
|
155
|
+
- Trend chart data (if historical results available)
|
|
156
|
+
- Analysis narrative
|
|
157
|
+
- Verdict with specifics on gaps
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
**STEP_2_CHECKPOINT:**
|
|
161
|
+
- [ ] Eval reporter agent launched
|
|
162
|
+
- [ ] All specified tiers executed
|
|
163
|
+
- [ ] Results captured for each metric
|
|
164
|
+
- [ ] Analysis and verdict provided
|
|
165
|
+
|
|
166
|
+
---
|
|
167
|
+
|
|
168
|
+
## Step 3: DISPLAY RESULTS DASHBOARD
|
|
169
|
+
|
|
170
|
+
```
|
|
171
|
+
╔══════════════════════════════════════════════════════════════╗
|
|
172
|
+
║ EVALUATION RESULTS ║
|
|
173
|
+
╠══════════════════════════════════════════════════════════════╣
|
|
174
|
+
║ ║
|
|
175
|
+
║ Phase: {N} — {phase_name} ║
|
|
176
|
+
║ Verdict: {ALL_TARGETS_MET | PARTIAL | BELOW_TARGETS | ║
|
|
177
|
+
║ REGRESSION} ║
|
|
178
|
+
║ ║
|
|
179
|
+
║ TIER 1 — SANITY: {passed}/{total} checks passed ║
|
|
180
|
+
║ TIER 2 — PROXY: ║
|
|
181
|
+
║ {metric_1}: {result} / {target} {PASS|MISS} {+/-delta} ║
|
|
182
|
+
║ {metric_2}: {result} / {target} {PASS|MISS} {+/-delta} ║
|
|
183
|
+
║ TIER 3 — DEFERRED: {ran | skipped} ║
|
|
184
|
+
║ ║
|
|
185
|
+
║ vs Baseline: {overall_improvement}% improvement ║
|
|
186
|
+
║ vs SOTA: {gap_to_sota}% gap remaining ║
|
|
187
|
+
║ ║
|
|
188
|
+
║ Top ablation finding: ║
|
|
189
|
+
║ {component}: {contribution}% of total improvement ║
|
|
190
|
+
║ ║
|
|
191
|
+
╚══════════════════════════════════════════════════════════════╝
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
---
|
|
195
|
+
|
|
196
|
+
## Step 4: UPDATE EVAL.md WITH RESULTS
|
|
197
|
+
|
|
198
|
+
1. **Populate Results section** in `${phase_dir}/EVAL.md`:
|
|
199
|
+
```markdown
|
|
200
|
+
## Results — {YYYY-MM-DD}
|
|
201
|
+
|
|
202
|
+
### Tier 1: Sanity Checks
|
|
203
|
+
| Check | Status | Runtime | Notes |
|
|
204
|
+
|-------|--------|---------|-------|
|
|
205
|
+
| {check} | PASS/FAIL | {time} | {notes} |
|
|
206
|
+
|
|
207
|
+
### Tier 2: Proxy Metrics
|
|
208
|
+
| Metric | Baseline | Target | Result | Status | Delta |
|
|
209
|
+
|--------|----------|--------|--------|--------|-------|
|
|
210
|
+
| {metric} | {base} | {target} | {result} | PASS/MISS | {+/-} |
|
|
211
|
+
|
|
212
|
+
### Tier 3: Deferred (if ran)
|
|
213
|
+
| Evaluation | Result | Notes |
|
|
214
|
+
|------------|--------|-------|
|
|
215
|
+
| {eval} | {result} | {notes} |
|
|
216
|
+
|
|
217
|
+
### Ablation Analysis
|
|
218
|
+
| Component | Full Result | Ablated Result | Delta | Contribution |
|
|
219
|
+
|-----------|-------------|----------------|-------|--------------|
|
|
220
|
+
| {comp} | {full} | {ablated} | {delta} | {%} |
|
|
221
|
+
|
|
222
|
+
### Verdict: {verdict}
|
|
223
|
+
{analysis narrative}
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
2. **Append to History section** in EVAL.md:
|
|
227
|
+
```markdown
|
|
228
|
+
## History
|
|
229
|
+
| Date | Iteration | Verdict | Key Metric | Value | Notes |
|
|
230
|
+
|------|-----------|---------|------------|-------|-------|
|
|
231
|
+
| {date} | {N} | {verdict} | {metric} | {value} | {notes} |
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
---
|
|
235
|
+
|
|
236
|
+
## Step 5: UPDATE BENCHMARKS.md
|
|
237
|
+
|
|
238
|
+
1. **Load or create** `.planning/BENCHMARKS.md`
|
|
239
|
+
2. **Append new data points**:
|
|
240
|
+
```markdown
|
|
241
|
+
## Phase {N}: {phase_name} — {date}
|
|
242
|
+
|
|
243
|
+
| Metric | Value | vs_Baseline | vs_SOTA | Trend |
|
|
244
|
+
|--------|-------|-------------|---------|-------|
|
|
245
|
+
| {metric} | {value} | {delta} | {gap} | {up/down/flat} |
|
|
246
|
+
```
|
|
247
|
+
3. **Preserve all historical entries** — this is an append-only log
|
|
248
|
+
|
|
249
|
+
---
|
|
250
|
+
|
|
251
|
+
## Step 6: COMMIT
|
|
252
|
+
|
|
253
|
+
```bash
|
|
254
|
+
git add ${phase_dir}/*-EVAL.md
|
|
255
|
+
git add .planning/BENCHMARKS.md
|
|
256
|
+
git commit -m "eval: report phase {N} results — {verdict}"
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
---
|
|
260
|
+
|
|
261
|
+
## Step 7: ROUTE NEXT ACTION
|
|
262
|
+
|
|
263
|
+
| Verdict | Suggestion |
|
|
264
|
+
|---------|------------|
|
|
265
|
+
| ALL_TARGETS_MET | `/grd:verify-phase {N}` — proceed to verification |
|
|
266
|
+
| PARTIAL | Review gaps, decide: iterate or accept |
|
|
267
|
+
| BELOW_TARGETS | `/grd:iterate {N}` — trigger iteration loop |
|
|
268
|
+
| REGRESSION | `/grd:iterate {N}` — urgent, something broke |
|
|
269
|
+
|
|
270
|
+
**If BELOW_TARGETS or REGRESSION:**
|
|
271
|
+
```
|
|
272
|
+
WARNING: Results below targets.
|
|
273
|
+
|
|
274
|
+
Metrics below target:
|
|
275
|
+
{metric_1}: {result} vs target {target} (gap: {gap})
|
|
276
|
+
{metric_2}: {result} vs target {target} (gap: {gap})
|
|
277
|
+
|
|
278
|
+
Suggested action: /grd:iterate {N}
|
|
279
|
+
This will analyze gaps and suggest corrections.
|
|
280
|
+
```
|
|
281
|
+
|
|
282
|
+
</process>
|
|
283
|
+
|
|
284
|
+
<output>
|
|
285
|
+
**FILES_UPDATED:**
|
|
286
|
+
- `${phase_dir}/EVAL.md` — results and history populated
|
|
287
|
+
- `.planning/BENCHMARKS.md` — new benchmark data appended
|
|
288
|
+
|
|
289
|
+
**DISPLAY**: Results dashboard with per-metric status, ablation findings, and verdict
|
|
290
|
+
|
|
291
|
+
**GIT**: Committed: `eval: report phase {N} results — {verdict}`
|
|
292
|
+
</output>
|
|
293
|
+
|
|
294
|
+
<error_handling>
|
|
295
|
+
- **EVAL.md missing**: STOP, direct to `/grd:eval-plan {N}` first
|
|
296
|
+
- **Evaluation command fails**: Record failure, continue with remaining metrics, flag in report
|
|
297
|
+
- **No baseline for comparison**: Show absolute values only, note baseline gap
|
|
298
|
+
- **Metrics return non-numeric output**: Parse error, ask user for manual metric value
|
|
299
|
+
- **All sanity checks fail**: STOP evaluation, suggest debugging: `/grd:debug`
|
|
300
|
+
- **BENCHMARKS.md corrupted**: Back up and recreate from EVAL.md history sections
|
|
301
|
+
</error_handling>
|
|
302
|
+
|
|
303
|
+
<success_criteria>
|
|
304
|
+
- All specified tiers executed with results captured
|
|
305
|
+
- Results compared against baseline, target, and SOTA
|
|
306
|
+
- EVAL.md Results section is populated with complete data
|
|
307
|
+
- BENCHMARKS.md updated with new data points
|
|
308
|
+
- Verdict is clear with specific gap identification
|
|
309
|
+
- Below-target metrics trigger clear iteration suggestion
|
|
310
|
+
</success_criteria>
|
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Run autonomous self-improvement loop with sonnet-tier models
|
|
3
|
+
argument-hint: "[--iterations N] [--pick-pct N] [--dry-run] [--no-worktree] [--infinite]"
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
Run the evolve command to discover improvements and execute them autonomously:
|
|
7
|
+
|
|
8
|
+
```bash
|
|
9
|
+
node ${CLAUDE_PLUGIN_ROOT}/bin/grd-tools.js evolve run $ARGUMENTS
|
|
10
|
+
```
|
|
11
|
+
|
|
12
|
+
The evolve loop uses a paired discover→execute architecture per iteration:
|
|
13
|
+
1. Discovers 5-10 specific, immediately implementable improvements on the current codebase state
|
|
14
|
+
2. Groups items by theme and selects top priority groups using `--pick-pct` (default 50%)
|
|
15
|
+
3. Executes selected groups in a single subprocess call (max 10 items)
|
|
16
|
+
4. Runs a review/verification pass
|
|
17
|
+
5. Writes evolution notes and commits changes
|
|
18
|
+
6. Next iteration discovers again on the NOW-EVOLVED codebase (sees its own changes)
|
|
19
|
+
7. Repeats — each iteration builds on the previous one's improvements
|
|
20
|
+
|
|
21
|
+
This means each iteration is a tight discover→execute→discover cycle that progressively improves the codebase.
|
|
22
|
+
|
|
23
|
+
Flags:
|
|
24
|
+
- `--iterations N` — Number of iterations (0 = unlimited, runs until all groups done)
|
|
25
|
+
- `--pick-pct N` — Percentage of total groups to pick per iteration (default: 50, min 1 group)
|
|
26
|
+
- `--dry-run` — Discover and group only, don't execute
|
|
27
|
+
- `--no-worktree` — Disable git worktree isolation (by default, enabled when `branching_strategy !== 'none'`)
|
|
28
|
+
- `--timeout N` — Timeout per subprocess in minutes
|
|
29
|
+
- `--max-turns N` — Max turns per subprocess
|
|
30
|
+
|
|
31
|
+
When git isolation is enabled (config `branching_strategy` is not `'none'`), all execution runs in a dedicated worktree. A PR is created automatically after completion. Use `--no-worktree` to disable.
|
|
32
|
+
|
|
33
|
+
All operations enforce a sonnet model ceiling — no opus-class models are used.
|
|
34
|
+
|
|
35
|
+
IMPORTANT: This command is long-running (spawns multiple Claude subprocesses). You MUST run it in the background using `run_in_background: true` on the Bash tool to avoid hitting the Bash tool's default timeout. Use `TaskOutput` with `block: false` to check progress periodically.
|
|
36
|
+
|
|
37
|
+
Report the JSON results. If any groups failed, explain what happened. Suggest running again for continued improvement.
|
|
38
|
+
|
|
39
|
+
## Infinite Mode
|
|
40
|
+
|
|
41
|
+
To run a fully autonomous development loop (discover -> autoplan -> autopilot -> repeat):
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
node ${CLAUDE_PLUGIN_ROOT}/bin/grd-tools.js evolve run --infinite $ARGUMENTS
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
The infinite evolve loop:
|
|
48
|
+
1. Discovers improvements across the codebase
|
|
49
|
+
2. Creates a milestone from discoveries using autoplan
|
|
50
|
+
3. Executes all phases in that milestone using autopilot
|
|
51
|
+
4. Repeats until: max cycles reached, time budget exhausted, or no discoveries remain
|
|
52
|
+
|
|
53
|
+
### Infinite Mode Flags
|
|
54
|
+
|
|
55
|
+
| Flag | Description | Default |
|
|
56
|
+
|------|-------------|---------|
|
|
57
|
+
| `--infinite` | Enable infinite evolve mode | false |
|
|
58
|
+
| `--max-cycles N` | Maximum discover-plan-execute cycles | 10 |
|
|
59
|
+
| `--time-budget N` | Total time budget in minutes (0 = unlimited) | 0 |
|
|
60
|
+
| `--max-milestones N` | Max milestones per autopilot run per cycle | 1 |
|
|
61
|
+
| `--pick-pct N` | Discovery pick percentage per cycle | 50 |
|
|
62
|
+
| `--dry-run` | Preview each step without executing | false |
|
|
63
|
+
| `--timeout N` | Per-subprocess timeout in minutes | -- |
|
|
64
|
+
| `--max-turns N` | Max turns per subprocess | -- |
|
|
65
|
+
|
|
66
|
+
### Infinite Mode Examples
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
# Preview what infinite evolve would do
|
|
70
|
+
node ${CLAUDE_PLUGIN_ROOT}/bin/grd-tools.js evolve run --infinite --dry-run
|
|
71
|
+
|
|
72
|
+
# Run 3 cycles with 60-minute time budget
|
|
73
|
+
node ${CLAUDE_PLUGIN_ROOT}/bin/grd-tools.js evolve run --infinite --max-cycles 3 --time-budget 60
|
|
74
|
+
|
|
75
|
+
# Run with custom pick percentage
|
|
76
|
+
node ${CLAUDE_PLUGIN_ROOT}/bin/grd-tools.js evolve run --infinite --pick-pct 30
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
IMPORTANT: Infinite mode is extremely long-running. You MUST run it in the background using `run_in_background: true`. Monitor progress via the log file at `.planning/autopilot/infinite-evolve.log`.
|