cue-ai 0.9.0 → 0.9.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +40 -0
- package/README.md +82 -33
- package/bin/cue-review-progress +107 -0
- package/bin/cue-review-watch +98 -0
- package/dist/cue.js +7352 -3744
- package/package.json +16 -5
- package/profiles/_types.ts +9 -0
- package/profiles/backend/profile.yaml +2 -0
- package/profiles/blog-writer/profile.yaml +10 -0
- package/profiles/browser/profile.yaml +9 -2
- package/profiles/builder/profile.yaml +3 -6
- package/profiles/career/profile.yaml +13 -2
- package/profiles/claude-api/profile.yaml +1 -1
- package/profiles/commerce/profile.yaml +27 -3
- package/profiles/core/logo.png +0 -0
- package/profiles/core/profile.yaml +62 -2
- package/profiles/dash-merge-test/profile.yaml +109 -0
- package/profiles/designer/profile.yaml +2 -0
- package/profiles/designer-medusa-next/profile.yaml +4 -1
- package/profiles/designer-medusa-vite/profile.yaml +4 -1
- package/profiles/docs-writer/profile.yaml +3 -1
- package/profiles/eu-tender-research/README.md +48 -0
- package/profiles/eu-tender-research/logo.png +0 -0
- package/profiles/eu-tender-research/profile.yaml +108 -0
- package/profiles/finance/logo.png +0 -0
- package/profiles/finance/profile.yaml +46 -0
- package/profiles/frontend/profile.yaml +5 -9
- package/profiles/growth/profile.yaml +2 -3
- package/profiles/gstack/profile.yaml +15 -0
- package/profiles/higgsfield/profile.yaml +3 -0
- package/profiles/hyperframes/logo.png +0 -0
- package/profiles/hyperframes/profile.yaml +59 -0
- package/profiles/improver/profile.yaml +88 -0
- package/profiles/marketing/profile.yaml +0 -3
- package/profiles/medusa-dev/profile.yaml +2 -0
- package/profiles/medusa-next/profile.yaml +2 -3
- package/profiles/medusa-vite/profile.yaml +2 -3
- package/profiles/n8n/logo.png +0 -0
- package/profiles/n8n/profile.yaml +50 -0
- package/profiles/nextjs/profile.yaml +2 -3
- package/profiles/ops/profile.yaml +2 -0
- package/profiles/postizz/profile.yaml +13 -3
- package/profiles/python/profile.yaml +3 -0
- package/profiles/research/profile.yaml +3 -1
- package/profiles/schema.json +10 -0
- package/profiles/secops/profile.yaml +2 -0
- package/profiles/seo/profile.yaml +56 -0
- package/profiles/skill-writer/profile.yaml +8 -0
- package/profiles/ssh/profile.yaml +32 -0
- package/profiles/strapi/logo.png +0 -0
- package/profiles/strapi/profile.yaml +45 -0
- package/profiles/stripe/logo.png +0 -0
- package/profiles/stripe/profile.yaml +1 -0
- package/profiles/supabase/logo.png +0 -0
- package/profiles/supabase/profile.yaml +85 -0
- package/profiles/vercel/logo.png +0 -0
- package/profiles/vercel/profile.yaml +25 -1
- package/profiles/vite/profile.yaml +4 -3
- package/profiles/web-frontend-base/profile.yaml +5 -4
- package/profiles/webshop/profile.yaml +23 -5
- package/profiles/x-growth-bot/profile.yaml +44 -0
- package/resources/icons/generate-icons.py +128 -2
- package/resources/mcps/configs/claude.sanitized.json +42 -0
- package/resources/mcps/configs/codex.sanitized.json +7 -0
- package/resources/skills/skills/career/resume-version-manager/SKILL.md +351 -0
- package/resources/skills/skills/career/salary-negotiation-prep/SKILL.md +378 -0
- package/resources/skills/skills/content/pdf/SKILL.md +2 -0
- package/resources/skills/skills/content/postiz-cards/SKILL.md +48 -0
- package/resources/skills/skills/content/postiz-cards/scripts/analytics.sh +38 -0
- package/resources/skills/skills/content/postiz-cards/scripts/card.sh +42 -0
- package/resources/skills/skills/content/postiz-cards/scripts/lint.py +38 -0
- package/resources/skills/skills/design/headless-gif-demo/SKILL.md +1 -1
- package/resources/skills/skills/design/readme-svg-design/SKILL.md +1 -1
- package/resources/skills/skills/eu-funding/grant-outreach/SKILL.md +70 -0
- package/resources/skills/skills/eu-funding/hu-grant-finder/SKILL.md +114 -0
- package/resources/skills/skills/eu-funding/hu-grant-finder/evals.md +26 -0
- package/resources/skills/skills/eu-funding/ted-tender-search/SKILL.md +80 -0
- package/resources/skills/skills/eu-funding/ted-tender-search/evals.md +26 -0
- package/resources/skills/skills/eu-funding/ted-tender-search/scripts/ted-search.sh +46 -0
- package/resources/skills/skills/event-design/wedding-invitations/SKILL.md +1 -1
- package/resources/skills/skills/github/gx-agents/SKILL.md +96 -0
- package/resources/skills/skills/gstack/design-shotgun/SKILL.md +1 -1
- package/resources/skills/skills/marketing/ab-test-analyzer/SKILL.md +1 -1
- package/resources/skills/skills/marketing/ab-test-setup-and-analysis/SKILL.md +1 -1
- package/resources/skills/skills/marketing/account-structure-review/SKILL.md +1 -1
- package/resources/skills/skills/marketing/ad-copy-variant-generator/SKILL.md +1 -1
- package/resources/skills/skills/marketing/ad-extension-audit/SKILL.md +1 -1
- package/resources/skills/skills/marketing/ad-spend-allocator/SKILL.md +1 -1
- package/resources/skills/skills/marketing/anomaly-detection/SKILL.md +1 -1
- package/resources/skills/skills/marketing/attribution-model-comparison/SKILL.md +1 -1
- package/resources/skills/skills/marketing/audience-overlap-analysis/SKILL.md +7 -1
- package/resources/skills/skills/marketing/bid-strategy-recommendations/SKILL.md +7 -1
- package/resources/skills/skills/marketing/budget-scenario-planner/SKILL.md +6 -1
- package/resources/skills/skills/marketing/campaign-naming-convention-builder/SKILL.md +7 -1
- package/resources/skills/skills/marketing/channel-mix-optimizer/SKILL.md +7 -1
- package/resources/skills/skills/marketing/client-report-narratives/SKILL.md +6 -1
- package/resources/skills/skills/marketing/competitor-creative-analysis/SKILL.md +1 -1
- package/resources/skills/skills/marketing/competitor-teardown/SKILL.md +1 -1
- package/resources/skills/skills/marketing/content-repurposer/SKILL.md +1 -1
- package/resources/skills/skills/marketing/conversion-path-analysis/SKILL.md +1 -1
- package/resources/skills/skills/marketing/cpa-diagnostics/SKILL.md +1 -1
- package/resources/skills/skills/marketing/creative-fatigue-detection/SKILL.md +1 -1
- package/resources/skills/skills/marketing/day-hour-performance-breakdown/SKILL.md +1 -1
- package/resources/skills/skills/marketing/device-performance-split/SKILL.md +1 -1
- package/resources/skills/skills/marketing/e2e-seo-assistant/SKILL.md +1 -1
- package/resources/skills/skills/marketing/email-sequence-writer/SKILL.md +1 -1
- package/resources/skills/skills/marketing/frequency-cap-recommendations/SKILL.md +1 -1
- package/resources/skills/skills/marketing/geo-performance-analysis/SKILL.md +1 -1
- package/resources/skills/skills/marketing/google-ads-audit/SKILL.md +1 -1
- package/resources/skills/skills/marketing/icp-research-assistant/SKILL.md +1 -1
- package/resources/skills/skills/marketing/keyword-cannibalization-check/SKILL.md +1 -1
- package/resources/skills/skills/marketing/landing-page-audit/SKILL.md +1 -1
- package/resources/skills/skills/marketing/landing-page-audit-quick/SKILL.md +1 -1
- package/resources/skills/skills/marketing/linkedin-ads-audit/SKILL.md +1 -1
- package/resources/skills/skills/marketing/meta-ads-audit/SKILL.md +1 -1
- package/resources/skills/skills/marketing/pacing-monitor/SKILL.md +1 -1
- package/resources/skills/skills/marketing/performance-benchmarking/SKILL.md +1 -1
- package/resources/skills/skills/marketing/programmatic-seo-builder/SKILL.md +1 -1
- package/resources/skills/skills/marketing/quality-score-breakdown/SKILL.md +1 -1
- package/resources/skills/skills/marketing/reddit-ads-audit/SKILL.md +1 -1
- package/resources/skills/skills/marketing/retargeting-window-analysis/SKILL.md +1 -1
- package/resources/skills/skills/marketing/roas-forecasting/SKILL.md +1 -1
- package/resources/skills/skills/marketing/search-term-mining/SKILL.md +1 -1
- package/resources/skills/skills/marketing/utm-tracking-generator/SKILL.md +1 -1
- package/resources/skills/skills/marketing/wasted-spend-finder/SKILL.md +1 -1
- package/resources/skills/skills/marketing/weekly-account-summary/SKILL.md +1 -1
- package/resources/skills/skills/meta/awesome-list-submit/SKILL.md +4 -4
- package/resources/skills/skills/meta/cue-dashboard/SKILL.md +109 -0
- package/resources/skills/skills/meta/cue-developer/SKILL.md +161 -0
- package/resources/skills/skills/meta/cue-developer/evals/evals.json +57 -0
- package/resources/skills/skills/meta/cue-developer/references/architecture.md +65 -0
- package/resources/skills/skills/meta/cue-developer/references/build_and_test.md +72 -0
- package/resources/skills/skills/meta/cue-developer/references/contributing.md +75 -0
- package/resources/skills/skills/meta/cue-developer/references/conventions.md +57 -0
- package/resources/skills/skills/meta/cue-developer/references/first_time_setup.md +51 -0
- package/resources/skills/skills/meta/cue-developer/references/skill_and_mcp_authoring.md +84 -0
- package/resources/skills/skills/meta/cue-developer/references/troubleshooting.md +42 -0
- package/resources/skills/skills/meta/delegation-check/SKILL.md +148 -0
- package/resources/skills/skills/meta/delegation-check/specs/scan-algorithm.md +125 -0
- package/resources/skills/skills/meta/delegation-check/specs/separation-rules.md +190 -0
- package/resources/skills/skills/meta/focus/SKILL.md +62 -0
- package/resources/skills/skills/meta/help/SKILL.md +1 -1
- package/resources/skills/skills/meta/integrity-tags/SKILL.md +2 -0
- package/resources/skills/skills/meta/next-steps/SKILL.md +124 -0
- package/resources/skills/skills/meta/next-steps/evals/eval-set.json +92 -0
- package/resources/skills/skills/meta/profile-from-docs/SKILL.md +141 -0
- package/resources/skills/skills/meta/ralph-loop/SKILL.md +83 -0
- package/resources/skills/skills/meta/ralph-loop/scripts/loop.sh +73 -0
- package/resources/skills/skills/meta/skill-simplify/SKILL.md +136 -0
- package/resources/skills/skills/meta/skill-simplify/phases/01-analysis.md +173 -0
- package/resources/skills/skills/meta/skill-simplify/phases/02-optimize.md +104 -0
- package/resources/skills/skills/meta/skill-simplify/phases/03-check.md +145 -0
- package/resources/skills/skills/meta/smart-loader/scripts/smart-lookup.sh +13 -4
- package/resources/skills/skills/meta/verify-council/SKILL.md +182 -0
- package/resources/skills/skills/meta/verify-council/references/lane-prompts.md +103 -0
- package/resources/skills/skills/meta/verify-council/references/workflow.js +217 -0
- package/resources/skills/skills/nvidia/aiq-research/SKILL.md +1 -1
- package/resources/skills/skills/nvidia/cuopt-developer/SKILL.md +16 -1
- package/resources/skills/skills/nvidia/cuopt-developer/resources/contributing.md +2 -2
- package/resources/skills/skills/nvidia/cuopt-developer/resources/numerical_debugging.md +128 -0
- package/resources/skills/skills/nvidia/cuopt-developer/resources/python_bindings.md +2 -9
- package/resources/skills/skills/nvidia/cuopt-developer/resources/vrp_skills.md +166 -0
- package/resources/skills/skills/nvidia/cuopt-install/SKILL.md +2 -10
- package/resources/skills/skills/nvidia/cuopt-numerical-optimization-api-c/SKILL.md +3 -23
- package/resources/skills/skills/nvidia/cuopt-numerical-optimization-api-c/resources/examples.md +40 -20
- package/resources/skills/skills/nvidia/cuopt-numerical-optimization-api-python/SKILL.md +5 -1
- package/resources/skills/skills/nvidia/skill-evolution/SKILL.md +4 -5
- package/resources/skills/skills/research/trendradar/SKILL.md +1 -1
- package/resources/skills/skills/ssh/ssh-config/SKILL.md +94 -0
- package/resources/skills/skills/ssh/ssh-copy/SKILL.md +92 -0
- package/resources/skills/skills/ssh/ssh-harden/SKILL.md +108 -0
- package/resources/skills/skills/ssh/ssh-keys/SKILL.md +82 -0
- package/resources/skills/skills/ssh/ssh-paste-image/LICENSE +28 -0
- package/resources/skills/skills/ssh/ssh-paste-image/SKILL.md +149 -0
- package/resources/skills/skills/ssh/ssh-paste-image/scripts/build.sh +29 -0
- package/resources/skills/skills/ssh/ssh-paste-image/scripts/client/go.mod +3 -0
- package/resources/skills/skills/ssh/ssh-paste-image/scripts/client/main.go +79 -0
- package/resources/skills/skills/ssh/ssh-paste-image/scripts/daemon/ccimgd.service +12 -0
- package/resources/skills/skills/ssh/ssh-paste-image/scripts/daemon/com.ccimgd.plist +20 -0
- package/resources/skills/skills/ssh/ssh-paste-image/scripts/daemon/go.mod +3 -0
- package/resources/skills/skills/ssh/ssh-paste-image/scripts/daemon/main.go +98 -0
- package/resources/skills/skills/ssh/ssh-tunnel/SKILL.md +96 -0
- package/resources/skills/skills/strapi/building-with-strapi/SKILL.md +112 -0
- package/resources/skills/skills/strapi/strapi-cli/SKILL.md +93 -0
- package/resources/skills/skills/strapi/strapi-content-api/SKILL.md +115 -0
- package/resources/skills/skills/strapi/strapi-deploy/SKILL.md +89 -0
- package/resources/skills/skills/strapi/strapi-mcp-setup/SKILL.md +101 -0
- package/resources/skills/skills/strapi/strapi-plugins/SKILL.md +97 -0
- package/resources/skills/skills/tools/context7/SKILL.md +101 -0
- package/resources/skills/skills/tools/opensrc/SKILL.md +1 -1
- package/resources/skills/skills/tools/portless/SKILL.md +186 -0
- package/resources/skills/skills/xbot/operate/SKILL.md +229 -0
- package/src/commands/_index.ts +8 -0
- package/src/commands/ai-score.e2e.test.ts +11 -4
- package/src/commands/ai.ts +3 -4
- package/src/commands/auto-detect.ts +1 -1
- package/src/commands/cli.test.ts +1 -2
- package/src/commands/cli.ts +1 -1
- package/src/commands/cloud.ts +1 -1
- package/src/commands/current.ts +1 -4
- package/src/commands/dash.test.ts +110 -0
- package/src/commands/dash.ts +194 -0
- package/src/commands/dashboard.ts +26 -0
- package/src/commands/diff.ts +1 -1
- package/src/commands/discover.test.ts +1 -1
- package/src/commands/discover.ts +90 -40
- package/src/commands/doctor.test.ts +58 -0
- package/src/commands/doctor.ts +79 -3
- package/src/commands/eval-behavior.ts +1 -1
- package/src/commands/eval.ts +2 -2
- package/src/commands/evolve.ts +4 -3
- package/src/commands/failures.test.ts +1 -1
- package/src/commands/features-batch1.test.ts +6 -1
- package/src/commands/icon.ts +1 -5
- package/src/commands/import-profile.ts +1 -1
- package/src/commands/init.ts +50 -7
- package/src/commands/install-sh.e2e.test.ts +65 -0
- package/src/commands/launch-handoff.e2e.test.ts +88 -0
- package/src/commands/launch.e2e.test.ts +8 -1
- package/src/commands/launch.test.ts +29 -0
- package/src/commands/launch.ts +185 -131
- package/src/commands/lock.ts +0 -1
- package/src/commands/marketplace.ts +0 -4
- package/src/commands/materialize.ts +1 -1
- package/src/commands/mem.ts +341 -0
- package/src/commands/optimizer.ts +0 -3
- package/src/commands/playground.ts +1 -2
- package/src/commands/profile-draft-skill.ts +1 -1
- package/src/commands/replay-whatif.ts +1 -6
- package/src/commands/score.ts +2 -2
- package/src/commands/security.test.ts +88 -0
- package/src/commands/security.ts +74 -28
- package/src/commands/shell.test.ts +65 -4
- package/src/commands/shell.ts +67 -7
- package/src/commands/skills-test.ts +0 -1
- package/src/commands/skills.ts +28 -2
- package/src/commands/sources.ts +1 -2
- package/src/commands/status.ts +2 -6
- package/src/commands/submit-profile.ts +1 -1
- package/src/commands/suggest.ts +35 -10
- package/src/commands/trigger-gaps.test.ts +50 -0
- package/src/commands/trigger-gaps.ts +63 -29
- package/src/commands/update.ts +1 -1
- package/src/commands/validate.ts +16 -4
- package/src/commands/watch-live.ts +1 -1
- package/src/commands/workspace.ts +1 -1
- package/src/index.ts +26 -10
- package/src/lib/active-sessions.ts +1 -1
- package/src/lib/agent-adapters.test.ts +100 -0
- package/src/lib/agent-adapters.ts +2 -2
- package/src/lib/analytics.test.ts +88 -0
- package/src/lib/analytics.ts +82 -1
- package/src/lib/auto-detect.test.ts +10 -4
- package/src/lib/auto-detect.ts +19 -23
- package/src/lib/brand-icons.ts +0 -1
- package/src/lib/cache.ts +2 -3
- package/src/lib/claude-mem-env.test.ts +148 -0
- package/src/lib/claude-mem-env.ts +172 -0
- package/src/lib/combo-history.test.ts +53 -0
- package/src/lib/combo-history.ts +83 -0
- package/src/lib/companion-detect.test.ts +108 -0
- package/src/lib/companion-detect.ts +140 -0
- package/src/lib/companion-fetch.ts +4 -6
- package/src/lib/conditional-skills.test.ts +1 -1
- package/src/lib/config-paths.test.ts +53 -0
- package/src/lib/config-paths.ts +33 -0
- package/src/lib/dashboard-server.test.ts +351 -0
- package/src/lib/dashboard-server.ts +1476 -27
- package/src/lib/debug-log.test.ts +66 -0
- package/src/lib/debug-log.ts +45 -0
- package/src/lib/mcp-catalog.test.ts +102 -0
- package/src/lib/mcp-catalog.ts +193 -0
- package/src/lib/pair-suggestions.test.ts +111 -0
- package/src/lib/pair-suggestions.ts +98 -5
- package/src/lib/permissions.test.ts +76 -0
- package/src/lib/permissions.ts +125 -0
- package/src/lib/picker.test.ts +1106 -1
- package/src/lib/picker.ts +1230 -142
- package/src/lib/plugin-discovery.ts +126 -0
- package/src/lib/pr-poster.ts +1 -1
- package/src/lib/pr-throttle.ts +2 -6
- package/src/lib/profile-linter.test.ts +67 -1
- package/src/lib/profile-linter.ts +59 -14
- package/src/lib/profile-loader.test.ts +21 -0
- package/src/lib/profile-loader.ts +22 -3
- package/src/lib/profile-metrics.ts +2 -6
- package/src/lib/profile-names.test.ts +58 -0
- package/src/lib/repos.test.ts +57 -0
- package/src/lib/repos.ts +167 -0
- package/src/lib/resolver-npx.ts +10 -1
- package/src/lib/runtime-materializer.test.ts +200 -3
- package/src/lib/runtime-materializer.ts +129 -20
- package/src/lib/shared-profiles.ts +2 -3
- package/src/lib/skill-clis.test.ts +113 -0
- package/src/lib/skill-clis.ts +232 -0
- package/src/lib/skill-dependencies.ts +9 -1
- package/src/lib/skill-deps.ts +1 -1
- package/src/lib/skill-linter.ts +1 -1
- package/src/lib/skill-quality.ts +0 -1
- package/src/lib/skill-sandbox.test.ts +1 -1
- package/src/lib/skills-lock.test.ts +1 -1
- package/src/lib/telemetry-consent.ts +3 -5
- package/src/lib/telemetry-report.test.ts +2 -2
- package/src/lib/token-budget.ts +111 -0
- package/src/lib/trigger-gaps.test.ts +70 -0
- package/src/lib/trigger-gaps.ts +48 -6
- package/src/lib/tui/data.ts +1 -5
- package/src/lib/workflow-store.ts +150 -0
- package/src/lib/workspace-secrets.ts +0 -4
- package/src/lib/workspaces.ts +1 -1
|
@@ -0,0 +1,104 @@
|
|
|
1
|
+
# Phase 2: Optimize
|
|
2
|
+
|
|
3
|
+
Apply the Phase 1 plan with Edit, in priority order, then write the result.
|
|
4
|
+
Preserve every functional element. When in doubt, keep the original.
|
|
5
|
+
|
|
6
|
+
## Objective
|
|
7
|
+
|
|
8
|
+
- Run every operation in order: delete, merge, simplify, format.
|
|
9
|
+
- Keep every functional element from the Phase 1 inventory.
|
|
10
|
+
- Fix the flagged format issues.
|
|
11
|
+
- Write the optimized content back to the target file.
|
|
12
|
+
|
|
13
|
+
## Step 2.1: Apply operations in order
|
|
14
|
+
|
|
15
|
+
**Priority 1, delete** (safest, highest impact):
|
|
16
|
+
|
|
17
|
+
| Target | Action |
|
|
18
|
+
|--------|--------|
|
|
19
|
+
| Duplicate Overview | Remove `## Overview` if it restates the frontmatter description |
|
|
20
|
+
| ASCII flowchart | Remove if a phase table or structure already covers it |
|
|
21
|
+
| "When to use" / "Use Cases" | Remove |
|
|
22
|
+
| Best Practices section | Remove if it duplicates Core Rules |
|
|
23
|
+
| Duplicate folder tree | Remove the ASCII tree if an Output Artifacts table covers it |
|
|
24
|
+
| "Next Phase" prose | Remove when a table or TodoWrite handles flow |
|
|
25
|
+
| Standalone example sections | Remove if the logic is already shown inline |
|
|
26
|
+
| Descriptive code blocks | Remove if nearby prose or a table covers the content |
|
|
27
|
+
|
|
28
|
+
**Priority 2, merge** (structural):
|
|
29
|
+
|
|
30
|
+
| Target | Action |
|
|
31
|
+
|--------|--------|
|
|
32
|
+
| Similar AskUserQuestion blocks | One block with a mode parameter |
|
|
33
|
+
| Repeated Option A/B/C routing | One dispatch |
|
|
34
|
+
| Sequential single-line commands | One code block |
|
|
35
|
+
| Repeated TodoWrite blocks | Template once, the rest as one-line comments |
|
|
36
|
+
| Duplicate error handling | One `## Error handling` table |
|
|
37
|
+
| Equivalent template variants | One template plus a comment naming the dropped variant, for example `// variant: multi-perspective adds Perspective` |
|
|
38
|
+
| Multiple output-artifact tables | One combined table with a phase column |
|
|
39
|
+
|
|
40
|
+
**Priority 3, simplify** (compress descriptive content):
|
|
41
|
+
|
|
42
|
+
| Target | Action |
|
|
43
|
+
|--------|--------|
|
|
44
|
+
| Verbose comments | One line, drop obvious restatements |
|
|
45
|
+
| Display-format blocks | Convert a logging-only block to a prose sentence describing the output shape |
|
|
46
|
+
| Wordy intros | Drop the preamble |
|
|
47
|
+
| Prompt padding | Drop generic advice from agent or exploration prompts |
|
|
48
|
+
| Long success-criteria lists | Trim to the essential 5 to 7, drop the obvious |
|
|
49
|
+
|
|
50
|
+
**Priority 4, format fixes**:
|
|
51
|
+
|
|
52
|
+
| Target | Action |
|
|
53
|
+
|--------|--------|
|
|
54
|
+
| Nested backtick literals | Convert the block to prose, or use a four-backtick fence |
|
|
55
|
+
| Hardcoded option lists | Replace with dynamic generation: name the source and the generation logic |
|
|
56
|
+
| Handoff without steps | Add concrete steps referencing the target command's interface |
|
|
57
|
+
| Unclosed brackets | Match the brackets |
|
|
58
|
+
| Undefined variables | Add the declaration or link the source |
|
|
59
|
+
|
|
60
|
+
## Step 2.2: Language unification (only if needed)
|
|
61
|
+
|
|
62
|
+
If the file mixes languages in functional comments, unify the non-functional
|
|
63
|
+
text to the majority language. Never change variable names, function names,
|
|
64
|
+
schema fields, or error message strings inside code.
|
|
65
|
+
|
|
66
|
+
## Step 2.3: Write the result
|
|
67
|
+
|
|
68
|
+
Apply edits with Edit on the target file. After writing, record the new line
|
|
69
|
+
count for the report:
|
|
70
|
+
|
|
71
|
+
```bash
|
|
72
|
+
NEW_LINES=$(wc -l < "$FILE")
|
|
73
|
+
SAVED=$(( ORIGINAL_LINES - NEW_LINES ))
|
|
74
|
+
PCT=$(( SAVED * 100 / ORIGINAL_LINES ))
|
|
75
|
+
echo "Reduced $FILE: $ORIGINAL_LINES -> $NEW_LINES (-${PCT}%)"
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
## Step 2.4: Keep a change record
|
|
79
|
+
|
|
80
|
+
Track what changed so Phase 3 can explain a WARN and a human can audit:
|
|
81
|
+
|
|
82
|
+
```
|
|
83
|
+
optimizationRecord = {
|
|
84
|
+
deletedSections: [ ... ], // section names removed
|
|
85
|
+
mergedGroups: [ { from, to } ], // what folded into what
|
|
86
|
+
simplifiedAreas: [ { section, strategy } ],
|
|
87
|
+
formatFixes: [ { line, type, fix } ],
|
|
88
|
+
linesBefore: ORIGINAL_LINES,
|
|
89
|
+
linesAfter: NEW_LINES
|
|
90
|
+
}
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
## Key rules
|
|
94
|
+
|
|
95
|
+
1. Never modify a functional code block beyond compressing its comments and
|
|
96
|
+
whitespace.
|
|
97
|
+
2. Descriptive blocks may be deleted only when prose or a table covers them.
|
|
98
|
+
3. Never change a function signature, variable name, or schema field.
|
|
99
|
+
4. A merge must keep every original branch. The unified version handles every
|
|
100
|
+
case the originals did.
|
|
101
|
+
5. When uncertain, keep the original. Conservative beats clever here.
|
|
102
|
+
6. Format fixes change presentation only, never semantics.
|
|
103
|
+
|
|
104
|
+
Hand Phase 3 the `optimizationRecord` and the original snapshot path.
|
|
@@ -0,0 +1,145 @@
|
|
|
1
|
+
# Phase 3: Integrity check
|
|
2
|
+
|
|
3
|
+
Re-extract the inventory from the optimized file with the same logic as Phase
|
|
4
|
+
1, compare counts category by category, validate format, and report PASS,
|
|
5
|
+
WARN, or FAIL. Revert on FAIL. This phase is the entire reason the skill is
|
|
6
|
+
safe to run, do not skip it.
|
|
7
|
+
|
|
8
|
+
## Objective
|
|
9
|
+
|
|
10
|
+
- Re-run the Phase 1 extraction on the optimized content.
|
|
11
|
+
- Compare counts with role-aware classification.
|
|
12
|
+
- Confirm no new format issues appeared.
|
|
13
|
+
- Report a result with actionable detail.
|
|
14
|
+
- Restore the original when a critical element is missing.
|
|
15
|
+
|
|
16
|
+
## Step 3.1: Re-extract
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
NEW_LINES=$(wc -l < "$FILE")
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
Read the optimized file fresh and rebuild the inventory using the exact same
|
|
23
|
+
rules from `01-analysis.md` steps 1.2, 1.2.1, 1.2.2, 1.2.3. Same extraction,
|
|
24
|
+
same role classification, same counting. A different method here would make
|
|
25
|
+
the comparison meaningless.
|
|
26
|
+
|
|
27
|
+
## Step 3.2: Compare, role-aware
|
|
28
|
+
|
|
29
|
+
Diff each category as `after - before`. The category decides the verdict.
|
|
30
|
+
|
|
31
|
+
**CRITICAL**, must not decrease, any drop is a FAIL:
|
|
32
|
+
|
|
33
|
+
```
|
|
34
|
+
functionalCodeBlocks dataStructures routingBranches errorHandlers
|
|
35
|
+
conditionalLogic askUserQuestions inputModes outputArtifacts
|
|
36
|
+
skillInvocations
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
**MERGE_AWARE**, a decrease is a WARN that needs coverage proof:
|
|
40
|
+
|
|
41
|
+
```
|
|
42
|
+
agentCalls codeBlocks
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
**EXPECTED_DECREASE**, a decrease is the goal, always OK:
|
|
46
|
+
|
|
47
|
+
```
|
|
48
|
+
descriptiveCodeBlocks todoWriteBlocks phaseHandoffs tables schemas
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
Verdict per category:
|
|
52
|
+
|
|
53
|
+
```
|
|
54
|
+
for each category:
|
|
55
|
+
diff = after - before
|
|
56
|
+
if CRITICAL: status = diff < 0 ? FAIL : OK ; FAIL sets hasCriticalLoss
|
|
57
|
+
if MERGE_AWARE: status = diff < 0 ? WARN : OK ; WARN sets hasWarning
|
|
58
|
+
else: status = OK
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
## Step 3.3: Deep verification
|
|
62
|
+
|
|
63
|
+
**CRITICAL drop**: name exactly what was lost. For each FAIL category, list the
|
|
64
|
+
before-items that have no match in the after-inventory using the matching
|
|
65
|
+
rules below. That list is what you report to the user.
|
|
66
|
+
|
|
67
|
+
**MERGE_AWARE drop**: check the merged version covers every original variant.
|
|
68
|
+
For each dropped item, look for a merge comment in a surviving item that names
|
|
69
|
+
it, for example `// variant: multi adds Perspective`. If a dropped item has no
|
|
70
|
+
such comment anywhere, it is truly lost. Promote that category to FAIL and set
|
|
71
|
+
`hasCriticalLoss`. Otherwise the WARN stands and the merge is fine.
|
|
72
|
+
|
|
73
|
+
## Step 3.4: Format validation
|
|
74
|
+
|
|
75
|
+
Re-scan the optimized `functional` blocks for format issues. Any issue that
|
|
76
|
+
exists now but did not exist in the Phase 1 list is a NEW issue introduced by
|
|
77
|
+
the edit, and a new issue is a FAIL.
|
|
78
|
+
|
|
79
|
+
| Check | Detection | On failure |
|
|
80
|
+
|-------|-----------|------------|
|
|
81
|
+
| Bracket matching | Count `{([` vs `})]` per block | FAIL, fix or revert |
|
|
82
|
+
| Variable consistency | `${var}` used but never declared | WARNING, note it |
|
|
83
|
+
| Structural completeness | Body has an entry but no return, Write, or output | WARNING |
|
|
84
|
+
| Nested backticks | Backtick literal inside a code fence | WARNING if pre-existing, FAIL if new |
|
|
85
|
+
| Schema field preservation | After fields match before fields | FAIL if any field lost |
|
|
86
|
+
|
|
87
|
+
## Step 3.5: Report
|
|
88
|
+
|
|
89
|
+
```
|
|
90
|
+
status = hasCriticalLoss ? FAIL : (hasWarning ? WARN : PASS)
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
Show a table with every category: before, after, delta, status. Highlight the
|
|
94
|
+
FAIL and WARN rows. Add a one-line format summary: issues resolved, new issues
|
|
95
|
+
(should be zero). Report lines before, lines after, and the percentage saved.
|
|
96
|
+
|
|
97
|
+
## Step 3.6: Act on the result
|
|
98
|
+
|
|
99
|
+
```bash
|
|
100
|
+
case "$STATUS" in
|
|
101
|
+
FAIL)
|
|
102
|
+
cp "$FILE.simplify.bak" "$FILE" # restore the original
|
|
103
|
+
echo "FAIL: critical element lost or new format issue. Reverted."
|
|
104
|
+
;;
|
|
105
|
+
WARN)
|
|
106
|
+
echo "WARN: merge or descriptive decrease. Verify the merge covers every case."
|
|
107
|
+
# keep the edit, show the merge justification from optimizationRecord
|
|
108
|
+
;;
|
|
109
|
+
PASS)
|
|
110
|
+
rm -f "$FILE.simplify.bak" # safe to drop the backup
|
|
111
|
+
echo "PASS: every functional element preserved. ${SAVED} lines saved."
|
|
112
|
+
;;
|
|
113
|
+
esac
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
On FAIL, restore and report which category dropped and which elements were
|
|
117
|
+
lost. Do not patch in place after a FAIL, restore first, then a human decides.
|
|
118
|
+
|
|
119
|
+
## Element matching rules
|
|
120
|
+
|
|
121
|
+
How a before-element is judged present in the after-inventory:
|
|
122
|
+
|
|
123
|
+
| Element | Match on |
|
|
124
|
+
|---------|----------|
|
|
125
|
+
| codeBlocks | Same language plus first meaningful line, ignoring whitespace and comments |
|
|
126
|
+
| agentCalls | Same agentType plus prompt keyword overlap above 60% |
|
|
127
|
+
| dataStructures | Same variable name, or same field set |
|
|
128
|
+
| routingBranches | Same condition, normalized |
|
|
129
|
+
| errorHandlers | Same error type or pattern |
|
|
130
|
+
| conditionalLogic | Same condition plus same outcome set |
|
|
131
|
+
| askUserQuestions | Same question count plus similar option labels |
|
|
132
|
+
| inputModes | Same mode identifier |
|
|
133
|
+
| outputArtifacts | Same file path pattern or artifact name |
|
|
134
|
+
| skillInvocations | Same skill name |
|
|
135
|
+
| todoWriteBlocks | Same phase names, order-independent |
|
|
136
|
+
| phaseHandoffs | Same target phase reference |
|
|
137
|
+
| tables | Same column headers |
|
|
138
|
+
| schemas | Same schema name or field set |
|
|
139
|
+
|
|
140
|
+
Merge coverage (`coversElement`):
|
|
141
|
+
|
|
142
|
+
- Agent calls: a surviving template carries a `// For multi:` or
|
|
143
|
+
`// variant:` comment naming the missing variant.
|
|
144
|
+
- Code blocks: a surviving block carries a comment noting the alternative was
|
|
145
|
+
folded in.
|
|
@@ -195,10 +195,19 @@ resolve_active_profile
|
|
|
195
195
|
|
|
196
196
|
loaded_set=""
|
|
197
197
|
if [ "$EXCLUDE_LOADED" -eq 1 ] && [ -n "$active_profile" ]; then
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
198
|
+
# Preferred: the materializer's manifest of loaded <category>/<slug> ids. The
|
|
199
|
+
# runtime skills/ dir is now flat (skills/<slug>), so the old depth-2 find no
|
|
200
|
+
# longer reflects category identity — the manifest does.
|
|
201
|
+
manifest="$HOME/.config/cue/runtime/$active_profile/claude/.cue-skills"
|
|
202
|
+
if [ -f "$manifest" ]; then
|
|
203
|
+
loaded_set=$(sort -u "$manifest" 2>/dev/null)
|
|
204
|
+
else
|
|
205
|
+
# Fallback for runtimes built before the manifest (nested <category>/<slug>).
|
|
206
|
+
runtime="$HOME/.config/cue/runtime/$active_profile/claude/skills"
|
|
207
|
+
if [ -d "$runtime" ]; then
|
|
208
|
+
loaded_set=$(find "$runtime" -mindepth 2 -maxdepth 2 -name "*" -print 2>/dev/null \
|
|
209
|
+
| sed "s|^$runtime/||" | sort -u)
|
|
210
|
+
fi
|
|
202
211
|
fi
|
|
203
212
|
fi
|
|
204
213
|
is_loaded() {
|
|
@@ -0,0 +1,182 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: verify-council
|
|
3
|
+
description: >-
|
|
4
|
+
Use when finishing a visual or hard-to-reverse change, or the user says
|
|
5
|
+
"council", "panel of agents", or "visually verify". Runs independent verifier
|
|
6
|
+
lanes plus a real-screen proof, token-scaled.
|
|
7
|
+
tags: [meta, verification, review, workflow]
|
|
8
|
+
category: meta
|
|
9
|
+
version: 1.0.0
|
|
10
|
+
requires_mcps: []
|
|
11
|
+
allowed-tools: []
|
|
12
|
+
triggers:
|
|
13
|
+
- "verify with a council"
|
|
14
|
+
- "council of agents"
|
|
15
|
+
- "panel of agents"
|
|
16
|
+
- "verify harder"
|
|
17
|
+
- "visually verify"
|
|
18
|
+
- "prove it visually"
|
|
19
|
+
- "second opinion on this change"
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
# verify-council
|
|
23
|
+
|
|
24
|
+
The closing "VERIFIED" step is where false confidence hides. One auditor shares
|
|
25
|
+
the author's blind spots, and a visual claim "checked" by reading the render
|
|
26
|
+
code is at best `[INFERRED]`, never `[VERIFIED]`. This skill replaces the single
|
|
27
|
+
self-check with a small **council** of independent verifier lanes that run in
|
|
28
|
+
parallel, one of which drives the **real screen** and reports a measured value.
|
|
29
|
+
|
|
30
|
+
It is the panel upgrade to `/verify` (one auditor). Use `/verify` for a quick
|
|
31
|
+
single check; use this when the change is visual, decision-critical, or hard to
|
|
32
|
+
reverse, and you want diverse lenses plus a real-screen proof.
|
|
33
|
+
|
|
34
|
+
## When to activate
|
|
35
|
+
|
|
36
|
+
Run the council as the last step of a turn when **all three** hold (the
|
|
37
|
+
`/verify` triage gate):
|
|
38
|
+
|
|
39
|
+
1. **Decision-relevant**: the user will act on the result.
|
|
40
|
+
2. **Hard to reverse**: being wrong costs real recovery (shipped bug, wrong
|
|
41
|
+
layout, broken contract).
|
|
42
|
+
3. **Mechanically checkable**: a fresh agent can confirm by reading files,
|
|
43
|
+
running a command, or measuring the rendered screen.
|
|
44
|
+
|
|
45
|
+
Skip it for cosmetic or one-line changes: an inline `[VERIFIED]` with a quoted
|
|
46
|
+
line is enough. In minimal-safe-mode, ask before spawning the lanes.
|
|
47
|
+
|
|
48
|
+
Also activate on: "council", "panel of agents", "verify harder", "visually
|
|
49
|
+
verify".
|
|
50
|
+
|
|
51
|
+
## Phase 0: triage and mechanical gate (free, no agents)
|
|
52
|
+
|
|
53
|
+
Run the deterministic checks **first**. They are free and catch most breakage
|
|
54
|
+
before a single agent is paid for.
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
bun test <touched-files> # or the project's test runner
|
|
58
|
+
bun run typecheck # tsc --noEmit
|
|
59
|
+
bun run lint # biome / eslint
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
If any gate fails, stop and fix it. Reviewing broken code with agents wastes
|
|
63
|
+
tokens. Then classify the change surface, which decides the lanes:
|
|
64
|
+
|
|
65
|
+
- `code`: logic only, no rendered surface.
|
|
66
|
+
- `tui`: a terminal UI (cue picker, dashboards drawn to the pane).
|
|
67
|
+
- `web`: the `web/` dashboard or any browser surface.
|
|
68
|
+
|
|
69
|
+
## Phase 1: the council (parallel, scoped, cheap models)
|
|
70
|
+
|
|
71
|
+
Spawn only the lanes the change needs. Each lane gets the **diff and the list of
|
|
72
|
+
claims**, not the repo, and returns a structured verdict. Lanes run on a cheap
|
|
73
|
+
model (sonnet); the author adjudicates on the strong one.
|
|
74
|
+
|
|
75
|
+
| Lane | When | What it does | Returns |
|
|
76
|
+
|---|---|---|---|
|
|
77
|
+
| A correctness | always | Re-derives each logic claim against the diff, neutral prompt, no shared context | PASS / FAIL / PARTIAL + quoted line |
|
|
78
|
+
| B red-team | always for code | Runs the CRITICAL/HIGH review pass over the diff | findings or `REVIEW_CLEAN` |
|
|
79
|
+
| C visual | surface is `tui` or `web` | Drives the real screen and measures it (see below) | observed value vs claim |
|
|
80
|
+
| D skeptic | claim is hard to reverse | Independent refuter, told to REFUTE, default refuted when unsure | refuted? + reason |
|
|
81
|
+
|
|
82
|
+
The visual lane is the one that turns yellow into green:
|
|
83
|
+
|
|
84
|
+
- **TUI**: drive the actual terminal with the `cue-tty-watch` MCP: launch the
|
|
85
|
+
app in a tmux pane (`tmux_pane`), `send_keys_tmux` to reach the target state,
|
|
86
|
+
`screenshot`, then `find_text` or `ask_about_image` to assert the rendered
|
|
87
|
+
claim (footer string present, columns aligned).
|
|
88
|
+
- **Web**: drive the dev server with the `lightpanda` MCP: `goto` the URL,
|
|
89
|
+
`eval` `getComputedStyle()` / `getBoundingClientRect()`, and return the
|
|
90
|
+
measured pixel values.
|
|
91
|
+
|
|
92
|
+
## Phase 2: adjudication (author, strong model)
|
|
93
|
+
|
|
94
|
+
The lanes surface disagreement; the **source settles** it. For every FAIL,
|
|
95
|
+
PARTIAL, or finding:
|
|
96
|
+
|
|
97
|
+
- Re-read the source at its **absolute** path. Quote the exact line, do not
|
|
98
|
+
paraphrase. Use `grep -H` (never `-h`) so a match is credited to the right
|
|
99
|
+
file.
|
|
100
|
+
- If the source confirms the lane, issue a `[CORRECTION]` per the liedetector
|
|
101
|
+
protocol and fix it.
|
|
102
|
+
- If the source contradicts the lane, the source wins; the lane finding was a
|
|
103
|
+
hallucination, no correction.
|
|
104
|
+
- A claim earns `[VERIFIED]` only when its lane PASSed with quoted or measured
|
|
105
|
+
evidence. A **visual** claim needs the Phase 1 measurement, never a code read.
|
|
106
|
+
|
|
107
|
+
Fix every real CRITICAL and HIGH, re-run the relevant Phase 0 gate to prove the
|
|
108
|
+
fix, then write the confidence audit and sign off.
|
|
109
|
+
|
|
110
|
+
## Running it
|
|
111
|
+
|
|
112
|
+
**Preferred (Claude Code): the Workflow engine.** It encodes the adaptive
|
|
113
|
+
fan-out, schema-validated verdicts, and the parallel barrier in one place. Pass
|
|
114
|
+
the capped diff, the claim list, and the surface as `args`:
|
|
115
|
+
|
|
116
|
+
```text
|
|
117
|
+
Workflow({
|
|
118
|
+
scriptPath: ".../meta/verify-council/references/workflow.js",
|
|
119
|
+
args: { diff, claims, surface, hardToReverse, appCmd, url }
|
|
120
|
+
})
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
Read the returned verdicts and run Phase 2 yourself. The engine spends nothing
|
|
124
|
+
on adjudication, keeping the strong model out of the fan-out.
|
|
125
|
+
|
|
126
|
+
**Fallback (no Workflow tool, e.g. Codex): manual fan-out.** Spawn the same
|
|
127
|
+
lanes with the Agent tool, one per lane, using the neutral prompts in
|
|
128
|
+
[references/lane-prompts.md](references/lane-prompts.md). The protocol is
|
|
129
|
+
identical; only the orchestration differs.
|
|
130
|
+
|
|
131
|
+
## Token discipline
|
|
132
|
+
|
|
133
|
+
This runs "each time" only because it stays cheap:
|
|
134
|
+
|
|
135
|
+
- Mechanical gate short-circuits before any agent; triage early-exits trivial
|
|
136
|
+
diffs to zero agents.
|
|
137
|
+
- Lanes see the diff and claims, capped at 60k chars, not the repo.
|
|
138
|
+
- Adaptive count: 2 lanes default, plus the visual lane only when visual, plus
|
|
139
|
+
the skeptic only when hard to reverse.
|
|
140
|
+
- Cheap model for lanes; strong model only for adjudication.
|
|
141
|
+
- One parallel barrier, so wall-clock is the slowest lane, not the sum.
|
|
142
|
+
- Each lane carries a fixed floor (~35k tokens for agent boot + tool registry),
|
|
143
|
+
so the council pays off only for genuinely high-stakes changes. For the rest, a
|
|
144
|
+
clean mechanical gate is the right answer: do not convene a council for a diff
|
|
145
|
+
the test suite already proves.
|
|
146
|
+
- Pass the diff INLINE, not via a fetch command. An inline diff gives the lane
|
|
147
|
+
nothing to roam toward; handed only a command, an agent tends to explore the
|
|
148
|
+
wider repo (more tokens, unscoped findings).
|
|
149
|
+
|
|
150
|
+
## Rules
|
|
151
|
+
|
|
152
|
+
- Never tag a visual claim `[VERIFIED]` from a code read. The visual lane's
|
|
153
|
+
measured value is the only path to green for layout, spacing, or rendered text.
|
|
154
|
+
- Never trust a FAIL on its own. Re-read the source at its absolute path and
|
|
155
|
+
quote the line before acting.
|
|
156
|
+
- Never skip Phase 0. Paying agents to review code that fails its own tests is
|
|
157
|
+
the most wasteful mistake here.
|
|
158
|
+
- Never spawn the heavy lanes for a cosmetic diff. Match the lane count to the
|
|
159
|
+
stakes.
|
|
160
|
+
- In minimal-safe-mode, ask before spawning the council.
|
|
161
|
+
|
|
162
|
+
## Example
|
|
163
|
+
|
|
164
|
+
A TUI footer + column-alignment change to the cue profile picker
|
|
165
|
+
(`src/lib/picker.ts`). Surface is `tui` and the layout is hard to reverse, so
|
|
166
|
+
the council runs correctness + red-team + visual.
|
|
167
|
+
|
|
168
|
+
```text
|
|
169
|
+
1. Phase 0 (free): bun test src/lib/picker.test.ts → 155 pass
|
|
170
|
+
bun run typecheck → clean
|
|
171
|
+
2. Phase 1 (council): correctness + red-team over the diff, plus the visual lane:
|
|
172
|
+
cue-tty-watch launches the picker in a tmux pane, keys to the combine screen,
|
|
173
|
+
screenshots it, find_text "⏎ enter to continue" → present; columns aligned.
|
|
174
|
+
3. Phase 2 (author): no FAILs to adjudicate → the footer claim moves from
|
|
175
|
+
🟡 [INFERRED] (read the render code) to 🟢 [VERIFIED] (measured on screen).
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
## See also
|
|
179
|
+
|
|
180
|
+
- `/verify`: the single-auditor fast path this skill scales up.
|
|
181
|
+
- `/code-review-deep`: the standalone diff review that lane B reuses.
|
|
182
|
+
- `meta/liedetector`: the confidence-tag output contract Phase 2 writes in.
|
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
# Lane prompts (Agent-tool fallback)
|
|
2
|
+
|
|
3
|
+
Use these when the Workflow tool is unavailable (e.g. Codex). Spawn one Agent
|
|
4
|
+
per lane, in parallel, each on a cheap model (sonnet). They are the same prompts
|
|
5
|
+
the Workflow engine builds inline; this file is the human-readable canonical
|
|
6
|
+
copy. Fill in `<DIFF>`, `<CLAIMS>`, `<URL>`, `<APP_CMD>` before spawning.
|
|
7
|
+
|
|
8
|
+
Every lane shares this preamble:
|
|
9
|
+
|
|
10
|
+
```text
|
|
11
|
+
You are an INDEPENDENT verifier with no shared context. Do not edit anything.
|
|
12
|
+
Several assertions below may be false — do not assume they are correct.
|
|
13
|
+
For each claim return PASS / FAIL / PARTIAL / NA with evidence that QUOTES the
|
|
14
|
+
exact line or MEASURES the value. A remembered or paraphrased line is not evidence.
|
|
15
|
+
Return a terse list: one claim id + verdict + one evidence line each.
|
|
16
|
+
|
|
17
|
+
SCOPE: the change under review is EXACTLY <DIFF> (or the verbatim output of the one
|
|
18
|
+
command you are given). Do not open or report on any file outside it; discard any
|
|
19
|
+
finding you cannot tie to a line shown there. Reviewing the wider repo wastes the
|
|
20
|
+
token budget.
|
|
21
|
+
CLAIMS: answer exactly the claim ids listed, verbatim. Do not invent, rename, or add
|
|
22
|
+
ids. If a claim cannot be judged from the change, mark it NA.
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
The scope + claim-binding lines matter: without them a lane drifts into a free-form
|
|
26
|
+
repo review, which both blows the token budget and leaves the caller's actual claims
|
|
27
|
+
unverified.
|
|
28
|
+
|
|
29
|
+
## Lane A — correctness (always)
|
|
30
|
+
|
|
31
|
+
```text
|
|
32
|
+
Lane: correctness. Re-derive each claim strictly from the diff.
|
|
33
|
+
|
|
34
|
+
CLAIMS:
|
|
35
|
+
<CLAIMS>
|
|
36
|
+
|
|
37
|
+
--- DIFF ---
|
|
38
|
+
<DIFF>
|
|
39
|
+
--- END DIFF ---
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
## Lane B — red-team (always for code)
|
|
43
|
+
|
|
44
|
+
```text
|
|
45
|
+
Lane: red-team. Review ONLY the diff for blocking defects.
|
|
46
|
+
CRITICAL: security holes, data loss, crashes, injection, broken auth.
|
|
47
|
+
HIGH: real bugs, broken contracts, race conditions, wrong logic.
|
|
48
|
+
List each as a finding with severity. Skip LOW/style nits.
|
|
49
|
+
If there are no CRITICAL or HIGH defects, reply exactly: REVIEW_CLEAN.
|
|
50
|
+
|
|
51
|
+
--- DIFF ---
|
|
52
|
+
<DIFF>
|
|
53
|
+
--- END DIFF ---
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
## Lane C — visual proof (only when the surface is rendered)
|
|
57
|
+
|
|
58
|
+
A code read is not acceptable here. Only a screenshot or a measured value counts.
|
|
59
|
+
|
|
60
|
+
TUI surface:
|
|
61
|
+
|
|
62
|
+
```text
|
|
63
|
+
Lane: visual proof. Surface: TUI.
|
|
64
|
+
Use the cue-tty-watch MCP tools (find them with ToolSearch): launch the app in a
|
|
65
|
+
tmux pane with `<APP_CMD>`, send_keys_tmux to reach the target screen, screenshot
|
|
66
|
+
it, then find_text / ask_about_image to assert each claim against what is on screen.
|
|
67
|
+
|
|
68
|
+
VISUAL CLAIMS:
|
|
69
|
+
<CLAIMS>
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
Web surface:
|
|
73
|
+
|
|
74
|
+
```text
|
|
75
|
+
Lane: visual proof. Surface: WEB.
|
|
76
|
+
Use the lightpanda MCP tools (find them with ToolSearch): goto <URL>, then eval
|
|
77
|
+
getComputedStyle() / getBoundingClientRect() on the relevant selectors and return
|
|
78
|
+
the MEASURED pixel values for each claim.
|
|
79
|
+
|
|
80
|
+
VISUAL CLAIMS:
|
|
81
|
+
<CLAIMS>
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
## Lane D — skeptic (only when the change is hard to reverse)
|
|
85
|
+
|
|
86
|
+
```text
|
|
87
|
+
Lane: skeptic. Your job is to REFUTE the riskiest claim, not confirm it. Default
|
|
88
|
+
each claim to FAIL ("refuted") unless the diff makes it undeniable. Be adversarial.
|
|
89
|
+
|
|
90
|
+
CLAIMS:
|
|
91
|
+
<CLAIMS>
|
|
92
|
+
|
|
93
|
+
--- DIFF ---
|
|
94
|
+
<DIFF>
|
|
95
|
+
--- END DIFF ---
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
## Adjudication (back on the strong model)
|
|
99
|
+
|
|
100
|
+
The lanes surface disagreement; the source settles it. For every FAIL / PARTIAL /
|
|
101
|
+
finding: re-read the source at its absolute path, quote the exact line (`grep -H`,
|
|
102
|
+
never `-h`), and let the source win. Tag a claim `[VERIFIED]` only when its lane
|
|
103
|
+
PASSed with quoted or measured evidence; a visual claim needs Lane C's measurement.
|