npm - @event4u/agent-config - Versions diffs - 5.6.1 → 5.8.0 - Mend

@event4u/agent-config 5.6.1 → 5.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (225) hide show

package/.agent-src/commands/agent-handoff.md +1 -1
package/.agent-src/commands/agent-status.md +1 -1
package/.agent-src/commands/agents/audit.md +1 -1
package/.agent-src/commands/agents/init.md +1 -1
package/.agent-src/commands/agents/user/accept.md +3 -3
package/.agent-src/commands/agents/user/init.md +4 -4
package/.agent-src/commands/agents/user/show.md +3 -3
package/.agent-src/commands/agents/user/update.md +3 -3
package/.agent-src/commands/agents/user.md +1 -1
package/.agent-src/commands/agents.md +1 -1
package/.agent-src/commands/analytics/prune.md +1 -1
package/.agent-src/commands/analytics/show.md +1 -1
package/.agent-src/commands/analytics.md +1 -1
package/.agent-src/commands/bug-fix.md +1 -1
package/.agent-src/commands/challenge-me.md +1 -1
package/.agent-src/commands/chat-history/import.md +1 -1
package/.agent-src/commands/chat-history/learn.md +1 -1
package/.agent-src/commands/chat-history/show.md +1 -1
package/.agent-src/commands/chat-history.md +1 -1
package/.agent-src/commands/check-current-md.md +1 -1
package/.agent-src/commands/condense.md +1 -1
package/.agent-src/commands/context.md +1 -1
package/.agent-src/commands/cost-report.md +13 -8
package/.agent-src/commands/council.md +3 -3
package/.agent-src/commands/create-pr/description-only.md +1 -1
package/.agent-src/commands/create-pr.md +1 -1
package/.agent-src/commands/e2e-heal.md +1 -1
package/.agent-src/commands/e2e-plan.md +1 -1
package/.agent-src/commands/feature.md +1 -1
package/.agent-src/commands/fix/ci.md +1 -1
package/.agent-src/commands/fix/portability.md +1 -1
package/.agent-src/commands/fix/pr-bot-comments.md +1 -1
package/.agent-src/commands/fix/pr-comments.md +1 -1
package/.agent-src/commands/fix/pr-developer-comments.md +1 -1
package/.agent-src/commands/fix/refs.md +1 -1
package/.agent-src/commands/fix/seeder.md +1 -1
package/.agent-src/commands/fix.md +1 -1
package/.agent-src/commands/judge.md +1 -1
package/.agent-src/commands/knowledge/cross-repo.md +1 -1
package/.agent-src/commands/knowledge/forget.md +1 -1
package/.agent-src/commands/knowledge/ingest.md +1 -1
package/.agent-src/commands/knowledge/list.md +1 -1
package/.agent-src/commands/knowledge.md +1 -1
package/.agent-src/commands/memory/add.md +1 -1
package/.agent-src/commands/memory/learn-low-impact.md +1 -1
package/.agent-src/commands/memory/load.md +1 -1
package/.agent-src/commands/memory/mine-session.md +1 -1
package/.agent-src/commands/memory/promote.md +1 -1
package/.agent-src/commands/memory/propose.md +1 -1
package/.agent-src/commands/memory.md +1 -1
package/.agent-src/commands/mode.md +1 -1
package/.agent-src/commands/optimize/agents-dir.md +1 -1
package/.agent-src/commands/optimize/augmentignore.md +1 -1
package/.agent-src/commands/optimize/rtk.md +1 -1
package/.agent-src/commands/optimize/skills.md +1 -1
package/.agent-src/commands/optimize.md +1 -1
package/.agent-src/commands/orchestrate.md +1 -1
package/.agent-src/commands/override/create.md +1 -1
package/.agent-src/commands/override/manage.md +1 -1
package/.agent-src/commands/override.md +1 -1
package/.agent-src/commands/package-reset.md +1 -1
package/.agent-src/commands/prediction-pool.md +234 -0
package/.agent-src/commands/profile/activate.md +81 -0
package/.agent-src/commands/profile/deactivate.md +68 -0
package/.agent-src/commands/profile/show.md +70 -0
package/.agent-src/commands/profile.md +68 -0
package/.agent-src/commands/project-health.md +1 -1
package/.agent-src/commands/quality-fix.md +1 -1
package/.agent-src/commands/roadmap/process-full.md +1 -1
package/.agent-src/commands/roadmap/process-phase.md +1 -1
package/.agent-src/commands/roadmap/process-step.md +1 -1
package/.agent-src/commands/roadmap.md +1 -1
package/.agent-src/commands/set-cost-profile.md +9 -9
package/.agent-src/commands/skill/preview.md +3 -3
package/.agent-src/commands/skill.md +1 -1
package/.agent-src/commands/skills/discover.md +1 -1
package/.agent-src/commands/skills.md +1 -1
package/.agent-src/commands/sync-agent-settings.md +3 -3
package/.agent-src/commands/sync-gitignore/fix.md +1 -1
package/.agent-src/commands/sync-gitignore.md +1 -1
package/.agent-src/commands/update-form-request-messages.md +1 -1
package/.agent-src/presets/README.md +1 -1
package/.agent-src/profiles/README.md +1 -1
package/.agent-src/rules/non-destructive-by-default.md +2 -1
package/.agent-src/skills/check-refs/SKILL.md +1 -1
package/.agent-src/skills/finishing-a-development-branch/SKILL.md +1 -1
package/.agent-src/skills/git-workflow/SKILL.md +1 -1
package/.agent-src/skills/jira-integration/SKILL.md +1 -1
package/.agent-src/skills/markitdown/SKILL.md +1 -1
package/.agent-src/skills/prediction-pool-optimizer/SKILL.md +314 -0
package/.agent-src/skills/prediction-pool-optimizer/evals/triggers.json +20 -0
package/.agent-src/skills/prediction-pool-optimizer/reference/ev-fixtures.md +175 -0
package/.agent-src/skills/prediction-pool-optimizer/reference/odds-and-bonus.md +109 -0
package/.agent-src/skills/rtk-output-filtering/SKILL.md +1 -1
package/.agent-src/skills/script-writing/SKILL.md +1 -1
package/.agent-src/skills/token-optimizer/SKILL.md +1 -1
package/.agent-src/skills/using-git-worktrees/SKILL.md +1 -1
package/.agent-src/templates/agent-settings.md +7 -7
package/.agent-src/templates/agents/agent-project-settings.example.yml +2 -2
package/.agent-src/templates/scripts/work_engine/_lib/agent_settings.py +54 -6
package/.agent-src/templates/scripts/work_engine/hook_bootstrap.py +1 -1
package/.agent-src/templates/scripts/work_engine/hooks/builtin/memory_visibility.py +9 -7
package/.agent-src/templates/scripts/work_engine/hooks/settings.py +9 -10
package/.agent-src/templates/scripts/work_engine/scoring/memory_visibility.py +17 -4
package/.claude-plugin/marketplace.json +370 -364
package/CHANGELOG.md +108 -0
package/README.md +2 -2
package/config/agent-settings.template.yml +11 -2
package/config/discovery/packs.yml +11 -0
package/config/discovery/session-profiles.yml +37 -0
package/config/discovery/workspaces.yml +1 -1
package/config/profiles/balanced.ini +1 -1
package/config/profiles/full.ini +1 -1
package/config/profiles/minimal.ini +1 -1
package/dist/discovery/deprecation-report.md +1 -1
package/dist/discovery/discovery-manifest.json +254 -100
package/dist/discovery/discovery-manifest.json.sha256 +1 -1
package/dist/discovery/discovery-manifest.summary.md +4 -3
package/dist/discovery/orphan-report.md +1 -1
package/dist/discovery/packs.json +41 -6
package/dist/discovery/trust-report.md +3 -3
package/dist/discovery/workspaces.json +19 -6
package/dist/mcp/registry-manifest.json +3 -3
package/dist/server/io/substituteTemplate.js +3 -3
package/dist/server/io/substituteTemplate.js.map +1 -1
package/dist/server/routes/settings.js +2 -2
package/dist/server/routes/settings.js.map +1 -1
package/dist/server/schemas/settings.js +4 -2
package/dist/server/schemas/settings.js.map +1 -1
package/dist/ui/assets/{index-DVsyUMZe.js → index-5lFqAKL0.js} +2 -2
package/dist/ui/assets/index-5lFqAKL0.js.map +1 -0
package/dist/ui/index.html +1 -1
package/docs/architecture/current-onboard-baseline.md +3 -3
package/docs/architecture.md +2 -2
package/docs/catalog.md +11 -5
package/docs/contracts/adr-level-6-productization.md +1 -1
package/docs/contracts/command-clusters.md +2 -0
package/docs/contracts/config-presets.md +2 -2
package/docs/contracts/cost-profile-defaults.md +5 -5
package/docs/contracts/discovery-manifest.schema.json +1 -1
package/docs/contracts/explain-trace.schema.json +3 -3
package/docs/contracts/memory-visibility-v1.md +15 -7
package/docs/contracts/profile-system.md +2 -2
package/docs/contracts/session-profile-overlay.md +120 -0
package/docs/contracts/settings-api.md +3 -3
package/docs/contracts/value-report-schema.md +14 -1
package/docs/customization.md +47 -5
package/docs/decisions/ADR-010-profile-pack-preset-boundary.md +47 -11
package/docs/decisions/ADR-013-discovery-frontmatter-contract.md +16 -2
package/docs/decisions/ADR-034-per-skill-model-recommendation-transport.md +1 -1
package/docs/decisions/ADR-036-global-install-browser-wizard-handoff.md +106 -0
package/docs/decisions/ADR-037-cost-profile-untangle.md +117 -0
package/docs/decisions/ADR-038-canonical-settings-path.md +66 -0
package/docs/decisions/ADR-039-claude-skills-untracked.md +139 -0
package/docs/decisions/ADR-rule-kernel-and-router.md +1 -1
package/docs/decisions/INDEX.md +4 -0
package/docs/development.md +12 -0
package/docs/getting-started.md +2 -2
package/docs/guidelines/agent-infra/layered-settings.md +10 -4
package/docs/installation.md +3 -3
package/docs/setup/mcp-client-config.md +1 -1
package/docs/skills-catalog.md +5 -1
package/docs/value.md +9 -7
package/docs/wizard.md +1 -1
package/llms.txt +4 -0
package/package.json +1 -1
package/scripts/__pycache__/validate_frontmatter.cpython-312.pyc +0 -0
package/scripts/_cli/cmd_doctor.py +3 -2
package/scripts/_cli/cmd_explain.py +1 -1
package/scripts/_cli/cmd_versions.py +2 -2
package/scripts/_cli/explain_last/inputs.py +11 -8
package/scripts/_cli/explain_last/sections/inputs.py +1 -1
package/scripts/_lib/__pycache__/__init__.cpython-312.pyc +0 -0
package/scripts/_lib/__pycache__/agent_src.cpython-312.pyc +0 -0
package/scripts/_lib/agent_settings.py +54 -6
package/scripts/_lib/agent_src.py +30 -0
package/scripts/_lib/value_ladder.py +99 -2
package/scripts/_lib/value_report.py +30 -16
package/scripts/ai_council/modes.py +1 -1
package/scripts/ai_council/session.py +5 -1
package/scripts/audit_command_surface.py +7 -1
package/scripts/audit_initial_context.py +26 -2
package/scripts/check_gate_paths.py +117 -0
package/scripts/check_references.py +51 -2
package/scripts/check_skill_requires.py +143 -0
package/scripts/check_test_coverage_diff.py +180 -0
package/scripts/compile_router.py +5 -1
package/scripts/condense.py +92 -4
package/scripts/config/session_profiles.py +492 -0
package/scripts/council_cli.py +5 -1
package/scripts/first-run.sh +11 -11
package/scripts/hook_manifest.yaml +15 -7
package/scripts/hooks/dispatch_hook.py +8 -0
package/scripts/install +14 -1
package/scripts/install-hooks.sh +2 -1
package/scripts/install.py +203 -433
package/scripts/install_anthropic_key.sh +1 -1
package/scripts/install_openai_key.sh +1 -1
package/scripts/inventory_abstraction_budget.py +6 -1
package/scripts/lint_agents_md.py +11 -4
package/scripts/lint_discovery_vocabulary.py +5 -5
package/scripts/lint_hook_concern_budget.py +5 -1
package/scripts/lint_marketplace.py +18 -7
package/scripts/lint_roadmap_ci_steps.py +5 -1
package/scripts/lint_roadmap_complexity.py +5 -1
package/scripts/lint_value_dashboard.py +1 -1
package/scripts/mcp_server/prompts.py +5 -1
package/scripts/prediction-pool/adapters/_schema.md +42 -0
package/scripts/prediction-pool/adapters/kicktipp.yml +23 -0
package/scripts/prediction-pool/poisson_sim.py +167 -0
package/scripts/prediction-pool/pool_winsim.py +236 -0
package/scripts/prediction-pool/score_ev.py +188 -0
package/scripts/profile_staleness_hook.py +69 -0
package/scripts/render_value_md.py +1 -0
package/scripts/roadmap_progress_hook.py +56 -6
package/scripts/schemas/agent-settings.schema.json +77 -0
package/scripts/schemas/skill.schema.json +7 -0
package/scripts/smoke_quickstart.py +7 -6
package/scripts/sync_agent_settings.py +12 -5
package/scripts/validate_agent_settings.py +124 -0
package/scripts/validate_decision_engine.py +5 -1
package/templates/minimal/.agent-settings.yml +1 -1
package/dist/ui/assets/index-DVsyUMZe.js.map +0 -1
package/scripts/measure_roadmap_trajectory.py +0 -112
package/scripts/verify_roadmap_closure.py +0 -327

package/.agent-src/commands/skills.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-model_tier: inherit
+model_tier: medium
 name: skills
 tier: 2
 description: Skill discovery orchestrator — routes to discover. Local, explained skill recommendations over the catalog + role shortlists + optional local analytics.

package/.agent-src/commands/sync-agent-settings.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-model_tier: inherit
+model_tier: medium
 name: sync-agent-settings
 tier: 1
 description: Sync `.agent-settings.yml` against the current template + profile — adds new sections/keys, preserves user values, shows a diff before writing
@@ -41,7 +41,7 @@ Use when:
 ## When NOT to use
-- To change a value (`ide`, `cost_profile`, `max_parallel`) → edit the
+- To change a value (`ide`, `rule_loading_tier`, `max_parallel`) → edit the
   file directly or ask the agent; the sync only reconciles structure.
 - To create `.agent-project-settings.yml` (team file) → that is a
   separate concern; this command only touches the developer file.
@@ -95,7 +95,7 @@ Free-text replies (`"nö"`, `"leave it"`, unrecognized input) count as
 ### 4. Profile override
-The script auto-detects the profile from the target's `cost_profile`
+The script auto-detects the profile from the target's `rule_loading_tier`
 key and falls back to `minimal`. To sync against a different profile
 (e.g. during a profile change), pass `--profile balanced` or
 `--profile full` — but ask the user first; changing the profile is a

package/.agent-src/commands/sync-gitignore/fix.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-model_tier: inherit
+model_tier: medium
 name: sync-gitignore:fix
 tier: 2
 cluster: sync-gitignore

package/.agent-src/commands/sync-gitignore.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-model_tier: inherit
+model_tier: medium
 name: sync-gitignore
 tier: 1
 cluster: sync-gitignore

package/.agent-src/commands/update-form-request-messages.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-model_tier: inherit
+model_tier: medium
 name: update-form-request-messages
 tier: 2
 framework: laravel

package/.agent-src/presets/README.md CHANGED Viewed

@@ -4,7 +4,7 @@ Seed presets for the [preset system](../../docs/contracts/config-presets.md).
 Each preset bundles governance knobs (autonomy / confidence / risk /
 council / mcp / cost / notifications) so the user picks a stance, not
 a dozen individual values. Boundary against `profile.id`, `pack.id`,
-and `cost_profile` lives in
+and `rule_loading_tier` lives in
 [ADR-010](../../docs/decisions/ADR-010-profile-pack-preset-boundary.md).
 ## Seed set (v2.x — fixed)

package/.agent-src/profiles/README.md CHANGED Viewed

@@ -4,7 +4,7 @@ Seed profiles for the [profile system](../../docs/contracts/profile-system.md).
 Each profile answers *who is the user?* — audience identity that
 selects the default skill/command surface, README entry-paragraph,
 and persona pre-selection. Boundary against `preset.id`, `pack.id`,
-and `cost_profile` lives in
+and `rule_loading_tier` lives in
 [ADR-010](../../docs/decisions/ADR-010-profile-pack-preset-boundary.md).
 ## Seed set (v2.x — fixed)

package/.agent-src/rules/non-destructive-by-default.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 type: "always"
 tier: "safety-floor"
-description: "Agent is never destructive — Hard Floor always asks for prod-trunk merges, deploys, pushes, prod data/infra, bulk deletions, and bulk-deletion/infra commits; no autonomy or roadmap bypass"
+description: "Hard Floor: agent asks before prod-trunk commits/merges, deploys, pushes, prod data/infra, bulk deletions/infra commits; verify branch before each commit; no autonomy or roadmap bypass"
 alwaysApply: true
 load_context:
   - ../contexts/authority/destructive-mechanics.md
@@ -28,6 +28,7 @@ Triggers below require explicit user confirmation **on this turn** — not from
 | Trigger | Examples |
 |---|---|
 | **Production-branch merge** | `main`, `master`, `prod`, `production`, `release/*`, or any branch the project marks as deployment trunk |
+| **Commit on a production branch** | any `git commit` while `HEAD` is on a prod trunk (set above). **Verify branch before every commit** — `main` is opt-in only, never inferred from a prior turn or a merged PR that left the repo on `main` |
 | **Deploy / release** | `terraform apply` on prod, `kubectl apply` on prod, deploy scripts, release commands, tag pushes that trigger CI deployment |
 | **Push to remote** | any `git push` (also covered by [`scope-control`](scope-control.md), restated so the floor never weakens) |
 | **Production data / infra** | prod DB writes / migrations, prod config, secrets rotation, IAM / role / policy, DNS, anything in a `prod`-scoped path or pipeline |

package/.agent-src/skills/check-refs/SKILL.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-model_tier: inherit
+model_tier: medium
 name: check-refs
 description: "Use when verifying cross-references between skills, rules, commands, guidelines, and context documents are not broken after edits, renames, or deletions."
 domain: process

package/.agent-src/skills/finishing-a-development-branch/SKILL.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-model_tier: inherit
+model_tier: medium
 name: finishing-a-development-branch
 description: "Use when the feature is implementation-complete and the next step is 'ship it' — verifies, cleans up, and routes to merge/PR/park/discard — even when the user just says 'I'm done, what now?'."
 domain: process

package/.agent-src/skills/git-workflow/SKILL.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-model_tier: inherit
+model_tier: medium
 name: git-workflow
 description: "Use when working with Git — branch naming, commit messages, PR creation, rebasing, or the code review process — even when the user says 'push this' or 'merge the branch' without naming Git."
 domain: process

package/.agent-src/skills/jira-integration/SKILL.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-model_tier: inherit
+model_tier: medium
 name: jira-integration
 description: "Use when the user says "check Jira", "create ticket", "update issue", or needs JQL queries, ticket transitions, or branch-to-ticket linking."
 domain: process

package/.agent-src/skills/markitdown/SKILL.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-model_tier: inherit
+model_tier: medium
 name: markitdown
 description: "Use when converting PDF, DOCX, XLSX, PPTX, EPUB, images, or audio to Markdown for LLM ingestion via the upstream markitdown-mcp server — 'extract this PDF', 'OCR this image', 'transcribe this audio'."
 status: active

package/.agent-src/skills/prediction-pool-optimizer/SKILL.md ADDED Viewed

@@ -0,0 +1,314 @@
+---
+model_tier: high
+name: prediction-pool-optimizer
+description: "Optimize prediction-pool tips (kicktipp etc.): rules + multi-book consensus odds → expected-points-max answer for every question, scores AND bonus. Triggers 'optimize my pool tips', 'predict'."
+domain: product
+personas: []
+workspaces:
+  - small-business
+packs:
+  - fun
+lifecycle: experimental
+trust:
+  level: experimental
+install:
+  default: false
+  removable: true
+---
+# prediction-pool-optimizer
+> Turn a prediction pool's **scoring rules** plus a **consensus of the major
+> bookmakers' odds** into the answer that maximizes **expected points** — not
+> the most likely outcome — for **every open question in the pool**: match
+> scores AND every bonus / award / special question (top scorer, group
+> winners, champion, most cards …). Sport-agnostic core with per-sport
+> probability blocks. Consumed by [`/prediction-pool`](../../commands/prediction-pool.md).
+> The optimization target is the pool's score, so the chain is always
+> **rules → odds → expected value → participant field → answer**, never
+> "who wins this match?".
+## When to use
+When someone wants the best tips for a prediction / betting pool
+(kicktipp-style company pools — football WM, basketball WM, …) and the
+target is **pool points**, not match truth. Triggered by
+[`/prediction-pool`](../../commands/prediction-pool.md) (Steps 3–5) or directly
+when a user asks to optimize / maximize their pool picks.
+**The one idea that makes this skill correct:** the highest-probability
+result is **not** the highest-expected-value tip. Under most pool rules a
+2:1 or 1:0 scores the same partial points as the "obvious" pick but hits
+more often; under quote/rarity rules a rare-but-plausible result is worth
+more. **Always optimize the pool's points, never the truth of the match.**
+## Hard rules
+- **Rules before tips.** Never produce a tip before the pool's scoring is
+  parsed (Procedure step 1). Strategy is a function of the rules.
+- **Answer EVERY open question.** A pool has scores *and* bonus / award /
+  special questions ("which team supplies the top scorer?", "most yellow
+  cards?", "champion?"). Scorelines only, bonus questions blank = a **failed
+  run** — enumerate every open question in step 1, carry each to an answer
+  (steps 5–6). No silent skips.
+- **Odds are the primary signal — multi-book consensus, not one book.**
+  Bookmaker probabilities already fold in form, squad, injuries, travel,
+  climate. Build the base from a **consensus across the 5–10 biggest
+  publicly-viewable books** (step 2), de-vigged, **sharpness-weighted** —
+  never mirror a single portal. Override only with *current* info (confirmed
+  lineups, late injuries, suspensions, manager change).
+- **No invented numbers.** Emit no probability you cannot derive from real
+  odds or **actually executed** code. Tournament/outright/award numbers come
+  from real markets **or** the executed Poisson helper — never a claimed
+  "I ran 10,000 simulations".
+- **Scorelines computed, not guessed.** EV-max tip per match from the executed
+  grid optimiser (`score_ev.py`, step 4a), never the eye. A 3:2 / 4:1 / 1:4 in
+  the output = signature of a skipped computation.
+- **One-sentence justification** per answer. Short.
+## Procedure
+### 1. Parse the pool rules AND enumerate every open question
+From the pool's rule page, extract and document:
+- Points for **exact result** / **goal (point) difference** / **tendency**.
+- **Every bonus / award / special question** (champion, top scorer, "team of
+  the top scorer", group winners, most cards, longest unbeaten,
+  will-there-be-a-red-card, over/under totals …). **Write them all down as an
+  explicit checklist** — this list is the run's contract; every entry must
+  reach an answer.
+- **Joker / multiplier** rules, per-question point weights.
+- **Quote / rarity** scoring (rare correct tips score more)? — flips strategy
+  toward contrarian (step 4).
+- Special scorings, **per-question deadlines**, **strategy limits** (e.g. max
+  N identical tips).
+- **The goal**: place well, or *win* a large pool? (changes variance — step 4.)
+### 2. Build the data base — a consensus across the major books
+Primary signal: current bookmaker odds, **aggregated across the 5–10 biggest
+publicly-viewable books**, not a single portal:
+1. **Collect** odds for each market (1X2, exact-score, outrights, and each
+   special/award market a bonus question needs) from several books.
+   Odds-comparison aggregators (Oddschecker, Oddsportal / Betexplorer) show
+   many books at once; supplement with named books. Book list + weighting
+   recipe in [`reference/odds-and-bonus.md`](reference/odds-and-bonus.md).
+2. **De-vig each book** independently (remove its margin) → per-book implied
+   probabilities. Raw odds sum to >100%; never treat them as probabilities.
+3. **Aggregate with a healthy weighting**, not a blind average: weight
+   **sharp, low-margin books higher** (Pinnacle, Betfair Exchange),
+   recreational books lower; weighted mean or trimmed median so one outlier
+   book cannot swing the base. Result = the **consensus probability** — the
+   calibration base.
+4. **Single-book outlier = flag, not truth** — investigate *why* (priced-in
+   injury? stale line?) before moving off consensus. Cross-portal agreement is
+   signal; one portal disagreeing is a prompt to check, not to follow.
+Secondary (only when it adds signal the consensus has not absorbed): confirmed
+lineups, injuries, suspensions, manager change, recent form, home advantage,
+head-to-head, rest/travel, weather, model forecasts (Opta), Elo/SPI ratings.
+### 3. Per-match probabilities (sport block)
+Compute, per match, the outcome distribution and the most plausible exact
+results. Pick the block for the event's sport:
+**Football / soccer**
+- Model goals as **Poisson** per side from each team's expected goals;
+  draws are real (~22–28% baseline) — people under-tip them.
+- Outcome split: home-win / draw / away-win; then the exact-score grid.
+- Common EV-strong exact results: 1:0, 2:1, 1:1, 2:0.
+**Basketball**
+- **No draws.** Model the points margin as roughly **Gaussian** around the
+  market spread; pair with the moneyline for win probability and the
+  total (over/under) for the score level.
+- Tendency = sign of (margin); "exact result" rules are rare — read step 1.
+**Generic fallback (other sports)**
+- Derive the outcome split straight from de-vigged moneyline odds; estimate
+  a plausible score from the market total. State the model used.
+Cross-check the model against the consensus; on a large divergence, re-check
+the data and explain the cause before trusting it.
+### 4. Convert to the EV-maximizing tip
+Map probabilities to the tip with the **highest expected points under the
+step-1 rules** — not the prettiest match.
+#### 4a. The EV-max scoreline is computed, never eyeballed
+Don't hand-pick a scoreline. Run the executed grid optimiser — builds the full
+Poisson score grid, returns the EV-max tip under the step-1 point tiers:
+```bash
+python3 scripts/prediction-pool/score_ev.py --lh <home-xg> --la <away-xg> \
+    --tendency <t> --diff <d> --exact <e>          # one match
+python3 scripts/prediction-pool/score_ev.py matches.json \
+    --tendency <t> --diff <d> --exact <e>          # batch, prints a ranked table
+```
+Two facts the grid makes unavoidable, intuition gets wrong:
+- **High scorelines almost never EV-max.** Under partial points a moderate
+  favourite peaks at **1:0 / 2:0 / 2:1**; **1:0 wins surprisingly often**, top
+  of the surface is *flat* (1:0 vs 2:1 vs 2:0 within hundredths). 3:2 / 4:1 /
+  1:4 never optimal — such a tip means the grid wasn't run.
+- **Draws under-tipped.** A correct draw banks the goal-difference tier on
+  every draw scoreline, so in a close match (xG within ~0.4) a 1:1 can
+  out-score a 1:0 — and for low-scoring even games (λ ≲ 1.0/side) a 0:0 is the
+  EV-max. Let the grid decide; the eye tips too few draws.
+- **Standard fixed-point scoring + goal "place well"** → tip the grid's EV-max
+  per match. **No contrarian** — only your tip scores, tipping "different"
+  burns EV.
+- **Quote / rarity scoring** → weigh rarer-but-plausible results against payout;
+  take rarity when `payout × probability` wins (raise `--exact` or post-process
+  the ranked table by the multiplier).
+#### 4b. Large pool, goal "win it" — measure P(finish 1st), don't guess
+Goal = **win** a large pool → target flips from E(points) to **P(finish ahead
+of the field)**; pure EV-max converges with the crowd, can't open a gap.
+Measure it with the executed field simulator, not a "rough Kelly" hand-wave:
+```bash
+python3 scripts/prediction-pool/pool_winsim.py pool.json --runs 4000 --max-flips 4
+```
+Models the field as softmax-EV tippers, reports `P(win)` for EV-max-everywhere,
+then greedily reports **which few tips to flip** off EV-max (EV cost + P(win)
+gain each). Read it as the field threshold, empirically:
+- Pool **N < 20** → sim shows flips barely move P(win); maximize EV, ignore the
+  field.
+- **20 ≤ N < 100 and in the prize positions** → maximize EV.
+- **N ≥ 100, or outside the top ~20%** → take the sim's suggested flips: a
+  handful of higher-variance scorelines on high-consensus matches lift P(win)
+  most per unit EV given up. Flip only what the sim says pays — variance you
+  don't need is wasted EV.
+Respect all strategy limits from step 1 (max identical tips, etc.).
+### 5. Tournament, bonus & special questions — answer every one (no hallucination)
+Walk the **step-1 checklist** and answer **each** entry. Pick the method by
+question type — full taxonomy + per-type method in
+[`reference/odds-and-bonus.md`](reference/odds-and-bonus.md):
+- **Tournament structure** (group winners, KO rounds, finalists, champion):
+  real **outright market odds** ("to win group", "to reach final", "outright
+  winner") aggregated per step 2, **or** the executed Poisson simulator:
+  ```bash
+  python3 scripts/prediction-pool/poisson_sim.py <teams-xg.json> --runs 20000
+  ```
+  It plays the bracket from per-team expected goals and prints empirical
+  advancement / title probabilities. **Run it — never report simulated
+  numbers you did not actually compute.**
+- **Award / player markets** (top scorer, most assists, "which team supplies
+  the top scorer", golden boot, most cards): use the matching **special
+  market** — e.g. aggregate per-player "top goalscorer" odds **by team** to
+  answer "which team has the top scorer". No clean market → derive from a
+  stated model (squad strength × games-expected) and **label it a model
+  estimate**, not a market number.
+- **Binary / over-under specials** (red card yes/no, over/under total
+  goals/cards): de-vig the consensus probability for the line, pick the EV-max
+  side under the question's point weight.
+Optimize every answer on the same expected-points basis as the scores. Re-run
+as late as each question's deadline allows: re-check confirmed lineups,
+injuries, suspensions, odds movement, then adjust. The per-question deadline is
+the only hard constraint.
+## Output format
+1. **Approval table** — one row per match:
+   ```
+   Match | Tip | Prob / EV | Risk (low/med/high) | 1-line reason | Books used
+   ```
+   `Books used` names the consensus base (e.g. "consensus of 7 books, sharp-weighted").
+2. **Bonus & special answers** — one row per open question from the step-1
+   checklist, **every entry answered** (none blank):
+   ```
+   Question | Answer | Prob / EV | Risk | 1-line reason | Source (market / model)
+   ```
+3. **Group standings and the full bracket** where the event has them.
+4. **Self-check note** — (a) tips reconcile with
+   [`reference/ev-fixtures.md`](reference/ev-fixtures.md) (known rules + odds →
+   known-good EV tip); (b) bonus table has the **same number of rows as the
+   step-1 checklist** — a shorter table means a question was dropped. If your
+   method disagrees with a fixture, your method is wrong — find the error
+   (usually a forgotten partial-points term, un-de-vigged odds, or following
+   one book instead of the consensus), don't ship the tip.
+Handed back to [`/prediction-pool`](../../commands/prediction-pool.md) for the approval
+gate — the skill never enters or submits anything.
+## Gotcha
+- **Answering only the scores.** Bonus / award questions carry real points;
+  leaving them blank because they are "not a scoreline" forfeits them. The
+  step-1 checklist exists so every question is answered.
+- **Following one portal.** A single book can be stale or shaded; build the
+  base from a sharp-weighted consensus across several; an outlier is a flag to
+  investigate, not a number to copy.
+- **Tipping the modal result, not the EV-maximal one.** The single most likely
+  scoreline rarely maximizes partial points — run `score_ev.py` across the
+  result grid, don't eyeball the favourite.
+- **Hand-picking a high scoreline.** 3:2 / 4:1 / 1:4 never EV-max under partial
+  points — moderate favourites peak at 1:0 / 2:0 / 2:1. A high tip = grid
+  skipped; run `score_ev.py`.
+- **Under-tipping draws.** A correct draw banks the goal-difference tier on
+  every draw scoreline, so a close match can want 1:1 (or 0:0). Let the grid
+  decide; the eye tips too few draws.
+- **"Rough Kelly" variance for a large pool.** Don't guess deviation amount —
+  run `pool_winsim.py`; returns the exact flips that raise P(finish 1st) most
+  per unit EV given up.
+- **Forgetting to de-vig.** Raw bookmaker odds sum to >100%; treating them as
+  probabilities inflates the favourite. Remove the margin **per book** before
+  aggregating.
+- **Contrarian under fixed points.** Deviating "to stand out" only helps under
+  quote/rarity rules or a win-a-large-pool goal — otherwise it burns EV.
+- **Claimed-but-unrun simulation.** "I ran 10,000 tournaments" without
+  executing `poisson_sim.py` is hallucinated — run the code or use outright odds.
+## Do NOT
+- Leave any open pool question (bonus / award / special) unanswered.
+- Build the base from a single bookmaker, or skip de-vigging before aggregating.
+- Tip the most likely result instead of the EV-maximal one.
+- Hand-pick a scoreline instead of running `score_ev.py` — never emit a
+  3:2 / 4:1 / 1:4 tip, never EV-max under partial points.
+- Go contrarian under standard fixed-point scoring with a "place well" goal.
+- Guess large-pool variance ("rough Kelly") instead of running `pool_winsim.py`.
+- Report Monte-Carlo numbers without running `poisson_sim.py` / `pool_winsim.py`.
+- Treat raw odds as probabilities without removing the vig.
+- Give betting or financial advice — this optimizes a game; the human submits.
+## See also
+- [`/prediction-pool`](../../commands/prediction-pool.md) — the orchestrator (event,
+  persistence, Playwright entry, gates).
+- [`reference/odds-and-bonus.md`](reference/odds-and-bonus.md) — major-book list
+  + sharpness-weighted consensus recipe, and the bonus / award / special
+  question taxonomy with a per-type method.
+- [`reference/ev-fixtures.md`](reference/ev-fixtures.md) — known-good
+  rules+odds → EV examples.
+- [`scripts/prediction-pool/score_ev.py`](../../../../scripts/prediction-pool/score_ev.py) —
+  executed exact-score EV optimiser (step 4a; λ + rule → EV-max scoreline).
+- [`scripts/prediction-pool/pool_winsim.py`](../../../../scripts/prediction-pool/pool_winsim.py) —
+  executed field model + P(finish 1st) simulator and flip-finder (step 4b).
+- [`scripts/prediction-pool/poisson_sim.py`](../../../../scripts/prediction-pool/poisson_sim.py) —
+  the executed tournament simulator (step 5).

package/.agent-src/skills/prediction-pool-optimizer/evals/triggers.json ADDED Viewed

@@ -0,0 +1,20 @@
+{
+  "skill": "prediction-pool-optimizer",
+  "description": "9 should-trigger + 5 should-not-trigger queries. Should-trigger covers DE + EN phrasings and the core intent (pool tips, kicktipp, expected-points optimization across sports, plus answering the bonus / award questions and using bookmaker-consensus odds). Should-not-trigger covers near-miss neighbours: regulated financial advice (finance pack), plain match-result prediction with no pool, generic web research, AI video, and real-money sportsbook betting (out of scope / refuse).",
+  "queries": [
+    {"q": "optimize my kicktipp tips for the football WM 2026", "trigger": true},
+    {"q": "fill my company Tippspiel for the basketball world cup", "trigger": true},
+    {"q": "welche Tipps maximieren meine Punkte im kicktipp-Tippspiel?", "trigger": true},
+    {"q": "best picks for our office prediction pool given the scoring rules", "trigger": true},
+    {"q": "maximiere meine erwarteten Punkte im Tippspiel, nicht nur wer gewinnt", "trigger": true},
+    {"q": "predict our office kicktipp pool for the WM", "trigger": true},
+    {"q": "mach mein Tippspiel für die WM", "trigger": true},
+    {"q": "beantworte auch alle Bonusfragen im kicktipp, z.B. welche Mannschaft den Torschützenkönig stellt", "trigger": true},
+    {"q": "use the odds from the big betting sites to optimize my pool picks", "trigger": true},
+    {"q": "should we invest in this startup based on a DCF?", "trigger": false, "note": "regulated financial valuation → dcf-modeling / finance pack"},
+    {"q": "who will win tonight's match?", "trigger": false, "note": "plain result prediction, no pool / no scoring rules to optimize"},
+    {"q": "research the best running shoes for me", "trigger": false, "note": "generic web research → research / deep-research"},
+    {"q": "make a hype video for the world cup final", "trigger": false, "note": "AI video pipeline → /video"},
+    {"q": "place a €50 bet on the favourite at my bookmaker", "trigger": false, "note": "real-money sportsbook wagering — out of scope, not what this fun pool tool does"}
+  ]
+}

package/.agent-src/skills/prediction-pool-optimizer/reference/ev-fixtures.md ADDED Viewed

@@ -0,0 +1,175 @@
+# EV fixtures — known-good rules + odds → tip
+Sanity-check fixtures for `prediction-pool-optimizer` Step "Self-check". Each
+fixture states a scoring rule, the (de-vigged) market probabilities, and
+the expected-points-maximizing tip. If your method disagrees with a
+fixture, your method is wrong — find the error before shipping a tip.
+These are illustrative, not exhaustive. Add fixtures for any pool rule
+shape you encounter so future runs catch the same class of drift.
+---
+## Fixture 1 — standard fixed points, goal "place well"
+**Rule:** exact result = 4, goal-difference = 3, tendency = 2, else 0.
+No quote rule. No strategy limit. Goal: place well.
+**Match (football):** Poisson on market xG ≈ 1.7 : 0.8.
+**Script-verified** (`score_ev.py --lh 1.7 --la 0.8 --exact 4 --diff 3 --tendency 2`):
+```
+EV-max tip : 1:0  (EV 1.574)
+  1:0  1.574  <- EV-max
+  2:1  1.530
+  2:0  1.477
+```
+**Reasoning:** top of the EV surface is **flat** — 1:0, 2:1, 2:0 all bank the
+tendency (2) plus goal-difference (3) on many neighbours, within hundredths of
+each other. Grid puts **1:0 narrowly first**; eyeballing the modal *result*
+(2:1) lands a near-tie, not the optimum. Run the grid — don't assert the
+favourite's "obvious" score.
+**Known-good tip:** **1:0 home** (2:1 essentially tied; with the real de-vigged
+λ either can lead — the grid decides). (Risk: low.) **Not** contrarian — under
+fixed points only your own tip scores, so deviating costs EV.
+---
+## Fixture 2 — quote / rarity scoring
+**Rule:** points = base × rarity multiplier (rarer correct tips score
+more); tendency still banks a small base.
+**Match (football):** same probabilities as Fixture 1.
+**Reasoning:** the rarity multiplier can make a plausible-but-uncommon
+exact result (e.g. 3:1, 2:2) outscore the modal 2:1 when
+`payout(result) × P(result)` is higher. Compute EV per candidate including
+the multiplier; take the max.
+**Known-good tip:** the result with the highest `multiplier × probability`,
+**not** the highest probability — typically a step rarer than 2:1
+(e.g. 3:1 or 2:2 depending on the multiplier curve). (Risk: medium.)
+---
+## Fixture 3 — large pool, goal "win it"
+**Rule:** standard fixed points. Pool N = 400. You are outside the top 20%.
+**Match (football):** a near-coin-flip favourite, Home 52% / Draw 26% /
+Away 22%.
+**Reasoning:** N ≥ 100 and you behind → pure EV converges with the field, can't
+create the gap; target is **P(finish 1st)**, not E(points). Don't guess the
+variance: run `pool_winsim.py` with the pool's `N` and your `my_lead`. Shows
+P(win) collapsing under EV-max-everywhere, returns the **specific flips**
+(higher-variance scorelines on high-consensus matches) that raise P(win) most
+per unit EV given up.
+**Known-good tip:** EV-max on the safe matches; the **simulator's suggested
+flips** on the 2–4 matches it names, to manufacture upside. (Risk: high —
+intentional.) Verify the sim shows a P(win) gain — flips not moving it (small
+N) → don't add variance you don't need.
+---
+## Fixture 4 — basketball, no draws
+**Rule:** correct winner = 3, correct margin bucket = +2.
+**Match (basketball):** market spread Home −6.5, moneyline Home 78%.
+Margin modelled Gaussian, mean ≈ 6.5, sd ≈ 11.
+**Reasoning:** no draw term exists; optimize winner first (Home banks 3 at
+78%), then the margin bucket from the Gaussian (most mass straddles the
+spread). Tip the winner plus the modal margin bucket.
+**Known-good tip:** **Home win, margin ~5–9.** (Risk: low on winner.)
+---
+## Fixture 5 — multi-book consensus (de-vig per book, sharp-weighted)
+**Rule:** any — checks the **odds base**, not the EV map.
+**Market (football, 1X2):** two books.
+- Book S (sharp, weight 3): 1.80 / 3.60 / 4.50 → de-vig 0.526 / 0.263 / 0.210.
+- Book R (recreational, weight 1): 1.75 / 3.50 / 4.20 → de-vig 0.522 / 0.261 / 0.217.
+**Reasoning:** de-vig **each book** first (raw `1/o` sums to >1; normalise),
+then sharp-weighted mean per outcome and renormalise. Aggregating raw odds, or
+using one book, is wrong.
+**Known-good base:** **Home 0.525 / Draw 0.262 / Away 0.212.** A run that fed
+the EV grid one book's raw odds has the wrong base — fix it before the tip.
+---
+## Fixture 6 — "team of the top scorer" (aggregate player market by team)
+**Rule:** bonus question = 6 points: "which team supplies the tournament top
+scorer?"
+**Market (top-goalscorer outright, de-vigged player probabilities):**
+- Team A: A1 14%, A2 5% → team A total **19%**.
+- Team B: B1 16% → team B total **16%**.
+- Team C: C1 9%, C2 4% → team C total **13%**.
+**Reasoning:** the most-likely *player* (B1, 16%) is on team B, but the
+question asks the **team** — sum each squad's players. Team A 19% beats team B
+16%. Answer the asked question, not the adjacent one.
+**Known-good answer:** **Team A.** (Source: market, aggregated by team. Risk:
+medium.) **Not** team B — the modal-player trap.
+---
+## Fixture 7 — high-scoreline trap (the "EV-optimized" model that wasn't)
+**Rule:** kicktipp 2 / 3 / 5 — tendency = 2, goal-difference = 3, exact = 5.
+**Matches (script-verified, `score_ev.py … --tendency 2 --diff 3 --exact 5`):**
+| Match (λ) | EV-max | a high tip's EV | verdict |
+|---|---|---|---|
+| Senegal–Iraq (2.0:0.7) | **1:0** (1.881) | 4:1 ≈ 1.55 | high tip leaks ~0.33 |
+| Qatar–Switzerland (0.6:2.1) | **0:1** (1.981) | 1:4 ≈ 1.65 | tipping the underdog's goals = costliest move on the board |
+| Spain–CapeVerde (2.3:0.6) | **2:0** (2.033) | 3:1 ≈ 1.88 | only at λ ≳ 2.3 does 2:0 edge past 1:0; never higher |
+**Reasoning:** under partial points the value sits in the tendency and
+goal-difference tiers, not the exact high score. **1:0 is the optimum
+astonishingly often** (even for clear favourites at λ ≈ 2.0); 2:0 takes over
+only near λ ≈ 2.3–2.4; above that, never. **3:2 / 4:1 / 4:2 / 1:4 are never
+EV-max.** Adding goals — especially the underdog's — only shrinks the hit
+probability without protecting the diff/tendency points.
+**Known-good behaviour:** any 3:2 / 4:x / x:4 tip in the run → the grid wasn't
+run; `score_ev.py` is the gate. (Risk: low; correctness fixture, not strategy.)
+---
+## Fixture 8 — draws are under-tipped
+**Rule:** kicktipp 2 / 3 / 5 (as Fixture 7).
+**Matches (script-verified, `score_ev.py … --tendency 2 --diff 3 --exact 5`):**
+```
+λ 1.0:1.0  ->  EV-max 0:0 (1.196), 1:1 tied (1.196)   # a draw IS the optimum
+λ 0.9:0.9  ->  EV-max 0:0 (1.317), 1:1 second
+λ 1.2:1.2  ->  EV-max 1:0 (1.150), draw third (1.091)  # 1-goal win edges it
+```
+**Reasoning:** people tip too few draws. A correct draw banks the
+goal-difference tier (3) on *every* draw scoreline, so in a **low-scoring even
+match (λ ≲ 1.0/side) the draw — usually 0:0 — is the EV-max**, tied with 1:1.
+As λ rises past ~1.1 a one-goal win edges ahead, but the draw stays in the top
+tips. Grid surfaces this; intuition suppresses it.
+**Known-good behaviour:** a tip set with **near-zero draws across many
+low-scoring even matches** is a red flag — re-run `score_ev.py`, let the grid
+decide, don't default every close game to 1:0.