npm - feed-the-machine - Versions diffs - 1.6.0 → 1.7.0 - Mend

feed-the-machine 1.6.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (269) hide show

package/LICENSE +21 -21
package/README.md +170 -170
package/bin/brain.py +1340 -0
package/bin/convert_claude_skills_to_codex.py +490 -0
package/bin/generate-manifest.mjs +463 -463
package/bin/harden_codex_skills.py +141 -0
package/bin/install.mjs +491 -491
package/bin/migrate-eng-buddy-data.py +875 -0
package/bin/playbook_engine/__init__.py +1 -0
package/bin/playbook_engine/conftest.py +8 -0
package/bin/playbook_engine/extractor.py +33 -0
package/bin/playbook_engine/manager.py +102 -0
package/bin/playbook_engine/models.py +84 -0
package/bin/playbook_engine/registry.py +35 -0
package/bin/playbook_engine/test_extractor.py +72 -0
package/bin/playbook_engine/test_integration.py +129 -0
package/bin/playbook_engine/test_manager.py +85 -0
package/bin/playbook_engine/test_models.py +166 -0
package/bin/playbook_engine/test_registry.py +67 -0
package/bin/playbook_engine/test_tracer.py +86 -0
package/bin/playbook_engine/tracer.py +93 -0
package/bin/tasks_db.py +456 -0
package/docs/HOOKS.md +243 -243
package/docs/INBOX.md +233 -233
package/ftm/SKILL.md +125 -122
package/ftm-audit/SKILL.md +623 -623
package/ftm-audit/references/protocols/PROJECT-PATTERNS.md +91 -91
package/ftm-audit/references/protocols/RUNTIME-WIRING.md +66 -66
package/ftm-audit/references/protocols/WIRING-CONTRACTS.md +135 -135
package/ftm-audit/references/strategies/AUTO-FIX-STRATEGIES.md +69 -69
package/ftm-audit/references/templates/REPORT-FORMAT.md +96 -96
package/ftm-audit/scripts/run-knip.sh +23 -23
package/ftm-audit.yml +2 -2
package/ftm-brainstorm/SKILL.md +1003 -498
package/ftm-brainstorm/evals/evals.json +180 -100
package/ftm-brainstorm/evals/promptfoo.yaml +109 -109
package/ftm-brainstorm/references/agent-prompts.md +552 -224
package/ftm-brainstorm/references/plan-template.md +209 -121
package/ftm-brainstorm.yml +2 -2
package/ftm-browse/SKILL.md +454 -454
package/ftm-browse/daemon/browser-manager.ts +206 -206
package/ftm-browse/daemon/bun.lock +30 -30
package/ftm-browse/daemon/cli.ts +347 -347
package/ftm-browse/daemon/commands.ts +410 -410
package/ftm-browse/daemon/main.ts +357 -357
package/ftm-browse/daemon/package.json +17 -17
package/ftm-browse/daemon/server.ts +189 -189
package/ftm-browse/daemon/snapshot.ts +519 -519
package/ftm-browse/daemon/tsconfig.json +22 -22
package/ftm-browse.yml +4 -4
package/ftm-capture/SKILL.md +370 -370
package/ftm-capture.yml +4 -4
package/ftm-codex-gate/SKILL.md +361 -361
package/ftm-codex-gate.yml +2 -2
package/ftm-config/SKILL.md +422 -345
package/ftm-config.default.yml +125 -82
package/ftm-config.yml +44 -2
package/ftm-council/SKILL.md +416 -416
package/ftm-council/references/prompts/CLAUDE-INVESTIGATION.md +60 -60
package/ftm-council/references/prompts/CODEX-INVESTIGATION.md +58 -58
package/ftm-council/references/prompts/GEMINI-INVESTIGATION.md +58 -58
package/ftm-council/references/prompts/REBUTTAL-TEMPLATE.md +57 -57
package/ftm-council/references/protocols/PREREQUISITES.md +47 -47
package/ftm-council/references/protocols/STEP-0-FRAMING.md +46 -46
package/ftm-council.yml +2 -2
package/ftm-dashboard/SKILL.md +163 -163
package/ftm-dashboard.yml +4 -4
package/ftm-debug/SKILL.md +1037 -1037
package/ftm-debug/references/phases/PHASE-0-INTAKE.md +58 -58
package/ftm-debug/references/phases/PHASE-1-TRIAGE.md +46 -46
package/ftm-debug/references/phases/PHASE-2-WAR-ROOM-AGENTS.md +279 -279
package/ftm-debug/references/phases/PHASE-3-TO-6-EXECUTION.md +436 -436
package/ftm-debug/references/protocols/BLACKBOARD.md +86 -86
package/ftm-debug/references/protocols/EDGE-CASES.md +103 -103
package/ftm-debug.yml +2 -2
package/ftm-diagram/SKILL.md +277 -277
package/ftm-diagram.yml +2 -2
package/ftm-executor/SKILL.md +777 -777
package/ftm-executor/references/STYLE-TEMPLATE.md +73 -73
package/ftm-executor/references/phases/PHASE-0-VERIFICATION.md +62 -62
package/ftm-executor/references/phases/PHASE-2-AGENT-ASSEMBLY.md +34 -34
package/ftm-executor/references/phases/PHASE-3-WORKTREES.md +38 -38
package/ftm-executor/references/phases/PHASE-4-5-AUDIT.md +72 -72
package/ftm-executor/references/phases/PHASE-4-DISPATCH.md +66 -66
package/ftm-executor/references/phases/PHASE-5-5-CODEX-GATE.md +73 -73
package/ftm-executor/references/protocols/DOCUMENTATION-BOOTSTRAP.md +36 -36
package/ftm-executor/references/protocols/MODEL-PROFILE.md +59 -59
package/ftm-executor/references/protocols/PROGRESS-TRACKING.md +66 -66
package/ftm-executor/runtime/ftm-runtime.mjs +252 -252
package/ftm-executor/runtime/package.json +8 -8
package/ftm-executor.yml +2 -2
package/ftm-git/SKILL.md +441 -441
package/ftm-git/evals/evals.json +26 -26
package/ftm-git/evals/promptfoo.yaml +75 -75
package/ftm-git/hooks/post-commit-experience.sh +92 -92
package/ftm-git/references/patterns/SECRET-PATTERNS.md +104 -104
package/ftm-git/references/protocols/REMEDIATION.md +139 -139
package/ftm-git/scripts/pre-commit-secrets.sh +110 -110
package/ftm-git.yml +2 -2
package/ftm-inbox/backend/__pycache__/main.cpython-314.pyc +0 -0
package/ftm-inbox/backend/adapters/_retry.py +64 -64
package/ftm-inbox/backend/adapters/base.py +230 -230
package/ftm-inbox/backend/adapters/freshservice.py +104 -104
package/ftm-inbox/backend/adapters/gmail.py +125 -125
package/ftm-inbox/backend/adapters/jira.py +136 -136
package/ftm-inbox/backend/adapters/registry.py +192 -192
package/ftm-inbox/backend/adapters/slack.py +110 -110
package/ftm-inbox/backend/db/connection.py +54 -54
package/ftm-inbox/backend/db/schema.py +78 -78
package/ftm-inbox/backend/executor/__init__.py +7 -7
package/ftm-inbox/backend/executor/engine.py +149 -149
package/ftm-inbox/backend/executor/step_runner.py +98 -98
package/ftm-inbox/backend/main.py +103 -103
package/ftm-inbox/backend/models/__init__.py +1 -1
package/ftm-inbox/backend/models/unified_task.py +36 -36
package/ftm-inbox/backend/planner/__init__.py +6 -6
package/ftm-inbox/backend/planner/__pycache__/__init__.cpython-314.pyc +0 -0
package/ftm-inbox/backend/planner/__pycache__/generator.cpython-314.pyc +0 -0
package/ftm-inbox/backend/planner/__pycache__/schema.cpython-314.pyc +0 -0
package/ftm-inbox/backend/planner/generator.py +127 -127
package/ftm-inbox/backend/planner/schema.py +34 -34
package/ftm-inbox/backend/requirements.txt +5 -5
package/ftm-inbox/backend/routes/__pycache__/plan.cpython-314.pyc +0 -0
package/ftm-inbox/backend/routes/execute.py +186 -186
package/ftm-inbox/backend/routes/health.py +52 -52
package/ftm-inbox/backend/routes/inbox.py +68 -68
package/ftm-inbox/backend/routes/plan.py +271 -271
package/ftm-inbox/bin/launchagent.mjs +91 -91
package/ftm-inbox/bin/setup.mjs +188 -188
package/ftm-inbox/bin/start.sh +10 -10
package/ftm-inbox/bin/status.sh +17 -17
package/ftm-inbox/bin/stop.sh +8 -8
package/ftm-inbox/config.example.yml +55 -55
package/ftm-inbox/package-lock.json +2898 -2898
package/ftm-inbox/package.json +26 -26
package/ftm-inbox/postcss.config.js +6 -6
package/ftm-inbox/src/app.css +199 -199
package/ftm-inbox/src/app.html +18 -18
package/ftm-inbox/src/lib/api.ts +166 -166
package/ftm-inbox/src/lib/components/ExecutionLog.svelte +81 -81
package/ftm-inbox/src/lib/components/InboxFeed.svelte +143 -143
package/ftm-inbox/src/lib/components/PlanStep.svelte +271 -271
package/ftm-inbox/src/lib/components/PlanView.svelte +206 -206
package/ftm-inbox/src/lib/components/StreamPanel.svelte +99 -99
package/ftm-inbox/src/lib/components/TaskCard.svelte +190 -190
package/ftm-inbox/src/lib/components/ui/EmptyState.svelte +63 -63
package/ftm-inbox/src/lib/components/ui/KawaiiCard.svelte +86 -86
package/ftm-inbox/src/lib/components/ui/PillButton.svelte +106 -106
package/ftm-inbox/src/lib/components/ui/StatusBadge.svelte +67 -67
package/ftm-inbox/src/lib/components/ui/StreamDrawer.svelte +149 -149
package/ftm-inbox/src/lib/components/ui/ThemeToggle.svelte +80 -80
package/ftm-inbox/src/lib/theme.ts +47 -47
package/ftm-inbox/src/routes/+layout.svelte +76 -76
package/ftm-inbox/src/routes/+page.svelte +401 -401
package/ftm-inbox/svelte.config.js +12 -12
package/ftm-inbox/tailwind.config.ts +63 -63
package/ftm-inbox/tsconfig.json +13 -13
package/ftm-inbox/vite.config.ts +6 -6
package/ftm-intent/SKILL.md +241 -241
package/ftm-intent.yml +2 -2
package/ftm-manifest.json +3794 -3794
package/ftm-map/SKILL.md +291 -291
package/ftm-map/scripts/db.py +712 -712
package/ftm-map/scripts/index.py +415 -415
package/ftm-map/scripts/parser.py +224 -224
package/ftm-map/scripts/queries/go-tags.scm +20 -20
package/ftm-map/scripts/queries/javascript-tags.scm +35 -35
package/ftm-map/scripts/queries/python-tags.scm +31 -31
package/ftm-map/scripts/queries/ruby-tags.scm +19 -19
package/ftm-map/scripts/queries/rust-tags.scm +37 -37
package/ftm-map/scripts/queries/typescript-tags.scm +41 -41
package/ftm-map/scripts/query.py +301 -301
package/ftm-map/scripts/ranker.py +377 -377
package/ftm-map/scripts/requirements.txt +5 -5
package/ftm-map/scripts/setup-hooks.sh +27 -27
package/ftm-map/scripts/setup.sh +56 -56
package/ftm-map/scripts/test_db.py +364 -364
package/ftm-map/scripts/test_parser.py +174 -174
package/ftm-map/scripts/test_query.py +183 -183
package/ftm-map/scripts/test_ranker.py +199 -199
package/ftm-map/scripts/views.py +591 -591
package/ftm-map.yml +2 -2
package/ftm-mind/SKILL.md +201 -1943
package/ftm-mind/evals/promptfoo.yaml +142 -142
package/ftm-mind/references/blackboard-protocol.md +110 -0
package/ftm-mind/references/blackboard-schema.md +328 -328
package/ftm-mind/references/complexity-guide.md +110 -110
package/ftm-mind/references/complexity-sizing.md +138 -0
package/ftm-mind/references/decide-act-protocol.md +172 -0
package/ftm-mind/references/direct-execution.md +51 -0
package/ftm-mind/references/environment-discovery.md +77 -0
package/ftm-mind/references/event-registry.md +319 -319
package/ftm-mind/references/mcp-inventory.md +300 -296
package/ftm-mind/references/ops-routing.md +47 -0
package/ftm-mind/references/orient-protocol.md +234 -0
package/ftm-mind/references/personality.md +40 -0
package/ftm-mind/references/protocols/COMPLEXITY-SIZING.md +72 -72
package/ftm-mind/references/protocols/MCP-HEURISTICS.md +32 -32
package/ftm-mind/references/protocols/PLAN-APPROVAL.md +80 -80
package/ftm-mind/references/reflexion-protocol.md +249 -249
package/ftm-mind/references/routing/SCENARIOS.md +22 -22
package/ftm-mind/references/routing-scenarios.md +35 -35
package/ftm-mind.yml +2 -2
package/ftm-ops.yml +4 -0
package/ftm-pause/SKILL.md +395 -395
package/ftm-pause/references/protocols/SKILL-RESTORE-PROTOCOLS.md +186 -186
package/ftm-pause/references/protocols/VALIDATION.md +80 -80
package/ftm-pause.yml +2 -2
package/ftm-researcher/SKILL.md +275 -275
package/ftm-researcher/evals/agent-diversity.yaml +17 -17
package/ftm-researcher/evals/synthesis-quality.yaml +12 -12
package/ftm-researcher/evals/trigger-accuracy.yaml +39 -39
package/ftm-researcher/references/adaptive-search.md +116 -116
package/ftm-researcher/references/agent-prompts.md +193 -193
package/ftm-researcher/references/council-integration.md +193 -193
package/ftm-researcher/references/output-format.md +203 -203
package/ftm-researcher/references/synthesis-pipeline.md +165 -165
package/ftm-researcher/scripts/score_credibility.py +234 -234
package/ftm-researcher/scripts/validate_research.py +92 -92
package/ftm-researcher.yml +2 -2
package/ftm-resume/SKILL.md +518 -518
package/ftm-resume/references/protocols/VALIDATION.md +172 -172
package/ftm-resume.yml +2 -2
package/ftm-retro/SKILL.md +380 -380
package/ftm-retro/references/protocols/SCORING-RUBRICS.md +89 -89
package/ftm-retro/references/templates/REPORT-FORMAT.md +109 -109
package/ftm-retro.yml +2 -2
package/ftm-routine/SKILL.md +170 -170
package/ftm-routine.yml +4 -4
package/ftm-state/blackboard/capabilities.json +5 -5
package/ftm-state/blackboard/capabilities.schema.json +27 -27
package/ftm-state/blackboard/context.json +37 -23
package/ftm-state/blackboard/experiences/doom-statusline-fix.json +26 -0
package/ftm-state/blackboard/experiences/hackathon-pages-site.json +26 -0
package/ftm-state/blackboard/experiences/hindsight-sso-kickoff.json +42 -0
package/ftm-state/blackboard/experiences/index.json +58 -9
package/ftm-state/blackboard/experiences/learning-ragnarok-api-access.json +23 -0
package/ftm-state/blackboard/experiences/nordlayer-members-auto-assign.json +26 -0
package/ftm-state/blackboard/experiences/saml2aws-stale-session-fix.json +41 -0
package/ftm-state/blackboard/patterns.json +6 -6
package/ftm-state/schemas/context.schema.json +130 -130
package/ftm-state/schemas/experience-index.schema.json +77 -77
package/ftm-state/schemas/experience.schema.json +78 -78
package/ftm-state/schemas/patterns.schema.json +44 -44
package/ftm-upgrade/SKILL.md +194 -194
package/ftm-upgrade/scripts/check-version.sh +76 -76
package/ftm-upgrade/scripts/upgrade.sh +143 -143
package/ftm-upgrade.yml +2 -2
package/ftm-verify.yml +2 -2
package/ftm.yml +2 -2
package/hooks/ftm-auto-log.sh +137 -0
package/hooks/ftm-blackboard-enforcer.sh +93 -93
package/hooks/ftm-discovery-reminder.sh +90 -90
package/hooks/ftm-drafts-gate.sh +61 -61
package/hooks/ftm-event-logger.mjs +107 -107
package/hooks/ftm-install-hooks.sh +240 -0
package/hooks/ftm-learning-capture.sh +117 -0
package/hooks/ftm-map-autodetect.sh +79 -79
package/hooks/ftm-pending-sync-check.sh +22 -22
package/hooks/ftm-plan-gate.sh +92 -92
package/hooks/ftm-post-commit-trigger.sh +57 -57
package/hooks/ftm-post-compaction.sh +138 -0
package/hooks/ftm-pre-compaction.sh +147 -0
package/hooks/ftm-session-end.sh +52 -0
package/hooks/ftm-session-snapshot.sh +213 -0
package/hooks/settings-template.json +81 -81
package/install.sh +363 -363
package/package.json +84 -84
package/uninstall.sh +25 -25

package/ftm-researcher/references/synthesis-pipeline.md CHANGED Viewed

@@ -1,165 +1,165 @@
-# Synthesis Pipeline
-5-phase pipeline that takes raw findings from finder agents and produces a structured disagreement map.
----
-## Phase 1: Normalize & Deduplicate
-Input: Raw findings from all finder agents (7 agents x 3-8 findings each = 21-56 findings)
-Steps:
-1. Flatten all findings into a single list
-2. Group by semantic similarity (same claim from different agents)
-3. For each group:
-   - Merge into a single canonical claim
-   - Track which agents found it (agent_count)
-   - Track source type diversity (source_diversity_score = unique source types / total sources)
-   - Flag circular sourcing: if all sources in a group cite the same original source, mark as circular=true
-4. Output: unique_claims[] sorted by agent_count DESC, source_diversity_score DESC
-### Semantic Similarity Heuristics
-Two claims are considered semantically similar when:
-- They make the same factual assertion about the same subject, even with different wording
-- One is a subset of the other (e.g., "X uses Y" vs "X uses Y for Z")
-- They cite the same source for the same conclusion
-Two claims are NOT similar when:
-- They address different aspects of the same topic
-- They reach different conclusions about the same subject
-- One is general and the other is specific with additional qualifying conditions
-When merging, keep the most specific version as the canonical claim.
----
-## Phase 2: Adversarial Review (ftm-council)
-Input: Top claims from Phase 1 (all claims with agent_count >= 2, plus any high-confidence unique claims with confidence > 0.8)
-Council invocation:
-- Send claims as a structured prompt to ftm-council
-- Ask: "Evaluate each claim. For each: Is the evidence sufficient? What would make this wrong? Are there alternative explanations? Rate confidence 0-1."
-- Council runs Claude + Codex + Gemini independently, then reconciles
-Output: claims[] with council_verdict (agreed | contested | insufficient_evidence), provider_disagreements[]
-### FALLBACK (if Codex/Gemini unavailable):
-Spawn 2 standalone agents on the review model:
-**Devil's Advocate:** "Your job is to find reasons each claim is WRONG. Search for counter-evidence, flag single-source claims, identify logical gaps."
-**Edge Case Hunter:** "Your job is to find where each claim BREAKS. Scaling limits, security concerns, accessibility gaps, failure modes under load."
-Both receive all claims and return challenge_findings[]
----
-## Phase 3: Pairwise Rank (for contested claims)
-Input: Claims marked as "contested" by council
-For each pair of conflicting claims:
-- LLM-as-judge prompt: "Given research question Q, Claim A says [X] with evidence [E1]. Claim B says [Y] with evidence [E2]. Which claim is better supported? Why? Consider: source authority, evidence specificity, logical coherence, relevance to the question."
-- Tournament bracket: winners advance, losers are demoted to "minority view"
-Output: ranked_claims[] with rank_position, judge_rationale
-### Ranking Criteria (in priority order)
-1. **Source authority**: Primary sources and peer-reviewed research outweigh blog posts and forum answers
-2. **Evidence specificity**: Concrete data points (benchmarks, case studies with numbers) outweigh general assertions
-3. **Logical coherence**: Claims with clear causal reasoning outweigh correlational arguments
-4. **Relevance to question**: Claims that directly address the research question outweigh tangentially related findings
-5. **Recency**: For fast-moving topics, newer evidence outweighs older evidence (all else equal)
----
-## Phase 4: Reconcile — Disagreement Map
-Input: All processed claims (normalized, council-reviewed, ranked)
-The Reconciler agent produces structured output in 4 tiers:
-### Tier 1: Consensus Claims
-3+ agents agree, council agreed, multiple source types.
-- Highest confidence. Present as established findings.
-- Include: canonical claim, supporting agents, source count, source diversity, council verdict, confidence score
-### Tier 2: Contested Claims
-Council disagreed, or pairwise ranking was close.
-- Present BOTH sides with the specific disagreement.
-- Include: claim_a, claim_b, agents_for_a, agents_for_b, council positions, rank winner, judge rationale
-### Tier 3: Unique Insights
-Found by 1 agent only, not contradicted.
-- High value OR hallucination — flag for user judgment.
-- Include: claim, agent_role, confidence, source, note flagging single-source status
-### Tier 4: Refuted Claims
-Council rejected, or pairwise loser with low evidence.
-- Still present briefly — knowing what's wrong is valuable.
-- Include: claim, rejection_reason, original_agent
----
-## Phase 5: Render
-Produce both:
-- **Structured JSON artifact** (see output-format.md for schema)
-- **Rendered markdown** for user display (see output-format.md for template)
-The JSON artifact is the primary output for skill-to-skill consumption. The markdown is for human reading.
----
-## Reconciler Agent Prompt
-```
-You are the Reconciler — the final judge in a multi-agent research pipeline.
-You receive findings from 7 research agents that have been normalized,
-deduplicated, and adversarially reviewed.
-Your job is NOT to average or blend. Your job is to JUDGE:
-- Which claims are strong? (multiple independent sources, council agreement)
-- Which claims are contested? (present both sides, don't pick a winner)
-- Which claims are unique insights? (valuable if true, flag for verification)
-- Which claims should be rejected? (weak evidence, circular sourcing, council rejection)
-Produce a structured disagreement map, not a smooth summary.
-The user should see WHERE agents agreed, WHERE they disagreed, and WHY.
-INPUT:
-- normalized_claims: [list of deduplicated claims with agent_count and source_diversity]
-- council_verdicts: [list of claims with agreed/contested/insufficient verdicts]
-- pairwise_rankings: [list of contested claim pairs with winners and rationale]
-- credibility_scores: [list of claims with scored credibility from score_credibility.py]
-OUTPUT FORMAT:
-Return a JSON object with these exact keys:
-{
-  "consensus": [{ claim, supporting_agents, source_count, source_diversity, council_verdict, confidence }],
-  "contested": [{ claim_a, claim_b, agents_for_a, agents_for_b, council_verdict, provider_positions, rank_winner, judge_rationale }],
-  "unique_insights": [{ claim, agent_role, confidence, note }],
-  "refuted": [{ claim, rejection_reason, original_agent }]
-}
-RULES:
-- A claim needs 3+ agents AND council agreement to be consensus
-- A claim with 2 agents but council agreement goes to consensus with a "moderate confidence" flag
-- A claim with council disagreement ALWAYS goes to contested, even if 5 agents agree
-- A single-agent claim with confidence > 0.8 goes to unique_insights
-- A single-agent claim with confidence <= 0.5 goes to refuted
-- Everything else goes to unique_insights with appropriate flagging
-- NEVER merge contested claims into a smooth middle ground — preserve the disagreement
-```
----
-## Pipeline Skip Rules
-- **Quick mode**: Skip Phases 2, 3, 4. Orchestrator does a single-pass synthesis directly from normalized findings.
-- **Standard mode**: Skip Phase 2 (council). Run Phases 1, 3, 4, 5.
-- **Deep mode**: Run all 5 phases.
+# Synthesis Pipeline
+5-phase pipeline that takes raw findings from finder agents and produces a structured disagreement map.
+---
+## Phase 1: Normalize & Deduplicate
+Input: Raw findings from all finder agents (7 agents x 3-8 findings each = 21-56 findings)
+Steps:
+1. Flatten all findings into a single list
+2. Group by semantic similarity (same claim from different agents)
+3. For each group:
+   - Merge into a single canonical claim
+   - Track which agents found it (agent_count)
+   - Track source type diversity (source_diversity_score = unique source types / total sources)
+   - Flag circular sourcing: if all sources in a group cite the same original source, mark as circular=true
+4. Output: unique_claims[] sorted by agent_count DESC, source_diversity_score DESC
+### Semantic Similarity Heuristics
+Two claims are considered semantically similar when:
+- They make the same factual assertion about the same subject, even with different wording
+- One is a subset of the other (e.g., "X uses Y" vs "X uses Y for Z")
+- They cite the same source for the same conclusion
+Two claims are NOT similar when:
+- They address different aspects of the same topic
+- They reach different conclusions about the same subject
+- One is general and the other is specific with additional qualifying conditions
+When merging, keep the most specific version as the canonical claim.
+---
+## Phase 2: Adversarial Review (ftm-council)
+Input: Top claims from Phase 1 (all claims with agent_count >= 2, plus any high-confidence unique claims with confidence > 0.8)
+Council invocation:
+- Send claims as a structured prompt to ftm-council
+- Ask: "Evaluate each claim. For each: Is the evidence sufficient? What would make this wrong? Are there alternative explanations? Rate confidence 0-1."
+- Council runs Claude + Codex + Gemini independently, then reconciles
+Output: claims[] with council_verdict (agreed | contested | insufficient_evidence), provider_disagreements[]
+### FALLBACK (if Codex/Gemini unavailable):
+Spawn 2 standalone agents on the review model:
+**Devil's Advocate:** "Your job is to find reasons each claim is WRONG. Search for counter-evidence, flag single-source claims, identify logical gaps."
+**Edge Case Hunter:** "Your job is to find where each claim BREAKS. Scaling limits, security concerns, accessibility gaps, failure modes under load."
+Both receive all claims and return challenge_findings[]
+---
+## Phase 3: Pairwise Rank (for contested claims)
+Input: Claims marked as "contested" by council
+For each pair of conflicting claims:
+- LLM-as-judge prompt: "Given research question Q, Claim A says [X] with evidence [E1]. Claim B says [Y] with evidence [E2]. Which claim is better supported? Why? Consider: source authority, evidence specificity, logical coherence, relevance to the question."
+- Tournament bracket: winners advance, losers are demoted to "minority view"
+Output: ranked_claims[] with rank_position, judge_rationale
+### Ranking Criteria (in priority order)
+1. **Source authority**: Primary sources and peer-reviewed research outweigh blog posts and forum answers
+2. **Evidence specificity**: Concrete data points (benchmarks, case studies with numbers) outweigh general assertions
+3. **Logical coherence**: Claims with clear causal reasoning outweigh correlational arguments
+4. **Relevance to question**: Claims that directly address the research question outweigh tangentially related findings
+5. **Recency**: For fast-moving topics, newer evidence outweighs older evidence (all else equal)
+---
+## Phase 4: Reconcile — Disagreement Map
+Input: All processed claims (normalized, council-reviewed, ranked)
+The Reconciler agent produces structured output in 4 tiers:
+### Tier 1: Consensus Claims
+3+ agents agree, council agreed, multiple source types.
+- Highest confidence. Present as established findings.
+- Include: canonical claim, supporting agents, source count, source diversity, council verdict, confidence score
+### Tier 2: Contested Claims
+Council disagreed, or pairwise ranking was close.
+- Present BOTH sides with the specific disagreement.
+- Include: claim_a, claim_b, agents_for_a, agents_for_b, council positions, rank winner, judge rationale
+### Tier 3: Unique Insights
+Found by 1 agent only, not contradicted.
+- High value OR hallucination — flag for user judgment.
+- Include: claim, agent_role, confidence, source, note flagging single-source status
+### Tier 4: Refuted Claims
+Council rejected, or pairwise loser with low evidence.
+- Still present briefly — knowing what's wrong is valuable.
+- Include: claim, rejection_reason, original_agent
+---
+## Phase 5: Render
+Produce both:
+- **Structured JSON artifact** (see output-format.md for schema)
+- **Rendered markdown** for user display (see output-format.md for template)
+The JSON artifact is the primary output for skill-to-skill consumption. The markdown is for human reading.
+---
+## Reconciler Agent Prompt
+```
+You are the Reconciler — the final judge in a multi-agent research pipeline.
+You receive findings from 7 research agents that have been normalized,
+deduplicated, and adversarially reviewed.
+Your job is NOT to average or blend. Your job is to JUDGE:
+- Which claims are strong? (multiple independent sources, council agreement)
+- Which claims are contested? (present both sides, don't pick a winner)
+- Which claims are unique insights? (valuable if true, flag for verification)
+- Which claims should be rejected? (weak evidence, circular sourcing, council rejection)
+Produce a structured disagreement map, not a smooth summary.
+The user should see WHERE agents agreed, WHERE they disagreed, and WHY.
+INPUT:
+- normalized_claims: [list of deduplicated claims with agent_count and source_diversity]
+- council_verdicts: [list of claims with agreed/contested/insufficient verdicts]
+- pairwise_rankings: [list of contested claim pairs with winners and rationale]
+- credibility_scores: [list of claims with scored credibility from score_credibility.py]
+OUTPUT FORMAT:
+Return a JSON object with these exact keys:
+{
+  "consensus": [{ claim, supporting_agents, source_count, source_diversity, council_verdict, confidence }],
+  "contested": [{ claim_a, claim_b, agents_for_a, agents_for_b, council_verdict, provider_positions, rank_winner, judge_rationale }],
+  "unique_insights": [{ claim, agent_role, confidence, note }],
+  "refuted": [{ claim, rejection_reason, original_agent }]
+}
+RULES:
+- A claim needs 3+ agents AND council agreement to be consensus
+- A claim with 2 agents but council agreement goes to consensus with a "moderate confidence" flag
+- A claim with council disagreement ALWAYS goes to contested, even if 5 agents agree
+- A single-agent claim with confidence > 0.8 goes to unique_insights
+- A single-agent claim with confidence <= 0.5 goes to refuted
+- Everything else goes to unique_insights with appropriate flagging
+- NEVER merge contested claims into a smooth middle ground — preserve the disagreement
+```
+---
+## Pipeline Skip Rules
+- **Quick mode**: Skip Phases 2, 3, 4. Orchestrator does a single-pass synthesis directly from normalized findings.
+- **Standard mode**: Skip Phase 2 (council). Run Phases 1, 3, 4, 5.
+- **Deep mode**: Run all 5 phases.