feed-the-machine 1.6.1 → 1.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (272) hide show
  1. package/LICENSE +21 -21
  2. package/README.md +262 -170
  3. package/bin/__pycache__/tasks_db.cpython-314.pyc +0 -0
  4. package/bin/brain.py +1340 -0
  5. package/bin/convert_claude_skills_to_codex.py +490 -0
  6. package/bin/generate-manifest.mjs +463 -463
  7. package/bin/harden_codex_skills.py +141 -0
  8. package/bin/install.mjs +491 -491
  9. package/bin/migrate-eng-buddy-data.py +875 -0
  10. package/bin/playbook_engine/__init__.py +1 -0
  11. package/bin/playbook_engine/conftest.py +8 -0
  12. package/bin/playbook_engine/extractor.py +33 -0
  13. package/bin/playbook_engine/manager.py +102 -0
  14. package/bin/playbook_engine/models.py +84 -0
  15. package/bin/playbook_engine/registry.py +35 -0
  16. package/bin/playbook_engine/test_extractor.py +72 -0
  17. package/bin/playbook_engine/test_integration.py +129 -0
  18. package/bin/playbook_engine/test_manager.py +85 -0
  19. package/bin/playbook_engine/test_models.py +166 -0
  20. package/bin/playbook_engine/test_registry.py +67 -0
  21. package/bin/playbook_engine/test_tracer.py +86 -0
  22. package/bin/playbook_engine/tracer.py +93 -0
  23. package/bin/tasks_db.py +456 -0
  24. package/docs/HOOKS.md +243 -243
  25. package/docs/INBOX.md +233 -233
  26. package/ftm/SKILL.md +125 -122
  27. package/ftm-audit/SKILL.md +673 -623
  28. package/ftm-audit/references/protocols/PROJECT-PATTERNS.md +91 -91
  29. package/ftm-audit/references/protocols/RUNTIME-WIRING.md +66 -66
  30. package/ftm-audit/references/protocols/WIRING-CONTRACTS.md +135 -135
  31. package/ftm-audit/references/strategies/AUTO-FIX-STRATEGIES.md +69 -69
  32. package/ftm-audit/references/templates/REPORT-FORMAT.md +96 -96
  33. package/ftm-audit/scripts/run-knip.sh +23 -23
  34. package/ftm-audit.yml +2 -2
  35. package/ftm-brainstorm/SKILL.md +1003 -498
  36. package/ftm-brainstorm/evals/evals.json +180 -100
  37. package/ftm-brainstorm/evals/promptfoo.yaml +109 -109
  38. package/ftm-brainstorm/references/agent-prompts.md +552 -224
  39. package/ftm-brainstorm/references/plan-template.md +209 -121
  40. package/ftm-brainstorm.yml +2 -2
  41. package/ftm-browse/SKILL.md +454 -454
  42. package/ftm-browse/daemon/browser-manager.ts +206 -206
  43. package/ftm-browse/daemon/bun.lock +30 -30
  44. package/ftm-browse/daemon/cli.ts +347 -347
  45. package/ftm-browse/daemon/commands.ts +410 -410
  46. package/ftm-browse/daemon/main.ts +357 -357
  47. package/ftm-browse/daemon/package.json +17 -17
  48. package/ftm-browse/daemon/server.ts +189 -189
  49. package/ftm-browse/daemon/snapshot.ts +519 -519
  50. package/ftm-browse/daemon/tsconfig.json +22 -22
  51. package/ftm-browse.yml +4 -4
  52. package/ftm-capture/SKILL.md +370 -370
  53. package/ftm-capture.yml +4 -4
  54. package/ftm-codex-gate/SKILL.md +361 -361
  55. package/ftm-codex-gate.yml +2 -2
  56. package/ftm-config/SKILL.md +422 -345
  57. package/ftm-config.default.yml +125 -82
  58. package/ftm-config.yml +44 -2
  59. package/ftm-council/SKILL.md +416 -416
  60. package/ftm-council/references/prompts/CLAUDE-INVESTIGATION.md +60 -60
  61. package/ftm-council/references/prompts/CODEX-INVESTIGATION.md +58 -58
  62. package/ftm-council/references/prompts/GEMINI-INVESTIGATION.md +58 -58
  63. package/ftm-council/references/prompts/REBUTTAL-TEMPLATE.md +57 -57
  64. package/ftm-council/references/protocols/PREREQUISITES.md +47 -47
  65. package/ftm-council/references/protocols/STEP-0-FRAMING.md +46 -46
  66. package/ftm-council-chat.yml +2 -0
  67. package/ftm-council.yml +2 -2
  68. package/ftm-dashboard/SKILL.md +163 -163
  69. package/ftm-dashboard.yml +4 -4
  70. package/ftm-debug/SKILL.md +1037 -1037
  71. package/ftm-debug/references/phases/PHASE-0-INTAKE.md +58 -58
  72. package/ftm-debug/references/phases/PHASE-1-TRIAGE.md +46 -46
  73. package/ftm-debug/references/phases/PHASE-2-WAR-ROOM-AGENTS.md +279 -279
  74. package/ftm-debug/references/phases/PHASE-3-TO-6-EXECUTION.md +436 -436
  75. package/ftm-debug/references/protocols/BLACKBOARD.md +86 -86
  76. package/ftm-debug/references/protocols/EDGE-CASES.md +103 -103
  77. package/ftm-debug.yml +2 -2
  78. package/ftm-diagram/SKILL.md +277 -277
  79. package/ftm-diagram.yml +2 -2
  80. package/ftm-executor/SKILL.md +777 -777
  81. package/ftm-executor/references/STYLE-TEMPLATE.md +73 -73
  82. package/ftm-executor/references/phases/PHASE-0-VERIFICATION.md +62 -62
  83. package/ftm-executor/references/phases/PHASE-2-AGENT-ASSEMBLY.md +34 -34
  84. package/ftm-executor/references/phases/PHASE-3-WORKTREES.md +38 -38
  85. package/ftm-executor/references/phases/PHASE-4-5-AUDIT.md +81 -72
  86. package/ftm-executor/references/phases/PHASE-4-DISPATCH.md +66 -66
  87. package/ftm-executor/references/phases/PHASE-5-5-CODEX-GATE.md +73 -73
  88. package/ftm-executor/references/protocols/DOCUMENTATION-BOOTSTRAP.md +36 -36
  89. package/ftm-executor/references/protocols/MODEL-PROFILE.md +59 -59
  90. package/ftm-executor/references/protocols/PROGRESS-TRACKING.md +66 -66
  91. package/ftm-executor/runtime/ftm-runtime.mjs +252 -252
  92. package/ftm-executor/runtime/package.json +8 -8
  93. package/ftm-executor.yml +2 -2
  94. package/ftm-git/SKILL.md +441 -441
  95. package/ftm-git/evals/evals.json +26 -26
  96. package/ftm-git/evals/promptfoo.yaml +75 -75
  97. package/ftm-git/hooks/post-commit-experience.sh +92 -92
  98. package/ftm-git/references/patterns/SECRET-PATTERNS.md +104 -104
  99. package/ftm-git/references/protocols/REMEDIATION.md +139 -139
  100. package/ftm-git/scripts/pre-commit-secrets.sh +110 -110
  101. package/ftm-git.yml +2 -2
  102. package/ftm-inbox/backend/__pycache__/main.cpython-314.pyc +0 -0
  103. package/ftm-inbox/backend/adapters/_retry.py +64 -64
  104. package/ftm-inbox/backend/adapters/base.py +230 -230
  105. package/ftm-inbox/backend/adapters/freshservice.py +104 -104
  106. package/ftm-inbox/backend/adapters/gmail.py +125 -125
  107. package/ftm-inbox/backend/adapters/jira.py +136 -136
  108. package/ftm-inbox/backend/adapters/registry.py +192 -192
  109. package/ftm-inbox/backend/adapters/slack.py +110 -110
  110. package/ftm-inbox/backend/db/connection.py +54 -54
  111. package/ftm-inbox/backend/db/schema.py +78 -78
  112. package/ftm-inbox/backend/executor/__init__.py +7 -7
  113. package/ftm-inbox/backend/executor/engine.py +149 -149
  114. package/ftm-inbox/backend/executor/step_runner.py +98 -98
  115. package/ftm-inbox/backend/main.py +103 -103
  116. package/ftm-inbox/backend/models/__init__.py +1 -1
  117. package/ftm-inbox/backend/models/unified_task.py +36 -36
  118. package/ftm-inbox/backend/planner/__init__.py +6 -6
  119. package/ftm-inbox/backend/planner/__pycache__/__init__.cpython-314.pyc +0 -0
  120. package/ftm-inbox/backend/planner/__pycache__/generator.cpython-314.pyc +0 -0
  121. package/ftm-inbox/backend/planner/__pycache__/schema.cpython-314.pyc +0 -0
  122. package/ftm-inbox/backend/planner/generator.py +127 -127
  123. package/ftm-inbox/backend/planner/schema.py +34 -34
  124. package/ftm-inbox/backend/requirements.txt +5 -5
  125. package/ftm-inbox/backend/routes/__pycache__/plan.cpython-314.pyc +0 -0
  126. package/ftm-inbox/backend/routes/execute.py +186 -186
  127. package/ftm-inbox/backend/routes/health.py +52 -52
  128. package/ftm-inbox/backend/routes/inbox.py +68 -68
  129. package/ftm-inbox/backend/routes/plan.py +271 -271
  130. package/ftm-inbox/bin/launchagent.mjs +91 -91
  131. package/ftm-inbox/bin/setup.mjs +188 -188
  132. package/ftm-inbox/bin/start.sh +10 -10
  133. package/ftm-inbox/bin/status.sh +17 -17
  134. package/ftm-inbox/bin/stop.sh +8 -8
  135. package/ftm-inbox/config.example.yml +55 -55
  136. package/ftm-inbox/package-lock.json +2898 -2898
  137. package/ftm-inbox/package.json +26 -26
  138. package/ftm-inbox/postcss.config.js +6 -6
  139. package/ftm-inbox/src/app.css +199 -199
  140. package/ftm-inbox/src/app.html +18 -18
  141. package/ftm-inbox/src/lib/api.ts +166 -166
  142. package/ftm-inbox/src/lib/components/ExecutionLog.svelte +81 -81
  143. package/ftm-inbox/src/lib/components/InboxFeed.svelte +143 -143
  144. package/ftm-inbox/src/lib/components/PlanStep.svelte +271 -271
  145. package/ftm-inbox/src/lib/components/PlanView.svelte +206 -206
  146. package/ftm-inbox/src/lib/components/StreamPanel.svelte +99 -99
  147. package/ftm-inbox/src/lib/components/TaskCard.svelte +190 -190
  148. package/ftm-inbox/src/lib/components/ui/EmptyState.svelte +63 -63
  149. package/ftm-inbox/src/lib/components/ui/KawaiiCard.svelte +86 -86
  150. package/ftm-inbox/src/lib/components/ui/PillButton.svelte +106 -106
  151. package/ftm-inbox/src/lib/components/ui/StatusBadge.svelte +67 -67
  152. package/ftm-inbox/src/lib/components/ui/StreamDrawer.svelte +149 -149
  153. package/ftm-inbox/src/lib/components/ui/ThemeToggle.svelte +80 -80
  154. package/ftm-inbox/src/lib/theme.ts +47 -47
  155. package/ftm-inbox/src/routes/+layout.svelte +76 -76
  156. package/ftm-inbox/src/routes/+page.svelte +401 -401
  157. package/ftm-inbox/svelte.config.js +12 -12
  158. package/ftm-inbox/tailwind.config.ts +63 -63
  159. package/ftm-inbox/tsconfig.json +13 -13
  160. package/ftm-inbox/vite.config.ts +6 -6
  161. package/ftm-intent/SKILL.md +241 -241
  162. package/ftm-intent.yml +2 -2
  163. package/ftm-manifest.json +3794 -3794
  164. package/ftm-map/SKILL.md +291 -291
  165. package/ftm-map/scripts/db.py +712 -712
  166. package/ftm-map/scripts/index.py +415 -415
  167. package/ftm-map/scripts/parser.py +224 -224
  168. package/ftm-map/scripts/queries/go-tags.scm +20 -20
  169. package/ftm-map/scripts/queries/javascript-tags.scm +35 -35
  170. package/ftm-map/scripts/queries/python-tags.scm +31 -31
  171. package/ftm-map/scripts/queries/ruby-tags.scm +19 -19
  172. package/ftm-map/scripts/queries/rust-tags.scm +37 -37
  173. package/ftm-map/scripts/queries/typescript-tags.scm +41 -41
  174. package/ftm-map/scripts/query.py +301 -301
  175. package/ftm-map/scripts/ranker.py +377 -377
  176. package/ftm-map/scripts/requirements.txt +5 -5
  177. package/ftm-map/scripts/setup-hooks.sh +27 -27
  178. package/ftm-map/scripts/setup.sh +56 -56
  179. package/ftm-map/scripts/test_db.py +364 -364
  180. package/ftm-map/scripts/test_parser.py +174 -174
  181. package/ftm-map/scripts/test_query.py +183 -183
  182. package/ftm-map/scripts/test_ranker.py +199 -199
  183. package/ftm-map/scripts/views.py +591 -591
  184. package/ftm-map.yml +2 -2
  185. package/ftm-mind/SKILL.md +201 -1943
  186. package/ftm-mind/evals/promptfoo.yaml +142 -142
  187. package/ftm-mind/references/blackboard-protocol.md +110 -0
  188. package/ftm-mind/references/blackboard-schema.md +328 -328
  189. package/ftm-mind/references/complexity-guide.md +110 -110
  190. package/ftm-mind/references/complexity-sizing.md +138 -0
  191. package/ftm-mind/references/decide-act-protocol.md +172 -0
  192. package/ftm-mind/references/direct-execution.md +51 -0
  193. package/ftm-mind/references/environment-discovery.md +77 -0
  194. package/ftm-mind/references/event-registry.md +319 -319
  195. package/ftm-mind/references/mcp-inventory.md +300 -296
  196. package/ftm-mind/references/ops-routing.md +47 -0
  197. package/ftm-mind/references/orient-protocol.md +234 -0
  198. package/ftm-mind/references/personality.md +40 -0
  199. package/ftm-mind/references/protocols/COMPLEXITY-SIZING.md +72 -72
  200. package/ftm-mind/references/protocols/MCP-HEURISTICS.md +32 -32
  201. package/ftm-mind/references/protocols/PLAN-APPROVAL.md +80 -80
  202. package/ftm-mind/references/reflexion-protocol.md +249 -249
  203. package/ftm-mind/references/routing/SCENARIOS.md +22 -22
  204. package/ftm-mind/references/routing-scenarios.md +35 -35
  205. package/ftm-mind.yml +2 -2
  206. package/ftm-ops.yml +4 -0
  207. package/ftm-pause/SKILL.md +395 -395
  208. package/ftm-pause/references/protocols/SKILL-RESTORE-PROTOCOLS.md +186 -186
  209. package/ftm-pause/references/protocols/VALIDATION.md +80 -80
  210. package/ftm-pause.yml +2 -2
  211. package/ftm-researcher/SKILL.md +275 -275
  212. package/ftm-researcher/evals/agent-diversity.yaml +17 -17
  213. package/ftm-researcher/evals/synthesis-quality.yaml +12 -12
  214. package/ftm-researcher/evals/trigger-accuracy.yaml +39 -39
  215. package/ftm-researcher/references/adaptive-search.md +116 -116
  216. package/ftm-researcher/references/agent-prompts.md +193 -193
  217. package/ftm-researcher/references/council-integration.md +193 -193
  218. package/ftm-researcher/references/output-format.md +203 -203
  219. package/ftm-researcher/references/synthesis-pipeline.md +165 -165
  220. package/ftm-researcher/scripts/score_credibility.py +234 -234
  221. package/ftm-researcher/scripts/validate_research.py +92 -92
  222. package/ftm-researcher.yml +2 -2
  223. package/ftm-resume/SKILL.md +518 -518
  224. package/ftm-resume/references/protocols/VALIDATION.md +172 -172
  225. package/ftm-resume.yml +2 -2
  226. package/ftm-retro/SKILL.md +380 -380
  227. package/ftm-retro/references/protocols/SCORING-RUBRICS.md +89 -89
  228. package/ftm-retro/references/templates/REPORT-FORMAT.md +109 -109
  229. package/ftm-retro.yml +2 -2
  230. package/ftm-routine/SKILL.md +170 -170
  231. package/ftm-routine.yml +4 -4
  232. package/ftm-state/blackboard/capabilities.json +5 -5
  233. package/ftm-state/blackboard/capabilities.schema.json +27 -27
  234. package/ftm-state/blackboard/context.json +37 -23
  235. package/ftm-state/blackboard/experiences/doom-statusline-fix.json +26 -0
  236. package/ftm-state/blackboard/experiences/hackathon-pages-site.json +26 -0
  237. package/ftm-state/blackboard/experiences/hindsight-sso-kickoff.json +42 -0
  238. package/ftm-state/blackboard/experiences/index.json +58 -9
  239. package/ftm-state/blackboard/experiences/learning-ragnarok-api-access.json +23 -0
  240. package/ftm-state/blackboard/experiences/nordlayer-members-auto-assign.json +26 -0
  241. package/ftm-state/blackboard/experiences/saml2aws-stale-session-fix.json +41 -0
  242. package/ftm-state/blackboard/patterns.json +6 -6
  243. package/ftm-state/schemas/context.schema.json +130 -130
  244. package/ftm-state/schemas/experience-index.schema.json +77 -77
  245. package/ftm-state/schemas/experience.schema.json +78 -78
  246. package/ftm-state/schemas/patterns.schema.json +44 -44
  247. package/ftm-upgrade/SKILL.md +194 -194
  248. package/ftm-upgrade/scripts/check-version.sh +76 -76
  249. package/ftm-upgrade/scripts/upgrade.sh +143 -143
  250. package/ftm-upgrade.yml +2 -2
  251. package/ftm-verify.yml +2 -2
  252. package/ftm.yml +2 -2
  253. package/hooks/ftm-auto-log.sh +137 -0
  254. package/hooks/ftm-blackboard-enforcer.sh +93 -93
  255. package/hooks/ftm-discovery-reminder.sh +90 -90
  256. package/hooks/ftm-drafts-gate.sh +61 -61
  257. package/hooks/ftm-event-logger.mjs +107 -107
  258. package/hooks/ftm-install-hooks.sh +240 -0
  259. package/hooks/ftm-learning-capture.sh +117 -0
  260. package/hooks/ftm-map-autodetect.sh +79 -79
  261. package/hooks/ftm-pending-sync-check.sh +22 -22
  262. package/hooks/ftm-plan-gate.sh +92 -92
  263. package/hooks/ftm-post-commit-trigger.sh +57 -57
  264. package/hooks/ftm-post-compaction.sh +138 -0
  265. package/hooks/ftm-pre-compaction.sh +147 -0
  266. package/hooks/ftm-session-end.sh +52 -0
  267. package/hooks/ftm-session-snapshot.sh +213 -0
  268. package/hooks/ftm-task-loader.sh +100 -0
  269. package/hooks/settings-template.json +91 -81
  270. package/install.sh +363 -363
  271. package/package.json +84 -84
  272. package/uninstall.sh +25 -25
@@ -1,165 +1,165 @@
1
- # Synthesis Pipeline
2
-
3
- 5-phase pipeline that takes raw findings from finder agents and produces a structured disagreement map.
4
-
5
- ---
6
-
7
- ## Phase 1: Normalize & Deduplicate
8
-
9
- Input: Raw findings from all finder agents (7 agents x 3-8 findings each = 21-56 findings)
10
-
11
- Steps:
12
- 1. Flatten all findings into a single list
13
- 2. Group by semantic similarity (same claim from different agents)
14
- 3. For each group:
15
- - Merge into a single canonical claim
16
- - Track which agents found it (agent_count)
17
- - Track source type diversity (source_diversity_score = unique source types / total sources)
18
- - Flag circular sourcing: if all sources in a group cite the same original source, mark as circular=true
19
- 4. Output: unique_claims[] sorted by agent_count DESC, source_diversity_score DESC
20
-
21
- ### Semantic Similarity Heuristics
22
-
23
- Two claims are considered semantically similar when:
24
- - They make the same factual assertion about the same subject, even with different wording
25
- - One is a subset of the other (e.g., "X uses Y" vs "X uses Y for Z")
26
- - They cite the same source for the same conclusion
27
-
28
- Two claims are NOT similar when:
29
- - They address different aspects of the same topic
30
- - They reach different conclusions about the same subject
31
- - One is general and the other is specific with additional qualifying conditions
32
-
33
- When merging, keep the most specific version as the canonical claim.
34
-
35
- ---
36
-
37
- ## Phase 2: Adversarial Review (ftm-council)
38
-
39
- Input: Top claims from Phase 1 (all claims with agent_count >= 2, plus any high-confidence unique claims with confidence > 0.8)
40
-
41
- Council invocation:
42
- - Send claims as a structured prompt to ftm-council
43
- - Ask: "Evaluate each claim. For each: Is the evidence sufficient? What would make this wrong? Are there alternative explanations? Rate confidence 0-1."
44
- - Council runs Claude + Codex + Gemini independently, then reconciles
45
-
46
- Output: claims[] with council_verdict (agreed | contested | insufficient_evidence), provider_disagreements[]
47
-
48
- ### FALLBACK (if Codex/Gemini unavailable):
49
-
50
- Spawn 2 standalone agents on the review model:
51
-
52
- **Devil's Advocate:** "Your job is to find reasons each claim is WRONG. Search for counter-evidence, flag single-source claims, identify logical gaps."
53
-
54
- **Edge Case Hunter:** "Your job is to find where each claim BREAKS. Scaling limits, security concerns, accessibility gaps, failure modes under load."
55
-
56
- Both receive all claims and return challenge_findings[]
57
-
58
- ---
59
-
60
- ## Phase 3: Pairwise Rank (for contested claims)
61
-
62
- Input: Claims marked as "contested" by council
63
-
64
- For each pair of conflicting claims:
65
- - LLM-as-judge prompt: "Given research question Q, Claim A says [X] with evidence [E1]. Claim B says [Y] with evidence [E2]. Which claim is better supported? Why? Consider: source authority, evidence specificity, logical coherence, relevance to the question."
66
- - Tournament bracket: winners advance, losers are demoted to "minority view"
67
-
68
- Output: ranked_claims[] with rank_position, judge_rationale
69
-
70
- ### Ranking Criteria (in priority order)
71
-
72
- 1. **Source authority**: Primary sources and peer-reviewed research outweigh blog posts and forum answers
73
- 2. **Evidence specificity**: Concrete data points (benchmarks, case studies with numbers) outweigh general assertions
74
- 3. **Logical coherence**: Claims with clear causal reasoning outweigh correlational arguments
75
- 4. **Relevance to question**: Claims that directly address the research question outweigh tangentially related findings
76
- 5. **Recency**: For fast-moving topics, newer evidence outweighs older evidence (all else equal)
77
-
78
- ---
79
-
80
- ## Phase 4: Reconcile — Disagreement Map
81
-
82
- Input: All processed claims (normalized, council-reviewed, ranked)
83
-
84
- The Reconciler agent produces structured output in 4 tiers:
85
-
86
- ### Tier 1: Consensus Claims
87
- 3+ agents agree, council agreed, multiple source types.
88
- - Highest confidence. Present as established findings.
89
- - Include: canonical claim, supporting agents, source count, source diversity, council verdict, confidence score
90
-
91
- ### Tier 2: Contested Claims
92
- Council disagreed, or pairwise ranking was close.
93
- - Present BOTH sides with the specific disagreement.
94
- - Include: claim_a, claim_b, agents_for_a, agents_for_b, council positions, rank winner, judge rationale
95
-
96
- ### Tier 3: Unique Insights
97
- Found by 1 agent only, not contradicted.
98
- - High value OR hallucination — flag for user judgment.
99
- - Include: claim, agent_role, confidence, source, note flagging single-source status
100
-
101
- ### Tier 4: Refuted Claims
102
- Council rejected, or pairwise loser with low evidence.
103
- - Still present briefly — knowing what's wrong is valuable.
104
- - Include: claim, rejection_reason, original_agent
105
-
106
- ---
107
-
108
- ## Phase 5: Render
109
-
110
- Produce both:
111
- - **Structured JSON artifact** (see output-format.md for schema)
112
- - **Rendered markdown** for user display (see output-format.md for template)
113
-
114
- The JSON artifact is the primary output for skill-to-skill consumption. The markdown is for human reading.
115
-
116
- ---
117
-
118
- ## Reconciler Agent Prompt
119
-
120
- ```
121
- You are the Reconciler — the final judge in a multi-agent research pipeline.
122
- You receive findings from 7 research agents that have been normalized,
123
- deduplicated, and adversarially reviewed.
124
-
125
- Your job is NOT to average or blend. Your job is to JUDGE:
126
- - Which claims are strong? (multiple independent sources, council agreement)
127
- - Which claims are contested? (present both sides, don't pick a winner)
128
- - Which claims are unique insights? (valuable if true, flag for verification)
129
- - Which claims should be rejected? (weak evidence, circular sourcing, council rejection)
130
-
131
- Produce a structured disagreement map, not a smooth summary.
132
- The user should see WHERE agents agreed, WHERE they disagreed, and WHY.
133
-
134
- INPUT:
135
- - normalized_claims: [list of deduplicated claims with agent_count and source_diversity]
136
- - council_verdicts: [list of claims with agreed/contested/insufficient verdicts]
137
- - pairwise_rankings: [list of contested claim pairs with winners and rationale]
138
- - credibility_scores: [list of claims with scored credibility from score_credibility.py]
139
-
140
- OUTPUT FORMAT:
141
- Return a JSON object with these exact keys:
142
- {
143
- "consensus": [{ claim, supporting_agents, source_count, source_diversity, council_verdict, confidence }],
144
- "contested": [{ claim_a, claim_b, agents_for_a, agents_for_b, council_verdict, provider_positions, rank_winner, judge_rationale }],
145
- "unique_insights": [{ claim, agent_role, confidence, note }],
146
- "refuted": [{ claim, rejection_reason, original_agent }]
147
- }
148
-
149
- RULES:
150
- - A claim needs 3+ agents AND council agreement to be consensus
151
- - A claim with 2 agents but council agreement goes to consensus with a "moderate confidence" flag
152
- - A claim with council disagreement ALWAYS goes to contested, even if 5 agents agree
153
- - A single-agent claim with confidence > 0.8 goes to unique_insights
154
- - A single-agent claim with confidence <= 0.5 goes to refuted
155
- - Everything else goes to unique_insights with appropriate flagging
156
- - NEVER merge contested claims into a smooth middle ground — preserve the disagreement
157
- ```
158
-
159
- ---
160
-
161
- ## Pipeline Skip Rules
162
-
163
- - **Quick mode**: Skip Phases 2, 3, 4. Orchestrator does a single-pass synthesis directly from normalized findings.
164
- - **Standard mode**: Skip Phase 2 (council). Run Phases 1, 3, 4, 5.
165
- - **Deep mode**: Run all 5 phases.
1
+ # Synthesis Pipeline
2
+
3
+ 5-phase pipeline that takes raw findings from finder agents and produces a structured disagreement map.
4
+
5
+ ---
6
+
7
+ ## Phase 1: Normalize & Deduplicate
8
+
9
+ Input: Raw findings from all finder agents (7 agents x 3-8 findings each = 21-56 findings)
10
+
11
+ Steps:
12
+ 1. Flatten all findings into a single list
13
+ 2. Group by semantic similarity (same claim from different agents)
14
+ 3. For each group:
15
+ - Merge into a single canonical claim
16
+ - Track which agents found it (agent_count)
17
+ - Track source type diversity (source_diversity_score = unique source types / total sources)
18
+ - Flag circular sourcing: if all sources in a group cite the same original source, mark as circular=true
19
+ 4. Output: unique_claims[] sorted by agent_count DESC, source_diversity_score DESC
20
+
21
+ ### Semantic Similarity Heuristics
22
+
23
+ Two claims are considered semantically similar when:
24
+ - They make the same factual assertion about the same subject, even with different wording
25
+ - One is a subset of the other (e.g., "X uses Y" vs "X uses Y for Z")
26
+ - They cite the same source for the same conclusion
27
+
28
+ Two claims are NOT similar when:
29
+ - They address different aspects of the same topic
30
+ - They reach different conclusions about the same subject
31
+ - One is general and the other is specific with additional qualifying conditions
32
+
33
+ When merging, keep the most specific version as the canonical claim.
34
+
35
+ ---
36
+
37
+ ## Phase 2: Adversarial Review (ftm-council)
38
+
39
+ Input: Top claims from Phase 1 (all claims with agent_count >= 2, plus any high-confidence unique claims with confidence > 0.8)
40
+
41
+ Council invocation:
42
+ - Send claims as a structured prompt to ftm-council
43
+ - Ask: "Evaluate each claim. For each: Is the evidence sufficient? What would make this wrong? Are there alternative explanations? Rate confidence 0-1."
44
+ - Council runs Claude + Codex + Gemini independently, then reconciles
45
+
46
+ Output: claims[] with council_verdict (agreed | contested | insufficient_evidence), provider_disagreements[]
47
+
48
+ ### FALLBACK (if Codex/Gemini unavailable):
49
+
50
+ Spawn 2 standalone agents on the review model:
51
+
52
+ **Devil's Advocate:** "Your job is to find reasons each claim is WRONG. Search for counter-evidence, flag single-source claims, identify logical gaps."
53
+
54
+ **Edge Case Hunter:** "Your job is to find where each claim BREAKS. Scaling limits, security concerns, accessibility gaps, failure modes under load."
55
+
56
+ Both receive all claims and return challenge_findings[]
57
+
58
+ ---
59
+
60
+ ## Phase 3: Pairwise Rank (for contested claims)
61
+
62
+ Input: Claims marked as "contested" by council
63
+
64
+ For each pair of conflicting claims:
65
+ - LLM-as-judge prompt: "Given research question Q, Claim A says [X] with evidence [E1]. Claim B says [Y] with evidence [E2]. Which claim is better supported? Why? Consider: source authority, evidence specificity, logical coherence, relevance to the question."
66
+ - Tournament bracket: winners advance, losers are demoted to "minority view"
67
+
68
+ Output: ranked_claims[] with rank_position, judge_rationale
69
+
70
+ ### Ranking Criteria (in priority order)
71
+
72
+ 1. **Source authority**: Primary sources and peer-reviewed research outweigh blog posts and forum answers
73
+ 2. **Evidence specificity**: Concrete data points (benchmarks, case studies with numbers) outweigh general assertions
74
+ 3. **Logical coherence**: Claims with clear causal reasoning outweigh correlational arguments
75
+ 4. **Relevance to question**: Claims that directly address the research question outweigh tangentially related findings
76
+ 5. **Recency**: For fast-moving topics, newer evidence outweighs older evidence (all else equal)
77
+
78
+ ---
79
+
80
+ ## Phase 4: Reconcile — Disagreement Map
81
+
82
+ Input: All processed claims (normalized, council-reviewed, ranked)
83
+
84
+ The Reconciler agent produces structured output in 4 tiers:
85
+
86
+ ### Tier 1: Consensus Claims
87
+ 3+ agents agree, council agreed, multiple source types.
88
+ - Highest confidence. Present as established findings.
89
+ - Include: canonical claim, supporting agents, source count, source diversity, council verdict, confidence score
90
+
91
+ ### Tier 2: Contested Claims
92
+ Council disagreed, or pairwise ranking was close.
93
+ - Present BOTH sides with the specific disagreement.
94
+ - Include: claim_a, claim_b, agents_for_a, agents_for_b, council positions, rank winner, judge rationale
95
+
96
+ ### Tier 3: Unique Insights
97
+ Found by 1 agent only, not contradicted.
98
+ - High value OR hallucination — flag for user judgment.
99
+ - Include: claim, agent_role, confidence, source, note flagging single-source status
100
+
101
+ ### Tier 4: Refuted Claims
102
+ Council rejected, or pairwise loser with low evidence.
103
+ - Still present briefly — knowing what's wrong is valuable.
104
+ - Include: claim, rejection_reason, original_agent
105
+
106
+ ---
107
+
108
+ ## Phase 5: Render
109
+
110
+ Produce both:
111
+ - **Structured JSON artifact** (see output-format.md for schema)
112
+ - **Rendered markdown** for user display (see output-format.md for template)
113
+
114
+ The JSON artifact is the primary output for skill-to-skill consumption. The markdown is for human reading.
115
+
116
+ ---
117
+
118
+ ## Reconciler Agent Prompt
119
+
120
+ ```
121
+ You are the Reconciler — the final judge in a multi-agent research pipeline.
122
+ You receive findings from 7 research agents that have been normalized,
123
+ deduplicated, and adversarially reviewed.
124
+
125
+ Your job is NOT to average or blend. Your job is to JUDGE:
126
+ - Which claims are strong? (multiple independent sources, council agreement)
127
+ - Which claims are contested? (present both sides, don't pick a winner)
128
+ - Which claims are unique insights? (valuable if true, flag for verification)
129
+ - Which claims should be rejected? (weak evidence, circular sourcing, council rejection)
130
+
131
+ Produce a structured disagreement map, not a smooth summary.
132
+ The user should see WHERE agents agreed, WHERE they disagreed, and WHY.
133
+
134
+ INPUT:
135
+ - normalized_claims: [list of deduplicated claims with agent_count and source_diversity]
136
+ - council_verdicts: [list of claims with agreed/contested/insufficient verdicts]
137
+ - pairwise_rankings: [list of contested claim pairs with winners and rationale]
138
+ - credibility_scores: [list of claims with scored credibility from score_credibility.py]
139
+
140
+ OUTPUT FORMAT:
141
+ Return a JSON object with these exact keys:
142
+ {
143
+ "consensus": [{ claim, supporting_agents, source_count, source_diversity, council_verdict, confidence }],
144
+ "contested": [{ claim_a, claim_b, agents_for_a, agents_for_b, council_verdict, provider_positions, rank_winner, judge_rationale }],
145
+ "unique_insights": [{ claim, agent_role, confidence, note }],
146
+ "refuted": [{ claim, rejection_reason, original_agent }]
147
+ }
148
+
149
+ RULES:
150
+ - A claim needs 3+ agents AND council agreement to be consensus
151
+ - A claim with 2 agents but council agreement goes to consensus with a "moderate confidence" flag
152
+ - A claim with council disagreement ALWAYS goes to contested, even if 5 agents agree
153
+ - A single-agent claim with confidence > 0.8 goes to unique_insights
154
+ - A single-agent claim with confidence <= 0.5 goes to refuted
155
+ - Everything else goes to unique_insights with appropriate flagging
156
+ - NEVER merge contested claims into a smooth middle ground — preserve the disagreement
157
+ ```
158
+
159
+ ---
160
+
161
+ ## Pipeline Skip Rules
162
+
163
+ - **Quick mode**: Skip Phases 2, 3, 4. Orchestrator does a single-pass synthesis directly from normalized findings.
164
+ - **Standard mode**: Skip Phase 2 (council). Run Phases 1, 3, 4, 5.
165
+ - **Deep mode**: Run all 5 phases.