feed-the-machine 1.6.0 → 1.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -21
- package/README.md +170 -170
- package/bin/brain.py +1340 -0
- package/bin/convert_claude_skills_to_codex.py +490 -0
- package/bin/generate-manifest.mjs +463 -463
- package/bin/harden_codex_skills.py +141 -0
- package/bin/install.mjs +491 -491
- package/bin/migrate-eng-buddy-data.py +875 -0
- package/bin/playbook_engine/__init__.py +1 -0
- package/bin/playbook_engine/conftest.py +8 -0
- package/bin/playbook_engine/extractor.py +33 -0
- package/bin/playbook_engine/manager.py +102 -0
- package/bin/playbook_engine/models.py +84 -0
- package/bin/playbook_engine/registry.py +35 -0
- package/bin/playbook_engine/test_extractor.py +72 -0
- package/bin/playbook_engine/test_integration.py +129 -0
- package/bin/playbook_engine/test_manager.py +85 -0
- package/bin/playbook_engine/test_models.py +166 -0
- package/bin/playbook_engine/test_registry.py +67 -0
- package/bin/playbook_engine/test_tracer.py +86 -0
- package/bin/playbook_engine/tracer.py +93 -0
- package/bin/tasks_db.py +456 -0
- package/docs/HOOKS.md +243 -243
- package/docs/INBOX.md +233 -233
- package/ftm/SKILL.md +125 -122
- package/ftm-audit/SKILL.md +623 -623
- package/ftm-audit/references/protocols/PROJECT-PATTERNS.md +91 -91
- package/ftm-audit/references/protocols/RUNTIME-WIRING.md +66 -66
- package/ftm-audit/references/protocols/WIRING-CONTRACTS.md +135 -135
- package/ftm-audit/references/strategies/AUTO-FIX-STRATEGIES.md +69 -69
- package/ftm-audit/references/templates/REPORT-FORMAT.md +96 -96
- package/ftm-audit/scripts/run-knip.sh +23 -23
- package/ftm-audit.yml +2 -2
- package/ftm-brainstorm/SKILL.md +1003 -498
- package/ftm-brainstorm/evals/evals.json +180 -100
- package/ftm-brainstorm/evals/promptfoo.yaml +109 -109
- package/ftm-brainstorm/references/agent-prompts.md +552 -224
- package/ftm-brainstorm/references/plan-template.md +209 -121
- package/ftm-brainstorm.yml +2 -2
- package/ftm-browse/SKILL.md +454 -454
- package/ftm-browse/daemon/browser-manager.ts +206 -206
- package/ftm-browse/daemon/bun.lock +30 -30
- package/ftm-browse/daemon/cli.ts +347 -347
- package/ftm-browse/daemon/commands.ts +410 -410
- package/ftm-browse/daemon/main.ts +357 -357
- package/ftm-browse/daemon/package.json +17 -17
- package/ftm-browse/daemon/server.ts +189 -189
- package/ftm-browse/daemon/snapshot.ts +519 -519
- package/ftm-browse/daemon/tsconfig.json +22 -22
- package/ftm-browse.yml +4 -4
- package/ftm-capture/SKILL.md +370 -370
- package/ftm-capture.yml +4 -4
- package/ftm-codex-gate/SKILL.md +361 -361
- package/ftm-codex-gate.yml +2 -2
- package/ftm-config/SKILL.md +422 -345
- package/ftm-config.default.yml +125 -82
- package/ftm-config.yml +44 -2
- package/ftm-council/SKILL.md +416 -416
- package/ftm-council/references/prompts/CLAUDE-INVESTIGATION.md +60 -60
- package/ftm-council/references/prompts/CODEX-INVESTIGATION.md +58 -58
- package/ftm-council/references/prompts/GEMINI-INVESTIGATION.md +58 -58
- package/ftm-council/references/prompts/REBUTTAL-TEMPLATE.md +57 -57
- package/ftm-council/references/protocols/PREREQUISITES.md +47 -47
- package/ftm-council/references/protocols/STEP-0-FRAMING.md +46 -46
- package/ftm-council.yml +2 -2
- package/ftm-dashboard/SKILL.md +163 -163
- package/ftm-dashboard.yml +4 -4
- package/ftm-debug/SKILL.md +1037 -1037
- package/ftm-debug/references/phases/PHASE-0-INTAKE.md +58 -58
- package/ftm-debug/references/phases/PHASE-1-TRIAGE.md +46 -46
- package/ftm-debug/references/phases/PHASE-2-WAR-ROOM-AGENTS.md +279 -279
- package/ftm-debug/references/phases/PHASE-3-TO-6-EXECUTION.md +436 -436
- package/ftm-debug/references/protocols/BLACKBOARD.md +86 -86
- package/ftm-debug/references/protocols/EDGE-CASES.md +103 -103
- package/ftm-debug.yml +2 -2
- package/ftm-diagram/SKILL.md +277 -277
- package/ftm-diagram.yml +2 -2
- package/ftm-executor/SKILL.md +777 -777
- package/ftm-executor/references/STYLE-TEMPLATE.md +73 -73
- package/ftm-executor/references/phases/PHASE-0-VERIFICATION.md +62 -62
- package/ftm-executor/references/phases/PHASE-2-AGENT-ASSEMBLY.md +34 -34
- package/ftm-executor/references/phases/PHASE-3-WORKTREES.md +38 -38
- package/ftm-executor/references/phases/PHASE-4-5-AUDIT.md +72 -72
- package/ftm-executor/references/phases/PHASE-4-DISPATCH.md +66 -66
- package/ftm-executor/references/phases/PHASE-5-5-CODEX-GATE.md +73 -73
- package/ftm-executor/references/protocols/DOCUMENTATION-BOOTSTRAP.md +36 -36
- package/ftm-executor/references/protocols/MODEL-PROFILE.md +59 -59
- package/ftm-executor/references/protocols/PROGRESS-TRACKING.md +66 -66
- package/ftm-executor/runtime/ftm-runtime.mjs +252 -252
- package/ftm-executor/runtime/package.json +8 -8
- package/ftm-executor.yml +2 -2
- package/ftm-git/SKILL.md +441 -441
- package/ftm-git/evals/evals.json +26 -26
- package/ftm-git/evals/promptfoo.yaml +75 -75
- package/ftm-git/hooks/post-commit-experience.sh +92 -92
- package/ftm-git/references/patterns/SECRET-PATTERNS.md +104 -104
- package/ftm-git/references/protocols/REMEDIATION.md +139 -139
- package/ftm-git/scripts/pre-commit-secrets.sh +110 -110
- package/ftm-git.yml +2 -2
- package/ftm-inbox/backend/__pycache__/main.cpython-314.pyc +0 -0
- package/ftm-inbox/backend/adapters/_retry.py +64 -64
- package/ftm-inbox/backend/adapters/base.py +230 -230
- package/ftm-inbox/backend/adapters/freshservice.py +104 -104
- package/ftm-inbox/backend/adapters/gmail.py +125 -125
- package/ftm-inbox/backend/adapters/jira.py +136 -136
- package/ftm-inbox/backend/adapters/registry.py +192 -192
- package/ftm-inbox/backend/adapters/slack.py +110 -110
- package/ftm-inbox/backend/db/connection.py +54 -54
- package/ftm-inbox/backend/db/schema.py +78 -78
- package/ftm-inbox/backend/executor/__init__.py +7 -7
- package/ftm-inbox/backend/executor/engine.py +149 -149
- package/ftm-inbox/backend/executor/step_runner.py +98 -98
- package/ftm-inbox/backend/main.py +103 -103
- package/ftm-inbox/backend/models/__init__.py +1 -1
- package/ftm-inbox/backend/models/unified_task.py +36 -36
- package/ftm-inbox/backend/planner/__init__.py +6 -6
- package/ftm-inbox/backend/planner/__pycache__/__init__.cpython-314.pyc +0 -0
- package/ftm-inbox/backend/planner/__pycache__/generator.cpython-314.pyc +0 -0
- package/ftm-inbox/backend/planner/__pycache__/schema.cpython-314.pyc +0 -0
- package/ftm-inbox/backend/planner/generator.py +127 -127
- package/ftm-inbox/backend/planner/schema.py +34 -34
- package/ftm-inbox/backend/requirements.txt +5 -5
- package/ftm-inbox/backend/routes/__pycache__/plan.cpython-314.pyc +0 -0
- package/ftm-inbox/backend/routes/execute.py +186 -186
- package/ftm-inbox/backend/routes/health.py +52 -52
- package/ftm-inbox/backend/routes/inbox.py +68 -68
- package/ftm-inbox/backend/routes/plan.py +271 -271
- package/ftm-inbox/bin/launchagent.mjs +91 -91
- package/ftm-inbox/bin/setup.mjs +188 -188
- package/ftm-inbox/bin/start.sh +10 -10
- package/ftm-inbox/bin/status.sh +17 -17
- package/ftm-inbox/bin/stop.sh +8 -8
- package/ftm-inbox/config.example.yml +55 -55
- package/ftm-inbox/package-lock.json +2898 -2898
- package/ftm-inbox/package.json +26 -26
- package/ftm-inbox/postcss.config.js +6 -6
- package/ftm-inbox/src/app.css +199 -199
- package/ftm-inbox/src/app.html +18 -18
- package/ftm-inbox/src/lib/api.ts +166 -166
- package/ftm-inbox/src/lib/components/ExecutionLog.svelte +81 -81
- package/ftm-inbox/src/lib/components/InboxFeed.svelte +143 -143
- package/ftm-inbox/src/lib/components/PlanStep.svelte +271 -271
- package/ftm-inbox/src/lib/components/PlanView.svelte +206 -206
- package/ftm-inbox/src/lib/components/StreamPanel.svelte +99 -99
- package/ftm-inbox/src/lib/components/TaskCard.svelte +190 -190
- package/ftm-inbox/src/lib/components/ui/EmptyState.svelte +63 -63
- package/ftm-inbox/src/lib/components/ui/KawaiiCard.svelte +86 -86
- package/ftm-inbox/src/lib/components/ui/PillButton.svelte +106 -106
- package/ftm-inbox/src/lib/components/ui/StatusBadge.svelte +67 -67
- package/ftm-inbox/src/lib/components/ui/StreamDrawer.svelte +149 -149
- package/ftm-inbox/src/lib/components/ui/ThemeToggle.svelte +80 -80
- package/ftm-inbox/src/lib/theme.ts +47 -47
- package/ftm-inbox/src/routes/+layout.svelte +76 -76
- package/ftm-inbox/src/routes/+page.svelte +401 -401
- package/ftm-inbox/svelte.config.js +12 -12
- package/ftm-inbox/tailwind.config.ts +63 -63
- package/ftm-inbox/tsconfig.json +13 -13
- package/ftm-inbox/vite.config.ts +6 -6
- package/ftm-intent/SKILL.md +241 -241
- package/ftm-intent.yml +2 -2
- package/ftm-manifest.json +3794 -3794
- package/ftm-map/SKILL.md +291 -291
- package/ftm-map/scripts/db.py +712 -712
- package/ftm-map/scripts/index.py +415 -415
- package/ftm-map/scripts/parser.py +224 -224
- package/ftm-map/scripts/queries/go-tags.scm +20 -20
- package/ftm-map/scripts/queries/javascript-tags.scm +35 -35
- package/ftm-map/scripts/queries/python-tags.scm +31 -31
- package/ftm-map/scripts/queries/ruby-tags.scm +19 -19
- package/ftm-map/scripts/queries/rust-tags.scm +37 -37
- package/ftm-map/scripts/queries/typescript-tags.scm +41 -41
- package/ftm-map/scripts/query.py +301 -301
- package/ftm-map/scripts/ranker.py +377 -377
- package/ftm-map/scripts/requirements.txt +5 -5
- package/ftm-map/scripts/setup-hooks.sh +27 -27
- package/ftm-map/scripts/setup.sh +56 -56
- package/ftm-map/scripts/test_db.py +364 -364
- package/ftm-map/scripts/test_parser.py +174 -174
- package/ftm-map/scripts/test_query.py +183 -183
- package/ftm-map/scripts/test_ranker.py +199 -199
- package/ftm-map/scripts/views.py +591 -591
- package/ftm-map.yml +2 -2
- package/ftm-mind/SKILL.md +201 -1943
- package/ftm-mind/evals/promptfoo.yaml +142 -142
- package/ftm-mind/references/blackboard-protocol.md +110 -0
- package/ftm-mind/references/blackboard-schema.md +328 -328
- package/ftm-mind/references/complexity-guide.md +110 -110
- package/ftm-mind/references/complexity-sizing.md +138 -0
- package/ftm-mind/references/decide-act-protocol.md +172 -0
- package/ftm-mind/references/direct-execution.md +51 -0
- package/ftm-mind/references/environment-discovery.md +77 -0
- package/ftm-mind/references/event-registry.md +319 -319
- package/ftm-mind/references/mcp-inventory.md +300 -296
- package/ftm-mind/references/ops-routing.md +47 -0
- package/ftm-mind/references/orient-protocol.md +234 -0
- package/ftm-mind/references/personality.md +40 -0
- package/ftm-mind/references/protocols/COMPLEXITY-SIZING.md +72 -72
- package/ftm-mind/references/protocols/MCP-HEURISTICS.md +32 -32
- package/ftm-mind/references/protocols/PLAN-APPROVAL.md +80 -80
- package/ftm-mind/references/reflexion-protocol.md +249 -249
- package/ftm-mind/references/routing/SCENARIOS.md +22 -22
- package/ftm-mind/references/routing-scenarios.md +35 -35
- package/ftm-mind.yml +2 -2
- package/ftm-ops.yml +4 -0
- package/ftm-pause/SKILL.md +395 -395
- package/ftm-pause/references/protocols/SKILL-RESTORE-PROTOCOLS.md +186 -186
- package/ftm-pause/references/protocols/VALIDATION.md +80 -80
- package/ftm-pause.yml +2 -2
- package/ftm-researcher/SKILL.md +275 -275
- package/ftm-researcher/evals/agent-diversity.yaml +17 -17
- package/ftm-researcher/evals/synthesis-quality.yaml +12 -12
- package/ftm-researcher/evals/trigger-accuracy.yaml +39 -39
- package/ftm-researcher/references/adaptive-search.md +116 -116
- package/ftm-researcher/references/agent-prompts.md +193 -193
- package/ftm-researcher/references/council-integration.md +193 -193
- package/ftm-researcher/references/output-format.md +203 -203
- package/ftm-researcher/references/synthesis-pipeline.md +165 -165
- package/ftm-researcher/scripts/score_credibility.py +234 -234
- package/ftm-researcher/scripts/validate_research.py +92 -92
- package/ftm-researcher.yml +2 -2
- package/ftm-resume/SKILL.md +518 -518
- package/ftm-resume/references/protocols/VALIDATION.md +172 -172
- package/ftm-resume.yml +2 -2
- package/ftm-retro/SKILL.md +380 -380
- package/ftm-retro/references/protocols/SCORING-RUBRICS.md +89 -89
- package/ftm-retro/references/templates/REPORT-FORMAT.md +109 -109
- package/ftm-retro.yml +2 -2
- package/ftm-routine/SKILL.md +170 -170
- package/ftm-routine.yml +4 -4
- package/ftm-state/blackboard/capabilities.json +5 -5
- package/ftm-state/blackboard/capabilities.schema.json +27 -27
- package/ftm-state/blackboard/context.json +37 -23
- package/ftm-state/blackboard/experiences/doom-statusline-fix.json +26 -0
- package/ftm-state/blackboard/experiences/hackathon-pages-site.json +26 -0
- package/ftm-state/blackboard/experiences/hindsight-sso-kickoff.json +42 -0
- package/ftm-state/blackboard/experiences/index.json +58 -9
- package/ftm-state/blackboard/experiences/learning-ragnarok-api-access.json +23 -0
- package/ftm-state/blackboard/experiences/nordlayer-members-auto-assign.json +26 -0
- package/ftm-state/blackboard/experiences/saml2aws-stale-session-fix.json +41 -0
- package/ftm-state/blackboard/patterns.json +6 -6
- package/ftm-state/schemas/context.schema.json +130 -130
- package/ftm-state/schemas/experience-index.schema.json +77 -77
- package/ftm-state/schemas/experience.schema.json +78 -78
- package/ftm-state/schemas/patterns.schema.json +44 -44
- package/ftm-upgrade/SKILL.md +194 -194
- package/ftm-upgrade/scripts/check-version.sh +76 -76
- package/ftm-upgrade/scripts/upgrade.sh +143 -143
- package/ftm-upgrade.yml +2 -2
- package/ftm-verify.yml +2 -2
- package/ftm.yml +2 -2
- package/hooks/ftm-auto-log.sh +137 -0
- package/hooks/ftm-blackboard-enforcer.sh +93 -93
- package/hooks/ftm-discovery-reminder.sh +90 -90
- package/hooks/ftm-drafts-gate.sh +61 -61
- package/hooks/ftm-event-logger.mjs +107 -107
- package/hooks/ftm-install-hooks.sh +240 -0
- package/hooks/ftm-learning-capture.sh +117 -0
- package/hooks/ftm-map-autodetect.sh +79 -79
- package/hooks/ftm-pending-sync-check.sh +22 -22
- package/hooks/ftm-plan-gate.sh +92 -92
- package/hooks/ftm-post-commit-trigger.sh +57 -57
- package/hooks/ftm-post-compaction.sh +138 -0
- package/hooks/ftm-pre-compaction.sh +147 -0
- package/hooks/ftm-session-end.sh +52 -0
- package/hooks/ftm-session-snapshot.sh +213 -0
- package/hooks/settings-template.json +81 -81
- package/install.sh +363 -363
- package/package.json +84 -84
- package/uninstall.sh +25 -25
|
@@ -1,193 +1,193 @@
|
|
|
1
|
-
# ftm-council Integration
|
|
2
|
-
|
|
3
|
-
## When Council Is Invoked
|
|
4
|
-
|
|
5
|
-
- Deep mode only (standard and quick skip council)
|
|
6
|
-
- After normalize & dedup (Phase 1 of synthesis)
|
|
7
|
-
- Input: all claims with agent_count >= 2, plus high-confidence unique claims (confidence > 0.8)
|
|
8
|
-
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
## Interface Contract
|
|
12
|
-
|
|
13
|
-
ftm-researcher prepares a structured prompt for ftm-council:
|
|
14
|
-
|
|
15
|
-
```
|
|
16
|
-
Evaluate these research findings for accuracy, completeness, and potential bias.
|
|
17
|
-
For each claim below, independently assess:
|
|
18
|
-
1. Is the evidence sufficient to support this claim?
|
|
19
|
-
2. What would make this claim wrong?
|
|
20
|
-
3. Are there alternative explanations the research may have missed?
|
|
21
|
-
4. Rate your confidence in this claim (0-1).
|
|
22
|
-
|
|
23
|
-
[claims formatted as numbered list with evidence and sources]
|
|
24
|
-
|
|
25
|
-
Return your assessment for each claim with: verdict (supported/contested/insufficient),
|
|
26
|
-
confidence, and reasoning.
|
|
27
|
-
```
|
|
28
|
-
|
|
29
|
-
### Payload Format
|
|
30
|
-
|
|
31
|
-
```json
|
|
32
|
-
{
|
|
33
|
-
"context": "Research evaluation for: [query]",
|
|
34
|
-
"claims": [
|
|
35
|
-
{
|
|
36
|
-
"id": "f-001",
|
|
37
|
-
"claim": "...",
|
|
38
|
-
"evidence": "...",
|
|
39
|
-
"sources": ["url1", "url2"],
|
|
40
|
-
"source_types": ["peer_reviewed", "blog"],
|
|
41
|
-
"agent_count": 3,
|
|
42
|
-
"credibility_score": 0.78
|
|
43
|
-
}
|
|
44
|
-
],
|
|
45
|
-
"evaluation_criteria": "accuracy, completeness, potential bias"
|
|
46
|
-
}
|
|
47
|
-
```
|
|
48
|
-
|
|
49
|
-
### Expected Response Format
|
|
50
|
-
|
|
51
|
-
```json
|
|
52
|
-
{
|
|
53
|
-
"evaluations": [
|
|
54
|
-
{
|
|
55
|
-
"claim_id": "f-001",
|
|
56
|
-
"verdict": "supported | contested | insufficient",
|
|
57
|
-
"confidence": 0.85,
|
|
58
|
-
"reasoning": "...",
|
|
59
|
-
"what_would_make_this_wrong": "...",
|
|
60
|
-
"alternative_explanations": ["..."]
|
|
61
|
-
}
|
|
62
|
-
],
|
|
63
|
-
"provider_positions": {
|
|
64
|
-
"claude": { "f-001": "supported", ... },
|
|
65
|
-
"codex": { "f-001": "contested", ... },
|
|
66
|
-
"gemini": { "f-001": "supported", ... }
|
|
67
|
-
}
|
|
68
|
-
}
|
|
69
|
-
```
|
|
70
|
-
|
|
71
|
-
---
|
|
72
|
-
|
|
73
|
-
## How Council Results Map Back
|
|
74
|
-
|
|
75
|
-
| Council Verdict | Mapping |
|
|
76
|
-
|---|---|
|
|
77
|
-
| All 3 providers: "supported" | consensus tier |
|
|
78
|
-
| 2 agree "supported", 1 contests | consensus tier with minority note |
|
|
79
|
-
| 2 contest, 1 supports | contested tier |
|
|
80
|
-
| All 3 contest | refuted tier |
|
|
81
|
-
| Mixed with "insufficient" | unique_insights tier (needs more evidence) |
|
|
82
|
-
| 2 "insufficient", 1 "supported" | unique_insights tier |
|
|
83
|
-
| 2 "insufficient", 1 "contested" | refuted tier (not enough evidence to contest = rejection) |
|
|
84
|
-
|
|
85
|
-
### Tie-Breaking Rules
|
|
86
|
-
|
|
87
|
-
When the mapping is ambiguous:
|
|
88
|
-
1. Prefer the more conservative tier (contested over consensus, refuted over unique_insights)
|
|
89
|
-
2. If all three providers give different verdicts, place in contested with full position details
|
|
90
|
-
3. If confidence scores diverge significantly (spread > 0.3), flag as high-uncertainty
|
|
91
|
-
|
|
92
|
-
---
|
|
93
|
-
|
|
94
|
-
## Fallback: Standalone Challengers
|
|
95
|
-
|
|
96
|
-
When ftm-council is unavailable (Codex CLI or Gemini CLI not installed):
|
|
97
|
-
|
|
98
|
-
Spawn 2 agents on the `review` model from ftm-config:
|
|
99
|
-
|
|
100
|
-
### Devil's Advocate Agent
|
|
101
|
-
|
|
102
|
-
```
|
|
103
|
-
You are the Devil's Advocate in a research pipeline.
|
|
104
|
-
|
|
105
|
-
Your sole purpose is to find reasons each claim is WRONG.
|
|
106
|
-
|
|
107
|
-
For each claim below:
|
|
108
|
-
1. Search for counter-evidence using WebSearch
|
|
109
|
-
2. Identify logical gaps in the reasoning
|
|
110
|
-
3. Flag claims supported by only one source type
|
|
111
|
-
4. Check if the evidence actually supports the claim or if the claim overstates the evidence
|
|
112
|
-
5. Look for cherry-picked data or survivorship bias
|
|
113
|
-
|
|
114
|
-
Be adversarial. The goal is to stress-test, not to confirm.
|
|
115
|
-
|
|
116
|
-
CLAIMS TO CHALLENGE:
|
|
117
|
-
[formatted list of claims with evidence]
|
|
118
|
-
|
|
119
|
-
RETURN FORMAT:
|
|
120
|
-
For each claim challenged, return:
|
|
121
|
-
- claim_challenged: [the claim text]
|
|
122
|
-
- challenge_type: counter_evidence | logical_gap | single_source | overstated | bias
|
|
123
|
-
- counter_evidence: [what you found that contradicts or weakens the claim]
|
|
124
|
-
- severity: high | medium | low
|
|
125
|
-
- recommendation: reject | weaken | flag_for_review | accept_with_caveat
|
|
126
|
-
```
|
|
127
|
-
|
|
128
|
-
### Edge Case Hunter Agent
|
|
129
|
-
|
|
130
|
-
```
|
|
131
|
-
You are the Edge Case Hunter in a research pipeline.
|
|
132
|
-
|
|
133
|
-
Your sole purpose is to find where each claim BREAKS.
|
|
134
|
-
|
|
135
|
-
For each claim below:
|
|
136
|
-
1. What happens at scale? (10x, 100x, 1000x users/data/requests)
|
|
137
|
-
2. What happens under adversarial conditions? (malicious input, DDoS, data poisoning)
|
|
138
|
-
3. What about accessibility? (screen readers, keyboard-only, low bandwidth)
|
|
139
|
-
4. What about the 1% case? (rare but catastrophic failure modes)
|
|
140
|
-
5. What about 5 years from now? (technology shifts, dependency deprecation, scaling limits)
|
|
141
|
-
6. What happens when the key assumption changes? (the market shifts, the API breaks, the team grows)
|
|
142
|
-
|
|
143
|
-
CLAIMS TO STRESS-TEST:
|
|
144
|
-
[formatted list of claims with evidence]
|
|
145
|
-
|
|
146
|
-
RETURN FORMAT:
|
|
147
|
-
For each claim stressed, return:
|
|
148
|
-
- claim_challenged: [the claim text]
|
|
149
|
-
- challenge_type: scale | adversarial | accessibility | edge_case | longevity | assumption_shift
|
|
150
|
-
- failure_scenario: [specific scenario where this claim breaks]
|
|
151
|
-
- severity: high | medium | low
|
|
152
|
-
- recommendation: reject | weaken | flag_for_review | accept_with_caveat
|
|
153
|
-
```
|
|
154
|
-
|
|
155
|
-
### Fallback Mapping
|
|
156
|
-
|
|
157
|
-
Map challenger results to tiers:
|
|
158
|
-
|
|
159
|
-
| Challenger Result | Mapping |
|
|
160
|
-
|---|---|
|
|
161
|
-
| No challenges from either agent | consensus |
|
|
162
|
-
| Challenges with weak counter-evidence (low severity) | consensus with note |
|
|
163
|
-
| One agent challenges with medium severity | contested |
|
|
164
|
-
| Both agents challenge with medium+ severity | contested (strong) |
|
|
165
|
-
| Multiple high-severity challenges | refuted |
|
|
166
|
-
| Only edge case challenges, no factual counter-evidence | consensus with edge-case notes |
|
|
167
|
-
|
|
168
|
-
---
|
|
169
|
-
|
|
170
|
-
## Council Availability Detection
|
|
171
|
-
|
|
172
|
-
Before invoking ftm-council, check availability:
|
|
173
|
-
|
|
174
|
-
1. Check if `codex` CLI is installed: `which codex`
|
|
175
|
-
2. Check if `gemini` CLI is installed: `which gemini`
|
|
176
|
-
3. If both are available: use full council
|
|
177
|
-
4. If only one is available: use 2-provider council (reduced confidence in verdicts)
|
|
178
|
-
5. If neither is available: use fallback challenger agents
|
|
179
|
-
|
|
180
|
-
Log the availability status in the research metadata.
|
|
181
|
-
|
|
182
|
-
---
|
|
183
|
-
|
|
184
|
-
## Per-Claim Council Invocation
|
|
185
|
-
|
|
186
|
-
The conversational iteration protocol supports council invocation for individual claims:
|
|
187
|
-
|
|
188
|
-
When the user says "council #N":
|
|
189
|
-
1. Extract finding N from the current research state
|
|
190
|
-
2. Send ONLY that claim to ftm-council with full evidence
|
|
191
|
-
3. Update the claim's tier based on council verdict
|
|
192
|
-
4. Re-render the disagreement map with the updated position
|
|
193
|
-
5. Report the council's reasoning to the user
|
|
1
|
+
# ftm-council Integration
|
|
2
|
+
|
|
3
|
+
## When Council Is Invoked
|
|
4
|
+
|
|
5
|
+
- Deep mode only (standard and quick skip council)
|
|
6
|
+
- After normalize & dedup (Phase 1 of synthesis)
|
|
7
|
+
- Input: all claims with agent_count >= 2, plus high-confidence unique claims (confidence > 0.8)
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## Interface Contract
|
|
12
|
+
|
|
13
|
+
ftm-researcher prepares a structured prompt for ftm-council:
|
|
14
|
+
|
|
15
|
+
```
|
|
16
|
+
Evaluate these research findings for accuracy, completeness, and potential bias.
|
|
17
|
+
For each claim below, independently assess:
|
|
18
|
+
1. Is the evidence sufficient to support this claim?
|
|
19
|
+
2. What would make this claim wrong?
|
|
20
|
+
3. Are there alternative explanations the research may have missed?
|
|
21
|
+
4. Rate your confidence in this claim (0-1).
|
|
22
|
+
|
|
23
|
+
[claims formatted as numbered list with evidence and sources]
|
|
24
|
+
|
|
25
|
+
Return your assessment for each claim with: verdict (supported/contested/insufficient),
|
|
26
|
+
confidence, and reasoning.
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
### Payload Format
|
|
30
|
+
|
|
31
|
+
```json
|
|
32
|
+
{
|
|
33
|
+
"context": "Research evaluation for: [query]",
|
|
34
|
+
"claims": [
|
|
35
|
+
{
|
|
36
|
+
"id": "f-001",
|
|
37
|
+
"claim": "...",
|
|
38
|
+
"evidence": "...",
|
|
39
|
+
"sources": ["url1", "url2"],
|
|
40
|
+
"source_types": ["peer_reviewed", "blog"],
|
|
41
|
+
"agent_count": 3,
|
|
42
|
+
"credibility_score": 0.78
|
|
43
|
+
}
|
|
44
|
+
],
|
|
45
|
+
"evaluation_criteria": "accuracy, completeness, potential bias"
|
|
46
|
+
}
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
### Expected Response Format
|
|
50
|
+
|
|
51
|
+
```json
|
|
52
|
+
{
|
|
53
|
+
"evaluations": [
|
|
54
|
+
{
|
|
55
|
+
"claim_id": "f-001",
|
|
56
|
+
"verdict": "supported | contested | insufficient",
|
|
57
|
+
"confidence": 0.85,
|
|
58
|
+
"reasoning": "...",
|
|
59
|
+
"what_would_make_this_wrong": "...",
|
|
60
|
+
"alternative_explanations": ["..."]
|
|
61
|
+
}
|
|
62
|
+
],
|
|
63
|
+
"provider_positions": {
|
|
64
|
+
"claude": { "f-001": "supported", ... },
|
|
65
|
+
"codex": { "f-001": "contested", ... },
|
|
66
|
+
"gemini": { "f-001": "supported", ... }
|
|
67
|
+
}
|
|
68
|
+
}
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
---
|
|
72
|
+
|
|
73
|
+
## How Council Results Map Back
|
|
74
|
+
|
|
75
|
+
| Council Verdict | Mapping |
|
|
76
|
+
|---|---|
|
|
77
|
+
| All 3 providers: "supported" | consensus tier |
|
|
78
|
+
| 2 agree "supported", 1 contests | consensus tier with minority note |
|
|
79
|
+
| 2 contest, 1 supports | contested tier |
|
|
80
|
+
| All 3 contest | refuted tier |
|
|
81
|
+
| Mixed with "insufficient" | unique_insights tier (needs more evidence) |
|
|
82
|
+
| 2 "insufficient", 1 "supported" | unique_insights tier |
|
|
83
|
+
| 2 "insufficient", 1 "contested" | refuted tier (not enough evidence to contest = rejection) |
|
|
84
|
+
|
|
85
|
+
### Tie-Breaking Rules
|
|
86
|
+
|
|
87
|
+
When the mapping is ambiguous:
|
|
88
|
+
1. Prefer the more conservative tier (contested over consensus, refuted over unique_insights)
|
|
89
|
+
2. If all three providers give different verdicts, place in contested with full position details
|
|
90
|
+
3. If confidence scores diverge significantly (spread > 0.3), flag as high-uncertainty
|
|
91
|
+
|
|
92
|
+
---
|
|
93
|
+
|
|
94
|
+
## Fallback: Standalone Challengers
|
|
95
|
+
|
|
96
|
+
When ftm-council is unavailable (Codex CLI or Gemini CLI not installed):
|
|
97
|
+
|
|
98
|
+
Spawn 2 agents on the `review` model from ftm-config:
|
|
99
|
+
|
|
100
|
+
### Devil's Advocate Agent
|
|
101
|
+
|
|
102
|
+
```
|
|
103
|
+
You are the Devil's Advocate in a research pipeline.
|
|
104
|
+
|
|
105
|
+
Your sole purpose is to find reasons each claim is WRONG.
|
|
106
|
+
|
|
107
|
+
For each claim below:
|
|
108
|
+
1. Search for counter-evidence using WebSearch
|
|
109
|
+
2. Identify logical gaps in the reasoning
|
|
110
|
+
3. Flag claims supported by only one source type
|
|
111
|
+
4. Check if the evidence actually supports the claim or if the claim overstates the evidence
|
|
112
|
+
5. Look for cherry-picked data or survivorship bias
|
|
113
|
+
|
|
114
|
+
Be adversarial. The goal is to stress-test, not to confirm.
|
|
115
|
+
|
|
116
|
+
CLAIMS TO CHALLENGE:
|
|
117
|
+
[formatted list of claims with evidence]
|
|
118
|
+
|
|
119
|
+
RETURN FORMAT:
|
|
120
|
+
For each claim challenged, return:
|
|
121
|
+
- claim_challenged: [the claim text]
|
|
122
|
+
- challenge_type: counter_evidence | logical_gap | single_source | overstated | bias
|
|
123
|
+
- counter_evidence: [what you found that contradicts or weakens the claim]
|
|
124
|
+
- severity: high | medium | low
|
|
125
|
+
- recommendation: reject | weaken | flag_for_review | accept_with_caveat
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
### Edge Case Hunter Agent
|
|
129
|
+
|
|
130
|
+
```
|
|
131
|
+
You are the Edge Case Hunter in a research pipeline.
|
|
132
|
+
|
|
133
|
+
Your sole purpose is to find where each claim BREAKS.
|
|
134
|
+
|
|
135
|
+
For each claim below:
|
|
136
|
+
1. What happens at scale? (10x, 100x, 1000x users/data/requests)
|
|
137
|
+
2. What happens under adversarial conditions? (malicious input, DDoS, data poisoning)
|
|
138
|
+
3. What about accessibility? (screen readers, keyboard-only, low bandwidth)
|
|
139
|
+
4. What about the 1% case? (rare but catastrophic failure modes)
|
|
140
|
+
5. What about 5 years from now? (technology shifts, dependency deprecation, scaling limits)
|
|
141
|
+
6. What happens when the key assumption changes? (the market shifts, the API breaks, the team grows)
|
|
142
|
+
|
|
143
|
+
CLAIMS TO STRESS-TEST:
|
|
144
|
+
[formatted list of claims with evidence]
|
|
145
|
+
|
|
146
|
+
RETURN FORMAT:
|
|
147
|
+
For each claim stressed, return:
|
|
148
|
+
- claim_challenged: [the claim text]
|
|
149
|
+
- challenge_type: scale | adversarial | accessibility | edge_case | longevity | assumption_shift
|
|
150
|
+
- failure_scenario: [specific scenario where this claim breaks]
|
|
151
|
+
- severity: high | medium | low
|
|
152
|
+
- recommendation: reject | weaken | flag_for_review | accept_with_caveat
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
### Fallback Mapping
|
|
156
|
+
|
|
157
|
+
Map challenger results to tiers:
|
|
158
|
+
|
|
159
|
+
| Challenger Result | Mapping |
|
|
160
|
+
|---|---|
|
|
161
|
+
| No challenges from either agent | consensus |
|
|
162
|
+
| Challenges with weak counter-evidence (low severity) | consensus with note |
|
|
163
|
+
| One agent challenges with medium severity | contested |
|
|
164
|
+
| Both agents challenge with medium+ severity | contested (strong) |
|
|
165
|
+
| Multiple high-severity challenges | refuted |
|
|
166
|
+
| Only edge case challenges, no factual counter-evidence | consensus with edge-case notes |
|
|
167
|
+
|
|
168
|
+
---
|
|
169
|
+
|
|
170
|
+
## Council Availability Detection
|
|
171
|
+
|
|
172
|
+
Before invoking ftm-council, check availability:
|
|
173
|
+
|
|
174
|
+
1. Check if `codex` CLI is installed: `which codex`
|
|
175
|
+
2. Check if `gemini` CLI is installed: `which gemini`
|
|
176
|
+
3. If both are available: use full council
|
|
177
|
+
4. If only one is available: use 2-provider council (reduced confidence in verdicts)
|
|
178
|
+
5. If neither is available: use fallback challenger agents
|
|
179
|
+
|
|
180
|
+
Log the availability status in the research metadata.
|
|
181
|
+
|
|
182
|
+
---
|
|
183
|
+
|
|
184
|
+
## Per-Claim Council Invocation
|
|
185
|
+
|
|
186
|
+
The conversational iteration protocol supports council invocation for individual claims:
|
|
187
|
+
|
|
188
|
+
When the user says "council #N":
|
|
189
|
+
1. Extract finding N from the current research state
|
|
190
|
+
2. Send ONLY that claim to ftm-council with full evidence
|
|
191
|
+
3. Update the claim's tier based on council verdict
|
|
192
|
+
4. Re-render the disagreement map with the updated position
|
|
193
|
+
5. Report the council's reasoning to the user
|