@balpal4495/quorum 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.github/copilot-instructions.md +94 -0
- package/CLAUDE.md +86 -0
- package/GEMINI.md +73 -0
- package/LICENSE +21 -0
- package/README.md +202 -0
- package/SETUP.md +256 -0
- package/bin/init.js +366 -0
- package/modules/AGENTS.md +66 -0
- package/modules/CLAUDE.md +64 -0
- package/modules/README.md +251 -0
- package/modules/council/advisors.ts +68 -0
- package/modules/council/chairman.ts +112 -0
- package/modules/council/deliberate.ts +106 -0
- package/modules/council/frame.ts +54 -0
- package/modules/council/index.ts +4 -0
- package/modules/council/personas.ts +57 -0
- package/modules/council/reviewers.ts +81 -0
- package/modules/council/types.ts +45 -0
- package/modules/jury/evaluate.ts +112 -0
- package/modules/jury/index.ts +3 -0
- package/modules/jury/schema.ts +15 -0
- package/modules/jury/types.ts +31 -0
- package/modules/oracle/adapters/lance-db.ts +81 -0
- package/modules/oracle/adapters/xenova-embedder.ts +43 -0
- package/modules/oracle/bm25.ts +92 -0
- package/modules/oracle/index.ts +36 -0
- package/modules/oracle/log.ts +15 -0
- package/modules/oracle/propose.ts +148 -0
- package/modules/oracle/query.ts +145 -0
- package/modules/oracle/summary.ts +115 -0
- package/modules/oracle/types.ts +32 -0
- package/modules/sentinel/assert.ts +95 -0
- package/modules/sentinel/coverage.ts +106 -0
- package/modules/sentinel/drift.ts +159 -0
- package/modules/sentinel/index.ts +6 -0
- package/modules/sentinel/review.ts +207 -0
- package/modules/setup.ts +153 -0
- package/modules/shared/types.ts +148 -0
- package/package.json +47 -0
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
# modules/ — Claude Instructions
|
|
2
|
+
|
|
3
|
+
Supplements the root-level instructions. Read this when working inside the `modules/` folder.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## What these modules are
|
|
8
|
+
|
|
9
|
+
Three portable TypeScript modules — Oracle, Jury, Council — that form the knowledge and reasoning layer of an agentic workflow. They are designed to be dropped into any Node.js codebase.
|
|
10
|
+
|
|
11
|
+
The entry point for a host application is `setup.ts`. Everything else is internal.
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## Key design decisions to preserve
|
|
16
|
+
|
|
17
|
+
### Dependency injection throughout
|
|
18
|
+
No module imports a specific LLM provider, vector store, or embedder. All external dependencies are passed in as function arguments or via a deps object. If you add a new capability, follow this pattern — do not hardcode providers.
|
|
19
|
+
|
|
20
|
+
### council_brief is computed, not trusted
|
|
21
|
+
In `jury/evaluate.ts`, the `council_brief` field in the LLM response is **always overridden** based on the numeric `confidence` value after parsing. The LLM is not trusted to compute this correctly. Do not remove this override.
|
|
22
|
+
|
|
23
|
+
### Throw on bad LLM output — never default to passing
|
|
24
|
+
Both `jury/evaluate.ts` and `council/chairman.ts` throw if the LLM returns non-JSON or output that fails Zod validation. This is intentional. A silently passing Jury score is worse than an error. Do not add fallbacks or defaults.
|
|
25
|
+
|
|
26
|
+
### oracle.commit() is a human gate
|
|
27
|
+
`council/deliberate.ts` calls `oracle.propose()` at the end of every deliberation. It never calls `oracle.commit()`. If you see a code path that calls `oracle.commit()` without explicit human input, that is a bug.
|
|
28
|
+
|
|
29
|
+
### Query logging is best-effort
|
|
30
|
+
`oracle/log.ts` writes to a JSONL file. The `query()` function wraps this in a try/catch that swallows errors silently. This is correct behaviour — a log write failure must never fail a query.
|
|
31
|
+
|
|
32
|
+
---
|
|
33
|
+
|
|
34
|
+
## When modifying oracle/query.ts
|
|
35
|
+
|
|
36
|
+
The retrieval pipeline has two passes:
|
|
37
|
+
1. **Vector search** — embed query, retrieve `limit × 3` candidates from the vector store
|
|
38
|
+
2. **BM25 re-ranking** — score candidates, enrich query with domain terms from Pass 1, fuse ranks via RRF
|
|
39
|
+
|
|
40
|
+
RRF constant is `k=60`. Score threshold default is `0.031`. Results below the threshold are dropped entirely — not returned as low-confidence results. If you change the threshold, update the default in `query.ts` and the `QueryOptions` type comment in `shared/types.ts`.
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
## When modifying council/deliberate.ts
|
|
45
|
+
|
|
46
|
+
The pipeline order is fixed: `frameQuestion → fanOutAdvisors → fanOutReviewers → chairman → oracle.propose()`. Advisors and reviewers each run in parallel internally via `Promise.all`. Do not make the advisor and reviewer phases sequential — that defeats the independence of the panel.
|
|
47
|
+
|
|
48
|
+
Anonymisation of advisor responses happens inside `fanOutReviewers()` before any reviewer sees them. It must stay there.
|
|
49
|
+
|
|
50
|
+
---
|
|
51
|
+
|
|
52
|
+
## Safe to change
|
|
53
|
+
|
|
54
|
+
- `council/personas.ts` — add or adjust personas freely
|
|
55
|
+
- `models` defaults in `setup.ts` — adjust model names as providers evolve
|
|
56
|
+
- BM25 constants (`K1`, `B`) in `oracle/bm25.ts` — tunable, well-commented
|
|
57
|
+
- `CANDIDATE_MULTIPLIER` and `RRF_K` in `oracle/query.ts` — tunable retrieval parameters
|
|
58
|
+
|
|
59
|
+
## Do not change without strong reason
|
|
60
|
+
|
|
61
|
+
- The `VectorStore` interface in `oracle/types.ts` — changing it breaks all adapters
|
|
62
|
+
- The `ChronicleEntry` type in `shared/types.ts` — changing it breaks stored data
|
|
63
|
+
- The Zod schemas in `jury/schema.ts` and `council/chairman.ts` — these are the output contracts
|
|
64
|
+
- The `OracleClient` interface in `shared/types.ts` — Jury and Council depend on it
|
|
@@ -0,0 +1,251 @@
|
|
|
1
|
+
# Oracle · Jury · Council · Sentinel
|
|
2
|
+
|
|
3
|
+
Four portable modules for the knowledge and reasoning layer of any agentic workflow.
|
|
4
|
+
Drop the `modules/` folder into your project and wire up the dependencies.
|
|
5
|
+
|
|
6
|
+
```
|
|
7
|
+
Oracle → Jury → Council → human gate → Executor
|
|
8
|
+
Sentinel → coverage + drift + PR coverage map
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## Modules
|
|
14
|
+
|
|
15
|
+
| Module | Responsibility | LLM? |
|
|
16
|
+
|---|---|---|
|
|
17
|
+
| **Oracle** | Query and write interface to Chronicle (the persistent knowledge store) | No |
|
|
18
|
+
| **Jury** | Evaluate a design against Oracle evidence — produces a confidence score | Yes |
|
|
19
|
+
| **Council** | Adversarial validation via parallel advisor/reviewer fan-out — produces a verdict | Yes |
|
|
20
|
+
| **Sentinel** | Chronicle coverage reporting, drift detection, and PR coverage maps | Optional |
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## Chronicle
|
|
25
|
+
|
|
26
|
+
Chronicle is the data that underpins the system. It is not a module — it lives at `.chronicle/` in your project root.
|
|
27
|
+
|
|
28
|
+
```
|
|
29
|
+
.chronicle/
|
|
30
|
+
committed/ ← approved entries as JSON (committed to git, source of truth)
|
|
31
|
+
proposals/ ← staged entries awaiting human approval (JSON, not indexed yet)
|
|
32
|
+
SUMMARY.md ← auto-generated agent context, rebuilt on every commit
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
Every entry goes through `oracle.propose()` → human approval → `oracle.commit()`. There are no auto-commits.
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## Dependencies
|
|
40
|
+
|
|
41
|
+
**Required** (must be in your project):
|
|
42
|
+
```
|
|
43
|
+
zod
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
**Optional** — only needed if using the included default adapters:
|
|
47
|
+
```
|
|
48
|
+
vectordb ← LanceDB adapter (oracle/adapters/lance-db.ts)
|
|
49
|
+
@xenova/transformers ← local ONNX embedder (oracle/adapters/xenova-embedder.ts)
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
You can substitute any vector store and embedder by implementing the `VectorStore` and `embedder` interfaces.
|
|
53
|
+
|
|
54
|
+
---
|
|
55
|
+
|
|
56
|
+
## TypeScript
|
|
57
|
+
|
|
58
|
+
Requires TypeScript 4.7+ and `zod` v3.
|
|
59
|
+
|
|
60
|
+
Recommended `tsconfig.json` settings:
|
|
61
|
+
```json
|
|
62
|
+
{
|
|
63
|
+
"compilerOptions": {
|
|
64
|
+
"strict": true,
|
|
65
|
+
"moduleResolution": "node"
|
|
66
|
+
}
|
|
67
|
+
}
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
72
|
+
## Quick start
|
|
73
|
+
|
|
74
|
+
```typescript
|
|
75
|
+
import { createOracleClient, xenovaEmbed, createLanceDBStore } from "./modules/oracle"
|
|
76
|
+
import { evaluate } from "./modules/jury"
|
|
77
|
+
import { deliberate } from "./modules/council"
|
|
78
|
+
|
|
79
|
+
// 1. Wire Oracle (no LLM required)
|
|
80
|
+
const oracle = createOracleClient({
|
|
81
|
+
embedder: xenovaEmbed,
|
|
82
|
+
vectorStore: await createLanceDBStore(".chronicle"),
|
|
83
|
+
})
|
|
84
|
+
|
|
85
|
+
// 2. Retrieve evidence for the task at hand
|
|
86
|
+
const evidence = await oracle.query("authentication patterns in this codebase")
|
|
87
|
+
|
|
88
|
+
// 3. Jury evaluates the design against the evidence
|
|
89
|
+
const juryOutput = await evaluate(
|
|
90
|
+
{
|
|
91
|
+
outcome: "Add JWT authentication to the API",
|
|
92
|
+
design: "RS256 tokens, 15-min expiry, refresh rotation in httpOnly cookies",
|
|
93
|
+
evidence,
|
|
94
|
+
},
|
|
95
|
+
{ llm: yourLLMProvider, model: "gpt-4o-mini" },
|
|
96
|
+
)
|
|
97
|
+
|
|
98
|
+
// 4. Council validates adversarially
|
|
99
|
+
const verdict = await deliberate(
|
|
100
|
+
{
|
|
101
|
+
outcome: "Add JWT authentication to the API",
|
|
102
|
+
design: "RS256 tokens, 15-min expiry, refresh rotation in httpOnly cookies",
|
|
103
|
+
evidence,
|
|
104
|
+
jury_output: juryOutput,
|
|
105
|
+
},
|
|
106
|
+
{
|
|
107
|
+
llm: yourLLMProvider,
|
|
108
|
+
oracle,
|
|
109
|
+
models: {
|
|
110
|
+
frame: "gpt-4o-mini",
|
|
111
|
+
advisors: "gpt-4o-mini",
|
|
112
|
+
reviewers: "gpt-4o",
|
|
113
|
+
chairman: "gpt-4o",
|
|
114
|
+
},
|
|
115
|
+
},
|
|
116
|
+
)
|
|
117
|
+
|
|
118
|
+
// 5. Route on verdict
|
|
119
|
+
if (verdict.satisfied) {
|
|
120
|
+
// → human gate → Executor
|
|
121
|
+
} else if (verdict.recommendation === "redesign") {
|
|
122
|
+
// → return to Designer with verdict.verdict as feedback
|
|
123
|
+
} else {
|
|
124
|
+
// → return to Detective with juryOutput.gaps
|
|
125
|
+
}
|
|
126
|
+
|
|
127
|
+
// 6. Human approves the proposed Chronicle entry
|
|
128
|
+
// The Council automatically called oracle.propose() — you just need to commit:
|
|
129
|
+
// await oracle.commit(proposalId)
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
---
|
|
133
|
+
|
|
134
|
+
## LLM provider interface
|
|
135
|
+
|
|
136
|
+
The `LLMProvider` type is a simple function. Wire it to any provider:
|
|
137
|
+
|
|
138
|
+
```typescript
|
|
139
|
+
import type { LLMProvider } from "./modules/shared/types"
|
|
140
|
+
|
|
141
|
+
// OpenAI example
|
|
142
|
+
const openaiProvider: LLMProvider = async (messages, model = "gpt-4o") => {
|
|
143
|
+
const res = await openai.chat.completions.create({ model, messages })
|
|
144
|
+
return res.choices[0].message.content ?? ""
|
|
145
|
+
}
|
|
146
|
+
|
|
147
|
+
// Anthropic example
|
|
148
|
+
const anthropicProvider: LLMProvider = async (messages, model = "claude-3-5-sonnet-20241022") => {
|
|
149
|
+
const system = messages.find(m => m.role === "system")?.content ?? ""
|
|
150
|
+
const userMessages = messages.filter(m => m.role !== "system")
|
|
151
|
+
const res = await anthropic.messages.create({ model, system, messages: userMessages, max_tokens: 2048 })
|
|
152
|
+
return res.content[0].type === "text" ? res.content[0].text : ""
|
|
153
|
+
}
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
---
|
|
157
|
+
|
|
158
|
+
## Output routing
|
|
159
|
+
|
|
160
|
+
### Jury
|
|
161
|
+
|
|
162
|
+
| `recommendation` | Next step |
|
|
163
|
+
|---|---|
|
|
164
|
+
| `proceed` | Pass to Council |
|
|
165
|
+
| `investigate-more` | Return to Detective with `gaps` |
|
|
166
|
+
| `redesign` | Return to Designer |
|
|
167
|
+
|
|
168
|
+
### Council
|
|
169
|
+
|
|
170
|
+
| `satisfied` | `recommendation` | Next step |
|
|
171
|
+
|---|---|---|
|
|
172
|
+
| `true` | `proceed` | Human gate → Executor |
|
|
173
|
+
| `false` | `redesign` | Return to Designer with `verdict` |
|
|
174
|
+
| `false` | `investigate-more` | Return to Detective with `juryOutput.gaps` |
|
|
175
|
+
|
|
176
|
+
---
|
|
177
|
+
|
|
178
|
+
## Sentinel
|
|
179
|
+
|
|
180
|
+
Sentinel is the health and visibility layer. It operates independently of the Oracle → Jury → Council pipeline and has no LLM dependency for its core functions.
|
|
181
|
+
|
|
182
|
+
### Coverage
|
|
183
|
+
|
|
184
|
+
Reports which source files have Chronicle entries and which are blind spots.
|
|
185
|
+
|
|
186
|
+
```typescript
|
|
187
|
+
import { coverage } from "./modules/sentinel"
|
|
188
|
+
|
|
189
|
+
const report = await coverage(".chronicle", "src", {
|
|
190
|
+
excludeTestFiles: true, // default — __tests__/, *.test.ts, *.spec.ts are excluded
|
|
191
|
+
})
|
|
192
|
+
// report.percentage, report.uncoveredFiles, report.coverageByFile
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
### Drift detection
|
|
196
|
+
|
|
197
|
+
For each Chronicle entry, asks the LLM whether the `key_insight` still accurately describes the current code. Advisory only — never modifies entries.
|
|
198
|
+
|
|
199
|
+
```typescript
|
|
200
|
+
import { detectDrift } from "./modules/sentinel"
|
|
201
|
+
|
|
202
|
+
const report = await detectDrift(".chronicle", "src", llmProvider)
|
|
203
|
+
// report.flags (potentially stale), report.confirmed, report.skipped
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
### Vitest assertions
|
|
207
|
+
|
|
208
|
+
Drop into any Vitest suite to get coverage and drift as named tests.
|
|
209
|
+
|
|
210
|
+
```typescript
|
|
211
|
+
import { describe } from "vitest"
|
|
212
|
+
import { sentinelAssertions } from "./modules/sentinel"
|
|
213
|
+
|
|
214
|
+
const assertions = sentinelAssertions({
|
|
215
|
+
chronicleDir: ".chronicle",
|
|
216
|
+
codebasePath: "src", // defaults to "." — scan from project root
|
|
217
|
+
llm: myLLMProvider, // omit to skip drift tests
|
|
218
|
+
minCoveragePercent: 50, // default 0 = advisory only, never fails CI
|
|
219
|
+
})
|
|
220
|
+
|
|
221
|
+
describe("sentinel", () => { assertions.forEach(a => a()) })
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
`minCoveragePercent: 0` (the default) means the coverage test is purely advisory — it logs gaps to the console but never fails the build. Raise it as the project matures.
|
|
225
|
+
|
|
226
|
+
### PR coverage map
|
|
227
|
+
|
|
228
|
+
`sentinel/review.ts` exports `reviewContext(changedFiles, chronicleDir, codebasePath)` — used by the `sentinel-pr.yml` GitHub Actions workflow to post a PR comment showing the full-project coverage table and a colour-coded Mermaid heatmap. Test files are excluded from the scan.
|
|
229
|
+
|
|
230
|
+
---
|
|
231
|
+
|
|
232
|
+
## Running tests
|
|
233
|
+
|
|
234
|
+
Tests use [Vitest](https://vitest.dev/). Add to your project's test config or run directly:
|
|
235
|
+
|
|
236
|
+
```bash
|
|
237
|
+
npx vitest run modules/
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
---
|
|
241
|
+
|
|
242
|
+
## What these modules do NOT include
|
|
243
|
+
|
|
244
|
+
The following are application-specific and must be built in the host project:
|
|
245
|
+
|
|
246
|
+
- **Detective** — investigation and task intake
|
|
247
|
+
- **Designer** — solution proposal
|
|
248
|
+
- **Executor** — task execution (existing tools/agents)
|
|
249
|
+
- **Validator** — satisfaction evaluator on implementation
|
|
250
|
+
- **Human gate UI** — approval interface for Chronicle proposals
|
|
251
|
+
- **Workflow orchestration** — LangGraph, Inngest, or equivalent
|
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
import type { LLMProvider, OracleResult } from "../shared/types"
|
|
2
|
+
import type { AdvisorPersona } from "./personas"
|
|
3
|
+
|
|
4
|
+
export interface AdvisorResponse {
|
|
5
|
+
persona: string
|
|
6
|
+
response: string
|
|
7
|
+
}
|
|
8
|
+
|
|
9
|
+
function formatEvidence(evidence: OracleResult[]): string {
|
|
10
|
+
if (evidence.length === 0) {
|
|
11
|
+
return "No Oracle entries are available. Reason from absence of evidence — name what is missing."
|
|
12
|
+
}
|
|
13
|
+
return evidence
|
|
14
|
+
.map(e =>
|
|
15
|
+
`[${e.id}] (${e.status})\n${e.key_insight}\nAreas: ${e.affected_areas.join(", ")}`,
|
|
16
|
+
)
|
|
17
|
+
.join("\n\n")
|
|
18
|
+
}
|
|
19
|
+
|
|
20
|
+
/**
|
|
21
|
+
* Run all advisors in parallel.
|
|
22
|
+
* Each advisor receives the framed question, the Oracle evidence pack,
|
|
23
|
+
* and their persona's system prompt fragment.
|
|
24
|
+
*
|
|
25
|
+
* Advisors MUST cite specific Oracle entry IDs — this is enforced in the prompt.
|
|
26
|
+
*/
|
|
27
|
+
export async function fanOutAdvisors(
|
|
28
|
+
framedQuestion: string,
|
|
29
|
+
evidence: OracleResult[],
|
|
30
|
+
personas: readonly AdvisorPersona[],
|
|
31
|
+
llm: LLMProvider,
|
|
32
|
+
model?: string,
|
|
33
|
+
): Promise<AdvisorResponse[]> {
|
|
34
|
+
const evidenceText = formatEvidence(evidence)
|
|
35
|
+
|
|
36
|
+
return Promise.all(
|
|
37
|
+
personas.map(async (persona): Promise<AdvisorResponse> => {
|
|
38
|
+
const systemPrompt = [
|
|
39
|
+
`You are a Council advisor — ${persona.name}.`,
|
|
40
|
+
"",
|
|
41
|
+
persona.systemFragment,
|
|
42
|
+
"",
|
|
43
|
+
"Rules:",
|
|
44
|
+
"- Reason ONLY from the Oracle evidence provided. Do not use general knowledge.",
|
|
45
|
+
"- Cite specific Oracle entry IDs (e.g. [abc-123]) for every claim you make.",
|
|
46
|
+
"- If the evidence is insufficient to support a claim, say so explicitly.",
|
|
47
|
+
"- Keep your response focused and under 400 words.",
|
|
48
|
+
].join("\n")
|
|
49
|
+
|
|
50
|
+
const userPrompt = [
|
|
51
|
+
framedQuestion,
|
|
52
|
+
"",
|
|
53
|
+
"## Oracle Evidence",
|
|
54
|
+
evidenceText,
|
|
55
|
+
].join("\n")
|
|
56
|
+
|
|
57
|
+
const response = await llm(
|
|
58
|
+
[
|
|
59
|
+
{ role: "system", content: systemPrompt },
|
|
60
|
+
{ role: "user", content: userPrompt },
|
|
61
|
+
],
|
|
62
|
+
model,
|
|
63
|
+
)
|
|
64
|
+
|
|
65
|
+
return { persona: persona.name, response }
|
|
66
|
+
}),
|
|
67
|
+
)
|
|
68
|
+
}
|
|
@@ -0,0 +1,112 @@
|
|
|
1
|
+
import { z } from "zod"
|
|
2
|
+
import type { LLMProvider, OracleResult } from "../shared/types"
|
|
3
|
+
import type { AdvisorResponse } from "./advisors"
|
|
4
|
+
import type { ReviewerResponse } from "./reviewers"
|
|
5
|
+
import type { CouncilOutput } from "./types"
|
|
6
|
+
|
|
7
|
+
const ChairmanOutputSchema = z.object({
|
|
8
|
+
satisfied: z.boolean(),
|
|
9
|
+
verdict: z.string().min(1),
|
|
10
|
+
challenges: z.array(z.string()),
|
|
11
|
+
evidence_cited: z.array(z.string()),
|
|
12
|
+
recommendation: z.enum(["proceed", "redesign", "investigate-more"]),
|
|
13
|
+
})
|
|
14
|
+
|
|
15
|
+
function formatAdvisors(responses: AdvisorResponse[]): string {
|
|
16
|
+
return responses
|
|
17
|
+
.map(r => `## ${r.persona}\n${r.response}`)
|
|
18
|
+
.join("\n\n---\n\n")
|
|
19
|
+
}
|
|
20
|
+
|
|
21
|
+
function formatReviewers(responses: ReviewerResponse[]): string {
|
|
22
|
+
return responses
|
|
23
|
+
.map(r => `## ${r.reviewerId}\n${r.review}`)
|
|
24
|
+
.join("\n\n---\n\n")
|
|
25
|
+
}
|
|
26
|
+
|
|
27
|
+
function formatEvidence(evidence: OracleResult[]): string {
|
|
28
|
+
if (evidence.length === 0) return "No Oracle evidence."
|
|
29
|
+
return evidence
|
|
30
|
+
.map(
|
|
31
|
+
e =>
|
|
32
|
+
`[${e.id}] (${e.status}, confidence: ${e.confidence.toFixed(2)}) ${e.key_insight}`,
|
|
33
|
+
)
|
|
34
|
+
.join("\n")
|
|
35
|
+
}
|
|
36
|
+
|
|
37
|
+
const CHAIRMAN_SYSTEM_PROMPT = [
|
|
38
|
+
"You are the Council Chairman. You synthesise the final verdict from all advisor and reviewer inputs.",
|
|
39
|
+
"",
|
|
40
|
+
"Your verdict must:",
|
|
41
|
+
"1. Be grounded in Oracle evidence — cite specific entry IDs for every material conclusion",
|
|
42
|
+
"2. Summarise what was challenged and what held up under scrutiny",
|
|
43
|
+
"3. State a clear recommendation",
|
|
44
|
+
"4. List every Oracle entry ID that materially influenced the verdict in evidence_cited",
|
|
45
|
+
"",
|
|
46
|
+
"satisfied = true → design holds up, can proceed to the human gate",
|
|
47
|
+
"satisfied = false → fundamental flaw, unresolved gap, or design needs rework",
|
|
48
|
+
"",
|
|
49
|
+
"Return ONLY valid JSON — no markdown fences, no explanation:",
|
|
50
|
+
JSON.stringify({
|
|
51
|
+
satisfied: "<boolean>",
|
|
52
|
+
verdict: "<string ≤400 words — clear synthesis>",
|
|
53
|
+
challenges: ["<string — each challenge raised>"],
|
|
54
|
+
evidence_cited: ["<Oracle entry ID>"],
|
|
55
|
+
recommendation: "proceed | redesign | investigate-more",
|
|
56
|
+
}),
|
|
57
|
+
].join("\n")
|
|
58
|
+
|
|
59
|
+
/**
|
|
60
|
+
* Chairman synthesises the verdict from all advisor and reviewer inputs.
|
|
61
|
+
* Every material conclusion must cite specific Oracle entry IDs.
|
|
62
|
+
*
|
|
63
|
+
* Throws if the LLM returns non-JSON or output fails schema validation.
|
|
64
|
+
*/
|
|
65
|
+
export async function chairman(
|
|
66
|
+
advisorResponses: AdvisorResponse[],
|
|
67
|
+
reviewerResponses: ReviewerResponse[],
|
|
68
|
+
evidence: OracleResult[],
|
|
69
|
+
llm: LLMProvider,
|
|
70
|
+
model?: string,
|
|
71
|
+
): Promise<CouncilOutput> {
|
|
72
|
+
const userPrompt = [
|
|
73
|
+
"## Advisor Responses",
|
|
74
|
+
formatAdvisors(advisorResponses),
|
|
75
|
+
"",
|
|
76
|
+
"## Reviewer Critiques",
|
|
77
|
+
formatReviewers(reviewerResponses),
|
|
78
|
+
"",
|
|
79
|
+
"## Oracle Evidence",
|
|
80
|
+
formatEvidence(evidence),
|
|
81
|
+
].join("\n")
|
|
82
|
+
|
|
83
|
+
const raw = await llm(
|
|
84
|
+
[
|
|
85
|
+
{ role: "system", content: CHAIRMAN_SYSTEM_PROMPT },
|
|
86
|
+
{ role: "user", content: userPrompt },
|
|
87
|
+
],
|
|
88
|
+
model,
|
|
89
|
+
)
|
|
90
|
+
|
|
91
|
+
let parsed: unknown
|
|
92
|
+
try {
|
|
93
|
+
const cleaned = raw
|
|
94
|
+
.replace(/^```(?:json)?\s*/m, "")
|
|
95
|
+
.replace(/\s*```$/m, "")
|
|
96
|
+
.trim()
|
|
97
|
+
parsed = JSON.parse(cleaned)
|
|
98
|
+
} catch {
|
|
99
|
+
throw new Error(
|
|
100
|
+
`Council chairman: LLM returned non-JSON. Raw (first 300 chars): ${raw.slice(0, 300)}`,
|
|
101
|
+
)
|
|
102
|
+
}
|
|
103
|
+
|
|
104
|
+
const result = ChairmanOutputSchema.safeParse(parsed)
|
|
105
|
+
if (!result.success) {
|
|
106
|
+
throw new Error(
|
|
107
|
+
`Council chairman: output failed schema validation. Issues: ${JSON.stringify(result.error.issues)}`,
|
|
108
|
+
)
|
|
109
|
+
}
|
|
110
|
+
|
|
111
|
+
return result.data
|
|
112
|
+
}
|
|
@@ -0,0 +1,106 @@
|
|
|
1
|
+
import type { CouncilInput, CouncilOutput, CouncilDeps } from "./types"
|
|
2
|
+
import { DEFAULT_PERSONAS } from "./personas"
|
|
3
|
+
import { frameQuestion } from "./frame"
|
|
4
|
+
import { fanOutAdvisors } from "./advisors"
|
|
5
|
+
import { fanOutReviewers } from "./reviewers"
|
|
6
|
+
import { chairman } from "./chairman"
|
|
7
|
+
|
|
8
|
+
const DEFAULT_ADVISOR_COUNT = 5
|
|
9
|
+
const DEFAULT_REVIEWER_COUNT = 5
|
|
10
|
+
|
|
11
|
+
/**
|
|
12
|
+
* Run the Council deliberation pipeline.
|
|
13
|
+
*
|
|
14
|
+
* Pipeline:
|
|
15
|
+
* 1. frameQuestion — reframe outcome + design into a deliberation brief
|
|
16
|
+
* 2. fanOutAdvisors — N advisors reason in parallel from Oracle evidence
|
|
17
|
+
* 3. fanOutReviewers — N reviewers critique anonymised advisor responses in parallel
|
|
18
|
+
* 4. chairman — synthesises verdict, cites Oracle entry IDs
|
|
19
|
+
* 5. oracle.propose() — proposes verdict to Chronicle (human approval required to commit)
|
|
20
|
+
*
|
|
21
|
+
* The council_brief from jury_output determines framing tone:
|
|
22
|
+
* "challenge" → find what is wrong (Jury confidence < 0.6)
|
|
23
|
+
* "pressure-test" → try to break what looks solid (Jury confidence ≥ 0.6)
|
|
24
|
+
*
|
|
25
|
+
* Routing on output:
|
|
26
|
+
* satisfied: true → proceed to human gate → Executor
|
|
27
|
+
* satisfied: false, recommendation: redesign → return to Designer
|
|
28
|
+
* satisfied: false, recommendation: investigate-more → return to Detective with gaps list
|
|
29
|
+
*/
|
|
30
|
+
export async function deliberate(
|
|
31
|
+
input: CouncilInput,
|
|
32
|
+
deps: CouncilDeps,
|
|
33
|
+
): Promise<CouncilOutput> {
|
|
34
|
+
const {
|
|
35
|
+
llm,
|
|
36
|
+
oracle,
|
|
37
|
+
advisorCount = DEFAULT_ADVISOR_COUNT,
|
|
38
|
+
reviewerCount = DEFAULT_REVIEWER_COUNT,
|
|
39
|
+
models = {},
|
|
40
|
+
} = deps
|
|
41
|
+
|
|
42
|
+
// Select personas — cycle DEFAULT_PERSONAS if advisorCount > 5
|
|
43
|
+
const personas = Array.from(
|
|
44
|
+
{ length: advisorCount },
|
|
45
|
+
(_, i) => DEFAULT_PERSONAS[i % DEFAULT_PERSONAS.length],
|
|
46
|
+
)
|
|
47
|
+
|
|
48
|
+
// 1. Frame the deliberation question
|
|
49
|
+
const framedQuestion = await frameQuestion(input, llm, models.frame)
|
|
50
|
+
|
|
51
|
+
// 2. Advisors reason in parallel
|
|
52
|
+
const advisorResponses = await fanOutAdvisors(
|
|
53
|
+
framedQuestion,
|
|
54
|
+
input.evidence,
|
|
55
|
+
personas,
|
|
56
|
+
llm,
|
|
57
|
+
models.advisors,
|
|
58
|
+
)
|
|
59
|
+
|
|
60
|
+
// 3. Reviewers critique in parallel (advisor responses anonymised inside fanOutReviewers)
|
|
61
|
+
const reviewerResponses = await fanOutReviewers(
|
|
62
|
+
advisorResponses,
|
|
63
|
+
input.evidence,
|
|
64
|
+
reviewerCount,
|
|
65
|
+
llm,
|
|
66
|
+
models.reviewers,
|
|
67
|
+
)
|
|
68
|
+
|
|
69
|
+
// 4. Chairman synthesises verdict
|
|
70
|
+
const verdict = await chairman(
|
|
71
|
+
advisorResponses,
|
|
72
|
+
reviewerResponses,
|
|
73
|
+
input.evidence,
|
|
74
|
+
llm,
|
|
75
|
+
models.chairman,
|
|
76
|
+
)
|
|
77
|
+
|
|
78
|
+
// 5. Propose verdict to Oracle — human must call oracle.commit() to index it
|
|
79
|
+
// Truncate to 200 chars so it passes propose()'s schema validation.
|
|
80
|
+
const firstSentence = verdict.verdict.split(/[.!?]/)[0]?.trim() ?? ""
|
|
81
|
+
const keyInsight = (firstSentence.length >= 20 ? firstSentence : verdict.verdict)
|
|
82
|
+
.slice(0, 200)
|
|
83
|
+
|
|
84
|
+
await oracle.propose({
|
|
85
|
+
key_insight: keyInsight,
|
|
86
|
+
affected_areas: extractAffectedAreas(input.outcome, input.design),
|
|
87
|
+
status: "open",
|
|
88
|
+
confidence: input.jury_output.confidence,
|
|
89
|
+
source_module: "council",
|
|
90
|
+
evidence_cited: verdict.evidence_cited,
|
|
91
|
+
})
|
|
92
|
+
|
|
93
|
+
return verdict
|
|
94
|
+
}
|
|
95
|
+
|
|
96
|
+
/**
|
|
97
|
+
* Extract candidate affected areas from the outcome and design text.
|
|
98
|
+
* Looks for capitalised noun phrases as a simple heuristic.
|
|
99
|
+
* The host application may override by post-processing CouncilOutput.
|
|
100
|
+
*/
|
|
101
|
+
function extractAffectedAreas(outcome: string, design: string): string[] {
|
|
102
|
+
const text = `${outcome} ${design}`
|
|
103
|
+
const phrases = text.match(/\b[A-Z][a-zA-Z]+(?:\s[A-Z][a-zA-Z]+)*\b/g) ?? []
|
|
104
|
+
const unique = [...new Set(phrases)]
|
|
105
|
+
return unique.length > 0 ? unique.slice(0, 5) : ["general"]
|
|
106
|
+
}
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
import type { LLMProvider } from "../shared/types"
|
|
2
|
+
import type { CouncilInput } from "./types"
|
|
3
|
+
|
|
4
|
+
/**
|
|
5
|
+
* Reframe the outcome + design into a clear deliberation brief for the advisor panel.
|
|
6
|
+
* Tone and scope are set by the Jury's council_brief value.
|
|
7
|
+
*/
|
|
8
|
+
export async function frameQuestion(
|
|
9
|
+
input: CouncilInput,
|
|
10
|
+
llm: LLMProvider,
|
|
11
|
+
model?: string,
|
|
12
|
+
): Promise<string> {
|
|
13
|
+
const { outcome, design, jury_output } = input
|
|
14
|
+
|
|
15
|
+
const briefInstruction =
|
|
16
|
+
jury_output.council_brief === "challenge"
|
|
17
|
+
? `The Jury has LOW confidence (score: ${jury_output.confidence.toFixed(2)}). ` +
|
|
18
|
+
"Find what is WRONG with this design. Look for fundamental flaws, not just edge cases."
|
|
19
|
+
: `The Jury has HIGH confidence (score: ${jury_output.confidence.toFixed(2)}). ` +
|
|
20
|
+
"PRESSURE-TEST this design. Assume it is broadly correct — try to break it. " +
|
|
21
|
+
"Find edge cases, scaling failures, and hidden assumptions."
|
|
22
|
+
|
|
23
|
+
const systemPrompt = [
|
|
24
|
+
"You are the Council Framer. You write the deliberation brief that a panel of expert advisors will work from.",
|
|
25
|
+
"",
|
|
26
|
+
"Write a clear, precise brief that:",
|
|
27
|
+
"1. States what needs to be achieved (the outcome)",
|
|
28
|
+
"2. States what is being proposed (the design)",
|
|
29
|
+
"3. States the Jury's assessment and the gaps it identified",
|
|
30
|
+
"4. Sets the council directive — challenge or pressure-test",
|
|
31
|
+
"",
|
|
32
|
+
"Keep it under 300 words. Be direct. Advisors must know exactly what to evaluate.",
|
|
33
|
+
].join("\n")
|
|
34
|
+
|
|
35
|
+
const userPrompt = [
|
|
36
|
+
`Outcome: ${outcome}`,
|
|
37
|
+
"",
|
|
38
|
+
`Design: ${design}`,
|
|
39
|
+
"",
|
|
40
|
+
`Jury assessment: ${jury_output.assessment}`,
|
|
41
|
+
`Jury confidence: ${jury_output.confidence.toFixed(2)}`,
|
|
42
|
+
`Jury gaps: ${jury_output.gaps.join("; ") || "none identified"}`,
|
|
43
|
+
"",
|
|
44
|
+
briefInstruction,
|
|
45
|
+
].join("\n")
|
|
46
|
+
|
|
47
|
+
return llm(
|
|
48
|
+
[
|
|
49
|
+
{ role: "system", content: systemPrompt },
|
|
50
|
+
{ role: "user", content: userPrompt },
|
|
51
|
+
],
|
|
52
|
+
model,
|
|
53
|
+
)
|
|
54
|
+
}
|