@balpal4495/quorum 3.0.3 → 3.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/commands/compass.js +4 -4
- package/bin/shared/llm.js +2 -2
- package/dist/advisor/ask.d.ts +13 -0
- package/dist/advisor/ask.d.ts.map +1 -0
- package/dist/advisor/ask.js +67 -0
- package/dist/advisor/ask.js.map +1 -0
- package/dist/advisor/index.d.ts +3 -0
- package/dist/advisor/index.d.ts.map +1 -0
- package/dist/advisor/index.js +2 -0
- package/dist/advisor/index.js.map +1 -0
- package/dist/advisor/prompt.d.ts +5 -0
- package/dist/advisor/prompt.d.ts.map +1 -0
- package/{modules/advisor/prompt.ts → dist/advisor/prompt.js} +22 -26
- package/dist/advisor/prompt.js.map +1 -0
- package/dist/advisor/types.d.ts +23 -0
- package/dist/advisor/types.d.ts.map +1 -0
- package/dist/advisor/types.js +2 -0
- package/dist/advisor/types.js.map +1 -0
- package/dist/compass/behavior.d.ts +4 -0
- package/dist/compass/behavior.d.ts.map +1 -0
- package/dist/compass/behavior.js +138 -0
- package/dist/compass/behavior.js.map +1 -0
- package/dist/compass/create.d.ts +3 -0
- package/dist/compass/create.d.ts.map +1 -0
- package/dist/compass/create.js +289 -0
- package/dist/compass/create.js.map +1 -0
- package/dist/compass/evidence/collect.d.ts +11 -0
- package/dist/compass/evidence/collect.d.ts.map +1 -0
- package/dist/compass/evidence/collect.js +86 -0
- package/dist/compass/evidence/collect.js.map +1 -0
- package/dist/compass/index.d.ts +8 -0
- package/dist/compass/index.d.ts.map +1 -0
- package/dist/compass/index.js +8 -0
- package/dist/compass/index.js.map +1 -0
- package/dist/compass/prompts/index.d.ts +28 -0
- package/dist/compass/prompts/index.d.ts.map +1 -0
- package/{modules/compass/prompts/index.ts → dist/compass/prompts/index.js} +13 -38
- package/dist/compass/prompts/index.js.map +1 -0
- package/dist/compass/prompts/system.d.ts +2 -0
- package/dist/compass/prompts/system.d.ts.map +1 -0
- package/{modules/compass/prompts/system.ts → dist/compass/prompts/system.js} +2 -1
- package/dist/compass/prompts/system.js.map +1 -0
- package/dist/compass/propose.d.ts +15 -0
- package/dist/compass/propose.d.ts.map +1 -0
- package/dist/compass/propose.js +128 -0
- package/dist/compass/propose.js.map +1 -0
- package/dist/compass/schemas.d.ts +1271 -0
- package/dist/compass/schemas.d.ts.map +1 -0
- package/dist/compass/schemas.js +113 -0
- package/dist/compass/schemas.js.map +1 -0
- package/dist/compass/score.d.ts +25 -0
- package/dist/compass/score.d.ts.map +1 -0
- package/dist/compass/score.js +89 -0
- package/dist/compass/score.js.map +1 -0
- package/dist/compass/sources/index.d.ts +9 -0
- package/dist/compass/sources/index.d.ts.map +1 -0
- package/dist/compass/sources/index.js +408 -0
- package/dist/compass/sources/index.js.map +1 -0
- package/dist/compass/types.d.ts +334 -0
- package/dist/compass/types.d.ts.map +1 -0
- package/dist/compass/types.js +2 -0
- package/dist/compass/types.js.map +1 -0
- package/dist/council/advisors.d.ts +15 -0
- package/dist/council/advisors.d.ts.map +1 -0
- package/dist/council/advisors.js +46 -0
- package/dist/council/advisors.js.map +1 -0
- package/dist/council/chairman.d.ts +13 -0
- package/dist/council/chairman.d.ts.map +1 -0
- package/dist/council/chairman.js +145 -0
- package/dist/council/chairman.js.map +1 -0
- package/dist/council/deliberate.d.ts +22 -0
- package/dist/council/deliberate.d.ts.map +1 -0
- package/dist/council/deliberate.js +99 -0
- package/dist/council/deliberate.js.map +1 -0
- package/dist/council/frame.d.ts +8 -0
- package/dist/council/frame.d.ts.map +1 -0
- package/dist/council/frame.js +40 -0
- package/dist/council/frame.js.map +1 -0
- package/dist/council/index.d.ts +6 -0
- package/dist/council/index.d.ts.map +1 -0
- package/dist/council/index.js +4 -0
- package/dist/council/index.js.map +1 -0
- package/dist/council/personas.d.ts +18 -0
- package/dist/council/personas.d.ts.map +1 -0
- package/dist/council/personas.js +44 -0
- package/dist/council/personas.js.map +1 -0
- package/dist/council/reviewers.d.ts +13 -0
- package/dist/council/reviewers.d.ts.map +1 -0
- package/dist/council/reviewers.js +59 -0
- package/dist/council/reviewers.js.map +1 -0
- package/dist/council/risk.d.ts +16 -0
- package/dist/council/risk.d.ts.map +1 -0
- package/dist/council/risk.js +74 -0
- package/dist/council/risk.js.map +1 -0
- package/dist/council/types.d.ts +95 -0
- package/dist/council/types.d.ts.map +1 -0
- package/dist/council/types.js +2 -0
- package/dist/council/types.js.map +1 -0
- package/dist/jury/evaluate.d.ts +13 -0
- package/dist/jury/evaluate.d.ts.map +1 -0
- package/{modules/jury/evaluate.ts → dist/jury/evaluate.js} +60 -84
- package/dist/jury/evaluate.js.map +1 -0
- package/dist/jury/index.d.ts +6 -0
- package/dist/jury/index.d.ts.map +1 -0
- package/dist/jury/index.js +4 -0
- package/dist/jury/index.js.map +1 -0
- package/dist/jury/preflight.d.ts +26 -0
- package/dist/jury/preflight.d.ts.map +1 -0
- package/dist/jury/preflight.js +71 -0
- package/dist/jury/preflight.js.map +1 -0
- package/dist/jury/schema.d.ts +57 -0
- package/dist/jury/schema.d.ts.map +1 -0
- package/dist/jury/schema.js +21 -0
- package/dist/jury/schema.js.map +1 -0
- package/dist/jury/types.d.ts +47 -0
- package/dist/jury/types.d.ts.map +1 -0
- package/dist/jury/types.js +2 -0
- package/dist/jury/types.js.map +1 -0
- package/dist/oracle/adapters/lance-db.d.ts +15 -0
- package/dist/oracle/adapters/lance-db.d.ts.map +1 -0
- package/dist/oracle/adapters/lance-db.js +68 -0
- package/dist/oracle/adapters/lance-db.js.map +1 -0
- package/dist/oracle/adapters/xenova-embedder.d.ts +21 -0
- package/dist/oracle/adapters/xenova-embedder.d.ts.map +1 -0
- package/dist/oracle/adapters/xenova-embedder.js +36 -0
- package/dist/oracle/adapters/xenova-embedder.js.map +1 -0
- package/dist/oracle/bm25.d.ts +20 -0
- package/dist/oracle/bm25.d.ts.map +1 -0
- package/dist/oracle/bm25.js +82 -0
- package/dist/oracle/bm25.js.map +1 -0
- package/dist/oracle/index.d.ts +21 -0
- package/dist/oracle/index.d.ts.map +1 -0
- package/dist/oracle/index.js +25 -0
- package/dist/oracle/index.js.map +1 -0
- package/dist/oracle/log.d.ts +6 -0
- package/dist/oracle/log.d.ts.map +1 -0
- package/dist/oracle/log.js +12 -0
- package/dist/oracle/log.js.map +1 -0
- package/dist/oracle/propose.d.ts +25 -0
- package/dist/oracle/propose.d.ts.map +1 -0
- package/dist/oracle/propose.js +133 -0
- package/dist/oracle/propose.js.map +1 -0
- package/dist/oracle/query.d.ts +17 -0
- package/dist/oracle/query.d.ts.map +1 -0
- package/dist/oracle/query.js +106 -0
- package/dist/oracle/query.js.map +1 -0
- package/dist/oracle/summary.d.ts +11 -0
- package/dist/oracle/summary.d.ts.map +1 -0
- package/dist/oracle/summary.js +102 -0
- package/dist/oracle/summary.js.map +1 -0
- package/dist/oracle/types.d.ts +31 -0
- package/dist/oracle/types.d.ts.map +1 -0
- package/dist/oracle/types.js +2 -0
- package/dist/oracle/types.js.map +1 -0
- package/dist/sentinel/assert.d.ts +28 -0
- package/dist/sentinel/assert.d.ts.map +1 -0
- package/dist/sentinel/assert.js +63 -0
- package/dist/sentinel/assert.js.map +1 -0
- package/dist/sentinel/coverage.d.ts +14 -0
- package/dist/sentinel/coverage.d.ts.map +1 -0
- package/dist/sentinel/coverage.js +96 -0
- package/dist/sentinel/coverage.js.map +1 -0
- package/dist/sentinel/drift.d.ts +12 -0
- package/dist/sentinel/drift.d.ts.map +1 -0
- package/dist/sentinel/drift.js +149 -0
- package/dist/sentinel/drift.js.map +1 -0
- package/dist/sentinel/index.d.ts +7 -0
- package/dist/sentinel/index.d.ts.map +1 -0
- package/dist/sentinel/index.js +5 -0
- package/dist/sentinel/index.js.map +1 -0
- package/dist/sentinel/review.d.ts +15 -0
- package/dist/sentinel/review.d.ts.map +1 -0
- package/dist/sentinel/review.js +177 -0
- package/dist/sentinel/review.js.map +1 -0
- package/dist/setup.d.ts +103 -0
- package/dist/setup.d.ts.map +1 -0
- package/dist/setup.js +87 -0
- package/dist/setup.js.map +1 -0
- package/dist/shared/types.d.ts +173 -0
- package/dist/shared/types.d.ts.map +1 -0
- package/dist/shared/types.js +16 -0
- package/dist/shared/types.js.map +1 -0
- package/package.json +13 -8
- package/.github/copilot-instructions.md +0 -117
- package/CLAUDE.md +0 -146
- package/GEMINI.md +0 -73
- package/SETUP.md +0 -264
- package/evals/__tests__/eval.test.ts +0 -31
- package/evals/cases/auth_hs256_rejected.json +0 -46
- package/evals/cases/auth_rs256_valid.json +0 -30
- package/evals/cases/cache_missing_lock.json +0 -31
- package/evals/cases/db_naive_not_null.json +0 -32
- package/evals/cases/logging_pii_leak.json +0 -32
- package/evals/cases/migration_with_rollback.json +0 -43
- package/evals/cases/no_evidence_novel_design.json +0 -16
- package/evals/cases/payment_no_idempotency.json +0 -33
- package/evals/cases/redis_session_rejected.json +0 -32
- package/evals/cases/safe_refactor.json +0 -17
- package/evals/runner.ts +0 -226
- package/modules/AGENTS.md +0 -78
- package/modules/CLAUDE.md +0 -93
- package/modules/README.md +0 -504
- package/modules/advisor/ask.ts +0 -87
- package/modules/advisor/index.ts +0 -2
- package/modules/advisor/types.ts +0 -26
- package/modules/compass/behavior.ts +0 -161
- package/modules/compass/create.ts +0 -365
- package/modules/compass/evidence/collect.ts +0 -109
- package/modules/compass/index.ts +0 -7
- package/modules/compass/propose.ts +0 -152
- package/modules/compass/schemas.ts +0 -121
- package/modules/compass/score.ts +0 -77
- package/modules/compass/sources/index.ts +0 -413
- package/modules/compass/types.ts +0 -431
- package/modules/council/advisors.ts +0 -71
- package/modules/council/chairman.ts +0 -183
- package/modules/council/deliberate.ts +0 -141
- package/modules/council/frame.ts +0 -54
- package/modules/council/index.ts +0 -9
- package/modules/council/personas.ts +0 -57
- package/modules/council/reviewers.ts +0 -82
- package/modules/council/risk.ts +0 -89
- package/modules/council/types.ts +0 -107
- package/modules/jury/index.ts +0 -5
- package/modules/jury/preflight.ts +0 -101
- package/modules/jury/schema.ts +0 -24
- package/modules/jury/types.ts +0 -50
- package/modules/oracle/adapters/lance-db.ts +0 -81
- package/modules/oracle/adapters/xenova-embedder.ts +0 -43
- package/modules/oracle/bm25.ts +0 -92
- package/modules/oracle/index.ts +0 -36
- package/modules/oracle/log.ts +0 -15
- package/modules/oracle/propose.ts +0 -164
- package/modules/oracle/query.ts +0 -146
- package/modules/oracle/summary.ts +0 -116
- package/modules/oracle/types.ts +0 -32
- package/modules/sentinel/assert.ts +0 -95
- package/modules/sentinel/coverage.ts +0 -106
- package/modules/sentinel/drift.ts +0 -163
- package/modules/sentinel/index.ts +0 -6
- package/modules/sentinel/review.ts +0 -208
- package/modules/setup.ts +0 -202
- package/modules/shared/types.ts +0 -193
|
@@ -1,30 +0,0 @@
|
|
|
1
|
-
{
|
|
2
|
-
"id": "auth_rs256_valid",
|
|
3
|
-
"description": "Proposing the already-approved RS256 pattern — should proceed",
|
|
4
|
-
"outcome": "Add JWT authentication to the API",
|
|
5
|
-
"design": "RS256 tokens with 15-minute expiry and refresh rotation stored in httpOnly cookies, matching the approved pattern in Chronicle",
|
|
6
|
-
"oracle_evidence": [
|
|
7
|
-
{
|
|
8
|
-
"id": "auth-031",
|
|
9
|
-
"key_insight": "RS256 with short-lived tokens and refresh rotation in httpOnly cookies is the approved pattern",
|
|
10
|
-
"decision": "RS256 with 15-min access tokens and refresh rotation stored in httpOnly cookies is the approved auth pattern",
|
|
11
|
-
"schema_version": 2,
|
|
12
|
-
"affected_areas": ["src/auth/", "src/middleware/"],
|
|
13
|
-
"scope": ["auth", "sessions"],
|
|
14
|
-
"status": "validated",
|
|
15
|
-
"confidence": 0.88,
|
|
16
|
-
"source_module": "council",
|
|
17
|
-
"evidence_cited": [],
|
|
18
|
-
"timestamp": "2025-03-15T11:00:00Z"
|
|
19
|
-
}
|
|
20
|
-
],
|
|
21
|
-
"expected": {
|
|
22
|
-
"jury_min_confidence": 0.65,
|
|
23
|
-
"council_recommendation": "proceed",
|
|
24
|
-
"must_not_flag": ["key rotation problem"],
|
|
25
|
-
"risk_level": "critical",
|
|
26
|
-
"preflight_expects": {
|
|
27
|
-
"touches_sensitive_area": true
|
|
28
|
-
}
|
|
29
|
-
}
|
|
30
|
-
}
|
|
@@ -1,31 +0,0 @@
|
|
|
1
|
-
{
|
|
2
|
-
"id": "cache_missing_lock",
|
|
3
|
-
"description": "Cache implementation missing stampede protection — should warn or block",
|
|
4
|
-
"outcome": "Cache expensive product catalogue queries in Redis",
|
|
5
|
-
"design": "On cache miss, fetch from database and write to Redis with a 5-minute TTL. No locking strategy.",
|
|
6
|
-
"oracle_evidence": [
|
|
7
|
-
{
|
|
8
|
-
"id": "cache-008",
|
|
9
|
-
"key_insight": "Redis cache without stampede protection caused DB overload during peak traffic",
|
|
10
|
-
"decision": "Redis cache without stampede protection caused DB overload — all cache misses hit DB simultaneously during spikes",
|
|
11
|
-
"schema_version": 2,
|
|
12
|
-
"affected_areas": ["src/cache/", "src/api/products.ts"],
|
|
13
|
-
"scope": ["cache", "performance"],
|
|
14
|
-
"status": "refuted",
|
|
15
|
-
"confidence": 0.85,
|
|
16
|
-
"source_module": "council",
|
|
17
|
-
"evidence_cited": [],
|
|
18
|
-
"timestamp": "2025-02-20T14:00:00Z"
|
|
19
|
-
}
|
|
20
|
-
],
|
|
21
|
-
"expected": {
|
|
22
|
-
"jury_max_confidence": 0.60,
|
|
23
|
-
"council_recommendation": "redesign",
|
|
24
|
-
"must_flag": ["stampede", "lock"],
|
|
25
|
-
"must_cite": ["cache-008"],
|
|
26
|
-
"risk_level": "medium",
|
|
27
|
-
"preflight_expects": {
|
|
28
|
-
"chronicle_conflicts": ["cache-008"]
|
|
29
|
-
}
|
|
30
|
-
}
|
|
31
|
-
}
|
|
@@ -1,32 +0,0 @@
|
|
|
1
|
-
{
|
|
2
|
-
"id": "db_naive_not_null",
|
|
3
|
-
"description": "Naive NOT NULL migration on a large table — should block",
|
|
4
|
-
"outcome": "Add a NOT NULL region column to the users table",
|
|
5
|
-
"design": "Run ALTER TABLE users ADD COLUMN region TEXT NOT NULL DEFAULT 'uk'",
|
|
6
|
-
"oracle_evidence": [
|
|
7
|
-
{
|
|
8
|
-
"id": "db-017",
|
|
9
|
-
"key_insight": "Avoid naive locking migrations on large tables",
|
|
10
|
-
"decision": "Avoid naive locking migrations on large tables — use shadow column pattern or pg_repack",
|
|
11
|
-
"schema_version": 2,
|
|
12
|
-
"affected_areas": ["db/migrations", "src/models/user.ts"],
|
|
13
|
-
"scope": ["database", "migrations"],
|
|
14
|
-
"status": "refuted",
|
|
15
|
-
"confidence": 0.91,
|
|
16
|
-
"source_module": "council",
|
|
17
|
-
"evidence_cited": [],
|
|
18
|
-
"timestamp": "2025-01-10T10:00:00Z"
|
|
19
|
-
}
|
|
20
|
-
],
|
|
21
|
-
"expected": {
|
|
22
|
-
"jury_max_confidence": 0.55,
|
|
23
|
-
"council_recommendation": "redesign",
|
|
24
|
-
"must_flag": ["lock strategy", "rollback"],
|
|
25
|
-
"must_cite": ["db-017"],
|
|
26
|
-
"risk_level": "high",
|
|
27
|
-
"preflight_expects": {
|
|
28
|
-
"rollback_mentioned": false,
|
|
29
|
-
"chronicle_conflicts": ["db-017"]
|
|
30
|
-
}
|
|
31
|
-
}
|
|
32
|
-
}
|
|
@@ -1,32 +0,0 @@
|
|
|
1
|
-
{
|
|
2
|
-
"id": "logging_pii_leak",
|
|
3
|
-
"description": "Logging request bodies that contain PII — should block",
|
|
4
|
-
"outcome": "Add structured request logging to the API for debugging",
|
|
5
|
-
"design": "Log full request body including email, name, and any user-submitted fields to stdout using JSON format",
|
|
6
|
-
"oracle_evidence": [
|
|
7
|
-
{
|
|
8
|
-
"id": "sec-003",
|
|
9
|
-
"key_insight": "Logging PII fields violates GDPR and our data retention policy",
|
|
10
|
-
"decision": "Logging PII fields (email, name, address) violates GDPR Article 5 — all request logging must scrub or omit PII fields",
|
|
11
|
-
"schema_version": 2,
|
|
12
|
-
"affected_areas": ["src/middleware/logger.ts", "src/api/"],
|
|
13
|
-
"scope": ["pii", "compliance", "logging"],
|
|
14
|
-
"status": "validated",
|
|
15
|
-
"confidence": 0.95,
|
|
16
|
-
"source_module": "council",
|
|
17
|
-
"evidence_cited": [],
|
|
18
|
-
"timestamp": "2025-01-05T08:00:00Z"
|
|
19
|
-
}
|
|
20
|
-
],
|
|
21
|
-
"expected": {
|
|
22
|
-
"jury_max_confidence": 0.40,
|
|
23
|
-
"council_recommendation": "redesign",
|
|
24
|
-
"must_flag": ["PII", "GDPR", "email"],
|
|
25
|
-
"must_cite": ["sec-003"],
|
|
26
|
-
"risk_level": "high",
|
|
27
|
-
"preflight_expects": {
|
|
28
|
-
"touches_sensitive_area": true,
|
|
29
|
-
"sensitive_areas_include": ["pii"]
|
|
30
|
-
}
|
|
31
|
-
}
|
|
32
|
-
}
|
|
@@ -1,43 +0,0 @@
|
|
|
1
|
-
{
|
|
2
|
-
"id": "migration_with_rollback",
|
|
3
|
-
"description": "DB migration that explicitly addresses rollback and uses safe pattern — should proceed",
|
|
4
|
-
"outcome": "Add a NOT NULL region column to the users table",
|
|
5
|
-
"design": "Use shadow column pattern: add region TEXT NULLABLE, backfill via batched update, then add NOT NULL constraint after 100% fill confirmed. Rollback: drop shadow column. Uses pg_repack to avoid exclusive locks.",
|
|
6
|
-
"oracle_evidence": [
|
|
7
|
-
{
|
|
8
|
-
"id": "db-017",
|
|
9
|
-
"key_insight": "Avoid naive locking migrations on large tables — use shadow column pattern or pg_repack",
|
|
10
|
-
"decision": "Avoid naive locking migrations on large tables — use shadow column pattern or pg_repack",
|
|
11
|
-
"schema_version": 2,
|
|
12
|
-
"affected_areas": ["db/migrations", "src/models/user.ts"],
|
|
13
|
-
"scope": ["database", "migrations"],
|
|
14
|
-
"status": "refuted",
|
|
15
|
-
"confidence": 0.91,
|
|
16
|
-
"source_module": "council",
|
|
17
|
-
"evidence_cited": [],
|
|
18
|
-
"timestamp": "2025-01-10T10:00:00Z"
|
|
19
|
-
},
|
|
20
|
-
{
|
|
21
|
-
"id": "db-019",
|
|
22
|
-
"key_insight": "Shadow column pattern with batched backfill is the approved approach for NOT NULL migrations",
|
|
23
|
-
"decision": "Shadow column pattern with batched backfill is the approved approach for large NOT NULL migrations",
|
|
24
|
-
"schema_version": 2,
|
|
25
|
-
"affected_areas": ["db/migrations"],
|
|
26
|
-
"scope": ["database", "migrations"],
|
|
27
|
-
"status": "validated",
|
|
28
|
-
"confidence": 0.87,
|
|
29
|
-
"source_module": "council",
|
|
30
|
-
"evidence_cited": ["db-017"],
|
|
31
|
-
"timestamp": "2025-02-01T12:00:00Z"
|
|
32
|
-
}
|
|
33
|
-
],
|
|
34
|
-
"expected": {
|
|
35
|
-
"jury_min_confidence": 0.65,
|
|
36
|
-
"council_recommendation": "proceed",
|
|
37
|
-
"risk_level": "high",
|
|
38
|
-
"preflight_expects": {
|
|
39
|
-
"rollback_mentioned": true,
|
|
40
|
-
"chronicle_conflicts": ["db-017"]
|
|
41
|
-
}
|
|
42
|
-
}
|
|
43
|
-
}
|
|
@@ -1,16 +0,0 @@
|
|
|
1
|
-
{
|
|
2
|
-
"id": "no_evidence_novel_design",
|
|
3
|
-
"description": "Novel design with no Chronicle evidence either way — should investigate-more",
|
|
4
|
-
"outcome": "Implement real-time collaboration features using WebSockets",
|
|
5
|
-
"design": "Use Socket.io for bi-directional communication, Redis pub/sub for multi-instance message fanout, and optimistic UI updates with conflict resolution via last-write-wins",
|
|
6
|
-
"oracle_evidence": [],
|
|
7
|
-
"expected": {
|
|
8
|
-
"jury_max_confidence": 0.65,
|
|
9
|
-
"council_recommendation": "investigate-more",
|
|
10
|
-
"risk_level": "medium",
|
|
11
|
-
"preflight_expects": {
|
|
12
|
-
"touches_sensitive_area": false,
|
|
13
|
-
"chronicle_conflicts": []
|
|
14
|
-
}
|
|
15
|
-
}
|
|
16
|
-
}
|
|
@@ -1,33 +0,0 @@
|
|
|
1
|
-
{
|
|
2
|
-
"id": "payment_no_idempotency",
|
|
3
|
-
"description": "Payment charge without idempotency key — should block",
|
|
4
|
-
"outcome": "Implement one-click repurchase for customers",
|
|
5
|
-
"design": "On button click, POST /api/charge with the stored card token and amount. Retry on network failure up to 3 times.",
|
|
6
|
-
"oracle_evidence": [
|
|
7
|
-
{
|
|
8
|
-
"id": "pay-004",
|
|
9
|
-
"key_insight": "Payment charges without idempotency keys caused duplicate charges during network retries",
|
|
10
|
-
"decision": "All payment charge requests must include a Stripe idempotency key — retries without idempotency keys caused duplicate charges in production",
|
|
11
|
-
"schema_version": 2,
|
|
12
|
-
"affected_areas": ["src/payments/", "src/api/checkout.ts"],
|
|
13
|
-
"scope": ["payments", "stripe"],
|
|
14
|
-
"status": "refuted",
|
|
15
|
-
"confidence": 0.97,
|
|
16
|
-
"source_module": "council",
|
|
17
|
-
"evidence_cited": [],
|
|
18
|
-
"timestamp": "2025-04-01T16:00:00Z"
|
|
19
|
-
}
|
|
20
|
-
],
|
|
21
|
-
"expected": {
|
|
22
|
-
"jury_max_confidence": 0.40,
|
|
23
|
-
"council_recommendation": "redesign",
|
|
24
|
-
"must_flag": ["idempotency", "duplicate charge"],
|
|
25
|
-
"must_cite": ["pay-004"],
|
|
26
|
-
"risk_level": "critical",
|
|
27
|
-
"preflight_expects": {
|
|
28
|
-
"touches_sensitive_area": true,
|
|
29
|
-
"sensitive_areas_include": ["payments"],
|
|
30
|
-
"chronicle_conflicts": ["pay-004"]
|
|
31
|
-
}
|
|
32
|
-
}
|
|
33
|
-
}
|
|
@@ -1,32 +0,0 @@
|
|
|
1
|
-
{
|
|
2
|
-
"id": "redis_session_rejected",
|
|
3
|
-
"description": "Proposing Redis sessions when they were already removed — should block",
|
|
4
|
-
"outcome": "Implement user session management",
|
|
5
|
-
"design": "Store session data in Redis with a 30-minute TTL and auto-extend on activity. Use express-session with connect-redis.",
|
|
6
|
-
"oracle_evidence": [
|
|
7
|
-
{
|
|
8
|
-
"id": "auth-015",
|
|
9
|
-
"key_insight": "Redis sessions removed due to memory overhead at scale and operational complexity",
|
|
10
|
-
"decision": "Redis sessions removed — memory overhead at scale was unsustainable and operational complexity (Redis cluster, failover) added too much risk",
|
|
11
|
-
"schema_version": 2,
|
|
12
|
-
"affected_areas": ["src/auth/", "src/middleware/session.ts"],
|
|
13
|
-
"scope": ["auth", "sessions", "infrastructure"],
|
|
14
|
-
"status": "refuted",
|
|
15
|
-
"confidence": 0.89,
|
|
16
|
-
"source_module": "council",
|
|
17
|
-
"evidence_cited": [],
|
|
18
|
-
"timestamp": "2025-02-10T15:00:00Z"
|
|
19
|
-
}
|
|
20
|
-
],
|
|
21
|
-
"expected": {
|
|
22
|
-
"jury_max_confidence": 0.50,
|
|
23
|
-
"council_recommendation": "redesign",
|
|
24
|
-
"must_flag": ["memory overhead", "Redis"],
|
|
25
|
-
"must_cite": ["auth-015"],
|
|
26
|
-
"risk_level": "critical",
|
|
27
|
-
"preflight_expects": {
|
|
28
|
-
"touches_sensitive_area": true,
|
|
29
|
-
"chronicle_conflicts": ["auth-015"]
|
|
30
|
-
}
|
|
31
|
-
}
|
|
32
|
-
}
|
|
@@ -1,17 +0,0 @@
|
|
|
1
|
-
{
|
|
2
|
-
"id": "safe_refactor",
|
|
3
|
-
"description": "Low-risk internal refactor with no sensitive areas — should proceed without friction",
|
|
4
|
-
"outcome": "Rename internal helper functions in the reporting module for consistency",
|
|
5
|
-
"design": "Rename generateCsvReport to exportReportAsCsv and generatePdfReport to exportReportAsPdf in src/reports/. Update all callers. No behaviour change.",
|
|
6
|
-
"oracle_evidence": [],
|
|
7
|
-
"expected": {
|
|
8
|
-
"jury_min_confidence": 0.70,
|
|
9
|
-
"council_recommendation": "proceed",
|
|
10
|
-
"risk_level": "low",
|
|
11
|
-
"preflight_expects": {
|
|
12
|
-
"touches_sensitive_area": false,
|
|
13
|
-
"rollback_mentioned": false,
|
|
14
|
-
"chronicle_conflicts": []
|
|
15
|
-
}
|
|
16
|
-
}
|
|
17
|
-
}
|
package/evals/runner.ts
DELETED
|
@@ -1,226 +0,0 @@
|
|
|
1
|
-
/**
|
|
2
|
-
* Eval runner for Quorum Jury + Council.
|
|
3
|
-
*
|
|
4
|
-
* Each case in evals/cases/ defines a proposal and what the system should produce.
|
|
5
|
-
* The runner validates:
|
|
6
|
-
* - Jury confidence is within expected bounds
|
|
7
|
-
* - Preflight detects the expected signals
|
|
8
|
-
* - Risk classifier assigns the expected level
|
|
9
|
-
* - Council recommendation matches (when an LLM provider is available)
|
|
10
|
-
*
|
|
11
|
-
* Jury + preflight run without any LLM (deterministic).
|
|
12
|
-
* Council assertions are skipped if no LLM provider is injected.
|
|
13
|
-
*
|
|
14
|
-
* Usage:
|
|
15
|
-
* npx vitest run evals/
|
|
16
|
-
*
|
|
17
|
-
* Or run against a real LLM:
|
|
18
|
-
* EVAL_LLM=openai npx vitest run evals/
|
|
19
|
-
*/
|
|
20
|
-
|
|
21
|
-
import { promises as fs } from "fs"
|
|
22
|
-
import path from "path"
|
|
23
|
-
import type { OracleResult, LLMProvider } from "../modules/shared/types"
|
|
24
|
-
import { runPreflight } from "../modules/jury/preflight"
|
|
25
|
-
import { classifyRisk } from "../modules/council/risk"
|
|
26
|
-
|
|
27
|
-
export interface EvalCase {
|
|
28
|
-
id: string
|
|
29
|
-
description: string
|
|
30
|
-
outcome: string
|
|
31
|
-
design: string
|
|
32
|
-
oracle_evidence: OracleResult[]
|
|
33
|
-
expected: {
|
|
34
|
-
jury_min_confidence?: number
|
|
35
|
-
jury_max_confidence?: number
|
|
36
|
-
council_recommendation?: "proceed" | "redesign" | "investigate-more"
|
|
37
|
-
must_flag?: string[]
|
|
38
|
-
must_not_flag?: string[]
|
|
39
|
-
must_cite?: string[]
|
|
40
|
-
risk_level?: string
|
|
41
|
-
preflight_expects?: {
|
|
42
|
-
touches_sensitive_area?: boolean
|
|
43
|
-
sensitive_areas_include?: string[]
|
|
44
|
-
rollback_mentioned?: boolean
|
|
45
|
-
test_strategy_mentioned?: boolean
|
|
46
|
-
chronicle_conflicts?: string[]
|
|
47
|
-
}
|
|
48
|
-
}
|
|
49
|
-
}
|
|
50
|
-
|
|
51
|
-
export interface EvalResult {
|
|
52
|
-
caseId: string
|
|
53
|
-
description: string
|
|
54
|
-
passed: boolean
|
|
55
|
-
failures: string[]
|
|
56
|
-
preflight: ReturnType<typeof runPreflight>
|
|
57
|
-
risk: ReturnType<typeof classifyRisk>
|
|
58
|
-
juryOutput?: unknown
|
|
59
|
-
councilOutput?: unknown
|
|
60
|
-
durationMs: number
|
|
61
|
-
}
|
|
62
|
-
|
|
63
|
-
export async function loadCases(casesDir?: string): Promise<EvalCase[]> {
|
|
64
|
-
const dir = casesDir ?? path.join(__dirname, "cases")
|
|
65
|
-
const files = (await fs.readdir(dir)).filter(f => f.endsWith(".json"))
|
|
66
|
-
const cases = await Promise.all(
|
|
67
|
-
files.map(async f => {
|
|
68
|
-
const raw = await fs.readFile(path.join(dir, f), "utf8")
|
|
69
|
-
return JSON.parse(raw) as EvalCase
|
|
70
|
-
}),
|
|
71
|
-
)
|
|
72
|
-
return cases
|
|
73
|
-
}
|
|
74
|
-
|
|
75
|
-
export async function runCase(
|
|
76
|
-
evalCase: EvalCase,
|
|
77
|
-
llm?: LLMProvider,
|
|
78
|
-
): Promise<EvalResult> {
|
|
79
|
-
const start = Date.now()
|
|
80
|
-
const failures: string[] = []
|
|
81
|
-
|
|
82
|
-
const { outcome, design, oracle_evidence: evidence, expected } = evalCase
|
|
83
|
-
|
|
84
|
-
// ── Deterministic checks (no LLM) ──────────────────────────────────────────
|
|
85
|
-
|
|
86
|
-
const preflight = runPreflight(outcome, design, evidence)
|
|
87
|
-
const risk = classifyRisk(outcome, design, evidence)
|
|
88
|
-
|
|
89
|
-
// Risk level
|
|
90
|
-
if (expected.risk_level && risk.level !== expected.risk_level) {
|
|
91
|
-
failures.push(
|
|
92
|
-
`risk_level: expected "${expected.risk_level}", got "${risk.level}" (reasons: ${risk.reasons.join(", ")})`,
|
|
93
|
-
)
|
|
94
|
-
}
|
|
95
|
-
|
|
96
|
-
// Preflight assertions
|
|
97
|
-
const pf = expected.preflight_expects
|
|
98
|
-
if (pf) {
|
|
99
|
-
if (pf.touches_sensitive_area !== undefined && preflight.touches_sensitive_area !== pf.touches_sensitive_area) {
|
|
100
|
-
failures.push(`preflight.touches_sensitive_area: expected ${pf.touches_sensitive_area}, got ${preflight.touches_sensitive_area}`)
|
|
101
|
-
}
|
|
102
|
-
if (pf.rollback_mentioned !== undefined && preflight.rollback_mentioned !== pf.rollback_mentioned) {
|
|
103
|
-
failures.push(`preflight.rollback_mentioned: expected ${pf.rollback_mentioned}, got ${preflight.rollback_mentioned}`)
|
|
104
|
-
}
|
|
105
|
-
if (pf.test_strategy_mentioned !== undefined && preflight.test_strategy_mentioned !== pf.test_strategy_mentioned) {
|
|
106
|
-
failures.push(`preflight.test_strategy_mentioned: expected ${pf.test_strategy_mentioned}, got ${preflight.test_strategy_mentioned}`)
|
|
107
|
-
}
|
|
108
|
-
if (pf.chronicle_conflicts) {
|
|
109
|
-
for (const id of pf.chronicle_conflicts) {
|
|
110
|
-
if (!preflight.chronicle_conflicts.includes(id)) {
|
|
111
|
-
failures.push(`preflight.chronicle_conflicts: expected "${id}" to be flagged`)
|
|
112
|
-
}
|
|
113
|
-
}
|
|
114
|
-
}
|
|
115
|
-
if (pf.sensitive_areas_include) {
|
|
116
|
-
for (const area of pf.sensitive_areas_include) {
|
|
117
|
-
if (!preflight.sensitive_areas.includes(area)) {
|
|
118
|
-
failures.push(`preflight.sensitive_areas: expected "${area}" to be detected`)
|
|
119
|
-
}
|
|
120
|
-
}
|
|
121
|
-
}
|
|
122
|
-
}
|
|
123
|
-
|
|
124
|
-
let juryOutput: unknown
|
|
125
|
-
let councilOutput: unknown
|
|
126
|
-
|
|
127
|
-
// ── LLM-dependent checks (skipped if no provider) ──────────────────────────
|
|
128
|
-
|
|
129
|
-
if (llm) {
|
|
130
|
-
const { evaluate } = await import("../modules/jury/evaluate")
|
|
131
|
-
try {
|
|
132
|
-
juryOutput = await evaluate({ outcome, design, evidence }, { llm })
|
|
133
|
-
const jury = juryOutput as { confidence: number; recommendation: string; assessment: string; gaps: string[] }
|
|
134
|
-
|
|
135
|
-
if (expected.jury_min_confidence !== undefined && jury.confidence < expected.jury_min_confidence) {
|
|
136
|
-
failures.push(`jury.confidence: expected ≥ ${expected.jury_min_confidence}, got ${jury.confidence}`)
|
|
137
|
-
}
|
|
138
|
-
if (expected.jury_max_confidence !== undefined && jury.confidence > expected.jury_max_confidence) {
|
|
139
|
-
failures.push(`jury.confidence: expected ≤ ${expected.jury_max_confidence}, got ${jury.confidence}`)
|
|
140
|
-
}
|
|
141
|
-
} catch (err) {
|
|
142
|
-
failures.push(`jury threw: ${String(err)}`)
|
|
143
|
-
}
|
|
144
|
-
|
|
145
|
-
if (expected.council_recommendation && juryOutput) {
|
|
146
|
-
const { deliberate } = await import("../modules/council/deliberate")
|
|
147
|
-
const mockOracle = {
|
|
148
|
-
query: async () => [],
|
|
149
|
-
propose: async () => ({ proposalId: "eval-proposal" }),
|
|
150
|
-
commit: async () => { throw new Error("commit not available in eval") },
|
|
151
|
-
}
|
|
152
|
-
try {
|
|
153
|
-
councilOutput = await deliberate(
|
|
154
|
-
{ outcome, design, evidence, jury_output: juryOutput as never },
|
|
155
|
-
{ llm, oracle: mockOracle, advisorCount: 2, reviewerCount: 2 },
|
|
156
|
-
)
|
|
157
|
-
const council = councilOutput as { recommendation: string; verdict: string; blockers: Array<{ issue: string }>; evidence_cited: string[] }
|
|
158
|
-
|
|
159
|
-
if (council.recommendation !== expected.council_recommendation) {
|
|
160
|
-
failures.push(
|
|
161
|
-
`council.recommendation: expected "${expected.council_recommendation}", got "${council.recommendation}"`,
|
|
162
|
-
)
|
|
163
|
-
}
|
|
164
|
-
|
|
165
|
-
const verdictText = [
|
|
166
|
-
council.verdict,
|
|
167
|
-
...council.blockers.map(b => b.issue),
|
|
168
|
-
].join(" ").toLowerCase()
|
|
169
|
-
|
|
170
|
-
if (expected.must_flag) {
|
|
171
|
-
for (const term of expected.must_flag) {
|
|
172
|
-
if (!verdictText.includes(term.toLowerCase())) {
|
|
173
|
-
failures.push(`council must_flag: "${term}" not mentioned in verdict or blockers`)
|
|
174
|
-
}
|
|
175
|
-
}
|
|
176
|
-
}
|
|
177
|
-
if (expected.must_not_flag) {
|
|
178
|
-
for (const term of expected.must_not_flag) {
|
|
179
|
-
if (verdictText.includes(term.toLowerCase())) {
|
|
180
|
-
failures.push(`council must_not_flag: "${term}" was mentioned but should not be`)
|
|
181
|
-
}
|
|
182
|
-
}
|
|
183
|
-
}
|
|
184
|
-
if (expected.must_cite) {
|
|
185
|
-
for (const id of expected.must_cite) {
|
|
186
|
-
if (!council.evidence_cited.includes(id)) {
|
|
187
|
-
failures.push(`council must_cite: entry ID "${id}" not in evidence_cited`)
|
|
188
|
-
}
|
|
189
|
-
}
|
|
190
|
-
}
|
|
191
|
-
} catch (err) {
|
|
192
|
-
failures.push(`council threw: ${String(err)}`)
|
|
193
|
-
}
|
|
194
|
-
}
|
|
195
|
-
}
|
|
196
|
-
|
|
197
|
-
return {
|
|
198
|
-
caseId: evalCase.id,
|
|
199
|
-
description: evalCase.description,
|
|
200
|
-
passed: failures.length === 0,
|
|
201
|
-
failures,
|
|
202
|
-
preflight,
|
|
203
|
-
risk,
|
|
204
|
-
juryOutput,
|
|
205
|
-
councilOutput,
|
|
206
|
-
durationMs: Date.now() - start,
|
|
207
|
-
}
|
|
208
|
-
}
|
|
209
|
-
|
|
210
|
-
export function printEvalSummary(results: EvalResult[]): void {
|
|
211
|
-
const passed = results.filter(r => r.passed).length
|
|
212
|
-
const total = results.length
|
|
213
|
-
console.log(`\n${"─".repeat(60)}`)
|
|
214
|
-
console.log(`Eval results: ${passed}/${total} passed`)
|
|
215
|
-
console.log("─".repeat(60))
|
|
216
|
-
for (const r of results) {
|
|
217
|
-
const icon = r.passed ? "✓" : "✗"
|
|
218
|
-
console.log(`${icon} ${r.caseId} (${r.durationMs}ms)`)
|
|
219
|
-
if (!r.passed) {
|
|
220
|
-
for (const f of r.failures) {
|
|
221
|
-
console.log(` → ${f}`)
|
|
222
|
-
}
|
|
223
|
-
}
|
|
224
|
-
}
|
|
225
|
-
console.log("─".repeat(60))
|
|
226
|
-
}
|
package/modules/AGENTS.md
DELETED
|
@@ -1,78 +0,0 @@
|
|
|
1
|
-
# modules/ — Agent Instructions
|
|
2
|
-
|
|
3
|
-
Supplements the root `AGENTS.md` / `copilot-instructions.md` with module-specific internals.
|
|
4
|
-
When working inside this folder, follow these rules in addition to the root guidelines.
|
|
5
|
-
|
|
6
|
-
---
|
|
7
|
-
|
|
8
|
-
## File ownership
|
|
9
|
-
|
|
10
|
-
### Oracle
|
|
11
|
-
| File | Owns |
|
|
12
|
-
|---|---|
|
|
13
|
-
| `oracle/query.ts` | Two-pass retrieval (vector → BM25 → RRF fusion). Score threshold. Query log. |
|
|
14
|
-
| `oracle/bm25.ts` | BM25 scoring algorithm. Domain term extraction for query enrichment. |
|
|
15
|
-
| `oracle/propose.ts` | `propose()` + `commit()`. The human-gated write path. Do not add auto-commit logic here. |
|
|
16
|
-
| `oracle/log.ts` | Best-effort JSONL query log writer. Must never throw to callers. |
|
|
17
|
-
| `oracle/adapters/lance-db.ts` | LanceDB `VectorStore` implementation. Swappable — do not couple oracle internals to this. |
|
|
18
|
-
| `oracle/adapters/xenova-embedder.ts` | Local ONNX embedder. Swappable — do not couple oracle internals to this. |
|
|
19
|
-
|
|
20
|
-
### Jury
|
|
21
|
-
| File | Owns |
|
|
22
|
-
|---|---|
|
|
23
|
-
| `jury/schema.ts` | Zod schema for structured LLM output. Source of truth for `JuryOutput` shape including `confidence_breakdown` and `blocking_gaps`. |
|
|
24
|
-
| `jury/evaluate.ts` | Four-dimension evaluation. **Confidence is always recomputed from the breakdown average here — do not remove this. `council_brief` is also overridden from confidence.** |
|
|
25
|
-
| `jury/preflight.ts` | Deterministic preflight — no LLM. Detects sensitive areas, rollback mention, and Chronicle conflicts before the LLM runs. Safe to extend with new patterns. |
|
|
26
|
-
|
|
27
|
-
### Council
|
|
28
|
-
| File | Owns |
|
|
29
|
-
|---|---|
|
|
30
|
-
| `council/personas.ts` | Default advisor personas. Safe to extend. Do not remove existing personas without good reason. |
|
|
31
|
-
| `council/frame.ts` | Sets deliberation tone from `council_brief`. Challenge vs pressure-test framing lives here. |
|
|
32
|
-
| `council/advisors.ts` | Parallel advisor fan-out. Advisors must cite Oracle entry IDs — enforced in the prompt. |
|
|
33
|
-
| `council/reviewers.ts` | Anonymisation of advisor responses + parallel reviewer fan-out. Anonymisation must happen before reviewers see responses. |
|
|
34
|
-
| `council/chairman.ts` | Verdict synthesis + Zod validation. Produces structured `blockers`/`warnings`, validates citations, tracks `advisor_split`. Throws on bad output — do not add fallbacks. |
|
|
35
|
-
| `council/risk.ts` | Deterministic risk classifier — no LLM. Assigns `low/medium/high/critical` and `council_mode` from design text and refuted evidence. Drives advisor/reviewer fan-out counts. |
|
|
36
|
-
| `council/deliberate.ts` | Full pipeline orchestration. Calls `oracle.propose()` at the end — never `oracle.commit()`. Risk classifier runs first to set fan-out counts. |
|
|
37
|
-
|
|
38
|
-
### Advisor
|
|
39
|
-
| File | Owns |
|
|
40
|
-
|---|---|
|
|
41
|
-
| `advisor/ask.ts` | Main entry point. Queries Oracle, calls LLM, validates answer against satisfaction threshold (confidence ≥ 0.7, no blockers). Retries up to 2 times with previous answer as context. Throws on bad LLM output — do not add fallbacks. |
|
|
42
|
-
| `advisor/prompt.ts` | SYSTEM_PROMPT, evidence formatter, user prompt builder. The plain-language framing lives here. |
|
|
43
|
-
| `advisor/types.ts` | `AdvisorInput`, `AdvisorAnswer`, `AdvisorOutput`, `AdvisorDeps` types. |
|
|
44
|
-
|
|
45
|
-
---
|
|
46
|
-
|
|
47
|
-
## Extension points
|
|
48
|
-
|
|
49
|
-
**Swap the vector store** — implement `VectorStore` from `oracle/types.ts` and pass it to `createOracleClient()` or `setup()`.
|
|
50
|
-
|
|
51
|
-
**Swap the embedder** — pass `embedder: yourFn` to `setup()`. Must return a consistent-dimension float array.
|
|
52
|
-
|
|
53
|
-
**Add advisor personas** — extend `DEFAULT_PERSONAS` in `council/personas.ts`, or pass a custom personas array directly to `fanOutAdvisors()`.
|
|
54
|
-
|
|
55
|
-
**Use different models per step** — pass `models` to `setup()` or `council.deliberate()` deps. Cheaper models for advisors, stronger for chairman is the intended pattern.
|
|
56
|
-
|
|
57
|
-
---
|
|
58
|
-
|
|
59
|
-
## Invariants — do not break these
|
|
60
|
-
|
|
61
|
-
- `advisor/ask.ts` never calls `oracle.propose()` or `oracle.commit()`. It is a read-only path.
|
|
62
|
-
- `oracle.commit()` is never called without explicit human input. `deliberate()` calls `propose()` only.
|
|
63
|
-
- `jury/evaluate.ts` recomputes `confidence` as the exact average of `confidence_breakdown` dimensions — the LLM value is discarded.
|
|
64
|
-
- `jury/evaluate.ts` derives `council_brief` from the recomputed confidence — never trusts the LLM value.
|
|
65
|
-
- `chairman.ts` and `jury/evaluate.ts` throw on schema validation failure. Do not add try/catch that swallows these errors.
|
|
66
|
-
- `deliberate.ts` passes `citation_validation.valid_ids` (not raw `evidence_cited`) to `oracle.propose()` — hallucinated IDs are stripped.
|
|
67
|
-
- Query logging in `oracle/log.ts` is always best-effort — callers must not fail because of a log write error.
|
|
68
|
-
- `VectorStore` and `embedder` are always injected — never imported directly inside Oracle logic.
|
|
69
|
-
|
|
70
|
-
---
|
|
71
|
-
|
|
72
|
-
## Tests
|
|
73
|
-
|
|
74
|
-
```bash
|
|
75
|
-
npx vitest run modules/
|
|
76
|
-
```
|
|
77
|
-
|
|
78
|
-
Tests live in `__tests__/` inside each module folder. Use `vi.fn()` for LLM providers and vector stores — never call a real LLM in tests.
|