@balpal4495/quorum 3.0.3 → 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (243) hide show
  1. package/bin/commands/compass.js +4 -4
  2. package/bin/shared/llm.js +2 -2
  3. package/dist/advisor/ask.d.ts +13 -0
  4. package/dist/advisor/ask.d.ts.map +1 -0
  5. package/dist/advisor/ask.js +67 -0
  6. package/dist/advisor/ask.js.map +1 -0
  7. package/dist/advisor/index.d.ts +3 -0
  8. package/dist/advisor/index.d.ts.map +1 -0
  9. package/dist/advisor/index.js +2 -0
  10. package/dist/advisor/index.js.map +1 -0
  11. package/dist/advisor/prompt.d.ts +5 -0
  12. package/dist/advisor/prompt.d.ts.map +1 -0
  13. package/{modules/advisor/prompt.ts → dist/advisor/prompt.js} +22 -26
  14. package/dist/advisor/prompt.js.map +1 -0
  15. package/dist/advisor/types.d.ts +23 -0
  16. package/dist/advisor/types.d.ts.map +1 -0
  17. package/dist/advisor/types.js +2 -0
  18. package/dist/advisor/types.js.map +1 -0
  19. package/dist/compass/behavior.d.ts +4 -0
  20. package/dist/compass/behavior.d.ts.map +1 -0
  21. package/dist/compass/behavior.js +138 -0
  22. package/dist/compass/behavior.js.map +1 -0
  23. package/dist/compass/create.d.ts +3 -0
  24. package/dist/compass/create.d.ts.map +1 -0
  25. package/dist/compass/create.js +289 -0
  26. package/dist/compass/create.js.map +1 -0
  27. package/dist/compass/evidence/collect.d.ts +11 -0
  28. package/dist/compass/evidence/collect.d.ts.map +1 -0
  29. package/dist/compass/evidence/collect.js +86 -0
  30. package/dist/compass/evidence/collect.js.map +1 -0
  31. package/dist/compass/index.d.ts +8 -0
  32. package/dist/compass/index.d.ts.map +1 -0
  33. package/dist/compass/index.js +8 -0
  34. package/dist/compass/index.js.map +1 -0
  35. package/dist/compass/prompts/index.d.ts +28 -0
  36. package/dist/compass/prompts/index.d.ts.map +1 -0
  37. package/{modules/compass/prompts/index.ts → dist/compass/prompts/index.js} +13 -38
  38. package/dist/compass/prompts/index.js.map +1 -0
  39. package/dist/compass/prompts/system.d.ts +2 -0
  40. package/dist/compass/prompts/system.d.ts.map +1 -0
  41. package/{modules/compass/prompts/system.ts → dist/compass/prompts/system.js} +2 -1
  42. package/dist/compass/prompts/system.js.map +1 -0
  43. package/dist/compass/propose.d.ts +15 -0
  44. package/dist/compass/propose.d.ts.map +1 -0
  45. package/dist/compass/propose.js +128 -0
  46. package/dist/compass/propose.js.map +1 -0
  47. package/dist/compass/schemas.d.ts +1271 -0
  48. package/dist/compass/schemas.d.ts.map +1 -0
  49. package/dist/compass/schemas.js +113 -0
  50. package/dist/compass/schemas.js.map +1 -0
  51. package/dist/compass/score.d.ts +25 -0
  52. package/dist/compass/score.d.ts.map +1 -0
  53. package/dist/compass/score.js +89 -0
  54. package/dist/compass/score.js.map +1 -0
  55. package/dist/compass/sources/index.d.ts +9 -0
  56. package/dist/compass/sources/index.d.ts.map +1 -0
  57. package/dist/compass/sources/index.js +408 -0
  58. package/dist/compass/sources/index.js.map +1 -0
  59. package/dist/compass/types.d.ts +334 -0
  60. package/dist/compass/types.d.ts.map +1 -0
  61. package/dist/compass/types.js +2 -0
  62. package/dist/compass/types.js.map +1 -0
  63. package/dist/council/advisors.d.ts +15 -0
  64. package/dist/council/advisors.d.ts.map +1 -0
  65. package/dist/council/advisors.js +46 -0
  66. package/dist/council/advisors.js.map +1 -0
  67. package/dist/council/chairman.d.ts +13 -0
  68. package/dist/council/chairman.d.ts.map +1 -0
  69. package/dist/council/chairman.js +145 -0
  70. package/dist/council/chairman.js.map +1 -0
  71. package/dist/council/deliberate.d.ts +22 -0
  72. package/dist/council/deliberate.d.ts.map +1 -0
  73. package/dist/council/deliberate.js +99 -0
  74. package/dist/council/deliberate.js.map +1 -0
  75. package/dist/council/frame.d.ts +8 -0
  76. package/dist/council/frame.d.ts.map +1 -0
  77. package/dist/council/frame.js +40 -0
  78. package/dist/council/frame.js.map +1 -0
  79. package/dist/council/index.d.ts +6 -0
  80. package/dist/council/index.d.ts.map +1 -0
  81. package/dist/council/index.js +4 -0
  82. package/dist/council/index.js.map +1 -0
  83. package/dist/council/personas.d.ts +18 -0
  84. package/dist/council/personas.d.ts.map +1 -0
  85. package/dist/council/personas.js +44 -0
  86. package/dist/council/personas.js.map +1 -0
  87. package/dist/council/reviewers.d.ts +13 -0
  88. package/dist/council/reviewers.d.ts.map +1 -0
  89. package/dist/council/reviewers.js +59 -0
  90. package/dist/council/reviewers.js.map +1 -0
  91. package/dist/council/risk.d.ts +16 -0
  92. package/dist/council/risk.d.ts.map +1 -0
  93. package/dist/council/risk.js +74 -0
  94. package/dist/council/risk.js.map +1 -0
  95. package/dist/council/types.d.ts +95 -0
  96. package/dist/council/types.d.ts.map +1 -0
  97. package/dist/council/types.js +2 -0
  98. package/dist/council/types.js.map +1 -0
  99. package/dist/jury/evaluate.d.ts +13 -0
  100. package/dist/jury/evaluate.d.ts.map +1 -0
  101. package/{modules/jury/evaluate.ts → dist/jury/evaluate.js} +60 -84
  102. package/dist/jury/evaluate.js.map +1 -0
  103. package/dist/jury/index.d.ts +6 -0
  104. package/dist/jury/index.d.ts.map +1 -0
  105. package/dist/jury/index.js +4 -0
  106. package/dist/jury/index.js.map +1 -0
  107. package/dist/jury/preflight.d.ts +26 -0
  108. package/dist/jury/preflight.d.ts.map +1 -0
  109. package/dist/jury/preflight.js +71 -0
  110. package/dist/jury/preflight.js.map +1 -0
  111. package/dist/jury/schema.d.ts +57 -0
  112. package/dist/jury/schema.d.ts.map +1 -0
  113. package/dist/jury/schema.js +21 -0
  114. package/dist/jury/schema.js.map +1 -0
  115. package/dist/jury/types.d.ts +47 -0
  116. package/dist/jury/types.d.ts.map +1 -0
  117. package/dist/jury/types.js +2 -0
  118. package/dist/jury/types.js.map +1 -0
  119. package/dist/oracle/adapters/lance-db.d.ts +15 -0
  120. package/dist/oracle/adapters/lance-db.d.ts.map +1 -0
  121. package/dist/oracle/adapters/lance-db.js +68 -0
  122. package/dist/oracle/adapters/lance-db.js.map +1 -0
  123. package/dist/oracle/adapters/xenova-embedder.d.ts +21 -0
  124. package/dist/oracle/adapters/xenova-embedder.d.ts.map +1 -0
  125. package/dist/oracle/adapters/xenova-embedder.js +36 -0
  126. package/dist/oracle/adapters/xenova-embedder.js.map +1 -0
  127. package/dist/oracle/bm25.d.ts +20 -0
  128. package/dist/oracle/bm25.d.ts.map +1 -0
  129. package/dist/oracle/bm25.js +82 -0
  130. package/dist/oracle/bm25.js.map +1 -0
  131. package/dist/oracle/index.d.ts +21 -0
  132. package/dist/oracle/index.d.ts.map +1 -0
  133. package/dist/oracle/index.js +25 -0
  134. package/dist/oracle/index.js.map +1 -0
  135. package/dist/oracle/log.d.ts +6 -0
  136. package/dist/oracle/log.d.ts.map +1 -0
  137. package/dist/oracle/log.js +12 -0
  138. package/dist/oracle/log.js.map +1 -0
  139. package/dist/oracle/propose.d.ts +25 -0
  140. package/dist/oracle/propose.d.ts.map +1 -0
  141. package/dist/oracle/propose.js +133 -0
  142. package/dist/oracle/propose.js.map +1 -0
  143. package/dist/oracle/query.d.ts +17 -0
  144. package/dist/oracle/query.d.ts.map +1 -0
  145. package/dist/oracle/query.js +106 -0
  146. package/dist/oracle/query.js.map +1 -0
  147. package/dist/oracle/summary.d.ts +11 -0
  148. package/dist/oracle/summary.d.ts.map +1 -0
  149. package/dist/oracle/summary.js +102 -0
  150. package/dist/oracle/summary.js.map +1 -0
  151. package/dist/oracle/types.d.ts +31 -0
  152. package/dist/oracle/types.d.ts.map +1 -0
  153. package/dist/oracle/types.js +2 -0
  154. package/dist/oracle/types.js.map +1 -0
  155. package/dist/sentinel/assert.d.ts +28 -0
  156. package/dist/sentinel/assert.d.ts.map +1 -0
  157. package/dist/sentinel/assert.js +63 -0
  158. package/dist/sentinel/assert.js.map +1 -0
  159. package/dist/sentinel/coverage.d.ts +14 -0
  160. package/dist/sentinel/coverage.d.ts.map +1 -0
  161. package/dist/sentinel/coverage.js +96 -0
  162. package/dist/sentinel/coverage.js.map +1 -0
  163. package/dist/sentinel/drift.d.ts +12 -0
  164. package/dist/sentinel/drift.d.ts.map +1 -0
  165. package/dist/sentinel/drift.js +149 -0
  166. package/dist/sentinel/drift.js.map +1 -0
  167. package/dist/sentinel/index.d.ts +7 -0
  168. package/dist/sentinel/index.d.ts.map +1 -0
  169. package/dist/sentinel/index.js +5 -0
  170. package/dist/sentinel/index.js.map +1 -0
  171. package/dist/sentinel/review.d.ts +15 -0
  172. package/dist/sentinel/review.d.ts.map +1 -0
  173. package/dist/sentinel/review.js +177 -0
  174. package/dist/sentinel/review.js.map +1 -0
  175. package/dist/setup.d.ts +103 -0
  176. package/dist/setup.d.ts.map +1 -0
  177. package/dist/setup.js +87 -0
  178. package/dist/setup.js.map +1 -0
  179. package/dist/shared/types.d.ts +173 -0
  180. package/dist/shared/types.d.ts.map +1 -0
  181. package/dist/shared/types.js +16 -0
  182. package/dist/shared/types.js.map +1 -0
  183. package/package.json +13 -8
  184. package/.github/copilot-instructions.md +0 -117
  185. package/CLAUDE.md +0 -146
  186. package/GEMINI.md +0 -73
  187. package/SETUP.md +0 -264
  188. package/evals/__tests__/eval.test.ts +0 -31
  189. package/evals/cases/auth_hs256_rejected.json +0 -46
  190. package/evals/cases/auth_rs256_valid.json +0 -30
  191. package/evals/cases/cache_missing_lock.json +0 -31
  192. package/evals/cases/db_naive_not_null.json +0 -32
  193. package/evals/cases/logging_pii_leak.json +0 -32
  194. package/evals/cases/migration_with_rollback.json +0 -43
  195. package/evals/cases/no_evidence_novel_design.json +0 -16
  196. package/evals/cases/payment_no_idempotency.json +0 -33
  197. package/evals/cases/redis_session_rejected.json +0 -32
  198. package/evals/cases/safe_refactor.json +0 -17
  199. package/evals/runner.ts +0 -226
  200. package/modules/AGENTS.md +0 -78
  201. package/modules/CLAUDE.md +0 -93
  202. package/modules/README.md +0 -504
  203. package/modules/advisor/ask.ts +0 -87
  204. package/modules/advisor/index.ts +0 -2
  205. package/modules/advisor/types.ts +0 -26
  206. package/modules/compass/behavior.ts +0 -161
  207. package/modules/compass/create.ts +0 -365
  208. package/modules/compass/evidence/collect.ts +0 -109
  209. package/modules/compass/index.ts +0 -7
  210. package/modules/compass/propose.ts +0 -152
  211. package/modules/compass/schemas.ts +0 -121
  212. package/modules/compass/score.ts +0 -77
  213. package/modules/compass/sources/index.ts +0 -413
  214. package/modules/compass/types.ts +0 -431
  215. package/modules/council/advisors.ts +0 -71
  216. package/modules/council/chairman.ts +0 -183
  217. package/modules/council/deliberate.ts +0 -141
  218. package/modules/council/frame.ts +0 -54
  219. package/modules/council/index.ts +0 -9
  220. package/modules/council/personas.ts +0 -57
  221. package/modules/council/reviewers.ts +0 -82
  222. package/modules/council/risk.ts +0 -89
  223. package/modules/council/types.ts +0 -107
  224. package/modules/jury/index.ts +0 -5
  225. package/modules/jury/preflight.ts +0 -101
  226. package/modules/jury/schema.ts +0 -24
  227. package/modules/jury/types.ts +0 -50
  228. package/modules/oracle/adapters/lance-db.ts +0 -81
  229. package/modules/oracle/adapters/xenova-embedder.ts +0 -43
  230. package/modules/oracle/bm25.ts +0 -92
  231. package/modules/oracle/index.ts +0 -36
  232. package/modules/oracle/log.ts +0 -15
  233. package/modules/oracle/propose.ts +0 -164
  234. package/modules/oracle/query.ts +0 -146
  235. package/modules/oracle/summary.ts +0 -116
  236. package/modules/oracle/types.ts +0 -32
  237. package/modules/sentinel/assert.ts +0 -95
  238. package/modules/sentinel/coverage.ts +0 -106
  239. package/modules/sentinel/drift.ts +0 -163
  240. package/modules/sentinel/index.ts +0 -6
  241. package/modules/sentinel/review.ts +0 -208
  242. package/modules/setup.ts +0 -202
  243. package/modules/shared/types.ts +0 -193
@@ -1,30 +0,0 @@
1
- {
2
- "id": "auth_rs256_valid",
3
- "description": "Proposing the already-approved RS256 pattern — should proceed",
4
- "outcome": "Add JWT authentication to the API",
5
- "design": "RS256 tokens with 15-minute expiry and refresh rotation stored in httpOnly cookies, matching the approved pattern in Chronicle",
6
- "oracle_evidence": [
7
- {
8
- "id": "auth-031",
9
- "key_insight": "RS256 with short-lived tokens and refresh rotation in httpOnly cookies is the approved pattern",
10
- "decision": "RS256 with 15-min access tokens and refresh rotation stored in httpOnly cookies is the approved auth pattern",
11
- "schema_version": 2,
12
- "affected_areas": ["src/auth/", "src/middleware/"],
13
- "scope": ["auth", "sessions"],
14
- "status": "validated",
15
- "confidence": 0.88,
16
- "source_module": "council",
17
- "evidence_cited": [],
18
- "timestamp": "2025-03-15T11:00:00Z"
19
- }
20
- ],
21
- "expected": {
22
- "jury_min_confidence": 0.65,
23
- "council_recommendation": "proceed",
24
- "must_not_flag": ["key rotation problem"],
25
- "risk_level": "critical",
26
- "preflight_expects": {
27
- "touches_sensitive_area": true
28
- }
29
- }
30
- }
@@ -1,31 +0,0 @@
1
- {
2
- "id": "cache_missing_lock",
3
- "description": "Cache implementation missing stampede protection — should warn or block",
4
- "outcome": "Cache expensive product catalogue queries in Redis",
5
- "design": "On cache miss, fetch from database and write to Redis with a 5-minute TTL. No locking strategy.",
6
- "oracle_evidence": [
7
- {
8
- "id": "cache-008",
9
- "key_insight": "Redis cache without stampede protection caused DB overload during peak traffic",
10
- "decision": "Redis cache without stampede protection caused DB overload — all cache misses hit DB simultaneously during spikes",
11
- "schema_version": 2,
12
- "affected_areas": ["src/cache/", "src/api/products.ts"],
13
- "scope": ["cache", "performance"],
14
- "status": "refuted",
15
- "confidence": 0.85,
16
- "source_module": "council",
17
- "evidence_cited": [],
18
- "timestamp": "2025-02-20T14:00:00Z"
19
- }
20
- ],
21
- "expected": {
22
- "jury_max_confidence": 0.60,
23
- "council_recommendation": "redesign",
24
- "must_flag": ["stampede", "lock"],
25
- "must_cite": ["cache-008"],
26
- "risk_level": "medium",
27
- "preflight_expects": {
28
- "chronicle_conflicts": ["cache-008"]
29
- }
30
- }
31
- }
@@ -1,32 +0,0 @@
1
- {
2
- "id": "db_naive_not_null",
3
- "description": "Naive NOT NULL migration on a large table — should block",
4
- "outcome": "Add a NOT NULL region column to the users table",
5
- "design": "Run ALTER TABLE users ADD COLUMN region TEXT NOT NULL DEFAULT 'uk'",
6
- "oracle_evidence": [
7
- {
8
- "id": "db-017",
9
- "key_insight": "Avoid naive locking migrations on large tables",
10
- "decision": "Avoid naive locking migrations on large tables — use shadow column pattern or pg_repack",
11
- "schema_version": 2,
12
- "affected_areas": ["db/migrations", "src/models/user.ts"],
13
- "scope": ["database", "migrations"],
14
- "status": "refuted",
15
- "confidence": 0.91,
16
- "source_module": "council",
17
- "evidence_cited": [],
18
- "timestamp": "2025-01-10T10:00:00Z"
19
- }
20
- ],
21
- "expected": {
22
- "jury_max_confidence": 0.55,
23
- "council_recommendation": "redesign",
24
- "must_flag": ["lock strategy", "rollback"],
25
- "must_cite": ["db-017"],
26
- "risk_level": "high",
27
- "preflight_expects": {
28
- "rollback_mentioned": false,
29
- "chronicle_conflicts": ["db-017"]
30
- }
31
- }
32
- }
@@ -1,32 +0,0 @@
1
- {
2
- "id": "logging_pii_leak",
3
- "description": "Logging request bodies that contain PII — should block",
4
- "outcome": "Add structured request logging to the API for debugging",
5
- "design": "Log full request body including email, name, and any user-submitted fields to stdout using JSON format",
6
- "oracle_evidence": [
7
- {
8
- "id": "sec-003",
9
- "key_insight": "Logging PII fields violates GDPR and our data retention policy",
10
- "decision": "Logging PII fields (email, name, address) violates GDPR Article 5 — all request logging must scrub or omit PII fields",
11
- "schema_version": 2,
12
- "affected_areas": ["src/middleware/logger.ts", "src/api/"],
13
- "scope": ["pii", "compliance", "logging"],
14
- "status": "validated",
15
- "confidence": 0.95,
16
- "source_module": "council",
17
- "evidence_cited": [],
18
- "timestamp": "2025-01-05T08:00:00Z"
19
- }
20
- ],
21
- "expected": {
22
- "jury_max_confidence": 0.40,
23
- "council_recommendation": "redesign",
24
- "must_flag": ["PII", "GDPR", "email"],
25
- "must_cite": ["sec-003"],
26
- "risk_level": "high",
27
- "preflight_expects": {
28
- "touches_sensitive_area": true,
29
- "sensitive_areas_include": ["pii"]
30
- }
31
- }
32
- }
@@ -1,43 +0,0 @@
1
- {
2
- "id": "migration_with_rollback",
3
- "description": "DB migration that explicitly addresses rollback and uses safe pattern — should proceed",
4
- "outcome": "Add a NOT NULL region column to the users table",
5
- "design": "Use shadow column pattern: add region TEXT NULLABLE, backfill via batched update, then add NOT NULL constraint after 100% fill confirmed. Rollback: drop shadow column. Uses pg_repack to avoid exclusive locks.",
6
- "oracle_evidence": [
7
- {
8
- "id": "db-017",
9
- "key_insight": "Avoid naive locking migrations on large tables — use shadow column pattern or pg_repack",
10
- "decision": "Avoid naive locking migrations on large tables — use shadow column pattern or pg_repack",
11
- "schema_version": 2,
12
- "affected_areas": ["db/migrations", "src/models/user.ts"],
13
- "scope": ["database", "migrations"],
14
- "status": "refuted",
15
- "confidence": 0.91,
16
- "source_module": "council",
17
- "evidence_cited": [],
18
- "timestamp": "2025-01-10T10:00:00Z"
19
- },
20
- {
21
- "id": "db-019",
22
- "key_insight": "Shadow column pattern with batched backfill is the approved approach for NOT NULL migrations",
23
- "decision": "Shadow column pattern with batched backfill is the approved approach for large NOT NULL migrations",
24
- "schema_version": 2,
25
- "affected_areas": ["db/migrations"],
26
- "scope": ["database", "migrations"],
27
- "status": "validated",
28
- "confidence": 0.87,
29
- "source_module": "council",
30
- "evidence_cited": ["db-017"],
31
- "timestamp": "2025-02-01T12:00:00Z"
32
- }
33
- ],
34
- "expected": {
35
- "jury_min_confidence": 0.65,
36
- "council_recommendation": "proceed",
37
- "risk_level": "high",
38
- "preflight_expects": {
39
- "rollback_mentioned": true,
40
- "chronicle_conflicts": ["db-017"]
41
- }
42
- }
43
- }
@@ -1,16 +0,0 @@
1
- {
2
- "id": "no_evidence_novel_design",
3
- "description": "Novel design with no Chronicle evidence either way — should investigate-more",
4
- "outcome": "Implement real-time collaboration features using WebSockets",
5
- "design": "Use Socket.io for bi-directional communication, Redis pub/sub for multi-instance message fanout, and optimistic UI updates with conflict resolution via last-write-wins",
6
- "oracle_evidence": [],
7
- "expected": {
8
- "jury_max_confidence": 0.65,
9
- "council_recommendation": "investigate-more",
10
- "risk_level": "medium",
11
- "preflight_expects": {
12
- "touches_sensitive_area": false,
13
- "chronicle_conflicts": []
14
- }
15
- }
16
- }
@@ -1,33 +0,0 @@
1
- {
2
- "id": "payment_no_idempotency",
3
- "description": "Payment charge without idempotency key — should block",
4
- "outcome": "Implement one-click repurchase for customers",
5
- "design": "On button click, POST /api/charge with the stored card token and amount. Retry on network failure up to 3 times.",
6
- "oracle_evidence": [
7
- {
8
- "id": "pay-004",
9
- "key_insight": "Payment charges without idempotency keys caused duplicate charges during network retries",
10
- "decision": "All payment charge requests must include a Stripe idempotency key — retries without idempotency keys caused duplicate charges in production",
11
- "schema_version": 2,
12
- "affected_areas": ["src/payments/", "src/api/checkout.ts"],
13
- "scope": ["payments", "stripe"],
14
- "status": "refuted",
15
- "confidence": 0.97,
16
- "source_module": "council",
17
- "evidence_cited": [],
18
- "timestamp": "2025-04-01T16:00:00Z"
19
- }
20
- ],
21
- "expected": {
22
- "jury_max_confidence": 0.40,
23
- "council_recommendation": "redesign",
24
- "must_flag": ["idempotency", "duplicate charge"],
25
- "must_cite": ["pay-004"],
26
- "risk_level": "critical",
27
- "preflight_expects": {
28
- "touches_sensitive_area": true,
29
- "sensitive_areas_include": ["payments"],
30
- "chronicle_conflicts": ["pay-004"]
31
- }
32
- }
33
- }
@@ -1,32 +0,0 @@
1
- {
2
- "id": "redis_session_rejected",
3
- "description": "Proposing Redis sessions when they were already removed — should block",
4
- "outcome": "Implement user session management",
5
- "design": "Store session data in Redis with a 30-minute TTL and auto-extend on activity. Use express-session with connect-redis.",
6
- "oracle_evidence": [
7
- {
8
- "id": "auth-015",
9
- "key_insight": "Redis sessions removed due to memory overhead at scale and operational complexity",
10
- "decision": "Redis sessions removed — memory overhead at scale was unsustainable and operational complexity (Redis cluster, failover) added too much risk",
11
- "schema_version": 2,
12
- "affected_areas": ["src/auth/", "src/middleware/session.ts"],
13
- "scope": ["auth", "sessions", "infrastructure"],
14
- "status": "refuted",
15
- "confidence": 0.89,
16
- "source_module": "council",
17
- "evidence_cited": [],
18
- "timestamp": "2025-02-10T15:00:00Z"
19
- }
20
- ],
21
- "expected": {
22
- "jury_max_confidence": 0.50,
23
- "council_recommendation": "redesign",
24
- "must_flag": ["memory overhead", "Redis"],
25
- "must_cite": ["auth-015"],
26
- "risk_level": "critical",
27
- "preflight_expects": {
28
- "touches_sensitive_area": true,
29
- "chronicle_conflicts": ["auth-015"]
30
- }
31
- }
32
- }
@@ -1,17 +0,0 @@
1
- {
2
- "id": "safe_refactor",
3
- "description": "Low-risk internal refactor with no sensitive areas — should proceed without friction",
4
- "outcome": "Rename internal helper functions in the reporting module for consistency",
5
- "design": "Rename generateCsvReport to exportReportAsCsv and generatePdfReport to exportReportAsPdf in src/reports/. Update all callers. No behaviour change.",
6
- "oracle_evidence": [],
7
- "expected": {
8
- "jury_min_confidence": 0.70,
9
- "council_recommendation": "proceed",
10
- "risk_level": "low",
11
- "preflight_expects": {
12
- "touches_sensitive_area": false,
13
- "rollback_mentioned": false,
14
- "chronicle_conflicts": []
15
- }
16
- }
17
- }
package/evals/runner.ts DELETED
@@ -1,226 +0,0 @@
1
- /**
2
- * Eval runner for Quorum Jury + Council.
3
- *
4
- * Each case in evals/cases/ defines a proposal and what the system should produce.
5
- * The runner validates:
6
- * - Jury confidence is within expected bounds
7
- * - Preflight detects the expected signals
8
- * - Risk classifier assigns the expected level
9
- * - Council recommendation matches (when an LLM provider is available)
10
- *
11
- * Jury + preflight run without any LLM (deterministic).
12
- * Council assertions are skipped if no LLM provider is injected.
13
- *
14
- * Usage:
15
- * npx vitest run evals/
16
- *
17
- * Or run against a real LLM:
18
- * EVAL_LLM=openai npx vitest run evals/
19
- */
20
-
21
- import { promises as fs } from "fs"
22
- import path from "path"
23
- import type { OracleResult, LLMProvider } from "../modules/shared/types"
24
- import { runPreflight } from "../modules/jury/preflight"
25
- import { classifyRisk } from "../modules/council/risk"
26
-
27
- export interface EvalCase {
28
- id: string
29
- description: string
30
- outcome: string
31
- design: string
32
- oracle_evidence: OracleResult[]
33
- expected: {
34
- jury_min_confidence?: number
35
- jury_max_confidence?: number
36
- council_recommendation?: "proceed" | "redesign" | "investigate-more"
37
- must_flag?: string[]
38
- must_not_flag?: string[]
39
- must_cite?: string[]
40
- risk_level?: string
41
- preflight_expects?: {
42
- touches_sensitive_area?: boolean
43
- sensitive_areas_include?: string[]
44
- rollback_mentioned?: boolean
45
- test_strategy_mentioned?: boolean
46
- chronicle_conflicts?: string[]
47
- }
48
- }
49
- }
50
-
51
- export interface EvalResult {
52
- caseId: string
53
- description: string
54
- passed: boolean
55
- failures: string[]
56
- preflight: ReturnType<typeof runPreflight>
57
- risk: ReturnType<typeof classifyRisk>
58
- juryOutput?: unknown
59
- councilOutput?: unknown
60
- durationMs: number
61
- }
62
-
63
- export async function loadCases(casesDir?: string): Promise<EvalCase[]> {
64
- const dir = casesDir ?? path.join(__dirname, "cases")
65
- const files = (await fs.readdir(dir)).filter(f => f.endsWith(".json"))
66
- const cases = await Promise.all(
67
- files.map(async f => {
68
- const raw = await fs.readFile(path.join(dir, f), "utf8")
69
- return JSON.parse(raw) as EvalCase
70
- }),
71
- )
72
- return cases
73
- }
74
-
75
- export async function runCase(
76
- evalCase: EvalCase,
77
- llm?: LLMProvider,
78
- ): Promise<EvalResult> {
79
- const start = Date.now()
80
- const failures: string[] = []
81
-
82
- const { outcome, design, oracle_evidence: evidence, expected } = evalCase
83
-
84
- // ── Deterministic checks (no LLM) ──────────────────────────────────────────
85
-
86
- const preflight = runPreflight(outcome, design, evidence)
87
- const risk = classifyRisk(outcome, design, evidence)
88
-
89
- // Risk level
90
- if (expected.risk_level && risk.level !== expected.risk_level) {
91
- failures.push(
92
- `risk_level: expected "${expected.risk_level}", got "${risk.level}" (reasons: ${risk.reasons.join(", ")})`,
93
- )
94
- }
95
-
96
- // Preflight assertions
97
- const pf = expected.preflight_expects
98
- if (pf) {
99
- if (pf.touches_sensitive_area !== undefined && preflight.touches_sensitive_area !== pf.touches_sensitive_area) {
100
- failures.push(`preflight.touches_sensitive_area: expected ${pf.touches_sensitive_area}, got ${preflight.touches_sensitive_area}`)
101
- }
102
- if (pf.rollback_mentioned !== undefined && preflight.rollback_mentioned !== pf.rollback_mentioned) {
103
- failures.push(`preflight.rollback_mentioned: expected ${pf.rollback_mentioned}, got ${preflight.rollback_mentioned}`)
104
- }
105
- if (pf.test_strategy_mentioned !== undefined && preflight.test_strategy_mentioned !== pf.test_strategy_mentioned) {
106
- failures.push(`preflight.test_strategy_mentioned: expected ${pf.test_strategy_mentioned}, got ${preflight.test_strategy_mentioned}`)
107
- }
108
- if (pf.chronicle_conflicts) {
109
- for (const id of pf.chronicle_conflicts) {
110
- if (!preflight.chronicle_conflicts.includes(id)) {
111
- failures.push(`preflight.chronicle_conflicts: expected "${id}" to be flagged`)
112
- }
113
- }
114
- }
115
- if (pf.sensitive_areas_include) {
116
- for (const area of pf.sensitive_areas_include) {
117
- if (!preflight.sensitive_areas.includes(area)) {
118
- failures.push(`preflight.sensitive_areas: expected "${area}" to be detected`)
119
- }
120
- }
121
- }
122
- }
123
-
124
- let juryOutput: unknown
125
- let councilOutput: unknown
126
-
127
- // ── LLM-dependent checks (skipped if no provider) ──────────────────────────
128
-
129
- if (llm) {
130
- const { evaluate } = await import("../modules/jury/evaluate")
131
- try {
132
- juryOutput = await evaluate({ outcome, design, evidence }, { llm })
133
- const jury = juryOutput as { confidence: number; recommendation: string; assessment: string; gaps: string[] }
134
-
135
- if (expected.jury_min_confidence !== undefined && jury.confidence < expected.jury_min_confidence) {
136
- failures.push(`jury.confidence: expected ≥ ${expected.jury_min_confidence}, got ${jury.confidence}`)
137
- }
138
- if (expected.jury_max_confidence !== undefined && jury.confidence > expected.jury_max_confidence) {
139
- failures.push(`jury.confidence: expected ≤ ${expected.jury_max_confidence}, got ${jury.confidence}`)
140
- }
141
- } catch (err) {
142
- failures.push(`jury threw: ${String(err)}`)
143
- }
144
-
145
- if (expected.council_recommendation && juryOutput) {
146
- const { deliberate } = await import("../modules/council/deliberate")
147
- const mockOracle = {
148
- query: async () => [],
149
- propose: async () => ({ proposalId: "eval-proposal" }),
150
- commit: async () => { throw new Error("commit not available in eval") },
151
- }
152
- try {
153
- councilOutput = await deliberate(
154
- { outcome, design, evidence, jury_output: juryOutput as never },
155
- { llm, oracle: mockOracle, advisorCount: 2, reviewerCount: 2 },
156
- )
157
- const council = councilOutput as { recommendation: string; verdict: string; blockers: Array<{ issue: string }>; evidence_cited: string[] }
158
-
159
- if (council.recommendation !== expected.council_recommendation) {
160
- failures.push(
161
- `council.recommendation: expected "${expected.council_recommendation}", got "${council.recommendation}"`,
162
- )
163
- }
164
-
165
- const verdictText = [
166
- council.verdict,
167
- ...council.blockers.map(b => b.issue),
168
- ].join(" ").toLowerCase()
169
-
170
- if (expected.must_flag) {
171
- for (const term of expected.must_flag) {
172
- if (!verdictText.includes(term.toLowerCase())) {
173
- failures.push(`council must_flag: "${term}" not mentioned in verdict or blockers`)
174
- }
175
- }
176
- }
177
- if (expected.must_not_flag) {
178
- for (const term of expected.must_not_flag) {
179
- if (verdictText.includes(term.toLowerCase())) {
180
- failures.push(`council must_not_flag: "${term}" was mentioned but should not be`)
181
- }
182
- }
183
- }
184
- if (expected.must_cite) {
185
- for (const id of expected.must_cite) {
186
- if (!council.evidence_cited.includes(id)) {
187
- failures.push(`council must_cite: entry ID "${id}" not in evidence_cited`)
188
- }
189
- }
190
- }
191
- } catch (err) {
192
- failures.push(`council threw: ${String(err)}`)
193
- }
194
- }
195
- }
196
-
197
- return {
198
- caseId: evalCase.id,
199
- description: evalCase.description,
200
- passed: failures.length === 0,
201
- failures,
202
- preflight,
203
- risk,
204
- juryOutput,
205
- councilOutput,
206
- durationMs: Date.now() - start,
207
- }
208
- }
209
-
210
- export function printEvalSummary(results: EvalResult[]): void {
211
- const passed = results.filter(r => r.passed).length
212
- const total = results.length
213
- console.log(`\n${"─".repeat(60)}`)
214
- console.log(`Eval results: ${passed}/${total} passed`)
215
- console.log("─".repeat(60))
216
- for (const r of results) {
217
- const icon = r.passed ? "✓" : "✗"
218
- console.log(`${icon} ${r.caseId} (${r.durationMs}ms)`)
219
- if (!r.passed) {
220
- for (const f of r.failures) {
221
- console.log(` → ${f}`)
222
- }
223
- }
224
- }
225
- console.log("─".repeat(60))
226
- }
package/modules/AGENTS.md DELETED
@@ -1,78 +0,0 @@
1
- # modules/ — Agent Instructions
2
-
3
- Supplements the root `AGENTS.md` / `copilot-instructions.md` with module-specific internals.
4
- When working inside this folder, follow these rules in addition to the root guidelines.
5
-
6
- ---
7
-
8
- ## File ownership
9
-
10
- ### Oracle
11
- | File | Owns |
12
- |---|---|
13
- | `oracle/query.ts` | Two-pass retrieval (vector → BM25 → RRF fusion). Score threshold. Query log. |
14
- | `oracle/bm25.ts` | BM25 scoring algorithm. Domain term extraction for query enrichment. |
15
- | `oracle/propose.ts` | `propose()` + `commit()`. The human-gated write path. Do not add auto-commit logic here. |
16
- | `oracle/log.ts` | Best-effort JSONL query log writer. Must never throw to callers. |
17
- | `oracle/adapters/lance-db.ts` | LanceDB `VectorStore` implementation. Swappable — do not couple oracle internals to this. |
18
- | `oracle/adapters/xenova-embedder.ts` | Local ONNX embedder. Swappable — do not couple oracle internals to this. |
19
-
20
- ### Jury
21
- | File | Owns |
22
- |---|---|
23
- | `jury/schema.ts` | Zod schema for structured LLM output. Source of truth for `JuryOutput` shape including `confidence_breakdown` and `blocking_gaps`. |
24
- | `jury/evaluate.ts` | Four-dimension evaluation. **Confidence is always recomputed from the breakdown average here — do not remove this. `council_brief` is also overridden from confidence.** |
25
- | `jury/preflight.ts` | Deterministic preflight — no LLM. Detects sensitive areas, rollback mention, and Chronicle conflicts before the LLM runs. Safe to extend with new patterns. |
26
-
27
- ### Council
28
- | File | Owns |
29
- |---|---|
30
- | `council/personas.ts` | Default advisor personas. Safe to extend. Do not remove existing personas without good reason. |
31
- | `council/frame.ts` | Sets deliberation tone from `council_brief`. Challenge vs pressure-test framing lives here. |
32
- | `council/advisors.ts` | Parallel advisor fan-out. Advisors must cite Oracle entry IDs — enforced in the prompt. |
33
- | `council/reviewers.ts` | Anonymisation of advisor responses + parallel reviewer fan-out. Anonymisation must happen before reviewers see responses. |
34
- | `council/chairman.ts` | Verdict synthesis + Zod validation. Produces structured `blockers`/`warnings`, validates citations, tracks `advisor_split`. Throws on bad output — do not add fallbacks. |
35
- | `council/risk.ts` | Deterministic risk classifier — no LLM. Assigns `low/medium/high/critical` and `council_mode` from design text and refuted evidence. Drives advisor/reviewer fan-out counts. |
36
- | `council/deliberate.ts` | Full pipeline orchestration. Calls `oracle.propose()` at the end — never `oracle.commit()`. Risk classifier runs first to set fan-out counts. |
37
-
38
- ### Advisor
39
- | File | Owns |
40
- |---|---|
41
- | `advisor/ask.ts` | Main entry point. Queries Oracle, calls LLM, validates answer against satisfaction threshold (confidence ≥ 0.7, no blockers). Retries up to 2 times with previous answer as context. Throws on bad LLM output — do not add fallbacks. |
42
- | `advisor/prompt.ts` | SYSTEM_PROMPT, evidence formatter, user prompt builder. The plain-language framing lives here. |
43
- | `advisor/types.ts` | `AdvisorInput`, `AdvisorAnswer`, `AdvisorOutput`, `AdvisorDeps` types. |
44
-
45
- ---
46
-
47
- ## Extension points
48
-
49
- **Swap the vector store** — implement `VectorStore` from `oracle/types.ts` and pass it to `createOracleClient()` or `setup()`.
50
-
51
- **Swap the embedder** — pass `embedder: yourFn` to `setup()`. Must return a consistent-dimension float array.
52
-
53
- **Add advisor personas** — extend `DEFAULT_PERSONAS` in `council/personas.ts`, or pass a custom personas array directly to `fanOutAdvisors()`.
54
-
55
- **Use different models per step** — pass `models` to `setup()` or `council.deliberate()` deps. Cheaper models for advisors, stronger for chairman is the intended pattern.
56
-
57
- ---
58
-
59
- ## Invariants — do not break these
60
-
61
- - `advisor/ask.ts` never calls `oracle.propose()` or `oracle.commit()`. It is a read-only path.
62
- - `oracle.commit()` is never called without explicit human input. `deliberate()` calls `propose()` only.
63
- - `jury/evaluate.ts` recomputes `confidence` as the exact average of `confidence_breakdown` dimensions — the LLM value is discarded.
64
- - `jury/evaluate.ts` derives `council_brief` from the recomputed confidence — never trusts the LLM value.
65
- - `chairman.ts` and `jury/evaluate.ts` throw on schema validation failure. Do not add try/catch that swallows these errors.
66
- - `deliberate.ts` passes `citation_validation.valid_ids` (not raw `evidence_cited`) to `oracle.propose()` — hallucinated IDs are stripped.
67
- - Query logging in `oracle/log.ts` is always best-effort — callers must not fail because of a log write error.
68
- - `VectorStore` and `embedder` are always injected — never imported directly inside Oracle logic.
69
-
70
- ---
71
-
72
- ## Tests
73
-
74
- ```bash
75
- npx vitest run modules/
76
- ```
77
-
78
- Tests live in `__tests__/` inside each module folder. Use `vi.fn()` for LLM providers and vector stores — never call a real LLM in tests.