ultimate-pi 0.1.7 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (524) hide show
  1. package/.agents/skills/graphify/.graphify_version +1 -0
  2. package/.agents/skills/graphify/SKILL.md +1204 -0
  3. package/.agents/skills/wiki-autoresearch/SKILL.md +225 -97
  4. package/.agents/skills/wiki-autoresearch/references/program.md +28 -62
  5. package/.agents/skills/wiki-autoresearch/references/quality-sites.md +32 -0
  6. package/.env.example +5 -1
  7. package/.gitattributes +1 -0
  8. package/.github/workflows/publish-github-packages.yml +1 -1
  9. package/.pi/SYSTEM.md +72 -18
  10. package/.pi/agents/harness/adversary.md +32 -0
  11. package/.pi/agents/harness/evaluator.md +32 -0
  12. package/.pi/agents/harness/executor.md +34 -0
  13. package/.pi/agents/harness/meta-optimizer.md +33 -0
  14. package/.pi/agents/harness/planner.md +33 -0
  15. package/.pi/agents/harness/tie-breaker.md +35 -0
  16. package/.pi/agents/harness/trace-librarian.md +32 -0
  17. package/.pi/extensions/banner.png +0 -0
  18. package/.pi/extensions/budget-guard.ts +265 -0
  19. package/.pi/extensions/custom-footer.ts +194 -22
  20. package/.pi/extensions/custom-header.ts +47 -9
  21. package/.pi/extensions/debate-orchestrator.ts +479 -0
  22. package/.pi/extensions/harness-live-widget.ts +438 -0
  23. package/.pi/extensions/policy-gate.ts +349 -0
  24. package/.pi/extensions/review-integrity.ts +198 -0
  25. package/.pi/extensions/test-diff-integrity.ts +240 -0
  26. package/.pi/extensions/trace-recorder.ts +315 -0
  27. package/.pi/harness/README.md +23 -0
  28. package/.pi/harness/router/README.md +35 -0
  29. package/.pi/harness/router/apply-router-proposal.mjs +153 -0
  30. package/.pi/harness/router/propose-router-tuning.mjs +149 -0
  31. package/.pi/harness/specs/README.md +37 -0
  32. package/.pi/harness/specs/adversary-report.schema.json +53 -0
  33. package/.pi/harness/specs/budget-exhausted-event.schema.json +93 -0
  34. package/.pi/harness/specs/consensus-packet.schema.json +175 -0
  35. package/.pi/harness/specs/eval-verdict.schema.json +59 -0
  36. package/.pi/harness/specs/incident-record.schema.json +84 -0
  37. package/.pi/harness/specs/plan-packet.schema.json +90 -0
  38. package/.pi/harness/specs/round-result.schema.json +126 -0
  39. package/.pi/harness/specs/router-tuning-proposal.schema.json +114 -0
  40. package/.pi/harness/specs/run-trace.schema.json +107 -0
  41. package/.pi/lib/harness-ui-state.ts +311 -0
  42. package/.pi/mcp.json +4 -0
  43. package/.pi/model-router.json +93 -93
  44. package/.pi/prompts/graphify.md +23 -0
  45. package/.pi/prompts/harness-abort.md +41 -0
  46. package/.pi/prompts/harness-auto.md +83 -0
  47. package/.pi/prompts/harness-critic.md +52 -0
  48. package/.pi/prompts/harness-eval.md +51 -0
  49. package/.pi/prompts/harness-incident.md +51 -0
  50. package/.pi/prompts/harness-plan.md +64 -0
  51. package/.pi/prompts/harness-review.md +52 -0
  52. package/.pi/prompts/harness-router-tune.md +74 -0
  53. package/.pi/prompts/harness-run.md +59 -0
  54. package/.pi/prompts/harness-setup.md +316 -216
  55. package/.pi/prompts/harness-trace.md +51 -0
  56. package/.pi/prompts/wiki-autoresearch.md +9 -7
  57. package/.pi/prompts/wiki-save.md +20 -0
  58. package/.pi/skills/agent-router/SKILL.md +2 -4
  59. package/.pi/skills/ast-grep/SKILL.md +354 -0
  60. package/.pi/sounds/project-sounds.json +18 -24
  61. package/AGENTS.md +30 -0
  62. package/CHANGELOG.md +89 -0
  63. package/CONTRIBUTING.md +51 -1
  64. package/README.md +264 -20
  65. package/biome.json +8 -2
  66. package/lefthook.yml +3 -2
  67. package/node_modules/@sting8k/pi-vcc/README.md +200 -0
  68. package/node_modules/@sting8k/pi-vcc/index.ts +14 -0
  69. package/node_modules/@sting8k/pi-vcc/package.json +26 -0
  70. package/node_modules/@sting8k/pi-vcc/scripts/audit-sessions.ts +88 -0
  71. package/node_modules/@sting8k/pi-vcc/scripts/benchmark-real-sessions.ts +25 -0
  72. package/node_modules/@sting8k/pi-vcc/scripts/compare-before-after.ts +36 -0
  73. package/node_modules/@sting8k/pi-vcc/scripts/dump-branch-output.ts +20 -0
  74. package/node_modules/@sting8k/pi-vcc/src/commands/pi-vcc.ts +36 -0
  75. package/node_modules/@sting8k/pi-vcc/src/commands/vcc-recall.ts +65 -0
  76. package/node_modules/@sting8k/pi-vcc/src/core/brief.ts +381 -0
  77. package/node_modules/@sting8k/pi-vcc/src/core/build-sections.ts +79 -0
  78. package/node_modules/@sting8k/pi-vcc/src/core/content.ts +60 -0
  79. package/node_modules/@sting8k/pi-vcc/src/core/filter-noise.ts +42 -0
  80. package/node_modules/@sting8k/pi-vcc/src/core/format-recall.ts +27 -0
  81. package/node_modules/@sting8k/pi-vcc/src/core/format.ts +49 -0
  82. package/node_modules/@sting8k/pi-vcc/src/core/lineage.ts +26 -0
  83. package/node_modules/@sting8k/pi-vcc/src/core/load-messages.ts +41 -0
  84. package/node_modules/@sting8k/pi-vcc/src/core/normalize.ts +66 -0
  85. package/node_modules/@sting8k/pi-vcc/src/core/recall-scope.ts +14 -0
  86. package/node_modules/@sting8k/pi-vcc/src/core/render-entries.ts +55 -0
  87. package/node_modules/@sting8k/pi-vcc/src/core/report.ts +237 -0
  88. package/node_modules/@sting8k/pi-vcc/src/core/sanitize.ts +5 -0
  89. package/node_modules/@sting8k/pi-vcc/src/core/search-entries.ts +221 -0
  90. package/node_modules/@sting8k/pi-vcc/src/core/settings.ts +77 -0
  91. package/node_modules/@sting8k/pi-vcc/src/core/skill-collapse.ts +35 -0
  92. package/node_modules/@sting8k/pi-vcc/src/core/summarize.ts +157 -0
  93. package/node_modules/@sting8k/pi-vcc/src/core/tool-args.ts +14 -0
  94. package/node_modules/@sting8k/pi-vcc/src/details.ts +7 -0
  95. package/node_modules/@sting8k/pi-vcc/src/extract/commits.ts +69 -0
  96. package/node_modules/@sting8k/pi-vcc/src/extract/files.ts +80 -0
  97. package/node_modules/@sting8k/pi-vcc/src/extract/goals.ts +79 -0
  98. package/node_modules/@sting8k/pi-vcc/src/extract/preferences.ts +55 -0
  99. package/node_modules/@sting8k/pi-vcc/src/hooks/before-compact.ts +322 -0
  100. package/node_modules/@sting8k/pi-vcc/src/sections.ts +12 -0
  101. package/node_modules/@sting8k/pi-vcc/src/tools/recall.ts +109 -0
  102. package/node_modules/@sting8k/pi-vcc/src/types.ts +14 -0
  103. package/node_modules/@sting8k/pi-vcc/tests/before-compact-hook.test.ts +181 -0
  104. package/node_modules/@sting8k/pi-vcc/tests/before-compact.test.ts +140 -0
  105. package/node_modules/@sting8k/pi-vcc/tests/brief.test.ts +206 -0
  106. package/node_modules/@sting8k/pi-vcc/tests/build-sections.test.ts +59 -0
  107. package/node_modules/@sting8k/pi-vcc/tests/compile.test.ts +80 -0
  108. package/node_modules/@sting8k/pi-vcc/tests/content.test.ts +31 -0
  109. package/node_modules/@sting8k/pi-vcc/tests/extract-goals.test.ts +86 -0
  110. package/node_modules/@sting8k/pi-vcc/tests/extract-preferences.test.ts +30 -0
  111. package/node_modules/@sting8k/pi-vcc/tests/filter-noise.test.ts +61 -0
  112. package/node_modules/@sting8k/pi-vcc/tests/fixtures.ts +61 -0
  113. package/node_modules/@sting8k/pi-vcc/tests/format-recall.test.ts +30 -0
  114. package/node_modules/@sting8k/pi-vcc/tests/format.test.ts +62 -0
  115. package/node_modules/@sting8k/pi-vcc/tests/lineage.test.ts +33 -0
  116. package/node_modules/@sting8k/pi-vcc/tests/load-messages.test.ts +51 -0
  117. package/node_modules/@sting8k/pi-vcc/tests/normalize.test.ts +97 -0
  118. package/node_modules/@sting8k/pi-vcc/tests/real-sessions.test.ts +38 -0
  119. package/node_modules/@sting8k/pi-vcc/tests/recall-expand.test.ts +15 -0
  120. package/node_modules/@sting8k/pi-vcc/tests/recall-scope.test.ts +32 -0
  121. package/node_modules/@sting8k/pi-vcc/tests/recall-tool-scope.test.ts +67 -0
  122. package/node_modules/@sting8k/pi-vcc/tests/render-entries.test.ts +62 -0
  123. package/node_modules/@sting8k/pi-vcc/tests/report.test.ts +44 -0
  124. package/node_modules/@sting8k/pi-vcc/tests/sanitize.test.ts +24 -0
  125. package/node_modules/@sting8k/pi-vcc/tests/search-entries.test.ts +144 -0
  126. package/node_modules/@sting8k/pi-vcc/tests/support/load-session.ts +23 -0
  127. package/node_modules/@sting8k/pi-vcc/tests/support/real-sessions.ts +51 -0
  128. package/package.json +15 -4
  129. package/scripts/__pycache__/merge_graphify_corpora.cpython-314.pyc +0 -0
  130. package/scripts/index_youtube_urls.py +376 -0
  131. package/scripts/merge_graphify_corpora.py +398 -0
  132. package/scripts/regen_graphify_html.py +46 -0
  133. package/.agents/skills/defuddle/SKILL.md +0 -90
  134. package/.agents/skills/wiki/SKILL.md +0 -215
  135. package/.agents/skills/wiki/references/css-snippets.md +0 -122
  136. package/.agents/skills/wiki/references/frontmatter.md +0 -107
  137. package/.agents/skills/wiki/references/git-setup.md +0 -58
  138. package/.agents/skills/wiki/references/mcp-setup.md +0 -149
  139. package/.agents/skills/wiki/references/modes.md +0 -259
  140. package/.agents/skills/wiki/references/plugins.md +0 -96
  141. package/.agents/skills/wiki/references/rest-api.md +0 -124
  142. package/.agents/skills/wiki-fold/SKILL.md +0 -204
  143. package/.agents/skills/wiki-fold/references/fold-template.md +0 -133
  144. package/.agents/skills/wiki-ingest/SKILL.md +0 -288
  145. package/.agents/skills/wiki-lint/SKILL.md +0 -183
  146. package/.agents/skills/wiki-query/SKILL.md +0 -176
  147. package/.pi/agents/rethink.md +0 -140
  148. package/.pi/agents/wiki-ingest.md +0 -67
  149. package/.pi/agents/wiki-lint.md +0 -75
  150. package/.pi/internal/cursor-sdk-transcript-parser.ts +0 -59
  151. package/.pi/prompts/save.md +0 -16
  152. package/.pi/prompts/wiki.md +0 -23
  153. package/.pi/providers/cursor-sdk-provider.test.mjs +0 -476
  154. package/.pi/providers/cursor-sdk-provider.ts +0 -1085
  155. package/vault/AGENTS.md +0 -37
  156. package/vault/wiki/_templates/comparison.md +0 -39
  157. package/vault/wiki/_templates/concept.md +0 -40
  158. package/vault/wiki/_templates/decision.md +0 -21
  159. package/vault/wiki/_templates/entity.md +0 -32
  160. package/vault/wiki/_templates/flow.md +0 -14
  161. package/vault/wiki/_templates/module.md +0 -18
  162. package/vault/wiki/_templates/question.md +0 -31
  163. package/vault/wiki/_templates/source.md +0 -39
  164. package/vault/wiki/concepts/AST-Aware Code Chunking.md +0 -44
  165. package/vault/wiki/concepts/Build-Time Prompt Compilation.md +0 -107
  166. package/vault/wiki/concepts/Context Engine (AI Coding).md +0 -47
  167. package/vault/wiki/concepts/Context-Aware System Reminders.md +0 -61
  168. package/vault/wiki/concepts/Contextualized Text Embedding.md +0 -42
  169. package/vault/wiki/concepts/Contractor vs Employee AI Model.md +0 -55
  170. package/vault/wiki/concepts/Dual-Model Agent Architecture.md +0 -65
  171. package/vault/wiki/concepts/Late Chunking vs Early Chunking.md +0 -43
  172. package/vault/wiki/concepts/Majority Vote Ensembling.md +0 -68
  173. package/vault/wiki/concepts/Meta-Harness.md +0 -16
  174. package/vault/wiki/concepts/Multi-Agent AI Coding Architecture.md +0 -75
  175. package/vault/wiki/concepts/Prompt Enhancement.md +0 -90
  176. package/vault/wiki/concepts/Prompt Renderer.md +0 -89
  177. package/vault/wiki/concepts/Semantic Codebase Indexing.md +0 -67
  178. package/vault/wiki/concepts/additive-config-hierarchy.md +0 -16
  179. package/vault/wiki/concepts/agent-artifacts-verifiable-deliverables.md +0 -71
  180. package/vault/wiki/concepts/agent-browser-browser-automation.md +0 -99
  181. package/vault/wiki/concepts/agent-codebase-interface.md +0 -43
  182. package/vault/wiki/concepts/agent-harness-architecture.md +0 -67
  183. package/vault/wiki/concepts/agent-loop-detection-patterns.md +0 -133
  184. package/vault/wiki/concepts/agent-search-enforcement.md +0 -126
  185. package/vault/wiki/concepts/agent-skills-ecosystem.md +0 -74
  186. package/vault/wiki/concepts/agent-skills-pattern.md +0 -68
  187. package/vault/wiki/concepts/agentic-harness-context-enforcement.md +0 -91
  188. package/vault/wiki/concepts/agentic-harness.md +0 -34
  189. package/vault/wiki/concepts/agentic-orchestration-pipeline.md +0 -56
  190. package/vault/wiki/concepts/agentic-search-no-embeddings.md +0 -18
  191. package/vault/wiki/concepts/anthropic-context-engineering.md +0 -13
  192. package/vault/wiki/concepts/antigravity-agent-first-architecture.md +0 -61
  193. package/vault/wiki/concepts/ast-compression.md +0 -19
  194. package/vault/wiki/concepts/ast-truncation.md +0 -66
  195. package/vault/wiki/concepts/barrel-files.md +0 -37
  196. package/vault/wiki/concepts/browser-harness-agent.md +0 -41
  197. package/vault/wiki/concepts/browser-subagent-visual-verification.md +0 -82
  198. package/vault/wiki/concepts/codebase-intelligence-ecosystem-comparison.md +0 -192
  199. package/vault/wiki/concepts/codebase-intelligence-harness-integration.md +0 -161
  200. package/vault/wiki/concepts/codebase-to-context-ingestion.md +0 -46
  201. package/vault/wiki/concepts/codex-harness-innovations.md +0 -147
  202. package/vault/wiki/concepts/consensus-debate-flow.md +0 -17
  203. package/vault/wiki/concepts/consensus-debate.md +0 -206
  204. package/vault/wiki/concepts/content-addressed-spec-identity.md +0 -166
  205. package/vault/wiki/concepts/context-anxiety.md +0 -57
  206. package/vault/wiki/concepts/context-compression-techniques.md +0 -19
  207. package/vault/wiki/concepts/context-continuity.md +0 -22
  208. package/vault/wiki/concepts/context-drift-in-agents.md +0 -106
  209. package/vault/wiki/concepts/context-engineering.md +0 -62
  210. package/vault/wiki/concepts/context-folding.md +0 -67
  211. package/vault/wiki/concepts/context-mode.md +0 -38
  212. package/vault/wiki/concepts/cursor-harness-innovations.md +0 -107
  213. package/vault/wiki/concepts/deterministic-session-compaction.md +0 -79
  214. package/vault/wiki/concepts/drift-detection-unified.md +0 -296
  215. package/vault/wiki/concepts/execution-feedback-loop.md +0 -46
  216. package/vault/wiki/concepts/feedforward-feedback-harness.md +0 -60
  217. package/vault/wiki/concepts/five-root-cause-metrics-sentrux.md +0 -40
  218. package/vault/wiki/concepts/fork-safe-spec-storage.md +0 -89
  219. package/vault/wiki/concepts/fts5-sandbox.md +0 -19
  220. package/vault/wiki/concepts/fuzzy-edit-matching.md +0 -71
  221. package/vault/wiki/concepts/gemini-cli-architecture.md +0 -104
  222. package/vault/wiki/concepts/generator-evaluator-architecture.md +0 -64
  223. package/vault/wiki/concepts/guardian-agent-pattern.md +0 -67
  224. package/vault/wiki/concepts/harness-configuration-layers.md +0 -89
  225. package/vault/wiki/concepts/harness-control-frameworks.md +0 -155
  226. package/vault/wiki/concepts/harness-engineering-first-principles.md +0 -90
  227. package/vault/wiki/concepts/harness-h-formalism.md +0 -53
  228. package/vault/wiki/concepts/hybrid-code-search.md +0 -61
  229. package/vault/wiki/concepts/inline-post-edit-validation.md +0 -112
  230. package/vault/wiki/concepts/legendary-engineering-patterns-harness.md +0 -110
  231. package/vault/wiki/concepts/lifecycle-hooks.md +0 -94
  232. package/vault/wiki/concepts/mcp-tool-routing.md +0 -102
  233. package/vault/wiki/concepts/memory-system-of-record-vs-ephemeral-cache.md +0 -47
  234. package/vault/wiki/concepts/meta-agent-context-pruning.md +0 -151
  235. package/vault/wiki/concepts/model-adaptive-harness.md +0 -122
  236. package/vault/wiki/concepts/model-routing-agents.md +0 -101
  237. package/vault/wiki/concepts/monorepo-architecture.md +0 -45
  238. package/vault/wiki/concepts/multi-agent-specialization.md +0 -61
  239. package/vault/wiki/concepts/permission-subsystem.md +0 -16
  240. package/vault/wiki/concepts/pi-messenger-analysis.md +0 -243
  241. package/vault/wiki/concepts/pi-vscode-extension-landscape.md +0 -37
  242. package/vault/wiki/concepts/policy-engine-pattern.md +0 -78
  243. package/vault/wiki/concepts/progressive-disclosure-agents.md +0 -53
  244. package/vault/wiki/concepts/progressive-skill-disclosure.md +0 -17
  245. package/vault/wiki/concepts/provider-native-prompting.md +0 -203
  246. package/vault/wiki/concepts/quality-signal-sentrux.md +0 -37
  247. package/vault/wiki/concepts/repo-map-ranking.md +0 -42
  248. package/vault/wiki/concepts/result-monad-error-handling.md +0 -47
  249. package/vault/wiki/concepts/safety-defense-in-depth.md +0 -83
  250. package/vault/wiki/concepts/sandbox-os-enforcement.md +0 -18
  251. package/vault/wiki/concepts/selective-debate-routing.md +0 -70
  252. package/vault/wiki/concepts/self-evolving-harness.md +0 -60
  253. package/vault/wiki/concepts/sentrux-mcp-integration.md +0 -36
  254. package/vault/wiki/concepts/sentrux-rules-engine.md +0 -49
  255. package/vault/wiki/concepts/shell-pattern-compression.md +0 -24
  256. package/vault/wiki/concepts/skill-first-architecture.md +0 -166
  257. package/vault/wiki/concepts/structured-compaction.md +0 -78
  258. package/vault/wiki/concepts/subagent-orchestration.md +0 -17
  259. package/vault/wiki/concepts/subagent-worktree-isolation.md +0 -68
  260. package/vault/wiki/concepts/superpowers-methodology.md +0 -78
  261. package/vault/wiki/concepts/think-in-code.md +0 -73
  262. package/vault/wiki/concepts/ts-execution-layer.md +0 -100
  263. package/vault/wiki/concepts/typescript-strict-mode.md +0 -37
  264. package/vault/wiki/concepts/vcc-conversation-compaction-for-pi.md +0 -53
  265. package/vault/wiki/concepts/verification-drift-detection.md +0 -19
  266. package/vault/wiki/consensus/consensus-records.md +0 -58
  267. package/vault/wiki/decisions/2026-04-30-pi-lean-ctx-native.md +0 -122
  268. package/vault/wiki/decisions/2026-05-07-replace-lean-ctx-with-context-mode.md +0 -59
  269. package/vault/wiki/decisions/adr-008.md +0 -40
  270. package/vault/wiki/decisions/adr-009.md +0 -46
  271. package/vault/wiki/decisions/adr-010.md +0 -55
  272. package/vault/wiki/decisions/adr-011.md +0 -165
  273. package/vault/wiki/decisions/adr-012.md +0 -102
  274. package/vault/wiki/decisions/adr-013.md +0 -59
  275. package/vault/wiki/decisions/adr-014.md +0 -73
  276. package/vault/wiki/decisions/adr-015.md +0 -81
  277. package/vault/wiki/decisions/adr-016.md +0 -91
  278. package/vault/wiki/decisions/adr-017.md +0 -79
  279. package/vault/wiki/decisions/adr-018.md +0 -100
  280. package/vault/wiki/decisions/adr-019.md +0 -75
  281. package/vault/wiki/decisions/adr-020.md +0 -106
  282. package/vault/wiki/decisions/adr-021.md +0 -86
  283. package/vault/wiki/decisions/adr-022.md +0 -113
  284. package/vault/wiki/decisions/adr-023.md +0 -113
  285. package/vault/wiki/decisions/adr-024.md +0 -73
  286. package/vault/wiki/decisions/adr-025.md +0 -130
  287. package/vault/wiki/decisions/adr-026.md +0 -56
  288. package/vault/wiki/decisions/adr-027.md +0 -94
  289. package/vault/wiki/decisions/colocate-wiki.md +0 -34
  290. package/vault/wiki/entities/Anders Hejlsberg.md +0 -29
  291. package/vault/wiki/entities/Anthropic.md +0 -17
  292. package/vault/wiki/entities/Augment Code.md +0 -49
  293. package/vault/wiki/entities/Bjarne Stroustrup.md +0 -26
  294. package/vault/wiki/entities/Bolt.new (StackBlitz).md +0 -39
  295. package/vault/wiki/entities/Boris Cherny.md +0 -11
  296. package/vault/wiki/entities/Claude Code.md +0 -19
  297. package/vault/wiki/entities/Dennis Ritchie.md +0 -26
  298. package/vault/wiki/entities/Emergent Labs.md +0 -32
  299. package/vault/wiki/entities/Google Cloud.md +0 -16
  300. package/vault/wiki/entities/Guido van Rossum.md +0 -28
  301. package/vault/wiki/entities/Ken Thompson.md +0 -28
  302. package/vault/wiki/entities/Lee et al.md +0 -16
  303. package/vault/wiki/entities/Linus Torvalds.md +0 -28
  304. package/vault/wiki/entities/Lovable (company).md +0 -40
  305. package/vault/wiki/entities/Martin Fowler.md +0 -16
  306. package/vault/wiki/entities/Meng et al.md +0 -16
  307. package/vault/wiki/entities/OpenAI.md +0 -16
  308. package/vault/wiki/entities/Rocket.new.md +0 -38
  309. package/vault/wiki/entities/VILA-Lab.md +0 -15
  310. package/vault/wiki/entities/autodev-codebase.md +0 -18
  311. package/vault/wiki/entities/ck-tool.md +0 -59
  312. package/vault/wiki/entities/codesearch.md +0 -18
  313. package/vault/wiki/entities/disler-indydevdan.md +0 -33
  314. package/vault/wiki/entities/gsd-get-shit-done.md +0 -56
  315. package/vault/wiki/entities/javascript-runtimes.md +0 -48
  316. package/vault/wiki/entities/jesse-vincent.md +0 -38
  317. package/vault/wiki/entities/lean-ctx.md +0 -32
  318. package/vault/wiki/entities/opendev.md +0 -41
  319. package/vault/wiki/entities/ops-codegraph-tool.md +0 -18
  320. package/vault/wiki/entities/pi-coding-agent.md +0 -53
  321. package/vault/wiki/entities/sentrux.md +0 -54
  322. package/vault/wiki/entities/vgrep-tool.md +0 -57
  323. package/vault/wiki/entities/vitest.md +0 -41
  324. package/vault/wiki/flows/harness-wiki-pipeline.md +0 -204
  325. package/vault/wiki/hot.md +0 -932
  326. package/vault/wiki/index.md +0 -437
  327. package/vault/wiki/log.md +0 -422
  328. package/vault/wiki/meta/dashboard.md +0 -30
  329. package/vault/wiki/meta/lint-report-2026-04-30.md +0 -86
  330. package/vault/wiki/meta/lint-report-2026-05-02.md +0 -251
  331. package/vault/wiki/meta/overview.canvas +0 -43
  332. package/vault/wiki/modules/adversarial-verification.md +0 -57
  333. package/vault/wiki/modules/automated-observability.md +0 -54
  334. package/vault/wiki/modules/bench.md +0 -20
  335. package/vault/wiki/modules/extensions.md +0 -23
  336. package/vault/wiki/modules/grounding-checkpoints.md +0 -62
  337. package/vault/wiki/modules/harness-implementation-plan.md +0 -345
  338. package/vault/wiki/modules/harness-wiki-skill-mapping.md +0 -135
  339. package/vault/wiki/modules/harness.md +0 -86
  340. package/vault/wiki/modules/persistent-memory.md +0 -85
  341. package/vault/wiki/modules/schema-orchestration.md +0 -68
  342. package/vault/wiki/modules/skills.md +0 -27
  343. package/vault/wiki/modules/spec-hardening.md +0 -58
  344. package/vault/wiki/modules/structured-planning.md +0 -53
  345. package/vault/wiki/modules/think-in-code-enforcement.md +0 -153
  346. package/vault/wiki/modules/wiki-query-interface.md +0 -64
  347. package/vault/wiki/overview.md +0 -51
  348. package/vault/wiki/questions/Research-pi-vs-claude-code-agentic-orchestration-pipeline.md +0 -87
  349. package/vault/wiki/questions/Research-sentrux-dev.md +0 -123
  350. package/vault/wiki/questions/Research-superpowers-skill-for-agentic-coding-agents.md +0 -164
  351. package/vault/wiki/questions/Research: Augment Code Context Engine.md +0 -244
  352. package/vault/wiki/questions/Research: Automating Software Engineering - Lovable, Bolt, Emergent, Rocket.md +0 -112
  353. package/vault/wiki/questions/Research: Claude Code State-of-the-Art Harness Improvements.md +0 -209
  354. package/vault/wiki/questions/Research: Codex State-of-the-Art Harness Improvements.md +0 -99
  355. package/vault/wiki/questions/Research: Engineering Workflows of Legendary Programmers and AI Harness Mapping.md +0 -107
  356. package/vault/wiki/questions/Research: Fallow Codebase Intelligence Harness Integration.md +0 -72
  357. package/vault/wiki/questions/Research: Gemini CLI SOTA Harness Integration.md +0 -166
  358. package/vault/wiki/questions/Research: GitHub Issues as Harness Spec Storage.md +0 -188
  359. package/vault/wiki/questions/Research: Google Antigravity Harness Integration.md +0 -120
  360. package/vault/wiki/questions/Research: Meta-Agent Context Drift Detection.md +0 -236
  361. package/vault/wiki/questions/Research: Model-Adaptive Agent Harness Design.md +0 -95
  362. package/vault/wiki/questions/Research: Model-Specific Prompting Guides.md +0 -165
  363. package/vault/wiki/questions/Research: Prompt Renderer for Multi-Model Agent Harness.md +0 -216
  364. package/vault/wiki/questions/Research: Skill-First Harness Architecture.md +0 -91
  365. package/vault/wiki/questions/Research: TypeScript Best Practices and Codebase Structure.md +0 -88
  366. package/vault/wiki/questions/Research: TypeScript Execution Layer for Agent Tool Calling.md +0 -81
  367. package/vault/wiki/questions/Research: claude-mem over Obsidian for Harness Layer.md +0 -71
  368. package/vault/wiki/questions/Research: claude-mem over obsidian wiki as the knowledge base for our agentic harness pipeline. think from first principles. does this replace or complement our current setup? no hard feelings about previous decisions. gimme accurate points.md +0 -80
  369. package/vault/wiki/questions/Research: context-mode vs lean-ctx.md +0 -72
  370. package/vault/wiki/questions/Research: cursor.sh Harness Innovations.md +0 -92
  371. package/vault/wiki/questions/Research: executor.sh Harness Integration.md +0 -170
  372. package/vault/wiki/questions/Research: how GSD fits into our coding harness setup.md +0 -97
  373. package/vault/wiki/questions/Research: how claude-mem fits into our workflow. and whether it should replace obsidian in the codebase. no hard feelings about previous actions, rethink from first principles always.md +0 -80
  374. package/vault/wiki/questions/Research: pi-vcc.md +0 -113
  375. package/vault/wiki/questions/Research: semantic code search tools.md +0 -69
  376. package/vault/wiki/questions/Research: vcc extension for pi coding agent.md +0 -73
  377. package/vault/wiki/questions/how-to-enable-semantic-code-search-now.md +0 -111
  378. package/vault/wiki/questions/mvp-implementation-blueprint.md +0 -552
  379. package/vault/wiki/questions/research-agent-first-codebase-exploration.md +0 -199
  380. package/vault/wiki/questions/research-agentic-coding-harness-latest-papers.md +0 -142
  381. package/vault/wiki/questions/research-gitingest-gitreverse-integration.md +0 -100
  382. package/vault/wiki/questions/research-wozcode-token-reduction.md +0 -67
  383. package/vault/wiki/questions/resolved-context-pruning-inplace-vs-restart.md +0 -95
  384. package/vault/wiki/questions/resolved-context-window-economics.md +0 -167
  385. package/vault/wiki/questions/resolved-imad-debate-gating-transfer.md +0 -126
  386. package/vault/wiki/questions/resolved-mcp-tool-preference.md +0 -112
  387. package/vault/wiki/questions/resolved-small-model-meta-agents.md +0 -107
  388. package/vault/wiki/questions/resolved-treesitter-dynamic-languages.md +0 -95
  389. package/vault/wiki/sources/Auggie Context MCP Server.md +0 -63
  390. package/vault/wiki/sources/Augment Code Codacy AI Giants.md +0 -61
  391. package/vault/wiki/sources/Augment Code MCP SiliconAngle.md +0 -49
  392. package/vault/wiki/sources/Augment Code WorkOS ERC 2025.md +0 -55
  393. package/vault/wiki/sources/Augment Context Engine Official.md +0 -71
  394. package/vault/wiki/sources/Augment SWE-bench Agent GitHub.md +0 -74
  395. package/vault/wiki/sources/Augment SWE-bench Pro Blog.md +0 -58
  396. package/vault/wiki/sources/Source: AgentBus Jinja2 Prompt Pipelines.md +0 -75
  397. package/vault/wiki/sources/Source: Arxiv /342/200/224 Don't Break the Cache.md" +0 -85
  398. package/vault/wiki/sources/Source: Augment - Harness Engineering for AI Coding Agents.md +0 -58
  399. package/vault/wiki/sources/Source: Blake Crosley Agent Architecture Guide.md +0 -100
  400. package/vault/wiki/sources/Source: Bolt.new Architecture & Case Study.md +0 -75
  401. package/vault/wiki/sources/Source: Build-Time Prompt Compilation Architecture.md +0 -107
  402. package/vault/wiki/sources/Source: Claude API Agent Skills Overview.md +0 -70
  403. package/vault/wiki/sources/Source: Gemini CLI Changelogs.md +0 -88
  404. package/vault/wiki/sources/Source: Google Blog - Gemini CLI Announcement.md +0 -57
  405. package/vault/wiki/sources/Source: Google Gemini CLI Architecture Docs.md +0 -53
  406. package/vault/wiki/sources/Source: LangChain - Anatomy of Agent Harness.md +0 -65
  407. package/vault/wiki/sources/Source: Lovable Architecture & Clone Analysis.md +0 -83
  408. package/vault/wiki/sources/Source: Martin Fowler - Harness Engineering.md +0 -70
  409. package/vault/wiki/sources/Source: OpenAI Harness Engineering Five Principles.md +0 -58
  410. package/vault/wiki/sources/Source: OpenAI Harness Engineering /342/200/224 0 Lines of Human Code.md" +0 -101
  411. package/vault/wiki/sources/Source: OpenDev /342/200/224 Building AI Coding Agents for the Terminal.md" +0 -100
  412. package/vault/wiki/sources/Source: Render AI Coding Agents Benchmark 2025.md +0 -53
  413. package/vault/wiki/sources/Source: Rocket.new /342/200/224 Vibe Solutioning Platform.md" +0 -70
  414. package/vault/wiki/sources/Source: SwirlAI Agent Skills Progressive Disclosure.md +0 -71
  415. package/vault/wiki/sources/Source: TianPan Prompt Caching Architecture.md +0 -89
  416. package/vault/wiki/sources/Source: Vercel Labs agent-browser.md +0 -155
  417. package/vault/wiki/sources/Source: browser-harness CDP Harness.md +0 -126
  418. package/vault/wiki/sources/agent-drift-academic-paper.md +0 -79
  419. package/vault/wiki/sources/aider-repomap-tree-sitter.md +0 -42
  420. package/vault/wiki/sources/anthropic-compaction-api.md +0 -58
  421. package/vault/wiki/sources/anthropic-effective-harnesses.md +0 -42
  422. package/vault/wiki/sources/anthropic-prompt-best-practices.md +0 -100
  423. package/vault/wiki/sources/anthropic2026-harness-design.md +0 -63
  424. package/vault/wiki/sources/barrel-files-tkdodo.md +0 -38
  425. package/vault/wiki/sources/birth-of-unix-kernighan-interview.md +0 -57
  426. package/vault/wiki/sources/bockeler2026-harness-engineering.md +0 -69
  427. package/vault/wiki/sources/cast-code-chunking-paper.md +0 -50
  428. package/vault/wiki/sources/ck-semantic-search.md +0 -78
  429. package/vault/wiki/sources/claude-code-architecture-karaxai-2026.md +0 -71
  430. package/vault/wiki/sources/claude-code-architecture-qubytes-2026.md +0 -50
  431. package/vault/wiki/sources/claude-code-architecture-vila-lab-2026.md +0 -64
  432. package/vault/wiki/sources/claude-code-security-architecture-penligent-2026.md +0 -70
  433. package/vault/wiki/sources/claude-context-editing-docs.md +0 -13
  434. package/vault/wiki/sources/cloudflare-codemode.md +0 -63
  435. package/vault/wiki/sources/code-chunk-library-supermemory.md +0 -63
  436. package/vault/wiki/sources/codeact-apple-2024.md +0 -62
  437. package/vault/wiki/sources/codex-dsc-rfc-8573.md +0 -41
  438. package/vault/wiki/sources/codex-open-source-agent-2026.md +0 -110
  439. package/vault/wiki/sources/coir-code-retrieval-benchmark.md +0 -51
  440. package/vault/wiki/sources/colinmcnamara-context-optimization-codemode.md +0 -48
  441. package/vault/wiki/sources/context-folding-paper.md +0 -61
  442. package/vault/wiki/sources/context-mode-website.md +0 -63
  443. package/vault/wiki/sources/cursor-agent-best-practices-2026.md +0 -62
  444. package/vault/wiki/sources/cursor-fork-29b-2025.md +0 -50
  445. package/vault/wiki/sources/cursor-harness-april-2026.md +0 -76
  446. package/vault/wiki/sources/cursor-instant-apply-2024.md +0 -45
  447. package/vault/wiki/sources/cursor-shadow-workspace-2024.md +0 -52
  448. package/vault/wiki/sources/cursor-shipped-coding-agent-2026.md +0 -53
  449. package/vault/wiki/sources/cursor-vs-antigravity-2026.md +0 -51
  450. package/vault/wiki/sources/disler-pi-vs-claude-code.md +0 -69
  451. package/vault/wiki/sources/distill-deterministic-context-compression.md +0 -53
  452. package/vault/wiki/sources/embedding-models-benchmark-supermemory-2025.md +0 -48
  453. package/vault/wiki/sources/executor-rhyssullivan.md +0 -122
  454. package/vault/wiki/sources/fallow-rs-codebase-intelligence.md +0 -125
  455. package/vault/wiki/sources/fan2025-imad.md +0 -60
  456. package/vault/wiki/sources/forgecode-gpt5-agent-improvements.md +0 -63
  457. package/vault/wiki/sources/gemini-3-prompting-guide.md +0 -78
  458. package/vault/wiki/sources/gh-cli-sub-issue-rfc.md +0 -50
  459. package/vault/wiki/sources/gh-sub-issue-extension.md +0 -72
  460. package/vault/wiki/sources/github-fork-issues-discussion.md +0 -44
  461. package/vault/wiki/sources/github-issue-dependencies-docs.md +0 -49
  462. package/vault/wiki/sources/github-sub-issues-docs.md +0 -51
  463. package/vault/wiki/sources/gitingest.md +0 -91
  464. package/vault/wiki/sources/gitreverse.md +0 -63
  465. package/vault/wiki/sources/google-antigravity-official-blog.md +0 -47
  466. package/vault/wiki/sources/google-antigravity-wikipedia.md +0 -53
  467. package/vault/wiki/sources/gsd-codecentric-deep-dive.md +0 -57
  468. package/vault/wiki/sources/gsd-github-repo.md +0 -51
  469. package/vault/wiki/sources/gsd-hn-discussion.md +0 -59
  470. package/vault/wiki/sources/guido-python-design-philosophy.md +0 -56
  471. package/vault/wiki/sources/hejlsberg-7-learnings.md +0 -48
  472. package/vault/wiki/sources/ironclaw-drift-monitor.md +0 -80
  473. package/vault/wiki/sources/langsight-loop-detection.md +0 -80
  474. package/vault/wiki/sources/leanctx-website.md +0 -69
  475. package/vault/wiki/sources/lee2026-meta-harness.md +0 -59
  476. package/vault/wiki/sources/linux-kernel-coding-workflow.md +0 -50
  477. package/vault/wiki/sources/lou2026-autoharness.md +0 -53
  478. package/vault/wiki/sources/martin-fowler-harness-engineering.md +0 -73
  479. package/vault/wiki/sources/mcp-architecture-docs.md +0 -13
  480. package/vault/wiki/sources/meng2026-agent-harness-survey.md +0 -79
  481. package/vault/wiki/sources/mindstudio-four-agent-types.md +0 -68
  482. package/vault/wiki/sources/ms-chat-history-management.md +0 -13
  483. package/vault/wiki/sources/openai-prompt-guidance.md +0 -104
  484. package/vault/wiki/sources/openclaw-session-pruning.md +0 -13
  485. package/vault/wiki/sources/opencode-dcp.md +0 -13
  486. package/vault/wiki/sources/opendev-arxiv-2603.05344v1.md +0 -79
  487. package/vault/wiki/sources/openhands-platform.md +0 -39
  488. package/vault/wiki/sources/oss-guide-codebase-exploration.md +0 -53
  489. package/vault/wiki/sources/pi-compaction-extensions-ecosystem.md +0 -102
  490. package/vault/wiki/sources/pi-context-prune-github-repo.md +0 -38
  491. package/vault/wiki/sources/pi-mono-compaction-docs.md +0 -38
  492. package/vault/wiki/sources/pi-omni-compact-github-repo.md +0 -50
  493. package/vault/wiki/sources/pi-rtk-optimizer-github-repo.md +0 -45
  494. package/vault/wiki/sources/pi-vcc-github-repo.md +0 -69
  495. package/vault/wiki/sources/pi-vscode-marketplace.md +0 -41
  496. package/vault/wiki/sources/pi-vscode-model-provider-marketplace.md +0 -39
  497. package/vault/wiki/sources/py-tree-sitter.md +0 -13
  498. package/vault/wiki/sources/sentrux-dev-landing.md +0 -40
  499. package/vault/wiki/sources/sentrux-docs-pro-architecture.md +0 -75
  500. package/vault/wiki/sources/sentrux-docs-quality-signal.md +0 -46
  501. package/vault/wiki/sources/sentrux-docs-root-cause-metrics.md +0 -57
  502. package/vault/wiki/sources/sentrux-docs-rules-engine.md +0 -58
  503. package/vault/wiki/sources/sentrux-github-repo.md +0 -56
  504. package/vault/wiki/sources/superpowers-github-repo.md +0 -56
  505. package/vault/wiki/sources/superpowers-release-blog.md +0 -54
  506. package/vault/wiki/sources/superpowers-termdock-analysis.md +0 -45
  507. package/vault/wiki/sources/swe-agent-aci.md +0 -42
  508. package/vault/wiki/sources/swe-bench.md +0 -45
  509. package/vault/wiki/sources/swe-pruner-context-pruning.md +0 -13
  510. package/vault/wiki/sources/think-in-code-blog.md +0 -48
  511. package/vault/wiki/sources/tree-sitter-docs.md +0 -13
  512. package/vault/wiki/sources/ts-best-practices-2025-devto.md +0 -42
  513. package/vault/wiki/sources/ts-folder-structure-mingyang.md +0 -58
  514. package/vault/wiki/sources/ts-monorepo-koerselman.md +0 -44
  515. package/vault/wiki/sources/ts-result-error-handling-kkalamarski.md +0 -52
  516. package/vault/wiki/sources/ts-runtimes-comparison-betterstack.md +0 -42
  517. package/vault/wiki/sources/ts-strict-mode-rishikc.md +0 -43
  518. package/vault/wiki/sources/unix-philosophy.md +0 -48
  519. package/vault/wiki/sources/vectara-chunking-vs-embedding-naacl2025.md +0 -39
  520. package/vault/wiki/sources/vectara-guardian-agents.md +0 -79
  521. package/vault/wiki/sources/vgrep-semantic-search.md +0 -76
  522. package/vault/wiki/sources/vitest-official.md +0 -41
  523. package/vault/wiki/sources/vscode-pi-community-extension.md +0 -40
  524. package/vault/wiki/sources/wozcode.md +0 -79
@@ -1,65 +0,0 @@
1
- ---
2
- type: concept
3
- title: "Dual-Model Agent Architecture"
4
- created: 2026-04-30
5
- status: developing
6
- tags:
7
- - agent-architecture
8
- - llm
9
- - ensembling
10
- - swe-bench
11
- aliases:
12
- - Two-Model Agent
13
- related:
14
- - "[[Majority Vote Ensembling]]"
15
- - "[[Agentic Coding Harness]]"
16
- sources:
17
- - "[[Augment SWE-bench Agent GitHub]]"
18
- - "[[Augment SWE-bench Pro Blog]]"
19
- updated: 2026-05-02
20
-
21
- ---# Dual-Model Agent Architecture
22
-
23
- An agent architecture that uses two different LLMs for distinct phases: a fast, capable model for iterative reasoning/ coding, and a more deliberative model for solution selection/verification.
24
-
25
- ## Augment Code's Implementation
26
-
27
- ### Phase 1: Core Reasoning (Claude Sonnet 3.7)
28
- - Handles the iterative coding loop: read files, write code, run tests, debug.
29
- - Fast, capable, good at following instructions.
30
- - Runs in a loop with tool access (bash, file edit, sequential thinking).
31
-
32
- ### Phase 2: Solution Ensembling (OpenAI o1)
33
- - After generating N candidate solutions (typically 8).
34
- - Presents all candidates to o1 with evaluation outcomes.
35
- - o1 analyzes and selects the best solution.
36
- - o1 is slower but more deliberative — better at comparative analysis.
37
-
38
- ## Why Two Models?
39
-
40
- 1. **Cost optimization**: Fast model for the 95% of work; expensive model only for selection.
41
- 2. **Complementary strengths**: Claude excels at code generation; o1 excels at analysis and comparison.
42
- 3. **Error reduction**: Majority vote ensembling catches errors that any single run might miss.
43
- 4. **Separation of concerns**: Generation and evaluation use different reasoning patterns.
44
-
45
- ## Alternative Patterns
46
-
47
- ### Single-Model Multi-Pass
48
- - Same model generates multiple solutions then self-reviews.
49
- - Simpler but less effective than cross-model ensembling.
50
-
51
- ### Model Cascade
52
- - Start with fast/cheap model; escalate to stronger model on failure.
53
- - Used by SWE-agent and some production systems.
54
-
55
- ### Committee of Models
56
- - 3+ different models generate solutions independently.
57
- - Voting or LLM-based selection.
58
-
59
- ## Implementation for Our Harness
60
-
61
- We can implement dual-model architecture as a configurable strategy:
62
- - **Primary model**: Claude (fast, code-capable) for the main agent loop.
63
- - **Ensembler model**: GPT-5 or o1 for solution verification and selection.
64
- - Generate 3-5 candidate solutions, use ensembler to pick best.
65
- - Configurable via harness config.
@@ -1,43 +0,0 @@
1
- ---
2
- type: concept
3
- title: "Late Chunking vs Early Chunking"
4
- created: 2026-04-30
5
- status: developing
6
- tags:
7
- - chunking
8
- - embeddings
9
- - rag
10
- - semantic-search
11
- related:
12
- - "[[AST-Aware Code Chunking]]"
13
- - "[[Contextualized Text Embedding]]"
14
- sources:
15
- - "[[vectara-chunking-vs-embedding-naacl2025]]"
16
- updated: 2026-05-02
17
-
18
- ---# Late Chunking vs Early Chunking
19
-
20
- ## Definitions
21
-
22
- - **Early chunking (standard)**: Split text → embed each chunk separately. Each chunk's embedding only sees its own text.
23
- - **Late chunking**: Embed the entire document first (producing token-level embeddings), then pool token embeddings into chunk-level embeddings using chunk boundaries. Each chunk's embedding "sees" the full document context.
24
- - **Contextual retrieval**: An intermediate approach: prepend document-level context to each chunk before embedding. Simpler than late chunking, captures some cross-chunk context.
25
-
26
- ## Trade-offs
27
-
28
- | Approach | Semantic Coherence | Compute Cost | Implementation Complexity |
29
- |----------|-------------------|--------------|---------------------------|
30
- | Early chunking | Lowest | Lowest | Simplest |
31
- | Contextual retrieval | Medium | Medium | Moderate |
32
- | Late chunking | Highest | Highest | Complex |
33
-
34
- ## Research Findings (arXiv:2504.19754)
35
-
36
- Late chunking + contextual retrieval evaluated for RAG systems:
37
- - Contextual retrieval preserves semantic coherence more effectively than early chunking
38
- - But requires greater computational resources (embeds full documents)
39
- - For code: contextual retrieval (prepending scope/file context) is the sweet spot — better than bare early chunking, cheaper than full late chunking
40
-
41
- ## Relevance to Our Implementation
42
-
43
- We implement **contextual retrieval** (not full late chunking): prepend file path, scope chain, signatures, and imports to each chunk before embedding. This gives us much of the benefit at moderate cost.
@@ -1,68 +0,0 @@
1
- ---
2
- type: concept
3
- title: "Majority Vote Ensembling"
4
- created: 2026-04-30
5
- status: developing
6
- tags:
7
- - agent-architecture
8
- - llm
9
- - ensembling
10
- aliases:
11
- - Solution Ensembling
12
- related:
13
- - "[[Dual-Model Agent Architecture]]"
14
- sources:
15
- - "[[Augment SWE-bench Agent GitHub]]"
16
- updated: 2026-05-02
17
-
18
- ---# Majority Vote Ensembling
19
-
20
- A technique where an agent generates multiple candidate solutions to the same problem, then uses an LLM (or voting mechanism) to select the best one. Used by Augment Code's SWE-bench agent to boost success rates.
21
-
22
- ## How Augment Implements It
23
-
24
- 1. Run the core agent (Claude Sonnet 3.7) N times on the same problem (typically N=8).
25
- 2. Each run produces a candidate solution (diff).
26
- 3. Run evaluation harness on each candidate to get pass/fail outcomes.
27
- 4. Feed all candidates + outcomes to OpenAI o1 with a prompt asking it to select the best solution.
28
- 5. o1 returns the index of the selected solution.
29
-
30
- ## Input Format
31
- ```json
32
- {
33
- "id": "problem-1",
34
- "instruction": "Fix the login timeout issue",
35
- "diffs": ["diff1", "diff2", "..."],
36
- "eval_outcomes": [
37
- {"is_success": true},
38
- {"is_success": false}
39
- ]
40
- }
41
- ```
42
-
43
- ## Why It Works
44
-
45
- 1. **Variance reduction**: Multiple independent runs reduce the impact of any single bad generation.
46
- 2. **Complementary failures**: Different runs fail on different aspects; ensembling can pick the run that succeeded.
47
- 3. **LLM-as-judge**: o1's reasoning capabilities are better suited for comparative analysis than code generation.
48
- 4. **Evaluation-guided**: Including eval outcomes helps the ensembler distinguish between functionally correct and incorrect solutions.
49
-
50
- ## Cost Consideration
51
-
52
- Running N candidates multiplies cost by N. Augment's approach: use a fast/cheap model (Sonnet) for the N runs, then an expensive model (o1) only for the single ensembling step.
53
-
54
- ## Implementation for Our Harness
55
-
56
- ```python
57
- def ensemble_solutions(problem: str, candidates: int = 5) -> str:
58
- solutions = []
59
- for i in range(candidates):
60
- # Run agent independently
61
- diff = run_agent(problem)
62
- result = evaluate(diff)
63
- solutions.append({"diff": diff, "success": result.passed})
64
-
65
- # Select best via LLM ensembler
66
- best = llm_ensembler.select_best(problem, solutions)
67
- return best.diff
68
- ```
@@ -1,16 +0,0 @@
1
- ---
2
- type: concept
3
- status: stub
4
- created: 2026-05-02
5
- updated: 2026-05-02
6
- tags: [concept, harness, meta-learning]
7
- ---
8
-
9
- # Meta-Harness
10
-
11
- Outer-loop harness optimization framework from Lee et al. (Stanford/Together AI). A harness that optimizes the inner harness — selecting best configurations, prompts, and patterns across multiple agent runs.
12
-
13
- ## References
14
-
15
- - [[lee2026-meta-harness]]
16
- - [[self-evolving-harness]]
@@ -1,75 +0,0 @@
1
- ---
2
- type: concept
3
- title: "Multi-Agent AI Coding Architecture"
4
- created: 2026-05-03
5
- updated: 2026-05-03
6
- status: developing
7
- tags:
8
- - multi-agent
9
- - architecture
10
- - agentic-coding
11
- - harness
12
- related:
13
- - "[[subagent-orchestration]]"
14
- - "[[generator-evaluator-architecture]]"
15
- - "[[agentic-harness]]"
16
- - "[[Source: Lovable Architecture & Clone Analysis]]"
17
- - "[[anthropic2026-harness-design]]"
18
- sources:
19
- - "[[Source: Lovable Architecture & Clone Analysis]]"
20
- - "[[anthropic2026-harness-design]]"
21
- - "[[Source: OpenAI Harness Engineering — 0 Lines of Human Code]]"
22
- - "[[Source: OpenDev — Building AI Coding Agents for the Terminal]]"
23
-
24
- ---# Multi-Agent AI Coding Architecture
25
-
26
- The decomposition of software engineering tasks across specialized agents, each with a defined role, input/output contract, and tool surface. This is the **universal pattern** across all successful AI coding platforms.
27
-
28
- ## Three Common Decompositions
29
-
30
- ### Lovable/Clone Pattern: Planner → Architect → Coder
31
- ```
32
- User prompt → Planner (structured Plan) → Architect (TaskPlan) → Coder (files on disk)
33
- ```
34
- - Each agent receives Pydantic-validated inputs
35
- - LangGraph orchestrates with conditional edges
36
- - Coder uses ReAct pattern with file system tools
37
-
38
- ### Anthropic Pattern: Planner → Generator → Evaluator
39
- ```
40
- User prompt → Planner (product spec) → Generator (implements) ⇄ Evaluator (grades)
41
- ```
42
- - Generator and Evaluator negotiate "sprint contracts" before coding
43
- - Evaluator uses Playwright to actually click through the app
44
- - Hard thresholds on grading criteria — fall below any, sprint fails
45
-
46
- ### OpenAI Pattern: Agent-to-Agent Review Loops
47
- ```
48
- Codex generates → Codex reviews locally → Additional agent review (cloud) → Human/agent feedback → Iterate
49
- ```
50
- - "Ralph Wiggum Loop": agent reviews its own changes, requests additional reviews, responds to feedback, iterates until all agent reviewers satisfied
51
- - Humans may review PRs but aren't required to
52
- - Pushed "almost all review effort towards being handled agent-to-agent"
53
-
54
- ## First-Principles Architecture
55
-
56
- ### 1. Separate Planning from Execution
57
- Do not let the same agent plan and code in one step. The Planner should have read-only tools only — structurally prevented from writing code. This forces deliberation before action and prevents premature implementation.
58
-
59
- ### 2. Structured Handoffs Between Agents
60
- Every handoff must be a validated data contract, not free text. Pydantic schemas, typed dicts, or structured files. The downstream agent processes objects, not unstructured descriptions.
61
-
62
- ### 3. Independent Evaluator with Hard Criteria
63
- The agent that builds cannot be trusted to evaluate. Separate evaluator with explicit, gradable criteria. Each criterion has a hard threshold — not negotiable. "Claude is a poor QA agent out of the box" — evaluator requires explicit tuning to be skeptical.
64
-
65
- ### 4. Sprint Contracts (Agree on "Done" Before Work)
66
- Before coding starts, the implementer and evaluator negotiate what success looks like. This prevents scope creep and provides concrete verification targets. Communication via files, not chat.
67
-
68
- ### 5. Tool Surface = Agent Capability Boundary
69
- Each agent's available tools define its actual capability — not its prompt, not its role description. Remove write tools from planners. Remove subagent-spawning from subagents. Make capabilities structural, not aspirational.
70
-
71
- ## Relevance to Our Harness
72
- - L2 (Planning) should be a separate agent with read-only tools
73
- - L3 (Execution) should work from L2's structured output
74
- - L4 (Verification) needs hard criteria with thresholds, not narrative feedback
75
- - Sprint contracts between L2 and L4 before L3 begins
@@ -1,90 +0,0 @@
1
- ---
2
- type: concept
3
- title: "Prompt Enhancement"
4
- created: 2026-04-30
5
- status: developing
6
- tags:
7
- - prompt-engineering
8
- - context
9
- - retrieval
10
- aliases:
11
- - Prompt Enrichment
12
- - Context Injection
13
- related:
14
- - "[[Context Engine (AI Coding)]]"
15
- - "[[Semantic Codebase Indexing]]"
16
- sources:
17
- - "[[Augment Code WorkOS ERC 2025]]"
18
- - "[[Augment Code Codacy AI Giants]]"
19
- updated: 2026-05-02
20
-
21
- ---# Prompt Enhancement
22
-
23
- The process of automatically enriching a user's query with relevant codebase context before it reaches the LLM. The goal is to give the LLM the same understanding a senior engineer would have when approaching a task.
24
-
25
- ## How Augment's Prompt Enhancer Works
26
-
27
- 1. User types a query: "add logging to payment API."
28
- 2. Context Engine semantically searches the codebase for relevant code.
29
- 3. Enhancer constructs an augmented prompt containing:
30
- - The original query.
31
- - Relevant source files and their paths.
32
- - Existing patterns (how logging is done elsewhere).
33
- - Related utilities and libraries already in the codebase.
34
- - Team conventions and coding standards.
35
- 4. The augmented prompt is sent to the LLM.
36
-
37
- ## Key Design Principles
38
-
39
- ### Reuse Over Reinvention
40
- The enhancer actively detects existing utilities and libraries. In Augment's demo, when asked to add Git branch info to a status bar, the enhancer detected an existing internal Git library and guided the agent to use it instead of shelling out to git.
41
-
42
- ### Context Budget Management
43
- The enhancer must balance context richness with token budget:
44
- - Retrieve only what's relevant (semantic search).
45
- - Compress retrieved context (summarize large files).
46
- - Rank by relevance, not just similarity.
47
- - Respect the model's context window.
48
-
49
- ### Pattern Recognition
50
- The enhancer learns from the codebase:
51
- - Naming conventions.
52
- - Error handling patterns.
53
- - Import structure.
54
- - Testing patterns.
55
- - Architectural layering.
56
-
57
- ## Implementation for Our Harness
58
-
59
- ```python
60
- def enhance_prompt(query: str, workspace: str) -> str:
61
- # 1. Semantic search for relevant code
62
- relevant_files = semantic_search(query, workspace, top_k=10)
63
-
64
- # 2. Extract patterns from relevant files
65
- patterns = extract_patterns(relevant_files)
66
-
67
- # 3. Find existing utilities/libraries
68
- utilities = find_related_utilities(query, workspace)
69
-
70
- # 4. Fetch wiki knowledge (our existing knowledge base)
71
- wiki_context = query_wiki(query)
72
-
73
- # 5. Build augmented prompt
74
- return build_prompt(
75
- query=query,
76
- relevant_code=relevant_files,
77
- patterns=patterns,
78
- utilities=utilities,
79
- wiki=wiki_context
80
- )
81
- ```
82
-
83
- ## Integration with Existing Harness
84
-
85
- Our harness already has several context sources:
86
- - **lean-ctx**: Exact file retrieval (grep, find, read).
87
- - **wiki**: Architectural knowledge, research, patterns.
88
- - **ctx_knowledge**: Persistent project conventions and gotchas.
89
-
90
- Prompt enhancement would unify these into a preprocessing step before the main agent loop.
@@ -1,89 +0,0 @@
1
- ---
2
- type: concept
3
- title: "Prompt Renderer"
4
- created: 2026-05-02
5
- updated: 2026-05-02
6
- tags:
7
- - prompt-renderer
8
- - multi-model
9
- - build-time-compilation
10
- - harness
11
- status: developing
12
- related:
13
- - "[[provider-native-prompting]]"
14
- - "[[model-adaptive-harness]]"
15
- - "[[research: Prompt Renderer for Multi-Model Agent Harness]]"
16
- sources:
17
- - "[[Source: Build-Time Prompt Compilation Architecture]]"
18
- - "[[Source: AgentBus Jinja2 Prompt Pipelines]]"
19
-
20
- ---# Prompt Renderer
21
-
22
- A build-time prompt compilation system that takes a **base prompt spec** (model-agnostic) and renders **per-model optimized prompts** by applying each model's official prompting conventions, substituting variables, and caching compiled output.
23
-
24
- ## Architecture
25
-
26
- ```
27
- Base Prompt Spec (JSON/YAML)
28
-
29
- [Compile-time Renderer]
30
-
31
- ┌───────┼───────┬─────────┐
32
- │ GPT │Claude │Gemini │ ← Per-model compiled prompts
33
- │.json │.json │.json │
34
- └───────┴───────┴─────────┘
35
-
36
- [npm package] ← Shipped in lib
37
-
38
- [Runtime] → load pre-compiled prompt → substitute runtime vars → send to LLM
39
- ```
40
-
41
- ## Key Properties
42
-
43
- - **Build-time, not runtime**: Compiler runs during `npm run build`, output shipped as JSON in npm package
44
- - **Base spec is model-agnostic**: Single source of truth that describes WHAT the prompt should do, not HOW
45
- - **Per-model renderers**: Each model gets a plugin that knows its official prompting conventions
46
- - **Variable system**: Two-phase — compile-time variables (resolved at build) vs runtime variables (resolved at call time)
47
- - **Caching layer**: Pre-compiled prompts are the cache — no runtime compilation, no warmup needed
48
- - **Deterministic**: Same spec + same renderer version → identical output (hash-verifiable)
49
-
50
- ## Rendering Pipeline
51
-
52
- 1. **Parse base spec**: Validate structure, required fields, variable declarations
53
- 2. **Select model renderer**: Load per-model plugin (GPT, Claude, Gemini, etc.)
54
- 3. **Apply model conventions**: XML tags for Claude, constraints-first for GPT, constraints-last for Gemini
55
- 4. **Substitute compile-time variables**: Resolve all vars marked `compile: true`
56
- 5. **Validate output**: Check token count, syntax, caching thresholds
57
- 6. **Serialize**: Write compiled prompt to JSON with hash + metadata
58
- 7. **Cache**: Store hash → compiled output for incremental builds
59
-
60
- ## Model-Specific Rendering Rules
61
-
62
- | Convention | GPT (OpenAI) | Claude (Anthropic) | Gemini (Google) |
63
- |-----------|-------------|-------------------|-----------------|
64
- | System prompt | `system` role message | `system` parameter | `systemInstruction` |
65
- | Structure | Constraints-first, flat | XML tags, nesting OK | Constraints-last, plain text |
66
- | Instruction style | Outcome-first, shorter | Long-form, detailed | Multimodal-friendly |
67
- | Cache control | Auto (no code) | `cache_control: {type: "ephemeral"}` | Explicit context cache |
68
- | Output format | Function calling | Structured output API | Controlled generation |
69
- | Best practice source | platform.openai.com/docs/guides/prompt-engineering | docs.anthropic.com + interactive tutorial | cloud.google.com/vertex-ai/docs |
70
-
71
- ## Variable Substitution
72
-
73
- Two-phase variable system:
74
-
75
- ```yaml
76
- variables:
77
- model_name: { type: string, compile: true } # Resolved at build
78
- user_query: { type: string, compile: false } # Resolved at runtime
79
- max_tokens: { type: number, compile: true, default: 4096 }
80
- ```
81
-
82
- Compile-time variables produce multiple compiled variants if multiple values are specified (e.g., `model_name: [gpt-5.2, claude-sonnet-4.5]`).
83
-
84
- ## Caching Strategy
85
-
86
- - **Build cache**: Incremental — only recompile prompts whose spec hash changed
87
- - **Output cache**: Compiled prompts stored by `{spec_hash}-{model}-{var_hash}.json`
88
- - **Runtime**: Zero cost — load pre-compiled JSON, substitute runtime vars, send
89
- - **npm distribution**: Compiled prompts are regular files in the package — no compilation code shipped
@@ -1,67 +0,0 @@
1
- ---
2
- type: concept
3
- title: "Semantic Codebase Indexing"
4
- created: 2026-04-30
5
- status: developing
6
- tags:
7
- - code-indexing
8
- - embeddings
9
- - vector-search
10
- - ast
11
- aliases:
12
- - Code Embedding
13
- related:
14
- - "[[Context Engine (AI Coding)]]"
15
- - "[[Prompt Enhancement]]"
16
- sources:
17
- - "[[Augment Context Engine Official]]"
18
- - "[[Augment Code Codacy AI Giants]]"
19
- updated: 2026-05-02
20
-
21
- ---# Semantic Codebase Indexing
22
-
23
- The process of converting source code into vector embeddings that capture semantic meaning, enabling similarity search across a codebase without relying on exact keyword matching.
24
-
25
- ## How It Works
26
-
27
- ### 1. Code Chunking
28
- - Split source files into logical units: functions, classes, methods, modules.
29
- - Use tree-sitter AST parsing for language-aware chunk boundaries.
30
- - Typical chunk size: 200-500 tokens for optimal embedding quality.
31
-
32
- ### 2. Embedding Generation
33
- - Pass each chunk through an embedding model.
34
- - Options: all-MiniLM-L6-v2 (384-dim, local), CodeBERT, or Voyage AI code embeddings.
35
- - Augment Code uses custom embedding models trained in pairs for maximum retrieval quality.
36
-
37
- ### 3. Vector Database Storage
38
- - Store embeddings in LanceDB, ChromaDB, or Qdrant.
39
- - Index for fast approximate nearest neighbor (ANN) search.
40
- - Attach metadata: file path, line range, function/class name, dependencies.
41
-
42
- ### 4. Real-time Sync
43
- - Watch filesystem for changes using watchdog/inotify.
44
- - Re-embed changed files incrementally.
45
- - Augment claims "millisecond-level sync."
46
-
47
- ### 5. Hybrid Search
48
- - Combine vector similarity (semantic) + BM25/ keyword (lexical).
49
- - Re-rank results by relevance, recency, and relationship proximity.
50
-
51
- ## Why Semantic > Grep
52
-
53
- | Aspect | Grep/Keyword | Semantic Indexing |
54
- |--------|-------------|-------------------|
55
- | Finds related code | Only exact matches | Finds semantically similar code |
56
- | Understands intent | No | Yes — "payment logging" finds telemetry, billing, audit |
57
- | Cross-language | No | Partially — embeddings capture patterns |
58
- | Relationship aware | No | Yes — understands call graphs and imports |
59
- | Noise filtering | Manual | Automatic relevance ranking |
60
-
61
- ## Implementation Stack (for our harness)
62
-
63
- - **Parser**: tree-sitter (18 languages via lean-ctx).
64
- - **Embeddings**: sentence-transformers (all-MiniLM-L6-v2) or voyage-code-2.
65
- - **Vector DB**: LanceDB (embedded, zero-config) or ChromaDB.
66
- - **Sync**: watchdog (Python).
67
- - **Search**: hybrid BM25 + cosine similarity with re-ranking.
@@ -1,16 +0,0 @@
1
- ---
2
- type: concept
3
- status: stub
4
- created: 2026-05-02
5
- updated: 2026-05-02
6
- tags: [concept, configuration, claude-code]
7
- ---
8
-
9
- # Additive Config Hierarchy
10
-
11
- Configuration pattern from Claude Code: config layers stack additively (CLAUDE.md → project-level → user-level → system-level) rather than overriding. Each layer adds context rather than replacing previous layers.
12
-
13
- ## References
14
-
15
- - [[claude-code-architecture-karaxai-2026]]
16
- - [[harness-configuration-layers]]
@@ -1,71 +0,0 @@
1
- ---
2
- type: concept
3
- title: "Agent Artifacts (Trust via Verifiable Deliverables)"
4
- status: developing
5
- created: 2026-05-01
6
- updated: 2026-05-01
7
- tags:
8
- - antigravity
9
- - verification
10
- - trust
11
- - harness-design
12
- aliases: ["Artifact system", "verifiable artifacts"]
13
- related:
14
- - "[[adversarial-verification]]"
15
- - "[[automated-observability]]"
16
- - "[[harness-implementation-plan]]"
17
- - "[[antigravity-agent-first-architecture]]"
18
- sources:
19
- - "[[google-antigravity-official-blog]]"
20
- - "[[cursor-vs-antigravity-2026]]"
21
-
22
- ---# Agent Artifacts: Trust via Verifiable Deliverables
23
-
24
- Google Antigravity's Artifact system replaces raw tool-call logs with human-readable, verifiable deliverables that agents generate as they work.
25
-
26
- ## What Are Artifacts?
27
-
28
- Structured, verifiable outputs agents produce during execution:
29
- - Task lists and implementation plans
30
- - Screenshots and browser recordings
31
- - Walkthrough documents
32
- - Test result summaries
33
- - Architecture diagrams
34
-
35
- Artifacts represent work at a **task level**, not an API-call level. They are designed to be audited by humans, not parsed by machines.
36
-
37
- ## How Artifacts Build Trust
38
-
39
- ```
40
- Raw tool logs: "execute_command: npm install" → "exit 0" → "write_file: src/auth.ts" → ...
41
- Artifact: "Authentication migration plan" → "Screenshot: login page working" → "Test results: 23/23 pass"
42
- ```
43
-
44
- The second format is reviewable in seconds. The first requires scrolling through hundreds of lines.
45
-
46
- ## Feedback on Artifacts
47
-
48
- - Developers comment on artifacts (Google Docs-style commenting)
49
- - Agents incorporate feedback **without stopping execution**
50
- - Feedback is asynchronous: you comment, the agent picks it up at the next checkpoint
51
- - No need to restart tasks for mid-course corrections
52
-
53
- ## Comparison with Our Harness
54
-
55
- | Dimension | Our Harness (L4 + L5) | Antigravity Artifacts |
56
- |-----------|----------------------|----------------------|
57
- | Verification type | Adversarial critic agents | Human-reviewable deliverables |
58
- | Feedback loop | Multi-round debate (selective) | Async comments on artifacts |
59
- | Trust mechanism | Critic proves work wrong | Agent proves work right |
60
- | Cost | LLM tokens (critic rounds) | Human attention (review artifacts) |
61
-
62
- ## Gap Analysis
63
-
64
- Our L4 adversarial verification asks: "Is this correct?" (critic finds flaws).
65
- Antigravity's Artifacts ask: "Here's proof this is correct" (agent demonstrates success).
66
-
67
- These are **complementary**. The critic catches what the agent missed. The artifact proves what the agent got right. Both should exist in the harness.
68
-
69
- ## Proposed Integration: Phase P31
70
-
71
- Add an **Artifact Generation Layer** after L4 verification. Agents generate screenshots, browser recordings, and test result summaries as verifiable proof of work. These artifacts feed into L5 observability and serve as the human-reviewable interface.