ultimate-pi 0.1.7 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (524) hide show
  1. package/.agents/skills/graphify/.graphify_version +1 -0
  2. package/.agents/skills/graphify/SKILL.md +1204 -0
  3. package/.agents/skills/wiki-autoresearch/SKILL.md +225 -97
  4. package/.agents/skills/wiki-autoresearch/references/program.md +28 -62
  5. package/.agents/skills/wiki-autoresearch/references/quality-sites.md +32 -0
  6. package/.env.example +5 -1
  7. package/.gitattributes +1 -0
  8. package/.github/workflows/publish-github-packages.yml +1 -1
  9. package/.pi/SYSTEM.md +72 -18
  10. package/.pi/agents/harness/adversary.md +32 -0
  11. package/.pi/agents/harness/evaluator.md +32 -0
  12. package/.pi/agents/harness/executor.md +34 -0
  13. package/.pi/agents/harness/meta-optimizer.md +33 -0
  14. package/.pi/agents/harness/planner.md +33 -0
  15. package/.pi/agents/harness/tie-breaker.md +35 -0
  16. package/.pi/agents/harness/trace-librarian.md +32 -0
  17. package/.pi/extensions/banner.png +0 -0
  18. package/.pi/extensions/budget-guard.ts +265 -0
  19. package/.pi/extensions/custom-footer.ts +194 -22
  20. package/.pi/extensions/custom-header.ts +47 -9
  21. package/.pi/extensions/debate-orchestrator.ts +479 -0
  22. package/.pi/extensions/harness-live-widget.ts +438 -0
  23. package/.pi/extensions/policy-gate.ts +349 -0
  24. package/.pi/extensions/review-integrity.ts +198 -0
  25. package/.pi/extensions/test-diff-integrity.ts +240 -0
  26. package/.pi/extensions/trace-recorder.ts +315 -0
  27. package/.pi/harness/README.md +23 -0
  28. package/.pi/harness/router/README.md +35 -0
  29. package/.pi/harness/router/apply-router-proposal.mjs +153 -0
  30. package/.pi/harness/router/propose-router-tuning.mjs +149 -0
  31. package/.pi/harness/specs/README.md +37 -0
  32. package/.pi/harness/specs/adversary-report.schema.json +53 -0
  33. package/.pi/harness/specs/budget-exhausted-event.schema.json +93 -0
  34. package/.pi/harness/specs/consensus-packet.schema.json +175 -0
  35. package/.pi/harness/specs/eval-verdict.schema.json +59 -0
  36. package/.pi/harness/specs/incident-record.schema.json +84 -0
  37. package/.pi/harness/specs/plan-packet.schema.json +90 -0
  38. package/.pi/harness/specs/round-result.schema.json +126 -0
  39. package/.pi/harness/specs/router-tuning-proposal.schema.json +114 -0
  40. package/.pi/harness/specs/run-trace.schema.json +107 -0
  41. package/.pi/lib/harness-ui-state.ts +311 -0
  42. package/.pi/mcp.json +4 -0
  43. package/.pi/model-router.json +93 -93
  44. package/.pi/prompts/graphify.md +23 -0
  45. package/.pi/prompts/harness-abort.md +41 -0
  46. package/.pi/prompts/harness-auto.md +83 -0
  47. package/.pi/prompts/harness-critic.md +52 -0
  48. package/.pi/prompts/harness-eval.md +51 -0
  49. package/.pi/prompts/harness-incident.md +51 -0
  50. package/.pi/prompts/harness-plan.md +64 -0
  51. package/.pi/prompts/harness-review.md +52 -0
  52. package/.pi/prompts/harness-router-tune.md +74 -0
  53. package/.pi/prompts/harness-run.md +59 -0
  54. package/.pi/prompts/harness-setup.md +316 -216
  55. package/.pi/prompts/harness-trace.md +51 -0
  56. package/.pi/prompts/wiki-autoresearch.md +9 -7
  57. package/.pi/prompts/wiki-save.md +20 -0
  58. package/.pi/skills/agent-router/SKILL.md +2 -4
  59. package/.pi/skills/ast-grep/SKILL.md +354 -0
  60. package/.pi/sounds/project-sounds.json +18 -24
  61. package/AGENTS.md +30 -0
  62. package/CHANGELOG.md +89 -0
  63. package/CONTRIBUTING.md +51 -1
  64. package/README.md +264 -20
  65. package/biome.json +8 -2
  66. package/lefthook.yml +3 -2
  67. package/node_modules/@sting8k/pi-vcc/README.md +200 -0
  68. package/node_modules/@sting8k/pi-vcc/index.ts +14 -0
  69. package/node_modules/@sting8k/pi-vcc/package.json +26 -0
  70. package/node_modules/@sting8k/pi-vcc/scripts/audit-sessions.ts +88 -0
  71. package/node_modules/@sting8k/pi-vcc/scripts/benchmark-real-sessions.ts +25 -0
  72. package/node_modules/@sting8k/pi-vcc/scripts/compare-before-after.ts +36 -0
  73. package/node_modules/@sting8k/pi-vcc/scripts/dump-branch-output.ts +20 -0
  74. package/node_modules/@sting8k/pi-vcc/src/commands/pi-vcc.ts +36 -0
  75. package/node_modules/@sting8k/pi-vcc/src/commands/vcc-recall.ts +65 -0
  76. package/node_modules/@sting8k/pi-vcc/src/core/brief.ts +381 -0
  77. package/node_modules/@sting8k/pi-vcc/src/core/build-sections.ts +79 -0
  78. package/node_modules/@sting8k/pi-vcc/src/core/content.ts +60 -0
  79. package/node_modules/@sting8k/pi-vcc/src/core/filter-noise.ts +42 -0
  80. package/node_modules/@sting8k/pi-vcc/src/core/format-recall.ts +27 -0
  81. package/node_modules/@sting8k/pi-vcc/src/core/format.ts +49 -0
  82. package/node_modules/@sting8k/pi-vcc/src/core/lineage.ts +26 -0
  83. package/node_modules/@sting8k/pi-vcc/src/core/load-messages.ts +41 -0
  84. package/node_modules/@sting8k/pi-vcc/src/core/normalize.ts +66 -0
  85. package/node_modules/@sting8k/pi-vcc/src/core/recall-scope.ts +14 -0
  86. package/node_modules/@sting8k/pi-vcc/src/core/render-entries.ts +55 -0
  87. package/node_modules/@sting8k/pi-vcc/src/core/report.ts +237 -0
  88. package/node_modules/@sting8k/pi-vcc/src/core/sanitize.ts +5 -0
  89. package/node_modules/@sting8k/pi-vcc/src/core/search-entries.ts +221 -0
  90. package/node_modules/@sting8k/pi-vcc/src/core/settings.ts +77 -0
  91. package/node_modules/@sting8k/pi-vcc/src/core/skill-collapse.ts +35 -0
  92. package/node_modules/@sting8k/pi-vcc/src/core/summarize.ts +157 -0
  93. package/node_modules/@sting8k/pi-vcc/src/core/tool-args.ts +14 -0
  94. package/node_modules/@sting8k/pi-vcc/src/details.ts +7 -0
  95. package/node_modules/@sting8k/pi-vcc/src/extract/commits.ts +69 -0
  96. package/node_modules/@sting8k/pi-vcc/src/extract/files.ts +80 -0
  97. package/node_modules/@sting8k/pi-vcc/src/extract/goals.ts +79 -0
  98. package/node_modules/@sting8k/pi-vcc/src/extract/preferences.ts +55 -0
  99. package/node_modules/@sting8k/pi-vcc/src/hooks/before-compact.ts +322 -0
  100. package/node_modules/@sting8k/pi-vcc/src/sections.ts +12 -0
  101. package/node_modules/@sting8k/pi-vcc/src/tools/recall.ts +109 -0
  102. package/node_modules/@sting8k/pi-vcc/src/types.ts +14 -0
  103. package/node_modules/@sting8k/pi-vcc/tests/before-compact-hook.test.ts +181 -0
  104. package/node_modules/@sting8k/pi-vcc/tests/before-compact.test.ts +140 -0
  105. package/node_modules/@sting8k/pi-vcc/tests/brief.test.ts +206 -0
  106. package/node_modules/@sting8k/pi-vcc/tests/build-sections.test.ts +59 -0
  107. package/node_modules/@sting8k/pi-vcc/tests/compile.test.ts +80 -0
  108. package/node_modules/@sting8k/pi-vcc/tests/content.test.ts +31 -0
  109. package/node_modules/@sting8k/pi-vcc/tests/extract-goals.test.ts +86 -0
  110. package/node_modules/@sting8k/pi-vcc/tests/extract-preferences.test.ts +30 -0
  111. package/node_modules/@sting8k/pi-vcc/tests/filter-noise.test.ts +61 -0
  112. package/node_modules/@sting8k/pi-vcc/tests/fixtures.ts +61 -0
  113. package/node_modules/@sting8k/pi-vcc/tests/format-recall.test.ts +30 -0
  114. package/node_modules/@sting8k/pi-vcc/tests/format.test.ts +62 -0
  115. package/node_modules/@sting8k/pi-vcc/tests/lineage.test.ts +33 -0
  116. package/node_modules/@sting8k/pi-vcc/tests/load-messages.test.ts +51 -0
  117. package/node_modules/@sting8k/pi-vcc/tests/normalize.test.ts +97 -0
  118. package/node_modules/@sting8k/pi-vcc/tests/real-sessions.test.ts +38 -0
  119. package/node_modules/@sting8k/pi-vcc/tests/recall-expand.test.ts +15 -0
  120. package/node_modules/@sting8k/pi-vcc/tests/recall-scope.test.ts +32 -0
  121. package/node_modules/@sting8k/pi-vcc/tests/recall-tool-scope.test.ts +67 -0
  122. package/node_modules/@sting8k/pi-vcc/tests/render-entries.test.ts +62 -0
  123. package/node_modules/@sting8k/pi-vcc/tests/report.test.ts +44 -0
  124. package/node_modules/@sting8k/pi-vcc/tests/sanitize.test.ts +24 -0
  125. package/node_modules/@sting8k/pi-vcc/tests/search-entries.test.ts +144 -0
  126. package/node_modules/@sting8k/pi-vcc/tests/support/load-session.ts +23 -0
  127. package/node_modules/@sting8k/pi-vcc/tests/support/real-sessions.ts +51 -0
  128. package/package.json +15 -4
  129. package/scripts/__pycache__/merge_graphify_corpora.cpython-314.pyc +0 -0
  130. package/scripts/index_youtube_urls.py +376 -0
  131. package/scripts/merge_graphify_corpora.py +398 -0
  132. package/scripts/regen_graphify_html.py +46 -0
  133. package/.agents/skills/defuddle/SKILL.md +0 -90
  134. package/.agents/skills/wiki/SKILL.md +0 -215
  135. package/.agents/skills/wiki/references/css-snippets.md +0 -122
  136. package/.agents/skills/wiki/references/frontmatter.md +0 -107
  137. package/.agents/skills/wiki/references/git-setup.md +0 -58
  138. package/.agents/skills/wiki/references/mcp-setup.md +0 -149
  139. package/.agents/skills/wiki/references/modes.md +0 -259
  140. package/.agents/skills/wiki/references/plugins.md +0 -96
  141. package/.agents/skills/wiki/references/rest-api.md +0 -124
  142. package/.agents/skills/wiki-fold/SKILL.md +0 -204
  143. package/.agents/skills/wiki-fold/references/fold-template.md +0 -133
  144. package/.agents/skills/wiki-ingest/SKILL.md +0 -288
  145. package/.agents/skills/wiki-lint/SKILL.md +0 -183
  146. package/.agents/skills/wiki-query/SKILL.md +0 -176
  147. package/.pi/agents/rethink.md +0 -140
  148. package/.pi/agents/wiki-ingest.md +0 -67
  149. package/.pi/agents/wiki-lint.md +0 -75
  150. package/.pi/internal/cursor-sdk-transcript-parser.ts +0 -59
  151. package/.pi/prompts/save.md +0 -16
  152. package/.pi/prompts/wiki.md +0 -23
  153. package/.pi/providers/cursor-sdk-provider.test.mjs +0 -476
  154. package/.pi/providers/cursor-sdk-provider.ts +0 -1085
  155. package/vault/AGENTS.md +0 -37
  156. package/vault/wiki/_templates/comparison.md +0 -39
  157. package/vault/wiki/_templates/concept.md +0 -40
  158. package/vault/wiki/_templates/decision.md +0 -21
  159. package/vault/wiki/_templates/entity.md +0 -32
  160. package/vault/wiki/_templates/flow.md +0 -14
  161. package/vault/wiki/_templates/module.md +0 -18
  162. package/vault/wiki/_templates/question.md +0 -31
  163. package/vault/wiki/_templates/source.md +0 -39
  164. package/vault/wiki/concepts/AST-Aware Code Chunking.md +0 -44
  165. package/vault/wiki/concepts/Build-Time Prompt Compilation.md +0 -107
  166. package/vault/wiki/concepts/Context Engine (AI Coding).md +0 -47
  167. package/vault/wiki/concepts/Context-Aware System Reminders.md +0 -61
  168. package/vault/wiki/concepts/Contextualized Text Embedding.md +0 -42
  169. package/vault/wiki/concepts/Contractor vs Employee AI Model.md +0 -55
  170. package/vault/wiki/concepts/Dual-Model Agent Architecture.md +0 -65
  171. package/vault/wiki/concepts/Late Chunking vs Early Chunking.md +0 -43
  172. package/vault/wiki/concepts/Majority Vote Ensembling.md +0 -68
  173. package/vault/wiki/concepts/Meta-Harness.md +0 -16
  174. package/vault/wiki/concepts/Multi-Agent AI Coding Architecture.md +0 -75
  175. package/vault/wiki/concepts/Prompt Enhancement.md +0 -90
  176. package/vault/wiki/concepts/Prompt Renderer.md +0 -89
  177. package/vault/wiki/concepts/Semantic Codebase Indexing.md +0 -67
  178. package/vault/wiki/concepts/additive-config-hierarchy.md +0 -16
  179. package/vault/wiki/concepts/agent-artifacts-verifiable-deliverables.md +0 -71
  180. package/vault/wiki/concepts/agent-browser-browser-automation.md +0 -99
  181. package/vault/wiki/concepts/agent-codebase-interface.md +0 -43
  182. package/vault/wiki/concepts/agent-harness-architecture.md +0 -67
  183. package/vault/wiki/concepts/agent-loop-detection-patterns.md +0 -133
  184. package/vault/wiki/concepts/agent-search-enforcement.md +0 -126
  185. package/vault/wiki/concepts/agent-skills-ecosystem.md +0 -74
  186. package/vault/wiki/concepts/agent-skills-pattern.md +0 -68
  187. package/vault/wiki/concepts/agentic-harness-context-enforcement.md +0 -91
  188. package/vault/wiki/concepts/agentic-harness.md +0 -34
  189. package/vault/wiki/concepts/agentic-orchestration-pipeline.md +0 -56
  190. package/vault/wiki/concepts/agentic-search-no-embeddings.md +0 -18
  191. package/vault/wiki/concepts/anthropic-context-engineering.md +0 -13
  192. package/vault/wiki/concepts/antigravity-agent-first-architecture.md +0 -61
  193. package/vault/wiki/concepts/ast-compression.md +0 -19
  194. package/vault/wiki/concepts/ast-truncation.md +0 -66
  195. package/vault/wiki/concepts/barrel-files.md +0 -37
  196. package/vault/wiki/concepts/browser-harness-agent.md +0 -41
  197. package/vault/wiki/concepts/browser-subagent-visual-verification.md +0 -82
  198. package/vault/wiki/concepts/codebase-intelligence-ecosystem-comparison.md +0 -192
  199. package/vault/wiki/concepts/codebase-intelligence-harness-integration.md +0 -161
  200. package/vault/wiki/concepts/codebase-to-context-ingestion.md +0 -46
  201. package/vault/wiki/concepts/codex-harness-innovations.md +0 -147
  202. package/vault/wiki/concepts/consensus-debate-flow.md +0 -17
  203. package/vault/wiki/concepts/consensus-debate.md +0 -206
  204. package/vault/wiki/concepts/content-addressed-spec-identity.md +0 -166
  205. package/vault/wiki/concepts/context-anxiety.md +0 -57
  206. package/vault/wiki/concepts/context-compression-techniques.md +0 -19
  207. package/vault/wiki/concepts/context-continuity.md +0 -22
  208. package/vault/wiki/concepts/context-drift-in-agents.md +0 -106
  209. package/vault/wiki/concepts/context-engineering.md +0 -62
  210. package/vault/wiki/concepts/context-folding.md +0 -67
  211. package/vault/wiki/concepts/context-mode.md +0 -38
  212. package/vault/wiki/concepts/cursor-harness-innovations.md +0 -107
  213. package/vault/wiki/concepts/deterministic-session-compaction.md +0 -79
  214. package/vault/wiki/concepts/drift-detection-unified.md +0 -296
  215. package/vault/wiki/concepts/execution-feedback-loop.md +0 -46
  216. package/vault/wiki/concepts/feedforward-feedback-harness.md +0 -60
  217. package/vault/wiki/concepts/five-root-cause-metrics-sentrux.md +0 -40
  218. package/vault/wiki/concepts/fork-safe-spec-storage.md +0 -89
  219. package/vault/wiki/concepts/fts5-sandbox.md +0 -19
  220. package/vault/wiki/concepts/fuzzy-edit-matching.md +0 -71
  221. package/vault/wiki/concepts/gemini-cli-architecture.md +0 -104
  222. package/vault/wiki/concepts/generator-evaluator-architecture.md +0 -64
  223. package/vault/wiki/concepts/guardian-agent-pattern.md +0 -67
  224. package/vault/wiki/concepts/harness-configuration-layers.md +0 -89
  225. package/vault/wiki/concepts/harness-control-frameworks.md +0 -155
  226. package/vault/wiki/concepts/harness-engineering-first-principles.md +0 -90
  227. package/vault/wiki/concepts/harness-h-formalism.md +0 -53
  228. package/vault/wiki/concepts/hybrid-code-search.md +0 -61
  229. package/vault/wiki/concepts/inline-post-edit-validation.md +0 -112
  230. package/vault/wiki/concepts/legendary-engineering-patterns-harness.md +0 -110
  231. package/vault/wiki/concepts/lifecycle-hooks.md +0 -94
  232. package/vault/wiki/concepts/mcp-tool-routing.md +0 -102
  233. package/vault/wiki/concepts/memory-system-of-record-vs-ephemeral-cache.md +0 -47
  234. package/vault/wiki/concepts/meta-agent-context-pruning.md +0 -151
  235. package/vault/wiki/concepts/model-adaptive-harness.md +0 -122
  236. package/vault/wiki/concepts/model-routing-agents.md +0 -101
  237. package/vault/wiki/concepts/monorepo-architecture.md +0 -45
  238. package/vault/wiki/concepts/multi-agent-specialization.md +0 -61
  239. package/vault/wiki/concepts/permission-subsystem.md +0 -16
  240. package/vault/wiki/concepts/pi-messenger-analysis.md +0 -243
  241. package/vault/wiki/concepts/pi-vscode-extension-landscape.md +0 -37
  242. package/vault/wiki/concepts/policy-engine-pattern.md +0 -78
  243. package/vault/wiki/concepts/progressive-disclosure-agents.md +0 -53
  244. package/vault/wiki/concepts/progressive-skill-disclosure.md +0 -17
  245. package/vault/wiki/concepts/provider-native-prompting.md +0 -203
  246. package/vault/wiki/concepts/quality-signal-sentrux.md +0 -37
  247. package/vault/wiki/concepts/repo-map-ranking.md +0 -42
  248. package/vault/wiki/concepts/result-monad-error-handling.md +0 -47
  249. package/vault/wiki/concepts/safety-defense-in-depth.md +0 -83
  250. package/vault/wiki/concepts/sandbox-os-enforcement.md +0 -18
  251. package/vault/wiki/concepts/selective-debate-routing.md +0 -70
  252. package/vault/wiki/concepts/self-evolving-harness.md +0 -60
  253. package/vault/wiki/concepts/sentrux-mcp-integration.md +0 -36
  254. package/vault/wiki/concepts/sentrux-rules-engine.md +0 -49
  255. package/vault/wiki/concepts/shell-pattern-compression.md +0 -24
  256. package/vault/wiki/concepts/skill-first-architecture.md +0 -166
  257. package/vault/wiki/concepts/structured-compaction.md +0 -78
  258. package/vault/wiki/concepts/subagent-orchestration.md +0 -17
  259. package/vault/wiki/concepts/subagent-worktree-isolation.md +0 -68
  260. package/vault/wiki/concepts/superpowers-methodology.md +0 -78
  261. package/vault/wiki/concepts/think-in-code.md +0 -73
  262. package/vault/wiki/concepts/ts-execution-layer.md +0 -100
  263. package/vault/wiki/concepts/typescript-strict-mode.md +0 -37
  264. package/vault/wiki/concepts/vcc-conversation-compaction-for-pi.md +0 -53
  265. package/vault/wiki/concepts/verification-drift-detection.md +0 -19
  266. package/vault/wiki/consensus/consensus-records.md +0 -58
  267. package/vault/wiki/decisions/2026-04-30-pi-lean-ctx-native.md +0 -122
  268. package/vault/wiki/decisions/2026-05-07-replace-lean-ctx-with-context-mode.md +0 -59
  269. package/vault/wiki/decisions/adr-008.md +0 -40
  270. package/vault/wiki/decisions/adr-009.md +0 -46
  271. package/vault/wiki/decisions/adr-010.md +0 -55
  272. package/vault/wiki/decisions/adr-011.md +0 -165
  273. package/vault/wiki/decisions/adr-012.md +0 -102
  274. package/vault/wiki/decisions/adr-013.md +0 -59
  275. package/vault/wiki/decisions/adr-014.md +0 -73
  276. package/vault/wiki/decisions/adr-015.md +0 -81
  277. package/vault/wiki/decisions/adr-016.md +0 -91
  278. package/vault/wiki/decisions/adr-017.md +0 -79
  279. package/vault/wiki/decisions/adr-018.md +0 -100
  280. package/vault/wiki/decisions/adr-019.md +0 -75
  281. package/vault/wiki/decisions/adr-020.md +0 -106
  282. package/vault/wiki/decisions/adr-021.md +0 -86
  283. package/vault/wiki/decisions/adr-022.md +0 -113
  284. package/vault/wiki/decisions/adr-023.md +0 -113
  285. package/vault/wiki/decisions/adr-024.md +0 -73
  286. package/vault/wiki/decisions/adr-025.md +0 -130
  287. package/vault/wiki/decisions/adr-026.md +0 -56
  288. package/vault/wiki/decisions/adr-027.md +0 -94
  289. package/vault/wiki/decisions/colocate-wiki.md +0 -34
  290. package/vault/wiki/entities/Anders Hejlsberg.md +0 -29
  291. package/vault/wiki/entities/Anthropic.md +0 -17
  292. package/vault/wiki/entities/Augment Code.md +0 -49
  293. package/vault/wiki/entities/Bjarne Stroustrup.md +0 -26
  294. package/vault/wiki/entities/Bolt.new (StackBlitz).md +0 -39
  295. package/vault/wiki/entities/Boris Cherny.md +0 -11
  296. package/vault/wiki/entities/Claude Code.md +0 -19
  297. package/vault/wiki/entities/Dennis Ritchie.md +0 -26
  298. package/vault/wiki/entities/Emergent Labs.md +0 -32
  299. package/vault/wiki/entities/Google Cloud.md +0 -16
  300. package/vault/wiki/entities/Guido van Rossum.md +0 -28
  301. package/vault/wiki/entities/Ken Thompson.md +0 -28
  302. package/vault/wiki/entities/Lee et al.md +0 -16
  303. package/vault/wiki/entities/Linus Torvalds.md +0 -28
  304. package/vault/wiki/entities/Lovable (company).md +0 -40
  305. package/vault/wiki/entities/Martin Fowler.md +0 -16
  306. package/vault/wiki/entities/Meng et al.md +0 -16
  307. package/vault/wiki/entities/OpenAI.md +0 -16
  308. package/vault/wiki/entities/Rocket.new.md +0 -38
  309. package/vault/wiki/entities/VILA-Lab.md +0 -15
  310. package/vault/wiki/entities/autodev-codebase.md +0 -18
  311. package/vault/wiki/entities/ck-tool.md +0 -59
  312. package/vault/wiki/entities/codesearch.md +0 -18
  313. package/vault/wiki/entities/disler-indydevdan.md +0 -33
  314. package/vault/wiki/entities/gsd-get-shit-done.md +0 -56
  315. package/vault/wiki/entities/javascript-runtimes.md +0 -48
  316. package/vault/wiki/entities/jesse-vincent.md +0 -38
  317. package/vault/wiki/entities/lean-ctx.md +0 -32
  318. package/vault/wiki/entities/opendev.md +0 -41
  319. package/vault/wiki/entities/ops-codegraph-tool.md +0 -18
  320. package/vault/wiki/entities/pi-coding-agent.md +0 -53
  321. package/vault/wiki/entities/sentrux.md +0 -54
  322. package/vault/wiki/entities/vgrep-tool.md +0 -57
  323. package/vault/wiki/entities/vitest.md +0 -41
  324. package/vault/wiki/flows/harness-wiki-pipeline.md +0 -204
  325. package/vault/wiki/hot.md +0 -932
  326. package/vault/wiki/index.md +0 -437
  327. package/vault/wiki/log.md +0 -422
  328. package/vault/wiki/meta/dashboard.md +0 -30
  329. package/vault/wiki/meta/lint-report-2026-04-30.md +0 -86
  330. package/vault/wiki/meta/lint-report-2026-05-02.md +0 -251
  331. package/vault/wiki/meta/overview.canvas +0 -43
  332. package/vault/wiki/modules/adversarial-verification.md +0 -57
  333. package/vault/wiki/modules/automated-observability.md +0 -54
  334. package/vault/wiki/modules/bench.md +0 -20
  335. package/vault/wiki/modules/extensions.md +0 -23
  336. package/vault/wiki/modules/grounding-checkpoints.md +0 -62
  337. package/vault/wiki/modules/harness-implementation-plan.md +0 -345
  338. package/vault/wiki/modules/harness-wiki-skill-mapping.md +0 -135
  339. package/vault/wiki/modules/harness.md +0 -86
  340. package/vault/wiki/modules/persistent-memory.md +0 -85
  341. package/vault/wiki/modules/schema-orchestration.md +0 -68
  342. package/vault/wiki/modules/skills.md +0 -27
  343. package/vault/wiki/modules/spec-hardening.md +0 -58
  344. package/vault/wiki/modules/structured-planning.md +0 -53
  345. package/vault/wiki/modules/think-in-code-enforcement.md +0 -153
  346. package/vault/wiki/modules/wiki-query-interface.md +0 -64
  347. package/vault/wiki/overview.md +0 -51
  348. package/vault/wiki/questions/Research-pi-vs-claude-code-agentic-orchestration-pipeline.md +0 -87
  349. package/vault/wiki/questions/Research-sentrux-dev.md +0 -123
  350. package/vault/wiki/questions/Research-superpowers-skill-for-agentic-coding-agents.md +0 -164
  351. package/vault/wiki/questions/Research: Augment Code Context Engine.md +0 -244
  352. package/vault/wiki/questions/Research: Automating Software Engineering - Lovable, Bolt, Emergent, Rocket.md +0 -112
  353. package/vault/wiki/questions/Research: Claude Code State-of-the-Art Harness Improvements.md +0 -209
  354. package/vault/wiki/questions/Research: Codex State-of-the-Art Harness Improvements.md +0 -99
  355. package/vault/wiki/questions/Research: Engineering Workflows of Legendary Programmers and AI Harness Mapping.md +0 -107
  356. package/vault/wiki/questions/Research: Fallow Codebase Intelligence Harness Integration.md +0 -72
  357. package/vault/wiki/questions/Research: Gemini CLI SOTA Harness Integration.md +0 -166
  358. package/vault/wiki/questions/Research: GitHub Issues as Harness Spec Storage.md +0 -188
  359. package/vault/wiki/questions/Research: Google Antigravity Harness Integration.md +0 -120
  360. package/vault/wiki/questions/Research: Meta-Agent Context Drift Detection.md +0 -236
  361. package/vault/wiki/questions/Research: Model-Adaptive Agent Harness Design.md +0 -95
  362. package/vault/wiki/questions/Research: Model-Specific Prompting Guides.md +0 -165
  363. package/vault/wiki/questions/Research: Prompt Renderer for Multi-Model Agent Harness.md +0 -216
  364. package/vault/wiki/questions/Research: Skill-First Harness Architecture.md +0 -91
  365. package/vault/wiki/questions/Research: TypeScript Best Practices and Codebase Structure.md +0 -88
  366. package/vault/wiki/questions/Research: TypeScript Execution Layer for Agent Tool Calling.md +0 -81
  367. package/vault/wiki/questions/Research: claude-mem over Obsidian for Harness Layer.md +0 -71
  368. package/vault/wiki/questions/Research: claude-mem over obsidian wiki as the knowledge base for our agentic harness pipeline. think from first principles. does this replace or complement our current setup? no hard feelings about previous decisions. gimme accurate points.md +0 -80
  369. package/vault/wiki/questions/Research: context-mode vs lean-ctx.md +0 -72
  370. package/vault/wiki/questions/Research: cursor.sh Harness Innovations.md +0 -92
  371. package/vault/wiki/questions/Research: executor.sh Harness Integration.md +0 -170
  372. package/vault/wiki/questions/Research: how GSD fits into our coding harness setup.md +0 -97
  373. package/vault/wiki/questions/Research: how claude-mem fits into our workflow. and whether it should replace obsidian in the codebase. no hard feelings about previous actions, rethink from first principles always.md +0 -80
  374. package/vault/wiki/questions/Research: pi-vcc.md +0 -113
  375. package/vault/wiki/questions/Research: semantic code search tools.md +0 -69
  376. package/vault/wiki/questions/Research: vcc extension for pi coding agent.md +0 -73
  377. package/vault/wiki/questions/how-to-enable-semantic-code-search-now.md +0 -111
  378. package/vault/wiki/questions/mvp-implementation-blueprint.md +0 -552
  379. package/vault/wiki/questions/research-agent-first-codebase-exploration.md +0 -199
  380. package/vault/wiki/questions/research-agentic-coding-harness-latest-papers.md +0 -142
  381. package/vault/wiki/questions/research-gitingest-gitreverse-integration.md +0 -100
  382. package/vault/wiki/questions/research-wozcode-token-reduction.md +0 -67
  383. package/vault/wiki/questions/resolved-context-pruning-inplace-vs-restart.md +0 -95
  384. package/vault/wiki/questions/resolved-context-window-economics.md +0 -167
  385. package/vault/wiki/questions/resolved-imad-debate-gating-transfer.md +0 -126
  386. package/vault/wiki/questions/resolved-mcp-tool-preference.md +0 -112
  387. package/vault/wiki/questions/resolved-small-model-meta-agents.md +0 -107
  388. package/vault/wiki/questions/resolved-treesitter-dynamic-languages.md +0 -95
  389. package/vault/wiki/sources/Auggie Context MCP Server.md +0 -63
  390. package/vault/wiki/sources/Augment Code Codacy AI Giants.md +0 -61
  391. package/vault/wiki/sources/Augment Code MCP SiliconAngle.md +0 -49
  392. package/vault/wiki/sources/Augment Code WorkOS ERC 2025.md +0 -55
  393. package/vault/wiki/sources/Augment Context Engine Official.md +0 -71
  394. package/vault/wiki/sources/Augment SWE-bench Agent GitHub.md +0 -74
  395. package/vault/wiki/sources/Augment SWE-bench Pro Blog.md +0 -58
  396. package/vault/wiki/sources/Source: AgentBus Jinja2 Prompt Pipelines.md +0 -75
  397. package/vault/wiki/sources/Source: Arxiv /342/200/224 Don't Break the Cache.md" +0 -85
  398. package/vault/wiki/sources/Source: Augment - Harness Engineering for AI Coding Agents.md +0 -58
  399. package/vault/wiki/sources/Source: Blake Crosley Agent Architecture Guide.md +0 -100
  400. package/vault/wiki/sources/Source: Bolt.new Architecture & Case Study.md +0 -75
  401. package/vault/wiki/sources/Source: Build-Time Prompt Compilation Architecture.md +0 -107
  402. package/vault/wiki/sources/Source: Claude API Agent Skills Overview.md +0 -70
  403. package/vault/wiki/sources/Source: Gemini CLI Changelogs.md +0 -88
  404. package/vault/wiki/sources/Source: Google Blog - Gemini CLI Announcement.md +0 -57
  405. package/vault/wiki/sources/Source: Google Gemini CLI Architecture Docs.md +0 -53
  406. package/vault/wiki/sources/Source: LangChain - Anatomy of Agent Harness.md +0 -65
  407. package/vault/wiki/sources/Source: Lovable Architecture & Clone Analysis.md +0 -83
  408. package/vault/wiki/sources/Source: Martin Fowler - Harness Engineering.md +0 -70
  409. package/vault/wiki/sources/Source: OpenAI Harness Engineering Five Principles.md +0 -58
  410. package/vault/wiki/sources/Source: OpenAI Harness Engineering /342/200/224 0 Lines of Human Code.md" +0 -101
  411. package/vault/wiki/sources/Source: OpenDev /342/200/224 Building AI Coding Agents for the Terminal.md" +0 -100
  412. package/vault/wiki/sources/Source: Render AI Coding Agents Benchmark 2025.md +0 -53
  413. package/vault/wiki/sources/Source: Rocket.new /342/200/224 Vibe Solutioning Platform.md" +0 -70
  414. package/vault/wiki/sources/Source: SwirlAI Agent Skills Progressive Disclosure.md +0 -71
  415. package/vault/wiki/sources/Source: TianPan Prompt Caching Architecture.md +0 -89
  416. package/vault/wiki/sources/Source: Vercel Labs agent-browser.md +0 -155
  417. package/vault/wiki/sources/Source: browser-harness CDP Harness.md +0 -126
  418. package/vault/wiki/sources/agent-drift-academic-paper.md +0 -79
  419. package/vault/wiki/sources/aider-repomap-tree-sitter.md +0 -42
  420. package/vault/wiki/sources/anthropic-compaction-api.md +0 -58
  421. package/vault/wiki/sources/anthropic-effective-harnesses.md +0 -42
  422. package/vault/wiki/sources/anthropic-prompt-best-practices.md +0 -100
  423. package/vault/wiki/sources/anthropic2026-harness-design.md +0 -63
  424. package/vault/wiki/sources/barrel-files-tkdodo.md +0 -38
  425. package/vault/wiki/sources/birth-of-unix-kernighan-interview.md +0 -57
  426. package/vault/wiki/sources/bockeler2026-harness-engineering.md +0 -69
  427. package/vault/wiki/sources/cast-code-chunking-paper.md +0 -50
  428. package/vault/wiki/sources/ck-semantic-search.md +0 -78
  429. package/vault/wiki/sources/claude-code-architecture-karaxai-2026.md +0 -71
  430. package/vault/wiki/sources/claude-code-architecture-qubytes-2026.md +0 -50
  431. package/vault/wiki/sources/claude-code-architecture-vila-lab-2026.md +0 -64
  432. package/vault/wiki/sources/claude-code-security-architecture-penligent-2026.md +0 -70
  433. package/vault/wiki/sources/claude-context-editing-docs.md +0 -13
  434. package/vault/wiki/sources/cloudflare-codemode.md +0 -63
  435. package/vault/wiki/sources/code-chunk-library-supermemory.md +0 -63
  436. package/vault/wiki/sources/codeact-apple-2024.md +0 -62
  437. package/vault/wiki/sources/codex-dsc-rfc-8573.md +0 -41
  438. package/vault/wiki/sources/codex-open-source-agent-2026.md +0 -110
  439. package/vault/wiki/sources/coir-code-retrieval-benchmark.md +0 -51
  440. package/vault/wiki/sources/colinmcnamara-context-optimization-codemode.md +0 -48
  441. package/vault/wiki/sources/context-folding-paper.md +0 -61
  442. package/vault/wiki/sources/context-mode-website.md +0 -63
  443. package/vault/wiki/sources/cursor-agent-best-practices-2026.md +0 -62
  444. package/vault/wiki/sources/cursor-fork-29b-2025.md +0 -50
  445. package/vault/wiki/sources/cursor-harness-april-2026.md +0 -76
  446. package/vault/wiki/sources/cursor-instant-apply-2024.md +0 -45
  447. package/vault/wiki/sources/cursor-shadow-workspace-2024.md +0 -52
  448. package/vault/wiki/sources/cursor-shipped-coding-agent-2026.md +0 -53
  449. package/vault/wiki/sources/cursor-vs-antigravity-2026.md +0 -51
  450. package/vault/wiki/sources/disler-pi-vs-claude-code.md +0 -69
  451. package/vault/wiki/sources/distill-deterministic-context-compression.md +0 -53
  452. package/vault/wiki/sources/embedding-models-benchmark-supermemory-2025.md +0 -48
  453. package/vault/wiki/sources/executor-rhyssullivan.md +0 -122
  454. package/vault/wiki/sources/fallow-rs-codebase-intelligence.md +0 -125
  455. package/vault/wiki/sources/fan2025-imad.md +0 -60
  456. package/vault/wiki/sources/forgecode-gpt5-agent-improvements.md +0 -63
  457. package/vault/wiki/sources/gemini-3-prompting-guide.md +0 -78
  458. package/vault/wiki/sources/gh-cli-sub-issue-rfc.md +0 -50
  459. package/vault/wiki/sources/gh-sub-issue-extension.md +0 -72
  460. package/vault/wiki/sources/github-fork-issues-discussion.md +0 -44
  461. package/vault/wiki/sources/github-issue-dependencies-docs.md +0 -49
  462. package/vault/wiki/sources/github-sub-issues-docs.md +0 -51
  463. package/vault/wiki/sources/gitingest.md +0 -91
  464. package/vault/wiki/sources/gitreverse.md +0 -63
  465. package/vault/wiki/sources/google-antigravity-official-blog.md +0 -47
  466. package/vault/wiki/sources/google-antigravity-wikipedia.md +0 -53
  467. package/vault/wiki/sources/gsd-codecentric-deep-dive.md +0 -57
  468. package/vault/wiki/sources/gsd-github-repo.md +0 -51
  469. package/vault/wiki/sources/gsd-hn-discussion.md +0 -59
  470. package/vault/wiki/sources/guido-python-design-philosophy.md +0 -56
  471. package/vault/wiki/sources/hejlsberg-7-learnings.md +0 -48
  472. package/vault/wiki/sources/ironclaw-drift-monitor.md +0 -80
  473. package/vault/wiki/sources/langsight-loop-detection.md +0 -80
  474. package/vault/wiki/sources/leanctx-website.md +0 -69
  475. package/vault/wiki/sources/lee2026-meta-harness.md +0 -59
  476. package/vault/wiki/sources/linux-kernel-coding-workflow.md +0 -50
  477. package/vault/wiki/sources/lou2026-autoharness.md +0 -53
  478. package/vault/wiki/sources/martin-fowler-harness-engineering.md +0 -73
  479. package/vault/wiki/sources/mcp-architecture-docs.md +0 -13
  480. package/vault/wiki/sources/meng2026-agent-harness-survey.md +0 -79
  481. package/vault/wiki/sources/mindstudio-four-agent-types.md +0 -68
  482. package/vault/wiki/sources/ms-chat-history-management.md +0 -13
  483. package/vault/wiki/sources/openai-prompt-guidance.md +0 -104
  484. package/vault/wiki/sources/openclaw-session-pruning.md +0 -13
  485. package/vault/wiki/sources/opencode-dcp.md +0 -13
  486. package/vault/wiki/sources/opendev-arxiv-2603.05344v1.md +0 -79
  487. package/vault/wiki/sources/openhands-platform.md +0 -39
  488. package/vault/wiki/sources/oss-guide-codebase-exploration.md +0 -53
  489. package/vault/wiki/sources/pi-compaction-extensions-ecosystem.md +0 -102
  490. package/vault/wiki/sources/pi-context-prune-github-repo.md +0 -38
  491. package/vault/wiki/sources/pi-mono-compaction-docs.md +0 -38
  492. package/vault/wiki/sources/pi-omni-compact-github-repo.md +0 -50
  493. package/vault/wiki/sources/pi-rtk-optimizer-github-repo.md +0 -45
  494. package/vault/wiki/sources/pi-vcc-github-repo.md +0 -69
  495. package/vault/wiki/sources/pi-vscode-marketplace.md +0 -41
  496. package/vault/wiki/sources/pi-vscode-model-provider-marketplace.md +0 -39
  497. package/vault/wiki/sources/py-tree-sitter.md +0 -13
  498. package/vault/wiki/sources/sentrux-dev-landing.md +0 -40
  499. package/vault/wiki/sources/sentrux-docs-pro-architecture.md +0 -75
  500. package/vault/wiki/sources/sentrux-docs-quality-signal.md +0 -46
  501. package/vault/wiki/sources/sentrux-docs-root-cause-metrics.md +0 -57
  502. package/vault/wiki/sources/sentrux-docs-rules-engine.md +0 -58
  503. package/vault/wiki/sources/sentrux-github-repo.md +0 -56
  504. package/vault/wiki/sources/superpowers-github-repo.md +0 -56
  505. package/vault/wiki/sources/superpowers-release-blog.md +0 -54
  506. package/vault/wiki/sources/superpowers-termdock-analysis.md +0 -45
  507. package/vault/wiki/sources/swe-agent-aci.md +0 -42
  508. package/vault/wiki/sources/swe-bench.md +0 -45
  509. package/vault/wiki/sources/swe-pruner-context-pruning.md +0 -13
  510. package/vault/wiki/sources/think-in-code-blog.md +0 -48
  511. package/vault/wiki/sources/tree-sitter-docs.md +0 -13
  512. package/vault/wiki/sources/ts-best-practices-2025-devto.md +0 -42
  513. package/vault/wiki/sources/ts-folder-structure-mingyang.md +0 -58
  514. package/vault/wiki/sources/ts-monorepo-koerselman.md +0 -44
  515. package/vault/wiki/sources/ts-result-error-handling-kkalamarski.md +0 -52
  516. package/vault/wiki/sources/ts-runtimes-comparison-betterstack.md +0 -42
  517. package/vault/wiki/sources/ts-strict-mode-rishikc.md +0 -43
  518. package/vault/wiki/sources/unix-philosophy.md +0 -48
  519. package/vault/wiki/sources/vectara-chunking-vs-embedding-naacl2025.md +0 -39
  520. package/vault/wiki/sources/vectara-guardian-agents.md +0 -79
  521. package/vault/wiki/sources/vgrep-semantic-search.md +0 -76
  522. package/vault/wiki/sources/vitest-official.md +0 -41
  523. package/vault/wiki/sources/vscode-pi-community-extension.md +0 -40
  524. package/vault/wiki/sources/wozcode.md +0 -79
@@ -1,99 +0,0 @@
1
- ---
2
- type: concept
3
- title: "agent-browser — Rust-Native Browser Automation for AI Agents"
4
- status: developing
5
- created: 2026-05-02
6
- updated: 2026-05-02
7
- tags:
8
- - browser-automation
9
- - ai-agents
10
- - vercel-labs
11
- - rust
12
- - cdp
13
- - headless-browser
14
- aliases: ["agent-browser", "Vercel Labs agent-browser"]
15
- related:
16
- - "[[browser-subagent-visual-verification]]"
17
- - "[[harness-implementation-plan]]"
18
- - "[[Source: Vercel Labs agent-browser]]"
19
- sources:
20
- - "[[Source: Vercel Labs agent-browser]]"
21
- ---
22
-
23
- # agent-browser — Rust-Native Browser Automation for AI Agents
24
-
25
- Vercel Labs agent-browser (31.4K GitHub stars, Apache 2.0, v0.26.0) is the leading open-source browser automation CLI built specifically for AI agents. Rust-native single binary, 112 contributors, 81 releases, 568 commits.
26
-
27
- **Supersedes**: [[browser-harness-agent]] (9.4K stars, MIT, Python) — replaced May 2026 for P30. agent-browser has 3.3× more stars, richer AI agent integration, and Rust-native performance.
28
-
29
- ## Core Design
30
-
31
- Unlike Puppeteer/Playwright (human scripting APIs) and browser-harness (raw CDP with self-healing), agent-browser provides an **agent-native interface**: snapshot-based element refs (`@e1`, `@e2`), JSON output, annotated screenshots, structured diff, and a built-in skills system. The AI agent thinks in terms of refs from snapshots — not CSS selectors, not CDP method calls.
32
-
33
- ## Key Innovations for AI Agents
34
-
35
- ### 1. Snapshot + Refs Workflow
36
- ```
37
- agent-browser snapshot -i --json
38
- → Returns: {"refs": {"e1": {"role":"button","name":"Submit"}, "e2": {"role":"textbox","name":"Email"}}}
39
- agent-browser click @e1 # deterministic, no DOM re-query
40
- agent-browser fill @e2 "text" # refs survive page changes until re-snapshot
41
- ```
42
-
43
- ### 2. Annotated Screenshots
44
- ```
45
- agent-browser screenshot --annotate
46
- → Screenshot with numbered labels [1], [2], [3] matching @e1, @e2, @e3 refs
47
- → Multimodal models can reason about visual layout + refs simultaneously
48
- ```
49
-
50
- ### 3. Structured Diff
51
- ```
52
- agent-browser diff screenshot --baseline before.png -o diff.png
53
- agent-browser diff snapshot --baseline before-snapshot.txt
54
- → Structural + visual diff for verifying UI changes
55
- ```
56
-
57
- ### 4. React Introspection
58
- ```
59
- agent-browser open --enable react-devtools <url>
60
- agent-browser react tree # full component tree
61
- agent-browser react suspense # suspense boundaries + classifier
62
- agent-browser vitals # LCP/CLS/TTFB/FCP/INP + React hydration
63
- ```
64
-
65
- ### 5. Batch Mode
66
- ```
67
- agent-browser batch "open url" "snapshot -i" "click @e1" "screenshot"
68
- → Multiple commands in single CLI invocation, reduces process startup overhead
69
- ```
70
-
71
- ### 6. Built-in Skills
72
- ```
73
- agent-browser skills get core # 420-line usage guide for agents
74
- npx skills add vercel-labs/agent-browser # install skill stub
75
- ```
76
-
77
- ## Architecture
78
-
79
- - **Rust CLI** + **Rust Daemon**: Single binary. Daemon auto-starts, persists between commands
80
- - **Client-daemon**: Fast subsequent commands (no browser restart)
81
- - **Direct CDP**: Like browser-harness — raw DevTools Protocol, no Puppeteer wrappers
82
- - **Multi-provider**: Local Chrome + 6 cloud providers (Browserless, Browserbase, Browser Use, Kernel, AgentCore, iOS)
83
-
84
- ## Integration with P30
85
-
86
- P30 Browser Subagent dispatches via P25 router for UI tasks. Harness invokes `agent-browser` CLI as a subprocess (or via batch mode for multi-step workflows). Config at `.pi/harness/browser.json`.
87
-
88
- **What we use**:
89
- - Snapshot + refs for element interaction
90
- - Annotated screenshots for visual verification
91
- - Diff for before/after comparison
92
- - Batch mode for multi-step agent workflows
93
- - `--json` for structured output parsing
94
-
95
- **What we skip**:
96
- - Dashboard (CLI harness only)
97
- - AI Chat (our agent IS the chat)
98
- - Cloud providers (local Chrome only; opt-in for serverless)
99
- - iOS Simulator (web-focused; opt-in)
@@ -1,43 +0,0 @@
1
- ---
2
- type: concept
3
- title: "Agent-Codebase Interface (ACI)"
4
- created: 2026-04-30
5
- updated: 2026-04-30
6
- tags:
7
- - agent-architecture
8
- - codebase-exploration
9
- - interface-design
10
- related:
11
- - "[[swe-agent-aci]]"
12
- - "[[research-agent-first-codebase-exploration]]"
13
- status: developing
14
-
15
- ---# Agent-Codebase Interface (ACI)
16
-
17
- The design of tool interfaces specifically for AI agents — not humans — to interact with codebases. Extends the SWE-agent concept of Agent-Computer Interfaces to codebase exploration specifically.
18
-
19
- ## Core Principle
20
-
21
- Agents process information differently from humans. They have:
22
- - **Fixed context windows** (not infinite working memory)
23
- - **Token-based costs** (every byte of context has a cost)
24
- - **No visual cortex** (can't "see" code structure, need explicit representations)
25
- - **No intuition** (can't form mental models from partial exposure)
26
- - **Perfect recall within context** (but zero recall outside it)
27
-
28
- Therefore, the interface must:
29
- 1. Maximize information density per token
30
- 2. Present structured, machine-parseable representations
31
- 3. Support progressive disclosure (drill down on demand)
32
- 4. Enable autonomous navigation decisions
33
-
34
- ## Contrast with Human Interfaces
35
-
36
- | Human Interface | Agent Interface |
37
- |----------------|-----------------|
38
- | Syntax highlighting, file trees | AST symbol maps, dependency graphs |
39
- | Scroll through files | Fetch specific symbol definitions |
40
- | Visual pattern recognition | Semantic search + structured queries |
41
- | Gradual immersion ("Paper Cuts") | Bulk ingestion + ranking algorithms |
42
- | IDE debugging (step-through) | Execution feedback loops (run tests, check output) |
43
- | "Use the project" to learn | "Map the project" to learn |
@@ -1,67 +0,0 @@
1
- ---
2
- type: concept
3
- tags:
4
- - harness
5
- - architecture
6
- - context-engineering
7
- - safety
8
- related:
9
- - "[[Agentic Orchestration Pipeline]]"
10
- - "[[Context Engineering]]"
11
- - "[[Safety Defense-in-Depth]]"
12
- - "[[sources/martin-fowler-harness-engineering]]"
13
- - "[[sources/opendev-arxiv-2603.05344v1]]"
14
- ---
15
-
16
- # Agent Harness Architecture
17
-
18
- The harness is everything in an AI coding agent except the model itself: the runtime orchestration layer that wraps the reasoning loop and coordinates tool dispatch, context management, safety enforcement, and session persistence. Defined as: **Agent = Model + Harness**.
19
-
20
- ## Two-Phase Model
21
-
22
- ### Scaffolding (Pre-Runtime)
23
- Runs once before the first prompt. Assembles the agent:
24
- - System prompt compilation (conditional, priority-ordered sections)
25
- - Tool schema building (from registry, MCP discovery, subagent schemas)
26
- - Subagent registration and initialization
27
-
28
- ### Harness (Runtime)
29
- Operates continuously during execution:
30
- - Tool dispatch with safety gating
31
- - Context lifecycle management (compaction, reminders, memory)
32
- - Approval workflows (Manual/Semi-Auto/Auto)
33
- - Session persistence and undo tracking
34
-
35
- ## Feedforward + Feedback Model
36
-
37
- | Direction | Type | Examples |
38
- |-----------|------|----------|
39
- | **Feedforward (Guides)** | Steer before action | System prompts, AGENTS.md, Skills, coding conventions, architecture docs |
40
- | **Feedback (Sensors)** | Observe after action | Linters, tests, review agents, type checkers, structural analysis |
41
-
42
- Two execution modes:
43
- - **Computational**: Deterministic, fast — tests, linters, type checkers
44
- - **Inferential**: LLM-based, semantic — AI code reviews, "LLM as judge"
45
-
46
- ## The Steering Loop
47
-
48
- Human developers iterate on the harness: whenever an issue occurs repeatedly, improve feedforward guides or feedback sensors. Agents can help build harness components (write tests, generate linter rules, create documentation).
49
-
50
- ## Harness Layers (OpenDev Reference)
51
-
52
- 1. **Prompt Composition**: Conditional sections sorted by priority, provider-specific variants, ${VAR} substitution, two-part caching
53
- 2. **Context Engineering**: Staged compaction, event-driven reminders, dual-memory architecture, tool result optimization
54
- 3. **Tool System**: Registry with handler categories, lazy MCP discovery, batch execution, 9-pass fuzzy edit matching
55
- 4. **Safety System**: 5-layer defense-in-depth (prompt → schema → approval → validation → hooks)
56
- 5. **Persistence**: Session storage, operation log/undo, configuration hierarchy, provider cache
57
-
58
- ## Harness Templates
59
-
60
- For common topologies (CRUD APIs, event processors, dashboards), a harness template bundles guides + sensors as a reusable package. Teams select tech stacks partly based on available harnesses.
61
-
62
- ## Relevance to Our Harness
63
-
64
- Our current harness architecture:
65
- - **Scaffolding**: `.pi/skills/` system, agent prompt engineering, wiki as knowledge base
66
- - **Runtime**: `lean-ctx` for tool routing, `Agent` for subagent spawning, `wiki-autoresearch` for research
67
- - **Gaps**: No safety defense-in-depth, no staged compaction, no event-driven reminders, no team dispatch, no sequential chaining
@@ -1,133 +0,0 @@
1
- ---
2
- aliases: ["agent loop patterns", "stuck agent detection", "tool call loops"]
3
- type: concept
4
- title: "Agent Loop Detection Patterns"
5
- created: 2026-04-30
6
- status: developing
7
- tags:
8
- - concept
9
- - loop-detection
10
- - agent-reliability
11
- - production
12
- related:
13
- - "[[Research: Meta-Agent Context Drift Detection]]"
14
- - "[[context-drift-in-agents]]"
15
- - "[[meta-agent-context-pruning]]"
16
- - "[[langsight-loop-detection]]"
17
- - "[[ironclaw-drift-monitor]]"
18
- updated: 2026-05-02
19
-
20
- ---# Agent Loop Detection Patterns
21
-
22
- Production-grade detection patterns for identifying when an AI agent is stuck in a non-productive loop. Based on LangSight's production experience and ironclaw's DriftMonitor proposal.
23
-
24
- ## Three Loop Types
25
-
26
- ### 1. Direct Repetition
27
-
28
- Same tool called with identical arguments multiple times in a row. Most common pattern.
29
-
30
- **Cause**: Tool returns error or unexpected result. LLM's retry logic doesn't distinguish "transient failure, retry" from "structural failure, give up."
31
-
32
- **Real-world example**: Support agent called `crm-mcp/lookup_customer` 89 times with identical arguments. CRM returned slightly malformed response. Agent decided it needed more data, called same tool, got same malformed response, repeated. Cost: $214.
33
-
34
- **Detection**: `SHA256(tool_name + normalized_args)[:16]`. If same hash appears ≥3 times in session window, flag as loop.
35
-
36
- ### 2. Ping-Pong Between Tools
37
-
38
- Two tools called alternately without state change between calls.
39
-
40
- **Example**: Agent calls CRM → gets customer → calls Billing → gets invoices → calls CRM again with same args → calls Billing again.
41
-
42
- **Detection**: Sequence pattern matching on last 6 calls. A-B-A-B-A-B pattern triggers detection.
43
-
44
- ### 3. Retry-Without-Progress
45
-
46
- Tool call succeeds (no error) but response doesn't satisfy agent's internal goal. Agent keeps calling with minor argument variations.
47
-
48
- **Detection**: Semantic similarity of consecutive reasoning outputs >0.95 cosine across multiple steps. Computationally expensive.
49
-
50
- ## Detection Approaches
51
-
52
- ### Approach 1: Argument Hash (Recommended)
53
-
54
- ```python
55
- import hashlib, json
56
- from collections import Counter
57
-
58
- def compute_call_hash(tool_name: str, args: dict) -> str:
59
- payload = f"{tool_name}:{json.dumps(args, sort_keys=True)}"
60
- return hashlib.sha256(payload.encode()).hexdigest()[:16]
61
-
62
- class LoopDetector:
63
- def __init__(self, threshold: int = 3):
64
- self.threshold = threshold
65
- self.call_counts = Counter()
66
-
67
- def record_call(self, tool_name: str, args: dict) -> bool:
68
- call_hash = compute_call_hash(tool_name, args)
69
- self.call_counts[call_hash] += 1
70
- return self.call_counts[call_hash] >= self.threshold
71
- ```
72
-
73
- Catches >90% of real-world loops with zero false positives at threshold 3.
74
-
75
- ### Approach 2: Sliding Window Rate
76
-
77
- Count tool calls regardless of argument variation. If tool called >N times in M seconds, flag.
78
-
79
- ```python
80
- from collections import deque
81
- from datetime import datetime, timedelta
82
-
83
- class RateLoopDetector:
84
- def __init__(self, max_calls: int = 10, window_seconds: int = 60):
85
- self.max_calls = max_calls
86
- self.window = timedelta(seconds=window_seconds)
87
- self.call_times: dict[str, deque] = {}
88
-
89
- def record_call(self, tool_name: str) -> bool:
90
- now = datetime.utcnow()
91
- if tool_name not in self.call_times:
92
- self.call_times[tool_name] = deque()
93
- times = self.call_times[tool_name]
94
- while times and now - times[0] > self.window:
95
- times.popleft()
96
- times.append(now)
97
- return len(times) >= self.max_calls
98
- ```
99
-
100
- ### Approach 3: LLM Similarity
101
-
102
- Compare semantic similarity between consecutive reasoning outputs. Most sophisticated but computationally expensive. Usually overkill — Approaches 1+2 catch >90%.
103
-
104
- ## Intervention Strategies
105
-
106
- | Strategy | When | Risk |
107
- |----------|------|------|
108
- | **Warn + continue** | Early monitoring, unsure about thresholds | No false-termination risk, but loops continue |
109
- | **Terminate session** | Production, confident in thresholds | False termination loses partial work |
110
- | **Inject recovery** | Want agent to self-correct | Agent may ignore or loop again |
111
- | **Prune + restart** | Proposed meta-agent pattern | Pruning may remove useful context |
112
-
113
- ## Threshold Tuning
114
-
115
- - **Default**: 3 identical calls. Works for most agents.
116
- - **Polling agents**: Use time-based windows (Approach 2), not count-based.
117
- - **Retry-heavy workflows**: Increase to 5-7.
118
- - **Sub-agents**: Each sub-agent gets own detector. Parent calling same sub-agent multiple times is not a loop.
119
- - **Start with warn, switch to terminate**: Monitor for a week, then enforce.
120
-
121
- ## Always Combine With Budget Guardrails
122
-
123
- Loop detection catches known patterns. Budget guardrails catch unknown patterns:
124
- - Max cost per session ($1 default)
125
- - Max steps (25 default)
126
- - Max wall time (120s default)
127
- - Soft alert at 80% of budget
128
-
129
- ## See Also
130
-
131
- - [[meta-agent-context-pruning]] — Extends detection with pruning + restart
132
- - [[langsight-loop-detection]] — Source: production deployment guide
133
- - [[ironclaw-drift-monitor]] — Source: 5-rule DriftMonitor proposal
@@ -1,126 +0,0 @@
1
- ---
2
- type: concept
3
- status: developing
4
- created: 2026-04-30
5
- updated: 2026-04-30
6
- tags:
7
- - agentic-harness
8
- - tool-enforcement
9
- - semantic-search
10
- - mcp
11
- related:
12
- - "[[ck-tool]]"
13
- - "[[mcp-tool-routing]]"
14
- - "[[agentic-harness-context-enforcement]]"
15
- - "[[Research: semantic code search tools]]"
16
- title: "agent search enforcement"
17
-
18
- ---# agent search enforcement
19
-
20
- Strategies to force AI coding agents to use semantic code search tools (ck, vgrep) instead of raw `grep`, `cat`, and pipe commands.
21
-
22
- ## Problem
23
-
24
- AI coding agents default to shell tools: `grep -r "pattern" .`, `cat file | grep foo`, `find . -name "*.py" | xargs grep bar`. These are:
25
- - **Lexical-only**: Miss conceptual matches, require exact keyword knowledge
26
- - **Noisy**: Return too many or too few results
27
- - **Token-inefficient**: Raw grep output wastes context window on irrelevant matches
28
- - **Non-indexed**: Every query scans the entire codebase (slow on large repos)
29
-
30
- Semantic tools (ck --sem) solve these problems but agents don't use them by default because they're not native tools.
31
-
32
- ## Enforcement Strategies
33
-
34
- ### 1. System Prompt Rules (Weak)
35
-
36
- Add to agent system prompt / CLAUDE.md:
37
- ```markdown
38
- ## Search Policy
39
- - NEVER use raw `grep` for codebase exploration.
40
- - ALWAYS use `ck --sem` or `ck --hybrid` for conceptual searches.
41
- - `grep` is permitted ONLY for exact literal string matching (e.g., finding a specific error message).
42
- - Before any grep, consider: "Can I express this as a ck query?"
43
- ```
44
-
45
- **Effectiveness**: Low-Medium. Depends on model compliance. Claude 4 Opus follows rules well; smaller models may ignore. Costs zero infrastructure.
46
-
47
- ### 2. MCP Tool Registration (Medium)
48
-
49
- Register ck as an MCP tool:
50
- ```bash
51
- claude mcp add ck-search -s user -- ck --serve
52
- ```
53
-
54
- The agent sees `ck_search`, `ck_get`, `ck_info`, `ck_reindex` as first-class tools alongside `bash` and `read`. If the prompt emphasizes preferring MCP tools, the agent may route code searches through ck.
55
-
56
- **Effectiveness**: Medium. Agent still has `bash` available. Needs prompt reinforcement. Best when combined with Strategy 1.
57
-
58
- ### 3. Shell Wrapper Interception (Medium-Strong)
59
-
60
- Create a wrapper script that intercepts grep and routes semantic-looking queries to ck:
61
-
62
- ```bash
63
- #!/bin/bash
64
- # ~/bin/grep (wrapper for agent's PATH)
65
-
66
- # Route to ck if query looks conceptual (multi-word, no obvious regex)
67
- if [[ "$*" =~ [[:space:]] ]] && [[ ! "$*" =~ [\^\$\.\*\[\]\\] ]]; then
68
- if command -v ck &>/dev/null; then
69
- exec ck --hybrid "$@" 2>/dev/null || exec /usr/bin/grep "$@"
70
- fi
71
- fi
72
- exec /usr/bin/grep "$@"
73
- ```
74
-
75
- Place this in the agent's PATH before `/usr/bin`.
76
-
77
- **Risks**:
78
- - False positives: `grep "TODO: fix this"` gets intercepted but should be lexical
79
- - Breaks scripts that parse grep output format
80
- - Adding `--hybrid` changes output format (score fields, different line format)
81
- - Hard to distinguish "the agent wants grep" from "the agent typed something that looks semantic"
82
-
83
- **Mitigation**: Only wrap for known agent users, not system-wide. Use an explicit env var: `CK_ENFORCE=1 grep ...`
84
-
85
- ### 4. Harness-Level Tool Routing (Strong)
86
-
87
- Modify the agent harness (e.g., lean-ctx bash tool) to inspect every bash command before execution:
88
-
89
- ```python
90
- def pre_exec_hook(command: str) -> str:
91
- """Intercept grep/cat and suggest ck."""
92
- if re.match(r'^(grep|/usr/bin/grep|/bin/grep)\s', command):
93
- # Extract pattern and path
94
- match = re.match(r'^grep\s+(?:-[a-zA-Z]+\s+)*["\']?([^"\']+)["\']?\s+(.*)', command)
95
- if match:
96
- pattern, path = match.groups()
97
- # If pattern is multi-word (conceptual), route to ck
98
- if ' ' in pattern and not re.search(r'[\^\$\.\*\[\]\\]', pattern):
99
- return f'ck --hybrid "{pattern}" {path}'
100
- return command # pass through unchanged
101
- ```
102
-
103
- **Effectiveness**: Strong. Catches all grep invocations. Can log/report non-compliance. Requires modifying harness code.
104
-
105
- ### 5. Post-Hoc Validation (Weak)
106
-
107
- A checker that scans agent action logs and flags grep usage. Reactive — doesn't prevent the bad behavior, only reports it.
108
-
109
- ```bash
110
- # Check agent logs for grep usage
111
- grep -c '"command": "grep' agent-session.log
112
- ```
113
-
114
- ## Recommended Approach
115
-
116
- **Three-layer defense for the ultimate-pi harness:**
117
-
118
- 1. **Layer 1 (immediate)**: System prompt rules in AGENTS.md + install ck + register MCP
119
- 2. **Layer 2 (medium-term)**: Add pre-exec hook to lean-ctx bash tool that warns/logs grep usage and suggests ck
120
- 3. **Layer 3 (optional)**: Shell wrapper for known agent sessions with `CK_ENFORCE` env var
121
-
122
- ## Open Questions
123
-
124
- - [ ] How does Claude Code's native `Grep` tool interact with custom MCP tools? Does it prefer its own?
125
- - [ ] Can MCP tools be marked as "preferred" or given higher priority?
126
- - [ ] What's the false-positive rate of shell interception on real-world agent queries?
@@ -1,74 +0,0 @@
1
- ---
2
- type: concept
3
- status: developing
4
- created: 2026-05-05
5
- tags:
6
- - agent-skills
7
- - ecosystem
8
- - open-standard
9
- - progressive-disclosure
10
- related:
11
- - "[[superpowers-methodology]]"
12
- - "[[agent-skills-pattern]]"
13
- - "[[skill-first-architecture]]"
14
- - "[[policy-engine-pattern]]"
15
- ---
16
-
17
- # Agent Skills Ecosystem
18
-
19
- ## Definition
20
-
21
- The Agent Skills ecosystem is the open-standard marketplace and format for packaging reusable AI agent expertise as SKILL.md files. Originally developed by Anthropic, released as an open standard in October 2025, and adopted by all major agent platforms within weeks. As of May 2026: 490K+ skills across multiple marketplaces.
22
-
23
- ## The SKILL.md Open Standard
24
-
25
- Every skill is a directory containing a `SKILL.md` file with:
26
- - **YAML frontmatter**: `name` (lowercase-hyphenated, ≤64 chars), `description` (≤1024 chars — the trigger), optional `allowed-tools`, `metadata`, `license`
27
- - **Markdown instructions**: What the agent should do when the skill activates
28
-
29
- Progressive disclosure architecture:
30
- 1. **Discovery** (always loaded): Name + description only (~100 tokens per skill)
31
- 2. **Activation** (on-demand): Full SKILL.md body loaded when task matches description
32
- 3. **Execution** (on-demand): Scripts, reference files, templates loaded as needed
33
-
34
- ## Marketplaces
35
-
36
- | Marketplace | Skills | Key Differentiator |
37
- |-------------|--------|-------------------|
38
- | **Skills.sh** (Vercel) | 83K+ | Curated quality, CLI-native install, Snyk security scanning, leaderboard |
39
- | **SkillsMP** | 400K+ | Volume leader, GitHub crawl, AI-powered semantic search |
40
- | **ClawHub** (OpenClaw) | ~10K+ | Open platform, hit by ClawHavoc malware campaign |
41
-
42
- ## Installation
43
-
44
- Universal: `npx skills add owner/repo`
45
-
46
- Per-agent paths:
47
- - Claude Code: `.claude/skills/` (project) or `~/.claude/skills/` (personal)
48
- - Codex CLI: `.agents/skills/` or `.codex/skills/`
49
- - Cursor: `.cursor/skills/`
50
- - Gemini CLI: `.gemini/skills/`
51
- - GitHub Copilot: `.github/skills/`
52
- - Windsurf: `.windsurf/skills/`
53
-
54
- ## Two Skill Types
55
-
56
- 1. **Capability Uplift** — Gives agent abilities it doesn't have. Before the skill, agent can't do the task. Examples: Firecrawl (web scraping), Document Skills (PDF/DOCX creation), Webapp Testing (Playwright).
57
-
58
- 2. **Encoded Preference** — Agent already knows how, but the skill encodes your team's specific way. Examples: Code review checklists, commit message formats, API conventions.
59
-
60
- ## Security Risks
61
-
62
- Snyk's ToxicSkills study (Feb 2026) scanned 3,984 skills:
63
- - 36.8% had at least one security flaw
64
- - 13.4% contained critical-level issues
65
- - 76 skills were confirmed malicious payloads
66
- - 91% of malicious skills combined prompt injection with traditional malware
67
-
68
- The ClawHavoc campaign (Jan-Feb 2026): 341 malicious skills on ClawHub distributing Atomic macOS Stealer.
69
-
70
- ## Ecosystem Trajectory
71
- Zero to 490K skills in six months (Oct 2025 – Mar 2026). All major platforms adopted within weeks. The format's simplicity (anyone who can write Markdown can create a skill) drove adoption. Network effects accelerating: more skills → more agent users → more skill authors.
72
-
73
- ## Relevance to Harness
74
- Our `.pi/skills/` system uses the same progressive disclosure pattern. The Agent Skills ecosystem validates that markdown-based skills are the right primitive — and that cross-agent portability is the winning strategy. We should consider SKILL.md compatibility for maximum reuse of the 490K+ ecosystem.
@@ -1,68 +0,0 @@
1
- ---
2
- type: concept
3
- title: "Agent Skills Pattern (Progressive Disclosure)"
4
- created: 2026-05-01
5
- updated: 2026-05-01
6
- status: developing
7
- tags:
8
- - harness
9
- - skills
10
- - context-engineering
11
- - gemini-cli
12
- related:
13
- - "[[harness-engineering-first-principles]]"
14
- - "[[gemini-cli-architecture]]"
15
- sources:
16
- - "[[Source: Gemini CLI Changelogs]]"
17
- - "[[Source: LangChain - Anatomy of Agent Harness]]"
18
-
19
- ---# Agent Skills Pattern: Progressive Disclosure
20
-
21
- ## What It Is
22
-
23
- Agent Skills is a harness-level primitive for **progressive disclosure**: skills are loaded on-demand via an activation mechanism rather than all at context start. This prevents context rot — the observed degradation in model performance as the context window fills with irrelevant tool definitions and instructions.
24
-
25
- ## Why It Matters
26
-
27
- Too many tools or MCP servers loaded into context on agent start degrades performance _before_ the agent can start working. Skills solve this by loading only when needed:
28
-
29
- 1. Agent starts with minimal context (core tools + system prompt)
30
- 2. Agent analyzes task, determines which skills are relevant
31
- 3. Agent calls `activate_skill` tool to load specific skill's instructions + tools
32
- 4. Skill's context injected into current conversation
33
- 5. Agent uses skill, then moves on (skill context may persist or be compacted)
34
-
35
- ## Gemini CLI Implementation (v0.23+)
36
-
37
- - **v0.23 (Jan 2026)**: Experimental Agent Skills support via agentskills.io
38
- - **v0.24**: Built-in agent skills, `/skills install/uninstall`, `/agents refresh`
39
- - **v0.25**: `activate_skill` tool formalized, `pr-creator` skill, skills enabled by default
40
- - **v0.26**: `skill-creator` meta-skill (skills that create skills)
41
- - **v0.30**: SDK package enabling custom skills with dynamic system instructions
42
- - **v0.39**: `/memory inbox` for reviewing and patching skills extracted during sessions
43
-
44
- ## Key Design Decisions
45
-
46
- 1. **Frontmatter metadata**: Each skill has structured metadata describing when to activate
47
- 2. **Activation tool**: Model decides when to call `activate_skill` based on task analysis
48
- 3. **Skill inbox**: Extracted skills don't auto-install — human reviews first via `/memory inbox`
49
- 4. **Skill-creator**: Meta-skill enables agent to create new skills from observed patterns
50
-
51
- ## Ultimate-PI Current State
52
-
53
- We have `.pi/skills/` directory with 16+ skills, but they load all at context start (no progressive disclosure). This follows the "delivery mechanism for context engineering" pattern but without the activation mechanism that prevents context rot.
54
-
55
- ## Integration Path (P-F2)
56
-
57
- 1. Add frontmatter to each skill: `activation_triggers`, `required_capabilities`, `token_budget`
58
- 2. Add `activate_skill` tool to tool registry
59
- 3. Implement skill registry that loads skills on-demand
60
- 4. Add `/memory inbox` for reviewing AI-extracted patterns before they become permanent skills
61
- 5. Implement skill-creator meta-skill for autonomous skill generation from observed failures
62
-
63
- ## Relationship to Other Harness Primitives
64
-
65
- - **Context Compression**: Skills reduce the _need_ for compression by keeping context lean
66
- - **Subagents**: Skills can be loaded into subagents independently, each with relevant context
67
- - **Policy Engine**: Skill activation can be gated by policy (e.g., "never activate browser skill on production")
68
- - **Memory Systems**: Skills extracted from sessions feed into persistent memory (wiki in our case)