ultimate-pi 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (516) hide show
  1. package/.agents/skills/ck-search/SKILL.md +99 -0
  2. package/.agents/skills/defuddle/SKILL.md +90 -0
  3. package/.agents/skills/find-skills/SKILL.md +142 -0
  4. package/.agents/skills/firecrawl/SKILL.md +150 -0
  5. package/.agents/skills/firecrawl/rules/install.md +82 -0
  6. package/.agents/skills/firecrawl/rules/security.md +26 -0
  7. package/.agents/skills/firecrawl-agent/SKILL.md +57 -0
  8. package/.agents/skills/firecrawl-build-interact/SKILL.md +67 -0
  9. package/.agents/skills/firecrawl-build-onboarding/SKILL.md +102 -0
  10. package/.agents/skills/firecrawl-build-onboarding/references/auth-flow.md +39 -0
  11. package/.agents/skills/firecrawl-build-onboarding/references/project-setup.md +20 -0
  12. package/.agents/skills/firecrawl-build-onboarding/references/sdk-installation.md +17 -0
  13. package/.agents/skills/firecrawl-build-scrape/SKILL.md +68 -0
  14. package/.agents/skills/firecrawl-build-search/SKILL.md +68 -0
  15. package/.agents/skills/firecrawl-crawl/SKILL.md +58 -0
  16. package/.agents/skills/firecrawl-download/SKILL.md +69 -0
  17. package/.agents/skills/firecrawl-interact/SKILL.md +83 -0
  18. package/.agents/skills/firecrawl-map/SKILL.md +50 -0
  19. package/.agents/skills/firecrawl-parse/SKILL.md +61 -0
  20. package/.agents/skills/firecrawl-scrape/SKILL.md +68 -0
  21. package/.agents/skills/firecrawl-search/SKILL.md +59 -0
  22. package/.agents/skills/obsidian-bases/SKILL.md +299 -0
  23. package/.agents/skills/obsidian-markdown/SKILL.md +237 -0
  24. package/.agents/skills/posthog-analyst/SKILL.md +306 -0
  25. package/.agents/skills/posthog-analyst/evals/evals.json +23 -0
  26. package/.agents/skills/wiki/SKILL.md +215 -0
  27. package/.agents/skills/wiki/references/css-snippets.md +122 -0
  28. package/.agents/skills/wiki/references/frontmatter.md +107 -0
  29. package/.agents/skills/wiki/references/git-setup.md +58 -0
  30. package/.agents/skills/wiki/references/mcp-setup.md +149 -0
  31. package/.agents/skills/wiki/references/modes.md +259 -0
  32. package/.agents/skills/wiki/references/plugins.md +96 -0
  33. package/.agents/skills/wiki/references/rest-api.md +124 -0
  34. package/.agents/skills/wiki-autoresearch/SKILL.md +211 -0
  35. package/.agents/skills/wiki-autoresearch/references/program.md +75 -0
  36. package/.agents/skills/wiki-fold/SKILL.md +204 -0
  37. package/.agents/skills/wiki-fold/references/fold-template.md +133 -0
  38. package/.agents/skills/wiki-ingest/SKILL.md +288 -0
  39. package/.agents/skills/wiki-lint/SKILL.md +183 -0
  40. package/.agents/skills/wiki-query/SKILL.md +176 -0
  41. package/.agents/skills/wiki-save/SKILL.md +128 -0
  42. package/.ckignore +41 -0
  43. package/.env.example +9 -0
  44. package/.github/workflows/lint.yml +33 -0
  45. package/.github/workflows/publish-github-packages.yml +35 -0
  46. package/.github/workflows/publish-npm.yml +1 -1
  47. package/.pi/SYSTEM.md +107 -40
  48. package/.pi/agents/pi-pi/agent-expert.md +205 -0
  49. package/.pi/agents/pi-pi/cli-expert.md +47 -0
  50. package/.pi/agents/pi-pi/config-expert.md +67 -0
  51. package/.pi/agents/pi-pi/ext-expert.md +53 -0
  52. package/.pi/agents/pi-pi/keybinding-expert.md +123 -0
  53. package/.pi/agents/pi-pi/pi-orchestrator.md +103 -0
  54. package/.pi/agents/pi-pi/prompt-expert.md +83 -0
  55. package/.pi/agents/pi-pi/skill-expert.md +52 -0
  56. package/.pi/agents/pi-pi/theme-expert.md +46 -0
  57. package/.pi/agents/pi-pi/tui-expert.md +100 -0
  58. package/.pi/agents/rethink.md +140 -0
  59. package/.pi/agents/wiki-ingest.md +67 -0
  60. package/.pi/agents/wiki-lint.md +75 -0
  61. package/.pi/auto-commit.json +20 -0
  62. package/.pi/extensions/banner.png +0 -0
  63. package/.pi/extensions/ck-enforce.ts +216 -0
  64. package/.pi/extensions/custom-footer.ts +308 -0
  65. package/.pi/extensions/custom-header.ts +116 -0
  66. package/.pi/extensions/dotenv-loader.ts +170 -0
  67. package/.pi/internal/cursor-sdk-transcript-parser.ts +59 -0
  68. package/.pi/model-router.json +95 -0
  69. package/.pi/npm/.gitignore +2 -0
  70. package/.pi/prompts/git-sync.md +124 -0
  71. package/.pi/prompts/harness-setup.md +509 -0
  72. package/.pi/prompts/save.md +16 -0
  73. package/.pi/prompts/wiki-autoresearch.md +19 -0
  74. package/.pi/prompts/wiki.md +23 -0
  75. package/.pi/providers/cursor-sdk-provider.test.mjs +476 -0
  76. package/.pi/providers/cursor-sdk-provider.ts +1085 -0
  77. package/.pi/settings.json +14 -4
  78. package/.pi/skills/agent-router/SKILL.md +174 -0
  79. package/.pi/sounds/alert/1-kaching-track.mp3 +0 -0
  80. package/.pi/sounds/error/1-ksi-wth-track.mp3 +0 -0
  81. package/.pi/sounds/error/2-smash-track.mp3 +0 -0
  82. package/.pi/sounds/error/3-buzzer-track.mp3 +0 -0
  83. package/.pi/sounds/notification/1-soft-notification-track.mp3 +0 -0
  84. package/.pi/sounds/project-sounds.json +25 -0
  85. package/.pi/sounds/reminder/1-soft-notification-track.mp3 +0 -0
  86. package/.pi/sounds/success/1-tada-track.mp3 +0 -0
  87. package/.pi/sounds/success/2-jobs-done-track.mp3 +0 -0
  88. package/.pi/sounds/success/3-yay-track.mp3 +0 -0
  89. package/CONTRIBUTING.md +116 -0
  90. package/README.md +32 -39
  91. package/biome.json +34 -0
  92. package/firecrawl/.env.template +58 -0
  93. package/firecrawl/README.md +49 -0
  94. package/firecrawl/docker-compose.yaml +201 -0
  95. package/firecrawl/searxng/searxng.env +3 -0
  96. package/firecrawl/searxng/settings.yml +85 -0
  97. package/lefthook.yml +8 -0
  98. package/package.json +55 -24
  99. package/vault/AGENTS.md +37 -0
  100. package/vault/wiki/_templates/comparison.md +39 -0
  101. package/vault/wiki/_templates/concept.md +40 -0
  102. package/vault/wiki/_templates/decision.md +21 -0
  103. package/vault/wiki/_templates/entity.md +32 -0
  104. package/vault/wiki/_templates/flow.md +14 -0
  105. package/vault/wiki/_templates/module.md +18 -0
  106. package/vault/wiki/_templates/question.md +31 -0
  107. package/vault/wiki/_templates/source.md +39 -0
  108. package/vault/wiki/concepts/AST-Aware Code Chunking.md +44 -0
  109. package/vault/wiki/concepts/Build-Time Prompt Compilation.md +107 -0
  110. package/vault/wiki/concepts/Context Engine (AI Coding).md +47 -0
  111. package/vault/wiki/concepts/Context-Aware System Reminders.md +61 -0
  112. package/vault/wiki/concepts/Contextualized Text Embedding.md +42 -0
  113. package/vault/wiki/concepts/Contractor vs Employee AI Model.md +55 -0
  114. package/vault/wiki/concepts/Dual-Model Agent Architecture.md +65 -0
  115. package/vault/wiki/concepts/Late Chunking vs Early Chunking.md +43 -0
  116. package/vault/wiki/concepts/Majority Vote Ensembling.md +68 -0
  117. package/vault/wiki/concepts/Meta-Harness.md +16 -0
  118. package/vault/wiki/concepts/Multi-Agent AI Coding Architecture.md +75 -0
  119. package/vault/wiki/concepts/Prompt Enhancement.md +90 -0
  120. package/vault/wiki/concepts/Prompt Renderer.md +89 -0
  121. package/vault/wiki/concepts/Semantic Codebase Indexing.md +67 -0
  122. package/vault/wiki/concepts/additive-config-hierarchy.md +16 -0
  123. package/vault/wiki/concepts/agent-artifacts-verifiable-deliverables.md +71 -0
  124. package/vault/wiki/concepts/agent-browser-browser-automation.md +99 -0
  125. package/vault/wiki/concepts/agent-codebase-interface.md +43 -0
  126. package/vault/wiki/concepts/agent-harness-architecture.md +67 -0
  127. package/vault/wiki/concepts/agent-loop-detection-patterns.md +133 -0
  128. package/vault/wiki/concepts/agent-search-enforcement.md +126 -0
  129. package/vault/wiki/concepts/agent-skills-ecosystem.md +74 -0
  130. package/vault/wiki/concepts/agent-skills-pattern.md +68 -0
  131. package/vault/wiki/concepts/agentic-harness-context-enforcement.md +91 -0
  132. package/vault/wiki/concepts/agentic-harness.md +34 -0
  133. package/vault/wiki/concepts/agentic-orchestration-pipeline.md +56 -0
  134. package/vault/wiki/concepts/agentic-search-no-embeddings.md +18 -0
  135. package/vault/wiki/concepts/anthropic-context-engineering.md +13 -0
  136. package/vault/wiki/concepts/antigravity-agent-first-architecture.md +61 -0
  137. package/vault/wiki/concepts/ast-compression.md +19 -0
  138. package/vault/wiki/concepts/ast-truncation.md +66 -0
  139. package/vault/wiki/concepts/barrel-files.md +37 -0
  140. package/vault/wiki/concepts/browser-harness-agent.md +41 -0
  141. package/vault/wiki/concepts/browser-subagent-visual-verification.md +82 -0
  142. package/vault/wiki/concepts/codebase-intelligence-ecosystem-comparison.md +192 -0
  143. package/vault/wiki/concepts/codebase-intelligence-harness-integration.md +161 -0
  144. package/vault/wiki/concepts/codebase-to-context-ingestion.md +46 -0
  145. package/vault/wiki/concepts/codex-harness-innovations.md +147 -0
  146. package/vault/wiki/concepts/consensus-debate-flow.md +17 -0
  147. package/vault/wiki/concepts/consensus-debate.md +206 -0
  148. package/vault/wiki/concepts/content-addressed-spec-identity.md +166 -0
  149. package/vault/wiki/concepts/context-anxiety.md +57 -0
  150. package/vault/wiki/concepts/context-compression-techniques.md +19 -0
  151. package/vault/wiki/concepts/context-continuity.md +22 -0
  152. package/vault/wiki/concepts/context-drift-in-agents.md +106 -0
  153. package/vault/wiki/concepts/context-engineering.md +62 -0
  154. package/vault/wiki/concepts/context-folding.md +67 -0
  155. package/vault/wiki/concepts/context-mode.md +38 -0
  156. package/vault/wiki/concepts/cursor-harness-innovations.md +107 -0
  157. package/vault/wiki/concepts/deterministic-session-compaction.md +79 -0
  158. package/vault/wiki/concepts/drift-detection-unified.md +296 -0
  159. package/vault/wiki/concepts/execution-feedback-loop.md +46 -0
  160. package/vault/wiki/concepts/feedforward-feedback-harness.md +60 -0
  161. package/vault/wiki/concepts/five-root-cause-metrics-sentrux.md +40 -0
  162. package/vault/wiki/concepts/fork-safe-spec-storage.md +89 -0
  163. package/vault/wiki/concepts/fts5-sandbox.md +19 -0
  164. package/vault/wiki/concepts/fuzzy-edit-matching.md +71 -0
  165. package/vault/wiki/concepts/gemini-cli-architecture.md +104 -0
  166. package/vault/wiki/concepts/generator-evaluator-architecture.md +64 -0
  167. package/vault/wiki/concepts/guardian-agent-pattern.md +67 -0
  168. package/vault/wiki/concepts/harness-configuration-layers.md +89 -0
  169. package/vault/wiki/concepts/harness-control-frameworks.md +155 -0
  170. package/vault/wiki/concepts/harness-engineering-first-principles.md +90 -0
  171. package/vault/wiki/concepts/harness-h-formalism.md +53 -0
  172. package/vault/wiki/concepts/hybrid-code-search.md +61 -0
  173. package/vault/wiki/concepts/inline-post-edit-validation.md +112 -0
  174. package/vault/wiki/concepts/legendary-engineering-patterns-harness.md +110 -0
  175. package/vault/wiki/concepts/lifecycle-hooks.md +94 -0
  176. package/vault/wiki/concepts/mcp-tool-routing.md +102 -0
  177. package/vault/wiki/concepts/memory-system-of-record-vs-ephemeral-cache.md +47 -0
  178. package/vault/wiki/concepts/meta-agent-context-pruning.md +151 -0
  179. package/vault/wiki/concepts/model-adaptive-harness.md +122 -0
  180. package/vault/wiki/concepts/model-routing-agents.md +101 -0
  181. package/vault/wiki/concepts/monorepo-architecture.md +45 -0
  182. package/vault/wiki/concepts/multi-agent-specialization.md +61 -0
  183. package/vault/wiki/concepts/permission-subsystem.md +16 -0
  184. package/vault/wiki/concepts/pi-messenger-analysis.md +243 -0
  185. package/vault/wiki/concepts/pi-vscode-extension-landscape.md +37 -0
  186. package/vault/wiki/concepts/policy-engine-pattern.md +78 -0
  187. package/vault/wiki/concepts/progressive-disclosure-agents.md +53 -0
  188. package/vault/wiki/concepts/progressive-skill-disclosure.md +17 -0
  189. package/vault/wiki/concepts/provider-native-prompting.md +203 -0
  190. package/vault/wiki/concepts/quality-signal-sentrux.md +37 -0
  191. package/vault/wiki/concepts/repo-map-ranking.md +42 -0
  192. package/vault/wiki/concepts/result-monad-error-handling.md +47 -0
  193. package/vault/wiki/concepts/safety-defense-in-depth.md +83 -0
  194. package/vault/wiki/concepts/sandbox-os-enforcement.md +18 -0
  195. package/vault/wiki/concepts/selective-debate-routing.md +70 -0
  196. package/vault/wiki/concepts/self-evolving-harness.md +60 -0
  197. package/vault/wiki/concepts/sentrux-mcp-integration.md +36 -0
  198. package/vault/wiki/concepts/sentrux-rules-engine.md +49 -0
  199. package/vault/wiki/concepts/shell-pattern-compression.md +24 -0
  200. package/vault/wiki/concepts/skill-first-architecture.md +166 -0
  201. package/vault/wiki/concepts/structured-compaction.md +78 -0
  202. package/vault/wiki/concepts/subagent-orchestration.md +17 -0
  203. package/vault/wiki/concepts/subagent-worktree-isolation.md +68 -0
  204. package/vault/wiki/concepts/superpowers-methodology.md +78 -0
  205. package/vault/wiki/concepts/think-in-code.md +73 -0
  206. package/vault/wiki/concepts/ts-execution-layer.md +100 -0
  207. package/vault/wiki/concepts/typescript-strict-mode.md +37 -0
  208. package/vault/wiki/concepts/vcc-conversation-compaction-for-pi.md +51 -0
  209. package/vault/wiki/concepts/verification-drift-detection.md +19 -0
  210. package/vault/wiki/consensus/consensus-records.md +58 -0
  211. package/vault/wiki/decisions/2026-04-30-pi-lean-ctx-native.md +122 -0
  212. package/vault/wiki/decisions/adr-008.md +40 -0
  213. package/vault/wiki/decisions/adr-009.md +46 -0
  214. package/vault/wiki/decisions/adr-010.md +55 -0
  215. package/vault/wiki/decisions/adr-011.md +165 -0
  216. package/vault/wiki/decisions/adr-012.md +102 -0
  217. package/vault/wiki/decisions/adr-013.md +59 -0
  218. package/vault/wiki/decisions/adr-014.md +73 -0
  219. package/vault/wiki/decisions/adr-015.md +81 -0
  220. package/vault/wiki/decisions/adr-016.md +91 -0
  221. package/vault/wiki/decisions/adr-017.md +79 -0
  222. package/vault/wiki/decisions/adr-018.md +100 -0
  223. package/vault/wiki/decisions/adr-019.md +75 -0
  224. package/vault/wiki/decisions/adr-020.md +106 -0
  225. package/vault/wiki/decisions/adr-021.md +86 -0
  226. package/vault/wiki/decisions/adr-022.md +113 -0
  227. package/vault/wiki/decisions/adr-023.md +113 -0
  228. package/vault/wiki/decisions/adr-024.md +73 -0
  229. package/vault/wiki/decisions/adr-025.md +130 -0
  230. package/vault/wiki/decisions/adr-026.md +56 -0
  231. package/vault/wiki/decisions/colocate-wiki.md +34 -0
  232. package/vault/wiki/entities/Anders Hejlsberg.md +29 -0
  233. package/vault/wiki/entities/Anthropic.md +17 -0
  234. package/vault/wiki/entities/Augment Code.md +49 -0
  235. package/vault/wiki/entities/Bjarne Stroustrup.md +26 -0
  236. package/vault/wiki/entities/Bolt.new (StackBlitz).md +39 -0
  237. package/vault/wiki/entities/Boris Cherny.md +11 -0
  238. package/vault/wiki/entities/Claude Code.md +19 -0
  239. package/vault/wiki/entities/Dennis Ritchie.md +26 -0
  240. package/vault/wiki/entities/Emergent Labs.md +32 -0
  241. package/vault/wiki/entities/Google Cloud.md +16 -0
  242. package/vault/wiki/entities/Guido van Rossum.md +28 -0
  243. package/vault/wiki/entities/Ken Thompson.md +28 -0
  244. package/vault/wiki/entities/Lee et al.md +16 -0
  245. package/vault/wiki/entities/Linus Torvalds.md +28 -0
  246. package/vault/wiki/entities/Lovable (company).md +40 -0
  247. package/vault/wiki/entities/Martin Fowler.md +16 -0
  248. package/vault/wiki/entities/Meng et al.md +16 -0
  249. package/vault/wiki/entities/OpenAI.md +16 -0
  250. package/vault/wiki/entities/Rocket.new.md +38 -0
  251. package/vault/wiki/entities/VILA-Lab.md +15 -0
  252. package/vault/wiki/entities/autodev-codebase.md +18 -0
  253. package/vault/wiki/entities/ck-tool.md +59 -0
  254. package/vault/wiki/entities/codesearch.md +18 -0
  255. package/vault/wiki/entities/disler-indydevdan.md +33 -0
  256. package/vault/wiki/entities/gsd-get-shit-done.md +56 -0
  257. package/vault/wiki/entities/javascript-runtimes.md +48 -0
  258. package/vault/wiki/entities/jesse-vincent.md +38 -0
  259. package/vault/wiki/entities/lean-ctx.md +32 -0
  260. package/vault/wiki/entities/opendev.md +41 -0
  261. package/vault/wiki/entities/ops-codegraph-tool.md +18 -0
  262. package/vault/wiki/entities/pi-coding-agent.md +53 -0
  263. package/vault/wiki/entities/sentrux.md +54 -0
  264. package/vault/wiki/entities/vgrep-tool.md +57 -0
  265. package/vault/wiki/entities/vitest.md +41 -0
  266. package/vault/wiki/flows/harness-wiki-pipeline.md +204 -0
  267. package/vault/wiki/hot.md +932 -0
  268. package/vault/wiki/index.md +437 -0
  269. package/vault/wiki/log.md +418 -0
  270. package/vault/wiki/meta/dashboard.md +30 -0
  271. package/vault/wiki/meta/lint-report-2026-04-30.md +86 -0
  272. package/vault/wiki/meta/lint-report-2026-05-02.md +251 -0
  273. package/vault/wiki/meta/overview.canvas +43 -0
  274. package/vault/wiki/modules/adversarial-verification.md +57 -0
  275. package/vault/wiki/modules/automated-observability.md +54 -0
  276. package/vault/wiki/modules/bench.md +20 -0
  277. package/vault/wiki/modules/extensions.md +23 -0
  278. package/vault/wiki/modules/grounding-checkpoints.md +62 -0
  279. package/vault/wiki/modules/harness-implementation-plan.md +345 -0
  280. package/vault/wiki/modules/harness-wiki-skill-mapping.md +135 -0
  281. package/vault/wiki/modules/harness.md +86 -0
  282. package/vault/wiki/modules/persistent-memory.md +85 -0
  283. package/vault/wiki/modules/schema-orchestration.md +68 -0
  284. package/vault/wiki/modules/skills.md +27 -0
  285. package/vault/wiki/modules/spec-hardening.md +58 -0
  286. package/vault/wiki/modules/structured-planning.md +53 -0
  287. package/vault/wiki/modules/think-in-code-enforcement.md +153 -0
  288. package/vault/wiki/modules/wiki-query-interface.md +64 -0
  289. package/vault/wiki/overview.md +51 -0
  290. package/vault/wiki/questions/Research-pi-vs-claude-code-agentic-orchestration-pipeline.md +87 -0
  291. package/vault/wiki/questions/Research-sentrux-dev.md +123 -0
  292. package/vault/wiki/questions/Research-superpowers-skill-for-agentic-coding-agents.md +164 -0
  293. package/vault/wiki/questions/Research: Augment Code Context Engine.md +244 -0
  294. package/vault/wiki/questions/Research: Automating Software Engineering - Lovable, Bolt, Emergent, Rocket.md +112 -0
  295. package/vault/wiki/questions/Research: Claude Code State-of-the-Art Harness Improvements.md +209 -0
  296. package/vault/wiki/questions/Research: Codex State-of-the-Art Harness Improvements.md +99 -0
  297. package/vault/wiki/questions/Research: Engineering Workflows of Legendary Programmers and AI Harness Mapping.md +107 -0
  298. package/vault/wiki/questions/Research: Fallow Codebase Intelligence Harness Integration.md +72 -0
  299. package/vault/wiki/questions/Research: Gemini CLI SOTA Harness Integration.md +166 -0
  300. package/vault/wiki/questions/Research: GitHub Issues as Harness Spec Storage.md +188 -0
  301. package/vault/wiki/questions/Research: Google Antigravity Harness Integration.md +120 -0
  302. package/vault/wiki/questions/Research: Meta-Agent Context Drift Detection.md +236 -0
  303. package/vault/wiki/questions/Research: Model-Adaptive Agent Harness Design.md +95 -0
  304. package/vault/wiki/questions/Research: Model-Specific Prompting Guides.md +165 -0
  305. package/vault/wiki/questions/Research: Prompt Renderer for Multi-Model Agent Harness.md +216 -0
  306. package/vault/wiki/questions/Research: Skill-First Harness Architecture.md +91 -0
  307. package/vault/wiki/questions/Research: TypeScript Best Practices and Codebase Structure.md +88 -0
  308. package/vault/wiki/questions/Research: TypeScript Execution Layer for Agent Tool Calling.md +81 -0
  309. package/vault/wiki/questions/Research: claude-mem over Obsidian for Harness Layer.md +71 -0
  310. package/vault/wiki/questions/Research: claude-mem over obsidian wiki as the knowledge base for our agentic harness pipeline. think from first principles. does this replace or complement our current setup? no hard feelings about previous decisions. gimme accurate points.md +80 -0
  311. package/vault/wiki/questions/Research: context-mode vs lean-ctx.md +72 -0
  312. package/vault/wiki/questions/Research: cursor.sh Harness Innovations.md +92 -0
  313. package/vault/wiki/questions/Research: executor.sh Harness Integration.md +170 -0
  314. package/vault/wiki/questions/Research: how GSD fits into our coding harness setup.md +97 -0
  315. package/vault/wiki/questions/Research: how claude-mem fits into our workflow. and whether it should replace obsidian in the codebase. no hard feelings about previous actions, rethink from first principles always.md +80 -0
  316. package/vault/wiki/questions/Research: pi-vcc.md +113 -0
  317. package/vault/wiki/questions/Research: semantic code search tools.md +69 -0
  318. package/vault/wiki/questions/Research: vcc extension for pi coding agent.md +73 -0
  319. package/vault/wiki/questions/how-to-enable-semantic-code-search-now.md +111 -0
  320. package/vault/wiki/questions/mvp-implementation-blueprint.md +552 -0
  321. package/vault/wiki/questions/research-agent-first-codebase-exploration.md +199 -0
  322. package/vault/wiki/questions/research-agentic-coding-harness-latest-papers.md +142 -0
  323. package/vault/wiki/questions/research-gitingest-gitreverse-integration.md +100 -0
  324. package/vault/wiki/questions/research-wozcode-token-reduction.md +67 -0
  325. package/vault/wiki/questions/resolved-context-pruning-inplace-vs-restart.md +95 -0
  326. package/vault/wiki/questions/resolved-context-window-economics.md +167 -0
  327. package/vault/wiki/questions/resolved-imad-debate-gating-transfer.md +126 -0
  328. package/vault/wiki/questions/resolved-mcp-tool-preference.md +112 -0
  329. package/vault/wiki/questions/resolved-small-model-meta-agents.md +107 -0
  330. package/vault/wiki/questions/resolved-treesitter-dynamic-languages.md +95 -0
  331. package/vault/wiki/sources/Auggie Context MCP Server.md +63 -0
  332. package/vault/wiki/sources/Augment Code Codacy AI Giants.md +61 -0
  333. package/vault/wiki/sources/Augment Code MCP SiliconAngle.md +49 -0
  334. package/vault/wiki/sources/Augment Code WorkOS ERC 2025.md +55 -0
  335. package/vault/wiki/sources/Augment Context Engine Official.md +71 -0
  336. package/vault/wiki/sources/Augment SWE-bench Agent GitHub.md +74 -0
  337. package/vault/wiki/sources/Augment SWE-bench Pro Blog.md +58 -0
  338. package/vault/wiki/sources/Source: AgentBus Jinja2 Prompt Pipelines.md +75 -0
  339. package/vault/wiki/sources/Source: Arxiv /342/200/224 Don't Break the Cache.md" +85 -0
  340. package/vault/wiki/sources/Source: Augment - Harness Engineering for AI Coding Agents.md +58 -0
  341. package/vault/wiki/sources/Source: Blake Crosley Agent Architecture Guide.md +100 -0
  342. package/vault/wiki/sources/Source: Bolt.new Architecture & Case Study.md +75 -0
  343. package/vault/wiki/sources/Source: Build-Time Prompt Compilation Architecture.md +107 -0
  344. package/vault/wiki/sources/Source: Claude API Agent Skills Overview.md +70 -0
  345. package/vault/wiki/sources/Source: Gemini CLI Changelogs.md +88 -0
  346. package/vault/wiki/sources/Source: Google Blog - Gemini CLI Announcement.md +57 -0
  347. package/vault/wiki/sources/Source: Google Gemini CLI Architecture Docs.md +53 -0
  348. package/vault/wiki/sources/Source: LangChain - Anatomy of Agent Harness.md +65 -0
  349. package/vault/wiki/sources/Source: Lovable Architecture & Clone Analysis.md +83 -0
  350. package/vault/wiki/sources/Source: Martin Fowler - Harness Engineering.md +70 -0
  351. package/vault/wiki/sources/Source: OpenAI Harness Engineering Five Principles.md +58 -0
  352. package/vault/wiki/sources/Source: OpenAI Harness Engineering /342/200/224 0 Lines of Human Code.md" +101 -0
  353. package/vault/wiki/sources/Source: OpenDev /342/200/224 Building AI Coding Agents for the Terminal.md" +100 -0
  354. package/vault/wiki/sources/Source: Render AI Coding Agents Benchmark 2025.md +53 -0
  355. package/vault/wiki/sources/Source: Rocket.new /342/200/224 Vibe Solutioning Platform.md" +70 -0
  356. package/vault/wiki/sources/Source: SwirlAI Agent Skills Progressive Disclosure.md +71 -0
  357. package/vault/wiki/sources/Source: TianPan Prompt Caching Architecture.md +89 -0
  358. package/vault/wiki/sources/Source: Vercel Labs agent-browser.md +155 -0
  359. package/vault/wiki/sources/Source: browser-harness CDP Harness.md +126 -0
  360. package/vault/wiki/sources/agent-drift-academic-paper.md +79 -0
  361. package/vault/wiki/sources/aider-repomap-tree-sitter.md +42 -0
  362. package/vault/wiki/sources/anthropic-compaction-api.md +58 -0
  363. package/vault/wiki/sources/anthropic-effective-harnesses.md +42 -0
  364. package/vault/wiki/sources/anthropic-prompt-best-practices.md +100 -0
  365. package/vault/wiki/sources/anthropic2026-harness-design.md +63 -0
  366. package/vault/wiki/sources/barrel-files-tkdodo.md +38 -0
  367. package/vault/wiki/sources/birth-of-unix-kernighan-interview.md +57 -0
  368. package/vault/wiki/sources/bockeler2026-harness-engineering.md +69 -0
  369. package/vault/wiki/sources/cast-code-chunking-paper.md +50 -0
  370. package/vault/wiki/sources/ck-semantic-search.md +78 -0
  371. package/vault/wiki/sources/claude-code-architecture-karaxai-2026.md +71 -0
  372. package/vault/wiki/sources/claude-code-architecture-qubytes-2026.md +50 -0
  373. package/vault/wiki/sources/claude-code-architecture-vila-lab-2026.md +64 -0
  374. package/vault/wiki/sources/claude-code-security-architecture-penligent-2026.md +70 -0
  375. package/vault/wiki/sources/claude-context-editing-docs.md +13 -0
  376. package/vault/wiki/sources/cloudflare-codemode.md +63 -0
  377. package/vault/wiki/sources/code-chunk-library-supermemory.md +63 -0
  378. package/vault/wiki/sources/codeact-apple-2024.md +62 -0
  379. package/vault/wiki/sources/codex-dsc-rfc-8573.md +41 -0
  380. package/vault/wiki/sources/codex-open-source-agent-2026.md +110 -0
  381. package/vault/wiki/sources/coir-code-retrieval-benchmark.md +51 -0
  382. package/vault/wiki/sources/colinmcnamara-context-optimization-codemode.md +48 -0
  383. package/vault/wiki/sources/context-folding-paper.md +61 -0
  384. package/vault/wiki/sources/context-mode-website.md +63 -0
  385. package/vault/wiki/sources/cursor-agent-best-practices-2026.md +62 -0
  386. package/vault/wiki/sources/cursor-fork-29b-2025.md +50 -0
  387. package/vault/wiki/sources/cursor-harness-april-2026.md +76 -0
  388. package/vault/wiki/sources/cursor-instant-apply-2024.md +45 -0
  389. package/vault/wiki/sources/cursor-shadow-workspace-2024.md +52 -0
  390. package/vault/wiki/sources/cursor-shipped-coding-agent-2026.md +53 -0
  391. package/vault/wiki/sources/cursor-vs-antigravity-2026.md +51 -0
  392. package/vault/wiki/sources/disler-pi-vs-claude-code.md +69 -0
  393. package/vault/wiki/sources/distill-deterministic-context-compression.md +53 -0
  394. package/vault/wiki/sources/embedding-models-benchmark-supermemory-2025.md +48 -0
  395. package/vault/wiki/sources/executor-rhyssullivan.md +122 -0
  396. package/vault/wiki/sources/fallow-rs-codebase-intelligence.md +125 -0
  397. package/vault/wiki/sources/fan2025-imad.md +60 -0
  398. package/vault/wiki/sources/forgecode-gpt5-agent-improvements.md +63 -0
  399. package/vault/wiki/sources/gemini-3-prompting-guide.md +78 -0
  400. package/vault/wiki/sources/gh-cli-sub-issue-rfc.md +50 -0
  401. package/vault/wiki/sources/gh-sub-issue-extension.md +72 -0
  402. package/vault/wiki/sources/github-fork-issues-discussion.md +44 -0
  403. package/vault/wiki/sources/github-issue-dependencies-docs.md +49 -0
  404. package/vault/wiki/sources/github-sub-issues-docs.md +51 -0
  405. package/vault/wiki/sources/gitingest.md +91 -0
  406. package/vault/wiki/sources/gitreverse.md +63 -0
  407. package/vault/wiki/sources/google-antigravity-official-blog.md +47 -0
  408. package/vault/wiki/sources/google-antigravity-wikipedia.md +53 -0
  409. package/vault/wiki/sources/gsd-codecentric-deep-dive.md +57 -0
  410. package/vault/wiki/sources/gsd-github-repo.md +51 -0
  411. package/vault/wiki/sources/gsd-hn-discussion.md +59 -0
  412. package/vault/wiki/sources/guido-python-design-philosophy.md +56 -0
  413. package/vault/wiki/sources/hejlsberg-7-learnings.md +48 -0
  414. package/vault/wiki/sources/ironclaw-drift-monitor.md +80 -0
  415. package/vault/wiki/sources/langsight-loop-detection.md +80 -0
  416. package/vault/wiki/sources/leanctx-website.md +69 -0
  417. package/vault/wiki/sources/lee2026-meta-harness.md +59 -0
  418. package/vault/wiki/sources/linux-kernel-coding-workflow.md +50 -0
  419. package/vault/wiki/sources/lou2026-autoharness.md +53 -0
  420. package/vault/wiki/sources/martin-fowler-harness-engineering.md +73 -0
  421. package/vault/wiki/sources/mcp-architecture-docs.md +13 -0
  422. package/vault/wiki/sources/meng2026-agent-harness-survey.md +79 -0
  423. package/vault/wiki/sources/mindstudio-four-agent-types.md +68 -0
  424. package/vault/wiki/sources/ms-chat-history-management.md +13 -0
  425. package/vault/wiki/sources/openai-prompt-guidance.md +104 -0
  426. package/vault/wiki/sources/openclaw-session-pruning.md +13 -0
  427. package/vault/wiki/sources/opencode-dcp.md +13 -0
  428. package/vault/wiki/sources/opendev-arxiv-2603.05344v1.md +79 -0
  429. package/vault/wiki/sources/openhands-platform.md +39 -0
  430. package/vault/wiki/sources/oss-guide-codebase-exploration.md +53 -0
  431. package/vault/wiki/sources/pi-compaction-extensions-ecosystem.md +102 -0
  432. package/vault/wiki/sources/pi-context-prune-github-repo.md +38 -0
  433. package/vault/wiki/sources/pi-mono-compaction-docs.md +38 -0
  434. package/vault/wiki/sources/pi-omni-compact-github-repo.md +50 -0
  435. package/vault/wiki/sources/pi-rtk-optimizer-github-repo.md +45 -0
  436. package/vault/wiki/sources/pi-vcc-github-repo.md +69 -0
  437. package/vault/wiki/sources/pi-vscode-marketplace.md +41 -0
  438. package/vault/wiki/sources/pi-vscode-model-provider-marketplace.md +39 -0
  439. package/vault/wiki/sources/py-tree-sitter.md +13 -0
  440. package/vault/wiki/sources/sentrux-dev-landing.md +40 -0
  441. package/vault/wiki/sources/sentrux-docs-pro-architecture.md +75 -0
  442. package/vault/wiki/sources/sentrux-docs-quality-signal.md +46 -0
  443. package/vault/wiki/sources/sentrux-docs-root-cause-metrics.md +57 -0
  444. package/vault/wiki/sources/sentrux-docs-rules-engine.md +58 -0
  445. package/vault/wiki/sources/sentrux-github-repo.md +56 -0
  446. package/vault/wiki/sources/superpowers-github-repo.md +56 -0
  447. package/vault/wiki/sources/superpowers-release-blog.md +54 -0
  448. package/vault/wiki/sources/superpowers-termdock-analysis.md +45 -0
  449. package/vault/wiki/sources/swe-agent-aci.md +42 -0
  450. package/vault/wiki/sources/swe-bench.md +45 -0
  451. package/vault/wiki/sources/swe-pruner-context-pruning.md +13 -0
  452. package/vault/wiki/sources/think-in-code-blog.md +48 -0
  453. package/vault/wiki/sources/tree-sitter-docs.md +13 -0
  454. package/vault/wiki/sources/ts-best-practices-2025-devto.md +42 -0
  455. package/vault/wiki/sources/ts-folder-structure-mingyang.md +58 -0
  456. package/vault/wiki/sources/ts-monorepo-koerselman.md +44 -0
  457. package/vault/wiki/sources/ts-result-error-handling-kkalamarski.md +52 -0
  458. package/vault/wiki/sources/ts-runtimes-comparison-betterstack.md +42 -0
  459. package/vault/wiki/sources/ts-strict-mode-rishikc.md +43 -0
  460. package/vault/wiki/sources/unix-philosophy.md +48 -0
  461. package/vault/wiki/sources/vectara-chunking-vs-embedding-naacl2025.md +39 -0
  462. package/vault/wiki/sources/vectara-guardian-agents.md +79 -0
  463. package/vault/wiki/sources/vgrep-semantic-search.md +76 -0
  464. package/vault/wiki/sources/vitest-official.md +41 -0
  465. package/vault/wiki/sources/vscode-pi-community-extension.md +40 -0
  466. package/vault/wiki/sources/wozcode.md +79 -0
  467. package/.agents/skills/compress/SKILL.md +0 -111
  468. package/.agents/skills/compress/scripts/__init__.py +0 -9
  469. package/.agents/skills/compress/scripts/__main__.py +0 -3
  470. package/.agents/skills/compress/scripts/benchmark.py +0 -78
  471. package/.agents/skills/compress/scripts/cli.py +0 -73
  472. package/.agents/skills/compress/scripts/compress.py +0 -227
  473. package/.agents/skills/compress/scripts/detect.py +0 -121
  474. package/.agents/skills/compress/scripts/validate.py +0 -189
  475. package/.agents/skills/emil-design-eng/SKILL.md +0 -679
  476. package/.agents/skills/lean-ctx/SKILL.md +0 -149
  477. package/.agents/skills/lean-ctx/scripts/install.sh +0 -95
  478. package/.agents/skills/scrapling-official/LICENSE.txt +0 -28
  479. package/.agents/skills/scrapling-official/SKILL.md +0 -390
  480. package/.agents/skills/scrapling-official/examples/01_fetcher_session.py +0 -26
  481. package/.agents/skills/scrapling-official/examples/02_dynamic_session.py +0 -26
  482. package/.agents/skills/scrapling-official/examples/03_stealthy_session.py +0 -26
  483. package/.agents/skills/scrapling-official/examples/04_spider.py +0 -58
  484. package/.agents/skills/scrapling-official/examples/README.md +0 -45
  485. package/.agents/skills/scrapling-official/references/fetching/choosing.md +0 -78
  486. package/.agents/skills/scrapling-official/references/fetching/dynamic.md +0 -352
  487. package/.agents/skills/scrapling-official/references/fetching/static.md +0 -432
  488. package/.agents/skills/scrapling-official/references/fetching/stealthy.md +0 -255
  489. package/.agents/skills/scrapling-official/references/mcp-server.md +0 -214
  490. package/.agents/skills/scrapling-official/references/migrating_from_beautifulsoup.md +0 -86
  491. package/.agents/skills/scrapling-official/references/parsing/adaptive.md +0 -212
  492. package/.agents/skills/scrapling-official/references/parsing/main_classes.md +0 -586
  493. package/.agents/skills/scrapling-official/references/parsing/selection.md +0 -494
  494. package/.agents/skills/scrapling-official/references/spiders/advanced.md +0 -344
  495. package/.agents/skills/scrapling-official/references/spiders/architecture.md +0 -94
  496. package/.agents/skills/scrapling-official/references/spiders/getting-started.md +0 -164
  497. package/.agents/skills/scrapling-official/references/spiders/proxy-blocking.md +0 -235
  498. package/.agents/skills/scrapling-official/references/spiders/requests-responses.md +0 -196
  499. package/.agents/skills/scrapling-official/references/spiders/sessions.md +0 -205
  500. package/PLAN.md +0 -11
  501. package/extensions/lean-ctx-enforce.ts +0 -166
  502. package/skills-lock.json +0 -35
  503. package/wiki/README.md +0 -19
  504. package/wiki/decisions/0001-establish-project-wiki-and-decision-record-format.md +0 -25
  505. package/wiki/decisions/0002-add-project-banner-to-readme.md +0 -26
  506. package/wiki/decisions/0003-remove-redundant-readme-title-heading.md +0 -26
  507. package/wiki/decisions/0004-publish-package-to-npm-as-ultimate-pi.md +0 -26
  508. package/wiki/decisions/0005-automate-npm-publish-with-github-actions.md +0 -27
  509. package/wiki/decisions/0006-switch-to-npm-trusted-publishing.md +0 -26
  510. package/wiki/decisions/0007-use-absolute-banner-url-for-npm-readme-rendering.md +0 -26
  511. package/wiki/decisions/0008-rename-banner-asset-for-cache-busting.md +0 -26
  512. package/wiki/decisions/0009-force-oidc-path-by-clearing-node-auth-token-in-publish-step.md +0 -25
  513. package/wiki/decisions/0010-simplify-setup-node-for-npm-trusted-publishing.md +0 -26
  514. package/wiki/decisions/0011-add-noop-workflow-change-to-force-fresh-publish-run.md +0 -25
  515. package/wiki/decisions/0012-align-workflow-runtime-with-npm-trusted-publishing-requirements.md +0 -26
  516. package/wiki/decisions/0013-add-package-repository-url-for-provenance-validation.md +0 -25
@@ -0,0 +1,100 @@
1
+ ---
2
+ type: source
3
+ status: ingested
4
+ source_type: official-documentation
5
+ title: "Anthropic Prompt Engineering Best Practices (Claude Opus 4.7 through Haiku 4.5)"
6
+ author: "Anthropic"
7
+ date_published: 2026-04-01
8
+ date_fetched: 2026-05-01
9
+ url: "https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/claude-prompting-best-practices"
10
+ confidence: high
11
+ key_claims:
12
+ - "Claude Opus 4.7 interprets prompts more literally and explicitly than Opus 4.6"
13
+ - "Effort parameter (max/xhigh/high/medium/low) is the primary control knob replacing budget_tokens"
14
+ - "Adaptive thinking dynamically calibrates reasoning depth per step"
15
+ - "XML tags are the recommended structure format for complex prompts"
16
+ - "Long content at top + query at bottom improves performance up to 30%"
17
+ - "Claude Opus 4.7 has stronger default design aesthetic with specific house style"
18
+ - "Code review harnesses need explicit lowering of the reporting bar for Opus 4.7"
19
+ tags:
20
+ - prompting
21
+ - anthropic
22
+ - claude
23
+ - model-specific
24
+ - harness-design
25
+ created: 2026-05-02
26
+ updated: 2026-05-02
27
+
28
+ ---# Anthropic Prompt Engineering Best Practices
29
+
30
+ Official comprehensive prompt engineering guide for Claude's latest models (Opus 4.7, Opus 4.6, Sonnet 4.6, Haiku 4.5). Single reference covering foundational techniques, output control, tool use, thinking, and agentic systems.
31
+
32
+ ## Model-Specific Key Findings
33
+
34
+ ### Claude Opus 4.7
35
+ - **More literal instruction following**: Will not silently generalize instructions; precision over thrash
36
+ - **Response length calibrated to task complexity**: Shorter on lookups, longer on analysis
37
+ - **Effort parameter critical**: `xhigh` for coding/agentic, `high` minimum for intelligence-sensitive
38
+ - **Tool use triggering**: Uses tools LESS than Opus 4.6; needs explicit guidance or higher effort
39
+ - **User-facing progress updates**: Better native updates; remove scaffolding that forces interim messages
40
+ - **Tone shift**: More direct/opinionated, less validation-forward, fewer emoji than Opus 4.6
41
+ - **Subagent spawning**: Tends to spawn FEWER subagents by default
42
+ - **Default frontend aesthetic**: Warm cream (~`#F4F1EA`), serif display type, terracotta/amber accent
43
+ - **Code review**: Better at finding bugs but follows "only high-severity" filters too faithfully
44
+ - **Design steering**: "Propose 4 directions first" pattern breaks the default
45
+
46
+ ### Claude Opus 4.6
47
+ - **Adaptive thinking**: Replaces budget_tokens; effort controls depth
48
+ - **Overthinking risk**: Excessive upfront exploration; may gather extensive context without prompting
49
+ - **Subagent predilection**: Strong tendency to spawn subagents; may overuse
50
+ - **Better vision**: Improved multi-image processing, computer use
51
+ - **Prefilled responses deprecated**: Last assistant turn prefill returns 400 on Mythos Preview
52
+
53
+ ### Claude Sonnet 4.6
54
+ - **Effort default**: `high` (was no effort parameter in Sonnet 4.5)
55
+ - **Recommended settings**: `medium` for most apps, `low` for latency-sensitive
56
+ - **Adaptive thinking**: Best for autonomous multi-step agents, computer use, bimodal workloads
57
+ - **64k max_tokens recommended** at medium/high effort
58
+
59
+ ## General Principles (All Models)
60
+
61
+ ### Prompt Structure
62
+ - **XML tags preferred**: `<instructions>`, `<context>`, `<examples>`, `<input>` — unambiguous parsing
63
+ - **Long content at top, query at bottom**: Up to 30% quality improvement
64
+ - **Role setting**: Even single sentence makes difference
65
+ - **Examples in `<example>` tags**: 3-5 examples; diverse, structured
66
+ - **Be clear and direct**: "Golden rule — show prompt to colleague; if they'd be confused, Claude will be too"
67
+ - **Provide context/why**: Explaining motivation helps understanding
68
+ - **Prefer general instructions over prescriptive steps**: Claude's reasoning exceeds human-prescribed steps
69
+
70
+ ### Tool Use
71
+ - **Explicit direction needed**: "can you suggest" vs "implement" distinction
72
+ - **Proactive action default**: `<default_to_action>` block for autonomous behavior
73
+ - **Conservative action default**: `<do_not_act_before_instructions>` for safety-critical
74
+ - **Parallel tool calling**: Maximize by default, steerable
75
+ - **Older aggressive prompts cause overtriggering**: Dial back "CRITICAL: You MUST" language
76
+
77
+ ### Thinking & Reasoning
78
+ - **Adaptive thinking**: Dynamic calibration; higher effort = more thinking
79
+ - **Steerable**: "Thinking adds latency; only use when it will meaningfully improve quality"
80
+ - **Self-check**: "Before you finish, verify your answer against [criteria]"
81
+ - **Multishot with `<thinking>` tags**: Show reasoning pattern in examples
82
+
83
+ ### Agentic Systems
84
+ - **Context awareness**: Model tracks remaining context window tokens
85
+ - **State tracking across windows**: Save progress to files, use git, structured formats
86
+ - **Multi-window workflows**: First window sets up framework, future windows iterate
87
+ - **Balancing autonomy/safety**: Reversible actions OK, destructive actions need confirmation
88
+ - **Research mode**: Competing hypotheses, confidence tracking, hypothesis trees
89
+
90
+ ### Output Control
91
+ - **Tell what to do, not what not to do**: "Write in flowing prose" not "Don't use markdown"
92
+ - **Match prompt style to desired output**: Remove markdown from prompt to reduce markdown in output
93
+ - **XML format indicators**: `<smoothly_flowing_prose_paragraphs>` tags
94
+ - **Avoid overengineering**: Don't add features, refactors, or abstractions beyond what was asked
95
+ - **Minimize hallucinations**: `<investigate_before_answering>` block
96
+
97
+ ### Frontend Design
98
+ - **Don't settle for "AI slop"**: Distinctive typography, cohesive themes, purposeful motion
99
+ - **Opus 4.7 default**: Warm cream + serif + terracotta; steer via concrete specs or option proposal
100
+ - **Frontend aesthetics block**: Explicit guidance against generic patterns
@@ -0,0 +1,63 @@
1
+ ---
2
+ type: source
3
+ source_type: blog
4
+ title: "Harness Design for Long-Running Application Development"
5
+ author: "Prithvi Rajasekaran, Anthropic Engineering"
6
+ date_published: 2026-03-24
7
+ url: "https://www.anthropic.com/engineering/harness-design-long-running-apps"
8
+ confidence: high
9
+ key_claims:
10
+ - "Self-evaluation is fundamentally broken: agents praise their own mediocre work"
11
+ - "Separating generator from evaluator (GAN-inspired) dramatically improves output quality"
12
+ - "Sprint contracts: agree on 'done' before writing code"
13
+ - "Harness simplification: as models improve, remove non-load-bearing components"
14
+ - "Cost: 20x more expensive but dramatically better quality"
15
+ tags:
16
+ - harness
17
+ - anthropic
18
+ - multi-agent
19
+ - evaluator
20
+ - generator
21
+ created: 2026-04-30
22
+ updated: 2026-04-30
23
+ status: ingested
24
+
25
+ ---# Harness Design for Long-Running Application Development
26
+
27
+ Anthropic Engineering, March 2026. Prithvi Rajasekaran.
28
+
29
+ ## Three-Agent Architecture
30
+
31
+ **Planner**: Takes 1-4 sentence prompt → full product spec. Stays at product-context level, avoids granular technical details to prevent cascading errors.
32
+
33
+ **Generator**: Implements one feature at a time. Self-evaluates after each sprint before handing off.
34
+
35
+ **Evaluator**: Uses Playwright MCP to interactively test the running application. Grades against explicit criteria (design quality, originality, craft, functionality). Each criterion has a hard threshold — if any falls below, sprint fails.
36
+
37
+ ## Critical Findings
38
+
39
+ ### Self-Evaluation Is Broken
40
+ When asked to evaluate their own work, agents "respond by confidently praising the work — even when, to a human observer, the quality is obviously mediocre." Separating generator from evaluator is essential.
41
+
42
+ ### Context Anxiety
43
+ Models start wrapping up prematurely when approaching context limit. Compaction alone insufficient — context resets with structured handoffs required for Sonnet 4.5. Opus 4.5+ largely fixed this.
44
+
45
+ ### Evaluator Tuning
46
+ Claude is "a poor QA agent out of the box" — identifies flaws then talks itself out of flagging them. Needs explicit tuning to be skeptical. Multiple rounds of development loop required.
47
+
48
+ ### Sprint Contracts
49
+ Before each sprint, generator and evaluator negotiate a contract defining what "done" looks like. Generator proposes, evaluator reviews. They iterate until agreement. Communication via files.
50
+
51
+ ### Harness Simplification Principle
52
+ "Every component in a harness encodes an assumption about what the model can't do on its own, and those assumptions are worth stress testing." When Opus 4.6 arrived, sprint construct was removed (model handled decomposition natively). Evaluator became conditional — worth the cost only when task sits beyond what the model does reliably solo.
53
+
54
+ ## Results
55
+
56
+ | Harness | Duration | Cost | Quality |
57
+ |---------|----------|------|---------|
58
+ | Solo | 20 min | $9 | Core feature broken |
59
+ | Full harness | 6 hr | $200 | All features working, AI integration |
60
+
61
+ ## Key Takeaway
62
+
63
+ "The space of interesting harness combinations doesn't shrink as models improve. Instead, it moves."
@@ -0,0 +1,38 @@
1
+ ---
2
+ type: source
3
+ status: ingested
4
+ source_type: article
5
+ author: "Dominik Dorfmeister (TkDodo)"
6
+ date_published: 2024-07-26
7
+ url: "https://tkdodo.eu/blog/please-stop-using-barrel-files"
8
+ confidence: high
9
+ key_claims:
10
+ - "Barrel files cause circular imports when internal modules import from the barrel"
11
+ - "Next.js projects saw 11K → 3.5K module load reduction (68%) by removing barrels"
12
+ - "Barrel files slow development server startup by 5-10 seconds in large projects"
13
+ - "Barrels are appropriate only for library entry points (package.json `main` field)"
14
+ tags:
15
+ - typescript
16
+ - barrel-files
17
+ - code-organization
18
+ - performance
19
+ created: 2026-05-02
20
+ updated: 2026-05-02
21
+
22
+ ---# Please Stop Using Barrel Files
23
+
24
+ Source: TkDodo's blog (Dominik Dorfmeister), July 2024. Author of TanStack Query.
25
+
26
+ ## Summary
27
+
28
+ Argues against the widespread practice of using `index.ts` barrel files to re-export from directories. Documents real-world performance problems and circular import issues caused by barrel files in production Next.js applications.
29
+
30
+ ## Key Arguments
31
+
32
+ **Circular imports**: When a module inside a directory imports from its own barrel (`import { X } from '@/dir'`), it creates a circular dependency. ESLint `import/no-cycle` can catch some but not all cases.
33
+
34
+ **Development speed**: Barrel files force JavaScript to load and parse every module in the barrel synchronously, even if only one export is needed. A real Next.js project saw module count drop from 11K to 3.5K (68% reduction) after removing barrels, cutting startup time from 5-10 seconds down significantly.
35
+
36
+ **Next.js `optimizePackageImports`**: Automatically transforms barrel imports to direct module paths, but only works if the barrel is a "pure" re-export file with no other code.
37
+
38
+ **When barrels are OK**: Library entry points only (the `main` field in `package.json`). For application code, direct imports are preferred.
@@ -0,0 +1,57 @@
1
+ ---
2
+ type: source
3
+ source_type: interview-podcast
4
+ title: "The Birth of UNIX — Brian Kernighan on Bell Labs"
5
+ author: "Brian Kernighan (interviewed by Adam Gordon Bell)"
6
+ date_published: 2020-11-01
7
+ url: "https://corecursive.com/brian-kernighan-unix-bell-labs1/"
8
+ confidence: high
9
+ key_claims:
10
+ - "Ken Thompson built the first working Unix in 3 weeks"
11
+ - "Bell Labs culture: shared machine, shared source tree, everyone on same filesystem"
12
+ - "The Unix Room as collaborative physical space"
13
+ - "Ken Thompson reversed-engineered a typesetter in hours: disassembler, assembler, B interpreter"
14
+ - "Pipes enabled Cambrian explosion of composable tools"
15
+ - "Community built around shared machine and shared source — `who` command as social tool"
16
+ - "The only rule: you changed it last, it's yours"
17
+ tags: [unix, bell-labs, ken-thompson, brian-kernighan, history]
18
+ ---
19
+
20
+ # The Birth of UNIX — Brian Kernighan Interview
21
+
22
+ ## Ken Thompson's Productivity
23
+
24
+ - Built first working Unix in 3 weeks while wife was on vacation.
25
+ - Reverse-engineered a typesetter in hours: wrote a disassembler for an unfamiliar CPU from binary code, then an assembler, then a B language interpreter — all in about a day.
26
+ - Brian Kernighan: "For Ken, it was just like breathing. Oh, okay, done. Next."
27
+
28
+ ## The Unix Room Culture
29
+
30
+ - Physical shared space on Bell Labs 6th floor with a PDP-11 and teletypes.
31
+ - "If you wanted, you could go sit in your office and think deep thoughts... then come back to the common space when you wanted to."
32
+ - Shared machine + shared filesystem: everyone could see everyone's source code.
33
+ - "The only real rule: you changed it last, it's yours."
34
+ - `who` command as community builder — showed who was logged in and when they last acted.
35
+ - 10-kilo chocolate bars on the table, Private Eye magazine from Dennis Ritchie.
36
+
37
+ ## The Pipes Breakthrough
38
+
39
+ - Doug McIlroy pushed for program composition for years.
40
+ - Ken Thompson implemented it. The pipe symbol `|` "just clicked instantly."
41
+ - Within days: "frenzy of fixing up programs so that they would work properly in pipelines."
42
+ - Sort was repackaged to read stdin/write stdout — pattern used daily by millions since.
43
+
44
+ ## Kernighan on Modern Programming
45
+
46
+ - "I found it easier to program when I was trying to figure out the logic for myself rather than trying to figure out where in the infinite stack of documentation was the function I needed."
47
+ - "Too much of today's programming is more like looking it up."
48
+
49
+ ## Richard Hamming's Influence
50
+
51
+ - "He would reserve Friday afternoons for thinking great thoughts."
52
+ - Asked chemists: "Could your work lead to a Nobel Prize? If not, why are you working on it?"
53
+ - But the Unix work itself didn't seem important at the time — it was just making programming easier for themselves.
54
+
55
+ ## Kernighan's Thesis as Tool-Building Metaphor
56
+
57
+ - His PhD thesis formatting program: the first 500 cards were the program, the remaining 5,500 were the thesis. "It's building tools that let you do things, and the tools are often some kind of specialized language."
@@ -0,0 +1,69 @@
1
+ ---
2
+ type: source
3
+ source_type: blog
4
+ title: "Harness Engineering for Coding Agent Users"
5
+ author: "Birgitta Böckeler, Martin Fowler"
6
+ date_published: 2026-04-02
7
+ url: "https://martinfowler.com/articles/harness-engineering.html"
8
+ confidence: high
9
+ key_claims:
10
+ - "Feedforward (guides) + Feedback (sensors) = harness control framework"
11
+ - "Computational controls: deterministic, fast (tests, linters, type checkers)"
12
+ - "Inferential controls: semantic, probabilistic (AI code review, LLM-as-judge)"
13
+ - "Three regulation categories: Maintainability, Architecture Fitness, Behaviour"
14
+ - "Behavioural harness (functional correctness) remains unsolved"
15
+ - "Ashby's Law: harness must match system variety; topologies reduce variety"
16
+ tags:
17
+ - harness
18
+ - feedforward
19
+ - feedback
20
+ - martin-fowler
21
+ - maintainability
22
+ created: 2026-04-30
23
+ updated: 2026-04-30
24
+ status: ingested
25
+
26
+ ---# Harness Engineering for Coding Agent Users
27
+
28
+ Birgitta Böckeler, Martin Fowler. April 2026.
29
+
30
+ ## The Framework
31
+
32
+ ### Feedforward Controls (Guides)
33
+ Anticipate agent behavior, steer BEFORE it acts:
34
+ - AGENTS.md, skills, rules, how-to guides
35
+ - Language servers, CLIs, scripts, codemods
36
+
37
+ ### Feedback Controls (Sensors)
38
+ Observe AFTER agent acts, enable self-correction:
39
+ - AI code review agents
40
+ - Static analysis, linters, logs, browser testing
41
+
42
+ ### Computational vs Inferential
43
+
44
+ | Type | Speed | Reliability | Examples |
45
+ |------|-------|-------------|----------|
46
+ | Computational | ms-sec | Deterministic | Tests, linters, type checkers, structural analysis |
47
+ | Inferential | sec-min | Probabilistic | AI code review, LLM-as-judge, semantic analysis |
48
+
49
+ ## Three Regulation Categories
50
+
51
+ 1. **Maintainability Harness**: Internal code quality. Computational sensors catch structural issues reliably. LLMs partially address semantic issues but expensively.
52
+
53
+ 2. **Architecture Fitness Harness**: Architecture characteristics. Fitness functions + observability standards.
54
+
55
+ 3. **Behaviour Harness**: Functional correctness. **THE UNSOLVED PROBLEM.** Current approach (AI-generated tests + manual testing) insufficient.
56
+
57
+ ## Harnessability
58
+
59
+ Not every codebase is equally harnessable. Strongly typed languages, clear module boundaries, framework abstractions increase harnessability. "Ambient affordances" — structural properties that make the environment legible to agents.
60
+
61
+ ## Harness Templates
62
+
63
+ Pre-bundled guides + sensors for service topologies (CRUD, event processor, data dashboard). Ashby's Law: topology narrows the solution space, making comprehensive harnesses achievable.
64
+
65
+ ## Key Insight
66
+
67
+ > "The human's job is to STEER the agent by iterating on the harness. Whenever an issue happens multiple times, the feedforward and feedback controls should be improved."
68
+
69
+ Harness engineering is an ongoing practice, not a one-time configuration.
@@ -0,0 +1,50 @@
1
+ ---
2
+ type: source
3
+ status: ingested
4
+ source_type: research-paper
5
+ author: Yilin Zhang, Xinran Zhao, Zora Zhiruo Wang, Chenyang Yang, Jiayi Wei, Tongshuang Wu (CMU)
6
+ date_published: 2025-06-18
7
+ url: https://arxiv.org/abs/2506.15655
8
+ confidence: high
9
+ key_claims:
10
+ - "AST-based chunking (cAST) boosts Recall@5 by 4.3 points on RepoEval retrieval and Pass@1 by 2.67 on SWE-bench generation"
11
+ - "Existing line-based chunking heuristics break semantic structures, splitting functions or merging unrelated code"
12
+ - "cAST recursively breaks large AST nodes into smaller chunks and merges sibling nodes while respecting size limits"
13
+ - "Structure-aware chunking generates self-contained, semantically coherent units across programming languages"
14
+ tags:
15
+ - chunking
16
+ - AST
17
+ - code-rag
18
+ - embedding
19
+ - arxiv
20
+ created: 2026-05-02
21
+ updated: 2026-05-02
22
+
23
+ ---# cAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree
24
+
25
+ ## Summary
26
+
27
+ Peer-reviewed paper (arXiv:2506.15655, June 2025) from CMU researchers proposing AST-based chunking for code RAG pipelines. The core insight: line-based chunking breaks semantic structures, splitting functions mid-body or merging unrelated code. cAST parses code into ASTs and uses recursive split-then-merge to create self-contained, semantically coherent chunks.
28
+
29
+ ## Key Details
30
+
31
+ ### Problem
32
+ - RAG pipelines split documents into retrievable units (chunks)
33
+ - Line-based heuristics often break semantic structures
34
+ - Splitting functions or merging unrelated code degrades generation quality
35
+
36
+ ### Solution: cAST
37
+ - Parse code into Abstract Syntax Tree
38
+ - Recursively break large AST nodes into smaller chunks
39
+ - Merge sibling nodes while respecting size limits
40
+ - Uses non-whitespace character count (not line count) for sizing
41
+ - Greedy window assignment with merge-adjacent optimization
42
+
43
+ ### Results
44
+ - Recall@5: +4.3 points on RepoEval retrieval
45
+ - Pass@1: +2.67 on SWE-bench generation
46
+ - Works across programming languages
47
+
48
+ ## Relevance to Our Implementation
49
+
50
+ This is the foundational paper for AST-aware chunking. The `code-chunk` library (supermemoryai) implements this algorithm in production. We should adopt AST-aware chunking via tree-sitter (already available in lean-ctx) rather than naive text splitting.
@@ -0,0 +1,78 @@
1
+ ---
2
+ type: source
3
+ source_type: official-documentation
4
+ title: "ck: Hybrid Code Search"
5
+ author: BeaconBay
6
+ date_published: 2025-08-30
7
+ url: https://beaconbay.github.io/ck/
8
+ repo: https://github.com/BeaconBay/ck
9
+ confidence: high
10
+ key_claims:
11
+ - "ck is a grep-compatible hybrid code search tool combining BM25 lexical search with embedding-based semantic search"
12
+ - "~1M LOC indexed in under 2 minutes, sub-500ms queries"
13
+ - "Completely offline: no code or queries sent to external services"
14
+ - "Built-in MCP server for AI agent integration (ck --serve)"
15
+ - "Supports 80+ languages via tree-sitter chunking"
16
+ tags:
17
+ - code-search
18
+ - semantic-search
19
+ - grep
20
+ - mcp
21
+ - rust
22
+ related:
23
+ - "[[ck-tool]]"
24
+ - "[[hybrid-code-search]]"
25
+ - "[[Research: semantic code search tools]]"
26
+ created: 2025-08-30
27
+ updated: 2026-04-30
28
+ status: ingested
29
+
30
+ ---# ck (seek): Hybrid Code Search
31
+
32
+ ## Summary
33
+
34
+ ck is a Rust-based hybrid code search tool that fuses lexical (BM25/grep) precision with embedding-based semantic recall, then re-ranks results using Reciprocal Rank Fusion (RRF). It positions itself as a drop-in grep replacement with added semantic capabilities.
35
+
36
+ ## What It Contributes
37
+
38
+ **Primary contribution to AI coding agents**: ck provides a grep-compatible CLI that agents can use directly (`ck --sem "error handling" src/`) while also serving as an MCP server for deeper integration. The MCP tools (`ck_search`, `ck_get`, `ck_info`, `ck_reindex`) give agents first-class access to semantic code search without parsing CLI output.
39
+
40
+ ## Key Capabilities
41
+
42
+ | Capability | Details |
43
+ |---|---|
44
+ | **Lexical Search** | BM25-based, grep-compatible flags (-n, -A, -B, -C, -r, -l, -i, -w) |
45
+ | **Semantic Search** | `ck --sem "query"` — embedding-based, finds by concept not keywords |
46
+ | **Hybrid Search** | `ck --hybrid "query"` — RRF fusion of lexical + semantic results |
47
+ | **TUI Mode** | `ck-tui` — interactive terminal interface with live results |
48
+ | **Editor Integration** | VSCode/Cursor extension (`code --install-extension ck-search`) |
49
+ | **MCP Server** | `ck --serve` — Model Context Protocol for AI agent integration |
50
+ | **Incremental Indexing** | Chunk-level re-indexing: only re-embeds changed files |
51
+
52
+ ## Installation
53
+
54
+ ```bash
55
+ # From NPM (recommended)
56
+ npm install -g @beaconbay/ck-search
57
+
58
+ # From crates.io
59
+ cargo install ck-search
60
+
61
+ # MCP setup for Claude Code
62
+ claude mcp add ck-search -s user -- ck --serve
63
+ ```
64
+
65
+ ## Limitations (Documented)
66
+
67
+ 1. **No code-aware embeddings**: Uses generic text embeddings (fastembed), not code-specialized models. Structural patterns may be missed.
68
+ 2. **80 language max**: Tree-sitter chunking covers 80 languages. Unsupported languages fall back to line-based chunking.
69
+ 3. **No custom model training**: Pre-trained models only. Cannot fine-tune for domain-specific codebases.
70
+ 4. **HuggingFace cache control**: Cache location controlled by HF env vars (`$HF_HOME`), no ck-specific config.
71
+ 5. **Memory**: 4-8GB RAM recommended for large codebases (10M+ LOC).
72
+ 6. **Result pagination**: Max 100 results per page. Exhaustive search requires cursor-based pagination.
73
+ 7. **No team/cloud sync**: Local-only indexes. No shared or remote indexes.
74
+ 8. **No AST-level understanding**: Chunking is tree-sitter-based, but embeddings are text, not AST-aware.
75
+
76
+ ## Confidence Assessment
77
+
78
+ **High confidence** for feature claims — all verified against official documentation and GitHub repo. The limitations section is unusually thorough for a young project (transparency is a good signal). The tool is actively maintained (last commit within days as of April 2026). Stars growth: 1,572 in ~8 months suggests strong community validation.
@@ -0,0 +1,71 @@
1
+ ---
2
+ type: source
3
+ status: ingested
4
+ source_type: blog
5
+ title: "How Claude Code Actually Works: A Systems-Level Deep Dive"
6
+ author: "KaraxAI"
7
+ date_published: 2026-03-19
8
+ url: "https://karaxai.com/posts/how-claude-code-works-systems-deep-dive/"
9
+ confidence: medium
10
+ tags: [claude-code, architecture, CLAUDE.md, agent-loop, skills, plugins, MCP, subagents, hooks]
11
+ key_claims:
12
+ - "Claude Code has 82,000+ GitHub stars and handles millions of coding sessions"
13
+ - "CLAUDE.md is injected into user messages in <system-reminder> tags, every turn — not in system prompt"
14
+ - "96% compliance with 5 conditional rule files (30 lines each) vs 92% with single 150-line CLAUDE.md"
15
+ - "Three memory systems: CLAUDE.md (reliable), auto-memory (200-line limit, lossy), session memory (lossy)"
16
+ - "Auto-compaction at ~83.5% of 200K window, ~85% payload reduction"
17
+ - "Skills use progressive disclosure: 100 tokens at startup, full body on-demand"
18
+ - "Subagents get fresh 200K context, only summary returns, cannot spawn own subagents"
19
+ - "Hooks achieve 100% compliance; CLAUDE.md rules achieve ~92%"
20
+ - "Deliberately no embeddings: 'agentic search generally works better' — Boris Cherny"
21
+ - "The model is the commodity; the agent is the product"
22
+ created: 2026-05-02
23
+ updated: 2026-05-02
24
+ ---
25
+ # Claude Code Systems Deep-Dive (KaraxAI, 2026)
26
+
27
+ ## Source Summary
28
+
29
+ Comprehensive technical walkthrough published March 2026. Covers the full stack: context assembly, agentic loop, MCP, plugins. Based on reverse-engineered internals from mitmproxy interception, npm tarball analysis, and systematic prompt extraction. Notable for providing specific compliance numbers and the direct quote from Claude Code's creator about rejecting embeddings.
30
+
31
+ ## CLAUDE.md Loading Hierarchy
32
+
33
+ ```
34
+ Global (~/.claude/CLAUDE.md) → Enterprise → Project → Local → Notebook (cursor rules)
35
+ ```
36
+
37
+ All tiers are additive. When instructions conflict, more specific (local) wins. Conditional rules via YAML frontmatter (`match: "*.test.ts"`) since v1.0.16.
38
+
39
+ ## Agentic Loop
40
+
41
+ Single-threaded. Model receives context → produces response → if tool calls, execute, append to history, call again → if `stop_reason === "end_turn"`, stop. Between iterations: permission enforcement (hooks → deny rules → allow rules → ask rules → permission mode), context monitoring (auto-compaction at ~83.5%), state re-injection (CLAUDE.md re-sent every turn), mid-task steering (async dual-buffer queue).
42
+
43
+ ## Context Compression
44
+
45
+ At ~167K/200K tokens, auto-compaction triggers. Summary in `<summary>` tags. All prior messages dropped. ~85% reduction (167K → ~25K). Lossy: old file contents, tool outputs lost; new summary + last 5 messages + CLAUDE.md survive.
46
+
47
+ ## Skills
48
+
49
+ Progressive disclosure: scans `.claude/skills/` and `~/.claude/skills/`, loads only `name` + `description` (~100 tokens each) into `<available_skills>` block. Full content loads on invocation via Skill tool. Skills can include supporting files, restrict tools, spawn subagents. Built-in skills (`/simplify`, `/review`, `/batch`, `/loop`, `/debug`) are prompt-based, not hardcoded.
50
+
51
+ ## Plugins
52
+
53
+ Directory with `.claude-plugin/plugin.json` manifest. Bundles any combination of: skills, agents, hooks, MCP servers, commands, CLAUDE.md. Namespacing: `/my-plugin:hello`. Agent override: plugin can replace main agent's system prompt. 9,000+ plugins across registries. Official marketplace ships built-in.
54
+
55
+ ## No Embeddings
56
+
57
+ > "Early versions used RAG + a local vector db, but we found pretty quickly that agentic search generally works better." — Boris Cherny, Claude Code creator
58
+
59
+ Search hierarchy: Glob (file path matching, near-zero token cost) → Grep (ripgrep, regex-powered) → Read (full file load, reserved for confirmed-relevant files). Explore subagent on Haiku for deep exploration.
60
+
61
+ ## Hooks
62
+
63
+ Deterministic escape hatch. Shell commands fire on lifecycle events. Exit codes: 0 = allow, 2 = block (stderr fed to Claude), other = non-blocking error. CLAUDE.md ~92% compliance. Hooks 100% for matched conditions.
64
+
65
+ ## Key Quotes
66
+
67
+ > "The model is the commodity; the agent is the product."
68
+
69
+ > "Agentic search generally works better." — Boris Cherny
70
+
71
+ > "CLAUDE.md content is injected into user messages, wrapped in <system-reminder> XML tags. Every turn. Not once at session start — every single API call re-sends it."
@@ -0,0 +1,50 @@
1
+ ---
2
+ type: source
3
+ status: ingested
4
+ source_type: blog
5
+ title: "Inside Claude Code: The Architecture That Makes AI Actually Do the Work"
6
+ author: "Vijendra (The Neural Blueprint / Qubytes)"
7
+ date_published: 2026-04-30
8
+ url: "https://qubytes.substack.com/p/claude-code-architecture-explained"
9
+ confidence: medium
10
+ tags: [claude-code, architecture, agent-loop, compaction, hooks, subagents]
11
+ key_claims:
12
+ - "Claude Code is a while-loop surrounded by serious infrastructure"
13
+ - "Five critical subsystems: Agent Loop, Permission System, Tools & Execution Environment, State & Persistence, Compaction Pipeline"
14
+ - "Compaction pipeline is the most underappreciated component — five layers, forked subagent, structured summary"
15
+ - "Hooks are the enterprise integration surface"
16
+ - "Subagents enable horizontal scaling of reasoning"
17
+ - "Safety as a subsystem, not an afterthought"
18
+ created: 2026-05-02
19
+ updated: 2026-05-02
20
+ ---
21
+ # Claude Code Architecture (Qubytes, 2026)
22
+
23
+ ## Source Summary
24
+
25
+ Technical deep-dive by Vijendra (The Neural Blueprint) analyzing Claude Code as a layered architecture. Published April 30, 2026 — same day as Cursor's harness evolution blog. Synthesized from the leaked source code and official documentation.
26
+
27
+ ## Five Subsystems
28
+
29
+ ### 1. Agent Loop
30
+ The heart. Orchestrates everything: assembles context window, dispatches requests, routes tool-use responses, commits state. Feedback controller, not a pipeline. Non-deterministic iterations driven by task complexity.
31
+
32
+ ### 2. Permission System
33
+ First-class architectural concern. Sits between agent loop and tool execution. ML-based auto classifier with 7 permission modes. Diamond-shaped decision node: deny sends feedback to loop, accept lets execution proceed.
34
+
35
+ ### 3. Tools & Execution Environment
36
+ Built-in tools (file read/write, bash, grep, glob) + MCP extensions. All tool execution runs through Shell Sandbox. Remote execution backends (local/cloud/remote).
37
+
38
+ ### 4. State & Persistence
39
+ Append-oriented session transcript. Not just logging — substrate for resume, fork, rewind. CLAUDE.md + memory inject persistent project context. Sidechain transcripts for subagent interactions, preventing context pollution.
40
+
41
+ ### 5. Compaction Pipeline
42
+ Five layers: forked subagent produces ~6,500 token structured summary. Preserves: last 5 file attachments, active skills, plan state, tool deltas. "Structured extraction followed by selective reconstruction — not summarization."
43
+
44
+ ## Key Quotes
45
+
46
+ > "The core agent loop — assemble context, call the model, receive a tool request, execute it, repeat — is conceptually simple. The real engineering genius lives in everything around that loop."
47
+
48
+ > "Context is a managed resource, not an infinite buffer."
49
+
50
+ > "If you can't answer: How does your permission system work? What's your compaction strategy? Can I hook into the lifecycle? How does subagent delegation handle context isolation? — you're not looking at a production-ready agentic system."
@@ -0,0 +1,64 @@
1
+ ---
2
+ type: source
3
+ status: ingested
4
+ source_type: academic-paper
5
+ title: "Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems"
6
+ author: "Jiacheng Liu, Xiaohan Zhao, Xinyi Shang, Zhiqiang Shen"
7
+ date_published: 2026-04-14
8
+ url: "https://arxiv.org/abs/2604.14228"
9
+ confidence: high
10
+ tags: [claude-code, agent-architecture, source-code-analysis, design-principles]
11
+ key_claims:
12
+ - "Claude Code architecture centers on a simple while-loop with four surrounding subsystems"
13
+ - "Five human values motivate the architecture: human decision authority, safety, reliable execution, capability amplification, contextual adaptability"
14
+ - "Thirteen design principles trace from values to specific implementation choices"
15
+ - "Core subsystems: permission system with ML classifier, five-layer compaction pipeline, four extensibility mechanisms, subagent delegation with worktree isolation"
16
+ - "Comparison with OpenClaw reveals how same design questions produce different answers under different deployment contexts"
17
+ - "Six open design directions identified for future agent systems"
18
+ created: 2026-05-02
19
+ updated: 2026-05-02
20
+ ---
21
+ # Dive into Claude Code (VILA-Lab, 2026)
22
+
23
+ ## Source Summary
24
+
25
+ Academic paper by researchers at VILA-Lab analyzing Claude Code's TypeScript source code (publicly available after accidental leak). Reverse-engineers the complete architecture from 510K+ lines of TypeScript. The most comprehensive architectural analysis of Claude Code available.
26
+
27
+ ## Architecture Components
28
+
29
+ ### Core Loop
30
+ The center is a simple `while`-loop: assemble context → call model → receive tool request → execute → repeat. Most of the 510K lines lives in systems around this loop, not in the loop itself.
31
+
32
+ ### Five Human Values → 13 Design Principles
33
+ 1. **Human Decision Authority**: Permission system, plan mode, manual approval gates
34
+ 2. **Safety and Security**: ML-based auto classifier, sandboxing, permission modes
35
+ 3. **Reliable Execution**: Compaction for long sessions, checkpointing, error recovery
36
+ 4. **Capability Amplification**: MCP, subagents, skills, plugins
37
+ 5. **Contextual Adaptability**: CLAUDE.md hierarchy, conditional rules, dynamic context loading
38
+
39
+ ### Four Extensibility Mechanisms
40
+ 1. **MCP** (Model Context Protocol): Open standard for tool connections. JSON-RPC 2.0, stdio and HTTP transports. Donated to Linux Foundation Dec 2025. Adopted by OpenAI, Google, GitHub, JetBrains.
41
+ 2. **Plugins**: Distribution layer bundling skills + agents + hooks + MCP. 9,000+ ecosystem. Namespaced, versioned.
42
+ 3. **Skills**: Progressive disclosure. Name+description at startup, full body on demand.
43
+ 4. **Hooks**: Deterministic lifecycle events. Exit-code semantics for allow/deny.
44
+
45
+ ### Comparison with OpenClaw
46
+ OpenClaw is a multi-channel personal assistant gateway. Same design questions, different answers due to different deployment context: per-action classification vs perimeter-level access control, single CLI loop vs embedded runtime in gateway control plane, context-window extensions vs gateway-wide capability registration.
47
+
48
+ ## Six Open Design Directions
49
+ 1. Cross-agent state sharing
50
+ 2. Long-horizon task decomposition
51
+ 3. Agent-to-agent negotiation protocols
52
+ 4. Formal verification of agent safety
53
+ 5. Energy-aware scheduling
54
+ 6. Multi-modal grounding in software engineering
55
+
56
+ ## Relevance to Our Harness
57
+
58
+ This paper provides the foundational framework for understanding Claude Code as a harness architecture. The "five human values → 13 design principles → implementation choices" methodology is directly applicable to our own harness design documentation. The comparison with OpenClaw validates our multi-source research approach (Cursor vs Antigravity vs Claude Code — different deployment contexts surface different design answers).
59
+
60
+ ## Key Quotes
61
+
62
+ > "The core of the system is a simple while-loop that calls the model, runs tools, and repeats. Most of the code, however, lives in the systems around this loop."
63
+
64
+ > "Our analysis identifies five human values, philosophies, and needs that motivate the architecture and traces them through thirteen design principles to specific implementation choices."