ultimate-pi 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (516) hide show
  1. package/.agents/skills/ck-search/SKILL.md +99 -0
  2. package/.agents/skills/defuddle/SKILL.md +90 -0
  3. package/.agents/skills/find-skills/SKILL.md +142 -0
  4. package/.agents/skills/firecrawl/SKILL.md +150 -0
  5. package/.agents/skills/firecrawl/rules/install.md +82 -0
  6. package/.agents/skills/firecrawl/rules/security.md +26 -0
  7. package/.agents/skills/firecrawl-agent/SKILL.md +57 -0
  8. package/.agents/skills/firecrawl-build-interact/SKILL.md +67 -0
  9. package/.agents/skills/firecrawl-build-onboarding/SKILL.md +102 -0
  10. package/.agents/skills/firecrawl-build-onboarding/references/auth-flow.md +39 -0
  11. package/.agents/skills/firecrawl-build-onboarding/references/project-setup.md +20 -0
  12. package/.agents/skills/firecrawl-build-onboarding/references/sdk-installation.md +17 -0
  13. package/.agents/skills/firecrawl-build-scrape/SKILL.md +68 -0
  14. package/.agents/skills/firecrawl-build-search/SKILL.md +68 -0
  15. package/.agents/skills/firecrawl-crawl/SKILL.md +58 -0
  16. package/.agents/skills/firecrawl-download/SKILL.md +69 -0
  17. package/.agents/skills/firecrawl-interact/SKILL.md +83 -0
  18. package/.agents/skills/firecrawl-map/SKILL.md +50 -0
  19. package/.agents/skills/firecrawl-parse/SKILL.md +61 -0
  20. package/.agents/skills/firecrawl-scrape/SKILL.md +68 -0
  21. package/.agents/skills/firecrawl-search/SKILL.md +59 -0
  22. package/.agents/skills/obsidian-bases/SKILL.md +299 -0
  23. package/.agents/skills/obsidian-markdown/SKILL.md +237 -0
  24. package/.agents/skills/posthog-analyst/SKILL.md +306 -0
  25. package/.agents/skills/posthog-analyst/evals/evals.json +23 -0
  26. package/.agents/skills/wiki/SKILL.md +215 -0
  27. package/.agents/skills/wiki/references/css-snippets.md +122 -0
  28. package/.agents/skills/wiki/references/frontmatter.md +107 -0
  29. package/.agents/skills/wiki/references/git-setup.md +58 -0
  30. package/.agents/skills/wiki/references/mcp-setup.md +149 -0
  31. package/.agents/skills/wiki/references/modes.md +259 -0
  32. package/.agents/skills/wiki/references/plugins.md +96 -0
  33. package/.agents/skills/wiki/references/rest-api.md +124 -0
  34. package/.agents/skills/wiki-autoresearch/SKILL.md +211 -0
  35. package/.agents/skills/wiki-autoresearch/references/program.md +75 -0
  36. package/.agents/skills/wiki-fold/SKILL.md +204 -0
  37. package/.agents/skills/wiki-fold/references/fold-template.md +133 -0
  38. package/.agents/skills/wiki-ingest/SKILL.md +288 -0
  39. package/.agents/skills/wiki-lint/SKILL.md +183 -0
  40. package/.agents/skills/wiki-query/SKILL.md +176 -0
  41. package/.agents/skills/wiki-save/SKILL.md +128 -0
  42. package/.ckignore +41 -0
  43. package/.env.example +9 -0
  44. package/.github/workflows/lint.yml +33 -0
  45. package/.github/workflows/publish-github-packages.yml +35 -0
  46. package/.github/workflows/publish-npm.yml +1 -1
  47. package/.pi/SYSTEM.md +107 -40
  48. package/.pi/agents/pi-pi/agent-expert.md +205 -0
  49. package/.pi/agents/pi-pi/cli-expert.md +47 -0
  50. package/.pi/agents/pi-pi/config-expert.md +67 -0
  51. package/.pi/agents/pi-pi/ext-expert.md +53 -0
  52. package/.pi/agents/pi-pi/keybinding-expert.md +123 -0
  53. package/.pi/agents/pi-pi/pi-orchestrator.md +103 -0
  54. package/.pi/agents/pi-pi/prompt-expert.md +83 -0
  55. package/.pi/agents/pi-pi/skill-expert.md +52 -0
  56. package/.pi/agents/pi-pi/theme-expert.md +46 -0
  57. package/.pi/agents/pi-pi/tui-expert.md +100 -0
  58. package/.pi/agents/rethink.md +140 -0
  59. package/.pi/agents/wiki-ingest.md +67 -0
  60. package/.pi/agents/wiki-lint.md +75 -0
  61. package/.pi/auto-commit.json +20 -0
  62. package/.pi/extensions/banner.png +0 -0
  63. package/.pi/extensions/ck-enforce.ts +216 -0
  64. package/.pi/extensions/custom-footer.ts +308 -0
  65. package/.pi/extensions/custom-header.ts +116 -0
  66. package/.pi/extensions/dotenv-loader.ts +170 -0
  67. package/.pi/internal/cursor-sdk-transcript-parser.ts +59 -0
  68. package/.pi/model-router.json +95 -0
  69. package/.pi/npm/.gitignore +2 -0
  70. package/.pi/prompts/git-sync.md +124 -0
  71. package/.pi/prompts/harness-setup.md +509 -0
  72. package/.pi/prompts/save.md +16 -0
  73. package/.pi/prompts/wiki-autoresearch.md +19 -0
  74. package/.pi/prompts/wiki.md +23 -0
  75. package/.pi/providers/cursor-sdk-provider.test.mjs +476 -0
  76. package/.pi/providers/cursor-sdk-provider.ts +1085 -0
  77. package/.pi/settings.json +14 -4
  78. package/.pi/skills/agent-router/SKILL.md +174 -0
  79. package/.pi/sounds/alert/1-kaching-track.mp3 +0 -0
  80. package/.pi/sounds/error/1-ksi-wth-track.mp3 +0 -0
  81. package/.pi/sounds/error/2-smash-track.mp3 +0 -0
  82. package/.pi/sounds/error/3-buzzer-track.mp3 +0 -0
  83. package/.pi/sounds/notification/1-soft-notification-track.mp3 +0 -0
  84. package/.pi/sounds/project-sounds.json +25 -0
  85. package/.pi/sounds/reminder/1-soft-notification-track.mp3 +0 -0
  86. package/.pi/sounds/success/1-tada-track.mp3 +0 -0
  87. package/.pi/sounds/success/2-jobs-done-track.mp3 +0 -0
  88. package/.pi/sounds/success/3-yay-track.mp3 +0 -0
  89. package/CONTRIBUTING.md +116 -0
  90. package/README.md +32 -39
  91. package/biome.json +34 -0
  92. package/firecrawl/.env.template +58 -0
  93. package/firecrawl/README.md +49 -0
  94. package/firecrawl/docker-compose.yaml +201 -0
  95. package/firecrawl/searxng/searxng.env +3 -0
  96. package/firecrawl/searxng/settings.yml +85 -0
  97. package/lefthook.yml +8 -0
  98. package/package.json +55 -24
  99. package/vault/AGENTS.md +37 -0
  100. package/vault/wiki/_templates/comparison.md +39 -0
  101. package/vault/wiki/_templates/concept.md +40 -0
  102. package/vault/wiki/_templates/decision.md +21 -0
  103. package/vault/wiki/_templates/entity.md +32 -0
  104. package/vault/wiki/_templates/flow.md +14 -0
  105. package/vault/wiki/_templates/module.md +18 -0
  106. package/vault/wiki/_templates/question.md +31 -0
  107. package/vault/wiki/_templates/source.md +39 -0
  108. package/vault/wiki/concepts/AST-Aware Code Chunking.md +44 -0
  109. package/vault/wiki/concepts/Build-Time Prompt Compilation.md +107 -0
  110. package/vault/wiki/concepts/Context Engine (AI Coding).md +47 -0
  111. package/vault/wiki/concepts/Context-Aware System Reminders.md +61 -0
  112. package/vault/wiki/concepts/Contextualized Text Embedding.md +42 -0
  113. package/vault/wiki/concepts/Contractor vs Employee AI Model.md +55 -0
  114. package/vault/wiki/concepts/Dual-Model Agent Architecture.md +65 -0
  115. package/vault/wiki/concepts/Late Chunking vs Early Chunking.md +43 -0
  116. package/vault/wiki/concepts/Majority Vote Ensembling.md +68 -0
  117. package/vault/wiki/concepts/Meta-Harness.md +16 -0
  118. package/vault/wiki/concepts/Multi-Agent AI Coding Architecture.md +75 -0
  119. package/vault/wiki/concepts/Prompt Enhancement.md +90 -0
  120. package/vault/wiki/concepts/Prompt Renderer.md +89 -0
  121. package/vault/wiki/concepts/Semantic Codebase Indexing.md +67 -0
  122. package/vault/wiki/concepts/additive-config-hierarchy.md +16 -0
  123. package/vault/wiki/concepts/agent-artifacts-verifiable-deliverables.md +71 -0
  124. package/vault/wiki/concepts/agent-browser-browser-automation.md +99 -0
  125. package/vault/wiki/concepts/agent-codebase-interface.md +43 -0
  126. package/vault/wiki/concepts/agent-harness-architecture.md +67 -0
  127. package/vault/wiki/concepts/agent-loop-detection-patterns.md +133 -0
  128. package/vault/wiki/concepts/agent-search-enforcement.md +126 -0
  129. package/vault/wiki/concepts/agent-skills-ecosystem.md +74 -0
  130. package/vault/wiki/concepts/agent-skills-pattern.md +68 -0
  131. package/vault/wiki/concepts/agentic-harness-context-enforcement.md +91 -0
  132. package/vault/wiki/concepts/agentic-harness.md +34 -0
  133. package/vault/wiki/concepts/agentic-orchestration-pipeline.md +56 -0
  134. package/vault/wiki/concepts/agentic-search-no-embeddings.md +18 -0
  135. package/vault/wiki/concepts/anthropic-context-engineering.md +13 -0
  136. package/vault/wiki/concepts/antigravity-agent-first-architecture.md +61 -0
  137. package/vault/wiki/concepts/ast-compression.md +19 -0
  138. package/vault/wiki/concepts/ast-truncation.md +66 -0
  139. package/vault/wiki/concepts/barrel-files.md +37 -0
  140. package/vault/wiki/concepts/browser-harness-agent.md +41 -0
  141. package/vault/wiki/concepts/browser-subagent-visual-verification.md +82 -0
  142. package/vault/wiki/concepts/codebase-intelligence-ecosystem-comparison.md +192 -0
  143. package/vault/wiki/concepts/codebase-intelligence-harness-integration.md +161 -0
  144. package/vault/wiki/concepts/codebase-to-context-ingestion.md +46 -0
  145. package/vault/wiki/concepts/codex-harness-innovations.md +147 -0
  146. package/vault/wiki/concepts/consensus-debate-flow.md +17 -0
  147. package/vault/wiki/concepts/consensus-debate.md +206 -0
  148. package/vault/wiki/concepts/content-addressed-spec-identity.md +166 -0
  149. package/vault/wiki/concepts/context-anxiety.md +57 -0
  150. package/vault/wiki/concepts/context-compression-techniques.md +19 -0
  151. package/vault/wiki/concepts/context-continuity.md +22 -0
  152. package/vault/wiki/concepts/context-drift-in-agents.md +106 -0
  153. package/vault/wiki/concepts/context-engineering.md +62 -0
  154. package/vault/wiki/concepts/context-folding.md +67 -0
  155. package/vault/wiki/concepts/context-mode.md +38 -0
  156. package/vault/wiki/concepts/cursor-harness-innovations.md +107 -0
  157. package/vault/wiki/concepts/deterministic-session-compaction.md +79 -0
  158. package/vault/wiki/concepts/drift-detection-unified.md +296 -0
  159. package/vault/wiki/concepts/execution-feedback-loop.md +46 -0
  160. package/vault/wiki/concepts/feedforward-feedback-harness.md +60 -0
  161. package/vault/wiki/concepts/five-root-cause-metrics-sentrux.md +40 -0
  162. package/vault/wiki/concepts/fork-safe-spec-storage.md +89 -0
  163. package/vault/wiki/concepts/fts5-sandbox.md +19 -0
  164. package/vault/wiki/concepts/fuzzy-edit-matching.md +71 -0
  165. package/vault/wiki/concepts/gemini-cli-architecture.md +104 -0
  166. package/vault/wiki/concepts/generator-evaluator-architecture.md +64 -0
  167. package/vault/wiki/concepts/guardian-agent-pattern.md +67 -0
  168. package/vault/wiki/concepts/harness-configuration-layers.md +89 -0
  169. package/vault/wiki/concepts/harness-control-frameworks.md +155 -0
  170. package/vault/wiki/concepts/harness-engineering-first-principles.md +90 -0
  171. package/vault/wiki/concepts/harness-h-formalism.md +53 -0
  172. package/vault/wiki/concepts/hybrid-code-search.md +61 -0
  173. package/vault/wiki/concepts/inline-post-edit-validation.md +112 -0
  174. package/vault/wiki/concepts/legendary-engineering-patterns-harness.md +110 -0
  175. package/vault/wiki/concepts/lifecycle-hooks.md +94 -0
  176. package/vault/wiki/concepts/mcp-tool-routing.md +102 -0
  177. package/vault/wiki/concepts/memory-system-of-record-vs-ephemeral-cache.md +47 -0
  178. package/vault/wiki/concepts/meta-agent-context-pruning.md +151 -0
  179. package/vault/wiki/concepts/model-adaptive-harness.md +122 -0
  180. package/vault/wiki/concepts/model-routing-agents.md +101 -0
  181. package/vault/wiki/concepts/monorepo-architecture.md +45 -0
  182. package/vault/wiki/concepts/multi-agent-specialization.md +61 -0
  183. package/vault/wiki/concepts/permission-subsystem.md +16 -0
  184. package/vault/wiki/concepts/pi-messenger-analysis.md +243 -0
  185. package/vault/wiki/concepts/pi-vscode-extension-landscape.md +37 -0
  186. package/vault/wiki/concepts/policy-engine-pattern.md +78 -0
  187. package/vault/wiki/concepts/progressive-disclosure-agents.md +53 -0
  188. package/vault/wiki/concepts/progressive-skill-disclosure.md +17 -0
  189. package/vault/wiki/concepts/provider-native-prompting.md +203 -0
  190. package/vault/wiki/concepts/quality-signal-sentrux.md +37 -0
  191. package/vault/wiki/concepts/repo-map-ranking.md +42 -0
  192. package/vault/wiki/concepts/result-monad-error-handling.md +47 -0
  193. package/vault/wiki/concepts/safety-defense-in-depth.md +83 -0
  194. package/vault/wiki/concepts/sandbox-os-enforcement.md +18 -0
  195. package/vault/wiki/concepts/selective-debate-routing.md +70 -0
  196. package/vault/wiki/concepts/self-evolving-harness.md +60 -0
  197. package/vault/wiki/concepts/sentrux-mcp-integration.md +36 -0
  198. package/vault/wiki/concepts/sentrux-rules-engine.md +49 -0
  199. package/vault/wiki/concepts/shell-pattern-compression.md +24 -0
  200. package/vault/wiki/concepts/skill-first-architecture.md +166 -0
  201. package/vault/wiki/concepts/structured-compaction.md +78 -0
  202. package/vault/wiki/concepts/subagent-orchestration.md +17 -0
  203. package/vault/wiki/concepts/subagent-worktree-isolation.md +68 -0
  204. package/vault/wiki/concepts/superpowers-methodology.md +78 -0
  205. package/vault/wiki/concepts/think-in-code.md +73 -0
  206. package/vault/wiki/concepts/ts-execution-layer.md +100 -0
  207. package/vault/wiki/concepts/typescript-strict-mode.md +37 -0
  208. package/vault/wiki/concepts/vcc-conversation-compaction-for-pi.md +51 -0
  209. package/vault/wiki/concepts/verification-drift-detection.md +19 -0
  210. package/vault/wiki/consensus/consensus-records.md +58 -0
  211. package/vault/wiki/decisions/2026-04-30-pi-lean-ctx-native.md +122 -0
  212. package/vault/wiki/decisions/adr-008.md +40 -0
  213. package/vault/wiki/decisions/adr-009.md +46 -0
  214. package/vault/wiki/decisions/adr-010.md +55 -0
  215. package/vault/wiki/decisions/adr-011.md +165 -0
  216. package/vault/wiki/decisions/adr-012.md +102 -0
  217. package/vault/wiki/decisions/adr-013.md +59 -0
  218. package/vault/wiki/decisions/adr-014.md +73 -0
  219. package/vault/wiki/decisions/adr-015.md +81 -0
  220. package/vault/wiki/decisions/adr-016.md +91 -0
  221. package/vault/wiki/decisions/adr-017.md +79 -0
  222. package/vault/wiki/decisions/adr-018.md +100 -0
  223. package/vault/wiki/decisions/adr-019.md +75 -0
  224. package/vault/wiki/decisions/adr-020.md +106 -0
  225. package/vault/wiki/decisions/adr-021.md +86 -0
  226. package/vault/wiki/decisions/adr-022.md +113 -0
  227. package/vault/wiki/decisions/adr-023.md +113 -0
  228. package/vault/wiki/decisions/adr-024.md +73 -0
  229. package/vault/wiki/decisions/adr-025.md +130 -0
  230. package/vault/wiki/decisions/adr-026.md +56 -0
  231. package/vault/wiki/decisions/colocate-wiki.md +34 -0
  232. package/vault/wiki/entities/Anders Hejlsberg.md +29 -0
  233. package/vault/wiki/entities/Anthropic.md +17 -0
  234. package/vault/wiki/entities/Augment Code.md +49 -0
  235. package/vault/wiki/entities/Bjarne Stroustrup.md +26 -0
  236. package/vault/wiki/entities/Bolt.new (StackBlitz).md +39 -0
  237. package/vault/wiki/entities/Boris Cherny.md +11 -0
  238. package/vault/wiki/entities/Claude Code.md +19 -0
  239. package/vault/wiki/entities/Dennis Ritchie.md +26 -0
  240. package/vault/wiki/entities/Emergent Labs.md +32 -0
  241. package/vault/wiki/entities/Google Cloud.md +16 -0
  242. package/vault/wiki/entities/Guido van Rossum.md +28 -0
  243. package/vault/wiki/entities/Ken Thompson.md +28 -0
  244. package/vault/wiki/entities/Lee et al.md +16 -0
  245. package/vault/wiki/entities/Linus Torvalds.md +28 -0
  246. package/vault/wiki/entities/Lovable (company).md +40 -0
  247. package/vault/wiki/entities/Martin Fowler.md +16 -0
  248. package/vault/wiki/entities/Meng et al.md +16 -0
  249. package/vault/wiki/entities/OpenAI.md +16 -0
  250. package/vault/wiki/entities/Rocket.new.md +38 -0
  251. package/vault/wiki/entities/VILA-Lab.md +15 -0
  252. package/vault/wiki/entities/autodev-codebase.md +18 -0
  253. package/vault/wiki/entities/ck-tool.md +59 -0
  254. package/vault/wiki/entities/codesearch.md +18 -0
  255. package/vault/wiki/entities/disler-indydevdan.md +33 -0
  256. package/vault/wiki/entities/gsd-get-shit-done.md +56 -0
  257. package/vault/wiki/entities/javascript-runtimes.md +48 -0
  258. package/vault/wiki/entities/jesse-vincent.md +38 -0
  259. package/vault/wiki/entities/lean-ctx.md +32 -0
  260. package/vault/wiki/entities/opendev.md +41 -0
  261. package/vault/wiki/entities/ops-codegraph-tool.md +18 -0
  262. package/vault/wiki/entities/pi-coding-agent.md +53 -0
  263. package/vault/wiki/entities/sentrux.md +54 -0
  264. package/vault/wiki/entities/vgrep-tool.md +57 -0
  265. package/vault/wiki/entities/vitest.md +41 -0
  266. package/vault/wiki/flows/harness-wiki-pipeline.md +204 -0
  267. package/vault/wiki/hot.md +932 -0
  268. package/vault/wiki/index.md +437 -0
  269. package/vault/wiki/log.md +418 -0
  270. package/vault/wiki/meta/dashboard.md +30 -0
  271. package/vault/wiki/meta/lint-report-2026-04-30.md +86 -0
  272. package/vault/wiki/meta/lint-report-2026-05-02.md +251 -0
  273. package/vault/wiki/meta/overview.canvas +43 -0
  274. package/vault/wiki/modules/adversarial-verification.md +57 -0
  275. package/vault/wiki/modules/automated-observability.md +54 -0
  276. package/vault/wiki/modules/bench.md +20 -0
  277. package/vault/wiki/modules/extensions.md +23 -0
  278. package/vault/wiki/modules/grounding-checkpoints.md +62 -0
  279. package/vault/wiki/modules/harness-implementation-plan.md +345 -0
  280. package/vault/wiki/modules/harness-wiki-skill-mapping.md +135 -0
  281. package/vault/wiki/modules/harness.md +86 -0
  282. package/vault/wiki/modules/persistent-memory.md +85 -0
  283. package/vault/wiki/modules/schema-orchestration.md +68 -0
  284. package/vault/wiki/modules/skills.md +27 -0
  285. package/vault/wiki/modules/spec-hardening.md +58 -0
  286. package/vault/wiki/modules/structured-planning.md +53 -0
  287. package/vault/wiki/modules/think-in-code-enforcement.md +153 -0
  288. package/vault/wiki/modules/wiki-query-interface.md +64 -0
  289. package/vault/wiki/overview.md +51 -0
  290. package/vault/wiki/questions/Research-pi-vs-claude-code-agentic-orchestration-pipeline.md +87 -0
  291. package/vault/wiki/questions/Research-sentrux-dev.md +123 -0
  292. package/vault/wiki/questions/Research-superpowers-skill-for-agentic-coding-agents.md +164 -0
  293. package/vault/wiki/questions/Research: Augment Code Context Engine.md +244 -0
  294. package/vault/wiki/questions/Research: Automating Software Engineering - Lovable, Bolt, Emergent, Rocket.md +112 -0
  295. package/vault/wiki/questions/Research: Claude Code State-of-the-Art Harness Improvements.md +209 -0
  296. package/vault/wiki/questions/Research: Codex State-of-the-Art Harness Improvements.md +99 -0
  297. package/vault/wiki/questions/Research: Engineering Workflows of Legendary Programmers and AI Harness Mapping.md +107 -0
  298. package/vault/wiki/questions/Research: Fallow Codebase Intelligence Harness Integration.md +72 -0
  299. package/vault/wiki/questions/Research: Gemini CLI SOTA Harness Integration.md +166 -0
  300. package/vault/wiki/questions/Research: GitHub Issues as Harness Spec Storage.md +188 -0
  301. package/vault/wiki/questions/Research: Google Antigravity Harness Integration.md +120 -0
  302. package/vault/wiki/questions/Research: Meta-Agent Context Drift Detection.md +236 -0
  303. package/vault/wiki/questions/Research: Model-Adaptive Agent Harness Design.md +95 -0
  304. package/vault/wiki/questions/Research: Model-Specific Prompting Guides.md +165 -0
  305. package/vault/wiki/questions/Research: Prompt Renderer for Multi-Model Agent Harness.md +216 -0
  306. package/vault/wiki/questions/Research: Skill-First Harness Architecture.md +91 -0
  307. package/vault/wiki/questions/Research: TypeScript Best Practices and Codebase Structure.md +88 -0
  308. package/vault/wiki/questions/Research: TypeScript Execution Layer for Agent Tool Calling.md +81 -0
  309. package/vault/wiki/questions/Research: claude-mem over Obsidian for Harness Layer.md +71 -0
  310. package/vault/wiki/questions/Research: claude-mem over obsidian wiki as the knowledge base for our agentic harness pipeline. think from first principles. does this replace or complement our current setup? no hard feelings about previous decisions. gimme accurate points.md +80 -0
  311. package/vault/wiki/questions/Research: context-mode vs lean-ctx.md +72 -0
  312. package/vault/wiki/questions/Research: cursor.sh Harness Innovations.md +92 -0
  313. package/vault/wiki/questions/Research: executor.sh Harness Integration.md +170 -0
  314. package/vault/wiki/questions/Research: how GSD fits into our coding harness setup.md +97 -0
  315. package/vault/wiki/questions/Research: how claude-mem fits into our workflow. and whether it should replace obsidian in the codebase. no hard feelings about previous actions, rethink from first principles always.md +80 -0
  316. package/vault/wiki/questions/Research: pi-vcc.md +113 -0
  317. package/vault/wiki/questions/Research: semantic code search tools.md +69 -0
  318. package/vault/wiki/questions/Research: vcc extension for pi coding agent.md +73 -0
  319. package/vault/wiki/questions/how-to-enable-semantic-code-search-now.md +111 -0
  320. package/vault/wiki/questions/mvp-implementation-blueprint.md +552 -0
  321. package/vault/wiki/questions/research-agent-first-codebase-exploration.md +199 -0
  322. package/vault/wiki/questions/research-agentic-coding-harness-latest-papers.md +142 -0
  323. package/vault/wiki/questions/research-gitingest-gitreverse-integration.md +100 -0
  324. package/vault/wiki/questions/research-wozcode-token-reduction.md +67 -0
  325. package/vault/wiki/questions/resolved-context-pruning-inplace-vs-restart.md +95 -0
  326. package/vault/wiki/questions/resolved-context-window-economics.md +167 -0
  327. package/vault/wiki/questions/resolved-imad-debate-gating-transfer.md +126 -0
  328. package/vault/wiki/questions/resolved-mcp-tool-preference.md +112 -0
  329. package/vault/wiki/questions/resolved-small-model-meta-agents.md +107 -0
  330. package/vault/wiki/questions/resolved-treesitter-dynamic-languages.md +95 -0
  331. package/vault/wiki/sources/Auggie Context MCP Server.md +63 -0
  332. package/vault/wiki/sources/Augment Code Codacy AI Giants.md +61 -0
  333. package/vault/wiki/sources/Augment Code MCP SiliconAngle.md +49 -0
  334. package/vault/wiki/sources/Augment Code WorkOS ERC 2025.md +55 -0
  335. package/vault/wiki/sources/Augment Context Engine Official.md +71 -0
  336. package/vault/wiki/sources/Augment SWE-bench Agent GitHub.md +74 -0
  337. package/vault/wiki/sources/Augment SWE-bench Pro Blog.md +58 -0
  338. package/vault/wiki/sources/Source: AgentBus Jinja2 Prompt Pipelines.md +75 -0
  339. package/vault/wiki/sources/Source: Arxiv /342/200/224 Don't Break the Cache.md" +85 -0
  340. package/vault/wiki/sources/Source: Augment - Harness Engineering for AI Coding Agents.md +58 -0
  341. package/vault/wiki/sources/Source: Blake Crosley Agent Architecture Guide.md +100 -0
  342. package/vault/wiki/sources/Source: Bolt.new Architecture & Case Study.md +75 -0
  343. package/vault/wiki/sources/Source: Build-Time Prompt Compilation Architecture.md +107 -0
  344. package/vault/wiki/sources/Source: Claude API Agent Skills Overview.md +70 -0
  345. package/vault/wiki/sources/Source: Gemini CLI Changelogs.md +88 -0
  346. package/vault/wiki/sources/Source: Google Blog - Gemini CLI Announcement.md +57 -0
  347. package/vault/wiki/sources/Source: Google Gemini CLI Architecture Docs.md +53 -0
  348. package/vault/wiki/sources/Source: LangChain - Anatomy of Agent Harness.md +65 -0
  349. package/vault/wiki/sources/Source: Lovable Architecture & Clone Analysis.md +83 -0
  350. package/vault/wiki/sources/Source: Martin Fowler - Harness Engineering.md +70 -0
  351. package/vault/wiki/sources/Source: OpenAI Harness Engineering Five Principles.md +58 -0
  352. package/vault/wiki/sources/Source: OpenAI Harness Engineering /342/200/224 0 Lines of Human Code.md" +101 -0
  353. package/vault/wiki/sources/Source: OpenDev /342/200/224 Building AI Coding Agents for the Terminal.md" +100 -0
  354. package/vault/wiki/sources/Source: Render AI Coding Agents Benchmark 2025.md +53 -0
  355. package/vault/wiki/sources/Source: Rocket.new /342/200/224 Vibe Solutioning Platform.md" +70 -0
  356. package/vault/wiki/sources/Source: SwirlAI Agent Skills Progressive Disclosure.md +71 -0
  357. package/vault/wiki/sources/Source: TianPan Prompt Caching Architecture.md +89 -0
  358. package/vault/wiki/sources/Source: Vercel Labs agent-browser.md +155 -0
  359. package/vault/wiki/sources/Source: browser-harness CDP Harness.md +126 -0
  360. package/vault/wiki/sources/agent-drift-academic-paper.md +79 -0
  361. package/vault/wiki/sources/aider-repomap-tree-sitter.md +42 -0
  362. package/vault/wiki/sources/anthropic-compaction-api.md +58 -0
  363. package/vault/wiki/sources/anthropic-effective-harnesses.md +42 -0
  364. package/vault/wiki/sources/anthropic-prompt-best-practices.md +100 -0
  365. package/vault/wiki/sources/anthropic2026-harness-design.md +63 -0
  366. package/vault/wiki/sources/barrel-files-tkdodo.md +38 -0
  367. package/vault/wiki/sources/birth-of-unix-kernighan-interview.md +57 -0
  368. package/vault/wiki/sources/bockeler2026-harness-engineering.md +69 -0
  369. package/vault/wiki/sources/cast-code-chunking-paper.md +50 -0
  370. package/vault/wiki/sources/ck-semantic-search.md +78 -0
  371. package/vault/wiki/sources/claude-code-architecture-karaxai-2026.md +71 -0
  372. package/vault/wiki/sources/claude-code-architecture-qubytes-2026.md +50 -0
  373. package/vault/wiki/sources/claude-code-architecture-vila-lab-2026.md +64 -0
  374. package/vault/wiki/sources/claude-code-security-architecture-penligent-2026.md +70 -0
  375. package/vault/wiki/sources/claude-context-editing-docs.md +13 -0
  376. package/vault/wiki/sources/cloudflare-codemode.md +63 -0
  377. package/vault/wiki/sources/code-chunk-library-supermemory.md +63 -0
  378. package/vault/wiki/sources/codeact-apple-2024.md +62 -0
  379. package/vault/wiki/sources/codex-dsc-rfc-8573.md +41 -0
  380. package/vault/wiki/sources/codex-open-source-agent-2026.md +110 -0
  381. package/vault/wiki/sources/coir-code-retrieval-benchmark.md +51 -0
  382. package/vault/wiki/sources/colinmcnamara-context-optimization-codemode.md +48 -0
  383. package/vault/wiki/sources/context-folding-paper.md +61 -0
  384. package/vault/wiki/sources/context-mode-website.md +63 -0
  385. package/vault/wiki/sources/cursor-agent-best-practices-2026.md +62 -0
  386. package/vault/wiki/sources/cursor-fork-29b-2025.md +50 -0
  387. package/vault/wiki/sources/cursor-harness-april-2026.md +76 -0
  388. package/vault/wiki/sources/cursor-instant-apply-2024.md +45 -0
  389. package/vault/wiki/sources/cursor-shadow-workspace-2024.md +52 -0
  390. package/vault/wiki/sources/cursor-shipped-coding-agent-2026.md +53 -0
  391. package/vault/wiki/sources/cursor-vs-antigravity-2026.md +51 -0
  392. package/vault/wiki/sources/disler-pi-vs-claude-code.md +69 -0
  393. package/vault/wiki/sources/distill-deterministic-context-compression.md +53 -0
  394. package/vault/wiki/sources/embedding-models-benchmark-supermemory-2025.md +48 -0
  395. package/vault/wiki/sources/executor-rhyssullivan.md +122 -0
  396. package/vault/wiki/sources/fallow-rs-codebase-intelligence.md +125 -0
  397. package/vault/wiki/sources/fan2025-imad.md +60 -0
  398. package/vault/wiki/sources/forgecode-gpt5-agent-improvements.md +63 -0
  399. package/vault/wiki/sources/gemini-3-prompting-guide.md +78 -0
  400. package/vault/wiki/sources/gh-cli-sub-issue-rfc.md +50 -0
  401. package/vault/wiki/sources/gh-sub-issue-extension.md +72 -0
  402. package/vault/wiki/sources/github-fork-issues-discussion.md +44 -0
  403. package/vault/wiki/sources/github-issue-dependencies-docs.md +49 -0
  404. package/vault/wiki/sources/github-sub-issues-docs.md +51 -0
  405. package/vault/wiki/sources/gitingest.md +91 -0
  406. package/vault/wiki/sources/gitreverse.md +63 -0
  407. package/vault/wiki/sources/google-antigravity-official-blog.md +47 -0
  408. package/vault/wiki/sources/google-antigravity-wikipedia.md +53 -0
  409. package/vault/wiki/sources/gsd-codecentric-deep-dive.md +57 -0
  410. package/vault/wiki/sources/gsd-github-repo.md +51 -0
  411. package/vault/wiki/sources/gsd-hn-discussion.md +59 -0
  412. package/vault/wiki/sources/guido-python-design-philosophy.md +56 -0
  413. package/vault/wiki/sources/hejlsberg-7-learnings.md +48 -0
  414. package/vault/wiki/sources/ironclaw-drift-monitor.md +80 -0
  415. package/vault/wiki/sources/langsight-loop-detection.md +80 -0
  416. package/vault/wiki/sources/leanctx-website.md +69 -0
  417. package/vault/wiki/sources/lee2026-meta-harness.md +59 -0
  418. package/vault/wiki/sources/linux-kernel-coding-workflow.md +50 -0
  419. package/vault/wiki/sources/lou2026-autoharness.md +53 -0
  420. package/vault/wiki/sources/martin-fowler-harness-engineering.md +73 -0
  421. package/vault/wiki/sources/mcp-architecture-docs.md +13 -0
  422. package/vault/wiki/sources/meng2026-agent-harness-survey.md +79 -0
  423. package/vault/wiki/sources/mindstudio-four-agent-types.md +68 -0
  424. package/vault/wiki/sources/ms-chat-history-management.md +13 -0
  425. package/vault/wiki/sources/openai-prompt-guidance.md +104 -0
  426. package/vault/wiki/sources/openclaw-session-pruning.md +13 -0
  427. package/vault/wiki/sources/opencode-dcp.md +13 -0
  428. package/vault/wiki/sources/opendev-arxiv-2603.05344v1.md +79 -0
  429. package/vault/wiki/sources/openhands-platform.md +39 -0
  430. package/vault/wiki/sources/oss-guide-codebase-exploration.md +53 -0
  431. package/vault/wiki/sources/pi-compaction-extensions-ecosystem.md +102 -0
  432. package/vault/wiki/sources/pi-context-prune-github-repo.md +38 -0
  433. package/vault/wiki/sources/pi-mono-compaction-docs.md +38 -0
  434. package/vault/wiki/sources/pi-omni-compact-github-repo.md +50 -0
  435. package/vault/wiki/sources/pi-rtk-optimizer-github-repo.md +45 -0
  436. package/vault/wiki/sources/pi-vcc-github-repo.md +69 -0
  437. package/vault/wiki/sources/pi-vscode-marketplace.md +41 -0
  438. package/vault/wiki/sources/pi-vscode-model-provider-marketplace.md +39 -0
  439. package/vault/wiki/sources/py-tree-sitter.md +13 -0
  440. package/vault/wiki/sources/sentrux-dev-landing.md +40 -0
  441. package/vault/wiki/sources/sentrux-docs-pro-architecture.md +75 -0
  442. package/vault/wiki/sources/sentrux-docs-quality-signal.md +46 -0
  443. package/vault/wiki/sources/sentrux-docs-root-cause-metrics.md +57 -0
  444. package/vault/wiki/sources/sentrux-docs-rules-engine.md +58 -0
  445. package/vault/wiki/sources/sentrux-github-repo.md +56 -0
  446. package/vault/wiki/sources/superpowers-github-repo.md +56 -0
  447. package/vault/wiki/sources/superpowers-release-blog.md +54 -0
  448. package/vault/wiki/sources/superpowers-termdock-analysis.md +45 -0
  449. package/vault/wiki/sources/swe-agent-aci.md +42 -0
  450. package/vault/wiki/sources/swe-bench.md +45 -0
  451. package/vault/wiki/sources/swe-pruner-context-pruning.md +13 -0
  452. package/vault/wiki/sources/think-in-code-blog.md +48 -0
  453. package/vault/wiki/sources/tree-sitter-docs.md +13 -0
  454. package/vault/wiki/sources/ts-best-practices-2025-devto.md +42 -0
  455. package/vault/wiki/sources/ts-folder-structure-mingyang.md +58 -0
  456. package/vault/wiki/sources/ts-monorepo-koerselman.md +44 -0
  457. package/vault/wiki/sources/ts-result-error-handling-kkalamarski.md +52 -0
  458. package/vault/wiki/sources/ts-runtimes-comparison-betterstack.md +42 -0
  459. package/vault/wiki/sources/ts-strict-mode-rishikc.md +43 -0
  460. package/vault/wiki/sources/unix-philosophy.md +48 -0
  461. package/vault/wiki/sources/vectara-chunking-vs-embedding-naacl2025.md +39 -0
  462. package/vault/wiki/sources/vectara-guardian-agents.md +79 -0
  463. package/vault/wiki/sources/vgrep-semantic-search.md +76 -0
  464. package/vault/wiki/sources/vitest-official.md +41 -0
  465. package/vault/wiki/sources/vscode-pi-community-extension.md +40 -0
  466. package/vault/wiki/sources/wozcode.md +79 -0
  467. package/.agents/skills/compress/SKILL.md +0 -111
  468. package/.agents/skills/compress/scripts/__init__.py +0 -9
  469. package/.agents/skills/compress/scripts/__main__.py +0 -3
  470. package/.agents/skills/compress/scripts/benchmark.py +0 -78
  471. package/.agents/skills/compress/scripts/cli.py +0 -73
  472. package/.agents/skills/compress/scripts/compress.py +0 -227
  473. package/.agents/skills/compress/scripts/detect.py +0 -121
  474. package/.agents/skills/compress/scripts/validate.py +0 -189
  475. package/.agents/skills/emil-design-eng/SKILL.md +0 -679
  476. package/.agents/skills/lean-ctx/SKILL.md +0 -149
  477. package/.agents/skills/lean-ctx/scripts/install.sh +0 -95
  478. package/.agents/skills/scrapling-official/LICENSE.txt +0 -28
  479. package/.agents/skills/scrapling-official/SKILL.md +0 -390
  480. package/.agents/skills/scrapling-official/examples/01_fetcher_session.py +0 -26
  481. package/.agents/skills/scrapling-official/examples/02_dynamic_session.py +0 -26
  482. package/.agents/skills/scrapling-official/examples/03_stealthy_session.py +0 -26
  483. package/.agents/skills/scrapling-official/examples/04_spider.py +0 -58
  484. package/.agents/skills/scrapling-official/examples/README.md +0 -45
  485. package/.agents/skills/scrapling-official/references/fetching/choosing.md +0 -78
  486. package/.agents/skills/scrapling-official/references/fetching/dynamic.md +0 -352
  487. package/.agents/skills/scrapling-official/references/fetching/static.md +0 -432
  488. package/.agents/skills/scrapling-official/references/fetching/stealthy.md +0 -255
  489. package/.agents/skills/scrapling-official/references/mcp-server.md +0 -214
  490. package/.agents/skills/scrapling-official/references/migrating_from_beautifulsoup.md +0 -86
  491. package/.agents/skills/scrapling-official/references/parsing/adaptive.md +0 -212
  492. package/.agents/skills/scrapling-official/references/parsing/main_classes.md +0 -586
  493. package/.agents/skills/scrapling-official/references/parsing/selection.md +0 -494
  494. package/.agents/skills/scrapling-official/references/spiders/advanced.md +0 -344
  495. package/.agents/skills/scrapling-official/references/spiders/architecture.md +0 -94
  496. package/.agents/skills/scrapling-official/references/spiders/getting-started.md +0 -164
  497. package/.agents/skills/scrapling-official/references/spiders/proxy-blocking.md +0 -235
  498. package/.agents/skills/scrapling-official/references/spiders/requests-responses.md +0 -196
  499. package/.agents/skills/scrapling-official/references/spiders/sessions.md +0 -205
  500. package/PLAN.md +0 -11
  501. package/extensions/lean-ctx-enforce.ts +0 -166
  502. package/skills-lock.json +0 -35
  503. package/wiki/README.md +0 -19
  504. package/wiki/decisions/0001-establish-project-wiki-and-decision-record-format.md +0 -25
  505. package/wiki/decisions/0002-add-project-banner-to-readme.md +0 -26
  506. package/wiki/decisions/0003-remove-redundant-readme-title-heading.md +0 -26
  507. package/wiki/decisions/0004-publish-package-to-npm-as-ultimate-pi.md +0 -26
  508. package/wiki/decisions/0005-automate-npm-publish-with-github-actions.md +0 -27
  509. package/wiki/decisions/0006-switch-to-npm-trusted-publishing.md +0 -26
  510. package/wiki/decisions/0007-use-absolute-banner-url-for-npm-readme-rendering.md +0 -26
  511. package/wiki/decisions/0008-rename-banner-asset-for-cache-busting.md +0 -26
  512. package/wiki/decisions/0009-force-oidc-path-by-clearing-node-auth-token-in-publish-step.md +0 -25
  513. package/wiki/decisions/0010-simplify-setup-node-for-npm-trusted-publishing.md +0 -26
  514. package/wiki/decisions/0011-add-noop-workflow-change-to-force-fresh-publish-run.md +0 -25
  515. package/wiki/decisions/0012-align-workflow-runtime-with-npm-trusted-publishing-requirements.md +0 -26
  516. package/wiki/decisions/0013-add-package-repository-url-for-provenance-validation.md +0 -25
@@ -0,0 +1,83 @@
1
+ ---
2
+ type: source
3
+ source_type: blog
4
+ title: "Lovable Architecture & Clone Analysis"
5
+ author: "JIN (blog.devgenius.io), Neel S (Medium), Lovable Docs"
6
+ date_published: 2025-09-05
7
+ url:
8
+ - "https://blog.devgenius.io/lovables-architecture-decoded-how-ai-transforms-intent-into-production-ready-code-ceead05003e4"
9
+ - "https://docs.lovable.dev/introduction/welcome"
10
+ - "https://medium.com/@indraneelsarode22neel/building-a-lovable-clone-inside-the-architecture-of-agentic-ai-platforms-4d423dc53a9c"
11
+ confidence: medium
12
+ key_claims:
13
+ - "Lovable's key innovation is the orchestration layer on top of models, not the models themselves"
14
+ - "Multi-agent architecture: Planner → Architect → Coder with Pydantic-typed handoffs"
15
+ - "Structured outputs (Pydantic schemas) prevent chaos — transforms AI from demo to production"
16
+ - "LangGraph enables state-driven multi-agent workflows with conditional edges"
17
+ - "Groq's sub-100ms inference makes iterative development enjoyable"
18
+ - "Lovable supports full lifecycle: prototyping → deployment → operation with code ownership via GitHub sync"
19
+ tags:
20
+ - lovable
21
+ - multi-agent
22
+ - agentic-ai
23
+ - structured-outputs
24
+ - langgraph
25
+ created: 2026-05-03
26
+ updated: 2026-05-03
27
+ status: ingested
28
+
29
+ ---# Lovable Architecture & Clone Analysis
30
+
31
+ Lovable (formerly GPT Engineer) is a full-stack AI development platform that transforms natural language into production-ready web applications. Built for enterprises with SOC 2 Type II, ISO 27001, and GDPR compliance.
32
+
33
+ ## Key Architecture Insight
34
+
35
+ The critical point: Lovable's breakthrough is not about using better models — it's about the **orchestration layer** sitting on top of them. The system architecture bridges the "intent-to-execution chasm" that raw AI code generators fail at.
36
+
37
+ ## Lovable Clone Architecture (Neel S, Sept 2025)
38
+
39
+ A simplified Lovable clone built with LangGraph, Groq, and Pydantic:
40
+
41
+ ### Three-Agent Pipeline
42
+
43
+ **1. Planner Agent**: Raw user prompt → structured project plan (name, techstack, features, files). Output: Pydantic `Plan` object.
44
+
45
+ **2. Architect Agent**: Project plan → detailed implementation steps with file-specific tasks. Output: `TaskPlan` object.
46
+
47
+ **3. Coder Agent**: Implementation tasks → actual files on disk. Uses ReAct pattern with file system tools (read_file, write_file, list_files).
48
+
49
+ ### State Management
50
+
51
+ State flows through agents as structured dict:
52
+ ```
53
+ {
54
+ "user_prompt": str,
55
+ "plan": Plan,
56
+ "task_plan": TaskPlan,
57
+ "coder_state": CoderState,
58
+ "status": str
59
+ }
60
+ ```
61
+
62
+ LangGraph orchestrates: `graph.add_conditional_edges("coder", lambda s: "END" if s.get("status") == "DONE" else "coder")`
63
+
64
+ ### Key Patterns
65
+
66
+ - **Structured outputs**: `llm.with_structured_output(Plan).invoke(prompt)` — no text parsing
67
+ - **ReAct pattern**: Coder has real tools, not just text generation
68
+ - **Handoffs via validated data contracts**: Each agent produces typed objects for downstream consumption
69
+
70
+ ## Lovable Production Architecture
71
+
72
+ From official docs:
73
+ - **Full-stack**: Frontend, backend, database, authentication, integrations
74
+ - **Code ownership**: Sync to GitHub, integrate into existing workflows
75
+ - **Enterprise**: SOC 2 Type II, ISO 27001, SSO/SCIM
76
+ - **Security**: Built-in checks, data usage controls, data opt-out
77
+
78
+ ## Relevance to AI Coding Harness
79
+
80
+ 1. **Multi-agent decomposition with typed handoffs** is the central pattern — directly applicable to our harness L2 (planning) → L3 (execution) flow.
81
+ 2. **Structured outputs as reliability mechanism** — our harness should enforce schema-validated handoffs between phases, not free-text.
82
+ 3. **State management as first-class concern** — LangGraph's state graph pattern maps well to harness session state.
83
+ 4. **Orchestration layer > model layer** — invest in harness infrastructure, ride model improvements.
@@ -0,0 +1,70 @@
1
+ ---
2
+ type: source
3
+ status: ingested
4
+ source_type: engineering-blog
5
+ author: Birgitta Böckeler (Thoughtworks)
6
+ date_published: 2026-04-02
7
+ date_accessed: 2026-05-01
8
+ url: https://martinfowler.com/articles/harness-engineering.html
9
+ confidence: high
10
+ key_claims:
11
+ - Harness = everything in agent except the model itself
12
+ - Two control types: Feedforward (guides, prevent) + Feedback (sensors, self-correct)
13
+ - Two execution types: Computational (deterministic, fast) + Inferential (LLM-based, expensive)
14
+ - Three regulation categories: Maintainability, Architecture Fitness, Behaviour
15
+ - The steering loop: human iterates on harness when issues recur
16
+ - Keep quality left: fast checks pre-commit, expensive checks post-integration
17
+ - Harnessability: not every codebase equally amenable. "Ambient affordances" matter.
18
+ - Ashby's Law: regulator must have at least as much variety as system it governs
19
+ - Behaviour harness is the elephant in the room — unresolved
20
+ created: 2026-05-02
21
+ updated: 2026-05-02
22
+ tags: [source]
23
+ ---
24
+ # Martin Fowler: Harness Engineering for Coding Agent Users
25
+
26
+ ## What It Is
27
+
28
+ Canonical framework for harness engineering from Martin Fowler (Thoughtworks). Published April 2, 2026. Supersedes earlier memo from Feb 2026. Defines the mental model for building trust in coding agents through constraints and feedback loops.
29
+
30
+ ## Core Framework
31
+
32
+ ### Feedforward and Feedback
33
+
34
+ - **Guides (feedforward controls)**: Anticipate agent behaviour, steer before it acts. Increase probability of good first-attempt results.
35
+ - **Sensors (feedback controls)**: Observe after agent acts, help it self-correct. Most powerful when signals are optimized for LLM consumption (e.g., custom linter messages with fix instructions).
36
+ - Separately: agent repeats mistakes (feedback-only) OR encodes rules never tested (feedforward-only). Both needed.
37
+
38
+ ### Computational vs Inferential
39
+
40
+ | Type | Computational | Inferential |
41
+ |------|--------------|-------------|
42
+ | Speed | milliseconds-seconds | seconds-minutes |
43
+ | Cost | cheap | expensive |
44
+ | Determinism | deterministic | non-deterministic |
45
+ | Examples | linters, tests, type checkers | AI code review, "LLM as judge" |
46
+ | Run frequency | every change | selectively |
47
+
48
+ ### The Steering Loop
49
+
50
+ Human iterates on harness. When issue happens multiple times → improve feedforward/feedback controls. Agents can help write harness controls (custom linters, structural tests, how-to guides).
51
+
52
+ ### Regulation Categories
53
+
54
+ 1. **Maintainability harness**: Code quality, conventions. Easiest — lots of pre-existing computational tooling.
55
+ 2. **Architecture fitness harness**: Fitness functions for architecture characteristics (performance, observability, etc.).
56
+ 3. **Behaviour harness**: Functional correctness. "Elephant in the room" — AI-generated tests aren't reliable enough yet. Approved fixtures pattern shows promise.
57
+
58
+ ### Harnessability
59
+
60
+ Not every codebase equally harnessable. Strongly typed languages, clear module boundaries, frameworks that abstract details all increase harnessability. "Ambient affordances" (Ned Letcher): structural properties that make environment legible to agents.
61
+
62
+ ### Harness Templates
63
+
64
+ Pre-bundled guides + sensors for common service topologies (CRUD business service, event processor, data dashboard). Teams may pick tech stacks based on available harness templates.
65
+
66
+ ## Relevance to Ultimate-PI
67
+
68
+ Our 8-layer pipeline directly implements Feedforward+Feedback. L1-L2 (Spec Hardening, Planning) are feedforward. L2.5-L4 (Drift, Grounding, Adversarial) are feedback. L5-L8 (Observability, Memory, Orchestration, Query) are the steering loop infrastructure. Our three drift paradigms map to the three regulation categories: Implementation drift = Maintainability, Spec drift = Behaviour, Tool-call drift crosses all three.
69
+
70
+ Key gap: we don't separate computational vs inferential controls explicitly. Our drift detection is inferential; we could strengthen with computational sensors (custom linters, structural tests).
@@ -0,0 +1,58 @@
1
+ ---
2
+ type: source
3
+ status: ingested
4
+ source_type: engineering-blog
5
+ author: Tony Lee
6
+ date_published: 2026-02-12
7
+ date_accessed: 2026-05-01
8
+ url: https://tonylee.im/en/blog/openai-harness-engineering-five-principles-codex/
9
+ confidence: high
10
+ key_claims:
11
+ - OpenAI Codex team built 1M-line product using only agents, zero human-written code
12
+ - Took 1/10th the time vs manual (internal estimate, uncontrolled conditions)
13
+ - Five principles: visibility, capability-gap thinking, mechanical enforcement, agent eyes, map-not-manual
14
+ - Custom concurrency helpers instead of external libraries (API stability favors agents)
15
+ - Custom linters + structural tests enforce layered architecture; linters themselves written by Codex
16
+ - Chrome DevTools Protocol gives agent DOM snapshots, screenshots, navigation
17
+ - "A map, not a manual": ARCHITECTURE.md as bird's-eye view, not exhaustive documentation
18
+ created: 2026-05-02
19
+ updated: 2026-05-02
20
+ tags: [source]
21
+ ---
22
+ # OpenAI Harness Engineering: Five Principles
23
+
24
+ ## What It Is
25
+
26
+ Summary of OpenAI's internal harness engineering practices from their Codex team, which built a 1M-line product using only AI agents (zero human-written code). Based on OpenAI's official post (openai.com/index/harness-engineering, Feb 11, 2026) but that page is 403-walled — this summary from Tony Lee's analysis provides the five principles.
27
+
28
+ ## The Five Principles
29
+
30
+ ### 1. What the Agent Can't See Doesn't Exist
31
+
32
+ All decisions pushed into repository as markdown, schemas, and ExecPlans (PLANS.md). ExecPlan = self-contained design doc written so a beginner could implement end-to-end. Codex worked continuously for 7+ hours on single prompts — only possible with complete, stable context.
33
+
34
+ ### 2. Ask What Capability Is Missing, Not Why the Agent Is Failing
35
+
36
+ When velocity was slow, team asked "what capability is missing?" instead of "why is the agent failing?" Reframed work from prompting harder to instrumenting environment better. Built custom concurrency helpers rather than external libraries — API stability + training data representation favor "boring technology."
37
+
38
+ ### 3. Mechanical Enforcement Over Documentation
39
+
40
+ Enforced invariant rules mechanically (linters, structural tests) rather than prescribing implementation in text. Architecture locked into layered domain structure: Providers → Service → Runtime → UI. Dependency directions verified by linters. Custom linters written by Codex itself.
41
+
42
+ ### 4. Give the Agent Eyes
43
+
44
+ Connected Chrome DevTools Protocol to agent runtime. Pre/post-task snapshot comparison + runtime event observation = agent fixes in loop until clean. Single Codex runs sustained 6+ hours on one task. Temporary observability stack per git worktree: Victoria Logs + Victoria Metrics. Prompts like "make the service start in under 800ms" become executable.
45
+
46
+ ### 5. A Map, Not a Manual
47
+
48
+ ARCHITECTURE.md as bird's-eye view of project structure, including only what rarely changes. Architectural invariants expressed as "something does not exist here" — counterintuitive but effective. Stating boundaries explicitly constrains all downstream implementation.
49
+
50
+ ## Unresolved Questions
51
+
52
+ - Can agent-built system maintain architectural consistency over years? Unknown.
53
+ - How must harness change as models improve? Unknown.
54
+ - 1M-line number represents single internal project under controlled conditions. Extrapolation requires caution.
55
+
56
+ ## Relevance to Ultimate-PI
57
+
58
+ Principle 1 maps to L3 Grounding (everything in repo). Principle 2 maps to our tool-first approach (ck, Gitingest, pi-lean-ctx). Principle 3 maps to L2.5 Drift Monitor + Phase 16 Lint Gate — but we need *mechanical* enforcement (linters), not just drift detection. Principle 4 maps to L5 Observability but we lack browser/visual verification. Principle 5 maps to our wiki/overview.md + index.md but we could formalize ARCHITECTURE.md pattern.
@@ -0,0 +1,101 @@
1
+ ---
2
+ type: source
3
+ source_type: blog
4
+ title: "OpenAI Harness Engineering — 0 Lines of Human Code"
5
+ author: "Ryan Lopopolo, OpenAI Engineering"
6
+ date_published: 2026-02-11
7
+ url: "https://openai.com/index/harness-engineering/"
8
+ confidence: high
9
+ key_claims:
10
+ - "Built a product with 0 lines of manually-written code over 5 months"
11
+ - "~1M lines of code, ~1,500 PRs, 3-7 engineers steering Codex agents"
12
+ - "Average throughput: 3.5 PRs per engineer per day, increasing as team scaled"
13
+ - "Context is a scarce resource — use AGENTS.md as table of contents, not encyclopedia"
14
+ - "Enforce architecture mechanically via custom linters, not via prompts"
15
+ - "Codex can run single tasks for 6+ hours autonomously"
16
+ - "Dedicated doc-gardening agents scan for stale documentation"
17
+ - "Prefer 'boring' technology — easier for agents to model"
18
+ tags:
19
+ - openai
20
+ - codex
21
+ - harness-engineering
22
+ - context-engineering
23
+ - agentic-coding
24
+ created: 2026-05-03
25
+ updated: 2026-05-03
26
+ status: ingested
27
+
28
+ ---# OpenAI Harness Engineering — 0 Lines of Human Code
29
+
30
+ OpenAI Engineering, February 2026. Ryan Lopopolo on building a product with Codex where humans never directly contributed any code.
31
+
32
+ ## Core Philosophy
33
+
34
+ **"Humans steer. Agents execute."** The team's primary job became designing environments, specifying intent, and building feedback loops that allow Codex agents to do reliable work.
35
+
36
+ ## Key Architectural Decisions
37
+
38
+ ### 1. Progressive Disclosure (Maps, Not Encyclopedias)
39
+
40
+ The "one big AGENTS.md" approach failed:
41
+ - Context scarcity: giant file crowds out the task
42
+ - Too much guidance becomes non-guidance
43
+ - Rots instantly — agents can't tell what's stale
44
+ - Hard to verify mechanically
45
+
46
+ Solution: **AGENTS.md as table of contents** (~100 lines), pointing to structured `docs/` directory:
47
+ ```
48
+ docs/
49
+ ├── design-docs/ (index, core beliefs)
50
+ ├── exec-plans/ (active, completed, tech-debt)
51
+ ├── product-specs/ (index, feature specs)
52
+ ├── references/ (design system, tool docs)
53
+ ├── DESIGN.md, FRONTEND.md, PLANS.md, QUALITY_SCORE.md
54
+ ```
55
+
56
+ ### 2. Mechanical Architecture Enforcement
57
+
58
+ Layered domain architecture with strictly validated dependency directions:
59
+ ```
60
+ Types → Config → Repo → Service → Runtime → UI
61
+ ```
62
+ - Cross-cutting concerns enter only through explicit Providers interface
63
+ - Enforced via custom linters and structural tests
64
+ - Error messages injected as remediation instructions into agent context
65
+ - "With agents, constraints become multipliers: once encoded, they apply everywhere at once"
66
+
67
+ ### 3. Agent Legibility as System of Record
68
+
69
+ "From the agent's point of view, anything it can't access in-context while running effectively doesn't exist." Knowledge from Slack, Google Docs, or people's heads is invisible. All knowledge must be encoded into the repository as markdown.
70
+
71
+ ### 4. Environment Control
72
+
73
+ Codex drives apps via Chrome DevTools Protocol: snapshots DOM, navigates, validates UI behavior. Ephemeral observability stack per worktree: logs (LogQL), metrics (PromQL), traces. Single Codex runs work on one task for 6+ hours.
74
+
75
+ ### 5. Garbage Collection for AI Slop
76
+
77
+ Initial approach: humans spent Fridays (20% of week) cleaning "AI slop." Didn't scale. Solution: encode "golden principles" mechanically, run recurring background Codex tasks scanning for deviations, open targeted refactoring PRs. "Technical debt is like a high-interest loan — pay it down continuously in small increments."
78
+
79
+ ### 6. Minimal Blocking Merge Gates
80
+
81
+ PRs are short-lived. Test flakes addressed with follow-up runs. "Corrections are cheap, waiting is expensive." In high-throughput agent systems, this is often the right tradeoff.
82
+
83
+ ## Full Autonomy Achieved
84
+
85
+ Codex can now end-to-end drive a new feature from one prompt:
86
+ 1. Validate codebase state
87
+ 2. Reproduce reported bug
88
+ 3. Record video demonstrating failure
89
+ 4. Implement fix
90
+ 5. Validate fix by driving application
91
+ 6. Record video demonstrating resolution
92
+ 7. Open PR
93
+ 8. Respond to agent and human feedback
94
+ 9. Detect and remediate build failures
95
+ 10. Escalate to human only when judgment required
96
+ 11. Merge the change
97
+
98
+ ## Open Questions (from OpenAI)
99
+ - How does architectural coherence evolve over years in a fully agent-generated system?
100
+ - Where does human judgment add the most leverage?
101
+ - How does the system evolve as models improve?
@@ -0,0 +1,100 @@
1
+ ---
2
+ type: source
3
+ source_type: paper
4
+ title: "OpenDev — Building AI Coding Agents for the Terminal"
5
+ author: "Nghi D. Q. Bui, OpenDev"
6
+ date_published: 2026-03-05
7
+ url: "https://arxiv.org/html/2603.05344v1"
8
+ confidence: high
9
+ key_claims:
10
+ - "First comprehensive technical report for an open-source, terminal-native, interactive coding agent"
11
+ - "Compound AI system: per-workflow LLM binding (action, thinking, critique, vision, compact)"
12
+ - "5-stage adaptive context compaction reduces peak context consumption by ~54%"
13
+ - "Event-driven system reminders counteract instruction fade-out in long sessions"
14
+ - "5-layer defense-in-depth safety architecture (prompt, schema, runtime, tool-level, hooks)"
15
+ - "Lazy MCP tool discovery reduces startup context cost from 40% to <5%"
16
+ - "9-pass fuzzy edit matching chain resolves LLM formatting imprecision"
17
+ tags:
18
+ - opendev
19
+ - terminal-agent
20
+ - context-engineering
21
+ - safety
22
+ - mcp
23
+ - compound-ai
24
+ created: 2026-05-03
25
+ updated: 2026-05-03
26
+ status: ingested
27
+
28
+ ---# OpenDev — Building AI Coding Agents for the Terminal
29
+
30
+ arXiv paper, March 2026. OpenDev is an open-source CLI coding agent with a published technical report — bridging the gap between closed-source industrial practice and open academic discourse.
31
+
32
+ ## Core Architecture
33
+
34
+ ### Compound AI System
35
+ Not a single model but a structured ensemble of agents and workflows, each independently bound to a user-configured LLM. Five model roles with fallback chains:
36
+ - **Action model**: Primary execution model for tool-based reasoning
37
+ - **Thinking model**: Extended reasoning without tool access (prevents premature action)
38
+ - **Critique model**: Self-evaluation (Reflexion-inspired, selective activation)
39
+ - **Vision model**: Vision-language for screenshots/images
40
+ - **Compact model**: Smaller/faster model for summarization during compaction
41
+
42
+ ### Dual-Agent Separation
43
+ Main agent for execution + Planner subagent for planning. Planner has **read-only tools only** — write tools are absent from its schema entirely, making write attempts structurally impossible.
44
+
45
+ ### Extended ReAct Loop
46
+ Four phases per iteration:
47
+ 1. **Context management**: 5-stage adaptive compaction (70% → 99% thresholds)
48
+ 2. **Thinking**: Separate LLM call without tools, at configurable depth (OFF/LOW/MEDIUM/HIGH)
49
+ 3. **Action**: Full LLM call with tool schemas
50
+ 4. **Decision**: Doom-loop detection, tool dispatch, error recovery
51
+
52
+ ## Context Engineering (First-Class Concern)
53
+
54
+ ### Adaptive Context Compaction (ACC)
55
+ Five graduated stages:
56
+ - **Stage 1 (70%)**: Warning — log utilization, no reduction
57
+ - **Stage 2 (80%)**: Observation masking — replace old results with reference pointers
58
+ - **Stage 2.5 (85%)**: Fast pruning — delete old tool outputs beyond recency window
59
+ - **Stage 3 (90%)**: Aggressive masking — only most recent outputs preserved
60
+ - **Stage 4 (99%)**: Full LLM compaction — summarize middle history, preserve recent
61
+
62
+ Result: 54% reduction in peak context consumption. Artifact index tracks all files touched.
63
+
64
+ ### Event-Driven System Reminders
65
+ 24 reminder templates injected as `role: user` messages at decision points. Address attention-decay: after 30+ tool calls, agents silently stop following system prompt instructions. Reminders fire at precise decision points (tool failure, exploration spiral, premature completion, incomplete todos). Guardrail counters prevent noise (max 2-3 nudges per type).
66
+
67
+ ### Dual-Memory Architecture
68
+ - **Episodic memory**: LLM-generated summary of full conversation (strategic context)
69
+ - **Working memory**: Last 6 message pairs verbatim (operational detail)
70
+ - Summary regenerated every 5 messages from full history to prevent drift accumulation
71
+
72
+ ### Dynamic System Prompt Construction
73
+ Priority-ordered conditional sections. Each section has a predicate condition — gets loaded only when contextually relevant (e.g., git workflow section only in git repos). Provider-specific sections for Anthropic vs OpenAI vs Fireworks. Two-part composition for Anthropic prompt caching (88% cost reduction on cached portion).
74
+
75
+ ## Safety — Defense in Depth
76
+
77
+ Five independent safety layers:
78
+ 1. **Prompt-level guardrails**: Security policy, action safety, git workflow
79
+ 2. **Schema-level tool gating**: Dangerous tools invisible to agent, not just blocked
80
+ 3. **Runtime approval system**: Manual/Semi-Auto/Auto levels, persistent permissions, pattern matching
81
+ 4. **Tool-level validation**: DANGEROUS_PATTERNS blocklist, stale-read detection, timeouts
82
+ 5. **Lifecycle hooks**: External scripts intercept 10 lifecycle events, can block or mutate
83
+
84
+ ## Tool System
85
+
86
+ 35 built-in tools across 12 categories. Key innovations:
87
+ - **9-pass fuzzy edit matching**: Absorbs LLM formatting imprecision (trailing whitespace, indentation, escape sequences)
88
+ - **Lazy MCP discovery**: `search_tools` with keyword scoring. Startup context cost: 40% → <5%
89
+ - **Auto-promote server commands**: 16 regex patterns detect dev servers, auto-background them
90
+ - **Dual-mode search**: ripgrep (text) + ast-grep (structural) with LSP for semantic code analysis
91
+
92
+ ## Discussion: Transferable Lessons
93
+
94
+ 1. **Context is a budget, not a buffer** — graduated reduction beats binary emergency compaction
95
+ 2. **Inject reminders at decision points, not upfront** — `role: user` beats `role: system`
96
+ 3. **Separate thinking from action** — absence of tool schemas changes behavior, not instructions
97
+ 4. **Make unsafe tools invisible, not blocked** — schema gating > runtime permission checks
98
+ 5. **Design tools to absorb LLM imprecision** — chain-of-responsibility matchers convert near-misses
99
+ 6. **Bound every resource that grows with session length** — caps on everything
100
+ 7. **Calibrate from API-reported token counts, not local estimates** — providers inject invisible content
@@ -0,0 +1,53 @@
1
+ ---
2
+ type: source
3
+ status: ingested
4
+ source_type: benchmark-report
5
+ author: Mitch Alderson (Render)
6
+ date_published: 2025-08-12
7
+ date_accessed: 2026-05-01
8
+ url: https://render.com/blog/ai-coding-agents-benchmark
9
+ confidence: high
10
+ key_claims:
11
+ - Cursor leads overall (8/10): best setup speed, Docker/Render deployment, code quality
12
+ - Claude Code (6.8/10): best for rapid prototypes, productive terminal UX
13
+ - Gemini CLI (6.8/10): wins large-context refactors, weak on greenfield
14
+ - OpenAI Codex (6/10): powerful model, hampered by UX issues
15
+ - Gemini CLI pattern: excels at editing existing codebases (context-driven), struggles generating from scratch
16
+ - Free tier: 60 req/min, 1,000 req/day (industry best)
17
+ created: 2026-05-02
18
+ updated: 2026-05-02
19
+ tags: [source]
20
+ ---
21
+ # Render AI Coding Agents Benchmark (August 2025)
22
+
23
+ ## What It Is
24
+
25
+ Independent benchmark comparing Cursor, Claude Code, Gemini CLI, and OpenAI Codex on production codebases in 2025. Two test categories: "vibe coding" (greenfield URL shortener) and production code tasks (Go monorepo, Astro.js site).
26
+
27
+ ## Final Scores
28
+
29
+ | Tool | Setup | Cost | Quality | Context | Integration | Speed | Specialized | **Avg** |
30
+ |------|-------|------|---------|---------|-------------|-------|-------------|---------|
31
+ | Cursor | 9 | 5 | 9 | 8 | 8 | 9 | 8 | **8** |
32
+ | Claude Code | 8 | 6 | 7 | 5 | 9 | 7 | 6 | **6.8** |
33
+ | Gemini CLI | 6 | 8 | 7 | 9 | 5 | 5 | 8 | **6.8** |
34
+ | Codex | 3 | 6 | 8 | 7 | 4 | 7 | 7 | **6** |
35
+
36
+ ## Gemini CLI Specific Findings
37
+
38
+ - **Context: 9/10** — best in class. 1M token window + automatic codebase loading. Loaded most/all relevant files without manual intervention.
39
+ - **Quality: 7/10** — solid on production refactors (first-try Go refactor with proper error handling), but 3/10 on vibe coding (7 follow-up error prompts needed, barebones output).
40
+ - **Speed: 5/10** — slow due to automatic full-context loading.
41
+ - **Hypothesis** (unconfirmed): Gemini may be tuned to make decisions based on context rather than pre-training, favoring editing existing codebases over generating from scratch.
42
+
43
+ ## Key Takeaways
44
+
45
+ - Each tool excels in different areas; no single winner
46
+ - For production refactoring: Gemini + Cursor best (context matters most)
47
+ - For greenfield: Cursor + Claude Code best (model quality + UX matters)
48
+ - AI agents best used by experienced engineers who audit output
49
+ - All agents were great as "error assistants" — troubleshooting via chat
50
+
51
+ ## Relevance to Ultimate-PI
52
+
53
+ Gemini CLI's context-driven approach validates our L3 Grounding layer (Gitingest + ck). The benchmark's finding that context quality beats model quality for production tasks reinforces our first-principles decision to invest heavily in grounding/context engineering.
@@ -0,0 +1,70 @@
1
+ ---
2
+ type: source
3
+ source_type: news_product
4
+ title: "Rocket.new — Vibe Solutioning Platform"
5
+ author: "Rocket (website), Jagmeet Singh (TechCrunch)"
6
+ date_published: 2026-04-06
7
+ url:
8
+ - "https://www.rocket.new/"
9
+ - "https://techcrunch.com/2026/04/06/indian-startup-rocket-wants-its-ai-to-do-mckinsey-style-consulting-at-a-fraction-of-the-cost/"
10
+ confidence: medium
11
+ key_claims:
12
+ - "World's first Vibe Solutioning platform: strategy → build → competitive intelligence in one system"
13
+ - "Code generation is a commodity — deciding what to build is the missing piece"
14
+ - "$15M seed from Accel, Salesforce Ventures, Together Fund"
15
+ - "1.5M users across 180 countries, 57 employees, based in Surat, India"
16
+ - "Generates 'McKinsey-grade' consulting-style reports from simple prompts"
17
+ - "Subscriptions: $25-$350/month"
18
+ tags:
19
+ - rocket
20
+ - vibe-solutioning
21
+ - strategy
22
+ - competitive-intelligence
23
+ created: 2026-05-03
24
+ updated: 2026-05-03
25
+ status: ingested
26
+
27
+ ---# Rocket.new — Vibe Solutioning Platform
28
+
29
+ Rocket describes itself as the world's first "Vibe Solutioning" platform. It covers the full arc from market research → product strategy → app building → competitive intelligence in one system with shared context.
30
+
31
+ ## Three Capabilities
32
+
33
+ ### 1. Solve (The Thinking Before the Build)
34
+ - Describe a market problem → AI returns research, evidence, and recommendation
35
+ - Outputs: market analysis, what-to-build, GTM strategy, PRD, regulatory research
36
+ - "Ready to present to a room, hand to a developer, or take straight into Build"
37
+ - Draws on 1,000+ data sources: Meta ad libraries, Similarweb API, own crawlers
38
+
39
+ ### 2. Build (Production-Grade from First Prompt)
40
+ - Web apps, mobile apps, landing pages, SaaS, internal tools, dashboards
41
+ - Import from Figma, reimagine existing designs
42
+ - One-click deploy with staging and production environments
43
+ - Claims "100x better than anything else" in user testimonials
44
+
45
+ ### 3. Intelligence (Know What Your Competition Just Did)
46
+ - Continuous monitoring of competitor pricing, messaging, launches, website changes
47
+ - Daily briefs, hiring signals, social media intel
48
+ - "Rocket saw it, connected it, and already knows what it means for you"
49
+
50
+ ## Business Model
51
+ - $25/mo: Build only
52
+ - $250/mo: Strategy + Research (2-3 "McKinsey-grade" reports + builds)
53
+ - $350/mo: Full platform including competitive intelligence
54
+ - ARPU ~$4,000/year; 20-30% customers are SMBs
55
+ - Gross margins >50%
56
+
57
+ ## Funding & Traction
58
+ - $15M seed (Sept 2025) from Accel, Salesforce Ventures, Together Fund
59
+ - Grew from 400K to 1.5M+ users post-funding
60
+ - Founded by Vishal Virani (previously co-founder of DhiWise, which pivoted to Rocket)
61
+
62
+ ## Key Thesis
63
+ "Everyone can generate the code now — it has become a commodity. But what to build is something which everyone is missing. Running a business and just building a codebase are two different things."
64
+
65
+ ## Relevance to AI Coding Harness
66
+
67
+ 1. **Pre-build strategy layer**: Rocket validates that the "what to build" gap is real and commercially viable. Our harness could integrate a planning phase that does market/competitive analysis before generating code.
68
+ 2. **Shared context across lifecycle**: From strategy → build → monitoring, context compounds. A harness should treat context as persistent across all phases, not resetting between planning and coding.
69
+ 3. **Competitive intelligence as feedback loop**: After deployment, monitor what competitors do and feed that back into the planning phase. This creates a continuous improvement loop.
70
+ 4. **Limitation noted by TechCrunch**: Analysis is synthesized from existing data, not independently verifiable. Users should validate outputs before business decisions.
@@ -0,0 +1,71 @@
1
+ ---
2
+ type: source
3
+ source_type: newsletter
4
+ title: "SwirlAI — Agent Skills Progressive Disclosure"
5
+ author: "Aurimas Griciūnas"
6
+ date_published: 2026-03-11
7
+ url: "https://www.newsletter.swirlai.com/p/agent-skills-progressive-disclosure"
8
+ confidence: high
9
+ key_claims:
10
+ - "Agent Skills use three-tier progressive disclosure: Discovery (~80 tokens/skill), Activation (~2,000 tokens median), Execution (unlimited supporting files)"
11
+ - "Anthropic released Agent Skills open standard Dec 18, 2025. Within weeks, OpenAI, Google, GitHub, Cursor adopted it."
12
+ - "Skills marketplaces like SkillsMP index over 400,000 skills across platforms."
13
+ - "Progressive disclosure is a SYSTEM DESIGN PATTERN, not just a coding agent feature."
14
+ - "Context windows are finite and lossy — models miss information in the middle of long contexts ('lost in the middle')"
15
+ - "Best practice: fewer than 20 tools available to an agent, accuracy degrades past 10"
16
+ - "Skill description quality directly determines routing accuracy — Claude selects skills through pure LLM reasoning"
17
+ tags: [source, skills, progressive-disclosure, agent-architecture]
18
+ related:
19
+ - "[[agent-skills-pattern]]"
20
+ - "[[progressive-disclosure-agents]]"
21
+ - "[[skill-first-architecture]]"
22
+ ---
23
+
24
+ # SwirlAI — Agent Skills: Progressive Disclosure as a System Design Pattern
25
+
26
+ ## Summary
27
+
28
+ Comprehensive analysis by Aurimas Griciūnas (SwirlAI Newsletter, 35K+ subscribers) on why Agent Skills became an industry standard within weeks. Published March 11, 2026 — three months after Anthropic's open standard release.
29
+
30
+ ## Key Contributions
31
+
32
+ ### Three-Tier Progressive Disclosure Architecture
33
+
34
+ The `SKILL.md` file organizes information into three layers. The platform implements the loading logic.
35
+
36
+ **Layer 1: Discovery** (~80 tokens/skill median). At startup, the platform reads only `name` and `description` from YAML frontmatter. All 17 of Anthropic's official skills together cost ~1,700 tokens at discovery — an agent can be aware of dozens of skills for less context than a single activated skill.
37
+
38
+ **Layer 2: Activation** (~2,000 tokens median). When the platform determines a skill is relevant, it loads the full `SKILL.md` markdown body. Body sizes range from ~275 tokens (internal-comms) to ~8,000 tokens (skill-creator).
39
+
40
+ **Layer 3: Execution** (unlimited). Supporting files (scripts, reference docs, templates, configs) loaded on demand. Scripts execute without their code entering context — only output consumes tokens.
41
+
42
+ ### Industry Adoption Speed
43
+
44
+ - **Dec 18, 2025**: Anthropic releases open standard
45
+ - **Within weeks**: OpenAI (Codex CLI, ChatGPT), Google (Gemini CLI), GitHub Copilot, Cursor all adopt
46
+ - **By Mar 2026**: SkillsMP indexes 400,000+ skills
47
+
48
+ > "Every one of these platforms faces the same two problems: how to give agents broad knowledge without destroying context quality, and how to let users configure agent behavior without requiring engineering expertise. The skills format solves both."
49
+
50
+ ### Non-Coding Applications
51
+
52
+ OpenClaw (175K GitHub stars in <2 weeks) demonstrates the pattern works beyond coding agents: calendar management, email drafting, smart home control, meal planning, cross-platform coordination. Community registry ClawHub hosts 13,000+ skills, most non-technical.
53
+
54
+ ### Context Engineering
55
+
56
+ > "Best practice recommends fewer than 20 tools available to an agent at once, with accuracy degrading past 10. The same principle applies to instructions."
57
+
58
+ Context windows are finite and lossy. The "lost in the middle" phenomenon: models reliably miss information placed in the middle of long contexts.
59
+
60
+ ## What We Adopt
61
+
62
+ - Three-tier progressive disclosure as the architectural model for harness skills
63
+ - Skills as the atomic unit of harness behavior (not code modules)
64
+ - Description quality as the routing mechanism (not keyword matching)
65
+ - The insight that markdown skills make agent behavior configurable by non-engineers
66
+
67
+ ## What We Note
68
+
69
+ - The ecosystem moved fast because the problem (context bloat + configuration accessibility) is universal
70
+ - Skills compose with hooks — skills can define deterministic behavior in frontmatter
71
+ - Marketplaces are forming — our harness skills could be published to SkillsMP