ultimate-pi 0.1.7 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (524) hide show
  1. package/.agents/skills/graphify/.graphify_version +1 -0
  2. package/.agents/skills/graphify/SKILL.md +1204 -0
  3. package/.agents/skills/wiki-autoresearch/SKILL.md +225 -97
  4. package/.agents/skills/wiki-autoresearch/references/program.md +28 -62
  5. package/.agents/skills/wiki-autoresearch/references/quality-sites.md +32 -0
  6. package/.env.example +5 -1
  7. package/.gitattributes +1 -0
  8. package/.github/workflows/publish-github-packages.yml +1 -1
  9. package/.pi/SYSTEM.md +72 -18
  10. package/.pi/agents/harness/adversary.md +32 -0
  11. package/.pi/agents/harness/evaluator.md +32 -0
  12. package/.pi/agents/harness/executor.md +34 -0
  13. package/.pi/agents/harness/meta-optimizer.md +33 -0
  14. package/.pi/agents/harness/planner.md +33 -0
  15. package/.pi/agents/harness/tie-breaker.md +35 -0
  16. package/.pi/agents/harness/trace-librarian.md +32 -0
  17. package/.pi/extensions/banner.png +0 -0
  18. package/.pi/extensions/budget-guard.ts +265 -0
  19. package/.pi/extensions/custom-footer.ts +194 -22
  20. package/.pi/extensions/custom-header.ts +47 -9
  21. package/.pi/extensions/debate-orchestrator.ts +479 -0
  22. package/.pi/extensions/harness-live-widget.ts +438 -0
  23. package/.pi/extensions/policy-gate.ts +349 -0
  24. package/.pi/extensions/review-integrity.ts +198 -0
  25. package/.pi/extensions/test-diff-integrity.ts +240 -0
  26. package/.pi/extensions/trace-recorder.ts +315 -0
  27. package/.pi/harness/README.md +23 -0
  28. package/.pi/harness/router/README.md +35 -0
  29. package/.pi/harness/router/apply-router-proposal.mjs +153 -0
  30. package/.pi/harness/router/propose-router-tuning.mjs +149 -0
  31. package/.pi/harness/specs/README.md +37 -0
  32. package/.pi/harness/specs/adversary-report.schema.json +53 -0
  33. package/.pi/harness/specs/budget-exhausted-event.schema.json +93 -0
  34. package/.pi/harness/specs/consensus-packet.schema.json +175 -0
  35. package/.pi/harness/specs/eval-verdict.schema.json +59 -0
  36. package/.pi/harness/specs/incident-record.schema.json +84 -0
  37. package/.pi/harness/specs/plan-packet.schema.json +90 -0
  38. package/.pi/harness/specs/round-result.schema.json +126 -0
  39. package/.pi/harness/specs/router-tuning-proposal.schema.json +114 -0
  40. package/.pi/harness/specs/run-trace.schema.json +107 -0
  41. package/.pi/lib/harness-ui-state.ts +311 -0
  42. package/.pi/mcp.json +4 -0
  43. package/.pi/model-router.json +93 -93
  44. package/.pi/prompts/graphify.md +23 -0
  45. package/.pi/prompts/harness-abort.md +41 -0
  46. package/.pi/prompts/harness-auto.md +83 -0
  47. package/.pi/prompts/harness-critic.md +52 -0
  48. package/.pi/prompts/harness-eval.md +51 -0
  49. package/.pi/prompts/harness-incident.md +51 -0
  50. package/.pi/prompts/harness-plan.md +64 -0
  51. package/.pi/prompts/harness-review.md +52 -0
  52. package/.pi/prompts/harness-router-tune.md +74 -0
  53. package/.pi/prompts/harness-run.md +59 -0
  54. package/.pi/prompts/harness-setup.md +316 -216
  55. package/.pi/prompts/harness-trace.md +51 -0
  56. package/.pi/prompts/wiki-autoresearch.md +9 -7
  57. package/.pi/prompts/wiki-save.md +20 -0
  58. package/.pi/skills/agent-router/SKILL.md +2 -4
  59. package/.pi/skills/ast-grep/SKILL.md +354 -0
  60. package/.pi/sounds/project-sounds.json +18 -24
  61. package/AGENTS.md +30 -0
  62. package/CHANGELOG.md +89 -0
  63. package/CONTRIBUTING.md +51 -1
  64. package/README.md +264 -20
  65. package/biome.json +8 -2
  66. package/lefthook.yml +3 -2
  67. package/node_modules/@sting8k/pi-vcc/README.md +200 -0
  68. package/node_modules/@sting8k/pi-vcc/index.ts +14 -0
  69. package/node_modules/@sting8k/pi-vcc/package.json +26 -0
  70. package/node_modules/@sting8k/pi-vcc/scripts/audit-sessions.ts +88 -0
  71. package/node_modules/@sting8k/pi-vcc/scripts/benchmark-real-sessions.ts +25 -0
  72. package/node_modules/@sting8k/pi-vcc/scripts/compare-before-after.ts +36 -0
  73. package/node_modules/@sting8k/pi-vcc/scripts/dump-branch-output.ts +20 -0
  74. package/node_modules/@sting8k/pi-vcc/src/commands/pi-vcc.ts +36 -0
  75. package/node_modules/@sting8k/pi-vcc/src/commands/vcc-recall.ts +65 -0
  76. package/node_modules/@sting8k/pi-vcc/src/core/brief.ts +381 -0
  77. package/node_modules/@sting8k/pi-vcc/src/core/build-sections.ts +79 -0
  78. package/node_modules/@sting8k/pi-vcc/src/core/content.ts +60 -0
  79. package/node_modules/@sting8k/pi-vcc/src/core/filter-noise.ts +42 -0
  80. package/node_modules/@sting8k/pi-vcc/src/core/format-recall.ts +27 -0
  81. package/node_modules/@sting8k/pi-vcc/src/core/format.ts +49 -0
  82. package/node_modules/@sting8k/pi-vcc/src/core/lineage.ts +26 -0
  83. package/node_modules/@sting8k/pi-vcc/src/core/load-messages.ts +41 -0
  84. package/node_modules/@sting8k/pi-vcc/src/core/normalize.ts +66 -0
  85. package/node_modules/@sting8k/pi-vcc/src/core/recall-scope.ts +14 -0
  86. package/node_modules/@sting8k/pi-vcc/src/core/render-entries.ts +55 -0
  87. package/node_modules/@sting8k/pi-vcc/src/core/report.ts +237 -0
  88. package/node_modules/@sting8k/pi-vcc/src/core/sanitize.ts +5 -0
  89. package/node_modules/@sting8k/pi-vcc/src/core/search-entries.ts +221 -0
  90. package/node_modules/@sting8k/pi-vcc/src/core/settings.ts +77 -0
  91. package/node_modules/@sting8k/pi-vcc/src/core/skill-collapse.ts +35 -0
  92. package/node_modules/@sting8k/pi-vcc/src/core/summarize.ts +157 -0
  93. package/node_modules/@sting8k/pi-vcc/src/core/tool-args.ts +14 -0
  94. package/node_modules/@sting8k/pi-vcc/src/details.ts +7 -0
  95. package/node_modules/@sting8k/pi-vcc/src/extract/commits.ts +69 -0
  96. package/node_modules/@sting8k/pi-vcc/src/extract/files.ts +80 -0
  97. package/node_modules/@sting8k/pi-vcc/src/extract/goals.ts +79 -0
  98. package/node_modules/@sting8k/pi-vcc/src/extract/preferences.ts +55 -0
  99. package/node_modules/@sting8k/pi-vcc/src/hooks/before-compact.ts +322 -0
  100. package/node_modules/@sting8k/pi-vcc/src/sections.ts +12 -0
  101. package/node_modules/@sting8k/pi-vcc/src/tools/recall.ts +109 -0
  102. package/node_modules/@sting8k/pi-vcc/src/types.ts +14 -0
  103. package/node_modules/@sting8k/pi-vcc/tests/before-compact-hook.test.ts +181 -0
  104. package/node_modules/@sting8k/pi-vcc/tests/before-compact.test.ts +140 -0
  105. package/node_modules/@sting8k/pi-vcc/tests/brief.test.ts +206 -0
  106. package/node_modules/@sting8k/pi-vcc/tests/build-sections.test.ts +59 -0
  107. package/node_modules/@sting8k/pi-vcc/tests/compile.test.ts +80 -0
  108. package/node_modules/@sting8k/pi-vcc/tests/content.test.ts +31 -0
  109. package/node_modules/@sting8k/pi-vcc/tests/extract-goals.test.ts +86 -0
  110. package/node_modules/@sting8k/pi-vcc/tests/extract-preferences.test.ts +30 -0
  111. package/node_modules/@sting8k/pi-vcc/tests/filter-noise.test.ts +61 -0
  112. package/node_modules/@sting8k/pi-vcc/tests/fixtures.ts +61 -0
  113. package/node_modules/@sting8k/pi-vcc/tests/format-recall.test.ts +30 -0
  114. package/node_modules/@sting8k/pi-vcc/tests/format.test.ts +62 -0
  115. package/node_modules/@sting8k/pi-vcc/tests/lineage.test.ts +33 -0
  116. package/node_modules/@sting8k/pi-vcc/tests/load-messages.test.ts +51 -0
  117. package/node_modules/@sting8k/pi-vcc/tests/normalize.test.ts +97 -0
  118. package/node_modules/@sting8k/pi-vcc/tests/real-sessions.test.ts +38 -0
  119. package/node_modules/@sting8k/pi-vcc/tests/recall-expand.test.ts +15 -0
  120. package/node_modules/@sting8k/pi-vcc/tests/recall-scope.test.ts +32 -0
  121. package/node_modules/@sting8k/pi-vcc/tests/recall-tool-scope.test.ts +67 -0
  122. package/node_modules/@sting8k/pi-vcc/tests/render-entries.test.ts +62 -0
  123. package/node_modules/@sting8k/pi-vcc/tests/report.test.ts +44 -0
  124. package/node_modules/@sting8k/pi-vcc/tests/sanitize.test.ts +24 -0
  125. package/node_modules/@sting8k/pi-vcc/tests/search-entries.test.ts +144 -0
  126. package/node_modules/@sting8k/pi-vcc/tests/support/load-session.ts +23 -0
  127. package/node_modules/@sting8k/pi-vcc/tests/support/real-sessions.ts +51 -0
  128. package/package.json +15 -4
  129. package/scripts/__pycache__/merge_graphify_corpora.cpython-314.pyc +0 -0
  130. package/scripts/index_youtube_urls.py +376 -0
  131. package/scripts/merge_graphify_corpora.py +398 -0
  132. package/scripts/regen_graphify_html.py +46 -0
  133. package/.agents/skills/defuddle/SKILL.md +0 -90
  134. package/.agents/skills/wiki/SKILL.md +0 -215
  135. package/.agents/skills/wiki/references/css-snippets.md +0 -122
  136. package/.agents/skills/wiki/references/frontmatter.md +0 -107
  137. package/.agents/skills/wiki/references/git-setup.md +0 -58
  138. package/.agents/skills/wiki/references/mcp-setup.md +0 -149
  139. package/.agents/skills/wiki/references/modes.md +0 -259
  140. package/.agents/skills/wiki/references/plugins.md +0 -96
  141. package/.agents/skills/wiki/references/rest-api.md +0 -124
  142. package/.agents/skills/wiki-fold/SKILL.md +0 -204
  143. package/.agents/skills/wiki-fold/references/fold-template.md +0 -133
  144. package/.agents/skills/wiki-ingest/SKILL.md +0 -288
  145. package/.agents/skills/wiki-lint/SKILL.md +0 -183
  146. package/.agents/skills/wiki-query/SKILL.md +0 -176
  147. package/.pi/agents/rethink.md +0 -140
  148. package/.pi/agents/wiki-ingest.md +0 -67
  149. package/.pi/agents/wiki-lint.md +0 -75
  150. package/.pi/internal/cursor-sdk-transcript-parser.ts +0 -59
  151. package/.pi/prompts/save.md +0 -16
  152. package/.pi/prompts/wiki.md +0 -23
  153. package/.pi/providers/cursor-sdk-provider.test.mjs +0 -476
  154. package/.pi/providers/cursor-sdk-provider.ts +0 -1085
  155. package/vault/AGENTS.md +0 -37
  156. package/vault/wiki/_templates/comparison.md +0 -39
  157. package/vault/wiki/_templates/concept.md +0 -40
  158. package/vault/wiki/_templates/decision.md +0 -21
  159. package/vault/wiki/_templates/entity.md +0 -32
  160. package/vault/wiki/_templates/flow.md +0 -14
  161. package/vault/wiki/_templates/module.md +0 -18
  162. package/vault/wiki/_templates/question.md +0 -31
  163. package/vault/wiki/_templates/source.md +0 -39
  164. package/vault/wiki/concepts/AST-Aware Code Chunking.md +0 -44
  165. package/vault/wiki/concepts/Build-Time Prompt Compilation.md +0 -107
  166. package/vault/wiki/concepts/Context Engine (AI Coding).md +0 -47
  167. package/vault/wiki/concepts/Context-Aware System Reminders.md +0 -61
  168. package/vault/wiki/concepts/Contextualized Text Embedding.md +0 -42
  169. package/vault/wiki/concepts/Contractor vs Employee AI Model.md +0 -55
  170. package/vault/wiki/concepts/Dual-Model Agent Architecture.md +0 -65
  171. package/vault/wiki/concepts/Late Chunking vs Early Chunking.md +0 -43
  172. package/vault/wiki/concepts/Majority Vote Ensembling.md +0 -68
  173. package/vault/wiki/concepts/Meta-Harness.md +0 -16
  174. package/vault/wiki/concepts/Multi-Agent AI Coding Architecture.md +0 -75
  175. package/vault/wiki/concepts/Prompt Enhancement.md +0 -90
  176. package/vault/wiki/concepts/Prompt Renderer.md +0 -89
  177. package/vault/wiki/concepts/Semantic Codebase Indexing.md +0 -67
  178. package/vault/wiki/concepts/additive-config-hierarchy.md +0 -16
  179. package/vault/wiki/concepts/agent-artifacts-verifiable-deliverables.md +0 -71
  180. package/vault/wiki/concepts/agent-browser-browser-automation.md +0 -99
  181. package/vault/wiki/concepts/agent-codebase-interface.md +0 -43
  182. package/vault/wiki/concepts/agent-harness-architecture.md +0 -67
  183. package/vault/wiki/concepts/agent-loop-detection-patterns.md +0 -133
  184. package/vault/wiki/concepts/agent-search-enforcement.md +0 -126
  185. package/vault/wiki/concepts/agent-skills-ecosystem.md +0 -74
  186. package/vault/wiki/concepts/agent-skills-pattern.md +0 -68
  187. package/vault/wiki/concepts/agentic-harness-context-enforcement.md +0 -91
  188. package/vault/wiki/concepts/agentic-harness.md +0 -34
  189. package/vault/wiki/concepts/agentic-orchestration-pipeline.md +0 -56
  190. package/vault/wiki/concepts/agentic-search-no-embeddings.md +0 -18
  191. package/vault/wiki/concepts/anthropic-context-engineering.md +0 -13
  192. package/vault/wiki/concepts/antigravity-agent-first-architecture.md +0 -61
  193. package/vault/wiki/concepts/ast-compression.md +0 -19
  194. package/vault/wiki/concepts/ast-truncation.md +0 -66
  195. package/vault/wiki/concepts/barrel-files.md +0 -37
  196. package/vault/wiki/concepts/browser-harness-agent.md +0 -41
  197. package/vault/wiki/concepts/browser-subagent-visual-verification.md +0 -82
  198. package/vault/wiki/concepts/codebase-intelligence-ecosystem-comparison.md +0 -192
  199. package/vault/wiki/concepts/codebase-intelligence-harness-integration.md +0 -161
  200. package/vault/wiki/concepts/codebase-to-context-ingestion.md +0 -46
  201. package/vault/wiki/concepts/codex-harness-innovations.md +0 -147
  202. package/vault/wiki/concepts/consensus-debate-flow.md +0 -17
  203. package/vault/wiki/concepts/consensus-debate.md +0 -206
  204. package/vault/wiki/concepts/content-addressed-spec-identity.md +0 -166
  205. package/vault/wiki/concepts/context-anxiety.md +0 -57
  206. package/vault/wiki/concepts/context-compression-techniques.md +0 -19
  207. package/vault/wiki/concepts/context-continuity.md +0 -22
  208. package/vault/wiki/concepts/context-drift-in-agents.md +0 -106
  209. package/vault/wiki/concepts/context-engineering.md +0 -62
  210. package/vault/wiki/concepts/context-folding.md +0 -67
  211. package/vault/wiki/concepts/context-mode.md +0 -38
  212. package/vault/wiki/concepts/cursor-harness-innovations.md +0 -107
  213. package/vault/wiki/concepts/deterministic-session-compaction.md +0 -79
  214. package/vault/wiki/concepts/drift-detection-unified.md +0 -296
  215. package/vault/wiki/concepts/execution-feedback-loop.md +0 -46
  216. package/vault/wiki/concepts/feedforward-feedback-harness.md +0 -60
  217. package/vault/wiki/concepts/five-root-cause-metrics-sentrux.md +0 -40
  218. package/vault/wiki/concepts/fork-safe-spec-storage.md +0 -89
  219. package/vault/wiki/concepts/fts5-sandbox.md +0 -19
  220. package/vault/wiki/concepts/fuzzy-edit-matching.md +0 -71
  221. package/vault/wiki/concepts/gemini-cli-architecture.md +0 -104
  222. package/vault/wiki/concepts/generator-evaluator-architecture.md +0 -64
  223. package/vault/wiki/concepts/guardian-agent-pattern.md +0 -67
  224. package/vault/wiki/concepts/harness-configuration-layers.md +0 -89
  225. package/vault/wiki/concepts/harness-control-frameworks.md +0 -155
  226. package/vault/wiki/concepts/harness-engineering-first-principles.md +0 -90
  227. package/vault/wiki/concepts/harness-h-formalism.md +0 -53
  228. package/vault/wiki/concepts/hybrid-code-search.md +0 -61
  229. package/vault/wiki/concepts/inline-post-edit-validation.md +0 -112
  230. package/vault/wiki/concepts/legendary-engineering-patterns-harness.md +0 -110
  231. package/vault/wiki/concepts/lifecycle-hooks.md +0 -94
  232. package/vault/wiki/concepts/mcp-tool-routing.md +0 -102
  233. package/vault/wiki/concepts/memory-system-of-record-vs-ephemeral-cache.md +0 -47
  234. package/vault/wiki/concepts/meta-agent-context-pruning.md +0 -151
  235. package/vault/wiki/concepts/model-adaptive-harness.md +0 -122
  236. package/vault/wiki/concepts/model-routing-agents.md +0 -101
  237. package/vault/wiki/concepts/monorepo-architecture.md +0 -45
  238. package/vault/wiki/concepts/multi-agent-specialization.md +0 -61
  239. package/vault/wiki/concepts/permission-subsystem.md +0 -16
  240. package/vault/wiki/concepts/pi-messenger-analysis.md +0 -243
  241. package/vault/wiki/concepts/pi-vscode-extension-landscape.md +0 -37
  242. package/vault/wiki/concepts/policy-engine-pattern.md +0 -78
  243. package/vault/wiki/concepts/progressive-disclosure-agents.md +0 -53
  244. package/vault/wiki/concepts/progressive-skill-disclosure.md +0 -17
  245. package/vault/wiki/concepts/provider-native-prompting.md +0 -203
  246. package/vault/wiki/concepts/quality-signal-sentrux.md +0 -37
  247. package/vault/wiki/concepts/repo-map-ranking.md +0 -42
  248. package/vault/wiki/concepts/result-monad-error-handling.md +0 -47
  249. package/vault/wiki/concepts/safety-defense-in-depth.md +0 -83
  250. package/vault/wiki/concepts/sandbox-os-enforcement.md +0 -18
  251. package/vault/wiki/concepts/selective-debate-routing.md +0 -70
  252. package/vault/wiki/concepts/self-evolving-harness.md +0 -60
  253. package/vault/wiki/concepts/sentrux-mcp-integration.md +0 -36
  254. package/vault/wiki/concepts/sentrux-rules-engine.md +0 -49
  255. package/vault/wiki/concepts/shell-pattern-compression.md +0 -24
  256. package/vault/wiki/concepts/skill-first-architecture.md +0 -166
  257. package/vault/wiki/concepts/structured-compaction.md +0 -78
  258. package/vault/wiki/concepts/subagent-orchestration.md +0 -17
  259. package/vault/wiki/concepts/subagent-worktree-isolation.md +0 -68
  260. package/vault/wiki/concepts/superpowers-methodology.md +0 -78
  261. package/vault/wiki/concepts/think-in-code.md +0 -73
  262. package/vault/wiki/concepts/ts-execution-layer.md +0 -100
  263. package/vault/wiki/concepts/typescript-strict-mode.md +0 -37
  264. package/vault/wiki/concepts/vcc-conversation-compaction-for-pi.md +0 -53
  265. package/vault/wiki/concepts/verification-drift-detection.md +0 -19
  266. package/vault/wiki/consensus/consensus-records.md +0 -58
  267. package/vault/wiki/decisions/2026-04-30-pi-lean-ctx-native.md +0 -122
  268. package/vault/wiki/decisions/2026-05-07-replace-lean-ctx-with-context-mode.md +0 -59
  269. package/vault/wiki/decisions/adr-008.md +0 -40
  270. package/vault/wiki/decisions/adr-009.md +0 -46
  271. package/vault/wiki/decisions/adr-010.md +0 -55
  272. package/vault/wiki/decisions/adr-011.md +0 -165
  273. package/vault/wiki/decisions/adr-012.md +0 -102
  274. package/vault/wiki/decisions/adr-013.md +0 -59
  275. package/vault/wiki/decisions/adr-014.md +0 -73
  276. package/vault/wiki/decisions/adr-015.md +0 -81
  277. package/vault/wiki/decisions/adr-016.md +0 -91
  278. package/vault/wiki/decisions/adr-017.md +0 -79
  279. package/vault/wiki/decisions/adr-018.md +0 -100
  280. package/vault/wiki/decisions/adr-019.md +0 -75
  281. package/vault/wiki/decisions/adr-020.md +0 -106
  282. package/vault/wiki/decisions/adr-021.md +0 -86
  283. package/vault/wiki/decisions/adr-022.md +0 -113
  284. package/vault/wiki/decisions/adr-023.md +0 -113
  285. package/vault/wiki/decisions/adr-024.md +0 -73
  286. package/vault/wiki/decisions/adr-025.md +0 -130
  287. package/vault/wiki/decisions/adr-026.md +0 -56
  288. package/vault/wiki/decisions/adr-027.md +0 -94
  289. package/vault/wiki/decisions/colocate-wiki.md +0 -34
  290. package/vault/wiki/entities/Anders Hejlsberg.md +0 -29
  291. package/vault/wiki/entities/Anthropic.md +0 -17
  292. package/vault/wiki/entities/Augment Code.md +0 -49
  293. package/vault/wiki/entities/Bjarne Stroustrup.md +0 -26
  294. package/vault/wiki/entities/Bolt.new (StackBlitz).md +0 -39
  295. package/vault/wiki/entities/Boris Cherny.md +0 -11
  296. package/vault/wiki/entities/Claude Code.md +0 -19
  297. package/vault/wiki/entities/Dennis Ritchie.md +0 -26
  298. package/vault/wiki/entities/Emergent Labs.md +0 -32
  299. package/vault/wiki/entities/Google Cloud.md +0 -16
  300. package/vault/wiki/entities/Guido van Rossum.md +0 -28
  301. package/vault/wiki/entities/Ken Thompson.md +0 -28
  302. package/vault/wiki/entities/Lee et al.md +0 -16
  303. package/vault/wiki/entities/Linus Torvalds.md +0 -28
  304. package/vault/wiki/entities/Lovable (company).md +0 -40
  305. package/vault/wiki/entities/Martin Fowler.md +0 -16
  306. package/vault/wiki/entities/Meng et al.md +0 -16
  307. package/vault/wiki/entities/OpenAI.md +0 -16
  308. package/vault/wiki/entities/Rocket.new.md +0 -38
  309. package/vault/wiki/entities/VILA-Lab.md +0 -15
  310. package/vault/wiki/entities/autodev-codebase.md +0 -18
  311. package/vault/wiki/entities/ck-tool.md +0 -59
  312. package/vault/wiki/entities/codesearch.md +0 -18
  313. package/vault/wiki/entities/disler-indydevdan.md +0 -33
  314. package/vault/wiki/entities/gsd-get-shit-done.md +0 -56
  315. package/vault/wiki/entities/javascript-runtimes.md +0 -48
  316. package/vault/wiki/entities/jesse-vincent.md +0 -38
  317. package/vault/wiki/entities/lean-ctx.md +0 -32
  318. package/vault/wiki/entities/opendev.md +0 -41
  319. package/vault/wiki/entities/ops-codegraph-tool.md +0 -18
  320. package/vault/wiki/entities/pi-coding-agent.md +0 -53
  321. package/vault/wiki/entities/sentrux.md +0 -54
  322. package/vault/wiki/entities/vgrep-tool.md +0 -57
  323. package/vault/wiki/entities/vitest.md +0 -41
  324. package/vault/wiki/flows/harness-wiki-pipeline.md +0 -204
  325. package/vault/wiki/hot.md +0 -932
  326. package/vault/wiki/index.md +0 -437
  327. package/vault/wiki/log.md +0 -422
  328. package/vault/wiki/meta/dashboard.md +0 -30
  329. package/vault/wiki/meta/lint-report-2026-04-30.md +0 -86
  330. package/vault/wiki/meta/lint-report-2026-05-02.md +0 -251
  331. package/vault/wiki/meta/overview.canvas +0 -43
  332. package/vault/wiki/modules/adversarial-verification.md +0 -57
  333. package/vault/wiki/modules/automated-observability.md +0 -54
  334. package/vault/wiki/modules/bench.md +0 -20
  335. package/vault/wiki/modules/extensions.md +0 -23
  336. package/vault/wiki/modules/grounding-checkpoints.md +0 -62
  337. package/vault/wiki/modules/harness-implementation-plan.md +0 -345
  338. package/vault/wiki/modules/harness-wiki-skill-mapping.md +0 -135
  339. package/vault/wiki/modules/harness.md +0 -86
  340. package/vault/wiki/modules/persistent-memory.md +0 -85
  341. package/vault/wiki/modules/schema-orchestration.md +0 -68
  342. package/vault/wiki/modules/skills.md +0 -27
  343. package/vault/wiki/modules/spec-hardening.md +0 -58
  344. package/vault/wiki/modules/structured-planning.md +0 -53
  345. package/vault/wiki/modules/think-in-code-enforcement.md +0 -153
  346. package/vault/wiki/modules/wiki-query-interface.md +0 -64
  347. package/vault/wiki/overview.md +0 -51
  348. package/vault/wiki/questions/Research-pi-vs-claude-code-agentic-orchestration-pipeline.md +0 -87
  349. package/vault/wiki/questions/Research-sentrux-dev.md +0 -123
  350. package/vault/wiki/questions/Research-superpowers-skill-for-agentic-coding-agents.md +0 -164
  351. package/vault/wiki/questions/Research: Augment Code Context Engine.md +0 -244
  352. package/vault/wiki/questions/Research: Automating Software Engineering - Lovable, Bolt, Emergent, Rocket.md +0 -112
  353. package/vault/wiki/questions/Research: Claude Code State-of-the-Art Harness Improvements.md +0 -209
  354. package/vault/wiki/questions/Research: Codex State-of-the-Art Harness Improvements.md +0 -99
  355. package/vault/wiki/questions/Research: Engineering Workflows of Legendary Programmers and AI Harness Mapping.md +0 -107
  356. package/vault/wiki/questions/Research: Fallow Codebase Intelligence Harness Integration.md +0 -72
  357. package/vault/wiki/questions/Research: Gemini CLI SOTA Harness Integration.md +0 -166
  358. package/vault/wiki/questions/Research: GitHub Issues as Harness Spec Storage.md +0 -188
  359. package/vault/wiki/questions/Research: Google Antigravity Harness Integration.md +0 -120
  360. package/vault/wiki/questions/Research: Meta-Agent Context Drift Detection.md +0 -236
  361. package/vault/wiki/questions/Research: Model-Adaptive Agent Harness Design.md +0 -95
  362. package/vault/wiki/questions/Research: Model-Specific Prompting Guides.md +0 -165
  363. package/vault/wiki/questions/Research: Prompt Renderer for Multi-Model Agent Harness.md +0 -216
  364. package/vault/wiki/questions/Research: Skill-First Harness Architecture.md +0 -91
  365. package/vault/wiki/questions/Research: TypeScript Best Practices and Codebase Structure.md +0 -88
  366. package/vault/wiki/questions/Research: TypeScript Execution Layer for Agent Tool Calling.md +0 -81
  367. package/vault/wiki/questions/Research: claude-mem over Obsidian for Harness Layer.md +0 -71
  368. package/vault/wiki/questions/Research: claude-mem over obsidian wiki as the knowledge base for our agentic harness pipeline. think from first principles. does this replace or complement our current setup? no hard feelings about previous decisions. gimme accurate points.md +0 -80
  369. package/vault/wiki/questions/Research: context-mode vs lean-ctx.md +0 -72
  370. package/vault/wiki/questions/Research: cursor.sh Harness Innovations.md +0 -92
  371. package/vault/wiki/questions/Research: executor.sh Harness Integration.md +0 -170
  372. package/vault/wiki/questions/Research: how GSD fits into our coding harness setup.md +0 -97
  373. package/vault/wiki/questions/Research: how claude-mem fits into our workflow. and whether it should replace obsidian in the codebase. no hard feelings about previous actions, rethink from first principles always.md +0 -80
  374. package/vault/wiki/questions/Research: pi-vcc.md +0 -113
  375. package/vault/wiki/questions/Research: semantic code search tools.md +0 -69
  376. package/vault/wiki/questions/Research: vcc extension for pi coding agent.md +0 -73
  377. package/vault/wiki/questions/how-to-enable-semantic-code-search-now.md +0 -111
  378. package/vault/wiki/questions/mvp-implementation-blueprint.md +0 -552
  379. package/vault/wiki/questions/research-agent-first-codebase-exploration.md +0 -199
  380. package/vault/wiki/questions/research-agentic-coding-harness-latest-papers.md +0 -142
  381. package/vault/wiki/questions/research-gitingest-gitreverse-integration.md +0 -100
  382. package/vault/wiki/questions/research-wozcode-token-reduction.md +0 -67
  383. package/vault/wiki/questions/resolved-context-pruning-inplace-vs-restart.md +0 -95
  384. package/vault/wiki/questions/resolved-context-window-economics.md +0 -167
  385. package/vault/wiki/questions/resolved-imad-debate-gating-transfer.md +0 -126
  386. package/vault/wiki/questions/resolved-mcp-tool-preference.md +0 -112
  387. package/vault/wiki/questions/resolved-small-model-meta-agents.md +0 -107
  388. package/vault/wiki/questions/resolved-treesitter-dynamic-languages.md +0 -95
  389. package/vault/wiki/sources/Auggie Context MCP Server.md +0 -63
  390. package/vault/wiki/sources/Augment Code Codacy AI Giants.md +0 -61
  391. package/vault/wiki/sources/Augment Code MCP SiliconAngle.md +0 -49
  392. package/vault/wiki/sources/Augment Code WorkOS ERC 2025.md +0 -55
  393. package/vault/wiki/sources/Augment Context Engine Official.md +0 -71
  394. package/vault/wiki/sources/Augment SWE-bench Agent GitHub.md +0 -74
  395. package/vault/wiki/sources/Augment SWE-bench Pro Blog.md +0 -58
  396. package/vault/wiki/sources/Source: AgentBus Jinja2 Prompt Pipelines.md +0 -75
  397. package/vault/wiki/sources/Source: Arxiv /342/200/224 Don't Break the Cache.md" +0 -85
  398. package/vault/wiki/sources/Source: Augment - Harness Engineering for AI Coding Agents.md +0 -58
  399. package/vault/wiki/sources/Source: Blake Crosley Agent Architecture Guide.md +0 -100
  400. package/vault/wiki/sources/Source: Bolt.new Architecture & Case Study.md +0 -75
  401. package/vault/wiki/sources/Source: Build-Time Prompt Compilation Architecture.md +0 -107
  402. package/vault/wiki/sources/Source: Claude API Agent Skills Overview.md +0 -70
  403. package/vault/wiki/sources/Source: Gemini CLI Changelogs.md +0 -88
  404. package/vault/wiki/sources/Source: Google Blog - Gemini CLI Announcement.md +0 -57
  405. package/vault/wiki/sources/Source: Google Gemini CLI Architecture Docs.md +0 -53
  406. package/vault/wiki/sources/Source: LangChain - Anatomy of Agent Harness.md +0 -65
  407. package/vault/wiki/sources/Source: Lovable Architecture & Clone Analysis.md +0 -83
  408. package/vault/wiki/sources/Source: Martin Fowler - Harness Engineering.md +0 -70
  409. package/vault/wiki/sources/Source: OpenAI Harness Engineering Five Principles.md +0 -58
  410. package/vault/wiki/sources/Source: OpenAI Harness Engineering /342/200/224 0 Lines of Human Code.md" +0 -101
  411. package/vault/wiki/sources/Source: OpenDev /342/200/224 Building AI Coding Agents for the Terminal.md" +0 -100
  412. package/vault/wiki/sources/Source: Render AI Coding Agents Benchmark 2025.md +0 -53
  413. package/vault/wiki/sources/Source: Rocket.new /342/200/224 Vibe Solutioning Platform.md" +0 -70
  414. package/vault/wiki/sources/Source: SwirlAI Agent Skills Progressive Disclosure.md +0 -71
  415. package/vault/wiki/sources/Source: TianPan Prompt Caching Architecture.md +0 -89
  416. package/vault/wiki/sources/Source: Vercel Labs agent-browser.md +0 -155
  417. package/vault/wiki/sources/Source: browser-harness CDP Harness.md +0 -126
  418. package/vault/wiki/sources/agent-drift-academic-paper.md +0 -79
  419. package/vault/wiki/sources/aider-repomap-tree-sitter.md +0 -42
  420. package/vault/wiki/sources/anthropic-compaction-api.md +0 -58
  421. package/vault/wiki/sources/anthropic-effective-harnesses.md +0 -42
  422. package/vault/wiki/sources/anthropic-prompt-best-practices.md +0 -100
  423. package/vault/wiki/sources/anthropic2026-harness-design.md +0 -63
  424. package/vault/wiki/sources/barrel-files-tkdodo.md +0 -38
  425. package/vault/wiki/sources/birth-of-unix-kernighan-interview.md +0 -57
  426. package/vault/wiki/sources/bockeler2026-harness-engineering.md +0 -69
  427. package/vault/wiki/sources/cast-code-chunking-paper.md +0 -50
  428. package/vault/wiki/sources/ck-semantic-search.md +0 -78
  429. package/vault/wiki/sources/claude-code-architecture-karaxai-2026.md +0 -71
  430. package/vault/wiki/sources/claude-code-architecture-qubytes-2026.md +0 -50
  431. package/vault/wiki/sources/claude-code-architecture-vila-lab-2026.md +0 -64
  432. package/vault/wiki/sources/claude-code-security-architecture-penligent-2026.md +0 -70
  433. package/vault/wiki/sources/claude-context-editing-docs.md +0 -13
  434. package/vault/wiki/sources/cloudflare-codemode.md +0 -63
  435. package/vault/wiki/sources/code-chunk-library-supermemory.md +0 -63
  436. package/vault/wiki/sources/codeact-apple-2024.md +0 -62
  437. package/vault/wiki/sources/codex-dsc-rfc-8573.md +0 -41
  438. package/vault/wiki/sources/codex-open-source-agent-2026.md +0 -110
  439. package/vault/wiki/sources/coir-code-retrieval-benchmark.md +0 -51
  440. package/vault/wiki/sources/colinmcnamara-context-optimization-codemode.md +0 -48
  441. package/vault/wiki/sources/context-folding-paper.md +0 -61
  442. package/vault/wiki/sources/context-mode-website.md +0 -63
  443. package/vault/wiki/sources/cursor-agent-best-practices-2026.md +0 -62
  444. package/vault/wiki/sources/cursor-fork-29b-2025.md +0 -50
  445. package/vault/wiki/sources/cursor-harness-april-2026.md +0 -76
  446. package/vault/wiki/sources/cursor-instant-apply-2024.md +0 -45
  447. package/vault/wiki/sources/cursor-shadow-workspace-2024.md +0 -52
  448. package/vault/wiki/sources/cursor-shipped-coding-agent-2026.md +0 -53
  449. package/vault/wiki/sources/cursor-vs-antigravity-2026.md +0 -51
  450. package/vault/wiki/sources/disler-pi-vs-claude-code.md +0 -69
  451. package/vault/wiki/sources/distill-deterministic-context-compression.md +0 -53
  452. package/vault/wiki/sources/embedding-models-benchmark-supermemory-2025.md +0 -48
  453. package/vault/wiki/sources/executor-rhyssullivan.md +0 -122
  454. package/vault/wiki/sources/fallow-rs-codebase-intelligence.md +0 -125
  455. package/vault/wiki/sources/fan2025-imad.md +0 -60
  456. package/vault/wiki/sources/forgecode-gpt5-agent-improvements.md +0 -63
  457. package/vault/wiki/sources/gemini-3-prompting-guide.md +0 -78
  458. package/vault/wiki/sources/gh-cli-sub-issue-rfc.md +0 -50
  459. package/vault/wiki/sources/gh-sub-issue-extension.md +0 -72
  460. package/vault/wiki/sources/github-fork-issues-discussion.md +0 -44
  461. package/vault/wiki/sources/github-issue-dependencies-docs.md +0 -49
  462. package/vault/wiki/sources/github-sub-issues-docs.md +0 -51
  463. package/vault/wiki/sources/gitingest.md +0 -91
  464. package/vault/wiki/sources/gitreverse.md +0 -63
  465. package/vault/wiki/sources/google-antigravity-official-blog.md +0 -47
  466. package/vault/wiki/sources/google-antigravity-wikipedia.md +0 -53
  467. package/vault/wiki/sources/gsd-codecentric-deep-dive.md +0 -57
  468. package/vault/wiki/sources/gsd-github-repo.md +0 -51
  469. package/vault/wiki/sources/gsd-hn-discussion.md +0 -59
  470. package/vault/wiki/sources/guido-python-design-philosophy.md +0 -56
  471. package/vault/wiki/sources/hejlsberg-7-learnings.md +0 -48
  472. package/vault/wiki/sources/ironclaw-drift-monitor.md +0 -80
  473. package/vault/wiki/sources/langsight-loop-detection.md +0 -80
  474. package/vault/wiki/sources/leanctx-website.md +0 -69
  475. package/vault/wiki/sources/lee2026-meta-harness.md +0 -59
  476. package/vault/wiki/sources/linux-kernel-coding-workflow.md +0 -50
  477. package/vault/wiki/sources/lou2026-autoharness.md +0 -53
  478. package/vault/wiki/sources/martin-fowler-harness-engineering.md +0 -73
  479. package/vault/wiki/sources/mcp-architecture-docs.md +0 -13
  480. package/vault/wiki/sources/meng2026-agent-harness-survey.md +0 -79
  481. package/vault/wiki/sources/mindstudio-four-agent-types.md +0 -68
  482. package/vault/wiki/sources/ms-chat-history-management.md +0 -13
  483. package/vault/wiki/sources/openai-prompt-guidance.md +0 -104
  484. package/vault/wiki/sources/openclaw-session-pruning.md +0 -13
  485. package/vault/wiki/sources/opencode-dcp.md +0 -13
  486. package/vault/wiki/sources/opendev-arxiv-2603.05344v1.md +0 -79
  487. package/vault/wiki/sources/openhands-platform.md +0 -39
  488. package/vault/wiki/sources/oss-guide-codebase-exploration.md +0 -53
  489. package/vault/wiki/sources/pi-compaction-extensions-ecosystem.md +0 -102
  490. package/vault/wiki/sources/pi-context-prune-github-repo.md +0 -38
  491. package/vault/wiki/sources/pi-mono-compaction-docs.md +0 -38
  492. package/vault/wiki/sources/pi-omni-compact-github-repo.md +0 -50
  493. package/vault/wiki/sources/pi-rtk-optimizer-github-repo.md +0 -45
  494. package/vault/wiki/sources/pi-vcc-github-repo.md +0 -69
  495. package/vault/wiki/sources/pi-vscode-marketplace.md +0 -41
  496. package/vault/wiki/sources/pi-vscode-model-provider-marketplace.md +0 -39
  497. package/vault/wiki/sources/py-tree-sitter.md +0 -13
  498. package/vault/wiki/sources/sentrux-dev-landing.md +0 -40
  499. package/vault/wiki/sources/sentrux-docs-pro-architecture.md +0 -75
  500. package/vault/wiki/sources/sentrux-docs-quality-signal.md +0 -46
  501. package/vault/wiki/sources/sentrux-docs-root-cause-metrics.md +0 -57
  502. package/vault/wiki/sources/sentrux-docs-rules-engine.md +0 -58
  503. package/vault/wiki/sources/sentrux-github-repo.md +0 -56
  504. package/vault/wiki/sources/superpowers-github-repo.md +0 -56
  505. package/vault/wiki/sources/superpowers-release-blog.md +0 -54
  506. package/vault/wiki/sources/superpowers-termdock-analysis.md +0 -45
  507. package/vault/wiki/sources/swe-agent-aci.md +0 -42
  508. package/vault/wiki/sources/swe-bench.md +0 -45
  509. package/vault/wiki/sources/swe-pruner-context-pruning.md +0 -13
  510. package/vault/wiki/sources/think-in-code-blog.md +0 -48
  511. package/vault/wiki/sources/tree-sitter-docs.md +0 -13
  512. package/vault/wiki/sources/ts-best-practices-2025-devto.md +0 -42
  513. package/vault/wiki/sources/ts-folder-structure-mingyang.md +0 -58
  514. package/vault/wiki/sources/ts-monorepo-koerselman.md +0 -44
  515. package/vault/wiki/sources/ts-result-error-handling-kkalamarski.md +0 -52
  516. package/vault/wiki/sources/ts-runtimes-comparison-betterstack.md +0 -42
  517. package/vault/wiki/sources/ts-strict-mode-rishikc.md +0 -43
  518. package/vault/wiki/sources/unix-philosophy.md +0 -48
  519. package/vault/wiki/sources/vectara-chunking-vs-embedding-naacl2025.md +0 -39
  520. package/vault/wiki/sources/vectara-guardian-agents.md +0 -79
  521. package/vault/wiki/sources/vgrep-semantic-search.md +0 -76
  522. package/vault/wiki/sources/vitest-official.md +0 -41
  523. package/vault/wiki/sources/vscode-pi-community-extension.md +0 -40
  524. package/vault/wiki/sources/wozcode.md +0 -79
@@ -0,0 +1,398 @@
1
+ #!/usr/bin/env python3
2
+ """
3
+ Merge graphify-out with optional graphify-books-out and graphify-yt-transcripts-out into graphify-out.
4
+
5
+ (Books/YouTube dirs were removed after a successful one-time merge; restore them from git to re-run.)
6
+
7
+ - Prefixes all book and YouTube node IDs to avoid collisions and preserve provenance.
8
+ - Merges hyperedges (normalizing books' member_nodes -> nodes).
9
+ - Adds cross-corpus INFERRED edges via token overlap / Jaccard on normalized labels.
10
+ - Re-clusters with graphify, writes graph.json, GRAPH_REPORT.md, analysis, labels, and graph.html (full viz via explicit node_limit).
11
+ """
12
+ from __future__ import annotations
13
+
14
+ import json
15
+ import re
16
+ import shutil
17
+ import sys
18
+ from collections import defaultdict
19
+ from datetime import datetime, timezone
20
+ from pathlib import Path
21
+
22
+ import networkx as nx
23
+ from networkx.readwrite import json_graph
24
+
25
+ from graphify.analyze import god_nodes, surprising_connections, suggest_questions
26
+ from graphify.cluster import cluster, score_all
27
+ from graphify.export import to_html, to_json
28
+ from graphify.report import generate
29
+
30
+ ROOT = Path(__file__).resolve().parents[1]
31
+ OUT = ROOT / "graphify-out"
32
+ MAIN_JSON = ROOT / "graphify-out" / "graph.json"
33
+ BOOKS_JSON = ROOT / "graphify-books-out" / "graph.json"
34
+ YT_JSON = ROOT / "graphify-yt-transcripts-out" / "graph.json"
35
+ YT_SEM = ROOT / "graphify-yt-transcripts-out" / "semantic_extraction.json"
36
+
37
+ BOOK_PREFIX = "books__"
38
+ YT_PREFIX = "yt__"
39
+
40
+
41
+ def _norm_tokens(text: str) -> set[str]:
42
+ s = re.sub(r"[^a-z0-9\s]", " ", (text or "").lower())
43
+ return {t for t in s.split() if len(t) > 2}
44
+
45
+
46
+ def load_node_link(path: Path) -> nx.Graph:
47
+ data = json.loads(path.read_text(encoding="utf-8"))
48
+ return json_graph.node_link_graph(data, edges="links")
49
+
50
+
51
+ def load_youtube_nx(path: Path) -> nx.Graph:
52
+ data = json.loads(path.read_text(encoding="utf-8"))
53
+ G = nx.Graph()
54
+ for n in data.get("nodes", []):
55
+ nid = n["id"]
56
+ attrs = {k: v for k, v in n.items() if k != "id"}
57
+ if "source_file" not in attrs or attrs["source_file"] in (None, ""):
58
+ attrs["source_file"] = "graphify-yt-transcripts-out/transcripts"
59
+ if "file_type" not in attrs:
60
+ attrs["file_type"] = "document"
61
+ G.add_node(nid, **attrs)
62
+ for e in data.get("edges", []):
63
+ u, v = e["source"], e["target"]
64
+ if u not in G or v not in G:
65
+ continue
66
+ ed = {k: v for k, v in e.items() if k not in ("source", "target")}
67
+ G.add_edge(u, v, **ed)
68
+ return G
69
+
70
+
71
+ def prefix_graph(G: nx.Graph, prefix: str) -> tuple[nx.Graph, dict[str, str]]:
72
+ """Return new graph with prefixed node ids; mapping old_id -> new_id."""
73
+ mapping = {n: f"{prefix}{n}" for n in G.nodes()}
74
+ H = nx.relabel_nodes(G, mapping, copy=True)
75
+ return H, mapping
76
+
77
+
78
+ def strip_community(G: nx.Graph) -> None:
79
+ for _, d in G.nodes(data=True):
80
+ d.pop("community", None)
81
+
82
+
83
+ def collect_hyperedges_main(data: dict) -> list[dict]:
84
+ g = data.get("graph") or {}
85
+ return list(g.get("hyperedges") or [])
86
+
87
+
88
+ def collect_hyperedges_books(data: dict, id_map: dict[str, str]) -> list[dict]:
89
+ out: list[dict] = []
90
+ for h in (data.get("graph") or {}).get("hyperedges") or []:
91
+ members = h.get("member_nodes") or h.get("nodes") or []
92
+ remapped = [id_map[m] for m in members if m in id_map]
93
+ if len(remapped) < 2:
94
+ continue
95
+ h2 = dict(h)
96
+ h2["nodes"] = remapped
97
+ h2.pop("member_nodes", None)
98
+ if "label" not in h2 and h2.get("description"):
99
+ h2["label"] = str(h2["description"])[:200]
100
+ if "relation" not in h2:
101
+ h2["relation"] = "participate_in"
102
+ if "confidence" not in h2:
103
+ h2["confidence"] = "INFERRED"
104
+ if "confidence_score" not in h2:
105
+ h2["confidence_score"] = 0.7
106
+ out.append(h2)
107
+ return out
108
+
109
+
110
+ def collect_hyperedges_yt(semantic: dict, id_map: dict[str, str]) -> list[dict]:
111
+ out: list[dict] = []
112
+ for h in semantic.get("hyperedges") or []:
113
+ nodes = h.get("nodes") or []
114
+ remapped = [id_map[n] for n in nodes if n in id_map]
115
+ if len(remapped) < 2:
116
+ continue
117
+ h2 = dict(h)
118
+ h2["nodes"] = remapped
119
+ out.append(h2)
120
+ return out
121
+
122
+
123
+ def build_token_index(G: nx.Graph) -> tuple[dict[str, set[str]], dict[str, str]]:
124
+ """node_id -> tokens, node_id -> display string for matching."""
125
+ tokens: dict[str, set[str]] = {}
126
+ labels: dict[str, str] = {}
127
+ for nid, d in G.nodes(data=True):
128
+ lab = d.get("norm_label") or d.get("label") or str(nid)
129
+ labels[nid] = lab if isinstance(lab, str) else str(lab)
130
+ tokens[nid] = _norm_tokens(labels[nid])
131
+ return tokens, labels
132
+
133
+
134
+ def add_cross_corpus_edges(
135
+ G: nx.Graph,
136
+ parts: list[tuple[str, nx.Graph, dict[str, set[str]], dict[str, str]]],
137
+ *,
138
+ max_edges: int = 12000,
139
+ min_jaccard: float = 0.32,
140
+ min_shared: int = 2,
141
+ max_per_target_corpus: int = 2,
142
+ ) -> int:
143
+ """
144
+ parts: (name, subgraph, tokens_map, labels_map) for each corpus.
145
+ Adds INFERRED semantically_similar_to edges only between different corpora (id prefix).
146
+ """
147
+ inverted: dict[str, list[tuple[str, str]]] = defaultdict(list)
148
+ for corpus, _Sg, tok_map, _lab in parts:
149
+ for nid, toks in tok_map.items():
150
+ for t in toks:
151
+ inverted[t].append((corpus, nid))
152
+
153
+ token_maps = {name: tm for name, _Sg, tm, _ in parts}
154
+ def corpus_of(nid: str) -> str:
155
+ if nid.startswith(BOOK_PREFIX):
156
+ return "books"
157
+ if nid.startswith(YT_PREFIX):
158
+ return "yt"
159
+ return "main"
160
+
161
+ existing = {frozenset((u, v)) for u, v in G.edges()}
162
+ added = 0
163
+
164
+ for corpus_a, _Ga, tok_a, _lab_a in parts:
165
+ for u, tu in tok_a.items():
166
+ if not tu:
167
+ continue
168
+ cand: set[str] = set()
169
+ for t in tu:
170
+ for corp_b, v in inverted[t]:
171
+ if corp_b == corpus_a:
172
+ continue
173
+ if corpus_of(u) == corpus_of(v):
174
+ continue
175
+ cand.add(v)
176
+
177
+ scored: list[tuple[float, str]] = []
178
+ for v in cand:
179
+ tv = None
180
+ for name in token_maps:
181
+ if v in token_maps[name]:
182
+ tv = token_maps[name][v]
183
+ break
184
+ if not tv:
185
+ continue
186
+ inter = len(tu & tv)
187
+ if inter < min_shared:
188
+ continue
189
+ union = len(tu | tv) or 1
190
+ j = inter / union
191
+ if j < min_jaccard:
192
+ continue
193
+ scored.append((j, v))
194
+
195
+ scored.sort(reverse=True)
196
+ tgt_corpus_count: dict[str, int] = defaultdict(int)
197
+ for j, v in scored:
198
+ if added >= max_edges:
199
+ return added
200
+ cb = corpus_of(v)
201
+ if tgt_corpus_count[cb] >= max_per_target_corpus:
202
+ continue
203
+ pair = frozenset((u, v))
204
+ if pair in existing:
205
+ continue
206
+ existing.add(pair)
207
+ tgt_corpus_count[cb] += 1
208
+ rationale = f"cross_corpus token overlap jaccard={j:.2f}"
209
+ G.add_edge(
210
+ u,
211
+ v,
212
+ relation="semantically_similar_to",
213
+ confidence="INFERRED",
214
+ confidence_score=min(0.95, 0.55 + 0.4 * j),
215
+ source_file="graphify_merge/cross_corpus",
216
+ source_location=f"{corpus_a}->{cb}",
217
+ weight=1.0,
218
+ rationale=rationale[:500],
219
+ )
220
+ added += 1
221
+ return added
222
+
223
+
224
+ def auto_community_labels(
225
+ G: nx.Graph, communities: dict[int, list[str]]
226
+ ) -> dict[int, str]:
227
+ """Short names from highest-degree node labels in each community."""
228
+ deg = dict(G.degree())
229
+ out: dict[int, str] = {}
230
+ for cid, members in communities.items():
231
+ ranked = sorted(members, key=lambda n: deg.get(n, 0), reverse=True)
232
+ bits: list[str] = []
233
+ seen_words: set[str] = set()
234
+ for nid in ranked[:12]:
235
+ lab = G.nodes[nid].get("label") or nid
236
+ if not isinstance(lab, str):
237
+ lab = str(lab)
238
+ # shorten
239
+ short = lab.strip()
240
+ if len(short) > 42:
241
+ short = short[:39] + "…"
242
+ w = _norm_tokens(short)
243
+ if not w:
244
+ continue
245
+ if short and short not in bits:
246
+ bits.append(short)
247
+ seen_words |= w
248
+ if len(bits) >= 3:
249
+ break
250
+ if bits:
251
+ name = " · ".join(bits[:3])
252
+ else:
253
+ name = f"Community {cid}"
254
+ if len(name) > 90:
255
+ name = name[:87] + "…"
256
+ out[cid] = name
257
+ return out
258
+
259
+
260
+ def polish_labels(labels: dict[int, str], G: nx.Graph, communities: dict[int, list[str]]) -> dict[int, str]:
261
+ """Short-circuit noisy labels from ingested graph-report summary nodes."""
262
+ out = dict(labels)
263
+ for cid, name in list(out.items()):
264
+ nlow = name.lower()
265
+ if "graph report" in nlow and "communities" in nlow:
266
+ out[cid] = "Ingested graph-report hubs (books merge artifact)"
267
+ elif "communities (" in nlow and "thin omitted" in nlow:
268
+ out[cid] = "Book community index nodes (metadata)"
269
+ return out
270
+
271
+
272
+ def main() -> None:
273
+ for p in (BOOKS_JSON, YT_JSON):
274
+ if not p.exists():
275
+ print(
276
+ f"Missing {p}. Books/YouTube graphs were merged into graphify-out and "
277
+ "the source dirs were removed; restore graphify-books-out/ and "
278
+ "graphify-yt-transcripts-out/ from git (or a backup) to re-run this merge.",
279
+ file=sys.stderr,
280
+ )
281
+ raise SystemExit(1)
282
+
283
+ ts = datetime.now(timezone.utc).strftime("%Y%m%d%H%M%S")
284
+ backup = OUT / f"graph.json.pre-merge-{ts}.bak"
285
+ if MAIN_JSON.exists():
286
+ shutil.copy2(MAIN_JSON, backup)
287
+ print(f"Backed up graph.json -> {backup.name}")
288
+
289
+ raw_main = json.loads(MAIN_JSON.read_text(encoding="utf-8"))
290
+ raw_books = json.loads(BOOKS_JSON.read_text(encoding="utf-8"))
291
+
292
+ G_main = load_node_link(MAIN_JSON)
293
+ G_books = load_node_link(BOOKS_JSON)
294
+ G_yt = load_youtube_nx(YT_JSON)
295
+
296
+ strip_community(G_main)
297
+ strip_community(G_books)
298
+ strip_community(G_yt)
299
+
300
+ G_books_p, map_b = prefix_graph(G_books, BOOK_PREFIX)
301
+ G_yt_p, map_y = prefix_graph(G_yt, YT_PREFIX)
302
+
303
+ G = nx.compose_all([G_main, G_books_p, G_yt_p])
304
+
305
+ hyper: list[dict] = []
306
+ hyper += collect_hyperedges_main(raw_main)
307
+ hyper += collect_hyperedges_books(raw_books, map_b)
308
+ if YT_SEM.exists():
309
+ sem = json.loads(YT_SEM.read_text(encoding="utf-8"))
310
+ hyper += collect_hyperedges_yt(sem, map_y)
311
+ G.graph["hyperedges"] = hyper
312
+ print(f"Merged hyperedges: {len(hyper)}")
313
+
314
+ parts = []
315
+ for name, sub in (
316
+ ("main", G_main),
317
+ ("books", G_books_p),
318
+ ("yt", G_yt_p),
319
+ ):
320
+ tm, lm = build_token_index(sub)
321
+ parts.append((name, sub, tm, lm))
322
+
323
+ n_cross = add_cross_corpus_edges(G, parts)
324
+ print(f"Cross-corpus edges added: {n_cross}")
325
+ print(f"Combined graph: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges")
326
+
327
+ communities = cluster(G)
328
+ cohesion = score_all(G, communities)
329
+ gods = god_nodes(G)
330
+ surprises = surprising_connections(G, communities)
331
+
332
+ labels = polish_labels(auto_community_labels(G, communities), G, communities)
333
+ questions = suggest_questions(G, communities, labels)
334
+
335
+ detection = {
336
+ "total_files": 0,
337
+ "total_words": 0,
338
+ "needs_graph": True,
339
+ "warning": None,
340
+ "files": {"paper": [], "code": [], "document": [], "image": [], "video": []},
341
+ "skipped_sensitive": [],
342
+ "graphifyignore_patterns": 0,
343
+ }
344
+ tokens = {"input": 0, "output": 0}
345
+
346
+ report = generate(
347
+ G,
348
+ communities,
349
+ cohesion,
350
+ labels,
351
+ gods,
352
+ surprises,
353
+ detection,
354
+ tokens,
355
+ str(ROOT),
356
+ suggested_questions=questions,
357
+ )
358
+ OUT.mkdir(parents=True, exist_ok=True)
359
+ (OUT / "GRAPH_REPORT.md").write_text(report, encoding="utf-8")
360
+
361
+ ok = to_json(G, communities, str(OUT / "graph.json"), force=True)
362
+ if not ok:
363
+ raise SystemExit("to_json refused to write; check stderr")
364
+
365
+ analysis = {
366
+ "communities": {str(k): v for k, v in communities.items()},
367
+ "cohesion": {str(k): v for k, v in cohesion.items()},
368
+ "gods": gods,
369
+ "surprises": surprises,
370
+ "questions": questions,
371
+ "merge_meta": {
372
+ "merged_at": datetime.now(timezone.utc).isoformat(),
373
+ "sources": ["graphify-out", "graphify-books-out", "graphify-yt-transcripts-out"],
374
+ "cross_corpus_edges": n_cross,
375
+ "hyperedges": len(hyper),
376
+ },
377
+ }
378
+ (OUT / ".graphify_analysis.json").write_text(
379
+ json.dumps(analysis, indent=2), encoding="utf-8"
380
+ )
381
+ (OUT / ".graphify_labels.json").write_text(
382
+ json.dumps({str(k): v for k, v in labels.items()}, indent=2),
383
+ encoding="utf-8",
384
+ )
385
+
386
+ n = G.number_of_nodes()
387
+ to_html(
388
+ G,
389
+ communities,
390
+ str(OUT / "graph.html"),
391
+ community_labels=labels,
392
+ node_limit=n,
393
+ )
394
+ print(f"Wrote graph.html ({n} nodes, node_limit=n for graphify viz cap)")
395
+
396
+
397
+ if __name__ == "__main__":
398
+ main()
@@ -0,0 +1,46 @@
1
+ #!/usr/bin/env python3
2
+ """Write graphify-out/graph.html from existing graph.json (full graph, bypasses 5k default cap)."""
3
+ from __future__ import annotations
4
+
5
+ import json
6
+ import sys
7
+ from pathlib import Path
8
+
9
+ from networkx.readwrite import json_graph
10
+
11
+ from graphify.export import to_html
12
+
13
+ ROOT = Path(__file__).resolve().parents[1]
14
+ OUT = ROOT / "graphify-out"
15
+
16
+
17
+ def main() -> None:
18
+ gj = OUT / "graph.json"
19
+ if not gj.exists():
20
+ print(f"Missing {gj}", file=sys.stderr)
21
+ sys.exit(1)
22
+ G = json_graph.node_link_graph(json.loads(gj.read_text(encoding="utf-8")), edges="links")
23
+ analysis_path = OUT / ".graphify_analysis.json"
24
+ if not analysis_path.exists():
25
+ print(f"Missing {analysis_path}", file=sys.stderr)
26
+ sys.exit(1)
27
+ analysis = json.loads(analysis_path.read_text(encoding="utf-8"))
28
+ communities = {int(k): v for k, v in analysis["communities"].items()}
29
+ labels_path = OUT / ".graphify_labels.json"
30
+ labels: dict[int, str] = {}
31
+ if labels_path.exists():
32
+ labels = {int(k): v for k, v in json.loads(labels_path.read_text(encoding="utf-8")).items()}
33
+ n = G.number_of_nodes()
34
+ # graphify skips full HTML when n > default limit; pass explicit limit for full-node viz.
35
+ to_html(
36
+ G,
37
+ communities,
38
+ str(OUT / "graph.html"),
39
+ community_labels=labels or None,
40
+ node_limit=n,
41
+ )
42
+ print(f"Wrote {OUT / 'graph.html'} ({n} nodes)")
43
+
44
+
45
+ if __name__ == "__main__":
46
+ main()
@@ -1,90 +0,0 @@
1
- ---
2
- name: defuddle
3
- description: "Strip clutter from web pages before ingesting into the wiki. Removes ads, navigation, headers, footers, and boilerplate: leaving clean readable markdown that saves 40-60% tokens. Triggers on: defuddle, clean this page, strip this url, fetch and clean, clean web content before ingesting, strip ads, remove clutter, clean URL content, readable markdown from URL."
4
- allowed-tools: Read Bash
5
- ---
6
-
7
- # defuddle: Web Page Cleaner
8
-
9
- Defuddle extracts the meaningful content from a web page and drops everything else: ads, cookie banners, nav bars, related articles, footers, social sharing buttons. What remains is the article body as clean markdown.
10
-
11
- Use this before any URL ingestion. It is optional but strongly recommended. It cuts token usage by 40-60% on typical web articles and produces cleaner wiki pages.
12
-
13
- ---
14
-
15
- ## Wiki Path Resolution
16
-
17
- This skill saves cleaned content to `.raw/` (relative to vault root). It does NOT write to `wiki/` directly. The vault root is the working directory. Other skills (wiki-ingest) handle wiki path resolution via `VAULT_WIKI_PATH` when reading from `.raw/` and writing to `wiki/`.
18
-
19
- ---
20
-
21
- ## Install
22
-
23
- ```bash
24
- npm install -g defuddle-cli
25
- ```
26
-
27
- Verify: `defuddle --version`
28
-
29
- ---
30
-
31
- ## Usage
32
-
33
- ### Clean a URL directly
34
- ```bash
35
- defuddle https://example.com/article
36
- ```
37
- Outputs clean markdown to stdout.
38
-
39
- ### Save to .raw/
40
- ```bash
41
- defuddle https://example.com/article > .raw/articles/article-slug-$(date +%Y-%m-%d).md
42
- ```
43
-
44
- ### Add frontmatter header after saving
45
- After running defuddle, prepend the source URL and fetch date:
46
- ```bash
47
- SLUG="article-slug-$(date +%Y-%m-%d)"
48
- { echo "---"; echo "source_url: https://example.com/article"; echo "fetched: $(date +%Y-%m-%d)"; echo "---"; echo ""; defuddle https://example.com/article; } > .raw/articles/$SLUG.md
49
- ```
50
-
51
- ### Clean a local HTML file
52
- ```bash
53
- defuddle page.html
54
- ```
55
-
56
- ---
57
-
58
- ## When to Use
59
-
60
- **Use defuddle when:**
61
- - Ingesting a news article, blog post, or documentation page from a URL
62
- - The page has a lot of surrounding content (most web pages do)
63
- - You want to stay within token budget on a long article
64
-
65
- **Skip defuddle when:**
66
- - The source is already a clean markdown or PDF file
67
- - The page is a dashboard, app, or structured data (defuddle expects article-style content)
68
- - defuddle is not installed and the article is short enough to process raw
69
-
70
- ---
71
-
72
- ## Fallback
73
-
74
- If defuddle is not installed, check:
75
-
76
- ```bash
77
- which defuddle 2>/dev/null || echo "not installed"
78
- ```
79
-
80
- If not installed: use WebFetch directly. The content will be less clean but still workable.
81
-
82
- ---
83
-
84
- ## Integration with /wiki-ingest
85
-
86
- The `/wiki-ingest` skill checks for defuddle automatically when a URL is passed. You do not need to run defuddle manually before ingesting a URL. The ingest skill will call it if available.
87
-
88
- To manually clean a page and save before ingesting:
89
- 1. Run the save command above
90
- 2. Then: `ingest .raw/articles/[slug].md`