@skill-graph/cli 0.5.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (330) hide show
  1. package/CHANGELOG.md +247 -0
  2. package/LICENSE +200 -0
  3. package/NOTICE +62 -0
  4. package/README.md +398 -0
  5. package/SKILL_GRAPH.md +443 -0
  6. package/bin/skill-graph.js +374 -0
  7. package/docs/ADOPTION.md +117 -0
  8. package/docs/CONFORMANCE.md +66 -0
  9. package/docs/PRIMER.md +384 -0
  10. package/docs/QUICKSTART-30MIN.md +333 -0
  11. package/docs/ROUTING-METRICS.md +120 -0
  12. package/docs/SKILL-MD-FORMAT-COMPATIBILITY.md +127 -0
  13. package/docs/SKILL_AUDIT_CHECKLIST.md +199 -0
  14. package/docs/SKILL_AUDIT_LOOP.md +195 -0
  15. package/docs/SKILL_METADATA_PROTOCOL.md +609 -0
  16. package/docs/_archived/marketplace-publication-priority-2026-05-18.md +239 -0
  17. package/docs/adr/0001-predicate-set.md +69 -0
  18. package/docs/adr/0002-json-ld-context.md +82 -0
  19. package/docs/adr/0003-ontoclean-rigidity-tags.md +65 -0
  20. package/docs/adr/0004-persistent-identifiers.md +74 -0
  21. package/docs/adr/0005-freshness-consolidation.md +70 -0
  22. package/docs/adr/0006-revise-predicate-rename.md +105 -0
  23. package/docs/adr/0007-audit-loop-cadence.md +99 -0
  24. package/docs/adr/0008-skill-surface-split-and-curation-policy.md +93 -0
  25. package/docs/category-consumers.md +168 -0
  26. package/docs/concept-map.md +194 -0
  27. package/docs/diagrams/drift-states.mmd +21 -0
  28. package/docs/diagrams/manifest-pipeline.mmd +25 -0
  29. package/docs/diagrams/routing-harness.mmd +41 -0
  30. package/docs/diagrams/starter-graph.mmd +53 -0
  31. package/docs/field-decision-guide.md +315 -0
  32. package/docs/field-rationale.md +211 -0
  33. package/docs/field-reference.generated.md +624 -0
  34. package/docs/field-reference.md +1426 -0
  35. package/docs/glossary.md +190 -0
  36. package/docs/head-noun-glossary.md +63 -0
  37. package/docs/images/audit-phases.png +0 -0
  38. package/docs/images/drift-states.png +0 -0
  39. package/docs/images/graded-mode.png +0 -0
  40. package/docs/images/manifest-pipeline.png +0 -0
  41. package/docs/images/routing-harness.png +0 -0
  42. package/docs/images/skill-anatomy.png +0 -0
  43. package/docs/images/starter-graph.png +0 -0
  44. package/docs/images/system-model.png +0 -0
  45. package/docs/integrations/github-actions.md +155 -0
  46. package/docs/manifest-field-mapping.md +443 -0
  47. package/docs/marketplace-publication-queue.generated.md +240 -0
  48. package/docs/marketplace-release-agent-prompt.md +82 -0
  49. package/docs/marketplace-skill-candidate-list.md +272 -0
  50. package/docs/marketplace-syndication.md +222 -0
  51. package/docs/migration-sample-review.md +155 -0
  52. package/docs/migrations/v4-to-v5.md +168 -0
  53. package/docs/migrations/v5-to-v6.md +221 -0
  54. package/docs/name-exceptions.yaml +37 -0
  55. package/docs/plans/marketplace-p1-public-migration-plan.md +41 -0
  56. package/docs/plans/multi-root-workspace.md +148 -0
  57. package/docs/plans/scripts-roadmap.md +107 -0
  58. package/docs/plans/v4-schema-bump.md +160 -0
  59. package/docs/plans/wave-2-extraction.md +122 -0
  60. package/docs/positioning-vs-marketplaces.md +175 -0
  61. package/docs/proposals/skill-audit-loop-positioning.md +160 -0
  62. package/docs/quality-doctrine.md +138 -0
  63. package/docs/recommended-skills.md +150 -0
  64. package/docs/research/skill-comprehension-eval-research.md +1830 -0
  65. package/docs/research/skill-retrieval-evidence.md +66 -0
  66. package/docs/skill-metadata-protocol.md +471 -0
  67. package/docs/skills-sh-maintainer-cleanup-request.md +80 -0
  68. package/examples/audits/a11y/findings.md +52 -0
  69. package/examples/audits/a11y/scorecard.md +21 -0
  70. package/examples/audits/a11y/verdict.md +44 -0
  71. package/examples/audits/debugging/findings.md +59 -0
  72. package/examples/audits/debugging/scorecard.md +22 -0
  73. package/examples/audits/debugging/verdict.md +33 -0
  74. package/examples/audits/documentation/findings.md +59 -0
  75. package/examples/audits/documentation/scorecard.md +22 -0
  76. package/examples/audits/documentation/verdict.md +33 -0
  77. package/examples/evals/a11y.json +140 -0
  78. package/examples/evals/api-design.json +52 -0
  79. package/examples/evals/code-review.json +52 -0
  80. package/examples/evals/data-modeling.json +52 -0
  81. package/examples/evals/database-migration.json +52 -0
  82. package/examples/evals/debugging.json +118 -0
  83. package/examples/evals/dependency-architecture.json +52 -0
  84. package/examples/evals/design-system-architecture.json +52 -0
  85. package/examples/evals/error-tracking.json +52 -0
  86. package/examples/evals/event-contract-design.json +52 -0
  87. package/examples/evals/form-ux-architecture.json +52 -0
  88. package/examples/evals/framework-fit-analysis.json +52 -0
  89. package/examples/evals/graph-audit.json +139 -0
  90. package/examples/evals/information-architecture.json +52 -0
  91. package/examples/evals/interaction-feedback.json +52 -0
  92. package/examples/evals/interaction-patterns.json +52 -0
  93. package/examples/evals/layout-composition.json +52 -0
  94. package/examples/evals/lint-overlay.json +117 -0
  95. package/examples/evals/microcopy.json +52 -0
  96. package/examples/evals/observability-modeling.json +52 -0
  97. package/examples/evals/pattern-recognition.json +96 -0
  98. package/examples/evals/performance-engineering.json +52 -0
  99. package/examples/evals/refactor.json +128 -0
  100. package/examples/evals/semiotics.json +52 -0
  101. package/examples/evals/skill-infrastructure.json +96 -0
  102. package/examples/evals/skill-router.json +140 -0
  103. package/examples/evals/skill-router.routing.json +113 -0
  104. package/examples/evals/system-interface-contracts.json +52 -0
  105. package/examples/evals/task-analysis.json +52 -0
  106. package/examples/evals/testing-strategy.json +118 -0
  107. package/examples/evals/type-safety.json +249 -0
  108. package/examples/evals/visual-design-foundations.json +52 -0
  109. package/examples/evals/webhook-integration.json +52 -0
  110. package/examples/exports/a11y.skill-md.md +80 -0
  111. package/examples/exports/debugging.skill-md.md +80 -0
  112. package/examples/exports/refactor.skill-md.md +78 -0
  113. package/examples/exports/testing-strategy.skill-md.md +81 -0
  114. package/examples/projects/markdown-static-site/README.md +115 -0
  115. package/examples/projects/markdown-static-site/skills/content-source-router/SKILL.md +131 -0
  116. package/examples/projects/markdown-static-site/skills/image-optimization-pipeline-config/SKILL.md +132 -0
  117. package/examples/projects/markdown-static-site/skills/link-rot-detection/SKILL.md +103 -0
  118. package/examples/projects/markdown-static-site/skills/markdown-post-frontmatter-validation/SKILL.md +133 -0
  119. package/examples/projects/markdown-static-site/skills/migrate-posts-to-v2-frontmatter/SKILL.md +140 -0
  120. package/examples/projects/saas-stripe-postgres/README.md +208 -0
  121. package/examples/projects/saas-stripe-postgres/db/migrations/0004_canonicalize_orders.sql +37 -0
  122. package/examples/projects/saas-stripe-postgres/db/schema.sql +112 -0
  123. package/examples/projects/saas-stripe-postgres/skills/migrate-orders-to-canonical-schema/SKILL.md +149 -0
  124. package/examples/projects/saas-stripe-postgres/skills/nextjs-server-action-validation/SKILL.md +154 -0
  125. package/examples/projects/saas-stripe-postgres/skills/payment-provider-router/SKILL.md +153 -0
  126. package/examples/projects/saas-stripe-postgres/skills/postgres-rls-pattern/SKILL.md +163 -0
  127. package/examples/projects/saas-stripe-postgres/skills/stripe-webhook-signature-verification/SKILL.md +137 -0
  128. package/examples/protocol/skill-metadata-template.md +301 -0
  129. package/examples/protocol/skills.manifest.sample.json +13245 -0
  130. package/examples/skill-metadata-template.md +317 -0
  131. package/examples/skills.manifest.sample.json +13519 -0
  132. package/examples/tests/v3-1-skos-fixture/SKILL.md +93 -0
  133. package/marketplace/README.md +17 -0
  134. package/marketplace/skills/a11y/SKILL.md +66 -0
  135. package/marketplace/skills/acid-fundamentals/SKILL.md +106 -0
  136. package/marketplace/skills/agent-engineering/SKILL.md +386 -0
  137. package/marketplace/skills/agent-eval-design/SKILL.md +55 -0
  138. package/marketplace/skills/ai-native-development/SKILL.md +294 -0
  139. package/marketplace/skills/api-design/SKILL.md +60 -0
  140. package/marketplace/skills/architecture-decision-records/SKILL.md +55 -0
  141. package/marketplace/skills/background-jobs/SKILL.md +265 -0
  142. package/marketplace/skills/bounded-context-mapping/SKILL.md +55 -0
  143. package/marketplace/skills/cap-theorem-tradeoffs/SKILL.md +127 -0
  144. package/marketplace/skills/client-server-boundary/SKILL.md +187 -0
  145. package/marketplace/skills/code-review/SKILL.md +120 -0
  146. package/marketplace/skills/color-system-design/SKILL.md +43 -0
  147. package/marketplace/skills/component-architecture/SKILL.md +126 -0
  148. package/marketplace/skills/compression/SKILL.md +112 -0
  149. package/marketplace/skills/conceptual-modeling/SKILL.md +181 -0
  150. package/marketplace/skills/connection-pooling/SKILL.md +105 -0
  151. package/marketplace/skills/constraint-awareness/SKILL.md +287 -0
  152. package/marketplace/skills/content-monitor/SKILL.md +209 -0
  153. package/marketplace/skills/context-engineering/SKILL.md +320 -0
  154. package/marketplace/skills/context-graph/SKILL.md +174 -0
  155. package/marketplace/skills/context-management/SKILL.md +174 -0
  156. package/marketplace/skills/context-window/SKILL.md +239 -0
  157. package/marketplace/skills/contract-testing/SKILL.md +120 -0
  158. package/marketplace/skills/cron-scheduling/SKILL.md +223 -0
  159. package/marketplace/skills/dark-mode-implementation/SKILL.md +47 -0
  160. package/marketplace/skills/data-modeling/SKILL.md +59 -0
  161. package/marketplace/skills/data-modeling-fundamentals/SKILL.md +117 -0
  162. package/marketplace/skills/database-migration/SKILL.md +429 -0
  163. package/marketplace/skills/debugging/SKILL.md +67 -0
  164. package/marketplace/skills/dependency-architecture/SKILL.md +58 -0
  165. package/marketplace/skills/design-module-composition/SKILL.md +43 -0
  166. package/marketplace/skills/design-system-architecture/SKILL.md +61 -0
  167. package/marketplace/skills/design-thinking/SKILL.md +44 -0
  168. package/marketplace/skills/diagnosis/SKILL.md +296 -0
  169. package/marketplace/skills/diff-analysis/SKILL.md +188 -0
  170. package/marketplace/skills/e2e-test-design/SKILL.md +113 -0
  171. package/marketplace/skills/entity-relationship-modeling/SKILL.md +218 -0
  172. package/marketplace/skills/epistemic-grounding/SKILL.md +112 -0
  173. package/marketplace/skills/error-boundary/SKILL.md +235 -0
  174. package/marketplace/skills/error-tracking/SKILL.md +261 -0
  175. package/marketplace/skills/eval-driven-development/SKILL.md +147 -0
  176. package/marketplace/skills/evaluation/SKILL.md +113 -0
  177. package/marketplace/skills/event-contract-design/SKILL.md +60 -0
  178. package/marketplace/skills/event-storming/SKILL.md +56 -0
  179. package/marketplace/skills/form-ux-architecture/SKILL.md +60 -0
  180. package/marketplace/skills/framework-fit-analysis/SKILL.md +59 -0
  181. package/marketplace/skills/frontend-architecture/SKILL.md +43 -0
  182. package/marketplace/skills/generative-ui/SKILL.md +118 -0
  183. package/marketplace/skills/graph-audit/SKILL.md +81 -0
  184. package/marketplace/skills/guardrails/SKILL.md +118 -0
  185. package/marketplace/skills/hooks-patterns/SKILL.md +185 -0
  186. package/marketplace/skills/http-semantics/SKILL.md +136 -0
  187. package/marketplace/skills/ideation/SKILL.md +41 -0
  188. package/marketplace/skills/indexing-strategy/SKILL.md +108 -0
  189. package/marketplace/skills/information-architecture/SKILL.md +59 -0
  190. package/marketplace/skills/integration-test-design/SKILL.md +111 -0
  191. package/marketplace/skills/intent-recognition/SKILL.md +136 -0
  192. package/marketplace/skills/interaction-feedback/SKILL.md +59 -0
  193. package/marketplace/skills/interaction-patterns/SKILL.md +59 -0
  194. package/marketplace/skills/journey-mapping/SKILL.md +41 -0
  195. package/marketplace/skills/keywords/SKILL.md +213 -0
  196. package/marketplace/skills/knowledge-modeling/SKILL.md +232 -0
  197. package/marketplace/skills/layout-composition/SKILL.md +59 -0
  198. package/marketplace/skills/linguistics/SKILL.md +429 -0
  199. package/marketplace/skills/lint-overlay/SKILL.md +76 -0
  200. package/marketplace/skills/mental-models/SKILL.md +126 -0
  201. package/marketplace/skills/merge-queue/SKILL.md +94 -0
  202. package/marketplace/skills/methodology/SKILL.md +317 -0
  203. package/marketplace/skills/microcopy/SKILL.md +232 -0
  204. package/marketplace/skills/middleware-patterns/SKILL.md +363 -0
  205. package/marketplace/skills/mobile-responsive-ux/SKILL.md +287 -0
  206. package/marketplace/skills/mutation-testing/SKILL.md +112 -0
  207. package/marketplace/skills/naming-conventions/SKILL.md +112 -0
  208. package/marketplace/skills/observability-modeling/SKILL.md +59 -0
  209. package/marketplace/skills/ontology-modeling/SKILL.md +67 -0
  210. package/marketplace/skills/owasp-security/SKILL.md +153 -0
  211. package/marketplace/skills/pattern-recognition/SKILL.md +472 -0
  212. package/marketplace/skills/performance-budgets/SKILL.md +185 -0
  213. package/marketplace/skills/performance-engineering/SKILL.md +58 -0
  214. package/marketplace/skills/performance-testing/SKILL.md +125 -0
  215. package/marketplace/skills/printify/SKILL.md +42 -0
  216. package/marketplace/skills/prioritization/SKILL.md +118 -0
  217. package/marketplace/skills/problem-framing/SKILL.md +41 -0
  218. package/marketplace/skills/problem-locating-solving/SKILL.md +203 -0
  219. package/marketplace/skills/project-knowledge-extraction/SKILL.md +54 -0
  220. package/marketplace/skills/prompt-craft/SKILL.md +134 -0
  221. package/marketplace/skills/prompt-injection-defense/SKILL.md +132 -0
  222. package/marketplace/skills/property-based-testing/SKILL.md +100 -0
  223. package/marketplace/skills/prototyping/SKILL.md +43 -0
  224. package/marketplace/skills/query-optimization/SKILL.md +144 -0
  225. package/marketplace/skills/real-time-updates/SKILL.md +324 -0
  226. package/marketplace/skills/ref-patterns/SKILL.md +284 -0
  227. package/marketplace/skills/refactor/SKILL.md +65 -0
  228. package/marketplace/skills/rendering-models/SKILL.md +142 -0
  229. package/marketplace/skills/replication-patterns/SKILL.md +110 -0
  230. package/marketplace/skills/research-synthesis/SKILL.md +41 -0
  231. package/marketplace/skills/route-handler-design/SKILL.md +347 -0
  232. package/marketplace/skills/schema-evolution/SKILL.md +140 -0
  233. package/marketplace/skills/security-fundamentals/SKILL.md +139 -0
  234. package/marketplace/skills/semantic-center/SKILL.md +194 -0
  235. package/marketplace/skills/semantic-relations/SKILL.md +250 -0
  236. package/marketplace/skills/semantics/SKILL.md +366 -0
  237. package/marketplace/skills/semiotics/SKILL.md +230 -0
  238. package/marketplace/skills/seo-strategy/SKILL.md +260 -0
  239. package/marketplace/skills/server-actions-design/SKILL.md +243 -0
  240. package/marketplace/skills/server-components-design/SKILL.md +190 -0
  241. package/marketplace/skills/sharding-strategy/SKILL.md +123 -0
  242. package/marketplace/skills/shopify/SKILL.md +42 -0
  243. package/marketplace/skills/skill-infrastructure/SKILL.md +320 -0
  244. package/marketplace/skills/skill-router/SKILL.md +71 -0
  245. package/marketplace/skills/skill-scaffold/SKILL.md +105 -0
  246. package/marketplace/skills/snapshot-testing/SKILL.md +120 -0
  247. package/marketplace/skills/spec-driven-development/SKILL.md +148 -0
  248. package/marketplace/skills/state-machine-modeling/SKILL.md +56 -0
  249. package/marketplace/skills/state-management/SKILL.md +134 -0
  250. package/marketplace/skills/streaming-architecture/SKILL.md +194 -0
  251. package/marketplace/skills/summarization/SKILL.md +156 -0
  252. package/marketplace/skills/suspense-patterns/SKILL.md +265 -0
  253. package/marketplace/skills/system-interface-contracts/SKILL.md +59 -0
  254. package/marketplace/skills/task-analysis/SKILL.md +201 -0
  255. package/marketplace/skills/taxonomy-design/SKILL.md +66 -0
  256. package/marketplace/skills/test-coverage-strategy/SKILL.md +108 -0
  257. package/marketplace/skills/test-doubles-design/SKILL.md +98 -0
  258. package/marketplace/skills/test-driven-development/SKILL.md +96 -0
  259. package/marketplace/skills/testing-strategy/SKILL.md +67 -0
  260. package/marketplace/skills/theme-system-design/SKILL.md +43 -0
  261. package/marketplace/skills/tool-call-flow/SKILL.md +229 -0
  262. package/marketplace/skills/tool-call-strategy/SKILL.md +292 -0
  263. package/marketplace/skills/transaction-isolation/SKILL.md +98 -0
  264. package/marketplace/skills/type-safety/SKILL.md +177 -0
  265. package/marketplace/skills/typography-system/SKILL.md +43 -0
  266. package/marketplace/skills/usability-testing/SKILL.md +43 -0
  267. package/marketplace/skills/user-research/SKILL.md +43 -0
  268. package/marketplace/skills/vercel-composition-patterns/SKILL.md +157 -0
  269. package/marketplace/skills/version-control/SKILL.md +233 -0
  270. package/marketplace/skills/visual-design-foundations/SKILL.md +59 -0
  271. package/marketplace/skills/visual-hierarchy/SKILL.md +43 -0
  272. package/marketplace/skills/webhook-integration/SKILL.md +331 -0
  273. package/marketplace/skills/writing-humanizer/SKILL.md +380 -0
  274. package/package.json +67 -0
  275. package/schemas/manifest.schema.json +811 -0
  276. package/schemas/manifest.v2.schema.json +164 -0
  277. package/schemas/manifest.v3.schema.json +758 -0
  278. package/schemas/manifest.v4.schema.json +755 -0
  279. package/schemas/manifest.v5.schema.json +755 -0
  280. package/schemas/manifest.v6.schema.json +811 -0
  281. package/schemas/skill.context.jsonld +279 -0
  282. package/schemas/skill.schema.json +919 -0
  283. package/schemas/skill.v2.schema.json +201 -0
  284. package/schemas/skill.v3.schema.json +827 -0
  285. package/schemas/skill.v4.schema.json +822 -0
  286. package/schemas/skill.v5.schema.json +830 -0
  287. package/schemas/skill.v6.schema.json +946 -0
  288. package/schemas/vocabulary/keywords.json +180 -0
  289. package/schemas/vocabulary/workspace_tags.json +23 -0
  290. package/scripts/__tests__/migrate-skill-v2-to-v3.test.js +161 -0
  291. package/scripts/__tests__/migrate-skill-v3-to-v4.test.js +158 -0
  292. package/scripts/__tests__/test-export-parser-drift.js +149 -0
  293. package/scripts/__tests__/test-marketplace-export.js +114 -0
  294. package/scripts/__tests__/test-router-paths.js +82 -0
  295. package/scripts/__tests__/test-stability-promotion.js +244 -0
  296. package/scripts/__tests__/test-v3-1-alias-contract.js +109 -0
  297. package/scripts/__tests__/test-v3-1-skos-runtime.js +116 -0
  298. package/scripts/backfill-schema-version.js +198 -0
  299. package/scripts/build-field-reference.js +160 -0
  300. package/scripts/build-retrieval-baseline.js +511 -0
  301. package/scripts/check-markdown-links.js +211 -0
  302. package/scripts/check-protocol-consistency.js +979 -0
  303. package/scripts/export-marketplace-skills.js +610 -0
  304. package/scripts/export-skill.js +374 -0
  305. package/scripts/generate-manifest.js +787 -0
  306. package/scripts/lib/alias-contract.js +83 -0
  307. package/scripts/lib/audit-prompt-builder.js +771 -0
  308. package/scripts/lib/mock-grader.js +134 -0
  309. package/scripts/lib/parse-frontmatter.js +429 -0
  310. package/scripts/lib/roots.js +119 -0
  311. package/scripts/lint/check-archetype-sections.js +185 -0
  312. package/scripts/lint/check-category-enum.js +83 -0
  313. package/scripts/lint/check-routing-eval.js +146 -0
  314. package/scripts/lint/check-routing-quality.js +211 -0
  315. package/scripts/lint/check-stability-promotion.js +220 -0
  316. package/scripts/lint/format-code-frame.js +206 -0
  317. package/scripts/marketplace-install.js +125 -0
  318. package/scripts/migrate-category-to-enum.js +169 -0
  319. package/scripts/migrate-skill-v2-to-v3.js +424 -0
  320. package/scripts/migrate-skill-v3-to-v4.js +200 -0
  321. package/scripts/migrate-skill-v5-to-v6.js +304 -0
  322. package/scripts/restructure-by-category.js +85 -0
  323. package/scripts/seed-publication-classification.js +282 -0
  324. package/scripts/skill-audit.js +893 -0
  325. package/scripts/skill-graph-drift.js +483 -0
  326. package/scripts/skill-graph-route.js +766 -0
  327. package/scripts/skill-graph-routing-eval.js +393 -0
  328. package/scripts/skill-lint.js +1317 -0
  329. package/scripts/skill-overlap.js +213 -0
  330. package/scripts/verify-skill-md-export.js +201 -0
@@ -0,0 +1,320 @@
1
+ ---
2
+ name: context-engineering
3
+ description: "Use when designing what information reaches an LLM agent before it reasons — system prompt, persistent memory, always-loaded rules, injected skills, and the user prompt — or when diagnosing why an agent produced a wrong answer despite a clear instruction. Covers the four context failure modes (missing, stale, wrong, overwhelming), the five-layer context stack, four context quality metrics (injection precision and recall, utilization, freshness), the Frequent Intentional Compaction (FIC) protocol, subagent delegation for context-heavy work, and the failure-mode decision tree. Do NOT use for prompt wording (use `prompt-craft`), authoring a new SKILL.md (use `skill-scaffold`), or deciding which skill the router activates for a given query (use `skill-router`)."
4
+ license: MIT
5
+ compatibility: "Provider-agnostic; principles apply across Claude, GPT, Gemini, and open-weight models. Layer mapping varies by harness (Claude Code, OpenCode, Cursor, Continue, Aider) but the five-layer abstraction holds."
6
+ allowed-tools: Read Grep Bash Edit
7
+ metadata:
8
+ metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"agent\",\"domain\":\"agent/context\",\"scope\":\"portable\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-05-06\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-05-06\\\\\\\"}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"context engineering\\\\\\\",\\\\\\\"context failure\\\\\\\",\\\\\\\"agent context\\\\\\\",\\\\\\\"context quality\\\\\\\",\\\\\\\"context design\\\\\\\",\\\\\\\"missing context\\\\\\\",\\\\\\\"stale context\\\\\\\",\\\\\\\"wrong context\\\\\\\",\\\\\\\"overwhelming context\\\\\\\",\\\\\\\"context window\\\\\\\",\\\\\\\"context utilization\\\\\\\",\\\\\\\"injection precision\\\\\\\",\\\\\\\"injection recall\\\\\\\",\\\\\\\"frequent intentional compaction\\\\\\\",\\\\\\\"FIC\\\\\\\",\\\\\\\"context compaction\\\\\\\",\\\\\\\"subagent delegation\\\\\\\",\\\\\\\"context stack\\\\\\\",\\\\\\\"context layers\\\\\\\",\\\\\\\"skill injection\\\\\\\",\\\\\\\"context engineering stack\\\\\\\",\\\\\\\"agent failure diagnosis\\\\\\\",\\\\\\\"why did the agent fail\\\\\\\"]\",\"examples\":\"[\\\\\\\"the agent ignored the instruction and used the wrong query helper — was the right skill loaded?\\\\\\\",\\\\\\\"we keep getting generic answers from the agent even though the skill has the answer — what's wrong?\\\\\\\",\\\\\\\"I want to design which skills get injected for which prompts — where do I start?\\\\\\\",\\\\\\\"the agent's quality drops in long sessions — when should I compact?\\\\\\\",\\\\\\\"diagnose this agent failure: it had the file open but produced wrong output anyway\\\\\\\",\\\\\\\"we have 200 skills and the agent picks the wrong ones — fix the injection\\\\\\\",\\\\\\\"should I read this 5K-line file directly or delegate to a subagent?\\\\\\\",\\\\\\\"audit our context pipeline — what's loaded when, and is any of it stale?\\\\\\\"]\",\"anti_examples\":\"[\\\\\\\"improve this prompt's wording to get better outputs\\\\\\\",\\\\\\\"scaffold a new SKILL.md for our team's deploy procedure\\\\\\\",\\\\\\\"the router picked the wrong skill for this query — debug it\\\\\\\",\\\\\\\"review this AI-generated PR for correctness\\\\\\\",\\\\\\\"write a doc explaining our agent system to a new joiner\\\\\\\",\\\\\\\"investigate why production crashed at 3am\\\\\\\"]\",\"relations\":\"{\\\\\\\"boundary\\\\\\\":[{\\\\\\\"skill\\\\\\\":\\\\\\\"prompt-craft\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"prompt-craft writes the wording of one instruction; context-engineering shapes the entire surrounding payload (rules, memory, skills, file reads) that the prompt sits inside\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"skill-scaffold\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"skill-scaffold authors the structure of a single SKILL.md file; context-engineering decides which skills should exist, get loaded, and reach the model in the first place\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"skill-router\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"skill-router is the runtime mechanism that selects skills for a query; context-engineering is the design discipline behind the entire context stack the router operates within\\\\\\\"}],\\\\\\\"related\\\\\\\":[\\\\\\\"prompt-craft\\\\\\\",\\\\\\\"skill-router\\\\\\\"],\\\\\\\"verify_with\\\\\\\":[\\\\\\\"code-review\\\\\\\"]}\",\"portability\":\"{\\\\\\\"readiness\\\\\\\":\\\\\\\"scripted\\\\\\\",\\\\\\\"targets\\\\\\\":[\\\\\\\"skill-md\\\\\\\"]}\",\"lifecycle\":\"{\\\\\\\"stale_after_days\\\\\\\":90,\\\\\\\"review_cadence\\\\\\\":\\\\\\\"quarterly\\\\\\\"}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/context-engineering/SKILL.md\"}"
9
+ skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
10
+ skill_graph_protocol: Skill Metadata Protocol v4
11
+ skill_graph_project: Skill Graph
12
+ skill_graph_canonical_skill: skills/context-engineering/SKILL.md
13
+ ---
14
+
15
+ # Context Engineering
16
+
17
+ ## Coverage
18
+
19
+ - Core principle: the model is a reasoning engine that reasons over whatever is in its context window — wrong context produces correct reasoning over false premises
20
+ - The five-layer context stack: system prompt, persistent memory, always-loaded rules, injected skills, agent prompt — what each layer does and how each can fail
21
+ - The four context failure modes: missing, stale, wrong, overwhelming — diagnostic questions for each, table of symptoms, and prevention strategies
22
+ - Four context quality metrics: injection precision, injection recall, context utilization, freshness score — definitions, healthy ranges, and how to measure each
23
+ - Frequent Intentional Compaction (FIC): proactive compaction at task boundaries, target utilization range, and the difference between planned and forced compaction
24
+ - Subagent delegation pattern: when to delegate context-heavy investigation to a subagent so the main agent receives a summary instead of raw evidence
25
+ - Debugging decision tree: how to diagnose any agent failure by walking from missing-context through overwhelming-context before blaming the model
26
+ - The verification checklist: gates a context-engineering review must pass before declaring the pipeline healthy
27
+
28
+ ## Philosophy
29
+
30
+ The model is a reasoning engine that reasons over whatever is in its context window. If the context is wrong, the reasoning is correct but the conclusion is wrong. This means most agent failures are context failures, not model failures.
31
+
32
+ Without this discipline, teams blame the model for mistakes caused by missing keywords, stale skill content, or an overwhelmed window. Context engineering provides the diagnostic framework to identify *why* an agent produced a wrong answer and the design principles to prevent recurrence. It treats the context window as a deliberate design surface — not a dumping ground — so that the model's native reasoning produces the correct output without heroic prompting.
33
+
34
+ > Most agent failures are context failures, not model failures. Context engineering is the discipline of designing what information the model sees, when it sees it, and in what form — so that the model's native reasoning produces the correct output without heroic prompting.
35
+
36
+ ## Core Principle
37
+
38
+ The model is a reasoning engine. It reasons over whatever context is in its window. If the context is wrong, the reasoning is correct but the conclusion is wrong. **You cannot prompt your way out of a context problem.**
39
+
40
+ Three implications:
41
+
42
+ 1. **Garbage in, garbage out** — a perfectly written prompt cannot compensate for missing domain knowledge. The model will hallucinate confidently.
43
+ 2. **Signal-to-noise ratio matters** — flooding the context window with irrelevant information degrades performance just as surely as omitting critical information. The model attends to everything.
44
+ 3. **Context is a design surface** — the information in the context window is as intentional as a database schema or an API contract. It should be designed, measured, and iterated.
45
+
46
+ ## The Five-Layer Context Stack
47
+
48
+ ```
49
+ Layer 5 — Agent prompt (the task-specific instruction from user or orchestrator)
50
+ Layer 4 — Injected skills (domain knowledge selected per task by a router or injector)
51
+ Layer 3 — Always-loaded rules (universal guardrails: security, naming, GDPR, etc.)
52
+ Layer 2 — Persistent memory (cross-session knowledge: user preferences, prior decisions)
53
+ Layer 1 — System prompt (foundational identity, non-negotiable rules, reading order)
54
+ ```
55
+
56
+ Each layer adds context. Each layer can introduce failure. The context engineer's job is to ensure the right information reaches the right layer at the right time.
57
+
58
+ | Layer | Loaded when | Failure mode if broken |
59
+ |-------|-------------|-----------------------|
60
+ | Layer 1 — System prompt | Always (turn 0) | Agent ignores fundamental rules and identity |
61
+ | Layer 2 — Persistent memory | Always (auto-loaded at session start) | Agent repeats prior mistakes or ignores user preferences |
62
+ | Layer 3 — Always-loaded rules | Always (auto-loaded by harness) | Agent violates universal guardrails |
63
+ | Layer 4 — Injected skills | Per task, via router or keyword match | Agent lacks domain knowledge for the task |
64
+ | Layer 5 — Agent prompt | Per request | Agent operates on an ambiguous or underspecified instruction |
65
+
66
+ The exact mechanism varies by harness — Claude Code uses `CLAUDE.md` + `.claude/rules/` + `MEMORY.md`; Cursor uses `.cursorrules`; OpenCode uses `AGENTS.md`; Aider uses `CONVENTIONS.md`. The abstraction holds across all of them.
67
+
68
+ ## The Four Context Failure Modes
69
+
70
+ Every agent mistake can be traced to one of four context failures. Diagnosing which failure mode caused the mistake is the first step to fixing it.
71
+
72
+ ### Missing Context
73
+
74
+ The agent does not have the information it needs. Most common, easiest to diagnose.
75
+
76
+ | Symptom | Root cause | Fix |
77
+ |---------|-----------|-----|
78
+ | Agent uses wrong API or helper despite a skill that names the correct one | Skill not injected for this prompt's keywords | Add task-phrase keywords to the skill; verify routing |
79
+ | Agent ignores a project convention (naming, structure, format) | Convention not in always-loaded rules | Promote the rule from skill to always-loaded layer |
80
+ | Agent contradicts a decision made in a prior session | Decision not in persistent memory | Save the decision to a memory topic file |
81
+ | Agent uses a deprecated pattern | Skill content does not name the current replacement | Update the skill with the current pattern + deprecation note |
82
+
83
+ **Diagnostic question:** "Did the agent have access to the information that would have prevented this mistake?"
84
+
85
+ ### Stale Context
86
+
87
+ The agent has the information, but it is outdated. The skill says X, but the codebase has moved to Y.
88
+
89
+ | Symptom | Root cause | Fix |
90
+ |---------|-----------|-----|
91
+ | Agent references a file that was renamed several sessions ago | Skill references not updated after the rename | Run drift check; update file paths in the skill |
92
+ | Agent uses an old version of an external API | Skill content not refreshed after the upstream change | Update the skill, bump `freshness`, link new docs |
93
+ | Agent applies a pattern that was superseded by a recorded decision | The supersession is not cross-referenced from the skill | Add a "superseded by" pointer in the skill body |
94
+ | Memory says "blocked by X" but X was resolved last week | Memory topic file not pruned | Update or remove the stale memory entry |
95
+
96
+ **Diagnostic question:** "Was this information correct when it was written? Has the source of truth changed since?"
97
+
98
+ **Prevention:** every skill carries a `drift_check.last_verified` date. Skills whose verification is older than the lifecycle policy (e.g. 90 days for portable, 30 days for integration skills) should be reviewed before use in a high-stakes task.
99
+
100
+ ### Wrong Context
101
+
102
+ The agent has information, but it is incorrect. The most dangerous failure mode because the agent acts confidently on false premises.
103
+
104
+ | Symptom | Root cause | Fix |
105
+ |---------|-----------|-----|
106
+ | Agent applies a pattern from a different project / different library version | Skill content was copied without adapting | Audit skill for cross-project accuracy; add scope qualifier |
107
+ | Agent follows a rule that contradicts another rule | Two rules conflict without precedence | Establish explicit precedence or merge the rules |
108
+ | Agent uses a formula with incorrect semantics | Formula in skill has a bug | Verify formulas against actual implementation code, with a line-number citation |
109
+ | Agent cites a reference that says the opposite of what the skill claims | Hallucinated or misquoted reference | Add source attribution with file path and line numbers |
110
+
111
+ **Diagnostic question:** "Was the information the agent acted on actually correct?"
112
+
113
+ **Prevention:** skill content cites specific files and line numbers. Generic "best practice" advice without grounded evidence is a wrong-context risk.
114
+
115
+ ### Overwhelming Context
116
+
117
+ The agent has too much information. The signal is diluted by noise; the model's attention is spread too thin.
118
+
119
+ | Symptom | Root cause | Fix |
120
+ |---------|-----------|-----|
121
+ | Agent ignores a critical rule buried in a 2000-line skill | Skill is a monolith without internal structure | Split into thin SKILL.md + `references/` for depth |
122
+ | Agent produces generic output despite specific guidance | Too many skills injected, none deeply read | Improve injection precision; remove overly broad keywords |
123
+ | Agent's work quality degrades in the second half of a long session | Context window approaching capacity | Apply FIC at task breakpoints |
124
+ | Agent follows a less-relevant rule over a more-relevant one | Rules not prioritised by specificity | Use always-loaded rules for universal guardrails; skills for domain-specific |
125
+
126
+ **Diagnostic question:** "Did the agent have so much context that it couldn't focus on the right information?"
127
+
128
+ **Prevention:** measure injection precision. If more than 30% of injected skills are irrelevant to the task, the injection system needs tuning.
129
+
130
+ ## Context Quality Metrics
131
+
132
+ Four metrics measure the health of a context engineering system. Track these over time to identify systemic issues.
133
+
134
+ ### Injection Precision
135
+
136
+ Of all skills injected for a task, what percentage were actually needed?
137
+
138
+ ```
139
+ Injection Precision = (skills used / skills injected) × 100
140
+ ```
141
+
142
+ | Score | Interpretation | Action |
143
+ |-------|---------------|--------|
144
+ | > 80% | Healthy | Maintain |
145
+ | 50–80% | Noisy | Tighten keywords; remove overly broad trigger phrases |
146
+ | < 50% | Broken | Audit keywords; too many skills match too many prompts |
147
+
148
+ **How to measure:** after completing a task, review which injected skills were actually referenced in the agent's work. Skills injected but never consulted are false positives.
149
+
150
+ ### Injection Recall
151
+
152
+ Of all skills that would have been useful for a task, what percentage were actually injected?
153
+
154
+ ```
155
+ Injection Recall = (relevant skills injected / relevant skills total) × 100
156
+ ```
157
+
158
+ | Score | Interpretation | Action |
159
+ |-------|---------------|--------|
160
+ | > 90% | Healthy | Maintain |
161
+ | 70–90% | Gaps exist | Add missing keywords to under-matched skills |
162
+ | < 70% | Systematic failure | Review skill descriptions; add trigger phrases |
163
+
164
+ **How to measure:** after a context failure, check whether the skill that would have prevented it was indexed and whether its keywords matched the prompt.
165
+
166
+ ### Context Utilization
167
+
168
+ What percentage of the context window is used productively (contributes to correct output)?
169
+
170
+ ```
171
+ Context Utilization = (productive context / total context used) × 100
172
+ ```
173
+
174
+ Productive context: relevant skill content, necessary file reads, on-topic conversation history. Unproductive context: irrelevant skills, stale memory, redundant file reads, verbose tool output.
175
+
176
+ | Score | Interpretation | Action |
177
+ |-------|---------------|--------|
178
+ | > 70% | Efficient | Maintain |
179
+ | 40–70% | Acceptable but improvable | Compact stale conversation; reduce verbose tool output |
180
+ | < 40% | Wasteful | Apply FIC; review what loads at startup |
181
+
182
+ ### Freshness Score
183
+
184
+ What percentage of injected skill content is current (drift_check within the lifecycle window)?
185
+
186
+ ```
187
+ Freshness Score = (skills with drift_check inside window / total injected skills) × 100
188
+ ```
189
+
190
+ | Score | Interpretation | Action |
191
+ |-------|---------------|--------|
192
+ | > 90% | Current | Maintain |
193
+ | 70–90% | Drifting | Schedule drift check for stale skills |
194
+ | < 70% | Dangerous | Stop; audit all stale skills before continuing |
195
+
196
+ ## Frequent Intentional Compaction (FIC)
197
+
198
+ FIC is a proactive context-management strategy. Instead of waiting for the window to fill and reacting with an emergency compact, plan compaction points into the workflow.
199
+
200
+ **Target utilization:** 40–60% of the context window during steady-state work.
201
+
202
+ **When to compact:** at natural breakpoints, not when forced by pressure.
203
+
204
+ | Breakpoint | Action |
205
+ |-----------|--------|
206
+ | Task boundary (one task done, before starting next) | Compact — summarise what was accomplished, discard working details |
207
+ | Research complete, implementation starting | Compact — keep conclusions, discard search results and exploration |
208
+ | After reading a large file or running an enumeration | Summarise key findings; do not keep raw output in conversation |
209
+ | After a debugging session | Keep the fix and the root cause; discard the investigation steps |
210
+
211
+ The forced compact is dangerous because it is uncontrolled — you lose whatever the compaction algorithm decides is least important, which may include critical task context.
212
+
213
+ ### FIC Anti-Patterns
214
+
215
+ | Anti-pattern | Why it fails | Fix |
216
+ |-------------|-------------|-----|
217
+ | Never compacting ("I might need that later") | Window fills; forced compact loses more than planned compact would | Compact proactively at breakpoints |
218
+ | Compacting too aggressively ("keep only the plan") | Loses critical decisions and constraints | Keep decisions, constraints, and the active plan; discard exploration |
219
+ | Reading entire large files "just in case" | Wastes 5–10K tokens per file | Read targeted sections; use grep to find relevant lines |
220
+ | Keeping full tool output in context | JSON or log output is enormous and rarely re-used | Summarise tool results immediately after reading |
221
+
222
+ ## Subagent Delegation for Context-Heavy Work
223
+
224
+ Some tasks consume enormous context (reading 20 files, searching across the codebase, analysing dependencies). Doing this in the main session pollutes its context for every subsequent step. Delegate to a subagent instead:
225
+
226
+ ```
227
+ Main agent: "I need to understand the data pipeline that feeds the dashboard"
228
+ ↓ spawn subagent with narrow scope
229
+ Subagent: reads 15 files, traces the pipeline, builds a mental model
230
+ ↓ returns a 200-word structured summary
231
+ Main agent: receives summary (≈200 tokens, not ≈15,000)
232
+ ```
233
+
234
+ The subagent's context is disposable. The main agent gets the conclusion without the investigation cost.
235
+
236
+ This pattern is especially valuable for: codebase exploration, audit work, dependency tracing, multi-file refactor planning, and any "look at everything before deciding" investigation.
237
+
238
+ ## Debugging Agent Failures
239
+
240
+ When an agent produces wrong output, walk this decision tree before blaming the model:
241
+
242
+ ```
243
+ Agent produced wrong output
244
+
245
+
246
+ Was the right skill (or rule) loaded?
247
+
248
+ NO ──▶ MISSING CONTEXT
249
+ │ Fix: add keywords; promote rule to always-loaded layer; save the decision
250
+
251
+ YES
252
+
253
+
254
+ Was the loaded content correct and current?
255
+
256
+ NO ──▶ STALE or WRONG CONTEXT
257
+ │ Fix: update the skill; verify against source; cite line numbers
258
+
259
+ YES
260
+
261
+
262
+ Was the context window > 80% full?
263
+
264
+ YES ──▶ OVERWHELMING CONTEXT
265
+ │ Fix: apply FIC; delegate investigation to a subagent
266
+
267
+ NO
268
+
269
+
270
+ Was the prompt itself ambiguous or underspecified?
271
+
272
+ YES ──▶ Prompt problem (use prompt-craft)
273
+
274
+ NO
275
+
276
+
277
+ Genuine model reasoning failure (rare; try a different model or add chain-of-thought)
278
+ ```
279
+
280
+ ### Failure analysis template
281
+
282
+ When recording a context failure for retrospective:
283
+
284
+ ```markdown
285
+ ## Context Failure Report
286
+
287
+ Date: YYYY-MM-DD
288
+ Task: [what the agent was asked to do]
289
+ Failure: [what went wrong]
290
+ Failure mode: Missing | Stale | Wrong | Overwhelming
291
+ Root cause: [why the context was bad]
292
+ Fix applied: [what was changed to prevent recurrence]
293
+ Layer affected: Layer 1 / 2 / 3 / 4 / 5 (system prompt / memory / rules / skills / prompt)
294
+ ```
295
+
296
+ ## Verification
297
+
298
+ Use this checklist when designing a new skill, debugging a failure, or auditing the context pipeline:
299
+
300
+ - [ ] Every agent failure was diagnosed to one of the four failure modes (Missing, Stale, Wrong, Overwhelming) before any fix landed
301
+ - [ ] Skills involved in any failure now have keywords that match how users actually phrased the request
302
+ - [ ] Skill content cites specific files and line numbers, not generic advice
303
+ - [ ] `drift_check.last_verified` is inside the lifecycle window for every active skill
304
+ - [ ] No two skills give contradictory advice on the same topic (`relations.boundary` is honest about ownership)
305
+ - [ ] Context utilization is tracked at task breakpoints; FIC fires at natural boundaries
306
+ - [ ] Injection precision is above 80% (most injected skills are actually consulted)
307
+ - [ ] Injection recall is above 90% (most needed skills are injected)
308
+ - [ ] Context-heavy investigations are delegated to subagents so the main agent receives summaries, not raw evidence
309
+ - [ ] Always-loaded rules contain only universal guardrails; domain-specific guidance lives in skills
310
+
311
+ ## Do NOT Use When
312
+
313
+ | Use instead | When |
314
+ |---|---|
315
+ | `prompt-craft` | The fix is in the wording of one instruction (clarity, format, few-shot examples), not the surrounding stack |
316
+ | `skill-scaffold` | Authoring or restructuring a single SKILL.md file (the contract, not the system around it) |
317
+ | `skill-router` | Debugging which skill the router activates for a specific query — that is a routing-mechanism question, not a context-design question |
318
+ | `code-review` | Reviewing AI-generated code for correctness, security, or style |
319
+ | `documentation` | Writing prose for a human reader about how the agent system works |
320
+ | `debugging` | Investigating a runtime production failure that is not specifically an agent context failure |
@@ -0,0 +1,174 @@
1
+ ---
2
+ name: context-graph
3
+ description: "Use when designing or auditing the multi-graph context architecture of an AI-coding workspace: skill graph, document routing graph, memory index, script registry, and the cross-graph edges between them. Covers edge typing, orphan detection, connectivity health, deterministic graph synthesis signals, change-propagation checks, and drift or hub-and-spoke anti-patterns. Do NOT use for authoring one SKILL.md (use `skill-scaffold`), validating one skill (use `graph-audit`), live routing decisions (use `skill-router`), context-window budgeting (use `context-window`), or session load/drop choices (use `context-management`)."
4
+ license: MIT
5
+ compatibility: "Architecture-level skill. Applies to any agent-coding workspace that has more than one skill / doc-routing / memory artifact and any way to traverse them — Claude Code, OpenCode, Cursor, Aider, Continue, Copilot Workspace, or a custom harness. The four-graph model and the orphan / connectivity metrics are independent of the specific runtime."
6
+ allowed-tools: Read Grep
7
+ metadata:
8
+ metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"agent\",\"domain\":\"agent/context\",\"scope\":\"portable\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-05-06\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-05-13\\\\\\\",\\\\\\\"truth_source_hashes\\\\\\\":{\\\\\\\"SKILL_GRAPH.md\\\\\\\":\\\\\\\"a63fc59a1d99933fc6bc5c4033f8be86ee2f5460f3b5d5ab232a3f77eff71c8f\\\\\\\",\\\\\\\"docs/PRIMER.md\\\\\\\":\\\\\\\"e6bd99468c224fe4c9606e147c5db94dff889feeb9ca5d80084480039c7e9296\\\\\\\",\\\\\\\"docs/concept-map.md\\\\\\\":\\\\\\\"053e5cf891d7abf7efb30c9014fd2365da7d13588ef6b55057520756a315dc8c\\\\\\\",\\\\\\\"docs/diagrams/starter-graph.mmd\\\\\\\":\\\\\\\"6aeaa417e08efbb6bb90b34f56c8df7f06dc608e3c8cd743d428da9fcfaf278c\\\\\\\",\\\\\\\"scripts/generate-manifest.js\\\\\\\":\\\\\\\"9d7bbbdae440fdb1763d61ffa7bda10c9efae92359d1c2139d0e971582d59e0e\\\\\\\",\\\\\\\"scripts/skill-overlap.js\\\\\\\":\\\\\\\"ed642cbc677cc76ec1321300b37d6752337b6b5541c7a9f558fd315d6f934e4b\\\\\\\"}}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"context graph architecture\\\\\\\",\\\\\\\"multi-graph context model\\\\\\\",\\\\\\\"skill knowledge graph\\\\\\\",\\\\\\\"document routing graph\\\\\\\",\\\\\\\"memory index graph\\\\\\\",\\\\\\\"script command registry graph\\\\\\\",\\\\\\\"cross-graph edges\\\\\\\",\\\\\\\"orphan detection skill graph\\\\\\\",\\\\\\\"graph connectivity metrics\\\\\\\",\\\\\\\"average node degree\\\\\\\",\\\\\\\"hub-and-spoke anti-pattern\\\\\\\",\\\\\\\"reciprocal relations\\\\\\\",\\\\\\\"bidirectional graph edges\\\\\\\",\\\\\\\"change propagation across graphs\\\\\\\",\\\\\\\"edge type taxonomy\\\\\\\",\\\\\\\"adjacent boundary verify_with\\\\\\\",\\\\\\\"deterministic graph synthesis\\\\\\\",\\\\\\\"bundle co-membership\\\\\\\"]\",\"examples\":\"[\\\\\\\"we have ~300 skills but the agent never finds half of them — what's the diagnostic frame?\\\\\\\",\\\\\\\"how do I measure whether our skill graph is actually navigable vs just present?\\\\\\\",\\\\\\\"I changed a webhook handler — what's the discipline for tracing the impact across docs, skills, memory, and scripts?\\\\\\\",\\\\\\\"we keep accumulating orphan skills and our connectivity drops every quarter — how do I make graph-health a deliberate gate?\\\\\\\",\\\\\\\"the agent is loading 15 skills per task and burning context — is the underlying graph too dense, too sparse, or wrong-shaped?\\\\\\\",\\\\\\\"design a deterministic recipe for synthesizing the skill graph from frontmatter without running an LLM\\\\\\\",\\\\\\\"what's the right cap on adjacent / boundary / verify_with relations per skill?\\\\\\\"]\",\"anti_examples\":\"[\\\\\\\"scaffold a new SKILL.md from a template\\\\\\\",\\\\\\\"validate that this single skill's frontmatter matches the schema\\\\\\\",\\\\\\\"decide which skill to inject for this query right now\\\\\\\",\\\\\\\"this skill says 'use orgQuery'; that one says 'never use orgQuery' — fix the conflict\\\\\\\",\\\\\\\"decide what should and shouldn't be in this agent's context window for this task\\\\\\\",\\\\\\\"review this AI-generated PR for correctness\\\\\\\"]\",\"relations\":\"{\\\\\\\"boundary\\\\\\\":[{\\\\\\\"skill\\\\\\\":\\\\\\\"skill-router\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"skill-router is the per-query dispatch decision (which skill activates now); context-graph is the underlying graph the router traverses\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"graph-audit\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"graph-audit validates one skill's schema and relation-target existence; context-graph reasons about the topology of the whole graph (orphans, connectivity, edge cap discipline)\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"skill-infrastructure\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"skill-infrastructure owns the live skill library tooling (census, conflict detection, routing-gap reporting); context-graph owns the architectural model behind it\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"skill-scaffold\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"skill-scaffold authors a single SKILL.md; context-graph designs the graph that those authored skills participate in\\\\\\\"}],\\\\\\\"related\\\\\\\":[\\\\\\\"skill-router\\\\\\\",\\\\\\\"graph-audit\\\\\\\",\\\\\\\"skill-infrastructure\\\\\\\",\\\\\\\"skill-scaffold\\\\\\\"],\\\\\\\"verify_with\\\\\\\":[\\\\\\\"graph-audit\\\\\\\",\\\\\\\"skill-infrastructure\\\\\\\"]}\",\"grounding\":\"{\\\\\\\"domain_object\\\\\\\":\\\\\\\"Skill Graph library topology and context discovery model\\\\\\\",\\\\\\\"grounding_mode\\\\\\\":\\\\\\\"hybrid\\\\\\\",\\\\\\\"truth_sources\\\\\\\":[\\\\\\\"SKILL_GRAPH.md\\\\\\\",\\\\\\\"docs/PRIMER.md\\\\\\\",\\\\\\\"docs/concept-map.md\\\\\\\",\\\\\\\"docs/diagrams/starter-graph.mmd\\\\\\\",\\\\\\\"scripts/generate-manifest.js\\\\\\\",\\\\\\\"scripts/skill-overlap.js\\\\\\\"],\\\\\\\"failure_modes\\\\\\\":[\\\\\\\"inferred_edges_replace_authored_relations\\\\\\\",\\\\\\\"orphan_skills_remain_unreachable\\\\\\\",\\\\\\\"relation_caps_turn_into_hub_and_spoke_graph\\\\\\\",\\\\\\\"change_propagation_ignores_cross_graph_edges\\\\\\\"],\\\\\\\"evidence_priority\\\\\\\":\\\\\\\"repo_code_first\\\\\\\"}\",\"portability\":\"{\\\\\\\"readiness\\\\\\\":\\\\\\\"scripted\\\\\\\",\\\\\\\"targets\\\\\\\":[\\\\\\\"skill-md\\\\\\\"]}\",\"lifecycle\":\"{\\\\\\\"stale_after_days\\\\\\\":365,\\\\\\\"review_cadence\\\\\\\":\\\\\\\"quarterly\\\\\\\"}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/context-graph/SKILL.md\",\"skill_graph_export_description\":\"shortened for Agent Skills 1024-character description limit; canonical source keeps the full routing contract\",\"skill_graph_canonical_description_length\":\"1597\"}"
9
+ skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
10
+ skill_graph_protocol: Skill Metadata Protocol v4
11
+ skill_graph_project: Skill Graph
12
+ skill_graph_canonical_skill: skills/context-graph/SKILL.md
13
+ ---
14
+
15
+ # Context Graph
16
+
17
+ ## Coverage
18
+
19
+ The architectural model behind navigable context in an AI-coding workspace. Names the four interconnected graphs that any mature workspace accumulates — Skill Knowledge Graph, Document Routing Graph, Memory Index, Script / Command Registry — and the cross-graph edges that connect them (skill → script, skill → memory, doc-routing → doc, script → command). Specifies the three skill-graph edge types (`adjacent`, `boundary`, `verify_with`) and their per-edge-type caps. Defines orphan detection (a node with zero or near-zero incoming edges that agents cannot find by traversal) and the priority order for remediation (security skills first, then financial, integration, infrastructure, then UX). Specifies graph-connectivity metrics with healthy / unhealthy bands: connectivity, average degree, orphan rate, max degree, cluster count, hub-spoke ratio. Names the five deterministic signals that should drive graph synthesis (explicit prose references, manual `relations` frontmatter, bundle co-membership, shared routing labels, keyword overlap) — never an LLM at synthesis time. Walks the change-propagation checklist that traces a single edit across all four graphs. Catalogs the anti-patterns that quietly destroy graph quality: edge inflation, one-way edges, optional-metadata mindset, AI-inferred edges that drift on rebuild, ignoring cross-graph edges.
20
+
21
+ ## Philosophy
22
+
23
+ Without a navigable graph, agents cannot discover context they did not already know existed. The original failure mode looks like this: a skill exists, the agent doesn't reference it by name in the current prompt, and the routing layer has no edge to find it from — so the skill might as well not exist. A workspace can ship hundreds of skills and still operate as if it had ten, because the other 290 are unreachable from any traversal an agent actually performs.
24
+
25
+ Context discovery is therefore a precondition for context quality. If the right skill, doc, or memory file cannot be found by following edges from the current task, content quality is irrelevant. Graph maintenance — adding edges, fixing orphans, capping inflation, keeping cross-graph references current — is a quality gate, not optional metadata. Every new skill enters the system with a question attached: who reaches this from where, by which edges?
26
+
27
+ The deterministic-signal discipline is the second non-negotiable. Graph synthesis must be a deterministic function of the authored artifacts (frontmatter relations, bundle membership, prose references, shared routing labels, keyword overlap) — not an LLM inference. If the graph drifts on rebuild, agents lose the one stable surface they have. Use AI to _suggest_ edges during authoring; never to _generate_ the live graph at runtime.
28
+
29
+ ## 1. The Four Context Graphs
30
+
31
+ A mature AI-coding workspace converges on four interconnected graphs:
32
+
33
+ ### Graph 1 — Skill Knowledge Graph
34
+
35
+ Nodes are skill files; edges are the typed relations declared in skill frontmatter. The job of this graph is _what knowledge exists in the workspace, and what knowledge teaches alongside what other knowledge_. The graph's vital signs are connectivity (no large isolated components), orphan rate (no skills nobody references), and edge-type discipline (each edge has a typed reason).
36
+
37
+ ### Graph 2 — Document Routing Graph
38
+
39
+ Nodes are documentation targets and change categories; edges express "when this kind of code changes, those docs must be updated." The job of this graph is _propagation_ — preventing stale docs by making the doc-update obligation visible at the point of code change. The graph is most valuable when it is read by humans during PR review and by agents during the wrap / closeout protocol, not when it is read by no one.
40
+
41
+ ### Graph 3 — Memory Index
42
+
43
+ Nodes are persistent memory topic files (decisions, observations, durable preferences); edges are the index entries that point from a topic table to the underlying file. The job of this graph is _cross-session knowledge persistence_ — the answer to "what did we already decide about X, why, and when did the decision become true." A memory graph that records facts but not the _why_ and _how_ of decisions cannot answer audit questions like "why did the agent choose Y?". Workspaces that need decision provenance extend the memory graph with the Process Knowledge Ontology pattern (modeling decisions, triggers, state transitions, and outcomes as first-class entities).
44
+
45
+ ### Graph 4 — Script / Command Registry
46
+
47
+ Nodes are scripts and commands; edges are the categorisations that group them by purpose. The job of this graph is _agent tooling discovery_ — when an agent needs a deterministic script or a slash command, the registry is what makes it findable without trial-and-error.
48
+
49
+ ### Cross-graph edges
50
+
51
+ The four graphs are _interconnected_. The cross-graph edges are where most of the propagation value lives:
52
+
53
+ | From | To | Edge type | Example |
54
+ | ----------- | ------- | -------------------------------------------------- | ------------------------------------------------------------------------- |
55
+ | Skill | Script | `key_file` (frontmatter `paths` or body reference) | A health-audit skill points at the script that runs the audit |
56
+ | Skill | Skill | `adjacent` / `boundary` / `verify_with` | Frontmatter relations |
57
+ | Script | Command | `consumed_by` | A loop-supervisor script is consumed by a manage-style command |
58
+ | Memory | Skill | `informs` | A memory file recording a billing strategy informs an agent-routing skill |
59
+ | Doc-routing | Doc | `requires_update` | A code change row points at the docs that must be updated together |
60
+
61
+ A workspace that names all four graphs _and_ their cross-graph edges has a complete map. A workspace that names only the skill graph has roughly a quarter of the picture.
62
+
63
+ ## 2. Edge Types in the Skill Graph
64
+
65
+ The skill graph uses three relation types. Each has a different _meaning_ and a different _cap_. Mixing them collapses the graph into noise.
66
+
67
+ | Type | Recommended cap | Meaning | Example |
68
+ | ------------- | --------------- | -------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------- |
69
+ | `adjacent` | ≤ 5 per skill | Closely related — teach together; an agent loading one would benefit from also loading the other | A data-reconciliation skill ↔ a financial-correctness skill |
70
+ | `boundary` | ≤ 5 per skill | Contrasting — "do NOT use X for this; use Y instead." The router should _exclude_ the boundary skill when both match | A financial-correctness skill ↮ a data-visualisation skill |
71
+ | `verify_with` | ≤ 3 per skill | Cross-check skill output against this skill before trusting it | A financial-correctness skill → a code-logic skill |
72
+
73
+ The caps exist to prevent edge inflation. A skill with 12 `adjacent` relations is not "well-connected" — it is a hub that pulls every adjacent traversal toward itself, hiding more specific signals. Edge discipline beats edge volume.
74
+
75
+ ### `boundary` is exclusion, not adjacency
76
+
77
+ The most common edge-type confusion: putting "topical neighbour" skills in `boundary`. Boundary edges tell the router "if both this skill and the boundary skill match, route AWAY from the boundary skill" — they are _exclusion-with-a-reason_, not "see also." Putting a skill in `boundary` that should be in `related` will hijack the boundary skill's positive cases and depress its routing-eval pass rate. When in doubt, prefer `related` and only promote to `boundary` when the two skills genuinely _compete_ for the same prompt with different correct answers.
78
+
79
+ ## 3. Orphan Detection
80
+
81
+ An **orphan** is a node with zero (or near-zero) incoming edges. Nothing points at it, so traversal cannot reach it; agents have to know its exact name to find it. In a healthy graph, the orphan rate is below 10%. In an unhealthy graph it is the majority — and adding more skills makes the problem worse, not better, because each new skill is also unreachable.
82
+
83
+ ### Orphan-detection recipe
84
+
85
+ 1. Rebuild the graph from authored artefacts (deterministic synthesis).
86
+ 2. Walk every node, count its `degree` (incoming + outgoing).
87
+ 3. Flag every node with degree ≤ 1 as an orphan candidate.
88
+ 4. For each orphan: identify its domain cluster (layer, keywords, examples) and find 3–5 sibling skills that _should_ reference it.
89
+ 5. Add `relations` to the orphan and reciprocal references to its siblings — bidirectionally. A one-way edge from the orphan does not solve discovery, because the existing skills are where traversal _starts_.
90
+
91
+ ### Remediation priority
92
+
93
+ Fix orphans in order of blast radius, not alphabetically:
94
+
95
+ 1. **Security and compliance skills** — data exposure risk if agents miss them
96
+ 2. **Correctness-critical skills** — financial, accounting, time, irreversible mutations
97
+ 3. **Integration skills** — webhook signature verification, idempotency, retry
98
+ 4. **Infrastructure skills** — operational impact (deploy, migrate, rollback)
99
+ 5. **UX / display skills** — lower blast radius; fix once higher-priority orphans are gone
100
+
101
+ ## 4. Graph Connectivity Metrics
102
+
103
+ These are the vital signs of a skill graph. Run them after every batch of skill additions or edge edits.
104
+
105
+ | Metric | Formula | Healthy band | Unhealthy signal |
106
+ | ------------------- | -------------------------------------- | --------------- | -------------------------------------------------------------------------------- |
107
+ | **Connectivity** | `connected_skills / total_skills` | > 95% | Multiple disconnected clusters indicate domain silos |
108
+ | **Average degree** | `total_edges × 2 / total_nodes` | > 5 | Below 3 means the graph is too sparse for traversal to be useful |
109
+ | **Orphan rate** | `nodes with degree ≤ 1 / total_nodes` | < 10% | Above 30% means agents cannot discover most of the library |
110
+ | **Max degree** | Highest degree of any node | < 30 | A single node with degree 50+ is a hub-and-spoke anti-pattern |
111
+ | **Cluster count** | Connected components | < 3 (ideally 1) | Many clusters means the workspace has informal silos that traversal can't bridge |
112
+ | **Hub-spoke ratio** | `nodes with degree > 15 / total_nodes` | < 5% | More than 10% means the graph is degenerating into a star around a few mega-hubs |
113
+
114
+ ### Five deterministic signals for graph synthesis
115
+
116
+ Synthesise the skill graph from these signals only — never from an LLM at runtime:
117
+
118
+ 1. **Explicit prose references** — patterns like "Do NOT use X — use skill-name" in skill bodies
119
+ 2. **Manual `relations` frontmatter** — author-declared edges
120
+ 3. **Bundle co-membership** — skills declared in the same routing bundle
121
+ 4. **Shared routing labels / triggers** — overlapping `triggers` or label declarations
122
+ 5. **Keyword overlap** — shared keywords via the routing-config map
123
+
124
+ A graph built from these signals is _reproducible_: rebuild today and tomorrow and the edges are identical. A graph that uses LLM inference at synthesis time will drift on every rebuild and the routing layer cannot trust it.
125
+
126
+ ## 5. Change-Propagation Analysis
127
+
128
+ When a single artefact changes, trace the propagation across all four graphs. This is the discipline that prevents silent staleness.
129
+
130
+ ### Propagation checklist
131
+
132
+ | Step | Action | Tool |
133
+ | ---- | -------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------- |
134
+ | 1 | Read the document-routing graph. Find the change category (e.g., "DB migration", "webhook handler change") and list the docs that must be updated. | Read the routing table |
135
+ | 2 | Grep the changed file path / function name across all `*.md` for stale references | `grep -r "<changed_id>" --include="*.md"` |
136
+ | 3 | Check skill key-file sections for references to the changed file | `grep -r "<changed_id>" skills/` |
137
+ | 4 | Check the memory index for related topic files; update or add records if a decision changed | Read the memory index |
138
+ | 5 | Verify no stale references remain — run any doc-verification gate the workspace ships | Local doc-verification script |
139
+
140
+ Each step exercises a different edge type. Skipping a step leaves a stale edge somewhere in the system, and the staleness compounds — the next change inherits a wrong baseline.
141
+
142
+ ## 6. Anti-Patterns
143
+
144
+ | Anti-pattern | Why it fails | What to do instead |
145
+ | --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
146
+ | **Edge inflation** — adding 10+ `adjacent` relations to one skill | Creates a hub-and-spoke; traversal pulls everything toward the hub and hides specific signals | Cap at 5; pick the most semantically-close siblings |
147
+ | **One-way edges** — adding edges _from_ a new skill _to_ existing skills only | Existing skills stay orphaned; nothing points at them in the new direction | Add reciprocal references — update the existing skill too |
148
+ | **Optional-metadata mindset** — treating relations as nice-to-have | Orphan rate drifts up silently; eventually most skills are unreachable | Graph maintenance is a quality gate; CI should fail on degraded connectivity |
149
+ | **AI-inferred runtime edges** — letting an LLM "infer" relations on every rebuild | Graph drifts non-deterministically; routing layer cannot trust it | Use deterministic signals at synthesis time; use AI only as an _authoring suggestion_ the human accepts |
150
+ | **Ignoring cross-graph edges** — only maintaining the skill graph | Skills reference scripts and memory references skills, but those edges are unmaintained | Map all four graphs and the cross-graph edges between them |
151
+ | **Boundary-as-adjacency** — putting topical neighbours in `boundary` | Hijacks the neighbour's positive cases; depresses its routing-eval | Use `related` for neighbours; reserve `boundary` for genuine routing-exclusion |
152
+
153
+ ## Verification
154
+
155
+ - [ ] All four graphs in the workspace are named and have an authoritative source-of-truth file
156
+ - [ ] Cross-graph edges are explicit (skill → script, skill → memory, doc-routing → doc, script → command) — not implicit
157
+ - [ ] Graph rebuild is deterministic — same input artefacts produce identical edge set on every run
158
+ - [ ] Orphan rate is below 10%; orphans above the threshold have been triaged by blast radius
159
+ - [ ] Average degree is above 5; max degree is below 30; cluster count is 1 (or small with explicit reason)
160
+ - [ ] Edge-type discipline is enforced — `adjacent` ≤ 5, `boundary` ≤ 5, `verify_with` ≤ 3 per skill
161
+ - [ ] `boundary` is used for routing-exclusion only, not for "see also"
162
+ - [ ] The change-propagation checklist has been applied for the most recent significant change, end-to-end across all four graphs
163
+ - [ ] CI (or an equivalent gate) fails the merge when connectivity, orphan rate, or max degree breaches the healthy bands
164
+
165
+ ## Do NOT Use When
166
+
167
+ | Use instead | When |
168
+ | ---------------------- | ----------------------------------------------------------------------------------------------------------------------------------- |
169
+ | `skill-scaffold` | Authoring or restructuring a single SKILL.md — the per-skill craft, not the whole-graph architecture |
170
+ | `skill-infrastructure` | Running the live skill library tooling — census, overlap detection, routing-gap reporting, drift checks on individual skills |
171
+ | `graph-audit` | Validating that one skill's frontmatter matches the schema and its relation targets exist |
172
+ | `skill-router` | Deciding which skill activates for a specific query at dispatch time — that is the _consumer_ of this graph, not the graph's design |
173
+ | `documentation` | Writing the prose of a single document for a human reader — the change-propagation framework here is upstream |
174
+ | `code-review` | Reviewing AI-generated code — orthogonal concern |