npm - @skill-graph/cli - Versions diffs - 0.5.6 - Mend

@skill-graph/cli 0.5.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (330) hide show

package/CHANGELOG.md +247 -0
package/LICENSE +200 -0
package/NOTICE +62 -0
package/README.md +398 -0
package/SKILL_GRAPH.md +443 -0
package/bin/skill-graph.js +374 -0
package/docs/ADOPTION.md +117 -0
package/docs/CONFORMANCE.md +66 -0
package/docs/PRIMER.md +384 -0
package/docs/QUICKSTART-30MIN.md +333 -0
package/docs/ROUTING-METRICS.md +120 -0
package/docs/SKILL-MD-FORMAT-COMPATIBILITY.md +127 -0
package/docs/SKILL_AUDIT_CHECKLIST.md +199 -0
package/docs/SKILL_AUDIT_LOOP.md +195 -0
package/docs/SKILL_METADATA_PROTOCOL.md +609 -0
package/docs/_archived/marketplace-publication-priority-2026-05-18.md +239 -0
package/docs/adr/0001-predicate-set.md +69 -0
package/docs/adr/0002-json-ld-context.md +82 -0
package/docs/adr/0003-ontoclean-rigidity-tags.md +65 -0
package/docs/adr/0004-persistent-identifiers.md +74 -0
package/docs/adr/0005-freshness-consolidation.md +70 -0
package/docs/adr/0006-revise-predicate-rename.md +105 -0
package/docs/adr/0007-audit-loop-cadence.md +99 -0
package/docs/adr/0008-skill-surface-split-and-curation-policy.md +93 -0
package/docs/category-consumers.md +168 -0
package/docs/concept-map.md +194 -0
package/docs/diagrams/drift-states.mmd +21 -0
package/docs/diagrams/manifest-pipeline.mmd +25 -0
package/docs/diagrams/routing-harness.mmd +41 -0
package/docs/diagrams/starter-graph.mmd +53 -0
package/docs/field-decision-guide.md +315 -0
package/docs/field-rationale.md +211 -0
package/docs/field-reference.generated.md +624 -0
package/docs/field-reference.md +1426 -0
package/docs/glossary.md +190 -0
package/docs/head-noun-glossary.md +63 -0
package/docs/images/audit-phases.png +0 -0
package/docs/images/drift-states.png +0 -0
package/docs/images/graded-mode.png +0 -0
package/docs/images/manifest-pipeline.png +0 -0
package/docs/images/routing-harness.png +0 -0
package/docs/images/skill-anatomy.png +0 -0
package/docs/images/starter-graph.png +0 -0
package/docs/images/system-model.png +0 -0
package/docs/integrations/github-actions.md +155 -0
package/docs/manifest-field-mapping.md +443 -0
package/docs/marketplace-publication-queue.generated.md +240 -0
package/docs/marketplace-release-agent-prompt.md +82 -0
package/docs/marketplace-skill-candidate-list.md +272 -0
package/docs/marketplace-syndication.md +222 -0
package/docs/migration-sample-review.md +155 -0
package/docs/migrations/v4-to-v5.md +168 -0
package/docs/migrations/v5-to-v6.md +221 -0
package/docs/name-exceptions.yaml +37 -0
package/docs/plans/marketplace-p1-public-migration-plan.md +41 -0
package/docs/plans/multi-root-workspace.md +148 -0
package/docs/plans/scripts-roadmap.md +107 -0
package/docs/plans/v4-schema-bump.md +160 -0
package/docs/plans/wave-2-extraction.md +122 -0
package/docs/positioning-vs-marketplaces.md +175 -0
package/docs/proposals/skill-audit-loop-positioning.md +160 -0
package/docs/quality-doctrine.md +138 -0
package/docs/recommended-skills.md +150 -0
package/docs/research/skill-comprehension-eval-research.md +1830 -0
package/docs/research/skill-retrieval-evidence.md +66 -0
package/docs/skill-metadata-protocol.md +471 -0
package/docs/skills-sh-maintainer-cleanup-request.md +80 -0
package/examples/audits/a11y/findings.md +52 -0
package/examples/audits/a11y/scorecard.md +21 -0
package/examples/audits/a11y/verdict.md +44 -0
package/examples/audits/debugging/findings.md +59 -0
package/examples/audits/debugging/scorecard.md +22 -0
package/examples/audits/debugging/verdict.md +33 -0
package/examples/audits/documentation/findings.md +59 -0
package/examples/audits/documentation/scorecard.md +22 -0
package/examples/audits/documentation/verdict.md +33 -0
package/examples/evals/a11y.json +140 -0
package/examples/evals/api-design.json +52 -0
package/examples/evals/code-review.json +52 -0
package/examples/evals/data-modeling.json +52 -0
package/examples/evals/database-migration.json +52 -0
package/examples/evals/debugging.json +118 -0
package/examples/evals/dependency-architecture.json +52 -0
package/examples/evals/design-system-architecture.json +52 -0
package/examples/evals/error-tracking.json +52 -0
package/examples/evals/event-contract-design.json +52 -0
package/examples/evals/form-ux-architecture.json +52 -0
package/examples/evals/framework-fit-analysis.json +52 -0
package/examples/evals/graph-audit.json +139 -0
package/examples/evals/information-architecture.json +52 -0
package/examples/evals/interaction-feedback.json +52 -0
package/examples/evals/interaction-patterns.json +52 -0
package/examples/evals/layout-composition.json +52 -0
package/examples/evals/lint-overlay.json +117 -0
package/examples/evals/microcopy.json +52 -0
package/examples/evals/observability-modeling.json +52 -0
package/examples/evals/pattern-recognition.json +96 -0
package/examples/evals/performance-engineering.json +52 -0
package/examples/evals/refactor.json +128 -0
package/examples/evals/semiotics.json +52 -0
package/examples/evals/skill-infrastructure.json +96 -0
package/examples/evals/skill-router.json +140 -0
package/examples/evals/skill-router.routing.json +113 -0
package/examples/evals/system-interface-contracts.json +52 -0
package/examples/evals/task-analysis.json +52 -0
package/examples/evals/testing-strategy.json +118 -0
package/examples/evals/type-safety.json +249 -0
package/examples/evals/visual-design-foundations.json +52 -0
package/examples/evals/webhook-integration.json +52 -0
package/examples/exports/a11y.skill-md.md +80 -0
package/examples/exports/debugging.skill-md.md +80 -0
package/examples/exports/refactor.skill-md.md +78 -0
package/examples/exports/testing-strategy.skill-md.md +81 -0
package/examples/projects/markdown-static-site/README.md +115 -0
package/examples/projects/markdown-static-site/skills/content-source-router/SKILL.md +131 -0
package/examples/projects/markdown-static-site/skills/image-optimization-pipeline-config/SKILL.md +132 -0
package/examples/projects/markdown-static-site/skills/link-rot-detection/SKILL.md +103 -0
package/examples/projects/markdown-static-site/skills/markdown-post-frontmatter-validation/SKILL.md +133 -0
package/examples/projects/markdown-static-site/skills/migrate-posts-to-v2-frontmatter/SKILL.md +140 -0
package/examples/projects/saas-stripe-postgres/README.md +208 -0
package/examples/projects/saas-stripe-postgres/db/migrations/0004_canonicalize_orders.sql +37 -0
package/examples/projects/saas-stripe-postgres/db/schema.sql +112 -0
package/examples/projects/saas-stripe-postgres/skills/migrate-orders-to-canonical-schema/SKILL.md +149 -0
package/examples/projects/saas-stripe-postgres/skills/nextjs-server-action-validation/SKILL.md +154 -0
package/examples/projects/saas-stripe-postgres/skills/payment-provider-router/SKILL.md +153 -0
package/examples/projects/saas-stripe-postgres/skills/postgres-rls-pattern/SKILL.md +163 -0
package/examples/projects/saas-stripe-postgres/skills/stripe-webhook-signature-verification/SKILL.md +137 -0
package/examples/protocol/skill-metadata-template.md +301 -0
package/examples/protocol/skills.manifest.sample.json +13245 -0
package/examples/skill-metadata-template.md +317 -0
package/examples/skills.manifest.sample.json +13519 -0
package/examples/tests/v3-1-skos-fixture/SKILL.md +93 -0
package/marketplace/README.md +17 -0
package/marketplace/skills/a11y/SKILL.md +66 -0
package/marketplace/skills/acid-fundamentals/SKILL.md +106 -0
package/marketplace/skills/agent-engineering/SKILL.md +386 -0
package/marketplace/skills/agent-eval-design/SKILL.md +55 -0
package/marketplace/skills/ai-native-development/SKILL.md +294 -0
package/marketplace/skills/api-design/SKILL.md +60 -0
package/marketplace/skills/architecture-decision-records/SKILL.md +55 -0
package/marketplace/skills/background-jobs/SKILL.md +265 -0
package/marketplace/skills/bounded-context-mapping/SKILL.md +55 -0
package/marketplace/skills/cap-theorem-tradeoffs/SKILL.md +127 -0
package/marketplace/skills/client-server-boundary/SKILL.md +187 -0
package/marketplace/skills/code-review/SKILL.md +120 -0
package/marketplace/skills/color-system-design/SKILL.md +43 -0
package/marketplace/skills/component-architecture/SKILL.md +126 -0
package/marketplace/skills/compression/SKILL.md +112 -0
package/marketplace/skills/conceptual-modeling/SKILL.md +181 -0
package/marketplace/skills/connection-pooling/SKILL.md +105 -0
package/marketplace/skills/constraint-awareness/SKILL.md +287 -0
package/marketplace/skills/content-monitor/SKILL.md +209 -0
package/marketplace/skills/context-engineering/SKILL.md +320 -0
package/marketplace/skills/context-graph/SKILL.md +174 -0
package/marketplace/skills/context-management/SKILL.md +174 -0
package/marketplace/skills/context-window/SKILL.md +239 -0
package/marketplace/skills/contract-testing/SKILL.md +120 -0
package/marketplace/skills/cron-scheduling/SKILL.md +223 -0
package/marketplace/skills/dark-mode-implementation/SKILL.md +47 -0
package/marketplace/skills/data-modeling/SKILL.md +59 -0
package/marketplace/skills/data-modeling-fundamentals/SKILL.md +117 -0
package/marketplace/skills/database-migration/SKILL.md +429 -0
package/marketplace/skills/debugging/SKILL.md +67 -0
package/marketplace/skills/dependency-architecture/SKILL.md +58 -0
package/marketplace/skills/design-module-composition/SKILL.md +43 -0
package/marketplace/skills/design-system-architecture/SKILL.md +61 -0
package/marketplace/skills/design-thinking/SKILL.md +44 -0
package/marketplace/skills/diagnosis/SKILL.md +296 -0
package/marketplace/skills/diff-analysis/SKILL.md +188 -0
package/marketplace/skills/e2e-test-design/SKILL.md +113 -0
package/marketplace/skills/entity-relationship-modeling/SKILL.md +218 -0
package/marketplace/skills/epistemic-grounding/SKILL.md +112 -0
package/marketplace/skills/error-boundary/SKILL.md +235 -0
package/marketplace/skills/error-tracking/SKILL.md +261 -0
package/marketplace/skills/eval-driven-development/SKILL.md +147 -0
package/marketplace/skills/evaluation/SKILL.md +113 -0
package/marketplace/skills/event-contract-design/SKILL.md +60 -0
package/marketplace/skills/event-storming/SKILL.md +56 -0
package/marketplace/skills/form-ux-architecture/SKILL.md +60 -0
package/marketplace/skills/framework-fit-analysis/SKILL.md +59 -0
package/marketplace/skills/frontend-architecture/SKILL.md +43 -0
package/marketplace/skills/generative-ui/SKILL.md +118 -0
package/marketplace/skills/graph-audit/SKILL.md +81 -0
package/marketplace/skills/guardrails/SKILL.md +118 -0
package/marketplace/skills/hooks-patterns/SKILL.md +185 -0
package/marketplace/skills/http-semantics/SKILL.md +136 -0
package/marketplace/skills/ideation/SKILL.md +41 -0
package/marketplace/skills/indexing-strategy/SKILL.md +108 -0
package/marketplace/skills/information-architecture/SKILL.md +59 -0
package/marketplace/skills/integration-test-design/SKILL.md +111 -0
package/marketplace/skills/intent-recognition/SKILL.md +136 -0
package/marketplace/skills/interaction-feedback/SKILL.md +59 -0
package/marketplace/skills/interaction-patterns/SKILL.md +59 -0
package/marketplace/skills/journey-mapping/SKILL.md +41 -0
package/marketplace/skills/keywords/SKILL.md +213 -0
package/marketplace/skills/knowledge-modeling/SKILL.md +232 -0
package/marketplace/skills/layout-composition/SKILL.md +59 -0
package/marketplace/skills/linguistics/SKILL.md +429 -0
package/marketplace/skills/lint-overlay/SKILL.md +76 -0
package/marketplace/skills/mental-models/SKILL.md +126 -0
package/marketplace/skills/merge-queue/SKILL.md +94 -0
package/marketplace/skills/methodology/SKILL.md +317 -0
package/marketplace/skills/microcopy/SKILL.md +232 -0
package/marketplace/skills/middleware-patterns/SKILL.md +363 -0
package/marketplace/skills/mobile-responsive-ux/SKILL.md +287 -0
package/marketplace/skills/mutation-testing/SKILL.md +112 -0
package/marketplace/skills/naming-conventions/SKILL.md +112 -0
package/marketplace/skills/observability-modeling/SKILL.md +59 -0
package/marketplace/skills/ontology-modeling/SKILL.md +67 -0
package/marketplace/skills/owasp-security/SKILL.md +153 -0
package/marketplace/skills/pattern-recognition/SKILL.md +472 -0
package/marketplace/skills/performance-budgets/SKILL.md +185 -0
package/marketplace/skills/performance-engineering/SKILL.md +58 -0
package/marketplace/skills/performance-testing/SKILL.md +125 -0
package/marketplace/skills/printify/SKILL.md +42 -0
package/marketplace/skills/prioritization/SKILL.md +118 -0
package/marketplace/skills/problem-framing/SKILL.md +41 -0
package/marketplace/skills/problem-locating-solving/SKILL.md +203 -0
package/marketplace/skills/project-knowledge-extraction/SKILL.md +54 -0
package/marketplace/skills/prompt-craft/SKILL.md +134 -0
package/marketplace/skills/prompt-injection-defense/SKILL.md +132 -0
package/marketplace/skills/property-based-testing/SKILL.md +100 -0
package/marketplace/skills/prototyping/SKILL.md +43 -0
package/marketplace/skills/query-optimization/SKILL.md +144 -0
package/marketplace/skills/real-time-updates/SKILL.md +324 -0
package/marketplace/skills/ref-patterns/SKILL.md +284 -0
package/marketplace/skills/refactor/SKILL.md +65 -0
package/marketplace/skills/rendering-models/SKILL.md +142 -0
package/marketplace/skills/replication-patterns/SKILL.md +110 -0
package/marketplace/skills/research-synthesis/SKILL.md +41 -0
package/marketplace/skills/route-handler-design/SKILL.md +347 -0
package/marketplace/skills/schema-evolution/SKILL.md +140 -0
package/marketplace/skills/security-fundamentals/SKILL.md +139 -0
package/marketplace/skills/semantic-center/SKILL.md +194 -0
package/marketplace/skills/semantic-relations/SKILL.md +250 -0
package/marketplace/skills/semantics/SKILL.md +366 -0
package/marketplace/skills/semiotics/SKILL.md +230 -0
package/marketplace/skills/seo-strategy/SKILL.md +260 -0
package/marketplace/skills/server-actions-design/SKILL.md +243 -0
package/marketplace/skills/server-components-design/SKILL.md +190 -0
package/marketplace/skills/sharding-strategy/SKILL.md +123 -0
package/marketplace/skills/shopify/SKILL.md +42 -0
package/marketplace/skills/skill-infrastructure/SKILL.md +320 -0
package/marketplace/skills/skill-router/SKILL.md +71 -0
package/marketplace/skills/skill-scaffold/SKILL.md +105 -0
package/marketplace/skills/snapshot-testing/SKILL.md +120 -0
package/marketplace/skills/spec-driven-development/SKILL.md +148 -0
package/marketplace/skills/state-machine-modeling/SKILL.md +56 -0
package/marketplace/skills/state-management/SKILL.md +134 -0
package/marketplace/skills/streaming-architecture/SKILL.md +194 -0
package/marketplace/skills/summarization/SKILL.md +156 -0
package/marketplace/skills/suspense-patterns/SKILL.md +265 -0
package/marketplace/skills/system-interface-contracts/SKILL.md +59 -0
package/marketplace/skills/task-analysis/SKILL.md +201 -0
package/marketplace/skills/taxonomy-design/SKILL.md +66 -0
package/marketplace/skills/test-coverage-strategy/SKILL.md +108 -0
package/marketplace/skills/test-doubles-design/SKILL.md +98 -0
package/marketplace/skills/test-driven-development/SKILL.md +96 -0
package/marketplace/skills/testing-strategy/SKILL.md +67 -0
package/marketplace/skills/theme-system-design/SKILL.md +43 -0
package/marketplace/skills/tool-call-flow/SKILL.md +229 -0
package/marketplace/skills/tool-call-strategy/SKILL.md +292 -0
package/marketplace/skills/transaction-isolation/SKILL.md +98 -0
package/marketplace/skills/type-safety/SKILL.md +177 -0
package/marketplace/skills/typography-system/SKILL.md +43 -0
package/marketplace/skills/usability-testing/SKILL.md +43 -0
package/marketplace/skills/user-research/SKILL.md +43 -0
package/marketplace/skills/vercel-composition-patterns/SKILL.md +157 -0
package/marketplace/skills/version-control/SKILL.md +233 -0
package/marketplace/skills/visual-design-foundations/SKILL.md +59 -0
package/marketplace/skills/visual-hierarchy/SKILL.md +43 -0
package/marketplace/skills/webhook-integration/SKILL.md +331 -0
package/marketplace/skills/writing-humanizer/SKILL.md +380 -0
package/package.json +67 -0
package/schemas/manifest.schema.json +811 -0
package/schemas/manifest.v2.schema.json +164 -0
package/schemas/manifest.v3.schema.json +758 -0
package/schemas/manifest.v4.schema.json +755 -0
package/schemas/manifest.v5.schema.json +755 -0
package/schemas/manifest.v6.schema.json +811 -0
package/schemas/skill.context.jsonld +279 -0
package/schemas/skill.schema.json +919 -0
package/schemas/skill.v2.schema.json +201 -0
package/schemas/skill.v3.schema.json +827 -0
package/schemas/skill.v4.schema.json +822 -0
package/schemas/skill.v5.schema.json +830 -0
package/schemas/skill.v6.schema.json +946 -0
package/schemas/vocabulary/keywords.json +180 -0
package/schemas/vocabulary/workspace_tags.json +23 -0
package/scripts/__tests__/migrate-skill-v2-to-v3.test.js +161 -0
package/scripts/__tests__/migrate-skill-v3-to-v4.test.js +158 -0
package/scripts/__tests__/test-export-parser-drift.js +149 -0
package/scripts/__tests__/test-marketplace-export.js +114 -0
package/scripts/__tests__/test-router-paths.js +82 -0
package/scripts/__tests__/test-stability-promotion.js +244 -0
package/scripts/__tests__/test-v3-1-alias-contract.js +109 -0
package/scripts/__tests__/test-v3-1-skos-runtime.js +116 -0
package/scripts/backfill-schema-version.js +198 -0
package/scripts/build-field-reference.js +160 -0
package/scripts/build-retrieval-baseline.js +511 -0
package/scripts/check-markdown-links.js +211 -0
package/scripts/check-protocol-consistency.js +979 -0
package/scripts/export-marketplace-skills.js +610 -0
package/scripts/export-skill.js +374 -0
package/scripts/generate-manifest.js +787 -0
package/scripts/lib/alias-contract.js +83 -0
package/scripts/lib/audit-prompt-builder.js +771 -0
package/scripts/lib/mock-grader.js +134 -0
package/scripts/lib/parse-frontmatter.js +429 -0
package/scripts/lib/roots.js +119 -0
package/scripts/lint/check-archetype-sections.js +185 -0
package/scripts/lint/check-category-enum.js +83 -0
package/scripts/lint/check-routing-eval.js +146 -0
package/scripts/lint/check-routing-quality.js +211 -0
package/scripts/lint/check-stability-promotion.js +220 -0
package/scripts/lint/format-code-frame.js +206 -0
package/scripts/marketplace-install.js +125 -0
package/scripts/migrate-category-to-enum.js +169 -0
package/scripts/migrate-skill-v2-to-v3.js +424 -0
package/scripts/migrate-skill-v3-to-v4.js +200 -0
package/scripts/migrate-skill-v5-to-v6.js +304 -0
package/scripts/restructure-by-category.js +85 -0
package/scripts/seed-publication-classification.js +282 -0
package/scripts/skill-audit.js +893 -0
package/scripts/skill-graph-drift.js +483 -0
package/scripts/skill-graph-route.js +766 -0
package/scripts/skill-graph-routing-eval.js +393 -0
package/scripts/skill-lint.js +1317 -0
package/scripts/skill-overlap.js +213 -0
package/scripts/verify-skill-md-export.js +201 -0

package/marketplace/skills/design-system-architecture/SKILL.md ADDED Viewed

@@ -0,0 +1,61 @@
+---
+name: design-system-architecture
+description: "Use when designing or auditing a design system's architecture: token taxonomy, semantic tokens, component APIs, theming, accessibility contracts, documentation, governance, and migration strategy. Do NOT use for information hierarchy and navigation (use `information-architecture`), page-specific layout (use `layout-composition`), visual craft direction (use `visual-design-foundations`), sentence-level UI copy (use `microcopy`), or accessibility-only audits (use `a11y`)."
+license: MIT
+compatibility: "Portable design-system architecture guidance for web and app component systems, token systems, and multi-theme UI libraries."
+allowed-tools: Read Grep
+metadata:
+  metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"design\",\"domain\":\"design/system\",\"scope\":\"portable\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-05-11\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-05-11\\\\\\\"}\",\"eval_artifacts\":\"present\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"design tokens\\\\\\\",\\\\\\\"semantic tokens\\\\\\\",\\\\\\\"component API\\\\\\\",\\\\\\\"theming\\\\\\\",\\\\\\\"component library\\\\\\\",\\\\\\\"token taxonomy\\\\\\\",\\\\\\\"design system migration\\\\\\\",\\\\\\\"design system audit\\\\\\\",\\\\\\\"component library audit\\\\\\\",\\\\\\\"token drift\\\\\\\",\\\\\\\"component API consistency\\\\\\\",\\\\\\\"tokens components variants\\\\\\\",\\\\\\\"icons design system\\\\\\\",\\\\\\\"icon library systematization\\\\\\\",\\\\\\\"iconography consistency\\\\\\\"]\",\"examples\":\"[\\\\\\\"define semantic tokens so charts, status colors, and surfaces do not hardcode raw colors\\\\\\\",\\\\\\\"audit this component library for API consistency and token drift\\\\\\\",\\\\\\\"design the theming architecture before adding dark mode\\\\\\\",\\\\\\\"how should we migrate old CSS variables to canonical design-system tokens?\\\\\\\"]\",\"anti_examples\":\"[\\\\\\\"organize pages, nav, sitemap, and wayfinding\\\\\\\",\\\\\\\"rewrite the empty-state text and tooltip labels\\\\\\\",\\\\\\\"add aria-labels and keyboard behavior to this component\\\\\\\",\\\\\\\"draft an architecture note explaining why we chose Postgres over DynamoDB\\\\\\\"]\",\"relations\":\"{\\\\\\\"boundary\\\\\\\":[{\\\\\\\"skill\\\\\\\":\\\\\\\"information-architecture\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"information-architecture owns navigation and content structure; design-system-architecture owns tokens and component systems\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"microcopy\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"microcopy owns UI text patterns; design-system-architecture owns reusable component and token contracts\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"a11y\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"a11y verifies accessibility behavior; design-system-architecture embeds accessibility contracts into components\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"layout-composition\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"layout-composition owns page-specific responsive structure; design-system-architecture owns reusable component and token rules\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"visual-design-foundations\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"visual-design-foundations owns surface-level visual craft; design-system-architecture owns reusable system contracts\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"interaction-patterns\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"interaction-patterns owns selecting the right pattern for a task; design-system-architecture owns reusable component APIs once the pattern belongs in the system\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"refactor\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"refactor restructures code behavior-preservingly; design-system-architecture changes the UI system contract\\\\\\\"}],\\\\\\\"related\\\\\\\":[\\\\\\\"a11y\\\\\\\",\\\\\\\"microcopy\\\\\\\",\\\\\\\"information-architecture\\\\\\\",\\\\\\\"semantics\\\\\\\",\\\\\\\"layout-composition\\\\\\\",\\\\\\\"visual-design-foundations\\\\\\\",\\\\\\\"interaction-patterns\\\\\\\"],\\\\\\\"verify_with\\\\\\\":[\\\\\\\"a11y\\\\\\\",\\\\\\\"code-review\\\\\\\"]}\",\"portability\":\"{\\\\\\\"readiness\\\\\\\":\\\\\\\"scripted\\\\\\\",\\\\\\\"targets\\\\\\\":[\\\\\\\"skill-md\\\\\\\"]}\",\"lifecycle\":\"{\\\\\\\"stale_after_days\\\\\\\":365,\\\\\\\"review_cadence\\\\\\\":\\\\\\\"quarterly\\\\\\\"}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/design-system-architecture/SKILL.md\"}"
+  skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
+  skill_graph_protocol: Skill Metadata Protocol v4
+  skill_graph_project: Skill Graph
+  skill_graph_canonical_skill: skills/design-system-architecture/SKILL.md
+---
+# Design System Architecture
+## Coverage
+Design and audit reusable UI systems. Covers token taxonomy, semantic vs raw tokens, component APIs, variants, slots, theming, accessibility contracts, responsive behavior, documentation, governance, migration, and drift detection between code and design intent.
+## Philosophy
+A design system is a product architecture layer, not a style pile. Tokens and components should encode durable decisions so product work becomes faster and more consistent. If every screen still makes local choices for color, spacing, state, and behavior, the design system is only decorative.
+Optimize for clear constraints. A system with too many escape hatches is not flexible; it is ungoverned.
+## Method
+1. Inventory tokens, components, variants, and usage hotspots.
+2. Separate raw tokens from semantic tokens.
+3. Define component contracts: purpose, props/slots, states, accessibility, and composition rules.
+4. Establish theming and density rules before multiplying variants.
+5. Mark forbidden local overrides and migration paths.
+6. Add docs examples that show expected use and anti-use.
+7. Verify real screens can be built without one-off styling.
+## Evals
+This skill ships a comprehension-eval artifact at [`examples/evals/design-system-architecture.json`](https://github.com/jacob-balslev/skill-graph/blob/main/examples/evals/design-system-architecture.json). The checklist below is the authoring gate for design-system architecture decisions; the eval file is the grader surface.
+## Verification
+- [ ] Semantic tokens cover product meaning without leaking palette names
+- [ ] Components have clear ownership and API boundaries
+- [ ] Variants map to real use cases, not visual guesses
+- [ ] Accessibility behavior is part of the component contract
+- [ ] Theming does not require component-level rewrites
+- [ ] Deprecated tokens or components have migration paths
+- [ ] Real product screens can use the system without local escape hatches
+## Do NOT Use When
+| Use instead | When |
+|---|---|
+| `information-architecture` | You need page hierarchy, navigation, sitemap, or wayfinding. |
+| `microcopy` | You need UI wording, labels, empty states, or error copy. |
+| `a11y` | You need focused accessibility compliance verification. |
+| `layout-composition` | You need page-specific responsive structure, section order, or breakpoints. |
+| `visual-design-foundations` | You need color, typography, spacing, density, or visual craft direction. |
+| `interaction-patterns` | You need to choose a control or interaction pattern before systemizing it. |
+| `refactor` | You are only restructuring existing code without changing design-system contracts. |

package/marketplace/skills/design-thinking/SKILL.md ADDED Viewed

@@ -0,0 +1,44 @@
+---
+name: design-thinking
+description: "Use when orchestrating a full human-centered design process across discovery, definition, ideation, prototyping, and testing — when uncertain which stage of the arc a team is in, when deciding whether to loop back, or when routing to the right stage-specific sibling skill. Do NOT use for single-stage execution (go directly to problem-framing, user-research, research-synthesis, journey-mapping, ideation, prototyping, or usability-testing) or for engineering domain discovery (use event-storming)."
+license: CC-BY-4.0
+metadata:
+  metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"design\",\"scope\":\"portable\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-05-12\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-05-12\\\\\\\"}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"design thinking process\\\\\\\",\\\\\\\"double diamond\\\\\\\",\\\\\\\"five stage design process\\\\\\\",\\\\\\\"empathize define ideate prototype test\\\\\\\",\\\\\\\"human centered design\\\\\\\",\\\\\\\"Stanford d.school\\\\\\\",\\\\\\\"IDEO method\\\\\\\",\\\\\\\"design sprint\\\\\\\",\\\\\\\"discover define develop deliver\\\\\\\",\\\\\\\"looping back\\\\\\\",\\\\\\\"stage routing\\\\\\\",\\\\\\\"MIT Sloan design thinking\\\\\\\",\\\\\\\"Tim Brown HBR\\\\\\\"]\",\"triggers\":\"[\\\\\\\"design thinking\\\\\\\",\\\\\\\"human-centered design\\\\\\\",\\\\\\\"double diamond\\\\\\\",\\\\\\\"which stage\\\\\\\",\\\\\\\"design process\\\\\\\"]\",\"examples\":\"[\\\\\\\"We have user interviews done but no synthesis yet — which design-thinking stage are we in and what's next?\\\\\\\",\\\\\\\"Plan a full design-thinking arc for a four-week project on rural healthcare access.\\\\\\\",\\\\\\\"We just finished a usability test and three findings broke our framing — should we loop back to define?\\\\\\\",\\\\\\\"Route this brief to the right stage-specific skill: 'help us figure out what to build for new homeowners'.\\\\\\\"]\",\"anti_examples\":\"[\\\\\\\"Run a single crazy-8s round on this specific how-might-we.\\\\\\\",\\\\\\\"Write the React component for the dashboard widget.\\\\\\\",\\\\\\\"Model the bounded contexts for the order-fulfillment domain.\\\\\\\"]\",\"relations\":\"{\\\\\\\"related\\\\\\\":[\\\\\\\"problem-framing\\\\\\\",\\\\\\\"user-research\\\\\\\",\\\\\\\"research-synthesis\\\\\\\",\\\\\\\"journey-mapping\\\\\\\",\\\\\\\"ideation\\\\\\\",\\\\\\\"prototyping\\\\\\\",\\\\\\\"usability-testing\\\\\\\"],\\\\\\\"boundary\\\\\\\":[{\\\\\\\"skill\\\\\\\":\\\\\\\"event-storming\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"event-storming is a collaborative engineering discovery practice for mapping domain events, commands, and aggregates with developers and domain experts. design-thinking orchestrates a human-centered design arc with users at the center. Both are 'discovery' practices but they differ in subject (system vs. human), participants, and output.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"problem-locating-solving\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"problem-locating-solving handles concrete code-level bug localization. design-thinking handles the upstream open-ended question of what should be designed at all — different problem class entirely.\\\\\\\"}]}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/design-thinking/SKILL.md\"}"
+  skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
+  skill_graph_protocol: Skill Metadata Protocol v4
+  skill_graph_project: Skill Graph
+  skill_graph_canonical_skill: skills/design-thinking/SKILL.md
+---
+# Design Thinking
+## Coverage
+Design thinking is the meta-skill that orchestrates a full human-centered design arc and routes specific work to the appropriate stage-specific sibling skill. Multiple canonical framings exist and largely agree on the shape. The **Stanford d.school** describes five stages: **Empathize → Define → Ideate → Prototype → Test**. The **MIT Sloan** framing renders it as **Understand → Involve → Ideate → Prototype-test → Implement**. The **UK Design Council's Double Diamond** maps the same arc onto two diamonds: **Discover → Define** (the problem-space diamond, diverge then converge on the right problem) and **Develop → Deliver** (the solution-space diamond, diverge then converge on the right solution). Tim Brown's HBR essay (2008) and the IDEO Field Guide describe the same arc under different stage labels.
+Across framings the meta-skill covers (a) **stage recognition** — knowing which stage a team is currently in based on what artifacts exist and what question is open; (b) **stage routing** — handing the work to the right sibling skill (problem-framing for definition work, user-research for empathy/discovery, research-synthesis for sense-making, journey-mapping for cross-touchpoint experience, ideation for divergent/convergent concept generation, prototyping for learning artifacts, usability-testing for evaluation); (c) **transition criteria** — knowing what evidence justifies moving from one stage to the next; and (d) **loop-back conditions** — knowing when findings in a later stage invalidate work in an earlier one and the team should return rather than press forward.
+The skill includes the **non-linearity principle**: although the stages are described in order, real projects loop. A prototype test (Test stage) commonly produces evidence that the team's problem framing (Define stage) was wrong, and the right response is to loop back to Define rather than ship the prototype as-is. Recognizing this is part of the meta-skill — a team that refuses to loop is performing the ritual of design thinking without practicing it. Conversely, looping endlessly without committing is its own failure mode, and the meta-skill includes naming when "we have enough" to proceed.
+The skill also covers **format choices** for orchestration — multi-week project arcs versus compressed **Design Sprints** (Jake Knapp, Google Ventures) which run a full Define-through-Test cycle in five days. The format trades depth for speed; both have valid uses.
+## Philosophy
+Design thinking exists because complex human problems do not yield to either pure analysis or pure intuition, and the discipline insists that iterating between empathy with users and concrete artifacts is more productive than either alone. The arc is not a procedure to be executed once; it is a structured way to make uncertainty visible. Each stage produces a specific kind of evidence (qualitative observations, framed problems, concept variants, learning artifacts, behavioral findings), and the discipline rewards teams that can name what kind of evidence they have versus what kind they still need.
+The meta-skill is sceptical of two opposite failure modes. The first is **stage skipping** — leaping from a vague brief directly to prototyping because building feels like progress, with no framing and no research; the resulting prototype answers a question nobody asked. The second is **stage stalling** — researching indefinitely, framing endlessly, ideating without ever building, because each new round of empathy raises new questions and the team mistakes activity for progress. Both failures stem from the same root: not knowing which stage's question is currently open. The meta-skill names the open question explicitly and chooses the next stage to address it.
+## Verification
+- The team can name which stage of the arc they are currently in (Empathize / Define / Ideate / Prototype / Test, or the equivalent in whichever framing) and what specific open question that stage exists to answer.
+- The next-stage transition has a written criterion — what evidence will count as "done" for the current stage — not just a calendar date.
+- When a later-stage finding contradicts an earlier-stage assumption, the team explicitly decides whether to loop back or continue, and records the rationale; the question is not silently dropped.
+- Each stage's output uses the appropriate sibling skill's methods (real interviews and not just team brainstorming for empathy; affinity mapping and not just memory for synthesis; divergent rounds for ideation; learning-goal contracts for prototypes; task scenarios for tests).
+- The arc preserves **human centrality** — at no stage is the user replaced by a stakeholder proxy or a single team member's opinion; if it has been, that is a flag for loop-back.
+- The team can articulate what they no longer believe that they believed before the arc started — design thinking that produces no changed beliefs has either been performed superficially or applied to a problem that did not require it.
+## Do NOT Use When
+- The work is fully inside a single stage and the right sibling skill is obvious — go directly to **problem-framing**, **user-research**, **research-synthesis**, **journey-mapping**, **ideation**, **prototyping**, or **usability-testing** rather than invoking the meta-skill.
+- The problem is a well-specified engineering implementation task with a clear acceptance criterion — there is no design uncertainty to discover; just build it.
+- The discovery target is an engineering domain model, event flow, or bounded-context map — use **event-storming**, **conceptual-modeling**, or **bounded-context-mapping**.
+- The problem is a concrete code defect that needs localization — use **problem-locating-solving**.
+- The work is automated test design, CI architecture, or any engineering verification — use **testing-strategy**.
+- The "user" is an internal system, agent, or non-human actor — design thinking's empathy stage assumes human subjects; methods do not transfer cleanly.
+- The team's question is purely strategic prioritization (which of these well-understood things should we build first) rather than open-ended design — use prioritization frameworks rather than the full arc.

package/marketplace/skills/diagnosis/SKILL.md ADDED Viewed

@@ -0,0 +1,296 @@
+---
+name: diagnosis
+description: "Use when facing an unknown software failure, when symptoms point to different root causes, or when an initial debugging attempt has not converged. Provides a triage-first diagnostic routing framework: classify the failure, collect the right evidence, choose a technique, track confidence, and escalate when stuck. Do NOT use for executing scientific debugging after triage (use `debugging`), code-quality review (use `code-review`), or proactive observability setup."
+license: MIT
+compatibility: "Language- and stack-agnostic. The classification taxonomy, evidence protocol, and confidence ladder apply to any software failure investigation; specific technique names (git bisect, EXPLAIN plans, HMAC verification) are illustrative — substitute the equivalents of your stack."
+allowed-tools: Read Grep
+metadata:
+  metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"engineering\",\"domain\":\"engineering/debugging\",\"scope\":\"portable\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-05-06\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-05-06\\\\\\\"}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"diagnostic triage software failure\\\\\\\",\\\\\\\"symptom classification taxonomy\\\\\\\",\\\\\\\"what kind of bug is this\\\\\\\",\\\\\\\"which debugging approach\\\\\\\",\\\\\\\"diagnostic routing framework\\\\\\\",\\\\\\\"evidence collection before hypothesis\\\\\\\",\\\\\\\"confidence ladder debugging\\\\\\\",\\\\\\\"escalation criteria debugging\\\\\\\",\\\\\\\"cascade vs coincidence failure\\\\\\\",\\\\\\\"environment ghost\\\\\\\",\\\\\\\"failure not converging\\\\\\\",\\\\\\\"misclassified symptom debugging\\\\\\\",\\\\\\\"stuck at level 1 diagnosis\\\\\\\",\\\\\\\"debug technique selection matrix\\\\\\\",\\\\\\\"configuration vs code error\\\\\\\",\\\\\\\"timing vs logic error\\\\\\\",\\\\\\\"integration boundary failure\\\\\\\",\\\\\\\"data integrity vs logic error\\\\\\\"]\",\"examples\":\"[\\\\\\\"the agent has been chasing this bug for 30 minutes — what's the structural fix?\\\\\\\",\\\\\\\"the symptoms span data integrity and UI rendering — which is the root cause?\\\\\\\",\\\\\\\"the build fails locally but passes in CI — how do I diagnose that class first?\\\\\\\",\\\\\\\"I have a stack trace and an unhandled exception — what's the cheapest technique?\\\\\\\",\\\\\\\"intermittent failure that doesn't reproduce on retry — which class is this?\\\\\\\",\\\\\\\"we ran profiling, instrumentation, and bisect — none converge. What did we misclassify?\\\\\\\",\\\\\\\"two engineers disagree on whether this is a config issue or a logic error — what evidence settles it?\\\\\\\"]\",\"anti_examples\":\"[\\\\\\\"actually execute scientific-method debugging on this stack trace\\\\\\\",\\\\\\\"review this AI-generated PR for correctness\\\\\\\",\\\\\\\"scan this repo for OWASP top 10 vulnerabilities\\\\\\\",\\\\\\\"design observability instrumentation for this service\\\\\\\",\\\\\\\"decide which agent should pick up this ticket\\\\\\\",\\\\\\\"what's the right test pyramid for this feature\\\\\\\"]\",\"relations\":\"{\\\\\\\"boundary\\\\\\\":[{\\\\\\\"skill\\\\\\\":\\\\\\\"debugging\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"debugging is the *execution* phase (run a chosen technique against an already-classified failure); diagnosis is the *triage* phase before debugging — classify first, then debug\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"code-review\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"code-review evaluates code for quality / correctness in advance; diagnosis investigates an already-broken behavior\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"owasp-security\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"owasp-security is a domain-specific scan against a known threat list; diagnosis is the cross-domain triage that routes to security investigation only when symptoms point there\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"testing-strategy\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"testing-strategy decides what to test proactively; diagnosis decides how to investigate after a test (or production) has revealed a failure\\\\\\\"}],\\\\\\\"related\\\\\\\":[\\\\\\\"debugging\\\\\\\",\\\\\\\"error-tracking\\\\\\\",\\\\\\\"code-review\\\\\\\"],\\\\\\\"verify_with\\\\\\\":[\\\\\\\"debugging\\\\\\\"]}\",\"portability\":\"{\\\\\\\"readiness\\\\\\\":\\\\\\\"scripted\\\\\\\",\\\\\\\"targets\\\\\\\":[\\\\\\\"skill-md\\\\\\\"]}\",\"lifecycle\":\"{\\\\\\\"stale_after_days\\\\\\\":365,\\\\\\\"review_cadence\\\\\\\":\\\\\\\"quarterly\\\\\\\"}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/diagnosis/SKILL.md\"}"
+  skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
+  skill_graph_protocol: Skill Metadata Protocol v4
+  skill_graph_project: Skill Graph
+  skill_graph_canonical_skill: skills/diagnosis/SKILL.md
+---
+# Diagnosis
+## Coverage
+The triage-first framework that classifies a software failure into a _problem class_ and routes it to the right diagnostic technique before root-cause investigation begins. Names nine symptom classes — Logic Error, Runtime Crash, Data Integrity, Timing / Race, Performance, Configuration, Security, Integration, Tooling / Build / Script-path — and provides a classification decision tree that walks from "is there a stack trace?" to a single class. Specifies a universal evidence-collection protocol (exact error message, reproduction steps, last-known-good state, environment facts) and class-specific evidence checklists. Lays out the technique-selection matrix — stack-trace reading, data-flow tracing, git bisect, differential comparison, instrumentation, MRE isolation, profiling, boundary probing — with each technique's time cost, best-case class, and evidence prerequisite. Defines the diagnostic confidence ladder (level 0 Symptom → 1 Classified → 2 Localized → 3 Root Cause → 4 Verified Fix) with explicit "you can say / you cannot say" boundaries at each level and stuck-state checkpoints (5-min, 10-min, 15-min, oscillation). Names escalation criteria for switching approach, switching class, or escalating to a human. Covers three cross-domain patterns where multiple classes apply simultaneously: the Cascade (one root cause, many symptoms), the Coincidence (two unrelated bugs that look like one), the Environment Ghost (works in one environment, fails in another). Catalogues diagnostic anti-patterns and ships a structured diagnostic-session template.
+## Philosophy
+Debugging fails most often not because the engineer lacks skill, but because the wrong methodology is applied to the problem class. A timing bug needs different tools than a data-integrity bug. A multi-tenant leak needs different thinking than a rendering glitch. The most expensive debugging mistake is spending 30 minutes applying scientific-method debugging to what is actually a configuration error discoverable in 2 minutes.
+This skill is the triage _nurse_, not the surgeon. A nurse does not treat the patient — they take vital signs, route to cardiology or neurology, and escalate to the attending physician when criteria are met. Software diagnosis works the same way: collect evidence, classify the symptom, route to the right specialist technique, and pivot when convergence stalls. The 2–5 minute cost of triage is always smaller than the 30-minute cost of misdiagnosis. Skipping triage because "the cause is obvious" fails roughly 60% of the time on non-trivial bugs (confirmation bias) — even seasoned engineers benefit from making the classification step explicit.
+## 1. The Diagnostic Triage Protocol
+Before debugging, diagnose which _kind_ of problem you have. The class determines the technique and the technique determines the time-to-fix.
+```
+1. Collect baseline evidence (Section 3)
+2. Classify the symptom            (Section 2)
+3. Select the diagnostic technique (Section 4)
+4. Execute using the routed technique
+5. If not converging after 3 attempts, escalate (Section 6)
+```
+**Rule:** never start fixing before completing steps 1–3. The cost of misclassification exceeds the cost of five minutes of triage every time.
+## 2. Symptom-Classification Taxonomy
+Every failure falls into one of nine classes. Each class has a primary diagnostic technique.
+| Class                             | Symptoms                                                                              | Primary technique                                                                                               |
+| --------------------------------- | ------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- |
+| **Logic Error**                   | Wrong output, wrong calculation, wrong state transition                               | Trace data flow; compare expected vs actual at each stage                                                       |
+| **Runtime Crash**                 | Unhandled exception, process exit, 500 error                                          | Read stack trace; find the throwing line; check preconditions                                                   |
+| **Data Integrity**                | Missing records, wrong totals, duplicate entries, cross-tenant leak                   | Compare source data to derived data at each transform stage                                                     |
+| **Timing / Race**                 | Intermittent failure, works on retry, order-dependent                                 | Add timestamps to logs; look for concurrent mutations; check locks                                              |
+| **Performance**                   | Slow response, timeout, memory growth, CPU spike                                      | Profile _first_ (measure before hypothesizing); find the hot path                                               |
+| **Configuration**                 | Works locally but not in staging / prod, env-dependent                                | Diff environments — env vars, versions, feature flags, DNS, SSL                                                 |
+| **Security**                      | Auth bypass, data exposure, HMAC failure, injection                                   | Follow data flow from untrusted input to sensitive operation                                                    |
+| **Integration**                   | Webhook not arriving, API returning unexpected shape, sync drift                      | Check both sides of the boundary independently, then compare                                                    |
+| **Tooling / Build / Script-path** | `Cannot find module`, wrong cwd, stale script paths, `read EIO`, `ENOENT` on a script | Verify path resolution; check cwd; verify dependency install; compare referenced path vs actual filesystem path |
+### Classification decision tree
+```
+Is there a stack trace or error message?
+  YES → Does it point to a specific line?
+          YES → Runtime Crash (read the line; check preconditions)
+          NO  → Is it a timeout or OOM?
+                  YES → Performance
+                  NO  → Logic Error (the error is a symptom of wrong state)
+  NO  → Is the output wrong but no error thrown?
+          YES → Is the wrongness in calculated numbers or records?
+                  YES → Data Integrity
+                  NO  → Logic Error
+          NO  → Is it intermittent?
+                  YES → Timing / Race
+                  NO  → Does it depend on environment?
+                          YES → Configuration
+                          NO  → Does the error message contain a file/module path?
+                                  YES → Tooling / Build / Script-path
+                                  NO  → Does it involve external services?
+                                          YES → Integration
+                                          NO  → Are there security signals
+                                                (auth failure, permission error,
+                                                unexpected data exposure, HMAC failure,
+                                                access-control bypass)?
+                                                  YES → Security
+                                                  NO  → Unknown / Unclassified
+                                                          → restart evidence collection;
+                                                            run a fresh investigative sweep
+```
+## 3. Evidence-Collection Protocol
+Before forming any hypothesis, collect baseline evidence. The class determines the additional evidence needed beyond the universal set.
+### Universal evidence (always collect)
+| Evidence                            | How to collect                                               | Why                                      |
+| ----------------------------------- | ------------------------------------------------------------ | ---------------------------------------- |
+| Exact error message or wrong output | Copy from logs, terminal, or UI                              | Prevents paraphrasing errors             |
+| Reproduction steps                  | The minimal sequence that triggers the failure               | Proves the bug exists and is testable    |
+| Last-known-good state               | `git log --oneline -10`, recent deploys, recent data changes | Brackets the introduction window         |
+| Environment facts                   | Runtime version, env vars, database state, running services  | Eliminates the Configuration class early |
+### Class-specific evidence
+| Class                         | Additional evidence to collect                                                                                                      |
+| ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- |
+| Logic Error                   | Input data, expected output, actual output, intermediate values at key transform points                                             |
+| Runtime Crash                 | Full stack trace, request payload, database state at crash time                                                                     |
+| Data Integrity                | Source record count vs derived count, sample rows from each stage, tenant / scope identifiers                                       |
+| Timing / Race                 | Timestamps of concurrent operations, lock state, retry behaviour, whether it reproduces under load                                  |
+| Performance                   | Response-time baseline, CPU / memory profile, query plans (EXPLAIN), N+1 query check                                                |
+| Configuration                 | Env-var diff (local vs staging vs prod), package-version diff, feature-flag state                                                   |
+| Security                      | Auth state, session-token contents, role / permission, request headers, HMAC comparison                                             |
+| Integration                   | Request / response pair from both sides, delivery logs, timestamp alignment                                                         |
+| Tooling / Build / Script-path | Module-resolution output, current working directory at failure, dependency-install verification, referenced path vs filesystem path |
+**Rule:** if you cannot fill the universal evidence table, you are not ready to hypothesize. Collect first, think second.
+## 4. Technique-Selection Matrix
+Once the symptom is classified, pick the cheapest technique that could resolve the class.
+| Technique                        | Best for                                     | Time cost      | Evidence required                       |
+| -------------------------------- | -------------------------------------------- | -------------- | --------------------------------------- |
+| **Stack-trace reading**          | Runtime crashes, unhandled exceptions        | 1–2 min        | Stack trace                             |
+| **Data-flow tracing**            | Logic errors, data integrity                 | 5–15 min       | Input + output at each stage            |
+| **Binary search (`git bisect`)** | Regressions with known-good state            | 3–10 min       | Known-good commit + reproducible test   |
+| **Differential comparison**      | Configuration, environment-dependent failure | 2–5 min        | Two environments to compare             |
+| **Instrumentation (logging)**    | Timing / race, intermittent failures         | 5–10 min setup | Hypothesis about where to instrument    |
+| **Isolation (MRE)**              | Complex failures with many variables         | 10–20 min      | Reproducible failure                    |
+| **Profiling**                    | Performance, memory, CPU                     | 5–15 min       | Running system under load               |
+| **Boundary probing**             | Integration failures                         | 5–10 min       | Access to both sides of the integration |
+### Technique-ordering principle
+Always start with the cheapest technique that could resolve the class:
+1. **Read the error** (~30 s) — solves ~40% of runtime crashes
+2. **Check the environment** (~1 min) — solves ~30% of configuration issues
+3. **Trace the data flow** (~5 min) — solves ~50% of logic / data errors
+4. **Isolate with MRE** (~10 min) — solves most of what remains
+5. **Instrument and observe** (~10+ min) — last resort for timing / intermittent failures
+## 5. The Diagnostic Confidence Ladder
+As evidence accumulates, confidence in the diagnosis should increase _monotonically_. If it doesn't, the symptom has been misclassified.
+| Level            | Confidence | You can say                                             | You cannot say           |
+| ---------------- | ---------- | ------------------------------------------------------- | ------------------------ |
+| 0 — Symptom      | 0%         | "Something is wrong"                                    | Anything about the cause |
+| 1 — Classified   | 20%        | "This is a [class] problem"                             | Where specifically       |
+| 2 — Localized    | 50%        | "The failure is in [module / file / function]"          | What exactly is wrong    |
+| 3 — Root cause   | 80%        | "The cause is [specific condition]"                     | That the fix will work   |
+| 4 — Verified fix | 95%        | "This fix resolves the root cause and does not regress" | Nothing — ship it        |
+### Stuck-state checkpoints
+- **Stuck at level 0 for > 5 min** → you need more evidence; restart Section 3
+- **Stuck at level 1 for > 10 min** → likely misclassification; re-run the classification tree
+- **Stuck at level 2 for > 15 min** → the problem may be cross-domain; check whether multiple classes apply
+- **Oscillating between levels** → stop. Write down what you _know_ vs what you're _assuming_. The assumption is wrong.
+## 6. Escalation Criteria
+### Switch diagnostic approach when
+| Signal                                           | Action                                              |
+| ------------------------------------------------ | --------------------------------------------------- |
+| Three hypotheses tested, none confirmed          | Re-classify the symptom from scratch                |
+| Fix works locally but not in target env          | Switch to Configuration-class techniques            |
+| Multiple symptoms that don't share a root cause  | You may have 2+ bugs; triage each independently     |
+| Evidence contradicts the classification          | Trust the evidence; re-classify                     |
+| Confidence has _decreased_ over the last 3 steps | Stop. You're making it worse. Fresh context needed. |
+### Escalate to human when
+| Signal                                                                | Why a human is needed           |
+| --------------------------------------------------------------------- | ------------------------------- |
+| Requires access you don't have (production DB, third-party dashboard) | Authorization boundary          |
+| Business-logic ambiguity ("should this return 0 or null?")            | Product decision, not technical |
+| Fix requires a breaking change to a public API                        | Stakeholder alignment needed    |
+| Reproduction requires real user data you cannot access                | Privacy / compliance boundary   |
+| 30 minutes of investigation with no progress                          | Fresh perspective needed        |
+## 7. Cross-Domain Patterns
+Some failures span multiple classes simultaneously. These compound failures are the hardest to diagnose.
+### Pattern: the Cascade
+A single root cause triggers symptoms across multiple classes.
+```
+Root cause: missing null-check in a data transform
+  → Data Integrity symptom: wrong totals
+  → Logic Error symptom:    UI shows negative values
+  → Integration symptom:    webhook payload rejected by partner
+```
+**Diagnostic approach:** find the _earliest_ symptom in the data flow. That's closest to the root cause.
+### Pattern: the Coincidence
+Two unrelated bugs appear simultaneously, creating a misleading compound symptom.
+```
+Bug A: CSS regression from a recent deploy        (Logic Error)
+Bug B: slow API from an unrelated query change    (Performance)
+Combined symptom: "the page is broken and slow"
+```
+**Diagnostic approach:** separate the symptoms. Test each independently. If fixing one doesn't affect the other, they're independent bugs.
+### Pattern: the Environment Ghost
+Works in one environment, fails in another, with no code difference.
+```
+Local:    works   (runtime 20.11, .env.local, fresh DB)
+Staging:  fails   (runtime 20.9,  CI env vars, migrated DB)
+```
+**Diagnostic approach:** diff _everything_ — runtime versions, env vars, DB state, feature flags, DNS, SSL, headers. The first difference you find is usually the cause.
+## 8. Anti-Patterns
+| Anti-pattern                               | Why it fails                                            | Correct                                          |
+| ------------------------------------------ | ------------------------------------------------------- | ------------------------------------------------ |
+| Fixing before diagnosing                   | Treats the symptom; root cause persists                 | Complete the triage protocol first               |
+| Hypothesis without evidence                | Confirmation bias drives you toward your guess          | Collect universal evidence before any hypothesis |
+| Changing multiple variables at once        | Cannot determine which change had the effect            | One variable at a time                           |
+| Assuming the obvious cause                 | The obvious cause is wrong ~60% of the time             | Verify with evidence even when "obvious"         |
+| Debugging by `printf` without a hypothesis | Random instrumentation wastes time                      | Instrument to test a _specific_ hypothesis       |
+| Applying the wrong class's technique       | Performance profiling won't find a logic error          | Re-classify if the technique isn't converging    |
+| Escalating too early                       | Hasn't gathered enough evidence for a useful escalation | Fill the evidence table before escalating        |
+| Escalating too late                        | Spent 45 minutes on what a human could resolve in 5     | Follow the time-based escalation triggers        |
+## 9. Diagnostic-Session Template
+Use this template to structure a diagnostic session. It prevents skipping steps.
+```markdown
+## Diagnostic Session: [Brief description]
+### 1. Symptom
+- What: [exact error or wrong behavior]
+- Where: [route / component / job]
+- When: [always / intermittent / environment-specific]
+- Since: [commit / deploy / data change]
+### 2. Classification
+- Primary class: [from taxonomy]
+- Confidence: [0–4 level]
+- Technique: [from technique matrix]
+### 3. Evidence Collected
+- [ ] Error message / wrong output (exact)
+- [ ] Reproduction steps (minimal)
+- [ ] Last-known-good state
+- [ ] Environment facts
+- [ ] Class-specific evidence: [list]
+### 4. Hypotheses Tested
+| #   | Hypothesis | Test | Result | Confidence after |
+| --- | ---------- | ---- | ------ | ---------------- |
+| 1   |            |      |        |                  |
+### 5. Resolution
+- Root cause: [one sentence]
+- Fix: [what was changed]
+- Prevention: [test / guard / doc added]
+```
+## Verification
+- [ ] The symptom was classified before any debugging technique was chosen
+- [ ] Baseline evidence was collected before any hypothesis was formed
+- [ ] The cheapest technique that could resolve this class was tried first
+- [ ] Confidence increased monotonically — or the symptom was re-classified the moment it didn't
+- [ ] If the approach was changed, the reason was documented (which signal triggered the switch)
+- [ ] The time-based stuck-state checkpoints were respected (5-min / 10-min / 15-min triggers)
+- [ ] If the failure spanned multiple classes, the cross-domain pattern (Cascade / Coincidence / Environment Ghost) was named explicitly
+## Do NOT Use When
+| Use instead        | When                                                                                                                                                    |
+| ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `debugging`        | Actually executing scientific-method debugging on a failure that has _already_ been classified — this skill routes to debugging; it does not replace it |
+| `code-review`      | Reviewing code for quality / correctness _before_ a failure exists — diagnosis is downstream                                                            |
+| `owasp-security`   | A focused security audit against a known threat list — diagnosis only routes here when symptoms point at security                                       |
+| `testing-strategy` | Deciding what to test proactively — diagnosis is for _reactive_ investigation after a failure                                                           |
+| `error-tracking`   | Setting up the production-error-capture / sampling / alerting stack — diagnosis investigates a _specific_ failure already in front of you               |
+| `skill-router`     | Choosing which agent skill activates for an arbitrary query — that's cross-skill dispatch, not failure triage                                           |

package/marketplace/skills/diff-analysis/SKILL.md ADDED Viewed

@@ -0,0 +1,188 @@
+---
+name: diff-analysis
+description: "Use when analyzing `git diff`, reviewing a patch before commit, or explaining what a changeset does. Covers unified diff anatomy, hunk interpretation, semantic-vs-formatting separation, blast-radius tracing, hidden-risk scanning, and intent-vs-diff comparison. Do NOT use for full code-review verdicts (use `code-review`), git workflow decisions (use `version-control`), or visual diffs."
+license: MIT
+compatibility: "Markdown, Git, agent-skill runtimes"
+allowed-tools: Read Grep Bash
+metadata:
+  metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"quality\",\"domain\":\"quality/doctrine\",\"scope\":\"portable\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-03-28\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-03-28\\\\\\\"}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"git diff\\\\\\\",\\\\\\\"unified diff\\\\\\\",\\\\\\\"patch analysis\\\\\\\",\\\\\\\"changeset review\\\\\\\",\\\\\\\"diff analysis\\\\\\\",\\\\\\\"read a diff\\\\\\\"]\",\"triggers\":\"[\\\\\\\"diff-skill\\\\\\\"]\",\"relations\":\"{\\\\\\\"related\\\\\\\":[\\\\\\\"semantics\\\\\\\"],\\\\\\\"boundary\\\\\\\":[\\\\\\\"code-review\\\\\\\",\\\\\\\"version-control\\\\\\\"],\\\\\\\"verify_with\\\\\\\":[\\\\\\\"refactor\\\\\\\"]}\",\"portability\":\"{\\\\\\\"readiness\\\\\\\":\\\\\\\"scripted\\\\\\\",\\\\\\\"targets\\\\\\\":[\\\\\\\"skill-md\\\\\\\"]}\",\"lifecycle\":\"{\\\\\\\"stale_after_days\\\\\\\":90,\\\\\\\"review_cadence\\\\\\\":\\\\\\\"quarterly\\\\\\\"}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/diff-analysis/SKILL.md\"}"
+  skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
+  skill_graph_protocol: Skill Metadata Protocol v4
+  skill_graph_project: Skill Graph
+  skill_graph_canonical_skill: skills/diff-analysis/SKILL.md
+---
+# Diff Analysis
+## Domain Context
+**What is this skill?** This skill provides disciplined diff analysis for AI agents: reading code changes as a structured before/after artifact, isolating semantic changes from formatting noise, tracing blast radius across files, and extracting review-ready findings from a patch. Covers unified diff anatomy, hunk-by-hunk interpretation, scope validation, hidden-risk scanning, and intent-vs-diff comparison. Use when analyzing `git diff`, reviewing a patch before commit, or explaining what a changeset actually does. Do NOT use for full code-review verdicts (use code-review), git workflow decisions (use version-control), or pixel/image comparison (use playwright-cli or visual diff tooling).
+## Key Files
+| File | Purpose |
+|---|---|
+| `skills/diff-analysis/references/repo-diff-patterns.md` | Repo-grounded patch examples showing how diff classes map to real changes in this workspace. |
+| `skills/diff-analysis/references/diff-reading-checklist.md` | Step-by-step checklist for reading hunks, isolating semantic deltas, and naming blast radius. |
+## Coverage
+This skill covers reading and interpreting unified diffs and patches: the anatomy of `diff --git` output, hunk-by-hunk semantic extraction, file-level change classification (rename, mechanical rewrite, local logic edit, contract edit, test-only edit), separating signal from formatting noise, blast radius estimation for changed contracts and types, intent-vs-diff mismatch detection, and writing concise behavior-focused diff summaries. Does not cover full code review verdicts, git workflow/branching decisions, or visual/pixel comparison.
+## Philosophy
+Agents that skim diffs miss hidden behavior changes buried in formatting churn. A one-line guard removal inside a 300-line reformat can silently widen access. This skill exists because agents need a repeatable reading discipline -- structure first, meaning second, risk last -- instead of narrating every added and removed line equally. Without it, agents produce line-by-line restatements that miss the semantic delta and say "looks safe" without naming the blast radius.
+A diff is not just a list of changed lines. It is a compact representation of intent, scope, and risk. This skill helps agents read a patch accurately, separate real behavior changes from noise, and turn raw hunks into useful conclusions.
+For real repo-grounded examples, read `references/repo-diff-patterns.md` and `references/diff-reading-checklist.md` when you need concrete patch shapes instead of the general rubric.
+## 1. What This Skill Owns
+| Owns | Does not own |
+| --- | --- |
+| Reading unified diffs and patches | Deciding branch strategy or release flow |
+| Separating semantic change from formatting churn | Full review sign-off across correctness/security/performance |
+| Mapping changed hunks to probable blast radius | Visual screenshot or image diffs |
+| Explaining what changed in plain language | Commit-policy or git-history governance |
+## 2. The Diff Reading Loop
+Read diffs in this order:
+1. Identify the file set.
+2. Classify each file by change type.
+3. Read hunk headers before line edits.
+4. Extract semantic change from each hunk.
+5. Check for scope mismatch or hidden blast radius.
+6. Summarize intent, risk, and verification needs.
+Do not start by reading every added and removed line equally. Start from structure, then meaning.
+## 3. Diff Anatomy
+| Diff part | What it tells you | How to use it |
+| --- | --- | --- |
+| `diff --git a/... b/...` | File identity | Build the file-level scope list |
+| `index ...` | Blob/version change | Usually low-value unless debugging patch application |
+| `---` / `+++` | Before and after file path | Confirm rename vs in-place edit |
+| `@@ ... @@` | Hunk location and nearby context | Understand where the change lands before reading lines |
+| `-` lines | Removed behavior/content | Ask what guarantee or behavior disappeared |
+| `+` lines | Added behavior/content | Ask what new state, branch, or dependency now exists |
+| context lines | Stable neighborhood | Use to infer surrounding intent and call path |
+## 4. File-Level Change Classification
+Before reading hunks, tag each changed file.
+| Change class | Typical signal | Primary question |
+| --- | --- | --- |
+| Rename/move | Path changed, little content churn | Is behavior unchanged but references now need updates? |
+| Mechanical rewrite | Many lines changed, low semantic delta | Is this formatting or real logic? |
+| Local logic edit | Small hunk in one function | What behavior changed here? |
+| Contract edit | Types, schemas, API responses, SQL view shape | What downstream consumers now need adjustment? |
+| Test-only edit | Only assertions/fixtures changed | Is the test following behavior or masking a regression? |
+This classification decides how deeply to inspect the diff.
+## 5. Semantic Extraction Per Hunk
+For each hunk, answer four questions:
+1. What behavior or contract existed before?
+2. What behavior or contract exists now?
+3. Is the change additive, restrictive, or substitutive?
+4. What adjacent path could now behave differently?
+### Hunk interpretation rules
+- A one-line edit can still be a contract break.
+- Large churn can still be mostly noise.
+- Added guards often narrow behavior; removed guards widen risk.
+- Type-only changes can imply runtime consequences if APIs or assumptions shift.
+## 6. Noise vs Signal
+| Looks noisy | May still matter because |
+| --- | --- |
+| Import reorder | It can hide a new dependency or removal of a side-effect import |
+| Rename-only edit | It can change route ownership, dynamic import paths, or symbol meaning |
+| Formatting rewrite | It can bury one real branch or condition change |
+| Test snapshot update | It can normalize a regression instead of proving a fix |
+### Signal extraction rules
+- First identify files with likely semantic impact.
+- Then ignore purely cosmetic churn only after proving it is cosmetic.
+- If one hunk mixes formatting and behavior, rewrite the summary around the behavior change only.
+## 7. Blast Radius Checks
+After understanding the diff itself, ask what else the patch implicitly touches.
+| Change type | Likely blast radius |
+| --- | --- |
+| Public type/interface change | Callers, tests, route contracts, docs |
+| Query/view change | Services, report math, downstream consumers |
+| Auth/guard change | Access paths, redirects, error handling |
+| Config/env change | Startup paths, deployment docs, feature gates |
+| Utility change | Every call site using the helper |
+This skill does not require opening every dependent file. It requires naming the probable risk surface correctly.
+## 8. Intent vs Diff
+Compare the stated goal against the actual patch.
+| If the stated intent is... | Check whether the diff actually... |
+| --- | --- |
+| Fix a bug | Closes the failing path without silently broadening scope |
+| Refactor | Preserves behavior while changing structure |
+| Add a feature | Includes the necessary contract, UI, and verification changes |
+| Clean up | Removes dead weight without deleting active value |
+If the diff and stated intent disagree, the patch needs clarification or further work.
+## 9. Good Diff Summaries
+A good summary says why the change matters, not just what lines moved.
+### Use this format
+- File scope: which files changed and what kinds of changes they represent
+- Semantic delta: what behavior or contract changed
+- Risk surface: where regressions could now appear
+- Verify next: what should be tested or re-read next
+### Avoid
+- line-by-line narration of the whole patch
+- repeating obvious rename churn
+- calling a diff safe without naming the risk surface
+## 10. Boundaries
+- Use `code-review` when you need a full review verdict and comment severity.
+- Use `version-control` for branching, rebasing, squash, release, or provenance policy.
+- Use `playwright-cli` or visual diff tools for screenshots and pixel comparison.
+- Use `scanning` when you need to move from a diff into exact files and line slices efficiently.
+## Verification
+After applying this skill, verify:
+- [ ] I classified the file set before reading hunks deeply.
+- [ ] I used hunk context, not only added/removed lines.
+- [ ] I separated semantic change from cosmetic noise.
+- [ ] I identified the likely blast radius of the patch.
+- [ ] I compared the diff against the claimed intent.
+- [ ] My summary explains behavior change, risk, and next verification step.
+- [ ] I did not produce a line-by-line narration of the whole patch.
+- [ ] I named the risk surface explicitly, not just "looks safe."
+## Do NOT Use When
+| Instead of this skill | Use | Why |
+|---|---|---|
+| Full correctness/security review with blocking vs advisory comments | `code-review` | code-review owns the verdict structure and comment severity |
+| Git branching, rebasing, squash, release flow decisions | `version-control` | version-control owns branch strategy and release governance |
+| Visual screenshot or pixel comparison | `playwright-cli` | playwright-cli owns browser-based visual verification |
+| Finding specific code locations from a diff | `scanning` | scanning owns efficient file and line-slice navigation |