@skill-graph/cli 0.5.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +247 -0
- package/LICENSE +200 -0
- package/NOTICE +62 -0
- package/README.md +398 -0
- package/SKILL_GRAPH.md +443 -0
- package/bin/skill-graph.js +374 -0
- package/docs/ADOPTION.md +117 -0
- package/docs/CONFORMANCE.md +66 -0
- package/docs/PRIMER.md +384 -0
- package/docs/QUICKSTART-30MIN.md +333 -0
- package/docs/ROUTING-METRICS.md +120 -0
- package/docs/SKILL-MD-FORMAT-COMPATIBILITY.md +127 -0
- package/docs/SKILL_AUDIT_CHECKLIST.md +199 -0
- package/docs/SKILL_AUDIT_LOOP.md +195 -0
- package/docs/SKILL_METADATA_PROTOCOL.md +609 -0
- package/docs/_archived/marketplace-publication-priority-2026-05-18.md +239 -0
- package/docs/adr/0001-predicate-set.md +69 -0
- package/docs/adr/0002-json-ld-context.md +82 -0
- package/docs/adr/0003-ontoclean-rigidity-tags.md +65 -0
- package/docs/adr/0004-persistent-identifiers.md +74 -0
- package/docs/adr/0005-freshness-consolidation.md +70 -0
- package/docs/adr/0006-revise-predicate-rename.md +105 -0
- package/docs/adr/0007-audit-loop-cadence.md +99 -0
- package/docs/adr/0008-skill-surface-split-and-curation-policy.md +93 -0
- package/docs/category-consumers.md +168 -0
- package/docs/concept-map.md +194 -0
- package/docs/diagrams/drift-states.mmd +21 -0
- package/docs/diagrams/manifest-pipeline.mmd +25 -0
- package/docs/diagrams/routing-harness.mmd +41 -0
- package/docs/diagrams/starter-graph.mmd +53 -0
- package/docs/field-decision-guide.md +315 -0
- package/docs/field-rationale.md +211 -0
- package/docs/field-reference.generated.md +624 -0
- package/docs/field-reference.md +1426 -0
- package/docs/glossary.md +190 -0
- package/docs/head-noun-glossary.md +63 -0
- package/docs/images/audit-phases.png +0 -0
- package/docs/images/drift-states.png +0 -0
- package/docs/images/graded-mode.png +0 -0
- package/docs/images/manifest-pipeline.png +0 -0
- package/docs/images/routing-harness.png +0 -0
- package/docs/images/skill-anatomy.png +0 -0
- package/docs/images/starter-graph.png +0 -0
- package/docs/images/system-model.png +0 -0
- package/docs/integrations/github-actions.md +155 -0
- package/docs/manifest-field-mapping.md +443 -0
- package/docs/marketplace-publication-queue.generated.md +240 -0
- package/docs/marketplace-release-agent-prompt.md +82 -0
- package/docs/marketplace-skill-candidate-list.md +272 -0
- package/docs/marketplace-syndication.md +222 -0
- package/docs/migration-sample-review.md +155 -0
- package/docs/migrations/v4-to-v5.md +168 -0
- package/docs/migrations/v5-to-v6.md +221 -0
- package/docs/name-exceptions.yaml +37 -0
- package/docs/plans/marketplace-p1-public-migration-plan.md +41 -0
- package/docs/plans/multi-root-workspace.md +148 -0
- package/docs/plans/scripts-roadmap.md +107 -0
- package/docs/plans/v4-schema-bump.md +160 -0
- package/docs/plans/wave-2-extraction.md +122 -0
- package/docs/positioning-vs-marketplaces.md +175 -0
- package/docs/proposals/skill-audit-loop-positioning.md +160 -0
- package/docs/quality-doctrine.md +138 -0
- package/docs/recommended-skills.md +150 -0
- package/docs/research/skill-comprehension-eval-research.md +1830 -0
- package/docs/research/skill-retrieval-evidence.md +66 -0
- package/docs/skill-metadata-protocol.md +471 -0
- package/docs/skills-sh-maintainer-cleanup-request.md +80 -0
- package/examples/audits/a11y/findings.md +52 -0
- package/examples/audits/a11y/scorecard.md +21 -0
- package/examples/audits/a11y/verdict.md +44 -0
- package/examples/audits/debugging/findings.md +59 -0
- package/examples/audits/debugging/scorecard.md +22 -0
- package/examples/audits/debugging/verdict.md +33 -0
- package/examples/audits/documentation/findings.md +59 -0
- package/examples/audits/documentation/scorecard.md +22 -0
- package/examples/audits/documentation/verdict.md +33 -0
- package/examples/evals/a11y.json +140 -0
- package/examples/evals/api-design.json +52 -0
- package/examples/evals/code-review.json +52 -0
- package/examples/evals/data-modeling.json +52 -0
- package/examples/evals/database-migration.json +52 -0
- package/examples/evals/debugging.json +118 -0
- package/examples/evals/dependency-architecture.json +52 -0
- package/examples/evals/design-system-architecture.json +52 -0
- package/examples/evals/error-tracking.json +52 -0
- package/examples/evals/event-contract-design.json +52 -0
- package/examples/evals/form-ux-architecture.json +52 -0
- package/examples/evals/framework-fit-analysis.json +52 -0
- package/examples/evals/graph-audit.json +139 -0
- package/examples/evals/information-architecture.json +52 -0
- package/examples/evals/interaction-feedback.json +52 -0
- package/examples/evals/interaction-patterns.json +52 -0
- package/examples/evals/layout-composition.json +52 -0
- package/examples/evals/lint-overlay.json +117 -0
- package/examples/evals/microcopy.json +52 -0
- package/examples/evals/observability-modeling.json +52 -0
- package/examples/evals/pattern-recognition.json +96 -0
- package/examples/evals/performance-engineering.json +52 -0
- package/examples/evals/refactor.json +128 -0
- package/examples/evals/semiotics.json +52 -0
- package/examples/evals/skill-infrastructure.json +96 -0
- package/examples/evals/skill-router.json +140 -0
- package/examples/evals/skill-router.routing.json +113 -0
- package/examples/evals/system-interface-contracts.json +52 -0
- package/examples/evals/task-analysis.json +52 -0
- package/examples/evals/testing-strategy.json +118 -0
- package/examples/evals/type-safety.json +249 -0
- package/examples/evals/visual-design-foundations.json +52 -0
- package/examples/evals/webhook-integration.json +52 -0
- package/examples/exports/a11y.skill-md.md +80 -0
- package/examples/exports/debugging.skill-md.md +80 -0
- package/examples/exports/refactor.skill-md.md +78 -0
- package/examples/exports/testing-strategy.skill-md.md +81 -0
- package/examples/projects/markdown-static-site/README.md +115 -0
- package/examples/projects/markdown-static-site/skills/content-source-router/SKILL.md +131 -0
- package/examples/projects/markdown-static-site/skills/image-optimization-pipeline-config/SKILL.md +132 -0
- package/examples/projects/markdown-static-site/skills/link-rot-detection/SKILL.md +103 -0
- package/examples/projects/markdown-static-site/skills/markdown-post-frontmatter-validation/SKILL.md +133 -0
- package/examples/projects/markdown-static-site/skills/migrate-posts-to-v2-frontmatter/SKILL.md +140 -0
- package/examples/projects/saas-stripe-postgres/README.md +208 -0
- package/examples/projects/saas-stripe-postgres/db/migrations/0004_canonicalize_orders.sql +37 -0
- package/examples/projects/saas-stripe-postgres/db/schema.sql +112 -0
- package/examples/projects/saas-stripe-postgres/skills/migrate-orders-to-canonical-schema/SKILL.md +149 -0
- package/examples/projects/saas-stripe-postgres/skills/nextjs-server-action-validation/SKILL.md +154 -0
- package/examples/projects/saas-stripe-postgres/skills/payment-provider-router/SKILL.md +153 -0
- package/examples/projects/saas-stripe-postgres/skills/postgres-rls-pattern/SKILL.md +163 -0
- package/examples/projects/saas-stripe-postgres/skills/stripe-webhook-signature-verification/SKILL.md +137 -0
- package/examples/protocol/skill-metadata-template.md +301 -0
- package/examples/protocol/skills.manifest.sample.json +13245 -0
- package/examples/skill-metadata-template.md +317 -0
- package/examples/skills.manifest.sample.json +13519 -0
- package/examples/tests/v3-1-skos-fixture/SKILL.md +93 -0
- package/marketplace/README.md +17 -0
- package/marketplace/skills/a11y/SKILL.md +66 -0
- package/marketplace/skills/acid-fundamentals/SKILL.md +106 -0
- package/marketplace/skills/agent-engineering/SKILL.md +386 -0
- package/marketplace/skills/agent-eval-design/SKILL.md +55 -0
- package/marketplace/skills/ai-native-development/SKILL.md +294 -0
- package/marketplace/skills/api-design/SKILL.md +60 -0
- package/marketplace/skills/architecture-decision-records/SKILL.md +55 -0
- package/marketplace/skills/background-jobs/SKILL.md +265 -0
- package/marketplace/skills/bounded-context-mapping/SKILL.md +55 -0
- package/marketplace/skills/cap-theorem-tradeoffs/SKILL.md +127 -0
- package/marketplace/skills/client-server-boundary/SKILL.md +187 -0
- package/marketplace/skills/code-review/SKILL.md +120 -0
- package/marketplace/skills/color-system-design/SKILL.md +43 -0
- package/marketplace/skills/component-architecture/SKILL.md +126 -0
- package/marketplace/skills/compression/SKILL.md +112 -0
- package/marketplace/skills/conceptual-modeling/SKILL.md +181 -0
- package/marketplace/skills/connection-pooling/SKILL.md +105 -0
- package/marketplace/skills/constraint-awareness/SKILL.md +287 -0
- package/marketplace/skills/content-monitor/SKILL.md +209 -0
- package/marketplace/skills/context-engineering/SKILL.md +320 -0
- package/marketplace/skills/context-graph/SKILL.md +174 -0
- package/marketplace/skills/context-management/SKILL.md +174 -0
- package/marketplace/skills/context-window/SKILL.md +239 -0
- package/marketplace/skills/contract-testing/SKILL.md +120 -0
- package/marketplace/skills/cron-scheduling/SKILL.md +223 -0
- package/marketplace/skills/dark-mode-implementation/SKILL.md +47 -0
- package/marketplace/skills/data-modeling/SKILL.md +59 -0
- package/marketplace/skills/data-modeling-fundamentals/SKILL.md +117 -0
- package/marketplace/skills/database-migration/SKILL.md +429 -0
- package/marketplace/skills/debugging/SKILL.md +67 -0
- package/marketplace/skills/dependency-architecture/SKILL.md +58 -0
- package/marketplace/skills/design-module-composition/SKILL.md +43 -0
- package/marketplace/skills/design-system-architecture/SKILL.md +61 -0
- package/marketplace/skills/design-thinking/SKILL.md +44 -0
- package/marketplace/skills/diagnosis/SKILL.md +296 -0
- package/marketplace/skills/diff-analysis/SKILL.md +188 -0
- package/marketplace/skills/e2e-test-design/SKILL.md +113 -0
- package/marketplace/skills/entity-relationship-modeling/SKILL.md +218 -0
- package/marketplace/skills/epistemic-grounding/SKILL.md +112 -0
- package/marketplace/skills/error-boundary/SKILL.md +235 -0
- package/marketplace/skills/error-tracking/SKILL.md +261 -0
- package/marketplace/skills/eval-driven-development/SKILL.md +147 -0
- package/marketplace/skills/evaluation/SKILL.md +113 -0
- package/marketplace/skills/event-contract-design/SKILL.md +60 -0
- package/marketplace/skills/event-storming/SKILL.md +56 -0
- package/marketplace/skills/form-ux-architecture/SKILL.md +60 -0
- package/marketplace/skills/framework-fit-analysis/SKILL.md +59 -0
- package/marketplace/skills/frontend-architecture/SKILL.md +43 -0
- package/marketplace/skills/generative-ui/SKILL.md +118 -0
- package/marketplace/skills/graph-audit/SKILL.md +81 -0
- package/marketplace/skills/guardrails/SKILL.md +118 -0
- package/marketplace/skills/hooks-patterns/SKILL.md +185 -0
- package/marketplace/skills/http-semantics/SKILL.md +136 -0
- package/marketplace/skills/ideation/SKILL.md +41 -0
- package/marketplace/skills/indexing-strategy/SKILL.md +108 -0
- package/marketplace/skills/information-architecture/SKILL.md +59 -0
- package/marketplace/skills/integration-test-design/SKILL.md +111 -0
- package/marketplace/skills/intent-recognition/SKILL.md +136 -0
- package/marketplace/skills/interaction-feedback/SKILL.md +59 -0
- package/marketplace/skills/interaction-patterns/SKILL.md +59 -0
- package/marketplace/skills/journey-mapping/SKILL.md +41 -0
- package/marketplace/skills/keywords/SKILL.md +213 -0
- package/marketplace/skills/knowledge-modeling/SKILL.md +232 -0
- package/marketplace/skills/layout-composition/SKILL.md +59 -0
- package/marketplace/skills/linguistics/SKILL.md +429 -0
- package/marketplace/skills/lint-overlay/SKILL.md +76 -0
- package/marketplace/skills/mental-models/SKILL.md +126 -0
- package/marketplace/skills/merge-queue/SKILL.md +94 -0
- package/marketplace/skills/methodology/SKILL.md +317 -0
- package/marketplace/skills/microcopy/SKILL.md +232 -0
- package/marketplace/skills/middleware-patterns/SKILL.md +363 -0
- package/marketplace/skills/mobile-responsive-ux/SKILL.md +287 -0
- package/marketplace/skills/mutation-testing/SKILL.md +112 -0
- package/marketplace/skills/naming-conventions/SKILL.md +112 -0
- package/marketplace/skills/observability-modeling/SKILL.md +59 -0
- package/marketplace/skills/ontology-modeling/SKILL.md +67 -0
- package/marketplace/skills/owasp-security/SKILL.md +153 -0
- package/marketplace/skills/pattern-recognition/SKILL.md +472 -0
- package/marketplace/skills/performance-budgets/SKILL.md +185 -0
- package/marketplace/skills/performance-engineering/SKILL.md +58 -0
- package/marketplace/skills/performance-testing/SKILL.md +125 -0
- package/marketplace/skills/printify/SKILL.md +42 -0
- package/marketplace/skills/prioritization/SKILL.md +118 -0
- package/marketplace/skills/problem-framing/SKILL.md +41 -0
- package/marketplace/skills/problem-locating-solving/SKILL.md +203 -0
- package/marketplace/skills/project-knowledge-extraction/SKILL.md +54 -0
- package/marketplace/skills/prompt-craft/SKILL.md +134 -0
- package/marketplace/skills/prompt-injection-defense/SKILL.md +132 -0
- package/marketplace/skills/property-based-testing/SKILL.md +100 -0
- package/marketplace/skills/prototyping/SKILL.md +43 -0
- package/marketplace/skills/query-optimization/SKILL.md +144 -0
- package/marketplace/skills/real-time-updates/SKILL.md +324 -0
- package/marketplace/skills/ref-patterns/SKILL.md +284 -0
- package/marketplace/skills/refactor/SKILL.md +65 -0
- package/marketplace/skills/rendering-models/SKILL.md +142 -0
- package/marketplace/skills/replication-patterns/SKILL.md +110 -0
- package/marketplace/skills/research-synthesis/SKILL.md +41 -0
- package/marketplace/skills/route-handler-design/SKILL.md +347 -0
- package/marketplace/skills/schema-evolution/SKILL.md +140 -0
- package/marketplace/skills/security-fundamentals/SKILL.md +139 -0
- package/marketplace/skills/semantic-center/SKILL.md +194 -0
- package/marketplace/skills/semantic-relations/SKILL.md +250 -0
- package/marketplace/skills/semantics/SKILL.md +366 -0
- package/marketplace/skills/semiotics/SKILL.md +230 -0
- package/marketplace/skills/seo-strategy/SKILL.md +260 -0
- package/marketplace/skills/server-actions-design/SKILL.md +243 -0
- package/marketplace/skills/server-components-design/SKILL.md +190 -0
- package/marketplace/skills/sharding-strategy/SKILL.md +123 -0
- package/marketplace/skills/shopify/SKILL.md +42 -0
- package/marketplace/skills/skill-infrastructure/SKILL.md +320 -0
- package/marketplace/skills/skill-router/SKILL.md +71 -0
- package/marketplace/skills/skill-scaffold/SKILL.md +105 -0
- package/marketplace/skills/snapshot-testing/SKILL.md +120 -0
- package/marketplace/skills/spec-driven-development/SKILL.md +148 -0
- package/marketplace/skills/state-machine-modeling/SKILL.md +56 -0
- package/marketplace/skills/state-management/SKILL.md +134 -0
- package/marketplace/skills/streaming-architecture/SKILL.md +194 -0
- package/marketplace/skills/summarization/SKILL.md +156 -0
- package/marketplace/skills/suspense-patterns/SKILL.md +265 -0
- package/marketplace/skills/system-interface-contracts/SKILL.md +59 -0
- package/marketplace/skills/task-analysis/SKILL.md +201 -0
- package/marketplace/skills/taxonomy-design/SKILL.md +66 -0
- package/marketplace/skills/test-coverage-strategy/SKILL.md +108 -0
- package/marketplace/skills/test-doubles-design/SKILL.md +98 -0
- package/marketplace/skills/test-driven-development/SKILL.md +96 -0
- package/marketplace/skills/testing-strategy/SKILL.md +67 -0
- package/marketplace/skills/theme-system-design/SKILL.md +43 -0
- package/marketplace/skills/tool-call-flow/SKILL.md +229 -0
- package/marketplace/skills/tool-call-strategy/SKILL.md +292 -0
- package/marketplace/skills/transaction-isolation/SKILL.md +98 -0
- package/marketplace/skills/type-safety/SKILL.md +177 -0
- package/marketplace/skills/typography-system/SKILL.md +43 -0
- package/marketplace/skills/usability-testing/SKILL.md +43 -0
- package/marketplace/skills/user-research/SKILL.md +43 -0
- package/marketplace/skills/vercel-composition-patterns/SKILL.md +157 -0
- package/marketplace/skills/version-control/SKILL.md +233 -0
- package/marketplace/skills/visual-design-foundations/SKILL.md +59 -0
- package/marketplace/skills/visual-hierarchy/SKILL.md +43 -0
- package/marketplace/skills/webhook-integration/SKILL.md +331 -0
- package/marketplace/skills/writing-humanizer/SKILL.md +380 -0
- package/package.json +67 -0
- package/schemas/manifest.schema.json +811 -0
- package/schemas/manifest.v2.schema.json +164 -0
- package/schemas/manifest.v3.schema.json +758 -0
- package/schemas/manifest.v4.schema.json +755 -0
- package/schemas/manifest.v5.schema.json +755 -0
- package/schemas/manifest.v6.schema.json +811 -0
- package/schemas/skill.context.jsonld +279 -0
- package/schemas/skill.schema.json +919 -0
- package/schemas/skill.v2.schema.json +201 -0
- package/schemas/skill.v3.schema.json +827 -0
- package/schemas/skill.v4.schema.json +822 -0
- package/schemas/skill.v5.schema.json +830 -0
- package/schemas/skill.v6.schema.json +946 -0
- package/schemas/vocabulary/keywords.json +180 -0
- package/schemas/vocabulary/workspace_tags.json +23 -0
- package/scripts/__tests__/migrate-skill-v2-to-v3.test.js +161 -0
- package/scripts/__tests__/migrate-skill-v3-to-v4.test.js +158 -0
- package/scripts/__tests__/test-export-parser-drift.js +149 -0
- package/scripts/__tests__/test-marketplace-export.js +114 -0
- package/scripts/__tests__/test-router-paths.js +82 -0
- package/scripts/__tests__/test-stability-promotion.js +244 -0
- package/scripts/__tests__/test-v3-1-alias-contract.js +109 -0
- package/scripts/__tests__/test-v3-1-skos-runtime.js +116 -0
- package/scripts/backfill-schema-version.js +198 -0
- package/scripts/build-field-reference.js +160 -0
- package/scripts/build-retrieval-baseline.js +511 -0
- package/scripts/check-markdown-links.js +211 -0
- package/scripts/check-protocol-consistency.js +979 -0
- package/scripts/export-marketplace-skills.js +610 -0
- package/scripts/export-skill.js +374 -0
- package/scripts/generate-manifest.js +787 -0
- package/scripts/lib/alias-contract.js +83 -0
- package/scripts/lib/audit-prompt-builder.js +771 -0
- package/scripts/lib/mock-grader.js +134 -0
- package/scripts/lib/parse-frontmatter.js +429 -0
- package/scripts/lib/roots.js +119 -0
- package/scripts/lint/check-archetype-sections.js +185 -0
- package/scripts/lint/check-category-enum.js +83 -0
- package/scripts/lint/check-routing-eval.js +146 -0
- package/scripts/lint/check-routing-quality.js +211 -0
- package/scripts/lint/check-stability-promotion.js +220 -0
- package/scripts/lint/format-code-frame.js +206 -0
- package/scripts/marketplace-install.js +125 -0
- package/scripts/migrate-category-to-enum.js +169 -0
- package/scripts/migrate-skill-v2-to-v3.js +424 -0
- package/scripts/migrate-skill-v3-to-v4.js +200 -0
- package/scripts/migrate-skill-v5-to-v6.js +304 -0
- package/scripts/restructure-by-category.js +85 -0
- package/scripts/seed-publication-classification.js +282 -0
- package/scripts/skill-audit.js +893 -0
- package/scripts/skill-graph-drift.js +483 -0
- package/scripts/skill-graph-route.js +766 -0
- package/scripts/skill-graph-routing-eval.js +393 -0
- package/scripts/skill-lint.js +1317 -0
- package/scripts/skill-overlap.js +213 -0
- package/scripts/verify-skill-md-export.js +201 -0
|
@@ -0,0 +1,98 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: transaction-isolation
|
|
3
|
+
description: "Use when reasoning about the I in ACID — the isolation level a database provides between concurrent transactions: the five SQL-standard levels (read uncommitted, read committed, repeatable read, serializable) plus snapshot isolation; the anomalies each level admits (dirty reads, non-repeatable reads, phantom reads, write skew, lost updates); the Berenson et al. 1995 critique that exposed the standard's looseness; the difference between locking-based and MVCC-based isolation; Postgres's Serializable Snapshot Isolation (SSI) as one rigorous implementation; how to choose an isolation level for a workload by enumerating the anomalies the workload cannot tolerate. Do NOT use for the broader ACID frame (use acid-fundamentals), distributed-replica consistency (use cap-theorem-tradeoffs), query performance tuning (use query-optimization), or schema design (use data-modeling)."
|
|
4
|
+
license: MIT
|
|
5
|
+
allowed-tools: Read Grep
|
|
6
|
+
metadata:
|
|
7
|
+
metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"engineering\",\"domain\":\"engineering/data\",\"scope\":\"reference\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-05-16\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-05-16\\\\\\\"}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"comprehension_state\":\"present\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"isolation level\\\\\\\",\\\\\\\"read committed\\\\\\\",\\\\\\\"repeatable read\\\\\\\",\\\\\\\"serializable\\\\\\\",\\\\\\\"snapshot isolation\\\\\\\",\\\\\\\"SSI\\\\\\\",\\\\\\\"MVCC\\\\\\\",\\\\\\\"dirty read\\\\\\\",\\\\\\\"non-repeatable read\\\\\\\",\\\\\\\"phantom read\\\\\\\",\\\\\\\"write skew\\\\\\\",\\\\\\\"lost update\\\\\\\",\\\\\\\"Berenson\\\\\\\"]\",\"triggers\":\"[\\\\\\\"what isolation level do we need\\\\\\\",\\\\\\\"is read committed enough\\\\\\\",\\\\\\\"what's write skew\\\\\\\",\\\\\\\"MVCC vs locking\\\\\\\",\\\\\\\"Postgres serializable vs MySQL serializable\\\\\\\"]\",\"examples\":\"[\\\\\\\"choose an isolation level for a workload that has concurrent balance-decrement operations\\\\\\\",\\\\\\\"diagnose a data-correctness bug caused by an anomaly the chosen isolation level permits\\\\\\\",\\\\\\\"explain the difference between snapshot isolation and full serializability\\\\\\\",\\\\\\\"decide whether to use SELECT FOR UPDATE or upgrade isolation level\\\\\\\"]\",\"anti_examples\":\"[\\\\\\\"explain the four ACID properties (use acid-fundamentals)\\\\\\\",\\\\\\\"reason about distributed-replica consistency under partition (use cap-theorem-tradeoffs)\\\\\\\",\\\\\\\"tune a slow query (use query-optimization)\\\\\\\"]\",\"relations\":\"{\\\\\\\"related\\\\\\\":[\\\\\\\"acid-fundamentals\\\\\\\",\\\\\\\"cap-theorem-tradeoffs\\\\\\\",\\\\\\\"data-modeling\\\\\\\",\\\\\\\"query-optimization\\\\\\\"],\\\\\\\"boundary\\\\\\\":[{\\\\\\\"skill\\\\\\\":\\\\\\\"acid-fundamentals\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"acid-fundamentals owns the four-property ACID frame as a whole; this skill owns the I axis specifically — the choice and semantics of isolation levels as a tunable. The two compose: acid-fundamentals names isolation as one of four guarantees; this skill makes the I axis operational.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"cap-theorem-tradeoffs\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"cap-theorem-tradeoffs owns distributed-replica agreement (CAP's C); this skill owns single-cluster transaction isolation (ACID's I). The two C/I letters concern different layers of the system.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"query-optimization\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"query-optimization owns the performance dimension of query execution; this skill owns the correctness dimension of concurrent execution. Optimizations that change locking behavior can shift anomaly exposure; the two interact.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"data-modeling\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"data-modeling owns schema and access-pattern design; this skill owns the concurrency-correctness contract under whichever schema and access pattern are chosen.\\\\\\\"}],\\\\\\\"verify_with\\\\\\\":[\\\\\\\"acid-fundamentals\\\\\\\",\\\\\\\"query-optimization\\\\\\\"]}\",\"mental_model\":\"|\",\"purpose\":\"|\",\"boundary\":\"|\",\"analogy\":\"An isolation level is to a database what a confidentiality regime is to a research lab — Read Uncommitted is the open whiteboard (anyone can read anyone's half-finished work); Read Committed is the rule that you only photograph your colleague's notebook after they have signed each page; Repeatable Read is a sealed envelope (you read once, you keep reading the same thing for the duration); Serializable is a locked vault (you take exclusive custody, others queue); Snapshot Isolation is a private photocopy (you read from your photocopy, others read from theirs, and only at commit time do the photocopies have to agree on the world they were taken from).\",\"misconception\":\"|\",\"concept\":\"{\\\\\\\"definition\\\\\\\":\\\\\\\"Transaction isolation is the property — and the configurable choice — that determines whether concurrent transactions appear to execute serially or are permitted to observe each other's intermediate effects. The SQL standard defines four isolation levels (read uncommitted, read committed, repeatable read, serializable) by enumerating the anomalies each level may or may not permit (dirty reads, non-repeatable reads, phantom reads). Snapshot isolation, ubiquitous in modern MVCC databases, is a fifth practical level that the standard did not define. Stronger isolation eliminates more anomalies at the cost of concurrency (more transactions block or retry); weaker isolation admits anomalies that the application must handle, either by tolerating them, by upgrading the isolation level, or by using explicit locking. The discipline is choosing the isolation level by *naming the anomalies the application cannot tolerate*, not by reflex.\\\\\\\",\\\\\\\"mental_model\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"purpose\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"boundary\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"taxonomy\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"analogy\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"misconception\\\\\\\":\\\\\\\"|\\\\\\\"}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/transaction-isolation/SKILL.md\"}"
|
|
8
|
+
skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
|
|
9
|
+
skill_graph_protocol: Skill Metadata Protocol v4
|
|
10
|
+
skill_graph_project: Skill Graph
|
|
11
|
+
skill_graph_canonical_skill: skills/transaction-isolation/SKILL.md
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
# Transaction Isolation
|
|
15
|
+
|
|
16
|
+
## Coverage
|
|
17
|
+
|
|
18
|
+
The I axis of ACID — the property and operational choice that determines what concurrent transactions can observe of each other. Covers the five practical levels (read uncommitted, read committed, repeatable read, snapshot isolation, serializable), the anomalies each admits (dirty reads, non-repeatable reads, phantom reads, lost updates, write skew, read-only-transaction anomalies), the Berenson et al. (1995) critique that exposed the SQL standard's looseness, the two dominant implementation mechanisms (two-phase locking and MVCC), Postgres's Serializable Snapshot Isolation (SSI), and the choice procedure: enumerate the anomalies the workload cannot tolerate, choose the lowest level that prevents them.
|
|
19
|
+
|
|
20
|
+
## Philosophy
|
|
21
|
+
|
|
22
|
+
Isolation is the most-mis-defaulted and most-mis-understood part of ACID. The standard's level names are not universal across databases; the level the team thinks they're running may not be the one the database actually provides; the workload's vulnerability to specific anomalies determines the required level — and most teams don't enumerate that vulnerability before choosing.
|
|
23
|
+
|
|
24
|
+
The discipline is to make the choice explicit, per transaction, against a named anomaly set. "Default to serializable for safety" is over-conservative; "default to read committed for performance" is under-safe; "this workload has invariants across rows that are vulnerable to write skew under SI, so this transaction runs at serializable" is the discipline.
|
|
25
|
+
|
|
26
|
+
The implementation matters as much as the level. A team running on Postgres at "serializable" is using SSI, with commit-time aborts that require retry-loop logic; a team running on MySQL InnoDB at "repeatable read" is getting gap-lock-prevented phantoms unlike the standard. Knowing the specific database's implementation of the chosen level is operational hygiene.
|
|
27
|
+
|
|
28
|
+
## The Anomaly Catalog
|
|
29
|
+
|
|
30
|
+
| Anomaly | What it is | Eliminated by |
|
|
31
|
+
|---|---|---|
|
|
32
|
+
| Dirty read | Reading uncommitted data from another transaction | Read committed and above |
|
|
33
|
+
| Non-repeatable read | Same row, different values within transaction | Repeatable read and above |
|
|
34
|
+
| Phantom read | Same range query, different result set | Serializable (and some RR implementations) |
|
|
35
|
+
| Lost update | Read-modify-write race between two transactions | Serializable; mitigated by SELECT FOR UPDATE |
|
|
36
|
+
| Write skew | Two transactions act on a snapshot; jointly violate an invariant | Serializable / SSI |
|
|
37
|
+
| Read-only transaction anomaly | Read-only transaction produces inconsistent output | SSI |
|
|
38
|
+
|
|
39
|
+
The discipline is reading this table for the workload at hand: which of these anomalies, if they occurred, would produce a correctness bug? Pick the lowest level that prevents those.
|
|
40
|
+
|
|
41
|
+
## Level vs Implementation Matrix
|
|
42
|
+
|
|
43
|
+
| Level (standard) | Postgres | MySQL InnoDB | SQL Server | Oracle |
|
|
44
|
+
|---|---|---|---|---|
|
|
45
|
+
| Read Uncommitted | Same as read committed | Available; dirty reads | Available | Same as read committed |
|
|
46
|
+
| Read Committed | Default; MVCC | Available | Default | Default; MVCC |
|
|
47
|
+
| Repeatable Read | MVCC; phantoms allowed | Default; gap locks prevent phantoms | Available; lock-based | Same as serializable |
|
|
48
|
+
| Snapshot Isolation | Not directly named (RR is SI-like) | n/a | RCSI option | Default-equivalent |
|
|
49
|
+
| Serializable | SSI (since 9.1) | Lock-based | Lock-based with range locks | Snapshot + checks |
|
|
50
|
+
|
|
51
|
+
Naming is not consistent; reading the database's documentation is required.
|
|
52
|
+
|
|
53
|
+
## The Choice Procedure
|
|
54
|
+
|
|
55
|
+
1. **Enumerate the workload's anomaly vulnerabilities.** For each transaction class: which anomalies would produce a correctness bug if they occurred? (A balance-decrement is vulnerable to lost updates. A doctor-on-call check is vulnerable to write skew. A read-only report is vulnerable to non-repeatable reads if cross-table consistency matters.)
|
|
56
|
+
|
|
57
|
+
2. **Find the lowest level that prevents the named anomalies.** Walk the anomaly catalog upward. Stop at the level that prevents all vulnerabilities.
|
|
58
|
+
|
|
59
|
+
3. **Verify the database's implementation actually prevents what the standard says it should.** Read the specific database's documentation; don't assume the standard's table is what your database does.
|
|
60
|
+
|
|
61
|
+
4. **Add explicit locking where the level alone is insufficient.** `SELECT FOR UPDATE`, advisory locks, and optimistic-concurrency tokens are tools for targeted correctness without raising the whole transaction's isolation level.
|
|
62
|
+
|
|
63
|
+
5. **Handle the retry-required failure modes.** SSI can abort transactions at commit; the application must retry. Repeatable read can throw serialization errors; the application must handle. Higher isolation introduces new failure modes, not zero failure modes.
|
|
64
|
+
|
|
65
|
+
## Verification
|
|
66
|
+
|
|
67
|
+
After applying this skill, verify:
|
|
68
|
+
- [ ] The team can name the database's default isolation level and the level's implementation mechanism (MVCC, SSI, locking). Default assumption is not relied on.
|
|
69
|
+
- [ ] For each transaction class, the team has enumerated the anomaly vulnerabilities and chosen an isolation level that prevents them. Levels are not picked by reflex.
|
|
70
|
+
- [ ] Explicit locking (SELECT FOR UPDATE, advisory locks) is used where targeted correctness is needed without upgrading the whole transaction.
|
|
71
|
+
- [ ] Application code that runs at SSI or repeatable-read with serialization errors has retry-on-conflict logic. Higher isolation's failure modes are handled, not ignored.
|
|
72
|
+
- [ ] Write skew vulnerability is recognized for transactions that read X and write Y based on it under SI; either the level is upgraded to serializable or explicit locking guards the read set.
|
|
73
|
+
- [ ] Read-only transactions that join multiple tables run at at least repeatable read or snapshot isolation when cross-table consistency matters.
|
|
74
|
+
- [ ] The specific database's documentation has been consulted, not the SQL standard's level names. The team knows the specific anomaly set the level actually prevents.
|
|
75
|
+
- [ ] Isolation-level changes in production are treated as behavior changes, not config changes. New failure modes are handled before the change rolls.
|
|
76
|
+
|
|
77
|
+
## Do NOT Use When
|
|
78
|
+
|
|
79
|
+
| Instead of this skill | Use | Why |
|
|
80
|
+
|---|---|---|
|
|
81
|
+
| Explaining the broader ACID frame | `acid-fundamentals` | acid-fundamentals owns the four-property frame; this owns the I axis |
|
|
82
|
+
| Reasoning about replica agreement across nodes | `cap-theorem-tradeoffs` | CAP is the distributed-systems frame; this is the single-cluster transactional frame |
|
|
83
|
+
| Tuning a slow query's performance | `query-optimization` | query-optimization owns performance; this owns concurrency correctness |
|
|
84
|
+
| Choosing an index for a query | `indexing-strategy` | indexing owns retrieval performance; this owns concurrent execution |
|
|
85
|
+
| Designing schema and access patterns | `data-modeling` | data-modeling owns design; this owns the concurrency contract |
|
|
86
|
+
| Coordinating across multiple transactions or services | sagas / distributed locks (out of this skill's scope) | cross-transaction coordination is a different problem |
|
|
87
|
+
|
|
88
|
+
## Key Sources
|
|
89
|
+
|
|
90
|
+
- Berenson, H., Bernstein, P., Gray, J., Melton, J., O'Neil, E., & O'Neil, P. (1995). ["A Critique of ANSI SQL Isolation Levels"](https://dl.acm.org/doi/10.1145/568271.223785). *SIGMOD 1995*. Foundational paper formalizing the anomalies and showing the SQL standard's looseness; required reading for serious treatment of isolation.
|
|
91
|
+
- Adya, A. (1999). ["Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions"](http://pmg.csail.mit.edu/papers/adya-phd.pdf). PhD thesis, MIT. Extends Berenson et al. with a more rigorous framework; basis for modern isolation reasoning.
|
|
92
|
+
- Cahill, M. J., Röhm, U., & Fekete, A. D. (2008). ["Serializable Isolation for Snapshot Databases"](https://dl.acm.org/doi/10.1145/1376616.1376690). *SIGMOD 2008*. The paper that introduced Serializable Snapshot Isolation (SSI); the basis of Postgres's serializable mode since 9.1.
|
|
93
|
+
- Fekete, A., Liarokapis, D., O'Neil, E., O'Neil, P., & Shasha, D. (2005). ["Making Snapshot Isolation Serializable"](https://dl.acm.org/doi/10.1145/1071610.1071615). *ACM TODS*, 30(2). Precursor to the SSI paper; characterization of snapshot isolation's anomaly set.
|
|
94
|
+
- PostgreSQL Global Development Group. ["PostgreSQL Documentation — Transaction Isolation"](https://www.postgresql.org/docs/current/transaction-iso.html). Canonical reference for Postgres's specific isolation implementation; covers SSI behavior and the abort-and-retry contract.
|
|
95
|
+
- Kleppmann, M. (2017). *Designing Data-Intensive Applications*. O'Reilly. Chapter 7 (Transactions) — modern practitioner treatment of all isolation levels, anomalies, and the implementation strategies.
|
|
96
|
+
- Gray, J., & Reuter, A. (1992). *Transaction Processing: Concepts and Techniques*. Morgan Kaufmann. The canonical textbook; deep treatment of locking and concurrency control.
|
|
97
|
+
- Bernstein, P. A., & Goodman, N. (1981). ["Concurrency Control in Distributed Database Systems"](https://dl.acm.org/doi/10.1145/356842.356846). *ACM Computing Surveys*, 13(2). Foundational survey of concurrency-control techniques.
|
|
98
|
+
- MySQL Reference Manual. ["Transaction Isolation Levels"](https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html). Reference for MySQL InnoDB's specific implementation; documents the gap-lock prevention of phantoms at RR.
|
|
@@ -0,0 +1,177 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: type-safety
|
|
3
|
+
description: "Use when reasoning about types as a quality property of code: what guarantees the type system actually provides, the difference between sound and unsound systems, structural vs nominal typing, type narrowing and exhaustiveness, the runtime/compile-time boundary, and where validation must happen because the type system cannot. Covers TypeScript, Flow, Hindley-Milner languages, and gradual typing in general. Do NOT use for runtime input validation library choice (use api-design for API surface validation; use individual library docs for library mechanics), for SQL type mapping (use data-modeling), or for type system implementation (compilers — out of scope)."
|
|
4
|
+
license: MIT
|
|
5
|
+
allowed-tools: Read Grep
|
|
6
|
+
metadata:
|
|
7
|
+
metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"quality\",\"domain\":\"quality/types\",\"scope\":\"reference\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-05-15\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-05-15\\\\\\\"}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"comprehension_state\":\"present\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"type safety\\\\\\\",\\\\\\\"TypeScript\\\\\\\",\\\\\\\"sound type system\\\\\\\",\\\\\\\"unsound type system\\\\\\\",\\\\\\\"structural typing\\\\\\\",\\\\\\\"nominal typing\\\\\\\",\\\\\\\"type narrowing\\\\\\\",\\\\\\\"exhaustiveness check\\\\\\\",\\\\\\\"gradual typing\\\\\\\",\\\\\\\"runtime boundary\\\\\\\"]\",\"triggers\":\"[\\\\\\\"is this type-safe\\\\\\\",\\\\\\\"should this be `any` or `unknown`\\\\\\\",\\\\\\\"exhaustiveness check\\\\\\\",\\\\\\\"narrowing\\\\\\\",\\\\\\\"where does validation belong\\\\\\\"]\",\"examples\":\"[\\\\\\\"review whether this discriminated union has an exhaustiveness check at the switch\\\\\\\",\\\\\\\"decide whether to use `any` or `unknown` for this third-party JSON payload\\\\\\\",\\\\\\\"explain why TypeScript's `as` cast doesn't actually validate at runtime\\\\\\\",\\\\\\\"design where Zod (or any validator) parses at the application boundary\\\\\\\"]\",\"anti_examples\":\"[\\\\\\\"implement HMAC verification for an inbound webhook (use webhook-integration)\\\\\\\",\\\\\\\"design the JSON shape of an API endpoint (use api-design)\\\\\\\",\\\\\\\"choose between Postgres column types (use data-modeling)\\\\\\\"]\",\"relations\":\"{\\\\\\\"related\\\\\\\":[\\\\\\\"api-design\\\\\\\",\\\\\\\"testing-strategy\\\\\\\",\\\\\\\"code-review\\\\\\\"],\\\\\\\"boundary\\\\\\\":[{\\\\\\\"skill\\\\\\\":\\\\\\\"api-design\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"api-design owns the external request/response surface; type-safety owns the discipline of expressing internal program correctness as types.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"testing-strategy\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"testing-strategy owns the runtime verification of behavior; type-safety owns the compile-time verification of structure. They cover different failure modes.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"data-modeling\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"data-modeling owns persistence and entity shape; type-safety owns the in-memory type contracts that consume that shape.\\\\\\\"}],\\\\\\\"verify_with\\\\\\\":[\\\\\\\"testing-strategy\\\\\\\",\\\\\\\"code-review\\\\\\\"]}\",\"mental_model\":\"|\",\"purpose\":\"|\",\"boundary\":\"|\",\"analogy\":\"Type safety is to programs what a passport check is to international travel — the document (type annotation) certifies identity within the issuing country's records, but on the way through customs (the I/O boundary), the document is re-verified against the actual traveler, and any mismatch is rejected before they enter the trusted zone.\",\"misconception\":\"|\",\"concept\":\"{\\\\\\\"definition\\\\\\\":\\\\\\\"Type safety is the property of a program in which type errors — operations applied to values of the wrong kind — are detected before they cause incorrect behavior. A type system provides type safety to the extent that it formally rules out classes of errors at compile time. A sound type system rules out all errors of the kinds it tracks; an unsound system rules out some but allows others through escape hatches.\\\\\\\",\\\\\\\"mental_model\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"purpose\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"boundary\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"taxonomy\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"analogy\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"misconception\\\\\\\":\\\\\\\"|\\\\\\\"}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/type-safety/SKILL.md\"}"
|
|
8
|
+
skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
|
|
9
|
+
skill_graph_protocol: Skill Metadata Protocol v4
|
|
10
|
+
skill_graph_project: Skill Graph
|
|
11
|
+
skill_graph_canonical_skill: skills/type-safety/SKILL.md
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
# Type Safety
|
|
15
|
+
|
|
16
|
+
## Coverage
|
|
17
|
+
|
|
18
|
+
The discipline of using a type system to rule out classes of runtime errors before they occur. Covers what soundness means and where TypeScript (and other gradual systems) is unsound, structural vs nominal typing, type narrowing and exhaustiveness checking, the runtime boundary problem, the difference between `any` and `unknown`, when to use type assertions (rarely) and when to validate (always at I/O boundaries), and the connection to runtime validation libraries.
|
|
19
|
+
|
|
20
|
+
## Philosophy
|
|
21
|
+
|
|
22
|
+
Types are claims about values; type-checking is proof-checking. A program that compiles is a program whose claims have been internally consistent — but a program is more than its compiler. Values that arrive from outside the program (HTTP responses, environment variables, parsed JSON, untrusted user input) have no type until you parse them, no matter what type annotation sits next to them.
|
|
23
|
+
|
|
24
|
+
The discipline of type-safety is to take the compile-time guarantees seriously and to know exactly where they stop. A codebase that pretends `JSON.parse(x) as User` is safe has confused a syntactic claim with a semantic guarantee. A codebase that validates at the boundary and trusts the type inside has correctly aligned the two layers.
|
|
25
|
+
|
|
26
|
+
In gradual systems like TypeScript, the discipline includes treating escape hatches (`any`, `as`, `// @ts-ignore`) as exceptional, justified, and rare — not as the default response to a type error.
|
|
27
|
+
|
|
28
|
+
## Soundness — What the System Actually Promises
|
|
29
|
+
|
|
30
|
+
| System | Soundness | Where it leaks |
|
|
31
|
+
|---|---|---|
|
|
32
|
+
| TypeScript | Unsound (by design) | `any`, `as`, function bivariance, ambient declarations, type assertions, `Object.keys()` typed as `string[]`, array `.find()` returning a `T` not `T \| undefined` (without strict flag), unchecked `noUncheckedIndexedAccess`, mutable arrays in covariant positions |
|
|
33
|
+
| Flow | Unsound | Similar escape hatches; less broadly adopted |
|
|
34
|
+
| Python + mypy | Unsound (gradual) | `Any`, `cast`, dynamic-only constructs |
|
|
35
|
+
| Java | Sound for types, unsound for null | NullPointerException; generics erased at runtime |
|
|
36
|
+
| C# | Mostly sound | Reflection, `dynamic` |
|
|
37
|
+
| Go | Sound for types, structural interfaces | Empty interface (`interface{}`) is the escape hatch; type assertions panic on failure |
|
|
38
|
+
| Rust | Sound (memory + types) | `unsafe` blocks are documented escape hatches |
|
|
39
|
+
| Haskell | Sound (within purity) | `unsafePerformIO`, FFI |
|
|
40
|
+
|
|
41
|
+
**Practical TypeScript stance:** Enable strict mode (`strict: true`), enable `noUncheckedIndexedAccess`, enable `noImplicitAny`. These flags close the most common leakage points. Without them, the system's guarantees are substantially weaker than developers assume.
|
|
42
|
+
|
|
43
|
+
## Narrowing
|
|
44
|
+
|
|
45
|
+
Narrowing is the type checker's mechanism for refining a broad type based on control-flow evidence. Use it instead of casts.
|
|
46
|
+
|
|
47
|
+
```typescript
|
|
48
|
+
function process(x: string | number | null) {
|
|
49
|
+
if (x === null) return; // narrows to string | number
|
|
50
|
+
if (typeof x === 'string') { // narrows to string
|
|
51
|
+
return x.toUpperCase();
|
|
52
|
+
}
|
|
53
|
+
// here x is narrowed to number
|
|
54
|
+
return x.toFixed(2);
|
|
55
|
+
}
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
| Narrowing technique | Use when |
|
|
59
|
+
|---|---|
|
|
60
|
+
| `typeof x === '...'` | Distinguishing primitives |
|
|
61
|
+
| `x instanceof Class` | Distinguishing class instances |
|
|
62
|
+
| `'field' in x` | Distinguishing discriminated objects |
|
|
63
|
+
| `x === literal` | Distinguishing literal types |
|
|
64
|
+
| Discriminated union via tag field | Designed-in narrowing for ADTs |
|
|
65
|
+
| User-defined type guards (`x is T`) | Custom predicates |
|
|
66
|
+
| `Array.isArray(x)` | Array vs non-array |
|
|
67
|
+
|
|
68
|
+
Discriminated unions are the strongest pattern:
|
|
69
|
+
|
|
70
|
+
```typescript
|
|
71
|
+
type Result =
|
|
72
|
+
| { ok: true; value: User }
|
|
73
|
+
| { ok: false; error: string };
|
|
74
|
+
|
|
75
|
+
function handle(r: Result) {
|
|
76
|
+
if (r.ok) return r.value; // narrows; `r.error` is not accessible here
|
|
77
|
+
return r.error; // narrowed to the error branch
|
|
78
|
+
}
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
## Exhaustiveness Checking
|
|
82
|
+
|
|
83
|
+
Force the compiler to verify that all cases of a union are handled. The pattern uses an unreachable `never` branch.
|
|
84
|
+
|
|
85
|
+
```typescript
|
|
86
|
+
type Method = 'GET' | 'POST' | 'PUT' | 'DELETE';
|
|
87
|
+
|
|
88
|
+
function describe(m: Method): string {
|
|
89
|
+
switch (m) {
|
|
90
|
+
case 'GET': return 'safe';
|
|
91
|
+
case 'POST': return 'mutation';
|
|
92
|
+
case 'PUT': return 'idempotent replacement';
|
|
93
|
+
case 'DELETE': return 'idempotent removal';
|
|
94
|
+
default: {
|
|
95
|
+
const _exhaustive: never = m; // compile error if a case is missing
|
|
96
|
+
throw new Error(`unhandled: ${_exhaustive}`);
|
|
97
|
+
}
|
|
98
|
+
}
|
|
99
|
+
}
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
When a new variant is added to `Method`, the `never` assignment fails to type-check — the compiler points at every missing branch. This converts a runtime "unhandled case" bug into a compile-time error.
|
|
103
|
+
|
|
104
|
+
## The Runtime Boundary
|
|
105
|
+
|
|
106
|
+
Type information stops at the runtime boundary. Values crossing in must be parsed; values crossing out are serialized.
|
|
107
|
+
|
|
108
|
+
| Boundary | Risk | Discipline |
|
|
109
|
+
|---|---|---|
|
|
110
|
+
| `JSON.parse(networkResponse)` | Returns `any` (or `unknown` with stricter typing); no validation | Parse with a schema (Zod, io-ts) before trusting the type |
|
|
111
|
+
| `process.env.X` | All env vars are `string \| undefined`, but TypeScript may type them as `string` via globals | Validate at startup with a typed env config object |
|
|
112
|
+
| `localStorage.getItem(k)` | Returns `string \| null`, but the contents are untyped | Parse + validate before use |
|
|
113
|
+
| `fetch(url).then(r => r.json())` | The promise resolves with `any` | Validate against an expected schema |
|
|
114
|
+
| Database driver results | Library-typed; trust depends on the library's contract | Verify the library actually checks types at the boundary |
|
|
115
|
+
| `Function(string)` / `eval` | Arbitrary code; arbitrary types | Avoid; if necessary, type the result as `unknown` |
|
|
116
|
+
|
|
117
|
+
The pattern: **validate at the boundary; trust the type inside.**
|
|
118
|
+
|
|
119
|
+
```typescript
|
|
120
|
+
import { z } from 'zod';
|
|
121
|
+
|
|
122
|
+
const UserSchema = z.object({
|
|
123
|
+
id: z.string(),
|
|
124
|
+
email: z.string().email(),
|
|
125
|
+
createdAt: z.coerce.date(),
|
|
126
|
+
});
|
|
127
|
+
|
|
128
|
+
type User = z.infer<typeof UserSchema>;
|
|
129
|
+
|
|
130
|
+
async function fetchUser(id: string): Promise<User> {
|
|
131
|
+
const raw = await fetch(`/api/users/${id}`).then(r => r.json());
|
|
132
|
+
return UserSchema.parse(raw); // throws on validation failure
|
|
133
|
+
}
|
|
134
|
+
// Inside the rest of the program, User is trusted.
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
## `any` vs `unknown` vs `never`
|
|
138
|
+
|
|
139
|
+
| Type | Set of values | What you can do with it |
|
|
140
|
+
|---|---|---|
|
|
141
|
+
| `any` | All values | Anything (escape hatch — no checking) |
|
|
142
|
+
| `unknown` | All values | Nothing until you narrow (safe escape hatch) |
|
|
143
|
+
| `never` | No values | Nothing (used for exhaustiveness checks and unreachable code) |
|
|
144
|
+
| `void` | Returned, not consumed | Function return only; the value is "no value worth using" |
|
|
145
|
+
|
|
146
|
+
Rule: prefer `unknown` over `any` always. The cost is one narrowing step; the benefit is type-safety preserved.
|
|
147
|
+
|
|
148
|
+
## Verification
|
|
149
|
+
|
|
150
|
+
After applying this skill, verify:
|
|
151
|
+
- [ ] TypeScript strict mode is enabled (`"strict": true` in tsconfig).
|
|
152
|
+
- [ ] `noUncheckedIndexedAccess` is enabled; array/object access produces `T | undefined`.
|
|
153
|
+
- [ ] No `any` appears in committed code without an inline comment explaining why `unknown` is insufficient.
|
|
154
|
+
- [ ] No `as Type` cast appears without an inline comment explaining why narrowing is insufficient.
|
|
155
|
+
- [ ] Every I/O boundary parses with a runtime validator (Zod, io-ts, valibot, etc.) before the value is treated as typed.
|
|
156
|
+
- [ ] Discriminated unions have an exhaustiveness check at every consumer site.
|
|
157
|
+
- [ ] Public API boundaries (exported functions, route handlers, library entry points) have explicit return types — not just inferred.
|
|
158
|
+
- [ ] `// @ts-ignore` and `// @ts-expect-error` are accompanied by a justification and a tracking comment.
|
|
159
|
+
|
|
160
|
+
## Do NOT Use When
|
|
161
|
+
|
|
162
|
+
| Instead of this skill | Use | Why |
|
|
163
|
+
|---|---|---|
|
|
164
|
+
| Designing the JSON shape of an API endpoint | `api-design` | api-design owns the external surface contract; this skill owns internal type discipline |
|
|
165
|
+
| Verifying behavior at runtime with tests | `testing-strategy` | testing-strategy owns runtime verification; this skill owns compile-time |
|
|
166
|
+
| Designing database schema and column types | `data-modeling` | data-modeling owns persistence shape; this skill owns in-memory type contracts |
|
|
167
|
+
| Choosing between Zod / io-ts / valibot | individual library docs + `api-design` | Library choice is a tactical decision below this skill |
|
|
168
|
+
| Implementing the compiler / type-checker | language compiler implementation references | Out of scope — this skill is about *using* type systems, not building them |
|
|
169
|
+
|
|
170
|
+
## Key Sources
|
|
171
|
+
|
|
172
|
+
- Pierce, B. C. (2002). *Types and Programming Languages*. MIT Press. The canonical textbook. Chapters 1-3 cover untyped lambda calculus, simple types, and the soundness/progress/preservation framework that underpins every type system.
|
|
173
|
+
- Siek, J. G., & Taha, W. (2006). "Gradual Typing for Functional Languages." *Scheme and Functional Programming 2006*. The original gradual typing paper.
|
|
174
|
+
- Siek, J. G., & Vachharajani, M. (2008). "Gradual typing with unification-based inference." *Proceedings of the 2008 symposium on Dynamic languages*. Formalizes the soundness vs ergonomics trade-off in gradual systems.
|
|
175
|
+
- Microsoft. [TypeScript Handbook](https://www.typescriptlang.org/docs/handbook/intro.html). The reference for TypeScript's type system, including the documented unsoundness in *Type Compatibility* and *Narrowing* chapters.
|
|
176
|
+
- Anders Hejlsberg et al. [TypeScript Design Goals](https://github.com/microsoft/TypeScript/wiki/TypeScript-Design-Goals). Explicit acknowledgement that TypeScript trades soundness for ergonomics: "non-goals: apply a sound or 'provably correct' type system."
|
|
177
|
+
- Curry, H. B., & Feys, R. (1958). *Combinatory Logic, Volume I*. Original work on the Curry-Howard correspondence — types as propositions, programs as proofs.
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: typography-system
|
|
3
|
+
description: "Use when designing a typography system — typeface selection and pairing, modular type scale, vertical rhythm, line-height and measure rules, and web font delivery (subsetting, font-display, variable fonts). Do NOT use for body copy writing, single-headline font pairing, or non-text design tokens."
|
|
4
|
+
license: CC-BY-4.0
|
|
5
|
+
metadata:
|
|
6
|
+
metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"design\",\"scope\":\"portable\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-05-12\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-05-12\\\\\\\"}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"type scale\\\\\\\",\\\\\\\"typeface pairing\\\\\\\",\\\\\\\"vertical rhythm\\\\\\\",\\\\\\\"line height\\\\\\\",\\\\\\\"measure line length\\\\\\\",\\\\\\\"web font delivery\\\\\\\",\\\\\\\"variable fonts\\\\\\\",\\\\\\\"font-display swap\\\\\\\",\\\\\\\"font subsetting\\\\\\\",\\\\\\\"modular scale\\\\\\\",\\\\\\\"typographic tokens\\\\\\\",\\\\\\\"opentype features\\\\\\\"]\",\"triggers\":\"[\\\\\\\"typography system\\\\\\\",\\\\\\\"type scale\\\\\\\",\\\\\\\"font pairing\\\\\\\",\\\\\\\"variable fonts\\\\\\\",\\\\\\\"vertical rhythm\\\\\\\"]\",\"examples\":\"[\\\\\\\"Build a type scale with seven steps using a 1.25 ratio and assign each step to a semantic token (display, h1, body, caption)\\\\\\\",\\\\\\\"Pair a serif display face with a sans-serif text face and document when to use each\\\\\\\",\\\\\\\"Self-host a variable font with WOFF2 subsetting and font-display: swap\\\\\\\"]\",\"anti_examples\":\"[\\\\\\\"Write the headline copy for the landing page\\\\\\\",\\\\\\\"Pick the brand's primary color\\\\\\\",\\\\\\\"Decide where the headline component lives in the folder structure\\\\\\\"]\",\"relations\":\"{\\\\\\\"related\\\\\\\":[\\\\\\\"visual-hierarchy\\\\\\\",\\\\\\\"visual-design-foundations\\\\\\\",\\\\\\\"theme-system-design\\\\\\\",\\\\\\\"layout-composition\\\\\\\"],\\\\\\\"boundary\\\\\\\":[{\\\\\\\"skill\\\\\\\":\\\\\\\"visual-hierarchy\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"visual-hierarchy decides how to deploy type as an ordering signal on a given surface; this skill defines the scale and faces that get deployed.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"theme-system-design\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"theme-system-design covers how typography tokens are layered and switched; this skill produces the typographic decisions inside them.\\\\\\\"}]}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/typography-system/SKILL.md\"}"
|
|
7
|
+
skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
|
|
8
|
+
skill_graph_protocol: Skill Metadata Protocol v4
|
|
9
|
+
skill_graph_project: Skill Graph
|
|
10
|
+
skill_graph_canonical_skill: skills/typography-system/SKILL.md
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Typography System
|
|
14
|
+
|
|
15
|
+
## Coverage
|
|
16
|
+
A typography system has four components: a small set of typefaces (often one for display, one for text, optionally one monospaced), a modular scale of sizes, a set of weight and style variants per face, and rules for line-height, letter-spacing, and measure (characters per line). Each component is encoded as design tokens — font-family-text, font-size-100 through font-size-900, line-height-tight/normal/loose, letter-spacing-tight/normal — and consumed by components through semantic tokens (display, heading-1, body, caption).
|
|
17
|
+
|
|
18
|
+
Type scales are usually built from a single ratio applied iteratively to a base size. Common ratios: 1.125 (major second, subtle, content-dense UIs), 1.2 (minor third), 1.25 (major third, common default), 1.333 (perfect fourth), 1.414 (augmented fourth), 1.5 (perfect fifth, loud), 1.618 (golden ratio, very loud). Most production systems use 5–9 steps; more steps dilute the visual distinction between adjacent levels.
|
|
19
|
+
|
|
20
|
+
Line-height and measure are coupled. Longer measures need taller line-heights to keep the eye from skipping lines; shorter measures need tighter line-heights to avoid feeling sparse. The widely-cited target is 45–75 characters per line for body text, with line-height between 1.4 and 1.6 for body and 1.1–1.3 for display. Letter-spacing (CSS letter-spacing / tracking) generally tightens at large sizes (-0.02em or less at display sizes) and stays neutral at body sizes; uppercase text benefits from positive tracking (+0.05em or more) for legibility.
|
|
21
|
+
|
|
22
|
+
Variable fonts (OpenType font-variations) deliver multiple weights, widths, and optical sizes in a single file via continuous axes (wght 100–900, wdth, opsz, etc.), exposed in CSS via font-variation-settings and font-weight: <number>. They reduce HTTP requests and enable weight as a continuous design decision. Web font delivery best practices: WOFF2 format (universally supported, ~30% smaller than WOFF); subset to Latin or the languages actually used (Google Fonts CSS API does this automatically; self-hosted fonts use pyftsubset or fonttools); font-display: swap to render fallback text immediately and swap in the web font when loaded; preload the most-used font files with <link rel="preload" as="font" crossorigin>; use size-adjust, ascent-override, descent-override, and line-gap-override on the fallback @font-face to match metrics and minimize cumulative layout shift.
|
|
23
|
+
|
|
24
|
+
## Philosophy
|
|
25
|
+
Restraint is the practice. One text face and one display face cover most product needs; a third is a deliberate choice that requires justification. Each additional typeface costs bandwidth, hierarchy clarity, and rendering consistency across operating systems.
|
|
26
|
+
|
|
27
|
+
Typography is the densest carrier of brand. A wordmark, a heading face, and a body face shape voice more than any color does. Treat the choice of faces with the same seriousness as the choice of brand color, and treat the system around them — scale, rhythm, measure — as the structure that makes the choices work across surfaces.
|
|
28
|
+
|
|
29
|
+
## Verification
|
|
30
|
+
- The system uses at most three typefaces (display, text, monospace); each is justified by a use that the others cannot serve.
|
|
31
|
+
- Type scale steps are derived from a single ratio applied to a base size; sizes are not picked individually.
|
|
32
|
+
- Body text measure falls within 45–75 characters on the most common viewport widths; verified by inspecting actual rendered lines, not assumed.
|
|
33
|
+
- Web fonts are served as WOFF2, subset to required glyphs, with font-display: swap and metrics-matched fallback @font-face to minimize layout shift.
|
|
34
|
+
- Cumulative Layout Shift attributable to web font loading is below 0.1 on a representative page.
|
|
35
|
+
- Variable fonts (where used) load a single file and access weight/width through font-variation-settings or numeric font-weight, not separate files per weight.
|
|
36
|
+
- Headings and body share a consistent vertical rhythm; line-heights and margins snap to a baseline grid (typically a 4px or 8px subgrid).
|
|
37
|
+
|
|
38
|
+
## Do NOT Use When
|
|
39
|
+
- The task is writing copy. Typography systems shape how copy reads; they do not produce copy.
|
|
40
|
+
- The decision is a one-off pairing for a single graphic or asset with no system implications.
|
|
41
|
+
- The work is color, spacing, or motion tokens. Use color-system-design or visual-design-foundations.
|
|
42
|
+
- The concern is how typography tokens reach components and switch between themes. Use theme-system-design.
|
|
43
|
+
- You are debugging a single rendering issue (font kerning, ligature behavior in a specific browser) without a system change. That is browser-specific debugging.
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: usability-testing
|
|
3
|
+
description: "Use when observing real users attempting tasks on a prototype or live product to surface usability issues — moderated or unmoderated, think-aloud protocol, task scenarios, severity rating, sample sizing per Nielsen's heuristics. Do NOT use for automated test suites, code coverage analysis, CI pipelines, unit/integration testing, or any engineering verification — those are testing-strategy concerns, not human-behavior observation."
|
|
4
|
+
license: CC-BY-4.0
|
|
5
|
+
metadata:
|
|
6
|
+
metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"design\",\"scope\":\"portable\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-05-12\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-05-12\\\\\\\"}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"think aloud protocol\\\\\\\",\\\\\\\"task scenario\\\\\\\",\\\\\\\"moderated usability test\\\\\\\",\\\\\\\"unmoderated test\\\\\\\",\\\\\\\"severity rating\\\\\\\",\\\\\\\"five user rule\\\\\\\",\\\\\\\"formative testing\\\\\\\",\\\\\\\"summative testing\\\\\\\",\\\\\\\"hallway test\\\\\\\",\\\\\\\"moderator neutrality\\\\\\\",\\\\\\\"usability heuristics\\\\\\\",\\\\\\\"SUS score\\\\\\\",\\\\\\\"task success rate\\\\\\\",\\\\\\\"critical incident\\\\\\\"]\",\"triggers\":\"[\\\\\\\"usability test\\\\\\\",\\\\\\\"think aloud\\\\\\\",\\\\\\\"test this prototype\\\\\\\",\\\\\\\"task scenarios\\\\\\\",\\\\\\\"test with users\\\\\\\"]\",\"examples\":\"[\\\\\\\"Write three task scenarios for a usability test of this onboarding flow.\\\\\\\",\\\\\\\"How many participants do I need for a formative round on this prototype?\\\\\\\",\\\\\\\"Review my moderator script for neutrality and leading prompts.\\\\\\\",\\\\\\\"Rate the severity of these eight usability findings using Nielsen's scale.\\\\\\\"]\",\"anti_examples\":\"[\\\\\\\"Add unit tests for the order-total calculation function.\\\\\\\",\\\\\\\"Set up the CI pipeline for the new repo.\\\\\\\",\\\\\\\"Run a load test against the checkout API.\\\\\\\"]\",\"relations\":\"{\\\\\\\"related\\\\\\\":[\\\\\\\"prototyping\\\\\\\",\\\\\\\"user-research\\\\\\\",\\\\\\\"research-synthesis\\\\\\\",\\\\\\\"design-thinking\\\\\\\"],\\\\\\\"boundary\\\\\\\":[{\\\\\\\"skill\\\\\\\":\\\\\\\"testing-strategy\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"testing-strategy is an engineering practice for automated test suites that verify code behavior against specifications. usability-testing is a research practice for observing humans interacting with artifacts. The shared word 'testing' is the only thing in common.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"a11y\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"a11y covers accessibility conformance criteria (WCAG, screen reader behavior, keyboard operability). usability-testing can include accessibility-focused sessions but its scope is broader and its method is empirical observation rather than spec conformance.\\\\\\\"}]}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/usability-testing/SKILL.md\"}"
|
|
7
|
+
skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
|
|
8
|
+
skill_graph_protocol: Skill Metadata Protocol v4
|
|
9
|
+
skill_graph_project: Skill Graph
|
|
10
|
+
skill_graph_canonical_skill: skills/usability-testing/SKILL.md
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Usability Testing
|
|
14
|
+
|
|
15
|
+
## Coverage
|
|
16
|
+
Usability testing covers the evaluative research practice of watching people attempt realistic tasks on a prototype or product, then identifying the obstacles they encounter. The dominant method is the **think-aloud protocol** (Ericsson & Simon), where participants narrate their thoughts as they work, surfacing the mental model they are using and the points where it diverges from the design. Sessions are organized around **task scenarios** — short narratives that frame a goal without prescribing the steps ("you want to find out how much you owe in taxes this quarter") — and a **moderator** who maintains neutrality, resists answering questions, and prompts only with open-ended interventions like "what are you thinking now?" or "what did you expect to happen?".
|
|
17
|
+
|
|
18
|
+
The skill covers **sample sizing**. The widely-cited Nielsen/Landauer "5-user rule" estimates that 5 users surface ~85% of major usability problems for a homogeneous user group on a discrete task, with steeply diminishing returns afterward. The rule has important limits: it applies per distinct user segment, per discrete task scope, and to **formative** (iterative diagnostic) testing — not to **summative** (benchmark) studies, which require much larger samples for valid statistical comparison. Misapplying the 5-user rule to summative claims is a common error.
|
|
19
|
+
|
|
20
|
+
Findings are organized by **severity rating** (Nielsen's 0–4 scale: cosmetic, minor, major, catastrophic) so the team can triage. **Task success rate**, **time on task**, and standardized instruments like **SUS** (System Usability Scale, Brooke 1996) provide quantitative complements when needed. The practice distinguishes **moderated** sessions (richer data, higher cost, requires scheduling) from **unmoderated** tools (lower cost, scales to dozens of sessions, sacrifices the moderator's ability to follow up on surprises).
|
|
21
|
+
|
|
22
|
+
The skill also covers what NOT to do in a session: leading prompts, defending the design, explaining how the design "is supposed to work" when the participant gets stuck, and over-fitting interpretations to a single dramatic finding from one participant.
|
|
23
|
+
|
|
24
|
+
## Philosophy
|
|
25
|
+
Usability testing is built on a humbling claim: designers and engineers cannot reliably predict where users will struggle. The mental models that make a design feel obvious to its creators are exactly the models a fresh user lacks, and only direct observation closes that gap. The discipline rejects "I think users will understand this" in favor of "we watched users; here is what happened." Each session that confirms the design entirely is mildly suspicious — either the tasks were too easy or the moderator was unintentionally helping.
|
|
26
|
+
|
|
27
|
+
The practice is opinionated about moderator behavior. The moderator's job is to be uninteresting — to let the silence sit, to let the participant struggle long enough for the obstacle to become visible, to not rescue. This is hard because the social instinct is to help, and the design instinct is to defend. A moderator who explains the design after a participant gets stuck has destroyed the evidence; the obstacle the participant just encountered is the finding, and it cannot be re-observed in that session.
|
|
28
|
+
|
|
29
|
+
## Verification
|
|
30
|
+
- Tasks are written as goals, not as instructions — a participant could complete the task without seeing the design first; "find out how much you owe" not "click the Tax Summary tab and then click View Details."
|
|
31
|
+
- The moderator script contains no leading prompts and no defensive explanations; the moderator's most common utterances are "what are you thinking?" and silence.
|
|
32
|
+
- Findings are rated by severity, not just listed; the team can identify the catastrophic issues distinctly from cosmetic ones.
|
|
33
|
+
- Sample size matches the claim type — 5 users for formative diagnostic findings is defensible; for summative or benchmark claims, sample size is justified separately.
|
|
34
|
+
- At least one finding contradicts a designer or PM expectation; if every finding confirms prior beliefs, the tasks were likely too constrained or the moderation too helpful.
|
|
35
|
+
- Recordings or detailed notes preserve specific participant behavior so synthesis works from observation, not from moderator impressions.
|
|
36
|
+
|
|
37
|
+
## Do NOT Use When
|
|
38
|
+
- The target is automated verification of code correctness — use **testing-strategy** for unit, integration, and end-to-end engineering tests.
|
|
39
|
+
- The goal is to discover what users need before any artifact exists — use **user-research** for generative interviews and contextual studies.
|
|
40
|
+
- The artifact has not yet been built or sketched — build a prototype first via **prototyping**, then test it.
|
|
41
|
+
- The question requires statistical significance across a large population (benchmarking, A/B comparison) — usability testing surfaces issues; statistical comparison needs larger summative methods or experimentation.
|
|
42
|
+
- The evaluation is purely about accessibility conformance to a specification — use **a11y** for WCAG/ARIA conformance review; usability testing complements this with empirical observation of assistive-tech users but is not a conformance audit.
|
|
43
|
+
- The output should be themes from a corpus of completed sessions — move to **research-synthesis** for affinity mapping and insight extraction.
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: user-research
|
|
3
|
+
description: "Use when planning or conducting generative qualitative research with real users — interviews, contextual inquiry, ethnographic observation, diary studies — to learn what people do, think, and need in their own context. Do NOT use for analytics review, survey statistics, A/B test interpretation, or agent-side intent classification — those are different research practices entirely."
|
|
4
|
+
license: CC-BY-4.0
|
|
5
|
+
metadata:
|
|
6
|
+
metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"design\",\"scope\":\"portable\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-05-12\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-05-12\\\\\\\"}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"user interviews\\\\\\\",\\\\\\\"contextual inquiry\\\\\\\",\\\\\\\"ethnographic observation\\\\\\\",\\\\\\\"diary study\\\\\\\",\\\\\\\"generative research\\\\\\\",\\\\\\\"qualitative research\\\\\\\",\\\\\\\"interview guide\\\\\\\",\\\\\\\"leading questions\\\\\\\",\\\\\\\"master-apprentice model\\\\\\\",\\\\\\\"in-context observation\\\\\\\",\\\\\\\"field study\\\\\\\",\\\\\\\"listening for needs\\\\\\\",\\\\\\\"five whys interview\\\\\\\"]\",\"triggers\":\"[\\\\\\\"interview users\\\\\\\",\\\\\\\"user research plan\\\\\\\",\\\\\\\"what to ask users\\\\\\\",\\\\\\\"contextual inquiry\\\\\\\",\\\\\\\"diary study\\\\\\\"]\",\"examples\":\"[\\\\\\\"Draft an interview guide for SMB founders adopting their first accounting software.\\\\\\\",\\\\\\\"How do I observe ICU nurses on shift without disturbing the workflow?\\\\\\\",\\\\\\\"Review my interview script for leading questions and solution-prompts.\\\\\\\",\\\\\\\"Plan a two-week diary study for commuters using public transit apps.\\\\\\\"]\",\"anti_examples\":\"[\\\\\\\"Analyze last quarter's NPS results and produce a dashboard.\\\\\\\",\\\\\\\"Classify whether this agent request from the user is high-risk before executing.\\\\\\\",\\\\\\\"Set up an A/B test of two onboarding flows.\\\\\\\"]\",\"relations\":\"{\\\\\\\"related\\\\\\\":[\\\\\\\"problem-framing\\\\\\\",\\\\\\\"research-synthesis\\\\\\\",\\\\\\\"usability-testing\\\\\\\",\\\\\\\"design-thinking\\\\\\\"],\\\\\\\"boundary\\\\\\\":[{\\\\\\\"skill\\\\\\\":\\\\\\\"intent-recognition\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"intent-recognition classifies an agent request's risk level at runtime from the agent's perspective. user-research investigates real human users' goals, contexts, and needs through fieldwork — these are entirely different practices that share only the word 'intent'.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"usability-testing\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"usability-testing is evaluative — it watches users attempt tasks on an artifact to find usability defects. user-research is generative — it studies users before any artifact exists, to discover needs and context.\\\\\\\"}]}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/user-research/SKILL.md\"}"
|
|
7
|
+
skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
|
|
8
|
+
skill_graph_protocol: Skill Metadata Protocol v4
|
|
9
|
+
skill_graph_project: Skill Graph
|
|
10
|
+
skill_graph_canonical_skill: skills/user-research/SKILL.md
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# User Research
|
|
14
|
+
|
|
15
|
+
## Coverage
|
|
16
|
+
User research covers the generative qualitative methods that surface what people do, think, feel, and need — typically before a solution exists. The core methods are **semi-structured interviews** (Steve Portigal, Tomer Sharon), **contextual inquiry** with its master-apprentice stance (Beyer & Holtzblatt), **ethnographic observation** in the user's actual environment, **diary studies** for behaviors that unfold over days or weeks, and **intercept studies** for in-the-moment reactions. Each method trades off depth, naturalism, scale, and scheduling cost differently; choosing well depends on what kind of evidence the project needs.
|
|
17
|
+
|
|
18
|
+
The skill includes the craft of **interview construction**: opening with broad context questions, moving to specific recent episodes ("tell me about the last time you…"), avoiding hypotheticals ("would you use…") and leading prompts ("don't you find it frustrating that…"), and using silence as a tool. The **critical incident technique** (Flanagan) and **5 Whys** laddering are used in-session to push past surface answers. The practice also includes what NOT to do: solution-prompting, confirmation seeking, anchoring on the interviewer's own hypothesis, interrupting, and steering toward a preferred narrative.
|
|
19
|
+
|
|
20
|
+
Contextual methods extend interviews into the user's environment. **Contextual inquiry** treats the user as the master craftsperson and the researcher as an apprentice asking clarifying questions while the user works. **Fly-on-the-wall observation** removes the researcher's questions entirely. **Shadowing** follows a single user through their day. Each makes different trade-offs between naturalism (less intrusion → more authentic behavior) and depth (more questions → richer interpretation).
|
|
21
|
+
|
|
22
|
+
Diary and longitudinal methods cover behaviors that do not surface in a single session. Daily prompts, photo diaries, and experience sampling (Csíkszentmihályi) capture in-context moments and reduce recall bias.
|
|
23
|
+
|
|
24
|
+
## Philosophy
|
|
25
|
+
User research is harder than it looks because the natural conversational instincts that make humans good company — finishing each other's sentences, offering sympathy, confirming what the other person seems to want to hear — actively destroy data quality. The discipline trains interviewers to do the opposite: leave silence intact, ask the participant to "say more about that" instead of paraphrasing, and treat surprise as the signal that the conversation is producing new information.
|
|
26
|
+
|
|
27
|
+
The practice is grounded in a specific epistemological claim: people are unreliable narrators of their own behavior, especially when asked hypothetical or future-tense questions, but they are reasonably reliable when describing concrete recent episodes. This is why methods skew toward "tell me about the last time" over "would you ever" — episodic memory is more trustworthy than self-prediction. It is also why observation outranks interview when the project can afford it: what people do and what people say they do are routinely different.
|
|
28
|
+
|
|
29
|
+
## Verification
|
|
30
|
+
- The interview guide contains no leading, hypothetical, or solution-prompting questions; every question can be answered by describing a real past event.
|
|
31
|
+
- Sessions are recorded (with consent) so synthesis works from primary data, not interviewer memory.
|
|
32
|
+
- At least one finding from the research contradicts a hypothesis the team held going in — if every finding confirms prior beliefs, the questions were probably leading.
|
|
33
|
+
- For contextual studies, observation happened in the user's real environment, not a lab simulation of it.
|
|
34
|
+
- Sample composition is documented and matches the recruitment criteria — including who was excluded and why.
|
|
35
|
+
- The researcher can name what they don't yet know after the session, not just what they confirmed.
|
|
36
|
+
|
|
37
|
+
## Do NOT Use When
|
|
38
|
+
- The question is quantitative (how many, what percentage, statistical significance) — use survey or analytics methods, not generative interviews.
|
|
39
|
+
- A working artifact already exists and the question is "does this artifact work for users" — use **usability-testing**.
|
|
40
|
+
- The team needs to make sense of research that has already been collected — use **research-synthesis**.
|
|
41
|
+
- The "user" is an agent or system, not a human — interview methods do not transfer to non-humans.
|
|
42
|
+
- The team has not yet agreed on what problem they are studying — return to **problem-framing** first, then design research to investigate the framed problem.
|
|
43
|
+
- The need is to evaluate a feature against a hypothesis with a control group — use experimental methods (A/B, RCT), not interviews.
|