@skill-graph/cli 0.5.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +247 -0
- package/LICENSE +200 -0
- package/NOTICE +62 -0
- package/README.md +398 -0
- package/SKILL_GRAPH.md +443 -0
- package/bin/skill-graph.js +374 -0
- package/docs/ADOPTION.md +117 -0
- package/docs/CONFORMANCE.md +66 -0
- package/docs/PRIMER.md +384 -0
- package/docs/QUICKSTART-30MIN.md +333 -0
- package/docs/ROUTING-METRICS.md +120 -0
- package/docs/SKILL-MD-FORMAT-COMPATIBILITY.md +127 -0
- package/docs/SKILL_AUDIT_CHECKLIST.md +199 -0
- package/docs/SKILL_AUDIT_LOOP.md +195 -0
- package/docs/SKILL_METADATA_PROTOCOL.md +609 -0
- package/docs/_archived/marketplace-publication-priority-2026-05-18.md +239 -0
- package/docs/adr/0001-predicate-set.md +69 -0
- package/docs/adr/0002-json-ld-context.md +82 -0
- package/docs/adr/0003-ontoclean-rigidity-tags.md +65 -0
- package/docs/adr/0004-persistent-identifiers.md +74 -0
- package/docs/adr/0005-freshness-consolidation.md +70 -0
- package/docs/adr/0006-revise-predicate-rename.md +105 -0
- package/docs/adr/0007-audit-loop-cadence.md +99 -0
- package/docs/adr/0008-skill-surface-split-and-curation-policy.md +93 -0
- package/docs/category-consumers.md +168 -0
- package/docs/concept-map.md +194 -0
- package/docs/diagrams/drift-states.mmd +21 -0
- package/docs/diagrams/manifest-pipeline.mmd +25 -0
- package/docs/diagrams/routing-harness.mmd +41 -0
- package/docs/diagrams/starter-graph.mmd +53 -0
- package/docs/field-decision-guide.md +315 -0
- package/docs/field-rationale.md +211 -0
- package/docs/field-reference.generated.md +624 -0
- package/docs/field-reference.md +1426 -0
- package/docs/glossary.md +190 -0
- package/docs/head-noun-glossary.md +63 -0
- package/docs/images/audit-phases.png +0 -0
- package/docs/images/drift-states.png +0 -0
- package/docs/images/graded-mode.png +0 -0
- package/docs/images/manifest-pipeline.png +0 -0
- package/docs/images/routing-harness.png +0 -0
- package/docs/images/skill-anatomy.png +0 -0
- package/docs/images/starter-graph.png +0 -0
- package/docs/images/system-model.png +0 -0
- package/docs/integrations/github-actions.md +155 -0
- package/docs/manifest-field-mapping.md +443 -0
- package/docs/marketplace-publication-queue.generated.md +240 -0
- package/docs/marketplace-release-agent-prompt.md +82 -0
- package/docs/marketplace-skill-candidate-list.md +272 -0
- package/docs/marketplace-syndication.md +222 -0
- package/docs/migration-sample-review.md +155 -0
- package/docs/migrations/v4-to-v5.md +168 -0
- package/docs/migrations/v5-to-v6.md +221 -0
- package/docs/name-exceptions.yaml +37 -0
- package/docs/plans/marketplace-p1-public-migration-plan.md +41 -0
- package/docs/plans/multi-root-workspace.md +148 -0
- package/docs/plans/scripts-roadmap.md +107 -0
- package/docs/plans/v4-schema-bump.md +160 -0
- package/docs/plans/wave-2-extraction.md +122 -0
- package/docs/positioning-vs-marketplaces.md +175 -0
- package/docs/proposals/skill-audit-loop-positioning.md +160 -0
- package/docs/quality-doctrine.md +138 -0
- package/docs/recommended-skills.md +150 -0
- package/docs/research/skill-comprehension-eval-research.md +1830 -0
- package/docs/research/skill-retrieval-evidence.md +66 -0
- package/docs/skill-metadata-protocol.md +471 -0
- package/docs/skills-sh-maintainer-cleanup-request.md +80 -0
- package/examples/audits/a11y/findings.md +52 -0
- package/examples/audits/a11y/scorecard.md +21 -0
- package/examples/audits/a11y/verdict.md +44 -0
- package/examples/audits/debugging/findings.md +59 -0
- package/examples/audits/debugging/scorecard.md +22 -0
- package/examples/audits/debugging/verdict.md +33 -0
- package/examples/audits/documentation/findings.md +59 -0
- package/examples/audits/documentation/scorecard.md +22 -0
- package/examples/audits/documentation/verdict.md +33 -0
- package/examples/evals/a11y.json +140 -0
- package/examples/evals/api-design.json +52 -0
- package/examples/evals/code-review.json +52 -0
- package/examples/evals/data-modeling.json +52 -0
- package/examples/evals/database-migration.json +52 -0
- package/examples/evals/debugging.json +118 -0
- package/examples/evals/dependency-architecture.json +52 -0
- package/examples/evals/design-system-architecture.json +52 -0
- package/examples/evals/error-tracking.json +52 -0
- package/examples/evals/event-contract-design.json +52 -0
- package/examples/evals/form-ux-architecture.json +52 -0
- package/examples/evals/framework-fit-analysis.json +52 -0
- package/examples/evals/graph-audit.json +139 -0
- package/examples/evals/information-architecture.json +52 -0
- package/examples/evals/interaction-feedback.json +52 -0
- package/examples/evals/interaction-patterns.json +52 -0
- package/examples/evals/layout-composition.json +52 -0
- package/examples/evals/lint-overlay.json +117 -0
- package/examples/evals/microcopy.json +52 -0
- package/examples/evals/observability-modeling.json +52 -0
- package/examples/evals/pattern-recognition.json +96 -0
- package/examples/evals/performance-engineering.json +52 -0
- package/examples/evals/refactor.json +128 -0
- package/examples/evals/semiotics.json +52 -0
- package/examples/evals/skill-infrastructure.json +96 -0
- package/examples/evals/skill-router.json +140 -0
- package/examples/evals/skill-router.routing.json +113 -0
- package/examples/evals/system-interface-contracts.json +52 -0
- package/examples/evals/task-analysis.json +52 -0
- package/examples/evals/testing-strategy.json +118 -0
- package/examples/evals/type-safety.json +249 -0
- package/examples/evals/visual-design-foundations.json +52 -0
- package/examples/evals/webhook-integration.json +52 -0
- package/examples/exports/a11y.skill-md.md +80 -0
- package/examples/exports/debugging.skill-md.md +80 -0
- package/examples/exports/refactor.skill-md.md +78 -0
- package/examples/exports/testing-strategy.skill-md.md +81 -0
- package/examples/projects/markdown-static-site/README.md +115 -0
- package/examples/projects/markdown-static-site/skills/content-source-router/SKILL.md +131 -0
- package/examples/projects/markdown-static-site/skills/image-optimization-pipeline-config/SKILL.md +132 -0
- package/examples/projects/markdown-static-site/skills/link-rot-detection/SKILL.md +103 -0
- package/examples/projects/markdown-static-site/skills/markdown-post-frontmatter-validation/SKILL.md +133 -0
- package/examples/projects/markdown-static-site/skills/migrate-posts-to-v2-frontmatter/SKILL.md +140 -0
- package/examples/projects/saas-stripe-postgres/README.md +208 -0
- package/examples/projects/saas-stripe-postgres/db/migrations/0004_canonicalize_orders.sql +37 -0
- package/examples/projects/saas-stripe-postgres/db/schema.sql +112 -0
- package/examples/projects/saas-stripe-postgres/skills/migrate-orders-to-canonical-schema/SKILL.md +149 -0
- package/examples/projects/saas-stripe-postgres/skills/nextjs-server-action-validation/SKILL.md +154 -0
- package/examples/projects/saas-stripe-postgres/skills/payment-provider-router/SKILL.md +153 -0
- package/examples/projects/saas-stripe-postgres/skills/postgres-rls-pattern/SKILL.md +163 -0
- package/examples/projects/saas-stripe-postgres/skills/stripe-webhook-signature-verification/SKILL.md +137 -0
- package/examples/protocol/skill-metadata-template.md +301 -0
- package/examples/protocol/skills.manifest.sample.json +13245 -0
- package/examples/skill-metadata-template.md +317 -0
- package/examples/skills.manifest.sample.json +13519 -0
- package/examples/tests/v3-1-skos-fixture/SKILL.md +93 -0
- package/marketplace/README.md +17 -0
- package/marketplace/skills/a11y/SKILL.md +66 -0
- package/marketplace/skills/acid-fundamentals/SKILL.md +106 -0
- package/marketplace/skills/agent-engineering/SKILL.md +386 -0
- package/marketplace/skills/agent-eval-design/SKILL.md +55 -0
- package/marketplace/skills/ai-native-development/SKILL.md +294 -0
- package/marketplace/skills/api-design/SKILL.md +60 -0
- package/marketplace/skills/architecture-decision-records/SKILL.md +55 -0
- package/marketplace/skills/background-jobs/SKILL.md +265 -0
- package/marketplace/skills/bounded-context-mapping/SKILL.md +55 -0
- package/marketplace/skills/cap-theorem-tradeoffs/SKILL.md +127 -0
- package/marketplace/skills/client-server-boundary/SKILL.md +187 -0
- package/marketplace/skills/code-review/SKILL.md +120 -0
- package/marketplace/skills/color-system-design/SKILL.md +43 -0
- package/marketplace/skills/component-architecture/SKILL.md +126 -0
- package/marketplace/skills/compression/SKILL.md +112 -0
- package/marketplace/skills/conceptual-modeling/SKILL.md +181 -0
- package/marketplace/skills/connection-pooling/SKILL.md +105 -0
- package/marketplace/skills/constraint-awareness/SKILL.md +287 -0
- package/marketplace/skills/content-monitor/SKILL.md +209 -0
- package/marketplace/skills/context-engineering/SKILL.md +320 -0
- package/marketplace/skills/context-graph/SKILL.md +174 -0
- package/marketplace/skills/context-management/SKILL.md +174 -0
- package/marketplace/skills/context-window/SKILL.md +239 -0
- package/marketplace/skills/contract-testing/SKILL.md +120 -0
- package/marketplace/skills/cron-scheduling/SKILL.md +223 -0
- package/marketplace/skills/dark-mode-implementation/SKILL.md +47 -0
- package/marketplace/skills/data-modeling/SKILL.md +59 -0
- package/marketplace/skills/data-modeling-fundamentals/SKILL.md +117 -0
- package/marketplace/skills/database-migration/SKILL.md +429 -0
- package/marketplace/skills/debugging/SKILL.md +67 -0
- package/marketplace/skills/dependency-architecture/SKILL.md +58 -0
- package/marketplace/skills/design-module-composition/SKILL.md +43 -0
- package/marketplace/skills/design-system-architecture/SKILL.md +61 -0
- package/marketplace/skills/design-thinking/SKILL.md +44 -0
- package/marketplace/skills/diagnosis/SKILL.md +296 -0
- package/marketplace/skills/diff-analysis/SKILL.md +188 -0
- package/marketplace/skills/e2e-test-design/SKILL.md +113 -0
- package/marketplace/skills/entity-relationship-modeling/SKILL.md +218 -0
- package/marketplace/skills/epistemic-grounding/SKILL.md +112 -0
- package/marketplace/skills/error-boundary/SKILL.md +235 -0
- package/marketplace/skills/error-tracking/SKILL.md +261 -0
- package/marketplace/skills/eval-driven-development/SKILL.md +147 -0
- package/marketplace/skills/evaluation/SKILL.md +113 -0
- package/marketplace/skills/event-contract-design/SKILL.md +60 -0
- package/marketplace/skills/event-storming/SKILL.md +56 -0
- package/marketplace/skills/form-ux-architecture/SKILL.md +60 -0
- package/marketplace/skills/framework-fit-analysis/SKILL.md +59 -0
- package/marketplace/skills/frontend-architecture/SKILL.md +43 -0
- package/marketplace/skills/generative-ui/SKILL.md +118 -0
- package/marketplace/skills/graph-audit/SKILL.md +81 -0
- package/marketplace/skills/guardrails/SKILL.md +118 -0
- package/marketplace/skills/hooks-patterns/SKILL.md +185 -0
- package/marketplace/skills/http-semantics/SKILL.md +136 -0
- package/marketplace/skills/ideation/SKILL.md +41 -0
- package/marketplace/skills/indexing-strategy/SKILL.md +108 -0
- package/marketplace/skills/information-architecture/SKILL.md +59 -0
- package/marketplace/skills/integration-test-design/SKILL.md +111 -0
- package/marketplace/skills/intent-recognition/SKILL.md +136 -0
- package/marketplace/skills/interaction-feedback/SKILL.md +59 -0
- package/marketplace/skills/interaction-patterns/SKILL.md +59 -0
- package/marketplace/skills/journey-mapping/SKILL.md +41 -0
- package/marketplace/skills/keywords/SKILL.md +213 -0
- package/marketplace/skills/knowledge-modeling/SKILL.md +232 -0
- package/marketplace/skills/layout-composition/SKILL.md +59 -0
- package/marketplace/skills/linguistics/SKILL.md +429 -0
- package/marketplace/skills/lint-overlay/SKILL.md +76 -0
- package/marketplace/skills/mental-models/SKILL.md +126 -0
- package/marketplace/skills/merge-queue/SKILL.md +94 -0
- package/marketplace/skills/methodology/SKILL.md +317 -0
- package/marketplace/skills/microcopy/SKILL.md +232 -0
- package/marketplace/skills/middleware-patterns/SKILL.md +363 -0
- package/marketplace/skills/mobile-responsive-ux/SKILL.md +287 -0
- package/marketplace/skills/mutation-testing/SKILL.md +112 -0
- package/marketplace/skills/naming-conventions/SKILL.md +112 -0
- package/marketplace/skills/observability-modeling/SKILL.md +59 -0
- package/marketplace/skills/ontology-modeling/SKILL.md +67 -0
- package/marketplace/skills/owasp-security/SKILL.md +153 -0
- package/marketplace/skills/pattern-recognition/SKILL.md +472 -0
- package/marketplace/skills/performance-budgets/SKILL.md +185 -0
- package/marketplace/skills/performance-engineering/SKILL.md +58 -0
- package/marketplace/skills/performance-testing/SKILL.md +125 -0
- package/marketplace/skills/printify/SKILL.md +42 -0
- package/marketplace/skills/prioritization/SKILL.md +118 -0
- package/marketplace/skills/problem-framing/SKILL.md +41 -0
- package/marketplace/skills/problem-locating-solving/SKILL.md +203 -0
- package/marketplace/skills/project-knowledge-extraction/SKILL.md +54 -0
- package/marketplace/skills/prompt-craft/SKILL.md +134 -0
- package/marketplace/skills/prompt-injection-defense/SKILL.md +132 -0
- package/marketplace/skills/property-based-testing/SKILL.md +100 -0
- package/marketplace/skills/prototyping/SKILL.md +43 -0
- package/marketplace/skills/query-optimization/SKILL.md +144 -0
- package/marketplace/skills/real-time-updates/SKILL.md +324 -0
- package/marketplace/skills/ref-patterns/SKILL.md +284 -0
- package/marketplace/skills/refactor/SKILL.md +65 -0
- package/marketplace/skills/rendering-models/SKILL.md +142 -0
- package/marketplace/skills/replication-patterns/SKILL.md +110 -0
- package/marketplace/skills/research-synthesis/SKILL.md +41 -0
- package/marketplace/skills/route-handler-design/SKILL.md +347 -0
- package/marketplace/skills/schema-evolution/SKILL.md +140 -0
- package/marketplace/skills/security-fundamentals/SKILL.md +139 -0
- package/marketplace/skills/semantic-center/SKILL.md +194 -0
- package/marketplace/skills/semantic-relations/SKILL.md +250 -0
- package/marketplace/skills/semantics/SKILL.md +366 -0
- package/marketplace/skills/semiotics/SKILL.md +230 -0
- package/marketplace/skills/seo-strategy/SKILL.md +260 -0
- package/marketplace/skills/server-actions-design/SKILL.md +243 -0
- package/marketplace/skills/server-components-design/SKILL.md +190 -0
- package/marketplace/skills/sharding-strategy/SKILL.md +123 -0
- package/marketplace/skills/shopify/SKILL.md +42 -0
- package/marketplace/skills/skill-infrastructure/SKILL.md +320 -0
- package/marketplace/skills/skill-router/SKILL.md +71 -0
- package/marketplace/skills/skill-scaffold/SKILL.md +105 -0
- package/marketplace/skills/snapshot-testing/SKILL.md +120 -0
- package/marketplace/skills/spec-driven-development/SKILL.md +148 -0
- package/marketplace/skills/state-machine-modeling/SKILL.md +56 -0
- package/marketplace/skills/state-management/SKILL.md +134 -0
- package/marketplace/skills/streaming-architecture/SKILL.md +194 -0
- package/marketplace/skills/summarization/SKILL.md +156 -0
- package/marketplace/skills/suspense-patterns/SKILL.md +265 -0
- package/marketplace/skills/system-interface-contracts/SKILL.md +59 -0
- package/marketplace/skills/task-analysis/SKILL.md +201 -0
- package/marketplace/skills/taxonomy-design/SKILL.md +66 -0
- package/marketplace/skills/test-coverage-strategy/SKILL.md +108 -0
- package/marketplace/skills/test-doubles-design/SKILL.md +98 -0
- package/marketplace/skills/test-driven-development/SKILL.md +96 -0
- package/marketplace/skills/testing-strategy/SKILL.md +67 -0
- package/marketplace/skills/theme-system-design/SKILL.md +43 -0
- package/marketplace/skills/tool-call-flow/SKILL.md +229 -0
- package/marketplace/skills/tool-call-strategy/SKILL.md +292 -0
- package/marketplace/skills/transaction-isolation/SKILL.md +98 -0
- package/marketplace/skills/type-safety/SKILL.md +177 -0
- package/marketplace/skills/typography-system/SKILL.md +43 -0
- package/marketplace/skills/usability-testing/SKILL.md +43 -0
- package/marketplace/skills/user-research/SKILL.md +43 -0
- package/marketplace/skills/vercel-composition-patterns/SKILL.md +157 -0
- package/marketplace/skills/version-control/SKILL.md +233 -0
- package/marketplace/skills/visual-design-foundations/SKILL.md +59 -0
- package/marketplace/skills/visual-hierarchy/SKILL.md +43 -0
- package/marketplace/skills/webhook-integration/SKILL.md +331 -0
- package/marketplace/skills/writing-humanizer/SKILL.md +380 -0
- package/package.json +67 -0
- package/schemas/manifest.schema.json +811 -0
- package/schemas/manifest.v2.schema.json +164 -0
- package/schemas/manifest.v3.schema.json +758 -0
- package/schemas/manifest.v4.schema.json +755 -0
- package/schemas/manifest.v5.schema.json +755 -0
- package/schemas/manifest.v6.schema.json +811 -0
- package/schemas/skill.context.jsonld +279 -0
- package/schemas/skill.schema.json +919 -0
- package/schemas/skill.v2.schema.json +201 -0
- package/schemas/skill.v3.schema.json +827 -0
- package/schemas/skill.v4.schema.json +822 -0
- package/schemas/skill.v5.schema.json +830 -0
- package/schemas/skill.v6.schema.json +946 -0
- package/schemas/vocabulary/keywords.json +180 -0
- package/schemas/vocabulary/workspace_tags.json +23 -0
- package/scripts/__tests__/migrate-skill-v2-to-v3.test.js +161 -0
- package/scripts/__tests__/migrate-skill-v3-to-v4.test.js +158 -0
- package/scripts/__tests__/test-export-parser-drift.js +149 -0
- package/scripts/__tests__/test-marketplace-export.js +114 -0
- package/scripts/__tests__/test-router-paths.js +82 -0
- package/scripts/__tests__/test-stability-promotion.js +244 -0
- package/scripts/__tests__/test-v3-1-alias-contract.js +109 -0
- package/scripts/__tests__/test-v3-1-skos-runtime.js +116 -0
- package/scripts/backfill-schema-version.js +198 -0
- package/scripts/build-field-reference.js +160 -0
- package/scripts/build-retrieval-baseline.js +511 -0
- package/scripts/check-markdown-links.js +211 -0
- package/scripts/check-protocol-consistency.js +979 -0
- package/scripts/export-marketplace-skills.js +610 -0
- package/scripts/export-skill.js +374 -0
- package/scripts/generate-manifest.js +787 -0
- package/scripts/lib/alias-contract.js +83 -0
- package/scripts/lib/audit-prompt-builder.js +771 -0
- package/scripts/lib/mock-grader.js +134 -0
- package/scripts/lib/parse-frontmatter.js +429 -0
- package/scripts/lib/roots.js +119 -0
- package/scripts/lint/check-archetype-sections.js +185 -0
- package/scripts/lint/check-category-enum.js +83 -0
- package/scripts/lint/check-routing-eval.js +146 -0
- package/scripts/lint/check-routing-quality.js +211 -0
- package/scripts/lint/check-stability-promotion.js +220 -0
- package/scripts/lint/format-code-frame.js +206 -0
- package/scripts/marketplace-install.js +125 -0
- package/scripts/migrate-category-to-enum.js +169 -0
- package/scripts/migrate-skill-v2-to-v3.js +424 -0
- package/scripts/migrate-skill-v3-to-v4.js +200 -0
- package/scripts/migrate-skill-v5-to-v6.js +304 -0
- package/scripts/restructure-by-category.js +85 -0
- package/scripts/seed-publication-classification.js +282 -0
- package/scripts/skill-audit.js +893 -0
- package/scripts/skill-graph-drift.js +483 -0
- package/scripts/skill-graph-route.js +766 -0
- package/scripts/skill-graph-routing-eval.js +393 -0
- package/scripts/skill-lint.js +1317 -0
- package/scripts/skill-overlap.js +213 -0
- package/scripts/verify-skill-md-export.js +201 -0
|
@@ -0,0 +1,113 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: e2e-test-design
|
|
3
|
+
description: "Use when designing end-to-end tests that exercise a user-visible path through the whole system, including the UI layer: the user-journey unit-of-test that distinguishes e2e from integration testing, the five-primitive structure (user journey, environment, test data, observable assertion, recovery), why e2e tests are expensive and how to keep them few-and-load-bearing, the wait/synchronization discipline that makes them not-flaky, the page-object and trace-test patterns, the role of e2e tests in the test pyramid/trophy (the top tier — fewest in count but highest in coverage of user-observable behavior), and the modern e2e tool landscape (Playwright, Cypress, Selenium). Do NOT use for testing internal seams of the system (use integration-test-design), single-unit isolated tests (use testing-strategy + test-doubles-design), consumer-driven contract verification (use contract-testing), or visual regression of specific components (use snapshot-testing)."
|
|
4
|
+
license: MIT
|
|
5
|
+
allowed-tools: Read Grep
|
|
6
|
+
metadata:
|
|
7
|
+
metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"quality\",\"domain\":\"quality/testing\",\"scope\":\"reference\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-05-16\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-05-16\\\\\\\"}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"comprehension_state\":\"present\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"end-to-end testing\\\\\\\",\\\\\\\"e2e test\\\\\\\",\\\\\\\"user journey test\\\\\\\",\\\\\\\"Playwright\\\\\\\",\\\\\\\"Cypress\\\\\\\",\\\\\\\"Selenium\\\\\\\",\\\\\\\"page object\\\\\\\",\\\\\\\"test flake\\\\\\\",\\\\\\\"wait strategy\\\\\\\",\\\\\\\"trace test\\\\\\\",\\\\\\\"smoke test\\\\\\\"]\",\"triggers\":\"[\\\\\\\"do we need e2e tests\\\\\\\",\\\\\\\"the e2e tests are flaky\\\\\\\",\\\\\\\"Playwright vs Cypress\\\\\\\",\\\\\\\"how many e2e tests is too many\\\\\\\",\\\\\\\"page object pattern\\\\\\\"]\",\"examples\":\"[\\\\\\\"design an e2e test suite for an onboarding journey: signup → email verify → first action\\\\\\\",\\\\\\\"decide which user journeys deserve e2e coverage vs integration-test coverage\\\\\\\",\\\\\\\"diagnose flaky e2e tests — usually wait-strategy or test-data problems\\\\\\\",\\\\\\\"explain why fewer e2e tests with higher load-bearing value beats many e2e tests with low value\\\\\\\"]\",\"anti_examples\":\"[\\\\\\\"test internal seams of the system (use integration-test-design)\\\\\\\",\\\\\\\"test a single component in isolation (use testing-strategy + test-doubles-design)\\\\\\\",\\\\\\\"verify a service's contract against consumers (use contract-testing)\\\\\\\"]\",\"relations\":\"{\\\\\\\"related\\\\\\\":[\\\\\\\"testing-strategy\\\\\\\",\\\\\\\"integration-test-design\\\\\\\",\\\\\\\"snapshot-testing\\\\\\\",\\\\\\\"test-driven-development\\\\\\\",\\\\\\\"contract-testing\\\\\\\"],\\\\\\\"boundary\\\\\\\":[{\\\\\\\"skill\\\\\\\":\\\\\\\"testing-strategy\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"testing-strategy owns the strategic ratio of test levels; this skill owns the design of the e2e tier specifically — the smallest tier in the pyramid/trophy, the most expensive per test, the most user-meaningful per test.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"integration-test-design\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"integration-test-design owns tests of internal seams between units; this skill owns user-journey tests through the whole stack including UI. The cost difference is an order of magnitude; conflating them either inflates CI cost or misses real e2e coverage.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"contract-testing\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"contract-testing verifies the interface between consumer and provider via consumer-driven contracts; this skill verifies the user-journey behavior end-to-end. Contracts replace e2e tests across service boundaries when the journey is service-to-service; e2e is for journeys with humans at one end.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"snapshot-testing\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"Visual snapshot tests are a regression net at e2e or component scope; this skill owns the user-journey behavior end-to-end. They compose: visual snapshots within e2e tests catch UI changes the journey assertions don't.\\\\\\\"}],\\\\\\\"verify_with\\\\\\\":[\\\\\\\"testing-strategy\\\\\\\",\\\\\\\"integration-test-design\\\\\\\"]}\",\"mental_model\":\"|\",\"purpose\":\"|\",\"boundary\":\"|\",\"analogy\":\"An e2e test is to a software system what a flight rehearsal is to a launch — you do not certify a rocket by testing each bolt in a clean room (units), nor by firing each engine in isolation (integration), nor by writing a specification of what the avionics should do (contract); you certify it by performing the entire launch sequence, with real fuel, against a real flight plan, with the actual crew, and you do this rarely because each rehearsal costs millions and ten high-fidelity rehearsals tell you more than a thousand quick ones.\",\"misconception\":\"|\",\"concept\":\"{\\\\\\\"definition\\\\\\\":\\\\\\\"End-to-end (e2e) test design is the discipline of designing tests that exercise a complete user-visible path through the entire system — the UI, the application layer, the data layer, external integrations, and back. The unit of test is the *user journey*: a sequence of user actions and the observable outcomes the user experiences. E2e tests are the highest-scope tier of the test pyramid (or trophy) — the fewest in count, the slowest per test, the most user-meaningful per test. Their value is the confidence that the system, assembled, works for a real user task; their cost is real and growing with the system's complexity. The discipline of e2e test design is keeping the count small enough that the cost stays manageable while keeping the coverage broad enough that the tests are load-bearing evidence the system works for users.\\\\\\\",\\\\\\\"mental_model\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"purpose\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"boundary\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"taxonomy\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"analogy\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"misconception\\\\\\\":\\\\\\\"|\\\\\\\"}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/e2e-test-design/SKILL.md\"}"
|
|
8
|
+
skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
|
|
9
|
+
skill_graph_protocol: Skill Metadata Protocol v4
|
|
10
|
+
skill_graph_project: Skill Graph
|
|
11
|
+
skill_graph_canonical_skill: skills/e2e-test-design/SKILL.md
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
# E2e Test Design
|
|
15
|
+
|
|
16
|
+
## Coverage
|
|
17
|
+
|
|
18
|
+
The discipline of designing tests that exercise a complete user-visible path through the entire system — UI, application, data, third parties, and back — with the user journey as the unit of test. Covers the five primitives (journey, environment, test data, observable assertion, wait strategy), the pyramid/trophy position (top tier, fewest tests, highest cost, highest user-meaningful coverage), the test-data strategies that determine isolation and flake rate, the wait-for-condition synchronization discipline that prevents flake, the framework landscape (Playwright, Cypress, Selenium, Puppeteer), and the cost-and-count trade-off that distinguishes good e2e suites from over-investment.
|
|
19
|
+
|
|
20
|
+
## Philosophy
|
|
21
|
+
|
|
22
|
+
E2e tests are the smallest tier of the test suite by count and the largest by individual cost. They are the evidence that the *assembled system* works for *real users*; nothing else in the test suite provides that evidence. They are also the most expensive tests to write, run, and maintain.
|
|
23
|
+
|
|
24
|
+
The discipline is making each e2e test *load-bearing*: covers a journey that matters; asserts outcomes that prove the system works for users; runs reliably with near-zero flake. A team with one hundred low-value flaky e2e tests is worse off than a team with ten high-value reliable e2e tests, because trust in the suite — the willingness to take a red build as evidence of a real bug — is the suite's value.
|
|
25
|
+
|
|
26
|
+
The right e2e count is much lower than most teams intuit. The pyramid/trophy framings put e2e at the top — fewest tests, by design. Teams that grow e2e suites to hundreds of tests usually have many tests that should be integration tests; the discipline is reversing that drift.
|
|
27
|
+
|
|
28
|
+
## What An E2e Test Looks Like
|
|
29
|
+
|
|
30
|
+
| Component | Detail |
|
|
31
|
+
|---|---|
|
|
32
|
+
| Scope | A complete user journey (signup, checkout, primary creative action) |
|
|
33
|
+
| Environment | Production-like; all real components, recorded fakes for paid third parties |
|
|
34
|
+
| Stack | Full UI, application, database, message bus, file storage |
|
|
35
|
+
| Test data | Isolated per test, typically via per-test user accounts with unique IDs |
|
|
36
|
+
| Assertions | DOM elements visible, URL changes, key user-observable outcomes |
|
|
37
|
+
| Wait strategy | Wait-for-condition with generous timeouts; never hardcoded delays |
|
|
38
|
+
| Diagnostics | Screenshot on failure, video capture, trace viewer for replay |
|
|
39
|
+
| Runtime | Seconds to tens of seconds per test |
|
|
40
|
+
| Suite size | 10-50 critical-path; 50-200 for broader regression |
|
|
41
|
+
| Suite runtime | Under 30 minutes total via parallelization |
|
|
42
|
+
|
|
43
|
+
## The Wait-Strategy Discipline
|
|
44
|
+
|
|
45
|
+
E2e flake's primary cause is bad synchronization. The discipline:
|
|
46
|
+
|
|
47
|
+
| Anti-pattern | Pattern |
|
|
48
|
+
|---|---|
|
|
49
|
+
| `page.click('button'); sleep(2); expect(elem).toBeVisible()` | `page.click('button'); await expect(elem).toBeVisible()` (auto-waits up to timeout) |
|
|
50
|
+
| `sleep(5); expect(toast).toBe('Saved')` | `await expect(toast).toBe('Saved', { timeout: 10_000 })` (retries until match) |
|
|
51
|
+
| `page.click('a'); sleep(3); page.click('next')` | `page.click('a'); await page.waitForURL(/expected/); page.click('next')` |
|
|
52
|
+
| Hardcoded delay for network call | `await page.waitForResponse(predicate)` |
|
|
53
|
+
| Hardcoded delay for animation | Animation-aware wait or disable animations in test config |
|
|
54
|
+
|
|
55
|
+
Playwright, Cypress, and Selenium all support wait-for-condition. The discipline is using it everywhere instead of `sleep`.
|
|
56
|
+
|
|
57
|
+
## Test-Data Strategy Comparison
|
|
58
|
+
|
|
59
|
+
| Strategy | Speed | Isolation | Best for |
|
|
60
|
+
|---|---|---|---|
|
|
61
|
+
| Fresh environment per test | Slowest | Strongest | Highest-stakes journeys; very few tests |
|
|
62
|
+
| Per-test data with unique IDs | Fast | Strong | Most production e2e suites |
|
|
63
|
+
| Fixture seed + per-user partition | Fast | Strong (if partition discipline holds) | Suites with shared lookup data |
|
|
64
|
+
| Shared snapshot, read-only discipline | Fastest | Relies on discipline | Pure read flows; rare |
|
|
65
|
+
|
|
66
|
+
Per-test data with unique IDs is the working standard. Each test creates the users, orders, or whatever entities it needs, with IDs that don't collide with other tests. Cleanup happens at test end or suite end.
|
|
67
|
+
|
|
68
|
+
## Framework Selection
|
|
69
|
+
|
|
70
|
+
| Framework | Strengths | Weaknesses | Best fit |
|
|
71
|
+
|---|---|---|---|
|
|
72
|
+
| Playwright | Modern, fast, multi-browser, auto-waiting, trace viewer | Newer ecosystem, smaller community than Selenium | New projects; recommended default |
|
|
73
|
+
| Cypress | Excellent DX, retry-until-pass, in-browser test execution | Chromium-family focused, iframe and tab handling awkward | Developer-experience-focused teams |
|
|
74
|
+
| Selenium WebDriver | Largest ecosystem, cross-language, every browser | Slower, more boilerplate, more flake-prone without careful setup | Large enterprise / legacy / cross-browser-matrix |
|
|
75
|
+
| Puppeteer | Lower-level Chrome API | Chromium-only, less abstraction | Lower-level scripting and scraping |
|
|
76
|
+
| Detox / Appium / Maestro | Mobile-native | Mobile-only | Mobile app e2e |
|
|
77
|
+
|
|
78
|
+
## Verification
|
|
79
|
+
|
|
80
|
+
After applying this skill, verify:
|
|
81
|
+
- [ ] Every e2e test covers a complete user journey, not a UI interaction in isolation. "Click this button" is not an e2e test; "complete signup" is.
|
|
82
|
+
- [ ] The test environment is production-like: real UI, real backend, real database, real message bus. Recorded fakes are used only for paid or unavailable third parties.
|
|
83
|
+
- [ ] Wait strategy is wait-for-condition with generous timeouts everywhere. No hardcoded `sleep` calls.
|
|
84
|
+
- [ ] Test data is isolated per test (per-test user accounts with unique IDs, or per-test data with cleanup). Shared mutable state is the flake source.
|
|
85
|
+
- [ ] Each e2e test has a clear load-bearing reason for existing — names a journey that matters, asserts an outcome that would not be caught by a lower-tier test.
|
|
86
|
+
- [ ] Suite size and runtime are tracked. Suites trending toward 500+ tests or 30+ minutes are diagnosed; many should be lower-tier tests.
|
|
87
|
+
- [ ] Flake rate is monitored and held near zero. Flaky tests are diagnosed and fixed, not retried-and-ignored.
|
|
88
|
+
- [ ] Failure diagnostics (screenshot, video, trace) are configured so that test failures are debuggable from the CI artifacts.
|
|
89
|
+
- [ ] E2e tests run on every PR (critical-path subset) and on every merge or nightly (broader regression). They are not the primary test for behaviors lower-tier tests can verify.
|
|
90
|
+
|
|
91
|
+
## Do NOT Use When
|
|
92
|
+
|
|
93
|
+
| Instead of this skill | Use | Why |
|
|
94
|
+
|---|---|---|
|
|
95
|
+
| Testing an internal seam between modules | `integration-test-design` | integration tests are cheaper and faster for non-user-journey verification |
|
|
96
|
+
| Testing a single unit in isolation | `testing-strategy` + `test-doubles-design` | unit-scope is much cheaper for implementation-detail verification |
|
|
97
|
+
| Verifying a service-to-service contract | `contract-testing` | contract tests are more targeted than e2e for service-boundary verification |
|
|
98
|
+
| Visual regression of a specific component | `snapshot-testing` | snapshot is cheaper and more targeted than e2e for visual regression |
|
|
99
|
+
| Choosing the ratio of test levels | `testing-strategy` | strategy owns ratios; this skill owns e2e-tier design |
|
|
100
|
+
| Measuring whether the test suite catches defects | `mutation-testing` | mutation is the test-suite quality signal; this skill is e2e tier design |
|
|
101
|
+
|
|
102
|
+
## Key Sources
|
|
103
|
+
|
|
104
|
+
- Cohn, M. (2009). *Succeeding with Agile: Software Development Using Scrum*. The test pyramid framing places e2e at the top tier; canonical reference for why e2e count should be small.
|
|
105
|
+
- Fowler, M. (2012). ["The Practical Test Pyramid"](https://martinfowler.com/articles/practical-test-pyramid.html). Practitioner essay on the pyramid; sections on UI/e2e tests' cost-benefit profile.
|
|
106
|
+
- Dodds, K. C. (2018). ["The Testing Trophy and Testing Classifications"](https://kentcdodds.com/blog/the-testing-trophy-and-testing-classifications). Alternative framing that retains e2e at the top tier with similar cost-and-count reasoning.
|
|
107
|
+
- Microsoft / Playwright Team. ["Playwright — Documentation"](https://playwright.dev/docs/intro). Canonical reference for the modern multi-browser e2e framework; includes the auto-waiting and trace-viewer discipline.
|
|
108
|
+
- Cypress.io. ["Cypress — Documentation"](https://docs.cypress.io/). Canonical reference for the Cypress framework; includes the retry-until-pass assertion model and the in-browser test execution architecture.
|
|
109
|
+
- Selenium Project. ["Selenium WebDriver — Documentation"](https://www.selenium.dev/documentation/webdriver/). The canonical cross-language WebDriver protocol reference; foundation for many derived tools.
|
|
110
|
+
- Meszaros, G. (2007). *xUnit Test Patterns: Refactoring Test Code*. Catalog includes e2e-relevant patterns for test data lifecycle, shared fixtures, and test independence.
|
|
111
|
+
- Fowler, M. ["Page Object"](https://martinfowler.com/bliki/PageObject.html). The canonical reference for the page-object pattern; useful for organizing large e2e suites.
|
|
112
|
+
- Google Testing Blog. ["Testing on the Toilet — Just say no to more end-to-end tests"](https://testing.googleblog.com/2015/04/just-say-no-to-more-end-to-end-tests.html). Industrial perspective on why more e2e tests is usually the wrong response to bug pressure.
|
|
113
|
+
- North, D. ["BDD as Outside-In Development"](https://dannorth.net/introducing-bdd/). Adjacent thread: BDD's outside-in style produces user-journey-scope tests at the outermost layer, conceptually aligned with e2e.
|
|
@@ -0,0 +1,218 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: entity-relationship-modeling
|
|
3
|
+
description: "Use when designing database tables, reviewing schema changes, planning migrations, or translating conceptual models into physical database structures. Covers ER notation, entity/attribute/key design, normalization and denormalization, junction tables, inheritance mapping, temporal modeling, ER-to-SQL translation, indexing, and constraints. Do NOT use for conceptual domain analysis (use `conceptual-modeling`), formal ontology (use `ontology`), or cross-system API contracts (use `system-interface-contracts`)."
|
|
4
|
+
license: MIT
|
|
5
|
+
compatibility: "Markdown, Git, agent-skill runtimes"
|
|
6
|
+
allowed-tools: Read Grep Bash
|
|
7
|
+
metadata:
|
|
8
|
+
metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"engineering\",\"domain\":\"engineering/modeling\",\"scope\":\"portable\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-03-29\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-03-29\\\\\\\"}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"entity relationship\\\\\\\",\\\\\\\"ER diagram\\\\\\\",\\\\\\\"ER model\\\\\\\",\\\\\\\"database design\\\\\\\",\\\\\\\"schema design\\\\\\\",\\\\\\\"normalization\\\\\\\",\\\\\\\"foreign key\\\\\\\",\\\\\\\"primary key\\\\\\\",\\\\\\\"junction table\\\\\\\",\\\\\\\"database modeling\\\\\\\"]\",\"triggers\":\"[\\\\\\\"er-modeling-skill\\\\\\\",\\\\\\\"database-design-skill\\\\\\\"]\",\"relations\":\"{\\\\\\\"related\\\\\\\":[\\\\\\\"database-migration\\\\\\\"],\\\\\\\"boundary\\\\\\\":[\\\\\\\"conceptual-modeling\\\\\\\"],\\\\\\\"verify_with\\\\\\\":[\\\\\\\"code-review\\\\\\\"]}\",\"portability\":\"{\\\\\\\"readiness\\\\\\\":\\\\\\\"scripted\\\\\\\",\\\\\\\"targets\\\\\\\":[\\\\\\\"skill-md\\\\\\\"]}\",\"lifecycle\":\"{\\\\\\\"stale_after_days\\\\\\\":90,\\\\\\\"review_cadence\\\\\\\":\\\\\\\"quarterly\\\\\\\"}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/entity-relationship-modeling/SKILL.md\"}"
|
|
9
|
+
skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
|
|
10
|
+
skill_graph_protocol: Skill Metadata Protocol v4
|
|
11
|
+
skill_graph_project: Skill Graph
|
|
12
|
+
skill_graph_canonical_skill: skills/entity-relationship-modeling/SKILL.md
|
|
13
|
+
---
|
|
14
|
+
# Entity-Relationship Modeling
|
|
15
|
+
|
|
16
|
+
## Domain Context
|
|
17
|
+
|
|
18
|
+
**What is this skill?** This skill provides entity-relationship (ER) modeling patterns for designing database schemas from domain requirements: Chen notation and Crow's Foot notation, entity identification and attribute analysis, primary/foreign key design, normalization (1NF through BCNF), denormalization trade-offs, junction table patterns, inheritance mapping strategies (single-table, class-table, concrete-table), temporal data modeling, and schema evolution/migration patterns. Covers the ER-to-SQL translation pipeline, indexing strategy from access patterns, constraint specification (NOT NULL, UNIQUE, CHECK, FK), and anti-patterns like EAV abuse, polymorphic associations, and over-normalization. Use when designing new database tables, reviewing schema changes, planning migrations, or translating conceptual models into physical database structures. Do NOT use for conceptual domain analysis (use `conceptual-modeling`), formal ontology (use `ontology`), or cross-system data mapping (use `relational-mapping`).
|
|
19
|
+
|
|
20
|
+
## Coverage
|
|
21
|
+
|
|
22
|
+
Entity-relationship modeling for database schema design: Chen notation and Crow's Foot notation, entity identification and attribute analysis, primary/foreign key design (natural vs. surrogate, UUID vs. serial), normalization forms (1NF through BCNF) with trade-off analysis, denormalization patterns for read performance, junction table design for M:N relationships, inheritance mapping strategies (single-table, class-table, concrete-table), temporal data modeling (SCD Type 1/2/3, bi-temporal), schema evolution and migration patterns, ER-to-SQL translation, indexing strategy from access patterns, constraint specification (NOT NULL, UNIQUE, CHECK, FK, EXCLUDE), and anti-patterns (EAV, polymorphic associations, over-normalization, mega-tables). Does not cover conceptual domain analysis (`conceptual-modeling`), formal ontology (`ontology`), or cross-system data mapping (`relational-mapping`).
|
|
23
|
+
|
|
24
|
+
## Philosophy
|
|
25
|
+
|
|
26
|
+
A database schema is a commitment about what the business considers true. Every table is a claim that a category of things exists; every foreign key is a claim that two categories are related; every constraint is a claim about what the business considers valid. Bad ER design does not just cause slow queries — it causes business logic bugs, data integrity violations, and migration nightmares. This skill exists because agents commonly produce schemas that "work" for the happy path but fail under real-world conditions: concurrent updates, schema evolution, multi-tenancy, and audit requirements. The goal is schemas that are correct first, performant second, and evolvable always.
|
|
27
|
+
|
|
28
|
+
## 1. Entity Identification
|
|
29
|
+
|
|
30
|
+
### What Makes a Good Entity
|
|
31
|
+
|
|
32
|
+
| Criterion | Pass | Fail |
|
|
33
|
+
|-----------|------|------|
|
|
34
|
+
| **Has identity** | Two orders can be distinguished by ID | "OrderType" — just an enum value |
|
|
35
|
+
| **Has multiple attributes** | Order has status, amount, date, customer | "Color" with just a name |
|
|
36
|
+
| **Has a lifecycle** | Order transitions through states | A constant lookup value |
|
|
37
|
+
| **Participates in relationships** | Order belongs to Customer, has LineItems | An isolated value with no connections |
|
|
38
|
+
| **Business users name it** | "Customer," "Product," "Order" | "DataRecord," "Item," "Thing" |
|
|
39
|
+
|
|
40
|
+
### Entity vs. Attribute vs. Relationship
|
|
41
|
+
|
|
42
|
+
| If it has... | It is probably... |
|
|
43
|
+
|-------------|-------------------|
|
|
44
|
+
| Multiple attributes of its own | An entity |
|
|
45
|
+
| Only a name/label | An attribute (or enum) |
|
|
46
|
+
| Attributes that describe a connection | A relationship entity (reified relationship) |
|
|
47
|
+
| Multiple instances per parent | A child entity (not a multi-valued attribute) |
|
|
48
|
+
|
|
49
|
+
## 2. Primary Key Design
|
|
50
|
+
|
|
51
|
+
> For philosophical identity questions (what makes two entities "the same"), see `ontology`. This section covers the database implementation of those decisions.
|
|
52
|
+
|
|
53
|
+
| Strategy | When to use | Trade-offs |
|
|
54
|
+
|----------|-------------|-----------|
|
|
55
|
+
| **UUID (v4 or v7)** | Distributed systems, multi-tenancy, external exposure | 16 bytes, not sortable (v4), not human-readable |
|
|
56
|
+
| **UUID v7** | Need sortable UUIDs for index performance | Timestamp-prefixed, best of both worlds |
|
|
57
|
+
| **Serial/BIGSERIAL** | Single-database, internal only | Compact, sortable, but reveals sequence |
|
|
58
|
+
| **Natural key** | Immutable business identifier (ISO codes, SKUs) | Only if truly immutable; rare in practice |
|
|
59
|
+
| **Composite key** | Junction tables, external system references | Complex JOINs, ORM friction |
|
|
60
|
+
|
|
61
|
+
Rules:
|
|
62
|
+
- Default to UUID v7 for new tables in multi-tenant SaaS (sortable, no sequence leakage, distributed-safe).
|
|
63
|
+
- Never expose serial IDs externally (information leakage: competitor can count your orders).
|
|
64
|
+
- Natural keys are tempting but dangerous — "immutable" business identifiers change more often than you think.
|
|
65
|
+
|
|
66
|
+
## 3. Relationship Patterns
|
|
67
|
+
|
|
68
|
+
### Cardinality Implementation
|
|
69
|
+
|
|
70
|
+
| Relationship | Implementation |
|
|
71
|
+
|-------------|---------------|
|
|
72
|
+
| **1:1** | FK with UNIQUE on child table, or merge into one table |
|
|
73
|
+
| **1:N** | FK on the N-side (child) referencing parent PK |
|
|
74
|
+
| **M:N** | Junction table with two FKs + composite UNIQUE |
|
|
75
|
+
| **M:N with attributes** | Junction table promoted to entity (with its own PK and additional columns) |
|
|
76
|
+
| **Self-referential** | FK referencing same table (e.g., `manager_id` → `employees.id`) |
|
|
77
|
+
| **Polymorphic** | Avoid; use junction tables per type or single FK with type discriminator |
|
|
78
|
+
|
|
79
|
+
### Junction Table Design
|
|
80
|
+
|
|
81
|
+
```sql
|
|
82
|
+
-- Simple M:N
|
|
83
|
+
CREATE TABLE product_categories (
|
|
84
|
+
product_id UUID REFERENCES products(id) ON DELETE CASCADE,
|
|
85
|
+
category_id UUID REFERENCES categories(id) ON DELETE CASCADE,
|
|
86
|
+
PRIMARY KEY (product_id, category_id)
|
|
87
|
+
);
|
|
88
|
+
|
|
89
|
+
-- M:N with attributes (promoted to entity)
|
|
90
|
+
CREATE TABLE order_line_items (
|
|
91
|
+
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
92
|
+
order_id UUID NOT NULL REFERENCES orders(id) ON DELETE CASCADE,
|
|
93
|
+
product_id UUID NOT NULL REFERENCES products(id),
|
|
94
|
+
quantity INTEGER NOT NULL CHECK (quantity > 0),
|
|
95
|
+
unit_price_cents INTEGER NOT NULL CHECK (unit_price_cents >= 0),
|
|
96
|
+
created_at TIMESTAMPTZ DEFAULT now()
|
|
97
|
+
);
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
## 4. Normalization
|
|
101
|
+
|
|
102
|
+
| Form | Rule | Violation example | Fix |
|
|
103
|
+
|------|------|-------------------|-----|
|
|
104
|
+
| **1NF** | Atomic values, no repeating groups | `tags: "red,blue,green"` | Separate table for tags |
|
|
105
|
+
| **2NF** | No partial dependencies (all non-key attributes depend on entire PK) | Junction table with attributes depending on only one FK | Move to parent table or new entity |
|
|
106
|
+
| **3NF** | No transitive dependencies (non-key attributes don't depend on other non-key attributes) | `order.customer_name` duplicating `customer.name` | Remove; join when needed |
|
|
107
|
+
| **BCNF** | Every determinant is a candidate key | Rare in practice; usually 3NF suffices | Decompose table |
|
|
108
|
+
|
|
109
|
+
Rules:
|
|
110
|
+
- Normalize to 3NF by default for transactional tables.
|
|
111
|
+
- Denormalize deliberately for read-heavy analytics with documented justification.
|
|
112
|
+
- Never skip normalization analysis — even if you plan to denormalize, understanding the normal form reveals the true dependencies.
|
|
113
|
+
|
|
114
|
+
## 5. Denormalization Patterns
|
|
115
|
+
|
|
116
|
+
| Pattern | When | Risk |
|
|
117
|
+
|---------|------|------|
|
|
118
|
+
| **Materialized view** | Read-heavy aggregations, dashboards | Staleness, refresh overhead |
|
|
119
|
+
| **Computed column** | Frequently derived values | Must be maintained on writes |
|
|
120
|
+
| **Redundant column** | Avoid frequent JOINs for hot queries | Update anomalies |
|
|
121
|
+
| **JSON column** | Flexible schema within structured data | Query complexity, no FK enforcement |
|
|
122
|
+
| **Pre-aggregated table** | Time-series rollups, analytics | Dual-write consistency |
|
|
123
|
+
|
|
124
|
+
Rules:
|
|
125
|
+
- Every denormalization must document: what is denormalized, why, and how consistency is maintained.
|
|
126
|
+
- Prefer materialized views (DB-maintained) over application-level redundancy (app-maintained).
|
|
127
|
+
- JSON columns are not a substitute for proper entity design; use only for genuinely flexible/unstructured data.
|
|
128
|
+
|
|
129
|
+
## 6. Inheritance Mapping
|
|
130
|
+
|
|
131
|
+
| Strategy | When | Trade-offs |
|
|
132
|
+
|----------|------|-----------|
|
|
133
|
+
| **Single-table** (STI) | Few subtypes, similar attributes, simple queries | Nullable columns, wasted space |
|
|
134
|
+
| **Class-table** | Many shared attributes, need FK to parent | JOIN overhead, complex inserts |
|
|
135
|
+
| **Concrete-table** | Subtypes are queried independently, few shared operations | No polymorphic queries, attribute duplication |
|
|
136
|
+
|
|
137
|
+
### Decision Matrix
|
|
138
|
+
|
|
139
|
+
| Criterion | Single-table | Class-table | Concrete-table |
|
|
140
|
+
|-----------|-------------|-------------|----------------|
|
|
141
|
+
| Query simplicity | Best | Worst | Medium |
|
|
142
|
+
| Storage efficiency | Worst | Best | Medium |
|
|
143
|
+
| Polymorphic queries | Built-in | Requires JOIN | Requires UNION |
|
|
144
|
+
| Subtype isolation | None | Partial | Full |
|
|
145
|
+
| Schema evolution | Easy (add column) | Medium (alter multiple) | Hard (alter each) |
|
|
146
|
+
|
|
147
|
+
Rules:
|
|
148
|
+
- Default to single-table inheritance for <= 3 subtypes with mostly shared attributes.
|
|
149
|
+
- Switch to class-table when subtypes diverge significantly (many subtype-specific columns).
|
|
150
|
+
- Concrete-table only when subtypes are operationally independent and never queried together.
|
|
151
|
+
|
|
152
|
+
## 7. Temporal Data Modeling
|
|
153
|
+
|
|
154
|
+
| Pattern | Tracks | Implementation |
|
|
155
|
+
|---------|--------|---------------|
|
|
156
|
+
| **SCD Type 1** | Current value only (overwrite) | Simple UPDATE |
|
|
157
|
+
| **SCD Type 2** | Full history with validity periods | `valid_from`, `valid_to`, `is_current` |
|
|
158
|
+
| **SCD Type 3** | Previous + current value | `current_value`, `previous_value` columns |
|
|
159
|
+
| **Bi-temporal** | Both business time and system time | `valid_from`, `valid_to`, `recorded_at`, `superseded_at` |
|
|
160
|
+
|
|
161
|
+
Rules:
|
|
162
|
+
- Financial data requires at minimum SCD Type 2 (full audit trail).
|
|
163
|
+
- Use `valid_to IS NULL` or a boolean `is_current` flag for efficient current-value queries.
|
|
164
|
+
- Bi-temporal modeling is necessary when you need to answer "what did we think was true on date X about date Y?"
|
|
165
|
+
|
|
166
|
+
## 8. Constraint Specification
|
|
167
|
+
|
|
168
|
+
| Constraint | Purpose | Example |
|
|
169
|
+
|-----------|---------|---------|
|
|
170
|
+
| `NOT NULL` | Attribute is mandatory | Every order has a customer |
|
|
171
|
+
| `UNIQUE` | No duplicate values | Email addresses |
|
|
172
|
+
| `CHECK` | Business rule enforcement | `quantity > 0`, `amount_cents >= 0` |
|
|
173
|
+
| `FOREIGN KEY` | Referential integrity | Order references Customer |
|
|
174
|
+
| `EXCLUDE` | No overlapping ranges | Booking date ranges |
|
|
175
|
+
| `DEFAULT` | Sensible initial value | `created_at DEFAULT now()` |
|
|
176
|
+
|
|
177
|
+
Rules:
|
|
178
|
+
- Push validation to the database whenever possible. Application-level validation can be bypassed; DB constraints cannot.
|
|
179
|
+
- Every financial amount column needs `CHECK (amount >= 0)` or explicit handling of negatives.
|
|
180
|
+
- Prefer `ON DELETE CASCADE` for composition (line items); `ON DELETE RESTRICT` for association (customer has orders).
|
|
181
|
+
|
|
182
|
+
## 9. Anti-Patterns
|
|
183
|
+
|
|
184
|
+
| Anti-Pattern | Symptom | Fix |
|
|
185
|
+
|-------------|---------|-----|
|
|
186
|
+
| **EAV (Entity-Attribute-Value)** | Generic `key/value` table instead of proper columns | Design explicit entities; use JSON column for genuinely dynamic attributes |
|
|
187
|
+
| **Polymorphic association** | One FK column + type discriminator pointing to multiple tables | Separate FK per related table, or junction table per relationship type |
|
|
188
|
+
| **Mega-table** | 50+ columns, many nullable | Decompose into related entities by business concern |
|
|
189
|
+
| **Over-normalization** | 15 JOINs for a simple query | Selectively denormalize with materialized views |
|
|
190
|
+
| **God table** | One table serves orders, invoices, quotes, and returns | Separate by business entity; share via FK to common parent if needed |
|
|
191
|
+
| **Missing constraints** | No CHECK, FK, or UNIQUE — all validation in app code | Add database-level constraints as the source of truth |
|
|
192
|
+
| **Implicit deletion** | `is_deleted` boolean instead of proper lifecycle | Use soft delete with `deleted_at` timestamp, or archive tables |
|
|
193
|
+
|
|
194
|
+
## Verification
|
|
195
|
+
|
|
196
|
+
> **Scope note:** This checklist covers the implementation (ER) layer — primary keys, foreign keys, normalization, and index strategy. For relationship-level verification (named associations, semantic cardinality), use [`conceptual-modeling`]. For axiom-level verification (formal class definitions, property domains/ranges), use [`ontology`].
|
|
197
|
+
|
|
198
|
+
After applying this skill, verify:
|
|
199
|
+
- [ ] Primary keys are defined for every entity with an explicit strategy (UUID v7 preferred for new tables)
|
|
200
|
+
- [ ] Foreign keys correctly reference parent PKs with explicit ON DELETE behavior (CASCADE for composition, RESTRICT for association)
|
|
201
|
+
- [ ] Normalization level is documented (3NF for OLTP default; any denormalization is justified and documented)
|
|
202
|
+
- [ ] Index strategy is documented — at minimum, covering PKs, FK columns, and query predicates for hot paths
|
|
203
|
+
- [ ] Financial columns have CHECK constraints for valid ranges (`amount >= 0` or explicit negatives handling)
|
|
204
|
+
- [ ] Temporal data has the appropriate SCD type for audit requirements
|
|
205
|
+
- [ ] No EAV or polymorphic association patterns without explicit justification
|
|
206
|
+
|
|
207
|
+
## Do NOT Use When
|
|
208
|
+
|
|
209
|
+
| Instead of this skill | Use | Why |
|
|
210
|
+
|---|---|---|
|
|
211
|
+
| Analyzing business requirements into implementation-independent domain concepts | `conceptual-modeling` | Conceptual modeling captures the domain; ER modeling implements the storage |
|
|
212
|
+
| Defining formal type hierarchies with reasoning and axioms | `ontology` | Ontology is the philosophical layer; ER modeling is the physical layer |
|
|
213
|
+
| Mapping entities between different systems or representations | `relational-mapping` | Relational mapping connects systems; ER modeling designs one system's schema |
|
|
214
|
+
| Running SQL migrations on Neon Postgres | `database-migration` | Migration handles the change process; ER modeling handles the target design |
|
|
215
|
+
|
|
216
|
+
---
|
|
217
|
+
|
|
218
|
+
*Version 1.0.0 — 2026-03-29. Initial creation.*
|
|
@@ -0,0 +1,112 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: epistemic-grounding
|
|
3
|
+
description: "Use when authoring any artifact that makes claims — skill content, documentation, audit findings, architecture proposals, code review comments. Covers the discipline of grounding every claim to a verifiable source, distinguishing verified-by-evidence from inferred-from-context, using normative vocabulary precisely (RFC 2119 MUST/SHOULD/MAY), and structuring arguments so the warrant from data to claim is visible to a reader. Do NOT use for verification protocol mechanics in this repo (use the verification-protocol rule file), for output-completeness enforcement (use methodology), or for self-scoring on a 1-5 scale (use self-evaluation)."
|
|
4
|
+
license: MIT
|
|
5
|
+
allowed-tools: Read Grep
|
|
6
|
+
metadata:
|
|
7
|
+
metadata: "{\"schema_version\":6,\"version\":\"1.0.0\",\"type\":\"capability\",\"category\":\"foundations\",\"domain\":\"foundations/epistemics\",\"scope\":\"reference\",\"owner\":\"skill-graph-maintainer\",\"freshness\":\"2026-05-15\",\"drift_check\":\"{\\\\\\\"last_verified\\\\\\\":\\\\\\\"2026-05-15\\\\\\\"}\",\"eval_artifacts\":\"planned\",\"eval_state\":\"unverified\",\"routing_eval\":\"absent\",\"comprehension_state\":\"present\",\"stability\":\"experimental\",\"keywords\":\"[\\\\\\\"epistemic grounding\\\\\\\",\\\\\\\"claim grounding\\\\\\\",\\\\\\\"source citation\\\\\\\",\\\\\\\"RFC 2119 modality\\\\\\\",\\\\\\\"Toulmin argument\\\\\\\",\\\\\\\"evidence-based assertion\\\\\\\",\\\\\\\"hallucination prevention\\\\\\\",\\\\\\\"normative vocabulary\\\\\\\",\\\\\\\"warrant\\\\\\\",\\\\\\\"epistemic hedge\\\\\\\"]\",\"triggers\":\"[\\\\\\\"ground this claim\\\\\\\",\\\\\\\"cite a source\\\\\\\",\\\\\\\"MUST vs SHOULD\\\\\\\",\\\\\\\"is this verified\\\\\\\",\\\\\\\"how do you know that\\\\\\\"]\",\"examples\":\"[\\\\\\\"before stating that this library supports X, confirm against the actual docs\\\\\\\",\\\\\\\"rewrite this finding so each assertion either cites a file or is marked as inference\\\\\\\",\\\\\\\"should this be a MUST or a SHOULD? what's the strength of the claim?\\\\\\\",\\\\\\\"the agent reported 'fix works' but no test was run — flag the gap in grounding\\\\\\\"]\",\"anti_examples\":\"[\\\\\\\"verify every step of an audit task with concrete evidence (use methodology)\\\\\\\",\\\\\\\"decide which lint rule to add for a specific kind of drift (use skill-infrastructure)\\\\\\\",\\\\\\\"evaluate a finished SKILL.md against the comprehension grader (use evaluation)\\\\\\\"]\",\"relations\":\"{\\\\\\\"related\\\\\\\":[\\\\\\\"methodology\\\\\\\",\\\\\\\"semantics\\\\\\\"],\\\\\\\"boundary\\\\\\\":[{\\\\\\\"skill\\\\\\\":\\\\\\\"methodology\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"methodology enforces output-level completeness and step-level evidence receipts; epistemic-grounding is the upstream discipline that decides what counts as evidence in the first place.\\\\\\\"},{\\\\\\\"skill\\\\\\\":\\\\\\\"semantics\\\\\\\",\\\\\\\"reason\\\\\\\":\\\\\\\"semantics owns the rules for naming and meaning-making; epistemic-grounding owns the rules for grounding a claim to a verifiable source.\\\\\\\"}],\\\\\\\"verify_with\\\\\\\":[\\\\\\\"methodology\\\\\\\",\\\\\\\"evaluation\\\\\\\"]}\",\"mental_model\":\"|\",\"purpose\":\"|\",\"boundary\":\"|\",\"analogy\":\"Epistemic grounding is to claims what double-entry bookkeeping is to financial transactions — every assertion has a corresponding source on the other side of the ledger, and any entry without its pair is a red flag in the audit.\",\"misconception\":\"|\",\"concept\":\"{\\\\\\\"definition\\\\\\\":\\\\\\\"Epistemic grounding is the discipline of binding every assertion to a verifiable source, marking the modality (strength) of the claim, and making the warrant (the inference from source to claim) explicit. It is the practice that turns a generated statement into a defended statement.\\\\\\\",\\\\\\\"mental_model\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"purpose\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"boundary\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"taxonomy\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"analogy\\\\\\\":\\\\\\\"|\\\\\\\",\\\\\\\"misconception\\\\\\\":\\\\\\\"|\\\\\\\"}\",\"skill_graph_source_repo\":\"https://github.com/jacob-balslev/skill-graph\",\"skill_graph_protocol\":\"Skill Metadata Protocol v5\",\"skill_graph_project\":\"Skill Graph\",\"skill_graph_canonical_skill\":\"skills/epistemic-grounding/SKILL.md\"}"
|
|
8
|
+
skill_graph_source_repo: "https://github.com/jacob-balslev/skill-graph"
|
|
9
|
+
skill_graph_protocol: Skill Metadata Protocol v4
|
|
10
|
+
skill_graph_project: Skill Graph
|
|
11
|
+
skill_graph_canonical_skill: skills/epistemic-grounding/SKILL.md
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
# Epistemic Grounding
|
|
15
|
+
|
|
16
|
+
## Coverage
|
|
17
|
+
|
|
18
|
+
The discipline of grounding every claim to a verifiable source, marking the modality (RFC 2119 MUST/SHOULD/MAY) of the claim, and making the warrant from source to claim explicit. Covers the six-primitive Toulmin argument structure (claim/data/warrant/backing/qualifier/rebuttal), the RFC 2119 normative vocabulary, the distinction between verified/inferred/hallucinated, how to hedge honestly without softening, and the failure modes (cargo-cult citation, stale grounding, vibe-based assertion). Applies to any artifact that makes claims: SKILL.md content, code review comments, audit findings, documentation, architecture proposals, agent output.
|
|
19
|
+
|
|
20
|
+
## Philosophy
|
|
21
|
+
|
|
22
|
+
Confidence is not evidence. An LLM trained to be helpful produces fluent, confident text whether or not the underlying claim is true — RLHF rewards plausibility, not verification. The result is that ungrounded text is indistinguishable from grounded text by surface signals alone: tone, structure, and vocabulary look identical.
|
|
23
|
+
|
|
24
|
+
The countermeasure is not to ask the model to "be more careful" (it can't see its own confidence). It is to require structural signals that make grounding state visible: every claim has a source-warrant-qualifier triple in the text, or it carries an explicit hedge ("not verified", "as of 2026-05-15", "inferred from context"). A reader scanning the text can then tell at a glance which claims are grounded and which are not.
|
|
25
|
+
|
|
26
|
+
This discipline serves three downstream uses: (1) audit and review become tractable because the reviewer knows what to verify; (2) drift detection becomes possible because grounded claims expose their dependencies; (3) downstream agents reading the artifact can decide which claims to trust and which to re-verify.
|
|
27
|
+
|
|
28
|
+
## The Six-Primitive Argument
|
|
29
|
+
|
|
30
|
+
Every grounded claim has six primitives, derived from Toulmin's argument structure (1958). Most agent output surfaces only the Claim and silently elides the rest.
|
|
31
|
+
|
|
32
|
+
| Primitive | What it answers | Example |
|
|
33
|
+
|---|---|---|
|
|
34
|
+
| **Claim** | What is being asserted? | "HTTP DELETE is idempotent." |
|
|
35
|
+
| **Data** | What source supports the claim? | "RFC 9110 § 9.3.5" |
|
|
36
|
+
| **Warrant** | What rule connects data to claim? | "RFC 9110 normatively defines HTTP method semantics; § 9.3.5 specifies DELETE's idempotency." |
|
|
37
|
+
| **Backing** | What gives the warrant authority? | "IETF standards body; the RFC has been ratified and is the canonical HTTP/1.1 reference." |
|
|
38
|
+
| **Qualifier** | How strong is the claim? | "MUST" (per RFC 2119) — DELETE is by-definition idempotent. |
|
|
39
|
+
| **Rebuttal** | Under what conditions does the claim fail? | "Server-side state changes that cause subsequent DELETEs to return different status codes do not violate idempotency in the spec sense — idempotency is about the server's resource state, not the response code." |
|
|
40
|
+
|
|
41
|
+
A grounded version of "HTTP DELETE is idempotent" might read in full: *"DELETE is idempotent (RFC 9110 § 9.3.5, MUST). Repeated calls produce the same resource state; response codes may differ (404 after the first 204), which does not violate spec-level idempotency."* That sentence has all six primitives. The ungrounded version — "DELETE is idempotent" — has only the claim.
|
|
42
|
+
|
|
43
|
+
In a SKILL.md, you don't enumerate all six explicitly every sentence. You inline them via citation form, qualifier word, and parenthetical rebuttal. The discipline is to write each claim such that the six primitives can be *reconstructed* by a reader, not necessarily that they are all surface-visible.
|
|
44
|
+
|
|
45
|
+
## RFC 2119 Modality
|
|
46
|
+
|
|
47
|
+
The normative vocabulary from RFC 2119 gives Qualifiers a shared, machine-readable meaning. Use these words *only* in their RFC 2119 sense; in non-normative writing, use weaker words.
|
|
48
|
+
|
|
49
|
+
| Word | Meaning |
|
|
50
|
+
|---|---|
|
|
51
|
+
| **MUST** / **REQUIRED** / **SHALL** | Absolute requirement. The claim is binding; violation breaks the contract. |
|
|
52
|
+
| **MUST NOT** / **SHALL NOT** | Absolute prohibition. |
|
|
53
|
+
| **SHOULD** / **RECOMMENDED** | Strong recommendation. Violation requires understanding and justifying the deviation. |
|
|
54
|
+
| **SHOULD NOT** / **NOT RECOMMENDED** | Strong recommendation against. |
|
|
55
|
+
| **MAY** / **OPTIONAL** | Truly optional. Either choice is conformant. |
|
|
56
|
+
|
|
57
|
+
A skill that says "the handler MUST verify HMAC" makes a different commitment than "the handler should verify HMAC" (lowercase, weak). The capitalized RFC 2119 form has contract weight; the lowercase form is advisory English. Mixing them is a grounding failure.
|
|
58
|
+
|
|
59
|
+
In SKILL.md authoring, use RFC 2119 vocabulary *only* when the claim is genuinely normative (a protocol requirement, a security invariant, a financial correctness rule). For preferences and patterns, use weaker prose: "prefer", "by default", "in this repo we use".
|
|
60
|
+
|
|
61
|
+
## Verified vs Inferred vs Asserted
|
|
62
|
+
|
|
63
|
+
Every claim falls into one of three states. Epistemic grounding requires labeling the state explicitly when it isn't obvious from context.
|
|
64
|
+
|
|
65
|
+
| State | Definition | Required marker |
|
|
66
|
+
|---|---|---|
|
|
67
|
+
| **Verified** | The claim was confirmed by a tool call (Read, Grep, curl, test run) in the same writing session or has an attached evidence receipt. | Cite the verification ("ran `npm test`, all 47 passed") or attach the source path with a line range. |
|
|
68
|
+
| **Inferred** | The claim is a reasonable conclusion from cited evidence, but the conclusion itself wasn't directly tested. | Mark with "inferred from", "follows from", or a hedge like "likely". |
|
|
69
|
+
| **Asserted** | The claim is from the writer's prior knowledge with no in-session verification. | Mark with "as of <date>", "I haven't verified this", or "in my experience". This is the weakest state and should be rare in grounded writing. |
|
|
70
|
+
|
|
71
|
+
The failure mode is silent state-drift: asserting something as if verified, or inferring something as if asserted. The discipline is to mark the state when it differs from what context implies.
|
|
72
|
+
|
|
73
|
+
## Failure Modes
|
|
74
|
+
|
|
75
|
+
| Failure | What it looks like | Why it fails |
|
|
76
|
+
|---|---|---|
|
|
77
|
+
| **Cargo-cult citation** | A reference is added at the end of a paragraph, but doesn't actually support the specific claims in that paragraph. | The warrant from source to claim is missing; the citation is decorative. |
|
|
78
|
+
| **Stale grounding** | A claim cites a source that has since been moved, renamed, or rewritten. | The grounding was once valid but isn't anymore; the reader trusts it without knowing. |
|
|
79
|
+
| **Vibe-based assertion** | A claim is stated with no marker, in a context where the reader will assume it's grounded. | Silent state-drift from asserted to verified. |
|
|
80
|
+
| **Overstated modality** | "MUST" used for a preference; "MAY" used for a hard requirement. | RFC 2119 vocabulary loses its load-bearing meaning. |
|
|
81
|
+
| **Pseudo-hedge** | "Probably", "in most cases", "generally" applied where the writer actually knows the answer. | Soft hedging looks like grounding but obscures the actual state of knowledge. |
|
|
82
|
+
| **Citation laundering** | Citing a secondary source (a blog post citing an RFC) without going to the primary. | The chain of warrants is weak; the secondary source may have misread the primary. |
|
|
83
|
+
| **Authority projection** | Citing a high-authority source for a claim it doesn't actually make. | The source's authority covers the writer's claim even though the source didn't say it. |
|
|
84
|
+
|
|
85
|
+
## Verification
|
|
86
|
+
|
|
87
|
+
After applying this skill, verify:
|
|
88
|
+
- [ ] Every claim that would surprise a domain expert has a source attached.
|
|
89
|
+
- [ ] RFC 2119 vocabulary is used only in normative claims, not in advisory prose.
|
|
90
|
+
- [ ] No paragraph has trailing-citation decoration where the citation doesn't tie to specific claims.
|
|
91
|
+
- [ ] Claims that are inference (not from a cited source) are explicitly marked.
|
|
92
|
+
- [ ] The strongest claims have the most explicit grounding; the weakest claims have hedges.
|
|
93
|
+
- [ ] No "MUST" applies to a preference; no "MAY" applies to a requirement.
|
|
94
|
+
- [ ] At least one Rebuttal is acknowledged for the central claim of the artifact.
|
|
95
|
+
|
|
96
|
+
## Do NOT Use When
|
|
97
|
+
|
|
98
|
+
| Instead of this skill | Use | Why |
|
|
99
|
+
|---|---|---|
|
|
100
|
+
| Enforcing step-level evidence receipts and output completeness | `methodology` | methodology owns the execution discipline; this skill is the upstream grounding discipline that decides what counts as evidence in the first place |
|
|
101
|
+
| Naming things precisely (variables, functions, files) | `semantics` | semantics owns naming precision; this skill owns claim grounding |
|
|
102
|
+
| Drawing inferences from premises | `reasoning` | reasoning is the cognitive primitive; this skill is the marking discipline for distinguishing inference from observation |
|
|
103
|
+
| Verifying that a specific implementation works | `evaluation` or repo-local verification protocol | evaluation owns grader frameworks; this skill is the structural discipline upstream of any verification |
|
|
104
|
+
| Designing the rules of a skill audit | `skill-infrastructure` | skill-infrastructure owns lint and census tooling; this skill governs what counts as a grounded claim that lint might check |
|
|
105
|
+
|
|
106
|
+
## Key Sources
|
|
107
|
+
|
|
108
|
+
- Toulmin, S. (1958). *The Uses of Argument*. Cambridge University Press. The canonical six-primitive argument structure (claim/data/warrant/backing/qualifier/rebuttal).
|
|
109
|
+
- Bradner, S. (1997). [RFC 2119: Key words for use in RFCs to Indicate Requirement Levels](https://datatracker.ietf.org/doc/html/rfc2119). IETF. The standardized MUST/SHOULD/MAY normative vocabulary.
|
|
110
|
+
- Leiba, B. (2017). [RFC 8174: Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words](https://datatracker.ietf.org/doc/html/rfc8174). IETF. Clarifies that only ALL-CAPS forms carry RFC 2119 weight.
|
|
111
|
+
- Northeastern University (2025). "AI sycophancy: 58.19% rate across frontier models." The measurement that justifies structural countermeasures over behavioral ones.
|
|
112
|
+
- Royal Society Open Science (2025). "LLM summarization bias: overgeneralization in 26-73% of cases." The measurement that justifies explicit source-to-claim warrant tracking.
|