rsc-universal 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +279 -0
- package/manifest.json +4761 -0
- package/package.json +59 -0
- package/schema/frontmatter.schema.json +12 -0
- package/scripts/build-manifest.js +72 -0
- package/scripts/consult.js +106 -0
- package/scripts/detect-repo.js +118 -0
- package/scripts/doctor.js +21 -0
- package/scripts/eval-lint.sh +179 -0
- package/scripts/install-apply.js +52 -0
- package/scripts/install-plan.js +13 -0
- package/scripts/lib/behavior-score.js +103 -0
- package/scripts/lib/frontmatter.js +47 -0
- package/scripts/lib/harden-policy.js +41 -0
- package/scripts/lib/manifest.js +18 -0
- package/scripts/lib/recommend.js +36 -0
- package/scripts/lib/registry.js +110 -0
- package/scripts/lib/result-envelope.js +35 -0
- package/scripts/lib/state.js +12 -0
- package/scripts/lib/ui.js +17 -0
- package/scripts/reviewer-guard.sh +67 -0
- package/scripts/rsc.js +108 -0
- package/scripts/skill-behavior-eval.js +33 -0
- package/scripts/skill-behavior-eval.workflow.js +136 -0
- package/scripts/skill-behavior-rubric.md +63 -0
- package/scripts/skill-harden-rubric.md +40 -0
- package/scripts/skill-harden.workflow.js +161 -0
- package/scripts/skill-rubric.md +39 -0
- package/scripts/skill-scoreboard.workflow.js +35 -0
- package/skills/ab-testing/SKILL.md +191 -0
- package/skills/ab-testing/evals/README.md +8 -0
- package/skills/ab-testing/evals/cases.yaml +49 -0
- package/skills/ab-testing/references/pitfalls.md +74 -0
- package/skills/ab-testing/references/sample-size-and-cuped.md +128 -0
- package/skills/ab-testing/scripts/verify.sh +89 -0
- package/skills/accessibility/SKILL.md +218 -0
- package/skills/accessibility/evals/README.md +3 -0
- package/skills/accessibility/evals/cases.yaml +47 -0
- package/skills/accessibility/references/aria-patterns.md +113 -0
- package/skills/accessibility/references/wcag22-checklist.md +83 -0
- package/skills/accessibility/scripts/verify.sh +103 -0
- package/skills/ads/SKILL.md +175 -0
- package/skills/ads/evals/README.md +15 -0
- package/skills/ads/evals/cases.yaml +58 -0
- package/skills/ads/references/platform-specs.md +73 -0
- package/skills/ads/references/roas-model.md +77 -0
- package/skills/ads/scripts/verify.sh +210 -0
- package/skills/agent-eval/SKILL.md +213 -0
- package/skills/agent-eval/evals/README.md +12 -0
- package/skills/agent-eval/evals/cases.yaml +45 -0
- package/skills/agent-eval/references/judge-design.md +118 -0
- package/skills/agent-eval/references/runner-and-gate.md +183 -0
- package/skills/agent-eval/scripts/verify.sh +161 -0
- package/skills/agent-safety/SKILL.md +176 -0
- package/skills/agent-safety/evals/README.md +12 -0
- package/skills/agent-safety/evals/cases.yaml +46 -0
- package/skills/agent-safety/references/threat-model.md +51 -0
- package/skills/ai-media/SKILL.md +196 -0
- package/skills/ai-media/evals/README.md +3 -0
- package/skills/ai-media/evals/cases.yaml +45 -0
- package/skills/ai-media/references/ffmpeg-assembly.md +117 -0
- package/skills/ai-media/references/models-and-params.md +78 -0
- package/skills/ai-media/scripts/verify.sh +103 -0
- package/skills/analytics/SKILL.md +219 -0
- package/skills/analytics/evals/README.md +9 -0
- package/skills/analytics/evals/cases.yaml +53 -0
- package/skills/analytics/references/event-taxonomy.md +75 -0
- package/skills/analytics/references/ga4-setup.md +122 -0
- package/skills/analytics/references/posthog-setup.md +100 -0
- package/skills/analytics/scripts/verify.sh +95 -0
- package/skills/analyze/SKILL.md +136 -0
- package/skills/analyze/evals/README.md +72 -0
- package/skills/analyze/evals/cases.yaml +74 -0
- package/skills/angular/SKILL.md +288 -0
- package/skills/angular/evals/README.md +3 -0
- package/skills/angular/evals/cases.yaml +38 -0
- package/skills/angular/references/migration.md +81 -0
- package/skills/angular/references/signals-rxjs.md +92 -0
- package/skills/angular/scripts/verify.sh +122 -0
- package/skills/api-connector-builder/SKILL.md +285 -0
- package/skills/api-connector-builder/evals/README.md +11 -0
- package/skills/api-connector-builder/evals/cases.yaml +47 -0
- package/skills/api-connector-builder/references/auth-flows.md +132 -0
- package/skills/api-connector-builder/references/pagination.md +144 -0
- package/skills/api-connector-builder/scripts/verify.sh +172 -0
- package/skills/api-design/SKILL.md +189 -0
- package/skills/api-design/evals/README.md +3 -0
- package/skills/api-design/evals/cases.yaml +45 -0
- package/skills/api-design/references/graphql-design.md +70 -0
- package/skills/api-design/references/openapi-contract.md +86 -0
- package/skills/api-design/references/rest-conventions.md +63 -0
- package/skills/api-design/references/versioning-and-evolution.md +49 -0
- package/skills/api-design/scripts/verify.sh +138 -0
- package/skills/article-writing/SKILL.md +175 -0
- package/skills/article-writing/evals/README.md +3 -0
- package/skills/article-writing/evals/cases.yaml +47 -0
- package/skills/article-writing/references/ai-tell-banlist.md +114 -0
- package/skills/article-writing/references/on-page-seo.md +133 -0
- package/skills/article-writing/scripts/verify.sh +165 -0
- package/skills/astro/SKILL.md +275 -0
- package/skills/astro/evals/README.md +3 -0
- package/skills/astro/evals/cases.yaml +41 -0
- package/skills/astro/references/content-layer.md +118 -0
- package/skills/astro/references/deploy-and-integrations.md +163 -0
- package/skills/astro/scripts/verify.sh +137 -0
- package/skills/author-skill/SKILL.md +206 -0
- package/skills/author-skill/evals/README.md +66 -0
- package/skills/author-skill/evals/cases.yaml +75 -0
- package/skills/author-skill/references/description-recipe.md +84 -0
- package/skills/author-skill/references/eval-authoring.md +74 -0
- package/skills/author-skill/references/rsc-conventions.md +91 -0
- package/skills/automation-flows/SKILL.md +132 -0
- package/skills/automation-flows/evals/README.md +5 -0
- package/skills/automation-flows/evals/cases.yaml +44 -0
- package/skills/automation-flows/references/error-handling.md +58 -0
- package/skills/automation-flows/references/n8n-workflow-json.md +63 -0
- package/skills/automation-flows/scripts/verify.sh +78 -0
- package/skills/aws-essentials/SKILL.md +223 -0
- package/skills/aws-essentials/evals/README.md +10 -0
- package/skills/aws-essentials/evals/cases.yaml +44 -0
- package/skills/aws-essentials/references/iam-least-privilege.md +134 -0
- package/skills/aws-essentials/references/rds-cloudfront-recipes.md +127 -0
- package/skills/aws-essentials/scripts/verify.sh +99 -0
- package/skills/backups/SKILL.md +137 -0
- package/skills/backups/evals/README.md +3 -0
- package/skills/backups/evals/cases.yaml +42 -0
- package/skills/backups/references/engine-recipes.md +121 -0
- package/skills/backups/references/restore-runbook.md +65 -0
- package/skills/backups/scripts/verify.sh +80 -0
- package/skills/bash-scripting/SKILL.md +231 -0
- package/skills/bash-scripting/evals/README.md +3 -0
- package/skills/bash-scripting/evals/cases.yaml +45 -0
- package/skills/bash-scripting/references/portability.md +97 -0
- package/skills/bash-scripting/scripts/verify.sh +140 -0
- package/skills/bookkeeping/SKILL.md +184 -0
- package/skills/bookkeeping/evals/README.md +5 -0
- package/skills/bookkeeping/evals/cases.yaml +52 -0
- package/skills/bookkeeping/references/chart-of-accounts.md +87 -0
- package/skills/bookkeeping/references/reconciliation-playbook.md +54 -0
- package/skills/bookkeeping/references/tricky-transactions.md +192 -0
- package/skills/brand-identity/SKILL.md +161 -0
- package/skills/brand-identity/evals/README.md +14 -0
- package/skills/brand-identity/evals/cases.yaml +43 -0
- package/skills/brand-identity/references/color-and-tokens.md +129 -0
- package/skills/brand-identity/references/logo-and-assets.md +117 -0
- package/skills/brand-identity/scripts/verify.sh +224 -0
- package/skills/brand-voice/SKILL.md +183 -0
- package/skills/brand-voice/evals/README.md +3 -0
- package/skills/brand-voice/evals/cases.yaml +57 -0
- package/skills/brand-voice/references/voice-guide-template.md +150 -0
- package/skills/brand-voice/references/word-bank.md +61 -0
- package/skills/brand-voice/scripts/verify.sh +190 -0
- package/skills/building-agents/SKILL.md +469 -0
- package/skills/building-agents/evals/README.md +68 -0
- package/skills/building-agents/evals/cases.yaml +60 -0
- package/skills/building-agents/references/agent-loops-and-harness.md +371 -0
- package/skills/building-agents/references/evals-and-observability.md +420 -0
- package/skills/building-agents/references/mcp-servers.md +294 -0
- package/skills/building-agents/references/provider-abstraction.md +489 -0
- package/skills/building-agents/references/tools-and-rag.md +417 -0
- package/skills/building-agents/scripts/verify.sh +121 -0
- package/skills/business-intelligence/SKILL.md +176 -0
- package/skills/business-intelligence/evals/README.md +3 -0
- package/skills/business-intelligence/evals/cases.yaml +43 -0
- package/skills/business-intelligence/references/authoring-semantic-models.md +120 -0
- package/skills/business-intelligence/references/wiring-agents-and-apis.md +79 -0
- package/skills/business-intelligence/scripts/verify.sh +143 -0
- package/skills/calendar-scheduling/SKILL.md +196 -0
- package/skills/calendar-scheduling/evals/README.md +14 -0
- package/skills/calendar-scheduling/evals/cases.yaml +45 -0
- package/skills/calendar-scheduling/references/google-calendar-sync.md +78 -0
- package/skills/calendar-scheduling/references/provider-matrix.md +71 -0
- package/skills/calendar-scheduling/scripts/verify.sh +117 -0
- package/skills/case-studies/SKILL.md +147 -0
- package/skills/case-studies/evals/README.md +3 -0
- package/skills/case-studies/evals/cases.yaml +63 -0
- package/skills/case-studies/references/case-study-skeleton.md +90 -0
- package/skills/case-studies/references/consent-and-substantiation.md +80 -0
- package/skills/case-studies/scripts/verify.sh +161 -0
- package/skills/chatbot/SKILL.md +168 -0
- package/skills/chatbot/evals/README.md +13 -0
- package/skills/chatbot/evals/cases.yaml +43 -0
- package/skills/chatbot/references/handoff-and-sales.md +71 -0
- package/skills/chatbot/references/system-prompt-and-guardrails.md +78 -0
- package/skills/chatbot/scripts/verify.sh +162 -0
- package/skills/chrome-extension/SKILL.md +169 -0
- package/skills/chrome-extension/evals/README.md +12 -0
- package/skills/chrome-extension/evals/cases.yaml +40 -0
- package/skills/chrome-extension/references/store-and-migration.md +84 -0
- package/skills/chrome-extension/scripts/verify.sh +62 -0
- package/skills/clarify/SKILL.md +159 -0
- package/skills/clarify/evals/README.md +70 -0
- package/skills/clarify/evals/cases.yaml +71 -0
- package/skills/clickhouse-analytics/SKILL.md +165 -0
- package/skills/clickhouse-analytics/evals/README.md +3 -0
- package/skills/clickhouse-analytics/evals/cases.yaml +45 -0
- package/skills/clickhouse-analytics/references/ingestion-and-mvs.md +109 -0
- package/skills/clickhouse-analytics/references/query-optimization.md +76 -0
- package/skills/clickhouse-analytics/references/schema-and-engines.md +63 -0
- package/skills/clickhouse-analytics/scripts/verify.sh +109 -0
- package/skills/client-onboarding/SKILL.md +254 -0
- package/skills/client-onboarding/evals/README.md +14 -0
- package/skills/client-onboarding/evals/cases.yaml +40 -0
- package/skills/client-onboarding/references/onboarding-playbook.md +126 -0
- package/skills/cloudflare/SKILL.md +191 -0
- package/skills/cloudflare/evals/README.md +15 -0
- package/skills/cloudflare/evals/cases.yaml +46 -0
- package/skills/cloudflare/references/storage-primitives.md +104 -0
- package/skills/cloudflare/references/wrangler-config.md +91 -0
- package/skills/cloudflare/scripts/verify.sh +133 -0
- package/skills/code-review/SKILL.md +143 -0
- package/skills/code-review/evals/README.md +3 -0
- package/skills/code-review/evals/cases.yaml +55 -0
- package/skills/code-review/references/pr-workflow.md +67 -0
- package/skills/codebase-onboarding/SKILL.md +133 -0
- package/skills/codebase-onboarding/evals/README.md +3 -0
- package/skills/codebase-onboarding/evals/cases.yaml +69 -0
- package/skills/codebase-onboarding/references/recon-playbook.md +57 -0
- package/skills/codebase-onboarding/scripts/verify.sh +54 -0
- package/skills/cold-outreach/SKILL.md +206 -0
- package/skills/cold-outreach/evals/README.md +3 -0
- package/skills/cold-outreach/evals/cases.yaml +60 -0
- package/skills/cold-outreach/references/compliance-footer.md +50 -0
- package/skills/cold-outreach/references/hook-derivation.md +73 -0
- package/skills/cold-outreach/references/templates.md +88 -0
- package/skills/cold-outreach/scripts/verify.sh +170 -0
- package/skills/community/SKILL.md +225 -0
- package/skills/community/evals/README.md +3 -0
- package/skills/community/evals/cases.yaml +40 -0
- package/skills/community/references/metrics-and-rituals.md +58 -0
- package/skills/community/references/platform-playbooks.md +64 -0
- package/skills/community/scripts/verify.sh +83 -0
- package/skills/competitor-watch/SKILL.md +193 -0
- package/skills/competitor-watch/evals/README.md +19 -0
- package/skills/competitor-watch/evals/cases.yaml +54 -0
- package/skills/competitor-watch/references/monitoring-config.md +124 -0
- package/skills/competitor-watch/references/tracker-schema.md +79 -0
- package/skills/competitor-watch/scripts/verify.sh +253 -0
- package/skills/compliance/SKILL.md +184 -0
- package/skills/compliance/evals/README.md +14 -0
- package/skills/compliance/evals/cases.yaml +46 -0
- package/skills/compliance/references/frameworks.md +108 -0
- package/skills/compliance/references/operating-rhythm.md +79 -0
- package/skills/compliance/scripts/verify.sh +168 -0
- package/skills/compose-multiplatform/SKILL.md +198 -0
- package/skills/compose-multiplatform/evals/README.md +3 -0
- package/skills/compose-multiplatform/evals/cases.yaml +40 -0
- package/skills/compose-multiplatform/references/ios-interop.md +91 -0
- package/skills/compose-multiplatform/references/project-setup.md +96 -0
- package/skills/compose-multiplatform/scripts/verify.sh +123 -0
- package/skills/constitution/SKILL.md +160 -0
- package/skills/constitution/evals/README.md +68 -0
- package/skills/constitution/evals/cases.yaml +72 -0
- package/skills/constitution/references/constitution-template.md +90 -0
- package/skills/content-engine/SKILL.md +164 -0
- package/skills/content-engine/evals/README.md +17 -0
- package/skills/content-engine/evals/cases.yaml +62 -0
- package/skills/content-engine/references/atomization.md +81 -0
- package/skills/content-engine/references/brief-and-pipeline.md +90 -0
- package/skills/content-engine/scripts/verify.sh +146 -0
- package/skills/context-budget/SKILL.md +132 -0
- package/skills/context-budget/evals/README.md +11 -0
- package/skills/context-budget/evals/cases.yaml +40 -0
- package/skills/context-budget/references/handoff-and-compaction.md +96 -0
- package/skills/continuous-learning/SKILL.md +136 -0
- package/skills/continuous-learning/evals/README.md +16 -0
- package/skills/continuous-learning/evals/cases.yaml +39 -0
- package/skills/continuous-learning/references/lesson-routing.md +106 -0
- package/skills/contracts/SKILL.md +124 -0
- package/skills/contracts/evals/README.md +3 -0
- package/skills/contracts/evals/cases.yaml +42 -0
- package/skills/contracts/references/clause-library.md +129 -0
- package/skills/contracts/references/review-playbook.md +49 -0
- package/skills/contracts/scripts/verify.sh +53 -0
- package/skills/coolify/SKILL.md +201 -0
- package/skills/coolify/evals/README.md +21 -0
- package/skills/coolify/evals/cases.yaml +46 -0
- package/skills/coolify/references/databases-and-backups.md +99 -0
- package/skills/coolify/references/deploy-recipes.md +105 -0
- package/skills/coolify/references/install-and-proxy.md +80 -0
- package/skills/coolify/scripts/verify.sh +123 -0
- package/skills/cost-tracking/SKILL.md +183 -0
- package/skills/cost-tracking/evals/README.md +3 -0
- package/skills/cost-tracking/evals/cases.yaml +45 -0
- package/skills/cost-tracking/references/cloud-caps.md +52 -0
- package/skills/cost-tracking/references/pricing-tables.md +51 -0
- package/skills/cost-tracking/scripts/verify.sh +135 -0
- package/skills/course-builder/SKILL.md +186 -0
- package/skills/course-builder/evals/README.md +16 -0
- package/skills/course-builder/evals/cases.yaml +49 -0
- package/skills/course-builder/references/assessment-design.md +74 -0
- package/skills/course-builder/references/grounding-and-scoping.md +69 -0
- package/skills/course-builder/references/outcomes-and-blooms.md +82 -0
- package/skills/course-builder/scripts/verify.sh +247 -0
- package/skills/course-storytelling/SKILL.md +205 -0
- package/skills/course-storytelling/evals/README.md +54 -0
- package/skills/course-storytelling/evals/cases.yaml +50 -0
- package/skills/course-storytelling/references/brunson-frameworks.md +190 -0
- package/skills/course-storytelling/references/concept-landing-recipe.md +136 -0
- package/skills/course-storytelling/references/course-analysis.md +124 -0
- package/skills/course-storytelling/references/learner-grounding.md +183 -0
- package/skills/course-storytelling/references/mental-models.md +115 -0
- package/skills/course-storytelling/scripts/verify.sh +223 -0
- package/skills/cpp/SKILL.md +349 -0
- package/skills/cpp/evals/README.md +14 -0
- package/skills/cpp/evals/cases.yaml +44 -0
- package/skills/cpp/references/cmake.md +167 -0
- package/skills/cpp/references/move-and-templates.md +130 -0
- package/skills/cpp/references/undefined-behavior.md +86 -0
- package/skills/cpp/scripts/verify.sh +165 -0
- package/skills/csharp-dotnet/SKILL.md +291 -0
- package/skills/csharp-dotnet/evals/README.md +3 -0
- package/skills/csharp-dotnet/evals/cases.yaml +48 -0
- package/skills/csharp-dotnet/references/aspnetcore.md +99 -0
- package/skills/csharp-dotnet/references/async.md +82 -0
- package/skills/csharp-dotnet/references/efcore.md +96 -0
- package/skills/csharp-dotnet/scripts/verify.sh +90 -0
- package/skills/customer-support/SKILL.md +193 -0
- package/skills/customer-support/evals/README.md +13 -0
- package/skills/customer-support/evals/cases.yaml +61 -0
- package/skills/customer-support/references/macros-and-sla.md +142 -0
- package/skills/dashboard/SKILL.md +205 -0
- package/skills/dashboard/evals/README.md +3 -0
- package/skills/dashboard/evals/cases.yaml +50 -0
- package/skills/dashboard/references/chart-selection.md +34 -0
- package/skills/dashboard/references/tile-schema.md +164 -0
- package/skills/dashboard/scripts/verify.sh +130 -0
- package/skills/data-cleaning/SKILL.md +285 -0
- package/skills/data-cleaning/evals/README.md +16 -0
- package/skills/data-cleaning/evals/cases.yaml +57 -0
- package/skills/data-cleaning/references/normalization-recipes.md +136 -0
- package/skills/data-cleaning/references/validation-patterns.md +134 -0
- package/skills/data-cleaning/scripts/verify.sh +115 -0
- package/skills/data-policy/SKILL.md +163 -0
- package/skills/data-policy/evals/README.md +15 -0
- package/skills/data-policy/evals/cases.yaml +44 -0
- package/skills/data-policy/references/consent-and-ropa.md +97 -0
- package/skills/data-policy/references/retention-schedule.md +83 -0
- package/skills/data-policy/scripts/verify.sh +143 -0
- package/skills/data-scraper/SKILL.md +134 -0
- package/skills/data-scraper/evals/README.md +3 -0
- package/skills/data-scraper/evals/cases.yaml +46 -0
- package/skills/data-scraper/references/anti-bot.md +85 -0
- package/skills/data-scraper/references/frameworks.md +116 -0
- package/skills/data-scraper/references/legal-compliance.md +59 -0
- package/skills/data-scraper/scripts/verify.sh +166 -0
- package/skills/db-migrations/SKILL.md +254 -0
- package/skills/db-migrations/evals/README.md +10 -0
- package/skills/db-migrations/evals/cases.yaml +46 -0
- package/skills/db-migrations/references/backfill-and-batching.md +105 -0
- package/skills/db-migrations/references/expand-contract-playbook.md +152 -0
- package/skills/db-migrations/references/tools-and-runners.md +88 -0
- package/skills/db-migrations/scripts/verify.sh +112 -0
- package/skills/debug/SKILL.md +227 -0
- package/skills/debug/evals/README.md +88 -0
- package/skills/debug/evals/cases.yaml +74 -0
- package/skills/decision-records/SKILL.md +189 -0
- package/skills/decision-records/evals/README.md +3 -0
- package/skills/decision-records/evals/cases.yaml +43 -0
- package/skills/decision-records/references/templates.md +232 -0
- package/skills/decision-records/scripts/verify.sh +105 -0
- package/skills/deployment/SKILL.md +439 -0
- package/skills/deployment/evals/README.md +50 -0
- package/skills/deployment/evals/cases.yaml +53 -0
- package/skills/deployment/references/coolify.md +216 -0
- package/skills/deployment/references/dockerfiles-by-stack.md +319 -0
- package/skills/deployment/references/github-actions.md +295 -0
- package/skills/deployment/references/hosting-targets.md +272 -0
- package/skills/deployment/scripts/verify.sh +134 -0
- package/skills/design/SKILL.md +399 -0
- package/skills/design/evals/README.md +53 -0
- package/skills/design/evals/cases.yaml +56 -0
- package/skills/design/references/brand-grounding.md +187 -0
- package/skills/design/references/copywriting-frameworks.md +138 -0
- package/skills/design/references/landing-anatomy-and-cro.md +202 -0
- package/skills/design/references/motion-and-interaction.md +182 -0
- package/skills/design/references/research-method.md +147 -0
- package/skills/design/references/signature-and-craft.md +148 -0
- package/skills/design/references/trends-2026.md +80 -0
- package/skills/design/references/visual-system.md +236 -0
- package/skills/design/scripts/verify.sh +248 -0
- package/skills/digitalocean/SKILL.md +251 -0
- package/skills/digitalocean/evals/README.md +10 -0
- package/skills/digitalocean/evals/cases.yaml +37 -0
- package/skills/digitalocean/references/app-spec.md +126 -0
- package/skills/digitalocean/references/droplet-ops.md +95 -0
- package/skills/digitalocean/scripts/verify.sh +102 -0
- package/skills/django/SKILL.md +268 -0
- package/skills/django/evals/README.md +11 -0
- package/skills/django/evals/cases.yaml +47 -0
- package/skills/django/references/drf.md +109 -0
- package/skills/django/references/orm-performance.md +91 -0
- package/skills/django/references/security.md +81 -0
- package/skills/django/references/testing.md +86 -0
- package/skills/django/scripts/verify.sh +115 -0
- package/skills/docker/SKILL.md +283 -0
- package/skills/docker/evals/README.md +10 -0
- package/skills/docker/evals/cases.yaml +44 -0
- package/skills/docker/references/base-images-and-stages.md +104 -0
- package/skills/docker/references/compose-recipes.md +109 -0
- package/skills/docker/scripts/verify.sh +149 -0
- package/skills/document-processing/SKILL.md +214 -0
- package/skills/document-processing/evals/README.md +3 -0
- package/skills/document-processing/evals/cases.yaml +65 -0
- package/skills/document-processing/references/engines.md +67 -0
- package/skills/document-processing/scripts/verify.sh +172 -0
- package/skills/domains-dns/SKILL.md +146 -0
- package/skills/domains-dns/evals/README.md +16 -0
- package/skills/domains-dns/evals/cases.yaml +47 -0
- package/skills/domains-dns/references/record-cookbook.md +94 -0
- package/skills/domains-dns/references/tls-and-acme.md +90 -0
- package/skills/domains-dns/references/verify-and-debug.md +64 -0
- package/skills/domains-dns/scripts/verify.sh +163 -0
- package/skills/drizzle-orm/SKILL.md +234 -0
- package/skills/drizzle-orm/evals/README.md +12 -0
- package/skills/drizzle-orm/evals/cases.yaml +47 -0
- package/skills/drizzle-orm/references/relations-and-drivers.md +118 -0
- package/skills/drizzle-orm/scripts/verify.sh +155 -0
- package/skills/duckdb/SKILL.md +207 -0
- package/skills/duckdb/evals/README.md +31 -0
- package/skills/duckdb/evals/cases.yaml +41 -0
- package/skills/duckdb/references/python-and-interop.md +105 -0
- package/skills/duckdb/references/remote-and-lakehouse.md +101 -0
- package/skills/duckdb/scripts/verify.sh +71 -0
- package/skills/dynamodb/SKILL.md +217 -0
- package/skills/dynamodb/evals/README.md +8 -0
- package/skills/dynamodb/evals/cases.yaml +46 -0
- package/skills/dynamodb/references/access-patterns.md +127 -0
- package/skills/dynamodb/references/capacity-and-limits.md +78 -0
- package/skills/dynamodb/scripts/verify.sh +108 -0
- package/skills/e-signature/SKILL.md +185 -0
- package/skills/e-signature/evals/README.md +3 -0
- package/skills/e-signature/evals/cases.yaml +44 -0
- package/skills/e-signature/references/docusign.md +83 -0
- package/skills/e-signature/references/dropbox-sign.md +73 -0
- package/skills/e-signature/references/legal-tiers.md +37 -0
- package/skills/e-signature/scripts/verify.sh +81 -0
- package/skills/e2e-testing/SKILL.md +243 -0
- package/skills/e2e-testing/evals/README.md +10 -0
- package/skills/e2e-testing/evals/cases.yaml +64 -0
- package/skills/e2e-testing/references/config-and-ci.md +156 -0
- package/skills/e2e-testing/references/flakiness-playbook.md +124 -0
- package/skills/e2e-testing/scripts/verify.sh +117 -0
- package/skills/electron/SKILL.md +221 -0
- package/skills/electron/evals/README.md +13 -0
- package/skills/electron/evals/cases.yaml +38 -0
- package/skills/electron/references/packaging-and-updates.md +122 -0
- package/skills/electron/references/security-and-ipc.md +158 -0
- package/skills/electron/scripts/verify.sh +143 -0
- package/skills/elixir/SKILL.md +217 -0
- package/skills/elixir/evals/README.md +3 -0
- package/skills/elixir/evals/cases.yaml +41 -0
- package/skills/elixir/references/mix-and-releases.md +91 -0
- package/skills/elixir/references/otp-patterns.md +96 -0
- package/skills/elixir/scripts/verify.sh +76 -0
- package/skills/email-connector/SKILL.md +294 -0
- package/skills/email-connector/evals/README.md +19 -0
- package/skills/email-connector/evals/cases.yaml +39 -0
- package/skills/email-connector/references/providers.md +107 -0
- package/skills/email-connector/scripts/verify.sh +72 -0
- package/skills/email-deliverability/SKILL.md +168 -0
- package/skills/email-deliverability/evals/README.md +21 -0
- package/skills/email-deliverability/evals/cases.yaml +45 -0
- package/skills/email-deliverability/scripts/verify.sh +98 -0
- package/skills/embeddings-search/SKILL.md +193 -0
- package/skills/embeddings-search/evals/README.md +10 -0
- package/skills/embeddings-search/evals/cases.yaml +44 -0
- package/skills/embeddings-search/references/evaluation.md +86 -0
- package/skills/embeddings-search/references/models.md +73 -0
- package/skills/embeddings-search/scripts/verify.sh +103 -0
- package/skills/error-handling/SKILL.md +307 -0
- package/skills/error-handling/evals/README.md +12 -0
- package/skills/error-handling/evals/cases.yaml +46 -0
- package/skills/error-handling/references/boundaries-and-messaging.md +120 -0
- package/skills/error-handling/references/retry-and-resilience.md +154 -0
- package/skills/error-handling/scripts/verify.sh +110 -0
- package/skills/expo/SKILL.md +253 -0
- package/skills/expo/evals/README.md +13 -0
- package/skills/expo/evals/cases.yaml +44 -0
- package/skills/expo/references/config-plugins.md +117 -0
- package/skills/expo/references/eas-update.md +118 -0
- package/skills/expo/scripts/verify.sh +132 -0
- package/skills/fal/SKILL.md +210 -0
- package/skills/fal/evals/README.md +3 -0
- package/skills/fal/evals/cases.yaml +42 -0
- package/skills/fal/references/models-and-cost.md +53 -0
- package/skills/fal/references/queue-and-webhooks.md +153 -0
- package/skills/fal/scripts/verify.sh +72 -0
- package/skills/fastapi/SKILL.md +499 -0
- package/skills/fastapi/evals/README.md +50 -0
- package/skills/fastapi/evals/cases.yaml +55 -0
- package/skills/fastapi/references/database.md +347 -0
- package/skills/fastapi/references/production.md +338 -0
- package/skills/fastapi/references/security.md +330 -0
- package/skills/fastapi/references/testing.md +349 -0
- package/skills/fastapi/scripts/verify.sh +116 -0
- package/skills/finance-ops/SKILL.md +149 -0
- package/skills/finance-ops/evals/README.md +3 -0
- package/skills/finance-ops/evals/cases.yaml +39 -0
- package/skills/finance-ops/references/cash-flow-forecast.md +57 -0
- package/skills/finance-ops/references/month-close.md +59 -0
- package/skills/finance-ops/references/reconciliation.md +65 -0
- package/skills/finance-ops/scripts/verify.sh +166 -0
- package/skills/financial-model/SKILL.md +170 -0
- package/skills/financial-model/evals/README.md +3 -0
- package/skills/financial-model/evals/cases.yaml +53 -0
- package/skills/financial-model/references/benchmarks-and-scenarios.md +55 -0
- package/skills/financial-model/references/model-structure.md +67 -0
- package/skills/financial-model/references/revenue-build.md +68 -0
- package/skills/financial-model/scripts/verify.sh +232 -0
- package/skills/firebase/SKILL.md +251 -0
- package/skills/firebase/evals/README.md +12 -0
- package/skills/firebase/evals/cases.yaml +45 -0
- package/skills/firebase/references/cloud-functions.md +102 -0
- package/skills/firebase/references/data-modeling.md +108 -0
- package/skills/firebase/references/security-rules.md +137 -0
- package/skills/firebase/scripts/verify.sh +98 -0
- package/skills/flutter/SKILL.md +448 -0
- package/skills/flutter/evals/README.md +54 -0
- package/skills/flutter/evals/cases.yaml +69 -0
- package/skills/flutter/references/architecture-and-state.md +499 -0
- package/skills/flutter/references/i18n-and-dependencies.md +197 -0
- package/skills/flutter/references/performance.md +299 -0
- package/skills/flutter/references/testing.md +385 -0
- package/skills/flutter/references/ui-and-navigation.md +378 -0
- package/skills/flutter/scripts/verify.sh +104 -0
- package/skills/fly-io/SKILL.md +206 -0
- package/skills/fly-io/evals/README.md +3 -0
- package/skills/fly-io/evals/cases.yaml +42 -0
- package/skills/fly-io/references/fly-toml.md +155 -0
- package/skills/fly-io/references/multi-region.md +66 -0
- package/skills/fly-io/scripts/verify.sh +90 -0
- package/skills/forecasting/SKILL.md +139 -0
- package/skills/forecasting/evals/README.md +13 -0
- package/skills/forecasting/evals/cases.yaml +47 -0
- package/skills/forecasting/references/accuracy-and-backtesting.md +104 -0
- package/skills/forecasting/references/methods-cheatsheet.md +94 -0
- package/skills/forecasting/scripts/verify.sh +99 -0
- package/skills/fundraising/SKILL.md +162 -0
- package/skills/fundraising/evals/README.md +18 -0
- package/skills/fundraising/evals/cases.yaml +76 -0
- package/skills/fundraising/references/funnel-math.md +90 -0
- package/skills/fundraising/references/process-playbook.md +97 -0
- package/skills/gcp-essentials/SKILL.md +327 -0
- package/skills/gcp-essentials/evals/README.md +12 -0
- package/skills/gcp-essentials/evals/cases.yaml +38 -0
- package/skills/gcp-essentials/references/deploy-recipes.md +81 -0
- package/skills/gcp-essentials/references/iam-and-auth.md +94 -0
- package/skills/gcp-essentials/references/networking-and-sql.md +74 -0
- package/skills/gcp-essentials/scripts/verify.sh +158 -0
- package/skills/gdpr-privacy/SKILL.md +167 -0
- package/skills/gdpr-privacy/evals/README.md +3 -0
- package/skills/gdpr-privacy/evals/cases.yaml +47 -0
- package/skills/gdpr-privacy/references/dpa-and-transfers.md +63 -0
- package/skills/gdpr-privacy/references/dsar-and-consent.md +83 -0
- package/skills/gdpr-privacy/references/privacy-policy-blueprint.md +99 -0
- package/skills/gdpr-privacy/scripts/verify.sh +84 -0
- package/skills/git-workflow/SKILL.md +190 -0
- package/skills/git-workflow/evals/README.md +10 -0
- package/skills/git-workflow/evals/cases.yaml +47 -0
- package/skills/git-workflow/references/interactive-rebase.md +89 -0
- package/skills/github-actions/SKILL.md +256 -0
- package/skills/github-actions/evals/README.md +3 -0
- package/skills/github-actions/evals/cases.yaml +45 -0
- package/skills/github-actions/references/caching-and-matrix.md +92 -0
- package/skills/github-actions/references/oidc-deploys.md +130 -0
- package/skills/github-actions/scripts/verify.sh +105 -0
- package/skills/go/SKILL.md +438 -0
- package/skills/go/evals/README.md +56 -0
- package/skills/go/evals/cases.yaml +55 -0
- package/skills/go/references/concurrency.md +557 -0
- package/skills/go/references/http-services.md +529 -0
- package/skills/go/references/testing.md +338 -0
- package/skills/go/scripts/verify.sh +109 -0
- package/skills/google-workspace/SKILL.md +287 -0
- package/skills/google-workspace/evals/README.md +16 -0
- package/skills/google-workspace/evals/cases.yaml +44 -0
- package/skills/google-workspace/references/api-recipes.md +148 -0
- package/skills/google-workspace/references/auth-setup.md +100 -0
- package/skills/google-workspace/scripts/verify.sh +128 -0
- package/skills/grants/SKILL.md +171 -0
- package/skills/grants/evals/README.md +3 -0
- package/skills/grants/evals/cases.yaml +69 -0
- package/skills/grants/references/budget-justification.md +71 -0
- package/skills/grants/references/jurisdictions.md +35 -0
- package/skills/grants/references/logic-model.md +66 -0
- package/skills/grants/scripts/verify.sh +193 -0
- package/skills/harness/SKILL.md +329 -0
- package/skills/harness/assets/_TEMPLATE/.env.example +8 -0
- package/skills/harness/assets/_TEMPLATE/CREDENTIALS.md +25 -0
- package/skills/harness/assets/_TEMPLATE/README.md +25 -0
- package/skills/harness/assets/_TEMPLATE/test_connection.sh +30 -0
- package/skills/harness/evals/README.md +54 -0
- package/skills/harness/evals/cases.yaml +72 -0
- package/skills/harness/examples/audit-example.md +120 -0
- package/skills/harness/references/agents-md-template.md +41 -0
- package/skills/harness/references/audit-report-template.html +140 -0
- package/skills/harness/references/audit-report-template.md +116 -0
- package/skills/harness/references/claude-md-template.md +98 -0
- package/skills/harness/references/inbox-readme-template.md +51 -0
- package/skills/harness/references/ingest-formats.md +185 -0
- package/skills/harness/references/providers.yaml +3410 -0
- package/skills/harness/references/tools-readme-template.md +88 -0
- package/skills/harness/references/wiki-archive-template.html +81 -0
- package/skills/harness/references/wiki-article-template.md +20 -0
- package/skills/harness/references/wiki-dashboard-template.html +136 -0
- package/skills/harness/references/wiki-deep-improve-report-template.html +126 -0
- package/skills/harness/references/wiki-gaps-template.md +18 -0
- package/skills/harness/references/wiki-index-template.md +23 -0
- package/skills/harness/references/wiki-protocol.md +699 -0
- package/skills/harness/references/wiki-raw-template.md +7 -0
- package/skills/hetzner/SKILL.md +221 -0
- package/skills/hetzner/evals/README.md +35 -0
- package/skills/hetzner/evals/cases.yaml +46 -0
- package/skills/hetzner/references/cloud-init.md +120 -0
- package/skills/hetzner/references/plans-and-locations.md +56 -0
- package/skills/hetzner/scripts/verify.sh +122 -0
- package/skills/hiring/SKILL.md +248 -0
- package/skills/hiring/evals/README.md +13 -0
- package/skills/hiring/evals/cases.yaml +41 -0
- package/skills/hiring/references/templates.md +118 -0
- package/skills/htmx/SKILL.md +261 -0
- package/skills/htmx/evals/README.md +3 -0
- package/skills/htmx/evals/cases.yaml +38 -0
- package/skills/htmx/references/patterns.md +113 -0
- package/skills/htmx/references/server-contract.md +91 -0
- package/skills/htmx/scripts/verify.sh +93 -0
- package/skills/huggingface/SKILL.md +190 -0
- package/skills/huggingface/evals/README.md +11 -0
- package/skills/huggingface/evals/cases.yaml +41 -0
- package/skills/huggingface/references/endpoints-and-spaces.md +99 -0
- package/skills/huggingface/references/hub-and-cli.md +85 -0
- package/skills/huggingface/references/inference-providers.md +115 -0
- package/skills/huggingface/scripts/verify.sh +123 -0
- package/skills/implement/SKILL.md +283 -0
- package/skills/implement/evals/README.md +56 -0
- package/skills/implement/evals/cases.yaml +43 -0
- package/skills/init/SKILL.md +184 -0
- package/skills/init/evals/README.md +49 -0
- package/skills/init/evals/cases.yaml +74 -0
- package/skills/init/references/accompaniment-and-profile.md +140 -0
- package/skills/init/references/discovery.md +90 -0
- package/skills/init/references/recommend-skills.md +115 -0
- package/skills/init/scripts/verify.sh +122 -0
- package/skills/instagram-api/SKILL.md +241 -0
- package/skills/instagram-api/evals/README.md +3 -0
- package/skills/instagram-api/evals/cases.yaml +43 -0
- package/skills/instagram-api/references/insights-metrics.md +88 -0
- package/skills/instagram-api/references/publish-reel.md +98 -0
- package/skills/instagram-api/scripts/verify.sh +137 -0
- package/skills/inventory/SKILL.md +131 -0
- package/skills/inventory/evals/README.md +3 -0
- package/skills/inventory/evals/cases.yaml +43 -0
- package/skills/inventory/references/abc-xyz.md +52 -0
- package/skills/inventory/references/ddmrp.md +32 -0
- package/skills/inventory/references/reorder-policies.md +85 -0
- package/skills/inventory/references/safety-stock.md +63 -0
- package/skills/inventory/scripts/verify.sh +155 -0
- package/skills/investor-materials/SKILL.md +175 -0
- package/skills/investor-materials/evals/README.md +15 -0
- package/skills/investor-materials/evals/cases.yaml +60 -0
- package/skills/investor-materials/references/dataroom-checklist.md +134 -0
- package/skills/investor-materials/references/update-and-onepager-templates.md +152 -0
- package/skills/investor-materials/scripts/verify.sh +148 -0
- package/skills/invoicing/SKILL.md +154 -0
- package/skills/invoicing/evals/README.md +5 -0
- package/skills/invoicing/evals/cases.yaml +49 -0
- package/skills/invoicing/references/dunning-ladder.md +53 -0
- package/skills/invoicing/references/e-invoicing-mandates.md +43 -0
- package/skills/invoicing/scripts/fixtures/broken-invoice.json +13 -0
- package/skills/invoicing/scripts/fixtures/valid-invoice.json +15 -0
- package/skills/invoicing/scripts/verify.sh +133 -0
- package/skills/ip-trademark/SKILL.md +186 -0
- package/skills/ip-trademark/evals/README.md +10 -0
- package/skills/ip-trademark/evals/cases.yaml +47 -0
- package/skills/ip-trademark/references/jurisdictions.md +63 -0
- package/skills/ip-trademark/references/ownership-and-licensing.md +90 -0
- package/skills/java/SKILL.md +341 -0
- package/skills/java/evals/README.md +23 -0
- package/skills/java/evals/cases.yaml +43 -0
- package/skills/java/references/builds.md +133 -0
- package/skills/java/references/concurrency.md +108 -0
- package/skills/java/references/streams.md +102 -0
- package/skills/java/scripts/verify.sh +107 -0
- package/skills/knowledge-ops/SKILL.md +125 -0
- package/skills/knowledge-ops/evals/README.md +16 -0
- package/skills/knowledge-ops/evals/cases.yaml +50 -0
- package/skills/knowledge-ops/references/gardening-playbook.md +116 -0
- package/skills/kotlin-android/SKILL.md +245 -0
- package/skills/kotlin-android/evals/README.md +13 -0
- package/skills/kotlin-android/evals/cases.yaml +56 -0
- package/skills/kotlin-android/references/architecture.md +200 -0
- package/skills/kotlin-android/references/gradle-setup.md +125 -0
- package/skills/kotlin-android/scripts/verify.sh +109 -0
- package/skills/kpi-framework/SKILL.md +199 -0
- package/skills/kpi-framework/evals/README.md +11 -0
- package/skills/kpi-framework/evals/cases.yaml +42 -0
- package/skills/kpi-framework/references/definition-and-targets.md +64 -0
- package/skills/kpi-framework/references/metric-catalog.md +84 -0
- package/skills/landing-copy/SKILL.md +153 -0
- package/skills/landing-copy/evals/README.md +18 -0
- package/skills/landing-copy/evals/cases.yaml +63 -0
- package/skills/landing-copy/references/frameworks.md +61 -0
- package/skills/landing-copy/references/page-skeleton.md +92 -0
- package/skills/landing-copy/scripts/verify.sh +164 -0
- package/skills/laravel/SKILL.md +301 -0
- package/skills/laravel/evals/README.md +10 -0
- package/skills/laravel/evals/cases.yaml +45 -0
- package/skills/laravel/references/eloquent-patterns.md +126 -0
- package/skills/laravel/references/queues-and-scheduling.md +153 -0
- package/skills/laravel/scripts/verify.sh +128 -0
- package/skills/lead-gen/SKILL.md +155 -0
- package/skills/lead-gen/evals/README.md +3 -0
- package/skills/lead-gen/evals/cases.yaml +43 -0
- package/skills/lead-gen/references/data-sources.md +87 -0
- package/skills/lead-gen/references/scoring-model.md +93 -0
- package/skills/lead-gen/scripts/verify.sh +179 -0
- package/skills/linkedin-api/SKILL.md +211 -0
- package/skills/linkedin-api/evals/README.md +3 -0
- package/skills/linkedin-api/evals/cases.yaml +41 -0
- package/skills/linkedin-api/references/api-reference.md +168 -0
- package/skills/linkedin-api/scripts/verify.sh +98 -0
- package/skills/linkedin-carousels/SKILL.md +239 -0
- package/skills/linkedin-carousels/evals/README.md +13 -0
- package/skills/linkedin-carousels/evals/cases.yaml +62 -0
- package/skills/linkedin-carousels/references/carousel-patterns.md +200 -0
- package/skills/linkedin-carousels/scripts/verify.sh +160 -0
- package/skills/linkedin-content/SKILL.md +162 -0
- package/skills/linkedin-content/evals/README.md +13 -0
- package/skills/linkedin-content/evals/cases.yaml +62 -0
- package/skills/linkedin-content/references/hooks-and-formats.md +114 -0
- package/skills/linkedin-content/scripts/verify.sh +154 -0
- package/skills/linkedin-outreach/SKILL.md +174 -0
- package/skills/linkedin-outreach/evals/README.md +3 -0
- package/skills/linkedin-outreach/evals/cases.yaml +43 -0
- package/skills/linkedin-outreach/references/ledger-schema.md +48 -0
- package/skills/linkedin-outreach/references/sales-navigator-playbook.md +61 -0
- package/skills/linkedin-outreach/scripts/verify.sh +120 -0
- package/skills/linkedin-strategy/SKILL.md +167 -0
- package/skills/linkedin-strategy/evals/README.md +3 -0
- package/skills/linkedin-strategy/evals/cases.yaml +49 -0
- package/skills/linkedin-strategy/references/ssi-and-pillars.md +59 -0
- package/skills/linkedin-strategy/references/wiki-records.md +62 -0
- package/skills/linkedin-strategy/scripts/verify.sh +120 -0
- package/skills/llm-pipeline/SKILL.md +155 -0
- package/skills/llm-pipeline/evals/README.md +3 -0
- package/skills/llm-pipeline/evals/cases.yaml +44 -0
- package/skills/llm-pipeline/references/caching-layers.md +60 -0
- package/skills/llm-pipeline/references/litellm-router.md +101 -0
- package/skills/llm-pipeline/scripts/verify.sh +169 -0
- package/skills/logistics-ops/SKILL.md +219 -0
- package/skills/logistics-ops/evals/README.md +20 -0
- package/skills/logistics-ops/evals/cases.yaml +48 -0
- package/skills/logistics-ops/references/carriers-and-claims.md +105 -0
- package/skills/market-research/SKILL.md +145 -0
- package/skills/market-research/evals/README.md +3 -0
- package/skills/market-research/evals/cases.yaml +48 -0
- package/skills/market-research/references/demand-signals.md +63 -0
- package/skills/market-research/references/sizing-playbook.md +121 -0
- package/skills/market-research/scripts/verify.sh +215 -0
- package/skills/marketing/SKILL.md +233 -0
- package/skills/marketing/evals/README.md +61 -0
- package/skills/marketing/evals/cases.yaml +84 -0
- package/skills/marketing/references/brand-grounding.md +197 -0
- package/skills/marketing/references/campaigns-and-channels.md +151 -0
- package/skills/marketing/references/copy-frameworks.md +166 -0
- package/skills/marketing/references/landing-copy.md +191 -0
- package/skills/marketing/references/seo-geo.md +391 -0
- package/skills/marketing/scripts/seo_audit.py +166 -0
- package/skills/marketing/scripts/verify.sh +233 -0
- package/skills/medium-publishing/SKILL.md +152 -0
- package/skills/medium-publishing/evals/README.md +3 -0
- package/skills/medium-publishing/evals/cases.yaml +42 -0
- package/skills/medium-publishing/references/cross-post-and-canonical.md +65 -0
- package/skills/medium-publishing/references/legacy-api.md +100 -0
- package/skills/medium-strategy/SKILL.md +161 -0
- package/skills/medium-strategy/evals/README.md +3 -0
- package/skills/medium-strategy/evals/cases.yaml +50 -0
- package/skills/medium-strategy/references/distribution-and-boost.md +65 -0
- package/skills/medium-strategy/references/wiki-records.md +60 -0
- package/skills/medium-strategy/scripts/verify.sh +118 -0
- package/skills/medium-writing/SKILL.md +140 -0
- package/skills/medium-writing/evals/README.md +5 -0
- package/skills/medium-writing/evals/cases.yaml +39 -0
- package/skills/medium-writing/references/title-patterns.md +79 -0
- package/skills/meeting-notes/SKILL.md +168 -0
- package/skills/meeting-notes/evals/README.md +14 -0
- package/skills/meeting-notes/evals/cases.yaml +46 -0
- package/skills/meeting-notes/references/templates.md +140 -0
- package/skills/modal/SKILL.md +307 -0
- package/skills/modal/evals/README.md +29 -0
- package/skills/modal/evals/cases.yaml +50 -0
- package/skills/modal/references/images-gpu-cookbook.md +160 -0
- package/skills/modal/references/web-and-scaling.md +138 -0
- package/skills/modal/scripts/verify.sh +127 -0
- package/skills/mongodb/SKILL.md +342 -0
- package/skills/mongodb/evals/README.md +29 -0
- package/skills/mongodb/evals/cases.yaml +41 -0
- package/skills/mongodb/references/aggregation.md +115 -0
- package/skills/mongodb/references/data-modeling.md +135 -0
- package/skills/mongodb/references/transactions-and-ops.md +128 -0
- package/skills/mongodb/scripts/verify.sh +151 -0
- package/skills/monitoring/SKILL.md +155 -0
- package/skills/monitoring/evals/README.md +3 -0
- package/skills/monitoring/evals/cases.yaml +47 -0
- package/skills/monitoring/references/burn-rate-and-oncall.md +128 -0
- package/skills/monitoring/references/tool-setup.md +154 -0
- package/skills/monitoring/scripts/verify.sh +145 -0
- package/skills/mysql/SKILL.md +249 -0
- package/skills/mysql/evals/README.md +12 -0
- package/skills/mysql/evals/cases.yaml +49 -0
- package/skills/mysql/references/indexing-and-explain.md +161 -0
- package/skills/mysql/references/mysql-vs-mariadb.md +78 -0
- package/skills/mysql/references/online-ddl-and-migrations.md +120 -0
- package/skills/mysql/references/replication-and-ha.md +115 -0
- package/skills/mysql/scripts/verify.sh +141 -0
- package/skills/neon/SKILL.md +218 -0
- package/skills/neon/evals/README.md +11 -0
- package/skills/neon/evals/cases.yaml +45 -0
- package/skills/neon/references/branching-ci.md +86 -0
- package/skills/neon/scripts/verify.sh +78 -0
- package/skills/nestjs/SKILL.md +225 -0
- package/skills/nestjs/evals/README.md +3 -0
- package/skills/nestjs/evals/cases.yaml +38 -0
- package/skills/nestjs/references/cross-cutting.md +135 -0
- package/skills/nestjs/references/testing-recipes.md +105 -0
- package/skills/nestjs/scripts/verify.sh +98 -0
- package/skills/netlify/SKILL.md +208 -0
- package/skills/netlify/evals/README.md +13 -0
- package/skills/netlify/evals/cases.yaml +43 -0
- package/skills/netlify/references/functions.md +97 -0
- package/skills/netlify/references/netlify-toml.md +115 -0
- package/skills/netlify/scripts/verify.sh +95 -0
- package/skills/newsletter/SKILL.md +162 -0
- package/skills/newsletter/evals/README.md +12 -0
- package/skills/newsletter/evals/cases.yaml +42 -0
- package/skills/newsletter/references/growth-loops.md +73 -0
- package/skills/newsletter/references/welcome-sequence.md +62 -0
- package/skills/newsletter/scripts/verify.sh +173 -0
- package/skills/nextjs/SKILL.md +472 -0
- package/skills/nextjs/evals/README.md +59 -0
- package/skills/nextjs/evals/cases.yaml +56 -0
- package/skills/nextjs/references/data-and-caching.md +309 -0
- package/skills/nextjs/references/metadata.md +208 -0
- package/skills/nextjs/references/performance.md +325 -0
- package/skills/nextjs/references/react.md +383 -0
- package/skills/nextjs/references/security.md +239 -0
- package/skills/nextjs/references/testing.md +290 -0
- package/skills/nextjs/scripts/verify.sh +141 -0
- package/skills/no-code-app/SKILL.md +153 -0
- package/skills/no-code-app/evals/README.md +3 -0
- package/skills/no-code-app/evals/cases.yaml +43 -0
- package/skills/no-code-app/references/platform-limits.md +100 -0
- package/skills/nodejs/SKILL.md +242 -0
- package/skills/nodejs/evals/README.md +3 -0
- package/skills/nodejs/evals/cases.yaml +39 -0
- package/skills/nodejs/references/express5-migration.md +53 -0
- package/skills/nodejs/references/graceful-shutdown.md +73 -0
- package/skills/nodejs/scripts/verify.sh +122 -0
- package/skills/notion-connector/SKILL.md +234 -0
- package/skills/notion-connector/evals/README.md +15 -0
- package/skills/notion-connector/evals/cases.yaml +45 -0
- package/skills/notion-connector/references/api-versions.md +63 -0
- package/skills/notion-connector/references/property-shapes.md +110 -0
- package/skills/notion-connector/references/sync-patterns.md +95 -0
- package/skills/notion-connector/scripts/verify.sh +162 -0
- package/skills/observability/SKILL.md +231 -0
- package/skills/observability/evals/README.md +3 -0
- package/skills/observability/evals/cases.yaml +49 -0
- package/skills/observability/references/collector-config.md +98 -0
- package/skills/observability/references/instrumentation-recipes.md +115 -0
- package/skills/observability/scripts/verify.sh +156 -0
- package/skills/ollama/SKILL.md +213 -0
- package/skills/ollama/evals/README.md +9 -0
- package/skills/ollama/evals/cases.yaml +43 -0
- package/skills/ollama/references/api.md +148 -0
- package/skills/ollama/references/hardware-sizing.md +87 -0
- package/skills/ollama/scripts/verify.sh +116 -0
- package/skills/orient/SKILL.md +54 -0
- package/skills/orient/evals/README.md +16 -0
- package/skills/orient/evals/cases.yaml +57 -0
- package/skills/orient/references/orientation-contract.md +34 -0
- package/skills/parallel/SKILL.md +198 -0
- package/skills/parallel/evals/README.md +62 -0
- package/skills/parallel/evals/cases.yaml +44 -0
- package/skills/people-ops/SKILL.md +122 -0
- package/skills/people-ops/evals/README.md +14 -0
- package/skills/people-ops/evals/cases.yaml +43 -0
- package/skills/people-ops/references/templates.md +129 -0
- package/skills/performance/SKILL.md +221 -0
- package/skills/performance/evals/README.md +3 -0
- package/skills/performance/evals/cases.yaml +47 -0
- package/skills/performance/references/profiling-playbook.md +54 -0
- package/skills/performance/scripts/verify.sh +94 -0
- package/skills/phoenix/SKILL.md +169 -0
- package/skills/phoenix/evals/README.md +3 -0
- package/skills/phoenix/evals/cases.yaml +40 -0
- package/skills/phoenix/references/auth-and-scopes.md +82 -0
- package/skills/phoenix/references/ecto-patterns.md +93 -0
- package/skills/phoenix/references/liveview.md +134 -0
- package/skills/phoenix/scripts/verify.sh +73 -0
- package/skills/php/SKILL.md +397 -0
- package/skills/php/evals/README.md +12 -0
- package/skills/php/evals/cases.yaml +45 -0
- package/skills/php/references/tooling.md +170 -0
- package/skills/php/references/type-system.md +220 -0
- package/skills/php/scripts/verify.sh +155 -0
- package/skills/pitch-deck/SKILL.md +209 -0
- package/skills/pitch-deck/evals/README.md +15 -0
- package/skills/pitch-deck/evals/cases.yaml +55 -0
- package/skills/pitch-deck/references/numbers-that-matter.md +78 -0
- package/skills/pitch-deck/references/slide-spine.md +149 -0
- package/skills/pitch-deck/scripts/verify.sh +186 -0
- package/skills/plan/SKILL.md +204 -0
- package/skills/plan/evals/README.md +62 -0
- package/skills/plan/evals/cases.yaml +49 -0
- package/skills/plan/references/plan-template.md +124 -0
- package/skills/planetscale/SKILL.md +223 -0
- package/skills/planetscale/evals/README.md +11 -0
- package/skills/planetscale/evals/cases.yaml +46 -0
- package/skills/planetscale/references/deploy-requests.md +75 -0
- package/skills/planetscale/references/no-foreign-keys.md +88 -0
- package/skills/planetscale/scripts/verify.sh +115 -0
- package/skills/podcast/SKILL.md +166 -0
- package/skills/podcast/evals/README.md +17 -0
- package/skills/podcast/evals/cases.yaml +61 -0
- package/skills/podcast/references/rss-and-namespace.md +136 -0
- package/skills/podcast/scripts/verify.sh +246 -0
- package/skills/postgresdb/SKILL.md +372 -0
- package/skills/postgresdb/evals/README.md +55 -0
- package/skills/postgresdb/evals/cases.yaml +57 -0
- package/skills/postgresdb/references/migrations.md +279 -0
- package/skills/postgresdb/references/operations-and-security.md +267 -0
- package/skills/postgresdb/references/query-optimization.md +374 -0
- package/skills/postgresdb/references/schema-and-indexing.md +379 -0
- package/skills/postgresdb/scripts/verify.sh +191 -0
- package/skills/presentations/SKILL.md +296 -0
- package/skills/presentations/evals/README.md +61 -0
- package/skills/presentations/evals/cases.yaml +56 -0
- package/skills/presentations/references/brand-grounding.md +160 -0
- package/skills/presentations/references/markdown-decks.md +290 -0
- package/skills/presentations/references/pptx-python.md +242 -0
- package/skills/presentations/references/slide-design.md +261 -0
- package/skills/presentations/references/storytelling-and-decks.md +150 -0
- package/skills/presentations/scripts/verify.sh +252 -0
- package/skills/press-kit/SKILL.md +243 -0
- package/skills/press-kit/evals/README.md +15 -0
- package/skills/press-kit/evals/cases.yaml +55 -0
- package/skills/press-kit/references/release-types.md +102 -0
- package/skills/press-kit/references/templates.md +132 -0
- package/skills/press-kit/scripts/verify.sh +161 -0
- package/skills/pricing/SKILL.md +160 -0
- package/skills/pricing/evals/README.md +5 -0
- package/skills/pricing/evals/cases.yaml +44 -0
- package/skills/pricing/references/localization.md +56 -0
- package/skills/pricing/references/pricing-models.md +55 -0
- package/skills/pricing/scripts/verify.sh +91 -0
- package/skills/prisma-orm/SKILL.md +320 -0
- package/skills/prisma-orm/evals/README.md +12 -0
- package/skills/prisma-orm/evals/cases.yaml +56 -0
- package/skills/prisma-orm/references/migrations-and-v7-upgrade.md +197 -0
- package/skills/prisma-orm/references/queries-and-performance.md +169 -0
- package/skills/prisma-orm/scripts/verify.sh +137 -0
- package/skills/procurement/SKILL.md +179 -0
- package/skills/procurement/evals/README.md +20 -0
- package/skills/procurement/evals/cases.yaml +49 -0
- package/skills/procurement/references/scorecard-and-tco.md +100 -0
- package/skills/procurement/references/sourcing-requests.md +116 -0
- package/skills/procurement/scripts/verify.sh +280 -0
- package/skills/project-ops/SKILL.md +130 -0
- package/skills/project-ops/evals/README.md +3 -0
- package/skills/project-ops/evals/cases.yaml +71 -0
- package/skills/project-ops/references/raid-and-rag.md +58 -0
- package/skills/project-ops/references/status-report-template.md +68 -0
- package/skills/project-ops/scripts/verify.sh +257 -0
- package/skills/prompt-engineering/SKILL.md +138 -0
- package/skills/prompt-engineering/evals/README.md +11 -0
- package/skills/prompt-engineering/evals/cases.yaml +46 -0
- package/skills/prompt-engineering/references/eval-templates.md +94 -0
- package/skills/prompt-engineering/references/output-contracts.md +120 -0
- package/skills/prompt-engineering/scripts/verify.sh +84 -0
- package/skills/proposals/SKILL.md +159 -0
- package/skills/proposals/evals/README.md +3 -0
- package/skills/proposals/evals/cases.yaml +53 -0
- package/skills/proposals/references/proposal-skeleton.md +110 -0
- package/skills/proposals/references/sow-skeleton.md +79 -0
- package/skills/proposals/scripts/verify.sh +201 -0
- package/skills/python/SKILL.md +369 -0
- package/skills/python/evals/README.md +19 -0
- package/skills/python/evals/cases.yaml +46 -0
- package/skills/python/references/async.md +136 -0
- package/skills/python/references/stdlib.md +162 -0
- package/skills/python/references/typing.md +160 -0
- package/skills/python/scripts/verify.sh +125 -0
- package/skills/rag/SKILL.md +226 -0
- package/skills/rag/evals/README.md +13 -0
- package/skills/rag/evals/cases.yaml +45 -0
- package/skills/rag/references/evaluation.md +99 -0
- package/skills/rag/references/pipeline.md +151 -0
- package/skills/rag/scripts/verify.sh +99 -0
- package/skills/rails/SKILL.md +264 -0
- package/skills/rails/evals/README.md +12 -0
- package/skills/rails/evals/cases.yaml +47 -0
- package/skills/rails/references/activerecord.md +148 -0
- package/skills/rails/references/hotwire.md +139 -0
- package/skills/rails/references/testing.md +110 -0
- package/skills/rails/scripts/verify.sh +128 -0
- package/skills/railway/SKILL.md +245 -0
- package/skills/railway/evals/README.md +14 -0
- package/skills/railway/evals/cases.yaml +44 -0
- package/skills/railway/references/cli-cookbook.md +137 -0
- package/skills/railway/references/config-as-code.md +120 -0
- package/skills/railway/scripts/verify.sh +162 -0
- package/skills/react/SKILL.md +222 -0
- package/skills/react/evals/README.md +3 -0
- package/skills/react/evals/cases.yaml +43 -0
- package/skills/react/references/data-and-state.md +152 -0
- package/skills/react/references/performance.md +75 -0
- package/skills/react/references/routing.md +99 -0
- package/skills/react/scripts/verify.sh +123 -0
- package/skills/react-native/SKILL.md +220 -0
- package/skills/react-native/evals/README.md +3 -0
- package/skills/react-native/evals/cases.yaml +42 -0
- package/skills/react-native/references/native-modules.md +123 -0
- package/skills/react-native/references/performance-debugging.md +46 -0
- package/skills/react-native/scripts/verify.sh +117 -0
- package/skills/redis/SKILL.md +298 -0
- package/skills/redis/evals/README.md +10 -0
- package/skills/redis/evals/cases.yaml +43 -0
- package/skills/redis/references/caching.md +116 -0
- package/skills/redis/references/locks-and-rate-limiting.md +140 -0
- package/skills/redis/references/queues.md +102 -0
- package/skills/redis/scripts/verify.sh +164 -0
- package/skills/remotion-video/SKILL.md +218 -0
- package/skills/remotion-video/evals/README.md +23 -0
- package/skills/remotion-video/evals/cases.yaml +64 -0
- package/skills/remotion-video/references/captions-pipeline.md +163 -0
- package/skills/remotion-video/references/render-and-pipeline.md +131 -0
- package/skills/remotion-video/scripts/verify.sh +169 -0
- package/skills/render/SKILL.md +256 -0
- package/skills/render/evals/README.md +12 -0
- package/skills/render/evals/cases.yaml +45 -0
- package/skills/render/references/blueprint-reference.md +203 -0
- package/skills/render/scripts/verify.sh +167 -0
- package/skills/replicate/SKILL.md +210 -0
- package/skills/replicate/evals/README.md +9 -0
- package/skills/replicate/evals/cases.yaml +45 -0
- package/skills/replicate/references/cog-packaging.md +89 -0
- package/skills/replicate/references/deployments-api.md +87 -0
- package/skills/replicate/references/webhooks-and-async.md +110 -0
- package/skills/replicate/scripts/verify.sh +162 -0
- package/skills/replicate-images/SKILL.md +241 -0
- package/skills/replicate-images/evals/README.md +13 -0
- package/skills/replicate-images/evals/cases.yaml +41 -0
- package/skills/replicate-images/references/editing-recipes.md +129 -0
- package/skills/replicate-images/references/models.md +131 -0
- package/skills/replicate-images/scripts/verify.sh +178 -0
- package/skills/reporting/SKILL.md +178 -0
- package/skills/reporting/evals/README.md +12 -0
- package/skills/reporting/evals/cases.yaml +46 -0
- package/skills/reporting/references/pipeline.md +213 -0
- package/skills/reporting/scripts/verify.sh +149 -0
- package/skills/research-ops/SKILL.md +200 -0
- package/skills/research-ops/evals/README.md +13 -0
- package/skills/research-ops/evals/cases.yaml +38 -0
- package/skills/research-ops/references/credibility-rubric.md +78 -0
- package/skills/research-ops/references/memo-template.md +63 -0
- package/skills/research-ops/scripts/verify.sh +181 -0
- package/skills/retention/SKILL.md +206 -0
- package/skills/retention/evals/README.md +13 -0
- package/skills/retention/evals/cases.yaml +42 -0
- package/skills/retention/references/health-score-and-metrics.md +97 -0
- package/skills/retention/references/save-and-winback-plays.md +65 -0
- package/skills/review/SKILL.md +222 -0
- package/skills/review/evals/README.md +84 -0
- package/skills/review/evals/cases.yaml +55 -0
- package/skills/review-management/SKILL.md +204 -0
- package/skills/review-management/evals/README.md +13 -0
- package/skills/review-management/evals/cases.yaml +60 -0
- package/skills/review-management/references/platform-apis.md +86 -0
- package/skills/review-management/scripts/verify.sh +128 -0
- package/skills/ruby/SKILL.md +316 -0
- package/skills/ruby/evals/README.md +12 -0
- package/skills/ruby/evals/cases.yaml +41 -0
- package/skills/ruby/references/gems-and-testing.md +208 -0
- package/skills/ruby/references/metaprogramming.md +161 -0
- package/skills/ruby/scripts/verify.sh +83 -0
- package/skills/runpod/SKILL.md +238 -0
- package/skills/runpod/evals/README.md +11 -0
- package/skills/runpod/evals/cases.yaml +47 -0
- package/skills/runpod/references/cost-and-scaling.md +85 -0
- package/skills/runpod/references/serverless-workers.md +101 -0
- package/skills/runpod/scripts/verify.sh +126 -0
- package/skills/rust/SKILL.md +395 -0
- package/skills/rust/evals/README.md +12 -0
- package/skills/rust/evals/cases.yaml +42 -0
- package/skills/rust/references/async-tokio.md +141 -0
- package/skills/rust/references/axum-service.md +132 -0
- package/skills/rust/references/ownership.md +86 -0
- package/skills/rust/references/testing.md +108 -0
- package/skills/rust/scripts/verify.sh +91 -0
- package/skills/sales-pipeline/SKILL.md +162 -0
- package/skills/sales-pipeline/evals/README.md +13 -0
- package/skills/sales-pipeline/evals/cases.yaml +60 -0
- package/skills/sales-pipeline/references/forecasting-math.md +82 -0
- package/skills/sales-pipeline/references/stage-playbook.md +84 -0
- package/skills/sales-pipeline/scripts/verify.sh +210 -0
- package/skills/scaling/SKILL.md +137 -0
- package/skills/scaling/evals/README.md +3 -0
- package/skills/scaling/evals/cases.yaml +42 -0
- package/skills/scaling/references/load-testing-k6.md +127 -0
- package/skills/scaling/scripts/example.load.js +24 -0
- package/skills/scaling/scripts/verify.sh +70 -0
- package/skills/sdd/SKILL.md +203 -0
- package/skills/sdd/evals/README.md +60 -0
- package/skills/sdd/evals/cases.yaml +78 -0
- package/skills/sdd-init/SKILL.md +148 -0
- package/skills/sdd-init/evals/README.md +3 -0
- package/skills/sdd-init/evals/cases.yaml +43 -0
- package/skills/secure-coding/SKILL.md +365 -0
- package/skills/secure-coding/evals/README.md +68 -0
- package/skills/secure-coding/evals/cases.yaml +55 -0
- package/skills/secure-coding/references/authn-authz.md +249 -0
- package/skills/secure-coding/references/owasp-by-stack.md +574 -0
- package/skills/secure-coding/references/secrets-and-supply-chain.md +205 -0
- package/skills/secure-coding/references/threat-modeling.md +213 -0
- package/skills/secure-coding/scripts/verify.sh +208 -0
- package/skills/security-scan/SKILL.md +239 -0
- package/skills/security-scan/evals/README.md +14 -0
- package/skills/security-scan/evals/cases.yaml +50 -0
- package/skills/security-scan/references/tools.md +98 -0
- package/skills/security-scan/references/triage.md +93 -0
- package/skills/security-scan/scripts/verify.sh +108 -0
- package/skills/seo-geo/SKILL.md +192 -0
- package/skills/seo-geo/evals/README.md +14 -0
- package/skills/seo-geo/evals/cases.yaml +45 -0
- package/skills/seo-geo/references/ai-crawler-control.md +104 -0
- package/skills/seo-geo/references/schema-recipes.md +130 -0
- package/skills/seo-geo/scripts/verify.sh +236 -0
- package/skills/ship/SKILL.md +258 -0
- package/skills/ship/evals/README.md +89 -0
- package/skills/ship/evals/cases.yaml +44 -0
- package/skills/shopify/SKILL.md +229 -0
- package/skills/shopify/evals/README.md +14 -0
- package/skills/shopify/evals/cases.yaml +41 -0
- package/skills/shopify/references/apps-graphql.md +103 -0
- package/skills/shopify/references/checkout-extensibility.md +71 -0
- package/skills/shopify/references/liquid-themes.md +89 -0
- package/skills/shopify/scripts/verify.sh +120 -0
- package/skills/shortform-editing/SKILL.md +161 -0
- package/skills/shortform-editing/evals/README.md +16 -0
- package/skills/shortform-editing/evals/cases.yaml +61 -0
- package/skills/shortform-editing/references/captions.md +85 -0
- package/skills/shortform-editing/references/ffmpeg-pipeline.md +126 -0
- package/skills/shortform-editing/scripts/verify.sh +148 -0
- package/skills/shortform-ideation/SKILL.md +153 -0
- package/skills/shortform-ideation/evals/README.md +20 -0
- package/skills/shortform-ideation/evals/cases.yaml +58 -0
- package/skills/shortform-ideation/references/experiment-ledger.md +85 -0
- package/skills/shortform-ideation/references/trend-sources.md +69 -0
- package/skills/shortform-ideation/scripts/verify.sh +172 -0
- package/skills/shortform-packaging/SKILL.md +247 -0
- package/skills/shortform-packaging/evals/README.md +10 -0
- package/skills/shortform-packaging/evals/cases.yaml +48 -0
- package/skills/shortform-packaging/references/package-templates.md +117 -0
- package/skills/shortform-packaging/scripts/verify.sh +210 -0
- package/skills/shortform-strategy/SKILL.md +149 -0
- package/skills/shortform-strategy/evals/README.md +3 -0
- package/skills/shortform-strategy/evals/cases.yaml +52 -0
- package/skills/shortform-strategy/references/learning-loop-template.md +49 -0
- package/skills/shortform-strategy/references/platform-signals-2026.md +46 -0
- package/skills/shortform-strategy/scripts/verify.sh +176 -0
- package/skills/skill-scout/SKILL.md +133 -0
- package/skills/skill-scout/evals/README.md +12 -0
- package/skills/skill-scout/evals/cases.yaml +56 -0
- package/skills/skill-scout/references/install-commands.md +76 -0
- package/skills/skill-scout/scripts/verify.sh +154 -0
- package/skills/social-publisher/SKILL.md +179 -0
- package/skills/social-publisher/evals/README.md +14 -0
- package/skills/social-publisher/evals/cases.yaml +55 -0
- package/skills/social-publisher/references/calendar-schema.md +97 -0
- package/skills/social-publisher/references/platform-limits.md +56 -0
- package/skills/social-publisher/scripts/verify.sh +232 -0
- package/skills/solid-js/SKILL.md +260 -0
- package/skills/solid-js/evals/README.md +3 -0
- package/skills/solid-js/evals/cases.yaml +38 -0
- package/skills/solid-js/references/reactivity-deep-dive.md +89 -0
- package/skills/solid-js/references/router-and-start.md +93 -0
- package/skills/solid-js/scripts/verify.sh +130 -0
- package/skills/sop-builder/SKILL.md +233 -0
- package/skills/sop-builder/evals/README.md +14 -0
- package/skills/sop-builder/evals/cases.yaml +48 -0
- package/skills/sop-builder/references/sop-skeleton.md +170 -0
- package/skills/specify/SKILL.md +214 -0
- package/skills/specify/evals/README.md +73 -0
- package/skills/specify/evals/cases.yaml +80 -0
- package/skills/specify/references/eliciting-requirements.md +77 -0
- package/skills/specify/references/spec-template.md +60 -0
- package/skills/spreadsheet-ops/SKILL.md +180 -0
- package/skills/spreadsheet-ops/evals/README.md +33 -0
- package/skills/spreadsheet-ops/evals/cases.yaml +42 -0
- package/skills/spreadsheet-ops/references/formula-cookbook.md +70 -0
- package/skills/spreadsheet-ops/references/python-excel.md +87 -0
- package/skills/spreadsheet-ops/references/sheets-api-appsscript.md +118 -0
- package/skills/spreadsheet-ops/scripts/verify.sh +152 -0
- package/skills/spring-boot/SKILL.md +375 -0
- package/skills/spring-boot/evals/README.md +11 -0
- package/skills/spring-boot/evals/cases.yaml +49 -0
- package/skills/spring-boot/references/jpa.md +94 -0
- package/skills/spring-boot/references/security.md +92 -0
- package/skills/spring-boot/references/testing.md +95 -0
- package/skills/spring-boot/scripts/verify.sh +115 -0
- package/skills/sql/SKILL.md +286 -0
- package/skills/sql/evals/README.md +9 -0
- package/skills/sql/evals/cases.yaml +49 -0
- package/skills/sql/references/ctes-and-recursion.md +63 -0
- package/skills/sql/references/joins-and-sets.md +71 -0
- package/skills/sql/references/portability.md +38 -0
- package/skills/sql/references/window-functions.md +72 -0
- package/skills/sql/scripts/verify.sh +139 -0
- package/skills/sqlite-turso/SKILL.md +214 -0
- package/skills/sqlite-turso/evals/README.md +24 -0
- package/skills/sqlite-turso/evals/cases.yaml +45 -0
- package/skills/sqlite-turso/references/embedded-replicas.md +96 -0
- package/skills/sqlite-turso/scripts/verify.sh +95 -0
- package/skills/stripe/SKILL.md +269 -0
- package/skills/stripe/evals/README.md +11 -0
- package/skills/stripe/evals/cases.yaml +45 -0
- package/skills/stripe/references/going-live.md +64 -0
- package/skills/stripe/references/webhook-events.md +79 -0
- package/skills/stripe/scripts/verify.sh +130 -0
- package/skills/structured-extraction/SKILL.md +230 -0
- package/skills/structured-extraction/evals/README.md +13 -0
- package/skills/structured-extraction/evals/cases.yaml +70 -0
- package/skills/structured-extraction/references/providers.md +152 -0
- package/skills/structured-extraction/scripts/verify.sh +160 -0
- package/skills/suggest/SKILL.md +30 -0
- package/skills/suggest/evals/README.md +14 -0
- package/skills/suggest/evals/cases.yaml +51 -0
- package/skills/supabase/SKILL.md +268 -0
- package/skills/supabase/evals/README.md +12 -0
- package/skills/supabase/evals/cases.yaml +42 -0
- package/skills/supabase/references/auth-ssr.md +173 -0
- package/skills/supabase/references/rls-cookbook.md +122 -0
- package/skills/supabase/scripts/verify.sh +149 -0
- package/skills/svelte/SKILL.md +238 -0
- package/skills/svelte/evals/README.md +3 -0
- package/skills/svelte/evals/cases.yaml +41 -0
- package/skills/svelte/references/runes.md +97 -0
- package/skills/svelte/references/sveltekit-data.md +156 -0
- package/skills/svelte/scripts/verify.sh +128 -0
- package/skills/swift-ios/SKILL.md +217 -0
- package/skills/swift-ios/evals/README.md +3 -0
- package/skills/swift-ios/evals/cases.yaml +46 -0
- package/skills/swift-ios/references/concurrency.md +132 -0
- package/skills/swift-ios/references/testing.md +112 -0
- package/skills/swift-ios/scripts/verify.sh +98 -0
- package/skills/tasks/SKILL.md +260 -0
- package/skills/tasks/evals/README.md +70 -0
- package/skills/tasks/evals/cases.yaml +75 -0
- package/skills/tauri/SKILL.md +224 -0
- package/skills/tauri/evals/README.md +12 -0
- package/skills/tauri/evals/cases.yaml +46 -0
- package/skills/tauri/references/bundling-distribution.md +129 -0
- package/skills/tauri/references/security.md +143 -0
- package/skills/tauri/scripts/verify.sh +178 -0
- package/skills/technical-writing/SKILL.md +230 -0
- package/skills/technical-writing/evals/README.md +12 -0
- package/skills/technical-writing/evals/cases.yaml +53 -0
- package/skills/technical-writing/references/diataxis-modes.md +131 -0
- package/skills/technical-writing/references/vale-starter.md +90 -0
- package/skills/technical-writing/scripts/verify.sh +83 -0
- package/skills/terms-conditions/SKILL.md +147 -0
- package/skills/terms-conditions/evals/README.md +14 -0
- package/skills/terms-conditions/evals/cases.yaml +48 -0
- package/skills/terms-conditions/references/clause-library.md +158 -0
- package/skills/terms-conditions/references/notices-and-aup.md +125 -0
- package/skills/terms-conditions/scripts/verify.sh +92 -0
- package/skills/testing-go/SKILL.md +246 -0
- package/skills/testing-go/evals/README.md +3 -0
- package/skills/testing-go/evals/cases.yaml +44 -0
- package/skills/testing-go/references/coverage-and-benchmarks.md +85 -0
- package/skills/testing-go/references/mocks-and-fakes.md +140 -0
- package/skills/testing-go/references/synctest-and-concurrency.md +82 -0
- package/skills/testing-go/scripts/verify.sh +72 -0
- package/skills/testing-py/SKILL.md +179 -0
- package/skills/testing-py/evals/README.md +5 -0
- package/skills/testing-py/evals/cases.yaml +44 -0
- package/skills/testing-py/references/mocking.md +141 -0
- package/skills/testing-py/references/property-testing.md +99 -0
- package/skills/testing-py/scripts/verify.sh +117 -0
- package/skills/testing-web/SKILL.md +224 -0
- package/skills/testing-web/evals/README.md +11 -0
- package/skills/testing-web/evals/cases.yaml +52 -0
- package/skills/testing-web/references/jest-setup.md +88 -0
- package/skills/testing-web/references/recipes.md +116 -0
- package/skills/testing-web/scripts/verify.sh +111 -0
- package/skills/tiktok-api/SKILL.md +315 -0
- package/skills/tiktok-api/evals/README.md +17 -0
- package/skills/tiktok-api/evals/cases.yaml +51 -0
- package/skills/tiktok-api/references/metrics-and-publish.md +127 -0
- package/skills/tiktok-api/references/oauth-setup.md +105 -0
- package/skills/tiktok-api/references/wiki-schema.md +85 -0
- package/skills/tiktok-api/scripts/verify.sh +96 -0
- package/skills/together-fireworks/SKILL.md +181 -0
- package/skills/together-fireworks/evals/README.md +3 -0
- package/skills/together-fireworks/evals/cases.yaml +50 -0
- package/skills/together-fireworks/references/batch-and-tuning.md +59 -0
- package/skills/together-fireworks/references/models-and-pricing.md +79 -0
- package/skills/together-fireworks/scripts/verify.sh +165 -0
- package/skills/translation-l10n/SKILL.md +229 -0
- package/skills/translation-l10n/evals/README.md +3 -0
- package/skills/translation-l10n/evals/cases.yaml +39 -0
- package/skills/translation-l10n/references/icu-cookbook.md +82 -0
- package/skills/translation-l10n/references/rtl-and-bidi.md +60 -0
- package/skills/typescript/SKILL.md +258 -0
- package/skills/typescript/evals/README.md +15 -0
- package/skills/typescript/evals/cases.yaml +46 -0
- package/skills/typescript/references/build-and-monorepo.md +141 -0
- package/skills/typescript/references/type-system.md +162 -0
- package/skills/typescript/scripts/verify.sh +52 -0
- package/skills/unit-economics/SKILL.md +180 -0
- package/skills/unit-economics/evals/README.md +5 -0
- package/skills/unit-economics/evals/cases.yaml +43 -0
- package/skills/unit-economics/references/formulas.md +144 -0
- package/skills/unit-economics/scripts/verify.sh +179 -0
- package/skills/vector-db/SKILL.md +189 -0
- package/skills/vector-db/evals/README.md +10 -0
- package/skills/vector-db/evals/cases.yaml +45 -0
- package/skills/vector-db/references/engines.md +175 -0
- package/skills/vector-db/references/tuning.md +62 -0
- package/skills/vector-db/scripts/verify.sh +110 -0
- package/skills/vercel/SKILL.md +242 -0
- package/skills/vercel/evals/README.md +23 -0
- package/skills/vercel/evals/cases.yaml +45 -0
- package/skills/vercel/references/cli-cookbook.md +98 -0
- package/skills/vercel/references/vercel-json.md +120 -0
- package/skills/vercel/scripts/verify.sh +168 -0
- package/skills/verify/SKILL.md +188 -0
- package/skills/verify/evals/README.md +78 -0
- package/skills/verify/evals/cases.yaml +74 -0
- package/skills/video-shorts/SKILL.md +163 -0
- package/skills/video-shorts/evals/README.md +15 -0
- package/skills/video-shorts/evals/cases.yaml +56 -0
- package/skills/video-shorts/references/hook-and-script-patterns.md +95 -0
- package/skills/video-shorts/references/specs-and-safe-zones.md +74 -0
- package/skills/video-shorts/scripts/verify.sh +172 -0
- package/skills/vue-nuxt/SKILL.md +384 -0
- package/skills/vue-nuxt/evals/README.md +11 -0
- package/skills/vue-nuxt/evals/cases.yaml +49 -0
- package/skills/vue-nuxt/references/data-and-state.md +127 -0
- package/skills/vue-nuxt/references/migration-nuxt4.md +79 -0
- package/skills/vue-nuxt/references/nitro-and-rendering.md +117 -0
- package/skills/vue-nuxt/references/reactivity.md +135 -0
- package/skills/vue-nuxt/scripts/verify.sh +148 -0
- package/skills/webhooks/SKILL.md +246 -0
- package/skills/webhooks/evals/README.md +15 -0
- package/skills/webhooks/evals/cases.yaml +46 -0
- package/skills/webhooks/references/framework-raw-body.md +97 -0
- package/skills/webhooks/references/signature-schemes.md +66 -0
- package/skills/webhooks/scripts/verify.sh +142 -0
- package/skills/webinar/SKILL.md +196 -0
- package/skills/webinar/evals/README.md +14 -0
- package/skills/webinar/evals/cases.yaml +44 -0
- package/skills/webinar/references/email-cadence.md +75 -0
- package/skills/webinar/references/run-of-show.md +83 -0
- package/skills/whatsapp-telegram/SKILL.md +235 -0
- package/skills/whatsapp-telegram/evals/README.md +11 -0
- package/skills/whatsapp-telegram/evals/cases.yaml +44 -0
- package/skills/whatsapp-telegram/references/telegram-bot-api.md +91 -0
- package/skills/whatsapp-telegram/references/whatsapp-cloud-api.md +103 -0
- package/skills/whatsapp-telegram/scripts/verify.sh +90 -0
- package/skills/wordpress/SKILL.md +224 -0
- package/skills/wordpress/evals/README.md +3 -0
- package/skills/wordpress/evals/cases.yaml +50 -0
- package/skills/wordpress/references/hardening.md +108 -0
- package/skills/wordpress/references/performance.md +80 -0
- package/skills/wordpress/references/woocommerce.md +65 -0
- package/skills/wordpress/scripts/verify.sh +96 -0
- package/skills/worktrees/SKILL.md +199 -0
- package/skills/worktrees/evals/README.md +78 -0
- package/skills/worktrees/evals/cases.yaml +47 -0
- package/skills/youtube-api/SKILL.md +286 -0
- package/skills/youtube-api/evals/README.md +3 -0
- package/skills/youtube-api/evals/cases.yaml +50 -0
- package/skills/youtube-api/references/analytics-queries.md +89 -0
- package/skills/youtube-api/references/oauth-setup.md +55 -0
- package/skills/youtube-api/references/wiki-schema.md +70 -0
- package/skills/youtube-api/scripts/verify.sh +84 -0
- package/skills/youtube-ideation/SKILL.md +234 -0
- package/skills/youtube-ideation/evals/README.md +14 -0
- package/skills/youtube-ideation/evals/cases.yaml +52 -0
- package/skills/youtube-ideation/references/idea-ledger-and-loop.md +89 -0
- package/skills/youtube-ideation/references/research-and-signals.md +92 -0
- package/skills/youtube-ideation/scripts/verify.sh +237 -0
- package/skills/youtube-packaging/SKILL.md +220 -0
- package/skills/youtube-packaging/evals/README.md +16 -0
- package/skills/youtube-packaging/evals/cases.yaml +48 -0
- package/skills/youtube-packaging/references/description-and-chapters.md +135 -0
- package/skills/youtube-packaging/scripts/verify.sh +250 -0
- package/skills/youtube-strategy/SKILL.md +157 -0
- package/skills/youtube-strategy/evals/README.md +5 -0
- package/skills/youtube-strategy/evals/cases.yaml +61 -0
- package/skills/youtube-strategy/references/channel-architecture.md +46 -0
- package/skills/youtube-strategy/references/wiki-records.md +86 -0
- package/skills/youtube-strategy/scripts/verify.sh +118 -0
- package/skills/youtube-thumbnails/SKILL.md +180 -0
- package/skills/youtube-thumbnails/evals/README.md +11 -0
- package/skills/youtube-thumbnails/evals/cases.yaml +48 -0
- package/skills/youtube-thumbnails/references/composition-and-specs.md +69 -0
- package/skills/youtube-thumbnails/references/experiment-log-format.md +65 -0
- package/skills/youtube-thumbnails/scripts/verify.sh +123 -0
- package/targets/claude.js +23 -0
- package/targets/codex.js +29 -0
- package/targets/cursor.js +20 -0
- package/targets/gemini.js +29 -0
- package/targets/index.js +55 -0
|
@@ -0,0 +1,469 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: building-agents
|
|
3
|
+
description: "Use when designing or building an LLM agent, tool-using system, RAG pipeline, eval harness, or MCP server in this repo — across any provider (OpenAI, Anthropic, Google Gemini, or OSS via OpenAI-compatible endpoints / litellm). Triggers: 'build an agent', 'add tool calling / function calling', 'structured JSON output', 'RAG / retrieval / embeddings / rerank', 'agent loop / ReAct / orchestrator-worker / multi-agent', 'LLM eval / golden set / LLM-as-judge / regression gate', 'prompt caching / model routing / token budget / cost control', 'trace / observability for LLM calls', 'build an MCP server', or 'make our LLM code provider-agnostic / swap models'. FastAPI/Python, Next.js, Go, Flutter, Postgres stacks."
|
|
4
|
+
tags: [agents, llm, mcp, rag, evals, ai]
|
|
5
|
+
recommends: [secure-coding, deployment]
|
|
6
|
+
origin: risco
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Building production LLM agents (model-agnostic)
|
|
10
|
+
|
|
11
|
+
Build production LLM agents that are model-agnostic by construction — a thin provider adapter, a disciplined agent loop, schema-validated tools, provider-neutral RAG, eval gates, OTel tracing, and optionally an MCP server — so swapping OpenAI ↔ Anthropic ↔ Gemini ↔ OSS is a config change, not a rewrite.
|
|
12
|
+
|
|
13
|
+
## The one rule
|
|
14
|
+
|
|
15
|
+
> Program against a **capability interface**, never a vendor SDK. Vendor specifics (model id, tool-schema shape, JSON mode, caching, token limits) live behind one adapter resolved from config. If a model name or price appears in business logic, it's a bug.
|
|
16
|
+
|
|
17
|
+
> Model names and prices rot. Never hardcode them in app logic — resolve from config/registry, and re-verify the dated tables before quoting a number.
|
|
18
|
+
|
|
19
|
+
## When to use / When NOT to use
|
|
20
|
+
|
|
21
|
+
**Use when:** starting any production-bound LLM feature; code is hardwired to one SDK and you want to swap/route/fallback models; adding tools/function calling, structured output, or streaming; standing up RAG over Postgres/`pgvector` or an external store; building an eval harness / CI quality gate; adding tracing, cost tracking, caching, or routing/cascades; building or hardening an MCP server.
|
|
22
|
+
|
|
23
|
+
**Do NOT use when:**
|
|
24
|
+
|
|
25
|
+
- One-shot throwaway prompt, no tools, no eval, no production path → just call the SDK directly.
|
|
26
|
+
- Pure prompt-wording improvement with no architecture → that's prompt engineering, not this.
|
|
27
|
+
- Anthropic-SDK-specific tuning (caching internals, thinking, batch) in a file that *only* imports `anthropic` → defer to a dedicated Anthropic-SDK skill if your environment provides one (e.g. `claude-api`); this skill stays multi-provider.
|
|
28
|
+
- Workspace scaffolding (`01-TOOLS`/`02-DOCS` layout) → **`harness`**.
|
|
29
|
+
- Picking *which* coding agent (Claude Code vs Aider) → agent-eval territory, not this.
|
|
30
|
+
- No retrieval, no tools, no loop, no evals at all → you don't need an agent; say so.
|
|
31
|
+
|
|
32
|
+
## Decision rules (read before writing code)
|
|
33
|
+
|
|
34
|
+
1. **Adapter first** — define the `LLMProvider` Protocol before any provider call.
|
|
35
|
+
2. **Smallest loop that works** — single-agent before multi-agent; ReAct only when the path is uncertain; plan-execute when steps are knowable.
|
|
36
|
+
3. **Tools are typed contracts** — schema + validation + idempotency key on every side-effecting tool; no catch-all tools.
|
|
37
|
+
4. **Retrieve, don't stuff** — RAG when ground truth lives in data; cite or refuse.
|
|
38
|
+
5. **Eval before ship** — a golden set + regression gate in CI, or it's not production.
|
|
39
|
+
6. **Cheapest model that passes the eval** — route/cascade up, never default to flagship.
|
|
40
|
+
|
|
41
|
+
## Architecture at a glance
|
|
42
|
+
|
|
43
|
+
```text
|
|
44
|
+
┌───────────────────────────────────────────────────────────┐
|
|
45
|
+
Caller ──▶│ Agent loop (perceive → decide → act → observe, bounded) │
|
|
46
|
+
└───────────────────────────────────────────────────────────┘
|
|
47
|
+
│ │ │ │
|
|
48
|
+
▼ ▼ ▼ ▼
|
|
49
|
+
┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐
|
|
50
|
+
│ Provider │ │ Tool │ │ Retriever │ │ Tracer │
|
|
51
|
+
│ adapter │ │ registry │ │ (pgvector) │ │ (OTel) │
|
|
52
|
+
│ ↔ OpenAI / │ │ → sandboxed│ │ → cite/ │ │ gen_ai.* │
|
|
53
|
+
│ Anthropic /│ │ tools │ │ refuse │ │ │
|
|
54
|
+
│ Gemini /OSS│ └────────────┘ └────────────┘ └────────────┘
|
|
55
|
+
└────────────┘
|
|
56
|
+
▲ │
|
|
57
|
+
└──────────── Eval gate (CI) ◀─────────────┘
|
|
58
|
+
provider-abstraction.md agent-loops…/tools-and-rag.md evals-and-observability.md
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
## The provider adapter (the heart of the skill)
|
|
62
|
+
|
|
63
|
+
The one payload to internalize. Python 3.12+, Pydantic v2, **async** so it composes
|
|
64
|
+
directly with the agent loop in `references/agent-loops-and-harness.md`. Streaming,
|
|
65
|
+
the Gemini and OSS/litellm adapters, tool-result plumbing, and a `route()` registry live
|
|
66
|
+
in `references/provider-abstraction.md` — this excerpt is the load-bearing core, not the
|
|
67
|
+
whole interface.
|
|
68
|
+
|
|
69
|
+
```python
|
|
70
|
+
from __future__ import annotations
|
|
71
|
+
|
|
72
|
+
import os
|
|
73
|
+
from typing import Literal, Protocol, runtime_checkable
|
|
74
|
+
|
|
75
|
+
from pydantic import BaseModel, Field
|
|
76
|
+
|
|
77
|
+
|
|
78
|
+
class Message(BaseModel):
|
|
79
|
+
role: Literal["system", "user", "assistant", "tool"]
|
|
80
|
+
content: str
|
|
81
|
+
|
|
82
|
+
|
|
83
|
+
class ToolSpec(BaseModel):
|
|
84
|
+
name: str
|
|
85
|
+
description: str
|
|
86
|
+
parameters: dict # JSON Schema for the tool's arguments
|
|
87
|
+
|
|
88
|
+
|
|
89
|
+
class Usage(BaseModel):
|
|
90
|
+
input_tokens: int = 0
|
|
91
|
+
output_tokens: int = 0
|
|
92
|
+
cost_usd: float = 0.0
|
|
93
|
+
|
|
94
|
+
|
|
95
|
+
class CompletionRequest(BaseModel):
|
|
96
|
+
model: str # resolved from config, e.g. "claude-sonnet-4-6" — never literal in logic
|
|
97
|
+
messages: list[Message]
|
|
98
|
+
tools: list[ToolSpec] = Field(default_factory=list)
|
|
99
|
+
response_schema: dict | None = None # JSON Schema -> structured output
|
|
100
|
+
temperature: float = 0.0
|
|
101
|
+
max_tokens: int = 1024
|
|
102
|
+
|
|
103
|
+
|
|
104
|
+
class CompletionResponse(BaseModel):
|
|
105
|
+
text: str = ""
|
|
106
|
+
tool_calls: list[dict] = Field(default_factory=list) # [{id, name, arguments}]
|
|
107
|
+
usage: Usage = Field(default_factory=Usage)
|
|
108
|
+
raw: dict | None = None
|
|
109
|
+
|
|
110
|
+
|
|
111
|
+
@runtime_checkable
|
|
112
|
+
class LLMProvider(Protocol):
|
|
113
|
+
# Async so it drives the async agent loop directly. The full interface in
|
|
114
|
+
# references/provider-abstraction.md adds stream() and embed().
|
|
115
|
+
async def complete(self, req: CompletionRequest) -> CompletionResponse: ...
|
|
116
|
+
|
|
117
|
+
|
|
118
|
+
class OpenAIAdapter:
|
|
119
|
+
def __init__(self, model: str) -> None:
|
|
120
|
+
from openai import AsyncOpenAI
|
|
121
|
+
|
|
122
|
+
self.model, self.client = model, AsyncOpenAI()
|
|
123
|
+
|
|
124
|
+
async def complete(self, req: CompletionRequest) -> CompletionResponse:
|
|
125
|
+
# Chat Completions shape (universal, still current); references/provider-abstraction.md
|
|
126
|
+
# gives the preferred Responses-API adapter. system stays a `system` role message here.
|
|
127
|
+
kwargs: dict = {"model": self.model, "messages": [m.model_dump() for m in req.messages],
|
|
128
|
+
"temperature": req.temperature, "max_tokens": req.max_tokens}
|
|
129
|
+
if req.tools:
|
|
130
|
+
kwargs["tools"] = [{"type": "function", "function": {"name": t.name, "description": t.description, "parameters": t.parameters}} for t in req.tools]
|
|
131
|
+
if req.response_schema:
|
|
132
|
+
kwargs["response_format"] = {"type": "json_schema", "json_schema": {"name": "out", "schema": req.response_schema, "strict": True}}
|
|
133
|
+
r = await self.client.chat.completions.create(**kwargs)
|
|
134
|
+
msg = r.choices[0].message
|
|
135
|
+
calls = [{"id": c.id, "name": c.function.name, "arguments": c.function.arguments} for c in (msg.tool_calls or [])]
|
|
136
|
+
return CompletionResponse(text=msg.content or "", tool_calls=calls, raw=r.model_dump(),
|
|
137
|
+
usage=Usage(input_tokens=r.usage.prompt_tokens, output_tokens=r.usage.completion_tokens))
|
|
138
|
+
|
|
139
|
+
|
|
140
|
+
class AnthropicAdapter:
|
|
141
|
+
def __init__(self, model: str) -> None:
|
|
142
|
+
from anthropic import AsyncAnthropic
|
|
143
|
+
|
|
144
|
+
self.model, self.client = model, AsyncAnthropic()
|
|
145
|
+
|
|
146
|
+
async def complete(self, req: CompletionRequest) -> CompletionResponse:
|
|
147
|
+
# QUIRKS: system is a top-level param (not a message); tools use input_schema (not function).
|
|
148
|
+
system = "\n".join(m.content for m in req.messages if m.role == "system") or None
|
|
149
|
+
turns = [{"role": m.role, "content": m.content} for m in req.messages if m.role != "system"]
|
|
150
|
+
kwargs: dict = {"model": self.model, "system": system, "messages": turns, "max_tokens": req.max_tokens, "temperature": req.temperature}
|
|
151
|
+
if req.tools:
|
|
152
|
+
kwargs["tools"] = [{"name": t.name, "description": t.description, "input_schema": t.parameters} for t in req.tools]
|
|
153
|
+
if req.response_schema: # structured output via tool-forcing
|
|
154
|
+
kwargs["tools"] = [{"name": "out", "description": "Emit the result", "input_schema": req.response_schema}]
|
|
155
|
+
kwargs["tool_choice"] = {"type": "tool", "name": "out"}
|
|
156
|
+
r = await self.client.messages.create(**kwargs)
|
|
157
|
+
text = "".join(b.text for b in r.content if b.type == "text")
|
|
158
|
+
calls = [{"id": b.id, "name": b.name, "arguments": b.input} for b in r.content if b.type == "tool_use"]
|
|
159
|
+
return CompletionResponse(text=text, tool_calls=calls, raw=r.model_dump(),
|
|
160
|
+
usage=Usage(input_tokens=r.usage.input_tokens, output_tokens=r.usage.output_tokens))
|
|
161
|
+
|
|
162
|
+
|
|
163
|
+
def get_provider(spec: str | None = None) -> LLMProvider:
|
|
164
|
+
"""Parse 'provider:model' (default from env LLM) into a concrete adapter."""
|
|
165
|
+
provider, _, model = (spec or os.environ["LLM"]).partition(":")
|
|
166
|
+
if provider == "openai":
|
|
167
|
+
return OpenAIAdapter(model)
|
|
168
|
+
if provider == "anthropic":
|
|
169
|
+
return AnthropicAdapter(model)
|
|
170
|
+
raise ValueError(f"unknown provider: {provider!r}")
|
|
171
|
+
# Gemini + OSS/litellm adapters, streaming, tool-result plumbing, and route() registry
|
|
172
|
+
# -> references/provider-abstraction.md
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
## Good vs Bad
|
|
176
|
+
|
|
177
|
+
```python
|
|
178
|
+
# BAD — vendor SDK + model id hardwired into a handler; swapping = rewrite every call-site.
|
|
179
|
+
client = OpenAI()
|
|
180
|
+
def summarize(text: str) -> str:
|
|
181
|
+
r = client.chat.completions.create(model="gpt-5.5", messages=[{"role": "user", "content": text}])
|
|
182
|
+
return r.choices[0].message.content
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
```python
|
|
186
|
+
# GOOD — one adapter resolved from config; logic never names a model.
|
|
187
|
+
provider = get_provider(settings.llm) # e.g. "anthropic:claude-sonnet-4-6"
|
|
188
|
+
async def summarize(text: str) -> str:
|
|
189
|
+
req = CompletionRequest(model=settings.model_id, messages=[Message(role="user", content=text)])
|
|
190
|
+
return (await provider.complete(req)).text
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
```python
|
|
194
|
+
# BAD — parse-and-pray; wrong shape fails silently at 3am.
|
|
195
|
+
raw = (await provider.complete(req)).text
|
|
196
|
+
try:
|
|
197
|
+
data = json.loads(raw)
|
|
198
|
+
except json.JSONDecodeError:
|
|
199
|
+
data = {} # the bug is now invisible
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
```python
|
|
203
|
+
# GOOD — strict structured output + schema validation that fails loudly on drift.
|
|
204
|
+
class Answer(BaseModel):
|
|
205
|
+
sentiment: Literal["pos", "neg", "neu"]
|
|
206
|
+
score: float
|
|
207
|
+
|
|
208
|
+
req.response_schema = Answer.model_json_schema()
|
|
209
|
+
ans = Answer.model_validate_json((await provider.complete(req)).text)
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
```python
|
|
213
|
+
# BAD — unbounded loop; no cap/timeout/idempotency. Burns budget, repeats side effects, wedges.
|
|
214
|
+
while True:
|
|
215
|
+
resp = await provider.complete(req)
|
|
216
|
+
if not resp.tool_calls:
|
|
217
|
+
break
|
|
218
|
+
for call in resp.tool_calls:
|
|
219
|
+
await run_tool(call)
|
|
220
|
+
```
|
|
221
|
+
|
|
222
|
+
```python
|
|
223
|
+
# GOOD — bounded loop: step cap + per-tool timeout + idempotency key (safe to retry).
|
|
224
|
+
for step in range(max_steps):
|
|
225
|
+
resp = await provider.complete(req)
|
|
226
|
+
if not resp.tool_calls:
|
|
227
|
+
break
|
|
228
|
+
for call in resp.tool_calls:
|
|
229
|
+
async with asyncio.timeout(tool_timeout_s):
|
|
230
|
+
await run_tool(call, idempotency_key=call["id"])
|
|
231
|
+
# full loop, budgets, recovery -> references/agent-loops-and-harness.md
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
## Tools & structured output (minimum viable)
|
|
235
|
+
|
|
236
|
+
```python
|
|
237
|
+
from typing import Callable, Literal
|
|
238
|
+
|
|
239
|
+
from pydantic import BaseModel, ConfigDict, Field, ValidationError
|
|
240
|
+
|
|
241
|
+
|
|
242
|
+
class CreateInvoiceArgs(BaseModel):
|
|
243
|
+
model_config = ConfigDict(extra="forbid") # reject unknown keys from the model
|
|
244
|
+
customer_id: str = Field(min_length=1)
|
|
245
|
+
amount_cents: int = Field(gt=0)
|
|
246
|
+
currency: Literal["EUR", "USD"] = "EUR"
|
|
247
|
+
|
|
248
|
+
|
|
249
|
+
class ToolResult(BaseModel):
|
|
250
|
+
status: Literal["success", "warning", "error"]
|
|
251
|
+
summary: str
|
|
252
|
+
data: dict | None = None
|
|
253
|
+
next_actions: list[str] = Field(default_factory=list)
|
|
254
|
+
|
|
255
|
+
|
|
256
|
+
def _create_invoice(args: CreateInvoiceArgs) -> ToolResult:
|
|
257
|
+
invoice_id = f"inv_{args.customer_id}_{args.amount_cents}" # real impl: DB insert + idempotency
|
|
258
|
+
return ToolResult(status="success", summary=f"Created {invoice_id}", data={"id": invoice_id})
|
|
259
|
+
|
|
260
|
+
|
|
261
|
+
TOOLS: dict[str, tuple[type[BaseModel], Callable]] = {
|
|
262
|
+
"create_invoice": (CreateInvoiceArgs, _create_invoice),
|
|
263
|
+
}
|
|
264
|
+
|
|
265
|
+
|
|
266
|
+
def dispatch(name: str, raw_args: dict) -> ToolResult:
|
|
267
|
+
spec = TOOLS.get(name)
|
|
268
|
+
if spec is None:
|
|
269
|
+
return ToolResult(status="error", summary=f"unknown tool {name!r}", next_actions=["pick a registered tool"])
|
|
270
|
+
args_model, handler = spec
|
|
271
|
+
try:
|
|
272
|
+
args = args_model.model_validate(raw_args) # validate BEFORE side effects
|
|
273
|
+
except ValidationError as e:
|
|
274
|
+
return ToolResult(status="error", summary="invalid args", data={"errors": e.errors()},
|
|
275
|
+
next_actions=["fix the arguments and retry"])
|
|
276
|
+
return handler(args)
|
|
277
|
+
# schema design, sandboxing, idempotency, DI-scoped DB sessions -> references/tools-and-rag.md
|
|
278
|
+
```
|
|
279
|
+
|
|
280
|
+
## RAG in 30 lines (provider-agnostic embeddings)
|
|
281
|
+
|
|
282
|
+
```sql
|
|
283
|
+
CREATE EXTENSION IF NOT EXISTS vector;
|
|
284
|
+
CREATE TABLE IF NOT EXISTS docs (
|
|
285
|
+
id bigserial PRIMARY KEY,
|
|
286
|
+
content text NOT NULL,
|
|
287
|
+
embedding vector(1536) NOT NULL,
|
|
288
|
+
meta jsonb NOT NULL DEFAULT '{}'
|
|
289
|
+
);
|
|
290
|
+
CREATE INDEX IF NOT EXISTS docs_embedding_hnsw
|
|
291
|
+
ON docs USING hnsw (embedding vector_cosine_ops);
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
```python
|
|
295
|
+
async def embed(texts: list[str]) -> list[list[float]]:
|
|
296
|
+
# Same provider interface as completions; impl in references/tools-and-rag.md.
|
|
297
|
+
return await provider.embed(texts) # returns one 1536-d vector per text
|
|
298
|
+
|
|
299
|
+
|
|
300
|
+
async def retrieve(query: str, k: int = 5, min_sim: float = 0.25) -> list[dict]:
|
|
301
|
+
[q] = await embed([query])
|
|
302
|
+
rows = await db.fetch( # cosine distance <=>; similarity = 1 - distance
|
|
303
|
+
"SELECT id, content, 1 - (embedding <=> $1) AS sim "
|
|
304
|
+
"FROM docs ORDER BY embedding <=> $1 LIMIT $2",
|
|
305
|
+
q, k,
|
|
306
|
+
)
|
|
307
|
+
return [dict(r) for r in rows if r["sim"] >= min_sim]
|
|
308
|
+
|
|
309
|
+
|
|
310
|
+
async def answer(query: str) -> str:
|
|
311
|
+
chunks = await retrieve(query)
|
|
312
|
+
if not chunks: # refuse rather than hallucinate
|
|
313
|
+
return "I don't have grounded information to answer that."
|
|
314
|
+
context = "\n".join(f"[{c['id']}] {c['content']}" for c in chunks)
|
|
315
|
+
req = CompletionRequest(
|
|
316
|
+
model=settings.model_id,
|
|
317
|
+
messages=[Message(role="system", content="Answer ONLY from context; cite chunk ids like [12]."),
|
|
318
|
+
Message(role="user", content=f"{context}\n\nQ: {query}")],
|
|
319
|
+
)
|
|
320
|
+
return (await provider.complete(req)).text
|
|
321
|
+
# chunking, hybrid RRF, rerank, citation grader, memory -> references/tools-and-rag.md
|
|
322
|
+
```
|
|
323
|
+
|
|
324
|
+
## Evals & cost gates (the production line)
|
|
325
|
+
|
|
326
|
+
```python
|
|
327
|
+
import json
|
|
328
|
+
import statistics
|
|
329
|
+
import sys
|
|
330
|
+
import time
|
|
331
|
+
|
|
332
|
+
|
|
333
|
+
async def run_eval(golden_path: str, graders: list, thresholds: dict[str, float]) -> None:
|
|
334
|
+
cases = [json.loads(line) for line in open(golden_path)] # {"input","expected","meta"}
|
|
335
|
+
results = []
|
|
336
|
+
for case in cases:
|
|
337
|
+
t0 = time.perf_counter()
|
|
338
|
+
out = await provider.complete(CompletionRequest(model=settings.model_id,
|
|
339
|
+
messages=[Message(role="user", content=case["input"])]))
|
|
340
|
+
scores = {g.name: g.grade(case, out) for g in graders} # exact / schema / LLM-judge
|
|
341
|
+
results.append({"scores": scores, "cost": out.usage.cost_usd,
|
|
342
|
+
"ms": (time.perf_counter() - t0) * 1000})
|
|
343
|
+
n = len(results)
|
|
344
|
+
metrics = {
|
|
345
|
+
"accuracy": sum(r["scores"]["exact"] for r in results) / n,
|
|
346
|
+
"faithfulness": sum(r["scores"]["judge"] for r in results) / n,
|
|
347
|
+
"p95_latency_ms": statistics.quantiles([r["ms"] for r in results], n=20)[-1],
|
|
348
|
+
"cost_per_task": sum(r["cost"] for r in results) / n,
|
|
349
|
+
}
|
|
350
|
+
failed = [k for k, lo in thresholds.items() if metrics[k] < lo]
|
|
351
|
+
print(json.dumps(metrics, indent=2))
|
|
352
|
+
sys.exit(1 if failed else 0) # CI gate: non-zero blocks the merge
|
|
353
|
+
```
|
|
354
|
+
|
|
355
|
+
Routing cascade in one line: `route(task) → cheapest model whose eval passes; escalate only on a failed self-check`.
|
|
356
|
+
|
|
357
|
+
Pointer: full runner, judge, CI gate, caching, batching, budgets → `references/evals-and-observability.md`.
|
|
358
|
+
|
|
359
|
+
## Observability (OTel GenAI, vendor-neutral)
|
|
360
|
+
|
|
361
|
+
```python
|
|
362
|
+
from opentelemetry import trace
|
|
363
|
+
|
|
364
|
+
tracer = trace.get_tracer("agent")
|
|
365
|
+
|
|
366
|
+
|
|
367
|
+
async def traced_complete(provider: LLMProvider, req: CompletionRequest) -> CompletionResponse:
|
|
368
|
+
with tracer.start_as_current_span("chat") as span:
|
|
369
|
+
span.set_attribute("gen_ai.system", settings.llm.split(":")[0])
|
|
370
|
+
span.set_attribute("gen_ai.request.model", req.model)
|
|
371
|
+
resp = await provider.complete(req)
|
|
372
|
+
span.set_attributes({"gen_ai.usage.input_tokens": resp.usage.input_tokens,
|
|
373
|
+
"gen_ai.usage.output_tokens": resp.usage.output_tokens,
|
|
374
|
+
"gen_ai.usage.cost_usd": resp.usage.cost_usd})
|
|
375
|
+
return resp
|
|
376
|
+
# Langfuse / Phoenix / Braintrust are swappable OTLP backends: emit spans, swap the exporter.
|
|
377
|
+
# span-per-tool, trace-id propagation, exporters -> references/evals-and-observability.md
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
## MCP: when and the smallest server
|
|
381
|
+
|
|
382
|
+
**Native tools** when the agent and tools share a process/repo. **MCP** when tools must be reused across clients/teams or run out-of-process — accept the MCP cost (schema tokens, transport, ops) in exchange for reuse.
|
|
383
|
+
|
|
384
|
+
```python
|
|
385
|
+
from fastmcp import FastMCP # standalone fastmcp 2.x; see references/mcp-servers.md
|
|
386
|
+
|
|
387
|
+
mcp = FastMCP("invoices")
|
|
388
|
+
|
|
389
|
+
|
|
390
|
+
@mcp.tool()
|
|
391
|
+
def create_invoice(customer_id: str, amount_cents: int, currency: str = "EUR") -> dict:
|
|
392
|
+
"""Create an invoice. amount_cents must be > 0."""
|
|
393
|
+
if amount_cents <= 0:
|
|
394
|
+
raise ValueError("amount_cents must be positive")
|
|
395
|
+
return {"id": f"inv_{customer_id}_{amount_cents}", "currency": currency}
|
|
396
|
+
|
|
397
|
+
|
|
398
|
+
@mcp.resource("invoice://{invoice_id}")
|
|
399
|
+
def read_invoice(invoice_id: str) -> str:
|
|
400
|
+
"""Read-only invoice lookup by id."""
|
|
401
|
+
return f"Invoice {invoice_id}: status=open"
|
|
402
|
+
|
|
403
|
+
|
|
404
|
+
if __name__ == "__main__":
|
|
405
|
+
mcp.run() # stdio transport
|
|
406
|
+
# (MCP spec 2025-11-25; stateless-core RC 2026-07-28; verify before quoting)
|
|
407
|
+
# TypeScript server, transports, HTTP+auth, testing -> references/mcp-servers.md
|
|
408
|
+
```
|
|
409
|
+
|
|
410
|
+
## Anti-patterns → STOP
|
|
411
|
+
|
|
412
|
+
| Rationalization | Reality |
|
|
413
|
+
|---|---|
|
|
414
|
+
| "I'll just call the OpenAI SDK directly, we'll never switch" | The adapter is ~40 lines; retrofitting it across 30 call-sites later is a rewrite. Adapter first. |
|
|
415
|
+
| "JSON output is usually valid, I'll parse it" | "Usually" = pages at 3am. Use strict structured output + schema validation. |
|
|
416
|
+
| "The agent loop works, I don't need a step cap" | Unbounded loops burn budget and wedge on errors. Cap steps, timeouts, and budget. |
|
|
417
|
+
| "One mega-tool that takes a freeform command is flexible" | It's unobservable and unsafe. Narrow typed tools with idempotency keys. |
|
|
418
|
+
| "We can eval by eyeballing outputs" | Vibes don't gate CI. Golden set + graders + threshold or it's not production. |
|
|
419
|
+
| "Default everything to the flagship model, it's smartest" | 5–20× cost for no measured gain. Route to the cheapest model that passes the eval. |
|
|
420
|
+
| "Stuff the whole doc in the prompt instead of RAG" | Blows context + cost and still hallucinates. Retrieve + cite + refuse. |
|
|
421
|
+
| "Retry on every exception" | Retrying a 400/401 wastes budget. Retry only transient (429/5xx/timeout) with backoff+jitter. |
|
|
422
|
+
| "Hardcode the model name, it's fine" | Names rot (Opus 4.7 → 4.8 in weeks). Resolve from config/registry. |
|
|
423
|
+
| "MCP for everything" | In-process native tools are simpler and faster when reuse isn't needed. MCP only for cross-client reuse. |
|
|
424
|
+
| "Tool results just return the raw API blob" | Give the model `status/summary/next_actions`; raw blobs waste context and stall recovery. |
|
|
425
|
+
| "Prompt caching is Anthropic-only so skip caching" | Each provider has its own caching/dedup; abstract it behind the adapter, don't skip it. |
|
|
426
|
+
|
|
427
|
+
## Quick reference
|
|
428
|
+
|
|
429
|
+
| Task | Do this | Reference |
|
|
430
|
+
|---|---|---|
|
|
431
|
+
| Define a provider | `LLMProvider` Protocol + normalized request/response | `references/provider-abstraction.md` |
|
|
432
|
+
| Add a tool | Pydantic args model + `ToolResult` + validating dispatcher | `references/tools-and-rag.md` |
|
|
433
|
+
| Structured output | Strict JSON Schema (OpenAI) / tool-forcing (Anthropic) / `response_json_schema` (Gemini) | `references/provider-abstraction.md` |
|
|
434
|
+
| Build the loop | Bounded perceive → decide → act → observe with budgets | `references/agent-loops-and-harness.md` |
|
|
435
|
+
| Multi-agent | Orchestrator-worker + parallel fan-out with a semaphore | `references/agent-loops-and-harness.md` |
|
|
436
|
+
| RAG | `pgvector` ANN + hybrid RRF + rerank + cite | `references/tools-and-rag.md` |
|
|
437
|
+
| Eval gate | Golden set + graders + CI threshold exit code | `references/evals-and-observability.md` |
|
|
438
|
+
| Trace | OTel GenAI `gen_ai.*` spans, swappable exporter | `references/evals-and-observability.md` |
|
|
439
|
+
| Cut cost | Cache + route/cascade + batch + budgets | `references/evals-and-observability.md` |
|
|
440
|
+
| MCP server | FastMCP stdio / Streamable HTTP server | `references/mcp-servers.md` |
|
|
441
|
+
|
|
442
|
+
## verify.sh
|
|
443
|
+
|
|
444
|
+
`scripts/verify.sh` lints example agent code and dry-runs the eval smoke test in **the user's project** — not in this skill repo. It detects each tool (`ruff`, `mypy`, `tsc`/`node`, `go`, the eval entrypoint, `markdownlint`) and skips any that are missing with a yellow WARN; a missing tool never fails the run. Invoke it with `bash scripts/verify.sh` from the project root. Exit 0 means clean (or only skips); a non-zero exit means a real lint/typecheck/vet/eval failure.
|
|
445
|
+
|
|
446
|
+
## Project grounding (02-DOCS + CLAUDE.md)
|
|
447
|
+
|
|
448
|
+
When this skill runs in a project with a `02-DOCS/` layer (the
|
|
449
|
+
[`harness`](../harness/SKILL.md) Karpathy wiki), record this
|
|
450
|
+
project's agent decisions there and index them from the root `CLAUDE.md`, so the next
|
|
451
|
+
agent inherits the conventions instead of re-deriving them.
|
|
452
|
+
|
|
453
|
+
1. **Find the article** `02-DOCS/wiki/stack/agents.md`, linked from a `## Knowledge map` section in the root
|
|
454
|
+
`CLAUDE.md`.
|
|
455
|
+
2. **If missing or stale**, create/update it with the project's real choices — the provider(s) and model routing, where the provider adapter lives, tool/RAG conventions, the eval gates, and the observability backend —
|
|
456
|
+
then add/refresh the `CLAUDE.md` link (create the `## Knowledge map` section, and
|
|
457
|
+
`CLAUDE.md` itself, if absent).
|
|
458
|
+
3. **Read it first on every use** and stay consistent; when a convention changes, update the
|
|
459
|
+
article (bump its `Updated` date) in the same change.
|
|
460
|
+
|
|
461
|
+
No `02-DOCS/` layer? Skip silently (optionally suggest `harness`). Unlike the
|
|
462
|
+
brand study, technical conventions are *recorded, not gated* — never block the task on this.
|
|
463
|
+
|
|
464
|
+
## See Also
|
|
465
|
+
|
|
466
|
+
- `../harness/SKILL.md` — workspace `01-TOOLS`/`02-DOCS` scaffolding.
|
|
467
|
+
- Stack siblings the examples target: `../fastapi/SKILL.md`, `../nextjs/SKILL.md`, `../go/SKILL.md`, `../postgresdb/SKILL.md`, `../flutter/SKILL.md`; plus `../secure-coding/SKILL.md` and `../deployment/SKILL.md` for hardening and shipping the agent service.
|
|
468
|
+
- External skills (no sibling in this repo; use if your environment provides them): `claude-api` — Anthropic-SDK-specific tuning (caching internals, thinking, batch) when a file only imports `anthropic`; `deep-research` — the research-harness fan-out / verify pattern.
|
|
469
|
+
- ECC analogues (external, no links): `agent-harness-construction`, `eval-harness`, `cost-aware-llm-pipeline`, `mcp-server-patterns`, `context-budget`.
|
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
# Eval harness — `building-agents`
|
|
2
|
+
|
|
3
|
+
This evaluates two things: (1) **triggering** — does the skill activate on the
|
|
4
|
+
right prompts and stay quiet on near-misses — and (2) **capability** — does
|
|
5
|
+
loading the skill measurably improve the agent's answer. These run through an
|
|
6
|
+
**agent harness** (a Claude Code / SDK session), not a pure shell script: a
|
|
7
|
+
human or a driver script feeds prompts and grades outcomes against
|
|
8
|
+
`cases.yaml`.
|
|
9
|
+
|
|
10
|
+
## Files
|
|
11
|
+
|
|
12
|
+
- `cases.yaml` — `should_trigger`, `should_not_trigger`, and `capability` cases.
|
|
13
|
+
|
|
14
|
+
## 1. Triggering eval
|
|
15
|
+
|
|
16
|
+
Goal: confirm the skill's `description` routes correctly.
|
|
17
|
+
|
|
18
|
+
1. Load **only** this skill into the agent (its `SKILL.md` description in the
|
|
19
|
+
skill index). For near-miss realism, also expose the sibling skills named in
|
|
20
|
+
each `route_to` so the agent can pick the better match instead of defaulting.
|
|
21
|
+
2. For each prompt in `should_trigger` and `should_not_trigger`, start a fresh
|
|
22
|
+
session and send the prompt verbatim. Do **not** answer the task — observe
|
|
23
|
+
only whether `building-agents` is selected.
|
|
24
|
+
3. Run **3–5 trials** per prompt (the selection is stochastic). Record the
|
|
25
|
+
fire-rate per prompt.
|
|
26
|
+
4. Score:
|
|
27
|
+
- `should_trigger` item passes if the skill fires in a strong majority of
|
|
28
|
+
trials (>= 4/5).
|
|
29
|
+
- `should_not_trigger` item passes if the skill does **not** fire, and
|
|
30
|
+
ideally the agent routes to the `route_to` sibling (or to nothing when
|
|
31
|
+
`route_to: none`).
|
|
32
|
+
|
|
33
|
+
**Pass bar:** >= 90% trigger accuracy across all items
|
|
34
|
+
(true-positives + true-negatives / total trials). Any `should_not_trigger`
|
|
35
|
+
that fires the skill is a false-positive and must be investigated — usually a
|
|
36
|
+
too-greedy `description`.
|
|
37
|
+
|
|
38
|
+
## 2. Capability eval
|
|
39
|
+
|
|
40
|
+
Goal: prove the skill changes the output, not just the routing.
|
|
41
|
+
|
|
42
|
+
1. For each `capability` scenario, run it **twice** in matched sessions:
|
|
43
|
+
- **WITHOUT** the skill loaded (baseline).
|
|
44
|
+
- **WITH** the skill loaded.
|
|
45
|
+
2. Grade each answer against that scenario's `must_include` rubric: count how
|
|
46
|
+
many points are concretely present (code or prose), not merely gestured at.
|
|
47
|
+
3. Compute coverage = points covered / total points, for both runs.
|
|
48
|
+
|
|
49
|
+
**Pass bar:** WITH-skill coverage >= 80% of the rubric, AND a clear lift over
|
|
50
|
+
the WITHOUT-skill baseline (the skill must add the load-bearing points the
|
|
51
|
+
baseline misses — typically the adapter/Protocol, idempotency, cite-or-refuse,
|
|
52
|
+
CI gate exit code, and transient-only retries). If the baseline already scores
|
|
53
|
+
~as high, the skill isn't earning its place on those points.
|
|
54
|
+
|
|
55
|
+
## Honest caveats
|
|
56
|
+
|
|
57
|
+
- Selection is non-deterministic; always use multiple trials and report rates,
|
|
58
|
+
not a single pass/fail.
|
|
59
|
+
- Grading `must_include` and "routes to the right sibling" requires judgment —
|
|
60
|
+
an LLM-as-judge can assist but spot-check it.
|
|
61
|
+
- Keep `cases.yaml` in sync with `SKILL.md`: when the description's triggers or
|
|
62
|
+
the "Do NOT use" list change, update the cases in the same commit.
|
|
63
|
+
|
|
64
|
+
## Counts
|
|
65
|
+
|
|
66
|
+
- should_trigger: 7
|
|
67
|
+
- should_not_trigger: 7
|
|
68
|
+
- capability: 2
|
|
@@ -0,0 +1,60 @@
|
|
|
1
|
+
skill: building-agents
|
|
2
|
+
|
|
3
|
+
should_trigger:
|
|
4
|
+
- prompt: "We're hardwired to the OpenAI SDK everywhere and the boss wants us to be able to switch to Claude or Gemini without a rewrite. How should I restructure this?"
|
|
5
|
+
why: "Provider-agnostic restructuring / 'swap models is a config change' is the core thesis of the skill (the capability interface + adapter)."
|
|
6
|
+
- prompt: "Add function calling to our support assistant so it can look up orders and issue refunds, but I don't want it to blow up if the model sends garbage arguments."
|
|
7
|
+
why: "Adding tool calling with schema validation + typed contracts + idempotency on side-effecting tools is squarely in scope."
|
|
8
|
+
- prompt: "Build me a chatbot that answers questions strictly from our internal handbook PDFs and says 'I don't know' when the answer isn't there."
|
|
9
|
+
why: "RAG over a private corpus with cite-or-refuse grounding — a named trigger — even though the user never says 'RAG', 'embeddings', or 'retrieval'."
|
|
10
|
+
- prompt: "Our agent sometimes spins forever and runs up a huge API bill. Help me make the loop safe and put a cost ceiling on each task."
|
|
11
|
+
why: "Bounded agent loop with step cap, timeouts, and token/cost budgets — the loop + cost-control trigger — phrased as a symptom, not a tool name."
|
|
12
|
+
- prompt: "Before we ship this LLM feature, I want a quality gate in CI that fails the build if accuracy or faithfulness drops on our test set."
|
|
13
|
+
why: "Golden set + graders + regression gate as a CI exit code is the 'eval before ship' decision rule and an explicit trigger."
|
|
14
|
+
- prompt: "I want to expose our internal pricing tools to several different AI clients so multiple teams can reuse them. What's the right way to wrap them?"
|
|
15
|
+
why: "Cross-client tool reuse out-of-process = build an MCP server, an explicit trigger; the user describes the reuse need rather than naming MCP."
|
|
16
|
+
- prompt: "Stand up tracing for our LLM calls so I can see token usage and cost per request in Langfuse without locking us to one backend."
|
|
17
|
+
why: "Vendor-neutral OTel GenAI observability with a swappable exporter is an explicit trigger."
|
|
18
|
+
|
|
19
|
+
should_not_trigger:
|
|
20
|
+
- prompt: "Reword this prompt so the model gives more concise answers — no code changes, just better wording."
|
|
21
|
+
route_to: "none"
|
|
22
|
+
why: "Pure prompt-wording improvement with no architecture is prompt engineering; the skill's 'Do NOT use' list excludes it explicitly."
|
|
23
|
+
- prompt: "Write a one-off script that calls the OpenAI API once to summarize a string I paste in. No tools, no tests, throwaway."
|
|
24
|
+
route_to: "none"
|
|
25
|
+
why: "One-shot throwaway prompt with no tools/eval/production path — the skill says just call the SDK directly."
|
|
26
|
+
- prompt: "Tune Anthropic prompt caching and extended thinking in our anthropic-only client.py to cut latency on a file that only imports the anthropic SDK."
|
|
27
|
+
route_to: "claude-api"
|
|
28
|
+
why: "Single-vendor Anthropic-SDK internals tuning in a file that only imports anthropic defers to the dedicated Anthropic skill; this skill stays multi-provider."
|
|
29
|
+
- prompt: "Write the pgvector schema and an HNSW-indexed nearest-neighbour SQL query, plus tuning advice — just the database side, no LLM calls."
|
|
30
|
+
route_to: "postgresdb"
|
|
31
|
+
why: "Pure SQL / pgvector indexing with no embeddings pipeline, retrieval grounding, or agent is Postgres territory."
|
|
32
|
+
- prompt: "Set up the CI pipeline and Docker deploy for our FastAPI service that happens to host an LLM endpoint — focus on the deployment, not the model code."
|
|
33
|
+
route_to: "deployment"
|
|
34
|
+
why: "Shipping/CI/containerization of the service is deployment; the skill itself points to deployment for that concern."
|
|
35
|
+
- prompt: "Audit our auth handler for injection and secret-leak vulnerabilities — it's a normal REST endpoint, not an agent."
|
|
36
|
+
route_to: "secure-coding"
|
|
37
|
+
why: "A security-only review of a non-agent endpoint is secure-coding, not LLM agent architecture."
|
|
38
|
+
- prompt: "Bootstrap the 01-TOOLS and 02-DOCS workspace layout for this repo and generate the root CLAUDE.md."
|
|
39
|
+
route_to: "harness"
|
|
40
|
+
why: "Workspace scaffolding (the 01-TOOLS/02-DOCS layout) is explicitly delegated to harness."
|
|
41
|
+
|
|
42
|
+
capability:
|
|
43
|
+
- scenario: "Our codebase calls `OpenAI().chat.completions.create(model='gpt-5.5', ...)` directly in three handlers. We need to support switching to Anthropic Claude or Gemini via config, with strict JSON output for a sentiment classifier and one side-effecting `create_invoice` tool. Lay out the architecture and the key code."
|
|
44
|
+
must_include:
|
|
45
|
+
- "Defines a capability/provider interface (e.g. an LLMProvider Protocol with a normalized request/response) that business logic programs against instead of a vendor SDK"
|
|
46
|
+
- "Model id is resolved from config/registry (e.g. 'provider:model'), never hardcoded in handler logic"
|
|
47
|
+
- "Per-provider adapters that hide vendor quirks (OpenAI tools/function vs Anthropic system-as-top-level-param + input_schema / tool-forcing for structured output)"
|
|
48
|
+
- "Strict structured output via JSON Schema with validation that fails loudly (e.g. Pydantic model_validate), not try/except json.loads that swallows errors"
|
|
49
|
+
- "The create_invoice tool has a typed args model with extra='forbid', validates arguments BEFORE side effects, and carries an idempotency key"
|
|
50
|
+
- "Returns a structured ToolResult (status/summary/next_actions), not the raw API blob"
|
|
51
|
+
- "Mentions selecting the cheapest model that passes an eval rather than defaulting to the flagship"
|
|
52
|
+
- scenario: "We have a RAG support bot and want to make it production-ready before launch. Show how to bound the agent loop, eval it, and observe it."
|
|
53
|
+
must_include:
|
|
54
|
+
- "Bounded agent loop: explicit step cap, per-tool timeout, and a token/cost budget (no while True)"
|
|
55
|
+
- "Retrieval grounded with cite-or-refuse: cite chunk ids and refuse when no chunk passes a similarity threshold rather than hallucinating"
|
|
56
|
+
- "A golden set plus graders (exact/schema/LLM-as-judge) with a CI threshold that exits non-zero to block the merge"
|
|
57
|
+
- "Reports metrics including accuracy/faithfulness, p95 latency, and cost per task"
|
|
58
|
+
- "OTel GenAI tracing with gen_ai.* attributes (model, input/output tokens, cost) and a swappable OTLP exporter/backend"
|
|
59
|
+
- "Retries only transient errors (429/5xx/timeout) with backoff+jitter, not 4xx like 400/401"
|
|
60
|
+
- "Routing/cascade to the cheapest model whose eval passes, escalating only on a failed self-check"
|