@heytherevibin/skillforge 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +16 -0
- package/CODE_OF_CONDUCT.md +34 -0
- package/CONTRIBUTING.md +38 -0
- package/LICENSE +21 -0
- package/README.md +337 -0
- package/RELEASING.md +93 -0
- package/SECURITY.md +31 -0
- package/STRATEGY.md +26 -0
- package/bin/cli.js +547 -0
- package/lib/packs.js +184 -0
- package/package.json +38 -0
- package/python/app/__init__.py +0 -0
- package/python/app/__pycache__/__init__.cpython-312.pyc +0 -0
- package/python/app/__pycache__/auth.cpython-312.pyc +0 -0
- package/python/app/__pycache__/main.cpython-312.pyc +0 -0
- package/python/app/auth.py +63 -0
- package/python/app/cli.py +78 -0
- package/python/app/db_paths.py +26 -0
- package/python/app/events_cli.py +175 -0
- package/python/app/main.py +647 -0
- package/python/app/materialize.py +138 -0
- package/python/app/mcp_server.py +610 -0
- package/python/app/route_cli.py +117 -0
- package/python/requirements-dev.txt +1 -0
- package/python/requirements.txt +7 -0
- package/python/tests/test_db_paths.py +41 -0
- package/skills/accessibility/SKILL.md +145 -0
- package/skills/agent-architecture-audit/SKILL.md +256 -0
- package/skills/agent-eval/SKILL.md +144 -0
- package/skills/agent-harness-construction/SKILL.md +72 -0
- package/skills/agent-introspection-debugging/SKILL.md +152 -0
- package/skills/agent-payment-x402/SKILL.md +224 -0
- package/skills/agent-sort/SKILL.md +214 -0
- package/skills/agentic-engineering/SKILL.md +62 -0
- package/skills/agentic-os/SKILL.md +386 -0
- package/skills/ai-first-engineering/SKILL.md +50 -0
- package/skills/ai-regression-testing/SKILL.md +384 -0
- package/skills/android-clean-architecture/SKILL.md +338 -0
- package/skills/angular-developer/SKILL.md +153 -0
- package/skills/angular-developer/references/angular-animations.md +160 -0
- package/skills/angular-developer/references/angular-aria.md +410 -0
- package/skills/angular-developer/references/cli.md +86 -0
- package/skills/angular-developer/references/component-harnesses.md +59 -0
- package/skills/angular-developer/references/component-styling.md +91 -0
- package/skills/angular-developer/references/components.md +117 -0
- package/skills/angular-developer/references/creating-services.md +97 -0
- package/skills/angular-developer/references/data-resolvers.md +69 -0
- package/skills/angular-developer/references/define-routes.md +67 -0
- package/skills/angular-developer/references/defining-providers.md +72 -0
- package/skills/angular-developer/references/di-fundamentals.md +120 -0
- package/skills/angular-developer/references/e2e-testing.md +56 -0
- package/skills/angular-developer/references/effects.md +83 -0
- package/skills/angular-developer/references/hierarchical-injectors.md +43 -0
- package/skills/angular-developer/references/host-elements.md +80 -0
- package/skills/angular-developer/references/injection-context.md +63 -0
- package/skills/angular-developer/references/inputs.md +101 -0
- package/skills/angular-developer/references/linked-signal.md +59 -0
- package/skills/angular-developer/references/loading-strategies.md +61 -0
- package/skills/angular-developer/references/mcp.md +108 -0
- package/skills/angular-developer/references/navigate-to-routes.md +69 -0
- package/skills/angular-developer/references/outputs.md +86 -0
- package/skills/angular-developer/references/reactive-forms.md +122 -0
- package/skills/angular-developer/references/rendering-strategies.md +44 -0
- package/skills/angular-developer/references/resource.md +77 -0
- package/skills/angular-developer/references/route-animations.md +56 -0
- package/skills/angular-developer/references/route-guards.md +52 -0
- package/skills/angular-developer/references/router-lifecycle.md +45 -0
- package/skills/angular-developer/references/router-testing.md +87 -0
- package/skills/angular-developer/references/show-routes-with-outlets.md +68 -0
- package/skills/angular-developer/references/signal-forms.md +795 -0
- package/skills/angular-developer/references/signals-overview.md +94 -0
- package/skills/angular-developer/references/tailwind-css.md +69 -0
- package/skills/angular-developer/references/template-driven-forms.md +114 -0
- package/skills/angular-developer/references/testing-fundamentals.md +65 -0
- package/skills/api-connector-builder/SKILL.md +120 -0
- package/skills/api-design/SKILL.md +522 -0
- package/skills/architecture-decision-records/SKILL.md +178 -0
- package/skills/article-writing/SKILL.md +78 -0
- package/skills/automation-audit-ops/SKILL.md +141 -0
- package/skills/autonomous-agent-harness/SKILL.md +272 -0
- package/skills/autonomous-loops/SKILL.md +609 -0
- package/skills/backend-patterns/SKILL.md +560 -0
- package/skills/benchmark/SKILL.md +92 -0
- package/skills/blueprint/SKILL.md +104 -0
- package/skills/browser-qa/SKILL.md +86 -0
- package/skills/bun-runtime/SKILL.md +83 -0
- package/skills/canary-watch/SKILL.md +98 -0
- package/skills/carrier-relationship-management/SKILL.md +211 -0
- package/skills/cisco-ios-patterns/SKILL.md +163 -0
- package/skills/ck/SKILL.md +147 -0
- package/skills/ck/commands/forget.mjs +44 -0
- package/skills/ck/commands/info.mjs +24 -0
- package/skills/ck/commands/init.mjs +143 -0
- package/skills/ck/commands/list.mjs +40 -0
- package/skills/ck/commands/migrate.mjs +202 -0
- package/skills/ck/commands/resume.mjs +36 -0
- package/skills/ck/commands/save.mjs +210 -0
- package/skills/ck/commands/shared.mjs +387 -0
- package/skills/ck/hooks/session-start.mjs +224 -0
- package/skills/claude-devfleet/SKILL.md +103 -0
- package/skills/click-path-audit/SKILL.md +244 -0
- package/skills/clickhouse-io/SKILL.md +438 -0
- package/skills/code-tour/SKILL.md +235 -0
- package/skills/codebase-onboarding/SKILL.md +232 -0
- package/skills/coding-standards/SKILL.md +548 -0
- package/skills/compose-multiplatform-patterns/SKILL.md +298 -0
- package/skills/connections-optimizer/SKILL.md +188 -0
- package/skills/content-engine/SKILL.md +126 -0
- package/skills/content-hash-cache-pattern/SKILL.md +160 -0
- package/skills/context-budget/SKILL.md +134 -0
- package/skills/continuous-agent-loop/SKILL.md +44 -0
- package/skills/continuous-learning/SKILL.md +129 -0
- package/skills/continuous-learning/config.json +18 -0
- package/skills/continuous-learning/evaluate-session.sh +69 -0
- package/skills/continuous-learning-v2/SKILL.md +358 -0
- package/skills/continuous-learning-v2/agents/observer-loop.sh +322 -0
- package/skills/continuous-learning-v2/agents/observer.md +198 -0
- package/skills/continuous-learning-v2/agents/session-guardian.sh +150 -0
- package/skills/continuous-learning-v2/agents/start-observer.sh +248 -0
- package/skills/continuous-learning-v2/config.json +8 -0
- package/skills/continuous-learning-v2/hooks/observe.sh +476 -0
- package/skills/continuous-learning-v2/scripts/detect-project.sh +288 -0
- package/skills/continuous-learning-v2/scripts/instinct-cli.py +1519 -0
- package/skills/continuous-learning-v2/scripts/lib/homunculus-dir.sh +31 -0
- package/skills/continuous-learning-v2/scripts/migrate-homunculus.sh +62 -0
- package/skills/continuous-learning-v2/scripts/test_parse_instinct.py +1018 -0
- package/skills/cost-aware-llm-pipeline/SKILL.md +182 -0
- package/skills/cost-tracking/SKILL.md +147 -0
- package/skills/council/SKILL.md +202 -0
- package/skills/cpp-coding-standards/SKILL.md +722 -0
- package/skills/cpp-testing/SKILL.md +323 -0
- package/skills/crosspost/SKILL.md +110 -0
- package/skills/csharp-testing/SKILL.md +320 -0
- package/skills/customer-billing-ops/SKILL.md +139 -0
- package/skills/customs-trade-compliance/SKILL.md +262 -0
- package/skills/dart-flutter-patterns/SKILL.md +562 -0
- package/skills/dashboard-builder/SKILL.md +108 -0
- package/skills/data-scraper-agent/SKILL.md +764 -0
- package/skills/database-migrations/SKILL.md +428 -0
- package/skills/deep-research/SKILL.md +158 -0
- package/skills/defi-amm-security/SKILL.md +166 -0
- package/skills/deployment-patterns/SKILL.md +426 -0
- package/skills/design-system/SKILL.md +81 -0
- package/skills/django-celery/SKILL.md +456 -0
- package/skills/django-patterns/SKILL.md +733 -0
- package/skills/django-security/SKILL.md +592 -0
- package/skills/django-tdd/SKILL.md +728 -0
- package/skills/django-verification/SKILL.md +468 -0
- package/skills/dmux-workflows/SKILL.md +190 -0
- package/skills/docker-patterns/SKILL.md +363 -0
- package/skills/documentation-lookup/SKILL.md +89 -0
- package/skills/dotnet-patterns/SKILL.md +320 -0
- package/skills/e2e-testing/SKILL.md +325 -0
- package/skills/email-ops/SKILL.md +120 -0
- package/skills/energy-procurement/SKILL.md +227 -0
- package/skills/enterprise-agent-ops/SKILL.md +49 -0
- package/skills/error-handling/SKILL.md +375 -0
- package/skills/eval-harness/SKILL.md +269 -0
- package/skills/evm-token-decimals/SKILL.md +130 -0
- package/skills/exa-search/SKILL.md +106 -0
- package/skills/fal-ai-media/SKILL.md +287 -0
- package/skills/fastapi-patterns/SKILL.md +327 -0
- package/skills/finance-billing-ops/SKILL.md +126 -0
- package/skills/flox-environments/SKILL.md +496 -0
- package/skills/flutter-dart-code-review/SKILL.md +434 -0
- package/skills/foundation-models-on-device/SKILL.md +243 -0
- package/skills/frontend-design-direction/SKILL.md +92 -0
- package/skills/frontend-patterns/SKILL.md +641 -0
- package/skills/frontend-slides/SKILL.md +183 -0
- package/skills/frontend-slides/STYLE_PRESETS.md +330 -0
- package/skills/frontend-slides/animation-patterns.md +122 -0
- package/skills/frontend-slides/html-template.md +419 -0
- package/skills/frontend-slides/scripts/export-pdf.sh +418 -0
- package/skills/frontend-slides/scripts/extract-pptx.py +96 -0
- package/skills/frontend-slides/viewport-base.css +153 -0
- package/skills/fsharp-testing/SKILL.md +279 -0
- package/skills/gan-style-harness/SKILL.md +278 -0
- package/skills/gateguard/SKILL.md +125 -0
- package/skills/git-workflow/SKILL.md +714 -0
- package/skills/github-ops/SKILL.md +143 -0
- package/skills/golang-patterns/SKILL.md +673 -0
- package/skills/golang-testing/SKILL.md +719 -0
- package/skills/google-workspace-ops/SKILL.md +94 -0
- package/skills/healthcare-cdss-patterns/SKILL.md +245 -0
- package/skills/healthcare-emr-patterns/SKILL.md +159 -0
- package/skills/healthcare-eval-harness/SKILL.md +207 -0
- package/skills/healthcare-phi-compliance/SKILL.md +145 -0
- package/skills/hermes-imports/SKILL.md +87 -0
- package/skills/hexagonal-architecture/SKILL.md +275 -0
- package/skills/hipaa-compliance/SKILL.md +78 -0
- package/skills/homelab-network-readiness/SKILL.md +169 -0
- package/skills/homelab-network-setup/SKILL.md +129 -0
- package/skills/homelab-pihole-dns/SKILL.md +274 -0
- package/skills/homelab-vlan-segmentation/SKILL.md +311 -0
- package/skills/homelab-wireguard-vpn/SKILL.md +305 -0
- package/skills/hookify-rules/SKILL.md +128 -0
- package/skills/inventory-demand-planning/SKILL.md +246 -0
- package/skills/investor-materials/SKILL.md +95 -0
- package/skills/investor-outreach/SKILL.md +90 -0
- package/skills/ios-icon-gen/SKILL.md +157 -0
- package/skills/ios-icon-gen/scripts/generate_icons.swift +258 -0
- package/skills/ios-icon-gen/scripts/iconify_gen.sh +235 -0
- package/skills/iterative-retrieval/SKILL.md +209 -0
- package/skills/java-coding-standards/SKILL.md +382 -0
- package/skills/jira-integration/SKILL.md +292 -0
- package/skills/jpa-patterns/SKILL.md +150 -0
- package/skills/knowledge-ops/SKILL.md +153 -0
- package/skills/kotlin-coroutines-flows/SKILL.md +283 -0
- package/skills/kotlin-exposed-patterns/SKILL.md +718 -0
- package/skills/kotlin-ktor-patterns/SKILL.md +688 -0
- package/skills/kotlin-patterns/SKILL.md +710 -0
- package/skills/kotlin-testing/SKILL.md +823 -0
- package/skills/laravel-patterns/SKILL.md +414 -0
- package/skills/laravel-plugin-discovery/SKILL.md +228 -0
- package/skills/laravel-security/SKILL.md +284 -0
- package/skills/laravel-tdd/SKILL.md +282 -0
- package/skills/laravel-verification/SKILL.md +178 -0
- package/skills/lead-intelligence/SKILL.md +320 -0
- package/skills/lead-intelligence/agents/enrichment-agent.md +85 -0
- package/skills/lead-intelligence/agents/mutual-mapper.md +75 -0
- package/skills/lead-intelligence/agents/outreach-drafter.md +98 -0
- package/skills/lead-intelligence/agents/signal-scorer.md +60 -0
- package/skills/liquid-glass-design/SKILL.md +279 -0
- package/skills/llm-trading-agent-security/SKILL.md +146 -0
- package/skills/logistics-exception-management/SKILL.md +221 -0
- package/skills/make-interfaces-feel-better/SKILL.md +151 -0
- package/skills/manim-video/SKILL.md +88 -0
- package/skills/manim-video/assets/network_graph_scene.py +52 -0
- package/skills/market-research/SKILL.md +74 -0
- package/skills/mcp-server-patterns/SKILL.md +68 -0
- package/skills/messages-ops/SKILL.md +103 -0
- package/skills/mle-workflow/SKILL.md +345 -0
- package/skills/motion-advanced/SKILL.md +596 -0
- package/skills/motion-foundations/SKILL.md +299 -0
- package/skills/motion-patterns/SKILL.md +435 -0
- package/skills/motion-ui/SKILL.md +574 -0
- package/skills/mysql-patterns/SKILL.md +411 -0
- package/skills/nanoclaw-repl/SKILL.md +32 -0
- package/skills/nestjs-patterns/SKILL.md +229 -0
- package/skills/netmiko-ssh-automation/SKILL.md +173 -0
- package/skills/network-bgp-diagnostics/SKILL.md +167 -0
- package/skills/network-config-validation/SKILL.md +210 -0
- package/skills/network-interface-health/SKILL.md +152 -0
- package/skills/nextjs-turbopack/SKILL.md +43 -0
- package/skills/nodejs-keccak256/SKILL.md +102 -0
- package/skills/nutrient-document-processing/SKILL.md +166 -0
- package/skills/nuxt4-patterns/SKILL.md +99 -0
- package/skills/openclaw-persona-forge/SKILL.md +288 -0
- package/skills/openclaw-persona-forge/gacha.py +224 -0
- package/skills/openclaw-persona-forge/gacha.sh +5 -0
- package/skills/openclaw-persona-forge/references/avatar-style.md +124 -0
- package/skills/openclaw-persona-forge/references/boundary-rules.md +53 -0
- package/skills/openclaw-persona-forge/references/error-handling.md +53 -0
- package/skills/openclaw-persona-forge/references/identity-tension.md +48 -0
- package/skills/openclaw-persona-forge/references/naming-system.md +39 -0
- package/skills/openclaw-persona-forge/references/output-template.md +166 -0
- package/skills/opensource-pipeline/SKILL.md +254 -0
- package/skills/perl-patterns/SKILL.md +503 -0
- package/skills/perl-security/SKILL.md +502 -0
- package/skills/perl-testing/SKILL.md +474 -0
- package/skills/plan-orchestrate/SKILL.md +253 -0
- package/skills/plankton-code-quality/SKILL.md +236 -0
- package/skills/postgres-patterns/SKILL.md +146 -0
- package/skills/product-capability/SKILL.md +140 -0
- package/skills/product-lens/SKILL.md +91 -0
- package/skills/production-audit/SKILL.md +206 -0
- package/skills/production-scheduling/SKILL.md +237 -0
- package/skills/project-flow-ops/SKILL.md +110 -0
- package/skills/prompt-optimizer/SKILL.md +398 -0
- package/skills/python-patterns/SKILL.md +749 -0
- package/skills/python-testing/SKILL.md +815 -0
- package/skills/pytorch-patterns/SKILL.md +395 -0
- package/skills/quality-nonconformance/SKILL.md +259 -0
- package/skills/quarkus-patterns/SKILL.md +721 -0
- package/skills/quarkus-security/SKILL.md +466 -0
- package/skills/quarkus-tdd/SKILL.md +810 -0
- package/skills/quarkus-verification/SKILL.md +478 -0
- package/skills/ralphinho-rfc-pipeline/SKILL.md +66 -0
- package/skills/redis-patterns/SKILL.md +402 -0
- package/skills/regex-vs-llm-structured-text/SKILL.md +219 -0
- package/skills/remotion-video-creation/SKILL.md +43 -0
- package/skills/remotion-video-creation/rules/3d.md +86 -0
- package/skills/remotion-video-creation/rules/animations.md +29 -0
- package/skills/remotion-video-creation/rules/assets/charts-bar-chart.tsx +173 -0
- package/skills/remotion-video-creation/rules/assets/text-animations-typewriter.tsx +100 -0
- package/skills/remotion-video-creation/rules/assets/text-animations-word-highlight.tsx +108 -0
- package/skills/remotion-video-creation/rules/assets.md +78 -0
- package/skills/remotion-video-creation/rules/audio.md +172 -0
- package/skills/remotion-video-creation/rules/calculate-metadata.md +104 -0
- package/skills/remotion-video-creation/rules/can-decode.md +75 -0
- package/skills/remotion-video-creation/rules/charts.md +58 -0
- package/skills/remotion-video-creation/rules/compositions.md +146 -0
- package/skills/remotion-video-creation/rules/display-captions.md +126 -0
- package/skills/remotion-video-creation/rules/extract-frames.md +229 -0
- package/skills/remotion-video-creation/rules/fonts.md +152 -0
- package/skills/remotion-video-creation/rules/get-audio-duration.md +58 -0
- package/skills/remotion-video-creation/rules/get-video-dimensions.md +68 -0
- package/skills/remotion-video-creation/rules/get-video-duration.md +58 -0
- package/skills/remotion-video-creation/rules/gifs.md +138 -0
- package/skills/remotion-video-creation/rules/images.md +130 -0
- package/skills/remotion-video-creation/rules/import-srt-captions.md +67 -0
- package/skills/remotion-video-creation/rules/lottie.md +67 -0
- package/skills/remotion-video-creation/rules/measuring-dom-nodes.md +34 -0
- package/skills/remotion-video-creation/rules/measuring-text.md +143 -0
- package/skills/remotion-video-creation/rules/sequencing.md +106 -0
- package/skills/remotion-video-creation/rules/tailwind.md +11 -0
- package/skills/remotion-video-creation/rules/text-animations.md +20 -0
- package/skills/remotion-video-creation/rules/timing.md +179 -0
- package/skills/remotion-video-creation/rules/transcribe-captions.md +19 -0
- package/skills/remotion-video-creation/rules/transitions.md +122 -0
- package/skills/remotion-video-creation/rules/trimming.md +52 -0
- package/skills/remotion-video-creation/rules/videos.md +171 -0
- package/skills/repo-scan/SKILL.md +78 -0
- package/skills/research-ops/SKILL.md +111 -0
- package/skills/returns-reverse-logistics/SKILL.md +239 -0
- package/skills/rules-distill/SKILL.md +263 -0
- package/skills/rules-distill/scripts/scan-rules.sh +58 -0
- package/skills/rules-distill/scripts/scan-skills.sh +129 -0
- package/skills/rust-patterns/SKILL.md +498 -0
- package/skills/rust-testing/SKILL.md +499 -0
- package/skills/safety-guard/SKILL.md +74 -0
- package/skills/santa-method/SKILL.md +306 -0
- package/skills/scientific-db-pubmed-database/SKILL.md +175 -0
- package/skills/scientific-db-uspto-database/SKILL.md +177 -0
- package/skills/scientific-pkg-gget/SKILL.md +166 -0
- package/skills/scientific-thinking-literature-review/SKILL.md +192 -0
- package/skills/scientific-thinking-scholar-evaluation/SKILL.md +160 -0
- package/skills/search-first/SKILL.md +181 -0
- package/skills/security-bounty-hunter/SKILL.md +99 -0
- package/skills/security-review/SKILL.md +502 -0
- package/skills/security-review/cloud-infrastructure-security.md +361 -0
- package/skills/seo/SKILL.md +153 -0
- package/skills/skill-comply/SKILL.md +57 -0
- package/skills/skill-comply/fixtures/compliant_trace.jsonl +5 -0
- package/skills/skill-comply/fixtures/noncompliant_trace.jsonl +3 -0
- package/skills/skill-comply/fixtures/tdd_spec.yaml +44 -0
- package/skills/skill-comply/prompts/classifier.md +24 -0
- package/skills/skill-comply/prompts/scenario_generator.md +62 -0
- package/skills/skill-comply/prompts/spec_generator.md +42 -0
- package/skills/skill-comply/pyproject.toml +15 -0
- package/skills/skill-comply/scripts/__init__.py +0 -0
- package/skills/skill-comply/scripts/classifier.py +85 -0
- package/skills/skill-comply/scripts/grader.py +124 -0
- package/skills/skill-comply/scripts/parser.py +107 -0
- package/skills/skill-comply/scripts/report.py +170 -0
- package/skills/skill-comply/scripts/run.py +127 -0
- package/skills/skill-comply/scripts/runner.py +186 -0
- package/skills/skill-comply/scripts/scenario_generator.py +70 -0
- package/skills/skill-comply/scripts/spec_generator.py +72 -0
- package/skills/skill-comply/scripts/utils.py +13 -0
- package/skills/skill-comply/tests/test_grader.py +197 -0
- package/skills/skill-comply/tests/test_parser.py +90 -0
- package/skills/skill-comply/tests/test_runner.py +172 -0
- package/skills/skill-scout/SKILL.md +139 -0
- package/skills/skill-stocktake/SKILL.md +193 -0
- package/skills/skill-stocktake/scripts/quick-diff.sh +87 -0
- package/skills/skill-stocktake/scripts/save-results.sh +56 -0
- package/skills/skill-stocktake/scripts/scan.sh +170 -0
- package/skills/social-graph-ranker/SKILL.md +153 -0
- package/skills/springboot-patterns/SKILL.md +313 -0
- package/skills/springboot-security/SKILL.md +271 -0
- package/skills/springboot-tdd/SKILL.md +157 -0
- package/skills/springboot-verification/SKILL.md +230 -0
- package/skills/strategic-compact/SKILL.md +129 -0
- package/skills/strategic-compact/suggest-compact.sh +54 -0
- package/skills/swift-actor-persistence/SKILL.md +142 -0
- package/skills/swift-concurrency-6-2/SKILL.md +216 -0
- package/skills/swift-protocol-di-testing/SKILL.md +189 -0
- package/skills/swiftui-patterns/SKILL.md +259 -0
- package/skills/tdd-workflow/SKILL.md +462 -0
- package/skills/team-builder/SKILL.md +166 -0
- package/skills/terminal-ops/SKILL.md +108 -0
- package/skills/tinystruct-patterns/SKILL.md +130 -0
- package/skills/tinystruct-patterns/references/architecture.md +77 -0
- package/skills/tinystruct-patterns/references/data-handling.md +35 -0
- package/skills/tinystruct-patterns/references/routing.md +57 -0
- package/skills/tinystruct-patterns/references/system-usage.md +74 -0
- package/skills/tinystruct-patterns/references/testing.md +59 -0
- package/skills/token-budget-advisor/SKILL.md +133 -0
- package/skills/ui-demo/SKILL.md +464 -0
- package/skills/ui-to-vue/SKILL.md +134 -0
- package/skills/unified-notifications-ops/SKILL.md +186 -0
- package/skills/verification-loop/SKILL.md +125 -0
- package/skills/video-editing/SKILL.md +309 -0
- package/skills/videodb/SKILL.md +373 -0
- package/skills/videodb/reference/api-reference.md +550 -0
- package/skills/videodb/reference/capture-reference.md +407 -0
- package/skills/videodb/reference/capture.md +101 -0
- package/skills/videodb/reference/editor.md +443 -0
- package/skills/videodb/reference/generative.md +331 -0
- package/skills/videodb/reference/rtstream-reference.md +564 -0
- package/skills/videodb/reference/rtstream.md +65 -0
- package/skills/videodb/reference/search.md +230 -0
- package/skills/videodb/reference/streaming.md +406 -0
- package/skills/videodb/reference/use-cases.md +118 -0
- package/skills/videodb/scripts/ws_listener.py +282 -0
- package/skills/visa-doc-translate/README.md +86 -0
- package/skills/visa-doc-translate/SKILL.md +117 -0
- package/skills/vite-patterns/SKILL.md +448 -0
- package/skills/windows-desktop-e2e/SKILL.md +787 -0
- package/skills/workspace-surface-audit/SKILL.md +124 -0
- package/skills/x-api/SKILL.md +233 -0
|
@@ -0,0 +1,279 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: fsharp-testing
|
|
3
|
+
description: F# testing patterns with xUnit, FsUnit, Unquote, FsCheck property-based testing, integration tests, and test organization best practices.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# F# Testing Patterns
|
|
7
|
+
|
|
8
|
+
Comprehensive testing patterns for F# applications using xUnit, FsUnit, Unquote, FsCheck, and modern .NET testing practices.
|
|
9
|
+
|
|
10
|
+
## When to Activate
|
|
11
|
+
|
|
12
|
+
- Writing new tests for F# code
|
|
13
|
+
- Reviewing test quality and coverage
|
|
14
|
+
- Setting up test infrastructure for F# projects
|
|
15
|
+
- Debugging flaky or slow tests
|
|
16
|
+
|
|
17
|
+
## Test Framework Stack
|
|
18
|
+
|
|
19
|
+
| Tool | Purpose |
|
|
20
|
+
|---|---|
|
|
21
|
+
| **xUnit** | Test framework (standard .NET ecosystem choice) |
|
|
22
|
+
| **FsUnit.xUnit** | F#-friendly assertion syntax for xUnit |
|
|
23
|
+
| **Unquote** | Assertion library using F# quotations for clear failure messages |
|
|
24
|
+
| **FsCheck.xUnit** | Property-based testing integrated with xUnit |
|
|
25
|
+
| **NSubstitute** | Mocking .NET dependencies |
|
|
26
|
+
| **Testcontainers** | Real infrastructure in integration tests |
|
|
27
|
+
| **WebApplicationFactory** | ASP.NET Core integration tests |
|
|
28
|
+
|
|
29
|
+
## Unit Tests with xUnit + FsUnit
|
|
30
|
+
|
|
31
|
+
### Basic Test Structure
|
|
32
|
+
|
|
33
|
+
```fsharp
|
|
34
|
+
module OrderServiceTests
|
|
35
|
+
|
|
36
|
+
open Xunit
|
|
37
|
+
open FsUnit.Xunit
|
|
38
|
+
|
|
39
|
+
[<Fact>]
|
|
40
|
+
let ``create sets status to Pending`` () =
|
|
41
|
+
let order = Order.create "cust-1" [ validItem ]
|
|
42
|
+
order.Status |> should equal Pending
|
|
43
|
+
|
|
44
|
+
[<Fact>]
|
|
45
|
+
let ``confirm changes status to Confirmed`` () =
|
|
46
|
+
let order = Order.create "cust-1" [ validItem ]
|
|
47
|
+
let confirmed = Order.confirm order
|
|
48
|
+
confirmed.Status |> should be (ofCase <@ Confirmed @>)
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
### Assertions with Unquote
|
|
52
|
+
|
|
53
|
+
Unquote uses F# quotations so failure messages show the full expression that failed, not just "expected X got Y".
|
|
54
|
+
|
|
55
|
+
```fsharp
|
|
56
|
+
module OrderValidationTests
|
|
57
|
+
|
|
58
|
+
open Xunit
|
|
59
|
+
open Swensen.Unquote
|
|
60
|
+
|
|
61
|
+
[<Fact>]
|
|
62
|
+
let ``PlaceOrder returns success when request is valid`` () =
|
|
63
|
+
let request = { CustomerId = "cust-123"; Items = [ validItem ] }
|
|
64
|
+
let result = OrderService.placeOrder request
|
|
65
|
+
test <@ Result.isOk result @>
|
|
66
|
+
|
|
67
|
+
[<Fact>]
|
|
68
|
+
let ``order total sums item prices`` () =
|
|
69
|
+
let items = [ { Sku = "A"; Quantity = 2; Price = 10m }
|
|
70
|
+
{ Sku = "B"; Quantity = 1; Price = 5m } ]
|
|
71
|
+
let total = Order.calculateTotal items
|
|
72
|
+
test <@ total = 25m @>
|
|
73
|
+
|
|
74
|
+
[<Fact>]
|
|
75
|
+
let ``validated email rejects empty input`` () =
|
|
76
|
+
let result = ValidatedEmail.create ""
|
|
77
|
+
test <@ Result.isError result @>
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
### Async Tests
|
|
81
|
+
|
|
82
|
+
```fsharp
|
|
83
|
+
[<Fact>]
|
|
84
|
+
let ``PlaceOrder returns success when request is valid`` () = task {
|
|
85
|
+
let deps = createTestDeps ()
|
|
86
|
+
let request = { CustomerId = "cust-123"; Items = [ validItem ] }
|
|
87
|
+
|
|
88
|
+
let! result = OrderService.placeOrder deps request
|
|
89
|
+
|
|
90
|
+
test <@ Result.isOk result @>
|
|
91
|
+
}
|
|
92
|
+
|
|
93
|
+
[<Fact>]
|
|
94
|
+
let ``PlaceOrder returns error when items are empty`` () = task {
|
|
95
|
+
let deps = createTestDeps ()
|
|
96
|
+
let request = { CustomerId = "cust-123"; Items = [] }
|
|
97
|
+
|
|
98
|
+
let! result = OrderService.placeOrder deps request
|
|
99
|
+
|
|
100
|
+
test <@ Result.isError result @>
|
|
101
|
+
}
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
### Parameterized Tests with Theory
|
|
105
|
+
|
|
106
|
+
```fsharp
|
|
107
|
+
[<Theory>]
|
|
108
|
+
[<InlineData("")>]
|
|
109
|
+
[<InlineData(" ")>]
|
|
110
|
+
let ``PlaceOrder rejects empty customer ID`` (customerId: string) =
|
|
111
|
+
let request = { CustomerId = customerId; Items = [ validItem ] }
|
|
112
|
+
let result = OrderService.placeOrder request
|
|
113
|
+
result |> should be (ofCase <@ Error @>)
|
|
114
|
+
|
|
115
|
+
[<Theory>]
|
|
116
|
+
[<InlineData("", false)>]
|
|
117
|
+
[<InlineData("a", false)>]
|
|
118
|
+
[<InlineData("user@example.com", true)>]
|
|
119
|
+
[<InlineData("user+tag@example.co.uk", true)>]
|
|
120
|
+
let ``IsValidEmail returns expected result`` (email: string, expected: bool) =
|
|
121
|
+
test <@ EmailValidator.isValid email = expected @>
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
## Property-Based Testing with FsCheck
|
|
125
|
+
|
|
126
|
+
### Using FsCheck.xUnit
|
|
127
|
+
|
|
128
|
+
```fsharp
|
|
129
|
+
open FsCheck
|
|
130
|
+
open FsCheck.Xunit
|
|
131
|
+
|
|
132
|
+
[<Property>]
|
|
133
|
+
let ``order total is always non-negative`` (items: NonEmptyList<PositiveInt * decimal>) =
|
|
134
|
+
let orderItems =
|
|
135
|
+
items.Get
|
|
136
|
+
|> List.map (fun (qty, price) ->
|
|
137
|
+
{ Sku = "SKU"; Quantity = qty.Get; Price = abs price })
|
|
138
|
+
let total = Order.calculateTotal orderItems
|
|
139
|
+
total >= 0m
|
|
140
|
+
|
|
141
|
+
[<Property>]
|
|
142
|
+
let ``serialization roundtrips`` (order: Order) =
|
|
143
|
+
let json = JsonSerializer.Serialize order
|
|
144
|
+
let deserialized = JsonSerializer.Deserialize<Order> json
|
|
145
|
+
deserialized = order
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
### Custom Generators
|
|
149
|
+
|
|
150
|
+
```fsharp
|
|
151
|
+
type OrderGenerators =
|
|
152
|
+
static member ValidEmail () =
|
|
153
|
+
gen {
|
|
154
|
+
let! user = Gen.elements [ "alice"; "bob"; "carol" ]
|
|
155
|
+
let! domain = Gen.elements [ "example.com"; "test.org" ]
|
|
156
|
+
return $"{user}@{domain}"
|
|
157
|
+
}
|
|
158
|
+
|> Arb.fromGen
|
|
159
|
+
|
|
160
|
+
[<Property(Arbitrary = [| typeof<OrderGenerators> |])>]
|
|
161
|
+
let ``valid emails pass validation`` (email: string) =
|
|
162
|
+
EmailValidator.isValid email
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
## Mocking Dependencies
|
|
166
|
+
|
|
167
|
+
### Function Stubs (Preferred)
|
|
168
|
+
|
|
169
|
+
```fsharp
|
|
170
|
+
let createTestDeps () =
|
|
171
|
+
let mutable savedOrders = []
|
|
172
|
+
{ FindOrder = fun id -> task { return Map.tryFind id testData }
|
|
173
|
+
SaveOrder = fun order -> task { savedOrders <- order :: savedOrders }
|
|
174
|
+
SendNotification = fun _ -> Task.CompletedTask }
|
|
175
|
+
|
|
176
|
+
[<Fact>]
|
|
177
|
+
let ``PlaceOrder saves the confirmed order`` () = task {
|
|
178
|
+
let mutable saved = []
|
|
179
|
+
let deps =
|
|
180
|
+
{ createTestDeps () with
|
|
181
|
+
SaveOrder = fun order -> task { saved <- order :: saved } }
|
|
182
|
+
|
|
183
|
+
let! _ = OrderService.placeOrder deps validRequest
|
|
184
|
+
|
|
185
|
+
test <@ saved.Length = 1 @>
|
|
186
|
+
}
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
### NSubstitute for .NET Interfaces
|
|
190
|
+
|
|
191
|
+
```fsharp
|
|
192
|
+
open NSubstitute
|
|
193
|
+
|
|
194
|
+
[<Fact>]
|
|
195
|
+
let ``calls repository with correct ID`` () = task {
|
|
196
|
+
let repo = Substitute.For<IOrderRepository>()
|
|
197
|
+
repo.FindByIdAsync(Arg.Any<Guid>(), Arg.Any<CancellationToken>())
|
|
198
|
+
.Returns(Task.FromResult(Some testOrder))
|
|
199
|
+
|
|
200
|
+
let service = OrderService(repo)
|
|
201
|
+
let! _ = service.GetOrder(testOrder.Id, CancellationToken.None)
|
|
202
|
+
|
|
203
|
+
do! repo.Received(1).FindByIdAsync(testOrder.Id, Arg.Any<CancellationToken>())
|
|
204
|
+
}
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
## ASP.NET Core Integration Tests
|
|
208
|
+
|
|
209
|
+
```fsharp
|
|
210
|
+
type OrderApiTests (factory: WebApplicationFactory<Program>) =
|
|
211
|
+
interface IClassFixture<WebApplicationFactory<Program>>
|
|
212
|
+
|
|
213
|
+
let client =
|
|
214
|
+
factory.WithWebHostBuilder(fun builder ->
|
|
215
|
+
builder.ConfigureServices(fun services ->
|
|
216
|
+
services.RemoveAll<DbContextOptions<AppDbContext>>() |> ignore
|
|
217
|
+
services.AddDbContext<AppDbContext>(fun options ->
|
|
218
|
+
options.UseInMemoryDatabase("TestDb") |> ignore) |> ignore))
|
|
219
|
+
.CreateClient()
|
|
220
|
+
|
|
221
|
+
[<Fact>]
|
|
222
|
+
member _.``GET order returns 404 when not found`` () = task {
|
|
223
|
+
let! response = client.GetAsync($"/api/orders/{Guid.NewGuid()}")
|
|
224
|
+
test <@ response.StatusCode = HttpStatusCode.NotFound @>
|
|
225
|
+
}
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
## Test Organization
|
|
229
|
+
|
|
230
|
+
```
|
|
231
|
+
tests/
|
|
232
|
+
MyApp.Tests/
|
|
233
|
+
Unit/
|
|
234
|
+
OrderServiceTests.fs
|
|
235
|
+
PaymentServiceTests.fs
|
|
236
|
+
Integration/
|
|
237
|
+
OrderApiTests.fs
|
|
238
|
+
OrderRepositoryTests.fs
|
|
239
|
+
Properties/
|
|
240
|
+
OrderPropertyTests.fs
|
|
241
|
+
Helpers/
|
|
242
|
+
TestData.fs
|
|
243
|
+
TestDeps.fs
|
|
244
|
+
```
|
|
245
|
+
|
|
246
|
+
## Common Anti-Patterns
|
|
247
|
+
|
|
248
|
+
| Anti-Pattern | Fix |
|
|
249
|
+
|---|---|
|
|
250
|
+
| Testing implementation details | Test behavior and outcomes |
|
|
251
|
+
| Mutable shared test state | Fresh state per test |
|
|
252
|
+
| `Thread.Sleep` in async tests | Use `Task.Delay` with timeout, or polling helpers |
|
|
253
|
+
| Asserting on `sprintf` output | Assert on typed values and pattern matches |
|
|
254
|
+
| Ignoring `CancellationToken` | Always pass and verify cancellation |
|
|
255
|
+
| Skipping property-based tests | Use FsCheck for any function with clear invariants |
|
|
256
|
+
|
|
257
|
+
## Related Skills
|
|
258
|
+
|
|
259
|
+
- `dotnet-patterns` - Idiomatic .NET patterns, dependency injection, and architecture
|
|
260
|
+
- `csharp-testing` - C# testing patterns (shared infrastructure like WebApplicationFactory and Testcontainers applies to F# too)
|
|
261
|
+
|
|
262
|
+
## Running Tests
|
|
263
|
+
|
|
264
|
+
```bash
|
|
265
|
+
# Run all tests
|
|
266
|
+
dotnet test
|
|
267
|
+
|
|
268
|
+
# Run with coverage
|
|
269
|
+
dotnet test --collect:"XPlat Code Coverage"
|
|
270
|
+
|
|
271
|
+
# Run specific project
|
|
272
|
+
dotnet test tests/MyApp.Tests/
|
|
273
|
+
|
|
274
|
+
# Filter by test name
|
|
275
|
+
dotnet test --filter "FullyQualifiedName~OrderService"
|
|
276
|
+
|
|
277
|
+
# Watch mode during development
|
|
278
|
+
dotnet watch test --project tests/MyApp.Tests/
|
|
279
|
+
```
|
|
@@ -0,0 +1,278 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: gan-style-harness
|
|
3
|
+
description: "GAN-inspired Generator-Evaluator agent harness for building high-quality applications autonomously. Based on Anthropic's March 2026 harness design paper."
|
|
4
|
+
origin: the toolset-community
|
|
5
|
+
tools: Read, Write, Edit, Bash, Grep, Glob, Task
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# GAN-Style Harness Skill
|
|
9
|
+
|
|
10
|
+
> Inspired by [Anthropic's Harness Design for Long-Running Application Development](https://www.anthropic.com/engineering/harness-design-long-running-apps) (March 24, 2026)
|
|
11
|
+
|
|
12
|
+
A multi-agent harness that separates **generation** from **evaluation**, creating an adversarial feedback loop that drives quality far beyond what a single agent can achieve.
|
|
13
|
+
|
|
14
|
+
## Core Insight
|
|
15
|
+
|
|
16
|
+
> When asked to evaluate their own work, agents are pathological optimists — they praise mediocre output and talk themselves out of legitimate issues. But engineering a **separate evaluator** to be ruthlessly strict is far more tractable than teaching a generator to self-critique.
|
|
17
|
+
|
|
18
|
+
This is the same dynamic as GANs (Generative Adversarial Networks): the Generator produces, the Evaluator critiques, and that feedback drives the next iteration.
|
|
19
|
+
|
|
20
|
+
## When to Use
|
|
21
|
+
|
|
22
|
+
- Building complete applications from a one-line prompt
|
|
23
|
+
- Frontend design tasks requiring high visual quality
|
|
24
|
+
- Full-stack projects that need working features, not just code
|
|
25
|
+
- Any task where "AI slop" aesthetics are unacceptable
|
|
26
|
+
- Projects where you want to invest $50-200 for production-quality output
|
|
27
|
+
|
|
28
|
+
## When NOT to Use
|
|
29
|
+
|
|
30
|
+
- Quick single-file fixes (use standard `claude -p`)
|
|
31
|
+
- Tasks with tight budget constraints (<$10)
|
|
32
|
+
- Simple refactoring (use de-sloppify pattern instead)
|
|
33
|
+
- Tasks that are already well-specified with tests (use TDD workflow)
|
|
34
|
+
|
|
35
|
+
## Architecture
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
┌─────────────┐
|
|
39
|
+
│ PLANNER │
|
|
40
|
+
│ (Opus 4.6) │
|
|
41
|
+
└──────┬──────┘
|
|
42
|
+
│ Product Spec
|
|
43
|
+
│ (features, sprints, design direction)
|
|
44
|
+
▼
|
|
45
|
+
┌────────────────────────┐
|
|
46
|
+
│ │
|
|
47
|
+
│ GENERATOR-EVALUATOR │
|
|
48
|
+
│ FEEDBACK LOOP │
|
|
49
|
+
│ │
|
|
50
|
+
│ ┌──────────┐ │
|
|
51
|
+
│ │GENERATOR │--build-->│──┐
|
|
52
|
+
│ │(Opus 4.6)│ │ │
|
|
53
|
+
│ └────▲─────┘ │ │
|
|
54
|
+
│ │ │ │ live app
|
|
55
|
+
│ feedback │ │
|
|
56
|
+
│ │ │ │
|
|
57
|
+
│ ┌────┴─────┐ │ │
|
|
58
|
+
│ │EVALUATOR │<-test----│──┘
|
|
59
|
+
│ │(Opus 4.6)│ │
|
|
60
|
+
│ │+Playwright│ │
|
|
61
|
+
│ └──────────┘ │
|
|
62
|
+
│ │
|
|
63
|
+
│ 5-15 iterations │
|
|
64
|
+
└────────────────────────┘
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## The Three Agents
|
|
68
|
+
|
|
69
|
+
### 1. Planner Agent
|
|
70
|
+
|
|
71
|
+
**Role:** Product manager — expands a brief prompt into a full product specification.
|
|
72
|
+
|
|
73
|
+
**Key behaviors:**
|
|
74
|
+
- Takes a one-line prompt and produces a 16-feature, multi-sprint specification
|
|
75
|
+
- Defines user stories, technical requirements, and visual design direction
|
|
76
|
+
- Is deliberately **ambitious** — conservative planning leads to underwhelming results
|
|
77
|
+
- Produces evaluation criteria that the Evaluator will use later
|
|
78
|
+
|
|
79
|
+
**Model:** Opus 4.6 (needs deep reasoning for spec expansion)
|
|
80
|
+
|
|
81
|
+
### 2. Generator Agent
|
|
82
|
+
|
|
83
|
+
**Role:** Developer — implements features according to the spec.
|
|
84
|
+
|
|
85
|
+
**Key behaviors:**
|
|
86
|
+
- Works in structured sprints (or continuous mode with newer models)
|
|
87
|
+
- Negotiates a "sprint contract" with the Evaluator before writing code
|
|
88
|
+
- Uses full-stack tooling: React, FastAPI/Express, databases, CSS
|
|
89
|
+
- Manages git for version control between iterations
|
|
90
|
+
- Reads Evaluator feedback and incorporates it in next iteration
|
|
91
|
+
|
|
92
|
+
**Model:** Opus 4.6 (needs strong coding capability)
|
|
93
|
+
|
|
94
|
+
### 3. Evaluator Agent
|
|
95
|
+
|
|
96
|
+
**Role:** QA engineer — tests the live running application, not just code.
|
|
97
|
+
|
|
98
|
+
**Key behaviors:**
|
|
99
|
+
- Uses **Playwright MCP** to interact with the live application
|
|
100
|
+
- Clicks through features, fills forms, tests API endpoints
|
|
101
|
+
- Scores against four criteria (configurable):
|
|
102
|
+
1. **Design Quality** — Does it feel like a coherent whole?
|
|
103
|
+
2. **Originality** — Custom decisions vs. template/AI patterns?
|
|
104
|
+
3. **Craft** — Typography, spacing, animations, micro-interactions?
|
|
105
|
+
4. **Functionality** — Do all features actually work?
|
|
106
|
+
- Returns structured feedback with scores and specific issues
|
|
107
|
+
- Is engineered to be **ruthlessly strict** — never praises mediocre work
|
|
108
|
+
|
|
109
|
+
**Model:** Opus 4.6 (needs strong judgment + tool use)
|
|
110
|
+
|
|
111
|
+
## Evaluation Criteria
|
|
112
|
+
|
|
113
|
+
The default four criteria, each scored 1-10:
|
|
114
|
+
|
|
115
|
+
```markdown
|
|
116
|
+
## Evaluation Rubric
|
|
117
|
+
|
|
118
|
+
### Design Quality (weight: 0.3)
|
|
119
|
+
- 1-3: Generic, template-like, "AI slop" aesthetics
|
|
120
|
+
- 4-6: Competent but unremarkable, follows conventions
|
|
121
|
+
- 7-8: Distinctive, cohesive visual identity
|
|
122
|
+
- 9-10: Could pass for a professional designer's work
|
|
123
|
+
|
|
124
|
+
### Originality (weight: 0.2)
|
|
125
|
+
- 1-3: Default colors, stock layouts, no personality
|
|
126
|
+
- 4-6: Some custom choices, mostly standard patterns
|
|
127
|
+
- 7-8: Clear creative vision, unique approach
|
|
128
|
+
- 9-10: Surprising, delightful, genuinely novel
|
|
129
|
+
|
|
130
|
+
### Craft (weight: 0.3)
|
|
131
|
+
- 1-3: Broken layouts, missing states, no animations
|
|
132
|
+
- 4-6: Works but feels rough, inconsistent spacing
|
|
133
|
+
- 7-8: Polished, smooth transitions, responsive
|
|
134
|
+
- 9-10: Pixel-perfect, delightful micro-interactions
|
|
135
|
+
|
|
136
|
+
### Functionality (weight: 0.2)
|
|
137
|
+
- 1-3: Core features broken or missing
|
|
138
|
+
- 4-6: Happy path works, edge cases fail
|
|
139
|
+
- 7-8: All features work, good error handling
|
|
140
|
+
- 9-10: Bulletproof, handles every edge case
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
### Scoring
|
|
144
|
+
|
|
145
|
+
- **Weighted score** = sum of (criterion_score * weight)
|
|
146
|
+
- **Pass threshold** = 7.0 (configurable)
|
|
147
|
+
- **Max iterations** = 15 (configurable, typically 5-15 sufficient)
|
|
148
|
+
|
|
149
|
+
## Usage
|
|
150
|
+
|
|
151
|
+
### Via Command
|
|
152
|
+
|
|
153
|
+
```bash
|
|
154
|
+
# Full three-agent harness
|
|
155
|
+
/project:gan-build "Build a project management app with Kanban boards, team collaboration, and dark mode"
|
|
156
|
+
|
|
157
|
+
# With custom config
|
|
158
|
+
/project:gan-build "Build a recipe sharing platform" --max-iterations 10 --pass-threshold 7.5
|
|
159
|
+
|
|
160
|
+
# Frontend design mode (generator + evaluator only, no planner)
|
|
161
|
+
/project:gan-design "Create a landing page for a crypto portfolio tracker"
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
### Via Shell Script
|
|
165
|
+
|
|
166
|
+
```bash
|
|
167
|
+
# Basic usage
|
|
168
|
+
./scripts/gan-harness.sh "Build a music streaming dashboard"
|
|
169
|
+
|
|
170
|
+
# With options
|
|
171
|
+
GAN_MAX_ITERATIONS=10 \
|
|
172
|
+
GAN_PASS_THRESHOLD=7.5 \
|
|
173
|
+
GAN_EVAL_CRITERIA="functionality,performance,security" \
|
|
174
|
+
./scripts/gan-harness.sh "Build a REST API for task management"
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
### Via Claude Code (Manual)
|
|
178
|
+
|
|
179
|
+
```bash
|
|
180
|
+
# Step 1: Plan
|
|
181
|
+
claude -p --model opus "You are a Product Planner. Read PLANNER_PROMPT.md. Expand this brief into a full product spec: 'Build a Kanban board app'. Write spec to spec.md"
|
|
182
|
+
|
|
183
|
+
# Step 2: Generate (iteration 1)
|
|
184
|
+
claude -p --model opus "You are a Generator. Read spec.md. Implement Sprint 1. Start the dev server on port 3000."
|
|
185
|
+
|
|
186
|
+
# Step 3: Evaluate (iteration 1)
|
|
187
|
+
claude -p --model opus --allowedTools "Read,Bash,mcp__playwright__*" "You are an Evaluator. Read EVALUATOR_PROMPT.md. Test the live app at http://localhost:3000. Score against the rubric. Write feedback to feedback-001.md"
|
|
188
|
+
|
|
189
|
+
# Step 4: Generate (iteration 2 — reads feedback)
|
|
190
|
+
claude -p --model opus "You are a Generator. Read spec.md and feedback-001.md. Address all issues. Improve the scores."
|
|
191
|
+
|
|
192
|
+
# Repeat steps 3-4 until pass threshold met
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
## Evolution Across Model Capabilities
|
|
196
|
+
|
|
197
|
+
The harness should simplify as models improve. Following Anthropic's evolution:
|
|
198
|
+
|
|
199
|
+
### Stage 1 — Weaker Models (Sonnet-class)
|
|
200
|
+
- Full sprint decomposition required
|
|
201
|
+
- Context resets between sprints (avoid context anxiety)
|
|
202
|
+
- 2-agent minimum: Initializer + Coding Agent
|
|
203
|
+
- Heavy scaffolding compensates for model limitations
|
|
204
|
+
|
|
205
|
+
### Stage 2 — Capable Models (Opus 4.5-class)
|
|
206
|
+
- Full 3-agent harness: Planner + Generator + Evaluator
|
|
207
|
+
- Sprint contracts before each implementation phase
|
|
208
|
+
- 10-sprint decomposition for complex apps
|
|
209
|
+
- Context resets still useful but less critical
|
|
210
|
+
|
|
211
|
+
### Stage 3 — Frontier Models (Opus 4.6-class)
|
|
212
|
+
- Simplified harness: single planning pass, continuous generation
|
|
213
|
+
- Evaluation reduced to single end-pass (model is smarter)
|
|
214
|
+
- No sprint structure needed
|
|
215
|
+
- Automatic compaction handles context growth
|
|
216
|
+
|
|
217
|
+
> **Key principle:** Every harness component encodes an assumption about what the model can't do alone. When models improve, re-test those assumptions. Strip away what's no longer needed.
|
|
218
|
+
|
|
219
|
+
## Configuration
|
|
220
|
+
|
|
221
|
+
### Environment Variables
|
|
222
|
+
|
|
223
|
+
| Variable | Default | Description |
|
|
224
|
+
|----------|---------|-------------|
|
|
225
|
+
| `GAN_MAX_ITERATIONS` | `15` | Maximum generator-evaluator cycles |
|
|
226
|
+
| `GAN_PASS_THRESHOLD` | `7.0` | Weighted score to pass (1-10) |
|
|
227
|
+
| `GAN_PLANNER_MODEL` | `opus` | Model for planning agent |
|
|
228
|
+
| `GAN_GENERATOR_MODEL` | `opus` | Model for generator agent |
|
|
229
|
+
| `GAN_EVALUATOR_MODEL` | `opus` | Model for evaluator agent |
|
|
230
|
+
| `GAN_EVAL_CRITERIA` | `design,originality,craft,functionality` | Comma-separated criteria |
|
|
231
|
+
| `GAN_DEV_SERVER_PORT` | `3000` | Port for the live app |
|
|
232
|
+
| `GAN_DEV_SERVER_CMD` | `npm run dev` | Command to start dev server |
|
|
233
|
+
| `GAN_PROJECT_DIR` | `.` | Project working directory |
|
|
234
|
+
| `GAN_SKIP_PLANNER` | `false` | Skip planner, use spec directly |
|
|
235
|
+
| `GAN_EVAL_MODE` | `playwright` | `playwright`, `screenshot`, or `code-only` |
|
|
236
|
+
|
|
237
|
+
### Evaluation Modes
|
|
238
|
+
|
|
239
|
+
| Mode | Tools | Best For |
|
|
240
|
+
|------|-------|----------|
|
|
241
|
+
| `playwright` | Browser MCP + live interaction | Full-stack apps with UI |
|
|
242
|
+
| `screenshot` | Screenshot + visual analysis | Static sites, design-only |
|
|
243
|
+
| `code-only` | Tests + linting + build | APIs, libraries, CLI tools |
|
|
244
|
+
|
|
245
|
+
## Anti-Patterns
|
|
246
|
+
|
|
247
|
+
1. **Evaluator too lenient** — If the evaluator passes everything on iteration 1, your rubric is too generous. Tighten scoring criteria and add explicit penalties for common AI patterns.
|
|
248
|
+
|
|
249
|
+
2. **Generator ignoring feedback** — Ensure feedback is passed as a file, not inline. The generator should read `feedback-NNN.md` at the start of each iteration.
|
|
250
|
+
|
|
251
|
+
3. **Infinite loops** — Always set `GAN_MAX_ITERATIONS`. If the generator can't improve past a score plateau after 3 iterations, stop and flag for human review.
|
|
252
|
+
|
|
253
|
+
4. **Evaluator testing superficially** — The evaluator must use Playwright to **interact** with the live app, not just screenshot it. Click buttons, fill forms, test error states.
|
|
254
|
+
|
|
255
|
+
5. **Evaluator praising its own fixes** — Never let the evaluator suggest fixes and then evaluate those fixes. The evaluator only critiques; the generator fixes.
|
|
256
|
+
|
|
257
|
+
6. **Context exhaustion** — For long sessions, use Claude Agent SDK's automatic compaction or reset context between major phases.
|
|
258
|
+
|
|
259
|
+
## Results: What to Expect
|
|
260
|
+
|
|
261
|
+
Based on Anthropic's published results:
|
|
262
|
+
|
|
263
|
+
| Metric | Solo Agent | GAN Harness | Improvement |
|
|
264
|
+
|--------|-----------|-------------|-------------|
|
|
265
|
+
| Time | 20 min | 4-6 hours | 12-18x longer |
|
|
266
|
+
| Cost | $9 | $125-200 | 14-22x more |
|
|
267
|
+
| Quality | Barely functional | Production-ready | Phase change |
|
|
268
|
+
| Core features | Broken | All working | N/A |
|
|
269
|
+
| Design | Generic AI slop | Distinctive, polished | N/A |
|
|
270
|
+
|
|
271
|
+
**The tradeoff is clear:** ~20x more time and cost for a qualitative leap in output quality. This is for projects where quality matters.
|
|
272
|
+
|
|
273
|
+
## References
|
|
274
|
+
|
|
275
|
+
- [Anthropic: Harness Design for Long-Running Apps](https://www.anthropic.com/engineering/harness-design-long-running-apps) — Original paper by Prithvi Rajasekaran
|
|
276
|
+
- [Epsilla: The GAN-Style Agent Loop](https://www.epsilla.com/blogs/anthropic-harness-engineering-multi-agent-gan-architecture) — Architecture deconstruction
|
|
277
|
+
- [Martin Fowler: Harness Engineering](https://martinfowler.com/articles/exploring-gen-ai/harness-engineering.html) — Broader industry context
|
|
278
|
+
- [OpenAI: Harness Engineering](https://openai.com/index/harness-engineering/) — OpenAI's parallel work
|
|
@@ -0,0 +1,125 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: gateguard
|
|
3
|
+
description: Fact-forcing gate that blocks Edit/Write/Bash (including MultiEdit) and demands concrete investigation (importers, data schemas, user instruction) before allowing the action. Measurably improves output quality by +2.25 points vs ungated agents.
|
|
4
|
+
origin: community
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# GateGuard — Fact-Forcing Pre-Action Gate
|
|
8
|
+
|
|
9
|
+
A PreToolUse hook that forces Claude to investigate before editing. Instead of self-evaluation ("are you sure?"), it demands concrete facts. The act of investigation creates awareness that self-evaluation never did.
|
|
10
|
+
|
|
11
|
+
## When to Activate
|
|
12
|
+
|
|
13
|
+
- Working on any codebase where file edits affect multiple modules
|
|
14
|
+
- Projects with data files that have specific schemas or date formats
|
|
15
|
+
- Teams where AI-generated code must match existing patterns
|
|
16
|
+
- Any workflow where Claude tends to guess instead of investigating
|
|
17
|
+
|
|
18
|
+
## Core Concept
|
|
19
|
+
|
|
20
|
+
LLM self-evaluation doesn't work. Ask "did you violate any policies?" and the answer is always "no." This is verified experimentally.
|
|
21
|
+
|
|
22
|
+
But asking "list every file that imports this module" forces the LLM to run Grep and Read. The investigation itself creates context that changes the output.
|
|
23
|
+
|
|
24
|
+
**Three-stage gate:**
|
|
25
|
+
|
|
26
|
+
```
|
|
27
|
+
1. DENY — block the first Edit/Write/Bash attempt
|
|
28
|
+
2. FORCE — tell the model exactly which facts to gather
|
|
29
|
+
3. ALLOW — permit retry after facts are presented
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
No competitor does all three. Most stop at deny.
|
|
33
|
+
|
|
34
|
+
## Evidence
|
|
35
|
+
|
|
36
|
+
Two independent A/B tests, identical agents, same task:
|
|
37
|
+
|
|
38
|
+
| Task | Gated | Ungated | Gap |
|
|
39
|
+
| --- | --- | --- | --- |
|
|
40
|
+
| Analytics module | 8.0/10 | 6.5/10 | +1.5 |
|
|
41
|
+
| Webhook validator | 10.0/10 | 7.0/10 | +3.0 |
|
|
42
|
+
| **Average** | **9.0** | **6.75** | **+2.25** |
|
|
43
|
+
|
|
44
|
+
Both agents produce code that runs and passes tests. The difference is design depth.
|
|
45
|
+
|
|
46
|
+
## Gate Types
|
|
47
|
+
|
|
48
|
+
### Edit / MultiEdit Gate (first edit per file)
|
|
49
|
+
|
|
50
|
+
MultiEdit is handled identically — each file in the batch is gated individually.
|
|
51
|
+
|
|
52
|
+
```
|
|
53
|
+
Before editing {file_path}, present these facts:
|
|
54
|
+
|
|
55
|
+
1. List ALL files that import/require this file (use Grep)
|
|
56
|
+
2. List the public functions/classes affected by this change
|
|
57
|
+
3. If this file reads/writes data files, show field names, structure,
|
|
58
|
+
and date format (use redacted or synthetic values, not raw production data)
|
|
59
|
+
4. Quote the user's current instruction verbatim
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
### Write Gate (first new file creation)
|
|
63
|
+
|
|
64
|
+
```
|
|
65
|
+
Before creating {file_path}, present these facts:
|
|
66
|
+
|
|
67
|
+
1. Name the file(s) and line(s) that will call this new file
|
|
68
|
+
2. Confirm no existing file serves the same purpose (use Glob)
|
|
69
|
+
3. If this file reads/writes data files, show field names, structure,
|
|
70
|
+
and date format (use redacted or synthetic values, not raw production data)
|
|
71
|
+
4. Quote the user's current instruction verbatim
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
### Destructive Bash Gate (every destructive command)
|
|
75
|
+
|
|
76
|
+
Triggers on: `rm -rf`, `git reset --hard`, `git push --force`, `drop table`, etc.
|
|
77
|
+
|
|
78
|
+
```
|
|
79
|
+
1. List all files/data this command will modify or delete
|
|
80
|
+
2. Write a one-line rollback procedure
|
|
81
|
+
3. Quote the user's current instruction verbatim
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### Routine Bash Gate (once per session)
|
|
85
|
+
|
|
86
|
+
```
|
|
87
|
+
1. The current user request in one sentence
|
|
88
|
+
2. What this specific command verifies or produces
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
## Quick Start
|
|
92
|
+
|
|
93
|
+
### Option A: Use the the toolset hook (zero install)
|
|
94
|
+
|
|
95
|
+
The hook at `scripts/hooks/gateguard-fact-force.js` is included in this plugin. Enable it via hooks.json.
|
|
96
|
+
|
|
97
|
+
If GateGuard blocks setup or repair work, start the session with
|
|
98
|
+
`SKILLFORGE_GATEGUARD=off`. For hook-level control, keep using
|
|
99
|
+
`SKILLFORGE_DISABLED_HOOKS` with the GateGuard hook ID.
|
|
100
|
+
|
|
101
|
+
### Option B: Full package with config
|
|
102
|
+
|
|
103
|
+
```bash
|
|
104
|
+
pip install gateguard-ai
|
|
105
|
+
gateguard init
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
This adds `.gateguard.yml` for per-project configuration (custom messages, ignore paths, gate toggles).
|
|
109
|
+
|
|
110
|
+
## Anti-Patterns
|
|
111
|
+
|
|
112
|
+
- **Don't use self-evaluation instead.** "Are you sure?" always gets "yes." This is experimentally verified.
|
|
113
|
+
- **Don't skip the data schema check.** Both A/B test agents assumed ISO-8601 dates when real data used `%Y/%m/%d %H:%M`. Checking data structure (with redacted values) prevents this entire class of bugs.
|
|
114
|
+
- **Don't gate every single Bash command.** Routine bash gates once per session. Destructive bash gates every time. This balance avoids slowdown while catching real risks.
|
|
115
|
+
|
|
116
|
+
## Best Practices
|
|
117
|
+
|
|
118
|
+
- Let the gate fire naturally. Don't try to pre-answer the gate questions — the investigation itself is what improves quality.
|
|
119
|
+
- Customize gate messages for your domain. If your project has specific conventions, add them to the gate prompts.
|
|
120
|
+
- Use `.gateguard.yml` to ignore paths like `.venv/`, `node_modules/`, `.git/`.
|
|
121
|
+
|
|
122
|
+
## Related Skills
|
|
123
|
+
|
|
124
|
+
- `safety-guard` — Runtime safety checks (complementary, not overlapping)
|
|
125
|
+
- `code-reviewer` — Post-edit review (GateGuard is pre-edit investigation)
|