@harness-engineering/cli 1.23.1 → 1.23.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/agents/commands/codex/harness/add-harness-component/SKILL.md +21 -12
- package/dist/agents/commands/codex/harness/cleanup-dead-code/SKILL.md +9 -0
- package/dist/agents/commands/codex/harness/detect-doc-drift/SKILL.md +9 -0
- package/dist/agents/commands/codex/harness/enforce-architecture/SKILL.md +5 -15
- package/dist/agents/commands/codex/harness/harness-architecture-advisor/SKILL.md +5 -15
- package/dist/agents/commands/codex/harness/harness-autopilot/SKILL.md +10 -0
- package/dist/agents/commands/codex/harness/harness-brainstorming/SKILL.md +9 -0
- package/dist/agents/commands/codex/harness/harness-code-review/SKILL.md +5 -15
- package/dist/agents/commands/codex/harness/harness-codebase-cleanup/SKILL.md +9 -0
- package/dist/agents/commands/codex/harness/harness-debugging/SKILL.md +9 -0
- package/dist/agents/commands/codex/harness/harness-dependency-health/SKILL.md +8 -0
- package/dist/agents/commands/codex/harness/harness-docs-pipeline/SKILL.md +9 -0
- package/dist/agents/commands/codex/harness/harness-execution/SKILL.md +9 -0
- package/dist/agents/commands/codex/harness/harness-hotspot-detector/SKILL.md +8 -0
- package/dist/agents/commands/codex/harness/harness-impact-analysis/SKILL.md +8 -0
- package/dist/agents/commands/codex/harness/harness-integrity/SKILL.md +8 -0
- package/dist/agents/commands/codex/harness/harness-onboarding/SKILL.md +18 -10
- package/dist/agents/commands/codex/harness/harness-perf/SKILL.md +9 -0
- package/dist/agents/commands/codex/harness/harness-planning/SKILL.md +10 -0
- package/dist/agents/commands/codex/harness/harness-refactoring/SKILL.md +9 -0
- package/dist/agents/commands/codex/harness/harness-release-readiness/SKILL.md +9 -0
- package/dist/agents/commands/codex/harness/harness-roadmap/SKILL.md +10 -1
- package/dist/agents/commands/codex/harness/harness-security-scan/SKILL.md +5 -15
- package/dist/agents/commands/codex/harness/harness-skill-authoring/SKILL.md +20 -1
- package/dist/agents/commands/codex/harness/harness-soundness-review/SKILL.md +10 -0
- package/dist/agents/commands/codex/harness/harness-supply-chain-audit/SKILL.md +8 -0
- package/dist/agents/commands/codex/harness/harness-tdd/SKILL.md +9 -0
- package/dist/agents/commands/codex/harness/harness-test-advisor/SKILL.md +8 -0
- package/dist/agents/commands/codex/harness/harness-verification/SKILL.md +9 -0
- package/dist/agents/commands/codex/harness/harness-verify/SKILL.md +8 -0
- package/dist/agents/commands/codex/harness/initialize-harness-project/SKILL.md +22 -13
- package/dist/agents/commands/cursor/harness/add-harness-component.mdc +12 -12
- package/dist/agents/commands/cursor/harness/harness-onboarding.mdc +10 -10
- package/dist/agents/commands/cursor/harness/harness-roadmap.mdc +1 -1
- package/dist/agents/commands/cursor/harness/initialize-harness-project.mdc +13 -13
- package/dist/agents/skills/claude-code/add-harness-component/SKILL.md +21 -12
- package/dist/agents/skills/claude-code/align-documentation/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/check-mechanical-constraints/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/cleanup-dead-code/SKILL.md +11 -0
- package/dist/agents/skills/claude-code/detect-doc-drift/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/enforce-architecture/SKILL.md +5 -15
- package/dist/agents/skills/claude-code/harness-accessibility/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-api-design/SKILL.md +5 -15
- package/dist/agents/skills/claude-code/harness-architecture-advisor/SKILL.md +5 -15
- package/dist/agents/skills/claude-code/harness-auth/SKILL.md +5 -15
- package/dist/agents/skills/claude-code/harness-autopilot/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-brainstorming/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-caching/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-chaos/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-code-review/SKILL.md +5 -15
- package/dist/agents/skills/claude-code/harness-codebase-cleanup/SKILL.md +11 -0
- package/dist/agents/skills/claude-code/harness-compliance/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-containerization/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-data-pipeline/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-data-validation/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-database/SKILL.md +5 -15
- package/dist/agents/skills/claude-code/harness-debugging/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-dependency-health/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-deployment/SKILL.md +5 -15
- package/dist/agents/skills/claude-code/harness-design/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-design-mobile/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-design-system/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-design-web/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-diagnostics/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-docs-pipeline/SKILL.md +11 -0
- package/dist/agents/skills/claude-code/harness-dx/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-e2e/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-event-driven/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-execution/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-feature-flags/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-git-workflow/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-hotspot-detector/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-i18n/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-i18n-process/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-i18n-workflow/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-impact-analysis/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-incident-response/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-infrastructure-as-code/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-integration-test/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-integrity/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-knowledge-mapper/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-load-testing/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-ml-ops/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-mobile-patterns/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-mutation-test/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-observability/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-onboarding/SKILL.md +18 -10
- package/dist/agents/skills/claude-code/harness-parallel-agents/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-perf/SKILL.md +11 -0
- package/dist/agents/skills/claude-code/harness-perf-tdd/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-planning/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-pre-commit-review/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-product-spec/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-property-test/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-refactoring/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-release-readiness/SKILL.md +11 -0
- package/dist/agents/skills/claude-code/harness-resilience/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-roadmap/SKILL.md +10 -1
- package/dist/agents/skills/claude-code/harness-roadmap-pilot/SKILL.md +8 -0
- package/dist/agents/skills/claude-code/harness-secrets/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-security-review/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-security-scan/SKILL.md +5 -15
- package/dist/agents/skills/claude-code/harness-skill-authoring/SKILL.md +29 -1
- package/dist/agents/skills/claude-code/harness-soundness-review/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-sql-review/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-state-management/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-supply-chain-audit/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-tdd/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-test-advisor/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-test-data/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-ux-copy/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-verification/SKILL.md +9 -0
- package/dist/agents/skills/claude-code/harness-verify/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/harness-visual-regression/SKILL.md +10 -0
- package/dist/agents/skills/claude-code/initialize-harness-project/SKILL.md +22 -13
- package/dist/agents/skills/claude-code/validate-context-engineering/SKILL.md +9 -0
- package/dist/agents/skills/codex/add-harness-component/SKILL.md +21 -12
- package/dist/agents/skills/codex/align-documentation/SKILL.md +9 -0
- package/dist/agents/skills/codex/check-mechanical-constraints/SKILL.md +9 -0
- package/dist/agents/skills/codex/cleanup-dead-code/SKILL.md +11 -0
- package/dist/agents/skills/codex/detect-doc-drift/SKILL.md +9 -0
- package/dist/agents/skills/codex/enforce-architecture/SKILL.md +5 -15
- package/dist/agents/skills/codex/harness-accessibility/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-api-design/SKILL.md +5 -15
- package/dist/agents/skills/codex/harness-architecture-advisor/SKILL.md +5 -15
- package/dist/agents/skills/codex/harness-auth/SKILL.md +5 -15
- package/dist/agents/skills/codex/harness-autopilot/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-brainstorming/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-caching/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-chaos/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-code-review/SKILL.md +5 -15
- package/dist/agents/skills/codex/harness-codebase-cleanup/SKILL.md +11 -0
- package/dist/agents/skills/codex/harness-compliance/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-containerization/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-data-pipeline/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-data-validation/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-database/SKILL.md +5 -15
- package/dist/agents/skills/codex/harness-debugging/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-dependency-health/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-deployment/SKILL.md +5 -15
- package/dist/agents/skills/codex/harness-design/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-design-mobile/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-design-system/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-design-web/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-diagnostics/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-docs-pipeline/SKILL.md +11 -0
- package/dist/agents/skills/codex/harness-dx/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-e2e/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-event-driven/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-execution/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-feature-flags/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-git-workflow/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-hotspot-detector/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-i18n/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-i18n-process/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-i18n-workflow/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-impact-analysis/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-incident-response/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-infrastructure-as-code/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-integration-test/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-integrity/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-knowledge-mapper/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-load-testing/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-ml-ops/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-mobile-patterns/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-mutation-test/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-observability/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-onboarding/SKILL.md +18 -10
- package/dist/agents/skills/codex/harness-parallel-agents/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-perf/SKILL.md +11 -0
- package/dist/agents/skills/codex/harness-perf-tdd/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-planning/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-pre-commit-review/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-product-spec/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-property-test/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-refactoring/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-release-readiness/SKILL.md +11 -0
- package/dist/agents/skills/codex/harness-resilience/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-roadmap/SKILL.md +10 -1
- package/dist/agents/skills/codex/harness-roadmap-pilot/SKILL.md +8 -0
- package/dist/agents/skills/codex/harness-secrets/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-security-review/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-security-scan/SKILL.md +5 -15
- package/dist/agents/skills/codex/harness-skill-authoring/SKILL.md +29 -1
- package/dist/agents/skills/codex/harness-soundness-review/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-sql-review/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-state-management/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-supply-chain-audit/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-tdd/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-test-advisor/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-test-data/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-ux-copy/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-verification/SKILL.md +9 -0
- package/dist/agents/skills/codex/harness-verify/SKILL.md +10 -0
- package/dist/agents/skills/codex/harness-visual-regression/SKILL.md +10 -0
- package/dist/agents/skills/codex/initialize-harness-project/SKILL.md +22 -13
- package/dist/agents/skills/codex/validate-context-engineering/SKILL.md +9 -0
- package/dist/agents/skills/cursor/add-harness-component/SKILL.md +21 -12
- package/dist/agents/skills/cursor/align-documentation/SKILL.md +9 -0
- package/dist/agents/skills/cursor/check-mechanical-constraints/SKILL.md +9 -0
- package/dist/agents/skills/cursor/cleanup-dead-code/SKILL.md +11 -0
- package/dist/agents/skills/cursor/detect-doc-drift/SKILL.md +9 -0
- package/dist/agents/skills/cursor/enforce-architecture/SKILL.md +5 -15
- package/dist/agents/skills/cursor/harness-accessibility/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-api-design/SKILL.md +5 -15
- package/dist/agents/skills/cursor/harness-architecture-advisor/SKILL.md +5 -15
- package/dist/agents/skills/cursor/harness-auth/SKILL.md +5 -15
- package/dist/agents/skills/cursor/harness-autopilot/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-brainstorming/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-caching/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-chaos/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-code-review/SKILL.md +5 -15
- package/dist/agents/skills/cursor/harness-codebase-cleanup/SKILL.md +11 -0
- package/dist/agents/skills/cursor/harness-compliance/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-containerization/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-data-pipeline/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-data-validation/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-database/SKILL.md +5 -15
- package/dist/agents/skills/cursor/harness-debugging/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-dependency-health/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-deployment/SKILL.md +5 -15
- package/dist/agents/skills/cursor/harness-design/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-design-mobile/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-design-system/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-design-web/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-diagnostics/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-docs-pipeline/SKILL.md +11 -0
- package/dist/agents/skills/cursor/harness-dx/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-e2e/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-event-driven/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-execution/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-feature-flags/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-git-workflow/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-hotspot-detector/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-i18n/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-i18n-process/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-i18n-workflow/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-impact-analysis/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-incident-response/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-infrastructure-as-code/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-integration-test/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-integrity/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-knowledge-mapper/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-load-testing/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-ml-ops/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-mobile-patterns/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-mutation-test/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-observability/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-onboarding/SKILL.md +18 -10
- package/dist/agents/skills/cursor/harness-parallel-agents/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-perf/SKILL.md +11 -0
- package/dist/agents/skills/cursor/harness-perf-tdd/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-planning/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-pre-commit-review/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-product-spec/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-property-test/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-refactoring/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-release-readiness/SKILL.md +11 -0
- package/dist/agents/skills/cursor/harness-resilience/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-roadmap/SKILL.md +10 -1
- package/dist/agents/skills/cursor/harness-roadmap-pilot/SKILL.md +8 -0
- package/dist/agents/skills/cursor/harness-secrets/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-security-review/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-security-scan/SKILL.md +5 -15
- package/dist/agents/skills/cursor/harness-skill-authoring/SKILL.md +29 -1
- package/dist/agents/skills/cursor/harness-soundness-review/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-sql-review/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-state-management/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-supply-chain-audit/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-tdd/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-test-advisor/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-test-data/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-ux-copy/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-verification/SKILL.md +9 -0
- package/dist/agents/skills/cursor/harness-verify/SKILL.md +10 -0
- package/dist/agents/skills/cursor/harness-visual-regression/SKILL.md +10 -0
- package/dist/agents/skills/cursor/initialize-harness-project/SKILL.md +22 -13
- package/dist/agents/skills/cursor/validate-context-engineering/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/add-harness-component/SKILL.md +21 -12
- package/dist/agents/skills/gemini-cli/align-documentation/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/check-mechanical-constraints/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/cleanup-dead-code/SKILL.md +11 -0
- package/dist/agents/skills/gemini-cli/detect-doc-drift/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/enforce-architecture/SKILL.md +5 -15
- package/dist/agents/skills/gemini-cli/harness-accessibility/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-api-design/SKILL.md +5 -15
- package/dist/agents/skills/gemini-cli/harness-architecture-advisor/SKILL.md +5 -15
- package/dist/agents/skills/gemini-cli/harness-auth/SKILL.md +5 -15
- package/dist/agents/skills/gemini-cli/harness-autopilot/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-brainstorming/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-caching/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-chaos/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-code-review/SKILL.md +5 -15
- package/dist/agents/skills/gemini-cli/harness-codebase-cleanup/SKILL.md +11 -0
- package/dist/agents/skills/gemini-cli/harness-compliance/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-containerization/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-data-pipeline/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-data-validation/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-database/SKILL.md +5 -15
- package/dist/agents/skills/gemini-cli/harness-debugging/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-dependency-health/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-deployment/SKILL.md +5 -15
- package/dist/agents/skills/gemini-cli/harness-design/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-design-mobile/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-design-system/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-design-web/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-diagnostics/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-docs-pipeline/SKILL.md +11 -0
- package/dist/agents/skills/gemini-cli/harness-dx/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-e2e/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-event-driven/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-execution/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-feature-flags/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-git-workflow/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-hotspot-detector/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-i18n/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-i18n-process/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-i18n-workflow/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-impact-analysis/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-incident-response/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-infrastructure-as-code/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-integration-test/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-integrity/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-knowledge-mapper/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-load-testing/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-ml-ops/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-mobile-patterns/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-mutation-test/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-observability/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-onboarding/SKILL.md +18 -10
- package/dist/agents/skills/gemini-cli/harness-parallel-agents/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-perf/SKILL.md +11 -0
- package/dist/agents/skills/gemini-cli/harness-perf-tdd/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-planning/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-pre-commit-review/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-product-spec/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-property-test/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-refactoring/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-release-readiness/SKILL.md +11 -0
- package/dist/agents/skills/gemini-cli/harness-resilience/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-roadmap/SKILL.md +10 -1
- package/dist/agents/skills/gemini-cli/harness-roadmap-pilot/SKILL.md +8 -0
- package/dist/agents/skills/gemini-cli/harness-secrets/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-security-review/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-security-scan/SKILL.md +5 -15
- package/dist/agents/skills/gemini-cli/harness-skill-authoring/SKILL.md +29 -1
- package/dist/agents/skills/gemini-cli/harness-soundness-review/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-sql-review/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-state-management/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-supply-chain-audit/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-tdd/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-test-advisor/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-test-data/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-ux-copy/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-verification/SKILL.md +9 -0
- package/dist/agents/skills/gemini-cli/harness-verify/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/harness-visual-regression/SKILL.md +10 -0
- package/dist/agents/skills/gemini-cli/initialize-harness-project/SKILL.md +22 -13
- package/dist/agents/skills/gemini-cli/validate-context-engineering/SKILL.md +9 -0
- package/dist/agents-md-HCCCO5PK.js +9 -0
- package/dist/{architecture-FVERI7BQ.js → architecture-S2H624W7.js} +5 -5
- package/dist/{assess-project-UGL5KLBV.js → assess-project-XSGK44S5.js} +1 -1
- package/dist/bin/harness-mcp.js +18 -18
- package/dist/bin/harness.js +124 -35
- package/dist/{check-phase-gate-C7JPPKMX.js → check-phase-gate-UGBJ237T.js} +5 -5
- package/dist/{chunk-RQ3AKUJB.js → chunk-2DHX6TAP.js} +4 -4
- package/dist/{chunk-7XZSHTYZ.js → chunk-2GT3HO2T.js} +3 -3
- package/dist/{chunk-ZLTFDTK7.js → chunk-2YA4XRI3.js} +5 -5
- package/dist/{chunk-GZKSBLQL.js → chunk-35EQ5UEI.js} +1 -1
- package/dist/{chunk-T5QWCVGK.js → chunk-4FHBPA3E.js} +11 -3
- package/dist/{chunk-ERS5EVUZ.js → chunk-5LMZA5LZ.js} +10 -10
- package/dist/{chunk-L57RL7MC.js → chunk-BK52Z6DR.js} +869 -419
- package/dist/{chunk-EUCASOD7.js → chunk-CLD4KL7O.js} +341 -71
- package/dist/{chunk-OD3S2NHN.js → chunk-E2GTL3YS.js} +1 -1
- package/dist/{chunk-YLN34N65.js → chunk-FP53DDB5.js} +1 -1
- package/dist/{chunk-7V5Y2L67.js → chunk-I47JLISV.js} +1 -1
- package/dist/{chunk-LAKMOIU6.js → chunk-KC5CTCEL.js} +9 -9
- package/dist/{chunk-UJHNGRS6.js → chunk-KTL3PHNQ.js} +6445 -6222
- package/dist/{chunk-DBSOCI3G.js → chunk-KV4M6Y5J.js} +1 -1
- package/dist/{chunk-FIAPHX37.js → chunk-LM5Z2WCA.js} +1 -1
- package/dist/{chunk-SD3SQOZ2.js → chunk-LOUH2LIC.js} +1 -1
- package/dist/{chunk-FNVAW5NG.js → chunk-MHOO7NLG.js} +11 -11
- package/dist/{chunk-HRUCT5YX.js → chunk-MZAHE4DK.js} +12 -12
- package/dist/{chunk-WKLLNUAT.js → chunk-NKL53UBL.js} +6 -6
- package/dist/{chunk-AQN7GFKU.js → chunk-PGF44T2D.js} +6 -6
- package/dist/{chunk-H7Y5CKTM.js → chunk-Q3XYV5UC.js} +1 -1
- package/dist/{chunk-KIR5PQX5.js → chunk-S5ZXT3TZ.js} +1 -1
- package/dist/{chunk-6KWBH4EO.js → chunk-UGD37ECK.js} +5 -5
- package/dist/{chunk-QBATHQXU.js → chunk-V27WDRYV.js} +540 -490
- package/dist/{chunk-YQ6KC6TE.js → chunk-YDRB55Q4.js} +1 -1
- package/dist/{chunk-CZEPCYVX.js → chunk-ZRYDYDB2.js} +6 -6
- package/dist/{chunk-7DMF3VT5.js → chunk-ZYJJUPNE.js} +1 -1
- package/dist/ci-workflow-I3V7FZNV.js +9 -0
- package/dist/{create-skill-U3XCFRZN.js → create-skill-AO25CJFM.js} +2 -2
- package/dist/{dist-USY2C5JL.js → dist-666AAZQ6.js} +1 -1
- package/dist/{dist-DZ63LLUD.js → dist-KQSTRP36.js} +1 -1
- package/dist/{dist-LPGVPYOZ.js → dist-MKWF5CXR.js} +7 -3
- package/dist/{dist-K56VJ4UJ.js → dist-WU3TVNNG.js} +7 -1
- package/dist/{docs-CGUBALYL.js → docs-R7UVQBMQ.js} +5 -5
- package/dist/engine-JGI3MWAC.js +9 -0
- package/dist/{entropy-H5OOCI57.js → entropy-IDHIG7HS.js} +4 -4
- package/dist/{feedback-XTDR7E3R.js → feedback-JZETY4UR.js} +1 -1
- package/dist/{generate-agent-definitions-RBI7Z4RY.js → generate-agent-definitions-D7B25YTM.js} +6 -6
- package/dist/{graph-loader-GRXDUWXO.js → graph-loader-BJULJYGG.js} +1 -1
- package/dist/index.d.ts +12 -8
- package/dist/index.js +54 -54
- package/dist/loader-E4KNTOP2.js +11 -0
- package/dist/mcp-67I2DBNM.js +37 -0
- package/dist/{performance-FSXEQJYB.js → performance-744OSR6P.js} +5 -5
- package/dist/{review-pipeline-VLKL7NV2.js → review-pipeline-HIO7HBW4.js} +1 -1
- package/dist/runtime-JXQ26U4Z.js +10 -0
- package/dist/{security-B76X5RL7.js → security-GDKHVFUC.js} +1 -1
- package/dist/{validate-KN6A2GN3.js → validate-2IUR3OWX.js} +5 -5
- package/dist/validate-cross-check-AM4T6P2K.js +9 -0
- package/package.json +5 -5
- package/dist/agents-md-FJXDMZPJ.js +0 -9
- package/dist/ci-workflow-S7VY625R.js +0 -9
- package/dist/engine-PEHFAFOT.js +0 -9
- package/dist/loader-IOC5L7NL.js +0 -11
- package/dist/mcp-7RPKBGIR.js +0 -37
- package/dist/runtime-3X2MV6R4.js +0 -10
- package/dist/validate-cross-check-LITTM24O.js +0 -9
- package/dist/{chunk-CJDVBBPB.js → chunk-3ISINLYT.js} +1 -1
|
@@ -493,6 +493,16 @@ emails.welcome.greeting -> "Hello {name}, welcome aboard!"
|
|
|
493
493
|
Approve to continue scaffolding, or provide corrections.
|
|
494
494
|
```
|
|
495
495
|
|
|
496
|
+
## Rationalizations to Reject
|
|
497
|
+
|
|
498
|
+
| Rationalization | Reality |
|
|
499
|
+
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
500
|
+
| "The user already told me they want Spanish and French — I can skip the configuration phase and go straight to scaffolding." | The configuration phase writes the `i18n` block to `harness.config.json`. Without it, subsequent runs of harness-i18n have no enabled flag, no strictness level, and no locale list to work against. Verbal confirmation does not substitute for written config. |
|
|
501
|
+
| "In retrofit mode, the key naming is straightforward — I'll apply the generated key catalog directly without showing it to the user for review." | The retrofit key catalog checkpoint is a hard gate. Key names become permanent identifiers that translation teams, TMS tools, and source code will reference for years. The user must review and approve them before any files are written. |
|
|
502
|
+
| "The pseudo-locale transformation for this string with `{name}` is obvious — I'll just wrap the entire string including the placeholder." | ICU MessageFormat placeholders must be preserved exactly. Transforming `{name}` to `{ñàmë}` breaks the interpolation at runtime. The pseudo-locale algorithm must detect and skip all placeholder syntax before applying accent and expansion transforms. |
|
|
503
|
+
| "These target locale files already exist from a previous run — I'll overwrite them with the new extraction output to keep things clean." | Existing target locale translations must never be overwritten. A key with a translated (non-empty, non-source-identical) value in a target locale represents real translation work. Overwriting it destroys that work silently. Always preserve existing translations. |
|
|
504
|
+
| "We found 120 strings in retrofit mode — I'll just run the full extraction without the audit phase since we clearly need everything extracted." | The retrofit audit results are what tell the user how much effort the extraction requires and let them prioritize high-traffic flows. Skipping the audit and going straight to extraction removes the user's ability to scope the work before it happens. |
|
|
505
|
+
|
|
496
506
|
## Gates
|
|
497
507
|
|
|
498
508
|
These are hard stops. Violating any gate means the process has broken down.
|
|
@@ -151,6 +151,16 @@ When no graph is available, use static analysis to approximate impact:
|
|
|
151
151
|
- Report follows the structured output format
|
|
152
152
|
- All findings are backed by graph query evidence (with graph) or systematic static analysis (without graph)
|
|
153
153
|
|
|
154
|
+
## Rationalizations to Reject
|
|
155
|
+
|
|
156
|
+
These are common rationalizations that sound reasonable but lead to incorrect results. When you catch yourself thinking any of these, stop and follow the documented process instead.
|
|
157
|
+
|
|
158
|
+
| Rationalization | Why It Is Wrong |
|
|
159
|
+
| -------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
|
|
160
|
+
| "The change is small so the blast radius must be low -- I can skip the transitive dependent check" | Small changes to shared utilities can have outsized blast radius. A one-line change to auth.ts can affect 23 transitive dependents. |
|
|
161
|
+
| "The graph is a few commits behind but it is close enough for this analysis" | If the graph is more than 2 commits behind, the skill requires a refresh before proceeding. Recent commits may have added new consumers. |
|
|
162
|
+
| "No graph exists so I cannot produce a useful impact analysis" | The fallback strategy using import parsing and naming conventions achieves ~70% completeness. Missing the graph does not mean stopping. |
|
|
163
|
+
|
|
154
164
|
## Examples
|
|
155
165
|
|
|
156
166
|
### Example: Analyzing a Change to auth.ts
|
|
@@ -208,6 +208,16 @@ Phase 4: IMPROVE
|
|
|
208
208
|
4. [P2] Create secret rotation runbook for all services (owner: @sre)
|
|
209
209
|
```
|
|
210
210
|
|
|
211
|
+
## Rationalizations to Reject
|
|
212
|
+
|
|
213
|
+
| Rationalization | Reality |
|
|
214
|
+
| --------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
215
|
+
| "The root cause was human error — someone pushed a bad config" | Human error is a symptom, not a root cause. The root cause is the system that allowed a bad config to reach production undetected. A postmortem that stops at "human error" prevents no future incidents because it identifies no systemic fix. |
|
|
216
|
+
| "We know what happened — we don't need to write a full postmortem for a minor incident" | The decision about what is "minor" is made under the stress of recovery, not under calm analysis. Contributing factors and near-misses that look minor in the moment are frequently the root cause of the next major incident. Document while the context is fresh. |
|
|
217
|
+
| "The action items are in Slack — we don't need to track them formally" | Action items not tracked in a formal system with owners and due dates are not completed. Slack messages are buried within hours. The improvement phase of an incident exists only if its outputs are tracked to completion. |
|
|
218
|
+
| "We don't have SLOs yet so we can't calculate error budget impact" | The absence of SLOs is itself a finding. Without SLOs, there is no objective basis for deciding whether reliability is acceptable. The incident is the forcing function to establish baseline SLOs. Document this gap as a P0 action item. |
|
|
219
|
+
| "The incident was caused by a third-party outage — nothing we could have done" | Third-party outages expose missing circuit breakers, absent fallbacks, and insufficient multi-region routing. The postmortem should document why the third-party outage caused a customer-visible incident and what resilience improvements would have isolated the blast radius. |
|
|
220
|
+
|
|
211
221
|
## Gates
|
|
212
222
|
|
|
213
223
|
- **No postmortem without a root cause statement.** A postmortem that says "cause unknown" is incomplete. If the root cause cannot be determined, the postmortem must document what was investigated, what was ruled out, and what additional data is needed. Do not close the investigation.
|
|
@@ -264,6 +264,16 @@ Phase 4: VALIDATE
|
|
|
264
264
|
Result: WARN -- 2 security improvements needed
|
|
265
265
|
```
|
|
266
266
|
|
|
267
|
+
## Rationalizations to Reject
|
|
268
|
+
|
|
269
|
+
| Rationalization | Reality |
|
|
270
|
+
| ---------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
271
|
+
| "We store state locally because it's just a dev environment" | Local state is not shared between team members. Two developers running `terraform apply` against the same environment with diverged local state will produce conflicting resource definitions, duplicate resources, or state corruption that requires manual recovery. |
|
|
272
|
+
| "We haven't pinned the provider version because we want to automatically get security patches" | Unpinned providers can silently change resource behavior on `terraform init`. A `~> 5.0` constraint without an upper bound can pull a provider with breaking changes. Pin the minor version and upgrade explicitly via reviewed PRs so changes are intentional. |
|
|
273
|
+
| "That S3 bucket has public access because it hosts our static site" | Static site hosting does not require a public bucket ACL. CloudFront with an Origin Access Control (OAC) policy serves files from a private bucket. Public bucket ACLs are a common misconfiguration vector because they apply to all objects, including accidentally uploaded sensitive files. |
|
|
274
|
+
| "We'll tag resources properly before we go to production" | Untagged resources accumulate. Cost allocation reports become impossible, security audits cannot identify owners, and decommissioning requires manual investigation of every resource. Tagging must be enforced at resource creation — retroactive tagging at scale is a weeks-long engineering project. |
|
|
275
|
+
| "Manual changes are fine for urgent hotfixes — we'll import them to Terraform afterward" | Manual changes without immediate import create drift that may be overwritten by the next `terraform apply`. The "import it later" step is almost never done. Every manual change that goes unimported erodes the reliability guarantee that IaC provides. |
|
|
276
|
+
|
|
267
277
|
## Gates
|
|
268
278
|
|
|
269
279
|
- **No local state for shared infrastructure.** Terraform configurations managing shared resources must use a remote backend with locking. Local state is blocking for any non-experimental configuration.
|
|
@@ -256,6 +256,15 @@ describe('ProjectService contract', () => {
|
|
|
256
256
|
});
|
|
257
257
|
```
|
|
258
258
|
|
|
259
|
+
## Rationalizations to Reject
|
|
260
|
+
|
|
261
|
+
| Rationalization | Why It Is Wrong |
|
|
262
|
+
| ----------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
263
|
+
| "Testing the happy path is sufficient -- error scenarios are edge cases" | The success criteria require error scenarios (400, 401, 403, 404, 500, timeout) for all public endpoints. Error paths are where real-world failures happen. |
|
|
264
|
+
| "We can test against the staging environment instead of setting up local mocks" | No integration tests that require external staging environments for CI. Tests must run with local test doubles. |
|
|
265
|
+
| "The consumer contract changed, so I will update the consumer test to match the provider" | Contract changes must be coordinated. The provider may have introduced a bug, not an intentional change. |
|
|
266
|
+
| "Tests pass when I run them in order, so they are fine" | Phase 4 requires running tests in random order. Any test that fails only in a specific order has a shared-state bug. |
|
|
267
|
+
|
|
259
268
|
## Gates
|
|
260
269
|
|
|
261
270
|
- **No integration tests that require external staging environments for CI.** Every integration test must run with local test doubles (mocks, containers, in-memory databases). Tests that fail without a staging VPN are not integration tests -- they are environment tests.
|
|
@@ -122,6 +122,16 @@ Rules:
|
|
|
122
122
|
- [ ] Unified report follows the exact format
|
|
123
123
|
- [ ] Overall verdict correctly reflects both mechanical and review results
|
|
124
124
|
|
|
125
|
+
## Rationalizations to Reject
|
|
126
|
+
|
|
127
|
+
These are common rationalizations that sound reasonable but lead to incorrect results. When you catch yourself thinking any of these, stop and follow the documented process instead.
|
|
128
|
+
|
|
129
|
+
| Rationalization | Why It Is Wrong |
|
|
130
|
+
| -------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
131
|
+
| "All three mechanical checks failed, but I should still run the AI review to get useful feedback" | When ALL three checks fail, stop immediately. Do not proceed to Phase 2. AI review on code that does not compile is wasted effort. |
|
|
132
|
+
| "The security scanner found a warning but it is not high severity, so it should not affect the overall result" | Error-severity security findings are blocking. The distinction is severity, not the agent's opinion of importance. |
|
|
133
|
+
| "The AI review flagged an architectural concern as blocking, so the integrity check should fail" | Only runtime errors, data loss, and security vulnerabilities count as blocking review findings. Architectural concerns are noted but do not block. |
|
|
134
|
+
|
|
125
135
|
## Examples
|
|
126
136
|
|
|
127
137
|
### Example: All Clear
|
|
@@ -162,6 +162,15 @@ This ensures subsequent graph queries (impact analysis, drift detection) include
|
|
|
162
162
|
- Report follows the structured output format
|
|
163
163
|
- All findings are backed by graph query evidence (with graph) or directory/file analysis (without graph)
|
|
164
164
|
|
|
165
|
+
## Rationalizations to Reject
|
|
166
|
+
|
|
167
|
+
| Rationalization | Why It Is Wrong |
|
|
168
|
+
| --------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
169
|
+
| "The graph is a few commits behind, but it is close enough for knowledge mapping" | If the graph is more than 10 commits behind, run harness scan before proceeding. A stale graph produces a knowledge map with missing modules. |
|
|
170
|
+
| "No graph exists, so this skill cannot produce useful output" | The fallback strategy is explicit: use directory structure and file analysis. Fallback completeness is ~50%, significantly better than nothing. |
|
|
171
|
+
| "The existing AGENTS.md is outdated, so I will overwrite it with the generated version" | Never overwrite without confirmation. Existing AGENTS.md may contain carefully authored context the graph cannot infer. |
|
|
172
|
+
| "The module descriptions I inferred from function names are accurate enough" | Inferred descriptions are starting points. Phase 3 (AUDIT) exists to identify coverage gaps. Name-based inference misses purpose, constraints, and relationships. |
|
|
173
|
+
|
|
165
174
|
## Examples
|
|
166
175
|
|
|
167
176
|
### Example: Generating AGENTS.md from Graph
|
|
@@ -259,6 +259,16 @@ Phase 4: ANALYZE
|
|
|
259
259
|
Recommendation: Add DataLoader for orders resolver, re-test after fix
|
|
260
260
|
```
|
|
261
261
|
|
|
262
|
+
## Rationalizations to Reject
|
|
263
|
+
|
|
264
|
+
| Rationalization | Reality |
|
|
265
|
+
| ----------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
266
|
+
| "The smoke test passed, so the full load test will probably be fine too." | A smoke test at 1-2 VUs tells you the script runs — it says nothing about behavior at 100 or 1000 VUs. Connection pool exhaustion, lock contention, and GC pressure only appear under load. Smoke passing is the floor, not the ceiling. |
|
|
267
|
+
| "Staging is smaller than production, so results won't be accurate anyway — no point running the full test." | Staging results are always useful as a proxy: they reveal algorithmic bottlenecks, N+1 queries, and missing indexes that scale identically regardless of instance count. Document the scale factor and use it. Do not skip testing because the environment is imperfect. |
|
|
268
|
+
| "We haven't changed the API, so the old load test baselines still apply." | Baselines go stale when dependencies update, traffic patterns shift, or adjacent services change. A deployment that adds one middleware layer or changes a database index can move p99 by 200ms. Baselines must be re-validated, not assumed. |
|
|
269
|
+
| "The p95 threshold is arbitrary — let's just relax it until the test passes." | A threshold without a documented basis is a guess. A threshold lowered to make a failing test pass is a suppressed regression. Thresholds must be derived from SLOs or measured baselines. If the SLO is wrong, change the SLO explicitly with stakeholder sign-off. |
|
|
270
|
+
| "We'll run the soak test later — we just need to ship the load test first." | Soak tests catch failures that only emerge over hours: memory leaks, connection pool exhaustion, log file growth. If the feature involves a long-lived process, background worker, or WebSocket, skipping the soak test means the failure surfaces in production. |
|
|
271
|
+
|
|
262
272
|
## Gates
|
|
263
273
|
|
|
264
274
|
- **No load tests against production without explicit human approval.** Load tests can cause real outages. The target environment must be verified as non-production before execution. If production testing is required, a `[checkpoint:human-verify]` must be passed with documented approval.
|
|
@@ -326,6 +326,16 @@ Phase 4: VALIDATE
|
|
|
326
326
|
After fixes: projected NEEDS_ATTENTION (missing precision/recall metrics)
|
|
327
327
|
```
|
|
328
328
|
|
|
329
|
+
## Rationalizations to Reject
|
|
330
|
+
|
|
331
|
+
| Rationalization | Reality |
|
|
332
|
+
| ----------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
333
|
+
| "We re-trained with more data but the architecture is the same — the previous evaluation still applies." | Evaluation results are bound to a specific model artifact, not to the architecture. A re-trained model with different weights can have dramatically different failure modes even if accuracy appears similar. Every model version that goes to production must be evaluated against the golden set, not inherited from its predecessor. |
|
|
334
|
+
| "The model file is only 8MB — committing it to git is more convenient than setting up an artifact store." | Model files in git corrupt repository history, explode clone times for all contributors, and cannot be versioned alongside experiment metadata. Convenience now creates permanent technical debt. The artifact store setup is a one-time cost; git pollution is permanent. |
|
|
335
|
+
| "Loading the model inside the request handler is simpler — the model is small enough that latency won't be noticeable." | Per-request model loading adds I/O and deserialization on every inference call, holds no persistent state across requests, and collapses under any meaningful concurrency. "Small enough" is a guess without measurement. Models must be loaded at startup and held in memory. |
|
|
336
|
+
| "We can add experiment tracking after we get the model working — right now we just need to iterate quickly." | Experiment tracking is hardest to add retroactively because you cannot reconstruct the conditions of runs you did not log. The runs being executed without tracking right now are the ones producing the model that may go to production. Log them now or accept that the model is not reproducible. |
|
|
337
|
+
| "The prompt template is short enough to read in context — version controlling it adds unnecessary process." | Prompts embedded in application code change silently when developers edit them, have no history of what changed and why, and cannot be evaluated independently. A prompt is a model artifact. It requires the same versioning, evaluation, and promotion discipline as model weights. |
|
|
338
|
+
|
|
329
339
|
## Gates
|
|
330
340
|
|
|
331
341
|
- **No deploying models without evaluation.** A model that has not been evaluated against a golden set or baseline cannot be promoted to production. This is always an error.
|
|
@@ -311,6 +311,16 @@ Phase 4: VALIDATE
|
|
|
311
311
|
Store submission ready: PASS
|
|
312
312
|
```
|
|
313
313
|
|
|
314
|
+
## Rationalizations to Reject
|
|
315
|
+
|
|
316
|
+
| Rationalization | Reality |
|
|
317
|
+
| -------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
318
|
+
| "We request all permissions at launch to get them out of the way — users can deny them if they want." | App stores treat permissions-at-launch as a review red flag and users deny at much higher rates when there is no contextual explanation. Permissions requested at the moment they are needed, with a sentence explaining why, consistently achieve higher grant rates and reduce store rejection risk. |
|
|
319
|
+
| "Universal Links are optional — the URL scheme fallback works fine for deep linking." | URL scheme fallbacks (`myapp://`) can be claimed by any installed app on the device. A malicious or coincidentally named app can intercept links intended for yours. Universal Links with verified `apple-app-site-association` files are cryptographically bound to your domain and cannot be hijacked. |
|
|
320
|
+
| "The push notification handler works in foreground and background — we can handle the terminated state separately after launch." | Users often first interact with an app by tapping a push notification when the app is terminated. The cold-start tap handler is commonly the first impression. Shipping without it means a class of users experiences a broken entry point from day one. |
|
|
321
|
+
| "The staging configuration is slightly different but we'll remember to change it before the App Store build." | "Remember to change it" is not a process. Staging URLs, debug API keys, and sandbox APNs environments in production builds have shipped before and will again. Separate build configurations and environment-specific entitlement files are the only reliable mitigation. |
|
|
322
|
+
| "The privacy manifest requirement is new — we'll add it in the next release after the store flags it." | Apple has enforced PrivacyInfo.xcprivacy requirements for new submissions and updates since May 2024. Submitting without it results in rejection, which blocks the entire release. Adding it retroactively under rejection pressure is strictly more costly than adding it now. |
|
|
323
|
+
|
|
314
324
|
## Gates
|
|
315
325
|
|
|
316
326
|
- **No missing permission usage descriptions.** Every permission requested in code must have a corresponding usage description in the platform manifest. Missing descriptions cause automatic App Store rejection on iOS and are a best practice requirement on Android.
|
|
@@ -236,6 +236,15 @@ mvn org.pitest:pitest-maven:mutationCoverage
|
|
|
236
236
|
# Report generated at target/pit-reports/index.html
|
|
237
237
|
```
|
|
238
238
|
|
|
239
|
+
## Rationalizations to Reject
|
|
240
|
+
|
|
241
|
+
| Rationalization | Why It Is Wrong |
|
|
242
|
+
| ---------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------- |
|
|
243
|
+
| "We have 80% line coverage, so test quality is already good" | Line coverage measures execution, not verification. Mutation testing reveals missing assertions and weak assertions. |
|
|
244
|
+
| "The survived mutants are in non-critical utility code, so we can ignore them" | Every survived mutant must be either addressed with a test or explicitly justified as an equivalent mutant. |
|
|
245
|
+
| "I will write a test that targets the specific mutation to kill it" | No gaming the mutation score. Every new test must test a meaningful behavior, not just kill a specific mutant. |
|
|
246
|
+
| "The test suite has some failures, but we can still run mutation testing to see what we learn" | No mutation testing against a failing test suite. Mutations against broken tests produce garbage results. |
|
|
247
|
+
|
|
239
248
|
## Gates
|
|
240
249
|
|
|
241
250
|
- **No mutation testing against a failing test suite.** All tests must pass before mutants are generated. Running mutations against broken tests produces garbage results. Fix the tests first.
|
|
@@ -268,6 +268,16 @@ Phase 4: VALIDATE
|
|
|
268
268
|
Result: WARN -- 3 instrumentation gaps, alerting needs SLO alignment
|
|
269
269
|
```
|
|
270
270
|
|
|
271
|
+
## Rationalizations to Reject
|
|
272
|
+
|
|
273
|
+
| Rationalization | Reality |
|
|
274
|
+
| -------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
275
|
+
| "We can see what's happening in CloudWatch logs — we don't need structured logging" | Unstructured log lines cannot be queried, aggregated, or correlated across services. When an incident spans three services, searching for a request ID across unstructured logs is manual forensics. Structured logging is not a nicety — it is the foundation for incident response. |
|
|
276
|
+
| "We'll add alerting once we've seen a few incidents and know what to alert on" | The first incident is the worst time to define alerting. SLO-based burn rate alerts can be defined from traffic patterns before any incidents occur. Waiting for incidents to define thresholds means every early failure goes undetected. |
|
|
277
|
+
| "User ID is a useful label for the latency metric — it helps us debug per-user issues" | User ID as a metric label creates one time series per user, which at 100,000 users means 100,000 label combinations. High-cardinality labels exhaust metric storage, cause query timeouts, and make the entire metrics system unstable. Use logs for per-user debugging; use metrics for aggregate signals. |
|
|
278
|
+
| "The tracing library is initialized, so we have distributed tracing" | Initializing the library creates root spans but does not propagate context across HTTP boundaries, instrument database calls, or connect traces to logs. Trace initialization without verified end-to-end propagation produces disconnected, useless traces. |
|
|
279
|
+
| "We have alerts — they're just not linked to runbooks yet" | An alert that fires at 3am without a runbook link requires the on-call engineer to start debugging from scratch. The absence of a runbook is not a documentation gap; it is a mean-time-to-recover multiplier. |
|
|
280
|
+
|
|
271
281
|
## Gates
|
|
272
282
|
|
|
273
283
|
- **No sensitive data in logs.** If PII, credentials, or tokens are detected in log output, it is a blocking finding. The logging configuration must sanitize or redact sensitive fields before any other improvements are made.
|
|
@@ -23,7 +23,7 @@
|
|
|
23
23
|
- Constraints and forbidden patterns
|
|
24
24
|
- Any special instructions or warnings
|
|
25
25
|
|
|
26
|
-
2. **Read `harness.
|
|
26
|
+
2. **Read `harness.config.json`.** Extract:
|
|
27
27
|
- Project name and stack
|
|
28
28
|
- Adoption level (basic, intermediate, advanced)
|
|
29
29
|
- Layer definitions and their directory mappings
|
|
@@ -48,7 +48,7 @@
|
|
|
48
48
|
2. **Map the architecture.** Walk the directory structure and identify:
|
|
49
49
|
- Top-level organization pattern (monorepo, single package, workspace)
|
|
50
50
|
- Source code location and entry points
|
|
51
|
-
- Layer boundaries (from `harness.
|
|
51
|
+
- Layer boundaries (from `harness.config.json` and actual directory structure)
|
|
52
52
|
- Shared utilities or common modules
|
|
53
53
|
- Configuration files and their purposes
|
|
54
54
|
|
|
@@ -61,7 +61,7 @@
|
|
|
61
61
|
- Code formatting (detect from config files: `.prettierrc`, `.eslintrc`, `biome.json`)
|
|
62
62
|
|
|
63
63
|
4. **Map the constraints.** Identify what is restricted:
|
|
64
|
-
- Forbidden imports (from `harness.
|
|
64
|
+
- Forbidden imports (from `harness.config.json` dependency constraints)
|
|
65
65
|
- Layer boundary rules (which layers can import from which)
|
|
66
66
|
- Linting rules that encode architectural decisions
|
|
67
67
|
- Any constraints documented in `AGENTS.md` that are not yet automated
|
|
@@ -95,8 +95,8 @@ Graph queries produce a complete architecture map in seconds, including transiti
|
|
|
95
95
|
|
|
96
96
|
### Phase 3: ORIENT — Identify Adoption Level and Maturity
|
|
97
97
|
|
|
98
|
-
1. **Confirm the adoption level** matches what `harness.
|
|
99
|
-
- Basic: `AGENTS.md` and `harness.
|
|
98
|
+
1. **Confirm the adoption level** matches what `harness.config.json` declares:
|
|
99
|
+
- Basic: `AGENTS.md` and `harness.config.json` exist but no layers or constraints
|
|
100
100
|
- Intermediate: Layers defined, dependency constraints enforced, at least one custom skill
|
|
101
101
|
- Advanced: Personas, state management, learnings, CI integration
|
|
102
102
|
|
|
@@ -184,21 +184,29 @@ Graph queries produce a complete architecture map in seconds, including transiti
|
|
|
184
184
|
- **`harness check-deps`** — Run to verify dependency constraints are passing, which confirms layer boundaries are respected.
|
|
185
185
|
- **`harness state show`** — View current state to understand where the last session left off.
|
|
186
186
|
- **`AGENTS.md`** — Primary source of project context and agent instructions.
|
|
187
|
-
- **`harness.
|
|
187
|
+
- **`harness.config.json`** — Source of structural configuration (layers, constraints, skills).
|
|
188
188
|
- **`.harness/learnings.md`** — Historical context and institutional knowledge.
|
|
189
189
|
|
|
190
190
|
## Success Criteria
|
|
191
191
|
|
|
192
|
-
- All four configuration sources were read (`AGENTS.md`, `harness.
|
|
192
|
+
- All four configuration sources were read (`AGENTS.md`, `harness.config.json`, `.harness/learnings.md`, `.harness/state.json`)
|
|
193
193
|
- Technology stack is accurately identified (language, framework, test runner, build tool)
|
|
194
194
|
- Architecture is mapped with correct layer boundaries and dependency directions
|
|
195
195
|
- Conventions are identified from actual code patterns, not assumed
|
|
196
|
-
- Constraints are enumerated from both `harness.
|
|
196
|
+
- Constraints are enumerated from both `harness.config.json` and `AGENTS.md`
|
|
197
197
|
- Adoption level is confirmed (not just declared — validated)
|
|
198
198
|
- A structured orientation summary is produced with all sections filled
|
|
199
199
|
- The "Getting Started" section is actionable and tailored to the audience
|
|
200
200
|
- `harness validate` was run and results are reported
|
|
201
201
|
|
|
202
|
+
## Rationalizations to Reject
|
|
203
|
+
|
|
204
|
+
| Rationalization | Reality |
|
|
205
|
+
| -------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
206
|
+
| "I can skip reading .harness/learnings.md since it is just historical notes" | Learnings contain hard-won insights from previous sessions -- decisions made, gotchas discovered, patterns that worked or failed. Skipping them means repeating mistakes already diagnosed. |
|
|
207
|
+
| "The harness.config.json says intermediate, so I can report that without validation" | Declared adoption level must be confirmed, not assumed. A project that declares intermediate but fails harness validate is not truly intermediate. |
|
|
208
|
+
| "I will map the architecture by reading the directory names since that is faster than checking conventions in actual code" | Conventions must be identified from actual code patterns, not assumed from directory structure. File naming, import style, and error handling can only be verified by reading real source files. |
|
|
209
|
+
|
|
202
210
|
## Examples
|
|
203
211
|
|
|
204
212
|
### Example: Onboarding to an Intermediate TypeScript Project
|
|
@@ -211,7 +219,7 @@ Read AGENTS.md:
|
|
|
211
219
|
- Stack: TypeScript, Express, Vitest, PostgreSQL
|
|
212
220
|
- Conventions: zod validation, repository pattern, kebab-case files
|
|
213
221
|
|
|
214
|
-
Read harness.
|
|
222
|
+
Read harness.config.json:
|
|
215
223
|
- Level: intermediate
|
|
216
224
|
- Layers: presentation (src/routes/), business (src/services/), data (src/repositories/)
|
|
217
225
|
- Constraints: presentation → business OK, business → data OK, data → presentation FORBIDDEN
|
|
@@ -258,7 +266,7 @@ Produce orientation with all sections. Getting Started for this context:
|
|
|
258
266
|
|
|
259
267
|
```
|
|
260
268
|
Read AGENTS.md — exists, minimal content
|
|
261
|
-
Read harness.
|
|
269
|
+
Read harness.config.json — level: basic, no layers defined
|
|
262
270
|
No .harness/learnings.md
|
|
263
271
|
No .harness/state.json
|
|
264
272
|
```
|
|
@@ -159,6 +159,15 @@ For each independent task, write a focused agent brief:
|
|
|
159
159
|
- `harness validate` passes after integration
|
|
160
160
|
- No agent modified files outside its declared scope
|
|
161
161
|
|
|
162
|
+
## Rationalizations to Reject
|
|
163
|
+
|
|
164
|
+
| Rationalization | Why It Is Wrong |
|
|
165
|
+
| -------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
166
|
+
| "These two tasks touch different functions in the same file, so they are independent enough" | If both tasks write to the same file, they are NOT independent. Even different functions in the same file creates merge conflicts. |
|
|
167
|
+
| "I verified independence manually -- no need to run check_task_independence" | Manual verification misses transitive dependency overlap. check_task_independence with graph-expanded analysis catches transitive conflicts. |
|
|
168
|
+
| "There are only 2 independent tasks, but parallelism would save time" | NOT when there are fewer than 3 independent tasks. Coordination overhead outweighs parallelism benefit for 2 tasks. |
|
|
169
|
+
| "Each agent's tests pass, so integration is fine" | Step 4 requires running the FULL test suite after integration. Parallel changes can cause integration failures that individual test runs miss. |
|
|
170
|
+
|
|
162
171
|
## Examples
|
|
163
172
|
|
|
164
173
|
### Example: Parallel Implementation of Three Independent Services
|
|
@@ -187,6 +187,17 @@ This phase runs only when `.bench.ts` files exist in the project. If none are fo
|
|
|
187
187
|
- Gate decision is recorded in state
|
|
188
188
|
- `harness validate` passes after enforcement
|
|
189
189
|
|
|
190
|
+
## Rationalizations to Reject
|
|
191
|
+
|
|
192
|
+
These are common rationalizations that sound reasonable but lead to incorrect results. When you catch yourself thinking any of these, stop and follow the documented process instead.
|
|
193
|
+
|
|
194
|
+
| Rationalization | Why It Is Wrong |
|
|
195
|
+
| ------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
196
|
+
| "The cyclomatic complexity is 16 but the function is straightforward, so I can override the Tier 1 threshold" | Tier 1 violations are non-negotiable blockers. No merge with Tier 1 performance violations. If a threshold needs adjustment, reconfigure with documented justification. |
|
|
197
|
+
| "The benchmark regression is only 6% and it is probably just noise" | The noise margin (default 3%) is applied before flagging. A 6% regression on a perf-critical path exceeds the Tier 1 threshold even after noise consideration. |
|
|
198
|
+
| "The working tree has a small uncommitted change but it should not affect benchmark results" | No running benchmarks with a dirty working tree. Uncommitted changes invalidate benchmark results. |
|
|
199
|
+
| "I will update the baselines to match the new performance numbers rather than fixing the regression" | Baselines must come from fresh runs against committed code. Silently moving the goalposts defeats the purpose of performance gates. |
|
|
200
|
+
|
|
190
201
|
## Examples
|
|
191
202
|
|
|
192
203
|
### Example: PR with High Complexity Function
|
|
@@ -235,6 +235,16 @@ harness check-perf — complexity reduced from 12 to 8 (improvement)
|
|
|
235
235
|
harness perf baselines update — new baseline saved
|
|
236
236
|
```
|
|
237
237
|
|
|
238
|
+
## Rationalizations to Reject
|
|
239
|
+
|
|
240
|
+
| Rationalization | Reality |
|
|
241
|
+
| --------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
242
|
+
| "The correctness test is green, I'll add the benchmark later when we know performance is an issue." | The benchmark is not optional — it is the mechanism that defines "performance issue." Without a baseline captured at implementation time, you have nothing to compare against when a regression appears months later. Later never comes. |
|
|
243
|
+
| "I'll skip the REFACTOR phase since the spec doesn't mention performance requirements." | The spec not mentioning a requirement means there is no user-facing SLO, not that performance is irrelevant. The benchmark still captures the baseline that future work must not regress from. Phase 3 is optional; the benchmark file is not. |
|
|
244
|
+
| "The benchmark results vary too much between runs to be meaningful — I'll just omit it." | Variance is a signal, not a reason to skip. High variance means the benchmark needs warmup iterations, more samples, or isolation from I/O. Fix the benchmark, do not delete it. An absent benchmark offers zero protection against regressions. |
|
|
245
|
+
| "This function is only called during startup, so its performance doesn't matter at runtime." | Startup performance determines deployment speed, lambda cold-start latency, and test suite duration. "Not in the hot path at runtime" does not mean performance is free to ignore. Measure it so the baseline exists if startup behavior changes. |
|
|
246
|
+
| "We already have an integration test that covers this — writing a separate benchmark would be redundant." | Integration tests verify correctness under realistic conditions. Benchmarks measure isolated performance with precise input control. An integration test that passes in 2 seconds tells you nothing about whether the function itself takes 1ms or 800ms. |
|
|
247
|
+
|
|
238
248
|
## Gates
|
|
239
249
|
|
|
240
250
|
- **No code before test AND benchmark.** Both must exist before implementation begins.
|
|
@@ -468,6 +468,16 @@ When `docs/changes/` exists in the project, produce `docs/changes/<feature>/delt
|
|
|
468
468
|
- When `rigorLevel` is `standard` and task count < 8, the skeleton is skipped
|
|
469
469
|
- The skeleton format is lightweight (~200 tokens): numbered groups with task count and time estimates
|
|
470
470
|
|
|
471
|
+
## Rationalizations to Reject
|
|
472
|
+
|
|
473
|
+
| Rationalization | Reality |
|
|
474
|
+
| ------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
475
|
+
| "The task is conceptually clear so I do not need to include exact code in the plan" | Every task must have exact file paths, exact code, and exact commands. If you cannot write the code in the plan, you do not understand the task well enough to plan it. |
|
|
476
|
+
| "This task touches 5 files but it is logically one unit of work, so splitting it would add overhead" | Tasks touching more than 3 files must be split. The overhead of splitting is far less than the cost of a failed oversized task. |
|
|
477
|
+
| "Tests for this task can be added in a follow-up task since the implementation is straightforward" | No skipping TDD in tasks. Every code-producing task must start with writing a test. "Add tests later" is explicitly forbidden. |
|
|
478
|
+
| "The spec does not cover this edge case, but I can fill in the gap during planning" | When the spec is missing information, do not fill in the gaps yourself. Escalate. Filling gaps silently creates undocumented design decisions that no one reviewed. |
|
|
479
|
+
| "I discovered we need an additional file during decomposition, but updating the file map is just bookkeeping" | The file map must be complete. Every file that will be created or modified must appear in the file map before task decomposition. |
|
|
480
|
+
|
|
471
481
|
## Examples
|
|
472
482
|
|
|
473
483
|
### Example: Planning a User Notification Feature
|
|
@@ -284,6 +284,15 @@ fi
|
|
|
284
284
|
- [ ] AI review focused on high-signal issues only (no style nits)
|
|
285
285
|
- [ ] Report follows the structured format exactly
|
|
286
286
|
|
|
287
|
+
## Rationalizations to Reject
|
|
288
|
+
|
|
289
|
+
| Rationalization | Why It Is Wrong |
|
|
290
|
+
| ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
291
|
+
| "The lint errors are just warnings, so I can proceed to AI review" | The gate is absolute: any mechanical check failure means STOP. AI review does not run until lint, typecheck, and tests all pass. |
|
|
292
|
+
| "This is a docs-only change but let me run AI review anyway for thoroughness" | The fast path is mandatory. If only docs/config files changed, AI review is skipped. Running it anyway wastes tokens. |
|
|
293
|
+
| "The AI found a style issue, so I should block the commit" | AI review observations are advisory only. Only mechanical check failures block the commit. |
|
|
294
|
+
| "I will skip the security scan since this is an internal endpoint" | Phase 3 runs the security scanner against all staged source files regardless of exposure. Hardcoded secrets and injection are blocking even in internal code. |
|
|
295
|
+
|
|
287
296
|
## Examples
|
|
288
297
|
|
|
289
298
|
### Example: Clean Commit
|
|
@@ -197,6 +197,16 @@
|
|
|
197
197
|
- Output format matches existing project conventions when they exist
|
|
198
198
|
- Generated PRD is saved to the correct directory with consistent naming
|
|
199
199
|
|
|
200
|
+
## Rationalizations to Reject
|
|
201
|
+
|
|
202
|
+
| Rationalization | Why It Is Wrong |
|
|
203
|
+
| -------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
204
|
+
| "The feature request is clear enough -- I can skip the ambiguity check and start writing stories" | The gate: no generating specs from ambiguous input without clarification. Missing actors or undefined triggers lead to untestable acceptance criteria. |
|
|
205
|
+
| "This acceptance criterion is understood by the team, so it does not need to be formally testable" | No untestable acceptance criteria is a hard gate. Every criterion must be verifiable by an automated test or specific manual procedure. |
|
|
206
|
+
| "The happy path scenarios are enough -- edge cases are unlikely" | The skill requires at least one unwanted-behavior criterion for every user-facing action. Edge cases are where production bugs live. |
|
|
207
|
+
| "The existing PRD is outdated, so I will just replace it with a fresh one" | No overwriting existing specs is a gate. Present the diff rather than replacing the file. |
|
|
208
|
+
| "We can figure out the success metrics later during implementation" | Every success metric must be measurable, time-bound, and specific at spec time. |
|
|
209
|
+
|
|
200
210
|
## Examples
|
|
201
211
|
|
|
202
212
|
### Example: GitHub Issue to PRD for Team Notifications
|
|
@@ -266,6 +266,16 @@ def test_sort_handles_floats(xs):
|
|
|
266
266
|
assert result[i] <= result[i + 1]
|
|
267
267
|
```
|
|
268
268
|
|
|
269
|
+
## Rationalizations to Reject
|
|
270
|
+
|
|
271
|
+
| Rationalization | Reality |
|
|
272
|
+
| ------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
273
|
+
| "We already have example-based tests that cover the edge cases — property tests would just be redundant." | Example-based tests cover the cases the author thought of. Property tests cover the cases they did not. The entire value of generative testing is that it explores regions of the input space that human intuition misses — off-by-one errors, Unicode combining characters, signed integer overflow at boundaries. |
|
|
274
|
+
| "The generator keeps producing rejected inputs, so I'll just filter more aggressively to make the test pass faster." | Heavy `filter` usage is a symptom of a broken generator, not a solution. Each rejected sample wastes an iteration, and `filter` destroys the shrinking chain, leaving you with an unhelpful counterexample when a bug is found. Rewrite the generator using `map` and `flatMap` to construct valid inputs directly. |
|
|
275
|
+
| "The counterexample is too strange to be a real-world case — I'll just increase the iteration count so it appears less often." | A shrunk counterexample that triggers a property failure is a real bug by definition. "Unlikely in practice" is not a property of correctness — the question is whether the invariant holds. If the counterexample is a valid input the function might receive, fix the function. If it is not a valid input, constrain the generator. |
|
|
276
|
+
| "This function has too many invariants to specify — I'll just skip property testing and trust the unit tests." | Complex functions with many invariants are exactly the functions most in need of property testing. High complexity means a larger bug-hiding surface. Start with the most important invariants (no-crash, round-trip, idempotence) rather than attempting to encode all properties at once. |
|
|
277
|
+
| "Property tests are too slow — they'll block CI for 10 minutes." | Run 100 iterations on PR, 10,000 iterations nightly. The CI time argument justifies reducing iteration count, never eliminating property tests entirely. A suite that runs 0 property tests found 0 edge cases. |
|
|
278
|
+
|
|
269
279
|
## Gates
|
|
270
280
|
|
|
271
281
|
- **No property tests without shrinking.** If the framework's automatic shrinking is disabled or the generator uses patterns that break shrinking (excessive `filter`), counterexamples will be unhelpfully large. Fix the generator to support shrinking.
|
|
@@ -134,6 +134,15 @@ Skipping this step means subsequent graph queries (impact analysis, dependency h
|
|
|
134
134
|
- No behavioral changes were introduced (the test suite is the proof)
|
|
135
135
|
- No dead code was left behind (run `harness cleanup` to verify)
|
|
136
136
|
|
|
137
|
+
## Rationalizations to Reject
|
|
138
|
+
|
|
139
|
+
| Rationalization | Reality |
|
|
140
|
+
| ------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
141
|
+
| "The tests are mostly passing, so I can start refactoring and fix the remaining failures as I go" | All tests must pass BEFORE refactoring starts. If tests are not green before you start, you are not refactoring -- you are debugging. |
|
|
142
|
+
| "This refactoring changes a small amount of behavior, but it is a clear improvement" | Refactoring must not change behavior. The test suite is the proof. If the refactoring requires changing tests, you may be changing behavior. |
|
|
143
|
+
| "I will make several changes at once and run tests at the end since each change is small" | Tests must run after EVERY single change. If a test breaks, you must undo the LAST change immediately. |
|
|
144
|
+
| "The refactoring did not produce a measurable improvement, but the code is different so it must be somewhat better" | If the refactoring introduced no measurable improvement, revert the entire sequence. Refactoring for its own sake is churn. |
|
|
145
|
+
|
|
137
146
|
## Examples
|
|
138
147
|
|
|
139
148
|
### Example: Moving business logic out of a UI component
|
|
@@ -537,6 +537,17 @@ This framing is informational — it does not block anything. It gives the team
|
|
|
537
537
|
8. Monorepo support: each package is audited independently with per-package results in the report
|
|
538
538
|
9. `harness validate` passes after the skill's SKILL.md and skill.yaml are written
|
|
539
539
|
|
|
540
|
+
## Rationalizations to Reject
|
|
541
|
+
|
|
542
|
+
These are common rationalizations that sound reasonable but lead to incorrect results. When you catch yourself thinking any of these, stop and follow the documented process instead.
|
|
543
|
+
|
|
544
|
+
| Rationalization | Why It Is Wrong |
|
|
545
|
+
| ----------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- |
|
|
546
|
+
| "The MAINTAIN phase takes too long, so I will skip dispatching the 4 maintenance agents" | No skipping the MAINTAIN phase. Maintenance checks catch issues that release-specific checks miss. |
|
|
547
|
+
| "This auto-fix is obviously correct, so I can apply it without prompting the user" | No auto-fix without prompting. Every fix must be presented to the human before being applied. |
|
|
548
|
+
| "Most checks pass and only a few warnings remain, so the release is ready" | A "mostly passing" report is not a passing report. The result is PASS only when zero failures exist across all categories. |
|
|
549
|
+
| "The previous run found these issues and I fixed them, so I can trust the cached results" | Session resumption requires re-running all checks. Code may have changed since the last run. |
|
|
550
|
+
|
|
540
551
|
## Examples
|
|
541
552
|
|
|
542
553
|
### Example: First Run on a Monorepo with Gaps
|
|
@@ -240,6 +240,16 @@ Phase 4: VALIDATE
|
|
|
240
240
|
Redis fallback serves from LRU when Redis is down
|
|
241
241
|
```
|
|
242
242
|
|
|
243
|
+
## Rationalizations to Reject
|
|
244
|
+
|
|
245
|
+
| Rationalization | Reality |
|
|
246
|
+
| ----------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
247
|
+
| "That third-party API has 99.99% uptime — we don't need a circuit breaker" | 99.99% uptime means 52 minutes of downtime per year. That downtime will not occur as one predictable window — it will happen as degraded responses and timeouts during a traffic spike. Without a circuit breaker, every caller blocks for the full timeout duration, exhausting thread pools and cascading across the system. |
|
|
248
|
+
| "We have retry logic, so failures are handled" | Retry logic without a circuit breaker amplifies failures. When the downstream service is degraded, retries multiply the load on an already struggling system. Circuit breakers and retries are complementary controls, not alternatives. |
|
|
249
|
+
| "The fallback adds complexity — we'll add it if the circuit breaker actually opens" | A circuit breaker without a fallback is a different kind of failure mode, not resilience. When the circuit opens, users see an error instead of a degraded-but-functional experience. Fallbacks must be designed and tested before the circuit ever opens in production. |
|
|
250
|
+
| "Our database connection pool is 100 connections — that's plenty" | Connection pool size without query timeouts means slow queries hold connections indefinitely. A single slow query spike can exhaust the pool, causing every subsequent request to wait. Pool sizing and query timeouts are both required. |
|
|
251
|
+
| "The service is internal — it doesn't need rate limiting" | Internal services are often called by automated processes, CI pipelines, and batch jobs that can spike traffic in ways user-facing services do not. Missing rate limiting on internal services is a common cause of self-inflicted outages during deployments and data migrations. |
|
|
252
|
+
|
|
243
253
|
## Gates
|
|
244
254
|
|
|
245
255
|
- **No retry on non-idempotent operations without idempotency keys.** Retrying a POST or DELETE that lacks an idempotency mechanism can cause data duplication or data loss. This is a blocking finding. The operation must be made idempotent before retry logic is added.
|
|
@@ -42,7 +42,7 @@ If the human has not seen and approved the milestone groupings and feature list,
|
|
|
42
42
|
- Has spec + plan but no implementation -> `planned`
|
|
43
43
|
- Has spec but no plan -> `backlog`
|
|
44
44
|
- Has plan but no spec -> `planned` (unusual, flag for human review)
|
|
45
|
-
6. Detect project name from `harness.
|
|
45
|
+
6. Detect project name from `harness.config.json` `project` field, or `package.json` `name` field, or directory name as fallback.
|
|
46
46
|
|
|
47
47
|
Present scan summary:
|
|
48
48
|
|
|
@@ -457,6 +457,15 @@ Choice?
|
|
|
457
457
|
19. `--query` filters features by status or milestone and displays results with milestone context
|
|
458
458
|
20. `--query` errors gracefully when no roadmap exists, directing the user to `--create`
|
|
459
459
|
|
|
460
|
+
## Rationalizations to Reject
|
|
461
|
+
|
|
462
|
+
| Rationalization | Reality |
|
|
463
|
+
| ----------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- |
|
|
464
|
+
| "The feature list looks correct, so I can skip the PROPOSE phase and write the roadmap directly" | The Iron Law: never write docs/roadmap.md without the human confirming the proposed structure first. |
|
|
465
|
+
| "This sync detected a status change and the inference is clearly correct, so I can apply it without confirmation" | The sync PROPOSE phase requires presenting proposed changes and waiting for human confirmation. The human-always-wins rule applies. |
|
|
466
|
+
| "The existing roadmap is outdated, so I will recreate it with --create to get a fresh start" | No overwriting an existing roadmap without explicit user consent. Silent overwrites destroy prior manual edits and status tracking. |
|
|
467
|
+
| "There is no roadmap yet but the user asked me to add a feature, so I will create one as a side effect of --add" | When the roadmap does not exist, --add must error with a clear message directing the user to --create. |
|
|
468
|
+
|
|
460
469
|
## Examples
|
|
461
470
|
|
|
462
471
|
### Example: `--create` -- Bootstrap a Roadmap from Existing Artifacts
|
|
@@ -150,6 +150,14 @@ Proceed with Feature A? (y/n/pick another)
|
|
|
150
150
|
7. Transition routes to brainstorming (no spec) or autopilot (spec exists)
|
|
151
151
|
8. `harness validate` passes after all changes
|
|
152
152
|
|
|
153
|
+
## Rationalizations to Reject
|
|
154
|
+
|
|
155
|
+
| Rationalization | Reality |
|
|
156
|
+
| ----------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
|
|
157
|
+
| "The top-scored candidate is obviously correct, so I can assign it without asking the human" | The Iron Law: never assign or transition without the human confirming the recommendation first. |
|
|
158
|
+
| "Affinity data is not available so the scoring is degraded -- I should just pick the first planned item" | Proceed without affinity scoring by zeroing out the affinity weight. Position and dependents signals still produce meaningful rankings. |
|
|
159
|
+
| "The feature has no spec, but I can skip brainstorming and jump straight to planning since the summary is clear enough" | No spec routes to brainstorming, spec exists routes to autopilot. A one-line roadmap summary is not a spec. |
|
|
160
|
+
|
|
153
161
|
## Examples
|
|
154
162
|
|
|
155
163
|
### Example: Pick Next Item from a Multi-Milestone Roadmap
|
|
@@ -278,6 +278,16 @@ Phase 4: VALIDATE
|
|
|
278
278
|
Result: FAIL -- rotation required before deployment, history rewrite recommended
|
|
279
279
|
```
|
|
280
280
|
|
|
281
|
+
## Rationalizations to Reject
|
|
282
|
+
|
|
283
|
+
| Rationalization | Reality |
|
|
284
|
+
| ----------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
285
|
+
| "That key is read-only so it's not a big deal if it leaks" | Read-only credentials still enable data exfiltration, reconnaissance, and discovery of other vulnerabilities. A leaked read-only database credential exposes every row in the database. Scope does not eliminate risk. |
|
|
286
|
+
| "We removed it from the file — it's cleaned up now" | Removing a secret from the current tree does not remove it from git history. Anyone with a clone of the repository can recover the secret with `git log -p`. Rotation is required regardless of file deletion. |
|
|
287
|
+
| "That's a test environment key, not production" | Test environment credentials are frequently reused, shared informally, and rotated less often. Leaked test keys also reveal credential patterns and naming conventions that help attackers guess production secrets. |
|
|
288
|
+
| "It's in a private repo so only our team can see it" | Private repos are accessed by CI/CD systems, third-party integrations, contractors, and former employees. Repository access controls are not a substitute for secret externalization. Breaches routinely originate from compromised internal access. |
|
|
289
|
+
| "We'll move it to an environment variable before we deploy" | Intent does not prevent exposure. The secret is in the codebase now and may already be in commit history, CI logs, or developer machine caches. Remediation must happen at the moment of detection, not at deployment time. |
|
|
290
|
+
|
|
281
291
|
## Gates
|
|
282
292
|
|
|
283
293
|
- **No CRITICAL findings may remain unaddressed.** Production credentials exposed in source code are blocking. Execution halts until the credential is rotated and the code is remediated.
|