ai-driven-dev-v2 0.1.0a1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- ai_driven_dev_v2-0.1.0a1/.agents/skills/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/.agents/skills/aidd-eval/SKILL.md +109 -0
- ai_driven_dev_v2-0.1.0a1/.agents/skills/aidd-eval/references/e2e-flow-audit.md +20 -0
- ai_driven_dev_v2-0.1.0a1/.agents/skills/backlog-ops/SKILL.md +175 -0
- ai_driven_dev_v2-0.1.0a1/.agents/skills/live-e2e/SKILL.md +213 -0
- ai_driven_dev_v2-0.1.0a1/.agents/skills/project-navigation/SKILL.md +23 -0
- ai_driven_dev_v2-0.1.0a1/.agents/skills/runtime-log-triage/SKILL.md +23 -0
- ai_driven_dev_v2-0.1.0a1/.agents/skills/stage-contract-change/SKILL.md +25 -0
- ai_driven_dev_v2-0.1.0a1/.agents/skills/task-slicing/SKILL.md +78 -0
- ai_driven_dev_v2-0.1.0a1/.agents/skills/user-story-check/SKILL.md +22 -0
- ai_driven_dev_v2-0.1.0a1/.editorconfig +12 -0
- ai_driven_dev_v2-0.1.0a1/.github/pull_request_template.md +12 -0
- ai_driven_dev_v2-0.1.0a1/.github/workflows/ci.yml +88 -0
- ai_driven_dev_v2-0.1.0a1/.github/workflows/manual-live-e2e.yml +160 -0
- ai_driven_dev_v2-0.1.0a1/.github/workflows/release.yml +259 -0
- ai_driven_dev_v2-0.1.0a1/.gitignore +13 -0
- ai_driven_dev_v2-0.1.0a1/AGENTS.md +102 -0
- ai_driven_dev_v2-0.1.0a1/CLAUDE.md +7 -0
- ai_driven_dev_v2-0.1.0a1/CONTRIBUTING.md +150 -0
- ai_driven_dev_v2-0.1.0a1/Dockerfile +14 -0
- ai_driven_dev_v2-0.1.0a1/LICENSE +202 -0
- ai_driven_dev_v2-0.1.0a1/MANIFEST.md +296 -0
- ai_driven_dev_v2-0.1.0a1/Makefile +24 -0
- ai_driven_dev_v2-0.1.0a1/PKG-INFO +381 -0
- ai_driven_dev_v2-0.1.0a1/README.md +354 -0
- ai_driven_dev_v2-0.1.0a1/aidd.example.toml +40 -0
- ai_driven_dev_v2-0.1.0a1/contracts/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/answers.md +47 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/idea-brief.md +23 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/implementation-report.md +24 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/plan.md +39 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/qa-report.md +27 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/questions.md +45 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/repair-brief.md +64 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/research-notes.md +40 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/review-report.md +26 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/review-spec-report.md +45 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/stage-brief.md +58 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/stage-result.md +76 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/tasklist.md +24 -0
- ai_driven_dev_v2-0.1.0a1/contracts/documents/validator-report.md +81 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/AGENTS.md +3 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/answers.md +4 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/questions.md +4 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/repair-brief.md +18 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/stage-brief.md +27 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/stage-result.md +34 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/validator-report.md +19 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/idea/AGENTS.md +3 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/idea/answers.md +5 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/idea/idea-brief.md +19 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/idea/questions.md +5 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/idea/stage-result.md +40 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/idea/validator-report.md +25 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/AGENTS.md +3 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/README.md +6 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/repair-needed/implementation-report.md +22 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/repair-needed/repair-brief.md +20 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/repair-needed/stage-result.md +39 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/repair-needed/validator-report.md +27 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/success/implementation-report.md +24 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/success/stage-result.md +38 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/success/validator-report.md +25 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/AGENTS.md +3 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/README.md +6 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/invalid/plan.md +34 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/invalid/questions.md +5 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/invalid/stage-result.md +39 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/invalid/validator-report.md +26 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/valid/plan.md +47 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/valid/stage-result.md +38 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/valid/validator-report.md +25 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/AGENTS.md +3 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/README.md +6 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/repair-needed/qa-report.md +21 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/repair-needed/repair-brief.md +20 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/repair-needed/stage-result.md +40 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/repair-needed/validator-report.md +27 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/success/qa-report.md +33 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/success/stage-result.md +38 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/success/validator-report.md +25 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/research/AGENTS.md +3 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/research/README.md +6 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/research/answered/answers.md +5 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/research/answered/questions.md +5 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/research/answered/research-notes.md +30 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/research/answered/stage-result.md +40 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/research/answered/validator-report.md +25 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/research/unresolved/questions.md +5 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/research/unresolved/research-notes.md +25 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/research/unresolved/stage-result.md +39 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/research/unresolved/validator-report.md +25 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review/AGENTS.md +3 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review/README.md +6 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review/repair-needed/repair-brief.md +20 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review/repair-needed/review-report.md +17 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review/repair-needed/stage-result.md +39 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review/repair-needed/validator-report.md +27 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review/success/review-report.md +25 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review/success/stage-result.md +38 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review/success/validator-report.md +25 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review-spec/AGENTS.md +3 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review-spec/review-spec-report.md +29 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review-spec/stage-result.md +38 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/review-spec/validator-report.md +25 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/tasklist/AGENTS.md +3 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/tasklist/stage-result.md +38 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/tasklist/tasklist.md +43 -0
- ai_driven_dev_v2-0.1.0a1/contracts/examples/tasklist/validator-report.md +25 -0
- ai_driven_dev_v2-0.1.0a1/contracts/stages/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/contracts/stages/idea.md +112 -0
- ai_driven_dev_v2-0.1.0a1/contracts/stages/implement.md +109 -0
- ai_driven_dev_v2-0.1.0a1/contracts/stages/plan.md +112 -0
- ai_driven_dev_v2-0.1.0a1/contracts/stages/qa.md +118 -0
- ai_driven_dev_v2-0.1.0a1/contracts/stages/research.md +111 -0
- ai_driven_dev_v2-0.1.0a1/contracts/stages/review-spec.md +117 -0
- ai_driven_dev_v2-0.1.0a1/contracts/stages/review.md +123 -0
- ai_driven_dev_v2-0.1.0a1/contracts/stages/tasklist.md +116 -0
- ai_driven_dev_v2-0.1.0a1/docs/AGENTS.md +11 -0
- ai_driven_dev_v2-0.1.0a1/docs/analysis/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/docs/analysis/analytical-note.md +66 -0
- ai_driven_dev_v2-0.1.0a1/docs/architecture/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/docs/architecture/adapter-conformance-matrix.md +37 -0
- ai_driven_dev_v2-0.1.0a1/docs/architecture/adapter-protocol.md +219 -0
- ai_driven_dev_v2-0.1.0a1/docs/architecture/distribution-and-development.md +136 -0
- ai_driven_dev_v2-0.1.0a1/docs/architecture/document-contracts.md +227 -0
- ai_driven_dev_v2-0.1.0a1/docs/architecture/eval-harness-integration.md +236 -0
- ai_driven_dev_v2-0.1.0a1/docs/architecture/operator-frontend.md +124 -0
- ai_driven_dev_v2-0.1.0a1/docs/architecture/project-set-workspace.md +110 -0
- ai_driven_dev_v2-0.1.0a1/docs/architecture/runtime-matrix.md +45 -0
- ai_driven_dev_v2-0.1.0a1/docs/architecture/target-architecture.md +515 -0
- ai_driven_dev_v2-0.1.0a1/docs/backlog/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/docs/backlog/backlog.md +87 -0
- ai_driven_dev_v2-0.1.0a1/docs/backlog/rebuild-plan.md +142 -0
- ai_driven_dev_v2-0.1.0a1/docs/backlog/roadmap.md +5256 -0
- ai_driven_dev_v2-0.1.0a1/docs/compatibility-policy.md +185 -0
- ai_driven_dev_v2-0.1.0a1/docs/e2e/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/docs/e2e/live-e2e-catalog.md +169 -0
- ai_driven_dev_v2-0.1.0a1/docs/e2e/live-quality-rubric.md +93 -0
- ai_driven_dev_v2-0.1.0a1/docs/e2e/operator-ui-local-project.md +104 -0
- ai_driven_dev_v2-0.1.0a1/docs/e2e/scenario-matrix.md +96 -0
- ai_driven_dev_v2-0.1.0a1/docs/operator-handbook.md +257 -0
- ai_driven_dev_v2-0.1.0a1/docs/operator-support-policy.md +141 -0
- ai_driven_dev_v2-0.1.0a1/docs/operator-troubleshooting.md +186 -0
- ai_driven_dev_v2-0.1.0a1/docs/product/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/docs/product/user-stories.md +144 -0
- ai_driven_dev_v2-0.1.0a1/docs/release-checklist.md +140 -0
- ai_driven_dev_v2-0.1.0a1/harness/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/harness/fixtures/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/harness/fixtures/minimal-python/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/harness/fixtures/minimal-python/aidd_fixture_runtime.py +206 -0
- ai_driven_dev_v2-0.1.0a1/harness/fixtures/minimal-python/pyproject.toml +4 -0
- ai_driven_dev_v2-0.1.0a1/harness/fixtures/minimal-python/src/minimal_app/__init__.py +2 -0
- ai_driven_dev_v2-0.1.0a1/harness/fixtures/minimal-python/tests/test_minimal_app.py +5 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/deterministic/minimal-python-bounded-workflow.yaml +42 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/deterministic/minimal-python-full-workflow.yaml +41 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/deterministic/project-set-plan-context.yaml +87 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/hono-non-error-throw-handling.yaml +82 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/hono-router-double-star-parity.yaml +116 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/httpx-cli-docs-sync.yaml +59 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/httpx-invalid-header-message.yaml +81 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/sqlite-utils-detect-types-header-only.yaml +98 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/sqlite-utils-yielded-rows-interview.yaml +125 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/typer-boolean-help-rendering.yaml +59 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/typer-styled-help-alignment.yaml +82 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/smoke/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/smoke/installed-local-project-fixture.yaml +97 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/smoke/plan-stage-minimal-fixture.yaml +33 -0
- ai_driven_dev_v2-0.1.0a1/harness/scenarios/smoke/plan-stagepack-smoke.yaml +55 -0
- ai_driven_dev_v2-0.1.0a1/manifest.txt +262 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/common/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/common/run-rules.md +7 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/idea/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/implement/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/plan/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/qa/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/research/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/review/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/review-spec/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/idea/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/idea/interview.md +12 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/idea/repair.md +80 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/idea/run.md +70 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/idea/system.md +20 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/implement/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/implement/interview.md +13 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/implement/repair.md +86 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/implement/run.md +80 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/implement/system.md +22 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/plan/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/plan/interview.md +18 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/plan/repair.md +75 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/plan/run.md +81 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/plan/system.md +22 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/qa/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/qa/interview.md +12 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/qa/repair.md +82 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/qa/run.md +85 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/qa/system.md +23 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/research/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/research/interview.md +18 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/research/repair.md +71 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/research/run.md +72 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/research/system.md +21 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review/interview.md +12 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review/repair.md +89 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review/run.md +90 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review/system.md +22 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review-spec/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review-spec/interview.md +18 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review-spec/repair.md +81 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review-spec/run.md +75 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review-spec/system.md +21 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/tasklist/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/tasklist/interview.md +12 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/tasklist/repair.md +80 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/tasklist/run.md +71 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/tasklist/system.md +22 -0
- ai_driven_dev_v2-0.1.0a1/prompt-packs/tasklist/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/pyproject.toml +64 -0
- ai_driven_dev_v2-0.1.0a1/reports/repo-readiness/backlog-coverage.md +35 -0
- ai_driven_dev_v2-0.1.0a1/reports/repo-readiness/blockers-and-next-actions.md +52 -0
- ai_driven_dev_v2-0.1.0a1/reports/repo-readiness/repo-readiness-report.md +77 -0
- ai_driven_dev_v2-0.1.0a1/reports/repo-readiness/user-story-traceability.md +17 -0
- ai_driven_dev_v2-0.1.0a1/scripts/release_live_proof_runtime.py +612 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/__init__.py +5 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/__init__.py +1 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/base.py +19 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/claude_code/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/claude_code/__init__.py +5 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/claude_code/probe.py +56 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/claude_code/runner.py +742 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/codex/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/codex/__init__.py +5 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/codex/probe.py +48 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/codex/runner.py +309 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/generic_cli/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/generic_cli/__init__.py +5 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/generic_cli/probe.py +28 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/generic_cli/runner.py +212 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/native_prompt.py +187 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/opencode/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/opencode/__init__.py +5 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/opencode/probe.py +48 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/opencode/runner.py +343 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/path_resolution.py +46 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/probe_support.py +207 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/runner_support.py +169 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/runtime_artifacts.py +32 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/runtime_events.py +244 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/runtime_execution.py +61 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/runtime_registry.py +123 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/subprocess_streaming.py +196 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/surface.py +464 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/AGENTS.md +10 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/__init__.py +1 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/doctor.py +84 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/eval.py +114 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/init_command.py +25 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/main.py +123 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/run.py +431 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/run_lookup.py +31 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/stage.py +89 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/stage_inspection.py +129 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/stage_run.py +485 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/support.py +145 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/cli/ui.py +801 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/compatibility.py +42 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/config.py +465 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/__init__.py +1 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/adapter_interview.py +25 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/contracts.py +17 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/interview.py +470 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/markdown.py +199 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/models/AGENTS.md +8 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/models/__init__.py +1 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/models/run.py +347 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/operator_frontend.py +236 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/project_set.py +142 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/repair.py +639 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/resources.py +109 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/run_inspection.py +646 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/run_lookup.py +369 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/run_store.py +873 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/runtime_readiness.py +91 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_graph.py +310 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_interview_routing.py +117 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_invocation.py +267 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_manifest.py +82 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_models.py +198 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_outputs.py +256 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_paths.py +17 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_preparation.py +163 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_registry.py +257 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_runner.py +397 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_terminal.py +193 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_validation.py +338 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/stages.py +29 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/state_machine.py +69 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/work_item.py +8 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/workflow_service.py +264 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/core/workspace.py +202 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/evals/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/evals/__init__.py +1 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/evals/log_analysis.py +788 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/evals/quality.py +610 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/evals/reporting.py +278 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/evals/self_repair_probes.py +111 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/evals/stage_timing.py +819 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/evals/verdicts.py +193 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/__init__.py +18 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/adapter_conformance.py +138 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/conformance_matrix.py +113 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/eval_classification.py +255 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/eval_execution.py +217 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/eval_models.py +110 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/eval_preparation.py +149 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/eval_reports.py +918 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/eval_runner.py +248 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/install_artifact.py +230 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/live_runtime_config.py +256 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/live_workspace_bootstrap.py +173 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/repo_prep.py +299 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/result_bundle.py +375 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/runner.py +399 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/scenario_loader.py +22 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/harness/scenarios.py +576 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/runtime_logs/AGENTS.md +8 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/runtime_logs/__init__.py +1 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/runtime_logs/model.py +10 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/__init__.py +1 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/cross_document.py +309 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/document_loader.py +210 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/documents.py +13 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/models.py +61 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/reports.py +106 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic.py +31 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/__init__.py +9 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/blocks.py +149 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/common.py +467 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/evidence.py +83 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/findings.py +21 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/idea.py +76 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/ids.py +38 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/implement.py +321 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/placeholders.py +173 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/plan.py +194 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/qa.py +312 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/registry.py +120 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/research.py +86 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/review.py +236 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/review_spec.py +179 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/risks.py +52 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/tasklist.py +271 -0
- ai_driven_dev_v2-0.1.0a1/src/aidd/validators/structural.py +242 -0
- ai_driven_dev_v2-0.1.0a1/tests/AGENTS.md +9 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/AGENTS.md +8 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_claude_code_probe.py +87 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_claude_code_runner.py +1098 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_codex_probe.py +116 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_codex_runner.py +274 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_generic_cli_document_handshake.py +372 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_generic_cli_probe.py +62 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_generic_cli_runner.py +324 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_native_prompt.py +61 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_opencode_probe.py +115 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_opencode_runner.py +289 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_runtime_events.py +88 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_runtime_execution_contract.py +36 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_runtime_registry.py +27 -0
- ai_driven_dev_v2-0.1.0a1/tests/adapters/test_subprocess_streaming.py +90 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/AGENTS.md +8 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_doctor.py +87 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_eval_doctor.py +73 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_eval_run.py +146 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_eval_summary.py +45 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_release_live_proof_runtime.py +88 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_run_artifacts.py +207 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_run_logs.py +190 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_run_show.py +152 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_run_workflow.py +509 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_runtime_timeout.py +47 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_stage_questions.py +94 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_stage_run.py +990 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_stage_summary.py +196 -0
- ai_driven_dev_v2-0.1.0a1/tests/cli/test_ui.py +428 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/AGENTS.md +8 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_interview.py +336 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_operator_frontend.py +267 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_project_set.py +130 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_repair.py +442 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_repair_flow.py +245 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_resources.py +56 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_run_lookup.py +534 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_run_store_layout.py +598 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_stage_graph.py +533 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_stage_manifest.py +73 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_stage_registry.py +303 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_stage_runner.py +2411 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_stage_terminal.py +129 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_state_machine.py +53 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_workflow_service.py +93 -0
- ai_driven_dev_v2-0.1.0a1/tests/core/test_workspace_layout.py +171 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/AGENTS.md +8 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_log_analysis_events_jsonl.py +89 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_log_analysis_first_boundary.py +147 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_log_analysis_regressions.py +100 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_log_analysis_runtime_log.py +136 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_log_analysis_taxonomy.py +134 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_log_analysis_validation_inputs.py +89 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_quality.py +442 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_reporting_latest_summary.py +55 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_reporting_markdown_summary.py +165 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_reporting_runtime_aggregation.py +85 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_reporting_summary_regressions.py +114 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_reporting_summary_rows.py +65 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_self_repair_probes.py +46 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_stage_timing.py +305 -0
- ai_driven_dev_v2-0.1.0a1/tests/evals/test_verdicts.py +321 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/AGENTS.md +8 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_adapter_conformance_lane.py +40 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_conformance_matrix.py +40 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_eval_runner.py +1026 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_install_artifact.py +104 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_live_runtime_config.py +257 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_repo_prep.py +444 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_result_bundle_artifacts.py +59 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_result_bundle_completeness.py +220 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_result_bundle_layout.py +61 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_result_bundle_persistence.py +167 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_runner_integration.py +187 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_runner_invoke.py +283 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_runner_setup.py +91 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_runner_teardown.py +92 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_runner_verify.py +119 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_scenario_loader_model.py +304 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_scenario_loader_substitutions.py +104 -0
- ai_driven_dev_v2-0.1.0a1/tests/harness/test_scenario_loader_validation.py +474 -0
- ai_driven_dev_v2-0.1.0a1/tests/test_cli_run_lookup.py +239 -0
- ai_driven_dev_v2-0.1.0a1/tests/test_cli_smoke.py +28 -0
- ai_driven_dev_v2-0.1.0a1/tests/test_config.py +341 -0
- ai_driven_dev_v2-0.1.0a1/tests/test_contract_registry.py +12 -0
- ai_driven_dev_v2-0.1.0a1/tests/test_docs_consistency.py +301 -0
- ai_driven_dev_v2-0.1.0a1/tests/test_packaging_resources.py +32 -0
- ai_driven_dev_v2-0.1.0a1/tests/test_prompt_quality.py +72 -0
- ai_driven_dev_v2-0.1.0a1/tests/test_release_workflow.py +122 -0
- ai_driven_dev_v2-0.1.0a1/tests/test_reporting.py +11 -0
- ai_driven_dev_v2-0.1.0a1/tests/test_scenario_loader.py +16 -0
- ai_driven_dev_v2-0.1.0a1/tests/test_scenario_taxonomy.py +193 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/AGENTS.md +8 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/README.md +43 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/implement-invalid-noop/workspace/workitems/WI-SEM-IMPLEMENT-NOOP/stages/implement/implementation-report.md +22 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/implement-invalid-verification/workspace/workitems/WI-SEM-IMPLEMENT-VERIFY/stages/implement/implementation-report.md +23 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/implement-valid/workspace/workitems/WI-SEM-IMPLEMENT-VALID/stages/implement/implementation-report.md +24 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/invalid/workspace/workitems/WI-SEM-INVALID/stages/idea/idea-brief.md +17 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/invalid-list-format/workspace/workitems/WI-SEM-LIST-INVALID/stages/idea/idea-brief.md +17 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/plan-invalid/workspace/workitems/WI-SEM-PLAN-INVALID/stages/plan/plan.md +34 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/plan-valid/workspace/workitems/WI-SEM-PLAN-VALID/stages/plan/plan.md +39 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/qa-invalid/workspace/workitems/WI-SEM-QA-INVALID/stages/qa/qa-report.md +17 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/qa-valid/workspace/workitems/WI-SEM-QA-VALID/stages/qa/qa-report.md +20 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/research-invalid-missing-source/workspace/workitems/WI-SEM-RESEARCH-MISSING-SOURCE/stages/research/research-notes.md +25 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/research-invalid-unresolved-question/workspace/workitems/WI-SEM-RESEARCH-UNRESOLVED/stages/research/research-notes.md +25 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/research-valid/workspace/workitems/WI-SEM-RESEARCH-VALID/stages/research/research-notes.md +25 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/review-invalid/workspace/workitems/WI-SEM-REVIEW-INVALID/stages/review/review-report.md +18 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/review-spec-invalid/workspace/workitems/WI-SEM-REVIEW-SPEC-INVALID/stages/review-spec/review-spec-report.md +25 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/review-spec-valid/workspace/workitems/WI-SEM-REVIEW-SPEC-VALID/stages/review-spec/review-spec-report.md +29 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/review-valid/workspace/workitems/WI-SEM-REVIEW-VALID/stages/review/review-report.md +23 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/tasklist-invalid/workspace/workitems/WI-SEM-TASKLIST-INVALID/stages/tasklist/tasklist.md +18 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/tasklist-valid/workspace/workitems/WI-SEM-TASKLIST-VALID/stages/tasklist/tasklist.md +24 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/valid/workspace/workitems/WI-SEM-VALID/stages/idea/idea-brief.md +18 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/valid-list-format/workspace/workitems/WI-SEM-LIST-VALID/stages/idea/idea-brief.md +18 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/test_cross_document.py +581 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/test_document_loader.py +224 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/test_models.py +41 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/test_reports.py +90 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/test_semantic.py +2670 -0
- ai_driven_dev_v2-0.1.0a1/tests/validators/test_structural.py +575 -0
|
@@ -0,0 +1,9 @@
|
|
|
1
|
+
# AGENTS.md
|
|
2
|
+
|
|
3
|
+
This directory holds reusable development workflows for coding agents.
|
|
4
|
+
|
|
5
|
+
## Rules
|
|
6
|
+
|
|
7
|
+
- Keep each skill focused on one repeatable workflow.
|
|
8
|
+
- Prefer repo-specific instructions over generic advice.
|
|
9
|
+
- Update skills when the roadmap, architecture, or contributor workflow changes.
|
|
@@ -0,0 +1,109 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: aidd-eval
|
|
3
|
+
description: Run harness and eval scenarios for ai_driven_dev_v2, validate document-first stage outputs, preserve runtime logs, analyze failures, and produce durable audit artifacts for deterministic and manual-live lanes.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# aidd-eval
|
|
7
|
+
|
|
8
|
+
## Use when
|
|
9
|
+
|
|
10
|
+
- You need to run a harness scenario against one of the maintained runtimes.
|
|
11
|
+
- You need to validate stage outputs against Markdown document contracts.
|
|
12
|
+
- You need to check self-repair behavior after validator failures.
|
|
13
|
+
- You need to capture runtime logs, normalized events, and log-analysis artifacts.
|
|
14
|
+
- You need to audit generated artifacts and generated code after execution.
|
|
15
|
+
|
|
16
|
+
For **local live-run operator guidance**, prefer `live-e2e`.
|
|
17
|
+
Use `aidd-eval` when the main task is generic eval execution, artifact analysis,
|
|
18
|
+
validation discipline, grading, and failure classification across deterministic
|
|
19
|
+
and manual-live lanes.
|
|
20
|
+
|
|
21
|
+
## Required reading
|
|
22
|
+
|
|
23
|
+
1. `docs/architecture/eval-harness-integration.md`
|
|
24
|
+
2. `docs/architecture/document-contracts.md`
|
|
25
|
+
3. `docs/architecture/adapter-protocol.md`
|
|
26
|
+
4. `docs/architecture/runtime-matrix.md`
|
|
27
|
+
5. `docs/e2e/scenario-matrix.md`
|
|
28
|
+
6. `docs/e2e/live-quality-rubric.md` for live scenarios
|
|
29
|
+
7. the selected scenario under `harness/scenarios/`
|
|
30
|
+
8. `.agents/skills/aidd-eval/references/e2e-flow-audit.md`
|
|
31
|
+
|
|
32
|
+
## Lane split
|
|
33
|
+
|
|
34
|
+
- Deterministic scenarios use `feature_source.mode: fixture-seed` and may run in `ci` or `manual`.
|
|
35
|
+
- Live scenarios use `feature_source.mode: curated-issue-pool`, must live under `harness/scenarios/live/`, and are manual-only.
|
|
36
|
+
|
|
37
|
+
## Hard rules
|
|
38
|
+
|
|
39
|
+
1. Never hand-edit runtime-generated stage output documents during an eval run.
|
|
40
|
+
2. Always probe the adapter first.
|
|
41
|
+
3. Always preserve raw runtime logs.
|
|
42
|
+
4. Always validate output Markdown documents against their contracts.
|
|
43
|
+
5. Always allow the stage self-repair loop to run if the scenario expects repairable failures.
|
|
44
|
+
6. Always keep question/answer events as durable artifacts.
|
|
45
|
+
7. Always generate log-analysis output.
|
|
46
|
+
8. Keep infrastructure failures separate from model or document failures.
|
|
47
|
+
9. For live scenarios, preserve install evidence, issue-selection evidence, and quality artifacts.
|
|
48
|
+
10. Never mutate roadmap or backlog files as part of live quality auditing.
|
|
49
|
+
|
|
50
|
+
## Default procedure
|
|
51
|
+
|
|
52
|
+
1. Load the scenario and confirm the requested runtime is allowed.
|
|
53
|
+
2. Probe the adapter and record capability information.
|
|
54
|
+
3. Prepare or reset the fixture workspace or target repository.
|
|
55
|
+
4. Run the requested stage or flow through the harness.
|
|
56
|
+
For live scenarios, select the first curated issue, install the artifact under test first, and run AIDD from the target repository root.
|
|
57
|
+
5. Capture:
|
|
58
|
+
- install transcript and artifact identity for live scenarios,
|
|
59
|
+
- issue-selection evidence for live scenarios,
|
|
60
|
+
- fixture-seed metadata for deterministic scenarios,
|
|
61
|
+
- raw runtime logs,
|
|
62
|
+
- structured runtime logs when available,
|
|
63
|
+
- normalized events,
|
|
64
|
+
- question/answer events,
|
|
65
|
+
- validator outcomes,
|
|
66
|
+
- repair attempts.
|
|
67
|
+
6. Validate all required output documents.
|
|
68
|
+
7. Run live quality commands and score artifact/code quality when the scenario requires it.
|
|
69
|
+
8. Run graders.
|
|
70
|
+
9. Run log analysis.
|
|
71
|
+
10. Write the final audit artifacts.
|
|
72
|
+
11. Report the final execution verdict and quality conclusion explicitly.
|
|
73
|
+
|
|
74
|
+
## Canonical output locations
|
|
75
|
+
|
|
76
|
+
- `.aidd/reports/evals/<run_id>/runtime.log`
|
|
77
|
+
- `.aidd/reports/evals/<run_id>/runtime.jsonl` when supported
|
|
78
|
+
- `.aidd/reports/evals/<run_id>/events.jsonl` when supported
|
|
79
|
+
- `.aidd/reports/evals/<run_id>/install-transcript.json`
|
|
80
|
+
- `.aidd/reports/evals/<run_id>/issue-selection.json`
|
|
81
|
+
- `.aidd/reports/evals/<run_id>/validator-report.md`
|
|
82
|
+
- `.aidd/reports/evals/<run_id>/repair-history.md`
|
|
83
|
+
- `.aidd/reports/evals/<run_id>/log-analysis.md`
|
|
84
|
+
- `.aidd/reports/evals/<run_id>/grader.json`
|
|
85
|
+
- `.aidd/reports/evals/<run_id>/verdict.md`
|
|
86
|
+
- `.aidd/reports/evals/<run_id>/quality-report.md`
|
|
87
|
+
- `.aidd/reports/evals/<run_id>/quality-transcript.json`
|
|
88
|
+
|
|
89
|
+
## Execution verdict taxonomy
|
|
90
|
+
|
|
91
|
+
For eval harness runs, preserve the stable execution verdict taxonomy:
|
|
92
|
+
|
|
93
|
+
- `pass`
|
|
94
|
+
- `fail`
|
|
95
|
+
- `blocked`
|
|
96
|
+
- `infra-fail`
|
|
97
|
+
|
|
98
|
+
Quality remains additive and must be reported separately as:
|
|
99
|
+
|
|
100
|
+
- `pass`
|
|
101
|
+
- `warn`
|
|
102
|
+
- `fail`
|
|
103
|
+
- `none`
|
|
104
|
+
|
|
105
|
+
## Example command shape
|
|
106
|
+
|
|
107
|
+
```bash
|
|
108
|
+
aidd eval run harness/scenarios/smoke/plan-stagepack-smoke.yaml --runtime opencode
|
|
109
|
+
```
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
# E2E Flow Audit Reference
|
|
2
|
+
|
|
3
|
+
Use this reference when writing or reviewing eval audit output.
|
|
4
|
+
|
|
5
|
+
## Minimum audit sections
|
|
6
|
+
|
|
7
|
+
- scenario summary
|
|
8
|
+
- runtime and adapter used
|
|
9
|
+
- repository pin or fixture identity
|
|
10
|
+
- stage or flow scope
|
|
11
|
+
- validator outcomes
|
|
12
|
+
- repair history
|
|
13
|
+
- user questions and answers
|
|
14
|
+
- log analysis
|
|
15
|
+
- final verdict
|
|
16
|
+
- follow-up actions
|
|
17
|
+
|
|
18
|
+
## First-failure principle
|
|
19
|
+
|
|
20
|
+
The audit should name the earliest decisive failure signal, not only the last visible symptom.
|
|
@@ -0,0 +1,175 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: backlog-ops
|
|
3
|
+
description: Select, split, create, promote, and close roadmap tasks while keeping `roadmap.md` and `backlog.md` synchronized.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# backlog-ops
|
|
7
|
+
|
|
8
|
+
Use this skill whenever you touch `docs/backlog/roadmap.md` or `docs/backlog/backlog.md`.
|
|
9
|
+
|
|
10
|
+
## Planning sources
|
|
11
|
+
|
|
12
|
+
Read these in order:
|
|
13
|
+
|
|
14
|
+
1. `docs/backlog/backlog.md`
|
|
15
|
+
2. `docs/backlog/roadmap.md`
|
|
16
|
+
3. `docs/product/user-stories.md`
|
|
17
|
+
4. the nearest `AGENTS.md` for the code or docs area you will touch
|
|
18
|
+
|
|
19
|
+
## Canonical rules
|
|
20
|
+
|
|
21
|
+
- `docs/backlog/roadmap.md` is the canonical hierarchy.
|
|
22
|
+
- `docs/backlog/backlog.md` is the short actionable queue.
|
|
23
|
+
- Work must always fit `wave -> epic -> slice -> local task`.
|
|
24
|
+
- A local task must be reviewable without another decomposition pass.
|
|
25
|
+
- Update `roadmap.md` first, then update `backlog.md`.
|
|
26
|
+
|
|
27
|
+
## Taking a task
|
|
28
|
+
|
|
29
|
+
1. Read the `Next` section in `docs/backlog/backlog.md`.
|
|
30
|
+
2. Pick the first local task marked `next` unless it is blocked by a documented dependency.
|
|
31
|
+
3. Read the full parent slice in `docs/backlog/roadmap.md`.
|
|
32
|
+
4. Read the linked user stories and architecture notes for the touched area.
|
|
33
|
+
5. Restate the task in your own words before coding:
|
|
34
|
+
- exact output;
|
|
35
|
+
- touched module or file family;
|
|
36
|
+
- main verification signal;
|
|
37
|
+
- dependencies that must already exist.
|
|
38
|
+
|
|
39
|
+
Do not start coding if you cannot name all four items above.
|
|
40
|
+
|
|
41
|
+
## Local-task quality bar
|
|
42
|
+
|
|
43
|
+
A valid local task has:
|
|
44
|
+
|
|
45
|
+
- one clear output artifact or code change;
|
|
46
|
+
- one dominant touched area;
|
|
47
|
+
- one main verification path;
|
|
48
|
+
- explicit upstream dependencies;
|
|
49
|
+
- wording that starts with a concrete verb.
|
|
50
|
+
|
|
51
|
+
A task must be split immediately if any of these are true:
|
|
52
|
+
|
|
53
|
+
- it touches more than one subsystem family, such as core + adapter + harness;
|
|
54
|
+
- it mixes contract design and broad downstream rollout;
|
|
55
|
+
- it has multiple independent outputs that could be reviewed separately;
|
|
56
|
+
- it has no single pass/fail check;
|
|
57
|
+
- it would require another planning discussion during implementation.
|
|
58
|
+
|
|
59
|
+
## Local-task template
|
|
60
|
+
|
|
61
|
+
When you create or rewrite a local task, make it fit this template:
|
|
62
|
+
|
|
63
|
+
- **ID** — `W<wave>-E<epic>-S<slice>-T<task>`
|
|
64
|
+
- **Action** — starts with a verb such as `Define`, `Implement`, `Write`, `Add`, `Expose`, `Render`
|
|
65
|
+
- **Output** — name the artifact, module, or command that changes
|
|
66
|
+
- **Scope** — keep one dominant touched area
|
|
67
|
+
- **Verification** — state how the task will be proven done
|
|
68
|
+
|
|
69
|
+
Example:
|
|
70
|
+
|
|
71
|
+
- `W4-E1-S2-T4` Implement stdout and stderr streaming to the CLI while the subprocess runs.
|
|
72
|
+
|
|
73
|
+
That is good because it names the subsystem, the behavior, and the direct review target.
|
|
74
|
+
|
|
75
|
+
## Creating a new local task
|
|
76
|
+
|
|
77
|
+
Create a new local task when the discovered work:
|
|
78
|
+
|
|
79
|
+
- clearly belongs to an existing slice goal;
|
|
80
|
+
- can be reviewed independently;
|
|
81
|
+
- has one dominant output and one verification signal.
|
|
82
|
+
|
|
83
|
+
Workflow:
|
|
84
|
+
|
|
85
|
+
1. Add the new task under the correct slice in `roadmap.md`.
|
|
86
|
+
2. Keep the existing slice goal unless the outcome changed materially.
|
|
87
|
+
3. Preserve the current task id for the first surviving piece whenever you split active work.
|
|
88
|
+
4. Append new task ids after the preserved one.
|
|
89
|
+
5. Update slice dependencies, touched areas, or exit evidence if the new task changes them.
|
|
90
|
+
6. Pull the new task into `backlog.md` only if it is immediately actionable.
|
|
91
|
+
|
|
92
|
+
## Creating a new slice
|
|
93
|
+
|
|
94
|
+
Create a new slice only when the discovered work is a different meaningful outcome, for example:
|
|
95
|
+
|
|
96
|
+
- a new stage contract;
|
|
97
|
+
- a new adapter capability;
|
|
98
|
+
- a separate harness scenario lane;
|
|
99
|
+
- a separate operator command surface.
|
|
100
|
+
|
|
101
|
+
Do **not** create a new slice just because the current task is too large. Split into more local tasks first.
|
|
102
|
+
|
|
103
|
+
A good slice has:
|
|
104
|
+
|
|
105
|
+
- one outcome sentence in the goal;
|
|
106
|
+
- explicit primary outputs;
|
|
107
|
+
- touched areas;
|
|
108
|
+
- dependencies;
|
|
109
|
+
- exit evidence.
|
|
110
|
+
|
|
111
|
+
## Creating a new epic
|
|
112
|
+
|
|
113
|
+
Create a new epic only when the theme changes enough that the work is no longer one coherent track, for example:
|
|
114
|
+
|
|
115
|
+
- moving from validators into runtime adapters;
|
|
116
|
+
- moving from harness execution into release operations.
|
|
117
|
+
|
|
118
|
+
If the work still serves the same theme, keep it inside the current epic.
|
|
119
|
+
|
|
120
|
+
## Splitting workflow
|
|
121
|
+
|
|
122
|
+
When a task or slice is too large:
|
|
123
|
+
|
|
124
|
+
1. Identify the dominant outputs hidden inside the oversized work.
|
|
125
|
+
2. Keep the current id for the first smallest reviewable piece.
|
|
126
|
+
3. Create follow-up task ids for the remaining pieces.
|
|
127
|
+
4. Reword each new task so it names the output directly.
|
|
128
|
+
5. Check whether the parent slice still has one clear outcome.
|
|
129
|
+
6. Update `backlog.md` so only the immediate next pieces stay in `Next`.
|
|
130
|
+
|
|
131
|
+
## Dependency rules
|
|
132
|
+
|
|
133
|
+
- Dependencies belong on the slice, not repeated on every task unless there is an exception.
|
|
134
|
+
- A task may assume slice dependencies are already satisfied.
|
|
135
|
+
- If one task inside a slice depends on another task in the same slice, order the tasks so the dependency is obvious.
|
|
136
|
+
- If discovered work depends on another wave or epic, add that dependency explicitly to the slice.
|
|
137
|
+
|
|
138
|
+
## Promotion rules for `backlog.md`
|
|
139
|
+
|
|
140
|
+
Use `backlog.md` as a queue, not as a second roadmap.
|
|
141
|
+
|
|
142
|
+
- `Next` contains immediately actionable local tasks only.
|
|
143
|
+
- `Soon` contains tasks that are likely next but still depend on `Next`.
|
|
144
|
+
- `Parking lot` holds later-wave tasks that should stay visible.
|
|
145
|
+
- Never place a slice or epic in `backlog.md`; only local task ids belong there.
|
|
146
|
+
- Never add a task to `backlog.md` unless it already exists in `roadmap.md`.
|
|
147
|
+
|
|
148
|
+
## Closing work
|
|
149
|
+
|
|
150
|
+
After implementation:
|
|
151
|
+
|
|
152
|
+
1. Mark the task or slice state in `roadmap.md` if it materially changed.
|
|
153
|
+
2. Remove completed tasks from `backlog.md`.
|
|
154
|
+
3. Add follow-up tasks to `roadmap.md` before mentioning them elsewhere.
|
|
155
|
+
4. Delete stale wording instead of leaving historical clutter.
|
|
156
|
+
5. Make sure the new plan still reads cleanly from wave to task.
|
|
157
|
+
|
|
158
|
+
## Sync checklist
|
|
159
|
+
|
|
160
|
+
Any change to planning files should leave all of these true:
|
|
161
|
+
|
|
162
|
+
- every backlog id exists in the roadmap;
|
|
163
|
+
- every `Next` item is a local task, not a slice;
|
|
164
|
+
- no task wording is ambiguous or multi-output;
|
|
165
|
+
- parent slices still have one meaningful outcome;
|
|
166
|
+
- dependencies and exit evidence still match the work.
|
|
167
|
+
|
|
168
|
+
## Output when reporting planning work
|
|
169
|
+
|
|
170
|
+
When you finish a planning update, report:
|
|
171
|
+
|
|
172
|
+
- the local task you took or the slice you decomposed;
|
|
173
|
+
- new or changed task ids;
|
|
174
|
+
- any new dependencies you added;
|
|
175
|
+
- which items moved into `Next`, `Soon`, or `Parking lot`.
|
|
@@ -0,0 +1,213 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: live-e2e
|
|
3
|
+
description: Run or prepare a manual full-flow live end-to-end scenario against a public GitHub repository with repository pinning, curated issue selection, quality checks, and full log capture.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# live-e2e
|
|
7
|
+
|
|
8
|
+
## Use when
|
|
9
|
+
|
|
10
|
+
- You need to execute or author a scenario from `docs/e2e/live-e2e-catalog.md`.
|
|
11
|
+
- You need to compare live provider behavior on a real repository.
|
|
12
|
+
- You need to prove the installed operator flow from `idea` through `qa` as a manual external audit.
|
|
13
|
+
- You need a **local source-checkout runbook** for manual live E2E, not just the abstract eval contract.
|
|
14
|
+
|
|
15
|
+
## This skill vs `aidd-eval`
|
|
16
|
+
|
|
17
|
+
- Use `live-e2e` when the main question is: "How do I make a local live run work from this checkout?"
|
|
18
|
+
- Use `aidd-eval` when the main question is: "How do I audit artifacts, validation, grading, and failure classification across eval lanes?"
|
|
19
|
+
- `live-e2e` is the primary local operator playbook for live runs.
|
|
20
|
+
- `aidd-eval` remains the generic eval and artifact-analysis skill.
|
|
21
|
+
|
|
22
|
+
## Read first
|
|
23
|
+
|
|
24
|
+
This skill is intended to be sufficient for a prepared local run, but these files
|
|
25
|
+
remain the authoritative deeper references:
|
|
26
|
+
|
|
27
|
+
1. `docs/e2e/live-e2e-catalog.md`
|
|
28
|
+
2. `docs/e2e/scenario-matrix.md`
|
|
29
|
+
3. `docs/operator-handbook.md`
|
|
30
|
+
4. the selected manifest in `harness/scenarios/live/`
|
|
31
|
+
|
|
32
|
+
## What must already exist
|
|
33
|
+
|
|
34
|
+
If you only use this skill from the current project, the run still needs these
|
|
35
|
+
external prerequisites to already be true:
|
|
36
|
+
|
|
37
|
+
- you are in a prepared local **source checkout** of this repository;
|
|
38
|
+
- `uv sync --extra dev` has already completed successfully;
|
|
39
|
+
- the selected live manifest exists under `harness/scenarios/live/`;
|
|
40
|
+
- the requested runtime appears in the scenario's `runtime_targets`;
|
|
41
|
+
- the machine has network access to clone the pinned public target repository;
|
|
42
|
+
- the selected provider is already authenticated and runnable on the machine;
|
|
43
|
+
- the selected provider CLI is available, or you have an AIDD-compatible wrapper
|
|
44
|
+
command override for the chosen live runtime.
|
|
45
|
+
|
|
46
|
+
This skill does **not** provision runtime authentication, wrapper scripts, or provider setup for you.
|
|
47
|
+
|
|
48
|
+
## Runtime-command contract
|
|
49
|
+
|
|
50
|
+
For local manual live runs, `claude-code`, `codex`, and `opencode` use native provider CLI
|
|
51
|
+
commands by default. You may provide a runtime-command override through
|
|
52
|
+
environment variables when you need a custom wrapper:
|
|
53
|
+
|
|
54
|
+
- `AIDD_EVAL_CLAUDE_CODE_COMMAND` for `claude-code`
|
|
55
|
+
- `AIDD_EVAL_CODEX_COMMAND` for `codex`
|
|
56
|
+
- `AIDD_EVAL_OPENCODE_COMMAND` for `opencode`
|
|
57
|
+
|
|
58
|
+
When set, the value must point to an **AIDD-compatible wrapper command**:
|
|
59
|
+
|
|
60
|
+
- it must be invokable from the shell on the current machine;
|
|
61
|
+
- it must accept the adapter flags AIDD passes for that runtime;
|
|
62
|
+
- it may be a wrapper around the upstream provider CLI rather than the raw provider binary;
|
|
63
|
+
- `aidd doctor` distinguishes provider probe readiness from execution command readiness.
|
|
64
|
+
|
|
65
|
+
There are no repo-local wrapper templates in this wave. The operator must already
|
|
66
|
+
have provider auth and a working provider CLI or wrapper execution surface.
|
|
67
|
+
|
|
68
|
+
## Local preflight checklist
|
|
69
|
+
|
|
70
|
+
Before the live run, confirm all of these:
|
|
71
|
+
|
|
72
|
+
1. `uv sync --extra dev`
|
|
73
|
+
2. `uv run aidd doctor`
|
|
74
|
+
3. the selected scenario is under `harness/scenarios/live/`
|
|
75
|
+
4. the scenario has `automation_lane: manual`
|
|
76
|
+
5. the scenario forces `stage_scope: idea -> qa`
|
|
77
|
+
6. the runtime you plan to use appears in `runtime_targets`
|
|
78
|
+
7. `uv run aidd eval doctor <manifest> --runtime <runtime>` reports execution readiness
|
|
79
|
+
8. any wrapper env var you choose to set resolves on the machine and uses the expected auth state
|
|
80
|
+
|
|
81
|
+
Recommended local preflight:
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
uv sync --extra dev
|
|
85
|
+
uv run aidd doctor
|
|
86
|
+
uv run aidd eval doctor harness/scenarios/live/sqlite-utils-detect-types-header-only.yaml --runtime codex
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
Optional wrapper override:
|
|
90
|
+
|
|
91
|
+
```bash
|
|
92
|
+
export AIDD_EVAL_CODEX_COMMAND='<aidd-compatible codex wrapper>'
|
|
93
|
+
export AIDD_EVAL_OPENCODE_COMMAND='<aidd-compatible opencode wrapper>'
|
|
94
|
+
export AIDD_EVAL_CLAUDE_CODE_COMMAND='<aidd-compatible claude-code wrapper>'
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
## Canonical local launch
|
|
98
|
+
|
|
99
|
+
The primary execution path for this skill is a local run from the AIDD source checkout:
|
|
100
|
+
|
|
101
|
+
```bash
|
|
102
|
+
uv run aidd eval run harness/scenarios/live/sqlite-utils-detect-types-header-only.yaml --runtime codex
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
or:
|
|
106
|
+
|
|
107
|
+
```bash
|
|
108
|
+
uv run aidd eval run harness/scenarios/live/sqlite-utils-detect-types-header-only.yaml --runtime opencode
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
or:
|
|
112
|
+
|
|
113
|
+
```bash
|
|
114
|
+
uv run aidd eval run harness/scenarios/live/sqlite-utils-detect-types-header-only.yaml --runtime claude-code
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
The GitHub `manual-live-e2e` workflow is a secondary alternate entrypoint, not the primary flow described by this skill.
|
|
118
|
+
|
|
119
|
+
## What the harness will do
|
|
120
|
+
|
|
121
|
+
During a successful local live run, the harness will:
|
|
122
|
+
|
|
123
|
+
1. load the selected scenario and validate the live-lane contract;
|
|
124
|
+
2. resolve and record the pinned target repository commit;
|
|
125
|
+
3. prepare a clean working copy of the target repository;
|
|
126
|
+
4. select the **first listed issue** from the curated issue pool;
|
|
127
|
+
5. write issue-selection evidence to the eval bundle and target-repo context;
|
|
128
|
+
6. seed `.aidd/` inside the target repository;
|
|
129
|
+
7. write a live `aidd.example.toml` with the runtime command and execution mode for the chosen provider;
|
|
130
|
+
8. build and install the AIDD artifact under test with `uv tool`;
|
|
131
|
+
9. run installed `aidd` from the target repository root with explicit workflow bounds `idea -> qa`;
|
|
132
|
+
10. run setup, verify, and quality commands and write the final audit artifacts.
|
|
133
|
+
|
|
134
|
+
## Validations and blockers
|
|
135
|
+
|
|
136
|
+
The live run can be rejected or downgraded at several layers:
|
|
137
|
+
|
|
138
|
+
- manifest validation rejects non-live scenarios, non-manual live scenarios, missing `quality`, invalid `runtime_targets`, or any live scenario that is not bounded to `idea -> qa`;
|
|
139
|
+
- runtime admission rejects a requested runtime that is not declared in `runtime_targets`;
|
|
140
|
+
- stage execution stays bounded to `idea -> qa`;
|
|
141
|
+
- stage outputs must validate against Markdown document contracts;
|
|
142
|
+
- repair loops are allowed to run when validation failures are repairable;
|
|
143
|
+
- interview scenarios block when required answers are missing;
|
|
144
|
+
- repo-local `verify.commands` must pass;
|
|
145
|
+
- repo-local `quality.commands` must pass for a clean quality result;
|
|
146
|
+
- execution `pass` is impossible if any stage in scope is missing required validated artifacts.
|
|
147
|
+
|
|
148
|
+
Live execution verdicts remain:
|
|
149
|
+
|
|
150
|
+
- `pass`
|
|
151
|
+
- `fail`
|
|
152
|
+
- `blocked`
|
|
153
|
+
- `infra-fail`
|
|
154
|
+
|
|
155
|
+
Quality is additive:
|
|
156
|
+
|
|
157
|
+
- `pass`
|
|
158
|
+
- `warn`
|
|
159
|
+
- `fail`
|
|
160
|
+
- `none`
|
|
161
|
+
|
|
162
|
+
## Output locations and success criteria
|
|
163
|
+
|
|
164
|
+
The canonical eval bundle for a local live run lives under:
|
|
165
|
+
|
|
166
|
+
- `.aidd/reports/evals/<run_id>/`
|
|
167
|
+
|
|
168
|
+
Expected live artifacts include:
|
|
169
|
+
|
|
170
|
+
- `issue-selection.json`
|
|
171
|
+
- `install-transcript.json`
|
|
172
|
+
- `runtime.log`
|
|
173
|
+
- `validator-report.md`
|
|
174
|
+
- `repair-history.md`
|
|
175
|
+
- `log-analysis.md`
|
|
176
|
+
- `grader.json`
|
|
177
|
+
- `verdict.md`
|
|
178
|
+
- `quality-report.md`
|
|
179
|
+
- `quality-transcript.json`
|
|
180
|
+
|
|
181
|
+
A live run is only "clean" when execution evidence exists, verification output is present, and the bundle includes `quality-report.md` plus `quality-transcript.json`.
|
|
182
|
+
|
|
183
|
+
## First triage for common failures
|
|
184
|
+
|
|
185
|
+
- Provider executable missing: install/login to the selected provider CLI, or export `AIDD_EVAL_CODEX_COMMAND` / `AIDD_EVAL_OPENCODE_COMMAND` for a wrapper.
|
|
186
|
+
- Runtime launches but immediately fails in native mode: inspect provider auth, model selection, and sandbox permissions.
|
|
187
|
+
- Runtime launches but immediately fails in `adapter-flags` mode: the configured command is probably not an AIDD-compatible wrapper command.
|
|
188
|
+
- `unsupported-runtime`: the runtime is not declared in the scenario's `runtime_targets`.
|
|
189
|
+
- `blocked`: inspect `questions.md` / `answers.md` expectations for interview scenarios.
|
|
190
|
+
- `fail` after run success: inspect `verify-transcript.json`, `quality-transcript.json`, and the stage-local validator reports.
|
|
191
|
+
- Missing clean execution despite zero exit codes: inspect `verdict.md` and `grader.json` for pass-guard failures caused by missing `stage-result.md` or `validator-report.md`.
|
|
192
|
+
|
|
193
|
+
## Procedure
|
|
194
|
+
|
|
195
|
+
1. Confirm the selected scenario is in `harness/scenarios/live/`, has `automation_lane: manual`, and declares the requested runtime in `runtime_targets`.
|
|
196
|
+
2. Run the local preflight checks from this skill, including `aidd eval doctor`.
|
|
197
|
+
3. Export a wrapper env var only when you intentionally want `adapter-flags` mode.
|
|
198
|
+
4. Launch `uv run aidd eval run <manifest> --runtime <runtime>`.
|
|
199
|
+
5. Preserve the resulting bundle and inspect `verdict.md`, `grader.json`, `quality-report.md`, and transcripts before judging the run.
|
|
200
|
+
6. If the setup, provider coverage, size classification, quality recipe, or verification recipe had to change, update the scenario manifest, matrix doc, and catalog after the run as separate follow-up work.
|
|
201
|
+
|
|
202
|
+
## Hard rules
|
|
203
|
+
|
|
204
|
+
- Never treat live E2E as a CI or release lane.
|
|
205
|
+
- Never assume this skill provisions runtime auth, wrappers, or provider setup.
|
|
206
|
+
- Never dispatch the manual GitHub workflow without provider execution readiness for the selected runtime.
|
|
207
|
+
- Never run a live scenario without storing the resolved repo pin.
|
|
208
|
+
- Never run a live scenario without storing the selected issue snapshot.
|
|
209
|
+
- Never treat a live scenario as canonical unless it executes `idea -> qa`.
|
|
210
|
+
- Never treat a live scenario as passed without install evidence and verification output.
|
|
211
|
+
- Never treat a live scenario as clean without `quality-report.md` and `quality-transcript.json`.
|
|
212
|
+
- Preserve all runtime logs.
|
|
213
|
+
- Keep `.aidd` rooted inside the target repository for installed live runs.
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: project-navigation
|
|
3
|
+
description: Map a task to the right AIDD docs, modules, checks, and scenario assets before making changes.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# project-navigation
|
|
7
|
+
|
|
8
|
+
## Use when
|
|
9
|
+
|
|
10
|
+
- You are starting work in this repository.
|
|
11
|
+
- You do not know which module or document set owns the change.
|
|
12
|
+
|
|
13
|
+
## Procedure
|
|
14
|
+
|
|
15
|
+
1. Read `AGENTS.md` and `docs/product/user-stories.md`.
|
|
16
|
+
2. Classify the task as one of: docs, contracts, core, adapters, validators, harness, evals, or CLI.
|
|
17
|
+
3. Read the nearest nested `AGENTS.md` for that area.
|
|
18
|
+
4. Identify the expected checks and scenario updates.
|
|
19
|
+
5. Name the primary files that should change before editing.
|
|
20
|
+
|
|
21
|
+
## Output
|
|
22
|
+
|
|
23
|
+
Produce a short work map: owning area, likely files, checks to run, and whether a scenario update is required.
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: runtime-log-triage
|
|
3
|
+
description: Analyze runtime and adapter logs to identify the first decisive failure signal and classify it correctly.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# runtime-log-triage
|
|
7
|
+
|
|
8
|
+
## Use when
|
|
9
|
+
|
|
10
|
+
- A scenario or stage run failed.
|
|
11
|
+
- You need to separate document, model, adapter, auth, permission, timeout, or environment failures.
|
|
12
|
+
|
|
13
|
+
## Procedure
|
|
14
|
+
|
|
15
|
+
1. Read `runtime.log`, `events.jsonl`, and `validator-report.md`.
|
|
16
|
+
2. Identify the earliest decisive signal.
|
|
17
|
+
3. Separate runtime startup failures from document validation failures.
|
|
18
|
+
4. Check whether a user question should have blocked the run.
|
|
19
|
+
5. Write a short `log-analysis.md` that names the first cause, not just the final symptom.
|
|
20
|
+
|
|
21
|
+
## Output
|
|
22
|
+
|
|
23
|
+
Return the likely failure class and the evidence chain that supports it.
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: stage-contract-change
|
|
3
|
+
description: Make a safe change to a stage or document contract by updating contracts, validators, prompts, and scenarios together.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# stage-contract-change
|
|
7
|
+
|
|
8
|
+
## Use when
|
|
9
|
+
|
|
10
|
+
- You are changing a stage input or output document.
|
|
11
|
+
- You are changing validation rules or repair behavior.
|
|
12
|
+
|
|
13
|
+
## Procedure
|
|
14
|
+
|
|
15
|
+
1. Update the relevant contract doc first.
|
|
16
|
+
2. Update validator logic or validator plan.
|
|
17
|
+
3. Update prompt files or prompt-pack references if the runtime needs new instructions.
|
|
18
|
+
4. Update stage-result expectations and repair behavior if needed.
|
|
19
|
+
5. Add or update at least one smoke or eval scenario.
|
|
20
|
+
|
|
21
|
+
## Hard rules
|
|
22
|
+
|
|
23
|
+
- Never change a stage contract in code only.
|
|
24
|
+
- Never widen a stage output implicitly.
|
|
25
|
+
- Keep Markdown as the canonical runtime-authored output form.
|
|
@@ -0,0 +1,78 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: task-slicing
|
|
3
|
+
description: Turn a coarse roadmap item into reviewable local tasks with one output, one dominant touched area, and one main verification signal.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# task-slicing
|
|
7
|
+
|
|
8
|
+
Use this skill when a roadmap task or slice still feels too vague to implement directly.
|
|
9
|
+
|
|
10
|
+
## What "good slicing" means
|
|
11
|
+
|
|
12
|
+
A strong local task has:
|
|
13
|
+
|
|
14
|
+
- one concrete output;
|
|
15
|
+
- one dominant touched area;
|
|
16
|
+
- one main verification path;
|
|
17
|
+
- wording that starts with a verb;
|
|
18
|
+
- a scope small enough for one focused review.
|
|
19
|
+
|
|
20
|
+
## Smells that mean "split again"
|
|
21
|
+
|
|
22
|
+
Split again if the proposed task:
|
|
23
|
+
|
|
24
|
+
- touches core plus adapter plus harness together;
|
|
25
|
+
- mixes contract design with broad rollout;
|
|
26
|
+
- produces multiple independent artifacts;
|
|
27
|
+
- needs different verification strategies at once;
|
|
28
|
+
- still contains words like `implement stage`, `finish adapter`, `wire everything`, or `support all cases`.
|
|
29
|
+
|
|
30
|
+
## Split order
|
|
31
|
+
|
|
32
|
+
Always try this order:
|
|
33
|
+
|
|
34
|
+
1. split into more local tasks in the same slice;
|
|
35
|
+
2. create a new slice only if there is a different meaningful outcome;
|
|
36
|
+
3. create a new epic only if the theme changes.
|
|
37
|
+
|
|
38
|
+
## Recipe
|
|
39
|
+
|
|
40
|
+
1. Name the parent outcome in one sentence.
|
|
41
|
+
2. List the concrete outputs hidden inside it.
|
|
42
|
+
3. Group outputs by touched area.
|
|
43
|
+
4. Turn each group into a verb-led task.
|
|
44
|
+
5. Check that each task has one main verification signal.
|
|
45
|
+
6. Reorder tasks so dependencies read top to bottom.
|
|
46
|
+
|
|
47
|
+
## Examples
|
|
48
|
+
|
|
49
|
+
Too broad:
|
|
50
|
+
|
|
51
|
+
- `Implement the Claude Code adapter.`
|
|
52
|
+
|
|
53
|
+
Better:
|
|
54
|
+
|
|
55
|
+
- `Implement Claude Code command assembly from stage brief, workspace path, and prompt-pack inputs.`
|
|
56
|
+
- `Stream raw Claude Code stdout and stderr to the operator CLI in real time.`
|
|
57
|
+
- `Persist a full runtime.log that matches the raw streamed output as closely as possible.`
|
|
58
|
+
- `Detect Claude Code question or pause events when the runtime exposes them.`
|
|
59
|
+
|
|
60
|
+
Too broad:
|
|
61
|
+
|
|
62
|
+
- `Finalize the implement stage.`
|
|
63
|
+
|
|
64
|
+
Better:
|
|
65
|
+
|
|
66
|
+
- `Define the required implement inputs, including task selection, repository state, and allowed write scope.`
|
|
67
|
+
- `Define the required implement outputs, including change summary, touched files, and verification notes.`
|
|
68
|
+
- `Define validator rules for missing diffs, unverifiable claims, and incomplete execution summaries.`
|
|
69
|
+
- `Create the implement prompt-pack scaffold with explicit edit and verification guidance.`
|
|
70
|
+
|
|
71
|
+
## Output
|
|
72
|
+
|
|
73
|
+
When you use this skill, report:
|
|
74
|
+
|
|
75
|
+
- the parent item you decomposed;
|
|
76
|
+
- the new task ids;
|
|
77
|
+
- why the old task was too broad;
|
|
78
|
+
- the output and verification signal for each new task.
|