ai-driven-dev-v2 0.1.0a1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (489) hide show
  1. ai_driven_dev_v2-0.1.0a1/.agents/skills/AGENTS.md +9 -0
  2. ai_driven_dev_v2-0.1.0a1/.agents/skills/aidd-eval/SKILL.md +109 -0
  3. ai_driven_dev_v2-0.1.0a1/.agents/skills/aidd-eval/references/e2e-flow-audit.md +20 -0
  4. ai_driven_dev_v2-0.1.0a1/.agents/skills/backlog-ops/SKILL.md +175 -0
  5. ai_driven_dev_v2-0.1.0a1/.agents/skills/live-e2e/SKILL.md +213 -0
  6. ai_driven_dev_v2-0.1.0a1/.agents/skills/project-navigation/SKILL.md +23 -0
  7. ai_driven_dev_v2-0.1.0a1/.agents/skills/runtime-log-triage/SKILL.md +23 -0
  8. ai_driven_dev_v2-0.1.0a1/.agents/skills/stage-contract-change/SKILL.md +25 -0
  9. ai_driven_dev_v2-0.1.0a1/.agents/skills/task-slicing/SKILL.md +78 -0
  10. ai_driven_dev_v2-0.1.0a1/.agents/skills/user-story-check/SKILL.md +22 -0
  11. ai_driven_dev_v2-0.1.0a1/.editorconfig +12 -0
  12. ai_driven_dev_v2-0.1.0a1/.github/pull_request_template.md +12 -0
  13. ai_driven_dev_v2-0.1.0a1/.github/workflows/ci.yml +88 -0
  14. ai_driven_dev_v2-0.1.0a1/.github/workflows/manual-live-e2e.yml +160 -0
  15. ai_driven_dev_v2-0.1.0a1/.github/workflows/release.yml +259 -0
  16. ai_driven_dev_v2-0.1.0a1/.gitignore +13 -0
  17. ai_driven_dev_v2-0.1.0a1/AGENTS.md +102 -0
  18. ai_driven_dev_v2-0.1.0a1/CLAUDE.md +7 -0
  19. ai_driven_dev_v2-0.1.0a1/CONTRIBUTING.md +150 -0
  20. ai_driven_dev_v2-0.1.0a1/Dockerfile +14 -0
  21. ai_driven_dev_v2-0.1.0a1/LICENSE +202 -0
  22. ai_driven_dev_v2-0.1.0a1/MANIFEST.md +296 -0
  23. ai_driven_dev_v2-0.1.0a1/Makefile +24 -0
  24. ai_driven_dev_v2-0.1.0a1/PKG-INFO +381 -0
  25. ai_driven_dev_v2-0.1.0a1/README.md +354 -0
  26. ai_driven_dev_v2-0.1.0a1/aidd.example.toml +40 -0
  27. ai_driven_dev_v2-0.1.0a1/contracts/AGENTS.md +10 -0
  28. ai_driven_dev_v2-0.1.0a1/contracts/documents/AGENTS.md +9 -0
  29. ai_driven_dev_v2-0.1.0a1/contracts/documents/answers.md +47 -0
  30. ai_driven_dev_v2-0.1.0a1/contracts/documents/idea-brief.md +23 -0
  31. ai_driven_dev_v2-0.1.0a1/contracts/documents/implementation-report.md +24 -0
  32. ai_driven_dev_v2-0.1.0a1/contracts/documents/plan.md +39 -0
  33. ai_driven_dev_v2-0.1.0a1/contracts/documents/qa-report.md +27 -0
  34. ai_driven_dev_v2-0.1.0a1/contracts/documents/questions.md +45 -0
  35. ai_driven_dev_v2-0.1.0a1/contracts/documents/repair-brief.md +64 -0
  36. ai_driven_dev_v2-0.1.0a1/contracts/documents/research-notes.md +40 -0
  37. ai_driven_dev_v2-0.1.0a1/contracts/documents/review-report.md +26 -0
  38. ai_driven_dev_v2-0.1.0a1/contracts/documents/review-spec-report.md +45 -0
  39. ai_driven_dev_v2-0.1.0a1/contracts/documents/stage-brief.md +58 -0
  40. ai_driven_dev_v2-0.1.0a1/contracts/documents/stage-result.md +76 -0
  41. ai_driven_dev_v2-0.1.0a1/contracts/documents/tasklist.md +24 -0
  42. ai_driven_dev_v2-0.1.0a1/contracts/documents/validator-report.md +81 -0
  43. ai_driven_dev_v2-0.1.0a1/contracts/examples/AGENTS.md +9 -0
  44. ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/AGENTS.md +3 -0
  45. ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/answers.md +4 -0
  46. ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/questions.md +4 -0
  47. ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/repair-brief.md +18 -0
  48. ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/stage-brief.md +27 -0
  49. ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/stage-result.md +34 -0
  50. ai_driven_dev_v2-0.1.0a1/contracts/examples/common-documents/validator-report.md +19 -0
  51. ai_driven_dev_v2-0.1.0a1/contracts/examples/idea/AGENTS.md +3 -0
  52. ai_driven_dev_v2-0.1.0a1/contracts/examples/idea/answers.md +5 -0
  53. ai_driven_dev_v2-0.1.0a1/contracts/examples/idea/idea-brief.md +19 -0
  54. ai_driven_dev_v2-0.1.0a1/contracts/examples/idea/questions.md +5 -0
  55. ai_driven_dev_v2-0.1.0a1/contracts/examples/idea/stage-result.md +40 -0
  56. ai_driven_dev_v2-0.1.0a1/contracts/examples/idea/validator-report.md +25 -0
  57. ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/AGENTS.md +3 -0
  58. ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/README.md +6 -0
  59. ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/repair-needed/implementation-report.md +22 -0
  60. ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/repair-needed/repair-brief.md +20 -0
  61. ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/repair-needed/stage-result.md +39 -0
  62. ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/repair-needed/validator-report.md +27 -0
  63. ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/success/implementation-report.md +24 -0
  64. ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/success/stage-result.md +38 -0
  65. ai_driven_dev_v2-0.1.0a1/contracts/examples/implement/success/validator-report.md +25 -0
  66. ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/AGENTS.md +3 -0
  67. ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/README.md +6 -0
  68. ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/invalid/plan.md +34 -0
  69. ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/invalid/questions.md +5 -0
  70. ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/invalid/stage-result.md +39 -0
  71. ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/invalid/validator-report.md +26 -0
  72. ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/valid/plan.md +47 -0
  73. ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/valid/stage-result.md +38 -0
  74. ai_driven_dev_v2-0.1.0a1/contracts/examples/plan/valid/validator-report.md +25 -0
  75. ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/AGENTS.md +3 -0
  76. ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/README.md +6 -0
  77. ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/repair-needed/qa-report.md +21 -0
  78. ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/repair-needed/repair-brief.md +20 -0
  79. ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/repair-needed/stage-result.md +40 -0
  80. ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/repair-needed/validator-report.md +27 -0
  81. ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/success/qa-report.md +33 -0
  82. ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/success/stage-result.md +38 -0
  83. ai_driven_dev_v2-0.1.0a1/contracts/examples/qa/success/validator-report.md +25 -0
  84. ai_driven_dev_v2-0.1.0a1/contracts/examples/research/AGENTS.md +3 -0
  85. ai_driven_dev_v2-0.1.0a1/contracts/examples/research/README.md +6 -0
  86. ai_driven_dev_v2-0.1.0a1/contracts/examples/research/answered/answers.md +5 -0
  87. ai_driven_dev_v2-0.1.0a1/contracts/examples/research/answered/questions.md +5 -0
  88. ai_driven_dev_v2-0.1.0a1/contracts/examples/research/answered/research-notes.md +30 -0
  89. ai_driven_dev_v2-0.1.0a1/contracts/examples/research/answered/stage-result.md +40 -0
  90. ai_driven_dev_v2-0.1.0a1/contracts/examples/research/answered/validator-report.md +25 -0
  91. ai_driven_dev_v2-0.1.0a1/contracts/examples/research/unresolved/questions.md +5 -0
  92. ai_driven_dev_v2-0.1.0a1/contracts/examples/research/unresolved/research-notes.md +25 -0
  93. ai_driven_dev_v2-0.1.0a1/contracts/examples/research/unresolved/stage-result.md +39 -0
  94. ai_driven_dev_v2-0.1.0a1/contracts/examples/research/unresolved/validator-report.md +25 -0
  95. ai_driven_dev_v2-0.1.0a1/contracts/examples/review/AGENTS.md +3 -0
  96. ai_driven_dev_v2-0.1.0a1/contracts/examples/review/README.md +6 -0
  97. ai_driven_dev_v2-0.1.0a1/contracts/examples/review/repair-needed/repair-brief.md +20 -0
  98. ai_driven_dev_v2-0.1.0a1/contracts/examples/review/repair-needed/review-report.md +17 -0
  99. ai_driven_dev_v2-0.1.0a1/contracts/examples/review/repair-needed/stage-result.md +39 -0
  100. ai_driven_dev_v2-0.1.0a1/contracts/examples/review/repair-needed/validator-report.md +27 -0
  101. ai_driven_dev_v2-0.1.0a1/contracts/examples/review/success/review-report.md +25 -0
  102. ai_driven_dev_v2-0.1.0a1/contracts/examples/review/success/stage-result.md +38 -0
  103. ai_driven_dev_v2-0.1.0a1/contracts/examples/review/success/validator-report.md +25 -0
  104. ai_driven_dev_v2-0.1.0a1/contracts/examples/review-spec/AGENTS.md +3 -0
  105. ai_driven_dev_v2-0.1.0a1/contracts/examples/review-spec/review-spec-report.md +29 -0
  106. ai_driven_dev_v2-0.1.0a1/contracts/examples/review-spec/stage-result.md +38 -0
  107. ai_driven_dev_v2-0.1.0a1/contracts/examples/review-spec/validator-report.md +25 -0
  108. ai_driven_dev_v2-0.1.0a1/contracts/examples/tasklist/AGENTS.md +3 -0
  109. ai_driven_dev_v2-0.1.0a1/contracts/examples/tasklist/stage-result.md +38 -0
  110. ai_driven_dev_v2-0.1.0a1/contracts/examples/tasklist/tasklist.md +43 -0
  111. ai_driven_dev_v2-0.1.0a1/contracts/examples/tasklist/validator-report.md +25 -0
  112. ai_driven_dev_v2-0.1.0a1/contracts/stages/AGENTS.md +10 -0
  113. ai_driven_dev_v2-0.1.0a1/contracts/stages/idea.md +112 -0
  114. ai_driven_dev_v2-0.1.0a1/contracts/stages/implement.md +109 -0
  115. ai_driven_dev_v2-0.1.0a1/contracts/stages/plan.md +112 -0
  116. ai_driven_dev_v2-0.1.0a1/contracts/stages/qa.md +118 -0
  117. ai_driven_dev_v2-0.1.0a1/contracts/stages/research.md +111 -0
  118. ai_driven_dev_v2-0.1.0a1/contracts/stages/review-spec.md +117 -0
  119. ai_driven_dev_v2-0.1.0a1/contracts/stages/review.md +123 -0
  120. ai_driven_dev_v2-0.1.0a1/contracts/stages/tasklist.md +116 -0
  121. ai_driven_dev_v2-0.1.0a1/docs/AGENTS.md +11 -0
  122. ai_driven_dev_v2-0.1.0a1/docs/analysis/AGENTS.md +9 -0
  123. ai_driven_dev_v2-0.1.0a1/docs/analysis/analytical-note.md +66 -0
  124. ai_driven_dev_v2-0.1.0a1/docs/architecture/AGENTS.md +10 -0
  125. ai_driven_dev_v2-0.1.0a1/docs/architecture/adapter-conformance-matrix.md +37 -0
  126. ai_driven_dev_v2-0.1.0a1/docs/architecture/adapter-protocol.md +219 -0
  127. ai_driven_dev_v2-0.1.0a1/docs/architecture/distribution-and-development.md +136 -0
  128. ai_driven_dev_v2-0.1.0a1/docs/architecture/document-contracts.md +227 -0
  129. ai_driven_dev_v2-0.1.0a1/docs/architecture/eval-harness-integration.md +236 -0
  130. ai_driven_dev_v2-0.1.0a1/docs/architecture/operator-frontend.md +124 -0
  131. ai_driven_dev_v2-0.1.0a1/docs/architecture/project-set-workspace.md +110 -0
  132. ai_driven_dev_v2-0.1.0a1/docs/architecture/runtime-matrix.md +45 -0
  133. ai_driven_dev_v2-0.1.0a1/docs/architecture/target-architecture.md +515 -0
  134. ai_driven_dev_v2-0.1.0a1/docs/backlog/AGENTS.md +10 -0
  135. ai_driven_dev_v2-0.1.0a1/docs/backlog/backlog.md +87 -0
  136. ai_driven_dev_v2-0.1.0a1/docs/backlog/rebuild-plan.md +142 -0
  137. ai_driven_dev_v2-0.1.0a1/docs/backlog/roadmap.md +5256 -0
  138. ai_driven_dev_v2-0.1.0a1/docs/compatibility-policy.md +185 -0
  139. ai_driven_dev_v2-0.1.0a1/docs/e2e/AGENTS.md +9 -0
  140. ai_driven_dev_v2-0.1.0a1/docs/e2e/live-e2e-catalog.md +169 -0
  141. ai_driven_dev_v2-0.1.0a1/docs/e2e/live-quality-rubric.md +93 -0
  142. ai_driven_dev_v2-0.1.0a1/docs/e2e/operator-ui-local-project.md +104 -0
  143. ai_driven_dev_v2-0.1.0a1/docs/e2e/scenario-matrix.md +96 -0
  144. ai_driven_dev_v2-0.1.0a1/docs/operator-handbook.md +257 -0
  145. ai_driven_dev_v2-0.1.0a1/docs/operator-support-policy.md +141 -0
  146. ai_driven_dev_v2-0.1.0a1/docs/operator-troubleshooting.md +186 -0
  147. ai_driven_dev_v2-0.1.0a1/docs/product/AGENTS.md +9 -0
  148. ai_driven_dev_v2-0.1.0a1/docs/product/user-stories.md +144 -0
  149. ai_driven_dev_v2-0.1.0a1/docs/release-checklist.md +140 -0
  150. ai_driven_dev_v2-0.1.0a1/harness/AGENTS.md +9 -0
  151. ai_driven_dev_v2-0.1.0a1/harness/fixtures/AGENTS.md +9 -0
  152. ai_driven_dev_v2-0.1.0a1/harness/fixtures/minimal-python/AGENTS.md +9 -0
  153. ai_driven_dev_v2-0.1.0a1/harness/fixtures/minimal-python/aidd_fixture_runtime.py +206 -0
  154. ai_driven_dev_v2-0.1.0a1/harness/fixtures/minimal-python/pyproject.toml +4 -0
  155. ai_driven_dev_v2-0.1.0a1/harness/fixtures/minimal-python/src/minimal_app/__init__.py +2 -0
  156. ai_driven_dev_v2-0.1.0a1/harness/fixtures/minimal-python/tests/test_minimal_app.py +5 -0
  157. ai_driven_dev_v2-0.1.0a1/harness/scenarios/AGENTS.md +9 -0
  158. ai_driven_dev_v2-0.1.0a1/harness/scenarios/deterministic/minimal-python-bounded-workflow.yaml +42 -0
  159. ai_driven_dev_v2-0.1.0a1/harness/scenarios/deterministic/minimal-python-full-workflow.yaml +41 -0
  160. ai_driven_dev_v2-0.1.0a1/harness/scenarios/deterministic/project-set-plan-context.yaml +87 -0
  161. ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/AGENTS.md +9 -0
  162. ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/hono-non-error-throw-handling.yaml +82 -0
  163. ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/hono-router-double-star-parity.yaml +116 -0
  164. ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/httpx-cli-docs-sync.yaml +59 -0
  165. ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/httpx-invalid-header-message.yaml +81 -0
  166. ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/sqlite-utils-detect-types-header-only.yaml +98 -0
  167. ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/sqlite-utils-yielded-rows-interview.yaml +125 -0
  168. ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/typer-boolean-help-rendering.yaml +59 -0
  169. ai_driven_dev_v2-0.1.0a1/harness/scenarios/live/typer-styled-help-alignment.yaml +82 -0
  170. ai_driven_dev_v2-0.1.0a1/harness/scenarios/smoke/AGENTS.md +9 -0
  171. ai_driven_dev_v2-0.1.0a1/harness/scenarios/smoke/installed-local-project-fixture.yaml +97 -0
  172. ai_driven_dev_v2-0.1.0a1/harness/scenarios/smoke/plan-stage-minimal-fixture.yaml +33 -0
  173. ai_driven_dev_v2-0.1.0a1/harness/scenarios/smoke/plan-stagepack-smoke.yaml +55 -0
  174. ai_driven_dev_v2-0.1.0a1/manifest.txt +262 -0
  175. ai_driven_dev_v2-0.1.0a1/prompt-packs/AGENTS.md +9 -0
  176. ai_driven_dev_v2-0.1.0a1/prompt-packs/common/AGENTS.md +9 -0
  177. ai_driven_dev_v2-0.1.0a1/prompt-packs/common/run-rules.md +7 -0
  178. ai_driven_dev_v2-0.1.0a1/prompt-packs/idea/AGENTS.md +9 -0
  179. ai_driven_dev_v2-0.1.0a1/prompt-packs/implement/AGENTS.md +9 -0
  180. ai_driven_dev_v2-0.1.0a1/prompt-packs/plan/AGENTS.md +9 -0
  181. ai_driven_dev_v2-0.1.0a1/prompt-packs/qa/AGENTS.md +9 -0
  182. ai_driven_dev_v2-0.1.0a1/prompt-packs/research/AGENTS.md +9 -0
  183. ai_driven_dev_v2-0.1.0a1/prompt-packs/review/AGENTS.md +9 -0
  184. ai_driven_dev_v2-0.1.0a1/prompt-packs/review-spec/AGENTS.md +9 -0
  185. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/AGENTS.md +9 -0
  186. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/idea/AGENTS.md +10 -0
  187. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/idea/interview.md +12 -0
  188. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/idea/repair.md +80 -0
  189. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/idea/run.md +70 -0
  190. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/idea/system.md +20 -0
  191. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/implement/AGENTS.md +10 -0
  192. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/implement/interview.md +13 -0
  193. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/implement/repair.md +86 -0
  194. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/implement/run.md +80 -0
  195. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/implement/system.md +22 -0
  196. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/plan/AGENTS.md +10 -0
  197. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/plan/interview.md +18 -0
  198. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/plan/repair.md +75 -0
  199. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/plan/run.md +81 -0
  200. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/plan/system.md +22 -0
  201. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/qa/AGENTS.md +10 -0
  202. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/qa/interview.md +12 -0
  203. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/qa/repair.md +82 -0
  204. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/qa/run.md +85 -0
  205. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/qa/system.md +23 -0
  206. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/research/AGENTS.md +10 -0
  207. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/research/interview.md +18 -0
  208. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/research/repair.md +71 -0
  209. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/research/run.md +72 -0
  210. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/research/system.md +21 -0
  211. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review/AGENTS.md +10 -0
  212. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review/interview.md +12 -0
  213. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review/repair.md +89 -0
  214. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review/run.md +90 -0
  215. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review/system.md +22 -0
  216. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review-spec/AGENTS.md +10 -0
  217. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review-spec/interview.md +18 -0
  218. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review-spec/repair.md +81 -0
  219. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review-spec/run.md +75 -0
  220. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/review-spec/system.md +21 -0
  221. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/tasklist/AGENTS.md +10 -0
  222. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/tasklist/interview.md +12 -0
  223. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/tasklist/repair.md +80 -0
  224. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/tasklist/run.md +71 -0
  225. ai_driven_dev_v2-0.1.0a1/prompt-packs/stages/tasklist/system.md +22 -0
  226. ai_driven_dev_v2-0.1.0a1/prompt-packs/tasklist/AGENTS.md +9 -0
  227. ai_driven_dev_v2-0.1.0a1/pyproject.toml +64 -0
  228. ai_driven_dev_v2-0.1.0a1/reports/repo-readiness/backlog-coverage.md +35 -0
  229. ai_driven_dev_v2-0.1.0a1/reports/repo-readiness/blockers-and-next-actions.md +52 -0
  230. ai_driven_dev_v2-0.1.0a1/reports/repo-readiness/repo-readiness-report.md +77 -0
  231. ai_driven_dev_v2-0.1.0a1/reports/repo-readiness/user-story-traceability.md +17 -0
  232. ai_driven_dev_v2-0.1.0a1/scripts/release_live_proof_runtime.py +612 -0
  233. ai_driven_dev_v2-0.1.0a1/src/aidd/AGENTS.md +9 -0
  234. ai_driven_dev_v2-0.1.0a1/src/aidd/__init__.py +5 -0
  235. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/AGENTS.md +10 -0
  236. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/__init__.py +1 -0
  237. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/base.py +19 -0
  238. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/claude_code/AGENTS.md +10 -0
  239. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/claude_code/__init__.py +5 -0
  240. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/claude_code/probe.py +56 -0
  241. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/claude_code/runner.py +742 -0
  242. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/codex/AGENTS.md +10 -0
  243. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/codex/__init__.py +5 -0
  244. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/codex/probe.py +48 -0
  245. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/codex/runner.py +309 -0
  246. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/generic_cli/AGENTS.md +10 -0
  247. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/generic_cli/__init__.py +5 -0
  248. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/generic_cli/probe.py +28 -0
  249. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/generic_cli/runner.py +212 -0
  250. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/native_prompt.py +187 -0
  251. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/opencode/AGENTS.md +10 -0
  252. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/opencode/__init__.py +5 -0
  253. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/opencode/probe.py +48 -0
  254. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/opencode/runner.py +343 -0
  255. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/path_resolution.py +46 -0
  256. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/probe_support.py +207 -0
  257. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/runner_support.py +169 -0
  258. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/runtime_artifacts.py +32 -0
  259. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/runtime_events.py +244 -0
  260. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/runtime_execution.py +61 -0
  261. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/runtime_registry.py +123 -0
  262. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/subprocess_streaming.py +196 -0
  263. ai_driven_dev_v2-0.1.0a1/src/aidd/adapters/surface.py +464 -0
  264. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/AGENTS.md +10 -0
  265. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/__init__.py +1 -0
  266. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/doctor.py +84 -0
  267. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/eval.py +114 -0
  268. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/init_command.py +25 -0
  269. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/main.py +123 -0
  270. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/run.py +431 -0
  271. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/run_lookup.py +31 -0
  272. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/stage.py +89 -0
  273. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/stage_inspection.py +129 -0
  274. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/stage_run.py +485 -0
  275. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/support.py +145 -0
  276. ai_driven_dev_v2-0.1.0a1/src/aidd/cli/ui.py +801 -0
  277. ai_driven_dev_v2-0.1.0a1/src/aidd/compatibility.py +42 -0
  278. ai_driven_dev_v2-0.1.0a1/src/aidd/config.py +465 -0
  279. ai_driven_dev_v2-0.1.0a1/src/aidd/core/AGENTS.md +9 -0
  280. ai_driven_dev_v2-0.1.0a1/src/aidd/core/__init__.py +1 -0
  281. ai_driven_dev_v2-0.1.0a1/src/aidd/core/adapter_interview.py +25 -0
  282. ai_driven_dev_v2-0.1.0a1/src/aidd/core/contracts.py +17 -0
  283. ai_driven_dev_v2-0.1.0a1/src/aidd/core/interview.py +470 -0
  284. ai_driven_dev_v2-0.1.0a1/src/aidd/core/markdown.py +199 -0
  285. ai_driven_dev_v2-0.1.0a1/src/aidd/core/models/AGENTS.md +8 -0
  286. ai_driven_dev_v2-0.1.0a1/src/aidd/core/models/__init__.py +1 -0
  287. ai_driven_dev_v2-0.1.0a1/src/aidd/core/models/run.py +347 -0
  288. ai_driven_dev_v2-0.1.0a1/src/aidd/core/operator_frontend.py +236 -0
  289. ai_driven_dev_v2-0.1.0a1/src/aidd/core/project_set.py +142 -0
  290. ai_driven_dev_v2-0.1.0a1/src/aidd/core/repair.py +639 -0
  291. ai_driven_dev_v2-0.1.0a1/src/aidd/core/resources.py +109 -0
  292. ai_driven_dev_v2-0.1.0a1/src/aidd/core/run_inspection.py +646 -0
  293. ai_driven_dev_v2-0.1.0a1/src/aidd/core/run_lookup.py +369 -0
  294. ai_driven_dev_v2-0.1.0a1/src/aidd/core/run_store.py +873 -0
  295. ai_driven_dev_v2-0.1.0a1/src/aidd/core/runtime_readiness.py +91 -0
  296. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_graph.py +310 -0
  297. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_interview_routing.py +117 -0
  298. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_invocation.py +267 -0
  299. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_manifest.py +82 -0
  300. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_models.py +198 -0
  301. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_outputs.py +256 -0
  302. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_paths.py +17 -0
  303. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_preparation.py +163 -0
  304. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_registry.py +257 -0
  305. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_runner.py +397 -0
  306. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_terminal.py +193 -0
  307. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stage_validation.py +338 -0
  308. ai_driven_dev_v2-0.1.0a1/src/aidd/core/stages.py +29 -0
  309. ai_driven_dev_v2-0.1.0a1/src/aidd/core/state_machine.py +69 -0
  310. ai_driven_dev_v2-0.1.0a1/src/aidd/core/work_item.py +8 -0
  311. ai_driven_dev_v2-0.1.0a1/src/aidd/core/workflow_service.py +264 -0
  312. ai_driven_dev_v2-0.1.0a1/src/aidd/core/workspace.py +202 -0
  313. ai_driven_dev_v2-0.1.0a1/src/aidd/evals/AGENTS.md +9 -0
  314. ai_driven_dev_v2-0.1.0a1/src/aidd/evals/__init__.py +1 -0
  315. ai_driven_dev_v2-0.1.0a1/src/aidd/evals/log_analysis.py +788 -0
  316. ai_driven_dev_v2-0.1.0a1/src/aidd/evals/quality.py +610 -0
  317. ai_driven_dev_v2-0.1.0a1/src/aidd/evals/reporting.py +278 -0
  318. ai_driven_dev_v2-0.1.0a1/src/aidd/evals/self_repair_probes.py +111 -0
  319. ai_driven_dev_v2-0.1.0a1/src/aidd/evals/stage_timing.py +819 -0
  320. ai_driven_dev_v2-0.1.0a1/src/aidd/evals/verdicts.py +193 -0
  321. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/AGENTS.md +9 -0
  322. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/__init__.py +18 -0
  323. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/adapter_conformance.py +138 -0
  324. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/conformance_matrix.py +113 -0
  325. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/eval_classification.py +255 -0
  326. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/eval_execution.py +217 -0
  327. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/eval_models.py +110 -0
  328. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/eval_preparation.py +149 -0
  329. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/eval_reports.py +918 -0
  330. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/eval_runner.py +248 -0
  331. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/install_artifact.py +230 -0
  332. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/live_runtime_config.py +256 -0
  333. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/live_workspace_bootstrap.py +173 -0
  334. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/repo_prep.py +299 -0
  335. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/result_bundle.py +375 -0
  336. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/runner.py +399 -0
  337. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/scenario_loader.py +22 -0
  338. ai_driven_dev_v2-0.1.0a1/src/aidd/harness/scenarios.py +576 -0
  339. ai_driven_dev_v2-0.1.0a1/src/aidd/runtime_logs/AGENTS.md +8 -0
  340. ai_driven_dev_v2-0.1.0a1/src/aidd/runtime_logs/__init__.py +1 -0
  341. ai_driven_dev_v2-0.1.0a1/src/aidd/runtime_logs/model.py +10 -0
  342. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/AGENTS.md +9 -0
  343. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/__init__.py +1 -0
  344. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/cross_document.py +309 -0
  345. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/document_loader.py +210 -0
  346. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/documents.py +13 -0
  347. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/models.py +61 -0
  348. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/reports.py +106 -0
  349. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic.py +31 -0
  350. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/__init__.py +9 -0
  351. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/blocks.py +149 -0
  352. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/common.py +467 -0
  353. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/evidence.py +83 -0
  354. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/findings.py +21 -0
  355. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/idea.py +76 -0
  356. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/ids.py +38 -0
  357. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/implement.py +321 -0
  358. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/placeholders.py +173 -0
  359. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/plan.py +194 -0
  360. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/qa.py +312 -0
  361. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/registry.py +120 -0
  362. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/research.py +86 -0
  363. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/review.py +236 -0
  364. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/review_spec.py +179 -0
  365. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/risks.py +52 -0
  366. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/semantic_rules/tasklist.py +271 -0
  367. ai_driven_dev_v2-0.1.0a1/src/aidd/validators/structural.py +242 -0
  368. ai_driven_dev_v2-0.1.0a1/tests/AGENTS.md +9 -0
  369. ai_driven_dev_v2-0.1.0a1/tests/adapters/AGENTS.md +8 -0
  370. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_claude_code_probe.py +87 -0
  371. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_claude_code_runner.py +1098 -0
  372. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_codex_probe.py +116 -0
  373. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_codex_runner.py +274 -0
  374. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_generic_cli_document_handshake.py +372 -0
  375. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_generic_cli_probe.py +62 -0
  376. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_generic_cli_runner.py +324 -0
  377. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_native_prompt.py +61 -0
  378. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_opencode_probe.py +115 -0
  379. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_opencode_runner.py +289 -0
  380. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_runtime_events.py +88 -0
  381. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_runtime_execution_contract.py +36 -0
  382. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_runtime_registry.py +27 -0
  383. ai_driven_dev_v2-0.1.0a1/tests/adapters/test_subprocess_streaming.py +90 -0
  384. ai_driven_dev_v2-0.1.0a1/tests/cli/AGENTS.md +8 -0
  385. ai_driven_dev_v2-0.1.0a1/tests/cli/test_doctor.py +87 -0
  386. ai_driven_dev_v2-0.1.0a1/tests/cli/test_eval_doctor.py +73 -0
  387. ai_driven_dev_v2-0.1.0a1/tests/cli/test_eval_run.py +146 -0
  388. ai_driven_dev_v2-0.1.0a1/tests/cli/test_eval_summary.py +45 -0
  389. ai_driven_dev_v2-0.1.0a1/tests/cli/test_release_live_proof_runtime.py +88 -0
  390. ai_driven_dev_v2-0.1.0a1/tests/cli/test_run_artifacts.py +207 -0
  391. ai_driven_dev_v2-0.1.0a1/tests/cli/test_run_logs.py +190 -0
  392. ai_driven_dev_v2-0.1.0a1/tests/cli/test_run_show.py +152 -0
  393. ai_driven_dev_v2-0.1.0a1/tests/cli/test_run_workflow.py +509 -0
  394. ai_driven_dev_v2-0.1.0a1/tests/cli/test_runtime_timeout.py +47 -0
  395. ai_driven_dev_v2-0.1.0a1/tests/cli/test_stage_questions.py +94 -0
  396. ai_driven_dev_v2-0.1.0a1/tests/cli/test_stage_run.py +990 -0
  397. ai_driven_dev_v2-0.1.0a1/tests/cli/test_stage_summary.py +196 -0
  398. ai_driven_dev_v2-0.1.0a1/tests/cli/test_ui.py +428 -0
  399. ai_driven_dev_v2-0.1.0a1/tests/core/AGENTS.md +8 -0
  400. ai_driven_dev_v2-0.1.0a1/tests/core/test_interview.py +336 -0
  401. ai_driven_dev_v2-0.1.0a1/tests/core/test_operator_frontend.py +267 -0
  402. ai_driven_dev_v2-0.1.0a1/tests/core/test_project_set.py +130 -0
  403. ai_driven_dev_v2-0.1.0a1/tests/core/test_repair.py +442 -0
  404. ai_driven_dev_v2-0.1.0a1/tests/core/test_repair_flow.py +245 -0
  405. ai_driven_dev_v2-0.1.0a1/tests/core/test_resources.py +56 -0
  406. ai_driven_dev_v2-0.1.0a1/tests/core/test_run_lookup.py +534 -0
  407. ai_driven_dev_v2-0.1.0a1/tests/core/test_run_store_layout.py +598 -0
  408. ai_driven_dev_v2-0.1.0a1/tests/core/test_stage_graph.py +533 -0
  409. ai_driven_dev_v2-0.1.0a1/tests/core/test_stage_manifest.py +73 -0
  410. ai_driven_dev_v2-0.1.0a1/tests/core/test_stage_registry.py +303 -0
  411. ai_driven_dev_v2-0.1.0a1/tests/core/test_stage_runner.py +2411 -0
  412. ai_driven_dev_v2-0.1.0a1/tests/core/test_stage_terminal.py +129 -0
  413. ai_driven_dev_v2-0.1.0a1/tests/core/test_state_machine.py +53 -0
  414. ai_driven_dev_v2-0.1.0a1/tests/core/test_workflow_service.py +93 -0
  415. ai_driven_dev_v2-0.1.0a1/tests/core/test_workspace_layout.py +171 -0
  416. ai_driven_dev_v2-0.1.0a1/tests/evals/AGENTS.md +8 -0
  417. ai_driven_dev_v2-0.1.0a1/tests/evals/test_log_analysis_events_jsonl.py +89 -0
  418. ai_driven_dev_v2-0.1.0a1/tests/evals/test_log_analysis_first_boundary.py +147 -0
  419. ai_driven_dev_v2-0.1.0a1/tests/evals/test_log_analysis_regressions.py +100 -0
  420. ai_driven_dev_v2-0.1.0a1/tests/evals/test_log_analysis_runtime_log.py +136 -0
  421. ai_driven_dev_v2-0.1.0a1/tests/evals/test_log_analysis_taxonomy.py +134 -0
  422. ai_driven_dev_v2-0.1.0a1/tests/evals/test_log_analysis_validation_inputs.py +89 -0
  423. ai_driven_dev_v2-0.1.0a1/tests/evals/test_quality.py +442 -0
  424. ai_driven_dev_v2-0.1.0a1/tests/evals/test_reporting_latest_summary.py +55 -0
  425. ai_driven_dev_v2-0.1.0a1/tests/evals/test_reporting_markdown_summary.py +165 -0
  426. ai_driven_dev_v2-0.1.0a1/tests/evals/test_reporting_runtime_aggregation.py +85 -0
  427. ai_driven_dev_v2-0.1.0a1/tests/evals/test_reporting_summary_regressions.py +114 -0
  428. ai_driven_dev_v2-0.1.0a1/tests/evals/test_reporting_summary_rows.py +65 -0
  429. ai_driven_dev_v2-0.1.0a1/tests/evals/test_self_repair_probes.py +46 -0
  430. ai_driven_dev_v2-0.1.0a1/tests/evals/test_stage_timing.py +305 -0
  431. ai_driven_dev_v2-0.1.0a1/tests/evals/test_verdicts.py +321 -0
  432. ai_driven_dev_v2-0.1.0a1/tests/harness/AGENTS.md +8 -0
  433. ai_driven_dev_v2-0.1.0a1/tests/harness/test_adapter_conformance_lane.py +40 -0
  434. ai_driven_dev_v2-0.1.0a1/tests/harness/test_conformance_matrix.py +40 -0
  435. ai_driven_dev_v2-0.1.0a1/tests/harness/test_eval_runner.py +1026 -0
  436. ai_driven_dev_v2-0.1.0a1/tests/harness/test_install_artifact.py +104 -0
  437. ai_driven_dev_v2-0.1.0a1/tests/harness/test_live_runtime_config.py +257 -0
  438. ai_driven_dev_v2-0.1.0a1/tests/harness/test_repo_prep.py +444 -0
  439. ai_driven_dev_v2-0.1.0a1/tests/harness/test_result_bundle_artifacts.py +59 -0
  440. ai_driven_dev_v2-0.1.0a1/tests/harness/test_result_bundle_completeness.py +220 -0
  441. ai_driven_dev_v2-0.1.0a1/tests/harness/test_result_bundle_layout.py +61 -0
  442. ai_driven_dev_v2-0.1.0a1/tests/harness/test_result_bundle_persistence.py +167 -0
  443. ai_driven_dev_v2-0.1.0a1/tests/harness/test_runner_integration.py +187 -0
  444. ai_driven_dev_v2-0.1.0a1/tests/harness/test_runner_invoke.py +283 -0
  445. ai_driven_dev_v2-0.1.0a1/tests/harness/test_runner_setup.py +91 -0
  446. ai_driven_dev_v2-0.1.0a1/tests/harness/test_runner_teardown.py +92 -0
  447. ai_driven_dev_v2-0.1.0a1/tests/harness/test_runner_verify.py +119 -0
  448. ai_driven_dev_v2-0.1.0a1/tests/harness/test_scenario_loader_model.py +304 -0
  449. ai_driven_dev_v2-0.1.0a1/tests/harness/test_scenario_loader_substitutions.py +104 -0
  450. ai_driven_dev_v2-0.1.0a1/tests/harness/test_scenario_loader_validation.py +474 -0
  451. ai_driven_dev_v2-0.1.0a1/tests/test_cli_run_lookup.py +239 -0
  452. ai_driven_dev_v2-0.1.0a1/tests/test_cli_smoke.py +28 -0
  453. ai_driven_dev_v2-0.1.0a1/tests/test_config.py +341 -0
  454. ai_driven_dev_v2-0.1.0a1/tests/test_contract_registry.py +12 -0
  455. ai_driven_dev_v2-0.1.0a1/tests/test_docs_consistency.py +301 -0
  456. ai_driven_dev_v2-0.1.0a1/tests/test_packaging_resources.py +32 -0
  457. ai_driven_dev_v2-0.1.0a1/tests/test_prompt_quality.py +72 -0
  458. ai_driven_dev_v2-0.1.0a1/tests/test_release_workflow.py +122 -0
  459. ai_driven_dev_v2-0.1.0a1/tests/test_reporting.py +11 -0
  460. ai_driven_dev_v2-0.1.0a1/tests/test_scenario_loader.py +16 -0
  461. ai_driven_dev_v2-0.1.0a1/tests/test_scenario_taxonomy.py +193 -0
  462. ai_driven_dev_v2-0.1.0a1/tests/validators/AGENTS.md +8 -0
  463. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/README.md +43 -0
  464. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/implement-invalid-noop/workspace/workitems/WI-SEM-IMPLEMENT-NOOP/stages/implement/implementation-report.md +22 -0
  465. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/implement-invalid-verification/workspace/workitems/WI-SEM-IMPLEMENT-VERIFY/stages/implement/implementation-report.md +23 -0
  466. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/implement-valid/workspace/workitems/WI-SEM-IMPLEMENT-VALID/stages/implement/implementation-report.md +24 -0
  467. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/invalid/workspace/workitems/WI-SEM-INVALID/stages/idea/idea-brief.md +17 -0
  468. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/invalid-list-format/workspace/workitems/WI-SEM-LIST-INVALID/stages/idea/idea-brief.md +17 -0
  469. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/plan-invalid/workspace/workitems/WI-SEM-PLAN-INVALID/stages/plan/plan.md +34 -0
  470. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/plan-valid/workspace/workitems/WI-SEM-PLAN-VALID/stages/plan/plan.md +39 -0
  471. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/qa-invalid/workspace/workitems/WI-SEM-QA-INVALID/stages/qa/qa-report.md +17 -0
  472. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/qa-valid/workspace/workitems/WI-SEM-QA-VALID/stages/qa/qa-report.md +20 -0
  473. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/research-invalid-missing-source/workspace/workitems/WI-SEM-RESEARCH-MISSING-SOURCE/stages/research/research-notes.md +25 -0
  474. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/research-invalid-unresolved-question/workspace/workitems/WI-SEM-RESEARCH-UNRESOLVED/stages/research/research-notes.md +25 -0
  475. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/research-valid/workspace/workitems/WI-SEM-RESEARCH-VALID/stages/research/research-notes.md +25 -0
  476. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/review-invalid/workspace/workitems/WI-SEM-REVIEW-INVALID/stages/review/review-report.md +18 -0
  477. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/review-spec-invalid/workspace/workitems/WI-SEM-REVIEW-SPEC-INVALID/stages/review-spec/review-spec-report.md +25 -0
  478. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/review-spec-valid/workspace/workitems/WI-SEM-REVIEW-SPEC-VALID/stages/review-spec/review-spec-report.md +29 -0
  479. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/review-valid/workspace/workitems/WI-SEM-REVIEW-VALID/stages/review/review-report.md +23 -0
  480. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/tasklist-invalid/workspace/workitems/WI-SEM-TASKLIST-INVALID/stages/tasklist/tasklist.md +18 -0
  481. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/tasklist-valid/workspace/workitems/WI-SEM-TASKLIST-VALID/stages/tasklist/tasklist.md +24 -0
  482. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/valid/workspace/workitems/WI-SEM-VALID/stages/idea/idea-brief.md +18 -0
  483. ai_driven_dev_v2-0.1.0a1/tests/validators/fixtures/semantic/valid-list-format/workspace/workitems/WI-SEM-LIST-VALID/stages/idea/idea-brief.md +18 -0
  484. ai_driven_dev_v2-0.1.0a1/tests/validators/test_cross_document.py +581 -0
  485. ai_driven_dev_v2-0.1.0a1/tests/validators/test_document_loader.py +224 -0
  486. ai_driven_dev_v2-0.1.0a1/tests/validators/test_models.py +41 -0
  487. ai_driven_dev_v2-0.1.0a1/tests/validators/test_reports.py +90 -0
  488. ai_driven_dev_v2-0.1.0a1/tests/validators/test_semantic.py +2670 -0
  489. ai_driven_dev_v2-0.1.0a1/tests/validators/test_structural.py +575 -0
@@ -0,0 +1,9 @@
1
+ # AGENTS.md
2
+
3
+ This directory holds reusable development workflows for coding agents.
4
+
5
+ ## Rules
6
+
7
+ - Keep each skill focused on one repeatable workflow.
8
+ - Prefer repo-specific instructions over generic advice.
9
+ - Update skills when the roadmap, architecture, or contributor workflow changes.
@@ -0,0 +1,109 @@
1
+ ---
2
+ name: aidd-eval
3
+ description: Run harness and eval scenarios for ai_driven_dev_v2, validate document-first stage outputs, preserve runtime logs, analyze failures, and produce durable audit artifacts for deterministic and manual-live lanes.
4
+ ---
5
+
6
+ # aidd-eval
7
+
8
+ ## Use when
9
+
10
+ - You need to run a harness scenario against one of the maintained runtimes.
11
+ - You need to validate stage outputs against Markdown document contracts.
12
+ - You need to check self-repair behavior after validator failures.
13
+ - You need to capture runtime logs, normalized events, and log-analysis artifacts.
14
+ - You need to audit generated artifacts and generated code after execution.
15
+
16
+ For **local live-run operator guidance**, prefer `live-e2e`.
17
+ Use `aidd-eval` when the main task is generic eval execution, artifact analysis,
18
+ validation discipline, grading, and failure classification across deterministic
19
+ and manual-live lanes.
20
+
21
+ ## Required reading
22
+
23
+ 1. `docs/architecture/eval-harness-integration.md`
24
+ 2. `docs/architecture/document-contracts.md`
25
+ 3. `docs/architecture/adapter-protocol.md`
26
+ 4. `docs/architecture/runtime-matrix.md`
27
+ 5. `docs/e2e/scenario-matrix.md`
28
+ 6. `docs/e2e/live-quality-rubric.md` for live scenarios
29
+ 7. the selected scenario under `harness/scenarios/`
30
+ 8. `.agents/skills/aidd-eval/references/e2e-flow-audit.md`
31
+
32
+ ## Lane split
33
+
34
+ - Deterministic scenarios use `feature_source.mode: fixture-seed` and may run in `ci` or `manual`.
35
+ - Live scenarios use `feature_source.mode: curated-issue-pool`, must live under `harness/scenarios/live/`, and are manual-only.
36
+
37
+ ## Hard rules
38
+
39
+ 1. Never hand-edit runtime-generated stage output documents during an eval run.
40
+ 2. Always probe the adapter first.
41
+ 3. Always preserve raw runtime logs.
42
+ 4. Always validate output Markdown documents against their contracts.
43
+ 5. Always allow the stage self-repair loop to run if the scenario expects repairable failures.
44
+ 6. Always keep question/answer events as durable artifacts.
45
+ 7. Always generate log-analysis output.
46
+ 8. Keep infrastructure failures separate from model or document failures.
47
+ 9. For live scenarios, preserve install evidence, issue-selection evidence, and quality artifacts.
48
+ 10. Never mutate roadmap or backlog files as part of live quality auditing.
49
+
50
+ ## Default procedure
51
+
52
+ 1. Load the scenario and confirm the requested runtime is allowed.
53
+ 2. Probe the adapter and record capability information.
54
+ 3. Prepare or reset the fixture workspace or target repository.
55
+ 4. Run the requested stage or flow through the harness.
56
+ For live scenarios, select the first curated issue, install the artifact under test first, and run AIDD from the target repository root.
57
+ 5. Capture:
58
+ - install transcript and artifact identity for live scenarios,
59
+ - issue-selection evidence for live scenarios,
60
+ - fixture-seed metadata for deterministic scenarios,
61
+ - raw runtime logs,
62
+ - structured runtime logs when available,
63
+ - normalized events,
64
+ - question/answer events,
65
+ - validator outcomes,
66
+ - repair attempts.
67
+ 6. Validate all required output documents.
68
+ 7. Run live quality commands and score artifact/code quality when the scenario requires it.
69
+ 8. Run graders.
70
+ 9. Run log analysis.
71
+ 10. Write the final audit artifacts.
72
+ 11. Report the final execution verdict and quality conclusion explicitly.
73
+
74
+ ## Canonical output locations
75
+
76
+ - `.aidd/reports/evals/<run_id>/runtime.log`
77
+ - `.aidd/reports/evals/<run_id>/runtime.jsonl` when supported
78
+ - `.aidd/reports/evals/<run_id>/events.jsonl` when supported
79
+ - `.aidd/reports/evals/<run_id>/install-transcript.json`
80
+ - `.aidd/reports/evals/<run_id>/issue-selection.json`
81
+ - `.aidd/reports/evals/<run_id>/validator-report.md`
82
+ - `.aidd/reports/evals/<run_id>/repair-history.md`
83
+ - `.aidd/reports/evals/<run_id>/log-analysis.md`
84
+ - `.aidd/reports/evals/<run_id>/grader.json`
85
+ - `.aidd/reports/evals/<run_id>/verdict.md`
86
+ - `.aidd/reports/evals/<run_id>/quality-report.md`
87
+ - `.aidd/reports/evals/<run_id>/quality-transcript.json`
88
+
89
+ ## Execution verdict taxonomy
90
+
91
+ For eval harness runs, preserve the stable execution verdict taxonomy:
92
+
93
+ - `pass`
94
+ - `fail`
95
+ - `blocked`
96
+ - `infra-fail`
97
+
98
+ Quality remains additive and must be reported separately as:
99
+
100
+ - `pass`
101
+ - `warn`
102
+ - `fail`
103
+ - `none`
104
+
105
+ ## Example command shape
106
+
107
+ ```bash
108
+ aidd eval run harness/scenarios/smoke/plan-stagepack-smoke.yaml --runtime opencode
109
+ ```
@@ -0,0 +1,20 @@
1
+ # E2E Flow Audit Reference
2
+
3
+ Use this reference when writing or reviewing eval audit output.
4
+
5
+ ## Minimum audit sections
6
+
7
+ - scenario summary
8
+ - runtime and adapter used
9
+ - repository pin or fixture identity
10
+ - stage or flow scope
11
+ - validator outcomes
12
+ - repair history
13
+ - user questions and answers
14
+ - log analysis
15
+ - final verdict
16
+ - follow-up actions
17
+
18
+ ## First-failure principle
19
+
20
+ The audit should name the earliest decisive failure signal, not only the last visible symptom.
@@ -0,0 +1,175 @@
1
+ ---
2
+ name: backlog-ops
3
+ description: Select, split, create, promote, and close roadmap tasks while keeping `roadmap.md` and `backlog.md` synchronized.
4
+ ---
5
+
6
+ # backlog-ops
7
+
8
+ Use this skill whenever you touch `docs/backlog/roadmap.md` or `docs/backlog/backlog.md`.
9
+
10
+ ## Planning sources
11
+
12
+ Read these in order:
13
+
14
+ 1. `docs/backlog/backlog.md`
15
+ 2. `docs/backlog/roadmap.md`
16
+ 3. `docs/product/user-stories.md`
17
+ 4. the nearest `AGENTS.md` for the code or docs area you will touch
18
+
19
+ ## Canonical rules
20
+
21
+ - `docs/backlog/roadmap.md` is the canonical hierarchy.
22
+ - `docs/backlog/backlog.md` is the short actionable queue.
23
+ - Work must always fit `wave -> epic -> slice -> local task`.
24
+ - A local task must be reviewable without another decomposition pass.
25
+ - Update `roadmap.md` first, then update `backlog.md`.
26
+
27
+ ## Taking a task
28
+
29
+ 1. Read the `Next` section in `docs/backlog/backlog.md`.
30
+ 2. Pick the first local task marked `next` unless it is blocked by a documented dependency.
31
+ 3. Read the full parent slice in `docs/backlog/roadmap.md`.
32
+ 4. Read the linked user stories and architecture notes for the touched area.
33
+ 5. Restate the task in your own words before coding:
34
+ - exact output;
35
+ - touched module or file family;
36
+ - main verification signal;
37
+ - dependencies that must already exist.
38
+
39
+ Do not start coding if you cannot name all four items above.
40
+
41
+ ## Local-task quality bar
42
+
43
+ A valid local task has:
44
+
45
+ - one clear output artifact or code change;
46
+ - one dominant touched area;
47
+ - one main verification path;
48
+ - explicit upstream dependencies;
49
+ - wording that starts with a concrete verb.
50
+
51
+ A task must be split immediately if any of these are true:
52
+
53
+ - it touches more than one subsystem family, such as core + adapter + harness;
54
+ - it mixes contract design and broad downstream rollout;
55
+ - it has multiple independent outputs that could be reviewed separately;
56
+ - it has no single pass/fail check;
57
+ - it would require another planning discussion during implementation.
58
+
59
+ ## Local-task template
60
+
61
+ When you create or rewrite a local task, make it fit this template:
62
+
63
+ - **ID** — `W<wave>-E<epic>-S<slice>-T<task>`
64
+ - **Action** — starts with a verb such as `Define`, `Implement`, `Write`, `Add`, `Expose`, `Render`
65
+ - **Output** — name the artifact, module, or command that changes
66
+ - **Scope** — keep one dominant touched area
67
+ - **Verification** — state how the task will be proven done
68
+
69
+ Example:
70
+
71
+ - `W4-E1-S2-T4` Implement stdout and stderr streaming to the CLI while the subprocess runs.
72
+
73
+ That is good because it names the subsystem, the behavior, and the direct review target.
74
+
75
+ ## Creating a new local task
76
+
77
+ Create a new local task when the discovered work:
78
+
79
+ - clearly belongs to an existing slice goal;
80
+ - can be reviewed independently;
81
+ - has one dominant output and one verification signal.
82
+
83
+ Workflow:
84
+
85
+ 1. Add the new task under the correct slice in `roadmap.md`.
86
+ 2. Keep the existing slice goal unless the outcome changed materially.
87
+ 3. Preserve the current task id for the first surviving piece whenever you split active work.
88
+ 4. Append new task ids after the preserved one.
89
+ 5. Update slice dependencies, touched areas, or exit evidence if the new task changes them.
90
+ 6. Pull the new task into `backlog.md` only if it is immediately actionable.
91
+
92
+ ## Creating a new slice
93
+
94
+ Create a new slice only when the discovered work is a different meaningful outcome, for example:
95
+
96
+ - a new stage contract;
97
+ - a new adapter capability;
98
+ - a separate harness scenario lane;
99
+ - a separate operator command surface.
100
+
101
+ Do **not** create a new slice just because the current task is too large. Split into more local tasks first.
102
+
103
+ A good slice has:
104
+
105
+ - one outcome sentence in the goal;
106
+ - explicit primary outputs;
107
+ - touched areas;
108
+ - dependencies;
109
+ - exit evidence.
110
+
111
+ ## Creating a new epic
112
+
113
+ Create a new epic only when the theme changes enough that the work is no longer one coherent track, for example:
114
+
115
+ - moving from validators into runtime adapters;
116
+ - moving from harness execution into release operations.
117
+
118
+ If the work still serves the same theme, keep it inside the current epic.
119
+
120
+ ## Splitting workflow
121
+
122
+ When a task or slice is too large:
123
+
124
+ 1. Identify the dominant outputs hidden inside the oversized work.
125
+ 2. Keep the current id for the first smallest reviewable piece.
126
+ 3. Create follow-up task ids for the remaining pieces.
127
+ 4. Reword each new task so it names the output directly.
128
+ 5. Check whether the parent slice still has one clear outcome.
129
+ 6. Update `backlog.md` so only the immediate next pieces stay in `Next`.
130
+
131
+ ## Dependency rules
132
+
133
+ - Dependencies belong on the slice, not repeated on every task unless there is an exception.
134
+ - A task may assume slice dependencies are already satisfied.
135
+ - If one task inside a slice depends on another task in the same slice, order the tasks so the dependency is obvious.
136
+ - If discovered work depends on another wave or epic, add that dependency explicitly to the slice.
137
+
138
+ ## Promotion rules for `backlog.md`
139
+
140
+ Use `backlog.md` as a queue, not as a second roadmap.
141
+
142
+ - `Next` contains immediately actionable local tasks only.
143
+ - `Soon` contains tasks that are likely next but still depend on `Next`.
144
+ - `Parking lot` holds later-wave tasks that should stay visible.
145
+ - Never place a slice or epic in `backlog.md`; only local task ids belong there.
146
+ - Never add a task to `backlog.md` unless it already exists in `roadmap.md`.
147
+
148
+ ## Closing work
149
+
150
+ After implementation:
151
+
152
+ 1. Mark the task or slice state in `roadmap.md` if it materially changed.
153
+ 2. Remove completed tasks from `backlog.md`.
154
+ 3. Add follow-up tasks to `roadmap.md` before mentioning them elsewhere.
155
+ 4. Delete stale wording instead of leaving historical clutter.
156
+ 5. Make sure the new plan still reads cleanly from wave to task.
157
+
158
+ ## Sync checklist
159
+
160
+ Any change to planning files should leave all of these true:
161
+
162
+ - every backlog id exists in the roadmap;
163
+ - every `Next` item is a local task, not a slice;
164
+ - no task wording is ambiguous or multi-output;
165
+ - parent slices still have one meaningful outcome;
166
+ - dependencies and exit evidence still match the work.
167
+
168
+ ## Output when reporting planning work
169
+
170
+ When you finish a planning update, report:
171
+
172
+ - the local task you took or the slice you decomposed;
173
+ - new or changed task ids;
174
+ - any new dependencies you added;
175
+ - which items moved into `Next`, `Soon`, or `Parking lot`.
@@ -0,0 +1,213 @@
1
+ ---
2
+ name: live-e2e
3
+ description: Run or prepare a manual full-flow live end-to-end scenario against a public GitHub repository with repository pinning, curated issue selection, quality checks, and full log capture.
4
+ ---
5
+
6
+ # live-e2e
7
+
8
+ ## Use when
9
+
10
+ - You need to execute or author a scenario from `docs/e2e/live-e2e-catalog.md`.
11
+ - You need to compare live provider behavior on a real repository.
12
+ - You need to prove the installed operator flow from `idea` through `qa` as a manual external audit.
13
+ - You need a **local source-checkout runbook** for manual live E2E, not just the abstract eval contract.
14
+
15
+ ## This skill vs `aidd-eval`
16
+
17
+ - Use `live-e2e` when the main question is: "How do I make a local live run work from this checkout?"
18
+ - Use `aidd-eval` when the main question is: "How do I audit artifacts, validation, grading, and failure classification across eval lanes?"
19
+ - `live-e2e` is the primary local operator playbook for live runs.
20
+ - `aidd-eval` remains the generic eval and artifact-analysis skill.
21
+
22
+ ## Read first
23
+
24
+ This skill is intended to be sufficient for a prepared local run, but these files
25
+ remain the authoritative deeper references:
26
+
27
+ 1. `docs/e2e/live-e2e-catalog.md`
28
+ 2. `docs/e2e/scenario-matrix.md`
29
+ 3. `docs/operator-handbook.md`
30
+ 4. the selected manifest in `harness/scenarios/live/`
31
+
32
+ ## What must already exist
33
+
34
+ If you only use this skill from the current project, the run still needs these
35
+ external prerequisites to already be true:
36
+
37
+ - you are in a prepared local **source checkout** of this repository;
38
+ - `uv sync --extra dev` has already completed successfully;
39
+ - the selected live manifest exists under `harness/scenarios/live/`;
40
+ - the requested runtime appears in the scenario's `runtime_targets`;
41
+ - the machine has network access to clone the pinned public target repository;
42
+ - the selected provider is already authenticated and runnable on the machine;
43
+ - the selected provider CLI is available, or you have an AIDD-compatible wrapper
44
+ command override for the chosen live runtime.
45
+
46
+ This skill does **not** provision runtime authentication, wrapper scripts, or provider setup for you.
47
+
48
+ ## Runtime-command contract
49
+
50
+ For local manual live runs, `claude-code`, `codex`, and `opencode` use native provider CLI
51
+ commands by default. You may provide a runtime-command override through
52
+ environment variables when you need a custom wrapper:
53
+
54
+ - `AIDD_EVAL_CLAUDE_CODE_COMMAND` for `claude-code`
55
+ - `AIDD_EVAL_CODEX_COMMAND` for `codex`
56
+ - `AIDD_EVAL_OPENCODE_COMMAND` for `opencode`
57
+
58
+ When set, the value must point to an **AIDD-compatible wrapper command**:
59
+
60
+ - it must be invokable from the shell on the current machine;
61
+ - it must accept the adapter flags AIDD passes for that runtime;
62
+ - it may be a wrapper around the upstream provider CLI rather than the raw provider binary;
63
+ - `aidd doctor` distinguishes provider probe readiness from execution command readiness.
64
+
65
+ There are no repo-local wrapper templates in this wave. The operator must already
66
+ have provider auth and a working provider CLI or wrapper execution surface.
67
+
68
+ ## Local preflight checklist
69
+
70
+ Before the live run, confirm all of these:
71
+
72
+ 1. `uv sync --extra dev`
73
+ 2. `uv run aidd doctor`
74
+ 3. the selected scenario is under `harness/scenarios/live/`
75
+ 4. the scenario has `automation_lane: manual`
76
+ 5. the scenario forces `stage_scope: idea -> qa`
77
+ 6. the runtime you plan to use appears in `runtime_targets`
78
+ 7. `uv run aidd eval doctor <manifest> --runtime <runtime>` reports execution readiness
79
+ 8. any wrapper env var you choose to set resolves on the machine and uses the expected auth state
80
+
81
+ Recommended local preflight:
82
+
83
+ ```bash
84
+ uv sync --extra dev
85
+ uv run aidd doctor
86
+ uv run aidd eval doctor harness/scenarios/live/sqlite-utils-detect-types-header-only.yaml --runtime codex
87
+ ```
88
+
89
+ Optional wrapper override:
90
+
91
+ ```bash
92
+ export AIDD_EVAL_CODEX_COMMAND='<aidd-compatible codex wrapper>'
93
+ export AIDD_EVAL_OPENCODE_COMMAND='<aidd-compatible opencode wrapper>'
94
+ export AIDD_EVAL_CLAUDE_CODE_COMMAND='<aidd-compatible claude-code wrapper>'
95
+ ```
96
+
97
+ ## Canonical local launch
98
+
99
+ The primary execution path for this skill is a local run from the AIDD source checkout:
100
+
101
+ ```bash
102
+ uv run aidd eval run harness/scenarios/live/sqlite-utils-detect-types-header-only.yaml --runtime codex
103
+ ```
104
+
105
+ or:
106
+
107
+ ```bash
108
+ uv run aidd eval run harness/scenarios/live/sqlite-utils-detect-types-header-only.yaml --runtime opencode
109
+ ```
110
+
111
+ or:
112
+
113
+ ```bash
114
+ uv run aidd eval run harness/scenarios/live/sqlite-utils-detect-types-header-only.yaml --runtime claude-code
115
+ ```
116
+
117
+ The GitHub `manual-live-e2e` workflow is a secondary alternate entrypoint, not the primary flow described by this skill.
118
+
119
+ ## What the harness will do
120
+
121
+ During a successful local live run, the harness will:
122
+
123
+ 1. load the selected scenario and validate the live-lane contract;
124
+ 2. resolve and record the pinned target repository commit;
125
+ 3. prepare a clean working copy of the target repository;
126
+ 4. select the **first listed issue** from the curated issue pool;
127
+ 5. write issue-selection evidence to the eval bundle and target-repo context;
128
+ 6. seed `.aidd/` inside the target repository;
129
+ 7. write a live `aidd.example.toml` with the runtime command and execution mode for the chosen provider;
130
+ 8. build and install the AIDD artifact under test with `uv tool`;
131
+ 9. run installed `aidd` from the target repository root with explicit workflow bounds `idea -> qa`;
132
+ 10. run setup, verify, and quality commands and write the final audit artifacts.
133
+
134
+ ## Validations and blockers
135
+
136
+ The live run can be rejected or downgraded at several layers:
137
+
138
+ - manifest validation rejects non-live scenarios, non-manual live scenarios, missing `quality`, invalid `runtime_targets`, or any live scenario that is not bounded to `idea -> qa`;
139
+ - runtime admission rejects a requested runtime that is not declared in `runtime_targets`;
140
+ - stage execution stays bounded to `idea -> qa`;
141
+ - stage outputs must validate against Markdown document contracts;
142
+ - repair loops are allowed to run when validation failures are repairable;
143
+ - interview scenarios block when required answers are missing;
144
+ - repo-local `verify.commands` must pass;
145
+ - repo-local `quality.commands` must pass for a clean quality result;
146
+ - execution `pass` is impossible if any stage in scope is missing required validated artifacts.
147
+
148
+ Live execution verdicts remain:
149
+
150
+ - `pass`
151
+ - `fail`
152
+ - `blocked`
153
+ - `infra-fail`
154
+
155
+ Quality is additive:
156
+
157
+ - `pass`
158
+ - `warn`
159
+ - `fail`
160
+ - `none`
161
+
162
+ ## Output locations and success criteria
163
+
164
+ The canonical eval bundle for a local live run lives under:
165
+
166
+ - `.aidd/reports/evals/<run_id>/`
167
+
168
+ Expected live artifacts include:
169
+
170
+ - `issue-selection.json`
171
+ - `install-transcript.json`
172
+ - `runtime.log`
173
+ - `validator-report.md`
174
+ - `repair-history.md`
175
+ - `log-analysis.md`
176
+ - `grader.json`
177
+ - `verdict.md`
178
+ - `quality-report.md`
179
+ - `quality-transcript.json`
180
+
181
+ A live run is only "clean" when execution evidence exists, verification output is present, and the bundle includes `quality-report.md` plus `quality-transcript.json`.
182
+
183
+ ## First triage for common failures
184
+
185
+ - Provider executable missing: install/login to the selected provider CLI, or export `AIDD_EVAL_CODEX_COMMAND` / `AIDD_EVAL_OPENCODE_COMMAND` for a wrapper.
186
+ - Runtime launches but immediately fails in native mode: inspect provider auth, model selection, and sandbox permissions.
187
+ - Runtime launches but immediately fails in `adapter-flags` mode: the configured command is probably not an AIDD-compatible wrapper command.
188
+ - `unsupported-runtime`: the runtime is not declared in the scenario's `runtime_targets`.
189
+ - `blocked`: inspect `questions.md` / `answers.md` expectations for interview scenarios.
190
+ - `fail` after run success: inspect `verify-transcript.json`, `quality-transcript.json`, and the stage-local validator reports.
191
+ - Missing clean execution despite zero exit codes: inspect `verdict.md` and `grader.json` for pass-guard failures caused by missing `stage-result.md` or `validator-report.md`.
192
+
193
+ ## Procedure
194
+
195
+ 1. Confirm the selected scenario is in `harness/scenarios/live/`, has `automation_lane: manual`, and declares the requested runtime in `runtime_targets`.
196
+ 2. Run the local preflight checks from this skill, including `aidd eval doctor`.
197
+ 3. Export a wrapper env var only when you intentionally want `adapter-flags` mode.
198
+ 4. Launch `uv run aidd eval run <manifest> --runtime <runtime>`.
199
+ 5. Preserve the resulting bundle and inspect `verdict.md`, `grader.json`, `quality-report.md`, and transcripts before judging the run.
200
+ 6. If the setup, provider coverage, size classification, quality recipe, or verification recipe had to change, update the scenario manifest, matrix doc, and catalog after the run as separate follow-up work.
201
+
202
+ ## Hard rules
203
+
204
+ - Never treat live E2E as a CI or release lane.
205
+ - Never assume this skill provisions runtime auth, wrappers, or provider setup.
206
+ - Never dispatch the manual GitHub workflow without provider execution readiness for the selected runtime.
207
+ - Never run a live scenario without storing the resolved repo pin.
208
+ - Never run a live scenario without storing the selected issue snapshot.
209
+ - Never treat a live scenario as canonical unless it executes `idea -> qa`.
210
+ - Never treat a live scenario as passed without install evidence and verification output.
211
+ - Never treat a live scenario as clean without `quality-report.md` and `quality-transcript.json`.
212
+ - Preserve all runtime logs.
213
+ - Keep `.aidd` rooted inside the target repository for installed live runs.
@@ -0,0 +1,23 @@
1
+ ---
2
+ name: project-navigation
3
+ description: Map a task to the right AIDD docs, modules, checks, and scenario assets before making changes.
4
+ ---
5
+
6
+ # project-navigation
7
+
8
+ ## Use when
9
+
10
+ - You are starting work in this repository.
11
+ - You do not know which module or document set owns the change.
12
+
13
+ ## Procedure
14
+
15
+ 1. Read `AGENTS.md` and `docs/product/user-stories.md`.
16
+ 2. Classify the task as one of: docs, contracts, core, adapters, validators, harness, evals, or CLI.
17
+ 3. Read the nearest nested `AGENTS.md` for that area.
18
+ 4. Identify the expected checks and scenario updates.
19
+ 5. Name the primary files that should change before editing.
20
+
21
+ ## Output
22
+
23
+ Produce a short work map: owning area, likely files, checks to run, and whether a scenario update is required.
@@ -0,0 +1,23 @@
1
+ ---
2
+ name: runtime-log-triage
3
+ description: Analyze runtime and adapter logs to identify the first decisive failure signal and classify it correctly.
4
+ ---
5
+
6
+ # runtime-log-triage
7
+
8
+ ## Use when
9
+
10
+ - A scenario or stage run failed.
11
+ - You need to separate document, model, adapter, auth, permission, timeout, or environment failures.
12
+
13
+ ## Procedure
14
+
15
+ 1. Read `runtime.log`, `events.jsonl`, and `validator-report.md`.
16
+ 2. Identify the earliest decisive signal.
17
+ 3. Separate runtime startup failures from document validation failures.
18
+ 4. Check whether a user question should have blocked the run.
19
+ 5. Write a short `log-analysis.md` that names the first cause, not just the final symptom.
20
+
21
+ ## Output
22
+
23
+ Return the likely failure class and the evidence chain that supports it.
@@ -0,0 +1,25 @@
1
+ ---
2
+ name: stage-contract-change
3
+ description: Make a safe change to a stage or document contract by updating contracts, validators, prompts, and scenarios together.
4
+ ---
5
+
6
+ # stage-contract-change
7
+
8
+ ## Use when
9
+
10
+ - You are changing a stage input or output document.
11
+ - You are changing validation rules or repair behavior.
12
+
13
+ ## Procedure
14
+
15
+ 1. Update the relevant contract doc first.
16
+ 2. Update validator logic or validator plan.
17
+ 3. Update prompt files or prompt-pack references if the runtime needs new instructions.
18
+ 4. Update stage-result expectations and repair behavior if needed.
19
+ 5. Add or update at least one smoke or eval scenario.
20
+
21
+ ## Hard rules
22
+
23
+ - Never change a stage contract in code only.
24
+ - Never widen a stage output implicitly.
25
+ - Keep Markdown as the canonical runtime-authored output form.
@@ -0,0 +1,78 @@
1
+ ---
2
+ name: task-slicing
3
+ description: Turn a coarse roadmap item into reviewable local tasks with one output, one dominant touched area, and one main verification signal.
4
+ ---
5
+
6
+ # task-slicing
7
+
8
+ Use this skill when a roadmap task or slice still feels too vague to implement directly.
9
+
10
+ ## What "good slicing" means
11
+
12
+ A strong local task has:
13
+
14
+ - one concrete output;
15
+ - one dominant touched area;
16
+ - one main verification path;
17
+ - wording that starts with a verb;
18
+ - a scope small enough for one focused review.
19
+
20
+ ## Smells that mean "split again"
21
+
22
+ Split again if the proposed task:
23
+
24
+ - touches core plus adapter plus harness together;
25
+ - mixes contract design with broad rollout;
26
+ - produces multiple independent artifacts;
27
+ - needs different verification strategies at once;
28
+ - still contains words like `implement stage`, `finish adapter`, `wire everything`, or `support all cases`.
29
+
30
+ ## Split order
31
+
32
+ Always try this order:
33
+
34
+ 1. split into more local tasks in the same slice;
35
+ 2. create a new slice only if there is a different meaningful outcome;
36
+ 3. create a new epic only if the theme changes.
37
+
38
+ ## Recipe
39
+
40
+ 1. Name the parent outcome in one sentence.
41
+ 2. List the concrete outputs hidden inside it.
42
+ 3. Group outputs by touched area.
43
+ 4. Turn each group into a verb-led task.
44
+ 5. Check that each task has one main verification signal.
45
+ 6. Reorder tasks so dependencies read top to bottom.
46
+
47
+ ## Examples
48
+
49
+ Too broad:
50
+
51
+ - `Implement the Claude Code adapter.`
52
+
53
+ Better:
54
+
55
+ - `Implement Claude Code command assembly from stage brief, workspace path, and prompt-pack inputs.`
56
+ - `Stream raw Claude Code stdout and stderr to the operator CLI in real time.`
57
+ - `Persist a full runtime.log that matches the raw streamed output as closely as possible.`
58
+ - `Detect Claude Code question or pause events when the runtime exposes them.`
59
+
60
+ Too broad:
61
+
62
+ - `Finalize the implement stage.`
63
+
64
+ Better:
65
+
66
+ - `Define the required implement inputs, including task selection, repository state, and allowed write scope.`
67
+ - `Define the required implement outputs, including change summary, touched files, and verification notes.`
68
+ - `Define validator rules for missing diffs, unverifiable claims, and incomplete execution summaries.`
69
+ - `Create the implement prompt-pack scaffold with explicit edit and verification guidance.`
70
+
71
+ ## Output
72
+
73
+ When you use this skill, report:
74
+
75
+ - the parent item you decomposed;
76
+ - the new task ids;
77
+ - why the old task was too broad;
78
+ - the output and verification signal for each new task.