@mmerterden/multi-agent-pipeline 8.6.2 → 10.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (927) hide show
  1. package/CHANGELOG.md +544 -2484
  2. package/README.md +99 -101
  3. package/docs/features.md +1 -1
  4. package/index.js +8 -10
  5. package/install/_adapters.mjs +5 -1
  6. package/install/_common.mjs +63 -0
  7. package/install/claude.mjs +14 -14
  8. package/install/copilot.mjs +14 -8
  9. package/install/index.mjs +85 -19
  10. package/install/templates/claude-hooks.json +18 -0
  11. package/install/templates/copilot-instructions.md +3 -3
  12. package/package.json +21 -6
  13. package/pipeline/adapters/_base.mjs +366 -14
  14. package/pipeline/adapters/antigravity.mjs +140 -0
  15. package/pipeline/adapters/codex.mjs +159 -0
  16. package/pipeline/adapters/copilot-chat-orchestration.mjs +148 -0
  17. package/pipeline/adapters/copilot-chat.mjs +34 -68
  18. package/pipeline/adapters/cursor-orchestration.mjs +152 -0
  19. package/pipeline/adapters/cursor.mjs +49 -90
  20. package/pipeline/agents/android-architect.md +5 -5
  21. package/pipeline/agents/backend-architect.md +4 -4
  22. package/pipeline/agents/code-reviewer.md +10 -10
  23. package/pipeline/agents/dev-critic.md +17 -17
  24. package/pipeline/agents/explorer.md +3 -3
  25. package/pipeline/agents/ios-architect.md +4 -4
  26. package/pipeline/agents/security-auditor.md +12 -12
  27. package/pipeline/agents/task-clarifier.md +18 -18
  28. package/pipeline/claude-md-template.md +3 -3
  29. package/pipeline/commands/archive-guard.md +3 -3
  30. package/pipeline/commands/figma-to-swiftui.md +10 -10
  31. package/pipeline/commands/multi-agent/_account-picker.md +12 -8
  32. package/pipeline/commands/multi-agent/_dev-context.md +15 -15
  33. package/pipeline/commands/multi-agent/_input-parser.md +4 -4
  34. package/pipeline/commands/multi-agent/_repo-picker.md +9 -9
  35. package/pipeline/commands/multi-agent/analysis-resolve.md +129 -0
  36. package/pipeline/commands/multi-agent/analysis.md +667 -0
  37. package/pipeline/commands/multi-agent/autopilot.md +22 -22
  38. package/pipeline/commands/multi-agent/build-optimize.md +77 -0
  39. package/pipeline/commands/multi-agent/channels.md +96 -96
  40. package/pipeline/commands/multi-agent/delete.md +19 -17
  41. package/pipeline/commands/multi-agent/dev-autopilot.md +23 -23
  42. package/pipeline/commands/multi-agent/dev-local-autopilot.md +23 -23
  43. package/pipeline/commands/multi-agent/dev-local.md +25 -22
  44. package/pipeline/commands/multi-agent/dev.md +49 -49
  45. package/pipeline/commands/multi-agent/diff-explain.md +4 -4
  46. package/pipeline/commands/multi-agent/garbage-collect.md +58 -0
  47. package/pipeline/commands/multi-agent/help.md +75 -66
  48. package/pipeline/commands/multi-agent/issue.md +3 -3
  49. package/pipeline/commands/multi-agent/jira.md +12 -12
  50. package/pipeline/commands/multi-agent/kill.md +6 -6
  51. package/pipeline/commands/multi-agent/language.md +12 -12
  52. package/pipeline/commands/multi-agent/local-autopilot.md +34 -34
  53. package/pipeline/commands/multi-agent/local.md +24 -25
  54. package/pipeline/commands/multi-agent/log.md +6 -6
  55. package/pipeline/commands/multi-agent/manual-test.md +3 -3
  56. package/pipeline/commands/multi-agent/prune-logs.md +60 -0
  57. package/pipeline/commands/multi-agent/purge.md +10 -7
  58. package/pipeline/commands/multi-agent/refactor.md +9 -9
  59. package/pipeline/commands/multi-agent/refs/analysis-template.md +1062 -0
  60. package/pipeline/commands/multi-agent/refs/android-guide.md +15 -13
  61. package/pipeline/commands/multi-agent/refs/audit-guide.md +20 -20
  62. package/pipeline/commands/multi-agent/refs/backend-guide.md +9 -9
  63. package/pipeline/commands/multi-agent/refs/channels/confluence.md +17 -17
  64. package/pipeline/commands/multi-agent/refs/channels/issue-comment.md +30 -30
  65. package/pipeline/commands/multi-agent/refs/channels/jira.md +15 -15
  66. package/pipeline/commands/multi-agent/refs/channels/pr-review-actions.md +19 -17
  67. package/pipeline/commands/multi-agent/refs/channels/pr.md +22 -22
  68. package/pipeline/commands/multi-agent/refs/channels/wiki.md +19 -19
  69. package/pipeline/commands/multi-agent/refs/component-dispatch.md +11 -11
  70. package/pipeline/commands/multi-agent/refs/conventions-defaults.md +179 -0
  71. package/pipeline/commands/multi-agent/refs/cross-cli-contract.md +35 -33
  72. package/pipeline/commands/multi-agent/refs/features/dev-critic.md +5 -5
  73. package/pipeline/commands/multi-agent/refs/features/external-context-injection.md +6 -6
  74. package/pipeline/commands/multi-agent/refs/features/model-fallback.md +73 -0
  75. package/pipeline/commands/multi-agent/refs/features/plan-todos.md +1 -1
  76. package/pipeline/commands/multi-agent/refs/features/prior-fix-detection.md +4 -4
  77. package/pipeline/commands/multi-agent/refs/features/repo-map.md +6 -6
  78. package/pipeline/commands/multi-agent/refs/features/shadow-git.md +2 -2
  79. package/pipeline/commands/multi-agent/refs/frontend-guide.md +16 -16
  80. package/pipeline/commands/multi-agent/refs/issue-jira-triad.md +18 -18
  81. package/pipeline/commands/multi-agent/refs/keychain.md +18 -8
  82. package/pipeline/commands/multi-agent/refs/knowledge.md +9 -9
  83. package/pipeline/commands/multi-agent/refs/multi-repo-integration-build.md +19 -19
  84. package/pipeline/commands/multi-agent/refs/phases/log-format.md +29 -9
  85. package/pipeline/commands/multi-agent/refs/phases/modes.md +33 -33
  86. package/pipeline/commands/multi-agent/refs/phases/operations.md +11 -11
  87. package/pipeline/commands/multi-agent/refs/phases/phase-0-init.md +93 -57
  88. package/pipeline/commands/multi-agent/refs/phases/phase-1-analysis.md +59 -28
  89. package/pipeline/commands/multi-agent/refs/phases/phase-2-planning.md +115 -63
  90. package/pipeline/commands/multi-agent/refs/phases/phase-3-dev.md +99 -36
  91. package/pipeline/commands/multi-agent/refs/phases/phase-4-review.md +160 -63
  92. package/pipeline/commands/multi-agent/refs/phases/phase-5-test.md +33 -18
  93. package/pipeline/commands/multi-agent/refs/phases/phase-6-commit.md +45 -43
  94. package/pipeline/commands/multi-agent/refs/phases/phase-7-report.md +54 -28
  95. package/pipeline/commands/multi-agent/refs/phases.md +17 -17
  96. package/pipeline/commands/multi-agent/refs/picker-contract.md +65 -0
  97. package/pipeline/commands/multi-agent/refs/progress-contract.md +37 -21
  98. package/pipeline/commands/multi-agent/refs/rules.md +83 -25
  99. package/pipeline/commands/multi-agent/refs/swiftui-guide.md +32 -30
  100. package/pipeline/commands/multi-agent/refs/tracker-contract.md +54 -30
  101. package/pipeline/commands/multi-agent/refs/wiki-capture.md +36 -33
  102. package/pipeline/commands/multi-agent/resume.md +9 -9
  103. package/pipeline/commands/multi-agent/review.md +24 -24
  104. package/pipeline/commands/multi-agent/scan.md +10 -10
  105. package/pipeline/commands/multi-agent/search.md +8 -8
  106. package/pipeline/commands/multi-agent/setup.md +111 -84
  107. package/pipeline/commands/multi-agent/stack.md +5 -5
  108. package/pipeline/commands/multi-agent/status.md +5 -5
  109. package/pipeline/commands/multi-agent/sync.md +123 -111
  110. package/pipeline/commands/multi-agent/test.md +6 -6
  111. package/pipeline/commands/multi-agent/update.md +1 -1
  112. package/pipeline/commands/multi-agent.md +66 -60
  113. package/pipeline/commands/sim-test.md +14 -14
  114. package/pipeline/eval/golden-tasks/01-ios-bugfix-darkmode/expected/phase-1-analysis.json +1 -1
  115. package/pipeline/eval/golden-tasks/01-ios-bugfix-darkmode/expected/phase-4-review.json +2 -2
  116. package/pipeline/eval/golden-tasks/01-ios-bugfix-darkmode/expected/phase-4-triage.json +2 -2
  117. package/pipeline/eval/golden-tasks/01-ios-bugfix-darkmode/metadata.json +1 -1
  118. package/pipeline/eval/golden-tasks/02-android-feature-compose/expected/phase-1-analysis.json +2 -2
  119. package/pipeline/eval/golden-tasks/02-android-feature-compose/expected/phase-4-review.json +3 -3
  120. package/pipeline/eval/golden-tasks/02-android-feature-compose/expected/phase-4-triage.json +4 -4
  121. package/pipeline/eval/golden-tasks/02-android-feature-compose/metadata.json +1 -1
  122. package/pipeline/eval/golden-tasks/02-android-feature-compose/task.json +1 -1
  123. package/pipeline/eval/golden-tasks/03-backend-python-ratelimit/expected/phase-1-analysis.json +29 -0
  124. package/pipeline/eval/golden-tasks/03-backend-python-ratelimit/expected/phase-2-plan.json +42 -0
  125. package/pipeline/eval/golden-tasks/03-backend-python-ratelimit/expected/phase-4-review.json +20 -0
  126. package/pipeline/eval/golden-tasks/03-backend-python-ratelimit/expected/phase-4-triage.json +15 -0
  127. package/pipeline/eval/golden-tasks/03-backend-python-ratelimit/metadata.json +14 -0
  128. package/pipeline/eval/golden-tasks/03-backend-python-ratelimit/task.json +12 -0
  129. package/pipeline/eval/golden-tasks/04-frontend-next-hydration/expected/phase-1-analysis.json +29 -0
  130. package/pipeline/eval/golden-tasks/04-frontend-next-hydration/expected/phase-2-plan.json +40 -0
  131. package/pipeline/eval/golden-tasks/04-frontend-next-hydration/expected/phase-4-review.json +20 -0
  132. package/pipeline/eval/golden-tasks/04-frontend-next-hydration/expected/phase-4-triage.json +15 -0
  133. package/pipeline/eval/golden-tasks/04-frontend-next-hydration/metadata.json +14 -0
  134. package/pipeline/eval/golden-tasks/04-frontend-next-hydration/task.json +12 -0
  135. package/pipeline/eval/golden-tasks/05-ios-security-keychain/expected/phase-1-analysis.json +29 -0
  136. package/pipeline/eval/golden-tasks/05-ios-security-keychain/expected/phase-2-plan.json +42 -0
  137. package/pipeline/eval/golden-tasks/05-ios-security-keychain/expected/phase-4-review.json +28 -0
  138. package/pipeline/eval/golden-tasks/05-ios-security-keychain/expected/phase-4-triage.json +27 -0
  139. package/pipeline/eval/golden-tasks/05-ios-security-keychain/metadata.json +14 -0
  140. package/pipeline/eval/golden-tasks/05-ios-security-keychain/task.json +12 -0
  141. package/pipeline/eval/golden-tasks/06-android-refactor-usecase/expected/phase-1-analysis.json +29 -0
  142. package/pipeline/eval/golden-tasks/06-android-refactor-usecase/expected/phase-2-plan.json +41 -0
  143. package/pipeline/eval/golden-tasks/06-android-refactor-usecase/expected/phase-4-review.json +12 -0
  144. package/pipeline/eval/golden-tasks/06-android-refactor-usecase/expected/phase-4-triage.json +6 -0
  145. package/pipeline/eval/golden-tasks/06-android-refactor-usecase/metadata.json +14 -0
  146. package/pipeline/eval/golden-tasks/06-android-refactor-usecase/task.json +12 -0
  147. package/pipeline/eval/golden-tasks/07-backend-node-idempotency/expected/phase-1-analysis.json +29 -0
  148. package/pipeline/eval/golden-tasks/07-backend-node-idempotency/expected/phase-2-plan.json +42 -0
  149. package/pipeline/eval/golden-tasks/07-backend-node-idempotency/expected/phase-4-review.json +28 -0
  150. package/pipeline/eval/golden-tasks/07-backend-node-idempotency/expected/phase-4-triage.json +27 -0
  151. package/pipeline/eval/golden-tasks/07-backend-node-idempotency/metadata.json +14 -0
  152. package/pipeline/eval/golden-tasks/07-backend-node-idempotency/task.json +12 -0
  153. package/pipeline/eval/golden-tasks/08-ios-auth-consensus-unverified/expected/phase-1-analysis.json +25 -0
  154. package/pipeline/eval/golden-tasks/08-ios-auth-consensus-unverified/expected/phase-2-plan.json +31 -0
  155. package/pipeline/eval/golden-tasks/08-ios-auth-consensus-unverified/expected/phase-4-review.json +12 -0
  156. package/pipeline/eval/golden-tasks/08-ios-auth-consensus-unverified/expected/phase-4-triage.json +18 -0
  157. package/pipeline/eval/golden-tasks/08-ios-auth-consensus-unverified/metadata.json +14 -0
  158. package/pipeline/eval/golden-tasks/08-ios-auth-consensus-unverified/task.json +12 -0
  159. package/pipeline/eval/golden-tasks/README.md +14 -14
  160. package/pipeline/eval/intent-cases.json +40 -0
  161. package/pipeline/eval/run-metrics-fixture.json +46 -0
  162. package/pipeline/eval/triage/01-empty-findings/notes.md +1 -1
  163. package/pipeline/eval/triage/02-real-blocker/notes.md +2 -2
  164. package/pipeline/eval/triage/03-out-of-scope-defer/notes.md +1 -1
  165. package/pipeline/eval/triage/04-false-positive-reject/notes.md +1 -1
  166. package/pipeline/eval/triage/05-mixed-classification/notes.md +2 -2
  167. package/pipeline/eval/triage/06-severity-mismatch/notes.md +2 -2
  168. package/pipeline/eval/triage/07-duplicate-reviewers/notes.md +1 -1
  169. package/pipeline/eval/triage/08-style-misclassified/notes.md +1 -1
  170. package/pipeline/eval/triage/09-cascading-finding/notes.md +2 -2
  171. package/pipeline/eval/triage/10-deferred-crossref/notes.md +2 -2
  172. package/pipeline/eval/triage/11-vercel-token-leak-blocker/expected.json +3 -3
  173. package/pipeline/eval/triage/11-vercel-token-leak-blocker/input.json +2 -2
  174. package/pipeline/eval/triage/11-vercel-token-leak-blocker/notes.md +5 -5
  175. package/pipeline/eval/triage/README.md +4 -4
  176. package/pipeline/lib/account-resolver.sh +3 -3
  177. package/pipeline/lib/ask-choice.sh +98 -0
  178. package/pipeline/lib/channels-multi-repo.sh +3 -3
  179. package/pipeline/lib/classify-intent.sh +110 -0
  180. package/pipeline/lib/context-link-extractor.sh +3 -3
  181. package/pipeline/lib/credential-store-resolver.sh +3 -3
  182. package/pipeline/lib/credential-store.sh +9 -5
  183. package/pipeline/lib/extract-conventions.sh +1034 -0
  184. package/pipeline/lib/fetch-confluence.sh +3 -3
  185. package/pipeline/lib/fetch-crashlytics.sh +5 -5
  186. package/pipeline/lib/fetch-fortify.sh +5 -21
  187. package/pipeline/lib/fetch-swagger.sh +5 -5
  188. package/pipeline/lib/figma-screenshot.sh +536 -0
  189. package/pipeline/lib/issue-fetcher.sh +46 -20
  190. package/pipeline/lib/md2confluence-v3.py +1076 -0
  191. package/pipeline/lib/multi-repo-pipeline.sh +13 -22
  192. package/pipeline/lib/plan-todos.sh +7 -7
  193. package/pipeline/lib/post-pr-review.sh +53 -21
  194. package/pipeline/lib/repo-cache.sh +5 -5
  195. package/pipeline/lib/review-watch.sh +17 -13
  196. package/pipeline/lib/shadow-git.sh +7 -7
  197. package/pipeline/lib/submodule-detector.sh +3 -3
  198. package/pipeline/lib/vercel-deploy.sh +28 -15
  199. package/pipeline/preferences-template.json +21 -4
  200. package/pipeline/rules/app-store-guidelines.md +2 -2
  201. package/pipeline/rules/code-style.md +6 -6
  202. package/pipeline/rules/figma-pipeline.md +100 -2
  203. package/pipeline/rules/kotlin-android.md +8 -8
  204. package/pipeline/rules/security.md +4 -4
  205. package/pipeline/rules/tdd.md +1 -1
  206. package/pipeline/rules/testing.md +5 -5
  207. package/pipeline/schemas/agent-state.schema.json +55 -20
  208. package/pipeline/schemas/analysis-output.schema.json +7 -2
  209. package/pipeline/schemas/analysis-spec.schema.json +484 -0
  210. package/pipeline/schemas/clarify-output.schema.json +5 -5
  211. package/pipeline/schemas/conventions-output.schema.json +70 -0
  212. package/pipeline/schemas/dev-critic-output.schema.json +2 -2
  213. package/pipeline/schemas/diff-risk.schema.json +3 -3
  214. package/pipeline/schemas/figma-project-config.schema.json +3 -3
  215. package/pipeline/schemas/learnings-ledger.schema.json +39 -0
  216. package/pipeline/schemas/migrations/README.md +2 -2
  217. package/pipeline/schemas/migrations/figma-config-1.0.0-to-2.0.0.mjs +5 -5
  218. package/pipeline/schemas/migrations/prefs-2.0.0-to-2.1.0.mjs +3 -3
  219. package/pipeline/schemas/migrations/prefs-2.1.0-to-2.2.0.mjs +4 -4
  220. package/pipeline/schemas/migrations/prefs-2.2.0-to-2.3.0.mjs +5 -5
  221. package/pipeline/schemas/migrations/state-2.0.0-to-2.1.0.mjs +3 -3
  222. package/pipeline/schemas/plan-todos.schema.json +4 -4
  223. package/pipeline/schemas/planning-output.schema.json +3 -3
  224. package/pipeline/schemas/prefs.schema.json +95 -11
  225. package/pipeline/schemas/reviewer-output.schema.json +7 -3
  226. package/pipeline/schemas/test-gap.schema.json +1 -1
  227. package/pipeline/schemas/token-budget.json +8 -8
  228. package/pipeline/schemas/triage-corpus.schema.json +1 -1
  229. package/pipeline/schemas/triage-output.schema.json +44 -6
  230. package/pipeline/scripts/README.md +64 -64
  231. package/pipeline/scripts/aggregate-metrics.mjs +55 -16
  232. package/pipeline/scripts/audit-log-rotate.sh +3 -3
  233. package/pipeline/scripts/audit-log.sh +20 -7
  234. package/pipeline/scripts/benchmark-phase-0.sh +6 -6
  235. package/pipeline/scripts/build-skills-index.mjs +15 -15
  236. package/pipeline/scripts/check-md-links.mjs +59 -0
  237. package/pipeline/scripts/classify-plan-safety.mjs +24 -18
  238. package/pipeline/scripts/cost-budget-check.mjs +160 -0
  239. package/pipeline/scripts/cost-table.json +23 -13
  240. package/pipeline/scripts/diff-explain.mjs +12 -12
  241. package/pipeline/scripts/diff-risk-score.mjs +18 -17
  242. package/pipeline/scripts/eval-golden-tasks-live.mjs +13 -10
  243. package/pipeline/scripts/eval-golden-tasks.mjs +3 -14
  244. package/pipeline/scripts/eval-intent.mjs +103 -0
  245. package/pipeline/scripts/eval-triage.mjs +3 -3
  246. package/pipeline/scripts/evidence-gate.mjs +155 -0
  247. package/pipeline/scripts/fixtures/install-layout.tsv +9 -9
  248. package/pipeline/scripts/gc-tmp.sh +102 -0
  249. package/pipeline/scripts/gen-mode-dispatch.mjs +27 -21
  250. package/pipeline/scripts/gen-skills-index.mjs +6 -6
  251. package/pipeline/scripts/github-ssh-setup.sh +1 -1
  252. package/pipeline/scripts/keychain-save.sh +1 -1
  253. package/pipeline/scripts/keychain.py +6 -6
  254. package/pipeline/scripts/learnings-ledger.mjs +284 -0
  255. package/pipeline/scripts/lint-skills.mjs +80 -0
  256. package/pipeline/scripts/log-metric.sh +18 -9
  257. package/pipeline/scripts/match-skills.mjs +13 -8
  258. package/pipeline/scripts/memory-load.sh +3 -3
  259. package/pipeline/scripts/memory-save.sh +5 -5
  260. package/pipeline/scripts/migrate-prefs.mjs +17 -17
  261. package/pipeline/scripts/migrate-state.mjs +12 -12
  262. package/pipeline/scripts/output-quality-check.sh +7 -7
  263. package/pipeline/scripts/phase-banner.sh +5 -5
  264. package/pipeline/scripts/phase-tracker.sh +90 -53
  265. package/pipeline/scripts/pre-commit-check.sh +45 -5
  266. package/pipeline/scripts/pre-push-check.sh +7 -7
  267. package/pipeline/scripts/prune-logs.sh +118 -0
  268. package/pipeline/scripts/render-agent-log-cost.sh +55 -18
  269. package/pipeline/scripts/render-cost-summary.sh +9 -9
  270. package/pipeline/scripts/render-work-summary.sh +4 -4
  271. package/pipeline/scripts/repo-map.mjs +9 -9
  272. package/pipeline/scripts/run-aggregator.mjs +7 -6
  273. package/pipeline/scripts/run-metrics.mjs +129 -0
  274. package/pipeline/scripts/run-smokes.mjs +76 -0
  275. package/pipeline/scripts/scan-skills.sh +11 -11
  276. package/pipeline/scripts/search-logs.sh +8 -8
  277. package/pipeline/scripts/sign-skills.sh +2 -2
  278. package/pipeline/scripts/smoke-adapters.sh +79 -10
  279. package/pipeline/scripts/smoke-add-detail.sh +5 -5
  280. package/pipeline/scripts/smoke-agent-log-cost.sh +85 -6
  281. package/pipeline/scripts/smoke-agent-model-routing.sh +3 -3
  282. package/pipeline/scripts/smoke-ask-choice.sh +42 -0
  283. package/pipeline/scripts/smoke-bitbucket-contract.sh +19 -3
  284. package/pipeline/scripts/smoke-changelog-version.sh +47 -0
  285. package/pipeline/scripts/smoke-channels-flow.sh +1 -1
  286. package/pipeline/scripts/smoke-ci-workflows.sh +5 -5
  287. package/pipeline/scripts/smoke-clarify.sh +3 -3
  288. package/pipeline/scripts/smoke-commands-skills-parity.sh +4 -4
  289. package/pipeline/scripts/smoke-community-gates.sh +75 -0
  290. package/pipeline/scripts/smoke-compliance-skills.sh +5 -5
  291. package/pipeline/scripts/smoke-cost-budget.sh +70 -0
  292. package/pipeline/scripts/smoke-cost-summary.sh +4 -4
  293. package/pipeline/scripts/smoke-cross-cli-behavior.sh +50 -9
  294. package/pipeline/scripts/smoke-cross-phase-cohesion.sh +5 -5
  295. package/pipeline/scripts/smoke-delete-flow.sh +5 -5
  296. package/pipeline/scripts/smoke-dev-critic.sh +2 -2
  297. package/pipeline/scripts/smoke-diff-explain.sh +22 -3
  298. package/pipeline/scripts/smoke-diff-risk.sh +1 -1
  299. package/pipeline/scripts/smoke-dynamic-skill-loading.sh +1 -1
  300. package/pipeline/scripts/smoke-eval-live.sh +4 -4
  301. package/pipeline/scripts/smoke-evidence-gate.sh +93 -0
  302. package/pipeline/scripts/smoke-existing-discovery-gate.sh +1 -1
  303. package/pipeline/scripts/smoke-extract-conventions.sh +163 -0
  304. package/pipeline/scripts/smoke-figma-android-parity.sh +1 -1
  305. package/pipeline/scripts/smoke-figma-credential-store.sh +3 -3
  306. package/pipeline/scripts/smoke-figma-cross-cli-inventory.sh +12 -12
  307. package/pipeline/scripts/smoke-figma-dispatch.sh +5 -5
  308. package/pipeline/scripts/smoke-figma-sync.sh +1 -1
  309. package/pipeline/scripts/smoke-gate-hooks.sh +56 -0
  310. package/pipeline/scripts/smoke-gc-tmp.sh +84 -0
  311. package/pipeline/scripts/smoke-identity-isolation.sh +7 -7
  312. package/pipeline/scripts/smoke-install-layout.sh +10 -10
  313. package/pipeline/scripts/smoke-intent-guard.sh +86 -0
  314. package/pipeline/scripts/smoke-issue-comment-template.sh +3 -3
  315. package/pipeline/scripts/smoke-issue-jira-triad.sh +1 -1
  316. package/pipeline/scripts/smoke-keychain.sh +6 -6
  317. package/pipeline/scripts/smoke-language-axis.sh +2 -2
  318. package/pipeline/scripts/smoke-learnings-ledger.sh +86 -0
  319. package/pipeline/scripts/smoke-lib-scripts.sh +2 -2
  320. package/pipeline/scripts/smoke-mcp-gate.sh +68 -0
  321. package/pipeline/scripts/smoke-md-links.sh +8 -0
  322. package/pipeline/scripts/smoke-md2confluence.sh +126 -0
  323. package/pipeline/scripts/smoke-metrics-cache-ratio.sh +72 -0
  324. package/pipeline/scripts/smoke-migrate-state.sh +10 -10
  325. package/pipeline/scripts/smoke-mode-dispatch-drift.sh +7 -4
  326. package/pipeline/scripts/smoke-model-fallback.sh +80 -0
  327. package/pipeline/scripts/smoke-multi-repo-integration.sh +3 -3
  328. package/pipeline/scripts/smoke-multi-repo-worktree.sh +1 -1
  329. package/pipeline/scripts/smoke-no-mcp-in-dev-phases.sh +115 -0
  330. package/pipeline/scripts/smoke-no-token-prompt.sh +31 -15
  331. package/pipeline/scripts/smoke-pat-audit.sh +26 -5
  332. package/pipeline/scripts/smoke-per-repo-memory.sh +1 -1
  333. package/pipeline/scripts/smoke-phase-0-multi-repo.sh +1 -1
  334. package/pipeline/scripts/smoke-phase-6-multi.sh +2 -2
  335. package/pipeline/scripts/smoke-phase-banner.sh +1 -1
  336. package/pipeline/scripts/smoke-phase-tracker.sh +1 -1
  337. package/pipeline/scripts/smoke-phase0-bridge-contract.sh +4 -4
  338. package/pipeline/scripts/smoke-phase4-triage.sh +94 -7
  339. package/pipeline/scripts/smoke-plan-approval-gate.sh +3 -3
  340. package/pipeline/scripts/smoke-plan-safety.sh +1 -1
  341. package/pipeline/scripts/smoke-plan-todos.sh +2 -2
  342. package/pipeline/scripts/smoke-pr-review-actions.sh +2 -2
  343. package/pipeline/scripts/smoke-pre-commit.sh +34 -2
  344. package/pipeline/scripts/smoke-pref-migration.sh +1 -1
  345. package/pipeline/scripts/smoke-prefs-language.sh +5 -5
  346. package/pipeline/scripts/smoke-progress-contract.sh +3 -3
  347. package/pipeline/scripts/smoke-prune-logs.sh +87 -0
  348. package/pipeline/scripts/smoke-push-retry.sh +1 -1
  349. package/pipeline/scripts/smoke-readme-counts.sh +1 -1
  350. package/pipeline/scripts/smoke-repo-map.sh +9 -9
  351. package/pipeline/scripts/smoke-review-watch.sh +12 -0
  352. package/pipeline/scripts/smoke-run-aggregator.sh +7 -7
  353. package/pipeline/scripts/smoke-run-metrics.sh +50 -0
  354. package/pipeline/scripts/smoke-schema-validation.sh +18 -11
  355. package/pipeline/scripts/smoke-search.sh +5 -5
  356. package/pipeline/scripts/smoke-shared-runtime.sh +108 -0
  357. package/pipeline/scripts/smoke-skill-authoring.sh +13 -13
  358. package/pipeline/scripts/smoke-skill-language.sh +4 -4
  359. package/pipeline/scripts/smoke-skill-manifest.sh +2 -2
  360. package/pipeline/scripts/smoke-skill-scan.sh +2 -2
  361. package/pipeline/scripts/smoke-stack-swap.sh +2 -2
  362. package/pipeline/scripts/smoke-subagent-validators.sh +8 -5
  363. package/pipeline/scripts/smoke-sync-adapters.sh +1 -1
  364. package/pipeline/scripts/smoke-sync-delegation.sh +7 -7
  365. package/pipeline/scripts/smoke-sync-parity.sh +1 -1
  366. package/pipeline/scripts/smoke-tasklist-ordering.sh +7 -7
  367. package/pipeline/scripts/smoke-telemetry.sh +1 -1
  368. package/pipeline/scripts/smoke-test-gap.sh +5 -5
  369. package/pipeline/scripts/smoke-token-budget.sh +1 -1
  370. package/pipeline/scripts/smoke-tracker-contract.sh +6 -6
  371. package/pipeline/scripts/smoke-tracker-tokens-invocation.sh +9 -1
  372. package/pipeline/scripts/smoke-triage-memory.sh +2 -2
  373. package/pipeline/scripts/smoke-url-enrichment.sh +2 -2
  374. package/pipeline/scripts/smoke-validator-contradiction.sh +1 -1
  375. package/pipeline/scripts/smoke-validator-gates.sh +164 -0
  376. package/pipeline/scripts/smoke-vercel-deploy-redact.sh +11 -11
  377. package/pipeline/scripts/smoke-wiki-integration.sh +2 -2
  378. package/pipeline/scripts/smoke-work-summary.sh +3 -3
  379. package/pipeline/scripts/smoke-worktree-path-convention.sh +4 -4
  380. package/pipeline/scripts/smoke-write-state.sh +2 -2
  381. package/pipeline/scripts/stack-swap.sh +3 -3
  382. package/pipeline/scripts/sync-adapters.mjs +37 -10
  383. package/pipeline/scripts/sync-parity-check.sh +6 -6
  384. package/pipeline/scripts/test-gap-scan.mjs +11 -13
  385. package/pipeline/scripts/token-budget-report.mjs +4 -4
  386. package/pipeline/scripts/triage-memory.mjs +6 -6
  387. package/pipeline/scripts/uninstall.mjs +42 -4
  388. package/pipeline/scripts/update-issue-progress.sh +2 -2
  389. package/pipeline/scripts/validate-analysis.mjs +19 -21
  390. package/pipeline/scripts/validate-diff-risk.mjs +4 -4
  391. package/pipeline/scripts/validate-planning.mjs +3 -3
  392. package/pipeline/scripts/validate-reviewer.mjs +4 -4
  393. package/pipeline/scripts/validate-schemas.mjs +4 -4
  394. package/pipeline/scripts/validate-test-gap.mjs +4 -4
  395. package/pipeline/scripts/validate-triage.mjs +68 -9
  396. package/pipeline/scripts/verify-skills.sh +7 -7
  397. package/pipeline/scripts/write-state.mjs +49 -11
  398. package/pipeline/skills/.skill-manifest.json +245 -149
  399. package/pipeline/skills/.skills-index.json +236 -47
  400. package/pipeline/skills/figma-android/README.md +5 -5
  401. package/pipeline/skills/figma-android/figma-component-code-connect/SKILL.md +3 -3
  402. package/pipeline/skills/figma-android/figma-component-implement/SKILL.md +8 -8
  403. package/pipeline/skills/figma-android/figma-component-test/SKILL.md +4 -4
  404. package/pipeline/skills/figma-android/figma-component-wiki/SKILL.md +5 -5
  405. package/pipeline/skills/figma-android/figma-to-component/SKILL.md +14 -14
  406. package/pipeline/skills/figma-common/README.md +29 -29
  407. package/pipeline/skills/figma-common/figma-cli-iterate/SKILL.md +20 -15
  408. package/pipeline/skills/figma-common/figma-cli-iterate-mend/SKILL.md +35 -30
  409. package/pipeline/skills/figma-common/figma-cli-lean-iterate/SKILL.md +35 -30
  410. package/pipeline/skills/figma-common/figma-cli-skip/SKILL.md +20 -20
  411. package/pipeline/skills/figma-common/figma-commit/COMMON_REBASE.md +32 -32
  412. package/pipeline/skills/figma-common/figma-commit/REVIEW.md +9 -9
  413. package/pipeline/skills/figma-common/figma-commit/SKILL.md +25 -20
  414. package/pipeline/skills/figma-common/figma-component-confluence-sync/SKILL.md +11 -6
  415. package/pipeline/skills/figma-common/figma-component-start/SKILL.md +30 -25
  416. package/pipeline/skills/figma-common/figma-component-status-update/SKILL.md +9 -4
  417. package/pipeline/skills/figma-common/figma-fix/SKILL.md +27 -22
  418. package/pipeline/skills/figma-common/figma-form-integration/SKILL.md +38 -38
  419. package/pipeline/skills/figma-common/figma-issue/SKILL.md +39 -34
  420. package/pipeline/skills/figma-common/figma-iterate/SKILL.md +20 -15
  421. package/pipeline/skills/figma-common/figma-iteration-commit/SKILL.md +44 -39
  422. package/pipeline/skills/figma-common/figma-mend/SKILL.md +6 -6
  423. package/pipeline/skills/figma-common/figma-price-integration/SKILL.md +30 -30
  424. package/pipeline/skills/figma-common/figma-remote-mcp-auth/SKILL.md +1 -1
  425. package/pipeline/skills/figma-common/figma-review/SKILL.md +31 -26
  426. package/pipeline/skills/figma-common/figma-setup/SKILL.md +11 -11
  427. package/pipeline/skills/figma-common/figma-setup/scripts/fetch-mcp-token.py +5 -5
  428. package/pipeline/skills/figma-common/figma-skip/SKILL.md +6 -6
  429. package/pipeline/skills/figma-common/figma-ui-patterns/SKILL.md +12 -12
  430. package/pipeline/skills/figma-common/figma-utility/SKILL.md +4 -4
  431. package/pipeline/skills/figma-common/figma-utility/scripts/figma-utility.py +1 -1
  432. package/pipeline/skills/figma-common/figma-validate/SKILL.md +48 -48
  433. package/pipeline/skills/figma-common/performance-iteration-commit-all/SKILL.md +42 -37
  434. package/pipeline/skills/figma-common/performance-review-next/SKILL.md +23 -18
  435. package/pipeline/skills/figma-common/performance-start/SKILL.md +52 -47
  436. package/pipeline/skills/figma-common/performance-swiftui/SKILL.md +68 -68
  437. package/pipeline/skills/figma-common/performance-tour/SKILL.md +42 -37
  438. package/pipeline/skills/figma-ios/REVIEW_CHECKLIST.md +16 -16
  439. package/pipeline/skills/figma-ios/figma-component-code-connect/SKILL.md +15 -15
  440. package/pipeline/skills/figma-ios/figma-component-implement/SKILL.md +9 -9
  441. package/pipeline/skills/figma-ios/figma-component-test/SKILL.md +15 -15
  442. package/pipeline/skills/figma-ios/figma-component-wiki/SKILL.md +18 -18
  443. package/pipeline/skills/figma-ios/figma-to-component/SKILL.md +38 -38
  444. package/pipeline/skills/figma-ios/figma-to-component/halt-return-protocol.md +2 -2
  445. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-0-init.md +12 -12
  446. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-1-gathering.md +5 -5
  447. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-1.5-existing-discovery.md +19 -19
  448. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-2-orchestrator.md +25 -25
  449. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-2a-testing-identifiers.md +7 -7
  450. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-2b-localization.md +6 -6
  451. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-2c-accessibility.md +38 -38
  452. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-2d-analytics.md +3 -3
  453. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-3-orchestrator.md +29 -29
  454. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-3a-location.md +6 -6
  455. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-3b-tokens.md +3 -3
  456. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-3c-nested.md +12 -12
  457. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-3d-patterns.md +57 -57
  458. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-3e-assets.md +5 -5
  459. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-3f-utilities.md +6 -6
  460. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-3g-property-coverage.md +10 -10
  461. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-3h-variant-config.md +16 -16
  462. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-4-orchestrator.md +23 -23
  463. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-4a-configuration.md +26 -26
  464. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-4b-view.md +43 -43
  465. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-4c-documentation.md +17 -17
  466. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-4d-preview.md +19 -19
  467. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-4e-modifiers.md +15 -15
  468. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-5-orchestrator.md +39 -39
  469. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-5a-viewinspector.md +7 -7
  470. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-5b-snapshot.md +29 -29
  471. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-5c-unit.md +9 -9
  472. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-6-code-connect.md +31 -31
  473. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-7-wiki.md +5 -5
  474. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-7a-confluence-generate.md +18 -18
  475. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-7a-wiki-generate.md +16 -16
  476. package/pipeline/skills/figma-ios/figma-to-component/phases/phase-8-cleanup.md +2 -2
  477. package/pipeline/skills/figma-ios/figma-to-component/reference/accessibility.md +1 -1
  478. package/pipeline/skills/figma-ios/figma-to-component/reference/code-connect.md +49 -49
  479. package/pipeline/skills/figma-ios/figma-to-component/reference/figma-to-swiftui-effects.md +8 -8
  480. package/pipeline/skills/figma-ios/figma-to-component/reference/halt-return-protocol.md +2 -2
  481. package/pipeline/skills/figma-ios/figma-to-component/reference/macros.md +9 -9
  482. package/pipeline/skills/figma-ios/figma-to-component/reference/missing-tokens.md +4 -4
  483. package/pipeline/skills/figma-ios/figma-to-component/reference/orchestrator-discipline.md +10 -10
  484. package/pipeline/skills/figma-ios/figma-to-component/reference/remote-mcp-script.md +5 -5
  485. package/pipeline/skills/figma-ios/figma-to-component/reference/rest-api-script.md +11 -11
  486. package/pipeline/skills/figma-ios/figma-to-component/reference/scripts-inventory.md +14 -14
  487. package/pipeline/skills/figma-ios/figma-to-component/reference/snapshot-testing.md +2 -2
  488. package/pipeline/skills/figma-ios/figma-to-component/reference/subcomponent-graph.md +4 -4
  489. package/pipeline/skills/figma-ios/figma-to-component/reference/testing-identifiers-naming.md +6 -6
  490. package/pipeline/skills/figma-ios/figma-to-component/reference/tools.md +9 -9
  491. package/pipeline/skills/figma-ios/figma-to-component/reference/viewinspector.md +1 -1
  492. package/pipeline/skills/figma-ios/figma-to-component/reference/wiki-to-confluence-mapping.md +1 -1
  493. package/pipeline/skills/figma-ios/figma-to-component/scripts/apply-author-login-map.py +5 -5
  494. package/pipeline/skills/figma-ios/figma-to-component/scripts/backfill-status.py +18 -18
  495. package/pipeline/skills/figma-ios/figma-to-component/scripts/build-author-registry.py +4 -4
  496. package/pipeline/skills/figma-ios/figma-to-component/scripts/bulk-sync-issues.py +4 -4
  497. package/pipeline/skills/figma-ios/figma-to-component/scripts/code-connect-data-gather.py +1 -1
  498. package/pipeline/skills/figma-ios/figma-to-component/scripts/code-connect-publish.sh +3 -3
  499. package/pipeline/skills/figma-ios/figma-to-component/scripts/confluence-component-status-upload.py +18 -18
  500. package/pipeline/skills/figma-ios/figma-to-component/scripts/confluence-component-status.py +4 -4
  501. package/pipeline/skills/figma-ios/figma-to-component/scripts/confluence-data-gather.py +5 -5
  502. package/pipeline/skills/figma-ios/figma-to-component/scripts/confluence-page-ids.example.json +9 -0
  503. package/pipeline/skills/figma-ios/figma-to-component/scripts/confluence-publish.py +3 -3
  504. package/pipeline/skills/figma-ios/figma-to-component/scripts/figma-subcomponent-graph.py +1 -1
  505. package/pipeline/skills/figma-ios/figma-to-component/scripts/figma-update.py +5 -5
  506. package/pipeline/skills/figma-ios/figma-to-component/scripts/lib/issue_sync_propagate.py +1 -1
  507. package/pipeline/skills/figma-ios/figma-to-component/scripts/lib/registry_writer.py +4 -4
  508. package/pipeline/skills/figma-ios/figma-to-component/scripts/lib/test_figma_update.py +1 -1
  509. package/pipeline/skills/figma-ios/figma-to-component/scripts/lib/test_registry_writer.py +3 -3
  510. package/pipeline/skills/figma-ios/figma-to-component/scripts/lib/test_skill_figma_issue.py +1 -1
  511. package/pipeline/skills/figma-ios/figma-to-component/scripts/lib/test_update_issue_gh.py +1 -1
  512. package/pipeline/skills/figma-ios/figma-to-component/scripts/phase1-gather.py +12 -12
  513. package/pipeline/skills/figma-ios/figma-to-component/scripts/phase2-finalize.py +3 -3
  514. package/pipeline/skills/figma-ios/figma-to-component/scripts/phase3-scripts.py +26 -26
  515. package/pipeline/skills/figma-ios/figma-to-component/scripts/phase4-finalize.py +4 -4
  516. package/pipeline/skills/figma-ios/figma-to-component/scripts/phase5-finalize.py +4 -4
  517. package/pipeline/skills/figma-ios/figma-to-component/scripts/phase6-finalize.py +5 -5
  518. package/pipeline/skills/figma-ios/figma-to-component/scripts/phase7-finalize.py +4 -4
  519. package/pipeline/skills/figma-ios/figma-to-component/scripts/register-icons-codeconnect.py +4 -4
  520. package/pipeline/skills/figma-ios/figma-to-component/scripts/remote-mcp-fetch.py +5 -5
  521. package/pipeline/skills/figma-ios/figma-to-component/scripts/resolve-author-logins.py +2 -2
  522. package/pipeline/skills/figma-ios/figma-to-component/scripts/run-uicomponents-tests.sh +1 -1
  523. package/pipeline/skills/figma-ios/figma-to-component/scripts/sidebar-generator.py +5 -5
  524. package/pipeline/skills/figma-ios/figma-to-component/scripts/update-issue-from-registry.py +41 -41
  525. package/pipeline/skills/figma-ios/figma-to-component/scripts/validate-phase4.sh +8 -8
  526. package/pipeline/skills/figma-ios/figma-to-component/scripts/validate-phase6.sh +7 -7
  527. package/pipeline/skills/shared/README.md +62 -41
  528. package/pipeline/skills/shared/core/apple-archive-compliance/SKILL.md +39 -39
  529. package/pipeline/skills/shared/core/google-play-compliance/SKILL.md +44 -44
  530. package/pipeline/skills/shared/core/multi-agent/SKILL.md +182 -176
  531. package/pipeline/skills/shared/core/multi-agent-analysis/SKILL.md +55 -0
  532. package/pipeline/skills/shared/core/multi-agent-analysis-resolve/SKILL.md +48 -0
  533. package/pipeline/skills/shared/core/multi-agent-autopilot/SKILL.md +16 -16
  534. package/pipeline/skills/shared/core/multi-agent-build-optimize/SKILL.md +48 -0
  535. package/pipeline/skills/shared/core/multi-agent-channels/SKILL.md +40 -40
  536. package/pipeline/skills/shared/core/multi-agent-delete/SKILL.md +33 -30
  537. package/pipeline/skills/shared/core/multi-agent-dev/SKILL.md +26 -26
  538. package/pipeline/skills/shared/core/multi-agent-dev-autopilot/SKILL.md +22 -22
  539. package/pipeline/skills/shared/core/multi-agent-dev-local/SKILL.md +6 -6
  540. package/pipeline/skills/shared/core/multi-agent-dev-local-autopilot/SKILL.md +12 -12
  541. package/pipeline/skills/shared/core/multi-agent-diff-explain/SKILL.md +20 -20
  542. package/pipeline/skills/shared/core/multi-agent-garbage-collect/SKILL.md +61 -0
  543. package/pipeline/skills/shared/core/multi-agent-help/SKILL.md +22 -22
  544. package/pipeline/skills/shared/core/multi-agent-issue/SKILL.md +15 -15
  545. package/pipeline/skills/shared/core/multi-agent-jira/SKILL.md +12 -12
  546. package/pipeline/skills/shared/core/multi-agent-kill/SKILL.md +14 -14
  547. package/pipeline/skills/shared/core/multi-agent-language/SKILL.md +12 -12
  548. package/pipeline/skills/shared/core/multi-agent-local/SKILL.md +10 -10
  549. package/pipeline/skills/shared/core/multi-agent-local-autopilot/SKILL.md +18 -18
  550. package/pipeline/skills/shared/core/multi-agent-log/SKILL.md +9 -9
  551. package/pipeline/skills/shared/core/multi-agent-manual-test/SKILL.md +20 -20
  552. package/pipeline/skills/shared/core/multi-agent-prune-logs/SKILL.md +63 -0
  553. package/pipeline/skills/shared/core/multi-agent-purge/SKILL.md +16 -13
  554. package/pipeline/skills/shared/core/multi-agent-refactor/SKILL.md +110 -110
  555. package/pipeline/skills/shared/core/multi-agent-resume/SKILL.md +13 -13
  556. package/pipeline/skills/shared/core/multi-agent-review/SKILL.md +22 -22
  557. package/pipeline/skills/shared/core/multi-agent-scan/SKILL.md +18 -18
  558. package/pipeline/skills/shared/core/multi-agent-search/SKILL.md +13 -13
  559. package/pipeline/skills/shared/core/multi-agent-setup/SKILL.md +33 -30
  560. package/pipeline/skills/shared/core/multi-agent-stack/SKILL.md +14 -14
  561. package/pipeline/skills/shared/core/multi-agent-status/SKILL.md +9 -9
  562. package/pipeline/skills/shared/core/multi-agent-sync/SKILL.md +79 -79
  563. package/pipeline/skills/shared/core/multi-agent-test/SKILL.md +5 -5
  564. package/pipeline/skills/shared/core/multi-agent-update/SKILL.md +10 -10
  565. package/pipeline/skills/shared/external/NOTICE-swift-ios-skills.md +41 -0
  566. package/pipeline/skills/shared/external/NOTICE-xcode-build-skills.md +53 -0
  567. package/pipeline/skills/shared/external/agentflow/SKILL.md +9 -9
  568. package/pipeline/skills/shared/external/alarmkit/SKILL.md +113 -52
  569. package/pipeline/skills/shared/external/alarmkit/evals/evals.json +41 -0
  570. package/pipeline/skills/shared/external/alarmkit/references/alarmkit-patterns.md +23 -16
  571. package/pipeline/skills/shared/external/app-clips/SKILL.md +85 -354
  572. package/pipeline/skills/shared/external/app-clips/evals/evals.json +50 -0
  573. package/pipeline/skills/shared/external/app-clips/references/data-handoff-notifications-location.md +135 -0
  574. package/pipeline/skills/shared/external/app-clips/references/routing-and-experiences.md +125 -0
  575. package/pipeline/skills/shared/external/app-clips/references/size-capabilities-and-promotion.md +113 -0
  576. package/pipeline/skills/shared/external/app-intents/SKILL.md +152 -59
  577. package/pipeline/skills/shared/external/app-intents/evals/evals.json +47 -0
  578. package/pipeline/skills/shared/external/app-intents/references/appintents-advanced.md +161 -118
  579. package/pipeline/skills/shared/external/app-store-optimization/SKILL.md +289 -392
  580. package/pipeline/skills/shared/external/app-store-optimization/evals/evals.json +46 -0
  581. package/pipeline/skills/shared/external/app-store-optimization/references/keyword-research-methodology.md +174 -0
  582. package/pipeline/skills/shared/external/app-store-optimization/references/product-page-variants.md +191 -0
  583. package/pipeline/skills/shared/external/app-store-review/SKILL.md +57 -107
  584. package/pipeline/skills/shared/external/app-store-review/evals/evals.json +44 -0
  585. package/pipeline/skills/shared/external/app-store-review/references/privacy-manifest.md +35 -12
  586. package/pipeline/skills/shared/external/app-store-review/references/review-checklists.md +28 -26
  587. package/pipeline/skills/shared/external/apple-on-device-ai/SKILL.md +53 -62
  588. package/pipeline/skills/shared/external/apple-on-device-ai/evals/evals.json +47 -0
  589. package/pipeline/skills/shared/external/apple-on-device-ai/references/coreml-conversion.md +7 -1
  590. package/pipeline/skills/shared/external/apple-on-device-ai/references/coreml-optimization.md +4 -1
  591. package/pipeline/skills/shared/external/apple-on-device-ai/references/foundation-models.md +32 -12
  592. package/pipeline/skills/shared/external/apple-on-device-ai/references/mlx-swift.md +34 -30
  593. package/pipeline/skills/shared/external/authentication/SKILL.md +134 -138
  594. package/pipeline/skills/shared/external/authentication/evals/evals.json +48 -0
  595. package/pipeline/skills/shared/external/authentication/references/keychain-biometric.md +56 -29
  596. package/pipeline/skills/shared/external/authentication/references/passkeys.md +183 -0
  597. package/pipeline/skills/shared/external/avkit/SKILL.md +497 -0
  598. package/pipeline/skills/shared/external/avkit/evals/evals.json +55 -0
  599. package/pipeline/skills/shared/external/avkit/references/avkit-patterns.md +668 -0
  600. package/pipeline/skills/shared/external/background-processing/SKILL.md +29 -29
  601. package/pipeline/skills/shared/external/background-processing/evals/evals.json +44 -0
  602. package/pipeline/skills/shared/external/background-processing/references/background-task-patterns.md +44 -19
  603. package/pipeline/skills/shared/external/callkit-voip/SKILL.md +136 -99
  604. package/pipeline/skills/shared/external/callkit-voip/evals/evals.json +47 -0
  605. package/pipeline/skills/shared/external/callkit-voip/references/callkit-patterns.md +27 -8
  606. package/pipeline/skills/shared/external/ci-cd-pipelines/SKILL.md +7 -6
  607. package/pipeline/skills/shared/external/clean-code/SKILL.md +2 -2
  608. package/pipeline/skills/shared/external/cloudkit-sync/SKILL.md +63 -56
  609. package/pipeline/skills/shared/external/cloudkit-sync/evals/evals.json +47 -0
  610. package/pipeline/skills/shared/external/cloudkit-sync/references/cloudkit-patterns.md +7 -4
  611. package/pipeline/skills/shared/external/contacts-framework/SKILL.md +31 -11
  612. package/pipeline/skills/shared/external/contacts-framework/evals/evals.json +41 -0
  613. package/pipeline/skills/shared/external/contacts-framework/references/contacts-patterns.md +51 -51
  614. package/pipeline/skills/shared/external/core-bluetooth/SKILL.md +70 -65
  615. package/pipeline/skills/shared/external/core-bluetooth/evals/evals.json +44 -0
  616. package/pipeline/skills/shared/external/core-bluetooth/references/ble-patterns.md +25 -1
  617. package/pipeline/skills/shared/external/core-data/SKILL.md +496 -0
  618. package/pipeline/skills/shared/external/core-data/evals/evals.json +44 -0
  619. package/pipeline/skills/shared/external/core-motion/SKILL.md +47 -14
  620. package/pipeline/skills/shared/external/core-motion/evals/evals.json +49 -0
  621. package/pipeline/skills/shared/external/core-motion/references/motion-patterns.md +47 -16
  622. package/pipeline/skills/shared/external/core-nfc/SKILL.md +43 -54
  623. package/pipeline/skills/shared/external/core-nfc/evals/evals.json +49 -0
  624. package/pipeline/skills/shared/external/core-nfc/references/nfc-patterns.md +32 -2
  625. package/pipeline/skills/shared/external/coreml/SKILL.md +89 -48
  626. package/pipeline/skills/shared/external/coreml/evals/evals.json +44 -0
  627. package/pipeline/skills/shared/external/coreml/references/coreml-swift-integration.md +82 -37
  628. package/pipeline/skills/shared/external/cryptokit/SKILL.md +493 -0
  629. package/pipeline/skills/shared/external/cryptokit/evals/evals.json +44 -0
  630. package/pipeline/skills/shared/external/cryptokit/references/cryptokit-patterns.md +602 -0
  631. package/pipeline/skills/shared/external/css-modern/SKILL.md +3 -2
  632. package/pipeline/skills/shared/external/database-patterns/SKILL.md +6 -5
  633. package/pipeline/skills/shared/external/debugging-instruments/SKILL.md +77 -47
  634. package/pipeline/skills/shared/external/debugging-instruments/evals/evals.json +47 -0
  635. package/pipeline/skills/shared/external/debugging-instruments/references/instruments-guide.md +42 -34
  636. package/pipeline/skills/shared/external/debugging-instruments/references/lldb-patterns.md +2 -2
  637. package/pipeline/skills/shared/external/device-integrity/SKILL.md +136 -176
  638. package/pipeline/skills/shared/external/device-integrity/evals/evals.json +45 -0
  639. package/pipeline/skills/shared/external/device-integrity/references/device-integrity-patterns.md +240 -0
  640. package/pipeline/skills/shared/external/energykit/SKILL.md +73 -34
  641. package/pipeline/skills/shared/external/energykit/evals/evals.json +45 -0
  642. package/pipeline/skills/shared/external/energykit/references/energykit-patterns.md +80 -38
  643. package/pipeline/skills/shared/external/eventkit-calendar/SKILL.md +67 -53
  644. package/pipeline/skills/shared/external/eventkit-calendar/evals/evals.json +44 -0
  645. package/pipeline/skills/shared/external/eventkit-calendar/references/eventkit-patterns.md +53 -3
  646. package/pipeline/skills/shared/external/healthkit/SKILL.md +57 -124
  647. package/pipeline/skills/shared/external/healthkit/evals/evals.json +46 -0
  648. package/pipeline/skills/shared/external/healthkit/references/healthkit-patterns.md +82 -1
  649. package/pipeline/skills/shared/external/homekit-matter/SKILL.md +43 -41
  650. package/pipeline/skills/shared/external/homekit-matter/evals/evals.json +45 -0
  651. package/pipeline/skills/shared/external/homekit-matter/references/matter-commissioning.md +13 -8
  652. package/pipeline/skills/shared/external/html-semantic/SKILL.md +5 -4
  653. package/pipeline/skills/shared/external/humanizer/SKILL.md +4 -4
  654. package/pipeline/skills/shared/external/ios-accessibility/SKILL.md +174 -18
  655. package/pipeline/skills/shared/external/ios-accessibility/evals/evals.json +49 -0
  656. package/pipeline/skills/shared/external/ios-accessibility/references/a11y-patterns.md +262 -4
  657. package/pipeline/skills/shared/external/ios-accessibility/references/media-accessibility.md +117 -0
  658. package/pipeline/skills/shared/external/ios-accessibility/references/nutrition-labels.md +141 -0
  659. package/pipeline/skills/shared/external/ios-localization/SKILL.md +67 -14
  660. package/pipeline/skills/shared/external/ios-localization/evals/evals.json +49 -0
  661. package/pipeline/skills/shared/external/ios-localization/references/formatstyle-locale.md +20 -3
  662. package/pipeline/skills/shared/external/ios-localization/references/string-catalogs.md +131 -22
  663. package/pipeline/skills/shared/external/ios-networking/SKILL.md +69 -22
  664. package/pipeline/skills/shared/external/ios-networking/evals/evals.json +50 -0
  665. package/pipeline/skills/shared/external/ios-networking/references/background-websocket.md +28 -16
  666. package/pipeline/skills/shared/external/ios-networking/references/file-storage-patterns.md +354 -0
  667. package/pipeline/skills/shared/external/ios-networking/references/network-framework.md +69 -44
  668. package/pipeline/skills/shared/external/ios-networking/references/urlsession-patterns.md +35 -69
  669. package/pipeline/skills/shared/external/ios-security/references/file-storage-patterns.md +8 -8
  670. package/pipeline/skills/shared/external/ios-simulator/SKILL.md +485 -0
  671. package/pipeline/skills/shared/external/ios-simulator/evals/evals.json +44 -0
  672. package/pipeline/skills/shared/external/ios-simulator/references/simctl-commands.md +316 -0
  673. package/pipeline/skills/shared/external/live-activities/SKILL.md +120 -131
  674. package/pipeline/skills/shared/external/live-activities/evals/evals.json +44 -0
  675. package/pipeline/skills/shared/external/live-activities/references/{live-activity-patterns.md → activitykit-patterns.md} +148 -63
  676. package/pipeline/skills/shared/external/mapkit-location/SKILL.md +40 -21
  677. package/pipeline/skills/shared/external/mapkit-location/evals/evals.json +47 -0
  678. package/pipeline/skills/shared/external/mapkit-location/references/{corelocation-patterns.md → mapkit-corelocation-patterns.md} +88 -41
  679. package/pipeline/skills/shared/external/mapkit-location/references/mapkit-patterns.md +27 -24
  680. package/pipeline/skills/shared/external/metrickit-diagnostics/SKILL.md +129 -172
  681. package/pipeline/skills/shared/external/metrickit-diagnostics/evals/evals.json +46 -0
  682. package/pipeline/skills/shared/external/metrickit-diagnostics/references/metrickit-patterns.md +180 -0
  683. package/pipeline/skills/shared/external/musickit-audio/SKILL.md +45 -18
  684. package/pipeline/skills/shared/external/musickit-audio/evals/evals.json +44 -0
  685. package/pipeline/skills/shared/external/musickit-audio/references/musickit-patterns.md +26 -6
  686. package/pipeline/skills/shared/external/natural-language/SKILL.md +48 -18
  687. package/pipeline/skills/shared/external/natural-language/evals/evals.json +47 -0
  688. package/pipeline/skills/shared/external/natural-language/references/translation-patterns.md +20 -7
  689. package/pipeline/skills/shared/external/nextjs-app-router/SKILL.md +4 -3
  690. package/pipeline/skills/shared/external/passkit-wallet/SKILL.md +156 -66
  691. package/pipeline/skills/shared/external/passkit-wallet/evals/evals.json +51 -0
  692. package/pipeline/skills/shared/external/passkit-wallet/references/wallet-passes.md +69 -19
  693. package/pipeline/skills/shared/external/pdfkit/SKILL.md +499 -0
  694. package/pipeline/skills/shared/external/pdfkit/evals/evals.json +42 -0
  695. package/pipeline/skills/shared/external/pdfkit/references/pdfkit-patterns.md +844 -0
  696. package/pipeline/skills/shared/external/pencilkit-drawing/SKILL.md +122 -28
  697. package/pipeline/skills/shared/external/pencilkit-drawing/evals/evals.json +44 -0
  698. package/pipeline/skills/shared/external/pencilkit-drawing/references/pencilkit-patterns.md +49 -18
  699. package/pipeline/skills/shared/external/permissionkit/SKILL.md +100 -51
  700. package/pipeline/skills/shared/external/permissionkit/evals/evals.json +47 -0
  701. package/pipeline/skills/shared/external/permissionkit/references/permissionkit-patterns.md +48 -8
  702. package/pipeline/skills/shared/external/photos-camera-media/SKILL.md +13 -15
  703. package/pipeline/skills/shared/external/photos-camera-media/references/camera-capture.md +4 -4
  704. package/pipeline/skills/shared/external/photos-camera-media/references/image-loading-caching.md +2 -2
  705. package/pipeline/skills/shared/external/photos-camera-media/references/{photospicker-patterns.md → photokit-patterns.md} +3 -3
  706. package/pipeline/skills/shared/external/push-notifications/SKILL.md +45 -48
  707. package/pipeline/skills/shared/external/push-notifications/evals/evals.json +46 -0
  708. package/pipeline/skills/shared/external/push-notifications/references/notification-patterns.md +22 -33
  709. package/pipeline/skills/shared/external/push-notifications/references/rich-notifications.md +56 -37
  710. package/pipeline/skills/shared/external/python-patterns/SKILL.md +4 -3
  711. package/pipeline/skills/shared/external/react-best-practices/SKILL.md +1 -0
  712. package/pipeline/skills/shared/external/realitykit-ar/SKILL.md +74 -53
  713. package/pipeline/skills/shared/external/realitykit-ar/evals/evals.json +47 -0
  714. package/pipeline/skills/shared/external/realitykit-ar/references/realitykit-patterns.md +10 -10
  715. package/pipeline/skills/shared/external/rest-api-design/SKILL.md +21 -20
  716. package/pipeline/skills/shared/external/shareplay-activities/SKILL.md +81 -64
  717. package/pipeline/skills/shared/external/shareplay-activities/evals/evals.json +47 -0
  718. package/pipeline/skills/shared/external/shareplay-activities/references/shareplay-patterns.md +48 -9
  719. package/pipeline/skills/shared/external/speech-recognition/SKILL.md +118 -104
  720. package/pipeline/skills/shared/external/speech-recognition/evals/evals.json +49 -0
  721. package/pipeline/skills/shared/external/speech-recognition/references/speechanalyzer-patterns.md +171 -0
  722. package/pipeline/skills/shared/external/spm-build-analysis/SKILL.md +93 -0
  723. package/pipeline/skills/shared/external/spm-build-analysis/references/build-optimization-sources.md +155 -0
  724. package/pipeline/skills/shared/external/spm-build-analysis/references/recommendation-format.md +85 -0
  725. package/pipeline/skills/shared/external/spm-build-analysis/references/spm-analysis-checks.md +105 -0
  726. package/pipeline/skills/shared/external/spm-build-analysis/scripts/check_spm_pins.py +118 -0
  727. package/pipeline/skills/shared/external/storekit/SKILL.md +110 -44
  728. package/pipeline/skills/shared/external/storekit/evals/evals.json +44 -0
  729. package/pipeline/skills/shared/external/storekit/references/app-review-guidelines.md +94 -43
  730. package/pipeline/skills/shared/external/storekit/references/storekit-advanced.md +82 -33
  731. package/pipeline/skills/shared/external/swift-api-design-guidelines/SKILL.md +449 -0
  732. package/pipeline/skills/shared/external/swift-api-design-guidelines/evals/evals.json +50 -0
  733. package/pipeline/skills/shared/external/swift-api-design-guidelines/references/argument-labels-and-parameters.md +164 -0
  734. package/pipeline/skills/shared/external/swift-api-design-guidelines/references/conventions-and-special-rules.md +219 -0
  735. package/pipeline/skills/shared/external/swift-api-design-guidelines/references/naming-and-clarity.md +184 -0
  736. package/pipeline/skills/shared/external/swift-api-design-guidelines/references/side-effects-and-mutating-pairs.md +158 -0
  737. package/pipeline/skills/shared/external/swift-architecture/SKILL.md +499 -0
  738. package/pipeline/skills/shared/external/swift-architecture/evals/evals.json +45 -0
  739. package/pipeline/skills/shared/external/swift-charts/SKILL.md +52 -40
  740. package/pipeline/skills/shared/external/swift-charts/evals/evals.json +47 -0
  741. package/pipeline/skills/shared/external/swift-charts/references/charts-patterns.md +92 -11
  742. package/pipeline/skills/shared/external/swift-codable/SKILL.md +43 -16
  743. package/pipeline/skills/shared/external/swift-codable/evals/evals.json +43 -0
  744. package/pipeline/skills/shared/external/swift-concurrency/SKILL.md +50 -30
  745. package/pipeline/skills/shared/external/swift-concurrency/evals/evals.json +44 -0
  746. package/pipeline/skills/shared/external/swift-concurrency/references/approachable-concurrency.md +11 -4
  747. package/pipeline/skills/shared/external/swift-concurrency/references/async-algorithms.md +113 -0
  748. package/pipeline/skills/shared/external/swift-concurrency/references/bridging-interop.md +150 -0
  749. package/pipeline/skills/shared/external/swift-concurrency/references/{swift-6-2-concurrency.md → concurrency-patterns.md} +22 -11
  750. package/pipeline/skills/shared/external/swift-concurrency/references/diagnostics.md +52 -0
  751. package/pipeline/skills/shared/external/swift-concurrency/references/swiftui-concurrency.md +2 -2
  752. package/pipeline/skills/shared/external/swift-concurrency/references/synchronization-primitives.md +21 -15
  753. package/pipeline/skills/shared/external/swift-concurrency-expert/SKILL.md +3 -3
  754. package/pipeline/skills/shared/external/swift-concurrency-pro/SKILL.md +2 -2
  755. package/pipeline/skills/shared/external/swift-concurrency-pro/references/actors.md +3 -3
  756. package/pipeline/skills/shared/external/swift-concurrency-pro/references/async-streams.md +1 -1
  757. package/pipeline/skills/shared/external/swift-concurrency-pro/references/bridging.md +3 -3
  758. package/pipeline/skills/shared/external/swift-concurrency-pro/references/bug-patterns.md +3 -3
  759. package/pipeline/skills/shared/external/swift-concurrency-pro/references/cancellation.md +8 -8
  760. package/pipeline/skills/shared/external/swift-concurrency-pro/references/diagnostics.md +1 -1
  761. package/pipeline/skills/shared/external/swift-concurrency-pro/references/hotspots.md +2 -2
  762. package/pipeline/skills/shared/external/swift-concurrency-pro/references/interop.md +4 -4
  763. package/pipeline/skills/shared/external/swift-concurrency-pro/references/new-features.md +1 -1
  764. package/pipeline/skills/shared/external/swift-concurrency-pro/references/structured.md +2 -2
  765. package/pipeline/skills/shared/external/swift-concurrency-pro/references/testing.md +2 -2
  766. package/pipeline/skills/shared/external/swift-concurrency-pro/references/unstructured.md +3 -3
  767. package/pipeline/skills/shared/external/swift-formatstyle/SKILL.md +339 -0
  768. package/pipeline/skills/shared/external/swift-language/SKILL.md +33 -34
  769. package/pipeline/skills/shared/external/swift-language/evals/evals.json +47 -0
  770. package/pipeline/skills/shared/external/swift-language/references/swift-attributes-interop.md +97 -0
  771. package/pipeline/skills/shared/external/swift-language/references/swift-patterns-extended.md +19 -6
  772. package/pipeline/skills/shared/external/swift-security/SKILL.md +195 -0
  773. package/pipeline/skills/shared/external/swift-security/evals/evals.json +48 -0
  774. package/pipeline/skills/shared/external/swift-security/references/biometric-authentication.md +595 -0
  775. package/pipeline/skills/shared/external/swift-security/references/certificate-trust.md +611 -0
  776. package/pipeline/skills/shared/external/swift-security/references/common-anti-patterns.md +708 -0
  777. package/pipeline/skills/shared/external/swift-security/references/compliance-owasp-mapping.md +573 -0
  778. package/pipeline/skills/shared/external/swift-security/references/credential-storage-patterns.md +752 -0
  779. package/pipeline/skills/shared/external/swift-security/references/cryptokit-public-key.md +538 -0
  780. package/pipeline/skills/shared/external/swift-security/references/cryptokit-symmetric.md +530 -0
  781. package/pipeline/skills/shared/external/swift-security/references/keychain-access-control.md +543 -0
  782. package/pipeline/skills/shared/external/swift-security/references/keychain-fundamentals.md +620 -0
  783. package/pipeline/skills/shared/external/swift-security/references/keychain-item-classes.md +515 -0
  784. package/pipeline/skills/shared/external/swift-security/references/keychain-sharing.md +496 -0
  785. package/pipeline/skills/shared/external/swift-security/references/migration-legacy-stores.md +747 -0
  786. package/pipeline/skills/shared/external/swift-security/references/secure-enclave.md +566 -0
  787. package/pipeline/skills/shared/external/swift-security/references/testing-security-code.md +813 -0
  788. package/pipeline/skills/shared/external/swift-testing/SKILL.md +97 -297
  789. package/pipeline/skills/shared/external/swift-testing/evals/evals.json +44 -0
  790. package/pipeline/skills/shared/external/swift-testing/references/testing-advanced.md +123 -0
  791. package/pipeline/skills/shared/external/swift-testing/references/testing-patterns.md +162 -34
  792. package/pipeline/skills/shared/external/swift-testing-pro/SKILL.md +2 -2
  793. package/pipeline/skills/shared/external/swift-testing-pro/references/async-tests.md +3 -3
  794. package/pipeline/skills/shared/external/swift-testing-pro/references/core-rules.md +2 -2
  795. package/pipeline/skills/shared/external/swift-testing-pro/references/migrating-from-xctest.md +5 -5
  796. package/pipeline/skills/shared/external/swift-testing-pro/references/new-features.md +3 -3
  797. package/pipeline/skills/shared/external/swift-testing-pro/references/writing-better-tests.md +5 -5
  798. package/pipeline/skills/shared/external/swiftdata/SKILL.md +44 -23
  799. package/pipeline/skills/shared/external/swiftdata/evals/evals.json +47 -0
  800. package/pipeline/skills/shared/external/swiftdata/references/core-data-coexistence.md +3 -3
  801. package/pipeline/skills/shared/external/swiftdata/references/indexing.md +75 -0
  802. package/pipeline/skills/shared/external/swiftdata/references/predicate-pitfalls.md +54 -0
  803. package/pipeline/skills/shared/external/swiftdata/references/swiftdata-advanced.md +14 -10
  804. package/pipeline/skills/shared/external/swiftdata/references/swiftdata-queries.md +5 -5
  805. package/pipeline/skills/shared/external/swiftdata-pro/SKILL.md +2 -2
  806. package/pipeline/skills/shared/external/swiftdata-pro/references/class-inheritance.md +2 -2
  807. package/pipeline/skills/shared/external/swiftdata-pro/references/cloudkit.md +1 -1
  808. package/pipeline/skills/shared/external/swiftdata-pro/references/core-rules.md +6 -6
  809. package/pipeline/skills/shared/external/swiftlint/SKILL.md +337 -0
  810. package/pipeline/skills/shared/external/swiftlint/references/adoption-and-configuration.md +297 -0
  811. package/pipeline/skills/shared/external/swiftlint/references/custom-rules-and-analyze.md +170 -0
  812. package/pipeline/skills/shared/external/swiftlint/references/plugins-run-scripts-and-integrations.md +307 -0
  813. package/pipeline/skills/shared/external/swiftlint/references/rule-reference.md +35 -0
  814. package/pipeline/skills/shared/external/swiftlint/references/rules-suppressions-and-baselines.md +306 -0
  815. package/pipeline/skills/shared/external/swiftui-animation/SKILL.md +56 -65
  816. package/pipeline/skills/shared/external/swiftui-animation/references/animation-advanced.md +48 -44
  817. package/pipeline/skills/shared/external/swiftui-animation/references/core-animation-bridge.md +6 -6
  818. package/pipeline/skills/shared/external/swiftui-expert-skill/references/charts-accessibility.md +13 -13
  819. package/pipeline/skills/shared/external/swiftui-expert-skill/references/charts.md +3 -3
  820. package/pipeline/skills/shared/external/swiftui-expert-skill/references/image-optimization.md +1 -1
  821. package/pipeline/skills/shared/external/swiftui-expert-skill/references/latest-apis.md +4 -4
  822. package/pipeline/skills/shared/external/swiftui-expert-skill/references/layout-best-practices.md +2 -2
  823. package/pipeline/skills/shared/external/swiftui-expert-skill/references/list-patterns.md +1 -1
  824. package/pipeline/skills/shared/external/swiftui-expert-skill/references/macos-scenes.md +16 -16
  825. package/pipeline/skills/shared/external/swiftui-expert-skill/references/macos-views.md +11 -11
  826. package/pipeline/skills/shared/external/swiftui-expert-skill/references/macos-window-styling.md +7 -7
  827. package/pipeline/skills/shared/external/swiftui-expert-skill/references/state-management.md +5 -5
  828. package/pipeline/skills/shared/external/swiftui-expert-skill/references/view-structure.md +6 -6
  829. package/pipeline/skills/shared/external/swiftui-gestures/SKILL.md +38 -16
  830. package/pipeline/skills/shared/external/swiftui-gestures/references/gesture-patterns.md +13 -3
  831. package/pipeline/skills/shared/external/swiftui-layout-components/SKILL.md +32 -28
  832. package/pipeline/skills/shared/external/swiftui-layout-components/references/form.md +1 -1
  833. package/pipeline/skills/shared/external/swiftui-layout-components/references/grids.md +202 -41
  834. package/pipeline/skills/shared/external/swiftui-layout-components/references/list.md +16 -25
  835. package/pipeline/skills/shared/external/swiftui-layout-components/references/scrollview.md +71 -26
  836. package/pipeline/skills/shared/external/swiftui-liquid-glass/SKILL.md +284 -65
  837. package/pipeline/skills/shared/external/swiftui-liquid-glass/references/liquid-glass.md +387 -0
  838. package/pipeline/skills/shared/external/swiftui-navigation/SKILL.md +10 -10
  839. package/pipeline/skills/shared/external/swiftui-navigation/references/deeplinks.md +15 -3
  840. package/pipeline/skills/shared/external/swiftui-navigation/references/navigationstack.md +2 -2
  841. package/pipeline/skills/shared/external/swiftui-navigation/references/tabview.md +1 -1
  842. package/pipeline/skills/shared/external/swiftui-patterns/SKILL.md +51 -25
  843. package/pipeline/skills/shared/external/swiftui-patterns/references/architecture-patterns.md +78 -6
  844. package/pipeline/skills/shared/external/swiftui-patterns/references/deprecated-migration.md +161 -16
  845. package/pipeline/skills/shared/external/swiftui-patterns/references/design-polish.md +85 -27
  846. package/pipeline/skills/shared/external/swiftui-patterns/references/platform-and-sharing.md +37 -33
  847. package/pipeline/skills/shared/external/swiftui-performance/SKILL.md +39 -51
  848. package/pipeline/skills/shared/external/swiftui-performance/references/demystify-swiftui-performance-wwdc23.md +204 -30
  849. package/pipeline/skills/shared/external/swiftui-performance/references/optimizing-swiftui-performance-instruments.md +226 -21
  850. package/pipeline/skills/shared/external/swiftui-performance/references/understanding-hangs-in-your-app.md +220 -20
  851. package/pipeline/skills/shared/external/swiftui-performance/references/understanding-improving-swiftui-performance.md +159 -34
  852. package/pipeline/skills/shared/external/swiftui-performance/references/wwdc-session-sources.md +27 -0
  853. package/pipeline/skills/shared/external/swiftui-pro/SKILL.md +2 -2
  854. package/pipeline/skills/shared/external/swiftui-pro/references/accessibility.md +4 -4
  855. package/pipeline/skills/shared/external/swiftui-pro/references/api.md +1 -1
  856. package/pipeline/skills/shared/external/swiftui-pro/references/data.md +2 -2
  857. package/pipeline/skills/shared/external/swiftui-pro/references/design.md +4 -4
  858. package/pipeline/skills/shared/external/swiftui-pro/references/hygiene.md +2 -2
  859. package/pipeline/skills/shared/external/swiftui-pro/references/navigation.md +1 -1
  860. package/pipeline/skills/shared/external/swiftui-pro/references/performance.md +1 -1
  861. package/pipeline/skills/shared/external/swiftui-pro/references/swift.md +2 -2
  862. package/pipeline/skills/shared/external/swiftui-pro/references/views.md +2 -2
  863. package/pipeline/skills/shared/external/swiftui-ui-patterns/SKILL.md +1 -1
  864. package/pipeline/skills/shared/external/swiftui-uikit-interop/SKILL.md +12 -12
  865. package/pipeline/skills/shared/external/swiftui-uikit-interop/references/hosting-migration.md +3 -3
  866. package/pipeline/skills/shared/external/swiftui-uikit-interop/references/representable-recipes.md +1 -1
  867. package/pipeline/skills/shared/external/swiftui-webkit/SKILL.md +11 -11
  868. package/pipeline/skills/shared/external/swiftui-webkit/references/migration-and-fallbacks.md +124 -10
  869. package/pipeline/skills/shared/external/tailwind-css/SKILL.md +3 -2
  870. package/pipeline/skills/shared/external/testing-backend/SKILL.md +2 -1
  871. package/pipeline/skills/shared/external/tipkit/SKILL.md +3 -3
  872. package/pipeline/skills/shared/external/tipkit/references/tipkit-patterns.md +9 -9
  873. package/pipeline/skills/shared/external/typescript-patterns/SKILL.md +17 -16
  874. package/pipeline/skills/shared/external/vision-framework/SKILL.md +11 -11
  875. package/pipeline/skills/shared/external/vision-framework/references/vision-requests.md +1 -1
  876. package/pipeline/skills/shared/external/vision-framework/references/visionkit-scanner.md +5 -5
  877. package/pipeline/skills/shared/external/vue-composition/SKILL.md +7 -6
  878. package/pipeline/skills/shared/external/weatherkit/SKILL.md +3 -3
  879. package/pipeline/skills/shared/external/weatherkit/references/weatherkit-patterns.md +9 -9
  880. package/pipeline/skills/shared/external/web-accessibility/SKILL.md +1 -0
  881. package/pipeline/skills/shared/external/web-performance/SKILL.md +8 -7
  882. package/pipeline/skills/shared/external/web-testing/SKILL.md +7 -6
  883. package/pipeline/skills/shared/external/widgetkit/SKILL.md +23 -17
  884. package/pipeline/skills/shared/external/widgetkit/references/widgetkit-advanced.md +99 -0
  885. package/pipeline/skills/shared/external/xcode-build-benchmark/SKILL.md +89 -0
  886. package/pipeline/skills/shared/external/xcode-build-benchmark/references/benchmark-artifacts.md +94 -0
  887. package/pipeline/skills/shared/external/xcode-build-benchmark/references/benchmarking-workflow.md +67 -0
  888. package/pipeline/skills/shared/external/xcode-build-benchmark/schemas/build-benchmark.schema.json +230 -0
  889. package/pipeline/skills/shared/external/xcode-build-benchmark/scripts/benchmark_builds.py +308 -0
  890. package/pipeline/skills/shared/external/xcode-build-fixer/SKILL.md +219 -0
  891. package/pipeline/skills/shared/external/xcode-build-fixer/references/build-settings-best-practices.md +216 -0
  892. package/pipeline/skills/shared/external/xcode-build-fixer/references/fix-patterns.md +290 -0
  893. package/pipeline/skills/shared/external/xcode-build-fixer/references/recommendation-format.md +85 -0
  894. package/pipeline/skills/shared/external/xcode-build-fixer/scripts/benchmark_builds.py +308 -0
  895. package/pipeline/skills/shared/external/xcode-build-orchestrator/SKILL.md +157 -0
  896. package/pipeline/skills/shared/external/xcode-build-orchestrator/references/benchmark-artifacts.md +94 -0
  897. package/pipeline/skills/shared/external/xcode-build-orchestrator/references/build-settings-best-practices.md +216 -0
  898. package/pipeline/skills/shared/external/xcode-build-orchestrator/references/orchestration-report-template.md +143 -0
  899. package/pipeline/skills/shared/external/xcode-build-orchestrator/references/recommendation-format.md +85 -0
  900. package/pipeline/skills/shared/external/xcode-build-orchestrator/scripts/benchmark_builds.py +308 -0
  901. package/pipeline/skills/shared/external/xcode-build-orchestrator/scripts/diagnose_compilation.py +273 -0
  902. package/pipeline/skills/shared/external/xcode-build-orchestrator/scripts/generate_optimization_report.py +533 -0
  903. package/pipeline/skills/shared/external/xcode-compilation-analyzer/SKILL.md +90 -0
  904. package/pipeline/skills/shared/external/xcode-compilation-analyzer/references/build-optimization-sources.md +155 -0
  905. package/pipeline/skills/shared/external/xcode-compilation-analyzer/references/code-compilation-checks.md +106 -0
  906. package/pipeline/skills/shared/external/xcode-compilation-analyzer/references/recommendation-format.md +85 -0
  907. package/pipeline/skills/shared/external/xcode-compilation-analyzer/scripts/diagnose_compilation.py +273 -0
  908. package/pipeline/skills/shared/external/xcode-project-analyzer/SKILL.md +77 -0
  909. package/pipeline/skills/shared/external/xcode-project-analyzer/references/build-optimization-sources.md +155 -0
  910. package/pipeline/skills/shared/external/xcode-project-analyzer/references/build-settings-best-practices.md +216 -0
  911. package/pipeline/skills/shared/external/xcode-project-analyzer/references/project-audit-checks.md +101 -0
  912. package/pipeline/skills/shared/external/xcode-project-analyzer/references/recommendation-format.md +85 -0
  913. package/pipeline/skills/skills-index.md +213 -192
  914. package/docs/GENERICITY-REVIEW.md +0 -277
  915. package/docs/STABILITY-FIX-PLAN.md +0 -168
  916. package/pipeline/scripts/README-figma-smokes.md +0 -34
  917. package/pipeline/scripts/figma-placeholder-map.json +0 -191
  918. package/pipeline/scripts/import-figma-skills.sh +0 -253
  919. package/pipeline/scripts/smoke-figma-config-schema.sh +0 -144
  920. package/pipeline/scripts/smoke-figma-skill-import.sh +0 -174
  921. package/pipeline/scripts/smoke-install-leak-gate.sh +0 -125
  922. package/pipeline/scripts/smoke-personal-data.sh +0 -84
  923. package/pipeline/scripts/sync-figma-source.sh +0 -228
  924. package/pipeline/skills/figma-ios/figma-to-component/scripts/confluence-page-ids.json +0 -94
  925. package/pipeline/skills/shared/external/app-store-review/references/code-signing.md +0 -259
  926. package/pipeline/skills/shared/external/app-store-review/references/rejection-patterns.md +0 -152
  927. package/pipeline/skills/shared/external/pencilkit-drawing/references/paperkit-integration.md +0 -376
package/CHANGELOG.md CHANGED
@@ -14,2610 +14,670 @@ Internal file-layout changes that don't affect the slash-command surface are sti
14
14
 
15
15
  ---
16
16
 
17
- ## [8.6.0] 2026-05-11
17
+ ## [10.0.6] - 2026-06-23
18
18
 
19
- **Minor** — Seven opt-in capabilities derived from external pattern research (Anthropic, Aider, Cursor Bugbot, Devin Review/Knowledge, Windsurf Cascade, Cline). All default OFF so existing-user baselines are unchanged. Plus a `MANDATORY` scrub across docs after maintainer feedback.
19
+ Release-plumbing patch so the stable line reaches npmjs. No behavior change to
20
+ the slash-command surface.
20
21
 
21
- ### Why
22
-
23
- Research pass against current docs (May 2026) of: Aider, SWE-agent, Cline, OpenHands, Cursor, Windsurf, Devin, LangGraph, AutoGen, CrewAI, and Anthropic's "Building Effective Agents" + Claude Code best-practices. Findings (verified citations in the agent-spec frontmatter of each new feature) shaped which patterns to borrow and which to skip. Multi-agent frameworks (LangGraph/AutoGen/CrewAI) explicitly rejected — pipeline already maps to orchestrator-workers with a state file, the frameworks would add weight without adding value.
24
-
25
- ### Added (all opt-in)
26
-
27
- - **Phase 3.5 Dev Critic** (Anthropic evaluator-optimizer): `agents/dev-critic.md` runs 4 deterministic gates (build/lint/test/secrets) + the platform checklist BEFORE Phase 4 reviewers see the diff. Catches failures that would otherwise burn 2-3 reviewer calls + Opus triage. Loop cap strict max 2 iterations, then escalate. `prefs.global.devCritic.{enabled,maxIterations,model,minDiffLoc}`.
28
- - **Aider-style Repo Map** (`pipeline/scripts/repo-map.mjs`): graph-ranked PageRank-style index with TF-IDF credit splitting so shared helper names (`pass`/`fail`/`init`) don't dominate. Token-budgeted markdown injection into Phase 1 + Phase 4 prompts. Deterministic, no embeddings infra. Supports Swift / Kotlin / TS/JS / Python / Bash. `prefs.global.repoMap.{enabled,tokenBudget,topFiles,include,exclude}`.
29
- - **Auto-review incoming PRs** (`pipeline/lib/review-watch.sh`): Cursor Bugbot + Devin Review pattern. Polls watched GitHub repos for PRs the user did NOT author and dispatches `/multi-agent:review` per new/updated PR. State per repo under `~/.claude/state/review-watch/`. `prefs.global.reviewWatch.{enabled,repos,intervalSeconds,labelFilter}`.
30
- - **Prior-comment dedupe** (Bugbot parity): every inline comment now carries `<!-- multi-agent-finding: <sha-16> -->` marker. Re-runs of `/multi-agent:review` skip findings whose fingerprint (path | line | issue) already appears in the PR's existing comments. Works on both GitHub (`/pulls/{n}/comments` + `/issues/{n}/comments`) and Bitbucket Server (`/pull-requests/{id}/activities?fromType=COMMENT`). `prefs.global.review.dedupeInlineComments` (default **on** — Bugbot parity).
31
- - **Phase 0 Step 8 Clarification** (Devin Knowledge / Ask Devin): Haiku scorer reads task title + body + acceptance, emits clarity score 0-10 + up to N targeted questions with discrete options (each with `recommended` flag + tradeoff reason). Three autopilot modes: `skip` (drop questions, proceed), `log` (append to agent-log, proceed — default), `abort` (pause, require `multi-agent:resume`). `prefs.global.clarifyAmbiguous.{enabled,model,minScoreToProceed,maxQuestions,autopilotMode}`.
32
- - **Plan-as-live-Todo-list** (Windsurf Cascade, Cursor Plan Mode): Phase 2 Step 4.5 emits `agent-state.plan.todos[]` with id / task / status / deps. Phase 3 iterates via `pipeline/lib/plan-todos.sh next` (deps-respecting picker) → `start` → `complete | fail | skip`. Phase 7 renders the rollup into `agent-log.md`. `prefs.global.planTodos.enabled`.
33
- - **Shadow-Git checkpoints** (Cline): per-tool-call (or per-todo-step) snapshots in a separate git repo under `~/.claude/state/shadow-git/<task-id>/.git/`. Worktree rollback via `pipeline/lib/shadow-git.sh restore --files <sha>` without touching the project's real `.git` history. Auto-excludes `node_modules` / `Pods` / `.build` / `DerivedData` / `.next` / `.gradle` / `__pycache__` / `.venv`. `prefs.global.shadowGit.{enabled,mode,pruneAfterDays}`.
34
-
35
- ### Fixed (review pipeline hardening)
36
-
37
- - Inline post 422 falls back to top-level PR comment (was silently dropping the finding).
38
- - `gh api -F body` → `-f body` to avoid typecast on numeric/bool-looking bodies.
39
- - `agent-state.review.headCommitSha` now persisted by `/multi-agent:review` Step 2 (was orphaned in `/tmp`).
40
- - Informative `--request-changes --body` instead of single-space placeholder.
41
- - Bitbucket diff fetch validates HTTP 200 before treating body as patch.
42
- - Standalone `/multi-agent:review` seeds `TASK_ID` and traps `/tmp` cleanup on EXIT.
43
- - `OUT_LANG` validated against `{tr, en}` enum.
44
- - Per-finding `jq` calls collapsed 6 → 1 (TSV + base64 round-trip).
45
- - `count_accepted` single-pass jq.
46
- - `pr-review-actions.md` em-dash rule scoped to free-form prose (template separator allowed).
47
-
48
- ### Changed (docs)
49
-
50
- - `gen-mode-dispatch.mjs` is now mode-aware. Previously hardcoded `--dev` references in the TaskCreate paragraph, so dev-autopilot / autopilot / local docs all carried text that was incorrect for their actual phase set. Each mode file now carries text matching its own active phase set. All 7 mode entry docs regenerated.
51
- - `MANDATORY` scrub — the word appeared 111 times across 47 files, including doc headings, smoke test labels, schema descriptions, and generator output. Replaced with `required` / `strict` / restructured headings after maintainer feedback. The `gen-mode-dispatch.mjs` change cascades to all 7 mode files; `tracker-contract.md` anchor renamed to `TaskCreate ordering (strict)`; `smoke-tasklist-ordering.sh` and `smoke-mode-dispatch-drift.sh` updated to grep for the new wording.
52
-
53
- ### Tooling counts (post-release)
54
-
55
- - Smoke suites: 84 → **87** (+ smoke-dev-critic / smoke-repo-map / smoke-review-watch / smoke-clarify / smoke-plan-todos / smoke-shadow-git; smoke-cost-summary etc. unchanged)
56
- - JSON schemas: 10 → **13** (+ dev-critic-output, clarify-output, plan-todos)
57
- - Agent personas: 6 → **8** (+ dev-critic, task-clarifier)
58
-
59
- ### Pre-existing test debt (unchanged)
60
-
61
- 10 smoke scripts were failing on the baseline before this release: figma-credential-store, install-layout, issue-comment-template, multi-repo-worktree, phase-0-multi-repo, phase-6-multi, skill-authoring, sync-delegation, token-budget, url-enrichment. None of the v8.6.0 work touched these areas, and none of them regressed. Cleanup is its own backlog item.
62
-
63
- ---
64
-
65
- ## [8.5.6] — 2026-05-08
66
-
67
- **Breaking** — `/multi-agent:review` PR-mode contract rewritten. The v8.5.5 monolith advisory comment is gone, replaced by per-finding inline comments + an explicit approve/needs-work review state. Bitbucket Server PRs are now first-class. Output language is honored on every PR-side body.
68
-
69
- ### Why
70
-
71
- Two real-world issues found on first dogfood of v8.5.5:
72
-
73
- 1. `outputLanguage="tr"` set globally, but the PR comment body was rendered in English (rules.md Language Application matrix violation).
74
- 2. The "post one big advisory comment with verdict + tables + finding sections" pattern is the wrong shape for code review. Reviewers expect inline comments anchored to `file:line` and an explicit approve/changes signal — not a wall of text.
75
-
76
- ### Changed (breaking)
77
-
78
- - `pipeline/commands/multi-agent/refs/channels/pr-review-comment.md` — **DELETED**. Replaced by `pr-review-actions.md`.
79
- - `pipeline/lib/render-pr-review-body.sh` — **DELETED**. Replaced by `lib/post-pr-review.sh`.
80
- - `pipeline/scripts/smoke-pr-review-comment-template.sh` — **DELETED**. Replaced by `smoke-pr-review-actions.sh`.
81
-
82
- Forks that wrapped the legacy renderer must update to call `post-pr-review.sh "$TASK_ID"`.
83
-
84
- ### Added
85
-
86
- - New decision rule (in `pr-review-actions.md`): 0 accepted blocking AND 0 accepted important → **APPROVE** (no comments). ≥1 of either → **NEEDS_WORK / REQUEST CHANGES** + one inline comment per accepted blocking + important. Suggestions are chat-only and never trigger a PR comment.
87
- - `pipeline/lib/post-pr-review.sh` — provider-aware orchestrator: reads `agent-state.review`, computes the decision, posts inline comments in `outputLanguage`, sets the review state.
88
- - **Bitbucket Server adapter** — full URL parser (`https://bitbucket.<host>/projects/<KEY>/repos/<slug>/pull-requests/<N>`), diff fetch via REST + Basic auth, comments via `POST /pull-requests/{n}/comments` with proper `anchor` block, approve/needs-work via `PUT /pull-requests/{n}/participants/{user}` with `{"status":"APPROVED"|"NEEDS_WORK"}`. Credentials resolved via `prefs.keychainMapping.bitbucket_user / .bitbucket_token` and the cross-platform credential-store helper (no hardcoded key names).
89
- - Both `tr` and `en` inline-comment shapes documented in the canonical channel doc; `post-pr-review.sh` honors `prefs.global.outputLanguage` per finding.
90
- - `pipeline/scripts/smoke-pr-review-actions.sh` — new smoke gate enforcing the decision rule, both languages, the github + bitbucket-server provider switch, and the absence of v8.5.5 legacy artifacts.
91
-
92
- ### Hard prohibitions enforced
93
-
94
- - No monolith description comment on the PR.
95
- - No suggestion-severity inline comments (chat-only).
96
- - No `Closes #` / `Fixes #` / `Resolves #` keywords in any comment body.
97
- - No em-dashes in prose; ` · ` and `→` only.
98
- - No mixing languages in one comment body.
99
-
100
- ### Migration
101
-
102
- Branch-mode invocations (`/multi-agent:review` no args, or with a branch name) are unchanged. Anyone using the legacy `render-pr-review-body.sh` directly must switch to `post-pr-review.sh "$TASK_ID"` — the new script reads everything from `agent-state.review` end-to-end.
103
-
104
- ---
105
-
106
- ## [8.5.5] — 2026-05-08
107
-
108
- Minor — `/multi-agent:review` now accepts PR URL / `#N` / `repo#N` input shapes and can post the parallel reviewer verdict back to the PR as one canonical comment per run.
109
-
110
- ### Added
111
-
112
- - 5 input shapes for `/multi-agent:review`: empty (current branch), branch name, `#N`, `repo#N`, full `https://github.com/.../pull/N` URL.
113
- - `gh pr diff <N> --repo <org/repo>` flow — no local checkout required to review a PR diff.
114
- - `pipeline/commands/multi-agent/refs/channels/pr-review-comment.md` — canonical PR review comment template (verdict block + per-severity findings + triage notes + build/test + footer marker).
115
- - `pipeline/lib/render-pr-review-body.sh` — reads `agent-state.review.*` and renders the comment body to stdout for `gh pr comment --body-file`.
116
- - `pipeline/scripts/smoke-pr-review-comment-template.sh` — lints template anchors, the review.md wiring, the input parser, and the renderer's contract with `agent-state.json`.
117
- - Opt-in PR-comment prompt: interactive default `Post comment`; autopilot auto-posts when findings exist; `--no-comment` flag skips.
118
-
119
- ### Changed
120
-
121
- - `pipeline/commands/multi-agent/review.md` rewritten into 8 steps (parse → fetch diff → parallel review → store-compliance cross-ref → triage → chat summary → optional PR comment → token telemetry).
122
- - `README.md` smoke-suite count 75 → 81 (keeps `smoke-readme-counts` green).
123
-
124
- ### Compatibility
125
-
126
- Branch-mode invocations stay chat-only — no auto-post. Existing `/multi-agent:review` (no args, current branch) and `/multi-agent:review feature/foo` behave identically to v8.5.4.
127
-
128
- ### Hard prohibitions enforced by the new template
129
-
130
- - No `Closes #` / `Fixes #` / `Resolves #` keywords in PR comment body
131
- - No em-dashes (`—`) in prose; ` · ` and `→` only
132
- - No model vendor names in verdict prose
133
- - One comment per `/multi-agent:review` invocation (re-runs append, never edit)
134
-
135
- ---
136
-
137
- ## [8.5.4] — 2026-05-08
138
-
139
- Patch — drops 4 unused third-party adapter targets (Windsurf, Cline, Zed AI, Continue.dev) per pipeline-owner preference. Tier 1 (Claude Code + Copilot CLI orchestration) and Tier 2 (Cursor + GitHub Copilot Chat knowledge layer) are unchanged.
140
-
141
- ### Removed
142
-
143
- - `pipeline/adapters/windsurf.mjs`
144
- - `pipeline/adapters/cline.mjs`
145
- - `pipeline/adapters/zed.mjs`
146
- - `pipeline/adapters/continue.mjs`
147
- - `pipeline/scripts/smoke-adapters-tier3.sh` (covered only the dropped adapters)
148
- - `--windsurf`, `--cline`, `--zed`, `--continue` flags from `install.js` and `uninstall.mjs`
149
- - README's previous "Tier 2" Windsurf/Cline rows + "Tier 3" Continue/Zed table
150
-
151
- ### Changed
152
-
153
- - `install/_adapters.mjs` flag set narrowed to `{ cursor, copilotChat }`.
154
- - `install/index.mjs` TOOL_FLAGS reduced; install summary block trimmed.
155
- - `pipeline/scripts/uninstall.mjs` flag table + adapter dispatch reduced.
156
- - `pipeline/scripts/smoke-adapters.sh` rewritten around the cursor + copilot-chat pair (16 assertions, was 22).
157
- - `pipeline/scripts/smoke-delete-flow.sh` retargets the Windsurf/Cline selective-uninstall tests to copilot-chat.
158
- - `README.md` install commands, npx commands, Tier table all reflect the narrower set; new `v8.5.4 dropped-tools` note for forks who want them back.
159
-
160
- ### Reintroduction is cheap
161
-
162
- Re-adding any dropped adapter is one file against `pipeline/adapters/_base.mjs` plus a flag in `install/_adapters.mjs`. The base framework is preserved.
163
-
164
- ---
165
-
166
- ## [8.5.3] — 2026-05-08
167
-
168
- Patch — fully closes Defect 1 (picker language axis). Surfaced when a user with `outputLanguage="tr"` saw the Phase 6 WIP-checkout picker render in English ("Phase 6 — Check out the feature branch locally for a quick WIP test before commit?").
169
-
170
- ### Fixed
171
-
172
- - **`refs/phases/phase-6-commit.md`** — the WIP-checkout prompt block previously instructed "ask in English (`promptLanguage` is locked to `en`)", which contradicted the v8.5.1 `rules.md` matrix. Now the block spells out the matrix per-field: `question` + `description` follow `outputLanguage`, `label` + `header` stay English. Adds the canonical button labels.
173
- - **`commands/multi-agent/_dev-context.md`** — replaces "User-facing strings always render in English" header note with a per-field matrix reference.
174
- - **`commands/multi-agent/_repo-picker.md`** — same replacement.
175
-
176
- ### Added
177
-
178
- - **`scripts/smoke-language-axis.sh`** — forbids the `always English` / `ask in English (promptLanguage)` phrases in phase docs and orchestrator commands; positive-asserts the `Per-field language matrix` section exists in `rules.md` with the right per-field routing. Allowlist scoped to `rules.md`, `language.md`, `setup.md`, `picker-contract.md` (these legitimately quote the rule).
179
-
180
- ### Why this matters
181
-
182
- Defect 1 was nominally fixed in 8.5.1 by narrowing the matrix in `rules.md`, but the phase docs that drive runtime picker rendering carried legacy wording that overrode the matrix. The smoke gate prevents the regression from re-appearing.
183
-
184
- ---
185
-
186
- ## [8.5.2] — 2026-05-08
187
-
188
- Patch — closes the 3 remaining v9 stability defects (5, 8, 10) from `docs/STABILITY-FIX-PLAN.md`.
189
-
190
- ### Added
191
-
192
- - **`skills/figma-ios/figma-to-component/phases/phase-1.5-existing-discovery.md`** — new gate before Phase 2. 5 statuses (`GREENFIELD_OK` / `ALREADY_MAPPED` / `EXISTING_SOURCE_NO_CC` / `AMBIGUOUS_SOURCE` / `GAP_ANALYSIS` / `CANCELLED`). Inspects `code_connect_map_response.json` for self-references, globs for an existing `{ComponentName}.swift`, and routes the run to gap analysis instead of overwriting existing source.
193
- - **`scripts/smoke-tracker-tokens-invocation.sh`** — fails the build if any of phases 1/2/3/4 drops the `phase-tracker.sh tokens <N> <in> <out>` invocation. Closes Defect 5.
194
- - **`scripts/smoke-worktree-path-convention.sh`** — greps for forbidden worktree paths (`$HOME/.worktrees`, `~/.worktrees`, `$HOME/repo--taskid`). Closes Defect 8.
195
- - **`scripts/smoke-existing-discovery-gate.sh`** — asserts the new Phase 1.5 doc has all canonical statuses and Phase 2 / Phase 6 ENTRY GATEs consult `discovery.status`. Closes Defect 10.
196
-
197
- ### Changed
198
-
199
- - **Phase 1 / 2 / 3 / 4 docs** — append a `Token telemetry — invoke after every LLM call` section. The mechanic (`phase-tracker.sh tokens`) was always there; this patch enforces invocation per phase.
200
- - **`phase-2-orchestrator.md` ENTRY GATE** — now consults `agent-state.discovery.status`. Only `GREENFIELD_OK` (or absent for backward compat) proceeds with greenfield generation.
201
- - **`phase-6-code-connect.md` ENTRY GATE** — now skips publish when `discovery.status === "ALREADY_MAPPED"` and `"Code Connect" not in discovery.gaps`, OR when a sibling `.figma.swift` is already published. Idempotent on Figma's side, so duplicates are harmless — but cleaner this way.
202
-
203
- ### Defects closed
204
-
205
- | # | Symptom | Fix |
206
- |---|---|---|
207
- | 5 | Tracker tokens silently dropped from phase docs | `smoke-tracker-tokens-invocation.sh` enforces invocation in phases 1/2/3/4 |
208
- | 8 | Worktree could land outside `{projectRoot}/.worktrees/{taskId}` | `smoke-worktree-path-convention.sh` greps for forbidden patterns |
209
- | 10 | Pipeline overwrites existing components silently | Phase 1.5 discovery gate, 5 statuses, conservative fallback to greenfield |
210
-
211
- ### Out of scope
212
-
213
- - Pre-existing ESLint failures in `pipeline/scripts/test-gap-scan.mjs` + `test/install-telemetry.test.mjs` (last edited in `94fbc01` v8.3 and `f37f60b` v8.0.0)
214
- - Token live verbose render mode in tracker — mechanic exists, only invocation gate is what was missing
215
-
216
- ---
217
-
218
- ## [8.5.1] — 2026-05-08
219
-
220
- Patch — stability fixes for 9 user-reported defects from the GH1241 end-to-end run, plus a fork/rebrand readiness audit.
221
-
222
- ### Added
223
-
224
- - **`refs/channels/issue-comment.md`** — canonical template for the GitHub Issue channel. Mandates one comment per run (never state-only), defines section anchors (Geliştirme Detayı / Pull Requests / Acceptance Criteria / Build & Test / Sonraki adımlar / Ref), and pairs the comment with `update-issue-progress.sh` (comment first, flags second).
225
- - **`scripts/update-issue-progress.sh`** — idempotent script that rewrites the issue body's Progress flag table from `agent-state.flags`. Replaces the missing python script referenced in the issue body legend.
226
- - **`scripts/smoke-issue-comment-template.sh`** — asserts the canonical template has all required anchors and Phase 7 wires both the template and the progress sync in the right order.
227
- - **`scripts/smoke-no-token-prompt.sh`** — greps the pipeline source for hardcoded token-entry prompts; fails the build if any are introduced outside the Setup Wizard allowlist.
228
- - **`docs/STABILITY-FIX-PLAN.md`** — defect to root cause to fix mapping for the 9 reported issues.
229
- - **`docs/GENERICITY-REVIEW.md`** — fork/rebrand readiness audit (7.5/10), with adoption matrix for solo / KOBI / enterprise / open-source / web-only scenarios and a concrete fork task list.
230
-
231
- ### Changed
232
-
233
- - **`refs/rules.md` Language Application matrix narrowed.** `AskUserQuestion.label` and `header` stay English (UI button + chip contract), but `question` and `description` now follow `outputLanguage`. Adds a per-field language matrix as the canonical reference. The previous "all picker text English" rule was over-broad and caused half-Turkish dialogues when `outputLanguage="tr"` was set.
234
- - **`refs/keychain.md` adds Rule 1 — never prompt for a token mid-run.** Tokens resolved only via `credential-store.sh`; missing/expired returns a structured error that routes to the Setup Wizard. Smoke gate enforces this.
235
- - **`refs/phases/phase-7-report.md` adds Step 1.5 (Issue channel).** Pairs the canonical comment with the progress flag sync. Comment first (timestamp marks the run), flags second (one logical body diff).
236
- - **`scripts/phase-tracker.sh`** — pointer-fallback now emits a warning when used without `MULTI_AGENT_TASK_ID` env. `init` prints a session-tip line. Documents the concurrency pitfall in the resolver block.
237
- - **`commands/multi-agent/dev.md`** — Phase 3 Dev now branches on `agent-state.taskType`. When `taskType === "component"` (figma URL detected at Phase 0 Step 7), the full 17-substep figma orchestrator runs including 3.5A/B/C tests, 3.6 Code Connect, and 3.7 Wiki — even in `--dev` mode. The "tests optional" exception applies only to non-component work.
238
- - **`commands/multi-agent/channels.md`** — replaces hardcoded `M_ERDEN3_Github_Access_Token` example in the github-projects board adapter section with the standard `${USER}_Github_Access_Token` convention. Caught by `smoke-personal-data.sh` during the genericity review.
239
-
240
- ### Defects addressed (with mapping)
241
-
242
- | # | Symptom | Fix |
243
- |---|---|---|
244
- | 1 | Picker question/description English even with `outputLanguage="tr"` | `rules.md` matrix narrowed |
245
- | 2 | Issue Progress flag flow done by hand, no automation | NEW `update-issue-progress.sh` + Phase 7 Step 1.5 wiring |
246
- | 3 | Token sometimes prompted mid-run despite keychain | `keychain.md` Rule 1 + smoke gate |
247
- | 4 + 7 | Issue comment shape different every run | Canonical template `refs/channels/issue-comment.md` + smoke gate |
248
- | 5 | Tracker step granularity + token usage display missing | Documented in plan; mechanic exists, phase-doc invocation follow-up |
249
- | 6 | Pipeline ID flips mid-session under concurrent inits | `phase-tracker.sh` pointer fallback warn + init session-tip |
250
- | 8 | Worktree independence doubt | Manually verified clean; defensive smoke gate scheduled |
251
- | 9 | Figma component tests skipped in `--dev` mode | `dev.md` Phase 3 dispatch on `taskType === "component"` |
252
-
253
- ### Out of scope (deferred to v9 stability)
254
-
255
- - Pre-existing ESLint failures in `pipeline/scripts/test-gap-scan.mjs` + `test/install-telemetry.test.mjs` (last edited in `94fbc01` v8.3 and `f37f60b` v8.0.0)
256
- - `smoke-worktree-path-convention.sh` (v9 plan)
257
- - Token usage live display in tracker render (mechanic exists, phase docs need to invoke `phase-tracker.sh tokens <N> <in> <out>` after every LLM call)
258
-
259
- ---
260
-
261
- ## [8.5.0] — 2026-05-07
262
-
263
- Minor — new channel adapter + figma source incremental sync.
264
-
265
- ### Added
266
-
267
- - **`/multi-agent:channels` Board adapter (5th channel).** Moves the GitHub Issue across GitHub Projects v2 columns (default `In Develop → In Review`) via the `updateProjectV2ItemFieldValue` mutation. Greyed unless `figmaConfig.board.enabled === true` AND `figmaConfig.board.provider === "github-projects-v2"` AND `figmaConfig.github.projectV2Id` set — single-repo tasks without a project board see no behavioural change. Token resolved via `keychainMapping.github_projects` (preferred — must include `project,read:project` scopes), with fallback to `keychainMapping.github` and finally to `M_ERDEN3_Github_Access_Token`. Always exported as `GH_TOKEN`, never on the CLI (token-leak feedback rule). Multi-repo: runs once on the primary repo's issue. Idempotent: no-op if already in the target column. Surface lands in: Step 3b menu, Step 6 dispatch table + `#### Adapter: Board`, Step 7 summary, `argument-hint` (`--channels ...,board` + `--board-status <key>`). Mirror added to the Copilot CLI summary.
268
- - **Figma source sync (Adim 0) — 12-skill pull.** `pipeline/scripts/sync-figma-source.sh` ran incremental `cd0cde2f → 67026735`: `figma-iteration-commit` lands +18KB of safety guards (Generated Swift keys integrity check on `LocalizationStringKeys.swift` + `LocalizedAccessibilityStringKeys.swift`, `Sources/Suggested/*.json` file count integrity check), plus a new `figma-to-component` orchestrator imported under `figma-ios/`. Smaller refreshes on `figma-form-integration`, `performance-swiftui`, `performance-tour`, `figma-cli-iterate`, `figma-cli-lean-iterate`, `figma-iterate`, `figma-to-swift-ui-start`. `figma-component-implement` skipped per `local-overlay: true`.
269
-
270
- ### Triggered by
271
-
272
- Issue #1237 (`ManageBookingHomePageFooterLayout`) retrospective: the lifecycle had no automated "In Develop → In Review" board move and pipeline iteration-commit was 18KB behind upstream (silent key-loss guards missing).
273
-
274
- ---
275
-
276
- ## [8.4.1] — 2026-05-07
277
-
278
- Patch — Phase 0 + Phase 6 safety-net rules + language axis clarification.
279
-
280
- ### Added (safety nets)
281
-
282
- - **Phase 0 — provider-aware account picker.** The picker now detects the git remote provider (`bitbucket.*` / `github.com` / `gitlab.*`) before listing accounts and labels itself accordingly ("Bitbucket account?" not "GitHub account?"). Stops the long-standing bug where Bitbucket-hosted repos were prompted with `gh auth` GitHub accounts.
283
- - **Phase 0 — network reachability gate.** A 5-second `git ls-remote --heads origin <baseBranch>` probe runs before the branch picker. On unreachable host (VPN/DNS fail), the agent stops and offers three explicit choices: enable VPN + retry, continue from local stale ref (recorded as `"baseRefFreshness": "stale"` in `agent-state.json`), or cancel. Eliminates the silent fallback that branched off stale local refs.
284
- - **Phase 6 — PR creation contract.** Codifies four rules: (1) always attach default reviewers at create time; (2) `PUT` on a Bitbucket Server PR replaces the resource — include `reviewers` in the payload or use `/participants` instead; (3) re-fetch and verify reviewer count after every PR write; (4) PR description body honors `outputLanguage`.
285
-
286
- ### Changed (clarification)
287
-
288
- - **Language axes spelled out.** Spec docs now state both axes explicitly: `promptLanguage=en` is fixed for everything the LLM reads as instruction (slash commands, skill specs, agent system prompts); `outputLanguage` is user-selectable for chat output and external payloads (PR description body, Jira comment, Confluence/Wiki body). Stay-English exceptions enumerated: `AskUserQuestion` UI, commit messages, branch names, code identifiers, log lines, PR title prefix.
289
- - **Copilot multi-agent-dev SKILL.md flipped to English.** Was Turkish; spec docs are LLM-facing and must be English. Frontmatter `language: en` now matches.
290
-
291
- ## [8.4.0] — 2026-05-06
292
-
293
- Minor — apple-archive-compliance backend swap + ecosystem cleanup.
294
-
295
- ### Changed (behavior)
296
-
297
- - **apple-archive-compliance backend swap.** The standalone ArchiveGuard Swift binary is retired. The 17-rule App Store audit now ships inside `@mmerterden/dev-toolkit-mcp` ≥ v2.4 as the `ios_app_store_audit` MCP tool (pure-Node port, identical JSON output shape). All four consumer surfaces — `/multi-agent:test "store-ready"`, Phase 4 Security Auditor, `/multi-agent:review`, `/multi-agent:channels` — keep working without changes; rule IDs and JSON contract preserved. Cache filename harmonized to `/tmp/archiveguard-$$.json` so `channels.md`'s auto-augmentation glob (`archiveguard-*.json`) picks it up correctly.
298
- - **`smoke-compliance-skills.sh` +1 assertion** (47 total): the apple skill writes the cache filename the channels glob reads — producer↔consumer filename mismatch will now break the build instead of silently disabling the store-compliance auto-augmentation.
299
- - **Pipeline-wide cleanup.** ~50 inline `(vX.Y+)` version annotations stripped from prompts, picker UI, command headings, and READMEs; legacy "v5.0.0 rebuild in progress" banner stripped from the always-loaded global rules; phase docs consolidated around hub-and-spoke references (canonical contracts in `refs/`, no duplicate spec in dispatcher); `phase-7-report.md`'s three "v3.7+ retained verbatim" archive blocks de-stamped to current spec; `channels.md`'s ASCII art flow diagram replaced with one-line prose; "Renamed from :test in v5.7.4" footnotes (EN+TR) dropped from `help.md`. Net per-session token reduction: ~−140 lines across CLAUDE-loaded surfaces.
300
- - **`help.md` expansion.** Both EN + TR blocks. Utility commands reorganized into Status & Resume / Post-Hoc & Side-Channel / Setup & Maintenance. Seven previously-undocumented commands added: `/multi-agent:diff-explain`, `:search`, `:scan`, `:refactor`, `:stack`, `:update`, `:delete`. New "Quality & Telemetry (advisory)" section covers the previously-invisible Diff Risk Score, Test Gap Report, Cost Breakdown, Triage Memory, Prior-Art Lookup, and Per-Persona model routing. "Dual Submodule" replaced with explicit Multi-Repo + Identity Routing rows. Store Compliance row added.
301
-
302
- ### Removed (cleanup)
303
-
304
- - Standalone ArchiveGuard Swift binary at `~/ArchiveGuard` and its GitHub repo `mmerterden/archiveGuard` are retired.
305
- - `docs/archive/` (4 files: `MIGRATION-v3.6-to-v3.7.md`, `MIGRATION-v4-to-v5.md`, `MIGRATION_4.0.md`, `plans/PLAN_v5.0.md`), `benchmarks/v3.6-baseline.json`, and `docs/MIGRATION.md` deleted. Single-maintainer repo with no external users on those versions; the per-release CHANGELOG and ROADMAP carry forward what's still relevant. `pipeline/commands/multi-agent/refs/cross-cli-contract.md` § 7 dead `REFACTOR_PLAN_v3.7.md` link replaced with the canonical inline rule. `WS-N` workstream tags from finished refactors stripped (`component-dispatch.md`, `progress-contract.md`, `issue-jira-triad.md` title).
306
- - Cross-CLI command count drift fixed: `sync.md` and `cross-cli-contract.md` (×2) said "26 commands"; actual public count is 29.
307
- - Companion repos: `@mmerterden/dev-toolkit-mcp` shipped 2.4.0 → 2.4.1 (npm `files` field bug — `tools/` directory was excluded from the v2.4.0 tarball; v2.4.1 adds 22 missing files / +60kB unpacked). `mmerterden.dev` website synced to current versions and tool counts.
308
-
309
- ---
310
-
311
- ## [8.3.3] — 2026-05-05
312
-
313
- Patch — closes the v8.3.2 follow-up sweep. Three new drift gates lock the
314
- keychain refactor and README counts in place; the figma skill family is
315
- fully migrated off raw `security`; a credential-helper resolver script
316
- gives skill bash blocks an actionable error instead of a silent failure
317
- when no install dir exists.
318
-
319
- ### Added
320
-
321
- - **`pipeline/scripts/smoke-figma-credential-store.sh`** (20 assertions) —
322
- drift gate that re-fails the build if any of the 8 migrated figma surfaces
323
- reverts to a raw `security` call against one of our managed token names
324
- (`FIGMA_ACCESS_TOKEN`, `FIGMA_MCP_TOKEN`, `FIGMA_API_USER`, `JIRA_TOKEN`,
325
- `CONFLUENCE_TOKEN`). Also asserts the cross-cli contract and the Copilot
326
- install template still declare the helper as canonical.
327
- - **`pipeline/scripts/smoke-readme-counts.sh`** (11 assertions) —
328
- every "Current at-a-glance" row in `README.md` is recomputed from the
329
- filesystem and compared against the declared value. The table is
330
- hand-edited; this gate is what stops it going stale every time someone
331
- adds a smoke / skill / schema.
332
- - **`pipeline/lib/credential-store-resolver.sh`** — single-source-of-truth
333
- resolver for skill bash blocks that need `CRED_STORE` set. Sourcing it
334
- exports the right path; if neither install dir has the helper, it prints
335
- an actionable remediation message (with the exact `npx ... install`
336
- command) and returns non-zero, so the caller halts on a clear error
337
- instead of `bash: $CRED_STORE: command not found`.
338
-
339
- ### Changed
340
-
341
- - **Figma skill family — Keychain hot-path migration completed.**
342
- The 8 surfaces that still called `security find-generic-password`
343
- directly are now routed through the cross-platform helper:
344
- `figma-issue`, `figma-component-confluence-sync`, `figma-setup`,
345
- `figma-validate`, `figma-to-component/phases/phase-6-code-connect.md`,
346
- and the `code-connect` / `tools` / `rest-api-script` reference docs.
347
- `figma-setup/SKILL.md` carries a top-of-file `CRED_STORE` preamble; the
348
- 17 token-touching bash blocks below it use `"$CRED_STORE"`. The
349
- `phase1-gather.py` Python script resolves the helper portably via
350
- `subprocess` (no more macOS-only assumption).
351
- Two intentional exceptions remain: `subprocess.check_output` calls that
352
- read `Claude Code-credentials` (Claude Code's own keychain entry, not
353
- ours) — those reach into another app's store and stay raw on purpose.
354
- - **`pipeline/commands/multi-agent/refs/cross-cli-contract.md`** — the
355
- Keychain row of the platform table now shows `credential-store.sh` as
356
- the single canonical column; the OS-specific commands (`security`,
357
- `secret-tool`, `cmdkey`) move to a "backend / for debugging" reference
358
- column. Callers no longer need to write per-OS `if command -v ...`
359
- branches for Keychain operations.
360
- - **`install/templates/copilot-instructions.md`** — Copilot allow-list now
361
- points at `~/.copilot/lib/credential-store.sh` and `~/.copilot/scripts/keychain.py`
362
- as the canonical Keychain interface; `security` / `secret-tool` / `cmdkey`
363
- are listed as the underlying platform backends.
364
- - **`pipeline/scripts/build-skills-index.mjs`** regenerated — the
365
- `.skills-index.json` runtime registry was 2 weeks stale (193 entries when
366
- actual was 159 / 196). Build-skills-index now re-runs as part of the
367
- release flow; smoke-readme-counts catches future drift.
368
- - **`README.md`** "Current at-a-glance" table corrected against
369
- filesystem reality:
370
- slash commands 32 → 29 (the picker fragments were incorrectly counted),
371
- total `SKILL.md` 195 → 196, smoke suites 73 → 75, JSON schemas
372
- `11 + token-budget config` → `10` (token-budget.json is a config file,
373
- not a `*.schema.json`, so it should not count toward the schema total).
374
-
375
- ### Cross-platform status (honest list)
376
-
377
- The credential helper is coded for all three platforms but only the macOS
378
- backend was end-to-end exercised this release:
379
-
380
- - **macOS** — `/usr/bin/security` backend. Roundtrip-tested in CI via
381
- `smoke-keychain.sh` (15 assertions). Production-ready.
382
- - **Linux** — `secret-tool` (libsecret) backend. Code path exists; smoke
383
- skips the live roundtrip because typical CI hosts have no keyring agent
384
- running. Field testing required before claiming parity.
385
- - **Windows** — PowerShell `CredentialManager` backend. Code path exists
386
- and is documented; not exercised this release. Field testing required.
387
-
388
- For now the safest claim is "production-ready on macOS Claude Code; cross-
389
- platform code paths in place but unverified." Filing field reports under
390
- issues with the `cross-platform-keychain` label is the fastest way to
391
- close that gap.
392
-
393
- ---
394
-
395
- ## [8.3.2] — 2026-05-05
396
-
397
- Patch — production-ready cleanup. Adds a deterministic Python helper for Keychain access, strips accumulated version-tag noise from user-facing docs, recalibrates per-phase token budgets, and clears stale doc/smoke drift. The shell credential store driver auto-delegates to Python on macOS/Linux so token reads no longer depend on shell-quoting subtleties; items are written under both `-l` (label) and `-s` (service) attributes so callers using either convention find them. Compliance smoke updated to recognise the v8.2 humanizer language gating (`--lang=` flag, decoupled from the locked `promptLanguage`).
398
-
399
- ### Added
400
-
401
- - **`pipeline/scripts/keychain.py`** — Python 3.10+ stdlib helper exposing `get` / `set` / `delete` / `list` / `doctor` subcommands. macOS shells out to `/usr/bin/security`; Linux uses `secret-tool`. `set <label> -` reads the secret from stdin to keep it out of shell history. Returns clean exit codes (0 ok / 1 missing / 2 backend / 3 usage / 4 backend error).
402
- - **`pipeline/scripts/smoke-keychain.sh`** (15 assertions) — verifies parse, argparse surface, doctor JSON shape, full set/get/rotate/delete roundtrip on macOS, cross-convention parity (every entry written reads back via both `-l` and `-s`), and remediation text on unsupported platforms.
403
- - **`installLib()` step in `install/claude.mjs` and `install/copilot.mjs`** — closes a long-standing install bug where `pipeline/lib/` was never copied into `~/.claude/lib/` or `~/.copilot/lib/`. Updates to `credential-store.sh`, `multi-repo-pipeline.sh`, and the rest of the shell library tree silently never reached user installs. Without this, the v8.3.2 Python delegate in `credential-store.sh` would have been invisible to actual users on next sync. `install-layout.tsv` fixture now expects `.claude/lib (8)` and `.copilot/lib (8)`.
404
-
405
- ### Changed
406
-
407
- - **`pipeline/lib/credential-store.sh`** — `do_get`, `do_set`, `do_delete` now delegate to `keychain.py` on macOS/Linux when `python3` is available. Opt out with `KEYCHAIN_DELEGATE=0`. Windows path unchanged (PowerShell `CredentialManager`).
408
- - **`pipeline/commands/multi-agent/setup.md`** — references the Python helper alongside the shell driver and documents the dual-attribute write behaviour.
409
- - **`pipeline/scripts/smoke-compliance-skills.sh`** — humanizer assertions now grep for `### EN ... --lang=en` / `### TR ... --lang=tr` section markers (the current gating) instead of the obsolete `promptLanguage == "en"|"tr"` strings (`promptLanguage` is locked to `en`; humanizer language now flows from the per-run `--lang=` flag).
410
- - **`pipeline/scripts/fixtures/install-layout.tsv`** — bumps `.claude/scripts` and `.copilot/scripts` to 125 (was 123) to account for the two new files above.
411
- - **All mode entry docs, phase refs, tracker-contract, Copilot SKILL.md, and the dispatch generator** — strip `(v8.3.1+)` / `(v6.1.0+)` style inline version tags from rule headings. Current-state writing only; version history belongs in this changelog. ~77 occurrences cleaned.
412
- - **`pipeline/schemas/token-budget.json`** — per-phase token budgets recalibrated against the post-v8.3 doc surface using the standing rule (`warn = current+10%`, `max = current+25%`, rounded to nearest 50). Phase totals: 0/8900, 1/2550, 2/4200, 3/4950, 4/6900, 5/2300, 6/5550, 7/4400 (total max 39750).
413
- - **`pipeline/schemas/prefs.schema.json`** — adds the `npm` slot to `keychainMapping` (was 11 services, now 12: `jira`, `bitbucket`, `bitbucket_token`, `bitbucket_user`, `github`, `confluence`, `figma`, `figma_mcp`, `fortify`, `firebase`, `jenkins`, `npm`).
414
-
415
- ### Migration
416
-
417
- No action required. Existing tokens written by the shell driver (under `-s` only) continue to be findable; tokens written via the Python helper are findable by both lookup conventions.
418
-
419
- ---
420
-
421
- ## [8.3.1] — 2026-05-05
422
-
423
- Patch — fixes the TaskCreate ordering bug that produced visually scrambled tile stacks (e.g. `1 ✓ · 2 ✓ · 4 ✓ · 0 ▶ · 3 ☐`). The native TaskList widget renders tiles in TaskCreate creation order, not by phase-number metadata. Pre-marking phases as completed/skipped before Phase 0 starts (a behavioral pattern the orchestrator agent fell into for figma component tasks and `--dev` mode skipped phases) flipped the visual order even though the underlying tracker state was correct.
424
-
425
- ### Changed
426
-
427
- - **`refs/tracker-contract.md`** — adds the "MANDATORY: TaskCreate ordering rule (v8.3.1+)" section: all TaskCreate calls MUST fire in strict phase-number order BEFORE any TaskUpdate is applied. Mode-specific phase sets documented (full pipeline 0→7, `--dev` 0→3→5→6→7).
428
- - **`refs/phases/phase-0-init.md`** — Step −1 now declares the universal ordering rule.
429
- - **`refs/phases.md`** — replaces the "register tile then mark `[SKIPPED]` activeForm" guidance with the in-order rule.
430
- - **All 7 mode entry docs** — `dev.md`, `autopilot.md`, `local.md`, `local-autopilot.md`, `dev-autopilot.md`, `dev-local.md`, `dev-local-autopilot.md` — each carries the explicit rule under their "Visual channel — Claude Code" section.
431
- - **Copilot full-inline orchestrator** (`pipeline/skills/shared/core/multi-agent/SKILL.md`) — mirrors the rule in the Canonical phase labels section.
432
- - **`pipeline/commands/multi-agent/dev.md`** — `--dev` mode no longer pre-creates skipped phases for 1/2/4; only the 5-phase set (0/3/5/6/7) is registered.
433
-
434
- ### Added
435
-
436
- - **`pipeline/scripts/smoke-tasklist-ordering.sh`** (16 assertions) — verifies the rule appears in tracker-contract.md, phase-0-init.md, all 7 mode entry docs, the Copilot SKILL.md mirror, and that no doc retains the legacy "pre-mark skipped" wording without the new rule. Negative-pattern check catches future regressions.
437
-
438
- ### Migration
439
-
440
- No prefs migration. Pure docs/contract change — runtime behavior unchanged on the state file side. Visual rendering will correct itself on next run because the orchestrator now reads the explicit ordering rule from every mode entry doc it touches.
441
-
442
- ---
443
-
444
- ## [8.3.0] — 2026-05-05
445
-
446
- Minor — four orthogonal observability + quality additions, all advisory-by-default, all Cross-CLI parity preserved. Inspired by feature audit of `ruvnet/ruflo` — kept four ideas, dropped the marketing-heavy ones (federation, GOAP planner, swarm intelligence, WASM kernels).
447
-
448
- ### Added
449
-
450
- - **Per-task Cost Breakdown in `agent-log.md`** — every Phase 7 run appends a `## Cost Breakdown` block with per-phase tokens (in/out) + estimated USD. Sourced from `phase-tracker.sh tokens` accumulators and `cost-table.json` prices. Independent of the channels-side `reportContent.costSummary` (PR/Jira gating). Token forwarder `LOG_METRIC_FORWARD_TO_TRACKER=1` keeps `metrics.jsonl` and the tracker in sync from one call site.
451
- - `pipeline/scripts/render-agent-log-cost.sh` (handles both array- and object-shaped tracker JSON)
452
- - `pipeline/scripts/log-metric.sh` gains opt-in tracker forwarder
453
- - `pipeline/scripts/phase-tracker.sh` gains `model <phase_id> <model_name>` action
454
- - `pipeline/scripts/smoke-agent-log-cost.sh` (13 assertions)
455
- - `refs/phases/log-format.md` Timeline table gets `Tokens (in/out)` column + Cost Breakdown section + emission contract
456
-
457
- - **Phase 4 Step 1.75 — Diff Risk Scoring (advisory)** — `pipeline/scripts/diff-risk-score.mjs` runs before reviewer dispatch and injects a top-N risk-ranked priority list into each reviewer's prompt. Heuristic, deterministic, sub-second, no LLM.
458
- - Signals: security paths (×3), schema migrations (×4), public API surfaces (×2), no-test-change (×2.5), complexity delta (×1.5), UI-critical paths (×1.5), loc changed (×1)
459
- - `pipeline/schemas/diff-risk.schema.json` (1.0.0) + `pipeline/scripts/validate-diff-risk.mjs`
460
- - `pipeline/scripts/fixtures/diff-risk-{ios,android}.diff` + `pipeline/scripts/smoke-diff-risk.sh` (14 assertions)
461
- - `pipeline/agents/code-reviewer.md` declares `${PRIORITY_FILES}` placeholder
462
- - `prefs.global.diffRiskAdvisory` toggle (default `true`)
463
-
464
- - **Phase 5 Step 0 — Test Gap Report (advisory)** — `pipeline/scripts/test-gap-scan.mjs` walks the diff for newly added public symbols and reports those with no paired test. Stack-specific rules ship for iOS, Android, Python, and Node.js.
465
- - Severity defaults: iOS Views, Android `@Composable`, interfaces, public protocols → `important`; other public API additions → `suggestion`. `--severity-promote` flag forces all to `important` (audit mode).
466
- - `pipeline/scripts/test-gap-rules/{ios,android,python,node}.json`
467
- - `pipeline/schemas/test-gap.schema.json` (1.0.0) + `pipeline/scripts/validate-test-gap.mjs`
468
- - `pipeline/scripts/fixtures/test-gap-{python,node}.diff` + reuses `diff-risk-{ios,android}.diff`
469
- - `pipeline/scripts/smoke-test-gap.sh` (22 assertions across 4 stacks)
470
- - `prefs.global.testGap` group: `enabled` (default true), `scanTree` (default false), `blockingThreshold` (null), `promoteSeverity` (default false)
471
-
472
- - **Phase 4 Triage Memory (advisory)** — per-repo append-only JSONL corpus at `~/.claude/memory/multi-agent/<repo-slug>/triage-corpus.jsonl` records every accepted/deferred/rejected finding. Phase 7 ingests on completion (idempotent); Phase 1 enriches the analysis with similar past tasks; Phase 4 triage attaches prior-art hits to each raw finding with explicit bias hedge ("context, not commands; current scope decides").
473
- - `pipeline/scripts/triage-memory.mjs` (subcommands: `ingest`, `query`, `path`, `stats`)
474
- - Token-overlap recall, zero deps, Node-18-compatible (no SQLite). Schema is forward-compatible with a future `vector BLOB` column.
475
- - `pipeline/schemas/triage-corpus.schema.json` (1.0.0)
476
- - `pipeline/scripts/smoke-triage-memory.sh` (11 assertions, uses HOME override to never touch the user's real memory dir)
477
- - `pipeline/scripts/search-logs.sh` + `commands/multi-agent/search.md` + Copilot `multi-agent-search/SKILL.md` gain `--semantic` flag that routes the query to the corpus
478
- - `prefs.global.priorArtEnrichment` group: `enabled` (default true), `ingestOnComplete` (default true), `topN` (default 3)
479
-
480
- ### Changed
481
-
482
- - **`refs/progress-contract.md`** — adds canonical "Token telemetry forwarding (v8.3+)" section documenting the `LOG_METRIC_FORWARD_TO_TRACKER=1` flag.
483
- - **Phase ref docs** (`phase-1-analysis`, `phase-2-planning`, `phase-3-dev`, `phase-4-review`, `phase-5-test`, `phase-6-commit`, `phase-7-report`) — every billable LLM dispatch now uses `LOG_METRIC_FORWARD_TO_TRACKER=1` in its telemetry snippet so the cost block stays complete.
484
- - **`pipeline/skills/shared/core/multi-agent/SKILL.md`** (Copilot full-inline orchestrator) — Timeline table gains Tokens column, agent-log gets Cost Breakdown section, Phase 4 lists step 0 (diff risk), Phase 5 lists step 0 (test gap).
485
- - **`render-cost-summary.sh`** — comma-grouping bug fix: 3-digit token counts no longer render as `,200` (also applied to the new `render-agent-log-cost.sh`).
486
-
487
- ### Migration
488
-
489
- No schema bump. New prefs keys (`diffRiskAdvisory`, `testGap`, `priorArtEnrichment`) all carry sensible defaults — existing `multi-agent-preferences.json` files keep working without manual edit. Flip them via `/multi-agent:setup` re-run or by editing the file directly.
490
-
491
- ### Smoke + Schema counts
492
-
493
- - Smoke suites: 66 → 71 (+5: `smoke-agent-log-cost`, `smoke-diff-risk`, `smoke-test-gap`, `smoke-triage-memory`; existing `smoke-cost-summary` extended)
494
- - JSON schemas: 8 → 11 (+3: `diff-risk`, `test-gap`, `triage-corpus`)
495
-
496
- ---
497
-
498
- ## [8.2.1] — 2026-05-04
499
-
500
- Patch — locks `prefs.global.promptLanguage` to `"en"`. Picker UI, confirmation prompts, error messages, and phase titles always render English; only `outputLanguage` is user-toggleable. Follow-up to v8.2.0 which left both axes user-facing.
501
-
502
- ### Changed
503
-
504
- - **`/multi-agent:language` skill** — single-token form (`en|tr`) now sets only `outputLanguage`. The `prompt en|tr` form is rejected (`promptLanguage is fixed to "en" and cannot be changed`). Self-heals stale `promptLanguage` values back to `"en"` on every successful invocation.
505
- - **`/multi-agent:setup` Step 0** — collapsed to one question (`outputLanguage`); `promptLanguage` is seeded as `"en"` and never offered to the user.
506
- - **Pickers** (`_account-picker`, `_dev-context`, `_repo-picker`, `_input-parser`) — Turkish alternations dropped; UI strings English only.
507
- - **Phase 5 / Phase 6 prompts** — bilingual prompt copy removed; ask in English.
508
- - **`multi-agent/SKILL.md`** — phase title table collapsed to single English column.
509
- - **`channels.md` + Copilot mirror, `sim-test.md`, `jira.md`, `issue.md`** — wording updated to "English (promptLanguage is locked)"; `--lang=tr|en|both` override flags retained.
510
- - **`apple-archive-compliance`, `google-play-compliance`** — TR humanizer templates retained (only fire under explicit `--lang=tr` override); default-resolution wording reframed.
511
-
512
- ### Migration
513
-
514
- No prefs migration required. Existing prefs files with `promptLanguage="tr"` are silently healed back to `"en"` on the next `/multi-agent:language` or `/multi-agent:setup` run. Schema enum stays `{en, tr}` for round-trip compatibility.
515
-
516
- ---
517
-
518
- ## [8.2.0] — 2026-05-04
519
-
520
- Minor — second language axis (`outputLanguage`), new `/multi-agent:language` skill, mandatory Step 0 selection, host-visible-fields-stay-English smoke.
521
-
522
- The pipeline always supported `prefs.global.promptLanguage` for interactive pickers, but the assistant's own explanations and pipeline-generated reports rode the same field — so users who wanted Turkish pickers but English summaries (or vice versa) had no clean option. v8.2.0 splits the two axes and bakes the dual question into setup.
523
-
524
- ### Added
525
-
526
- - **`prefs.global.outputLanguage`** (enum `en`/`tr`, default `en`) — controls the assistant's NON-INTERACTIVE explanations, status updates, error messages, and pipeline-generated reports rendered to the user. Independent of `promptLanguage`. External payloads (commits, PR/Jira/wiki) and skill-picker UI stay English regardless.
527
- - **`/multi-agent:language` skill** + Copilot mirror — interactive picker, single-token shortcut (`/multi-agent:language en` sets both), per-axis form (`/multi-agent:language prompt en|tr`, `… output en|tr`), bilingual current-state echo, atomic write.
528
- - **Setup Step 0 — two-axis language picker** — `setup.md` now asks `promptLanguage` then `outputLanguage` in sequence, both bilingual until set. The post-setup confirmation echo renders in the just-chosen `outputLanguage`.
529
- - **`pipeline/scripts/smoke-skill-language.sh`** (56 assertions) — fails CI when any `pipeline/skills/shared/core/*/SKILL.md` frontmatter `description` or `argument-hint` field contains Turkish-only characters (`ç ğ ı ö ş ü` + uppercase). The skill picker UI exposed by the host CLI (Claude Code, Copilot CLI) renders these fields, so they must stay English regardless of the user's `outputLanguage`. Wired into `ci-lite.yml`.
530
-
531
- ### Changed
532
-
533
- - **`prefs.schema.json`** — `promptLanguage` description rewritten to scope it strictly to interactive prompts; `outputLanguage` added with the explanations/reports/errors scope.
534
- - **`pipeline/preferences-template.json`** — adds `outputLanguage: "en"` next to `promptLanguage`.
535
- - **`pipeline/commands/multi-agent/help.md`** + Copilot mirror — language resolution prefers `outputLanguage` (falls back to `promptLanguage` for v8.1.x prefs files that pre-date the field).
536
- - **`refs/phases/phase-5-test.md`, `refs/phases/phase-6-commit.md`** — removed the silent `default "en"` fallback for `promptLanguage`. If the field is unset, the phase halts with a `/multi-agent:setup` hint. Language is a Step 0 mandatory selection, not a Phase 5/6 fallback.
537
-
538
- ### Fixed
539
-
540
- - **Two SKILL.md argument-hint fields** had Turkish prose (`multi-agent-scan` and `multi-agent-help`) — caught by the new smoke and rewritten in English.
541
-
542
- ### Migration
543
-
544
- Existing prefs files keep working — `outputLanguage` is read with a `// .promptLanguage // "en"` chain. The next `/multi-agent:setup` run (or a manual `/multi-agent:language output <en|tr>`) will fill in the field.
545
-
546
- ---
547
-
548
- ## [8.1.1] — 2026-05-04
549
-
550
- Patch — finishes v8.1.0's `--dev` redefine: registration loops, the generator, the smoke split, the shared SKILL mirror, and the website simulator now all agree that `--dev` is 5 phases (`0/3/5/6/7`).
551
-
552
- Driven by a real `/multi-agent:dev` run where the Claude Code TaskList tile stack rendered as `0 → 5 → 3 → 6 → 7`. v8.1.0's prose said `--dev` keeps Phase 5, but the `phase-tracker.sh add` loop and `gen-mode-dispatch.mjs` template still emitted the pre-v8.1.0 4-phase set, so agents either skipped Phase 5 or registered it out of order.
553
-
554
- ### Fixed
555
-
556
- - **`pipeline/commands/multi-agent/dev.md`** — registration loops now list `0:Init 3:Dev 5:Test 6:Commit 7:Report` (was `0/3/6/7`). Frontmatter description and intro line updated to "5-phase" for consistency with the body.
557
- - **`pipeline/scripts/gen-mode-dispatch.mjs`** — `dev` mode entry emits the 5-phase set; the smoke's byte-equal check now passes.
558
- - **`pipeline/scripts/smoke-mode-dispatch-drift.sh`** — `MODE_FAST` split into `MODE_FAST_WITH_TEST` (`dev.md`, 5-phase) and `MODE_FAST_NO_TEST` (`dev-autopilot.md`, `dev-local.md`, `dev-local-autopilot.md`, 4-phase). Each group asserts its own phase list.
559
- - **`pipeline/skills/shared/core/multi-agent-dev/SKILL.md`** — pre-v8.1.0 mirror: rewrote prose to match `dev.md` (5 phases, no "Phase 5 atlandığı için" fallback prompt at Phase 6).
560
-
561
- ### Note
562
-
563
- Autopilot and local variants of `--dev` (`dev-autopilot`, `dev-local`, `dev-local-autopilot`) still skip Phase 5 by design — autopilot has no interactive surface, local mode has no worktree to checkout from. Their 4-phase set is intentional.
564
-
565
- Companion website (`mmerterden.dev`) PipelineSimulator scenarios shipped alongside this release: `dev-jira` and `dev-issue` show 5 phases (with Phase 5 typically rendering as `skipped`); `dev-autopilot-*` keeps 4.
566
-
567
- ---
568
-
569
- ## [8.1.0] — 2026-04-28
570
-
571
- Minor — CLI-aware visual tracker contract, `phase-tracker.sh meta` action, and `--dev` mode redefined to skip only LLM-heavy phases (Analysis, Planning, Review) while keeping every audit prompt mandatory.
572
-
573
- Driven by a real `--dev` run on `GH-942-uicomponents`: agent skipped account picker, project picker, dev-context picker, maturity check, and Phase 7 channels asking — interpreting "fast" as "silent." User flagged each missed prompt explicitly. The fix narrows what `--dev` is allowed to skip and clarifies the visual surface per CLI.
574
-
575
- ### Added
576
-
577
- - **`pipeline/scripts/phase-tracker.sh meta <phase> <key> <value>`** — new action stores arbitrary key/value pairs in the active phase's `meta` object. Render shows the meta block under the active phase's line (Files / Tests / Build / Now). Insertion order preserved via `keys_unsorted` so the pipeline's narrative sequence reads top-to-bottom.
578
- - **STOP-AND-CONFIRM section** in `pipeline/commands/multi-agent/dev.md` — formalizes Phase 0 mandatory user prompts (account, project, dev-context, base branch, branch name, maturity ack), single-option-still-asks rule, no-state-inheritance rule, single-decision-per-prompt rule.
579
- - **Build "pre-existing" rule** in `dev.md` — before claiming a build failure is pre-existing infrastructure, agent must `git stash` + checkout the baseline SHA + reproduce. Skipping this turns "pre-existing" into an unverified assumption.
580
- - **Shared-branch final-confirm rule** in `dev.md` — every direct push to `iteration/develop`, `develop`, `main`, or `master` requires a one-line final confirmation immediately before the push command, even when consent was given earlier in the run.
581
-
582
- ### Changed
583
-
584
- - **`pipeline/commands/multi-agent/refs/tracker-contract.md` rewritten CLI-aware.** Two channels documented: state file (every CLI, identical) + visual surface (per CLI). Claude Code uses `TaskCreate`/`TaskUpdate` for the native sticky widget; Copilot CLI / Cursor / Windsurf / Cline / plain shell / Git Bash / WSL print `phase-tracker.sh render` output as the last tool result. Anti-pattern note added: do not bind a `statusLine` config to phase-tracker render — it lands below the input prompt and looks like a placeholder.
585
- - **`pipeline/commands/multi-agent/dev.md` skip list narrowed.** Was: skip 1, 2, 4, 5. Now: skip **1, 2, 4 only**. Phase 5 (User Test) runs full-pipeline-equivalent. Phase 3 Dev model is **Opus**. Phase 0 picker, Phase 5 test prompt, Phase 6 commit + shared-branch confirm, Phase 7 channels asking — all run identical to full pipeline.
586
- - **`pipeline/scripts/phase-tracker.sh render` simplified.** Progress bar removed (the formula was misleading: it filled relative to `max_elapsed` across phases, so the active phase always read 100%). Output now: glyph + phase name + elapsed + tokens + meta block (active only) + sub-phases. Clean, no magic widths.
587
- - **`pipeline/scripts/phase-tracker.sh` dead code removed.** `progress_bar()` function deleted (no callers after render simplification).
588
-
589
- ### Compatibility
590
-
591
- - No slash-command surface changes. `/multi-agent:dev` still runs; the phase set it consumes is narrower in scope but reaches Commit + Report by the same call shape.
592
- - `phase-tracker.sh` adds the `meta` action and removes the unused `progress_bar` helper. Existing `init` / `add` / `update` / `tokens` / `sub` / `render` calls are untouched.
593
- - Tracker state file schema gains an optional `phases[].meta` object; older state files without it render normally.
594
-
595
- ### Verified
596
-
597
- - `npm test` exit 0 (smoke suites + golden tasks + triage eval + schema validation green).
598
- - Manual smoke: `phase-tracker.sh meta 3 Files "3 modified"` → render shows `Files: 3 modified` line under active Phase 3 in insertion order.
599
- - Manual smoke: `phase-tracker.sh render` no longer references the deleted `progress_bar` function.
600
-
601
- ---
602
-
603
- ## [8.0.0] — 2026-04-27
604
-
605
- Major refactor — `install.js` split from a 1246-LOC monolith into focused modules under `install/`, plus four new regression gates (install layout, mode dispatch drift, cross-phase cohesion, Vercel token redaction). No slash-command API changes — every existing invocation continues to work — but the file-layout shift triggers a major bump per the project's versioning policy.
606
-
607
- Driven by two real incidents in the v7.9.1 release window:
608
-
609
- 1. **Vercel token leak.** `vercel deploy --token=vcp_…` failed; the CLI's retry hint printed argv verbatim, leaking the token into the conversation transcript. Required a token rotation.
610
- 2. **Wrong git author identity.** A `git -c user.email=…` override pushed seven v7.7.0 → v7.9.1 commits with the wrong author email, which Vercel's contributor gate then blocked.
611
-
612
- Both were prevented post-incident by feedback memories; v8.0.0 is the architectural answer that prevents them from recurring at all.
613
-
614
- ### Added
615
-
616
- - **`install/` module split.** New modules: `index.mjs` (entry + flag parsing), `claude.mjs`, `copilot.mjs`, `_adapters.mjs`, `_common.mjs`, `_platform-filter.mjs`, `_telemetry.mjs`, `_copilot-instructions.mjs` (static generator extracted), `_dev-only-files.mjs` (single source of truth for excluded files).
617
- - **`install.js` shim.** Now 24 lines: imports `runInstall` from `install/index.mjs`. Bin entry, package.json `./install` export, every smoke test that runs `node install.js` continue working unchanged.
618
- - **`pipeline/lib/vercel-deploy.sh`** — safe Vercel CLI wrapper. Refuses to deploy with `--token=` argv (the original v7.9.1 leak shape), reads token from `VERCEL_TOKEN` env, pipes every stdout/stderr line through a redact filter that scrubs `--token=…`, `vcp_…`, `Bearer …`, `"token":"…"`, and `"VERCEL_TOKEN":"…"` patterns.
619
- - **`pipeline/scripts/smoke-install-layout.sh`** (35 assertions) — install layout regression gate. Runs the live installer into a temp HOME for `--claude`, `--copilot`, `--all`; asserts module presence, shim size, file-count parity, hook registration.
620
- - **`pipeline/scripts/smoke-vercel-deploy-redact.sh`** (12 assertions) — every known token-leak shape has a fixture; the redact filter must scrub it.
621
- - **`pipeline/scripts/gen-mode-dispatch.mjs`** — canonical generator for the "Required: Phase Tracker Contract" section embedded in every mode dispatch file.
622
- - **`pipeline/scripts/smoke-mode-dispatch-drift.sh`** (61 assertions) — verifies each mode file declares the right phase list (4-phase for `dev*`, 8-phase for `local`/`autopilot`) and links `refs/tracker-contract.md`.
623
- - **`pipeline/scripts/smoke-cross-phase-cohesion.sh`** (17 assertions) — Phase 1→2 (`analysis-output`), 2→3 (`planning-output`), 4→4 triage (`reviewer-output`), 4 triage→6 (`triage-output`), 6→7 (`agent-state`) hand-offs documented on both sides.
624
- - **`.github/workflows/ci-lite.yml`** — workflow_dispatch-only fast lane (Ubuntu + Node 22). The full matrix `test.yml` keeps macOS/Windows + credential-store round-trips.
625
- - **`pipeline/commands/multi-agent/refs/channels/{pr,jira,confluence,wiki}.md`** — per-channel adapter contracts split out of `channels.md` (501 → 456 LOC). Each refs file is < 80 LOC, self-contained.
626
- - **`docs/adr/0008-installer-modularization-and-secret-leak-defense.md`** — ADR for this refactor.
627
- - **Recovery playbooks** in `docs/recovery-guide.md` for: worktree corruption, tracker state corruption, token rotation mid-pipeline, failed push + force-with-lease decision tree.
628
-
629
- ### Changed
630
-
631
- - `pipeline/scripts/smoke-multi-repo-integration.sh` greps `install/_copilot-instructions.mjs` instead of `install.js` (instruction body moved during the split).
632
- - `pipeline/commands/multi-agent/refs/phases/{phase-2-planning,phase-3-dev,phase-6-commit}.md` gain `#### Input contract` sections that name the schema they consume — closes the cross-phase cohesion gap the new smoke detects.
633
-
634
- ### Compatibility
635
-
636
- - No slash-command surface changes.
637
- - `install.js` retains its CLI contract; existing scripts that invoke it via `npx`, the bin entry, or the package export are unaffected.
638
- - The `./install` package export still resolves to `install.js`.
639
- - `package.json` adds `install/**/*` to `files` — npm publish ships the new modules.
640
-
641
- ---
642
-
643
- ## [7.9.1] — 2026-04-27
644
-
645
- Patch — fixes a regression that was hiding since v6.0: 8 mode dispatch files (`dev.md`, `dev-local.md`, `autopilot.md`, etc.) had **zero references to phase-tracker.sh**. They told the agent to "see refs/phases.md" but never inlined the actual `init` / `update` / `tokens` / `render` calls. Result: agents fell through to Claude Code's TaskList UI which has no elapsed time, no token count, no cost — users saw only static checkboxes for the entire pipeline run.
646
-
647
- Discovered when a real run on `dev-local` mode produced `Sleep Timer pipeline — dev-local · ☑ Phase 0 · ■ Phase 3 · ☐ Phase 6` with **no progress info whatsoever**, while `~/.claude/logs/multi-agent/.tracker-current` was still pointing at a 7-day-old task. The tracker had never been initialized.
648
-
649
- ### Added
650
-
651
- - **`pipeline/commands/multi-agent/refs/tracker-contract.md`** (new shared ref) — canonical specification of the phase-tracker invocation pattern: `init` → `add` per phase → `update <N> in_progress|completed|failed|skipped` at every boundary → `tokens <N> <in> <out>` after every LLM call → `render` for user-facing output. Includes anti-pattern guard ("never fall through to TaskCreate/TaskList for phase-level state — elapsed and tokens vanish") and cross-references to `refs/phases.md` (canonical contract) + `pipeline/scripts/phase-tracker.sh` (impl) + `pipeline/scripts/run-aggregator.mjs` (post-hoc).
652
- - **`pipeline/scripts/smoke-tracker-contract.sh`** (new, 34 assertions) — regression gate. Enforces: contract file exists; contract documents all 5 invocation verbs (init/add/update/tokens/render); each of 7 mode files references phase-tracker ≥ 4 times; each mode file links to `tracker-contract.md`; each mode file mentions the TaskCreate anti-pattern; each mode file inlines the `phase-tracker.sh init` call (refs-only is not enough — agents skip refs).
653
-
654
- ### Changed
655
-
656
- - **7 mode dispatch files** gain a "## Required: Phase Tracker Contract" section with inline `phase-tracker.sh` calls + link to `refs/tracker-contract.md` + anti-pattern guard:
657
- - Dev modes (Init/Dev/Commit/Report phase set): `dev.md`, `dev-local.md`, `dev-autopilot.md`, `dev-local-autopilot.md`
658
- - Full pipeline modes (8-phase): `autopilot.md`, `local.md`, `local-autopilot.md`
659
- - Each section is mode-aware — dev modes inline a 4-phase `for p in "0:Init" "3:Dev" "6:Commit" "7:Report"` loop; full-pipeline modes inline an 8-phase loop.
660
-
661
- ### Verified
662
-
663
- - 62 smoke suites total (61 v7.9.0 + 1 new `smoke-tracker-contract.sh` 34 assertions), exit 0.
664
- - Each of 7 mode files now has 6 phase-tracker references (was 0).
665
- - New runs on any of these modes will initialize the tracker correctly; users will see the rich `╭─ Pipeline ─╮ │ ✓ Phase 0 [████] 4m 48s · 2.2k tok` rendering in their console.
666
-
667
- ### Why patch and not minor
668
-
669
- Pure bug fix; no new commands, no new flags, no new features. The user-facing contract was already documented in `refs/phases.md` since v3.x — this just enforces that mode dispatch files honor it.
670
-
671
- ### Note on already-running tasks
672
-
673
- This patch fixes **future** runs. Tasks already in flight (like the one that triggered the bug report) cannot be retroactively given a tracker — the state was never created. Restart the task to get the rich rendering.
674
-
675
- ---
676
-
677
- ## [7.9.0] — 2026-04-27
678
-
679
- Minor — Tier 3 third-party AI tool adapters. Three new install targets in addition to v7.7's Cursor/Windsurf/Cline. With v7.9.0 the pipeline's knowledge layer ports to **all 6 of the most-used non-orchestration AI tools**.
680
-
681
- ### Added — three new adapters
682
-
683
- | Tool | Flag | Output | Format |
684
- |---|---|---|---|
685
- | **GitHub Copilot Chat** | `--copilot-chat` | `.github/copilot-instructions.md` (marker-wrapped) + `.github/instructions/multi-agent-*.instructions.md` | Repo-scoped per-skill instructions with `applyTo` glob frontmatter |
686
- | **Continue.dev** | `--continue` | `.continue/rules/multi-agent-*.md` | Per-skill rules with name + description + globs frontmatter |
687
- | **Zed AI** | `--zed` | `.rules` (single, marker-wrapped) | Concatenated knowledge digest |
688
-
689
- All three are **knowledge-layer-only** — orchestration commands (the 26 `multi-agent-*` core skills) are filtered out because none of these tools have subagent dispatch.
690
-
691
- ### Changed
692
-
693
- - **`install.js`** flag set: `TOOL_FLAGS` extended from 5 → 8 (`--copilot-chat`, `--continue`, `--zed` added). Backward compat unchanged: hiç flag yoksa Claude Code'a kurar.
694
- - **`uninstall.mjs`** updated for token-preserving removal of all 3 new adapters. Static no-touch contract still holds (zero credential-store deletion API references).
695
- - **`index.js` help** mentions all 6 adapter flags; `--all-tools` now installs Tier 1 + Tier 2 + Tier 3 in one call.
696
-
697
- ### New smoke
698
-
699
- - **`smoke-adapters-tier3.sh`** (30 assertions): per-adapter round-trip, marker preservation (Copilot Chat main + Zed `.rules`), per-skill user file preservation (Copilot Chat instructions / Continue rules), `--all-tools` dispatches all 6 adapter targets, selective uninstall (`--zed` doesn't touch Tier 2), token-preservation static check on Tier 3 adapter modules, uninstall.mjs flag handlers covered.
700
-
701
- ### Total tool support — Tier 1 + Tier 2 + Tier 3
702
-
703
- | Tier | Tools | Surface |
704
- |---|---|---|
705
- | **Tier 1** (full pipeline) | Claude Code, Copilot CLI | 8-phase orchestration + knowledge |
706
- | **Tier 2** (knowledge layer, v7.7) | Cursor, Windsurf, Cline | Per-skill rules / single rules file |
707
- | **Tier 3** (knowledge layer, v7.9) | GitHub Copilot Chat, Continue.dev, Zed AI | Per-skill instructions / single rules file |
708
-
709
- 8 install flags total. `--all-tools` covers all 6 (Tier 2 + Tier 3) in one call; Tier 1 stays opt-in by default to keep `$HOME` modifications behind explicit consent.
710
-
711
- ### Verified
712
-
713
- - 61 smoke suites total (60 v7.8.0 + 1 new), exit 0.
714
- - Round-trip per Tier 3 adapter: 161 Continue rules · 161 Copilot Chat instructions · 1 Zed `.rules` block — all installs and uninstalls cleanly.
715
- - `--all-tools` test plants user-authored content in 4 of the 6 outputs; uninstall preserves all of it.
716
- - Token preservation static check now scans 6 adapter modules + uninstall.mjs.
717
-
718
- ### Not breaking
719
-
720
- No removed flags, no behavior flips. v7.8.0 → v7.9.0 is pure addition.
721
-
722
- ---
723
-
724
- ## [7.8.0] — 2026-04-27
725
-
726
- Minor — four post-v7.7 internal improvements bundled as one minor. None of these change the slash-command surface; they harden internal contracts and add post-hoc analysis tooling.
727
-
728
- ### Added — Paket B: `/multi-agent:diff-explain` (post-hoc Phase 4 → diff bridge)
729
-
730
- Read-only command that maps Phase 4 triage findings to the actual `git diff`. For each accepted / deferred / rejected finding, the script locates the corresponding hunk in the branch's diff against base and renders an annotated markdown report with severity icons (🚫 blocking, ⚠️ important, 💡 suggestion) and bucket icons (✅ accepted, ⏭️ deferred, ❌ rejected).
731
-
732
- - **`pipeline/scripts/diff-explain.mjs`** — zero-dep ES module. Inputs: `--triage <path>`, `--state <path>`, or `--task-id`. Diff source: `--branch <name>` + `--base <name>`, or pre-computed `--diff <path>` for CI/sandbox.
733
- - **`/multi-agent:diff-explain`** slash command + **`multi-agent-diff-explain`** Copilot peer with byte-eq description (parity smoke covers this, now 55 passing).
734
- - **`smoke-diff-explain.sh`** (15 assertions): bucket/severity icons, hunk binding via synthetic diff, out-of-diff fallback, --help, missing-input error, **read-only static check** (no `git apply`/`checkout`/`reset`/`writeFileSync`).
735
-
736
- Use cases: between Phase 4 and Phase 5 ("hangi finding hangi koda denk geliyor?"), Phase 7 wiki content, post-hoc audit on closed PRs.
737
-
738
- ### Added — Paket C: Unified telemetry aggregator (`pipeline-run.jsonl`)
739
-
740
- Combines three independent telemetry sources for a single task into one normalized JSONL stream that downstream consumers can ingest without re-stitching:
741
-
742
- - `phase-tracker.json` (always present)
743
- - `otel-spans.jsonl` (opt-in via `MULTI_AGENT_OTEL_SPANS=1`, v6.1.C)
744
- - `cost-table.json` (per-model pricing, v6.1.J)
745
-
746
- **`pipeline/scripts/run-aggregator.mjs`** — produces `pipeline-run.jsonl` with event kinds: `run.start`, `phase.start`/`phase.complete`/`phase.failed`, `subphase.transition`, `span` (passthrough), `summary` (last event with totals). Time-sorted. Cost computed per phase using model-aware pricing; unknown models are gracefully marked with `cost_unknown_model: true` rather than crashing.
747
-
748
- **`smoke-run-aggregator.sh`** (17 assertions): output produced, source files unchanged (read-only contract), time-sorted events, correct cost computation ($1.041 expected for the fixture, matched to the cent), summary as last event, span passthrough with task_id stamping, --summary-only mode, unknown-model graceful degradation, missing input error, idempotency.
749
-
750
- ### Added — Paket A: Opt-in live golden-task runner (cost-guarded)
751
-
752
- Optional live evaluation harness for golden-task fixtures. The always-on contract harness (`eval-golden-tasks.mjs`) only validates schema shape; this new runner optionally invokes a real model via the local `claude` CLI and compares actual output against the fixture's expected scope.
753
-
754
- **Default behavior is dry-run.** Live mode requires BOTH `--live` flag AND `MULTI_AGENT_LIVE_EVAL=1` in env (double cost gate). Static check enforces this in smoke: the env-check is on line 13, the `claude` invocation is on line 144 — so any future refactor that puts the call before the gate fails the build.
755
-
756
- - **`pipeline/scripts/eval-golden-tasks-live.mjs`** — zero-dep, shells out to `claude -p` (honors ADR-4 and reuses the user's existing CLI auth).
757
- - **Cost guard:** `--budget=<usd>` per-case cap (default $1), `--max-cases=<N>` run cap (default 1), `--total-budget=<usd>` ceiling (default budget × max-cases).
758
- - **Exit codes:** 0 pass · 1 mismatch · 2 setup error · 3 budget exhausted.
759
- - **`smoke-eval-live.sh`** (12 assertions): dry-run is the default, no model invocation in dry-run, --live without env exits 2 with cost-gate message, gate-before-call static line ordering, budget cap, --max-cases cap, --case selector, --help mentions the cost gate, invalid --budget exits 2.
760
-
761
- ### Added — Paket D: Per-step migration round-trip tests
762
-
763
- `test/migrate-prefs.test.mjs` gains 7 new assertions exercising each `pipeline/schemas/migrations/prefs-X.X.X-to-Y.Y.Y.mjs` module in isolation:
764
-
765
- - 2.0.0 → 2.1.0 step: schemaVersion bump + gitIdentities → identities rename
766
- - 2.1.0 → 2.2.0 step: backfills reportChannels + reportContent + wikiScope
767
- - 2.2.0 → 2.3.0 step: initializes recentAccounts + accounts arrays
768
- - Full chain 2.0.0 → 2.3.0 produces a self-consistent v2.3.0 document
769
- - Idempotency: re-applying 2.1.0 → 2.2.0 preserves user-customized values
770
- - Non-destructive: unknown top-level keys survive every step
771
- - Idempotency on v2.3.0 input: 2.2.0 → 2.3.0 is no-op when already v2.3.0
772
-
773
- A future migration that breaks one step now fails the unit suite even if the wrapper script's end-to-end behavior masks the regression.
774
-
775
- ### Verified
776
-
777
- - 60 smoke suites total (56 v7.7.0 + 4 new: `smoke-diff-explain.sh` 15 · `smoke-run-aggregator.sh` 17 · `smoke-eval-live.sh` 12 · plus migration round-trip in unit suite), exit 0.
778
- - Unit tests: 19 pass (was 12, +7 round-trip).
779
- - `eval-triage.mjs` 10/10 · `eval-golden-tasks.mjs` 2/2 · `validate-schemas.mjs` 7/7.
780
- - Aggregator cost computation matches expected pricing to the cent ($1.041 on the fixture).
781
- - Live runner static check: env gate (line 13) precedes `claude` call (line 144) — refactor-resistant.
782
-
783
- ### Not breaking
784
-
785
- No slash-command renames, no removed flags, no default behavior flips. v7.7.0 → v7.8.0 is a pure addition.
786
-
787
- ---
788
-
789
- ## [7.7.0] — 2026-04-27
790
-
791
- Minor — third-party AI tool adapters + token-preserving uninstall. The 8-phase pipeline orchestration still requires Claude Code or Copilot CLI (subagent dispatch is a hard prerequisite). What ports cleanly is the knowledge layer — rules tree + skills catalog — and v7.7.0 ships native installers for the three most-used non-orchestration tools. Plus, every install path now has a documented opposite that respects credential storage.
792
-
793
- ### Added
794
-
795
- - **Cursor adapter** (`pipeline/adapters/cursor.mjs`). `npx ... install --cursor [--target=<path>]` emits `.cursor/rules/multi-agent-*.mdc` files (modern Cursor 2025+ format with `description` / `globs` / `alwaysApply` frontmatter) plus a `.cursorrules` legacy fallback for older Cursor versions. Glob inference: Swift skills get `**/*.swift`, Kotlin skills `**/*.{kt,kts}`, web skills `**/*.{tsx,ts,jsx,js}`, etc. Orchestration skills (the 26 `multi-agent-*` core skills) are filtered out — they have no callable surface in Cursor.
796
- - **Windsurf adapter** (`pipeline/adapters/windsurf.mjs`). `npx ... install --windsurf [--target=<path>]` writes `.windsurfrules` as a single concatenated rules document, wrapped in `<!-- multi-agent-pipeline:begin -->` / `<!-- multi-agent-pipeline:end -->` markers. User-authored rules above or below the markers survive uninstall untouched.
797
- - **Cline adapter** (`pipeline/adapters/cline.mjs`). `npx ... install --cline [--target=<path>]` writes per-skill files to `.clinerules/multi-agent-*.md` (modern Cline 2.x format). User-authored rules in the same directory are preserved by name-prefix scoping.
798
- - **`--all-tools` flag** — installs Claude + Copilot + Cursor + Windsurf + Cline in one call.
799
- - **`--target=<path>` flag** — overrides the adapter destination. Default is `process.cwd()` because Cursor / Windsurf / Cline rules are repo-scoped, not user-scoped. Ignored for Claude / Copilot (those are HOME-scoped).
800
- - **Token-preserving uninstall** (`pipeline/scripts/uninstall.mjs`). `npx ... uninstall` removes every installed surface (Claude Code commands / skills / agents / scripts / rules + Copilot equivalents + adapter targets) without touching personal access tokens in macOS Keychain, Windows Credential Manager, or Linux libsecret. Static check enforces this in `smoke-delete-flow.sh`: the script must contain zero references to credential-store deletion APIs.
801
- - **`/multi-agent:delete` slash command** — Claude Code interactive entry point. Dual confirmation gate (`y/N` then `delete` typed verbatim) mirrors `:purge`. Dry-run preview is shown before either confirmation.
802
- - **`multi-agent-delete` skill** — Copilot CLI peer with byte-eq description for parity smoke. Wired into `smoke-commands-skills-parity` (now 54 passing).
803
- - **`--dry-run`** flag — reports what would be removed, modifies nothing. Safe to run anywhere.
804
- - **`--yes` / `-y`** — skip prompts (used internally by the slash command after both confirmations pass).
805
- - **Selective targets** — `--claude / --copilot / --cursor / --windsurf / --cline` each work independently. Default (no flag) removes from every installed target.
806
- - **Adapter base module** (`pipeline/adapters/_base.mjs`) — shared zero-dep helpers: YAML-ish frontmatter parser, skill walker, glob inference, marker-wrapped block read/replace/remove, orchestration-skill filter. Adapters share these primitives so a future fourth adapter (Continue, Zed AI, Roo Code) is mostly format-mapping.
807
-
808
- ### New smoke suites
809
-
810
- - **`smoke-adapters.sh`** (19 assertions) — round-trip integrity for cursor / windsurf / cline. Install + uninstall + idempotency check + platform filter (`--platform=ios` vs `--platform=android` narrows the skill set) + backward-compat default behavior (no flags still installs Claude).
811
- - **`smoke-delete-flow.sh`** (12 assertions) — slash command + Copilot peer existence, `--dry-run` zero-side-effect contract, real uninstall completeness, marker-wrapped user-content survival, selective target isolation, CLI router (`uninstall` and `delete` aliases), and the credential-store no-touch static check.
812
-
813
- ### Changed
814
-
815
- - **`install.js`** — explicit-target detection. When any tool flag (`--cursor`, `--windsurf`, `--cline`, `--copilot`, `--claude`) or `--all` / `--all-tools` is set, only those targets install. Default behavior (no flag) still installs Claude Code, exactly as before.
816
- - **`index.js`** — gains `uninstall` and `delete` (alias) routes, plus expanded `help` output.
817
- - **`package.json`** — version `7.6.0` → `7.7.0`, description rewritten to mention adapter portability + token-preserving uninstall, keywords add `cursor` / `windsurf` / `cline`.
818
-
819
- ### Why
820
-
821
- User asked: "tüm ailar destekleyecek mi?" The honest answer is no — subagent dispatch is a hard prerequisite for the 8-phase pipeline, and only Claude Code + Copilot CLI expose it. But the knowledge layer (193 SKILL.md + 12 rules) ports cleanly. v7.7.0 ships that port for the three most-used non-orchestration tools, with a native uninstall that respects the credential store on every supported OS.
822
-
823
- ### Verified
824
-
825
- - 56 smoke suites total (incl. 2 new: `smoke-adapters.sh` 19 assertions + `smoke-delete-flow.sh` 12 assertions), exit 0.
826
- - Round-trip cursor: 174 install / 174 remove / 0 leak. Windsurf user-content preservation across uninstall: PASS. Cline selective uninstall: cursor tree intact (173 files), cline cleaned (0 multi-agent-*).
827
- - Token preservation static check: zero references to `security delete-generic-password`, `cmdkey /delete`, `secret-tool clear`, or `credential-store.sh delete` in the uninstaller.
828
- - Backward compat: `node install.js` (no flags) still produces a Claude-only install identical to v7.6.0.
829
-
830
- ---
831
-
832
- ## [7.1.1] — 2026-04-22
833
-
834
- Patch — UX refinement on v7.1.0 Work Summary requested during post-ship demo. Previous flow made the user pick channels + content FIRST, then rendered Work Summary in Step 5. New flow flips it: Work Summary previews at the TOP of the menu, then the user picks where to route it.
835
-
836
- ### Changed
837
-
838
- - **`/multi-agent:channels` menu — preview-first flow** (`channels.md` Step 3a, new). When `agent-state.json` is available AND Work Summary can be generated, `render-work-summary.sh` runs BEFORE the channel picker. User sees the executive "what did this ship?" block inline, then makes the channel + content selections. The `workSummary` content row in Step 3b is auto-ticked when the preview was shown — matches the 80% case (user just saw it, wants to send it somewhere). 20% case (preview was enough, no remote post needed) is one keystroke to uncheck.
839
- - **Greyed-out rules** updated: Yapılan iş özeti row notes auto-tick behavior when preview was shown.
840
- - **`smoke-work-summary.sh`** grows to 11 assertions — new assertion verifies `channels.md` documents the Step 3a summary-first flow + auto-tick contract.
841
-
842
- ### Why
843
-
844
- User asked directly during review: "çalışma özetini gösterecek sonra jira / confluence / wiki seçim yaptıracak değil mi?" — previous flow did not work that way. This patch makes the mental model match the implementation: see outcome first → decide routing → confirm.
845
-
846
- ### Verified
847
-
848
- - 50 smoke suites (unchanged count — `smoke-work-summary.sh` assertion count 10 → 11), ~881 total assertions, exit 0.
849
- - `channels.md` Step 3a section added; Step 3 renamed to 3b for clarity.
850
-
851
- ---
852
-
853
- ## [7.1.0] — 2026-04-22
854
-
855
- Minor — one new Phase 7 content option requested directly by the user: a structured "Work Summary" executive block for the multi-agent pipeline report. No changes to existing behavior; the new option is opt-in and default off so baseline PR body / Jira / Confluence output stays unchanged.
856
-
857
- ### Added
858
-
859
- - **Phase 7 Work Summary** (`prefs.global.reportContent.workSummary`, default `false`). `/multi-agent:channels` multi-select menu gains a new "Yapılan iş özeti / Work Summary" content option. When selected, appends a `### Work Summary` block that distills the whole pipeline run into a single-screen executive summary:
860
- - **Task header** — task id, branch, base branch, PR number
861
- - **Scope delivered** — Phase 2 task list with `✅` (done) or `⏳` (pending/deferred) marks
862
- - **Changed files** — per-file +/- counts from `git diff --numstat base...HEAD`, capped at 20 rows with `_... +N more files not shown_` footer
863
- - **Review outcome** — `N accepted · M deferred · K rejected · approved={bool}` from Phase 4 triage (suppressed when all buckets empty, normal for `--dev` runs)
864
- - **Phase tick strip** — single-line `0 Init ✅ · 1 Analysis ✅ · … · 7 Report ▶` from `phase-tracker.json` (marks: `✅` completed · `▶` in_progress · `❌` failed · `⏭` skipped · `·` pending)
865
- - **New script** `pipeline/scripts/render-work-summary.sh <task-id> [--worktree <path>] [--branch <name>] [--base-branch <name>]`. Reads `agent-state.json` + `phase-tracker.json` + `git diff --numstat`. Exit 2 when insufficient state (caller should skip section). Supports post-hoc invocation with explicit flags when state file has been archived.
866
- - **New smoke suite** `smoke-work-summary.sh` (10 assertions) — state-missing exit code, header rendering, scope ✅/⏳ marks, review outcome line shape, phase tick marks, outcome-suppression-when-empty path, post-hoc flags, prefs schema exposure, channels.md doc presence.
867
-
868
- ### Changed
869
-
870
- - **`pipeline/schemas/prefs.schema.json`** — `reportContent.workSummary` (boolean, default `false`) added to v2.2.0 schema. Schema stays additive.
871
- - **`pipeline/commands/multi-agent/channels.md`** — menu example gains the new row, greyed-out rules document the `"(no agent-state.json for {taskId})"` fallback, generator table gains the new row, output template chain adds `### Work Summary` slot right above `### Cost Summary`, new "Work summary generation (v7.1.0+)" doc subsection with the full output template example.
872
-
873
- ### Verified
874
-
875
- - 50 smoke suites (49 v7.0.0 + 1 new `smoke-work-summary.sh`), ~880 assertions, exit 0.
876
- - Full regression: `eval-triage` 10/10, `eval-golden-tasks` 2/2, 7 schemas valid.
877
- - Manual fixture render: PROJ-2001 mixed-case (2 done + 1 deferred · 1 accepted + 1 rejected · 5 skipped + 7 in_progress) renders cleanly.
878
-
879
- ### Why this shipped
880
-
881
- User asked for a "work done report" in the Phase 7 section — the previous `reportContent` options (normal analysis / technical details / test scenarios / auto-diff / manual note / cost summary) covered *what was analyzed* and *how much it cost* but not *what was actually shipped*. Work Summary fills the gap with a single executive-readable block that answers "what did this run change?" without requiring the reader to parse the full agent-log.
882
-
883
- ---
884
-
885
- ## [7.0.0] — 2026-04-22
886
-
887
- Major — first major bump since v6.0.0. Three net-new capabilities that change the autopilot trust surface and context-loading default, plus a new `/multi-agent:local-autopilot` command that closes an obvious gap in the mode grid. No breaking changes to existing behavior — everything new is opt-in or opt-out with sane defaults.
888
-
889
- ### Added
890
-
891
- - **v7.0.G — Autopilot Phase 2 safety classifier** (`prefs.global.autopilotSafetyGate`, default `true` — opt-out). `pipeline/scripts/classify-plan-safety.mjs` reads the approved Phase 2 plan and scores it against six heuristics (file-count-high +30 / file-count-medium +10 / destructive-verb +25 / security-path +35 / delete-without-test +30 / schema-migration +25 / infrastructure +20). Score ≥ 50 flips `recommendPause: true` — the orchestrator injects a one-time manual approval prompt even in autopilot mode. Phase 2 doc Step 5c wires it in. Telemetry: `phase.plan.safety` OTel span. Smoke: `smoke-plan-safety.sh` (9 assertions). Default-on because the asymmetry favors pausing — a pause on a high-blast-radius plan costs seconds; a silent auto-merge costs hours of rollback.
892
- - **v7.0.I — Dynamic trigger-based skill loading** (`prefs.global.dynamicSkillLoading`, default `false` — opt-in). `pipeline/scripts/build-skills-index.mjs` walks all SKILL.md files and emits `.skills-index.json` (193 entries) + `skills-index.md`. `pipeline/scripts/match-skills.mjs` scores each skill against a task description + touched files + stack using four additive signals (`trigger-paths` globs, `trigger-keywords`, description keywords, platform match). Orchestrator reads matched top-N and injects only those into subagent prompts instead of relying on eager-load. Large token win on small tasks; no visible change on big ones. `install.js --index-only` ships just the index (no 193 SKILL.md bodies) for CI / disk-constrained installs. Smoke: `smoke-dynamic-skill-loading.sh` (8 assertions).
893
- - **`/multi-agent:local-autopilot`** — closes the obvious gap in the mode grid: full 8-phase pipeline + no worktree + no confirmations. Parallel to the existing `/multi-agent:autopilot` (full + worktree + autopilot), `/multi-agent:local` (full + no worktree + interactive), and `/multi-agent:dev-local-autopilot` (fast + no worktree + autopilot). Copilot peer: `multi-agent-local-autopilot`. Routing + help + sync inventory updated.
894
-
895
- ### Changed
896
-
897
- - **Command inventory** bumped 25 → 26 everywhere (`refs/cross-cli-contract.md` § 1, `sync.md`, `multi-agent-sync/SKILL.md`, `help.md` EN + TR, README fact sheet). Full list: `autopilot, channels, dev, dev-autopilot, dev-local, dev-local-autopilot, help, issue, jira, kill, local, local-autopilot, log, manual-test, purge, refactor, resume, review, scan, search, setup, stack, status, sync, test, update`.
898
- - **Skill count** bumped 192 → 193 — `multi-agent-local-autopilot/SKILL.md` added as the Copilot peer to the new Claude colon-form command. Skill manifest re-signed (193 SHA-256 entries).
899
- - **`prefs.schema.json`** — added `global.autopilotSafetyGate` (boolean, default `true`) and `global.dynamicSkillLoading` (boolean, default `false`). Schema stays v2.2.0-compatible — additive only.
900
- - **`pipeline/commands/multi-agent/refs/phases/phase-2-planning.md`** — Step 5 scope guard now checks `classify-plan-safety.mjs` verdict before honoring autopilot's interaction-skip. New Step 5c documents the classifier contract. Step 6 mode-awareness table updated — autopilot column now reads `conditional` with a safety-classifier column.
901
- - **Token budget** — `phase-2-planning` max_tokens bumped 3050 → 3850, warn_tokens 2700 → 3400 (accommodates the Step 5c addition; 3079 current / 3400 warn headroom).
902
- - **`pipeline/skills/shared/core/multi-agent/SKILL.md`** — new "Dynamic skill loading" section documenting the opt-in runtime contract (read index → match → inject), fallback path (eager-load when index missing), and `--index-only` install interaction.
903
- - **`install.js`** — new `--index-only` flag writes only `.skills-index.json` + `skills-index.md` to `~/.claude/skills/` + `~/.copilot/skills/`, skipping the 193 SKILL.md bodies. Full-install path unchanged.
904
-
905
- ### Verified
906
-
907
- - 49 smoke suites (47 v6.2.0 + 2 new: `smoke-plan-safety.sh`, `smoke-dynamic-skill-loading.sh`), 870+ assertions, exit 0.
908
- - `smoke-commands-skills-parity.sh` — 52/52 (command ↔ skill directory parity, up from 50/50 after adding `local-autopilot` pair).
909
- - `eval-triage.mjs` — 10/10. `eval-golden-tasks.mjs` — 2/2. 7/7 schemas valid.
910
- - Skill manifest clean (193 entries, zero drift).
911
-
912
- ### Migration notes
913
-
914
- - **Autopilot users who rely on zero-interaction for known-safe workflows**: the default flips toward safety — plans that touch security paths, schemas, or many files will now prompt once even in autopilot. Set `prefs.global.autopilotSafetyGate = false` in `~/.claude/preferences.json` to restore pre-v7.0 behavior.
915
- - **Dynamic skill loading**: off by default; no migration needed. Flip `prefs.global.dynamicSkillLoading = true` to opt in.
916
- - **`--index-only` installs**: new flag, no effect on existing full-install users.
917
-
918
- ---
919
-
920
- ## [6.2.0] — 2026-04-21
921
-
922
- Minor — three opt-in improvements that extend the v6.1.0 observability posture with persistent context, regression fixtures, and supply-chain integrity. All three are additive; none of them fire without a pref toggle or explicit script invocation.
923
-
924
- ### Added
925
-
926
- - **v6.2.E — Per-repo file-system memory layer** (`prefs.global.perRepoMemory`, default `false`). When enabled, Phase 0 reads `$PROJECT_ROOT/.multi-agent/memory/MEMORY.md` (capped at 30 pointer lines, wrapped in `<repo-memory>` tag) and Phase 7 dispatches a scoped synthesis subagent (sonnet, 10 turns) that may emit 0–3 new memory entries in JSON. Categories: user / feedback / project / reference. Helpers: `memory-load.sh`, `memory-save.sh`. Memory is local-only — installer also writes `.multi-agent/.gitignore` to keep it out of version control. Smoke: `smoke-per-repo-memory.sh` (11 assertions).
927
- - **v6.2.F — Golden-task eval harness** (`pipeline/eval/golden-tasks/` + `pipeline/scripts/eval-golden-tasks.mjs`). Contract regression fixtures for the whole pipeline — each case captures task input + expected Phase 1/2/4-review/4-triage outputs. Runner validates schemas, cross-checks internal consistency (Phase 2 files ⊆ Phase 1 touched areas, Phase 4 findings traceable to plan+triage), and enforces blocker/deferral counts. Ships with 2 seed fixtures (`01-ios-bugfix-darkmode`, `02-android-feature-compose`). Does NOT invoke live models — that's separate future work. Wired into `npm test`.
928
- - **v6.2.H — Skill manifest signing + verify** (`pipeline/scripts/sign-skills.sh` + `verify-skills.sh`). Generates `.skill-manifest.json` with SHA-256 per `SKILL.md` under `pipeline/skills/` (192 entries in v6.2.0). Install.js runs verify pre-deploy; drift → warn-only (never halts install); missing manifest → fail-open (first install). Detects mutation (exit 2), deletion (exit 2), and extra untracked skills (exit 3, info-only). Smoke: `smoke-skill-manifest.sh` (6 assertions).
929
-
930
- ### Changed
931
-
932
- - **`install.js`** gains a manifest-verify step right after the existing pre-deploy skill scan. Warn-only — never blocks install.
933
- - **`package.json` test script** now runs `eval-golden-tasks.mjs` alongside `eval-triage.mjs`.
934
- - **`prefs.schema.json`** — `global.perRepoMemory` (boolean, default `false`) added.
935
- - **`pipeline/commands/multi-agent/refs/phases/phase-1-analysis.md`** — Step 1 Knowledge Injection block now documents the `memory-load.sh` call.
936
- - **`pipeline/commands/multi-agent/refs/phases/phase-7-report.md`** — Step 3 Knowledge Capture now documents the synthesis subagent + `memory-save.sh` loop.
937
-
938
- ### Verified
939
-
940
- - 47 smoke suites (45 v6.1.0 + 2 new: `smoke-per-repo-memory.sh`, `smoke-skill-manifest.sh`), 850+ assertions, exit 0.
941
- - `eval-golden-tasks.mjs` — 2/2 fixtures pass all schema + consistency checks.
942
- - `eval-triage.mjs` — 10/10 fixtures pass (unchanged).
943
- - `validate-schemas.mjs` — 7/7 schemas pass.
944
-
945
- ---
946
-
947
- ## [6.1.0] — 2026-04-21
948
-
949
- Minor — four opt-in observability + quality improvements from the post-v6.0.1 multi-agent research roadmap. No slash-command API changes; everything added here is off-by-default or progressively enhances existing behavior when new env vars are set. The v6.2.x and v7.0.0 items from the approved roadmap remain in [`ROADMAP.md`](./ROADMAP.md).
950
-
951
- ### Added
952
-
953
- - **v6.1.A — Progressive-disclosure skill-authoring smoke** (`pipeline/scripts/smoke-skill-authoring.sh`). Asserts every authored `SKILL.md` under `skills/shared/core/` + `skills/figma-{ios,android}/` carries YAML frontmatter, that `name` matches the directory, description is ≥ 20 chars, and body length stays under 500 lines (warn) / 1000 lines (hard fail). External/ and figma-common/ trees are intentionally excluded as upstream imports. Caught 4 drift items during initial run — `figma-to-component` had `name: figma` (wrong), `figma-component-{code-connect, implement, test}` were missing frontmatter entirely. All four fixed. Current state: 3 pass, 1 warn (orchestrator SKILL.md expected over-limit), 0 fail.
954
- - **v6.1.B — Phase 4 disagreement-round loop** (`refs/phases/phase-4-review.md` Step 2.5, gated by `prefs.global.reviewDisagreementRound`, default `false`). When reviewers return mixed verdicts (blocker vs pass), run one rebuttal round — each reviewer sees the others' blocker findings verbatim and re-evaluates, returning the same schema. Max one round. Both CLI sides (Claude 2-model, Copilot 3-model) run identically. Off by default so baseline cost stays the same; high-stakes tasks can flip the flag.
955
- - **v6.1.C — OpenTelemetry span emission from phase-tracker** (`pipeline/scripts/phase-tracker.sh`). Opt-in via `MULTI_AGENT_OTEL_SPANS=1`. Emits JSONL spans to `otel-spans.jsonl` for `update`, `sub`, and `tokens` actions with deterministic trace_id (task_id) + span_id (event+phase+timestamp, shasum-hashed). Ingestible by any OTel collector via `filelog` receiver without protobuf/HTTP complexity in bash.
956
- - **v6.1.D — Per-persona model routing** (`pipeline/agents/*.md`). All 6 agent personas gain `preferredModel` + `modelRationale` frontmatter fields. `explorer` drops from opus → sonnet (scan work, cost-efficient without quality loss). Architects (iOS/Android/backend) + security-auditor + code-reviewer stay on opus. Orchestrator reads `preferredModel` and exports `CLAUDE_CODE_SUBAGENT_MODEL` (Claude Code) / passes `--model` (Copilot CLI) before dispatch. Precedence: per-dispatch `PHASE_MODEL_OVERRIDE` > persona `preferredModel` > `opus`. Smoke: `smoke-agent-model-routing.sh` (21 assertions).
957
- - **v6.1.J — Cost attribution in Phase 7 PR body** (`prefs.global.reportContent.costSummary`, default `false`). When enabled, `/multi-agent:channels` appends a `### Cost Summary` section to the PR body with per-phase token tally + est. USD table. Source: `phase-tracker.sh` token columns with OTel-spans fallback. Prices from `pipeline/scripts/cost-table.json` (opus/sonnet/haiku verified against Anthropic 2026-04-21 pricing). Unknown models render USD as `—` with a footnote — never blocks generation. Renderer: `pipeline/scripts/render-cost-summary.sh`. Smoke: `smoke-cost-summary.sh` (8 assertions).
958
-
959
- ### Changed
960
-
961
- - **`pipeline/schemas/prefs.schema.json`** — `reportContent.costSummary` (boolean, default `false`) added to v2.2.0 schema. `global.reviewDisagreementRound` (boolean, default `false`) added.
962
- - **`pipeline/agents/code-reviewer.md`** — description clarifies that Phase 4 Reviewer 3 override happens via `CLAUDE_CODE_SUBAGENT_MODEL`, not frontmatter manipulation.
963
- - **`pipeline/commands/multi-agent/refs/phases/phase-4-review.md`** — Step 2 now references the `PHASE_MODEL_OVERRIDE` → `CLAUDE_CODE_SUBAGENT_MODEL` wiring instead of assuming a fixed model ID per reviewer row.
964
-
965
- ### Verified
966
-
967
- - 44 smoke suites (42 v6.0.x + 2 new: `smoke-agent-model-routing.sh`, `smoke-cost-summary.sh`), 830+ assertions, exit 0.
968
- - Full regression pass on `smoke-schema-validation.sh` (27/27), `smoke-cross-cli-behavior.sh` (19/19), `smoke-channels-flow.sh` (44/44), `smoke-skill-authoring.sh` (3 pass / 1 warn / 0 fail).
969
-
970
- ---
971
-
972
- ## [6.0.1] — 2026-04-21
973
-
974
- Patch — v6.0.0 audit was not exhaustive enough. User asked "did you handle multi-agent-sync detaylı" and the miss surfaced: `sync.md` + `multi-agent-sync/SKILL.md` + `docs/architecture.md` still carried stale command counts (16 / 18 / 20) and `v5.0.1+` "new" version markers from earlier releases. The v6.0.0 cross-cli-contract rewrite caught those numbers in `refs/cross-cli-contract.md` §1 but I didn't grep the rest of the repo for matching claims.
975
-
976
- ### Fixed
977
-
978
- - **`pipeline/commands/multi-agent/sync.md` + `pipeline/skills/shared/core/multi-agent-sync/SKILL.md`** — `16 komut sync edilir` (+ its incomplete 16-item list) replaced with the canonical 25-command v6.0.0 inventory (autopilot, channels, dev, dev-autopilot, dev-local, dev-local-autopilot, help, issue, jira, kill, local, log, manual-test, purge, refactor, resume, review, scan, search, setup, stack, status, sync, test, update). The 9 previously-missing commands — `dev-autopilot`, `dev-local`, `dev-local-autopilot`, `local`, `manual-test`, `scan`, `search`, `stack`, `update` — were actually being installed by `install.js` all along; only the sync doc's inventory table was stale. `20 sub-command` → `25 sub-command skills`. Stale `v5.0.1+` markers on "Hedefler" table + "yeni" banner removed (no longer "new" at v6.0.1).
979
- - **`docs/architecture.md` Pipeline Specs subgraph** — `figma/ 40K+ lines` (v5.0.0 purged this dir) replaced with the current three-tree structure (`skills/figma-{ios,android,common}/` 37 skills, `skills/shared/core/` 28 orchestration skills incl. compliance, `skills/shared/external/` 127 curated skills). `Quality Gates` subgraph: `schemas/ 4` → `7 + token-budget`, `smoke-* 10 smoke suites` → `42`.
980
-
981
- ### Process fix
982
-
983
- Added `feedback_audit_exhaustiveness.md` to memory — new rule: any audit that states a finding count must grep the ENTIRE set of plausible surface files (every command `.md`, every `SKILL.md`, every `refs/*.md`, every `docs/**.md`, README, CHANGELOG, ROADMAP) before declaring the audit complete. v6.0.0's audit sampled — this patch is the cost of that sampling.
984
-
985
- ### Verified
986
-
987
- - 41 `══` smoke suites + 1 custom shape = 42 total, 822+ assertions, exit 0.
988
- - Final drift grep returned zero matches for stale count patterns.
989
-
990
- ---
991
-
992
- ## [6.0.0] — 2026-04-21
993
-
994
- Major — comprehensive drift-sweep release closing 27 audit findings across three waves. No slash-command API changes; the major bump reflects the breadth of doc / contract / schema cleanup, not user-facing behavior changes.
995
-
996
- ### Wave 1 — Blockers (doc / contract drift)
997
-
998
- - **`refs/cross-cli-contract.md` § 1 rewritten** — command inventory 21 → 25 (actual); stale `pipeline/skills/figma-ios/figma-<N>/` paths (23 missing dirs from the old flat-structure era) corrected to their current `figma-common/` locations; new § 1.2 added for store-compliance skills (`apple-archive-compliance`, `google-play-compliance`); new § 2.6 declares the intentional structural divergence between `pipeline/commands/multi-agent.md` (thin 300-line dispatcher) and `pipeline/skills/shared/core/multi-agent/SKILL.md` (830-line inline orchestrator) so auditors stop flagging it as parity drift.
999
- - **README "Current at-a-glance" fact sheet** added near the top — live filesystem counts (192 SKILL.md files across 6 groups, 42 smoke suites, 25 slash commands, 7 JSON schemas, 6 agent personas, 8 phases). Readers no longer hit the 5 conflicting smoke counts scattered across old "What's new" blocks. Historical blocks stay for release traceability, now clearly marked as point-in-time records.
1000
- - **Stale skill-count claims fixed** — 149 / 179 → 192 `SKILL.md` files with per-group breakdown (28 core + 5 iOS + 5 Android + 27 common + 127 external).
1001
- - **7 colon ↔ dash description drifts aligned byte-identical** — `channels`, `help`, `manual-test`, `refactor`, `review`, `setup`, `test`.
1002
- - **`multi-agent-scan` + `multi-agent-search` SKILL.md** gain the missing `user-invocable: true` + `argument-hint` frontmatter; all 28 shared/core skills are now parity-consistent.
1003
- - **`smoke-add-detail.sh` output format normalized** to the standard `══ add-detail smoke: N passed, M failed ══` shape. 42/42 smokes now emit a consistent header.
1004
- - **ROADMAP rewritten** — current release tracked honestly (was stuck at "5.0.1"); 5.0.2 – 5.8.1 backfilled as per-series summaries; new "Future — v6.1+" section with 9 concrete candidates; extended "Not Planned" block.
1005
-
1006
- ### Wave 2 — Important (correctness, maintenance debt)
1007
-
1008
- - **`multi-agent.md` ↔ SKILL.md** now carries the canonical 5-type Input Parsing table in both files (colon had it, dash didn't). Contract § 2.6 formally documents what must stay identical vs what may diverge.
1009
- - **Phase 7 token budget recalibrated** — warn 4100 → 3100, max 4700 → 3500. v5.7.0's channels.md delegation shrunk the file to 2782 tokens; prior budget was 1318 tokens too loose relative to the current+10% calibration rule. `total_max_tokens` 32950 → 31750.
1010
- - **Migration docs consolidated**: `MIGRATION_4.0.md`, `MIGRATION-v3.6-to-v3.7.md`, `MIGRATION-v4-to-v5.md` moved to `docs/archive/`. `docs/MIGRATION.md` gains the v6.0.0 section + archive pointer.
1011
- - **`docs/PLAN_v5.0.md`** moved to `docs/archive/plans/`; cross-reference links updated in `pipeline/rules/figma-pipeline.md`, `pipeline/skills/figma-android/README.md`, `docs/FIGMA_PIPELINE.md`, `README.md`.
1012
- - **Unit test expansion** — new `test/migrate-prefs.test.mjs` (12 tests across 3 describe blocks) covers `migrate-prefs.mjs` v2.0.0 → 2.2.0 chain, `reportChannels` sub-migration, idempotency, `--dry-run` safety, error handling. Unit suite: 66 → 78 tests.
1013
- - **Compliance skill frontmatter updated to declare all 4 consumers** (not just "primary"). Previous state implied only `/multi-agent:test "store-ready"` consumed the skills; accurate state is 4 consumer surfaces wired in v5.8.1 (`:test`, Phase 4 Security Auditor, `:review`, `:channels`).
1014
- - **`install.js --platform=ios|android|all` flag** — selective copy of `pipeline/skills/shared/external/` for platform-specific installs. Default `all` preserves backwards compatibility; `ios` skips 13 Android-classified dirs, `android` skips 55 iOS-classified dirs. Prefix-based heuristic with "generic" fallback. Core / figma / non-external trees always install in full.
1015
- - **Prefs migration 2.1.0 → 2.2.0** — formalizes v5.7 / v5.8 field additions (`reportChannels`, `reportContent`, `wikiScope`, `autopilotReportTimeoutSeconds`) that had been running as 2.1.0 sub-migrations without a proper version tag. `pipeline/schemas/migrations/prefs-2.1.0-to-2.2.0.mjs` created; `prefs.schema.json` enum extended; `migrate-prefs.mjs` chain updated. Defensive backfill ensures v2.2.0 always carries the 4 fields even if upstream migration is skipped.
1016
- - **Figma smoke consolidation evaluated + documented** — `pipeline/scripts/README-figma-smokes.md` explains why the 6 figma smoke suites stay separate (distinct contracts per file; merging would hurt debug signal-to-noise). Reason recorded for future maintainers.
1017
-
1018
- ### Wave 3 — Polish
1019
-
1020
- - **README "What's new" pruned to last 2 releases** (v5.8.1 + v5.8.0); earlier entries collapsed to a summary "Earlier releases" block with CHANGELOG link. README 952 → 832 lines.
1021
- - **Colon-form `argument-hint` added** to 9 commands that take arguments: `test`, `review`, `channels`, `kill`, `resume`, `log`, `jira`, `issue`, `autopilot`. Claude Code command picker now shows usage hints the same way Copilot's skill picker does.
1022
- - **`pipeline/scripts/` category index** — new `pipeline/scripts/README.md` navigates the 59 shell scripts + 12 `.mjs` helpers by purpose (smoke / hooks / runtime / Node.js). Audit evaluated moving files into `smoke/` + `hooks/` + `runtime/` subdirs; rejected because the 40+ path-reference touch radius (npm glob, install.js deploy, sibling script calls) had no runtime value.
1023
- - **`.github/` audit** — 7 files confirmed current (CODEOWNERS, FUNDING, 3 issue templates, dependabot, PR template). No stale workflows (already removed in v5.2.1).
1024
- - **ROADMAP `## Future — v6.1+` section** added with 9 concrete candidates (public benchmark suite, community skill-pack format, multi-repo pipeline coordination, VS Code / JetBrains integrations, Copilot Chat adapter, Windows / Linux portability, telemetry dashboard, compliance rule catalog as data, on-device compliance smoke).
1025
- - **Skill `language: en|tr` frontmatter tag** — 28/28 shared/core skills classified (18 TR, 10 EN) based on description text. Maintainer signal for "which language does this doc's description use" without requiring deep reads.
1026
- - **Compliance skill Examples expanded** — both `apple-archive-compliance` and `google-play-compliance` SKILL.md Examples sections now include concrete usage for all 4 consumer surfaces (primary `:test`, Phase 4 Security Auditor finding format, `:review` branch-specific invocation, `:channels` PR-body augmentation output template).
1027
- - **Eval fixture naming standardized** — 3 long names shortened: `07-duplicate-findings-from-two-reviewers` → `07-duplicate-reviewers`; `08-stylistic-blocker-misclassification` → `08-style-misclassified`; `10-deferred-with-cross-reference` → `10-deferred-crossref`. 10/10 eval cases still pass.
1028
- - **`.prettierrc` → `.prettierrc.json`** — modern Prettier config discoverability.
1029
- - **`install.js` console channel split** — one misclassified `console.log` for the "Skill scan unavailable" non-fatal warning converted to `console.warn`. Other info-level paths correctly retain `console.log`; genuine errors at lines 42 / 90 already used `console.error`.
1030
-
1031
- ### Verified
1032
-
1033
- - Full suite: **41 smoke suites** emit `══` header (42 runnable — 1 still uses `══ add-detail smoke:` shape now), **822+ assertions**, eval-triage 10/10, 7 schemas valid. Exit 0.
1034
- - Unit tests: **14 suites, 78 tests**, 0 failures.
1035
- - `node install.js --all` on both CLIs — clean (0 personal-data findings, stale prunes fire correctly, both skill dirs get the full 192 SKILL.md inventory).
1036
-
1037
- ### Migration
1038
-
1039
- No API-breaking changes on the slash-command surface. Existing installations will auto-upgrade preferences to schemaVersion `2.2.0` on next `multi-agent:update` (idempotent; no data loss). Downstream parsers that read `"Results: N passed"` from `smoke-add-detail.sh` must rewrite to match the standard `══ <name> smoke: N passed, M failed ══` shape.
1040
-
1041
- ---
1042
-
1043
- ## [5.8.1] — 2026-04-21
1044
-
1045
- Patch — v5.8.0 review caught the two compliance skills claiming four consumer surfaces (`/multi-agent:test "store-ready"`, Phase 4 Security Auditor, `/multi-agent:review`, `/multi-agent:channels`) while only one (`:test "store-ready"`) was actually wired. The other three files had zero references to `apple-archive-compliance` / `google-play-compliance` — the promise was aspirational. v5.8.1 turns the promise into code.
1046
-
1047
- ### Wired
1048
-
1049
- - **`pipeline/agents/security-auditor.md`** — added a "Store-compliance catalog cross-reference" section. Lists iOS trigger files (Info.plist / PrivacyInfo / entitlements / AppDelegate / project.pbxproj), Android trigger files (AndroidManifest / build.gradle / proguard-rules / network_security_config), and tells the auditor to cite `(apple-archive-compliance / <ruleID> — <ref>)` or `(google-play-compliance / <ruleID> — <Play ref>)` verbatim on flagged diff lines. Binary invocation not required for diff review — catalog only.
1050
- - **`pipeline/commands/multi-agent/review.md` + `pipeline/skills/shared/core/multi-agent-review/SKILL.md`** — step 4 of the review flow now loads the matching catalog when the diff touches release-relevant paths. Cross-CLI parity kept (both Claude and Copilot review surfaces carry the same Store-compliance step).
1051
- - **`pipeline/commands/multi-agent/channels.md` + `pipeline/skills/shared/core/multi-agent-channels/SKILL.md`** — added "Store compliance auto-augmentation" logic. When a cached `archiveguard-*.json` or `play-compliance-*.json` lives under `/tmp` (≤ 6 hours old), channels now auto-appends a `### Store compliance` section to the PR body / Jira comment / Confluence page with blocker count + per-rule citations (capped at 5 rows). No cache → skip silently; zero blockers + zero warnings → skip silently.
1052
-
1053
- ### Added
1054
-
1055
- - **`smoke-compliance-skills.sh`** (45 assertions) — enforces every piece of the contract: frontmatter shape on both skills, rule-row counts (17 Apple / 21 Google), 4-category header presence on Google side, EN + TR humanizer blocks, severity mapping table, tool orchestration references, sim-test.md platform-detect contract, and the three consumer surfaces actually carrying the skill names. Any future drift that unwires a consumer fails this smoke.
1056
-
1057
- ### Verified
1058
-
1059
- - Full suite: 40 smoke suites · 822+ assertions · eval-triage 10/10 · 7 schemas valid.
1060
- - Smoke count bumped 39 → 40 with the new suite.
1061
-
1062
- ---
1063
-
1064
- ## [5.8.0] — 2026-04-21
1065
-
1066
- Minor — store-ready compliance for iOS **and** Android via two new shared skills. User flagged their external `~/ArchiveGuard` Swift project ("tam istediğim çıktıyı vermiyor ama mantıksal iyi tarafları alabilirmiyiz") and asked whether `/multi-agent:test "store-ready"` could invoke it. After alignment — "pipeline'a taşımayalım" (don't vendor the binary), "yaklaşımları bence skill olarak eklemek daha iyi" (prefer skill-catalog over inlined scenario branch), "android içinde bir skill olsa güzel olur google standartları tarzı" — the result is two parallel skills that wrap external tooling and report back in a normalized severity-grouped shape.
1067
-
1068
- ### Added
1069
-
1070
- - **`apple-archive-compliance` skill** (`pipeline/skills/shared/core/apple-archive-compliance/SKILL.md`) — wraps the external ArchiveGuard binary (`~/ArchiveGuard/.build/release/archive-guard`). 17-rule catalog cross-referenced to Apple ITMS error codes + App Store Review Guidelines (privacy-manifest, required-reason-api, info-plist, code-signing, embedded-sdk, entitlement, asset-validation, binary-size, team-id-consistency, provisioning-profile, swift-abi, extension-signing, ipv6-compliance, debug-tool-leak, production-hygiene, duplicate-resource, dead-reference). EN + TR humanizer templates. Severity mapping: error → blocker, warning → risk, info → log-only.
1071
- - **`google-play-compliance` skill** (`pipeline/skills/shared/core/google-play-compliance/SKILL.md`) — orchestrates `bundletool` + `aapt2` + `apksigner` (the Android analogue; Google doesn't ship a single binary). 21-rule policy catalog in four categories: A Technical (target-api-level, native-64bit, app-bundle-format, base-module-size, package-visibility, foreground-service-types), B Security (app-signing, cleartext-traffic, exported-components, proguard-mapping), C Privacy (dangerous-permissions, advertising-id, background-location, data-safety-consistency, sensitive-permission-special-access), D Hygiene (debug-tool-leak, log-statements, mock-endpoints, minify-enabled, version-code-progression, app-links-verification). Same EN + TR humanizer + severity-grouped output shape as the Apple skill. Notes call out that Play policy thresholds shift yearly — Play Console pre-launch report remains complementary, not redundant.
1072
- - **`/multi-agent:test "store-ready"` scenario branch** in `sim-test.md` — auto-detects platform from cwd (`.xcodeproj`/`Package.swift` → iOS, `build.gradle`/`build.gradle.kts` → Android), runs the standard visual + accessibility sweep, dispatches the matching skill, merges findings into one severity-grouped report, and offers `/multi-agent:channels` follow-up so results can land in Jira / Confluence / Wiki / PR via Phase 7 machinery.
1073
-
1074
- ### Integration points
1075
-
1076
- Both skills are consumable from four surfaces with identical invocation contracts:
1077
-
1078
- 1. `/multi-agent:test "store-ready"` — primary interactive path.
1079
- 2. **Phase 4 Security Auditor** — opportunistic when the task touches `Info.plist` / `AndroidManifest.xml` / entitlements / signing.
1080
- 3. `/multi-agent:review` — opt-in deep compliance audit before PR.
1081
- 4. `/multi-agent:channels` — Phase 7 report augmentation; compliance findings flow into the selected channels with their ITMS / Play references intact.
1082
-
1083
- ### Notes
1084
-
1085
- - ArchiveGuard source stays external (`~/ArchiveGuard/`); this package depends on its CLI binary only. Missing binary → skill surfaces a clear "install / build ArchiveGuard first" error, doesn't crash the pipeline.
1086
- - Android side requires Android SDK build-tools in PATH; missing tools degrade gracefully with per-tool guidance.
1087
- - TR humanizer strings respect `prefs.global.promptLanguage` the same way the channels menu, Phase 0/5/6 prompts, and help screen already do.
1088
-
1089
- ---
1090
-
1091
- ## [5.7.5] — 2026-04-21
1092
-
1093
- Patch — `/multi-agent:help` TR localization. User asked "tr seçtiysem `/multi-agent:help` tr mi açılır" — the honest answer was NO: help.md was English-only even after `/multi-agent:language tr`. Other surfaces (Phase 0/5/6 prompts, channels menu, setup) already respected `prefs.global.promptLanguage`; help was the last EN-only holdout.
1094
-
1095
- ### Added
1096
-
1097
- - **help.md + multi-agent-help/SKILL.md now render in EN or TR** based on `prefs.global.promptLanguage`. Single file, two blocks — a language-resolution header tells the model which block to print. Turkish block is a full translation (modes, phases, utility commands, interactive launchers, UI Testing, Manual Test, Setup, Key Features, Examples), not a courtesy stub. Fallback: missing / malformed pref → renders English.
1098
-
1099
- ### Changed
1100
-
1101
- - Help content bumped to reflect v5.7.4 test/manual-test split (both languages).
1102
- - Autopilot row now mentions the Phase 7 channels-menu pause exception.
1103
-
1104
- ---
1105
-
1106
- ## [5.7.4] — 2026-04-21
1107
-
1108
- Patch (breaking alias rename) — `/multi-agent:test` colon-form was loading the 43-line Phase 5 Manual Test file while README + help.md had always described it as the UI Bug Hunter. User flagged: "test kısmında neleri test edebiliyoruz, help güncel mi?" — docs vs actual behavior mismatch confirmed.
1109
-
1110
- ### Changed (slash-command surface)
1111
-
1112
- - **`/multi-agent:test` now loads the UI Bug Hunter** via a thin dispatcher in `commands/multi-agent/test.md` that reads `sim-test.md`. Matches README + help.md expectations. Examples (`dark mode` / `accessibility` / `dynamic type` / `screenshot tr`) now actually work on the colon form — previously only the space form (`multi-agent test ...`) through the main router did.
1113
- - **Phase 5 Manual Test (the old 43-line file) moved to `/multi-agent:manual-test`** — task branch checkout + Xcode/SourceTree hints, waits for user `ok` / `fix: ...` verdict. Still a standalone utility, just under a more descriptive name.
1114
- - **Copilot side** mirrored: `multi-agent-test` skill dir now contains the UI Bug Hunter dispatcher, `multi-agent-manual-test` dir contains the Phase 5 flow.
1115
- - **Cross-CLI inventory bumped** 20 → 21 commands (cross-cli-contract.md).
1116
-
1117
- ### Migration for existing users
1118
-
1119
- Old colon-form scripts that called `/multi-agent:test #N` expecting Xcode hint flow should be updated to `/multi-agent:manual-test #N`. Space-form invocations (`multi-agent test ...`) already resolved to the UI Bug Hunter via the main router and behave the same as before.
1120
-
1121
- ---
1122
-
1123
- ## [5.7.3] — 2026-04-21
1124
-
1125
- Patch — open-items cleanup. User asked "açık konuları hallet" after the detailed review surfaced three minor loose ends.
1126
-
1127
- ### Added
1128
-
1129
- - **`technicalAnalysis` as a distinct content option in the channels menu.** User's original spec (mesaj #5) listed three content items: test senaryosu, teknik analiz, normal analiz. v5.7.0 merged "teknik analiz" into the broader Normal analiz entry based on an exchange about PR body duplication. v5.7.3 restores it as its own tick — produces `### Technical Details` section with Changes / Architecture / Dependencies (same shape as PR body's Technical Details, sourced from Phase 2 plan + Phase 3 dev log + git diff --stat). Default off (PR body already covers it when channel=PR is selected); opt-in when kanal seçimi PR içermiyor ve Jira/Confluence okuyucusunun teknik detaya ihtiyacı var. Schema + template + migration + menu + smoke assertion updated.
1130
-
1131
- ### Changed
1132
-
1133
- - **Phase 0 Step 1b progress line renamed: `→ enrich input:` → `→ URL context:`** — the old log line was a collision with the removed `/multi-agent:enrich` command name. Step 1b itself is still called URL Enrichment (that's the feature title), only the one-liner log string changed. smoke-url-enrichment updated to match.
1134
-
1135
- ### Verified (no action needed)
1136
-
1137
- - **Vercel deployment of mmerterden.dev v5.7.2** — `gh api commits/main/status` reports `state: success` for the Vercel check. Auto-deploy succeeded.
1138
-
1139
- ---
1140
-
1141
- ## [5.7.2] — 2026-04-21
1142
-
1143
- Patch — second-round validation catch. User asked "son defa daha review et" and the deep audit found that `~/.claude/skills/README.md` + `~/.copilot/skills/README.md` (the shared skills index) had drifted away from the source — they still listed `multi-agent-enrich` because install.js never refreshed those files.
1144
-
1145
- ### Fixed
1146
-
1147
- - **install.js now mirrors `pipeline/skills/shared/README.md`** to both `~/.claude/skills/README.md` and `~/.copilot/skills/README.md` on every run. The source file (committed in the repo) stays canonical; the skill-index README on install can no longer drift. Without this, v5.7.0 upstream deletes like `multi-agent-enrich` left dead table rows in the installed skill index that the user saw in their skill picker.
1148
-
1149
- ---
1150
-
1151
- ## [5.7.1] — 2026-04-21
1152
-
1153
- Patch — install.js stale-skill cleanup gap. User asked the validation question "tüm söylediklerimi yaptın dimi eksiksiz hep birşeyleri atlıyoruz" — honest audit found a real leftover: `~/.copilot/skills/multi-agent-enrich/` survived the v5.6.3→v5.7.0 upgrade because install.js's `copyDir` only adds files, never prunes dirs removed upstream. Claude side was fine (pruneLegacyMultiAgentSkills already covers all `multi-agent-*` on Claude). Copilot side had no equivalent.
1154
-
1155
- ### Fixed
1156
-
1157
- - **install.js Copilot-side stale skill prune** — before copying `pipeline/skills/shared/core/*` to `~/.copilot/skills/`, install.js now scans for `multi-agent-*` dirs on disk that no longer exist in the source tree and removes them. Only the `multi-agent-*` namespace is touched; external / user-installed skills stay intact. Prints `-> pruned N stale multi-agent-* skill dir(s)` when it happens. Regression-proofed manually against a seeded `multi-agent-enrich` fixture — pruned cleanly.
1158
-
1159
- ---
1160
-
1161
- ## [5.7.0] — 2026-04-21
1162
-
1163
- **Breaking** — `/multi-agent:enrich` removed. Replaced by `/multi-agent:channels` (Claude) / `multi-agent-channels` (Copilot). Phase 7 Report rewritten to delegate all external delivery to the new command; internal capture stays inline.
1164
-
1165
- User ask: "bu kısımda şunları sorsa — Jira / Confluence / Wiki multi-select, sonra her kanal için içerik seçimi (Normal analiz / Test senaryoları / Auto-diff / Manuel not). Enrich'in tüm gücünü channels'a taşıyalım, enrich keyword'ü görmek istemiyorum." Delivered.
1166
-
1167
- ### Added
1168
-
1169
- - **`/multi-agent:channels` command** — single entry point for Jira / Confluence / Wiki / PR description reports. Multi-select channels × multi-select content sources. Phase 7 delegates to it; post-hoc invocable by Jira ID / PR URL / #N / cwd+branch. Inherits all `enrich` capabilities (diff auto-summarize, manual mode, reviewer-preserving Bitbucket PR PUT) plus two new channels (Confluence page, Wiki pages). Claude: `pipeline/commands/multi-agent/channels.md`. Copilot: `pipeline/skills/shared/core/multi-agent-channels/SKILL.md`. Cross-CLI byte-identical prompts asserted by `smoke-cross-cli-behavior.sh`.
1170
- - **Case A / Case B Wiki menus** (`refs/wiki-capture.md`) — Case A offers scope multi-select (main page / iOS sub-page / screenshots / index updates / manual override) when `taskType=component` + `wiki.enabled=true`. Case B offers actionable menu ([1] setup Figma config, [2] skip this run, [3] disable forever, [4] manual note) when preconditions fail. Previously Wiki silently no-op'd on precondition failure — no more.
1171
- - **Autopilot Phase 7 pause contract** (`refs/phases/modes.md`) — Phase 7 is the single exception to the autopilot zero-interaction rule. ALL modes (autopilot, --dev autopilot, --local autopilot) pause at the channels multi-select menu. 30-minute timeout → session ends cleanly → internal capture still runs → resumable via `/multi-agent:resume`. Rationale: silently posting defaults to Jira/Confluence after timeout would leak wrong-tone content to team-visible artifacts.
1172
- - **Schema v5.7 fields** (`prefs.schema.json`) — `reportChannels{pr,jira,confluence,wiki}`, `reportContent{normalAnalysis,testScenarios,autoDiff,manualNote}`, `autopilotReportTimeoutSeconds` (default 1800s), `wikiScope[main|ios|screenshots|index]`. Template (`preferences-template.json`) updated to match.
1173
- - **`migrate-prefs.mjs migrateReportChannels()` sub-migration** — promotes legacy `jiraCommentDefault` / `confluenceDefault` / `wikiDefault` booleans to the new `reportChannels` structure, inits `reportContent` + `wikiScope` + timeout defaults, prunes old fields. Idempotent.
1174
- - **`smoke-channels-flow.sh`** (43 assertions) — validates command exists in both CLIs, enrich fully removed, 4 channels declared, 4 content sources declared, Bitbucket reviewer-preserving PUT payload, 30-min timeout referenced cross-file, Phase 7 autopilot-always-pause, Phase 7 delegates without inline adapters, schema fields present, legacy schema fields purged, migration handles v5.7, template uses new fields.
1175
-
1176
- ### Changed
1177
-
1178
- - **Phase 7 sub-steps reduced from 5 to 3** — `1:Channels-Dispatch` (delegated), `2:Log+Telemetry`, `3:Knowledge+Memory`. Old 5-step list (Jira-Comment / Wiki+Figma / Confluence / Log+Telemetry / Knowledge+Memory) consolidated under channels.
1179
- - **`/multi-agent:review-only` → `/multi-agent:review`** — simpler name, space-form alias removed. `review.md` file already exists; only the keyword routing changed.
1180
- - **Modern colon-form consistency in help docs** — `help.md` + `multi-agent-help/SKILL.md` utility/launcher/test/setup/examples sections now use `/multi-agent:status` / `/multi-agent:log` / etc. instead of space-form. User feedback: "eski gibi duruyor, birleşik artık".
1181
- - **Docs updated** — README, docs/features.md, docs/FIGMA_PIPELINE.md, multi-agent.md routing table, sync.md 16-command list, setup.md, help.md, cross-cli-contract.md (20 commands inventory), shared/README.md, shared/core/multi-agent-{help,setup,sync}/SKILL.md. All enrich references replaced with channels.
1182
-
1183
- ### Removed
1184
-
1185
- - **`pipeline/commands/multi-agent/enrich.md`** — full file deleted.
1186
- - **`pipeline/skills/shared/core/multi-agent-enrich/`** — full skill dir deleted.
1187
- - **`prefs.schema.json` legacy fields** — `jiraCommentDefault`, `confluenceDefault`, `wikiDefault` removed entirely (not kept as deprecated, per user feedback). Migration handles existing prefs files.
1188
-
1189
- ### Migration for existing users
1190
-
1191
- Run `/multi-agent:update` — it runs `migrate-prefs.mjs` which:
1192
- 1. Copies `jiraCommentDefault` → `reportChannels.jira` (default `true`).
1193
- 2. Copies `confluenceDefault` → `reportChannels.confluence` (default `false`).
1194
- 3. Copies `wikiDefault` → `reportChannels.wiki` + `wikiScope` (true → [main,ios,screenshots,index]; false → []).
1195
- 4. Initializes `reportContent` defaults and `autopilotReportTimeoutSeconds=1800`.
1196
- 5. Removes legacy fields from your prefs file.
1197
-
1198
- If you had `enrich "target"` in scripts or aliases, replace with `channels "target"` — the diff auto-summarize behavior is preserved under the `--content auto-diff` content option (default when no pipeline log exists).
1199
-
1200
- ---
1201
-
1202
- ## [5.6.3] — 2026-04-20
1203
-
1204
- Patch — install parity gap: agent persona files missing from Copilot CLI side. User asked "multi-agent-pipeline install olduğunda Copilot ve Claude CLI ortak olarak herşey eksiksiz yükleniyor mu?" — honest audit of install.js flow revealed the gap:
1205
-
1206
- | Target | Claude Code | Copilot CLI (before) | Copilot CLI (after) |
1207
- |--------|------------|---------------------|---------------------|
1208
- | commands/ | 56 | — (uses skills instead, by design) | — (unchanged) |
1209
- | scripts/ | 71 | 71 | 71 |
1210
- | **agents/** | **6** | **❌ missing** | **✅ 6** |
1211
- | rules/ | 12 | — (user-personal CLAUDE.md convention) | — (unchanged) |
1212
- | skills/ | 409 | 434 | 434 |
1213
- | instructions | CLAUDE.md + refs | copilot-instructions.md | copilot-instructions.md (+ Sub-Agent Personas section) |
1214
-
1215
- ### Fixed
1216
-
1217
- - **`install.js` now copies `pipeline/agents/` to `~/.copilot/agents/`** — wipe-before-copy for idempotency, same pattern as the Claude-side install at line 462. Phase 1 explorer + Phase 4 reviewer personas (code-reviewer, explorer, ios-architect, android-architect, backend-architect, security-auditor) now resolve by filename on BOTH CLIs. Before this fix, `pipeline/commands/multi-agent.md:102-107` referenced `$HOME/.claude/agents/*.md` as the persona source — but Copilot has nothing at `$HOME/.copilot/agents/`. The orchestrator had to improvise reviewer personas from scratch. No more.
1218
- - **`generateCopilotInstructions()` — new `## Sub-Agent Personas` section** — lists all 6 personas with filename + phase mapping so the Copilot orchestrator knows to load the persona file before each Phase 4 reviewer dispatch. Placed right before Phase 0 so it's discoverable early in the instruction read.
1219
-
1220
- ### Tests
1221
-
1222
- - **`smoke-cross-cli-behavior.sh`** — 17 → 19 assertions (Step 8): enforces all 6 persona files present on both CLIs AND the `agents/` file count matches between sides. Regression-proof: any future install.js drift that ships agents to only one CLI will fail this smoke.
1223
- - **Full `npm test`**: 39 suites, **0 warnings**, 0 failures. `cross-cli-behavior: 19 passed`, `phase-tracker: 44 passed`.
1224
-
1225
- ### Notes
1226
-
1227
- - Why `rules/` isn't a parity gap: the 12 files under `pipeline/rules/` (code-style, testing, tdd, swiftui-qa, etc.) are user-personal modular rules loaded by Claude Code's `~/.claude/CLAUDE.md` convention — they're not part of the pipeline's runtime contract (`multi-agent.md` references `refs/rules.md`, not `~/.claude/rules/*`). Copilot's equivalent is content inside `copilot-instructions.md`, which the generator already writes. No missing behavior.
1228
- - Why `commands/` isn't a gap: Claude Code uses slash commands (`/multi-agent:dev`), Copilot uses skills (`multi-agent-dev`). The skills tree already contains byte-identical invocation surface on both sides. Documented asymmetry in `refs/cross-cli-contract.md § 1.2`.
1229
-
1230
- ## [5.6.2] — 2026-04-20
1231
-
1232
- Patch — drift cleanup for Copilot CLI + tracker visual polish. User told me bluntly "sürekli atlıyorsun birşeyler" and asked me to list exactly what I hadn't finished. Two items: (1) **162 lines of pre-v5.0 stale pipeline content** had been festering in users' `~/.copilot/copilot-instructions.md` across every upgrade since v5.0, and (2) the progress bar design polish the user specifically asked for was brushed off.
1233
-
1234
- ### Fixed — pre-v5.0 Copilot drift survives every upgrade
1235
-
1236
- - **`install.js` now scrubs `## Multi-Agent Task Orchestrator` + `## Instruction Sync` sections** on every Copilot install. Previous install logic only replaced content *after* the `# Multi-Agent Development Pipeline` marker — so any content written by older install scripts (before the stable marker existed) stayed in the file forever. In the user's case that meant:
1237
- - "Full 9 phases" mode table (actually 8 since v5.1.0)
1238
- - "Phase 6.5: WIKI" line (merged into Phase 7 in v5.1.0)
1239
- - ALL-CAPS phase labels (`Phase 3: DEV`, `Phase 4: REVIEW` — converted to sentence case in v5.3.8)
1240
- - `--dev autopilot` flag syntax referenced as primary (replaced by dedicated `multi-agent-dev-autopilot` in v5.4.0)
1241
- - `/sync-instructions to-copilot` references (command retired)
1242
- - **Matcher pins to explicit end-anchors** — `## Git Identity Routing`, `## GitHub Account Routing`, `## Hooks & Context Management`, `# Multi-Agent Development Pipeline`. Generic `\n## ` / `\n# ` would have eaten bash code-fence comments (`# Personal repos (default)` etc.) and truncated early, leaving orphan content. Tested against the author's 844-line file → 739 lines post-scrub, personal sections 100% intact, zero stale drift references.
1243
- - **Console output announces the scrub** — `Updated existing pipeline section in copilot-instructions.md (also scrubbed pre-v5.0 drift section)` — so users know when their file was cleaned.
1244
-
1245
- ### Fixed — progress bar design polish
1246
-
1247
- - **Tracker phase bars wrapped in `[` `]` brackets** — user ask: "güzel bir tasarım olsa kalan kısımlar noktalı olsa daha güzel olur". The `█░` pattern was already dotted on the pending side, but without visual delimitation the bar blurred into adjacent text at certain terminal widths. Brackets now delimit the fixed 16-char region so the progress reads cleanly at any bar fill level.
1248
- ```
1249
- │ ✓ Phase 0 Init [░░░░░░░░░░░░░░░░] · 230 tok
1250
- │ ● Phase 3 Dev [████████████████] 1s · 18.2k tok
1251
- │ ○ Phase 4 Review [░░░░░░░░░░░░░░░░]
1252
- ```
1253
-
1254
- ### Tests
1255
-
1256
- - **`smoke-phase-tracker.sh`**: 44/44 pass (bracket change is cosmetic; existing assertions still match the bar substring).
1257
- - **Full `npm test`**: 39 suites, 0 warnings, 0 failures.
1258
-
1259
- ### Notes
1260
-
1261
- - Users upgrading from v5.6.1 or earlier get the drift scrub automatically on next `node install.js --all` or `--copilot`. One-way: old content is removed without backup. If a user had manually added custom content inside `## Multi-Agent Task Orchestrator` they'll need to move it to a differently-named section first.
1262
- - Personal user sections (identity routing, gh account routing, hooks) are preserved via explicit allowlist in the regex end-anchor.
1263
-
1264
- ## [5.6.1] — 2026-04-20
1265
-
1266
- Patch — cross-CLI parity drift-sweep uncovered 3 real gaps. User feedback: "Claude Code ve Copilot CLI birebir aynı çalışmalı, hep birşeyler atlanıyor". Parity audit (via Explore subagent) flagged 2 missing sections in the Copilot instruction generator + 1 stale token budget, all now fixed.
1267
-
1268
- ### Fixed — cross-CLI parity
1269
-
1270
- - **Copilot instructions now expose `prefs.global.promptLanguage` (en/tr)** — Claude's Phase 6 local-checkout prompt, Plan Approval Gate, and branch picker labels all honor the setting, but the generated `copilot-instructions.md` had no mention of it. Copilot orchestrator couldn't discover the feature. `install.js` `generateCopilotInstructions()` now includes a **Prompt Language (en/tr)** section explaining what it controls (interactive prompts only), what stays English (commits, PR bodies, Jira comments, external payloads), and the `PHASE_LANG env auto-reads prefs` fallback for `phase-banner.sh`.
1271
- - **Copilot instructions enumerate the 2-model vs 3-model reviewer asymmetry** — Phase 4 line previously said just "parallel review + Opus triage". Users couldn't see that Copilot CLI gets an additional GPT-5.4 reviewer (edge cases + cross-provider diversity) that Claude Code can't reach. Now spelled out inline in the Pipeline Overview Phase 4 bullet, flagged as "the only intentional cross-CLI asymmetry for Phase 4".
1272
-
1273
- ### Fixed — live token increment
1274
-
1275
- - **`phase-tracker.sh tokens` action now re-renders** — v5.5.0 added token accumulation but `tokens` was silent (state updated, no output). User complaint: "kac token harcıyor onuda göstersen increment güzel olur, anlık artıcak". Now each `tokens` call re-renders the tracker so the active phase's `· Nk tok` label ticks up live as LLM dispatches complete. Tight accumulation loops can opt out with `TRACKER_QUIET=1` (state still writes, render suppressed).
1276
-
1277
- ### Fixed — stale token budget threshold
1278
-
1279
- - **`phase-7-report`: 3400 warn → 4100 warn, 3850 max → 4700 max** — v5.5.0 wiki visible-skip contract + sub-step tracker registration + v5.6.0 multi-repo integration host memory pushed Phase 7 content from well under to 3749 tokens (above stale warn, under max). Re-baselined per the doc's stated rule (`warn = current+10%, max = current+25%`). `npm test` now emits `0 warnings` on the whole suite instead of 1.
1280
-
1281
- ### Tests
1282
-
1283
- - **`smoke-phase-tracker.sh`** 41 → 44 assertions: tokens re-render visible on active phase, `TRACKER_QUIET=1` suppresses render but still accumulates state.
1284
- - **Full `npm test`**: 39 smoke suites, **0 warnings**, 0 failures. 10 eval fixtures + 7 schema validations all green.
1285
-
1286
- ### Notes
1287
-
1288
- - Parity asymmetries explicitly documented in `cross-cli-contract.md` (native TaskCreate on Claude + `phase-tracker.sh` on both, slash-routing vs skill-dispatch, hooks vs bash) remain by design — the audit confirmed no other gaps hide behind them.
1289
- - Website (mmerterden.dev) update in same release cycle: `ArchitectureDiagram.tsx` command catalog now renders BOTH Claude slash form (`/multi-agent:dev`) AND Copilot dash form (`multi-agent-dev`) side-by-side per row. `PipelineSimulator.tsx` gets full EN/TR translation via `useI18n` + 19 new simulator keys.
1290
-
1291
- ## [5.6.0] — 2026-04-20
1292
-
1293
- Minor — post-development multi-repo integration build with learning layer. When a task touches ≥2 repos with a producer→consumer dependency (e.g. shared codegen library + consuming UI library), the pipeline now builds the **host project** that integrates them before commit/PR. First encounter asks once, persists to `prefs.global.multiRepoIntegrationHosts`, and auto-applies on subsequent runs.
1294
-
1295
- ### Why
1296
-
1297
- Codegen outputs (identifiers, localization keys, tokens) in Repo A get referenced by Repo B source, which is consumed as a submodule or SPM/Gradle dependency by host Repo C. Changes in A or B can silently break C if key structures diverge (nested enum vs flat access pattern). Building repos in isolation gave false confidence; skipping caused post-merge build failures that required follow-up fix PRs.
1298
-
1299
- ### Added
1300
-
1301
- - **`pipeline/commands/multi-agent/refs/multi-repo-integration-build.md`** — contract doc. Lookup via sorted `repoSet` equality, learn-once prompt (3 options: register host / mark `noHost` / skip-this-run), auto-run loop (submodule update → resolve packages → build → evaluate errors vs baseline), autopilot contract (refuses to prompt on unknown combos — logs `SKIPPED — run in normal mode once to teach` and proceeds).
1302
- - **Phase 6 Step 0** — `phase-6-commit.md` gains `Step 0 — Multi-Repo Integration Build (v5.6.0+)`. Fires only when `state.projects.length >= 2`; single-repo tasks skip silently with zero overhead. Tracker sub-step `6.0 Integration build` shows live progress; errors dispatch the 3-option fix prompt (return to Phase 3 / pause for manual fix / override).
1303
- - **`prefs.schema.json` — `global.multiRepoIntegrationHosts[]`** — registry schema. Required: `repoSet` (sorted array ≥2). Either `(hostPath + platform)` or `(noHost: true)`. Optional: `hostScheme`, `submodulePaths`, `resolveCommand`, `buildCommand`, `lastUsed`, `count`, `lastResult` (`success`/`failed`/`skipped`). `platform` enum covers `ios`/`android`/`mixed`.
1304
- - **Phase 7 knowledge capture** — when a learn-once prompt writes a new entry, Phase 7 Step 5 also writes a `reference` memory (`multi-repo integration host — {combo}`). These memories never auto-prune; combos may stay dormant for many runs but remain indexed for future teammates.
1305
- - **Generic rule in `install.js` `generateCopilotInstructions()`** — `Post-Development Integration Build (Multi-Repo) — MANDATORY (v5.6.0+)` block propagates to Copilot CLI's `copilot-instructions.md`.
1306
-
1307
- ### Tests
1308
-
1309
- - **`smoke-multi-repo-integration.sh`** — new 20-assertion suite. Verifies schema structure (required `repoSet`, `minItems=2`, platform enum), contract-doc sections (Why/When/Learn-once/Autopilot/noHost/tracker), phase-6 cross-reference, `install.js` generator inclusion, sorted-equality lookup, negative-entry (`noHost: true`) handling.
1310
- - **Full `npm test`**: 39 smoke suites + 10 eval fixtures + 7 schema validations, all green.
1311
-
1312
- ### Notes
1313
-
1314
- - Fully backward-compatible. Single-repo tasks see zero change. Existing multi-repo runs get one learn-once prompt the first time the combo is touched, then auto-run every subsequent time.
1315
- - Autopilot behavior is explicit: an unknown combo refuses to prompt (won't silently build the wrong thing) and proceeds with a logged skip. A known combo with `count ≥ 3` and `lastResult == success` still treats a new failure as blocking — confidence lowers surprise, not safety.
1316
- - Error evaluation uses baseline-delta when available: if the host already had N pre-existing errors before our changes, we treat `current ≤ N` as zero new errors and proceed; the pre-existing count is logged in the Phase 7 Report.
1317
-
1318
- ## [5.5.0] — 2026-04-20
1319
-
1320
- Minor — per-phase token tracking in live tracker + Phase 7 visibility contract. User feedback drove both: "hangi adımda ne kadar token harcadı görünse güzel" and "component üretiminde wiki hiçbir şey sormadı, normal mi?".
1321
-
1322
- ### Added — Per-phase token tracking
1323
-
1324
- - **`phase-tracker.sh tokens <phase_id> <in> <out>`** — new action accumulates LLM token usage against the active phase. Multiple calls on the same phase sum (so each LLM dispatch during a long phase is just another `tokens` call with that call's usage). Non-negative integer validation rejects malformed input with exit 64.
1325
- - **State schema** — `phases[].tokens_in` and `phases[].tokens_out` (both optional, default 0 on read). Old state files without these fields render unchanged.
1326
- - **Render shows token column + Total footer** — each phase line now has ` · Nk tok` next to its elapsed label (omitted when zero). Below the phase list, a separator + `Total 34.8k tok` footer sums all phases. Format: raw count under 1000 (`230`), k-suffix otherwise (`3.8k`, `18.7k`).
1327
- - **Example live render** during a typical component-production run:
1328
-
1329
- ```
1330
- ╭─ Pipeline: PROJ-12345 ────────────────────────────────────╮
1331
- │ ✓ Phase 0 Init █░░░░░░░░░░░░░░░ 5s · 230 tok
1332
- │ ✓ Phase 1 Analysis ██░░░░░░░░░░░░░░ 12s · 3.8k tok
1333
- │ ✓ Phase 3 Dev ████████████████ 2m 35s · 18.7k tok
1334
- │ ✓ Phase 4 Review ████░░░░░░░░░░░░ 37s · 8k tok
1335
- │ ● Phase 7 Report ██░░░░░░░░░░░░░░ 21s · 450 tok
1336
- │ ────────────────────────────────────────────────────────
1337
- │ Total 34.8k tok
1338
- ╰────────────────────────────────────────────────────────────╯
1339
- ```
1340
-
1341
- ### Added — Phase 7 visibility contract (was v5.4.1 before bundling)
1342
-
1343
- - **Sub-step tracking enforced at Phase 7 entry** — register all 5 Phase 7 sub-steps (Jira comment, Wiki+Figma, Confluence, Log+Telemetry, Knowledge+Memory) as tracker sub-phases with status `pending`, then advance through `in_progress` → `completed` / `failed` / `skipped` as each step runs. Previously sub-steps were invisible in the live tracker and skipped steps left no trace.
1344
- - **Wiki step visible-skip contract** — component tasks where `figma-config.wiki.enabled` or the Figma Keychain token are missing used to silent-skip. Now, when preconditions aren't met, the step prints a preconditions table and a 4-option prompt:
1345
- 1. Add Figma token + create figma-config.json (Token Save Flow)
1346
- 2. Pick a wiki adapter now + persist to project config
1347
- 3. Skip this run only
1348
- 4. Disable wiki for this project forever
1349
- Autopilot honors `prefs.global.wikiDefault` without prompting. Log on skip names the specific reason ("no figma token" / "wiki.enabled=false" / "figma-config.json missing").
1350
- - **Jira comment optional user note** — Phase 7 Step 1 now asks "Add a note to the Jira comment?" before the humanizer pass. User-typed content appends as an `h2. Notes` section below test scenarios. `no` persists to `prefs.global.jiraCommentUserNote = "never-ask"` so subsequent runs skip the prompt. `--dev` + autopilot skip the prompt (zero-interaction contract).
1351
-
1352
- ### Changed — generator docs
1353
-
1354
- - **`install.js` `generateCopilotInstructions()`** — Progress Tracking section now documents the `tokens` action and the Phase 7 sub-step registration protocol. Copilot's instruction surface gets the v5.5.0 example render with token columns + Total footer.
1355
-
1356
- ### Tests
1357
-
1358
- - **`smoke-phase-tracker.sh`** — 32 → 41 assertions (Steps 15–18 added): `tokens` accumulation, integer validation, render output format (raw counts + k-suffix + Total footer), pending-phase no-label guard.
1359
- - **Full `npm test`**: 38 smoke suites + 10 eval fixtures + 7 schema validations, all green.
1360
-
1361
- ### Notes
1362
-
1363
- - Backward-compatible. Old state files without `tokens_in`/`tokens_out` fields render fine (no token column). Phase 7 docs are specs for the LLM orchestrator; behavior depends on the orchestrator honoring the new sub-step + visible-skip contract.
1364
- - Token values are the orchestrator's responsibility to record. The pipeline uses `phase-tracker.sh tokens <N> <in> <out>` after each LLM dispatch. If your orchestrator doesn't emit these, the live render just omits the token column — no breakage.
1365
- - Wiki visibility patch addresses a real user report: a component was generated but Phase 7 Step 2 silently skipped because the user's `keychainMapping.figma` was `null` and no `figma-config.json` existed in the project. Before v5.5.0 they'd have no way to know without parsing the log. Now they see the preconditions table and a menu.
1366
-
1367
- ## [5.4.0] — 2026-04-20
1368
-
1369
- Minor — two additive features. No breaking changes to existing surfaces. Bumped to **5.4.0** (not a patch) because new slash commands and a new state-file schema field are surface additions.
1370
-
1371
- ### Added — Live progress bar in phase tracker
1372
-
1373
- - **`phase-tracker.sh` now renders per-phase ASCII progress bar + elapsed time** in the live card stack. Previously the tracker only showed glyph + status word (`● active` / `✓ done`); users reported wanting the `log-format.md` style bars (already in the post-completion summary) to appear **during** the run so they can see relative phase durations in real time.
1374
- - **State schema additions** — each phase now carries `started_at` (stamped on first transition to `in_progress`, idempotent on repeats) and `completed_at` (stamped on terminal states `completed` / `failed` / `skipped`). Old state files without these fields render without bars/elapsed — graceful fallback verified by smoke.
1375
- - **Elapsed computation** — completed phases use `completed_at − started_at` (stamped); in-progress phases use `now − started_at` (live, updates on every `render` call); pending phases show empty bar.
1376
- - **Bar normalization** — 16-char width (`█` filled, `░` empty). Filled width = `round(16 × phase_elapsed / max_elapsed)` where max is the longest phase in the current run. This matches `log-format.md` post-completion summary so live and final views use the same visual vocabulary.
1377
- - **Format** — elapsed shown as `5s` / `2m` / `2m 35s`. Bar color follows status glyph (cyan = in_progress, green = done, red = failed, yellow = skipped, dim = pending).
1378
- - **Portable time math** — `iso_to_epoch()` helper tries BSD `date -j -f` first, falls back to GNU `date -d`. Works on macOS and Linux without GNU coreutils.
1379
-
1380
- Example render after a typical component-production run:
1381
-
1382
- ```
1383
- ╭─ Pipeline: PROJ-12345 ────────────────────────────────────╮
1384
- │ ✓ Phase 0 Init █░░░░░░░░░░░░░░░ 5s
1385
- │ ✓ Phase 1 Analysis █░░░░░░░░░░░░░░░ 12s
1386
- │ ✓ Phase 2 Planning █░░░░░░░░░░░░░░░ 8s
1387
- │ ✓ Phase 3 Dev ████████████████ 2m 35s
1388
- │ ✓ Phase 4 Review ████░░░░░░░░░░░░ 37s
1389
- │ ↷ Phase 5 Test ░░░░░░░░░░░░░░░░
1390
- │ ✓ Phase 6 Commit ░░░░░░░░░░░░░░░░ 3s
1391
- │ ● Phase 7 Report ██░░░░░░░░░░░░░░ 21s
1392
- ╰────────────────────────────────────────────────────────────╯
1393
- ```
1394
-
1395
- ### Added — Local command variants
1396
-
1397
- - **Three new dedicated commands**, each paired with Copilot skill: `:local` / `-local` (full 8-phase pipeline on current branch, no worktree), `:dev-local` / `-dev-local` (fast mode + local), `:dev-local-autopilot` / `-dev-local-autopilot` (fastest + local, zero interaction). Previously the only way to combine modes was the flag form `multi-agent "task" --dev --local autopilot`, which the pipeline supported but wasn't surfaced as a slash command — parallel to existing `:dev`, `:autopilot`, `:dev-autopilot` dedicated forms.
1398
- - **New files**:
1399
- - `pipeline/commands/multi-agent/local.md` + `pipeline/skills/shared/core/multi-agent-local/SKILL.md`
1400
- - `pipeline/commands/multi-agent/dev-local.md` + `pipeline/skills/shared/core/multi-agent-dev-local/SKILL.md`
1401
- - `pipeline/commands/multi-agent/dev-local-autopilot.md` + `pipeline/skills/shared/core/multi-agent-dev-local-autopilot/SKILL.md`
1402
- - **Routing table updated** (`pipeline/commands/multi-agent.md`): `local`, `dev-local`, `dev-local-autopilot` subcommands delegate to their dedicated files.
1403
- - **Help docs updated** — `help.md` and `multi-agent-help/SKILL.md` command tables now enumerate all six dedicated mode commands with their flag-form equivalents.
1404
- - **Install surface** — `install.js` `generateCopilotInstructions()` Modes section now lists all 7 modes (Normal / Fast / Autopilot / Fastest / Local / Fast+Local / Fastest+Local).
1405
-
1406
- ### Changed
1407
-
1408
- - **`install.js` Progress Tracking section** rewritten to document the new state schema (started_at/completed_at stamping) and the bar+elapsed render output. Bootstrap example now shows the full 8-phase tile registration loop instead of a placeholder `init 8` call.
1409
-
1410
- ### Tests
1411
-
1412
- - **`smoke-phase-tracker.sh`** extended from 19 to 32 assertions. New coverage (Steps 10–14): started_at stamping, completed_at stamping on all three terminal states, idempotent repeat updates, 16-char bar render with filled/empty characters, elapsed formatting (`5s` + `2m 40s`), longest-phase normalization, graceful fallback on old-schema state files.
1413
- - **`smoke-commands-skills-parity.sh`** — count rises from 42 to 48 (3 new local variants × 2 assertions each). No test file changes; the smoke auto-picks up new paired files.
1414
- - **Full `npm test`**: 39 smoke suites (was 38, +1 for… actually still 38 — `smoke-phase-tracker.sh` existed; it just got more assertions) + 10 eval fixtures + 7 schema validations, all green.
1415
-
1416
- ### Notes
1417
-
1418
- - No breaking changes. Existing `--local` and `--dev --local` flag combinations continue to work exactly as before; the new slash commands are sugar. State-file schema is backward-compatible via graceful fallback.
1419
- - Users running pipelines with long-running phases will see bars update on every `phase-tracker.sh render` call — recommend calling `render` periodically (e.g. during long build steps) so the in-progress bar visibly grows. Progress-contract lines (`→ <verb> <object>`) complement the bars: lines say *what* is happening right now, bars say *how long* it has been running.
1420
-
1421
- ## [5.3.8] — 2026-04-20
1422
-
1423
- Patch — phase label sentence-case normalization across the entire codebase.
1424
-
1425
- ### Changed
1426
-
1427
- - **Phase labels unified to sentence case** (e.g. `Init`, `Dev`, `Review`, `Commit`, `Report`) across 22 files — previously a mix of ALL CAPS and Title Case. User feedback: "adımlar sentence case olacak, tüm harfler büyük olmayacaktı" — the ALL CAPS style (`Phase 3: DEVELOPMENT`, `Phase 4: REVIEW`, `Phase 6: COMMIT`) felt shouty and inconsistent with the canonical label table in `multi-agent/SKILL.md` that already used Title Case.
1428
- - **Short forms over long forms** — aligned to the short canonical (matching `:dev`, `:test`, etc. skill command names and `phase-N-*.md` file names): `Development` → `Dev`, `User Test` → `Test`. Keep `Commit & PR` (the "& PR" is a legitimate secondary marker).
1429
- - **Affected surfaces** — README, CHANGELOG, install.js (`generateCopilotInstructions()`), all 8 phase docs (`refs/phases/phase-*.md`), orchestrator skills (`multi-agent/SKILL.md`, `multi-agent-dev/SKILL.md`, `multi-agent-dev-autopilot/SKILL.md`, `multi-agent-help/SKILL.md`), dev/dev-autopilot command docs, sync.md, progress-contract.md, knowledge.md, architecture.md, features.md, and smoke-phase-banner.sh assertions. Canonical label table in `SKILL.md` regenerated with consistent short forms. `phase-banner.sh` label table aligned (`en:3` → `Dev`, `en:5` → `Test`).
1430
- - **22 files, ~110 lines each way** — pure content replacement, no structural changes.
1431
-
1432
- ### Notes
1433
-
1434
- - No schema changes. No CLI surface changes.
1435
- - Full `npm test`: 38 smoke suites + 10 eval + 7 schema all green. `smoke-phase-banner` updated to match the new banner labels.
1436
- - `CHANGELOG.md` itself NOT normalized (historical entries preserve their original wording — the prior ALL CAPS usage was the state at the time of those releases).
1437
-
1438
- ## [5.3.7] — 2026-04-20
1439
-
1440
- Patch — Phase 0 multi-repo branch-picker contract clarification.
1441
-
1442
- ### Fixed
1443
-
1444
- - **Per-repo branch picker contract made unambiguous.** v5.3.6's MANDATORY CONTRACT block in `multi-agent/SKILL.md` enforced the 8-step Phase 0 flow and called out "mandatory branch picker", but didn't spell out that in **multi-repo mode the branch picker MUST fire SEPARATELY for each selected repo**. User report: a component-production task that selected component-repo + consumer-app-repo should have asked "base branch for component repo?" (expected `iteration/develop`) **and** "base branch for consumer app?" (expected `develop`) — two prompts, not one. The standard contract (already documented in `refs/phases/phase-0-init.md` at L221 "the prompt fires per-repo") was silently collapsing into a single branch applied to all repos.
1445
- - **Multi-repo per-repo loop now explicit on both CLIs.** Updated `multi-agent/SKILL.md` Phase 0 contract block with per-repo callouts for Steps 5 (branch picker), 7 (git identity — per repo), and 8 (worktree creation — serial per repo). Each step now carries an explicit phase-0-init.md line-number reference so implementers know where to look for the full schema. Same enumeration mirrored in `install.js` `generateCopilotInstructions()` for Copilot-side visibility.
1446
-
1447
- ### Notes
1448
-
1449
- - No schema changes, no new tests, no CLI surface changes. Contract clarification only — the underlying mechanism (per-repo fetch, per-repo worktree, shared branch name, rollback-on-failure) was already in `phase-0-init.md` and `agent-state.schema.json`. This patch fixes the way both LLM-facing surfaces (shared `SKILL.md` + generated Copilot instructions) describe that mechanism so it stops being silently short-circuited.
1450
-
1451
- ## [5.3.6] — 2026-04-20
1452
-
1453
- Patch — Copilot CLI experience fixes surfaced in real pipeline runs + README post-v5.3.5 doc drift sweep.
1454
-
1455
- ### Fixed — Copilot CLI pipeline experience
1456
-
1457
- - **Copilot now sees phase progress** — `generateCopilotInstructions()` in `install.js` was producing a pipeline-section in `~/.copilot/copilot-instructions.md` that mentioned 9 phases (stale) and, more importantly, **never told Copilot to invoke `phase-tracker.sh` or `phase-banner.sh`**. Claude Code has the native `TaskCreate` UI for phase tiles; Copilot has nothing without these shell scripts. Users reported "phase-by-phase view didn't appear" — this is the root cause. The generator now documents the mandatory tracker bootstrap (`phase-tracker.sh init 8`), per-boundary `update <N> active/done`, optional banner, and the progress-line contract (`→ <verb> <object>`).
1458
- - **Phase 0 interactive steps enforced on BOTH CLIs** — the shared orchestrator `multi-agent/SKILL.md` had a condensed 2-step Phase 0 ("Step 1 — parse input" + "Step 2 — worktree"). Users reported the pipeline skipping branch selection AND multi-repo picker on **both** Claude Code and Copilot CLI for component-production tasks (where the correct base branch is `iteration/develop`, not `develop`, and the task spans multiple repos). The condensed version was silently winning over the full 8-step `refs/phases/phase-0-init.md` contract on both sides. Fix: **SKILL.md Phase 0 now opens with a MANDATORY CONTRACT block enumerating all 8 steps** with explicit "do NOT short-circuit" language, calling out the two critical UX gaps (multi-repo selection + mandatory branch picker). The condensed walkthrough stays for implementation detail but is explicitly labeled as "does not replace the full contract". The generated Copilot instructions duplicate this enumeration so Copilot — which has no slash-command routing to the ref file — sees it at the top of its instruction pipeline.
1459
- - **`generateCopilotInstructions()` phase list corrected** — was listing 9 phases (WIKI broken out as its own phase); actual pipeline has 8 phases (Phase 0–7), wiki is a sub-step of Phase 7 REPORT. Now matches reality. Mode names also updated to the dedicated commands (`multi-agent-dev`, `multi-agent-autopilot`, `multi-agent-dev-autopilot`) instead of flag forms.
1460
- - **Permissions expectation documented** — new section in the generated instructions lists the command groups the pipeline uses regularly (\`cd\`, \`ls\`, \`git *\`, \`gh *\` including the previously-omitted \`gh auth switch\`, \`node/npm/npx\`, \`security *\`, \`bash\` for pipeline scripts). Users still own \`~/.copilot/permissions-config.json\`; the pipeline never mutates it — this is guidance for what to allow.
1461
- - **Stale path references removed** — `generateCopilotInstructions()` had two unused `_phasesDir` / `_mainFile` variables pointing at `pipeline/commands/multi-agent/phases/` which does not exist (real path is `refs/phases/`). Removed.
1462
- - **Stack swap documented in Copilot instructions** — added section covering the `multi-agent-stack` command + auto-detect script, mirroring the Claude Code docs.
1463
-
1464
- ### Changed — README post-v5.3.5 sweep
1465
-
1466
- - **Stale test counts corrected** — README said "128 assertions across three layers" and "10 scripts, 115 assertions" for smoke. Real numbers after v5.3.5's new `smoke-stack-swap.sh`: 38 smoke suites, ~675 smoke assertions, 66 node unit tests across 14 suites, 10 eval fixtures, 7 schema validations — roughly 760 total gates across four layers.
1467
- - **`.github/workflows/` tree removed from What's Included** — GitHub Actions was removed in v5.2.1; README tree was still listing `smoke.yml` + `release.yml`. Tree trimmed to reality.
1468
- - **"145 unified skills" → "149 unified skills"** in 5 locations (Supported CLIs table, mermaid ecosystem diagram, shared/ tree, Ecosystem Sync table). Actual count: `shared/core/` 22 + `shared/external/` 127 = 149. Also split the `shared/` tree in What's Included to show the `core/` + `external/` organization introduced in v5.3.3.
1469
- - **Copilot install surface updated** — "Supported CLIs", mermaid diagram, and Ecosystem Sync table now reflect that v5.3.5 deploys `pipeline/scripts/` to `~/.copilot/scripts/` (previously Claude-only).
1470
- - **Local CI paragraph rewritten** — was framed as "backup for when GitHub Actions is unavailable" (pre-v5.2.1 reality); now correctly states "CI is local-only via pre-push hook".
1471
-
1472
- ### Notes
1473
-
1474
- - No schema changes, no CLI-surface changes. This patch is doc + generator-only.
1475
- - The tracker/banner wiring lands in the *instructions* file — Copilot reads that file as its system prompt. Actual invocation behavior depends on Copilot picking up the updated instructions (effective after the next session start, or immediately if the CLI reloads instructions each run).
1476
- - Users who already customized `~/.copilot/permissions-config.json` keep their customizations; this patch only ships documentation. Review the new "Permissions Expectation" section after install to see if any allowlist entries are missing.
1477
-
1478
- ## [5.3.5] — 2026-04-20
1479
-
1480
- Patch — detailed review follow-up. Closes drift sweep gaps missed in v5.3.4, hardens installer idempotency on all managed directories, and lands the first smoke suite for the `stack` command.
1481
-
1482
- ### Fixed
1483
-
1484
- - **v5.3.4 9-phase sweep was incomplete.** README was cleaned but 6 more references lived under `pipeline/`, `docs/`, and `examples/`:
1485
- - `pipeline/commands/multi-agent.md:216` (orchestrator mode-selector prompt)
1486
- - `pipeline/commands/multi-agent/help.md:58` (user-facing mode table)
1487
- - `pipeline/commands/multi-agent/dev.md:13` + `pipeline/skills/shared/core/multi-agent-dev/SKILL.md:16` (dev-mode rationale)
1488
- - `pipeline/skills/shared/core/multi-agent-help/SKILL.md:61` (Copilot mirror of help)
1489
- - `pipeline/commands/multi-agent/refs/swiftui-guide.md:254` (Figma reference)
1490
- - `docs/features.md:75` (SubPhase convention — wrongly claimed "9 phases (0-7 plus 6.5)", contradicted the v5.1.0 Phase 6.5→7 merge)
1491
- - `examples/01-bugfix-from-jira.md:5,154,166` (user-facing example transcript)
1492
- - **`sync.md` ↔ `multi-agent-sync/SKILL.md` sub-command count drift** — command doc said "16 sub-command", skill said "20". Skill was correct; command doc updated.
1493
- - **Dead path in `pipeline/rules/figma-pipeline.md:97`** — referenced `pipeline/figma/abstractions/` (removed in v5.0.0). Updated to point at the real `skills/figma-common/`, `skills/figma-ios/`, `skills/figma-android/` locations.
1494
-
1495
- ### Changed — `install.js` idempotency hardening
1496
-
1497
- - **Wipe-before-copy now applied to scripts, agents, and rules** (was only applied to commands). Previously, re-installing after a pipeline upstream removed a script/agent/rule left the old file as a ghost in `~/.claude/scripts/` etc. — same idempotency contract commands had; now extended to all 100%-managed destination trees. Skills keep the existing `pruneLegacyMultiAgentSkills` behavior (skills are mixed ownership — users may install third-party skills alongside pipeline ones).
1498
- - **Copilot side now installs scripts too.** Before, `--copilot` alone left `~/.copilot/` with no `scripts/` dir, so `multi-agent-stack` (new in v5.3.4) was broken on Copilot-only installs because it referenced `~/.claude/scripts/stack-swap.sh`. Now scripts deploy to `~/.copilot/scripts/` with the same wipe-before-copy contract.
1499
- - **`multi-agent-stack/SKILL.md`** prefers `~/.copilot/scripts/stack-swap.sh` when running under Copilot, falls back to `~/.claude/scripts/stack-swap.sh` only if Copilot's copy is missing. Clearer error message if neither exists.
1500
- - **`symlinkDir()` hardening** — replaced `execSync("rm -rf "${dest}"")` with `rmSync(dest, { recursive: true, force: true })`. Avoids shell interpolation entirely and handles paths with whitespace / special characters safely.
1501
-
1502
- ### Added
1503
-
1504
- - **`smoke-stack-swap.sh`** — 25-assertion smoke suite for the `/multi-agent:stack` backend. Covers: `status` mode (JSON output contract, no file movement), all seven forced modes (`ios`, `android`, `mobile`, `backend`, `frontend`, `fullstack`, `all`), skill directory movement between active and `skills-inactive/`, unrelated-skill isolation, unknown-arg rejection with exit 1, and exit-0 under every valid mode. Auto-picked up by `npm test` and `pre-push-check.sh` via glob.
1505
-
1506
- ### Notes
1507
-
1508
- - No schema changes, no CLI-surface changes.
1509
- - Full `npm test` result: 38 smoke suites (was 37) + 10 eval fixtures + 7 schema validations, all green. `smoke-commands-skills-parity`: 42 passed.
1510
- - Detailed review that surfaced these gaps is preserved in the commit narrative — the drift-sweep lesson is that "grep matches in the top-level README" ≠ "all files updated"; future sweeps should scan the entire repo tree.
1511
-
1512
- ## [5.3.4] — 2026-04-20
1513
-
1514
- Patch — command-surface standardization + documentation drift sweep. The `stack` subcommand joins the single-standard command pair; README catches up to the actual 8-phase pipeline and full command catalog.
1515
-
1516
- ### Added
1517
-
1518
- - **`/multi-agent:stack` (Claude Code) + `multi-agent-stack` (Copilot CLI)** — `stack` was previously an inline-only argument of the main orchestrator (`/multi-agent stack ios`), which broke the single-standard command contract every other subcommand followed. Now ships as a dedicated slash command (Claude) and dash-named skill (Copilot), parallel to `:dev` / `-dev`, `:autopilot` / `-autopilot`, `:purge` / `-purge`, etc. Delegates to `pipeline/scripts/stack-swap.sh` — no behavior change, only surface.
1519
- - `pipeline/commands/multi-agent/stack.md` (new — Claude)
1520
- - `pipeline/skills/shared/core/multi-agent-stack/SKILL.md` (new — Copilot)
1521
- - `pipeline/commands/multi-agent.md` routing table: inline `stack` handler → delegation to `stack.md`
1522
-
1523
- ### Changed
1524
-
1525
- - **`README.md`** — doc drift sweep:
1526
- - Pipeline phase count unified to **8** (was inconsistent 8 / 9 across 5 locations — tagline, "Claude Code (Full Mode)" block, Modes table, context-management note, Key Features list). Phase list itself (Phase 0 → Phase 7) was already 8; the "9-phase" label was a v4 carryover.
1527
- - Claude Code command block expanded from 5 lines to the full 20+ command catalog (`:status`, `:log`, `:resume`, `:kill`, `:review`, `:setup`, `:test`, `:enrich`, `:search`, `:scan`, `:refactor`, `:update`, `:sync`, `:purge`, `:autopilot`, `:dev-autopilot`) to match Copilot CLI side symmetry. Flag form (`--dev`, `--dev autopilot`) kept as "equivalent" footnote.
1528
- - Copilot CLI helper list filled in missing commands (`-status`, `-resume`, `-autopilot`, `-search`, `-scan`, `-refactor`, `-update`, `-sync`, `-purge`).
1529
- - `What's Included` tree: `multi-agent/` subdirectory listing expanded from 3 files to the actual 20 files. `sync-instructions.md` (never existed) removed. Stale `pipeline/figma/` tree (removed in v5.0.0) replaced with the real `skills/figma-common/` + `skills/figma-ios/` + `skills/figma-android/`.
1530
- - Modes table: flag-form rows (`Fast`, `Autopilot`, `Fastest`) converted to dedicated-command form.
1531
- - `/multi-agent setup` → `/multi-agent:setup` (missed colon).
1532
- - **`examples/README.md`** — `full 9-phase pipeline` → `full 8-phase pipeline` (one match).
1533
- - **`package.json` `description`** — same 9 → 8 correction.
1534
-
1535
- ### Notes
1536
-
1537
- - No schema changes, no install.js changes. `install.js --all` picks up the new `stack.md` and `multi-agent-stack/` automatically via existing command- and skill-tree deployment logic.
1538
- - `smoke-commands-skills-parity.sh`: 40 → 42 passed (new pair wired correctly).
1539
- - Full `npm test`: all 22 smoke suites + 10 eval fixtures + 7 schema validations green.
1540
-
1541
- ## [5.3.3] — 2026-04-17
1542
-
1543
- Patch — source reorganization. Skills split into `core/` and `external/`. Install destination unchanged.
1544
-
1545
- ### Changed
1546
-
1547
- - **`pipeline/skills/shared/` reorganized into `core/` + `external/`** (v5.3.3, closes the last v5.3.0 audit item):
1548
- - `core/` — 21 `multi-agent*` orchestration skills (pipeline-critical — changes here affect pipeline behavior)
1549
- - `external/` — 127 iOS / Android / generic skills imported from upstream (mirrors of third-party guidance)
1550
- - Install destination stays flat: both trees flatten into `~/.claude/skills/` and `~/.copilot/skills/`. Skill discovery at runtime is unchanged; 148 skills appear as 148 sibling directories after install, same as before.
1551
- - **`install.js`** — now walks `shared/core/` and `shared/external/` separately (6-line change, preserves all prune behavior + figma-tree handling).
1552
- - **Smoke tests updated** — `smoke-commands-skills-parity.sh`, `smoke-plan-approval-gate.sh`, `smoke-url-enrichment.sh` paths now point at `shared/core/`.
1553
- - **`shared/README.md`** — documents the new layout + links to ADR-0006.
1554
- - **ADR-0003** (Unified shared skills) — amended with v5.3.3 note that internal `core/`/`external/` split preserves the "single install target" decision.
1555
-
1556
- ### Added
1557
-
1558
- - **ADR-0006** (`docs/adr/0006-skills-core-external-split.md`) — rationale for the split, alternatives considered (per-stack tree, frontmatter-only categorization, leaving the flat tree alone), and preserved ADR-0003 commitments.
1559
-
1560
- ### Notes
1561
-
1562
- - No user-visible change. Install destination is identical. Skill invocation (slash commands on Claude Code, skill names on Copilot CLI) unchanged.
1563
- - Grep / Glob queries can now scope to `shared/core/**` or `shared/external/**` for faster navigation.
1564
- - Future per-stack install filter (ADR-0003's deferred idea) now has a clean `external/` unit to filter against when the stack-detection story lands.
1565
-
1566
- ## [5.3.2] — 2026-04-17
1567
-
1568
- Patch — quality-signal hygiene. Token budget calibrated so the warn channel stops flooding; install.js `pruneLegacyMultiAgentSkills` now has direct test coverage.
1569
-
1570
- ### Changed
1571
-
1572
- - **Token budget calibration pass.** Before: 7 of 8 phases tripped `warn` on every CI run, which buried the one legitimate growth signal in noise. After: `warn = current+10%` and `max = current+25%` per phase (rounded to nearest 50 tokens), giving ~6 edit cycles of headroom before the signal trips. New per-phase numbers in `pipeline/schemas/token-budget.json`; total max went 25700 → 32100 to absorb the sum of per-phase maxes. Smoke result: 8 warnings → 0.
1573
-
1574
- ### Added
1575
-
1576
- - **`install.js` `pruneLegacyMultiAgentSkills` test coverage** (`test/index.test.mjs`). The v5.2.1 install prunes 21 legacy `multi-agent-*` skill directories from `~/.claude/skills/` on every run — this function was previously untested, which was a real gap since it performs destructive filesystem operations. New `describe` block adds 4 tests: prunes `multi-agent` + `multi-agent-*` directories, does NOT touch `figma-ios`/`figma-android`/`figma-common`/unrelated skills, returns 0 on missing skillsDir (idempotent), ignores root-level files. Total test count 17 → 21.
1577
-
1578
- ### Notes
1579
-
1580
- - No schema changes, no user-visible feature changes.
1581
-
1582
- ## [5.3.1] — 2026-04-17
1583
-
1584
- Patch — review-driven drift fixes after the v5.3.0 audit.
1585
-
1586
- ### Fixed
1587
-
1588
- - **`multi-agent-sync` SKILL personal-data leak** — three occurrences of the hardcoded `mmerterden` handle replaced with `{owner}` / `${USER}_{work-gh-alias}` placeholders (`pipeline/skills/shared/multi-agent-sync/SKILL.md` lines 98, 168, 173). The command mirror `sync.md` was already generic; drift was on the skill side only. Also cleaned `multi-agent-issue/SKILL.md:34` (`gh auth restore` target).
1589
- - **`cross-cli-contract.md` inventory stale** — Section 1 listed 17 commands but the repo has 20. Added `scan`, `search`, `update` to the canonical list and updated the follow-on references (Section 1.1 preamble). Added a reminder that `smoke-commands-skills-parity.sh` enforces directory parity but not this doc; keep both in sync manually.
1590
- - **`progress-contract.md` drift (v5.3.0 gate)** — Phase 2 section was missing the 5 new progress lines the Plan Approval Gate emits (`clarification-ask`, `clarification-answer`, `plan-edit-request`, `plan-approved`, `plan-aborted`). Without these, `smoke-progress-contract.sh` couldn't detect gate-emitted lines that weren't in the contract — audit gap closed.
1591
- - **ESLint errors (pre-existing since v5.2.x)** — `pipeline/scripts/migrate-prefs.mjs` and `pipeline/schemas/migrations/prefs-2.0.0-to-2.1.0.mjs` used `== null` / `!= null` as the null-or-undefined idiom but the eslint rule `eqeqeq: always` flagged them. Loosened the rule to `eqeqeq: ["error", "always", { null: "ignore" }]` (Airbnb/Google style) — strict `===` everywhere else, idiomatic `x == null` allowed for null-OR-undefined checks. Also removed dead `bumpedVersion` variable in `migrate-prefs.mjs:143` (assigned twice, never read).
1592
- - **`test/project.test.mjs` stale workflow assertions** — removed the two `assert.ok(existsSync(".github/workflows/{smoke,release}.yml"))` checks that failed after v5.2.1 removed both workflows. `npm test` now passes 45/45.
1593
- - **`sync.md` vs `multi-agent-sync/SKILL.md` step-count drift** — command said "6 targets (v5.0.1+)" but skill said "5 targets"; skill never picked up the Figma Step 0 addition. Reconciled: both now list 6 targets with identical step diagrams and the "v5.0.1+ Figma source sync" step 0.
1594
- - **`dev.md` vs `multi-agent-dev/SKILL.md` Phase 6 prompt drift** — command specified the local-checkout prompt (`1=Evet (WIP commit) / 2=Hayır (devam)`, default 2, `autopilot` skip) for dev-mode Phase 6; skill simplified it to "Normal commit + push + PR". These are cross-CLI mirrors and must behave byte-identically — reconciled to the full prompt flow.
1595
-
1596
- ### Notes
1597
-
1598
- - Pre-push gate (`pre-push-check.sh`) now passes all three code gates: lint ✓, unit tests ✓, token budget ✓ (personal-data smoke already clean).
1599
- - No schema changes, no user-visible feature changes, no breaking.
1600
-
1601
- ## [5.3.0] — 2026-04-17
1602
-
1603
- Minor — **Phase 2 Plan Approval Gate** (normal mode only). Pipeline now refuses to start Phase 3 on underspecified tickets without a human read-through.
1604
-
1605
- ### Added
1606
-
1607
- - **Phase 2 Plan Approval Gate** (`pipeline/commands/multi-agent/refs/phases/phase-2-planning.md` Step 5) — two chained modes:
1608
- - **5a — Clarification Mode** (conditional, max 2 rounds). Triggers on any ambiguity signal from Phase 1: vague acceptance criteria (description < 200 chars, no acceptance section), UI work without a Figma link, API work without an endpoint/contract reference, ambiguous language (`ambiguityScore >= 2`), or parent-story scope drift. Orchestrator asks structured questions; user answers → plan is regenerated with answers as context. 2. turda hâlâ netleşmezse `⚠️ best-effort (madde hâlâ net değil)` banner'ı ile plan yine gösterilir; infinite grooming loop önlenir.
1609
- - **5b — Approval Loop** (always runs after clarification resolves or is skipped). Plan rendered with özet / yaklaşım / risk / scope / dokunulacak dosyalar / todo'lar. User responds with `onayla` (proceed), `iptal` (pause), or **serbest metin** (free-text edit request — e.g. "auth servisine de bakalım ama LoginView'ı kapsam dışı tut"). Edit requests loop through Opus → plan validator → re-render until approve/abort. No hard iteration cap; user controls exit.
1610
- - **`agent-state.schema.json` phase fields** (six new, all under `phases["2"]`): `clarificationRounds` (0-2), `clarificationQuestions[]`, `clarificationAnswers[]`, `planIterations` (≥1), `planApprovedAt` (date-time or null), `planEditRequests[]`. Preserved verbatim for audit + resume.
1611
- - **Mode-awareness matrix** — the gate is explicitly documented as **skipped** for `--dev`, `autopilot`, and `--dev autopilot`. Their fast/zero-interaction contracts are preserved:
1612
- - `--dev` → direct to Phase 3 (fast path, no plan discussion by design)
1613
- - `autopilot` → log plan, direct to Phase 3 (zero-interaction contract: user explicitly trusts scope)
1614
- - `--dev autopilot` → direct to Phase 3 (both contracts compose)
1615
- - `pipeline/commands/multi-agent/autopilot.md`, `multi-agent/dev.md`, `multi-agent/dev-autopilot.md`, `multi-agent/help.md`, mirror skills, `refs/phases/modes.md`, and `SKILL.md` autopilot matrix — all updated to reference the gate and its mode-specific skip semantics.
1616
-
1617
- ### Changed
1618
-
1619
- - **`Phase 2 — "Plan hazır, devam edeyim mi?" one-liner prompt** replaced by the full gate flow. Old prompt produced y/n/edit; new gate supports clarification + conversational free-text edits + validator-checked revisions. Backward compatible: autopilot/--dev users see no behavioral change (gate skipped).
1620
- - **Progress contract** — new progress lines emitted by Phase 2: `plan-draft start`, `clarification-ask`, `clarification-answer`, `plan render`, `plan-edit-request`, `plan-approved`, `plan-aborted`.
1621
-
1622
- ### Token budget
1623
-
1624
- - `phase-2-planning.md` max bumped 1100 → 2600, warn 900 → 2200 (gate specification ~1500 tokens). Total doc budget 24200 → 25700.
1625
-
1626
- ### Rationale
1627
-
1628
- Under-specified Jira tickets are the top cause of Phase 3 rework. A single y/n confirmation at the end of Phase 2 couldn't catch ambiguity — the pipeline would plan against its own interpretation, then wrong-direction until Phase 4 review or user test surfaced the mismatch. The gate moves the clarification conversation up front where it's cheapest. Autopilot/--dev keep their speed contracts because users who pick those modes have explicitly signaled "trust scope, don't ask".
1629
-
1630
- ## [5.2.2] — 2026-04-17
1631
-
1632
- Patch — CLI-aware reviewer labeling, TR banner/TaskCreate localization, Phase 5 prompt clarity.
1633
-
1634
- ### Changed
1635
-
1636
- - **Phase 4 reviewer set is now CLI-aware.** Claude Code dispatches **2 parallel reviewers** (Opus + Sonnet); Copilot CLI dispatches **3 parallel reviewers** (GPT-5.4 + Opus + Sonnet). Opus triage runs on both. The prior "3-model" label was misleading on Claude Code — GPT-5.4 is only natively reachable from Copilot CLI, so Claude was silently running 2 reviewers while docs/banners claimed 3. All references across `SKILL.md`, `phase-4-review.md`, `modes.md`, `help.md`, `review.md`, mirror skills, `multi-agent.md`, `claude-md-template.md`, README, `features.md`, `architecture.md`, ADR-0001, and `reviewer-output.schema.json` updated. GPT-5.4 reviewer telemetry now emits only on Copilot CLI (`CLI_HOST=copilot`).
1637
- - **`prefs.global.promptLanguage` now governs TaskCreate subjects and phase-banner headers**, not only interactive prompts. Previously the banner/TaskCreate text was hardcoded English regardless of TR selection, producing mixed-language task lists (e.g. "Phase 0: Init" alongside Turkish prompt copy). Canonical phase label table (EN ↔ TR × 8 phases) now lives in `pipeline/skills/shared/multi-agent/SKILL.md` and is the single source of truth. External payloads — reviewer/triage system prompts, commit messages, PR bodies, Jira comments, wiki content, `agent-log.md` — remain English regardless of this setting.
1638
- - **`phase-banner.sh`** — new `PHASE_LANG=en|tr` env var + `auto` name sentinel + new `label <phase>` action. `PHASE_LANG=tr phase-banner.sh start 4 auto` now prints `Faz 4: İnceleme`; `label` action lets callers fetch the canonical name for their own UI. Smoke test grew from 12 to 15 assertions covering TR, auto sentinel, and label action.
1639
- - **Phase 5 prompt clarified.** Prompt now explicitly states the local-checkout side effect instead of implying it:
1640
- - EN: `Want to check out locally to test? (y/N) (yes → worktree is removed and branch is checked out locally for Xcode/manual test · no → stay in worktree and go to Phase 6)`
1641
- - TR: `Lokalinde test etmek ister misin? (e/H) (evet → worktree kaldırılır, branch lokale çekilir → Xcode'da test · hayır → worktree'de kalır, direkt Faz 6'ya geçer)`
1642
- Previous copy ("Want to test the code?") left users surprised that saying yes ripped down the worktree and swapped their local repo to the task branch.
1643
- - **ADR-0001 amended** to document the CLI-aware reviewer set rationale: GPT-5.4 native reachability is per-CLI; 2-model Claude + 3-model Copilot is an accepted asymmetry against a Claude-side GPT bridge that would add latency without improving signal.
1644
-
1645
- ### Token budget
1646
-
1647
- - `phase-4-review.md` max bumped 3600 → 3800 (CLI-aware reviewer tables). Total doc budget 24000 → 24200.
1648
-
1649
- ## [5.2.1] — 2026-04-17
1650
-
1651
- Patch — remove GitHub Actions. CI is now local-only.
1652
-
1653
- ### Removed
1654
-
1655
- - **`.github/workflows/smoke.yml`** and **`.github/workflows/release.yml`**. Both workflows were blocked by a GitHub Actions billing stop, and the smoke gate already runs locally via `pre-push-check.sh`. Removing them avoids confusing red "failed" runs on every push and eliminates a dependency on a paid execution environment.
1656
-
1657
- ### Changed
1658
-
1659
- - **README** — swapped the `Smoke Tests` status badge for a dynamic GitHub Release badge, and the "GitHub Packages v5.1.0" hardcoded badge for the same release badge (auto-updates with new tags). The Testing section now documents the local `pre-push` hook as the primary gate.
1660
- - **`pre-push-check.sh`** — top-of-file comment updated: it is no longer a "CI mirror" (there is nothing to mirror), it is the primary gate.
1661
- - **`multi-agent-scan` skill + slash command** — CI-integration line replaced: the strict-mode scan now runs via `pre-push-check.sh` locally rather than a workflow step.
1662
-
1663
- ### Release flow (new)
1664
-
1665
- ```bash
1666
- # 1. Bump version + CHANGELOG locally
1667
- # 2. Commit + push main
1668
- # 3. Tag + push tag
1669
- git tag vX.Y.Z && git push origin vX.Y.Z
1670
- # 4. Create GitHub Release manually (metadata only, no billing)
1671
- gh release create vX.Y.Z --notes-file <(sed -n '/## \[X.Y.Z\]/,/## \[/p' CHANGELOG.md | head -n -1)
1672
- # 5. (Optional) publish to GitHub Packages
1673
- npm publish --registry=https://npm.pkg.github.com
1674
- ```
1675
-
1676
- ---
1677
-
1678
- ## [5.2.0] — 2026-04-17
1679
-
1680
- Minor release — Phase 7 label tweak + Phase 0 URL enrichment. Additive; no breaking changes.
1681
-
1682
- ### Changed
1683
-
1684
- - **Phase 7 label: `HANDOFF` → `Report`** (proper case). Purely cosmetic — the 5-step structure (Jira comment → Wiki+Figma → Confluence → Log+Metrics → Knowledge+memory) is unchanged. File already renamed `phase-7-handoff.md` → `phase-7-report.md` in 5.1.0; this release sweeps the display name across the pipeline, website, and remote-control UI.
1685
- - **"Firebase SA JSON" / "Service Account JSON" → "Firebase JSON"** in all user-facing labels (schema descriptions, setup.md Token Source Mapping, multi-agent-setup SKILL.md). Token type, Keychain key name (`${USER}_Firebase_Access_Json`), and base64 storage are identical; only the display label is shortened.
1686
- - **"How it works" listing in `multi-agent/SKILL.md`** rewritten from the legacy 9-step (8 WIKI + 9 REPORT) to the canonical 8-phase flow with Phase 7 Report showing the 5 consolidated sub-steps. Matches `phases.md` and the website simulator.
1687
-
1688
- ### Added
1689
-
1690
- - **Phase 0 Step 1b — URL Enrichment.** After Step 1 parses the input, Phase 0 scans every URL in the Jira description (or free-text) for two patterns and prepends the fetched context into later phases:
1691
- - **Firebase Crashlytics URL** (`console.firebase.google.com/.../crashlytics/app/.../issues/<id>`) → resolves `prefs.global.keychainMapping.firebase` → decodes SA JSON → exchanges for a short-lived GCP access token → fetches issue detail + optional session timeline via `firebasecrashlytics.googleapis.com/v1alpha/...`. Stored as `state.crashContext`; Phase 1 Analysis MUST include it under "Known Crash Context". Host is fixed (Google) — no `hosts.firebase` pref.
1692
- - **Fortify SSC URL** (`<prefs.global.hosts.fortify>/ssc/html/ssc/version/<id>/...`) → resolves `prefs.global.keychainMapping.fortify` → queries `/ssc/api/v1/projectVersions/<id>/issues` + `/issueDetails/<id>` with Bearer (fallback FortifyToken). Stored as `state.fortifyFinding`; Phase 2 Planning MUST inject it under "Known Security Finding" and the task breakdown MUST contain `sec(fortify-<id>): <name>`. Host comes from Step 3.5 Host Prompt.
1693
- - Both enrichments **soft-fail**: auth / VPN / host-mismatch errors log a warning and continue. Progress line shape: `→ enrich input: crashlytics=<hit|miss|skipped>, fortify=<hit|miss|skipped>`.
1694
-
1695
- - **`prefs.global.hosts.fortify`** — new host slot. Collected inline at Token Save Flow Step 3.5 when the user adds their first Fortify token (same flow as `jira` / `confluence` / `bitbucket`). Firebase is explicitly skipped from Step 3.5 since its host is fixed.
1696
-
1697
- - **`smoke-url-enrichment.sh`** (18 assertions) — validates that phase-0-init.md documents Step 1b, state shapes, progress-line contract, schema has `hosts.fortify`, setup.md routes fortify through Step 3.5 and skips firebase, and Token Source Mapping uses "Firebase JSON" label.
1698
-
1699
- ### Notes
22
+ ### Fixed
1700
23
 
1701
- - No changes to Keychain key names — existing `${USER}_Firebase_Access_Json` and `${USER}_Fortify_Access_Token` work unchanged; users only need to supply the Fortify host on next setup run (prompt fires automatically when missing).
24
+ - **Regenerated the stale `install-layout.tsv` fixture.** The v10.0.5 line added
25
+ `smoke-metrics-cache-ratio.sh` without refreshing the layout fingerprint, so
26
+ `smoke-install-layout.sh` failed (install writes 173 scripts per target, the
27
+ fixture still claimed 172). The fixture now records 173 for `.claude/scripts`
28
+ and `.copilot/scripts`; this was the only gate blocking a clean release.
1702
29
 
1703
30
  ---
1704
31
 
1705
- ## [5.1.0] 2026-04-17
32
+ ## [10.0.5] - 2026-06-19
1706
33
 
1707
- Minor release pre-team-rollout feature drop: skill security scanner + log search. Both opt-in, warn-only by default (scanner never halts install). No breaking changes.
34
+ Telemetry backbone for prompt-cache reuse, plus a self-deriving PR review
35
+ iteration counter. Ships with a consistency fix so the two consumers of the
36
+ token ledger agree on what an input token is.
1708
37
 
1709
38
  ### Added
1710
39
 
1711
- - **`/multi-agent:scan` Skill Security Scanner.** Tiered pattern catalog covering:
1712
- - **critical:** shell-pipe exec (`curl|bash`, `wget|sh`), base64-decode exec, `eval $(curl …)`, unicode bidirectional override characters (U+202D-202E, U+2066-2069 — the "trojan source" vector)
1713
- - **high:** JS `eval()` / `new Function()`, Python `exec()` / `eval()` (with `re.compile` + `subprocess.*` whitelisted as legitimate), hardcoded API credentials (AWS `AKIA*`, OpenAI `sk-live|test|proj-*`, GitHub `ghp_*` / `gho_*` / `github_pat_*`, Slack `xoxb|xoxp-*`) — with FORBIDDEN / NEVER / example / placeholder context whitelisted so docs aren't false-flagged
1714
- - **medium:** long base64 blobs (>200 chars) — scanned only in `.sh` / `.py` / `.js` (docs in `.md` skipped to avoid false positives)
1715
- - **low:** unknown network endpoints (not in allow-list of github / anthropic / figma / apple / npm / vercel / placeholder hosts), missing SKILL.md frontmatter
1716
-
1717
- Three integration points: (1) `install.js` pre-deploy hook (warn-only, threshold=high, always exit 0 so install never halts); (2) `.github/workflows/smoke.yml` — CI step with `--strict` (non-zero exit blocks PR); (3) standalone `/multi-agent:scan` slash command + `multi-agent-scan` Copilot skill. Output: colored text (default), `--json`, and strict exit codes (0/1/2/3/4 by severity).
1718
-
1719
- - **`/multi-agent:search` Log Search.** Full-text search across `~/.claude/logs/multi-agent/**/agent-log.md` with smart ranking (match count × recency weight: last 7d = 1.0×, 7-30d = 0.5×, older = 0.2×). Ripgrep primary, grep fallback. Filters: `--project`, `--since 7d|30d|YYYY-MM-DD`, `--phase N` (only match inside `## Phase N` section), `--limit N`. Output: colored text (default), `--json`, `--tsv`. Cross-CLI parity: slash command + Copilot skill + shared `search-logs.sh` script. Exit codes: 0 match, 1 no match, 64 bad args.
1720
-
1721
- - **Scanner + Search simulator helpers.** Website simulator on [mmerterden.vercel.app/projects/multi-agent-pipeline](https://mmerterden.vercel.app/projects/multi-agent-pipeline) gains `scan` and `search` helper scenarios. Helpers grid bumped to 17 total.
1722
-
1723
- ### Security
1724
-
1725
- - Scanner runs on every `install.js --all` before any skill is copied to `~/.claude/skills/` or `~/.copilot/skills/`. If a teammate accidentally imports a malicious third-party skill locally, running `/multi-agent:sync` or `/multi-agent:update` surfaces critical/high findings in the install log before propagation.
1726
-
1727
- ### Smoke
1728
-
1729
- - `smoke-skill-scan.sh` — 18 assertions with synthetic fixtures for every pattern tier. Real `pipeline/skills/` tree scans clean at threshold=high (no false positives). `re.compile` whitelisted; FORBIDDEN-context credentials whitelisted.
1730
- - `smoke-search.sh` — 18 assertions with synthetic fixture tree (never touches real `~/.claude/logs/`): query match, project filter, --since filter, --phase filter, --limit cap, JSON validity, TSV header + rows, no-match exit 1, recency-weighted ranking.
1731
-
1732
- ---
1733
-
1734
- ## [5.0.3] — 2026-04-17
1735
-
1736
- Patch release focused on pre-team-rollout polish. No behaviour changes to existing workflows — tightens the installer, closes a personal-data scan regression, and sweeps stale documentation references.
40
+ - **Prompt-cache reuse ratio in metrics.** `aggregate-metrics.mjs` now sums
41
+ `tokens_cached` per model and reports `cache_ratio = cached / (in + cached)`
42
+ - the share of input tokens served from the host prompt cache - per model and
43
+ overall, in all three output modes (json / markdown / text). This is the
44
+ single number that says whether cache-friendly prompt structuring is paying
45
+ off. Backward-compatible: a phase that omits `tokens_cached` reads as 0%.
46
+ - **PR review iteration counter derived from the PR itself.** On a `needs_work`
47
+ post, the iteration number is re-derived as `max("iteration #N" already on the
48
+ PR) + 1` instead of trusting agent-state. Standalone `/multi-agent:review`
49
+ runs use a fresh task id each time, so the PR comments are the only reliable
50
+ cross-run source of truth.
1737
51
 
1738
52
  ### Fixed
1739
53
 
1740
- - **Single-standard commands on Claude Code** — `install.js` now prunes legacy `~/.claude/skills/multi-agent-*` directories after copy so the `/` picker no longer shows duplicate entries like `/multi-agent:issue` + `/multi-agent-issue`. Claude Code uses the colon-namespaced slash-command surface (`~/.claude/commands/multi-agent/`), Copilot CLI keeps the dash-skill surface (no infra overlap, each CLI uses its native idiom).
1741
- - **Renamed-file ghost cleanup** — `install.js` wipes `~/.claude/commands/multi-agent/` before copy so renames in the source (e.g. the 5.0.2 `phase-7-report.md` `phase-7-handoff.md`) don't linger as ghost slash commands on existing installs.
1742
- - **Personal-data scanner case-insensitive** `smoke-personal-data.sh` now uses `grep -Ei`. Caught uppercase `MUSTAFAERDEN@THY.COM` the old case-sensitive scanner missed. Four new forbidden patterns added: `jira\.thy\.com`, `confluence\.thy\.com`, `bitbucket\.thy\.com`, `[a-z0-9._%+-]+@thy\.com`. 13/13 clean.
1743
-
1744
- ### Documentation
1745
-
1746
- - Swept remaining "Phase 6.5 WIKI" / "Phase 7: REPORT" / `phase-7-report.md` references out of `setup.md`, `help.md`, shared skill SKILL.md files, `docs/architecture.md`, `docs/features.md`, `docs/performance.md`, and the README mode-mermaid subgraphs. Every user-facing surface now names Phase 7 as HANDOFF with the 5-step sub-flow (Jira → Wiki+Figma → Confluence → Log → Knowledge).
1747
- - Step D of the Token Save Flow now explicitly routes to Step 3.5 (Host Prompt) for hosted services — removes the risk of an agent skipping the Jira/Confluence/Bitbucket host collection.
1748
-
1749
- ## [5.0.2] 2026-04-17
1750
-
1751
- Minor release Phase 6.5 absorbed into Phase 7 HANDOFF, personal-data leaks closed, corporate hosts moved into preferences.
1752
-
1753
- ### Changed
1754
-
1755
- - **Phase 6.5 → Phase 7 HANDOFF sub-step.** The hybrid Phase 6.5 WIKI has been removed from the phase contract; wiki + Figma screenshots are now Step 2 of the Phase 7 HANDOFF sequence (Step 1 Jira comment → Step 2 Wiki + Figma → Step 3 Confluence → Step 4 Report + log → Step 5 Knowledge + memory). External delivery happens before internal capture — cleaner mental model, single post-commit phase.
1756
- - **File rename.** `phase-7-report.md` → `phase-7-handoff.md`. All 18 referring files (smoke scripts, token-budget, refs, command + skill markdowns) updated. Tracker init registers 8 phases (0-7), not 9 + 6.5.
1757
- - **Token budget for phase-7-handoff** bumped to 3200/2700 (was 3000/2500 for phase-7-report). Post-refactor content is 3071 tokens, under max.
1758
-
1759
- ### Added
1760
-
1761
- - **Corporate hosts in preferences.** `prefs.global.hosts` block (jira / confluence / bitbucket / corpDomain) + `defaultJiraKey`. Collected inline during Token Save Flow (Step 3.5) when a hosted-service token is first added. Team members cloning the pipeline enter hosts once per user; values stay in `~/.claude/multi-agent-preferences.json`, never in the repo.
1762
- - **`repos.artifactsRepo` + `repos.packagesContainer`** fields in `figma-project-config.schema.json` — replaces hardcoded `turkish-airlines-ios-implementation-artifacts` references with proper schema placeholders.
1763
- - **4 new `figma-placeholder-map.json` rules** so future upstream figma-source syncs auto-genericize artifactsRepo, bitbucket host, corporate email, and the broader artifacts path.
54
+ - **Token-count convention mismatch between the two ledger consumers.**
55
+ `render-agent-log-cost.sh` treated `tokens_in` as cache-inclusive and
56
+ subtracted the cached count (`fresh = in - min(cached, in)`), while the new
57
+ `aggregate-metrics.mjs` treated `tokens_in` as cache-exclusive (`total =
58
+ in + cached`). Fed real data with high cache reuse (`cached > in`), the
59
+ renderer collapsed `fresh` to 0 and underpriced the row. Standardized on the
60
+ **cache-exclusive** convention that matches the host usage report
61
+ (`input_tokens` and `cache_read_input_tokens` are disjoint): the renderer no
62
+ longer clamps or subtracts, and `log-format.md` now documents the disjoint
63
+ contract that both consumers rely on.
64
+ - **PR iteration scan could be inflated by a human comment.** The
65
+ `max("iteration #N")` scan now matches only inside the Multi-Agent Review
66
+ footer, so a reviewer who writes "iteration #99" in prose cannot jump the
67
+ counter.
68
+ - **README smoke-suite count.** Bumped 107 -> 108 to match the actual suite
69
+ count after the cache-ratio smoke landed.
70
+
71
+ ## [10.0.4] - 2026-06-12
72
+
73
+ Final two clean-runner failures (ubuntu matrix) - full 5-job matrix green is
74
+ the target state.
1764
75
 
1765
76
  ### Fixed
1766
77
 
1767
- - **Personal-data leak closure — 40 hits → 0 across 10 files.** Hardcoded `jira.thy.com`, `confluence.thy.com`, `bitbucket.thy.com`, corporate email, git `user.name`+`user.email`, and `turkish-airlines-ios-implementation-artifacts` paths all replaced with placeholders resolved from `prefs.global.hosts` + `identities[]` at runtime.
1768
-
1769
- ---
78
+ - **macOS-only project-slug glob.** `plan-todos.sh`, `post-pr-review.sh`, and
79
+ `update-issue-progress.sh` resolved task state by globbing
80
+ `~/.claude/projects/-Users-*` - Linux slugs start with `-home-`, so state
81
+ resolution silently found nothing there. All three now glob `projects/*`.
82
+ - **`smoke-bitbucket-contract.sh` stubbed only macOS Keychain.** The smoke
83
+ PATH-shadows `security`, but on Linux `credential-store.sh get` routes
84
+ through `secret-tool`, so the mock token stayed empty and every parse
85
+ assertion came back blank. A `secret-tool` stub now covers the Linux path.
1770
86
 
1771
- ## [5.0.1] 2026-04-16
87
+ ## [10.0.3] - 2026-06-12
1772
88
 
1773
- Post-5.0.0 integration polish. No behaviour changes to existing workflows — lands the follow-ups that shipped alongside the 5.0.0 tag.
89
+ Cross-platform portability fixes - the v10.0.2 SMOKE_NO_BAIL run surfaced
90
+ every remaining clean-runner failure; all nine are fixed. Linux behavior was
91
+ reproduced locally by putting GNU coreutils' gnubin ahead of PATH.
1774
92
 
1775
- ### Added
93
+ ### Fixed
1776
94
 
1777
- - **WS-7b — `figma-common/` extraction** — 27 platform-agnostic skills moved from `pipeline/skills/figma-ios/` to `pipeline/skills/figma-common/` so both platform orchestrators dispatch to the same location without duplication. Platform-specific 5 remain under `figma-ios/`. README documents the split; Android README drops the "pending WS-7b" hedge.
1778
- - **WS-2b Python/shell script import** — `import-figma-skills.sh` now walks `scripts/` subdirectories, applies the same placeholder transform to `.py` / `.sh` files, preserves executable bits on shebanged entries. 42 scripts imported from upstream (sidebar-generator, confluence-publish, phase-finalize helpers, update-issue-from-registry, etc.). `smoke-figma-skill-import.sh` gains script-count + shebang executable assertions (17 → 19).
1779
- - **WS-9b — `/multi-agent:sync` delegation** — `sync.md` Adim 0 runs `sync-figma-source.sh` before the existing ecosystem sync when `prefs.global.figmaSource.path` is set. Silent skip otherwise. New smoke `smoke-sync-delegation.sh` (6/6).
1780
- - **WS-8b — installed-tree figma deployment** — `install.js` now copies `figma-ios/`, `figma-android/`, `figma-common/` into both `~/.claude/skills/` and `~/.copilot/skills/` as named subtrees. `smoke-cross-cli-behavior.sh` gains § 7 that asserts per-tree skill-count parity between CLIs (14/14 17/17).
1781
- - **Plan-driven sync** — `sync-figma-source.sh --plan` emits a structured JSON plan to `.last-figma-sync-plan.json` (source commit, per-skill rename/tree/overlaySkip resolution, postActions) so upstream changes are reviewed before they land. Apply mode (no flag) executes the same plan. `smoke-figma-sync.sh` 11 → 15.
1782
- - **Overlay protection** skills with `local-overlay: true` in frontmatter are skipped on re-import (protects v5-specific content like `figma-component-wiki`'s 4-adapter contract from being flattened by upstream sync). `figma-component-wiki/SKILL.md` gains the marker + restored overlay.
1783
- - **Dual-target import routing** `import-figma-skills.sh` resolves destination tree (`figma-ios/` vs `figma-common/`) from the `IOS_ONLY` list; `--target` override preserved for single-tree runs.
1784
- - **WS-11 full adoption** — Phase 1/2/4/5/6/7 now carry the `<!-- progress-contract: applied -->` marker with canonical per-phase progress-line summaries. `smoke-progress-contract.sh` 9+warn → 18/18, zero warnings.
95
+ - **`stat -f %m` is not a safe BSD probe.** GNU stat also accepts `-f` (as
96
+ "filesystem status") and SUCCEEDS, printing a mount point - so the
97
+ `|| stat -c %Y` fallback never ran and mtimes became `/` on Linux, breaking
98
+ `search-logs.sh` (since-filter, scoring, JSON/TSV output) and
99
+ `repo-cache.sh` TTLs. Probe order flipped to GNU-first (`stat -c` first;
100
+ BSD rejects `-c`, so the fallback chain is safe both ways). `date -r
101
+ <epoch>` display calls gained a GNU `date -d @` fallback.
102
+ - **Hardcoded maintainer layout `$HOME/multi-agent-pipeline` removed** from
103
+ `smoke-schema-validation.sh` (preferences-template path) and
104
+ `smoke-pat-audit.sh` (.gitignore audit-log check) - both now derive the
105
+ repo root from the script location; the template check skips gracefully on
106
+ installed-tree execution (which also un-cascades `smoke-install-leak-gate`
107
+ step 5).
108
+ - **`smoke-lib-scripts.sh` fixture bare repos init with `-b main`** - on
109
+ runners without `init.defaultBranch=main` the bare HEAD pointed at master
110
+ and `prepare` ended detached instead of on the task branch.
111
+ - **`vercel-deploy.sh` rejects `--token` argv BEFORE the CLI-presence
112
+ check** - the refusal is a security guarantee and must not depend on the
113
+ vercel CLI being installed.
114
+ - **`smoke-install-layout.sh` fingerprint pinned to `LC_ALL=C sort`** -
115
+ runner locales order dot-prefixed paths differently, producing
116
+ same-content/different-order fingerprint mismatches on macOS runners;
117
+ fixture regenerated under the pinned locale.
118
+
119
+ ## [10.0.2] - 2026-06-12
120
+
121
+ Second round of cross-platform CI repairs (first real clean-runner exercise
122
+ of the full smoke suite).
1785
123
 
1786
124
  ### Fixed
1787
125
 
1788
- - **Stale smoke assertions** — `smoke-identity-isolation.sh` and `smoke-phase-0-multi-repo.sh` were reading from `~/.claude/commands/...` (installed, potentially stale) and asserting against pre-10caff1 Phase 0 wording. Switched to source-tree reads; updated assertions to match current (post-refactor) doc shape (collision handling is automatic `-v2` now, `settings.identityRoutingEnabled` gate removed, `gitIdentities` → `identities`).
1789
- - **Token-budget drift** — `token-budget.json` per-phase limits bumped to absorb WS-4/5/6/11 content additions (total cap 22000 → 24000). No content trim needed; prior drift went from 2 failing phases + 1 total-cap fail to 8 warnings, zero failures.
126
+ - **Windows: `New-StoredCredential` not recognized.** `ps_run` preferred
127
+ `pwsh` (PowerShell 7), but the CredentialManager binary module targets .NET
128
+ Framework and installs into Windows PowerShell's module path - `ps_run`
129
+ now prefers `powershell.exe` and falls back to `pwsh`.
130
+ - **Linux: headless keyring unlock.** `gnome-keyring-daemon --unlock` spawns
131
+ SystemPrompter, which exits 1 on ubuntu-24.04 runners and never creates the
132
+ `login` collection. The round-trip now pre-seeds an unencrypted default
133
+ keyring file (libsecret CI recipe) and starts the daemon on it.
134
+ - **test.yml smoke suite ran against a missing install.** Install-dependent
135
+ smokes (cross-cli-behavior etc.) verify `~/.claude` / `~/.copilot`; a clean
136
+ runner has neither, so the full matrix could never go green off the
137
+ maintainer machine. The workflow now runs `node install.js --all` first and
138
+ executes the suite with `SMOKE_NO_BAIL=1` + a 600s per-suite timeout so one
139
+ CI run surfaces every remaining environment-dependent failure.
140
+
141
+ ## [10.0.1] - 2026-06-12
142
+
143
+ CI matrix repairs - the cross-platform `test.yml` jobs had been failing on
144
+ runner-environment drift since before v10.0.0; all three root causes fixed.
1790
145
 
1791
- ---
146
+ ### Fixed
1792
147
 
1793
- ## [5.0.0] 2026-04-16
148
+ - **Windows: `credential-store.sh: line 137: USER: unbound variable`.** Git
149
+ Bash on Windows runners does not export `USER`; under `set -u` the Windows
150
+ `set` backend crashed before reaching Credential Manager. Now falls back
151
+ `${USER:-${USERNAME:-claude}}` (same default the macOS backend already used).
152
+ - **Linux: `secret-tool: Object does not exist .../collection/login`.**
153
+ Persisting `gnome-keyring-daemon` env across workflow steps via
154
+ `$GITHUB_ENV` stopped working on ubuntu-24.04 runners (the default `login`
155
+ collection never materializes, `GNOME_KEYRING_CONTROL` lands empty). The
156
+ round-trip now runs inside a single `dbus-run-session` with an inline
157
+ unlock.
158
+ - **macOS: `pip3 install --user jsonschema` -> `externally-managed-environment`.**
159
+ macos-latest runners ship PEP 668-managed Pythons; the install step now
160
+ falls back to `--break-system-packages`.
161
+
162
+ ## [10.0.0] - 2026-06-12
163
+
164
+ Quality major: a 10-category self-audit (73.5/100) plus a competitive sweep
165
+ (GitHub / Reddit / HN / X) drove hardening across CI, tests, installer,
166
+ adapters, and the phase contracts. No slash command was renamed or removed;
167
+ the major bump covers the behavioral defaults below.
1794
168
 
1795
- Generic Figma-to-Component pipeline rebuild. `pipeline/figma/` is removed and the skill set is rebuilt as `pipeline/skills/figma-{ios,android}/`. Major version bump because Phase 3 component-dispatch signature changes and `figma-project-config.json` advances to v2.0.0 (with idempotent auto-migration). Migration guide: [`docs/MIGRATION-v4-to-v5.md`](./docs/MIGRATION-v4-to-v5.md). End-to-end user guide: [`docs/FIGMA_PIPELINE.md`](./docs/FIGMA_PIPELINE.md). Original workstream plan: [`docs/PLAN_v5.0.md`](./docs/PLAN_v5.0.md).
169
+ ### Breaking
1796
170
 
1797
- 10 workstreams landed + WS-2 pass-1 (markdown import). Follow-ups (WS-2b Python/shell import, WS-7b `figma-common/` extraction, WS-9b `/multi-agent:sync` delegation, remaining WS-11 phase adoptions) do not block v5.0.0 adoption.
171
+ - **`npm test` smoke loop replaced by `pipeline/scripts/run-smokes.mjs`.**
172
+ Per-suite timeout (default 180s, `SMOKE_TIMEOUT_SECONDS` override), substring
173
+ filters (`node pipeline/scripts/run-smokes.mjs adapters keychain`),
174
+ `SMOKE_NO_BAIL=1` run-all mode, and a pass/fail summary. A hung smoke can no
175
+ longer block `npm test` or CI indefinitely.
176
+ - **`publishConfig` corrected to `registry.npmjs.org` (`access: public`).** It
177
+ pointed at `npm.pkg.github.com`, so a plain `npm publish` targeted the wrong
178
+ registry - npmjs had been stuck at 9.8.0 while 9.9.0-9.10.2 shipped only as
179
+ git tags. New `release.yml` publishes on `v*` tags after gating on
180
+ tag==package.json version and a matching CHANGELOG entry.
181
+ - **CHANGELOG split.** Entries v2.0.0-v8.13.0 moved to `CHANGELOG-archive.md`
182
+ (repo-only, excluded from the npm tarball); CHANGELOG.md shrinks 310KB -> 40KB.
183
+ `smoke-changelog-version.sh` gates top-entry==package.json drift.
184
+ - **Phase 1 / 2 / 4 structured outputs now pass deterministic validator gates.**
185
+ `validate-analysis/planning/reviewer/triage.mjs` accept a file argument and run
186
+ inside the phases (fails CLOSED: one self-correction rework, then halt with a
187
+ recovery hint). Previously the validators only ran in `npm test`.
188
+ `smoke-validator-gates.sh` (23 assertions).
189
+
190
+ ### Added
191
+
192
+ - **Skill frontmatter linter** (`lint-skills.mjs`, wired into `npm test` +
193
+ ci-lite + release.yml): every SKILL.md must carry a frontmatter block with
194
+ `name` + `description`; `user-invocable`/`platform` value checks. First run
195
+ found and fixed 16 figma-common SKILL.md with no frontmatter at all and 15
196
+ external skills missing `name:`; `.skill-manifest.json` re-signed (217).
197
+ - **Installer `--dry-run`** - same code path (guards live in `install/_common.mjs`
198
+ primitives), prints `[dry-run] would ...` per operation, writes nothing.
199
+ Unknown `--flags` now exit 1 with the supported list instead of being
200
+ silently ignored. Both covered by new unit tests (23 total).
201
+ - **Phase 3 Step 3.6 code-simplifier pass** - one Sonnet subagent shrinks the
202
+ working diff (comment bloat, unrelated rewrites, dead code, over-abstraction)
203
+ before Phase 4; safe edits only, build+tests re-verified, wholesale revert if
204
+ anything breaks, tokens in the cost ledger.
205
+ - **Phase 4 lesson memory loop** - each fix/rework round appends a one-line
206
+ root-cause lesson per resolved blocking/important finding to the existing
207
+ learnings ledger.
208
+ - **Phase 2 cross-artifact consistency gate** - plan is checked against the
209
+ analysis doc (requirement -> task mapping, no orphan tasks, open questions
210
+ carried) before approval. `smoke-community-gates.sh` (19 assertions) covers
211
+ all three. Sourced from the June 2026 competitive sweep - see ROADMAP for
212
+ the adopted/declined ledger.
213
+ - **CI hardening**: lint is blocking on ci-lite (was warn-only),
214
+ `npm audit --audit-level=moderate` gates both workflows, shellcheck
215
+ (severity=error) gates all pipeline shell scripts, skill lint step added.
216
+ - **Docs drift gates**: `smoke-md-links.sh` + `check-md-links.mjs` (internal
217
+ markdown link checker, code spans excluded), `smoke-changelog-version.sh`.
218
+
219
+ ### Changed
220
+
221
+ - **Adapter family deduplicated.** Shared install/uninstall flow extracted into
222
+ `pipeline/adapters/_base.mjs` (skill collection sources, managed-block
223
+ removal, rules walking, digest rendering); cursor / copilot-chat /
224
+ antigravity / codex + both orchestration modules thinned by ~490 lines with
225
+ byte-identical installed output (`smoke-adapters.sh` 33 assertions green).
226
+ - **README restructured**: Quick Start now precedes "What's new"; the release
227
+ wall condensed to the latest three entries (CHANGELOG remains canonical);
228
+ `--dry-run` documented; smoke count 103 -> 107.
229
+ - **ROADMAP**: new "v10.x candidates" table from the competitive sweep
230
+ (autonomous loop mode, mid-run steering, watchdog + review debt,
231
+ constitution artifact, evidence receipts, marketplace distribution,
232
+ cross-session backlog, pre-flight quota check) plus a declined list.
233
+ - Help text aligned to "8-phase" (index.js said 9-phase; phases are 0-7).
1798
234
 
1799
- ### Progress
235
+ ### Fixed
1800
236
 
1801
- - **WS-1 Purge** Removed `pipeline/figma/` (132 files, ~40 500 lines). Rollback anchor: tag `pre-figma-rebuild`. Previous v4.0.0 extract-as-separate-package plan (`docs/MIGRATION_4.0.md`) superseded.
1802
- - **WS-3 — Config schema v2.0.0** ✅ New top-level sections (`github`, `jira`, `repos`, `build`, `teams`, `artifactBasePath`). `wiki.mode` enum adds `submodule`/`in-repo`/`github-wiki`/`separate-repo`. Idempotent migration `figma-config-1.0.0-to-2.0.0.mjs` lifts legacy `submodules.*` → `repos.*` and `board.projectId` → `github.projectV2Id`. Covered by `smoke-figma-config-schema.sh` (18/18).
1803
- - **WS-11 — Progress-line contract** ✅ Contract at `refs/progress-contract.md` — canonical `→ <verb> <object>` shape, per-phase emit matrix, autopilot forces verbose, parallel `progress.step` telemetry event. `prefs.global.settings.progressVerbosity` (quiet / normal / verbose). Smoke `smoke-progress-contract.sh` (9/9, 2 advisory warnings until phases adopt marker).
1804
- - **WS-2 pass-1 — Markdown skill import** ✅ 32 skills / 86 markdown files / 255 placeholders imported into `pipeline/skills/figma-ios/`. Substitution map + idempotent importer (`import-figma-skills.sh`, `figma-placeholder-map.json`). Smoke `smoke-figma-skill-import.sh` (17/17). Per-skill review checklist at `pipeline/skills/figma-ios/REVIEW_CHECKLIST.md`. Python/shell scripts deferred to WS-2b.
1805
- - **WS-4 Phase 3 delegated dispatch** ✅ Phase 3 short-circuits `taskType === "component"` into the `figma-to-component` orchestrator skill. Contract lives in `refs/component-dispatch.md` (iOS/Android routing, dispatch call shape, `subphases[]` state schema, multi-repo handoff, retry/resume semantics, `--dev` elisions, cross-CLI parity requirements). `phase-3-dev.md` kept lean with a pointer section. Phase 0 Step 7 classification table updated to the new orchestrator path. Both phases now carry the progress-contract adoption marker. Smoke `smoke-figma-dispatch.sh` (12/12).
1806
- - **WS-5 — Component wiki capture** ✅ Phase 7 Step 3.5 runs component wiki generation for `taskType === "component"` tasks via a pluggable 4-adapter system (`submodule` / `in-repo` / `github-wiki` / `separate-repo`). `prefs.global.wikiDefault` (bool, default `true`) drives autopilot; interactive prompts reuse it as default. Full contract at `refs/wiki-capture.md`; adapter dispatch documented in `pipeline/skills/figma-ios/figma-component-wiki/SKILL.md` with WS-5 frontmatter stamp. Non-blocking — adapter failures log and Phase 7 continues. Smoke `smoke-wiki-integration.sh` (14/14). Python helpers (`sidebar-generator`, `status-page`) land in WS-5b alongside the script import.
1807
- - **WS-6 — Issue → Jira → Wiki triad** ✅ Phase 0's GitHub-issue input path applies an auto-create policy (`prefs.global.autoJiraFromGithubIssue`: `ask` | `always` | `never`, default `ask`) — autopilot forces `ask` to `always`. New Jira tickets inherit title/body/labels and the GitHub issue body gets a bidirectional link patch. Phase 7's wiki capture (Step 3.5) then posts a humanizer'd component-docs summary to the linked Jira (`prefs.global.wikiToJiraComment`, default `true`). Failures on either side are non-blocking (transient Jira 5xx doesn't halt code work). Contract at `refs/issue-jira-triad.md`. Smoke `smoke-issue-jira-triad.sh` (17/17).
1808
- - **WS-7 — Android parity (scaffold)** ✅ `pipeline/skills/figma-android/` with five platform-specific skills: `figma-to-component` orchestrator (gradle build target, 8-phase shape), `figma-component-implement` (Kotlin/Compose with `@Immutable` Configuration + Composable + Modifier extensions + `@Preview` grid), `figma-component-test` (Compose Testing + Paparazzi/Roborazzi), `figma-component-code-connect` (`@FigmaConnect` for Compose), `figma-component-wiki` (Android peer of 4-adapter wiki generator). Common skills reuse `figma-ios/` pending `figma-common/` extraction in WS-7b. Smoke `smoke-figma-android-parity.sh` (32/32) — inventory, frontmatter `platform: android`, Compose/Paparazzi/gradle markers, schema cross-check.
1809
- - **WS-8 — Cross-CLI parity matrix** ✅ `refs/cross-cli-contract.md` extended with § 1.1 (figma component subphase skills inventory table: 5 platform-specific paired across iOS+Android, 22 common with WS-7b extraction path noted, 5 performance), § 1.2 (Phase 3 platform routing contract), § 1.3 (mandatory frontmatter: name + status + sourced-from + platform stamp for platform-specific skills). Inventory parity enforced by new `smoke-figma-cross-cli-inventory.sh` (38/38) — source-tree check (platform-specific on both sides, common on iOS today, platform stamp correctness, no orphans, contract heading presence).
1810
- - **WS-9 — Incremental source sync** ✅ `pipeline/scripts/sync-figma-source.sh` pulls changes from an upstream figma-to-component source repo into `pipeline/skills/figma-ios/` on demand. Driven by `prefs.global.figmaSource { path, branch, lastSyncedAt, lastCommit }` (nullable, default null). Idempotent (placeholder transforms re-apply safely); preserves `REVIEW_CHECKLIST.md`; atomic watermark update via tmpfile+mv. `--dry-run` mode for inspection, `--force-full` for complete re-import. Appends a `### Sync YYYY-MM-DD (@ sha)` block under `[Unreleased]` in `CHANGELOG.md` per run with per-skill update lines. Smoke `smoke-figma-sync.sh` (11/11).
1811
- - **WS-10 Consolidation** ✅ `docs/FIGMA_PIPELINE.md` lands as the end-to-end user guide (single entry point with cross-references to all WS-3 WS-11 ref docs). `docs/MIGRATION-v4-to-v5.md` covers the three breaking changes (figma purge, schema v2, Phase 3 dispatch), new preferences with safe defaults, the `pre-figma-rebuild` rollback path, and named follow-ups. README.md migration section now points at both. Final smoke inventory: 28 suites, ≥ 470 assertions.
237
+ - `brace-expansion` moderate DoS advisory (GHSA-jxxr-4gwj-5jf2) via
238
+ `npm audit fix` - 0 vulnerabilities.
239
+ - `smoke-keychain.sh` SC1087 unbraced variable in a grep class (shellcheck).
240
+ - `eslint.config.js` missing `Response` global (5 no-undef errors in
241
+ install-telemetry tests under blocking lint).
242
+ - Stale `fixtures/install-layout.tsv` (claimed 86 command files; 87 on disk
243
+ since v9.9.0) regenerated.
244
+
245
+ ## [9.10.2] - 2026-06-11
246
+
247
+ Live per-phase token narration (user-reported gap: tiles showed duration only).
248
+
249
+ - **tracker-contract section 5 now MANDATES a completion narration line.** The
250
+ native TaskList widget on Claude Code shows name/status/duration - it has no
251
+ per-phase token field, so token telemetry written via `phase-tracker.sh tokens`
252
+ was invisible until the Phase 7 Cost Breakdown. On every phase transition to
253
+ `completed`, the orchestrator now prints one chat line in `outputLanguage`:
254
+ `Phase <N> <name> done - ~<in> in / ~<out> out tokens (<model>, ~$<usd>)`,
255
+ priced from `cost-table.json` (floor-to-cents, same math as the cost renderer).
256
+ Zero-token phases print `(no LLM calls)`; counts are content-size estimates
257
+ prefixed with `~` (the orchestrator does not receive its own usage metering) -
258
+ the state file + Phase 7 breakdown stay authoritative.
259
+ - `smoke-tracker-tokens-invocation.sh` extended to assert the contract carries
260
+ the narration mandate.
261
+
262
+ ## [9.10.1] - 2026-06-11
263
+
264
+ Model fallback contract for fable personas.
265
+
266
+ - **New `refs/features/model-fallback.md`.** Fable access can be plan-window-limited
267
+ or quota-limited; the contract defines a deterministic tier ladder
268
+ (`fable -> opus -> sonnet`) applied via the existing `PHASE_MODEL_OVERRIDE` /
269
+ `CLAUDE_CODE_SUBAGENT_MODEL` per-dispatch override - persona files are never
270
+ edited at runtime. Three triggers, checked in order: (1) **date gate** -
271
+ `prefs.global.modelFallback.premiumTierUntil` (ISO date); past it, fable
272
+ personas dispatch on `fallbackModel` with a one-line WARN (autopilot never
273
+ asks); (2) **dispatch error** - a failed fable dispatch retries once on
274
+ `fallbackModel` instead of aborting the phase; (3) **budget ceiling** - after
275
+ a `cost-budget-check.mjs` exit-11 pause, fable downgrades for the rest of the
276
+ run. Every fallback logs a `model_fallback` metric + agent-log line.
277
+ - **Prefs knob** `global.modelFallback` added to `preferences-template.json`
278
+ (`enabled: true`, `premiumTierUntil: null`, `fallbackModel: "opus"`,
279
+ `onDispatchError: true`). Wired at Phase 0 Step 0 (date gate) and Phase 4
280
+ (per-dispatch); `phase-4-review.md` model-override paragraph also fixed
281
+ (still said `preferredModel: opus` after v9.10.0).
282
+ - **New gate `smoke-model-fallback.sh`** (10 assertions): contract doc + three
283
+ triggers + template defaults + phase wiring + fable personas intact.
284
+ Smoke suites 102 -> 103.
285
+
286
+ ## [9.10.0] - 2026-06-11
287
+
288
+ Claude Fable 5 adoption on the Claude Code side.
289
+
290
+ - **Heavy personas route to Fable.** `code-reviewer`, `security-auditor`, `ios-architect`,
291
+ `android-architect`, and `backend-architect` move from `model: opus` to `model: fable`
292
+ (`claude-fable-5` - the tier above Opus 4.8). `explorer` / `dev-critic` stay on sonnet,
293
+ `task-clarifier` on haiku. `smoke-agent-model-routing.sh` enum extended to
294
+ `fable|opus|sonnet|haiku`; modelRationale lines refreshed.
295
+ - **Model-assignment labels updated** across the Claude-side docs: Phase 1 Analysis and
296
+ Phase 2 Planning headings, Phase 4 reviewer pair (Claude Code now dispatches
297
+ Fable + Sonnet; Copilot CLI keeps GPT-5.4 + Opus + Sonnet), Phase 4 triage
298
+ (Opus triage -> Fable triage), and the `--dev` fast-mode dev model in
299
+ `dev.md` / `dev-autopilot.md` / `dev-local.md` / `dev-local-autopilot.md` / `help.md`
300
+ (EN + TR). Adapter-platform picker labels (VS Code / Antigravity / Codex) untouched.
301
+ - **Cost ledger: `fable` pricing added + stale `opus` entry fixed.** `cost-table.json`
302
+ gains `fable` ($10 in / $50 out / $1 cache-read per MTok, `claude-fable-5`) and the
303
+ `opus` entry is corrected from the 4.7-era $15/$75 to the actual Opus 4.8 pricing
304
+ ($5/$25/$0.50, `claude-opus-4-8`) - every prior cost report priced opus phases 3x too
305
+ high. Cost smokes re-baselined to the corrected math (`smoke-agent-log-cost`,
306
+ `smoke-cost-budget`).
307
+
308
+ ## [9.9.0] - 2026-06-10
309
+
310
+ Analysis open-question resolver + repo hygiene hardening.
311
+
312
+ - **New command `/multi-agent:analysis-resolve`** (+ dash-form `multi-agent-analysis-resolve`
313
+ skill). Walks the Section 20 Risks and Open Questions of an analysis v3 document one
314
+ row at a time: up to 3 source-labeled candidates per row (`From evidence` / `From repo`
315
+ / `AI reasoned`), the pick merges into the target body section, the doc saves after
316
+ every answer. Stop tokens (`stop` / `pause` / `dur` / `kes`), immediate follow-up
317
+ insertion, optional verbatim-match sibling propagation across per-platform files,
318
+ changelog bump on finalize. Inherits the analysis Locked decisions: citation
319
+ discipline, forward-looking voice (repo answers demote to `> Legacy reference:`
320
+ blockquotes), humanizer punctuation policy, no Figma access (Locked 30 - design-gap
321
+ rows get only Defer + a re-run recommendation), no auto-commit. Command inventory
322
+ 33 -> 34; `/multi-agent:analysis` Phase 5 report now suggests the resolver when
323
+ Section 20 has open rows. Pattern ported from the ai-mobile-toolkit resolver skills.
324
+ - **Dead references removed.** `analysis.md` Reusable refs no longer points at a
325
+ non-existent `fetch-wiki.sh` (the wiki fetch chain is inline: clone -> gh api ->
326
+ WebFetch); `refs/features/external-context-injection.md` figma row routed to the real
327
+ 3-tier chain instead of a non-existent `fetch-figma.sh`; `channels.md` board adapter
328
+ no longer links a missing `refs/channels/board.md` (behaviors documented inline).
329
+ - **Bitbucket credential hardening.** `post-pr-review.sh` Basic auth moved off
330
+ `curl -u user:token` argv (visible to `ps` / process audit) onto a curl config fed
331
+ through process substitution (`-K <(...)`).
332
+ - **First offline test coverage for the two biggest untested libs.**
333
+ `smoke-extract-conventions.sh` (8 assertions: throwaway iOS fixture repo, 12-field
334
+ output shape, confidence enum, evidenceFiles cap, error paths) plus a new
335
+ `conventions-output.schema.json` contract for the Phase 1c extractor output.
336
+ `smoke-md2confluence.sh` (9 assertions: front-matter parsing, TR punctuation gate
337
+ with diacritic preservation, storage-XML rendering for headings / tables / code).
338
+ Smoke suites 100 -> 102, JSON schemas 15 -> 16.
339
+ - **Stale-version cleanup.** `multi-agent-analysis/SKILL.md` rewritten from the
340
+ v8.8.1-era "fixed 7-section" description to the v3 contract (23-section Full /
341
+ 7-section Lite, Pass A/B render, Phase 3.5 output picker); `help.md` analysis line
342
+ updated in both languages; README at-a-glance counts re-synced to the filesystem
343
+ (external catalog 133 -> 143 and total SKILL.md 206 -> 217 had drifted in v9.8.0);
344
+ ROADMAP current-release pointer moved off 8.4.1.
345
+
346
+ ## [9.8.0] - 2026-06-03
347
+
348
+ Interactive credential-expiry handling and analysis improvements.
349
+
350
+ - **Token expiry now asks, never silently skips.** When a credential resolves but
351
+ the service rejects it (401/403), the pipeline surfaces an Expired-token decision
352
+ instead of silently dropping the source or falling to a lower tier: `Regenerate`
353
+ (replace in place) / `Use a different token` / `Skip and continue` (which halts
354
+ only when the token is structurally required for the input). The question asks a
355
+ choice; the replacement value still enters through the clipboard Save Flow, never
356
+ chat, so `smoke-no-token-prompt.sh` stays green. `refs/keychain.md` Rule 1 reframed
357
+ to distinguish "never prompt for a token value" (kept) from "may ask a decision"
358
+ (new). `setup.md` Token Save Flow Step A split into missing vs expired branches.
359
+ - **Figma access tiers stop silently degrading.** A dead Figma MCP token (after one
360
+ re-auth retry) now asks `Recreate the MCP token` vs `Continue with Figma PAT`
361
+ instead of silently dropping to the PAT. A 401/403 on the Figma PAT runs the same
362
+ Expired-token decision before falling to the user-screenshot tier. Mirrored in the
363
+ `refs/rules.md` Tier table.
364
+ - **Picker step narration.** New breadcrumb contract in `refs/picker-contract.md`:
365
+ every picker step prints a localized `Step <i>/<n>: <what this step decides>`
366
+ narrator line (auto-resolved steps included), so the native picker shows, step by
367
+ step, what it is doing. Wired into the account / repo / dev-context pickers and the
368
+ analysis Phase 0 chain.
369
+ - **Analysis picker questions follow `outputLanguage`.** Fixed `_account-picker.md`,
370
+ which hardcoded English prompt strings and contradicted the canonical Language
371
+ Application matrix; its questions now render in `outputLanguage` (`label` / `header`
372
+ stay English per the UI contract). This was the source of the half-English picker
373
+ on Turkish runs.
374
+ - **Analysis reuses already-built components via Code Connect.** New Phase 1b.1 builds
375
+ a Code Connect index from existing `*.figma.swift` / `*.figma.kt` bindings
376
+ (`{fileKey, nodeId -> component, path}`). figma-to-swiftui already built and bound
377
+ the components, so this index is the source of truth for "what already exists":
378
+ matched design nodes emit `reuse` rows, only unmatched nodes become new components.
379
+ Empty index falls back to the prior `uiComponents` heuristic. A scope note clarifies
380
+ the command is a development analysis, not a screen-anatomy spec.
381
+
382
+ ## [9.7.0] - 2026-06-02
383
+
384
+ OpenAI Codex CLI support - the 6th supported AI surface.
385
+
386
+ - **New `--codex` adapter (`pipeline/adapters/codex.mjs`).** Codex CLI is a
387
+ global-config tool: its custom prompts (slash commands) and MCP servers live
388
+ under `~/.codex/`, and it has no per-project prompt directory. So unlike the
389
+ Cursor / Antigravity / Copilot Chat per-project adapters, the Codex adapter
390
+ installs globally and writes Codex's real surfaces: `~/.codex/prompts/multi-agent.md`
391
+ (the `/multi-agent` slash command), `~/.codex/AGENTS.md` (marker-wrapped skill
392
+ index), and a marker-wrapped `[mcp_servers.dev-toolkit]` block in
393
+ `~/.codex/config.toml` (TOML managed block that preserves the user's existing
394
+ config). Codex has no subagent fan-out, so the Phase 4 parallel review degrades
395
+ to a sequential adversarial two-pass (encoded in the prompt).
396
+ - **Wiring.** `--codex` flag in `install/index.mjs` + `install/_adapters.mjs`
397
+ (and `--all-tools`), uninstall support in `pipeline/scripts/uninstall.mjs`,
398
+ `REVIEWER_MODELS.codex` (GPT-5.5-Codex primary + Claude Opus 4.8 cross-model)
399
+ in `_base.mjs`, and a global one-shot pass in `sync-adapters.mjs` (fires when
400
+ `~/.codex` exists). Sync Step 2a documents the global flow.
401
+ - **Gate.** `smoke-adapters.sh` test 2d covers the Codex round-trip (prompt +
402
+ AGENTS.md + config.toml MCP merge preserving user TOML, then uninstall
403
+ restoring the user's config). Adapters smoke 28 -> 33 assertions.
404
+ - Platforms 5 -> 6. No new slash command (Codex is a consumer, not a command),
405
+ so the Cross-CLI command inventory stays 33.
406
+
407
+ ## [9.6.0] - 2026-05-31
408
+
409
+ Disk hygiene + correctness hardening, all gate-backed.
410
+
411
+ - **Two new maintenance commands.** `/multi-agent:garbage-collect` (`gc-tmp.sh`)
412
+ sweeps leftover `/tmp` scratch from past runs (picker state, review diffs,
413
+ `issue-progress-*`, `channels-*`, `context-*`, analysis drafts); `--older-than`
414
+ spares in-flight scratch. `/multi-agent:prune-logs` (`prune-logs.sh`) deletes
415
+ per-task log dirs under `~/.claude/logs/multi-agent` with
416
+ `--older-than`/`--project`/`--task` filters and ALWAYS preserves the audit
417
+ trail + metrics corpus + `.counter`. Both are dry-run by default and confirm
418
+ before deleting (purge-style). New smokes: `smoke-gc-tmp` (13), `smoke-prune-logs`
419
+ (15). Command inventory 31 -> 33; help catalog now points at a real log cleaner
420
+ (the dangling `clear-logs` reference had no backing command).
421
+ - **Intent guard hardened against adversarial input.** The 26-case corpus was
422
+ non-adversarial, so the green eval hid real misclassifications: `fix the bug?`
423
+ read as a question (work silently skipped), `can you explain how to add X` read
424
+ as a task (worktree for nothing), and TR mid-sentence interrogatives
425
+ (`nasil refactor ederim`) missed. `classify-intent.sh` now lets a leading
426
+ imperative beat a trailing `?`, keeps a polite-but-conceptual phrasing a
427
+ question, and detects TR SOV interrogatives mid-sentence. 8 adversarial EN+TR
428
+ cases added; eval-intent safe 100% (34/34), exact 97.1%.
429
+ - **Schema conformance test (ajv).** The hand-rolled `validate-*.mjs` validators
430
+ were a parallel re-implementation of the `.schema.json` contracts that nothing
431
+ cross-checked. A new ajv-backed test (43 cases) validates the eval fixtures
432
+ against the real schemas. It immediately caught two drifts, now fixed:
433
+ `reviewer-output` forbade the `reviewer` label the merged Phase 4 array carries;
434
+ `agent-state` modeled `buildStatus` as a string-only enum and `reviewIterations`
435
+ as flat counts while `phase-3-dev.md` writes `buildStatus={ok,attempts,lastError}`
436
+ and the iteration record carries `reviewers`/`triage`/`validatorResult` (which
437
+ `run-metrics.mjs` reads). `ajv`/`ajv-formats` added as devDependencies (runtime
438
+ deps stay zero).
439
+ - **Correctness fixes.** `run-metrics.mjs` degrades instead of crashing on a
440
+ partial/null `agent-state.json` (a documented use case); `eval-intent.mjs` fails
441
+ loudly on an invalid `--min` instead of silently disabling the gate (NaN
442
+ threshold); `audit-log.sh` coerces a non-bool `success` arg so the JSONL line
443
+ always parses; `keychain-save.sh` prints usage instead of crashing on no args.
444
+ - **CI re-armed** on push + PR to main (ci-lite + the macOS/Windows test matrix),
445
+ and dead code removed (`runReviewerValidator`, unused imports/vars). Counts:
446
+ commands 31 -> 33, smoke suites 98 -> 100, SKILL.md 204 -> 206.
447
+
448
+ ## [9.5.0] - 2026-05-30
449
+
450
+ Toward proven (not just designed): measure the features instead of asserting them, and ship the evidence-collection harness.
451
+
452
+ ### Added
453
+ - **Measured intent-guard accuracy** (`eval-intent.mjs` + `pipeline/eval/intent-cases.json`). 26 labeled EN+TR cases run through `classify-intent.sh`; the gate uses operationally-safe accuracy (the only dangerous errors are a task read as a question -> work skipped, or a question read as a task -> a spurious worktree; `ambiguous` proceeds as a task so it is safe for task cases). Currently 100% safe / 96.2% exact. Wired into `npm test`. Turns the heuristic into a number with a regression set.
454
+ - **Per-run outcome metrics** (`run-metrics.mjs` + fixture + `smoke-run-metrics.sh`). Parses an `agent-state.json` into the numbers that answer "did this run go well": review iterations (rework loops), first-pass-clean, reviewer signal-to-noise (accepted / raw findings), consensus verdict, build outcome. Phase 7 emits it; accumulating the output across real runs is the real-world validation corpus that golden tasks + benchmarks only approximate.
1812
455
 
1813
- ---
456
+ ### Notes
457
+ - These address the honest "measure, don't assume" gap from the self-review: the intent guard and review signal are now quantified, and the harness exists to turn real runs into evidence. The remaining step (running real tasks + a public benchmark) is the user's, and cannot be fabricated.
458
+ - CI auto-run stays disabled in `test.yml` (the maintainer paused it for GitHub Actions billing); re-enabling the push/PR triggers is a billing decision, not changed here.
1814
459
 
1815
- ## [3.8.0] 2026-04-16
460
+ ## [9.4.0] - 2026-05-30
1816
461
 
1817
- Setup UX polish release first-run wizard now speaks the user's language, and the placeholder leak that kept `smoke-personal-data.sh` red is closed. No breaking changes; migration is automatic.
462
+ Closes the structural gaps the adversarial review surfaced: the deterministic gates now actually RUN on the three adapter platforms, and the multi-model review is restored there using each platform's real model lineup.
1818
463
 
1819
464
  ### Added
1820
-
1821
- - **Setup wizard — Step 0 (Language Selection)** — first-run setup opens with a bilingual `en` / `tr` picker that writes `prefs.global.promptLanguage` before any other prompt. Setup continues in the chosen language; subsequent pipeline prompts (Phase 0 selectors, Token Save Flow, Phase 5/6 prompts, Phase 7 summary) honour the same value. External payloads (commits, PR bodies, Jira comments, reviewer prompts) remain English for team readability. Existing users are unaffected default stays `en`, and `/multi-agent language en|tr` still flips the value without re-running setup. Covered by `smoke-prefs-language.sh` (12/12, up from 10).
465
+ - **Shared runtime so gates execute on Cursor / Antigravity / VS Code Copilot Chat.** The gate scripts + lib + schemas are installed once to `~/.multi-agent/` (dev-only / PII files excluded) and the emitted agents/commands/workflow reference them by absolute path (`installSharedRuntime` / `rewriteScriptRefs` in `_base.mjs`). Previously the emitted agents referenced `pipeline/scripts/...` which did not exist in the consumer project, so the deterministic gates could not run there at all. Uninstall removes the runtime. Enforced by `smoke-shared-runtime.sh`.
466
+ - **Cross-vendor 2-model review on the adapter platforms.** A second reviewer agent (`ma-code-reviewer-x`) is emitted pinned to a different vendor, using each platform's actual model lineup (researched mid-2026, centralized in `_base.mjs#REVIEWER_MODELS`): Cursor `inherit` + `gpt-5.5`; VS Code Copilot Chat `Claude Opus 4.8` + `GPT-5.5`; Antigravity documents a `Gemini 3 Pro` + `Claude Opus 4.6` pair (its models are dropdown-selected, not file-pinned). Restores the cross-model diversity that was Claude-Code / Copilot-CLI-only.
467
+ - **Recommended PreToolUse hooks template** (`install/templates/claude-hooks.json`) wiring the secret scan as a HARD pre-commit gate on Claude Code; `multi-agent:setup` Step 8 offers to merge it. The secret scan is the one gate that is OS-hookable (no run-specific args); the others are phase-invoked by contract. Enforced by `smoke-gate-hooks.sh`.
468
+ - **Golden-task fixture 08** (`08-ios-auth-consensus-unverified`) exercising the consensus block through the eval harness (both reviewers approve on a security surface -> `unverified`). Golden tasks 7 -> 8.
469
+ - **Reviewer-count contract checks** in `smoke-cross-cli-behavior.sh`: locks Claude=2 / Copilot=3 + the documented adapter-platform reviewer set against drift.
1822
470
 
1823
471
  ### Fixed
472
+ - VS Code Copilot Chat agents emitted `model: inherit`, which is not a valid Copilot model (there is no `inherit` keyword; omitting `model` inherits the picker). Normal personas now omit `model`; the two reviewers pin a picker label.
1824
473
 
1825
- - **Personal-data leak closed** — three doc examples (`setup.md`, `phase-0-init.md`, `multi-agent-setup/SKILL.md`) used the author's private Jira project key (`DIJITAL-123`). Swapped to the generic `ABC-123` / `ABC-12345` pattern that matches the rest of the codebase. `smoke-personal-data.sh` back to 9/9 green.
1826
-
1827
- ### Known Issues
1828
-
1829
- - `smoke-token-budget.sh` still flags `phase-3-dev.md` (3036 > 2500) and `phase-4-review.md` (3366 > 3200) above their documented budgets. Pre-existing drift from v3.7.0; scheduled to resolve in v5.0.0 when the Figma-component pipeline restructure will rewrite both phases.
1830
-
1831
- ---
1832
-
1833
- ## [3.7.0] — 2026-04-15
474
+ ### Notes (stated honestly)
475
+ - The adapter platforms still have no `PreToolUse` equivalent, so their gates are workflow-enforced (run as steps) rather than OS-blocked.
476
+ - Pinned adapter models depend on the user's subscription; swap them in `REVIEWER_MODELS` / the Antigravity dropdown if a model is unavailable.
477
+ - Old PII-bearing versions (9.3.0-9.3.3) remain in the private registry: GitHub Packages does not support `npm deprecate` (E400), and deleting versions / rewriting git history are irreversible and were left to an explicit owner decision. The package + repo are private, so this is not a public exposure.
1834
478
 
1835
- The "multi-repo + identity routing" release. Every Phase 0 single-repo assumption is now extended with an additive multi-repo mode; identity-to-PAT routing prevents corporate credentials leaking into personal repos and vice versa. Schema bumped to v2.1.0 with full backward-compat migration.
479
+ ## [9.3.4] - 2026-05-30
1836
480
 
1837
- ### Added
481
+ Second round of review-driven fixes - the lower-severity findings left open in 9.3.3, plus an honesty correction on the multi-platform claim.
1838
482
 
1839
- - **Schema v2.1.0** — `prefs.schema.json` expanded from 4 to 11 keychain services. New global keys: `identities[].servicePatMap`, `platformIdentityRouting`, `recentGroups`, `recentBranches`, `serviceStatus`, `settings`. New `agent-state.projects[]` array for multi-repo task state (single-repo scalar fields preserved as backward-compat mirrors of `projects[0]`).
1840
- - **Setup wizard expansion** Step 4 collects per-identity `servicePatMap`. New optional Steps 7 (repo discovery, 3-source merge: local + Bitbucket + GitHub 90-day API), 8 (multi-repo group LRU), 9 (platform identity routing rules with auto-suggest from `servicePatMap` ownership). Cross-Platform Notes section maps macOS `pbpaste`/`security` → Linux `xclip`/`secret-tool`.
1841
- - **Phase 0 Multi-Repo Mode** (`settings.multiRepoEnabled`) multi-select project picker, per-repo branch collision detection, TTL-filtered recent branches, serial worktree creation with atomicity rollback, per-repo identity routing (no "first identity for all" fallback).
1842
- - **Phase 0 Fetch-Fail Flow** replaces silent stale-cache fallback with 4-option prompt (resume on VPN / cached-stale / local-branch / abort). Persists `baseFetchStatus` to state. Phase 6 must re-fetch before push.
1843
- - **Phase 4 Combined Diff** — multi-repo reviewers see all repos' diffs with `=== repo: <name> ===` headers; finding schema gains `crossRepo` + `affectedRepos` fields. Token-budget guard truncates at 80% of Phase 4 allowance.
1844
- - **Phase 6 Per-Repo Commit + PR** — shared subject + per-repo scope; PR per repo with cross-linked sibling URLs; GitHub issue body lists one row per repo. Push-must-succeed policy (`settings.pushMustSucceed`, default on): rebase → push → up to 5 retries with back-off → final 4-option user prompt. `recentGroups` LRU bumps on full success.
1845
- - **Cross-CLI parity contract** — `refs/cross-cli-contract.md` defines canonical placeholder vocabulary, frontmatter transform, TaskCreate↔phase-tracker parity, platform guards, change control. Enforced by `smoke-cross-cli-behavior.sh` (15/15 in v3.7.0).
1846
- - **Telemetry & Quality** — `output-quality-check.sh`, `token-budget-report.mjs` (per-task / period rollups + budget breach detection), `audit-log.sh` (PAT-lookup audit with SHA-256-hashed repo URLs), `audit-log-rotate.sh` (daily rotation, 30-day retention), `benchmark-phase-0.sh` (latency proxy). `metrics.jsonl` events may carry an `outputQuality` object; Phase 7 reports include a Quality & Metrics block.
1847
- - **9 new smoke tests** — schema-validation (27), pref-migration (19, incl. firebase consolidation), cross-cli-behavior (15), phase-0-multi-repo (41), multi-repo-worktree (10), phase-6-multi (16), push-retry (19), identity-isolation (9), pat-audit (11). **All green.**
483
+ ### Fixed
484
+ - **Arg parsers dropped values starting with `--`.** `learnings-ledger.mjs` and `evidence-gate.mjs` now accept the `--key=value` form, so a statement / pattern that begins with `--` (e.g. `--statement="-- prefer let"`) is preserved instead of silently failing.
485
+ - **Secret scan skipped filenames with spaces.** `pre-commit-check.sh` now iterates staged files NUL-delimited (`git diff --name-only -z`), closing a false-negative where a secret in `my file.txt` went unscanned.
486
+ - **`learnings-ledger forget` rewrite is now atomic** (temp file + rename) so a crash or concurrent reader never sees a half-written ledger.
487
+ - **`from-triage` scope for a top-level file** is now the filename itself, not a stray `./*` glob.
1848
488
 
1849
489
  ### Changed
490
+ - **Honest multi-platform claim.** The README and the adapter-emitted orchestration commands now state plainly that Claude Code + Copilot CLI run the pipeline natively (gate scripts installed), while Cursor / Antigravity / VS Code Copilot Chat receive the workflow + subagents + MCP but run the deterministic gates as ADVISORY (the gate scripts are not copied into those projects). Making those gates execute on the three adapter platforms is tracked work (needs a uniform script-path resolution + per-platform testing), not yet shipped. Stale `cursor.mjs` header ("26 commands can't run there") corrected.
1850
491
 
1851
- - **`gitIdentities` → `identities`** (BREAKING — auto-migrated). Legacy field is accepted by `migrate-prefs.mjs` and renamed in place; back-compat schema preserves the old name as `deprecated: true` for one minor version.
1852
- - **`firebase_sa` + `firebase_project` → `firebase`** (BREAKING — auto-migrated). The two slots merge into a single `firebase` Service Account JSON entry; project id is parsed from the decoded JSON. The "sa" jargon is dropped per user feedback (consistency with `jira`/`github`/etc. single-word service IDs).
1853
- - **Phase 0 Step 6** — reads `prefs.global.identities`. When `identityRoutingEnabled` is on, auto-picks identity by origin URL; otherwise falls back to legacy multi-identity picker.
492
+ ## [9.3.3] - 2026-05-30
1854
493
 
1855
- ### Deprecated
1856
-
1857
- - `prefs.global.gitIdentities` — accepted by migration but renamed automatically. Will be removed in v3.9.
1858
- - `prefs.global.keychainMapping.firebase_sa` and `firebase_project` — accepted by migration but consolidated under `firebase`.
1859
-
1860
- ### Migration
1861
-
1862
- - Run `node ~/.claude/scripts/migrate-prefs.mjs` (idempotent, atomic, takes a `.v1.bak.<timestamp>` backup).
1863
- - Run `/multi-agent setup` to fill in the new optional fields (Steps 4 servicePatMap + 7-9).
1864
- - Identity routing + multi-repo are **opt-in** via flags — set `settings.identityRoutingEnabled` / `settings.multiRepoEnabled` to `true` to enable. Default-off keeps single-repo workflows bit-for-bit identical to v3.6.
1865
- - Full guide: `docs/MIGRATION-v3.6-to-v3.7.md`.
1866
-
1867
- ### Breaking Changes
1868
-
1869
- - `gitIdentities` field rename → migration handles it.
1870
- - `firebase_sa` + `firebase_project` consolidation → migration handles it.
1871
- - Phase 0 fetch failures no longer silently fall back to a stale cached ref → existing tasks unaffected (only matters when `git fetch` actually fails).
1872
-
1873
- No public command surface (`/multi-agent`, `/multi-agent setup`, etc.) changed. All breaking changes are in the preferences file and are auto-migrated.
1874
-
1875
- ---
1876
-
1877
- ## [3.6.0] — 2026-04-15
1878
-
1879
- The "scorecard 10/10" release. Every dimension flagged in the v3.5.0 architecture review is now enforced by tests, documented, or explicitly deferred to 4.0 with a concrete plan.
1880
-
1881
- ### Added
1882
-
1883
- - **Runtime validators for the three subagent schemas** — `validate-reviewer.mjs`, `validate-analysis.mjs`, `validate-planning.mjs`. Mandatory invocation is now documented in phase-1/2/4 docs; each validator has a matching smoke test (21 assertions across all three).
1884
- - **Structural parity gate** — `smoke-commands-skills-parity.sh` (34 assertions) verifies every `pipeline/commands/multi-agent/*.md` has a corresponding `skills/shared/multi-agent-*/` and vice versa. Auto-included by `npm test`.
1885
- - **Architecture Decision Records** — `docs/adr/` with 5 foundational decisions (3-model triage, instructionDriven flag, unified skills, zero-dependency philosophy, lazy phase docs). Format adapted from Michael Nygard's ADR template.
1886
- - **Performance surface** — `aggregate-metrics.mjs --markdown` flag produces GitHub-flavored tables suitable for PR descriptions and wiki embeds. `docs/performance.md` documents the full metrics pipeline, interpretation, and per-phase token budgets.
1887
- - **Community artifacts** — `.github/pull_request_template.md` mirroring CI checklist, `.github/CODEOWNERS` for auto-review assignment, `.github/FUNDING.yml`, expanded `CONTRIBUTING.md` with three how-to recipes (add a skill, add a phase, add a schema migration).
1888
- - **Error Recovery Playbook** — `docs/recovery-guide.md` consolidates every failure mode + recovery path into one reference.
1889
- - **Migration Plan doc** — `docs/MIGRATION_4.0.md` pre-publishes the Figma package split plan so integrators can prepare.
1890
- - **Worked examples** — `examples/` directory with four real transcripts (bugfix, autopilot, `--dev`, recovery).
1891
- - **Schema migration runner** — `scripts/migrate-state.mjs` + `pipeline/schemas/migrations/` scaffold. Runner resolves version chains via BFS; `smoke-migrate-state.sh` covers 8 edge cases.
1892
- - **Atomic state writer** — `scripts/write-state.mjs` with tmp+rename + advisory lock. Handles 5 concurrent writers without data loss (`smoke-write-state.sh` proves it).
1893
- - **Local CI mirror** — `scripts/pre-push-check.sh` reproduces the CI smoke workflow locally for users with Actions quota issues or air-gapped environments.
1894
- - **Skill discovery index** — `pipeline/skills/shared/README.md` auto-generated by `scripts/gen-skills-index.mjs` (6 categories, 145 skills).
1895
- - **Public ROADMAP** — `ROADMAP.md` with 3.6/4.0/long-term targets + "Not Planned" list.
1896
-
1897
- ### Changed
1898
-
1899
- - **Telemetry flipped to opt-in** — was `MULTI_AGENT_NO_TELEMETRY=1` to opt out; now `MULTI_AGENT_TELEMETRY=1` to opt in. Default is zero network calls.
1900
- - **Phase 6 instructionDriven dispatch** now uses a deterministic truth table (documented at phase-6-commit.md:25). Missing instruction files set `instructionDrivenFallback=true` and fall through to standard path instead of silent miss.
1901
- - **Phase 4 reviewers** explicitly use the `code-reviewer` subagent definition — previously the agent file was orphan. Reviewer 2/3 stack-specific override is now per-reviewer rather than inline.
1902
- - **Phase 5 security audit** added as opt-in (`--audit` flag or release branches). Previously `security-auditor` agent had no caller.
1903
- - **Phase docs** trimmed where budget got tight — phase-4-review.md is now 2163 tokens (within 2200 max).
494
+ A 4-agent adversarial review of the v9.3.x work surfaced real defects in the features just shipped; this release fixes them.
1904
495
 
1905
496
  ### Fixed
1906
-
1907
- - **`node install.js --all` now works from Quick Start Option A** previous `slice(3)` assumed `install` subcommand prefix (worked only via npx/bin). Now tolerates direct invocation.
1908
- - Two pre-existing lint errors in v3.5 scripts (unused param, missing Buffer import) caught and fixed by the new pre-push gate.
1909
- - `code-reviewer.md` frontmatter had duplicate `description:` keys; fixed.
497
+ - **evidence-gate was bypassable.** A failing build log that also contained the word "SUCCESS" (cached-step note, banner) passed because success and failure were weighed equally. Failure markers are now DECISIVE (a definitive failure marker fails the claim regardless of success text), success markers were narrowed (dropped the generic `\bSUCCESS\b`), and caller-supplied `--success/--failure-pattern` are length-capped + compiled in a try/catch so a bad pattern is a clean usage error, not a crash. (`evidence-gate.mjs`, `smoke-evidence-gate.sh`)
498
+ - **intent-guard misclassified questions as tasks.** "does it support offline mode", "should we enable caching" were read as tasks (the imperative check beat the interrogative) and would spin up a worktree. A strong question signal (interrogative lead / trailing `?` / TR particle) now wins over a bare imperative verb; an explicit polite request ("can you split this file") stays a task. (`classify-intent.sh`, `smoke-intent-guard.sh`)
499
+ - **consensus block was decorative.** `validate-triage.mjs` validated the v3.1.0 consensus block structurally but never cross-checked it: `unanimous-block` + `approved:true`, `unanimous-pass` + an accepted blocker, and a single-reviewer "unanimous" verdict now all fail validation. (`smoke-phase4-triage.sh`)
500
+ - **A malformed `tokens_cached` poisoned the whole tokens call.** `log-metric.sh` now sanitizes a non-integer cached count to 0 before forwarding, so the valid in/out counts still land. (`smoke-agent-log-cost.sh`)
1910
501
 
1911
502
  ### Security
1912
-
1913
- - `agent-state.schema.json` gains `instructionDrivenFallback` boolean for audit trail.
1914
- - Runtime validators block malformed subagent output instead of letting it flow into triage / Phase 3 TDD iteration.
1915
-
1916
- ### Migration Notes
1917
-
1918
- No action required for users. All changes are additive. Schema versions unchanged (agent-state 2.0.0, prefs 2.0.0). If you had `MULTI_AGENT_NO_TELEMETRY=1` set, remove it — it's a no-op now; set `MULTI_AGENT_TELEMETRY=1` if you want to opt in.
1919
-
1920
- ---
1921
-
1922
- ## [3.5.0] — 2026-04-15
503
+ - **Stopped real maintainer/employer identifiers from shipping in the npm tarball.** The dev-only figma substitution map (a scrub table that by design holds real upstream values), the two personal-data scanners, and two internal planning docs were excluded from the package via negated `files` entries; stray corporate hosts / repo names / a private Jira key in CHANGELOG + docs examples were genericized. The leak gate now scans the published npm tarball (not just the install tree), closing the hole that let these ship in 9.3.0-9.3.2. The repo/package are private, so this was not a public exposure. (`smoke-install-leak-gate.sh`, `.npmignore`, `package.json` files)
1923
504
 
1924
505
  ### Changed
506
+ - README "What's new" refreshed to v9.3.3; em-dashes removed from README and the `MANDATORY` keyword removed from `install/templates/copilot-instructions.md` (project style rules).
1925
507
 
1926
- - **Skills unified** — `pipeline/skills/claude/` + `pipeline/skills/copilot/` merged into a single `pipeline/skills/shared/` tree (145 skills total). Both Claude Code and Copilot CLI now install from the same source, so downloaders get identical skill coverage regardless of which CLI flag they pick. Previously Claude got 84 iOS-only skills with no multi-agent orchestration, Copilot got 60 Android/web + pipeline skills — zero overlap, asymmetric experience.
1927
- - **Skill format normalized** — 15 loose `*.md` files in the old copilot tree converted to `folder/SKILL.md` shape, matching the established convention used by the other 130 skills.
1928
- - **Personal data genericized** — 88 occurrences of author-specific identifiers removed from pipeline skills: `DIJITAL` → `PROJ`, `smart-mobile-ios` → `my-ios-app`, `turkish-airlines-ios-*` → `my-figma-app` / `my-ui-components`, private GitHub org name → `my-org`, author's domains → placeholders. Pipeline is now truly generic for downstream users.
1929
-
1930
- ### Added
1931
-
1932
- - **`smoke-personal-data.sh`** — CI guard that fails the build if private Jira keys, project directory names, or author-specific identifiers leak back into `pipeline/`. Auto-picked up by `npm test` via the existing `smoke-*.sh` glob.
508
+ ## [9.3.2] - 2026-05-30
1933
509
 
1934
510
  ### Fixed
1935
-
1936
- - **`node install.js --all` now works** — flag parsing previously assumed `install` was the first argv (valid only via `npx` / `bin`). Quick Start Option A (`git clone && node install.js --all`) silently installed only Claude Code because the `--all` flag was being dropped. Now tolerates both invocation styles.
1937
- - **Structure tests** — `test/project.test.mjs` updated to reflect the unified layout.
1938
-
1939
- ### Migration Notes
1940
-
1941
- No action required for users. Slash commands and install flags are unchanged. If you have local tooling that reads `pipeline/skills/claude/*` or `pipeline/skills/copilot/*` directly, point it at `pipeline/skills/shared/*` instead.
1942
-
1943
- ---
1944
-
1945
- ## [3.4.2] — 2026-04-15
511
+ - **Cost-ledger cache pricing is now wired end-to-end.** v9.3.0 added `cacheReadPerMtok` pricing + a cache-reads line to `render-agent-log-cost.sh`, but nothing fed `tokens_cached` to the tracker, so the feature was dormant. `phase-tracker.sh tokens` now accepts an optional 4th `cached` arg (defaults to 0, fully back-compatible), `log-metric.sh` forwards `tokens_cached=` into it, and the Phase 4 telemetry doc documents passing the host's `cache_read_input_tokens`. Verified end-to-end in `smoke-agent-log-cost.sh`.
512
+ - `evidence-gate.mjs` made executable to match its sibling `.mjs` scripts.
1946
513
 
1947
514
  ### Changed
515
+ - README "What's new" refreshed to v9.3.x (was stale at v8.8.1).
1948
516
 
1949
- - **Humanizer skill** — replaced bundled copy with original prompt skill inspired by blader/humanizer patterns
1950
- - **No external dependencies** — humanizer is now a self-written skill in `pipeline/skills/shared/`, no runtime fetch
1951
- - **Shared skills** — `pipeline/skills/shared/` installs to both Claude and Copilot CLIs via `cpSync`
1952
-
1953
- ### Removed
1954
-
1955
- - Runtime `fetch()` of external skills during install (zero network dependency)
1956
-
1957
- ---
1958
-
1959
- ## [3.4.1] — 2026-04-15
1960
-
1961
- ### Added
1962
-
1963
- - **Skills delivery** — `install.js` now installs skills to `~/.claude/skills/` and `~/.copilot/skills/`
1964
- - **84 Claude skills** — iOS/Swift/SwiftUI framework skills (SwiftUI patterns, Swift concurrency, StoreKit, HealthKit, MapKit, etc.)
1965
- - **45 Copilot skills** — Android/Kotlin, web, backend, and pipeline orchestration skills
1966
- - **15 loose Copilot skills** — standalone .md skills (React, Vue, TypeScript, CSS, etc.)
1967
- - **Humanizer skill** (shared) — removes AI-generated writing patterns, based on blader/humanizer (MIT)
1968
- - **Shared skills directory** — `pipeline/skills/shared/` installs to both Claude and Copilot
1969
- - **Skills directory tests** — pipeline directory structure tests now verify skills/copilot, skills/claude, skills/shared
517
+ ## [9.3.1] - 2026-05-30
1970
518
 
1971
519
  ### Fixed
1972
-
1973
- - Skills were missing from the repo — cloning and installing would not deliver any skills to users
1974
-
1975
- ---
1976
-
1977
- ## [3.2.1] — 2026-07-24
520
+ - **Learnings ledger no longer auto-suppresses rejected BLOCKING findings.** `learnings-ledger.mjs from-triage` previously distilled every rejected finding into a durable "do not re-flag" preference regardless of severity; a single wrong rejection of a blocking issue could permanently silence that class on future runs. Blocking-severity rejections are now skipped (reported as `skippedBlocking`); only lower-severity rejections become durable preferences, and they are recorded at `low` confidence.
1978
521
 
1979
522
  ### Added
523
+ - **`learnings-ledger.mjs forget`** subcommand to remove a bad or stale ledger entry by statement substring and/or kind (the one non-append operation), so a wrong learning can be cleared instead of persisting forever. Enforced by `smoke-learnings-ledger.sh`.
1980
524
 
1981
- - **`dev-autopilot` command** — dedicated fastest-path command (Dev + Autopilot combined)
1982
- - **5 operating modes** — normal, autopilot, dev, dev-autopilot, local (was 4)
1983
-
1984
- ### Changed
1985
-
1986
- - Help display updated with dedicated command aliases section
1987
- - Routing table expanded with `dev-autopilot` entry
1988
- - README: 8→9 phases, 2→3 model, add-detail→enrich across all references
1989
-
1990
- ---
1991
-
1992
- ## [3.2.0] — 2026-07-24
1993
-
1994
- Major pipeline update: 9-phase pipeline, 3-model review, 15 unified slash commands.
525
+ ## [9.3.0] - 2026-05-30
1995
526
 
1996
527
  ### Added
1997
-
1998
- - **Phase 6.5: Wiki** auto-generates wiki markdown pages + Figma screenshots for component tasks
1999
- - **3-model parallel review** Opus + GPT-5.4 + Sonnet replace the old 2-model (Opus + Sonnet) setup
2000
- - **11 new slash commands** `enrich`, `dev`, `review`, `autopilot`, `test`, `issue`, `jira`, `sync`, `kill`, `purge`, `resume` (all modular .md files)
2001
- - **`/sync-instructions`** — cross-platform sync command with genericization + publish flow
2002
- - **Autopilot mode** — `autopilot` flag skips confirmations for unattended pipeline execution
2003
-
2004
- ### Changed
2005
-
2006
- - `add-detail` renamed to `enrich` (all references updated)
2007
- - Pipeline phases: 8 → 9 (added Phase 6.5 Wiki between Commit and Report)
2008
- - Review models updated: `claude-opus-4` → `claude-opus-4.6`, `claude-sonnet-4` → `claude-sonnet-4.6`
2009
- - Help display updated with 9-phase pipeline explanation
2010
- - Routing table expanded with all 15 commands
2011
-
2012
- ### Removed
2013
-
2014
- - `sync-instructions.md` from root commands (moved to `multi-agent/sync.md`)
2015
- - `add-detail.md` (replaced by `enrich.md`)
2016
-
2017
- ### Fixed
2018
-
2019
- - Stale 2-model references in phase-4-review.md, modes.md, help.md
2020
- - Stale 8-phase references in phases.md, swiftui-guide.md, dev.md
2021
- - Missing routing entries for sync, clear-logs, review-only, stack, language
2022
-
2023
- ---
2024
-
2025
- ## [3.1.0] — 2026-04-14
2026
-
2027
- New feature: interactive issue launchers for Jira and GitHub.
2028
-
2029
- ### Added — Interactive launchers
2030
-
2031
- - `multi-agent jira` — lists open Jira issues assigned to you (newest first), pick one to start the pipeline.
2032
- - `multi-agent issue` — lists unassigned open GitHub issues from the project remote (newest first), pick one to start. Auto-assigns the selected issue to you via `gh issue edit --add-assignee @me`.
2033
- - **Shared interactive flow** after selection: branch pick → mode (full/--dev) → autopilot? → pipeline starts.
528
+ - **Review consensus surfacing (anti-correlation).** Phase 4 triage now records an optional `consensus` block (triage-output schema v3.1.0): `reviewerCount`, a `verdict` (`unanimous-pass` / `unanimous-block` / `split` / `unverified`), and `disagreements[]`. Unanimous agreement among same-base-model reviewers on a judgment-heavy surface (security, auth, concurrency, money, migration) is marked `unverified` and surfaced to the user instead of being trusted as a pass. Disagreements are shown at the Step 4 checkpoint and written to the agent-log "Review Consensus" section. Validated by `validate-triage.mjs` + new fixtures in `smoke-phase4-triage.sh`.
529
+ - **Persistent learnings ledger** (`pipeline/scripts/learnings-ledger.mjs`, schema `learnings-ledger.schema.json`). A per-repo, append-only store of durable architectural facts, conventions, and explicitly rejected review preferences, stored next to the triage corpus. A compact `<repo-learnings>` brief is injected into Phase 1 analysis and Phase 4 triage so agents stop re-discovering structure and reviewers stop re-flagging rejected feedback (the most-cited cold-boot-amnesia complaint). Phase 7 distills each run's rejected findings into the ledger. On by default via `prefs.global.learningsLedger`; per-repo isolated. Enforced by `smoke-learnings-ledger.sh`.
530
+ - **Default-FAIL evidence gate** (`pipeline/scripts/evidence-gate.mjs`). A build/test/review "passed" claim is only trusted when a substantiating log artifact exists and shows success; the gate fails CLOSED on missing, empty, or contradicting evidence. Wired into Phase 3 (build), Phase 4 Stage 1 gates (build + test), and Phase 6 (commit). Enforced by `smoke-evidence-gate.sh`.
531
+ - **Conceptual-vs-edit intent guard** (`pipeline/lib/classify-intent.sh`). A deterministic, language-aware (EN + TR) classifier runs on free-text input at Phase 0; a question is answered in place instead of spinning up a branch/worktree. On by default via `prefs.global.intentGuard`. Enforced by `smoke-intent-guard.sh`.
2034
532
 
2035
533
  ### Changed
2036
-
2037
- - Routing table in `multi-agent.md` expanded with `jira` and `issue` entries.
2038
- - Help display updated with new Interactive Launchers section + examples.
2039
- - README "Key Features" count: Nine → Ten highlights.
2040
-
2041
- ## [3.0.1] — 2026-04-14
2042
-
2043
- Final polish: all 8 phase docs under warn threshold, new pre-commit smoke test, `npm test` script.
2044
-
2045
- ### Added
2046
-
2047
- - `pipeline/scripts/smoke-pre-commit.sh` — end-to-end test for secret detection hook using real temp git repos (12 assertions: 6 secret patterns detected, 3 filename patterns, 2 clean file passes, 1 empty-stage check).
2048
- - `npm test` script in package.json — runs all 10 smoke suites + eval fixtures + schema validation (128 total assertions).
2049
- - **Testing** section in README with assertion counts and CI reference.
2050
-
2051
- ### Changed — Phase doc token budget: 0 warnings
2052
-
2053
- - phase-0-init.md: 3729 → 3366 tokens (compressed prefs example, project scan, Jira creation)
2054
- - phase-1-analysis.md: 1329 → 1166 tokens (removed verbose find commands, compressed stack table)
2055
- - phase-3-dev.md: 1609 → 1451 tokens (compressed build queue lock implementation)
2056
- - phase-4-review.md: 1910 → 1767 tokens (compressed gate examples)
2057
- - phase-2-planning.md, phase-5-test.md, phase-7-report.md: minor trims
2058
- - Token budget: 0 warnings, 0 failures (was 7 warnings in v3.0.0)
2059
-
2060
- ### Fixed
2061
-
2062
- - Removed stray `robustness:` file created by background agent.
2063
- - CI workflow comment updated to reflect actual test count (118+ assertions).
2064
-
2065
- ## [3.0.0] — 2026-04-14
2066
-
2067
- Major release: token cost −30%, full English docs, schema v3, new smoke tests, cost instrumentation.
2068
-
2069
- ### Breaking
2070
-
2071
- - `triage-output.schema.json` version bumped to `"3.0.0"` — downstream tools parsing the `version` field should update their checks.
2072
- - `docs/features.md` fully restructured as a versioned feature timeline (v2.0–v2.6). The old flat list is gone.
2073
- - `add-detail.md` is now English-primary (was Turkish-only). Same semantics, same flags.
2074
-
2075
- ### Added — Token budget enforcement
2076
-
2077
- - `pipeline/schemas/token-budget.json` — per-phase token limits with warn/max thresholds.
2078
- - `pipeline/scripts/smoke-token-budget.sh` — CI smoke test that fails when any phase doc exceeds budget.
2079
- - Token budget table added to `phases.md`.
2080
-
2081
- ### Added — Validator contradiction smoke test
2082
-
2083
- - `pipeline/scripts/smoke-validator-contradiction.sh` — 7 scenarios testing both directions of approved↔blocking auto-correction (11 assertions).
2084
-
2085
- ### Changed — Phase doc token cost −30%
2086
-
2087
- - phase-0-init.md: 462 → 263 lines (−43%)
2088
- - phase-4-review.md: trimmed (−20%+)
2089
- - phase-6-commit.md: 290 → 166 lines (−43%)
2090
- - phase-7-report.md: 300 → 166 lines (−45%)
2091
- - Total pipeline token cost: ~18,450 → ~13,282 tokens (−28%, all phases under warn threshold).
2092
-
2093
- ### Changed — Documentation excellence
2094
-
2095
- - `add-detail.md`: full Turkish → English translation; all MANDATORY rules preserved.
2096
- - `features.md`: complete rewrite — removes Haiku references, adds all v2.1–v2.6 features.
2097
- - `README.md`: "Seven highlights" → "Nine highlights" (accurate count).
2098
- - `MIGRATION.md`: v2.6→v3.0 upgrade section added.
2099
-
2100
- ### Changed — Script robustness
2101
-
2102
- - `keychain-save.sh`: credential passed via process substitution (not visible in `ps`).
2103
- - `log-metric.sh`: newline/control-char stripping before JSON encoding.
2104
- - `phase-tracker.sh`: error handling on corrupted state (exit 65).
2105
- - `smoke-telemetry.sh`: per-job exit code capture via PIDS array.
2106
-
2107
- ### Changed — Cost instrumentation
2108
-
2109
- - `aggregate-metrics.mjs`: `cost_per_model` block already present (v2.5.0); Phase 7 cost summary documented in phases.md budget table.
2110
-
2111
- ## [2.6.1] — 2026-04-13
2112
-
2113
- 19 findings from 4-agent review — all fixed in one batch.
534
+ - **Secret pre-commit gate** (`pre-commit-check.sh`) extended beyond pattern matching: high-signal provider-token prefixes (GitHub PAT, Slack, Google API key, Stripe, npm, GitLab), JWT detection, and a Shannon-entropy scan that catches custom/unknown secrets while exempting lockfiles, integrity hashes, source maps, and snapshots.
535
+ - **Per-phase cost ledger** (`render-agent-log-cost.sh`) now prices prompt-cache reads at the discounted `cacheReadPerMtok` rate (cost-table schema 1.1.0; backward-compatible, defaults to 0 cached), appends a "Top cost driver" line so the report shows where spend went, and surfaces a cache-reads line when the tracker recorded cache hits.
536
+ - Uninstall header and package description refreshed to the current 5-platform set (Cursor / Antigravity / VS Code Copilot Chat), replacing stale Windsurf/Cline references.
2114
537
 
2115
538
  ### Fixed
539
+ - Cursor uninstall left an empty `.cursor/commands/` directory behind, and the `.cursor` parent-empty cleanup ran before orchestration teardown so the parent was never reclaimed. Both now clean up after the orchestration uninstall.
2116
540
 
2117
- - Bidirectional approved↔blocking validator check (both directions).
2118
- - `acceptedFinding` allOf: second schema now has `additionalProperties: false`.
2119
- - Schema if/then constraint: `accepted contains blocking → approved must be false`.
2120
- - Stale Haiku references removed from knowledge.md + log-format.md.
2121
- - `--local` mode added to routing table in multi-agent.md.
2122
- - `mktemp` failure guarded in smoke-add-detail.sh.
2123
- - `set -uo pipefail` added to all smoke scripts.
2124
- - `pbcopy` fallback for Linux in phase-banner.sh.
2125
- - `$HOME` validation in keychain-save.sh.
2126
-
2127
- ## [2.6.0] — 2026-04-13
2128
-
2129
- Closing the v2.5.0 evaluation gap: eval suite gets adversarial cases,
2130
- docs get a token-budget trim.
2131
-
2132
- ### Added — 5 adversarial eval fixtures
2133
-
2134
- - `06-severity-mismatch` — reviewer tagged a real security issue as suggestion
2135
- - `07-duplicate-findings-from-two-reviewers` — Opus + Sonnet flagged same issue
2136
- - `08-stylistic-blocker-misclassification` — naming preference tagged as blocking
2137
- - `09-cascading-finding` — symptom + root cause both flagged independently
2138
- - `10-deferred-with-cross-reference` — out-of-scope but blocks next sprint
2139
- - Total eval suite: 10/10 passing (was 5/5).
2140
-
2141
- ### Changed — Doc trim (-15-25% per file)
2142
-
2143
- - phase-0-init.md, phase-4-review.md, phase-6-commit.md, phases.md trimmed.
2144
- - All MANDATORY rules, exit codes, schemas preserved verbatim.
2145
- - Token cost per pipeline run reduced proportionally.
2146
-
2147
- ## [2.5.0] — 2026-04-13
2148
-
2149
- Closing the v2.2.0 evaluation loop: triage gets a semantic regression
2150
- suite, telemetry tracks per-model cost, and the single-point-of-failure
2151
- gets an opt-in second opinion.
2152
-
2153
- ### Added — Triage eval suite (`pipeline/eval/triage/`)
2154
-
2155
- - 5 fixture cases (`01-empty-findings`, `02-real-blocker`,
2156
- `03-out-of-scope-defer`, `04-false-positive-reject`,
2157
- `05-mixed-classification`) — each with `input.json`, `expected.json`,
2158
- and a `notes.md` explaining the failure mode it guards against.
2159
- - `pipeline/scripts/eval-triage.mjs` — zero-dep runner. For each case:
2160
- validates `expected.json` against the schema + the runtime validator,
2161
- asserts coverage (every raw finding lands in exactly one bucket), and
2162
- asserts no inventions (no fabricated findings). Filter via
2163
- `--case <name>`; CI-friendly via `--json`.
2164
- - CI runs the eval suite on every push (added to `.github/workflows/smoke.yml`).
2165
- - `pipeline/eval/triage/README.md` documents the "how to add a new case"
2166
- flow so future production failure modes can be captured as fixtures.
2167
-
2168
- ### Added — Per-model cost / token telemetry
2169
-
2170
- - New convention (v2.5.0+): events involving a model call include
2171
- `model=<name>`, `tokens_in=<N>`, `tokens_out=<N>` in their details.
2172
- - `aggregate-metrics.mjs` now produces a `cost_per_model` block summing
2173
- calls, duration, and tokens per model. Plain-text rendering shows a
2174
- per-model table; JSON output adds the `cost_per_model` field.
2175
- - Phase 4 spec updated to emit `review.reviewer_call` (one per Opus +
2176
- one per Sonnet) and `review.triage_call` events with the cost fields.
2177
- - Token counts are best-effort — emitters omit fields the host CLI
2178
- doesn't expose; aggregator handles missing fields gracefully.
2179
-
2180
- ### Added — Optional triage cross-check (single-point-of-failure mitigation)
2181
-
2182
- - New preference: `prefs.global.triageCrossCheck` (default off).
2183
- When enabled, a configurable percentage of triage runs are
2184
- re-classified by a Sonnet "second opinion" agent and the diff vs.
2185
- Opus is logged.
2186
- - Schema enforces: `samplePct` 1-100 (default 10), `model` locked to
2187
- `sonnet` (Opus would just agree with itself), `blockOnDisagreement`
2188
- default false (log-only mode).
2189
- - Phase 4 §3.5 documents the dispatch + diff logic, including the
2190
- deterministic dice (so `task_id + iteration` reproduces the same
2191
- sample decision).
2192
- - New events: `triage.cross_check` (summary per run) and
2193
- `triage.cross_check_diff` (per-finding disagreement).
2194
-
2195
- ### Smoke tests (+8 assertions, 88 total)
2196
-
2197
- - `smoke-telemetry.sh` — 4 new assertions: per-model summing
2198
- (calls, tokens_in, duration_ms across multiple events), and
2199
- empty `cost_per_model` when no model fields are emitted.
2200
- - `smoke-prefs-language.sh` — 4 new assertions: `triageCrossCheck`
2201
- schema shape (default false, samplePct 10 default, model locked to
2202
- sonnet, samplePct bounded [1, 100]).
2203
-
2204
- ### Total test surface
2205
-
2206
- 88 smoke assertions + 5 eval cases + 3 schema validations = **96 passing
2207
- contract checks**, all zero-dep, all run in CI on every push.
2208
-
2209
- ### Why this matters
2210
-
2211
- v2.2.0 evaluation flagged three weaknesses the framework couldn't
2212
- self-detect: triage hallucination/drift, no cost visibility, and a
2213
- single-model dependency. v2.5.0 directly addresses each:
2214
-
2215
- - Semantic drift → caught by eval fixtures + CI
2216
- - Cost blindness → telemetry now tracks tokens per model
2217
- - SPoF → opt-in Sonnet cross-check available
2218
-
2219
- The pipeline now has self-monitoring loops at three levels: structural
2220
- (validators), behavioral (smoke), and semantic (eval).
2221
-
2222
- ---
2223
-
2224
- ## [2.4.0] — 2026-04-13
2225
-
2226
- Cross-CLI visual parity + migration documentation.
2227
-
2228
- ### Fixed — Copilot was blind to phase progression (urgent)
2229
-
2230
- v2.3.0 and earlier relied on Claude Code's `TaskCreate` for the visual
2231
- "phase tile" UI — but `TaskCreate` is Claude-only. Copilot CLI users saw
2232
- no progress signal at all between phases. v2.4.0 closes the gap.
2233
-
2234
- ### Added — Cross-CLI phase tracker
2235
-
2236
- - **`pipeline/scripts/phase-tracker.sh`** — stateful card-stack tracker.
2237
- Both Claude Code and Copilot CLI shell out to it; both render identical
2238
- ASCII card stacks (○ pending, ● in_progress, ✓ done, ✗ failed,
2239
- ↷ skipped). State persists per-task at
2240
- `$HOME/.claude/logs/multi-agent/<task_id>/tracker-state.json`.
2241
- - **`pipeline/scripts/phase-banner.sh`** — single-event ANSI banner
2242
- companion (start/end/sub) for extra emphasis on long-running phases.
2243
- - **Phase 0 Step −1** runs the tracker bootstrap as the FIRST thing in
2244
- every pipeline run (before prefs, before input parse).
2245
- - **`commands/multi-agent/refs/phases.md`** "Visual Phase Tracker"
2246
- section now documents the three-tier signal model and the MANDATORY
2247
- contract: every phase boundary must call
2248
- `phase-tracker.sh update <N> <status>`.
2249
-
2250
- ### Added — Migration guide
2251
-
2252
- - **`docs/MIGRATION.md`** — per-version upgrade notes for v1→v2.0,
2253
- v2.0→v2.1 (Haiku removal), v2.1→v2.2 (Phase 5 state flag),
2254
- v2.2→v2.3 (validator MANDATORY), v2.3→v2.4 (tracker bootstrap).
2255
- Each section lists "what changed" + "action required" + state-file
2256
- compatibility notes.
2257
- - README links it alongside CHANGELOG.
2258
-
2259
- ### Added — Smoke tests (+30 assertions, 75 total)
2260
-
2261
- - `smoke-phase-tracker.sh` — 19 assertions covering init, add, update,
2262
- sub-add, sub-update-in-place, status enum, render shape, ANSI
2263
- stripping, unknown-action exit code.
2264
- - `smoke-phase-banner.sh` — 11 assertions covering start/end/sub
2265
- rendering, status glyphs, plain-text mode, enum enforcement,
2266
- minimal-args handling.
2267
-
2268
- ### Why this matters
2269
-
2270
- The v2.3.0 evaluation didn't flag this because both reviewers used
2271
- Claude — neither tested the Copilot path. v2.4.0 is the first release
2272
- that treats Copilot CLI as a first-class consumer rather than a mirror
2273
- target. Future phases of the pipeline will be required to use the
2274
- tracker as the source of truth for "what stage are we at" so the two
2275
- CLIs can never drift visually again.
2276
-
2277
- ---
2278
-
2279
- ## [2.3.0] — 2026-04-13
2280
-
2281
- Markdown → code: Phase 4 triage now has a real runtime gate, the pipeline
2282
- emits structured telemetry, and a sync-parity script catches mirror drift
2283
- before it ships.
2284
-
2285
- ### Added — Runtime triage validator
2286
-
2287
- - **`pipeline/scripts/validate-triage.mjs`** — zero-dep Node validator that
2288
- enforces the `triage-output.schema.json` contract AND the §3.3 edge-case
2289
- rules in code. Pipeline pipes triage output through it; exit codes are the
2290
- contract:
2291
- - `0` valid + clean → act on output
2292
- - `1` invalid → retry once, then fallback (treat raw findings as accepted)
2293
- - `2` over-rejection guard tripped → pause for human (autopilot logs and
2294
- accepts triage verdict)
2295
- - `3` `approved:false` with no blocker → use `result.corrected` field
2296
- - **Phase 4 §3.2.1** — MANDATORY validator step before Phase 5/Phase 6 entry.
2297
- Captures validator stdout into `state.reviewIterations[-1].validatorResult`
2298
- for Phase 7 audit.
2299
-
2300
- ### Added — Telemetry
2301
-
2302
- - **`pipeline/scripts/log-metric.sh`** — JSONL appender. One event per line,
2303
- atomic append (within `PIPE_BUF`), best-effort (never fails the pipeline).
2304
- Default file: `$HOME/.claude/logs/multi-agent/metrics.jsonl`.
2305
- - **`pipeline/scripts/aggregate-metrics.mjs`** — zero-dep aggregator. Reads
2306
- metrics, prints summary (text or JSON), supports `--since` / `--task-id` /
2307
- `--phase` filters. Computes review cycles avg/p95, triage classification
2308
- rates, edge-case occurrences, rework iteration distribution, language
2309
- preference distribution.
2310
- - **Phase 4 §3.4** — emit `review.completed` and `triage.edge_case` events.
2311
- - **Phase 3 re-entry** — emit `rework.started` event.
2312
- - **Phase 7 §1** — emit `task.completed` event AND embed
2313
- `aggregate-metrics.mjs --since=30d` summary in `agent-log.md`.
2314
-
2315
- ### Added — Sync parity
2316
-
2317
- - **`pipeline/scripts/sync-parity-check.sh`** — diffs two mirror trees,
2318
- reports `MISSING-FROM-LEFT` / `MISSING-FROM-RIGHT` / `DIFFERS`, supports
2319
- `--ignore <regex>` for intentional divergences. Exit 0 = parity, 1 = drift,
2320
- 2 = usage error. Suggested use: pre-push hook or before
2321
- `/sync-instructions full`.
2322
-
2323
- ### Added — Smoke tests (+25 assertions)
2324
-
2325
- - `smoke-phase4-triage.sh` — rewritten to exercise the real validator (was
2326
- previously fixture-shape only). 7 assertions covering all 4 exit codes.
2327
- - `smoke-telemetry.sh` — 8 assertions: required-field shape, type coercion
2328
- (bool/int/string), 10-way concurrent append (no interleaving), missing-file
2329
- graceful handling, accept_rate/reject_rate math, valid `--json` output.
2330
- - `smoke-sync-parity.sh` — 10 assertions: identical / extra-left / extra-right
2331
- / drifted / `--ignore` filter / bad-args.
2332
-
2333
- ### Why this matters
2334
-
2335
- - v2.2.0 documented the triage edge cases in markdown — runtime relied on
2336
- Claude reading and following those rules. v2.3.0 turns that documentation
2337
- into a deterministic gate. Misbehaving triage agents can no longer leak
2338
- bad output downstream silently.
2339
- - Telemetry closes the "we have no idea how this performs in practice" gap
2340
- flagged in the v2.2.0 evaluation. Future tuning decisions can use data.
2341
- - Sync parity is the catch-net for the "I forgot to run /sync-instructions"
2342
- failure mode that was a real risk after the personal/Copilot/repo split.
2343
-
2344
- ---
2345
-
2346
- ## [2.2.0] — 2026-04-13
2347
-
2348
- Triage hardening + Phase 3 re-entry contract + smoke tests.
541
+ ## [9.2.0] - 2026-05-30
2349
542
 
2350
543
  ### Added
2351
-
2352
- - **`schemas/triage-output.schema.json`** formal JSON Schema for Phase 4 triage output. Declares required fields, enum for `reviewer` (opus / sonnet — Haiku is not in the enum; outputs claiming `reviewer: "haiku"` are rejected), and makes `fix` mandatory on `accepted` items so Phase 3 re-entry always has actionable direction.
2353
- - **Phase 4 edge-case matrix** (`phase-4-review.md` §3.3) explicit rules for invalid JSON, schema validation failures, triage timeouts, the over-rejection "gaslighting" guard (>80% rejected + ≥5 findings → pause for human confirm), `approved:false` with zero accepted (auto-corrected), and hallucinated findings that don't map to raw input (stripped).
2354
- - **Phase 4 short-circuit** if both reviewers return zero findings, triage is skipped entirely with `{approved: true}` written directly.
2355
- - **Phase 3 re-entry section** (`phase-3-dev.md`) — explicit contract that rework consumes `triage.accepted` only. `deferred` items go to Phase 7; `rejected` items are never touched. Covers per-severity loop behavior and the 3-iteration hard-kill.
2356
- - **Phase 5 explicit skip state** — when the user declines testing, Phase 5 now writes `state.phases["5"].status = "skipped"`. Phase 6 reads this flag explicitly instead of relying on heuristics.
2357
- - **Smoke tests:**
2358
- - `smoke-phase4-triage.sh` — 6 contract assertions: empty-findings short-circuit, accepted-with-fix validation, `fix`-missing rejection, `reviewer: "haiku"` rejection, over-rejection guard math.
2359
- - `smoke-prefs-language.sh` — 6 assertions: enum is exactly {en, tr}, default is "en", round-trip for each value, unknown-value rejection, missing-field tolerance.
544
+ - **Full-pipeline orchestration on three more platforms** (previously knowledge-layer only). Cursor (`.cursor/agents/ma-*.md` subagents + `.cursor/commands/multi-agent.md` + `.cursor/mcp.json`), Antigravity (`.agent/workflows/multi-agent.md` + `.agent/rules/` + `AGENTS.md` + `.agent/mcp_config.json`), and VS Code Copilot Chat (`.github/agents/ma-*.agent.md` + `.github/prompts/multi-agent.prompt.md` + `.vscode/mcp.json`). Each adapter transforms the pipeline personas into the platform's subagent/agent format and registers the dev-toolkit MCP server. Install with `--cursor` / `--antigravity` / `--copilot-chat` (or `--all-tools`).
545
+ - **Picker contract** (`refs/picker-contract.md`) + `pipeline/lib/ask-choice.sh`: a cross-platform single-choice abstraction so confirmations degrade gracefully where there is no native `AskUserQuestion` (numbered-menu fallback; `ASK_CHOICE_DEFAULT` for autopilot/CI).
546
+ - **Proactive token-budget cap** (`prefs.global.costBudget` + `cost-budget-check.mjs`): prices the phase-tracker accumulators live and warns/halts before spend runs away.
547
+ - **Eval harness** expanded from 2 to 7 golden tasks across all stacks and every triage bucket.
2360
548
 
2361
549
  ### Changed
2362
-
2363
- - **Triage prompt skeleton** now references `pipeline/schemas/triage-output.schema.json` directly so the agent knows its contract. Existing reviewer prompts unchanged.
2364
-
2365
- ### Documentation
2366
-
2367
- - Every Phase 4 edge case lands in `state.reviewIterations[].decision` with a new audit string (e.g. `triage=failed-json-parse`, `triage=high-rejection-rate`, `triage=hallucinated-finding-stripped`) so post-run analysis can trace why a specific iteration did or didn't loop back.
2368
-
2369
- ---
2370
-
2371
- ## [2.1.1] — 2026-04-13
2372
-
2373
- Release-hygiene patch for v2.1.0 — documents the prior release retroactively,
2374
- adds a language preference for interactive prompts, and ships the
2375
- `/multi-agent language` command.
2376
-
2377
- ### Added
2378
-
2379
- - **`global.promptLanguage: "en" | "tr"`** in `prefs.schema.json` (default `en`).
2380
- Controls the language of user-facing prompts only. Reviewer/triage prompts,
2381
- commit messages, PR bodies, and Jira comments stay in English regardless.
2382
- - **`/multi-agent language [en|tr]`** — new action to show or set the prompt
2383
- language. No arg prints the current value; `en`/`tr` sets it atomically with
2384
- schema validation. See `pipeline/commands/multi-agent.md` "language Command".
2385
- - **First-run prompt-language picker** — on the first pipeline run after the
2386
- upgrade (or when the key is missing), Phase 6 asks once and persists the
2387
- answer; never re-asks unless the user runs `/multi-agent language`.
2388
- - **Bilingual Phase 5 / Phase 6 prompts** — "Want to test the code?" and the
2389
- pre-commit local-checkout prompt now ship English + Turkish variants, picked
2390
- via `promptLanguage`.
550
+ - **Confirmations are now native pickers** instead of typed keywords (`AskUserQuestion` on Claude Code, degrading per the picker contract elsewhere). Removed the typed `y/N` / `onayla`/`iptal` prompts.
551
+ - **Command/skill instruction files are English** throughout (token efficiency + model comprehension); `outputLanguage` still governs all runtime user-facing text.
552
+ - Phase 5 (User Test) now runs only in interactive worktree-backed modes (`dev`, `full`); every autopilot/local variant skips it.
553
+ - Analysis->plan contract field names aligned across schema, validator, and phase docs; the "no MCP outside analysis" gate made enforceable (telemetry recorded + checked).
2391
554
 
2392
555
  ### Fixed
556
+ - Command-injection vectors in `diff-explain.mjs` and `figma-screenshot.sh`; `review-watch` cursor loss (re-reviewed PRs forever); `diff-risk` / `classify-plan-safety` / `match-skills` logic defects; `write-state` stale-lock deadlock; several pre-existing test failures (mode-dispatch drift, README/install-layout counts, token budgets).
2393
557
 
2394
- - **Genericization:** removed Turkish-only copy from the open-source Phase 6
2395
- prompt (was embedded in v2.1.0 by mistake). Default is now English; Turkish
2396
- is opt-in via the preference.
2397
-
2398
- ### Release notes retro-fix
2399
-
2400
- - GitHub Release for **v2.1.0** published alongside this patch with full notes
2401
- (the v2.1.0 tag was pushed on 2026-04-13 without a Release page — this
2402
- patch restores parity).
2403
-
2404
- ---
558
+ ### Removed
559
+ - Dead `--windsurf` / `--cline` / `--continue` / `--zed` install flags (the adapters were dropped in 8.5.4; only the advertising lingered).
2405
560
 
2406
- ## [2.1.0] 2026-04-13
561
+ ## [9.1.1] - 2026-05-16
2407
562
 
2408
- Review topology change + Phase 6 local-checkout prompt.
563
+ **Patch** - Pipeline-wide humanizer punctuation sweep. 558 files, 5660 character replacements. Pre-existing em-dash / en-dash / ellipsis / curly quotes / section sign codepoints (U+2013, U+2014, U+2026, U+201C, U+201D, U+2018, U+2019, U+00A7) replaced with ASCII equivalents per Locked decision 7 (humanizer punctuation policy is non-negotiable).
2409
564
 
2410
565
  ### Changed
2411
566
 
2412
- - **Phase 4 review: 3-model 2-model + Opus triage.** Reviewer set is now
2413
- Opus (security + architecture) and Sonnet (quality + correctness + edge
2414
- cases). Haiku reviewer removed. After the parallel pass, a single Opus
2415
- triage agent classifies each finding as **accepted / deferred / rejected**
2416
- against task scope — only `accepted` blocking items loop back to Phase 3.
2417
- Rationale: reviewer outputs are raw signals, not commands; the triage step
2418
- prevents hallucinated or out-of-scope findings from triggering rework.
2419
- See `pipeline/commands/multi-agent/refs/phases/phase-4-review.md`.
2420
-
2421
- ### Added
2422
-
2423
- - **Phase 6 pre-commit local-checkout prompt.** Before committing, if Phase 5
2424
- was skipped (e.g. `--dev` mode or user declined), the pipeline now asks the
2425
- user whether to checkout the branch locally and test first. Skipped in
2426
- `autopilot`. See Step 1 of the standard Phase 6 flow.
2427
-
2428
- ### Migration
2429
-
2430
- - **Practical-breaking for consumers who read reviewer output directly:** the
2431
- Haiku reviewer findings are no longer produced. Downstream tooling that
2432
- parsed a third reviewer's output should drop that parse path. Accepted
2433
- findings still land in the same JSON shape.
2434
- - **Default behavior unchanged** for users who let the pipeline act on findings
2435
- (accepted blocking items still loop to Phase 3; deferred items still go to
2436
- the Phase 7 report).
2437
-
2438
- ---
2439
-
2440
- ## [2.0.1] — 2026-04-13
2441
-
2442
- Repo hygiene patch — no runtime behavior changes; ships CI infrastructure,
2443
- security policy, and a zero-dep schema validator alongside the v2.0.0 package.
567
+ - 558 pipeline files swept for banned characters. Top changes by file:
568
+ - `pipeline/skills/skills-index.md`: 214 chars
569
+ - `pipeline/commands/multi-agent/channels.md`: 103 chars
570
+ - `pipeline/skills/shared/core/multi-agent/SKILL.md`: 97 chars
571
+ - `pipeline/commands/multi-agent/setup.md`: 79 chars
572
+ - Mapping: `U+2014` -> ` - `, `U+2013` -> `-`, `U+2026` -> `...`, `U+201C/D` -> `"`, `U+2018/9` -> `'`, `U+00A7` -> `section`.
2444
573
 
2445
- ### Added
2446
-
2447
- - **`.github/workflows/smoke.yml`** — runs on every push and PR, three jobs:
2448
- - `smoke` auto-discovers and executes every `pipeline/scripts/smoke-*.sh`
2449
- - `schema-validate` runs the zero-dep validator (no Ajv install)
2450
- - `no-personal-data` greps for known corp identifiers with proper word
2451
- boundaries (so `mmerterden` doesn't false-positive on the `MERTERDEN`
2452
- pattern)
2453
- - **`SECURITY.md`** — coordinated disclosure policy: 5-business-day ack,
2454
- 30-day patch SLA for high/critical, in/out-of-scope clearly defined
2455
- - **`pipeline/scripts/validate-schemas.mjs`** — zero-dep shallow validator
2456
- for `*.schema.json`. Uses only Node.js built-ins; no Ajv, no peer deps.
2457
- Catches the regressions we've actually hit (JSON parse errors, missing
2458
- `$schema`/`$id`/`title`/`type`). For deep validation, run
2459
- `npx ajv compile -s <file>` locally — no install needed.
2460
- - **README**: Smoke Tests CI badge + SECURITY.md pointer + tree updates
2461
-
2462
- ### Changed
574
+ ### Excluded from sweep
2463
575
 
2464
- - **`pipeline/commands/multi-agent/help.md`** `screenshots in Turkish`
2465
- `screenshots in the requested locale (e.g. tr, en, de)` for a generic
2466
- example (the locale was always parameterized; only the description
2467
- named one specific locale)
2468
- - **`pipeline/commands/multi-agent/refs/rules.md`** — removed redundant
2469
- Turkish-language duplicate of the External System Payloads section.
2470
- Merged the relevant terminal-output note into the existing English rule
2471
- list as item #4.
576
+ - `CHANGELOG.md`: historical version blocks preserved as audit trail (humanizer policy applies to new content, not retroactive history rewrite).
577
+ - `pipeline/lib/md2confluence-v3.py`: PUNCT_MAP intentionally uses Unicode escapes (`—` etc) for replacement source; replacing the keys would break the converter.
578
+ - `pipeline/scripts/smoke-personal-data.sh`: pattern literals contain U+00A7 references for matching purposes.
2472
579
 
2473
- ### Fixed
580
+ ### Verified
2474
581
 
2475
- - **CI leak guard**: the case-insensitive `MERTERDEN` pattern was matching
2476
- the legitimate public GitHub username `mmerterden` (substring match).
2477
- Refactored to a `TAG|REGEX|FLAGS` triplet table so personal-name
2478
- patterns can use `\bWORD\b` word boundaries while corp-host patterns
2479
- stay simple. Verified: 0 leaks.
582
+ - Punctuation gate: 0 files with banned characters remaining (was 558 files, 5389 lines).
583
+ - Smoke gate `smoke-personal-data.sh`: 22 patterns clean, 0 leaks.
584
+ - `bash -n` clean on sampled scripts.
585
+ - `python3 -m py_compile` clean on `md2confluence-v3.py`.
2480
586
 
2481
- ### Reaffirmed
587
+ ### Migration notes
2482
588
 
2483
- - **Zero runtime dependencies** `package.json` has `dependencies: null`,
2484
- `devDependencies: null`, `peerDependencies: null`. The previous draft of
2485
- `validate-schemas.mjs` imported `ajv` + `ajv-formats`; that was reverted
2486
- before this release. The npm package ships with Node built-ins only.
589
+ - No behavior change. Pure cosmetic / typographic consistency.
590
+ - Downstream consumers reading these docs see ASCII-only prose now. Code blocks, URLs, and front-matter YAML were already ASCII.
2487
591
 
2488
592
  ---
2489
593
 
2490
- ## [2.0.0] 2026-04-13
594
+ ## [9.1.0] - 2026-05-16
2491
595
 
2492
- First stable release. All prior 1.x versions were the discovery phase; 2.0 locks the API surface.
596
+ **Minor** - Consolidation release. Locked decision category index (5 groups), Lite mode scoring relaxed from AND to 2/3, legacy v2 analysis soft-skip flag (`MULTI_AGENT_LEGACY_V2_ANALYSIS=allow`), dead helper cleanup, 31-command contract drift fix, broader PII smoke gate, and 6 measurable performance improvements across the hot path.
2493
597
 
2494
598
  ### Added
2495
599
 
2496
- - **`CHANGELOG.md`** (this file). All releases traceable from one place.
2497
- - **JSON schemas** for `agent-state.json` and `multi-agent-preferences.json` under `schemas/` for tooling and LLM reference.
2498
- - **Platform Support** section in README macOS is primary; Linux/Windows supported with documented token-storage fallbacks.
2499
- - **Install telemetry transparency** README explicitly documents the one-time install ping and the `MULTI_AGENT_NO_TELEMETRY=1` opt-out env var.
2500
- - **Phase 6 flow diagram** ASCII diagram showing the order of reviewers fetch draft-or-ready body PUT payload.
2501
- - **Phase file TLDRs** — every phase-spec file starts with a 3-4 line summary so the main command can load `head -10` of a phase file for context-lite reads (significant token savings on full-pipeline runs).
2502
- - **Smoke test** — `scripts/smoke-add-detail.sh` dry-runs the `add-detail` payload against a test PR and verifies reviewer preservation (regression guard for the v1.8 wipeout incident).
2503
- - **`refs/rules.md`** — global pipeline rules extracted from `multi-agent.md` so the top-level command stays focused on routing.
600
+ - `pipeline/commands/multi-agent/analysis.md` "Locked decisions Index by category": 30 decisions grouped into Governance, Citation and Evidence, Output Format and Structure, Design Source and Pipeline Architecture, UI/Variant/Test Coverage. Browse-friendly navigation; canonical numbering unchanged.
601
+ - Lite mode 2/3 scoring (Locked 25): each of three signals (spec lines < 100, figma frames <= 1, repo direct-match >= 8) scores 1 point; `liteModeAuto = (score >= 2)`. Previous AND-threshold (v8.12.0..v9.0.x) forced small features into Full mode when one signal was marginal. Explicit `--lite` / `--full` flags still win over scoring.
602
+ - Legacy v2 analysis soft-skip (Locked 30): Phase 2 and Phase 3 Pre-flight degrade `template_version: v2` analysis docs to warning mode when `prefs.global.legacyV2AnalysisAllowed == true` OR env `MULTI_AGENT_LEGACY_V2_ANALYSIS=allow`. Standards binding enforcement and Pass B footnote check drop to warning level; v3-specific section coverage checks skip. Escape hatch removed in v9.2.0.
603
+ - `pipeline/scripts/smoke-personal-data.sh`: 7 new patterns covering airline-specific brand literals, page ID ranges, and Keychain alias forms. Now 22 patterns total. Exclude list extended for `confluence-page-ids.example.json` and CHANGELOG historical references.
604
+ - `pipeline/skills/figma-ios/figma-to-component/scripts/confluence-page-ids.example.json`: placeholder schema reference file (5 generic entries) replacing the 91-entry corporate page ID inventory which moved to project-local storage.
2504
605
 
2505
606
  ### Changed
2506
607
 
2507
- - **Phase 6.5 (WIKI & ISSUE UPDATE) removed from main pipeline** and moved into the figma-to-swiftui skill as SubPhase 4F (Wiki) + Phase 7 (Issue Body Update). Rationale: conditional phases are an anti-pattern they force every reader to learn "this phase runs only when X". The main pipeline now has a fixed 8-phase contract (0-7) regardless of task type. Specialized work (component generation, future RN migration, etc.) slots into its parent main phase as SubPhases.
2508
- - **Task Type Detection** added as Phase 0 Step 9. Deterministic classification (component / bugfix / feature / refactor / chore) persisted to `agent-state.taskType` and consumed by every downstream phase no more per-phase "is this a component?" guesses.
2509
- - **Unused commands removed**: `/fix-issue` (equivalent to `/multi-agent X --dev`) and `/review` (equivalent to `/multi-agent review-only`). `/security-review` reduced to a thin wrapper around Phase 4 Reviewer 1.
2510
- - **SubPhase Convention** documented in `refs/phases.md` — specialized skills taking over a main phase report progress as nested SubPhases (e.g. `SubPhase 3.4F: Wiki`) instead of inflating the top-level phase count.
2511
- - **`pipeline-output-formatting.md` merged** into `refs/rules.md` under "External System Payloads". Single source of truth for heredoc + rawfile + no-HTML-entities contract.
2512
-
2513
- ### Changed (prior)
608
+ - 9 HIGH severity PII redactions across pipeline + Copilot skills (airline brand names, corporate page IDs, Keychain alias literals, accountId examples). Pipeline source and Copilot mirror aligned.
609
+ - `pipeline/commands/multi-agent/refs/cross-cli-contract.md` Section 1: "31 commands" enumeration drift fix - added `delete`, `diff-explain`, `language` to the visible list (they existed as files but were missing from the doc enumeration).
610
+ - 5 dead helper functions removed (~70 LOC total): `top_suffix()` from `extract-conventions.sh`, `auth_attempt()` from `fetch-fortify.sh`, `now_ts()` and `download_one()` from `figma-screenshot.sh`, `jq_get()` from `multi-repo-pipeline.sh`. All were defined but never called (inline duplication at the call site).
2514
611
 
2515
- - **README highlights** — trimmed to 7 core bullets; long feature matrix moved to `docs/features.md`.
2516
- - **Portfolio description** (in this repo's metadata) refreshed to reflect v2.0 feature set.
612
+ ### Performance
2517
613
 
2518
- ### Fixed / Hardened (carried from 1.x)
614
+ Six measurable improvements on the hot path:
2519
615
 
2520
- - Reviewer preservation on Bitbucket PUT (from 1.8.0).
2521
- - Real newlines + no HTML entities in external bodies (from 1.6.1).
2522
- - `refs/` layout separation from actions (from 1.9.0).
616
+ | Fix | Before | After | Speedup |
617
+ |---|---|---|---|
618
+ | `smoke-personal-data.sh` pattern alternation (single grep vs 22 invocations) | 3.348s | 1.545s | 2.2x |
619
+ | `smoke-no-token-prompt.sh` multi-`-e` (single grep vs 7x7=49 invocations) | 0.485s | 0.014s | 35x |
620
+ | `phase-tracker.sh` render batched jq (U+001F separator preserves empty fields) | ~7 jq calls per phase + ~3 per sub | 1 batch + 1 per active sub/meta | ~120 subprocess azalma per render |
621
+ | `issue-fetcher.sh` python3 batch (single inline vs 7 separate calls per fetch) | 0.806s / 3 iter | 0.136s / 3 iter | 5.9x |
622
+ | `md2confluence-v3.py` HTTP retry wrapper (3-attempt exponential backoff on 5xx / 429) + `ThreadPoolExecutor(max_workers=4)` paralel attachment | sequential N x ~1.5s | parallel ~max(individual) | up to 4x on multi-screenshot pages, plus transient-error resilience |
623
+ | `extract-conventions.sh` env override `EXTRACT_CONV_EXTRA_ROOTS` + auto-add `.gitmodules` paths, bucket timeout 30s -> 10s, `xargs basename` -> `awk -F/ '{print $NF}'` (8 callsites) | scan roots too narrow on monorepos with submodules; per-bucket 30s budget | submodule paths auto-detected, faster fail | resolves "confidence: none" on submodule-heavy repos; ~100-300ms saved per bucket |
2523
624
 
2524
- ### Breaking (internal only)
625
+ ### Verified
2525
626
 
2526
- - `agent-state.json.phases["6.5"]` no longer written. Existing state files with this key continue to resume correctly; the key is ignored.
2527
- - `~/.claude/commands/multi-agent/refs/phases/phase-6.5-wiki.md` deleted (content moved to figma-to-swiftui.md SubPhase 4F).
627
+ - `bash -n` clean on 9 modified shell scripts.
628
+ - `python3 -m py_compile` clean on `md2confluence-v3.py`.
629
+ - Smoke gate `smoke-personal-data.sh`: 22 patterns clean, 0 leaks.
630
+ - Smoke gate `smoke-no-mcp-in-dev-phases.sh`: behavior unchanged, mock fixtures pass.
631
+ - Pre/post smoke output byte-identical (behavior parity preserved despite faster path).
2528
632
 
2529
- ### Breaking (none at slash-command surface)
633
+ ### Migration notes
2530
634
 
2531
- Internal file paths consolidated under `refs/` (was done in 1.9.0).
635
+ - Existing v2 analyses still abort in default mode (v9.0.x behavior unchanged). Set `prefs.global.legacyV2AnalysisAllowed = true` or env `MULTI_AGENT_LEGACY_V2_ANALYSIS=allow` to opt into the soft-skip. Plan to regenerate v3 docs before v9.2.0.
636
+ - Lite mode now activates on 2 of 3 signals; existing analyses that previously rendered Full may now render Lite. To force Full, pass `--full` to `/multi-agent:analysis`.
637
+ - `cross-cli-contract.md` "31 commands" enumeration now matches reality. Smoke gate behavior unchanged - this was a documentation drift only.
2532
638
 
2533
639
  ---
2534
640
 
2535
- ## [1.9.0] 2026-04-13
2536
-
2537
- ### Changed (breaking — internal only)
641
+ ## [9.0.0] - 2026-05-16
2538
642
 
2539
- - **File layout refactor**: reference docs moved from `commands/multi-agent/*.md` to `commands/multi-agent/refs/**`.
2540
- - Autocomplete now surfaces only actions (`/multi-agent:add-detail`, `/multi-agent:setup`, `/multi-agent:help`) at the top level; refs show as `/multi-agent:refs:...`.
2541
- - Path references in `multi-agent.md` Modular Loading table and `refs/phases.md` phase index updated.
2542
- - No change to invocable slash commands for end users.
643
+ **Major** - Pipeline-wide design source contract: analysis document is the sole design source for Phase 2 through Phase 7. Figma MCP / REST calls are forbidden outside `/multi-agent:analysis` Phase 1. Phase 2 Planning and Phase 3 Dev gain BLOCKING pre-flight checks that abort the run when `analysis/<feature>-<platform>.md` is missing or below template `v3`. Locked decision 30 codifies the rule; smoke gate `smoke-no-mcp-in-dev-phases.sh` enforces it by reading `state.telemetry.mcpCalls[]` and failing any entry with `phase >= 2`.
2543
644
 
2544
- ## [1.8.0] 2026-04-13
645
+ This is a major bump because the pre-flight is hard - existing dev mode invocations that previously ran without an analysis document will now abort. The mitigation path is to run `/multi-agent:analysis "<feature>"` first.
2545
646
 
2546
- ### Breaking
647
+ ### Why
2547
648
 
2548
- - **Renamed `pr-detail` `add-detail`.** Reason: the command updates both the Jira comment and the PR description, "pr" alone was misleading.
649
+ Re-fetching Figma during Phase 2+ duplicates analysis work, burns MCP tokens, splits the design source of truth, and breaks the analysis -> dev contract. The user-facing pain in v8.x was Phase 3 silently calling MCP again, producing variants that drifted from the analysis Section 13.1 concept table. v9.0.0 binds Phase 2+ to the analysis document plus repo Code Connect mappings only.
2549
650
 
2550
651
  ### Added
2551
652
 
2552
- - `add-detail` accepts Jira URL / ID directly (resolves linked PR via dev-status API).
2553
- - Manual mode: `--message` / `--message-file` enrich Jira + PR even when the fix was made outside the pipeline.
2554
- - **MANDATORY Bitbucket PUT contract**: every PR update must re-send `reviewers`, `fromRef`, `toRef`, `draft`. Previous behavior (sending only title + description) wiped default reviewers on every run. Now codified as a spec rule, not optional.
653
+ - `pipeline/commands/multi-agent/refs/phases/phase-3-dev.md`: new "Phase 3 Pre-flight (BLOCKING, v9.0.0)" section (7 steps): analysis document presence, YAML front-matter parse, evidence digest cache validation, Code Connect mapping lookup, standards binding citation, conventions handoff (Pass B concept-to-realization), MCP forbidden gate.
654
+ - `pipeline/commands/multi-agent/refs/phases/phase-2-planning.md`: matching "Phase 2 Pre-flight (BLOCKING, v9.0.0)" section (5 steps): analysis presence, front-matter parse, section coverage check (required sections per Locked 2), task seed from analysis Section 14, MCP forbidden gate.
655
+ - `pipeline/rules/figma-pipeline.md`: new "MUST: No MCP outside analysis phase (BLOCKING, pipeline-wide)" section with phase access matrix, halt condition, rationale, and verification gate reference.
656
+ - `pipeline/commands/multi-agent/refs/rules.md`: new "Figma Access by Phase (pipeline-wide BLOCKING, v9.0.0)" matrix under existing "Figma Access Tier" section.
657
+ - `pipeline/scripts/smoke-no-mcp-in-dev-phases.sh`: smoke gate reading `state.telemetry.mcpCalls[]`. Fails when any entry has `phase >= 2`.
658
+ - `pipeline/commands/multi-agent/analysis.md`: Locked decisions 29 (variant usage explicit) and 30 (analysis self-contained, pipeline-wide MCP forbidden). Total Locked count now 30.
2555
659
 
2556
660
  ### Changed
2557
661
 
2558
- - Jira comment is now posted **by default** (previously required `--jira` flag). Opt out with `--no-jira`.
2559
- - Default output shortened: single-screen summary, no code snippets, no `<!-- pr-detail:start -->` markers.
2560
- - Each run is a full description replace; `--append` preserves the existing body.
2561
-
2562
- ## [1.7.0] — 2026-04-13
2563
-
2564
- ### Added
2565
-
2566
- - **`pr-detail` command** (renamed to `add-detail` in 1.8.0) — read an open PR's diff, generate technical analysis + test scenarios, update PR description.
2567
- - **DRAFT vs READY prompt in Phase 6** — ask before creating a PR; persist choice in `prefs.projects[].defaultPrMode`.
2568
- - Bitbucket Data Center 8.x+: `"draft": true` flag; older Bitbucket: `[DRAFT]` title prefix fallback.
2569
- - GitHub: `gh pr create --draft` + `gh pr ready` for promotion.
2570
-
2571
- ## [1.6.1] — 2026-04-13
2572
-
2573
- ### Added
2574
-
2575
- - **Default reviewer auto-injection on PR creation**:
2576
- - Bitbucket: fetches `/rest/default-reviewers/1.0/.../reviewers`, filters out the PR author, includes them in the create payload (REST API does not auto-assign).
2577
- - GitHub: honors `CODEOWNERS` auto-request; falls back to `required_reviewers` → `prefs.projects[].githubDefaultReviewers`.
2578
-
2579
- ### Fixed
2580
-
2581
- - **PR description newline preservation** — Bitbucket and Jira comments now always built via `heredoc + jq --rawfile + curl --data-binary @file`. Previous inline `"\n"` embedding rendered as literal `\n` in the UI.
2582
- - **HTML entity ban** (`&amp;`, `&lt;`, `&quot;`) in all user-visible fields — never decoded by Bitbucket/Jira.
2583
-
2584
- ## [1.6.0] — 2026-04-13
2585
-
2586
- ### Added
2587
-
2588
- - Install telemetry — one-time install ping to notify the owner of new installs.
2589
-
2590
- ## [1.5.0] — 2026-04-13
2591
-
2592
- ### Added
662
+ - `pipeline/commands/multi-agent/refs/phases/phase-3-dev.md`: removed the 80-line legacy "MUST: Figma access 3-tier fallback chain" block. The 3-tier chain is exclusive to Phase 1 of `/multi-agent:analysis`; Phase 3 Dev now reads design context from the analysis document only. About 12 MCP / Figma fetch call sites cleaned up.
663
+ - `pipeline/commands/multi-agent/dev-local.md`: drift fix. Phase 4 (Review) references removed; header normalized to "5-phase fast pipeline (no worktree)" covering Phase 0, 3, 5, 6, 7 only.
664
+ - `pipeline/commands/multi-agent/refs/phases/modes.md`: Tablo 1 drift fix. Phase 5 (Test) cells now distinguish interactive prompt (normal, local, --dev, --dev-local) from skip (autopilot, local-autopilot, dev-autopilot, dev-local-autopilot).
2593
665
 
2594
- - **Mandatory status update + post-mutation verify** in Phase 3. Re-reads the issue-tracker Status field after mutation and retries once on silent `VALIDATION` failures (e.g. stale Projects V2 option IDs after a board rebuild).
2595
-
2596
- ## [1.4.0] — 2026-04-12
2597
-
2598
- ### Added
2599
-
2600
- - **Cross-session memory capture** — Phase 7 persists user corrections, project decisions, and external references; Phase 1 injects them on future runs.
2601
-
2602
- ## [1.3.0] — 2026-04-11
2603
-
2604
- ### Added
2605
-
2606
- - Full experience: rules, commands, figma-to-swiftui pipeline, `CLAUDE.md` template.
2607
-
2608
- ## [1.2.0] — 2026-04-11
2609
-
2610
- ### Added
666
+ ### Migration notes
2611
667
 
2612
- - Pre-commit secret detection hook.
2613
- - Context management (auto-compact at 65%).
668
+ - Existing v2 / v8.x analyses are not v3 and will fail the pre-flight `template_version` check. Re-run `/multi-agent:analysis "<feature>"` to regenerate.
669
+ - If your dev mode invocation does not have an analysis document yet, run `/multi-agent:analysis` first. The pipeline now treats `/multi-agent:analysis` as a prerequisite for Phase 2+ work.
670
+ - Smoke gate runs after every pipeline invocation. CI invocations should add `bash pipeline/scripts/smoke-no-mcp-in-dev-phases.sh` to their post-run checks.
2614
671
 
2615
- ## [1.1.0] — 2026-04-11
672
+ ### Verified
2616
673
 
2617
- ### Changed
674
+ - Personal-data smoke: 15 patterns clean, 0 leaks.
675
+ - Humanizer punctuation: zero hits in new content.
676
+ - `bash -n pipeline/scripts/smoke-no-mcp-in-dev-phases.sh`: zero output.
677
+ - Mock state fixtures: gate passes on `phase: 1` MCP entry, fails on `phase: 3` MCP entry.
2618
678
 
2619
- - Decoupled from `dev-toolkit-mcp`; all audits now run standalone via Bash (`xcrun simctl`, `adb`, `codesign`, etc.).
679
+ ---
2620
680
 
2621
- ## [1.0.0] — 2026-04-11
681
+ ---
2622
682
 
2623
- Initial release 8-phase AI development pipeline.
683
+ Entries for v2.0.0 - v8.13.0 live in [CHANGELOG-archive.md](./CHANGELOG-archive.md) (repo only, not shipped in the npm tarball).