npm - @kudusov.takhir/ba-toolkit - Versions diffs - 2.0.0 → 3.0.0 - Mend

@kudusov.takhir/ba-toolkit 2.0.0 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/CHANGELOG.md +60 -1
package/README.md +13 -17
package/bin/ba-toolkit.js +447 -133
package/package.json +2 -2
package/skills/ac/SKILL.md +3 -1
package/skills/apicontract/SKILL.md +3 -1
package/skills/brief/SKILL.md +3 -1
package/skills/datadict/SKILL.md +3 -1
package/skills/nfr/SKILL.md +3 -1
package/skills/principles/SKILL.md +3 -1
package/skills/references/interview-protocol.md +53 -0
package/skills/references/templates/agents-template.md +3 -1
package/skills/research/SKILL.md +3 -1
package/skills/scenarios/SKILL.md +3 -1
package/skills/srs/SKILL.md +3 -1
package/skills/stories/SKILL.md +3 -1
package/skills/usecases/SKILL.md +3 -1
package/skills/wireframes/SKILL.md +3 -1

package/CHANGELOG.md CHANGED Viewed

@@ -11,6 +11,64 @@ Versions follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ---
+## [3.0.0] — 2026-04-09
+### ⚠️ BREAKING — Cursor and Windsurf install paths moved to native Agent Skills
+Cursor and Windsurf both have **two separate features**: Rules (`.cursor/rules/*.mdc`, `.windsurf/rules/*.mdc`) and Agent Skills (`.cursor/skills/<skill>/SKILL.md`, `.windsurf/skills/<skill>/SKILL.md`). BA Toolkit is a pipeline of skills, not rules, but every previous version installed it as `.mdc` rules under `.cursor/rules/` and `.windsurf/rules/`. Both editors loaded those files as **rules**, never as skills, so `/brief`, `/srs`, … slash commands were never registered with the agent. Users reported that the skills did not show up. v3.0 fixes this by switching to the native Agent Skills paths and the native folder-per-skill `SKILL.md` layout (the same one Claude Code, Codex CLI, and Gemini CLI use). Confirmed against the official Cursor and Windsurf documentation via ctx7 MCP:
+| Agent | v2.0 path (broken) | v3.0 path (correct) | Format |
+|---|---|---|---|
+| Cursor | `.cursor/rules/<skill>.mdc` (flat) | `.cursor/skills/<skill>/SKILL.md` (folder-per-skill) | Native |
+| Windsurf | `.windsurf/rules/<skill>.mdc` (flat) | `.windsurf/skills/<skill>/SKILL.md` (folder-per-skill) | Native |
+**Migration for v2.0 Cursor/Windsurf users:**
+```bash
+# 1. Upgrade the package
+npm install -g @kudusov.takhir/ba-toolkit@latest
+# 2. Reinstall — the old install was at the wrong path; upgrade can't find
+#    its manifest there, so just run install fresh against the correct path.
+ba-toolkit install --for cursor      # writes to .cursor/skills/
+ba-toolkit install --for windsurf    # writes to .windsurf/skills/
+# 3. Manually clean up the orphaned old install (it never actually worked
+#    as skills anyway — Cursor/Windsurf were loading it as rules):
+rm -rf .cursor/rules/*.mdc           # if those .mdc files came from BA Toolkit
+rm -rf .windsurf/rules/*.mdc
+```
+After this you'll see the BA Toolkit skills register as actual Agent Skills in Cursor and Windsurf for the first time — `/brief`, `/srs`, `/ac`, `/nfr`, … become real slash commands. Reload the editor window after install. Claude Code, Codex CLI, and Gemini CLI users are unaffected — their paths and behavior are unchanged.
+### Added
+- **Integration test suite for every CLI subcommand** (`test/cli.integration.test.js`, 33 tests). Spawns the real CLI as a child process against temporary directories and asserts exit codes, stdout/stderr content, and filesystem state. Covers `--version`/`--help`/no-args, typo detection for unknown flags, `init` with flag combinations and validation failures, `install`/`upgrade`/`uninstall` dry-runs for every supported agent, end-to-end install → manifest → status → upgrade → uninstall round-trips for both Claude Code and Cursor (native skill format), and a regression guard proving that `uninstall` leaves the user's own unrelated skills in the shared destination untouched (manifest-driven removal guarantee).
+- **Test gate in `.github/workflows/release.yml`**. A new `Run tests` step runs `npm test` between the smoke test and the npm publish step — if any unit or integration test fails, the `publish-npm` job exits before the classic-auth strip + publish steps, and no broken release reaches npm. The GitHub Release created by the preceding job still happens; npm and GitHub are intentionally independent.
+- **Test job in `.github/workflows/validate.yml`**. A new `run-tests` job mirrors the release gate at PR time, so regressions are caught on the PR instead of at tag push. Triggered for changes under `bin/**`, `test/**`, `skills/**`, `output/**`, or `package.json`.
+- **ASCII banner at the top of `ba-toolkit init`**. Decorative `ba-toolkit` wordmark printed before the "New Project Setup" heading. Suppressed on non-TTY stdout (CI logs, piped output, captured test stdout) via an `isTTY` guard in `printBanner()`, so the banner never pollutes automation. Covered by a regression test that asserts the banner glyphs don't appear in non-TTY runs.
+- **Arrow-key menu navigation in `ba-toolkit init`** for the domain and agent selection prompts. Real terminals now get an interactive menu — `↑/↓` (also `j/k`) to move, `1-9` to jump by index, `Enter` to confirm, `Esc` or `Ctrl+C` to cancel cleanly with exit code 130. The previous numbered prompt is the automatic fallback when stdin/stdout is not a TTY (CI, piped input, `TERM=dumb`, IDE shells), so all CI/integration tests and the `printf | ba-toolkit init` use case keep working unchanged. Cross-platform via Node's `readline.emitKeypressEvents` + `setRawMode` — works the same on bash, zsh, fish, Windows Terminal (PowerShell, cmd, WSL), Git Bash, and VSCode integrated terminal. Three-layer design for testability: `menuStep(state, key)` — pure state machine, 11 unit tests; `renderMenu(state, opts)` — pure renderer, 7 unit tests; `runMenuTty` — the I/O glue, manually smoke-tested.
+- **Interview Protocol for every interview-phase skill** (`skills/references/interview-protocol.md`). Codifies the rule that AI skills must ask ONE question at a time, offer 3–5 domain-appropriate options per question sourced from `references/domains/{domain}.md`, always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question. Replaces the previous "dump a numbered questionnaire of 5+ questions at once" style that made users abandon long interviews. Every shipped SKILL.md with an `Interview` heading — 12 files: `brief`, `srs`, `stories`, `usecases`, `ac`, `nfr`, `datadict`, `apicontract`, `wireframes`, `scenarios`, `research`, `principles` — now opens its Interview section with a blockquote pointing to this protocol. A Node-level regression test in `test/cli.test.js` walks every shipped SKILL.md and fails if any interview skill ever ships without the protocol link; a mirror validation step in `.github/workflows/validate.yml` catches the same at PR time.
+### Changed
+- **`ba-toolkit init` now merges `AGENTS.md` instead of overwriting it.** The `agents-template.md` wraps the Active Project block (name/slug/domain/date) in `<!-- ba-toolkit:begin managed -->` / `<!-- ba-toolkit:end managed -->` anchors. On re-init, only the managed block is refreshed — Pipeline Status edits, Key Constraints, Open Questions, user notes, and anything else outside the anchors is preserved byte-for-byte. If an existing `AGENTS.md` has no anchors (legacy file or fully user-authored), it's left untouched and a `preserved` note is printed instead of overwriting. The old interactive "AGENTS.md already exists. Overwrite? (y/N)" prompt is removed — the merge is always safe. New pure helper `mergeAgentsMd(existing, ctx)` exported from `bin/ba-toolkit.js` with unit tests covering all three branches (`created`, `merged`, `preserved`) plus malformed-anchor edge cases; integration tests verify the double-init scenario and the unmanaged-file preservation.
+- `npm test` now runs both the pre-existing unit suite (`test/cli.test.js`) and the new integration suite — 154 tests total, ~2 seconds, still zero dependencies (only Node built-ins: `node:test`, `node:assert`, `node:child_process`).
+- `.gitignore` now excludes `.claude/settings.local.json` and `.claude/skills/` — local Claude Code state, never part of the package. `settings.local.json` was previously tracked and required a stash dance around `npm version`; that workaround is no longer needed.
+### Fixed
+- **Cursor install now targets `.cursor/skills/` with native SKILL.md format, not `.cursor/rules/` with `.mdc` conversion.** See the BREAKING section above for the full story and migration steps.
+- **Windsurf install now targets `.windsurf/skills/` with native SKILL.md format, not `.windsurf/rules/` with `.mdc` conversion.** Mirror of the Cursor fix. Confirmed against the [Windsurf Agent Skills documentation](https://docs.windsurf.com/windsurf/cascade/skills) via ctx7 MCP: Windsurf loads skills from `.windsurf/skills/<skill-name>/SKILL.md` as folder-per-skill with the same YAML frontmatter (`name`, `description`) as Claude Code and Cursor.
+- **`ba-toolkit init` no longer crashes on a single typo during interactive input.** The domain menu, agent menu, and manual slug entry now re-prompt on invalid input via a new `promptUntilValid(question, resolver, { maxAttempts, invalidMessage })` helper. After three consecutive invalid answers the command aborts with a clear "Too many invalid attempts — aborting." message so piped input can't infinite-loop. The flag-path (`--domain=banana`, `--for=vim`) still hard-fails immediately — that's the correct behavior for CI and scripting, and its tests are untouched.
+- **`prompt()` race condition that silently dropped piped input lines.** The previous implementation used `rl.question()` on a shared `readline.Interface`; when stdin was piped with multiple answers at once (e.g. `printf "banana\nsaas\n" | ba-toolkit init`, or any test feeding a full input buffer via `spawnSync`), readline emitted `'line'` events before the next `question()` handler had been attached, and those answers were silently lost. The second prompt then saw EOF and aborted with `INPUT_CLOSED` despite the answer being in the buffer. The new `prompt()` owns the `'line'` event directly, buffers arriving lines into an internal `lineQueue`, and parks `waiters` when the queue is empty — no more lost input. Uncovered while wiring up the `promptUntilValid` retry path.
+### Removed
+- **`.mdc` rule format conversion path is gone** — every shipped agent now uses native Agent Skills, so the conversion was dead code. Deleted: `skillToMdcContent()` (and its 2 unit tests), the `mdc` branch of `copySkills()`, the `format` field on every entry in `AGENTS`, the `format` parameter passed to `copySkills` and `writeManifest`, the runtime "format: .mdc (converted from SKILL.md)" log line. The `format` field is also gone from new manifests, but `readManifest` still parses legacy manifests that have it (covered by a forward-compat unit test). Net removal: ~80 lines of code + 2 unit tests; CLI surface unchanged.
+---
 ## [2.0.0] — 2026-04-09
 ### ⚠️ BREAKING — install layout dropped the `ba-toolkit/` wrapper
@@ -350,7 +408,8 @@ CI scripts that relied on the old behaviour (`init` creates files only, `install
 ---
-[Unreleased]: https://github.com/TakhirKudusov/ba-toolkit/compare/v2.0.0...HEAD
+[Unreleased]: https://github.com/TakhirKudusov/ba-toolkit/compare/v3.0.0...HEAD
+[3.0.0]: https://github.com/TakhirKudusov/ba-toolkit/compare/v2.0.0...v3.0.0
 [2.0.0]: https://github.com/TakhirKudusov/ba-toolkit/compare/v1.5.0...v2.0.0
 [1.5.0]: https://github.com/TakhirKudusov/ba-toolkit/compare/v1.4.0...v1.5.0
 [1.4.0]: https://github.com/TakhirKudusov/ba-toolkit/compare/v1.3.2...v1.4.0

package/README.md CHANGED Viewed

@@ -13,8 +13,8 @@ Structured BA pipeline for AI coding agents — brief to handoff, 21 skills, 9 d
 <img src="https://img.shields.io/badge/Claude_Code-✓-6C5CE7" alt="Claude Code">
 <img src="https://img.shields.io/badge/Codex_CLI-✓-00D26A" alt="Codex CLI">
 <img src="https://img.shields.io/badge/Gemini_CLI-✓-4285F4" alt="Gemini CLI">
-<img src="https://img.shields.io/badge/Cursor-convert-F5A623" alt="Cursor">
-<img src="https://img.shields.io/badge/Windsurf-convert-1ABCFE" alt="Windsurf">
+<img src="https://img.shields.io/badge/Cursor-✓-F5A623" alt="Cursor">
+<img src="https://img.shields.io/badge/Windsurf-✓-1ABCFE" alt="Windsurf">
 </div>
@@ -46,7 +46,7 @@ npm install -g @kudusov.takhir/ba-toolkit
 ba-toolkit init
 ```
-Supported agents: `claude-code`, `codex`, `gemini`, `cursor`, `windsurf`. Cursor and Windsurf installs auto-convert `SKILL.md` into the `.mdc` rule format. Pass `--dry-run` to preview the install step without writing files, or `--no-install` to create only the project structure and install skills later with `ba-toolkit install --for <agent>`.
+Supported agents: `claude-code`, `codex`, `gemini`, `cursor`, `windsurf`. All five use the native Agent Skills format (folder-per-skill with `SKILL.md`) — Claude Code at `.claude/skills/`, Codex at `~/.codex/skills/`, Gemini at `.gemini/skills/`, Cursor at `.cursor/skills/`, Windsurf at `.windsurf/skills/`. Pass `--dry-run` to preview the install step without writing files, or `--no-install` to create only the project structure and install skills later with `ba-toolkit install --for <agent>`.
 `ba-toolkit --help` shows the full CLI reference. Zero runtime dependencies — only Node.js ≥ 18.
@@ -98,21 +98,17 @@ cp -R ba-toolkit/skills/. /path/to/project/.gemini/skills/
 Reload the CLI after copying.
-### Cursor, Windsurf, Aider
+### Cursor
-These use their own rules format instead of `SKILL.md`. Convert first, then copy:
+Cursor has two separate features — Rules (`.cursor/rules/*.mdc`) and [Agent Skills](https://cursor.com/docs/skills) (`.cursor/skills/<skill>/SKILL.md`). BA Toolkit is a set of skills, not rules, so `ba-toolkit install --for cursor` drops the 21 skills directly into `.cursor/skills/` using the native folder-per-skill `SKILL.md` format — no conversion needed. Reload the Cursor window to pick them up.
-```bash
-# Option 1: community converter
-# https://github.com/alirezarezvani/claude-skills/blob/main/scripts/convert.sh
-./convert.sh --tool cursor --target /path/to/project
-./convert.sh --tool windsurf --target /path/to/project
+### Windsurf
-# Option 2: ask your AI agent
-# "Convert all SKILL.md files in skills/ to Cursor .mdc rule format"
-```
+Windsurf's [Agent Skills](https://docs.windsurf.com/windsurf/cascade/skills) feature loads skills from `.windsurf/skills/<skill>/SKILL.md`, the same folder-per-skill layout as Claude Code and Cursor. `ba-toolkit install --for windsurf` writes the 21 skills there natively. Reload the Windsurf window to pick them up.
+### Aider
-Cursor rules live in [`.cursor/rules/`](https://cursor.com/docs/rules) as `.mdc` files with YAML frontmatter (`description`, optional `globs`, `alwaysApply`). A plain rename of `SKILL.md` to `.mdc` is not enough — the metadata block is required. [Cursor CLI](https://cursor.com/docs/cli/using) reads the same `.cursor/rules` setup and may also pick up `AGENTS.md` / `CLAUDE.md` at repo root.
+Aider has no native skills feature. Convert manually with the community script at <https://github.com/alirezarezvani/claude-skills/blob/main/scripts/convert.sh> or ask your AI agent to convert `SKILL.md` files to the target format.
 ### Starting a new project (shell scripts)
@@ -205,11 +201,11 @@ BA Toolkit uses the open Agent Skills specification (`SKILL.md` format) publishe
 | **Claude Code** | Native | `cp -R skills/. .claude/skills/` |
 | **OpenAI Codex CLI** | Native | `cp -R skills/. ~/.codex/skills/` |
 | **Gemini CLI** | Native | Copy `skills/.` contents to `~/.gemini/skills/` (user) or `.gemini/skills/` (workspace) |
-| **Cursor** | Convert | `SKILL.md` → `.mdc` rules in `.cursor/rules/` |
-| **Windsurf** | Convert | `SKILL.md` → rules in `.windsurf/rules/` |
+| **Cursor** | Native | Copy `skills/.` contents to `.cursor/skills/` |
+| **Windsurf** | Native | Copy `skills/.` contents to `.windsurf/skills/` |
 | **Aider** | Convert | `SKILL.md` → conventions file |
-Native platforms read `SKILL.md` as-is. Convert platforms need a one-time format conversion — the content is the same, only the file format differs. `ba-toolkit install --for cursor|windsurf` does this automatically.
+All five officially supported platforms read `SKILL.md` as-is — no conversion. `ba-toolkit install --for <agent>` lands skills directly in the agent's native skills root.
 Skills do not hardcode platform paths — they reference `skills/references/environment.md`, which contains the output directory logic for each platform. Edit that file to customize; all skills pick up the change automatically.

package/bin/ba-toolkit.js CHANGED Viewed

@@ -18,57 +18,55 @@ const PACKAGE_ROOT = path.resolve(__dirname, '..');
 const SKILLS_DIR = path.join(PACKAGE_ROOT, 'skills');
 const PKG = JSON.parse(fs.readFileSync(path.join(PACKAGE_ROOT, 'package.json'), 'utf8'));
-// In v2.0 the install paths dropped the previous `ba-toolkit/` wrapper
-// directory. Claude Code, Codex CLI, and Gemini CLI all expect skills
-// to be discoverable as direct subfolders of their skills root —
-// `.claude/skills/<skill-name>/SKILL.md`, not nested one level deeper.
-// The wrapper made all 21 skills invisible to every agent.
+// All five supported agents — Claude Code, Codex CLI, Gemini CLI,
+// Cursor, and Windsurf — load Agent Skills as direct subfolders of
+// their skills root: `<skills-root>/<skill-name>/SKILL.md`. The toolkit
+// installs the 21 skills natively in this layout for every agent. No
+// .mdc conversion. Confirmed against the Agent Skills documentation
+// for each platform via ctx7 MCP / official docs.
 //
-// Cursor and Windsurf load `.mdc` rule files directly from their rules
-// root, so v2.0 also flattens that layout: the per-skill subfolders
-// produced by the previous version are gone, and rules sit at
-// `.cursor/rules/<skill-name>.mdc`.
+// Earlier versions tried to install Cursor and Windsurf via `.mdc`
+// rules under `.cursor/rules/` and `.windsurf/rules/` — but Rules and
+// Agent Skills are two separate features in both editors, and the
+// toolkit is a pipeline of skills, not rules. The wrong-feature install
+// silently failed: skills loaded as rules never surfaced as `/brief`,
+// `/srs`, … slash commands. v2.x corrects this for Cursor, and the
+// Windsurf cleanup in this changelog entry finishes the job.
 //
-// To stay safe sharing the skills root with the user's other skills /
-// rules, every install also drops a `.ba-toolkit-manifest.json` next to
-// the installed items. uninstall and upgrade read this manifest to
-// remove only what the toolkit owns; without it they refuse to touch
-// anything.
+// To stay safe sharing the skills root with the user's other skills,
+// every install also drops a `.ba-toolkit-manifest.json` next to the
+// installed items. uninstall and upgrade read this manifest to remove
+// only what the toolkit owns; without it they refuse to touch anything.
 const AGENTS = {
   'claude-code': {
     name: 'Claude Code',
     projectPath: '.claude/skills',
     globalPath: path.join(os.homedir(), '.claude', 'skills'),
-    format: 'skill',
     restartHint: 'Restart Claude Code to load the new skills.',
   },
   codex: {
     name: 'OpenAI Codex CLI',
     projectPath: null, // Codex uses only global
     globalPath: path.join(process.env.CODEX_HOME || path.join(os.homedir(), '.codex'), 'skills'),
-    format: 'skill',
     restartHint: 'Restart the Codex CLI to load the new skills.',
   },
   gemini: {
     name: 'Google Gemini CLI',
     projectPath: '.gemini/skills',
     globalPath: path.join(os.homedir(), '.gemini', 'skills'),
-    format: 'skill',
     restartHint: 'Reload Gemini CLI to pick up the new skills.',
   },
   cursor: {
     name: 'Cursor',
-    projectPath: '.cursor/rules',
-    globalPath: null, // Cursor rules are project-scoped
-    format: 'mdc',
-    restartHint: 'Reload the Cursor window to apply new rules.',
+    projectPath: '.cursor/skills',
+    globalPath: null, // Cursor skills are project-scoped for now
+    restartHint: 'Reload the Cursor window to apply new skills.',
   },
   windsurf: {
     name: 'Windsurf',
-    projectPath: '.windsurf/rules',
-    globalPath: null,
-    format: 'mdc',
-    restartHint: 'Reload the Windsurf window to apply new rules.',
+    projectPath: '.windsurf/skills',
+    globalPath: null, // Windsurf skills are project-scoped for now
+    restartHint: 'Reload the Windsurf window to apply new skills.',
   },
 };
@@ -85,6 +83,21 @@ const DOMAINS = [
   { id: 'custom',       name: 'Custom',       desc: 'Any other domain — general interview questions' },
 ];
+// ASCII banner shown at the top of `ba-toolkit init`. Suppressed on
+// non-TTY stdout so it doesn't end up in CI logs or piped output.
+// Stored as an array of literal lines (not a template literal) so the
+// `$` characters stay out of any interpolation path.
+const BANNER = [
+  ' /$$                           /$$                         /$$ /$$       /$$   /$$    ',
+  '| $$                          | $$                        | $$| $$      |__/  | $$    ',
+  '| $$$$$$$   /$$$$$$          /$$$$$$    /$$$$$$   /$$$$$$ | $$| $$   /$$ /$$ /$$$$$$  ',
+  '| $$__  $$ |____  $$ /$$$$$$|_  $$_/   /$$__  $$ /$$__  $$| $$| $$  /$$/| $$|_  $$_/  ',
+  '| $$  \\ $$  /$$$$$$$|______/  | $$    | $$  \\ $$| $$  \\ $$| $$| $$$$$$/ | $$  | $$    ',
+  '| $$  | $$ /$$__  $$          | $$ /$$| $$  | $$| $$  | $$| $$| $$_  $$ | $$  | $$ /$$',
+  '| $$$$$$$/|  $$$$$$$          |  $$$$/|  $$$$$$/|  $$$$$$/| $$| $$ \\  $$| $$  |  $$$$/',
+  '|_______/  \\_______/           \\___/   \\______/  \\______/ |__/|__/  \\__/|__/   \\___/  ',
+];
 // --- Terminal helpers --------------------------------------------------
 const NO_COLOR = !!process.env.NO_COLOR || !process.stdout.isTTY;
@@ -99,6 +112,17 @@ const bold = colour(1);
 function log(...args) { console.log(...args); }
 function logError(...args) { console.error(red('error:'), ...args); }
+// Print the BANNER to stdout if — and only if — stdout is a real TTY.
+// Piped / redirected runs (CI, test spawn, `ba-toolkit init | tee ...`)
+// get a clean log without the 8-line block. The banner is decorative,
+// not load-bearing, so suppressing it in non-interactive contexts is
+// the right default.
+function printBanner() {
+  if (!process.stdout.isTTY) return;
+  for (const line of BANNER) log(cyan(line));
+  log('');
+}
 // --- Arg parsing -------------------------------------------------------
 function parseArgs(argv) {
@@ -140,30 +164,66 @@ function parseArgs(argv) {
 // --- Prompt helper -----------------------------------------------------
 // Shared across all prompts in a single CLI invocation. Creating a new
-// readline.Interface for every question (the previous approach) made Ctrl+C
+// readline.Interface for every question (the earlier approach) made Ctrl+C
 // handling unreliable, leaked listeners on stdin, and broke when stdin was
-// piped (EOF on the second create). One interface per process, closed by
-// closeReadline() once main() finishes (or by the SIGINT handler).
+// piped. One interface per process, closed by closeReadline() once main()
+// finishes (or by the SIGINT handler).
+//
+// prompt() does NOT use `rl.question(...)` — that method races with
+// readline's internal line buffering when stdin is piped. If input arrives
+// faster than prompts are issued (the common piped case: the user pipes a
+// here-doc with multiple answers, or a test feeds the entire stdin buffer
+// upfront), readline emits 'line' events before the question listener is
+// attached and those lines are silently dropped. The second prompt then
+// sees EOF and errors with INPUT_CLOSED despite the answer actually being
+// in the buffer.
+//
+// Instead we own the 'line' event ourselves and keep a line queue: every
+// line that arrives is pushed onto `lineQueue` if no one is waiting, or
+// delivered directly to the oldest waiter. A prompt() call takes the head
+// of the queue if non-empty, otherwise parks a waiter. The 'close' event
+// drains all waiting waiters with INPUT_CLOSED.
 let sharedRl = null;
+const lineQueue = [];
+const waiters = [];
+let inputClosed = false;
+function ensureReadline() {
+  if (sharedRl) return;
+  sharedRl = readline.createInterface({ input: process.stdin, output: process.stdout });
+  sharedRl.on('line', (line) => {
+    if (waiters.length > 0) {
+      waiters.shift().resolve(line);
+    } else {
+      lineQueue.push(line);
+    }
+  });
+  sharedRl.on('close', () => {
+    inputClosed = true;
+    while (waiters.length > 0) {
+      const err = new Error('input stream closed before answer');
+      err.code = 'INPUT_CLOSED';
+      waiters.shift().reject(err);
+    }
+  });
+}
 function prompt(question) {
-  if (!sharedRl) {
-    sharedRl = readline.createInterface({ input: process.stdin, output: process.stdout });
+  ensureReadline();
+  // Render the question ourselves — we're not using rl.question().
+  process.stdout.write(question);
+  if (lineQueue.length > 0) {
+    return Promise.resolve(String(lineQueue.shift()).trim());
+  }
+  if (inputClosed) {
+    const err = new Error('input stream closed before answer');
+    err.code = 'INPUT_CLOSED';
+    return Promise.reject(err);
   }
   return new Promise((resolve, reject) => {
-    let answered = false;
-    const onClose = () => {
-      if (!answered) {
-        const err = new Error('input stream closed before answer');
-        err.code = 'INPUT_CLOSED';
-        reject(err);
-      }
-    };
-    sharedRl.once('close', onClose);
-    sharedRl.question(question, (answer) => {
-      answered = true;
-      sharedRl.removeListener('close', onClose);
-      resolve(answer.trim());
+    waiters.push({
+      resolve: (line) => resolve(String(line).trim()),
+      reject,
     });
   });
 }
@@ -175,6 +235,213 @@ function closeReadline() {
   }
 }
+// --- Arrow-key menus -----------------------------------------------------
+//
+// Three layers, separated for testability:
+//
+//   1. menuStep(state, key) — pure state machine. Given the current
+//      menu state and a normalised key action, returns the new state.
+//      Unit-tested directly. No dependencies, no I/O.
+//
+//   2. renderMenu(state, opts) — pure renderer. Returns the frame to
+//      print as a string. Unit-tested too — uses the colour helpers,
+//      which collapse to identity strings under NO_COLOR (i.e., in
+//      tests), so the assertions are stable.
+//
+//   3. runMenuTty(items, opts) / selectMenu(items, opts) — the I/O
+//      glue. Detects TTY, sets raw mode, listens for keypress events,
+//      drives the loop, falls back to a numbered prompt under
+//      promptUntilValid when the terminal is non-interactive (CI,
+//      piped input, TERM=dumb). Not unit-tested — covered by manual
+//      smoke and the existing fallback-path integration tests.
+//
+// Cross-platform note: Node's `readline.emitKeypressEvents` decodes
+// arrow-key escape sequences uniformly across bash/zsh/fish on
+// Linux/macOS, Windows Terminal (PowerShell, cmd, WSL), Git Bash /
+// MSYS2, and VSCode's integrated terminal. Modern Node also enables VT
+// mode automatically on Windows when raw mode is requested, so legacy
+// cmd.exe on Win10+ works too. The only environment we explicitly bail
+// out of is `TERM=dumb` (emacs M-x shell, some IDE shells) — keypress
+// decoding is unreliable there.
+function menuStep(state, key) {
+  if (state.done) return state;
+  const len = state.items.length;
+  if (len === 0) return state;
+  switch (key) {
+    case 'up':
+      return { ...state, index: (state.index - 1 + len) % len };
+    case 'down':
+      return { ...state, index: (state.index + 1) % len };
+    case 'enter':
+      return { ...state, done: true, choice: state.items[state.index] };
+    case 'cancel':
+      return { ...state, done: true, choice: null };
+    default:
+      if (/^[0-9]$/.test(key)) {
+        const n = parseInt(key, 10);
+        if (n >= 1 && n <= len) {
+          return { ...state, index: n - 1 };
+        }
+      }
+      return state;
+  }
+}
+function renderMenu(state, { title } = {}) {
+  const lines = [];
+  if (title) {
+    lines.push('  ' + yellow(title));
+    lines.push('');
+  }
+  const labelWidth = Math.max(...state.items.map((it) => it.label.length));
+  state.items.forEach((item, i) => {
+    const selected = i === state.index;
+    const marker = selected ? cyan('>') : ' ';
+    const idx = String(i + 1).padStart(2);
+    const label = selected ? bold(item.label.padEnd(labelWidth)) : item.label.padEnd(labelWidth);
+    const desc = item.desc ? '  ' + gray('— ' + item.desc) : '';
+    lines.push(`  ${marker} ${idx}) ${label}${desc}`);
+  });
+  lines.push('');
+  lines.push('  ' + gray('↑/↓ navigate · Enter select · 1-9 jump · Esc cancel'));
+  return lines.join('\n') + '\n';
+}
+// True when arrow-key menus are usable in this process. False under
+// piped stdin/stdout, dumb terminals, or when raw mode is unavailable.
+function isInteractiveTerminal() {
+  if (!process.stdin.isTTY) return false;
+  if (!process.stdout.isTTY) return false;
+  if (process.env.TERM === 'dumb') return false;
+  if (typeof process.stdin.setRawMode !== 'function') return false;
+  return true;
+}
+// TTY runner: drive the menu state machine via raw-mode keypress
+// events. Returns the chosen item or null if the user cancelled.
+// The caller is responsible for not invoking this when
+// isInteractiveTerminal() is false.
+function runMenuTty(items, { title } = {}) {
+  // The shared line-mode readline (used by `prompt()`) and a raw-mode
+  // keypress reader can't both own stdin at the same time. Close any
+  // line-mode interface before we take over; the next prompt() call
+  // will lazily recreate it via ensureReadline().
+  closeReadline();
+  return new Promise((resolve) => {
+    let state = { items, index: 0, done: false, choice: null };
+    let lastFrameLineCount = 0;
+    const render = () => {
+      // Erase the previous frame in place: move the cursor up over its
+      // line count, then clear from cursor to end of screen. First
+      // render has nothing to erase.
+      if (lastFrameLineCount > 0) {
+        process.stdout.write(`\x1b[${lastFrameLineCount}A\x1b[J`);
+      }
+      const frame = renderMenu(state, { title });
+      process.stdout.write(frame);
+      // Count lines actually printed (frame ends with a trailing \n).
+      lastFrameLineCount = frame.split('\n').length - 1;
+    };
+    const cleanup = () => {
+      process.stdin.removeListener('keypress', onKey);
+      try {
+        process.stdin.setRawMode(false);
+      } catch { /* setRawMode can throw if stdin is not a TTY anymore */ }
+      process.stdin.pause();
+    };
+    const onKey = (_str, key) => {
+      if (!key) return;
+      let action = null;
+      if (key.ctrl && key.name === 'c') action = 'cancel';
+      else if (key.name === 'escape') action = 'cancel';
+      else if (key.name === 'up' || key.name === 'k') action = 'up';
+      else if (key.name === 'down' || key.name === 'j') action = 'down';
+      else if (key.name === 'return') action = 'enter';
+      else if (key.sequence && /^[0-9]$/.test(key.sequence)) action = key.sequence;
+      if (!action) return;
+      state = menuStep(state, action);
+      if (state.done) {
+        cleanup();
+        resolve(state.choice);
+      } else {
+        render();
+      }
+    };
+    readline.emitKeypressEvents(process.stdin);
+    process.stdin.setRawMode(true);
+    process.stdin.resume();
+    process.stdin.on('keypress', onKey);
+    render();
+  });
+}
+// Top-level selector: interactive arrow-key menu in real terminals,
+// numbered prompt fallback everywhere else (CI, piped input, dumb
+// TERM, EditorIDE shells). Always returns either an item from `items`
+// or null on cancel.
+async function selectMenu(items, { title, fallbackPrompt }) {
+  if (isInteractiveTerminal()) {
+    return await runMenuTty(items, { title });
+  }
+  // Non-TTY fallback: print the numbered list once, then prompt with
+  // promptUntilValid so a single typo doesn't kill the wizard.
+  log('');
+  if (title) log('  ' + yellow(title));
+  const labelWidth = Math.max(...items.map((it) => it.label.length));
+  items.forEach((item, i) => {
+    const idx = String(i + 1).padStart(2);
+    const desc = item.desc ? '  ' + gray('— ' + item.desc) : '';
+    log(`    ${idx}) ${bold(item.label.padEnd(labelWidth))}${desc}`);
+  });
+  log('');
+  return await promptUntilValid(
+    fallbackPrompt,
+    (raw) => {
+      const trimmed = String(raw || '').toLowerCase().trim();
+      if (!trimmed) return null;
+      if (/^\d+$/.test(trimmed)) {
+        const n = parseInt(trimmed, 10);
+        return n >= 1 && n <= items.length ? items[n - 1] : null;
+      }
+      return items.find((it) => it.id === trimmed) || null;
+    },
+    { invalidMessage: `Invalid selection — pick a number between 1 and ${items.length} or an id.` },
+  );
+}
+// Ask the user `question`, run `resolver` on the trimmed answer, and
+// loop while the resolver returns null/undefined. Prints a yellow
+// "try again" message between attempts. Aborts with process.exit(1)
+// after `maxAttempts` consecutive invalid answers so a piped input
+// can't infinite-loop us.
+//
+// Previously, cmdInit called `resolveDomain` / `resolveAgent` /
+// `sanitiseSlug` once and hard-failed on the first typo — users who
+// mistyped "saass" lost the whole wizard and had to start over. With
+// the retry loop, they just read the error and try again.
+async function promptUntilValid(question, resolver, {
+  maxAttempts = 3,
+  invalidMessage = 'Invalid selection — try again.',
+} = {}) {
+  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
+    const raw = await prompt(question);
+    const result = resolver(raw);
+    if (result != null && result !== '') return result;
+    const remaining = maxAttempts - attempt;
+    if (remaining > 0) {
+      log('  ' + yellow(`${invalidMessage} (${remaining} attempt${remaining === 1 ? '' : 's'} left)`));
+    }
+  }
+  logError(`Too many invalid attempts — aborting.`);
+  process.exit(1);
+}
 // --- Utilities ---------------------------------------------------------
 function sanitiseSlug(input) {
@@ -352,8 +619,10 @@ function copyDirRecursive(src, dest, { dryRun, copied }) {
 //   - Quoted scalars (single or double quoted) — names would keep quotes
 //
 // Returns { name, description, body }. `description` is always
-// flattened to a single line (whitespace collapsed) because the .mdc
-// rule format expects a one-line description.
+// flattened to a single line (whitespace collapsed) — keeps the
+// downstream consumers (manifest summary, status output, agent skill
+// loaders that expect a single-line description) free of multi-line
+// surprises.
 function parseSkillFrontmatter(content) {
   const fmMatch = content.match(/^---\r?\n([\s\S]*?)\r?\n---\r?\n([\s\S]*)$/);
   if (!fmMatch) {
@@ -402,34 +671,18 @@ function parseSkillFrontmatter(content) {
   };
 }
-// Transform a SKILL.md file's contents to the Cursor/Windsurf .mdc rule
-// format: replace the YAML frontmatter with the two fields the rule
-// loader expects (description, alwaysApply), keep the body unchanged.
-function skillToMdcContent(content) {
-  const { description, body } = parseSkillFrontmatter(content);
-  return `---\ndescription: ${description}\nalwaysApply: false\n---\n\n` + body;
-}
-// Install the package's skills/ tree into the given destination, picking
-// the layout the target agent expects.
-//
-// For 'skill' format (Claude Code, Codex, Gemini): each source skill
-// folder lands as `<destRoot>/<skillName>/SKILL.md`. The references/
+// Install the package's skills/ tree into the given destination. Every
+// supported agent uses the same Agent Skills layout: each source skill
+// folder lands as `<destRoot>/<skillName>/SKILL.md`. The `references/`
 // folder is copied as-is to `<destRoot>/references/`.
 //
-// For 'mdc' format (Cursor, Windsurf): each source skill folder is
-// flattened to a single `<destRoot>/<skillName>.mdc` file containing
-// the transformed content. References still go to `<destRoot>/references/`
-// — non-.mdc files there are ignored by the rule loaders, but the LLM
-// can still find them at runtime via the Read tool.
-//
 // Skill names come from the SKILL.md `name:` frontmatter field, falling
 // back to the source folder name. Returns:
 //   { copied, items }
 // where `copied` is the list of absolute file paths written and `items`
 // is the list of top-level entries in destRoot that the toolkit owns
 // (used to write the manifest).
-function copySkills(srcRoot, destRoot, { format, dryRun = false }) {
+function copySkills(srcRoot, destRoot, { dryRun = false } = {}) {
   if (!fs.existsSync(srcRoot)) {
     throw new Error(`Source directory not found: ${srcRoot}`);
   }
@@ -456,17 +709,9 @@ function copySkills(srcRoot, destRoot, { format, dryRun = false }) {
     const { name } = parseSkillFrontmatter(content);
     const skillName = name || entry.name;
-    if (format === 'mdc') {
-      const transformed = skillToMdcContent(content);
-      const destFile = path.join(destRoot, `${skillName}.mdc`);
-      if (!dryRun) fs.writeFileSync(destFile, transformed);
-      copied.push(destFile);
-      items.push(`${skillName}.mdc`);
-    } else {
-      const skillDestDir = path.join(destRoot, skillName);
-      copyDirRecursive(srcPath, skillDestDir, { dryRun, copied });
-      items.push(skillName);
-    }
+    const skillDestDir = path.join(destRoot, skillName);
+    copyDirRecursive(srcPath, skillDestDir, { dryRun, copied });
+    items.push(skillName);
   }
   return { copied, items };
@@ -486,6 +731,13 @@ function copySkills(srcRoot, destRoot, { format, dryRun = false }) {
 // 21-skill list and ordering; keep that template in sync with it.
 const AGENTS_TEMPLATE_PATH = path.join(SKILLS_DIR, 'references', 'templates', 'agents-template.md');
+// Anchor markers delimit the block inside AGENTS.md that `ba-toolkit
+// init` owns and is allowed to rewrite on re-init. Everything outside
+// the anchors (Pipeline Status, Key Constraints, Open Questions, user
+// notes) is preserved untouched. See agents-template.md.
+const AGENTS_MANAGED_BEGIN = '<!-- ba-toolkit:begin managed -->';
+const AGENTS_MANAGED_END = '<!-- ba-toolkit:end managed -->';
 function renderAgentsMd({ name, slug, domain }) {
   let template;
   try {
@@ -500,9 +752,50 @@ function renderAgentsMd({ name, slug, domain }) {
     .replace(/\[DATE\]/g, today());
 }
+// Merge the fresh AGENTS.md content into whatever already exists at
+// the project root. Three branches:
+//
+//   1. No existing file (existing == null) — return the fresh template,
+//      action 'created'.
+//   2. Existing file has both anchor markers — replace only the managed
+//      block content between the anchors, leave the rest of the file
+//      (Pipeline Status, Key Constraints, user notes) untouched. Action
+//      'merged'.
+//   3. Existing file has no anchors — it's either a legacy AGENTS.md
+//      from a pre-merge version of the toolkit or a fully user-authored
+//      file. Leave it untouched and return { action: 'preserved' } so
+//      the caller can print a note. We never silently overwrite
+//      user content.
+//
+// Pure function for easy testing. Exported so test/cli.test.js can
+// cover all three branches without spawning a process.
+function mergeAgentsMd(existing, ctx) {
+  const fresh = renderAgentsMd(ctx);
+  if (existing == null) {
+    return { content: fresh, action: 'created' };
+  }
+  const beginIdx = existing.indexOf(AGENTS_MANAGED_BEGIN);
+  const endIdx = existing.indexOf(AGENTS_MANAGED_END);
+  if (beginIdx === -1 || endIdx === -1 || endIdx < beginIdx) {
+    return { content: existing, action: 'preserved' };
+  }
+  const freshBeginIdx = fresh.indexOf(AGENTS_MANAGED_BEGIN);
+  const freshEndIdx = fresh.indexOf(AGENTS_MANAGED_END);
+  if (freshBeginIdx === -1 || freshEndIdx === -1) {
+    // Template is broken — fall back to returning fresh. Should be
+    // caught in unit tests if the template file ever loses its anchors.
+    return { content: fresh, action: 'created' };
+  }
+  const freshManaged = fresh.slice(freshBeginIdx, freshEndIdx + AGENTS_MANAGED_END.length);
+  const before = existing.slice(0, beginIdx);
+  const after = existing.slice(endIdx + AGENTS_MANAGED_END.length);
+  return { content: before + freshManaged + after, action: 'merged' };
+}
 // --- Commands ----------------------------------------------------------
 async function cmdInit(args) {
+  printBanner();
   log('');
   log('  ' + cyan('BA Toolkit — New Project Setup'));
   log('  ' + cyan('================================'));
@@ -537,20 +830,43 @@ async function cmdInit(args) {
       }
       slug = derived;
     } else if (derived) {
-      const custom = await prompt(`  Project slug [${cyan(derived)}]: `);
-      slug = custom || derived;
+      // Default branch: the derived slug is offered as the suggested
+      // answer. Empty input accepts the suggestion; anything the user
+      // types is run through sanitiseSlug and must produce something
+      // non-empty — otherwise re-prompt.
+      slug = await promptUntilValid(
+        `  Project slug [${cyan(derived)}]: `,
+        (raw) => {
+          const typed = String(raw || '').trim();
+          if (!typed) return derived;
+          const cleaned = sanitiseSlug(typed);
+          return cleaned || null;
+        },
+        { invalidMessage: 'Invalid slug — must produce at least one ASCII letter/digit after sanitisation.' },
+      );
     } else {
       log('  ' + gray(`(could not derive a slug from "${name}" — please type one manually)`));
-      slug = await prompt('  Project slug (lowercase, hyphens only): ');
+      slug = await promptUntilValid(
+        '  Project slug (lowercase, hyphens only): ',
+        (raw) => {
+          const cleaned = sanitiseSlug(String(raw || '').trim());
+          return cleaned || null;
+        },
+        { invalidMessage: 'Invalid slug — must contain at least one ASCII letter or digit.' },
+      );
     }
   }
+  // At this point `slug` is already a sanitised, non-empty string from
+  // one of the branches above. The final sanitiseSlug call is a
+  // defensive no-op for the flag path (--slug) where we haven't
+  // cleaned it yet.
   slug = sanitiseSlug(slug);
   if (!slug) {
     logError('Invalid or empty slug.');
     process.exit(1);
   }
-  // --- 3. Domain (numbered menu) ---
+  // --- 3. Domain (arrow menu in TTY, numbered fallback elsewhere) ---
   const domainFlag = stringFlag(args, 'domain');
   let domain;
   if (domainFlag) {
@@ -561,20 +877,18 @@ async function cmdInit(args) {
       process.exit(1);
     }
   } else {
-    log('');
-    log('  ' + yellow('Pick a domain:'));
-    const domainNameWidth = Math.max(...DOMAINS.map((d) => d.name.length));
-    DOMAINS.forEach((d, i) => {
-      const idx = String(i + 1).padStart(2);
-      log(`    ${idx}) ${bold(d.name.padEnd(domainNameWidth))} ${gray('— ' + d.desc)}`);
-    });
-    log('');
-    const raw = await prompt(`  Select [1-${DOMAINS.length}]: `);
-    domain = resolveDomain(raw);
-    if (!domain) {
-      logError(`Invalid selection: ${raw || '(empty)'}`);
-      process.exit(1);
+    const chosen = await selectMenu(
+      DOMAINS.map((d) => ({ id: d.id, label: d.name, desc: d.desc })),
+      {
+        title: 'Pick a domain:',
+        fallbackPrompt: `  Select [1-${DOMAINS.length}]: `,
+      },
+    );
+    if (chosen == null) {
+      log('  ' + yellow('Cancelled.'));
+      process.exit(130);
     }
+    domain = chosen.id;
   }
   // --- 4. Agent (numbered menu), unless --no-install ---
@@ -590,21 +904,19 @@ async function cmdInit(args) {
         process.exit(1);
       }
     } else {
-      log('');
-      log('  ' + yellow('Pick your AI agent:'));
       const agentEntries = Object.entries(AGENTS);
-      const agentNameWidth = Math.max(...agentEntries.map(([, a]) => a.name.length));
-      agentEntries.forEach(([id, a], i) => {
-        const idx = String(i + 1).padStart(2);
-        log(`    ${idx}) ${bold(a.name.padEnd(agentNameWidth))} ${gray('(' + id + ')')}`);
-      });
-      log('');
-      const raw = await prompt(`  Select [1-${agentEntries.length}]: `);
-      agentId = resolveAgent(raw);
-      if (!agentId) {
-        logError(`Invalid selection: ${raw || '(empty)'}`);
-        process.exit(1);
+      const chosen = await selectMenu(
+        agentEntries.map(([id, a]) => ({ id, label: a.name, desc: '(' + id + ')' })),
+        {
+          title: 'Pick your AI agent:',
+          fallbackPrompt: `  Select [1-${agentEntries.length}]: `,
+        },
+      );
+      if (chosen == null) {
+        log('  ' + yellow('Cancelled.'));
+        process.exit(130);
       }
+      agentId = chosen.id;
     }
   }
@@ -620,18 +932,23 @@ async function cmdInit(args) {
     log(`    exists   ${outputDir}`);
   }
+  // AGENTS.md: merge-on-reinit instead of overwrite. Everything outside
+  // the managed block (Pipeline Status, Key Constraints, user notes) is
+  // preserved. See mergeAgentsMd for the three branches (created,
+  // merged, preserved).
   const agentsPath = 'AGENTS.md';
-  let writeAgents = true;
-  if (fs.existsSync(agentsPath)) {
-    const answer = await prompt('  AGENTS.md already exists. Overwrite? (y/N): ');
-    if (answer.toLowerCase() !== 'y') {
-      writeAgents = false;
-      log('    skipped  AGENTS.md');
-    }
-  }
-  if (writeAgents) {
-    fs.writeFileSync(agentsPath, renderAgentsMd({ name, slug, domain }));
-    log('    created  AGENTS.md');
+  const existingAgents = fs.existsSync(agentsPath)
+    ? fs.readFileSync(agentsPath, 'utf8')
+    : null;
+  const { content: agentsContent, action: agentsAction } = mergeAgentsMd(
+    existingAgents,
+    { name, slug, domain },
+  );
+  if (agentsAction === 'preserved') {
+    log('    ' + gray('preserved AGENTS.md (no ba-toolkit managed block — left untouched)'));
+  } else {
+    fs.writeFileSync(agentsPath, agentsContent);
+    log(`    ${agentsAction === 'merged' ? 'updated ' : 'created '} AGENTS.md`);
   }
   // --- 6. Install skills for the selected agent ---
@@ -683,8 +1000,8 @@ async function cmdInit(args) {
 // only what we own without touching the user's other skills sitting in
 // the same directory.
 //
-// Hidden filename with no `.md` / `.mdc` extension so the skill loader
-// of every supported agent ignores it.
+// Hidden filename with no `.md` extension so the skill loader of every
+// supported agent ignores it.
 const MANIFEST_FILENAME = '.ba-toolkit-manifest.json';
 function readManifest(destDir) {
@@ -697,11 +1014,10 @@ function readManifest(destDir) {
   }
 }
-function writeManifest(destDir, format, items) {
+function writeManifest(destDir, items) {
   const payload = {
     version: PKG.version,
     installedAt: new Date().toISOString(),
-    format,
     items,
   };
   fs.writeFileSync(
@@ -755,7 +1071,7 @@ async function runInstall({ agentId, isGlobal, isProject, dryRun, showHeader = t
   log(`    source:       ${SKILLS_DIR}`);
   log(`    destination:  ${destDir}`);
   log(`    scope:        ${effectiveGlobal ? 'global (user-wide)' : 'project-level'}`);
-  log(`    format:       ${agent.format === 'mdc' ? '.mdc (converted from SKILL.md)' : 'SKILL.md (native)'}`);
+  log(`    format:       SKILL.md (native)`);
   if (dryRun) log('    ' + yellow('mode:         dry-run (no files will be written)'));
   // Warn about a v1.x wrapper folder if one is sitting in the same
@@ -784,29 +1100,25 @@ async function runInstall({ agentId, isGlobal, isProject, dryRun, showHeader = t
   let result;
   try {
-    result = copySkills(SKILLS_DIR, destDir, { format: agent.format, dryRun });
+    result = copySkills(SKILLS_DIR, destDir, { dryRun });
   } catch (err) {
     logError(err.message);
     process.exit(1);
   }
   if (!dryRun) {
-    writeManifest(destDir, agent.format, result.items);
+    writeManifest(destDir, result.items);
   }
   log('    ' + green(`${dryRun ? 'would copy' : 'copied'} ${result.copied.length} files (${result.items.length} items).`));
-  if (!dryRun && agent.format === 'mdc') {
-    log('    ' + gray('SKILL.md files converted to .mdc rule format.'));
-  }
   return true;
 }
 // Remove every item listed in the given manifest from destDir, then
-// remove the manifest file itself. Items are paths relative to destDir
-// — for 'skill' format they're folder names (`brief`, `srs`, ...,
-// `references`), for 'mdc' they're file names (`brief.mdc`, ...,
-// plus the `references` folder). Anything not in the manifest is left
-// alone, including the user's other skills/rules in the same directory.
+// remove the manifest file itself. Items are top-level entries
+// relative to destDir — folder names like `brief`, `srs`, …,
+// `references`. Anything not in the manifest is left alone, including
+// the user's other skills sitting in the same directory.
 function removeManifestItems(destDir, manifest) {
   for (const item of manifest.items) {
     const p = path.join(destDir, item);
@@ -1264,10 +1576,12 @@ module.exports = {
   levenshtein,
   closestMatch,
   parseSkillFrontmatter,
-  skillToMdcContent,
   readManifest,
   detectLegacyInstall,
   renderAgentsMd,
+  mergeAgentsMd,
+  menuStep,
+  renderMenu,
   KNOWN_FLAGS,
   DOMAINS,
   AGENTS,

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@kudusov.takhir/ba-toolkit",
-  "version": "2.0.0",
+  "version": "3.0.0",
   "description": "AI-powered Business Analyst pipeline — 21 skills from project brief to development handoff. Works with Claude Code, Codex CLI, Gemini CLI, Cursor, and Windsurf.",
   "keywords": [
     "business-analyst",
@@ -43,6 +43,6 @@
     "node": ">=18"
   },
   "scripts": {
-    "test": "node --test test/cli.test.js"
+    "test": "node --test test/cli.test.js test/cli.integration.test.js"
   }
 }

package/skills/ac/SKILL.md CHANGED Viewed

@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
 ## Interview
-3–7 questions per round, 2–4 rounds.
+> **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
+3–7 topics per round, 2–4 rounds.
 **Required topics:**
 1. Which business rules should be reflected in AC (limits, formulas, thresholds)?

package/skills/apicontract/SKILL.md CHANGED Viewed

@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
 ## Interview
-3–7 questions per round, 2–4 rounds.
+> **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
+3–7 topics per round, 2–4 rounds.
 **Required topics:**
 1. Protocol — REST, WebSocket, GraphQL, combination?

package/skills/brief/SKILL.md CHANGED Viewed

@@ -34,7 +34,9 @@ The domain is written into the brief metadata and passed to all subsequent pipel
 ### 4. Interview
-3–7 questions per round, 2–4 rounds. Do not generate the artifact until sufficient information is collected.
+> **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
+Cover 3–7 topics per round, 2–4 rounds. Do not generate the artifact until sufficient information is collected.
 **Required topics (all domains):**
 1. Product type — what exactly is being built?

package/skills/datadict/SKILL.md CHANGED Viewed

@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
 ## Interview
-3–7 questions per round, 2–4 rounds.
+> **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
+3–7 topics per round, 2–4 rounds.
 **Required topics:**
 1. DBMS — MongoDB, PostgreSQL, MySQL, other?

package/skills/nfr/SKILL.md CHANGED Viewed

@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
 ## Interview
-3–7 questions per round, 2–4 rounds.
+> **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
+3–7 topics per round, 2–4 rounds.
 **Required topics:**
 1. Performance — target CCU (Concurrent Users), RPS (Requests Per Second), acceptable response time?

package/skills/principles/SKILL.md CHANGED Viewed

@@ -27,7 +27,9 @@ If `01_brief_*.md` already exists, extract the slug and domain from it. Otherwis
 ### 3. Interview
-1–2 rounds, 3–5 questions each. Do not ask about topics the user can accept as defaults.
+> **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
+1–2 rounds, 3–5 topics each. Do not ask about topics the user can accept as defaults.
 **Required topics:**
 1. Artifact language — which language should all artifacts be generated in? (default: the language of the user's first message)

package/skills/references/interview-protocol.md ADDED Viewed

@@ -0,0 +1,53 @@
+# Interview Protocol
+Every BA Toolkit skill that gathers information from the user MUST follow this protocol during its Interview phase. The goal is a conversation, not a questionnaire dump — users answer better when each question has focus and concrete options to react to.
+## Rules
+1. **One question at a time.** Never send a numbered list of 5+ questions in a single message. Ask one question, wait for the answer, acknowledge it in one line, then ask the next.
+2. **Offer 3–5 answer options per question.** For every question, present a short numbered list of the most likely answers based on:
+   - The project domain (load `references/domains/{domain}.md` and reuse its vocabulary, typical entities, and business goals verbatim when they fit — do not invent domain-specific options when the reference file already lists them).
+   - What the user has already said earlier in the interview.
+   - Industry conventions for the artifact being built.
+   Options should be **concrete**, not abstract — e.g. for "Who is your primary user?" in a SaaS project, offer "Product Manager at a 50–500-person SaaS startup", "Engineering Lead", "Ops/Support team", not "End user", "Customer", "User".
+3. **Always include a free-text option.** The last numbered option must always be something like `5. Other — type your own answer`. If the user picks it, accept arbitrary text. Never force the user into one of the predefined options.
+4. **Wait for the answer.** Do not generate the next question or any part of the artifact until the user has replied. A non-answer (e.g. "I don't know", "skip") is a valid answer — record it as "unknown" and move on.
+5. **Acknowledge, then proceed.** After each answer, reflect it back in one line (e.g. "Got it — primary user is the Ops team at mid-size logistics companies.") before asking the next question. This catches misunderstandings early.
+6. **Batch only when the user asks.** If the user explicitly says "just give me all the questions at once" or "I'll answer in one go", switch to a single numbered list. Otherwise stay one-at-a-time.
+7. **Stop when you have enough.** Each skill specifies a required set of topics. Once every required topic has a recorded answer, stop asking and move to the Generation phase. Do not pad the interview with "nice-to-have" questions.
+## Example
+Bad (old style):
+> Please answer the following questions:
+> 1. What is the product?
+> 2. Who is the target user?
+> 3. What problem does it solve?
+> 4. What are the success metrics?
+> 5. What are the key constraints?
+Good (protocol style):
+> Let's start with the product itself. What are you building?
+>
+> 1. A B2B SaaS tool for internal teams (dashboards, automation, reporting)
+> 2. A customer-facing web application (marketplace, portal, community)
+> 3. A mobile app (consumer or B2B)
+> 4. An API / developer platform
+> 5. Other — type your own answer
+*User picks 1 or types custom.*
+> Got it — internal B2B SaaS tool. Who is the primary user? [next question with 3–5 options tailored to B2B SaaS internal tooling]
+## When this protocol applies
+This protocol applies to every skill that has an `### Interview` (or `## Interview`) section in its SKILL.md — currently: `brief`, `srs`, `stories`, `usecases`, `ac`, `nfr`, `datadict`, `apicontract`, `wireframes`, `scenarios`, `research`, `principles`. Each of those skills MUST link to this file from its Interview section and follow the rules above.

package/skills/references/templates/agents-template.md CHANGED Viewed

@@ -1,6 +1,7 @@
 # BA Toolkit — Project Context
-> Auto-generated by `ba-toolkit init` on [DATE]. Updated automatically by /brief and /srs.
+<!-- ba-toolkit:begin managed -->
+> Auto-generated by `ba-toolkit init` on [DATE]. The Active Project block below is refreshed on every re-init. Everything outside this managed block is preserved — add your own notes, update the Pipeline Status, and edit the Key Constraints / Open Questions sections freely; `ba-toolkit init` will not touch them.
 ## Active Project
@@ -9,6 +10,7 @@
 **Domain:** [DOMAIN]
 **Language:** English
 **Output folder:** output/[SLUG]/
+<!-- ba-toolkit:end managed -->
 ## Pipeline Status

package/skills/research/SKILL.md CHANGED Viewed

@@ -23,7 +23,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
 ## Interview
-1–2 rounds, 4–6 questions.
+> **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
+1–2 rounds, 4–6 topics.
 **Required topics:**
 1. Existing infrastructure — is there a current backend, database, or API the new system must integrate with or extend?

package/skills/scenarios/SKILL.md CHANGED Viewed

@@ -23,7 +23,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
 ## Interview
-1 round, 3–5 questions.
+> **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
+1 round, 3–5 topics.
 **Required topics:**
 1. Coverage priority — generate scenarios for Must-priority US only, or include Should as well?

package/skills/srs/SKILL.md CHANGED Viewed

@@ -23,7 +23,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
 ## Interview
-3–7 questions per round, 2–4 rounds. Do not re-ask information already known from the brief.
+> **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
+3–7 topics per round, 2–4 rounds. Do not re-ask information already known from the brief.
 **Required topics:**
 1. User roles — which roles interact with the system?

package/skills/stories/SKILL.md CHANGED Viewed

@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
 ## Interview
-3–7 questions per round, 2–4 rounds.
+> **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
+3–7 topics per round, 2–4 rounds.
 **Required topics:**
 1. Which user flows are most critical?

package/skills/usecases/SKILL.md CHANGED Viewed

@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
 ## Interview
-3–7 questions per round, 2–4 rounds.
+> **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
+3–7 topics per round, 2–4 rounds.
 **Required topics:**
 1. Detail level — summary, user-goal, subfunction?

package/skills/wireframes/SKILL.md CHANGED Viewed

@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
 ## Interview
-3–7 questions per round, 2–4 rounds.
+> **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
+3–7 topics per round, 2–4 rounds.
 **Required topics:**
 1. Platform — web (desktop, mobile responsive), native app, Telegram Mini App?