@kudusov.takhir/ba-toolkit 2.0.0 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -11,6 +11,64 @@ Versions follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
11
11
 
12
12
  ---
13
13
 
14
+ ## [3.0.0] — 2026-04-09
15
+
16
+ ### ⚠️ BREAKING — Cursor and Windsurf install paths moved to native Agent Skills
17
+
18
+ Cursor and Windsurf both have **two separate features**: Rules (`.cursor/rules/*.mdc`, `.windsurf/rules/*.mdc`) and Agent Skills (`.cursor/skills/<skill>/SKILL.md`, `.windsurf/skills/<skill>/SKILL.md`). BA Toolkit is a pipeline of skills, not rules, but every previous version installed it as `.mdc` rules under `.cursor/rules/` and `.windsurf/rules/`. Both editors loaded those files as **rules**, never as skills, so `/brief`, `/srs`, … slash commands were never registered with the agent. Users reported that the skills did not show up. v3.0 fixes this by switching to the native Agent Skills paths and the native folder-per-skill `SKILL.md` layout (the same one Claude Code, Codex CLI, and Gemini CLI use). Confirmed against the official Cursor and Windsurf documentation via ctx7 MCP:
19
+
20
+ | Agent | v2.0 path (broken) | v3.0 path (correct) | Format |
21
+ |---|---|---|---|
22
+ | Cursor | `.cursor/rules/<skill>.mdc` (flat) | `.cursor/skills/<skill>/SKILL.md` (folder-per-skill) | Native |
23
+ | Windsurf | `.windsurf/rules/<skill>.mdc` (flat) | `.windsurf/skills/<skill>/SKILL.md` (folder-per-skill) | Native |
24
+
25
+ **Migration for v2.0 Cursor/Windsurf users:**
26
+
27
+ ```bash
28
+ # 1. Upgrade the package
29
+ npm install -g @kudusov.takhir/ba-toolkit@latest
30
+
31
+ # 2. Reinstall — the old install was at the wrong path; upgrade can't find
32
+ # its manifest there, so just run install fresh against the correct path.
33
+ ba-toolkit install --for cursor # writes to .cursor/skills/
34
+ ba-toolkit install --for windsurf # writes to .windsurf/skills/
35
+
36
+ # 3. Manually clean up the orphaned old install (it never actually worked
37
+ # as skills anyway — Cursor/Windsurf were loading it as rules):
38
+ rm -rf .cursor/rules/*.mdc # if those .mdc files came from BA Toolkit
39
+ rm -rf .windsurf/rules/*.mdc
40
+ ```
41
+
42
+ After this you'll see the BA Toolkit skills register as actual Agent Skills in Cursor and Windsurf for the first time — `/brief`, `/srs`, `/ac`, `/nfr`, … become real slash commands. Reload the editor window after install. Claude Code, Codex CLI, and Gemini CLI users are unaffected — their paths and behavior are unchanged.
43
+
44
+ ### Added
45
+
46
+ - **Integration test suite for every CLI subcommand** (`test/cli.integration.test.js`, 33 tests). Spawns the real CLI as a child process against temporary directories and asserts exit codes, stdout/stderr content, and filesystem state. Covers `--version`/`--help`/no-args, typo detection for unknown flags, `init` with flag combinations and validation failures, `install`/`upgrade`/`uninstall` dry-runs for every supported agent, end-to-end install → manifest → status → upgrade → uninstall round-trips for both Claude Code and Cursor (native skill format), and a regression guard proving that `uninstall` leaves the user's own unrelated skills in the shared destination untouched (manifest-driven removal guarantee).
47
+ - **Test gate in `.github/workflows/release.yml`**. A new `Run tests` step runs `npm test` between the smoke test and the npm publish step — if any unit or integration test fails, the `publish-npm` job exits before the classic-auth strip + publish steps, and no broken release reaches npm. The GitHub Release created by the preceding job still happens; npm and GitHub are intentionally independent.
48
+ - **Test job in `.github/workflows/validate.yml`**. A new `run-tests` job mirrors the release gate at PR time, so regressions are caught on the PR instead of at tag push. Triggered for changes under `bin/**`, `test/**`, `skills/**`, `output/**`, or `package.json`.
49
+ - **ASCII banner at the top of `ba-toolkit init`**. Decorative `ba-toolkit` wordmark printed before the "New Project Setup" heading. Suppressed on non-TTY stdout (CI logs, piped output, captured test stdout) via an `isTTY` guard in `printBanner()`, so the banner never pollutes automation. Covered by a regression test that asserts the banner glyphs don't appear in non-TTY runs.
50
+ - **Arrow-key menu navigation in `ba-toolkit init`** for the domain and agent selection prompts. Real terminals now get an interactive menu — `↑/↓` (also `j/k`) to move, `1-9` to jump by index, `Enter` to confirm, `Esc` or `Ctrl+C` to cancel cleanly with exit code 130. The previous numbered prompt is the automatic fallback when stdin/stdout is not a TTY (CI, piped input, `TERM=dumb`, IDE shells), so all CI/integration tests and the `printf | ba-toolkit init` use case keep working unchanged. Cross-platform via Node's `readline.emitKeypressEvents` + `setRawMode` — works the same on bash, zsh, fish, Windows Terminal (PowerShell, cmd, WSL), Git Bash, and VSCode integrated terminal. Three-layer design for testability: `menuStep(state, key)` — pure state machine, 11 unit tests; `renderMenu(state, opts)` — pure renderer, 7 unit tests; `runMenuTty` — the I/O glue, manually smoke-tested.
51
+ - **Interview Protocol for every interview-phase skill** (`skills/references/interview-protocol.md`). Codifies the rule that AI skills must ask ONE question at a time, offer 3–5 domain-appropriate options per question sourced from `references/domains/{domain}.md`, always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question. Replaces the previous "dump a numbered questionnaire of 5+ questions at once" style that made users abandon long interviews. Every shipped SKILL.md with an `Interview` heading — 12 files: `brief`, `srs`, `stories`, `usecases`, `ac`, `nfr`, `datadict`, `apicontract`, `wireframes`, `scenarios`, `research`, `principles` — now opens its Interview section with a blockquote pointing to this protocol. A Node-level regression test in `test/cli.test.js` walks every shipped SKILL.md and fails if any interview skill ever ships without the protocol link; a mirror validation step in `.github/workflows/validate.yml` catches the same at PR time.
52
+
53
+ ### Changed
54
+
55
+ - **`ba-toolkit init` now merges `AGENTS.md` instead of overwriting it.** The `agents-template.md` wraps the Active Project block (name/slug/domain/date) in `<!-- ba-toolkit:begin managed -->` / `<!-- ba-toolkit:end managed -->` anchors. On re-init, only the managed block is refreshed — Pipeline Status edits, Key Constraints, Open Questions, user notes, and anything else outside the anchors is preserved byte-for-byte. If an existing `AGENTS.md` has no anchors (legacy file or fully user-authored), it's left untouched and a `preserved` note is printed instead of overwriting. The old interactive "AGENTS.md already exists. Overwrite? (y/N)" prompt is removed — the merge is always safe. New pure helper `mergeAgentsMd(existing, ctx)` exported from `bin/ba-toolkit.js` with unit tests covering all three branches (`created`, `merged`, `preserved`) plus malformed-anchor edge cases; integration tests verify the double-init scenario and the unmanaged-file preservation.
56
+ - `npm test` now runs both the pre-existing unit suite (`test/cli.test.js`) and the new integration suite — 154 tests total, ~2 seconds, still zero dependencies (only Node built-ins: `node:test`, `node:assert`, `node:child_process`).
57
+ - `.gitignore` now excludes `.claude/settings.local.json` and `.claude/skills/` — local Claude Code state, never part of the package. `settings.local.json` was previously tracked and required a stash dance around `npm version`; that workaround is no longer needed.
58
+
59
+ ### Fixed
60
+
61
+ - **Cursor install now targets `.cursor/skills/` with native SKILL.md format, not `.cursor/rules/` with `.mdc` conversion.** See the BREAKING section above for the full story and migration steps.
62
+ - **Windsurf install now targets `.windsurf/skills/` with native SKILL.md format, not `.windsurf/rules/` with `.mdc` conversion.** Mirror of the Cursor fix. Confirmed against the [Windsurf Agent Skills documentation](https://docs.windsurf.com/windsurf/cascade/skills) via ctx7 MCP: Windsurf loads skills from `.windsurf/skills/<skill-name>/SKILL.md` as folder-per-skill with the same YAML frontmatter (`name`, `description`) as Claude Code and Cursor.
63
+ - **`ba-toolkit init` no longer crashes on a single typo during interactive input.** The domain menu, agent menu, and manual slug entry now re-prompt on invalid input via a new `promptUntilValid(question, resolver, { maxAttempts, invalidMessage })` helper. After three consecutive invalid answers the command aborts with a clear "Too many invalid attempts — aborting." message so piped input can't infinite-loop. The flag-path (`--domain=banana`, `--for=vim`) still hard-fails immediately — that's the correct behavior for CI and scripting, and its tests are untouched.
64
+ - **`prompt()` race condition that silently dropped piped input lines.** The previous implementation used `rl.question()` on a shared `readline.Interface`; when stdin was piped with multiple answers at once (e.g. `printf "banana\nsaas\n" | ba-toolkit init`, or any test feeding a full input buffer via `spawnSync`), readline emitted `'line'` events before the next `question()` handler had been attached, and those answers were silently lost. The second prompt then saw EOF and aborted with `INPUT_CLOSED` despite the answer being in the buffer. The new `prompt()` owns the `'line'` event directly, buffers arriving lines into an internal `lineQueue`, and parks `waiters` when the queue is empty — no more lost input. Uncovered while wiring up the `promptUntilValid` retry path.
65
+
66
+ ### Removed
67
+
68
+ - **`.mdc` rule format conversion path is gone** — every shipped agent now uses native Agent Skills, so the conversion was dead code. Deleted: `skillToMdcContent()` (and its 2 unit tests), the `mdc` branch of `copySkills()`, the `format` field on every entry in `AGENTS`, the `format` parameter passed to `copySkills` and `writeManifest`, the runtime "format: .mdc (converted from SKILL.md)" log line. The `format` field is also gone from new manifests, but `readManifest` still parses legacy manifests that have it (covered by a forward-compat unit test). Net removal: ~80 lines of code + 2 unit tests; CLI surface unchanged.
69
+
70
+ ---
71
+
14
72
  ## [2.0.0] — 2026-04-09
15
73
 
16
74
  ### ⚠️ BREAKING — install layout dropped the `ba-toolkit/` wrapper
@@ -350,7 +408,8 @@ CI scripts that relied on the old behaviour (`init` creates files only, `install
350
408
 
351
409
  ---
352
410
 
353
- [Unreleased]: https://github.com/TakhirKudusov/ba-toolkit/compare/v2.0.0...HEAD
411
+ [Unreleased]: https://github.com/TakhirKudusov/ba-toolkit/compare/v3.0.0...HEAD
412
+ [3.0.0]: https://github.com/TakhirKudusov/ba-toolkit/compare/v2.0.0...v3.0.0
354
413
  [2.0.0]: https://github.com/TakhirKudusov/ba-toolkit/compare/v1.5.0...v2.0.0
355
414
  [1.5.0]: https://github.com/TakhirKudusov/ba-toolkit/compare/v1.4.0...v1.5.0
356
415
  [1.4.0]: https://github.com/TakhirKudusov/ba-toolkit/compare/v1.3.2...v1.4.0
package/README.md CHANGED
@@ -13,8 +13,8 @@ Structured BA pipeline for AI coding agents — brief to handoff, 21 skills, 9 d
13
13
  <img src="https://img.shields.io/badge/Claude_Code-✓-6C5CE7" alt="Claude Code">
14
14
  <img src="https://img.shields.io/badge/Codex_CLI-✓-00D26A" alt="Codex CLI">
15
15
  <img src="https://img.shields.io/badge/Gemini_CLI-✓-4285F4" alt="Gemini CLI">
16
- <img src="https://img.shields.io/badge/Cursor-convert-F5A623" alt="Cursor">
17
- <img src="https://img.shields.io/badge/Windsurf-convert-1ABCFE" alt="Windsurf">
16
+ <img src="https://img.shields.io/badge/Cursor-✓-F5A623" alt="Cursor">
17
+ <img src="https://img.shields.io/badge/Windsurf-✓-1ABCFE" alt="Windsurf">
18
18
 
19
19
  </div>
20
20
 
@@ -46,7 +46,7 @@ npm install -g @kudusov.takhir/ba-toolkit
46
46
  ba-toolkit init
47
47
  ```
48
48
 
49
- Supported agents: `claude-code`, `codex`, `gemini`, `cursor`, `windsurf`. Cursor and Windsurf installs auto-convert `SKILL.md` into the `.mdc` rule format. Pass `--dry-run` to preview the install step without writing files, or `--no-install` to create only the project structure and install skills later with `ba-toolkit install --for <agent>`.
49
+ Supported agents: `claude-code`, `codex`, `gemini`, `cursor`, `windsurf`. All five use the native Agent Skills format (folder-per-skill with `SKILL.md`) Claude Code at `.claude/skills/`, Codex at `~/.codex/skills/`, Gemini at `.gemini/skills/`, Cursor at `.cursor/skills/`, Windsurf at `.windsurf/skills/`. Pass `--dry-run` to preview the install step without writing files, or `--no-install` to create only the project structure and install skills later with `ba-toolkit install --for <agent>`.
50
50
 
51
51
  `ba-toolkit --help` shows the full CLI reference. Zero runtime dependencies — only Node.js ≥ 18.
52
52
 
@@ -98,21 +98,17 @@ cp -R ba-toolkit/skills/. /path/to/project/.gemini/skills/
98
98
 
99
99
  Reload the CLI after copying.
100
100
 
101
- ### Cursor, Windsurf, Aider
101
+ ### Cursor
102
102
 
103
- These use their own rules format instead of `SKILL.md`. Convert first, then copy:
103
+ Cursor has two separate features — Rules (`.cursor/rules/*.mdc`) and [Agent Skills](https://cursor.com/docs/skills) (`.cursor/skills/<skill>/SKILL.md`). BA Toolkit is a set of skills, not rules, so `ba-toolkit install --for cursor` drops the 21 skills directly into `.cursor/skills/` using the native folder-per-skill `SKILL.md` format no conversion needed. Reload the Cursor window to pick them up.
104
104
 
105
- ```bash
106
- # Option 1: community converter
107
- # https://github.com/alirezarezvani/claude-skills/blob/main/scripts/convert.sh
108
- ./convert.sh --tool cursor --target /path/to/project
109
- ./convert.sh --tool windsurf --target /path/to/project
105
+ ### Windsurf
110
106
 
111
- # Option 2: ask your AI agent
112
- # "Convert all SKILL.md files in skills/ to Cursor .mdc rule format"
113
- ```
107
+ Windsurf's [Agent Skills](https://docs.windsurf.com/windsurf/cascade/skills) feature loads skills from `.windsurf/skills/<skill>/SKILL.md`, the same folder-per-skill layout as Claude Code and Cursor. `ba-toolkit install --for windsurf` writes the 21 skills there natively. Reload the Windsurf window to pick them up.
108
+
109
+ ### Aider
114
110
 
115
- Cursor rules live in [`.cursor/rules/`](https://cursor.com/docs/rules) as `.mdc` files with YAML frontmatter (`description`, optional `globs`, `alwaysApply`). A plain rename of `SKILL.md` to `.mdc` is not enough — the metadata block is required. [Cursor CLI](https://cursor.com/docs/cli/using) reads the same `.cursor/rules` setup and may also pick up `AGENTS.md` / `CLAUDE.md` at repo root.
111
+ Aider has no native skills feature. Convert manually with the community script at <https://github.com/alirezarezvani/claude-skills/blob/main/scripts/convert.sh> or ask your AI agent to convert `SKILL.md` files to the target format.
116
112
 
117
113
  ### Starting a new project (shell scripts)
118
114
 
@@ -205,11 +201,11 @@ BA Toolkit uses the open Agent Skills specification (`SKILL.md` format) publishe
205
201
  | **Claude Code** | Native | `cp -R skills/. .claude/skills/` |
206
202
  | **OpenAI Codex CLI** | Native | `cp -R skills/. ~/.codex/skills/` |
207
203
  | **Gemini CLI** | Native | Copy `skills/.` contents to `~/.gemini/skills/` (user) or `.gemini/skills/` (workspace) |
208
- | **Cursor** | Convert | `SKILL.md` → `.mdc` rules in `.cursor/rules/` |
209
- | **Windsurf** | Convert | `SKILL.md` rules in `.windsurf/rules/` |
204
+ | **Cursor** | Native | Copy `skills/.` contents to `.cursor/skills/` |
205
+ | **Windsurf** | Native | Copy `skills/.` contents to `.windsurf/skills/` |
210
206
  | **Aider** | Convert | `SKILL.md` → conventions file |
211
207
 
212
- Native platforms read `SKILL.md` as-is. Convert platforms need a one-time format conversion — the content is the same, only the file format differs. `ba-toolkit install --for cursor|windsurf` does this automatically.
208
+ All five officially supported platforms read `SKILL.md` as-is no conversion. `ba-toolkit install --for <agent>` lands skills directly in the agent's native skills root.
213
209
 
214
210
  Skills do not hardcode platform paths — they reference `skills/references/environment.md`, which contains the output directory logic for each platform. Edit that file to customize; all skills pick up the change automatically.
215
211
 
package/bin/ba-toolkit.js CHANGED
@@ -18,57 +18,55 @@ const PACKAGE_ROOT = path.resolve(__dirname, '..');
18
18
  const SKILLS_DIR = path.join(PACKAGE_ROOT, 'skills');
19
19
  const PKG = JSON.parse(fs.readFileSync(path.join(PACKAGE_ROOT, 'package.json'), 'utf8'));
20
20
 
21
- // In v2.0 the install paths dropped the previous `ba-toolkit/` wrapper
22
- // directory. Claude Code, Codex CLI, and Gemini CLI all expect skills
23
- // to be discoverable as direct subfolders of their skills root
24
- // `.claude/skills/<skill-name>/SKILL.md`, not nested one level deeper.
25
- // The wrapper made all 21 skills invisible to every agent.
21
+ // All five supported agents Claude Code, Codex CLI, Gemini CLI,
22
+ // Cursor, and Windsurf load Agent Skills as direct subfolders of
23
+ // their skills root: `<skills-root>/<skill-name>/SKILL.md`. The toolkit
24
+ // installs the 21 skills natively in this layout for every agent. No
25
+ // .mdc conversion. Confirmed against the Agent Skills documentation
26
+ // for each platform via ctx7 MCP / official docs.
26
27
  //
27
- // Cursor and Windsurf load `.mdc` rule files directly from their rules
28
- // root, so v2.0 also flattens that layout: the per-skill subfolders
29
- // produced by the previous version are gone, and rules sit at
30
- // `.cursor/rules/<skill-name>.mdc`.
28
+ // Earlier versions tried to install Cursor and Windsurf via `.mdc`
29
+ // rules under `.cursor/rules/` and `.windsurf/rules/` but Rules and
30
+ // Agent Skills are two separate features in both editors, and the
31
+ // toolkit is a pipeline of skills, not rules. The wrong-feature install
32
+ // silently failed: skills loaded as rules never surfaced as `/brief`,
33
+ // `/srs`, … slash commands. v2.x corrects this for Cursor, and the
34
+ // Windsurf cleanup in this changelog entry finishes the job.
31
35
  //
32
- // To stay safe sharing the skills root with the user's other skills /
33
- // rules, every install also drops a `.ba-toolkit-manifest.json` next to
34
- // the installed items. uninstall and upgrade read this manifest to
35
- // remove only what the toolkit owns; without it they refuse to touch
36
- // anything.
36
+ // To stay safe sharing the skills root with the user's other skills,
37
+ // every install also drops a `.ba-toolkit-manifest.json` next to the
38
+ // installed items. uninstall and upgrade read this manifest to remove
39
+ // only what the toolkit owns; without it they refuse to touch anything.
37
40
  const AGENTS = {
38
41
  'claude-code': {
39
42
  name: 'Claude Code',
40
43
  projectPath: '.claude/skills',
41
44
  globalPath: path.join(os.homedir(), '.claude', 'skills'),
42
- format: 'skill',
43
45
  restartHint: 'Restart Claude Code to load the new skills.',
44
46
  },
45
47
  codex: {
46
48
  name: 'OpenAI Codex CLI',
47
49
  projectPath: null, // Codex uses only global
48
50
  globalPath: path.join(process.env.CODEX_HOME || path.join(os.homedir(), '.codex'), 'skills'),
49
- format: 'skill',
50
51
  restartHint: 'Restart the Codex CLI to load the new skills.',
51
52
  },
52
53
  gemini: {
53
54
  name: 'Google Gemini CLI',
54
55
  projectPath: '.gemini/skills',
55
56
  globalPath: path.join(os.homedir(), '.gemini', 'skills'),
56
- format: 'skill',
57
57
  restartHint: 'Reload Gemini CLI to pick up the new skills.',
58
58
  },
59
59
  cursor: {
60
60
  name: 'Cursor',
61
- projectPath: '.cursor/rules',
62
- globalPath: null, // Cursor rules are project-scoped
63
- format: 'mdc',
64
- restartHint: 'Reload the Cursor window to apply new rules.',
61
+ projectPath: '.cursor/skills',
62
+ globalPath: null, // Cursor skills are project-scoped for now
63
+ restartHint: 'Reload the Cursor window to apply new skills.',
65
64
  },
66
65
  windsurf: {
67
66
  name: 'Windsurf',
68
- projectPath: '.windsurf/rules',
69
- globalPath: null,
70
- format: 'mdc',
71
- restartHint: 'Reload the Windsurf window to apply new rules.',
67
+ projectPath: '.windsurf/skills',
68
+ globalPath: null, // Windsurf skills are project-scoped for now
69
+ restartHint: 'Reload the Windsurf window to apply new skills.',
72
70
  },
73
71
  };
74
72
 
@@ -85,6 +83,21 @@ const DOMAINS = [
85
83
  { id: 'custom', name: 'Custom', desc: 'Any other domain — general interview questions' },
86
84
  ];
87
85
 
86
+ // ASCII banner shown at the top of `ba-toolkit init`. Suppressed on
87
+ // non-TTY stdout so it doesn't end up in CI logs or piped output.
88
+ // Stored as an array of literal lines (not a template literal) so the
89
+ // `$` characters stay out of any interpolation path.
90
+ const BANNER = [
91
+ ' /$$ /$$ /$$ /$$ /$$ /$$ ',
92
+ '| $$ | $$ | $$| $$ |__/ | $$ ',
93
+ '| $$$$$$$ /$$$$$$ /$$$$$$ /$$$$$$ /$$$$$$ | $$| $$ /$$ /$$ /$$$$$$ ',
94
+ '| $$__ $$ |____ $$ /$$$$$$|_ $$_/ /$$__ $$ /$$__ $$| $$| $$ /$$/| $$|_ $$_/ ',
95
+ '| $$ \\ $$ /$$$$$$$|______/ | $$ | $$ \\ $$| $$ \\ $$| $$| $$$$$$/ | $$ | $$ ',
96
+ '| $$ | $$ /$$__ $$ | $$ /$$| $$ | $$| $$ | $$| $$| $$_ $$ | $$ | $$ /$$',
97
+ '| $$$$$$$/| $$$$$$$ | $$$$/| $$$$$$/| $$$$$$/| $$| $$ \\ $$| $$ | $$$$/',
98
+ '|_______/ \\_______/ \\___/ \\______/ \\______/ |__/|__/ \\__/|__/ \\___/ ',
99
+ ];
100
+
88
101
  // --- Terminal helpers --------------------------------------------------
89
102
 
90
103
  const NO_COLOR = !!process.env.NO_COLOR || !process.stdout.isTTY;
@@ -99,6 +112,17 @@ const bold = colour(1);
99
112
  function log(...args) { console.log(...args); }
100
113
  function logError(...args) { console.error(red('error:'), ...args); }
101
114
 
115
+ // Print the BANNER to stdout if — and only if — stdout is a real TTY.
116
+ // Piped / redirected runs (CI, test spawn, `ba-toolkit init | tee ...`)
117
+ // get a clean log without the 8-line block. The banner is decorative,
118
+ // not load-bearing, so suppressing it in non-interactive contexts is
119
+ // the right default.
120
+ function printBanner() {
121
+ if (!process.stdout.isTTY) return;
122
+ for (const line of BANNER) log(cyan(line));
123
+ log('');
124
+ }
125
+
102
126
  // --- Arg parsing -------------------------------------------------------
103
127
 
104
128
  function parseArgs(argv) {
@@ -140,30 +164,66 @@ function parseArgs(argv) {
140
164
  // --- Prompt helper -----------------------------------------------------
141
165
 
142
166
  // Shared across all prompts in a single CLI invocation. Creating a new
143
- // readline.Interface for every question (the previous approach) made Ctrl+C
167
+ // readline.Interface for every question (the earlier approach) made Ctrl+C
144
168
  // handling unreliable, leaked listeners on stdin, and broke when stdin was
145
- // piped (EOF on the second create). One interface per process, closed by
146
- // closeReadline() once main() finishes (or by the SIGINT handler).
169
+ // piped. One interface per process, closed by closeReadline() once main()
170
+ // finishes (or by the SIGINT handler).
171
+ //
172
+ // prompt() does NOT use `rl.question(...)` — that method races with
173
+ // readline's internal line buffering when stdin is piped. If input arrives
174
+ // faster than prompts are issued (the common piped case: the user pipes a
175
+ // here-doc with multiple answers, or a test feeds the entire stdin buffer
176
+ // upfront), readline emits 'line' events before the question listener is
177
+ // attached and those lines are silently dropped. The second prompt then
178
+ // sees EOF and errors with INPUT_CLOSED despite the answer actually being
179
+ // in the buffer.
180
+ //
181
+ // Instead we own the 'line' event ourselves and keep a line queue: every
182
+ // line that arrives is pushed onto `lineQueue` if no one is waiting, or
183
+ // delivered directly to the oldest waiter. A prompt() call takes the head
184
+ // of the queue if non-empty, otherwise parks a waiter. The 'close' event
185
+ // drains all waiting waiters with INPUT_CLOSED.
147
186
  let sharedRl = null;
187
+ const lineQueue = [];
188
+ const waiters = [];
189
+ let inputClosed = false;
190
+
191
+ function ensureReadline() {
192
+ if (sharedRl) return;
193
+ sharedRl = readline.createInterface({ input: process.stdin, output: process.stdout });
194
+ sharedRl.on('line', (line) => {
195
+ if (waiters.length > 0) {
196
+ waiters.shift().resolve(line);
197
+ } else {
198
+ lineQueue.push(line);
199
+ }
200
+ });
201
+ sharedRl.on('close', () => {
202
+ inputClosed = true;
203
+ while (waiters.length > 0) {
204
+ const err = new Error('input stream closed before answer');
205
+ err.code = 'INPUT_CLOSED';
206
+ waiters.shift().reject(err);
207
+ }
208
+ });
209
+ }
148
210
 
149
211
  function prompt(question) {
150
- if (!sharedRl) {
151
- sharedRl = readline.createInterface({ input: process.stdin, output: process.stdout });
212
+ ensureReadline();
213
+ // Render the question ourselves we're not using rl.question().
214
+ process.stdout.write(question);
215
+ if (lineQueue.length > 0) {
216
+ return Promise.resolve(String(lineQueue.shift()).trim());
217
+ }
218
+ if (inputClosed) {
219
+ const err = new Error('input stream closed before answer');
220
+ err.code = 'INPUT_CLOSED';
221
+ return Promise.reject(err);
152
222
  }
153
223
  return new Promise((resolve, reject) => {
154
- let answered = false;
155
- const onClose = () => {
156
- if (!answered) {
157
- const err = new Error('input stream closed before answer');
158
- err.code = 'INPUT_CLOSED';
159
- reject(err);
160
- }
161
- };
162
- sharedRl.once('close', onClose);
163
- sharedRl.question(question, (answer) => {
164
- answered = true;
165
- sharedRl.removeListener('close', onClose);
166
- resolve(answer.trim());
224
+ waiters.push({
225
+ resolve: (line) => resolve(String(line).trim()),
226
+ reject,
167
227
  });
168
228
  });
169
229
  }
@@ -175,6 +235,213 @@ function closeReadline() {
175
235
  }
176
236
  }
177
237
 
238
+ // --- Arrow-key menus -----------------------------------------------------
239
+ //
240
+ // Three layers, separated for testability:
241
+ //
242
+ // 1. menuStep(state, key) — pure state machine. Given the current
243
+ // menu state and a normalised key action, returns the new state.
244
+ // Unit-tested directly. No dependencies, no I/O.
245
+ //
246
+ // 2. renderMenu(state, opts) — pure renderer. Returns the frame to
247
+ // print as a string. Unit-tested too — uses the colour helpers,
248
+ // which collapse to identity strings under NO_COLOR (i.e., in
249
+ // tests), so the assertions are stable.
250
+ //
251
+ // 3. runMenuTty(items, opts) / selectMenu(items, opts) — the I/O
252
+ // glue. Detects TTY, sets raw mode, listens for keypress events,
253
+ // drives the loop, falls back to a numbered prompt under
254
+ // promptUntilValid when the terminal is non-interactive (CI,
255
+ // piped input, TERM=dumb). Not unit-tested — covered by manual
256
+ // smoke and the existing fallback-path integration tests.
257
+ //
258
+ // Cross-platform note: Node's `readline.emitKeypressEvents` decodes
259
+ // arrow-key escape sequences uniformly across bash/zsh/fish on
260
+ // Linux/macOS, Windows Terminal (PowerShell, cmd, WSL), Git Bash /
261
+ // MSYS2, and VSCode's integrated terminal. Modern Node also enables VT
262
+ // mode automatically on Windows when raw mode is requested, so legacy
263
+ // cmd.exe on Win10+ works too. The only environment we explicitly bail
264
+ // out of is `TERM=dumb` (emacs M-x shell, some IDE shells) — keypress
265
+ // decoding is unreliable there.
266
+
267
+ function menuStep(state, key) {
268
+ if (state.done) return state;
269
+ const len = state.items.length;
270
+ if (len === 0) return state;
271
+ switch (key) {
272
+ case 'up':
273
+ return { ...state, index: (state.index - 1 + len) % len };
274
+ case 'down':
275
+ return { ...state, index: (state.index + 1) % len };
276
+ case 'enter':
277
+ return { ...state, done: true, choice: state.items[state.index] };
278
+ case 'cancel':
279
+ return { ...state, done: true, choice: null };
280
+ default:
281
+ if (/^[0-9]$/.test(key)) {
282
+ const n = parseInt(key, 10);
283
+ if (n >= 1 && n <= len) {
284
+ return { ...state, index: n - 1 };
285
+ }
286
+ }
287
+ return state;
288
+ }
289
+ }
290
+
291
+ function renderMenu(state, { title } = {}) {
292
+ const lines = [];
293
+ if (title) {
294
+ lines.push(' ' + yellow(title));
295
+ lines.push('');
296
+ }
297
+ const labelWidth = Math.max(...state.items.map((it) => it.label.length));
298
+ state.items.forEach((item, i) => {
299
+ const selected = i === state.index;
300
+ const marker = selected ? cyan('>') : ' ';
301
+ const idx = String(i + 1).padStart(2);
302
+ const label = selected ? bold(item.label.padEnd(labelWidth)) : item.label.padEnd(labelWidth);
303
+ const desc = item.desc ? ' ' + gray('— ' + item.desc) : '';
304
+ lines.push(` ${marker} ${idx}) ${label}${desc}`);
305
+ });
306
+ lines.push('');
307
+ lines.push(' ' + gray('↑/↓ navigate · Enter select · 1-9 jump · Esc cancel'));
308
+ return lines.join('\n') + '\n';
309
+ }
310
+
311
+ // True when arrow-key menus are usable in this process. False under
312
+ // piped stdin/stdout, dumb terminals, or when raw mode is unavailable.
313
+ function isInteractiveTerminal() {
314
+ if (!process.stdin.isTTY) return false;
315
+ if (!process.stdout.isTTY) return false;
316
+ if (process.env.TERM === 'dumb') return false;
317
+ if (typeof process.stdin.setRawMode !== 'function') return false;
318
+ return true;
319
+ }
320
+
321
+ // TTY runner: drive the menu state machine via raw-mode keypress
322
+ // events. Returns the chosen item or null if the user cancelled.
323
+ // The caller is responsible for not invoking this when
324
+ // isInteractiveTerminal() is false.
325
+ function runMenuTty(items, { title } = {}) {
326
+ // The shared line-mode readline (used by `prompt()`) and a raw-mode
327
+ // keypress reader can't both own stdin at the same time. Close any
328
+ // line-mode interface before we take over; the next prompt() call
329
+ // will lazily recreate it via ensureReadline().
330
+ closeReadline();
331
+
332
+ return new Promise((resolve) => {
333
+ let state = { items, index: 0, done: false, choice: null };
334
+ let lastFrameLineCount = 0;
335
+
336
+ const render = () => {
337
+ // Erase the previous frame in place: move the cursor up over its
338
+ // line count, then clear from cursor to end of screen. First
339
+ // render has nothing to erase.
340
+ if (lastFrameLineCount > 0) {
341
+ process.stdout.write(`\x1b[${lastFrameLineCount}A\x1b[J`);
342
+ }
343
+ const frame = renderMenu(state, { title });
344
+ process.stdout.write(frame);
345
+ // Count lines actually printed (frame ends with a trailing \n).
346
+ lastFrameLineCount = frame.split('\n').length - 1;
347
+ };
348
+
349
+ const cleanup = () => {
350
+ process.stdin.removeListener('keypress', onKey);
351
+ try {
352
+ process.stdin.setRawMode(false);
353
+ } catch { /* setRawMode can throw if stdin is not a TTY anymore */ }
354
+ process.stdin.pause();
355
+ };
356
+
357
+ const onKey = (_str, key) => {
358
+ if (!key) return;
359
+ let action = null;
360
+ if (key.ctrl && key.name === 'c') action = 'cancel';
361
+ else if (key.name === 'escape') action = 'cancel';
362
+ else if (key.name === 'up' || key.name === 'k') action = 'up';
363
+ else if (key.name === 'down' || key.name === 'j') action = 'down';
364
+ else if (key.name === 'return') action = 'enter';
365
+ else if (key.sequence && /^[0-9]$/.test(key.sequence)) action = key.sequence;
366
+ if (!action) return;
367
+ state = menuStep(state, action);
368
+ if (state.done) {
369
+ cleanup();
370
+ resolve(state.choice);
371
+ } else {
372
+ render();
373
+ }
374
+ };
375
+
376
+ readline.emitKeypressEvents(process.stdin);
377
+ process.stdin.setRawMode(true);
378
+ process.stdin.resume();
379
+ process.stdin.on('keypress', onKey);
380
+ render();
381
+ });
382
+ }
383
+
384
+ // Top-level selector: interactive arrow-key menu in real terminals,
385
+ // numbered prompt fallback everywhere else (CI, piped input, dumb
386
+ // TERM, EditorIDE shells). Always returns either an item from `items`
387
+ // or null on cancel.
388
+ async function selectMenu(items, { title, fallbackPrompt }) {
389
+ if (isInteractiveTerminal()) {
390
+ return await runMenuTty(items, { title });
391
+ }
392
+ // Non-TTY fallback: print the numbered list once, then prompt with
393
+ // promptUntilValid so a single typo doesn't kill the wizard.
394
+ log('');
395
+ if (title) log(' ' + yellow(title));
396
+ const labelWidth = Math.max(...items.map((it) => it.label.length));
397
+ items.forEach((item, i) => {
398
+ const idx = String(i + 1).padStart(2);
399
+ const desc = item.desc ? ' ' + gray('— ' + item.desc) : '';
400
+ log(` ${idx}) ${bold(item.label.padEnd(labelWidth))}${desc}`);
401
+ });
402
+ log('');
403
+ return await promptUntilValid(
404
+ fallbackPrompt,
405
+ (raw) => {
406
+ const trimmed = String(raw || '').toLowerCase().trim();
407
+ if (!trimmed) return null;
408
+ if (/^\d+$/.test(trimmed)) {
409
+ const n = parseInt(trimmed, 10);
410
+ return n >= 1 && n <= items.length ? items[n - 1] : null;
411
+ }
412
+ return items.find((it) => it.id === trimmed) || null;
413
+ },
414
+ { invalidMessage: `Invalid selection — pick a number between 1 and ${items.length} or an id.` },
415
+ );
416
+ }
417
+
418
+ // Ask the user `question`, run `resolver` on the trimmed answer, and
419
+ // loop while the resolver returns null/undefined. Prints a yellow
420
+ // "try again" message between attempts. Aborts with process.exit(1)
421
+ // after `maxAttempts` consecutive invalid answers so a piped input
422
+ // can't infinite-loop us.
423
+ //
424
+ // Previously, cmdInit called `resolveDomain` / `resolveAgent` /
425
+ // `sanitiseSlug` once and hard-failed on the first typo — users who
426
+ // mistyped "saass" lost the whole wizard and had to start over. With
427
+ // the retry loop, they just read the error and try again.
428
+ async function promptUntilValid(question, resolver, {
429
+ maxAttempts = 3,
430
+ invalidMessage = 'Invalid selection — try again.',
431
+ } = {}) {
432
+ for (let attempt = 1; attempt <= maxAttempts; attempt++) {
433
+ const raw = await prompt(question);
434
+ const result = resolver(raw);
435
+ if (result != null && result !== '') return result;
436
+ const remaining = maxAttempts - attempt;
437
+ if (remaining > 0) {
438
+ log(' ' + yellow(`${invalidMessage} (${remaining} attempt${remaining === 1 ? '' : 's'} left)`));
439
+ }
440
+ }
441
+ logError(`Too many invalid attempts — aborting.`);
442
+ process.exit(1);
443
+ }
444
+
178
445
  // --- Utilities ---------------------------------------------------------
179
446
 
180
447
  function sanitiseSlug(input) {
@@ -352,8 +619,10 @@ function copyDirRecursive(src, dest, { dryRun, copied }) {
352
619
  // - Quoted scalars (single or double quoted) — names would keep quotes
353
620
  //
354
621
  // Returns { name, description, body }. `description` is always
355
- // flattened to a single line (whitespace collapsed) because the .mdc
356
- // rule format expects a one-line description.
622
+ // flattened to a single line (whitespace collapsed) keeps the
623
+ // downstream consumers (manifest summary, status output, agent skill
624
+ // loaders that expect a single-line description) free of multi-line
625
+ // surprises.
357
626
  function parseSkillFrontmatter(content) {
358
627
  const fmMatch = content.match(/^---\r?\n([\s\S]*?)\r?\n---\r?\n([\s\S]*)$/);
359
628
  if (!fmMatch) {
@@ -402,34 +671,18 @@ function parseSkillFrontmatter(content) {
402
671
  };
403
672
  }
404
673
 
405
- // Transform a SKILL.md file's contents to the Cursor/Windsurf .mdc rule
406
- // format: replace the YAML frontmatter with the two fields the rule
407
- // loader expects (description, alwaysApply), keep the body unchanged.
408
- function skillToMdcContent(content) {
409
- const { description, body } = parseSkillFrontmatter(content);
410
- return `---\ndescription: ${description}\nalwaysApply: false\n---\n\n` + body;
411
- }
412
-
413
- // Install the package's skills/ tree into the given destination, picking
414
- // the layout the target agent expects.
415
- //
416
- // For 'skill' format (Claude Code, Codex, Gemini): each source skill
417
- // folder lands as `<destRoot>/<skillName>/SKILL.md`. The references/
674
+ // Install the package's skills/ tree into the given destination. Every
675
+ // supported agent uses the same Agent Skills layout: each source skill
676
+ // folder lands as `<destRoot>/<skillName>/SKILL.md`. The `references/`
418
677
  // folder is copied as-is to `<destRoot>/references/`.
419
678
  //
420
- // For 'mdc' format (Cursor, Windsurf): each source skill folder is
421
- // flattened to a single `<destRoot>/<skillName>.mdc` file containing
422
- // the transformed content. References still go to `<destRoot>/references/`
423
- // — non-.mdc files there are ignored by the rule loaders, but the LLM
424
- // can still find them at runtime via the Read tool.
425
- //
426
679
  // Skill names come from the SKILL.md `name:` frontmatter field, falling
427
680
  // back to the source folder name. Returns:
428
681
  // { copied, items }
429
682
  // where `copied` is the list of absolute file paths written and `items`
430
683
  // is the list of top-level entries in destRoot that the toolkit owns
431
684
  // (used to write the manifest).
432
- function copySkills(srcRoot, destRoot, { format, dryRun = false }) {
685
+ function copySkills(srcRoot, destRoot, { dryRun = false } = {}) {
433
686
  if (!fs.existsSync(srcRoot)) {
434
687
  throw new Error(`Source directory not found: ${srcRoot}`);
435
688
  }
@@ -456,17 +709,9 @@ function copySkills(srcRoot, destRoot, { format, dryRun = false }) {
456
709
  const { name } = parseSkillFrontmatter(content);
457
710
  const skillName = name || entry.name;
458
711
 
459
- if (format === 'mdc') {
460
- const transformed = skillToMdcContent(content);
461
- const destFile = path.join(destRoot, `${skillName}.mdc`);
462
- if (!dryRun) fs.writeFileSync(destFile, transformed);
463
- copied.push(destFile);
464
- items.push(`${skillName}.mdc`);
465
- } else {
466
- const skillDestDir = path.join(destRoot, skillName);
467
- copyDirRecursive(srcPath, skillDestDir, { dryRun, copied });
468
- items.push(skillName);
469
- }
712
+ const skillDestDir = path.join(destRoot, skillName);
713
+ copyDirRecursive(srcPath, skillDestDir, { dryRun, copied });
714
+ items.push(skillName);
470
715
  }
471
716
 
472
717
  return { copied, items };
@@ -486,6 +731,13 @@ function copySkills(srcRoot, destRoot, { format, dryRun = false }) {
486
731
  // 21-skill list and ordering; keep that template in sync with it.
487
732
  const AGENTS_TEMPLATE_PATH = path.join(SKILLS_DIR, 'references', 'templates', 'agents-template.md');
488
733
 
734
+ // Anchor markers delimit the block inside AGENTS.md that `ba-toolkit
735
+ // init` owns and is allowed to rewrite on re-init. Everything outside
736
+ // the anchors (Pipeline Status, Key Constraints, Open Questions, user
737
+ // notes) is preserved untouched. See agents-template.md.
738
+ const AGENTS_MANAGED_BEGIN = '<!-- ba-toolkit:begin managed -->';
739
+ const AGENTS_MANAGED_END = '<!-- ba-toolkit:end managed -->';
740
+
489
741
  function renderAgentsMd({ name, slug, domain }) {
490
742
  let template;
491
743
  try {
@@ -500,9 +752,50 @@ function renderAgentsMd({ name, slug, domain }) {
500
752
  .replace(/\[DATE\]/g, today());
501
753
  }
502
754
 
755
+ // Merge the fresh AGENTS.md content into whatever already exists at
756
+ // the project root. Three branches:
757
+ //
758
+ // 1. No existing file (existing == null) — return the fresh template,
759
+ // action 'created'.
760
+ // 2. Existing file has both anchor markers — replace only the managed
761
+ // block content between the anchors, leave the rest of the file
762
+ // (Pipeline Status, Key Constraints, user notes) untouched. Action
763
+ // 'merged'.
764
+ // 3. Existing file has no anchors — it's either a legacy AGENTS.md
765
+ // from a pre-merge version of the toolkit or a fully user-authored
766
+ // file. Leave it untouched and return { action: 'preserved' } so
767
+ // the caller can print a note. We never silently overwrite
768
+ // user content.
769
+ //
770
+ // Pure function for easy testing. Exported so test/cli.test.js can
771
+ // cover all three branches without spawning a process.
772
+ function mergeAgentsMd(existing, ctx) {
773
+ const fresh = renderAgentsMd(ctx);
774
+ if (existing == null) {
775
+ return { content: fresh, action: 'created' };
776
+ }
777
+ const beginIdx = existing.indexOf(AGENTS_MANAGED_BEGIN);
778
+ const endIdx = existing.indexOf(AGENTS_MANAGED_END);
779
+ if (beginIdx === -1 || endIdx === -1 || endIdx < beginIdx) {
780
+ return { content: existing, action: 'preserved' };
781
+ }
782
+ const freshBeginIdx = fresh.indexOf(AGENTS_MANAGED_BEGIN);
783
+ const freshEndIdx = fresh.indexOf(AGENTS_MANAGED_END);
784
+ if (freshBeginIdx === -1 || freshEndIdx === -1) {
785
+ // Template is broken — fall back to returning fresh. Should be
786
+ // caught in unit tests if the template file ever loses its anchors.
787
+ return { content: fresh, action: 'created' };
788
+ }
789
+ const freshManaged = fresh.slice(freshBeginIdx, freshEndIdx + AGENTS_MANAGED_END.length);
790
+ const before = existing.slice(0, beginIdx);
791
+ const after = existing.slice(endIdx + AGENTS_MANAGED_END.length);
792
+ return { content: before + freshManaged + after, action: 'merged' };
793
+ }
794
+
503
795
  // --- Commands ----------------------------------------------------------
504
796
 
505
797
  async function cmdInit(args) {
798
+ printBanner();
506
799
  log('');
507
800
  log(' ' + cyan('BA Toolkit — New Project Setup'));
508
801
  log(' ' + cyan('================================'));
@@ -537,20 +830,43 @@ async function cmdInit(args) {
537
830
  }
538
831
  slug = derived;
539
832
  } else if (derived) {
540
- const custom = await prompt(` Project slug [${cyan(derived)}]: `);
541
- slug = custom || derived;
833
+ // Default branch: the derived slug is offered as the suggested
834
+ // answer. Empty input accepts the suggestion; anything the user
835
+ // types is run through sanitiseSlug and must produce something
836
+ // non-empty — otherwise re-prompt.
837
+ slug = await promptUntilValid(
838
+ ` Project slug [${cyan(derived)}]: `,
839
+ (raw) => {
840
+ const typed = String(raw || '').trim();
841
+ if (!typed) return derived;
842
+ const cleaned = sanitiseSlug(typed);
843
+ return cleaned || null;
844
+ },
845
+ { invalidMessage: 'Invalid slug — must produce at least one ASCII letter/digit after sanitisation.' },
846
+ );
542
847
  } else {
543
848
  log(' ' + gray(`(could not derive a slug from "${name}" — please type one manually)`));
544
- slug = await prompt(' Project slug (lowercase, hyphens only): ');
849
+ slug = await promptUntilValid(
850
+ ' Project slug (lowercase, hyphens only): ',
851
+ (raw) => {
852
+ const cleaned = sanitiseSlug(String(raw || '').trim());
853
+ return cleaned || null;
854
+ },
855
+ { invalidMessage: 'Invalid slug — must contain at least one ASCII letter or digit.' },
856
+ );
545
857
  }
546
858
  }
859
+ // At this point `slug` is already a sanitised, non-empty string from
860
+ // one of the branches above. The final sanitiseSlug call is a
861
+ // defensive no-op for the flag path (--slug) where we haven't
862
+ // cleaned it yet.
547
863
  slug = sanitiseSlug(slug);
548
864
  if (!slug) {
549
865
  logError('Invalid or empty slug.');
550
866
  process.exit(1);
551
867
  }
552
868
 
553
- // --- 3. Domain (numbered menu) ---
869
+ // --- 3. Domain (arrow menu in TTY, numbered fallback elsewhere) ---
554
870
  const domainFlag = stringFlag(args, 'domain');
555
871
  let domain;
556
872
  if (domainFlag) {
@@ -561,20 +877,18 @@ async function cmdInit(args) {
561
877
  process.exit(1);
562
878
  }
563
879
  } else {
564
- log('');
565
- log(' ' + yellow('Pick a domain:'));
566
- const domainNameWidth = Math.max(...DOMAINS.map((d) => d.name.length));
567
- DOMAINS.forEach((d, i) => {
568
- const idx = String(i + 1).padStart(2);
569
- log(` ${idx}) ${bold(d.name.padEnd(domainNameWidth))} ${gray('— ' + d.desc)}`);
570
- });
571
- log('');
572
- const raw = await prompt(` Select [1-${DOMAINS.length}]: `);
573
- domain = resolveDomain(raw);
574
- if (!domain) {
575
- logError(`Invalid selection: ${raw || '(empty)'}`);
576
- process.exit(1);
880
+ const chosen = await selectMenu(
881
+ DOMAINS.map((d) => ({ id: d.id, label: d.name, desc: d.desc })),
882
+ {
883
+ title: 'Pick a domain:',
884
+ fallbackPrompt: ` Select [1-${DOMAINS.length}]: `,
885
+ },
886
+ );
887
+ if (chosen == null) {
888
+ log(' ' + yellow('Cancelled.'));
889
+ process.exit(130);
577
890
  }
891
+ domain = chosen.id;
578
892
  }
579
893
 
580
894
  // --- 4. Agent (numbered menu), unless --no-install ---
@@ -590,21 +904,19 @@ async function cmdInit(args) {
590
904
  process.exit(1);
591
905
  }
592
906
  } else {
593
- log('');
594
- log(' ' + yellow('Pick your AI agent:'));
595
907
  const agentEntries = Object.entries(AGENTS);
596
- const agentNameWidth = Math.max(...agentEntries.map(([, a]) => a.name.length));
597
- agentEntries.forEach(([id, a], i) => {
598
- const idx = String(i + 1).padStart(2);
599
- log(` ${idx}) ${bold(a.name.padEnd(agentNameWidth))} ${gray('(' + id + ')')}`);
600
- });
601
- log('');
602
- const raw = await prompt(` Select [1-${agentEntries.length}]: `);
603
- agentId = resolveAgent(raw);
604
- if (!agentId) {
605
- logError(`Invalid selection: ${raw || '(empty)'}`);
606
- process.exit(1);
908
+ const chosen = await selectMenu(
909
+ agentEntries.map(([id, a]) => ({ id, label: a.name, desc: '(' + id + ')' })),
910
+ {
911
+ title: 'Pick your AI agent:',
912
+ fallbackPrompt: ` Select [1-${agentEntries.length}]: `,
913
+ },
914
+ );
915
+ if (chosen == null) {
916
+ log(' ' + yellow('Cancelled.'));
917
+ process.exit(130);
607
918
  }
919
+ agentId = chosen.id;
608
920
  }
609
921
  }
610
922
 
@@ -620,18 +932,23 @@ async function cmdInit(args) {
620
932
  log(` exists ${outputDir}`);
621
933
  }
622
934
 
935
+ // AGENTS.md: merge-on-reinit instead of overwrite. Everything outside
936
+ // the managed block (Pipeline Status, Key Constraints, user notes) is
937
+ // preserved. See mergeAgentsMd for the three branches (created,
938
+ // merged, preserved).
623
939
  const agentsPath = 'AGENTS.md';
624
- let writeAgents = true;
625
- if (fs.existsSync(agentsPath)) {
626
- const answer = await prompt(' AGENTS.md already exists. Overwrite? (y/N): ');
627
- if (answer.toLowerCase() !== 'y') {
628
- writeAgents = false;
629
- log(' skipped AGENTS.md');
630
- }
631
- }
632
- if (writeAgents) {
633
- fs.writeFileSync(agentsPath, renderAgentsMd({ name, slug, domain }));
634
- log(' created AGENTS.md');
940
+ const existingAgents = fs.existsSync(agentsPath)
941
+ ? fs.readFileSync(agentsPath, 'utf8')
942
+ : null;
943
+ const { content: agentsContent, action: agentsAction } = mergeAgentsMd(
944
+ existingAgents,
945
+ { name, slug, domain },
946
+ );
947
+ if (agentsAction === 'preserved') {
948
+ log(' ' + gray('preserved AGENTS.md (no ba-toolkit managed block — left untouched)'));
949
+ } else {
950
+ fs.writeFileSync(agentsPath, agentsContent);
951
+ log(` ${agentsAction === 'merged' ? 'updated ' : 'created '} AGENTS.md`);
635
952
  }
636
953
 
637
954
  // --- 6. Install skills for the selected agent ---
@@ -683,8 +1000,8 @@ async function cmdInit(args) {
683
1000
  // only what we own without touching the user's other skills sitting in
684
1001
  // the same directory.
685
1002
  //
686
- // Hidden filename with no `.md` / `.mdc` extension so the skill loader
687
- // of every supported agent ignores it.
1003
+ // Hidden filename with no `.md` extension so the skill loader of every
1004
+ // supported agent ignores it.
688
1005
  const MANIFEST_FILENAME = '.ba-toolkit-manifest.json';
689
1006
 
690
1007
  function readManifest(destDir) {
@@ -697,11 +1014,10 @@ function readManifest(destDir) {
697
1014
  }
698
1015
  }
699
1016
 
700
- function writeManifest(destDir, format, items) {
1017
+ function writeManifest(destDir, items) {
701
1018
  const payload = {
702
1019
  version: PKG.version,
703
1020
  installedAt: new Date().toISOString(),
704
- format,
705
1021
  items,
706
1022
  };
707
1023
  fs.writeFileSync(
@@ -755,7 +1071,7 @@ async function runInstall({ agentId, isGlobal, isProject, dryRun, showHeader = t
755
1071
  log(` source: ${SKILLS_DIR}`);
756
1072
  log(` destination: ${destDir}`);
757
1073
  log(` scope: ${effectiveGlobal ? 'global (user-wide)' : 'project-level'}`);
758
- log(` format: ${agent.format === 'mdc' ? '.mdc (converted from SKILL.md)' : 'SKILL.md (native)'}`);
1074
+ log(` format: SKILL.md (native)`);
759
1075
  if (dryRun) log(' ' + yellow('mode: dry-run (no files will be written)'));
760
1076
 
761
1077
  // Warn about a v1.x wrapper folder if one is sitting in the same
@@ -784,29 +1100,25 @@ async function runInstall({ agentId, isGlobal, isProject, dryRun, showHeader = t
784
1100
 
785
1101
  let result;
786
1102
  try {
787
- result = copySkills(SKILLS_DIR, destDir, { format: agent.format, dryRun });
1103
+ result = copySkills(SKILLS_DIR, destDir, { dryRun });
788
1104
  } catch (err) {
789
1105
  logError(err.message);
790
1106
  process.exit(1);
791
1107
  }
792
1108
 
793
1109
  if (!dryRun) {
794
- writeManifest(destDir, agent.format, result.items);
1110
+ writeManifest(destDir, result.items);
795
1111
  }
796
1112
 
797
1113
  log(' ' + green(`${dryRun ? 'would copy' : 'copied'} ${result.copied.length} files (${result.items.length} items).`));
798
- if (!dryRun && agent.format === 'mdc') {
799
- log(' ' + gray('SKILL.md files converted to .mdc rule format.'));
800
- }
801
1114
  return true;
802
1115
  }
803
1116
 
804
1117
  // Remove every item listed in the given manifest from destDir, then
805
- // remove the manifest file itself. Items are paths relative to destDir
806
- // for 'skill' format they're folder names (`brief`, `srs`, ...,
807
- // `references`), for 'mdc' they're file names (`brief.mdc`, ...,
808
- // plus the `references` folder). Anything not in the manifest is left
809
- // alone, including the user's other skills/rules in the same directory.
1118
+ // remove the manifest file itself. Items are top-level entries
1119
+ // relative to destDir folder names like `brief`, `srs`, …,
1120
+ // `references`. Anything not in the manifest is left alone, including
1121
+ // the user's other skills sitting in the same directory.
810
1122
  function removeManifestItems(destDir, manifest) {
811
1123
  for (const item of manifest.items) {
812
1124
  const p = path.join(destDir, item);
@@ -1264,10 +1576,12 @@ module.exports = {
1264
1576
  levenshtein,
1265
1577
  closestMatch,
1266
1578
  parseSkillFrontmatter,
1267
- skillToMdcContent,
1268
1579
  readManifest,
1269
1580
  detectLegacyInstall,
1270
1581
  renderAgentsMd,
1582
+ mergeAgentsMd,
1583
+ menuStep,
1584
+ renderMenu,
1271
1585
  KNOWN_FLAGS,
1272
1586
  DOMAINS,
1273
1587
  AGENTS,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@kudusov.takhir/ba-toolkit",
3
- "version": "2.0.0",
3
+ "version": "3.0.0",
4
4
  "description": "AI-powered Business Analyst pipeline — 21 skills from project brief to development handoff. Works with Claude Code, Codex CLI, Gemini CLI, Cursor, and Windsurf.",
5
5
  "keywords": [
6
6
  "business-analyst",
@@ -43,6 +43,6 @@
43
43
  "node": ">=18"
44
44
  },
45
45
  "scripts": {
46
- "test": "node --test test/cli.test.js"
46
+ "test": "node --test test/cli.test.js test/cli.integration.test.js"
47
47
  }
48
48
  }
@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
21
21
 
22
22
  ## Interview
23
23
 
24
- 3–7 questions per round, 2–4 rounds.
24
+ > **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
25
+
26
+ 3–7 topics per round, 2–4 rounds.
25
27
 
26
28
  **Required topics:**
27
29
  1. Which business rules should be reflected in AC (limits, formulas, thresholds)?
@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
21
21
 
22
22
  ## Interview
23
23
 
24
- 3–7 questions per round, 2–4 rounds.
24
+ > **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
25
+
26
+ 3–7 topics per round, 2–4 rounds.
25
27
 
26
28
  **Required topics:**
27
29
  1. Protocol — REST, WebSocket, GraphQL, combination?
@@ -34,7 +34,9 @@ The domain is written into the brief metadata and passed to all subsequent pipel
34
34
 
35
35
  ### 4. Interview
36
36
 
37
- 3–7 questions per round, 24 rounds. Do not generate the artifact until sufficient information is collected.
37
+ > **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 35 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
38
+
39
+ Cover 3–7 topics per round, 2–4 rounds. Do not generate the artifact until sufficient information is collected.
38
40
 
39
41
  **Required topics (all domains):**
40
42
  1. Product type — what exactly is being built?
@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
21
21
 
22
22
  ## Interview
23
23
 
24
- 3–7 questions per round, 2–4 rounds.
24
+ > **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
25
+
26
+ 3–7 topics per round, 2–4 rounds.
25
27
 
26
28
  **Required topics:**
27
29
  1. DBMS — MongoDB, PostgreSQL, MySQL, other?
@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
21
21
 
22
22
  ## Interview
23
23
 
24
- 3–7 questions per round, 2–4 rounds.
24
+ > **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
25
+
26
+ 3–7 topics per round, 2–4 rounds.
25
27
 
26
28
  **Required topics:**
27
29
  1. Performance — target CCU (Concurrent Users), RPS (Requests Per Second), acceptable response time?
@@ -27,7 +27,9 @@ If `01_brief_*.md` already exists, extract the slug and domain from it. Otherwis
27
27
 
28
28
  ### 3. Interview
29
29
 
30
- 1–2 rounds, 3–5 questions each. Do not ask about topics the user can accept as defaults.
30
+ > **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
31
+
32
+ 1–2 rounds, 3–5 topics each. Do not ask about topics the user can accept as defaults.
31
33
 
32
34
  **Required topics:**
33
35
  1. Artifact language — which language should all artifacts be generated in? (default: the language of the user's first message)
@@ -0,0 +1,53 @@
1
+ # Interview Protocol
2
+
3
+ Every BA Toolkit skill that gathers information from the user MUST follow this protocol during its Interview phase. The goal is a conversation, not a questionnaire dump — users answer better when each question has focus and concrete options to react to.
4
+
5
+ ## Rules
6
+
7
+ 1. **One question at a time.** Never send a numbered list of 5+ questions in a single message. Ask one question, wait for the answer, acknowledge it in one line, then ask the next.
8
+
9
+ 2. **Offer 3–5 answer options per question.** For every question, present a short numbered list of the most likely answers based on:
10
+ - The project domain (load `references/domains/{domain}.md` and reuse its vocabulary, typical entities, and business goals verbatim when they fit — do not invent domain-specific options when the reference file already lists them).
11
+ - What the user has already said earlier in the interview.
12
+ - Industry conventions for the artifact being built.
13
+
14
+ Options should be **concrete**, not abstract — e.g. for "Who is your primary user?" in a SaaS project, offer "Product Manager at a 50–500-person SaaS startup", "Engineering Lead", "Ops/Support team", not "End user", "Customer", "User".
15
+
16
+ 3. **Always include a free-text option.** The last numbered option must always be something like `5. Other — type your own answer`. If the user picks it, accept arbitrary text. Never force the user into one of the predefined options.
17
+
18
+ 4. **Wait for the answer.** Do not generate the next question or any part of the artifact until the user has replied. A non-answer (e.g. "I don't know", "skip") is a valid answer — record it as "unknown" and move on.
19
+
20
+ 5. **Acknowledge, then proceed.** After each answer, reflect it back in one line (e.g. "Got it — primary user is the Ops team at mid-size logistics companies.") before asking the next question. This catches misunderstandings early.
21
+
22
+ 6. **Batch only when the user asks.** If the user explicitly says "just give me all the questions at once" or "I'll answer in one go", switch to a single numbered list. Otherwise stay one-at-a-time.
23
+
24
+ 7. **Stop when you have enough.** Each skill specifies a required set of topics. Once every required topic has a recorded answer, stop asking and move to the Generation phase. Do not pad the interview with "nice-to-have" questions.
25
+
26
+ ## Example
27
+
28
+ Bad (old style):
29
+
30
+ > Please answer the following questions:
31
+ > 1. What is the product?
32
+ > 2. Who is the target user?
33
+ > 3. What problem does it solve?
34
+ > 4. What are the success metrics?
35
+ > 5. What are the key constraints?
36
+
37
+ Good (protocol style):
38
+
39
+ > Let's start with the product itself. What are you building?
40
+ >
41
+ > 1. A B2B SaaS tool for internal teams (dashboards, automation, reporting)
42
+ > 2. A customer-facing web application (marketplace, portal, community)
43
+ > 3. A mobile app (consumer or B2B)
44
+ > 4. An API / developer platform
45
+ > 5. Other — type your own answer
46
+
47
+ *User picks 1 or types custom.*
48
+
49
+ > Got it — internal B2B SaaS tool. Who is the primary user? [next question with 3–5 options tailored to B2B SaaS internal tooling]
50
+
51
+ ## When this protocol applies
52
+
53
+ This protocol applies to every skill that has an `### Interview` (or `## Interview`) section in its SKILL.md — currently: `brief`, `srs`, `stories`, `usecases`, `ac`, `nfr`, `datadict`, `apicontract`, `wireframes`, `scenarios`, `research`, `principles`. Each of those skills MUST link to this file from its Interview section and follow the rules above.
@@ -1,6 +1,7 @@
1
1
  # BA Toolkit — Project Context
2
2
 
3
- > Auto-generated by `ba-toolkit init` on [DATE]. Updated automatically by /brief and /srs.
3
+ <!-- ba-toolkit:begin managed -->
4
+ > Auto-generated by `ba-toolkit init` on [DATE]. The Active Project block below is refreshed on every re-init. Everything outside this managed block is preserved — add your own notes, update the Pipeline Status, and edit the Key Constraints / Open Questions sections freely; `ba-toolkit init` will not touch them.
4
5
 
5
6
  ## Active Project
6
7
 
@@ -9,6 +10,7 @@
9
10
  **Domain:** [DOMAIN]
10
11
  **Language:** English
11
12
  **Output folder:** output/[SLUG]/
13
+ <!-- ba-toolkit:end managed -->
12
14
 
13
15
  ## Pipeline Status
14
16
 
@@ -23,7 +23,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
23
23
 
24
24
  ## Interview
25
25
 
26
- 1–2 rounds, 46 questions.
26
+ > **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 35 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
27
+
28
+ 1–2 rounds, 4–6 topics.
27
29
 
28
30
  **Required topics:**
29
31
  1. Existing infrastructure — is there a current backend, database, or API the new system must integrate with or extend?
@@ -23,7 +23,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
23
23
 
24
24
  ## Interview
25
25
 
26
- 1 round, 3–5 questions.
26
+ > **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
27
+
28
+ 1 round, 3–5 topics.
27
29
 
28
30
  **Required topics:**
29
31
  1. Coverage priority — generate scenarios for Must-priority US only, or include Should as well?
@@ -23,7 +23,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
23
23
 
24
24
  ## Interview
25
25
 
26
- 3–7 questions per round, 24 rounds. Do not re-ask information already known from the brief.
26
+ > **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 35 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
27
+
28
+ 3–7 topics per round, 2–4 rounds. Do not re-ask information already known from the brief.
27
29
 
28
30
  **Required topics:**
29
31
  1. User roles — which roles interact with the system?
@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
21
21
 
22
22
  ## Interview
23
23
 
24
- 3–7 questions per round, 2–4 rounds.
24
+ > **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
25
+
26
+ 3–7 topics per round, 2–4 rounds.
25
27
 
26
28
  **Required topics:**
27
29
  1. Which user flows are most critical?
@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
21
21
 
22
22
  ## Interview
23
23
 
24
- 3–7 questions per round, 2–4 rounds.
24
+ > **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
25
+
26
+ 3–7 topics per round, 2–4 rounds.
25
27
 
26
28
  **Required topics:**
27
29
  1. Detail level — summary, user-goal, subfunction?
@@ -21,7 +21,9 @@ Read `references/environment.md` from the `ba-toolkit` directory to determine th
21
21
 
22
22
  ## Interview
23
23
 
24
- 3–7 questions per round, 2–4 rounds.
24
+ > **Follow the [Interview Protocol](../references/interview-protocol.md):** ask one question at a time, offer 3–5 domain-appropriate options (load `references/domains/{domain}.md` for the ones that fit), always include a free-text "Other" option as the last choice, and wait for an answer before asking the next question.
25
+
26
+ 3–7 topics per round, 2–4 rounds.
25
27
 
26
28
  **Required topics:**
27
29
  1. Platform — web (desktop, mobile responsive), native app, Telegram Mini App?