@roadmapperai/mcp 0.9.3 → 0.9.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +95 -16
- package/package.json +1 -1
- package/server.mjs +875 -15
package/AGENTS.md
CHANGED
|
@@ -85,6 +85,13 @@ that's almost always what you actually want.
|
|
|
85
85
|
CANDIDATE for `propose_capability` (human-confirmed) followed by
|
|
86
86
|
`move_tasks`. Tune `minClusterSize` (default 3) / `fitThreshold`
|
|
87
87
|
(default 0.2) to control sensitivity.
|
|
88
|
+
- `detect_theme_sprawl` — the theme-level companion. Scores every
|
|
89
|
+
active theme against every other and flags pairs that overlap enough
|
|
90
|
+
to be consolidation candidates, each with a suggested merge
|
|
91
|
+
(`move_capabilities` the lighter theme's bets into the heavier, then
|
|
92
|
+
`archive_theme`). This is how you keep theme count in check now that
|
|
93
|
+
agents create themes autonomously (see `propose_theme`). Run it at
|
|
94
|
+
quarterly review. Tune `threshold` (default 0.34).
|
|
88
95
|
- `get_agents_md` — re-read this contract on demand.
|
|
89
96
|
|
|
90
97
|
**Multi-workspace addressing.** A single MCP install can talk to any
|
|
@@ -137,18 +144,37 @@ to "tell the human what I'd do"):
|
|
|
137
144
|
succeeds (this is a nudge, not a gate — unlike `propose_capability`,
|
|
138
145
|
which hard-blocks until you've done discovery). Surface the warning
|
|
139
146
|
to the user.
|
|
147
|
+
- `propose_tasks` — **bulk** create many tasks under ONE capability in
|
|
148
|
+
a single call. **This is the token-efficient path: prefer it over N
|
|
149
|
+
separate `propose_task` calls when filing a plan** — one request, one
|
|
150
|
+
compact `{id,title}` array back. Each task spec needs `title` +
|
|
151
|
+
`effort`; intra-batch dependencies work by giving a task a `ref` and
|
|
152
|
+
listing that ref in a sibling's `dependsOn` (refs are rewritten to
|
|
153
|
+
real ids after minting). A validation error fails the whole batch
|
|
154
|
+
before writing; once valid, per-row RPC failures are reported in
|
|
155
|
+
`tasks[].error` without sinking the rest.
|
|
140
156
|
- `propose_capability` — create a new capability under an **existing**
|
|
141
157
|
theme. Required: `name`, `pillarId`. Sensible defaults are applied
|
|
142
158
|
(`reach: 100`, `impact: 1`, `confidence: 70`). Pass `outcome` and
|
|
143
159
|
`specRef` whenever you have them — capabilities without an outcome
|
|
144
160
|
rarely survive review. Pass `idempotencyKey`.
|
|
145
|
-
- `propose_theme` — create a new theme. **
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
161
|
+
- `propose_theme` — create a new theme. **Theme-autonomy is ON by
|
|
162
|
+
default**, so you MAY create a theme without human confirmation when
|
|
163
|
+
the work genuinely needs a new years-stable pillar. You don't police
|
|
164
|
+
sprawl by asking a human every time — the server does it for you:
|
|
165
|
+
`propose_theme` returns `error: "too_similar"` (naming the match) if
|
|
166
|
+
your theme overlaps an existing one above the block bar, so you reuse
|
|
167
|
+
or `update_theme` that one instead. Still prefer the deepest existing
|
|
168
|
+
parent (new task > new capability > new theme); a theme is the rare
|
|
169
|
+
case. If a workspace turned autonomy OFF (Settings → Agent
|
|
170
|
+
automation), `propose_theme` returns `error: "confirmation_required"`
|
|
171
|
+
until you surface the new theme to the user and retry with
|
|
172
|
+
`confirm: true`. Use `force: true` only to override a `too_similar`
|
|
173
|
+
block that's a genuine false positive. Run `detect_theme_sprawl`
|
|
174
|
+
periodically to catch themes that have drifted into overlap.
|
|
175
|
+
- `submit_acceptance_grades` — stamp `{ status: pass | fail, note? }`
|
|
176
|
+
per criterion index, after you've actually verified the work. The
|
|
177
|
+
server stamps `gradedAt` and `gradedBy: "mcp:agent"`.
|
|
152
178
|
- `submit_acceptance_grades` — stamp `{ status: pass | fail, note? }`
|
|
153
179
|
per criterion index, after you've actually verified the work. The
|
|
154
180
|
server stamps `gradedAt` and `gradedBy: "mcp:agent"`.
|
|
@@ -223,6 +249,45 @@ disagrees with the cwd snapshot are refused — set
|
|
|
223
249
|
`ROADMAPPER_ALLOW_CROSS_WORKSPACE=1` in the env to override.
|
|
224
250
|
Reads can target any workspace freely.
|
|
225
251
|
|
|
252
|
+
**Repo-link gate (write path).** If you're in a git repo that
|
|
253
|
+
isn't mapped to a workspace, a mutator would silently land on the
|
|
254
|
+
install's env-default workspace — almost never what you want. So
|
|
255
|
+
the first such write is **refused** with a structured error:
|
|
256
|
+
|
|
257
|
+
```json
|
|
258
|
+
{
|
|
259
|
+
"error": "repo_unmapped",
|
|
260
|
+
"message": "\"owner/name\" isn't mapped to a workspace ...",
|
|
261
|
+
"fix": "link_repo()",
|
|
262
|
+
"alt": "<tool>({ workspaceId: \"<target>\", ... })"
|
|
263
|
+
}
|
|
264
|
+
```
|
|
265
|
+
|
|
266
|
+
Resolve it one of two ways, then retry:
|
|
267
|
+
- **`link_repo()`** — maps the repo you're in to your key's
|
|
268
|
+
workspace, so every future session resolves silently. This is
|
|
269
|
+
the right move when the repo *should* feed this workspace.
|
|
270
|
+
- **Pass `workspaceId` explicitly** on the call — proceeds without
|
|
271
|
+
mapping the repo. This is the escape hatch when you're working
|
|
272
|
+
across **several repos in one session** and just want this write
|
|
273
|
+
to land on a specific existing workspace.
|
|
274
|
+
|
|
275
|
+
The gate is repo-aware and only fires for a genuinely unmapped
|
|
276
|
+
repo: an explicit `workspaceId`, an already-mapped repo (resolves
|
|
277
|
+
via `repo_workspace_map`), or not being in a git repo at all
|
|
278
|
+
pass straight through — a multi-repo chat is never bricked.
|
|
279
|
+
Operators can disable the gate entirely with
|
|
280
|
+
`ROADMAPPER_ALLOW_UNMAPPED_REPO=1`.
|
|
281
|
+
|
|
282
|
+
This gate and the **seed-workspace guard** below are complementary,
|
|
283
|
+
not redundant: the repo-link gate covers the *in a git repo but
|
|
284
|
+
unmapped* case (and gives the more actionable `link_repo` fix),
|
|
285
|
+
while the seed-workspace guard still catches a write that fell
|
|
286
|
+
through to the bundled `"default"` workspace when you're **not** in
|
|
287
|
+
any repo (nothing to link). A write can be refused by at most one of
|
|
288
|
+
them; both protect the same env-default footgun from different
|
|
289
|
+
angles.
|
|
290
|
+
|
|
226
291
|
Authoring discipline:
|
|
227
292
|
- Read first (`list_themes`, `list_capabilities`, `list_tasks`) before
|
|
228
293
|
proposing anything, so you don't invent a new theme/capability that
|
|
@@ -402,12 +467,17 @@ Before writing anything:
|
|
|
402
467
|
- **Top score 0.2–0.4**: weak overlap. The top match is still
|
|
403
468
|
usually the right home; re-using a "close-enough" theme is
|
|
404
469
|
almost always better than creating a duplicate.
|
|
405
|
-
- **Empty matches or top < 0.2**: no existing theme fits.
|
|
406
|
-
|
|
407
|
-
|
|
408
|
-
|
|
409
|
-
|
|
410
|
-
|
|
470
|
+
- **Empty matches or top < 0.2**: no existing theme fits. With
|
|
471
|
+
theme-autonomy ON (the default), you may call `propose_theme`
|
|
472
|
+
directly when this is a genuinely new years-stable pillar — you
|
|
473
|
+
don't need to stop and ask. The server is the sprawl guard: it
|
|
474
|
+
refuses a near-duplicate (`error: "too_similar"`), so a theme that
|
|
475
|
+
gets created is one that doesn't already exist. With autonomy OFF,
|
|
476
|
+
`propose_theme` returns `error: "confirmation_required"` — surface
|
|
477
|
+
the theme to the user and retry with `confirm: true`. Either way,
|
|
478
|
+
prefer a capability under the closest theme when one is even
|
|
479
|
+
plausibly a fit; a new theme is the rare case.
|
|
480
|
+
The propose_theme tool enforces discovery: skipping
|
|
411
481
|
`suggest_theme_for` (or `list_themes` / `get_roadmap_snapshot`)
|
|
412
482
|
returns a `discovery_missing` error.
|
|
413
483
|
2. **Decompose into Capabilities.** A Capability is "a quarterly
|
|
@@ -427,9 +497,18 @@ Before writing anything:
|
|
|
427
497
|
|
|
428
498
|
### What to emit (template)
|
|
429
499
|
|
|
430
|
-
|
|
431
|
-
|
|
432
|
-
|
|
500
|
+
**If the write tools are live (you can call `propose_capability` /
|
|
501
|
+
`propose_tasks`), FILE DIRECTLY — do not also emit this JSON block.**
|
|
502
|
+
Dumping the full plan as JSON *and* filing it via tools pays for the
|
|
503
|
+
plan twice in tokens, and the tools are the canonical path anyway. Use
|
|
504
|
+
`propose_capability` for the bet, then ONE `propose_tasks` call for all
|
|
505
|
+
its tasks (not N `propose_task` calls), and report back a short summary
|
|
506
|
+
with the returned ids — not the whole record set. The JSON block below
|
|
507
|
+
is only for the **no-write-tools** case (seed/live-read tier), where
|
|
508
|
+
the user pastes it into Roadmapper's import manually.
|
|
509
|
+
|
|
510
|
+
When you do emit it: return a single JSON block. The field names must
|
|
511
|
+
match exactly. IDs use the `__NEW__` placeholder prefix when you're
|
|
433
512
|
proposing a new record — Roadmapper assigns the real `TH-NNNNNN` /
|
|
434
513
|
`CAP-NNN` / `TK-NNNNNN` ID at import time.
|
|
435
514
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@roadmapperai/mcp",
|
|
3
|
-
"version": "0.9.
|
|
3
|
+
"version": "0.9.4",
|
|
4
4
|
"description": "Roadmapper AI MCP server — exposes a planning surface (themes, capabilities, tasks, sprints, PRs) to coding agents via stdio JSON-RPC. Pairs with the Roadmapper AI workspace at dashboard.roadmapperai.com.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"mcp",
|
package/server.mjs
CHANGED
|
@@ -326,6 +326,13 @@ async function fetchWorkspaceEntitiesViaBroker() {
|
|
|
326
326
|
pillars: Array.isArray(parsed.pillars) ? parsed.pillars : [],
|
|
327
327
|
capabilities: Array.isArray(parsed.capabilities) ? parsed.capabilities : [],
|
|
328
328
|
tasks: Array.isArray(parsed.tasks) ? parsed.tasks : [],
|
|
329
|
+
// Additive (migration 0108): the workspace_settings row, used to
|
|
330
|
+
// resolve agent_theme_autonomy. Absent on older backends → null,
|
|
331
|
+
// which the projection treats as "all defaults" (autonomy on).
|
|
332
|
+
settings:
|
|
333
|
+
parsed.settings && typeof parsed.settings === "object"
|
|
334
|
+
? parsed.settings
|
|
335
|
+
: null,
|
|
329
336
|
};
|
|
330
337
|
} catch {
|
|
331
338
|
return null;
|
|
@@ -853,6 +860,7 @@ async function readWorkspaceProjected(wsIdOverride) {
|
|
|
853
860
|
themes: ent.pillars.map(rowToThemeProjected),
|
|
854
861
|
capabilities: ent.capabilities.map(rowToCapabilityProjected),
|
|
855
862
|
tasks: ent.tasks.map(rowToTaskProjected),
|
|
863
|
+
settings: rowToSettingsProjected(ent.settings),
|
|
856
864
|
};
|
|
857
865
|
}
|
|
858
866
|
// Broker failed — fall through to the direct read below. On a pure
|
|
@@ -876,15 +884,22 @@ async function readWorkspaceProjected(wsIdOverride) {
|
|
|
876
884
|
return res.json();
|
|
877
885
|
};
|
|
878
886
|
try {
|
|
879
|
-
const [pillars, caps, tasks] = await Promise.all([
|
|
887
|
+
const [pillars, caps, tasks, settingsRows] = await Promise.all([
|
|
880
888
|
fetchTable("pillars?select=*"),
|
|
881
889
|
fetchTable("capabilities?select=*"),
|
|
882
890
|
fetchTable("tasks?select=*"),
|
|
891
|
+
// Operator path: workspace_settings is one row per workspace.
|
|
892
|
+
// Tolerate the table not existing on an older DB (404) — fall
|
|
893
|
+
// back to defaults rather than failing the whole read.
|
|
894
|
+
fetchTable("workspace_settings?select=*").catch(() => []),
|
|
883
895
|
]);
|
|
884
896
|
return {
|
|
885
897
|
themes: pillars.map(rowToThemeProjected),
|
|
886
898
|
capabilities: caps.map(rowToCapabilityProjected),
|
|
887
899
|
tasks: tasks.map(rowToTaskProjected),
|
|
900
|
+
settings: rowToSettingsProjected(
|
|
901
|
+
Array.isArray(settingsRows) ? settingsRows[0] : settingsRows
|
|
902
|
+
),
|
|
888
903
|
};
|
|
889
904
|
} catch (e) {
|
|
890
905
|
log("supabase entity read failed:", e.message);
|
|
@@ -896,6 +911,20 @@ async function readWorkspaceProjected(wsIdOverride) {
|
|
|
896
911
|
* the same camelCase keys the SPA + agent surfaces have always
|
|
897
912
|
* used; the legacy JSONB shape and these table rows agree on
|
|
898
913
|
* every field. */
|
|
914
|
+
/**
|
|
915
|
+
* Project a workspace_settings row to the camelCase shape the server
|
|
916
|
+
* reads. Tolerant of null / {} (no row yet, or an older backend that
|
|
917
|
+
* doesn't return settings): every flag falls back to its product
|
|
918
|
+
* default. agent_theme_autonomy defaults TRUE — agents create themes
|
|
919
|
+
* autonomously unless a workspace explicitly turns it off.
|
|
920
|
+
*/
|
|
921
|
+
function rowToSettingsProjected(r) {
|
|
922
|
+
const row = r && typeof r === "object" ? r : {};
|
|
923
|
+
return {
|
|
924
|
+
// Default true: missing column / row / backend all mean "on".
|
|
925
|
+
agentThemeAutonomy: row.agent_theme_autonomy !== false,
|
|
926
|
+
};
|
|
927
|
+
}
|
|
899
928
|
function rowToThemeProjected(r) {
|
|
900
929
|
return stripUndefined({
|
|
901
930
|
id: r.id,
|
|
@@ -1396,6 +1425,27 @@ function jaccardScore(a, b) {
|
|
|
1396
1425
|
return overlap / Math.max(a.size, b.size);
|
|
1397
1426
|
}
|
|
1398
1427
|
|
|
1428
|
+
// ── Theme sprawl control ──────────────────────────────────────────
|
|
1429
|
+
//
|
|
1430
|
+
// With agent_theme_autonomy ON (the default), the old "stop and ask a
|
|
1431
|
+
// human before any new theme" guard is gone — so the sprawl guard has
|
|
1432
|
+
// to live server-side instead. A proposed theme whose name+description
|
|
1433
|
+
// overlaps an existing active theme at or above this bar is almost
|
|
1434
|
+
// certainly a near-duplicate ("Data Intelligence" vs "Data &
|
|
1435
|
+
// Intelligence"); propose_theme refuses it and points at the match so
|
|
1436
|
+
// the agent reuses/updates that theme instead of minting a sibling.
|
|
1437
|
+
// Set deliberately high: themes are coarse, so only a strong overlap
|
|
1438
|
+
// is a real duplicate. 0.6 blocks name-containment dups ("Data
|
|
1439
|
+
// Intelligence" ⊂ "Data Intelligence Platform" = 0.67) without
|
|
1440
|
+
// false-positiving on two distinct short themes that happen to share
|
|
1441
|
+
// ONE word ("Customer Loyalty" vs "Customer Retention" = 0.5 < 0.6).
|
|
1442
|
+
// force:true overrides for the rare legitimate case.
|
|
1443
|
+
const THEME_SPRAWL_BLOCK = 0.6;
|
|
1444
|
+
// Two existing themes overlapping at/above this are flagged as a
|
|
1445
|
+
// consolidation candidate by detect_theme_sprawl (lower than the block
|
|
1446
|
+
// bar — we want to surface drift before it's an exact dup).
|
|
1447
|
+
const THEME_SPRAWL_WARN = 0.34;
|
|
1448
|
+
|
|
1399
1449
|
// ── Session state + enforcement gates ─────────────────────────────
|
|
1400
1450
|
//
|
|
1401
1451
|
// One process serves one MCP client (stdio). State below is the
|
|
@@ -1485,6 +1535,88 @@ function discoveryMissingResult(toolName, fixCall, rationale) {
|
|
|
1485
1535
|
};
|
|
1486
1536
|
}
|
|
1487
1537
|
|
|
1538
|
+
/**
|
|
1539
|
+
* Block result for a mutator whose target workspace fell through to the
|
|
1540
|
+
* install's env default WHILE the agent is sitting in a git repo that
|
|
1541
|
+
* isn't mapped to any workspace. Same shape + self-heal rationale as the
|
|
1542
|
+
* rubric gate: name the exact fix so the LLM links the repo, then retries.
|
|
1543
|
+
*
|
|
1544
|
+
* Why this is repo-aware, not session-aware — a developer routinely has
|
|
1545
|
+
* SEVERAL repos open in one chat. The gate must only fire for the specific
|
|
1546
|
+
* unmapped repo, and must never brick a legitimate cross-repo write:
|
|
1547
|
+
* • An explicit `workspaceId` arg → caller is intentionally targeting a
|
|
1548
|
+
* workspace; never blocked (checked before this is reached).
|
|
1549
|
+
* • source === "repo"/"snapshot"/"arg" → already resolved to a real
|
|
1550
|
+
* mapping; this only fires on "env" (the silent install-default
|
|
1551
|
+
* fall-through), which — because resolveWorkspaceWithSource prefers a
|
|
1552
|
+
* repo_workspace_map hit — means THIS repo genuinely isn't mapped.
|
|
1553
|
+
* • No git slug (not in a repo) → nothing to link; fall through to the
|
|
1554
|
+
* env default rather than deadlock.
|
|
1555
|
+
* The message offers BOTH escape hatches so a multi-repo chat is never
|
|
1556
|
+
* stuck: link_repo (map this repo) OR pass workspaceId (target an existing
|
|
1557
|
+
* workspace without mapping the repo at all).
|
|
1558
|
+
*/
|
|
1559
|
+
function repoUnmappedResult(toolName, slug, envWsId) {
|
|
1560
|
+
return {
|
|
1561
|
+
content: [
|
|
1562
|
+
{
|
|
1563
|
+
type: "text",
|
|
1564
|
+
text: JSON.stringify(
|
|
1565
|
+
{
|
|
1566
|
+
error: "repo_unmapped",
|
|
1567
|
+
message:
|
|
1568
|
+
`"${slug}" isn't mapped to a workspace, so ${toolName} would land on the install-default workspace "${envWsId}" — probably not what you want. ` +
|
|
1569
|
+
`Map it once with link_repo (this repo → your key's workspace, resolves silently forever after), then retry ${toolName}. ` +
|
|
1570
|
+
`Or, if you meant a specific existing workspace, pass workspaceId on the call and it proceeds without mapping the repo.`,
|
|
1571
|
+
repo: slug,
|
|
1572
|
+
envDefaultWorkspace: envWsId,
|
|
1573
|
+
fix: "link_repo()",
|
|
1574
|
+
alt: `${toolName}({ workspaceId: "<target>", ... })`,
|
|
1575
|
+
},
|
|
1576
|
+
null,
|
|
1577
|
+
2
|
|
1578
|
+
),
|
|
1579
|
+
},
|
|
1580
|
+
],
|
|
1581
|
+
isError: true,
|
|
1582
|
+
};
|
|
1583
|
+
}
|
|
1584
|
+
|
|
1585
|
+
/**
|
|
1586
|
+
* Decide whether a mutator should be blocked because the agent is in an
|
|
1587
|
+
* unmapped repo and the write would silently hit the env default. Returns
|
|
1588
|
+
* a block result, or null to proceed. Pure + sync (no network) so it's
|
|
1589
|
+
* cheap on every mutator: the per-repo "is it mapped" question was already
|
|
1590
|
+
* answered by resolveWorkspaceWithSource (a mapped repo resolves to
|
|
1591
|
+
* source "repo", never "env"), so we only need the cwd's git slug here.
|
|
1592
|
+
*
|
|
1593
|
+
* Escape hatches, in order:
|
|
1594
|
+
* 1. Explicit workspaceId arg → intentional target, allow.
|
|
1595
|
+
* 2. Writes disabled → not our concern (set_credentials path handles it).
|
|
1596
|
+
* 3. Source isn't "env" → already resolved to a real mapping, allow.
|
|
1597
|
+
* 4. No client roots / no git slug → not in a linkable repo, allow
|
|
1598
|
+
* (fall through to env default; blocking would deadlock).
|
|
1599
|
+
* 5. Bypass env var set → allow (operator opt-out).
|
|
1600
|
+
*/
|
|
1601
|
+
async function repoLinkGate(name, args, source, envWsId) {
|
|
1602
|
+
if (args?.workspaceId) return null; // explicit target — never block
|
|
1603
|
+
if (writeMode() === "read-only") return null; // different problem
|
|
1604
|
+
if (source !== "env") return null; // resolved via repo/snapshot/arg
|
|
1605
|
+
if (process.env.ROADMAPPER_ALLOW_UNMAPPED_REPO === "1") return null;
|
|
1606
|
+
if (_clientRoots.length === 0) return null; // not in a repo at all
|
|
1607
|
+
|
|
1608
|
+
// Find the first open root with a resolvable origin slug. If none, the
|
|
1609
|
+
// agent isn't in a linkable git repo — don't block (let env default win).
|
|
1610
|
+
let slug = null;
|
|
1611
|
+
for (const dir of _clientRoots) {
|
|
1612
|
+
slug = await repoSlugForDir(dir);
|
|
1613
|
+
if (slug) break;
|
|
1614
|
+
}
|
|
1615
|
+
if (!slug) return null;
|
|
1616
|
+
|
|
1617
|
+
return repoUnmappedResult(name, slug, envWsId);
|
|
1618
|
+
}
|
|
1619
|
+
|
|
1488
1620
|
/**
|
|
1489
1621
|
* Telemetry write — fire-and-forget POST to public.mcp_telemetry
|
|
1490
1622
|
* via PostgREST when a service-role key is set. Never blocks the
|
|
@@ -1768,13 +1900,70 @@ const TOOLS = [
|
|
|
1768
1900
|
additionalProperties: false,
|
|
1769
1901
|
},
|
|
1770
1902
|
},
|
|
1903
|
+
{
|
|
1904
|
+
name: "propose_tasks",
|
|
1905
|
+
description:
|
|
1906
|
+
"Bulk-create MANY tasks under ONE capability in a single call. Token-efficient: prefer this over N separate propose_task calls when filing a plan — one request, one compact {id,title} array back instead of N round trips. When write tools are live, file directly via this tool; do NOT also paste the full JSON plan into chat (that pays for the plan twice).\n\n" +
|
|
1907
|
+
"USE WHEN: decomposing a capability into its 3-8 tasks, or importing a planned backlog. All tasks share the one capabilityId.\n" +
|
|
1908
|
+
"PREREQUISITE: get_agents_md once this session (enforced). The capability must already exist — propose_capability first if needed.\n" +
|
|
1909
|
+
"INTRA-BATCH DEPENDENCIES: give a task a `ref` (any alias string) and reference it in another task's `dependsOn` — refs are rewritten to the real TK ids after minting. dependsOn entries that aren't a sibling ref pass through as literal existing TK ids.\n" +
|
|
1910
|
+
"PARTIAL SUCCESS: a structural/validation error in any row fails the whole call before writing (fix the batch). Once validated, per-row RPC failures are reported in tasks[].error without sinking the rest.\n" +
|
|
1911
|
+
"ANTI-PATTERN: don't use for a single task (use propose_task); don't spread one capability's tasks across multiple capabilities (call once per capability).\n" +
|
|
1912
|
+
"EXAMPLE: propose_tasks({ capabilityId: 'CAP-018', tasks: [{ ref: 'a', title: 'Schema + migration', effort: 'M' }, { title: 'API endpoint', effort: 'M', dependsOn: ['a'] }] })\n\n" +
|
|
1913
|
+
"Requires write auth (set ROADMAPPER_API_KEY). Pass dryRun:true to validate + preview ids without writing. Pass workspaceId to target a workspace other than the env default.",
|
|
1914
|
+
inputSchema: {
|
|
1915
|
+
type: "object",
|
|
1916
|
+
properties: {
|
|
1917
|
+
capabilityId: { type: "string" },
|
|
1918
|
+
tasks: {
|
|
1919
|
+
type: "array",
|
|
1920
|
+
minItems: 1,
|
|
1921
|
+
maxItems: 100,
|
|
1922
|
+
description: "Task specs. Each needs title + effort; everything else is optional.",
|
|
1923
|
+
items: {
|
|
1924
|
+
type: "object",
|
|
1925
|
+
properties: {
|
|
1926
|
+
ref: {
|
|
1927
|
+
type: "string",
|
|
1928
|
+
description:
|
|
1929
|
+
"Optional caller alias for intra-batch dependsOn references. Not stored.",
|
|
1930
|
+
},
|
|
1931
|
+
title: { type: "string" },
|
|
1932
|
+
summary: { type: "string" },
|
|
1933
|
+
effort: { type: "string", enum: ["XS", "S", "M", "L", "XL"] },
|
|
1934
|
+
kind: { type: "string", enum: ["feature", "bug", "chore", "spike"] },
|
|
1935
|
+
priority: { type: "string", enum: ["P0", "P1", "P2", "P3"] },
|
|
1936
|
+
acceptance: { type: "array", items: { type: "string" } },
|
|
1937
|
+
dependsOn: {
|
|
1938
|
+
type: "array",
|
|
1939
|
+
items: { type: "string" },
|
|
1940
|
+
description:
|
|
1941
|
+
"Sibling refs (rewritten to real ids) and/or existing TK-NNNNNN ids.",
|
|
1942
|
+
},
|
|
1943
|
+
owner: { type: "string" },
|
|
1944
|
+
expectedPRs: { type: "number" },
|
|
1945
|
+
expectedScope: { type: "number" },
|
|
1946
|
+
idempotencyKey: { type: "string" },
|
|
1947
|
+
},
|
|
1948
|
+
required: ["title", "effort"],
|
|
1949
|
+
additionalProperties: false,
|
|
1950
|
+
},
|
|
1951
|
+
},
|
|
1952
|
+
dryRun: { type: "boolean" },
|
|
1953
|
+
workspaceId: { type: "string" },
|
|
1954
|
+
},
|
|
1955
|
+
required: ["capabilityId", "tasks"],
|
|
1956
|
+
additionalProperties: false,
|
|
1957
|
+
},
|
|
1958
|
+
},
|
|
1771
1959
|
{
|
|
1772
1960
|
name: "propose_theme",
|
|
1773
1961
|
description:
|
|
1774
|
-
"Propose a new strategic theme (pillar). Themes are years-stable —
|
|
1775
|
-
"
|
|
1776
|
-
"
|
|
1777
|
-
"
|
|
1962
|
+
"Propose a new strategic theme (pillar). Themes are years-stable, coarse pillars — the small top tier of the tree.\n\n" +
|
|
1963
|
+
"AUTONOMY: by default (agent_theme_autonomy ON) you may create a theme without human confirmation when the work genuinely needs a new pillar. The server controls sprawl for you — it REFUSES a near-duplicate of an existing theme (returns error:\"too_similar\" naming the match) so you reuse/update that one instead. If a workspace turned autonomy OFF, propose_theme returns error:\"confirmation_required\" until you surface the theme to the user and retry with confirm:true.\n" +
|
|
1964
|
+
"USE WHEN: the work doesn't fit any existing theme AND represents a distinct multi-year strategic direction. Most planning needs a capability under an existing theme, not a new theme.\n" +
|
|
1965
|
+
"PREREQUISITE: get_agents_md once this session (enforced). Theme discovery once this session, satisfied by suggest_theme_for (preferred — returns ranked matches), list_themes, or get_roadmap_snapshot (enforced — discovery_missing with a fix field otherwise).\n" +
|
|
1966
|
+
"ANTI-PATTERN: do not call to organize a quarter of work — that's a capability. Do not retry with force:true to bypass a too_similar block unless the overlap is a genuine false positive — that's the sprawl guard working.\n" +
|
|
1778
1967
|
"EXAMPLE: propose_theme({ name: 'AI Agent Reliability', description: 'Multi-year bet on making agent workflows reproducible.', targetRoi: 20000000, idempotencyKey: 'session-1-theme-1' })\n\n" +
|
|
1779
1968
|
"Requires write auth (set ROADMAPPER_API_KEY). targetRoi is RAW ANNUAL DOLLARS (e.g. 20000000 = $20M), not millions. Pass idempotencyKey so retries don't duplicate. Pass dryRun: true to validate without writing. Pass workspaceId to target a workspace other than the env default.",
|
|
1780
1969
|
inputSchema: {
|
|
@@ -1784,6 +1973,16 @@ const TOOLS = [
|
|
|
1784
1973
|
description: { type: "string" },
|
|
1785
1974
|
color: { type: "string" },
|
|
1786
1975
|
targetRoi: { type: "number", description: "Annual ROI target in raw dollars (e.g. 20000000 = $20M)." },
|
|
1976
|
+
force: {
|
|
1977
|
+
type: "boolean",
|
|
1978
|
+
description:
|
|
1979
|
+
"Override the too_similar sprawl block. Use ONLY when a flagged overlap with an existing theme is a genuine false positive and this is truly a distinct strategic pillar.",
|
|
1980
|
+
},
|
|
1981
|
+
confirm: {
|
|
1982
|
+
type: "boolean",
|
|
1983
|
+
description:
|
|
1984
|
+
"Set true to proceed when the workspace has agent theme-autonomy turned OFF — your attestation that the user explicitly approved this new theme. Ignored when autonomy is on (the default).",
|
|
1985
|
+
},
|
|
1787
1986
|
idempotencyKey: { type: "string" },
|
|
1788
1987
|
dryRun: { type: "boolean" },
|
|
1789
1988
|
workspaceId: { type: "string" },
|
|
@@ -2019,6 +2218,31 @@ const TOOLS = [
|
|
|
2019
2218
|
additionalProperties: false,
|
|
2020
2219
|
},
|
|
2021
2220
|
},
|
|
2221
|
+
{
|
|
2222
|
+
name: "detect_theme_sprawl",
|
|
2223
|
+
description:
|
|
2224
|
+
"Find pairs/clusters of EXISTING themes that overlap enough to be candidates for consolidation — the 'we have too many near-duplicate pillars' signal. The companion to agent_theme_autonomy: autonomy lets agents create themes freely, this is how you periodically detect and clean up the drift.\n\n" +
|
|
2225
|
+
"How it works: scores every active theme against every other by name+description token overlap, and reports pairs at or above the warn threshold (default 0.34). Each pair comes with the overlap score and a suggested action (merge via move_capabilities + archive_theme).\n" +
|
|
2226
|
+
"USE WHEN: quarterly review, or any time the theme list feels bloated. With autonomy on, run this occasionally to catch sibling themes that should be one.\n" +
|
|
2227
|
+
"PREREQUISITE: none — read-only. Enumerates every theme, so it satisfies the propose_theme discovery gate.\n" +
|
|
2228
|
+
"ANTI-PATTERN: don't auto-merge on a single weak overlap — a human owns theme structure. Tune threshold rather than acting on noise. Two themes CAN legitimately share vocabulary (e.g. 'Data Ingestion' vs 'Data Governance').\n" +
|
|
2229
|
+
"EXAMPLE: detect_theme_sprawl({ threshold: 0.34 })",
|
|
2230
|
+
inputSchema: {
|
|
2231
|
+
type: "object",
|
|
2232
|
+
properties: {
|
|
2233
|
+
threshold: {
|
|
2234
|
+
type: "number",
|
|
2235
|
+
minimum: 0,
|
|
2236
|
+
maximum: 1,
|
|
2237
|
+
description:
|
|
2238
|
+
"Min name+description Jaccard overlap between two themes to flag as a consolidation candidate. Default 0.34. Raise to surface only the most blatant duplicates.",
|
|
2239
|
+
},
|
|
2240
|
+
includeArchived: { type: "boolean" },
|
|
2241
|
+
workspaceId: { type: "string" },
|
|
2242
|
+
},
|
|
2243
|
+
additionalProperties: false,
|
|
2244
|
+
},
|
|
2245
|
+
},
|
|
2022
2246
|
];
|
|
2023
2247
|
|
|
2024
2248
|
/**
|
|
@@ -2337,6 +2561,7 @@ function updateLifecycleTools() {
|
|
|
2337
2561
|
/** Tools that mutate the workspace — all gated on rubric fetch. */
|
|
2338
2562
|
const MUTATOR_TOOLS = new Set([
|
|
2339
2563
|
"propose_task",
|
|
2564
|
+
"propose_tasks",
|
|
2340
2565
|
"propose_theme",
|
|
2341
2566
|
"propose_capability",
|
|
2342
2567
|
"submit_acceptance_grades",
|
|
@@ -2619,6 +2844,27 @@ async function callTool(name, args) {
|
|
|
2619
2844
|
);
|
|
2620
2845
|
return rubricMissingResult(name);
|
|
2621
2846
|
}
|
|
2847
|
+
// Repo-link gate. If the agent is in a git repo that isn't mapped to a
|
|
2848
|
+
// workspace, this write would silently land on the install's env
|
|
2849
|
+
// default. Block once with the link_repo fix (or the workspaceId escape
|
|
2850
|
+
// hatch) so the mapping gets done instead of writes scattering onto the
|
|
2851
|
+
// wrong workspace. Repo-aware so a multi-repo chat is never bricked —
|
|
2852
|
+
// see repoLinkGate / repoUnmappedResult for the full escape-hatch list.
|
|
2853
|
+
{
|
|
2854
|
+
const { source: wsSource } = resolveWorkspaceWithSource(
|
|
2855
|
+
args?.workspaceId
|
|
2856
|
+
);
|
|
2857
|
+
const linkBlock = await repoLinkGate(name, args, wsSource, wsId);
|
|
2858
|
+
if (linkBlock) {
|
|
2859
|
+
session.mutatorBlocks += 1;
|
|
2860
|
+
recordTelemetry(
|
|
2861
|
+
"mutator_blocked_repo_unmapped",
|
|
2862
|
+
{ tool: name, targetId },
|
|
2863
|
+
wsId
|
|
2864
|
+
);
|
|
2865
|
+
return linkBlock;
|
|
2866
|
+
}
|
|
2867
|
+
}
|
|
2622
2868
|
// Per-tool discovery gates. Block propose_theme until the agent
|
|
2623
2869
|
// has actually inspected the existing theme catalogue, and
|
|
2624
2870
|
// propose_capability until they've ranked existing caps for fit.
|
|
@@ -2836,6 +3082,8 @@ async function callTool(name, args) {
|
|
|
2836
3082
|
}
|
|
2837
3083
|
case "propose_task":
|
|
2838
3084
|
return proposeTask(args, projected, wsId);
|
|
3085
|
+
case "propose_tasks":
|
|
3086
|
+
return proposeTasks(args, projected, wsId);
|
|
2839
3087
|
case "propose_theme":
|
|
2840
3088
|
return proposeTheme(args, projected, wsId);
|
|
2841
3089
|
case "propose_capability":
|
|
@@ -2899,6 +3147,11 @@ async function callTool(name, args) {
|
|
|
2899
3147
|
// propose_capability gate (the natural next step on a gap).
|
|
2900
3148
|
session.capsDiscoveredAt = Date.now();
|
|
2901
3149
|
return detectCapabilityGaps(args, projected);
|
|
3150
|
+
case "detect_theme_sprawl":
|
|
3151
|
+
// Enumerates every active theme, so it satisfies the propose_theme
|
|
3152
|
+
// discovery gate (consolidating or proposing is the natural next step).
|
|
3153
|
+
session.themesListedAt = Date.now();
|
|
3154
|
+
return detectThemeSprawl(args, projected);
|
|
2902
3155
|
default:
|
|
2903
3156
|
return errorResult(`Unknown tool: ${name}`);
|
|
2904
3157
|
}
|
|
@@ -3087,16 +3340,235 @@ async function proposeTask(args, projected, wsId) {
|
|
|
3087
3340
|
);
|
|
3088
3341
|
}
|
|
3089
3342
|
|
|
3090
|
-
|
|
3343
|
+
/**
|
|
3344
|
+
* Shared field validation for a single task spec (used by the bulk
|
|
3345
|
+
* propose_tasks path). Returns an error string or null. Mirrors the
|
|
3346
|
+
* inline checks in proposeTask so both paths reject identically.
|
|
3347
|
+
*/
|
|
3348
|
+
function taskSpecError(t) {
|
|
3349
|
+
const titleErr = validateName(t.title, 5);
|
|
3350
|
+
if (titleErr) return titleErr;
|
|
3351
|
+
if (!t.effort)
|
|
3352
|
+
return "effort is required (one of XS, S, M, L, XL) on every task in the batch.";
|
|
3353
|
+
if (!VALID_EFFORTS.has(t.effort)) return `Invalid effort ${t.effort}.`;
|
|
3354
|
+
if (t.priority && !VALID_PRIORITIES.has(t.priority))
|
|
3355
|
+
return `Invalid priority ${t.priority}.`;
|
|
3356
|
+
if (t.kind && !VALID_KINDS.has(t.kind)) return `Invalid kind ${t.kind}.`;
|
|
3357
|
+
if (t.expectedPRs !== undefined && (typeof t.expectedPRs !== "number" || t.expectedPRs <= 0))
|
|
3358
|
+
return `expectedPRs must be a positive number, got ${t.expectedPRs}.`;
|
|
3359
|
+
if (t.expectedScope !== undefined && (typeof t.expectedScope !== "number" || t.expectedScope <= 0))
|
|
3360
|
+
return `expectedScope must be a positive number, got ${t.expectedScope}.`;
|
|
3361
|
+
return null;
|
|
3362
|
+
}
|
|
3363
|
+
|
|
3364
|
+
/** Build a task record from a spec + its pre-minted id. Mirrors the
|
|
3365
|
+
* object proposeTask constructs (minus the per-call skip warning). */
|
|
3366
|
+
function buildTaskRecord(t, cap, id) {
|
|
3367
|
+
const start = todayISO();
|
|
3368
|
+
const target = addDays(start, Math.max(1, Math.ceil(EFFORT_DAYS[t.effort])));
|
|
3369
|
+
return {
|
|
3370
|
+
id,
|
|
3371
|
+
capabilityId: cap.id,
|
|
3372
|
+
title: cleanText(t.title),
|
|
3373
|
+
summary: cleanText(t.summary),
|
|
3374
|
+
status: "planned",
|
|
3375
|
+
priority: t.priority ?? "P2",
|
|
3376
|
+
effort: t.effort,
|
|
3377
|
+
kind: t.kind ?? "feature",
|
|
3378
|
+
start,
|
|
3379
|
+
target,
|
|
3380
|
+
originalTarget: target,
|
|
3381
|
+
progress: 0,
|
|
3382
|
+
owner: t.owner?.trim() ?? "",
|
|
3383
|
+
team: cap.team ?? "",
|
|
3384
|
+
tags: [],
|
|
3385
|
+
prs: [],
|
|
3386
|
+
links: {},
|
|
3387
|
+
acceptance: t.acceptance ?? [],
|
|
3388
|
+
dependsOn: t.dependsOn ?? [],
|
|
3389
|
+
authorKind: "agent",
|
|
3390
|
+
...(t.expectedPRs !== undefined ? { expectedPRs: t.expectedPRs } : {}),
|
|
3391
|
+
...(t.expectedScope !== undefined ? { expectedScope: t.expectedScope } : {}),
|
|
3392
|
+
};
|
|
3393
|
+
}
|
|
3394
|
+
|
|
3395
|
+
/**
|
|
3396
|
+
* propose_tasks — file MANY tasks under one capability in a single
|
|
3397
|
+
* call. This is the token-efficient path: instead of N round trips
|
|
3398
|
+
* (each with its own tool-call framing + result), the agent sends the
|
|
3399
|
+
* whole batch once and gets back one compact array of {id, title}.
|
|
3400
|
+
*
|
|
3401
|
+
* Intra-batch dependencies: a task may carry a `ref` (a caller-chosen
|
|
3402
|
+
* alias) and other tasks may list that ref in `dependsOn`. We mint all
|
|
3403
|
+
* ids first, then rewrite any dependsOn entry that matches a sibling's
|
|
3404
|
+
* ref to the real TK id. dependsOn entries that aren't a known ref pass
|
|
3405
|
+
* through unchanged (assumed to be existing TK ids).
|
|
3406
|
+
*
|
|
3407
|
+
* Per-item failures don't sink the batch: each result row carries ok
|
|
3408
|
+
* or error, mirroring move_tasks. Validation errors are reported
|
|
3409
|
+
* per-row WITHOUT writing that row; valid rows still get created.
|
|
3410
|
+
*/
|
|
3411
|
+
async function proposeTasks(args, projected, wsId) {
|
|
3412
|
+
const cap = projected.capabilities.find((c) => c.id === args.capabilityId);
|
|
3413
|
+
if (!cap) return errorResult(`Capability ${args.capabilityId} not found.`);
|
|
3414
|
+
const specs = Array.isArray(args.tasks) ? args.tasks : null;
|
|
3415
|
+
if (!specs || specs.length === 0)
|
|
3416
|
+
return errorResult("tasks must be a non-empty array of task specs.");
|
|
3417
|
+
if (specs.length > 100)
|
|
3418
|
+
return errorResult(`Too many tasks (${specs.length}); cap is 100 per call.`);
|
|
3419
|
+
|
|
3420
|
+
// Mint ids up front so intra-batch dependsOn refs can resolve.
|
|
3421
|
+
const minted = specs.map((t) => ({ spec: t, id: randomTaskId() }));
|
|
3422
|
+
const refToId = new Map();
|
|
3423
|
+
for (const m of minted) {
|
|
3424
|
+
if (typeof m.spec.ref === "string" && m.spec.ref.trim())
|
|
3425
|
+
refToId.set(m.spec.ref.trim(), m.id);
|
|
3426
|
+
}
|
|
3427
|
+
const resolveDeps = (deps) =>
|
|
3428
|
+
Array.isArray(deps) ? deps.map((d) => refToId.get(d) ?? d) : [];
|
|
3429
|
+
|
|
3430
|
+
// Validate everything first; a structural error in any row fails the
|
|
3431
|
+
// whole call (cheaper to fix the batch than to half-apply it). RPC
|
|
3432
|
+
// errors below are the per-row, partial-success case.
|
|
3433
|
+
for (let i = 0; i < minted.length; i++) {
|
|
3434
|
+
const err = taskSpecError(minted[i].spec);
|
|
3435
|
+
if (err) return errorResult(`tasks[${i}]: ${err}`);
|
|
3436
|
+
}
|
|
3437
|
+
|
|
3438
|
+
if (args.dryRun) {
|
|
3439
|
+
return textResult(
|
|
3440
|
+
JSON.stringify({
|
|
3441
|
+
ok: true,
|
|
3442
|
+
dryRun: true,
|
|
3443
|
+
capabilityId: cap.id,
|
|
3444
|
+
wouldCreate: minted.map(({ spec, id }) => ({
|
|
3445
|
+
id,
|
|
3446
|
+
title: cleanText(spec.title),
|
|
3447
|
+
effort: spec.effort,
|
|
3448
|
+
})),
|
|
3449
|
+
message: `Would create ${minted.length} task(s) under ${cap.id} (${cap.name}). No records written.`,
|
|
3450
|
+
}),
|
|
3451
|
+
);
|
|
3452
|
+
}
|
|
3453
|
+
|
|
3454
|
+
const results = [];
|
|
3455
|
+
let created = 0;
|
|
3456
|
+
for (const { spec, id } of minted) {
|
|
3457
|
+
const record = buildTaskRecord(
|
|
3458
|
+
{ ...spec, dependsOn: resolveDeps(spec.dependsOn) },
|
|
3459
|
+
cap,
|
|
3460
|
+
id
|
|
3461
|
+
);
|
|
3462
|
+
try {
|
|
3463
|
+
const rpcResult = await rpcCall("propose_task", {
|
|
3464
|
+
p_workspace_id: wsId,
|
|
3465
|
+
p_task: record,
|
|
3466
|
+
p_idempotency_key: spec.idempotencyKey ?? null,
|
|
3467
|
+
});
|
|
3468
|
+
const stored = rpcResult?.task ?? record;
|
|
3469
|
+
const idempotent = rpcResult?.idempotent === true;
|
|
3470
|
+
if (!idempotent) created += 1;
|
|
3471
|
+
results.push({ ok: true, id: stored.id, title: record.title, idempotent });
|
|
3472
|
+
} catch (e) {
|
|
3473
|
+
results.push({ ok: false, title: record.title, error: e.message });
|
|
3474
|
+
}
|
|
3475
|
+
}
|
|
3476
|
+
|
|
3477
|
+
const failed = results.filter((r) => !r.ok).length;
|
|
3478
|
+
return textResult(
|
|
3479
|
+
JSON.stringify({
|
|
3480
|
+
ok: failed === 0,
|
|
3481
|
+
capabilityId: cap.id,
|
|
3482
|
+
created,
|
|
3483
|
+
idempotent: results.filter((r) => r.ok && r.idempotent).length,
|
|
3484
|
+
failed,
|
|
3485
|
+
tasks: results,
|
|
3486
|
+
message:
|
|
3487
|
+
`Filed ${created} new task(s) under ${cap.id} (${cap.name})` +
|
|
3488
|
+
(failed ? `; ${failed} failed (see tasks[].error).` : "."),
|
|
3489
|
+
})
|
|
3490
|
+
);
|
|
3491
|
+
}
|
|
3492
|
+
|
|
3493
|
+
async function proposeTheme(args, projected, wsId) {
|
|
3091
3494
|
const nameErr = validateName(args.name, 6);
|
|
3092
3495
|
if (nameErr) return errorResult(nameErr);
|
|
3093
3496
|
|
|
3094
3497
|
const name = cleanText(args.name);
|
|
3498
|
+
const description = cleanText(args.description);
|
|
3499
|
+
|
|
3500
|
+
// ── Sprawl control (always on, independent of autonomy) ──────────
|
|
3501
|
+
// Refuse a near-duplicate of an existing active theme. This is the
|
|
3502
|
+
// server-side replacement for the human gate: instead of asking a
|
|
3503
|
+
// person every time, we only stop the agent when it's about to mint
|
|
3504
|
+
// a theme that overlaps one that already exists. Reuse/update beats
|
|
3505
|
+
// a sibling. force:true is the deliberate override.
|
|
3506
|
+
const activeThemes = (projected?.themes ?? []).filter((t) => !t.archived);
|
|
3507
|
+
const proposedTokens = tokenize(`${name} ${description ?? ""}`);
|
|
3508
|
+
let nearest = null;
|
|
3509
|
+
let nearestScore = 0;
|
|
3510
|
+
for (const t of activeThemes) {
|
|
3511
|
+
const s = jaccardScore(proposedTokens, tokenize(`${t.name} ${t.description ?? ""}`));
|
|
3512
|
+
if (s > nearestScore) {
|
|
3513
|
+
nearestScore = s;
|
|
3514
|
+
nearest = t;
|
|
3515
|
+
}
|
|
3516
|
+
}
|
|
3517
|
+
if (nearest && nearestScore >= THEME_SPRAWL_BLOCK && args.force !== true) {
|
|
3518
|
+
return textResult(
|
|
3519
|
+
JSON.stringify(
|
|
3520
|
+
{
|
|
3521
|
+
error: "too_similar",
|
|
3522
|
+
message:
|
|
3523
|
+
`"${name}" overlaps the existing theme ${nearest.id} (${nearest.name}) ` +
|
|
3524
|
+
`at ${nearestScore.toFixed(2)} (block bar ${THEME_SPRAWL_BLOCK}). Themes are the ` +
|
|
3525
|
+
"small, years-stable top tier — a near-duplicate fragments the strategic view. " +
|
|
3526
|
+
"Reuse it: file your work as a capability under it (propose_capability with " +
|
|
3527
|
+
`pillarId: "${nearest.id}"), or broaden its scope with update_theme. If this is ` +
|
|
3528
|
+
"genuinely a distinct strategic pillar, retry with force:true.",
|
|
3529
|
+
nearestTheme: { id: nearest.id, name: nearest.name, score: Number(nearestScore.toFixed(3)) },
|
|
3530
|
+
fix: `propose_capability({ pillarId: "${nearest.id}", ... })`,
|
|
3531
|
+
},
|
|
3532
|
+
null,
|
|
3533
|
+
2
|
|
3534
|
+
),
|
|
3535
|
+
{ isError: true }
|
|
3536
|
+
);
|
|
3537
|
+
}
|
|
3538
|
+
|
|
3539
|
+
// ── Autonomy gate ────────────────────────────────────────────────
|
|
3540
|
+
// Default ON: agents create themes without confirmation. A workspace
|
|
3541
|
+
// that flips agent_theme_autonomy OFF re-imposes a human checkpoint —
|
|
3542
|
+
// propose_theme then refuses until the caller passes confirm:true
|
|
3543
|
+
// (the agent's signal that it surfaced the new theme to the user and
|
|
3544
|
+
// got an explicit yes). The sprawl block above still applies either way.
|
|
3545
|
+
const autonomy = projected?.settings?.agentThemeAutonomy !== false;
|
|
3546
|
+
if (!autonomy && args.confirm !== true && !args.dryRun) {
|
|
3547
|
+
return textResult(
|
|
3548
|
+
JSON.stringify(
|
|
3549
|
+
{
|
|
3550
|
+
error: "confirmation_required",
|
|
3551
|
+
message:
|
|
3552
|
+
`This workspace has agent theme-autonomy turned OFF, so a new theme ("${name}") ` +
|
|
3553
|
+
"needs explicit human sign-off. Surface the proposed theme to the user; if they " +
|
|
3554
|
+
"approve, retry with confirm:true. Otherwise file the work under an existing theme.",
|
|
3555
|
+
...(nearest
|
|
3556
|
+
? { closestExisting: { id: nearest.id, name: nearest.name, score: Number(nearestScore.toFixed(3)) } }
|
|
3557
|
+
: {}),
|
|
3558
|
+
fix: "propose_theme({ ...same args, confirm: true })",
|
|
3559
|
+
},
|
|
3560
|
+
null,
|
|
3561
|
+
2
|
|
3562
|
+
),
|
|
3563
|
+
{ isError: true }
|
|
3564
|
+
);
|
|
3565
|
+
}
|
|
3566
|
+
|
|
3095
3567
|
const id = randomThemeId();
|
|
3096
3568
|
const theme = {
|
|
3097
3569
|
id,
|
|
3098
3570
|
name,
|
|
3099
|
-
description
|
|
3571
|
+
description,
|
|
3100
3572
|
color: args.color || "#6366f1", // brand-indigo default; user can change
|
|
3101
3573
|
...(typeof args.targetRoi === "number" ? { targetRoi: args.targetRoi } : {}),
|
|
3102
3574
|
};
|
|
@@ -3461,18 +3933,25 @@ function suggestThemeFor(args, projected) {
|
|
|
3461
3933
|
score: Number(score.toFixed(3)),
|
|
3462
3934
|
}));
|
|
3463
3935
|
|
|
3464
|
-
//
|
|
3465
|
-
//
|
|
3936
|
+
// Autonomy-aware guidance. With agent_theme_autonomy ON (default),
|
|
3937
|
+
// the agent may create a theme on a weak/no match WITHOUT asking —
|
|
3938
|
+
// the server's too_similar block in propose_theme is the sprawl
|
|
3939
|
+
// guard, not a human checkpoint. With it OFF, fall back to the old
|
|
3940
|
+
// "confirm with the user first" framing.
|
|
3941
|
+
const autonomy = projected?.settings?.agentThemeAutonomy !== false;
|
|
3466
3942
|
const topScore = ranked[0]?.score ?? 0;
|
|
3467
3943
|
const meta =
|
|
3468
3944
|
topScore < 0.4
|
|
3469
3945
|
? {
|
|
3470
3946
|
_meta: {
|
|
3471
3947
|
roadmapper: {
|
|
3472
|
-
reminder:
|
|
3473
|
-
ranked.length === 0
|
|
3474
|
-
? "No existing theme overlaps
|
|
3475
|
-
: "No strong match (top score < 0.4).
|
|
3948
|
+
reminder: autonomy
|
|
3949
|
+
? ranked.length === 0
|
|
3950
|
+
? "No existing theme overlaps. Theme-autonomy is ON, so you may call propose_theme directly if this is a genuinely new strategic pillar — the server will refuse it only if it's a near-duplicate of an existing theme."
|
|
3951
|
+
: "No strong match (top score < 0.4). Prefer the closest existing theme if it fits; otherwise propose_theme is fine (autonomy is ON, sprawl is guarded server-side)."
|
|
3952
|
+
: ranked.length === 0
|
|
3953
|
+
? "No existing theme overlaps. Theme-autonomy is OFF for this workspace — verify with the user that this is a genuinely new strategic direction before propose_theme, and pass confirm:true."
|
|
3954
|
+
: "No strong match (top score < 0.4). Re-using a 'close-enough' theme is almost always right; theme-autonomy is OFF, so confirm with the user before propose_theme.",
|
|
3476
3955
|
},
|
|
3477
3956
|
},
|
|
3478
3957
|
}
|
|
@@ -3483,13 +3962,18 @@ function suggestThemeFor(args, projected) {
|
|
|
3483
3962
|
{
|
|
3484
3963
|
ok: true,
|
|
3485
3964
|
query: desc,
|
|
3965
|
+
themeAutonomy: autonomy,
|
|
3486
3966
|
matches: ranked,
|
|
3487
3967
|
hint:
|
|
3488
3968
|
ranked.length === 0
|
|
3489
|
-
?
|
|
3969
|
+
? autonomy
|
|
3970
|
+
? "No existing theme overlaps. propose_theme is appropriate if this is a distinct strategic pillar — autonomy is on; the server blocks only near-duplicates."
|
|
3971
|
+
: "No existing theme overlaps. propose_theme needs explicit user confirmation (autonomy off): pass confirm:true once the user approves."
|
|
3490
3972
|
: ranked[0].score > 0.4
|
|
3491
3973
|
? `Strong match: ${ranked[0].id} (${ranked[0].name}). Attach capabilities under this theme instead of creating a new one.`
|
|
3492
|
-
:
|
|
3974
|
+
: autonomy
|
|
3975
|
+
? `Weak overlap. The top match is often closer than it scores — prefer it if it fits; otherwise propose_theme is fine (sprawl guarded server-side).`
|
|
3976
|
+
: `Weak overlap. Prefer the top match over a new theme unless the user explicitly asks for a new strategic direction (autonomy off).`,
|
|
3493
3977
|
},
|
|
3494
3978
|
null,
|
|
3495
3979
|
2
|
|
@@ -4152,6 +4636,86 @@ function detectCapabilityGaps(args, projected) {
|
|
|
4152
4636
|
);
|
|
4153
4637
|
}
|
|
4154
4638
|
|
|
4639
|
+
/**
|
|
4640
|
+
* detect_theme_sprawl — the consolidation companion to
|
|
4641
|
+
* agent_theme_autonomy. Autonomy lets agents mint themes freely (with
|
|
4642
|
+
* the per-create too_similar block as a guard); over time, two themes
|
|
4643
|
+
* created from different sessions can still drift toward overlap. This
|
|
4644
|
+
* surfaces those pairs so a human can merge them.
|
|
4645
|
+
*
|
|
4646
|
+
* O(n^2) over active themes — fine; themes are the small top tier
|
|
4647
|
+
* (tens, not thousands). Deterministic: stable id sort, never random.
|
|
4648
|
+
*/
|
|
4649
|
+
function detectThemeSprawl(args, projected) {
|
|
4650
|
+
const threshold =
|
|
4651
|
+
typeof args?.threshold === "number" && Number.isFinite(args.threshold)
|
|
4652
|
+
? Math.min(1, Math.max(0, args.threshold))
|
|
4653
|
+
: THEME_SPRAWL_WARN;
|
|
4654
|
+
const includeArchived = args?.includeArchived === true;
|
|
4655
|
+
|
|
4656
|
+
const themes = (projected.themes ?? [])
|
|
4657
|
+
.filter((t) => includeArchived || !t.archived)
|
|
4658
|
+
.slice()
|
|
4659
|
+
.sort((a, b) => String(a.id).localeCompare(String(b.id)));
|
|
4660
|
+
|
|
4661
|
+
const capCountByTheme = new Map();
|
|
4662
|
+
for (const c of projected.capabilities ?? []) {
|
|
4663
|
+
if (c.archived) continue;
|
|
4664
|
+
capCountByTheme.set(c.pillarId, (capCountByTheme.get(c.pillarId) ?? 0) + 1);
|
|
4665
|
+
}
|
|
4666
|
+
|
|
4667
|
+
const toks = themes.map((t) => tokenize(`${t.name} ${t.description ?? ""}`));
|
|
4668
|
+
const pairs = [];
|
|
4669
|
+
for (let i = 0; i < themes.length; i++) {
|
|
4670
|
+
for (let j = i + 1; j < themes.length; j++) {
|
|
4671
|
+
const score = jaccardScore(toks[i], toks[j]);
|
|
4672
|
+
if (score < threshold) continue;
|
|
4673
|
+
// Suggest merging the lighter theme (fewer capabilities) INTO the
|
|
4674
|
+
// heavier one — the smaller bet is the cheaper thing to re-parent.
|
|
4675
|
+
const a = themes[i], b = themes[j];
|
|
4676
|
+
const aCaps = capCountByTheme.get(a.id) ?? 0;
|
|
4677
|
+
const bCaps = capCountByTheme.get(b.id) ?? 0;
|
|
4678
|
+
const [keep, fold] = aCaps >= bCaps ? [a, b] : [b, a];
|
|
4679
|
+
const foldCaps = keep === a ? bCaps : aCaps;
|
|
4680
|
+
pairs.push({
|
|
4681
|
+
score: Number(score.toFixed(3)),
|
|
4682
|
+
themes: [
|
|
4683
|
+
{ id: a.id, name: a.name, capabilities: aCaps },
|
|
4684
|
+
{ id: b.id, name: b.name, capabilities: bCaps },
|
|
4685
|
+
],
|
|
4686
|
+
suggestion:
|
|
4687
|
+
foldCaps > 0
|
|
4688
|
+
? `Likely duplicate. To consolidate: move_capabilities the ${foldCaps} capabilit${foldCaps === 1 ? "y" : "ies"} under ${fold.id} (${fold.name}) to ${keep.id} (${keep.name}), then archive_theme ${fold.id}.`
|
|
4689
|
+
: `Likely duplicate. ${fold.id} (${fold.name}) has no capabilities — archive_theme it and keep ${keep.id} (${keep.name}).`,
|
|
4690
|
+
});
|
|
4691
|
+
}
|
|
4692
|
+
}
|
|
4693
|
+
pairs.sort((x, y) => y.score - x.score);
|
|
4694
|
+
|
|
4695
|
+
const meta =
|
|
4696
|
+
pairs.length > 0
|
|
4697
|
+
? {
|
|
4698
|
+
_meta: {
|
|
4699
|
+
roadmapper: {
|
|
4700
|
+
reminder:
|
|
4701
|
+
`${pairs.length} theme pair(s) overlap at/above ${threshold} — candidate duplicates. ` +
|
|
4702
|
+
"Themes are the years-stable top tier; consolidating keeps the strategic view legible. A human should confirm each merge.",
|
|
4703
|
+
},
|
|
4704
|
+
},
|
|
4705
|
+
}
|
|
4706
|
+
: undefined;
|
|
4707
|
+
|
|
4708
|
+
return textResult(
|
|
4709
|
+
JSON.stringify({
|
|
4710
|
+
themesScanned: themes.length,
|
|
4711
|
+
threshold,
|
|
4712
|
+
sprawlPairCount: pairs.length,
|
|
4713
|
+
pairs,
|
|
4714
|
+
}),
|
|
4715
|
+
meta
|
|
4716
|
+
);
|
|
4717
|
+
}
|
|
4718
|
+
|
|
4155
4719
|
async function submitAcceptanceGrades(args, projected, wsId) {
|
|
4156
4720
|
const task = projected.tasks.find((t) => t.id === args.taskId);
|
|
4157
4721
|
if (!task) return errorResult(`Task ${args.taskId} not found.`);
|
|
@@ -4734,6 +5298,147 @@ async function runSelftest() {
|
|
|
4734
5298
|
},
|
|
4735
5299
|
pass: (r) => r?.themesListedAt !== null && r?.capsDiscoveredAt !== null,
|
|
4736
5300
|
},
|
|
5301
|
+
{
|
|
5302
|
+
// Sprawl control: a theme that overlaps an existing one above the
|
|
5303
|
+
// block bar is refused with too_similar, naming the match — even
|
|
5304
|
+
// on dryRun (the guard runs before the write/preview).
|
|
5305
|
+
name: "propose_theme blocks a near-duplicate theme (too_similar)",
|
|
5306
|
+
fn: () =>
|
|
5307
|
+
proposeTheme(
|
|
5308
|
+
{ name: "Data Intelligence Platform Core", dryRun: true },
|
|
5309
|
+
{
|
|
5310
|
+
themes: [
|
|
5311
|
+
{ id: "TH-DUP", name: "Data Intelligence Platform", description: "" },
|
|
5312
|
+
],
|
|
5313
|
+
settings: { agentThemeAutonomy: true },
|
|
5314
|
+
},
|
|
5315
|
+
"ws-test"
|
|
5316
|
+
),
|
|
5317
|
+
pass: (r) => {
|
|
5318
|
+
const t = r?.content?.[0]?.text ?? "";
|
|
5319
|
+
return t.includes("too_similar") && t.includes("TH-DUP");
|
|
5320
|
+
},
|
|
5321
|
+
},
|
|
5322
|
+
{
|
|
5323
|
+
// A distinct theme passes the sprawl guard, and with autonomy ON
|
|
5324
|
+
// (default) sails through to the (dryRun) create — no confirmation.
|
|
5325
|
+
name: "propose_theme allows a distinct theme when autonomy is on",
|
|
5326
|
+
fn: () =>
|
|
5327
|
+
proposeTheme(
|
|
5328
|
+
{ name: "Customer Onboarding Automation", dryRun: true },
|
|
5329
|
+
{
|
|
5330
|
+
themes: [
|
|
5331
|
+
{ id: "TH-DUP", name: "Data Intelligence Platform", description: "" },
|
|
5332
|
+
],
|
|
5333
|
+
settings: { agentThemeAutonomy: true },
|
|
5334
|
+
},
|
|
5335
|
+
"ws-test"
|
|
5336
|
+
),
|
|
5337
|
+
pass: (r) => {
|
|
5338
|
+
const t = r?.content?.[0]?.text ?? "";
|
|
5339
|
+
return t.includes("\"ok\": true") && t.includes("wouldCreate") && !t.includes("too_similar");
|
|
5340
|
+
},
|
|
5341
|
+
},
|
|
5342
|
+
{
|
|
5343
|
+
// force:true overrides a too_similar block for the rare genuine
|
|
5344
|
+
// false positive.
|
|
5345
|
+
name: "propose_theme force:true overrides the sprawl block",
|
|
5346
|
+
fn: () =>
|
|
5347
|
+
proposeTheme(
|
|
5348
|
+
{ name: "Data Intelligence Platform Core", force: true, dryRun: true },
|
|
5349
|
+
{
|
|
5350
|
+
themes: [
|
|
5351
|
+
{ id: "TH-DUP", name: "Data Intelligence Platform", description: "" },
|
|
5352
|
+
],
|
|
5353
|
+
settings: { agentThemeAutonomy: true },
|
|
5354
|
+
},
|
|
5355
|
+
"ws-test"
|
|
5356
|
+
),
|
|
5357
|
+
pass: (r) => {
|
|
5358
|
+
const t = r?.content?.[0]?.text ?? "";
|
|
5359
|
+
return t.includes("wouldCreate") && !t.includes("too_similar");
|
|
5360
|
+
},
|
|
5361
|
+
},
|
|
5362
|
+
{
|
|
5363
|
+
// With autonomy OFF, a brand-new theme needs confirm:true — the
|
|
5364
|
+
// server returns confirmation_required until the human signs off.
|
|
5365
|
+
name: "propose_theme requires confirm when autonomy is off",
|
|
5366
|
+
fn: () =>
|
|
5367
|
+
proposeTheme(
|
|
5368
|
+
{ name: "Brand New Distinct Strategic Pillar" },
|
|
5369
|
+
{ themes: [], settings: { agentThemeAutonomy: false } },
|
|
5370
|
+
"ws-test"
|
|
5371
|
+
),
|
|
5372
|
+
pass: (r) => {
|
|
5373
|
+
const t = r?.content?.[0]?.text ?? "";
|
|
5374
|
+
return t.includes("confirmation_required") && t.includes("confirm");
|
|
5375
|
+
},
|
|
5376
|
+
},
|
|
5377
|
+
{
|
|
5378
|
+
// detect_theme_sprawl surfaces overlapping existing themes as
|
|
5379
|
+
// consolidation candidates.
|
|
5380
|
+
name: "detect_theme_sprawl flags overlapping themes",
|
|
5381
|
+
fn: () =>
|
|
5382
|
+
detectThemeSprawl(
|
|
5383
|
+
{},
|
|
5384
|
+
{
|
|
5385
|
+
themes: [
|
|
5386
|
+
{ id: "TH-A", name: "Data Intelligence", description: "" },
|
|
5387
|
+
{ id: "TH-B", name: "Data Intelligence Platform", description: "" },
|
|
5388
|
+
],
|
|
5389
|
+
capabilities: [],
|
|
5390
|
+
}
|
|
5391
|
+
),
|
|
5392
|
+
pass: (r) => {
|
|
5393
|
+
const t = r?.content?.[0]?.text ?? "";
|
|
5394
|
+
return t.includes("\"sprawlPairCount\": 1") || (t.includes("TH-A") && t.includes("TH-B"));
|
|
5395
|
+
},
|
|
5396
|
+
},
|
|
5397
|
+
{
|
|
5398
|
+
// propose_tasks bulk: dryRun previews all rows + mints an id each.
|
|
5399
|
+
name: "propose_tasks bulk previews the whole batch (dryRun)",
|
|
5400
|
+
fn: () =>
|
|
5401
|
+
proposeTasks(
|
|
5402
|
+
{
|
|
5403
|
+
capabilityId: "CAP-1",
|
|
5404
|
+
dryRun: true,
|
|
5405
|
+
tasks: [
|
|
5406
|
+
{ ref: "a", title: "First bulk task here", effort: "M" },
|
|
5407
|
+
{ title: "Second bulk task here", effort: "S", dependsOn: ["a"] },
|
|
5408
|
+
],
|
|
5409
|
+
},
|
|
5410
|
+
{ capabilities: [{ id: "CAP-1", name: "Test Cap" }], themes: [], tasks: [] },
|
|
5411
|
+
"ws-test"
|
|
5412
|
+
),
|
|
5413
|
+
pass: (r) => {
|
|
5414
|
+
const t = r?.content?.[0]?.text ?? "";
|
|
5415
|
+
try {
|
|
5416
|
+
const j = JSON.parse(t);
|
|
5417
|
+
return j.dryRun === true && Array.isArray(j.wouldCreate) && j.wouldCreate.length === 2;
|
|
5418
|
+
} catch {
|
|
5419
|
+
return false;
|
|
5420
|
+
}
|
|
5421
|
+
},
|
|
5422
|
+
},
|
|
5423
|
+
{
|
|
5424
|
+
// propose_tasks rejects the whole batch on a per-row validation
|
|
5425
|
+
// error (missing effort), naming the offending index.
|
|
5426
|
+
name: "propose_tasks rejects a batch with a missing-effort row",
|
|
5427
|
+
fn: () =>
|
|
5428
|
+
proposeTasks(
|
|
5429
|
+
{
|
|
5430
|
+
capabilityId: "CAP-1",
|
|
5431
|
+
tasks: [{ title: "No effort on this task" }],
|
|
5432
|
+
},
|
|
5433
|
+
{ capabilities: [{ id: "CAP-1", name: "Test Cap" }], themes: [], tasks: [] },
|
|
5434
|
+
"ws-test"
|
|
5435
|
+
),
|
|
5436
|
+
pass: (r) => {
|
|
5437
|
+
if (!r?.isError) return false;
|
|
5438
|
+
const t = r?.content?.[0]?.text ?? "";
|
|
5439
|
+
return t.includes("tasks[0]") && t.includes("effort");
|
|
5440
|
+
},
|
|
5441
|
+
},
|
|
4737
5442
|
{
|
|
4738
5443
|
name: "resources/list returns the three resources",
|
|
4739
5444
|
fn: () => handle({ id: 12, method: "resources/list", params: {} }),
|
|
@@ -6400,6 +7105,161 @@ async function runSelftest() {
|
|
|
6400
7105
|
}
|
|
6401
7106
|
},
|
|
6402
7107
|
},
|
|
7108
|
+
{
|
|
7109
|
+
// Repo-link gate: a mutator in an UNMAPPED git repo (slug resolves
|
|
7110
|
+
// but no repo_workspace_map row, so resolution falls to env source)
|
|
7111
|
+
// is blocked with repo_unmapped naming the slug + the link_repo fix.
|
|
7112
|
+
name: "mutator blocked when in an unmapped repo (would hit env default)",
|
|
7113
|
+
fn: async () => {
|
|
7114
|
+
const savedWs = process.env.ROADMAPPER_WORKSPACE_ID;
|
|
7115
|
+
const savedKey = process.env.ROADMAPPER_API_KEY;
|
|
7116
|
+
const savedUrl = process.env.ROADMAPPER_BACKEND_URL;
|
|
7117
|
+
try {
|
|
7118
|
+
// Writes must be enabled or the gate defers to set_credentials.
|
|
7119
|
+
process.env.ROADMAPPER_API_KEY = "rmpr_selftest";
|
|
7120
|
+
process.env.ROADMAPPER_BACKEND_URL = "https://selftest.local";
|
|
7121
|
+
process.env.ROADMAPPER_WORKSPACE_ID = "ws-envdefault";
|
|
7122
|
+
session.rubricFetchedAt = Date.now(); // past the rubric gate
|
|
7123
|
+
_clientRoots = ["/tmp/unmapped"];
|
|
7124
|
+
__setRepoSlugForTest("acme/unmapped");
|
|
7125
|
+
__setRootWorkspaceForTest(null); // no repo_workspace_map hit → env source
|
|
7126
|
+
return await handle({
|
|
7127
|
+
id: 94,
|
|
7128
|
+
method: "tools/call",
|
|
7129
|
+
params: {
|
|
7130
|
+
name: "archive_task",
|
|
7131
|
+
arguments: { taskId: aTask, reason: "unmapped-repo probe" },
|
|
7132
|
+
},
|
|
7133
|
+
});
|
|
7134
|
+
} finally {
|
|
7135
|
+
__setRepoSlugForTest(undefined);
|
|
7136
|
+
__setRootWorkspaceForTest(undefined);
|
|
7137
|
+
_clientRoots = [];
|
|
7138
|
+
if (savedWs === undefined) delete process.env.ROADMAPPER_WORKSPACE_ID;
|
|
7139
|
+
else process.env.ROADMAPPER_WORKSPACE_ID = savedWs;
|
|
7140
|
+
if (savedKey === undefined) delete process.env.ROADMAPPER_API_KEY;
|
|
7141
|
+
else process.env.ROADMAPPER_API_KEY = savedKey;
|
|
7142
|
+
if (savedUrl === undefined) delete process.env.ROADMAPPER_BACKEND_URL;
|
|
7143
|
+
else process.env.ROADMAPPER_BACKEND_URL = savedUrl;
|
|
7144
|
+
}
|
|
7145
|
+
},
|
|
7146
|
+
pass: (r) => {
|
|
7147
|
+
try {
|
|
7148
|
+
const out = JSON.parse(r?.result?.content?.[0]?.text ?? "{}");
|
|
7149
|
+
return (
|
|
7150
|
+
out.error === "repo_unmapped" &&
|
|
7151
|
+
out.repo === "acme/unmapped" &&
|
|
7152
|
+
out.fix === "link_repo()" &&
|
|
7153
|
+
out.envDefaultWorkspace === "ws-envdefault"
|
|
7154
|
+
);
|
|
7155
|
+
} catch {
|
|
7156
|
+
return false;
|
|
7157
|
+
}
|
|
7158
|
+
},
|
|
7159
|
+
},
|
|
7160
|
+
{
|
|
7161
|
+
// ESCAPE HATCH 1 (the multi-repo case): an explicit workspaceId arg
|
|
7162
|
+
// means the caller is intentionally targeting a workspace — the gate
|
|
7163
|
+
// must NOT fire even in an unmapped repo. Proves a developer juggling
|
|
7164
|
+
// several repos in one chat is never bricked: pass workspaceId and the
|
|
7165
|
+
// write proceeds (lands downstream on the missing-service-key error in
|
|
7166
|
+
// selftest, NOT the repo_unmapped block — that's the assertion).
|
|
7167
|
+
name: "repo-link gate skipped when workspaceId passed explicitly",
|
|
7168
|
+
fn: async () => {
|
|
7169
|
+
try {
|
|
7170
|
+
session.rubricFetchedAt = Date.now();
|
|
7171
|
+
_clientRoots = ["/tmp/unmapped"];
|
|
7172
|
+
__setRepoSlugForTest("acme/unmapped");
|
|
7173
|
+
__setRootWorkspaceForTest(null);
|
|
7174
|
+
return await handle({
|
|
7175
|
+
id: 95,
|
|
7176
|
+
method: "tools/call",
|
|
7177
|
+
params: {
|
|
7178
|
+
name: "archive_task",
|
|
7179
|
+
arguments: {
|
|
7180
|
+
taskId: aTask,
|
|
7181
|
+
reason: "explicit-ws probe",
|
|
7182
|
+
workspaceId: "ws-explicit",
|
|
7183
|
+
},
|
|
7184
|
+
},
|
|
7185
|
+
});
|
|
7186
|
+
} finally {
|
|
7187
|
+
__setRepoSlugForTest(undefined);
|
|
7188
|
+
__setRootWorkspaceForTest(undefined);
|
|
7189
|
+
_clientRoots = [];
|
|
7190
|
+
}
|
|
7191
|
+
},
|
|
7192
|
+
pass: (r) => {
|
|
7193
|
+
// Must be an error result (no service key downstream) but NOT the
|
|
7194
|
+
// repo_unmapped block — proves the gate let the explicit target through.
|
|
7195
|
+
if (!r?.result?.isError) return false;
|
|
7196
|
+
const txt = r.result.content?.[0]?.text ?? "";
|
|
7197
|
+
return !txt.includes("repo_unmapped");
|
|
7198
|
+
},
|
|
7199
|
+
},
|
|
7200
|
+
{
|
|
7201
|
+
// ESCAPE HATCH 2: a MAPPED repo (resolution returns source "repo")
|
|
7202
|
+
// never trips the gate — the whole point. Seeding a root workspace
|
|
7203
|
+
// makes resolveWorkspaceWithSource return source:"repo", not "env".
|
|
7204
|
+
name: "repo-link gate skipped when repo IS mapped (source=repo)",
|
|
7205
|
+
fn: async () => {
|
|
7206
|
+
try {
|
|
7207
|
+
session.rubricFetchedAt = Date.now();
|
|
7208
|
+
_clientRoots = ["/tmp/mapped"];
|
|
7209
|
+
__setRepoSlugForTest("acme/mapped");
|
|
7210
|
+
__setRootWorkspaceForTest("ws-mapped", "acme/mapped"); // mapped → source "repo"
|
|
7211
|
+
return await handle({
|
|
7212
|
+
id: 96,
|
|
7213
|
+
method: "tools/call",
|
|
7214
|
+
params: {
|
|
7215
|
+
name: "archive_task",
|
|
7216
|
+
arguments: { taskId: aTask, reason: "mapped-repo probe" },
|
|
7217
|
+
},
|
|
7218
|
+
});
|
|
7219
|
+
} finally {
|
|
7220
|
+
__setRepoSlugForTest(undefined);
|
|
7221
|
+
__setRootWorkspaceForTest(undefined);
|
|
7222
|
+
_clientRoots = [];
|
|
7223
|
+
}
|
|
7224
|
+
},
|
|
7225
|
+
pass: (r) => {
|
|
7226
|
+
if (!r?.result?.isError) return false;
|
|
7227
|
+
const txt = r.result.content?.[0]?.text ?? "";
|
|
7228
|
+
return !txt.includes("repo_unmapped");
|
|
7229
|
+
},
|
|
7230
|
+
},
|
|
7231
|
+
{
|
|
7232
|
+
// ESCAPE HATCH 3: not in a git repo at all (no client roots) — nothing
|
|
7233
|
+
// to link, so the gate must fall through to the env default rather than
|
|
7234
|
+
// deadlock. Asserts NOT repo_unmapped.
|
|
7235
|
+
name: "repo-link gate skipped when not in a git repo (no deadlock)",
|
|
7236
|
+
fn: async () => {
|
|
7237
|
+
const savedWs = process.env.ROADMAPPER_WORKSPACE_ID;
|
|
7238
|
+
try {
|
|
7239
|
+
process.env.ROADMAPPER_WORKSPACE_ID = "ws-envdefault";
|
|
7240
|
+
session.rubricFetchedAt = Date.now();
|
|
7241
|
+
_clientRoots = []; // not in a repo
|
|
7242
|
+
__setRootWorkspaceForTest(null);
|
|
7243
|
+
return await handle({
|
|
7244
|
+
id: 97,
|
|
7245
|
+
method: "tools/call",
|
|
7246
|
+
params: {
|
|
7247
|
+
name: "archive_task",
|
|
7248
|
+
arguments: { taskId: aTask, reason: "no-repo probe" },
|
|
7249
|
+
},
|
|
7250
|
+
});
|
|
7251
|
+
} finally {
|
|
7252
|
+
__setRootWorkspaceForTest(undefined);
|
|
7253
|
+
if (savedWs === undefined) delete process.env.ROADMAPPER_WORKSPACE_ID;
|
|
7254
|
+
else process.env.ROADMAPPER_WORKSPACE_ID = savedWs;
|
|
7255
|
+
}
|
|
7256
|
+
},
|
|
7257
|
+
pass: (r) => {
|
|
7258
|
+
if (!r?.result?.isError) return false;
|
|
7259
|
+
const txt = r.result.content?.[0]?.text ?? "";
|
|
7260
|
+
return !txt.includes("repo_unmapped");
|
|
7261
|
+
},
|
|
7262
|
+
},
|
|
6403
7263
|
];
|
|
6404
7264
|
|
|
6405
7265
|
let passed = 0;
|