npm - @exaudeus/workrail - Versions diffs - 3.46.0 → 3.48.0 - Mend

@exaudeus/workrail 3.46.0 → 3.48.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/dist/cli/commands/index.d.ts +1 -0
package/dist/cli/commands/index.js +3 -1
package/dist/cli/commands/worktrain-trigger-test.d.ts +21 -0
package/dist/cli/commands/worktrain-trigger-test.js +123 -0
package/dist/cli-worktrain.js +65 -0
package/dist/console-ui/assets/{index-BQFhoMcY.js → index-CecBgrR7.js} +1 -1
package/dist/console-ui/index.html +1 -1
package/dist/coordinators/modes/implement-shared.d.ts +2 -1
package/dist/coordinators/modes/implement-shared.js +7 -3
package/dist/manifest.json +44 -36
package/dist/mcp/output-schemas.d.ts +2 -2
package/dist/trigger/adapters/github-queue-poller.js +10 -7
package/dist/trigger/github-queue-config.d.ts +1 -0
package/dist/trigger/github-queue-config.js +9 -0
package/dist/trigger/polling-scheduler.js +8 -1
package/dist/trigger/trigger-listener.js +296 -1
package/dist/trigger/trigger-router.d.ts +4 -2
package/dist/trigger/trigger-router.js +19 -3
package/dist/trigger/trigger-store.js +10 -0
package/dist/trigger/types.d.ts +2 -0
package/dist/v2/durable-core/schemas/artifacts/review-verdict.d.ts +5 -0
package/dist/v2/durable-core/schemas/artifacts/review-verdict.js +12 -0
package/docs/design/connect-adaptive-dispatch-candidates.md +153 -0
package/docs/design/connect-adaptive-dispatch-design-review.md +88 -0
package/docs/design/connect-adaptive-dispatch-implementation-plan.md +209 -0
package/docs/design/queue-label-support-candidates.md +83 -0
package/docs/design/queue-label-support-design-review.md +62 -0
package/docs/design/queue-label-support-implementation-plan.md +158 -0
package/docs/ideas/backlog.md +147 -0
package/package.json +1 -1
package/workflows/mr-review-workflow.agentic.v2.json +1 -1

package/docs/design/queue-label-support-candidates.md ADDED Viewed

@@ -0,0 +1,83 @@
+# Design Candidates: Label-Based Queue Filter for github_queue_poll
+## Problem Understanding
+### Tensions
+1. **Trigger YAML config vs runtime config.json**: `queueType`/`queueLabel` appear in triggers.yml (per-trigger), but `GitHubQueueConfig` is loaded from `~/.workrail/config.json` (global daemon-level config). Both need to converge on the same data at poll time. The test surface validates at the poller level, which takes a `GitHubQueueConfig` directly - meaning the poller just needs to handle label type correctly, regardless of which source built the config.
+2. **Additive field vs type-safe discriminated union**: Adding `queueLabel?: string` (optional) to `GitHubQueueConfig` makes the illegal state (`type==='label'` without `queueLabel`) detectable only at load time. A discriminated union would catch it at compile time. The 3-file scope constraint makes the discriminated union approach infeasible.
+3. **Existing `name?` field vs new `queueLabel?` field**: `GitHubQueueConfig` already has `name?: string` (read from `q['name']` in `loadQueueConfig()`). Adding `queueLabel?` alongside it creates two optional label fields. This is a naming inconsistency from an earlier design iteration.
+### Likely Seam
+`pollGitHubQueueIssues()` in `github-queue-poller.ts` - the URL construction is where the actual behavior diverges between filter types.
+### What Makes This Hard
+- The existing test at line 126 (`returns not_implemented for non-assignee queue type`) tests that label returns `not_implemented`. After this change, label with `queueLabel` succeeds. This test must be replaced, not kept alongside new tests.
+- `polling-scheduler.ts` has a guard at line 393 that still checks `queueConfig.type !== 'assignee'` - this file is out of scope. The scheduler will remain blocking for label type from config.json after our changes. Tests at the poller unit level bypass the scheduler.
+## Philosophy Constraints
+From `CLAUDE.md` and repo patterns:
+- **Result types, no throws**: all boundary functions return `Result<T, E>`. `err({ kind: 'not_implemented', ... })` is the established pattern for unimplemented branches.
+- **Validate at boundaries**: `loadQueueConfig()` must return `err` if `type === 'label'` and `queueLabel` is absent.
+- **Immutability by default**: new fields must be `readonly`.
+- **YAGNI**: no mention/query stubs, no new abstractions.
+- **Make illegal states unrepresentable**: satisfied at load time by validation; compile-time enforcement would require discriminated union (out of scope).
+No philosophy conflicts in this case.
+## Impact Surface
+- `polling-scheduler.ts` line 393: guard `queueConfig.type !== 'assignee'` remains. Out of scope. Label type from config.json will still be blocked by the scheduler even after our changes. This is an intentional stepping stone.
+- `tests/unit/github-queue-poller.test.ts` line 126: existing test must be replaced/updated (not just added to).
+- `src/trigger/types.ts`: `GitHubQueuePollingSource` needs `queueType?`/`queueLabel?` fields to store trigger YAML values. Not restricted by scope.
+## Candidates
+### Candidate 1: Additive optional field + load-time validation + poller branch (SELECTED)
+**Summary**: Add `readonly queueLabel?: string` to `GitHubQueueConfig`, validate it in `loadQueueConfig()` (err if `type==='label'` and `queueLabel` absent), add `labels=` branch in `pollGitHubQueueIssues()`, parse `queueType`/`queueLabel` in trigger-store, extend `GitHubQueuePollingSource` in types.ts.
+**Tensions resolved**: Label config is validated at load time; assignee path unchanged; no new abstractions.
+**Tension accepted**: `name?` field remains alongside `queueLabel?` (minor naming inconsistency).
+**Boundary**: `pollGitHubQueueIssues()` for URL construction; `loadQueueConfig()` for validation.
+**Failure mode**: Forgetting to update the existing `not_implemented` test for label type. The test at line 126 will fail if not replaced.
+**Repo pattern**: Follows exactly - mirrors how `user?` is handled for assignee type.
+**Gain**: Minimal change surface, no breaking changes.
+**Give up**: `name?` naming inconsistency stays.
+**Scope**: best-fit.
+**Philosophy fit**: Honors validate-at-boundaries, immutability, YAGNI, Result types.
+### Candidate 2: Discriminated union for GitHubQueueConfig (type-safe)
+**Summary**: Refactor `GitHubQueueConfig` into `AssigneeQueueConfig | LabelQueueConfig | MentionQueueConfig | QueryQueueConfig` so `queueLabel` is `string` (not optional) in `LabelQueueConfig`.
+**Tensions resolved**: Makes illegal states unrepresentable at compile time.
+**Tension accepted**: Breaks all existing callers; touches files outside scope.
+**Boundary**: Entire `GitHubQueueConfig` interface plus all callers.
+**Failure mode**: Cascading type errors in polling-scheduler.ts and tests.
+**Repo pattern**: Departs - no discriminated union for config types in this codebase.
+**Gain**: Compile-time safety.
+**Give up**: Simplicity, scope compliance.
+**Scope**: too broad.
+**Philosophy fit**: Honors make-illegal-states-unrepresentable but conflicts with YAGNI and scope constraint.
+## Comparison and Recommendation
+**Selected: Candidate 1.**
+All acceptance criteria are satisfied at minimum change surface. Load-time validation in `loadQueueConfig()` catches the illegal state (`type==='label'` without `queueLabel`) before any poll cycle runs. The `pollGitHubQueueIssues()` change is purely additive - the assignee branch is unchanged.
+## Self-Critique
+**Strongest counter-argument**: The `name?` field in `GitHubQueueConfig` was the original design for label support. Adding `queueLabel?` alongside it creates two ways to express the same concept. The cleaner fix would remove `name?` and replace it, but that's a breaking change to anyone who already uses `config.name` in code paths we're not touching (polling-scheduler at line 471 uses `queueConfig.type` but not `queueConfig.name`).
+**Pivot conditions**: If end-to-end testing (not just unit tests) is required, we'd need to update `polling-scheduler.ts` too. That's out of scope per the task description.
+**Invalidating assumption**: If `polling-scheduler.ts` is required for the feature to work at all, then the scope restriction is wrong. But the task explicitly says to verify with unit tests at the poller level.
+## Open Questions for the Main Agent
+1. Should the `name?` field be deprecated or removed? The task says add `queueLabel?` - leaving `name?` creates naming confusion. Recommend leaving it as-is and not removing it (YAGNI, out of scope).
+2. Should `GitHubQueuePollingSource` in `types.ts` gain `queueType?`/`queueLabel?` fields? Yes - trigger-store.ts parses them from YAML and needs somewhere to store them.

package/docs/design/queue-label-support-design-review.md ADDED Viewed

@@ -0,0 +1,62 @@
+# Design Review: Label-Based Queue Filter for github_queue_poll
+## Tradeoff Review
+**T1: `name?` stays alongside `queueLabel?`**
+- Acceptable. Load-time validation in `loadQueueConfig()` requires `queueLabel` when `type==='label'`. A user who writes `"name": "my-label"` in config.json gets an error at load time (queueLabel absent), not a silent failure.
+- Fails if: user somehow bypasses `loadQueueConfig()` and constructs `GitHubQueueConfig` directly with `name` but not `queueLabel`. Poller returns `not_implemented` (else branch). Acceptable - direct construction is a test-only pattern.
+**T2: `polling-scheduler.ts` guard remains blocking**
+- Acceptable within task scope. Unit tests validate at the poller level. End-to-end production use requires a follow-up PR to update the scheduler guard. Document in PR description.
+- Fails if: task author requires end-to-end production functionality. Risk level: medium for production, zero for acceptance criteria.
+**T3: Optional `queueLabel?` (not required at type level)**
+- Acceptable. Mitigated by load-time validation. The `pollGitHubQueueIssues` else branch returns `not_implemented` if `queueLabel` is absent, which is correct defensive behavior.
+## Failure Mode Review
+**FM1 (highest risk): Existing test at line 126 must be replaced**
+- Test currently expects `not_implemented` for `{type: 'label', name: 'my-label'}`.
+- After change: test will fail. Must replace with: (a) label success test, (b) label missing-queueLabel error test, (c) assignee regression test.
+- Mitigation: explicit action in implementation plan.
+**FM2: URL encoding**
+- Handled automatically by `URLSearchParams.set()` (established pattern in codebase).
+**FM3: GitHub API filter behavior**
+- Not our responsibility. Tests verify the URL contains the correct `labels=` param.
+**FM4: Scheduler guard**
+- Filed as known limitation (T2 above). Document in PR.
+## Runner-Up / Simpler Alternative Review
+- Discriminated union: more type-safe but requires files outside scope. Nothing worth borrowing.
+- Skipping trigger-store changes: fails acceptance criteria. Not viable.
+- Skipping types.ts extension: TypeScript would flag unused parsed values. Necessary.
+## Philosophy Alignment
+- Result types: satisfied throughout
+- Validate at boundaries: satisfied - `loadQueueConfig()` validates label requires `queueLabel`
+- Immutability: satisfied - all new fields are `readonly`
+- YAGNI: satisfied - no new abstractions
+- Make illegal states unrepresentable: partially satisfied (load-time, not compile-time). Accepted tension.
+## Findings
+**YELLOW - polling-scheduler.ts guard**: Feature works at unit test level but not in production. The scheduler guard at line 393 (`queueConfig.type !== 'assignee'`) will still block label-type configs. This is a known, accepted, out-of-scope issue. Document in PR.
+**YELLOW - test replacement**: The test at line 126 uses `{type: 'label', name: 'my-label'}` and expects `not_implemented`. This test MUST be replaced or it will fail after the change. High likelihood of causing CI failure if missed.
+No RED findings.
+## Recommended Revisions
+1. In implementation: replace test at line 126 with three new tests (label success, label missing queueLabel, assignee regression).
+2. In PR description: note that `polling-scheduler.ts` guard is a follow-up item.
+## Residual Concerns
+- `name?` field in `GitHubQueueConfig` is now superseded by `queueLabel?`. Future cleanup: remove `name?` and update any remaining references. Out of scope for this PR.
+- No other residual concerns.

package/docs/design/queue-label-support-implementation-plan.md ADDED Viewed

@@ -0,0 +1,158 @@
+# Implementation Plan: Label-Based Queue Filter for github_queue_poll
+## Problem Statement
+The `github_queue_poll` trigger supports `queueType: label` and `queueLabel: "worktrain:ready"` in `triggers.yml`, but these fields are silently ignored. The `GitHubQueueConfig` type supports `type: 'label'` but throws `not_implemented` at runtime. Label-based queue filtering lets WorkTrain pick up issues without a dedicated bot account -- any issue labeled `worktrain:ready` becomes a candidate.
+---
+## Acceptance Criteria
+1. `pollGitHubQueueIssues()` with `config.type === 'label'` and `config.queueLabel === 'worktrain:ready'` sends `GET /repos/:owner/:repo/issues?state=open&labels=worktrain%3Aready&per_page=100`
+2. `pollGitHubQueueIssues()` with `config.type === 'label'` and no `config.queueLabel` returns `err({ kind: 'not_implemented', ... })` (config validation error path)
+3. `pollGitHubQueueIssues()` with `config.type === 'assignee'` still works (regression)
+4. `loadQueueConfig()` with `type: 'label'` and no `queueLabel` field in config.json returns `err(...)` (not `ok`)
+5. `loadQueueConfig()` with `type: 'label'` and `queueLabel: 'worktrain:ready'` returns `ok(config)` with `config.queueLabel === 'worktrain:ready'`
+6. `trigger-store.ts` parses `queueType` and `queueLabel` from triggers.yml into `GitHubQueuePollingSource`
+7. `npm run build` succeeds (no TypeScript errors)
+8. `npx vitest run tests/unit/github-queue-poller.test.ts` -- all tests pass
+9. `npx vitest run` -- no regressions
+---
+## Non-Goals
+- Do not touch `src/mcp/`
+- Do not touch `polling-scheduler.ts`
+- Do not implement `mention` or `query` queue types
+- Do not refactor surrounding code
+- Do not remove the existing `name?` field from `GitHubQueueConfig`
+---
+## Philosophy-Driven Constraints
+- All boundary functions return `Result<T, E>` -- no throws
+- All new interface fields are `readonly`
+- Validation at load time (`loadQueueConfig`): err if `type==='label'` and `queueLabel` absent
+- Defensive else branch in `pollGitHubQueueIssues`: unknown types return `not_implemented`
+- `encodeURIComponent` / URLSearchParams for URL safety
+---
+## Invariants
+- I1: `pollGitHubQueueIssues` with `type='assignee'` and `config.user` sends `assignee=<user>` param (unchanged)
+- I2: `pollGitHubQueueIssues` with `type='label'` and `config.queueLabel` sends `labels=<encoded>` param
+- I3: `pollGitHubQueueIssues` with `type='label'` and no `config.queueLabel` returns `err({ kind: 'not_implemented' })`
+- I4: `pollGitHubQueueIssues` with any other type returns `err({ kind: 'not_implemented' })`
+- I5: `loadQueueConfig` with `type='label'` and no `queueLabel` key in config.json returns `err(string)`
+- I6: No throws at any boundary
+---
+## Selected Approach
+**Additive optional field + load-time validation + poller URL branch**
+Four changes:
+1. `GitHubQueueConfig` interface: add `readonly queueLabel?: string`
+2. `loadQueueConfig()`: parse `q['queueLabel']`, validate it when `type==='label'`
+3. `GitHubQueuePollingSource` (types.ts): add `readonly queueType?: string`, `readonly queueLabel?: string`
+4. `trigger-store.ts`: parse `queueType`/`queueLabel` in `ParsedTriggerRaw` + `setTriggerField()` + assembly block
+5. `pollGitHubQueueIssues()`: replace hard not_implemented guard with assignee/label/else branching
+Runner-up was discriminated union -- rejected due to scope constraint and unnecessary complexity for this bounded change.
+---
+## Vertical Slices
+### Slice 1: `github-queue-config.ts` -- add `queueLabel` field + validation
+**Files**: `src/trigger/github-queue-config.ts`
+**What**: Add `readonly queueLabel?: string` to `GitHubQueueConfig` interface. In `loadQueueConfig()`, parse `q['queueLabel']` and add validation: if `rawType === 'label'` and `!queueLabel`, return `err('config.queue.queueLabel is required when type is "label"')`. Build the return object with `queueLabel` when present.
+**Done when**: Interface has `queueLabel?`, validation fires correctly, `npm run build` clean.
+### Slice 2: `types.ts` -- extend `GitHubQueuePollingSource`
+**Files**: `src/trigger/types.ts`
+**What**: Add `readonly queueType?: string` and `readonly queueLabel?: string` to `GitHubQueuePollingSource`.
+**Done when**: Fields present in interface, build clean.
+### Slice 3: `trigger-store.ts` -- parse `queueType`/`queueLabel` from YAML
+**Files**: `src/trigger/trigger-store.ts`
+**What**:
+- Add `queueType?: string` and `queueLabel?: string` to `ParsedTriggerRaw` interface
+- Handle them in `setTriggerField()` switch
+- In the `github_queue_poll` assembly block, read `raw.queueType` and `raw.queueLabel` and include them in the `GitHubQueuePollingSource` object
+**Done when**: `triggers.yml` `self-improvement` trigger parses correctly with these fields; build clean.
+### Slice 4: `github-queue-poller.ts` -- implement label branch
+**Files**: `src/trigger/adapters/github-queue-poller.ts`
+**What**: Replace the current `if (config.type !== 'assignee') return err(not_implemented)` guard with:
+```typescript
+if (config.type === 'assignee' && config.user) {
+  url.searchParams.set('assignee', config.user);
+} else if (config.type === 'label' && config.queueLabel) {
+  url.searchParams.set('labels', config.queueLabel);
+} else {
+  return err({ kind: 'not_implemented', message: `Queue type "${config.type}" is not yet implemented` });
+}
+```
+Remove the old `if (config.user)` block that was AFTER the guard.
+Update JSDoc to reflect both supported types.
+**Done when**: Function correctly handles both assignee and label, build clean.
+### Slice 5: Tests -- update + add new tests
+**Files**: `tests/unit/github-queue-poller.test.ts`
+**What**:
+- REPLACE existing test at line 126 (`returns not_implemented for non-assignee queue type`) -- this test currently expects not_implemented for `{type: 'label', name: 'my-label'}`. It MUST be replaced, not kept.
+- ADD: `type: 'label'` with `queueLabel: 'worktrain:ready'` -> fetches with `labels=worktrain%3Aready` param
+- ADD: `type: 'label'` without `queueLabel` -> returns err with kind not_implemented (or config validation path)
+- KEEP as regression: `type: 'assignee'` test at line 105 (already tests assignee param)
+- Update `makeConfig()` helper: `name?` field can stay but add examples with `queueLabel`
+**Done when**: `npx vitest run tests/unit/github-queue-poller.test.ts` all pass.
+---
+## Test Design
+| Test | Expected behavior | Assertion |
+|------|-------------------|-----------|
+| label type + queueLabel | fetches with `labels=` param | URL contains `labels=worktrain%3Aready` |
+| label type + no queueLabel | returns not_implemented | `result.error.kind === 'not_implemented'` |
+| assignee type + user (regression) | fetches with `assignee=` param | URL contains `assignee=bob` |
+Existing tests for network_error, http_error, rate_limit, field mapping, maturity, idempotency: unchanged.
+---
+## Risk Register
+| Risk | Probability | Impact | Mitigation |
+|------|-------------|--------|------------|
+| Forgetting to replace test at line 126 | High | CI fails | Explicit: replace that test first |
+| `polling-scheduler.ts` guard still blocks end-to-end | Certain | Production feature blocked | Document in PR description as follow-up |
+| URL encoding of `:` in `worktrain:ready` | Low | Wrong API call | Use `url.searchParams.set()` which encodes automatically |
+---
+## PR Packaging Strategy
+Single PR: `feat/queue-label-support`
+Commit: `feat(trigger): implement label-based queue filter for github_queue_poll`
+PR description should note: the `polling-scheduler.ts` guard at line 393 still checks `queueConfig.type !== 'assignee'` and will need a follow-up update for end-to-end production use. This PR implements the foundational layer.
+---
+## Philosophy Alignment Per Slice
+| Slice | Principle | Status |
+|-------|-----------|--------|
+| 1 (config) | validate-at-boundaries | satisfied: err if type=label and queueLabel absent |
+| 1 (config) | immutability by default | satisfied: readonly field |
+| 2 (types) | explicit domain types | satisfied: typed fields not raw strings |
+| 3 (trigger-store) | validate-at-boundaries | satisfied: queueType parsed and stored |
+| 4 (poller) | Result types, no throws | satisfied: err() return, no throw |
+| 4 (poller) | exhaustiveness | tension: else branch catches unknowns but not exhaustive switch |
+| 5 (tests) | prefer fakes over mocks | satisfied: injectable fetchFn mock |

package/docs/ideas/backlog.md CHANGED Viewed

@@ -6481,3 +6481,150 @@ All five top-priority autonomous pipeline items shipped:
 4. **Level 1 usage: run WorkTrain on its own backlog** -- Create `worktrain:ready` issues for the top 10 ready tasks, assign to `worktrain-etienneb`, observe one full queue → pipeline run. Collect data on misclassifications and weak PRs before designing the grooming loop.
 5. **`worktrain inbox --watch`** -- Close the notification loop. Outbox exists, just needs the polling implementation.
+---
+## WorkTrain identity model: act as the user, not as a bot (Apr 20, 2026)
+**Design decision:** WorkTrain acts as the configured user, not as a separate bot account.
+### Why bot accounts are the wrong default
+Most developers -- especially at companies -- cannot create separate bot GitHub accounts. Jira, GitLab, and other enterprise systems tie authentication to employee identity. Requiring a separate account creates friction that blocks adoption entirely.
+WorkTrain's attribution signal is the **work pattern**, not the identity:
+- Branch name: `worktrain/<sessionId>` -- immediately recognizable
+- PR body footer: "🤖 Automated by WorkTrain" + session ID + workflow name
+- Commit co-author: `Co-Authored-By: WorkTrain <worktrain@noreply>`
+Anyone reviewing a PR knows it was autonomous. The developer's name on the PR is not a lie -- they configured WorkTrain to do this work on their behalf.
+### Queue membership without a bot account
+Assignee-based opt-in only works with a dedicated bot account. Label-based opt-in works with any setup:
+- Apply `worktrain:ready` label to an issue → WorkTrain picks it up
+- The queue poll trigger uses `queueType: label` + `queueLabel: "worktrain:ready"`
+- No bot account, no special permissions, no friction
+`workOnAll: true` (future) processes any open issue -- also requires no bot account.
+### Token: use your own PAT
+`$GITHUB_TOKEN` (your personal token) or a fine-grained PAT scoped to the target repo. WorkTrain uses it for API calls; the commit identity (`git user.name`, `git user.email`) is set separately in the worktree and can be whatever you want.
+---
+## Jira + GitLab integration for WorkTrain (Apr 20, 2026)
+**Context:** Most enterprise developers use Jira for tickets and GitLab for code hosting. WorkTrain should work in this environment without requiring GitHub or a bot account.
+### What exists
+`gitlab_poll` trigger already exists -- polls GitLab MR list and dispatches sessions when new/updated MRs appear. WorkTrain can already do autonomous MR review on GitLab.
+### What's missing
+**`jira_poll` trigger:** Poll a Jira board/sprint/filter for issues in a specific status (e.g., "In Progress", "Ready for Dev") assigned to the configured user, and dispatch WorkTrain sessions for them. The developer labels Jira issues for WorkTrain the same way they'd assign to a teammate.
+Proposed `jira_poll` config:
+```yaml
+- id: jira-queue
+  provider: jira_poll
+  jiraBaseUrl: https://zillow.atlassian.net
+  token: $JIRA_API_TOKEN
+  project: ACEI
+  statusFilter: "Ready for Dev"
+  assigneeFilter: "$JIRA_USERNAME"
+  workspacePath: /path/to/repo
+  branchStrategy: worktree
+  autoCommit: true
+  autoOpenPR: true
+  agentConfig:
+    maxSessionMinutes: 90
+```
+**GitLab issue queue:** Same as `github_queue_poll` but for GitLab issues. Dispatch coding sessions for GitLab issues labeled `worktrain` or in a specific milestone.
+### Implementation notes
+- `jira_poll` follows the same `PollingSource` discriminated union pattern as `gitlab_poll` and `github_queue_poll`
+- Jira REST API v3: `GET /rest/api/3/search?jql=project=X+AND+status="Ready for Dev"+AND+assignee=currentUser()`
+- Token: Jira API token (not OAuth -- simpler for developer tools)
+- `jira_poll` should extract issue title + description as the goal, and the Jira issue URL as `upstreamSpecUrl` in `TaskCandidate`
+### Priority
+Medium. GitLab MR review already works. Jira issue queue is the next most impactful integration for enterprise users. Design alongside the label-based GitHub queue -- the patterns are identical, just different API shapes.
+---
+## Queue opt-in design: unresolved decisions (Apr 20, 2026)
+**Status: DO NOT IMPLEMENT until these questions are answered.**
+The self-improvement queue was partially implemented using label-based opt-in, then later walked back. This section records what's actually unresolved so future work starts from the right place.
+### What's wrong with the current state
+The `github_queue_poll` trigger now supports both `assignee` and `label` queue types. The code is correct. But `triggers.yml` has no active queue trigger because the opt-in mechanism isn't settled -- see below.
+The label approach was implemented as a practical fallback when "no bot account" ruled out assignee-based. But labels were what we explicitly rejected in the original design because they require humans to apply them per issue. Reversing that decision without acknowledging it was a mistake. The right answer isn't to pick one mechanism -- it's to keep the queue shape configurable (which we already designed) and pick the right shape per context.
+### The configurable queue shape (already designed, partially implemented)
+```
+{ "queue": { "type": "github_assignee", "user":  "worktrain-etienneb" } }
+{ "queue": { "type": "github_label",    "name":  "worktrain:ready" } }
+{ "queue": { "type": "github_query",    "search": "is:issue is:open ..." } }
+{ "queue": { "type": "jql",             "query": "assignee=currentUser() AND status='Ready for Dev'" } }
+{ "queue": { "type": "gitlab_label",    "name":  "worktrain" } }
+```
+For the workrail repo specifically: either `github_assignee` (accept the conflation between your personal assignments and WorkTrain's queue -- fine for a solo repo) or `github_label` (apply label per issue -- more discipline, more friction). Neither is wrong; pick based on preference.
+### Enterprise implications that must be resolved before Zillow work
+Three questions for the user to verify before designing any Zillow path:
+1. **Service account process**: Does Zillow have a ServiceDesk or security review process for requesting service accounts (`worktrain-etienneb@zillow`)? If yes, request one through proper channels rather than acting under your personal identity.
+2. **AUP check**: Does Zillow's Acceptable Use Policy permit automation acting under employee identities without an explicit security review? If not, "WorkTrain acts as you" is not viable -- a service account is required.
+3. **Self-approval rules**: Can you approve your own MRs in Zillow's GitLab? If "no self-approval" is enforced, every WorkTrain MR needs a human reviewer. That changes the pipeline (no auto-merge under personal identity).
+These three answers determine the entire architecture for Zillow. Do not design the Jira/GitLab path until they are known.
+### Enterprise identity risk (important)
+"WorkTrain acts as you" is different from "Dependabot acts as you." Dependabot does narrow, predictable operations (dependency bumps). WorkTrain does arbitrary LLM-driven code changes. Every autonomous action -- MR opened, commit pushed, comment posted -- is attributed to you in audit logs. If WorkTrain does something wrong under your identity, the audit trail points to you. Understand this risk before turning on autonomy against company repos.
+### Jira return path (missing from current jira_poll design)
+The `jira_poll` backlog entry describes pulling tickets from Jira. It does not describe writing back:
+- Moving the ticket to "In Review" when an MR is opened
+- Adding the MR URL to the Jira ticket (a Jira field or comment)
+- Reacting to Jira transitions mid-work (ticket moved back to "To Do" → WorkTrain stops)
+The full Jira integration is a round-trip, not just a poll. Design the return path before implementing `jira_poll`.
+---
+## Gate 2 follow-up: per-trigger gh CLI token for delivery (Apr 20, 2026)
+`delivery-action.ts` calls `gh pr create` using whatever `gh` CLI auth is configured globally -- it does not pass a per-trigger token. For single-identity (always acting as yourself) this is fine. For multi-identity (Zillow service account alongside personal trigger), the globally authenticated `gh` user handles all PR creation, silently using the wrong identity.
+**Fix when multi-identity is needed:** Pass `GH_TOKEN=<triggerToken>` env override to `execFn` when calling `gh pr create` and `gh pr merge`. Not a blocker for single-identity. Prerequisite for multi-identity support.
+---
+## Queue config discriminated union tightening (Apr 20, 2026)
+`GitHubQueueConfig` uses a flat interface with runtime validation. Should be a proper TypeScript discriminated union so `type: 'assignee'` requires `user` at compile time. Low priority but tracked per "make illegal states unrepresentable."
+---
+## Kill switch and commit signing (Apr 20, 2026)
+**Kill switch:** `worktrain kill-sessions` -- aborts all running daemon sessions immediately. Useful when WorkTrain is doing something unexpected. Sends abort signal to all active sessions, marks them user-killed in the event log.
+**Commit signing:** verify `git commit` honors existing `commit.gpgsign` config, or add explicit opt-out for bot identities that don't have signing keys. Empirically verify before declaring this solved.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@exaudeus/workrail",
-  "version": "3.46.0",
+  "version": "3.48.0",
   "description": "Step-by-step workflow enforcement for AI agents via MCP",
   "license": "MIT",
   "repository": {

package/workflows/mr-review-workflow.agentic.v2.json CHANGED Viewed

@@ -312,7 +312,7 @@
     {
       "id": "phase-6-final-handoff",
       "title": "Phase 6: Final Handoff",
-      "prompt": "Provide the final MR review handoff.\n\nInclude:\n- MR title and purpose\n- review mode used\n- final recommendation and confidence band\n- confidence assessment summary, including the most important reason confidence was capped if it was not High\n- counts of Critical / Major / Minor / Nit findings\n- top findings with rationale\n- strongest remaining areas of uncertainty, if any\n- summary of the coverage ledger, especially any still-uncertain domains\n- ready-to-post MR comments summary\n- any validation outcomes a human reviewer should see\n- review environment status:\n  - what review target/context sources were successfully used\n  - what important sources were missing or ambiguous\n  - boundary confidence and context confidence\n  - how those limits affected the review\n- path to the full human-facing review artifact (`reviewDocPath`) only if one was created\n\nRules:\n- the final recommendation assists a human reviewer; it does not replace them\n- if `reviewDocPath` exists, treat it as a human-facing companion artifact only\n- be explicit when missing PR/ticket/doc/boundary context limited confidence\n- do not post comments, approve, reject, or merge unless the user explicitly asks\n\nIMPORTANT: After writing your notes, emit a structured verdict via complete_step's artifacts[] parameter using EXACTLY this schema (no extra fields):\n{\n  \"kind\": \"wr.review_verdict\",\n  \"verdict\": \"clean\" | \"minor\" | \"blocking\",\n  \"confidence\": \"high\" | \"medium\" | \"low\",\n  \"findings\": [ { \"severity\": \"critical\" | \"major\" | \"minor\" | \"nit\", \"summary\": \"one-line description\" } ],\n  \"summary\": \"one-line overall verdict summary\"\n}\nFor a clean review with no findings, use findings: []. The verdict field maps to severity: clean = no blocking issues, minor = small issues only, blocking = critical or major issues found.",
+      "prompt": "Provide the final MR review handoff.\n\nInclude:\n- MR title and purpose\n- review mode used\n- final recommendation and confidence band\n- confidence assessment summary, including the most important reason confidence was capped if it was not High\n- counts of Critical / Major / Minor / Nit findings\n- top findings with rationale\n- strongest remaining areas of uncertainty, if any\n- summary of the coverage ledger, especially any still-uncertain domains\n- ready-to-post MR comments summary\n- any validation outcomes a human reviewer should see\n- review environment status:\n  - what review target/context sources were successfully used\n  - what important sources were missing or ambiguous\n  - boundary confidence and context confidence\n  - how those limits affected the review\n- path to the full human-facing review artifact (`reviewDocPath`) only if one was created\n\nRules:\n- the final recommendation assists a human reviewer; it does not replace them\n- if `reviewDocPath` exists, treat it as a human-facing companion artifact only\n- be explicit when missing PR/ticket/doc/boundary context limited confidence\n- do not post comments, approve, reject, or merge unless the user explicitly asks\n\nIMPORTANT: After writing your notes, emit a structured verdict via complete_step's artifacts[] parameter using EXACTLY this schema (no extra fields):\n{\n  \"kind\": \"wr.review_verdict\",\n  \"verdict\": \"clean\" | \"minor\" | \"blocking\",\n  \"confidence\": \"high\" | \"medium\" | \"low\",\n  \"findings\": [ { \"severity\": \"critical\" | \"major\" | \"minor\" | \"nit\", \"summary\": \"one-line description\", \"findingCategory\": \"correctness\" | \"security\" | \"architecture\" | \"ux\" | \"performance\" | \"testing\" | \"style\" } ],\n  \"summary\": \"one-line overall verdict summary\"\n}\nFor a clean review with no findings, use findings: []. The verdict field maps to severity: clean = no blocking issues, minor = small issues only, blocking = critical or major issues found. For findingCategory use: correctness = wrong behavior/logic errors, security = auth/authz issues/injection/data exposure, architecture = wrong abstraction/tight coupling/invariant violations, ux = usability/accessibility/interaction design, performance = inefficiency/N+1/blocking operations, testing = missing tests/wrong test approach, style = naming/formatting/conventions.",
       "outputContract": {
         "contractRef": "wr.contracts.review_verdict",
         "required": false