@exaudeus/workrail 3.46.0 → 3.47.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,158 @@
1
+ # Implementation Plan: Label-Based Queue Filter for github_queue_poll
2
+
3
+ ## Problem Statement
4
+
5
+ The `github_queue_poll` trigger supports `queueType: label` and `queueLabel: "worktrain:ready"` in `triggers.yml`, but these fields are silently ignored. The `GitHubQueueConfig` type supports `type: 'label'` but throws `not_implemented` at runtime. Label-based queue filtering lets WorkTrain pick up issues without a dedicated bot account -- any issue labeled `worktrain:ready` becomes a candidate.
6
+
7
+ ---
8
+
9
+ ## Acceptance Criteria
10
+
11
+ 1. `pollGitHubQueueIssues()` with `config.type === 'label'` and `config.queueLabel === 'worktrain:ready'` sends `GET /repos/:owner/:repo/issues?state=open&labels=worktrain%3Aready&per_page=100`
12
+ 2. `pollGitHubQueueIssues()` with `config.type === 'label'` and no `config.queueLabel` returns `err({ kind: 'not_implemented', ... })` (config validation error path)
13
+ 3. `pollGitHubQueueIssues()` with `config.type === 'assignee'` still works (regression)
14
+ 4. `loadQueueConfig()` with `type: 'label'` and no `queueLabel` field in config.json returns `err(...)` (not `ok`)
15
+ 5. `loadQueueConfig()` with `type: 'label'` and `queueLabel: 'worktrain:ready'` returns `ok(config)` with `config.queueLabel === 'worktrain:ready'`
16
+ 6. `trigger-store.ts` parses `queueType` and `queueLabel` from triggers.yml into `GitHubQueuePollingSource`
17
+ 7. `npm run build` succeeds (no TypeScript errors)
18
+ 8. `npx vitest run tests/unit/github-queue-poller.test.ts` -- all tests pass
19
+ 9. `npx vitest run` -- no regressions
20
+
21
+ ---
22
+
23
+ ## Non-Goals
24
+
25
+ - Do not touch `src/mcp/`
26
+ - Do not touch `polling-scheduler.ts`
27
+ - Do not implement `mention` or `query` queue types
28
+ - Do not refactor surrounding code
29
+ - Do not remove the existing `name?` field from `GitHubQueueConfig`
30
+
31
+ ---
32
+
33
+ ## Philosophy-Driven Constraints
34
+
35
+ - All boundary functions return `Result<T, E>` -- no throws
36
+ - All new interface fields are `readonly`
37
+ - Validation at load time (`loadQueueConfig`): err if `type==='label'` and `queueLabel` absent
38
+ - Defensive else branch in `pollGitHubQueueIssues`: unknown types return `not_implemented`
39
+ - `encodeURIComponent` / URLSearchParams for URL safety
40
+
41
+ ---
42
+
43
+ ## Invariants
44
+
45
+ - I1: `pollGitHubQueueIssues` with `type='assignee'` and `config.user` sends `assignee=<user>` param (unchanged)
46
+ - I2: `pollGitHubQueueIssues` with `type='label'` and `config.queueLabel` sends `labels=<encoded>` param
47
+ - I3: `pollGitHubQueueIssues` with `type='label'` and no `config.queueLabel` returns `err({ kind: 'not_implemented' })`
48
+ - I4: `pollGitHubQueueIssues` with any other type returns `err({ kind: 'not_implemented' })`
49
+ - I5: `loadQueueConfig` with `type='label'` and no `queueLabel` key in config.json returns `err(string)`
50
+ - I6: No throws at any boundary
51
+
52
+ ---
53
+
54
+ ## Selected Approach
55
+
56
+ **Additive optional field + load-time validation + poller URL branch**
57
+
58
+ Four changes:
59
+ 1. `GitHubQueueConfig` interface: add `readonly queueLabel?: string`
60
+ 2. `loadQueueConfig()`: parse `q['queueLabel']`, validate it when `type==='label'`
61
+ 3. `GitHubQueuePollingSource` (types.ts): add `readonly queueType?: string`, `readonly queueLabel?: string`
62
+ 4. `trigger-store.ts`: parse `queueType`/`queueLabel` in `ParsedTriggerRaw` + `setTriggerField()` + assembly block
63
+ 5. `pollGitHubQueueIssues()`: replace hard not_implemented guard with assignee/label/else branching
64
+
65
+ Runner-up was discriminated union -- rejected due to scope constraint and unnecessary complexity for this bounded change.
66
+
67
+ ---
68
+
69
+ ## Vertical Slices
70
+
71
+ ### Slice 1: `github-queue-config.ts` -- add `queueLabel` field + validation
72
+ **Files**: `src/trigger/github-queue-config.ts`
73
+ **What**: Add `readonly queueLabel?: string` to `GitHubQueueConfig` interface. In `loadQueueConfig()`, parse `q['queueLabel']` and add validation: if `rawType === 'label'` and `!queueLabel`, return `err('config.queue.queueLabel is required when type is "label"')`. Build the return object with `queueLabel` when present.
74
+ **Done when**: Interface has `queueLabel?`, validation fires correctly, `npm run build` clean.
75
+
76
+ ### Slice 2: `types.ts` -- extend `GitHubQueuePollingSource`
77
+ **Files**: `src/trigger/types.ts`
78
+ **What**: Add `readonly queueType?: string` and `readonly queueLabel?: string` to `GitHubQueuePollingSource`.
79
+ **Done when**: Fields present in interface, build clean.
80
+
81
+ ### Slice 3: `trigger-store.ts` -- parse `queueType`/`queueLabel` from YAML
82
+ **Files**: `src/trigger/trigger-store.ts`
83
+ **What**:
84
+ - Add `queueType?: string` and `queueLabel?: string` to `ParsedTriggerRaw` interface
85
+ - Handle them in `setTriggerField()` switch
86
+ - In the `github_queue_poll` assembly block, read `raw.queueType` and `raw.queueLabel` and include them in the `GitHubQueuePollingSource` object
87
+ **Done when**: `triggers.yml` `self-improvement` trigger parses correctly with these fields; build clean.
88
+
89
+ ### Slice 4: `github-queue-poller.ts` -- implement label branch
90
+ **Files**: `src/trigger/adapters/github-queue-poller.ts`
91
+ **What**: Replace the current `if (config.type !== 'assignee') return err(not_implemented)` guard with:
92
+ ```typescript
93
+ if (config.type === 'assignee' && config.user) {
94
+ url.searchParams.set('assignee', config.user);
95
+ } else if (config.type === 'label' && config.queueLabel) {
96
+ url.searchParams.set('labels', config.queueLabel);
97
+ } else {
98
+ return err({ kind: 'not_implemented', message: `Queue type "${config.type}" is not yet implemented` });
99
+ }
100
+ ```
101
+ Remove the old `if (config.user)` block that was AFTER the guard.
102
+ Update JSDoc to reflect both supported types.
103
+ **Done when**: Function correctly handles both assignee and label, build clean.
104
+
105
+ ### Slice 5: Tests -- update + add new tests
106
+ **Files**: `tests/unit/github-queue-poller.test.ts`
107
+ **What**:
108
+ - REPLACE existing test at line 126 (`returns not_implemented for non-assignee queue type`) -- this test currently expects not_implemented for `{type: 'label', name: 'my-label'}`. It MUST be replaced, not kept.
109
+ - ADD: `type: 'label'` with `queueLabel: 'worktrain:ready'` -> fetches with `labels=worktrain%3Aready` param
110
+ - ADD: `type: 'label'` without `queueLabel` -> returns err with kind not_implemented (or config validation path)
111
+ - KEEP as regression: `type: 'assignee'` test at line 105 (already tests assignee param)
112
+ - Update `makeConfig()` helper: `name?` field can stay but add examples with `queueLabel`
113
+ **Done when**: `npx vitest run tests/unit/github-queue-poller.test.ts` all pass.
114
+
115
+ ---
116
+
117
+ ## Test Design
118
+
119
+ | Test | Expected behavior | Assertion |
120
+ |------|-------------------|-----------|
121
+ | label type + queueLabel | fetches with `labels=` param | URL contains `labels=worktrain%3Aready` |
122
+ | label type + no queueLabel | returns not_implemented | `result.error.kind === 'not_implemented'` |
123
+ | assignee type + user (regression) | fetches with `assignee=` param | URL contains `assignee=bob` |
124
+
125
+ Existing tests for network_error, http_error, rate_limit, field mapping, maturity, idempotency: unchanged.
126
+
127
+ ---
128
+
129
+ ## Risk Register
130
+
131
+ | Risk | Probability | Impact | Mitigation |
132
+ |------|-------------|--------|------------|
133
+ | Forgetting to replace test at line 126 | High | CI fails | Explicit: replace that test first |
134
+ | `polling-scheduler.ts` guard still blocks end-to-end | Certain | Production feature blocked | Document in PR description as follow-up |
135
+ | URL encoding of `:` in `worktrain:ready` | Low | Wrong API call | Use `url.searchParams.set()` which encodes automatically |
136
+
137
+ ---
138
+
139
+ ## PR Packaging Strategy
140
+
141
+ Single PR: `feat/queue-label-support`
142
+ Commit: `feat(trigger): implement label-based queue filter for github_queue_poll`
143
+
144
+ PR description should note: the `polling-scheduler.ts` guard at line 393 still checks `queueConfig.type !== 'assignee'` and will need a follow-up update for end-to-end production use. This PR implements the foundational layer.
145
+
146
+ ---
147
+
148
+ ## Philosophy Alignment Per Slice
149
+
150
+ | Slice | Principle | Status |
151
+ |-------|-----------|--------|
152
+ | 1 (config) | validate-at-boundaries | satisfied: err if type=label and queueLabel absent |
153
+ | 1 (config) | immutability by default | satisfied: readonly field |
154
+ | 2 (types) | explicit domain types | satisfied: typed fields not raw strings |
155
+ | 3 (trigger-store) | validate-at-boundaries | satisfied: queueType parsed and stored |
156
+ | 4 (poller) | Result types, no throws | satisfied: err() return, no throw |
157
+ | 4 (poller) | exhaustiveness | tension: else branch catches unknowns but not exhaustive switch |
158
+ | 5 (tests) | prefer fakes over mocks | satisfied: injectable fetchFn mock |
@@ -6481,3 +6481,77 @@ All five top-priority autonomous pipeline items shipped:
6481
6481
  4. **Level 1 usage: run WorkTrain on its own backlog** -- Create `worktrain:ready` issues for the top 10 ready tasks, assign to `worktrain-etienneb`, observe one full queue → pipeline run. Collect data on misclassifications and weak PRs before designing the grooming loop.
6482
6482
 
6483
6483
  5. **`worktrain inbox --watch`** -- Close the notification loop. Outbox exists, just needs the polling implementation.
6484
+
6485
+ ---
6486
+
6487
+ ## WorkTrain identity model: act as the user, not as a bot (Apr 20, 2026)
6488
+
6489
+ **Design decision:** WorkTrain acts as the configured user, not as a separate bot account.
6490
+
6491
+ ### Why bot accounts are the wrong default
6492
+
6493
+ Most developers -- especially at companies -- cannot create separate bot GitHub accounts. Jira, GitLab, and other enterprise systems tie authentication to employee identity. Requiring a separate account creates friction that blocks adoption entirely.
6494
+
6495
+ WorkTrain's attribution signal is the **work pattern**, not the identity:
6496
+ - Branch name: `worktrain/<sessionId>` -- immediately recognizable
6497
+ - PR body footer: "🤖 Automated by WorkTrain" + session ID + workflow name
6498
+ - Commit co-author: `Co-Authored-By: WorkTrain <worktrain@noreply>`
6499
+
6500
+ Anyone reviewing a PR knows it was autonomous. The developer's name on the PR is not a lie -- they configured WorkTrain to do this work on their behalf.
6501
+
6502
+ ### Queue membership without a bot account
6503
+
6504
+ Assignee-based opt-in only works with a dedicated bot account. Label-based opt-in works with any setup:
6505
+ - Apply `worktrain:ready` label to an issue → WorkTrain picks it up
6506
+ - The queue poll trigger uses `queueType: label` + `queueLabel: "worktrain:ready"`
6507
+ - No bot account, no special permissions, no friction
6508
+
6509
+ `workOnAll: true` (future) processes any open issue -- also requires no bot account.
6510
+
6511
+ ### Token: use your own PAT
6512
+
6513
+ `$GITHUB_TOKEN` (your personal token) or a fine-grained PAT scoped to the target repo. WorkTrain uses it for API calls; the commit identity (`git user.name`, `git user.email`) is set separately in the worktree and can be whatever you want.
6514
+
6515
+ ---
6516
+
6517
+ ## Jira + GitLab integration for WorkTrain (Apr 20, 2026)
6518
+
6519
+ **Context:** Most enterprise developers use Jira for tickets and GitLab for code hosting. WorkTrain should work in this environment without requiring GitHub or a bot account.
6520
+
6521
+ ### What exists
6522
+
6523
+ `gitlab_poll` trigger already exists -- polls GitLab MR list and dispatches sessions when new/updated MRs appear. WorkTrain can already do autonomous MR review on GitLab.
6524
+
6525
+ ### What's missing
6526
+
6527
+ **`jira_poll` trigger:** Poll a Jira board/sprint/filter for issues in a specific status (e.g., "In Progress", "Ready for Dev") assigned to the configured user, and dispatch WorkTrain sessions for them. The developer labels Jira issues for WorkTrain the same way they'd assign to a teammate.
6528
+
6529
+ Proposed `jira_poll` config:
6530
+ ```yaml
6531
+ - id: jira-queue
6532
+ provider: jira_poll
6533
+ jiraBaseUrl: https://zillow.atlassian.net
6534
+ token: $JIRA_API_TOKEN
6535
+ project: ACEI
6536
+ statusFilter: "Ready for Dev"
6537
+ assigneeFilter: "$JIRA_USERNAME"
6538
+ workspacePath: /path/to/repo
6539
+ branchStrategy: worktree
6540
+ autoCommit: true
6541
+ autoOpenPR: true
6542
+ agentConfig:
6543
+ maxSessionMinutes: 90
6544
+ ```
6545
+
6546
+ **GitLab issue queue:** Same as `github_queue_poll` but for GitLab issues. Dispatch coding sessions for GitLab issues labeled `worktrain` or in a specific milestone.
6547
+
6548
+ ### Implementation notes
6549
+
6550
+ - `jira_poll` follows the same `PollingSource` discriminated union pattern as `gitlab_poll` and `github_queue_poll`
6551
+ - Jira REST API v3: `GET /rest/api/3/search?jql=project=X+AND+status="Ready for Dev"+AND+assignee=currentUser()`
6552
+ - Token: Jira API token (not OAuth -- simpler for developer tools)
6553
+ - `jira_poll` should extract issue title + description as the goal, and the Jira issue URL as `upstreamSpecUrl` in `TaskCandidate`
6554
+
6555
+ ### Priority
6556
+
6557
+ Medium. GitLab MR review already works. Jira issue queue is the next most impactful integration for enterprise users. Design alongside the label-based GitHub queue -- the patterns are identical, just different API shapes.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@exaudeus/workrail",
3
- "version": "3.46.0",
3
+ "version": "3.47.0",
4
4
  "description": "Step-by-step workflow enforcement for AI agents via MCP",
5
5
  "license": "MIT",
6
6
  "repository": {
@@ -312,7 +312,7 @@
312
312
  {
313
313
  "id": "phase-6-final-handoff",
314
314
  "title": "Phase 6: Final Handoff",
315
- "prompt": "Provide the final MR review handoff.\n\nInclude:\n- MR title and purpose\n- review mode used\n- final recommendation and confidence band\n- confidence assessment summary, including the most important reason confidence was capped if it was not High\n- counts of Critical / Major / Minor / Nit findings\n- top findings with rationale\n- strongest remaining areas of uncertainty, if any\n- summary of the coverage ledger, especially any still-uncertain domains\n- ready-to-post MR comments summary\n- any validation outcomes a human reviewer should see\n- review environment status:\n - what review target/context sources were successfully used\n - what important sources were missing or ambiguous\n - boundary confidence and context confidence\n - how those limits affected the review\n- path to the full human-facing review artifact (`reviewDocPath`) only if one was created\n\nRules:\n- the final recommendation assists a human reviewer; it does not replace them\n- if `reviewDocPath` exists, treat it as a human-facing companion artifact only\n- be explicit when missing PR/ticket/doc/boundary context limited confidence\n- do not post comments, approve, reject, or merge unless the user explicitly asks\n\nIMPORTANT: After writing your notes, emit a structured verdict via complete_step's artifacts[] parameter using EXACTLY this schema (no extra fields):\n{\n \"kind\": \"wr.review_verdict\",\n \"verdict\": \"clean\" | \"minor\" | \"blocking\",\n \"confidence\": \"high\" | \"medium\" | \"low\",\n \"findings\": [ { \"severity\": \"critical\" | \"major\" | \"minor\" | \"nit\", \"summary\": \"one-line description\" } ],\n \"summary\": \"one-line overall verdict summary\"\n}\nFor a clean review with no findings, use findings: []. The verdict field maps to severity: clean = no blocking issues, minor = small issues only, blocking = critical or major issues found.",
315
+ "prompt": "Provide the final MR review handoff.\n\nInclude:\n- MR title and purpose\n- review mode used\n- final recommendation and confidence band\n- confidence assessment summary, including the most important reason confidence was capped if it was not High\n- counts of Critical / Major / Minor / Nit findings\n- top findings with rationale\n- strongest remaining areas of uncertainty, if any\n- summary of the coverage ledger, especially any still-uncertain domains\n- ready-to-post MR comments summary\n- any validation outcomes a human reviewer should see\n- review environment status:\n - what review target/context sources were successfully used\n - what important sources were missing or ambiguous\n - boundary confidence and context confidence\n - how those limits affected the review\n- path to the full human-facing review artifact (`reviewDocPath`) only if one was created\n\nRules:\n- the final recommendation assists a human reviewer; it does not replace them\n- if `reviewDocPath` exists, treat it as a human-facing companion artifact only\n- be explicit when missing PR/ticket/doc/boundary context limited confidence\n- do not post comments, approve, reject, or merge unless the user explicitly asks\n\nIMPORTANT: After writing your notes, emit a structured verdict via complete_step's artifacts[] parameter using EXACTLY this schema (no extra fields):\n{\n \"kind\": \"wr.review_verdict\",\n \"verdict\": \"clean\" | \"minor\" | \"blocking\",\n \"confidence\": \"high\" | \"medium\" | \"low\",\n \"findings\": [ { \"severity\": \"critical\" | \"major\" | \"minor\" | \"nit\", \"summary\": \"one-line description\", \"findingCategory\": \"correctness\" | \"security\" | \"architecture\" | \"ux\" | \"performance\" | \"testing\" | \"style\" } ],\n \"summary\": \"one-line overall verdict summary\"\n}\nFor a clean review with no findings, use findings: []. The verdict field maps to severity: clean = no blocking issues, minor = small issues only, blocking = critical or major issues found. For findingCategory use: correctness = wrong behavior/logic errors, security = auth/authz issues/injection/data exposure, architecture = wrong abstraction/tight coupling/invariant violations, ux = usability/accessibility/interaction design, performance = inefficiency/N+1/blocking operations, testing = missing tests/wrong test approach, style = naming/formatting/conventions.",
316
316
  "outputContract": {
317
317
  "contractRef": "wr.contracts.review_verdict",
318
318
  "required": false