@exaudeus/workrail 3.67.0 → 3.68.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/application/services/compiler/template-registry.js +10 -1
- package/dist/cli/commands/worktrain-init.js +1 -1
- package/dist/console-ui/assets/{index-tOl8Vowf.js → index-DPdRJHMX.js} +1 -1
- package/dist/console-ui/index.html +1 -1
- package/dist/coordinators/modes/full-pipeline.js +4 -4
- package/dist/coordinators/modes/implement-shared.js +5 -5
- package/dist/coordinators/modes/implement.js +4 -4
- package/dist/coordinators/pr-review.js +4 -4
- package/dist/daemon/workflow-runner.d.ts +1 -0
- package/dist/daemon/workflow-runner.js +1 -0
- package/dist/manifest.json +31 -31
- package/dist/mcp/handlers/v2-context-budget.js +18 -0
- package/dist/mcp/handlers/v2-workflow.js +1 -1
- package/dist/mcp/workflow-protocol-contracts.js +2 -2
- package/dist/v2/durable-core/constants.d.ts +2 -0
- package/dist/v2/durable-core/constants.js +2 -1
- package/dist/v2/projections/session-metrics.js +1 -1
- package/docs/authoring-v2.md +4 -4
- package/docs/changelog-recent.md +3 -3
- package/docs/configuration.md +1 -1
- package/docs/design/adaptive-coordinator-context-candidates.md +1 -1
- package/docs/design/adaptive-coordinator-context.md +1 -1
- package/docs/design/adaptive-coordinator-routing-candidates.md +18 -18
- package/docs/design/adaptive-coordinator-routing-review.md +1 -1
- package/docs/design/adaptive-coordinator-routing.md +34 -34
- package/docs/design/agent-cascade-protocol.md +2 -2
- package/docs/design/console-daemon-separation-discovery.md +323 -0
- package/docs/design/context-assembly-design-candidates.md +1 -1
- package/docs/design/context-assembly-implementation-plan.md +1 -1
- package/docs/design/context-assembly-layer.md +2 -2
- package/docs/design/context-assembly-review-findings.md +1 -1
- package/docs/design/coordinator-access-audit.md +293 -0
- package/docs/design/coordinator-architecture-audit.md +62 -0
- package/docs/design/coordinator-error-handling-audit.md +240 -0
- package/docs/design/coordinator-testability-audit.md +426 -0
- package/docs/design/daemon-architecture-discovery.md +1 -1
- package/docs/design/daemon-console-separation-discovery.md +242 -0
- package/docs/design/daemon-memory-audit.md +203 -0
- package/docs/design/design-candidates-console-daemon-separation.md +256 -0
- package/docs/design/design-candidates-discovery-loop-fix.md +141 -0
- package/docs/design/design-review-findings-console-daemon-separation.md +106 -0
- package/docs/design/design-review-findings-discovery-loop-fix.md +81 -0
- package/docs/design/discovery-loop-fix-candidates.md +161 -0
- package/docs/design/discovery-loop-fix-design-review.md +106 -0
- package/docs/design/discovery-loop-fix-validation.md +258 -0
- package/docs/design/discovery-loop-investigation-A.md +188 -0
- package/docs/design/discovery-loop-investigation-B.md +287 -0
- package/docs/design/exploration-workflow-candidates.md +205 -0
- package/docs/design/exploration-workflow-design-review.md +166 -0
- package/docs/design/exploration-workflow-discovery.md +443 -0
- package/docs/design/ide-context-files-candidates.md +231 -0
- package/docs/design/ide-context-files-design-review.md +85 -0
- package/docs/design/ide-context-files.md +615 -0
- package/docs/design/implementation-plan-discovery-loop-fix.md +199 -0
- package/docs/design/implementation-plan-queue-poll-rotation.md +102 -0
- package/docs/design/in-process-http-audit.md +190 -0
- package/docs/design/layer3b-ghost-nodes-design-candidates.md +2 -2
- package/docs/design/loadSessionNotes-candidates.md +108 -0
- package/docs/design/loadSessionNotes-test-coverage-discovery.md +297 -0
- package/docs/design/loadSessionNotes-test-coverage-session4.md +209 -0
- package/docs/design/loadSessionNotes-test-coverage-v3.md +321 -0
- package/docs/design/probe-session-design-candidates.md +261 -0
- package/docs/design/probe-session-phase0.md +490 -0
- package/docs/design/routines-guide.md +7 -7
- package/docs/design/session-metrics-attribution-candidates.md +250 -0
- package/docs/design/session-metrics-attribution-design-review.md +115 -0
- package/docs/design/session-metrics-attribution-discovery.md +319 -0
- package/docs/design/session-metrics-candidates.md +227 -0
- package/docs/design/session-metrics-design-review.md +104 -0
- package/docs/design/session-metrics-discovery.md +454 -0
- package/docs/design/spawn-session-debug.md +202 -0
- package/docs/design/trigger-validator-candidates.md +214 -0
- package/docs/design/trigger-validator-review.md +109 -0
- package/docs/design/trigger-validator-shaping-phase0.md +239 -0
- package/docs/design/trigger-validator.md +454 -0
- package/docs/design/v2-core-design-locks.md +2 -2
- package/docs/design/workflow-extension-points.md +15 -15
- package/docs/design/workflow-id-validation-at-startup.md +1 -1
- package/docs/design/workflow-id-validation-implementation-plan.md +2 -2
- package/docs/design/workflow-trigger-lifecycle-audit.md +175 -0
- package/docs/design/worktrain-task-queue-candidates.md +5 -5
- package/docs/design/worktrain-task-queue.md +4 -4
- package/docs/discovery/coordinator-script-design.md +1 -1
- package/docs/discovery/coordinator-ux-discovery.md +3 -3
- package/docs/discovery/simulation-report.md +1 -1
- package/docs/discovery/workflow-modernization-discovery.md +326 -0
- package/docs/discovery/workflow-selection-for-discovery-tasks.md +33 -33
- package/docs/discovery/worktrain-status-briefing.md +1 -1
- package/docs/discovery/wr-discovery-goal-reframing.md +1 -1
- package/docs/docker.md +1 -1
- package/docs/ideas/backlog.md +227 -0
- package/docs/ideas/third-party-workflow-setup-design-thinking.md +1 -1
- package/docs/integrations/claude-code.md +5 -5
- package/docs/integrations/firebender.md +1 -1
- package/docs/plans/agentic-orchestration-roadmap.md +2 -2
- package/docs/plans/mr-review-workflow-redesign.md +9 -9
- package/docs/plans/ui-ux-workflow-design-candidates.md +4 -4
- package/docs/plans/ui-ux-workflow-discovery.md +2 -2
- package/docs/plans/workflow-categories-candidates.md +8 -8
- package/docs/plans/workflow-categories-discovery.md +4 -4
- package/docs/plans/workflow-modernization-design.md +430 -0
- package/docs/plans/workflow-staleness-detection-candidates.md +11 -11
- package/docs/plans/workflow-staleness-detection-review.md +4 -4
- package/docs/plans/workflow-staleness-detection.md +9 -9
- package/docs/plans/workrail-platform-vision.md +3 -3
- package/docs/reference/agent-context-cleaner-snippet.md +1 -1
- package/docs/reference/agent-context-guidance.md +4 -4
- package/docs/reference/context-optimization.md +2 -2
- package/docs/roadmap/now-next-later.md +2 -2
- package/docs/roadmap/open-work-inventory.md +16 -16
- package/docs/workflows.md +31 -31
- package/package.json +1 -1
- package/spec/workflow-tags.json +47 -47
- package/workflows/adaptive-ticket-creation.json +16 -16
- package/workflows/architecture-scalability-audit.json +22 -22
- package/workflows/bug-investigation.agentic.v2.json +3 -3
- package/workflows/classify-task-workflow.json +1 -1
- package/workflows/coding-task-workflow-agentic.json +6 -6
- package/workflows/cross-platform-code-conversion.v2.json +8 -8
- package/workflows/document-creation-workflow.json +8 -8
- package/workflows/documentation-update-workflow.json +8 -8
- package/workflows/intelligent-test-case-generation.json +2 -2
- package/workflows/learner-centered-course-workflow.json +2 -2
- package/workflows/mr-review-workflow.agentic.v2.json +4 -4
- package/workflows/personal-learning-materials-creation-branched.json +8 -8
- package/workflows/presentation-creation.json +5 -5
- package/workflows/production-readiness-audit.json +1 -1
- package/workflows/relocation-workflow-us.json +31 -31
- package/workflows/routines/context-gathering.json +1 -1
- package/workflows/routines/design-review.json +1 -1
- package/workflows/routines/execution-simulation.json +1 -1
- package/workflows/routines/feature-implementation.json +3 -3
- package/workflows/routines/final-verification.json +1 -1
- package/workflows/routines/hypothesis-challenge.json +1 -1
- package/workflows/routines/ideation.json +1 -1
- package/workflows/routines/parallel-work-partitioning.json +3 -3
- package/workflows/routines/philosophy-alignment.json +2 -2
- package/workflows/routines/plan-analysis.json +1 -1
- package/workflows/routines/plan-generation.json +1 -1
- package/workflows/routines/tension-driven-design.json +6 -6
- package/workflows/scoped-documentation-workflow.json +26 -26
- package/workflows/ui-ux-design-workflow.json +14 -14
- package/workflows/workflow-diagnose-environment.json +1 -1
- package/workflows/workflow-for-workflows.json +1 -1
|
@@ -0,0 +1,490 @@
|
|
|
1
|
+
# Design Session: Capability Probe (Phase 0)
|
|
2
|
+
|
|
3
|
+
> **Note (updated):** This doc has been updated by a second session triggered by the same probe pattern. Prior session findings are preserved below. New session additions are marked with `[Session 2]`.
|
|
4
|
+
|
|
5
|
+
## Context / Ask
|
|
6
|
+
|
|
7
|
+
This session was initiated as a **capability probe** (`probeOnly: true`). The trigger instruction was:
|
|
8
|
+
|
|
9
|
+
> "This is a delegation capability probe. Complete immediately. Return 'delegation_confirmed' in your step notes."
|
|
10
|
+
|
|
11
|
+
The workflow advanced past the probe step into Phase 0. Since no real human goal was provided, Phase 0 is documenting the probe outcome and the most natural candidate goal given current project state.
|
|
12
|
+
|
|
13
|
+
**Stated goal (trigger, Session 1):** Confirm WorkRail Executor delegation is available.
|
|
14
|
+
**Stated goal (trigger, Session 2):** Same probe pattern, different phrasing -- "delegation_confirmed" string required.
|
|
15
|
+
|
|
16
|
+
**Reframed problem (Session 1):** The WorkRail project's highest-priority unstarted work is legacy workflow modernization (`exploration-workflow.json` and peers). The probe was a prerequisite check to determine whether parallel delegation could accelerate that work.
|
|
17
|
+
|
|
18
|
+
**Reframed problem (Session 2, from Phase -1 challenge step):** Before committing to a multi-step research path that may rely on optional runtime capabilities, determine which capabilities are actually available in this session so the execution path can be appropriately scoped or degraded without silent failure.
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## Path Recommendation
|
|
23
|
+
|
|
24
|
+
**`design_first`** -- because the trigger was a `solution_statement` (probe a specific capability) and the real underlying need (modernize legacy workflows efficiently) requires first confirming what tools are available and how they affect approach.
|
|
25
|
+
|
|
26
|
+
**Why not `landscape_first`:** The landscape (current workflow inventory, modernization criteria) is already well-documented in `docs/tickets/next-up.md` and `docs/roadmap/open-work-inventory.md`. We don't need another survey.
|
|
27
|
+
|
|
28
|
+
**Why not `full_spectrum`:** The problem definition is clear; the uncertainty is execution strategy (with vs. without delegation).
|
|
29
|
+
|
|
30
|
+
---
|
|
31
|
+
|
|
32
|
+
## Constraints / Anti-goals
|
|
33
|
+
|
|
34
|
+
**Constraints:**
|
|
35
|
+
- No `wr.executor` workflow exists -- delegation is unavailable
|
|
36
|
+
- All synthesis must remain with the main agent
|
|
37
|
+
- Protected files (`src/daemon/`, `src/v2/`, etc.) must not be touched
|
|
38
|
+
- Must not push directly to main
|
|
39
|
+
|
|
40
|
+
**Anti-goals:**
|
|
41
|
+
- Do not confabulate a problem to fill the workflow's structure
|
|
42
|
+
- Do not pretend delegation is available when it is not
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## Capability Probe Results
|
|
47
|
+
|
|
48
|
+
| Capability | Status | Evidence | Session |
|
|
49
|
+
|---|---|---|---|
|
|
50
|
+
| `wr.executor` delegation | **UNAVAILABLE** | `spawn_agent` returned `workflow_not_found: wr.executor` | Session 1 |
|
|
51
|
+
| Web browsing tool | **UNAVAILABLE** | No web browsing MCP tool in agent tool set (Bash/Read/Write/Edit/Glob/Grep/spawn_agent/signal_coordinator/complete_step/report_issue only) | Session 1 |
|
|
52
|
+
| `wr.executor` delegation | **UNAVAILABLE** | Session 2 live probe: `spawn_agent(workflowId: "wr.executor")` returned `outcome: "error"`, `notes: "workflow_not_found -- wr.executor"` | Session 2 |
|
|
53
|
+
| Network reachability | **AVAILABLE** | `curl https://example.com` succeeded within 5s | Session 2 |
|
|
54
|
+
| Web browsing tool | **UNAVAILABLE** | No structured web tool (WebFetch, BrowseWeb, etc.) in Session 2 tool set; network is up but no tool bridges it | Session 2 |
|
|
55
|
+
|
|
56
|
+
**Fallback path (both sessions):** Main-agent-only execution, codebase-only research. All analysis, synthesis, and writing handled inline using local file tools and Bash.
|
|
57
|
+
|
|
58
|
+
**Limitations:** No independent parallel cognitive perspectives from subagents; no external research beyond repository contents.
|
|
59
|
+
|
|
60
|
+
**`retriageNeeded`:** `false` -- prior session findings + Session 2 live probes confirm same capability state. Path recommendation (`design_first`) and rigor mode (`STANDARD`) remain valid. No path change needed.
|
|
61
|
+
|
|
62
|
+
---
|
|
63
|
+
|
|
64
|
+
## Artifact Strategy
|
|
65
|
+
|
|
66
|
+
This design doc (`docs/design/probe-session-phase0.md`) is a **human-readable artifact only**. It is not workflow memory.
|
|
67
|
+
|
|
68
|
+
| What | Where | Durable across rewind? |
|
|
69
|
+
|---|---|---|
|
|
70
|
+
| Execution truth (decisions, findings, rationale) | Step notes + context variables | Yes |
|
|
71
|
+
| Path state (goalType, reframedProblem, capabilities) | Context variables | Yes |
|
|
72
|
+
| Human-readable synthesis, design reasoning | This doc | No (sidecar -- may not survive rewind) |
|
|
73
|
+
| GitHub tickets / PR references | `gh` CLI + issue numbers in notes | Yes (external system) |
|
|
74
|
+
|
|
75
|
+
**What this doc is for:** Human review, design reasoning legibility, optional reference by the project owner.
|
|
76
|
+
|
|
77
|
+
**What this doc is NOT:** Required for the workflow to continue correctly. If this file is lost or diverges, the session notes and context variables are the authoritative source. The doc is reconstructable from step notes if needed.
|
|
78
|
+
|
|
79
|
+
**Update discipline:** Update this doc at each phase boundary. Do not let it get more than one step out of date. If a chat rewind occurs and the doc is stale, trust the step notes over the doc.
|
|
80
|
+
|
|
81
|
+
---
|
|
82
|
+
|
|
83
|
+
## Landscape Packet
|
|
84
|
+
|
|
85
|
+
> **[Session 2 update]:** Fresh scan confirms prior session findings. New findings added below with `[NEW]` markers.
|
|
86
|
+
|
|
87
|
+
### Current state summary
|
|
88
|
+
|
|
89
|
+
WorkRail has ~24 bundled workflows. All are unstamped (spec v3 staleness advisory fires for all of them -- `validate:registry` confirms). The registry validates (5 variants pass). No workflow named `exploration-workflow.json` exists on disk -- that ticket is stale.
|
|
90
|
+
|
|
91
|
+
**[NEW] `wr.shaping` itself is unstamped.** This workflow (the one currently running) is listed in the unstamped advisory output. This is not a blocker but is worth noting for planning hygiene.
|
|
92
|
+
|
|
93
|
+
### Modernization scoring (modern-feature markers out of 5)
|
|
94
|
+
|
|
95
|
+
| Score | Workflows |
|
|
96
|
+
|---|---|
|
|
97
|
+
| 0/5 | `test-session-persistence`, `wr.diagnose-environment` |
|
|
98
|
+
| 1/5 | `wr.adaptive-ticket-creation`, `wr.document-creation`, `wr.documentation-update`, `wr.intelligent-test-case-generation`, `learner-centered-course-workflow`, `wr.personal-learning-materials`, `wr.presentation-creation`, `wr.scoped-documentation`, `test-artifact-loop-control` |
|
|
99
|
+
| 2/5 | `bug-investigation`, `wr.classify-task`, `wr.coding-task`, `wr.cross-platform-code-conversion`, `wr.relocation-us`, `wr.shaping` |
|
|
100
|
+
| 3/5 | `wr.architecture-scalability-audit`, `mr-review-workflow`, `wr.ui-ux-design`, `wr.discovery` |
|
|
101
|
+
| 4/5 | `wr.production-readiness-audit`, `wr.workflow-for-workflows`, `wr.workflow-for-workflows.v2` |
|
|
102
|
+
|
|
103
|
+
Modern markers checked: `metaGuidance`, `references`, `features`, `recommendedPreferences`, `templateCalls`.
|
|
104
|
+
|
|
105
|
+
### Hard constraints
|
|
106
|
+
|
|
107
|
+
1. Protected files: `src/daemon/`, `src/v2/`, `src/trigger/`, `docs/ideas/backlog.md` -- cannot be touched
|
|
108
|
+
2. No push to main -- feature branch + PR required
|
|
109
|
+
3. Stamp a workflow only after running `wr.workflow-for-workflows` on it
|
|
110
|
+
4. Commit message format enforced by pre-commit hook
|
|
111
|
+
|
|
112
|
+
### Main existing approaches
|
|
113
|
+
|
|
114
|
+
- **`wr.workflow-for-workflows.v2.json`**: the canonical tool for modernizing other workflows; drives the authoring QA pass
|
|
115
|
+
- **`docs/authoring-v2.md` + `docs/authoring.md`**: authoritative references for what "modern" means
|
|
116
|
+
- **`stamp-workflow` script**: marks a workflow as conformant after modernization
|
|
117
|
+
|
|
118
|
+
### Obvious contradictions / gaps
|
|
119
|
+
|
|
120
|
+
1. **`exploration-workflow.json` referenced in tickets but was already shipped.** Git history confirms: it was modernized in #158 (Mar 27) and then consolidated into `wr.discovery.json` (Mar 29). The ticket in `docs/tickets/next-up.md` is stale -- the work is already done. The next modernization candidates are the score 1/5 workflows.
|
|
121
|
+
2. **`wr.executor` delegation referenced by 7 workflow files but the workflow doesn't exist.** Session 2 confirmed: `grep "WorkRail Executor" workflows/**` returns 7 files (ui-ux, mr-review x4, production-readiness x3, bug-investigation x5, coding-task, wr.architecture-scalability-audit, wr.discovery x6). All these references are in workflow step prompts -- not engine/daemon code. The gap is at the authoring layer, not the runtime layer. `spawn_agent` in `src/daemon/workflow-runner.ts` correctly returns `workflow_not_found` -- no silent failure at the engine level.
|
|
122
|
+
3. **[NEW] Graceful degradation is already built into most affected workflows.** `wr.discovery.json` has 14 references to `delegationAvailable` checks. `mr-review`, `wr.production-readiness-audit`, `bug-investigation` all have explicit fallback language. The workflows are already designed to degrade gracefully -- the gap is about *quality ceiling* (parallel perspectives unavailable in daemon mode), not about broken behavior.
|
|
123
|
+
4. **[NEW] No GitHub ticket tracks the `wr.executor` gap or the delegation contract question.** `gh issue list --search "executor delegation moderniz"` returns no matches. This is a known gap without any tracked resolution path.
|
|
124
|
+
|
|
125
|
+
### Evidence gaps
|
|
126
|
+
|
|
127
|
+
- `exploration-workflow.json`: resolved -- was consolidated into `wr.discovery.json` in commit a0ddaaac (Mar 29). Ticket in `docs/tickets/next-up.md` is stale.
|
|
128
|
+
- `wr.executor` gap: still untracked. No GitHub issue. No design doc section beyond this one.
|
|
129
|
+
- **[NEW] Gap closed:** Whether graceful degradation is already implemented. Confirmed: it is, in all 7 affected workflows.
|
|
130
|
+
- `wr.executor` daemon gap: not tracked as a known issue in any doc or GitHub ticket found. New finding.
|
|
131
|
+
|
|
132
|
+
---
|
|
133
|
+
|
|
134
|
+
## Problem Frame Packet
|
|
135
|
+
|
|
136
|
+
> **[Session 2 update]:** Deep framing pass. Prior session frame preserved; Session 2 additions in `[Session 2]` blocks.
|
|
137
|
+
|
|
138
|
+
### Primary stakeholder
|
|
139
|
+
|
|
140
|
+
**Project owner (etienneb)** -- sole developer and operator. Uses WorkRail both as a user (runs workflows to do engineering work) and as the author/maintainer (writes and ships workflows, maintains the daemon). Both roles are affected by the `wr.executor` gap.
|
|
141
|
+
|
|
142
|
+
**[Session 2] Two-role stakeholder model (prior session collapsed these):**
|
|
143
|
+
|
|
144
|
+
| Role | Job | Pain from delegation gap | Pain from stale planning |
|
|
145
|
+
|---|---|---|---|
|
|
146
|
+
| **Platform builder** | Publish workflows that work correctly for external users across all execution contexts (MCP + daemon) | External users who run bundled workflows in daemon sessions get lower-quality outputs without knowing why -- broken promise as product publisher | External users cannot rely on ticket queue or docs to understand what's current |
|
|
147
|
+
| **Internal operator** | Run daemon sessions for their own engineering work (coding tasks, MR reviews, investigation) | Daemon sessions on high-value workflows (mr-review, wr.production-readiness-audit) systematically lack parallel reviewer families -- THOROUGH rigor degrades to STANDARD quality | Stale tickets confuse future agent sessions reading the work queue |
|
|
148
|
+
|
|
149
|
+
**Secondary stakeholders:** external users who import bundled workflows via workflow repos and attempt daemon-mode operation.
|
|
150
|
+
|
|
151
|
+
### Jobs / outcomes the stakeholder cares about
|
|
152
|
+
|
|
153
|
+
1. **Run high-quality autonomous sessions** -- when the daemon runs, it should produce correct output without silent capability gaps. A workflow that says "spawn WorkRail Executors" but can't actually do it is a broken promise.
|
|
154
|
+
2. **Maintain an accurate work queue** -- tickets should reflect real remaining work, not work already done. Stale tickets waste planning attention.
|
|
155
|
+
3. **Ship modernized workflows** -- the 9 score-1/5 workflows represent real user-facing debt; agents running them get worse guidance than agents running the score-3/5+ workflows.
|
|
156
|
+
|
|
157
|
+
### Pains / tensions
|
|
158
|
+
|
|
159
|
+
**Tension 1: Daemon and MCP delegation are architecturally incompatible, but workflows don't distinguish them.**
|
|
160
|
+
- In Claude Code (MCP mode): "WorkRail Executor" = Task tool with `subagent_type: workrail-executor`. This runs a sub-agent inside the same Claude session. No workflow registry lookup happens.
|
|
161
|
+
- In the daemon: "WorkRail Executor" = `spawn_agent(workflowId: "wr.executor")`. This requires a workflow to exist in the registry.
|
|
162
|
+
- Current `wr.discovery.json` and `wr.shaping.json` use delegation language ("spawn TWO WorkRail Executors") written for the MCP/Claude Code mental model. When run as a daemon session, this instruction fails silently or falls back -- the agent just records `delegationAvailable: false` and proceeds solo.
|
|
163
|
+
- **Pain:** delegation-conditional quality improvements (parallel perspectives, hypothesis challengers) are systematically unavailable in daemon mode.
|
|
164
|
+
|
|
165
|
+
**[Session 2] Tension 1 deeper layer: the graceful degradation already works, but the quality ceiling is invisible.**
|
|
166
|
+
- All 7 affected workflows already check `delegationAvailable` before spawning. `wr.discovery.json` has 14 such checks. The daemon-mode behavior is "do the passes yourself in sequence" (from `metaGuidance`). This is implemented and functional.
|
|
167
|
+
- But: users (human and agent) running THOROUGH rigor in daemon mode cannot tell they are getting structurally lower-quality output than MCP mode. There is no signal, no badge, no warning. The gap is *invisible*. This is a different problem from "it doesn't work" -- it's "the quality degradation is silent."
|
|
168
|
+
- **Reframe:** the pain isn't `workflow_not_found` failures (those are already caught and degraded gracefully); the pain is that THOROUGH-mode quality in daemon is structurally capped without disclosure.
|
|
169
|
+
|
|
170
|
+
**Tension 2: Planning docs lag behind shipped code.**
|
|
171
|
+
- `docs/tickets/next-up.md` Ticket 2 says "exploration-workflow.json is the highest-priority candidate" -- but this was shipped 3+ weeks ago (consolidated into wr.discovery.json). The ticket is zombie work.
|
|
172
|
+
- Planning rot compounds: developers (human or agent) reading the ticket queue will repeat work or be confused about status.
|
|
173
|
+
|
|
174
|
+
**[Session 2] Tension 2 deeper layer: this isn't just one stale ticket -- it's a signal about the planning system's feedback loop.**
|
|
175
|
+
- The planning system (`now-next-later.md`, `open-work-inventory.md`, `tickets/next-up.md`) is designed to be updated as work ships. The AGENTS.md says "When completing a feature: mark it done, update status, note what was delivered." This process did not happen for `exploration-workflow.json` consolidation.
|
|
176
|
+
- The gap is at the process level: there's no lightweight mechanism that automatically prompts "a thing you tracked got done -- go mark it." The only enforcement is the human/agent reading the doc and noticing.
|
|
177
|
+
- **Risk:** if multiple daemon sessions are running and none updates the planning docs, the docs drift increasingly far from reality.
|
|
178
|
+
|
|
179
|
+
**Tension 3: Modernization scoring has no agreed definition of "done".**
|
|
180
|
+
- Score 4/5 workflows (`wr.production-readiness-audit`, `wr.workflow-for-workflows`) aren't at 5/5 because no workflow has `templateCalls`. It's unclear whether `templateCalls` is aspirational or required for "modern." The spec says "prefer templateCall when the goal is reusable inline routine structure" -- it's advisory, not required.
|
|
181
|
+
- **Pain:** if "done" isn't defined concretely, modernization work can balloon or under-deliver.
|
|
182
|
+
|
|
183
|
+
**[Session 2] Tension 4 (new): `wr.executor` creation may be premature given the Phase 2 composition roadmap.**
|
|
184
|
+
- `docs/plans/agentic-orchestration-roadmap.md` documents "Phase 2: Composition & Middleware Engine" which would include **auto-injection** of capability-checking steps: "If workflow `requires: ['subagents']`, automatically prepend `routine-environment-handshake`." This auto-injection model would replace the current manual `delegationAvailable` pattern in every workflow.
|
|
185
|
+
- If Phase 2 ever ships, manually creating `wr.executor` now to enable `spawn_agent` would be thrown away -- the whole delegation vocabulary would change. Creating `wr.executor` is work that bets against the roadmap direction.
|
|
186
|
+
- **Pain:** investing in `wr.executor` may produce throwaway work if Phase 2 composition is the actual intended architecture.
|
|
187
|
+
|
|
188
|
+
### Constraints that matter in lived use
|
|
189
|
+
|
|
190
|
+
1. All fixes must be workflow files or documentation -- no protected-file changes (`src/daemon/`, `src/v2/`, etc.)
|
|
191
|
+
2. No `wr.executor` workflow means no delegation; any solution that requires delegation to work is circular
|
|
192
|
+
3. Phase 2 composition roadmap is unimplemented -- cannot rely on auto-injection today
|
|
193
|
+
4. The daemon soul rules cannot be modified autonomously
|
|
194
|
+
5. `automationLevel: recommendation_only` -- this session should produce a decision frame for human approval, not autonomous implementation
|
|
195
|
+
|
|
196
|
+
### Success criteria (observable)
|
|
197
|
+
|
|
198
|
+
1. **Daemon sessions that invoke `spawn_agent` do not get `workflow_not_found`** -- either `wr.executor` workflow exists and routes correctly, or workflows authored for daemon sessions explicitly skip delegation steps rather than failing
|
|
199
|
+
2. **`docs/tickets/next-up.md` Ticket 2 is marked done or replaced** with an accurate next-up candidate (score-1/5 workflow list)
|
|
200
|
+
3. **`validate:registry` staleness advisory shrinks** -- at least 3 workflows move from unstamped to stamped after modernization passes
|
|
201
|
+
4. **A modernized score-1/5 workflow passes `wr.workflow-for-workflows` audit** without major revision requests
|
|
202
|
+
5. **[Session 2 addition] Daemon-mode quality degradation is disclosed** -- workflows running in daemon mode where delegation is unavailable either disclose the quality ceiling to the user or route to a structurally equivalent path. "Silent degradation" is eliminated.
|
|
203
|
+
|
|
204
|
+
### Assumptions surfaced
|
|
205
|
+
|
|
206
|
+
- A: `wr.executor` was intentionally not created because daemon delegation was never designed for this pattern
|
|
207
|
+
- B: Workflows that say "spawn WorkRail Executors" were always intended for MCP/Claude Code context only
|
|
208
|
+
- C: The ticket staleness is unknown to the project owner (they may know it's done)
|
|
209
|
+
- **[Session 2] D: Phase 2 composition is a live roadmap item, not abandoned** -- if it's actually abandoned, the auto-injection argument against `wr.executor` creation disappears. Evidence: `docs/plans/agentic-orchestration-roadmap.md` lists it as "Next Up" but there's no GitHub issue for it and no recent commits toward it.
|
|
210
|
+
- **[Session 2] E: The quality degradation from daemon-mode solo execution is significant enough to matter.** If the difference between parallel reviewer families and sequential solo execution is small in practice, the whole delegation gap problem shrinks. Evidence to verify: compare a real mr-review session output in daemon mode vs. MCP mode (not done in this session).
|
|
211
|
+
|
|
212
|
+
### Reframes / HMW questions
|
|
213
|
+
|
|
214
|
+
**Reframe 1: The `wr.executor` problem isn't a missing workflow -- it's a missing contract.**
|
|
215
|
+
The workflow authoring language ("spawn WorkRail Executors") was designed for Claude Code / MCP context. The daemon's `spawn_agent` tool was designed for something different: structured sub-workflows with their own step sequencing. These are two different delegation models with different tradeoffs. The right question isn't "should we create `wr.executor`?" but "should daemon-mode workflows have a different delegation vocabulary from MCP-mode workflows?"
|
|
216
|
+
|
|
217
|
+
**Reframe 2: The stale ticket problem is a signal about planning hygiene, not just one stale ticket.**
|
|
218
|
+
If Ticket 2 is stale by 3+ weeks, the planning system isn't getting updated as work ships. The fix isn't just updating the ticket -- it's establishing a habit or automation that marks tickets done when git history shows the work landed.
|
|
219
|
+
|
|
220
|
+
**[Session 2] Reframe 3: The delegation gap is primarily a disclosure problem, not an implementation gap.**
|
|
221
|
+
Every affected workflow already degrades gracefully. The problem is not that daemon sessions fail -- it's that users don't know they're getting structurally lower-quality output. A one-line `metaGuidance` addition to each affected workflow ("When run in daemon mode without delegation, parallel review families are unavailable; THOROUGH rigor will run at effective STANDARD depth") would eliminate the *invisible* quality gap without creating `wr.executor` at all. This is a weaker fix than actual delegation but costs nearly zero implementation.
|
|
222
|
+
|
|
223
|
+
**[Session 2] Reframe 4: HMW keep the planning docs current without relying on agent discipline?**
|
|
224
|
+
The stale ticket is a symptom of a process gap, not a knowledge gap. Both agents and humans know the work is done; nobody updated the doc. How might we make the update happen automatically (trigger a planning hygiene session when a PR merges that touches tracked work)? This is a product design question, not just a ticket maintenance question.
|
|
225
|
+
|
|
226
|
+
### Primary framing risk
|
|
227
|
+
|
|
228
|
+
**[Session 1]** If `wr.executor` was intentionally left undefined because daemon sessions are _never_ supposed to use the delegation-conditional code paths in `wr.discovery.json` / `wr.shaping.json`, then the "gap" is by design and the real problem is just documentation.
|
|
229
|
+
|
|
230
|
+
**[Session 2 -- sharper version]:** The primary framing risk is that **the entire problem is already solved by the graceful degradation contract**, and this session is spending energy on a quality ceiling that the project owner has consciously accepted. Evidence that this risk is live: (a) `delegationAvailable` checks exist in all affected workflows; (b) the `metaGuidance` says "If delegation is unavailable, do the passes yourself in sequence"; (c) no GitHub issue was ever created for this gap despite multiple sessions observing it. Three sessions have now noticed the `wr.executor` gap and none triggered a fix. This is strong evidence the owner knows and has accepted it.
|
|
231
|
+
|
|
232
|
+
**Specific condition that would make the framing wrong:** If the project owner confirms "yes, daemon-mode quality degradation is acceptable and graceful degradation is the intended contract," then there is no delegation problem to solve. The only remaining work is (a) mark Ticket 2 done and (b) modernize a score-1/5 workflow. The entire `wr.executor` discussion becomes noise.
|
|
233
|
+
|
|
234
|
+
---
|
|
235
|
+
|
|
236
|
+
## Synthesis
|
|
237
|
+
|
|
238
|
+
### The opportunity
|
|
239
|
+
|
|
240
|
+
WorkRail daemon sessions have a real capability gap: `wr.executor` doesn't exist, so every delegation-conditional workflow path silently degrades to solo execution. This isn't catastrophic (graceful fallback works), but it means THOROUGH rigor mode's parallel cognition value is entirely unavailable in daemon context. Given that the daemon is the primary autonomous execution environment, this is a meaningful gap.
|
|
241
|
+
|
|
242
|
+
Simultaneously, planning hygiene has drifted: the highest-priority modernization ticket references work completed weeks ago, and modernization "done" criteria are fuzzy.
|
|
243
|
+
|
|
244
|
+
The real opportunity is: clarify what daemon-mode delegation should look like, close the ticket hygiene gap, and pick the right first workflow to actually modernize — in that order.
|
|
245
|
+
|
|
246
|
+
### Decision criteria (what any good direction must satisfy)
|
|
247
|
+
|
|
248
|
+
1. **Does not introduce implementation scope that violates protected-file boundaries** -- `src/daemon/` is protected; any fix must be a workflow file or documentation change only
|
|
249
|
+
2. **Resolves whether `wr.executor` should be created or declared out-of-scope** -- a concrete yes/no answer, not "it depends"
|
|
250
|
+
3. **Leaves the planning queue in a state an agent or human can act on** -- Ticket 2 marked done, next candidate identified
|
|
251
|
+
4. **Defines modernization "done" concretely enough that the first score-1/5 candidate can be scoped** -- not perfect, but workable
|
|
252
|
+
5. **Is executable without delegation** -- since `wr.executor` is unavailable, the direction cannot require it to function
|
|
253
|
+
|
|
254
|
+
### Riskiest assumption
|
|
255
|
+
|
|
256
|
+
**That creating `wr.executor` is the right fix for the daemon delegation gap.** The alternative -- that the right fix is to annotate workflows as "MCP-preferred for delegation features" and accept graceful degradation as the daemon behavior -- requires zero implementation and is already mostly done (the `delegationAvailable` fallback works correctly). Creating `wr.executor` is only better if there's clear value in running an actual structured sub-workflow rather than just a free-form sub-agent. That value case hasn't been made yet.
|
|
257
|
+
|
|
258
|
+
### Remaining uncertainty type
|
|
259
|
+
|
|
260
|
+
**Recommendation uncertainty** -- not research uncertainty or prototype-learning uncertainty. All facts are known. The uncertainty is which of two coherent framings is right: (a) implement `wr.executor` as a real workflow to enable structured delegation, or (b) treat graceful degradation as the intended daemon behavior and document it as such. These lead to materially different next actions.
|
|
261
|
+
|
|
262
|
+
### Strongest challenge against the framing
|
|
263
|
+
|
|
264
|
+
The entire framing was derived from incidental observations made during a capability probe session that was explicitly told to "complete step 1 and stop." The owner triggered this to check a prerequisite, not to commission a design study. The "problems" identified (`wr.executor` gap, stale ticket, fuzzy modernization criteria) are real but may be at the wrong priority level relative to what the owner actually cares about right now. The framing implicitly promotes these observations to "work that should be done next" -- but the `now-next-later.md` says "nothing actively in progress" and the groomed ticket queue points to workflow modernization, not delegation infrastructure. **The framing may be solving problems the owner hasn't asked to solve and wouldn't prioritize if asked.** The safest output of this session is a clear memo of findings + recommendations, not a commitment to implement `wr.executor`.
|
|
265
|
+
|
|
266
|
+
**Resolution:** This challenge is valid and shapes the candidate directions. Any direction that involves implementing `wr.executor` should be framed as a recommendation for human approval, not autonomous execution. The design-first path is justified -- but the output should be a decision frame, not a build plan.
|
|
267
|
+
|
|
268
|
+
---
|
|
269
|
+
|
|
270
|
+
## Candidate Generation Expectations (set before generation runs)
|
|
271
|
+
|
|
272
|
+
**Path:** `design_first` | **Rigor:** STANDARD | **Target count:** 3
|
|
273
|
+
|
|
274
|
+
> **[Session 2 update]:** Expectations sharpened after Phase 1f deep framing and Phase 2 synthesis. The problem has *narrowed* -- `wr.executor` creation is off the table as a candidate direction for this session (out of scope, owner hasn't asked, `automationLevel: recommendation_only`). The candidate spread must reflect the smaller, more focused work set.
|
|
275
|
+
|
|
276
|
+
### [Session 1] Required emphases (preserved)
|
|
277
|
+
|
|
278
|
+
1. **At least one candidate must meaningfully reframe the problem** -- not just "create wr.executor" (obvious solution) but a direction that challenges the assumption that new implementation is needed at all. The framing challenge from Phase 2 is live: graceful degradation may already be the right behavior.
|
|
279
|
+
|
|
280
|
+
2. **At least one candidate must address the full problem space** -- the `wr.executor` gap, the stale ticket, and the fuzzy modernization "done" criteria are three separate tensions. At least one direction should treat them as a coherent system rather than independent fixes.
|
|
281
|
+
|
|
282
|
+
3. **Candidates must respect the protected-file constraint** -- no candidate should require touching `src/daemon/`, `src/v2/`, `src/trigger/`. All directions must be workflow-file or documentation changes only.
|
|
283
|
+
|
|
284
|
+
4. **Candidates must be scoped for human-approval output** -- per the Phase 2 framing challenge resolution, no candidate should be a build plan for autonomous execution. All candidates produce a recommendation memo or decision artifact, not shipped code.
|
|
285
|
+
|
|
286
|
+
5. **Candidates must diverge meaningfully** -- if all 3-4 candidates are variants of "create wr.executor with slightly different scopes," the spread is too narrow. There must be at least one direction that does NOT involve creating `wr.executor`.
|
|
287
|
+
|
|
288
|
+
### [Session 2] Updated emphases (override Session 1 where they conflict)
|
|
289
|
+
|
|
290
|
+
1. **Target count is 3, not 3-4.** Problem has narrowed; 3 meaningfully divergent directions is the right spread. A fourth direction would need to justify itself against the narrowing.
|
|
291
|
+
|
|
292
|
+
2. **`wr.executor` creation is NOT a candidate direction.** Phase 1f/2 challenge resolved: creating `wr.executor` is out of scope for this session (`automationLevel: recommendation_only`, owner hasn't asked, protected files involved). The candidates live in the space of hygiene, disclosure, and decision-framing -- not implementation.
|
|
293
|
+
|
|
294
|
+
3. **The genuine reframe requirement is now about scope, not solution type.** For `design_first`, at least one direction must challenge the assumption that all three tensions (delegation gap, stale ticket, modernization "done") need addressing in the same session. One candidate may argue: "only do the hygiene work; the delegation question belongs in a separate conversation."
|
|
295
|
+
|
|
296
|
+
4. **At least one direction must produce something executable in this session.** The output cannot be entirely deferred. At minimum, one candidate should result in a concrete artifact (updated ticket, named next candidate, or disclosure line) that can be produced and shipped as a PR without human approval first.
|
|
297
|
+
|
|
298
|
+
5. **At least one direction must include the structured owner decision frame on `wr.executor`.** Even though autonomous implementation is off the table, the question itself needs to be framed clearly enough that the owner can decide in a single reading. One candidate must include this framing.
|
|
299
|
+
|
|
300
|
+
6. **No candidate may require `wr.executor` to function.** All three directions must be fully executable in the current session without any delegation capability.
|
|
301
|
+
|
|
302
|
+
### Pre-shaped direction sketches (from Phase 2 synthesis)
|
|
303
|
+
|
|
304
|
+
These sketches are starting points for the generation routine, not final candidates:
|
|
305
|
+
|
|
306
|
+
- **Direction A: Minimal hygiene** -- close stale Ticket 2, name the first score-1/5 modernization candidate, produce no `wr.executor` framing. Argument: respects the owner's existing priority queue.
|
|
307
|
+
- **Direction B: Hygiene + disclosure** -- A, plus add a one-line delegation disclosure to `metaGuidance` of the 7 affected workflows. Argument: eliminates invisible quality ceiling at near-zero cost.
|
|
308
|
+
- **Direction C: Hygiene + disclosure + decision frame** -- B, plus produce a structured yes/no question for the owner on `wr.executor` architecture. Argument: makes the implicit accepted state explicit and gives the owner a clean trigger to revisit if priorities change.
|
|
309
|
+
|
|
310
|
+
### What the later synthesis will check
|
|
311
|
+
|
|
312
|
+
- Does the candidate set include a genuine reframe (scope challenge, not just solution re-packaging)?
|
|
313
|
+
- Are all three candidates executable without delegation and without `wr.executor`?
|
|
314
|
+
- Does no candidate require protected-file changes?
|
|
315
|
+
- Does at least one candidate produce something shippable in this session?
|
|
316
|
+
- Does at least one candidate include the structured owner decision frame?
|
|
317
|
+
- Is the spread wide enough to represent a real choice -- not just "how much to do"?
|
|
318
|
+
|
|
319
|
+
## Candidate Directions
|
|
320
|
+
|
|
321
|
+
> [Session 2 — generated in Step 3 of injected tension-driven-design routine]
|
|
322
|
+
|
|
323
|
+
---
|
|
324
|
+
|
|
325
|
+
### Candidate 1: Planning Hygiene Only (Simplest sufficient change)
|
|
326
|
+
|
|
327
|
+
**Summary:** Update `docs/tickets/next-up.md` to close stale Ticket 2, name the first concrete score-1/5 modernization candidate, and stop. Do not touch workflow files, delegation, or `wr.executor`.
|
|
328
|
+
|
|
329
|
+
**Tensions resolved:**
|
|
330
|
+
- Tension 3 (planning docs lag shipped code): fully resolved — Ticket 2 marked done, accurate next candidate named
|
|
331
|
+
|
|
332
|
+
**Tensions accepted (not resolved):**
|
|
333
|
+
- Tension 1 (single vocabulary, two environments): accepted as-is — no disclosure added
|
|
334
|
+
- Tension 2 (graceful degradation quality ceiling is invisible): accepted — quality ceiling stays silent
|
|
335
|
+
- Tension 4 (YAGNI vs. leaving gap undocumented): partially accepted — no permanent ADR, but the design doc here serves as a soft record
|
|
336
|
+
|
|
337
|
+
**Boundary solved at:** Planning/documentation layer only. No workflow files modified. No PR touches any `workflows/` or `src/` file.
|
|
338
|
+
|
|
339
|
+
**Specific failure mode:** Future daemon sessions (or human developers) read the corrected ticket queue, start modernizing a score-1/5 workflow, and when they encounter the delegation-unavailable situation they spin up their own analysis all over again — because there's no discovery path to this design doc. The planning hygiene improvement is real but doesn't prevent re-analysis waste.
|
|
340
|
+
|
|
341
|
+
**Relation to existing patterns:** Follows AGENTS.md exactly: "When completing a feature: mark it done, update status, note what was delivered." This is the explicitly prescribed behavior for completed work. No departure.
|
|
342
|
+
|
|
343
|
+
**What you gain:** Minimal scope, no risk of touching workflow files incorrectly, respects the owner's groomed priority queue exactly. One PR, one file change.
|
|
344
|
+
|
|
345
|
+
**What you give up:** The invisible quality ceiling remains. The architectural decision about MCP-vs-daemon delegation stays implicit. Every future session will re-discover it.
|
|
346
|
+
|
|
347
|
+
**Impact surface:** `docs/tickets/next-up.md` only. Zero impact on workflow behavior, engine behavior, or any user-visible surface.
|
|
348
|
+
|
|
349
|
+
**Scope judgment:** `too narrow` — it solves the most obvious hygiene problem but leaves both the observability gap and the architectural decision undocumented. The cost of addressing those is low (see Candidates 2 and 3), so stopping here wastes available opportunity.
|
|
350
|
+
|
|
351
|
+
**Philosophy principles honored:** YAGNI with discipline (does only what is explicitly prescribed), Prefer atomicity for correctness (one coherent doc update), Document "why" not "what" (updates the ticket with what shipped).
|
|
352
|
+
|
|
353
|
+
**Philosophy principles conflicted:** Observability as a constraint (quality ceiling stays invisible), Document "why" not "what" (the decision NOT to build `wr.executor` stays implicit).
|
|
354
|
+
|
|
355
|
+
---
|
|
356
|
+
|
|
357
|
+
### Candidate 2: Hygiene + Authoring-Layer Disclosure Patch (Repo pattern adaptation)
|
|
358
|
+
|
|
359
|
+
**Summary:** Close Ticket 2 (as Candidate 1), then add a single concise `metaGuidance` entry to the 7 affected workflows stating that parallel WorkRail Executor invocations require MCP/Claude Code context — making the quality ceiling visible at session start and resume.
|
|
360
|
+
|
|
361
|
+
**Concrete specification:**
|
|
362
|
+
- File changes: `docs/tickets/next-up.md` + 7 workflow JSON files: `wr.discovery.json`, `mr-review-workflow.agentic.v2.json`, `wr.production-readiness-audit.json`, `bug-investigation.agentic.v2.json`, `wr.coding-task.json`, `wr.architecture-scalability-audit.json`, `wr.ui-ux-design.json`
|
|
363
|
+
- Entry added to each: a single string ≤256 chars (schema constraint), appended to `metaGuidance` array
|
|
364
|
+
- Example text: `"WorkRail Executor delegation requires MCP/Claude Code context. In daemon mode, delegationAvailable=false; proceed solo and note that parallel reviewer families are unavailable."`
|
|
365
|
+
- Character count of example: 185 chars (within 256-char limit)
|
|
366
|
+
- Surface: `metaGuidance` is surfaced at session start and resume only (NOT repeated on every step advance) — confirmed from schema: "Persistent behavioral rules surfaced on start and resume."
|
|
367
|
+
|
|
368
|
+
**Tensions resolved:**
|
|
369
|
+
- Tension 2 (quality ceiling invisible): resolved — the quality ceiling is disclosed at session start and every resume
|
|
370
|
+
- Tension 3 (planning docs lag): resolved — Ticket 2 updated
|
|
371
|
+
- Tension 4 (YAGNI vs. leaving gap undocumented): partially resolved — the disclosure text serves as an in-band record of the delegation contract
|
|
372
|
+
|
|
373
|
+
**Tensions accepted (not resolved):**
|
|
374
|
+
- Tension 1 (single vocabulary, two environments): accepted — no schema-level distinction between MCP and daemon execution contexts; the disclosure is a patch, not an architectural fix
|
|
375
|
+
|
|
376
|
+
**Boundary solved at:** Workflow authoring layer. `metaGuidance` is the established mechanism for ambient behavioral rules — confirmed in `src/types/workflow-definition.ts` comments and `src/application/services/compiler/template-registry.ts` (routine `metaGuidance` injected as step-level guidance). This is exactly the right surface for this kind of rule.
|
|
377
|
+
|
|
378
|
+
**Specific failure mode:** An author adds a new high-value workflow with delegation instructions but forgets to add the disclosure `metaGuidance` entry. The disclosure coverage is only as good as authoring discipline. No enforcement mechanism exists.
|
|
379
|
+
|
|
380
|
+
**Relation to existing patterns:** Directly adapts the existing `metaGuidance` pattern. All 7 affected workflows already use `metaGuidance` for behavioral rules. Adding one more entry follows the established authoring pattern exactly. No new mechanisms invented.
|
|
381
|
+
|
|
382
|
+
**What you gain:** The quality ceiling becomes visible to agents and users at session start. Authoring discipline is the only new requirement. Zero engine changes. Schema-compliant. Shippable in one PR.
|
|
383
|
+
|
|
384
|
+
**What you give up:** `Make illegal states unrepresentable` is still violated — a workflow with delegation instructions can be run in daemon mode without any compile-time or startup warning. The disclosure is runtime advisory, not structural enforcement. Also: 7 workflow files touched means 7 files that could accidentally introduce JSON errors.
|
|
385
|
+
|
|
386
|
+
**Impact surface:** 7 workflow JSON files + 1 planning doc. The disclosure text affects what agents see at session start when running these workflows in any environment (including MCP, where delegation IS available — in that case the disclosure is technically misleading unless phrased conditionally). Must phrase carefully: not "delegation is unavailable" but "delegation requires MCP/Claude Code context; check `delegationAvailable` before spawning."
|
|
387
|
+
|
|
388
|
+
**Scope judgment:** `best-fit` — resolves the two concrete observable problems (invisible quality ceiling, stale ticket) without touching protected files or over-investing in an architectural solution the owner hasn't requested.
|
|
389
|
+
|
|
390
|
+
**Philosophy principles honored:** Observability as a constraint (quality ceiling now disclosed), Validate at boundaries trust inside (disclosure at session-start boundary), Document "why" not "what" (explains the MCP-vs-daemon split), YAGNI with discipline (no speculative `wr.executor` creation).
|
|
391
|
+
|
|
392
|
+
**Philosophy principles conflicted:** Architectural fixes over patches (this is explicitly a patch — `metaGuidance` is a localized special-case, not a structural invariant change), Make illegal states unrepresentable (the illegal state — running a delegation-expecting workflow in daemon mode — remains representable).
|
|
393
|
+
|
|
394
|
+
---
|
|
395
|
+
|
|
396
|
+
### Candidate 3: Hygiene + ADR documenting the MCP/Daemon delegation contract (Different mechanism, different tension target)
|
|
397
|
+
|
|
398
|
+
**Summary:** Close Ticket 2 (as Candidate 1), then author ADR 011 in `docs/adrs/` formally recording the architectural decision that WorkRail workflows use a single vocabulary for two incompatible delegation models, with the MCP model primary and daemon graceful degradation as the explicitly accepted behavior — creating a permanent, discoverable decision record that prevents future re-analysis.
|
|
399
|
+
|
|
400
|
+
**Concrete specification:**
|
|
401
|
+
- File changes: `docs/tickets/next-up.md` + `docs/adrs/011-mcp-daemon-delegation-vocabulary.md`
|
|
402
|
+
- ADR structure follows existing pattern (ADRs 001–010): Status, Date, Context, Decision, Consequences
|
|
403
|
+
- Status: `Accepted`
|
|
404
|
+
- Decision content: (a) WorkRail daemon `spawn_agent` targets workflow IDs; (b) `wr.executor` workflow does not exist and will not be created until a concrete use case justifies structured delegation; (c) workflows authored with "spawn WorkRail Executors" language are MCP-context-primary; (d) daemon-mode graceful degradation (check `delegationAvailable`, proceed solo) is the explicitly accepted behavior; (e) this decision should be revisited when Phase 2 composition engine is designed
|
|
405
|
+
- Does NOT add `metaGuidance` entries to workflow files (leaves that as a follow-up if the owner wants it)
|
|
406
|
+
|
|
407
|
+
**Tensions resolved:**
|
|
408
|
+
- Tension 4 (YAGNI vs. leaving gap undocumented): fully resolved — the decision NOT to build `wr.executor` is permanently recorded with rationale; future sessions find the ADR before re-deriving the analysis
|
|
409
|
+
- Tension 3 (planning docs lag): resolved — Ticket 2 updated
|
|
410
|
+
- Tension 1 (single vocabulary, two environments): explicitly accepted and documented as a deliberate decision, not an oversight
|
|
411
|
+
|
|
412
|
+
**Tensions accepted (not resolved):**
|
|
413
|
+
- Tension 2 (quality ceiling invisible): accepted — no runtime disclosure added; the ADR is for human and agent developers reading architecture docs, not for end-users at session start
|
|
414
|
+
|
|
415
|
+
**Boundary solved at:** Architecture decision record layer. An ADR is the right boundary for "this architectural question was considered and decided" — it creates a discoverable stop-point for future agents and human developers. `docs/adrs/` is an established, actively-maintained location (10 existing ADRs with clear format).
|
|
416
|
+
|
|
417
|
+
**Specific failure mode:** The ADR is only useful if future sessions (or humans) find it. There is no mechanism that says "before asking about `wr.executor`, read ADR 011." A daemon session given a task that involves delegation will not automatically look for ADRs unless the AGENTS.md or soul rules mention it. The ADR is discoverable via `grep` and `ls docs/adrs/` but not enforced.
|
|
418
|
+
|
|
419
|
+
**Relation to existing patterns:** Directly follows the existing ADR pattern. `docs/adrs/001-010` establishes the format: Status, Date, Context, Decision, Consequences. This would be ADR 011 with the same structure. No new mechanisms. The most recent ADR (010) is from 2026-04-xx (release pipeline), so the pattern is actively maintained.
|
|
420
|
+
|
|
421
|
+
**What you gain:** A permanent, versioned, discoverable architectural decision record. Future sessions running `ls docs/adrs/` or `grep "executor\|delegation" docs/adrs/` will find the decision. The analysis done across three sessions is captured once, permanently, in the canonical location for architectural decisions.
|
|
422
|
+
|
|
423
|
+
**What you give up:** The quality ceiling stays invisible to runtime agents — no `metaGuidance` disclosure. An agent running a daemon session won't see the ADR unless they explicitly look for architecture docs. The ADR serves human developers and future planning sessions, not live execution agents.
|
|
424
|
+
|
|
425
|
+
**Impact surface:** `docs/adrs/011-mcp-daemon-delegation-vocabulary.md` + `docs/tickets/next-up.md`. Zero impact on workflow behavior or engine behavior.
|
|
426
|
+
|
|
427
|
+
**Scope judgment:** `best-fit` (for a different audience than Candidate 2) — Candidate 2 serves live agents; Candidate 3 serves future architects and planning sessions. Both are appropriate but serve different primary tensions.
|
|
428
|
+
|
|
429
|
+
**Philosophy principles honored:** Document "why" not "what" (ADR explains the architectural decision and rationale, not just the state), YAGNI with discipline (explicitly records NOT building `wr.executor` as a conscious decision with clear conditions for revisiting), Architectural fixes over patches (an ADR is not a patch — it changes the constraint/invariant by making it explicit and discoverable).
|
|
430
|
+
|
|
431
|
+
**Philosophy principles conflicted:** Observability as a constraint (quality ceiling still invisible to runtime agents), Make illegal states unrepresentable (still violated at schema level).
|
|
432
|
+
|
|
433
|
+
---
|
|
434
|
+
|
|
435
|
+
### Convergence check (honest assessment)
|
|
436
|
+
|
|
437
|
+
All three candidates share: close Ticket 2, identify score-1/5 next modernization candidate. They diverge on how to handle the delegation architecture tension:
|
|
438
|
+
|
|
439
|
+
| Candidate | Primary tension addressed | Mechanism | Audience |
|
|
440
|
+
|---|---|---|---|
|
|
441
|
+
| 1 | Planning hygiene only | Doc update | Planning |
|
|
442
|
+
| 2 | Runtime observability | `metaGuidance` disclosure | Live agents |
|
|
443
|
+
| 3 | Architectural record | ADR | Future developers/agents |
|
|
444
|
+
|
|
445
|
+
**Candidates 2 and 3 are genuinely different** — different mechanisms, different audiences, different tensions resolved. They are also **combinable** (both could be done together), which is itself a signal: neither is mutually exclusive. A combined Direction 2+3 would be the most complete resolution of all four tensions, at a cost of slightly higher scope (7 workflow files + 1 ADR file + 1 planning doc).
|
|
446
|
+
|
|
447
|
+
**If forced to pick one**: Candidate 2 provides the most value for the most immediate observable problem (quality ceiling visible to live agents). Candidate 3 provides the most value for long-term architectural hygiene. Candidate 1 alone is too narrow given the available opportunity.
|
|
448
|
+
|
|
449
|
+
---
|
|
450
|
+
|
|
451
|
+
## Challenge Notes
|
|
452
|
+
|
|
453
|
+
*(integrated into Synthesis section above)*
|
|
454
|
+
|
|
455
|
+
---
|
|
456
|
+
|
|
457
|
+
## Resolution Notes
|
|
458
|
+
|
|
459
|
+
*(to be populated if workflow continues)*
|
|
460
|
+
|
|
461
|
+
---
|
|
462
|
+
|
|
463
|
+
## Decision Log
|
|
464
|
+
|
|
465
|
+
- **2026-04-xx (Session 1)**: Probe session detected. Documented capability unavailability. Recommended `design_first` path for any follow-on modernization work.
|
|
466
|
+
- **2026-04-21 (Session 2)**: Second probe session. Phase -1 (goal challenge) classified trigger as `solution_statement`, reframed problem as capability-scoping question. Phase 0 confirmed `design_first` path. Prior session landscape and problem-framing findings adopted as prior art -- no repeated landscape scan needed. Primary uncertainty remains: recommendation uncertainty (graceful degradation vs. implement `wr.executor`). All three tensions (delegation gap, stale ticket, fuzzy modernization "done") still live. No GitHub ticket exists for `wr.executor` gap or delegation contract clarification.
|
|
467
|
+
- **2026-04-21 (Session 2, Phase 3d)**: Three candidates generated (C1: hygiene only; C2: hygiene + metaGuidance disclosure; C3: hygiene + ADR). Adversarial challenge run solo (delegation unavailable). **Winner: Combined C2 + C3.** Challenge findings: (a) C2's primary beneficiary is human observers reading session logs/workflow files, not runtime agents -- existing step-level delegationAvailable checks already handle agent decisions correctly; (b) C3's discoverability value is asserted not demonstrated -- ADR helps sessions that search docs/adrs/ but no mechanism forces that search. Neither challenge kills the direction. Combined C2+C3 satisfies all 5 decision criteria; C1 fails criterion 3 (no owner decision frame). **Why C1 lost:** satisfies only T3 (planning hygiene) and leaves T2 and T4 unaddressed at minimal additional cost. **Why C2+C3 won:** non-exclusive candidates targeting different audiences (runtime observers vs. future planners); together resolve T2, T3, T4 within authoring+docs constraint. **Primary framing risk still live:** if owner confirms daemon-mode quality degradation is consciously accepted, switch to C1.
|
|
468
|
+
|
|
469
|
+
### [Session 2] Phase 0 Capture
|
|
470
|
+
|
|
471
|
+
| Field | Value |
|
|
472
|
+
|---|---|
|
|
473
|
+
| `problemStatement` | Before committing to research/execution paths that rely on optional capabilities, determine which capabilities are available and whether the project's real outstanding work (workflow modernization, delegation contract) is correctly framed |
|
|
474
|
+
| `desiredOutcome` | A concrete decision frame: either (a) create `wr.executor` to enable structured daemon delegation, or (b) annotate existing workflows as MCP-preferred for delegation and accept graceful degradation as the intended daemon behavior -- with planning docs updated to reflect the correct next modernization candidates |
|
|
475
|
+
| `coreConstraints` | Protected files (src/daemon/, src/v2/, src/trigger/) untouchable; no push to main; all work as feature branch + PR; delegation unavailable (wr.executor doesn't exist); no web access |
|
|
476
|
+
| `antiGoals` | Do not create implementation scope requiring protected-file changes; do not confabulate a problem; do not build `wr.executor` autonomously without human approval |
|
|
477
|
+
| `primaryUncertainty` | Recommendation uncertainty: graceful degradation as intended behavior vs. implement wr.executor for structured delegation -- two coherent framings with different consequences |
|
|
478
|
+
| `knownApproaches` | (A) Create `wr.executor` workflow; (B) Document daemon workflows as MCP-preferred and accept degradation; (C) Hybrid: add runtime context tag to workflow metadata indicating execution environment requirements |
|
|
479
|
+
| `importantStakeholders` | etienneb (sole developer, both user and maintainer role) |
|
|
480
|
+
| `rigorMode` | STANDARD |
|
|
481
|
+
| `automationLevel` | recommendation_only (human approval required before implementation) |
|
|
482
|
+
| `pathRecommendation` | design_first |
|
|
483
|
+
| `pathRationale` | Goal was solution_statement; landscape already known from Session 1; remaining uncertainty is recommendation uncertainty not research uncertainty; dominant risk is solving wrong problem (implementing wr.executor when graceful degradation is already correct); design_first surfaces the decision frame without committing to build |
|
|
484
|
+
| `designDocPath` | docs/design/probe-session-phase0.md |
|
|
485
|
+
|
|
486
|
+
---
|
|
487
|
+
|
|
488
|
+
## Final Summary
|
|
489
|
+
|
|
490
|
+
*(to be populated at workflow close)*
|
|
@@ -18,7 +18,7 @@ where the parent agent wants to continue working in parallel.
|
|
|
18
18
|
|
|
19
19
|
**Example** (in a workflow step prompt):
|
|
20
20
|
```
|
|
21
|
-
Spawn ONE WorkRail Executor running `routine-tension-driven-design` with your
|
|
21
|
+
Spawn ONE WorkRail Executor running `wr.routine-tension-driven-design` with your
|
|
22
22
|
tensions, philosophy sources, and problem understanding as input.
|
|
23
23
|
```
|
|
24
24
|
|
|
@@ -50,7 +50,7 @@ in confirmation gates, and be tracked individually in the session.
|
|
|
50
50
|
4. `{arg}` placeholders in prompts are substituted; `{{contextVar}}` is preserved for runtime
|
|
51
51
|
|
|
52
52
|
**Template ID convention**:
|
|
53
|
-
- Routine `routine-tension-driven-design` → template ID `wr.templates.routine.tension-driven-design`
|
|
53
|
+
- Routine `wr.routine-tension-driven-design` → template ID `wr.templates.routine.tension-driven-design`
|
|
54
54
|
- The `routine-` prefix is stripped automatically
|
|
55
55
|
|
|
56
56
|
**Example** (in workflow JSON):
|
|
@@ -170,11 +170,11 @@ This is often a better fit than executor-style delegation for:
|
|
|
170
170
|
|
|
171
171
|
The current routine catalog suggests these default uses:
|
|
172
172
|
|
|
173
|
-
- `routine-context-gathering`: completeness/depth audit or bounded context expansion
|
|
174
|
-
- `routine-hypothesis-challenge`: adversarial challenge against the current leading story
|
|
175
|
-
- `routine-execution-simulation`: bounded runtime/flow reasoning where mental execution adds value
|
|
176
|
-
- `routine-philosophy-alignment`: review against user/repo principles
|
|
177
|
-
- `routine-final-verification`: proof-oriented end-state validation
|
|
173
|
+
- `wr.routine-context-gathering`: completeness/depth audit or bounded context expansion
|
|
174
|
+
- `wr.routine-hypothesis-challenge`: adversarial challenge against the current leading story
|
|
175
|
+
- `wr.routine-execution-simulation`: bounded runtime/flow reasoning where mental execution adds value
|
|
176
|
+
- `wr.routine-philosophy-alignment`: review against user/repo principles
|
|
177
|
+
- `wr.routine-final-verification`: proof-oriented end-state validation
|
|
178
178
|
|
|
179
179
|
## Good and bad fits
|
|
180
180
|
|