@lumoai/cli 1.41.0 → 1.42.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -4,53 +4,50 @@
4
4
 
5
5
  ### Suggest-on-start from local git (no auto-bind)
6
6
 
7
- When a session starts **without** a bound task, the `session-start` hook tries to
8
- infer the task from local git so it can **suggest** one — it never binds for you
9
- (LUM-302):
10
-
11
- - It reads the **current branch name** first (e.g. `lumo/LUM-145-...`), then the
12
- **most recent commit subjects** (e.g. `... [LUM-145]`), extracting the first
13
- task identifier. Detection is **prefix-agnostic** (LUM-419): any team prefix
14
- matches — `SPEC-12` as much as `LUM-145` — using the same pattern the server
15
- uses to link PR branches to tasks. Well-known acronym-number tokens
16
- (`UTF-8`, `SHA-256`, `ISO-8601`, …) are skipped, never suggested.
17
- - On a hit it prints a single suggestion line and stops — the session stays
18
- **unbound** and no context is injected yet:
19
- `Detected LUM-145 (from branch name). Run lumo session attach LUM-145 to bind.`
20
- (the basis reads `from recent commits` when the hit came from commit subjects instead of the branch name)
21
- No task title is shown here because nothing was fetched; the title, memory,
22
- and PR-review todos appear only once you actually attach.
23
- - No match (detached HEAD, a branch with no identifier and no tagged commits,
24
- not a git repo) → it degrades to the normal unbound prompt.
25
-
26
- **Agent guidance:** when you see a suggestion line, confirm the inferred task is
27
- the one the user wants, then run `lumo session attach <LUM-N>` (followed by
28
- `lumo task context <LUM-N>` to load the background). `session attach` is the
29
- **only** path that binds — there is nothing to "undo" if the inference is wrong;
30
- just attach the correct task instead.
7
+ When a session starts **without** a bound task, the `session-start` hook infers a task from local git and **suggests** it — it never binds for you (LUM-302).
8
+
9
+ Detection order and rules:
10
+
11
+ - `current branch name` first (e.g. `lumo/LUM-145-...`), then the `most recent commit subjects` (e.g. `... [LUM-145]`), extracting the first task identifier.
12
+ - `prefix-agnostic` (LUM-419) any team prefix matches (`SPEC-12` as much as `LUM-145`), using the same pattern the server uses to link PR branches to tasks.
13
+ - `well-known acronym-number tokens` (`UTF-8`, `SHA-256`, `ISO-8601`, …) are skipped, never suggested.
14
+
15
+ On a hit it prints one suggestion line and stops — the session stays **unbound**, no context injected yet:
16
+
17
+ ```
18
+ Detected LUM-145 (from branch name). Run lumo session attach LUM-145 to bind.
19
+ ```
20
+
21
+ - The basis reads `from recent commits` when the hit came from commit subjects instead of the branch name.
22
+ - No task title is shown here (nothing was fetched); title, memory, and PR-review todos appear only once you attach.
23
+ - No match (detached HEAD; a branch with no identifier and no tagged commits; not a git repo) → degrades to the normal unbound prompt.
24
+
25
+ #### When to suggest
26
+
27
+ You see a suggestion line. Confirm the inferred task is the one the user wants, then run `lumo session attach <LUM-N>` followed by `lumo task context <LUM-N>` to load the background. `session attach` is the **only** path that binds — there is nothing to "undo" if the inference is wrong; just attach the correct task instead.
31
28
 
32
29
  ### Layer 2 project-memory review at session start
33
30
 
34
31
  When the session is bound, session-start may inject a **"🆕 Review needed: project memories auto-consolidated by the previous session"** section alongside the memory / PR-review blocks (LUM-165). It lists the **PROJECT-scope** memories that the member's **immediately-preceding session** auto-consolidated (Layer 2 runs asynchronously when a task is marked `done`). Each item shows its `id`.
35
32
 
36
- - **Why it's here:** Layer 2 promotions land async (after the task hits DONE), so they're surfaced at the _next_ session-start instead, when they've definitely landed.
37
- - **Show-once:** the section appears only at the session that immediately follows the one that produced the memories. It does **not** re-nag on later sessions, so act on it now or it scrolls off.
38
- - **Agent guidance:** briefly sanity-check each listed memory against the codebase/context. If one is wrong or over-generalized, remove it with `lumo memory rm <id> --yes` (ideally confirm with the user first). If they all look right, ignore the section and continue.
33
+ - **Why async / next-session:** Layer 2 promotions land after the task hits DONE, so they surface at the _next_ session-start, when they've definitely landed.
34
+ - **Show-once:** the section appears only at the session immediately following the one that produced the memories. It does **not** re-nag later, so act now or it scrolls off.
35
+ - **Attribution:** `lumo task update <id> --status done` automatically sends `CLAUDE_CODE_SESSION_ID` (via `X-Lumo-Session-Id`) so the Layer 2 memories are attributed to the session. Marking done from the **web UI** leaves them unattributed (they won't surface for review) expected.
39
36
 
40
- Attribution requires the CC session id to reach the server: `lumo task update <id> --status done` automatically sends `CLAUDE_CODE_SESSION_ID` (via an `X-Lumo-Session-Id` header) so the resulting Layer 2 memories are attributed to the session. Marking a task done from the **web UI** leaves them unattributed (they won't surface for review) — that's expected.
37
+ #### When to suggest
38
+
39
+ Briefly sanity-check each listed memory against the codebase/context. If one is wrong or over-generalized, remove it with `lumo memory rm <id> --yes` (ideally confirm with the user first). If they all look right, ignore the section and continue.
41
40
 
42
41
  ### Blocker alert injected at `session attach` / session-start
43
42
 
44
- When the bound task has live blockers or SUGGESTED dependency candidates, the attach output and session-start hook context inject a dependency warning block. The warning is built by `buildBlockerWarningSectionForTask` and is **omitted entirely** (empty string) when nothing is actionable — no output appears when there are no live blockers and no SUGGESTED candidates.
43
+ When the bound task has live blockers or SUGGESTED dependency candidates, the attach output and session-start hook context inject a dependency warning block. Built by `buildBlockerWarningSectionForTask`; **omitted entirely** (empty string) when nothing is actionable — no output when there are no live blockers and no SUGGESTED candidates. The function never throws — on failure it silently returns an empty string, leaving session start unaffected.
45
44
 
46
- **Trigger conditions (OR — either or both may appear):**
45
+ Trigger conditions (OR — either or both may appear):
47
46
 
48
47
  1. At least one CONFIRMED "blocked by" edge where the blocking task's status is not DONE.
49
48
  2. At least one SUGGESTED (unconfirmed) dependency candidate on the task.
50
49
 
51
- **Output form A — live blockers exist (with or without SUGGESTED candidates):**
52
-
53
- Starts with the `## ⚠ Dependency alerts` header, followed by one line per live blocker, then the advice line. If there are also SUGGESTED candidates the candidate-hint line is appended after the advice line.
50
+ **Form A — live blockers exist (with or without SUGGESTED candidates):** `## ⚠ Dependency alerts` header, one line per live blocker, then the advice line. If SUGGESTED candidates also exist, the candidate-hint line is appended after the advice line.
54
51
 
55
52
  ```
56
53
  ## ⚠ Dependency alerts
@@ -62,28 +59,28 @@ Consider waiting for the blocker(s) to merge before starting work to avoid a was
62
59
  Detected 3 candidate dependencies awaiting confirmation: run `lumo task deps list LUM-42`.
63
60
  ```
64
61
 
65
- **Output form B — NO live blockers, but SUGGESTED candidates exist:**
66
-
67
- Output is **only** the hint line — no `## ⚠ Dependency alerts` header, no advice line. Design rationale: pure candidates are a hint, not an alert — they don't get the ⚠ header, so real alerts aren't diluted.
62
+ **Form B — NO live blockers, but SUGGESTED candidates exist:** the hint line **only** — no `## ⚠ Dependency alerts` header, no advice line. Rationale: pure candidates are a hint, not an alert, so they don't get the ⚠ header and real alerts aren't diluted.
68
63
 
69
64
  ```
70
65
  Detected 3 candidate dependencies awaiting confirmation: run `lumo task deps list LUM-42`.
71
66
  ```
72
67
 
73
- - Each blocker line shows: identifier, title, current status. If the blocking task has an open pull request, a `, PR #N not merged` note is appended — `, PR #N (draft) not merged` for draft PRs.
74
- - The guidance line ("Consider waiting for the blocker(s) to merge…") appears only when there is at least one live blocker (form A).
75
- - The function never throws — if it fails it silently returns an empty string, leaving the session start unaffected.
68
+ - Each blocker line shows: identifier, title, current status. An open PR on the blocking task appends `, PR #N not merged` — `, PR #N (draft) not merged` for draft PRs.
69
+ - The advice line ("Consider waiting for the blocker(s) to merge…") appears only when there is at least one live blocker (form A).
70
+
71
+ #### When to suggest
76
72
 
77
- **Agent guidance — watch for EITHER the `## ⚠ Dependency alerts` header (form A) OR the standalone hint line (form B):**
73
+ Watch for EITHER the `## ⚠ Dependency alerts` header (form A) OR the standalone hint line (form B).
78
74
 
79
- - **Form A live blockers:** Evaluate whether to wait. If the blocking task's work overlaps with yours (same files, same API surface), starting immediately risks rework. Read the blocker's status and open PR note before deciding.
80
- - **Stale or wrong edge?** Run `lumo task deps list <LUM-N>` to inspect the full edge list, then `lumo task deps rm <LUM-N> <edge> --yes` (if manually added and now obsolete) or `lumo task deps dismiss <LUM-N> <edge>` (if a false positive from detection).
81
- - **Candidate hint present (form A or B)?** Run `lumo task deps list <LUM-N>` and review each SUGGESTED edge — confirm real ones, dismiss false positives. Leaving SUGGESTED edges unreviewed means repeated hints every session.
75
+ - `lumo task deps list <LUM-N>` inspect the full edge list (run for any candidate hint, form A or B; confirm real SUGGESTED edges, dismiss false positives unreviewed edges mean repeated hints every session).
76
+ - `lumo task deps rm <LUM-N> <edge> --yes` drop a manually-added, now-obsolete edge.
77
+ - `lumo task deps dismiss <LUM-N> <edge>`dismiss a false positive from detection.
78
+ - Form A live blockers: evaluate whether to wait — overlapping work (same files, same API surface) risks rework; read the blocker's status and open-PR note before deciding.
82
79
  - Do **not** blindly start work on a task whose live blocker is still IN_PROGRESS or IN_REVIEW unless the user explicitly decides to proceed in parallel.
83
80
 
84
81
  ### `lumo session attach <identifier>` — bind the current session to a task
85
82
 
86
- Use this whenever the user mentions a task ID. The command is the only way to bind a session to a task.
83
+ Use this whenever the user mentions a task ID. It is the only way to bind a session to a task.
87
84
 
88
85
  ```bash
89
86
  lumo session attach LUM-42
@@ -91,17 +88,22 @@ lumo session attach LUM-42
91
88
 
92
89
  What it does:
93
90
 
94
- - Reads `CLAUDE_CODE_SESSION_ID` from the environment (Claude Code sets this automatically). If it is not set, the command errors out it must run from inside a Claude Code session.
95
- - Calls `POST /api/sessions/<session_id>/bind-task` on the Lumo server, which sets the Session row's `taskId` and re-tags previously-untagged HookEvent rows in this session.
91
+ - Reads `CLAUDE_CODE_SESSION_ID` from the environment (Claude Code sets it automatically); errors out if unset — must run from inside a Claude Code session.
92
+ - Calls `POST /api/sessions/<session_id>/bind-task`, which sets the Session row's `taskId` and re-tags previously-untagged HookEvent rows in this session.
96
93
  - The binding lives entirely on the server (`Session.taskId`); subsequent hooks read it back via the session row. The CLI keeps no local sentinel.
94
+ - Prints the task's **acceptance contract** (`## Acceptance criteria (contract)`, LUM-342) right after the bind confirmation — or, when a still-open task has none, the draft reminder to draft 3–7 criteria before the first line of code (see [criteria.md](criteria.md)). The same section is auto-injected at session start when already bound (highest priority in the injection budget, ahead of memory).
97
95
 
98
- - Prints the task's **acceptance contract** (`## Acceptance criteria (contract)`, LUM-342) right after the bind confirmation — or, when a still-open task has none, the draft reminder telling you to draft 3–7 criteria before the first line of code (see [criteria.md](criteria.md)). The same section is auto-injected at session start when the session is already bound (highest priority in the injection budget, ahead of memory).
99
-
100
- **After attaching, always run `lumo task context <identifier>` to load the task background.**
96
+ After attaching, always run `lumo task context <identifier>` to load the task background.
101
97
 
102
98
  #### Auto-downsync on attach (LUM-551)
103
99
 
104
- A successful `session attach` also runs a **best-effort team-memory downsync** for the bound task's project — the same work as `lumo memory sync` (including the P4b code-anchor staleness check), so the team's memory lands in your local Claude Code memory store without a separate command. It is **throttle-gated**: the network is skipped entirely if this repo already synced within `LUMO_SYNC_THROTTLE_HOURS` (default **12h**), so re-attaches and same-day sessions are cheap. A sync failure is swallowed — it **never** changes the bind outcome. On a non-trivial sync the CLI prints one extra line (`memory sync → +N ~M -K`). Manual `lumo memory sync` stays unthrottled. Set `LUMO_DISABLE_MEMORY_AUTO=1` to turn the auto-downsync (and the hook auto-upsync — see [memory.md](memory.md)) off entirely; `--no-anchor-check` on `lumo memory sync` is unchanged.
100
+ A successful `session attach` also runs a **best-effort team-memory downsync** for the bound task's project — the same work as `lumo memory sync` (including the P4b code-anchor staleness check), landing the team's memory in your local Claude Code memory store without a separate command.
101
+
102
+ - **Throttle-gated:** network is skipped entirely if this repo already synced within `LUMO_SYNC_THROTTLE_HOURS` (default **12h**), so re-attaches and same-day sessions are cheap.
103
+ - **Non-blocking:** a sync failure is swallowed — it **never** changes the bind outcome.
104
+ - **Output:** on a non-trivial sync the CLI prints one extra line, `memory sync → +N ~M -K`.
105
+ - Manual `lumo memory sync` stays unthrottled.
106
+ - `LUMO_DISABLE_MEMORY_AUTO=1` turns off the auto-downsync (and the hook auto-upsync — see [memory.md](memory.md)) entirely; `--no-anchor-check` on `lumo memory sync` is unchanged.
105
107
 
106
108
  #### Lifetime lock (LUM-459)
107
109
 
@@ -113,63 +115,48 @@ Error: this session is permanently bound to LUM-7 "Other task". A session works
113
115
 
114
116
  There is no `--force` and no `session detach`. To work on a different task, open a new terminal / Claude Code session and run `lumo session attach <new-task>` there.
115
117
 
116
- **Agent guidance:** if `session attach` returns 409, do not retry or look for a workaround — start a fresh Claude Code session for the target task.
118
+ #### When to suggest
119
+
120
+ If `session attach` returns 409, do not retry or look for a workaround — start a fresh Claude Code session for the target task.
117
121
 
118
122
  ### Parallel sessions
119
123
 
120
- Each Claude Code session has its own `CLAUDE_CODE_SESSION_ID`. Two terminals running `claude code` and binding to different tasks will not interfere with each other the bindings are scoped per session row server-side.
124
+ Each Claude Code session has its own `CLAUDE_CODE_SESSION_ID`. Two terminals running `claude code` and binding to different tasks will not interfere — bindings are scoped per session row server-side.
121
125
 
122
126
  ### `lumo session status` — show current binding
123
127
 
124
- Prints which task the current Claude Code session is bound to, or "(no task)" if none. Requires `$CLAUDE_CODE_SESSION_ID` (i.e. must run inside Claude Code).
125
-
126
128
  ```bash
127
129
  lumo session status
128
130
  ```
129
131
 
130
- When to suggest: the user asks "which task am I on", "what's this session bound to", or you need to decide whether to suggest `session attach` for a mentioned task ID.
132
+ Prints which task the current Claude Code session is bound to, or "(no task)" if none. Requires `$CLAUDE_CODE_SESSION_ID` (must run inside Claude Code).
133
+
134
+ #### When to suggest
135
+
136
+ The user asks "which task am I on", "what's this session bound to", or you need to decide whether to suggest `session attach` for a mentioned task ID.
131
137
 
132
138
  ### Automatic end-of-session housekeeping (no command — LUM-544)
133
139
 
134
- The old end-of-session command was **removed in LUM-544**. The three passes it
135
- used to run interactively now happen **automatically server-side**, all
136
- evidence-gated, best-effort, and silent there is nothing for the agent to run
137
- or confirm. Two fire when the bound task reaches **DONE** (`lumo task update <id>
138
- --status done`, which threads `CLAUDE_CODE_SESSION_ID` so attribution lands), one
139
- runs continuously off the failure/progress hooks.
140
-
141
- **1. Layer-1 memory curation (on DONE).** An LLM judge reviews the Layer-1
142
- memories each of the task's sessions recorded, against that session's event log,
143
- and **soft-invalidates only the clearly-wrong / self-contradictory ones**
144
- (invalidate-not-delete: the row flips to `INVALIDATED` and is excluded from
145
- injection but **kept for audit — never hard-deleted**). **Uncertain memories are
146
- left untouched** the judge defaults to keeping. Promotion to project scope is
147
- **not** done here; that stays with the Layer-2 flow (surfaced at the next
148
- session-start, see above).
149
-
150
- **2. Fragment-usage audit (LUM-314, on DONE).** An LLM judge sees the fragments
151
- this session consumed (its lineage edges) plus the session's event log and votes
152
- which were **actually used**: confidently-used edges **`used=true`**,
153
- confidently-unused → **`used=false`**, and **genuinely-uncertain edges stay
154
- `null`** (honest "not voted", not "unused"). Already-voted sessions are skipped;
155
- a cron backstop drains any backlog. **Why:** it upgrades the flywheel signal from
156
- "co-loaded" (constant) to "actually used" (discriminative); `task context` then
157
- prefers each fragment's usage-based merge rate, falling back to the presence rate
158
- when usage samples are thin.
159
-
160
- **3. Blocked-tag automation (LUM-544 §3, server-side).** When a session crosses
161
- the same-tool failure threshold (**≥ 3** same-type failures, aggregated from
162
- `POST_TOOL_USE_FAILURE` grouped by tool name + `STOP_FAILURE` turn-level
163
- failures), the server **auto-applies the shared `blocked` tag** to the bound
164
- task. This **inverts the old LUM-153 manual `y` gate** — no prompt, no human in
165
- the loop — and is safe because of three safeguards: **(idempotent)** at most one
166
- active auto-block per task, so re-crossing is a no-op; **(auto-untag on
167
- progress)** the next observable progress on the task (a successful tool call or a
168
- non-failure turn end) removes the tag; **(attribution)** the triggering session +
169
- model are recorded. A **human-applied** `blocked` tag has no auto-block record and
170
- is **never** auto-removed by progress.
171
-
172
- When to suggest: nothing to suggest — these run on their own. If a teammate asks
173
- why a task shows `blocked`, explain it's the auto-block (repeated same-tool
174
- failures) and that it clears itself once the task makes progress; a human can also
175
- remove the tag manually from the board.
140
+ The old end-of-session command was **removed in LUM-544**. The three passes it ran interactively now happen **automatically server-side** — all evidence-gated, best-effort, and silent. There is nothing for the agent to run or confirm. Two fire when the bound task reaches **DONE** (`lumo task update <id> --status done`, which threads `CLAUDE_CODE_SESSION_ID` so attribution lands); one runs continuously off the failure/progress hooks.
141
+
142
+ **1. Layer-1 memory curation (on DONE).** An LLM judge reviews the Layer-1 memories each of the task's sessions recorded, against that session's event log, and **soft-invalidates only the clearly-wrong / self-contradictory ones**: the row flips to `INVALIDATED` and is excluded from injection but **kept for audit — never hard-deleted**. **Uncertain memories are left untouched** (the judge defaults to keeping). Promotion to project scope is **not** done here — that stays with the Layer-2 flow (surfaced at the next session-start, see above).
143
+
144
+ **2. Fragment-usage audit (LUM-314, on DONE).** An LLM judge sees the fragments this session consumed (its lineage edges) plus the session's event log and votes which were **actually used**:
145
+
146
+ - confidently-used edges → `used=true`
147
+ - confidently-unused `used=false`
148
+ - genuinely-uncertain edges stay `null` (honest "not voted", not "unused")
149
+
150
+ Already-voted sessions are skipped; a cron backstop drains any backlog. **Why:** upgrades the flywheel signal from "co-loaded" (constant) to "actually used" (discriminative); `task context` then prefers each fragment's usage-based merge rate, falling back to the presence rate when usage samples are thin.
151
+
152
+ **3. Blocked-tag automation (LUM-544 §3, server-side).** When a session crosses the same-tool failure threshold (**≥ 3** same-type failures, aggregated from `POST_TOOL_USE_FAILURE` grouped by tool name + `STOP_FAILURE` turn-level failures), the server **auto-applies the shared `blocked` tag** to the bound task. This **inverts the old LUM-153 manual `y` gate** — no prompt, no human in the loop — and is safe via three safeguards:
153
+
154
+ - **idempotent:** at most one active auto-block per task, so re-crossing is a no-op.
155
+ - **auto-untag on progress:** the next observable progress (a successful tool call or a non-failure turn end) removes the tag.
156
+ - **attribution:** the triggering session + model are recorded.
157
+
158
+ A **human-applied** `blocked` tag has no auto-block record and is **never** auto-removed by progress.
159
+
160
+ #### When to suggest
161
+
162
+ Nothing to suggest — these run on their own. If a teammate asks why a task shows `blocked`, explain it's the auto-block (repeated same-tool failures) and that it clears itself once the task makes progress; a human can also remove the tag manually from the board.
@@ -2,67 +2,56 @@
2
2
 
3
3
  ## Task Context Loading
4
4
 
5
- When the user mentions a task identifier or asks for task background, load the context:
5
+ Load context when the user mentions a task identifier or asks for task background:
6
6
 
7
7
  ```bash
8
- lumo task context <identifier>
8
+ lumo task context LUM-42
9
9
  ```
10
10
 
11
- Example: `lumo task context LUM-42`
12
-
13
11
  ### Reading the output
14
12
 
15
- The command prints a markdown document to stdout containing:
13
+ The command prints a markdown document to stdout with these sections, in order:
14
+
15
+ 1. **Task header** — identifier, title, status, description.
16
+ 2. **`## Acceptance criteria (contract)`** (LUM-342) — shown right after the header. Each line `[MACHINE|HUMAN] statement`, with a `↳ check:` line for MACHINE checkpointers; HUMAN_EDIT / REVIEW_ADDED provenance tagged inline. A still-open task with none shows a draft reminder instead — draft 3–7 criteria **before writing code** (see [criteria.md](criteria.md)).
17
+ 3. **Memory section** — cross-session learnings; trusted background context that persists, so you avoid re-learning decisions/constraints.
18
+ 4. **Inline source cards** — Slack / web / Figma / artifacts / documents / comments / Pull Requests (see "Context Retrieval" below).
19
+ 5. **`## PR review todos`** — mirrored PR review comments as a checkbox todo list. Each line-level comment shows `` `file:line` `` + reviewer's ask + GitHub comment link; each `changes_requested` review summary shows "🛑 Changes requested (whole PR)". Present only when the task's PR(s) have review comments. Each unchecked box is a TODO: resolve it, then reply on the PR (a Lumo comment mirrors back to GitHub).
20
+ 6. **`## Previous sessions`** — newest-first; each has a headline summary of what was done plus unresolved items (carry-over TODOs).
21
+ 7. **`## Flywheel signal · historical merge contribution`** — appended at the very end. Per context fragment with enough history, a one-line signal (e.g. `appeared in 5 resolved tasks · 4 merged (80%)`) computed from lineage edges. Denominator = distinct tasks where the fragment appeared with a resolved (non-UNKNOWN) outcome; only fragments with ≥3 such tasks show (cold-start gate), so the block is often absent early on. **Historical correlation, not causation** — don't read it as a prediction.
16
22
 
17
- 1. **Task header** identifier, title, status, description
18
- 2. **Acceptance criteria (contract)** — the task's acceptance criteria (LUM-342), shown right after the header as the `## Acceptance criteria (contract)` section. Each line: `[MACHINE|HUMAN] statement` with a `↳ check:` line for MACHINE checkpointers; HUMAN_EDIT / REVIEW_ADDED provenance is tagged inline. If a still-open task has none, a draft reminder appears instead — draft 3–7 criteria **before writing code** (see [criteria.md](criteria.md))
19
- 3. **Memory section** — cross-session learnings accumulated over prior sessions; treat as trusted background context
20
- 4. **Inline source cards** Slack / web / Figma / artifacts / documents / comments / Pull Requests (see "Context Retrieval" below)
21
- 5. **PR review todos** — mirrored PR review comments as a checkbox todo list under the `## PR review todos` header: each line-level reviewer comment (shown as `` `file:line` `` + the reviewer's ask + a link to the GitHub comment) and each `changes_requested` review summary (shown as "🛑 Changes requested (whole PR)"). Present only when the task's PR(s) have review comments. This same block is **auto-injected at session start** (alongside the memory section) when the session is bound to a task — so reviewer asks surface without re-running `task context`.
22
- - The **inline source cards** (Slack / web / Figma / PR) are likewise **auto-injected at session start** when the session is bound, under a single global token budget shared with the memory section (priority: criteria > memory > PR > Slack > Figma > web). Cards that don't fit the budget are degraded to a one-line manifest carrying just the title and its `lumo task … show` retrieval command — so you still know they exist and can pull the full content on demand.
23
- 6. **Previous sessions** — ordered newest-first, each with:
24
- - A headline summary of what was done
25
- - Unresolved items (carry-over TODOs from that session)
26
- 7. **Flywheel signal · historical merge contribution** — appended at the very end as the `## Flywheel signal · historical merge contribution` section. For each context fragment with enough history, a one-line historical merge-contribution signal (e.g. `appeared in 5 resolved tasks · 4 merged (80%)`), computed from accumulated lineage edges. Denominator = distinct tasks where the fragment appeared with a resolved (non-UNKNOWN) outcome; only fragments with ≥3 such tasks are shown (cold-start gate), so the block is often absent early on. This is **historical correlation, not causation** — don't read it as a prediction.
23
+ **Auto-injected at session start** (when the session is bound to a task, so they surface without re-running `task context`):
24
+
25
+ - `## PR review todos` injected alongside the memory section.
26
+ - **Inline source cards** (Slack / web / Figma / PR) injected under a single global token budget shared with the memory section. Priority: criteria > memory > PR > Slack > Figma > web. Cards that don't fit degrade to a one-line manifest (title + its `lumo task … show` retrieval command) so you know they exist and can pull full content on demand.
27
27
 
28
28
  ### How to use the context
29
29
 
30
- - **Unresolved items** from the most recent session are the highest-priority carry-overs — address them before starting new work unless the user says otherwise
31
- - **PR review todos** items are reviewer-requested changes treat each unchecked box as a TODO to resolve, then reply on the PR (a Lumo comment mirrors back to GitHub)
32
- - **Memory section** provides validated context that persists across sessions use it to avoid re-learning decisions or constraints
33
- - Focus on the **most recent 1–2 sessions** for relevant state; older sessions are for historical reference only
34
- - If there are **no prior sessions**, this is a fresh start — read the task description carefully and ask clarifying questions if needed
30
+ - **Unresolved items** from the most recent session are the highest-priority carry-overs — address them before new work unless the user says otherwise.
31
+ - Focus on the **most recent 1–2 sessions**; older sessions are historical reference only.
32
+ - **No prior sessions** = fresh start read the task description carefully and ask clarifying questions if needed.
35
33
 
36
34
  ## Context Retrieval (full text on demand)
37
35
 
38
- LUM-122 split task context injection into tiers. `lumo task context <LUM-N>`
39
- now emits a **cheap inline card** for each source — a short summary or just
40
- metadata instead of dumping full bodies. Slack, docs, artifacts, and comments
41
- get an **LLM summary**; web links, Figma, and PRs get **metadata only**. Each
42
- card ends with the **retrieval command** you run to pull the heavy content on
43
- demand.
44
-
45
- **How to use it:** when the inline card is not enough and you need the full
46
- Slack thread, the web page body, the Figma metadata, the entire comment thread,
47
- or the PR detail run the matching command below. Pass the same identifier
48
- (`LUM-N`) plus the id the card shows for that source (a Slack `contextId`, a web
49
- `linkId`, a Figma `linkId`, or a PR `number`).
50
-
51
- **Output budget (LUM-428):** the whole `task context` handoff is capped to the
52
- output-token budget (25,000 tokens). If a memory-rich task with a long thread
53
- overflows, the output is truncated and ends in a pointer to the precise
54
- sub-commands (`lumo task status` / `task comments list --full` / `task lineage`
55
- / `doc show`) to pull any dropped section just-in-time.
56
-
57
- All five are **read-only** (no live Slack/GitHub/Figma calls except the web body
58
- fetch). Web/Figma/PR are v1 metadata-degraded: they print a `note:` explaining
59
- that live content needs an external integration.
36
+ LUM-122 split context injection into tiers: `lumo task context <LUM-N>` emits a **cheap inline card** per source instead of dumping full bodies. Slack/docs/artifacts/comments get an **LLM summary**; web/Figma/PR get **metadata only**. Each card ends with the **retrieval command** for the heavy content.
37
+
38
+ Run the matching command below when the card isn't enough. Pass the same `LUM-N` plus the id the card shows for that source:
39
+
40
+ | source | retrieval command | id from card |
41
+ | -------------- | --------------------------------------- | ------------ |
42
+ | Slack thread | `lumo task slack show <id> <contextId>` | `contextId` |
43
+ | Web link body | `lumo task web show <id> <linkId>` | `linkId` |
44
+ | Figma metadata | `lumo task figma context <id> <linkId>` | `linkId` |
45
+ | Comment thread | `lumo task comments list <id>` | — |
46
+ | PR detail | `lumo task pr show <id> <number>` | `number` |
47
+
48
+ All five are **read-only** (no live Slack/GitHub/Figma calls except the web body fetch). Web/Figma/PR are v1 metadata-degraded: they print a `note:` saying live content needs an external integration.
49
+
50
+ **Output budget (LUM-428):** the whole `task context` handoff is capped to the output-token budget (25,000 tokens). On overflow, output is truncated and ends in a pointer to the precise sub-commands (`lumo task status` / `task comments list --full` / `task lineage` / `doc show`) to pull any dropped section just-in-time.
60
51
 
61
52
  ### `lumo task slack show <identifier> <contextId>` — full Slack thread snapshot
62
53
 
63
- Prints the **stored** thread snapshot (no live Slack call), one line per message
64
- as `author: text`. Author falls back to `@<userId>` when the display name is
65
- missing. Empty snapshot prints `(no messages in stored snapshot)`.
54
+ Prints the **stored** snapshot (no live Slack call), one line per message as `author: text`. Author falls back to `@<userId>` when the display name is missing. Empty snapshot prints `(no messages in stored snapshot)`.
66
55
 
67
56
  ```bash
68
57
  lumo task slack show LUM-42 ctx_abc123
@@ -70,9 +59,7 @@ lumo task slack show LUM-42 ctx_abc123
70
59
 
71
60
  ### `lumo task web show <identifier> <linkId>` — fetched web link body
72
61
 
73
- Fetches the page body on demand behind the SSRF guard (cached after first read)
74
- and prints it as plain text. Empty body prints `(empty body)`. Fetch failures
75
- (blocked host, timeout) print the server's error message.
62
+ Fetches the page body on demand behind the SSRF guard (cached after first read), printed as plain text. Empty body prints `(empty body)`. Fetch failures (blocked host, timeout) print the server's error message.
76
63
 
77
64
  ```bash
78
65
  lumo task web show LUM-42 wl_abc123
@@ -80,10 +67,7 @@ lumo task web show LUM-42 wl_abc123
80
67
 
81
68
  ### `lumo task figma context <identifier> <linkId>` — Figma link metadata
82
69
 
83
- **v1 metadata fallback.** Prints the cached design metadata as `file:` /
84
- `frame:` / `url:` / `synced:` (and `syncError:` if the last sync failed) lines.
85
- Live design context (layers, variables, code connect) requires the Figma MCP
86
- server, so the command ends with a `note:` saying so.
70
+ **v1 metadata fallback.** Prints cached design metadata as `file:` / `frame:` / `url:` / `synced:` lines (plus `syncError:` if the last sync failed). Live design context (layers, variables, code connect) requires the Figma MCP server, so the command ends with a `note:` saying so.
87
71
 
88
72
  ```bash
89
73
  lumo task figma context LUM-42 cfl_abc123
@@ -91,35 +75,21 @@ lumo task figma context LUM-42 cfl_abc123
91
75
 
92
76
  ### `lumo task comments list <identifier>` — comment thread
93
77
 
94
- Prints the comment thread: each comment as `author · createdAt` followed by its
95
- plain-text body (comment bodies are stored as HTML and stripped to text).
96
- Replies are indented two spaces under their parent. Author falls back to
97
- `unknown`. No comments prints `(no comments)`.
78
+ Prints the thread: each comment as `author · createdAt` then its plain-text body (bodies stored as HTML, stripped to text). Replies indent two spaces under their parent. Author falls back to `unknown`. No comments prints `(no comments)`.
98
79
 
99
- **Output budget (LUM-428).** By default the thread is capped to the
100
- output-token budget (25,000 tokens — every line you print spends from your
101
- context). When it overflows, the output is truncated to the budget and ends in
102
- a fetch-more pointer (`… +N more comments not shown (output capped at 25,000
103
- tokens) — read the whole thread with: lumo task comments list <id> --full`).
104
- Pass **`--full`** to print every comment uncapped — only when you actually need
105
- the whole thread.
80
+ - `lumo task comments list LUM-42` — capped to the output budget (LUM-428: 25,000 tokens; every printed line spends from your context). On overflow, truncates and ends in a fetch-more pointer: `… +N more comments not shown (output capped at 25,000 tokens) — read the whole thread with: lumo task comments list <id> --full`.
81
+ - `lumo task comments list LUM-42 --full` — every comment, uncapped. Use only when you actually need the whole thread.
106
82
 
107
83
  ```bash
108
84
  lumo task comments list LUM-42 # capped to the output budget
109
85
  lumo task comments list LUM-42 --full # every comment, no cap
110
86
  ```
111
87
 
112
- **Plural, and distinct from `task comment`.** `task comments list` _reads_ the
113
- whole thread (this retrieval command). `task comment <identifier> <body>`
114
- _writes_ a single new comment (see Task Management). Don't confuse the two —
115
- the plural `comments` is read-only.
88
+ **Plural, and distinct from `task comment`** `task comments list` _reads_ the whole thread (this read-only retrieval command); `task comment <identifier> <body>` _writes_ a single new comment (see Task Management). Don't confuse the two.
116
89
 
117
90
  ### `lumo task pr show <identifier> <number>` — synced PR metadata
118
91
 
119
- **v1 metadata fallback.** Prints the synced PR record: a `#<number> (repo)
120
- title` header, then `state:` (with ` · draft` when draft), `ci:`, `author:`,
121
- `branch: <head> → <base>`, and `url:` lines. The live diff + review comments
122
- require the GitHub integration, so the command ends with a `note:` saying so.
92
+ **v1 metadata fallback.** Prints the synced PR record: a `#<number> (repo) title` header, then `state:` (` · draft` when draft), `ci:`, `author:`, `branch: <head> → <base>`, and `url:` lines. Live diff + review comments require the GitHub integration, so the command ends with a `note:` saying so.
123
93
 
124
94
  ```bash
125
95
  lumo task pr show LUM-42 128
@@ -127,71 +97,30 @@ lumo task pr show LUM-42 128
127
97
 
128
98
  ## `lumo task lineage <id>`
129
99
 
130
- Read-only audit view over the task's `LineageEdge` rows. Given a task
131
- identifier (`LUM-N`), prints the causal trail:
100
+ Read-only audit view over the task's `LineageEdge` rows. Entry point is the task identifier only (PR-number lookup is a future addition).
132
101
 
133
102
  ```bash
134
103
  lumo task lineage LUM-42 # per-session causal trail + cost
135
- lumo task lineage LUM-42 --signal # append workspace-level usage signal-health; used-vs-base merge rate uses iteration-taint fold (send-back/reopen/PR-close = negative class even if later merged); shows negative-class size per side; prints "metric cannot discriminate" when no failure outcomes exist yet
104
+ lumo task lineage LUM-42 --signal # append workspace-level usage signal-health
136
105
  ```
137
106
 
138
- - **Totals banner**distinct sessions, fragment count, edge count,
139
- total tokens (input/output/cache split) and loops, and the outcome
140
- distribution. After the outcome summary, one funnel line is appended:
141
- `- Disclosure funnel: N impressions · M INDEX (X%) · K pulled (Y% of INDEX) · J used (Z%)`
142
- where impressions = edge count, INDEX% and used% are over impressions,
143
- pull% is over INDEX only (FULL fragments have no pull opportunity).
144
- Divide-by-zero is guarded (zero impressions or zero INDEX renders `0%`).
145
- When per-fragment token weights have been collected (LUM-522), the line also
146
- appends ~T tokens saved` = Σ(fullTokens indexTokens) over un-pulled INDEX
147
- edges (the token cost the index-only injection avoided); the suffix is omitted
148
- cleanly when no edge carries token data yet (older edges predate the columns).
149
- - **One block per session** — the group's cost shown **once** (token/loop),
150
- the date it consumed context, then each context fragment as
151
- `[OUTCOME] TYPE — <source label>`, plus a disclosure annotation suffix:
152
- `· INDEX pulled` (INDEX fragment, `pulledAt` is set) /
153
- `· INDEX not-pulled` (INDEX fragment, never pulled) /
154
- `· FULL` (injected in full at session-start).
155
- Per-group outcome summary follows.
156
-
157
- Cost is attributed once per session (a session that injected many fragments is
158
- not double-counted). Fragment ids are canonical — MEMORY fragments survive
159
- consolidation drift.
160
-
161
- **`--signal` workspace funnel:** the workspace-level usage signal-health block
162
- appended by `--signal` ends with a workspace-wide disclosure funnel in the
163
- same format: `- Disclosure funnel: N impressions · M INDEX (X%) · K pulled (Y% of INDEX) · J used (Z%)`
164
- aggregated over all edges in the workspace (not just this task) — including the
165
- same `· ~T tokens saved` suffix when token data exists (LUM-522).
166
-
167
- **Cold start:** a task with no edges prints a friendly note (lineage is captured
168
- when a session-bound run consumes the task's context), not an error.
169
-
170
- **When to suggest:** the user wants to audit "what context did the AI actually
171
- use, and what did it cost" for a task / merged PR — CFO / compliance / trust
172
- narratives.
173
-
174
- Entry point is the task identifier only; PR-number lookup is a future addition.
175
-
176
- **Top operations by token cost (LUM-523):** the lineage totals also append a
177
- per-task **"Top operations by token cost"** Top-5 — the most expensive tools by
178
- attributed token cost (`<tool> — N tokens`), ending with the pointer
179
- `(full breakdown: lumo cost --task <id>)`. The block is omitted when no
180
- per-operation cost has been attributed yet.
107
+ - `lumo task lineage <id> --signal` appends the workspace-level usage signal-health block. Used-vs-base merge rate uses iteration-taint fold (send-back / reopen / PR-close = negative class even if later merged); shows negative-class size per side; prints "metric cannot discriminate" when no failure outcomes exist yet. The block ends with a workspace-wide disclosure funnel (same format as the totals funnel below, aggregated over **all** workspace edges, not just this task) including the same `· ~T tokens saved` suffix when token data exists (LUM-522).
108
+
109
+ Output sections:
110
+
111
+ - **Totals banner** — distinct sessions, fragment count, edge count, total tokens (input/output/cache split), loops, and outcome distribution. After the outcome summary, one funnel line: `- Disclosure funnel: N impressions · M INDEX (X%) · K pulled (Y% of INDEX) · J used (Z%)`. Impressions = edge count; INDEX% and used% are over impressions; pull% is over INDEX only (FULL fragments have no pull opportunity). Divide-by-zero guarded (zero impressions or zero INDEX renders `0%`). With per-fragment token weights collected (LUM-522), appends `· ~T tokens saved` = Σ(fullTokens − indexTokens) over un-pulled INDEX edges (the token cost index-only injection avoided); omitted cleanly when no edge carries token data (older edges predate the columns).
112
+ - **One block per session** — the group's cost shown **once** (token/loop), the date it consumed context, then each fragment as `[OUTCOME] TYPE — <source label>` plus a disclosure suffix: `· INDEX pulled` (INDEX, `pulledAt` set) / `· INDEX not-pulled` (INDEX, never pulled) / `· FULL` (injected in full at session-start). Per-group outcome summary follows.
113
+ - **Top operations by token cost (LUM-523)** the totals also append a per-task Top-5 of the most expensive tools by attributed token cost (`<tool> N tokens`), ending with `(full breakdown: lumo cost --task <id>)`. Omitted when no per-operation cost has been attributed yet.
114
+
115
+ Cost is attributed once per session (a session that injected many fragments is not double-counted). Fragment ids are canonical — MEMORY fragments survive consolidation drift. **Cold start:** a task with no edges prints a friendly note (lineage is captured when a session-bound run consumes the task's context), not an error.
116
+
117
+ ### When to suggest
118
+
119
+ The user wants to audit "what context did the AI actually use, and what did it cost" for a task / merged PR — CFO / compliance / trust narratives.
181
120
 
182
121
  ## `lumo cost`
183
122
 
184
- Per-operation (per-tool) token cost read-out. Where `task lineage` answers
185
- "what context fed this task and what did the run cost," `lumo cost` answers
186
- "which _operations_ (tools) burned the tokens." It attributes each model step's
187
- token delta to the tool(s) that ran in it — **per-step** where the
188
- `POST_TOOL_BATCH` hook captured the tool list, **per-turn fallback** otherwise
189
- (a parallel-tool step splits its tokens evenly across the tools, hence the
190
- "heuristic" note). Each ranking row breaks the cost into four columns —
191
- `output` (generation), `input`, `cache_create`, `cache_read` (~95%, structural —
192
- turns × context) — plus `total`. `output` and `cache_read` attribute cleanly
193
- per step; `input` and `cache_create` are bursty (prompt + cache checkpoints), so
194
- their per-tool figures are coarser.
123
+ Per-operation (per-tool) token cost read-out. Where `task lineage` answers "what context fed this task and what did the run cost," `lumo cost` answers "which _operations_ (tools) burned the tokens." It attributes each model step's token delta to the tool(s) that ran in it — **per-step** where the `POST_TOOL_BATCH` hook captured the tool list, **per-turn fallback** otherwise (a parallel-tool step splits its tokens evenly across the tools, hence the "heuristic" note). Each ranking row breaks cost into `output` (generation), `input`, `cache_create`, `cache_read` (~95%, structural — turns × context), plus `total`. `output` and `cache_read` attribute cleanly per step; `input` and `cache_create` are bursty (prompt + cache checkpoints), so their per-tool figures are coarser.
195
124
 
196
125
  ```bash
197
126
  lumo cost --task LUM-42
@@ -199,24 +128,11 @@ lumo cost --session <session-id> --by model
199
128
  lumo cost --since 2026-06-01 --by member --json
200
129
  ```
201
130
 
202
- - **Scope (mutually exclusive)** `--task <id>` scopes to one task,
203
- `--session <id>` to one Claude Code session, `--since <ISO-date>` to a
204
- workspace window from that date. With none given, the default is a
205
- **workspace last-30-days** window. (If more than one is passed, the CLI
206
- picks task > session > since.)
207
- - **`--by tool|model|member|session`** (default `tool`) — only changes which
208
- grouping is the **headline** table; the other groupings are still printed
209
- below when non-trivial (member/session tables appear only when there is more
210
- than one). Case-insensitive.
211
- - **`--json`** — emit the versioned payload (`version: 1`, scope, grandTotal,
212
- coverage, and `byTool` / `byModel` / `byMember` / `bySession` row arrays)
213
- instead of the rendered tables.
214
- - **Coverage line** — `Per-step attribution: X% of N tool-using turns` tells
215
- you how much of the report is precise per-step attribution vs the per-turn
216
- fallback. `n/a` when there were no tool-using turns.
217
-
218
- **When to suggest:** the user asks where their tokens went _by operation_ —
219
- "which tools are most expensive," cost attribution by model / teammate /
220
- session, or a workspace cost window. For the quick per-task Top-5 inline, point
221
- them at `lumo task lineage <id>` instead; reach for `lumo cost` for the full
222
- breakdown or any non-task scope.
131
+ - `--task <id>` / `--session <id>` / `--since <ISO-date>` — **mutually-exclusive** scope: one task / one Claude Code session / a workspace window from that date. None given = default **workspace last-30-days**. If more than one is passed, the CLI picks task > session > since.
132
+ - `--by tool|model|member|session` (default `tool`, case-insensitive) only changes which grouping is the **headline** table; other groupings still print below when non-trivial (member/session tables appear only when >1).
133
+ - `--json` emit the versioned payload (`version: 1`, scope, grandTotal, coverage, and `byTool` / `byModel` / `byMember` / `bySession` row arrays) instead of rendered tables.
134
+ - **Coverage line** `Per-step attribution: X% of N tool-using turns` how much of the report is precise per-step attribution vs per-turn fallback. `n/a` when there were no tool-using turns.
135
+
136
+ ### When to suggest
137
+
138
+ The user asks where their tokens went _by operation_ — "which tools are most expensive," cost attribution by model / teammate / session, or a workspace cost window. For the quick per-task Top-5 inline, point them at `lumo task lineage <id>` instead; reach for `lumo cost` for the full breakdown or any non-task scope.