@llblab/pi-telegram 0.3.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -13,12 +13,12 @@ This repository is an actively maintained fork of [`badlogic/pi-telegram`](https
13
13
 
14
14
  ## Key Features
15
15
 
16
- - **Priority Command Queue**: Control commands such as `/status` and `/model` use a high-priority control queue, so they do not get stuck behind normal queued prompts when pi is busy.
16
+ - **Immediate Telegram Controls**: `/status` and `/model` respond immediately from Telegram, while model-switch continuation turns still use the control lane when a restart needs to resume safely.
17
17
  - **Interactive UI**: Manage your session directly from Telegram. Inline buttons allow you to switch models and adjust reasoning (thinking) levels on the fly.
18
18
  - **In-flight Model Switching**: Change the active model mid-generation. The agent gracefully pauses, applies the new model, and restarts its response without losing context.
19
19
  - **Smart Message Queue**: Messages sent while the agent is busy are queued and previewed in the pi status bar, and queued turns can be reprioritized or removed with Telegram reactions.
20
20
  - **Mobile-Optimized Rendering**: Tables and lists are formatted for narrow screens, table padding accounts for emoji grapheme and wide Unicode display width, and Telegram-originated runs prompt the assistant to prefer narrow table columns for phone readability. Markdown is correctly parsed and split to fit Telegram's limits without breaking HTML structures or code blocks, block spacing stays faithful to the original Markdown with readable heading separation, supported absolute links stay clickable, and unsupported link forms degrade safely.
21
- - **File Handling & Attachments**: Send images and files to the agent, or ask it to generate and return artifacts. Inbound downloads and outbound attachments are size-limited by default, and outbound files are delivered automatically via the `telegram_attach` tool.
21
+ - **File Handling & Attachments**: Send images and files to the agent, transcribe or transform inbound files with configured attachment handlers, or ask pi to generate and return artifacts. Inbound downloads and outbound attachments are size-limited by default, and outbound files are delivered automatically via the `telegram_attach` tool.
22
22
  - **Streaming Responses**: Closed Markdown blocks stream back as rich Telegram HTML while pi is generating, and the still-growing tail stays readable until the final fully rendered reply lands.
23
23
 
24
24
  ## Install
@@ -60,7 +60,7 @@ Paste your bot token when prompted. If a bot token is already saved in `~/.pi/ag
60
60
  /telegram-connect
61
61
  ```
62
62
 
63
- The bridge is session-local. Only one pi session should be connected to the bot at a time.
63
+ The bridge is session-local: only one pi instance polls Telegram at a time. `/telegram-connect` records polling ownership in `~/.pi/agent/locks.json`; live ownership moves require confirmation, while `/new` and same-`cwd` process restarts resume automatically.
64
64
 
65
65
  ### 4. Pair your account from Telegram
66
66
 
@@ -84,7 +84,7 @@ Use these inside the Telegram DM with your bot:
84
84
  - **`/compact`**: Start session compaction (only works when the session is idle).
85
85
  - **`/stop`**: Abort the active run.
86
86
 
87
- Telegram command admission is explicit: `/compact`, `/stop`, `/help`, and `/start` execute immediately; `/status` and `/model` enter the high-priority control lane so they can run before normal queued prompts when pi becomes safe to dispatch.
87
+ Telegram command admission is explicit: `/compact`, `/stop`, `/help`, `/start`, `/status`, and `/model` execute immediately. Synthetic model-switch continuation turns still enter the high-priority control lane so they can resume before normal queued prompts when pi becomes safe to dispatch.
88
88
 
89
89
  ### Pi Commands
90
90
 
@@ -92,19 +92,50 @@ Run these inside pi, not Telegram:
92
92
 
93
93
  - **`/telegram-setup`**: Configure or update the Telegram bot token.
94
94
  - **`/telegram-status`**: Check bridge status, connection, polling, execution, queue, and recent redacted runtime/API failure events.
95
- - **`/telegram-connect`**: Start polling Telegram updates in the current pi session.
96
- - **`/telegram-disconnect`**: Stop polling in the current pi session.
95
+ - **`/telegram-connect`**: Start polling Telegram updates in the current pi session, acquire the singleton lock, or interactively move ownership here from another live instance.
96
+ - **`/telegram-disconnect`**: Stop polling in the current pi session and release the singleton lock.
97
97
 
98
98
  ### Queue, Reactions, and Media
99
99
 
100
100
  - If you send more Telegram messages while pi is busy, they enter the default prompt queue and are processed in order.
101
101
  - `👍` moves a waiting prompt into the priority prompt queue, behind control actions but ahead of default prompts. Removing `👍` sends it back to its normal queue position, and adding `👍` again gives it a fresh priority position.
102
102
  - `👎` removes a waiting turn from the queue. Telegram Bot API does not expose ordinary DM message-deletion events through the polling path used here, so queue removal is bound to the dislike reaction.
103
- - For media groups, a reaction on any message in the group applies to the whole queued turn.
103
+ - Reactions apply to any waiting Telegram turn, including text, voice, files, images, and media groups. For media groups, a reaction on any message in the group applies to the whole queued turn.
104
104
  - If you edit a Telegram message while it is still waiting in the queue, the queued turn is updated instead of creating a duplicate prompt. Edits after a turn has already started may not affect the active run.
105
- - Inbound images, albums, and files are saved to `~/.pi/agent/tmp/telegram`, local file paths are included in the prompt, and inbound images are forwarded to pi as image inputs. Inbound downloads default to a 50 MiB limit and can be adjusted with `PI_TELEGRAM_INBOUND_FILE_MAX_BYTES` or `TELEGRAM_MAX_FILE_SIZE_BYTES`.
105
+ - Inbound images, albums, and files are saved to `~/.pi/agent/tmp/telegram`. Unhandled local file paths are included in the prompt, handled attachment output is injected into the prompt text, and inbound images are forwarded to pi as image inputs. Inbound downloads default to a 50 MiB limit and can be adjusted with `PI_TELEGRAM_INBOUND_FILE_MAX_BYTES` or `TELEGRAM_MAX_FILE_SIZE_BYTES`.
106
106
  - Queue reactions depend on Telegram delivering `message_reaction` updates for your bot and chat type.
107
107
 
108
+ ### Inbound Attachment Handlers
109
+
110
+ `telegram.json` can define ordered `attachmentHandlers` for common preprocessing such as voice transcription. Matching handlers run after download and before the Telegram turn enters the pi queue. If a matching handler fails, the next matching handler is tried as a fallback.
111
+
112
+ ```json
113
+ {
114
+ "attachmentHandlers": [
115
+ {
116
+ "type": "voice",
117
+ "template": "~/.pi/agent/skills/mistral-stt/scripts/transcribe.mjs {file} {lang} {model}",
118
+ "args": ["file", "lang", "model"],
119
+ "defaults": {
120
+ "lang": "ru",
121
+ "model": "voxtral-mini-latest"
122
+ }
123
+ },
124
+ {
125
+ "mime": "audio/*",
126
+ "template": "~/.pi/agent/skills/groq-stt/scripts/transcribe.mjs {file} {lang} {model}",
127
+ "args": ["file", "lang", "model"],
128
+ "defaults": {
129
+ "lang": "ru",
130
+ "model": "whisper-large-v3-turbo"
131
+ }
132
+ }
133
+ ]
134
+ }
135
+ ```
136
+
137
+ Matching supports `mime`, `type`, or `match`; wildcards like `audio/*` are accepted. Template placeholders are substituted into command args, not shell text: `{file}` is the downloaded file path, `{mime}` is the MIME type, `{type}` is the Telegram attachment type, and `defaults` can provide additional values such as `{lang}` or `{model}`. Local attachments stay in the prompt under `[attachments] <directory>` with relative file entries; successful handler stdout is added under `[outputs]`; failed handlers record diagnostics and fall back to the next matching handler. The portable command-template contract is documented in [`docs/command-templates.md`](./docs/command-templates.md); Telegram-specific handler config is documented in [`docs/attachment-handlers.md`](./docs/attachment-handlers.md).
138
+
108
139
  ### Requesting Files
109
140
 
110
141
  If you ask pi for a file or generated artifact (e.g., _"generate a shell script and attach it"_), pi will call the `telegram_attach` tool, and the extension will send the file alongside its next Telegram reply. Outbound attachments default to a 50 MiB limit and can be adjusted with `PI_TELEGRAM_OUTBOUND_ATTACHMENT_MAX_BYTES` or `TELEGRAM_MAX_ATTACHMENT_SIZE_BYTES`.
@@ -117,10 +148,10 @@ Rich previews are sent through editable messages because Telegram drafts are tex
117
148
 
118
149
  ## Status bar
119
150
 
120
- The pi status bar shows queued Telegram turns as compact previews, for example:
151
+ The pi status bar shows the current bridge state plus queued Telegram turns as compact previews. Busy labels distinguish states such as `active`, `dispatching`, `queued`, `tool running`, `model`, and `compacting`.
121
152
 
122
153
  ```text
123
- +3: [⬆ write a shell script…, summarize this image…, 📎 2 attachments]
154
+ telegram queued +3: [⬆ write a shell script…, summarize this image…, 📎 2 attachments]
124
155
  ```
125
156
 
126
157
  ## Notes
package/docs/README.md CHANGED
@@ -4,6 +4,7 @@ Living index of project documentation in `/docs`.
4
4
 
5
5
  ## Documents
6
6
 
7
- | Document | Description |
8
- | --- | --- |
9
- | [architecture.md](./architecture.md) | Overview of the Telegram bridge runtime, queueing model, rendering pipeline, and interactive controls |
7
+ - [architecture.md](./architecture.md) Overview of the Telegram bridge runtime, queueing model, rendering pipeline, and interactive controls
8
+ - [command-templates.md](./command-templates.md) Portable command-template standard core
9
+ - [attachment-handlers.md](./attachment-handlers.md) Local `pi-telegram` attachment-handler config, placeholders, and fallbacks
10
+ - [locks.md](./locks.md) — Shared `locks.json` standard for singleton extension ownership
@@ -21,35 +21,33 @@ Interface consistency rule: when two modules mean the same runtime entity, they
21
21
 
22
22
  Naming rule: because the repository already scopes this codebase to Telegram, extracted module and test filenames use bare domain names such as `api.ts`, `queue.ts`, `updates.ts`, and `queue.test.ts` rather than repeating `telegram-*` in every filename.
23
23
 
24
- Current runtime areas include:
25
-
26
- - Telegram Bot API concrete transport shapes live with Telegram API helpers in `/lib/api.ts`, while persisted bot/session pairing state lives in `/lib/config.ts`; domain-owned runtime state types stay with their owners, such as queued/active turn state in `/lib/queue.ts` and preview state in `/lib/preview.ts`, while domain helpers prefer local structural `*Like` contracts instead of importing concrete wire DTOs
27
- - Direct pi SDK imports are centralized in `/lib/pi.ts`, which exposes concrete pi SDK type exports, bound extension API runtime ports, and narrow bridge-facing helpers such as settings-manager creation plus context model/idle/pending-message/compaction adapters; `index.ts` uses this adapter namespace instead of importing `@mariozechner/pi-coding-agent` directly
28
- - Session-local runtime primitives such as queue/control/priority ordering counters, lifecycle/dispatch flags, setup guard state, abort-handler storage/binding, typing-loop timer lifecycle, typing-loop starter binding, prompt-dispatch lifecycle/runtime adapters, and agent-end reset sequencing in `/lib/runtime.ts`; the runtime domain's essence is mutable cross-domain session coordination rather than business behavior. It exposes a grouped bridge runtime facade with named queue/lifecycle/setup/abort/typing ports that bind those primitives to one session state while remaining a cohesive state/runtime boundary, and `index.ts` still wires live Telegram API calls and status updates into those helpers. Preview-specific state, draft-support detection, and draft-id allocation live in `/lib/preview.ts`.
29
- - Constants live in their owning domains instead of a shared constants module: API paths/inbound limits and inbound file-size env parsing in `/lib/api.ts`, outbound attachment limits and outbound attachment-size env parsing in `/lib/attachments.ts`, media-group debounce in `/lib/media.ts`, menu cache/state bounds in `/lib/menu.ts`, preview throttle/draft bounds in `/lib/preview.ts`, typing cadence in `/lib/runtime.ts`, diagnostic ring limits in `/lib/status.ts`, Telegram prompt prefix in `/lib/turns.ts`, and system-prompt guidance in `/lib/registration.ts`.
30
- - Queueing, narrow Telegram prompt content contracts, queue-store contracts/state helpers, active-turn state helpers, dispatch-readiness adapters, queue append/mutation runtime/controller adapters, control enqueue controllers, queue dispatch readiness/controller/runtime adapters, prompt enqueue/history planning/runtime/controllers, queue-runtime, session state appliers plus lifecycle/runtime sequencing, session start/shutdown sequencing plus hook binding, agent-start/agent-end lifecycle handling plus hook/runtime binding, and tool lifecycle handling plus tool-execution hook/runtime binding in `/lib/queue.ts`
31
- - Model identity/thinking-level contracts, scoped model-pattern parsing/resolution/sorting, current-model store/update/runtime helpers, in-flight model-switch state helpers, restart eligibility, delayed abort decisions, Telegram-prefix defaulted continuation prompt construction, continuation queue adapters, and model-switch controller/runtime binding over queue-owned turns in `/lib/model.ts`
32
- - Preview transport-selection, assistant-message preview lifecycle hook binding/handling, preview-finalization, preview controller state/reset helpers, preview Bot API message/rendered-chunk transport adapters, preview-controller/assistant-preview runtime binding, reply-metadata defaulting through the replies-domain helper, and preview-runtime helpers in `/lib/preview.ts`
33
- - Reply-transport, rendered-message delivery runtime/binding, structural assistant-message extraction, reply-parameter construction over API-owned transport shapes, and plain/Markdown final-reply helpers in `/lib/replies.ts`
34
- - Preview appearance and snapshot derivation stay in `/lib/rendering.ts`, while `/lib/preview.ts` owns transport and lifecycle decisions, so richer preview strategies can evolve without entangling Markdown formatting with Telegram delivery state
35
- - Polling request, start/stop controller state orchestration, polling activity readers, stop-condition, structural config contract, long-poll loop helpers, and poll-loop/controller runtime wiring over Telegram transport ports in `/lib/polling.ts`
36
- - Telegram persisted config shape, config-path defaults, config file read/write helpers, mutable config-store accessors, single-user authorization, and first-user pairing side effects/runtime adapters in `/lib/config.ts`
37
- - Telegram API helpers, concrete Bot API transport shapes including reply parameters and send/edit message bodies, typed/default Bot API runtime helpers, bot identity fetch transport, chat-action sender adapters/runtime-bound typing action, lazy bot-token client wrappers, API runtime error-recording wrappers, temp paths, inbound file-size limits, and runtime-bound temp-directory preparation/default cleanup in `/lib/api.ts`
38
- - Telegram turn-building helpers, runtime turn-builder wiring over media download ports and media-owned downloaded-file metadata contracts, queued-prompt edit runtime binding, and Node-backed image-file reads for pi image inputs in `/lib/turns.ts`
39
- - Telegram media/text extraction, file-info normalization, downloaded-message-file metadata contracts, inbound file download assembly, media-group debounce helpers, media-group controller state, and media-group-aware authorized-message dispatch adapter wiring in `/lib/media.ts`
40
- - Telegram slash-command parsing, command-message target helpers/adapters, command control-enqueue adapters/runtime binding, command-action routing, command-handler/target-runtime and command-or-prompt dispatch binding, command runtime port orchestration, shared command-runtime reply/status/control adapter closures, stop/compact/status/model/help command side-effect branching, bound Bot API command registration, and Bot API command metadata helpers in `/lib/commands.ts`
41
- - Telegram updates extraction, paired update-runtime binding, flow, execution-planning, authorized reaction priority/removal handling, direct execute-from-update routing, update runtime adapters over queue/media/menu ports, and runtime helpers in `/lib/updates.ts`
42
- - Telegram attachment queueing, narrow structural attachment turn targets, queued-attachment sender runtime binding, delivery helpers, Node-backed file stat checks, outbound photo-vs-document classification, and outbound attachment limits/env parsing in `/lib/attachments.ts`
43
- - Telegram tool, command, before-agent prompt, and lifecycle-hook registration helpers in `/lib/registration.ts`
44
- - Setup/token prompt, environment fallback, guarded setup runtime adapter wiring, structural setup config contract, token validation, config persistence orchestration, and setup notification helpers in `/lib/setup.ts`
45
- - Markdown block scanning/rendering, inline-token/style rendering, text-piece rendering, stable-preview block scanning, final rendered-block chunk balancing, preview-snapshot derivation, HTML escaping, raw HTML tag-preserving chunking, and Telegram message rendering helpers in `/lib/rendering.ts`
46
- - Status-bar rendering/runtime adapters, bridge status state adapters, status-message rendering and status-HTML binding, structural queue-lane status view contracts, structured redacted runtime-event recording, recent-event recorder state, recent-event line formatting, and grouped pi-side diagnostics helpers in `/lib/status.ts`
47
- - Menu settings/model-registry access through structural ports, menu-state construction, menu runtime state/cache controller, menu-state storage pruning/refresh helpers, command open-flow branching, action runtime/state-builder adapters, menu callback handler adapters, stored callback entry/runtime routing, model-menu input-cache/state-building resolution, pure menu-page derivation, pure menu render-payload builders, menu-message runtime, callback parsing, callback mutation helpers, full model-callback planning and execution, interface-polished callback effect ports, status-thinking callback handling, and UI helpers in `/lib/menu.ts`
48
- - Telegram API-bound transport adapters and broader event-side orchestration in `index.ts`; direct Node file-operation imports stay in the owning domains rather than the entrypoint
49
- - Remaining `index.ts` wiring is intentionally cross-domain adapter code that closes over live extension state, pi callbacks, Telegram API ports, and status updates; keep repeated wiring DRY through small local adapter helpers or owning-domain contracts when that reduces duplication without obscuring live state, and extract more only when a boundary can move cohesive behavior into an owning domain instead of relocating one-off closures
50
- - Additional domains can be extracted into `/lib/*.ts` as the bridge grows, while keeping `index.ts` as the single entrypoint
51
- - `index.ts` uses namespace imports for local bridge domains so orchestration reads as domain-scoped calls such as `Queue.*`, `Turns.*`, and `Rendering.*` instead of long flat import lists
52
- - Mirrored domain regression coverage lives in `/tests/*.test.ts` using the same bare domain naming scheme, and architecture-invariant coverage in `/tests/invariants.test.ts` checks that the local `index.ts` plus `/lib/*.ts` import graph stays acyclic, shared bucket domains such as `lib/constants.ts` or `lib/types.ts` are not reintroduced, empty interface-extension shells stay collapsed into clearer type aliases, direct pi SDK imports stay centralized, `index.ts` source code stays free of direct Node runtime imports, local helper declarations, local arrow adapters, direct `process.env`, and direct `pi.*` receiver access, `/lib/runtime.ts` stays free of local domain imports, structural leaf domains stay free of local nominal imports, the menu domain stays on structural ports without re-exporting model, API transport stays decoupled from persisted config defaults, structural update/media domains stay decoupled from concrete API transport shapes, and attachment delivery stays decoupled from queue/inbound media/API helpers
24
+ Current runtime areas use these ownership boundaries:
25
+
26
+ | Domain | Owns |
27
+ | ------ | ---- |
28
+ | `index.ts` | Single composition root for live pi/Telegram ports, session state, API-bound transport adapters, and status updates |
29
+ | `api` | Bot API transport shapes/helpers, retries, file download, temp-dir lifecycle, inbound limits, chat actions, lazy bot-token clients, runtime error recording |
30
+ | `config` / `setup` | Persisted bot/session pairing state, authorization, first-user pairing, token prompting, env fallback, validation, config persistence |
31
+ | `locks` / `polling` | Singleton `locks.json` ownership, takeover/restart semantics, long-poll controller state, update offset persistence, poll-loop runtime wiring |
32
+ | `updates` / `routing` | Update classification/execution planning, paired authorization, reactions, edits, callbacks, and inbound route composition |
33
+ | `media` / `turns` / `handlers` | Text/media extraction, media-group debounce, inbound downloads, turn building/editing, image reads, attachment-handler matching/execution/fallback output |
34
+ | `queue` | Queue item contracts, lane admission/order, stores, mutations, dispatch readiness/runtime, prompt/control enqueueing, session and agent/tool lifecycle sequencing |
35
+ | `runtime` | Session-local coordination primitives: counters, lifecycle flags, setup guard, abort handler, typing-loop timers, prompt-dispatch flags, agent-end reset binding |
36
+ | `model` / `menu` / `commands` | Model identity/thinking levels, scoped model resolution, in-flight switching, inline status/model/thinking UI, slash commands, bot command registration |
37
+ | `preview` / `replies` / `rendering` | Preview lifecycle/transports, final reply delivery and reply parameters, Telegram HTML Markdown rendering, chunking, stable-preview snapshots |
38
+ | `attachments` | `telegram_attach` registration, outbound attachment queueing, stat/limit checks, photo/document delivery classification |
39
+ | `status` | Status-bar/status-message rendering, queue-lane status views, redacted runtime event ring, grouped pi diagnostics |
40
+ | `lifecycle` / `prompts` / `pi` | pi hook registration, Telegram-specific before-agent prompt injection, centralized direct pi SDK imports and context adapters |
41
+
42
+ Boundary invariants:
43
+
44
+ - Constants and state types live with their owning domains; do not reintroduce shared buckets such as `lib/constants.ts` or `lib/types.ts`
45
+ - Domain helpers use narrow structural projections when that avoids importing concrete wire DTOs or broader runtime objects unnecessarily
46
+ - Preview appearance stays in `rendering`; preview transport/lifecycle stays in `preview`
47
+ - Direct `node:*` file-operation imports stay in owning domains, not in `index.ts`
48
+ - `index.ts` uses namespace imports for local bridge domains so orchestration reads as `Queue.*`, `Turns.*`, and `Rendering.*`
49
+ - Architecture-invariant tests guard the acyclic import graph, pi SDK centralization, entrypoint purity, runtime-domain isolation, structural leaf-domain isolation, menu/model boundaries, API/config separation, media/update/API separation, and attachment boundary isolation
50
+ - Mirrored domain regression coverage lives in `/tests/*.test.ts`; test helpers stay local to the mirrored suite by default, and shared fixture folders are justified only by reuse across multiple domain suites
53
51
 
54
52
  ## Configuration UX
55
53
 
@@ -61,6 +59,10 @@ Current runtime areas include:
61
59
 
62
60
  Because `ctx.ui.input()` only exposes placeholder text, the bridge uses `ctx.ui.editor()` whenever a real default value must appear already filled in. The persisted `telegram.json` config is written with private `0600` permissions because it contains the bot token.
63
61
 
62
+ ## Runtime Ownership
63
+
64
+ Telegram bot configuration stays in `~/.pi/agent/telegram.json`; singleton runtime ownership lives separately in `~/.pi/agent/locks.json` under `@llblab/pi-telegram`. `/telegram-connect` acquires or moves that lock before polling starts, and `/telegram-disconnect` stops polling and releases it. Session start may read the existing lock and resume polling when the lock already points at the current `pid`/`cwd`; after a full pi process restart, it may also replace a stale lock from the same `cwd` and resume polling automatically. Session start does not create new ownership from an inactive lock, a live external lock, or a stale lock from another directory. Session replacement suspends polling and ownership watchers without releasing the lock, allowing the next session-start hook in the same `pid`/`cwd` to resume from the existing explicit ownership. When a live external owner exists, `/telegram-connect` asks whether to move singleton ownership to the current pi instance. Active owners poll the lock while running through a snapshotted ownership context, so long-lived timers do not touch stale pi contexts after `/new`; they stop local polling when `locks.json` no longer points at their own `pid`/`cwd`, without deleting the new owner lock. Deleting `locks.json` resets runtime ownership without deleting Telegram configuration.
65
+
64
66
  ## Message And Queue Flow
65
67
 
66
68
  ### Inbound Path
@@ -70,9 +72,12 @@ Because `ctx.ui.input()` only exposes placeholder text, the bridge uses `ctx.ui.
70
72
  3. The bridge filters to the paired private user
71
73
  4. Media groups are coalesced into a single Telegram turn when needed
72
74
  5. Files are streamed into `~/.pi/agent/tmp/telegram` with a default 50 MiB size limit, partial-download cleanup on failures, and stale temp cleanup on session start; operators can tune the limit with `PI_TELEGRAM_INBOUND_FILE_MAX_BYTES` or `TELEGRAM_MAX_FILE_SIZE_BYTES`
73
- 6. A `PendingTelegramTurn` is created and queued locally
74
- 7. Telegram `edited_message` updates are routed separately and update a matching queued turn when the original message has not been dispatched yet
75
- 8. The queue dispatcher sends the turn into pi only when dispatch is safe
75
+ 6. Configured inbound attachment handlers may run on downloaded files by MIME wildcard, Telegram attachment type, or generic match selector; command templates receive safe command-arg substitution for `{file}`/`{mime}`/`{type}`
76
+ 7. Matching handlers are tried in config order: a non-zero exit records diagnostics and falls back to the next matching handler, while the first successful handler stops the chain
77
+ 8. Local attachments stay visible under `[attachments] <directory>` with relative file entries, and handler stdout is appended under `[outputs]` before the agent sees the turn; failed handlers omit output while keeping the attachment entry
78
+ 9. A `PendingTelegramTurn` is created and queued locally
79
+ 10. Telegram `edited_message` updates are routed separately and update a matching queued turn when the original message has not been dispatched yet
80
+ 11. The queue dispatcher sends the turn into pi only when dispatch is safe
76
81
 
77
82
  ### Queue Safety Model
78
83
 
@@ -88,11 +93,11 @@ Admission contract:
88
93
  | Admission | Examples | Queue shape | Dispatch rank |
89
94
  | --------------------- | ---------------------------------------------------- | -------------------------------------------------------------------- | ------------- |
90
95
  | Immediate execution | `/compact`, `/stop`, `/help`, `/start` | Does not enter the Telegram queue | N/A |
91
- | Control queue | `/status`, `/model`, model-switch continuation turns | `queueLane: control`; accepts control items and continuation prompts | 0 |
96
+ | Control queue | Model-switch continuation turns and future deferred controls | `queueLane: control`; accepts control items and continuation prompts | 0 |
92
97
  | Priority prompt queue | A waiting prompt promoted by `👍` | `kind: prompt`, `queueLane: priority` | 1 |
93
98
  | Default prompt queue | Normal Telegram text/media turns | `kind: prompt`, `queueLane: default` | 2 |
94
99
 
95
- The command action itself carries its execution mode, and the queue domain exposes lane contracts for admission mode, dispatch rank, and allowed item kinds. Queue append and planning paths validate lane admission so a malformed control/default or other invalid lane pairing fails predictably instead of silently changing priority. This lets synthetic control actions and Telegram prompts share one stable ordering model while still rendering distinctly in status output. In the pi status bar queue preview, priority prompts are marked with `⬆` while control items keep their own control-specific summary markers such as `⚡`.
100
+ The command action itself carries its execution mode, and the queue domain exposes lane contracts for admission mode, dispatch rank, and allowed item kinds. Queue append and planning paths validate lane admission so a malformed control/default or other invalid lane pairing fails predictably instead of silently changing priority. This lets synthetic control actions and Telegram prompts share one stable ordering model while still rendering distinctly in status output. In the pi status bar, busy labels distinguish `active`, `dispatching`, `queued`, `tool running`, `model`, and `compacting`; priority prompts are marked with `⬆` while control items keep markers such as `⚡`.
96
101
 
97
102
  A dispatched prompt remains in the queue until `agent_start` consumes it. That keeps the active Telegram turn bound correctly for previews, attachments, abort handling, and final reply delivery.
98
103
 
@@ -104,7 +109,7 @@ Dispatch is gated by:
104
109
  - `ctx.isIdle()` being true
105
110
  - `ctx.hasPendingMessages()` being false
106
111
 
107
- This prevents queue races around rapid follow-ups, `/compact`, and mixed local plus Telegram activity. The dispatch controller also serializes asynchronous control items, so a queued `/status` or `/model` action must settle before the next queued action can dispatch.
112
+ This prevents queue races around rapid follow-ups, `/compact`, and mixed local plus Telegram activity. Telegram `/status` and `/model` execute immediately; the dispatch controller still serializes any deferred control items so a queued control action must settle before the next queued action can dispatch.
108
113
 
109
114
  ### Abort Behavior
110
115
 
@@ -153,13 +158,13 @@ The bridge exposes Telegram-side session controls in addition to regular chat fo
153
158
 
154
159
  Current operator controls include:
155
160
 
156
- - `/status` for model, usage, cost, and context visibility, queued as a high-priority control item when needed
161
+ - `/status` for model, usage, cost, and context visibility, executed immediately from Telegram even while generation is active
157
162
  - Inline status buttons for model and thinking adjustments, applying idle selections immediately while still respecting busy-run restart rules; model-menu inputs are cached briefly and stored inline-menu states are pruned by TTL/LRU so old keyboards expire predictably
158
- - `/model` for interactive model selection, queued as a high-priority control item when needed and supporting in-flight restart of the active Telegram-owned run on a newly selected model
163
+ - `/model` for interactive model selection, executed immediately from Telegram and supporting in-flight restart of the active Telegram-owned run on a newly selected model
159
164
  - `/compact` for Telegram-triggered pi session compaction when the bridge is idle
160
165
  - `/stop` for aborting the active Telegram-owned run
161
166
  - `/telegram-status` for pi-side diagnostics as grouped line-by-line sections separated by blank lines: connection, polling, execution, queue, and the recent redacted runtime/API event ring. These sections include polling state, last update id, active turn source ids, pending dispatch, compaction state, active tool count, pending model-switch state, total queue depth, and queue-lane counts. The event ring records transport/API, polling/update, prompt-dispatch, control-action, typing, compaction, setup, session-lifecycle, and attachment queue/delivery failures; benign unchanged edit responses and unsupported empty draft-clear attempts are filtered out so expected preview transport noise does not obscure real failures
162
- - Queue reactions using `👍` and `👎`, with `👎` acting as the canonical queue-removal path because ordinary Telegram DM message deletions are not exposed through the Bot API polling path this bridge uses
167
+ - Queue reactions using `👍` and `👎` apply to waiting text, voice, file, image, and media-group turns by matching the turn's source Telegram message ids; `👎` acts as the canonical queue-removal path because ordinary Telegram DM message deletions are not exposed through the Bot API polling path this bridge uses
163
168
 
164
169
  ## In-Flight Model Switching
165
170
 
@@ -0,0 +1,60 @@
1
+ # Attachment Handlers
2
+
3
+ `pi-telegram` can run ordered inbound attachment handlers after downloading files and before the Telegram turn enters the pi queue.
4
+
5
+ This document is the local adaptation of the portable [Command Template Standard](./command-templates.md).
6
+
7
+ ## Config Shape
8
+
9
+ `telegram.json` may define `attachmentHandlers`:
10
+
11
+ ```json
12
+ {
13
+ "attachmentHandlers": [
14
+ {
15
+ "type": "voice",
16
+ "template": "~/.pi/agent/skills/mistral-stt/scripts/transcribe.mjs {file} {lang} {model}",
17
+ "args": ["file", "lang", "model"],
18
+ "defaults": {
19
+ "lang": "ru",
20
+ "model": "voxtral-mini-latest"
21
+ }
22
+ },
23
+ {
24
+ "mime": "audio/*",
25
+ "template": "~/.pi/agent/skills/groq-stt/scripts/transcribe.mjs {file} {lang} {model}",
26
+ "args": ["file", "lang", "model"],
27
+ "defaults": {
28
+ "lang": "ru",
29
+ "model": "whisper-large-v3-turbo"
30
+ }
31
+ }
32
+ ]
33
+ }
34
+ ```
35
+
36
+ Handlers match by `type`, `mime`, or `match`. Wildcards such as `audio/*` are accepted. Each matching handler must provide a `template`; optional `args` and `defaults` document or fill placeholder values.
37
+
38
+ ## Template Placeholders
39
+
40
+ Attachment handlers support these built-in placeholders:
41
+
42
+ | Placeholder | Value |
43
+ | ----------- | ---------------------------------------------------------------- |
44
+ | `{file}` | Full local path to the downloaded file |
45
+ | `{mime}` | MIME type if known |
46
+ | `{type}` | Attachment kind such as `voice`, `audio`, `document`, or `photo` |
47
+
48
+ `defaults` may provide additional placeholder values such as `{lang}` or `{model}`. `args` documents supported placeholders and may also encode defaults in compact form, for example `"file,lang=ru,model=voxtral-mini-latest"`.
49
+
50
+ If a template has no `{file}` placeholder, the downloaded file path is appended as the last command arg.
51
+
52
+ ## Ordered Fallbacks
53
+
54
+ A handler list is ordered. For each attachment, matching handlers run in list order and stop after the first successful handler.
55
+
56
+ If a matching handler fails with a non-zero exit code, the runtime records diagnostics and tries the next matching handler. If every matching handler fails, the attachment remains visible in the prompt as a normal local file reference.
57
+
58
+ ## Prompt Output
59
+
60
+ Local attachments stay in the prompt under `[attachments] <directory>` with relative file entries. Successful handler stdout is added under `[outputs]`. Empty output and failed handler output are omitted from the prompt text.
@@ -0,0 +1,75 @@
1
+ # Command Template Standard
2
+
3
+ Command templates are the stable integration format for deterministic local automation.
4
+
5
+ This document is the portable core. Extensions may adapt local examples, placeholder sources, and config locations, but should preserve this contract to stay compatible with the shared command-template model.
6
+
7
+ ## Definition
8
+
9
+ A command template is a single command-line string with named placeholders:
10
+
11
+ ```text
12
+ ~/bin/transcribe {file} {lang}
13
+ ```
14
+
15
+ ## Execution Contract
16
+
17
+ The runtime must:
18
+
19
+ 1. Split the template into shell-like words, honoring simple single quotes, double quotes, and backslash escapes
20
+ 2. Substitute placeholders inside each split word
21
+ 3. Execute the first word as the command and the remaining words as args
22
+ 4. Avoid evaluating the template through a shell
23
+ 5. Treat exit code `0` as success and non-zero exit as failure
24
+ 6. Use stdout as the result channel
25
+ 7. Use stderr only for diagnostics
26
+
27
+ Implementations may expand `~` in the command position and may resolve relative command paths against the caller cwd.
28
+
29
+ ## Quoting Model
30
+
31
+ Placeholder values are not shell-escaped because templates are not executed through a shell. A value containing spaces remains one command arg when it replaces one split word:
32
+
33
+ ```text
34
+ template="echo {text}"
35
+ text="hello world"
36
+ args=["hello world"]
37
+ ```
38
+
39
+ A placeholder can also be embedded inside one word:
40
+
41
+ ```text
42
+ template="tool --file={file}"
43
+ file="/tmp/a b.ogg"
44
+ args=["--file=/tmp/a b.ogg"]
45
+ ```
46
+
47
+ Use quotes only for literal template words that should contain spaces before placeholder substitution:
48
+
49
+ ```text
50
+ template="echo 'literal words' {text}"
51
+ ```
52
+
53
+ ## Storage Vocabulary
54
+
55
+ JSON storage is part of the standard vocabulary, but not one universal schema. Extensions may store command templates in different config files and surrounding shapes.
56
+
57
+ Common field names:
58
+
59
+ | Field | Meaning |
60
+ | ---------- | ------------------------------------------------------------------------------------------ |
61
+ | `template` | Command-line template string, usually attached to a named capability or handler |
62
+ | `args` | Declared placeholder names, represented as a string or array according to the local schema |
63
+ | `defaults` | Object mapping placeholder names to default values |
64
+
65
+ Config file locations, selectors, labels, descriptions, and surrounding registry shapes belong to each extension's local adaptation.
66
+
67
+ ## Tool Boundary
68
+
69
+ Agent tools are a separate abstraction. A tool name is not a portable command template because the pi extension API currently exposes tool registration and metadata, but not a public extension-to-extension `executeTool(name, args)` call.
70
+
71
+ Until such an API exists, extensions should prefer command templates for deterministic local automation.
72
+
73
+ ## Compatibility
74
+
75
+ Consumers should share this template contract, not private registry fields or implementation details from any specific extension.
package/docs/locks.md ADDED
@@ -0,0 +1,136 @@
1
+ # Extension Locks Standard
2
+
3
+ `locks.json` is a shared registry for singleton pi extensions.
4
+
5
+ Path:
6
+
7
+ ```text
8
+ ~/.pi/agent/locks.json
9
+ ```
10
+
11
+ ## Shape
12
+
13
+ ```json
14
+ {
15
+ "@llblab/pi-telegram": {
16
+ "pid": 2590864,
17
+ "cwd": "/home/user/project"
18
+ }
19
+ }
20
+ ```
21
+
22
+ Top-level keys are extension identities. Values are JSON objects owned by that extension.
23
+
24
+ ## Identity key
25
+
26
+ Use the most stable available identity:
27
+
28
+ 1. `package.json/name` for npm-style pi packages
29
+ 2. Directory name when the extension entrypoint is `index.ts` but there is no package name
30
+ 3. File basename when the extension is a single file
31
+
32
+ For npm-style package extensions, the canonical value is the `package.json` `name`. Implementations may keep that value as a small local constant when it is clearer than runtime package introspection. The fallback rules are only for unpackaged extensions.
33
+
34
+ Examples:
35
+
36
+ ```text
37
+ extensions/pi-telegram/package.json name=@llblab/pi-telegram -> @llblab/pi-telegram
38
+ extensions/pi-telegram/index.ts without package.json -> pi-telegram
39
+ extensions/pi-telegram.ts -> pi-telegram
40
+ ```
41
+
42
+ ## Required fields
43
+
44
+ ```json
45
+ {
46
+ "pid": 2590864
47
+ }
48
+ ```
49
+
50
+ `pid` is the process that currently owns the singleton runtime. `cwd` should be stored when ownership is tied to a pi session directory.
51
+
52
+ During a user-initiated start/connect event, an extension should:
53
+
54
+ 1. Read its lock entry
55
+ 2. If `pid` is stale, replace the entry
56
+ 3. If `pid` and `cwd` match the current pi instance, refresh or keep the entry
57
+ 4. If a live external owner exists, ask interactively whether to move singleton ownership here
58
+
59
+ ## Acquisition timing
60
+
61
+ Lock writes must be caused by an explicit user-initiated runtime event, such as `/wakeup-start`, `/telegram-connect`, or a confirmed takeover prompt.
62
+
63
+ Extension initialization and session-start hooks may read `locks.json`, update local status, install ownership watchers, and resume local work when the existing lock already points at the current `pid`/`cwd`. After a full process restart, a session-start hook may replace a stale lock from the same `cwd` to restore explicitly requested ownership. They must not create ownership from an inactive lock, take over a live external owner, or replace a stale lock from another directory by themselves. Such locks should stay visible as state until the user runs the start/connect command. Session replacement should suspend local runtime work and ownership watchers without releasing the lock, so the next session in the same `pid`/`cwd` can resume from explicit ownership.
64
+
65
+ ## Optional fields
66
+
67
+ Extensions may add compact fields when useful:
68
+
69
+ ```json
70
+ {
71
+ "pid": 2590864,
72
+ "cwd": "/repo/project",
73
+ "mode": "connected",
74
+ "updatedAt": "2026-04-28T00:00:00.000Z"
75
+ }
76
+ ```
77
+
78
+ Do not print optional fields in normal UI unless they help the user act.
79
+
80
+ ## Ownership rules
81
+
82
+ - One top-level key per singleton extension
83
+ - An extension may only mutate its own key
84
+ - Other keys must be preserved exactly
85
+ - If `cwd` is present, active-here ownership means both `pid` and `cwd` match the current pi instance
86
+ - Human-readable diagnostics should say `active here`, `active elsewhere`, or `stale`
87
+ - Debug data belongs in `locks.json`, not in normal status output
88
+
89
+ ## Runtime status
90
+
91
+ Singleton extensions with footer/status presence should expose quiet but explicit local state. For example, pi-wakeup uses:
92
+
93
+ - `wakeup off` when this pi instance does not own the singleton runtime
94
+ - `wakeup on` when this pi instance owns the runtime but has no pending wake-up detail to show
95
+ - `wakeup [16:32:39]` when the runtime owns scheduled work and can show the next countdown
96
+
97
+ ## Interactive takeover
98
+
99
+ Start/connect commands should make singleton moves easy:
100
+
101
+ 1. If no live owner exists, take ownership without an extra prompt
102
+ 2. If a live external owner exists, ask whether to move singleton ownership to this pi instance
103
+ 3. On confirmation, write the current `{ "pid": ..., "cwd": ... }` to this extension's key in `locks.json`
104
+ 4. The previous owner must notice that `locks.json` no longer points at its own `pid`/`cwd` and stop local runtime work without deleting the new lock
105
+
106
+ Takeover prompts should use the extension name as the dialog title, then the question, a blank line, and source/target lines:
107
+
108
+ ```text
109
+ pi-telegram
110
+ move singleton lock here?
111
+
112
+ from: pid 2590864, cwd /old
113
+ to: /new
114
+ ```
115
+
116
+ Avoid repeating the extension name in the body. Color is encouraged: extension title/name accent, question warning, `from:`/`to:` muted.
117
+
118
+ The previous owner may use `fs.watch`, mtime polling, or an existing status/timer tick. Long-lived watchers should compare against a snapshotted `pid`/`cwd` identity rather than a live pi context object, because session replacement such as `/new` makes captured contexts stale. The important contract is graceful local shutdown after ownership mismatch.
119
+
120
+ ## Reset
121
+
122
+ Delete `~/.pi/agent/locks.json` to reset singleton runtime ownership for all participating extensions without deleting their configuration files such as `telegram.json`.
123
+
124
+ ## Atomicity
125
+
126
+ Current baseline is read-modify-write JSON. This is enough for interactive pi singleton starts.
127
+
128
+ If multiple instances may start concurrently, use an atomic helper later:
129
+
130
+ - Lock file around `locks.json`, or
131
+ - Temp file + rename with conflict checks, or
132
+ - OS-level exclusive open for a short critical section
133
+
134
+ ## Migration
135
+
136
+ Migrations from legacy lock files or legacy keys should be one-off cleanup work. Runtime ownership should read and write only `locks.json` under the canonical identity key.