@llblab/pi-telegram 0.6.2 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +152 -0
- package/BACKLOG.md +5 -0
- package/CHANGELOG.md +142 -0
- package/README.md +49 -42
- package/docs/architecture.md +49 -40
- package/docs/attachment-handlers.md +4 -6
- package/docs/command-templates.md +53 -9
- package/docs/locks.md +4 -0
- package/docs/outbound-handlers.md +20 -14
- package/index.ts +217 -153
- package/lib/api.ts +1 -0
- package/lib/attachment-handlers.ts +16 -9
- package/lib/attachments.ts +1 -0
- package/lib/command-templates.ts +32 -3
- package/lib/commands.ts +367 -80
- package/lib/config.ts +10 -8
- package/lib/keyboard.ts +14 -0
- package/lib/lifecycle.ts +26 -0
- package/lib/locks.ts +3 -2
- package/lib/media.ts +1 -0
- package/lib/menu-model.ts +881 -0
- package/lib/menu-queue.ts +608 -0
- package/lib/menu-status.ts +226 -0
- package/lib/menu-thinking.ts +171 -0
- package/lib/menu.ts +143 -1019
- package/lib/model.ts +1 -0
- package/lib/outbound-handlers.ts +120 -45
- package/lib/pi.ts +8 -0
- package/lib/polling.ts +1 -0
- package/lib/preview.ts +97 -50
- package/lib/prompt-templates.ts +150 -0
- package/lib/prompts.ts +17 -9
- package/lib/queue.ts +51 -15
- package/lib/rendering.ts +1 -0
- package/lib/replies.ts +86 -2
- package/lib/routing.ts +76 -14
- package/lib/runtime.ts +2 -0
- package/lib/setup.ts +1 -0
- package/lib/status.ts +15 -6
- package/lib/turns.ts +1 -0
- package/lib/updates.ts +36 -6
- package/package.json +4 -1
package/docs/architecture.md
CHANGED
|
@@ -2,18 +2,18 @@
|
|
|
2
2
|
|
|
3
3
|
## Overview
|
|
4
4
|
|
|
5
|
-
`pi-telegram` is a session-local
|
|
5
|
+
`pi-telegram` is a session-local π extension that binds one Telegram DM to one running π session. The bridge owns four main responsibilities:
|
|
6
6
|
|
|
7
7
|
- Poll Telegram updates and enforce single-user pairing
|
|
8
|
-
- Translate Telegram messages and media into
|
|
9
|
-
- Stream and deliver
|
|
10
|
-
- Manage Telegram-specific controls such as queue reactions, `/
|
|
8
|
+
- Translate Telegram messages and media into π inputs
|
|
9
|
+
- Stream and deliver π responses back to Telegram
|
|
10
|
+
- Manage Telegram-specific controls such as queue reactions, π prompt-template commands, `/start` application menu sections, `/compact`, `/next`, `/abort`, and `/stop`
|
|
11
11
|
|
|
12
12
|
## Runtime Structure
|
|
13
13
|
|
|
14
14
|
`index.ts` remains the extension entrypoint and composition root. Reusable runtime logic is split into flat domain files under `/lib` rather than into a deep local module tree.
|
|
15
15
|
|
|
16
|
-
Architecture shorthand: this repository uses a `Flat Domain DAG`: cohesive bridge domains live as flat `/lib/*.ts` modules, local imports must form a directed acyclic graph, shared buckets are avoided, and `index.ts` wires live
|
|
16
|
+
Architecture shorthand: this repository uses a `Flat Domain DAG`: cohesive bridge domains live as flat `/lib/*.ts` modules, local imports must form a directed acyclic graph, shared buckets are avoided, and `index.ts` wires live π/Telegram ports plus session state. Source-module opening comments include `Zones:` tags such as `telegram`, `pi agent`, `tui`, or `shared utils` so cross-cutting responsibility areas stay visible without folder nesting.
|
|
17
17
|
|
|
18
18
|
Domain grouping rule: prefer cohesive domain files over atomizing every helper into its own file. A `shared` domain is allowed only for types or constants that genuinely span multiple bridge domains.
|
|
19
19
|
|
|
@@ -23,27 +23,29 @@ Naming rule: because the repository already scopes this codebase to Telegram, ex
|
|
|
23
23
|
|
|
24
24
|
Current runtime areas use these ownership boundaries:
|
|
25
25
|
|
|
26
|
-
| Domain
|
|
27
|
-
|
|
|
28
|
-
| `index.ts`
|
|
29
|
-
| `api`
|
|
30
|
-
| `config` / `setup`
|
|
31
|
-
| `locks` / `polling`
|
|
32
|
-
| `updates` / `routing`
|
|
33
|
-
| `media` / `turns` / `attachment-handlers` | Text/media extraction, media-group debounce, inbound downloads, turn building/editing, image reads, attachment-handler matching/execution/fallback output
|
|
34
|
-
| `queue`
|
|
35
|
-
| `runtime`
|
|
36
|
-
| `model` / `menu` / `commands`
|
|
37
|
-
| `
|
|
38
|
-
| `
|
|
39
|
-
| `
|
|
40
|
-
| `
|
|
41
|
-
| `
|
|
42
|
-
| `
|
|
26
|
+
| Domain | Owns |
|
|
27
|
+
| ----------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
28
|
+
| `index.ts` | Single composition root for live π/Telegram ports, session state, API-bound transport adapters, and status updates |
|
|
29
|
+
| `api` | Bot API transport shapes/helpers, retries, file download, temp-dir lifecycle, inbound limits, chat actions, lazy bot-token clients, runtime error recording |
|
|
30
|
+
| `config` / `setup` | Persisted bot/session pairing state, authorization, first-user pairing, token prompting, env fallback, validation, config persistence |
|
|
31
|
+
| `locks` / `polling` | Singleton `locks.json` ownership, takeover/restart semantics, long-poll controller state, update offset persistence, poll-loop runtime wiring |
|
|
32
|
+
| `updates` / `routing` | Update classification/execution planning, paired authorization, reactions, edits, callbacks, and inbound route composition |
|
|
33
|
+
| `media` / `turns` / `attachment-handlers` | Text/media extraction, media-group debounce, inbound downloads, turn building/editing, image reads, attachment-handler matching/execution/fallback output |
|
|
34
|
+
| `queue` | Queue item contracts, lane admission/order, stores, mutations, dispatch readiness/runtime, prompt/control enqueueing, session and agent/tool lifecycle sequencing |
|
|
35
|
+
| `runtime` | Session-local coordination primitives: counters, lifecycle flags, setup guard, abort handler, typing-loop timers, prompt-dispatch flags, agent-end reset binding |
|
|
36
|
+
| `model` / `menu-model` / `menu-thinking` / `menu-status` / `menu` / `menu-queue` / `commands` | Model identity/thinking levels, scoped model resolution, in-flight switching, model-menu UI, thinking-menu UI, status-menu UI, inline application callback composition, queue-menu UI, slash commands, bot command registration |
|
|
37
|
+
| `keyboard` | Shared Telegram inline-keyboard reply-markup structure; feature domains own callback semantics and button construction |
|
|
38
|
+
| `preview` / `replies` / `rendering` | Preview lifecycle/transports, final reply delivery and reply parameters, Telegram HTML Markdown rendering, chunking, stable-preview snapshots |
|
|
39
|
+
| `outbound-handlers` | Assistant-authored outbound comments, generated reply artifacts, inline-keyboard callbacks, and post-`agent_end` outbound action delivery |
|
|
40
|
+
| `attachments` | `telegram_attach` registration, outbound attachment queueing, stat/limit checks, photo/document delivery classification |
|
|
41
|
+
| `status` | Status-bar/status-message rendering, queue-lane status views, redacted runtime event ring, grouped π diagnostics |
|
|
42
|
+
| `lifecycle` / `prompts` / `prompt-templates` / `pi` | π hook registration, Telegram-specific before-agent prompt injection, π prompt-template discovery/expansion, centralized direct pi SDK imports and context adapters |
|
|
43
|
+
| `command-templates` | Portable shell-free command-template standard helpers, composition expansion, placeholder substitution, and executable resolution |
|
|
43
44
|
|
|
44
45
|
Boundary invariants:
|
|
45
46
|
|
|
46
47
|
- Constants and state types live with their owning domains; do not reintroduce shared buckets such as `lib/constants.ts` or `lib/types.ts`
|
|
48
|
+
- Shared Telegram inline-keyboard structure belongs to `keyboard`; application-control labels, callback data, and callback behavior stay in `menu`/`menu-model`/`menu-thinking`/`menu-status`/`menu-queue`; core queue mechanics stay in `queue`
|
|
47
49
|
- Domain helpers use narrow structural projections when that avoids importing concrete wire DTOs or broader runtime objects unnecessarily
|
|
48
50
|
- Preview appearance stays in `rendering`; preview transport/lifecycle stays in `preview`
|
|
49
51
|
- Direct `node:*` file-operation imports stay in owning domains, not in `index.ts`
|
|
@@ -59,11 +61,11 @@ Boundary invariants:
|
|
|
59
61
|
2. Otherwise use the first configured environment variable from the supported Telegram token list
|
|
60
62
|
3. Fall back to the example placeholder when no real value exists
|
|
61
63
|
|
|
62
|
-
Because `ctx.ui.input()` only exposes placeholder text, the bridge uses `ctx.ui.editor()` whenever a real default value must appear already filled in. The persisted `telegram.json` config is written
|
|
64
|
+
Because `ctx.ui.input()` only exposes placeholder text, the bridge uses `ctx.ui.editor()` whenever a real default value must appear already filled in. The persisted `telegram.json` config is written through a private temp file plus atomic rename, then left with `0600` permissions because it contains the bot token.
|
|
63
65
|
|
|
64
66
|
## Runtime Ownership
|
|
65
67
|
|
|
66
|
-
Telegram bot configuration stays in `~/.pi/agent/telegram.json`; singleton runtime ownership lives separately in `~/.pi/agent/locks.json` under `@llblab/pi-telegram`. `/telegram-connect` acquires or moves that lock before polling starts, and `/telegram-disconnect` stops polling and releases it. Session start may read the existing lock and resume polling when the lock already points at the current `pid`/`cwd`; after a full
|
|
68
|
+
Telegram bot configuration stays in `~/.pi/agent/telegram.json`; singleton runtime ownership lives separately in `~/.pi/agent/locks.json` under `@llblab/pi-telegram`. `/telegram-connect` acquires or moves that lock before polling starts, and `/telegram-disconnect` stops polling and releases it. Session start may read the existing lock and resume polling when the lock already points at the current `pid`/`cwd`; after a full π process restart, it may also replace a stale lock from the same `cwd` and resume polling automatically. Session start does not create new ownership from an inactive lock, a live external lock, or a stale lock from another directory. Session replacement suspends polling and ownership watchers without releasing the lock, allowing the next session-start hook in the same `pid`/`cwd` to resume from the existing explicit ownership. When a live external owner exists, `/telegram-connect` asks whether to move singleton ownership to the current π instance. Active owners poll the lock while running through a snapshotted ownership context, so long-lived timers do not touch stale π contexts after `/new`; they stop local polling when `locks.json` no longer points at their own `pid`/`cwd`, without deleting the new owner lock. Deleting `locks.json` resets runtime ownership without deleting Telegram configuration.
|
|
67
69
|
|
|
68
70
|
## Message And Queue Flow
|
|
69
71
|
|
|
@@ -80,11 +82,11 @@ Telegram bot configuration stays in `~/.pi/agent/telegram.json`; singleton runti
|
|
|
80
82
|
9. Local attachments stay visible under `[attachments] <directory>` with relative file entries, and handler stdout is appended under `[outputs]` before the agent sees the turn; failed handlers omit output while keeping the attachment entry
|
|
81
83
|
10. A `PendingTelegramTurn` is created and queued locally
|
|
82
84
|
11. Telegram `edited_message` updates are routed separately and update a matching queued turn when the original message has not been dispatched yet
|
|
83
|
-
12. The queue dispatcher sends the turn into
|
|
85
|
+
12. The queue dispatcher sends the turn into π only when dispatch is safe
|
|
84
86
|
|
|
85
87
|
### Queue Safety Model
|
|
86
88
|
|
|
87
|
-
The bridge keeps its own Telegram queue and does not rely only on
|
|
89
|
+
The bridge keeps its own Telegram queue and does not rely only on π's internal pending-message state.
|
|
88
90
|
|
|
89
91
|
Queued items now use two explicit dimensions:
|
|
90
92
|
|
|
@@ -95,24 +97,27 @@ Admission contract:
|
|
|
95
97
|
|
|
96
98
|
| Admission | Examples | Queue shape | Dispatch rank |
|
|
97
99
|
| --------------------- | ------------------------------------------------------------ | -------------------------------------------------------------------- | ------------- |
|
|
98
|
-
| Immediate execution | `/compact`, `/stop`, `/help`, `/start`
|
|
100
|
+
| Immediate execution | `/compact`, `/queue`, `/stop`, `/help`, `/start` | Does not enter the Telegram queue; `/help` opens the same menu as `/start`; `/stop` also clears queued items | N/A |
|
|
101
|
+
| Queued prompt command | `/continue`, `/template_name args` | `/continue` enqueues a Telegram-owned `continue` prompt; prompt-template commands expand the matching π template before entering the normal prompt queue | priority for `/continue`, otherwise default |
|
|
99
102
|
| Control queue | Model-switch continuation turns and future deferred controls | `queueLane: control`; accepts control items and continuation prompts | 0 |
|
|
100
|
-
| Priority prompt queue | A waiting prompt promoted by
|
|
103
|
+
| Priority prompt queue | A waiting prompt promoted by `👍`, `⚡️`, `❤️`, or `🕊` | `kind: prompt`, `queueLane: priority` | 1 |
|
|
101
104
|
| Default prompt queue | Normal Telegram text/media turns | `kind: prompt`, `queueLane: default` | 2 |
|
|
102
105
|
|
|
103
|
-
The command action itself carries its execution mode, and the queue domain exposes lane contracts for admission mode, dispatch rank, and allowed item kinds. Queue append and planning paths validate lane admission so a malformed control/default or other invalid lane pairing fails predictably instead of silently changing priority. This lets synthetic control actions and Telegram prompts share one stable ordering model while still rendering distinctly in status output. In the
|
|
106
|
+
The command action itself carries its execution mode, and the queue domain exposes lane contracts for admission mode, dispatch rank, and allowed item kinds. Queue append and planning paths validate lane admission so a malformed control/default or other invalid lane pairing fails predictably instead of silently changing priority. This lets synthetic control actions and Telegram prompts share one stable ordering model while still rendering distinctly in status output. In the π status bar, busy labels distinguish `active`, `dispatching`, `queued`, `tool running`, `model`, and `compacting`; priority prompts and priority control items are marked with `⚡`.
|
|
104
107
|
|
|
105
108
|
A dispatched prompt remains in the queue until `agent_start` consumes it. That keeps the active Telegram turn bound correctly for previews, attachments, abort handling, and final reply delivery.
|
|
106
109
|
|
|
107
110
|
Dispatch is gated by:
|
|
108
111
|
|
|
109
112
|
- No active Telegram turn
|
|
110
|
-
- No pending Telegram dispatch already sent to
|
|
113
|
+
- No pending Telegram dispatch already sent to π
|
|
111
114
|
- No compaction in progress
|
|
112
115
|
- `ctx.isIdle()` being true
|
|
113
116
|
- `ctx.hasPendingMessages()` being false
|
|
114
117
|
|
|
115
|
-
This prevents queue races around rapid follow-ups, `/compact`, and mixed local plus Telegram activity. Post-agent-end dispatch retries are scheduled through a session-bound deferred dispatcher that activates on session start, cancels timers on session shutdown, and skips callbacks from older generations before they touch `ExtensionContext`. Telegram `/
|
|
118
|
+
This prevents queue races around rapid follow-ups, `/compact`, and mixed local plus Telegram activity. Post-agent-end dispatch retries are scheduled through a session-bound deferred dispatcher that activates on session start, cancels timers on session shutdown, and skips callbacks from older generations before they touch `ExtensionContext`. Telegram `/start` and hidden compatibility shortcuts `/status`, `/model`, `/thinking`, and `/queue` execute immediately; the dispatch controller still serializes any deferred control items so a queued control action must settle before the next queued action can dispatch.
|
|
119
|
+
|
|
120
|
+
`/start` opens the main application menu: visible command help, compact command-only prompt-template rows when π exposes Telegram-compatible prompt-template names, status rows (`Status`, `Usage`, `Cost`, `Context`), and top-level buttons for model, thinking, and queue sections. The Queue button includes the current queued-item count. Hidden compatibility shortcuts `/help`, `/status`, `/model`, `/thinking`, and `/queue` jump directly to their corresponding menu screens. Command emoji come from the `commands` domain map so visible command descriptions and matching menu buttons share one fixed adornment source. Prompt-template commands use a fixed `🧩` marker, map π template names to Telegram-safe aliases such as `fix-tests` → `/fix_tests`, are registered in the Telegram bot command menu when the mapped command does not conflict with built-in bridge commands or hidden shortcuts, and expand before queueing because `ExtensionAPI.sendUserMessage()` intentionally bypasses π prompt-template expansion for extension-originated messages. Every submenu starts with a top Back row so navigation stays anchored near the original user message above the inline keyboard; model-menu scope and pagination controls sit directly under that top row before model choices, and tapping the pagination indicator opens a compact page picker headed by `<b>Choose a page:</b>`. `menu-model` owns model-menu state, scoped model pages, model callback planning, page-picker rendering, and model-menu rendering while `model` owns core model identity/switching semantics. `menu-thinking` owns thinking-menu text, reply markup, callback handling, and message rendering. `menu-status` owns status-menu payloads, status callback handling, and status-message rendering. `menu-queue` owns queue-menu UI only: queue items are rendered under a compact `<b>Queue:</b>` heading, top-to-bottom in dispatch order, numbered, and marked with `⚡` for priority prompts or `📎` for prompts with attachments. An empty queue renders bold message text plus the top Main menu button, not a disabled empty-state button. Selecting an item opens a submenu that displays the full queued prompt text with Back, priority toggle, and Cancel. If a callback targets an item that has already left the queue, the menu refreshes the list instead of applying a stale mutation.
|
|
116
121
|
|
|
117
122
|
### Abort Behavior
|
|
118
123
|
|
|
@@ -155,7 +160,7 @@ Telegram prompt responses use explicit delivery context to attach outbound text,
|
|
|
155
160
|
|
|
156
161
|
Outbound files are sent only after the active Telegram turn completes, must be staged through the `telegram_attach` tool, are staged atomically per tool call, are checked against a default 50 MiB limit configurable through `PI_TELEGRAM_OUTBOUND_ATTACHMENT_MAX_BYTES` or `TELEGRAM_MAX_ATTACHMENT_SIZE_BYTES`, and use file-backed multipart blobs so large sends do not require preloading whole files into memory.
|
|
157
162
|
|
|
158
|
-
Assistant-authored outbound actions use final-message markup instead of agent tool calls. Preview updates strip closed top-level HTML comments and currently open/partial top-level comment starts before rendering, so users do not see transient metadata even when streaming flushes happen after only `<`, `<!`, or `<!--`. On `agent_end`, the bridge removes top-level comments from the Markdown text reply, but treats column-zero top-level `<!-- telegram_voice ... -->` and `<!-- telegram_button ... -->` blocks specially before delivery; comments inside fenced code, quotes, lists, or indented examples stay literal, including fenced blocks with Markdown-valid indented closing fences. Voice maps to the first matching `outboundHandlers[]` entry with `type: "voice"`, synthesizes
|
|
163
|
+
Assistant-authored outbound actions use final-message markup instead of agent tool calls. Preview updates strip closed top-level HTML comments and currently open/partial top-level comment starts before rendering, so users do not see transient metadata even when streaming flushes happen after only `<`, `<!`, or `<!--`. On `agent_end`, the bridge removes top-level comments from the Markdown text reply, but treats column-zero top-level `<!-- telegram_voice ... -->` and `<!-- telegram_button ... -->` blocks specially before delivery; comments inside fenced code, quotes, lists, or indented examples stay literal, including fenced blocks with Markdown-valid indented closing fences. Voice maps to the first matching `outboundHandlers[]` entry with `type: "voice"`, synthesizes body text, `text="..."`, or colon shorthand through command-template execution, and uploads the generated OGG/Opus file via Telegram `sendVoice`; when no outbound voice handler is configured, it silently skips voice delivery. The `template: [...]` form can express TTS plus MP3-to-OGG conversion using configured templates and bridge-provided `{text}`, `{mp3}`, and `{ogg}` placeholders. Top-level `args` and `defaults` apply to all composed steps unless a step defines private values, the default command timeout applies automatically, and each step receives the previous step's stdout on stdin by default, without hard-coded filesystem defaults. Button blocks are built in: each `telegram_button` block becomes one inline-keyboard button on the final text, and callback clicks enqueue the configured prompt text as a normal Telegram prompt turn; the `telegram_button: Label` shorthand uses the same text for label and prompt, `prompt="..."` supports explicit one-line prompts, and body-form buttons use the body as the prompt. This keeps technical Markdown, code, tables, formulas, and numbered lists in the text channel when appropriate while allowing TTS-friendly voice messages and tappable continuations without invoking `telegram_attach` or extra transport tools.
|
|
159
164
|
|
|
160
165
|
## Interactive Controls
|
|
161
166
|
|
|
@@ -163,17 +168,21 @@ The bridge exposes Telegram-side session controls in addition to regular chat fo
|
|
|
163
168
|
|
|
164
169
|
Current operator controls include:
|
|
165
170
|
|
|
166
|
-
- `/
|
|
167
|
-
- Inline
|
|
168
|
-
- `/model`
|
|
169
|
-
- `/compact` for Telegram-triggered
|
|
171
|
+
- `/start` for the main application menu: command help, prompt-template commands, model, usage, cost, context visibility, and inline controls, executed immediately from Telegram even while generation is active
|
|
172
|
+
- Inline application-menu buttons for model, thinking, and queue controls, applying idle selections immediately while still respecting busy-run restart rules; model-menu inputs are cached briefly and stored inline-menu states are pruned by TTL/LRU so old keyboards expire predictably
|
|
173
|
+
- Hidden `/model` and `/thinking` shortcuts for opening the model and thinking sections directly while keeping settings out of the visible bot command menu
|
|
174
|
+
- `/compact` for Telegram-triggered π session compaction when the bridge is idle
|
|
175
|
+
- `/queue` for opening the queue section of the inline application menu; the same section is reachable from the status/main menu and supports top-anchored Back navigation, priority toggling, and cancellation
|
|
176
|
+
- `/next` for dispatching the next queued turn, aborting the active run first when π is busy
|
|
177
|
+
- `/continue` for enqueueing a Telegram-owned `continue` prompt, without aborting the current turn or forcing the next queued item
|
|
178
|
+
- `/abort` for aborting the active Telegram-owned run while preserving queued items for manual continuation
|
|
170
179
|
- `/stop` for aborting the active Telegram-owned run and clearing waiting Telegram queue items
|
|
171
|
-
- `/telegram-status` for
|
|
172
|
-
- Queue reactions
|
|
180
|
+
- `/telegram-status` for π-side diagnostics as grouped line-by-line sections separated by blank lines: connection, polling, execution, queue, and the recent redacted runtime/API event ring. These sections include polling state, last update id, active turn source ids, pending dispatch, compaction state, active tool count, pending model-switch state, total queue depth, and queue-lane counts. The event ring records transport/API, polling/update, prompt-dispatch, control-action, typing, compaction, setup, session-lifecycle, and attachment queue/delivery failures; benign unchanged edit responses and unsupported empty draft-clear attempts are filtered out so expected preview transport noise does not obscure real failures
|
|
181
|
+
- Queue reactions apply to waiting text, voice, file, image, and media-group turns by matching the turn's source Telegram message ids: `👍`, `⚡️`, `❤️`, and `🕊` promote waiting prompts, while `👎`, `👻`, `💔`, and `💩` remove waiting turns because ordinary Telegram DM message deletions are not exposed through the Bot API polling path this bridge uses
|
|
173
182
|
|
|
174
183
|
## In-Flight Model Switching
|
|
175
184
|
|
|
176
|
-
When `/model` is used during an active Telegram-owned run, the bridge can emulate the interactive
|
|
185
|
+
When `/model` is used during an active Telegram-owned run, the bridge can emulate the interactive π workflow of stopping, switching model, and continuing.
|
|
177
186
|
|
|
178
187
|
The current implementation does this by:
|
|
179
188
|
|
|
@@ -182,7 +191,7 @@ The current implementation does this by:
|
|
|
182
191
|
3. Aborting the active Telegram turn immediately, or delaying the abort until the current tool finishes when a tool call is in flight
|
|
183
192
|
4. Dispatching the continuation turn after the abort completes
|
|
184
193
|
|
|
185
|
-
This behavior is intentionally limited to runs currently owned by the Telegram bridge. If
|
|
194
|
+
This behavior is intentionally limited to runs currently owned by the Telegram bridge. If π is busy with non-Telegram work, the bridge still refuses the switch instead of hijacking unrelated session activity.
|
|
186
195
|
|
|
187
196
|
## Related
|
|
188
197
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Attachment Handlers
|
|
2
2
|
|
|
3
|
-
`pi-telegram` can run ordered inbound attachment handlers after downloading files and before the Telegram turn enters the
|
|
3
|
+
`pi-telegram` can run ordered inbound attachment handlers after downloading files and before the Telegram turn enters the π queue.
|
|
4
4
|
|
|
5
5
|
This document is the local adaptation of the portable [Command Template Standard](./command-templates.md).
|
|
6
6
|
|
|
@@ -13,19 +13,17 @@ This document is the local adaptation of the portable [Command Template Standard
|
|
|
13
13
|
"attachmentHandlers": [
|
|
14
14
|
{
|
|
15
15
|
"type": "voice",
|
|
16
|
-
"template": "/path/to/stt1 --file {file} --lang {lang=ru}"
|
|
17
|
-
"timeout": 30000
|
|
16
|
+
"template": "/path/to/stt1 --file {file} --lang {lang=ru}"
|
|
18
17
|
},
|
|
19
18
|
{
|
|
20
19
|
"mime": "audio/*",
|
|
21
|
-
"template": "/path/to/stt2 --file {file} --lang {lang=ru}"
|
|
22
|
-
"timeout": 30000
|
|
20
|
+
"template": "/path/to/stt2 --file {file} --lang {lang=ru}"
|
|
23
21
|
}
|
|
24
22
|
]
|
|
25
23
|
}
|
|
26
24
|
```
|
|
27
25
|
|
|
28
|
-
Handlers match by `type`, `mime`, or `match`. Wildcards such as `audio/*` are accepted. Each matching handler must provide `template`; a string is one command, and an array is ordered composition. Top-level `args` and `defaults` apply to composed steps unless a step defines private values
|
|
26
|
+
Handlers match by `type`, `mime`, or `match`. Wildcards such as `audio/*` are accepted. Each matching handler must provide `template`; a string is one command, and an array is ordered composition. Top-level `args` and `defaults` apply to composed steps unless a step defines private values. The command-template default timeout applies automatically. Legacy configs may still use `pipe` as a local alias.
|
|
29
27
|
|
|
30
28
|
## Template Placeholders
|
|
31
29
|
|
|
@@ -1,6 +1,14 @@
|
|
|
1
1
|
# Command Template Standard
|
|
2
2
|
|
|
3
|
-
Command templates are the portable integration format for deterministic local automation.
|
|
3
|
+
Command templates are the portable integration format for deterministic local automation.
|
|
4
|
+
|
|
5
|
+
**Meta-contract:** transportable (bit-for-bit identical across projects), high-density (zero fluff), constant (evolve by crystallizing, not speculating), optimal minimum (add only when it hurts).
|
|
6
|
+
|
|
7
|
+
**Scope:** portable command execution format — shell-free exec, composition/pipes, default timeout, critical-step branching, output artifact selection, handler-level fallback. Single JSON standard; no platform lock-in.
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
Extensions may choose their own config files, selectors, placeholder sources, and examples, but should preserve this core contract.
|
|
4
12
|
|
|
5
13
|
## Shape
|
|
6
14
|
|
|
@@ -22,13 +30,14 @@ There is no portable `command` field. The command is derived from `template`: af
|
|
|
22
30
|
|
|
23
31
|
Common object fields:
|
|
24
32
|
|
|
25
|
-
| Field | Meaning
|
|
26
|
-
| ---------- |
|
|
27
|
-
| `template` | Required command string or ordered composition array
|
|
28
|
-
| `args` | Optional placeholder-name declarations only; never stores defaults
|
|
29
|
-
| `defaults` | Placeholder default values by name
|
|
30
|
-
| `timeout` | Optional execution timeout in milliseconds
|
|
31
|
-
| `output` | Optional result selector; default
|
|
33
|
+
| Field | Meaning |
|
|
34
|
+
| ---------- | ------------------------------------------------------------------------------------------ |
|
|
35
|
+
| `template` | Required command string or ordered composition array |
|
|
36
|
+
| `args` | Optional placeholder-name declarations only; never stores defaults |
|
|
37
|
+
| `defaults` | Placeholder default values by name |
|
|
38
|
+
| `timeout` | Optional execution timeout override in milliseconds; default `30000` (30s) |
|
|
39
|
+
| `output` | Optional result selector; default `"stdout"`, or a "runtime value", e.g. `"ogg"` |
|
|
40
|
+
| `critical` | Optional boolean; default `false`. When `true`, failure aborts the entire root composition |
|
|
32
41
|
|
|
33
42
|
Storage paths, labels, selectors, descriptions, and registry-specific metadata belong to each extension's local schema.
|
|
34
43
|
|
|
@@ -113,7 +122,7 @@ Composition rules:
|
|
|
113
122
|
- Treat the whole composition as one handler for selector matching and fallback
|
|
114
123
|
- Top-level `args` and `defaults` apply to every leaf unless the leaf defines private values
|
|
115
124
|
- Leaf `args` replace inherited `args`; leaf `defaults` merge over inherited defaults; `timeout` and `output` are not inherited into leaves
|
|
116
|
-
-
|
|
125
|
+
- Default `30000` (30s) timeout applies automatically; configure `timeout` only for exceptional long-running commands
|
|
117
126
|
- Each leaf receives the previous leaf's stdout on stdin by default, while the final leaf stdout remains the default composition result
|
|
118
127
|
- Each leaf still applies its own inline defaults
|
|
119
128
|
|
|
@@ -136,6 +145,41 @@ Composition rules:
|
|
|
136
145
|
|
|
137
146
|
Legacy local schemas may accept `pipe` as an alias, but the portable standard is `template: [...]`.
|
|
138
147
|
|
|
148
|
+
## Fail-Open Default Policy
|
|
149
|
+
|
|
150
|
+
By default, composition continues on failure: the failed step is logged and the next step executes. This is analogous to `make -k` — the user sees all failures at once and decides what to fix.
|
|
151
|
+
|
|
152
|
+
## Critical Steps
|
|
153
|
+
|
|
154
|
+
Set `critical: true` on any leaf to abort the entire root composition on failure. One `critical` leaf can halt the whole pipeline.
|
|
155
|
+
|
|
156
|
+
```json
|
|
157
|
+
{
|
|
158
|
+
"template": [
|
|
159
|
+
{ "template": "cargo build" },
|
|
160
|
+
{ "template": "cargo fmt --check" },
|
|
161
|
+
{ "template": "cargo test", "critical": true }
|
|
162
|
+
]
|
|
163
|
+
}
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
`build` / `fmt` failures are logged, execution continues. `test` failure aborts the root composition immediately.
|
|
167
|
+
|
|
168
|
+
A `critical` leaf in a nested composition still aborts the outermost root `template: [...]`. There is no per-branch scoping in the current standard.
|
|
169
|
+
|
|
170
|
+
## Progressive Disclosure
|
|
171
|
+
|
|
172
|
+
The standard uses a single `template` field that grows with the user's needs:
|
|
173
|
+
|
|
174
|
+
```text
|
|
175
|
+
string → leaf command
|
|
176
|
+
string[] → sequential composition
|
|
177
|
+
{ template } → leaf with defaults
|
|
178
|
+
{ template, critical, output } → full leaf
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
Start with a string. Add composition when needed. Add critical when safety matters. Same contract, growing capability, no dead weight.
|
|
182
|
+
|
|
139
183
|
## Tool Boundary
|
|
140
184
|
|
|
141
185
|
Agent tools are a separate abstraction. A tool name is not a portable command template because the pi extension API exposes tool registration metadata, not a public extension-to-extension `executeTool(name, args)` contract. Until such an API exists, extensions should use command templates for deterministic local automation.
|
package/docs/locks.md
CHANGED
|
@@ -1,5 +1,9 @@
|
|
|
1
1
|
# Extension Locks Standard
|
|
2
2
|
|
|
3
|
+
**Meta-contract:** transportable (bit-for-bit identical across projects), high-density (zero fluff), constant (evolve by crystallizing, not speculating), optimal minimum (add only when it hurts).
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
3
7
|
`locks.json` is a shared registry for singleton pi extensions.
|
|
4
8
|
|
|
5
9
|
Path:
|
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
`pi-telegram` maps hidden assistant-authored HTML comments to Telegram-native outbound actions.
|
|
4
4
|
|
|
5
|
-
This is intentionally prompt-driven: the agent writes normal Markdown plus small hidden top-level blocks, and the bridge performs the transport work after `agent_end`. `telegram_voice` and `telegram_button` are not
|
|
5
|
+
This is intentionally prompt-driven: the agent writes normal Markdown plus small hidden top-level blocks, and the bridge performs the transport work after `agent_end`. `telegram_voice` and `telegram_button` are not π tools. Outbound behavior is an emergent result of the assistant prompt, configured command-template handlers, generated artifacts, and reply delivery. That avoids extra agent-side tool calls, avoids fragile parameter plumbing inside the conversation, and minimizes latency because text, voice, and buttons are planned in one standard assistant reply.
|
|
6
6
|
|
|
7
7
|
This document is the local outbound adaptation of the portable [Command Template Standard](./command-templates.md).
|
|
8
8
|
|
|
@@ -15,7 +15,7 @@ An outbound handler is selected by `type`. Assistant markup maps to handler type
|
|
|
15
15
|
| `telegram_voice` | `voice` | Generate OGG/Opus and call `sendVoice` |
|
|
16
16
|
| `telegram_button` | Built-in | Attach an inline keyboard button to the final text |
|
|
17
17
|
|
|
18
|
-
Configured command-template handlers provide `template`. A string is one command; an array is ordered composition. Top-level `args
|
|
18
|
+
Configured command-template handlers provide `template`. A string is one command; an array is ordered composition. Top-level `args` and `defaults` apply to all composed steps unless a step defines private values. The command-template default timeout applies automatically. `output` selects the primary artifact path when the handler produces a file instead of stdout text. Legacy configs may still use `pipe`, but `template: [...]` is the preferred standard shape.
|
|
19
19
|
|
|
20
20
|
## Voice Handler Config
|
|
21
21
|
|
|
@@ -30,8 +30,7 @@ Configured command-template handlers provide `template`. A string is one command
|
|
|
30
30
|
"/path/to/tts --text {text} --lang {lang=ru} --rate {rate=+30%} --write-media {mp3}",
|
|
31
31
|
"ffmpeg -y -i {mp3} -c:a libopus -b:a 32k -ar 16000 -ac 1 -vbr on {ogg}"
|
|
32
32
|
],
|
|
33
|
-
"output": "ogg"
|
|
34
|
-
"timeout": 120000
|
|
33
|
+
"output": "ogg"
|
|
35
34
|
}
|
|
36
35
|
]
|
|
37
36
|
}
|
|
@@ -49,9 +48,13 @@ Full text answer stays here.
|
|
|
49
48
|
<!-- telegram_voice lang=ru rate=+30%
|
|
50
49
|
Text to synthesize as a Telegram voice message.
|
|
51
50
|
-->
|
|
51
|
+
|
|
52
|
+
<!-- telegram_voice lang=ru rate=+30% text="Short spoken companion summary." -->
|
|
53
|
+
|
|
54
|
+
<!-- telegram_voice: Short spoken companion summary. -->
|
|
52
55
|
```
|
|
53
56
|
|
|
54
|
-
The bridge strips the comment from Telegram text. On `agent_end`, it maps each `telegram_voice` block to `type: "voice"`, generates one file per block, and sends each file as an independent Telegram-native voice message. The opening `<!-- telegram_voice` marker must start at column zero on a top-level line outside fenced code, quotes, and lists; otherwise it is rendered as literal Markdown.
|
|
57
|
+
The bridge strips the comment from Telegram text. On `agent_end`, it maps each `telegram_voice` block to `type: "voice"`, generates one file per block, and sends each file as an independent Telegram-native voice message. The opening `<!-- telegram_voice` marker must start at column zero on a top-level line outside fenced code, quotes, and lists; otherwise it is rendered as literal Markdown. Body-form comments leave the opening line unclosed until the body-ending `-->`; closed heads can use `text="..."` for explicit one-line spoken text.
|
|
55
58
|
|
|
56
59
|
## Built-In Voice Placeholders
|
|
57
60
|
|
|
@@ -59,7 +62,7 @@ Voice outbound handlers receive these runtime placeholders:
|
|
|
59
62
|
|
|
60
63
|
| Placeholder | Value |
|
|
61
64
|
| ----------- | -------------------------------------------------------- |
|
|
62
|
-
| `{text}` | Voice
|
|
65
|
+
| `{text}` | Voice text from body, `text="..."`, or colon shorthand |
|
|
63
66
|
| `{lang}` | Optional markup override such as `lang=ru` |
|
|
64
67
|
| `{rate}` | Optional markup override such as `rate=+30%` |
|
|
65
68
|
| `{mp3}` | Flat temp artifact path under `~/.pi/agent/tmp/telegram` |
|
|
@@ -73,28 +76,31 @@ For composed handlers, `output` selects the primary artifact after the compositi
|
|
|
73
76
|
|
|
74
77
|
For one-step `template` handlers, stdout remains the default result channel: the command should print the generated OGG/Opus path.
|
|
75
78
|
|
|
79
|
+
**Critical steps:** voice synthesis is a multi-step pipeline (TTS → ffmpeg → OGG). The ffmpeg conversion step is inherently critical — if it fails, the voice output is invalid. Mark it as `"critical": true` when a composed handler must abort after conversion failure instead of continuing to later non-critical steps. Keep the fallback chain (Mistral TTS → Groq TTS) as the safety net for persistent outages. See [Command Template Standard](./command-templates.md) for semantics.
|
|
80
|
+
|
|
76
81
|
## Buttons Markup
|
|
77
82
|
|
|
78
|
-
Assistant replies can include independent button blocks. The
|
|
83
|
+
Assistant replies can include independent button blocks. The prompt is sent back to π when the user taps the button; use the colon shorthand when the prompt should equal the label, `prompt="..."` for one-line prompts, or the body form for multiline prompts:
|
|
79
84
|
|
|
80
85
|
```md
|
|
81
86
|
I can continue.
|
|
82
87
|
|
|
83
|
-
<!-- telegram_button label="
|
|
84
|
-
Continue with the current plan.
|
|
85
|
-
-->
|
|
88
|
+
<!-- telegram_button label=Continue prompt="Continue with the current plan." -->
|
|
86
89
|
|
|
87
90
|
<!-- telegram_button label="Show risks"
|
|
88
91
|
List the main risks first.
|
|
89
92
|
-->
|
|
90
93
|
|
|
91
|
-
<!-- telegram_button
|
|
94
|
+
<!-- telegram_button: Done -->
|
|
92
95
|
```
|
|
93
96
|
|
|
94
97
|
Rules:
|
|
95
98
|
|
|
96
|
-
- `telegram_button
|
|
99
|
+
- `telegram_button: Label` creates one independent label-only button row whose prompt equals the label.
|
|
100
|
+
- `telegram_button label="Label" prompt="Prompt"` creates one independent button row whose prompt is the `prompt` attribute.
|
|
101
|
+
- `telegram_button label="Label"` with a body creates one independent button row whose prompt is the block body.
|
|
97
102
|
- The opening `<!-- telegram_button` marker must start at column zero on a top-level line outside fenced code, quotes, and lists; otherwise it is rendered as literal Markdown.
|
|
103
|
+
- Keep the canonical body form as `<!-- telegram_button label="Label"` + body + `-->`; closed heads must use `prompt="..."` or the colon shorthand to create a button.
|
|
98
104
|
- Use one block per button; this mirrors HTML's singular element model and avoids a nested button DSL inside comments.
|
|
99
105
|
- Button actions are stored in memory with short `callback_data`; Telegram never sees the full prompt in the button payload.
|
|
100
106
|
|
|
@@ -105,8 +111,8 @@ Buttons are built in and do not need a command template because they are pure Te
|
|
|
105
111
|
The extension injects Telegram-specific system prompt guidance so agents know the fast path:
|
|
106
112
|
|
|
107
113
|
- Write the full technical answer as normal Markdown.
|
|
108
|
-
- Add `telegram_voice` when a Telegram-native voice message is useful;
|
|
109
|
-
- Add `telegram_button label="..."` for
|
|
114
|
+
- Add `telegram_voice` when a Telegram-native voice message is useful; use body text, `text="..."`, or colon shorthand for the text to synthesize. A companion summary is optional, no specific summary format is required.
|
|
115
|
+
- Add `telegram_button: ...` when label equals prompt, `telegram_button label="..." prompt="..."` for one-line prompts, or `telegram_button label="..."` with a body for multiline prompts. If the reply contains only button/voice comment blocks, add a short visible marker (for example `Choose one:`) before them so Telegram always has a visible parent message for attachment.
|
|
110
116
|
- Do not call or register TTS/text-to-OGG/Telegram transport tools for voice or buttons; the bridge owns the configured outbound-handler pipeline and delivery.
|
|
111
117
|
|
|
112
118
|
This keeps the agent focused on semantics and lets the bridge handle low-latency Telegram adaptation.
|