@llblab/pi-telegram 0.7.1 → 0.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +6 -5
- package/CHANGELOG.md +20 -1
- package/README.md +36 -11
- package/docs/README.md +2 -2
- package/docs/architecture.md +7 -7
- package/docs/callback-namespaces.md +1 -1
- package/docs/inbound-handlers.md +93 -0
- package/docs/outbound-handlers.md +40 -4
- package/index.ts +52 -32
- package/lib/config.ts +9 -3
- package/lib/inbound-handlers.ts +588 -0
- package/lib/menu-queue.ts +1 -1
- package/lib/menu-status.ts +4 -4
- package/lib/menu.ts +7 -3
- package/lib/{attachments.ts → outbound-attachments.ts} +34 -34
- package/lib/outbound-handlers.ts +245 -0
- package/lib/prompts.ts +1 -1
- package/lib/routing.ts +14 -4
- package/lib/text-groups.ts +203 -0
- package/package.json +1 -1
- package/docs/attachment-handlers.md +0 -50
- package/lib/attachment-handlers.ts +0 -423
package/AGENTS.md
CHANGED
|
@@ -23,7 +23,7 @@
|
|
|
23
23
|
## 3. Project Topology
|
|
24
24
|
|
|
25
25
|
- `/index.ts`: Main extension entrypoint and runtime composition layer for the bridge
|
|
26
|
-
- `/lib/*.ts`: Flat domain modules for reusable runtime logic. Favor domain files such as queueing/runtime, replies, polling, updates, attachments, commands, lifecycle hooks, prompts, prompt-templates, pi SDK adapter, Telegram API, config, turns, media, setup, rendering, app menu, menu-model, menu-thinking, menu-queue, status/model-resolution support, and other cohesive bridge subsystems; use `shared` only when a type or constant truly spans multiple domains
|
|
26
|
+
- `/lib/*.ts`: Flat domain modules for reusable runtime logic. Favor domain files such as queueing/runtime, replies, polling, updates, outbound-attachments, commands, lifecycle hooks, prompts, prompt-templates, pi SDK adapter, Telegram API, config, turns, media, setup, rendering, app menu, menu-model, menu-thinking, menu-queue, status/model-resolution support, and other cohesive bridge subsystems; use `shared` only when a type or constant truly spans multiple domains
|
|
27
27
|
- `/tests/*.test.ts`: Domain-mirrored regression suites that follow the same flat naming as `/lib`
|
|
28
28
|
- `/docs/README.md`: Documentation index for technical project docs
|
|
29
29
|
- `/docs/architecture.md`: Runtime and subsystem overview for the bridge
|
|
@@ -100,8 +100,8 @@
|
|
|
100
100
|
The canonical detailed ownership map lives in [`docs/architecture.md`](./docs/architecture.md). Keep this section as a compact agent-facing index, not a second copy of the full map.
|
|
101
101
|
|
|
102
102
|
- Scheduling and lifecycle: `queue`, `runtime`, `lifecycle`, `locks`
|
|
103
|
-
- Telegram transport and inbound flow: `api`, `polling`, `updates`, `routing`, `media`, `turns`, `
|
|
104
|
-
- Response surfaces: `preview`, `replies`, `rendering`, `keyboard`, `attachments`, `outbound-handlers`, `status`
|
|
103
|
+
- Telegram transport and inbound flow: `api`, `polling`, `updates`, `routing`, `media`, `turns`, `inbound-handlers`, `config`, `setup`
|
|
104
|
+
- Response surfaces: `preview`, `replies`, `rendering`, `keyboard`, `outbound-attachments`, `outbound-handlers`, `status`
|
|
105
105
|
- Controls and application menu UI: `commands`, `menu`, `menu-model`, `menu-thinking`, `menu-status`, `menu-queue`, `model`, `prompts`
|
|
106
106
|
- Pi SDK boundary: `pi` owns direct pi imports and bound extension API ports
|
|
107
107
|
|
|
@@ -131,8 +131,9 @@ The canonical detailed ownership map lives in [`docs/architecture.md`](./docs/ar
|
|
|
131
131
|
- For `/telegram-setup`, prefer the locally saved bot token over environment variables on repeat setup runs; env vars are the bootstrap path when no local token exists, and persisted `telegram.json` writes must remain atomic plus private because status/setup/polling paths may read it concurrently
|
|
132
132
|
- Command help plus prompt-template commands and status/model/thinking/queue controls are driven through `/start`'s Telegram inline application menu and callback queries; the Queue button shows the queued-item count, model-menu scope/pagination controls stay at the top under Main menu, the model pagination indicator opens a compact page picker, and thinking-menu text stays a compact heading because the current level is marked by button state; `/status`, `/model`, `/thinking`, and `/queue` are hidden compatibility shortcuts
|
|
133
133
|
- Shared inline-keyboard structure belongs to `keyboard`; application-control button labels, callback data, and callback behavior stay in `menu`/`menu-model`/`menu-thinking`/`menu-status`/`menu-queue` while core queue mechanics stay in `queue`
|
|
134
|
-
- Inbound
|
|
135
|
-
-
|
|
134
|
+
- Inbound text/media may be transformed through configured `inboundHandlers` before queueing; legacy `attachmentHandlers` are deprecated compatibility aliases appended after `inboundHandlers`; outbound files must flow through `telegram_attach`
|
|
135
|
+
- Long Telegram text split recovery belongs to `text-groups`: keep it conservative, short-debounced, same chat/user/message-id contiguous, and gated by near-limit human text so normal rapid follow-ups and slash commands stay separate
|
|
136
|
+
- Inbound handlers and command-backed outbound handlers use command templates as the standard integration contract; built-in outbound buttons use inline keyboards plus callback routing because no external command execution is needed
|
|
136
137
|
- Telegram prompt-template commands are discovered from π slash commands with `source: "prompt"`; π template names are mapped to Bot API-compatible aliases (`fix-tests` → `/fix_tests`), aliases that conflict with built-in bridge commands or hidden shortcuts are not displayed, prompt-template aliases stay out of the Telegram bot command menu, and the bridge expands template files before queueing because extension-originated `sendUserMessage()` bypasses π's interactive template expansion
|
|
137
138
|
- Unknown callback data not owned by pi-telegram prefixes (`tgbtn:`, `menu:`, `model:`, `thinking:`, `status:`, `queue:`) may be forwarded as `[callback] <data>` after built-in handlers decline it; external extensions should follow `docs/callback-namespaces.md` and must not poll the same bot independently
|
|
138
139
|
- Command templates stay compact and shell-free: no `command` field, no shell execution, inline defaults are allowed as `{name=default}`, `template` may be a string or an ordered composition array, only `args`/`defaults` inherit into leaves, top-level `timeout` wraps composed sequences, stdout pipes to the next step's stdin by default, and multi-step work should use `template: [...]` rather than provider-specific fields; `pipe` is only a legacy local alias
|
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,24 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## 0.8.0: Handler Bus
|
|
4
|
+
|
|
5
|
+
- `[Inbound Handlers]` Added `inboundHandlers` as the provider-neutral Telegram → π transformation bus. Raw Telegram text can match `type: "text"`, `mime: "text/plain"`, or `mime: "text/*"`, receives text on stdin and `{text}`, and non-empty stdout replaces the prompt text before queueing; media/file handlers keep the existing `{file}`/`{mime}`/`{type}` behavior with optional independent selectors. Impact: translation, normalization, STT, OCR, and file extraction can share one command-template integration model.
|
|
6
|
+
- `[Text Attachments]` Attached `text/plain`/`text/*` files now have a built-in fail-open reader that injects UTF-8 content into `[outputs]` when no configured handler produced output. Impact: ordinary `.txt` and other text documents become readable to π without custom extraction config.
|
|
7
|
+
- `[Inbound Domain]` Renamed the implementation module and mirrored regression suite from `attachment-handlers` to `inbound-handlers`. Impact: file names now match the unified text/media preprocessing domain while legacy `attachmentHandlers` config remains supported.
|
|
8
|
+
- `[Outbound Attachment Domain]` Renamed the outbound file-delivery module and mirrored regression suite from `attachments` to `outbound-attachments`. Impact: `telegram_attach` ownership now reads as an outbound domain beside `outbound-handlers` while behavior stays unchanged.
|
|
9
|
+
- `[Inbound Docs]` Consolidated the deprecated `docs/attachment-handlers.md` page into `docs/inbound-handlers.md` and removed the old page. Impact: the inbound bus docs are now the canonical home for legacy `attachmentHandlers`, placeholders, ordered fallbacks, and prompt-output behavior without split documentation.
|
|
10
|
+
- `[Attachment Handlers]` `attachmentHandlers` is now deprecated but remains supported as a compatibility alias appended after `inboundHandlers`. Impact: existing voice/file preprocessing configs keep working while new configs can move to the unified inbound bus.
|
|
11
|
+
- `[Outbound Handlers]` Added `outboundHandlers` support for `type: "text"`; final text/Markdown replies can be transformed before Telegram rendering and delivery. Impact: translation-back or other outbound text normalization can be configured without hard-coded providers.
|
|
12
|
+
- `[Outbound Text Preview]` Finalized rich preview messages now pass through outbound `type: "text"` handlers before Telegram edit/delivery, with expanded README/docs examples for machine translation, final text rewrites, composed translated voice-over, and inline-button compatibility. Impact: outbound text transforms apply even when the final answer reuses an existing preview instead of falling back to a separate send path, while inline buttons remain attached and visible labels are transformed without changing callback prompts.
|
|
13
|
+
- `[Package]` Bumped package metadata to `0.8.0` through npm and kept the lockfile in sync.
|
|
14
|
+
|
|
15
|
+
## 0.7.2: Split Text Coalescing Hotfix
|
|
16
|
+
|
|
17
|
+
- `[Text Coalescing]` Telegram text messages that look like automatic splits of one near-limit human message are now short-debounced and forwarded to π as one prompt, using a conservative 3600-character near-limit threshold. Commands, bot messages, media groups, captions, non-contiguous messages, and normal short follow-ups bypass coalescing. Impact: long pasted logs/prompts are less likely to arrive as separate π turns when Telegram chunks them.
|
|
18
|
+
- `[Runtime Tests]` The media-group runtime regression now waits for the real debounce instead of mixing fake timers with the polling loop, and the reaction-priority runtime test flushes pending microtasks before ending the active turn. Impact: CI should stop failing on timing-only races around delayed dispatch and queued reaction mutations.
|
|
19
|
+
- `[Callback Namespaces]` Current status-screen navigation callbacks now use the canonical `menu:` namespace (`menu:model`, `menu:thinking`, `menu:queue`). `status:` remains reserved as an owned legacy prefix but is no longer emitted by current UI. Impact: new inline menu callbacks align with the unified app-menu model while old `status:` payloads still cannot leak to external fallback handlers.
|
|
20
|
+
- `[Package]` Bumped package metadata to `0.7.2` through npm and kept the lockfile in sync.
|
|
21
|
+
|
|
3
22
|
## 0.7.1: Layered Callback Interop
|
|
4
23
|
|
|
5
24
|
- `[Callback Interop]` Unknown Telegram inline-button callback data that does not belong to pi-telegram-owned prefixes (`tgbtn:`, `menu:`, `model:`, `thinking:`, `status:`, `queue:`) is now forwarded to π as `[callback] <data>` after assistant-button, queue-menu, and app-menu handlers decline it. `docs/callback-namespaces.md` defines the shared callback namespace standard for layered extensions. Impact: layered π extensions can namespace and handle their own Telegram inline buttons without polling the same bot or forking pi-telegram.
|
|
@@ -71,7 +90,7 @@
|
|
|
71
90
|
## 0.5.0: Command Templates, Domain Boundaries & Queue UX
|
|
72
91
|
|
|
73
92
|
- `[Queue UX]` Telegram `/status` and `/model` now execute immediately, post-agent-end queue dispatch retries after pi settles idle state, and the status bar shows specific busy labels (`active`, `dispatching`, `queued`, `tool running`, `model`). Reaction priority remains local and applies to text, voice, file, image, and media-group turns without introducing pi steering semantics. Impact: controls do not get stuck behind generation, queued work no longer needs a later Telegram update to unstick, and attachment turns keep predictable ordering.
|
|
74
|
-
- `[Attachment Handlers]` Inbound preprocessing now uses portable `template` configs with `args`/`defaults` and ordered fallback chains, documented in `docs/command-templates.md` and
|
|
93
|
+
- `[Attachment Handlers]` Inbound preprocessing now uses portable `template` configs with `args`/`defaults` and ordered fallback chains, documented in `docs/command-templates.md` and current inbound handler docs. Impact: voice/STT primary-fallback setups work from `telegram.json` without coupling pi-telegram to private auto-tool registry internals.
|
|
75
94
|
- `[Domain Boundaries]` Removed the broad `registration` domain and moved registration surfaces to owners: attachments register `telegram_attach`, commands register pi `/telegram-*` commands, lifecycle registers hooks, and prompts own Telegram-specific system prompt injection. Impact: entrypoint wiring is clearer and each registration surface has focused tests.
|
|
76
95
|
- `[telegram_attach]` The outbound attachment tool now lives in the attachments domain with outbound limits, queueing failure events, and pi-friendly tool-result formatting. Impact: outbound file delivery behavior is owned by the same domain that queues and sends Telegram attachments.
|
|
77
96
|
- `[Docs & Validation]` Updated README, docs, architecture/context maps, backlog, focused coverage, and removed vendored repository-local agent skills in favor of global validation tooling. Impact: user-facing docs, validation, and package-adjacent repo contents match the 0.5.0 code shape without stale skill copies.
|
package/README.md
CHANGED
|
@@ -18,7 +18,7 @@ This repository is an actively maintained fork of [`badlogic/pi-telegram`](https
|
|
|
18
18
|
- **In-flight Model Switching**: Change the active model mid-generation. The agent gracefully pauses, applies the new model, and restarts its response without losing context.
|
|
19
19
|
- **Smart Message Queue**: Messages sent while the agent is busy are queued and previewed in the π status bar, and queued turns can be reprioritized or removed with Telegram reactions or the queue section of the inline application menu.
|
|
20
20
|
- **Mobile-Optimized Rendering**: Tables and lists are formatted for narrow screens, table padding accounts for emoji grapheme and wide Unicode display width, and Telegram-originated runs prompt the assistant to prefer narrow table columns for phone readability. Markdown is correctly parsed and split to fit Telegram's limits without breaking HTML structures or code blocks, block spacing stays faithful to the original Markdown with readable heading separation, supported absolute links stay clickable, and unsupported link forms degrade safely.
|
|
21
|
-
- **File Handling & Attachments**: Send images and files to the agent, transcribe or transform inbound
|
|
21
|
+
- **File Handling & Attachments**: Send images and files to the agent, transcribe or transform inbound text/media with configured inbound handlers, or ask π to generate and return artifacts. Inbound downloads and outbound attachments are size-limited by default, and outbound files are delivered automatically via the `telegram_attach` tool.
|
|
22
22
|
- **Streaming Responses**: Closed Markdown blocks stream back as rich Telegram HTML while π is generating, and the still-growing tail stays readable until the final fully rendered reply lands.
|
|
23
23
|
|
|
24
24
|
## Install
|
|
@@ -77,7 +77,7 @@ Once paired, simply chat with your bot in Telegram. All text, images, and files
|
|
|
77
77
|
|
|
78
78
|
Use these inside the Telegram DM with your bot:
|
|
79
79
|
|
|
80
|
-
- **`/start`**: Pair the first Telegram user when needed, register
|
|
80
|
+
- **`/start`**: Pair the first Telegram user when needed, register bridge bot commands, and open the inline application menu with command help, available π prompt templates, status rows, and controls.
|
|
81
81
|
- **`/compact`**: Start session compaction (only works when the session is idle).
|
|
82
82
|
- **`/next`**: Dispatch the next queued turn (aborts π first if busy).
|
|
83
83
|
- **`/continue`**: Enqueue a priority `continue` prompt. It waits like normal Telegram work when π is busy and can trigger prompt/skill handling that listens for `continue`.
|
|
@@ -102,6 +102,7 @@ Run these inside π, not Telegram:
|
|
|
102
102
|
### Queue, Reactions, and Media
|
|
103
103
|
|
|
104
104
|
- If you send more Telegram messages while π is busy, they enter the default prompt queue and are processed in order.
|
|
105
|
+
- Very long text messages that Telegram appears to split automatically are coalesced through a short conservative debounce and forwarded to π as one prompt when the first chunk is near Telegram's text limit, currently using a 3600-character threshold. Commands, bot messages, media groups, and normal short follow-ups are not coalesced.
|
|
105
106
|
- `👍`, `⚡️`, `❤️`, and `🕊` move a waiting prompt into the priority prompt queue, behind control actions but ahead of default prompts. Removing the last priority reaction sends it back to its normal queue position, and adding a priority reaction again gives it a fresh priority position.
|
|
106
107
|
- `👎`, `👻`, `💔`, and `💩` remove a waiting turn from the queue. Telegram Bot API does not expose ordinary DM message-deletion events through the polling path used here, so queue removal is bound to removal reactions.
|
|
107
108
|
- Reactions apply to any waiting Telegram turn, including text, voice, files, images, and media groups. For media groups, a reaction on any message in the group applies to the whole queued turn.
|
|
@@ -110,26 +111,36 @@ Run these inside π, not Telegram:
|
|
|
110
111
|
- Inbound images, albums, and files are saved to `~/.pi/agent/tmp/telegram`. Unhandled local file paths are included in the prompt, handled attachment output is injected into the prompt text, and inbound images are forwarded to π as image inputs. Inbound downloads default to a 50 MiB limit and can be adjusted with `PI_TELEGRAM_INBOUND_FILE_MAX_BYTES` or `TELEGRAM_MAX_FILE_SIZE_BYTES`.
|
|
111
112
|
- Queue reactions depend on Telegram delivering `message_reaction` updates for your bot and chat type.
|
|
112
113
|
|
|
113
|
-
### Inbound
|
|
114
|
+
### Inbound Handlers
|
|
114
115
|
|
|
115
|
-
`telegram.json` can define ordered `
|
|
116
|
+
`telegram.json` can define ordered `inboundHandlers` for Telegram → π preprocessing such as text translation, voice transcription, OCR, or PDF extraction. Matching handlers run before the Telegram turn enters the π queue. If a matching media/file handler fails, the next matching handler is tried as a fallback. Legacy `attachmentHandlers` still work as a deprecated compatibility alias and are appended after `inboundHandlers`.
|
|
116
117
|
|
|
117
118
|
```json
|
|
118
119
|
{
|
|
119
|
-
"
|
|
120
|
+
"inboundHandlers": [
|
|
121
|
+
{
|
|
122
|
+
"type": "text",
|
|
123
|
+
"template": "/path/to/translate --lang {lang=en} --text \"{text}\""
|
|
124
|
+
},
|
|
120
125
|
{
|
|
121
126
|
"type": "voice",
|
|
122
|
-
"template":
|
|
127
|
+
"template": [
|
|
128
|
+
"/path/to/stt --file {file} --lang {lang=ru}",
|
|
129
|
+
"/path/to/translate-stdin --lang {lang=en}"
|
|
130
|
+
]
|
|
123
131
|
},
|
|
124
132
|
{
|
|
125
133
|
"mime": "audio/*",
|
|
126
|
-
"template":
|
|
134
|
+
"template": [
|
|
135
|
+
"/path/to/stt-fallback --file {file} --lang {lang=ru}",
|
|
136
|
+
"/path/to/translate-stdin --lang {lang=en}"
|
|
137
|
+
]
|
|
127
138
|
}
|
|
128
139
|
]
|
|
129
140
|
}
|
|
130
141
|
```
|
|
131
142
|
|
|
132
|
-
Matching supports `mime`, `type`, or `match`;
|
|
143
|
+
Matching supports optional `mime`, `type`, or `match`; `mime` can be used without `type`, and wildcards like `audio/*` or `text/*` are accepted. Raw Telegram text can match `type: "text"`, `mime: "text/plain"`, or `mime: "text/*"`; it is passed on stdin and as `{text}`, and non-empty stdout replaces the prompt text. Media/file handlers receive `{file}`, `{mime}`, and `{type}`; local attachments stay in the prompt under `[attachments] <directory>` with relative file entries, and successful media/file handler stdout is added under `[outputs]`. Attached `text/plain`/`text/*` files have a built-in fail-open reader that injects UTF-8 content into `[outputs]` when no configured handler produced output. Failed handlers record diagnostics and fall back safely. The portable command-template contract is documented in [`docs/command-templates.md`](./docs/command-templates.md); Telegram-specific inbound config is documented in [`docs/inbound-handlers.md`](./docs/inbound-handlers.md).
|
|
133
144
|
|
|
134
145
|
### Requesting Files
|
|
135
146
|
|
|
@@ -155,9 +166,22 @@ Text to synthesize as a Telegram voice message.
|
|
|
155
166
|
<!-- telegram_voice: Short spoken companion summary. -->
|
|
156
167
|
```
|
|
157
168
|
|
|
158
|
-
Outbound
|
|
169
|
+
Outbound `type: "text"` handlers can transform final text/Markdown before Telegram rendering and delivery, using stdin and `{text}` as input and non-empty stdout as replacement text. They are a good fit for machine translation, tone normalization, redaction, glossary expansion, or any other final text rewrite that should happen outside the agent prompt. The transform also applies when the bridge finalizes an already streamed rich preview, so Telegram may briefly show the pre-transform preview before the final edited message lands. Inline button labels are transformed too, while callback data and prompts stay unchanged.
|
|
170
|
+
|
|
171
|
+
```json
|
|
172
|
+
{
|
|
173
|
+
"outboundHandlers": [
|
|
174
|
+
{
|
|
175
|
+
"type": "text",
|
|
176
|
+
"template": "/path/to/translate --lang {lang=ru} --text {text}"
|
|
177
|
+
}
|
|
178
|
+
]
|
|
179
|
+
}
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
Outbound voice is disabled unless a matching `outboundHandlers[]` entry is configured. Multiple `telegram_voice` blocks in one reply are synthesized and sent independently, preserving each block's attributes. The bridge uses the same [command-template contract](./docs/command-templates.md) as inbound handlers: split the template into args, substitute placeholders, execute without a shell, and use stdout as the result channel for a single template.
|
|
159
183
|
|
|
160
|
-
A
|
|
184
|
+
A composed voice setup can translate the hidden `telegram_voice` text, synthesize it, and convert MP3 to Telegram-native OGG/Opus in one pipeline. The bridge provides `{text}`, `{mp3}`, and `{ogg}` to every step; top-level `args`/`defaults` apply to all steps unless a step defines private values, the default command timeout applies automatically, and each step's stdout is passed to the next step's stdin by default. Use `"output": "ogg"` when the artifact path should come from the generated `{ogg}` value instead of final stdout:
|
|
161
185
|
|
|
162
186
|
```json
|
|
163
187
|
{
|
|
@@ -165,7 +189,8 @@ A TTS plus MP3-to-OGG setup can be expressed as `template: [...]`. The bridge pr
|
|
|
165
189
|
{
|
|
166
190
|
"type": "voice",
|
|
167
191
|
"template": [
|
|
168
|
-
"/path/to/
|
|
192
|
+
"/path/to/translate-stdin --lang {lang=ru}",
|
|
193
|
+
"/path/to/tts-from-stdin --lang {lang=ru} --rate {rate=+30%} --write-media {mp3}",
|
|
169
194
|
"ffmpeg -y -i {mp3} -c:a libopus -b:a 32k -ar 16000 -ac 1 -vbr on {ogg}"
|
|
170
195
|
],
|
|
171
196
|
"output": "ogg"
|
package/docs/README.md
CHANGED
|
@@ -6,7 +6,7 @@ Living index of project documentation in `/docs`.
|
|
|
6
6
|
|
|
7
7
|
- [architecture.md](./architecture.md) — Overview of the Telegram bridge runtime, queueing model, rendering pipeline, and interactive controls
|
|
8
8
|
- [command-templates.md](./command-templates.md) — Portable command-template standard core
|
|
9
|
-
- [
|
|
10
|
-
- [outbound-handlers.md](./outbound-handlers.md) — Local `pi-telegram` outbound-handler config, voice/button
|
|
9
|
+
- [inbound-handlers.md](./inbound-handlers.md) — Local `pi-telegram` inbound text/media handler bus, legacy `attachmentHandlers` compatibility, placeholders, and fallbacks
|
|
10
|
+
- [outbound-handlers.md](./outbound-handlers.md) — Local `pi-telegram` outbound-handler config, text/voice/button behavior, artifact outputs, and callback routing
|
|
11
11
|
- [locks.md](./locks.md) — Shared `locks.json` standard for singleton extension ownership
|
|
12
12
|
- [callback-namespaces.md](./callback-namespaces.md) — Shared Telegram `callback_data` namespace standard for layered extensions
|
package/docs/architecture.md
CHANGED
|
@@ -30,14 +30,14 @@ Current runtime areas use these ownership boundaries:
|
|
|
30
30
|
| `config` / `setup` | Persisted bot/session pairing state, authorization, first-user pairing, token prompting, env fallback, validation, config persistence |
|
|
31
31
|
| `locks` / `polling` | Singleton `locks.json` ownership, takeover/restart semantics, long-poll controller state, update offset persistence, poll-loop runtime wiring |
|
|
32
32
|
| `updates` / `routing` | Update classification/execution planning, paired authorization, reactions, edits, callbacks, and inbound route composition |
|
|
33
|
-
| `media` / `turns` / `
|
|
33
|
+
| `media` / `text-groups` / `turns` / `inbound-handlers` | Text/media extraction, media-group debounce, long-text split coalescing, inbound downloads, inbound text/media handler execution, turn building/editing, image reads, legacy `attachmentHandlers` compatibility |
|
|
34
34
|
| `queue` | Queue item contracts, lane admission/order, stores, mutations, dispatch readiness/runtime, prompt/control enqueueing, session and agent/tool lifecycle sequencing |
|
|
35
35
|
| `runtime` | Session-local coordination primitives: counters, lifecycle flags, setup guard, abort handler, typing-loop timers, prompt-dispatch flags, agent-end reset binding |
|
|
36
36
|
| `model` / `menu-model` / `menu-thinking` / `menu-status` / `menu` / `menu-queue` / `commands` | Model identity/thinking levels, scoped model resolution, in-flight switching, model-menu UI, thinking-menu UI, status-menu UI, inline application callback composition, queue-menu UI, slash commands, bot command registration |
|
|
37
37
|
| `keyboard` | Shared Telegram inline-keyboard reply-markup structure; feature domains own callback semantics and button construction |
|
|
38
38
|
| `preview` / `replies` / `rendering` | Preview lifecycle/transports, final reply delivery and reply parameters, Telegram HTML Markdown rendering, chunking, stable-preview snapshots |
|
|
39
|
-
| `outbound-handlers` |
|
|
40
|
-
| `attachments`
|
|
39
|
+
| `outbound-handlers` | Outbound text transformation, assistant-authored outbound comments, generated reply artifacts, inline-keyboard callbacks, and post-`agent_end` outbound action delivery |
|
|
40
|
+
| `outbound-attachments` | `telegram_attach` registration, outbound attachment queueing, stat/limit checks, photo/document delivery classification |
|
|
41
41
|
| `status` | Status-bar/status-message rendering, queue-lane status views, redacted runtime event ring, grouped π diagnostics |
|
|
42
42
|
| `lifecycle` / `prompts` / `prompt-templates` / `pi` | π hook registration, Telegram-specific before-agent prompt injection, π prompt-template discovery/expansion, centralized direct pi SDK imports and context adapters |
|
|
43
43
|
| `command-templates` | Portable shell-free command-template standard helpers, composition expansion, placeholder substitution, and executable resolution |
|
|
@@ -50,7 +50,7 @@ Boundary invariants:
|
|
|
50
50
|
- Preview appearance stays in `rendering`; preview transport/lifecycle stays in `preview`
|
|
51
51
|
- Direct `node:*` file-operation imports stay in owning domains, not in `index.ts`
|
|
52
52
|
- `index.ts` uses namespace imports for local bridge domains so orchestration reads as `Queue.*`, `Turns.*`, and `Rendering.*`
|
|
53
|
-
- Architecture-invariant tests guard the acyclic import graph, pi SDK centralization, entrypoint purity, runtime-domain isolation, structural leaf-domain isolation, menu/model boundaries, API/config separation, media/update/API separation, and attachment boundary isolation
|
|
53
|
+
- Architecture-invariant tests guard the acyclic import graph, pi SDK centralization, entrypoint purity, runtime-domain isolation, structural leaf-domain isolation, menu/model boundaries, API/config separation, media/update/API separation, and outbound-attachment boundary isolation
|
|
54
54
|
- Mirrored domain regression coverage lives in `/tests/*.test.ts`; test helpers stay local to the mirrored suite by default, and shared fixture folders are justified only by reuse across multiple domain suites
|
|
55
55
|
|
|
56
56
|
## Configuration UX
|
|
@@ -77,8 +77,8 @@ Telegram bot configuration stays in `~/.pi/agent/telegram.json`; singleton runti
|
|
|
77
77
|
4. Media groups are coalesced into a single Telegram turn when needed
|
|
78
78
|
5. Slash command parsing uses only the new message text/caption, while Telegram `reply_to_message` text/caption is injected later as prompt-only `[reply]` context for normal queued turns
|
|
79
79
|
6. Files are streamed into `~/.pi/agent/tmp/telegram` with a default 50 MiB size limit, partial-download cleanup on failures, and stale temp cleanup on session start; operators can tune the limit with `PI_TELEGRAM_INBOUND_FILE_MAX_BYTES` or `TELEGRAM_MAX_FILE_SIZE_BYTES`
|
|
80
|
-
7. Configured inbound
|
|
81
|
-
8. Matching handlers are tried in config order: a non-zero exit records diagnostics and falls back to the next matching handler, while the first successful handler stops the chain
|
|
80
|
+
7. Configured inbound handlers may run on raw text or downloaded files by MIME wildcard, Telegram attachment type, or generic match selector; command templates receive safe command-arg substitution for `{text}`, `{file}`, `{mime}`, and `{type}` where applicable
|
|
81
|
+
8. Matching media/file handlers are tried in config order: a non-zero exit records diagnostics and falls back to the next matching handler, while the first successful handler stops the chain
|
|
82
82
|
9. Local attachments stay visible under `[attachments] <directory>` with relative file entries, and handler stdout is appended under `[outputs]` before the agent sees the turn; failed handlers omit output while keeping the attachment entry
|
|
83
83
|
10. A `PendingTelegramTurn` is created and queued locally
|
|
84
84
|
11. Telegram `edited_message` updates are routed separately and update a matching queued turn when the original message has not been dispatched yet
|
|
@@ -156,7 +156,7 @@ Preferred order:
|
|
|
156
156
|
|
|
157
157
|
Draft streaming can remain as a plain-text fallback path, but rich Telegram previews are driven through editable messages and stable-block snapshot selection.
|
|
158
158
|
|
|
159
|
-
Telegram prompt responses use explicit delivery context to attach outbound text, rich previews, errors, attachment notices, and uploads as Telegram replies to the source prompt when possible. Reply metadata is opt-in per delivery path, uses `reply_parameters` with `allow_sending_without_reply: true`, and is applied only to the first chunk of split long responses; continuation chunks are sent as normal adjacent messages. Media-group turns reply to the turn's representative `replyToMessageId`, not to every source message in the group.
|
|
159
|
+
Telegram prompt responses use explicit delivery context to attach outbound text, rich previews, errors, attachment notices, and uploads as Telegram replies to the source prompt when possible. Reply metadata is opt-in per delivery path, uses `reply_parameters` with `allow_sending_without_reply: true`, and is applied only to the first chunk of split long responses; continuation chunks are sent as normal adjacent messages. Media-group turns reply to the turn's representative `replyToMessageId`, not to every source message in the group. Long text split coalescing is intentionally conservative: only human text messages at or above the 3600-character near-limit threshold open the short debounce window, immediate same-chat/user contiguous text tails join that prompt, and commands, bot messages, captions, media groups, and normal short follow-ups bypass the coalescer.
|
|
160
160
|
|
|
161
161
|
Outbound files are sent only after the active Telegram turn completes, must be staged through the `telegram_attach` tool, are staged atomically per tool call, are checked against a default 50 MiB limit configurable through `PI_TELEGRAM_OUTBOUND_ATTACHMENT_MAX_BYTES` or `TELEGRAM_MAX_ATTACHMENT_SIZE_BYTES`, and use file-backed multipart blobs so large sends do not require preloading whole files into memory.
|
|
162
162
|
|
|
@@ -20,7 +20,7 @@ myext:page:2
|
|
|
20
20
|
|
|
21
21
|
- Use a stable extension-owned namespace, preferably the package or extension name without scope punctuation.
|
|
22
22
|
- Keep the namespace lowercase ASCII: `a-z`, `0-9`, `_`, `-`.
|
|
23
|
-
- Do not use `pi-telegram` owned prefixes: `tgbtn:`, `menu:`, `model:`, `thinking:`, `status:`, `queue:`.
|
|
23
|
+
- Do not use `pi-telegram` owned prefixes: `tgbtn:`, `menu:`, `model:`, `thinking:`, `status:`, `queue:`. Current app navigation uses `menu:`; `status:` remains reserved for legacy/owned status callbacks but is not emitted by current UI.
|
|
24
24
|
- Keep the full `callback_data` within Telegram's 64-byte limit.
|
|
25
25
|
- Put only opaque ids or small enum values in payloads; do not store secrets, full prompts, or large state.
|
|
26
26
|
- Treat callbacks as untrusted input. Validate namespace, action, and payload before executing side effects.
|
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
# Inbound Handlers
|
|
2
|
+
|
|
3
|
+
`pi-telegram` can run ordered inbound handlers before a Telegram turn enters the π queue. Inbound handlers are the provider-neutral Telegram → π transformation bus for raw text and downloaded media/files.
|
|
4
|
+
|
|
5
|
+
This document is the local inbound adaptation of the portable [Command Template Standard](./command-templates.md). It is also the canonical home for the legacy `attachmentHandlers` compatibility config.
|
|
6
|
+
|
|
7
|
+
## Config Shape
|
|
8
|
+
|
|
9
|
+
Prefer `inboundHandlers` for new configs:
|
|
10
|
+
|
|
11
|
+
```json
|
|
12
|
+
{
|
|
13
|
+
"inboundHandlers": [
|
|
14
|
+
{
|
|
15
|
+
"type": "text",
|
|
16
|
+
"template": "/path/to/translate --lang {lang=en} --text \"{text}\""
|
|
17
|
+
},
|
|
18
|
+
{
|
|
19
|
+
"type": "voice",
|
|
20
|
+
"template": [
|
|
21
|
+
"/path/to/stt --file {file} --lang {lang=ru}",
|
|
22
|
+
"/path/to/translate-stdin --lang {lang=en}"
|
|
23
|
+
]
|
|
24
|
+
},
|
|
25
|
+
{
|
|
26
|
+
"mime": "application/pdf",
|
|
27
|
+
"template": "/path/to/pdf-to-text --file {file}"
|
|
28
|
+
}
|
|
29
|
+
]
|
|
30
|
+
}
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
Legacy `telegram.json` files may still define `attachmentHandlers` for media/file preprocessing:
|
|
34
|
+
|
|
35
|
+
```json
|
|
36
|
+
{
|
|
37
|
+
"attachmentHandlers": [
|
|
38
|
+
{
|
|
39
|
+
"type": "voice",
|
|
40
|
+
"template": "/path/to/stt1 --file {file} --lang {lang=ru}"
|
|
41
|
+
},
|
|
42
|
+
{
|
|
43
|
+
"mime": "audio/*",
|
|
44
|
+
"template": "/path/to/stt2 --file {file} --lang {lang=ru}"
|
|
45
|
+
}
|
|
46
|
+
]
|
|
47
|
+
}
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
At runtime, `attachmentHandlers` is appended after `inboundHandlers`. Existing configs continue to work, while new configs should use `inboundHandlers`.
|
|
51
|
+
|
|
52
|
+
Handlers match by optional `type`, `mime`, or `match`. `mime` and `type` are independent selectors: if `mime` is present, `type` is not required. Wildcards such as `audio/*` or `text/*` are accepted. Each matching handler must provide `template`; a string is one command, and an array is ordered composition. Top-level `args` and `defaults` apply to composed steps unless a step defines private values. The command-template default timeout applies automatically. Legacy configs may still use `pipe` as a local alias.
|
|
53
|
+
|
|
54
|
+
`defaults` may provide additional placeholder values such as `{lang}` or `{model}`. `args` is only a string-array declaration of supported placeholders; defaults belong in `defaults` or inline placeholders such as `{lang=ru}`. Examples prefer explicit flag-style CLIs such as `--file {file}` and `--lang {lang=ru}` for readability, but positional forms such as `/path/to/stt {file} {lang=ru} {model=voxtral-mini-latest}` are equally valid when the target script supports them.
|
|
55
|
+
|
|
56
|
+
## Text Handlers
|
|
57
|
+
|
|
58
|
+
`type: "text"` handlers transform raw Telegram text before prompt construction. Raw Telegram text also has synthetic `mime: "text/plain"`, so a handler can match it with `type: "text"`, `mime: "text/plain"`, `mime: "text/*"`, or `match: "text/plain"`. The source text is provided on stdin and as `{text}`. Successful non-empty stdout replaces the current text and is passed to the next matching text handler. Empty stdout, non-zero exit, or handler failure keeps the previous text and records diagnostics.
|
|
59
|
+
|
|
60
|
+
Built-in placeholders for text handlers:
|
|
61
|
+
|
|
62
|
+
| Placeholder | Value |
|
|
63
|
+
| ----------- | ------------ |
|
|
64
|
+
| `{text}` | Current text |
|
|
65
|
+
| `{mime}` | `text/plain` |
|
|
66
|
+
| `{type}` | `text` |
|
|
67
|
+
|
|
68
|
+
## Media/File Handlers
|
|
69
|
+
|
|
70
|
+
Media/file handlers keep the legacy attachment-handler behavior: downloaded files are matched by `mime`, `type`, or `match`, then each file runs the first successful matching handler. Downloaded files with `mime: "text/plain"` or any `text/*` MIME type have a built-in fail-open handler that reads UTF-8 content into `[outputs]` when no configured handler produced output. Composition is useful for pipelines such as voice transcription followed by machine translation, so the agent receives translated `[outputs]` instead of the raw STT language.
|
|
71
|
+
|
|
72
|
+
Built-in placeholders for media/file handlers:
|
|
73
|
+
|
|
74
|
+
| Placeholder | Value |
|
|
75
|
+
| ----------- | ---------------------------------------------------------------- |
|
|
76
|
+
| `{file}` | Full local path to the downloaded file |
|
|
77
|
+
| `{mime}` | MIME type if known |
|
|
78
|
+
| `{type}` | Attachment kind such as `voice`, `audio`, `document`, or `photo` |
|
|
79
|
+
| `{text}` | Empty string |
|
|
80
|
+
|
|
81
|
+
If a top-level one-step media handler template has no `{file}` placeholder, the downloaded file path is appended as the last command arg as a one-step handler convenience. Composition steps are plain command templates and do not receive implicit file-path args; include `{file}` explicitly where needed.
|
|
82
|
+
|
|
83
|
+
## Ordered Fallbacks
|
|
84
|
+
|
|
85
|
+
A handler list is ordered. For each downloaded file, matching media/file handlers run in list order and stop after the first successful handler. A composed handler counts as one handler for fallback purposes: if any step fails, the next matching handler is tried.
|
|
86
|
+
|
|
87
|
+
If a matching handler fails with a non-zero exit code, the runtime records diagnostics and tries the next matching handler. If every matching handler fails, the attachment remains visible in the prompt as a normal local file reference.
|
|
88
|
+
|
|
89
|
+
## Prompt Output
|
|
90
|
+
|
|
91
|
+
Local attachments stay in the prompt under `[attachments] <directory>` with relative file entries. Successful media/file handler stdout is added under `[outputs]`. For composed media/file handlers, each step receives the previous step's stdout on stdin by default, and stdout from the last successful step is used as the handler output. Empty output and failed handler output are omitted from the prompt text.
|
|
92
|
+
|
|
93
|
+
Text handler output replaces the prompt text directly and is not duplicated under `[outputs]`.
|
|
@@ -8,15 +8,50 @@ This document is the local outbound adaptation of the portable [Command Template
|
|
|
8
8
|
|
|
9
9
|
## Standard
|
|
10
10
|
|
|
11
|
-
An outbound handler is selected by `type`.
|
|
11
|
+
An outbound handler is selected by `type`. Text replies and assistant markup map to handler types:
|
|
12
12
|
|
|
13
|
-
|
|
|
13
|
+
| Source | Handler type | Telegram action |
|
|
14
14
|
| ----------------- | ------------ | -------------------------------------------------- |
|
|
15
|
+
| Final text reply | `text` | Transform text/Markdown before Telegram rendering |
|
|
15
16
|
| `telegram_voice` | `voice` | Generate OGG/Opus and call `sendVoice` |
|
|
16
17
|
| `telegram_button` | Built-in | Attach an inline keyboard button to the final text |
|
|
17
18
|
|
|
18
19
|
Configured command-template handlers provide `template`. A string is one command; an array is ordered composition. Top-level `args` and `defaults` apply to all composed steps unless a step defines private values. The command-template default timeout applies automatically. `output` selects the primary artifact path when the handler produces a file instead of stdout text. Legacy configs may still use `pipe`, but `template: [...]` is the preferred standard shape.
|
|
19
20
|
|
|
21
|
+
## Text Handler Config
|
|
22
|
+
|
|
23
|
+
`type: "text"` handlers transform final text replies before rendering and delivery. The source text is provided on stdin and as `{text}`. Successful non-empty stdout replaces the current text. Empty stdout or handler failure keeps the previous text and records diagnostics.
|
|
24
|
+
|
|
25
|
+
This is ideal for machine translation, tone normalization, redaction, glossary expansion, compliance footers, or any other final text rewrite that should be configured outside the agent prompt. Text handlers run before Markdown/HTML rendering, so a Markdown reply remains Markdown input to the handler. They also run when the bridge finalizes an already streamed rich preview; in that path Telegram can briefly show a pre-transform preview before the final edited message is replaced with the handler output. Inline buttons are built as reply markup: visible button labels pass through the same text handler, while callback data and callback prompts remain unchanged.
|
|
26
|
+
|
|
27
|
+
Simple machine-translation handler with explicit text placeholder:
|
|
28
|
+
|
|
29
|
+
```json
|
|
30
|
+
{
|
|
31
|
+
"outboundHandlers": [
|
|
32
|
+
{
|
|
33
|
+
"type": "text",
|
|
34
|
+
"template": "/path/to/translate --lang {lang=ru} --text \"{text}\""
|
|
35
|
+
}
|
|
36
|
+
]
|
|
37
|
+
}
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
Stdin-based or subagent-backed translation can omit `{text}` from the template because the bridge also provides the source reply on stdin:
|
|
41
|
+
|
|
42
|
+
```json
|
|
43
|
+
{
|
|
44
|
+
"outboundHandlers": [
|
|
45
|
+
{
|
|
46
|
+
"type": "text",
|
|
47
|
+
"template": "/path/to/translate-stdin --lang {lang=ru}"
|
|
48
|
+
}
|
|
49
|
+
]
|
|
50
|
+
}
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
A text handler should preserve the full message unless shortening is intentional; for translation prompts, explicitly ask the tool to keep Markdown, line breaks, and details unchanged.
|
|
54
|
+
|
|
20
55
|
## Voice Handler Config
|
|
21
56
|
|
|
22
57
|
`telegram.json` may define `outboundHandlers`:
|
|
@@ -27,7 +62,8 @@ Configured command-template handlers provide `template`. A string is one command
|
|
|
27
62
|
{
|
|
28
63
|
"type": "voice",
|
|
29
64
|
"template": [
|
|
30
|
-
"/path/to/
|
|
65
|
+
"/path/to/translate-stdin --lang {lang=ru}",
|
|
66
|
+
"/path/to/tts-from-stdin --lang {lang=ru} --rate {rate=+30%} --write-media {mp3}",
|
|
31
67
|
"ffmpeg -y -i {mp3} -c:a libopus -b:a 32k -ar 16000 -ac 1 -vbr on {ogg}"
|
|
32
68
|
],
|
|
33
69
|
"output": "ogg"
|
|
@@ -36,7 +72,7 @@ Configured command-template handlers provide `template`. A string is one command
|
|
|
36
72
|
}
|
|
37
73
|
```
|
|
38
74
|
|
|
39
|
-
If a matching voice handler fails, the bridge tries the next matching `type: "voice"` handler.
|
|
75
|
+
In this example, the first step receives the `telegram_voice` text on stdin and returns translated text; the second step reads that translated text from stdin and writes `{mp3}`; the final step converts `{mp3}` to Telegram-ready `{ogg}`. If you do not need voice translation, omit the first step and call a TTS command that accepts `{text}` directly. If a matching voice handler fails, the bridge tries the next matching `type: "voice"` handler.
|
|
40
76
|
|
|
41
77
|
## Voice Markup
|
|
42
78
|
|