@llblab/pi-telegram 0.6.0 → 0.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -1
- package/docs/architecture.md +1 -1
- package/docs/attachment-handlers.md +1 -1
- package/docs/outbound-handlers.md +4 -2
- package/lib/command-templates.ts +6 -3
- package/lib/outbound-handlers.ts +49 -24
- package/lib/prompts.ts +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -171,7 +171,7 @@ A TTS plus MP3-to-OGG setup can be expressed as `template: [...]`. The bridge pr
|
|
|
171
171
|
|
|
172
172
|
#### Buttons
|
|
173
173
|
|
|
174
|
-
Button blocks attach inline quick replies to the final text. Use one independent `telegram_button` block per action; its `label` is shown in Telegram and its body is sent back to pi when tapped:
|
|
174
|
+
Button blocks attach inline quick replies to the final text. Use one independent `telegram_button` block per action; its `label` is shown in Telegram and its body is sent back to pi when tapped. If the prompt should equal the label, the body can be omitted:
|
|
175
175
|
|
|
176
176
|
```md
|
|
177
177
|
I can continue.
|
|
@@ -179,6 +179,8 @@ I can continue.
|
|
|
179
179
|
<!-- telegram_button label="Continue"
|
|
180
180
|
Continue with the current plan.
|
|
181
181
|
-->
|
|
182
|
+
|
|
183
|
+
<!-- telegram_button label="OK" -->
|
|
182
184
|
```
|
|
183
185
|
|
|
184
186
|
Button prompts are routed back into the normal Telegram queue as prompt turns. Outbound handler details are documented in [`docs/outbound-handlers.md`](./docs/outbound-handlers.md).
|
package/docs/architecture.md
CHANGED
|
@@ -155,7 +155,7 @@ Telegram prompt responses use explicit delivery context to attach outbound text,
|
|
|
155
155
|
|
|
156
156
|
Outbound files are sent only after the active Telegram turn completes, must be staged through the `telegram_attach` tool, are staged atomically per tool call, are checked against a default 50 MiB limit configurable through `PI_TELEGRAM_OUTBOUND_ATTACHMENT_MAX_BYTES` or `TELEGRAM_MAX_ATTACHMENT_SIZE_BYTES`, and use file-backed multipart blobs so large sends do not require preloading whole files into memory.
|
|
157
157
|
|
|
158
|
-
Assistant-authored outbound actions use final-message markup instead of agent tool calls. Preview updates strip closed top-level HTML comments and currently open/partial top-level comment starts before rendering, so users do not see transient metadata even when streaming flushes happen after only `<`, `<!`, or `<!--`. On `agent_end`, the bridge removes top-level comments from the Markdown text reply, but treats column-zero top-level `<!-- telegram_voice ... -->` and `<!-- telegram_button ... -->` blocks specially before delivery; comments inside fenced code, quotes, lists, or indented examples stay literal. Voice maps to the first matching `outboundHandlers[]` entry with `type: "voice"`, synthesizes the block body through command-template execution, and uploads the generated OGG/Opus file via Telegram `sendVoice`; when no outbound voice handler is configured, it silently skips voice delivery. The `template: [...]` form can express TTS plus MP3-to-OGG conversion using configured templates and bridge-provided `{text}`, `{mp3}`, and `{ogg}` placeholders. Top-level `args` and `defaults` apply to all composed steps unless a step defines private values, top-level `timeout` wraps the whole sequence, and each step receives the previous step's stdout on stdin by default, without hard-coded filesystem defaults. Button blocks are built in: each `telegram_button` block becomes one inline-keyboard button on the final text, and callback clicks enqueue the configured prompt text as a normal Telegram prompt turn. This keeps technical Markdown, code, tables, formulas, and numbered lists in the text channel when appropriate while allowing TTS-friendly voice messages and tappable continuations without invoking `telegram_attach` or extra transport tools.
|
|
158
|
+
Assistant-authored outbound actions use final-message markup instead of agent tool calls. Preview updates strip closed top-level HTML comments and currently open/partial top-level comment starts before rendering, so users do not see transient metadata even when streaming flushes happen after only `<`, `<!`, or `<!--`. On `agent_end`, the bridge removes top-level comments from the Markdown text reply, but treats column-zero top-level `<!-- telegram_voice ... -->` and `<!-- telegram_button ... -->` blocks specially before delivery; comments inside fenced code, quotes, lists, or indented examples stay literal, including fenced blocks with Markdown-valid indented closing fences. Voice maps to the first matching `outboundHandlers[]` entry with `type: "voice"`, synthesizes the block body through command-template execution, and uploads the generated OGG/Opus file via Telegram `sendVoice`; when no outbound voice handler is configured, it silently skips voice delivery. The `template: [...]` form can express TTS plus MP3-to-OGG conversion using configured templates and bridge-provided `{text}`, `{mp3}`, and `{ogg}` placeholders. Top-level `args` and `defaults` apply to all composed steps unless a step defines private values, top-level `timeout` wraps the whole sequence, and each step receives the previous step's stdout on stdin by default, without hard-coded filesystem defaults. Button blocks are built in: each `telegram_button` block becomes one inline-keyboard button on the final text, and callback clicks enqueue the configured prompt text, or the button label when the body is omitted, as a normal Telegram prompt turn. This keeps technical Markdown, code, tables, formulas, and numbered lists in the text channel when appropriate while allowing TTS-friendly voice messages and tappable continuations without invoking `telegram_attach` or extra transport tools.
|
|
159
159
|
|
|
160
160
|
## Interactive Controls
|
|
161
161
|
|
|
@@ -39,7 +39,7 @@ Attachment handlers support these built-in placeholders:
|
|
|
39
39
|
|
|
40
40
|
`defaults` may provide additional placeholder values such as `{lang}` or `{model}`. `args` is only a string-array declaration of supported placeholders; defaults belong in `defaults` or inline placeholders such as `{lang=ru}`. Examples prefer explicit flag-style CLIs for readability, but positional forms such as `/path/to/stt {file} {lang=ru} {model=voxtral-mini-latest}` are equally valid when the target script supports them.
|
|
41
41
|
|
|
42
|
-
If a top-level one-step handler template has no `{file}` placeholder, the downloaded file path is appended as the last command arg
|
|
42
|
+
If a top-level one-step handler template has no `{file}` placeholder, the downloaded file path is appended as the last command arg as a one-step handler convenience. Composition steps are plain command templates and do not receive implicit file-path args; include `{file}` explicitly where needed.
|
|
43
43
|
|
|
44
44
|
## Ordered Fallbacks
|
|
45
45
|
|
|
@@ -75,7 +75,7 @@ For one-step `template` handlers, stdout remains the default result channel: the
|
|
|
75
75
|
|
|
76
76
|
## Buttons Markup
|
|
77
77
|
|
|
78
|
-
Assistant replies can include independent button blocks. The block body is the prompt sent back to pi when the user taps the button:
|
|
78
|
+
Assistant replies can include independent button blocks. The block body is the prompt sent back to pi when the user taps the button; omit the body when the prompt should equal the label:
|
|
79
79
|
|
|
80
80
|
```md
|
|
81
81
|
I can continue.
|
|
@@ -87,11 +87,13 @@ Continue with the current plan.
|
|
|
87
87
|
<!-- telegram_button label="Show risks"
|
|
88
88
|
List the main risks first.
|
|
89
89
|
-->
|
|
90
|
+
|
|
91
|
+
<!-- telegram_button label="Done" -->
|
|
90
92
|
```
|
|
91
93
|
|
|
92
94
|
Rules:
|
|
93
95
|
|
|
94
|
-
- `telegram_button label="Label"` creates one independent button row whose prompt is the block body.
|
|
96
|
+
- `telegram_button label="Label"` creates one independent button row whose prompt is the block body, or the label itself when the body is omitted.
|
|
95
97
|
- The opening `<!-- telegram_button` marker must start at column zero on a top-level line outside fenced code, quotes, and lists; otherwise it is rendered as literal Markdown.
|
|
96
98
|
- Use one block per button; this mirrors HTML's singular element model and avoids a nested button DSL inside comments.
|
|
97
99
|
- Button actions are stored in memory with short `callback_data`; Telegram never sees the full prompt in the button payload.
|
package/lib/command-templates.ts
CHANGED
|
@@ -33,6 +33,7 @@ export interface CommandTemplateExecOptions {
|
|
|
33
33
|
timeout?: number;
|
|
34
34
|
signal?: AbortSignal;
|
|
35
35
|
stdin?: string;
|
|
36
|
+
killGrace?: number;
|
|
36
37
|
}
|
|
37
38
|
|
|
38
39
|
export interface CommandTemplateExecResult {
|
|
@@ -207,18 +208,20 @@ export function execCommandTemplate(
|
|
|
207
208
|
let killed = false;
|
|
208
209
|
let settled = false;
|
|
209
210
|
let timeoutId: NodeJS.Timeout | undefined;
|
|
211
|
+
let killTimeoutId: NodeJS.Timeout | undefined;
|
|
210
212
|
const killProcess = (): void => {
|
|
211
213
|
if (killed) return;
|
|
212
214
|
killed = true;
|
|
213
215
|
proc.kill("SIGTERM");
|
|
214
|
-
setTimeout(() => {
|
|
215
|
-
if (!
|
|
216
|
-
}, 5000);
|
|
216
|
+
killTimeoutId = setTimeout(() => {
|
|
217
|
+
if (!settled) proc.kill("SIGKILL");
|
|
218
|
+
}, options.killGrace ?? 5000);
|
|
217
219
|
};
|
|
218
220
|
const settle = (code: number): void => {
|
|
219
221
|
if (settled) return;
|
|
220
222
|
settled = true;
|
|
221
223
|
if (timeoutId) clearTimeout(timeoutId);
|
|
224
|
+
if (killTimeoutId) clearTimeout(killTimeoutId);
|
|
222
225
|
if (options.signal)
|
|
223
226
|
options.signal.removeEventListener("abort", killProcess);
|
|
224
227
|
resolve({ stdout, stderr, code, killed });
|
package/lib/outbound-handlers.ts
CHANGED
|
@@ -101,6 +101,11 @@ interface TelegramTopLevelHtmlComment {
|
|
|
101
101
|
end: number;
|
|
102
102
|
}
|
|
103
103
|
|
|
104
|
+
interface TelegramTopLevelFenceState {
|
|
105
|
+
marker: "`" | "~";
|
|
106
|
+
length: number;
|
|
107
|
+
}
|
|
108
|
+
|
|
104
109
|
function getMarkdownLineEnd(markdown: string, offset: number): number {
|
|
105
110
|
const newlineIndex = markdown.indexOf("\n", offset);
|
|
106
111
|
return newlineIndex === -1 ? markdown.length : newlineIndex + 1;
|
|
@@ -114,9 +119,29 @@ function getMarkdownLineText(
|
|
|
114
119
|
return markdown.slice(offset, end).replace(/\r?\n$/, "");
|
|
115
120
|
}
|
|
116
121
|
|
|
117
|
-
function
|
|
118
|
-
|
|
119
|
-
|
|
122
|
+
function getTopLevelOpeningFence(
|
|
123
|
+
line: string,
|
|
124
|
+
): TelegramTopLevelFenceState | undefined {
|
|
125
|
+
const match = line.match(/^(?: {0,3})(`{3,}|~{3,})/);
|
|
126
|
+
const sequence = match?.[1];
|
|
127
|
+
if (!sequence) return undefined;
|
|
128
|
+
return {
|
|
129
|
+
marker: sequence[0] as "`" | "~",
|
|
130
|
+
length: sequence.length,
|
|
131
|
+
};
|
|
132
|
+
}
|
|
133
|
+
|
|
134
|
+
function isTopLevelClosingFence(
|
|
135
|
+
line: string,
|
|
136
|
+
fence: TelegramTopLevelFenceState,
|
|
137
|
+
): boolean {
|
|
138
|
+
const match = line.match(/^(?: {0,3})(`{3,}|~{3,})([ \t]*)$/);
|
|
139
|
+
const sequence = match?.[1];
|
|
140
|
+
return (
|
|
141
|
+
!!sequence &&
|
|
142
|
+
sequence[0] === fence.marker &&
|
|
143
|
+
sequence.length >= fence.length
|
|
144
|
+
);
|
|
120
145
|
}
|
|
121
146
|
|
|
122
147
|
function collectTopLevelHtmlComments(markdown: string): {
|
|
@@ -125,18 +150,18 @@ function collectTopLevelHtmlComments(markdown: string): {
|
|
|
125
150
|
} {
|
|
126
151
|
const comments: TelegramTopLevelHtmlComment[] = [];
|
|
127
152
|
let offset = 0;
|
|
128
|
-
let
|
|
153
|
+
let fence: TelegramTopLevelFenceState | undefined;
|
|
129
154
|
while (offset < markdown.length) {
|
|
130
155
|
const lineEnd = getMarkdownLineEnd(markdown, offset);
|
|
131
156
|
const line = getMarkdownLineText(markdown, offset, lineEnd);
|
|
132
|
-
if (
|
|
133
|
-
if (line
|
|
157
|
+
if (fence) {
|
|
158
|
+
if (isTopLevelClosingFence(line, fence)) fence = undefined;
|
|
134
159
|
offset = lineEnd;
|
|
135
160
|
continue;
|
|
136
161
|
}
|
|
137
|
-
const
|
|
138
|
-
if (
|
|
139
|
-
|
|
162
|
+
const nextFence = getTopLevelOpeningFence(line);
|
|
163
|
+
if (nextFence) {
|
|
164
|
+
fence = nextFence;
|
|
140
165
|
offset = lineEnd;
|
|
141
166
|
continue;
|
|
142
167
|
}
|
|
@@ -174,19 +199,19 @@ function findTopLevelOpenOrPartialHtmlCommentIndex(markdown: string): number {
|
|
|
174
199
|
const { openCommentStart } = collectTopLevelHtmlComments(markdown);
|
|
175
200
|
if (openCommentStart !== undefined) return openCommentStart;
|
|
176
201
|
let offset = 0;
|
|
177
|
-
let
|
|
202
|
+
let fence: TelegramTopLevelFenceState | undefined;
|
|
178
203
|
while (offset < markdown.length) {
|
|
179
204
|
const lineEnd = getMarkdownLineEnd(markdown, offset);
|
|
180
205
|
const line = getMarkdownLineText(markdown, offset, lineEnd);
|
|
181
206
|
const isLastLine = lineEnd >= markdown.length;
|
|
182
|
-
if (
|
|
183
|
-
if (line
|
|
207
|
+
if (fence) {
|
|
208
|
+
if (isTopLevelClosingFence(line, fence)) fence = undefined;
|
|
184
209
|
offset = lineEnd;
|
|
185
210
|
continue;
|
|
186
211
|
}
|
|
187
|
-
const
|
|
188
|
-
if (
|
|
189
|
-
|
|
212
|
+
const nextFence = getTopLevelOpeningFence(line);
|
|
213
|
+
if (nextFence) {
|
|
214
|
+
fence = nextFence;
|
|
190
215
|
offset = lineEnd;
|
|
191
216
|
continue;
|
|
192
217
|
}
|
|
@@ -251,9 +276,8 @@ function normalizeMarkdownAfterVoiceExtraction(markdown: string): string {
|
|
|
251
276
|
|
|
252
277
|
export function stripTelegramCommentMarkupForPreview(markdown: string): string {
|
|
253
278
|
const withoutClosedBlocks = replaceTopLevelHtmlComments(markdown, () => "");
|
|
254
|
-
const openBlockIndex =
|
|
255
|
-
withoutClosedBlocks
|
|
256
|
-
);
|
|
279
|
+
const openBlockIndex =
|
|
280
|
+
findTopLevelOpenOrPartialHtmlCommentIndex(withoutClosedBlocks);
|
|
257
281
|
const previewMarkdown =
|
|
258
282
|
openBlockIndex >= 0
|
|
259
283
|
? withoutClosedBlocks.slice(0, openBlockIndex)
|
|
@@ -265,9 +289,8 @@ export function stripTelegramCommentMarkupForDelivery(
|
|
|
265
289
|
markdown: string,
|
|
266
290
|
): string {
|
|
267
291
|
const withoutClosedBlocks = replaceTopLevelHtmlComments(markdown, () => "");
|
|
268
|
-
const openBlockIndex =
|
|
269
|
-
withoutClosedBlocks
|
|
270
|
-
);
|
|
292
|
+
const openBlockIndex =
|
|
293
|
+
findTopLevelOpenOrPartialHtmlCommentIndex(withoutClosedBlocks);
|
|
271
294
|
const deliveryMarkdown =
|
|
272
295
|
openBlockIndex >= 0
|
|
273
296
|
? withoutClosedBlocks.slice(0, openBlockIndex)
|
|
@@ -343,7 +366,9 @@ function getVoiceReplyCompositionStepTimeout(
|
|
|
343
366
|
): number {
|
|
344
367
|
const remaining = getRemainingVoiceReplyTimeout(handlerTimeout, startedAt);
|
|
345
368
|
const stepTimeout = getVoiceReplyConfiguredTimeout(step);
|
|
346
|
-
return stepTimeout === undefined
|
|
369
|
+
return stepTimeout === undefined
|
|
370
|
+
? remaining
|
|
371
|
+
: Math.min(stepTimeout, remaining);
|
|
347
372
|
}
|
|
348
373
|
|
|
349
374
|
function formatVoiceReplyExecutionFailure(
|
|
@@ -704,8 +729,8 @@ function parseButtonsCommentRows(
|
|
|
704
729
|
body: string | undefined,
|
|
705
730
|
): TelegramOutboundButtonAction[][] {
|
|
706
731
|
const attributes = parseButtonsCommentAttributes(head);
|
|
707
|
-
|
|
708
|
-
|
|
732
|
+
if (!attributes.label) return [];
|
|
733
|
+
const prompt = body?.trim() || attributes.label;
|
|
709
734
|
return [[{ text: attributes.label, prompt }]];
|
|
710
735
|
}
|
|
711
736
|
|
package/lib/prompts.ts
CHANGED
|
@@ -17,7 +17,7 @@ Telegram bridge extension is active.
|
|
|
17
17
|
- Do not assume mentioning a local file path in plain text will send it to Telegram. Use telegram_attach.
|
|
18
18
|
- For Telegram-native outbound actions, use hidden top-level Markdown comments instead of agent-side tool calls: write a normal answer plus correctly formatted column-zero \`telegram_voice\` or \`telegram_button\` blocks outside code, quotes, and lists. The bridge handles delivery after \`agent_end\`, so do not call or register transport/TTS/text-to-OGG tools for these actions.
|
|
19
19
|
- A \`telegram_voice\` block body is the text to synthesize through the extension's configured outbound-handler pipeline. It may be a short companion summary when useful, but no specific summary format is required. Keep it TTS-friendly; avoid raw Markdown, code, formulas, tables, or long lists.
|
|
20
|
-
- Button blocks should contain quick reply prompts the user can tap; use independent blocks like \`<!-- telegram_button label="OK"\nPrompt text\n
|
|
20
|
+
- Button blocks should contain quick reply prompts the user can tap; use independent blocks like \`<!-- telegram_button label="OK"\nPrompt text\n-->\`, or \`<!-- telegram_button label="OK" -->\` when the prompt should equal the label. The callback prompt is routed back as a normal Telegram turn.`;
|
|
21
21
|
|
|
22
22
|
export function buildTelegramBridgeSystemPrompt(options: {
|
|
23
23
|
prompt: string;
|