arisa 3.0.3 → 3.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md CHANGED
@@ -5,7 +5,7 @@
5
5
  - `src/core/agent/*`: Pi Agent sessions, one per authorized chat.
6
6
  - `src/core/artifacts/*`: every incoming or generated message/file becomes an artifact.
7
7
  - `src/core/tools/*`: CLI tool registry, help lookup, config writes, execution.
8
- - `cli/*`: isolated tools. Each tool has `package.json`, `config.js`, `tool.manifest.json`, and `index.js`.
8
+ - `tools/*`: isolated tools. Each tool has `package.json`, `config.js`, `tool.manifest.json`, and `index.js`.
9
9
 
10
10
  ## Main rule: everything is piped through artifacts
11
11
  A pipe transforms one input artifact into one output artifact.
@@ -70,20 +70,27 @@ Example manual pipe:
70
70
  ## Missing config flow
71
71
  If `run_tool` returns `missingConfig`, the agent should:
72
72
  1. ask the user naturally in Telegram for the missing value
73
- 2. write the value into `cli/<tool>/config.js` with `set_tool_config`
73
+ 2. write the value into `~/.arisa/tools/<tool>/config.js` with `set_tool_config`
74
74
  3. retry the tool
75
75
 
76
76
  Do not assume a rigid question/answer protocol. Continue the conversation naturally and infer the config value from the user reply when possible.
77
77
 
78
- ## Telegram security
79
- - The first chat that messages the bot is authorized if `telegram.maxChatIds` allows it.
80
- - Do not authorize more chats than configured.
81
- - Access control is based on chat ids, not usernames.
78
+ ## Long-running work
79
+ If a task is likely to take noticeable time — for example creating a new tool, editing multiple files, or doing multi-step work the agent should first acknowledge the request briefly and naturally, then continue the work.
80
+
81
+ The acknowledgment should:
82
+ - be short and clear
83
+ - tell the user the work is starting
84
+ - mention when the task may take a while
85
+
86
+ Examples:
87
+ - "Understood. I'll build that tool now. This may take a couple of minutes."
88
+ - "Got it. I'll inspect the project and make the change now."
82
89
 
83
90
  ## Tool creation
84
91
  Do not assume specific future tools such as YouTube support exist.
85
92
  If the user asks for a capability that is not currently available, first check whether an existing registered tool can satisfy the task.
86
- If no existing tool can do it, the default attitude should be to propose creating a new CLI tool under `cli/<tool-name>` following the project conventions.
93
+ If no existing tool can do it, the default attitude should be to propose creating a new CLI tool under `tools/<tool-name>` following the project conventions.
87
94
  All newly created tools must document their help text, usage instructions, manifests, and user-facing operational strings in English.
88
95
  Do not stop at "I cannot do that" when the task is realistically implementable through a new tool.
89
96
  Prefer responses like:
@@ -101,7 +108,7 @@ When creating or editing tools, follow the shared path helpers in `src/runtime/p
101
108
  Consult the local skill for that workflow when building new tools.
102
109
 
103
110
  ## Safety
104
- - Do not install or run arbitrary tools outside registered `cli/*` manifests in V1.
111
+ - Do not install or run arbitrary tools outside registered `tools/*` manifests in V1.
105
112
  - Prefer tool manifests and CLI help over assumptions.
106
113
  - Keep tool configs inside `~/.arisa/tools/<tool>/config.js`.
107
114
  - Be proactive about extending capabilities, but do it through the project's tool architecture, not through ad hoc one-off behavior.
package/README.md CHANGED
@@ -6,11 +6,11 @@ Arisa is a personal Telegram assistant powered by Pi Agent.
6
6
 
7
7
  The initial inspiration was [OpenClaw](https://github.com/openclaw/openclaw). OpenClaw has interesting ideas but carries too much weight: when it generates tools they end up disorganized, and the overall framework feels overloaded for personal use.
8
8
 
9
- The real heart of OpenClaw is Pi Agent a [minimal terminal coding harness](https://www.youtube.com/watch?v=Dli5slNaJu0) that lets an AI agent reason and act with very little infrastructure. That part is genuinely good.
9
+ The real heart of OpenClaw is Pi Agent: a [minimal terminal coding harness](https://www.youtube.com/watch?v=Dli5slNaJu0) that lets an AI agent reason and act with very little infrastructure. That part is genuinely good.
10
10
 
11
11
  Telegram bots, on the other hand, work extremely well as a human interface. Simple, reliable, always in your pocket.
12
12
 
13
- So Arisa keeps exactly those two things Pi Agent and Telegram and nothing more. No pre-loaded opinions about what the agent should do or which tools it should have. The idea is that the agent builds itself around the user, not the other way around.
13
+ So Arisa keeps exactly those two things (Pi Agent & Telegram) and nothing more. No pre-loaded opinions about what the agent should do or which tools it should have. The idea is that the agent builds itself around the user, not the other way around.
14
14
 
15
15
  It is designed around a simple idea:
16
16
 
@@ -46,7 +46,7 @@ This distinction is important. Some transformations belong to the transport/inpu
46
46
  - media is stored as artifacts
47
47
 
48
48
  ### Tool model
49
- Each tool lives in its own folder under `cli/<tool-name>` and contains:
49
+ Each tool lives in its own folder under `tools/<tool-name>` and contains:
50
50
 
51
51
  - `package.json`
52
52
  - `config.js`
@@ -57,7 +57,7 @@ Each tool is isolated from the root project and from other tools.
57
57
  That isolation is part of the architecture:
58
58
 
59
59
  - each tool has its own folder
60
- - each tool keeps its own `config.js`
60
+ - each tool has a local `config.js` only for defaults/template values
61
61
  - each tool can have its own dependencies
62
62
  - one tool can be changed or replaced without tightly coupling the rest of the system
63
63
 
@@ -74,6 +74,7 @@ node index.js run --request-file <json>
74
74
  - artifact index is stored in `~/.arisa/state/artifacts.json`
75
75
  - incoming Telegram attachments are stored directly in `~/.arisa/artifacts/`
76
76
  - tool-specific secrets/config live in `~/.arisa/tools/<tool>/config.js`
77
+ - bundled tools and generated tools should both use the same source layout under `tools/<tool>/`
77
78
  - tool runtime temp files and generated outputs live in `~/.arisa/tmp/tools/<tool>/`
78
79
  - durable files should end up in `~/.arisa/artifacts/`
79
80
  - Pi authentication can use either:
@@ -92,6 +93,16 @@ Then run:
92
93
  arisa
93
94
  ```
94
95
 
96
+ Command modes:
97
+
98
+ ```bash
99
+ arisa # foreground, blocking
100
+ arisa start # start in background
101
+ arisa stop # stop background service
102
+ arisa status # show background service status
103
+ arisa flush # remove ~/.arisa
104
+ ```
105
+
95
106
  ## Bootstrap flow
96
107
 
97
108
  On first run, Arisa will:
@@ -146,7 +157,7 @@ src/
146
157
  runtime/ bootstrap + app startup
147
158
  transport/ Telegram integration
148
159
  core/ agent, tools, artifacts, config
149
- cli/
160
+ tools/
150
161
  openai-transcribe/
151
162
  openai-tts/
152
163
  ~/.arisa/
@@ -158,7 +169,9 @@ cli/
158
169
 
159
170
  ## Philosophy
160
171
 
161
- The agent should not come preloaded with vices or assumptions. It starts minimal and grows through real use shaped by the user, not by the framework.
172
+ The agent should not come preloaded with vices or assumptions. It starts minimal and grows through real use: shaped by the user, not by the framework.
173
+
174
+ For consistency, the entire Arisa codebase was built using Pi Agent itself, running on Codex: the model bundled with ChatGPT Plus. The goal was to see how far a model that most people already have access to could go when given a good harness. The experience was genuinely satisfying: having the agent reason about, extend, and improve its own system is exactly the kind of recursive loop the project is designed for.
162
175
 
163
176
  When a capability is missing:
164
177
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "arisa",
3
- "version": "3.0.3",
3
+ "version": "3.0.6",
4
4
  "description": "Telegram + Pi Agent modular assistant",
5
5
  "type": "module",
6
6
  "main": "src/index.js",
@@ -22,12 +22,15 @@
22
22
  "agent",
23
23
  "SOUL.md",
24
24
  "tinyclaw",
25
+ "nullclaw",
26
+ "picoclaw",
27
+ "zeroclaw",
25
28
  "jarvis",
26
29
  "AGENTS.md",
27
30
  "clasen"
28
31
  ],
29
32
  "author": "",
30
- "license": "ISC",
33
+ "license": "GLP",
31
34
  "packageManager": "pnpm@10.32.1",
32
35
  "dependencies": {
33
36
  "@mariozechner/pi-coding-agent": "^0.65.0",
@@ -3,14 +3,15 @@ import { mkdir, unlink } from "node:fs/promises";
3
3
  import { createAgentSession, SessionManager, defineTool } from "@mariozechner/pi-coding-agent";
4
4
  import { Type } from "@sinclair/typebox";
5
5
  import { createPiRuntime, hasProviderAuth } from "./pi-runtime.js";
6
-
7
- const agentDir = path.resolve("data/pi-agent");
6
+ import { loadProjectInstructions } from "./project-instructions.js";
7
+ import { getChatDir, piAgentDir as agentDir } from "../../runtime/paths.js";
8
8
 
9
9
  export class AgentManager {
10
- constructor({ config, artifactStore, toolRegistry }) {
10
+ constructor({ config, artifactStore, toolRegistry, logger }) {
11
11
  this.config = config;
12
12
  this.artifactStore = artifactStore;
13
13
  this.toolRegistry = toolRegistry;
14
+ this.logger = logger;
14
15
  this.sessions = new Map();
15
16
  }
16
17
 
@@ -20,6 +21,7 @@ export class AgentManager {
20
21
  }
21
22
 
22
23
  async validatePiAgent() {
24
+ this.logger?.log("agent", "validating Pi session");
23
25
  const { authStorage, modelRegistry } = createPiRuntime({
24
26
  provider: this.config.pi.provider,
25
27
  apiKey: this.config.pi.apiKey
@@ -42,7 +44,10 @@ export class AgentManager {
42
44
  }
43
45
 
44
46
  async getSessionContext(chatId, telegram) {
45
- if (this.sessions.has(chatId)) return this.sessions.get(chatId);
47
+ if (this.sessions.has(chatId)) {
48
+ this.logger?.log("agent", `reusing session for chat ${chatId}`);
49
+ return this.sessions.get(chatId);
50
+ }
46
51
 
47
52
  await mkdir(agentDir, { recursive: true });
48
53
  const { authStorage, modelRegistry } = createPiRuntime({
@@ -55,9 +60,10 @@ export class AgentManager {
55
60
  throw new Error(`No auth found for ${this.config.pi.provider}. Re-run bootstrap and complete login for this provider before Telegram starts.`);
56
61
  }
57
62
 
58
- const cwd = path.resolve("data/chats", String(chatId));
63
+ const cwd = getChatDir(chatId);
59
64
  await mkdir(cwd, { recursive: true });
60
65
 
66
+ this.logger?.log("agent", `creating session for chat ${chatId}`);
61
67
  const customTools = this.createTools(telegram);
62
68
  const { session } = await createAgentSession({
63
69
  cwd,
@@ -69,6 +75,10 @@ export class AgentManager {
69
75
  sessionManager: SessionManager.continueRecent(cwd)
70
76
  });
71
77
 
78
+ const instructions = await loadProjectInstructions();
79
+ this.logger?.log("agent", `injecting project instructions for chat ${chatId}`);
80
+ await session.prompt(`${instructions}\n\nAcknowledge with exactly: OK`);
81
+
72
82
  const ctx = { session };
73
83
  this.sessions.set(chatId, ctx);
74
84
  return ctx;
@@ -123,6 +133,7 @@ export class AgentManager {
123
133
  }),
124
134
  execute: async (_id, params) => {
125
135
  await this.toolRegistry.load();
136
+ this.logger?.log("agent", `run_tool ${params.name}`);
126
137
  let artifact = null;
127
138
  if (params.artifactId) {
128
139
  artifact = await this.artifactStore.get(params.artifactId);
@@ -168,13 +179,22 @@ export class AgentManager {
168
179
  }
169
180
  }),
170
181
  defineTool({
171
- name: "send_audio_reply",
172
- label: "Send audio reply",
173
- description: "Generate speech from text with a CLI tool and send it to the current Telegram chat.",
174
- parameters: Type.Object({ text: Type.String(), toolName: Type.Optional(Type.String()) }),
182
+ name: "send_media_reply",
183
+ label: "Send media reply",
184
+ description: "Run a CLI tool that generates a file and send it to the current Telegram chat using the tool's delivery hint or an explicit method.",
185
+ parameters: Type.Object({
186
+ text: Type.String(),
187
+ toolName: Type.Optional(Type.String()),
188
+ method: Type.Optional(Type.Union([
189
+ Type.Literal("voice"),
190
+ Type.Literal("audio"),
191
+ Type.Literal("document")
192
+ ]))
193
+ }),
175
194
  execute: async (_id, params) => {
176
195
  await this.toolRegistry.load();
177
196
  const toolName = params.toolName || "openai-tts";
197
+ this.logger?.log("agent", `send_media_reply via ${toolName}`);
178
198
  const result = await this.toolRegistry.run({
179
199
  name: toolName,
180
200
  request: { text: params.text, args: {} }
@@ -182,9 +202,13 @@ export class AgentManager {
182
202
  if (!result.ok || !result.output?.filePath) {
183
203
  return { content: [{ type: "text", text: JSON.stringify(result, null, 2) }], details: result };
184
204
  }
185
- await telegram.sendAudio(result.output.filePath, params.text);
205
+ const method = params.method || result.output?.delivery?.method || "audio";
206
+ await telegram.sendMedia(result.output.filePath, { method, caption: params.text });
186
207
  await unlink(result.output.filePath).catch(() => {});
187
- return { content: [{ type: "text", text: "Audio enviado por Telegram." }], details: result };
208
+ return {
209
+ content: [{ type: "text", text: `Media sent to Telegram as ${method}.` }],
210
+ details: { ...result, sent: { method } }
211
+ };
188
212
  }
189
213
  })
190
214
  ];
@@ -0,0 +1,11 @@
1
+ import { readFile } from "node:fs/promises";
2
+ import { fileURLToPath } from "node:url";
3
+
4
+ const instructionsPath = fileURLToPath(new URL("../../../AGENTS.md", import.meta.url));
5
+ let cachedInstructions = null;
6
+
7
+ export async function loadProjectInstructions() {
8
+ if (cachedInstructions !== null) return cachedInstructions;
9
+ cachedInstructions = await readFile(instructionsPath, "utf8");
10
+ return cachedInstructions;
11
+ }
@@ -4,7 +4,7 @@ import { spawn } from "node:child_process";
4
4
  import { getToolConfigPath, getToolTmpDir } from "../../runtime/paths.js";
5
5
  import { loadToolConfig, parseConfigModule, writeToolConfig } from "./tool-config.js";
6
6
 
7
- const cliRoot = path.resolve("cli");
7
+ const toolsRoot = path.resolve("tools");
8
8
 
9
9
  function runProcess(command, args, options = {}) {
10
10
  return new Promise((resolve) => {
@@ -18,22 +18,25 @@ function runProcess(command, args, options = {}) {
18
18
  }
19
19
 
20
20
  export class ToolRegistry {
21
- constructor() {
21
+ constructor({ logger } = {}) {
22
+ this.logger = logger;
22
23
  this.tools = new Map();
23
24
  }
24
25
 
25
26
  async load() {
26
27
  this.tools.clear();
28
+
27
29
  let entries = [];
28
30
  try {
29
- entries = await readdir(cliRoot, { withFileTypes: true });
31
+ entries = await readdir(toolsRoot, { withFileTypes: true });
30
32
  } catch {
33
+ this.logger?.log("tools", `tools directory not found: ${toolsRoot}`);
31
34
  return;
32
35
  }
33
36
 
34
37
  for (const entry of entries) {
35
38
  if (!entry.isDirectory()) continue;
36
- const toolDir = path.join(cliRoot, entry.name);
39
+ const toolDir = path.join(toolsRoot, entry.name);
37
40
  const manifestPath = path.join(toolDir, "tool.manifest.json");
38
41
  const configPath = path.join(toolDir, "config.js");
39
42
  try {
@@ -54,6 +57,8 @@ export class ToolRegistry {
54
57
  // ignore invalid tool dirs in v1
55
58
  }
56
59
  }
60
+
61
+ this.logger?.log("tools", `loaded ${this.tools.size} tool(s)`);
57
62
  }
58
63
 
59
64
  list() {
@@ -91,6 +96,7 @@ export class ToolRegistry {
91
96
  async run({ name, request }) {
92
97
  const tool = this.get(name);
93
98
  if (!tool) throw new Error(`Tool not found: ${name}`);
99
+ this.logger?.log("tools", `running ${name}`);
94
100
  const tmpDir = getToolTmpDir(name);
95
101
  await mkdir(tmpDir, { recursive: true });
96
102
  const requestFile = path.join(tmpDir, `.request-${Date.now()}.json`);
@@ -102,6 +108,7 @@ export class ToolRegistry {
102
108
  await unlink(requestFile).catch(() => {});
103
109
  try {
104
110
  const parsed = JSON.parse(result.stdout || result.stderr);
111
+ this.logger?.log("tools", `${name} -> ${parsed.ok === false ? "error" : "ok"}`);
105
112
  return parsed;
106
113
  } catch {
107
114
  return {
package/src/index.js CHANGED
@@ -2,13 +2,22 @@
2
2
 
3
3
  import { bootstrapIfNeeded } from "./runtime/bootstrap.js";
4
4
  import { createApp } from "./runtime/create-app.js";
5
+ import { createLogger } from "./runtime/logger.js";
6
+ import { getServiceStatus, registerServiceProcess, startService, stopService } from "./runtime/service-manager.js";
7
+ import { flushArisaHome } from "./runtime/flush.js";
5
8
 
6
- const forceBootstrap = process.argv.includes("--bootstrap");
9
+ const args = process.argv.slice(2);
10
+ const command = args.find((arg) => !arg.startsWith("--")) || "run";
11
+ const forceBootstrap = args.includes("--bootstrap");
12
+ const verbose = args.includes("--verbose");
13
+ const serviceRunner = args.includes("--service-runner");
14
+ const logger = createLogger({ verbose });
7
15
 
8
- async function main() {
16
+ async function runForeground() {
17
+ logger.log("app", `starting${verbose ? " in verbose mode" : ""}`);
9
18
  await bootstrapIfNeeded({ force: forceBootstrap });
10
19
  try {
11
- const app = await createApp();
20
+ const app = await createApp({ logger });
12
21
  await app.start();
13
22
  } catch (error) {
14
23
  const message = error instanceof Error ? error.message : String(error);
@@ -16,7 +25,7 @@ async function main() {
16
25
  console.log(`\n${message}\n`);
17
26
  console.log("Reopening bootstrap so you can provide a Pi API key or switch to a provider you already authenticated with.\n");
18
27
  await bootstrapIfNeeded({ force: true });
19
- const app = await createApp();
28
+ const app = await createApp({ logger });
20
29
  await app.start();
21
30
  return;
22
31
  }
@@ -24,4 +33,57 @@ async function main() {
24
33
  }
25
34
  }
26
35
 
36
+ async function main() {
37
+ if (serviceRunner) {
38
+ await registerServiceProcess();
39
+ await runForeground();
40
+ return;
41
+ }
42
+
43
+ if (command === "start") {
44
+ await bootstrapIfNeeded({ force: forceBootstrap });
45
+ const result = await startService({ verbose });
46
+ if (!result.ok) {
47
+ console.log(`Arisa is already running in background (pid ${result.pid}).`);
48
+ return;
49
+ }
50
+ console.log(`Arisa started in background (pid ${result.pid}).`);
51
+ console.log(`Log file: ${result.logFile}`);
52
+ return;
53
+ }
54
+
55
+ if (command === "stop") {
56
+ const result = await stopService();
57
+ if (!result.ok) {
58
+ console.log("Arisa is not running.");
59
+ return;
60
+ }
61
+ console.log(`Arisa stopped (pid ${result.pid}).`);
62
+ return;
63
+ }
64
+
65
+ if (command === "status") {
66
+ const status = await getServiceStatus();
67
+ if (!status.running) {
68
+ console.log("Arisa is not running.");
69
+ return;
70
+ }
71
+ console.log(`Arisa is running in background (pid ${status.pid}).`);
72
+ return;
73
+ }
74
+
75
+ if (command === "flush") {
76
+ const status = await getServiceStatus();
77
+ if (status.running) {
78
+ console.log(`Arisa is running (pid ${status.pid}). Stop it before flush.`);
79
+ return;
80
+ }
81
+ const result = await flushArisaHome();
82
+ console.log(`Arisa state removed: ${result.path}`);
83
+ return;
84
+ }
85
+
86
+ await runForeground();
87
+ }
88
+
27
89
  await main();
@@ -15,6 +15,23 @@ async function exists(file) {
15
15
  }
16
16
  }
17
17
 
18
+ function sortBootstrapModels(provider, models) {
19
+ const preferred = {
20
+ "openai-codex": ["gpt-5.4"]
21
+ };
22
+
23
+ const priority = preferred[provider] || [];
24
+ const positions = new Map(models.map((model, index) => [model.id, index]));
25
+
26
+ return [...models].sort((a, b) => {
27
+ const aIndex = priority.indexOf(a.id);
28
+ const bIndex = priority.indexOf(b.id);
29
+ const aRank = aIndex === -1 ? Number.MAX_SAFE_INTEGER : aIndex;
30
+ const bRank = bIndex === -1 ? Number.MAX_SAFE_INTEGER : bIndex;
31
+ if (aRank !== bRank) return aRank - bRank;
32
+ return (positions.get(b.id) || 0) - (positions.get(a.id) || 0);
33
+ });
34
+ }
18
35
 
19
36
  async function maybeOpenExternal(url) {
20
37
  if (!url) return;
@@ -105,7 +122,7 @@ export async function bootstrapIfNeeded({ force = false } = {}) {
105
122
 
106
123
  const selectedProviderIndex = Number(await ask("Select Pi provider by number", "1"));
107
124
  const selectedProvider = providers[Math.max(0, Math.min(providers.length - 1, selectedProviderIndex - 1))];
108
- const models = listProviderModels(selectedProvider.provider, runtime);
125
+ const models = sortBootstrapModels(selectedProvider.provider, listProviderModels(selectedProvider.provider, runtime));
109
126
  console.log(`\nAvailable models for ${selectedProvider.provider}:`);
110
127
  models.forEach((model, index) => {
111
128
  const capabilities = [model.reasoning ? "reasoning" : null, model.input?.includes("image") ? "image" : null].filter(Boolean).join(", ");
@@ -153,7 +170,8 @@ export async function bootstrapIfNeeded({ force = false } = {}) {
153
170
  telegram: {
154
171
  apiKey: telegramApiKey,
155
172
  maxChatIds: telegramMaxChatIds,
156
- authorizedChatIds: []
173
+ authorizedChatIds: [],
174
+ chatMeta: {}
157
175
  },
158
176
  pi: {
159
177
  provider: selectedProvider.provider,
@@ -4,18 +4,22 @@ import { ToolRegistry } from "../core/tools/tool-registry.js";
4
4
  import { AgentManager } from "../core/agent/agent-manager.js";
5
5
  import { createTelegramBot } from "../transport/telegram/bot.js";
6
6
 
7
- export async function createApp() {
7
+ export async function createApp({ logger } = {}) {
8
+ logger?.log("app", "loading config");
8
9
  const config = await loadConfig();
9
10
  const artifactStore = new ArtifactStore();
10
- const toolRegistry = new ToolRegistry();
11
+ const toolRegistry = new ToolRegistry({ logger });
11
12
  await toolRegistry.load();
13
+ logger?.log("app", `loaded ${toolRegistry.list().length} tools`);
12
14
 
13
- const agentManager = new AgentManager({ config, artifactStore, toolRegistry });
14
- const bot = await createTelegramBot({ config, artifactStore, toolRegistry, agentManager, saveConfig, updateConfig });
15
+ const agentManager = new AgentManager({ config, artifactStore, toolRegistry, logger });
16
+ const bot = await createTelegramBot({ config, artifactStore, toolRegistry, agentManager, saveConfig, updateConfig, logger });
15
17
 
16
18
  return {
17
19
  async start() {
20
+ logger?.log("app", `validating Pi model ${config.pi.provider}/${config.pi.model}`);
18
21
  await agentManager.validatePiAgent();
22
+ logger?.log("app", "starting Telegram bot");
19
23
  await bot.start();
20
24
  }
21
25
  };
@@ -0,0 +1,7 @@
1
+ import { rm } from "node:fs/promises";
2
+ import { arisaHomeDir } from "./paths.js";
3
+
4
+ export async function flushArisaHome() {
5
+ await rm(arisaHomeDir, { recursive: true, force: true });
6
+ return { ok: true, path: arisaHomeDir };
7
+ }
@@ -0,0 +1,20 @@
1
+ export function createLogger({ verbose = false } = {}) {
2
+ function stamp() {
3
+ return new Date().toISOString().replace("T", " ").slice(0, 19);
4
+ }
5
+
6
+ function format(scope, message) {
7
+ return `[${stamp()}]${scope ? ` [${scope}]` : ""} ${message}`;
8
+ }
9
+
10
+ return {
11
+ verbose,
12
+ log(scope, message) {
13
+ if (!verbose) return;
14
+ console.log(format(scope, message));
15
+ },
16
+ error(scope, message) {
17
+ console.error(format(scope, message));
18
+ }
19
+ };
20
+ }
@@ -5,11 +5,19 @@ import path from "node:path";
5
5
  export const arisaHomeDir = path.join(os.homedir(), ".arisa");
6
6
  export const stateDir = path.join(arisaHomeDir, "state");
7
7
  export const configFile = path.join(stateDir, "config.json");
8
+ export const servicePidFile = path.join(stateDir, "arisa.pid");
9
+ export const serviceLogFile = path.join(stateDir, "arisa.log");
8
10
  export const artifactsDir = path.join(arisaHomeDir, "artifacts");
9
11
  export const artifactsIndexFile = path.join(stateDir, "artifacts.json");
12
+ export const piAgentDir = path.join(stateDir, "pi-agent");
13
+ export const chatsDir = path.join(stateDir, "chats");
10
14
  export const toolsDir = path.join(arisaHomeDir, "tools");
11
15
  export const tmpDir = path.join(arisaHomeDir, "tmp");
12
16
 
17
+ export function getChatDir(chatId) {
18
+ return path.join(chatsDir, String(chatId));
19
+ }
20
+
13
21
  export function getToolDir(toolName) {
14
22
  return path.join(toolsDir, toolName);
15
23
  }
@@ -33,6 +41,8 @@ export function getToolTmpDir(toolName) {
33
41
  export async function ensureArisaHome() {
34
42
  await mkdir(stateDir, { recursive: true });
35
43
  await mkdir(artifactsDir, { recursive: true });
44
+ await mkdir(piAgentDir, { recursive: true });
45
+ await mkdir(chatsDir, { recursive: true });
36
46
  await mkdir(toolsDir, { recursive: true });
37
47
  await mkdir(tmpDir, { recursive: true });
38
48
  }
@@ -0,0 +1,98 @@
1
+ import { open, readFile, rm, writeFile } from "node:fs/promises";
2
+ import { spawn } from "node:child_process";
3
+ import process from "node:process";
4
+ import { fileURLToPath } from "node:url";
5
+ import { ensureArisaHome, serviceLogFile, servicePidFile } from "./paths.js";
6
+
7
+ const entryFile = fileURLToPath(new URL("../index.js", import.meta.url));
8
+
9
+ function isProcessRunning(pid) {
10
+ try {
11
+ process.kill(pid, 0);
12
+ return true;
13
+ } catch {
14
+ return false;
15
+ }
16
+ }
17
+
18
+ async function readPid() {
19
+ try {
20
+ const raw = await readFile(servicePidFile, "utf8");
21
+ const pid = Number.parseInt(raw.trim(), 10);
22
+ return Number.isFinite(pid) ? pid : null;
23
+ } catch {
24
+ return null;
25
+ }
26
+ }
27
+
28
+ export async function getServiceStatus() {
29
+ await ensureArisaHome();
30
+ const pid = await readPid();
31
+ if (!pid) return { running: false, pid: null };
32
+ if (!isProcessRunning(pid)) {
33
+ await rm(servicePidFile, { force: true }).catch(() => {});
34
+ return { running: false, pid: null, stalePid: pid };
35
+ }
36
+ return { running: true, pid };
37
+ }
38
+
39
+ export async function startService({ verbose = false } = {}) {
40
+ await ensureArisaHome();
41
+ const status = await getServiceStatus();
42
+ if (status.running) {
43
+ return { ok: false, reason: "already-running", pid: status.pid };
44
+ }
45
+
46
+ const logHandle = await open(serviceLogFile, "a");
47
+ const args = [entryFile, "--service-runner"];
48
+ if (verbose) args.push("--verbose");
49
+
50
+ const child = spawn(process.execPath, args, {
51
+ detached: true,
52
+ stdio: ["ignore", logHandle.fd, logHandle.fd],
53
+ env: process.env
54
+ });
55
+
56
+ child.unref();
57
+ await logHandle.close();
58
+ return { ok: true, pid: child.pid, logFile: serviceLogFile };
59
+ }
60
+
61
+ export async function stopService() {
62
+ const status = await getServiceStatus();
63
+ if (!status.running) {
64
+ return { ok: false, reason: "not-running", pid: status.stalePid || null };
65
+ }
66
+
67
+ try {
68
+ process.kill(status.pid, "SIGTERM");
69
+ } catch {
70
+ await rm(servicePidFile, { force: true }).catch(() => {});
71
+ return { ok: false, reason: "not-running", pid: status.pid };
72
+ }
73
+
74
+ return { ok: true, pid: status.pid };
75
+ }
76
+
77
+ export async function registerServiceProcess() {
78
+ await ensureArisaHome();
79
+ await writeFile(servicePidFile, `${process.pid}\n`, "utf8");
80
+
81
+ const cleanup = async () => {
82
+ await rm(servicePidFile, { force: true }).catch(() => {});
83
+ };
84
+
85
+ process.on("SIGTERM", async () => {
86
+ await cleanup();
87
+ process.exit(0);
88
+ });
89
+
90
+ process.on("SIGINT", async () => {
91
+ await cleanup();
92
+ process.exit(0);
93
+ });
94
+
95
+ process.on("exit", () => {
96
+ rm(servicePidFile, { force: true }).catch(() => {});
97
+ });
98
+ }
@@ -1,5 +1,15 @@
1
- export async function authorizeChat({ config, chatId, saveConfig }) {
1
+ export async function authorizeChat({ config, chatId, saveConfig, chatMeta = null }) {
2
+ config.telegram.chatMeta ||= {};
3
+
4
+ if (chatMeta) {
5
+ config.telegram.chatMeta[chatId] = {
6
+ ...(config.telegram.chatMeta[chatId] || {}),
7
+ ...chatMeta
8
+ };
9
+ }
10
+
2
11
  if (config.telegram.authorizedChatIds.includes(chatId)) {
12
+ if (chatMeta) await saveConfig(config);
3
13
  return { ok: true, firstTime: false };
4
14
  }
5
15
 
@@ -1,6 +1,35 @@
1
1
  import { Bot, InputFile } from "grammy";
2
2
  import { authorizeChat } from "./auth.js";
3
3
  import { captureIncomingArtifact } from "./media.js";
4
+ import { renderTelegramHtml, splitTelegramText } from "./text-format.js";
5
+
6
+ function quotedMessageSummary(message) {
7
+ if (!message) return [];
8
+
9
+ const fromName = message.from?.username
10
+ ? `@${message.from.username}`
11
+ : [message.from?.first_name, message.from?.last_name].filter(Boolean).join(" ") || "unknown";
12
+
13
+ const parts = [
14
+ `quotedMessageId: ${message.message_id}`,
15
+ `quotedFrom: ${fromName}`
16
+ ];
17
+
18
+ if (message.text) parts.push(`quotedText: ${message.text}`);
19
+ if (message.caption) parts.push(`quotedCaption: ${message.caption}`);
20
+ if (message.voice) parts.push(`quotedKind: voice`);
21
+ if (message.audio) parts.push(`quotedKind: audio`);
22
+ if (message.photo?.length) parts.push(`quotedKind: image`);
23
+ if (message.document) parts.push(`quotedKind: document`);
24
+ if (message.video) parts.push(`quotedKind: video`);
25
+ if (message.sticker) parts.push(`quotedKind: sticker`);
26
+
27
+ if (!message.text && !message.caption) {
28
+ parts.push(`Important: this message replies to a Telegram message with no textual body available in the update. Use the quoted kind and metadata as context.`);
29
+ }
30
+
31
+ return parts;
32
+ }
4
33
 
5
34
  function buildPrompt({ ctx, artifact, transcript }) {
6
35
  const parts = [
@@ -12,6 +41,7 @@ function buildPrompt({ ctx, artifact, transcript }) {
12
41
  ];
13
42
 
14
43
  if (ctx.message?.text) parts.push(`text: ${ctx.message.text}`);
44
+ parts.push(...quotedMessageSummary(ctx.message?.reply_to_message));
15
45
  if (artifact?.path) parts.push(`artifactPath: ${artifact.path}`);
16
46
  if (artifact?.id) parts.push(`artifactId: ${artifact.id}`);
17
47
  if (artifact?.mimeType) parts.push(`mimeType: ${artifact.mimeType}`);
@@ -24,7 +54,7 @@ function buildPrompt({ ctx, artifact, transcript }) {
24
54
 
25
55
  parts.push(`If you need a CLI tool, use list_tools/tool_help/run_tool.`);
26
56
  parts.push(`If a tool config is missing, ask the user naturally and then use set_tool_config.`);
27
- parts.push(`If the user wants audio output, use send_audio_reply.`);
57
+ parts.push(`If the user wants a generated media reply, use send_media_reply.`);
28
58
  return parts.join("\n");
29
59
  }
30
60
 
@@ -81,10 +111,19 @@ async function withTyping(ctx, work) {
81
111
  }
82
112
  }
83
113
 
84
- export async function createTelegramBot({ config, artifactStore, toolRegistry, agentManager, saveConfig, updateConfig }) {
114
+ export async function createTelegramBot({ config, artifactStore, toolRegistry, agentManager, saveConfig, updateConfig, logger }) {
85
115
  const bot = new Bot(config.telegram.apiKey);
86
116
  const perChatState = new Map();
87
117
 
118
+ function getIncomingChatMeta(ctx) {
119
+ return {
120
+ languageCode: ctx.from?.language_code || "",
121
+ username: ctx.from?.username || "",
122
+ firstName: ctx.from?.first_name || "",
123
+ lastName: ctx.from?.last_name || ""
124
+ };
125
+ }
126
+
88
127
  function getChatState(chatId) {
89
128
  if (!perChatState.has(chatId)) {
90
129
  perChatState.set(chatId, { processing: false, nextPrompt: "" });
@@ -93,8 +132,11 @@ export async function createTelegramBot({ config, artifactStore, toolRegistry, a
93
132
  }
94
133
 
95
134
  async function buildIncomingPrompt(ctx) {
135
+ logger?.log("telegram", `message ${ctx.msg.message_id} in chat ${ctx.chat.id}`);
96
136
  const artifact = await captureIncomingArtifact(ctx, artifactStore);
137
+ if (artifact) logger?.log("telegram", `captured artifact ${artifact.kind}${artifact.id ? ` ${artifact.id}` : ""}`);
97
138
  const { transcript, toolResult } = await maybeTranscribeIncomingAudio({ artifact, toolRegistry, artifactStore });
139
+ if (transcript) logger?.log("telegram", `audio transcribed to artifact ${transcript.id}`);
98
140
  if (artifact?.kind === "audio" && !transcript) {
99
141
  if (toolResult?.missingConfig?.includes("OPENAI_API_KEY")) {
100
142
  throw new Error("I need the OpenAI API key for ~/.arisa/tools/openai-transcribe/config.js before I can transcribe incoming audio.");
@@ -104,14 +146,29 @@ export async function createTelegramBot({ config, artifactStore, toolRegistry, a
104
146
  return buildPrompt({ ctx, artifact, transcript });
105
147
  }
106
148
 
149
+ async function sendTextReply(send, chatId, text) {
150
+ logger?.log("telegram", `sending text reply for chat ${chatId}`);
151
+ for (const chunk of splitTelegramText(text)) {
152
+ await send(renderTelegramHtml(chunk), { parse_mode: "HTML" });
153
+ }
154
+ }
155
+
107
156
  async function processPrompt(ctx, prompt) {
108
157
  const telegram = {
109
- sendAudio: async (filePath, caption) => ctx.replyWithAudio(new InputFile(filePath), { caption })
158
+ sendMedia: async (filePath, { method = "audio", caption } = {}) => {
159
+ logger?.log("telegram", `sending ${method} reply for chat ${ctx.chat.id}`);
160
+ const input = new InputFile(filePath);
161
+ if (method === "voice") return ctx.replyWithVoice(input, { caption });
162
+ if (method === "document") return ctx.replyWithDocument(input, { caption });
163
+ return ctx.replyWithAudio(input, { caption });
164
+ }
110
165
  };
111
166
  return withTyping(ctx, async () => {
112
167
  const { session } = await agentManager.getSessionContext(ctx.chat.id, telegram);
113
168
  const text = await collectText(session, prompt);
114
- if (text) await ctx.reply(text.slice(0, 4000));
169
+ if (text) {
170
+ await sendTextReply((message, extra) => ctx.reply(message, extra), ctx.chat.id, text);
171
+ }
115
172
  });
116
173
  }
117
174
 
@@ -120,6 +177,7 @@ export async function createTelegramBot({ config, artifactStore, toolRegistry, a
120
177
  const incomingPrompt = await buildIncomingPrompt(ctx);
121
178
 
122
179
  if (chatState.processing) {
180
+ logger?.log("telegram", `chat ${ctx.chat.id} busy, queueing message ${ctx.msg.message_id}`);
123
181
  chatState.nextPrompt = chatState.nextPrompt
124
182
  ? `${chatState.nextPrompt}\n\n${incomingPrompt}`
125
183
  : incomingPrompt;
@@ -127,10 +185,12 @@ export async function createTelegramBot({ config, artifactStore, toolRegistry, a
127
185
  }
128
186
 
129
187
  chatState.processing = true;
188
+ logger?.log("telegram", `processing message ${ctx.msg.message_id} in chat ${ctx.chat.id}`);
130
189
  let currentPrompt = incomingPrompt;
131
190
 
132
191
  while (currentPrompt) {
133
192
  try {
193
+ logger?.log("telegram", `prompt dispatch for chat ${ctx.chat.id}`);
134
194
  await processPrompt(ctx, currentPrompt);
135
195
  } finally {
136
196
  if (chatState.nextPrompt) {
@@ -146,55 +206,19 @@ export async function createTelegramBot({ config, artifactStore, toolRegistry, a
146
206
  }
147
207
 
148
208
  bot.catch((error) => {
209
+ logger?.error("telegram", `bot error: ${error instanceof Error ? error.message : String(error)}`);
149
210
  console.error("Telegram bot error:", error);
150
211
  });
151
212
 
152
213
  bot.command("start", async (ctx) => {
153
- const auth = await authorizeChat({ config, chatId: ctx.chat.id, saveConfig });
154
- if (!auth.ok) return ctx.reply("Private bot. Access denied.");
214
+ const auth = await authorizeChat({ config, chatId: ctx.chat.id, saveConfig, chatMeta: getIncomingChatMeta(ctx) });
215
+ if (!auth.ok) return;
155
216
  return ctx.reply(auth.firstTime ? "This chat is now authorized for Arisa." : "Arisa is ready.");
156
217
  });
157
218
 
158
- bot.command("pi_api_key", async (ctx) => {
159
- const auth = await authorizeChat({ config, chatId: ctx.chat.id, saveConfig });
160
- if (!auth.ok) return ctx.reply("Private bot. Access denied.");
161
-
162
- const apiKey = ctx.match?.trim();
163
- if (!apiKey) {
164
- return ctx.reply("Usage: /pi_api_key <your_api_key>");
165
- }
166
-
167
- const nextConfig = await updateConfig((current) => {
168
- current.pi.apiKey = apiKey;
169
- });
170
- config.pi.apiKey = nextConfig.pi.apiKey;
171
- agentManager.setConfig(nextConfig);
172
- return ctx.reply(`Saved Pi API key for ${nextConfig.pi.provider}.`);
173
- });
174
-
175
- bot.command("pi_model", async (ctx) => {
176
- const auth = await authorizeChat({ config, chatId: ctx.chat.id, saveConfig });
177
- if (!auth.ok) return ctx.reply("Private bot. Access denied.");
178
-
179
- const value = ctx.match?.trim();
180
- if (!value || !value.includes("/")) {
181
- return ctx.reply("Usage: /pi_model <provider/model>");
182
- }
183
-
184
- const [provider, model] = value.split("/");
185
- const nextConfig = await updateConfig((current) => {
186
- current.pi.provider = provider.trim();
187
- current.pi.model = model.trim();
188
- });
189
- config.pi.provider = nextConfig.pi.provider;
190
- config.pi.model = nextConfig.pi.model;
191
- agentManager.setConfig(nextConfig);
192
- return ctx.reply(`Saved Pi model ${nextConfig.pi.provider}/${nextConfig.pi.model}.`);
193
- });
194
-
195
219
  bot.on("message", async (ctx) => {
196
- const auth = await authorizeChat({ config, chatId: ctx.chat.id, saveConfig });
197
- if (!auth.ok) return ctx.reply("Private bot. Access denied.");
220
+ const auth = await authorizeChat({ config, chatId: ctx.chat.id, saveConfig, chatMeta: getIncomingChatMeta(ctx) });
221
+ if (!auth.ok) return;
198
222
 
199
223
  try {
200
224
  await enqueueOrProcess(ctx);
@@ -208,6 +232,41 @@ export async function createTelegramBot({ config, artifactStore, toolRegistry, a
208
232
 
209
233
  return {
210
234
  async start() {
235
+ config.telegram.chatMeta ||= {};
236
+ for (const chatId of config.telegram.authorizedChatIds || []) {
237
+ try {
238
+ logger?.log("telegram", `generating startup message for chat ${chatId}`);
239
+ const chatMeta = config.telegram.chatMeta[chatId] || {};
240
+ const telegram = {
241
+ sendMedia: async (filePath, { method = "audio", caption } = {}) => {
242
+ logger?.log("telegram", `sending ${method} reply for chat ${chatId}`);
243
+ const input = new InputFile(filePath);
244
+ if (method === "voice") return bot.api.sendVoice(chatId, input, { caption });
245
+ if (method === "document") return bot.api.sendDocument(chatId, input, { caption });
246
+ return bot.api.sendAudio(chatId, input, { caption });
247
+ }
248
+ };
249
+ const { session } = await agentManager.getSessionContext(chatId, telegram);
250
+ const welcomePrompt = [
251
+ "System event: Arisa has just started.",
252
+ `chatId: ${chatId}`,
253
+ `preferredTelegramLanguageCode: ${chatMeta.languageCode || "unknown"}`,
254
+ chatMeta.username ? `username: ${chatMeta.username}` : null,
255
+ chatMeta.firstName ? `firstName: ${chatMeta.firstName}` : null,
256
+ "Send a short welcome-back message for Telegram.",
257
+ "Keep it brief, warm, and natural.",
258
+ "Use the user's Telegram language when possible.",
259
+ "Do not mention internal implementation details."
260
+ ].filter(Boolean).join("\n");
261
+ const text = await collectText(session, welcomePrompt);
262
+ if (text) {
263
+ await sendTextReply((message, extra) => bot.api.sendMessage(chatId, message, extra), chatId, text);
264
+ }
265
+ } catch (error) {
266
+ logger?.log("telegram", `startup message failed for chat ${chatId}: ${error instanceof Error ? error.message : String(error)}`);
267
+ }
268
+ }
269
+ logger?.log("telegram", "bot polling started");
211
270
  await bot.start();
212
271
  }
213
272
  };
@@ -0,0 +1,72 @@
1
+ function escapeHtml(text = "") {
2
+ return text
3
+ .replace(/&/g, "&amp;")
4
+ .replace(/</g, "&lt;")
5
+ .replace(/>/g, "&gt;")
6
+ .replace(/"/g, "&quot;");
7
+ }
8
+
9
+ function formatInline(text) {
10
+ const escaped = escapeHtml(text);
11
+ return escaped
12
+ .replace(/\*\*(.+?)\*\*/gs, "<b>$1</b>")
13
+ .replace(/`([^`\n]+)`/g, "<code>$1</code>");
14
+ }
15
+
16
+ export function renderTelegramHtml(text = "") {
17
+ const source = String(text || "");
18
+ const parts = [];
19
+ let index = 0;
20
+
21
+ while (index < source.length) {
22
+ const start = source.indexOf("```", index);
23
+ if (start === -1) {
24
+ parts.push(formatInline(source.slice(index)));
25
+ break;
26
+ }
27
+
28
+ if (start > index) {
29
+ parts.push(formatInline(source.slice(index, start)));
30
+ }
31
+
32
+ const afterFence = start + 3;
33
+ const lineEnd = source.indexOf("\n", afterFence);
34
+ const languageLine = lineEnd === -1 ? source.slice(afterFence) : source.slice(afterFence, lineEnd);
35
+ const codeStart = lineEnd === -1 ? afterFence : lineEnd + 1;
36
+ const end = source.indexOf("```", codeStart);
37
+
38
+ if (end === -1) {
39
+ parts.push(formatInline(source.slice(start)));
40
+ break;
41
+ }
42
+
43
+ const language = languageLine.trim();
44
+ const code = source.slice(codeStart, end).replace(/\n$/, "");
45
+ const languageAttr = language ? ` language="${escapeHtml(language)}"` : "";
46
+ parts.push(`<pre><code${languageAttr}>${escapeHtml(code)}</code></pre>`);
47
+ index = end + 3;
48
+ }
49
+
50
+ return parts.join("");
51
+ }
52
+
53
+ export function splitTelegramText(text = "", maxLength = 3500) {
54
+ const source = String(text || "").trim();
55
+ if (!source) return [];
56
+ if (source.length <= maxLength) return [source];
57
+
58
+ const chunks = [];
59
+ let remaining = source;
60
+
61
+ while (remaining.length > maxLength) {
62
+ let cut = remaining.lastIndexOf("\n\n", maxLength);
63
+ if (cut < Math.floor(maxLength / 2)) cut = remaining.lastIndexOf("\n", maxLength);
64
+ if (cut < Math.floor(maxLength / 2)) cut = remaining.lastIndexOf(" ", maxLength);
65
+ if (cut <= 0) cut = maxLength;
66
+ chunks.push(remaining.slice(0, cut).trim());
67
+ remaining = remaining.slice(cut).trimStart();
68
+ }
69
+
70
+ if (remaining) chunks.push(remaining);
71
+ return chunks;
72
+ }
@@ -0,0 +1,4 @@
1
+ export default {
2
+ OPENAI_API_KEY: "",
3
+ MODEL: "gpt-4o-mini-transcribe"
4
+ };
@@ -1,3 +1,4 @@
1
+ import path from "node:path";
1
2
  import { readFile, stat } from "node:fs/promises";
2
3
  import defaults from "./config.js";
3
4
  import { loadToolConfig } from "../../src/core/tools/tool-config.js";
@@ -7,7 +8,7 @@ const toolName = "openai-transcribe";
7
8
  const config = await loadToolConfig(toolName, defaults);
8
9
 
9
10
  function printHelp() {
10
- console.log(`openai-transcribe\n\nUso:\n node index.js --help\n node index.js run --request-file <json>\n\nInput esperado:\n {\n \"artifact\": { \"path\": \"/abs/audio.ogg\", \"mimeType\": \"audio/ogg\" },\n \"args\": {}\n }\n\nConfig en ${getToolConfigPath(toolName)}:\n OPENAI_API_KEY\n MODEL\n`);
11
+ console.log(`openai-transcribe\n\nUsage:\n node index.js --help\n node index.js run --request-file <json>\n\nExpected input:\n {\n \"artifact\": { \"path\": \"/abs/audio.ogg\", \"mimeType\": \"audio/ogg\" },\n \"args\": {}\n }\n\nConfig at ${getToolConfigPath(toolName)}:\n OPENAI_API_KEY\n MODEL\n`);
11
12
  }
12
13
 
13
14
  async function run(requestFile) {
@@ -9,7 +9,7 @@
9
9
  "type": "string",
10
10
  "required": true,
11
11
  "secret": true,
12
- "prompt": "Necesito tu OPENAI_API_KEY para transcribir audio."
12
+ "prompt": "I need your OPENAI_API_KEY to transcribe audio."
13
13
  }
14
14
  }
15
15
  }
@@ -0,0 +1,5 @@
1
+ export default {
2
+ OPENAI_API_KEY: "",
3
+ MODEL: "gpt-4o-mini-tts",
4
+ VOICE: "alloy"
5
+ };
@@ -8,7 +8,7 @@ const toolName = "openai-tts";
8
8
  const config = await loadToolConfig(toolName, defaults);
9
9
 
10
10
  function printHelp() {
11
- console.log(`openai-tts\n\nUso:\n node index.js --help\n node index.js run --request-file <json>\n\nInput esperado:\n {\n \"text\": \"hola\",\n \"artifact\": { \"text\": \"hola\" },\n \"args\": { \"voice\": \"alloy\" }\n }\n\nConfig en ${getToolConfigPath(toolName)}:\n OPENAI_API_KEY\n MODEL\n VOICE\n`);
11
+ console.log(`openai-tts\n\nUsage:\n node index.js --help\n node index.js run --request-file <json>\n\nExpected input:\n {\n \"text\": \"hello\",\n \"artifact\": { \"text\": \"hello\" },\n \"args\": { \"voice\": \"alloy\" }\n }\n\nOutput:\n - generates OGG/Opus audio\n - suggests Telegram voice-note delivery via output.delivery.method = \"voice\"\n\nConfig at ${getToolConfigPath(toolName)}:\n OPENAI_API_KEY\n MODEL\n VOICE\n`);
12
12
  }
13
13
 
14
14
  async function run(requestFile) {
@@ -34,7 +34,7 @@ async function run(requestFile) {
34
34
  model: config.MODEL,
35
35
  voice: request.args?.voice || config.VOICE,
36
36
  input: inputText,
37
- format: "mp3"
37
+ format: "opus"
38
38
  })
39
39
  });
40
40
 
@@ -46,10 +46,19 @@ async function run(requestFile) {
46
46
 
47
47
  const outDir = getToolOutDir(toolName);
48
48
  await mkdir(outDir, { recursive: true });
49
- const filePath = path.join(outDir, `speech-${Date.now()}.mp3`);
49
+ const filePath = path.join(outDir, `speech-${Date.now()}.ogg`);
50
50
  const buffer = Buffer.from(await response.arrayBuffer());
51
51
  await writeFile(filePath, buffer);
52
- console.log(JSON.stringify({ ok: true, output: { filePath, fileName: path.basename(filePath), mimeType: "audio/mpeg", kind: "audio" } }));
52
+ console.log(JSON.stringify({
53
+ ok: true,
54
+ output: {
55
+ filePath,
56
+ fileName: path.basename(filePath),
57
+ mimeType: "audio/ogg",
58
+ kind: "audio",
59
+ delivery: { method: "voice" }
60
+ }
61
+ }));
53
62
  }
54
63
 
55
64
  const args = process.argv.slice(2);
@@ -1,20 +1,20 @@
1
1
  {
2
2
  "name": "openai-tts",
3
- "description": "Convert text into MP3 audio using OpenAI speech API.",
3
+ "description": "Convert text into OGG/Opus speech audio using the OpenAI speech API.",
4
4
  "entry": "index.js",
5
5
  "input": ["text/plain"],
6
- "output": ["audio/mpeg"],
6
+ "output": ["audio/ogg"],
7
7
  "configSchema": {
8
8
  "OPENAI_API_KEY": {
9
9
  "type": "string",
10
10
  "required": true,
11
11
  "secret": true,
12
- "prompt": "Necesito tu OPENAI_API_KEY para generar audio."
12
+ "prompt": "I need your OPENAI_API_KEY to generate speech audio."
13
13
  },
14
14
  "VOICE": {
15
15
  "type": "string",
16
16
  "required": false,
17
- "prompt": "Voz a usar, por ejemplo alloy."
17
+ "prompt": "Voice to use, for example alloy."
18
18
  }
19
19
  }
20
20
  }
@@ -0,0 +1 @@
1
+ export default {};
File without changes
File without changes
File without changes
File without changes
File without changes