onkol 0.3.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,322 @@
1
+ # Onkol
2
+
3
+ Your AI on-call team. One command per VM, and you get an autonomous agent on Discord that handles bugs, features, analysis, and ops so you don't have to.
4
+
5
+ Onkol turns Claude Code into a decentralized on-call system. Each VM runs an orchestrator that listens on Discord. You describe a problem in plain English, it spins up a dedicated worker session to solve it, and reports back when it's done.
6
+
7
+ ## How it works
8
+
9
+ ```
10
+ You on Discord: "the auth endpoint is returning 403 after token refresh"
11
+ |
12
+ Orchestrator (Claude Code)
13
+ reads your message, understands intent,
14
+ prepares context, spawns a worker
15
+ |
16
+ Worker (new Claude Code session)
17
+ diagnoses the bug, fixes auth.py,
18
+ runs tests, commits to a branch
19
+ |
20
+ You on Discord: "Fixed. Clock skew between auth server and app server.
21
+ Added 5s tolerance. Tests pass. Branch: fix/auth-403"
22
+ ```
23
+
24
+ **What makes it different:**
25
+ - **Decentralized.** Each VM is self-contained. No central server. 10 VMs = 10 independent agents.
26
+ - **Intent-driven.** Say "fix this" and it fixes autonomously. Say "look into this" and it investigates without touching code. Your phrasing controls the behavior.
27
+ - **Gets smarter.** Every resolved task leaves behind a learning. Next time a similar issue comes up, the agent already knows what to look for.
28
+ - **Works behind firewalls.** All connections are outbound to Discord. No inbound ports, no SSH tunnels, no VPN required.
29
+
30
+ ## Real-world setup
31
+
32
+ The intended way to use Onkol is with a **dedicated Discord server** that becomes your ops control center.
33
+
34
+ I manage about 10 applications across prod and staging. I created one Discord server and set it up exclusively for Onkol. Each VM I onboard creates its own category with an orchestrator channel. My Discord sidebar looks like this:
35
+
36
+ ```
37
+ MY-INFRA (Discord server)
38
+
39
+ ├── API-SERVER-PROD ← VM running in GCP
40
+ │ ├── #orchestrator ← talk to this VM's brain here
41
+ │ ├── #fix-auth-403 ← active worker (auto-created)
42
+ │ └── #analyze-error-logs ← active worker (auto-created)
43
+
44
+ ├── WEB-APP-STAGING ← VM running in AWS
45
+ │ └── #orchestrator
46
+
47
+ ├── BACKEND-PROD ← VM behind corporate VPN
48
+ │ ├── #orchestrator
49
+ │ └── #add-export-endpoint ← active worker
50
+
51
+ ├── DATA-PIPELINE-STAGING ← Another GCP VM
52
+ │ └── #orchestrator
53
+
54
+ └── ... (as many VMs as you have)
55
+ ```
56
+
57
+ ### The workflow
58
+
59
+ From your phone, laptop, or anywhere with Discord:
60
+
61
+ 1. Open the server, go to `#orchestrator` under the VM you care about
62
+ 2. Type what you need: "there's a bug where users get 403 after token refresh"
63
+ 3. The orchestrator creates a new channel `#fix-auth-403` and spawns a worker
64
+ 4. The worker posts its progress and findings in `#fix-auth-403`
65
+ 5. You can jump into that channel to give more context or redirect
66
+ 6. When it's done, the orchestrator dissolves the worker, the channel disappears, learnings are saved
67
+
68
+ You can do this from a party, a flight, or bed at 2 AM. You're just texting on Discord. The agent does the SSH, the debugging, the code reading, the fixing.
69
+
70
+ ### Multiple VMs, one view
71
+
72
+ Every VM is a category. Every task is a channel. You see your entire infrastructure at a glance in the Discord sidebar. No dashboards to build, no web apps to deploy. Discord IS the dashboard.
73
+
74
+ The VMs don't need to know about each other. Each one connects outbound to Discord independently. If a VM is behind a VPN you can only reach from one specific laptop, doesn't matter. As long as it has outbound HTTPS, it can connect to Discord and you can talk to it.
75
+
76
+ ### Setting up a new VM
77
+
78
+ ```bash
79
+ # SSH into the VM (one time only)
80
+ ssh user@my-new-vm
81
+
82
+ # Run setup (2 minutes)
83
+ npx onkol@latest setup
84
+
85
+ # Answer the questions, done.
86
+ # A new category appears in your Discord server.
87
+ # You never need to SSH into this VM again.
88
+ ```
89
+
90
+ ## Quick start
91
+
92
+ ### Prerequisites
93
+
94
+ You need these on the VM where you're setting up:
95
+
96
+ | Tool | Why | Install |
97
+ |------|-----|---------|
98
+ | **Node.js 18+** | Runs the setup CLI | [nodejs.org](https://nodejs.org) |
99
+ | **Bun** | Runs the Discord channel plugin | `curl -fsSL https://bun.sh/install \| bash` |
100
+ | **Claude Code** | The AI that does the work | [docs.anthropic.com](https://docs.anthropic.com/en/docs/claude-code/getting-started) |
101
+ | **tmux** | Keeps sessions alive | `apt install tmux` / `yum install tmux` |
102
+ | **jq** | JSON processing in scripts | `apt install jq` / `yum install jq` |
103
+
104
+ Claude Code must be logged in via `claude.ai` OAuth on the VM (not API key).
105
+
106
+ The setup wizard checks all dependencies before asking any questions. If something's missing, it tells you exactly what to install and exits without wasting your time.
107
+
108
+ ### Create a Discord bot
109
+
110
+ 1. Go to [discord.com/developers/applications](https://discord.com/developers/applications)
111
+ 2. New Application, name it, Create
112
+ 3. Bot, Reset Token, **copy it** (you only see it once)
113
+ 4. Bot, Privileged Gateway Intents, enable **Message Content Intent**, Save
114
+ 5. OAuth2, URL Generator, check `bot`, check permissions:
115
+ - View Channels, Send Messages, Read Message History, Attach Files, Manage Channels
116
+ 6. Copy the URL, open in browser, invite to your Discord server
117
+
118
+ The setup wizard validates your bot token and checks that Message Content Intent is enabled before proceeding. If something's wrong, it tells you exactly what to fix.
119
+
120
+ ### Run setup
121
+
122
+ ```bash
123
+ npx onkol@latest setup
124
+ ```
125
+
126
+ The wizard walks you through everything:
127
+
128
+ ```
129
+ Welcome to Onkol Setup
130
+
131
+ Checking dependencies...
132
+ ✓ claude
133
+ ✓ bun
134
+ ✓ tmux
135
+ ✓ jq
136
+ ✓ curl
137
+
138
+ All dependencies found.
139
+
140
+ ✔ Where should Onkol live? ~/onkol
141
+ ✔ What should this node be called? api-server-prod
142
+ ✔ Discord bot token: ****
143
+ ✔ Discord server (guild) ID: 1234567890
144
+ ✔ Your Discord user ID: 9876543210
145
+ ✔ Registry file? Write a prompt — tell Claude what to find
146
+ ✔ Describe: Find the API endpoints and database URLs from .env
147
+ ✔ Service summary? Auto-discover
148
+ ✔ CLAUDE.md? Yes — This is a Node.js API server deployed via docker...
149
+ ✔ Plugins? context7, superpowers, code-simplifier
150
+
151
+ ✓ Bot token is valid
152
+ ✓ Message Content intent is enabled
153
+ ✓ Discord category and #orchestrator channel created
154
+ ✓ 6 scripts installed
155
+ ✓ Plugin installed with 4 files + dependencies
156
+ ✓ Systemd service installed and enabled
157
+ ✓ Orchestrator started in tmux session "onkol-api-server-prod"
158
+
159
+ ✓ Onkol node "api-server-prod" is live!
160
+ ```
161
+
162
+ Go to your Discord server. You'll see a new category with an `#orchestrator` channel. Send it a message.
163
+
164
+ ## Usage
165
+
166
+ ### Talking to the orchestrator
167
+
168
+ The orchestrator lives in the `#orchestrator` channel of your node's category. It reads your intent from how you phrase things:
169
+
170
+ | You say | What happens |
171
+ |---------|-------------|
172
+ | "fix the 403 bug in auth" | Spawns a worker that diagnoses, fixes, tests, and commits |
173
+ | "look into why response times are high" | Spawns a worker that investigates and reports, no code changes |
174
+ | "add retry logic to the webhook handler" | Spawns a worker that implements, tests, and waits for your approval |
175
+ | "analyze transferred calls for the last 3 weeks" | Spawns a worker that reads logs/data and produces an analysis |
176
+ | "just ship it" | Fully autonomous, pushes and deploys (asks for confirmation first) |
177
+
178
+ ### How workers work
179
+
180
+ When the orchestrator spawns a worker:
181
+
182
+ 1. A new Discord channel appears in your category (e.g., `#fix-auth-bug`)
183
+ 2. A new Claude Code session starts in tmux on the VM
184
+ 3. The worker posts progress and results in its Discord channel
185
+ 4. You can talk to the worker directly in that channel
186
+ 5. When done, tell the orchestrator to dissolve it. The channel disappears, learnings are saved.
187
+
188
+ ### Managing workers
189
+
190
+ From the orchestrator channel:
191
+ - "dissolve fix-auth-bug" kills the worker, saves learnings, deletes channel
192
+ - "list workers" shows all active workers
193
+ - "check on fix-auth-bug" gets the worker's current status
194
+
195
+ ### Setup prompts
196
+
197
+ During setup, you can describe things in plain English instead of providing config files:
198
+
199
+ - **Registry**: "Find the API endpoints from .env and the S3 bucket from AWS CLI"
200
+ - **Services**: Auto-discovers running services, or you describe what to look for
201
+ - **CLAUDE.md**: "This is a Node.js API server, Express, deployed via docker..."
202
+
203
+ The orchestrator executes these prompts on first boot and generates the structured files.
204
+
205
+ ## Architecture
206
+
207
+ ```
208
+ Your Discord Server
209
+ ├── Category: api-server-prod ← VM 1
210
+ │ ├── #orchestrator ← persistent Claude Code session
211
+ │ ├── #fix-auth-bug ← worker (temporary)
212
+ │ └── #analyze-error-logs ← worker (temporary)
213
+ ├── Category: web-app-staging ← VM 2
214
+ │ └── #orchestrator
215
+ └── Category: backend-prod ← VM 3
216
+ └── #orchestrator
217
+ ```
218
+
219
+ Each VM runs independently:
220
+ - **Orchestrator.** Long-running Claude Code session in tmux. Receives Discord messages, spawns workers, manages lifecycle.
221
+ - **Workers.** Ephemeral Claude Code sessions. One per task. Each gets its own Discord channel, its own context, its own instructions.
222
+ - **discord-filtered plugin.** Custom MCP channel server that routes Discord messages by channel ID. All sessions share one bot but each only hears its own channel.
223
+
224
+ ### On-disk structure
225
+
226
+ ```
227
+ ~/onkol/
228
+ ├── config.json # Node config (bot token, server ID, etc.)
229
+ ├── registry.json # VM-specific secrets, endpoints, ports
230
+ ├── services.md # What runs on this VM
231
+ ├── CLAUDE.md # Orchestrator instructions
232
+ ├── knowledge/ # Learnings from dissolved workers
233
+ │ ├── index.json
234
+ │ └── 2026-03-22-fix-auth-clock-skew.md
235
+ ├── workers/
236
+ │ ├── tracking.json # Active workers
237
+ │ └── fix-auth-bug/ # Worker directory (while active)
238
+ ├── scripts/ # Lifecycle scripts
239
+ └── plugins/
240
+ └── discord-filtered/ # MCP channel plugin
241
+ ```
242
+
243
+ ### Knowledge base
244
+
245
+ Every dissolved worker leaves behind a learning:
246
+
247
+ ```markdown
248
+ ## What happened
249
+ Token validation rejected valid tokens for 2-3 seconds after refresh.
250
+
251
+ ## Root cause
252
+ No clock skew tolerance between auth server and app server.
253
+
254
+ ## Fix
255
+ Added 5-second CLOCK_SKEW_TOLERANCE in auth.py:47.
256
+
257
+ ## For next time
258
+ If 403 errors appear after token operations, check clock sync first.
259
+ ```
260
+
261
+ The orchestrator includes relevant past learnings when spawning new workers. The system gets better at diagnosing issues over time.
262
+
263
+ ## Resumable setup
264
+
265
+ If setup fails midway (missing dependency, network error, wrong bot token), your answers are saved automatically. Next time you run `npx onkol setup`, it offers to resume:
266
+
267
+ ```
268
+ ? Found a previous setup attempt (4 steps completed). What do you want to do?
269
+ ❯ Resume from where it left off (node: api-server-prod)
270
+ Start fresh
271
+ ```
272
+
273
+ No re-entering bot tokens or server IDs. It picks up right where it left off.
274
+
275
+ ## Commands
276
+
277
+ ```bash
278
+ npx onkol setup # Interactive setup wizard
279
+ npx onkol@latest setup # Force latest version
280
+ ```
281
+
282
+ On the VM after setup:
283
+
284
+ ```bash
285
+ # Attach to the orchestrator
286
+ tmux attach -t onkol-<node-name>
287
+
288
+ # Check service status
289
+ systemctl status onkol-<node-name>
290
+
291
+ # Restart orchestrator
292
+ sudo systemctl restart onkol-<node-name>
293
+
294
+ # View active workers
295
+ bash ~/onkol/scripts/list-workers.sh
296
+
297
+ # Manually dissolve a worker
298
+ bash ~/onkol/scripts/dissolve-worker.sh --name "worker-name"
299
+ ```
300
+
301
+ ## Requirements
302
+
303
+ - Claude Code with `claude.ai` OAuth login (Max plan recommended for concurrent sessions)
304
+ - Node.js 18+ and Bun on each VM
305
+ - tmux and jq on each VM
306
+ - A Discord server with a bot that has Manage Channels permission
307
+ - VMs need outbound HTTPS access (no inbound ports needed)
308
+
309
+ ## How it's built
310
+
311
+ | Component | Tech | Lines |
312
+ |-----------|------|-------|
313
+ | Setup wizard | Node.js, TypeScript, Inquirer | ~500 |
314
+ | Discord channel plugin | Bun, MCP SDK, discord.js | ~300 |
315
+ | Worker lifecycle scripts | Bash | ~400 |
316
+ | Orchestrator/worker templates | Handlebars | ~150 |
317
+
318
+ The core mechanism is [Claude Code Channels](https://code.claude.com/docs/en/channels), an MCP-based system that pushes Discord messages into Claude Code sessions. The `discord-filtered` plugin is a custom channel that routes by Discord channel ID, allowing multiple sessions to share one bot.
319
+
320
+ ## License
321
+
322
+ MIT
@@ -17,3 +17,23 @@ export declare function createChannel(token: string, guildId: string, name: stri
17
17
  }>;
18
18
  export declare function deleteChannel(token: string, channelId: string): Promise<void>;
19
19
  export declare function sendMessage(token: string, channelId: string, content: string): Promise<void>;
20
+ /**
21
+ * Validates the bot token and checks if it can connect to the Discord gateway
22
+ * with the required intents (Guilds, GuildMessages, MessageContent).
23
+ * Returns { ok: true } or { ok: false, error: string }.
24
+ */
25
+ export declare function validateBotToken(token: string): Promise<{
26
+ ok: true;
27
+ } | {
28
+ ok: false;
29
+ error: string;
30
+ }>;
31
+ /**
32
+ * Performs a lightweight check for MessageContent intent by attempting a
33
+ * test gateway connection. Returns a warning message if the intent appears
34
+ * to be disabled, or null if everything looks good.
35
+ *
36
+ * Note: The Discord REST API doesn't expose which intents are enabled.
37
+ * We do a quick WebSocket handshake to the gateway to detect DisallowedIntents.
38
+ */
39
+ export declare function checkGatewayIntents(token: string): Promise<string | null>;
@@ -51,3 +51,105 @@ export async function sendMessage(token, channelId, content) {
51
51
  if (!res.ok)
52
52
  throw new Error(`Failed to send message: ${res.status} ${await res.text()}`);
53
53
  }
54
+ /**
55
+ * Validates the bot token and checks if it can connect to the Discord gateway
56
+ * with the required intents (Guilds, GuildMessages, MessageContent).
57
+ * Returns { ok: true } or { ok: false, error: string }.
58
+ */
59
+ export async function validateBotToken(token) {
60
+ // Step 1: Check the token is valid via /users/@me
61
+ const meRes = await fetch(`${DISCORD_API}/users/@me`, {
62
+ headers: { Authorization: `Bot ${token}` },
63
+ });
64
+ if (!meRes.ok) {
65
+ const body = await meRes.text();
66
+ if (meRes.status === 401)
67
+ return { ok: false, error: 'Invalid bot token.' };
68
+ return { ok: false, error: `Discord API error (${meRes.status}): ${body}` };
69
+ }
70
+ // Step 2: Get the bot's application to check if it's a bot token
71
+ const me = await meRes.json();
72
+ if (!me.bot)
73
+ return { ok: false, error: 'This token belongs to a user account, not a bot.' };
74
+ // Step 3: Try connecting to the gateway with the required intents to check for DisallowedIntents
75
+ // Intents: Guilds (1) | GuildMessages (512) | MessageContent (32768) = 33281
76
+ const gatewayRes = await fetch(`${DISCORD_API}/gateway/bot`, {
77
+ headers: { Authorization: `Bot ${token}` },
78
+ });
79
+ if (!gatewayRes.ok) {
80
+ const body = await gatewayRes.text();
81
+ return { ok: false, error: `Cannot fetch gateway info (${gatewayRes.status}): ${body}` };
82
+ }
83
+ return { ok: true };
84
+ }
85
+ /**
86
+ * Performs a lightweight check for MessageContent intent by attempting a
87
+ * test gateway connection. Returns a warning message if the intent appears
88
+ * to be disabled, or null if everything looks good.
89
+ *
90
+ * Note: The Discord REST API doesn't expose which intents are enabled.
91
+ * We do a quick WebSocket handshake to the gateway to detect DisallowedIntents.
92
+ */
93
+ export function checkGatewayIntents(token) {
94
+ return new Promise(async (resolve) => {
95
+ const timeout = setTimeout(() => resolve(null), 10000); // assume OK if no response in 10s
96
+ try {
97
+ const gatewayRes = await fetch(`${DISCORD_API}/gateway/bot`, {
98
+ headers: { Authorization: `Bot ${token}` },
99
+ });
100
+ if (!gatewayRes.ok) {
101
+ clearTimeout(timeout);
102
+ resolve('Could not fetch gateway URL. Check your bot token.');
103
+ return;
104
+ }
105
+ const { url } = await gatewayRes.json();
106
+ // Dynamic import for WebSocket (works in both Node and Bun)
107
+ const WebSocket = (await import('ws')).default;
108
+ const ws = new WebSocket(`${url}?v=10&encoding=json`);
109
+ ws.on('message', (data) => {
110
+ try {
111
+ const payload = JSON.parse(data.toString());
112
+ if (payload.op === 10) {
113
+ // Send IDENTIFY with the intents we need
114
+ // Guilds=1, GuildMessages=512, MessageContent=32768
115
+ ws.send(JSON.stringify({
116
+ op: 2,
117
+ d: {
118
+ token,
119
+ intents: 1 | 512 | 32768,
120
+ properties: { os: 'linux', browser: 'onkol-setup', device: 'onkol-setup' },
121
+ },
122
+ }));
123
+ }
124
+ else if (payload.op === 0 && payload.t === 'READY') {
125
+ // All good — intents accepted
126
+ ws.close();
127
+ clearTimeout(timeout);
128
+ resolve(null);
129
+ }
130
+ }
131
+ catch { /* ignore parse errors */ }
132
+ });
133
+ ws.on('close', (code) => {
134
+ clearTimeout(timeout);
135
+ if (code === 4014) {
136
+ resolve('MessageContent intent is not enabled for this bot.\n' +
137
+ ' Go to https://discord.com/developers/applications → your bot → Bot settings\n' +
138
+ ' → Privileged Gateway Intents → enable "Message Content Intent" → Save');
139
+ }
140
+ else if (code === 4004) {
141
+ resolve('Invalid bot token (gateway rejected authentication).');
142
+ }
143
+ // Other close codes are fine (we close it ourselves on READY)
144
+ });
145
+ ws.on('error', () => {
146
+ clearTimeout(timeout);
147
+ resolve(null); // network error, don't block setup
148
+ });
149
+ }
150
+ catch {
151
+ clearTimeout(timeout);
152
+ resolve(null);
153
+ }
154
+ });
155
+ }