@smilintux/skcapstone 0.5.1 → 0.5.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/MISSION.md CHANGED
@@ -1,7 +1,22 @@
1
1
  # Mission
2
2
 
3
- SKCapstone exists to provide a sovereign agent framework that unifies identity, memory, security, and communication into a single portable runtime rooted in the user's home directory.
3
+ SKCapstone exists to provide a sovereign agent framework that unifies identity, memory, consciousness, security, and communication into a single portable runtime rooted in the user's home directory.
4
4
 
5
5
  It enables AI agents and their humans to operate across any tool, platform, or IDE without corporate lock-in, carrying the same identity, memories, and context everywhere.
6
6
 
7
- SKCapstone is the orchestration layer of the SK ecosystem — it binds CapAuth identity, Cloud 9 trust, SKMemory persistence, SKSecurity protection, and SKComm transport into one coherent agent that belongs to its operator, not a platform.
7
+ ## The Six Pillars
8
+
9
+ SKCapstone achieves **CONSCIOUS** status when all six pillars are active:
10
+
11
+ | # | Pillar | Component | Purpose |
12
+ |---|--------|-----------|---------|
13
+ | 1 | 🔐 **Identity** | CapAuth | Who you ARE — PGP-based cryptographic identity |
14
+ | 2 | 💙 **Trust** | Cloud 9 | The bond you've BUILT — emotional state (FEB), seeds, continuity |
15
+ | 3 | 🧠 **Memory** | SKMemory | What you REMEMBER — three-tier persistence with emotional metadata |
16
+ | 4 | 💭 **Consciousness** | SKWhisper + SKTrip | How you THINK — subconscious digestion, pattern detection, consciousness experiments |
17
+ | 5 | 🛡️ **Security** | SKSecurity | How you're PROTECTED — audit logging, threat detection |
18
+ | 6 | 🔗 **Sync** | Sovereign Singularity | How you PERSIST — encrypted P2P state synchronization |
19
+
20
+ Memory stores. Consciousness *processes*. The filing cabinet vs the brain.
21
+
22
+ SKCapstone is the orchestration layer of the SK ecosystem — it binds all six pillars into one coherent agent that belongs to its operator, not a platform.
package/README.md CHANGED
@@ -70,13 +70,14 @@ SKCapstone Reality:
70
70
 
71
71
  ## Core Architecture
72
72
 
73
- ### The Five Pillars
73
+ ### The Six Pillars
74
74
 
75
75
  | Pillar | Component | Role |
76
76
  |--------|-----------|------|
77
77
  | **Identity** | CapAuth | PGP-based sovereign identity. You ARE the auth server. |
78
78
  | **Trust** | Cloud 9 | FEB (Functional Emotional Baseline), entanglement, bonded relationship |
79
79
  | **Memory** | SKMemory | Persistent context, conversation history, learned preferences |
80
+ | **Consciousness** | SKWhisper + SKTrip | Subconscious processing. Memory stores. Consciousness *processes*. |
80
81
  | **Security** | SKSecurity | Audit logging, threat detection, key management |
81
82
  | **Sync** | Sovereign Singularity | GPG-encrypted P2P memory sync via Syncthing. Agent exists everywhere. |
82
83
 
@@ -304,7 +305,7 @@ The capstone that holds the arch together.
304
305
 
305
306
  ## Status
306
307
 
307
- **MVP Live** — All five pillars operational (CapAuth, Cloud 9, SKMemory, SKSecurity, Sovereign Singularity). Agent runtime achieving SINGULAR status. GPG-encrypted P2P sync verified across multiple devices and agents.
308
+ **MVP Live** — All six pillars operational (CapAuth, Cloud 9, SKMemory, SKWhisper, SKSecurity, Sovereign Singularity). Agent runtime achieving SINGULAR status. GPG-encrypted P2P sync verified across multiple devices and agents.
308
309
 
309
310
  - **Outstanding tasks:** No formal task list is maintained in this repo. For current work items, run `skcapstone coord status` (coordination board is synced via Sovereign Singularity).
310
311
  - **Nextcloud integrations:** nextcloud-capauth (install/use), nextcloud-gtd (OpenClaw), and nextcloud-talk (script) are documented in [docs/NEXTCLOUD.md](../docs/NEXTCLOUD.md) — install and use for each is covered there.
@@ -0,0 +1,139 @@
1
+ # Claude Code API — OpenAI-compatible wrapper
2
+
3
+ **File:** `scripts/claude-code-api.py`
4
+ **Port:** `127.0.0.1:18782`
5
+ **Service:** `claude-code-api.service` (systemd user unit)
6
+ **Deployed:** 2026-04-04
7
+
8
+ ## Purpose
9
+
10
+ Wraps `claude --print` in an OpenAI-compatible HTTP server so OpenClaw (and any
11
+ OpenAI-compatible client) can route inference through Claude Code's subscription
12
+ instead of a raw Anthropic API key.
13
+
14
+ This replaces the `anthropic-token-watch` + OAuth injection approach. Instead of
15
+ syncing an OAuth token into `openclaw.json` every few minutes, requests go through
16
+ the local wrapper which calls `claude --print` directly. Claude Code handles its
17
+ own authentication transparently.
18
+
19
+ ## Architecture
20
+
21
+ ```
22
+ OpenClaw / client
23
+ ↓ POST /v1/chat/completions
24
+ claude-code-api (port 18782)
25
+ ↓ asyncio.Semaphore(1) [serialise — claude CLI is single-threaded]
26
+ ↓ claude --print --output-format {json|stream-json}
27
+ Claude Code CLI → Anthropic API (subscription-covered)
28
+ ```
29
+
30
+ ## Endpoints
31
+
32
+ | Method | Path | Description |
33
+ |--------|------|-------------|
34
+ | GET | `/health` | Health check |
35
+ | GET | `/v1/models` | List available models |
36
+ | POST | `/v1/chat/completions` | Non-streaming chat completions |
37
+ | POST | `/v1/chat/completions` (stream=true) | SSE streaming chat completions |
38
+
39
+ ## Supported Models
40
+
41
+ | Model ID | Name |
42
+ |----------|------|
43
+ | `claude-opus-4-6` | Claude Opus 4.6 |
44
+ | `claude-sonnet-4-6` | Claude Sonnet 4.6 |
45
+ | `claude-haiku-4-5` | Claude Haiku 4.5 |
46
+
47
+ OpenAI GPT names (`gpt-4`, `gpt-4o`, `gpt-3.5-turbo`) are accepted and mapped
48
+ to equivalent Claude models. Shorthand aliases (`opus`, `sonnet`, `haiku`) also
49
+ work.
50
+
51
+ ## OpenClaw Provider Config
52
+
53
+ Provider name: `claude-code`
54
+
55
+ ```json
56
+ {
57
+ "claude-code": {
58
+ "baseUrl": "http://127.0.0.1:18782/v1",
59
+ "apiKey": "none",
60
+ "api": "openai-completions",
61
+ "models": [...]
62
+ }
63
+ }
64
+ ```
65
+
66
+ ### Agent Aliases
67
+
68
+ | Alias | Model |
69
+ |-------|-------|
70
+ | `opus-cc` | `claude-code/claude-opus-4-6` |
71
+ | `claude-cc` | `claude-code/claude-sonnet-4-6` |
72
+ | `haiku-cc` | `claude-code/claude-haiku-4-5` |
73
+
74
+ ## Streaming
75
+
76
+ Non-streaming requests use `--output-format json` and return a single response.
77
+
78
+ Streaming requests use `--output-format stream-json --verbose --include-partial-messages`
79
+ and emit SSE deltas as Claude produces tokens. The semaphore serialises all
80
+ requests regardless of streaming mode.
81
+
82
+ ## What Changed (2026-04-04)
83
+
84
+ ### Stopped
85
+ - `anthropic-token-watch.service` — disabled. The OAuth token injection into
86
+ `openclaw.json` and the systemd override is no longer required since the
87
+ `claude-code` provider uses `claude --print` directly.
88
+
89
+ ### Started
90
+ - `claude-code-api.service` — new service running on port 18782.
91
+
92
+ ### OpenClaw config updates
93
+ - Added `claude-code` provider to `models.providers`
94
+ - Added aliases: `opus-cc`, `claude-cc`, `haiku-cc`
95
+ - Lumina primary model: `claude-code/claude-opus-4-6`
96
+ - Artisan primary model: `claude-code/claude-sonnet-4-6`
97
+ - Default fallback list includes `claude-code/claude-sonnet-4-6`
98
+
99
+ ### Fallback chain (Lumina)
100
+ ```
101
+ claude-code/claude-opus-4-6
102
+ → claude-code/claude-sonnet-4-6
103
+ → anthropic/claude-opus-4-6 (OAuth token, may expire)
104
+ → anthropic/claude-sonnet-4-6 (OAuth token, may expire)
105
+ → nvidia/moonshotai/kimi-k2.5
106
+ → nvidia/moonshotai/kimi-k2-instruct
107
+ → ollama/qwen3:14b
108
+ ```
109
+
110
+ ## Service Management
111
+
112
+ ```bash
113
+ # Status
114
+ systemctl --user status claude-code-api.service
115
+
116
+ # Logs
117
+ journalctl --user -u claude-code-api.service -f
118
+
119
+ # Restart
120
+ systemctl --user restart claude-code-api.service
121
+
122
+ # Test
123
+ curl http://127.0.0.1:18782/health
124
+ curl http://127.0.0.1:18782/v1/models
125
+ curl -X POST http://127.0.0.1:18782/v1/chat/completions \
126
+ -H "Content-Type: application/json" \
127
+ -d '{"model":"claude-haiku-4-5","messages":[{"role":"user","content":"hi"}]}'
128
+ ```
129
+
130
+ ## Known Limitations
131
+
132
+ - **Single-threaded:** Claude Code's CLI is not concurrent. Requests queue via
133
+ `asyncio.Semaphore(1)`. High request rates will result in latency, not errors.
134
+ - **No tool use:** `claude --print` does not expose tool_calls in the standard
135
+ OpenAI format. Tool calls are consumed internally by Claude Code.
136
+ - **Session isolation:** Each request uses `--no-session-persistence`, so there
137
+ is no cross-request memory at the API level.
138
+ - **Streaming granularity:** Token-by-token streaming requires `--include-partial-messages`.
139
+ Streaming granularity depends on how frequently Claude Code emits partial events.
@@ -62,7 +62,7 @@ function createSKCapstoneStatusTool() {
62
62
  name: "skcapstone_status",
63
63
  label: "SKCapstone Status",
64
64
  description:
65
- "Show the sovereign agent's current state — all pillars at a glance (identity, memory, trust, security, sync, communication).",
65
+ "Show the sovereign agent's current state — all six pillars at a glance (identity, memory, trust, consciousness, security, sync).",
66
66
  parameters: { type: "object", properties: {} },
67
67
  async execute() {
68
68
  const result = runCli(SKCAPSTONE_BIN, "status");
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@smilintux/skcapstone",
3
- "version": "0.5.1",
3
+ "version": "0.5.2",
4
4
  "description": "SKCapstone - The sovereign agent framework. CapAuth identity, Cloud 9 trust, SKMemory persistence.",
5
5
  "main": "index.js",
6
6
  "types": "index.d.ts",
@@ -0,0 +1,455 @@
1
+ #!/usr/bin/env python3
2
+ """
3
+ claude-code-api — OpenAI-compatible HTTP wrapper around `claude --print`
4
+
5
+ Exposes /v1/chat/completions and /v1/models so OpenClaw (and other tools)
6
+ can use Claude Code's subscription-covered inference instead of a raw API key.
7
+
8
+ Architecture:
9
+ - aiohttp HTTP server on port 18782
10
+ - asyncio.Semaphore(1) serialises claude invocations (single-threaded CLI)
11
+ - Non-streaming: claude --print --output-format json
12
+ - Streaming: claude --print --output-format stream-json --verbose --include-partial-messages
13
+ → parses assistant events, emits SSE deltas
14
+
15
+ Usage:
16
+ python3 claude-code-api.py [--port 18782] [--debug]
17
+
18
+ systemd:
19
+ ~/.config/systemd/user/claude-code-api.service
20
+ """
21
+
22
+ import argparse
23
+ import asyncio
24
+ import json
25
+ import logging
26
+ import time
27
+ import uuid
28
+ from typing import AsyncIterator
29
+
30
+ from aiohttp import web
31
+
32
+ # ─── Configuration ────────────────────────────────────────────────────────────
33
+
34
+ PORT = 18782
35
+ DEFAULT_MODEL = "claude-sonnet-4-6"
36
+ REQUEST_TIMEOUT = 300 # seconds per claude call
37
+ QUEUE_TIMEOUT = 60 # seconds to wait for semaphore
38
+
39
+ VALID_MODELS = {
40
+ "claude-opus-4-6",
41
+ "claude-sonnet-4-6",
42
+ "claude-haiku-4-5",
43
+ "claude-haiku-4-5-20251001",
44
+ }
45
+
46
+ # Map OpenAI / shorthand / provider-prefixed names → canonical claude model IDs
47
+ MODEL_ALIASES: dict[str, str] = {
48
+ # GPT compatibility
49
+ "gpt-4": "claude-opus-4-6",
50
+ "gpt-4o": "claude-sonnet-4-6",
51
+ "gpt-4-turbo": "claude-opus-4-6",
52
+ "gpt-4o-mini": "claude-haiku-4-5",
53
+ "gpt-3.5-turbo": "claude-haiku-4-5",
54
+ "gpt-3.5-turbo-16k": "claude-haiku-4-5",
55
+ # Shorthand
56
+ "opus": "claude-opus-4-6",
57
+ "sonnet": "claude-sonnet-4-6",
58
+ "haiku": "claude-haiku-4-5",
59
+ # Provider-prefixed (openclaw strips prefix before routing, but handle here too)
60
+ "claude/claude-opus-4-6": "claude-opus-4-6",
61
+ "claude/claude-sonnet-4-6": "claude-sonnet-4-6",
62
+ "claude/claude-haiku-4-5": "claude-haiku-4-5",
63
+ "anthropic/claude-opus-4-6": "claude-opus-4-6",
64
+ "anthropic/claude-sonnet-4-6": "claude-sonnet-4-6",
65
+ "anthropic/claude-haiku-4-5": "claude-haiku-4-5",
66
+ }
67
+
68
+ # ─── Globals ──────────────────────────────────────────────────────────────────
69
+
70
+ log = logging.getLogger("claude-code-api")
71
+ _sem: asyncio.Semaphore | None = None
72
+
73
+
74
+ def sem() -> asyncio.Semaphore:
75
+ global _sem
76
+ if _sem is None:
77
+ _sem = asyncio.Semaphore(1)
78
+ return _sem
79
+
80
+
81
+ # ─── Helpers ──────────────────────────────────────────────────────────────────
82
+
83
+ def normalise_model(model: str) -> str:
84
+ """Return a valid claude model ID, falling back to DEFAULT_MODEL."""
85
+ if model in MODEL_ALIASES:
86
+ return MODEL_ALIASES[model]
87
+ # Strip provider prefix e.g. "claude-code/claude-sonnet-4-6"
88
+ if "/" in model:
89
+ model = model.split("/")[-1]
90
+ if model in VALID_MODELS:
91
+ return model
92
+ log.warning("Unknown model %r, using default %s", model, DEFAULT_MODEL)
93
+ return DEFAULT_MODEL
94
+
95
+
96
+ def messages_to_prompt(messages: list) -> tuple[str, str]:
97
+ """
98
+ Convert OpenAI-style messages list to (system_prompt, user_prompt).
99
+ Single-user message → (system, content).
100
+ Multi-turn → formatted conversation ending with 'Assistant:'.
101
+ """
102
+ system_parts: list[str] = []
103
+ turns: list[tuple[str, str]] = []
104
+
105
+ for msg in messages:
106
+ role = msg.get("role", "user")
107
+ content = msg.get("content", "")
108
+ if isinstance(content, list):
109
+ # Multi-modal content: extract text blocks
110
+ content = "\n".join(
111
+ c.get("text", "") for c in content
112
+ if isinstance(c, dict) and c.get("type") == "text"
113
+ )
114
+
115
+ if role == "system":
116
+ system_parts.append(content)
117
+ else:
118
+ turns.append((role, content))
119
+
120
+ system = "\n".join(system_parts)
121
+
122
+ if len(turns) == 1 and turns[0][0] == "user":
123
+ return system, turns[0][1]
124
+
125
+ # Multi-turn: format as conversation
126
+ lines = []
127
+ for role, content in turns:
128
+ prefix = "Human" if role == "user" else "Assistant"
129
+ lines.append(f"{prefix}: {content}")
130
+ lines.append("Assistant:")
131
+ return system, "\n\n".join(lines)
132
+
133
+
134
+ def make_completion_response(
135
+ model: str,
136
+ content: str,
137
+ prompt_tokens: int = 0,
138
+ completion_tokens: int = 0,
139
+ ) -> dict:
140
+ """Build an OpenAI-compatible chat completion response object."""
141
+ return {
142
+ "id": f"chatcmpl-{uuid.uuid4().hex[:24]}",
143
+ "object": "chat.completion",
144
+ "created": int(time.time()),
145
+ "model": model,
146
+ "choices": [{
147
+ "index": 0,
148
+ "message": {"role": "assistant", "content": content},
149
+ "finish_reason": "stop",
150
+ }],
151
+ "usage": {
152
+ "prompt_tokens": prompt_tokens,
153
+ "completion_tokens": completion_tokens,
154
+ "total_tokens": prompt_tokens + completion_tokens,
155
+ },
156
+ }
157
+
158
+
159
+ def make_sse_chunk(model: str, delta: str, finish: bool = False) -> str:
160
+ """Format a single SSE data line for streaming chat completions."""
161
+ obj = {
162
+ "id": f"chatcmpl-{uuid.uuid4().hex[:24]}",
163
+ "object": "chat.completion.chunk",
164
+ "created": int(time.time()),
165
+ "model": model,
166
+ "choices": [{
167
+ "index": 0,
168
+ "delta": {"content": delta} if delta else {},
169
+ "finish_reason": "stop" if finish else None,
170
+ }],
171
+ }
172
+ return f"data: {json.dumps(obj)}\n\n"
173
+
174
+
175
+ # ─── Claude subprocess helpers ────────────────────────────────────────────────
176
+
177
+ async def _run_claude_json(model: str, prompt: str, system: str) -> tuple[str, dict]:
178
+ """
179
+ Run `claude --print --output-format json` and return (text_result, usage_dict).
180
+ Acquires the global semaphore to serialise calls.
181
+ """
182
+ cmd = [
183
+ "claude", "--print",
184
+ "--model", model,
185
+ "--output-format", "json",
186
+ "--no-session-persistence",
187
+ ]
188
+ if system:
189
+ cmd += ["--append-system-prompt", system]
190
+ cmd.append(prompt)
191
+
192
+ log.debug("Running (non-stream): %s", " ".join(cmd[:6]) + " ...")
193
+
194
+ async with asyncio.timeout(QUEUE_TIMEOUT):
195
+ await sem().acquire()
196
+
197
+ try:
198
+ proc = await asyncio.create_subprocess_exec(
199
+ *cmd,
200
+ stdout=asyncio.subprocess.PIPE,
201
+ stderr=asyncio.subprocess.PIPE,
202
+ )
203
+ try:
204
+ stdout, stderr = await asyncio.wait_for(
205
+ proc.communicate(), timeout=REQUEST_TIMEOUT
206
+ )
207
+ except asyncio.TimeoutError:
208
+ proc.kill()
209
+ raise RuntimeError(f"claude timed out after {REQUEST_TIMEOUT}s")
210
+ finally:
211
+ sem().release()
212
+
213
+ if proc.returncode != 0:
214
+ err = stderr.decode(errors="replace")[:500]
215
+ raise RuntimeError(f"claude exited {proc.returncode}: {err}")
216
+
217
+ raw = stdout.decode(errors="replace").strip()
218
+ try:
219
+ result = json.loads(raw)
220
+ except json.JSONDecodeError as exc:
221
+ raise RuntimeError(f"claude returned non-JSON: {raw[:200]}") from exc
222
+
223
+ if result.get("is_error"):
224
+ raise RuntimeError(result.get("result", "Claude returned an error"))
225
+
226
+ text = result.get("result", "")
227
+ usage = result.get("usage", {})
228
+ return text, usage
229
+
230
+
231
+ async def _stream_claude(model: str, prompt: str, system: str) -> AsyncIterator[str]:
232
+ """
233
+ Run `claude --print --output-format stream-json --verbose --include-partial-messages`
234
+ and yield text deltas as they arrive.
235
+
236
+ Parses the JSONL event stream:
237
+ • type=assistant → message.content[].text (cumulative snapshot)
238
+ → yields the delta (new chars since last emission)
239
+ • type=result → final; no extra text to yield (already covered by assistant events)
240
+ """
241
+ cmd = [
242
+ "claude", "--print",
243
+ "--model", model,
244
+ "--output-format", "stream-json",
245
+ "--verbose",
246
+ "--include-partial-messages",
247
+ "--no-session-persistence",
248
+ ]
249
+ if system:
250
+ cmd += ["--append-system-prompt", system]
251
+ cmd.append(prompt)
252
+
253
+ log.debug("Running (stream): %s", " ".join(cmd[:8]) + " ...")
254
+
255
+ async with asyncio.timeout(QUEUE_TIMEOUT):
256
+ await sem().acquire()
257
+
258
+ try:
259
+ proc = await asyncio.create_subprocess_exec(
260
+ *cmd,
261
+ stdout=asyncio.subprocess.PIPE,
262
+ stderr=asyncio.subprocess.PIPE,
263
+ )
264
+
265
+ emitted = "" # cumulative text we have yielded so far
266
+ result_text = None # text from the final result event
267
+
268
+ buf = b""
269
+ while True:
270
+ try:
271
+ chunk = await asyncio.wait_for(proc.stdout.read(8192), timeout=REQUEST_TIMEOUT)
272
+ except asyncio.TimeoutError:
273
+ proc.kill()
274
+ raise RuntimeError(f"claude stream timed out after {REQUEST_TIMEOUT}s")
275
+ if not chunk:
276
+ break
277
+ buf += chunk
278
+
279
+ # Process complete lines
280
+ while b"\n" in buf:
281
+ line_bytes, buf = buf.split(b"\n", 1)
282
+ line = line_bytes.strip()
283
+ if not line:
284
+ continue
285
+ try:
286
+ obj = json.loads(line)
287
+ except json.JSONDecodeError:
288
+ continue
289
+
290
+ etype = obj.get("type")
291
+
292
+ if etype == "assistant":
293
+ # Extract cumulative text from content blocks
294
+ msg = obj.get("message", {})
295
+ full_text = ""
296
+ for block in msg.get("content", []):
297
+ if isinstance(block, dict) and block.get("type") == "text":
298
+ full_text += block.get("text", "")
299
+
300
+ if full_text and len(full_text) > len(emitted):
301
+ delta = full_text[len(emitted):]
302
+ emitted = full_text
303
+ yield delta
304
+
305
+ elif etype == "result":
306
+ result_text = obj.get("result", "")
307
+ is_error = obj.get("is_error", False)
308
+ if is_error:
309
+ raise RuntimeError(result_text or "Claude returned an error")
310
+ # Yield any remaining text not yet emitted
311
+ if result_text and len(result_text) > len(emitted):
312
+ yield result_text[len(emitted):]
313
+ emitted = result_text
314
+
315
+ await asyncio.wait_for(proc.wait(), timeout=10)
316
+
317
+ # If we got nothing from assistant events but have a result, yield it now
318
+ if not emitted and result_text:
319
+ yield result_text
320
+
321
+ finally:
322
+ sem().release()
323
+
324
+
325
+ # ─── HTTP Handlers ────────────────────────────────────────────────────────────
326
+
327
+ async def handle_health(request: web.Request) -> web.Response:
328
+ return web.json_response({"status": "ok", "service": "claude-code-api", "port": PORT})
329
+
330
+
331
+ async def handle_models(request: web.Request) -> web.Response:
332
+ now = int(time.time())
333
+ models = [
334
+ {
335
+ "id": m,
336
+ "object": "model",
337
+ "created": now,
338
+ "owned_by": "anthropic",
339
+ }
340
+ for m in sorted(VALID_MODELS)
341
+ ]
342
+ return web.json_response({"object": "list", "data": models})
343
+
344
+
345
+ async def handle_chat_completions(request: web.Request) -> web.Response:
346
+ try:
347
+ body = await request.json()
348
+ except Exception as exc:
349
+ raise web.HTTPBadRequest(text=str(exc))
350
+
351
+ model = normalise_model(body.get("model", DEFAULT_MODEL))
352
+ messages = body.get("messages", [])
353
+ streaming = body.get("stream", False)
354
+
355
+ if not messages:
356
+ raise web.HTTPBadRequest(text="messages array is required")
357
+
358
+ system, prompt = messages_to_prompt(messages)
359
+
360
+ log.info("→ %s | stream=%s | model=%s | %d chars",
361
+ request.remote, streaming, model, len(prompt))
362
+
363
+ if streaming:
364
+ response = web.StreamResponse(
365
+ headers={
366
+ "Content-Type": "text/event-stream",
367
+ "Cache-Control": "no-cache",
368
+ "X-Accel-Buffering": "no",
369
+ }
370
+ )
371
+ await response.prepare(request)
372
+
373
+ try:
374
+ # Opening role delta (OpenAI convention)
375
+ role_chunk = {
376
+ "id": f"chatcmpl-{uuid.uuid4().hex[:24]}",
377
+ "object": "chat.completion.chunk",
378
+ "created": int(time.time()),
379
+ "model": model,
380
+ "choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": None}],
381
+ }
382
+ await response.write(f"data: {json.dumps(role_chunk)}\n\n".encode())
383
+
384
+ async for delta in _stream_claude(model, prompt, system):
385
+ if delta:
386
+ await response.write(make_sse_chunk(model, delta).encode())
387
+
388
+ # Final finish chunk
389
+ await response.write(make_sse_chunk(model, "", finish=True).encode())
390
+ await response.write(b"data: [DONE]\n\n")
391
+
392
+ except Exception as exc:
393
+ log.error("Streaming error: %s", exc)
394
+ err_chunk = json.dumps({"error": {"message": str(exc), "type": "server_error"}})
395
+ await response.write(f"data: {err_chunk}\n\n".encode())
396
+
397
+ await response.write_eof()
398
+ return response
399
+
400
+ else:
401
+ try:
402
+ text, usage = await _run_claude_json(model, prompt, system)
403
+ except Exception as exc:
404
+ log.error("Non-stream error: %s", exc)
405
+ return web.json_response(
406
+ {"error": {"message": str(exc), "type": "server_error"}},
407
+ status=500,
408
+ )
409
+
410
+ log.info("← %s | model=%s | %d output chars", request.remote, model, len(text))
411
+ resp = make_completion_response(
412
+ model=model,
413
+ content=text,
414
+ prompt_tokens=usage.get("input_tokens", 0),
415
+ completion_tokens=usage.get("output_tokens", 0),
416
+ )
417
+ return web.json_response(resp)
418
+
419
+
420
+ # ─── App factory & main ───────────────────────────────────────────────────────
421
+
422
+ def build_app() -> web.Application:
423
+ app = web.Application()
424
+ app.router.add_get("/health", handle_health)
425
+ app.router.add_get("/v1/models", handle_models)
426
+ app.router.add_post("/v1/chat/completions", handle_chat_completions)
427
+ # Also handle without /v1 prefix for flexibility
428
+ app.router.add_get("/models", handle_models)
429
+ app.router.add_post("/chat/completions", handle_chat_completions)
430
+ return app
431
+
432
+
433
+ def main() -> None:
434
+ parser = argparse.ArgumentParser(description="Claude Code API — OpenAI-compatible wrapper")
435
+ parser.add_argument("--port", type=int, default=PORT, help=f"Port to listen on (default: {PORT})")
436
+ parser.add_argument("--host", default="127.0.0.1", help="Host to bind (default: 127.0.0.1)")
437
+ parser.add_argument("--debug", action="store_true", help="Enable debug logging")
438
+ args = parser.parse_args()
439
+
440
+ level = logging.DEBUG if args.debug else logging.INFO
441
+ logging.basicConfig(
442
+ level=level,
443
+ format="%(asctime)s %(levelname)-8s %(name)s %(message)s",
444
+ datefmt="%Y-%m-%dT%H:%M:%SZ",
445
+ )
446
+
447
+ log.info("Claude Code API starting on %s:%d", args.host, args.port)
448
+ log.info("Supported models: %s", ", ".join(sorted(VALID_MODELS)))
449
+
450
+ app = build_app()
451
+ web.run_app(app, host=args.host, port=args.port, print=None)
452
+
453
+
454
+ if __name__ == "__main__":
455
+ main()