PyPI - loreguard-cli - Versions diffs - 0.12.2__tar.gz → 0.13.0rc1__tar.gz - Mend

loreguard-cli 0.12.2tar.gz → 0.13.0rc1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (60) hide show

{loreguard_cli-0.12.2 → loreguard_cli-0.13.0rc1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: loreguard-cli
-Version: 0.12.2
+Version: 0.13.0rc1
 Summary: Local inference client for Loreguard NPCs
 Project-URL: Homepage, https://loreguard.com
 Project-URL: Documentation, https://github.com/beyond-logic-labs/loreguard-cli#readme

{loreguard_cli-0.12.2 → loreguard_cli-0.13.0rc1}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "loreguard-cli"
-version = "0.12.2"
+version = "0.13.0-rc.1"
 description = "Local inference client for Loreguard NPCs"
 readme = "README.md"
 license = "MIT"

loreguard_cli-0.13.0rc1/sdk/API.md ADDED Viewed

@@ -0,0 +1,249 @@
+# Loreguard Client API Reference
+Loreguard Client exposes a local HTTP API that any game can call. The server runs on `127.0.0.1` with a dynamic port written to `runtime.json`.
+## Service Discovery
+On startup, loreguard-client writes a `runtime.json` file:
+| Platform | Path |
+|----------|------|
+| macOS | `~/Library/Application Support/loreguard/runtime.json` |
+| Linux | `~/.local/share/loreguard/runtime.json` (or `$XDG_DATA_HOME/loreguard/`) |
+| Windows | `%APPDATA%/loreguard/runtime.json` |
+```json
+{
+  "port": 52341,
+  "pid": 12345,
+  "url": "http://127.0.0.1:52341",
+  "started_at": "2026-02-20T10:30:00Z",
+  "version": "0.7.0",
+  "backend_connected": true
+}
+```
+Read this file to discover the port, then make HTTP calls to `http://127.0.0.1:{port}`.
+---
+## Endpoints
+### `GET /health`
+Check if loreguard-client is running and connected to the backend.
+**Response:**
+```json
+{
+  "status": "ok",
+  "backend_connected": true
+}
+```
+Returns `500` if the server is in an error state.
+---
+### `GET /api/capabilities`
+Discover what features this bundle supports. Use this to feature-detect before sending requests with optional fields.
+**Response:**
+```json
+{
+  "streaming": true,
+  "chunk_modes": ["deberta", "sentence"],
+  "manages_history": false
+}
+```
+| Field | Type | Description |
+|-------|------|-------------|
+| `streaming` | bool | Whether SSE streaming is supported |
+| `chunk_modes` | string[] | Available chunk detection modes (e.g. `"deberta"`, `"sentence"`) |
+| `manages_history` | bool | Whether the bundle can manage conversation history internally |
+---
+### `POST /api/chat`
+Send a player message and get an NPC response. Supports both blocking JSON and SSE streaming.
+**Request Headers:**
+| Header | Value | Effect |
+|--------|-------|--------|
+| `Content-Type` | `application/json` | Required |
+| `Accept` | `text/event-stream` | Enables SSE streaming (optional) |
+| `Authorization` | `Bearer <token>` | API token for character access (optional) |
+**Request Body:**
+```json
+{
+  "character_id": "merchant-npc",
+  "message": "What do you have for sale?",
+  "player_handle": "player1",
+  "player_id": "uuid-here",
+  "current_context": "player is in the marketplace",
+  "scenario_id": "main-quest",
+  "history": [
+    {"role": "user", "content": "Hello!"},
+    {"role": "assistant", "content": "Welcome, traveler!"}
+  ],
+  "chunk_mode": "deberta",
+  "manage_history": false,
+  "max_speech_tokens": 150,
+  "verbose": false,
+  "enable_thinking": false
+}
+```
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `character_id` | string | Yes | NPC identifier |
+| `message` | string | Yes | Player's message |
+| `player_handle` | string | No | Player's display name |
+| `player_id` | string | No | Player's unique ID for per-player state. If empty, backend uses the developer's owner ID |
+| `current_context` | string | No | Game context (location, situation) |
+| `scenario_id` | string | No | Scenario identifier |
+| `history` | array | No | Conversation history. Omit if `manage_history` is true |
+| `chunk_mode` | string | No | `"deberta"` for ML-based chunk splitting, `"sentence"` for regex sentence splitting, `""` or omit for none |
+| `manage_history` | bool | No | If true, the backend manages conversation history per character+player pair |
+| `max_speech_tokens` | int | No | Maximum tokens in NPC speech (0 = default) |
+| `verbose` | bool | No | Include pipeline pass updates in response (for debugging) |
+| `enable_thinking` | bool | No | Include NPC internal monologue |
+All field names accept both `snake_case` and `camelCase` (e.g. `character_id` or `characterId`).
+#### JSON Response (default)
+When `Accept` is not `text/event-stream`:
+```json
+{
+  "response": "I have potions, swords, and shields. What interests you?",
+  "verified": true,
+  "citations": ["knowledge/inventory.md:5"],
+  "chunks": [
+    "I have potions, swords, and shields.",
+    "What interests you?"
+  ],
+  "pipeline_trace": []
+}
+```
+| Field | Type | Description |
+|-------|------|-------------|
+| `response` | string | Full NPC speech |
+| `verified` | bool | Whether NeMo verification passed |
+| `citations` | string[] | Knowledge sources used |
+| `chunks` | string[] | Sentence/chunk boundaries (only present when `chunk_mode` was set) |
+| `pipeline_trace` | array | Pipeline pass details (only present when `verbose` was true) |
+#### SSE Streaming Response
+When `Accept: text/event-stream` is set, the response is a stream of Server-Sent Events:
+```
+event: filler
+data: {"text": "Hmm...", "dialogueAct": "wh-question"}
+event: pass_update
+data: {"pass": "retrieval", "status": "complete", "latencyMs": 120}
+event: token
+data: {"t": "I"}
+event: token
+data: {"t": " have"}
+event: token
+data: {"t": " potions"}
+event: done
+data: {"speech": "I have potions...", "verified": true, "citations": [...], "chunks": [...]}
+event: follow_up
+data: {"speech": "By the way, new stock arrives tomorrow.", ...}
+```
+| Event | Data | Description |
+|-------|------|-------------|
+| `filler` | `{text, dialogueAct}` | Contextual filler message. Sent early (~100ms) before the pipeline completes. Display immediately for perceived responsiveness |
+| `token` | `{t}` | Single token from the LLM. Append to build the response incrementally |
+| `pass_update` | `{pass, status, ...}` | Pipeline pass progress (only when `verbose` is true). For debugging |
+| `done` | `{speech, verified, citations, chunks, ...}` | Final verified response. Contains the same fields as the JSON response |
+| `follow_up` | `{speech, ...}` | Unsolicited follow-up message from the NPC (may arrive after `done`) |
+| `error` | `{error}` | Error message. Stream ends after this |
+---
+## Integration Patterns
+### Minimal (any language)
+Just POST JSON and read the response:
+```
+POST http://127.0.0.1:{port}/api/chat
+Content-Type: application/json
+{"character_id": "merchant", "message": "Hello!"}
+```
+### With Chunks (for staggered display)
+Request chunked responses and display each chunk separately with delays:
+```json
+{
+  "character_id": "merchant",
+  "message": "Hello!",
+  "chunk_mode": "deberta"
+}
+```
+Response includes `chunks` array. Display each chunk as a separate message/bubble with ~500-700ms delays between them.
+### With Server-Managed History
+Let the backend track conversation history so your game doesn't have to:
+```json
+{
+  "character_id": "merchant",
+  "message": "Hello!",
+  "player_id": "player-uuid",
+  "manage_history": true
+}
+```
+No `history` field needed. The backend maintains a rolling conversation buffer per `(character_id, player_id)` pair.
+### With Streaming + Filler
+For games with real-time UX (speech bubbles, typing indicators):
+1. Set `Accept: text/event-stream`
+2. On `filler` event: show typing indicator or filler text ("Hmm...")
+3. On `token` events: append tokens to display
+4. On `done` event: finalize the response, show verified status
+---
+## SDK Files
+Pre-built SDK files for common engines are in the `sdk/` directory. Copy the relevant file into your project:
+| Engine | File | Notes |
+|--------|------|-------|
+| Python | `sdk/python/loreguard_sdk.py` | Requires `httpx`. Async + sync support |
+| JavaScript / Electron | `sdk/javascript/loreguard-sdk.js` | Node.js CommonJS. Uses `fetch` |
+| Unity / C# | `sdk/csharp/LoreguardSDK.cs` | Uses `UnityWebRequest`. Coroutine-based |
+| Godot 4 | `sdk/gdscript/LoreguardSDK.gd` | Signal-based. Supports streaming |
+These are thin HTTP wrappers around the endpoints documented above. You can also call the API directly from any language that supports HTTP.

{loreguard_cli-0.12.2 → loreguard_cli-0.13.0rc1}/sdk/python/loreguard_sdk.py RENAMED Viewed

@@ -145,7 +145,12 @@ async def chat(
     character_id: str,
     message: str,
     player_handle: str = "",
+    player_id: str = "",
     current_context: str = "",
+    history: list[dict[str, Any]] | None = None,
+    chunk_mode: str = "",
+    manage_history: bool = False,
+    max_speech_tokens: int = 0,
     stream: bool = True,
 ) -> AsyncIterator[dict[str, Any]]:
     """Chat with an NPC via loreguard-client.
@@ -154,12 +159,17 @@ async def chat(
         character_id: The NPC's ID
         message: Player's message to the NPC
         player_handle: Player's display name (optional)
+        player_id: Player's unique ID for per-player state (optional)
         current_context: Game context like "in a dark cave" (optional)
+        history: Conversation history as [{"role": "user"|"assistant", "content": "..."}] (optional)
+        chunk_mode: Chunk detection mode — "deberta" for ML-based, "" for none (optional)
+        manage_history: If True, bundle manages history internally per character+player (optional)
+        max_speech_tokens: Max tokens in NPC speech (optional, 0 = default)
         stream: If True, yields tokens as they arrive. If False, yields final response.
     Yields:
         For streaming: {"t": "token"} for each token, then {"speech": "...", "verified": True, ...}
-        For non-streaming: Single dict with complete response
+        For non-streaming: Single dict with complete response (includes "chunks" if chunk_mode set)
     Raises:
         RuntimeError: If loreguard-client is not running
@@ -181,12 +191,22 @@ async def chat(
     if stream:
         headers["Accept"] = "text/event-stream"
-    body = {
+    body: dict[str, Any] = {
         "character_id": character_id,
         "message": message,
         "player_handle": player_handle,
         "current_context": current_context,
     }
+    if player_id:
+        body["player_id"] = player_id
+    if history:
+        body["history"] = history
+    if chunk_mode:
+        body["chunk_mode"] = chunk_mode
+    if manage_history:
+        body["manage_history"] = True
+    if max_speech_tokens > 0:
+        body["max_speech_tokens"] = max_speech_tokens
     async with httpx.AsyncClient() as client:
         if stream:
@@ -214,6 +234,26 @@ async def chat(
             yield response.json()
+async def get_capabilities() -> dict[str, Any]:
+    """Get bundle capabilities.
+    Returns:
+        Capabilities dict with streaming, chunk_modes, manages_history.
+    Raises:
+        RuntimeError: If loreguard-client is not running
+        ImportError: If httpx is not installed
+    """
+    if httpx is None:
+        raise ImportError("httpx is required. Install with: pip install httpx")
+    url = f"{get_base_url()}/api/capabilities"
+    async with httpx.AsyncClient() as client:
+        response = await client.get(url, timeout=5.0)
+        response.raise_for_status()
+        return response.json()
 async def health_check() -> dict[str, Any]:
     """Check loreguard-client health.
@@ -239,7 +279,12 @@ def chat_sync(
     character_id: str,
     message: str,
     player_handle: str = "",
+    player_id: str = "",
     current_context: str = "",
+    history: list[dict[str, Any]] | None = None,
+    chunk_mode: str = "",
+    manage_history: bool = False,
+    max_speech_tokens: int = 0,
 ) -> dict[str, Any]:
     """Synchronous chat (non-streaming).
@@ -247,17 +292,28 @@ def chat_sync(
     Returns:
         Complete response dict with speech, verified, citations, etc.
+        Includes "chunks" list if chunk_mode was set.
     """
     if httpx is None:
         raise ImportError("httpx is required. Install with: pip install httpx")
     url = f"{get_base_url()}/api/chat"
-    body = {
+    body: dict[str, Any] = {
         "character_id": character_id,
         "message": message,
         "player_handle": player_handle,
         "current_context": current_context,
     }
+    if player_id:
+        body["player_id"] = player_id
+    if history:
+        body["history"] = history
+    if chunk_mode:
+        body["chunk_mode"] = chunk_mode
+    if manage_history:
+        body["manage_history"] = True
+    if max_speech_tokens > 0:
+        body["max_speech_tokens"] = max_speech_tokens
     with httpx.Client() as client:
         response = client.post(url, json=body, timeout=120.0)

{loreguard_cli-0.12.2 → loreguard_cli-0.13.0rc1}/src/http_server.py RENAMED Viewed

@@ -266,6 +266,8 @@ class EmbeddedHTTPServer:
                         "verified": data.get("verified", False),
                         "citations": data.get("citations", []),
                     }
+                    if data.get("chunks"):
+                        result["chunks"] = data["chunks"]
                     if pipeline_trace:
                         result["pipeline_trace"] = pipeline_trace
                     return result
@@ -323,6 +325,18 @@ class EmbeddedHTTPServer:
                     content={"status": "error", "error": str(e)},
                 )
+        @app.get("/api/capabilities")
+        async def capabilities():
+            caps = {
+                "streaming": True,
+                "chunk_modes": ["sentence"],
+                "manages_history": False,
+            }
+            if server.tunnel:
+                if getattr(server.tunnel, "chunk_detector", None) and server.tunnel.chunk_detector.is_loaded:
+                    caps["chunk_modes"].append("deberta")
+            return caps
         @app.post("/api/chat")
         async def chat(request: Request):
             if not server.tunnel or not server.tunnel.connected:
@@ -340,6 +354,8 @@ class EmbeddedHTTPServer:
             scenario_id = body.get("scenario_id", body.get("scenarioId", ""))
             enable_thinking = body.get("enable_thinking", body.get("enableThinking", False))
             max_speech_tokens = body.get("max_speech_tokens", body.get("maxSpeechTokens", 0))
+            chunk_mode = body.get("chunk_mode", body.get("chunkMode", ""))
+            manage_history = body.get("manage_history", body.get("manageHistory", False))
             accept = request.headers.get("accept", "")
             streaming = "text/event-stream" in accept
@@ -366,6 +382,8 @@ class EmbeddedHTTPServer:
                             verbose=body.get("verbose", False),
                             api_token=api_token,
                             max_speech_tokens=max_speech_tokens,
+                            chunk_mode=chunk_mode,
+                            manage_history=manage_history,
                         )
                     )
                     # Wait for the result
@@ -389,6 +407,8 @@ class EmbeddedHTTPServer:
                     verbose=body.get("verbose", False),
                     api_token=api_token,
                     max_speech_tokens=max_speech_tokens,
+                    chunk_mode=chunk_mode,
+                    manage_history=manage_history,
                 )
             if streaming:

{loreguard_cli-0.12.2 → loreguard_cli-0.13.0rc1}/src/tui/widgets/banner.py RENAMED Viewed

@@ -5,6 +5,7 @@ from rich.text import Text
 from rich.style import Style
 from ..styles import FG_DIM, PINK
+from ...runtime import get_version
 # Simple stylized logo
 LOGO = r"""
@@ -43,9 +44,9 @@ class LoreguardBanner(Static):
     }
     """
-    def __init__(self, version: str = "0.11.0") -> None:
+    def __init__(self, version: str = None) -> None:
         super().__init__()
-        self._version = version
+        self._version = version or get_version()
     def render(self) -> Text:
         """Render minimal banner."""

{loreguard_cli-0.12.2 → loreguard_cli-0.13.0rc1}/src/tunnel.py RENAMED Viewed

@@ -468,6 +468,7 @@ class BackendTunnel:
         elif msg_type == "pass_update":
             # Pipeline pass update (verbose mode)
             payload = data.get("payload", {})
+            self._log(f"[pass_update] received pass={payload.get('pass','?')} name={payload.get('name','?')}", "info")
             if self.on_pass_update:
                 self.on_pass_update(payload)
             # Also route to per-request queue for HTTP/SSE clients
@@ -1388,6 +1389,8 @@ class BackendTunnel:
         verbose: bool = False,
         api_token: str = "",
         max_speech_tokens: int = 0,
+        chunk_mode: str = "",
+        manage_history: bool = False,
     ) -> asyncio.Queue[dict[str, Any]]:
         """Send a chat request to the backend and return a queue for responses.
@@ -1434,6 +1437,10 @@ class BackendTunnel:
         # Only include maxSpeechTokens if explicitly set (non-zero)
         if max_speech_tokens > 0:
             payload["maxSpeechTokens"] = max_speech_tokens
+        if chunk_mode:
+            payload["chunkMode"] = chunk_mode
+        if manage_history:
+            payload["manageHistory"] = True
         await self._send({
             "id": self._generate_message_id(),
@@ -1498,6 +1505,7 @@ class BackendTunnel:
                 "type": "done",
                 "data": {
                     "speech": payload.get("speech", ""),
+                    "chunks": payload.get("chunks"),
                     "thoughts": payload.get("thoughts", ""),
                     "citations": payload.get("citations", []),
                     "verified": payload.get("verified", False),