loreguard-cli 0.3.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,220 @@
1
+ Metadata-Version: 2.4
2
+ Name: loreguard-cli
3
+ Version: 0.3.0
4
+ Summary: Local inference client for Loreguard NPCs
5
+ Project-URL: Homepage, https://loreguard.com
6
+ Project-URL: Documentation, https://github.com/beyond-logic-labs/loreguard-cli#readme
7
+ Project-URL: Repository, https://github.com/beyond-logic-labs/loreguard-cli
8
+ Project-URL: Issues, https://github.com/beyond-logic-labs/loreguard-cli/issues
9
+ License-Expression: MIT
10
+ License-File: LICENSE
11
+ Keywords: gamedev,inference,llm,loreguard,npc
12
+ Classifier: Development Status :: 4 - Beta
13
+ Classifier: Environment :: Console
14
+ Classifier: Intended Audience :: Developers
15
+ Classifier: License :: OSI Approved :: MIT License
16
+ Classifier: Operating System :: OS Independent
17
+ Classifier: Programming Language :: Python :: 3.10
18
+ Classifier: Programming Language :: Python :: 3.11
19
+ Classifier: Programming Language :: Python :: 3.12
20
+ Classifier: Topic :: Games/Entertainment
21
+ Requires-Python: >=3.10
22
+ Requires-Dist: aiofiles>=24.1.0
23
+ Requires-Dist: httpx>=0.26.0
24
+ Requires-Dist: pydantic>=2.5.0
25
+ Requires-Dist: websockets>=12.0
26
+ Provides-Extra: build
27
+ Requires-Dist: pyinstaller>=6.0.0; extra == 'build'
28
+ Provides-Extra: dev
29
+ Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
30
+ Requires-Dist: pytest>=7.4.0; extra == 'dev'
31
+ Requires-Dist: ruff>=0.1.0; extra == 'dev'
32
+ Description-Content-Type: text/markdown
33
+
34
+ # Loreguard
35
+
36
+ [![PyPI version](https://img.shields.io/pypi/v/loreguard-cli.svg)](https://pypi.org/project/loreguard-cli/)
37
+ [![Build](https://github.com/beyond-logic-labs/loreguard-cli/actions/workflows/release.yml/badge.svg)](https://github.com/beyond-logic-labs/loreguard-cli/actions/workflows/release.yml)
38
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
39
+ [![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
40
+ [![GitHub release](https://img.shields.io/github/v/release/beyond-logic-labs/loreguard-cli)](https://github.com/beyond-logic-labs/loreguard-cli/releases)
41
+
42
+ ```
43
+ ┌────────────────────────────────────────────────────────────────────────────────┐
44
+ │ │
45
+ │ ██╗ ██████╗ ██████╗ ███████╗ ██████╗ ██╗ ██╗ █████╗ ██████╗ ██████╗ │
46
+ │ ██║ ██╔═══██╗██╔══██╗██╔════╝ ██╔════╝ ██║ ██║██╔══██╗██╔══██╗██╔══██╗ │
47
+ │ ██║ ██║ ██║██████╔╝█████╗ ██║ ███╗██║ ██║███████║██████╔╝██║ ██║ │
48
+ │ ██║ ██║ ██║██╔══██╗██╔══╝ ██║ ██║██║ ██║██╔══██║██╔══██╗██║ ██║ │
49
+ │ ███████╗╚██████╔╝██║ ██║███████╗ ╚██████╔╝╚██████╔╝██║ ██║██║ ██║██████╔╝ │
50
+ │ ╚══════╝ ╚═════╝ ╚═╝ ╚═╝╚══════╝ ╚═════╝ ╚═════╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝ │
51
+ │ │
52
+ │ Local inference for your game NPCs │
53
+ │ loreguard.com │
54
+ │ │
55
+ └────────────────────────────────────────────────────────────────────────────────┘
56
+ ```
57
+
58
+ AI-Powered NPCs using your own hardware (your servers or your player's hardware)
59
+ Loreguard CLI connects the LLM Inference to the Loreguard NPC system.
60
+
61
+ ## How It Works
62
+
63
+ ```
64
+ ┌─────────────────┐ wss://api.loreguard.com ┌─────────────────┐
65
+ │ Your Game │◄────────────────────────────► │ Loreguard API │
66
+ │ (NPC Dialog) │ │ (Backend) │
67
+ └─────────────────┘ └────────┬────────┘
68
+
69
+ │ Routes inference
70
+ │ to your worker
71
+
72
+ ┌─────────────────┐
73
+ │ Loreguard CLI │◄── You run this
74
+ │ (This repo) │
75
+ └────────┬────────┘
76
+
77
+ │ Local inference
78
+
79
+ ┌─────────────────┐
80
+ │ llama.cpp │
81
+ │ (Your GPU/CPU) │
82
+ └─────────────────┘
83
+ ```
84
+
85
+ ## Installation
86
+
87
+ ### Option 1: Download Binary (Recommended)
88
+
89
+ Download standalone binaries from [Releases](https://github.com/beyond-logic-labs/loreguard-cli/releases):
90
+ - `loreguard-linux` - Linux x64
91
+ - `loreguard-macos` - macOS (Intel & Apple Silicon)
92
+ - `loreguard-windows.exe` - Windows x64
93
+
94
+ ### Option 2: Install from PyPI
95
+
96
+ ```bash
97
+ pip install loreguard-cli
98
+ ```
99
+
100
+ ### Option 3: Install from Source
101
+
102
+ ```bash
103
+ git clone https://github.com/beyond-logic-labs/loreguard-cli
104
+ cd loreguard-cli
105
+ pip install -e .
106
+ ```
107
+
108
+ ### Option 4: Build Your Own Binary
109
+
110
+ ```bash
111
+ git clone https://github.com/beyond-logic-labs/loreguard-cli
112
+ cd loreguard-cli
113
+ pip install -e ".[build]"
114
+ python scripts/build.py
115
+ # Output: dist/loreguard (or dist/loreguard.exe on Windows)
116
+ ```
117
+
118
+ ## Quick Start
119
+
120
+ ### Interactive Wizard
121
+
122
+ ```bash
123
+ loreguard
124
+ ```
125
+
126
+ The wizard guides you through:
127
+ 1. **Authentication** - Enter your worker token
128
+ 2. **Model Selection** - Choose or download a model
129
+ 3. **Running** - Starts llama-server and connects to backend
130
+
131
+ ### Headless CLI
132
+
133
+ ```bash
134
+ loreguard-cli --token lg_worker_xxx --model /path/to/model.gguf
135
+ ```
136
+
137
+ Or auto-download a supported model:
138
+
139
+ ```bash
140
+ loreguard-cli --token lg_worker_xxx --model-id qwen3-4b-instruct
141
+ ```
142
+
143
+ **Environment variables:**
144
+ ```bash
145
+ export LOREGUARD_TOKEN=lg_worker_xxx
146
+ export LOREGUARD_MODEL=/path/to/model.gguf
147
+ loreguard-cli
148
+ ```
149
+
150
+ ## Supported Models
151
+
152
+ | Model ID | Name | Size | Notes |
153
+ |----------|------|------|-------|
154
+ | `qwen3-4b-instruct` | Qwen3 4B Instruct | 2.8 GB | **Recommended** |
155
+ | `llama-3.2-3b-instruct` | Llama 3.2 3B | 2.0 GB | Fast |
156
+ | `qwen3-8b` | Qwen3 8B | 5.2 GB | Higher quality |
157
+ | `meta-llama-3-8b-instruct` | Llama 3 8B | 4.9 GB | General purpose |
158
+
159
+ Or use any `.gguf` model with `--model /path/to/model.gguf`.
160
+
161
+ ## Use Cases
162
+
163
+ ### For Game Developers (Testing & Development)
164
+
165
+ Use Loreguard CLI during development to test NPC dialogs with your own hardware:
166
+
167
+ ```bash
168
+ # Start the worker
169
+ loreguard-cli --token $YOUR_DEV_TOKEN --model-id qwen3-4b-instruct
170
+
171
+ # Your game connects to Loreguard API
172
+ # NPC inference requests are routed to your local worker
173
+ ```
174
+
175
+ ### For Players (Coming Soon)
176
+
177
+ > **Note:** Player distribution support is in development. Currently, players would need their own Loreguard account and token.
178
+
179
+ We're working on a **Game Keys** system that will allow:
180
+ - Developers to register their game and get a Game API Key
181
+ - Players to run the CLI without needing a Loreguard account
182
+ - Automatic worker provisioning scoped to each game
183
+
184
+ **Interested in early access?** Contact us at [loreguard.com](https://loreguard.com)
185
+
186
+ ## Requirements
187
+
188
+ - **RAM**: 8GB minimum (16GB+ for larger models)
189
+ - **GPU**: Optional but recommended (NVIDIA CUDA or Apple Silicon)
190
+ - **Disk**: 2-6GB depending on model
191
+ - **Python**: 3.10+ (if installing from source)
192
+
193
+ ## Get Your Token
194
+
195
+ 1. Go to [loreguard.com/developers](https://loreguard.com/developers)
196
+ 2. Create a worker token
197
+ 3. Use it with `--token` or `LOREGUARD_TOKEN`
198
+
199
+ ## Development
200
+
201
+ ```bash
202
+ git clone https://github.com/beyond-logic-labs/loreguard-cli
203
+ cd loreguard-cli
204
+ python -m venv .venv
205
+ source .venv/bin/activate # Windows: .venv\Scripts\activate
206
+ pip install -e ".[dev]"
207
+
208
+ # Run interactive wizard
209
+ python -m src.wizard
210
+
211
+ # Run headless CLI
212
+ python -m src.cli --help
213
+
214
+ # Run tests
215
+ pytest
216
+ ```
217
+
218
+ ## License
219
+
220
+ MIT
@@ -0,0 +1,17 @@
1
+ src/__init__.py,sha256=ZMoriJzdvRs-iW4dMtpdNd1FldEG5NG0TGaLoU2lvAc,88
2
+ src/cli.py,sha256=gndjZfWb_YtfE8hK8rnHKx8jjDMVd6mtHtMNbN1sfU0,13044
3
+ src/config.py,sha256=5WWbcaC5IOayHRm9AR8SquKonhouF4wZfbWf7_RqHbs,2266
4
+ src/llama_server.py,sha256=68ufUY6kVyqOW7bdnNCIKGTisHFBjeFUioj6fRyHqR4,12322
5
+ src/llm.py,sha256=zIg0fBE1hZDuvsqVPbKvApdgqE-WRCUN0yzPcv80Qpw,18113
6
+ src/main.py,sha256=0s-zbBZRWOesfd6utVsqNc_DMSxSiJBCl4qjxt-6btQ,4628
7
+ src/models_registry.py,sha256=VQNrapqbraw2B8PdfoOuqVDrJK2NBEAArNzyMf9PR6Q,4824
8
+ src/npc_chat.py,sha256=qpd71DIDyNt5rE6VpwKk632X8T4b2I1bvRKUnMajYrA,16310
9
+ src/steam.py,sha256=hdjSrdi3dNHVo7Ck0MgIo2IHuXDIJp3vFYofhOptNxc,16330
10
+ src/term_ui.py,sha256=gA73hO9MMeGDU5f44UhqvYP92i5-euY2TibIKWTE3F8,27197
11
+ src/tunnel.py,sha256=g9VakpW61BwL9ToEot6w5Hz07fAvVvH7YKkQZc15dS4,15661
12
+ src/wizard.py,sha256=RuQGHasR16k2Mj5ALiNRovHWhQ3cmESVrMuNl2iXP00,21258
13
+ loreguard_cli-0.3.0.dist-info/METADATA,sha256=-7DskTrhB1UrEUe4MOzoWsCsAYjMvb3o7Lti2lb8yQ8,9416
14
+ loreguard_cli-0.3.0.dist-info/WHEEL,sha256=WLgqFyCfm_KASv4WHyYy0P3pM_m7J5L9k2skdKLirC8,87
15
+ loreguard_cli-0.3.0.dist-info/entry_points.txt,sha256=LofpwUo6uKd-9uLxeJIbkZx8RX0DNFz3UPxM1Wokm5w,75
16
+ loreguard_cli-0.3.0.dist-info/licenses/LICENSE,sha256=r39SrDcO4q8PEQTShsd2OWVcA9tj07oiBgoWlq-_x9c,1074
17
+ loreguard_cli-0.3.0.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: hatchling 1.28.0
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
@@ -0,0 +1,3 @@
1
+ [console_scripts]
2
+ loreguard = src.wizard:main
3
+ loreguard-cli = src.cli:main
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2024 Beyond Logic Labs
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
src/__init__.py ADDED
@@ -0,0 +1,3 @@
1
+ """Lorekeeper Client Bridge - Local LLM to Backend connector."""
2
+
3
+ __version__ = "0.2.0"
src/cli.py ADDED
@@ -0,0 +1,409 @@
1
+ #!/usr/bin/env python3
2
+ """Loreguard CLI - Standalone headless mode for embedding in games.
3
+
4
+ Usage:
5
+ loreguard-cli --token lg_worker_xxx --model /path/to/model.gguf
6
+ loreguard-cli --token lg_worker_xxx --model-id qwen3-4b
7
+
8
+ Environment variables (alternative to args):
9
+ LOREGUARD_TOKEN Worker token
10
+ LOREGUARD_MODEL Path to model file
11
+ LOREGUARD_MODEL_ID Model ID to download (if not using custom model)
12
+ LOREGUARD_PORT Local llama-server port (default: 8080)
13
+ LOREGUARD_BACKEND Backend URL (default: wss://api.loreguard.com/workers)
14
+ """
15
+
16
+ import argparse
17
+ import asyncio
18
+ import logging
19
+ import os
20
+ import signal
21
+ import sys
22
+ from datetime import datetime
23
+ from pathlib import Path
24
+ from typing import Optional
25
+
26
+ # Setup logging
27
+ logging.basicConfig(
28
+ level=logging.INFO,
29
+ format="%(asctime)s [%(levelname)s] %(message)s",
30
+ datefmt="%Y-%m-%d %H:%M:%S",
31
+ )
32
+ log = logging.getLogger("loreguard")
33
+
34
+
35
+ class LoreguardCLI:
36
+ """Headless Loreguard client for embedding in games."""
37
+
38
+ def __init__(
39
+ self,
40
+ token: str,
41
+ model_path: Optional[Path] = None,
42
+ model_id: Optional[str] = None,
43
+ port: int = 8080,
44
+ backend_url: str = "wss://api.loreguard.com/workers",
45
+ ):
46
+ self.token = token
47
+ self.model_path = model_path
48
+ self.model_id = model_id
49
+ self.port = port
50
+ self.backend_url = backend_url
51
+
52
+ self._llama = None
53
+ self._tunnel = None
54
+ self._running = False
55
+
56
+ # Metrics
57
+ self._requests = 0
58
+ self._tokens = 0
59
+ self._start_time = None
60
+
61
+ async def run(self) -> int:
62
+ """Run the client. Returns exit code."""
63
+ self._start_time = datetime.now()
64
+
65
+ # Setup signal handlers
66
+ loop = asyncio.get_event_loop()
67
+ for sig in (signal.SIGINT, signal.SIGTERM):
68
+ loop.add_signal_handler(sig, lambda: asyncio.create_task(self._shutdown()))
69
+
70
+ log.info("=" * 50)
71
+ log.info("Loreguard CLI - Starting")
72
+ log.info("=" * 50)
73
+
74
+ try:
75
+ # Resolve model path
76
+ if not await self._resolve_model():
77
+ return 1
78
+
79
+ # Start llama-server
80
+ if not await self._start_llama_server():
81
+ return 1
82
+
83
+ # Connect to backend
84
+ if not await self._connect_backend():
85
+ return 1
86
+
87
+ self._running = True
88
+ log.info("=" * 50)
89
+ log.info("Ready! Waiting for inference requests...")
90
+ log.info("Press Ctrl+C to stop")
91
+ log.info("=" * 50)
92
+
93
+ # Keep running until shutdown
94
+ while self._running:
95
+ await asyncio.sleep(1)
96
+ self._log_stats()
97
+
98
+ return 0
99
+
100
+ except Exception as e:
101
+ log.error(f"Fatal error: {e}")
102
+ return 1
103
+ finally:
104
+ await self._cleanup()
105
+
106
+ async def _resolve_model(self) -> bool:
107
+ """Resolve model path, downloading if needed."""
108
+ if self.model_path:
109
+ if not self.model_path.exists():
110
+ log.error(f"Model not found: {self.model_path}")
111
+ return False
112
+ log.info(f"Using model: {self.model_path}")
113
+ return True
114
+
115
+ if self.model_id:
116
+ from .models_registry import SUPPORTED_MODELS
117
+ from .llama_server import get_models_dir, DownloadProgress
118
+
119
+ # Find model by ID
120
+ model = None
121
+ for m in SUPPORTED_MODELS:
122
+ if m.id == self.model_id:
123
+ model = m
124
+ break
125
+
126
+ if not model:
127
+ log.error(f"Unknown model ID: {self.model_id}")
128
+ log.info("Available models:")
129
+ for m in SUPPORTED_MODELS:
130
+ log.info(f" - {m.id}: {m.name}")
131
+ return False
132
+
133
+ models_dir = get_models_dir()
134
+ self.model_path = models_dir / model.filename
135
+
136
+ if self.model_path.exists():
137
+ log.info(f"Model already downloaded: {self.model_path}")
138
+ return True
139
+
140
+ # Download
141
+ log.info(f"Downloading {model.name} ({model.size_gb:.1f} GB)...")
142
+ try:
143
+ await self._download_model(model, self.model_path)
144
+ log.info(f"Download complete: {self.model_path}")
145
+ return True
146
+ except Exception as e:
147
+ log.error(f"Download failed: {e}")
148
+ return False
149
+
150
+ log.error("No model specified. Use --model or --model-id")
151
+ return False
152
+
153
+ async def _download_model(self, model, dest: Path) -> None:
154
+ """Download a model file with progress."""
155
+ import httpx
156
+
157
+ dest.parent.mkdir(parents=True, exist_ok=True)
158
+
159
+ async with httpx.AsyncClient(follow_redirects=True, timeout=None) as client:
160
+ async with client.stream("GET", model.url) as response:
161
+ response.raise_for_status()
162
+ total = model.size_bytes or int(response.headers.get("content-length", 0))
163
+ downloaded = 0
164
+ last_log = 0
165
+
166
+ with open(dest, "wb") as f:
167
+ async for chunk in response.aiter_bytes(chunk_size=1024 * 1024):
168
+ f.write(chunk)
169
+ downloaded += len(chunk)
170
+
171
+ # Log progress every 10%
172
+ pct = int(downloaded / total * 100) if total else 0
173
+ if pct >= last_log + 10:
174
+ last_log = pct
175
+ log.info(f" {pct}% ({downloaded // 1024 // 1024} MB)")
176
+
177
+ async def _start_llama_server(self) -> bool:
178
+ """Start llama-server."""
179
+ from .llama_server import (
180
+ LlamaServerProcess,
181
+ is_llama_server_installed,
182
+ download_llama_server,
183
+ DownloadProgress,
184
+ )
185
+
186
+ # Download llama-server if needed
187
+ if not is_llama_server_installed():
188
+ log.info("Downloading llama-server...")
189
+ try:
190
+ def on_progress(msg: str, progress: DownloadProgress | None):
191
+ if progress and int(progress.percent) % 20 == 0:
192
+ log.info(f" {int(progress.percent)}%")
193
+ await download_llama_server(on_progress)
194
+ log.info("llama-server downloaded")
195
+ except Exception as e:
196
+ log.error(f"Failed to download llama-server: {e}")
197
+ return False
198
+
199
+ # Start server
200
+ log.info(f"Starting llama-server on port {self.port}...")
201
+ try:
202
+ self._llama = LlamaServerProcess(self.model_path, port=self.port)
203
+ self._llama.start()
204
+
205
+ # Wait for ready
206
+ ready = await self._llama.wait_for_ready(timeout=120.0)
207
+ if not ready:
208
+ log.error("llama-server failed to start (timeout)")
209
+ return False
210
+
211
+ log.info("llama-server ready")
212
+ return True
213
+
214
+ except Exception as e:
215
+ log.error(f"Failed to start llama-server: {e}")
216
+ return False
217
+
218
+ async def _connect_backend(self) -> bool:
219
+ """Connect to Loreguard backend."""
220
+ # Dev mode - skip backend connection
221
+ if self.token == "dev_mock_token":
222
+ log.info("DEV MODE: Skipping backend connection")
223
+ log.info(f"llama-server running at http://127.0.0.1:{self.port}")
224
+ log.info("You can send requests directly to the llama-server")
225
+ return True
226
+
227
+ from .tunnel import BackendTunnel
228
+ from .llm import LLMProxy
229
+
230
+ log.info(f"Connecting to {self.backend_url}...")
231
+
232
+ try:
233
+ # Extract worker ID from token (format: lg_worker_<id>_<secret>)
234
+ parts = self.token.split("_")
235
+ worker_id = parts[2] if len(parts) >= 3 else "worker"
236
+
237
+ llm_proxy = LLMProxy(f"http://127.0.0.1:{self.port}")
238
+
239
+ self._tunnel = BackendTunnel(
240
+ backend_url=self.backend_url,
241
+ llm_proxy=llm_proxy,
242
+ worker_id=worker_id,
243
+ worker_token=self.token,
244
+ model_id=self.model_path.stem if self.model_path else "unknown",
245
+ )
246
+
247
+ self._tunnel.on_request_complete = self._on_request_complete
248
+
249
+ # Start connection (runs in background)
250
+ asyncio.create_task(self._tunnel.connect())
251
+
252
+ # Wait a bit for connection
253
+ await asyncio.sleep(2)
254
+ log.info("Backend connection established")
255
+ return True
256
+
257
+ except Exception as e:
258
+ log.error(f"Failed to connect to backend: {e}")
259
+ return False
260
+
261
+ def _on_request_complete(
262
+ self, npc: str, tokens: int, ttft_ms: float, total_ms: float
263
+ ) -> None:
264
+ """Called when a request completes."""
265
+ self._requests += 1
266
+ self._tokens += tokens
267
+ tps = (tokens / total_ms * 1000) if total_ms > 0 else 0
268
+
269
+ log.info(
270
+ f"Request #{self._requests}: {npc} | "
271
+ f"{tokens} tokens | {ttft_ms:.0f}ms TTFT | "
272
+ f"{total_ms/1000:.1f}s total | {tps:.1f} tk/s"
273
+ )
274
+
275
+ def _log_stats(self) -> None:
276
+ """Log periodic stats (every 60 seconds)."""
277
+ if not self._start_time:
278
+ return
279
+
280
+ elapsed = (datetime.now() - self._start_time).total_seconds()
281
+ if int(elapsed) % 60 == 0 and int(elapsed) > 0:
282
+ mins = int(elapsed // 60)
283
+ log.info(
284
+ f"Stats: {mins}m uptime | "
285
+ f"{self._requests} requests | "
286
+ f"{self._tokens:,} tokens"
287
+ )
288
+
289
+ async def _shutdown(self) -> None:
290
+ """Graceful shutdown."""
291
+ if not self._running:
292
+ return
293
+
294
+ log.info("Shutting down...")
295
+ self._running = False
296
+
297
+ async def _cleanup(self) -> None:
298
+ """Cleanup resources."""
299
+ if self._tunnel:
300
+ try:
301
+ await self._tunnel.disconnect()
302
+ except Exception:
303
+ pass
304
+
305
+ if self._llama:
306
+ try:
307
+ self._llama.stop()
308
+ except Exception:
309
+ pass
310
+
311
+ log.info("Goodbye!")
312
+
313
+
314
+ def main():
315
+ """CLI entry point."""
316
+ parser = argparse.ArgumentParser(
317
+ description="Loreguard CLI - Local inference for game NPCs",
318
+ formatter_class=argparse.RawDescriptionHelpFormatter,
319
+ epilog="""
320
+ Examples:
321
+ loreguard-cli --token lg_worker_xxx --model ./model.gguf
322
+ loreguard-cli --token lg_worker_xxx --model-id qwen3-4b
323
+ LOREGUARD_TOKEN=lg_worker_xxx loreguard-cli --model-id qwen3-4b
324
+
325
+ Available model IDs:
326
+ qwen3-4b-instruct Qwen3 4B Instruct (recommended, 2.8 GB)
327
+ llama-3.2-3b Llama 3.2 3B Instruct (2.0 GB)
328
+ qwen3-8b Qwen3 8B (5.2 GB)
329
+ meta-llama-3-8b Meta Llama 3 8B (4.9 GB)
330
+ """,
331
+ )
332
+
333
+ parser.add_argument(
334
+ "--token",
335
+ default=os.getenv("LOREGUARD_TOKEN", ""),
336
+ help="Worker token (or set LOREGUARD_TOKEN env var)",
337
+ )
338
+ parser.add_argument(
339
+ "--model",
340
+ type=Path,
341
+ default=os.getenv("LOREGUARD_MODEL"),
342
+ help="Path to .gguf model file",
343
+ )
344
+ parser.add_argument(
345
+ "--model-id",
346
+ default=os.getenv("LOREGUARD_MODEL_ID"),
347
+ help="Model ID to download (e.g., qwen3-4b-instruct)",
348
+ )
349
+ parser.add_argument(
350
+ "--port",
351
+ type=int,
352
+ default=int(os.getenv("LOREGUARD_PORT", "8080")),
353
+ help="Local llama-server port (default: 8080)",
354
+ )
355
+ parser.add_argument(
356
+ "--backend",
357
+ default=os.getenv("LOREGUARD_BACKEND", "wss://api.loreguard.com/workers"),
358
+ help="Backend WebSocket URL",
359
+ )
360
+ parser.add_argument(
361
+ "-v", "--verbose",
362
+ action="store_true",
363
+ help="Enable debug logging",
364
+ )
365
+ parser.add_argument(
366
+ "--dev",
367
+ action="store_true",
368
+ help="Dev mode - skip backend connection, just run llama-server",
369
+ )
370
+
371
+ args = parser.parse_args()
372
+
373
+ if args.verbose:
374
+ logging.getLogger().setLevel(logging.DEBUG)
375
+
376
+ # Dev mode - skip token validation
377
+ if args.dev:
378
+ args.token = "dev_mock_token"
379
+ log.info("Running in DEV MODE - no backend connection")
380
+ else:
381
+ # Validate token
382
+ if not args.token:
383
+ log.error("Token required. Use --token or set LOREGUARD_TOKEN (or use --dev)")
384
+ sys.exit(1)
385
+
386
+ if not args.token.startswith("lg_worker_"):
387
+ log.error("Invalid token format (must start with lg_worker_)")
388
+ sys.exit(1)
389
+
390
+ # Validate model
391
+ if not args.model and not args.model_id:
392
+ log.error("Model required. Use --model or --model-id")
393
+ sys.exit(1)
394
+
395
+ # Run
396
+ cli = LoreguardCLI(
397
+ token=args.token,
398
+ model_path=args.model,
399
+ model_id=args.model_id,
400
+ port=args.port,
401
+ backend_url=args.backend,
402
+ )
403
+
404
+ exit_code = asyncio.run(cli.run())
405
+ sys.exit(exit_code)
406
+
407
+
408
+ if __name__ == "__main__":
409
+ main()