preppergpt 0.1.0 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,32 +1,45 @@
1
1
  # PrepperGPT
2
2
 
3
- PrepperGPT packages a local-first ChatGPT-like experience for Linux machines.
4
- It uses upstream OpenWebUI for the app shell and adds a hardware detector,
5
- model planner, Docker Compose runtime, local sidecars, and a practical
6
- PrepperGPT field-kit theme.
3
+ PrepperGPT packages a local-first ChatGPT-like experience for post-apocalyptic
4
+ or long-duration outage scenarios where hosted AI services are unavailable. It
5
+ uses upstream OpenWebUI for the app shell and adds a hardware detector, model
6
+ planner, Docker Compose runtime, local sidecars, and a practical PrepperGPT
7
+ field-kit theme.
7
8
 
8
9
  The first release targets Linux with NVIDIA GPUs first, with CPU fallback where
9
10
  possible. It is an online installer: model and container downloads require a
10
11
  working network during setup.
11
12
 
13
+ PrepperGPT optimizes for survivability over cloud-like latency. On very large
14
+ local models, very low tokens/sec is acceptable because the alternative in the
15
+ target scenario is no assistant at all.
16
+
12
17
  ## Install
13
18
 
14
- Until the npm package is published, install from GitHub:
19
+ Install from npm:
15
20
 
16
21
  ```bash
17
- git clone https://github.com/teamslop/preppergpt.git
18
- cd preppergpt
19
- node bin/preppergpt.js install --profile balanced
20
- node bin/preppergpt.js start
22
+ npx preppergpt install --profile balanced
23
+ preppergpt start
21
24
  ```
22
25
 
23
- After npm publication:
26
+ Or install globally:
24
27
 
25
28
  ```bash
26
- npx preppergpt install --profile balanced
29
+ npm install -g preppergpt
30
+ preppergpt install --profile balanced
27
31
  preppergpt start
28
32
  ```
29
33
 
34
+ GitHub source install:
35
+
36
+ ```bash
37
+ git clone https://github.com/teamslop/preppergpt.git
38
+ cd preppergpt
39
+ node bin/preppergpt.js install --profile balanced
40
+ node bin/preppergpt.js start
41
+ ```
42
+
30
43
  Other profiles:
31
44
 
32
45
  ```bash
@@ -54,12 +67,14 @@ preppergpt stop
54
67
  preppergpt status
55
68
  preppergpt doctor
56
69
  preppergpt switch-profile --profile speed
70
+ preppergpt bundle whisper
57
71
  ```
58
72
 
59
73
  ## Profiles
60
74
 
61
75
  - `intelligence`: chooses the strongest local reasoning route that fits the
62
- machine, preferring GLM 5.2 Q4 and long-context coding routes when available.
76
+ machine, preferring GLM 5.2 Q8 on enterprise hardware, then GLM 5.2 Q4, then
77
+ long-context coding routes when available.
63
78
  - `speed`: chooses smaller GPU-friendly routes and makes low-latency chat the
64
79
  default.
65
80
  - `balanced`: uses the local auto-router as the default and keeps reasoning,
@@ -70,11 +85,20 @@ and route ordering into the generated compose override.
70
85
 
71
86
  ## Model Assets
72
87
 
73
- Some routes can be pulled by the runtime, while very large routes such as GLM
74
- 5.2 Q4 and Flux weights are marked as manual or external in
88
+ PrepperGPT installs a bundled local Whisper Base STT cache during
89
+ `preppergpt install`. It is stored under `~/.preppergpt/data/models/whisper/base`
90
+ by default and mounted into OpenWebUI, so speech-to-text works from local files
91
+ after setup.
92
+
93
+ Some other routes can be pulled by the runtime, while very large routes such as
94
+ GLM 5.2 Q8/Q4 and Flux weights are marked as manual or external in
75
95
  `profiles/models.json`. `preppergpt doctor` reports which selected routes still
76
96
  need local files or endpoints.
77
97
 
98
+ The GLM 5.2 Q8 route is intended for an enterprise/off-grid bunker-class host:
99
+ large RAM, fast NVMe, and patience for slow local generation when no hosted
100
+ service remains available.
101
+
78
102
  ## Publishing
79
103
 
80
104
  The package is designed to be published as:
@@ -84,7 +108,7 @@ npm publish --access public
84
108
  ```
85
109
 
86
110
  Publishing requires an authenticated npm account with permission to publish the
87
- currently unclaimed `preppergpt` package name.
111
+ `preppergpt` package.
88
112
 
89
113
  The source repository is expected at:
90
114
 
@@ -13,6 +13,7 @@ services:
13
13
  hard: 1048576
14
14
  volumes:
15
15
  - ${PREPPERGPT_DATA_DIR:?set PREPPERGPT_DATA_DIR}/openwebui:/app/backend/data
16
+ - ${PREPPERGPT_MODELS_DIR:?set PREPPERGPT_MODELS_DIR}:/models:ro
16
17
  - ../services/comfyui:/app/backend/data/parity-comfyui:ro
17
18
  - ../themes/preppergpt/static/favicon.svg:/app/backend/open_webui/static/favicon.svg:ro
18
19
  - ../themes/preppergpt/static/logo.svg:/app/backend/open_webui/static/logo.svg:ro
@@ -26,8 +27,8 @@ services:
26
27
  ENABLE_OLLAMA_API: "True"
27
28
  OLLAMA_BASE_URLS: "${OLLAMA_BASE_URL:-http://127.0.0.1:11434}"
28
29
  ENABLE_OPENAI_API: "True"
29
- OPENAI_API_BASE_URLS: "${SLOCODE_BASE_URL:-http://127.0.0.1:11438/v1};${GLM52_BASE_URL:-http://127.0.0.1:11441/v1};http://127.0.0.1:18041/v1;http://127.0.0.1:18043/v1;http://127.0.0.1:18044/v1"
30
- OPENAI_API_KEYS: "slopcode;glm52;deep-research;local-agent;local-vision"
30
+ OPENAI_API_BASE_URLS: "${GLM52_Q8_BASE_URL:-http://127.0.0.1:11446/v1};${SLOCODE_BASE_URL:-http://127.0.0.1:11438/v1};${GLM52_BASE_URL:-http://127.0.0.1:11441/v1};http://127.0.0.1:18041/v1;http://127.0.0.1:18043/v1;http://127.0.0.1:18044/v1"
31
+ OPENAI_API_KEYS: "glm52-q8;slopcode;glm52;deep-research;local-agent;local-vision"
31
32
  ENABLE_DIRECT_CONNECTIONS: "True"
32
33
  DEFAULT_MODELS: "${PREPPERGPT_DEFAULT_MODEL:-local-chatgpt-auto}"
33
34
  MODEL_ORDER_LIST: "${PREPPERGPT_MODEL_ORDER_LIST:-[\"local-chatgpt-auto\"]}"
@@ -69,9 +70,10 @@ services:
69
70
  CODE_INTERPRETER_JUPYTER_URL: "http://127.0.0.1:8888"
70
71
  CODE_INTERPRETER_JUPYTER_AUTH: "token"
71
72
  CODE_INTERPRETER_JUPYTER_AUTH_TOKEN: "${JUPYTER_TOKEN:?set JUPYTER_TOKEN}"
72
- WHISPER_MODEL: "large-v3"
73
+ WHISPER_MODEL: "${PREPPERGPT_WHISPER_MODEL_PATH:-/models/whisper/base}"
74
+ WHISPER_MODEL_DIR: "/models/whisper"
73
75
  WHISPER_COMPUTE_TYPE: "int8"
74
- WHISPER_MODEL_AUTO_UPDATE: "True"
76
+ WHISPER_MODEL_AUTO_UPDATE: "False"
75
77
  WHISPER_VAD_FILTER: "True"
76
78
  WHISPER_MULTILINGUAL: "True"
77
79
  ENABLE_IMAGE_GENERATION: "True"
@@ -147,8 +149,8 @@ services:
147
149
  DEEP_RESEARCH_PORT: "18041"
148
150
  DEEP_RESEARCH_PUBLIC_BASE_URL: "http://127.0.0.1:18041"
149
151
  DEEP_RESEARCH_MODEL_ID: "deep-research-glm52"
150
- DEEP_RESEARCH_MODEL: "${DEEP_RESEARCH_MODEL:-glm52-q4-local}"
151
- DEEP_RESEARCH_GLM_BASE_URL: "${GLM52_BASE_URL:-http://127.0.0.1:11441/v1}"
152
+ DEEP_RESEARCH_MODEL: "${PREPPERGPT_GLM_MODEL:-glm52-q4-local}"
153
+ DEEP_RESEARCH_GLM_BASE_URL: "${PREPPERGPT_GLM_BASE_URL:-http://127.0.0.1:11441/v1}"
152
154
  DEEP_RESEARCH_SEARXNG_URL: "http://127.0.0.1:18080/search"
153
155
  DEEP_RESEARCH_TIKA_URL: "http://127.0.0.1:9998/tika"
154
156
  DEEP_RESEARCH_LOCAL_APP_CONNECTOR_URL: "http://127.0.0.1:18042"
@@ -193,8 +195,8 @@ services:
193
195
  LOCAL_AGENT_PORT: "18043"
194
196
  LOCAL_AGENT_PUBLIC_BASE_URL: "http://127.0.0.1:18043"
195
197
  LOCAL_AGENT_MODEL_ID: "local-agent-glm52"
196
- LOCAL_AGENT_GLM_MODEL: "glm52-q4-local"
197
- LOCAL_AGENT_GLM_BASE_URL: "${GLM52_BASE_URL:-http://127.0.0.1:11441/v1}"
198
+ LOCAL_AGENT_GLM_MODEL: "${PREPPERGPT_GLM_MODEL:-glm52-q4-local}"
199
+ LOCAL_AGENT_GLM_BASE_URL: "${PREPPERGPT_GLM_BASE_URL:-http://127.0.0.1:11441/v1}"
198
200
  LOCAL_AGENT_AUTO_ROUTER_MODEL_ID: "local-auto-router"
199
201
  LOCAL_AGENT_AUTO_ROUTER_FAST_MODEL: "gemma4:12b-256k-gpu"
200
202
  LOCAL_AGENT_AUTO_ROUTER_FAST_BASE_URL: "http://127.0.0.1:11434/v1"
@@ -0,0 +1,30 @@
1
+ # Bundles
2
+
3
+ PrepperGPT keeps npm lightweight but installs small always-on local assets
4
+ during setup.
5
+
6
+ ## Whisper Base
7
+
8
+ `preppergpt install` downloads the MIT-licensed `Systran/faster-whisper-base`
9
+ CTranslate2 model into:
10
+
11
+ ```text
12
+ ~/.preppergpt/data/models/whisper/base
13
+ ```
14
+
15
+ OpenWebUI receives:
16
+
17
+ ```text
18
+ WHISPER_MODEL=/models/whisper/base
19
+ WHISPER_MODEL_DIR=/models/whisper
20
+ WHISPER_MODEL_AUTO_UPDATE=False
21
+ ```
22
+
23
+ To repair or refresh the bundle:
24
+
25
+ ```bash
26
+ preppergpt bundle whisper
27
+ preppergpt bundle whisper --force
28
+ ```
29
+
30
+ Source: https://huggingface.co/Systran/faster-whisper-base
package/docs/hardware.md CHANGED
@@ -1,7 +1,9 @@
1
1
  # Hardware Guide
2
2
 
3
3
  PrepperGPT works best on Linux with an NVIDIA GPU and enough NVMe space for
4
- model weights.
4
+ model weights. It is designed for post-apocalyptic or long-duration outage
5
+ scenarios, so the high-end GLM tiers deliberately favor local availability and
6
+ answer quality over hosted-service latency.
5
7
 
6
8
  Recommended starting points:
7
9
 
@@ -9,7 +11,22 @@ Recommended starting points:
9
11
  - Balanced profile: 32-64 GB RAM, 12-24 GB VRAM, 120 GB free disk.
10
12
  - Intelligence profile: 96 GB RAM or more, fast NVMe, and hundreds of GB free
11
13
  for GLM 5.2 Q4 or similar large weights.
14
+ - Enterprise 8-bit GLM tier: 256 GB RAM or more, 48-80 GB VRAM preferred,
15
+ and 1.5-2 TB of fast NVMe for GLM 5.2 Q8 plus working/cache room.
12
16
 
13
17
  The installer reserves about 15-20% VRAM headroom when deciding whether a model
14
18
  fits. If a large manual model is selected, `preppergpt doctor` explains the
15
19
  endpoint or file path that must be provided.
20
+
21
+ Very low tokens/sec is acceptable for the GLM 5.2 Q8 tier because that tier is
22
+ for situations where there is no cloud model to fall back to.
23
+
24
+ ## Hardware Matrix
25
+
26
+ | Tier | Typical specs | PrepperGPT routes |
27
+ | --- | --- | --- |
28
+ | Basic CPU laptop | 16 GB RAM, no GPU, 80 GB disk | `local-chatgpt-auto`, `llama3.1:8b`, `local-vision-moondream2`, bundled Whisper |
29
+ | Mid NVIDIA | 64 GB RAM, 12 GB usable VRAM, 250 GB disk | Gemma fast lane, Qwen coder fallback, local vision, bundled Whisper |
30
+ | High NVIDIA | 128 GB RAM, 24 GB VRAM, 750 GB NVMe | GLM 5.2 Q4 configured, Slopcode/Qwen configured, Gemma fast lane, Flux configured |
31
+ | Full PrepperGPT rig | 128+ GB RAM, 24+ GB VRAM, 1 TB NVMe, GLM/Slopcode/Flux files present | GLM 5.2 Q4 primary, Slopcode coding, Gemma fast lane, Deep Research, Agent, Vision, Flux, Whisper |
32
+ | Enterprise 8-bit GLM rig | 256+ GB RAM, 48-80+ GB VRAM preferred, 1.5-2 TB fast NVMe | `glm52-q8-local` primary for Max Intelligence, `glm52-q4-local` fallback, Slopcode/Qwen coding, Gemma fast lane, full sidecar stack |
@@ -3,9 +3,10 @@
3
3
  PrepperGPT separates routing from model licensing and distribution.
4
4
 
5
5
  - Ollama models are pulled by the local Ollama runtime when available.
6
- - OpenWebUI STT models are downloaded by OpenWebUI/faster-whisper.
6
+ - Whisper Base STT is installer-cached from `Systran/faster-whisper-base`
7
+ under the local PrepperGPT model directory and mounted into OpenWebUI.
7
8
  - Hugging Face vision models are downloaded by the local vision sidecar.
8
- - Very large GLM, Slopcode, and Flux assets are marked as manual or external
9
+ - Very large GLM Q8/Q4, Slopcode, and Flux assets are marked as manual or external
9
10
  until a license-compatible public download source is configured.
10
11
 
11
12
  Manual routes are still added to OpenWebUI. They become live when their local
@@ -1,10 +1,12 @@
1
1
  # PrepperGPT Local Parity Map
2
2
 
3
- PrepperGPT packages the local ChatGPT-like stack around OpenWebUI:
3
+ PrepperGPT packages the local ChatGPT-like stack around OpenWebUI for resilient
4
+ local use when hosted AI services are unavailable:
4
5
 
5
6
  - OpenWebUI UI at `http://127.0.0.1:8080`
6
7
  - Ollama fast local models at `http://127.0.0.1:11434`
7
- - Optional GLM 5.2 route at `http://127.0.0.1:11441/v1`
8
+ - Optional GLM 5.2 Q8 route at `http://127.0.0.1:11446/v1`
9
+ - Optional GLM 5.2 Q4 route at `http://127.0.0.1:11441/v1`
8
10
  - Optional Slopcode/Qwen route at `http://127.0.0.1:11438/v1`
9
11
  - Deep research sidecar at `http://127.0.0.1:18041/v1`
10
12
  - Local scheduler connector at `http://127.0.0.1:18042`
package/installer/cli.mjs CHANGED
@@ -1,12 +1,13 @@
1
1
  import fs from "node:fs";
2
2
  import http from "node:http";
3
+ import { ensureWhisperBundle, modelDirs, whisperBundleStatus } from "./lib/bundles.mjs";
3
4
  import { detectMachine } from "./lib/detect.mjs";
4
5
  import { buildPlan, normalizeProfile } from "./lib/planner.mjs";
5
6
  import { packageRoot, runtimePaths } from "./lib/paths.mjs";
6
7
  import { renderInstall } from "./lib/render.mjs";
7
8
  import { commandResult, parseArgs, readJson, shellQuote } from "./lib/util.mjs";
8
9
 
9
- const VERSION = "0.1.0";
10
+ const VERSION = "0.1.2";
10
11
 
11
12
  function usage() {
12
13
  return `PrepperGPT ${VERSION}
@@ -14,11 +15,12 @@ function usage() {
14
15
  Usage:
15
16
  preppergpt detect [--json]
16
17
  preppergpt plan --profile balanced|intelligence|speed [--json]
17
- preppergpt install --profile balanced|intelligence|speed [--dry-run] [--home PATH]
18
+ preppergpt install --profile balanced|intelligence|speed [--dry-run] [--skip-bundles] [--home PATH]
18
19
  preppergpt start [--home PATH]
19
20
  preppergpt stop [--home PATH]
20
21
  preppergpt status [--home PATH] [--json]
21
22
  preppergpt doctor [--profile balanced|intelligence|speed] [--home PATH]
23
+ preppergpt bundle whisper [--home PATH] [--force]
22
24
  preppergpt switch-profile --profile balanced|intelligence|speed [--home PATH]
23
25
  preppergpt version
24
26
  `;
@@ -122,6 +124,11 @@ async function commandInstall(flags) {
122
124
  return;
123
125
  }
124
126
  const paths = renderInstall(plan, detection, { home });
127
+ if (!flags.skip_bundles) {
128
+ console.log("Installing bundled Whisper base STT model...");
129
+ const bundle = await ensureWhisperBundle(paths.whisperHostDir, { force: Boolean(flags.force_bundle) });
130
+ console.log(`Whisper bundle: ${bundle.ready ? "ready" : "not ready"} at ${paths.whisperHostDir}`);
131
+ }
125
132
  console.log(`Wrote ${paths.envFile}`);
126
133
  console.log(`Wrote ${paths.generatedCompose}`);
127
134
  console.log(`Wrote ${paths.modelPlan}`);
@@ -187,6 +194,7 @@ async function commandStatus(flags) {
187
194
  }
188
195
 
189
196
  async function commandDoctor(flags) {
197
+ const paths = runtimePaths(flags.home);
190
198
  const detection = await detectMachine();
191
199
  const plan = buildPlan(detection, profileFrom(flags));
192
200
  printPlan(plan);
@@ -200,6 +208,28 @@ async function commandDoctor(flags) {
200
208
  console.log(` port ${port}: occupied`);
201
209
  }
202
210
  }
211
+ const dirs = modelDirs(paths);
212
+ const whisper = whisperBundleStatus(dirs.whisperHostDir);
213
+ console.log(` whisper-base bundle: ${whisper.ready ? "ok" : `missing ${whisper.missing.length} files`} (${dirs.whisperHostDir})`);
214
+ }
215
+
216
+ async function commandBundle(flags, positional) {
217
+ const name = positional[1] || "whisper";
218
+ if (!["whisper", "whisper-base"].includes(name)) {
219
+ throw new Error(`Unknown bundle: ${name}`);
220
+ }
221
+ const paths = runtimePaths(flags.home);
222
+ const dirs = modelDirs(paths);
223
+ const bundle = await ensureWhisperBundle(dirs.whisperHostDir, {
224
+ force: Boolean(flags.force),
225
+ dryRun: Boolean(flags.dry_run)
226
+ });
227
+ console.log(`Whisper bundle ${bundle.ready ? "ready" : "not ready"} at ${dirs.whisperHostDir}`);
228
+ if (bundle.missing?.length) {
229
+ for (const file of bundle.missing) {
230
+ console.log(` missing ${file}`);
231
+ }
232
+ }
203
233
  }
204
234
 
205
235
  export async function runCli(argv) {
@@ -220,6 +250,7 @@ export async function runCli(argv) {
220
250
  if (command === "stop") return commandStop(flags);
221
251
  if (command === "status") return commandStatus(flags);
222
252
  if (command === "doctor") return commandDoctor(flags);
253
+ if (command === "bundle") return commandBundle(flags, positional);
223
254
  if (command === "switch-profile") return commandSwitchProfile(flags);
224
255
  throw new Error(`Unknown command: ${command}\n\n${usage()}`);
225
256
  }
@@ -0,0 +1,107 @@
1
+ import fs from "node:fs";
2
+ import path from "node:path";
3
+ import { Readable } from "node:stream";
4
+ import { pipeline } from "node:stream/promises";
5
+ import { readJson, writeJson } from "./util.mjs";
6
+
7
+ export const WHISPER_BUNDLE = {
8
+ id: "whisper-base",
9
+ name: "Whisper Base STT Bundle",
10
+ repo: "Systran/faster-whisper-base",
11
+ revision: "main",
12
+ license: "MIT",
13
+ modelPathInContainer: "/models/whisper/base",
14
+ files: ["config.json", "model.bin", "tokenizer.json", "vocabulary.txt", "README.md"],
15
+ description: "CTranslate2 faster-whisper conversion of openai/whisper-base for local OpenWebUI STT."
16
+ };
17
+
18
+ function parseEnvFile(file) {
19
+ if (!fs.existsSync(file)) {
20
+ return {};
21
+ }
22
+ const entries = {};
23
+ for (const line of fs.readFileSync(file, "utf8").split(/\r?\n/)) {
24
+ if (!line || line.trim().startsWith("#") || !line.includes("=")) {
25
+ continue;
26
+ }
27
+ const [key, ...valueParts] = line.split("=");
28
+ let value = valueParts.join("=");
29
+ if ((value.startsWith('"') && value.endsWith('"')) || (value.startsWith("'") && value.endsWith("'"))) {
30
+ value = value.slice(1, -1);
31
+ }
32
+ entries[key] = value;
33
+ }
34
+ return entries;
35
+ }
36
+
37
+ export function modelDirs(paths) {
38
+ const env = parseEnvFile(paths.envFile);
39
+ const modelsDir = process.env.PREPPERGPT_MODELS_DIR || env.PREPPERGPT_MODELS_DIR || path.join(paths.dataDir, "models");
40
+ const whisperHostDir = path.join(modelsDir, "whisper", "base");
41
+ return { modelsDir, whisperHostDir };
42
+ }
43
+
44
+ export function whisperBundleStatus(targetDir) {
45
+ const files = WHISPER_BUNDLE.files.map((file) => path.join(targetDir, file));
46
+ const missing = files.filter((file) => !fs.existsSync(file));
47
+ let manifest = null;
48
+ const manifestPath = path.join(targetDir, "preppergpt-bundle.json");
49
+ if (fs.existsSync(manifestPath)) {
50
+ try {
51
+ manifest = readJson(manifestPath);
52
+ } catch {
53
+ manifest = null;
54
+ }
55
+ }
56
+ return {
57
+ id: WHISPER_BUNDLE.id,
58
+ targetDir,
59
+ ready: missing.length === 0,
60
+ missing,
61
+ manifest
62
+ };
63
+ }
64
+
65
+ async function downloadFile(url, targetFile) {
66
+ const response = await fetch(url, {
67
+ headers: {
68
+ "User-Agent": "preppergpt/0.1"
69
+ },
70
+ redirect: "follow"
71
+ });
72
+ if (!response.ok || !response.body) {
73
+ throw new Error(`Failed to download ${url}: HTTP ${response.status}`);
74
+ }
75
+ fs.mkdirSync(path.dirname(targetFile), { recursive: true });
76
+ const tmp = `${targetFile}.tmp-${process.pid}`;
77
+ await pipeline(Readable.fromWeb(response.body), fs.createWriteStream(tmp));
78
+ fs.renameSync(tmp, targetFile);
79
+ }
80
+
81
+ export async function ensureWhisperBundle(targetDir, options = {}) {
82
+ const status = whisperBundleStatus(targetDir);
83
+ if (status.ready && !options.force) {
84
+ return { ...status, changed: false };
85
+ }
86
+ fs.mkdirSync(targetDir, { recursive: true });
87
+ if (options.dryRun) {
88
+ return { ...status, changed: false, dryRun: true };
89
+ }
90
+ for (const file of WHISPER_BUNDLE.files) {
91
+ const targetFile = path.join(targetDir, file);
92
+ if (fs.existsSync(targetFile) && !options.force) {
93
+ continue;
94
+ }
95
+ const url = `https://huggingface.co/${WHISPER_BUNDLE.repo}/resolve/${WHISPER_BUNDLE.revision}/${file}`;
96
+ if (!options.quiet) {
97
+ console.log(`Downloading ${WHISPER_BUNDLE.repo}/${file}`);
98
+ }
99
+ await downloadFile(url, targetFile);
100
+ }
101
+ writeJson(path.join(targetDir, "preppergpt-bundle.json"), {
102
+ ...WHISPER_BUNDLE,
103
+ installedAt: new Date().toISOString(),
104
+ source: `https://huggingface.co/${WHISPER_BUNDLE.repo}`
105
+ });
106
+ return { ...whisperBundleStatus(targetDir), changed: true };
107
+ }
@@ -62,7 +62,9 @@ function chooseFirst(candidates, models, detection) {
62
62
  continue;
63
63
  }
64
64
  const failures = requirementFailures(model, detection);
65
- if (failures.length === 0 || model.source?.type === "manual" || model.source?.type === "external") {
65
+ const canUseExternalFallback =
66
+ ["manual", "external"].includes(model.source?.type) && !model.source?.requiresHardwareFit;
67
+ if (failures.length === 0 || canUseExternalFallback) {
66
68
  return { model, skipped };
67
69
  }
68
70
  skipped.push({ id, reasons: failures });
@@ -91,8 +93,9 @@ export function buildPlan(detection, requestedProfile = "balanced", catalog = lo
91
93
  }
92
94
  }
93
95
 
96
+ const defaultModel = selected.chat?.id || priorities.defaultModel;
94
97
  const routeIds = unique([
95
- priorities.defaultModel,
98
+ defaultModel,
96
99
  selected.chat?.id,
97
100
  selected.fast?.id,
98
101
  selected.reasoning?.id,
@@ -136,7 +139,7 @@ export function buildPlan(detection, requestedProfile = "balanced", catalog = lo
136
139
  generatedAt: new Date().toISOString(),
137
140
  profile,
138
141
  profileLabel: priorities.label,
139
- defaultModel: priorities.defaultModel,
142
+ defaultModel,
140
143
  routeIds,
141
144
  selected,
142
145
  skipped,
@@ -144,7 +147,7 @@ export function buildPlan(detection, requestedProfile = "balanced", catalog = lo
144
147
  estimates: estimatePlan(profile, selected),
145
148
  env: {
146
149
  PREPPERGPT_PROFILE: profile,
147
- PREPPERGPT_DEFAULT_MODEL: priorities.defaultModel,
150
+ PREPPERGPT_DEFAULT_MODEL: defaultModel,
148
151
  PREPPERGPT_MODEL_ORDER_LIST: JSON.stringify(routeIds)
149
152
  },
150
153
  warnings
@@ -1,5 +1,7 @@
1
1
  import crypto from "node:crypto";
2
2
  import fs from "node:fs";
3
+ import path from "node:path";
4
+ import { WHISPER_BUNDLE } from "./bundles.mjs";
3
5
  import { packagedPath, runtimePaths } from "./paths.mjs";
4
6
  import { envQuote, writeJson, writeText } from "./util.mjs";
5
7
 
@@ -10,6 +12,12 @@ function secret(bytes = 24) {
10
12
  function envFile(plan, paths, detection) {
11
13
  const dataDir = process.env.PREPPERGPT_DATA_DIR || paths.dataDir;
12
14
  const modelsDir = process.env.PREPPERGPT_MODELS_DIR || `${dataDir}/models`;
15
+ const whisperHostDir = path.join(modelsDir, "whisper", "base");
16
+ const selectedReasoningModel = plan.selected?.reasoning?.id || "glm52-q4-local";
17
+ const selectedGlmBaseUrl =
18
+ selectedReasoningModel === "glm52-q8-local"
19
+ ? process.env.GLM52_Q8_BASE_URL || "http://127.0.0.1:11446/v1"
20
+ : process.env.GLM52_BASE_URL || "http://127.0.0.1:11441/v1";
13
21
  const adminPassword = process.env.PREPPERGPT_ADMIN_PASSWORD || secret(18);
14
22
  const jupyterToken = process.env.JUPYTER_TOKEN || secret(18);
15
23
  const searxngSecret = process.env.SEARXNG_SECRET_KEY || secret(24);
@@ -17,9 +25,14 @@ function envFile(plan, paths, detection) {
17
25
  PREPPERGPT_PROFILE: plan.profile,
18
26
  PREPPERGPT_DATA_DIR: dataDir,
19
27
  PREPPERGPT_MODELS_DIR: modelsDir,
28
+ PREPPERGPT_WHISPER_HOST_DIR: whisperHostDir,
29
+ PREPPERGPT_WHISPER_MODEL: WHISPER_BUNDLE.id,
30
+ PREPPERGPT_WHISPER_MODEL_PATH: WHISPER_BUNDLE.modelPathInContainer,
20
31
  PREPPERGPT_PORT: process.env.PREPPERGPT_PORT || "8080",
21
32
  PREPPERGPT_DEFAULT_MODEL: plan.defaultModel,
22
33
  PREPPERGPT_MODEL_ORDER_LIST: JSON.stringify(plan.routeIds),
34
+ PREPPERGPT_GLM_MODEL: selectedReasoningModel,
35
+ PREPPERGPT_GLM_BASE_URL: selectedGlmBaseUrl,
23
36
  PREPPERGPT_DOCKER_GPUS: detection.gpus?.length ? "all" : "",
24
37
  WEBUI_NAME: "PrepperGPT",
25
38
  WEBUI_ADMIN_EMAIL: process.env.WEBUI_ADMIN_EMAIL || "admin@preppergpt.local",
@@ -29,6 +42,7 @@ function envFile(plan, paths, detection) {
29
42
  JUPYTER_TOKEN: jupyterToken,
30
43
  SEARXNG_SECRET_KEY: searxngSecret,
31
44
  GLM52_BASE_URL: process.env.GLM52_BASE_URL || "http://127.0.0.1:11441/v1",
45
+ GLM52_Q8_BASE_URL: process.env.GLM52_Q8_BASE_URL || "http://127.0.0.1:11446/v1",
32
46
  SLOCODE_BASE_URL: process.env.SLOCODE_BASE_URL || "http://127.0.0.1:11438/v1",
33
47
  OLLAMA_BASE_URL: process.env.OLLAMA_BASE_URL || "http://127.0.0.1:11434"
34
48
  };
@@ -60,17 +74,23 @@ function generatedCompose(plan, detection) {
60
74
 
61
75
  export function renderInstall(plan, detection, options = {}) {
62
76
  const paths = runtimePaths(options.home);
77
+ const dataDir = process.env.PREPPERGPT_DATA_DIR || paths.dataDir;
78
+ const modelsDir = process.env.PREPPERGPT_MODELS_DIR || `${dataDir}/models`;
79
+ const whisperHostDir = path.join(modelsDir, "whisper", "base");
63
80
  fs.mkdirSync(paths.root, { recursive: true });
64
81
  fs.mkdirSync(paths.dataDir, { recursive: true });
65
82
  fs.mkdirSync(paths.composeDir, { recursive: true });
66
83
  fs.mkdirSync(`${paths.dataDir}/preppergpt`, { recursive: true });
67
- fs.mkdirSync(`${paths.dataDir}/models`, { recursive: true });
84
+ fs.mkdirSync(modelsDir, { recursive: true });
85
+ fs.mkdirSync(whisperHostDir, { recursive: true });
68
86
  writeText(paths.envFile, envFile(plan, paths, detection), 0o600);
69
87
  writeText(paths.generatedCompose, generatedCompose(plan, detection));
70
88
  writeJson(paths.modelPlan, plan);
71
89
  writeJson(paths.detectReport, detection);
72
90
  return {
73
91
  ...paths,
92
+ modelsDir,
93
+ whisperHostDir,
74
94
  packageCompose: packagedPath("compose", "preppergpt.yaml")
75
95
  };
76
96
  }
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "preppergpt",
3
- "version": "0.1.0",
4
- "description": "A local-first ChatGPT-like field kit built on OpenWebUI and local models.",
3
+ "version": "0.1.2",
4
+ "description": "A post-apocalyptic local AI field kit for running a ChatGPT-like experience when hosted services are unavailable.",
5
5
  "type": "module",
6
6
  "bin": {
7
7
  "preppergpt": "bin/preppergpt.js"
@@ -19,7 +19,7 @@
19
19
  ],
20
20
  "scripts": {
21
21
  "test": "node --test",
22
- "check": "npm run test && node bin/preppergpt.js plan --profile balanced --json >/dev/null && node bin/preppergpt.js install --profile balanced --dry-run",
22
+ "check": "npm run test && node bin/preppergpt.js plan --profile balanced --json >/dev/null && node bin/preppergpt.js install --profile balanced --dry-run && node bin/preppergpt.js bundle whisper --dry-run >/dev/null",
23
23
  "pack:dry-run": "npm pack --dry-run"
24
24
  },
25
25
  "keywords": [
@@ -28,7 +28,9 @@
28
28
  "llm",
29
29
  "ollama",
30
30
  "preppergpt",
31
- "offline-ai"
31
+ "offline-ai",
32
+ "survival",
33
+ "post-apocalyptic"
32
34
  ],
33
35
  "homepage": "https://github.com/teamslop/preppergpt#readme",
34
36
  "bugs": {
@@ -5,15 +5,15 @@
5
5
  "label": "Max intelligence",
6
6
  "defaultModel": "glm52-q4-local",
7
7
  "roles": {
8
- "chat": ["glm52-q4-local", "qwen3.6-35b-a3b:slopcode-cpu-64k", "qwen2.5-coder:14b", "llama3.1:8b"],
9
- "reasoning": ["glm52-q4-local", "qwen3.6-35b-a3b:slopcode-cpu-64k", "qwen2.5-coder:14b"],
8
+ "chat": ["glm52-q8-local", "glm52-q4-local", "qwen3.6-35b-a3b:slopcode-cpu-64k", "qwen2.5-coder:14b", "llama3.1:8b"],
9
+ "reasoning": ["glm52-q8-local", "glm52-q4-local", "qwen3.6-35b-a3b:slopcode-cpu-64k", "qwen2.5-coder:14b"],
10
10
  "fast": ["gemma4:12b-256k-gpu", "llama3.1:8b"],
11
11
  "coding": ["qwen3.6-35b-a3b:slopcode-cpu-64k", "qwen2.5-coder:14b"],
12
12
  "research": ["deep-research-glm52"],
13
13
  "agent": ["local-agent-glm52"],
14
14
  "vision": ["local-vision-gemma4-12b", "local-vision-moondream2"],
15
15
  "image": ["flux-2-klein-9b-fp8"],
16
- "stt": ["whisper-large-v3"]
16
+ "stt": ["whisper-base-bundled"]
17
17
  }
18
18
  },
19
19
  "balanced": {
@@ -21,14 +21,14 @@
21
21
  "defaultModel": "local-chatgpt-auto",
22
22
  "roles": {
23
23
  "chat": ["local-chatgpt-auto", "gemma4:12b-256k-gpu", "glm52-q4-local", "llama3.1:8b"],
24
- "reasoning": ["glm52-q4-local", "qwen3.6-35b-a3b:slopcode-cpu-64k", "qwen2.5-coder:14b"],
24
+ "reasoning": ["glm52-q8-local", "glm52-q4-local", "qwen3.6-35b-a3b:slopcode-cpu-64k", "qwen2.5-coder:14b"],
25
25
  "fast": ["gemma4:12b-256k-gpu", "llama3.1:8b"],
26
26
  "coding": ["qwen3.6-35b-a3b:slopcode-cpu-64k", "qwen2.5-coder:14b"],
27
27
  "research": ["deep-research-glm52"],
28
28
  "agent": ["local-agent-glm52"],
29
29
  "vision": ["local-vision-gemma4-12b", "local-vision-moondream2"],
30
30
  "image": ["flux-2-klein-9b-fp8"],
31
- "stt": ["whisper-large-v3"]
31
+ "stt": ["whisper-base-bundled"]
32
32
  }
33
33
  },
34
34
  "speed": {
@@ -43,7 +43,7 @@
43
43
  "agent": ["local-agent-glm52"],
44
44
  "vision": ["local-vision-moondream2", "local-vision-gemma4-12b"],
45
45
  "image": ["flux-2-klein-9b-fp8"],
46
- "stt": ["whisper-large-v3"]
46
+ "stt": ["whisper-base-bundled"]
47
47
  }
48
48
  }
49
49
  },
@@ -66,6 +66,26 @@
66
66
  "description": "Virtual OpenAI-compatible route exposed by the local-agent sidecar."
67
67
  }
68
68
  },
69
+ {
70
+ "id": "glm52-q8-local",
71
+ "name": "GLM 5.2 Q8 Local",
72
+ "roles": ["chat", "reasoning"],
73
+ "backend": "llama.cpp",
74
+ "contextTokens": 65536,
75
+ "qualityScore": 104,
76
+ "speedScore": 15,
77
+ "tpsEstimate": "very low tokens/sec may be acceptable in disaster/off-grid use because no hosted model is available; benchmark locally",
78
+ "requires": {
79
+ "minRamGb": 192,
80
+ "diskGb": 1000,
81
+ "nvme": true
82
+ },
83
+ "source": {
84
+ "type": "external",
85
+ "requiresHardwareFit": true,
86
+ "description": "Run a GLM 5.2 Q8 OpenAI-compatible llama.cpp server at http://127.0.0.1:11446/v1 with weights on fast NVMe for maximum local quality when hosted services are unavailable."
87
+ }
88
+ },
69
89
  {
70
90
  "id": "glm52-q4-local",
71
91
  "name": "GLM 5.2 Q4 Local",
@@ -74,7 +94,7 @@
74
94
  "contextTokens": 65536,
75
95
  "qualityScore": 100,
76
96
  "speedScore": 25,
77
- "tpsEstimate": "0.4-3 completion tokens/sec on large CPU/NVMe builds; benchmark locally",
97
+ "tpsEstimate": "0.4-3 completion tokens/sec on large CPU/NVMe builds; acceptable for disaster/off-grid use when no hosted service is available; benchmark locally",
78
98
  "requires": {
79
99
  "minRamGb": 96,
80
100
  "diskGb": 520,
@@ -256,21 +276,23 @@
256
276
  }
257
277
  },
258
278
  {
259
- "id": "whisper-large-v3",
260
- "name": "Whisper Large v3 STT",
279
+ "id": "whisper-base-bundled",
280
+ "name": "Bundled Whisper Base STT",
261
281
  "roles": ["stt"],
262
282
  "backend": "openwebui-faster-whisper",
263
283
  "contextTokens": 0,
264
- "qualityScore": 82,
265
- "speedScore": 65,
266
- "tpsEstimate": "audio transcription speed depends on CPU/GPU and clip length",
284
+ "qualityScore": 72,
285
+ "speedScore": 82,
286
+ "tpsEstimate": "local speech-to-text speed depends on CPU/GPU and clip length",
267
287
  "requires": {
268
- "minRamGb": 16,
269
- "diskGb": 4
288
+ "minRamGb": 8,
289
+ "diskGb": 1
270
290
  },
271
291
  "source": {
272
- "type": "openwebui",
273
- "description": "OpenWebUI faster-whisper local STT model."
292
+ "type": "bundled-download",
293
+ "model": "Systran/faster-whisper-base",
294
+ "license": "MIT",
295
+ "description": "Installer-cached faster-whisper base model mounted into OpenWebUI for offline local STT after install."
274
296
  }
275
297
  }
276
298
  ]
@@ -3780,11 +3780,11 @@ def local_parity_recommended_model(feature_family: str, primary_models: list[str
3780
3780
  return models[0] if models else None
3781
3781
 
3782
3782
  if "codex" in family or "software" in family or "code" in family:
3783
- return first_available(["slopcode-qwen-coder-local", "local-agent-glm52", "glm52-q4-local"])
3783
+ return first_available(["slopcode-qwen-coder-local", "local-agent-glm52", "glm52-q8-local", "glm52-q4-local"])
3784
3784
  if "deep research" in family:
3785
- return first_available(["deep-research-glm52", "glm52-q4-local"])
3785
+ return first_available(["deep-research-glm52", "glm52-q8-local", "glm52-q4-local"])
3786
3786
  if "developer mode" in family or "mcp" in family:
3787
- return first_available(["local-agent-glm52", "local-chatgpt-auto", "glm52-q4-local"])
3787
+ return first_available(["local-agent-glm52", "local-chatgpt-auto", "glm52-q8-local", "glm52-q4-local"])
3788
3788
  if "image generation" in family:
3789
3789
  return first_available(["flux-2-klein-9b-fp8"])
3790
3790
  if "image editing" in family:
@@ -3792,18 +3792,18 @@ def local_parity_recommended_model(feature_family: str, primary_models: list[str
3792
3792
  if "image understanding" in family:
3793
3793
  return first_available(["local-vision-gemma4-12b", "local-vision-moondream2"])
3794
3794
  if "voice" in family or "record mode" in family:
3795
- return first_available(["whisper-large-v3", "local-agent-glm52"])
3795
+ return first_available(["whisper-base-bundled", "whisper-large-v3", "local-agent-glm52"])
3796
3796
  if "shopping" in family:
3797
- return first_available(["glm52-shopping-research-local", "glm52-q4-local"])
3797
+ return first_available(["glm52-shopping-research-local", "glm52-q8-local", "glm52-q4-local"])
3798
3798
  if "job search" in family or "resume" in family or "finance" in family:
3799
- return first_available(["local-agent-glm52", "local-chatgpt-auto", "glm52-q4-local"])
3799
+ return first_available(["local-agent-glm52", "local-chatgpt-auto", "glm52-q8-local", "glm52-q4-local"])
3800
3800
  if "study" in family:
3801
- return first_available(["glm52-study-coach-local", "glm52-q4-local"])
3801
+ return first_available(["glm52-study-coach-local", "glm52-q8-local", "glm52-q4-local"])
3802
3802
  if "advanced reasoning" in family or "long context" in family:
3803
- return first_available(["glm52-q4-local"])
3803
+ return first_available(["glm52-q8-local", "glm52-q4-local"])
3804
3804
  if "data analysis" in family or "canvas" in family or "memory" in family or "agent mode" in family:
3805
- return first_available(["local-agent-glm52", "glm52-q4-local"])
3806
- return first_available(["local-chatgpt-auto", "local-auto-router", "local-instant-gemma4-12b", "glm52-q4-local"])
3805
+ return first_available(["local-agent-glm52", "glm52-q8-local", "glm52-q4-local"])
3806
+ return first_available(["local-chatgpt-auto", "local-auto-router", "local-instant-gemma4-12b", "glm52-q8-local", "glm52-q4-local"])
3807
3807
 
3808
3808
 
3809
3809
  def local_parity_route_for_model(feature_family: str, model: str | None, profiles: dict) -> dict:
@@ -3833,10 +3833,10 @@ def local_parity_route_for_model(feature_family: str, model: str | None, profile
3833
3833
  route_id = "slopcode_tiny"
3834
3834
  route_type = "benchmarked_chat_route"
3835
3835
  action = "Select the Slopcode/Qwen coding model in OpenWebUI for local software work."
3836
- elif model_text == "glm52-q4-local" or "advanced reasoning" in family or "long context" in family:
3836
+ elif model_text in {"glm52-q8-local", "glm52-q4-local"} or "advanced reasoning" in family or "long context" in family:
3837
3837
  route_id = "glm_tiny"
3838
3838
  route_type = "benchmarked_chat_route"
3839
- action = "Select GLM 5.2 Q4 in OpenWebUI for private long-context reasoning."
3839
+ action = "Select the best available local GLM 5.2 route in OpenWebUI for private long-context reasoning."
3840
3840
  elif "shopping" in model_text:
3841
3841
  route_id = "glm52_shopping_research_preset"
3842
3842
  route_type = "chat_preset"
@@ -4462,9 +4462,9 @@ WORKFLOW_RECIPE_BLUEPRINTS = [
4462
4462
  "id": "private-long-context-workflow",
4463
4463
  "task_id": "private-long-context-reasoning",
4464
4464
  "title": "Private long-context reasoning with GLM 5.2",
4465
- "openwebui_entrypoint": "Model picker -> glm52-q4-local",
4465
+ "openwebui_entrypoint": "Model picker -> glm52-q8-local on enterprise rigs, otherwise glm52-q4-local",
4466
4466
  "steps": [
4467
- "Select glm52-q4-local when privacy and context length matter more than latency.",
4467
+ "Select glm52-q8-local on enterprise hardware when maximum local quality matters; otherwise select glm52-q4-local.",
4468
4468
  "Keep the prompt bounded when possible; use files/projects for reusable context.",
4469
4469
  "Use fast local routes for quick follow-ups when GLM latency is not needed.",
4470
4470
  ],
@@ -5293,6 +5293,7 @@ def local_parity_dashboard() -> dict:
5293
5293
  },
5294
5294
  "urls": {
5295
5295
  "openwebui": "http://127.0.0.1:8080",
5296
+ "glm52_q8_openai": "http://127.0.0.1:11446/v1",
5296
5297
  "glm52_openai": "http://127.0.0.1:11441/v1",
5297
5298
  "slopcode_openai": "http://127.0.0.1:11438/v1",
5298
5299
  "deep_research_openai": "http://127.0.0.1:18041/v1",
@@ -5306,6 +5307,7 @@ def local_parity_dashboard() -> dict:
5306
5307
  },
5307
5308
  "primary_models": [
5308
5309
  {"id": "local-chatgpt-auto", "route": "fast_router", "best_for": "default local ChatGPT-like routing"},
5310
+ {"id": "glm52-q8-local", "route": "glm_tiny", "context_tokens": 65536, "best_for": "enterprise 8-bit private long-context reasoning"},
5309
5311
  {"id": "glm52-q4-local", "route": "glm_tiny", "context_tokens": 65536, "best_for": "private long-context reasoning"},
5310
5312
  {
5311
5313
  "id": "qwen3.6-35b-a3b:slopcode-cpu-64k",
@@ -5317,7 +5319,7 @@ def local_parity_dashboard() -> dict:
5317
5319
  {"id": "local-agent-glm52", "route": "local_agent", "best_for": "tool and agent workflows"},
5318
5320
  {"id": "local-vision-gemma4-12b", "route": "local_vision", "best_for": "image understanding"},
5319
5321
  {"id": "flux-2-klein-9b-fp8", "route": "comfyui_flux", "best_for": "image generation"},
5320
- {"id": "whisper-large-v3", "route": "local_whisper_stt", "best_for": "speech-to-text"},
5322
+ {"id": "whisper-base-bundled", "route": "local_whisper_stt", "best_for": "speech-to-text"},
5321
5323
  ],
5322
5324
  "route_profiles": {
5323
5325
  key: {
@@ -6660,7 +6662,7 @@ def local_parity_audit_html() -> str:
6660
6662
  {metric("Starter prompts", f"{starter_summary.get('ready_starter_prompts')}/{starter_summary.get('starter_prompts')}", "prompt-library items")}
6661
6663
  {metric("Current release", f"{source_freshness_summary.get('current_release_covered_families')}/{source_freshness_summary.get('current_release_expected_families')}", "families")}
6662
6664
  {metric("Release evidence", f"{source_freshness_summary.get('current_release_covered_evidence_terms')}/{source_freshness_summary.get('current_release_expected_evidence_terms')}", "terms")}
6663
- {metric("Primary GLM route", "glm52-q4-local", "local long-context model")}
6665
+ {metric("Primary GLM route", "glm52-q8-local / glm52-q4-local", "local long-context model")}
6664
6666
  {metric("Scope exclusions", f"{frontier_summary.get('excluded_from_local_goal_items')}/{frontier_summary.get('boundary_items')}", "hosted capabilities")}
6665
6667
  {metric("Evidence artifacts", f"{evidence_summary.get('ready_artifacts')}/{evidence_summary.get('artifacts')}", "privacy-safe proof")}
6666
6668
  {metric("Quality evals", scorecard_summary.get('quality_evals'), "executable")}
@@ -9653,7 +9655,7 @@ def local_parity_gap_report_html() -> str:
9653
9655
  {metric("Quality evals", summary.get('quality_evals'), "executable catalog")}
9654
9656
  {metric("Continuity", summary.get('continuity_status'), "fallback status")}
9655
9657
  {metric("Sources", summary.get('source_entries'), "source snapshot")}
9656
- {metric("Primary GLM route", "glm52-q4-local", "local long-context model")}
9658
+ {metric("Primary GLM route", "glm52-q8-local / glm52-q4-local", "local long-context model")}
9657
9659
  {metric("GLM context", "65,536", "tokens")}
9658
9660
  {metric("Scope exclusions", f"{frontier_summary.get('excluded_from_local_goal_items')}/{frontier_summary.get('boundary_items')}", "hosted capabilities")}
9659
9661
  </section>
@@ -12797,7 +12799,7 @@ def local_model_route_recommendations() -> dict:
12797
12799
  "glm_tiny": {
12798
12800
  "title": "Private GLM 5.2 reasoning route",
12799
12801
  "benchmark_suite": "glm_tiny",
12800
- "default_model": "glm52-q4-local",
12802
+ "default_model": "glm52-q8-local or glm52-q4-local",
12801
12803
  "target_tps": 0.1,
12802
12804
  "best_for": [
12803
12805
  "private long-context reasoning",