sneakoscope 4.0.4 → 4.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (41) hide show
  1. package/README.md +9 -9
  2. package/crates/sks-core/Cargo.lock +1 -1
  3. package/crates/sks-core/Cargo.toml +1 -1
  4. package/crates/sks-core/src/main.rs +1 -1
  5. package/dist/bin/sks.js +1 -1
  6. package/dist/core/codex-app/glm-profile-schema.js +5 -1
  7. package/dist/core/commands/glm-command.js +51 -6
  8. package/dist/core/commands/mad-sks-command.js +65 -9
  9. package/dist/core/fsx.js +1 -1
  10. package/dist/core/perf/lru-cache.js +33 -0
  11. package/dist/core/providers/glm/glm-52-profile.js +14 -7
  12. package/dist/core/providers/glm/glm-52-request.js +43 -12
  13. package/dist/core/providers/glm/glm-52-response-guard.js +1 -2
  14. package/dist/core/providers/glm/glm-52-settings.js +50 -8
  15. package/dist/core/providers/glm/glm-bench.js +127 -0
  16. package/dist/core/providers/glm/glm-context-budget.js +15 -0
  17. package/dist/core/providers/glm/glm-context-cache.js +9 -0
  18. package/dist/core/providers/glm/glm-direct-run.js +140 -0
  19. package/dist/core/providers/glm/glm-interactive-launch.js +5 -0
  20. package/dist/core/providers/glm/glm-latency-trace.js +40 -0
  21. package/dist/core/providers/glm/glm-loop-guard.js +31 -0
  22. package/dist/core/providers/glm/glm-mad-launch.js +18 -3
  23. package/dist/core/providers/glm/glm-mad-mode.js +48 -20
  24. package/dist/core/providers/glm/glm-model-meta-cache.js +19 -0
  25. package/dist/core/providers/glm/glm-patch-apply.js +58 -0
  26. package/dist/core/providers/glm/glm-patch-parser.js +19 -0
  27. package/dist/core/providers/glm/glm-profile-resolver.js +104 -0
  28. package/dist/core/providers/glm/glm-readiness.js +5 -0
  29. package/dist/core/providers/glm/glm-reasoning-policy.js +15 -0
  30. package/dist/core/providers/glm/glm-request-cache.js +64 -0
  31. package/dist/core/providers/glm/glm-run-controller.js +66 -0
  32. package/dist/core/providers/glm/glm-run-state.js +11 -0
  33. package/dist/core/providers/glm/glm-run-timeout.js +31 -0
  34. package/dist/core/providers/glm/glm-speed-context.js +82 -0
  35. package/dist/core/providers/glm/glm-speed-gate.js +40 -0
  36. package/dist/core/providers/glm/glm-speed-output-parser.js +40 -0
  37. package/dist/core/providers/glm/glm-tool-schema-cache.js +19 -0
  38. package/dist/core/providers/openrouter/openrouter-client.js +21 -1
  39. package/dist/core/providers/openrouter/openrouter-stream.js +94 -0
  40. package/dist/core/version.js +1 -1
  41. package/package.json +1 -1
package/README.md CHANGED
@@ -35,16 +35,16 @@ Set up this agent project with Sneakoscope Codex. Use [[mandarange/Sneakoscope-C
35
35
 
36
36
  ## 🚀 Current Release
37
37
 
38
- SKS **4.0.4** makes the GLM 5.2 MAD launch path real: `sks --mad --glm` now resolves the OpenRouter key, opens the MAD Zellij launch flow with Codex configured for `z-ai/glm-5.2`, blocks GPT fallback panes, and records launch proof; `sks --mad --glm --repair` still rotates the OpenRouter API key outside project files.
38
+ SKS **4.0.6** makes the GLM 5.2 MAD path bounded by default: `sks --mad --glm` now returns readiness/status and exits when no task is supplied, while task forms use a direct GLM-only speed path with loop guards, request timeouts, and deterministic patch gates. Ordinary `sks --mad`, Naruto/Team, and non-GLM Codex paths keep their existing defaults.
39
39
 
40
- What changed in 4.0.4:
40
+ What changed in 4.0.6:
41
41
 
42
- - **GLM MAD actually launches.** `sks --mad --glm` no longer stops after the readiness banner; it continues into the MAD launcher with a GLM/OpenRouter Codex profile.
43
- - **No leaked OpenRouter key.** The Zellij pane uses a mission-local `sks-glm-codex-wrapper.sh` that reads the key at runtime, so layout artifacts do not contain the raw secret.
44
- - **No GPT fallback panes.** GLM MAD disables the existing GPT/codex-sdk native swarm by default until a GLM worker backend exists, preserving the no-fallback guarantee.
45
- - **Launch proof.** Each GLM MAD launch writes `mad-glm-launch.json` with provider/model/profile/wrapper evidence and records disabled fallback status.
46
- - **OpenRouter key lifecycle.** Keys resolve from `OPENROUTER_API_KEY`, `SKS_OPENROUTER_API_KEY`, or the user SKS secret store; stored keys use private permissions and redacted metadata.
47
- - **4.0.3 GLM request safeguards remain.** GLM requests still use `provider.allow_fallbacks: false`, omit fallback `models`, and reject non-GLM response model ids before mutation.
42
+ - **No default long-lived GLM launch.** Bare `sks --mad --glm` no longer falls through to MAD/Zellij; `--interactive`, `--zellij`, or `session` is required for that path.
43
+ - **Fast GLM speed profile.** Speed mode keeps OpenRouter locked to `z-ai/glm-5.2`, disables GPT/model fallback, avoids high/xhigh reasoning by default, and uses `provider.require_parameters: false` with throughput-first routing.
44
+ - **Bounded direct task runs.** `sks --mad --glm run "task"` and `sks --mad --glm "task"` use a one-shot GLM speed run with max-turn, wall-clock, request-timeout, no-progress, repeated-output, and terminal-state guards.
45
+ - **Deterministic mutation gate.** GLM still returns patch envelopes; SKS parses the unified diff, blocks protected paths, runs `git apply --check`, and applies only after the gate passes.
46
+ - **OpenRouter speed plumbing.** Encoded request bodies are cached without Authorization headers, request timeout/abort is wired, streaming TTFT/usage capture is scaffolded, and synthetic `--bench` remains network-free by default.
47
+ - **Loop regression tests.** Routing, speed-profile, cache, loop-guard, patch-gate, and OpenRouter key handling are covered by targeted tests.
48
48
 
49
49
  SKS **3.1.16** was a launch-reliability patch on the 3.1.15 doctor-reliability release. It made `sks --mad` self-bootstrap a fresh project instead of dead-ending on a missing Codex config.
50
50
 
@@ -396,7 +396,7 @@ sks team open-zellij latest
396
396
  sks team attach-zellij latest
397
397
  ```
398
398
 
399
- Interactive SKS sessions use Zellij layouts. By default SKS launches Codex in Fast service tier with `--model gpt-5.5`, `-c service_tier="fast"`, the selected `model_reasoning_effort`, and `--no-alt-screen` for Zellij-backed interactive panes so terminal scrollback captures the conversation transcript. SKS always forces the model to `gpt-5.5`; `SKS_CODEX_MODEL` and `SKS_CODEX_FAST_HIGH=0` cannot downgrade or remove that model pin. You can still set `SKS_CODEX_REASONING` to change reasoning effort, and `SKS_ZELLIJ_CODEX_ALT_SCREEN=1` restores Codex's alternate-screen UI for the next launch. Use `sks --mad --workspace <name>` for an explicit MAD session and `sks help` for CLI help.
399
+ Interactive SKS sessions use Zellij layouts. By default SKS launches Codex in Fast service tier with `--model gpt-5.5`, `-c service_tier="fast"`, the selected `model_reasoning_effort`, and `--no-alt-screen` for Zellij-backed interactive panes so terminal scrollback captures the conversation transcript. Non-GLM SKS sessions force the model to `gpt-5.5`; `sks --mad --glm` is the OpenRouter GLM 5.2 exception. `SKS_CODEX_MODEL` and `SKS_CODEX_FAST_HIGH=0` cannot downgrade or remove the non-GLM model pin. You can still set `SKS_CODEX_REASONING` to change reasoning effort, and `SKS_ZELLIJ_CODEX_ALT_SCREEN=1` restores Codex's alternate-screen UI for the next launch. Use `sks --mad --workspace <name>` for an explicit MAD session and `sks help` for CLI help.
400
400
 
401
401
  Before opening the interactive runtime, SKS checks the installed Codex CLI against npm `@openai/codex@latest`. If a newer version exists, it asks `Y/n`; answering `y` updates automatically with `npm i -g @openai/codex@latest` and then opens the runtime with the updated Codex CLI.
402
402
 
@@ -76,7 +76,7 @@ dependencies = [
76
76
 
77
77
  [[package]]
78
78
  name = "sks-core"
79
- version = "4.0.4"
79
+ version = "4.0.6"
80
80
  dependencies = [
81
81
  "serde_json",
82
82
  ]
@@ -1,6 +1,6 @@
1
1
  [package]
2
2
  name = "sks-core"
3
- version = "4.0.4"
3
+ version = "4.0.6"
4
4
  edition = "2021"
5
5
 
6
6
  [dependencies]
@@ -4,7 +4,7 @@ use std::io::{self, Read, Seek, SeekFrom};
4
4
  fn main() {
5
5
  let mut args = std::env::args().skip(1);
6
6
  match args.next().as_deref() {
7
- Some("--version") => println!("sks-rs 4.0.4"),
7
+ Some("--version") => println!("sks-rs 4.0.6"),
8
8
  Some("compact-info") => {
9
9
  let mut input = String::new();
10
10
  let _ = io::stdin().read_to_string(&mut input);
package/dist/bin/sks.js CHANGED
@@ -1,5 +1,5 @@
1
1
  #!/usr/bin/env node
2
- const FAST_PACKAGE_VERSION = '4.0.4';
2
+ const FAST_PACKAGE_VERSION = '4.0.6';
3
3
  const args = process.argv.slice(2);
4
4
  try {
5
5
  if (args[0] === '--agent' && args[1] === 'worker') {
@@ -13,7 +13,11 @@ export function validateGlmCodexAppModelProfile(value) {
13
13
  profile.model === GLM_52_OPENROUTER_MODEL ? null : 'glm_codex_app_profile_invalid_model',
14
14
  profile.mode === GLM_MAD_MODE ? null : 'glm_codex_app_profile_invalid_mode',
15
15
  profile.strictModelLock === true ? null : 'glm_codex_app_profile_not_strict',
16
- profile.gptFallbackAllowed === false ? null : 'glm_codex_app_profile_allows_gpt_fallback'
16
+ profile.gptFallbackAllowed === false ? null : 'glm_codex_app_profile_allows_gpt_fallback',
17
+ profile.defaultProfile === 'speed' ? null : 'glm_codex_app_profile_default_not_speed',
18
+ profile.defaultSettings?.tool_choice === 'none' ? null : 'glm_codex_app_profile_default_tools_not_omitted',
19
+ profile.defaultSettings?.provider_require_parameters === true ? null : 'glm_codex_app_profile_default_does_not_require_parameters',
20
+ profile.defaultSettings?.provider_allow_fallbacks === false ? null : 'glm_codex_app_profile_allows_provider_fallback'
17
21
  ].filter((item) => Boolean(item));
18
22
  return {
19
23
  ok: blockers.length === 0,
@@ -1,10 +1,55 @@
1
- import { runMadGlmMode } from '../providers/glm/glm-mad-mode.js';
2
- import { flag } from '../../cli/args.js';
3
- import { madHighCommand } from './mad-sks-command.js';
1
+ import { flag, positionalArgs } from '../../cli/args.js';
2
+ import { runGlmBench } from '../providers/glm/glm-bench.js';
3
+ import { printJson } from '../../cli/output.js';
4
+ import { runGlmDirectSpeedRun } from '../providers/glm/glm-direct-run.js';
5
+ import { runGlmReadinessAndExit } from '../providers/glm/glm-readiness.js';
6
+ import { runGlmInteractiveLaunch } from '../providers/glm/glm-interactive-launch.js';
4
7
  export async function glmCommand(args = []) {
5
- const result = await runMadGlmMode(args);
6
- if (!result.ok || flag(args, '--repair') || flag(args, '--json'))
8
+ if (flag(args, '--bench')) {
9
+ const result = await runGlmBench(process.cwd(), args);
10
+ if (result.status === 'blocked')
11
+ process.exitCode = 1;
12
+ if (flag(args, '--json'))
13
+ printJson(result);
14
+ else if (result.status === 'blocked')
15
+ console.error(`GLM bench blocked: ${result.warnings.join(', ')}`);
16
+ else
17
+ console.log(`GLM bench: dry-run p50=${result.summary.speed_p50_total_ms}ms ratio=${result.summary.speed_vs_deep_ratio}`);
7
18
  return result;
8
- return madHighCommand(['--glm', ...args], { glmReadiness: result });
19
+ }
20
+ const task = extractGlmTask(args);
21
+ const interactive = flag(args, '--interactive') || flag(args, '--zellij') || positionalArgs(args)[0] === 'session';
22
+ if (interactive) {
23
+ const readiness = await runGlmReadinessAndExit(args);
24
+ if (!readiness.ok)
25
+ return readiness;
26
+ return runGlmInteractiveLaunch(args, readiness);
27
+ }
28
+ if (!task || flag(args, '--repair') || flag(args, '--status')) {
29
+ return runGlmReadinessAndExit(args);
30
+ }
31
+ const result = await runGlmDirectSpeedRun({
32
+ cwd: process.cwd(),
33
+ task,
34
+ args,
35
+ dryRun: flag(args, '--dry-run')
36
+ });
37
+ if (flag(args, '--json'))
38
+ printJson(result);
39
+ else if (result.ok)
40
+ console.log(`GLM direct run completed: ${result.termination_reason}`);
41
+ else
42
+ console.error(`GLM direct run ${result.status}: ${result.blockers.join(', ') || result.termination_reason}`);
43
+ if (!result.ok)
44
+ process.exitCode = 1;
45
+ return result;
46
+ }
47
+ function extractGlmTask(args) {
48
+ const positional = positionalArgs(args).map(String);
49
+ if (positional[0] === 'run')
50
+ return positional.slice(1).join(' ').trim() || null;
51
+ if (positional[0] === 'session')
52
+ return null;
53
+ return positional.join(' ').trim() || null;
9
54
  }
10
55
  //# sourceMappingURL=glm-command.js.map
@@ -30,13 +30,31 @@ export async function madHighCommand(args = [], deps = {}) {
30
30
  const subcommand = firstSubcommand(args);
31
31
  if (subcommand)
32
32
  return madSksSubcommand(subcommand, args.filter((arg) => String(arg) !== subcommand));
33
- const cleanArgs = stripMadLaunchOnlyArgs(args);
34
33
  const rawArgs = (args || []).map((arg) => String(arg));
35
34
  const glmMadLaunch = isMadGlmLaunch(rawArgs, deps);
35
+ const glmOnlyFlagBlockers = findGlmOnlyMadFlagBlockers(rawArgs, glmMadLaunch);
36
+ if (glmOnlyFlagBlockers.length) {
37
+ const result = {
38
+ ok: false,
39
+ status: 'blocked',
40
+ blockers: glmOnlyFlagBlockers,
41
+ hint: 'GLM profile and diagnostics flags require sks --mad --glm.'
42
+ };
43
+ if (rawArgs.includes('--json'))
44
+ console.log(JSON.stringify(result, null, 2));
45
+ else {
46
+ console.error('SKS MAD launch blocked: GLM-only flags require --glm.');
47
+ for (const blocker of glmOnlyFlagBlockers)
48
+ console.error(`- ${blocker}`);
49
+ }
50
+ process.exitCode = 1;
51
+ return result;
52
+ }
53
+ const cleanArgs = stripMadLaunchOnlyArgs(args, { includeGlmFlags: glmMadLaunch });
36
54
  const madDbGrant = resolveMadLaunchMadDbGrant(rawArgs);
37
55
  const dryRun = rawArgs.includes('--dry-run');
38
56
  if (rawArgs.includes('--json') && !dryRun) {
39
- const profile = glmMadLaunch ? buildMadGlmLaunchProfileNoWrite() : buildMadHighLaunchProfileNoWrite();
57
+ const profile = glmMadLaunch ? buildMadGlmLaunchProfileNoWrite(rawArgs) : buildMadHighLaunchProfileNoWrite();
40
58
  return console.log(JSON.stringify(profile, null, 2));
41
59
  }
42
60
  const update = { status: 'notice_only', non_blocking: true };
@@ -176,7 +194,7 @@ export async function madHighCommand(args = [], deps = {}) {
176
194
  return launchPreflight;
177
195
  }
178
196
  const madLaunch = await activateMadZellijPermissionState(process.cwd(), args);
179
- const glmRuntime = glmMadLaunch ? await prepareMadGlmLaunchRuntime(madLaunch, deps) : null;
197
+ const glmRuntime = glmMadLaunch ? await prepareMadGlmLaunchRuntime(madLaunch, { ...deps, glmArgs: deps?.glmArgs || rawArgs }) : null;
180
198
  if (glmMadLaunch && !glmRuntime?.ok) {
181
199
  process.exitCode = 1;
182
200
  return glmRuntime;
@@ -325,7 +343,7 @@ function isMadGlmLaunch(args = [], deps = {}) {
325
343
  }
326
344
  async function prepareMadGlmLaunchRuntime(madLaunch, deps = {}) {
327
345
  const keyResolution = await resolveMadGlmLaunchKey(process.env);
328
- const profile = buildMadGlmLaunchProfileNoWrite();
346
+ const profile = buildMadGlmLaunchProfileNoWrite(deps?.glmArgs || []);
329
347
  if (!keyResolution.key) {
330
348
  const blocked = {
331
349
  schema: 'sks.glm-mad-launch.v1',
@@ -334,6 +352,9 @@ async function prepareMadGlmLaunchRuntime(madLaunch, deps = {}) {
334
352
  mission_id: madLaunch.mission_id,
335
353
  provider: profile.provider,
336
354
  model: profile.model,
355
+ glm_profile: profile.glm_profile,
356
+ glm_mode: profile.glm_mode,
357
+ model_reasoning_effort: profile.model_reasoning_effort,
337
358
  gpt_fallback_allowed: false,
338
359
  blockers: keyResolution.blockers,
339
360
  warnings: keyResolution.warnings
@@ -367,6 +388,9 @@ async function prepareMadGlmLaunchRuntime(madLaunch, deps = {}) {
367
388
  type: 'mad_sks.glm_launch_profile_ready',
368
389
  provider: profile.provider,
369
390
  model: profile.model,
391
+ glm_profile: profile.glm_profile,
392
+ glm_mode: profile.glm_mode,
393
+ model_reasoning_effort: profile.model_reasoning_effort,
370
394
  key_source: keyResolution.source || null,
371
395
  gpt_fallback_allowed: false
372
396
  });
@@ -713,7 +737,7 @@ async function activateMadZellijPermissionState(cwd = process.cwd(), args = [])
713
737
  });
714
738
  return { mission_id: id, dir, gate, root };
715
739
  }
716
- function madLaunchOnlyFlags() {
740
+ function baseMadLaunchOnlyFlags() {
717
741
  return new Set([
718
742
  '--mad',
719
743
  '--MAD',
@@ -760,8 +784,26 @@ function madLaunchOnlyFlags() {
760
784
  '--ack'
761
785
  ]);
762
786
  }
763
- function madLaunchValueFlags() {
787
+ function glmMadLaunchOnlyFlags() {
764
788
  return new Set([
789
+ '--deep',
790
+ '--xhigh',
791
+ '--strict',
792
+ '--trace',
793
+ '--ttft',
794
+ '--exact-provider'
795
+ ]);
796
+ }
797
+ function madLaunchOnlyFlags(includeGlmFlags = false) {
798
+ const flags = baseMadLaunchOnlyFlags();
799
+ if (includeGlmFlags) {
800
+ for (const flag of glmMadLaunchOnlyFlags())
801
+ flags.add(flag);
802
+ }
803
+ return flags;
804
+ }
805
+ function madLaunchValueFlags(includeGlmFlags = false) {
806
+ const flags = new Set([
765
807
  '--mad-agents',
766
808
  '--mad-swarm-agents',
767
809
  '--mad-swarm-work-items',
@@ -769,6 +811,20 @@ function madLaunchValueFlags() {
769
811
  '--mad-swarm-prompt',
770
812
  '--ack'
771
813
  ]);
814
+ if (includeGlmFlags)
815
+ flags.add('--exact-provider');
816
+ return flags;
817
+ }
818
+ export function findGlmOnlyMadFlagBlockers(args = [], glmMadLaunch = false) {
819
+ if (glmMadLaunch)
820
+ return [];
821
+ const blockers = [];
822
+ const glmOnly = new Set([...glmMadLaunchOnlyFlags(), '--bench']);
823
+ for (const arg of args) {
824
+ if (glmOnly.has(String(arg)))
825
+ blockers.push(`glm_flag_requires_--glm:${arg}`);
826
+ }
827
+ return blockers;
772
828
  }
773
829
  export function defaultMadSwarmBackend(args = [], opts = {}) {
774
830
  const list = (args || []).map((arg) => String(arg));
@@ -785,9 +841,9 @@ export function defaultMadSwarmBackend(args = [], opts = {}) {
785
841
  return 'codex-sdk';
786
842
  return 'zellij';
787
843
  }
788
- function stripMadLaunchOnlyArgs(args = []) {
789
- const flags = madLaunchOnlyFlags();
790
- const valueFlags = madLaunchValueFlags();
844
+ export function stripMadLaunchOnlyArgs(args = [], opts = {}) {
845
+ const flags = madLaunchOnlyFlags(Boolean(opts.includeGlmFlags));
846
+ const valueFlags = madLaunchValueFlags(Boolean(opts.includeGlmFlags));
791
847
  const out = [];
792
848
  for (let i = 0; i < args.length; i += 1) {
793
849
  const arg = String(args[i]);
package/dist/core/fsx.js CHANGED
@@ -5,7 +5,7 @@ import os from 'node:os';
5
5
  import crypto from 'node:crypto';
6
6
  import { spawn } from 'node:child_process';
7
7
  import { fileURLToPath } from 'node:url';
8
- export const PACKAGE_VERSION = '4.0.4';
8
+ export const PACKAGE_VERSION = '4.0.6';
9
9
  export const DEFAULT_PROCESS_TAIL_BYTES = 256 * 1024;
10
10
  export const DEFAULT_PROCESS_TIMEOUT_MS = 30 * 60 * 1000;
11
11
  export function nowIso() {
@@ -0,0 +1,33 @@
1
+ export class SksLruCache {
2
+ maxEntries;
3
+ map = new Map();
4
+ constructor(maxEntries = 128) {
5
+ this.maxEntries = Math.max(1, Math.floor(maxEntries));
6
+ }
7
+ get size() {
8
+ return this.map.size;
9
+ }
10
+ get(key) {
11
+ const entry = this.map.get(key);
12
+ if (!entry)
13
+ return null;
14
+ this.map.delete(key);
15
+ this.map.set(key, entry);
16
+ return entry.value;
17
+ }
18
+ set(key, value, createdAt = Date.now()) {
19
+ if (this.map.has(key))
20
+ this.map.delete(key);
21
+ this.map.set(key, { key, value, createdAt });
22
+ while (this.map.size > this.maxEntries) {
23
+ const oldest = this.map.keys().next().value;
24
+ if (!oldest)
25
+ break;
26
+ this.map.delete(oldest);
27
+ }
28
+ }
29
+ clear() {
30
+ this.map.clear();
31
+ }
32
+ }
33
+ //# sourceMappingURL=lru-cache.js.map
@@ -1,7 +1,9 @@
1
- import { GLM_52_DEFAULT_REQUEST_SETTINGS, GLM_52_OPENROUTER_MODEL, GLM_MAD_MODE } from './glm-52-settings.js';
1
+ import { GLM_52_OPENROUTER_MODEL, GLM_MAD_MODE } from './glm-52-settings.js';
2
+ import { profileFromConst } from './glm-profile-resolver.js';
2
3
  export const GLM_CODEX_APP_PROFILE_ID = 'sks/glm-5.2-mad';
3
- export const GLM_CODEX_APP_PROFILE_LABEL = 'GLM 5.2 (MAD / OpenRouter)';
4
+ export const GLM_CODEX_APP_PROFILE_LABEL = 'GLM 5.2 (MAD Speed / OpenRouter)';
4
5
  export function buildGlmCodexAppModelProfile() {
6
+ const speed = profileFromConst('speed');
5
7
  return {
6
8
  schema: 'sks.codex-app-model-profile.v1',
7
9
  id: GLM_CODEX_APP_PROFILE_ID,
@@ -12,12 +14,17 @@ export function buildGlmCodexAppModelProfile() {
12
14
  strictModelLock: true,
13
15
  gptFallbackAllowed: false,
14
16
  requiresSecret: 'openrouter-api-key',
17
+ defaultProfile: 'speed',
15
18
  defaultSettings: {
16
- temperature: GLM_52_DEFAULT_REQUEST_SETTINGS.temperature,
17
- top_p: GLM_52_DEFAULT_REQUEST_SETTINGS.top_p,
18
- reasoning_effort: 'high',
19
- tool_choice: 'auto',
20
- parallel_tool_calls: false
19
+ temperature: speed.temperature,
20
+ top_p: speed.top_p,
21
+ reasoning_effort: speed.reasoning_effort || null,
22
+ tool_choice: speed.tool_choice,
23
+ parallel_tool_calls: speed.parallel_tool_calls,
24
+ max_tokens: speed.max_tokens,
25
+ provider_sort: speed.provider.sort || 'throughput',
26
+ provider_allow_fallbacks: false,
27
+ provider_require_parameters: speed.provider.require_parameters
21
28
  },
22
29
  codexCompatibility: {
23
30
  target: 'rust-v0.141.0',
@@ -1,34 +1,65 @@
1
1
  import { GLM_52_DEFAULT_REQUEST_SETTINGS, GLM_52_OPENROUTER_MODEL, clampGlm52MaxTokens } from './glm-52-settings.js';
2
+ import { buildDeepReasoningConfig, buildFastReasoningConfig } from './glm-reasoning-policy.js';
3
+ import { profileFromConst, resolveGlmProfileFromArgs } from './glm-profile-resolver.js';
2
4
  export function buildGlm52Request(input) {
5
+ const profile = resolveInputProfile(input.profile, input.args, input.reasoningEffort);
6
+ if (profile.blockers.length) {
7
+ throw new Error(`GLM request profile blocked: ${profile.blockers.join(', ')}`);
8
+ }
9
+ const strictOrDeepEffort = profile.reasoning_effort || (input.reasoningEffort === 'high' || input.reasoningEffort === 'xhigh' ? input.reasoningEffort : undefined);
10
+ const reasoning = profile.name === 'speed'
11
+ ? buildFastReasoningConfig(input.reasoningMeta)
12
+ : buildDeepReasoningConfig(strictOrDeepEffort || 'high');
13
+ if (profile.name === 'speed' && (reasoning.effort === 'high' || reasoning.effort === 'xhigh')) {
14
+ throw new Error(`GLM speed profile invariant violated: forbidden reasoning effort ${reasoning.effort}`);
15
+ }
3
16
  const request = {
4
17
  model: GLM_52_OPENROUTER_MODEL,
5
18
  messages: input.messages,
6
- stream: input.stream ?? GLM_52_DEFAULT_REQUEST_SETTINGS.stream,
7
- temperature: GLM_52_DEFAULT_REQUEST_SETTINGS.temperature,
8
- top_p: GLM_52_DEFAULT_REQUEST_SETTINGS.top_p,
9
- reasoning: { effort: input.reasoningEffort ?? 'high' },
10
- max_tokens: clampGlm52MaxTokens(input.maxTokens),
11
- tool_choice: input.toolChoice ?? 'auto',
12
- parallel_tool_calls: input.parallelToolCalls ?? false,
19
+ stream: input.stream ?? profile.stream,
20
+ temperature: profile.temperature,
21
+ top_p: profile.top_p,
22
+ ...(reasoning ? { reasoning } : {}),
23
+ max_tokens: clampGlm52MaxTokens(input.maxTokens ?? profile.max_tokens),
24
+ tool_choice: input.toolChoice ?? profile.tool_choice,
25
+ parallel_tool_calls: input.parallelToolCalls ?? profile.parallel_tool_calls,
26
+ ...(profile.stop && profile.name === 'speed' ? { stop: profile.stop } : {}),
13
27
  provider: {
14
28
  allow_fallbacks: false,
15
- require_parameters: true,
16
- sort: input.providerSort ?? 'throughput'
17
- }
29
+ require_parameters: profile.provider.require_parameters,
30
+ ...(profile.provider.sort || input.providerSort ? { sort: input.providerSort ?? profile.provider.sort } : {}),
31
+ ...(profile.provider.preferred_min_throughput ? { preferred_min_throughput: profile.provider.preferred_min_throughput } : {}),
32
+ ...(profile.provider.preferred_max_latency ? { preferred_max_latency: profile.provider.preferred_max_latency } : {}),
33
+ ...(profile.provider.order ? { order: profile.provider.order } : {})
34
+ },
35
+ ...(input.responseFormat || profile.response_format ? { response_format: input.responseFormat ?? profile.response_format } : {})
18
36
  };
19
37
  return {
20
38
  ...request,
21
- ...(input.tools ? { tools: input.tools } : {}),
22
- ...(input.responseFormat ? { response_format: input.responseFormat } : {})
39
+ ...(input.tools && request.tool_choice !== 'none' ? { tools: input.tools } : {})
23
40
  };
24
41
  }
25
42
  export function buildGlm52KeyValidationRequest() {
26
43
  return buildGlm52Request({
27
44
  messages: [{ role: 'user', content: 'Reply with OK.' }],
45
+ profile: 'speed',
28
46
  stream: false,
29
47
  maxTokens: 1,
30
48
  toolChoice: 'none',
31
49
  parallelToolCalls: false
32
50
  });
33
51
  }
52
+ function resolveInputProfile(profile, args, reasoningEffort) {
53
+ if (profile && typeof profile === 'object')
54
+ return profile;
55
+ if (profile)
56
+ return profileFromConst(profile);
57
+ if (args)
58
+ return resolveGlmProfileFromArgs(args);
59
+ if (reasoningEffort === 'xhigh')
60
+ return profileFromConst('xhigh');
61
+ if (reasoningEffort === 'high')
62
+ return profileFromConst('deep');
63
+ return profileFromConst(GLM_52_DEFAULT_REQUEST_SETTINGS.mode === 'mad-glm-speed' ? 'speed' : 'speed');
64
+ }
34
65
  //# sourceMappingURL=glm-52-request.js.map
@@ -11,8 +11,7 @@ export function assertGlm52ActualModel(responseModel) {
11
11
  }
12
12
  const normalized = responseModel.toLowerCase();
13
13
  if (normalized === GLM_52_OPENROUTER_MODEL ||
14
- normalized.startsWith(`${GLM_52_OPENROUTER_MODEL}-`) ||
15
- normalized.includes('glm-5.2')) {
14
+ normalized.startsWith(`${GLM_52_OPENROUTER_MODEL}-`)) {
16
15
  return {
17
16
  ok: true,
18
17
  code: 'ok',
@@ -1,24 +1,66 @@
1
1
  export { OPENROUTER_CHAT_COMPLETIONS_URL } from '../openrouter/openrouter-types.js';
2
2
  export const GLM_52_OPENROUTER_MODEL = 'z-ai/glm-5.2';
3
- export const GLM_MAD_MODE = 'mad-glm';
4
- export const GLM_52_MAX_TOKENS_DEFAULT = 32768;
3
+ export const GLM_52_MODEL = GLM_52_OPENROUTER_MODEL;
4
+ export const GLM_SPEED_MODE = 'mad-glm-speed';
5
+ export const GLM_DEEP_MODE = 'mad-glm-deep';
6
+ export const GLM_XHIGH_MODE = 'mad-glm-xhigh';
7
+ export const GLM_STRICT_MODE = 'mad-glm-strict';
8
+ export const GLM_MAD_MODE = GLM_SPEED_MODE;
9
+ export const GLM_52_MAX_TOKENS_SPEED = 4096;
10
+ export const GLM_52_MAX_TOKENS_DEFAULT = GLM_52_MAX_TOKENS_SPEED;
11
+ export const GLM_52_MAX_TOKENS_DEEP = 16384;
12
+ export const GLM_52_MAX_TOKENS_XHIGH = 32768;
5
13
  export const GLM_52_MAX_TOKENS_LONG = 65536;
6
14
  export const GLM_52_MAX_TOKENS_XLONG = 131072;
7
15
  export const GLM_52_TOP_PROVIDER_MAX_COMPLETION_TOKENS = 262144;
8
- export const GLM_52_DEFAULT_REQUEST_SETTINGS = {
16
+ export const GLM_SPEED_PROFILE = {
9
17
  model: GLM_52_OPENROUTER_MODEL,
10
- temperature: 1,
11
- top_p: 0.95,
12
- reasoning_effort: 'high',
18
+ mode: GLM_SPEED_MODE,
19
+ temperature: 0.2,
20
+ top_p: 0.85,
13
21
  stream: true,
14
22
  provider: {
15
23
  allow_fallbacks: false,
16
- require_parameters: true
24
+ require_parameters: false,
25
+ sort: 'throughput',
26
+ preferred_min_throughput: { p50: 80, p90: 40 },
27
+ preferred_max_latency: { p50: 2, p90: 5 }
28
+ },
29
+ tool_choice: 'none',
30
+ parallel_tool_calls: false,
31
+ max_tokens: GLM_52_MAX_TOKENS_SPEED,
32
+ reasoning_effort: null,
33
+ reasoning_default: 'off-or-minimal-speed'
34
+ };
35
+ export const GLM_DEEP_PROFILE = {
36
+ model: GLM_52_OPENROUTER_MODEL,
37
+ mode: GLM_DEEP_MODE,
38
+ temperature: 0.3,
39
+ top_p: 0.9,
40
+ stream: true,
41
+ provider: {
42
+ allow_fallbacks: false,
43
+ require_parameters: true,
44
+ sort: 'throughput'
17
45
  },
18
46
  tool_choice: 'auto',
19
47
  parallel_tool_calls: false,
20
- max_tokens: GLM_52_MAX_TOKENS_DEFAULT
48
+ max_tokens: GLM_52_MAX_TOKENS_DEEP,
49
+ reasoning_effort: 'high'
50
+ };
51
+ export const GLM_XHIGH_PROFILE = {
52
+ ...GLM_DEEP_PROFILE,
53
+ mode: GLM_XHIGH_MODE,
54
+ max_tokens: GLM_52_MAX_TOKENS_XHIGH,
55
+ reasoning_effort: 'xhigh'
56
+ };
57
+ export const GLM_STRICT_PROFILE = {
58
+ ...GLM_DEEP_PROFILE,
59
+ mode: GLM_STRICT_MODE,
60
+ structured_outputs: true,
61
+ response_format: 'json_schema'
21
62
  };
63
+ export const GLM_52_DEFAULT_REQUEST_SETTINGS = GLM_SPEED_PROFILE;
22
64
  export function clampGlm52MaxTokens(value) {
23
65
  const numeric = Number.isFinite(value) ? Math.floor(Number(value)) : GLM_52_MAX_TOKENS_DEFAULT;
24
66
  return Math.max(1, Math.min(numeric, GLM_52_TOP_PROVIDER_MAX_COMPLETION_TOKENS));