@kaitranntt/ccs 3.4.4 → 3.4.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -197,69 +197,98 @@ Commands and skills symlinked from `~/.ccs/shared/` - no duplication across prof
197
197
 
198
198
  ## GLM with Thinking (GLMT)
199
199
 
200
+ > **[!] WARNING: NOT PRODUCTION READY**
201
+ >
202
+ > **GLMT is experimental and requires extensive debugging**:
203
+ > - Streaming and tool support still under active development
204
+ > - May experience unexpected errors, timeouts, or incomplete responses
205
+ > - Requires frequent debugging and manual intervention
206
+ > - **Not recommended for critical workflows or production use**
207
+ >
208
+ > **Alternative for GLM Thinking**: Consider going through the **CCR hustle** with the **Transformer of Bedolla** (ZaiTransformer) for a more stable implementation.
209
+ >
200
210
  > **[!] Important**: GLMT requires npm installation (`npm install -g @kaitranntt/ccs`). Not available in native shell versions (requires Node.js HTTP server).
201
211
 
212
+ ### Acknowledgments: The Foundation That Made GLMT Possible
213
+
214
+ > **[i] Pioneering Work by [@Bedolla](https://github.com/Bedolla)**
215
+ >
216
+ > **CCS's GLMT implementation owes its existence to the groundbreaking work of [@Bedolla](https://github.com/Bedolla)**, who created [ZaiTransformer](https://github.com/Bedolla/ZaiTransformer/) - the **first integration** to bridge [Claude Code Router (CCR)](https://github.com/musistudio/claude-code-router) with Z.AI's reasoning capabilities.
217
+ >
218
+ > **Why this matters**: Before ZaiTransformer, no one had successfully integrated Z.AI's thinking mode with Claude Code's workflow. Bedolla's work wasn't just helpful - it was **foundational**. His implementation of:
219
+ >
220
+ > - **Request/response transformation architecture** - The conceptual blueprint for how to bridge Anthropic and OpenAI formats
221
+ > - **Thinking mode control mechanisms** - The patterns for managing reasoning_content delivery
222
+ > - **Embedded proxy design** - The architecture that CCS's GLMT proxy is built upon
223
+ >
224
+ > These contributions directly inspired and enabled GLMT's design. **Without ZaiTransformer's pioneering work, GLMT wouldn't exist in its current form**. The technical patterns, transformation logic, and proxy architecture implemented in CCS are a direct evolution of the concepts Bedolla first proved viable.
225
+ >
226
+ > **Recognition**: If you benefit from GLMT's thinking capabilities, you're benefiting from Bedolla's vision and engineering. Please consider starring [ZaiTransformer](https://github.com/Bedolla/ZaiTransformer/) to support pioneering work in the Claude Code ecosystem.
227
+
228
+ ---
229
+
202
230
  ### GLM vs GLMT
203
231
 
204
232
  | Feature | GLM (`ccs glm`) | GLMT (`ccs glmt`) |
205
233
  |---------|-----------------|-------------------|
206
234
  | **Endpoint** | Anthropic-compatible | OpenAI-compatible |
207
- | **Thinking** | No | Yes (reasoning_content) |
208
- | **Tool Support** | Basic | **Full (v3.5+)** |
209
- | **MCP Tools** | Limited | **Working (v3.5+)** |
210
- | **Streaming** | Yes | **Yes (v3.4+)** |
211
- | **TTFB** | <500ms | <500ms (streaming), 2-10s (buffered) |
212
- | **Use Case** | Fast responses | Complex reasoning + tools |
235
+ | **Thinking** | No | Experimental (reasoning_content) |
236
+ | **Tool Support** | Basic | **Unstable (v3.5+)** |
237
+ | **MCP Tools** | Limited | **Buggy (v3.5+)** |
238
+ | **Streaming** | Stable | **Experimental (v3.4+)** |
239
+ | **TTFB** | <500ms | <500ms (sometimes), 2-10s+ (often) |
240
+ | **Use Case** | Reliable work | **Debugging experiments only** |
213
241
 
214
242
  ### Tool Support (v3.5)
215
243
 
216
- **GLMT now fully supports MCP tools and function calling**:
244
+ **GLMT attempts MCP tools and function calling (EXPERIMENTAL)**:
217
245
 
218
- - **Bidirectional Transformation**: Anthropic tools ↔ OpenAI function calling
219
- - **MCP Integration**: MCP tools execute correctly (no XML tag output)
220
- - **Streaming Tool Calls**: Real-time tool calls with input_json deltas
221
- - **Backward Compatible**: Works seamlessly with existing thinking support
222
- - **No Configuration**: Tool support works automatically
246
+ - **Bidirectional Transformation**: Anthropic tools ↔ OpenAI format (unstable)
247
+ - **MCP Integration**: MCP tools sometimes execute (often output XML garbage)
248
+ - **Streaming Tool Calls**: Real-time tool calls (when not crashing)
249
+ - **Backward Compatible**: May break existing thinking support
250
+ - **Configuration Required**: Frequent manual debugging needed
223
251
 
224
252
  ### Streaming Support (v3.4)
225
253
 
226
- **GLMT now supports real-time streaming** with incremental reasoning content delivery.
254
+ **GLMT attempts real-time streaming** with incremental reasoning content delivery (OFTEN FAILS).
227
255
 
228
- - **Default**: Streaming enabled (TTFB <500ms)
229
- - **Disable**: Set `CCS_GLMT_STREAMING=disabled` for buffered mode
230
- - **Force**: Set `CCS_GLMT_STREAMING=force` to override client preferences
231
- - **Thinking parameter**: Claude CLI `thinking` parameter support
232
- - Respects `thinking.type` and `budget_tokens`
233
- - Precedence: CLI parameter > message tags > default
256
+ - **Default**: Streaming enabled (TTFB <500ms when it works)
257
+ - **Auto-fallback**: Frequently switches to buffered mode due to errors
258
+ - **Thinking parameter**: Claude CLI `thinking` parameter sometimes works
259
+ - May ignore `thinking.type` and `budget_tokens`
260
+ - Precedence: CLI parameter > message tags > default (when not broken)
234
261
 
235
- **Confirmed working**: Z.AI (1498 reasoning chunks tested, tool calls verified)
262
+ **Barely working**: Z.AI (tested, tool calls frequently break, requires constant debugging)
236
263
 
237
- ### How It Works
264
+ ### How It Works (When It Works)
238
265
 
239
- 1. CCS spawns embedded HTTP proxy on localhost
240
- 2. Proxy converts Anthropic format → OpenAI format (streaming or buffered)
241
- 3. Transforms Anthropic tools → OpenAI function calling format
242
- 4. Forwards to Z.AI with reasoning parameters and tools
243
- 5. Converts `reasoning_content` → thinking blocks (incremental or complete)
244
- 6. Converts OpenAI `tool_calls` → Anthropic tool_use blocks
245
- 7. Thinking and tool calls appear in Claude Code UI in real-time
266
+ 1. CCS spawns embedded HTTP proxy on localhost (if not crashing)
267
+ 2. Proxy attempts to convert Anthropic format → OpenAI format (often fails)
268
+ 3. Tries to transform Anthropic tools → OpenAI function calling format (buggy)
269
+ 4. Forwards to Z.AI with reasoning parameters and tools (when not timing out)
270
+ 5. Attempts to convert `reasoning_content` → thinking blocks (partial or broken)
271
+ 6. Attempts to convert OpenAI `tool_calls` → Anthropic tool_use blocks (XML garbage common)
272
+ 7. Thinking and tool calls sometimes appear in Claude Code UI (when not broken)
246
273
 
247
- ### Control Tags
274
+ ### Control Tags & Keywords
248
275
 
276
+ **Control Tags**:
249
277
  - `<Thinking:On|Off>` - Enable/disable reasoning blocks (default: On)
250
278
  - `<Effort:Low|Medium|High>` - Control reasoning depth (deprecated - Z.AI only supports binary thinking)
251
279
 
280
+ **Thinking Keywords** (inconsistent activation):
281
+ - `think` - Sometimes enables reasoning (low effort)
282
+ - `think hard` - Sometimes enables reasoning (medium effort)
283
+ - `think harder` - Sometimes enables reasoning (high effort)
284
+ - `ultrathink` - Attempts maximum reasoning depth (often breaks)
285
+
252
286
  ### Environment Variables
253
287
 
254
- **GLMT-specific**:
255
- - `CCS_GLMT_FORCE_ENGLISH=true` - Force English output (default: true)
256
- - `CCS_GLMT_THINKING_BUDGET=8192` - Control thinking on/off based on task type
257
- - 0 or "unlimited": Always enable thinking
258
- - 1-2048: Disable thinking (fast execution)
259
- - 2049-8192: Enable for reasoning tasks only (default)
260
- - >8192: Always enable thinking
261
- - `CCS_GLMT_STREAMING=disabled` - Force buffered mode
262
- - `CCS_GLMT_STREAMING=force` - Force streaming (override client)
288
+ **GLMT features** (all experimental):
289
+ - Forced English output enforcement (sometimes works)
290
+ - Random thinking mode activation (unpredictable)
291
+ - Attempted streaming with frequent fallback to buffered mode
263
292
 
264
293
  **General**:
265
294
  - `CCS_DEBUG_LOG=1` - Enable debug file logging
@@ -301,10 +330,10 @@ ccs glmt --verbose "your prompt"
301
330
  # Logs: ~/.ccs/logs/
302
331
  ```
303
332
 
304
- **Check streaming mode**:
333
+ **GLMT debugging**:
305
334
  ```bash
306
- # Disable streaming for debugging
307
- CCS_GLMT_STREAMING=disabled ccs glmt "test"
335
+ # Verbose logging shows streaming status and reasoning details
336
+ ccs glmt --verbose "test"
308
337
  ```
309
338
 
310
339
  **Check reasoning content**:
package/VERSION CHANGED
@@ -1 +1 @@
1
- 3.4.4
1
+ 3.4.6
@@ -100,22 +100,32 @@ class DeltaAccumulator {
100
100
  */
101
101
  addDelta(delta) {
102
102
  const block = this.getCurrentBlock();
103
- if (block) {
104
- if (block.type === 'thinking') {
105
- // C-02 Fix: Enforce buffer size limit
106
- if (this.thinkingBuffer.length + delta.length > this.maxBufferSize) {
107
- throw new Error(`Thinking buffer exceeded ${this.maxBufferSize} bytes (DoS protection)`);
108
- }
109
- this.thinkingBuffer += delta;
110
- block.content = this.thinkingBuffer;
111
- } else if (block.type === 'text') {
112
- // C-02 Fix: Enforce buffer size limit
113
- if (this.textBuffer.length + delta.length > this.maxBufferSize) {
114
- throw new Error(`Text buffer exceeded ${this.maxBufferSize} bytes (DoS protection)`);
115
- }
116
- this.textBuffer += delta;
117
- block.content = this.textBuffer;
103
+ if (!block) {
104
+ // FIX: Guard against null block (should never happen, but defensive)
105
+ console.error('[DeltaAccumulator] ERROR: addDelta called with no current block');
106
+ return;
107
+ }
108
+
109
+ if (block.type === 'thinking') {
110
+ // C-02 Fix: Enforce buffer size limit
111
+ if (this.thinkingBuffer.length + delta.length > this.maxBufferSize) {
112
+ throw new Error(`Thinking buffer exceeded ${this.maxBufferSize} bytes (DoS protection)`);
118
113
  }
114
+ this.thinkingBuffer += delta;
115
+ block.content = this.thinkingBuffer;
116
+
117
+ // FIX: Verify assignment succeeded (paranoid check for race conditions)
118
+ if (block.content.length !== this.thinkingBuffer.length) {
119
+ console.error('[DeltaAccumulator] ERROR: Block content assignment failed');
120
+ console.error(`Expected: ${this.thinkingBuffer.length}, Got: ${block.content.length}`);
121
+ }
122
+ } else if (block.type === 'text') {
123
+ // C-02 Fix: Enforce buffer size limit
124
+ if (this.textBuffer.length + delta.length > this.maxBufferSize) {
125
+ throw new Error(`Text buffer exceeded ${this.maxBufferSize} bytes (DoS protection)`);
126
+ }
127
+ this.textBuffer += delta;
128
+ block.content = this.textBuffer;
119
129
  }
120
130
  }
121
131
 
@@ -126,6 +136,11 @@ class DeltaAccumulator {
126
136
  const block = this.getCurrentBlock();
127
137
  if (block) {
128
138
  block.stopped = true;
139
+
140
+ // FIX: Log block closure for debugging (helps diagnose timing issues)
141
+ if (block.type === 'thinking' && process.env.CCS_DEBUG === '1') {
142
+ console.error(`[DeltaAccumulator] Stopped thinking block ${block.index}: ${block.content?.length || 0} chars`);
143
+ }
129
144
  }
130
145
  }
131
146
 
@@ -8,6 +8,7 @@ const os = require('os');
8
8
  const SSEParser = require('./sse-parser');
9
9
  const DeltaAccumulator = require('./delta-accumulator');
10
10
  const LocaleEnforcer = require('./locale-enforcer');
11
+ const ReasoningEnforcer = require('./reasoning-enforcer');
11
12
 
12
13
  /**
13
14
  * GlmtTransformer - Convert between Anthropic and OpenAI formats with thinking and tool support
@@ -17,7 +18,7 @@ const LocaleEnforcer = require('./locale-enforcer');
17
18
  * - Response: OpenAI reasoning_content → Anthropic thinking blocks
18
19
  * - Tool Support: Anthropic tools ↔ OpenAI function calling (bidirectional)
19
20
  * - Streaming: Real-time tool calls with input_json deltas
20
- * - Debug mode: Log raw data to ~/.ccs/logs/ (CCS_DEBUG_LOG=1)
21
+ * - Debug mode: Log raw data to ~/.ccs/logs/ (CCS_DEBUG=1)
21
22
  * - Verbose mode: Console logging with timestamps
22
23
  * - Validation: Self-test transformation results
23
24
  *
@@ -37,16 +38,10 @@ class GlmtTransformer {
37
38
  this.defaultThinking = config.defaultThinking ?? true;
38
39
  this.verbose = config.verbose || false;
39
40
 
40
- // Support both CCS_DEBUG and CCS_DEBUG_LOG (with deprecation warning)
41
- const oldVar = process.env.CCS_DEBUG_LOG === '1';
42
- const newVar = process.env.CCS_DEBUG === '1';
43
- this.debugLog = config.debugLog ?? (newVar || oldVar);
44
-
45
- // Show deprecation warning once
46
- if (oldVar && !newVar && !GlmtTransformer._warnedDeprecation) {
47
- console.warn('[glmt] Warning: CCS_DEBUG_LOG is deprecated, use CCS_DEBUG instead');
48
- GlmtTransformer._warnedDeprecation = true;
49
- }
41
+ // CCS_DEBUG controls all debug logging (file + console)
42
+ const debugEnabled = process.env.CCS_DEBUG === '1';
43
+ this.debugLog = config.debugLog ?? debugEnabled;
44
+ this.debugMode = config.debugMode ?? debugEnabled;
50
45
 
51
46
  this.debugLogDir = config.debugLogDir || path.join(os.homedir(), '.ccs', 'logs');
52
47
  this.modelMaxTokens = {
@@ -60,6 +55,11 @@ class GlmtTransformer {
60
55
 
61
56
  // Initialize locale enforcer (always enforce English)
62
57
  this.localeEnforcer = new LocaleEnforcer();
58
+
59
+ // Initialize reasoning enforcer (enabled by default for all GLMT usage)
60
+ this.reasoningEnforcer = new ReasoningEnforcer({
61
+ enabled: config.explicitReasoning ?? true
62
+ });
63
63
  }
64
64
 
65
65
  /**
@@ -110,10 +110,16 @@ class GlmtTransformer {
110
110
  anthropicRequest.messages || []
111
111
  );
112
112
 
113
+ // 4.5. Inject reasoning instruction (if enabled or thinking requested)
114
+ const messagesWithReasoning = this.reasoningEnforcer.injectInstruction(
115
+ messagesWithLocale,
116
+ thinkingConfig
117
+ );
118
+
113
119
  // 5. Convert to OpenAI format
114
120
  const openaiRequest = {
115
121
  model: glmModel,
116
- messages: this._sanitizeMessages(messagesWithLocale),
122
+ messages: this._sanitizeMessages(messagesWithReasoning),
117
123
  max_tokens: this._getMaxTokens(glmModel),
118
124
  stream: anthropicRequest.stream ?? false
119
125
  };
@@ -645,10 +651,20 @@ class GlmtTransformer {
645
651
  if (delta.reasoning_content) {
646
652
  const currentBlock = accumulator.getCurrentBlock();
647
653
 
654
+ // FIX: Enhanced debug logging for thinking block diagnostics
655
+ if (this.debugMode) {
656
+ console.error(`[GLMT-DEBUG] Reasoning delta: ${delta.reasoning_content.length} chars`);
657
+ console.error(`[GLMT-DEBUG] Current block: ${currentBlock?.type || 'none'}, index: ${currentBlock?.index ?? 'N/A'}`);
658
+ }
659
+
648
660
  if (!currentBlock || currentBlock.type !== 'thinking') {
649
661
  // Start thinking block
650
662
  const block = accumulator.startBlock('thinking');
651
663
  events.push(this._createContentBlockStartEvent(block));
664
+
665
+ if (this.debugMode) {
666
+ console.error(`[GLMT-DEBUG] Started new thinking block ${block.index}`);
667
+ }
652
668
  }
653
669
 
654
670
  accumulator.addDelta(delta.reasoning_content);
@@ -664,7 +680,10 @@ class GlmtTransformer {
664
680
 
665
681
  // Close thinking block if transitioning from thinking to text
666
682
  if (currentBlock && currentBlock.type === 'thinking' && !currentBlock.stopped) {
667
- events.push(this._createSignatureDeltaEvent(currentBlock));
683
+ const signatureEvent = this._createSignatureDeltaEvent(currentBlock);
684
+ if (signatureEvent) { // FIX: Handle null return from signature race guard
685
+ events.push(signatureEvent);
686
+ }
668
687
  events.push(this._createContentBlockStopEvent(currentBlock));
669
688
  accumulator.stopCurrentBlock();
670
689
  }
@@ -691,7 +710,10 @@ class GlmtTransformer {
691
710
  const currentBlock = accumulator.getCurrentBlock();
692
711
  if (currentBlock && !currentBlock.stopped) {
693
712
  if (currentBlock.type === 'thinking') {
694
- events.push(this._createSignatureDeltaEvent(currentBlock));
713
+ const signatureEvent = this._createSignatureDeltaEvent(currentBlock);
714
+ if (signatureEvent) { // FIX: Handle null return from signature race guard
715
+ events.push(signatureEvent);
716
+ }
695
717
  }
696
718
  events.push(this._createContentBlockStopEvent(currentBlock));
697
719
  accumulator.stopCurrentBlock();
@@ -794,7 +816,10 @@ class GlmtTransformer {
794
816
  const currentBlock = accumulator.getCurrentBlock();
795
817
  if (currentBlock && !currentBlock.stopped) {
796
818
  if (currentBlock.type === 'thinking') {
797
- events.push(this._createSignatureDeltaEvent(currentBlock));
819
+ const signatureEvent = this._createSignatureDeltaEvent(currentBlock);
820
+ if (signatureEvent) { // FIX: Handle null return from signature race guard
821
+ events.push(signatureEvent);
822
+ }
798
823
  }
799
824
  events.push(this._createContentBlockStopEvent(currentBlock));
800
825
  accumulator.stopCurrentBlock();
@@ -914,7 +939,23 @@ class GlmtTransformer {
914
939
  * @private
915
940
  */
916
941
  _createSignatureDeltaEvent(block) {
942
+ // FIX: Guard against empty content (signature timing race)
943
+ // In streaming mode, signature may be requested before content fully accumulated
944
+ if (!block.content || block.content.length === 0) {
945
+ if (this.verbose) {
946
+ this.log(`WARNING: Skipping signature for empty thinking block ${block.index}`);
947
+ this.log(`This indicates a race condition - signature requested before content accumulated`);
948
+ }
949
+ return null; // Return null instead of event
950
+ }
951
+
917
952
  const signature = this._generateThinkingSignature(block.content);
953
+
954
+ // Enhanced logging for debugging
955
+ if (this.verbose) {
956
+ this.log(`Generating signature for block ${block.index}: ${block.content.length} chars`);
957
+ }
958
+
918
959
  return {
919
960
  event: 'content_block_delta',
920
961
  data: {
@@ -0,0 +1,173 @@
1
+ #!/usr/bin/env node
2
+ 'use strict';
3
+
4
+ /**
5
+ * ReasoningEnforcer - Inject explicit reasoning instructions into prompts
6
+ *
7
+ * Purpose: Force GLM models to use structured reasoning output format (<reasoning_content>)
8
+ * This complements API parameters (reasoning: true) with explicit prompt instructions.
9
+ *
10
+ * Usage:
11
+ * const enforcer = new ReasoningEnforcer({ enabled: true });
12
+ * const modifiedMessages = enforcer.injectInstruction(messages, thinkingConfig);
13
+ *
14
+ * Strategy:
15
+ * 1. If system prompt exists: Prepend reasoning instruction
16
+ * 2. If no system prompt: Prepend to first user message
17
+ * 3. Select prompt template based on effort level (low/medium/high/max)
18
+ * 4. Preserve message structure (string vs array content)
19
+ */
20
+
21
+ class ReasoningEnforcer {
22
+ constructor(options = {}) {
23
+ this.enabled = options.enabled ?? false; // Opt-in by default
24
+ this.prompts = options.prompts || this._getDefaultPrompts();
25
+ }
26
+
27
+ /**
28
+ * Inject reasoning instruction into messages
29
+ * @param {Array} messages - Messages array to modify
30
+ * @param {Object} thinkingConfig - { thinking: boolean, effort: string }
31
+ * @returns {Array} Modified messages array
32
+ */
33
+ injectInstruction(messages, thinkingConfig = {}) {
34
+ // Only inject if enabled or thinking explicitly requested
35
+ if (!this.enabled && !thinkingConfig.thinking) {
36
+ return messages;
37
+ }
38
+
39
+ // Clone messages to avoid mutation
40
+ const modifiedMessages = JSON.parse(JSON.stringify(messages));
41
+
42
+ // Select prompt based on effort level
43
+ const prompt = this._selectPrompt(thinkingConfig.effort || 'medium');
44
+
45
+ // Strategy 1: Inject into system prompt (preferred)
46
+ const systemIndex = modifiedMessages.findIndex(m => m.role === 'system');
47
+ if (systemIndex >= 0) {
48
+ const systemMsg = modifiedMessages[systemIndex];
49
+
50
+ if (typeof systemMsg.content === 'string') {
51
+ systemMsg.content = `${prompt}\n\n${systemMsg.content}`;
52
+ } else if (Array.isArray(systemMsg.content)) {
53
+ systemMsg.content.unshift({
54
+ type: 'text',
55
+ text: prompt
56
+ });
57
+ }
58
+
59
+ return modifiedMessages;
60
+ }
61
+
62
+ // Strategy 2: Prepend to first user message
63
+ const userIndex = modifiedMessages.findIndex(m => m.role === 'user');
64
+ if (userIndex >= 0) {
65
+ const userMsg = modifiedMessages[userIndex];
66
+
67
+ if (typeof userMsg.content === 'string') {
68
+ userMsg.content = `${prompt}\n\n${userMsg.content}`;
69
+ } else if (Array.isArray(userMsg.content)) {
70
+ userMsg.content.unshift({
71
+ type: 'text',
72
+ text: prompt
73
+ });
74
+ }
75
+
76
+ return modifiedMessages;
77
+ }
78
+
79
+ // No system or user messages found (edge case)
80
+ return modifiedMessages;
81
+ }
82
+
83
+ /**
84
+ * Select prompt template based on effort level
85
+ * @param {string} effort - 'low', 'medium', 'high', or 'max'
86
+ * @returns {string} Prompt template
87
+ * @private
88
+ */
89
+ _selectPrompt(effort) {
90
+ const normalizedEffort = effort.toLowerCase();
91
+ return this.prompts[normalizedEffort] || this.prompts.medium;
92
+ }
93
+
94
+ /**
95
+ * Get default prompt templates
96
+ * @returns {Object} Map of effort levels to prompts
97
+ * @private
98
+ */
99
+ _getDefaultPrompts() {
100
+ return {
101
+ low: `You are an expert reasoning model using GLM-4.6 architecture.
102
+
103
+ CRITICAL: Before answering, write 2-3 sentences of reasoning in <reasoning_content> tags.
104
+
105
+ OUTPUT FORMAT:
106
+ <reasoning_content>
107
+ (Brief analysis: what is the problem? what's the approach?)
108
+ </reasoning_content>
109
+
110
+ (Write your final answer here)`,
111
+
112
+ medium: `You are an expert reasoning model using GLM-4.6 architecture.
113
+
114
+ CRITICAL REQUIREMENTS:
115
+ 1. Always think step-by-step before answering
116
+ 2. Write your reasoning process explicitly in <reasoning_content> tags
117
+ 3. Never skip your chain of thought, even for simple problems
118
+
119
+ OUTPUT FORMAT:
120
+ <reasoning_content>
121
+ (Write your detailed thinking here: analyze the problem, explore approaches,
122
+ evaluate trade-offs, and arrive at a conclusion)
123
+ </reasoning_content>
124
+
125
+ (Write your final answer here based on your reasoning above)`,
126
+
127
+ high: `You are an expert reasoning model using GLM-4.6 architecture.
128
+
129
+ CRITICAL REQUIREMENTS:
130
+ 1. Think deeply and systematically before answering
131
+ 2. Write comprehensive reasoning in <reasoning_content> tags
132
+ 3. Explore multiple approaches and evaluate trade-offs
133
+ 4. Show all steps in your problem-solving process
134
+
135
+ OUTPUT FORMAT:
136
+ <reasoning_content>
137
+ (Write exhaustive analysis here:
138
+ - Problem decomposition
139
+ - Multiple approach exploration
140
+ - Trade-off analysis for each approach
141
+ - Edge case consideration
142
+ - Final conclusion with justification)
143
+ </reasoning_content>
144
+
145
+ (Write your final answer here based on your systematic reasoning above)`,
146
+
147
+ max: `You are an expert reasoning model using GLM-4.6 architecture.
148
+
149
+ CRITICAL REQUIREMENTS:
150
+ 1. Think exhaustively from first principles
151
+ 2. Write extremely detailed reasoning in <reasoning_content> tags
152
+ 3. Analyze ALL possible angles, approaches, and edge cases
153
+ 4. Challenge your own assumptions and explore alternatives
154
+ 5. Provide rigorous justification for every claim
155
+
156
+ OUTPUT FORMAT:
157
+ <reasoning_content>
158
+ (Write comprehensive analysis here:
159
+ - First principles breakdown
160
+ - Exhaustive approach enumeration
161
+ - Comparative analysis of all approaches
162
+ - Edge case and failure mode analysis
163
+ - Assumption validation
164
+ - Counter-argument consideration
165
+ - Final conclusion with rigorous justification)
166
+ </reasoning_content>
167
+
168
+ (Write your final answer here based on your exhaustive reasoning above)`
169
+ };
170
+ }
171
+ }
172
+
173
+ module.exports = ReasoningEnforcer;
package/lib/ccs CHANGED
@@ -2,7 +2,7 @@
2
2
  set -euo pipefail
3
3
 
4
4
  # Version (updated by scripts/bump-version.sh)
5
- CCS_VERSION="3.4.3"
5
+ CCS_VERSION="3.4.6"
6
6
  SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
7
7
  readonly CONFIG_FILE="${CCS_CONFIG:-$HOME/.ccs/config.json}"
8
8
  readonly PROFILES_JSON="$HOME/.ccs/profiles.json"
package/lib/ccs.ps1 CHANGED
@@ -12,7 +12,7 @@ param(
12
12
  $ErrorActionPreference = "Stop"
13
13
 
14
14
  # Version (updated by scripts/bump-version.sh)
15
- $CcsVersion = "3.4.3"
15
+ $CcsVersion = "3.4.6"
16
16
  $ScriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
17
17
  $ConfigFile = if ($env:CCS_CONFIG) { $env:CCS_CONFIG } else { "$env:USERPROFILE\.ccs\config.json" }
18
18
  $ProfilesJson = "$env:USERPROFILE\.ccs\profiles.json"
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@kaitranntt/ccs",
3
- "version": "3.4.4",
3
+ "version": "3.4.6",
4
4
  "description": "Claude Code Switch - Instant profile switching between Claude Sonnet 4.5 and GLM 4.6",
5
5
  "keywords": [
6
6
  "cli",