@kaitranntt/ccs 3.4.4 → 3.4.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +70 -41
- package/VERSION +1 -1
- package/bin/glmt/delta-accumulator.js +30 -15
- package/bin/glmt/glmt-transformer.js +56 -15
- package/bin/glmt/reasoning-enforcer.js +173 -0
- package/lib/ccs +1 -1
- package/lib/ccs.ps1 +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -197,69 +197,98 @@ Commands and skills symlinked from `~/.ccs/shared/` - no duplication across prof
|
|
|
197
197
|
|
|
198
198
|
## GLM with Thinking (GLMT)
|
|
199
199
|
|
|
200
|
+
> **[!] WARNING: NOT PRODUCTION READY**
|
|
201
|
+
>
|
|
202
|
+
> **GLMT is experimental and requires extensive debugging**:
|
|
203
|
+
> - Streaming and tool support still under active development
|
|
204
|
+
> - May experience unexpected errors, timeouts, or incomplete responses
|
|
205
|
+
> - Requires frequent debugging and manual intervention
|
|
206
|
+
> - **Not recommended for critical workflows or production use**
|
|
207
|
+
>
|
|
208
|
+
> **Alternative for GLM Thinking**: Consider going through the **CCR hustle** with the **Transformer of Bedolla** (ZaiTransformer) for a more stable implementation.
|
|
209
|
+
>
|
|
200
210
|
> **[!] Important**: GLMT requires npm installation (`npm install -g @kaitranntt/ccs`). Not available in native shell versions (requires Node.js HTTP server).
|
|
201
211
|
|
|
212
|
+
### Acknowledgments: The Foundation That Made GLMT Possible
|
|
213
|
+
|
|
214
|
+
> **[i] Pioneering Work by [@Bedolla](https://github.com/Bedolla)**
|
|
215
|
+
>
|
|
216
|
+
> **CCS's GLMT implementation owes its existence to the groundbreaking work of [@Bedolla](https://github.com/Bedolla)**, who created [ZaiTransformer](https://github.com/Bedolla/ZaiTransformer/) - the **first integration** to bridge [Claude Code Router (CCR)](https://github.com/musistudio/claude-code-router) with Z.AI's reasoning capabilities.
|
|
217
|
+
>
|
|
218
|
+
> **Why this matters**: Before ZaiTransformer, no one had successfully integrated Z.AI's thinking mode with Claude Code's workflow. Bedolla's work wasn't just helpful - it was **foundational**. His implementation of:
|
|
219
|
+
>
|
|
220
|
+
> - **Request/response transformation architecture** - The conceptual blueprint for how to bridge Anthropic and OpenAI formats
|
|
221
|
+
> - **Thinking mode control mechanisms** - The patterns for managing reasoning_content delivery
|
|
222
|
+
> - **Embedded proxy design** - The architecture that CCS's GLMT proxy is built upon
|
|
223
|
+
>
|
|
224
|
+
> These contributions directly inspired and enabled GLMT's design. **Without ZaiTransformer's pioneering work, GLMT wouldn't exist in its current form**. The technical patterns, transformation logic, and proxy architecture implemented in CCS are a direct evolution of the concepts Bedolla first proved viable.
|
|
225
|
+
>
|
|
226
|
+
> **Recognition**: If you benefit from GLMT's thinking capabilities, you're benefiting from Bedolla's vision and engineering. Please consider starring [ZaiTransformer](https://github.com/Bedolla/ZaiTransformer/) to support pioneering work in the Claude Code ecosystem.
|
|
227
|
+
|
|
228
|
+
---
|
|
229
|
+
|
|
202
230
|
### GLM vs GLMT
|
|
203
231
|
|
|
204
232
|
| Feature | GLM (`ccs glm`) | GLMT (`ccs glmt`) |
|
|
205
233
|
|---------|-----------------|-------------------|
|
|
206
234
|
| **Endpoint** | Anthropic-compatible | OpenAI-compatible |
|
|
207
|
-
| **Thinking** | No |
|
|
208
|
-
| **Tool Support** | Basic | **
|
|
209
|
-
| **MCP Tools** | Limited | **
|
|
210
|
-
| **Streaming** |
|
|
211
|
-
| **TTFB** | <500ms | <500ms (
|
|
212
|
-
| **Use Case** |
|
|
235
|
+
| **Thinking** | No | Experimental (reasoning_content) |
|
|
236
|
+
| **Tool Support** | Basic | **Unstable (v3.5+)** |
|
|
237
|
+
| **MCP Tools** | Limited | **Buggy (v3.5+)** |
|
|
238
|
+
| **Streaming** | Stable | **Experimental (v3.4+)** |
|
|
239
|
+
| **TTFB** | <500ms | <500ms (sometimes), 2-10s+ (often) |
|
|
240
|
+
| **Use Case** | Reliable work | **Debugging experiments only** |
|
|
213
241
|
|
|
214
242
|
### Tool Support (v3.5)
|
|
215
243
|
|
|
216
|
-
**GLMT
|
|
244
|
+
**GLMT attempts MCP tools and function calling (EXPERIMENTAL)**:
|
|
217
245
|
|
|
218
|
-
- **Bidirectional Transformation**: Anthropic tools ↔ OpenAI
|
|
219
|
-
- **MCP Integration**: MCP tools execute
|
|
220
|
-
- **Streaming Tool Calls**: Real-time tool calls
|
|
221
|
-
- **Backward Compatible**:
|
|
222
|
-
- **
|
|
246
|
+
- **Bidirectional Transformation**: Anthropic tools ↔ OpenAI format (unstable)
|
|
247
|
+
- **MCP Integration**: MCP tools sometimes execute (often output XML garbage)
|
|
248
|
+
- **Streaming Tool Calls**: Real-time tool calls (when not crashing)
|
|
249
|
+
- **Backward Compatible**: May break existing thinking support
|
|
250
|
+
- **Configuration Required**: Frequent manual debugging needed
|
|
223
251
|
|
|
224
252
|
### Streaming Support (v3.4)
|
|
225
253
|
|
|
226
|
-
**GLMT
|
|
254
|
+
**GLMT attempts real-time streaming** with incremental reasoning content delivery (OFTEN FAILS).
|
|
227
255
|
|
|
228
|
-
- **Default**: Streaming enabled (TTFB <500ms)
|
|
229
|
-
- **
|
|
230
|
-
- **
|
|
231
|
-
-
|
|
232
|
-
-
|
|
233
|
-
- Precedence: CLI parameter > message tags > default
|
|
256
|
+
- **Default**: Streaming enabled (TTFB <500ms when it works)
|
|
257
|
+
- **Auto-fallback**: Frequently switches to buffered mode due to errors
|
|
258
|
+
- **Thinking parameter**: Claude CLI `thinking` parameter sometimes works
|
|
259
|
+
- May ignore `thinking.type` and `budget_tokens`
|
|
260
|
+
- Precedence: CLI parameter > message tags > default (when not broken)
|
|
234
261
|
|
|
235
|
-
**
|
|
262
|
+
**Barely working**: Z.AI (tested, tool calls frequently break, requires constant debugging)
|
|
236
263
|
|
|
237
|
-
### How It Works
|
|
264
|
+
### How It Works (When It Works)
|
|
238
265
|
|
|
239
|
-
1. CCS spawns embedded HTTP proxy on localhost
|
|
240
|
-
2. Proxy
|
|
241
|
-
3.
|
|
242
|
-
4. Forwards to Z.AI with reasoning parameters and tools
|
|
243
|
-
5.
|
|
244
|
-
6.
|
|
245
|
-
7. Thinking and tool calls appear in Claude Code UI
|
|
266
|
+
1. CCS spawns embedded HTTP proxy on localhost (if not crashing)
|
|
267
|
+
2. Proxy attempts to convert Anthropic format → OpenAI format (often fails)
|
|
268
|
+
3. Tries to transform Anthropic tools → OpenAI function calling format (buggy)
|
|
269
|
+
4. Forwards to Z.AI with reasoning parameters and tools (when not timing out)
|
|
270
|
+
5. Attempts to convert `reasoning_content` → thinking blocks (partial or broken)
|
|
271
|
+
6. Attempts to convert OpenAI `tool_calls` → Anthropic tool_use blocks (XML garbage common)
|
|
272
|
+
7. Thinking and tool calls sometimes appear in Claude Code UI (when not broken)
|
|
246
273
|
|
|
247
|
-
### Control Tags
|
|
274
|
+
### Control Tags & Keywords
|
|
248
275
|
|
|
276
|
+
**Control Tags**:
|
|
249
277
|
- `<Thinking:On|Off>` - Enable/disable reasoning blocks (default: On)
|
|
250
278
|
- `<Effort:Low|Medium|High>` - Control reasoning depth (deprecated - Z.AI only supports binary thinking)
|
|
251
279
|
|
|
280
|
+
**Thinking Keywords** (inconsistent activation):
|
|
281
|
+
- `think` - Sometimes enables reasoning (low effort)
|
|
282
|
+
- `think hard` - Sometimes enables reasoning (medium effort)
|
|
283
|
+
- `think harder` - Sometimes enables reasoning (high effort)
|
|
284
|
+
- `ultrathink` - Attempts maximum reasoning depth (often breaks)
|
|
285
|
+
|
|
252
286
|
### Environment Variables
|
|
253
287
|
|
|
254
|
-
**GLMT
|
|
255
|
-
-
|
|
256
|
-
-
|
|
257
|
-
|
|
258
|
-
- 1-2048: Disable thinking (fast execution)
|
|
259
|
-
- 2049-8192: Enable for reasoning tasks only (default)
|
|
260
|
-
- >8192: Always enable thinking
|
|
261
|
-
- `CCS_GLMT_STREAMING=disabled` - Force buffered mode
|
|
262
|
-
- `CCS_GLMT_STREAMING=force` - Force streaming (override client)
|
|
288
|
+
**GLMT features** (all experimental):
|
|
289
|
+
- Forced English output enforcement (sometimes works)
|
|
290
|
+
- Random thinking mode activation (unpredictable)
|
|
291
|
+
- Attempted streaming with frequent fallback to buffered mode
|
|
263
292
|
|
|
264
293
|
**General**:
|
|
265
294
|
- `CCS_DEBUG_LOG=1` - Enable debug file logging
|
|
@@ -301,10 +330,10 @@ ccs glmt --verbose "your prompt"
|
|
|
301
330
|
# Logs: ~/.ccs/logs/
|
|
302
331
|
```
|
|
303
332
|
|
|
304
|
-
**
|
|
333
|
+
**GLMT debugging**:
|
|
305
334
|
```bash
|
|
306
|
-
#
|
|
307
|
-
|
|
335
|
+
# Verbose logging shows streaming status and reasoning details
|
|
336
|
+
ccs glmt --verbose "test"
|
|
308
337
|
```
|
|
309
338
|
|
|
310
339
|
**Check reasoning content**:
|
package/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
3.4.
|
|
1
|
+
3.4.6
|
|
@@ -100,22 +100,32 @@ class DeltaAccumulator {
|
|
|
100
100
|
*/
|
|
101
101
|
addDelta(delta) {
|
|
102
102
|
const block = this.getCurrentBlock();
|
|
103
|
-
if (block) {
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
if (this.textBuffer.length + delta.length > this.maxBufferSize) {
|
|
114
|
-
throw new Error(`Text buffer exceeded ${this.maxBufferSize} bytes (DoS protection)`);
|
|
115
|
-
}
|
|
116
|
-
this.textBuffer += delta;
|
|
117
|
-
block.content = this.textBuffer;
|
|
103
|
+
if (!block) {
|
|
104
|
+
// FIX: Guard against null block (should never happen, but defensive)
|
|
105
|
+
console.error('[DeltaAccumulator] ERROR: addDelta called with no current block');
|
|
106
|
+
return;
|
|
107
|
+
}
|
|
108
|
+
|
|
109
|
+
if (block.type === 'thinking') {
|
|
110
|
+
// C-02 Fix: Enforce buffer size limit
|
|
111
|
+
if (this.thinkingBuffer.length + delta.length > this.maxBufferSize) {
|
|
112
|
+
throw new Error(`Thinking buffer exceeded ${this.maxBufferSize} bytes (DoS protection)`);
|
|
118
113
|
}
|
|
114
|
+
this.thinkingBuffer += delta;
|
|
115
|
+
block.content = this.thinkingBuffer;
|
|
116
|
+
|
|
117
|
+
// FIX: Verify assignment succeeded (paranoid check for race conditions)
|
|
118
|
+
if (block.content.length !== this.thinkingBuffer.length) {
|
|
119
|
+
console.error('[DeltaAccumulator] ERROR: Block content assignment failed');
|
|
120
|
+
console.error(`Expected: ${this.thinkingBuffer.length}, Got: ${block.content.length}`);
|
|
121
|
+
}
|
|
122
|
+
} else if (block.type === 'text') {
|
|
123
|
+
// C-02 Fix: Enforce buffer size limit
|
|
124
|
+
if (this.textBuffer.length + delta.length > this.maxBufferSize) {
|
|
125
|
+
throw new Error(`Text buffer exceeded ${this.maxBufferSize} bytes (DoS protection)`);
|
|
126
|
+
}
|
|
127
|
+
this.textBuffer += delta;
|
|
128
|
+
block.content = this.textBuffer;
|
|
119
129
|
}
|
|
120
130
|
}
|
|
121
131
|
|
|
@@ -126,6 +136,11 @@ class DeltaAccumulator {
|
|
|
126
136
|
const block = this.getCurrentBlock();
|
|
127
137
|
if (block) {
|
|
128
138
|
block.stopped = true;
|
|
139
|
+
|
|
140
|
+
// FIX: Log block closure for debugging (helps diagnose timing issues)
|
|
141
|
+
if (block.type === 'thinking' && process.env.CCS_DEBUG === '1') {
|
|
142
|
+
console.error(`[DeltaAccumulator] Stopped thinking block ${block.index}: ${block.content?.length || 0} chars`);
|
|
143
|
+
}
|
|
129
144
|
}
|
|
130
145
|
}
|
|
131
146
|
|
|
@@ -8,6 +8,7 @@ const os = require('os');
|
|
|
8
8
|
const SSEParser = require('./sse-parser');
|
|
9
9
|
const DeltaAccumulator = require('./delta-accumulator');
|
|
10
10
|
const LocaleEnforcer = require('./locale-enforcer');
|
|
11
|
+
const ReasoningEnforcer = require('./reasoning-enforcer');
|
|
11
12
|
|
|
12
13
|
/**
|
|
13
14
|
* GlmtTransformer - Convert between Anthropic and OpenAI formats with thinking and tool support
|
|
@@ -17,7 +18,7 @@ const LocaleEnforcer = require('./locale-enforcer');
|
|
|
17
18
|
* - Response: OpenAI reasoning_content → Anthropic thinking blocks
|
|
18
19
|
* - Tool Support: Anthropic tools ↔ OpenAI function calling (bidirectional)
|
|
19
20
|
* - Streaming: Real-time tool calls with input_json deltas
|
|
20
|
-
* - Debug mode: Log raw data to ~/.ccs/logs/ (
|
|
21
|
+
* - Debug mode: Log raw data to ~/.ccs/logs/ (CCS_DEBUG=1)
|
|
21
22
|
* - Verbose mode: Console logging with timestamps
|
|
22
23
|
* - Validation: Self-test transformation results
|
|
23
24
|
*
|
|
@@ -37,16 +38,10 @@ class GlmtTransformer {
|
|
|
37
38
|
this.defaultThinking = config.defaultThinking ?? true;
|
|
38
39
|
this.verbose = config.verbose || false;
|
|
39
40
|
|
|
40
|
-
//
|
|
41
|
-
const
|
|
42
|
-
|
|
43
|
-
this.
|
|
44
|
-
|
|
45
|
-
// Show deprecation warning once
|
|
46
|
-
if (oldVar && !newVar && !GlmtTransformer._warnedDeprecation) {
|
|
47
|
-
console.warn('[glmt] Warning: CCS_DEBUG_LOG is deprecated, use CCS_DEBUG instead');
|
|
48
|
-
GlmtTransformer._warnedDeprecation = true;
|
|
49
|
-
}
|
|
41
|
+
// CCS_DEBUG controls all debug logging (file + console)
|
|
42
|
+
const debugEnabled = process.env.CCS_DEBUG === '1';
|
|
43
|
+
this.debugLog = config.debugLog ?? debugEnabled;
|
|
44
|
+
this.debugMode = config.debugMode ?? debugEnabled;
|
|
50
45
|
|
|
51
46
|
this.debugLogDir = config.debugLogDir || path.join(os.homedir(), '.ccs', 'logs');
|
|
52
47
|
this.modelMaxTokens = {
|
|
@@ -60,6 +55,11 @@ class GlmtTransformer {
|
|
|
60
55
|
|
|
61
56
|
// Initialize locale enforcer (always enforce English)
|
|
62
57
|
this.localeEnforcer = new LocaleEnforcer();
|
|
58
|
+
|
|
59
|
+
// Initialize reasoning enforcer (enabled by default for all GLMT usage)
|
|
60
|
+
this.reasoningEnforcer = new ReasoningEnforcer({
|
|
61
|
+
enabled: config.explicitReasoning ?? true
|
|
62
|
+
});
|
|
63
63
|
}
|
|
64
64
|
|
|
65
65
|
/**
|
|
@@ -110,10 +110,16 @@ class GlmtTransformer {
|
|
|
110
110
|
anthropicRequest.messages || []
|
|
111
111
|
);
|
|
112
112
|
|
|
113
|
+
// 4.5. Inject reasoning instruction (if enabled or thinking requested)
|
|
114
|
+
const messagesWithReasoning = this.reasoningEnforcer.injectInstruction(
|
|
115
|
+
messagesWithLocale,
|
|
116
|
+
thinkingConfig
|
|
117
|
+
);
|
|
118
|
+
|
|
113
119
|
// 5. Convert to OpenAI format
|
|
114
120
|
const openaiRequest = {
|
|
115
121
|
model: glmModel,
|
|
116
|
-
messages: this._sanitizeMessages(
|
|
122
|
+
messages: this._sanitizeMessages(messagesWithReasoning),
|
|
117
123
|
max_tokens: this._getMaxTokens(glmModel),
|
|
118
124
|
stream: anthropicRequest.stream ?? false
|
|
119
125
|
};
|
|
@@ -645,10 +651,20 @@ class GlmtTransformer {
|
|
|
645
651
|
if (delta.reasoning_content) {
|
|
646
652
|
const currentBlock = accumulator.getCurrentBlock();
|
|
647
653
|
|
|
654
|
+
// FIX: Enhanced debug logging for thinking block diagnostics
|
|
655
|
+
if (this.debugMode) {
|
|
656
|
+
console.error(`[GLMT-DEBUG] Reasoning delta: ${delta.reasoning_content.length} chars`);
|
|
657
|
+
console.error(`[GLMT-DEBUG] Current block: ${currentBlock?.type || 'none'}, index: ${currentBlock?.index ?? 'N/A'}`);
|
|
658
|
+
}
|
|
659
|
+
|
|
648
660
|
if (!currentBlock || currentBlock.type !== 'thinking') {
|
|
649
661
|
// Start thinking block
|
|
650
662
|
const block = accumulator.startBlock('thinking');
|
|
651
663
|
events.push(this._createContentBlockStartEvent(block));
|
|
664
|
+
|
|
665
|
+
if (this.debugMode) {
|
|
666
|
+
console.error(`[GLMT-DEBUG] Started new thinking block ${block.index}`);
|
|
667
|
+
}
|
|
652
668
|
}
|
|
653
669
|
|
|
654
670
|
accumulator.addDelta(delta.reasoning_content);
|
|
@@ -664,7 +680,10 @@ class GlmtTransformer {
|
|
|
664
680
|
|
|
665
681
|
// Close thinking block if transitioning from thinking to text
|
|
666
682
|
if (currentBlock && currentBlock.type === 'thinking' && !currentBlock.stopped) {
|
|
667
|
-
|
|
683
|
+
const signatureEvent = this._createSignatureDeltaEvent(currentBlock);
|
|
684
|
+
if (signatureEvent) { // FIX: Handle null return from signature race guard
|
|
685
|
+
events.push(signatureEvent);
|
|
686
|
+
}
|
|
668
687
|
events.push(this._createContentBlockStopEvent(currentBlock));
|
|
669
688
|
accumulator.stopCurrentBlock();
|
|
670
689
|
}
|
|
@@ -691,7 +710,10 @@ class GlmtTransformer {
|
|
|
691
710
|
const currentBlock = accumulator.getCurrentBlock();
|
|
692
711
|
if (currentBlock && !currentBlock.stopped) {
|
|
693
712
|
if (currentBlock.type === 'thinking') {
|
|
694
|
-
|
|
713
|
+
const signatureEvent = this._createSignatureDeltaEvent(currentBlock);
|
|
714
|
+
if (signatureEvent) { // FIX: Handle null return from signature race guard
|
|
715
|
+
events.push(signatureEvent);
|
|
716
|
+
}
|
|
695
717
|
}
|
|
696
718
|
events.push(this._createContentBlockStopEvent(currentBlock));
|
|
697
719
|
accumulator.stopCurrentBlock();
|
|
@@ -794,7 +816,10 @@ class GlmtTransformer {
|
|
|
794
816
|
const currentBlock = accumulator.getCurrentBlock();
|
|
795
817
|
if (currentBlock && !currentBlock.stopped) {
|
|
796
818
|
if (currentBlock.type === 'thinking') {
|
|
797
|
-
|
|
819
|
+
const signatureEvent = this._createSignatureDeltaEvent(currentBlock);
|
|
820
|
+
if (signatureEvent) { // FIX: Handle null return from signature race guard
|
|
821
|
+
events.push(signatureEvent);
|
|
822
|
+
}
|
|
798
823
|
}
|
|
799
824
|
events.push(this._createContentBlockStopEvent(currentBlock));
|
|
800
825
|
accumulator.stopCurrentBlock();
|
|
@@ -914,7 +939,23 @@ class GlmtTransformer {
|
|
|
914
939
|
* @private
|
|
915
940
|
*/
|
|
916
941
|
_createSignatureDeltaEvent(block) {
|
|
942
|
+
// FIX: Guard against empty content (signature timing race)
|
|
943
|
+
// In streaming mode, signature may be requested before content fully accumulated
|
|
944
|
+
if (!block.content || block.content.length === 0) {
|
|
945
|
+
if (this.verbose) {
|
|
946
|
+
this.log(`WARNING: Skipping signature for empty thinking block ${block.index}`);
|
|
947
|
+
this.log(`This indicates a race condition - signature requested before content accumulated`);
|
|
948
|
+
}
|
|
949
|
+
return null; // Return null instead of event
|
|
950
|
+
}
|
|
951
|
+
|
|
917
952
|
const signature = this._generateThinkingSignature(block.content);
|
|
953
|
+
|
|
954
|
+
// Enhanced logging for debugging
|
|
955
|
+
if (this.verbose) {
|
|
956
|
+
this.log(`Generating signature for block ${block.index}: ${block.content.length} chars`);
|
|
957
|
+
}
|
|
958
|
+
|
|
918
959
|
return {
|
|
919
960
|
event: 'content_block_delta',
|
|
920
961
|
data: {
|
|
@@ -0,0 +1,173 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
'use strict';
|
|
3
|
+
|
|
4
|
+
/**
|
|
5
|
+
* ReasoningEnforcer - Inject explicit reasoning instructions into prompts
|
|
6
|
+
*
|
|
7
|
+
* Purpose: Force GLM models to use structured reasoning output format (<reasoning_content>)
|
|
8
|
+
* This complements API parameters (reasoning: true) with explicit prompt instructions.
|
|
9
|
+
*
|
|
10
|
+
* Usage:
|
|
11
|
+
* const enforcer = new ReasoningEnforcer({ enabled: true });
|
|
12
|
+
* const modifiedMessages = enforcer.injectInstruction(messages, thinkingConfig);
|
|
13
|
+
*
|
|
14
|
+
* Strategy:
|
|
15
|
+
* 1. If system prompt exists: Prepend reasoning instruction
|
|
16
|
+
* 2. If no system prompt: Prepend to first user message
|
|
17
|
+
* 3. Select prompt template based on effort level (low/medium/high/max)
|
|
18
|
+
* 4. Preserve message structure (string vs array content)
|
|
19
|
+
*/
|
|
20
|
+
|
|
21
|
+
class ReasoningEnforcer {
|
|
22
|
+
constructor(options = {}) {
|
|
23
|
+
this.enabled = options.enabled ?? false; // Opt-in by default
|
|
24
|
+
this.prompts = options.prompts || this._getDefaultPrompts();
|
|
25
|
+
}
|
|
26
|
+
|
|
27
|
+
/**
|
|
28
|
+
* Inject reasoning instruction into messages
|
|
29
|
+
* @param {Array} messages - Messages array to modify
|
|
30
|
+
* @param {Object} thinkingConfig - { thinking: boolean, effort: string }
|
|
31
|
+
* @returns {Array} Modified messages array
|
|
32
|
+
*/
|
|
33
|
+
injectInstruction(messages, thinkingConfig = {}) {
|
|
34
|
+
// Only inject if enabled or thinking explicitly requested
|
|
35
|
+
if (!this.enabled && !thinkingConfig.thinking) {
|
|
36
|
+
return messages;
|
|
37
|
+
}
|
|
38
|
+
|
|
39
|
+
// Clone messages to avoid mutation
|
|
40
|
+
const modifiedMessages = JSON.parse(JSON.stringify(messages));
|
|
41
|
+
|
|
42
|
+
// Select prompt based on effort level
|
|
43
|
+
const prompt = this._selectPrompt(thinkingConfig.effort || 'medium');
|
|
44
|
+
|
|
45
|
+
// Strategy 1: Inject into system prompt (preferred)
|
|
46
|
+
const systemIndex = modifiedMessages.findIndex(m => m.role === 'system');
|
|
47
|
+
if (systemIndex >= 0) {
|
|
48
|
+
const systemMsg = modifiedMessages[systemIndex];
|
|
49
|
+
|
|
50
|
+
if (typeof systemMsg.content === 'string') {
|
|
51
|
+
systemMsg.content = `${prompt}\n\n${systemMsg.content}`;
|
|
52
|
+
} else if (Array.isArray(systemMsg.content)) {
|
|
53
|
+
systemMsg.content.unshift({
|
|
54
|
+
type: 'text',
|
|
55
|
+
text: prompt
|
|
56
|
+
});
|
|
57
|
+
}
|
|
58
|
+
|
|
59
|
+
return modifiedMessages;
|
|
60
|
+
}
|
|
61
|
+
|
|
62
|
+
// Strategy 2: Prepend to first user message
|
|
63
|
+
const userIndex = modifiedMessages.findIndex(m => m.role === 'user');
|
|
64
|
+
if (userIndex >= 0) {
|
|
65
|
+
const userMsg = modifiedMessages[userIndex];
|
|
66
|
+
|
|
67
|
+
if (typeof userMsg.content === 'string') {
|
|
68
|
+
userMsg.content = `${prompt}\n\n${userMsg.content}`;
|
|
69
|
+
} else if (Array.isArray(userMsg.content)) {
|
|
70
|
+
userMsg.content.unshift({
|
|
71
|
+
type: 'text',
|
|
72
|
+
text: prompt
|
|
73
|
+
});
|
|
74
|
+
}
|
|
75
|
+
|
|
76
|
+
return modifiedMessages;
|
|
77
|
+
}
|
|
78
|
+
|
|
79
|
+
// No system or user messages found (edge case)
|
|
80
|
+
return modifiedMessages;
|
|
81
|
+
}
|
|
82
|
+
|
|
83
|
+
/**
|
|
84
|
+
* Select prompt template based on effort level
|
|
85
|
+
* @param {string} effort - 'low', 'medium', 'high', or 'max'
|
|
86
|
+
* @returns {string} Prompt template
|
|
87
|
+
* @private
|
|
88
|
+
*/
|
|
89
|
+
_selectPrompt(effort) {
|
|
90
|
+
const normalizedEffort = effort.toLowerCase();
|
|
91
|
+
return this.prompts[normalizedEffort] || this.prompts.medium;
|
|
92
|
+
}
|
|
93
|
+
|
|
94
|
+
/**
|
|
95
|
+
* Get default prompt templates
|
|
96
|
+
* @returns {Object} Map of effort levels to prompts
|
|
97
|
+
* @private
|
|
98
|
+
*/
|
|
99
|
+
_getDefaultPrompts() {
|
|
100
|
+
return {
|
|
101
|
+
low: `You are an expert reasoning model using GLM-4.6 architecture.
|
|
102
|
+
|
|
103
|
+
CRITICAL: Before answering, write 2-3 sentences of reasoning in <reasoning_content> tags.
|
|
104
|
+
|
|
105
|
+
OUTPUT FORMAT:
|
|
106
|
+
<reasoning_content>
|
|
107
|
+
(Brief analysis: what is the problem? what's the approach?)
|
|
108
|
+
</reasoning_content>
|
|
109
|
+
|
|
110
|
+
(Write your final answer here)`,
|
|
111
|
+
|
|
112
|
+
medium: `You are an expert reasoning model using GLM-4.6 architecture.
|
|
113
|
+
|
|
114
|
+
CRITICAL REQUIREMENTS:
|
|
115
|
+
1. Always think step-by-step before answering
|
|
116
|
+
2. Write your reasoning process explicitly in <reasoning_content> tags
|
|
117
|
+
3. Never skip your chain of thought, even for simple problems
|
|
118
|
+
|
|
119
|
+
OUTPUT FORMAT:
|
|
120
|
+
<reasoning_content>
|
|
121
|
+
(Write your detailed thinking here: analyze the problem, explore approaches,
|
|
122
|
+
evaluate trade-offs, and arrive at a conclusion)
|
|
123
|
+
</reasoning_content>
|
|
124
|
+
|
|
125
|
+
(Write your final answer here based on your reasoning above)`,
|
|
126
|
+
|
|
127
|
+
high: `You are an expert reasoning model using GLM-4.6 architecture.
|
|
128
|
+
|
|
129
|
+
CRITICAL REQUIREMENTS:
|
|
130
|
+
1. Think deeply and systematically before answering
|
|
131
|
+
2. Write comprehensive reasoning in <reasoning_content> tags
|
|
132
|
+
3. Explore multiple approaches and evaluate trade-offs
|
|
133
|
+
4. Show all steps in your problem-solving process
|
|
134
|
+
|
|
135
|
+
OUTPUT FORMAT:
|
|
136
|
+
<reasoning_content>
|
|
137
|
+
(Write exhaustive analysis here:
|
|
138
|
+
- Problem decomposition
|
|
139
|
+
- Multiple approach exploration
|
|
140
|
+
- Trade-off analysis for each approach
|
|
141
|
+
- Edge case consideration
|
|
142
|
+
- Final conclusion with justification)
|
|
143
|
+
</reasoning_content>
|
|
144
|
+
|
|
145
|
+
(Write your final answer here based on your systematic reasoning above)`,
|
|
146
|
+
|
|
147
|
+
max: `You are an expert reasoning model using GLM-4.6 architecture.
|
|
148
|
+
|
|
149
|
+
CRITICAL REQUIREMENTS:
|
|
150
|
+
1. Think exhaustively from first principles
|
|
151
|
+
2. Write extremely detailed reasoning in <reasoning_content> tags
|
|
152
|
+
3. Analyze ALL possible angles, approaches, and edge cases
|
|
153
|
+
4. Challenge your own assumptions and explore alternatives
|
|
154
|
+
5. Provide rigorous justification for every claim
|
|
155
|
+
|
|
156
|
+
OUTPUT FORMAT:
|
|
157
|
+
<reasoning_content>
|
|
158
|
+
(Write comprehensive analysis here:
|
|
159
|
+
- First principles breakdown
|
|
160
|
+
- Exhaustive approach enumeration
|
|
161
|
+
- Comparative analysis of all approaches
|
|
162
|
+
- Edge case and failure mode analysis
|
|
163
|
+
- Assumption validation
|
|
164
|
+
- Counter-argument consideration
|
|
165
|
+
- Final conclusion with rigorous justification)
|
|
166
|
+
</reasoning_content>
|
|
167
|
+
|
|
168
|
+
(Write your final answer here based on your exhaustive reasoning above)`
|
|
169
|
+
};
|
|
170
|
+
}
|
|
171
|
+
}
|
|
172
|
+
|
|
173
|
+
module.exports = ReasoningEnforcer;
|
package/lib/ccs
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
set -euo pipefail
|
|
3
3
|
|
|
4
4
|
# Version (updated by scripts/bump-version.sh)
|
|
5
|
-
CCS_VERSION="3.4.
|
|
5
|
+
CCS_VERSION="3.4.6"
|
|
6
6
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
|
7
7
|
readonly CONFIG_FILE="${CCS_CONFIG:-$HOME/.ccs/config.json}"
|
|
8
8
|
readonly PROFILES_JSON="$HOME/.ccs/profiles.json"
|
package/lib/ccs.ps1
CHANGED
|
@@ -12,7 +12,7 @@ param(
|
|
|
12
12
|
$ErrorActionPreference = "Stop"
|
|
13
13
|
|
|
14
14
|
# Version (updated by scripts/bump-version.sh)
|
|
15
|
-
$CcsVersion = "3.4.
|
|
15
|
+
$CcsVersion = "3.4.6"
|
|
16
16
|
$ScriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
|
|
17
17
|
$ConfigFile = if ($env:CCS_CONFIG) { $env:CCS_CONFIG } else { "$env:USERPROFILE\.ccs\config.json" }
|
|
18
18
|
$ProfilesJson = "$env:USERPROFILE\.ccs\profiles.json"
|