npm - @kaitranntt/ccs - Versions diffs - 3.3.0 → 3.4.1 - Mend

@kaitranntt/ccs 3.3.0 → 3.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/README.md +66 -7
package/VERSION +1 -1
package/bin/{auth-commands.js → auth/auth-commands.js} +3 -3
package/bin/ccs.js +38 -19
package/bin/glmt/budget-calculator.js +114 -0
package/bin/glmt/delta-accumulator.js +261 -0
package/bin/glmt/glmt-proxy.js +488 -0
package/bin/glmt/glmt-transformer.js +919 -0
package/bin/glmt/locale-enforcer.js +80 -0
package/bin/glmt/sse-parser.js +96 -0
package/bin/glmt/task-classifier.js +162 -0
package/bin/{doctor.js → management/doctor.js} +2 -2
package/lib/ccs +1 -1
package/lib/ccs.ps1 +1 -1
package/package.json +1 -1
package/scripts/dev-install.sh +35 -0
package/bin/glmt-proxy.js +0 -307
package/bin/glmt-transformer.js +0 -437
/package/bin/{profile-detector.js → auth/profile-detector.js} +0 -0
/package/bin/{profile-registry.js → auth/profile-registry.js} +0 -0
/package/bin/{instance-manager.js → management/instance-manager.js} +0 -0
/package/bin/{recovery-manager.js → management/recovery-manager.js} +0 -0
/package/bin/{shared-manager.js → management/shared-manager.js} +0 -0
/package/bin/{claude-detector.js → utils/claude-detector.js} +0 -0
/package/bin/{config-manager.js → utils/config-manager.js} +0 -0
/package/bin/{error-manager.js → utils/error-manager.js} +0 -0
/package/bin/{helpers.js → utils/helpers.js} +0 -0

package/README.md CHANGED Viewed

@@ -205,21 +205,65 @@ Commands and skills symlinked from `~/.ccs/shared/` - no duplication across prof
 |---------|-----------------|-------------------|
 | **Endpoint** | Anthropic-compatible | OpenAI-compatible |
 | **Thinking** | No | Yes (reasoning_content) |
-| **Streaming** | Yes | No (buffered) |
-| **Use Case** | Fast responses | Complex reasoning |
+| **Tool Support** | Basic | **Full (v3.5+)** |
+| **MCP Tools** | Limited | **Working (v3.5+)** |
+| **Streaming** | Yes | **Yes (v3.4+)** |
+| **TTFB** | <500ms | <500ms (streaming), 2-10s (buffered) |
+| **Use Case** | Fast responses | Complex reasoning + tools |
+### Tool Support (v3.5)
+**GLMT now fully supports MCP tools and function calling**:
+- **Bidirectional Transformation**: Anthropic tools ↔ OpenAI function calling
+- **MCP Integration**: MCP tools execute correctly (no XML tag output)
+- **Streaming Tool Calls**: Real-time tool calls with input_json deltas
+- **Backward Compatible**: Works seamlessly with existing thinking support
+- **No Configuration**: Tool support works automatically
+### Streaming Support (v3.4)
+**GLMT now supports real-time streaming** with incremental reasoning content delivery.
+- **Default**: Streaming enabled (TTFB <500ms)
+- **Disable**: Set `CCS_GLMT_STREAMING=disabled` for buffered mode
+- **Force**: Set `CCS_GLMT_STREAMING=force` to override client preferences
+- **Thinking parameter**: Claude CLI `thinking` parameter support
+  - Respects `thinking.type` and `budget_tokens`
+  - Precedence: CLI parameter > message tags > default
+**Confirmed working**: Z.AI (1498 reasoning chunks tested, tool calls verified)
 ### How It Works
 1. CCS spawns embedded HTTP proxy on localhost
-2. Proxy converts Anthropic format → OpenAI format
-3. Forwards to Z.AI with reasoning parameters
-4. Converts `reasoning_content` → thinking blocks
-5. Thinking appears in Claude Code UI
+2. Proxy converts Anthropic format → OpenAI format (streaming or buffered)
+3. Transforms Anthropic tools → OpenAI function calling format
+4. Forwards to Z.AI with reasoning parameters and tools
+5. Converts `reasoning_content` → thinking blocks (incremental or complete)
+6. Converts OpenAI `tool_calls` → Anthropic tool_use blocks
+7. Thinking and tool calls appear in Claude Code UI in real-time
 ### Control Tags
 - `<Thinking:On|Off>` - Enable/disable reasoning blocks (default: On)
-- `<Effort:Low|Medium|High>` - Control reasoning depth (default: Medium)
+- `<Effort:Low|Medium|High>` - Control reasoning depth (deprecated - Z.AI only supports binary thinking)
+### Environment Variables
+**GLMT-specific**:
+- `CCS_GLMT_FORCE_ENGLISH=true` - Force English output (default: true)
+- `CCS_GLMT_THINKING_BUDGET=8192` - Control thinking on/off based on task type
+  - 0 or "unlimited": Always enable thinking
+  - 1-2048: Disable thinking (fast execution)
+  - 2049-8192: Enable for reasoning tasks only (default)
+  - >8192: Always enable thinking
+- `CCS_GLMT_STREAMING=disabled` - Force buffered mode
+- `CCS_GLMT_STREAMING=force` - Force streaming (override client)
+**General**:
+- `CCS_DEBUG_LOG=1` - Enable debug file logging
+- `CCS_CLAUDE_PATH=/path/to/claude` - Custom Claude CLI path
 ### API Key Setup
@@ -235,6 +279,14 @@ nano ~/.ccs/glmt.settings.json
 }
 ```
+### Security Limits
+**DoS protection** (v3.4):
+- SSE buffer: 1MB max per event
+- Content buffer: 10MB max per block (thinking/text)
+- Content blocks: 100 max per message
+- Request timeout: 120s (both streaming and buffered)
 ### Debugging
 **Enable verbose logging**:
@@ -249,6 +301,12 @@ ccs glmt --verbose "your prompt"
 # Logs: ~/.ccs/logs/
 ```
+**Check streaming mode**:
+```bash
+# Disable streaming for debugging
+CCS_GLMT_STREAMING=disabled ccs glmt "test"
+```
 **Check reasoning content**:
 ```bash
 cat ~/.ccs/logs/*response-openai.json | jq '.choices[0].message.reasoning_content'
@@ -351,6 +409,7 @@ irm ccs.kaitran.ca/uninstall | iex
 - [Configuration](./docs/en/configuration.md)
 - [Usage Examples](./docs/en/usage.md)
 - [System Architecture](./docs/system-architecture.md)
+- [GLMT Control Mechanisms](./docs/glmt-controls.md)
 - [Troubleshooting](./docs/en/troubleshooting.md)
 - [Contributing](./CONTRIBUTING.md)

package/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 3.3.0
1	+ 3.4.1

package/bin/{auth-commands.js → auth/auth-commands.js} RENAMED Viewed

@@ -2,9 +2,9 @@
 const { spawn } = require('child_process');
 const ProfileRegistry = require('./profile-registry');
-const InstanceManager = require('./instance-manager');
-const { colored } = require('./helpers');
-const { detectClaudeCli } = require('./claude-detector');
+const InstanceManager = require('../management/instance-manager');
+const { colored } = require('../utils/helpers');
+const { detectClaudeCli } = require('../utils/claude-detector');
 /**
  * Auth Commands (Simplified)

package/bin/ccs.js CHANGED Viewed

@@ -5,11 +5,11 @@ const { spawn } = require('child_process');
 const path = require('path');
 const fs = require('fs');
 const os = require('os');
-const { error, colored } = require('./helpers');
-const { detectClaudeCli, showClaudeNotFoundError } = require('./claude-detector');
-const { getSettingsPath, getConfigPath } = require('./config-manager');
-const { ErrorManager } = require('./error-manager');
-const RecoveryManager = require('./recovery-manager');
+const { error, colored } = require('./utils/helpers');
+const { detectClaudeCli, showClaudeNotFoundError } = require('./utils/claude-detector');
+const { getSettingsPath, getConfigPath } = require('./utils/config-manager');
+const { ErrorManager } = require('./utils/error-manager');
+const RecoveryManager = require('./management/recovery-manager');
 // Version (sync with package.json)
 const CCS_VERSION = require('../package.json').version;
@@ -194,7 +194,7 @@ function handleUninstallCommand() {
 }
 async function handleDoctorCommand() {
-  const Doctor = require('./doctor');
+  const Doctor = require('./management/doctor');
   const doctor = new Doctor();
   await doctor.runAllChecks();
@@ -216,7 +216,7 @@ function detectProfile(args) {
 // Execute Claude CLI with embedded proxy (for GLMT profile)
 async function execClaudeWithProxy(claudeCli, profileName, args) {
-  const { getSettingsPath } = require('./config-manager');
+  const { getSettingsPath } = require('./utils/config-manager');
   // 1. Read settings to get API key
   const settingsPath = getSettingsPath(profileName);
@@ -233,9 +233,10 @@ async function execClaudeWithProxy(claudeCli, profileName, args) {
   const verbose = args.includes('--verbose') || args.includes('-v');
   // 2. Spawn embedded proxy with verbose flag
-  const proxyPath = path.join(__dirname, 'glmt-proxy.js');
+  const proxyPath = path.join(__dirname, 'glmt', 'glmt-proxy.js');
   const proxyArgs = verbose ? ['--verbose'] : [];
-  const proxy = spawn('node', [proxyPath, ...proxyArgs], {
+  // Use process.execPath for Windows compatibility (CVE-2024-27980)
+  const proxy = spawn(process.execPath, [proxyPath, ...proxyArgs], {
     stdio: ['ignore', 'pipe', verbose ? 'pipe' : 'inherit']
   });
@@ -286,16 +287,34 @@ async function execClaudeWithProxy(claudeCli, profileName, args) {
   // 4. Spawn Claude CLI with proxy URL
   const envVars = {
-    ...process.env,
     ANTHROPIC_BASE_URL: `http://127.0.0.1:${port}`,
     ANTHROPIC_AUTH_TOKEN: apiKey,
     ANTHROPIC_MODEL: 'glm-4.6'
   };
-  const claude = spawn(claudeCli, args, {
-    stdio: 'inherit',
-    env: envVars
-  });
+  // Use existing execClaude helper for consistent Windows handling
+  const isWindows = process.platform === 'win32';
+  const needsShell = isWindows && /\.(cmd|bat|ps1)$/i.test(claudeCli);
+  const env = { ...process.env, ...envVars };
+  let claude;
+  if (needsShell) {
+    // When shell needed: concatenate into string to avoid DEP0190 warning
+    const cmdString = [claudeCli, ...args].map(escapeShellArg).join(' ');
+    claude = spawn(cmdString, {
+      stdio: 'inherit',
+      windowsHide: true,
+      shell: true,
+      env
+    });
+  } else {
+    // When no shell needed: use array form (faster, no shell overhead)
+    claude = spawn(claudeCli, args, {
+      stdio: 'inherit',
+      windowsHide: true,
+      env
+    });
+  }
   // 5. Cleanup: kill proxy when Claude exits
   claude.on('exit', (code, signal) => {
@@ -358,7 +377,7 @@ async function main() {
   // Special case: auth command (multi-account management)
   if (firstArg === 'auth') {
-    const AuthCommands = require('./auth-commands');
+    const AuthCommands = require('./auth/auth-commands');
     const authCommands = new AuthCommands();
     await authCommands.route(args.slice(1));
     return;
@@ -383,10 +402,10 @@ async function main() {
   }
   // Use ProfileDetector to determine profile type
-  const ProfileDetector = require('./profile-detector');
-  const InstanceManager = require('./instance-manager');
-  const ProfileRegistry = require('./profile-registry');
-  const { getSettingsPath } = require('./config-manager');
+  const ProfileDetector = require('./auth/profile-detector');
+  const InstanceManager = require('./management/instance-manager');
+  const ProfileRegistry = require('./auth/profile-registry');
+  const { getSettingsPath } = require('./utils/config-manager');
   const detector = new ProfileDetector();

package/bin/glmt/budget-calculator.js ADDED Viewed

@@ -0,0 +1,114 @@
+#!/usr/bin/env node
+'use strict';
+/**
+ * BudgetCalculator - Control thinking enable/disable based on task complexity
+ *
+ * Purpose: Z.AI API only supports binary thinking (on/off), not reasoning_effort levels.
+ * This module decides when to enable thinking based on task type and budget preferences.
+ *
+ * Usage:
+ *   const calculator = new BudgetCalculator();
+ *   const shouldThink = calculator.shouldEnableThinking(taskType, envBudget);
+ *
+ * Configuration:
+ *   CCS_GLMT_THINKING_BUDGET:
+ *     - 0 or "unlimited": Always enable thinking (power user mode)
+ *     - 1-2048: Disable thinking (fast execution, low budget)
+ *     - 2049-8192: Enable thinking for reasoning tasks only (default)
+ *     - >8192: Always enable thinking (high budget)
+ *
+ * Task type mapping:
+ *   - reasoning: Enable thinking (planning, design, analysis)
+ *   - execution: Disable thinking (fix, implement, debug) unless high budget
+ *   - mixed: Enable thinking if budget >= medium threshold
+ */
+class BudgetCalculator {
+  constructor(options = {}) {
+    this.budgetThresholds = {
+      low: 2048,      // Disable thinking (fast execution)
+      medium: 8192    // Enable thinking for reasoning tasks
+    };
+    this.defaultBudget = options.defaultBudget || 8192; // Default: enable thinking for reasoning
+  }
+  /**
+   * Determine if thinking should be enabled based on task type and budget
+   * @param {string} taskType - 'reasoning', 'execution', or 'mixed'
+   * @param {string|number} envBudget - CCS_GLMT_THINKING_BUDGET value
+   * @returns {boolean} True if thinking should be enabled
+   */
+  shouldEnableThinking(taskType, envBudget) {
+    const budget = this._parseBudget(envBudget);
+    // Unlimited budget (0): Always enable thinking
+    if (budget === 0) {
+      return true;
+    }
+    // Low budget (<= 2048): Disable thinking (fast execution mode)
+    if (budget <= this.budgetThresholds.low) {
+      return false;
+    }
+    // High budget (> 8192): Always enable thinking
+    if (budget > this.budgetThresholds.medium) {
+      return true;
+    }
+    // Medium budget (2049-8192): Task-aware decision
+    if (taskType === 'reasoning') {
+      return true;  // Enable thinking for planning/design tasks
+    } else if (taskType === 'execution') {
+      return false; // Disable thinking for quick fixes
+    } else {
+      return true;  // Enable for mixed/ambiguous tasks (default safe)
+    }
+  }
+  /**
+   * Parse budget from environment variable or use default
+   * @param {string|number} envBudget - Budget value
+   * @returns {number} Parsed budget (0 = unlimited)
+   * @private
+   */
+  _parseBudget(envBudget) {
+    // CRITICAL: Check for undefined/null explicitly, not falsy (0 is valid!)
+    if (envBudget === undefined || envBudget === null || envBudget === '') {
+      return this.defaultBudget;
+    }
+    // Handle string values
+    if (typeof envBudget === 'string') {
+      if (envBudget.toLowerCase() === 'unlimited') {
+        return 0;
+      }
+      const parsed = parseInt(envBudget, 10);
+      if (isNaN(parsed)) {
+        return this.defaultBudget;
+      }
+      return parsed < 0 ? 0 : parsed;
+    }
+    // Handle number values
+    if (typeof envBudget === 'number') {
+      return envBudget < 0 ? 0 : envBudget;
+    }
+    return this.defaultBudget;
+  }
+  /**
+   * Get human-readable budget description
+   * @param {number} budget - Budget value
+   * @returns {string} Description
+   */
+  getBudgetDescription(budget) {
+    if (budget === 0) return 'unlimited (always think)';
+    if (budget <= this.budgetThresholds.low) return 'low (fast execution, no thinking)';
+    if (budget <= this.budgetThresholds.medium) return 'medium (task-aware thinking)';
+    return 'high (always think)';
+  }
+}
+module.exports = BudgetCalculator;

package/bin/glmt/delta-accumulator.js ADDED Viewed

@@ -0,0 +1,261 @@
+#!/usr/bin/env node
+'use strict';
+/**
+ * DeltaAccumulator - Maintain state across streaming deltas
+ *
+ * Tracks:
+ * - Message metadata (id, model, role)
+ * - Content blocks (thinking, text)
+ * - Current block index
+ * - Accumulated content
+ *
+ * Usage:
+ *   const acc = new DeltaAccumulator(thinkingConfig);
+ *   const events = transformer.transformDelta(openaiEvent, acc);
+ */
+class DeltaAccumulator {
+  constructor(thinkingConfig = {}, options = {}) {
+    this.thinkingConfig = thinkingConfig;
+    this.messageId = 'msg_' + Date.now() + '_' + Math.random().toString(36).substring(7);
+    this.model = null;
+    this.role = 'assistant';
+    // Content blocks
+    this.contentBlocks = [];
+    this.currentBlockIndex = -1;
+    // Tool calls tracking
+    this.toolCalls = [];
+    this.toolCallsIndex = {};
+    // Buffers
+    this.thinkingBuffer = '';
+    this.textBuffer = '';
+    // C-02 Fix: Limits to prevent unbounded accumulation
+    this.maxBlocks = options.maxBlocks || 100;
+    this.maxBufferSize = options.maxBufferSize || 10 * 1024 * 1024; // 10MB
+    // Loop detection configuration
+    this.loopDetectionThreshold = options.loopDetectionThreshold || 3;
+    this.loopDetected = false;
+    // State flags
+    this.messageStarted = false;
+    this.finalized = false;
+    this.usageReceived = false; // Track if usage data has arrived
+    // Statistics
+    this.inputTokens = 0;
+    this.outputTokens = 0;
+    this.finishReason = null;
+  }
+  /**
+   * Get current content block
+   * @returns {Object|null} Current block or null
+   */
+  getCurrentBlock() {
+    if (this.currentBlockIndex >= 0 && this.currentBlockIndex < this.contentBlocks.length) {
+      return this.contentBlocks[this.currentBlockIndex];
+    }
+    return null;
+  }
+  /**
+   * Start new content block
+   * @param {string} type - Block type ('thinking', 'text', or 'tool_use')
+   * @returns {Object} New block
+   */
+  startBlock(type) {
+    // C-02 Fix: Enforce max blocks limit
+    if (this.contentBlocks.length >= this.maxBlocks) {
+      throw new Error(`Maximum ${this.maxBlocks} content blocks exceeded (DoS protection)`);
+    }
+    this.currentBlockIndex++;
+    const block = {
+      index: this.currentBlockIndex,
+      type: type,
+      content: '',
+      started: true,
+      stopped: false
+    };
+    this.contentBlocks.push(block);
+    // Reset buffer for new block (tool_use doesn't use buffers)
+    if (type === 'thinking') {
+      this.thinkingBuffer = '';
+    } else if (type === 'text') {
+      this.textBuffer = '';
+    }
+    return block;
+  }
+  /**
+   * Add delta to current block
+   * @param {string} delta - Content delta
+   */
+  addDelta(delta) {
+    const block = this.getCurrentBlock();
+    if (block) {
+      if (block.type === 'thinking') {
+        // C-02 Fix: Enforce buffer size limit
+        if (this.thinkingBuffer.length + delta.length > this.maxBufferSize) {
+          throw new Error(`Thinking buffer exceeded ${this.maxBufferSize} bytes (DoS protection)`);
+        }
+        this.thinkingBuffer += delta;
+        block.content = this.thinkingBuffer;
+      } else if (block.type === 'text') {
+        // C-02 Fix: Enforce buffer size limit
+        if (this.textBuffer.length + delta.length > this.maxBufferSize) {
+          throw new Error(`Text buffer exceeded ${this.maxBufferSize} bytes (DoS protection)`);
+        }
+        this.textBuffer += delta;
+        block.content = this.textBuffer;
+      }
+    }
+  }
+  /**
+   * Mark current block as stopped
+   */
+  stopCurrentBlock() {
+    const block = this.getCurrentBlock();
+    if (block) {
+      block.stopped = true;
+    }
+  }
+  /**
+   * Update usage statistics
+   * @param {Object} usage - Usage object from OpenAI
+   */
+  updateUsage(usage) {
+    if (usage) {
+      this.inputTokens = usage.prompt_tokens || usage.input_tokens || 0;
+      this.outputTokens = usage.completion_tokens || usage.output_tokens || 0;
+      this.usageReceived = true; // Mark that we've received usage data
+    }
+  }
+  /**
+   * Add or update tool call delta
+   * @param {Object} toolCallDelta - Tool call delta from OpenAI
+   */
+  addToolCallDelta(toolCallDelta) {
+    const index = toolCallDelta.index;
+    // Initialize tool call if not exists
+    if (!this.toolCallsIndex[index]) {
+      const toolCall = {
+        index: index,
+        id: '',
+        type: 'function',
+        function: {
+          name: '',
+          arguments: ''
+        }
+      };
+      this.toolCalls.push(toolCall);
+      this.toolCallsIndex[index] = toolCall;
+    }
+    const toolCall = this.toolCallsIndex[index];
+    // Update id if present
+    if (toolCallDelta.id) {
+      toolCall.id = toolCallDelta.id;
+    }
+    // Update type if present
+    if (toolCallDelta.type) {
+      toolCall.type = toolCallDelta.type;
+    }
+    // Update function name if present
+    if (toolCallDelta.function?.name) {
+      toolCall.function.name += toolCallDelta.function.name;
+    }
+    // Update function arguments if present
+    if (toolCallDelta.function?.arguments) {
+      toolCall.function.arguments += toolCallDelta.function.arguments;
+    }
+  }
+  /**
+   * Get all tool calls
+   * @returns {Array} Tool calls array
+   */
+  getToolCalls() {
+    return this.toolCalls;
+  }
+  /**
+   * Check for planning loop pattern
+   * Loop = N consecutive thinking blocks with no tool calls
+   * @returns {boolean} True if loop detected
+   */
+  checkForLoop() {
+    // Already detected loop
+    if (this.loopDetected) {
+      return true;
+    }
+    // Need minimum blocks to detect pattern
+    if (this.contentBlocks.length < this.loopDetectionThreshold) {
+      return false;
+    }
+    // Get last N blocks
+    const recentBlocks = this.contentBlocks.slice(-this.loopDetectionThreshold);
+    // Check if all recent blocks are thinking blocks
+    const allThinking = recentBlocks.every(b => b.type === 'thinking');
+    // Check if no tool calls have been made at all
+    const noToolCalls = this.toolCalls.length === 0;
+    // Loop detected if: all recent blocks are thinking AND no tool calls yet
+    if (allThinking && noToolCalls) {
+      this.loopDetected = true;
+      return true;
+    }
+    return false;
+  }
+  /**
+   * Reset loop detection state (for testing)
+   */
+  resetLoopDetection() {
+    this.loopDetected = false;
+  }
+  /**
+   * Get summary of accumulated state
+   * @returns {Object} Summary
+   */
+  getSummary() {
+    return {
+      messageId: this.messageId,
+      model: this.model,
+      role: this.role,
+      blockCount: this.contentBlocks.length,
+      currentIndex: this.currentBlockIndex,
+      toolCallCount: this.toolCalls.length,
+      messageStarted: this.messageStarted,
+      finalized: this.finalized,
+      loopDetected: this.loopDetected,
+      usage: {
+        input_tokens: this.inputTokens,
+        output_tokens: this.outputTokens
+      }
+    };
+  }
+}
+module.exports = DeltaAccumulator;