fluxflow-cli 1.8.18 → 1.8.20

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/ARCHITECTURE.md CHANGED
@@ -1,65 +1,65 @@
1
- # 🏛️ Architecture & Design
2
-
3
- Flux Flow is built on a modern, reactive stack that brings web-like development paradigms to the terminal. It utilizes a custom agentic loop for reasoning and a unique dual-model system for background processing.
4
-
5
- ## UI Layer: React & Ink
6
-
7
- The entire terminal interface is built using **React** via the [Ink](https://github.com/vadimdemedes/ink) renderer.
8
- - **Component-Based**: The UI is composed of isolated, reusable React components (`ChatLayout`, `StatusBar`, `CommandMenu`, `TerminalBox`, `ProfileForm`).
9
- - **Reactive State**: The application uses React hooks (`useState`, `useEffect`) to manage user input, application mode, model selection, and the terminal's resizing events.
10
- - **Zero-Render Overheads**: Critical performance trackers, like the session start time, are kept outside the React render cycle to maintain terminal responsiveness during high-speed AI text streaming.
11
-
12
- ## The Agentic Loop
13
-
14
- The core intelligence of Flux Flow resides in `src/utils/ai.js`. It does not rely on opaque third-party agent frameworks; instead, it uses a custom, highly transparent string-based protocol powered by an asynchronous generator (`async function*`). This approach allows for real-time UI updates while managing complex multi-step reasoning.
15
-
16
- The execution flow of a single user prompt follows this loop:
17
-
18
- 1. **Context Assembly**: The user's prompt is combined with the system instructions, temporary session context, persistent user memories, and the current chat history. If the history gets too large (e.g., >128k tokens) and compression is disabled, it is gracefully truncated.
19
- 2. **Stream Processing**: The main loop initiates a streaming request to the Gemini API (`client.models.generateContentStream`). It yields chunks of text and status updates directly back to the React UI as they arrive.
20
- 3. **Detection & Tool Execution**: Once the stream completes for a given turn, the entire response is scanned for tool calls using a custom regex and bracket-balancing parser (looking for `tool:functions.tool_name(args...)`).
21
- - If tools are found, the loop pauses.
22
- - Each tool is dispatched to its respective handler in `src/tools/`.
23
- - Tool outputs are collected and appended to the context as `[TOOL_RESULT]: ...`.
24
- 4. **Security Governance**: During tool execution, the loop enforces security checks (e.g., blocking `exec_command` from accessing system root drives if "External Workspace Access" is off) and pauses for Human-in-the-Loop (HITL) approval if necessary.
25
- 5. **Turn Management & Continuation**: The model is instructed to append `[turn: finish]` if its goal is complete, or `[turn: continue]` if it expects tool results.
26
- - If tools were called or `[turn: continue]` is present, the loop increments and re-prompts the model with the newly gathered `[TOOL_RESULT]` data.
27
- - If `[turn: finish]` is detected and no further tools were called, the main loop terminates, passing the final synthesized context to the background Janitor process.
28
- 6. **Loop Limits & Resilience**: To prevent infinite loops or excessive API usage, **Flux mode** is capped at 50 iterations per user prompt, while **Flow mode** is capped at 5.
29
- - **Multi-Stage Failover**: The loop features a sophisticated 8-attempt retry engine with random backoff (800ms - 2s).
30
- - **Critical Fallback Pivot**: If the primary model fails 5 consecutive times, the agent surgically pivots to a lighter, high-concurrency fallback model (`gemini-3.1-flash-lite-preview`) for the final 3 attempts to ensure session navigation through API congestion.
31
-
32
- ## Multimodal Pipeline
33
-
34
- Flux Flow implements a native multimodal processing engine in `src/tools/view_file.js`. This allows the agent to move beyond text-based reasoning and analyze visual assets directly.
35
-
36
- - **Binary Detection**: The pipeline uses `is-binary-path` to distinguish between text and binary files.
37
- - **Visual Encoding**: If an image or PDF is detected, the engine reads the raw bytes and converts them into base64-encoded `InlineData` objects.
38
- - **PDF Extraction**: For PDF documents, the engine extracts visual representation of pages to provide the model with high-fidelity spatial and textual context simultaneously.
39
- - **Context Injection**: These multimodal assets are injected directly into the Gemini model's multimodal part array, allowing the model to "see" the file as if it were looking at a screenshot.
40
-
41
- ## The Dual-Model System
42
-
43
- To maintain a fast, snappy UI while still performing complex data management, Flux Flow employs two separate AI models for every interaction:
44
-
45
- ### 1. The Main Agent
46
- - **Responsibility**: Direct user interaction, reasoning, and tool execution.
47
- - **Behavior**: Streams text directly to the UI. It focuses entirely on solving the user's immediate problem or answering their question.
48
-
49
- ### 2. The Janitor (Background Process)
50
- - **Responsibility**: System maintenance, long-term memory extraction, and chat summarization.
51
- - **Behavior**: After the Main Agent finishes its loop, the entire context (User Prompt + Agent Raws) is sent to the Janitor model.
52
- - **Headless Operation**: The Janitor is explicitly instructed to be a "silent background system process" with "no mouth." It *only* outputs valid tool calls (e.g., updating the chat title or saving a new user preference to the persistent memory vault).
53
-
54
- ## Data Persistence & Safety
55
-
56
- - **High-Fidelity Lock**: Because both the UI and the Janitor model may attempt to write to the `history.json` file simultaneously, a Promise-based `WRITE_LOCK` (`src/utils/history.js`) is utilized. This prevents race conditions and ensures data integrity.
57
- - **Encryption**: User secrets and persistent memories (`secret/memories.json`) are handled by `src/utils/crypto.js` to ensure local privacy.
58
-
59
- ## Redirection & The Anchor Strategy
60
-
61
- To support data portability (e.g., storing all app data on an external encrypted drive), Flux Flow utilizes a synchronous "Anchor" strategy in `src/utils/paths.js`.
62
-
63
- - **Synchronous Pivot**: Because many core modules (History, Secrets, Usage) initialize their file paths as constants during module loading, the application must determine the "Actual" data root before anything else.
64
- - **Boot-Sequence Priority**: On every launch, `paths.js` performs a synchronous file system check for `~/.fluxflow/settings.json`. If a redirection path is found (`useExternalData: true`), it immediately overrides the global `DATA_DIR` constant for the entire process.
65
- - **Sub-Coordinate Resolution**: All secondary directories (`LOGS_DIR`, `SECRET_DIR`) are derived dynamically from the redirected `DATA_DIR`, ensuring that all session data flows to the external sanctuary without requiring individual configuration updates across the codebase.
1
+ # 🏛️ Architecture & Design
2
+
3
+ Flux Flow is built on a modern, reactive stack that brings web-like development paradigms to the terminal. It utilizes a custom agentic loop for reasoning and a unique dual-model system for background processing.
4
+
5
+ ## UI Layer: React & Ink
6
+
7
+ The entire terminal interface is built using **React** via the [Ink](https://github.com/vadimdemedes/ink) renderer.
8
+ - **Component-Based**: The UI is composed of isolated, reusable React components (`ChatLayout`, `StatusBar`, `CommandMenu`, `TerminalBox`, `ProfileForm`).
9
+ - **Reactive State**: The application uses React hooks (`useState`, `useEffect`) to manage user input, application mode, model selection, and the terminal's resizing events.
10
+ - **Zero-Render Overheads**: Critical performance trackers, like the session start time, are kept outside the React render cycle to maintain terminal responsiveness during high-speed AI text streaming.
11
+
12
+ ## The Agentic Loop
13
+
14
+ The core intelligence of Flux Flow resides in `src/utils/ai.js`. It does not rely on opaque third-party agent frameworks; instead, it uses a custom, highly transparent string-based protocol powered by an asynchronous generator (`async function*`). This approach allows for real-time UI updates while managing complex multi-step reasoning.
15
+
16
+ The execution flow of a single user prompt follows this loop:
17
+
18
+ 1. **Context Assembly**: The user's prompt is combined with the system instructions, temporary session context, persistent user memories, and the current chat history. If the history gets too large (e.g., >128k tokens) and compression is disabled, it is gracefully truncated.
19
+ 2. **Stream Processing**: The main loop initiates a streaming request to the Gemini API (`client.models.generateContentStream`). It yields chunks of text and status updates directly back to the React UI as they arrive.
20
+ 3. **Detection & Tool Execution**: Once the stream completes for a given turn, the entire response is scanned for tool calls using a custom regex and bracket-balancing parser (looking for `tool:functions.tool_name(args...)`).
21
+ - If tools are found, the loop pauses.
22
+ - Each tool is dispatched to its respective handler in `src/tools/`.
23
+ - Tool outputs are collected and appended to the context as `[TOOL_RESULT]: ...`.
24
+ 4. **Security Governance**: During tool execution, the loop enforces security checks (e.g., blocking `exec_command` from accessing system root drives if "External Workspace Access" is off) and pauses for Human-in-the-Loop (HITL) approval if necessary.
25
+ 5. **Turn Management & Continuation**: The model is instructed to append `[turn: finish]` if its goal is complete, or `[turn: continue]` if it expects tool results.
26
+ - If tools were called or `[turn: continue]` is present, the loop increments and re-prompts the model with the newly gathered `[TOOL_RESULT]` data.
27
+ - If `[turn: finish]` is detected and no further tools were called, the main loop terminates, passing the final synthesized context to the background Janitor process.
28
+ 6. **Loop Limits & Resilience**: To prevent infinite loops or excessive API usage, **Flux mode** is capped at 50 iterations per user prompt, while **Flow mode** is capped at 5.
29
+ - **Multi-Stage Failover**: The loop features a sophisticated 8-attempt retry engine with random backoff (800ms - 2s).
30
+ - **Critical Fallback Pivot**: If the primary model fails 5 consecutive times, the agent surgically pivots to a lighter, high-concurrency fallback model (`gemini-3.1-flash-lite-preview`) for the final 3 attempts to ensure session navigation through API congestion.
31
+
32
+ ## Multimodal Pipeline
33
+
34
+ Flux Flow implements a native multimodal processing engine in `src/tools/view_file.js`. This allows the agent to move beyond text-based reasoning and analyze visual assets directly.
35
+
36
+ - **Binary Detection**: The pipeline uses `is-binary-path` to distinguish between text and binary files.
37
+ - **Visual Encoding**: If an image or PDF is detected, the engine reads the raw bytes and converts them into base64-encoded `InlineData` objects.
38
+ - **PDF Extraction**: For PDF documents, the engine extracts visual representation of pages to provide the model with high-fidelity spatial and textual context simultaneously.
39
+ - **Context Injection**: These multimodal assets are injected directly into the Gemini model's multimodal part array, allowing the model to "see" the file as if it were looking at a screenshot.
40
+
41
+ ## The Dual-Model System
42
+
43
+ To maintain a fast, snappy UI while still performing complex data management, Flux Flow employs two separate AI models for every interaction:
44
+
45
+ ### 1. The Main Agent
46
+ - **Responsibility**: Direct user interaction, reasoning, and tool execution.
47
+ - **Behavior**: Streams text directly to the UI. It focuses entirely on solving the user's immediate problem or answering their question.
48
+
49
+ ### 2. The Janitor (Background Process)
50
+ - **Responsibility**: System maintenance, long-term memory extraction, and chat summarization.
51
+ - **Behavior**: After the Main Agent finishes its loop, the entire context (User Prompt + Agent Raws) is sent to the Janitor model.
52
+ - **Headless Operation**: The Janitor is explicitly instructed to be a "silent background system process" with "no mouth." It *only* outputs valid tool calls (e.g., updating the chat title or saving a new user preference to the persistent memory vault).
53
+
54
+ ## Data Persistence & Safety
55
+
56
+ - **High-Fidelity Lock**: Because both the UI and the Janitor model may attempt to write to the `history.json` file simultaneously, a Promise-based `WRITE_LOCK` (`src/utils/history.js`) is utilized. This prevents race conditions and ensures data integrity.
57
+ - **Encryption**: User secrets and persistent memories (`secret/memories.json`) are handled by `src/utils/crypto.js` to ensure local privacy.
58
+
59
+ ## Redirection & The Anchor Strategy
60
+
61
+ To support data portability (e.g., storing all app data on an external encrypted drive), Flux Flow utilizes a synchronous "Anchor" strategy in `src/utils/paths.js`.
62
+
63
+ - **Synchronous Pivot**: Because many core modules (History, Secrets, Usage) initialize their file paths as constants during module loading, the application must determine the "Actual" data root before anything else.
64
+ - **Boot-Sequence Priority**: On every launch, `paths.js` performs a synchronous file system check for `~/.fluxflow/settings.json`. If a redirection path is found (`useExternalData: true`), it immediately overrides the global `DATA_DIR` constant for the entire process.
65
+ - **Sub-Coordinate Resolution**: All secondary directories (`LOGS_DIR`, `SECRET_DIR`) are derived dynamically from the redirected `DATA_DIR`, ensuring that all session data flows to the external sanctuary without requiring individual configuration updates across the codebase.
package/dist/fluxflow.js CHANGED
@@ -770,19 +770,19 @@ var init_thinking_prompts = __esm({
770
770
  thinking_prompts_default = {
771
771
  Max: `EFFORT_LEVEL: MAX
772
772
  Think in a continuous, fluid analytical monologue within the <think>...</think> block. Do NOT use headings, bullet points, or artificial sections. Engage in a deep "Stream of Consciousness" that follows this cognitive path:
773
- 1. **Deep Analysis**: Deconstruct the request into its core technical and logic requirements.
774
- 2. **Hypothesis & Test**: Propose multiple solutions mentally and critique them for edge cases or security risks.
775
- 3. **Architectural Planning**: Consider the long-term impact on the project structure and maintainability.
776
- 4. **Refinement**: Iterate on the chosen path until it is bulletproof.
773
+ Deep Analysis: Deconstruct the request into its core technical and logic requirements.
774
+ Hypothesis & Test: Propose multiple solutions mentally and critique them for edge cases or security risks.
775
+ Architectural Planning: Consider the long-term impact on the project structure and maintainability.
776
+ Refinement: Iterate on the chosen path until it is bulletproof.
777
777
 
778
778
  RULES:
779
779
  - NO HEADINGS. Just a solid, stable analytical monologue.
780
780
  - Be thorough and exhaustive. Explore the 'why' behind every decision.
781
781
  - Use internal critique: Question your own logic as you go.
782
- - **MANDATORY REASONING**: You MUST engage in full reasoning regardless of perceived simplicity.`,
783
- High: "EFFORT_LEVEL: HIGH\nThink in a stable, analytical monologue within the <think>...</think> block. Avoid headings or structured formatting. Your thinking should be a continuous stream of logical deduction:\n1. Analyze the immediate task and its dependencies.\n2. Mentally simulate the execution to identify potential failure points.\n3. Structure a precise plan that addresses both primary goals and secondary constraints.\n\nRULES:\n- NO HEADINGS. Maintain a fluid monologue style.\n- Be detailed and rigorous in your self-questioning.\n- Focus on accuracy and technical correctness.\n- **MANDATORY REASONING**: You MUST enter reasoning to verify the path forward.",
784
- Medium: "EFFORT_LEVEL: MEDIUM\nThink in a concise, stable monologue within the <think>...</think> block. No headings needed. Focus on the core logic required to solve the task efficiently:\n1. Identify the most direct path to the solution.\n2. Briefly consider and discard obvious alternatives.\n3. Confirm the plan meets the user's immediate requirements.\n\nRULES:\n- NO HEADINGS. Keep it as a simple, logical stream.\n- Be efficient. Spend energy only on what matters for the task.\n- **REQUIRED REASONING**: Engage in a baseline mental check for all technical tasks.",
785
- Minimal: "EFFORT_LEVEL: LOW\nThink in a brief, focused monologue within the <think>...</think> block. No headings. Just a quick mental check before acting:\n1. Verify the objective.\n2. Note the target files/tools.\n\nRULES:\n- NO HEADINGS. Just a few lines of clear, linear thought.\n- Use minimal/no thinking for simple or conversational requests."
782
+ - MANDATORY REASONING: You MUST engage in full reasoning regardless of perceived simplicity.`,
783
+ High: "EFFORT_LEVEL: HIGH\nThink in a stable, analytical monologue within the <think>...</think> block. Avoid headings or structured formatting. Your thinking should be a continuous stream of logical deduction:\nAnalyze the immediate task and its dependencies.\nMentally simulate the execution to identify potential failure points.\nStructure a precise plan that addresses both primary goals and secondary constraints.\n\nRULES:\n- NO HEADINGS. Maintain a fluid monologue style.\n- Be detailed and rigorous in your self-questioning.\n- Focus on accuracy and technical correctness.\n- MANDATORY REASONING: You MUST enter reasoning to verify the path forward.",
784
+ Medium: "EFFORT_LEVEL: MEDIUM\nThink in a concise, stable monologue within the <think>...</think> block. No headings needed. Focus on the core logic required to solve the task efficiently:\nIdentify the most direct path to the solution.\nBriefly consider and discard obvious alternatives.\nConfirm the plan meets the user's immediate requirements.\n\nRULES:\n- NO HEADINGS. Keep it as a simple, logical stream.\n- Be efficient. Spend energy only on what matters for the task.\n- REQUIRED REASONING: Engage in a baseline mental check for all technical tasks.",
785
+ Minimal: "EFFORT_LEVEL: LOW\nThink in a brief, focused monologue within the <think>...</think> block. No headings. Just a quick mental check before acting:\nVerify the objective.\nNote the target files/tools.\n\nRULES:\n- NO HEADINGS. Just a few lines of clear, linear thought.\n- Use minimal/no thinking for simple or conversational requests."
786
786
  };
787
787
  }
788
788
  });
@@ -794,7 +794,7 @@ var init_prompts = __esm({
794
794
  init_main_tools();
795
795
  init_janitor_tools();
796
796
  init_thinking_prompts();
797
- getSystemInstruction = (profile, thinkingLevel, mode, systemSettings, tempMemories = "", userMemories = "", isMemoryEnabled = true, isContext50 = false, maxLoops, currentLoop) => {
797
+ getSystemInstruction = (profile, thinkingLevel, mode, systemSettings, tempMemories = "", userMemories = "", isMemoryEnabled = true, isContext8 = false, maxLoops, currentLoop) => {
798
798
  let levelKey = thinkingLevel;
799
799
  if (thinkingLevel === "Low") levelKey = "Minimal";
800
800
  if (thinkingLevel === "xHigh" || thinkingLevel === "Max") levelKey = "Max";
@@ -808,7 +808,7 @@ var init_prompts = __esm({
808
808
  ` : "";
809
809
  const dateTimeStr = (/* @__PURE__ */ new Date()).toLocaleString();
810
810
  const cwdStr = process.cwd();
811
- const tempMemoriesStr = tempMemories?.length > 0 && !isContext50 ? `
811
+ const tempMemoriesStr = tempMemories?.length > 0 && !isContext8 ? `
812
812
  -- RECENT CONTEXT FROM OTHER CHAT THREADS --
813
813
  ${tempMemories}
814
814
  ------------------------------------------
@@ -833,7 +833,7 @@ If you see a [STEERING HINT] from user, give that prompt priority for the task a
833
833
  -- START THINKING INSTRUCTIONS --
834
834
  ${thinkingConfig}
835
835
 
836
- BEFORE USING ANY TOOL THINKING IS **MANDATORY**. ALWAYS PREFER TO ENTER IN THINKING AS PER INSTRUCTIONS FOR MORE ACCURACY, AVOID DIRECT SHOTS.
836
+ BEFORE USING ANY TOOL THINKING IS **MANDATORY** WITH TOOL RULES. ALWAYS PRIORITIZE THINKING FIRST BEFORE RESPONDING. YOU ARE **FORBIDDEN** TO JUMP TO RESPONSES FIRST.
837
837
  -- END THINKING INSTRUCTIONS --
838
838
 
839
839
  ${TOOL_PROTOCOL(mode)}
@@ -1488,7 +1488,7 @@ var init_memory = __esm({
1488
1488
  if (!content) return "ERROR: Missing 'content' for temp memory.";
1489
1489
  const tempStorage = readEncryptedJson(TEMP_MEM_FILE, {});
1490
1490
  if (!tempStorage[chatId]) tempStorage[chatId] = [];
1491
- const MAX_CHARS = 2500 * 4;
1491
+ const MAX_CHARS = 2e3 * 4;
1492
1492
  let currentTotalLength = tempStorage[chatId].reduce((acc, m) => acc + m.length, 0);
1493
1493
  while (tempStorage[chatId].length > 0 && currentTotalLength + content.length > MAX_CHARS) {
1494
1494
  const removed = tempStorage[chatId].shift();
@@ -1890,13 +1890,18 @@ var init_update_file = __esm({
1890
1890
  diffText += `-${startLine + i}|${line}
1891
1891
  `;
1892
1892
  });
1893
- fullNewLines.forEach((line, i) => {
1894
- diffText += `+${startLine + i}|${line}
1893
+ let currentNewLine = startLine;
1894
+ fullNewLines.forEach((line) => {
1895
+ diffText += `+${currentNewLine}|${line}
1895
1896
  `;
1897
+ currentNewLine++;
1896
1898
  });
1897
- for (let i = endLine; i < Math.min(allOriginalLines.length, endLine + 15); i++) {
1898
- diffText += `[UI_CONTEXT] ${i + 1}|${allOriginalLines[i]}
1899
+ const linesAffected = fullOldLines.length;
1900
+ const originalContextIdx = startLine + linesAffected - 1;
1901
+ for (let i = originalContextIdx; i < Math.min(allOriginalLines.length, originalContextIdx + 15); i++) {
1902
+ diffText += `[UI_CONTEXT] ${currentNewLine}|${allOriginalLines[i]}
1899
1903
  `;
1904
+ currentNewLine++;
1900
1905
  }
1901
1906
  diffText += `[DIFF_END]`;
1902
1907
  return diffText;
@@ -2665,7 +2670,7 @@ var init_ai = __esm({
2665
2670
  modifiedHistory.push({ role: "user", text: firstUserMsg });
2666
2671
  let lastUsage = null;
2667
2672
  const MAX_LOOPS = mode === "Flux" ? 50 : 7;
2668
- const MAX_RETRIES = 7;
2673
+ const MAX_RETRIES = 8;
2669
2674
  yield { type: "status", content: "Connecting..." };
2670
2675
  TERMINATION_SIGNAL = false;
2671
2676
  let fullAgentResponseChunks = [];
@@ -2698,16 +2703,18 @@ var init_ai = __esm({
2698
2703
  }
2699
2704
  let stream;
2700
2705
  let success = false;
2701
- let retryCount = 0;
2706
+ let retryCount = 1;
2707
+ let inStreamRetryCount = 1;
2702
2708
  let turnText = "";
2703
2709
  let lastToolSniffed = null;
2704
2710
  let lastToolEventTime = null;
2705
2711
  let toolResults = [];
2706
2712
  let toolCallPointer = 0;
2707
2713
  let isThinkingLoop = false;
2714
+ let isStutteringLoop = false;
2708
2715
  let isInitialAttempt = true;
2709
2716
  let accumulatedContext = "";
2710
- while (retryCount <= MAX_RETRIES && !success && !TERMINATION_SIGNAL) {
2717
+ while (retryCount <= MAX_RETRIES && inStreamRetryCount <= MAX_RETRIES && !success && !TERMINATION_SIGNAL) {
2711
2718
  try {
2712
2719
  if (isInitialAttempt) {
2713
2720
  yield { type: "turn_reset", content: true };
@@ -2729,48 +2736,32 @@ var init_ai = __esm({
2729
2736
  throw new Error("Error: Daily Quota Exausted for Agent");
2730
2737
  }
2731
2738
  let targetModel = modelName;
2732
- if (retryCount === 5) {
2739
+ if (retryCount === 6) {
2733
2740
  targetModel = "gemini-3-flash-preview";
2734
2741
  yield { type: "model_update", content: "Trying with fallback model" };
2735
- } else if (retryCount >= 6) {
2742
+ } else if (retryCount >= 7) {
2736
2743
  targetModel = "gemini-3.1-flash-lite-preview";
2737
2744
  yield { type: "model_update", content: "Trying with fallback model lite" };
2738
2745
  } else if (retryCount > 0) {
2739
2746
  yield { type: "model_update", content: null };
2740
2747
  }
2741
- const isContext50 = (sessionStats.tokens || 0) >= 54e3;
2742
- const currentSystemInstruction = getSystemInstruction(profile, thinkingLevel, mode, systemSettings, otherMemories, mainUserMemories, isMemoryEnabled, isContext50, MAX_LOOPS, loop + 1);
2748
+ const isContext8 = (sessionStats.tokens || 0) >= 8e3;
2749
+ const currentSystemInstruction = getSystemInstruction(profile, thinkingLevel, mode, systemSettings, otherMemories, mainUserMemories, isMemoryEnabled, isContext8, MAX_LOOPS, loop + 1);
2743
2750
  stream = await client.models.generateContentStream({
2744
2751
  model: targetModel || "gemma-4-31b-it",
2745
2752
  contents,
2746
2753
  config: {
2747
2754
  systemInstruction: currentSystemInstruction,
2748
- temperature: mode === "Flux" ? 1 : 1.3,
2755
+ temperature: mode === "Flux" ? 0.99 : 1.4,
2749
2756
  maxOutputTokens: 32768,
2750
2757
  mediaResolution: "MEDIA_RESOLUTION_MEDIUM",
2751
2758
  safetySettings: [
2752
- {
2753
- category: HarmCategory.HARM_CATEGORY_HARASSMENT,
2754
- threshold: HarmBlockThreshold.BLOCK_NONE
2755
- },
2756
- {
2757
- category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
2758
- threshold: HarmBlockThreshold.BLOCK_NONE
2759
- },
2760
- {
2761
- category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
2762
- threshold: HarmBlockThreshold.BLOCK_NONE
2763
- },
2764
- {
2765
- category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
2766
- threshold: HarmBlockThreshold.BLOCK_NONE
2767
- }
2759
+ { category: HarmCategory.HARM_CATEGORY_HARASSMENT, threshold: HarmBlockThreshold.BLOCK_NONE },
2760
+ { category: HarmCategory.HARM_CATEGORY_HATE_SPEECH, threshold: HarmBlockThreshold.BLOCK_NONE },
2761
+ { category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT, threshold: HarmBlockThreshold.BLOCK_NONE },
2762
+ { category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT, threshold: HarmBlockThreshold.BLOCK_NONE }
2768
2763
  ],
2769
- thinkingConfig: {
2770
- includeThoughts: false,
2771
- thinkingLevel: ThinkingLevel.MINIMAL
2772
- // Gemma's API Reasoning is bad. Keep it Minimal.
2773
- }
2764
+ thinkingConfig: { includeThoughts: false, thinkingLevel: targetModel.includes("pro") ? ThinkingLevel.HIGH : ThinkingLevel.MINIMAL }
2774
2765
  }
2775
2766
  });
2776
2767
  turnText = "";
@@ -2832,9 +2823,16 @@ var init_ai = __esm({
2832
2823
  const wordCount = thinkContent.split(/\s+/).filter((w) => w.length > 0).length;
2833
2824
  let repetitionThresholdThinking = 0.4;
2834
2825
  let repetitionThresholdResponse = 0.6;
2835
- let isOverVerboseThinking = wordCount > 2500;
2826
+ const thinkingCaps = {
2827
+ "low": 200,
2828
+ "medium": 500,
2829
+ "high": 2e3,
2830
+ "max": 3500
2831
+ };
2832
+ const cap = thinkingCaps[thinkingLevel?.toLowerCase()] || 2500;
2833
+ let isOverVerboseThinking = wordCount > cap;
2836
2834
  if (repetitionRatio > repetitionThresholdThinking || isOverVerboseThinking) {
2837
- const reason = repetitionRatio > repetitionThresholdThinking ? "Thinking Loop Detected" : "Rambling Detected";
2835
+ const reason = repetitionRatio > repetitionThresholdThinking ? "Thinking Loop Detected" : "Thinking Budget Exceeded";
2838
2836
  yield { type: "status", content: `${reason}. Re-centering...` };
2839
2837
  isThinkingLoop = true;
2840
2838
  await new Promise((resolve) => setTimeout(resolve, 3e3));
@@ -2864,6 +2862,7 @@ var init_ai = __esm({
2864
2862
  if (stutterDetected) {
2865
2863
  yield { type: "status", content: `Stuttering Detected. Re-centering...` };
2866
2864
  isThinkingLoop = false;
2865
+ isStutteringLoop = true;
2867
2866
  await new Promise((resolve) => setTimeout(resolve, 3e3));
2868
2867
  break;
2869
2868
  }
@@ -3067,24 +3066,33 @@ ${boxBottom}
3067
3066
  ----------------------------------------------------------------------
3068
3067
 
3069
3068
  `);
3070
- if (retryCount < MAX_RETRIES) {
3071
- retryCount++;
3072
- const waitTime = Math.floor(Math.random() * (2e3 - 800 + 1)) + 800;
3073
- if (turnText.trim().length > 0) {
3069
+ if (turnText.trim().length > 0) {
3070
+ if (inStreamRetryCount <= MAX_RETRIES) {
3071
+ inStreamRetryCount++;
3072
+ const waitTime = Math.floor(Math.random() * (3e3 - 1e3 + 1)) + 1e3;
3074
3073
  modifiedHistory.push({ role: "agent", text: turnText });
3075
3074
  if (toolResults.length > 0) {
3076
3075
  toolResults.forEach((tr) => modifiedHistory.push(tr));
3077
3076
  }
3078
3077
  modifiedHistory.push({ role: "user", text: "[SYSTEM] Response got cut for internal error, continue from checkpoint seamlessly and DON'T repeat what you already said!" });
3079
3078
  accumulatedContext += turnText;
3080
- yield { type: "status", content: `Recovering & Continuing (${retryCount}/${MAX_RETRIES + 1})...` };
3079
+ yield { type: "status", content: `Error Occured. Recovering Stream (${inStreamRetryCount}/${MAX_RETRIES})...` };
3080
+ await new Promise((resolve) => setTimeout(resolve, waitTime));
3081
3081
  } else {
3082
- isInitialAttempt = true;
3083
- yield { type: "status", content: `Retrying (${retryCount}/${MAX_RETRIES + 1})...` };
3082
+ throw new Error(`Stream collapsed too many times. (Failed to resolve ${MAX_RETRIES} times)
3083
+ Error Log can be found in ${path16.join(LOGS_DIR, "agent", "error.log")}`);
3084
3084
  }
3085
- await new Promise((resolve) => setTimeout(resolve, waitTime));
3086
3085
  } else {
3087
- throw new Error(`Model cannot be reached: ${errMsg}. (Failed ${MAX_RETRIES + 1} times)`);
3086
+ if (retryCount <= MAX_RETRIES) {
3087
+ retryCount++;
3088
+ const waitTime = Math.floor(Math.random() * (3e3 - 1e3 + 1)) + 1e3;
3089
+ isInitialAttempt = true;
3090
+ yield { type: "status", content: `Retrying Connection (${retryCount}/${MAX_RETRIES})...` };
3091
+ await new Promise((resolve) => setTimeout(resolve, waitTime));
3092
+ } else {
3093
+ throw new Error(`Model cannot be reached. (Failed ${MAX_RETRIES} times)
3094
+ Error Log can be found in ${path16.join(LOGS_DIR, "agent", "error.log")}`);
3095
+ }
3088
3096
  }
3089
3097
  }
3090
3098
  }
@@ -3243,8 +3251,9 @@ ${timestamp}`;
3243
3251
  if (toolResults.length > 0) {
3244
3252
  toolResults.forEach((tr) => modifiedHistory.push(tr));
3245
3253
  } else {
3246
- modifiedHistory.push({ role: "user", text: `[SYSTEM]: ${isThinkingLoop ? "OVER-THINKING " : ""}LOOP DETECTED by Internal System. ${isThinkingLoop ? "If you have planned the task, prioritize the execution/output. " : "If you have finished your task use [turn: finish] else continue."}` });
3254
+ modifiedHistory.push({ role: "user", text: `[SYSTEM]: ${isStutteringLoop && !isThinkingLoop ? `STUTTERING DETECTED by Internal System. Re-calibrate your response & proceed.` : `${isThinkingLoop ? "OVER-THINKING " : ""}LOOP DETECTED by Internal System${isThinkingLoop && " for current EFFORT_LEVEL"}. ${isThinkingLoop ? "If you have planned the task, prioritize the execution/output. " : "If you have finished your task use [turn: finish] else continue."}`}` });
3247
3255
  isThinkingLoop = false;
3256
+ isStutteringLoop = false;
3248
3257
  }
3249
3258
  }
3250
3259
  yield { type: "status", content: null };
@@ -3928,10 +3937,10 @@ Check what's new using \`/changelog\` command.`,
3928
3937
  cmd: "/model",
3929
3938
  desc: "Switch AI model",
3930
3939
  subs: [
3931
- { cmd: "gemma-4-31b-it", desc: apiTier === "Free" ? "Standard Default (Free, Recommended)" : "Standard Default (Free, Recommended) - Cannot use Gemma with paid API" },
3940
+ { cmd: "gemma-4-31b-it", desc: apiTier === "Free" ? "Standard Default (Free, Recommended)" : "Standard Default (Free, Recommended) - Use Free API Key to use this model " },
3932
3941
  { cmd: "gemini-3.1-pro-preview", desc: "Most Capable (Paid)" },
3933
- { cmd: "gemini-3-flash-preview", desc: "Fast & Lightweight (Paid, Free limited quota)" },
3934
- { cmd: "gemini-3.1-flash-lite-preview", desc: "Ultra Fast (Paid, Free limited quota)" }
3942
+ { cmd: "gemini-3-flash-preview", desc: "Fast & Lightweight (Paid, Limited Free quota)" },
3943
+ { cmd: "gemini-3.1-flash-lite-preview", desc: "Ultra Fast (Paid, Decent Free quota)" }
3935
3944
  ]
3936
3945
  },
3937
3946
  { cmd: "/settings", desc: "Configure system prefs" },
@@ -4951,7 +4960,7 @@ Selection: ${val}`,
4951
4960
  setActiveView("chat");
4952
4961
  setTimeout(() => {
4953
4962
  handleSubmit(val);
4954
- }, 50);
4963
+ }, 200);
4955
4964
  },
4956
4965
  onEdit: (val) => {
4957
4966
  setResolutionData(null);
@@ -5198,7 +5207,7 @@ var init_app = __esm({
5198
5207
  init_text();
5199
5208
  SESSION_START_TIME = Date.now();
5200
5209
  CHANGELOG_URL = "https://fluxflow-cli.onrender.com/changelog.html";
5201
- versionFluxflow = "1.8.18";
5210
+ versionFluxflow = "1.8.20";
5202
5211
  updatedOn = "2026-05-10";
5203
5212
  ResolutionModal = ({ data, onResolve, onEdit }) => /* @__PURE__ */ React10.createElement(Box10, { flexDirection: "column", borderStyle: "round", borderColor: "magenta", paddingX: 2, paddingY: 1, width: "100%" }, /* @__PURE__ */ React10.createElement(Text10, { color: "magenta", bold: true, underline: true }, "\u{1F7E3} STEERING HINT RESOLUTION"), /* @__PURE__ */ React10.createElement(Text10, { marginTop: 1 }, "The agent already finished the task before your hint was consumed."), /* @__PURE__ */ React10.createElement(Box10, { marginTop: 1, backgroundColor: "#222", paddingX: 1, width: "100%" }, /* @__PURE__ */ React10.createElement(Text10, { italic: true, color: "gray" }, '"', data, '"')), /* @__PURE__ */ React10.createElement(Box10, { marginTop: 1 }, /* @__PURE__ */ React10.createElement(Text10, { color: "cyan" }, "How would you like to proceed?")), /* @__PURE__ */ React10.createElement(Box10, { marginTop: 1 }, /* @__PURE__ */ React10.createElement(
5204
5213
  CommandMenu,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "fluxflow-cli",
3
- "version": "1.8.18",
3
+ "version": "1.8.20",
4
4
  "description": "A high-fidelity agentic terminal assistant for the Flux Era.",
5
5
  "keywords": [
6
6
  "ai",