npm - mini-coder - Versions diffs - 0.0.7 → 0.0.9 - Mend

mini-coder 0.0.7 → 0.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/codex-lazy-fix.md ADDED Viewed

@@ -0,0 +1,76 @@
+# Codex Autonomy Issues & Fix Analysis
+## Behaviours
+When using `zen/gpt-5.3-codex` as the agent, the model consistently exhibits "lazy" or permission-seeking behaviour. Specifically:
+1. **Initial Compliance**: It starts by reading files or globbing the directory.
+2. **Immediate Stall**: Instead of executing edits or implementing the plan, it outputs a multi-paragraph text explaining what it *plans* to do and ends the turn.
+3. **Permission Seeking**: It explicitly asks the user for permission (e.g., "Reply **'proceed'** and I'll start implementing batch 1").
+4. **Ralph Mode Incompatibility**: In `/ralph` mode, the agent loops continuously. Because it restarts with a fresh context on each loop and stalls after gathering context, it never actually writes any files. It just loops through the same read-and-plan phase until it hits the max iteration limit.
+5. **Model Differences**: Both Claude and Gemini models do not exhibit this behaviour. They are not subjected to the same conversational RLHF that pushes the model to ask the user to double check its work.
+## Root Cause Analysis
+An analysis of both OpenAI's open-source `codex-rs` client and `opencode` source code reveals that Codex models (like `gpt-5.3-codex`) are highly RLHF-tuned for safety and collaborative pair-programming. By default, the model prefers to break tasks into chunks and explicitly ask for sign-off.
+To override this, the model requires three things which `mini-coder` was failing to provide correctly:
+### 1. Dual-Anchored System Prompts (`system` + `instructions`)
+`mini-coder` implemented a check `useInstructions` that placed the system prompt into the `instructions` field of the `/v1/responses` API payload. However, doing so stripped the `system` role message from the conversation context (`input` array).
+By looking at `opencode` and `codex-rs`, they both ensure that the context array *also* contains the system prompt:
+- `opencode` maps its environment variables and system instructions to `role: "system"` (or `role: "developer"`) inside `input.messages`, **while also** passing behavioral instructions to the `instructions` field in the API payload.
+- `codex-rs` directly injects `role: "developer"` into the message list (as seen in `codex-rs/core/src/compact.rs` and their memory tracing implementations).
+Without the `system` / `developer` message anchored at the start of the `input` array, the AI SDK and the model deprioritized the standalone `instructions` field, allowing the model's base permission-seeking behaviors to take over.
+### 2. Explicit "Do Not Ask" Directives
+Both `opencode` and `codex-rs` employ heavy anti-permission prompts.
+- **Opencode** (`session/prompt/codex_header.txt`):
+  > "- Default: do the work without asking questions... Never ask permission questions like 'Should I proceed?' or 'Do you want me to run tests?'; proceed with the most reasonable option and mention what you did."
+- **Codex-RS** (`core/templates/model_instructions/gpt-5.2-codex_instructions_template.md`):
+  > "Persist until the task is fully handled end-to-end within the current turn whenever feasible: do not stop at analysis or partial fixes; carry changes through implementation, verification, and a clear explanation of outcomes unless the user explicitly pauses or redirects you."
+`mini-coder` introduced `CODEX_AUTONOMY` in a previous commit, but because of Issue #1, it was never adequately anchored in the `input` array.
+## Evidence & Tests
+We introduced a fetch wrapper interceptor in `src/llm-api/providers.ts` that logs the full outbound API requests to `~/.config/mini-coder/api.log`.
+A test script `test-turn.ts` running a dummy turn showed the exact payload generated by the AI SDK before our fix:
+```json
+    "body": {
+      "model": "gpt-5.3-codex",
+      "input": [
+        {
+          "role": "user",
+          "content": [
+            { "type": "input_text", "text": "hello" }
+          ]
+        }
+      ],
+      "store": false,
+      "instructions": "You are a test agent.",
+      ...
+```
+```json
+    "body": {
+      "model": "gpt-5.3-codex",
+      "input": [
+        {
+          "role": "developer",
+          "content": "You are mini-coder, a small and fast CLI coding agent... [CODEX_AUTONOMY directives]"
+        },
+        {
+          "role": "user",
+          "content": [
+            { "type": "input_text", "text": "hello" }
+          ]
+        }
+      ],
+      "instructions": "You are mini-coder, a small and fast CLI coding agent... [CODEX_AUTONOMY directives]"
+    }
+```
+This perfectly mirrors the behavior seen in `opencode` and `codex-rs`.
+## Actions Taken
+1. Added an `api.log` request interceptor in `providers.ts` to capture and inspect the exact JSON payloads sent to the OpenAI/AI SDK endpoints.
+2. Cloned and analyzed both `opencode` and `codex` repos to observe how they communicate with `gpt-5.*` codex endpoints.
+3. Updated `src/llm-api/turn.ts` so `system: systemPrompt` is *always* passed to the AI SDK, guaranteeing a `developer` message anchors the `input` array, even when `instructions` is also used.