npm - codemini-cli - Versions diffs - 0.4.2 → 0.4.3 - Mend

codemini-cli 0.4.2 → 0.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/README.md +4 -2
package/deployment.md +5 -5
package/package.json +1 -2
package/skills/grill-me/SKILL.md +30 -0
package/skills/project-requirements/SKILL.md +245 -0
package/skills/superpowers-lite/SKILL.md +5 -1
package/src/commands/run.js +5 -4
package/src/core/agent-loop.js +8 -8
package/src/core/chat-runtime.js +220 -31
package/src/core/config-store.js +6 -3
package/src/core/fff-adapter.js +1 -1
package/src/core/provider/anthropic.js +2 -2
package/src/core/provider/openai-compatible.js +2 -2
package/src/core/shell.js +1 -1
package/src/core/tools.js +116 -39
package/src/tui/chat-app.js +52 -22
package/src/tui/tool-activity/presenters/system.js +6 -0

package/README.md CHANGED Viewed

@@ -110,7 +110,8 @@ Skills are reusable workflow patterns that guide how the agent approaches differ
 | Skill | Trigger | Description |
 |-------|---------|-------------|
-| **superpowers-lite** | Default for all coding work | Lightweight operating style: prefer structured tools, keep context tight, use sub-agents, verify before claiming success |
+| **superpowers-lite** | Default for all coding work | Lightweight operating style: prefer structured tools, keep context tight, use sub-agents, verify before claiming success; asks 1-3 sharp questions only for high-risk decisions |
+| **grill-me** | Explicit pressure-test requests | Optional scrutiny mode for plans, PRs, launches, and ideas; challenges assumptions without changing the default workflow |
 | **brainstorm** | Multiple reasonable approaches exist | Explores options and tradeoffs before coding; asks one question at a time to resolve uncertainty |
 | **writing-plans** | Non-trivial implementation task | Creates a step-by-step plan with exact file paths, code, and verification steps before touching code |
@@ -382,7 +383,8 @@ Skill 是可复用的工作流模式，指导 agent 如何处理不同类型的
 | Skill | 触发条件 | 说明 |
 |-------|----------|------|
-| **superpowers-lite** | 所有编码工作的默认 skill | 轻量操作风格：优先结构化工具、保持上下文精简、使用 sub-agent、验证后再报告完成 |
+| **superpowers-lite** | 所有编码工作的默认 skill | 轻量操作风格：优先结构化工具、保持上下文精简、使用 sub-agent、验证后再报告完成；仅在高风险决策中提出 1-3 个尖锐问题 |
+| **grill-me** | 明确要求压力测试或拷问时 | 可选审查模式，用于方案、PR、发布和想法；挑战假设但不改变默认协作流程 |
 | **brainstorm** | 存在多种合理方案时 | 在编码前探索选项和权衡；每次只问一个问题来消除不确定性 |
 | **writing-plans** | 非平凡的实现任务 | 在动手之前创建包含精确文件路径、代码和验证步骤的分步计划 |

package/deployment.md CHANGED Viewed

@@ -13,13 +13,13 @@ npm pack
 Expected output:
 ```text
-codemini-cli-0.4.2.tgz
+codemini-cli-0.4.3.tgz
 ```
 If you want to verify the package contents:
 ```bash
-tar -tf codemini-cli-0.4.2.tgz
+tar -tf codemini-cli-0.4.3.tgz
 ```
 ## 2. Copy To The Target Machine
@@ -34,7 +34,7 @@ Copy the generated `.tgz` file to the Win10 machine by one of these methods:
 Recommended target path:
 ```powershell
-C:\temp\codemini-cli-0.4.2.tgz
+C:\temp\codemini-cli-0.4.3.tgz
 ```
 ## 3. Environment Requirements
@@ -58,7 +58,7 @@ npm -v
 Global install:
 ```powershell
-npm install -g C:\temp\codemini-cli-0.4.2.tgz
+npm install -g C:\temp\codemini-cli-0.4.3.tgz
 ```
 If global install is blocked by company policy, install in a working directory instead:
@@ -66,7 +66,7 @@ If global install is blocked by company policy, install in a working directory i
 ```powershell
 mkdir C:\temp\coder-test
 cd C:\temp\coder-test
-npm install C:\temp\codemini-cli-0.4.2.tgz
+npm install C:\temp\codemini-cli-0.4.3.tgz
 ```
 ## 5. Confirm Installation

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "codemini-cli",
-  "version": "0.4.2",
+  "version": "0.4.3",
   "description": "Coding CLI optimized for small-model workflows and Windows PowerShell",
   "keywords": [
     "cli",
@@ -49,7 +49,6 @@
     "@cursorless/tree-sitter-wasms": "^0.8.1",
     "cheerio": "^1.1.2",
     "cli-truncate": "^6.0.0",
-    "duck-duck-scrape": "^2.2.7",
     "ink": "^7.0.0",
     "react": "^19.2.5",
     "strip-ansi": "^7.2.0",

package/skills/grill-me/SKILL.md ADDED Viewed

@@ -0,0 +1,30 @@
+---
+name: grill-me
+description: Optional pressure-test mode for plans, architecture choices, PRs, launches, and product ideas: challenge assumptions without changing the default collaborative workflow.
+version: 0.1.0
+---
+Use this skill only when the user explicitly asks to be grilled, challenged, pressure-tested, stress-tested, or reviewed with unusually direct scrutiny.
+## Stance
+Be direct, but keep the target clear: challenge the work, not the person. The goal is better judgment, not dominance or theater.
+## Process
+1. Identify the claim, plan, design, PR, launch, or decision under review.
+2. State the highest-risk assumption first.
+3. Ask 3-7 pointed questions, ordered by risk.
+4. Call out missing evidence, weak verification, unclear ownership, rollback gaps, and hidden dependencies.
+5. End with a short verdict:
+   - `Ship`: risks are understood and verification is credible.
+   - `Revise`: the direction is good, but one or more issues should be fixed first.
+   - `Stop`: a core assumption is unproven or the blast radius is too high.
+## Boundaries
+- Do not insult, mock, or psychoanalyze the user.
+- Do not turn every normal coding task into a cross-examination.
+- Do not invent requirements. If context is missing, ask for the missing artifact or state the assumption.
+- Prefer concrete tests, rollback paths, and observable acceptance criteria over vague caution.

package/skills/project-requirements/SKILL.md ADDED Viewed

@@ -0,0 +1,245 @@
+---
+name: project-requirements
+description: Generate an interactive project requirements report from an existing codebase. Use when the user asks for a PRD, requirements document, API-by-API breakdown, business flow, architecture map, dependency graph, flowchart, product requirements reverse-engineering, or detailed project demand analysis.
+version: 0.1.0
+---
+Use this skill to reverse-engineer a project into a requirements document that product, engineering, and QA can navigate.
+Default to an HTML report with lightweight interactions. Produce Markdown only when the user asks for a text-first artifact, a PR-friendly source document, or an additional companion file.
+User request:
+```text
+{{args}}
+```
+Honor any concrete user request above, such as output format, report path, focus area, API subset, diagram style, or language. If it is empty, generate the default HTML requirements report for the current workspace.
+## Output
+Create the primary report at:
+```text
+docs/requirements/YYYY-MM-DD-project-requirements.html
+```
+If a companion Markdown file is useful, create:
+```text
+docs/requirements/YYYY-MM-DD-project-requirements.md
+```
+The HTML should be self-contained: inline CSS, inline JavaScript, no build step, no required external assets.
+Diagrams must be visible when the HTML is opened directly from disk:
+- Prefer inline SVG for architecture maps, dependency graphs, sequence summaries, and state diagrams.
+- Use semantic SVG groups, `<title>`/`<desc>`, readable labels, arrow markers, and stable element ids so sections can link to diagram nodes.
+- For simple hierarchy diagrams, CSS grid/flex boxes with connector lines are also acceptable.
+- Do not rely on Mermaid rendering as the only visible diagram. Mermaid source may be included in a collapsible `<details>` block as an editable source-of-truth companion.
+- Use Mermaid CDN rendering only as optional progressive enhancement when the user accepts network access. The static inline SVG or CSS diagram must remain the fallback and primary offline view.
+- Avoid showing only raw Mermaid code blocks in the final HTML unless the user explicitly asks for source-only diagrams.
+For medium or large projects, do not generate the entire HTML document in one model response or one huge `write` call. Create the report incrementally:
+1. Write a complete HTML shell first: `doctype`, `<head>`, inline CSS, navigation container, empty main sections, inline script, and closing tags.
+2. Add each major section with smaller `edit` insertions before a stable marker such as `<!-- REQUIREMENTS_SECTIONS -->`.
+3. Keep each write/edit chunk focused: one section, one API group, or one diagram at a time.
+4. After each chunk, preserve valid HTML and keep the marker in place until the final cleanup.
+5. In the final pass, remove unused markers and verify the file can be opened directly from disk.
+This chunked approach is required for HTML reports because inline CSS, JavaScript, diagrams, and API cards can become much larger than Markdown. It also gives the user immediate visible tool progress instead of waiting for one giant generated tool call.
+## Process
+1. Inspect the project before writing:
+   - Read top-level docs such as `README.md`, `OPERATIONS.md`, `docs/`, and deployment notes.
+   - Identify the stack from package manifests, route files, command handlers, API clients, database modules, schemas, and tests.
+   - Search with `rg` for routes, handlers, controllers, commands, schemas, migrations, HTTP verbs, RPC methods, queue handlers, and CLI subcommands.
+2. Build an evidence map:
+   - `EXTRACTED`: behavior directly supported by source code, docs, tests, config, or schemas.
+   - `INFERRED`: reasonable product requirement inferred from code relationships.
+   - `UNKNOWN`: requirement, owner, actor, edge case, or business rule that needs user confirmation.
+3. Decompose by API or interface first:
+   - HTTP API endpoints.
+   - CLI commands and subcommands.
+   - Tool calls, MCP handlers, RPC methods, queue jobs, scheduled tasks, or exported SDK functions.
+   - UI flows only after the backend/interface layer is mapped, unless the project is frontend-only.
+4. Connect each API/interface to requirements:
+   - User goal and actor.
+   - Trigger and entry point.
+   - Request/input shape.
+   - Response/output shape.
+   - Validation and permission rules.
+   - Data read/write behavior.
+   - Internal modules called.
+   - External services or files touched.
+   - Error cases and retry/rollback behavior.
+   - Observability, audit, and security notes.
+   - Acceptance criteria.
+5. Generate diagrams:
+   - Product flowchart for the main user journey.
+   - API dependency graph linking endpoints/commands to modules, data stores, and external services.
+   - Sequence diagram for at least one high-value flow.
+   - State or lifecycle diagram when the domain has clear states.
+   - Render each diagram as static inline SVG or CSS boxes in the HTML, with optional Mermaid source hidden in a collapsible details block.
+6. Write the report and preserve traceability:
+   - Link sections with stable anchors.
+   - Include code file paths for evidence.
+   - Mark inferred or unknown content visibly.
+   - Avoid pretending uncertain requirements are confirmed.
+   - For HTML output, write the shell first, then append/insert sections incrementally instead of producing one large complete file in a single tool call.
+## HTML Structure
+Use this structure unless the project suggests a better one:
+1. Executive summary.
+2. System map with a high-level static SVG or CSS architecture diagram.
+3. API/interface inventory with filters or grouped navigation.
+4. Per-API requirement cards.
+5. Core user flows with diagrams.
+6. Domain model and data ownership.
+7. Permissions, security, and compliance notes.
+8. Error handling and edge cases.
+9. Non-functional requirements.
+10. Open questions and `UNKNOWN` items.
+11. Source evidence index.
+## Interaction Guidelines
+Implement useful interactions with plain JavaScript:
+- Sticky table of contents.
+- Search/filter input for APIs, modules, and tags.
+- Expand/collapse details for each API.
+- Anchor links for every API and flow.
+- Evidence tags: `EXTRACTED`, `INFERRED`, `UNKNOWN`.
+- Back-to-top links for long reports.
+- Optional "show only open questions" toggle.
+Keep interactions accessible:
+- Use semantic headings, buttons, tables, and lists.
+- Make controls keyboard reachable.
+- Do not hide critical content behind JavaScript-only rendering.
+- Ensure the document remains readable if JavaScript is disabled.
+## API Section Template
+For each API, command, handler, or externally visible interface, include:
+```text
+Name:
+Type:
+Route/command/function:
+Evidence:
+Actor:
+Goal:
+Inputs:
+Outputs:
+Preconditions:
+Main flow:
+Alternative flows:
+Validation:
+Permissions:
+Data reads:
+Data writes:
+Internal dependencies:
+External dependencies:
+Errors:
+Observability:
+Acceptance criteria:
+Open questions:
+```
+## Diagram Patterns
+Use static diagrams when diagrams help compress complexity. In HTML output, render the visible diagram as inline SVG or CSS boxes. Include Mermaid only as optional source text when it helps future editing.
+Inline SVG architecture map:
+```html
+<figure class="diagram" id="system-architecture">
+  <figcaption>System architecture</figcaption>
+  <svg viewBox="0 0 960 520" role="img" aria-labelledby="arch-title arch-desc">
+    <title id="arch-title">System architecture</title>
+    <desc id="arch-desc">CLI commands call runtime services, which use tools and data stores.</desc>
+    <defs>
+      <marker id="arrow" markerWidth="10" markerHeight="10" refX="8" refY="3" orient="auto">
+        <path d="M0,0 L0,6 L9,3 z"></path>
+      </marker>
+    </defs>
+    <g id="cli-layer">
+      <rect x="40" y="40" width="220" height="90" rx="8"></rect>
+      <text x="60" y="90">CLI Entry</text>
+    </g>
+    <g id="runtime-layer">
+      <rect x="370" y="40" width="240" height="90" rx="8"></rect>
+      <text x="390" y="90">Runtime</text>
+    </g>
+    <line x1="260" y1="85" x2="370" y2="85" marker-end="url(#arrow)"></line>
+  </svg>
+</figure>
+```
+CSS box architecture map:
+```html
+<section class="arch-map" aria-label="System architecture">
+  <a class="arch-node" href="#api-chat">Chat command</a>
+  <span class="arch-edge" aria-hidden="true">-></span>
+  <a class="arch-node" href="#runtime-agent-loop">Agent loop</a>
+  <span class="arch-edge" aria-hidden="true">-></span>
+  <a class="arch-node" href="#tools-write">Tools</a>
+</section>
+```
+Optional Mermaid companion:
+Product flow:
+```mermaid
+flowchart TD
+  A[User starts task] --> B[System validates input]
+  B --> C[System performs core action]
+  C --> D[User receives result]
+```
+API dependency map:
+```mermaid
+graph LR
+  API[API or command] --> Handler[Handler]
+  Handler --> Service[Service]
+  Service --> Store[(Data store)]
+  Service --> External[External service]
+```
+Sequence flow:
+```mermaid
+sequenceDiagram
+  participant User
+  participant API
+  participant Service
+  participant Store
+  User->>API: Request
+  API->>Service: Validate and execute
+  Service->>Store: Read/write data
+  Store-->>Service: Result
+  Service-->>API: Domain result
+  API-->>User: Response
+```
+## Quality Bar
+The report is complete when:
+- A reader can find every major API or user-facing interface from the navigation.
+- Each interface has at least one source evidence path.
+- Main flows and dependencies are represented both in text and diagrams.
+- Inferred requirements are labeled instead of stated as facts.
+- Open questions are grouped so the user can resolve them later.
+- The HTML can be opened directly from disk in a browser.

package/skills/superpowers-lite/SKILL.md CHANGED Viewed

@@ -1,11 +1,13 @@
 ---
 name: superpowers-lite
 description: Concise workflow skill tuned for 30B-class models: prefer structured code tools first, keep context tight, use sub-agents for narrow tasks, and verify before claiming success.
-version: 0.2.0
+version: 0.3.0
 ---
 Use this skill as the default lightweight operating style for all coding work.
+This is the default, not an interrogation mode. Keep help calm and direct. For high-risk decisions only, add a light Grill Me pass: ask 1-3 sharp questions about assumptions, failure modes, or verification before proceeding. Challenge the plan, not the person.
 **Announce when using a skill:** Before following any route below, say "Using [skill name] to [purpose]" in your response. This signals intent and prevents silent skill skipping.
 ## Mandatory Skill Check
@@ -83,6 +85,8 @@ Evaluate the user's request and YOU MUST follow exactly one route:
 5. **Verify before claiming success.** Run the relevant test or command before saying work is done.
+6. **Use sharp questions sparingly.** For high-risk work, ask 1-3 sharp questions that expose assumptions or likely failure modes. For ordinary tasks, stay lightweight and keep moving.
 ## Sub-agent Guidance
 - `planner`: break work into steps, risks, and checks

package/src/commands/run.js CHANGED Viewed

@@ -11,6 +11,7 @@ import path from 'node:path';
 const ROLE_TOOL_POLICY = {
   planner: ['read', 'grep', 'list', 'query_project_index', 'tool_search', 'glob', 'ast_query', 'read_ast_node', 'read_plan', 'update_plan'],
+  advisor: ['read', 'grep', 'list', 'query_project_index', 'tool_search', 'read_plan'],
   coder: ['read', 'grep', 'list', 'edit', 'write', 'run', 'ast_query', 'read_ast_node', 'glob', 'tool_search', 'update_todos', 'read_plan', 'update_plan'],
   reviewer: ['read', 'grep', 'list', 'glob', 'tool_search', 'ast_query', 'read_ast_node', 'read_plan'],
   tester: ['read', 'grep', 'list', 'run', 'glob', 'tool_search', 'read_plan']
@@ -70,7 +71,7 @@ function makeCompletionFn(config) {
       model,
       messages,
       tools,
-      timeoutMs: config.gateway.timeout_ms || 90000,
+      timeoutMs: config.gateway.timeout_ms || 1800000,
       maxRetries: config.gateway.max_retries ?? 2
     });
 }
@@ -142,11 +143,11 @@ function normalizePlan(parsed, goal) {
 async function planPipeline({ goal, config, systemPrompt, model }) {
   const plannerPrompt = [
     'Create an execution plan and assign the best sub-agent role for each step.',
-    'Return strict JSON only with shape {"summary":"...","steps":[{"title":"...","role":"planner|coder|reviewer|tester","task":"..."}]}. No markdown.',
+    'Return strict JSON only with shape {"summary":"...","steps":[{"title":"...","role":"planner|advisor|coder|reviewer|tester","task":"..."}]}. No markdown.',
     `Available roles: ${HARNESS_ROLES.join(', ')}.`,
     'Prefer 3-5 steps total. The first step should usually inspect the target area.',
     'For implementation goals, include a reviewer or tester step near the end.',
-    'For advisory/analysis goals, keep it lean with planner/coder only.'
+    'For advisory/analysis goals, keep it lean with planner/advisor only; do not use coder unless code or files will be modified.'
   ].join('\n');
   const planning = await createChatCompletion({
@@ -158,7 +159,7 @@ async function planPipeline({ goal, config, systemPrompt, model }) {
       { role: 'system', content: `${systemPrompt}\n${plannerPrompt}` },
       { role: 'user', content: `Plan the following task:\n${goal}` }
     ],
-    timeoutMs: config.gateway.timeout_ms || 90000,
+    timeoutMs: config.gateway.timeout_ms || 1800000,
     maxRetries: config.gateway.max_retries ?? 2
   });

package/src/core/agent-loop.js CHANGED Viewed

@@ -179,6 +179,10 @@ const DREAM_AUTO_CAPTURE_TOOLS = new Set([
 const DREAM_AUTO_CAPTURE_COOLDOWN_MS = 60_000;
 const lastAutoCaptureByTool = new Map();
+function isAutoCaptureEnabled(config = {}) {
+  return config?.memory?.enabled !== false && config?.memory?.auto_capture !== false;
+}
 function shouldAutoCaptureError(toolName, message) {
   if (!DREAM_AUTO_CAPTURE_TOOLS.has(toolName)) return false;
   const now = Date.now();
@@ -196,10 +200,6 @@ function shouldAutoCaptureError(toolName, message) {
     /command not found/i,
     /permission denied/i,
     /args\?\s/i,
-    /Raw tool arguments/i,
-    /edit requires/i,
-    /write requires/i,
-    /requires file/i,
     /path.*outside workspace/i,
     /escapes workspace/i
   ];
@@ -209,7 +209,7 @@ function shouldAutoCaptureError(toolName, message) {
 }
 async function captureToolFailure(toolName, message, args, config = {}) {
-  if (config?.memory?.enabled === false || config?.memory?.auto_capture === false) return;
+  if (!isAutoCaptureEnabled(config)) return;
   const summary = `[${toolName}] ${String(message).slice(0, 120)}`;
   const details = args
     ? `Tool: ${toolName}\nError: ${message}\nArgs: ${JSON.stringify(args).slice(0, 300)}`
@@ -805,7 +805,7 @@ export async function runAgentLoop({
         if (onEvent) {
           onEvent({ type: 'tool:error', name: displayName, id: call.id, arguments: effectiveArgs, durationMs, summary: trimInline(message, 120) });
         }
-        if (shouldAutoCaptureError(toolName, message)) {
+        if (isAutoCaptureEnabled(config) && shouldAutoCaptureError(toolName, message)) {
           await captureToolFailure(toolName, message, effectiveArgs, config).catch(() => {});
         }
         return {
@@ -828,13 +828,13 @@ export async function runAgentLoop({
         const stderr = String(toolResult.stderr || '');
         if (typeof exitCode === 'number' && exitCode !== 0 && stderr) {
           const failMsg = `exit ${exitCode}: ${stderr.slice(0, 120)}`;
-          if (shouldAutoCaptureError(toolName, failMsg)) {
+          if (isAutoCaptureEnabled(config) && shouldAutoCaptureError(toolName, failMsg)) {
             await captureToolFailure(toolName, failMsg, effectiveArgs, config).catch(() => {});
           }
         }
         if (toolResult.error) {
           const errMsg = String(toolResult.error).slice(0, 120);
-          if (shouldAutoCaptureError(toolName, errMsg)) {
+          if (isAutoCaptureEnabled(config) && shouldAutoCaptureError(toolName, errMsg)) {
             await captureToolFailure(toolName, errMsg, effectiveArgs, config).catch(() => {});
           }
         }