npm - @link-assistant/hive-mind - Versions diffs - 1.37.3 → 1.38.0 - Mend

@link-assistant/hive-mind 1.37.3 → 1.38.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/CHANGELOG.md +19 -0
package/package.json +1 -1
package/src/agent.prompts.lib.mjs +21 -9
package/src/claude.budget-stats.lib.mjs +258 -0
package/src/claude.lib.mjs +60 -125
package/src/claude.prompts.lib.mjs +28 -12
package/src/codex.prompts.lib.mjs +18 -10
package/src/github.lib.mjs +9 -9
package/src/opencode.prompts.lib.mjs +16 -8
package/src/solve.mjs +7 -9
package/src/solve.results.lib.mjs +19 -1

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,24 @@
 # @link-assistant/hive-mind
+## 1.38.0
+### Minor Changes
+- ee331ef: Enhance --tokens-budget-stats with sub-session tracking, stream comparison, and GitHub comment display
+## 1.37.4
+### Patch Changes
+- 72bbb31: Add emphasis on reproducible automated testing in system prompts
+  - Add new "Reproducible testing" section to all prompt files (claude, agent, codex, opencode)
+  - Update "Solution development and testing" to emphasize test-first approach
+  - Enhance Playwright MCP guidelines with UI bug reproduction workflow
+  - Enhance Visual UI work section with before/after screenshot guidelines
+  - Fix spelling and grammar issues across all prompt files
+  - Soften forceful language to use recommendation style ("When x, do y.")
+  - Add comprehensive case study for issue #1179 documenting best practices
 ## 1.37.3
 ### Patch Changes

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@link-assistant/hive-mind",
-  "version": "1.37.3",
+  "version": "1.38.0",
   "description": "AI-powered issue solver and hive mind for collaborative problem solving",
   "main": "src/hive.mjs",
   "type": "module",

package/src/agent.prompts.lib.mjs CHANGED Viewed

@@ -144,7 +144,7 @@ ${getExperimentsExamplesSubPrompt(argv)}
 Initial research.
    - When you start, make sure you create detailed plan for yourself and follow your todo list step by step, make sure that as many points from these guidelines are added to your todo list to keep track of everything that can help you solve the issue with highest possible quality.
    - When you read issue, read all details and comments thoroughly.
-   - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. IMPORTANT: Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML). Use a CLI tool like 'file' command to check the actual file format. If the file command shows "HTML", "text", or "ASCII text", the download FAILED — do NOT call Read on this file. For images from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — use: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>"
+   - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML) using a CLI tool like the 'file' command to check the actual file format. When the file command shows "HTML", "text", or "ASCII text", the download failed — do not call Read on this file. When images are from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — use: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>"
    - When you need issue details, use gh issue view https://github.com/${owner}/${repo}/issues/${issueNumber}.
    - When you need related code, use gh search code --owner ${owner} [keywords].
    - When you need repo context, read files in your working directory.${
@@ -157,16 +157,16 @@ Initial research.
    - When accessing GitHub Gists, use gh gist view command instead of direct URL fetching.
    - When you are fixing a bug, please make sure you first find the actual root cause, do as many experiments as needed.
    - When you are fixing a bug and code does not have enough tracing/logs, add them and make sure they stay in the code, but are switched off by default.
-   - When you need comments on a pull request, note that GitHub has THREE different comment types with different API endpoints:
+   - When you need comments on a pull request, note that GitHub has three different comment types with different API endpoints:
       1. PR review comments (inline code comments): gh api repos/${owner}/${repo}/pulls/${prNumber}/comments --paginate
       2. PR conversation comments (general discussion): gh api repos/${owner}/${repo}/issues/${prNumber}/comments --paginate
       3. PR reviews (approve/request changes): gh api repos/${owner}/${repo}/pulls/${prNumber}/reviews --paginate
-      IMPORTANT: The command "gh pr view --json comments" ONLY returns conversation comments and misses review comments!
+      Note: The command "gh pr view --json comments" only returns conversation comments and misses review comments.
    - When you need latest comments on issue, use gh api repos/${owner}/${repo}/issues/${issueNumber}/comments --paginate.
 Solution development and testing.
-   - When issue is solvable, implement code with tests.
-   - When coding, each atomic step that can be useful by itself should be commited to the pull request's branch, meaning if work will be interrupted by any reason parts of solution will still be kept intact and safe in pull request.
+   - When issue is solvable, first create a test that reproduces the problem, then implement the fix.
+   - When coding, each atomic step that can be useful by itself should be committed to the pull request's branch, meaning if work will be interrupted by any reason parts of solution will still be kept intact and safe in pull request.
    - When you test:
       start from testing of small functions using separate scripts;
       write unit tests with mocks for easy and quick start.
@@ -175,9 +175,17 @@ Solution development and testing.
    - When you write or modify tests, consider setting reasonable timeouts at test, suite, and CI job levels so failures surface quickly instead of hanging.
    - When you see repeated test timeout patterns in CI, investigate the root cause rather than increasing timeouts.
    - When issue is unclear, write comment on issue asking questions.
-   - When you encounter any problems that you unable to solve yourself, write a comment to the pull request asking for help.
+   - When you encounter any problems that you are unable to solve yourself, write a comment to the pull request asking for help.
    - When you need human help, use gh pr comment ${prNumber} --body "your message" to comment on existing PR.
+Reproducible testing.
+   - When fixing a bug, create a test that reproduces the problem before implementing the fix. When you cannot reproduce the problem, you cannot verify the fix.
+   - When encountering logic bugs, write an automated test that fails due to the bug, then implement the fix to make it pass.
+   - When encountering UI bugs, capture a screenshot showing the problem state, then create a visual regression test or manual verification screenshot after the fix.
+   - When creating tests, prefer minimum reproducible examples - the simplest test case that demonstrates the issue.
+   - When submitting a fix, include in the PR description: (1) how to reproduce the issue, (2) the automated test that verifies the fix, (3) before/after screenshots for UI issues.
+   - When a bug fix doesn't have a reproducing test, the fix is incomplete - regressions can silently occur later.
 Preparing pull request.
    - When you code, follow contributing guidelines.
    - When you commit, write clear message.
@@ -217,12 +225,12 @@ Self review.
    - When you finalize, confirm code, tests, and description are consistent.${
      argv && argv.promptEnsureAllRequirementsAreMet
        ? `
-   - When no explicit feedback or requirements is provided, ensure all changes are correct, consistent, validated, tested, logged and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass.`
+   - When no explicit feedback or requirements are provided, ensure all changes are correct, consistent, validated, tested, logged and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass.`
        : ''
    }
 GitHub CLI command patterns.
-   - IMPORTANT: Always use --paginate flag when fetching lists from GitHub API to ensure all results are returned (GitHub returns max 30 per page by default).
+   - When fetching lists from GitHub API, use the --paginate flag to ensure all results are returned (GitHub returns max 30 per page by default).
    - When listing PR review comments (inline code comments), use gh api repos/OWNER/REPO/pulls/NUMBER/comments --paginate.
    - When listing PR conversation comments, use gh api repos/OWNER/REPO/issues/NUMBER/comments --paginate.
    - When listing PR reviews, use gh api repos/OWNER/REPO/pulls/NUMBER/reviews --paginate.
@@ -239,7 +247,11 @@ Visual UI work and screenshots.
    - When you need to show visual results, take a screenshot and save it to the repository (e.g., in a docs/screenshots/ or assets/ folder).
    - When you save screenshots to the repository, use permanent links in the pull request description markdown (e.g., https://github.com/${owner}/${repo}/blob/${branchName}/docs/screenshots/result.png?raw=true).
    - When uploading images, commit them to the branch first, then reference them using the GitHub blob URL format with ?raw=true suffix (works for both public and private repositories).
-   - When the visual result is important for review, mention it explicitly in the pull request description with the embedded image.`
+   - When the visual result is important for review, mention it explicitly in the pull request description with the embedded image.
+   - When fixing UI bugs, capture both the "before" (problem) and "after" (fixed) screenshots as evidence for human verification.
+   - When reporting UI bugs, include a screenshot of the problem state to enable visual verification of the fix.
+   - When the fix is visual, include side-by-side or sequential comparison of before/after states in the PR description.
+   - When possible, create automated visual regression tests to prevent the UI bug from recurring.`
        : ''
    }${ciExamples}${getArchitectureCareSubPrompt(argv)}`;
 };

package/src/claude.budget-stats.lib.mjs CHANGED Viewed

@@ -4,6 +4,135 @@
 import { formatNumber } from './claude.lib.mjs';
+/**
+ * Helper: creates a fresh sub-session usage object for tracking tokens between compactification events
+ * @returns {Object} Empty sub-session usage structure
+ */
+export const createEmptySubSessionUsage = () => ({
+  inputTokens: 0,
+  cacheCreationTokens: 0,
+  cacheReadTokens: 0,
+  outputTokens: 0,
+  messageCount: 0,
+});
+/**
+ * Helper: accumulates token usage from a JSONL entry into a model usage map
+ * @param {Object} modelUsageMap - Map of model ID to usage data
+ * @param {Object} entry - Parsed JSONL entry with message.usage and message.model
+ */
+export const accumulateModelUsage = (modelUsageMap, entry) => {
+  const model = entry.message.model;
+  if (model.startsWith('<') && model.endsWith('>')) return; // Issue #1486: skip <synthetic> etc.
+  const usage = entry.message.usage;
+  if (!modelUsageMap[model]) {
+    modelUsageMap[model] = {
+      inputTokens: 0,
+      cacheCreationTokens: 0,
+      cacheCreation5mTokens: 0,
+      cacheCreation1hTokens: 0,
+      cacheReadTokens: 0,
+      outputTokens: 0,
+      webSearchRequests: 0,
+    };
+  }
+  if (usage.input_tokens) modelUsageMap[model].inputTokens += usage.input_tokens;
+  if (usage.cache_creation_input_tokens) modelUsageMap[model].cacheCreationTokens += usage.cache_creation_input_tokens;
+  if (usage.cache_creation) {
+    if (usage.cache_creation.ephemeral_5m_input_tokens) modelUsageMap[model].cacheCreation5mTokens += usage.cache_creation.ephemeral_5m_input_tokens;
+    if (usage.cache_creation.ephemeral_1h_input_tokens) modelUsageMap[model].cacheCreation1hTokens += usage.cache_creation.ephemeral_1h_input_tokens;
+  }
+  if (usage.cache_read_input_tokens) modelUsageMap[model].cacheReadTokens += usage.cache_read_input_tokens;
+  if (usage.output_tokens) modelUsageMap[model].outputTokens += usage.output_tokens;
+};
+/**
+ * Display detailed model usage information
+ * @param {Object} usage - Usage data for a model
+ * @param {Function} log - Logging function
+ */
+export const displayModelUsage = async (usage, log) => {
+  // Show all model characteristics if available
+  if (usage.modelInfo) {
+    const info = usage.modelInfo;
+    const fields = [
+      { label: 'Model ID', value: info.id },
+      { label: 'Provider', value: info.provider || 'Unknown' },
+      { label: 'Context window', value: info.limit?.context ? `${formatNumber(info.limit.context)} tokens` : null },
+      { label: 'Max output', value: info.limit?.output ? `${formatNumber(info.limit.output)} tokens` : null },
+      { label: 'Input modalities', value: info.modalities?.input?.join(', ') || 'N/A' },
+      { label: 'Output modalities', value: info.modalities?.output?.join(', ') || 'N/A' },
+      { label: 'Knowledge cutoff', value: info.knowledge },
+      { label: 'Released', value: info.release_date },
+      {
+        label: 'Capabilities',
+        value: [info.attachment && 'Attachments', info.reasoning && 'Reasoning', info.temperature && 'Temperature', info.tool_call && 'Tool calls'].filter(Boolean).join(', ') || 'N/A',
+      },
+      { label: 'Open weights', value: info.open_weights ? 'Yes' : 'No' },
+    ];
+    for (const { label, value } of fields) {
+      if (value) await log(`      ${label}: ${value}`);
+    }
+    await log('');
+  } else {
+    await log('      ⚠️  Model info not available\n');
+  }
+  // Show usage data
+  await log('      Usage:');
+  await log(`        Input tokens: ${formatNumber(usage.inputTokens)}`);
+  if (usage.cacheCreationTokens > 0) {
+    await log(`        Cache creation tokens: ${formatNumber(usage.cacheCreationTokens)}`);
+  }
+  if (usage.cacheReadTokens > 0) {
+    await log(`        Cache read tokens: ${formatNumber(usage.cacheReadTokens)}`);
+  }
+  await log(`        Output tokens: ${formatNumber(usage.outputTokens)}`);
+  if (usage.webSearchRequests > 0) {
+    await log(`        Web search requests: ${usage.webSearchRequests}`);
+  }
+  // Show detailed cost calculation
+  if (usage.costUSD !== null && usage.costUSD !== undefined && usage.costBreakdown) {
+    await log('');
+    await log('      Cost Calculation (USD):');
+    const breakdown = usage.costBreakdown;
+    const types = [
+      { key: 'input', label: 'Input' },
+      { key: 'cacheWrite', label: 'Cache write' },
+      { key: 'cacheRead', label: 'Cache read' },
+      { key: 'output', label: 'Output' },
+    ];
+    for (const { key, label } of types) {
+      if (breakdown[key].tokens > 0) {
+        await log(`        ${label}: ${formatNumber(breakdown[key].tokens)} tokens × $${breakdown[key].costPerMillion}/M = $${breakdown[key].cost.toFixed(6)}`);
+      }
+    }
+    await log('        ─────────────────────────────────');
+    await log(`        Total: $${usage.costUSD.toFixed(6)}`);
+  } else if (usage.modelInfo === null) {
+    await log('');
+    await log('      Cost: Not available (could not fetch pricing)');
+  }
+};
+/**
+ * Display cost comparison between public pricing and Anthropic's official cost
+ * @param {number|null} publicCost - Public pricing estimate
+ * @param {number|null} anthropicCost - Anthropic's official cost
+ * @param {Function} log - Logging function
+ */
+export const displayCostComparison = async (publicCost, anthropicCost, log) => {
+  await log('\n   💰 Cost estimation:');
+  await log(`      Public pricing estimate: ${publicCost !== null && publicCost !== undefined ? `$${publicCost.toFixed(6)} USD` : 'unknown'}`);
+  await log(`      Calculated by Anthropic: ${anthropicCost !== null && anthropicCost !== undefined ? `$${anthropicCost.toFixed(6)} USD` : 'unknown'}`);
+  if (publicCost !== null && publicCost !== undefined && anthropicCost !== null && anthropicCost !== undefined) {
+    const difference = anthropicCost - publicCost;
+    const percentDiff = publicCost > 0 ? (difference / publicCost) * 100 : 0;
+    await log(`      Difference:              $${difference.toFixed(6)} (${percentDiff > 0 ? '+' : ''}${percentDiff.toFixed(2)}%)`);
+  } else {
+    await log('      Difference:              unknown');
+  }
+};
 /**
  * Display token budget statistics (context window usage and ratios)
  * @param {Object} usage - Usage data for a model
@@ -48,3 +177,132 @@ export const displayBudgetStats = async (usage, log) => {
   const totalSessionTokens = usage.inputTokens + usage.cacheCreationTokens + usage.outputTokens;
   await log(`        Total session tokens: ${formatNumber(totalSessionTokens)}`);
 };
+/**
+ * Display sub-session breakdown when compactification events occurred (Issue #1491)
+ * @param {Object} tokenUsage - Token usage data with subSessions and compactifications
+ * @param {Object} modelInfo - Model info with context/output limits
+ * @param {Function} log - Logging function
+ */
+export const displaySubSessionStats = async (tokenUsage, modelInfo, log) => {
+  if (!tokenUsage.subSessions || !tokenUsage.compactifications) return;
+  const contextLimit = modelInfo?.limit?.context;
+  await log(`\n      🔄 Compactification events: ${tokenUsage.compactifications.length}`);
+  for (let i = 0; i < tokenUsage.subSessions.length; i++) {
+    const sub = tokenUsage.subSessions[i];
+    const totalInput = sub.inputTokens + sub.cacheCreationTokens + sub.cacheReadTokens;
+    const label = i === 0 ? 'Initial session' : `After compactification #${i}`;
+    await log(`        Sub-session ${i + 1} (${label}):`);
+    await log(`          Messages: ${sub.messageCount}`);
+    await log(`          Context used: ${formatNumber(totalInput)} tokens`);
+    if (contextLimit) {
+      const pct = ((totalInput / contextLimit) * 100).toFixed(2);
+      await log(`          Context usage: ${pct}% of ${formatNumber(contextLimit)}`);
+    }
+    await log(`          Output: ${formatNumber(sub.outputTokens)} tokens`);
+  }
+  // Show compactification details
+  for (let i = 0; i < tokenUsage.compactifications.length; i++) {
+    const comp = tokenUsage.compactifications[i];
+    let detail = `        Compactification #${i + 1}: trigger=${comp.trigger}`;
+    if (comp.preTokens) detail += `, pre-compaction tokens=${formatNumber(comp.preTokens)}`;
+    await log(detail);
+  }
+};
+/**
+ * Display stream vs JSONL token comparison (Issue #1491)
+ * Shows independent calculation from stream events vs JSONL session file
+ * @param {Object} streamTokenUsage - Token usage accumulated from stream JSON events
+ * @param {Object} jsonlTokenUsage - Token usage calculated from JSONL session file
+ * @param {Function} log - Logging function
+ */
+export const displayTokenComparison = async (streamTokenUsage, jsonlTokenUsage, log) => {
+  if (!streamTokenUsage || !jsonlTokenUsage) return;
+  const streamTotal = streamTokenUsage.inputTokens + streamTokenUsage.cacheCreationTokens + streamTokenUsage.outputTokens;
+  const jsonlTotal = jsonlTokenUsage.inputTokens + jsonlTokenUsage.cacheCreationTokens + jsonlTokenUsage.outputTokens;
+  await log('\n      🔍 Token calculation comparison:');
+  await log(`        Stream JSON events: ${formatNumber(streamTotal)} tokens (${streamTokenUsage.eventCount} events)`);
+  await log(`        JSONL session file: ${formatNumber(jsonlTotal)} tokens`);
+  if (streamTotal !== jsonlTotal) {
+    const diff = jsonlTotal - streamTotal;
+    const pct = streamTotal > 0 ? ((diff / streamTotal) * 100).toFixed(2) : 'N/A';
+    await log(`        Difference: ${formatNumber(Math.abs(diff))} tokens (${diff > 0 ? '+' : ''}${pct}%)`);
+  } else {
+    await log('        Match: calculations are consistent');
+  }
+};
+/**
+ * Build budget stats string for GitHub PR comments (Issue #1491)
+ * Similar to buildCostInfoString but for token budget statistics
+ * @param {Object} tokenUsage - Token usage data from calculateSessionTokens
+ * @param {Object|null} streamTokenUsage - Token usage from stream JSON events
+ * @returns {string} Formatted markdown string for PR comment
+ */
+export const buildBudgetStatsString = (tokenUsage, streamTokenUsage) => {
+  if (!tokenUsage) return '';
+  let stats = '\n\n### 📊 **Token budget statistics:**';
+  // Per-model breakdown
+  if (tokenUsage.modelUsage) {
+    const modelIds = Object.keys(tokenUsage.modelUsage);
+    for (const modelId of modelIds) {
+      const usage = tokenUsage.modelUsage[modelId];
+      const modelName = usage.modelName || modelId;
+      const contextLimit = usage.modelInfo?.limit?.context;
+      const outputLimit = usage.modelInfo?.limit?.output;
+      const totalInput = usage.inputTokens + usage.cacheCreationTokens + usage.cacheReadTokens;
+      if (modelIds.length > 1) stats += `\n- **${modelName}**:`;
+      if (contextLimit) {
+        const contextPct = ((totalInput / contextLimit) * 100).toFixed(2);
+        stats += `\n- Context window: ${totalInput.toLocaleString()} / ${contextLimit.toLocaleString()} tokens (${contextPct}%)`;
+      } else {
+        stats += `\n- Context tokens used: ${totalInput.toLocaleString()}`;
+      }
+      if (outputLimit) {
+        const outputPct = ((usage.outputTokens / outputLimit) * 100).toFixed(2);
+        stats += `\n- Output tokens: ${usage.outputTokens.toLocaleString()} / ${outputLimit.toLocaleString()} tokens (${outputPct}%)`;
+      } else {
+        stats += `\n- Output tokens: ${usage.outputTokens.toLocaleString()}`;
+      }
+    }
+  }
+  // Sub-session breakdown if compactification occurred
+  if (tokenUsage.subSessions && tokenUsage.compactifications) {
+    stats += `\n- Compactifications: ${tokenUsage.compactifications.length}`;
+    for (let i = 0; i < tokenUsage.subSessions.length; i++) {
+      const sub = tokenUsage.subSessions[i];
+      const totalInput = sub.inputTokens + sub.cacheCreationTokens + sub.cacheReadTokens;
+      const label = i === 0 ? 'initial' : `after compactification #${i}`;
+      stats += `\n  - Sub-session ${i + 1} (${label}): ${totalInput.toLocaleString()} context, ${sub.outputTokens.toLocaleString()} output, ${sub.messageCount} messages`;
+    }
+  }
+  // Stream vs JSONL comparison
+  if (streamTokenUsage) {
+    const streamTotal = streamTokenUsage.inputTokens + streamTokenUsage.cacheCreationTokens + streamTokenUsage.outputTokens;
+    const jsonlTotal = tokenUsage.inputTokens + tokenUsage.cacheCreationTokens + tokenUsage.outputTokens;
+    stats += `\n- Own calculation (stream): ${streamTotal.toLocaleString()} tokens (${streamTokenUsage.eventCount} events)`;
+    stats += `\n- JSONL calculation: ${jsonlTotal.toLocaleString()} tokens`;
+    if (streamTotal !== jsonlTotal) {
+      const diff = jsonlTotal - streamTotal;
+      const pct = streamTotal > 0 ? ((diff / streamTotal) * 100).toFixed(2) : 'N/A';
+      stats += ` (diff: ${diff > 0 ? '+' : ''}${pct}%)`;
+    }
+  }
+  return stats;
+};

package/src/claude.lib.mjs CHANGED Viewed

@@ -12,7 +12,7 @@ import { timeouts, retryLimits, claudeCode, getClaudeEnv, getThinkingLevelToToke
 import { detectUsageLimit, formatUsageLimitMessage } from './usage-limit.lib.mjs';
 import { createInteractiveHandler } from './interactive-mode.lib.mjs';
 import { sanitizeObjectStrings } from './unicode-sanitization.lib.mjs';
-import { displayBudgetStats } from './claude.budget-stats.lib.mjs';
+import { displayBudgetStats, displaySubSessionStats, displayTokenComparison, createEmptySubSessionUsage, accumulateModelUsage, displayModelUsage, displayCostComparison } from './claude.budget-stats.lib.mjs';
 import { buildClaudeResumeCommand } from './claude.command-builder.lib.mjs';
 import { handleClaudeRuntimeSwitch } from './claude.runtime-switch.lib.mjs'; // see issue #1141
 import { CLAUDE_MODELS as availableModels } from './models/index.mjs'; // Issue #1221
@@ -480,91 +480,6 @@ export const calculateModelCost = (usage, modelInfo, includeBreakdown = false) =
   }
   return totalCost;
 };
-/**
- * Display detailed model usage information
- * @param {Object} usage - Usage data for a model
- * @param {Function} log - Logging function
- */
-const displayModelUsage = async (usage, log) => {
-  // Show all model characteristics if available
-  if (usage.modelInfo) {
-    const info = usage.modelInfo;
-    const fields = [
-      { label: 'Model ID', value: info.id },
-      { label: 'Provider', value: info.provider || 'Unknown' },
-      { label: 'Context window', value: info.limit?.context ? `${formatNumber(info.limit.context)} tokens` : null },
-      { label: 'Max output', value: info.limit?.output ? `${formatNumber(info.limit.output)} tokens` : null },
-      { label: 'Input modalities', value: info.modalities?.input?.join(', ') || 'N/A' },
-      { label: 'Output modalities', value: info.modalities?.output?.join(', ') || 'N/A' },
-      { label: 'Knowledge cutoff', value: info.knowledge },
-      { label: 'Released', value: info.release_date },
-      {
-        label: 'Capabilities',
-        value: [info.attachment && 'Attachments', info.reasoning && 'Reasoning', info.temperature && 'Temperature', info.tool_call && 'Tool calls'].filter(Boolean).join(', ') || 'N/A',
-      },
-      { label: 'Open weights', value: info.open_weights ? 'Yes' : 'No' },
-    ];
-    for (const { label, value } of fields) {
-      if (value) await log(`      ${label}: ${value}`);
-    }
-    await log('');
-  } else {
-    await log('      ⚠️  Model info not available\n');
-  }
-  // Show usage data
-  await log('      Usage:');
-  await log(`        Input tokens: ${formatNumber(usage.inputTokens)}`);
-  if (usage.cacheCreationTokens > 0) {
-    await log(`        Cache creation tokens: ${formatNumber(usage.cacheCreationTokens)}`);
-  }
-  if (usage.cacheReadTokens > 0) {
-    await log(`        Cache read tokens: ${formatNumber(usage.cacheReadTokens)}`);
-  }
-  await log(`        Output tokens: ${formatNumber(usage.outputTokens)}`);
-  if (usage.webSearchRequests > 0) {
-    await log(`        Web search requests: ${usage.webSearchRequests}`);
-  }
-  // Show detailed cost calculation
-  if (usage.costUSD !== null && usage.costUSD !== undefined && usage.costBreakdown) {
-    await log('');
-    await log('      Cost Calculation (USD):');
-    const breakdown = usage.costBreakdown;
-    const types = [
-      { key: 'input', label: 'Input' },
-      { key: 'cacheWrite', label: 'Cache write' },
-      { key: 'cacheRead', label: 'Cache read' },
-      { key: 'output', label: 'Output' },
-    ];
-    for (const { key, label } of types) {
-      if (breakdown[key].tokens > 0) {
-        await log(`        ${label}: ${formatNumber(breakdown[key].tokens)} tokens × $${breakdown[key].costPerMillion}/M = $${breakdown[key].cost.toFixed(6)}`);
-      }
-    }
-    await log('        ─────────────────────────────────');
-    await log(`        Total: $${usage.costUSD.toFixed(6)}`);
-  } else if (usage.modelInfo === null) {
-    await log('');
-    await log('      Cost: Not available (could not fetch pricing)');
-  }
-};
-/**
- * Display cost comparison between public pricing and Anthropic's official cost
- * @param {number|null} publicCost - Public pricing estimate
- * @param {number|null} anthropicCost - Anthropic's official cost
- * @param {Function} log - Logging function
- */
-const displayCostComparison = async (publicCost, anthropicCost, log) => {
-  await log('\n   💰 Cost estimation:');
-  await log(`      Public pricing estimate: ${publicCost !== null && publicCost !== undefined ? `$${publicCost.toFixed(6)} USD` : 'unknown'}`);
-  await log(`      Calculated by Anthropic: ${anthropicCost !== null && anthropicCost !== undefined ? `$${anthropicCost.toFixed(6)} USD` : 'unknown'}`);
-  if (publicCost !== null && publicCost !== undefined && anthropicCost !== null && anthropicCost !== undefined) {
-    const difference = anthropicCost - publicCost;
-    const percentDiff = publicCost > 0 ? (difference / publicCost) * 100 : 0;
-    await log(`      Difference:              $${difference.toFixed(6)} (${percentDiff > 0 ? '+' : ''}${percentDiff.toFixed(2)}%)`);
-  } else {
-    await log('      Difference:              unknown');
-  }
-};
 export const calculateSessionTokens = async (sessionId, tempDir) => {
   const os = (await use('os')).default;
   const homeDir = os.homedir();
@@ -582,6 +497,10 @@ export const calculateSessionTokens = async (sessionId, tempDir) => {
   }
   // Initialize per-model usage tracking
   const modelUsage = {};
+  // Issue #1491: Track sub-sessions between compactification events
+  const subSessions = [];
+  let currentSubSession = createEmptySubSessionUsage();
+  const compactifications = [];
   try {
     // Read the entire file
     const fileContent = await fs.readFile(sessionFile, 'utf8');
@@ -590,53 +509,39 @@ export const calculateSessionTokens = async (sessionId, tempDir) => {
       if (!line.trim()) continue;
       try {
         const entry = JSON.parse(line);
+        // Issue #1491: Detect compactification boundary events
+        if (entry.type === 'system' && entry.subtype === 'compact_boundary') {
+          // Save current sub-session and start a new one
+          if (currentSubSession.messageCount > 0) {
+            subSessions.push(currentSubSession);
+          }
+          compactifications.push({
+            timestamp: entry.timestamp || null,
+            preTokens: entry.compactMetadata?.preTokens || null,
+            trigger: entry.compactMetadata?.trigger || 'unknown',
+          });
+          currentSubSession = createEmptySubSessionUsage();
+          continue;
+        }
         if (entry.message && entry.message.usage && entry.message.model) {
-          const model = entry.message.model;
-          if (model.startsWith('<') && model.endsWith('>')) continue; // Issue #1486: skip <synthetic> etc.
+          accumulateModelUsage(modelUsage, entry);
+          // Issue #1491: Also track per-sub-session usage
           const usage = entry.message.usage;
-          // Initialize model entry if it doesn't exist
-          if (!modelUsage[model]) {
-            modelUsage[model] = {
-              inputTokens: 0,
-              cacheCreationTokens: 0,
-              cacheCreation5mTokens: 0,
-              cacheCreation1hTokens: 0,
-              cacheReadTokens: 0,
-              outputTokens: 0,
-              webSearchRequests: 0,
-            };
-          }
-          // Add input tokens
-          if (usage.input_tokens) {
-            modelUsage[model].inputTokens += usage.input_tokens;
-          }
-          // Add cache creation tokens (total)
-          if (usage.cache_creation_input_tokens) {
-            modelUsage[model].cacheCreationTokens += usage.cache_creation_input_tokens;
-          }
-          // Add cache creation tokens breakdown (5m and 1h)
-          if (usage.cache_creation) {
-            if (usage.cache_creation.ephemeral_5m_input_tokens) {
-              modelUsage[model].cacheCreation5mTokens += usage.cache_creation.ephemeral_5m_input_tokens;
-            }
-            if (usage.cache_creation.ephemeral_1h_input_tokens) {
-              modelUsage[model].cacheCreation1hTokens += usage.cache_creation.ephemeral_1h_input_tokens;
-            }
-          }
-          // Add cache read tokens
-          if (usage.cache_read_input_tokens) {
-            modelUsage[model].cacheReadTokens += usage.cache_read_input_tokens;
-          }
-          // Add output tokens
-          if (usage.output_tokens) {
-            modelUsage[model].outputTokens += usage.output_tokens;
-          }
+          if (usage.input_tokens) currentSubSession.inputTokens += usage.input_tokens;
+          if (usage.cache_creation_input_tokens) currentSubSession.cacheCreationTokens += usage.cache_creation_input_tokens;
+          if (usage.cache_read_input_tokens) currentSubSession.cacheReadTokens += usage.cache_read_input_tokens;
+          if (usage.output_tokens) currentSubSession.outputTokens += usage.output_tokens;
+          currentSubSession.messageCount++;
         }
       } catch {
         // Skip lines that aren't valid JSON
         continue;
       }
     }
+    // Push the final sub-session
+    if (currentSubSession.messageCount > 0) {
+      subSessions.push(currentSubSession);
+    }
     // If no usage data was found, return null
     if (Object.keys(modelUsage).length === 0) {
       return null;
@@ -699,6 +604,9 @@ export const calculateSessionTokens = async (sessionId, tempDir) => {
       outputTokens: totalOutputTokens,
       totalTokens,
       totalCostUSD: hasCostData ? totalCostUSD : null,
+      // Issue #1491: Sub-session and compactification data
+      subSessions: subSessions.length > 1 ? subSessions : null, // Only include if compactification occurred
+      compactifications: compactifications.length > 0 ? compactifications : null,
     };
   } catch (readError) {
     throw new Error(`Failed to read session file: ${readError.message}`);
@@ -832,6 +740,14 @@ export const executeClaudeCommand = async params => {
     let errorDuringExecution = false;
     let resultSummary = null;
     let resultModelUsage = null;
+    // Issue #1491: Track token usage from stream JSON events for independent calculation
+    const streamTokenUsage = {
+      inputTokens: 0,
+      cacheCreationTokens: 0,
+      cacheReadTokens: 0,
+      outputTokens: 0,
+      eventCount: 0,
+    };
     // Create interactive mode handler if enabled
     let interactiveHandler = null;
     if (argv.interactiveMode && owner && repo && prNumber) {
@@ -1054,6 +970,15 @@ export const executeClaudeCommand = async params => {
                 lastMessage = data.error || JSON.stringify(data);
                 if (lastMessage.includes('Internal server error')) isInternalServerError = true;
               }
+              // Issue #1491: Track token usage from stream events for independent calculation
+              if (data.type === 'assistant' && data.message && data.message.usage) {
+                const u = data.message.usage;
+                if (u.input_tokens) streamTokenUsage.inputTokens += u.input_tokens;
+                if (u.cache_creation_input_tokens) streamTokenUsage.cacheCreationTokens += u.cache_creation_input_tokens;
+                if (u.cache_read_input_tokens) streamTokenUsage.cacheReadTokens += u.cache_read_input_tokens;
+                if (u.output_tokens) streamTokenUsage.outputTokens += u.output_tokens;
+                streamTokenUsage.eventCount++;
+              }
               if (data.type === 'assistant' && data.message && data.message.content) {
                 const content = Array.isArray(data.message.content) ? data.message.content : [data.message.content];
                 for (const item of content) {
@@ -1336,6 +1261,15 @@ export const executeClaudeCommand = async params => {
                   await displayBudgetStats(usage, log);
                 }
               }
+              // Issue #1491: Display sub-session breakdown if compactification occurred
+              if (argv.tokensBudgetStats && tokenUsage.subSessions) {
+                const primaryModelInfo = Object.values(tokenUsage.modelUsage).find(u => u.modelInfo?.limit)?.modelInfo;
+                await displaySubSessionStats(tokenUsage, primaryModelInfo, log);
+              }
+              // Issue #1491: Display stream vs JSONL token comparison
+              if (argv.tokensBudgetStats && streamTokenUsage.eventCount > 0) {
+                await displayTokenComparison(streamTokenUsage, tokenUsage, log);
+              }
               // Show totals if multiple models were used
               if (modelIds.length > 1) {
                 await log('\n   📈 Total across all models:');
@@ -1381,6 +1315,7 @@ export const executeClaudeCommand = async params => {
         errorDuringExecution, // Issue #1088: Track if error_during_execution subtype occurred
         resultSummary, // Issue #1263: Include result summary for --attach-solution-summary
         resultModelUsage, // Issue #1454
+        streamTokenUsage: streamTokenUsage.eventCount > 0 ? streamTokenUsage : null, // Issue #1491
       };
     } catch (error) {
       reportError(error, {

package/src/claude.prompts.lib.mjs CHANGED Viewed

@@ -172,7 +172,7 @@ Initial research.
    - When you start, make sure you create detailed plan for yourself and follow your todo list step by step, make sure that as many points from these guidelines are added to your todo list to keep track of everything that can help you solve the issue with highest possible quality.
    - When user mentions CI failures or asks to investigate logs, consider adding these todos to track the investigation: (1) List recent CI runs with timestamps, (2) Download logs from failed runs to ci-logs/ directory, (3) Analyze error messages and identify root cause, (4) Implement fix, (5) Verify fix resolves the specific errors found in logs.
    - When you read issue, read all details and comments thoroughly.
-   - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. IMPORTANT: Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML). Use a CLI tool like 'file' command to check the actual file format. Reading corrupted or non-image files (like GitHub's "Not Found" pages saved as .png) can cause "Could not process image" errors and will crash the AI solver process. If the file command shows "HTML", "text", or "ASCII text", the download FAILED — do NOT call Read on this file. Instead: (1) For images from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — retry with: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>" (2) If retry still fails, skip the image and note it was unavailable.
+   - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML) using a CLI tool like the 'file' command to check the actual file format. When corrupted or non-image files (like GitHub's "Not Found" pages saved as .png) are read, they can cause "Could not process image" errors and crash the AI solver process. When the file command shows "HTML", "text", or "ASCII text", the download failed — do not call Read on this file. Instead: (1) When images are from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — retry with: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>" (2) When the retry still fails, skip the image and note it was unavailable.
    - When you need issue details, use gh issue view https://github.com/${owner}/${repo}/issues/${issueNumber}.
    - When you need related code, use gh search code --owner ${owner} [keywords].
    - When you need repo context, read files in your working directory.${
@@ -190,11 +190,11 @@ Initial research.
    - When accessing GitHub Gists (especially private ones), use gh gist view command instead of direct URL fetching to ensure proper authentication.
    - When you are fixing a bug, please make sure you first find the actual root cause, do as many experiments as needed.
    - When you are fixing a bug and code does not have enough tracing/logs, add them and make sure they stay in the code, but are switched off by default.
-   - When you need comments on a pull request, note that GitHub has THREE different comment types with different API endpoints:
+   - When you need comments on a pull request, note that GitHub has three different comment types with different API endpoints:
       1. PR review comments (inline code comments): gh api repos/${owner}/${repo}/pulls/${prNumber}/comments --paginate
       2. PR conversation comments (general discussion): gh api repos/${owner}/${repo}/issues/${prNumber}/comments --paginate
       3. PR reviews (approve/request changes): gh api repos/${owner}/${repo}/pulls/${prNumber}/reviews --paginate
-      IMPORTANT: The command "gh pr view --json comments" ONLY returns conversation comments and misses review comments!
+      Note: The command "gh pr view --json comments" only returns conversation comments and misses review comments.
    - When you need latest comments on issue, use gh api repos/${owner}/${repo}/issues/${issueNumber}/comments --paginate.${
      argv && argv.promptGeneralPurposeSubAgent
        ? `
@@ -208,9 +208,9 @@ Initial research.
    }
 Solution development and testing.
-   - When issue is solvable, implement code with tests.
+   - When issue is solvable, first create a test that reproduces the problem, then implement the fix.
    - When implementing features, search for similar existing implementations in the codebase and use them as examples instead of implementing everything from scratch.
-   - When coding, each atomic step that can be useful by itself should be commited to the pull request's branch, meaning if work will be interrupted by any reason parts of solution will still be kept intact and safe in pull request.
+   - When coding, each atomic step that can be useful by itself should be committed to the pull request's branch, meaning if work will be interrupted by any reason parts of solution will still be kept intact and safe in pull request.
    - When you test:
       start from testing of small functions using separate scripts;
       write unit tests with mocks for easy and quick start.
@@ -219,9 +219,17 @@ Solution development and testing.
    - When you write or modify tests, consider setting reasonable timeouts at test, suite, and CI job levels so failures surface quickly instead of hanging.
    - When you see repeated test timeout patterns in CI, investigate the root cause rather than increasing timeouts.
    - When issue is unclear, write comment on issue asking questions.
-   - When you encounter any problems that you unable to solve yourself (any human feedback or help), write a comment to the pull request asking for help.
+   - When you encounter any problems that you are unable to solve yourself (any human feedback or help), write a comment to the pull request asking for help.
    - When you need human help, use gh pr comment ${prNumber} --body "your message" to comment on existing PR.
+Reproducible testing.
+   - When fixing a bug, create a test that reproduces the problem before implementing the fix. When you cannot reproduce the problem, you cannot verify the fix.
+   - When encountering logic bugs, write an automated test that fails due to the bug, then implement the fix to make it pass.
+   - When encountering UI bugs, capture a screenshot showing the problem state, then create a visual regression test or manual verification screenshot after the fix.
+   - When creating tests, prefer minimum reproducible examples - the simplest test case that demonstrates the issue.
+   - When submitting a fix, include in the PR description: (1) how to reproduce the issue, (2) the automated test that verifies the fix, (3) before/after screenshots for UI issues.
+   - When a bug fix doesn't have a reproducing test, the fix is incomplete - regressions can silently occur later.
 Preparing pull request.
    - When you code, follow contributing guidelines.
    - When you commit, write clear message.
@@ -229,7 +237,7 @@ Preparing pull request.
    - When you open pr, describe solution draft and include tests.
    - When there is a package with version and GitHub Actions workflows for automatic release, update the version (or other necessary release trigger) in your pull request to prepare for next release.
    - When you update existing pr ${prNumber}, use gh pr edit to modify title and description.
-   - When you are about to commit or push code, ALWAYS run local CI checks first if they are available in contributing guidelines (like ruff check, mypy, eslint, etc.) to catch errors before pushing.
+   - When you are about to commit or push code, run local CI checks first if they are available in contributing guidelines (like ruff check, mypy, eslint, etc.) to catch errors before pushing.
    - When you finalize the pull request:
       follow style from merged prs for code, title, and description,
       make sure no uncommitted changes corresponding to the original requirements are left behind,
@@ -237,7 +245,7 @@ Preparing pull request.
       make sure all CI checks passing if they exist before you finish,
       check for latest comments on the issue and pull request to ensure no recent feedback was missed,
       double-check that all changes in the pull request answer to original requirements of the issue,
-      make sure no new new bugs are introduced in pull request by carefully reading gh pr diff,
+      make sure no new bugs are introduced in pull request by carefully reading gh pr diff,
       make sure no previously existing features were removed without an explicit request from users via the issue description, issue comments, and/or pull request comments.
    - When you finish implementation, use gh pr ready ${prNumber}.
@@ -260,12 +268,12 @@ Self review.
    - When you finalize, confirm code, tests, and description are consistent.${
      argv && argv.promptEnsureAllRequirementsAreMet
        ? `
-   - When no explicit feedback or requirements is provided, ensure all changes are correct, consistent, validated, tested, logged and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass.`
+   - When no explicit feedback or requirements are provided, ensure all changes are correct, consistent, validated, tested, logged and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass.`
        : ''
    }
 GitHub CLI command patterns.
-   - IMPORTANT: Always use --paginate flag when fetching lists from GitHub API to ensure all results are returned (GitHub returns max 30 per page by default).
+   - When fetching lists from GitHub API, use the --paginate flag to ensure all results are returned (GitHub returns max 30 per page by default).
    - When listing PR review comments (inline code comments), use gh api repos/OWNER/REPO/pulls/NUMBER/comments --paginate.
    - When listing PR conversation comments, use gh api repos/OWNER/REPO/issues/NUMBER/comments --paginate.
    - When listing PR reviews, use gh api repos/OWNER/REPO/pulls/NUMBER/reviews --paginate.
@@ -284,7 +292,11 @@ Playwright MCP usage (browser automation via mcp__playwright__* tools).
    - When you need to visually verify how a web page looks or take screenshots, use browser_take_screenshot from Playwright MCP.
    - When you need to fill forms, click buttons, or perform user interactions on web pages, use Playwright MCP tools (browser_click, browser_type, browser_fill_form).
    - When you need to test responsive design or different viewport sizes, use browser_resize from Playwright MCP.
-   - When you finish using the browser, always close it with browser_close to free resources.`
+   - When you finish using the browser, always close it with browser_close to free resources.
+   - When reproducing UI bugs, use browser_take_screenshot to capture the problem state before implementing any fix.
+   - When fixing UI bugs, take before/after screenshots to provide visual evidence of the fix for human verification.
+   - When creating UI tests, save baseline screenshots to the repository for visual regression testing.
+   - When verifying UI fixes, compare screenshots to ensure the fix doesn't introduce unintended visual changes.`
        : ''
    }${
      argv && argv.promptPlanSubAgent
@@ -329,7 +341,11 @@ Visual UI work and screenshots.
    - When you need to show visual results, take a screenshot and save it to the repository (e.g., in a docs/screenshots/ or assets/ folder).
    - When you save screenshots to the repository, use permanent links in the pull request description markdown (e.g., https://github.com/${owner}/${repo}/blob/${branchName}/docs/screenshots/result.png?raw=true).
    - When uploading images, commit them to the branch first, then reference them using the GitHub blob URL format with ?raw=true suffix (works for both public and private repositories).
-   - When the visual result is important for review, mention it explicitly in the pull request description with the embedded image.`
+   - When the visual result is important for review, mention it explicitly in the pull request description with the embedded image.
+   - When fixing UI bugs, capture both the "before" (problem) and "after" (fixed) screenshots as evidence for human verification.
+   - When reporting UI bugs, include a screenshot of the problem state to enable visual verification of the fix.
+   - When the fix is visual, include side-by-side or sequential comparison of before/after states in the PR description.
+   - When possible, create automated visual regression tests to prevent the UI bug from recurring.`
        : ''
    }${ciExamples}${getArchitectureCareSubPrompt(argv)}`;
 };

package/src/codex.prompts.lib.mjs CHANGED Viewed

@@ -152,7 +152,7 @@ Initial research.
    - When you start, make sure you create detailed plan for yourself and follow your todo list step by step, make sure that as many points from these guidelines are added to your todo list to keep track of everything that can help you solve the issue with highest possible quality.
    - When user mentions CI failures or asks to investigate logs, consider adding these todos to track the investigation: (1) List recent CI runs with timestamps, (2) Download logs from failed runs to ci-logs/ directory, (3) Analyze error messages and identify root cause, (4) Implement fix, (5) Verify fix resolves the specific errors found in logs.
    - When you read issue, read all details and comments thoroughly.
-   - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. IMPORTANT: Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML). Use a CLI tool like 'file' command to check the actual file format. If the file command shows "HTML", "text", or "ASCII text", the download FAILED — do NOT call Read on this file. For images from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — use: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>"
+   - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML) using a CLI tool like the 'file' command to check the actual file format. When the file command shows "HTML", "text", or "ASCII text", the download failed — do not call Read on this file. When images are from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — use: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>"
    - When you need issue details, use gh issue view https://github.com/${owner}/${repo}/issues/${issueNumber}.
    - When you need related code, use gh search code --owner ${owner} [keywords].
    - When you need repo context, read files in your working directory.${
@@ -165,17 +165,17 @@ Initial research.
    - When accessing GitHub Gists (especially private ones), use gh gist view command instead of direct URL fetching to ensure proper authentication.
    - When you are fixing a bug, please make sure you first find the actual root cause, do as many experiments as needed.
    - When you are fixing a bug and code does not have enough tracing/logs, add them and make sure they stay in the code, but are switched off by default.
-   - When you need comments on a pull request, note that GitHub has THREE different comment types with different API endpoints:
+   - When you need comments on a pull request, note that GitHub has three different comment types with different API endpoints:
       1. PR review comments (inline code comments): gh api repos/${owner}/${repo}/pulls/${prNumber}/comments --paginate
       2. PR conversation comments (general discussion): gh api repos/${owner}/${repo}/issues/${prNumber}/comments --paginate
       3. PR reviews (approve/request changes): gh api repos/${owner}/${repo}/pulls/${prNumber}/reviews --paginate
-      IMPORTANT: The command "gh pr view --json comments" ONLY returns conversation comments and misses review comments!
+      Note: The command "gh pr view --json comments" only returns conversation comments and misses review comments.
    - When you need latest comments on issue, use gh api repos/${owner}/${repo}/issues/${issueNumber}/comments --paginate.
 Solution development and testing.
-   - When issue is solvable, implement code with tests.
+   - When issue is solvable, first create a test that reproduces the problem, then implement the fix.
    - When implementing features, search for similar existing implementations in the codebase and use them as examples instead of implementing everything from scratch.
-   - When coding, each atomic step that can be useful by itself should be commited to the pull request's branch, meaning if work will be interrupted by any reason parts of solution will still be kept intact and safe in pull request.
+   - When coding, each atomic step that can be useful by itself should be committed to the pull request's branch, meaning if work will be interrupted by any reason parts of solution will still be kept intact and safe in pull request.
    - When you test:
       start from testing of small functions using separate scripts;
       write unit tests with mocks for easy and quick start.
@@ -184,9 +184,17 @@ Solution development and testing.
    - When you write or modify tests, consider setting reasonable timeouts at test, suite, and CI job levels so failures surface quickly instead of hanging.
    - When you see repeated test timeout patterns in CI, investigate the root cause rather than increasing timeouts.
    - When issue is unclear, write comment on issue asking questions.
-   - When you encounter any problems that you unable to solve yourself (any human feedback or help), write a comment to the pull request asking for help.
+   - When you encounter any problems that you are unable to solve yourself (any human feedback or help), write a comment to the pull request asking for help.
    - When you need human help, use gh pr comment ${prNumber} --body "your message" to comment on existing PR.
+Reproducible testing.
+   - When fixing a bug, create a test that reproduces the problem before implementing the fix. When you cannot reproduce the problem, you cannot verify the fix.
+   - When encountering logic bugs, write an automated test that fails due to the bug, then implement the fix to make it pass.
+   - When encountering UI bugs, capture a screenshot showing the problem state, then create a visual regression test or manual verification screenshot after the fix.
+   - When creating tests, prefer minimum reproducible examples - the simplest test case that demonstrates the issue.
+   - When submitting a fix, include in the PR description: (1) how to reproduce the issue, (2) the automated test that verifies the fix, (3) before/after screenshots for UI issues.
+   - When a bug fix doesn't have a reproducing test, the fix is incomplete - regressions can silently occur later.
 Preparing pull request.
    - When you code, follow contributing guidelines.
    - When you commit, write clear message.
@@ -194,7 +202,7 @@ Preparing pull request.
    - When you open pr, describe solution draft and include tests.
    - When there is a package with version and GitHub Actions workflows for automatic release, update the version (or other necessary release trigger) in your pull request to prepare for next release.
    - When you update existing pr ${prNumber}, use gh pr edit to modify title and description.
-   - When you are about to commit or push code, ALWAYS run local CI checks first if they are available in contributing guidelines (like ruff check, mypy, eslint, etc.) to catch errors before pushing.
+   - When you are about to commit or push code, run local CI checks first if they are available in contributing guidelines (like ruff check, mypy, eslint, etc.) to catch errors before pushing.
    - When you finalize the pull request:
       check that pull request title and description are updated (the PR may start with a [WIP] prefix and placeholder description that should be replaced with actual title and description of the changes),
       follow style from merged prs for code, title, and description,
@@ -202,7 +210,7 @@ Preparing pull request.
       make sure the default branch is merged to the pull request's branch,
       make sure all CI checks passing if they exist before you finish,
       double-check that all changes in the pull request answer to original requirements of the issue,
-      make sure no new new bugs are introduced in pull request by carefully reading gh pr diff,
+      make sure no new bugs are introduced in pull request by carefully reading gh pr diff,
       make sure no previously existing features were removed without an explicit request from users via the issue description, issue comments, and/or pull request comments.
    - When you finish implementation, use gh pr ready ${prNumber}.
@@ -225,12 +233,12 @@ Self review.
    - When you finalize, confirm code, tests, and description are consistent.${
      argv && argv.promptEnsureAllRequirementsAreMet
        ? `
-   - When no explicit feedback or requirements is provided, ensure all changes are correct, consistent, validated, tested, logged and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass.`
+   - When no explicit feedback or requirements are provided, ensure all changes are correct, consistent, validated, tested, logged and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass.`
        : ''
    }
 GitHub CLI command patterns.
-   - IMPORTANT: Always use --paginate flag when fetching lists from GitHub API to ensure all results are returned (GitHub returns max 30 per page by default).
+   - When fetching lists from GitHub API, use the --paginate flag to ensure all results are returned (GitHub returns max 30 per page by default).
    - When listing PR review comments (inline code comments), use gh api repos/OWNER/REPO/pulls/NUMBER/comments --paginate.
    - When listing PR conversation comments, use gh api repos/OWNER/REPO/issues/NUMBER/comments --paginate.
    - When listing PR reviews, use gh api repos/OWNER/REPO/pulls/NUMBER/reviews --paginate.

package/src/github.lib.mjs CHANGED Viewed

@@ -12,8 +12,8 @@ import { uploadLogWithGhUploadLog } from './log-upload.lib.mjs';
 import { formatResetTimeWithRelative } from './usage-limit.lib.mjs'; // See: https://github.com/link-assistant/hive-mind/issues/1236
 // Import model info helpers (Issue #1225)
 import { getToolDisplayName, getModelInfoForComment } from './models/index.mjs';
-// Re-export for use by other modules
-export { getToolDisplayName };
+export { getToolDisplayName }; // Re-export for use by other modules
+import { buildBudgetStatsString } from './claude.budget-stats.lib.mjs';
 /** Build cost estimation string for log comments (Issue #1250) */
 const buildCostInfoString = (totalCostUSD, anthropicTotalCostUSD, pricingInfo) => {
@@ -366,7 +366,9 @@ export async function attachLogToGitHub(options) {
     requestedModel = null, // Issue #1225: The --model flag value
     tool = null, // The tool used (claude, agent, opencode, codex)
     resultModelUsage = null, // Issue #1454
+    budgetStatsData = null, // Issue #1491: budget stats for comment
   } = options;
+  const budgetStats = budgetStatsData ? buildBudgetStatsString(budgetStatsData.tokenUsage, budgetStatsData.streamTokenUsage) : '';
   const targetName = targetType === 'pr' ? 'Pull Request' : 'Issue';
   const ghCommand = targetType === 'pr' ? 'pr' : 'issue';
   try {
@@ -552,7 +554,7 @@ ${logContent}
       // Issue #1088: "Finished with errors" format - work may have been completed but errors occurred
       const costInfo = buildCostInfoString(totalCostUSD, anthropicTotalCostUSD, pricingInfo);
       logComment = `## ⚠️ Solution Draft Finished with Errors
-This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${modelInfoString}
+This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${budgetStats}${modelInfoString}
 > **Note**: The session encountered errors during execution, but some work may have been completed. Please review the changes carefully.
@@ -568,10 +570,8 @@ ${logContent}
 ---
 *Now working session is ended, feel free to review and add any feedback on the solution draft.*`;
     } else {
-      // Success log format - use helper function for cost info
       const costInfo = buildCostInfoString(totalCostUSD, anthropicTotalCostUSD, pricingInfo);
-      // Determine title based on session type
-      // See: https://github.com/link-assistant/hive-mind/issues/1152
+      // Determine title based on session type (Issue #1152)
       let title = customTitle;
       let sessionNote = '';
       if (sessionType === 'auto-resume') {
@@ -585,7 +585,7 @@ ${logContent}
         sessionNote = '\n\n**Note**: This session was manually resumed using the --resume flag.';
       }
       logComment = `## ${title}
-This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${modelInfoString}${sessionNote}
+This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${budgetStats}${modelInfoString}${sessionNote}
 <details>
 <summary>Click to expand solution draft log (${Math.round(logStats.size / 1024)}KB)</summary>
@@ -733,7 +733,7 @@ ${errorMessage}
             // Issue #1088: "Finished with errors" format - work may have been completed but errors occurred
             const costInfo = buildCostInfoString(totalCostUSD, anthropicTotalCostUSD, pricingInfo);
             logUploadComment = `## ⚠️ Solution Draft Finished with Errors
-This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${modelInfoString}
+This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${budgetStats}${modelInfoString}
 > **Note**: The session encountered errors during execution, but some work may have been completed. Please review the changes carefully.
@@ -760,7 +760,7 @@ This log file contains the complete execution trace of the AI ${targetType === '
               sessionNote = '\n**Note**: This session was manually resumed using the --resume flag.\n';
             }
             logUploadComment = `## ${title}
-This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${modelInfoString}
+This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${budgetStats}${modelInfoString}
 ${sessionNote}
 ### 📎 **Log file uploaded as ${uploadTypeLabel}${chunkInfo}** (${Math.round(logStats.size / 1024)}KB)
 - [View complete solution draft log](${logUrl})

package/src/opencode.prompts.lib.mjs CHANGED Viewed

@@ -146,7 +146,7 @@ ${workspaceInstructions}
 Initial research.
    - When you start, make sure you create detailed plan for yourself and follow your todo list step by step, make sure that as many points from these guidelines are added to your todo list to keep track of everything that can help you solve the issue with highest possible quality.
    - When you read issue, read all details and comments thoroughly.
-   - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. IMPORTANT: Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML). Use a CLI tool like 'file' command to check the actual file format. Reading corrupted or non-image files (like GitHub's "Not Found" pages saved as .png) can cause "Could not process image" errors and will crash the AI solver process. If the file command shows "HTML", "text", or "ASCII text", the download FAILED — do NOT call Read on this file. Instead: (1) For images from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — retry with: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>" (2) If retry still fails, skip the image and note it was unavailable.
+   - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML) using a CLI tool like the 'file' command to check the actual file format. When corrupted or non-image files (like GitHub's "Not Found" pages saved as .png) are read, they can cause "Could not process image" errors and crash the AI solver process. When the file command shows "HTML", "text", or "ASCII text", the download failed — do not call Read on this file. Instead: (1) When images are from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — retry with: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>" (2) When the retry still fails, skip the image and note it was unavailable.
    - When you need issue details, use gh issue view https://github.com/${owner}/${repo}/issues/${issueNumber}.
    - When you need related code, use gh search code --owner ${owner} [keywords].
    - When you need repo context, read files in your working directory.${
@@ -159,17 +159,17 @@ Initial research.
    - When accessing GitHub Gists, use gh gist view command instead of direct URL fetching.
    - When you are fixing a bug, please make sure you first find the actual root cause, do as many experiments as needed.
    - When you are fixing a bug and code does not have enough tracing/logs, add them and make sure they stay in the code, but are switched off by default.
-   - When you need comments on a pull request, note that GitHub has THREE different comment types with different API endpoints:
+   - When you need comments on a pull request, note that GitHub has three different comment types with different API endpoints:
       1. PR review comments (inline code comments): gh api repos/${owner}/${repo}/pulls/${prNumber}/comments --paginate
       2. PR conversation comments (general discussion): gh api repos/${owner}/${repo}/issues/${prNumber}/comments --paginate
       3. PR reviews (approve/request changes): gh api repos/${owner}/${repo}/pulls/${prNumber}/reviews --paginate
-      IMPORTANT: The command "gh pr view --json comments" ONLY returns conversation comments and misses review comments!
+      Note: The command "gh pr view --json comments" only returns conversation comments and misses review comments.
    - When you need latest comments on issue, use gh api repos/${owner}/${repo}/issues/${issueNumber}/comments --paginate.
 Solution development and testing.
-   - When issue is solvable, implement code with tests.
+   - When issue is solvable, first create a test that reproduces the problem, then implement the fix.
    - When implementing features, search for similar existing implementations in the codebase and use them as examples instead of implementing everything from scratch.
-   - When coding, each atomic step that can be useful by itself should be commited to the pull request's branch, meaning if work will be interrupted by any reason parts of solution will still be kept intact and safe in pull request.
+   - When coding, each atomic step that can be useful by itself should be committed to the pull request's branch, meaning if work will be interrupted by any reason parts of solution will still be kept intact and safe in pull request.
    - When you test:
       start from testing of small functions using separate scripts;
       write unit tests with mocks for easy and quick start.
@@ -178,9 +178,17 @@ Solution development and testing.
    - When you write or modify tests, consider setting reasonable timeouts at test, suite, and CI job levels so failures surface quickly instead of hanging.
    - When you see repeated test timeout patterns in CI, investigate the root cause rather than increasing timeouts.
    - When issue is unclear, write comment on issue asking questions.
-   - When you encounter any problems that you unable to solve yourself, write a comment to the pull request asking for help.
+   - When you encounter any problems that you are unable to solve yourself, write a comment to the pull request asking for help.
    - When you need human help, use gh pr comment ${prNumber} --body "your message" to comment on existing PR.
+Reproducible testing.
+   - When fixing a bug, create a test that reproduces the problem before implementing the fix. When you cannot reproduce the problem, you cannot verify the fix.
+   - When encountering logic bugs, write an automated test that fails due to the bug, then implement the fix to make it pass.
+   - When encountering UI bugs, capture a screenshot showing the problem state, then create a visual regression test or manual verification screenshot after the fix.
+   - When creating tests, prefer minimum reproducible examples - the simplest test case that demonstrates the issue.
+   - When submitting a fix, include in the PR description: (1) how to reproduce the issue, (2) the automated test that verifies the fix, (3) before/after screenshots for UI issues.
+   - When a bug fix doesn't have a reproducing test, the fix is incomplete - regressions can silently occur later.
 Preparing pull request.
    - When you code, follow contributing guidelines.
    - When you commit, write clear message.
@@ -218,12 +226,12 @@ Self review.
    - When you finalize, confirm code, tests, and description are consistent.${
      argv && argv.promptEnsureAllRequirementsAreMet
        ? `
-   - When no explicit feedback or requirements is provided, ensure all changes are correct, consistent, validated, tested, logged and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass.`
+   - When no explicit feedback or requirements are provided, ensure all changes are correct, consistent, validated, tested, logged and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass.`
        : ''
    }
 GitHub CLI command patterns.
-   - IMPORTANT: Always use --paginate flag when fetching lists from GitHub API to ensure all results are returned (GitHub returns max 30 per page by default).
+   - When fetching lists from GitHub API, use the --paginate flag to ensure all results are returned (GitHub returns max 30 per page by default).
    - When listing PR review comments (inline code comments), use gh api repos/OWNER/REPO/pulls/NUMBER/comments --paginate.
    - When listing PR conversation comments, use gh api repos/OWNER/REPO/issues/NUMBER/comments --paginate.
    - When listing PR reviews, use gh api repos/OWNER/REPO/pulls/NUMBER/reviews --paginate.

package/src/solve.mjs CHANGED Viewed

@@ -876,9 +876,10 @@ try {
   let anthropicTotalCostUSD = toolResult.anthropicTotalCostUSD;
   let publicPricingEstimate = toolResult.publicPricingEstimate; // Used by agent tool
   let pricingInfo = toolResult.pricingInfo; // Used by agent tool for detailed pricing
-  let errorDuringExecution = toolResult.errorDuringExecution || false; // Issue #1088: Track error_during_execution
-  let resultSummary = toolResult.resultSummary || null; // Issue #1263: Capture result summary for --attach-solution-summary
-  let resultModelUsage = toolResult.resultModelUsage || null; // Issue #1454: Capture modelUsage from result JSON
+  let errorDuringExecution = toolResult.errorDuringExecution || false;
+  let resultSummary = toolResult.resultSummary || null;
+  let resultModelUsage = toolResult.resultModelUsage || null;
+  let streamTokenUsage = toolResult.streamTokenUsage || null;
   limitReached = toolResult.limitReached;
   cleanupContext.limitReached = limitReached;
@@ -1216,7 +1217,7 @@ try {
   }
   // Search for newly created pull requests and comments
-  const verifyResult = await verifyResults(owner, repo, branchName, issueNumber, prNumber, prUrl, referenceTime, argv, shouldAttachLogs, shouldRestart, sessionId, tempDir, anthropicTotalCostUSD, publicPricingEstimate, pricingInfo, errorDuringExecution, sessionType, resultModelUsage);
+  const verifyResult = await verifyResults(owner, repo, branchName, issueNumber, prNumber, prUrl, referenceTime, argv, shouldAttachLogs, shouldRestart, sessionId, tempDir, anthropicTotalCostUSD, publicPricingEstimate, pricingInfo, errorDuringExecution, sessionType, resultModelUsage, streamTokenUsage);
   const logsAlreadyUploaded = verifyResult?.logUploadSuccess || false;
   // Issue #1162: Auto-restart when PR title/description still has placeholder content
@@ -1263,7 +1264,7 @@ try {
     await cleanupClaudeFile(tempDir, branchName, null, argv);
     // Re-verify results after restart (without auto-restart flag to prevent recursion)
-    const reVerifyResult = await verifyResults(owner, repo, branchName, issueNumber, prNumber, prUrl, referenceTime, { ...argv, autoRestartOnNonUpdatedPullRequestDescription: false }, shouldAttachLogs, false, sessionId, tempDir, anthropicTotalCostUSD, publicPricingEstimate, pricingInfo, errorDuringExecution, sessionType, resultModelUsage);
+    const reVerifyResult = await verifyResults(owner, repo, branchName, issueNumber, prNumber, prUrl, referenceTime, { ...argv, autoRestartOnNonUpdatedPullRequestDescription: false }, shouldAttachLogs, false, sessionId, tempDir, anthropicTotalCostUSD, publicPricingEstimate, pricingInfo, errorDuringExecution, sessionType, resultModelUsage, streamTokenUsage);
     if (reVerifyResult?.prTitleHasPlaceholder || reVerifyResult?.prBodyHasPlaceholder) {
       await log('⚠️  PR title/description still not updated after restart');
@@ -1492,9 +1493,6 @@ try {
   // drainHandles() inside safeExit() will unref/close these before process.exit().
   await logActiveHandles(msg => log(msg));
-  // Issue #1431: safeExit() calls drainHandles() to unref/close known handle types
-  // (process.stdin ReadStream, undici Socket pool, command-stream ChildProcess,
-  // process.stdout/stderr WriteStreams) so the event loop exits naturally, then
-  // calls process.exit(0) as a deterministic safety net.
+  // Issue #1431: safeExit() unrefs handles so the event loop exits naturally, then calls process.exit(0)
   await safeExit(0, 'Process completed');
 }

package/src/solve.results.lib.mjs CHANGED Viewed

@@ -494,9 +494,23 @@ export const showSessionSummary = async (sessionId, limitReached, argv, issueUrl
 };
 // Verify results by searching for new PRs and comments
-export const verifyResults = async (owner, repo, branchName, issueNumber, prNumber, prUrl, referenceTime, argv, shouldAttachLogs, shouldRestart = false, sessionId = null, tempDir = null, anthropicTotalCostUSD = null, publicPricingEstimate = null, pricingInfo = null, errorDuringExecution = false, sessionType = 'new', resultModelUsage = null) => {
+export const verifyResults = async (owner, repo, branchName, issueNumber, prNumber, prUrl, referenceTime, argv, shouldAttachLogs, shouldRestart = false, sessionId = null, tempDir = null, anthropicTotalCostUSD = null, publicPricingEstimate = null, pricingInfo = null, errorDuringExecution = false, sessionType = 'new', resultModelUsage = null, streamTokenUsage = null) => {
   await log('\n🔍 Searching for created pull requests or comments...');
+  // Issue #1491: Build budget stats data for GitHub comment (computed once, used in both PR and issue paths)
+  let budgetStatsData = null;
+  if (argv.tokensBudgetStats && sessionId && tempDir) {
+    try {
+      const { calculateSessionTokens } = await import('./claude.lib.mjs');
+      const tokenUsage = await calculateSessionTokens(sessionId, tempDir);
+      if (tokenUsage) {
+        budgetStatsData = { tokenUsage, streamTokenUsage };
+      }
+    } catch (budgetError) {
+      if (argv.verbose) await log(`  ⚠️  Could not calculate budget stats: ${budgetError.message}`, { verbose: true });
+    }
+  }
   try {
     // Get the current user's GitHub username
     const userResult = await $`gh api user --jq .login`;
@@ -713,6 +727,8 @@ Fixes ${issueRef}
             tool: argv.tool || 'claude',
             // Issue #1454: Pass resultModelUsage for accurate multi-model display
             resultModelUsage,
+            // Issue #1491: Pass budget stats for token budget display in comment
+            budgetStatsData,
           });
         }
@@ -797,6 +813,8 @@ Fixes ${issueRef}
           tool: argv.tool || 'claude',
           // Issue #1454: Pass resultModelUsage for accurate multi-model display
           resultModelUsage,
+          // Issue #1491: Pass budget stats for token budget display in comment
+          budgetStatsData,
         });
       }