@link-assistant/hive-mind 1.37.3 → 1.38.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +19 -0
- package/package.json +1 -1
- package/src/agent.prompts.lib.mjs +21 -9
- package/src/claude.budget-stats.lib.mjs +258 -0
- package/src/claude.lib.mjs +60 -125
- package/src/claude.prompts.lib.mjs +28 -12
- package/src/codex.prompts.lib.mjs +18 -10
- package/src/github.lib.mjs +9 -9
- package/src/opencode.prompts.lib.mjs +16 -8
- package/src/solve.mjs +7 -9
- package/src/solve.results.lib.mjs +19 -1
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,24 @@
|
|
|
1
1
|
# @link-assistant/hive-mind
|
|
2
2
|
|
|
3
|
+
## 1.38.0
|
|
4
|
+
|
|
5
|
+
### Minor Changes
|
|
6
|
+
|
|
7
|
+
- ee331ef: Enhance --tokens-budget-stats with sub-session tracking, stream comparison, and GitHub comment display
|
|
8
|
+
|
|
9
|
+
## 1.37.4
|
|
10
|
+
|
|
11
|
+
### Patch Changes
|
|
12
|
+
|
|
13
|
+
- 72bbb31: Add emphasis on reproducible automated testing in system prompts
|
|
14
|
+
- Add new "Reproducible testing" section to all prompt files (claude, agent, codex, opencode)
|
|
15
|
+
- Update "Solution development and testing" to emphasize test-first approach
|
|
16
|
+
- Enhance Playwright MCP guidelines with UI bug reproduction workflow
|
|
17
|
+
- Enhance Visual UI work section with before/after screenshot guidelines
|
|
18
|
+
- Fix spelling and grammar issues across all prompt files
|
|
19
|
+
- Soften forceful language to use recommendation style ("When x, do y.")
|
|
20
|
+
- Add comprehensive case study for issue #1179 documenting best practices
|
|
21
|
+
|
|
3
22
|
## 1.37.3
|
|
4
23
|
|
|
5
24
|
### Patch Changes
|
package/package.json
CHANGED
|
@@ -144,7 +144,7 @@ ${getExperimentsExamplesSubPrompt(argv)}
|
|
|
144
144
|
Initial research.
|
|
145
145
|
- When you start, make sure you create detailed plan for yourself and follow your todo list step by step, make sure that as many points from these guidelines are added to your todo list to keep track of everything that can help you solve the issue with highest possible quality.
|
|
146
146
|
- When you read issue, read all details and comments thoroughly.
|
|
147
|
-
- When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it.
|
|
147
|
+
- When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML) using a CLI tool like the 'file' command to check the actual file format. When the file command shows "HTML", "text", or "ASCII text", the download failed — do not call Read on this file. When images are from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — use: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>"
|
|
148
148
|
- When you need issue details, use gh issue view https://github.com/${owner}/${repo}/issues/${issueNumber}.
|
|
149
149
|
- When you need related code, use gh search code --owner ${owner} [keywords].
|
|
150
150
|
- When you need repo context, read files in your working directory.${
|
|
@@ -157,16 +157,16 @@ Initial research.
|
|
|
157
157
|
- When accessing GitHub Gists, use gh gist view command instead of direct URL fetching.
|
|
158
158
|
- When you are fixing a bug, please make sure you first find the actual root cause, do as many experiments as needed.
|
|
159
159
|
- When you are fixing a bug and code does not have enough tracing/logs, add them and make sure they stay in the code, but are switched off by default.
|
|
160
|
-
- When you need comments on a pull request, note that GitHub has
|
|
160
|
+
- When you need comments on a pull request, note that GitHub has three different comment types with different API endpoints:
|
|
161
161
|
1. PR review comments (inline code comments): gh api repos/${owner}/${repo}/pulls/${prNumber}/comments --paginate
|
|
162
162
|
2. PR conversation comments (general discussion): gh api repos/${owner}/${repo}/issues/${prNumber}/comments --paginate
|
|
163
163
|
3. PR reviews (approve/request changes): gh api repos/${owner}/${repo}/pulls/${prNumber}/reviews --paginate
|
|
164
|
-
|
|
164
|
+
Note: The command "gh pr view --json comments" only returns conversation comments and misses review comments.
|
|
165
165
|
- When you need latest comments on issue, use gh api repos/${owner}/${repo}/issues/${issueNumber}/comments --paginate.
|
|
166
166
|
|
|
167
167
|
Solution development and testing.
|
|
168
|
-
- When issue is solvable, implement
|
|
169
|
-
- When coding, each atomic step that can be useful by itself should be
|
|
168
|
+
- When issue is solvable, first create a test that reproduces the problem, then implement the fix.
|
|
169
|
+
- When coding, each atomic step that can be useful by itself should be committed to the pull request's branch, meaning if work will be interrupted by any reason parts of solution will still be kept intact and safe in pull request.
|
|
170
170
|
- When you test:
|
|
171
171
|
start from testing of small functions using separate scripts;
|
|
172
172
|
write unit tests with mocks for easy and quick start.
|
|
@@ -175,9 +175,17 @@ Solution development and testing.
|
|
|
175
175
|
- When you write or modify tests, consider setting reasonable timeouts at test, suite, and CI job levels so failures surface quickly instead of hanging.
|
|
176
176
|
- When you see repeated test timeout patterns in CI, investigate the root cause rather than increasing timeouts.
|
|
177
177
|
- When issue is unclear, write comment on issue asking questions.
|
|
178
|
-
- When you encounter any problems that you unable to solve yourself, write a comment to the pull request asking for help.
|
|
178
|
+
- When you encounter any problems that you are unable to solve yourself, write a comment to the pull request asking for help.
|
|
179
179
|
- When you need human help, use gh pr comment ${prNumber} --body "your message" to comment on existing PR.
|
|
180
180
|
|
|
181
|
+
Reproducible testing.
|
|
182
|
+
- When fixing a bug, create a test that reproduces the problem before implementing the fix. When you cannot reproduce the problem, you cannot verify the fix.
|
|
183
|
+
- When encountering logic bugs, write an automated test that fails due to the bug, then implement the fix to make it pass.
|
|
184
|
+
- When encountering UI bugs, capture a screenshot showing the problem state, then create a visual regression test or manual verification screenshot after the fix.
|
|
185
|
+
- When creating tests, prefer minimum reproducible examples - the simplest test case that demonstrates the issue.
|
|
186
|
+
- When submitting a fix, include in the PR description: (1) how to reproduce the issue, (2) the automated test that verifies the fix, (3) before/after screenshots for UI issues.
|
|
187
|
+
- When a bug fix doesn't have a reproducing test, the fix is incomplete - regressions can silently occur later.
|
|
188
|
+
|
|
181
189
|
Preparing pull request.
|
|
182
190
|
- When you code, follow contributing guidelines.
|
|
183
191
|
- When you commit, write clear message.
|
|
@@ -217,12 +225,12 @@ Self review.
|
|
|
217
225
|
- When you finalize, confirm code, tests, and description are consistent.${
|
|
218
226
|
argv && argv.promptEnsureAllRequirementsAreMet
|
|
219
227
|
? `
|
|
220
|
-
- When no explicit feedback or requirements
|
|
228
|
+
- When no explicit feedback or requirements are provided, ensure all changes are correct, consistent, validated, tested, logged and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass.`
|
|
221
229
|
: ''
|
|
222
230
|
}
|
|
223
231
|
|
|
224
232
|
GitHub CLI command patterns.
|
|
225
|
-
-
|
|
233
|
+
- When fetching lists from GitHub API, use the --paginate flag to ensure all results are returned (GitHub returns max 30 per page by default).
|
|
226
234
|
- When listing PR review comments (inline code comments), use gh api repos/OWNER/REPO/pulls/NUMBER/comments --paginate.
|
|
227
235
|
- When listing PR conversation comments, use gh api repos/OWNER/REPO/issues/NUMBER/comments --paginate.
|
|
228
236
|
- When listing PR reviews, use gh api repos/OWNER/REPO/pulls/NUMBER/reviews --paginate.
|
|
@@ -239,7 +247,11 @@ Visual UI work and screenshots.
|
|
|
239
247
|
- When you need to show visual results, take a screenshot and save it to the repository (e.g., in a docs/screenshots/ or assets/ folder).
|
|
240
248
|
- When you save screenshots to the repository, use permanent links in the pull request description markdown (e.g., https://github.com/${owner}/${repo}/blob/${branchName}/docs/screenshots/result.png?raw=true).
|
|
241
249
|
- When uploading images, commit them to the branch first, then reference them using the GitHub blob URL format with ?raw=true suffix (works for both public and private repositories).
|
|
242
|
-
- When the visual result is important for review, mention it explicitly in the pull request description with the embedded image
|
|
250
|
+
- When the visual result is important for review, mention it explicitly in the pull request description with the embedded image.
|
|
251
|
+
- When fixing UI bugs, capture both the "before" (problem) and "after" (fixed) screenshots as evidence for human verification.
|
|
252
|
+
- When reporting UI bugs, include a screenshot of the problem state to enable visual verification of the fix.
|
|
253
|
+
- When the fix is visual, include side-by-side or sequential comparison of before/after states in the PR description.
|
|
254
|
+
- When possible, create automated visual regression tests to prevent the UI bug from recurring.`
|
|
243
255
|
: ''
|
|
244
256
|
}${ciExamples}${getArchitectureCareSubPrompt(argv)}`;
|
|
245
257
|
};
|
|
@@ -4,6 +4,135 @@
|
|
|
4
4
|
|
|
5
5
|
import { formatNumber } from './claude.lib.mjs';
|
|
6
6
|
|
|
7
|
+
/**
|
|
8
|
+
* Helper: creates a fresh sub-session usage object for tracking tokens between compactification events
|
|
9
|
+
* @returns {Object} Empty sub-session usage structure
|
|
10
|
+
*/
|
|
11
|
+
export const createEmptySubSessionUsage = () => ({
|
|
12
|
+
inputTokens: 0,
|
|
13
|
+
cacheCreationTokens: 0,
|
|
14
|
+
cacheReadTokens: 0,
|
|
15
|
+
outputTokens: 0,
|
|
16
|
+
messageCount: 0,
|
|
17
|
+
});
|
|
18
|
+
|
|
19
|
+
/**
|
|
20
|
+
* Helper: accumulates token usage from a JSONL entry into a model usage map
|
|
21
|
+
* @param {Object} modelUsageMap - Map of model ID to usage data
|
|
22
|
+
* @param {Object} entry - Parsed JSONL entry with message.usage and message.model
|
|
23
|
+
*/
|
|
24
|
+
export const accumulateModelUsage = (modelUsageMap, entry) => {
|
|
25
|
+
const model = entry.message.model;
|
|
26
|
+
if (model.startsWith('<') && model.endsWith('>')) return; // Issue #1486: skip <synthetic> etc.
|
|
27
|
+
const usage = entry.message.usage;
|
|
28
|
+
if (!modelUsageMap[model]) {
|
|
29
|
+
modelUsageMap[model] = {
|
|
30
|
+
inputTokens: 0,
|
|
31
|
+
cacheCreationTokens: 0,
|
|
32
|
+
cacheCreation5mTokens: 0,
|
|
33
|
+
cacheCreation1hTokens: 0,
|
|
34
|
+
cacheReadTokens: 0,
|
|
35
|
+
outputTokens: 0,
|
|
36
|
+
webSearchRequests: 0,
|
|
37
|
+
};
|
|
38
|
+
}
|
|
39
|
+
if (usage.input_tokens) modelUsageMap[model].inputTokens += usage.input_tokens;
|
|
40
|
+
if (usage.cache_creation_input_tokens) modelUsageMap[model].cacheCreationTokens += usage.cache_creation_input_tokens;
|
|
41
|
+
if (usage.cache_creation) {
|
|
42
|
+
if (usage.cache_creation.ephemeral_5m_input_tokens) modelUsageMap[model].cacheCreation5mTokens += usage.cache_creation.ephemeral_5m_input_tokens;
|
|
43
|
+
if (usage.cache_creation.ephemeral_1h_input_tokens) modelUsageMap[model].cacheCreation1hTokens += usage.cache_creation.ephemeral_1h_input_tokens;
|
|
44
|
+
}
|
|
45
|
+
if (usage.cache_read_input_tokens) modelUsageMap[model].cacheReadTokens += usage.cache_read_input_tokens;
|
|
46
|
+
if (usage.output_tokens) modelUsageMap[model].outputTokens += usage.output_tokens;
|
|
47
|
+
};
|
|
48
|
+
|
|
49
|
+
/**
|
|
50
|
+
* Display detailed model usage information
|
|
51
|
+
* @param {Object} usage - Usage data for a model
|
|
52
|
+
* @param {Function} log - Logging function
|
|
53
|
+
*/
|
|
54
|
+
export const displayModelUsage = async (usage, log) => {
|
|
55
|
+
// Show all model characteristics if available
|
|
56
|
+
if (usage.modelInfo) {
|
|
57
|
+
const info = usage.modelInfo;
|
|
58
|
+
const fields = [
|
|
59
|
+
{ label: 'Model ID', value: info.id },
|
|
60
|
+
{ label: 'Provider', value: info.provider || 'Unknown' },
|
|
61
|
+
{ label: 'Context window', value: info.limit?.context ? `${formatNumber(info.limit.context)} tokens` : null },
|
|
62
|
+
{ label: 'Max output', value: info.limit?.output ? `${formatNumber(info.limit.output)} tokens` : null },
|
|
63
|
+
{ label: 'Input modalities', value: info.modalities?.input?.join(', ') || 'N/A' },
|
|
64
|
+
{ label: 'Output modalities', value: info.modalities?.output?.join(', ') || 'N/A' },
|
|
65
|
+
{ label: 'Knowledge cutoff', value: info.knowledge },
|
|
66
|
+
{ label: 'Released', value: info.release_date },
|
|
67
|
+
{
|
|
68
|
+
label: 'Capabilities',
|
|
69
|
+
value: [info.attachment && 'Attachments', info.reasoning && 'Reasoning', info.temperature && 'Temperature', info.tool_call && 'Tool calls'].filter(Boolean).join(', ') || 'N/A',
|
|
70
|
+
},
|
|
71
|
+
{ label: 'Open weights', value: info.open_weights ? 'Yes' : 'No' },
|
|
72
|
+
];
|
|
73
|
+
for (const { label, value } of fields) {
|
|
74
|
+
if (value) await log(` ${label}: ${value}`);
|
|
75
|
+
}
|
|
76
|
+
await log('');
|
|
77
|
+
} else {
|
|
78
|
+
await log(' ⚠️ Model info not available\n');
|
|
79
|
+
}
|
|
80
|
+
// Show usage data
|
|
81
|
+
await log(' Usage:');
|
|
82
|
+
await log(` Input tokens: ${formatNumber(usage.inputTokens)}`);
|
|
83
|
+
if (usage.cacheCreationTokens > 0) {
|
|
84
|
+
await log(` Cache creation tokens: ${formatNumber(usage.cacheCreationTokens)}`);
|
|
85
|
+
}
|
|
86
|
+
if (usage.cacheReadTokens > 0) {
|
|
87
|
+
await log(` Cache read tokens: ${formatNumber(usage.cacheReadTokens)}`);
|
|
88
|
+
}
|
|
89
|
+
await log(` Output tokens: ${formatNumber(usage.outputTokens)}`);
|
|
90
|
+
if (usage.webSearchRequests > 0) {
|
|
91
|
+
await log(` Web search requests: ${usage.webSearchRequests}`);
|
|
92
|
+
}
|
|
93
|
+
// Show detailed cost calculation
|
|
94
|
+
if (usage.costUSD !== null && usage.costUSD !== undefined && usage.costBreakdown) {
|
|
95
|
+
await log('');
|
|
96
|
+
await log(' Cost Calculation (USD):');
|
|
97
|
+
const breakdown = usage.costBreakdown;
|
|
98
|
+
const types = [
|
|
99
|
+
{ key: 'input', label: 'Input' },
|
|
100
|
+
{ key: 'cacheWrite', label: 'Cache write' },
|
|
101
|
+
{ key: 'cacheRead', label: 'Cache read' },
|
|
102
|
+
{ key: 'output', label: 'Output' },
|
|
103
|
+
];
|
|
104
|
+
for (const { key, label } of types) {
|
|
105
|
+
if (breakdown[key].tokens > 0) {
|
|
106
|
+
await log(` ${label}: ${formatNumber(breakdown[key].tokens)} tokens × $${breakdown[key].costPerMillion}/M = $${breakdown[key].cost.toFixed(6)}`);
|
|
107
|
+
}
|
|
108
|
+
}
|
|
109
|
+
await log(' ─────────────────────────────────');
|
|
110
|
+
await log(` Total: $${usage.costUSD.toFixed(6)}`);
|
|
111
|
+
} else if (usage.modelInfo === null) {
|
|
112
|
+
await log('');
|
|
113
|
+
await log(' Cost: Not available (could not fetch pricing)');
|
|
114
|
+
}
|
|
115
|
+
};
|
|
116
|
+
|
|
117
|
+
/**
|
|
118
|
+
* Display cost comparison between public pricing and Anthropic's official cost
|
|
119
|
+
* @param {number|null} publicCost - Public pricing estimate
|
|
120
|
+
* @param {number|null} anthropicCost - Anthropic's official cost
|
|
121
|
+
* @param {Function} log - Logging function
|
|
122
|
+
*/
|
|
123
|
+
export const displayCostComparison = async (publicCost, anthropicCost, log) => {
|
|
124
|
+
await log('\n 💰 Cost estimation:');
|
|
125
|
+
await log(` Public pricing estimate: ${publicCost !== null && publicCost !== undefined ? `$${publicCost.toFixed(6)} USD` : 'unknown'}`);
|
|
126
|
+
await log(` Calculated by Anthropic: ${anthropicCost !== null && anthropicCost !== undefined ? `$${anthropicCost.toFixed(6)} USD` : 'unknown'}`);
|
|
127
|
+
if (publicCost !== null && publicCost !== undefined && anthropicCost !== null && anthropicCost !== undefined) {
|
|
128
|
+
const difference = anthropicCost - publicCost;
|
|
129
|
+
const percentDiff = publicCost > 0 ? (difference / publicCost) * 100 : 0;
|
|
130
|
+
await log(` Difference: $${difference.toFixed(6)} (${percentDiff > 0 ? '+' : ''}${percentDiff.toFixed(2)}%)`);
|
|
131
|
+
} else {
|
|
132
|
+
await log(' Difference: unknown');
|
|
133
|
+
}
|
|
134
|
+
};
|
|
135
|
+
|
|
7
136
|
/**
|
|
8
137
|
* Display token budget statistics (context window usage and ratios)
|
|
9
138
|
* @param {Object} usage - Usage data for a model
|
|
@@ -48,3 +177,132 @@ export const displayBudgetStats = async (usage, log) => {
|
|
|
48
177
|
const totalSessionTokens = usage.inputTokens + usage.cacheCreationTokens + usage.outputTokens;
|
|
49
178
|
await log(` Total session tokens: ${formatNumber(totalSessionTokens)}`);
|
|
50
179
|
};
|
|
180
|
+
|
|
181
|
+
/**
|
|
182
|
+
* Display sub-session breakdown when compactification events occurred (Issue #1491)
|
|
183
|
+
* @param {Object} tokenUsage - Token usage data with subSessions and compactifications
|
|
184
|
+
* @param {Object} modelInfo - Model info with context/output limits
|
|
185
|
+
* @param {Function} log - Logging function
|
|
186
|
+
*/
|
|
187
|
+
export const displaySubSessionStats = async (tokenUsage, modelInfo, log) => {
|
|
188
|
+
if (!tokenUsage.subSessions || !tokenUsage.compactifications) return;
|
|
189
|
+
|
|
190
|
+
const contextLimit = modelInfo?.limit?.context;
|
|
191
|
+
await log(`\n 🔄 Compactification events: ${tokenUsage.compactifications.length}`);
|
|
192
|
+
|
|
193
|
+
for (let i = 0; i < tokenUsage.subSessions.length; i++) {
|
|
194
|
+
const sub = tokenUsage.subSessions[i];
|
|
195
|
+
const totalInput = sub.inputTokens + sub.cacheCreationTokens + sub.cacheReadTokens;
|
|
196
|
+
const label = i === 0 ? 'Initial session' : `After compactification #${i}`;
|
|
197
|
+
|
|
198
|
+
await log(` Sub-session ${i + 1} (${label}):`);
|
|
199
|
+
await log(` Messages: ${sub.messageCount}`);
|
|
200
|
+
await log(` Context used: ${formatNumber(totalInput)} tokens`);
|
|
201
|
+
if (contextLimit) {
|
|
202
|
+
const pct = ((totalInput / contextLimit) * 100).toFixed(2);
|
|
203
|
+
await log(` Context usage: ${pct}% of ${formatNumber(contextLimit)}`);
|
|
204
|
+
}
|
|
205
|
+
await log(` Output: ${formatNumber(sub.outputTokens)} tokens`);
|
|
206
|
+
}
|
|
207
|
+
|
|
208
|
+
// Show compactification details
|
|
209
|
+
for (let i = 0; i < tokenUsage.compactifications.length; i++) {
|
|
210
|
+
const comp = tokenUsage.compactifications[i];
|
|
211
|
+
let detail = ` Compactification #${i + 1}: trigger=${comp.trigger}`;
|
|
212
|
+
if (comp.preTokens) detail += `, pre-compaction tokens=${formatNumber(comp.preTokens)}`;
|
|
213
|
+
await log(detail);
|
|
214
|
+
}
|
|
215
|
+
};
|
|
216
|
+
|
|
217
|
+
/**
|
|
218
|
+
* Display stream vs JSONL token comparison (Issue #1491)
|
|
219
|
+
* Shows independent calculation from stream events vs JSONL session file
|
|
220
|
+
* @param {Object} streamTokenUsage - Token usage accumulated from stream JSON events
|
|
221
|
+
* @param {Object} jsonlTokenUsage - Token usage calculated from JSONL session file
|
|
222
|
+
* @param {Function} log - Logging function
|
|
223
|
+
*/
|
|
224
|
+
export const displayTokenComparison = async (streamTokenUsage, jsonlTokenUsage, log) => {
|
|
225
|
+
if (!streamTokenUsage || !jsonlTokenUsage) return;
|
|
226
|
+
|
|
227
|
+
const streamTotal = streamTokenUsage.inputTokens + streamTokenUsage.cacheCreationTokens + streamTokenUsage.outputTokens;
|
|
228
|
+
const jsonlTotal = jsonlTokenUsage.inputTokens + jsonlTokenUsage.cacheCreationTokens + jsonlTokenUsage.outputTokens;
|
|
229
|
+
|
|
230
|
+
await log('\n 🔍 Token calculation comparison:');
|
|
231
|
+
await log(` Stream JSON events: ${formatNumber(streamTotal)} tokens (${streamTokenUsage.eventCount} events)`);
|
|
232
|
+
await log(` JSONL session file: ${formatNumber(jsonlTotal)} tokens`);
|
|
233
|
+
|
|
234
|
+
if (streamTotal !== jsonlTotal) {
|
|
235
|
+
const diff = jsonlTotal - streamTotal;
|
|
236
|
+
const pct = streamTotal > 0 ? ((diff / streamTotal) * 100).toFixed(2) : 'N/A';
|
|
237
|
+
await log(` Difference: ${formatNumber(Math.abs(diff))} tokens (${diff > 0 ? '+' : ''}${pct}%)`);
|
|
238
|
+
} else {
|
|
239
|
+
await log(' Match: calculations are consistent');
|
|
240
|
+
}
|
|
241
|
+
};
|
|
242
|
+
|
|
243
|
+
/**
|
|
244
|
+
* Build budget stats string for GitHub PR comments (Issue #1491)
|
|
245
|
+
* Similar to buildCostInfoString but for token budget statistics
|
|
246
|
+
* @param {Object} tokenUsage - Token usage data from calculateSessionTokens
|
|
247
|
+
* @param {Object|null} streamTokenUsage - Token usage from stream JSON events
|
|
248
|
+
* @returns {string} Formatted markdown string for PR comment
|
|
249
|
+
*/
|
|
250
|
+
export const buildBudgetStatsString = (tokenUsage, streamTokenUsage) => {
|
|
251
|
+
if (!tokenUsage) return '';
|
|
252
|
+
|
|
253
|
+
let stats = '\n\n### 📊 **Token budget statistics:**';
|
|
254
|
+
|
|
255
|
+
// Per-model breakdown
|
|
256
|
+
if (tokenUsage.modelUsage) {
|
|
257
|
+
const modelIds = Object.keys(tokenUsage.modelUsage);
|
|
258
|
+
for (const modelId of modelIds) {
|
|
259
|
+
const usage = tokenUsage.modelUsage[modelId];
|
|
260
|
+
const modelName = usage.modelName || modelId;
|
|
261
|
+
const contextLimit = usage.modelInfo?.limit?.context;
|
|
262
|
+
const outputLimit = usage.modelInfo?.limit?.output;
|
|
263
|
+
const totalInput = usage.inputTokens + usage.cacheCreationTokens + usage.cacheReadTokens;
|
|
264
|
+
|
|
265
|
+
if (modelIds.length > 1) stats += `\n- **${modelName}**:`;
|
|
266
|
+
|
|
267
|
+
if (contextLimit) {
|
|
268
|
+
const contextPct = ((totalInput / contextLimit) * 100).toFixed(2);
|
|
269
|
+
stats += `\n- Context window: ${totalInput.toLocaleString()} / ${contextLimit.toLocaleString()} tokens (${contextPct}%)`;
|
|
270
|
+
} else {
|
|
271
|
+
stats += `\n- Context tokens used: ${totalInput.toLocaleString()}`;
|
|
272
|
+
}
|
|
273
|
+
|
|
274
|
+
if (outputLimit) {
|
|
275
|
+
const outputPct = ((usage.outputTokens / outputLimit) * 100).toFixed(2);
|
|
276
|
+
stats += `\n- Output tokens: ${usage.outputTokens.toLocaleString()} / ${outputLimit.toLocaleString()} tokens (${outputPct}%)`;
|
|
277
|
+
} else {
|
|
278
|
+
stats += `\n- Output tokens: ${usage.outputTokens.toLocaleString()}`;
|
|
279
|
+
}
|
|
280
|
+
}
|
|
281
|
+
}
|
|
282
|
+
|
|
283
|
+
// Sub-session breakdown if compactification occurred
|
|
284
|
+
if (tokenUsage.subSessions && tokenUsage.compactifications) {
|
|
285
|
+
stats += `\n- Compactifications: ${tokenUsage.compactifications.length}`;
|
|
286
|
+
for (let i = 0; i < tokenUsage.subSessions.length; i++) {
|
|
287
|
+
const sub = tokenUsage.subSessions[i];
|
|
288
|
+
const totalInput = sub.inputTokens + sub.cacheCreationTokens + sub.cacheReadTokens;
|
|
289
|
+
const label = i === 0 ? 'initial' : `after compactification #${i}`;
|
|
290
|
+
stats += `\n - Sub-session ${i + 1} (${label}): ${totalInput.toLocaleString()} context, ${sub.outputTokens.toLocaleString()} output, ${sub.messageCount} messages`;
|
|
291
|
+
}
|
|
292
|
+
}
|
|
293
|
+
|
|
294
|
+
// Stream vs JSONL comparison
|
|
295
|
+
if (streamTokenUsage) {
|
|
296
|
+
const streamTotal = streamTokenUsage.inputTokens + streamTokenUsage.cacheCreationTokens + streamTokenUsage.outputTokens;
|
|
297
|
+
const jsonlTotal = tokenUsage.inputTokens + tokenUsage.cacheCreationTokens + tokenUsage.outputTokens;
|
|
298
|
+
stats += `\n- Own calculation (stream): ${streamTotal.toLocaleString()} tokens (${streamTokenUsage.eventCount} events)`;
|
|
299
|
+
stats += `\n- JSONL calculation: ${jsonlTotal.toLocaleString()} tokens`;
|
|
300
|
+
if (streamTotal !== jsonlTotal) {
|
|
301
|
+
const diff = jsonlTotal - streamTotal;
|
|
302
|
+
const pct = streamTotal > 0 ? ((diff / streamTotal) * 100).toFixed(2) : 'N/A';
|
|
303
|
+
stats += ` (diff: ${diff > 0 ? '+' : ''}${pct}%)`;
|
|
304
|
+
}
|
|
305
|
+
}
|
|
306
|
+
|
|
307
|
+
return stats;
|
|
308
|
+
};
|
package/src/claude.lib.mjs
CHANGED
|
@@ -12,7 +12,7 @@ import { timeouts, retryLimits, claudeCode, getClaudeEnv, getThinkingLevelToToke
|
|
|
12
12
|
import { detectUsageLimit, formatUsageLimitMessage } from './usage-limit.lib.mjs';
|
|
13
13
|
import { createInteractiveHandler } from './interactive-mode.lib.mjs';
|
|
14
14
|
import { sanitizeObjectStrings } from './unicode-sanitization.lib.mjs';
|
|
15
|
-
import { displayBudgetStats } from './claude.budget-stats.lib.mjs';
|
|
15
|
+
import { displayBudgetStats, displaySubSessionStats, displayTokenComparison, createEmptySubSessionUsage, accumulateModelUsage, displayModelUsage, displayCostComparison } from './claude.budget-stats.lib.mjs';
|
|
16
16
|
import { buildClaudeResumeCommand } from './claude.command-builder.lib.mjs';
|
|
17
17
|
import { handleClaudeRuntimeSwitch } from './claude.runtime-switch.lib.mjs'; // see issue #1141
|
|
18
18
|
import { CLAUDE_MODELS as availableModels } from './models/index.mjs'; // Issue #1221
|
|
@@ -480,91 +480,6 @@ export const calculateModelCost = (usage, modelInfo, includeBreakdown = false) =
|
|
|
480
480
|
}
|
|
481
481
|
return totalCost;
|
|
482
482
|
};
|
|
483
|
-
/**
|
|
484
|
-
* Display detailed model usage information
|
|
485
|
-
* @param {Object} usage - Usage data for a model
|
|
486
|
-
* @param {Function} log - Logging function
|
|
487
|
-
*/
|
|
488
|
-
const displayModelUsage = async (usage, log) => {
|
|
489
|
-
// Show all model characteristics if available
|
|
490
|
-
if (usage.modelInfo) {
|
|
491
|
-
const info = usage.modelInfo;
|
|
492
|
-
const fields = [
|
|
493
|
-
{ label: 'Model ID', value: info.id },
|
|
494
|
-
{ label: 'Provider', value: info.provider || 'Unknown' },
|
|
495
|
-
{ label: 'Context window', value: info.limit?.context ? `${formatNumber(info.limit.context)} tokens` : null },
|
|
496
|
-
{ label: 'Max output', value: info.limit?.output ? `${formatNumber(info.limit.output)} tokens` : null },
|
|
497
|
-
{ label: 'Input modalities', value: info.modalities?.input?.join(', ') || 'N/A' },
|
|
498
|
-
{ label: 'Output modalities', value: info.modalities?.output?.join(', ') || 'N/A' },
|
|
499
|
-
{ label: 'Knowledge cutoff', value: info.knowledge },
|
|
500
|
-
{ label: 'Released', value: info.release_date },
|
|
501
|
-
{
|
|
502
|
-
label: 'Capabilities',
|
|
503
|
-
value: [info.attachment && 'Attachments', info.reasoning && 'Reasoning', info.temperature && 'Temperature', info.tool_call && 'Tool calls'].filter(Boolean).join(', ') || 'N/A',
|
|
504
|
-
},
|
|
505
|
-
{ label: 'Open weights', value: info.open_weights ? 'Yes' : 'No' },
|
|
506
|
-
];
|
|
507
|
-
for (const { label, value } of fields) {
|
|
508
|
-
if (value) await log(` ${label}: ${value}`);
|
|
509
|
-
}
|
|
510
|
-
await log('');
|
|
511
|
-
} else {
|
|
512
|
-
await log(' ⚠️ Model info not available\n');
|
|
513
|
-
}
|
|
514
|
-
// Show usage data
|
|
515
|
-
await log(' Usage:');
|
|
516
|
-
await log(` Input tokens: ${formatNumber(usage.inputTokens)}`);
|
|
517
|
-
if (usage.cacheCreationTokens > 0) {
|
|
518
|
-
await log(` Cache creation tokens: ${formatNumber(usage.cacheCreationTokens)}`);
|
|
519
|
-
}
|
|
520
|
-
if (usage.cacheReadTokens > 0) {
|
|
521
|
-
await log(` Cache read tokens: ${formatNumber(usage.cacheReadTokens)}`);
|
|
522
|
-
}
|
|
523
|
-
await log(` Output tokens: ${formatNumber(usage.outputTokens)}`);
|
|
524
|
-
if (usage.webSearchRequests > 0) {
|
|
525
|
-
await log(` Web search requests: ${usage.webSearchRequests}`);
|
|
526
|
-
}
|
|
527
|
-
// Show detailed cost calculation
|
|
528
|
-
if (usage.costUSD !== null && usage.costUSD !== undefined && usage.costBreakdown) {
|
|
529
|
-
await log('');
|
|
530
|
-
await log(' Cost Calculation (USD):');
|
|
531
|
-
const breakdown = usage.costBreakdown;
|
|
532
|
-
const types = [
|
|
533
|
-
{ key: 'input', label: 'Input' },
|
|
534
|
-
{ key: 'cacheWrite', label: 'Cache write' },
|
|
535
|
-
{ key: 'cacheRead', label: 'Cache read' },
|
|
536
|
-
{ key: 'output', label: 'Output' },
|
|
537
|
-
];
|
|
538
|
-
for (const { key, label } of types) {
|
|
539
|
-
if (breakdown[key].tokens > 0) {
|
|
540
|
-
await log(` ${label}: ${formatNumber(breakdown[key].tokens)} tokens × $${breakdown[key].costPerMillion}/M = $${breakdown[key].cost.toFixed(6)}`);
|
|
541
|
-
}
|
|
542
|
-
}
|
|
543
|
-
await log(' ─────────────────────────────────');
|
|
544
|
-
await log(` Total: $${usage.costUSD.toFixed(6)}`);
|
|
545
|
-
} else if (usage.modelInfo === null) {
|
|
546
|
-
await log('');
|
|
547
|
-
await log(' Cost: Not available (could not fetch pricing)');
|
|
548
|
-
}
|
|
549
|
-
};
|
|
550
|
-
/**
|
|
551
|
-
* Display cost comparison between public pricing and Anthropic's official cost
|
|
552
|
-
* @param {number|null} publicCost - Public pricing estimate
|
|
553
|
-
* @param {number|null} anthropicCost - Anthropic's official cost
|
|
554
|
-
* @param {Function} log - Logging function
|
|
555
|
-
*/
|
|
556
|
-
const displayCostComparison = async (publicCost, anthropicCost, log) => {
|
|
557
|
-
await log('\n 💰 Cost estimation:');
|
|
558
|
-
await log(` Public pricing estimate: ${publicCost !== null && publicCost !== undefined ? `$${publicCost.toFixed(6)} USD` : 'unknown'}`);
|
|
559
|
-
await log(` Calculated by Anthropic: ${anthropicCost !== null && anthropicCost !== undefined ? `$${anthropicCost.toFixed(6)} USD` : 'unknown'}`);
|
|
560
|
-
if (publicCost !== null && publicCost !== undefined && anthropicCost !== null && anthropicCost !== undefined) {
|
|
561
|
-
const difference = anthropicCost - publicCost;
|
|
562
|
-
const percentDiff = publicCost > 0 ? (difference / publicCost) * 100 : 0;
|
|
563
|
-
await log(` Difference: $${difference.toFixed(6)} (${percentDiff > 0 ? '+' : ''}${percentDiff.toFixed(2)}%)`);
|
|
564
|
-
} else {
|
|
565
|
-
await log(' Difference: unknown');
|
|
566
|
-
}
|
|
567
|
-
};
|
|
568
483
|
export const calculateSessionTokens = async (sessionId, tempDir) => {
|
|
569
484
|
const os = (await use('os')).default;
|
|
570
485
|
const homeDir = os.homedir();
|
|
@@ -582,6 +497,10 @@ export const calculateSessionTokens = async (sessionId, tempDir) => {
|
|
|
582
497
|
}
|
|
583
498
|
// Initialize per-model usage tracking
|
|
584
499
|
const modelUsage = {};
|
|
500
|
+
// Issue #1491: Track sub-sessions between compactification events
|
|
501
|
+
const subSessions = [];
|
|
502
|
+
let currentSubSession = createEmptySubSessionUsage();
|
|
503
|
+
const compactifications = [];
|
|
585
504
|
try {
|
|
586
505
|
// Read the entire file
|
|
587
506
|
const fileContent = await fs.readFile(sessionFile, 'utf8');
|
|
@@ -590,53 +509,39 @@ export const calculateSessionTokens = async (sessionId, tempDir) => {
|
|
|
590
509
|
if (!line.trim()) continue;
|
|
591
510
|
try {
|
|
592
511
|
const entry = JSON.parse(line);
|
|
512
|
+
// Issue #1491: Detect compactification boundary events
|
|
513
|
+
if (entry.type === 'system' && entry.subtype === 'compact_boundary') {
|
|
514
|
+
// Save current sub-session and start a new one
|
|
515
|
+
if (currentSubSession.messageCount > 0) {
|
|
516
|
+
subSessions.push(currentSubSession);
|
|
517
|
+
}
|
|
518
|
+
compactifications.push({
|
|
519
|
+
timestamp: entry.timestamp || null,
|
|
520
|
+
preTokens: entry.compactMetadata?.preTokens || null,
|
|
521
|
+
trigger: entry.compactMetadata?.trigger || 'unknown',
|
|
522
|
+
});
|
|
523
|
+
currentSubSession = createEmptySubSessionUsage();
|
|
524
|
+
continue;
|
|
525
|
+
}
|
|
593
526
|
if (entry.message && entry.message.usage && entry.message.model) {
|
|
594
|
-
|
|
595
|
-
|
|
527
|
+
accumulateModelUsage(modelUsage, entry);
|
|
528
|
+
// Issue #1491: Also track per-sub-session usage
|
|
596
529
|
const usage = entry.message.usage;
|
|
597
|
-
|
|
598
|
-
if (
|
|
599
|
-
|
|
600
|
-
|
|
601
|
-
|
|
602
|
-
cacheCreation5mTokens: 0,
|
|
603
|
-
cacheCreation1hTokens: 0,
|
|
604
|
-
cacheReadTokens: 0,
|
|
605
|
-
outputTokens: 0,
|
|
606
|
-
webSearchRequests: 0,
|
|
607
|
-
};
|
|
608
|
-
}
|
|
609
|
-
// Add input tokens
|
|
610
|
-
if (usage.input_tokens) {
|
|
611
|
-
modelUsage[model].inputTokens += usage.input_tokens;
|
|
612
|
-
}
|
|
613
|
-
// Add cache creation tokens (total)
|
|
614
|
-
if (usage.cache_creation_input_tokens) {
|
|
615
|
-
modelUsage[model].cacheCreationTokens += usage.cache_creation_input_tokens;
|
|
616
|
-
}
|
|
617
|
-
// Add cache creation tokens breakdown (5m and 1h)
|
|
618
|
-
if (usage.cache_creation) {
|
|
619
|
-
if (usage.cache_creation.ephemeral_5m_input_tokens) {
|
|
620
|
-
modelUsage[model].cacheCreation5mTokens += usage.cache_creation.ephemeral_5m_input_tokens;
|
|
621
|
-
}
|
|
622
|
-
if (usage.cache_creation.ephemeral_1h_input_tokens) {
|
|
623
|
-
modelUsage[model].cacheCreation1hTokens += usage.cache_creation.ephemeral_1h_input_tokens;
|
|
624
|
-
}
|
|
625
|
-
}
|
|
626
|
-
// Add cache read tokens
|
|
627
|
-
if (usage.cache_read_input_tokens) {
|
|
628
|
-
modelUsage[model].cacheReadTokens += usage.cache_read_input_tokens;
|
|
629
|
-
}
|
|
630
|
-
// Add output tokens
|
|
631
|
-
if (usage.output_tokens) {
|
|
632
|
-
modelUsage[model].outputTokens += usage.output_tokens;
|
|
633
|
-
}
|
|
530
|
+
if (usage.input_tokens) currentSubSession.inputTokens += usage.input_tokens;
|
|
531
|
+
if (usage.cache_creation_input_tokens) currentSubSession.cacheCreationTokens += usage.cache_creation_input_tokens;
|
|
532
|
+
if (usage.cache_read_input_tokens) currentSubSession.cacheReadTokens += usage.cache_read_input_tokens;
|
|
533
|
+
if (usage.output_tokens) currentSubSession.outputTokens += usage.output_tokens;
|
|
534
|
+
currentSubSession.messageCount++;
|
|
634
535
|
}
|
|
635
536
|
} catch {
|
|
636
537
|
// Skip lines that aren't valid JSON
|
|
637
538
|
continue;
|
|
638
539
|
}
|
|
639
540
|
}
|
|
541
|
+
// Push the final sub-session
|
|
542
|
+
if (currentSubSession.messageCount > 0) {
|
|
543
|
+
subSessions.push(currentSubSession);
|
|
544
|
+
}
|
|
640
545
|
// If no usage data was found, return null
|
|
641
546
|
if (Object.keys(modelUsage).length === 0) {
|
|
642
547
|
return null;
|
|
@@ -699,6 +604,9 @@ export const calculateSessionTokens = async (sessionId, tempDir) => {
|
|
|
699
604
|
outputTokens: totalOutputTokens,
|
|
700
605
|
totalTokens,
|
|
701
606
|
totalCostUSD: hasCostData ? totalCostUSD : null,
|
|
607
|
+
// Issue #1491: Sub-session and compactification data
|
|
608
|
+
subSessions: subSessions.length > 1 ? subSessions : null, // Only include if compactification occurred
|
|
609
|
+
compactifications: compactifications.length > 0 ? compactifications : null,
|
|
702
610
|
};
|
|
703
611
|
} catch (readError) {
|
|
704
612
|
throw new Error(`Failed to read session file: ${readError.message}`);
|
|
@@ -832,6 +740,14 @@ export const executeClaudeCommand = async params => {
|
|
|
832
740
|
let errorDuringExecution = false;
|
|
833
741
|
let resultSummary = null;
|
|
834
742
|
let resultModelUsage = null;
|
|
743
|
+
// Issue #1491: Track token usage from stream JSON events for independent calculation
|
|
744
|
+
const streamTokenUsage = {
|
|
745
|
+
inputTokens: 0,
|
|
746
|
+
cacheCreationTokens: 0,
|
|
747
|
+
cacheReadTokens: 0,
|
|
748
|
+
outputTokens: 0,
|
|
749
|
+
eventCount: 0,
|
|
750
|
+
};
|
|
835
751
|
// Create interactive mode handler if enabled
|
|
836
752
|
let interactiveHandler = null;
|
|
837
753
|
if (argv.interactiveMode && owner && repo && prNumber) {
|
|
@@ -1054,6 +970,15 @@ export const executeClaudeCommand = async params => {
|
|
|
1054
970
|
lastMessage = data.error || JSON.stringify(data);
|
|
1055
971
|
if (lastMessage.includes('Internal server error')) isInternalServerError = true;
|
|
1056
972
|
}
|
|
973
|
+
// Issue #1491: Track token usage from stream events for independent calculation
|
|
974
|
+
if (data.type === 'assistant' && data.message && data.message.usage) {
|
|
975
|
+
const u = data.message.usage;
|
|
976
|
+
if (u.input_tokens) streamTokenUsage.inputTokens += u.input_tokens;
|
|
977
|
+
if (u.cache_creation_input_tokens) streamTokenUsage.cacheCreationTokens += u.cache_creation_input_tokens;
|
|
978
|
+
if (u.cache_read_input_tokens) streamTokenUsage.cacheReadTokens += u.cache_read_input_tokens;
|
|
979
|
+
if (u.output_tokens) streamTokenUsage.outputTokens += u.output_tokens;
|
|
980
|
+
streamTokenUsage.eventCount++;
|
|
981
|
+
}
|
|
1057
982
|
if (data.type === 'assistant' && data.message && data.message.content) {
|
|
1058
983
|
const content = Array.isArray(data.message.content) ? data.message.content : [data.message.content];
|
|
1059
984
|
for (const item of content) {
|
|
@@ -1336,6 +1261,15 @@ export const executeClaudeCommand = async params => {
|
|
|
1336
1261
|
await displayBudgetStats(usage, log);
|
|
1337
1262
|
}
|
|
1338
1263
|
}
|
|
1264
|
+
// Issue #1491: Display sub-session breakdown if compactification occurred
|
|
1265
|
+
if (argv.tokensBudgetStats && tokenUsage.subSessions) {
|
|
1266
|
+
const primaryModelInfo = Object.values(tokenUsage.modelUsage).find(u => u.modelInfo?.limit)?.modelInfo;
|
|
1267
|
+
await displaySubSessionStats(tokenUsage, primaryModelInfo, log);
|
|
1268
|
+
}
|
|
1269
|
+
// Issue #1491: Display stream vs JSONL token comparison
|
|
1270
|
+
if (argv.tokensBudgetStats && streamTokenUsage.eventCount > 0) {
|
|
1271
|
+
await displayTokenComparison(streamTokenUsage, tokenUsage, log);
|
|
1272
|
+
}
|
|
1339
1273
|
// Show totals if multiple models were used
|
|
1340
1274
|
if (modelIds.length > 1) {
|
|
1341
1275
|
await log('\n 📈 Total across all models:');
|
|
@@ -1381,6 +1315,7 @@ export const executeClaudeCommand = async params => {
|
|
|
1381
1315
|
errorDuringExecution, // Issue #1088: Track if error_during_execution subtype occurred
|
|
1382
1316
|
resultSummary, // Issue #1263: Include result summary for --attach-solution-summary
|
|
1383
1317
|
resultModelUsage, // Issue #1454
|
|
1318
|
+
streamTokenUsage: streamTokenUsage.eventCount > 0 ? streamTokenUsage : null, // Issue #1491
|
|
1384
1319
|
};
|
|
1385
1320
|
} catch (error) {
|
|
1386
1321
|
reportError(error, {
|
|
@@ -172,7 +172,7 @@ Initial research.
|
|
|
172
172
|
- When you start, make sure you create detailed plan for yourself and follow your todo list step by step, make sure that as many points from these guidelines are added to your todo list to keep track of everything that can help you solve the issue with highest possible quality.
|
|
173
173
|
- When user mentions CI failures or asks to investigate logs, consider adding these todos to track the investigation: (1) List recent CI runs with timestamps, (2) Download logs from failed runs to ci-logs/ directory, (3) Analyze error messages and identify root cause, (4) Implement fix, (5) Verify fix resolves the specific errors found in logs.
|
|
174
174
|
- When you read issue, read all details and comments thoroughly.
|
|
175
|
-
- When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it.
|
|
175
|
+
- When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML) using a CLI tool like the 'file' command to check the actual file format. When corrupted or non-image files (like GitHub's "Not Found" pages saved as .png) are read, they can cause "Could not process image" errors and crash the AI solver process. When the file command shows "HTML", "text", or "ASCII text", the download failed — do not call Read on this file. Instead: (1) When images are from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — retry with: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>" (2) When the retry still fails, skip the image and note it was unavailable.
|
|
176
176
|
- When you need issue details, use gh issue view https://github.com/${owner}/${repo}/issues/${issueNumber}.
|
|
177
177
|
- When you need related code, use gh search code --owner ${owner} [keywords].
|
|
178
178
|
- When you need repo context, read files in your working directory.${
|
|
@@ -190,11 +190,11 @@ Initial research.
|
|
|
190
190
|
- When accessing GitHub Gists (especially private ones), use gh gist view command instead of direct URL fetching to ensure proper authentication.
|
|
191
191
|
- When you are fixing a bug, please make sure you first find the actual root cause, do as many experiments as needed.
|
|
192
192
|
- When you are fixing a bug and code does not have enough tracing/logs, add them and make sure they stay in the code, but are switched off by default.
|
|
193
|
-
- When you need comments on a pull request, note that GitHub has
|
|
193
|
+
- When you need comments on a pull request, note that GitHub has three different comment types with different API endpoints:
|
|
194
194
|
1. PR review comments (inline code comments): gh api repos/${owner}/${repo}/pulls/${prNumber}/comments --paginate
|
|
195
195
|
2. PR conversation comments (general discussion): gh api repos/${owner}/${repo}/issues/${prNumber}/comments --paginate
|
|
196
196
|
3. PR reviews (approve/request changes): gh api repos/${owner}/${repo}/pulls/${prNumber}/reviews --paginate
|
|
197
|
-
|
|
197
|
+
Note: The command "gh pr view --json comments" only returns conversation comments and misses review comments.
|
|
198
198
|
- When you need latest comments on issue, use gh api repos/${owner}/${repo}/issues/${issueNumber}/comments --paginate.${
|
|
199
199
|
argv && argv.promptGeneralPurposeSubAgent
|
|
200
200
|
? `
|
|
@@ -208,9 +208,9 @@ Initial research.
|
|
|
208
208
|
}
|
|
209
209
|
|
|
210
210
|
Solution development and testing.
|
|
211
|
-
- When issue is solvable, implement
|
|
211
|
+
- When issue is solvable, first create a test that reproduces the problem, then implement the fix.
|
|
212
212
|
- When implementing features, search for similar existing implementations in the codebase and use them as examples instead of implementing everything from scratch.
|
|
213
|
-
- When coding, each atomic step that can be useful by itself should be
|
|
213
|
+
- When coding, each atomic step that can be useful by itself should be committed to the pull request's branch, meaning if work will be interrupted by any reason parts of solution will still be kept intact and safe in pull request.
|
|
214
214
|
- When you test:
|
|
215
215
|
start from testing of small functions using separate scripts;
|
|
216
216
|
write unit tests with mocks for easy and quick start.
|
|
@@ -219,9 +219,17 @@ Solution development and testing.
|
|
|
219
219
|
- When you write or modify tests, consider setting reasonable timeouts at test, suite, and CI job levels so failures surface quickly instead of hanging.
|
|
220
220
|
- When you see repeated test timeout patterns in CI, investigate the root cause rather than increasing timeouts.
|
|
221
221
|
- When issue is unclear, write comment on issue asking questions.
|
|
222
|
-
- When you encounter any problems that you unable to solve yourself (any human feedback or help), write a comment to the pull request asking for help.
|
|
222
|
+
- When you encounter any problems that you are unable to solve yourself (any human feedback or help), write a comment to the pull request asking for help.
|
|
223
223
|
- When you need human help, use gh pr comment ${prNumber} --body "your message" to comment on existing PR.
|
|
224
224
|
|
|
225
|
+
Reproducible testing.
|
|
226
|
+
- When fixing a bug, create a test that reproduces the problem before implementing the fix. When you cannot reproduce the problem, you cannot verify the fix.
|
|
227
|
+
- When encountering logic bugs, write an automated test that fails due to the bug, then implement the fix to make it pass.
|
|
228
|
+
- When encountering UI bugs, capture a screenshot showing the problem state, then create a visual regression test or manual verification screenshot after the fix.
|
|
229
|
+
- When creating tests, prefer minimum reproducible examples - the simplest test case that demonstrates the issue.
|
|
230
|
+
- When submitting a fix, include in the PR description: (1) how to reproduce the issue, (2) the automated test that verifies the fix, (3) before/after screenshots for UI issues.
|
|
231
|
+
- When a bug fix doesn't have a reproducing test, the fix is incomplete - regressions can silently occur later.
|
|
232
|
+
|
|
225
233
|
Preparing pull request.
|
|
226
234
|
- When you code, follow contributing guidelines.
|
|
227
235
|
- When you commit, write clear message.
|
|
@@ -229,7 +237,7 @@ Preparing pull request.
|
|
|
229
237
|
- When you open pr, describe solution draft and include tests.
|
|
230
238
|
- When there is a package with version and GitHub Actions workflows for automatic release, update the version (or other necessary release trigger) in your pull request to prepare for next release.
|
|
231
239
|
- When you update existing pr ${prNumber}, use gh pr edit to modify title and description.
|
|
232
|
-
- When you are about to commit or push code,
|
|
240
|
+
- When you are about to commit or push code, run local CI checks first if they are available in contributing guidelines (like ruff check, mypy, eslint, etc.) to catch errors before pushing.
|
|
233
241
|
- When you finalize the pull request:
|
|
234
242
|
follow style from merged prs for code, title, and description,
|
|
235
243
|
make sure no uncommitted changes corresponding to the original requirements are left behind,
|
|
@@ -237,7 +245,7 @@ Preparing pull request.
|
|
|
237
245
|
make sure all CI checks passing if they exist before you finish,
|
|
238
246
|
check for latest comments on the issue and pull request to ensure no recent feedback was missed,
|
|
239
247
|
double-check that all changes in the pull request answer to original requirements of the issue,
|
|
240
|
-
make sure no new
|
|
248
|
+
make sure no new bugs are introduced in pull request by carefully reading gh pr diff,
|
|
241
249
|
make sure no previously existing features were removed without an explicit request from users via the issue description, issue comments, and/or pull request comments.
|
|
242
250
|
- When you finish implementation, use gh pr ready ${prNumber}.
|
|
243
251
|
|
|
@@ -260,12 +268,12 @@ Self review.
|
|
|
260
268
|
- When you finalize, confirm code, tests, and description are consistent.${
|
|
261
269
|
argv && argv.promptEnsureAllRequirementsAreMet
|
|
262
270
|
? `
|
|
263
|
-
- When no explicit feedback or requirements
|
|
271
|
+
- When no explicit feedback or requirements are provided, ensure all changes are correct, consistent, validated, tested, logged and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass.`
|
|
264
272
|
: ''
|
|
265
273
|
}
|
|
266
274
|
|
|
267
275
|
GitHub CLI command patterns.
|
|
268
|
-
-
|
|
276
|
+
- When fetching lists from GitHub API, use the --paginate flag to ensure all results are returned (GitHub returns max 30 per page by default).
|
|
269
277
|
- When listing PR review comments (inline code comments), use gh api repos/OWNER/REPO/pulls/NUMBER/comments --paginate.
|
|
270
278
|
- When listing PR conversation comments, use gh api repos/OWNER/REPO/issues/NUMBER/comments --paginate.
|
|
271
279
|
- When listing PR reviews, use gh api repos/OWNER/REPO/pulls/NUMBER/reviews --paginate.
|
|
@@ -284,7 +292,11 @@ Playwright MCP usage (browser automation via mcp__playwright__* tools).
|
|
|
284
292
|
- When you need to visually verify how a web page looks or take screenshots, use browser_take_screenshot from Playwright MCP.
|
|
285
293
|
- When you need to fill forms, click buttons, or perform user interactions on web pages, use Playwright MCP tools (browser_click, browser_type, browser_fill_form).
|
|
286
294
|
- When you need to test responsive design or different viewport sizes, use browser_resize from Playwright MCP.
|
|
287
|
-
- When you finish using the browser, always close it with browser_close to free resources
|
|
295
|
+
- When you finish using the browser, always close it with browser_close to free resources.
|
|
296
|
+
- When reproducing UI bugs, use browser_take_screenshot to capture the problem state before implementing any fix.
|
|
297
|
+
- When fixing UI bugs, take before/after screenshots to provide visual evidence of the fix for human verification.
|
|
298
|
+
- When creating UI tests, save baseline screenshots to the repository for visual regression testing.
|
|
299
|
+
- When verifying UI fixes, compare screenshots to ensure the fix doesn't introduce unintended visual changes.`
|
|
288
300
|
: ''
|
|
289
301
|
}${
|
|
290
302
|
argv && argv.promptPlanSubAgent
|
|
@@ -329,7 +341,11 @@ Visual UI work and screenshots.
|
|
|
329
341
|
- When you need to show visual results, take a screenshot and save it to the repository (e.g., in a docs/screenshots/ or assets/ folder).
|
|
330
342
|
- When you save screenshots to the repository, use permanent links in the pull request description markdown (e.g., https://github.com/${owner}/${repo}/blob/${branchName}/docs/screenshots/result.png?raw=true).
|
|
331
343
|
- When uploading images, commit them to the branch first, then reference them using the GitHub blob URL format with ?raw=true suffix (works for both public and private repositories).
|
|
332
|
-
- When the visual result is important for review, mention it explicitly in the pull request description with the embedded image
|
|
344
|
+
- When the visual result is important for review, mention it explicitly in the pull request description with the embedded image.
|
|
345
|
+
- When fixing UI bugs, capture both the "before" (problem) and "after" (fixed) screenshots as evidence for human verification.
|
|
346
|
+
- When reporting UI bugs, include a screenshot of the problem state to enable visual verification of the fix.
|
|
347
|
+
- When the fix is visual, include side-by-side or sequential comparison of before/after states in the PR description.
|
|
348
|
+
- When possible, create automated visual regression tests to prevent the UI bug from recurring.`
|
|
333
349
|
: ''
|
|
334
350
|
}${ciExamples}${getArchitectureCareSubPrompt(argv)}`;
|
|
335
351
|
};
|
|
@@ -152,7 +152,7 @@ Initial research.
|
|
|
152
152
|
- When you start, make sure you create detailed plan for yourself and follow your todo list step by step, make sure that as many points from these guidelines are added to your todo list to keep track of everything that can help you solve the issue with highest possible quality.
|
|
153
153
|
- When user mentions CI failures or asks to investigate logs, consider adding these todos to track the investigation: (1) List recent CI runs with timestamps, (2) Download logs from failed runs to ci-logs/ directory, (3) Analyze error messages and identify root cause, (4) Implement fix, (5) Verify fix resolves the specific errors found in logs.
|
|
154
154
|
- When you read issue, read all details and comments thoroughly.
|
|
155
|
-
- When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it.
|
|
155
|
+
- When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML) using a CLI tool like the 'file' command to check the actual file format. When the file command shows "HTML", "text", or "ASCII text", the download failed — do not call Read on this file. When images are from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — use: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>"
|
|
156
156
|
- When you need issue details, use gh issue view https://github.com/${owner}/${repo}/issues/${issueNumber}.
|
|
157
157
|
- When you need related code, use gh search code --owner ${owner} [keywords].
|
|
158
158
|
- When you need repo context, read files in your working directory.${
|
|
@@ -165,17 +165,17 @@ Initial research.
|
|
|
165
165
|
- When accessing GitHub Gists (especially private ones), use gh gist view command instead of direct URL fetching to ensure proper authentication.
|
|
166
166
|
- When you are fixing a bug, please make sure you first find the actual root cause, do as many experiments as needed.
|
|
167
167
|
- When you are fixing a bug and code does not have enough tracing/logs, add them and make sure they stay in the code, but are switched off by default.
|
|
168
|
-
- When you need comments on a pull request, note that GitHub has
|
|
168
|
+
- When you need comments on a pull request, note that GitHub has three different comment types with different API endpoints:
|
|
169
169
|
1. PR review comments (inline code comments): gh api repos/${owner}/${repo}/pulls/${prNumber}/comments --paginate
|
|
170
170
|
2. PR conversation comments (general discussion): gh api repos/${owner}/${repo}/issues/${prNumber}/comments --paginate
|
|
171
171
|
3. PR reviews (approve/request changes): gh api repos/${owner}/${repo}/pulls/${prNumber}/reviews --paginate
|
|
172
|
-
|
|
172
|
+
Note: The command "gh pr view --json comments" only returns conversation comments and misses review comments.
|
|
173
173
|
- When you need latest comments on issue, use gh api repos/${owner}/${repo}/issues/${issueNumber}/comments --paginate.
|
|
174
174
|
|
|
175
175
|
Solution development and testing.
|
|
176
|
-
- When issue is solvable, implement
|
|
176
|
+
- When issue is solvable, first create a test that reproduces the problem, then implement the fix.
|
|
177
177
|
- When implementing features, search for similar existing implementations in the codebase and use them as examples instead of implementing everything from scratch.
|
|
178
|
-
- When coding, each atomic step that can be useful by itself should be
|
|
178
|
+
- When coding, each atomic step that can be useful by itself should be committed to the pull request's branch, meaning if work will be interrupted by any reason parts of solution will still be kept intact and safe in pull request.
|
|
179
179
|
- When you test:
|
|
180
180
|
start from testing of small functions using separate scripts;
|
|
181
181
|
write unit tests with mocks for easy and quick start.
|
|
@@ -184,9 +184,17 @@ Solution development and testing.
|
|
|
184
184
|
- When you write or modify tests, consider setting reasonable timeouts at test, suite, and CI job levels so failures surface quickly instead of hanging.
|
|
185
185
|
- When you see repeated test timeout patterns in CI, investigate the root cause rather than increasing timeouts.
|
|
186
186
|
- When issue is unclear, write comment on issue asking questions.
|
|
187
|
-
- When you encounter any problems that you unable to solve yourself (any human feedback or help), write a comment to the pull request asking for help.
|
|
187
|
+
- When you encounter any problems that you are unable to solve yourself (any human feedback or help), write a comment to the pull request asking for help.
|
|
188
188
|
- When you need human help, use gh pr comment ${prNumber} --body "your message" to comment on existing PR.
|
|
189
189
|
|
|
190
|
+
Reproducible testing.
|
|
191
|
+
- When fixing a bug, create a test that reproduces the problem before implementing the fix. When you cannot reproduce the problem, you cannot verify the fix.
|
|
192
|
+
- When encountering logic bugs, write an automated test that fails due to the bug, then implement the fix to make it pass.
|
|
193
|
+
- When encountering UI bugs, capture a screenshot showing the problem state, then create a visual regression test or manual verification screenshot after the fix.
|
|
194
|
+
- When creating tests, prefer minimum reproducible examples - the simplest test case that demonstrates the issue.
|
|
195
|
+
- When submitting a fix, include in the PR description: (1) how to reproduce the issue, (2) the automated test that verifies the fix, (3) before/after screenshots for UI issues.
|
|
196
|
+
- When a bug fix doesn't have a reproducing test, the fix is incomplete - regressions can silently occur later.
|
|
197
|
+
|
|
190
198
|
Preparing pull request.
|
|
191
199
|
- When you code, follow contributing guidelines.
|
|
192
200
|
- When you commit, write clear message.
|
|
@@ -194,7 +202,7 @@ Preparing pull request.
|
|
|
194
202
|
- When you open pr, describe solution draft and include tests.
|
|
195
203
|
- When there is a package with version and GitHub Actions workflows for automatic release, update the version (or other necessary release trigger) in your pull request to prepare for next release.
|
|
196
204
|
- When you update existing pr ${prNumber}, use gh pr edit to modify title and description.
|
|
197
|
-
- When you are about to commit or push code,
|
|
205
|
+
- When you are about to commit or push code, run local CI checks first if they are available in contributing guidelines (like ruff check, mypy, eslint, etc.) to catch errors before pushing.
|
|
198
206
|
- When you finalize the pull request:
|
|
199
207
|
check that pull request title and description are updated (the PR may start with a [WIP] prefix and placeholder description that should be replaced with actual title and description of the changes),
|
|
200
208
|
follow style from merged prs for code, title, and description,
|
|
@@ -202,7 +210,7 @@ Preparing pull request.
|
|
|
202
210
|
make sure the default branch is merged to the pull request's branch,
|
|
203
211
|
make sure all CI checks passing if they exist before you finish,
|
|
204
212
|
double-check that all changes in the pull request answer to original requirements of the issue,
|
|
205
|
-
make sure no new
|
|
213
|
+
make sure no new bugs are introduced in pull request by carefully reading gh pr diff,
|
|
206
214
|
make sure no previously existing features were removed without an explicit request from users via the issue description, issue comments, and/or pull request comments.
|
|
207
215
|
- When you finish implementation, use gh pr ready ${prNumber}.
|
|
208
216
|
|
|
@@ -225,12 +233,12 @@ Self review.
|
|
|
225
233
|
- When you finalize, confirm code, tests, and description are consistent.${
|
|
226
234
|
argv && argv.promptEnsureAllRequirementsAreMet
|
|
227
235
|
? `
|
|
228
|
-
- When no explicit feedback or requirements
|
|
236
|
+
- When no explicit feedback or requirements are provided, ensure all changes are correct, consistent, validated, tested, logged and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass.`
|
|
229
237
|
: ''
|
|
230
238
|
}
|
|
231
239
|
|
|
232
240
|
GitHub CLI command patterns.
|
|
233
|
-
-
|
|
241
|
+
- When fetching lists from GitHub API, use the --paginate flag to ensure all results are returned (GitHub returns max 30 per page by default).
|
|
234
242
|
- When listing PR review comments (inline code comments), use gh api repos/OWNER/REPO/pulls/NUMBER/comments --paginate.
|
|
235
243
|
- When listing PR conversation comments, use gh api repos/OWNER/REPO/issues/NUMBER/comments --paginate.
|
|
236
244
|
- When listing PR reviews, use gh api repos/OWNER/REPO/pulls/NUMBER/reviews --paginate.
|
package/src/github.lib.mjs
CHANGED
|
@@ -12,8 +12,8 @@ import { uploadLogWithGhUploadLog } from './log-upload.lib.mjs';
|
|
|
12
12
|
import { formatResetTimeWithRelative } from './usage-limit.lib.mjs'; // See: https://github.com/link-assistant/hive-mind/issues/1236
|
|
13
13
|
// Import model info helpers (Issue #1225)
|
|
14
14
|
import { getToolDisplayName, getModelInfoForComment } from './models/index.mjs';
|
|
15
|
-
// Re-export for use by other modules
|
|
16
|
-
|
|
15
|
+
export { getToolDisplayName }; // Re-export for use by other modules
|
|
16
|
+
import { buildBudgetStatsString } from './claude.budget-stats.lib.mjs';
|
|
17
17
|
|
|
18
18
|
/** Build cost estimation string for log comments (Issue #1250) */
|
|
19
19
|
const buildCostInfoString = (totalCostUSD, anthropicTotalCostUSD, pricingInfo) => {
|
|
@@ -366,7 +366,9 @@ export async function attachLogToGitHub(options) {
|
|
|
366
366
|
requestedModel = null, // Issue #1225: The --model flag value
|
|
367
367
|
tool = null, // The tool used (claude, agent, opencode, codex)
|
|
368
368
|
resultModelUsage = null, // Issue #1454
|
|
369
|
+
budgetStatsData = null, // Issue #1491: budget stats for comment
|
|
369
370
|
} = options;
|
|
371
|
+
const budgetStats = budgetStatsData ? buildBudgetStatsString(budgetStatsData.tokenUsage, budgetStatsData.streamTokenUsage) : '';
|
|
370
372
|
const targetName = targetType === 'pr' ? 'Pull Request' : 'Issue';
|
|
371
373
|
const ghCommand = targetType === 'pr' ? 'pr' : 'issue';
|
|
372
374
|
try {
|
|
@@ -552,7 +554,7 @@ ${logContent}
|
|
|
552
554
|
// Issue #1088: "Finished with errors" format - work may have been completed but errors occurred
|
|
553
555
|
const costInfo = buildCostInfoString(totalCostUSD, anthropicTotalCostUSD, pricingInfo);
|
|
554
556
|
logComment = `## ⚠️ Solution Draft Finished with Errors
|
|
555
|
-
This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${modelInfoString}
|
|
557
|
+
This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${budgetStats}${modelInfoString}
|
|
556
558
|
|
|
557
559
|
> **Note**: The session encountered errors during execution, but some work may have been completed. Please review the changes carefully.
|
|
558
560
|
|
|
@@ -568,10 +570,8 @@ ${logContent}
|
|
|
568
570
|
---
|
|
569
571
|
*Now working session is ended, feel free to review and add any feedback on the solution draft.*`;
|
|
570
572
|
} else {
|
|
571
|
-
// Success log format - use helper function for cost info
|
|
572
573
|
const costInfo = buildCostInfoString(totalCostUSD, anthropicTotalCostUSD, pricingInfo);
|
|
573
|
-
// Determine title based on session type
|
|
574
|
-
// See: https://github.com/link-assistant/hive-mind/issues/1152
|
|
574
|
+
// Determine title based on session type (Issue #1152)
|
|
575
575
|
let title = customTitle;
|
|
576
576
|
let sessionNote = '';
|
|
577
577
|
if (sessionType === 'auto-resume') {
|
|
@@ -585,7 +585,7 @@ ${logContent}
|
|
|
585
585
|
sessionNote = '\n\n**Note**: This session was manually resumed using the --resume flag.';
|
|
586
586
|
}
|
|
587
587
|
logComment = `## ${title}
|
|
588
|
-
This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${modelInfoString}${sessionNote}
|
|
588
|
+
This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${budgetStats}${modelInfoString}${sessionNote}
|
|
589
589
|
|
|
590
590
|
<details>
|
|
591
591
|
<summary>Click to expand solution draft log (${Math.round(logStats.size / 1024)}KB)</summary>
|
|
@@ -733,7 +733,7 @@ ${errorMessage}
|
|
|
733
733
|
// Issue #1088: "Finished with errors" format - work may have been completed but errors occurred
|
|
734
734
|
const costInfo = buildCostInfoString(totalCostUSD, anthropicTotalCostUSD, pricingInfo);
|
|
735
735
|
logUploadComment = `## ⚠️ Solution Draft Finished with Errors
|
|
736
|
-
This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${modelInfoString}
|
|
736
|
+
This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${budgetStats}${modelInfoString}
|
|
737
737
|
|
|
738
738
|
> **Note**: The session encountered errors during execution, but some work may have been completed. Please review the changes carefully.
|
|
739
739
|
|
|
@@ -760,7 +760,7 @@ This log file contains the complete execution trace of the AI ${targetType === '
|
|
|
760
760
|
sessionNote = '\n**Note**: This session was manually resumed using the --resume flag.\n';
|
|
761
761
|
}
|
|
762
762
|
logUploadComment = `## ${title}
|
|
763
|
-
This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${modelInfoString}
|
|
763
|
+
This log file contains the complete execution trace of the AI ${targetType === 'pr' ? 'solution draft' : 'analysis'} process.${costInfo}${budgetStats}${modelInfoString}
|
|
764
764
|
${sessionNote}
|
|
765
765
|
### 📎 **Log file uploaded as ${uploadTypeLabel}${chunkInfo}** (${Math.round(logStats.size / 1024)}KB)
|
|
766
766
|
- [View complete solution draft log](${logUrl})
|
|
@@ -146,7 +146,7 @@ ${workspaceInstructions}
|
|
|
146
146
|
Initial research.
|
|
147
147
|
- When you start, make sure you create detailed plan for yourself and follow your todo list step by step, make sure that as many points from these guidelines are added to your todo list to keep track of everything that can help you solve the issue with highest possible quality.
|
|
148
148
|
- When you read issue, read all details and comments thoroughly.
|
|
149
|
-
- When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it.
|
|
149
|
+
- When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML) using a CLI tool like the 'file' command to check the actual file format. When corrupted or non-image files (like GitHub's "Not Found" pages saved as .png) are read, they can cause "Could not process image" errors and crash the AI solver process. When the file command shows "HTML", "text", or "ASCII text", the download failed — do not call Read on this file. Instead: (1) When images are from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — retry with: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>" (2) When the retry still fails, skip the image and note it was unavailable.
|
|
150
150
|
- When you need issue details, use gh issue view https://github.com/${owner}/${repo}/issues/${issueNumber}.
|
|
151
151
|
- When you need related code, use gh search code --owner ${owner} [keywords].
|
|
152
152
|
- When you need repo context, read files in your working directory.${
|
|
@@ -159,17 +159,17 @@ Initial research.
|
|
|
159
159
|
- When accessing GitHub Gists, use gh gist view command instead of direct URL fetching.
|
|
160
160
|
- When you are fixing a bug, please make sure you first find the actual root cause, do as many experiments as needed.
|
|
161
161
|
- When you are fixing a bug and code does not have enough tracing/logs, add them and make sure they stay in the code, but are switched off by default.
|
|
162
|
-
- When you need comments on a pull request, note that GitHub has
|
|
162
|
+
- When you need comments on a pull request, note that GitHub has three different comment types with different API endpoints:
|
|
163
163
|
1. PR review comments (inline code comments): gh api repos/${owner}/${repo}/pulls/${prNumber}/comments --paginate
|
|
164
164
|
2. PR conversation comments (general discussion): gh api repos/${owner}/${repo}/issues/${prNumber}/comments --paginate
|
|
165
165
|
3. PR reviews (approve/request changes): gh api repos/${owner}/${repo}/pulls/${prNumber}/reviews --paginate
|
|
166
|
-
|
|
166
|
+
Note: The command "gh pr view --json comments" only returns conversation comments and misses review comments.
|
|
167
167
|
- When you need latest comments on issue, use gh api repos/${owner}/${repo}/issues/${issueNumber}/comments --paginate.
|
|
168
168
|
|
|
169
169
|
Solution development and testing.
|
|
170
|
-
- When issue is solvable, implement
|
|
170
|
+
- When issue is solvable, first create a test that reproduces the problem, then implement the fix.
|
|
171
171
|
- When implementing features, search for similar existing implementations in the codebase and use them as examples instead of implementing everything from scratch.
|
|
172
|
-
- When coding, each atomic step that can be useful by itself should be
|
|
172
|
+
- When coding, each atomic step that can be useful by itself should be committed to the pull request's branch, meaning if work will be interrupted by any reason parts of solution will still be kept intact and safe in pull request.
|
|
173
173
|
- When you test:
|
|
174
174
|
start from testing of small functions using separate scripts;
|
|
175
175
|
write unit tests with mocks for easy and quick start.
|
|
@@ -178,9 +178,17 @@ Solution development and testing.
|
|
|
178
178
|
- When you write or modify tests, consider setting reasonable timeouts at test, suite, and CI job levels so failures surface quickly instead of hanging.
|
|
179
179
|
- When you see repeated test timeout patterns in CI, investigate the root cause rather than increasing timeouts.
|
|
180
180
|
- When issue is unclear, write comment on issue asking questions.
|
|
181
|
-
- When you encounter any problems that you unable to solve yourself, write a comment to the pull request asking for help.
|
|
181
|
+
- When you encounter any problems that you are unable to solve yourself, write a comment to the pull request asking for help.
|
|
182
182
|
- When you need human help, use gh pr comment ${prNumber} --body "your message" to comment on existing PR.
|
|
183
183
|
|
|
184
|
+
Reproducible testing.
|
|
185
|
+
- When fixing a bug, create a test that reproduces the problem before implementing the fix. When you cannot reproduce the problem, you cannot verify the fix.
|
|
186
|
+
- When encountering logic bugs, write an automated test that fails due to the bug, then implement the fix to make it pass.
|
|
187
|
+
- When encountering UI bugs, capture a screenshot showing the problem state, then create a visual regression test or manual verification screenshot after the fix.
|
|
188
|
+
- When creating tests, prefer minimum reproducible examples - the simplest test case that demonstrates the issue.
|
|
189
|
+
- When submitting a fix, include in the PR description: (1) how to reproduce the issue, (2) the automated test that verifies the fix, (3) before/after screenshots for UI issues.
|
|
190
|
+
- When a bug fix doesn't have a reproducing test, the fix is incomplete - regressions can silently occur later.
|
|
191
|
+
|
|
184
192
|
Preparing pull request.
|
|
185
193
|
- When you code, follow contributing guidelines.
|
|
186
194
|
- When you commit, write clear message.
|
|
@@ -218,12 +226,12 @@ Self review.
|
|
|
218
226
|
- When you finalize, confirm code, tests, and description are consistent.${
|
|
219
227
|
argv && argv.promptEnsureAllRequirementsAreMet
|
|
220
228
|
? `
|
|
221
|
-
- When no explicit feedback or requirements
|
|
229
|
+
- When no explicit feedback or requirements are provided, ensure all changes are correct, consistent, validated, tested, logged and fully meet all discussed requirements (check issue description and all comments in issue and in pull request). Ensure all CI/CD checks pass.`
|
|
222
230
|
: ''
|
|
223
231
|
}
|
|
224
232
|
|
|
225
233
|
GitHub CLI command patterns.
|
|
226
|
-
-
|
|
234
|
+
- When fetching lists from GitHub API, use the --paginate flag to ensure all results are returned (GitHub returns max 30 per page by default).
|
|
227
235
|
- When listing PR review comments (inline code comments), use gh api repos/OWNER/REPO/pulls/NUMBER/comments --paginate.
|
|
228
236
|
- When listing PR conversation comments, use gh api repos/OWNER/REPO/issues/NUMBER/comments --paginate.
|
|
229
237
|
- When listing PR reviews, use gh api repos/OWNER/REPO/pulls/NUMBER/reviews --paginate.
|
package/src/solve.mjs
CHANGED
|
@@ -876,9 +876,10 @@ try {
|
|
|
876
876
|
let anthropicTotalCostUSD = toolResult.anthropicTotalCostUSD;
|
|
877
877
|
let publicPricingEstimate = toolResult.publicPricingEstimate; // Used by agent tool
|
|
878
878
|
let pricingInfo = toolResult.pricingInfo; // Used by agent tool for detailed pricing
|
|
879
|
-
let errorDuringExecution = toolResult.errorDuringExecution || false;
|
|
880
|
-
let resultSummary = toolResult.resultSummary || null;
|
|
881
|
-
let resultModelUsage = toolResult.resultModelUsage || null;
|
|
879
|
+
let errorDuringExecution = toolResult.errorDuringExecution || false;
|
|
880
|
+
let resultSummary = toolResult.resultSummary || null;
|
|
881
|
+
let resultModelUsage = toolResult.resultModelUsage || null;
|
|
882
|
+
let streamTokenUsage = toolResult.streamTokenUsage || null;
|
|
882
883
|
limitReached = toolResult.limitReached;
|
|
883
884
|
cleanupContext.limitReached = limitReached;
|
|
884
885
|
|
|
@@ -1216,7 +1217,7 @@ try {
|
|
|
1216
1217
|
}
|
|
1217
1218
|
|
|
1218
1219
|
// Search for newly created pull requests and comments
|
|
1219
|
-
const verifyResult = await verifyResults(owner, repo, branchName, issueNumber, prNumber, prUrl, referenceTime, argv, shouldAttachLogs, shouldRestart, sessionId, tempDir, anthropicTotalCostUSD, publicPricingEstimate, pricingInfo, errorDuringExecution, sessionType, resultModelUsage);
|
|
1220
|
+
const verifyResult = await verifyResults(owner, repo, branchName, issueNumber, prNumber, prUrl, referenceTime, argv, shouldAttachLogs, shouldRestart, sessionId, tempDir, anthropicTotalCostUSD, publicPricingEstimate, pricingInfo, errorDuringExecution, sessionType, resultModelUsage, streamTokenUsage);
|
|
1220
1221
|
const logsAlreadyUploaded = verifyResult?.logUploadSuccess || false;
|
|
1221
1222
|
|
|
1222
1223
|
// Issue #1162: Auto-restart when PR title/description still has placeholder content
|
|
@@ -1263,7 +1264,7 @@ try {
|
|
|
1263
1264
|
await cleanupClaudeFile(tempDir, branchName, null, argv);
|
|
1264
1265
|
|
|
1265
1266
|
// Re-verify results after restart (without auto-restart flag to prevent recursion)
|
|
1266
|
-
const reVerifyResult = await verifyResults(owner, repo, branchName, issueNumber, prNumber, prUrl, referenceTime, { ...argv, autoRestartOnNonUpdatedPullRequestDescription: false }, shouldAttachLogs, false, sessionId, tempDir, anthropicTotalCostUSD, publicPricingEstimate, pricingInfo, errorDuringExecution, sessionType, resultModelUsage);
|
|
1267
|
+
const reVerifyResult = await verifyResults(owner, repo, branchName, issueNumber, prNumber, prUrl, referenceTime, { ...argv, autoRestartOnNonUpdatedPullRequestDescription: false }, shouldAttachLogs, false, sessionId, tempDir, anthropicTotalCostUSD, publicPricingEstimate, pricingInfo, errorDuringExecution, sessionType, resultModelUsage, streamTokenUsage);
|
|
1267
1268
|
|
|
1268
1269
|
if (reVerifyResult?.prTitleHasPlaceholder || reVerifyResult?.prBodyHasPlaceholder) {
|
|
1269
1270
|
await log('⚠️ PR title/description still not updated after restart');
|
|
@@ -1492,9 +1493,6 @@ try {
|
|
|
1492
1493
|
// drainHandles() inside safeExit() will unref/close these before process.exit().
|
|
1493
1494
|
await logActiveHandles(msg => log(msg));
|
|
1494
1495
|
|
|
1495
|
-
// Issue #1431: safeExit()
|
|
1496
|
-
// (process.stdin ReadStream, undici Socket pool, command-stream ChildProcess,
|
|
1497
|
-
// process.stdout/stderr WriteStreams) so the event loop exits naturally, then
|
|
1498
|
-
// calls process.exit(0) as a deterministic safety net.
|
|
1496
|
+
// Issue #1431: safeExit() unrefs handles so the event loop exits naturally, then calls process.exit(0)
|
|
1499
1497
|
await safeExit(0, 'Process completed');
|
|
1500
1498
|
}
|
|
@@ -494,9 +494,23 @@ export const showSessionSummary = async (sessionId, limitReached, argv, issueUrl
|
|
|
494
494
|
};
|
|
495
495
|
|
|
496
496
|
// Verify results by searching for new PRs and comments
|
|
497
|
-
export const verifyResults = async (owner, repo, branchName, issueNumber, prNumber, prUrl, referenceTime, argv, shouldAttachLogs, shouldRestart = false, sessionId = null, tempDir = null, anthropicTotalCostUSD = null, publicPricingEstimate = null, pricingInfo = null, errorDuringExecution = false, sessionType = 'new', resultModelUsage = null) => {
|
|
497
|
+
export const verifyResults = async (owner, repo, branchName, issueNumber, prNumber, prUrl, referenceTime, argv, shouldAttachLogs, shouldRestart = false, sessionId = null, tempDir = null, anthropicTotalCostUSD = null, publicPricingEstimate = null, pricingInfo = null, errorDuringExecution = false, sessionType = 'new', resultModelUsage = null, streamTokenUsage = null) => {
|
|
498
498
|
await log('\n🔍 Searching for created pull requests or comments...');
|
|
499
499
|
|
|
500
|
+
// Issue #1491: Build budget stats data for GitHub comment (computed once, used in both PR and issue paths)
|
|
501
|
+
let budgetStatsData = null;
|
|
502
|
+
if (argv.tokensBudgetStats && sessionId && tempDir) {
|
|
503
|
+
try {
|
|
504
|
+
const { calculateSessionTokens } = await import('./claude.lib.mjs');
|
|
505
|
+
const tokenUsage = await calculateSessionTokens(sessionId, tempDir);
|
|
506
|
+
if (tokenUsage) {
|
|
507
|
+
budgetStatsData = { tokenUsage, streamTokenUsage };
|
|
508
|
+
}
|
|
509
|
+
} catch (budgetError) {
|
|
510
|
+
if (argv.verbose) await log(` ⚠️ Could not calculate budget stats: ${budgetError.message}`, { verbose: true });
|
|
511
|
+
}
|
|
512
|
+
}
|
|
513
|
+
|
|
500
514
|
try {
|
|
501
515
|
// Get the current user's GitHub username
|
|
502
516
|
const userResult = await $`gh api user --jq .login`;
|
|
@@ -713,6 +727,8 @@ Fixes ${issueRef}
|
|
|
713
727
|
tool: argv.tool || 'claude',
|
|
714
728
|
// Issue #1454: Pass resultModelUsage for accurate multi-model display
|
|
715
729
|
resultModelUsage,
|
|
730
|
+
// Issue #1491: Pass budget stats for token budget display in comment
|
|
731
|
+
budgetStatsData,
|
|
716
732
|
});
|
|
717
733
|
}
|
|
718
734
|
|
|
@@ -797,6 +813,8 @@ Fixes ${issueRef}
|
|
|
797
813
|
tool: argv.tool || 'claude',
|
|
798
814
|
// Issue #1454: Pass resultModelUsage for accurate multi-model display
|
|
799
815
|
resultModelUsage,
|
|
816
|
+
// Issue #1491: Pass budget stats for token budget display in comment
|
|
817
|
+
budgetStatsData,
|
|
800
818
|
});
|
|
801
819
|
}
|
|
802
820
|
|