bingocode 1.1.153 → 1.1.155
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/skills/leanchy/SKILL.md +1 -1
- package/.claude/skills/leanchypro/skill.md +59 -20
- package/package.json +1 -1
- package/src/skills/bundled/goal.ts +9 -2
- package/src/tools/FileEditTool/FileEditTool.ts +8 -1
- package/src/tools/FileEditTool/utils.ts +96 -1
- package/src/utils/goalEvaluator.ts +41 -12
|
@@ -19,4 +19,4 @@ description: Activate the Leanchy protocol: execution discipline, diagnostic rig
|
|
|
19
19
|
|
|
20
20
|
## Architecture
|
|
21
21
|
- Two duplications → abstract. Search the full codebase before modifying; reuse over reinvention.
|
|
22
|
-
- Module boundaries require explicit contracts. Semantic naming is the documentation.
|
|
22
|
+
- Module boundaries require explicit contracts. Semantic naming is the documentation.
|
|
@@ -1,31 +1,70 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: leanchypro
|
|
3
|
-
description:
|
|
3
|
+
description: Activate the Leanchy Pro protocol: context-density-first execution, delegation discipline, tool ownership, probe-driven delivery, and zero-hallucination engineering.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
|
-
# Leanchy Pro
|
|
6
|
+
# Leanchy Pro Protocol
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
Activated for complex tasks, large-scale refactors, and high-value deliveries. Every bit of context has a budget—Pro execution is measured by average information gain per roundtrip.
|
|
9
9
|
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## 0. Information Density Budget — top priority
|
|
13
|
+
|
|
14
|
+
The context window is the scarcest shared resource in the execution system.
|
|
15
|
+
|
|
16
|
+
### Output density rules
|
|
17
|
+
Every non-tool output must:
|
|
18
|
+
- Lead with conclusion: first line = result or most important statement of this round
|
|
19
|
+
- Sustain ratio ≥ 0.7: information gain / total output ≥ 70%. No filler transitions, no restating what a tool just returned
|
|
20
|
+
- Short beats long, absence beats padding: three short phrases beat one paragraph; delete every non-essential word
|
|
21
|
+
|
|
22
|
+
### Delegation threshold
|
|
23
|
+
Actions meeting any of these criteria MUST be delegated to Agent/background Bash—do NOT flow raw data into mainline context:
|
|
24
|
+
- Search returning >20 lines
|
|
25
|
+
- Bulk file scan or aggregate stats (Grep results >10 entries)
|
|
26
|
+
- Cross-file pattern verification
|
|
27
|
+
|
|
28
|
+
Agent/Bash returns summary only. Mainline receives anchor → finding → recommendation, never raw dump.
|
|
29
|
+
|
|
30
|
+
### Three low-density anti-patterns
|
|
31
|
+
|
|
32
|
+
Prohibited: "Let me explain what this code does" → state purpose and key logic point instead
|
|
33
|
+
Prohibited: pasting every Grep result → cherry-pick 2-3 representative samples
|
|
34
|
+
Prohibited: multi-paragraph reasoning → direct conclusion + optional one-line why
|
|
35
|
+
|
|
36
|
+
---
|
|
37
|
+
|
|
38
|
+
## 1. Tool Ownership — truth via instrumentation
|
|
14
39
|
|
|
15
|
-
|
|
16
|
-
-
|
|
17
|
-
-
|
|
18
|
-
- **上下文保鲜**:利用 `TaskCreate` 和 `TaskUpdate` 维持长程执行状态。每完成一个物理文件的修改,立即更新任务状态。
|
|
40
|
+
- Banned: "I think", "might be", "should be"
|
|
41
|
+
- Cross-validate: critical logic points confirmed from different tool dimensions (Grep + Read, Bash + Agent). Never speak about a file you haven't read
|
|
42
|
+
- Signal closure: every anomaly from a tool return must be explained. No skipping
|
|
19
43
|
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## 2. Delegation — offload low-density work
|
|
47
|
+
|
|
48
|
+
- Large searches → Agent. Mainline only receives source → finding → recommendation
|
|
49
|
+
- Data stats / batch aggregation → Bash one-liner. Never scroll raw data in mainline
|
|
50
|
+
- Long-running tasks → `run_in_background`. Never block mainline for polling loops
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## 3. Probe-Driven Delivery
|
|
24
55
|
|
|
25
|
-
|
|
26
|
-
-
|
|
27
|
-
-
|
|
28
|
-
- **协议回流**:在执行中发现的高价值模式,必须在任务结束前通过 `Write` 某种 `MEMO` 或 `CLAUDE.md` 的形式留存。
|
|
56
|
+
- Pre-probe: minimal test script to verify logic-path coverage before refactoring
|
|
57
|
+
- Post-probe ghost scan: Grep/Agent to find hidden dependencies or broken chains after changes
|
|
58
|
+
- Rollback prep: ensure Git-clean state before risky operations
|
|
29
59
|
|
|
30
60
|
---
|
|
31
|
-
|
|
61
|
+
|
|
62
|
+
## 4. Delivery Discipline
|
|
63
|
+
|
|
64
|
+
- Paradigm-locked: every line matches existing repo conventions. Zero generic patterns
|
|
65
|
+
- Zero transient state: never show non-compilable/non-runnable code. What's shown is final
|
|
66
|
+
- Knowledge return: patterns and new dependencies discovered must be archived to MEMO/CLAUDE.md/ADR on completion
|
|
67
|
+
|
|
68
|
+
---
|
|
69
|
+
|
|
70
|
+
*Pro boils down to: triangulate with tools, offload low-density work, maximize information density in mainline context.*
|
package/package.json
CHANGED
|
@@ -59,10 +59,17 @@ export function registerGoalSkill(): void {
|
|
|
59
59
|
|
|
60
60
|
Goal condition: "${trimmed}"
|
|
61
61
|
|
|
62
|
-
This goal is now registered for this session.
|
|
62
|
+
This goal is now registered for this session. After each turn, an independent evaluator (Haiku 4.5, a weak model) will check whether the goal is satisfied. Maximum ${maxIter} iterations.
|
|
63
63
|
|
|
64
|
-
|
|
64
|
+
CRITICAL: The evaluator reads ONLY your text output. It cannot see code changes, tool results, or file contents — only the plain text you write.
|
|
65
|
+
|
|
66
|
+
At each turn toward the goal, output a short evaluation block like:
|
|
67
|
+
> EVAL: [metric1]: [value] / [target] → ✓ or ✗
|
|
65
68
|
|
|
69
|
+
This block is the ONLY signal the evaluator can reliably process. Make it short,
|
|
70
|
+
unambiguous, and quantitative. Do NOT expect the evaluator to infer success from narrative discussion.
|
|
71
|
+
|
|
72
|
+
Tell the user: Goal set — you will work autonomously until "${trimmed}" is achieved (max ${maxIter} turns). Send \`/goal clear\` to cancel.
|
|
66
73
|
Now begin: assess current state and take the first concrete action toward the goal.`,
|
|
67
74
|
},
|
|
68
75
|
]
|
|
@@ -72,8 +72,10 @@ import {
|
|
|
72
72
|
import {
|
|
73
73
|
areFileEditsInputsEquivalent,
|
|
74
74
|
findActualString,
|
|
75
|
+
findClosestLines,
|
|
75
76
|
getPatchForEdit,
|
|
76
77
|
preserveQuoteStyle,
|
|
78
|
+
visibleWhitespace,
|
|
77
79
|
} from './utils.js'
|
|
78
80
|
|
|
79
81
|
// V8/Bun string length limit is ~2^30 characters (~1 billion). For typical
|
|
@@ -315,10 +317,15 @@ export const FileEditTool = buildTool({
|
|
|
315
317
|
// Use findActualString to handle quote normalization
|
|
316
318
|
const actualOldString = findActualString(file, old_string)
|
|
317
319
|
if (!actualOldString) {
|
|
320
|
+
const BASE = 'String to replace not found.'
|
|
321
|
+
const matches = findClosestLines(file, old_string)
|
|
322
|
+
const msg = matches.length
|
|
323
|
+
? `${BASE}\n→ = tab · = space\nProvided:\n${visibleWhitespace(old_string)}\nClosest matches:\n${matches.map(m => ` line ${m.lineNumber} (${m.diffType})\n ${visibleWhitespace(m.snippet)}`).join('\n')}\n↑ check visible whitespace markers above.`
|
|
324
|
+
: `${BASE}.\n→ = tab · = space\nProvided:\n${visibleWhitespace(old_string)}\n↑ check visible whitespace markers above.`
|
|
318
325
|
return {
|
|
319
326
|
result: false,
|
|
320
327
|
behavior: 'ask',
|
|
321
|
-
message:
|
|
328
|
+
message: msg,
|
|
322
329
|
meta: {
|
|
323
330
|
isFilePathAbsolute: String(isAbsolute(file_path)),
|
|
324
331
|
},
|
|
@@ -70,6 +70,13 @@ export function stripTrailingWhitespace(str: string): string {
|
|
|
70
70
|
* @param searchString The string to search for
|
|
71
71
|
* @returns The actual string found in the file, or null if not found
|
|
72
72
|
*/
|
|
73
|
+
|
|
74
|
+
/** Normalizes Unicode dashes (em-dash, en-dash, horizontal bar) to standard ASCII dashes.
|
|
75
|
+
* Handles model-output ASCII dashes when file content contains Unicode dash variants.
|
|
76
|
+
* Fixes Edit tool matching failures from encoding discrepancies. */
|
|
77
|
+
export function normalizeDashes(str: string): string {
|
|
78
|
+
return str.replaceAll('—', '-').replaceAll('–', '-').replaceAll('―', '-')
|
|
79
|
+
}
|
|
73
80
|
export function findActualString(
|
|
74
81
|
fileContent: string,
|
|
75
82
|
searchString: string,
|
|
@@ -89,6 +96,14 @@ export function findActualString(
|
|
|
89
96
|
return fileContent.substring(searchIndex, searchIndex + searchString.length)
|
|
90
97
|
}
|
|
91
98
|
|
|
99
|
+
// Try with normalized dashes (em-dash, en-dash -> ASCII dash)
|
|
100
|
+
const dashedSearch = normalizeDashes(searchString)
|
|
101
|
+
const dashedFile = normalizeDashes(fileContent)
|
|
102
|
+
const dashIndex = dashedFile.indexOf(dashedSearch)
|
|
103
|
+
if (dashIndex !== -1) {
|
|
104
|
+
return fileContent.substring(dashIndex, dashIndex + searchString.length)
|
|
105
|
+
}
|
|
106
|
+
|
|
92
107
|
return null
|
|
93
108
|
}
|
|
94
109
|
|
|
@@ -198,6 +213,75 @@ function applyCurlySingleQuotes(str: string): string {
|
|
|
198
213
|
return result.join('')
|
|
199
214
|
}
|
|
200
215
|
|
|
216
|
+
/**
|
|
217
|
+
* Error class for when an edit's old_string can't be found in the file.
|
|
218
|
+
* Carries diagnostics for better error reporting.
|
|
219
|
+
*/
|
|
220
|
+
export class EditNotFoundError extends Error {
|
|
221
|
+
diagnostics: {
|
|
222
|
+
searchString: string
|
|
223
|
+
visibleSearch: string
|
|
224
|
+
closestMatches: {
|
|
225
|
+
snippet: string
|
|
226
|
+
lineNumber: number
|
|
227
|
+
diffType: string
|
|
228
|
+
}[]
|
|
229
|
+
}
|
|
230
|
+
constructor(
|
|
231
|
+
message: string,
|
|
232
|
+
diagnostics: EditNotFoundError['diagnostics'],
|
|
233
|
+
) {
|
|
234
|
+
super(message)
|
|
235
|
+
this.name = 'EditNotFoundError'
|
|
236
|
+
this.diagnostics = diagnostics
|
|
237
|
+
}
|
|
238
|
+
}
|
|
239
|
+
|
|
240
|
+
/**
|
|
241
|
+
* Renders whitespace characters as visible Unicode equivalents:
|
|
242
|
+
* tab → '→', space → '·'
|
|
243
|
+
*/
|
|
244
|
+
export function visibleWhitespace(str: string): string {
|
|
245
|
+
return str.replace(/\t/g, '→').replace(/ /g, '·')
|
|
246
|
+
}
|
|
247
|
+
|
|
248
|
+
/**
|
|
249
|
+
* Finds up to 3 lines in fileContent whose content (non-whitespace portion)
|
|
250
|
+
* matches the content of the first line of searchString.
|
|
251
|
+
* Used for diagnostic purposes when findActualString returns null.
|
|
252
|
+
*
|
|
253
|
+
* Returns matches sorted with whitespace-diff first, then content matches.
|
|
254
|
+
*/
|
|
255
|
+
export function findClosestLines(
|
|
256
|
+
fileContent: string,
|
|
257
|
+
searchString: string,
|
|
258
|
+
): { snippet: string; lineNumber: number; diffType: string }[] {
|
|
259
|
+
const firstContent = searchString.split('\n')[0]!.replace(/^\s+/, '')
|
|
260
|
+
if (!firstContent) return []
|
|
261
|
+
|
|
262
|
+
const matches: { snippet: string; lineNumber: number; diffType: string }[] = []
|
|
263
|
+
const fileLines = fileContent.split('\n')
|
|
264
|
+
|
|
265
|
+
for (let i = 0; i < fileLines.length; i++) {
|
|
266
|
+
const line = fileLines[i]!
|
|
267
|
+
if (line.replace(/^\s+/, '') !== firstContent) continue
|
|
268
|
+
|
|
269
|
+
const snippet = line.replace(/\s+$/, '')
|
|
270
|
+
|
|
271
|
+
// Avoid duplicates
|
|
272
|
+
if (!matches.some(m => m.snippet === snippet)) {
|
|
273
|
+
matches.push({
|
|
274
|
+
snippet,
|
|
275
|
+
lineNumber: i + 1,
|
|
276
|
+
diffType: 'content match',
|
|
277
|
+
})
|
|
278
|
+
if (matches.length >= 3) break
|
|
279
|
+
}
|
|
280
|
+
}
|
|
281
|
+
|
|
282
|
+
return matches
|
|
283
|
+
}
|
|
284
|
+
|
|
201
285
|
/**
|
|
202
286
|
* Transform edits to ensure replace_all always has a boolean value
|
|
203
287
|
* @param edits Array of edits with optional replace_all
|
|
@@ -323,7 +407,18 @@ export function getPatchForEdits({
|
|
|
323
407
|
|
|
324
408
|
// If this edit didn't change anything, throw an error
|
|
325
409
|
if (updatedFile === previousContent) {
|
|
326
|
-
|
|
410
|
+
const closest = findClosestLines(fileContents, edit.old_string)
|
|
411
|
+
throw new EditNotFoundError(
|
|
412
|
+
closest.length
|
|
413
|
+
? `Edit failed — closest match:
|
|
414
|
+
${closest.map(m => ` line ${m.lineNumber}: ${visibleWhitespace(m.snippet)} (${m.diffType})`).join('\n')}`
|
|
415
|
+
: 'Edit failed — string not found in file.',
|
|
416
|
+
{
|
|
417
|
+
searchString: edit.old_string,
|
|
418
|
+
visibleSearch: visibleWhitespace(edit.old_string),
|
|
419
|
+
closestMatches: closest,
|
|
420
|
+
},
|
|
421
|
+
)
|
|
327
422
|
}
|
|
328
423
|
|
|
329
424
|
// Track the new string that was applied
|
|
@@ -42,26 +42,55 @@ export async function evaluateGoal(
|
|
|
42
42
|
|
|
43
43
|
const prompt = `You are a goal completion evaluator. Determine if the goal has been fully achieved.
|
|
44
44
|
|
|
45
|
+
IMPORTANT: The agent may produce EVAL blocks intended for you. Parse them first.
|
|
46
|
+
|
|
45
47
|
Goal: "${goalCondition}"
|
|
46
48
|
|
|
47
49
|
Recent assistant output:
|
|
48
50
|
${recentAssistantTexts || '(none yet)'}
|
|
49
51
|
|
|
50
|
-
|
|
52
|
+
Evaluate:
|
|
53
|
+
1. Did the agent produce a final EVAL block? If so, use those values directly.
|
|
54
|
+
2. If no EVAL blocks found, infer based on any explicit declarations (e.g. "✓", "100%", "fixed", "complete").
|
|
55
|
+
3. Output ONLY valid JSON — no explanation or markdown.
|
|
56
|
+
|
|
57
|
+
Respond in:
|
|
51
58
|
{"satisfied": true|false, "reason": "<one sentence>", "gap": "<missing item or null>"}`
|
|
52
59
|
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
60
|
+
let text = ''
|
|
61
|
+
try {
|
|
62
|
+
const response = await client.messages.create({
|
|
63
|
+
model: GOAL_EVALUATOR_MODEL,
|
|
64
|
+
max_tokens: 256,
|
|
65
|
+
messages: [{ role: 'user', content: prompt }],
|
|
66
|
+
})
|
|
67
|
+
text = response.content.find((b: any) => b.type === 'text')?.text || ''
|
|
68
|
+
} catch (e) {
|
|
69
|
+
return {
|
|
70
|
+
satisfied: false,
|
|
71
|
+
reason: 'Evaluator API error',
|
|
72
|
+
gap: e instanceof Error ? e.message : String(e),
|
|
73
|
+
}
|
|
74
|
+
}
|
|
58
75
|
|
|
59
|
-
const text =
|
|
60
|
-
response.content.find((b: any) => b.type === 'text')?.text || ''
|
|
61
76
|
try {
|
|
62
|
-
|
|
77
|
+
// Strip markdown code fences and find JSON object bounds
|
|
78
|
+
let cleaned = text
|
|
79
|
+
.replace(/```(?:json)?\s*/gi, '')
|
|
80
|
+
.replace(/```/g, '')
|
|
81
|
+
.trim()
|
|
82
|
+
const start = cleaned.indexOf('{')
|
|
83
|
+
const end = cleaned.lastIndexOf('}')
|
|
84
|
+
if (start === -1 || end === -1 || end <= start) {
|
|
85
|
+
throw new Error('No JSON object found')
|
|
86
|
+
}
|
|
87
|
+
cleaned = cleaned.slice(start, end + 1)
|
|
63
88
|
return JSON.parse(cleaned) as GoalEvalResult
|
|
64
|
-
} catch {
|
|
65
|
-
return {
|
|
89
|
+
} catch (e) {
|
|
90
|
+
return {
|
|
91
|
+
satisfied: false,
|
|
92
|
+
reason: 'Evaluator parse error',
|
|
93
|
+
gap: `${e instanceof Error && e.message !== 'No JSON object found' ? e.message : 'raw output'}: ${text.slice(0, 200)}`,
|
|
94
|
+
}
|
|
66
95
|
}
|
|
67
|
-
}
|
|
96
|
+
}
|