yam-harness 0.1.2 → 0.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/DECISIONS.md +10 -10
- package/ROADMAP.md +17 -15
- package/bin/yam.js +1 -6
- package/package.json +1 -1
- package/references/current-docs.md +4 -4
- package/references/db-supabase-safety-lite.md +4 -4
- package/references/honest-completion.md +4 -4
- package/references/hook-lite.md +5 -5
- package/references/markdown-management.md +4 -4
- package/references/memory.md +5 -5
- package/references/mission.md +4 -4
- package/references/quick.md +4 -4
- package/references/token-budget-reporter.md +4 -4
- package/references/tool-trust-layer.md +4 -4
- package/references/ueye.md +5 -5
- package/skills/ueye/SKILL.md +1 -1
- package/templates/tuning-log.md +3 -3
- package/yam.manifest.json +1 -1
package/DECISIONS.md
CHANGED
|
@@ -1,17 +1,17 @@
|
|
|
1
1
|
# yam Decision Baseline
|
|
2
2
|
|
|
3
|
-
Every `yam` change is evaluated against
|
|
3
|
+
Every `yam` change is evaluated against strict proof, modular skill, and minimal-core harness principles.
|
|
4
4
|
|
|
5
5
|
## Fixed Questions
|
|
6
6
|
|
|
7
|
-
1. What
|
|
8
|
-
2. What
|
|
9
|
-
3. What
|
|
7
|
+
1. What needs concrete evidence before completion?
|
|
8
|
+
2. What should stay selective or low-context?
|
|
9
|
+
3. What can be removed to keep the core obeyable?
|
|
10
10
|
4. What should `yam` keep light by default, and what should deepen deliberately?
|
|
11
11
|
|
|
12
12
|
## Borrow
|
|
13
13
|
|
|
14
|
-
###
|
|
14
|
+
### Strict Proof
|
|
15
15
|
|
|
16
16
|
- Truthful completion language.
|
|
17
17
|
- Risk escalation.
|
|
@@ -19,7 +19,7 @@ Every `yam` change is evaluated against Sneakoscope, ECC, and Karpathy-style min
|
|
|
19
19
|
- Fake versus real distinction.
|
|
20
20
|
- Runtime/process proof only when explicitly requested.
|
|
21
21
|
|
|
22
|
-
###
|
|
22
|
+
### Modular Skills
|
|
23
23
|
|
|
24
24
|
- Skills-first structure.
|
|
25
25
|
- Selective install.
|
|
@@ -27,7 +27,7 @@ Every `yam` change is evaluated against Sneakoscope, ECC, and Karpathy-style min
|
|
|
27
27
|
- Token optimization.
|
|
28
28
|
- Project-specific rules instead of global bloat.
|
|
29
29
|
|
|
30
|
-
###
|
|
30
|
+
### Minimal Core
|
|
31
31
|
|
|
32
32
|
- Short core.
|
|
33
33
|
- Few route names.
|
|
@@ -36,21 +36,21 @@ Every `yam` change is evaluated against Sneakoscope, ECC, and Karpathy-style min
|
|
|
36
36
|
|
|
37
37
|
## Reject
|
|
38
38
|
|
|
39
|
-
### From
|
|
39
|
+
### From Strict Proof
|
|
40
40
|
|
|
41
41
|
- Mandatory hooks.
|
|
42
42
|
- Mandatory Team or subagent proof.
|
|
43
43
|
- Always-on tmux/proof lifecycle.
|
|
44
44
|
- Heavy memory systems for ordinary edits.
|
|
45
45
|
|
|
46
|
-
### From
|
|
46
|
+
### From Modular Skills
|
|
47
47
|
|
|
48
48
|
- Full install by default.
|
|
49
49
|
- Giant catalog context.
|
|
50
50
|
- Hook runtime by default.
|
|
51
51
|
- Too many always-on rules.
|
|
52
52
|
|
|
53
|
-
### From Minimal
|
|
53
|
+
### From Minimal Core
|
|
54
54
|
|
|
55
55
|
- Under-verification.
|
|
56
56
|
- Vague quality rules.
|
package/ROADMAP.md
CHANGED
|
@@ -97,12 +97,14 @@ Tasks:
|
|
|
97
97
|
|
|
98
98
|
### 8. Scout / Research Workflow
|
|
99
99
|
|
|
100
|
-
Goal: give yam a research lane
|
|
100
|
+
Goal: give yam a research lane that is evidence-bound, lightweight, and decision-oriented.
|
|
101
101
|
|
|
102
|
-
|
|
102
|
+
Research reference points:
|
|
103
103
|
|
|
104
|
-
-
|
|
105
|
-
-
|
|
104
|
+
- Evidence boundaries.
|
|
105
|
+
- Source freshness.
|
|
106
|
+
- Fact/inference/recommendation separation.
|
|
107
|
+
- Decision-oriented summaries.
|
|
106
108
|
|
|
107
109
|
Tasks:
|
|
108
110
|
|
|
@@ -143,13 +145,13 @@ Tasks:
|
|
|
143
145
|
|
|
144
146
|
Goal: preserve durable lessons without turning yam into a heavy automatic memory system.
|
|
145
147
|
|
|
146
|
-
|
|
148
|
+
Kept:
|
|
147
149
|
|
|
148
150
|
- Sparse one-record-per-file storage.
|
|
149
151
|
- Wrongness-style records for repeated mistakes and wrong decisions.
|
|
150
152
|
- Deliberate forgetting via resolve instead of permanent prompt injection.
|
|
151
153
|
|
|
152
|
-
|
|
154
|
+
Kept:
|
|
153
155
|
|
|
154
156
|
- Evidence before recommendation.
|
|
155
157
|
- Clear separation between observation and next action.
|
|
@@ -173,13 +175,13 @@ Tasks:
|
|
|
173
175
|
|
|
174
176
|
Goal: prevent false runtime completion while keeping ordinary work fast.
|
|
175
177
|
|
|
176
|
-
|
|
178
|
+
Kept:
|
|
177
179
|
|
|
178
180
|
- Runtime truth vocabulary.
|
|
179
181
|
- Cleanup must be backed by exit/closure evidence.
|
|
180
182
|
- tmux physical proof idea, reduced to route-level evidence notes.
|
|
181
183
|
|
|
182
|
-
|
|
184
|
+
Kept:
|
|
183
185
|
|
|
184
186
|
- Evidence boundaries before recommendation.
|
|
185
187
|
- Explicit partial/blocked/assumed language.
|
|
@@ -202,13 +204,13 @@ Tasks:
|
|
|
202
204
|
|
|
203
205
|
Goal: provide one explicit heavy execution route without increasing total skill count.
|
|
204
206
|
|
|
205
|
-
|
|
207
|
+
Kept:
|
|
206
208
|
|
|
207
209
|
- Real Team/subagent route boundary.
|
|
208
210
|
- Cross-verification before completion.
|
|
209
211
|
- Runtime/tmux/browser proof when mission evidence needs it.
|
|
210
212
|
|
|
211
|
-
|
|
213
|
+
Kept:
|
|
212
214
|
|
|
213
215
|
- Role-specific work boundaries.
|
|
214
216
|
- Evidence-first reporting.
|
|
@@ -234,21 +236,21 @@ Tasks:
|
|
|
234
236
|
|
|
235
237
|
Goal: remove overlapping skill roles while preserving the best parts of the old routes.
|
|
236
238
|
|
|
237
|
-
|
|
239
|
+
Kept:
|
|
238
240
|
|
|
239
241
|
- Source screenshot inventory before visual claims.
|
|
240
242
|
- P0-P3 issue ledger.
|
|
241
243
|
- P0/P1-first fix loop.
|
|
242
244
|
- Partial truth cap for text-only or missing-screenshot review.
|
|
243
245
|
|
|
244
|
-
|
|
246
|
+
Kept:
|
|
245
247
|
|
|
246
248
|
- Smallest useful verification command.
|
|
247
249
|
- Group errors by file and root cause.
|
|
248
250
|
- Fix one error class at a time.
|
|
249
251
|
- Compact PASS/FAIL reporting.
|
|
250
252
|
|
|
251
|
-
|
|
253
|
+
Kept:
|
|
252
254
|
|
|
253
255
|
- Real preview/screenshot evidence.
|
|
254
256
|
- Compact design direction.
|
|
@@ -275,14 +277,14 @@ Tasks:
|
|
|
275
277
|
Goal: keep beginner momentum while creating a path toward professional proof-first work.
|
|
276
278
|
The hook stays light, but the harness direction does not. `yam` should support a depth ladder: direction fit first, focused proof for ordinary work, strong proof for risky work, and real team proof for `$mission`.
|
|
277
279
|
|
|
278
|
-
|
|
280
|
+
Kept:
|
|
279
281
|
|
|
280
282
|
- Hook status and trust reporting.
|
|
281
283
|
- Tool readiness as evidence.
|
|
282
284
|
- DB/Supabase safety thinking.
|
|
283
285
|
- Runtime/tmux/process cleanup truth.
|
|
284
286
|
|
|
285
|
-
|
|
287
|
+
Kept:
|
|
286
288
|
|
|
287
289
|
- Selective install and low-context operation.
|
|
288
290
|
- Evidence boundaries instead of always-on gates.
|
package/bin/yam.js
CHANGED
|
@@ -859,9 +859,6 @@ async function buildYamLiteContext({ cwd, prompt }) {
|
|
|
859
859
|
if (docsHint) lines.push(docsHint);
|
|
860
860
|
const routeHint = yamLiteRouteHint(prompt);
|
|
861
861
|
if (routeHint) lines.push(routeHint);
|
|
862
|
-
if (await exists(path.join(path.resolve(cwd), '.sneakoscope'))) {
|
|
863
|
-
lines.push('Caution: active .sneakoscope detected; avoid mixing proof gates unless the user explicitly wants it.');
|
|
864
|
-
}
|
|
865
862
|
return lines.join('\n');
|
|
866
863
|
}
|
|
867
864
|
|
|
@@ -1309,7 +1306,7 @@ async function inspectProjectPack(targetDir = process.cwd()) {
|
|
|
1309
1306
|
const instructionSurfaces = await findInstructionSurfaces(resolved);
|
|
1310
1307
|
|
|
1311
1308
|
if (missingSections.length) issues.push(`missing section(s): ${missingSections.join(', ')}`);
|
|
1312
|
-
if (words > 1200) warnings.push(`pack is long (${words} words); keep the
|
|
1309
|
+
if (words > 1200) warnings.push(`pack is long (${words} words); keep the core compact`);
|
|
1313
1310
|
if (words < 80) warnings.push(`pack is very short (${words} words); direction may be too thin to reuse`);
|
|
1314
1311
|
if (packAgeDays > PACK_STALE_DAYS) warnings.push(`pack is ${packAgeDays} days old; review whether direction or commands changed`);
|
|
1315
1312
|
if (placeholderLines > 12) warnings.push(`${placeholderLines} placeholder lines are still blank`);
|
|
@@ -1354,9 +1351,7 @@ async function findInstructionSurfaces(dir) {
|
|
|
1354
1351
|
{ path: 'CLAUDE.md', level: 'warning', note: 'active CLAUDE.md may carry non-yam instructions' },
|
|
1355
1352
|
{ path: 'RULES.md', level: 'warning', note: 'active RULES.md may carry non-yam instructions' },
|
|
1356
1353
|
{ path: '.codex/AGENTS.md', level: 'warning', note: 'active .codex/AGENTS.md may override project behavior' },
|
|
1357
|
-
{ path: '.codex/SNEAKOSCOPE.md', level: 'issue', note: 'active Sneakoscope instruction file detected' },
|
|
1358
1354
|
{ path: '.codex/hooks.json', level: 'issue', note: 'active Codex hook file detected' },
|
|
1359
|
-
{ path: '.sneakoscope', level: 'issue', note: 'active Sneakoscope directory detected' },
|
|
1360
1355
|
{ path: '.agents', level: 'warning', note: 'project-local .agents directory may add additional skills or instructions' }
|
|
1361
1356
|
];
|
|
1362
1357
|
const found = [];
|
package/package.json
CHANGED
|
@@ -36,10 +36,10 @@ Or:
|
|
|
36
36
|
Current-docs proof: skipped because this was stable/local/non-SDK work.
|
|
37
37
|
```
|
|
38
38
|
|
|
39
|
-
##
|
|
39
|
+
## Design Baseline
|
|
40
40
|
|
|
41
|
-
|
|
41
|
+
Strict proof favors source-backed evidence for current tool behavior.
|
|
42
42
|
|
|
43
|
-
|
|
43
|
+
Modular skill workflows keep research/context selective and low-context.
|
|
44
44
|
|
|
45
|
-
|
|
45
|
+
Minimal-core design says the rule is useful only when it changes the answer.
|
|
@@ -31,10 +31,10 @@ Before claiming safe:
|
|
|
31
31
|
- A successful migration command is not automatically safe; it only proves that command execution completed.
|
|
32
32
|
- Do not claim production safety without environment evidence.
|
|
33
33
|
|
|
34
|
-
##
|
|
34
|
+
## Design Baseline
|
|
35
35
|
|
|
36
|
-
|
|
36
|
+
Strict proof would gate destructive DB work more aggressively.
|
|
37
37
|
|
|
38
|
-
|
|
38
|
+
Modular skill workflows keep the check selective and evidence-bound.
|
|
39
39
|
|
|
40
|
-
|
|
40
|
+
Minimal-core design keeps this as a short rule and a small detector, not a full DB policy engine.
|
|
@@ -52,10 +52,10 @@ Runtime work needs stronger evidence because long-running processes can create f
|
|
|
52
52
|
- No release-blocking runtime proof unless the user chooses `$deep` or `$mission`.
|
|
53
53
|
- No full `$mission` claim without real subagent/team evidence; downgrade to `$deep`, or mark mission partial/blocked.
|
|
54
54
|
|
|
55
|
-
|
|
55
|
+
Design baseline:
|
|
56
56
|
|
|
57
|
-
-
|
|
58
|
-
-
|
|
59
|
-
-
|
|
57
|
+
- Strict proof collects stronger physical proof and gates completion more aggressively.
|
|
58
|
+
- Modular skill workflows keep evidence boundaries and report what is known vs inferred.
|
|
59
|
+
- Minimal-core design keeps the rule short and obeyable.
|
|
60
60
|
|
|
61
61
|
`yam` keeps the guard explicit, cheap, and route-aware.
|
package/references/hook-lite.md
CHANGED
|
@@ -14,7 +14,7 @@ Allowed:
|
|
|
14
14
|
- Remind the agent not to overclaim verification, cleanup, or visual evidence.
|
|
15
15
|
- Suggest `$quick`, `$ueye`, `$question`, `$scout`, `$deep`, or `$mission` based on obvious prompt signals.
|
|
16
16
|
- Mention a project pack or memory summary when present.
|
|
17
|
-
- Warn when
|
|
17
|
+
- Warn when conflicting proof-harness surfaces are active in the current project.
|
|
18
18
|
|
|
19
19
|
Not allowed:
|
|
20
20
|
|
|
@@ -44,12 +44,12 @@ Project hooks write to `<project>/.codex/hooks.json`.
|
|
|
44
44
|
|
|
45
45
|
`yam` backs up an existing hook file before enabling the lite hook.
|
|
46
46
|
|
|
47
|
-
##
|
|
47
|
+
## Design Baseline
|
|
48
48
|
|
|
49
|
-
|
|
49
|
+
Broad hook systems often use route prep, tool evidence, permission gates, subagent evidence, and stop gates.
|
|
50
50
|
|
|
51
|
-
|
|
51
|
+
Selective skill systems favor lower-context workflows.
|
|
52
52
|
|
|
53
|
-
|
|
53
|
+
Minimal-core systems avoid hooks unless the rule is short and changes behavior.
|
|
54
54
|
|
|
55
55
|
`yam` keeps this hook advisory-only so beginner momentum is preserved while the agent still receives a direction nudge. Deeper proof belongs in `$deep` and real team execution belongs in `$mission`, not in an always-on prompt hook.
|
|
@@ -2,21 +2,21 @@
|
|
|
2
2
|
|
|
3
3
|
`yam` uses markdown as a small direction layer, not as an automatic control system.
|
|
4
4
|
|
|
5
|
-
##
|
|
5
|
+
## Design Baseline
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
Strict proof systems:
|
|
8
8
|
|
|
9
9
|
- Creates and manages more markdown surfaces for agent control, route instructions, proof, and dashboards.
|
|
10
10
|
- Good for strict verification and anti-fake-work pressure.
|
|
11
11
|
- Risk: too much generated context and too much automatic intervention.
|
|
12
12
|
|
|
13
|
-
|
|
13
|
+
Modular skill systems:
|
|
14
14
|
|
|
15
15
|
- Splits markdown into modular instructions, rules, skills, and commands.
|
|
16
16
|
- Good for selective installation and low-context operation.
|
|
17
17
|
- Risk: too many optional files can still become noisy if installed wholesale.
|
|
18
18
|
|
|
19
|
-
|
|
19
|
+
Minimal-core systems:
|
|
20
20
|
|
|
21
21
|
- Keeps the core instruction document short and human-readable.
|
|
22
22
|
- Good for speed, obedience, and easy maintenance.
|
package/references/memory.md
CHANGED
|
@@ -2,12 +2,12 @@
|
|
|
2
2
|
|
|
3
3
|
`yam memory` is an opt-in, project-local memory layer.
|
|
4
4
|
|
|
5
|
-
It
|
|
5
|
+
It keeps only the lightest useful parts from heavier harness patterns:
|
|
6
6
|
|
|
7
|
-
-
|
|
8
|
-
-
|
|
9
|
-
-
|
|
10
|
-
-
|
|
7
|
+
- Sparse records, one file per durable claim, and deliberate forgetting instead of injecting every old claim.
|
|
8
|
+
- Wrongness memory for repeated mistakes, wrong decisions, stale assumptions, and overconfident claims.
|
|
9
|
+
- Separate evidence, inference, and recommendation.
|
|
10
|
+
- Keep the mechanism small enough to obey.
|
|
11
11
|
|
|
12
12
|
Storage:
|
|
13
13
|
|
package/references/mission.md
CHANGED
|
@@ -77,10 +77,10 @@ Doctor scan:
|
|
|
77
77
|
Use `references/doctor-scan.md` before final completion.
|
|
78
78
|
Keep the scan short, but cover direction fit, scope control, verification, runtime/cleanup, truth status, and fix-first items.
|
|
79
79
|
|
|
80
|
-
|
|
80
|
+
Design baseline:
|
|
81
81
|
|
|
82
|
-
-
|
|
83
|
-
-
|
|
84
|
-
-
|
|
82
|
+
- Strict proof would likely make this a team route with stronger gates and required agent evidence.
|
|
83
|
+
- Modular skill workflows split role responsibilities and keep evidence boundaries.
|
|
84
|
+
- Minimal-core design avoids adding this unless it clearly replaces a confusing middle route.
|
|
85
85
|
|
|
86
86
|
`yam` uses mission to replace the old standalone runtime route with a clearer heavy execution route.
|
package/references/quick.md
CHANGED
|
@@ -2,15 +2,15 @@
|
|
|
2
2
|
|
|
3
3
|
`quick` is the merged small-work route: fast patching, ordinary scoped implementation, and fast error scanning.
|
|
4
4
|
|
|
5
|
-
##
|
|
5
|
+
## Selected Principles
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
Strict proof:
|
|
8
8
|
|
|
9
9
|
- Honest completion language.
|
|
10
10
|
- Real versus assumed verification.
|
|
11
11
|
- Stop instead of claiming success when evidence is missing.
|
|
12
12
|
|
|
13
|
-
|
|
13
|
+
Focused execution:
|
|
14
14
|
|
|
15
15
|
- Detect the smallest useful command.
|
|
16
16
|
- Group build/type/lint/test errors by file and root cause.
|
|
@@ -18,7 +18,7 @@ From ECC:
|
|
|
18
18
|
- Re-run the same focused command after a fix.
|
|
19
19
|
- Use a compact PASS/FAIL matrix.
|
|
20
20
|
|
|
21
|
-
|
|
21
|
+
Minimal core:
|
|
22
22
|
|
|
23
23
|
- Keep the instruction short enough to obey.
|
|
24
24
|
- Read the smallest useful context.
|
|
@@ -33,12 +33,12 @@ yam measure ueye --files 7 --commands 2 --report-lines 18 --seconds 260
|
|
|
33
33
|
- `$deep`: can exceed ordinary budgets, but the reason must be risk-tied; single-agent runtime/tmux/browser checks belong here when verification needs them.
|
|
34
34
|
- `$mission`: can spend more context on real subagent/team lanes, cross-verification, doctor scan, and runtime evidence, but only for approved plans where real subagents are used or explicitly unavailable/partial.
|
|
35
35
|
|
|
36
|
-
##
|
|
36
|
+
## Design Baseline
|
|
37
37
|
|
|
38
|
-
|
|
38
|
+
Strict proof would favor stronger automatic evidence collection.
|
|
39
39
|
|
|
40
|
-
|
|
40
|
+
Modular skill workflows favor selective, low-context reporting.
|
|
41
41
|
|
|
42
|
-
|
|
42
|
+
Minimal-core design removes the measurement unless it changes behavior.
|
|
43
43
|
|
|
44
44
|
`yam` keeps manual measurement because it helps reduce over-reading without installing hooks.
|
|
@@ -49,7 +49,7 @@ Default:
|
|
|
49
49
|
Advisory:
|
|
50
50
|
|
|
51
51
|
- `yam-lite` hook may suggest routes and warn about overclaiming.
|
|
52
|
-
- `yam pack` may warn about stale project direction, command drift, active hooks, or
|
|
52
|
+
- `yam pack` may warn about stale project direction, command drift, active hooks, or legacy proof surfaces.
|
|
53
53
|
|
|
54
54
|
On demand:
|
|
55
55
|
|
|
@@ -60,7 +60,7 @@ On demand:
|
|
|
60
60
|
- `yam tools doctor`: inspect tool readiness without changing project state.
|
|
61
61
|
- `yam proof`: summarize actual evidence without running verification.
|
|
62
62
|
|
|
63
|
-
##
|
|
63
|
+
## Strict Proof Inputs
|
|
64
64
|
|
|
65
65
|
- Tool readiness checks.
|
|
66
66
|
- Hook status and trust reporting.
|
|
@@ -71,14 +71,14 @@ On demand:
|
|
|
71
71
|
- Destructive DB/Supabase command detection and production-write caution.
|
|
72
72
|
- Feature/release inventory as an optional doctor, not a default gate.
|
|
73
73
|
|
|
74
|
-
##
|
|
74
|
+
## Modular Skill Inputs
|
|
75
75
|
|
|
76
76
|
- Selective install and profiles.
|
|
77
77
|
- Evidence boundaries.
|
|
78
78
|
- Low-context command detection.
|
|
79
79
|
- Optional orchestration instead of always-on orchestration.
|
|
80
80
|
|
|
81
|
-
##
|
|
81
|
+
## Design Quality Inputs
|
|
82
82
|
|
|
83
83
|
- Real preview/screenshot evidence.
|
|
84
84
|
- Compact design direction.
|
package/references/ueye.md
CHANGED
|
@@ -2,9 +2,9 @@
|
|
|
2
2
|
|
|
3
3
|
`ueye` is the merged UI/design route: design-heavy implementation, screenshot-led UX review, and visual QA.
|
|
4
4
|
|
|
5
|
-
##
|
|
5
|
+
## Selected Principles
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
Visual proof:
|
|
8
8
|
|
|
9
9
|
- Source-screen inventory before visual claims.
|
|
10
10
|
- P0-P3 issue ledger.
|
|
@@ -12,21 +12,21 @@ From Sneakoscope image UX review:
|
|
|
12
12
|
- Recheck changed or high-risk screens after fixes when feasible.
|
|
13
13
|
- Cap text-only or missing-screenshot reviews as partial instead of fully verified.
|
|
14
14
|
|
|
15
|
-
Kept out
|
|
15
|
+
Kept out by design:
|
|
16
16
|
|
|
17
17
|
- Mandatory generated annotated images.
|
|
18
18
|
- Image voxel ledgers.
|
|
19
19
|
- Release gates for every UI change.
|
|
20
20
|
- Always-on proof loops.
|
|
21
21
|
|
|
22
|
-
|
|
22
|
+
Design quality:
|
|
23
23
|
|
|
24
24
|
- Real examples and previews matter more than abstract prose.
|
|
25
25
|
- Design direction should be compact and searchable.
|
|
26
26
|
- P0 gates should reject placeholder visuals, generic UI, and broken responsive states.
|
|
27
27
|
- UI work should be self-contained enough to inspect.
|
|
28
28
|
|
|
29
|
-
|
|
29
|
+
Evidence boundaries:
|
|
30
30
|
|
|
31
31
|
- Separate evidence from judgment.
|
|
32
32
|
- Keep review output compact.
|
package/skills/ueye/SKILL.md
CHANGED
|
@@ -34,7 +34,7 @@ Do not use for:
|
|
|
34
34
|
- Text-only visual critique cannot be reported as fully verified when screenshot evidence was required.
|
|
35
35
|
- Generated annotated images are optional, not a default gate.
|
|
36
36
|
- Image evidence should stay bounded: inspect the primary screen first, then only the states/images needed to support the claim.
|
|
37
|
-
-
|
|
37
|
+
- Design quality judgment belongs after implementation/review: compare to the reference first, then judge whether the result is good design.
|
|
38
38
|
|
|
39
39
|
## Workflow
|
|
40
40
|
|
package/templates/tuning-log.md
CHANGED