yam-harness 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. package/AGENTS.md +18 -0
  2. package/COMMANDS.md +144 -0
  3. package/DECISIONS.md +70 -0
  4. package/LICENSE +21 -0
  5. package/README.md +159 -0
  6. package/ROADMAP.md +308 -0
  7. package/bin/yam.js +1966 -0
  8. package/package.json +74 -0
  9. package/references/context-reuse.md +59 -0
  10. package/references/current-docs.md +45 -0
  11. package/references/db-supabase-safety-lite.md +40 -0
  12. package/references/doctor-scan.md +56 -0
  13. package/references/eye.md +30 -0
  14. package/references/final-report.md +61 -0
  15. package/references/honest-completion.md +61 -0
  16. package/references/hook-lite.md +55 -0
  17. package/references/markdown-management.md +56 -0
  18. package/references/memory.md +59 -0
  19. package/references/mission.md +86 -0
  20. package/references/question.md +25 -0
  21. package/references/quick.md +70 -0
  22. package/references/risk-escalation.md +27 -0
  23. package/references/runtime-orchestration.md +57 -0
  24. package/references/scout.md +38 -0
  25. package/references/token-budget-reporter.md +44 -0
  26. package/references/token-economy.md +61 -0
  27. package/references/tool-trust-layer.md +113 -0
  28. package/references/truth-matrix.md +44 -0
  29. package/references/ueye.md +83 -0
  30. package/references/ui-quality.md +23 -0
  31. package/references/verification-levels.md +53 -0
  32. package/skills/deep/SKILL.md +76 -0
  33. package/skills/mission/SKILL.md +105 -0
  34. package/skills/question/SKILL.md +45 -0
  35. package/skills/quick/SKILL.md +81 -0
  36. package/skills/scout/SKILL.md +71 -0
  37. package/skills/ueye/SKILL.md +90 -0
  38. package/templates/mission-plan.md +46 -0
  39. package/templates/runtime-proof.md +54 -0
  40. package/templates/tuning-log.md +39 -0
  41. package/templates/ueye-review.md +62 -0
  42. package/templates/yam.project.md +71 -0
  43. package/yam.manifest.json +48 -0
@@ -0,0 +1,76 @@
1
+ ---
2
+ name: deep
3
+ description: Heavy verification route for risky or user-requested deep work. Use when the user invokes $deep or explicitly asks for strong verification.
4
+ ---
5
+
6
+ # yam Deep
7
+
8
+ Use for:
9
+
10
+ - Auth, payment, DB, security, deployment, or release work.
11
+ - Broad refactors.
12
+ - Regressions with unclear cause.
13
+ - User-requested heavy verification.
14
+ - Work where false completion would be costly.
15
+ - Long-running verification using dev servers, test watchers, browser QA, tmux, or process cleanup.
16
+ - Broad or risky work that does not require real subagent/team execution.
17
+ - `$mission` requests that cannot use real subagents; downgrade to `$deep` and report why.
18
+
19
+ ## Principles
20
+
21
+ - Direction before execution.
22
+ - Token economy still matters, even in deep mode.
23
+ - Reuse `yam.project.md` before broad context reading when present.
24
+ - Strong verification is allowed here, but still bounded.
25
+ - Prefer focused evidence over ceremony.
26
+ - Stay single-agent by default; if real team/subagent execution is required, use `$mission`.
27
+ - Do not claim verified without actual evidence.
28
+ - Use runtime/tmux/process orchestration only when the verification claim needs it.
29
+ - Do not start long-running processes just to make small work look more proven.
30
+ - Do not claim cleanup unless process exit, tmux pane/session closure, or intentional persistence is confirmed.
31
+
32
+ ## Workflow
33
+
34
+ 1. Define the risk surface.
35
+ 2. Identify acceptance criteria.
36
+ 3. Build a focused verification plan.
37
+ 4. Implement or inspect within scope.
38
+ 5. Start dev servers, test watchers, tmux panes, or browser QA only when they materially support verification.
39
+ 6. Run appropriate checks: tests, typecheck, build, browser QA, security, runtime, or data safety checks.
40
+ 7. Confirm cleanup when claiming cleanup.
41
+ 8. Classify truth status with `references/truth-matrix.md`.
42
+ 9. Report proof summary and remaining risk.
43
+
44
+ ## Verification
45
+
46
+ Use Level 4 from `references/verification-levels.md`.
47
+ Use `references/token-economy.md`; wider context is allowed only when tied to the risk surface.
48
+ Use `references/context-reuse.md`; update stale project packs narrowly when needed.
49
+ Use `references/markdown-management.md` before writing broad proof or direction markdown.
50
+ Use `references/runtime-orchestration.md` when long-running processes, tmux, browser QA, or cleanup proof are needed.
51
+ Use `references/db-supabase-safety-lite.md` for destructive DB/Supabase, migration, production-write, or RLS/policy work.
52
+ Use `references/current-docs.md` when SDK/API/cloud-service behavior may be current or version-sensitive.
53
+ Use `references/honest-completion.md`; do not overclaim verification, runtime, cleanup, or visual proof.
54
+ Use `references/final-report.md` to close with remaining tasks and fix-first items when useful.
55
+ Use `references/token-budget-reporter.md` when a run needs measured budget feedback.
56
+
57
+ Deep verification may include:
58
+
59
+ - Test suite or relevant suites.
60
+ - Build.
61
+ - Browser QA.
62
+ - Dev server, test watcher, tmux, or process lifecycle checks.
63
+ - Security or migration checks.
64
+ - Before/after screenshot.
65
+ - Risk-specific manual inspection.
66
+
67
+ ## Final Response
68
+
69
+ Include:
70
+
71
+ - What was changed or reviewed.
72
+ - Evidence gathered.
73
+ - Truth status.
74
+ - Blockers or remaining risk.
75
+ - Remaining tasks.
76
+ - Fix-first items before planned tasks.
@@ -0,0 +1,105 @@
1
+ ---
2
+ name: mission
3
+ description: Explicit real-subagent/team execution route for approved implementation plans. Use when the user invokes $mission or asks for team implementation with real subagents, cross-verification, doctor scan, runtime/tmux/browser QA, and final proof summary.
4
+ ---
5
+
6
+ # yam Mission
7
+
8
+ Use for:
9
+
10
+ - Approved implementation plans or scenarios.
11
+ - Broad implementation where real subagent/team separation reduces risk.
12
+ - Team execution with real implementer, reviewer, verifier, and doctor lanes when tool support is available.
13
+ - Work that needs implementation plus cross-verification.
14
+ - Work that may need deep runtime/tmux/browser QA as part of final proof.
15
+
16
+ Do not use for:
17
+
18
+ - Tiny changes or ordinary scoped implementation. Use `$quick`.
19
+ - Design-heavy UI/UX implementation or screenshot-led review. Use `$ueye`.
20
+ - Pure investigation. Use `$scout`.
21
+ - Pure verification without real subagent/team execution. Use `$deep`.
22
+ - Heavy single-agent work, even with runtime/tmux/browser proof. Use `$deep`.
23
+ - Mission-shaped work when real subagents are unavailable or unsafe to use. Downgrade to `$deep` and report the downgrade.
24
+
25
+ ## Principles
26
+
27
+ - Direction before execution.
28
+ - Mission is explicit-only; never auto-escalate small tasks into mission.
29
+ - Start from the user's approved plan or ask for one if the plan is missing.
30
+ - Mission requires real subagent/team orchestration; role-only self-review belongs in `$deep`.
31
+ - Keep role separation real and useful, not theatrical.
32
+ - Token economy still matters.
33
+ - Use real subagents only when they are available and the work has separable, high-risk, or parallel lanes.
34
+ - If real subagents are not available, unsafe, or not worth using, downgrade to `$deep` by default.
35
+ - If the user explicitly insists on `$mission` despite unavailable subagents, mark the mission `partial` or `blocked` instead of pretending role-play is team execution.
36
+ - Use tmux/dev server/browser QA only when the mission needs runtime evidence.
37
+ - Cross-verify before claiming completion.
38
+ - Do not claim verified, cleaned up, or visually checked without evidence.
39
+ - Doctor scan is mandatory for mission finalization, but it should stay concise.
40
+
41
+ ## Workflow
42
+
43
+ 1. Restate the mission goal, scope, no-go rules, and acceptance criteria.
44
+ 2. Confirm real subagent/team availability and a meaningful split.
45
+ 3. If real subagents are unavailable, unsafe, or not useful, stop mission setup and downgrade to `$deep` unless the user explicitly asks to proceed with a partial/blocked mission.
46
+ 4. Split work into real lanes:
47
+ - Implementer: makes scoped changes.
48
+ - Reviewer: checks code, risk, and project direction.
49
+ - UX/browser verifier: checks screen behavior when relevant.
50
+ - Doctor/scanner: checks direction fit, scope control, verification, cleanup, stale context, and false-completion risk.
51
+ 5. Execute the implementation in bounded steps.
52
+ 6. Use `$deep`-style runtime verification when the mission needs dev server, tmux, test watcher, browser QA, cleanup, or before/after evidence.
53
+ 7. Cross-check findings and resolve contradictions.
54
+ 8. Run the smallest honest final verification set.
55
+ 9. Run doctor scan with `references/doctor-scan.md`.
56
+ 10. Confirm cleanup or explicitly report intentionally running processes.
57
+ 11. Produce final proof summary, truth status, remaining tasks, and fix-first items.
58
+
59
+ ## Proof Summary
60
+
61
+ Include:
62
+
63
+ - Mission goal.
64
+ - Role work completed.
65
+ - Subagent decision: used / downgraded_to_deep / unavailable_partial / blocked, with reason.
66
+ - Files or surfaces changed.
67
+ - Runtime/tmux/browser evidence when used.
68
+ - Cross-verification result.
69
+ - Doctor/scanner result.
70
+ - Cleanup status.
71
+ - Truth status: proven, verified, partial, fixture_only, fixture_instrumented_real, integration_optional, real_required_missing, skipped, blocked, or assumed.
72
+
73
+ Use `references/context-reuse.md`; read project pack before broad context.
74
+ Use `references/runtime-orchestration.md` only when runtime evidence is needed.
75
+ Use `references/doctor-scan.md` before final mission completion.
76
+ Use `references/db-supabase-safety-lite.md` for destructive DB/Supabase, migration, production-write, or RLS/policy lanes.
77
+ Use `references/current-docs.md` when any lane depends on current SDK/API/cloud-service behavior.
78
+ Use `references/honest-completion.md`; do not overclaim.
79
+ Use `references/final-report.md` to close with remaining tasks and fix-first items.
80
+ Use `references/token-budget-reporter.md` when budget drift matters.
81
+
82
+ ## Prompt Pattern
83
+
84
+ Good mission prompts include:
85
+
86
+ - Goal.
87
+ - Scope.
88
+ - No-go rules.
89
+ - Acceptance criteria.
90
+ - Runtime/browser needs.
91
+ - Required final checks.
92
+
93
+ If the user invokes `$mission` without an approved plan, ask for the plan or propose a compact plan first.
94
+
95
+ ## Final Response
96
+
97
+ Report:
98
+
99
+ - What the mission changed.
100
+ - Role/cross-verification summary.
101
+ - Evidence gathered.
102
+ - Truth status.
103
+ - Cleanup status.
104
+ - Remaining tasks.
105
+ - Fix-first items before planned tasks.
@@ -0,0 +1,45 @@
1
+ ---
2
+ name: question
3
+ description: Very-light Q&A route. Use when the user invokes $question or asks a direct conceptual or practical question that can be answered without code changes or broad research.
4
+ ---
5
+
6
+ # yam Question
7
+
8
+ Use for:
9
+ - Simple conceptual explanations.
10
+ - "What is X?" or "Can I do Y?" questions.
11
+ - Local harness/project usage questions.
12
+ - Clarifying a decision that does not need fresh source gathering.
13
+
14
+ Do not use for:
15
+ - Code changes. Use `$quick` or `$ueye`.
16
+ - Multi-option research or current market/tool comparisons. Use `$scout`.
17
+ - Risky verification or broad debugging. Use `$deep`.
18
+
19
+ Principles:
20
+ - Direction before execution.
21
+ - Answer the question first.
22
+ - Keep context tiny.
23
+ - Do not browse or inspect files unless freshness, local truth, or accuracy requires it.
24
+ - Separate fact, inference, and recommendation when the distinction matters.
25
+ - State uncertainty plainly.
26
+ - Token economy is part of quality.
27
+ - End with remaining tasks and fix-first items only when they are useful.
28
+
29
+ Workflow:
30
+ 1. Identify the exact question.
31
+ 2. Answer from stable knowledge or already-loaded context.
32
+ 3. Check local files only when the answer depends on installed/configured state.
33
+ 4. Browse only when the fact is current, niche, or likely to have changed.
34
+ 5. If the answer becomes comparison/research, switch to `$scout`.
35
+
36
+ Output:
37
+ - Direct answer.
38
+ - Short explanation.
39
+ - Practical next step, if any.
40
+ - Uncertainty or verification note, if relevant.
41
+
42
+ Token budget:
43
+ Use `references/token-budget-reporter.md`.
44
+ Use `references/current-docs.md` only when the question depends on current SDK/API/cloud-service behavior; otherwise answer directly.
45
+ Default: 0-2 files, 0 commands, short final answer.
@@ -0,0 +1,81 @@
1
+ ---
2
+ name: quick
3
+ description: Fast implementation route for small changes, focused bug fixes, and quick error scans. Use when the user invokes $quick or asks for a quick scoped fix.
4
+ ---
5
+
6
+ # yam Quick
7
+
8
+ Use for:
9
+
10
+ - Copy, label, spacing, color, small CSS, and small docs edits.
11
+ - Narrow bug fixes and ordinary scoped implementation.
12
+ - Fast error scans for build/type/lint/test failures.
13
+ - Small UI tweaks that do not need design exploration or visual review loops.
14
+
15
+ Do not use for:
16
+
17
+ - Design-heavy UI work or reference-image interpretation. Use `$ueye`.
18
+ - Risky, broad, or runtime-heavy work. Use `$deep`.
19
+ - DB/Supabase destructive commands, production writes, migrations, RLS, or schema changes. Recommend `$deep`.
20
+ - Real subagent/team implementation. Use `$mission`.
21
+ - Pure Q&A. Use `$question`.
22
+
23
+ ## Principles
24
+
25
+ - Direction before execution.
26
+ - Context-reuse first.
27
+ - Token economy is part of quality.
28
+ - Start with the smallest likely edit surface.
29
+ - Follow existing project architecture, naming, UX flow, and test style.
30
+ - Verify at the lightest level that honestly supports completion.
31
+ - Do not use teams, orchestration, structured proof, or tmux.
32
+ - Do not run broad test suites for tiny changes.
33
+
34
+ ## Lanes
35
+
36
+ Patch lane:
37
+
38
+ 1. Read `yam.project.md` or `.yam/memory/summary.md` only when present and useful.
39
+ 2. Inspect the smallest relevant file or nearby pattern.
40
+ 3. Make the minimal change.
41
+ 4. Re-read the changed snippet.
42
+ 5. Run at most one or two focused checks when useful.
43
+
44
+ Build-fix lane:
45
+
46
+ 1. Detect the smallest useful command from package scripts or project pack.
47
+ 2. Group errors by file and root cause.
48
+ 3. Fix one error class at a time.
49
+ 4. Read only the local error context, usually the file and nearby imports.
50
+ 5. Re-run the same focused command.
51
+ 6. Stop if the same error survives three attempts, errors expand, dependency installation is needed, or the fix implies architecture change.
52
+
53
+ Scan lane:
54
+
55
+ 1. Inspect the current error output or run the smallest detector.
56
+ 2. Report grouped issues and the safest first fix.
57
+ 3. Edit only when the user asked for implementation.
58
+
59
+ ## Verification
60
+
61
+ - Copy/CSS/docs: Level 0 is often enough after re-reading the changed snippet.
62
+ - TS/JS or app logic: prefer typecheck, related test, lint, or build in that order when available.
63
+ - Build-fix: use a compact PASS/FAIL matrix for command results.
64
+ - If verification is skipped, partial, blocked, or assumed, say that plainly.
65
+
66
+ Use `references/quick.md` for the merged fast/build rules.
67
+ Use `references/truth-matrix.md` for truth labels.
68
+ Use `references/db-supabase-safety-lite.md` when a command or prompt contains DB/Supabase mutation signals.
69
+ Use `references/token-economy.md` to keep context small.
70
+ Use `references/context-reuse.md` before broad project reading.
71
+ Use `references/final-report.md` to close with remaining tasks and fix-first items when useful.
72
+ Use `references/token-budget-reporter.md` when a run needs measured budget feedback.
73
+
74
+ ## Final Response
75
+
76
+ Keep it compact:
77
+
78
+ - What changed or what the scan found.
79
+ - What was checked.
80
+ - What was skipped, blocked, or still risky.
81
+ - Remaining tasks and fix-first items, only when useful.
@@ -0,0 +1,71 @@
1
+ ---
2
+ name: scout
3
+ description: Lightweight investigation and judgment route. Use when the user invokes $scout or asks to find options, references, docs, tools, product direction, or third-party perspective before implementation.
4
+ ---
5
+
6
+ # yam Scout
7
+
8
+ Use for:
9
+
10
+ - Tool/library comparison.
11
+ - Current documentation lookup.
12
+ - Design or product references.
13
+ - Technical direction checks.
14
+ - Risk discovery before implementation.
15
+ - Third-party judgment before committing to a direction.
16
+ - Objective and subjective evaluation together.
17
+ - Macro, realistic, and future-facing judgment.
18
+
19
+ ## Principles
20
+
21
+ - Scout, do not sprawl.
22
+ - Token economy is part of quality.
23
+ - Reuse `yam.project.md` before broad context reading when present.
24
+ - Prefer official and primary sources.
25
+ - Use current docs proof only when SDK/API/cloud-service freshness matters.
26
+ - Keep the question narrow.
27
+ - Do not change code unless the user asks.
28
+ - Return a practical recommendation.
29
+ - Separate fact, inference, opinion, and recommendation.
30
+ - Treat "objective" as evidence plus constraints, not certainty theater.
31
+ - Treat "subjective" as named taste, product instinct, and likely user perception.
32
+ - Keep source count bounded by default; use `$deep` only when the user asks for heavy verification.
33
+
34
+ ## Workflow
35
+
36
+ 1. Clarify the decision being scouted.
37
+ 2. Choose the lane:
38
+ - quick lookup
39
+ - option comparison
40
+ - design/reference scan
41
+ - product/technical direction memo
42
+ - risk discovery
43
+ 3. Read `yam.project.md` first when project direction matters.
44
+ 4. Gather 3-7 high-signal sources by default.
45
+ 5. Compare options by fit, risk, cost, implementation effort, and durability.
46
+ 6. Give both objective and subjective judgment when useful.
47
+ 7. Recommend a direction.
48
+ 8. State uncertainty and what would change the recommendation.
49
+
50
+ ## Output
51
+
52
+ Use concise sections:
53
+
54
+ - Best pick.
55
+ - Objective judgment.
56
+ - Subjective judgment.
57
+ - Macro / realistic / future view when the decision benefits from it.
58
+ - Alternatives.
59
+ - Risks.
60
+ - Sources or local evidence.
61
+
62
+ Use `references/token-economy.md`; default to 3-7 high-signal sources.
63
+ Use `references/current-docs.md` for current SDK/API/cloud-service questions.
64
+ Use `references/context-reuse.md`; do not rescout known decisions unless they may be stale.
65
+ Use `references/markdown-management.md` before creating or updating project packs.
66
+ Use `references/final-report.md` to close with remaining tasks and fix-first items when useful.
67
+ Use `references/token-budget-reporter.md` when a run needs measured budget feedback.
68
+
69
+ ## Final Response
70
+
71
+ Mention the recommended direction, key tradeoffs, remaining tasks, and fix-first items before planned tasks when useful.
@@ -0,0 +1,90 @@
1
+ ---
2
+ name: ueye
3
+ description: UI/UX/design implementation and visual review route with screenshot evidence. Use when the user invokes $ueye or asks for design-heavy UI work, reference-image-based implementation, or visual UX review.
4
+ ---
5
+
6
+ # yam Ueye
7
+
8
+ Use for:
9
+
10
+ - Design-heavy UI/UX implementation.
11
+ - Reference-image-based UI direction.
12
+ - Screenshot, URL, or current-screen UX review.
13
+ - Pre-fix and post-fix visual QA.
14
+ - UI states, responsive behavior, hierarchy, CTA, contrast, alignment, spacing, and affordance.
15
+
16
+ Do not use for:
17
+
18
+ - Tiny UI tweaks. Use `$quick`.
19
+ - Pure review of non-visual code risk. Use `$deep` or `$mission` when risk is high.
20
+ - Broad implementation plans with role split, runtime proof, or doctor scan. Use `$mission`.
21
+
22
+ ## Principles
23
+
24
+ - Direction before execution.
25
+ - Visual evidence before visual claims.
26
+ - Context-reuse first.
27
+ - Token economy is part of quality.
28
+ - Product fit beats decoration.
29
+ - Use existing tokens, components, typography, icons, and layout patterns first.
30
+ - Build the actual usable screen, not a marketing detour.
31
+ - Text-only visual critique cannot be reported as fully verified when screenshot evidence was required.
32
+ - Generated annotated images are optional, not a default gate.
33
+ - Image evidence should stay bounded: inspect the primary screen first, then only the states/images needed to support the claim.
34
+
35
+ ## Workflow
36
+
37
+ 1. Read `yam.project.md` and `.yam/memory/summary.md` only when present and useful.
38
+ 2. Identify the target screen, product direction, audience, and reference image or URL.
39
+ 3. Build or capture a source-screen inventory:
40
+ - user-provided screenshot
41
+ - local/browser screenshot
42
+ - exported static artifact image
43
+ - URL/current screen, when accessible
44
+ 4. Inspect nearby UI patterns, tokens, styles, and state handling.
45
+ 5. Implement the smallest coherent design improvement when implementation is requested.
46
+ 6. Check default, loading, error, empty, disabled, hover/focus, and mobile states when relevant.
47
+ 7. Run browser/screenshot verification when feasible.
48
+ 8. Produce a P0-P3 visual issue ledger and fix path.
49
+ 9. Recheck changed/high-risk screens after fixes when feasible.
50
+
51
+ ## Visual Truth Caps
52
+
53
+ - Full visual verification requires real source-screen evidence.
54
+ - Reference images guide direction; they do not prove the implemented screen unless compared with real source-screen evidence.
55
+ - Generated or annotated images are derivative evidence; they cannot upgrade missing real screen evidence to `verified`.
56
+ - Inspect 1-3 primary images by default; expand only for P0/P1 risk, responsive breakage, or user-requested deep visual QA.
57
+ - Keep source-screen inventory to the 5 most important rows by default.
58
+ - Screenshot unavailable: cap the result at `partial` or `blocked`.
59
+ - Text-only review: cap the result at `partial`.
60
+ - Mock or invented screenshots: cap the result at `partial` and mark the source.
61
+ - Browser unavailable but source files reviewed: report implementation verification separately from visual verification.
62
+ - Generated callout images may improve review quality, but missing generated images must not block ordinary Ueye work.
63
+
64
+ ## Design Checks
65
+
66
+ Use `references/ueye.md` and `references/ui-quality.md`.
67
+ Always consider:
68
+
69
+ - Direction fit.
70
+ - Hierarchy.
71
+ - CTA clarity.
72
+ - Spacing and alignment.
73
+ - Contrast.
74
+ - Density.
75
+ - Text fit.
76
+ - Responsive behavior.
77
+ - State coverage.
78
+ - Interaction affordance.
79
+ - Consistency with the project's visual language.
80
+
81
+ ## Final Response
82
+
83
+ Report:
84
+
85
+ - What changed visually or what was reviewed.
86
+ - Source evidence used.
87
+ - P0-P3 issues or confirmation that no blockers were found.
88
+ - States/viewports checked.
89
+ - Truth status and visual verification cap.
90
+ - Remaining tasks and fix-first items before planned tasks when useful.
@@ -0,0 +1,46 @@
1
+ # yam Mission Prompt
2
+
3
+ ```text
4
+ $mission
5
+ 아래 구현 계획은 확정됐어.
6
+
7
+ 목표:
8
+ -
9
+
10
+ 범위:
11
+ -
12
+
13
+ 금지사항:
14
+ -
15
+
16
+ Acceptance criteria:
17
+ -
18
+
19
+ 역할:
20
+ - Implementer: 범위 안에서 구현
21
+ - Reviewer: 코드/구조/리스크/방향성 검토
22
+ - UX/browser verifier: 화면/상태/브라우저 흐름 확인
23
+ - Doctor/scanner: stale context, 과검증/검증부족, false-completion risk, cleanup, 남은 fix-first 점검
24
+
25
+ Subagent 판단:
26
+ - 실제 subagent/team 사용 가능 여부:
27
+ - meaningful split:
28
+ - decision: used / downgraded_to_deep / unavailable_partial / blocked
29
+ - 이유:
30
+ - subagent가 불가능하거나 불필요하면 기본적으로 $deep으로 전환:
31
+
32
+ 검증:
33
+ - 필요한 가장 작은 검증 명령을 우선 실행
34
+ - 필요하면 tmux/dev server/browser QA/process cleanup proof 사용
35
+ - 검증하지 못한 것은 skipped/blocked/assumed로 명확히 보고
36
+
37
+ 최종 보고:
38
+ - 구현 요약
39
+ - 역할별 교차 검증 결과
40
+ - subagent decision
41
+ - 실제 evidence
42
+ - truth status
43
+ - cleanup status
44
+ - fix-first items
45
+ - remaining tasks
46
+ ```
@@ -0,0 +1,54 @@
1
+ # yam Deep/Mission Runtime Proof Summary
2
+
3
+ ## Goal
4
+
5
+ -
6
+
7
+ ## Plan
8
+
9
+ -
10
+
11
+ ## Processes
12
+
13
+ - Command:
14
+ - PID/session/pane:
15
+ - Port:
16
+ - Started:
17
+ - Stopped:
18
+ - Exit/closure verified:
19
+
20
+ ## tmux
21
+
22
+ - Used: yes / no
23
+ - Session:
24
+ - Pane:
25
+ - Before-drain observation:
26
+ - After-drain observation:
27
+ - Final observation:
28
+
29
+ ## Evidence
30
+
31
+ - Before:
32
+ - During:
33
+ - After:
34
+
35
+ ## Verification
36
+
37
+ - Command/check:
38
+ - Result:
39
+ - Truth status:
40
+
41
+ ## Cleanup
42
+
43
+ - Cleanup performed:
44
+ - Process exit verified:
45
+ - tmux pane/session closed or intentionally left running:
46
+ - Remaining process intentionally left running:
47
+
48
+ ## Blockers
49
+
50
+ - None.
51
+
52
+ ## Final Truth Status
53
+
54
+ - proven / verified / partial / fixture_only / fixture_instrumented_real / integration_optional / real_required_missing / skipped / blocked / assumed:
@@ -0,0 +1,39 @@
1
+ # yam Tuning Log
2
+
3
+ Use this to tune route wording from real use.
4
+
5
+ ## Entry
6
+
7
+ - Date:
8
+ - Project:
9
+ - Route:
10
+ - Task:
11
+
12
+ ## What Happened
13
+
14
+ - Over-read context:
15
+ - Under-verified:
16
+ - Report too long:
17
+ - Direction mismatch:
18
+ - Useful behavior:
19
+
20
+ ## Budget Measurement
21
+
22
+ - Files read:
23
+ - Commands run:
24
+ - Report lines:
25
+ - Seconds:
26
+ - Budget result:
27
+
28
+ ## Fix To Harness
29
+
30
+ - Skill to edit:
31
+ - Proposed wording:
32
+ - Keep/remove:
33
+
34
+ ## Compared Against
35
+
36
+ - Sneakoscope:
37
+ - ECC:
38
+ - Karpathy:
39
+ - yam decision:
@@ -0,0 +1,62 @@
1
+ # yam Ueye Review
2
+
3
+ ## Input
4
+
5
+ - Screenshot/URL:
6
+ - Product or screen:
7
+ - Reference direction:
8
+ - Source evidence:
9
+
10
+ ## Direction Fit
11
+
12
+ - Fits project direction:
13
+ - Mismatch:
14
+
15
+ ## Source-Screen Inventory
16
+
17
+ - Evidence bound: 1-3 primary images, max 5 inventory rows by default.
18
+ - Source type:
19
+ - State:
20
+ - Viewport:
21
+ - Visual verification cap:
22
+ - Reference/generated image used only as direction or annotation:
23
+
24
+ ## P0-P3 Issues
25
+
26
+ ### P0 Blockers
27
+
28
+ - None.
29
+
30
+ ### P1 Major
31
+
32
+ - None.
33
+
34
+ ### P2 Quality
35
+
36
+ - None.
37
+
38
+ ### P3 Polish
39
+
40
+ - None.
41
+
42
+ ## Checks
43
+
44
+ - Hierarchy:
45
+ - CTA:
46
+ - Spacing:
47
+ - Alignment:
48
+ - Contrast:
49
+ - Density:
50
+ - Text fit:
51
+ - Mobile:
52
+ - Empty/loading/error/disabled/hover/focus states:
53
+
54
+ ## Safe Fix Path
55
+
56
+ 1.
57
+ 2.
58
+ 3.
59
+
60
+ ## Truth Status
61
+
62
+ - verified / partial / skipped / blocked / assumed: