npm - yam-harness - Versions diffs - 0.1.2 → 0.1.3 - Mend

yam-harness 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/DECISIONS.md +10 -10
package/ROADMAP.md +17 -15
package/bin/yam.js +1 -6
package/package.json +1 -1
package/references/current-docs.md +4 -4
package/references/db-supabase-safety-lite.md +4 -4
package/references/honest-completion.md +4 -4
package/references/hook-lite.md +5 -5
package/references/markdown-management.md +4 -4
package/references/memory.md +5 -5
package/references/mission.md +4 -4
package/references/quick.md +4 -4
package/references/token-budget-reporter.md +4 -4
package/references/tool-trust-layer.md +4 -4
package/references/ueye.md +5 -5
package/skills/ueye/SKILL.md +1 -1
package/templates/tuning-log.md +3 -3
package/yam.manifest.json +1 -1

package/DECISIONS.md CHANGED Viewed

@@ -1,17 +1,17 @@
 # yam Decision Baseline
-Every `yam` change is evaluated against Sneakoscope, ECC, and Karpathy-style minimal harness principles.
+Every `yam` change is evaluated against strict proof, modular skill, and minimal-core harness principles.
 ## Fixed Questions
-1. What would Sneakoscope verify?
-2. What would ECC make selective or low-context?
-3. What would Karpathy remove to keep the core obeyable?
+1. What needs concrete evidence before completion?
+2. What should stay selective or low-context?
+3. What can be removed to keep the core obeyable?
 4. What should `yam` keep light by default, and what should deepen deliberately?
 ## Borrow
-### Sneakoscope
+### Strict Proof
 - Truthful completion language.
 - Risk escalation.
@@ -19,7 +19,7 @@ Every `yam` change is evaluated against Sneakoscope, ECC, and Karpathy-style min
 - Fake versus real distinction.
 - Runtime/process proof only when explicitly requested.
-### ECC
+### Modular Skills
 - Skills-first structure.
 - Selective install.
@@ -27,7 +27,7 @@ Every `yam` change is evaluated against Sneakoscope, ECC, and Karpathy-style min
 - Token optimization.
 - Project-specific rules instead of global bloat.
-### Karpathy-Style Minimal Harness
+### Minimal Core
 - Short core.
 - Few route names.
@@ -36,21 +36,21 @@ Every `yam` change is evaluated against Sneakoscope, ECC, and Karpathy-style min
 ## Reject
-### From Sneakoscope
+### From Strict Proof
 - Mandatory hooks.
 - Mandatory Team or subagent proof.
 - Always-on tmux/proof lifecycle.
 - Heavy memory systems for ordinary edits.
-### From ECC
+### From Modular Skills
 - Full install by default.
 - Giant catalog context.
 - Hook runtime by default.
 - Too many always-on rules.
-### From Minimal Harness
+### From Minimal Core
 - Under-verification.
 - Vague quality rules.

package/ROADMAP.md CHANGED Viewed

@@ -97,12 +97,14 @@ Tasks:
 ### 8. Scout / Research Workflow
-Goal: give yam a research lane comparable to Sneakoscope research, but lighter and more decision-oriented.
+Goal: give yam a research lane that is evidence-bound, lightweight, and decision-oriented.
-ECC reference points:
+Research reference points:
-- ECC has `deep-research`, `market-research`, `research-ops`, and `contexts/research.md`.
-- Useful parts to borrow: evidence boundaries, source freshness, fact/inference/recommendation separation, and decision-oriented summaries.
+- Evidence boundaries.
+- Source freshness.
+- Fact/inference/recommendation separation.
+- Decision-oriented summaries.
 Tasks:
@@ -143,13 +145,13 @@ Tasks:
 Goal: preserve durable lessons without turning yam into a heavy automatic memory system.
-Borrowed from Sneakoscope:
+Kept:
 - Sparse one-record-per-file storage.
 - Wrongness-style records for repeated mistakes and wrong decisions.
 - Deliberate forgetting via resolve instead of permanent prompt injection.
-Borrowed from ECC:
+Kept:
 - Evidence before recommendation.
 - Clear separation between observation and next action.
@@ -173,13 +175,13 @@ Tasks:
 Goal: prevent false runtime completion while keeping ordinary work fast.
-Borrowed from Sneakoscope:
+Kept:
 - Runtime truth vocabulary.
 - Cleanup must be backed by exit/closure evidence.
 - tmux physical proof idea, reduced to route-level evidence notes.
-Borrowed from ECC:
+Kept:
 - Evidence boundaries before recommendation.
 - Explicit partial/blocked/assumed language.
@@ -202,13 +204,13 @@ Tasks:
 Goal: provide one explicit heavy execution route without increasing total skill count.
-Borrowed from Sneakoscope:
+Kept:
 - Real Team/subagent route boundary.
 - Cross-verification before completion.
 - Runtime/tmux/browser proof when mission evidence needs it.
-Borrowed from ECC:
+Kept:
 - Role-specific work boundaries.
 - Evidence-first reporting.
@@ -234,21 +236,21 @@ Tasks:
 Goal: remove overlapping skill roles while preserving the best parts of the old routes.
-Borrowed from Sneakoscope actual image UX code:
+Kept:
 - Source screenshot inventory before visual claims.
 - P0-P3 issue ledger.
 - P0/P1-first fix loop.
 - Partial truth cap for text-only or missing-screenshot review.
-Borrowed from ECC command docs:
+Kept:
 - Smallest useful verification command.
 - Group errors by file and root cause.
 - Fix one error class at a time.
 - Compact PASS/FAIL reporting.
-Borrowed from Open Design local code and contribution rules:
+Kept:
 - Real preview/screenshot evidence.
 - Compact design direction.
@@ -275,14 +277,14 @@ Tasks:
 Goal: keep beginner momentum while creating a path toward professional proof-first work.
 The hook stays light, but the harness direction does not. `yam` should support a depth ladder: direction fit first, focused proof for ordinary work, strong proof for risky work, and real team proof for `$mission`.
-Borrowed from Sneakoscope:
+Kept:
 - Hook status and trust reporting.
 - Tool readiness as evidence.
 - DB/Supabase safety thinking.
 - Runtime/tmux/process cleanup truth.
-Borrowed from ECC:
+Kept:
 - Selective install and low-context operation.
 - Evidence boundaries instead of always-on gates.

package/bin/yam.js CHANGED Viewed

@@ -859,9 +859,6 @@ async function buildYamLiteContext({ cwd, prompt }) {
   if (docsHint) lines.push(docsHint);
   const routeHint = yamLiteRouteHint(prompt);
   if (routeHint) lines.push(routeHint);
-  if (await exists(path.join(path.resolve(cwd), '.sneakoscope'))) {
-    lines.push('Caution: active .sneakoscope detected; avoid mixing proof gates unless the user explicitly wants it.');
-  }
   return lines.join('\n');
 }
@@ -1309,7 +1306,7 @@ async function inspectProjectPack(targetDir = process.cwd()) {
   const instructionSurfaces = await findInstructionSurfaces(resolved);
   if (missingSections.length) issues.push(`missing section(s): ${missingSections.join(', ')}`);
-  if (words > 1200) warnings.push(`pack is long (${words} words); keep the Karpathy-style core compact`);
+  if (words > 1200) warnings.push(`pack is long (${words} words); keep the core compact`);
   if (words < 80) warnings.push(`pack is very short (${words} words); direction may be too thin to reuse`);
   if (packAgeDays > PACK_STALE_DAYS) warnings.push(`pack is ${packAgeDays} days old; review whether direction or commands changed`);
   if (placeholderLines > 12) warnings.push(`${placeholderLines} placeholder lines are still blank`);
@@ -1354,9 +1351,7 @@ async function findInstructionSurfaces(dir) {
     { path: 'CLAUDE.md', level: 'warning', note: 'active CLAUDE.md may carry non-yam instructions' },
     { path: 'RULES.md', level: 'warning', note: 'active RULES.md may carry non-yam instructions' },
     { path: '.codex/AGENTS.md', level: 'warning', note: 'active .codex/AGENTS.md may override project behavior' },
-    { path: '.codex/SNEAKOSCOPE.md', level: 'issue', note: 'active Sneakoscope instruction file detected' },
     { path: '.codex/hooks.json', level: 'issue', note: 'active Codex hook file detected' },
-    { path: '.sneakoscope', level: 'issue', note: 'active Sneakoscope directory detected' },
     { path: '.agents', level: 'warning', note: 'project-local .agents directory may add additional skills or instructions' }
   ];
   const found = [];

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "yam-harness",
-  "version": "0.1.2",
+  "version": "0.1.3",
   "description": "Progressive proof-first Codex harness: start fast, deepen deliberately, stay honest by design.",
   "type": "module",
   "author": "0kim0bos",

package/references/current-docs.md CHANGED Viewed

@@ -36,10 +36,10 @@ Or:
 Current-docs proof: skipped because this was stable/local/non-SDK work.
 ```
-## Compared Baseline
+## Design Baseline
-Sneakoscope favors source-intelligence proof for current tool behavior.
+Strict proof favors source-backed evidence for current tool behavior.
-ECC keeps research/context selective and low-context.
+Modular skill workflows keep research/context selective and low-context.
-Karpathy-style minimalism says the rule is useful only when it changes the answer.
+Minimal-core design says the rule is useful only when it changes the answer.

package/references/db-supabase-safety-lite.md CHANGED Viewed

@@ -31,10 +31,10 @@ Before claiming safe:
 - A successful migration command is not automatically safe; it only proves that command execution completed.
 - Do not claim production safety without environment evidence.
-## Compared Baseline
+## Design Baseline
-Sneakoscope would likely gate destructive DB work more aggressively.
+Strict proof would gate destructive DB work more aggressively.
-ECC would keep the check selective and evidence-bound.
+Modular skill workflows keep the check selective and evidence-bound.
-Karpathy-style minimalism keeps this as a short rule and a small detector, not a full DB policy engine.
+Minimal-core design keeps this as a short rule and a small detector, not a full DB policy engine.

package/references/honest-completion.md CHANGED Viewed

@@ -52,10 +52,10 @@ Runtime work needs stronger evidence because long-running processes can create f
 - No release-blocking runtime proof unless the user chooses `$deep` or `$mission`.
 - No full `$mission` claim without real subagent/team evidence; downgrade to `$deep`, or mark mission partial/blocked.
-Compared baseline:
+Design baseline:
-- Sneakoscope would collect stronger physical proof and gate completion more aggressively.
-- ECC would keep evidence boundaries and report what is known vs inferred.
-- Karpathy-style minimalism would keep the rule short and obeyable.
+- Strict proof collects stronger physical proof and gates completion more aggressively.
+- Modular skill workflows keep evidence boundaries and report what is known vs inferred.
+- Minimal-core design keeps the rule short and obeyable.
 `yam` keeps the guard explicit, cheap, and route-aware.

package/references/hook-lite.md CHANGED Viewed

@@ -14,7 +14,7 @@ Allowed:
 - Remind the agent not to overclaim verification, cleanup, or visual evidence.
 - Suggest `$quick`, `$ueye`, `$question`, `$scout`, `$deep`, or `$mission` based on obvious prompt signals.
 - Mention a project pack or memory summary when present.
-- Warn when `.sneakoscope` is active in the current project.
+- Warn when conflicting proof-harness surfaces are active in the current project.
 Not allowed:
@@ -44,12 +44,12 @@ Project hooks write to `<project>/.codex/hooks.json`.
 `yam` backs up an existing hook file before enabling the lite hook.
-## Compared Baseline
+## Design Baseline
-Sneakoscope uses hooks as a broad trust surface with route prep, tool evidence, permission gates, subagent evidence, and stop gates.
+Broad hook systems often use route prep, tool evidence, permission gates, subagent evidence, and stop gates.
-ECC favors selective setup and lower-context workflows.
+Selective skill systems favor lower-context workflows.
-Karpathy-style minimalism would avoid hooks unless the rule is short and changes behavior.
+Minimal-core systems avoid hooks unless the rule is short and changes behavior.
 `yam` keeps this hook advisory-only so beginner momentum is preserved while the agent still receives a direction nudge. Deeper proof belongs in `$deep` and real team execution belongs in `$mission`, not in an always-on prompt hook.

package/references/markdown-management.md CHANGED Viewed

@@ -2,21 +2,21 @@
 `yam` uses markdown as a small direction layer, not as an automatic control system.
-## Compared Baseline
+## Design Baseline
-Sneakoscope:
+Strict proof systems:
 - Creates and manages more markdown surfaces for agent control, route instructions, proof, and dashboards.
 - Good for strict verification and anti-fake-work pressure.
 - Risk: too much generated context and too much automatic intervention.
-ECC:
+Modular skill systems:
 - Splits markdown into modular instructions, rules, skills, and commands.
 - Good for selective installation and low-context operation.
 - Risk: too many optional files can still become noisy if installed wholesale.
-Karpathy-style minimal harness:
+Minimal-core systems:
 - Keeps the core instruction document short and human-readable.
 - Good for speed, obedience, and easy maintenance.

package/references/memory.md CHANGED Viewed

@@ -2,12 +2,12 @@
 `yam memory` is an opt-in, project-local memory layer.
-It borrows only the lightest useful parts from heavier harnesses:
+It keeps only the lightest useful parts from heavier harness patterns:
-- Sneakoscope TriWiki: sparse records, one file per claim, deliberate forgetting instead of injecting every old claim.
-- Sneakoscope wrongness memory: remember repeated mistakes, wrong decisions, stale assumptions, and overconfident claims.
-- ECC research style: separate evidence, inference, and recommendation.
-- Karpathy-style minimalism: keep the mechanism small enough to obey.
+- Sparse records, one file per durable claim, and deliberate forgetting instead of injecting every old claim.
+- Wrongness memory for repeated mistakes, wrong decisions, stale assumptions, and overconfident claims.
+- Separate evidence, inference, and recommendation.
+- Keep the mechanism small enough to obey.
 Storage:

package/references/mission.md CHANGED Viewed

@@ -77,10 +77,10 @@ Doctor scan:
 Use `references/doctor-scan.md` before final completion.
 Keep the scan short, but cover direction fit, scope control, verification, runtime/cleanup, truth status, and fix-first items.
-Compared baseline:
+Design baseline:
-- Sneakoscope would likely make this a Team route with stronger proof gates and required agent evidence.
-- ECC would split role responsibilities and keep evidence boundaries.
-- Karpathy-style minimalism would avoid adding this unless it clearly replaces a confusing middle route.
+- Strict proof would likely make this a team route with stronger gates and required agent evidence.
+- Modular skill workflows split role responsibilities and keep evidence boundaries.
+- Minimal-core design avoids adding this unless it clearly replaces a confusing middle route.
 `yam` uses mission to replace the old standalone runtime route with a clearer heavy execution route.

package/references/quick.md CHANGED Viewed

@@ -2,15 +2,15 @@
 `quick` is the merged small-work route: fast patching, ordinary scoped implementation, and fast error scanning.
-## Borrowed, With Weight Removed
+## Selected Principles
-From Sneakoscope:
+Strict proof:
 - Honest completion language.
 - Real versus assumed verification.
 - Stop instead of claiming success when evidence is missing.
-From ECC:
+Focused execution:
 - Detect the smallest useful command.
 - Group build/type/lint/test errors by file and root cause.
@@ -18,7 +18,7 @@ From ECC:
 - Re-run the same focused command after a fix.
 - Use a compact PASS/FAIL matrix.
-From Karpathy-style minimal harness:
+Minimal core:
 - Keep the instruction short enough to obey.
 - Read the smallest useful context.

package/references/token-budget-reporter.md CHANGED Viewed

@@ -33,12 +33,12 @@ yam measure ueye --files 7 --commands 2 --report-lines 18 --seconds 260
 - `$deep`: can exceed ordinary budgets, but the reason must be risk-tied; single-agent runtime/tmux/browser checks belong here when verification needs them.
 - `$mission`: can spend more context on real subagent/team lanes, cross-verification, doctor scan, and runtime evidence, but only for approved plans where real subagents are used or explicitly unavailable/partial.
-## Compared Baseline
+## Design Baseline
-Sneakoscope would favor stronger automatic evidence collection.
+Strict proof would favor stronger automatic evidence collection.
-ECC would favor selective, low-context reporting.
+Modular skill workflows favor selective, low-context reporting.
-Karpathy-style minimal harness would remove the measurement unless it changes behavior.
+Minimal-core design removes the measurement unless it changes behavior.
 `yam` keeps manual measurement because it helps reduce over-reading without installing hooks.

package/references/tool-trust-layer.md CHANGED Viewed

@@ -49,7 +49,7 @@ Default:
 Advisory:
 - `yam-lite` hook may suggest routes and warn about overclaiming.
-- `yam pack` may warn about stale project direction, command drift, active hooks, or Sneakoscope surfaces.
+- `yam pack` may warn about stale project direction, command drift, active hooks, or legacy proof surfaces.
 On demand:
@@ -60,7 +60,7 @@ On demand:
 - `yam tools doctor`: inspect tool readiness without changing project state.
 - `yam proof`: summarize actual evidence without running verification.
-## Borrow From Sneakoscope
+## Strict Proof Inputs
 - Tool readiness checks.
 - Hook status and trust reporting.
@@ -71,14 +71,14 @@ On demand:
 - Destructive DB/Supabase command detection and production-write caution.
 - Feature/release inventory as an optional doctor, not a default gate.
-## Borrow From ECC
+## Modular Skill Inputs
 - Selective install and profiles.
 - Evidence boundaries.
 - Low-context command detection.
 - Optional orchestration instead of always-on orchestration.
-## Borrow From Open Design
+## Design Quality Inputs
 - Real preview/screenshot evidence.
 - Compact design direction.

package/references/ueye.md CHANGED Viewed

@@ -2,9 +2,9 @@
 `ueye` is the merged UI/design route: design-heavy implementation, screenshot-led UX review, and visual QA.
-## Borrowed, With Weight Removed
+## Selected Principles
-From Sneakoscope image UX review:
+Visual proof:
 - Source-screen inventory before visual claims.
 - P0-P3 issue ledger.
@@ -12,21 +12,21 @@ From Sneakoscope image UX review:
 - Recheck changed or high-risk screens after fixes when feasible.
 - Cap text-only or missing-screenshot reviews as partial instead of fully verified.
-Kept out from Sneakoscope by design:
+Kept out by design:
 - Mandatory generated annotated images.
 - Image voxel ledgers.
 - Release gates for every UI change.
 - Always-on proof loops.
-From Open Design:
+Design quality:
 - Real examples and previews matter more than abstract prose.
 - Design direction should be compact and searchable.
 - P0 gates should reject placeholder visuals, generic UI, and broken responsive states.
 - UI work should be self-contained enough to inspect.
-From ECC:
+Evidence boundaries:
 - Separate evidence from judgment.
 - Keep review output compact.

package/skills/ueye/SKILL.md CHANGED Viewed

@@ -34,7 +34,7 @@ Do not use for:
 - Text-only visual critique cannot be reported as fully verified when screenshot evidence was required.
 - Generated annotated images are optional, not a default gate.
 - Image evidence should stay bounded: inspect the primary screen first, then only the states/images needed to support the claim.
-- Open Design-style quality judgment belongs after implementation/review: compare to the reference first, then judge whether the result is good design.
+- Design quality judgment belongs after implementation/review: compare to the reference first, then judge whether the result is good design.
 ## Workflow

package/templates/tuning-log.md CHANGED Viewed

@@ -33,7 +33,7 @@ Use this to tune route wording from real use.
 ## Compared Against
-- Sneakoscope:
-- ECC:
-- Karpathy:
+- Strict proof:
+- Modular skills:
+- Minimal core:
 - yam decision:

package/yam.manifest.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "yam",
-  "version": "0.1.2",
+  "version": "0.1.3",
   "principles": [
     "Direction before execution.",
     "Start fast.",