npm - bossbuild - Versions diffs - 0.97.0 - Mend

bossbuild 0.97.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (128) hide show

package/LICENSE +21 -0
package/PRINCIPLES.md +70 -0
package/README.md +213 -0
package/VERSION +1 -0
package/bin/boss +3 -0
package/library/README.md +19 -0
package/library/agents/.gitkeep +0 -0
package/library/agents/mentor-venture.md +57 -0
package/library/hooks/.gitkeep +0 -0
package/library/hooks/auto-log.js +133 -0
package/library/hooks/memory-cue.js +82 -0
package/library/hooks/secrets-guard.js +87 -0
package/library/memory-seed/README.md +29 -0
package/library/memory-seed/durable-facts-example.md +16 -0
package/library/practices/.gitkeep +0 -0
package/library/practices/agent-security.md +111 -0
package/library/practices/ai-adoption-culture.md +104 -0
package/library/practices/ai-ux-patterns.md +246 -0
package/library/practices/celebration-of-done.md +100 -0
package/library/practices/conscience-voicing.md +121 -0
package/library/practices/context-discipline.md +116 -0
package/library/practices/design-system.md +152 -0
package/library/practices/git-workflow.md +119 -0
package/library/practices/harm-taxonomy.md +45 -0
package/library/practices/quality-ratchet.md +48 -0
package/library/practices/revalidation.md +57 -0
package/library/practices/scalable-architecture.md +111 -0
package/library/practices/ship-it-live.md +149 -0
package/library/practices/skill-authoring.md +70 -0
package/library/skills/.gitkeep +0 -0
package/library/skills/boss-learn/SKILL.md +63 -0
package/library/skills/boss-sync/SKILL.md +48 -0
package/package.json +49 -0
package/registry/CHANGELOG.md +2737 -0
package/src/board.js +655 -0
package/src/brain.js +288 -0
package/src/cli.js +542 -0
package/src/conscience.js +426 -0
package/src/insights.js +147 -0
package/src/learn.js +92 -0
package/src/map.js +103 -0
package/src/modes.js +82 -0
package/src/paths.js +36 -0
package/src/registry.js +34 -0
package/src/scaffold.js +138 -0
package/src/sync.js +292 -0
package/src/team.js +103 -0
package/stages/L0-quickstart/manifest.json +12 -0
package/stages/L0-quickstart/template/.claude/agents/coder-generalist.md +31 -0
package/stages/L0-quickstart/template/.claude/agents/mentor-venture.md +57 -0
package/stages/L0-quickstart/template/.claude/agents/pm.md +28 -0
package/stages/L0-quickstart/template/.claude/hooks/conscience.js +89 -0
package/stages/L0-quickstart/template/.claude/hooks/lib/loop-runtime.js +507 -0
package/stages/L0-quickstart/template/.claude/hooks/lib/yaml.js +163 -0
package/stages/L0-quickstart/template/.claude/hooks/memory-cue.js +82 -0
package/stages/L0-quickstart/template/.claude/hooks/secrets-guard.js +87 -0
package/stages/L0-quickstart/template/.claude/rules/your-app-code.md +17 -0
package/stages/L0-quickstart/template/.claude/settings.json +36 -0
package/stages/L0-quickstart/template/.claude/skills/boss/SKILL.md +161 -0
package/stages/L0-quickstart/template/.claude/skills/boss-learn/SKILL.md +63 -0
package/stages/L0-quickstart/template/.claude/skills/boss-sync/SKILL.md +55 -0
package/stages/L0-quickstart/template/.claude/skills/canvas/SKILL.md +112 -0
package/stages/L0-quickstart/template/.claude/skills/comprehend/SKILL.md +72 -0
package/stages/L0-quickstart/template/.claude/skills/decide/SKILL.md +122 -0
package/stages/L0-quickstart/template/.claude/skills/feedback/SKILL.md +68 -0
package/stages/L0-quickstart/template/.claude/skills/import/SKILL.md +73 -0
package/stages/L0-quickstart/template/.claude/skills/persona/SKILL.md +92 -0
package/stages/L0-quickstart/template/.claude/skills/prototype/SKILL.md +114 -0
package/stages/L0-quickstart/template/.claude/skills/triage/SKILL.md +104 -0
package/stages/L0-quickstart/template/.claude/skills/welcome/SKILL.md +262 -0
package/stages/L0-quickstart/template/AGENTS.md +31 -0
package/stages/L0-quickstart/template/CLAUDE.md +57 -0
package/stages/L0-quickstart/template/docs/IDS.md +42 -0
package/stages/L0-quickstart/template/docs/ideas/INDEX.md +24 -0
package/stages/L0-quickstart/template/docs/loops/canvas-loop.md +90 -0
package/stages/L0-quickstart/template/docs/loops/capture-loop.md +64 -0
package/stages/L1-mvp/manifest.json +12 -0
package/stages/L1-mvp/template/.claude/agents/mentor-architect.md +124 -0
package/stages/L1-mvp/template/.claude/agents/mentor-cofounder.md +85 -0
package/stages/L1-mvp/template/.claude/agents/mentor-gtm.md +49 -0
package/stages/L1-mvp/template/.claude/agents/program-manager.md +46 -0
package/stages/L1-mvp/template/.claude/agents/tester.md +42 -0
package/stages/L1-mvp/template/.claude/hooks/auto-log.js +133 -0
package/stages/L1-mvp/template/.claude/rules/feature-context.md +18 -0
package/stages/L1-mvp/template/.claude/skills/ai-cost/SKILL.md +249 -0
package/stages/L1-mvp/template/.claude/skills/ai-failure-states/SKILL.md +226 -0
package/stages/L1-mvp/template/.claude/skills/ai-first-init/SKILL.md +227 -0
package/stages/L1-mvp/template/.claude/skills/close/SKILL.md +170 -0
package/stages/L1-mvp/template/.claude/skills/consult/SKILL.md +72 -0
package/stages/L1-mvp/template/.claude/skills/cost-review/SKILL.md +204 -0
package/stages/L1-mvp/template/.claude/skills/design-tokens-init/SKILL.md +192 -0
package/stages/L1-mvp/template/.claude/skills/drift-deep/SKILL.md +170 -0
package/stages/L1-mvp/template/.claude/skills/evals/SKILL.md +154 -0
package/stages/L1-mvp/template/.claude/skills/extract/SKILL.md +209 -0
package/stages/L1-mvp/template/.claude/skills/judge-traces/SKILL.md +68 -0
package/stages/L1-mvp/template/.claude/skills/log/SKILL.md +64 -0
package/stages/L1-mvp/template/.claude/skills/practice/SKILL.md +92 -0
package/stages/L1-mvp/template/.claude/skills/pretotype/SKILL.md +95 -0
package/stages/L1-mvp/template/.claude/skills/red-team/SKILL.md +137 -0
package/stages/L1-mvp/template/.claude/skills/revalidate/SKILL.md +51 -0
package/stages/L1-mvp/template/.claude/skills/ship/SKILL.md +105 -0
package/stages/L1-mvp/template/.claude/skills/smoke/SKILL.md +43 -0
package/stages/L1-mvp/template/.claude/skills/spec/SKILL.md +145 -0
package/stages/L1-mvp/template/claude-append.md +122 -0
package/stages/L1-mvp/template/docs/loops/ai-failure-state-loop.md +107 -0
package/stages/L1-mvp/template/docs/loops/coordination-loop.md +116 -0
package/stages/L1-mvp/template/docs/loops/cost-budget-loop.md +117 -0
package/stages/L1-mvp/template/docs/loops/cost-review-loop.md +113 -0
package/stages/L1-mvp/template/docs/loops/design-tokens-loop.md +98 -0
package/stages/L1-mvp/template/docs/loops/drift-loop.md +149 -0
package/stages/L1-mvp/template/docs/loops/extraction-loop.md +128 -0
package/stages/L1-mvp/template/docs/loops/focus-loop.md +106 -0
package/stages/L1-mvp/template/docs/loops/pretotype-loop.md +88 -0
package/stages/L1-mvp/template/docs/loops/spec-loop.md +83 -0
package/stages/L2-v1/manifest.json +12 -0
package/stages/L2-v1/template/.claude/agents/db-architect.md +91 -0
package/stages/L2-v1/template/.claude/agents/mentor-business.md +124 -0
package/stages/L2-v1/template/.claude/agents/mentor-fundraising.md +72 -0
package/stages/L2-v1/template/.claude/agents/mentor-pitch.md +84 -0
package/stages/L2-v1/template/.claude/agents/mentor-talent.md +84 -0
package/stages/L2-v1/template/.claude/agents/ui-designer.md +81 -0
package/stages/L2-v1/template/.claude/agents/ux-designer.md +87 -0
package/stages/L2-v1/template/.claude/skills/board/SKILL.md +98 -0
package/stages/L2-v1/template/.claude/skills/design-review/SKILL.md +77 -0
package/stages/L2-v1/template/.claude/skills/ux-check/SKILL.md +93 -0
package/stages/L2-v1/template/claude-append.md +59 -0
package/stages/L2-v1/template/docs/loops/design-drift-loop.md +108 -0
package/stages/L3-scale/README.md +13 -0

package/library/practices/celebration-of-done.md ADDED Viewed

@@ -0,0 +1,100 @@
+---
+id: PRACTICE-celebration-of-done
+type: practice
+owner: designer
+status: draft
+host: stack-neutral
+provenance: composted from Ajesh's humane-tech corpus (the "Celebration of Done" pillar — Mumbai sev-puri: assemble fast, serve with confidence, mark it, start the next; + AIR's "done is an exhale, not an end"). The GENERATIVE half of the humane lens — BOSS already *records* done (devlog, CHANGELOG, /close), it has never *marked* it. Draft pending wiring + /boss-learn route once the concurrent FEAT-024 (/ship) work lands. BOSS v0.9x.
+---
+# Practice — Celebration of Done (mark the threshold, then start the next)
+> **Where this sits.** This is the *generative* twin of BOSS's defensive disciplines. The conscience and
+> the harm-taxonomy keep the build from going wrong; this keeps the building *alive*. BOSS is earnest and
+> disciplined to a fault — it records completion in a dozen places and celebrates it in none. A build that
+> only ever measures what's left, and never marks what's crossed, quietly trains the founder that nothing
+> is ever enough. That's the pseudo-app polish-trap wearing a productivity mask.
+## The line
+**Done is a threshold, not perfection.** You ship when the work meets its essential purpose — and each
+*done* is the foundation for the next, not an endpoint. Marking the threshold is not a reward sticker;
+it's how a builder keeps the energy to take the next step, and how they learn — in the body, not from a
+lecture — that *shipped-and-real* beats *polished-and-unseen*.
+The image (sev-puri): the chaatwallah assembles in under two minutes from prepped components, serves it
+with confidence, no hesitation — and starts the next plate. Done is fast, unfussy, and immediately feeds
+what's next. Not a fine-dining flourish; a street-cart rhythm.
+## What the pause is actually for
+The point of marking *done* is not applause — it's a **threshold crossed, registered.** It does four
+things the relentless build never makes room for:
+1. **It creates the felt sense that something was achieved.** BOSS doesn't *emote at* the founder
+   (performed warmth, see below) — it makes the *space* for them to feel it themselves: a beat of
+   genuine acknowledgement instead of an instant pivot to the next task. The achievement was real; let
+   it land.
+2. **It punctuates the chase.** AI-speed building fragments attention — you ship and immediately lunge at
+   one-more-thing, switching so fast you *forget what you just did*. The marked threshold is the antidote
+   to that build-ADHD: a deliberate stop that says *this is finished; you don't have to keep running.*
+3. **It re-anchors them on the *why* and the *who*.** A threshold is the natural moment to reconnect to
+   the bet — *why did you think this was a good idea, and who were you solving for?* The build pulls you
+   into the how; done pulls you back to the why.
+4. **It turns into curiosity about whether it resonates *now*.** Re-anchored on the who, the honest next
+   feeling isn't "what's next on the list" — it's *"does this actually land for the person I built it
+   for?"* That curiosity is the bridge from done → the real user. Which means a well-marked *done* quietly
+   feeds the validation loop — the founder *wants* to go find out if it resonates, arriving at the test
+   through joy instead of discipline. Done is where building turns back into listening.
+## The hard part: celebrate WITHOUT performed warmth
+This is where it lives or dies. BOSS's voice is the seasoned hand who doesn't need the credit —
+**"🎉 Amazing job!!" is voice-mode bleed and an instant tell.** Performed warmth is its own small dishonesty,
+and this audience (and this founder) feels it. The practice is *genuine, specific recognition*, never
+applause. The rules:
+1. **Name what's real, specifically.** Not "great work!" — *"the AI-summary path is real now; that was the
+   riskiest assumption."* Recognition that proves you actually saw what they did beats any adjective.
+2. **Say what it unlocks.** A threshold matters because of what's now possible. *"A real user can hit this
+   today."* That's the celebration — the door that just opened — not a gold star.
+3. **Then point at the next rung — lightly.** Done is an exhale that feeds the next inhale. End on
+   momentum, not a to-do: offer the natural next step, don't assign it.
+4. **Proportional (JIT, Principle #2).** Friction scales to stakes; so does celebration. A closed FEAT, a
+   first live URL, a mode graduation — those are real thresholds. A typo fix gets nothing. A parade for a
+   trivial change cheapens every real one. Most moments get *silence*, same as the conscience.
+5. **No streaks, no scores, no manufactured occasion.** The instant celebration becomes a mechanic to hit,
+   it's a dark pattern (the engagement-loop the dark-pattern checklist warns against). Mark *real* thresholds
+   when they genuinely arrive; never invent one to keep someone engaged.
+6. **Collective when there's a team.** Shared acknowledgment is the point of a team threshold — name what
+   the *partnership* shipped, never single out (pairs with the coordination conscience + `mentor-cofounder`).
+   Solo: a genuine beat, never a notification.
+## Where it applies (the real thresholds)
+- **`/ship` hands back a live URL** — the cleanest threshold there is. "localhost is not shipped" → *it's
+  shipped.* A real user can reach it now. (Composes with FEAT-024's `/ship`.)
+- **`/close` (session end)** — mark what the session actually crossed, not just log it. The difference
+  between "session recorded" and "you got the thing working that was blocking you."
+- **Mode graduation (`boss unlock`)** — Quickstart→MVP→V1 is a genuine rung crossed; the founder earned the
+  next level of ceremony. Mark it; don't just unlock it silently.
+- **A FEAT closed / the canvas's riskiest assumption tested** — the existing "Done!" graduation moment
+  (canvas) is the seed of this; this practice is its doctrine.
+## The anti-pattern this prevents
+A build culture (even a solo one) that only ever names the gap — what's broken, what's left, what's not
+good enough — is how the joy leaks out of building, and how a founder burns out polishing a thing no one
+has seen. Marking *done* is the counterweight: it's permission to ship the honest version and move, which
+is the same lean instinct BOSS already preaches — here pointed at the founder's morale instead of the
+roadmap. Build fast, mark the threshold, start the next. That rhythm *is* the regenerative loop.
+## Related
+- `conscience-voicing.md` — proportionality + no-sermon + the seasoned-hand voice all govern *how* a
+  celebration is voiced (it's a conscience moment with the polarity flipped: same restraint, opposite sign).
+- `ai-ux-patterns.md` — the dark-patterns line that keeps celebration from curdling into a streak/engagement
+  mechanic.
+- Voice: the `boss-voice` memory (no performed warmth) — `voice-keeper` is the reviewer for any celebration
+  copy that ships.
+- `ship-it-live.md` (FEAT-024) — the technical threshold this marks.

package/library/practices/conscience-voicing.md ADDED Viewed

@@ -0,0 +1,121 @@
+---
+id: PRACTICE-conscience-voicing
+type: practice
+owner: mentor-humane
+status: active
+host: stack-neutral
+provenance: distilled from the 2026-06-20 thread on dignity-cost / over-censoring — "voice the tension, never filter the menu" + how Claude voices concern without blocking. Drove the mentor-business metering-axis fix (RVW-023 → ADAPT) and the mentor-humane "name, never override" reframe. BOSS v0.67.x.
+---
+# Practice — conscience voicing (voice the tension, never filter the menu)
+> **Where this sits.** This is the *behavioural* spine under every place BOSS speaks with an opinion —
+> the conscience hook moments, every `mentor-*` agent, `/vet`, and any skill that presents a menu of
+> choices. [`ai-ux-patterns.md`](ai-ux-patterns.md) owns how an AI feature behaves toward the person
+> in general; this owns the narrower, load-bearing case: **how BOSS surfaces a concern without making
+> the founder's choice for them.** It is Principle 6 (*humane before viable*) operationalised so the
+> conscience stays a conscience and never curdles into a censor.
+## The line
+**A conscience makes a cost *visible*; a censor makes a choice *unavailable*.** Everything below keeps
+BOSS on the first side of that line. The founder is sovereign. Principle 6 governs *how BOSS reasons*
+(a viability argument doesn't outrank the humane lens in BOSS's own analysis) — it does **not** govern
+the founder's decision.
+The two failure modes this prevents:
+- **Filtering the menu** — omitting an option BOSS disapproves of (a pricing model, a stack, a path).
+  Withholding it "to protect them" is itself a *dignity cost*: it makes the choice for them. Present
+  the full menu; annotate the one you're wary of. (See the [`mentor-business`] metering axis: every
+  model is shown, each with its tension named — none withheld.)
+- **Nagging** — raising a concern the founder has already heard and moved past. Repetition reads as
+  *I don't trust you*.
+## The craft — how to voice (borrowed from how a good model prevents harm)
+1. **Inform over refuse.** Default to helping *with the concern named*, not withholding. Refusal/block
+   is the rare last resort, reserved for genuine third-party harm. A conscience annotates; it doesn't
+   subtract.
+2. **Once, briefly, no sermon.** One sentence. No repetition, no stacked disclaimers, no moralising.
+   The moment it's a paragraph it's a lecture, and a lecture says *I don't trust you*.
+3. **Fill the knowledge gap, never imply an intelligence gap.** Surface the second-order consequence
+   they might not know — never explain the obvious to a competent adult. (This *is* the BOSS voice:
+   assume intelligence, never assume knowledge.) And treat a founder's *not-knowing as a doorway, never
+   a deficit* — the posture is "you haven't learned how to appreciate this yet," not "you don't get it."
+   Every expert was once an explorer; a green founder is mid-expansion, not lacking. Phrase a gap as an
+   invitation into something, never a remedial correction.
+4. **Proportionality.** Friction scales to stakes. A reversible, self-regarding choice gets a feather
+   touch or silence; real friction is reserved for hard-to-undo, other-harming choices. Treating a
+   pricing tweak with safety-issue gravity is the tell of a censor.
+5. **Honor prior consent; never relitigate.** Once they've heard it and chosen, it's settled.
+   Re-raising after a stated override is how care becomes control. (The machinery already encodes this —
+   `relationship.md`: *"if you've already raised this and they moved past it, don't say it again."*)
+6. **Offer the path, not just the cliff.** Concern travels with a constructive alternative —
+   "here's the trap, and here's how you'd clear it." Never a bare "don't."
+7. **Hand the decision back, explicitly.** End on their agency, not your verdict. "Your call" is the
+   point, not a courtesy.
+## The competence-gate — a voicing the caution/drift moments can reach for
+A specific shape of #3 ("fill the knowledge gap"), load-bearing enough to name. The Otis et al. HBS RCT
+(640 founders) found AI advice *amplifies* the judgment a founder already has: high performers ~+15%,
+struggling ones ~−8% — the ones least able to grade the advice were the ones it hurt. And the CHI 2025
+automation-bias work found confidence in AI tracks *less* checking, not more. The conscience is the
+equalizer that population needs.
+So when a founder is leaning on AI for a call they may not be equipped to evaluate, the conscience can
+voice **"are you set up to judge this answer?"** — and point at who *would* know — rather than answering
+for them. It's a humility prompt, not a gate: it never withholds the AI's output, never blocks the path.
+Keep it **rare and suggestive** — over-firing turns a humility prompt into a nag, and the founder who's
+clearly competent at the thing should never hear it. This is a lens the model can draw on when composing
+`caution`/`drift` voice, not a new hook predicate (there's no signal that detects over-trust; don't
+manufacture one).
+## The consent boundary — what may be muted
+Two kinds of tension, different consent rules. This is the load-bearing distinction:
+| Kind | Definition | Consent rule |
+|---|---|---|
+| **Self-regarding** | The choice mainly risks the founder's *own* venture (pricing, scope, raising early). | **Fully muteable.** If they signal they don't want it, drop it — voicing-once included. It's their company. |
+| **Third-party harm** | Someone *not in the room* could be hurt (a user, patient, a vulnerable cohort). | **Name once even if unwelcome** — the person who'd be harmed never consented to being muted. Still once, still light, still the founder's decision; you just don't let the *category* be pre-silenced. |
+The lens is "non-negotiable" only in this precise sense: you always **name** a third-party harm once;
+you never **override** the founder. Naming ≠ blocking.
+## Where it applies
+- **Conscience hook moments** (`caution`, `drift`, `capture`, `focus`, `restraint`, `done`): the
+  per-moment rules in the loop runtime already embody much of this ("say at most once per session",
+  "stay silent if mid-other-work", "don't sound like a productivity reward"). Read those against
+  this practice when adding or tuning a moment.
+- **Every `mentor-*` agent**: present full menus; name the honesty cost of each shape once; defer with
+  the menu *visible*, never withhold a shape. ([`mentor-humane`] holds the override-vs-name line;
+  [`mentor-business`] is the worked example.)
+- **`/vet`**: skepticism toward a *stranger's* claim is a legitimate default — but it's a held bias,
+  not neutral truth. Say so; don't dress a NO-bias as inevitability.
+- **Any menu-presenting skill**: the test — *did we omit an option because it's genuinely irrelevant,
+  or because we disapprove of it?* The second is filtering the menu.
+## Machinery to build on (don't reinvent)
+- **`boss conscience pause`** (session-level mute, recorded, auto-expiring) — IDEA-011.
+- **`.boss/brain/relationship.md`** (landed / ignored / overrode / pushed-back-and-was-right) — the
+  conscience reads the recent slice to calibrate, not repeat. FEAT-022.
+- **`boss conscience mute <moment>` / `unmute`** (v0.72.0) — the per-moment, hook-enforced mute: the
+  founder turns down one moment (drift, caution, capture, …) while the rest keep speaking; auto-unmutes
+  on expiry (the per-moment twin of pause's silent auto-resume). Stored under `conscienceMutes`,
+  orthogonal to pause. The **first-run consent moment** lives in `/welcome` — the founder meets the
+  moments and learns all three controls (pause / mute / override) before any of them fires.
+- **Still net-new**: encoding the self-regarding/third-party split *in the hook* so the second category
+  resists a blanket mute (today all hook moments are self-regarding, so a flat mute is correct; revisit
+  if a third-party-harm moment is ever added to the hook layer).
+## Related
+- [`ai-ux-patterns.md`](ai-ux-patterns.md) — the broader AI-behaviour patterns (interrupt registers,
+  risk-tiered gates) this specialises.
+- `mentor-humane` / `mentor-business` agents — where the rule is enforced in the mentor layer.
+- Voice: the `boss-voice` memory (seasoned hand, doesn't need the credit) — voicing tone must match it;
+  `voice-keeper` is the reviewer.

package/library/practices/context-discipline.md ADDED Viewed

@@ -0,0 +1,116 @@
+---
+id: PRACTICE-context-discipline
+type: practice
+owner: pm
+status: active
+host: claude-code
+provenance: vetted via /vet RVW-005 + RVW-010 (synthesizes RVW-002, RVW-009, RVW-012) — BOSS v0.42.0
+---
+# Practice — Context discipline
+> **What every always-loaded token costs.** On Claude Code, your `CLAUDE.md`, memory, rules, MCP tool
+> schemas, and skill descriptions enter the context window at session start — paid on *every* turn.
+> Bloat doesn't just cost money on the API; it **dilutes the model's attention** (context distraction:
+> bigger context ≠ better answers). Context discipline keeps the always-loaded surface small,
+> scopes the rest to load only when relevant, and **enforces** secret/no-read boundaries in the
+> harness rather than trusting a prompt.
+> **Host-bound.** This practice targets the **Claude Code** host (syntax verified against current
+> behavior 2026-06-02). The *principles* (lean always-loaded context; scope-by-relevance; enforce-in-harness)
+> are host-neutral; the *mechanisms* (`permissions.deny`, `.claude/rules/`, hooks) are Claude-Code
+> specifics. On a different host/model, recalibrate — see the model-recalibration discipline. **Re-verify
+> the syntax below when the host changes**; flags and frontmatter formats drift.
+## Why (the failure modes it prevents)
+- **Context distraction** — past a threshold the model over-weights repeated/irrelevant context and
+  neglects its training; creativity and accuracy drop. Lean context is a *quality* lever, not just a
+  cost one.
+- **Secret leakage** — the model will sometimes read (or even edit) a `.env`/secrets file even when
+  told not to. A prompt is not a boundary. Beginners commit keys the model hardcoded.
+- **Stale always-on rules** — domain rules that load every session whether or not they're relevant
+  are pure overhead (and risk contradicting each other — context clash).
+## The four moves
+### 1. Keep the always-loaded docs lean
+- **`CLAUDE.md`**: only what would *genuinely surprise an experienced dev new to the repo* —
+  non-obvious build/test commands, against-default architecture decisions, project constraints. Cut
+  anything the model already knows from training (framework syntax, generic preambles) or could learn
+  by reading the code for 20 minutes. Rule of thumb: keep it tight (compliance drops past ~200 lines).
+- **Session-state docs** (e.g. a `RESUME.md`): keep a **recency window** of the most recent few
+  entries; let the full history live in the changelog it already maintains. Don't let an
+  append-forever log become the file you read at every session start.
+- `<!-- HTML comments -->` are stripped before injection (zero-token notes for humans).
+- `@path` imports are organizational only — all imported files still load at startup (no token saving).
+- `CLAUDE.local.md` (gitignored) holds personal/local notes. Edits to `CLAUDE.md` apply on
+  restart/`/compact`, not mid-session. Run `/context` and `/memory` to see what actually loaded.
+### 2. Scope rules to where they apply (`.claude/rules/`)
+Put domain-specific instructions in `.claude/rules/*.md` with `paths:` frontmatter so they load
+**only when the model touches a matching file** — JIT context instead of always-on:
+```markdown
+---
+paths:
+  - "{{SRC_GLOB}}"          # e.g. "src/api/**/*.ts"
+---
+# {{Area}} rules
+{{the rules that only matter for files under that path}}
+```
+Rules **without** `paths:` load at launch (a second always-loaded `CLAUDE.md`) — use that only for
+genuinely global rules. This is just-in-time support (Principle 2) applied to the context window.
+**Shipped instance (BOSS FEAT-020 Phase 1, v0.45.0):** the L0 and L1 templates now ship a
+`.claude/rules/` example so every `boss new` project is JIT-by-construction, not just
+deny-by-construction — L0 `your-app-code.md` (the basic path-scoped pattern), L1 `feature-context.md`
+(the live feature's working notes, which `/close` will later compress — FEAT-020 Phases 2-3). The
+durable-vs-working-state cut that decides what belongs here vs. always-loaded memory lives in
+`library/memory-seed/README.md`. Re-verified against the official Claude Code docs 2026-06-05: `paths:`
+is the correct key (not Cursor's `globs:`); path-scoped rules load when Claude reads a matching file,
+not at session start.
+### 3. Enforce no-read boundaries in the harness, not the prompt
+Secrets and noise get a **hard block** via `permissions.deny` in `.claude/settings.json` — verified
+Claude Code glob syntax (`./` = relative to cwd; `**` = any depth):
+```json
+{
+  "permissions": {
+    "deny": [
+      "Read(./.env)", "Read(./.env.*)", "Read(./secrets/**)",
+      "Bash(cat ./.env*)", "Bash(cat ./secrets/*)",
+      "Read(./node_modules/**)", "Read(./dist/**)", "Read(./build/**)", "Read(*.lock)"
+    ]
+  }
+}
+```
+- **A `Read(...)` deny does NOT block Bash** (`cat .env` still works) — add the `Bash(...)` rules too.
+- **There is no `.claudeignore` file** in Claude Code (a common myth). `permissions.deny` is the
+  mechanism; `.gitignore` is separate and only stops commits, not reads.
+- For coverage that also catches MCP tools and skills added later, a **PreToolUse hook** can reject
+  any tool call touching a secret path (exit code `2`, or JSON `permissionDecision: "deny"`). **But
+  weigh the cost:** a `PreToolUse` hook fires on *every tool call* (a process spawn per call — real
+  latency), where the deny-list is a zero-cost native check. So: the **deny-list is the universal
+  floor** (always ship it); a **secrets-guard hook is a high-stakes ceiling** — reserve it for
+  regulated/PHI work or make it opt-in, don't impose per-call overhead on every project by default.
+  A real secret manager is beyond both. (Cost discipline: don't add always-on machinery for marginal
+  coverage — the framework BOSS warns founders against becoming.)
+  - **BOSS ships this hook dormant** as `.claude/hooks/secrets-guard.js` (canonical in
+    `library/hooks/secrets-guard.js`): Read/Edit of a secrets file → **deny**, Bash/MCP referencing
+    one → **ask**, else allow; fail-open. It is **not registered by default** (an unregistered hook
+    costs nothing — registration is the on-switch). Turn it on by adding the `PreToolUse` block in the
+    file header. **Recommended for the `domain-expert` / regulated cohort.**
+### 4. Filter noisy tool output before it enters context
+A **PostToolUse hook** can compress a 10k-line build/test log to a short error summary before it
+reaches the model — the model reasons over the summary, not the firehose.
+## The test
+*Would this token survive an experienced dev asking "does the model actually need this, here, every
+turn?"* If not, cut it, scope it, or block it. Lean context is faster, cheaper, **and sharper**.
+## Sources / how this was vetted
+Vetted via `/vet` (the skeptical inbox), not adopted on popularity — see `docs/research/verdicts/`
+RVW-005 (deny secrets), RVW-010 (token optimization), with RVW-002 (lean session docs), RVW-009
+(context-engineering failure modes), RVW-012 (enforce-in-harness). The `.claudeignore` claim was
+**rejected at verification** — it does not exist. Re-verify all Claude Code syntax on host change.

package/library/practices/design-system.md ADDED Viewed

@@ -0,0 +1,152 @@
+# Practice: Design system — style never locked into code
+> Generalized from dhun's design system (DESIGN_TOKENS single source of truth, central badge/pill
+> style utils, "no raw Tailwind colors" enforcement hook, Rangoli generative styles, prototype
+> REGISTRY). De-dhuned for reuse. Lands in **V1 mode**; seeds the moment a project grows real UI.
+>
+> **v0.20.x update:** AI-failure-mode catalog added below. The classical practice survives;
+> what AI-assisted building adds is a set of failure modes that *happen by default* when a
+> founder asks Claude for UI work without design discipline. See [IDEA-010](../../docs/ideas/IDEA-010-scalable-ai-design.md)
+> for the BOSS-specific design (loops, cohort-aware scaffolding, prompt patterns) — that's the
+> live spec; this practice doc is the always-true ground.
+## AI-failure-mode catalog (added v0.20.x)
+When founders ask AI (Claude / Cursor / Lovable / v0) to "build me a UI" without design
+discipline, these failure modes appear by default. Naming them is half the fix:
+| Failure | What it looks like | Prevention |
+|---|---|---|
+| **Rudimentary first design** | AI generates generic-internet defaults (Tailwind blue-500, default spacing, no brand). "Looks ok" at one screen; falls apart by three. | Brand-anchor the first prompt from the canvas Promises cell, not from "make it look good." |
+| **The 47 blues** | Each new screen, AI derives slightly different colors. `bg-blue-500`, `#3B82F6`, `bg-blue-600`, custom variables — all in the same codebase. No single source of truth. | Tokens file FIRST, before second screen. Reference tokens *by name* in every prompt. |
+| **Pattern reinvention** | Each new component is a new file. `Button.tsx` → `CTAButton.tsx` → `PrimaryButton.tsx` — all near-identical. AI doesn't search for the existing pattern. | Prompt convention: *"search components/ for similar; reuse first, extend second, create last."* |
+| **Billion-line drift** | Code grows linearly with screens instead of approximately constant after primitives are built. AI never generalizes across requests. | Token system + reuse-first prompting *together*. Either alone is insufficient. |
+| **Missing states** | Default/hover/active/disabled/empty/loading — at least one always missing. Especially empty + loading (the most user-facing failures). | Five-state requirement enforced at prompt level — name the states before AI gets a chance to skip them. |
+| **Brand-default problem** | AI defaults to generic-internet aesthetics because that's the training data. Your brand voice never makes it in unless you bring it. | Canvas Promises cell becomes the design brief, not "make it pretty." |
+### The field's published understanding (2025-2026)
+The AI-design-failure-mode literature is more developed than founders typically realize:
+- **Boldare** — *Design System for AI-Assisted Development* — failure modes named: context
+  loss, token ignorance, brand-default problem.
+- **uxmagic.ai** — *Can AI Follow Design Tokens? The Honest Answer* — direct treatment.
+- **Mageswari (Medium)** — *AI Design Systems: Why Tokens, Schema & Generative Rules Matter
+  Now* — articulates the three-layer token architecture and the "semantic translator" AI
+  needs.
+- **W3C Design Tokens Community Group** — the canonical format spec.
+- **Brad Frost (Atomic Design)** + **Nathan Curtis (design tokens layer-cake)** — foundational
+  pre-AI work that the AI-failure analysis extends.
+### The minimum AI-tolerant architecture
+From the field consensus: **three-layer tokens** (primitives → semantic → component), not
+two. Two layers is fragile under AI generation — the AI takes the easier path and hex-codes
+escape. Three layers gives the AI a semantic name to grab (`color.action.primary` not
+`blue.500`) so the token system survives generation.
+### Cohort-aware scaffolding (added v0.20.x; aligns with v0.20 cohort-aware conscience)
+The intervention *shape* varies per cohort (per `.boss/config.json` cohort declaration):
+- `vibe-coder-newbie` / `first-product` — **SHOW**: scaffold a minimal `DESIGN_TOKENS.md` +
+  one example component refactored. The teaching IS the intervention.
+- `eng-builder` / `returning-founder` — **OFFER**: "want me to scaffold the three-layer
+  token system now or later?" Skip the 101.
+- `vibe-virtuoso` — **OVERRIDE-FRIENDLY**: "you know this; here's the override pattern."
+- `indie-hacker` — **RIGHT-SIZED**: minimum portable system; no stack lock-in.
+- `non-tech-founder` / `domain-expert` — **PLAIN-LANGUAGE COACH**: describe the failure
+  that's coming if you don't do this; offer the fix.
+## Aesthetic ambition — past the slop default (added v0.61.0)
+> Adapted from Anthropic's own `frontend-design` skill via [RVW-014](../../docs/research/verdicts/RVW-014-frontend-design-aesthetic-ambition.md).
+> The failure-mode catalog above is the *discipline* axis — don't drift. This is the *taste* axis —
+> don't be generic. They are different failures: "the 47 blues" is drift; "AI slop" is genericness.
+> A codebase can be perfectly token-disciplined and still look like every other AI-built app.
+AI defaults to the mean of its training data, so unprompted it ships the same interface everyone
+else gets: Inter or Roboto, a purple gradient, a centered card on a gray background, motion that
+isn't there. It reads as *fine* on one screen and as *forgettable* by the third. Naming the slop is
+half the cure — the founder has to *ask* for character, because the model won't volunteer it.
+**The sharper framing (RVW-052): AI-default isn't just generic — it's *indistinguishable from every
+competitor*.** When Tailwind shipped `bg-indigo-500` as a default, its own creator later apologized that
+"every AI-generated UI on earth" went indigo — because the models all converged on the same shadcn/Tailwind
+default. Ship that default and you ship something a user can't tell apart from the ten other tools they
+tried this week. *Build faster ≠ build sameness.* The move: **spend the time the AI just saved you on the
+~5% that's actually yours** — the brand, the voice, the one memorable thing — instead of banking the speed
+and shipping the mean. That 5% is the whole distinctiveness pass below; it's where the saved hours should go.
+**The load-bearing line:** *intentionality, not intensity.* Both bold maximalism and refined
+minimalism work — what fails is the absence of a decision. For a first-time founder, **minimalism
+done precisely is the safer bet than maximalism done loosely** — restraint hides fewer mistakes.
+**A design-thinking pre-pass, before the first UI prompt** (one paragraph, not a document): who is
+this for, what should it feel like, and what's the one thing that should make it memorable? Feed
+that — not "make it look good" — into the prompt. (It's the same brand-anchor move the failure-mode
+catalog prescribes for "rudimentary first design," pointed at taste instead of tokens.)
+Five dimensions worth a deliberate choice (each is a prompt instruction, not a vibe):
+| Dimension | The generic default to escape | The intentional move |
+|---|---|---|
+| **Typography** | Inter / Roboto / Arial, one weight | A distinctive pairing chosen for the product's tone; weight + scale as hierarchy |
+| **Color & theme** | Purple gradient; timid mid-grays | One committed palette in CSS variables; a dominant color with sharp accents |
+| **Motion** | None, or easing on everything | A few high-impact moments — staggered load reveal, scroll-triggered — not motion-everywhere |
+| **Spatial composition** | Centered card, even grid | Asymmetry, overlap, diagonal flow, deliberate grid-breaks |
+| **Visual detail** | Flat fills | Gradients, texture, atmosphere — *matched to* the aesthetic, not sprinkled on |
+**The restraint that bounds the ambition (non-negotiable, even maximalist):** the failure-mode
+catalog and the five-state requirement still hold. Accessibility (contrast, focus, reduced-motion),
+the five states, and performance are floors, not trade-offs — a striking interface that fails
+contrast or drops loading states is still broken. Ambition rides *on top of* the discipline; it
+never substitutes for it. (This bound is the BOSS-specific adaptation; the source skill leans
+maximalist, which is unsafe advice for a green founder.)
+**Cohort-aware, same as the discipline axis:** `first-product`/`vibe-coder-newbie` can't yet *see*
+the slop — SHOW them one before/after so the eye gets trained. `eng-builder`/`vibe-virtuoso` have
+the eye but skip the pre-pass — OFFER the design-thinking prompt, skip the lecture.
+`non-tech-founder`/`domain-expert` — translate "memorable" into their domain's language.
+Lands at **V1**, with the rest of the design layer — the moment a UI is worth keeping is the moment
+genericness starts to cost.
+## The principle (PRINCIPLES.md #3)
+Style is reusable structure, so it must not get buried in implementation. Extract it into a
+**single source of truth** the app *and* prototypes both consume. The test: could a prototype or a
+sibling project reuse this design approach without copy-pasting component code? If not, it's locked.
+## What the design layer establishes
+1. **Design tokens — one source of truth.** Color, type, spacing, radius, elevation, motion as
+   named tokens (`DESIGN_TOKENS.md` + a machine format the code imports). Code references token
+   names, never raw values. Renaming/retheming happens in one place.
+2. **Style guide.** How the tokens compose into patterns: components, states, density, voice. The
+   "why," not just the "what."
+3. **Central style utilities.** Shared helpers for recurring decorated elements (badges, pills,
+   chips) live in one util with a colour budget — never ad-hoc per surface. (dhun: `badgeStyles.ts`,
+   pill governance, 4-colour ceiling per surface.)
+4. **Five-state requirement.** Every component specifies default / hover / active / disabled /
+   empty (and loading where relevant). Missing states are the most common drift.
+5. **Prototype reuse.** Prototypes import the *same* tokens + a component registry, so a mockup
+   looks like the product and graduates to code cleanly. (dhun: prototype `REGISTRY.md`.)
+## Enforcement — just-in-time
+- **Quickstart / MVP:** no design enforcement. Hardcoded styles in a throwaway are fine; don't
+  impose ceremony unearned. But the *moment* a UI is worth keeping, create the tokens file so style
+  is decoupled from the very first commit that matters.
+- **V1:** enforcement turns on. A `PostToolUse` hook flags hardcoded style values (e.g. raw colour
+  classes) and points at the token instead. `/design-review` before code, `/ux-check` after.
+  Agents `ui-designer` (token/visual authority) + `ux-designer` (flows, the 5 states) unlock here.
+- **Scale:** design drift audits, token versioning, multi-surface theming.
+## To author (when V1 mode is built)
+- `template/docs/design/DESIGN_TOKENS.md` (+ a tokens file in the chosen stack's format)
+- `template/docs/design/STYLE_GUIDE.md`, component-audit + state checklist
+- `template/.claude/skills/design-review/`, `ux-check/`, hook for hardcoded-style detection
+- `ui-designer` + `ux-designer` agents
+- a prototype registry + the rule that prototypes consume the token system

package/library/practices/git-workflow.md ADDED Viewed

@@ -0,0 +1,119 @@
+---
+id: PRACTICE-git-workflow
+type: practice
+owner: mentor-architect
+status: active
+host: stack-neutral
+provenance: distilled from the 2026-06-20 founding-teams research (RESEARCH-COMPENDIUM-2026-06-20 Part B5 — dev process / git workflow) — DORA/Accelerate [EVIDENCE], Addy Osmani on AI code review, METR n=16 perception-gap [EVIDENCE], the worktree-as-parallelism-primitive practitioner pattern — BOSS v0.87.0, FEAT-023 thread 1
+---
+# Practice — Git workflow for AI-native building (trunk-based, review-bounded)
+> **The shape of the problem.** AI didn't change what good version control is — it changed which part
+> hurts. The bottleneck used to be *writing* the code; now an agent writes it ~4× faster and you deliver
+> maybe ~12% more, because the new bottleneck is **review**. So the whole discipline reorients around one
+> question: *how much can two humans actually read and stand behind in a day?* Everything below is in
+> service of keeping the batches small enough, and the ownership clear enough, that the answer stays
+> honest. (DORA's 2025 line: **AI amplifies what's already there** — install the fundamentals *before*
+> you point agents at the repo, or the agent amplifies the mess.)
+## Trunk-based is the default (and the highest-leverage fundamental)
+For 2–5 people, commit to `main` or branches that live **hours, not days**, and merge daily. DORA's
+[EVIDENCE]: trunk-based teams are ~2.3× more likely to be elite performers, and speed and stability are
+**not** a trade-off — the same practice buys both. The rule of thumb: **fewer than 3 active branches**,
+each short-lived, integrated continuously.
+- **`/smoke` is the gate that makes trunk-based safe.** Daily merges to a green `main` only work if "is
+  the app even alive" is one command away. Run it before every commit; a red smoke is information, not a
+  failure — fix it or document the regression.
+- **CI is a *practice*, not a platform.** A two-person team's `/smoke` check **is** its CI. Keep `main`
+  green and merge small — that's the whole discipline. Add the hosted pipeline when surface area grows,
+  not before (premature ceremony — Principle #2). Don't cargo-cult a 12-stage GitHub Actions matrix onto
+  a repo two people share.
+## Worktrees are the AI-parallelism primitive — capped at your review capacity
+The native way to run agents in parallel is **one git worktree per task**: each agent works in its own
+checkout of the same repo, you review and merge each to trunk as it lands. But the cap is the thing most
+people get wrong:
+> **Cap parallel agents at ≈2–4 — and that number is your *review* capacity, not your *agent* count.**
+You can spawn ten agents. You cannot read ten diffs well. The constraint that matters is how many
+parallel streams of agent output a human can actually review to the point of standing behind them — and
+that's small. Fan out into 2–4 worktrees, merge to trunk daily, and treat the review budget as the hard
+limit. More agents than you can review isn't throughput; it's unreviewed code with your name on the merge.
+- **Vertical slices keep the worktrees from colliding.** Give each agent a feature end-to-end — a `FEAT`
+  is a natural slice — so the worktrees touch different code and merge cleanly. Informal ownership ("you
+  take checkout, I take auth") beats a formal locking scheme at this size.
+## Risk-tiered review, not blanket gates
+Blanket multi-approver review is what *kills* small batches (DORA names it directly; agentic PRs already
+sit ~5.3× longer before pickup). The answer isn't less review — it's review **proportioned to risk**.
+- **Whoever clicks merge owns what the agent wrote** (Addy Osmani). This is the load-bearing line. The
+  agent is not accountable; the human who merged it is. That ownership is what keeps "the AI wrote it"
+  from becoming an excuse.
+- **Read the test diff *harder* than the code.** This is the AI-specific trap: an agent under pressure to
+  make tests pass will quietly **rewrite the assertions to match the broken behaviour**. The code looks
+  clean, the suite is green, and the test now certifies the bug. When you review an agent's change, read
+  the test changes first and ask *did the behaviour get fixed, or did the expectation get lowered?*
+- **The tiers.** Low-risk (copy, styling, an isolated pure function, a reversible config) — a single
+  glance, or let the gate carry it. High-risk (anything touching auth, money, data migrations, deletes,
+  deploys, or an AI-mediated control path) — **the *other* human reviews it**, not the one who prompted
+  it. BOSS's `/smoke` + `/evals` + `/red-team` **are** the high-risk tier: smoke proves it's alive, evals
+  prove the AI path is correct, red-team proves the defences hold. Wire them in as the gate for the
+  irreversible, and leave the cheap stuff cheap.
+## Mob the hard problems (the questioning reflex degrades)
+There's an [EVIDENCE] finding (partly student-population, so held loosely): with an AI as the pair,
+people **question the suggestion less** and accept subtly-wrong code more — the second set of eyes a human
+pair used to provide gets replaced by an agent that doesn't push back. So for genuinely hard or novel
+problems, **put the two humans *and* the agent on it together** rather than one founder + agent in a
+worktree. The conscience's nudge here is narrow and rare: at a **high-stakes accept-without-a-second-look**
+— an irreversible change merged on a single glance — surface the question, never gate the merge.
+## The honesty anchor (don't sell yourself the speedup)
+**METR, n=16 [EVIDENCE]:** experienced developers working on *mature* repositories they knew well were
+**19% slower** with AI — while *believing* they were 20% faster. The perception gap is the point. This is
+the *opposite* population to a greenfield startup (where AI's speedup is real and large), so the lesson
+is **not** "AI is slower." The lesson is that your *felt* sense of velocity is an unreliable instrument —
+which is exactly why the batch stays small and the review stays real. Trust the green `main` and the
+merged diff, not the feeling of moving fast. (This is the build-process echo of BOSS's whole reason to
+exist: faster building doesn't make being wrong any cheaper.)
+## Ownership = prompt-author intent + reviewer acceptance
+The blameless-but-accountable answer to *"who owns the AI's bug?"* — and the one principle this practice
+shares verbatim with the founding-team layer (FEAT-021's `/decide` and credit work):
+> **Ownership is the prompt-author's intent plus the reviewer's acceptance.** The agent is the
+> instrument. The person who asked for the change owns the *intent*; the person who merged it owns the
+> *acceptance*. When those are the same person (solo, or a self-merge), they own both. There is no third
+> party to blame — which is the entire point of writing it down.
+State it once; both this practice and the team layer reference it.
+## Altitude / JIT (don't front-load it)
+This is **not** a Quickstart lecture. A founder dropping an idea into `/prototype` doesn't need a
+branching policy. The discipline earns its place at **MVP**, when there's a real `main` to keep green and
+a `/smoke` gate to anchor it — which is why the DOWN of this practice is a tight section in the L1/MVP
+working rules, not a wall of git theory. The worktree cap surfaces the first time a founder reaches for
+parallel agents; the risk-tiered review surfaces the first time a change touches an irreversible surface.
+Right ceremony, right rung (Principle #2).
+## Relationship to BOSS
+BOSS already ships most of the *mechanism* — `/smoke` (the trunk-safe gate), `/evals` and `/red-team`
+(the high-risk tier), `quality-ratchet` (keep `main` from backsliding), and `FEAT-NNN` (the vertical
+slice). This practice is the **operating discipline that wires them into a daily rhythm**, plus the two
+AI-specific cautions a pre-2025 git guide wouldn't have: the **review cap** on parallelism and **reading
+the test diff harder than the code**. A CLI `boss worktree` helper is a *possible* DOWN later — only once
+the by-hand pattern has earned it (Principle #1). See [`quality-ratchet.md`](quality-ratchet.md),
+[`agent-security.md`](agent-security.md) (the irreversible-action gate), and FEAT-023.