bossbuild 0.97.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/PRINCIPLES.md +70 -0
- package/README.md +213 -0
- package/VERSION +1 -0
- package/bin/boss +3 -0
- package/library/README.md +19 -0
- package/library/agents/.gitkeep +0 -0
- package/library/agents/mentor-venture.md +57 -0
- package/library/hooks/.gitkeep +0 -0
- package/library/hooks/auto-log.js +133 -0
- package/library/hooks/memory-cue.js +82 -0
- package/library/hooks/secrets-guard.js +87 -0
- package/library/memory-seed/README.md +29 -0
- package/library/memory-seed/durable-facts-example.md +16 -0
- package/library/practices/.gitkeep +0 -0
- package/library/practices/agent-security.md +111 -0
- package/library/practices/ai-adoption-culture.md +104 -0
- package/library/practices/ai-ux-patterns.md +246 -0
- package/library/practices/celebration-of-done.md +100 -0
- package/library/practices/conscience-voicing.md +121 -0
- package/library/practices/context-discipline.md +116 -0
- package/library/practices/design-system.md +152 -0
- package/library/practices/git-workflow.md +119 -0
- package/library/practices/harm-taxonomy.md +45 -0
- package/library/practices/quality-ratchet.md +48 -0
- package/library/practices/revalidation.md +57 -0
- package/library/practices/scalable-architecture.md +111 -0
- package/library/practices/ship-it-live.md +149 -0
- package/library/practices/skill-authoring.md +70 -0
- package/library/skills/.gitkeep +0 -0
- package/library/skills/boss-learn/SKILL.md +63 -0
- package/library/skills/boss-sync/SKILL.md +48 -0
- package/package.json +49 -0
- package/registry/CHANGELOG.md +2737 -0
- package/src/board.js +655 -0
- package/src/brain.js +288 -0
- package/src/cli.js +542 -0
- package/src/conscience.js +426 -0
- package/src/insights.js +147 -0
- package/src/learn.js +92 -0
- package/src/map.js +103 -0
- package/src/modes.js +82 -0
- package/src/paths.js +36 -0
- package/src/registry.js +34 -0
- package/src/scaffold.js +138 -0
- package/src/sync.js +292 -0
- package/src/team.js +103 -0
- package/stages/L0-quickstart/manifest.json +12 -0
- package/stages/L0-quickstart/template/.claude/agents/coder-generalist.md +31 -0
- package/stages/L0-quickstart/template/.claude/agents/mentor-venture.md +57 -0
- package/stages/L0-quickstart/template/.claude/agents/pm.md +28 -0
- package/stages/L0-quickstart/template/.claude/hooks/conscience.js +89 -0
- package/stages/L0-quickstart/template/.claude/hooks/lib/loop-runtime.js +507 -0
- package/stages/L0-quickstart/template/.claude/hooks/lib/yaml.js +163 -0
- package/stages/L0-quickstart/template/.claude/hooks/memory-cue.js +82 -0
- package/stages/L0-quickstart/template/.claude/hooks/secrets-guard.js +87 -0
- package/stages/L0-quickstart/template/.claude/rules/your-app-code.md +17 -0
- package/stages/L0-quickstart/template/.claude/settings.json +36 -0
- package/stages/L0-quickstart/template/.claude/skills/boss/SKILL.md +161 -0
- package/stages/L0-quickstart/template/.claude/skills/boss-learn/SKILL.md +63 -0
- package/stages/L0-quickstart/template/.claude/skills/boss-sync/SKILL.md +55 -0
- package/stages/L0-quickstart/template/.claude/skills/canvas/SKILL.md +112 -0
- package/stages/L0-quickstart/template/.claude/skills/comprehend/SKILL.md +72 -0
- package/stages/L0-quickstart/template/.claude/skills/decide/SKILL.md +122 -0
- package/stages/L0-quickstart/template/.claude/skills/feedback/SKILL.md +68 -0
- package/stages/L0-quickstart/template/.claude/skills/import/SKILL.md +73 -0
- package/stages/L0-quickstart/template/.claude/skills/persona/SKILL.md +92 -0
- package/stages/L0-quickstart/template/.claude/skills/prototype/SKILL.md +114 -0
- package/stages/L0-quickstart/template/.claude/skills/triage/SKILL.md +104 -0
- package/stages/L0-quickstart/template/.claude/skills/welcome/SKILL.md +262 -0
- package/stages/L0-quickstart/template/AGENTS.md +31 -0
- package/stages/L0-quickstart/template/CLAUDE.md +57 -0
- package/stages/L0-quickstart/template/docs/IDS.md +42 -0
- package/stages/L0-quickstart/template/docs/ideas/INDEX.md +24 -0
- package/stages/L0-quickstart/template/docs/loops/canvas-loop.md +90 -0
- package/stages/L0-quickstart/template/docs/loops/capture-loop.md +64 -0
- package/stages/L1-mvp/manifest.json +12 -0
- package/stages/L1-mvp/template/.claude/agents/mentor-architect.md +124 -0
- package/stages/L1-mvp/template/.claude/agents/mentor-cofounder.md +85 -0
- package/stages/L1-mvp/template/.claude/agents/mentor-gtm.md +49 -0
- package/stages/L1-mvp/template/.claude/agents/program-manager.md +46 -0
- package/stages/L1-mvp/template/.claude/agents/tester.md +42 -0
- package/stages/L1-mvp/template/.claude/hooks/auto-log.js +133 -0
- package/stages/L1-mvp/template/.claude/rules/feature-context.md +18 -0
- package/stages/L1-mvp/template/.claude/skills/ai-cost/SKILL.md +249 -0
- package/stages/L1-mvp/template/.claude/skills/ai-failure-states/SKILL.md +226 -0
- package/stages/L1-mvp/template/.claude/skills/ai-first-init/SKILL.md +227 -0
- package/stages/L1-mvp/template/.claude/skills/close/SKILL.md +170 -0
- package/stages/L1-mvp/template/.claude/skills/consult/SKILL.md +72 -0
- package/stages/L1-mvp/template/.claude/skills/cost-review/SKILL.md +204 -0
- package/stages/L1-mvp/template/.claude/skills/design-tokens-init/SKILL.md +192 -0
- package/stages/L1-mvp/template/.claude/skills/drift-deep/SKILL.md +170 -0
- package/stages/L1-mvp/template/.claude/skills/evals/SKILL.md +154 -0
- package/stages/L1-mvp/template/.claude/skills/extract/SKILL.md +209 -0
- package/stages/L1-mvp/template/.claude/skills/judge-traces/SKILL.md +68 -0
- package/stages/L1-mvp/template/.claude/skills/log/SKILL.md +64 -0
- package/stages/L1-mvp/template/.claude/skills/practice/SKILL.md +92 -0
- package/stages/L1-mvp/template/.claude/skills/pretotype/SKILL.md +95 -0
- package/stages/L1-mvp/template/.claude/skills/red-team/SKILL.md +137 -0
- package/stages/L1-mvp/template/.claude/skills/revalidate/SKILL.md +51 -0
- package/stages/L1-mvp/template/.claude/skills/ship/SKILL.md +105 -0
- package/stages/L1-mvp/template/.claude/skills/smoke/SKILL.md +43 -0
- package/stages/L1-mvp/template/.claude/skills/spec/SKILL.md +145 -0
- package/stages/L1-mvp/template/claude-append.md +122 -0
- package/stages/L1-mvp/template/docs/loops/ai-failure-state-loop.md +107 -0
- package/stages/L1-mvp/template/docs/loops/coordination-loop.md +116 -0
- package/stages/L1-mvp/template/docs/loops/cost-budget-loop.md +117 -0
- package/stages/L1-mvp/template/docs/loops/cost-review-loop.md +113 -0
- package/stages/L1-mvp/template/docs/loops/design-tokens-loop.md +98 -0
- package/stages/L1-mvp/template/docs/loops/drift-loop.md +149 -0
- package/stages/L1-mvp/template/docs/loops/extraction-loop.md +128 -0
- package/stages/L1-mvp/template/docs/loops/focus-loop.md +106 -0
- package/stages/L1-mvp/template/docs/loops/pretotype-loop.md +88 -0
- package/stages/L1-mvp/template/docs/loops/spec-loop.md +83 -0
- package/stages/L2-v1/manifest.json +12 -0
- package/stages/L2-v1/template/.claude/agents/db-architect.md +91 -0
- package/stages/L2-v1/template/.claude/agents/mentor-business.md +124 -0
- package/stages/L2-v1/template/.claude/agents/mentor-fundraising.md +72 -0
- package/stages/L2-v1/template/.claude/agents/mentor-pitch.md +84 -0
- package/stages/L2-v1/template/.claude/agents/mentor-talent.md +84 -0
- package/stages/L2-v1/template/.claude/agents/ui-designer.md +81 -0
- package/stages/L2-v1/template/.claude/agents/ux-designer.md +87 -0
- package/stages/L2-v1/template/.claude/skills/board/SKILL.md +98 -0
- package/stages/L2-v1/template/.claude/skills/design-review/SKILL.md +77 -0
- package/stages/L2-v1/template/.claude/skills/ux-check/SKILL.md +93 -0
- package/stages/L2-v1/template/claude-append.md +59 -0
- package/stages/L2-v1/template/docs/loops/design-drift-loop.md +108 -0
- package/stages/L3-scale/README.md +13 -0
|
@@ -0,0 +1,2737 @@
|
|
|
1
|
+
# BOSS Changelog
|
|
2
|
+
|
|
3
|
+
Each entry = a BOSS version. `/boss-sync` reads this to tell a project what's new since its pin.
|
|
4
|
+
|
|
5
|
+
## 0.97.0 — 2026-06-23
|
|
6
|
+
|
|
7
|
+
- **Rebrand: "BlueprintOS" → BOSS (Build Out Solid Stuff).** The name "BlueprintOS" collided with
|
|
8
|
+
[gj1342/blueprint-os](https://github.com/gj1342/blueprint-os) (an "OS for AI agents" using the same
|
|
9
|
+
skills/specs-as-markdown vocabulary), so the long form is retired and the earned **BOSS** wordmark
|
|
10
|
+
re-founded: **B.O.S.S. = Build Out Solid Stuff**, slogan **"Make it real."** (the anti-pseudo-app thesis
|
|
11
|
+
in three words). Dropped the "Operating System" framing entirely — BOSS is an incubator with a conscience,
|
|
12
|
+
not an OS. **Renamed across the shipped surface:** npm package `blueprintos` → **`bossbuild`**, repo URLs →
|
|
13
|
+
`github.com/ajeshh/bossbuild`, README / PRINCIPLES / docs headers, scaffold templates ("Scaffolded by
|
|
14
|
+
BOSS"), and CLI version/working-rules strings. **The `boss` command, the voice, the `mentor-*` roster, and
|
|
15
|
+
PRINCIPLES are unchanged** (package ≠ command — the bin name is always ours). Decision recorded in DEC-002;
|
|
16
|
+
full identity in `docs/design/BRAND.md`. Easter-egg full forms: *Builds, Or Stays Silent* · *Bosses Only
|
|
17
|
+
Self-Sabotage*. **Next:** GitHub repo rename, npm publish as `bossbuild`, Homebrew tap, domain `boss.build`.
|
|
18
|
+
|
|
19
|
+
## 0.96.0 — 2026-06-21
|
|
20
|
+
|
|
21
|
+
- **First `/humane-refresh` sweep, pass 2 → the cohort & frontier patterns (RVW-059–064).** The second
|
|
22
|
+
`/deep-research` pass (111 agents, 24/25 claims 3-vote verified) covered the families the AI-chatbot lens
|
|
23
|
+
and the classic-web checklist both miss. Folded into
|
|
24
|
+
[`ai-ux-patterns.md`](../library/practices/ai-ux-patterns.md) as a new **"cohort & frontier patterns"**
|
|
25
|
+
block (conditional — surfaces when the product touches that surface, not on every Quickstart):
|
|
26
|
+
- **Accessibility / exclusion-by-design (RVW-059)** — an inaccessible flow (unlabeled element, inaccessible
|
|
27
|
+
CAPTCHA, screen-reader-invisible pre-ticked box) is *"effectively deceptive"* to anyone who can't perceive
|
|
28
|
+
or escape it, **even unintentionally** — the sharpest case of "effect, not intent." WCAG is the floor, not
|
|
29
|
+
the ceiling; test deceptiveness *under assistive tech*. (CHI/CSCW '25.)
|
|
30
|
+
- **Minors & vulnerable-by-design (RVW-061)** — price in real currency + disclose odds (loot-box/gacha
|
|
31
|
+
currency-laundering; FTC's $20M Genshin action), ship addictive-design OFF by default for minors
|
|
32
|
+
(streaks/autoplay/push — EU DSA), age assurance not age-gate theater. (UK Children's Code, statutory.)
|
|
33
|
+
- **Agentic dark patterns (RVW-060)** — *the keystone for BOSS, which builds agents.* Two directions: your
|
|
34
|
+
agent as **perpetrator** (commit/purchase without consent, over-broad permissions → scope + confirm, §4's
|
|
35
|
+
risk gate) and your agent as **victim** (manipulated by UI dark patterns **70%+ vs 31% for humans, worse
|
|
36
|
+
as models scale** — Stanford DECEPTICON / OWASP Agentic 2026). The victim half also hardens
|
|
37
|
+
[`agent-security.md`](../library/practices/agent-security.md): UI dark patterns are an injection surface,
|
|
38
|
+
and **recognition ≠ protection** (in-context prompting, guardrail models, *and human oversight* were each
|
|
39
|
+
shown insufficient — don't oversell a fix).
|
|
40
|
+
- **Algorithmic management (RVW-063)** — when a product scores/ranks/pays people, opaque unpredictable
|
|
41
|
+
scoring + "algorithmic gamblification" is the worker-facing dark pattern → transparent predictable
|
|
42
|
+
formula. (HRW *The Gig Trap* 2025.)
|
|
43
|
+
- **Junk fees (RVW-062)** sharpens the v0.95 drip-pricing line — the all-in total must be the *most
|
|
44
|
+
prominent* price (FTC Junk Fees Rule). Regulatory-teeth pointer extended (Genshin, EU DSA minors, UK
|
|
45
|
+
Children's Code, ADA/EAA) **with an honest caveat that the FTC click-to-cancel rule was vacated (8th Cir.
|
|
46
|
+
2025).**
|
|
47
|
+
- **Sludge (RVW-064) held NOT-YET** — Family 1 returned zero verified claims (its *mechanics* already
|
|
48
|
+
landed under Obstruction in v0.95); needs a dedicated pass, alongside the click-to-cancel status. Watchlist
|
|
49
|
+
edges updated.
|
|
50
|
+
- *Founder-facing, project-neutral; pulled via `/boss-sync`.*
|
|
51
|
+
|
|
52
|
+
## 0.95.0 — 2026-06-21
|
|
53
|
+
|
|
54
|
+
- **`/humane-refresh` — the humane lens stops being a snapshot (IDEA-042).** BOSS's dark-pattern catalog was
|
|
55
|
+
adopted once (the CDT 37, v0.82) and frozen. Dark patterns are an arms race — new research and regulation
|
|
56
|
+
name them, new models open new emergent surfaces, new tools (generative UI, agents-on-your-behalf) create
|
|
57
|
+
surfaces nobody's named. So the catalog now has a **standing freshness discipline** instead of a one-time
|
|
58
|
+
read. The staleness-twin of model recalibration ([IDEA-014](../docs/ideas/IDEA-014-model-recalibration-discipline.md));
|
|
59
|
+
they share trigger events.
|
|
60
|
+
- **New skill [`.claude/skills/humane-refresh`](../.claude/skills/humane-refresh/SKILL.md)** (a
|
|
61
|
+
BOSS-curating-BOSS meta-skill, sits with `/vet`/`/boss-learn`/`/boss-sync`). It *orchestrates skills that
|
|
62
|
+
already exist* — `/deep-research` **finds** (scoped since last run) → diff against the live catalog →
|
|
63
|
+
`/vet` **judges** (default NO) → `/boss-learn` **routes**. Three triggers: on-demand, quarterly via
|
|
64
|
+
`/schedule`, or `--event` for a new model / tool / regulation. Honest about what it can't auto-detect
|
|
65
|
+
(the event trigger is you running it, not performed magic). **Re-runs research, never freezes a list**
|
|
66
|
+
(the RVW-001 anti-rot guard made into a skill).
|
|
67
|
+
- **New [watchlist](../docs/research/watchlists/humane-lens.md)** — the focused half of `SOURCES.md`: the
|
|
68
|
+
regulators + canonical taxonomies it lacked (FTC, EDPB, CPPA, DSA, Brignull, Mathur, CHI/SOUPS/FAccT),
|
|
69
|
+
the standing query, and a `last_refresh` marker so each sweep scopes to "since last time."
|
|
70
|
+
- **First sweep, pass 1 → the classic-web patterns the AI-chatbot lens omits (RVW-056/057).** Two
|
|
71
|
+
`/deep-research` passes, adversarially verified (3-vote). Folded into
|
|
72
|
+
[`ai-ux-patterns.md`](../library/practices/ai-ux-patterns.md):
|
|
73
|
+
- **Four named families an AI product inherits the moment it has an account, a paywall, or a checkout** —
|
|
74
|
+
**Obstruction** (roach-motel / un-deletable accounts, hard-to-cancel), **Sneaking** (sneak-into-basket,
|
|
75
|
+
hidden costs, drip pricing), **Manufactured urgency & scarcity** (fake timers/low-stock), **Interface
|
|
76
|
+
interference** (visual misdirection, trick questions, bad defaults) — each with its humane alternative.
|
|
77
|
+
Canon pinned (Brignull / Mathur / Gray), not enumerated (anti-freeze).
|
|
78
|
+
- **"Effect, not intent"** — a dark pattern needs no malice (CCPA judges by effect); you can ship one by
|
|
79
|
+
accident, which is exactly where the conscience earns its keep.
|
|
80
|
+
- **"Symmetry in choice"** as the concrete, testable bar under the existing asymmetry principle (CCPA
|
|
81
|
+
§ 7004: the good door can't take more clicks than the bad one).
|
|
82
|
+
- **Regulatory teeth as a reference pointer** (CCPA, EU AI Act Art. 5, EDPB, FTC) — "is this regulated?",
|
|
83
|
+
explicitly *not* legal advice, *not* a gate.
|
|
84
|
+
- Pass-1 *under-sourced* claims (token-cost surprise, AI-washing, deepfake social proof, FTC
|
|
85
|
+
click-to-cancel specifics) held at **NOT-YET** (RVW-058) — named in blogs, not yet verifiable; the
|
|
86
|
+
watchlist re-sweeps them. Pass 2 (sludge, accessibility, minors, agentic-AI, financial, attention/labor)
|
|
87
|
+
folds in next.
|
|
88
|
+
- *Founder-facing, project-neutral; pulled via `/boss-sync`.*
|
|
89
|
+
|
|
90
|
+
## 0.94.0 — 2026-06-21
|
|
91
|
+
|
|
92
|
+
- **The generative half of the humane lens — first additions (composted from the humane-tech corpus).**
|
|
93
|
+
BOSS's humane lens has been *defensive* (harm-taxonomy, dark-patterns, conscience-not-censor = *don't
|
|
94
|
+
harm*). This batch begins the *generative* half — *cultivate flourishing* — drawn from a founder
|
|
95
|
+
humane-tech corpus (the Humane Product Canvas lineage, permaculture, the "Celebration of Done," play,
|
|
96
|
+
kinship). All judgment/voice/default touches, no new gates.
|
|
97
|
+
- **New [`library/practices/celebration-of-done.md`](../library/practices/celebration-of-done.md) +
|
|
98
|
+
wired into `/close`.** BOSS *records* done everywhere and *marks* it nowhere. Done is a threshold, not
|
|
99
|
+
perfection: a pause that registers what was crossed (against AI-speed build-amnesia), re-anchors on the
|
|
100
|
+
*why* and the *who*, and turns into curiosity about whether it resonates *now* — the bridge back to the
|
|
101
|
+
real user. The hard rule: celebrate **without performed warmth** ("🎉 great job!" is voice-mode bleed) —
|
|
102
|
+
genuine, specific, proportional, no streaks. It's a conscience moment with the polarity flipped.
|
|
103
|
+
- **`/prototype` gains the *magic circle*.** Named BOSS's one licensed *play* space — bounded,
|
|
104
|
+
safe-to-fail, nothing precious, the conscience waits outside. The generative register BOSS lacked.
|
|
105
|
+
- **`conscience-voicing` — not-knowing as a doorway.** Folded into the never-imply-an-intelligence-gap
|
|
106
|
+
rule: a founder's gap is an invitation ("haven't learned to appreciate it *yet*"), never a deficit.
|
|
107
|
+
- **`/canvas` Metrics cell — regenerative metrics, weighted as seriously as commercial:** what people
|
|
108
|
+
*learn* / gain in well-being, what makes the venture *resilient* — growth that renews, not just extracts.
|
|
109
|
+
- **`ai-ux-patterns` — "Humane defaults" (the build-time inverse of the dark-pattern checklist).**
|
|
110
|
+
Friction-as-ethics, done right: ship the humane choice as the default (remove friction from the good
|
|
111
|
+
path), **never** sabotage the other path (that's a dark pattern aimed at a goal you like — means matter,
|
|
112
|
+
sovereignty is non-negotiable, keep every door open). When the founder crosses anyway: name the cost
|
|
113
|
+
once, then **record the crossing** as a `DEC-NNN` — accountability, not a gate, so "why did we do that,
|
|
114
|
+
and when?" stays traceable. The antidote to humanity eroding *invisibly* through a thousand small
|
|
115
|
+
decisions.
|
|
116
|
+
- *Founder-facing, project-neutral; pulled via `/boss-sync`. The defensive + generative halves now both
|
|
117
|
+
ship.*
|
|
118
|
+
|
|
119
|
+
## 0.93.0 — 2026-06-21
|
|
120
|
+
|
|
121
|
+
- **Distribution as a voicing leg — `/ship` now asks "who finds it?" (IDEA-041, the FEAT-024 sequel).**
|
|
122
|
+
Closes a real asymmetry: `PRINCIPLES.md` names *path to distribution* as a co-equal leg of a real-value
|
|
123
|
+
app, but it was the only leg the conscience never voiced — *"will anyone pay?"* fires in the flow (moment
|
|
124
|
+
#1 + `mentor-business`), *"will anyone ever find it?"* didn't. **Decided the leg-vs-mentor question with
|
|
125
|
+
Ajesh:** not an auto-firing hook (its predicate needs the reachability detection deliberately parked in
|
|
126
|
+
FEAT-024 slice 3 — building it would be the over-fire nag that repels the anti-growth-hacking cohort
|
|
127
|
+
IDEA-019 protects), and not left as mentor-only (the gap stays). Instead a **third answer**: a single
|
|
128
|
+
**demand voicing wired into `/ship`** at the live moment — *"it's reachable now; who's the first real user,
|
|
129
|
+
and how do they hit this?"* Leg-like (fires in the flow, at the right moment, names the cost) without a
|
|
130
|
+
hook (rides the founder's deliberate `/ship` invocation → zero over-fire). Governed by IDEA-019's
|
|
131
|
+
**situation-not-person** rule (about the work's path to a user, never a judgment of the founder), kept to
|
|
132
|
+
the *demand* question (not a Product-Hunt checklist — that's the growth-hacking nag), once + suggestive +
|
|
133
|
+
points at `mentor-gtm` for depth, never a gate. Cohort-aware (lightest touch for `indie-hacker` /
|
|
134
|
+
anti-growth-hacking; skip entirely if they've a first user or a deliberate no-distribution stance).
|
|
135
|
+
`ship-it-live` practice gains a short "reachable → discoverable" note. `/tmp`-verified (skill scaffolds
|
|
136
|
+
clean, 0 placeholders). **The full distribution *hook* stays deferred** — earned only if a reliable
|
|
137
|
+
reachable-but-undiscovered predicate ever exists (shares FEAT-024 slice 3's parked substrate question).
|
|
138
|
+
|
|
139
|
+
## 0.92.0 — 2026-06-21
|
|
140
|
+
|
|
141
|
+
- **`ship-it-live` practice + `/ship` skill — the CD half of building (FEAT-024 slices 1+2).** `git-workflow`
|
|
142
|
+
shipped the CI half ("is `main` green"); this ships the **CD** half: *is this where a real user can hit it,
|
|
143
|
+
or just you?* An app that only runs on `localhost` is a pseudo app — the validation conscience has no teeth
|
|
144
|
+
if the artifact was never put where someone could prove it. Shaped past "starter spec" by a `/deep-research`
|
|
145
|
+
pass (21 sources, 25 claims adversarially verified 3-vote, 22 confirmed / 3 killed).
|
|
146
|
+
- **UP — new [`library/practices/ship-it-live.md`](../library/practices/ship-it-live.md)** (stack-neutral,
|
|
147
|
+
Principle #4 — no baked-in deploy target): **"localhost is not shipped"**; **deploy early/cheap/reversible**
|
|
148
|
+
(the *"reliability is premature at MVP"* counter was killed 0-3 — the headline stands un-hedged);
|
|
149
|
+
**secrets & authz at the boundary = the leg with teeth** — the signature vibe-coded failure is a
|
|
150
|
+
client-bundled DB key + RLS the AI never configured (**CVE-2025-48757 / Lovable** — 170+ apps, ~10.3%
|
|
151
|
+
leaking PII + keys; **MoltBook** — 1.5M credentials, founder wrote no code) [EVIDENCE]; **rollback ≠
|
|
152
|
+
reversible** (instant rollback restores the *app*, not the *database* — Vercel documents this against its
|
|
153
|
+
own feature → expand-migrate-contract, Fowler); a **deploy honesty anchor** (DORA 2024: AI ↔ *worse*
|
|
154
|
+
stability + throughput — the METR twin; small-batches-offset-it was killed 1-2, so it's stated not
|
|
155
|
+
oversold); **preview-per-branch demoted** from "the review primitive" to a JIT-gated judgment (its
|
|
156
|
+
loop-tightening efficacy is vendor-positioning only — the one verified counter-argument; over-ceremony for
|
|
157
|
+
a 2-person team).
|
|
158
|
+
- **DOWN — new `/ship` skill (L1/MVP)** — the deterministic "put it where a real user can hit it" runner:
|
|
159
|
+
detect the stack → **deploy-time pre-flight** (no client-bundled secrets; server-side authz/RLS actually
|
|
160
|
+
on) → cheapest reversible host → hand back the **live URL** + the rollback path → capture the recipe as a
|
|
161
|
+
`PRAC-NNN`. The pre-flight is a **check, not a gate** (conscience-not-censor) and **points at** `/red-team`'s
|
|
162
|
+
pre-ship pass + `agent-security` rather than restating them (the secret-scan surface `secrets-guard`
|
|
163
|
+
deliberately doesn't cover). Cohort-aware (the pre-flight is non-negotiable for `first-product` /
|
|
164
|
+
`non-tech-founder` — they can't spot a leaked key). Plus a tight **"Shipping (localhost is not shipped)"**
|
|
165
|
+
DOWN section in the L1/MVP working rules. Skill-layer (predicate/runner split — the model parses the stack
|
|
166
|
+
+ shells out, so `src/` stays zero-dep, Principle #4 / working-rule #4).
|
|
167
|
+
- **Slice 3 (the `reachable?` conscience moment) deliberately deferred** — voicing-first; a hook needs a real
|
|
168
|
+
reachability predicate, and manufacturing one is the over-fire trap BOSS guards against. Ties to
|
|
169
|
+
IDEA-041 (reachable → discoverable: *"does anyone know it exists?"*).
|
|
170
|
+
|
|
171
|
+
## 0.91.0 — 2026-06-20
|
|
172
|
+
|
|
173
|
+
- **Team wayfinding — close the gap between the shipped founder layer and the docs (IDEA-037 / FEAT-021
|
|
174
|
+
follow-on).** FEAT-021 shipped 12 releases of team functionality (v0.74→v0.85: `boss team`, `/decide`,
|
|
175
|
+
`/practice`, `mentor-cofounder`, the `coordination` moment, the state cut) but the *wayfinding* lagged —
|
|
176
|
+
exactly the README-drift pattern IDEA-018/035 exist to catch. Closed it three ways: (1) **`boss map` now
|
|
177
|
+
lists `boss team`** in its standing controls, so the command is discoverable in a live install (it was
|
|
178
|
+
invisible there before); (2) **new [`docs/GUIDE-teams.md`](../docs/GUIDE-teams.md)** — a founder-facing
|
|
179
|
+
guide to working with a cofounder: how to add a team, the loop with two of you (`/decide` · `/practice` ·
|
|
180
|
+
`boss board --mine` · `mentor-cofounder`), what the `coordination` conscience moment watches, **what's
|
|
181
|
+
shared vs. private** (push = backup + keep-in-the-loop; the conscience's relationship notes stay yours),
|
|
182
|
+
the lightweight working agreement (Driver/Approver + consent), and the hard lines (never arbitrates,
|
|
183
|
+
never scores equity); (3) **`docs/GUIDE.md` + `README.md`** gained a "Building with a cofounder?" pointer
|
|
184
|
+
+ `mentor-cofounder` in the mentor table / advisor list. Docs + one `src/map.js` line; eval gate **120/0**;
|
|
185
|
+
`/tmp`-verified (`boss map` shows `boss team`). All dormant-solo framing preserved (solo founders see no
|
|
186
|
+
team noise).
|
|
187
|
+
|
|
188
|
+
## 0.90.0 — 2026-06-20
|
|
189
|
+
|
|
190
|
+
- **The rest of the `/vet` routing sweep — reliability, generative-UI/memory, anti-sameness, calibrated
|
|
191
|
+
pitch (RVW-040/043 + RVW-051/052 + RVW-037 → `/boss-learn` UP).** Final batch of the queued ADAPT/ADOPT
|
|
192
|
+
sweep, three surfaces, all judgment-aids — no new gates or hook predicates. (Bundled into one release to
|
|
193
|
+
minimize version churn during the concurrent FEAT-023 practice stream.)
|
|
194
|
+
- **`/evals` — correctness ≠ safety + pass^k (RVW-040/043, L1).** New "Correctness ≠ safety — the
|
|
195
|
+
adversarial half" section: a clean `/evals` pass isn't *done* until `/red-team` runs the adversarial
|
|
196
|
+
half (safety degrades under jailbreak across every model — Stanford HAI AI Index 2026; OWASP-Agentic is
|
|
197
|
+
what to probe). Sharpening gains **pass^k** (τ-bench): non-deterministic ≠ run-once — run each
|
|
198
|
+
load-bearing case k times, count how often it *all* succeeds; zero-dependency, a loop around the case
|
|
199
|
+
you already wrote. Two new Rules lines.
|
|
200
|
+
- **`ai-ux-patterns` — two net-new patterns (RVW-051, practice).** **§9 generative-UI control spectrum**
|
|
201
|
+
(static → declarative → open-ended; open-ended sits in the §4 irreversibility tier — an injected prompt
|
|
202
|
+
can redraw what the user sees) and **§10 memory-as-a-reviewable-object** (view/edit/correct/delete/scope
|
|
203
|
+
— the footprints principle extended from *what the agent did* to *what the system knows*; BOSS dogfoods
|
|
204
|
+
it as a memory-carrying tool).
|
|
205
|
+
- **`design-system` — AI-default = indistinguishable (RVW-052, practice).** Sharpened the "aesthetic
|
|
206
|
+
ambition" section: AI-default isn't just generic, it's *indistinguishable from every competitor* (the
|
|
207
|
+
Tailwind `bg-indigo-500` apology as the hook); *build faster ≠ build sameness*; **spend the time the AI
|
|
208
|
+
saved on the ~5% that's yours** — pointed at the existing distinctiveness pass, no new mechanism.
|
|
209
|
+
- **`mentor-pitch` — calibrate claims to evidence (RVW-037, L2/V1).** One attributed heuristic:
|
|
210
|
+
overclaiming *measurably lowers what founders raise* (HBR 2025); match the verb to the proof
|
|
211
|
+
(shows/suggests/we believe/we're testing). Calibration, not suppression — tied to the existing anti-hype
|
|
212
|
+
voice. The `/vet` routing sweep is now complete; remaining verdicts are NOT-YET watches (034/055/035).
|
|
213
|
+
- *Pulled via `/boss-sync`; `/evals`+`mentor-pitch` arrive at MVP/V1 unlock, practices are BOSS-canonical.*
|
|
214
|
+
|
|
215
|
+
## 0.89.0 — 2026-06-20
|
|
216
|
+
|
|
217
|
+
- **`scalable-architecture` practice — architecture that survives the climb (FEAT-023 thread 2,
|
|
218
|
+
`/boss-learn` UP).** Second slice of the AI-native build-process track (solo-applicable). The spine:
|
|
219
|
+
*the value is never the rule — it's the **enforcement loop**.* An agent re-derives a codebase's
|
|
220
|
+
conventions every session, so anything you only *wrote down* drifts and anything you *encoded as a check*
|
|
221
|
+
compounds (Factory.ai: "documented conventions rot; enforced conventions compound"). Two moves — **defer
|
|
222
|
+
the architecture tax you can defer, encode the conventions you can't afford to lose**:
|
|
223
|
+
- **Modular-monolith-first, extract when forced** (Fowler MonolithFirst; Shopify's 2.8M-line modular
|
|
224
|
+
monolith [EVIDENCE]; microservice envy on Hold). The *modular* half is load-bearing — clear module
|
|
225
|
+
seams inside one deployable so the eventual extraction is cheap; a `FEAT` is a natural module.
|
|
226
|
+
- **Spend rigor on the one-way doors — the schema is the one.** Migrations-as-code from the first table
|
|
227
|
+
(Bezos doors + evodb), the single thing genuinely expensive to retrofit. **The migration log is also
|
|
228
|
+
the guardrail against AI schema drift** — wire schema changes into git-workflow's high-risk review tier.
|
|
229
|
+
- **Conventions as code, enforced not remembered** — formatting-as-law (Biome's single binary is
|
|
230
|
+
agent-friendly), boundaries-as-lint + architectural fitness functions (principle + writable lint now;
|
|
231
|
+
heavy custom-plugin tooling NOT-YET), strict types at the boundary.
|
|
232
|
+
- **The ratchet holds the line, not the reviewer** — extends the existing
|
|
233
|
+
[`quality-ratchet`](../library/practices/quality-ratchet.md) (no-new-violations baseline gated in
|
|
234
|
+
`/smoke`), pointed at an architecture metric; doesn't restate it. Plus a pointer to
|
|
235
|
+
[`context-discipline`](../library/practices/context-discipline.md) for the one-canonical-context-file
|
|
236
|
+
finding (failure mode is over-length, not under-spec).
|
|
237
|
+
- **UP-only this slice** (no template DOWN): the practice is the deliverable and mentors cite practices by
|
|
238
|
+
reference; the natural `mentor-architect` DOWN is **deferred to avoid a live edit collision** with a
|
|
239
|
+
concurrent session, not on judgment. Altitude/JIT held — modular seams at MVP, migrations at the first
|
|
240
|
+
table, conventions-as-code at the second author (a second *agent* counts), services at V1→Scale. Zero-dep;
|
|
241
|
+
library only; eval gate **120/0**. **FEAT-023 thread 3 (V1→Scale org rung) remains — deferred until a
|
|
242
|
+
real V1-stage project.**
|
|
243
|
+
|
|
244
|
+
## 0.88.0 — 2026-06-20
|
|
245
|
+
|
|
246
|
+
- **`git-workflow` practice — trunk-based, review-bounded AI building (FEAT-023 thread 1, `/boss-learn` UP
|
|
247
|
+
+ a DOWN into L1/MVP).** The first slice of the AI-native build-process track (spun out of the
|
|
248
|
+
founding-teams research; **solo-applicable**, not team-specific). The reframe: AI didn't change what good
|
|
249
|
+
version control is — it changed which part *hurts*. The agent writes code ~4× faster; **review** is the
|
|
250
|
+
new bottleneck (~12% delivered), so the whole discipline reorients around *how much two humans can stand
|
|
251
|
+
behind in a day.*
|
|
252
|
+
- **UP — new [`library/practices/git-workflow.md`](../library/practices/git-workflow.md).** Trunk-based
|
|
253
|
+
default (DORA [EVIDENCE]: ~2.3× elite; <3 active branches, merge daily; `/smoke` is the gate that makes
|
|
254
|
+
it safe; **CI is a practice, not a platform** — a 2-person smoke check *is* its CI). **Git worktrees =
|
|
255
|
+
the AI-parallelism primitive, CAPPED at ≈2–4 = your *review* capacity, not your agent count** (you can
|
|
256
|
+
spawn ten agents; you can't read ten diffs — more agents than you can review is unreviewed code with
|
|
257
|
+
your name on the merge). **Risk-tiered review, not blanket gates** — low-risk gets a glance, high-risk
|
|
258
|
+
(auth/money/migrations/deletes/deploys/AI-paths) gets the *other* human, and **BOSS's `/smoke` +
|
|
259
|
+
`/evals` + `/red-team` ARE that high-risk tier.** **Read the test diff harder than the code** (agents
|
|
260
|
+
rewrite assertions to match broken behaviour — *did the behaviour get fixed, or the expectation
|
|
261
|
+
lowered?*). **Whoever clicks merge owns what the agent wrote** (Osmani); **mob the humans+agent on hard
|
|
262
|
+
problems** (the questioning reflex degrades with an AI pair). **Honesty anchor: METR (n=16)** —
|
|
263
|
+
experienced devs on mature repos 19% *slower* with AI while *believing* 20% faster (the perception gap;
|
|
264
|
+
opposite population to a greenfield startup). **Ownership = prompt-author intent + reviewer acceptance**
|
|
265
|
+
— stated once, shared verbatim with FEAT-021's founder layer.
|
|
266
|
+
- **DOWN — folded into the L1/MVP template** (`stages/L1-mvp/template/claude-append.md`): a tight
|
|
267
|
+
"Git workflow (trunk-based, review-bounded)" section in the MVP working rules, so a scaffolded venture
|
|
268
|
+
inherits the discipline at the rung where there's a real `main` to keep green — kept lean (the practice
|
|
269
|
+
holds the depth; CLAUDE.md bloat is the failure mode the research itself names).
|
|
270
|
+
- **Altitude/JIT held** (not a Quickstart lecture — earns its place at MVP). Zero-dep; library + template
|
|
271
|
+
only; pulled via `/boss-sync`. **FEAT-023 threads 2 (`scalable-architecture`) + 3 (V1→Scale rung)
|
|
272
|
+
remain** — thread 2 next-if-earned, thread 3 deferred until a real V1-stage project.
|
|
273
|
+
|
|
274
|
+
## 0.87.0 — 2026-06-20
|
|
275
|
+
|
|
276
|
+
- **The mentor-architect bundle — the jagged frontier, the model underneath, evals-as-spec (`/vet` sweep:
|
|
277
|
+
RVW-046 + RVW-041 + RVW-053 + RVW-050 → `/boss-learn` UP, MVP/L1).** Four findings sharpen the AI-native
|
|
278
|
+
architecture mentor, all as judgment-aids — no new gates, no scoring apparatus imported.
|
|
279
|
+
- **New section "the jagged frontier, and the model underneath"** — two judgments that sit above every
|
|
280
|
+
AI-native build call and **move with each model release** (the model-recalibration discipline, IDEA-014):
|
|
281
|
+
- **Inside/outside the frontier (RVW-046, Dell'Acqua/Mollick 758-consultant RCT).** AI is sharply
|
|
282
|
+
additive inside the suitable set, ~19pp *less* likely correct outside it — judge which side a task is
|
|
283
|
+
on, pick **centaur** vs. **cyborg**, and **re-ask with every model jump**. Heuristic, not law.
|
|
284
|
+
- **The 70% problem (RVW-053, Osmani + GitClear telemetry).** AI gets you ~70% (what you understand) and
|
|
285
|
+
stalls on the last 30% (what you don't) — the marker between `/prototype` (sketch freely) and
|
|
286
|
+
`/spec`/MVP (now you must understand what ships). Can't-shape-the-30% = slow down, not ship.
|
|
287
|
+
- **Non-default-model transparency (RVW-041, Stanford FMTI).** When deliberately *not* defaulting to the
|
|
288
|
+
host model, weigh transparency alongside cost/capability — a four-item "what to ask" list (data
|
|
289
|
+
provenance, known limits, deprecation/retirement policy, change cadence). Indicators-as-questions; the
|
|
290
|
+
Index's scoring left out.
|
|
291
|
+
- **Reliability bullet gains evals-as-spec (RVW-050, Husain/Shankar).** For an AI product the **eval *is*
|
|
292
|
+
the spec** — writing it drags you across the *Gulf of Specification* (loose intent vs. what the model
|
|
293
|
+
actually does); define the quality bar before building. `/evals` is the machinery; this is the judgment
|
|
294
|
+
above it. The standalone `evals-driven-development` practice stays a NOT-YET-until-MVP→V1 follow-on
|
|
295
|
+
(Quickstart founders don't need eval ceremony, Principle #2).
|
|
296
|
+
- *MVP-mode mentor (`stages/L1-mvp/`); arrives at `boss unlock mvp`; pulled via `/boss-sync`.*
|
|
297
|
+
|
|
298
|
+
## 0.86.0 — 2026-06-20
|
|
299
|
+
|
|
300
|
+
- **The thesis bundle — BOSS's own claim, made honest, plus two coaching blades (`/vet` sweep:
|
|
301
|
+
RVW-047 + RVW-049 + RVW-039 → `/boss-learn` UP).** Three [EVIDENCE]/[THOUGHT-LEAD] findings routed into
|
|
302
|
+
the surfaces where BOSS makes its case — no new machinery, exactly as each verdict scoped.
|
|
303
|
+
- **`PRINCIPLES.md` "why" reframe (RVW-047, Camuffo 759-firm RCT [EVIDENCE]).** The honest version of the
|
|
304
|
+
thesis: disciplined validation doesn't *guarantee a win* — it makes you **decide faster, including
|
|
305
|
+
quitting faster**. Cheap AI lowers the cost of *building*, not of *being wrong* — so the cheaper
|
|
306
|
+
building gets, the more the loop matters, not less. Named because BOSS's conscience exists to stop
|
|
307
|
+
self-fooling, and that has to include BOSS not overclaiming its own promise.
|
|
308
|
+
- **`mentor-venture` gains "the evidence you carry" (RVW-047/049/039).** Three coaching lines, judgment
|
|
309
|
+
first / cite-on-request: (1) validation buys faster decisions, not wins; (2) **the demoware test** —
|
|
310
|
+
*"if your AI product only replaces a prompt plus a copy/paste, it's a demo, not a product"* (Cagan/SVPG),
|
|
311
|
+
anchored at the `/prototype`→`/spec` graduation; (3) **the competence-gate** — AI advice *amplifies* the
|
|
312
|
+
judgment a founder already has and can harm the one least able to grade it (Otis et al. HBS RCT, 640
|
|
313
|
+
founders: high performers ~+15%, struggling ~−8%), so at a call they may not be equipped to judge, ask
|
|
314
|
+
*"are you set up to judge this answer?"* and point at who'd know.
|
|
315
|
+
- **`conscience-voicing.md` gains the competence-gate as a voicing (RVW-039).** A lens the `caution`/`drift`
|
|
316
|
+
moments can reach for — a humility prompt, rare and suggestive, never a gate, **explicitly not a new
|
|
317
|
+
hook predicate** (no signal detects over-trust; don't manufacture one). Cites Otis + CHI 2025
|
|
318
|
+
automation-bias (confidence in AI tracks *less* checking, not more).
|
|
319
|
+
- *Founder-facing, project-neutral; pulled via `/boss-sync`.* mentor-venture ships from Quickstart;
|
|
320
|
+
PRINCIPLES + the practice are BOSS-canonical, synced.
|
|
321
|
+
|
|
322
|
+
## 0.85.0 — 2026-06-20
|
|
323
|
+
|
|
324
|
+
- **The `coordination` conscience moment — the founding-team seam, watched via artifacts (founder layer
|
|
325
|
+
slice 5b; IDEA-037 / FEAT-021).** BOSS's first *team-aware* conscience moment, and the careful gate-piece
|
|
326
|
+
of the program. Built on the most-replicated human-AI-teaming finding: **AI accelerates each individual
|
|
327
|
+
but erodes the human-to-human seam invisibly** (Ju & Aral RCT: social/emotional comms −27% while
|
|
328
|
+
*perceived* teamwork stayed flat — so a "how's teamwork?" prompt is *proven blind*; you have to watch the
|
|
329
|
+
artifact channel). New `coordination-loop` (L1): **entry** = it's a team (`.boss/config.json` has a
|
|
330
|
+
cofounder `"handle"`) **AND** real work has happened (`docs/devlog.md` ≥3 entries); **exit** = ≥1
|
|
331
|
+
`DEC-NNN` recorded together. Open → the conscience reads a **bounded** slice (decisions dir · `boss board`
|
|
332
|
+
· `boss team`) and judges: is work flowing through one founder's agent *around* the cofounder, or is the
|
|
333
|
+
deciding just happening off-repo on a call? Fires only on a real seam; **stays silent on a quiet log**
|
|
334
|
+
(the evidence is weak-transfer — over-firing would punish a healthy team that talks on calls). **Dormant
|
|
335
|
+
by construction when solo** (no `"handle"` → unopenable); **fires at most once/session**; **serves the
|
|
336
|
+
partnership as the unit and NEVER takes a side** (surfaces the seam, never whose fault); points at
|
|
337
|
+
`/decide` + `mentor-cofounder`; mutable (`boss conscience mute coordination`). The **Red-Light** moment
|
|
338
|
+
from the ai-adoption handoff was deliberately **NOT** built as a hook (it's not predicate-gateable —
|
|
339
|
+
there's no artifact that says "about to automate a teammate's task"; forcing it = the over-fire failure
|
|
340
|
+
mode) — it lives as a framing in `mentor-cofounder` instead. **Conscience eval gate: 120/0** (113 + 7 new
|
|
341
|
+
`moment-coordination` cases — fire / dormant-solo / decided-together / too-early / no-team / empty);
|
|
342
|
+
`/tmp`-verified end-to-end (fires on the seam, goes silent once a `DEC` is recorded). Zero-dep.
|
|
343
|
+
**FEAT-021 slice 5 (mentor-the-team) now complete** (5a `mentor-cofounder` + 5b the moment). **Next:
|
|
344
|
+
slice 6 — `boss credits` + the ownership/equity moment (reshaped by the research: refuse-to-score,
|
|
345
|
+
CRediT/IBM-validated, contribution kinds, structural-trigger equity — the worst solo-test profile, build
|
|
346
|
+
the evidence substrate only and get it in front of a real team).**
|
|
347
|
+
|
|
348
|
+
## 0.84.0 — 2026-06-20
|
|
349
|
+
|
|
350
|
+
- **Humane harm-taxonomy as a shippable practice + the mentor internal/shipped boundary made explicit
|
|
351
|
+
(RVW-045 re-homed; `mentor-architect` verdict).** Caught while routing: RVW-045's harm-taxonomy had landed
|
|
352
|
+
in BOSS's *gitignored* `mentor-humane` agent — shipping to nobody. `mentor-architect` reframed it — the
|
|
353
|
+
humane lens is **cross-cutting**, so it belongs in a practice every mentor + the conscience can cite, not
|
|
354
|
+
inside one agent ("the agent was the wrong container, not just the wrong location").
|
|
355
|
+
- New **`library/practices/harm-taxonomy.md`** — Anthropic's 5 harm dimensions + Ada Lovelace's 4
|
|
356
|
+
relationship-harms (manipulation / dependence / anthropomorphism / overreliance), the shared vocabulary
|
|
357
|
+
the conscience, every mentor, and `/canvas` §3 reason against. Pairs with the dark-pattern checklist +
|
|
358
|
+
`/red-team --humane`.
|
|
359
|
+
- **`docs/MENTORS.md`** gains an **internal-vs-shipped boundary table** on the real axis ("artifact a
|
|
360
|
+
founder's *project* consumes" vs. "tooling for authoring BOSS itself") — the thing that would have caught
|
|
361
|
+
the misroute. Clarifies: the humane *lens* ships from Quickstart; the standalone `mentor-humane` *agent*
|
|
362
|
+
stays Scale (ethics wants a hook / cross-cutting shape, not an opt-in door).
|
|
363
|
+
- Captured, not built: **IDEA-038** (`library/` as the canonical managed-artifact shelf; templates become
|
|
364
|
+
thin manifests — deferred, to be *pulled* by the learning loop, not pushed by a routing bug) +
|
|
365
|
+
**IDEA-039** (humane lens as conscience-moments + practice vs. a standalone agent — decided by a future
|
|
366
|
+
humane-moment eval set, `tester` territory) + an architecture decision record. Practice + doc only;
|
|
367
|
+
zero CLI / dependency change.
|
|
368
|
+
|
|
369
|
+
## 0.83.0 — 2026-06-20
|
|
370
|
+
|
|
371
|
+
- **`/decide` gains a cheap falsifier + AI-decision provenance (founder layer slice 1 iteration; IDEA-037 /
|
|
372
|
+
FEAT-021, from the research realignment).** The single highest-leverage change the founding-teams research
|
|
373
|
+
surfaced: a decision log that only assigns blame **fails — verification cost is the bottleneck**, not
|
|
374
|
+
accountability. So `DEC-NNN` now carries:
|
|
375
|
+
- **A `## Falsifier` field — *"what would prove this wrong, and by when?"*** (*"if signups don't move by
|
|
376
|
+
July, X was wrong"*), mirrored into an optional `revisit_by:` date. **Required for `costly`/`one-way`
|
|
377
|
+
calls, encouraged for reversible** — it makes finding out you were wrong *cheap and scheduled*.
|
|
378
|
+
- **`decided_by:`** — `founder` / `ai-suggested-ratified` / `ai-autonomous`, so over-delegation of
|
|
379
|
+
load-bearing calls to the model is *visible* (where automation bias enters).
|
|
380
|
+
- **Reversibility-scaled ceremony** — reversible = a line + the **consent** question (*"is it safe enough
|
|
381
|
+
to try?"*, which gives a non-technical cofounder honest language to agree without fully evaluating a
|
|
382
|
+
technical call); `costly`/`one-way` = require the falsifier + weigh the alternatives.
|
|
383
|
+
- **A mild skeptic prompt at irreversible *AI-suggested* decisions** (*"what's the one thing you'd check
|
|
384
|
+
before you can't undo it?"*) — **a prompt, never a gate** (forced verification *backfires*; disposition
|
|
385
|
+
beats process). BOSS's own `DEC-001` updated to dogfood the new format.
|
|
386
|
+
Evidence: automation-bias review + "Bias in the Loop" (accountability-alone insufficient; forced
|
|
387
|
+
verification backfires; skepticism beats every design factor), "Don't Vibe" (over-delegation seam),
|
|
388
|
+
sociocratic consent. Skill-layer only, zero-dep; eval gate **113/0**; `/tmp`-verified. **Also triaged the
|
|
389
|
+
research:** FEAT-021 sharpened across all slices; build-process/scaling work **spun out to FEAT-023**
|
|
390
|
+
(`git-workflow` + scalable-architecture + the V1→Scale rung — not team-specific). **Next: slice 5b (the
|
|
391
|
+
teams-conscience family — Red-Light + watch-the-seam + moral-crumple-zone, gate-evalled).**
|
|
392
|
+
|
|
393
|
+
## 0.82.0 — 2026-06-20
|
|
394
|
+
|
|
395
|
+
- **The humane lens, operationalized — dark-pattern checklist + `/red-team --humane` + a harm taxonomy
|
|
396
|
+
(RVW-031 + RVW-045, via `/vet` → `/boss-learn`).** Third routed bundle from the research session, turning
|
|
397
|
+
the humane lens from a vibe into named, checkable artifacts:
|
|
398
|
+
- **`library/practices/ai-ux-patterns.md`** gains a **dark-patterns** section — CDT's *Dark Patterns in AI
|
|
399
|
+
Chatbots* (2026, CC-BY) 37 patterns in 5 families (data/memory exploitation · misleading design ·
|
|
400
|
+
autonomy-for-engagement · false social/emotional connection · coercive monetization), framed as
|
|
401
|
+
*recognize-as-you-build*, plus CDT's **constructive "humane alternative"** (default conversations to
|
|
402
|
+
end, opt-in social layer, genuine delete controls, no emotional language near a purchase). Key insight:
|
|
403
|
+
these can **emerge from the model** (sycophancy), not just be designed — so test the *built* product.
|
|
404
|
+
- **`/red-team` gains a `--humane` dimension** — probes the founder's *own* AI product for the dark-pattern
|
|
405
|
+
families, weighted toward emergent ones (sycophancy, engagement-prolonging, emotional-manipulation-near-
|
|
406
|
+
money, misrepresentation). Binary pass/fail; **suggestive, never blocking** (conscience-not-censor).
|
|
407
|
+
- *(BOSS-internal, not shipped)* BOSS's own self-hosted `mentor-humane` also gained a **harm taxonomy** —
|
|
408
|
+
Anthropic's 5 harm dimensions + Ada Lovelace's 4 relationship-harms (manipulation / dependence /
|
|
409
|
+
anthropomorphism / overreliance), reflexively disciplining BOSS's own voice. RVW-045 had no founder
|
|
410
|
+
template to route into (mentor-humane is BOSS-only), so this is dogfood, not a shipped capability.
|
|
411
|
+
- Held the conscience-not-censor line throughout (names the cost + the humane alternative; never makes a
|
|
412
|
+
choice unavailable). Practice + skill + agent text only; zero CLI/dependency change.
|
|
413
|
+
|
|
414
|
+
## 0.81.0 — 2026-06-20
|
|
415
|
+
|
|
416
|
+
- **`mentor-cofounder` — mentor the TEAM (founder layer slice 5a; IDEA-037 / FEAT-021).** Built on the
|
|
417
|
+
founders' Pain B (*"different founding skill sets — an engineer who's a first-time CTO, or a non-tech
|
|
418
|
+
person who doesn't know how to work with an engineer cofounder on building together"*). Every other
|
|
419
|
+
mentor coaches *a* founder; this new L1 mentor coaches the **relationship between founders** — the
|
|
420
|
+
differentiated "mentor the team" value. It helps a team **divide work across skill sets** (record
|
|
421
|
+
who-owns-what as a `DEC`), **bridge the skill gap both directions**, **surface the hard conversations**
|
|
422
|
+
(roles/pace/equity — the talks that kill teams by never happening), and **name decision rights** (one
|
|
423
|
+
DRI; equal equity fine, 50-50 control-deadlock the trap). **Folds in the inherited AI-adoption-culture
|
|
424
|
+
knowledge** (from the v0.80.0 practice, the handoff source): the **Human Agency Scale / Red-Light** zone
|
|
425
|
+
(don't automate a teammate's task they don't want automated), **psychological safety paired with high
|
|
426
|
+
standards** (Edmondson / RVW-035 — not niceness), killing the **secret-cyborg** dynamic (reward honest AI
|
|
427
|
+
use, shared via `/practice`), and the **no-workslop norm** (*"would I be proud to hand this to my
|
|
428
|
+
cofounder?"*). **Wires the AI consent + norms conversation into the `boss team` flow** — a one-time nudge
|
|
429
|
+
on the solo→team transition + a standing pointer in the team view. **The bright line is built in:** it
|
|
430
|
+
serves the **partnership-as-the-unit** and **NEVER takes a side** between cofounders (facilitate, name the
|
|
431
|
+
tradeoff, they decide; the only override is harm, not preference; not a lawyer or cap table). **Dormant
|
|
432
|
+
when solo.** Registered in the L1 manifest + `docs/MENTORS.md` (arrives at MVP). Zero-dep; eval gate
|
|
433
|
+
**113/0**; `/tmp`-verified. **Next: slice 5b — the conscience Red-Light *moment* (new hook detector +
|
|
434
|
+
eval cases through the gate; the deliberate, careful piece) · then slice 6 (`boss credits` + the
|
|
435
|
+
ownership moment).**
|
|
436
|
+
|
|
437
|
+
## 0.80.0 — 2026-06-20
|
|
438
|
+
|
|
439
|
+
- **New practice: `ai-adoption-culture.md` — bring AI to a team without breeding resentment (RVW-038,
|
|
440
|
+
via `/vet` → `/boss-learn`; feeds IDEA-037).** The second routed bundle from the research-feeds session,
|
|
441
|
+
and the most net-new: how a small founding team adopts AI so people **opt in** rather than comply.
|
|
442
|
+
Distilled from Stanford's **Human Agency Scale** (automate what's *wanted*; name the "Red Light"
|
|
443
|
+
capable-but-unwanted zone), **Edmondson** psychological safety (make it safe to admit "I don't know AI";
|
|
444
|
+
pair safety with high standards), Mollick's **"secret cyborgs"** (reward honest AI use, don't punish it
|
|
445
|
+
into hiding), and the **"workslop"** finding (sloppy AI output erodes trust *between cofounders* — own
|
|
446
|
+
your draft before you forward it). Ships a concrete **cofounder AI consent + norms conversation** and
|
|
447
|
+
stays JIT — dormant solo, surfaces when a second person joins, **never a mandate** (the conscience names
|
|
448
|
+
the Red-Light tension, never picks the answer). **Authored as a reviewable starting draft for the
|
|
449
|
+
IDEA-037 founding-teams build** — the handoff names the recommended wiring (a conscience Red-Light
|
|
450
|
+
moment, the consent conversation in `boss team`, mentor-the-team citing it) and leaves the hook/gate
|
|
451
|
+
changes to that build. Library practice, zero-dep (not scaffolded); folds in RVW-035 (Edmondson) when
|
|
452
|
+
slice 5 lands.
|
|
453
|
+
|
|
454
|
+
## 0.79.0 — 2026-06-20
|
|
455
|
+
|
|
456
|
+
- **Agent-security hardened for the 2026 agent-native surface (RVW-032/042/044/054, via `/vet` →
|
|
457
|
+
`/boss-learn`).** The first routed bundle from a research-feeds mining session (new
|
|
458
|
+
`docs/research/SOURCES.md` — a ~40-org institutional roster — plus a 25-verdict `/vet` sweep,
|
|
459
|
+
RVW-031…055). `library/practices/agent-security.md` + the L1 `/red-team` skill grow beyond the
|
|
460
|
+
stateless lethal-trifecta / LLM-Top-10 baseline to cover what a founder shipping an *agent* actually
|
|
461
|
+
faces:
|
|
462
|
+
- **Agent-native threat model** — the **OWASP Agentic ASI Top 10 (Dec 2025)** (goal hijack, tool
|
|
463
|
+
misuse, agentic supply chain, memory/context poisoning, identity/privilege abuse, rogue agents…)
|
|
464
|
+
added to `/red-team` for any target with tools + memory + autonomy; the stateless LLM Top 10 still
|
|
465
|
+
covers a plain prompt-in/text-out path. Plus **agentic misalignment** named as a *measured* failure
|
|
466
|
+
mode (Anthropic) — bound autonomy + sensitive access; don't assume good behaviour.
|
|
467
|
+
- **Containment defaults** beneath the Rule of Two (Anthropic "how we contain Claude" + Redwood
|
|
468
|
+
control): **match isolation to oversight** — read-only / read-write-no-delete mount tiers, **egress
|
|
469
|
+
allowlists**, inspect-tool-returns-before-context, and **gate the irreversible** behind a human or a
|
|
470
|
+
cheaper trusted check.
|
|
471
|
+
- **The shipped app is its own surface** — AI defaults to insecure code (Veracode: ~45% of
|
|
472
|
+
AI-generated code ships an OWASP-Top-10 vuln) and the classic vibe-coded leak is **client-side key
|
|
473
|
+
exposure** (the Tea breach; ~25k secrets found across vibe-coded sites). A new **pre-ship
|
|
474
|
+
app-security pass** in `/red-team` (no secrets/keys in the bundle + OWASP web basics),
|
|
475
|
+
**non-negotiable for `first-product` / `vibe-coder-newbie`**. Names the honest gap: **`secrets-guard`
|
|
476
|
+
does NOT cover this** — it stops the *agent* reading secrets into context; it does nothing about a
|
|
477
|
+
*shipped app* exposing a key.
|
|
478
|
+
- Held the JIT line (Principle #2): surfaces one trigger at a time — trifecta on first untrusted read,
|
|
479
|
+
the ASI list on first agent ship, the pre-ship scan on first deploy, the full battery for regulated
|
|
480
|
+
cohorts. Practice-doc + skill-text only (zero CLI / dependency change); provenance carries the RVW
|
|
481
|
+
trail.
|
|
482
|
+
|
|
483
|
+
## 0.78.0 — 2026-06-20
|
|
484
|
+
|
|
485
|
+
- **`/practice` + `PRAC-NNN` — the shared craft commons (founder layer slice 4; IDEA-037 / FEAT-021).**
|
|
486
|
+
The most *differentiated* slice, built on the founders' Pain C: *"either of us could be discovering best
|
|
487
|
+
new ways to use agentic AI — it changes so fast — and we want to keep sharing + staying current, using
|
|
488
|
+
best practices, so we can focus on building, with mentorship so we're not worrying whether we're outdated
|
|
489
|
+
or expensive."* New L1 skill `/practice` captures a **craft learning** (a better/cheaper/newer way to
|
|
490
|
+
build with AI) as a shared, attributed `docs/practices/PRAC-NNN-<slug>.md` (What we learned / Why it
|
|
491
|
+
works / How to apply), stamped with **who learned it** (`@github-username`) so a teammate gets it *and*
|
|
492
|
+
knows who to ask. **Shared by construction** (commits with the repo per the slice-3a cut — push backs it
|
|
493
|
+
up, a cofounder who clones has it). **Staleness-aware — the part that keeps you *current*, not just
|
|
494
|
+
documented:** a practice tied to a specific model/price/tool can carry `review_by:`, and when it passes,
|
|
495
|
+
`/revalidate` asks *"still the best way? anything changed?"* → keep / update / retire — the team's quiet
|
|
496
|
+
defense against being outdated or overspending, the model-recalibration discipline ([[IDEA-014]])
|
|
497
|
+
team-scoped. **Held the humane line:** attribution is recognition + a pointer, **never a scoreboard**
|
|
498
|
+
("measures what the team knows, never who contributed more"). New `PRAC-NNN` ID type (`docs/IDS.md`);
|
|
499
|
+
registered in the L1 manifest. Zero-dep, skill-layer only; eval gate **113/0**; `/tmp`-verified (ships,
|
|
500
|
+
`boss map` lists it, `PRAC-001` parses with attribution). **Next (FEAT-021): slice 5 mentor-the-team ·
|
|
501
|
+
slice 6 `boss credits` + the ownership moment.**
|
|
502
|
+
|
|
503
|
+
## 0.77.0 — 2026-06-20
|
|
504
|
+
|
|
505
|
+
- **The founder-layer state cut — back up + share, keep the conscience private (slice 3a; IDEA-037 /
|
|
506
|
+
FEAT-021 / DEC-001).** Occasioned by Ajesh's insight that BOSS taught us the *released-app vs.
|
|
507
|
+
dev-codebase* split: BOSS gitignores its *own* docs (don't ship private strategy in a public OSS package),
|
|
508
|
+
but a **venture's `docs/` already commit** — so a scaffolded team's ideas, canvas, `DEC-NNN` decisions,
|
|
509
|
+
research, RESUME, and the venture brain's `read.md` are **already backed up (push = backup) and shared
|
|
510
|
+
(a cofounder who clones is in the loop)**. This release names that win and **plugs the one leak**: the
|
|
511
|
+
template now gitignores the **per-person** conscience state — `.boss/brain/relationship.md` (what the
|
|
512
|
+
conscience said to *you* and what you did with it) + `.boss/trace.jsonl` — so one founder's private nudge
|
|
513
|
+
history never travels to the other (Contextual Integrity). The backup/share framing is surfaced in
|
|
514
|
+
`/welcome` (first-run) and the L0 `AGENTS.md` conventions. **The decision is recorded as BOSS's own first
|
|
515
|
+
`/decide`** — [`DEC-001`](../docs/decisions/DEC-001-founder-layer-brain-cut.md): *venture brain `read.md`
|
|
516
|
+
SHARED (the hive-mind read, also seeds a joining cofounder's brain for free), conscience `relationship.md`
|
|
517
|
+
PER-PERSON* — drafted with the research- + 3-mentor-backed recommendation, **Ajesh to confirm or supersede**.
|
|
518
|
+
Reversible at a cost (the leak fix *prevents* the irreversible accident; sharing `read.md` confirms the
|
|
519
|
+
status quo). Deferred: per-founder namespacing (`.boss/founders/<handle>/`) — the structural fix, needs
|
|
520
|
+
`brain.js` path changes. Template-only + docs; `/tmp`-verified (per-person ignored, shared/backup commit).
|
|
521
|
+
**Next (FEAT-021): slice 4 — shared craft commons (Pain C, the most differentiated) · slice 5
|
|
522
|
+
mentor-the-team · slice 6 `boss credits` + the ownership moment.**
|
|
523
|
+
|
|
524
|
+
## 0.76.0 — 2026-06-20
|
|
525
|
+
|
|
526
|
+
- **`boss board` owner lens — `owner:`-as-person + `--mine` (founder layer slice 2b; IDEA-037 / FEAT-021).**
|
|
527
|
+
The board (already a pure projection of frontmatter) now reads `owner:` and, **when the venture is a
|
|
528
|
+
team**, shows the founder who owns each card (`@handle`) — provenance of *who's the DRI*, surfaced as a
|
|
529
|
+
quiet suffix. `boss board --mine` narrows to the cards you own ("what am I on the hook for"); the JSON
|
|
530
|
+
projection (`--json`) carries `owner` per card for agent-readability. **Dormant-solo:** a solo founder
|
|
531
|
+
sees no owners and nothing changes — the lens only lights up once `boss team` has a cofounder. **Held the
|
|
532
|
+
humane line:** only a `@handle` counts (role owners like `pm` are ignored), and owners are shown as
|
|
533
|
+
per-card provenance, **never aggregated into a per-person count/leaderboard** (the credit-score line
|
|
534
|
+
mentor-humane drew). Quote-tolerant (`owner: "@handle"` — a leading `@` is reserved in YAML). Pure
|
|
535
|
+
projection preserved (no new state); zero-dep; eval gate **113/0**; `/tmp`-verified (team-shown /
|
|
536
|
+
solo-hidden / `--mine` filter / JSON field). *Deferred:* owner in the HTML board + `--next`/`--blocked`
|
|
537
|
+
views. **Next (FEAT-021): slice 3 — keep-in-the-loop + the shared/personal state cut** (the one
|
|
538
|
+
costly-to-reverse decision; to be recorded as BOSS's own `DEC` once the cut is chosen).
|
|
539
|
+
|
|
540
|
+
## 0.75.0 — 2026-06-20
|
|
541
|
+
|
|
542
|
+
- **`boss team` — the team-aware foundation (founder layer slice 2; IDEA-037 / FEAT-021).** Makes BOSS
|
|
543
|
+
*know* whether a venture is solo or a founding team, keyed on **GitHub identity** (`@username` via
|
|
544
|
+
`gh api user` → `git config`, never fabricated — the principal id the whole team layer builds on).
|
|
545
|
+
`boss team` shows the venture's people; `boss team add @handle "Name"` / `remove @handle` manage the
|
|
546
|
+
roster (stored in `.boss/config.json`). **Dormant-solo by design** (the mentor-pass guardrail that
|
|
547
|
+
clears the solo test): with no cofounder declared, a solo founder sees nothing new and *nothing
|
|
548
|
+
changes* — the team layer only lights up when someone joins, so it's inert, never overhead. Guards
|
|
549
|
+
against adding yourself (you're already "you"). `/welcome` now asks the light, optional "solo or with
|
|
550
|
+
someone?" question at first run (flippable anytime); L0 `CLAUDE.md`/`AGENTS.md` wayfinding refreshed for
|
|
551
|
+
`/decide` + `DEC-NNN` + `boss team` (no drift). `src/team.js`, zero-dep; conscience eval gate **113/0**;
|
|
552
|
+
`/tmp`-verified (solo→add→team→remove + self-add rejection). **Where the roster lives is deliberately
|
|
553
|
+
LOCAL for now** — whether it should travel via git (so a cofounder sees it) is the shared-vs-personal
|
|
554
|
+
**state cut**, slice 3, which will be recorded as BOSS's own `DEC` (dogfooding `/decide`). **Next
|
|
555
|
+
(FEAT-021):** slice 2b `owner:`-as-person + board-by-owner · slice 3 keep-in-the-loop + the state cut.
|
|
556
|
+
|
|
557
|
+
## 0.74.0 — 2026-06-20
|
|
558
|
+
|
|
559
|
+
- **`/decide` + `DEC-NNN` — the decision log (founder layer, slice 1; IDEA-037 → FEAT-021).** First build
|
|
560
|
+
of "BOSS for founding teams," green-lit on **real-founder demand** (past-pain stories from founder
|
|
561
|
+
conversations — the first field signal against the canvas's n=0 demand risk, Risk #6 antidote). New L0
|
|
562
|
+
skill `/decide` records a **load-bearing or hard-to-reverse** choice as a durable ADR-lite record in
|
|
563
|
+
`docs/decisions/DEC-NNN-<slug>.md`: Context / Decision / **Why** / Consequences, stamped with the
|
|
564
|
+
**decider** (`@github-username`, resolved from `gh api user` → `git config user.name`, never fabricated)
|
|
565
|
+
and a **`reversibility:`** flag (`reversible | costly | one-way`, Bezos two-way/one-way doors).
|
|
566
|
+
**Supersede-don't-edit** — a changed mind writes a new `DEC` with `supersedes:` and flips the old to
|
|
567
|
+
`status: superseded`, so the chain *is* the story of how thinking evolved. The rationale future-you (and
|
|
568
|
+
a cofounder who wasn't in the room) can read instead of guessing — and in a team, the artifact both can
|
|
569
|
+
point at instead of misremembering. New `DEC-NNN` ID type in `docs/IDS.md`; registered in the L0
|
|
570
|
+
manifest (decisions happen from day one). **Bright lines held** (mentor-vetted, IDEA-037): records a
|
|
571
|
+
decision, never gates one; never a cap table or legal advice (points at a real attorney for
|
|
572
|
+
equity/vesting); the conscience may surface a tension but **never picks a side between cofounders**.
|
|
573
|
+
Zero-dep, skill-layer only (no `src/` change). **Slice 1 of a 6-slice program** (team-aware foundation →
|
|
574
|
+
keep-in-the-loop → shared craft commons → mentor-the-team → credit + the ownership moment) — each later
|
|
575
|
+
slice records *into* this one. See [`FEAT-021`](../docs/ideas/FEAT-021-founder-layer-decision-log.md).
|
|
576
|
+
|
|
577
|
+
## 0.73.0 — 2026-06-20
|
|
578
|
+
|
|
579
|
+
- **`mentor-business` (V1) gains the on-ramp + tier-design layers — the founder's pricing menu, completed
|
|
580
|
+
(RVW-030, ADAPT).** RVW-023 (v0.66) gave the V1 business mentor its *metering-basis* axis (per-seat →
|
|
581
|
+
usage → hybrid → outcome). This adds the two questions that axis left open, sourced from a `/deep-research`
|
|
582
|
+
pass (27 sources, 22/25 claims adversarially confirmed; `docs/research/pricing-and-tiers-playbook-2026.md`):
|
|
583
|
+
- **The on-ramp** — freemium / free-trial / reverse-trial / no-free-tier, chosen by *available traffic*
|
|
584
|
+
and *per-user cost*, with the AI catch made explicit (every free interaction burns real compute, so
|
|
585
|
+
freemium needs cheap/hard-capped free or a reverse trial instead of a crippled tier).
|
|
586
|
+
- **The tiers** — the ~3-tiers-plus-enterprise default, gating each tier on the axis matching the value
|
|
587
|
+
metric and *only where a tier's absence blocks a segment's core job*, free→paid line at a real aha.
|
|
588
|
+
- Carries three cautionary cases (Cursor's credit-repricing apology, "customers don't think in tokens,"
|
|
589
|
+
vague outcome-units backlash) and an explicit **numbers-are-contested** rule: coach the ordering and
|
|
590
|
+
trade-offs, never quote a precise conversion rate as fact.
|
|
591
|
+
- **Template only**, not BOSS's own `mentor-business` instance — BOSS has zero paying users, so its own
|
|
592
|
+
agent stays deliberately lean (Principle #2). Inherits the agent's existing defer-discipline and
|
|
593
|
+
"voice the tension, never filter the menu" humane rule. Connected projects pull it via `/boss-sync`.
|
|
594
|
+
|
|
595
|
+
## 0.72.0 — 2026-06-20
|
|
596
|
+
|
|
597
|
+
- **Per-moment mute + first-run consent — "don't voice it if I don't want it," at the granularity of
|
|
598
|
+
the moment (v0.71.0 conscience-voicing follow-on).** Pause (v0.23) silenced the *whole* conscience for
|
|
599
|
+
a bounded session; this adds the surgical half the founder can't get from pause:
|
|
600
|
+
- **`boss conscience mute <moment> [--for 7d | --until-resume] [--reason]`** + **`unmute <moment>`
|
|
601
|
+
/ `--all`** — turn down ONE moment (drift, caution, capture, focus…) while the rest keep speaking.
|
|
602
|
+
Hook-enforced (the conscience filters muted signals after detection, then exits silent if nothing's
|
|
603
|
+
left — same shape as pause). **Auto-unmutes on expiry**, the per-moment twin of pause's silent
|
|
604
|
+
auto-resume. Stored under a separate `conscienceMutes` key in `.boss/config.json` so pause/resume
|
|
605
|
+
(which overwrite `cfg.conscience`) can never clobber a mute — the two controls are orthogonal by
|
|
606
|
+
construction. Moment names validate against the project's actual loops (typo → the available list).
|
|
607
|
+
- **`boss conscience status` now surfaces live mutes** (like it surfaces an active pause), so a
|
|
608
|
+
forgotten mute can't silently swallow a moment forever; the over-fire-smell hint in
|
|
609
|
+
`boss conscience activity` now points at mute as the surgical alternative to pause.
|
|
610
|
+
- **First-run consent moment** — `/welcome` now introduces the moments as a set and names all three
|
|
611
|
+
controls (pause / mute / override) *before* any fires, cohort-aware (full tour gets the three-control
|
|
612
|
+
walk; the 30-second version gets pause + mute in one breath). The founder meets the conscience and
|
|
613
|
+
learns they can dial each moment, rather than discovering the controls only after being nudged.
|
|
614
|
+
- This operationalizes the `conscience-voicing` practice's consent boundary: all current hook moments
|
|
615
|
+
are *self-regarding* (about the founder's own venture discipline), so a flat per-moment mute is the
|
|
616
|
+
correct, fully-honored control. Zero-dep; single source (the L0 hook + runtime feed both the hook and
|
|
617
|
+
the `boss conscience` CLI).
|
|
618
|
+
|
|
619
|
+
## 0.71.0 — 2026-06-20
|
|
620
|
+
|
|
621
|
+
- **Conscience voicing — name the tension, never filter the menu.** Closes a paternalism seam found
|
|
622
|
+
while auditing what BOSS recommends to founders: the conscience is *suggestive*, but in a few places
|
|
623
|
+
it had drifted toward *withholding*. Three moves, one principle (a conscience makes a cost **visible**;
|
|
624
|
+
a censor makes a choice **unavailable**):
|
|
625
|
+
- **`mentor-business` (V1 template) gained a second axis.** The model menu was structure/licensing
|
|
626
|
+
only (OSS, open-core, patronage, cohort, SaaS…); it now also names the **metering basis** the AI era
|
|
627
|
+
runs on — per-seat → usage → hybrid → outcome → service-as-software → agent-to-agent — each with its
|
|
628
|
+
humane tension as an *overridable note*. Founders see the full menu including the models BOSS is wary
|
|
629
|
+
of; omitting them "to protect" is itself a dignity cost. New rule: **voice the tension once, then
|
|
630
|
+
yield.** (Re-opens **RVW-023** NOT-YET → **ADAPT**: per-seat assumptions are now demonstrably stale —
|
|
631
|
+
Intercom/Zendesk/SAP/Adobe live — and the verdict had conflated *should BOSS adopt this for itself*
|
|
632
|
+
(still NOT-YET) with *should BOSS tell founders it exists* (yes).)
|
|
633
|
+
- **`mentor-humane` override authority clarified — over other mentors, never over the founder.** The
|
|
634
|
+
"lens is non-negotiable" language was the one place the conscience read like a censor. Reframed: the
|
|
635
|
+
lens makes a harm *un-ignorable*, not a choice *unavailable*. Encodes the consent boundary —
|
|
636
|
+
**self-regarding** tension (the founder's own venture) is fully muteable; **third-party harm**
|
|
637
|
+
(someone not in the room) is named once even if unwelcome, because the harmed party never consented
|
|
638
|
+
to being silenced. Always *name*; never *override*.
|
|
639
|
+
- **New library practice `conscience-voicing.md`** (UP via `/boss-learn`) — the inheritable spine:
|
|
640
|
+
the conscience-vs-censor line, the 7-habit craft of voicing concern without blocking, the
|
|
641
|
+
consent-boundary table, where it applies (hook moments, every mentor, `/vet`, any menu), and the
|
|
642
|
+
existing machinery (`conscience pause`, `relationship.md`) to build on rather than reinvent.
|
|
643
|
+
|
|
644
|
+
## 0.70.0 — 2026-06-20
|
|
645
|
+
|
|
646
|
+
- **Wayfinding-drift check — the de-rot pass becomes a standing guard (IDEA-035, built).** v0.68 was a
|
|
647
|
+
*manual* catch-up after the hand-authored prose lagged 20 releases of new capability. The generated
|
|
648
|
+
surfaces (`boss map`, `CHEATSHEET.md`, `SKILLS.md`) can't rot — they're rebuilt from `src/modes.js`;
|
|
649
|
+
the curated prose can, and nothing flagged it. New dev-only `scripts/check-wayfinding-drift.js`
|
|
650
|
+
(`npm run check:wayfinding`, and a courtesy nudge at the end of `gen:docs`) greps `GUIDE.md` — the one
|
|
651
|
+
prose doc *meant* to walk the whole ladder — against the manifest skill lists and warns on any skill
|
|
652
|
+
named in **no** rung. **The trap, designed around:** README/`/welcome` deliberately don't enumerate
|
|
653
|
+
skills (they point at `boss map`), so the check guards *only* `GUIDE.md`, never blanket coverage —
|
|
654
|
+
and it **nudges, never blocks** (exit 0 always; a drift check that fails a commit is the ceremony BOSS
|
|
655
|
+
refuses). Internal/meta skills carry a *printed* exempt list (`boss-learn`, `design-tokens-init`,
|
|
656
|
+
`extract`, `drift-deep`), never silent. Cleared the drift it found on first run: `GUIDE.md` now names
|
|
657
|
+
`/import` + `/cost-review` in their rungs and `/boss-sync` + `/feedback` as standing utilities.
|
|
658
|
+
Dev-only (not shipped in the package); the founder-facing `boss sync` generalization stays a NOT-YET
|
|
659
|
+
UP candidate (PRINCIPLE #2). Zero-dep.
|
|
660
|
+
|
|
661
|
+
## 0.69.0 — 2026-06-20
|
|
662
|
+
|
|
663
|
+
- **`shipped_on:` — a true date-windowed Shipped archive (IDEA-034 follow-on).** The board's Shipped
|
|
664
|
+
column was bounded only by a recent-*count* cap (v0.66). This adds the *date* half the founder asked
|
|
665
|
+
for: stamp `shipped_on: <date>` when a FEAT ships (in `/spec`'s lifecycle note, alongside
|
|
666
|
+
`building_since:`), and `boss board` folds any ship older than ~30 days into the "+N shipped earlier"
|
|
667
|
+
`<details>` — so the column shows what landed *lately*, not every ship forever. **Frontmatter-true,
|
|
668
|
+
never guessed:** no `shipped_on:` → graceful fallback to the count cap (legacy ships still bounded).
|
|
669
|
+
Shipped now sorts newest-first by ship date (dated ahead of undated, then id). `--all` still reveals
|
|
670
|
+
everything; `--json` carries `archived` + `shippedAgeDays`. Zero-dep; verified in `/tmp` (a 30+-day
|
|
671
|
+
ship archives, recent ones show, undated legacy falls back to the cap; HTML folds the same).
|
|
672
|
+
|
|
673
|
+
## 0.68.0 — 2026-06-20
|
|
674
|
+
|
|
675
|
+
- **De-rot pass — the hand-authored wayfinding caught up to 20 releases of new capability (IDEA-018).**
|
|
676
|
+
The generated surfaces (`boss map`, `CHEATSHEET.md`, `SKILLS.md`) stayed current automatically; the
|
|
677
|
+
*prose* a founder actually reads had drifted — the recurrence of the exact "19-release README drift"
|
|
678
|
+
IDEA-018 was built to catch. Surgical, scope-correct fixes (each surface names what fits *its* level,
|
|
679
|
+
and still points at `boss map` for the live list — no re-enumerating):
|
|
680
|
+
- **README** — the install flow now shows `/prototype` ("hit go") and, crucially, **`boss adopt`** for
|
|
681
|
+
an already-started repo (a stranger would otherwise read BOSS as greenfield-only — the exact gap
|
|
682
|
+
IDEA-005 closed). `--ai` mentioned.
|
|
683
|
+
- **`/welcome`** (the first-run orientation) — `/prototype` is now offered as a third path ("if you'd
|
|
684
|
+
rather *see* the idea than describe it"), and `/persona` is in the skill list.
|
|
685
|
+
- **L0 `CLAUDE.md`** — `/comprehend` added to the skills line.
|
|
686
|
+
- **`docs/GUIDE.md`** — the rung-by-rung walk now includes `/prototype` + `/persona` (Quickstart),
|
|
687
|
+
`/revalidate` + `/judge-traces` + `/red-team` + `/consult` (MVP), and a new "you already started —
|
|
688
|
+
there's a repo" entry for `boss adopt`.
|
|
689
|
+
No code change; eval 113/0. The lesson re-learned: generated wayfinding is drift-proof; curated prose
|
|
690
|
+
needs a deliberate de-rot pass after a big build run — this was it.
|
|
691
|
+
|
|
692
|
+
## 0.67.0 — 2026-06-20
|
|
693
|
+
|
|
694
|
+
- **The `/vet` sweep's six ADAPTs, routed via `/boss-learn` (RVW-015→026).** A 12-claim skeptical sweep
|
|
695
|
+
of the research inbox (2 REJECT · 4 NOT-YET · 6 ADAPT · 0 ADOPT — a good skeptic's spread) produced six
|
|
696
|
+
scoped adaptations, now landed. The headline two sharpen what just shipped:
|
|
697
|
+
- **Outcome ledger (RVW-021) — the humane alternative to a notification cap.** `boss conscience
|
|
698
|
+
activity` now reads `.boss/brain/relationship.md` and reports an **acted-on rate** ("75% of nudges
|
|
699
|
+
landed or were engaged · N landed · N overrode · N ignored"). A persistently-low acted-on rate is the
|
|
700
|
+
*real* over-fire smell — better than a raw count, and it **never muzzles a load-bearing warning** the
|
|
701
|
+
way an arbitrary daily cap would (BOSS's fires are predicate-earned, not engagement push). **Completes
|
|
702
|
+
IDEA-013's deferred self-throttle by outcome, not by fiat.**
|
|
703
|
+
- **Brain staleness is a write-side job (RVW-026).** `/close`'s brain-write step now re-checks the
|
|
704
|
+
standing summary *before* appending and **revises/retires stale claims** — "the brain evolves, it
|
|
705
|
+
doesn't just accrete; the most dangerous brain cites yesterday's truth with today's confidence."
|
|
706
|
+
Hardens the v0.65 venture brain.
|
|
707
|
+
- Four lighter skill sharpenings: **`/prototype`** gains the build-to-learn / build-to-earn frame
|
|
708
|
+
(Cagan; RVW-016); **`/canvas`** gains a scoped humane *build-or-buy?* cell for tool-shaped ideas
|
|
709
|
+
(Fried; RVW-018); **`/spec`** gains a delegation line — *what will you verify + what's out of the
|
|
710
|
+
agent's authority* (Mollick's checklist kernel, minus the "know what good looks like" platitude;
|
|
711
|
+
RVW-020); **`/evals`** gains an AISI-Inspect pointer for trajectory eval (RVW-025; the principle was
|
|
712
|
+
already there). The 2 REJECTs (AI-runs-your-interviews → fails #6; single-strong-agent → confirms
|
|
713
|
+
IDEA-028) and 4 NOT-YETs (constrained-decoding, MCP-publishing, OTel-GenAI, outcome-pricing) are
|
|
714
|
+
recorded with re-open conditions. eval **113/0** + GRADED 24 (the conscience change is to the
|
|
715
|
+
*activity readout*, not the hook firing). Research hygiene: inbox cleared to a clean drop-zone, 12
|
|
716
|
+
new verdicts written, sources archived to `reviewed/`.
|
|
717
|
+
|
|
718
|
+
## 0.66.0 — 2026-06-20
|
|
719
|
+
|
|
720
|
+
- **Board intelligence — the board stops being a mirror and becomes something you (and the agent)
|
|
721
|
+
steer by (IDEA-034).** The board was already ahead of most "AI board" advice — it's a *pure
|
|
722
|
+
projection* of `status:` frontmatter, never a maintained doc (IDEA-015). So this pass doesn't add
|
|
723
|
+
richer kanban (drag-drop, swimlanes, story points, a `board.json` are all **refused** — each
|
|
724
|
+
reintroduces a second source of truth or premature ceremony); it makes the projection answer harder
|
|
725
|
+
questions and feed the conscience. Four tracks:
|
|
726
|
+
- **A — Agent-readable board.** `boss board --next` (an ordered "what to pick up," finish-before-you-
|
|
727
|
+
start), `--blocked` (everything not moving — blocked + aging + review-due, in one place), and
|
|
728
|
+
`--json` (the machine-readable projection — the actual agent-readability contract; the agent reads
|
|
729
|
+
the board as task-queue instead of re-deriving state). CLI-level, every mode; a lighter cousin of
|
|
730
|
+
the V1 `/board` skill.
|
|
731
|
+
- **B — Time-in-build aging + a bounded Shipped column.** A FEAT that's sat in Building past ~3 weeks
|
|
732
|
+
now flags `⌛ Nw in build` (the zombie-feature smell `/revalidate` targets) — **frontmatter-true,
|
|
733
|
+
never guessed**: `/spec` stamps `building_since:` when it sets `status: building`; no date → no
|
|
734
|
+
flag. Building sorts longest-running-first (finish what's been open longest). The otherwise-unbounded
|
|
735
|
+
Shipped column caps to the most recent few (`--all` / a `<details>` expander to see the rest).
|
|
736
|
+
- **C — Honest flow in `boss insights`.** Idea→build cycle time (median, from recorded `created:`
|
|
737
|
+
dates only — omitted, never guessed, when absent). Loop-closure cycle time, **never throughput /
|
|
738
|
+
velocity** (the vanity metric BOSS refuses to expose).
|
|
739
|
+
- **D — Board → conscience `focus` moment.** ≥4 FEATs in Building with nothing Shipped opens a new
|
|
740
|
+
`focus-loop` (L1-mvp): the "stop starting, start finishing" smell. Judge-style (the model reads the
|
|
741
|
+
board and distinguishes scattered abandonment from honest parallel work before voicing), at most
|
|
742
|
+
once per session, **never a gate** — and it auto-silences the moment anything ships (exit = ≥1
|
|
743
|
+
shipped). Conservative threshold + auto-silence are the over-fire guards.
|
|
744
|
+
- **E — Lightweight priority.** Optional `priority: high` in FEAT/IDEA frontmatter floats a card to
|
|
745
|
+
the top of its column (a `⬆` marker) and leads `--next` — **one level, frontmatter-true, never a
|
|
746
|
+
drag-to-reorder.** Deliberately no P0/P1/P2 ladder (that turns the board into a planning surface you
|
|
747
|
+
tend instead of ship); the caveat — *re-prioritizing isn't progress; finishing is* — ships in
|
|
748
|
+
`/spec`. Priority is the one explicit ordering signal layered over the default finish-first sort.
|
|
749
|
+
- **HTML kanban — a real visual refresh, then a sharpening pass** (the founder asked if it could
|
|
750
|
+
"feel more legit," then for stronger hierarchy): owned accent + signature dot + uppercase kicker,
|
|
751
|
+
count pills, card depth/hover, **bold titles with a quiet monospace id**, **tinted backgrounds so
|
|
752
|
+
aging/blocked/review cards pull the eye first**, a priority pill, the Shipped `<details>` fold —
|
|
753
|
+
calm and crafted, still zero-dep / single-file / light-dark (not a startup-bro dashboard).
|
|
754
|
+
- **Discipline held:** zero-dep; the model owns judgment, the CLI owns the projection. Conscience gate
|
|
755
|
+
**113/0** (105 + 8 new `moment-focus` cases) and the 24 GRADED judgment evals stay green; verified
|
|
756
|
+
end-to-end in `/tmp`.
|
|
757
|
+
|
|
758
|
+
## 0.65.0 — 2026-06-20
|
|
759
|
+
|
|
760
|
+
- **`relationship.md` — the venture brain's missing half: the conscience learns whether its nudges
|
|
761
|
+
land (IDEA-022 / FEAT-022).** The architecture designed two model-owned files — `read.md` (the POV,
|
|
762
|
+
shipped) **and** `relationship.md` (what the conscience *said* and what the founder *did* with it).
|
|
763
|
+
Only the read existed; this builds the relationship log, which closes the loop the frequency ledger
|
|
764
|
+
(IDEA-013) only *counts*: did the nudge **land**?
|
|
765
|
+
- **`/close` writes it** — *only when the conscience actually fired this session* — a dated entry:
|
|
766
|
+
what was flagged + what the founder did, tagged honestly (*landed* / *ignored* / *overrode, with the
|
|
767
|
+
reason* / *pushed back and was right* — the last being the most valuable: it's how the conscience
|
|
768
|
+
learns to fire better). The must-nots carry over: it logs the conscience's *own* hit rate, never
|
|
769
|
+
scores the founder.
|
|
770
|
+
- **The conscience reads it back to calibrate (the payoff).** When a moment fires, the hook hands the
|
|
771
|
+
model a **bounded** slice of the recent log (last ~1-2 sessions) so it *adjusts instead of repeats*:
|
|
772
|
+
if it's raised a point and the founder moved past it for a stated reason, it says it lighter or stays
|
|
773
|
+
silent; if a past nudge landed, it builds on it. This is what makes the conscience feel like it
|
|
774
|
+
*remembers the conversation*, not just the venture.
|
|
775
|
+
- **CLI:** `boss brain --relationship` (view the log), `boss brain record --kind relationship`
|
|
776
|
+
(stamp it; the index now carries `kind` to distinguish read vs relationship), `boss brain` surfaces
|
|
777
|
+
a one-line pointer, and `boss brain forget --before <date>` prunes **both** files symmetrically
|
|
778
|
+
(living memory across both).
|
|
779
|
+
- **Cost + safety held:** read only when a moment is firing, bounded (~900 chars), **byte-identical
|
|
780
|
+
when no log exists** → 105 gate + 24 GRADED judgment evals stay green (verified). Zero-dep; the
|
|
781
|
+
model owns the prose, the CLI owns the index. **FEAT-022 (the venture brain) is now complete —
|
|
782
|
+
read + relationship + index + living memory.**
|
|
783
|
+
|
|
784
|
+
## 0.64.0 — 2026-06-20
|
|
785
|
+
|
|
786
|
+
- **AI-native scaffolder — `--ai` + `/comprehend` (IDEA-022 Track 3; the last track, most guarded).**
|
|
787
|
+
Scaffold from what BOSS *understands*, not just a fixed template copy — built exactly to the
|
|
788
|
+
guardrail: **additive, behind a flag, the deterministic template stays the default.** `boss new --ai`
|
|
789
|
+
/ `boss adopt --ai` do the normal reversible scaffold, then set `aiNative` in `.boss/config.json` and
|
|
790
|
+
point at a new model-driven **`/comprehend`** skill (L0). The CLI never calls a model (zero-dep, layer
|
|
791
|
+
1) — the comprehension is the skill's job (same predicate/runner split as `/import`). `/comprehend`
|
|
792
|
+
reads what BOSS can honestly understand (an adopted repo with the wide context · the captured idea +
|
|
793
|
+
`docs/source/` · or nothing-yet → it says so and stops), then **non-destructively**: fills the
|
|
794
|
+
`AGENTS.md` overview with a real read, **seeds the venture brain** with an honest first dated read
|
|
795
|
+
(so the conscience has continuity from day one — connects Track 3 → Tracks 0/4) + stamps the index,
|
|
796
|
+
and **recommends** (never auto-applies) the disciplines that fit (`/ai-first-init`, `/design-tokens-init`,
|
|
797
|
+
`secrets-guard`+`/red-team`, `/persona`). **The guardrail is in the skill itself:** everything is a
|
|
798
|
+
plain-text, diffable, revertable write in the working tree — *"if it can't be diffed, it doesn't
|
|
799
|
+
ship"*; it never rewrites the deterministic scaffold. Verified: `--ai` sets the flag + surfaces
|
|
800
|
+
`/comprehend` (and plain `boss new` is unchanged, `aiNative:false`); eval 105/0, GRADED 24.
|
|
801
|
+
**IDEA-022 is now complete — all four tracks + the spine shipped.** (The fuller presence/identity
|
|
802
|
+
design + the proactive presence-moment stay deliberately deferred: a new unprompted trigger is the
|
|
803
|
+
over-fire risk the conscience guards against; the v0.63 voicing *is* the presence.)
|
|
804
|
+
|
|
805
|
+
## 0.63.0 — 2026-06-20
|
|
806
|
+
|
|
807
|
+
- **The conscience now voices *with* the venture brain (IDEA-022 Track 4 — "the brain, voiced").** The
|
|
808
|
+
spine (Track 0) gave the brain a read; this makes the conscience **speak with it**. When a moment
|
|
809
|
+
fires, the hook reads a **bounded** slice of `.boss/brain/read.md` (the standing summary + the single
|
|
810
|
+
most recent dated read — continuity, not the whole history) and hands it to the model as a
|
|
811
|
+
*Continuity* frame, so the nudge is **specific to what the conscience already understands** — the
|
|
812
|
+
"how did it know that" that earns trust ("you've rebuilt onboarding three times and still haven't
|
|
813
|
+
talked to a user") instead of a generic line. The instruction is explicit: voice *with* the read,
|
|
814
|
+
don't read it back as fact, and **trust what you see now over the brain** (the founder can correct
|
|
815
|
+
it). **Cost + safety held:** the brain is read **only when a moment is already firing** (past the
|
|
816
|
+
silent early-exit — never every prompt), bounded to ~1400 chars, and **byte-identical output when no
|
|
817
|
+
brain exists** — so the **105 gate evals + 24 GRADED judgment evals stay green** (verified; the
|
|
818
|
+
judgment fixtures have no brain, so they're unaffected). The proactive "presence moment" (the
|
|
819
|
+
conscience surfacing its read unprompted) is deliberately **not** built — a new always-on trigger is
|
|
820
|
+
the over-fire risk the conscience itself guards against; the voicing *is* the presence. **IDEA-022:
|
|
821
|
+
Tracks 0, 1, 2, 4 now shipped; Track 3 (AI-native scaffolder, flag-guarded) is the last.**
|
|
822
|
+
|
|
823
|
+
## 0.62.0 — 2026-06-20
|
|
824
|
+
|
|
825
|
+
- **`/red-team` — turn BOSS's defenses into evidence (IDEA-033 #3; the "defense → measured" Anthropic
|
|
826
|
+
move).** `agent-security` is *prevention* (the deny-list floor, secrets-guard ceiling, Rule of Two);
|
|
827
|
+
`/red-team` is *proof*. A new **L1 skill** that adversarially tests an AI-mediated FEAT (or BOSS's own
|
|
828
|
+
conscience hook, `--self`) against the **OWASP 2025 LLM Top 10** — prompt injection (direct +
|
|
829
|
+
indirect), sensitive-info disclosure, improper output handling, excessive agency, system-prompt
|
|
830
|
+
leakage, vector weaknesses, misinformation, unbounded consumption, supply chain, data poisoning. Each
|
|
831
|
+
category gets a **binary pass/fail + the attack that proved it**; **failures become `/evals` cases**
|
|
832
|
+
(defense → test → regression-proof), pairing with `/evals` (correctness) and the agent-security
|
|
833
|
+
practice (prevention). Cohort-aware (domain-expert gets the full battery + escalation route;
|
|
834
|
+
first-product gets the high-value subset in plain language). Honest scope line every run: red-teaming
|
|
835
|
+
lowers risk, it doesn't certify safety; the deterministic deny-list floor stays the load-bearing
|
|
836
|
+
prevention. Registered in L1; `boss map` + cheatsheet updated; eval 105/0.
|
|
837
|
+
|
|
838
|
+
## 0.61.0 — 2026-06-20
|
|
839
|
+
|
|
840
|
+
- **Scouted skillsmp.com; routed two Anthropic skills through `/vet` → `/boss-learn` (dogfood).** Five
|
|
841
|
+
marketplace skills reviewed against BOSS's own machinery; the on-principle move was to *vet, not
|
|
842
|
+
hand-absorb*. Two earned an ADAPT, three did not. **What landed:**
|
|
843
|
+
- **`library/practices/skill-authoring.md`** (new) — from Anthropic's `skill-creator` ([RVW-013](../docs/research/verdicts/RVW-013-skill-creator-authoring-discipline.md),
|
|
844
|
+
ADAPT). Fills a real void: BOSS authors skills as its core motion but had *no* written authoring
|
|
845
|
+
discipline. Captures the three transferable principles — explanatory-over-prescriptive (the
|
|
846
|
+
IDEA-014 / Principle #2 stance applied to *how we write skills*), progressive disclosure, and
|
|
847
|
+
descriptions that earn their triggers — plus a ship-time self-check. The heavy with/without
|
|
848
|
+
**eval-harness is deliberately left out** (duplicates `/vet` + `conscience-evals/`); deferred to IDEA-033.
|
|
849
|
+
- **`design-system.md` → "Aesthetic ambition — past the slop default"** — from Anthropic's
|
|
850
|
+
`frontend-design` ([RVW-014](../docs/research/verdicts/RVW-014-frontend-design-aesthetic-ambition.md),
|
|
851
|
+
ADAPT). The practice owned the *discipline* axis (tokens, the 47 blues, missing states) but was
|
|
852
|
+
silent on the *taste* axis. Adds the anti-AI-slop stance, the five aesthetic dimensions, and a
|
|
853
|
+
one-paragraph design-thinking pre-pass — **bounded by BOSS's restraint** (a11y + five states + perf
|
|
854
|
+
are floors; minimalism is the safer default for a green founder, against the source's maximalist lean).
|
|
855
|
+
- **Three rejected, recorded:** `ui-ux-pro-max` (checklist mined into IDEA-033; CLI+DB machinery
|
|
856
|
+
rejected — zero-dep ethos), obra/superpowers `brainstorming` (its "every project, no exceptions"
|
|
857
|
+
is the literal anti-thesis of BOSS's JIT bet; two micro-techniques harvested to IDEA-033),
|
|
858
|
+
`code-reviewer` (already dominated by Claude Code's own `/code-review`).
|
|
859
|
+
- IDEA-033 backlog extended (items 6–8: skill-eval harness, UI/UX pre-delivery checklist, `/spec`+`/consult`
|
|
860
|
+
question-discipline audit) — each earn-it-gated, not green-lit.
|
|
861
|
+
|
|
862
|
+
## 0.60.0 — 2026-06-20
|
|
863
|
+
|
|
864
|
+
- **`docs/PATTERNS.md` — the patterns writeup (the packaging "cool" move, documented).** A public-facing,
|
|
865
|
+
builder-audience doc that names the engineering patterns BOSS is built on, **framed in Anthropic's own
|
|
866
|
+
2026 vocabulary with real numbers** — the highest-resonance, lowest-effort move from the Anthropic-appeal
|
|
867
|
+
research (it's *packaging* what BOSS already has, not new engineering):
|
|
868
|
+
- **The conscience separates the doer from the judge** (their #1 2026 motif) — a deterministic
|
|
869
|
+
`UserPromptSubmit` hook (438 lines, **zero model calls of its own**) that gates, then hands the model
|
|
870
|
+
a *bounded* read in a *fresh* context. Unprompted + isolated.
|
|
871
|
+
- **Two eval surfaces, with numbers:** 105 deterministic gate cases / 0 failures + **24 GRADED**
|
|
872
|
+
LLM-judge cases (separate pass, transcripts read).
|
|
873
|
+
- **Progressive-disclosure skills:** 29 skills, **~1.7k tokens avg** (< Anthropic's 5k guidance),
|
|
874
|
+
loading JIT (~100-token description until invoked).
|
|
875
|
+
- Dormant-by-default hooks + frequency-not-tokens cost ledger; security lineage (deny floor →
|
|
876
|
+
secrets-guard ceiling → Rule-of-Two) stated honestly; AGENTS.md portability; persona-as-both.
|
|
877
|
+
- **The honest limits are not buried:** zero real founders (its own canvas's 100%-risk), the conscience
|
|
878
|
+
is Claude-bound, a synthetic judge is still synthetic. (Anthropic rewards honest framing over theater.)
|
|
879
|
+
Linked from the README ("building agent tooling yourself?"). Ships (tracked doc). No behavior change.
|
|
880
|
+
|
|
881
|
+
## 0.59.0 — 2026-06-20
|
|
882
|
+
|
|
883
|
+
- **Venture-brain living memory — the write/evict side (IDEA-022 Track 0; the research's #1 capability
|
|
884
|
+
gap).** The brain spine (`boss brain` + `record`, model-owned `read.md` + CLI-owned `index.json`)
|
|
885
|
+
had the *read* side; this adds the **write/evict** side the 2026 gaps research named as the biggest
|
|
886
|
+
upgrade (Anthropic's memory tool + context editing: +39% / 84% fewer tokens) — and which `brain.js`
|
|
887
|
+
itself named as next. **Living memory ≠ infinite memory:**
|
|
888
|
+
- **`boss brain --diff`** — the read's *evolution* (date + headline per session, from the index):
|
|
889
|
+
continuity made visible without dumping the whole prose.
|
|
890
|
+
- **`boss brain forget --before <date>`** (or `--id <bN>`) — the **evict** side: drops dated reads
|
|
891
|
+
older than the date from `read.md` + prunes matching index entries, **keeping the standing summary
|
|
892
|
+
(preamble) + recent reads**. Founder-invoked, *never automatic* — it's an opinion about a person,
|
|
893
|
+
so only the human prunes it (on-principle).
|
|
894
|
+
- **Recency-window gate** (8 sessions) — `boss brain` nudges toward compression/eviction when the
|
|
895
|
+
read gets long, so the always-loaded surface stays lean (the bloat the conscience itself warns
|
|
896
|
+
against).
|
|
897
|
+
- **`/close` pairs the model side:** when the read spans many sessions, fold the oldest reads'
|
|
898
|
+
lasting conclusions into the standing summary at the top, drop the verbatim old blocks. The
|
|
899
|
+
standing summary survives; dated blocks are working history that ages out.
|
|
900
|
+
Additive to the parallel-session spine (no existing behavior changed); zero-dep, format-based block
|
|
901
|
+
handling (CLI owns boundaries, model owns content); verified in `/tmp` (diff, evict-preserving-
|
|
902
|
+
preamble, gate); eval 105/0. **IDEA-022: Tracks 1+2 + the Track-0 spine's living-memory increment now
|
|
903
|
+
shipped; Track 4 (fuller voicing) + Track 3 (AI-native scaffolder) remain.**
|
|
904
|
+
|
|
905
|
+
## 0.58.0 — 2026-06-20
|
|
906
|
+
|
|
907
|
+
- **Scaffold `AGENTS.md` — host-neutral, fixes a self-contradiction (IDEA-032).** The clearest cheapest
|
|
908
|
+
miss from the 2026 gaps research: BOSS scaffolded **Claude-only `CLAUDE.md`**, locking every venture
|
|
909
|
+
to one host — against its own [IDEA-006](../docs/ideas/IDEA-006-conscience-host-portability.md)
|
|
910
|
+
host-portability principle + Principle #5 (optionality). Fixed the way **Anthropic's own docs
|
|
911
|
+
recommend** (verified via `code.claude.com/docs/en/memory`): `AGENTS.md` carries the host-neutral
|
|
912
|
+
working rules + conventions (read directly by Codex/Cursor/Copilot/Devin/…); `CLAUDE.md` is a thin
|
|
913
|
+
Claude-specific layer that **imports it via `@AGENTS.md`** (loads into context at launch — no
|
|
914
|
+
duplication, no drift) and adds the skills/conscience/mode-ladder below. `boss new` emits both;
|
|
915
|
+
**`boss adopt` handles every case non-destructively** — bare repo → both copied; existing CLAUDE.md
|
|
916
|
+
→ preserved + an adopt block that `@AGENTS.md`-imports the (now-present) rules; existing AGENTS.md →
|
|
917
|
+
preserved + BOSS's working rules appended as a marked block. New reusable `appendMarkedBlock` helper.
|
|
918
|
+
Verified all four cases in `/tmp`; eval 105/0; `npm pack` ships `AGENTS.md`. **A concrete
|
|
919
|
+
down-payment on IDEA-006 + an Anthropic-appeal signal (portability is the property they evangelize).**
|
|
920
|
+
|
|
921
|
+
## 0.57.0 — 2026-06-20
|
|
922
|
+
|
|
923
|
+
- **`/consult` — convene the mentor board on a cross-cutting question (IDEA-022 Track 2).** Some
|
|
924
|
+
decisions don't belong to one mentor (raise-vs-bootstrap touches fundraising + business + GTM +
|
|
925
|
+
humane at once). A new **L1 skill** that orchestrates the individual `mentor-*` agents: route the
|
|
926
|
+
question to **only the mentors with a stake** (read the installed board from `.boss/manifest.json` —
|
|
927
|
+
it grows by mode), get each take **in its own lens** (grounded in the canvas/RESUME/FEAT, pushback
|
|
928
|
+
included), then **synthesize with the disagreement kept visible** — *where seasoned advisors disagree
|
|
929
|
+
is where the real decision lives; never average them into mush.* `mentor-humane` keeps its standing
|
|
930
|
+
**override authority**; the synthesis ties back to the canvas's riskiest assumption and **hands the
|
|
931
|
+
call back to the founder** (advisory, never a gate; record which lens you followed + why). Reads/
|
|
932
|
+
writes the venture brain (`.boss/brain/`) when present (IDEA-022). Registered in L1; `boss map` +
|
|
933
|
+
cheatsheet updated; eval 105/0. **IDEA-022 progress:** Track 1 (`boss adopt`, v0.56) + Track 2
|
|
934
|
+
(`/consult`, v0.57) now shipped; Tracks 3-4 (AI-native scaffolder, venture-brain voicing) remain.
|
|
935
|
+
|
|
936
|
+
## 0.56.0 — 2026-06-20
|
|
937
|
+
|
|
938
|
+
- **`boss adopt` — bring BOSS into an already-started repo, non-destructively (IDEA-005; Track 1 of
|
|
939
|
+
IDEA-022).** The largest realistic adoption path — most founders have a repo *before* they hear about
|
|
940
|
+
BOSS — and the one people kept asking for. `boss adopt [--mode <m>]` in an existing dir: copies only
|
|
941
|
+
what doesn't collide (a new `cpSafe`/`applyStageSafe` — **never `cpSync`'s clobber**), merges the
|
|
942
|
+
conscience-hook registration into an existing `settings.json` *additively* (the founder's permissions
|
|
943
|
+
+ their own hooks preserved), appends a small marked block to an existing `CLAUDE.md` (or copies the
|
|
944
|
+
template's if there is none), stamps `.boss/` as **not-self-hosted + adopted**, and registers it.
|
|
945
|
+
**"Lite BOSS" is the design, not a fallback (Principle 2):** defaults to the lightest register
|
|
946
|
+
(Quickstart); `--mode mvp` adopts the *full chain* (lays down Quickstart's foundation too, exactly as
|
|
947
|
+
`boss new` + `boss unlock mvp` would) for a brownfield app that's already earned it. Idempotent
|
|
948
|
+
(refuses re-adopt → points at `boss sync` / `boss unlock`). Once adopted it's a normal registered
|
|
949
|
+
project on the usual sync loop — no special-casing. Verified end-to-end in `/tmp`: a real repo's
|
|
950
|
+
`CLAUDE.md`, `settings.json` (custom permission + `Stop` hook), `README`, and `src/` all preserved;
|
|
951
|
+
conscience loops + hook land so it can fire; `boss map` works; `--mode mvp` lays L0+L1. eval 105/0.
|
|
952
|
+
Wired into `boss --help`. **This is also IDEA-022 Track 1** — the living conscience needs a venture
|
|
953
|
+
to read, and adopt is how an existing one gets a venture brain.
|
|
954
|
+
|
|
955
|
+
## 0.55.0 — 2026-06-20
|
|
956
|
+
|
|
957
|
+
- **`/persona` — your app's target-user as a consultable agent voice (IDEA-031).** Occasioned by
|
|
958
|
+
Ajesh: *"if I wanna build an app for moms to track chores, the first persona is moms… we can do a
|
|
959
|
+
Q&A with the builder, online research, or a UX researcher drops their research in… the app uses it
|
|
960
|
+
as an agent voice to guide product decisions"* — *"its also the QA, its both right?"* A new **L0
|
|
961
|
+
skill**: **derive** the primary target-user persona from the idea → `docs/personas/<slug>.md`
|
|
962
|
+
(who · context · jobs · pains · values · *what we don't know yet* · synthetic/real evidence ledger);
|
|
963
|
+
**enrich** from four sources (builder Q&A · `deep-research` online · drop-in real research via
|
|
964
|
+
`/import` · passive read of idea/canvas); **consult** in voice in **both directions** — *guidance*
|
|
965
|
+
("would she want X?") and *QA* (Husain-discipline structured reactions on a build, comparable across
|
|
966
|
+
versions). **The discipline is the product:** every consult is framed as a *pre-filter, never
|
|
967
|
+
validation* — balances interest with concerns, names its blind spots, and closes with the
|
|
968
|
+
go-ask-a-real-one caveat (Fitzpatrick/Mom Test); synthetic shrinks as real grows, visibly. Reuses
|
|
969
|
+
[IDEA-009](../docs/ideas/IDEA-009-proto-personas-as-evolving-instruments.md)'s evolving-instrument
|
|
970
|
+
methodology pointed at the *founder's* users (vs. BOSS's internal cohort instruments). Registered in
|
|
971
|
+
L0; `boss map` + cheatsheet updated; eval 105/0.
|
|
972
|
+
- **`/prototype` refined by its own persona-reactions pass (IDEA-009 instrument working).** Ran the 4
|
|
973
|
+
BOSS-internal personas on the new features (`/prototype`, kanban, upstream conscience); the
|
|
974
|
+
convergences drove real fixes: (1) the after-run nudge now leads with a **concrete plain-language
|
|
975
|
+
action** and lets *"I don't know yet"* be a fine answer (beginners bounced on `/canvas` /
|
|
976
|
+
"pressure-test" jargon); (2) the **5-token pass is skipped on throwaway sketches** by default
|
|
977
|
+
(eng-builder caught it contradicting "tangible beats pretty"); (3) **`--stack=<x>`** is now a
|
|
978
|
+
first-class option + the stack pick is narrated in plain words for beginners; (4) trimmed the
|
|
979
|
+
defensive over-explanation (dropped the "not vibe coding" protest; sketch-vs-MVP named once; the
|
|
980
|
+
"becomes the MVP" rule now names the *real* failure — bolting auth onto throwaway code). The
|
|
981
|
+
build-integrated eval channel caught design issues before a real founder hit them — first evidence
|
|
982
|
+
for IDEA-009's claim #1.
|
|
983
|
+
|
|
984
|
+
## 0.54.0 — 2026-06-20
|
|
985
|
+
|
|
986
|
+
- **Upstream conscience — `/spec` now asks "is it worth building?" not just "is it built right?"
|
|
987
|
+
(IDEA-026 Part A, closes IDEA-026).** The biggest *conceptual* delta from the 2026 leader scan
|
|
988
|
+
(Ng's "PM bottleneck," Appleton's "align before the agent runs"), shipped the small + safe way the
|
|
989
|
+
host-subtraction audit found: **not a new always-on loop, a voice-sharpening of the existing
|
|
990
|
+
`spec-loop` restraint** (which already fires, skill-invoked, when the founder reaches for `/spec`
|
|
991
|
+
before the canvas closes). The restraint frame now surfaces the *substantive* gap — *who is this
|
|
992
|
+
for, and what's the bet that could sink it?* — not a checklist. **Respects `/prototype`:** a
|
|
993
|
+
throwaway sketch needs none of this (build-first is legitimate); the question fires only when
|
|
994
|
+
committing to build *for real*. eval gate 105/0 (skill-voice change, no predicate change).
|
|
995
|
+
- **Positioning reframe — README opener now leads with the judgment gap (option C).** *"Everyone can
|
|
996
|
+
build now. Almost no one can tell a real business from a convincing demo. BOSS is the conscience
|
|
997
|
+
that keeps you honest while you move fast."* + the floor/ceiling subline (*vibe coding gets you a
|
|
998
|
+
demo; the discipline on top gets you a business*). The 2026-vocabulary update to positioning
|
|
999
|
+
(harness-is-the-moat / agentic-engineering-is-the-discipline), in BOSS's voice. Full draft +
|
|
1000
|
+
rationale in the gitignored `docs/dossier/positioning-reframe-2026.md`.
|
|
1001
|
+
|
|
1002
|
+
## 0.53.0 — 2026-06-20
|
|
1003
|
+
|
|
1004
|
+
- **`/judge-traces` — the deliberate reader for the trace substrate (IDEA-025 Phase 2).** v0.48 shipped
|
|
1005
|
+
`auto-log` (collects `.boss/trace.jsonl`); this is the skill that *reads* it — completing a
|
|
1006
|
+
shipped-but-inert capability (a substrate nothing reads has no purpose). A new **L1 skill** applying
|
|
1007
|
+
Hamel/Shankar's 2026 discipline to the founder's *own* sessions: read real traces → factual shape
|
|
1008
|
+
first (cheap, deterministic) → sort failures into a **binary** pass/fail taxonomy (`wrong-files`,
|
|
1009
|
+
`thrash`, `silent-scope-creep`, `no-trace-of-the-point`, or your own) → route *recurring* modes to
|
|
1010
|
+
`/boss-learn`. **One expert not a committee; don't grade its own homework (judge the trajectory);
|
|
1011
|
+
counts are the signal.** Graceful degrade: no trace → honest "nothing to judge yet, turn on
|
|
1012
|
+
`auto-log`," never a fabricated taxonomy. **Collection ≠ judgment** stays load-bearing: `auto-log`
|
|
1013
|
+
collects passively, `/judge-traces` judges deliberately — never fused into an always-on auto-grader.
|
|
1014
|
+
Registered in L1; `boss map` + cheatsheet updated; eval 105/0.
|
|
1015
|
+
- **Host-subtraction audit drafted (IDEA-028) + an IDEA-026 Part A finding (workspace docs).** The
|
|
1016
|
+
`mentor-architect` pass concluded **retire nothing**: BOSS's conscience fires *unprompted/event-driven*,
|
|
1017
|
+
and native `/goal`/`/loop` are *user-invoked* — they can't replace a conscience that speaks when you
|
|
1018
|
+
wouldn't have asked, so the host did *not* absorb the loop runtime (the moat). Sit-on the host for
|
|
1019
|
+
future orchestration (dynamic Workflows) + the secrets ceiling (auto-mode hard-deny); keep the
|
|
1020
|
+
`permissions.deny` floor (more portable). **Bonus finding:** `spec-loop` already implements most of
|
|
1021
|
+
the "upstream conscience" (IDEA-026 Part A) — it fires restraint when the founder reaches for `/spec`
|
|
1022
|
+
without the canvas closed — so Part A is a *voice sharpening of spec-loop*, not a new `worth-building-loop`
|
|
1023
|
+
(which would redundantly overlap spec-loop + caution + drift). Left for Ajesh's eye on the exact
|
|
1024
|
+
voice (it touches conscience tone). No code retired; decision-input only.
|
|
1025
|
+
|
|
1026
|
+
## 0.52.0 — 2026-06-20
|
|
1027
|
+
|
|
1028
|
+
- **`/prototype` — drop an idea, hit go, see something tangible (IDEA-030).** Ajesh's framing settled
|
|
1029
|
+
the design: *"not just vibe coding, but not gatekeeping until so much thought… building first in a
|
|
1030
|
+
lean cycle is a place to start, not waiting until the other two are clear. People can fill in the
|
|
1031
|
+
missing pieces after they get the gist out of their head and see something tangible."* A new **L0
|
|
1032
|
+
skill** that builds the smallest runnable, clickable version of an idea — the ONE core interaction,
|
|
1033
|
+
in whatever stack gets to "click it" fastest, mock data freely, the 5-token distinctiveness pass so
|
|
1034
|
+
it doesn't look generic — then runs it. **The load-bearing call:** the conscience fires **AFTER** the
|
|
1035
|
+
thing runs ("there it is — does seeing it change the idea? when you're ready: `/canvas`"), **never a
|
|
1036
|
+
gate before** — building first *is* a legitimate first move in the loop, not a skip of it. Cohort-
|
|
1037
|
+
aware (first-product gets the magic moment; eng-builder gets stack control; domain-expert gets the
|
|
1038
|
+
"this is a sketch, not a regulated tool" guardrail up front). Honest framing held: it's a *sketch to
|
|
1039
|
+
think with, named once, not your MVP* — and the graduate ladder (`unlock mvp` → `/spec` → `/evals`)
|
|
1040
|
+
is what keeps a fast prototype from becoming a pseudo-app (PRINCIPLES). Registered in the L0
|
|
1041
|
+
manifest; `boss map` + cheatsheet updated; eval 105/0.
|
|
1042
|
+
|
|
1043
|
+
## 0.51.0 — 2026-06-20
|
|
1044
|
+
|
|
1045
|
+
- **Visual kanban (`boss board --html`) + a voice-tightening pass.** Two founder-experience asks.
|
|
1046
|
+
- **`boss board --html` — the board, as a visual kanban.** Ajesh: *"having a visual kanban state
|
|
1047
|
+
that can be updated when the board is is super helpful."* Promotes the HTML view IDEA-015 deferred
|
|
1048
|
+
behind an earn-it gate. Same **pure projection** as the terminal board (`collectBoard`) rendered to
|
|
1049
|
+
a self-contained `.boss/board.html` — zero deps, no server, no JS framework; four columns
|
|
1050
|
+
(Captured / Taking shape / Building / Shipped), cards with id + title, `↻ review due` + `blocked`
|
|
1051
|
+
flags, the evidence line + riskiest-assumption framing on top, light/dark, responsive. Opens in
|
|
1052
|
+
the default browser (best-effort; the printed path is the contract). "Updated when the board is" =
|
|
1053
|
+
re-run it — it's a read of the files, never a maintained doc (the IDEA-015 discipline holds). Wired
|
|
1054
|
+
into `boss --help` + `boss map`.
|
|
1055
|
+
- **Voice pass (voice-keeper full audit — verdict: "in remarkably good shape").** Applied the
|
|
1056
|
+
high-leverage fixes: dropped the `🎯` emoji that shipped into the founder's own canvas template;
|
|
1057
|
+
unified the cohort question wording so `/welcome` and `/boss` are *actually* identical (they
|
|
1058
|
+
claimed to be); settled the self-description on **"build tool"** (not "build companion" — the
|
|
1059
|
+
ethos is conscience/tool, not coach); collapsed the most-read `boss new` first-run lines to one
|
|
1060
|
+
bold path + demoted fallbacks (matching `/welcome`'s own discipline); collapsed the `boss insights`
|
|
1061
|
+
triple-stated telemetry footer; dropped an inside-baseball `/ai-cost` wink in `boss conscience
|
|
1062
|
+
cost`; de-"coach"-ed the README mentor line against the settled ethos; added `boss brain` +
|
|
1063
|
+
`boss insights` to the `boss map` standing-controls so the live cheatsheet agrees with `--help`.
|
|
1064
|
+
(Kept the maintainer-only `Learned … UP` vocabulary deliberately — UP/DOWN is load-bearing
|
|
1065
|
+
Principle-1 language and the reader there is always the maintainer.)
|
|
1066
|
+
- **Captured (not built): [IDEA-030 "drop an idea and hit go"](../docs/ideas/IDEA-030-drop-an-idea-hit-go.md)** —
|
|
1067
|
+
a fast path to a runnable prototype. Philosophically loaded (the pseudo-app trap is *why BOSS
|
|
1068
|
+
exists*), so captured with a concrete `/prototype` proposal + the resolution (see-it-not-sell-it,
|
|
1069
|
+
conscience-attached, the graduate ladder as the honesty mechanism) and four shape questions for
|
|
1070
|
+
Ajesh — a decision to take together before building.
|
|
1071
|
+
- Eval gate 105/0; judgment GRADED; HTML render + voice fixes verified end-to-end in `/tmp`.
|
|
1072
|
+
|
|
1073
|
+
## 0.50.0 — 2026-06-20
|
|
1074
|
+
|
|
1075
|
+
- **Close the trends-pass loose ends (IDEA-027 #4 + IDEA-026 Part B wiring).** Ajesh's audit prompt —
|
|
1076
|
+
*"did we miss anything else to implement? you did a lot of lookup."* Two genuine loose ends from the
|
|
1077
|
+
research, both now closed:
|
|
1078
|
+
- **`boss board` surfaces "review due" — the trigger half of `/revalidate`.** v0.48 shipped the
|
|
1079
|
+
revalidation *gate* but nothing *surfaced what to revalidate* — a half-feature. The board now reads
|
|
1080
|
+
each item's `next_review:` frontmatter and flags any whose date has passed (`· ↻ review due`) plus
|
|
1081
|
+
a footer line `↻ N past review — run /revalidate <id>`. **Frontmatter-true, never guessed:** no
|
|
1082
|
+
`next_review` date → not flagged (an age-inferred "stale" signal would be noise the founder learns
|
|
1083
|
+
to ignore). Completes the dhun-ported revalidation lifecycle; `boss board` stays a pure projection
|
|
1084
|
+
(no new state).
|
|
1085
|
+
- **Agent security wired JIT into `/ai-first-init` (Step 5.5).** v0.49 shipped `agent-security.md` as
|
|
1086
|
+
an UP practice, but founders don't read `library/` — so the lethal-trifecta / Rule-of-Two framing
|
|
1087
|
+
now surfaces *in the AI-native day-one sequence* (one sentence on a Quickstart; more ceremony as
|
|
1088
|
+
the app reads untrusted input / handles regulated data). Names the surface, points at the
|
|
1089
|
+
`permissions.deny` floor + `secrets-guard` ceiling + the deterministic-guard rule.
|
|
1090
|
+
- Eval gate 105/0; judgment GRADED + no blocking failures; board staleness verified end-to-end in
|
|
1091
|
+
`/tmp` (past-review flagged, future-review not). **Honest remaining map (deliberately staged, not
|
|
1092
|
+
missed):** IDEA-025 P2/P3 (`/judge-traces` + trace-fed learn loop — need accumulated traces),
|
|
1093
|
+
IDEA-026 Part A (upstream conscience — needs the IDEA-028 host-mount), IDEA-028 host-subtraction
|
|
1094
|
+
audit (a decision to take *with* Ajesh, not autonomously), AGENT_DOC_MAP (until two agents
|
|
1095
|
+
collide), the SDD-vocabulary + "agentic-engineering" positioning reframes (mentor-pitch territory),
|
|
1096
|
+
and publishing the library on the open Skills standard (IDEA-006 distribution).
|
|
1097
|
+
|
|
1098
|
+
## 0.49.0 — 2026-06-20
|
|
1099
|
+
|
|
1100
|
+
- **Embed the 2026 best-practices — design, evals, security feel current across the board (IDEA-029 +
|
|
1101
|
+
IDEA-026 + the trends pass).** Ajesh's follow-on to v0.48: *"on all research you surfaced (not just
|
|
1102
|
+
design), let's ID all critical to add to our current work, and just go and embed it… make BOSS feel
|
|
1103
|
+
very up to date with best practices."* Where v0.48 shipped the dhun *machinery*, this embeds the
|
|
1104
|
+
*thinking* into the surfaces founders actually touch. Provenance:
|
|
1105
|
+
[SESSION-2026-06-20-ui-design-scan](../docs/research/sessions/SESSION-2026-06-20-ui-design-scan.md).
|
|
1106
|
+
- **`/design-tokens-init` — the 5-token distinctiveness pass + DTCG/semantic naming (the design
|
|
1107
|
+
win).** Three layers already prevented *drift*; the new section prevents *sameness* — the
|
|
1108
|
+
"shadcn trap" (slate/Inter/8px/indigo = generic-AI-app look) broken by five deliberate overrides
|
|
1109
|
+
(warm neutral · intentional radius · type pairing · one owned accent · **one "signature token"**),
|
|
1110
|
+
cohort-aware. Plus: name semantic tokens by **purpose not hue**, emit **W3C DTCG** where the stack
|
|
1111
|
+
allows (portability, no lock-in), and **inline a semantic→primitive map into CLAUDE.md** so the
|
|
1112
|
+
agent inherits the brand on every turn. (freedesignmd · Vercel · Curtis · W3C DTCG.)
|
|
1113
|
+
- **`/evals` — the 2026 Hamel/Shankar sharpening.** Error analysis on **real traces** first (and a
|
|
1114
|
+
pointer to v0.48's `.boss/trace.jsonl` as that raw material); **binary pass/fail, not 1–5 scores**;
|
|
1115
|
+
**one expert, not a committee**; **don't let the model grade its own homework** (separate verifier
|
|
1116
|
+
+ trajectory-not-just-endpoint); a **60/30/10** deterministic/judge/human cost hierarchy.
|
|
1117
|
+
- **New UP practice `library/practices/ai-ux-patterns.md`** — the interaction layer (where
|
|
1118
|
+
`design-system.md` owns the look): "why this" rationale grounded in the user's own inputs ·
|
|
1119
|
+
confidence-as-register · **Notify/Question/Review** interrupt registers · **edit-before-execute +
|
|
1120
|
+
risk-tiered gates** (four decision verbs; gate by loss type) · **trust-repair after a miss**
|
|
1121
|
+
(asymmetric recovery) · progressive disclosure · **discernment — knowing when not to speak — as a
|
|
1122
|
+
first-class fundamental** · pinned canonical refs (Shape of AI / HAX / IBM Carbon; community
|
|
1123
|
+
catalogs via `/vet`). Captured as [IDEA-029](../docs/ideas/IDEA-029-ai-native-interface-patterns.md),
|
|
1124
|
+
extends IDEA-010.
|
|
1125
|
+
- **`docs/mentor-practitioners.md`** AI-UX heuristics block updated with the 2026 additions +
|
|
1126
|
+
pointer to the new practice.
|
|
1127
|
+
- Library-layer + skill-content changes (no new template skill, no CLI change); eval gate 105/0,
|
|
1128
|
+
judgment GRADED + no blocking failures; `boss new` + `unlock mvp` verified the upgraded skills
|
|
1129
|
+
ship. **Staged from the design scan:** planning-as-collaboration / mid-run steering (Appleton —
|
|
1130
|
+
twin of IDEA-026 Part A), RESUME-as-agent-inbox, wrapper-vs-flatten-per-cohort.
|
|
1131
|
+
|
|
1132
|
+
## 0.48.0 — 2026-06-20
|
|
1133
|
+
|
|
1134
|
+
- **The 2026-trends + dhun-scan pass — first builds (IDEA-025/027/026).** Occasioned by Ajesh:
|
|
1135
|
+
*"with all the latest trends since we built BOSS… as well as potential new methods inside dhun, let's
|
|
1136
|
+
assess what BOSS can better improve upon."* Three research passes (external 2026 trends; a 2026-only
|
|
1137
|
+
scan of the named practitioners in `docs/mentor-practitioners.md`; a full catalog of the sibling
|
|
1138
|
+
**dhun** project's working-method machinery) → captured as
|
|
1139
|
+
[IDEA-025](../docs/ideas/IDEA-025-trace-native-conscience-and-evals.md)…028 with full provenance in
|
|
1140
|
+
[SESSION-2026-06-20](../docs/research/sessions/SESSION-2026-06-20-trends-and-dhun-scan.md). This
|
|
1141
|
+
release ships the concrete, proven, low-risk slice; the conceptual reframes (upstream conscience,
|
|
1142
|
+
host-subtraction) stay specced-and-staged. **Three new dormant/UP capabilities — no behavior change
|
|
1143
|
+
until opted in:**
|
|
1144
|
+
- **`auto-log` trace substrate — the keystone, shipped dormant (IDEA-025 Phase 1).** A zero-dep Node
|
|
1145
|
+
SubagentStop hook (`library/hooks/auto-log.js` + L1 template) that appends one honest line to
|
|
1146
|
+
`.boss/trace.jsonl` per writer-subagent — session, agent, files actually changed (reads
|
|
1147
|
+
`git status --porcelain`, so it catches *new* files too, not just tracked diffs), with last-line
|
|
1148
|
+
dedup and read-only-agent skip. **The within-session complement to v0.47's `boss insights`**
|
|
1149
|
+
(cross-project registry): both honest-trace, local-only, append-only, measure-don't-instrument
|
|
1150
|
+
(inherits the IDEA-021/013 contract). It's the raw material a future trace-native judge
|
|
1151
|
+
(`/judge-traces`) + sleep-time learn loop read — Hamel ("error analysis on real traces") + Chase
|
|
1152
|
+
("traces, not code, are the source of truth"). **Dormant** (a SubagentStop hook costs per-subagent
|
|
1153
|
+
latency — registration is the opt-in, same as `secrets-guard`).
|
|
1154
|
+
- **`memory-cue` hook — feedback→memory nudge, shipped dormant (IDEA-027 #1).** Ported UP from the
|
|
1155
|
+
dhun dogfood, Node-ported for the zero-dep rule (`library/hooks/memory-cue.js` + L0 template). A
|
|
1156
|
+
UserPromptSubmit hook that regex-detects a feedback signal (directive / corrective / confirmation)
|
|
1157
|
+
and *nudges* the model to save it to memory — never auto-writes (wording needs reasoning), silent
|
|
1158
|
+
on no-match (zero token cost). Serves the `library/memory-seed/` ambition.
|
|
1159
|
+
- **`/revalidate` — the 3-line gate against zombie features (IDEA-027 #2).** A new L1 skill (+
|
|
1160
|
+
`library/practices/revalidation.md`) ported UP from dhun's REVALIDATION lifecycle: before paused
|
|
1161
|
+
work re-enters the build, answer *still relevant? still aligned? anything changed?* → revive /
|
|
1162
|
+
rescope / kill / re-pause. BOSS eats it first — `docs/RESUME.md` carries a long deferred list.
|
|
1163
|
+
- **Two new UP practices** distilled from the same scan: `library/practices/quality-ratchet.md`
|
|
1164
|
+
(dhun's `.ratchet` one-way-baseline pattern, stack-neutral) and `library/practices/agent-security.md`
|
|
1165
|
+
(Simon Willison's 2026 lethal-trifecta / "Agents Rule of Two" / "deterministic guard around a
|
|
1166
|
+
non-deterministic model" — IDEA-026 Part B, the next ring after `secrets-guard`).
|
|
1167
|
+
- Zero-dep held (Node built-ins only; `npm pack` verified — all new files ship). Hooks dormant
|
|
1168
|
+
(settings.json unchanged; only `conscience.js` registered). `boss map` shows `/revalidate`;
|
|
1169
|
+
eval gate clean (no blocking failures); judgment GRADED 7/7; tested end-to-end in `/tmp`
|
|
1170
|
+
(`boss new` + `boss unlock mvp`, both hooks present + dormant, skill present).
|
|
1171
|
+
|
|
1172
|
+
## 0.47.0 — 2026-06-19
|
|
1173
|
+
|
|
1174
|
+
- **Humane two-way learning channel — built the moment BOSS went public, the humane way (IDEA-024).**
|
|
1175
|
+
Going public (MIT, github.com/ajeshh/bossbuild) turned a private dogfood into a thing strangers
|
|
1176
|
+
run, which needs a way to learn + pivot. Ajesh asked for "feedback from end-users, and learn
|
|
1177
|
+
*passively* how users use it." The second half is the exact surveillance line BOSS exists not to
|
|
1178
|
+
cross ([IDEA-021](../docs/ideas/IDEA-021-passive-instrumentation-and-fleet-learning.md)). Applied
|
|
1179
|
+
BOSS's own conscience (mentor-humane fork; PRINCIPLE: humane before viable) and built the honest-trace
|
|
1180
|
+
version instead — **no silent telemetry.**
|
|
1181
|
+
- **`/feedback` (direct, user-initiated).** A founder-facing skill in L0: send a bug / confusion /
|
|
1182
|
+
wish back upstream. Shows the founder the exact title + body + the *one* line of context
|
|
1183
|
+
(`BOSS <ver> · <mode> · <OS>`) before anything leaves the machine; files a GitHub issue via
|
|
1184
|
+
`gh issue create`, or falls back to a prefilled issue link to paste. Public-repo warning stated.
|
|
1185
|
+
Never automatic, never a hook.
|
|
1186
|
+
- **`boss insights` (passive, the humane way).** Reads the trace your own work *already leaves* —
|
|
1187
|
+
your registered projects on *this machine* — and reports where each venture's loop stands
|
|
1188
|
+
(idea → canvas → build), flagging empty / untested / stale. **Measures graduation + loop-closure,
|
|
1189
|
+
never activity/engagement** (the vanity metric BOSS refuses to expose). Local-only; nothing is
|
|
1190
|
+
sent. Zero-dep (`src/insights.js`).
|
|
1191
|
+
- **Opt-in share-up contract.** New `shareUp: false` default in `.boss/config.json` — any future
|
|
1192
|
+
cross-user learning is gated on this flag being true *and* a per-send confirmation, by construction.
|
|
1193
|
+
Telemetry is never a default.
|
|
1194
|
+
- **Not built (deferred, named in IDEA-024):** silent cross-user telemetry (the line — not crossing
|
|
1195
|
+
it); a full `npm publish` / auto-update pipeline (premature at n≈1 — `boss status` already nudges
|
|
1196
|
+
on version drift; the real risk is demand, not distribution). Captured in
|
|
1197
|
+
[SESSION-2026-06-19-founder-test](../docs/research/sessions/SESSION-2026-06-19-founder-test.md).
|
|
1198
|
+
|
|
1199
|
+
## 0.46.0 — 2026-06-19
|
|
1200
|
+
|
|
1201
|
+
- **Bring-your-own-material import — the on-ramp from "I jotted it somewhere" (IDEA-023).**
|
|
1202
|
+
Occasioned by a live founder-test: a founder ran `boss new`, then stalled — the idea lived in a
|
|
1203
|
+
Word doc / Google Doc / Obsidian note / PDF / deck / URL, and there was **no way to bring it in**.
|
|
1204
|
+
A correctly-scaffolded but idea-less project read as *"empty… I'm stuck."* This closes that
|
|
1205
|
+
first-run dead-end.
|
|
1206
|
+
- **Load-bearing decision:** import lives in the **skill layer, not the zero-dep CLI** — the CLI
|
|
1207
|
+
(Node built-ins only, Principle 4) can't parse PDF/docx or fetch a URL, but the model already
|
|
1208
|
+
reads heterogeneous formats natively. Same predicate/runner split as the loop runtime (IDEA-008):
|
|
1209
|
+
deterministic core stays deterministic; the model does the parsing + shaping.
|
|
1210
|
+
- **What ships:** (1) `/boss` §1 now ingests **one-or-more sources** — local files *and* URLs, in
|
|
1211
|
+
any mix — snapshots a durable copy of each into `docs/source/` ("the project owns a copy"), and
|
|
1212
|
+
synthesizes across all of them before shaping the idea. (2) A new **`/import`** skill (registered
|
|
1213
|
+
in the L0 manifest) for adding material to an *already-captured* idea, or as an alternate spin-up
|
|
1214
|
+
door. (3) **Discoverability fix** — the part that actually stuck the founder: `boss new`'s "Next:"
|
|
1215
|
+
block now shows `code <name>` (editor handoff) + the file/url/import options; `/welcome` (both the
|
|
1216
|
+
full tour and the 30-second version) and the L0 `CLAUDE.md` template all advertise bring-your-own
|
|
1217
|
+
material.
|
|
1218
|
+
- **Deferred (named in [IDEA-023](../docs/ideas/IDEA-023-bring-your-own-material-import.md)):**
|
|
1219
|
+
material-first ordering (point at material → BOSS names the folder), binary/OCR formats
|
|
1220
|
+
(`.pptx` text, image-only PDFs), live-source re-pull vs. one-time snapshot, and a CLI
|
|
1221
|
+
`boss import` second door (only if the skill path proves it's wanted outside Claude).
|
|
1222
|
+
- Dogfood target: `~/Projects/fraands` (the project that surfaced the gap). Surfaced + captured in
|
|
1223
|
+
[SESSION-2026-06-19-founder-test](../docs/research/sessions/SESSION-2026-06-19-founder-test.md)
|
|
1224
|
+
(OBS-002/003/005).
|
|
1225
|
+
- **`/welcome` closes on the action — long content no longer buries the next step (OBS-001).**
|
|
1226
|
+
Same founder-test: *"the welcome message is a bit too long… I forget what I'm supposed to do next."*
|
|
1227
|
+
The skill already had an "end on one next step" rule, but the beginner tour printed three full
|
|
1228
|
+
reference sections (conscience / modes / help) *after* the next step, walling it off. Fix: a new
|
|
1229
|
+
voice rule ("close on the action — long content must tie back to the next step"), and a structural
|
|
1230
|
+
**pivot** — after the shape + next step, the three reference sections are now **offered, not dumped**
|
|
1231
|
+
(tagged `reference — expand only if asked`); the founder leaves on `/boss` or `/triage`, not on a
|
|
1232
|
+
wall. (OBS-004 — host-switching across Claude app/VSCode/Cursor — logged, deferred to IDEA-006.)
|
|
1233
|
+
|
|
1234
|
+
## 0.45.0 — 2026-06-05
|
|
1235
|
+
|
|
1236
|
+
- **JIT working-context, Phase 1 — every `boss new` project is now JIT-by-construction (FEAT-020).**
|
|
1237
|
+
The deny-list (v0.42) made projects secrets-safe by default; this makes them *context-lean* by
|
|
1238
|
+
default. The principle was already vetted (`context-discipline` practice, RVW-005/010) but only
|
|
1239
|
+
*described* path-scoped rules — now the templates **ship** them. This is Phase 1 of a 4-phase
|
|
1240
|
+
lifecycle ([`docs/ideas/FEAT-020`](../docs/ideas/FEAT-020-jit-working-context-lifecycle.md)); Phases
|
|
1241
|
+
2-4 (`/close` GC, promote-on-evict via `/extract`, a freshness moment) are specced + deferred with
|
|
1242
|
+
triggers.
|
|
1243
|
+
- **What ships:** `.claude/rules/` examples with `paths:` frontmatter that load **only when Claude
|
|
1244
|
+
opens a matching file** (not at session start) — L0 `your-app-code.md` (the basic path-scoped
|
|
1245
|
+
pattern), L1 `feature-context.md` (the live feature's working notes, which Phase 2's `/close` will
|
|
1246
|
+
later compress to a one-liner). The two-memory cut that decides what goes where is documented in
|
|
1247
|
+
the new `library/memory-seed/` shelf (README + an example durable-facts seed). L0 `CLAUDE.md`'s
|
|
1248
|
+
Memory line now names both halves (durable → auto-memory; working-state → path-scoped rules).
|
|
1249
|
+
- **Verified before building:** confirmed against the official Claude Code docs (2026-06-05) that
|
|
1250
|
+
`.claude/rules/` with a `paths:` key is real and JIT-loaded — *not* a confusion with Cursor's
|
|
1251
|
+
`.cursor/rules` (`globs:`). The `context-discipline` practice's "re-verify host syntax on every
|
|
1252
|
+
build" rule, honored.
|
|
1253
|
+
- **The restraint line (carried from the IDEA):** Phases 2-4 risk being BOSS gold-plating its own
|
|
1254
|
+
substrate; the recency-window-by-hand is currently enough. First dogfood is BOSS's own repo — a
|
|
1255
|
+
Phase-1 slice going stale and misinforming a session is the cleanest Phase-2 re-open signal.
|
|
1256
|
+
- Zero-dep held; tested end-to-end in `/tmp` (`boss new` → rule present; `boss unlock mvp` → feature
|
|
1257
|
+
rule added). `mentor-architect` pass + 4 forks decided with Ajesh recorded in the FEAT.
|
|
1258
|
+
|
|
1259
|
+
## 0.44.0 — 2026-06-02
|
|
1260
|
+
|
|
1261
|
+
- **`secrets-guard` PreToolUse hook — the high-stakes ceiling, shipped opt-in (closes the RVW-005
|
|
1262
|
+
follow-on, the principled way).** The v0.42 deny-list is the universal zero-cost floor; this hook is
|
|
1263
|
+
the broader-coverage ceiling — and the v0.42.1 reconsideration said it must NOT be a universal
|
|
1264
|
+
default (a `PreToolUse` hook spawns a process on *every* tool call). So it ships **dormant**:
|
|
1265
|
+
`library/hooks/secrets-guard.js` (canonical) + `.claude/hooks/secrets-guard.js` in the L0 template,
|
|
1266
|
+
**not registered by default.** Registration is the on-switch (an unregistered hook costs nothing),
|
|
1267
|
+
recommended for the `domain-expert` / regulated cohort.
|
|
1268
|
+
- **Behavior:** Read/Edit/NotebookEdit of a secrets file (`.env`/`.env.*`/`secrets/**`) → **deny**
|
|
1269
|
+
(reading secret contents into context is the leak); Bash or MCP referencing a secrets path →
|
|
1270
|
+
**ask** (don't hard-block legit `.env` *creation* — surface it to the human); else allow.
|
|
1271
|
+
**Fail-open** (any parse/runtime surprise → allow; a guard that breaks the session is worse than
|
|
1272
|
+
one that occasionally misses, and the deny-list floor still hard-blocks the common vectors).
|
|
1273
|
+
- **Tested** (10 cases, piped PreToolUse events): denies Read/Edit of `.env`/`.env.local`/
|
|
1274
|
+
`secrets/`; allows `src/app.js`, `npm test`, and `.environment.ts` (no false positive); asks on
|
|
1275
|
+
`cat .env` + MCP-with-secrets; fail-open on malformed input. Zero-dep Node, output per the Claude
|
|
1276
|
+
Code PreToolUse contract (JSON `permissionDecision`, exit 0).
|
|
1277
|
+
- **Cohort auto-registration deferred** (a clean follow-on): wiring `domain-expert` cohort setup to
|
|
1278
|
+
register it automatically. Today it's documented opt-in (snippet in the file header + the
|
|
1279
|
+
`context-discipline` practice).
|
|
1280
|
+
|
|
1281
|
+
## 0.43.0 — 2026-06-02
|
|
1282
|
+
|
|
1283
|
+
- **Wayfinding, Pass 1 (IDEA-018) — `boss map` + a doc generator that can't rot.** Occasioned by a
|
|
1284
|
+
docs-health pass that found the README **19 releases stale** and, worse, that there was **no "how to
|
|
1285
|
+
use BOSS" guide at all**. The fix is shaped like BOSS itself: wayfinding, not a manual. Decisions
|
|
1286
|
+
locked with Ajesh — **mode ladder is the spine** (persona = entry filter, aspect = which mentor to
|
|
1287
|
+
ask, never chapters); **split audience** (a command ships to founders, the prose stays in the BOSS
|
|
1288
|
+
repo); **durable core first**.
|
|
1289
|
+
- **`boss map`** (CLI, ships to every project) — the *live* cheatsheet. A pure render of state the
|
|
1290
|
+
project already holds (the `.boss` stamp + installed `SKILL.md` files), in the `boss board`
|
|
1291
|
+
spirit: *You are here · available now (grouped by the rung that unlocked each skill) · one unlock
|
|
1292
|
+
away (the next rung's skills, read from the package, with the real project name substituted in) ·
|
|
1293
|
+
standing controls.* Nothing to maintain, nothing to drift.
|
|
1294
|
+
- **The de-rot mechanism** — `src/modes.js` is the single source both `boss map` and the generator
|
|
1295
|
+
read (manifests + `SKILL.md` frontmatter), so the live map and the static docs can never disagree.
|
|
1296
|
+
**`scripts/gen-docs.js`** (`npm run gen:docs`) emits **`docs/CHEATSHEET.md`** (the whole ladder,
|
|
1297
|
+
the wall-poster) and **`docs/SKILLS.md`** (one line per skill, grouped by mode) — both carry a
|
|
1298
|
+
GENERATED banner and are derived, never hand-typed. This is the actual fix for what bit the README:
|
|
1299
|
+
the per-mode lists become a build artifact, not a memory test.
|
|
1300
|
+
- Zero-dep held — `src/map.js` + `src/modes.js` ship; `scripts/` and the generated `docs/` stay
|
|
1301
|
+
dev-only (verified via `npm pack`). Eval suite regression-clean (no blocking failures, GRADED 7).
|
|
1302
|
+
End-to-end tested in `/tmp` (map at Quickstart → unlock mvp → map regroups + previews V1; placeholder
|
|
1303
|
+
substitution + width-capping confirmed). **Pass 2 (the hand-authored `docs/GUIDE.md` walkthrough)
|
|
1304
|
+
is the next session** — written against these generated surfaces, per the agreed sequence.
|
|
1305
|
+
|
|
1306
|
+
## 0.42.1 — 2026-06-02
|
|
1307
|
+
|
|
1308
|
+
- **BOSS eats its own context-discipline dogfood + sharpens the practice (the learning loop in real
|
|
1309
|
+
time).** Minutes after shipping `context-discipline` (v0.42.0), applied it to BOSS itself and
|
|
1310
|
+
refined it with what doing so taught.
|
|
1311
|
+
- **Dogfood (DOWN):** the recency-window rule (RVW-002) applied to BOSS's own `docs/RESUME.md` — the
|
|
1312
|
+
"State" section was a 25-entry append log read at every session start. Trimmed to the **5 most
|
|
1313
|
+
recent** entries; older versions point to `registry/CHANGELOG.md` (which carries all 43 versions —
|
|
1314
|
+
confirmed before cutting, so it's non-destructive). RESUME dropped **727 → 346 lines.** BOSS now
|
|
1315
|
+
practices the leanness it prescribes.
|
|
1316
|
+
- **Practice sharpening (the part that matters):** reconsidered the deferred PreToolUse secrets-guard
|
|
1317
|
+
hook and recorded *why* it's deferred, not just *that* it is. A `PreToolUse` hook fires a process
|
|
1318
|
+
on **every tool call** (real latency), where `permissions.deny` is a zero-cost native check. So the
|
|
1319
|
+
practice now states: the **deny-list is the universal floor** (always ship); a **secrets-guard hook
|
|
1320
|
+
is a high-stakes/opt-in ceiling** (regulated/PHI cohorts), **not** a universal default — adding
|
|
1321
|
+
always-on per-call machinery for marginal coverage is the framework-bloat BOSS warns founders
|
|
1322
|
+
against (R&H #1 / IDEA-013 cost discipline). This is `/vet`'s skepticism turned inward: even an
|
|
1323
|
+
ADOPT's "ceiling" gets cost-weighed before it ships to everyone.
|
|
1324
|
+
- No template/CLI behavior change beyond the doc + RESUME; the v0.42.0 deny-default still stands as
|
|
1325
|
+
the shipped safe-default.
|
|
1326
|
+
|
|
1327
|
+
## 0.42.0 — 2026-06-02
|
|
1328
|
+
|
|
1329
|
+
- **`/boss-learn` routes the sweep's first ADOPT — a "context discipline" practice, UP + a DOWN
|
|
1330
|
+
safe-default.** Acting on the v0.41 `/vet` sweep: the two ADOPTs (RVW-005 deny-secrets, RVW-010
|
|
1331
|
+
token-optimization) plus RVW-002 (lean session docs) collapsed into **one** pattern, routed two ways
|
|
1332
|
+
per PRINCIPLE #1.
|
|
1333
|
+
- **Verify-before-encode (the gate both verdicts set).** Before promoting, the version-bound Claude
|
|
1334
|
+
Code claims were checked against current behavior. One was **FALSE — `.claudeignore` does not
|
|
1335
|
+
exist** (the source post conflated it with `permissions.deny`); it was struck from the practice
|
|
1336
|
+
rather than shipped to every project. Confirmed: `permissions.deny` glob syntax (and that a
|
|
1337
|
+
`Read(...)` deny does **not** cover Bash — needs a separate `Bash(...)` rule), PreToolUse hard-block,
|
|
1338
|
+
`.claude/rules/` `paths:` frontmatter, CLAUDE.md load behavior. The `/vet` thesis applied to BOSS
|
|
1339
|
+
itself: popularity ≠ correctness, even when *BOSS* is the one adopting.
|
|
1340
|
+
- **UP** → `library/practices/context-discipline.md`: lean always-loaded docs (CLAUDE.md +
|
|
1341
|
+
RESUME recency-window), path-scoped `.claude/rules/`, `permissions.deny` for secrets *and* bloat,
|
|
1342
|
+
PreToolUse/PostToolUse hooks as the enforcement ceiling. Host-tagged `claude-code` with an explicit
|
|
1343
|
+
"re-verify syntax on host change" note (IDEA-014 recalibration). Provenance cites the RVWs — vetted,
|
|
1344
|
+
not adopted on stars.
|
|
1345
|
+
- **DOWN (product safe-default)** → the L0 Quickstart template now ships a `permissions.deny` block
|
|
1346
|
+
for `.env`/`.env.*`/`secrets/**` (Read + Bash) in `.claude/settings.json`, and `.gitignore` covers
|
|
1347
|
+
`.env.*` + `secrets/`. Every new `boss new` project is secrets-safe by default — the
|
|
1348
|
+
enforce-in-harness principle (RVW-012) made concrete, not left as advice.
|
|
1349
|
+
- **Deferred follow-ons (named, not crammed in — PRINCIPLE #2 / small steps):** a `library/hooks/`
|
|
1350
|
+
PreToolUse secrets-guard (catches Bash + MCP + future skills — code+test, its own step);
|
|
1351
|
+
mode/cohort-scoped `.claude/rules/` in the template; BOSS's own root CLAUDE.md/RESUME trim
|
|
1352
|
+
(RVW-002 — awaiting Ajesh's recency-window size). ADAPTs RVW-007/008 remain founder-facing/scope-gated.
|
|
1353
|
+
|
|
1354
|
+
## 0.41.0 — 2026-06-02
|
|
1355
|
+
|
|
1356
|
+
- **First `/vet --all` sweep — 10 verdicts (RVW-003…012), and the skill earned its keep.** Ajesh
|
|
1357
|
+
dropped a 10-item pile of AI/Claude-Code "best practices" (Reddit threads + Lenny's-newsletter posts)
|
|
1358
|
+
and swept them in one pass. The distribution is the proof the skill routes on merits, not reflex:
|
|
1359
|
+
**2 ADOPT · 2 ADAPT · 3 NOT-YET · 3 REJECT.** Only 2 clean adopts out of 10 — a skeptic, not a
|
|
1360
|
+
bookmark folder.
|
|
1361
|
+
- **The two ADOPTs collapse into ONE action — a "BOSS context discipline" practice** (the value of
|
|
1362
|
+
synthesizing a sweep instead of N independent verdicts). RVW-005 (hard-deny `.env`/secrets, don't
|
|
1363
|
+
trust prompting), RVW-010 (lean CLAUDE.md <500 tok, path-scoped `.claude/rules/`, `permissions.deny`
|
|
1364
|
+
for bloat, hook noise-filtering), and the earlier RVW-002 (RESUME recency-window) are facets of one
|
|
1365
|
+
practice; RVW-009 (context-engineering failure modes) is its research rationale and RVW-012 (Agent
|
|
1366
|
+
= Model + Harness, "safety lives in the harness not the model") its backing principle. Queued for
|
|
1367
|
+
`/boss-learn` (UP `library/practices/` + `library/hooks/` secrets-guard + DOWN BOSS's own doc trim
|
|
1368
|
+
+ template defaults), **gated on verifying version-bound Claude Code specifics first** (IDEA-014
|
|
1369
|
+
recalibration territory).
|
|
1370
|
+
- **Two ADAPTs, both founder-facing + scope-gated:** RVW-007 (Couch-to-5K — adopt the
|
|
1371
|
+
smallest-next-step *philosophy* for `/welcome` + beginner-cohort nudges; **reject the daily-streak
|
|
1372
|
+
mechanic** as the gamification/pressure trap the canvas already refuses — a guardrail recorded so a
|
|
1373
|
+
future session isn't tempted); RVW-008 (categorize-agents / start-simplest — a modest
|
|
1374
|
+
`mentor-architect` framing, strip the enterprise + stack taxonomy per PRINCIPLE #4).
|
|
1375
|
+
- **Three NOT-YET:** RVW-003 (plumbing-awareness — strong founder-facing candidate, re-open when the
|
|
1376
|
+
founder-facing build lands), RVW-009 + RVW-012 (reference pieces — re-open at conscience
|
|
1377
|
+
context-injection review / IDEA-006 host-contract work respectively).
|
|
1378
|
+
- **Three REJECT, recorded with reasons:** RVW-004 (`/remote-control` — out of scope/low-evidence,
|
|
1379
|
+
but kept a humane stance on always-on agent work), RVW-006 (21-hacks listicle — wrong altitude),
|
|
1380
|
+
RVW-011 (n8n tutorial — PRINCIPLE #4, folded into RVW-008 as an example).
|
|
1381
|
+
- **A real skeptical catch:** 6 of the 10 drops were by the **same author** (one newsletter
|
|
1382
|
+
corpus). The sweep applied an **author-concentration discount** — cross-confirmation *within* one
|
|
1383
|
+
voice isn't independent evidence; the "respected practitioner" rubric rung counts once, and the
|
|
1384
|
+
real evidence is the *distinct* sources (the deny-secrets PSA, slaorta, StokeJar, and the
|
|
1385
|
+
DeepMind/Microsoft/Salesforce research cited *inside* the pieces).
|
|
1386
|
+
- Records: `docs/research/verdicts/RVW-003…012`; all 10 inbox items marked `resolved:`. Zero-dep held
|
|
1387
|
+
(`npm pack` ships 0 — all under `docs/`). No code/skill change → gate + judgment suites unaffected.
|
|
1388
|
+
**Nothing is built yet** — ADOPT/ADAPT hand-offs await Ajesh's go (the skill decides *whether*;
|
|
1389
|
+
`/boss-learn` decides *where*).
|
|
1390
|
+
|
|
1391
|
+
## 0.40.1 — 2026-06-02
|
|
1392
|
+
|
|
1393
|
+
- **`/vet` gains batch sweep — drop a pile, vet once.** From dogfooding `/vet` on real drops (RVW-001,
|
|
1394
|
+
RVW-002): the natural rhythm is *accumulate, then sweep*, not vet-on-arrival. `/vet --all` now vets
|
|
1395
|
+
every un-vetted inbox item — **each as its own full skeptical pass with its own `RVW-NNN` verdict**
|
|
1396
|
+
— then prints one summary table + the ADOPT/ADAPT hand-off list. No-arg `/vet` lists un-vetted items
|
|
1397
|
+
oldest-first and offers the sweep. Already-vetted items (a `resolved:` line or an existing verdict)
|
|
1398
|
+
are skipped. Clarified the old "one claim per run" rule → **"one claim per verdict"**: it always
|
|
1399
|
+
protected *depth-per-claim* (never collapse several claims into one shallow verdict), never forbade
|
|
1400
|
+
vetting many in sequence — the sweep is many full passes, not one pass over many.
|
|
1401
|
+
- **First two verdicts on the record (dogfood):** `RVW-001` — the four-rule "Karpathy" CLAUDE.md →
|
|
1402
|
+
**REJECT** (BOSS already encodes all four as principles + the cohort-aware conscience; a static file
|
|
1403
|
+
would regress toward the frozen-rules brittleness the thread's own top critique names — which is
|
|
1404
|
+
IDEA-014's thesis). `RVW-002` — slaorta's lean/modular CLAUDE.md → **ADAPT** (apply the
|
|
1405
|
+
recency-window to `RESUME.md`'s State section, which duplicates `registry/CHANGELOG.md` and grows
|
|
1406
|
+
unbounded; generalizable shape is a `library/practices/` UP candidate). The REJECT/ADAPT split is
|
|
1407
|
+
the evidence `/vet` routes on merits, not reflex. RVW-001 also surfaced **external confirmation of
|
|
1408
|
+
IDEA-014** (a stranger reasoning to the recalibration thesis) — now cited in that idea.
|
|
1409
|
+
|
|
1410
|
+
## 0.40.0 — 2026-06-02
|
|
1411
|
+
|
|
1412
|
+
- **`/vet` — the skeptical inbox. The inverse of `/boss-learn`.** From Ajesh's seed: *"if i have new
|
|
1413
|
+
research or best practices, we should have a way where i can just drop it in, and then our mentors
|
|
1414
|
+
and such review and see what we should integrate. reddit is full of best practices, but that doesnt
|
|
1415
|
+
mean all are good ideas."* The last sentence is the whole design.
|
|
1416
|
+
- **Why it's the inverse, not a fork.** `/boss-learn` routes a pattern *you already proved* (built
|
|
1417
|
+
it, it worked, it repeated) UP into `library/` or DOWN into the app — its input has earned trust.
|
|
1418
|
+
`/vet` takes a claim *from a stranger* (a Reddit thread, an HN comment, a blog post, a paper, a
|
|
1419
|
+
"you must do X" tweet) that has earned **nothing**. Its job is the part `/boss-learn` never has to
|
|
1420
|
+
do: decide whether an unproven outside claim deserves to become practice **at all**. ADOPT *hands
|
|
1421
|
+
to* `/boss-learn` (whether → where); it never reimplements it.
|
|
1422
|
+
- **The filter is the product.** A drop folder with no judgment is a bookmark pile. The value is the
|
|
1423
|
+
skeptical read. The skill is **biased toward NO** — most internet best practices don't apply to
|
|
1424
|
+
BOSS, at its stage, for its thesis — and makes a claim *earn* an ADOPT.
|
|
1425
|
+
- **The NO-biased rubric (any one question can sink the claim):** (1) does it contradict a PRINCIPLE?
|
|
1426
|
+
(#6 / `mentor-humane` can veto outright); (2) evidence grade — n=1 vibe vs. pattern-with-data vs.
|
|
1427
|
+
respected practitioner (most claims die here); (3) duplicate or genuinely sharpen?; (4) who does it
|
|
1428
|
+
serve **and harm** (great for `eng-builder`, toxic for `first-product` → ADAPT-with-scoping at
|
|
1429
|
+
best); (5) cost/ceremony (does it make BOSS heavier — R&H #1).
|
|
1430
|
+
- **Four honest verdicts**, mirroring `/extract`'s UP/DOWN/NOT-YET: **ADOPT** (→ `/boss-learn`),
|
|
1431
|
+
**ADAPT** (modified, reasoned), **REJECT — with reason, recorded** (the quietly important one — so
|
|
1432
|
+
the same thread isn't re-litigated next month; the verdict log is BOSS's memory of what it
|
|
1433
|
+
*deliberately didn't* adopt), **NOT-YET** (with a re-open condition). Before vetting, `/vet` reads
|
|
1434
|
+
prior verdicts and won't re-litigate.
|
|
1435
|
+
- **Restraint by design (PRINCIPLE #2):** deliberate-invoke, like `/extract` and `/drift-deep` —
|
|
1436
|
+
**no `vet-loop`, no hook moment, no nudge to "review your inbox."** An automatic research-review
|
|
1437
|
+
obligation would be the ceremony BOSS exists to refuse. It also doesn't *find* research (that's
|
|
1438
|
+
`/deep-research`) — it judges what you bring it.
|
|
1439
|
+
- **Scope: internal-curation first.** `/vet` is a **BOSS-local meta-skill** (lives with `/boss-learn`
|
|
1440
|
+
+ `/boss-sync` in `.claude/skills/`, **not** in the founder template) — it vets against
|
|
1441
|
+
`PRINCIPLES.md` + BOSS's own `library/` and routes ADOPT into the BOSS source. The founder-facing
|
|
1442
|
+
version (founder drops a thread → BOSS reads it against *their* canvas/stage/cohort) is the named
|
|
1443
|
+
**UP candidate** (IDEA-016), deferred until the internal version earns it.
|
|
1444
|
+
- Shipped: `.claude/skills/vet/SKILL.md`; drop zone `docs/research/inbox/` + verdict log
|
|
1445
|
+
`docs/research/verdicts/` (each with a README); new **`RVW-NNN`** ID type in `docs/IDS.md`;
|
|
1446
|
+
IDEA-016 captured + the two design forks decided (internal-first; single skeptical pass — the
|
|
1447
|
+
mentor+persona panel is the upgrade if the single pass proves too shallow). Zero-dep held (`npm
|
|
1448
|
+
pack` ships **0** of these — BOSS-local + `docs/`, neither in the `files` allowlist). Gate +
|
|
1449
|
+
judgment suites unchanged (no hook moment, no predicate change).
|
|
1450
|
+
|
|
1451
|
+
## 0.39.0 — 2026-06-02
|
|
1452
|
+
|
|
1453
|
+
- **`capture` goes judge-backed — the third model-judgment moment, and it ships GRADED from day one.**
|
|
1454
|
+
capture (moment #3, PRINCIPLE #1's own) fired structurally on `≥3 devlog entries + no extraction
|
|
1455
|
+
record` — but a count can't tell a real extraction candidate from three entries of normal in-progress
|
|
1456
|
+
work. That's the exact crude-predicate problem drift (v0.31) and caution (v0.33) already solved with a
|
|
1457
|
+
bounded-read model judgment. v0.39 gives capture the same upgrade.
|
|
1458
|
+
- **The judgment (strictly more restraint — capture can now only fire LESS):** the gate still opens,
|
|
1459
|
+
but before voicing, the model silently reads the ~5 most recent devlog entries and fires ONLY if
|
|
1460
|
+
there's a real candidate — a pattern built **twice** (reusable practice → UP into `library/`), a
|
|
1461
|
+
fix/guard hand-applied in **several places** (hardening → DOWN into core), or a manual **ritual
|
|
1462
|
+
repeated** enough to deserve a skill/loop. If the recent work is one-off distinct features, deep
|
|
1463
|
+
focus on a single still-in-progress thing, or early throwaway spikes — nothing has generalized;
|
|
1464
|
+
**stay silent.** The silent class is trust-critical: nudging `/extract` with nothing to extract
|
|
1465
|
+
earns a NOT-YET every time and trains the founder to tune the conscience out — the premature
|
|
1466
|
+
ceremony PRINCIPLE #2 warns against. Same shape as drift/caution: no model call in the hook, no new
|
|
1467
|
+
state, no predicate change — a bounded-read voicing instruction the model executes in the live turn.
|
|
1468
|
+
- **Shipped:** upgraded `capture` voice frame in the hook lib; `capture` added to `JUDGE_MOMENTS` (so
|
|
1469
|
+
the conscience-frequency ledger logs it as a judge-moment) and to `MOMENT_SIGNALS` (voice-hash
|
|
1470
|
+
source of truth); **`capture.judgment.yml`** (7 labeled cases — 3 should-fire-extractable
|
|
1471
|
+
[practice-twice / guard-in-3-places / repeated-ritual], 3 should-not-fire-nothing-yet
|
|
1472
|
+
[one-off / single-in-progress / spikes], 1 ambiguous [written-twice-maybe]);
|
|
1473
|
+
**`fixtures-devlog-extract.js`** (extractability-focused devlog corpus, distinct from drift's
|
|
1474
|
+
risk-focused one); `replay.js` + `regrade.js` extended with a `capture` row (the MOMENTS registry
|
|
1475
|
+
proves it generalizes — third moment, same engine).
|
|
1476
|
+
- **Graded the free way (per v0.38):** all 7 cases run through isolated reasoning-required Opus 4.8
|
|
1477
|
+
sub-agents; **all 7 agree with the human labels.** `replay.js` reads **GRADED 7/7** for capture
|
|
1478
|
+
(24/24 across drift+caution+capture). Transcripts stamped `generated_via:
|
|
1479
|
+
in-session-subagent-reasoned` + `harness_note`; a real `npm run regrade capture` overwrites them.
|
|
1480
|
+
- Zero-dep held: `npm pack` ships **0** judgment/transcript/extract-fixture files; no `src/` ref.
|
|
1481
|
+
Gate suite **105/0/41** (the predicate is unchanged — `moment-capture.yml` still covers detection).
|
|
1482
|
+
The judgment channel now covers **3 of the conscience's moments**; the remaining structural moments
|
|
1483
|
+
(cost / failure-mode / cost-stale) are binary facts and correctly stay non-judge (a model judge
|
|
1484
|
+
there would be the v0.34 cost trap).
|
|
1485
|
+
|
|
1486
|
+
## 0.38.0 — 2026-06-02
|
|
1487
|
+
|
|
1488
|
+
- **The conscience's judgment is now MODEL-VERIFIED — `drift` + `caution` read `GRADED 17/17`, not
|
|
1489
|
+
`NEVER_GRADED`. The hole `regrade.js` was built to close (v0.35) is closed — without an API key.**
|
|
1490
|
+
Since v0.32 the judgment surface (`replay.js`) shipped a labeled set + voice-hash tripwire + coverage
|
|
1491
|
+
floors but printed `NEVER_GRADED` loudly: the model had never actually been tested against the labels,
|
|
1492
|
+
so every judge-moment was structurally-checked vibes. Closing it normally needs a paid out-of-band
|
|
1493
|
+
`regrade.js` run (Node `fetch` → Anthropic API). We closed it the free way: `regrade.js` runs two
|
|
1494
|
+
model calls per case *because it executes with no model present* — but a live session **is** Opus 4.8,
|
|
1495
|
+
the same model it would call. Each of the 17 cases (10 drift + 7 caution) was run through an **isolated
|
|
1496
|
+
sub-agent** seeing only the exact voice frame + bounded read the hook injects, and the decisions
|
|
1497
|
+
written as transcripts in `regrade.js`'s own format.
|
|
1498
|
+
- **Result: all 17 decisions agree with the human labels.** The frame and the labels are
|
|
1499
|
+
well-calibrated; the model nails the trust-critical silent class (the on-aim cases where firing
|
|
1500
|
+
would be the false positive that erodes the conscience) — e.g. it reads a missing canvas
|
|
1501
|
+
"Experiment this week" line as a *bookkeeping* gap, not a *validation* gap, when the devlog shows
|
|
1502
|
+
the experiment is already running.
|
|
1503
|
+
- **A real methodology finding, recorded honestly:** a first, terse "output only SILENT or the nudge"
|
|
1504
|
+
harness mislabelled **3 of 17** — one spurious fire (on-aim drift) and two spurious silences
|
|
1505
|
+
(textbook feature-piling / competitor-watching caution cases). Requiring the model to do the
|
|
1506
|
+
"silently read… then judge" reasoning the voice frame *explicitly demands* flipped all three to
|
|
1507
|
+
agree with the label. The lesson: the frame's reasoning instruction is load-bearing, and how you
|
|
1508
|
+
elicit a judgment changes it — exactly the kind of thing the recalibration discipline exists to catch.
|
|
1509
|
+
- **`regrade.js` made importable** — `main()` now runs only on direct invocation; `decisionPrompts`,
|
|
1510
|
+
`MOMENTS`, `loadCases` are exported (so the prompt assembly is reusable/testable and can't drift
|
|
1511
|
+
from the paid path). `--dry-run` still green; importing the module no longer spends.
|
|
1512
|
+
- **Honest provenance, not a masquerade:** every transcript carries `generated_via:
|
|
1513
|
+
in-session-subagent-reasoned` + a `harness_note` stating it was NOT the clean `fetch` harness and
|
|
1514
|
+
that `ANTHROPIC_API_KEY=… npm run regrade` overwrites it as the canonical instrument. The interim
|
|
1515
|
+
grading is real (same model, same frame, same isolation) without overclaiming the provenance.
|
|
1516
|
+
- Zero-dep line held: `npm pack` ships **0** judgment/transcript files; no `src/` reference. Gate
|
|
1517
|
+
suite **105/0/41**; judgment **GRADED 17/17** (was NEVER_GRADED 17). The loud "not yet
|
|
1518
|
+
model-verified" banner is gone.
|
|
1519
|
+
|
|
1520
|
+
## 0.37.0 — 2026-06-01
|
|
1521
|
+
|
|
1522
|
+
- **`/drift-deep` — the deep, whole-project drift audit. The biggest unused 4.8 lever, now built.**
|
|
1523
|
+
The hook `drift` moment (v0.31) is a cheap always-on tripwire: it reads ~5 recent entries and
|
|
1524
|
+
asks "you named a risk, you're piling work, nothing tests it — is the recent work on-aim?" This
|
|
1525
|
+
is the **deliberate, founder-invoked counterpart** that a bounded read can't do: *read EVERYTHING
|
|
1526
|
+
I've built and tell me, across the whole body of work, whether I'm validating my riskiest bet or
|
|
1527
|
+
building around it.* The 1M-context "am I fooling myself across everything" check — the original
|
|
1528
|
+
finding from the very first 4.8 pass.
|
|
1529
|
+
- **Why a skill, not a hook moment:** a whole-project read can't fire per-prompt — that's the
|
|
1530
|
+
expensive-AI-app trap the v0.34 cost discipline guards against. So the cheap moment stays the
|
|
1531
|
+
everyday tripwire; this is the audit you *invoke* when you want the truth, not a glance. The
|
|
1532
|
+
restraint (no loop, no nudge to "run your audit") is the design — making it a recurring
|
|
1533
|
+
obligation would be the premature ceremony BOSS avoids.
|
|
1534
|
+
- **Broader than the gate** in two ways: it runs even when a validation plan exists (did you
|
|
1535
|
+
*execute* the experiment, or write the plan-line and drift from running it?), and it reads the
|
|
1536
|
+
**actual `src/` code** (what you built is the truest record of what you bet on), not just the
|
|
1537
|
+
devlog tail.
|
|
1538
|
+
- **`/drift-deep` (L1-mvp skill)** — reads the canvas (bet + plan + cells) + ALL devlog + every
|
|
1539
|
+
FEAT spec + `src/` structurally + the ideas; judges each body of work against the bet ("does
|
|
1540
|
+
this *test* the risk or build *around* it?"); reaches a verdict (on-aim / drifting / mixed) with
|
|
1541
|
+
confidence + named gaps + the smallest re-aim; writes `docs/drift-audits/DRIFT-YYYY-MM-DD.md`.
|
|
1542
|
+
Cohort-aware (vibe-virtuoso served most — ships a lot, validates little; domain-expert gets the
|
|
1543
|
+
who-could-be-harmed humane lens on an un-validated risk). Routes back to `/canvas` / `/pretotype`.
|
|
1544
|
+
- **Integration:** the cheap `drift` hook moment now points at `/drift-deep` for the full audit
|
|
1545
|
+
(one terse clause — the cheap nudge stays cheap). Follows the `/extract` precedent — a deliberate
|
|
1546
|
+
skill judgment, tested in use, not by the hook-judgment-eval surface (noted in the skill).
|
|
1547
|
+
- L1-mvp now ships 14 skills. Gate + judgment suites 105/0/41 (the drift voice-hash shifted from
|
|
1548
|
+
the pointer — the tripwire working as designed; no transcripts, so no STALE). The 4.8 leverage
|
|
1549
|
+
arc's last deferred item, landed.
|
|
1550
|
+
|
|
1551
|
+
## 0.36.0 — 2026-06-01
|
|
1552
|
+
|
|
1553
|
+
- **`boss board` — a live read of what's in flight (IDEA-015, Phase 1).** Occasioned by Ajesh's
|
|
1554
|
+
"internal kanban / fire a html site / Obsidian / almost a Trello board" idea. Convened six advisors
|
|
1555
|
+
(venture, architect, humane, designer + vibe-virtuoso & indie-hacker persona reactions); the result
|
|
1556
|
+
was **unanimous and collapses to one fork: build the *view*, refuse the *app*.** A board BOSS
|
|
1557
|
+
*renders* from state it already holds externalizes the arc for a tired brain; a board BOSS *becomes*
|
|
1558
|
+
(log in, drag cards, keep in sync) is the photo-negative of BOSS and Canvas R&H #1 wearing a UI.
|
|
1559
|
+
**The founder never touches the board — they change the work and it re-renders.**
|
|
1560
|
+
- **`boss board` (new CLI subcommand, [src/board.js](../src/board.js))** — derives four columns
|
|
1561
|
+
(Captured → Taking shape → Building → Shipped) from files that already exist. **Frontmatter is
|
|
1562
|
+
truth — reads each IDEA-*/FEAT-* file's `status`, never `docs/ideas/INDEX.md`** (a hand-maintained
|
|
1563
|
+
table that drifts; a board that trusts a drifting source lies). Pure projection: no
|
|
1564
|
+
`.boss/board.json`, no second source of truth, nothing to sync — so concurrent / out-of-order /
|
|
1565
|
+
agent edits can't corrupt a render (the answer to Ajesh's "picks something out of order" worry is
|
|
1566
|
+
*statelessness*, not merge logic). A promoted idea is represented by its FEAT card (no
|
|
1567
|
+
double-count); blocked FEATs flag `· blocked`.
|
|
1568
|
+
- **Humane constraint honored (mentor-humane override):** the riskiest-assumption status sits
|
|
1569
|
+
*above* the columns — when there's capture but nothing pressure-tested, the evidence line says so
|
|
1570
|
+
plainly and points at `/canvas`. Empty columns are shown, not hidden (the empty cell is the
|
|
1571
|
+
diagnostic). Plain factual copy — no completion-celebration, no gamification, no notifications.
|
|
1572
|
+
- **Deterministic projection, no model in the loop** → it's a CLI verb, not a skill (spending model
|
|
1573
|
+
tokens on `readFile` + string-template would be the anti-pattern the v0.34 frequency-ledger work
|
|
1574
|
+
fought). Ships with the binary → available in every project automatically; **no manifest change**.
|
|
1575
|
+
Lands in IDEA-006's already-portable Layer 1 (zero host contract).
|
|
1576
|
+
- **Earned its keep on first run:** reading BOSS's own repo, it surfaced real INDEX-vs-files drift
|
|
1577
|
+
(IDEA-003 / IDEA-014 are `building` in their frontmatter while INDEX still said `exploring`) — the
|
|
1578
|
+
exact failure mode that justified reading frontmatter over the table.
|
|
1579
|
+
- **Deferred by design (the discipline BOSS preaches, applied to itself):** `--html` (read-only,
|
|
1580
|
+
generate-on-demand, same data model) is gated behind *earn it first* — if `boss board` gets run
|
|
1581
|
+
unprompted each session, build the render; if not, the gate saved the work. Obsidian is mostly
|
|
1582
|
+
documentation (`docs/` is already a vault), not a build. Both captured in IDEA-015 with the
|
|
1583
|
+
write-back caveats. End-to-end tested in `/tmp` (empty / captured-only caution banner / canvas →
|
|
1584
|
+
Taking shape / FEAT supersession / blocked / shipped / placeholder-canvas negative test).
|
|
1585
|
+
|
|
1586
|
+
## 0.35.0 — 2026-06-01
|
|
1587
|
+
|
|
1588
|
+
- **The recalibration engine — `regrade.js` built + run-ready, and model-recalibration named as a
|
|
1589
|
+
standing discipline (IDEA-014 Phase 1).** Occasioned by Ajesh's direction: adapting to new/
|
|
1590
|
+
different models should be a *standing capability*, not the ad-hoc reaction the v0.31–v0.34 arc
|
|
1591
|
+
was. Picked as the move that *helps BOSS keep improving easily* — because both judge-moments
|
|
1592
|
+
(drift, caution) were `NEVER_GRADED`: the labeled sets existed but the model had never been tested
|
|
1593
|
+
against them. Every future judge-moment would ship as vibes until that closed. `regrade.js` is the
|
|
1594
|
+
keystone that makes all future conscience work *measurable*.
|
|
1595
|
+
- **`regrade.js` (zero-dep, env-gated, out-of-band)** — the calibrator promoted from spec-stub to
|
|
1596
|
+
real harness. Per case: a **decision call** (give the live model the exact voice frame the hook
|
|
1597
|
+
injects + the bounded read the instruction names + a neutral founder turn → fire or stay
|
|
1598
|
+
silent?) + a **judge call** (grade the nudge against the case rubric → structured verdict),
|
|
1599
|
+
writing `transcripts/<moment>/<id>.json` stamped with the current voice-hash. Uses Node built-in
|
|
1600
|
+
`fetch` (no SDK). **`--dry-run`** verifies the whole pipeline (prompt assembly across all 17
|
|
1601
|
+
cases + the judge parser) with no API and no spend — verified green; the live `fetch` is the
|
|
1602
|
+
only unexercised line (a standard POST). `npm run regrade`; `BOSS_REGRADE_MODEL` to point at a
|
|
1603
|
+
different model.
|
|
1604
|
+
- **`moments.js`** — shared voice-hash source of truth so `replay.js` and `regrade.js` *cannot*
|
|
1605
|
+
disagree on the fingerprint (a mismatch would mark every transcript STALE forever). `replay.js`
|
|
1606
|
+
refactored onto it (same hashes; suites regression-clean).
|
|
1607
|
+
- **`docs/architecture/MODEL-RECALIBRATION.md`** — the standing checklist (IDEA-014's earned
|
|
1608
|
+
slice). Two triggers: a new same-vendor model (*leverage more* — re-grade, revisit each
|
|
1609
|
+
boundary for "can this move UP now?", read the frequency ledger, check context headroom) and a
|
|
1610
|
+
new host running a different model (*degrade gracefully* — judge-moments fall back to predicate-
|
|
1611
|
+
only). `regrade.js` is its engine; re-running it on a new model **is** recalibration. Named as a
|
|
1612
|
+
PRINCIPLE-#1 **UP** candidate (founders face the same model-migration problem — `/claude-api`
|
|
1613
|
+
already migrates their apps).
|
|
1614
|
+
- **Deferred, loudly:** the per-host model-capability profile + a `/recalibrate` skill — until a
|
|
1615
|
+
second model/host exists (a `/recalibrate` with nothing to recalibrate *to* is the premature
|
|
1616
|
+
ceremony v0.34 dodged). Token accounting stays host-contract territory (IDEA-006).
|
|
1617
|
+
- Gate + judgment suites 105/0/41; `npm pack` ships 0 tooling files. The judgment is now
|
|
1618
|
+
*run-ready* to become model-verified — one `ANTHROPIC_API_KEY=… npm run regrade` away (no key in
|
|
1619
|
+
the build env, so the live grade is the founder's to trigger).
|
|
1620
|
+
|
|
1621
|
+
## 0.34.0 — 2026-06-01
|
|
1622
|
+
|
|
1623
|
+
- **Conscience frequency ledger — BOSS eats its own `/ai-cost` dogfood, honestly. The last build
|
|
1624
|
+
of the 4.8 arc.** As judge-moments multiplied (v0.31 drift, v0.33 caution now do model judgment
|
|
1625
|
+
in the live turn), the conscience began costing real tokens per prompt — while BOSS preaches
|
|
1626
|
+
cost discipline and never measured its own. The first 4.8 pass flagged the trap: *BOSS becomes
|
|
1627
|
+
the expensive-AI app it warns against.* (IDEA-013.)
|
|
1628
|
+
- **The reframe is the decision (mentor-architect):** the build started as "conscience *cost*
|
|
1629
|
+
instrumentation" and was reframed to **frequency — facts, not estimates.** The hook never
|
|
1630
|
+
calls a model, so a char→token estimate would manufacture a billable-looking number while
|
|
1631
|
+
blind to its dominant term (the induced bounded reads judge-moments trigger in the main turn).
|
|
1632
|
+
That's *lying with numbers* — the exact cost-theater BOSS warns against; PRINCIPLE #2 vetoes
|
|
1633
|
+
it as premature ceremony. The *real* problem under the cost framing is **over-firing** — the
|
|
1634
|
+
actual way a conscience becomes costly/annoying, and a number you'd act on. So: measure
|
|
1635
|
+
frequency honestly; defer the token question to where honest token data lives (host-side,
|
|
1636
|
+
IDEA-006).
|
|
1637
|
+
- **`.boss/conscience-log.jsonl`** (gitignored) — the hook appends one line per fire:
|
|
1638
|
+
`{ts, moments[{moment,confidence}], judge(bool), injected_chars, cohort}`. Facts only — **no
|
|
1639
|
+
token/dollar estimate.** A **separate BOSS-meta ledger**, never the founder's
|
|
1640
|
+
`.boss/cost-log.jsonl` (BOSS's overhead ≠ the founder's app cost).
|
|
1641
|
+
- **First correctness-invisible fire-path side effect.** The hook goes from pure-emit to
|
|
1642
|
+
has-a-side-effect; `logActivity` runs only past the silent early-exit, append-only, single
|
|
1643
|
+
write, in its own swallowing try/catch. **Verified byte-identical** hook output with and
|
|
1644
|
+
without the ledger writable — delete it and the conscience behaves the same.
|
|
1645
|
+
- **`boss conscience activity`** (alias `cost`, which prints the honest frequency-not-tokens
|
|
1646
|
+
reframe) — fires, judge-moment share, median injected chars, per-moment counts, and the
|
|
1647
|
+
**over-fire smell** (clustering: a moment firing ≥4×/hour or ≥8×/24h — flagged because no
|
|
1648
|
+
per-prompt denominator exists without logging non-fires, which would break the instant
|
|
1649
|
+
property). Plus a one-line activity summary in `boss status --conscience`.
|
|
1650
|
+
- **Measure-only; self-throttle deferred indefinitely.** A throttle would gag the conscience
|
|
1651
|
+
exactly when a drifting founder generates more prompts and needs it most — **humane before
|
|
1652
|
+
viable**, and a one-way door (every "why didn't it speak?" becomes unfalsifiable). This ledger
|
|
1653
|
+
is the *evidence* that would earn that conversation, not a step toward it. Token/dollar
|
|
1654
|
+
estimation also deferred — host-contract territory (the only honest token count comes from the
|
|
1655
|
+
host, not the hook). `JUDGE_MOMENTS` set added to the hook lib. Gate + judgment suites
|
|
1656
|
+
regression-clean (105/0/41; drift + caution covered).
|
|
1657
|
+
|
|
1658
|
+
## 0.33.0 — 2026-06-01
|
|
1659
|
+
|
|
1660
|
+
- **`caution` goes judge-backed — depth vs. avoidance, and the first reuse of the v0.32 judgment
|
|
1661
|
+
machinery on an *existing* moment.** Moment #1 (the conscience's flagship "what does this
|
|
1662
|
+
prove?") fires when ≥3 captures exist with no filled riskiest assumption. But the predicate
|
|
1663
|
+
counts *total* captures — it can't tell **depth** (one idea getting sharper, converging toward a
|
|
1664
|
+
canvas) from **avoidance** (capturing-lots / validating-nothing). That ambiguity is `m1-snf-021`,
|
|
1665
|
+
which the gate runner has *skipped since v0.16* because no predicate can make the call. v0.33
|
|
1666
|
+
makes caution judge-backed and resolves it.
|
|
1667
|
+
- **Voice-frame upgrade** (`signalAsContext`, `caution` branch): the gate still opens, but the
|
|
1668
|
+
model now silently reads the active idea's capture log and judges before voicing — *one idea
|
|
1669
|
+
sharpening (narrowing the user, finding the real pain, wrestling the same question) = stay
|
|
1670
|
+
silent; idea-hopping / feature-piling / market-notes-without-a-bet = fire and name the specific
|
|
1671
|
+
pattern.* This is **strictly more restraint** — caution can now only fire *less*, never more.
|
|
1672
|
+
- **First reuse of the judgment surface (v0.32) on an existing moment** — proves the machinery
|
|
1673
|
+
isn't drift-specific. `replay.js` is now **multi-moment**: a `MOMENTS` registry (drift +
|
|
1674
|
+
caution) drives one shared engine (per-moment voice-hash tripwire + well-formedness + coverage
|
|
1675
|
+
+ grading status). Adding a judgment moment = one registry row + a `<moment>.judgment.yml`.
|
|
1676
|
+
- **`caution.judgment.yml`** — 7 labeled cases: 3 should-fire-avoidance (idea-hopping,
|
|
1677
|
+
feature-piling, market-notes), 3 should-not-fire-depth (the m1-snf-021 case made concrete —
|
|
1678
|
+
narrowing, wrestling, converging), 1 ambiguous (depth-with-a-detour, `acceptable:
|
|
1679
|
+
[fires, silent]`). Content-rich capture-log fixtures (`fixtures-capturelog.js`) — the prose
|
|
1680
|
+
matters because the judgment is semantic.
|
|
1681
|
+
- **m1-snf-021 closed at the right layer:** the gate runner keeps skipping it (correctly — no
|
|
1682
|
+
predicate can make the call), now annotated "RESOLVED in v0.33; covered by the judgment
|
|
1683
|
+
surface," not "unimplemented." The case the gate *couldn't* test is exactly what the judgment
|
|
1684
|
+
channel was built for — the clearest proof v0.32 earned its keep.
|
|
1685
|
+
- Gate suite regression-clean **105/0/41**; both judgment moments well-formed + covered; caution
|
|
1686
|
+
voice-hash updated (tripwire reads the live frame). End-to-end tested via `boss new`: caution
|
|
1687
|
+
fires on scattered captures and ships the new depth-vs-avoidance instruction. The judgment
|
|
1688
|
+
remains **not yet model-verified** (replay prints NEVER_GRADED loudly) — first STALE builds
|
|
1689
|
+
`regrade.js`.
|
|
1690
|
+
|
|
1691
|
+
## 0.32.0 — 2026-06-01
|
|
1692
|
+
|
|
1693
|
+
- **Judgment-quality eval channel — closing the hole `drift-loop` (v0.31) opened.** drift was the
|
|
1694
|
+
first moment whose *detection is a model judgment*, not a predicate. The existing eval runner
|
|
1695
|
+
tests only the **gate** (does the hook fire on the right structural state); it stops at the
|
|
1696
|
+
door. Whether the model correctly calls drift-vs-on-aim *past* the open gate was unevaluated —
|
|
1697
|
+
a named crack in the "no moment ships without evals" floor, in exactly the spot 4.8 made
|
|
1698
|
+
load-bearing (`eval.md`: shipping detection logic with no way to know if it's right = "vibes-
|
|
1699
|
+
based AI in BOSS"). v0.32 builds the complement.
|
|
1700
|
+
- **The method (mentor-architect pass):** two surfaces, two cadences, on purpose. The gate-eval
|
|
1701
|
+
(`runner.js`, every commit, $0) stays. The **judgment surface** (`conscience-evals/judgment/`)
|
|
1702
|
+
is **golden transcripts gated by a voice-hash tripwire** — *not* pure LLM-as-judge (breaks the
|
|
1703
|
+
free/deterministic/CI property) and *not* pure golden transcripts (rot silently when the voice
|
|
1704
|
+
frame changes). Golden transcripts are the committed dataset; the tripwire makes their
|
|
1705
|
+
staleness *loud*; LLM-as-judge regenerates them out-of-band.
|
|
1706
|
+
- **`judgment/replay.js` — zero-dep, every commit.** (1) well-formedness — every case is a
|
|
1707
|
+
genuine open-gate state (filled risk, devlog ≥3 entries, no experiment line) with a coherent
|
|
1708
|
+
label; (2) **voice-hash tripwire** — fingerprints the exact `drift` instruction the model runs
|
|
1709
|
+
(`composeContext` for a drift signal); a transcript recorded against a different hash is
|
|
1710
|
+
`STALE` and replay says so loudly (golden transcripts that can't detect their own staleness
|
|
1711
|
+
are an eval that lies); (3) coverage floors (no silent caps; the on-aim/should-not-fire class
|
|
1712
|
+
is trust-critical and meant to *grow* toward ≥10); (4) grading status `GRADED/STALE/
|
|
1713
|
+
NEVER_GRADED/REGRESSION`. Exit 1 on malformed/coverage/regression; exit 0 + loud warnings on
|
|
1714
|
+
never-graded/stale. **All four states proven** (GRADED/STALE/REGRESSION/NEVER_GRADED tested).
|
|
1715
|
+
- **`judgment/drift.judgment.yml` — the labeled ground truth (Husain: build the set first).**
|
|
1716
|
+
10 cases: 4 should-fire-and-name-gap (incl. the sharpest — building the integration the risk
|
|
1717
|
+
explicitly *defers*), 5 should-not-fire-on-aim (the trust-critical class, incl. the hard
|
|
1718
|
+
"on-aim but informal — no experiment line, but the work IS the test" case), 1 ambiguous
|
|
1719
|
+
(`acceptable: [fires, silent]` so the grader doesn't punish a defensible call). Content-rich
|
|
1720
|
+
paired devlog fixtures (`fixtures-devlog.js`) — the *prose* matters because the judgment is
|
|
1721
|
+
semantic. `must_reference`/`must_not` rubric per case for the future LLM-judge.
|
|
1722
|
+
- **`judgment/regrade.js` — paid, out-of-band, DEFERRED by design.** The calibrator (decision
|
|
1723
|
+
call → judge call → write transcript stamped with the current voice-hash). Per the architect's
|
|
1724
|
+
staged cut: ship the labeled set + replay + tripwire first; build the API half *the week the
|
|
1725
|
+
first STALE tripwire fires*. Runnable spec-stub documents the full algorithm + env-checks. When
|
|
1726
|
+
built: zero-dep (Node `fetch`, no SDK), env-gated on `ANTHROPIC_API_KEY`, never on the commit
|
|
1727
|
+
path, never imported by `src/`, never in the `files` allowlist (lives under `docs/`).
|
|
1728
|
+
- **The zero-dep line, pinned:** the rule is *shipped surface (`src/`, `files`) stays
|
|
1729
|
+
dependency-free* — not "dev tooling can't call a model." Confirmed: `npm pack --dry-run` ships
|
|
1730
|
+
**0** judgment files; no `src/`/`bin/` reference to the eval dirs.
|
|
1731
|
+
- **Dogfooded in `eval.md`:** a model-judgment moment cannot ship its detection with only a
|
|
1732
|
+
gate-eval — the drift signal + exit predicates now require a judgment set + replay coverage for
|
|
1733
|
+
`drift` and every successor. Shared YAML parser extracted to `conscience-evals/lib/yaml-eval.js`
|
|
1734
|
+
(gate runner + replay use one copy; gate suite regression-clean at 105/0/41). New npm scripts:
|
|
1735
|
+
`eval:gate`, `eval:judgment`, `eval`.
|
|
1736
|
+
- **Honest scope:** v0.32 ships the labeled set + the staleness machinery + coverage discipline.
|
|
1737
|
+
The judgment is **not yet model-verified** — replay prints `NEVER_GRADED` loudly so a green run
|
|
1738
|
+
is never mistaken for a graded judgment. The first `STALE` is the trigger to build `regrade.js`.
|
|
1739
|
+
|
|
1740
|
+
## 0.31.0 — 2026-06-01
|
|
1741
|
+
|
|
1742
|
+
- **`drift-loop` — the closest loop to why BOSS exists, and the first moment that fronts a
|
|
1743
|
+
*model judgment* the predicate can't make.** PRINCIPLES.md opens with *"build faster without
|
|
1744
|
+
fooling themselves… continuously checking the work against real pain, real buyers."* For 30
|
|
1745
|
+
releases the conscience caught *structural* gaps (no canvas, no budget, no failure-states)
|
|
1746
|
+
but never the gap it names first: **you named the bet that could sink this, then spent your
|
|
1747
|
+
sessions building something else.** v0.31 closes it — occasioned by a "what does Opus 4.8
|
|
1748
|
+
change about BOSS" pass whose load-bearing finding was *the hook=detection / model=voice
|
|
1749
|
+
boundary has moved*: a stronger model can do the semantic judgment regex can't, so the
|
|
1750
|
+
conscience's first real "are you fooling yourself" check becomes buildable.
|
|
1751
|
+
- **`drift-loop` (L1-mvp, hook-runner)** — fires the new `drift` moment when (a) a canvas has
|
|
1752
|
+
a real **Riskiest assumption** (same predicate `canvas-loop` / `spec-loop` use) AND (b)
|
|
1753
|
+
`docs/devlog.md` has ≥3 dated entries (work accumulating, same threshold as
|
|
1754
|
+
`extraction-loop`) AND (c) no real **Experiment this week** validation plan exists yet.
|
|
1755
|
+
Sits in the gap *between* `caution` (no risk named) and the `done` graduation (risk has a
|
|
1756
|
+
plan). Confidence scales on devlog overshoot: 3 → low, 4–5 → medium, 6+ → high.
|
|
1757
|
+
- **The architecture, pinned (mentor-architect pass):** this is **not a new detector** — it's
|
|
1758
|
+
a *predicate-gated, bounded-read voicing instruction on the existing `UserPromptSubmit` →
|
|
1759
|
+
`additionalContext` channel*. The cheap Node predicate is the gate (and the cost control);
|
|
1760
|
+
the model does the stated-vs-actual comparison *in the live turn*, reading a **bounded** set
|
|
1761
|
+
(riskiest-assumption line + ~5 recent devlog entries + the open FEAT — never the whole
|
|
1762
|
+
project). No model call in the hook, **no new host primitive** (IDEA-006 contract untouched),
|
|
1763
|
+
**no new state**, no new predicate vocabulary. Same shape as `extraction-loop`/`/extract`,
|
|
1764
|
+
but **hook-fired not skill-invoked** — because a founder who has drifted from their own
|
|
1765
|
+
stated bet is exactly the founder who won't think to *ask* whether they've drifted.
|
|
1766
|
+
- **New `drift` voice frame** in `signalAsContext` (loop-runtime.js): instructs silent bounded
|
|
1767
|
+
read → judge → name the *specific* gap ("you said X is the bet; the last sessions built Y, Z;
|
|
1768
|
+
neither tests X") → point at `/canvas` or `/pretotype`. **Stay silent when on-aim** (silence
|
|
1769
|
+
is the correct output). Not a "you've been productive!" reward, not a generic "you should
|
|
1770
|
+
validate" line — the value is the specific comparison. Cohort-aware (returning-founder gets
|
|
1771
|
+
the harder conviction cut; first-product gets "test your riskiest bet" taught plainly;
|
|
1772
|
+
domain-expert gets the who-could-be-harmed lens on the named risk).
|
|
1773
|
+
- **Eval coverage from the start** — `moment-drift.yml` (14 cases: 4 should-fire incl. the
|
|
1774
|
+
placeholder-experiment edge + the drift/capture co-fire; 10 should-not incl. plan-recorded,
|
|
1775
|
+
under-threshold, unnamed-risk, dropped-idea, pre-MVP, and the documented `any_file_matches`
|
|
1776
|
+
masking limitation). Runner extended: `buildCanvasFile` now writes the experiment line.
|
|
1777
|
+
Suite **105/0/41** (up from 91). *No moment ships without evals* — held. End-to-end tested
|
|
1778
|
+
in `/tmp`: fires low→high as devlog grows, goes silent when the validation plan is recorded.
|
|
1779
|
+
- L1-mvp now ships 12 skills + **8 loops**. Honest scope note: hook evals test the *gate*;
|
|
1780
|
+
judgment-quality (does the model correctly call drift vs. on-aim) is the model's layer,
|
|
1781
|
+
tested the way `/extract`'s judgment is — not this runner.
|
|
1782
|
+
|
|
1783
|
+
## 0.30.0 — 2026-05-27
|
|
1784
|
+
|
|
1785
|
+
- **Closing the two remaining audit gaps — cost-log read cadence + failure-state stub
|
|
1786
|
+
loophole.** Same shape as v0.27: no new product axis; the discipline tightens around
|
|
1787
|
+
surface already shipped. The two gaps the audit named after v0.26 — *"the weekly review
|
|
1788
|
+
cadence is unenforced"* and *"failure-state handlers can be stubs forever"* — were real
|
|
1789
|
+
holes. v0.30 closes both.
|
|
1790
|
+
- **`/cost-review` skill (L1-mvp)** — reads `.boss/cost-log.jsonl`, aggregates by FEAT +
|
|
1791
|
+
user + cohort, compares against `docs/ai-cost-budget.md`, flags surprises (cost
|
|
1792
|
+
outliers > 10× median; user outliers; FEAT skew; quiet upward drift), writes
|
|
1793
|
+
`docs/cost-reviews/REVIEW-YYYY-MM-DD.md` with headline / numbers / variance / surprises /
|
|
1794
|
+
actions / next-review fields. Cohort-aware framing (indie-hacker gets %-of-revenue;
|
|
1795
|
+
returning-founder gets unit economics; domain-expert gets privacy-first confirmation
|
|
1796
|
+
before showing any review data). Surfaces mentor handoffs (architect for cost-shape;
|
|
1797
|
+
business for unit-economics) when overages exceed 10% of monthly cap.
|
|
1798
|
+
- **`cost-review-loop` (L1-mvp, hook-runner)** — **second time-of-work entry pattern**
|
|
1799
|
+
(after extraction-loop). Entry: `docs/ai-cost-budget.md` exists (the deontic moment —
|
|
1800
|
+
declaring the budget commits you to reading it). Exit: ≥1 file in `docs/cost-reviews/
|
|
1801
|
+
REVIEW-*.md` with a `^- \*\*Total spend:\*\*` line. The first review closes the loop;
|
|
1802
|
+
recurring re-opening waits on time-aware predicate vocabulary (same dependency as
|
|
1803
|
+
extraction-loop). New **`cost-stale` moment** added to `signalAsContext` voice frame.
|
|
1804
|
+
- **`/ai-failure-states` template — Eval-tested column added.** Each failure-state row
|
|
1805
|
+
(garbage / refusal / hallucination / timeout / cost-spike) now carries an **Eval-tested**
|
|
1806
|
+
field naming which eval case actually exercises the handler, OR marked **STUB** with an
|
|
1807
|
+
override required per IDEA-008. *Closes the stub-forever loophole at the declaration
|
|
1808
|
+
layer.* Voice note: stubs are still legitimate; *stubs without an override-with-re-open-
|
|
1809
|
+
condition* are not.
|
|
1810
|
+
- **`/evals` skill — failure-mode coverage requirement.** For AI-mediated FEATs, the eval
|
|
1811
|
+
set must include at least one `should-fail` case per declared failure state, categorized
|
|
1812
|
+
by `failure_mode` matching the canonical names (`garbage`/`refusal`/`hallucination`/
|
|
1813
|
+
`timeout`/`cost-spike`/`other`). *Closes the stub-forever loophole at the test layer.*
|
|
1814
|
+
The two layers (declaration + test) compose: handler stubs in code → declaration in
|
|
1815
|
+
docs → eval cases that exercise the declarations → real coverage.
|
|
1816
|
+
- **Eval coverage from the start (v0.27 rule held).** `moment-cost-stale.yml` ships with 9
|
|
1817
|
+
cases: should-fire (no review; empty directory; stub file without canonical line); should-
|
|
1818
|
+
fire-multi (cost + cost-stale together when logger isn't wired but budget is); should-not-
|
|
1819
|
+
fire (full closure with review on record; pre-budget projects; LLM calls without budget
|
|
1820
|
+
declared — confirms entry is gated on budget doc, not LLM calls; Quickstart-shape projects).
|
|
1821
|
+
3 existing "full discipline declared" cases (m-cost-101, m-fm-101, m-cap-202) updated to
|
|
1822
|
+
also satisfy cost-review-loop's exit predicate, since they assert all-loops-closed.
|
|
1823
|
+
- **Suite count: 91 passed / 0 failed / 41 skipped (133 loaded)** — up from 83/0/41. Every
|
|
1824
|
+
hook-emitted moment now has eval coverage. The "no moment ships without evals" discipline
|
|
1825
|
+
is the new floor.
|
|
1826
|
+
- **What v0.30 does NOT do:** auto-enforce the failure-mode eval coverage (the `/evals`
|
|
1827
|
+
skill names the requirement; nothing fails the build if the eval set lacks coverage; that
|
|
1828
|
+
would be a v0.31+ tightening if founders ship handlers-without-evals systematically);
|
|
1829
|
+
recurring cost-review-loop (needs time-aware predicate vocabulary); pull `.boss/cost-
|
|
1830
|
+
log.jsonl` into the skill review output automatically (the parsing logic lives in the
|
|
1831
|
+
skill's instructions; Claude does the reading per-call to keep zero-dep CLI).
|
|
1832
|
+
- **The audit is closed.** The three load-bearing gaps from the post-v0.26 audit (eval
|
|
1833
|
+
coverage, /welcome onboarding, moment #3 capture) shipped in v0.27 / v0.28 / v0.29; the
|
|
1834
|
+
two discipline-hole gaps in already-shipped surface (cost-review cadence, failure-state
|
|
1835
|
+
stub loophole) shipped in v0.30. The right-to-defer items (brownfield, /consult, AI-first
|
|
1836
|
+
archetype template, IDEA-003 finish, etc.) remain on the v0.31+ backlog. Audit-driven
|
|
1837
|
+
hardening complete; the next move is feature work.
|
|
1838
|
+
|
|
1839
|
+
## 0.29.0 — 2026-05-27
|
|
1840
|
+
|
|
1841
|
+
- **Moment #3 lands — PRINCIPLE #1's own discipline, encoded (finally).** For 28 releases the
|
|
1842
|
+
conscience surfaced 6 moments (caution / done / restraint / coherence / cost / failure-mode)
|
|
1843
|
+
but never one for PRINCIPLE #1 itself — the *pause to extract patterns UP or DOWN* rule that
|
|
1844
|
+
defines what BOSS is. *"Capture"* was deferred 5 consecutive releases as "needs LLM-as-judge
|
|
1845
|
+
or heuristic design." v0.29 ships the heuristic-plus-judgment version: a hook-runner loop
|
|
1846
|
+
on a simple devlog heuristic + a skill that does the real judging.
|
|
1847
|
+
- **`extraction-loop` (L1-mvp, hook-runner)** — entry: `docs/devlog.md` has ≥3 dated
|
|
1848
|
+
entries (regex `^## \d{4}-\d{2}-\d{2}`); exit: ≥1 `docs/extractions/EXTR-*.md` file has
|
|
1849
|
+
a `- **Route:**` line. **First hook-runner loop whose entry is *time-of-work* (devlog
|
|
1850
|
+
count) rather than *file-state predicate*.** Sets the precedent for future judgment-
|
|
1851
|
+
required moments that can't be detected by file regex alone. Threshold of 3 mirrors
|
|
1852
|
+
PRINCIPLE #1's *"the third time the same work repeats"* signal.
|
|
1853
|
+
- **`/extract` skill (L1-mvp)** — the LLM-as-judge counterpart. Reads recent commits, the
|
|
1854
|
+
last 3-5 devlog entries, the current FEAT, `library/`, and `src/`. Identifies 1-3
|
|
1855
|
+
candidate patterns by three signals (Signal A: same work repeated; Signal B: named-and-
|
|
1856
|
+
stable shape; Signal C: load-bearing decision). Routes each candidate **UP** (into BOSS's
|
|
1857
|
+
`library/<cat>/` via `boss learn`) or **DOWN** (into the app's `src/` as a named refactor
|
|
1858
|
+
target) or honest **NOT-YET** (the third legitimate answer; recording the *not yet* IS
|
|
1859
|
+
the discipline). Writes `docs/extractions/EXTR-NNN-<slug>.md` with full routing record +
|
|
1860
|
+
"what didn't make the cut" section. Cohort-aware (returning-founder gets the seasoned
|
|
1861
|
+
`"what did you do twice?"` prompt; first-product gets gentler framing; domain-expert
|
|
1862
|
+
leans NOT-YET-with-caveats for regulated logic).
|
|
1863
|
+
- **New `capture` moment in the conscience** — added to `signalAsContext` in
|
|
1864
|
+
`lib/loop-runtime.js`. Voice frame names the inflection in plain language; explicitly
|
|
1865
|
+
rejects *"you've been productive!"* reward framing (PRINCIPLE #1 is the discipline, not
|
|
1866
|
+
the dopamine). Cohort decides the specific framing.
|
|
1867
|
+
- **`EXTR-NNN` ID type added** — `docs/IDS.md` template updates to name it under MVP-mode
|
|
1868
|
+
unlocks alongside `FEAT-NNN` / `FIX-NNN` / `BUG-NNN`.
|
|
1869
|
+
- **L1-mvp manifest** — now ships 11 skills + 6 loops. `claude-append.md` names `/extract`
|
|
1870
|
+
and `extraction-loop` alongside the others; calls out the *"two destinations, not one"*
|
|
1871
|
+
framing per PRINCIPLE #1.
|
|
1872
|
+
- **Eval coverage from the start (closes the v0.27 hole going forward).** `moment-capture.yml`
|
|
1873
|
+
ships with 11 cases: should-fire (3, 5 entries; empty extractions/ dir doesn't close); should-
|
|
1874
|
+
not-fire (≤2 entries; closed by UP route; closed by NOT-YET route; empty project; empty
|
|
1875
|
+
devlog file); Quickstart-shape sanity (no devlog → no fire); multi-loop isolation (all four
|
|
1876
|
+
AI-first/extraction loops closed independently). The discipline applied: **no moment ships
|
|
1877
|
+
without its evals.**
|
|
1878
|
+
- **Runner upgrades** — new fixtures: `devlog_3_entries`, `devlog_2_entries`, `devlog_5_entries`,
|
|
1879
|
+
`extraction_record_up`, `extraction_record_not_yet`. Loads `moment-capture.yml` in main().
|
|
1880
|
+
**Suite count: 83 passed / 0 failed / 41 skipped (124 loaded)** — up from 73/0/41.
|
|
1881
|
+
- **What v0.29 does NOT do:** auto-recurring extraction-loop (closes after one extraction;
|
|
1882
|
+
re-opening requires founder action or future predicate vocabulary with time-aware checks
|
|
1883
|
+
like *"N devlog entries since last extraction"*); execute the actual extraction (the skill
|
|
1884
|
+
proposes UP/DOWN routes; `boss learn` handles UP execution; refactors are the founder's
|
|
1885
|
+
call); LLM-call-out detector (some "is this reusable?" judgments would benefit from an
|
|
1886
|
+
external model call, but Claude-running-the-skill IS the LLM-as-judge — keeping it in-
|
|
1887
|
+
context is simpler and cheaper).
|
|
1888
|
+
- **The architectural precedent.** extraction-loop is the **first time-of-work entry**
|
|
1889
|
+
pattern. Future loops that need heuristics like "X has accumulated since Y" can reuse this
|
|
1890
|
+
shape (count files of one type; gate by absence of files of another type). Documented in
|
|
1891
|
+
the loop spec for future authors. Tightens the v0.18 runtime's vocabulary without changing
|
|
1892
|
+
the runner.
|
|
1893
|
+
|
|
1894
|
+
## 0.28.0 — 2026-05-27
|
|
1895
|
+
|
|
1896
|
+
- **`/welcome` — the gentle first-run orientation (closes the v0.19 cohort gap).** The v0.19
|
|
1897
|
+
personas pass surfaced that `first-product`, `vibe-coder-newbie`, and `non-tech-founder`
|
|
1898
|
+
cohorts bounce off BOSS because the first thing they see is *"drop your PRD."* Power-user
|
|
1899
|
+
framing, beginner cohort. v0.28 adds the gentler door: a Quickstart skill that orients in
|
|
1900
|
+
~1 minute, defines terms inline, names the conscience as a *nudge never a block*, surfaces
|
|
1901
|
+
override + pause as discoverable affordances, and routes to `/boss` or `/triage` based on
|
|
1902
|
+
what the founder is ready for.
|
|
1903
|
+
- **Cohort-aware depth** — the skill branches on `.boss/config.json` cohort:
|
|
1904
|
+
- `first-product` / `vibe-coder-newbie` / `non-tech-founder` / null → **full tour**: what
|
|
1905
|
+
BOSS is, what's in the folder, what to do next, how the conscience works, how modes level
|
|
1906
|
+
up, where to find help. Each section 2-3 sentences; terms defined inline; plain language.
|
|
1907
|
+
- `eng-builder` / `vibe-virtuoso` / `indie-hacker` / `returning-founder` → **30-second
|
|
1908
|
+
version**: one paragraph, pointer at `/boss`. *"You probably don't need the tour."* Then
|
|
1909
|
+
stops; doesn't elaborate.
|
|
1910
|
+
- `domain-expert` → **middle path**: full tour with high-stakes framing inline (privacy-
|
|
1911
|
+
first defaults, human-in-the-loop on hallucination, conscience errs toward speaking).
|
|
1912
|
+
- **Discoverability fix for IDEA-011** — the skill explicitly surfaces `boss conscience pause`
|
|
1913
|
+
+ the override grammar (*"deviation conscious, recorded, never blocked, never forgotten"*).
|
|
1914
|
+
Previously these were documented in CHANGELOG / docs/ideas; now they're surfaced in the
|
|
1915
|
+
cohort's first 5 minutes with BOSS. Partial close on IDEA-011 Phase 2's override-
|
|
1916
|
+
discoverability sub-gap.
|
|
1917
|
+
- **Cohort question moved upstream** — `/welcome` asks the cohort question first (when unset),
|
|
1918
|
+
before any orientation; `/boss` step 6 stays as the backup path for founders who go
|
|
1919
|
+
directly to `/boss`. The cohort persists in `.boss/config.json` for the conscience hook.
|
|
1920
|
+
- **`boss new` output updated** — the post-scaffold "Next:" block now names BOTH paths:
|
|
1921
|
+
`/welcome` for the first-time door, `/boss <idea|PRD>` for the power-user door. Same
|
|
1922
|
+
language as CLAUDE.md (template) top.
|
|
1923
|
+
- **L0-quickstart manifest** — adds `welcome` to the skills list (now 6 skills). No new
|
|
1924
|
+
agents, no new loops, no new hooks. Quickstart still ships a tiny agentic footprint.
|
|
1925
|
+
- **L0 CLAUDE.md (template)** — adds a top-line nudge: *"First time? Run `/welcome` — gentle
|
|
1926
|
+
orientation, takes a minute, defines terms inline. Already familiar with BOSS? Skip to
|
|
1927
|
+
`/boss <your idea or PRD path>` to spin up."* The first thing a fresh founder reads.
|
|
1928
|
+
- **End-to-end tested in /tmp** — `boss new welcome-test` → 6 skills land including welcome;
|
|
1929
|
+
CLAUDE.md surfaces /welcome at top; 73/73 conscience evals still pass.
|
|
1930
|
+
- **What v0.28 does NOT do:** auto-trigger `/welcome` on first Claude prompt (the CLAUDE.md
|
|
1931
|
+
surfacing is enough; auto-trigger via hook would conflict with cohort `eng-builder` /
|
|
1932
|
+
`returning-founder` who explicitly *don't* want it); ship an end-user onboarding flow (that
|
|
1933
|
+
was named in IDEA-012 as a separate v0.27+ candidate; this skill is founder-onboarding,
|
|
1934
|
+
not end-user-onboarding); add a `/welcome` for V1 / Scale modes (the lift is once, not per-
|
|
1935
|
+
mode — re-running in MVP+ is supported but the orientation doesn't change much).
|
|
1936
|
+
|
|
1937
|
+
## 0.27.0 — 2026-05-24
|
|
1938
|
+
|
|
1939
|
+
- **Conscience evals — closing the discipline-on-the-discipline-tool hole.** Four moments
|
|
1940
|
+
(restraint / coherence / cost / failure-mode) had shipped without eval coverage; the brake
|
|
1941
|
+
the discipline named (*"43/43 pass"*) was eroding silently. v0.27 closes the three
|
|
1942
|
+
hook-emitted ones; restraint is skill-side and stays out of the hook eval suite by design.
|
|
1943
|
+
- **`moment-cost.yml`** — 12 cases covering: Anthropic / OpenAI / Vercel AI SDK
|
|
1944
|
+
(generateText, streamText) entry detection; partial closure (budget doc alone, logger ref
|
|
1945
|
+
alone); full closure; multi-moment co-firing with failure-mode at the first-LLM-call
|
|
1946
|
+
inflection; should-not-fire on no-LLM and empty projects.
|
|
1947
|
+
- **`moment-failure-mode.yml`** — 8 cases isolating the failure-mode signal (close the cost
|
|
1948
|
+
loop to single out failure-mode); covers partial closure (doc alone, handlers alone),
|
|
1949
|
+
full closure, and the no-entry case.
|
|
1950
|
+
- **`moment-coherence.yml`** — 10 cases covering: className= / styled-components / inline
|
|
1951
|
+
style={} entry detection; confidence scaling (3 declarations = low; 12+ = high); partial
|
|
1952
|
+
closure (tokens doc alone, refs alone); full closure; under-threshold (2 declarations)
|
|
1953
|
+
and no-UI cases.
|
|
1954
|
+
- **Runner upgrades:**
|
|
1955
|
+
- Loads both L0-quickstart AND L1-mvp loops (L1 loops live in `stages/L1-mvp/template/
|
|
1956
|
+
docs/loops/`; previous runner only loaded L0, which is why three moments could ship
|
|
1957
|
+
without coverage).
|
|
1958
|
+
- **`src_files` / `docs_files` / `other_files` in `project_state`** — materializes
|
|
1959
|
+
arbitrary file paths via the new `FIXTURES` registry (`runner.js`). The minimal YAML
|
|
1960
|
+
parser doesn't support multi-line scalars, so file content lives in the runner as
|
|
1961
|
+
named fixtures: `anthropic_call`, `openai_call`, `vercel_ai_call`, `vercel_ai_stream`,
|
|
1962
|
+
`cost_budget_doc`, `cost_logger_ref`, `failure_states_doc`, `failure_handlers_ref`,
|
|
1963
|
+
`style_decls_low/high/two`, `style_styled_components`, `style_inline_objects`,
|
|
1964
|
+
`design_tokens_doc`, `token_refs`, `no_llm_code`, `empty`.
|
|
1965
|
+
- **Multi-moment assertions** — `expected_detection.moments: [cost, failure-mode]`
|
|
1966
|
+
asserts set inclusion across the actual signals list (not just the first signal).
|
|
1967
|
+
Useful for cases where multiple loops fire simultaneously by design.
|
|
1968
|
+
- **Single-moment fallback** — when `expected_detection.moment` is specified, the
|
|
1969
|
+
assertion also checks the moments list (not just the first signal), since signal
|
|
1970
|
+
ordering is filesystem-dependent.
|
|
1971
|
+
- **README updated** — current-cut section names all five tested moments; documents the
|
|
1972
|
+
multi-moment format and the FIXTURES registry pattern.
|
|
1973
|
+
- **Suite count: 73 passed / 0 failed / 41 skipped (114 loaded)** — up from 43/43. The 41
|
|
1974
|
+
skipped are unchanged: 20 moment-2 cases that live in `/canvas` skill prompt (no hook
|
|
1975
|
+
detector), 6 signal-text evals (separate runner), 15 cases gated on suppress_if / devlog
|
|
1976
|
+
awareness / session-state tracking that aren't yet implemented.
|
|
1977
|
+
- **What v0.27 does NOT do:** sharpen the first-cut wording in any YAML (Ajesh's sharpening
|
|
1978
|
+
pass remains pending — flagged in each file's frontmatter); ship a skill-eval runner for
|
|
1979
|
+
`spec-loop` / `pretotype-loop` (those are skill-side; need a different runner pattern);
|
|
1980
|
+
add a sixth moment for "capture" (PRINCIPLE #1 — gap #1 still on the table; needs an
|
|
1981
|
+
LLM-as-judge or heuristic detector design).
|
|
1982
|
+
- **Why this isn't a feature release:** no skills, agents, loops, or hooks shipped. The
|
|
1983
|
+
discipline tightened around what's already shipped. Same shape as v0.24 (positioning
|
|
1984
|
+
pass): not the most exciting release, plausibly the highest-leverage one this stretch.
|
|
1985
|
+
|
|
1986
|
+
## 0.26.0 — 2026-05-24
|
|
1987
|
+
|
|
1988
|
+
- **AI-first product template — BOSS's home turf, made first-class.** The v0.24 positioning
|
|
1989
|
+
named BOSS as *"the thinking layer for AI-native founders."* v0.26 ships the concrete
|
|
1990
|
+
artifact that earns the name: a conductor skill that walks the founder through the AI-first
|
|
1991
|
+
discipline **from day one**, plus the missing piece (failure-states design) that completes
|
|
1992
|
+
the spine. Cursor + a folder is "AI-native." BOSS + `/ai-first-init` is *"AI-first with
|
|
1993
|
+
discipline from day one."*
|
|
1994
|
+
- **`/ai-first-init` skill (L1-mvp)** — the **conductor**. Walks the founder through five
|
|
1995
|
+
steps: (1) declare what's AI-mediated → `docs/ai-first.md`; (2) seed structured outputs
|
|
1996
|
+
(Liu) → `docs/schemas/`; (3) seed eval set early (Husain) → `/evals --new`; (4) declare
|
|
1997
|
+
cost budget upfront → `/ai-cost`; (5) design failure states → `/ai-failure-states`. The
|
|
1998
|
+
"from day one" framing: declare the discipline *before* the first AI-mediated FEAT ships,
|
|
1999
|
+
not after the first bill / hallucination / refusal.
|
|
2000
|
+
- **`/ai-failure-states` skill (L1-mvp)** — the **missing piece**. Walks the founder through
|
|
2001
|
+
five guaranteed failure modes (garbage / refusal / hallucination / timeout / cost-spike),
|
|
2002
|
+
each with a project-specific declared response + stub fallback handler in code (so the
|
|
2003
|
+
discipline is wired before the founder forgets). Writes `docs/ai-failure-states.md`.
|
|
2004
|
+
Cohort-aware: `first-product` gets named patterns; `eng-builder` gets the unhandled-path
|
|
2005
|
+
angle; **`domain-expert` defaults to human-in-the-loop on hallucination** (no retry-loop
|
|
2006
|
+
on medical/legal/financial output).
|
|
2007
|
+
- **`ai-failure-state-loop` (L1-mvp, hook-runner)** — entry: ≥1 LLM SDK call site
|
|
2008
|
+
(parity with `cost-budget-loop` — the two failure modes always coexist at the
|
|
2009
|
+
AI-mediated boundary). Exit: `docs/ai-failure-states.md` exists AND code references at
|
|
2010
|
+
least one fallback handler (`handleGarbageResponse` / `handleRefusal` /
|
|
2011
|
+
`handleHallucination` / `handleTimeout` / `handleCostSpike` or snake_case Python
|
|
2012
|
+
equivalents). New `failure-mode` moment added to `signalAsContext` voice frame.
|
|
2013
|
+
- **`/spec` upgrade** — for AI-mediated FEATs, the spec template now includes a **Failure
|
|
2014
|
+
states** section alongside the existing evals + validated-learning fields. Names which
|
|
2015
|
+
of the five failure states the FEAT must handle + which fallback handler it routes to.
|
|
2016
|
+
Acceptance criteria are asked to reference at least one failure-state path.
|
|
2017
|
+
- **`/boss` AI-native nudge (L0-quickstart)** — when the founder names the model as
|
|
2018
|
+
load-bearing during spin-up (the product doesn't work without it), `/boss` names it
|
|
2019
|
+
explicitly back and recommends *"after `boss unlock mvp`, run `/ai-first-init`."* The
|
|
2020
|
+
recommendation is the artifact; `/boss` never runs it for the founder.
|
|
2021
|
+
- **`docs/ai-first.md` as the declaration contract.** Future FEATs read this doc; future
|
|
2022
|
+
`/spec` runs reference its cross-reference fields. If the doc says "deterministic" for a
|
|
2023
|
+
feature and a PR puts an LLM call in there, that's a real change worth a re-spec —
|
|
2024
|
+
promotes design-by-archeology into design-by-declaration.
|
|
2025
|
+
- **L1-mvp manifest now ships 10 skills + 5 loops + 4 agents.** `claude-append.md` names
|
|
2026
|
+
`/ai-first-init`, `/ai-failure-states`, and `ai-failure-state-loop` alongside the others.
|
|
2027
|
+
- **The lineage cited.** Husain (failure-mode categorization extended from evals → UX), Liu
|
|
2028
|
+
(structured outputs as the contract that makes failure-detection mechanical), Karpathy
|
|
2029
|
+
(the failure surface IS the design surface — designing the happy path is the easy 20%),
|
|
2030
|
+
Mollick (cost-as-design-input continuing from v0.25). No new mentors added; the discipline
|
|
2031
|
+
is the existing roster applied to the AI-mediated boundary.
|
|
2032
|
+
- **Conscience regression-clean.** The new loop lives in L1-mvp; the eval runner only loads
|
|
2033
|
+
L0-quickstart loops; no eval fixtures fire `failure-mode`. End-to-end tested in `/tmp`:
|
|
2034
|
+
fresh project → `boss unlock mvp` → drop LLM SDK call in `src/` → hook fires BOTH `cost`
|
|
2035
|
+
AND `failure-mode` simultaneously (the two loops share entry but track different exits).
|
|
2036
|
+
|
|
2037
|
+
## 0.25.0 — 2026-05-24
|
|
2038
|
+
|
|
2039
|
+
- **AI cost discipline — the universal-cohort feature lands.** Per IDEA-012's persona overlay,
|
|
2040
|
+
AI cost was the only candidate every cohort cared about. Now it's first-class in MVP mode:
|
|
2041
|
+
the founder gets nudged to declare the bill at the first LLM SDK call, not after the first
|
|
2042
|
+
surprise invoice.
|
|
2043
|
+
- **`/ai-cost` skill (L1-mvp)** — walks the founder through declaring `docs/ai-cost-budget.md`
|
|
2044
|
+
(per-user/day + monthly cap + model rationale + review cadence + breach grammar), suggests
|
|
2045
|
+
a ~30-line cost-logger wrapper (TypeScript + Python examples) writing to
|
|
2046
|
+
`.boss/cost-log.jsonl`, and surfaces mentor handoffs (`mentor-architect` for cost-shape →
|
|
2047
|
+
architecture; `mentor-business` for cost-per-user → pricing). **Cohort-aware defaults:**
|
|
2048
|
+
`first-product` $5/user/day strict; `vibe-virtuoso` inspect-only; `eng-builder` BYO;
|
|
2049
|
+
`indie-hacker` cost-as-%-of-revenue framing; `domain-expert` $20/user/day + **privacy-
|
|
2050
|
+
first logging (no PII, no prompt body)** + regulatory caveats; etc. The founder edits to
|
|
2051
|
+
fit the bet; the cohort sets the starting frame.
|
|
2052
|
+
- **`cost-budget-loop` (L1-mvp, hook-runner)** — entry: `src/**` contains ≥1 LLM SDK call
|
|
2053
|
+
site (regex covers `anthropic` / `@anthropic-ai/sdk` / `openai` / `OpenAI(` / `Anthropic(` /
|
|
2054
|
+
`messages.create` / `chat.completions.create` / Vercel AI SDK `generateText` /
|
|
2055
|
+
`streamText`); exit: `docs/ai-cost-budget.md` exists AND code references the logger
|
|
2056
|
+
wrapper (≥1 occurrence). Threshold of 1 (not 3 like design-tokens-loop) because cost
|
|
2057
|
+
discipline is **deontic at the first call** — there's no "exploratory" version of token
|
|
2058
|
+
spend. The first call hits a real billing meter.
|
|
2059
|
+
- **New `cost` moment in the conscience** — added to `signalAsContext` in
|
|
2060
|
+
`lib/loop-runtime.js`. Voice frame: name the bill exists in one line (cohort decides the
|
|
2061
|
+
framing — first-product wants a number, vibe-virtuoso wants the inspect affordance,
|
|
2062
|
+
domain-expert wants the privacy posture), point at `/ai-cost`, hand the decision back.
|
|
2063
|
+
Never blocks. Override via devlog per IDEA-008.
|
|
2064
|
+
- **`.gitignore`** updated to exclude `.boss/cost-log.jsonl` (local ledger; ship to a real
|
|
2065
|
+
datastore when you have real users — the skill's review-cadence step says so).
|
|
2066
|
+
- **L1-mvp manifest** now ships 8 skills + 4 loops + same 4 agents. `claude-append.md`
|
|
2067
|
+
names the new skill and loop alongside the others.
|
|
2068
|
+
- **Pairs (not auto-invoked) with two existing mentors.** The handoff lines in `/ai-cost`:
|
|
2069
|
+
*"`mentor-architect`, the cost shape says X — what architecture decisions does that imply?"*
|
|
2070
|
+
and *"`mentor-business`, our cost-per-active-user is X; what should the pricing carry?"*
|
|
2071
|
+
Cost discipline is the load-bearing connector between architecture (caching, batching,
|
|
2072
|
+
cheaper-fallback) and business (unit economics, willingness-to-pay).
|
|
2073
|
+
- **The discipline reads as Husain-applied-to-spend.** *"Almost all AI cost problems are
|
|
2074
|
+
visible in the ledger, and almost no one keeps one."* Liu cited for structured-outputs-as-
|
|
2075
|
+
cost-lever; Mollick for cost-as-design-input. No new mentor or practitioner added — just the
|
|
2076
|
+
application of existing discipline to a different artifact.
|
|
2077
|
+
- **What v0.25 does NOT do:** auto-instrument the user's code (judgment call: the founder
|
|
2078
|
+
knows their call sites better; the skill *suggests* the wrapper, doesn't write it). Auto-
|
|
2079
|
+
detect non-mainstream SDKs (LangChain wrappers, Replicate, Cohere, Bedrock). Ship a budget-
|
|
2080
|
+
enforcement layer (this is a *nudge*, never a gate — IDEA-011 discipline applies).
|
|
2081
|
+
- **Conscience regression-clean.** Existing 43/43 conscience evals still pass — the new loop
|
|
2082
|
+
lives in L1-mvp, the eval runner only loads L0-quickstart loops, so the new moment has no
|
|
2083
|
+
way to fire against the existing eval fixtures. End-to-end tested in `/tmp`.
|
|
2084
|
+
|
|
2085
|
+
## 0.24.0 — 2026-05-23
|
|
2086
|
+
|
|
2087
|
+
- **Positioning pass — first non-feature release in BOSS's history.** The deliverable is *the
|
|
2088
|
+
thinking* (positioning + README edit), not code. Per IDEA-012's revised roadmap, v0.24 was
|
|
2089
|
+
explicitly the positioning pass, not a feature release. The Dunford exercise was first
|
|
2090
|
+
recommended in the v0.15 advisory pass and deferred through 8 capability releases; finally
|
|
2091
|
+
executed.
|
|
2092
|
+
- **Output:** [`docs/dossier/positioning-pass-001.md`](docs/dossier/positioning-pass-001.md) —
|
|
2093
|
+
full Dunford exercise: target cohort (the founder using Cursor/Claude Code 3+ months with
|
|
2094
|
+
2+ unfinished projects), 8 competitive alternatives, 8 unique attributes that survive
|
|
2095
|
+
scrutiny, attributes mapped to value, 7 market-frame options, **8 candidate killer one-
|
|
2096
|
+
sentences stranger-tested**, 8 cohort-tailored variants per persona, the trend layer
|
|
2097
|
+
(*AI raised the speed limit; almost nothing raised the discipline limit*), 6 decisions to
|
|
2098
|
+
act on, downstream consequences.
|
|
2099
|
+
- **Decisions on the record:**
|
|
2100
|
+
- **Lead sentence (elevator):** *"BOSS is the just-in-time conscience for AI-native
|
|
2101
|
+
founders. Pause it any time."* (13 words; killer for verbal intros.)
|
|
2102
|
+
- **README opening sentence:** *"For founders building with AI — the thinking layer that
|
|
2103
|
+
nudges when you're drifting and pauses on command. No growth-hacking pressure. Override-
|
|
2104
|
+
friendly."* (22 words; killer for the README + landing-page hero.)
|
|
2105
|
+
- **Category frame:** *"the thinking layer for AI-native founders"* — drops "incubator" as
|
|
2106
|
+
the primary descriptor because it reads YC-shaped to strangers. "Incubator" stays valid
|
|
2107
|
+
as a secondary descriptor.
|
|
2108
|
+
- **Cohort-tailored variants:** 8 versions (one per persona archetype). Pattern: cohort-
|
|
2109
|
+
naming first phrase + feature-that-lands-hardest second.
|
|
2110
|
+
- **What BOSS doesn't compete on, named explicitly:** code generation (Lovable / v0 / Bolt).
|
|
2111
|
+
BOSS is *complementary* to those — a founder could use Lovable to scaffold the app + BOSS
|
|
2112
|
+
to scaffold the thinking about it.
|
|
2113
|
+
- **README updated** — the TL;DR replaced with the v0.24 positioning. Old: *"a just-in-time
|
|
2114
|
+
incubator for AI-native projects."* New: the candidate #8 framing above.
|
|
2115
|
+
- **What this positioning is NOT** (recorded in the dossier):
|
|
2116
|
+
- Not validated (synthetic-tested only until real founder reads + articulates back)
|
|
2117
|
+
- Not permanent (evolves with the product)
|
|
2118
|
+
- Not the only frame BOSS could use (picked AI-native-founder audience; the anti-VC
|
|
2119
|
+
indie-hacker frame is a legitimate secondary positioning that could split into its own
|
|
2120
|
+
surface later)
|
|
2121
|
+
- **Discipline applied to BOSS itself:** the Dunford exercise was on the record since v0.15.
|
|
2122
|
+
Eight releases of capability shipped before it landed. The positioning pass finally
|
|
2123
|
+
shipping is the discipline-on-the-discipline-tool move applied again — *not the most
|
|
2124
|
+
exciting release; probably the highest-leverage one in months*.
|
|
2125
|
+
|
|
2126
|
+
## 0.23.0 — 2026-05-23
|
|
2127
|
+
|
|
2128
|
+
- **Conscience pause primitive — the discipline-on-the-discipline-tool move (IDEA-011
|
|
2129
|
+
Phase 1).** Single-purpose release. Closes the canvas R&H #1 gap operationally: the founder
|
|
2130
|
+
can now silence the conscience for a bounded session without having to edit settings.json
|
|
2131
|
+
or rip out hooks. The four other items originally queued for v0.23 (Scale mode authoring,
|
|
2132
|
+
moment #3 detector, PostToolUse hook plumbing, IDEA-010 Phase 4 `/design-prompt`) are
|
|
2133
|
+
deferred to v0.24+ — see RESUME's restructured roadmap. **Discipline applied to BOSS
|
|
2134
|
+
itself: ship only the one most-load-bearing thing in a release that *could* have been
|
|
2135
|
+
bigger.**
|
|
2136
|
+
- **`boss conscience pause [--for 8h | --until-resume] [--reason "..."]`** — records a
|
|
2137
|
+
pause in `.boss/config.json`'s `conscience` block: `{ mode, since, expires, reason }`.
|
|
2138
|
+
Default duration: 8h (a build session). `--until-resume` for indefinite pauses.
|
|
2139
|
+
- **`boss conscience resume`** — explicit un-pause. Also happens automatically when the
|
|
2140
|
+
recorded expiry passes (the hook auto-clears the state).
|
|
2141
|
+
- **Hook reads pause state FIRST.** When `mode: paused` and `expires > now` → exit silent.
|
|
2142
|
+
When `mode: paused` and `expires <= now` → call `clearPauseState` and continue normally.
|
|
2143
|
+
The auto-resume IS the kindness; no special "your pause expired" signal (would be
|
|
2144
|
+
performative noise; IDEA-011 explicitly chose silent auto-resume).
|
|
2145
|
+
- **`boss status --conscience` surfaces pause state prominently** — at the TOP of the
|
|
2146
|
+
output, marked with `⏸ PAUSED`, showing since/expires/reason. When expired but not yet
|
|
2147
|
+
auto-cleared, shows `⏸ PAUSED (EXPIRED — will auto-resume on next prompt)`.
|
|
2148
|
+
- **Help text updated** to include the new commands.
|
|
2149
|
+
- **`readPauseState` + `clearPauseState`** added to `lib/loop-runtime.js` (template) —
|
|
2150
|
+
the canonical version. BOSS's CLI imports from there so there's one source of truth.
|
|
2151
|
+
- **The architectural principle the pause demonstrates** (worth naming): **fractal-consistent
|
|
2152
|
+
override discipline.** The same IDEA-008 grammar applied at two scales — per-loop overrides
|
|
2153
|
+
in devlog (micro), and whole-conscience pause in config (macro). Same shape: deviation
|
|
2154
|
+
conscious, recorded, never blocked, never forgotten (auto-resume is the kindness). Same
|
|
2155
|
+
recipe, different scope. *Not novel as a pattern* (Focus modes are OS-level table stakes);
|
|
2156
|
+
*worth claiming if anything* as the fractal-consistent application — Phase 3 externalization
|
|
2157
|
+
may turn this into a publishable practice if BOSS gets used on > 1 project.
|
|
2158
|
+
- **End-to-end tested in /tmp — 5 flows verified:**
|
|
2159
|
+
1. Baseline drifted state → hook fires (`caution low`)
|
|
2160
|
+
2. `boss conscience pause --for 8h --reason ...` → state recorded
|
|
2161
|
+
3. Hook fires after pause → silent ✓
|
|
2162
|
+
4. `boss status --conscience` → shows ⏸ PAUSED prominently with since/expires/reason
|
|
2163
|
+
5. `boss conscience resume` → hook fires again next prompt ✓
|
|
2164
|
+
Plus expired-auto-clear: hook reads expired pause → clears it → emits signal normally next
|
|
2165
|
+
prompt; config returns to `{ mode: 'active' }`.
|
|
2166
|
+
- **43/43 evals regression-clean.**
|
|
2167
|
+
- **Why a release for one feature?** Because it's the right discipline applied to BOSS itself.
|
|
2168
|
+
The other four queued v0.23 items are real and queued for v0.24+. The pause primitive's
|
|
2169
|
+
ratio of "leverage : code-size" is the highest of anything left in the roadmap (~50 lines
|
|
2170
|
+
of code + tests; closes the canvas R&H #1 gap directly). MVP discipline: minimum experiment
|
|
2171
|
+
that produces validated learning. Validated learning here: *did the pause primitive
|
|
2172
|
+
actually make BOSS feel nimble?* Founder using BOSS to do an all-night build will tell us.
|
|
2173
|
+
|
|
2174
|
+
## 0.22.0 — 2026-05-23
|
|
2175
|
+
|
|
2176
|
+
- **V1 mode authored — `boss unlock v1` works for real.** The third major mode arrives. Same
|
|
2177
|
+
playbook as MVP authoring (v0.14.0): manifest + template + claude-append + agents + skills +
|
|
2178
|
+
loops. V1 is the *real shippable release* mode — design layer turns on, the second-tier
|
|
2179
|
+
mentors arrive, data shape becomes a first-class decision, and a cross-FEAT sequencing
|
|
2180
|
+
surface appears.
|
|
2181
|
+
- **3 new builder agents:**
|
|
2182
|
+
- `ui-designer` — token + visual authority. Reads `DESIGN_TOKENS.md` as truth; refuses raw
|
|
2183
|
+
hex; three-layer architecture enforced (primitives → semantic → component). Cites Brad
|
|
2184
|
+
Frost (Atomic Design), Nathan Curtis (layer-cake), Jina Anne (W3C DTCG), Diana Mounter
|
|
2185
|
+
(Primer), Aarron Walter (emotional design). Holds the WCAG AA floor.
|
|
2186
|
+
- `ux-designer` — flow + state + interaction authority. **5-state requirement** non-
|
|
2187
|
+
optional (default / hover / active / disabled / empty + loading + error). Nielsen 10
|
|
2188
|
+
heuristics; Krug clarity; Norman affordances; Luke Wroblewski forms; Christopher
|
|
2189
|
+
Noessel for agentive patterns; Erika Hall for just-enough-research.
|
|
2190
|
+
- `db-architect` — schema + data-shape authority. Schema before code, even solo. Cites
|
|
2191
|
+
Codd, Date, Stonebraker, Kleppmann + AI-native data voices (Liu structured outputs,
|
|
2192
|
+
Husain data quality, Huyen production). Flags AI-data failure modes (unstructured-LLM-
|
|
2193
|
+
output in control flow; hallucinated-data pollution; eval-data-isn't-user-data).
|
|
2194
|
+
- **4 template mentor copies** (promoted from BOSS-local v0.15.0 to scaffolded-project
|
|
2195
|
+
templates — `{{PROJECT_NAME}}` / `{{MODE}}` placeholders): `mentor-business`,
|
|
2196
|
+
`mentor-fundraising`, `mentor-pitch`, `mentor-talent`. Same source practitioners as the
|
|
2197
|
+
BOSS-local versions; phrased to coach the founder of a generic scaffolded project. All
|
|
2198
|
+
advisory; never binding legal/financial/medical. Default position for mentor-fundraising
|
|
2199
|
+
+ mentor-talent: *don't raise / don't hire yet, possibly never*.
|
|
2200
|
+
- **3 new skills:**
|
|
2201
|
+
- `/board` — cross-FEAT sequencing surface. Live read computed from FEAT frontmatter +
|
|
2202
|
+
smoke/evals state + override entries. Flags: `--next`, `--blocked`, `--by-cohort`,
|
|
2203
|
+
`--deferred`, `--evals`. Owned by `program-manager`.
|
|
2204
|
+
- `/design-review` — before-code design review. Runs `ui-designer` (token + visual) +
|
|
2205
|
+
`ux-designer` (flows + 5 states) sequentially against the proposed UI. Reads tokens
|
|
2206
|
+
file, style guide, canvas Promises cell. Outputs concrete diffs. Cohort-aware delivery.
|
|
2207
|
+
- `/ux-check` — after-code UX review. Walks the *shipped* experience (not the spec),
|
|
2208
|
+
checks states are real, runs accessibility heuristics, applies AI-specific UX where
|
|
2209
|
+
relevant. Pairs with `/design-review`.
|
|
2210
|
+
- **1 new loop:** `design-drift-loop` (V1-stage, runner_type: hook). The V1-stage counterpart
|
|
2211
|
+
to MVP's `design-tokens-loop` — gates *whether tokens are still authoritative*, not just
|
|
2212
|
+
*whether they exist*. **Subtle pattern worth naming:** this loop's exit predicate is the
|
|
2213
|
+
*bad signal* (≥1 raw hex code in code, excluding the tokens file) — the loop is
|
|
2214
|
+
"drift-emitting" when the bad signal is present. The IDEA-008 primitive supports this
|
|
2215
|
+
without modification. Emits the `coherence` moment (introduced v0.21).
|
|
2216
|
+
- **L2-v1 manifest** declares 7 agents + 3 skills + 1 loop; `boss sync` carries them via the
|
|
2217
|
+
managed-file kinds (agent, skill, loop). `.boss` stamp tracks them. claude-append.md
|
|
2218
|
+
reads as a clean V1-working-rules catalog.
|
|
2219
|
+
- **End-to-end tested in /tmp:** `boss new` → `boss unlock mvp` → `boss unlock v1` lands all
|
|
2220
|
+
three modes' files cleanly. Final stamp: 14 agents + 15 skills + 6 loops + 1 hook + 3
|
|
2221
|
+
installed layers. `boss status --conscience` shows all 6 loops in correct states on a fresh
|
|
2222
|
+
project. 43/43 evals still pass (regression-clean).
|
|
2223
|
+
- **Deferred to v0.23:** moment #3 (capture — reusable value at breakpoint, needs LLM-as-judge
|
|
2224
|
+
or heuristic detector — not predicate-based); PostToolUse hook for hardcoded-style detection
|
|
2225
|
+
(new hook-type plumbing — its own concern); Scale mode authoring (mentor-humane template
|
|
2226
|
+
promotion, PM org, code-health, product council); `/design-prompt` skill or fold into
|
|
2227
|
+
`/design-review`.
|
|
2228
|
+
|
|
2229
|
+
## 0.21.0 — 2026-05-23
|
|
2230
|
+
|
|
2231
|
+
- **MVP discipline upgrades + IDEA-010 Phase 2 (design-tokens-loop) — all in one release.**
|
|
2232
|
+
Three new skills, three new loops, one upgraded skill, moment #4 (restraint) lands skill-side.
|
|
2233
|
+
Moment #3 (capture — reusable value at breakpoint) deferred to v0.22 — it needs a different
|
|
2234
|
+
detector design (predicate evaluation doesn't fit "noticing this artifact is more general
|
|
2235
|
+
than its loop").
|
|
2236
|
+
- **`/spec` upgraded** (the smallest cut, highest leverage per the v1 playbook):
|
|
2237
|
+
- Adds **validated-learning field** (Ries, *The Lean Startup*): "If this FEAT works
|
|
2238
|
+
perfectly, what do we learn?" If the answer is "users like it" or "the feature works,"
|
|
2239
|
+
don't build it. The MVP is the minimum experiment that produces validated learning, not
|
|
2240
|
+
the minimum product to polish.
|
|
2241
|
+
- Adds **evals field** (Husain): when a FEAT puts an LLM in control flow, the eval set
|
|
2242
|
+
lives at `docs/evals/FEAT-NNN.yml`. Schema'd output (Liu) strongly recommended.
|
|
2243
|
+
- **Moment #4 restraint check** (IDEA-008's collapsed-moments architecture): `/spec`
|
|
2244
|
+
reads canvas-loop state before creating a FEAT spec; if canvas-loop isn't closed for
|
|
2245
|
+
the active idea, the skill surfaces a Fitzpatrick-plain restraint nudge (cohort-aware
|
|
2246
|
+
via v0.20's framing). Override grammar lives in devlog. Never blocks; always records.
|
|
2247
|
+
- **`/evals` skill (new)** — Husain discipline as a first-class MVP skill paired with
|
|
2248
|
+
`/smoke`. Smoke answers "is it alive"; evals answers "is it correct." Eval set first
|
|
2249
|
+
(20 cases beats 0). Failure modes categorized by mode (Husain: failure modes are more
|
|
2250
|
+
valuable than success modes). Structured outputs recommended (Liu: Pydantic-first).
|
|
2251
|
+
Cites Husain + Liu + LLM-as-judge caveats.
|
|
2252
|
+
- **`/pretotype` skill (new)** — Savoia's discipline as a first-class MVP skill. The
|
|
2253
|
+
demand-test between `/canvas` and `/spec` — "make sure you're building the right It
|
|
2254
|
+
before building It right." Six patterns named (fake door / WoZ / Mechanical Turk /
|
|
2255
|
+
Pinocchio / YouTube test / impresario). The TRI metric (tangible / real-time /
|
|
2256
|
+
imminent). Threshold-before-running (Ries pivot discipline). YODA (your-own-data >
|
|
2257
|
+
anything).
|
|
2258
|
+
- **`/design-tokens-init` skill (new)** — IDEA-010 Phase 2. Scaffolds the three-layer
|
|
2259
|
+
token system at the first-UI-commit inflection. **Cohort-aware delivery** (v0.20's
|
|
2260
|
+
framing): vibe-coder-newbie gets SHOWING; eng-builder gets OFFERING; vibe-virtuoso gets
|
|
2261
|
+
OVERRIDE-FRIENDLY; first-product gets DEFINE-TERMS; indie-hacker gets RIGHT-SIZED;
|
|
2262
|
+
returning-founder gets SKIP-THE-101; non-tech-founder + domain-expert get PLAIN-
|
|
2263
|
+
LANGUAGE-COACH; unspecified gets neutral. The three-layer architecture (primitives →
|
|
2264
|
+
semantic → component) is Curtis's layer-cake; the AI-tolerance argument is that
|
|
2265
|
+
two-layer systems are fragile under AI generation (the field's consensus). Reads canvas
|
|
2266
|
+
Promises cell to brand-anchor the primitives (prevents the brand-default problem from
|
|
2267
|
+
IDEA-010).
|
|
2268
|
+
- **Three new MVP-stage loops** on the v0.18 generic loop primitive:
|
|
2269
|
+
- `spec-loop` (runner_type: skill) — encodes moment #4 restraint. Entry: canvas-loop
|
|
2270
|
+
closed for some active idea. Exit: `FEAT-NNN-<slug>.md` exists. `/spec` is the
|
|
2271
|
+
detector + runner.
|
|
2272
|
+
- `pretotype-loop` (runner_type: skill) — structural by default (no drift_moment to
|
|
2273
|
+
avoid over-firing). Records that demand-testing happened before significant build.
|
|
2274
|
+
`/pretotype` is the runner.
|
|
2275
|
+
- `design-tokens-loop` (runner_type: hook) — JIT, *only opens once UI starts
|
|
2276
|
+
accumulating* (≥3 style declarations across src/). Drift moment: **`coherence`**
|
|
2277
|
+
(new — system-vs-code drift; a flavor of caution specific to design-system mismatch).
|
|
2278
|
+
Stack-agnostic regex catches common React/Vue/Svelte/Solid patterns; founders can
|
|
2279
|
+
edit the spec for their stack.
|
|
2280
|
+
- **L1-mvp manifest** declares the new 3 skills + 3 loops; `boss sync` carries them; `.boss`
|
|
2281
|
+
stamp tracks them. The `claude-append.md` reads as a clean catalog of what MVP offers.
|
|
2282
|
+
- **End-to-end tested in /tmp:** scaffold → unlock mvp → all 7 skills + 5 loops land + stamp
|
|
2283
|
+
merges correctly + status --conscience shows the 5 loops in their right states (capture
|
|
2284
|
+
open-structural, canvas/spec/pretotype/design-tokens all unopenable on a fresh project, no
|
|
2285
|
+
spurious hook fires). 43/43 evals still pass (regression-clean).
|
|
2286
|
+
- **Deferred to v0.22:** moment #3 (capture — reusable value at breakpoint), V1 mode authoring
|
|
2287
|
+
including the design-drift-loop (IDEA-010 Phase 3) + ui-designer + ux-designer agents +
|
|
2288
|
+
/design-review + /ux-check + PostToolUse hook for hardcoded-style detection.
|
|
2289
|
+
|
|
2290
|
+
## 0.20.0 — 2026-05-23
|
|
2291
|
+
|
|
2292
|
+
- **The three design changes from v0.19's persona-reactions pass — landed.** Closes the loop on
|
|
2293
|
+
the reactions: the personas flagged it, v0.20 shipped it. Moments #3 + #4 (the other v0.20
|
|
2294
|
+
candidates per the published roadmap) deferred to v0.21 — they need their own design pass
|
|
2295
|
+
(moment #3's "reusable value at breakpoint" detection is harder than predicate evaluation;
|
|
2296
|
+
moment #4's "premature ceremony" needs `/spec`-skill-aware detection plus a spec-loop). The
|
|
2297
|
+
three things shipping in v0.20 are the *persona-driven* design changes; the moment work is a
|
|
2298
|
+
separate concern.
|
|
2299
|
+
- **`boss status --conscience`** — the inspect affordance (asked-for by `eng-builder`,
|
|
2300
|
+
`indie-hacker`, `vibe-virtuoso` personas in the v0.19 reactions doc). Loads
|
|
2301
|
+
`docs/loops/*.md`, classifies each loop (open / closed / unopenable), shows what would
|
|
2302
|
+
close the open ones (concrete predicate evidence: count/threshold, file matches, what's
|
|
2303
|
+
missing), reads recent override entries from `docs/devlog.md` per IDEA-008's grammar,
|
|
2304
|
+
shows the project's declared cohort. New `src/conscience.js` module formats the output;
|
|
2305
|
+
the loop runtime is imported from the canonical Quickstart-template path so there's one
|
|
2306
|
+
source of truth.
|
|
2307
|
+
- **Cohort-aware conscience** — `.boss/config.json` carries an optional `cohort` field
|
|
2308
|
+
(one of: vibe-coder-newbie | eng-builder | non-tech-founder | first-product |
|
|
2309
|
+
vibe-virtuoso | indie-hacker | returning-founder | domain-expert | null). The
|
|
2310
|
+
`loop-runtime.js`'s `composeContext` now reads the cohort and appends a **cohort framing
|
|
2311
|
+
directive** to `additionalContext` — same signal, different voice. Each cohort gets a
|
|
2312
|
+
distinct framing sentence: first-product needs *teaching*; returning-founder wants a
|
|
2313
|
+
*harder cohort-aware question*; vibe-virtuoso gets *friction over praise*; indie-hacker
|
|
2314
|
+
gets *plain Fitzpatrick language, not jargon*. The `/boss` spin-up skill now asks
|
|
2315
|
+
one optional question to set the cohort; user can always edit `.boss/config.json` later.
|
|
2316
|
+
- **Voice lineage decision: Fitzpatrick consistently.** The indie-hacker persona caught
|
|
2317
|
+
that the prior conscience voice mixed Fitzpatrick ("who would you ask first") with
|
|
2318
|
+
Maurya ("riskiest assumption" / "riskiest bet") in one breath. v0.20 picks Fitzpatrick-
|
|
2319
|
+
plain ("what they'd want to learn"; "who they'd ask first") consistently. Updates the
|
|
2320
|
+
`/triage` skill's exemplar text + the `composeContext` framing in `loop-runtime.js`. The
|
|
2321
|
+
canvas itself still uses "riskiest assumption" (that's the canvas's frame, established);
|
|
2322
|
+
the *conscience nudge* now speaks Fitzpatrick.
|
|
2323
|
+
- **Test coverage:** 43/43 evals still pass (regression-clean). End-to-end test in /tmp
|
|
2324
|
+
verified: fresh project → capture-loop open (no signal, structural); 3 captures + no canvas →
|
|
2325
|
+
canvas-loop drifts, hook fires with cohort framing in additionalContext; override in devlog →
|
|
2326
|
+
appears in `boss status --conscience` output.
|
|
2327
|
+
- **What's NOT in v0.20** (now queued for v0.21): conscience moments #3 (capture — reusable
|
|
2328
|
+
value) and #4 (restraint — premature ceremony). Moment #3 needs a different detector design
|
|
2329
|
+
(predicate evaluation doesn't fit "noticing this artifact is generalizable"). Moment #4
|
|
2330
|
+
needs a spec-loop authored AND skill-aware detection in `/spec`. Both warrant their own
|
|
2331
|
+
release.
|
|
2332
|
+
|
|
2333
|
+
## 0.19.0 — 2026-05-23
|
|
2334
|
+
|
|
2335
|
+
- **Proto-personas layer + first reactions pass — the founder-experience eval channel.** 8
|
|
2336
|
+
synthetic-founder agents now seated in BOSS's `.claude/agents/` with `persona-` prefix
|
|
2337
|
+
(parallel to `mentor-`, parallel to the builder team). They REACT to BOSS features (not
|
|
2338
|
+
advise, not mentor) so BOSS gets cheap pre-filter signal on how it lands across cohorts
|
|
2339
|
+
*before* spending the expensive real-founder Mom Test call (which remains explicitly
|
|
2340
|
+
overridden per advisory-pass #1; the override's re-open condition includes "persona
|
|
2341
|
+
reactions surface a coherent product story" — this layer is how that signal arrives).
|
|
2342
|
+
- The 8 personas: `vibe-coder-newbie` (no eng/startup background, picked up Cursor 3-6
|
|
2343
|
+
months ago), `eng-builder` (10+ years eng, first-time founder, skeptical of magic),
|
|
2344
|
+
`non-tech-founder` (deep domain expertise, can't code, AI is the bridge),
|
|
2345
|
+
`first-product` (absolute beginner — to building, to vibe coding, to everything),
|
|
2346
|
+
`vibe-virtuoso` (50+ shipped projects, zero sustained products, expert at idea-to-demo,
|
|
2347
|
+
bad at company-building), `indie-hacker` (right-sized lens — Walling/Fried/Jarvis;
|
|
2348
|
+
anti-VC; suspicious of venture-shaped language), `returning-founder` (has shipped before;
|
|
2349
|
+
intolerant of 101 content; wants depth), `domain-expert` (medical/legal/financial;
|
|
2350
|
+
stakes are real; humane lens applies hard).
|
|
2351
|
+
- **`persona-reactions-loop`** authored in `docs/loops/` — captures the discipline (runner_
|
|
2352
|
+
type: manual; uses the v0.18 loop primitive). Entry: persona agents exist. Exit: a
|
|
2353
|
+
`docs/dossier/persona-reactions/<feature>.md` doc with structured reactions + synthesis +
|
|
2354
|
+
design changes + real-founder questions the reactions sharpen.
|
|
2355
|
+
- **First reactions pass complete** at
|
|
2356
|
+
[`docs/dossier/persona-reactions/conscience-moment-1.md`](docs/dossier/persona-reactions/conscience-moment-1.md).
|
|
2357
|
+
All 8 personas reacting to the conscience moment-1 firing scenario.
|
|
2358
|
+
- **Three concrete design changes the reactions argue for** (ordered, with priorities):
|
|
2359
|
+
1. **Cohort-aware conscience** — the model composing the conscience voice should know what
|
|
2360
|
+
cohort the founder is in (`.boss/config.json` declares it, set by `/boss` during
|
|
2361
|
+
scaffold). First-product needs *teaching*; returning-founder needs *a harder
|
|
2362
|
+
question*; vibe-virtuoso needs *sharper architecture*. Same signal; different voice.
|
|
2363
|
+
2. **Inspect affordance** — `boss status --conscience` or `boss conscience --explain` so
|
|
2364
|
+
humans can see what loops are open / what would close them / what overrides exist.
|
|
2365
|
+
Engineers + indie-hackers + vibe-virtuosos all asked for this. Plausibly v0.20
|
|
2366
|
+
alongside moments #3/#4.
|
|
2367
|
+
3. **Pick the lineage in the conscience voice** — current voice mixes Fitzpatrick
|
|
2368
|
+
(talk-to-someone) and Maurya (riskiest-assumption) in one breath. Indie-hacker caught
|
|
2369
|
+
this; the eval set didn't. Lean one direction consistently.
|
|
2370
|
+
- **Three real-founder interview questions** the reactions sharpen for the eventual call:
|
|
2371
|
+
(1) read-aloud comprehension test; (2) curious-vs-defensive-vs-confused for first-timer
|
|
2372
|
+
cohort; (3) cohort-aware variant test for returning-founder cohort.
|
|
2373
|
+
- **Two surprises from the reactions** that hadn't surfaced via any other discipline:
|
|
2374
|
+
returning-founder wants a HARDER question, not a softer one (suggesting cohort-aware
|
|
2375
|
+
direction); indie-hacker noticed the voice-fights-itself issue (Fitzpatrick vs Maurya
|
|
2376
|
+
lineage mix) that 84 eval examples missed. **Argument for routine persona-eval passes on
|
|
2377
|
+
every user-facing text.**
|
|
2378
|
+
- **Roadmap status:** v0.17 (builder team) + v0.18 (loop primitive) + v0.19 (personas) all
|
|
2379
|
+
shipped this push. Next up: v0.20 (moments #3+#4 via the generic detector, leveraging the
|
|
2380
|
+
v0.19 reactions for cohort-aware language) + v0.21 (MVP discipline upgrades).
|
|
2381
|
+
|
|
2382
|
+
## 0.18.0 — 2026-05-22
|
|
2383
|
+
|
|
2384
|
+
- **IDEA-008 → FEAT-001: generic loop runtime in Node; bash hook retired.** The biggest
|
|
2385
|
+
architectural release since v0.8.0 (learning loop). The conscience hook used to be hand-coded
|
|
2386
|
+
bash with a single hard-wired detector for moment-1. It's now a generic Node runtime that
|
|
2387
|
+
reads `docs/loops/*.md` from the project, evaluates entry/exit predicates, and emits
|
|
2388
|
+
structured signals — *any* loop drifting fires *its* moment, no per-moment code. Moments
|
|
2389
|
+
#3 and #4 will be authored as loops (not detectors) in v0.20.
|
|
2390
|
+
- **`conscience.js`** (replaces `conscience.sh`) — zero-dep Node, ~50 lines. Loads loops,
|
|
2391
|
+
classifies state, composes signals, prints JSON. Fails silent — never blocks a prompt.
|
|
2392
|
+
- **`lib/loop-runtime.js`** — the engine. Loads loops from `docs/loops/*.md`, evaluates a
|
|
2393
|
+
closed predicate vocabulary (`exists`, `count_at_least`, `any_file_matches`) against the
|
|
2394
|
+
live filesystem, classifies each loop as `unopenable | open | closed`, returns structured
|
|
2395
|
+
signals only for hook-runner loops with a `drift_moment`. Loops without `drift_moment` are
|
|
2396
|
+
*structural* (express dependencies but don't drift — caught the over-fires-on-fresh-project
|
|
2397
|
+
bug live during the build; capture-loop is the canonical structural loop).
|
|
2398
|
+
- **`lib/yaml.js`** — zero-dep YAML parser lifted from the eval runner so the same code parses
|
|
2399
|
+
loop specs at hook time and eval examples at test time.
|
|
2400
|
+
- **Two named loops authored** in the Quickstart template (`docs/loops/`):
|
|
2401
|
+
- `capture-loop` — structural; expresses "at least one captured idea exists." Downstream
|
|
2402
|
+
loops (canvas-loop) check this via their own entry predicates.
|
|
2403
|
+
- `canvas-loop` — encodes moment-1's full logic *declaratively* via predicates. Entry: ≥3
|
|
2404
|
+
dated capture-log entries across active (non-dropped) idea files. Exit: at least one
|
|
2405
|
+
canvas tied to an active idea has a real (≥3-char alphanumeric) riskiest assumption.
|
|
2406
|
+
Drift = open + not closed = moment-1 caution. The hand-coded bash logic is now expressed
|
|
2407
|
+
in YAML.
|
|
2408
|
+
- **`manifest.json` gains a `loops` array.** `src/scaffold.js` + `src/sync.js` handle loops
|
|
2409
|
+
as a new managed-file kind (`kind: loop`, `rel: docs/loops/<name>.md`). Hook-library files
|
|
2410
|
+
(`lib/*.js`) auto-discovered from the template and synced alongside the hook script. The
|
|
2411
|
+
`.boss` stamp tracks `loops` (alongside agents/skills/hooks).
|
|
2412
|
+
- **`runner_type` field on loop specs** — resolves the moment-2 shape question from v0.16's
|
|
2413
|
+
meta-learnings. Today only `hook` is honored by the conscience runtime; `skill`, `manual`,
|
|
2414
|
+
`external` will be honored by future runners (skill-prompt eval, manual review, external
|
|
2415
|
+
detector — e.g. CI).
|
|
2416
|
+
- **Settings migration (bash → node).** Existing projects pinned at <= 0.13.0 have
|
|
2417
|
+
`bash …conscience.sh` in their settings.json. The merge logic now applies hook-command
|
|
2418
|
+
*migrations* before the additive merge: it drops stale bash entries before adding the new
|
|
2419
|
+
node entry. Tested with a synthetic pre-0.18 project: bash entry dropped, node entry added,
|
|
2420
|
+
user's permissions preserved, idempotent on re-sync.
|
|
2421
|
+
- **Eval set unchanged; 43/43 still pass against the generic runtime** (regression-coverage
|
|
2422
|
+
in place). The runner now invokes `node conscience.js`, materializes `docs/loops/` into
|
|
2423
|
+
the test project before each example (so the runtime has loops to load), and the existing
|
|
2424
|
+
examples test the *generic detector* end-to-end. One bug found during the build (POSIX
|
|
2425
|
+
regex char classes — `[[:space:]]` doesn't work in JS) fixed by switching loop specs to
|
|
2426
|
+
JS-native regex (`\s+`, `[a-zA-Z0-9]{3,}`). Eval-first discipline catching itself.
|
|
2427
|
+
- **BOSS dogfoods:** `docs/loops/capture-loop.md` + `docs/loops/canvas-loop.md` now live in
|
|
2428
|
+
BOSS itself (alongside the existing `docs/loops/eval.md`). BOSS-the-project runs the same
|
|
2429
|
+
runtime against its own state.
|
|
2430
|
+
- **Verdict on the primitive (from v0.16's meta-learnings): confirmed under contact with
|
|
2431
|
+
real Node implementation.** The four-field shape (entry / purpose / exit / drift) held up.
|
|
2432
|
+
Predicate vocabulary covered everything moment-1 needed declaratively. The structural-loop
|
|
2433
|
+
pattern (no `drift_moment`) handles dependencies-only cases cleanly. Net code change:
|
|
2434
|
+
`conscience.sh` (40 lines bash) replaced by `conscience.js` + `lib/` (~250 lines Node) —
|
|
2435
|
+
bigger surface, but moments #3+ will be loop specs (~30 lines YAML each), not new detectors.
|
|
2436
|
+
|
|
2437
|
+
## 0.17.0 — 2026-05-22
|
|
2438
|
+
|
|
2439
|
+
- **Builder team seated alongside the mentor board.** Three new builder agents in BOSS's own
|
|
2440
|
+
`.claude/agents/` — they make BOSS *feel right*, parallel to the mentor board which makes
|
|
2441
|
+
BOSS *be right*. Builders, not mentors: they propose concrete diffs, not advice. Cover the
|
|
2442
|
+
interaction-design layer that wasn't covered by either the mentor board or the existing builder
|
|
2443
|
+
agents (pm, coder-generalist, tester, program-manager).
|
|
2444
|
+
- **`designer`** — owns the UX of the entire BOSS interaction experience. Not "the visual
|
|
2445
|
+
designer" (BOSS is a CLI + Claude Code experience, not a webapp). Owns: when BOSS speaks vs
|
|
2446
|
+
stays quiet, what a skill *feels* like when run, the rhythm of mode unlocks, the surprise
|
|
2447
|
+
vs predictability of conscience moments, what the founder is being asked to *do* vs read at
|
|
2448
|
+
every step. Sources: Norman, Krug, Nielsen, Spool, Wroblewski, Walter + AI-specific UX
|
|
2449
|
+
heuristics (options-not-truth; visible confidence; deliberate failure states).
|
|
2450
|
+
- **`voice-keeper`** — guards what BOSS *sounds like*. Reviews skill text, agent system
|
|
2451
|
+
prompts, hook signal language, README, CHANGELOG, error messages. Catches performed warmth,
|
|
2452
|
+
scolding tone, voice-mode bleed, framework-jargon leaking into user-facing text, assumed
|
|
2453
|
+
knowledge, hedging. Proposes concrete edits side-by-side. Inward-facing language guardian.
|
|
2454
|
+
Sources: `boss-voice` memory (canonical spec), Strunk & White, Raskin + Neumeier (pitch =
|
|
2455
|
+
product voice), Godin (write to one person).
|
|
2456
|
+
- **`prompt-coach`** — helps the founder write better prompts (to BOSS, to Claude, to AI in
|
|
2457
|
+
general) and teaches the craft over time. Outward-facing counterpart to voice-keeper.
|
|
2458
|
+
Builds a per-founder pattern library in `docs/dossier/founder-prompt-patterns.md`. Catches
|
|
2459
|
+
vague / multi-prompt / missing-constraint / missing-output-format / leading-question /
|
|
2460
|
+
missing-context / wrong-role-assignment / stop-word-missing failure modes. Sources:
|
|
2461
|
+
Karpathy (think in distributions), Mollick (AI-as-different-roles), Willison (prompt-as-
|
|
2462
|
+
code), Liu (Pydantic-first), Husain (look at the output), Fitzpatrick (Mom Test discipline
|
|
2463
|
+
applied to interview prompts).
|
|
2464
|
+
- **Advisory-pass #1 (real-founder Mom Test calls) explicitly overridden — first real use of
|
|
2465
|
+
IDEA-008's override grammar.** Ajesh's call: at zero users + product still defining its shape,
|
|
2466
|
+
expensive real-founder calls are premature; cheap synthetic signal from proto-personas (v0.19
|
|
2467
|
+
work) is the right move now. Override recorded in `docs/dossier/boss-advisory-pass-001.md`
|
|
2468
|
+
with explicit re-open conditions (persona reactions surface a coherent product story, OR a
|
|
2469
|
+
non-Ajesh user starts using BOSS in earnest, OR the eval set catches something only real-
|
|
2470
|
+
founder feedback could surface). The recommendation isn't deleted — it's deferred under the
|
|
2471
|
+
IDEA-008 contract: deviation made conscious, recorded, re-openable.
|
|
2472
|
+
- **Roadmap published** (in RESUME): v0.17 builder team → v0.18 generic loop primitive (IDEA-008
|
|
2473
|
+
to FEAT) → v0.19 proto-personas as named loops → v0.20+ moments #3/#4 + MVP discipline
|
|
2474
|
+
upgrades + V1/Scale mode authoring + IDEA-003 finish + externalization + backlog. Ten releases
|
|
2475
|
+
sequenced for build-on-build; the discipline rails (evals, structured output, loops, personas)
|
|
2476
|
+
make each one buildable in a focused session.
|
|
2477
|
+
|
|
2478
|
+
## 0.16.0 — 2026-05-22
|
|
2479
|
+
|
|
2480
|
+
- **The eval-loop closed — conscience now has evals + structured output (proves IDEA-008's
|
|
2481
|
+
primitive on its first real run).** Two ladders climbed at once: produced the conscience eval
|
|
2482
|
+
set (the artifact the advisory pass / playbook said BOSS most needed) AND validated the loop
|
|
2483
|
+
primitive from IDEA-008 by running ONE loop end-to-end. Both succeeded.
|
|
2484
|
+
- **84 labeled eval examples** (43 moment-1 caution + 41 moment-2 Done!) in
|
|
2485
|
+
`docs/architecture/conscience-evals/{moment-1-caution,moment-2-done}.yml`. Should-fire +
|
|
2486
|
+
categorized should-NOT-fire entries (failure_mode taxonomy per Husain's discipline:
|
|
2487
|
+
`over-fires-on-fresh-project`, `fires-mid-other-work`, `repeats-itself`, `shame-toned`,
|
|
2488
|
+
`false-positive-canvas-exists`, `false-positive-not-drift`, `acknowledged-already`,
|
|
2489
|
+
`fires-too-early`, `performed-warmth`, `removes-agency`, `riskiest-assumption-stale`,
|
|
2490
|
+
`triggered-by-trivial-change`). Each example has structured `project_state` (the synthetic
|
|
2491
|
+
docs/ideas/ tree the runner builds in a temp dir) + `expected_detection`.
|
|
2492
|
+
- **Zero-dep Node runner** at `docs/architecture/conscience-evals/runner.js` — includes a
|
|
2493
|
+
minimal YAML parser for the subset our eval files use (so the data stays human-readable
|
|
2494
|
+
without breaking the zero-dep rule). Constructs synthetic state, invokes the actual hook,
|
|
2495
|
+
parses output, asserts. Reports per-example + a categorized summary table.
|
|
2496
|
+
- **Hook refactored to structured output** (Liu's discipline): now ships
|
|
2497
|
+
`{moment, confidence, evidence: {capture_count, canvases_with_filled_assumption,
|
|
2498
|
+
active_idea_count}, suppress_if}` in addition to the `additionalContext` string. Model still
|
|
2499
|
+
composes the voice; hook ships a schema.
|
|
2500
|
+
- **3 real bugs caught and fixed by the eval set itself** (Husain's "look at your data"
|
|
2501
|
+
discipline in action):
|
|
2502
|
+
1. Single-char placeholders like `?` slipped through the riskiest-assumption regex.
|
|
2503
|
+
Tightened to require ≥3 alphanumeric chars (rejects `?`, `??`, `_TBD_`, empty, etc.).
|
|
2504
|
+
2. Hook counted captures across *all* ideas — including `status: dropped`. Drift signal
|
|
2505
|
+
from already-walked-away ideas is meaningless. Hook now filters active vs. dropped.
|
|
2506
|
+
3. Filled canvases on dropped ideas were stopping the "validated" check. Same fix:
|
|
2507
|
+
active-status filtering on canvas checks too.
|
|
2508
|
+
- **Runner results: 43/43 pass on every runnable case. 41 skipped as categorized future-work**
|
|
2509
|
+
(moment-2 lives in `/canvas` skill prompt not a hook; suppress_if cases need session-state /
|
|
2510
|
+
devlog awareness; signal-text violations need a separate runner — all explicitly tracked).
|
|
2511
|
+
- **IDEA-008 primitive verdict — ready to promote to FEAT.** The four-field shape (entry,
|
|
2512
|
+
purpose, exit, drift) held up. The predicate vocabulary (`exists`, `contains`,
|
|
2513
|
+
`count_at_least`, `recorded_at`) survived contact with reality. Multi-part exit artifacts
|
|
2514
|
+
work. Skip-with-reason is the right runner pattern. ~half of examples test future features
|
|
2515
|
+
(suppress_if + devlog + skill-based detection), which is the right ratio — the eval set
|
|
2516
|
+
is forward-looking, not just current-implementation snapshot. One real shape-question
|
|
2517
|
+
surfaced (moment-2 isn't hook-detected — argues for a `runner_type` field on loop specs).
|
|
2518
|
+
- First loop authored: `docs/loops/eval.md` — the eval-loop spec using the four-field primitive
|
|
2519
|
+
with predicates in YAML frontmatter. The template for every future loop.
|
|
2520
|
+
|
|
2521
|
+
## 0.15.0 — 2026-05-22
|
|
2522
|
+
|
|
2523
|
+
- **BOSS now has its full mentor board seated, and the board has had its first session on BOSS
|
|
2524
|
+
itself.** Step back before the next build axis: identify and instantiate all the experts we'd
|
|
2525
|
+
want consulting on how to build an AI-native MVP right, then have them actually review BOSS.
|
|
2526
|
+
- **8 mentor agents live in this repo's `.claude/agents/`** (BOSS-local; the project-as-its-own-
|
|
2527
|
+
founder): `mentor-venture` (cornerstone), `mentor-architect`, `mentor-gtm` (the three that
|
|
2528
|
+
already existed in stage templates, now BOSS-tuned), plus 5 new: `mentor-business`,
|
|
2529
|
+
`mentor-fundraising`, `mentor-pitch`, `mentor-talent`, `mentor-humane`. Each cites the
|
|
2530
|
+
practitioners from `docs/mentor-practitioners.md` it draws on — no agent impersonates a
|
|
2531
|
+
person; mentors cite named practices (per the encoding decision in `docs/MENTORS.md`).
|
|
2532
|
+
- **`mentor-humane` carries explicit override authority** when a humane concern is on the table
|
|
2533
|
+
— the conscience's conscience. Seated from day one despite the Scale-level slot in the
|
|
2534
|
+
roster, because BOSS itself is the special case (it has to *be* humane in its construction,
|
|
2535
|
+
not just preach it).
|
|
2536
|
+
- **`mentor-architect` retuned for the AI-native MVP era** (both BOSS-local and the MVP template
|
|
2537
|
+
in `stages/L1-mvp/template/.claude/agents/`). Leads with AI as the modality; classical-stack
|
|
2538
|
+
choices are supporting cast. Names the load-bearing AI questions: surface, eval strategy,
|
|
2539
|
+
prompt vs. fine-tune vs. RAG, structured outputs, human-in-loop boundaries, cost/latency
|
|
2540
|
+
budgets, fallback. Cites Karpathy, Willison, Husain (evals — load-bearing), Liu (structured
|
|
2541
|
+
outputs — load-bearing), Mollick, Huyen.
|
|
2542
|
+
- **First advisory pass captured at `docs/dossier/boss-advisory-pass-001.md`.** Honest, not
|
|
2543
|
+
flattering. Each mentor's read on BOSS as-of v0.14.0, citing their practitioners. Five
|
|
2544
|
+
cross-cutting themes converged: (1) pause "more features" to earn founder contact; (2) evals
|
|
2545
|
+
are the next architecture investment (moments #3/#4 should be eval-set-first); (3)
|
|
2546
|
+
right-sized is the default shape (calm-company / OSS / patronage, not venture); (4) interior
|
|
2547
|
+
story rich, exterior story missing (strangers can't read the README); (5) the conscience is
|
|
2548
|
+
the moat *and* the most under-validated thing — plug this gap first.
|
|
2549
|
+
- **Next moves (re-ordered by the pass — supersedes the prior queue):** conscience-evals doc +
|
|
2550
|
+
structured hook output, *then* moments #3/#4; 5 real-founder Mom-Test interviews; Dunford
|
|
2551
|
+
positioning exercise + strangers-can-read-it README; humane upgrades to the conscience spec
|
|
2552
|
+
(cumulative-pressure check, BOSS.DK exemplar lines, cross-link humane practitioners into
|
|
2553
|
+
architect's lens); name the right-sized shape on the canvas.
|
|
2554
|
+
|
|
2555
|
+
## 0.14.0 — 2026-05-22
|
|
2556
|
+
|
|
2557
|
+
- **MVP mode (L1-mvp) is authored — `boss unlock mvp` works for real (closes IDEA-002).** Until now
|
|
2558
|
+
the L1 stage was a placeholder README; unlocking MVP errored out as "not authored yet." This release
|
|
2559
|
+
extracts this repo's own MVP practice UP into a real `stages/L1-mvp/{manifest.json,template/}`:
|
|
2560
|
+
- **Skills:** `/spec` (promote `IDEA-NNN` → `FEAT-NNN` with goal, acceptance criteria, smoke check,
|
|
2561
|
+
out-of-scope), `/smoke` (stack-configured build-health gate; saves the command to `.boss/smoke.json`
|
|
2562
|
+
on first use), `/log` (append-only `docs/devlog.md` — newest at the top, dated, FEAT-tagged),
|
|
2563
|
+
`/close` (session-end ritual that rewrites `docs/RESUME.md` and runs `/log`).
|
|
2564
|
+
- **Builder agents:** `tester` (owns `/smoke` + FEAT acceptance — surfaces, doesn't fix),
|
|
2565
|
+
`program-manager` (the *when* — sequences FEATs, names blocks, distinct from `pm`'s *what*).
|
|
2566
|
+
- **Mentor agents (advisory, never code):** `mentor-architect` (load-bearing decisions: data shape,
|
|
2567
|
+
boundaries, what to defer — the calibration against over-architecting an MVP) and `mentor-gtm`
|
|
2568
|
+
(first 100 users, channels, message — earned-when-needed, humane before viable).
|
|
2569
|
+
- **Additive CLAUDE.md** via `claude-append.md` — the mechanism shipped in v0.8.0 finally has its
|
|
2570
|
+
first real consumer. MVP's working rules (spec → smoke → log → close loop, conscience still runs)
|
|
2571
|
+
fold into the existing CLAUDE.md under the `boss:L1-mvp` marker; never overwrites Quickstart's rules.
|
|
2572
|
+
- **`boss sync` carries it for free** — the manifest's agents/skills are picked up by `managedFiles`
|
|
2573
|
+
in `src/sync.js`, so projects pinned at older versions get the MVP files via `/boss-sync` once they
|
|
2574
|
+
unlock the layer. (Hooks list is empty in this manifest — moments #3/#4 remain TBD.)
|
|
2575
|
+
- Tested in `/tmp`: scaffolded a Quickstart project, `boss unlock mvp` added 4 skills + 4 agents,
|
|
2576
|
+
appended the MVP working-rules block to CLAUDE.md, updated the stamp (`L0-quickstart → L1-mvp`,
|
|
2577
|
+
merged agents/skills), kept Quickstart files untouched, re-running unlock no-ops cleanly, `boss
|
|
2578
|
+
sync` recognizes everything as up-to-date.
|
|
2579
|
+
|
|
2580
|
+
## 0.13.0 — 2026-05-21
|
|
2581
|
+
|
|
2582
|
+
- **`boss sync` now carries hooks + settings (closes the conscience's reach gap).** Until now the
|
|
2583
|
+
conscience hook (v0.12.0) reached *new* projects only — `boss sync` carried skills/agents but not
|
|
2584
|
+
hooks or `settings.json`, so existing projects (e.g. `betabeta`) couldn't get it. Fixed:
|
|
2585
|
+
- **Hook scripts sync like any managed file** — `manifest.hooks` → `.claude/hooks/<name>.sh`, shown as
|
|
2586
|
+
`new`/`changed`/`ok` in the preview, written on `--apply`.
|
|
2587
|
+
- **Hook registrations merge into `.claude/settings.json` additively** — `boss sync` adds the
|
|
2588
|
+
`UserPromptSubmit` (etc.) entries BOSS ships, **matched by command so it's idempotent**, and
|
|
2589
|
+
**preserves the user's permissions and their own hooks.** It's the one user-editable file sync
|
|
2590
|
+
touches, and only the `hooks` block. (`computeSettingsMerge` in `src/sync.js`.)
|
|
2591
|
+
- The `.boss` stamp now tracks `hooks` (alongside agents/skills); `boss new`/`unlock` record them.
|
|
2592
|
+
- `/boss-sync` skill narrates the new behavior. Tested in `/tmp`: an "old" project (no hook,
|
|
2593
|
+
permissions-only settings, pinned 0.10.0) → sync adds the hook + merges settings (permissions
|
|
2594
|
+
preserved) + bumps the pin; idempotent on re-run.
|
|
2595
|
+
- **Existing projects can now `boss sync --apply` (or `/boss-sync`) to receive the conscience.**
|
|
2596
|
+
|
|
2597
|
+
## 0.12.0 — 2026-05-21
|
|
2598
|
+
|
|
2599
|
+
- **The conscience can now speak *unprompted* (spike → shipped).** Until now both moments only fired
|
|
2600
|
+
when you ran a skill (`/triage`, `/canvas`). A new **`UserPromptSubmit` hook** lets moment #1
|
|
2601
|
+
("what does this prove?") surface on its own:
|
|
2602
|
+
- `.claude/hooks/conscience.sh` — detection only: if ≥3 dated capture-log entries exist across
|
|
2603
|
+
`docs/ideas/` and no canvas has a *filled* riskiest assumption (capturing-lots / validating-nothing),
|
|
2604
|
+
it returns `additionalContext` — a **signal**, not canned copy. Claude keeps the voice and the
|
|
2605
|
+
judgment: it decides whether the moment fits, says it once in BOSS's register, or stays silent.
|
|
2606
|
+
- Registered in the template `.claude/settings.json`; invoked via `bash …` so it needs no execute bit.
|
|
2607
|
+
`manifest.json` hooks: `["conscience"]`.
|
|
2608
|
+
- Confirms the architecture for the remaining moments: **hook = detection, model = tact + voice.**
|
|
2609
|
+
- _Caveats:_ reaches **new** projects only (`boss sync` doesn't carry `settings.json`/hooks yet); the
|
|
2610
|
+
*feel* (wise vs. naggy) still needs live validation, like moment #1 in `betabeta`.
|
|
2611
|
+
|
|
2612
|
+
## 0.11.0 — 2026-05-21
|
|
2613
|
+
|
|
2614
|
+
- **The conscience — second moment: "Done!" (`/canvas` graduation).** The *affirming* register,
|
|
2615
|
+
counterpart to moment #1's caution. When the canvas holds up (most cells real + riskiest assumption
|
|
2616
|
+
has a validation plan), `/canvas` no longer just offers `boss unlock mvp` — it marks the threshold in
|
|
2617
|
+
two beats:
|
|
2618
|
+
- **Arrival** — names what became real (started with a hunch → now a specific person, real tension,
|
|
2619
|
+
sharp promise, a testable riskiest bet). Said plainly, let to land. Acknowledgment, not praise.
|
|
2620
|
+
- **Next doorway** — points at `boss unlock mvp` (build tools + next mentors) without rushing; the
|
|
2621
|
+
canvas keeps.
|
|
2622
|
+
- A threshold, not a finish line; never forced — the celebration is for when it's genuinely earned.
|
|
2623
|
+
- Completes the conscience's *two registers* (caution + completion) in Quickstart.
|
|
2624
|
+
|
|
2625
|
+
## 0.10.0 — 2026-05-21
|
|
2626
|
+
|
|
2627
|
+
- **The conscience — first moment lands (`/triage` validation check).** BOSS starts behaving like the
|
|
2628
|
+
*build's conscience*, not just a set of skills you invoke. The first of four conscience moments —
|
|
2629
|
+
**"what does this prove?"** — is now baked into `/triage`:
|
|
2630
|
+
- **Fires when** the active idea has ≥3 capture-log entries and no canvas with a filled riskiest
|
|
2631
|
+
assumption (the "capturing lots, validating nothing" drift) — and *only* then.
|
|
2632
|
+
- **Says one spare line** in BOSS's voice (the seasoned hand): names the drift, asks what would make
|
|
2633
|
+
it real, points at `/canvas`, hands the decision back. Never blocks a capture, never nags.
|
|
2634
|
+
- Turns the validation thinking that already lived in `/canvas` + `mentor-venture` (invoke-only) into
|
|
2635
|
+
a *moment that fires* in the flow where drift actually happens.
|
|
2636
|
+
- Template `CLAUDE.md` names the conscience in the Quickstart arc.
|
|
2637
|
+
- Existing projects pick it up via `boss sync` / `/boss-sync`.
|
|
2638
|
+
|
|
2639
|
+
## 0.9.0 — 2026-05-21
|
|
2640
|
+
|
|
2641
|
+
- **Mentor layer — structure + cornerstone (IDEA-003).** BOSS's second agent class lands.
|
|
2642
|
+
- **`docs/MENTORS.md`** — the design: two classes (builders make the app, `mentor-*` coach the
|
|
2643
|
+
founder), the roster + JIT-per-mode mapping (venture → architect/GTM → fundraising/pitch/talent/
|
|
2644
|
+
business → humane), the founder **dossier** (canvas → proposal → architecture brief → pitch →
|
|
2645
|
+
hiring plan → data room), and the hard line (no binding legal/financial advice; humane before viable).
|
|
2646
|
+
- **`mentor-venture` agent** seeded into Quickstart (`library/agents/` + template + manifest). The
|
|
2647
|
+
cornerstone mentor: pressure-tests whether an idea is worth it, names the riskiest assumption,
|
|
2648
|
+
points at the next real step, owns the canvas conversation. Advisory only — never writes code/specs.
|
|
2649
|
+
- Existing projects pull `mentor-venture` + the new skills via `boss sync` / `/boss-sync`.
|
|
2650
|
+
- _Still open:_ encoding real practitioners' best-practices UP into `practices/` + `memory-seed/`
|
|
2651
|
+
(awaiting the list); authoring the rest of the roster as their modes get built.
|
|
2652
|
+
|
|
2653
|
+
## 0.8.0 — 2026-05-21
|
|
2654
|
+
|
|
2655
|
+
- **The learning loop (IDEA-001).** PRINCIPLES #1 made operational, in both directions:
|
|
2656
|
+
- **`boss sync [--apply]`** — brings a project's BOSS-managed skills/agents (across all its
|
|
2657
|
+
installed modes) up to the current version. Previews as `new` / `changed (N lines)` / up-to-date,
|
|
2658
|
+
reconciles stale mode labels (an old `L0-sketch` pin → `L0-quickstart`), and on `--apply` writes
|
|
2659
|
+
the files + bumps the project's `.boss` pin. Syncs BOSS-managed skills/agents only; user-editable
|
|
2660
|
+
files (`CLAUDE.md`, `settings.json`) are left for hand-merge.
|
|
2661
|
+
- **`boss learn <path> --as <category>`** — promotes a proven pattern UP into
|
|
2662
|
+
`library/<category>/` (`agents|skills|hooks|practices|memory-seed`), bumps `VERSION` +
|
|
2663
|
+
`package.json` in sync, and prepends a CHANGELOG stub. Finds the BOSS **source** repo via the
|
|
2664
|
+
registry's self-hosted entry (or `$BOSS_SRC`), so it works when `boss` runs from a global install.
|
|
2665
|
+
- **`/boss-sync` + `/boss-learn` skills** — the judgment layer. `/boss-learn` is a two-destination
|
|
2666
|
+
**router** (UP = BOSS superset practice; DOWN = harden into the app's own core), never one-way.
|
|
2667
|
+
`/boss-sync` narrates the diff from the CHANGELOG and flags local edits before overwriting.
|
|
2668
|
+
- **`claude-append.md` support in `boss unlock`.** A mode template can carry a `claude-append.md`
|
|
2669
|
+
whose contents are *appended* to the project's CLAUDE.md under an idempotent marker — additive
|
|
2670
|
+
working rules, never an overwrite. (Needed by the MVP mode next.)
|
|
2671
|
+
|
|
2672
|
+
## 0.7.0 — 2026-05-21
|
|
2673
|
+
|
|
2674
|
+
- **Package / dogfood separation.** Clean boundary between the shippable package and BOSS's own
|
|
2675
|
+
incubation layer:
|
|
2676
|
+
- `package.json` `files` allowlist — only `bin/ src/ stages/ library/ VERSION PRINCIPLES.md
|
|
2677
|
+
registry/CHANGELOG.md` (+ README/LICENSE) ship. `docs/`, `.boss/`, root `CLAUDE.md`, and the
|
|
2678
|
+
local registry are never published. (`npm run pack:preview` to verify.)
|
|
2679
|
+
- **Registry moved out of the repo** to `~/.boss/registry.json` — machine-local state (your
|
|
2680
|
+
project list + absolute paths) no longer lives in (or leaks into) the package/repo.
|
|
2681
|
+
- `VERSION` and `package.json` version synced to 0.7.0.
|
|
2682
|
+
|
|
2683
|
+
## 0.6.0 — 2026-05-21
|
|
2684
|
+
|
|
2685
|
+
- **BOSS now dogfoods itself.** BOSS is its own first registered project (`.boss/` stamp,
|
|
2686
|
+
mode MVP, self-hosted) — retrofitted ahead of the MVP-mode template, which will be *extracted UP*
|
|
2687
|
+
from this repo's working practice (Principle 1).
|
|
2688
|
+
- Added BOSS's own dogfooded docs: root `CLAUDE.md`, `docs/IDS.md`, `docs/ideas/` (IDEA-001 learning
|
|
2689
|
+
loop, IDEA-002 MVP mode, IDEA-003 mentor layer), `docs/ideas/CANVAS.md` (BOSS's own Humane
|
|
2690
|
+
Product Canvas), and `docs/RESUME.md` (multi-session continuity).
|
|
2691
|
+
- Recorded the **mentor layer** vision: two agent classes — builders (make the app) and mentors
|
|
2692
|
+
(coach the founder); mentors accumulate a founder dossier toward funding/hiring.
|
|
2693
|
+
|
|
2694
|
+
## 0.5.0 — 2026-05-21
|
|
2695
|
+
|
|
2696
|
+
- **PRINCIPLES.md** — BOSS's six operating principles. #1: always scaffolding, but pause to sort
|
|
2697
|
+
patterns UP (BOSS superset practice) or DOWN (app core); `/boss-learn` becomes a two-way router.
|
|
2698
|
+
#3: nothing valuable gets locked into code (style → tokens, prototypes reuse the same system).
|
|
2699
|
+
- **Design-system practice** (`library/practices/design-system.md`) — generalized from dhun:
|
|
2700
|
+
tokens as single source of truth, central style utils, 5-state rule, prototype reuse, JIT
|
|
2701
|
+
enforcement (turns on at V1; seed tokens the moment real UI appears).
|
|
2702
|
+
- V1 mode stub fleshed out with the design layer + enforcement timing.
|
|
2703
|
+
|
|
2704
|
+
## 0.4.0 — 2026-05-21
|
|
2705
|
+
|
|
2706
|
+
- **Quickstart becomes a tiny incubator:** capture → keep adding → canvas → unlock MVP.
|
|
2707
|
+
- `/triage` rewritten as **living idea capture** — one evolving doc per idea (sharpening
|
|
2708
|
+
"current shape" + append-only capture log). Re-run anytime to keep adding.
|
|
2709
|
+
- New `/canvas` skill: a **humane business pressure-test** — Ajesh Shah's Humane Product Canvas
|
|
2710
|
+
(Human Foundation / Product Expression / Stewardship) as the spine, with Lean + Lenny-style
|
|
2711
|
+
prompts folded into each cell, plus the incubation heartbeat (riskiest assumption + one
|
|
2712
|
+
experiment this week). Acts as the Quickstart→MVP graduation gate.
|
|
2713
|
+
- North star recorded: BOSS is a just-in-time startup incubator — the right support shows up per mode.
|
|
2714
|
+
|
|
2715
|
+
## 0.3.0 — 2026-05-20
|
|
2716
|
+
|
|
2717
|
+
- Stages renamed to **modes** (the user's vocabulary): Quickstart → MVP → V1 → Scale
|
|
2718
|
+
(folder ids `L0-quickstart` / `L1-mvp` / `L2-v1` / `L3-scale`).
|
|
2719
|
+
- `boss unlock` accepts mode names, levels, or full ids (`mvp`, `L1`, `L1-mvp`).
|
|
2720
|
+
- `boss new` / `status` / `list` display the mode name; `.boss/manifest.json` records `mode`.
|
|
2721
|
+
- _Migration note:_ projects created on ≤0.2.0 carry the old `L0-sketch` stage label in their
|
|
2722
|
+
stamp; cosmetic only — `unlock`/`status` still work. A future `/boss-sync` will reconcile labels.
|
|
2723
|
+
|
|
2724
|
+
## 0.2.0 — 2026-05-20
|
|
2725
|
+
|
|
2726
|
+
- `/boss` spin-up skill (L0): reads a PRD/idea → shapes via pm lens → captures IDEA-001 →
|
|
2727
|
+
recommends stack + stage → optionally creates a **private** GitHub repo with a license.
|
|
2728
|
+
- Repo-creation defaults (in `.boss/config.json`): `github: ask`, `visibility: private`,
|
|
2729
|
+
`license: proprietary` (All Rights Reserved — keeps paid + OSS options open; relicense later).
|
|
2730
|
+
- Auto-sets repo-local GitHub noreply email to avoid the GH007 email-privacy push block.
|
|
2731
|
+
|
|
2732
|
+
## 0.1.0 — 2026-05-20
|
|
2733
|
+
|
|
2734
|
+
- Walking skeleton: `boss new` / `unlock` / `status` / `list` / `version`.
|
|
2735
|
+
- L0 · Quickstart stage authored (CLAUDE.md, pm + coder-generalist agents, /triage, ideas pool, IDS).
|
|
2736
|
+
- Registry + `.boss/` project stamp + version-pinning.
|
|
2737
|
+
- L1–L3 stages stubbed (manifests/templates not yet authored).
|