npm - company-skill - Versions diffs - 4.6.2 → 4.6.5 - Mend

company-skill 4.6.2 → 4.6.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/README.md +13 -7
package/agents/company-critic.md +8 -1
package/agents/company-lead.md +8 -0
package/agents/company-reviewer.md +3 -1
package/hooks/context-guard.js +24 -1
package/hooks/precompact.js +24 -1
package/hooks/session-restore.js +24 -1
package/hooks/stop-guard.js +24 -1
package/package.json +1 -1
package/scripts/check-contracts.js +5 -4
package/scripts/cleanup.js +1 -1
package/scripts/dashboard.js +62 -68
package/scripts/reset-company-guard.js +24 -1
package/scripts/restart-debate.js +24 -1
package/scripts/statusline.js +22 -2
package/skill/SKILL.md +17 -4

package/README.md CHANGED Viewed

@@ -46,17 +46,17 @@ GOAL -> THINK -> EXECUTE (parallel waves) -> VERIFY -> Done?
 ## Dashboard
-The dashboard starts automatically when you run `/company` and prints its URL in the cycle banner. Each session gets its own port (7000-7999, derived from the session id). Open it in any browser.
+The dashboard starts automatically when you run `/company` and prints its URL in the cycle banner and the Claude Code status line. Each session gets its own port (7000-7999, derived from the session id). Open it in any browser.
 ```
-http://127.0.0.1:7421   <- your session's link, printed at startup
+http://127.0.0.1:7421   <- your session's link, printed at startup and in the status bar
 ```
 What you see, panel by panel:
-**Context fill** - the live fill percentage, computed with the same formula the context-guard uses. When the session hits the restart threshold (default 50%), the bar shows "restart due" so you can see the gate before it fires.
+**Context fill** - the live fill percentage, computed with the same formula the context-guard uses. When the session hits the restart threshold (default 50%), the bar shows "restart due" so you can see the gate before it fires. A toggle in the dashboard controls the auto-restart per session.
-**Delegation tree** - SVG tree of orchestrator, department leads, and workers. Click any node to expand its current task and status. Zoom with +/- buttons or the mouse wheel. Drag to pan. Fullscreen button expands it. Zero external JS libraries.
+**Delegation tree** - SVG tree of orchestrator, department leads, and workers, with org-chart context filled from COMPANY.md. Click any node to expand its current task and status. Zoom with +/- buttons or the mouse wheel. Drag to pan. Fullscreen button expands it. Zero external JS libraries.
 **Active agents** - centered live table of every agent the orchestrator has spawned this session, with model, status, and token count.
@@ -69,22 +69,26 @@ The dashboard binds 127.0.0.1 only, reads local files, and sends nothing anywher
 Multi-agent orchestration buys quality with tokens. /company's answer to the token cost: spend strong-model tokens only where they buy quality, and report the bill every cycle.
-**Tiered model delegation.** Each delegation contract carries a `MODEL: cheap|mid|strong` tag. The orchestrator maps the tag to a model at spawn time. Override every sub-agent with `CLAUDE_CODE_SUBAGENT_MODEL` at launch, or write `FORCE_BEST` into `.company/MODEL_POLICY` mid-run.
+**Tiered model delegation.** Each delegation contract carries a `MODEL: cheap|mid|strong` tag. The orchestrator maps the tag to a model at spawn time. Effort scales with both ROI and stakes - high-stakes or high-value work gets heavier spawn. Override every sub-agent with `CLAUDE_CODE_SUBAGENT_MODEL` at launch, or write `FORCE_BEST` into `.company/MODEL_POLICY` mid-run.
 **Per-cycle cost reporting.** Every cycle produces a `COST:` line in the briefing and a `cycles/cycle-{N}-cost.json` artifact.
 **Prompt caching.** Agent prompts are laid out stable-first so repeated spawns hit a shared cache prefix.
+**Fable 5 / adaptive thinking.** On models that support adaptive thinking (Fable 5 and later), the orchestrator and verify layers run with thinking enabled. No `budget_tokens` param - reflection depth is model-controlled.
 ## Key features
 **Stop guard** - blocks session exit until every criterion has `passes: true` and reproduced evidence. Malformed state blocks rather than fails open. Deleting a hard criterion blocks instead of unlocking. [34-check test](tests/stop-guard.test.js).
-**Context-fill guard** - a second Stop hook forces `/company restart` once context reaches the threshold (default 50%). Reads the model id from the transcript to detect the context window. [37-check test](tests/context-guard.test.js).
+**Context-fill guard** - a second Stop hook forces `/company restart` once context reaches the threshold (default 50%). Reads the model id from the transcript to detect the context window. Per-session auto-restart toggle in the dashboard. [37-check test](tests/context-guard.test.js).
 **Delegation contracts** - a task does not exist without a filled contract. `check-contracts.js` rejects missing fields, vacuous VERIFY-WITH commands, invalid MODEL tiers, and cyclic dependencies. [17-check test](tests/check-contracts.test.js).
-**Double verification** - the Internal Reviewer re-runs every VERIFY-WITH command independently. The Devil's Advocate attacks everything marked passing. Two independent reproductions are evidence. One transcript is a hypothesis.
+**Multi-level verification.** The Internal Reviewer re-runs every VERIFY-WITH command independently. The Devil's Advocate attacks everything marked passing. For criteria tagged `stakes: "high"` in `criteria.json` (irreversible action, security surface, or public-facing claim), the critic runs in three fresh contexts with distinct lenses - correctness, security, reproducibility - and unanimous ACCEPT is required. Normal criteria keep the single critic. The completeness probe enumerates every surface the GOAL names and auto-rejects any unchecked one.
+**Design judge-panel.** For criteria tagged `kind: design`, the lead may emit up to three contracts from materially different angles plus one synthesis contract, reserved for genuine design forks.
 **Git isolation** - workers never push to main and never merge. Every code change lands as a draft PR. The merge gate is yours.
@@ -92,6 +96,8 @@ Multi-agent orchestration buys quality with tokens. /company's answer to the tok
 **Codebase graph** - on repos with >200 tracked files, `scripts/codegraph.js` builds a commit-keyed ranked symbol map into `.company/codegraph/` for lead prompts.
+**Status-line link** - `scripts/statusline.js` appends the per-session dashboard URL to the Claude Code status bar, enforced idempotently on every `/company` run.
 ## Commands

package/agents/company-critic.md CHANGED Viewed

@@ -12,12 +12,19 @@ Probe checklist, applied to every passing criterion and every merged-or-mergeabl
 1. Was the evidence REPRODUCED this cycle or merely transcribed from a worker's claim?
 2. Does the cited test or command actually exercise the change, or does it pass vacuously?
 3. What input, edge case, or environment breaks it?
-4. What surface was never checked (other pages, other platforms, error paths)?
+4. Completeness: enumerate every surface, modality, or claim the GOAL names or implies. Mark each
+   CHECKED (evidence cited this cycle) or UNCHECKED. An in-scope UNCHECKED surface is an automatic
+   REJECT. Out-of-scope gaps become PROPOSE lines, not REJECTs.
+   If a LENS directive is present in your prompt (e.g. "LENS: security"), focus your attack
+   through that lens but still apply all other probes. Return ACCEPT/REJECT + one gap line per lens,
+   no per-lens score.
 5. For every external claim: verified from their repo or docs, or guessed from memory?
 6. Could this be done simpler? Does every added component earn its place?
 7. Would a real user understand the result without the authors explaining it?
 8. MAST sweep (arxiv 2503.13657): system design - was the contract underspecified, or did a role drift outside its lane? Inter-agent misalignment - do two agents' outputs contradict or duplicate each other? Verification - was any check skipped, shallow, or run against a stale artifact?
 9. ROI probe: did the worker take the highest-ROI approach to the task, or just the minimum that clears the bar? A trivially better approach within the same scope is a soft flag. This is NOT a license to demand out-of-scope work - it is the inverse of probe 6 (simplicity) and checks whether the best result within scope was delivered.
+10. Anti-vacuous test (SKILL.md ANTI-VACUOUS TEST): does the new test FAIL against the pre-change code? If the test passes unconditionally (before and after the fix), it is vacuous - REJECT.
+11. Feature reachability (SKILL.md FEATURE REACHABILITY): for any feature that gates on a field or condition, is there an authoring or runtime path that sets that field? Probe by reading the skill/agent authoring instructions - if no instruction tells the orchestrator to write the gating field, the feature is dead - REJECT.
 Audit each probe claim against a tool result from THIS session. Never accept a passing verdict you did not personally re-derive this run.

package/agents/company-lead.md CHANGED Viewed

@@ -38,6 +38,14 @@ Rules that bind you:
 - MODEL is your difficulty call, not a default you copy. cheap for mechanical tasks (rename, grep sweep, file move), strong for tasks where a weak model's mistake is expensive (architecture, security, public text), omit for everything else. Justify it in one clause. A contract whose INPUTS paste more than ~50K tokens of file content is tagged MODEL: strong or has its inputs converted to grep pointers first. Long-context degradation on a cheap tier is a quality bug, not a saving.
 - Lay each contract out stable-first: the fixed template fields and pasted boilerplate at the top, volatile values (paths, SHAs, feedback) at the bottom, so repeated spawns share a cacheable prompt prefix. Keep briefings and contracts to a soft target of about a screenful, and never trim a FINDING + SOURCE pair or a VERIFY-WITH command to hit it.
+**Judge-panel for design decisions.** Reserved for genuine design forks, never for a mechanical
+fix. If a criterion is tagged `kind: design` in criteria.json AND you can name 2+ materially
+different angles in one line each, emit N<=3 independent contracts (each from a distinct stated
+angle) plus 1 synthesis contract (a fresh-context judge that picks the winner and grafts
+runner-up ideas). If you cannot name 2+ materially different angles, it is not a design fork:
+use the single contract path. The synthesis judge only selects the winning design, the critic
+and reviewer still gate the chosen design before any merge.
 Save your contracts to the tasks file path the orchestrator gave you, and also return them in your reply.
 Keep the reply SHORT: the contracts, any HIRE lines, any blocker. Cut narration and filler. Compress prose, never evidence.

package/agents/company-reviewer.md CHANGED Viewed

@@ -19,7 +19,9 @@ Additional duties:
 - **External fact check.** Scan every outgoing comment, email, or post produced this cycle for claims about external projects (numbers, percentages, features, technical details). Any claim not verified from the actual source is BLOCKED and the task loops back. Memory-based external claims are an automatic rejection.
 - **Novel ideas.** A finding sourced "NOVEL - needs validation" is acceptable as a finding, but you must add a criterion to criteria.json requiring its validation by experiment.
 - **Merge gate input.** Your MET grades feed the merge decision. Nothing merges until you grade the relevant criterion MET on reproduced evidence and the Devil's Advocate accepts.
-- **Stall counter.** When you keep a criterion failing, increment (or create) an `attempts` field on its criteria.json entry. At 2+ state in your verdict that the approach is stalled and the next cycle must re-plan, not re-try.
+- **Stall counter.** When you keep a criterion failing, increment (or create) an `attempts` field on its criteria.json entry. At 2+ state in your verdict that the approach is stalled and the next cycle must re-plan, not re-try. Writing `attempts` is required: the stall detector and the high-stakes 3-lens gate both read it from criteria.json.
+- **Anti-vacuous test check (SKILL.md ANTI-VACUOUS TEST).** For every new test introduced this cycle, confirm it FAILS against the pre-change code. A test that passes before the fix is vacuous and must be rejected.
+- **Feature reachability check (SKILL.md FEATURE REACHABILITY).** For every new feature that gates on a field in criteria.json (e.g. `stakes: "high"`), confirm the authoring path that sets that field exists in the skill or agent instructions. A gate that is never set is a dead feature; reject it until the wiring is present.
 - **Respawn reflection.** For any task that will be respawned, write a 3-line block into your verdict for the orchestrator to paste into the fresh contract: WHAT-WAS-TRIED / WHY-IT-FAILED (cited to the findings file) / DO-DIFFERENTLY. The failed worker's self-report is not a source.
 Audit each verdict against a tool result from THIS session. Only mark a criterion MET when you can cite the command you ran and its output from this run.

package/hooks/context-guard.js CHANGED Viewed

@@ -63,7 +63,30 @@ const transcriptPath = typeof stdinData.transcript_path === 'string' ? stdinData
 // Session scoping: same logic as stop-guard.
 // Only act for sessions in .company/OWNER. Absent/empty OWNER = gate all (legacy).
-const companyDir = process.env.COMPANY_DIR || path.join(process.cwd(), '.company');
+// Resolve companyDir robustly: COMPANY_DIR env wins; else prefer the dir that holds
+// a clean OWNER (at least one valid session-id line); fall back to cwd/.company.
+// A blank/garbled OWNER does NOT qualify a dir as the active run (BLOCKER-1 fix).
+function hasCleanOwner(ownerPath) {
+  try {
+    const lines = fs.readFileSync(ownerPath, 'utf8')
+      .split('\n').map(function (l) { return l.trim(); }).filter(Boolean);
+    return lines.length > 0 &&
+      lines.every(function (l) { return /^[A-Za-z0-9][A-Za-z0-9._-]{7,}$/.test(l); });
+  } catch (e) { return false; }
+}
+function resolveCompanyDir() {
+  if (process.env.COMPANY_DIR) return process.env.COMPANY_DIR;
+  const home = process.env.HOME || '';
+  const cwdDir = path.join(process.cwd(), '.company');
+  const homeDir = path.join(home, '.company');
+  const cwdHasOwner = hasCleanOwner(path.join(cwdDir, 'OWNER'));
+  const homeHasOwner = home && hasCleanOwner(path.join(homeDir, 'OWNER'));
+  // cwd/.company wins when it has a clean OWNER (project-local run, or both have OWNER).
+  if (cwdHasOwner) return cwdDir;
+  if (homeHasOwner) return homeDir;
+  return cwdDir; // new-run default: preserves original single-project behavior
+}
+const companyDir = resolveCompanyDir();
 const ownerPath = path.join(companyDir, 'OWNER');
 const cancelPath = path.join(companyDir, 'CANCEL');

package/hooks/precompact.js CHANGED Viewed

@@ -7,7 +7,30 @@
 const fs = require('fs');
 const path = require('path');
-const companyDir = process.env.COMPANY_DIR || path.join(process.cwd(), '.company');
+// Resolve companyDir robustly: COMPANY_DIR env wins; else prefer the dir that holds
+// a clean OWNER (at least one valid session-id line); fall back to cwd/.company.
+// A blank/garbled OWNER does NOT qualify a dir as the active run (BLOCKER-1 fix).
+function hasCleanOwner(ownerPath) {
+  try {
+    const lines = fs.readFileSync(ownerPath, 'utf8')
+      .split('\n').map(function (l) { return l.trim(); }).filter(Boolean);
+    return lines.length > 0 &&
+      lines.every(function (l) { return /^[A-Za-z0-9][A-Za-z0-9._-]{7,}$/.test(l); });
+  } catch (e) { return false; }
+}
+function resolveCompanyDir() {
+  if (process.env.COMPANY_DIR) return process.env.COMPANY_DIR;
+  const home = process.env.HOME || '';
+  const cwdDir = path.join(process.cwd(), '.company');
+  const homeDir = path.join(home, '.company');
+  const cwdHasOwner = hasCleanOwner(path.join(cwdDir, 'OWNER'));
+  const homeHasOwner = home && hasCleanOwner(path.join(homeDir, 'OWNER'));
+  // cwd/.company wins when it has a clean OWNER (project-local run, or both have OWNER).
+  if (cwdHasOwner) return cwdDir;
+  if (homeHasOwner) return homeDir;
+  return cwdDir; // new-run default: preserves original single-project behavior
+}
+const companyDir = resolveCompanyDir();
 if (!fs.existsSync(companyDir)) process.exit(0);
 // Only sessions listed in OWNER are acted on. A foreign session that shares the

package/hooks/session-restore.js CHANGED Viewed

@@ -9,7 +9,30 @@
 const fs = require('fs');
 const path = require('path');
-const companyDir = process.env.COMPANY_DIR || path.join(process.cwd(), '.company');
+// Resolve companyDir robustly: COMPANY_DIR env wins; else prefer the dir that holds
+// a clean OWNER (at least one valid session-id line); fall back to cwd/.company.
+// A blank/garbled OWNER does NOT qualify a dir as the active run (BLOCKER-1 fix).
+function hasCleanOwner(ownerPath) {
+  try {
+    const lines = fs.readFileSync(ownerPath, 'utf8')
+      .split('\n').map(function (l) { return l.trim(); }).filter(Boolean);
+    return lines.length > 0 &&
+      lines.every(function (l) { return /^[A-Za-z0-9][A-Za-z0-9._-]{7,}$/.test(l); });
+  } catch (e) { return false; }
+}
+function resolveCompanyDir() {
+  if (process.env.COMPANY_DIR) return process.env.COMPANY_DIR;
+  const home = process.env.HOME || '';
+  const cwdDir = path.join(process.cwd(), '.company');
+  const homeDir = path.join(home, '.company');
+  const cwdHasOwner = hasCleanOwner(path.join(cwdDir, 'OWNER'));
+  const homeHasOwner = home && hasCleanOwner(path.join(homeDir, 'OWNER'));
+  // cwd/.company wins when it has a clean OWNER (project-local run, or both have OWNER).
+  if (cwdHasOwner) return cwdDir;
+  if (homeHasOwner) return homeDir;
+  return cwdDir; // new-run default: preserves original single-project behavior
+}
+const companyDir = resolveCompanyDir();
 if (!fs.existsSync(companyDir)) process.exit(0);
 // Only sessions listed in OWNER are acted on. A foreign session that shares the

package/hooks/stop-guard.js CHANGED Viewed

@@ -24,7 +24,30 @@ const fs = require('fs');
 const path = require('path');
 const crypto = require('crypto');
-const companyDir = process.env.COMPANY_DIR || path.join(process.cwd(), '.company');
+// Resolve companyDir robustly: COMPANY_DIR env wins; else prefer the dir that holds
+// a clean OWNER (at least one valid session-id line); fall back to cwd/.company.
+// A blank/garbled OWNER does NOT qualify a dir as the active run (BLOCKER-1 fix).
+function hasCleanOwner(ownerPath) {
+  try {
+    const lines = fs.readFileSync(ownerPath, 'utf8')
+      .split('\n').map(function (l) { return l.trim(); }).filter(Boolean);
+    return lines.length > 0 &&
+      lines.every(function (l) { return /^[A-Za-z0-9][A-Za-z0-9._-]{7,}$/.test(l); });
+  } catch (e) { return false; }
+}
+function resolveCompanyDir() {
+  if (process.env.COMPANY_DIR) return process.env.COMPANY_DIR;
+  const home = process.env.HOME || '';
+  const cwdDir = path.join(process.cwd(), '.company');
+  const homeDir = path.join(home, '.company');
+  const cwdHasOwner = hasCleanOwner(path.join(cwdDir, 'OWNER'));
+  const homeHasOwner = home && hasCleanOwner(path.join(homeDir, 'OWNER'));
+  // cwd/.company wins when it has a clean OWNER (project-local run, or both have OWNER).
+  if (cwdHasOwner) return cwdDir;
+  if (homeHasOwner) return homeDir;
+  return cwdDir; // new-run default: preserves original single-project behavior
+}
+const companyDir = resolveCompanyDir();
 const criteriaPath = path.join(companyDir, 'criteria.json');
 const goalPath = path.join(companyDir, 'GOAL.md');
 const cancelPath = path.join(companyDir, 'CANCEL');

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "company-skill",
-  "version": "4.6.2",
+  "version": "4.6.5",
   "description": "Goal-driven multi-employee company for Claude Code. Give it a goal, it runs until done.",
   "bin": {
     "company-skill": "./bin/install.js"

package/scripts/check-contracts.js CHANGED Viewed

@@ -46,10 +46,11 @@ blocks.forEach((b, i) => {
   const vw = (b.split('VERIFY-WITH:')[1] || '').split('\n')[0].trim();
   const errs = [];
   if (missing.length) errs.push('missing ' + missing.join(' '));
-  // 8g fix: require a command/verb token, path component, or URL so bare phrases
-  // like "yes done" are rejected. The old vw.length < 8 guard was removed: it
-  // caused a false-positive on real short commands like "git log" (7 chars).
-  const VW_RE = /[/.:]|\b(test|grep|node|python3?|gh|git|curl|cat|ls|npm|make|pytest|diff|echo|jq|bash|sh)\b|\$\(|`|\|\||&&/;
+  // 8g fix: require a command/verb token, path component, URL, or a concrete
+  // visual-verify phrase (screenshot/playwright/open + named URL/path) so bare
+  // filler like "yes done" is rejected. Named-URL screenshot forms are explicitly
+  // allowed per skill guidance ("an equally concrete check, like a named URL").
+  const VW_RE = /[/.:]|\b(test|grep|node|python3?|gh|git|curl|cat|ls|npm|make|pytest|diff|echo|jq|bash|sh|playwright)\b|\$\(|`|\|\||&&|screenshot\s+https?:\/\/\S+|open\s+https?:\/\/\S+/;
   if (b.includes('VERIFY-WITH:') && (!vw.length || !VW_RE.test(vw))) errs.push('VERIFY-WITH is empty or vacuous');
   // ROI must have non-empty content after the colon so triage has something to sort on.
   const roi = (b.split('ROI:')[1] || '').split('\n')[0].trim();

package/scripts/cleanup.js CHANGED Viewed

@@ -129,7 +129,7 @@ function hasOpenPR(branchName) {
     const prs = JSON.parse(r.stdout || '[]');
     return prs.length > 0;
   } catch (e) {
-    return false;
+    return true; // fail safe: unparseable output -> assume open PR, block deletion
   }
 }

package/scripts/dashboard.js CHANGED Viewed

@@ -64,13 +64,29 @@ function resolvePort() {
 }
 const PORT = resolvePort();
+// A blank/garbled OWNER does NOT qualify a dir as the active run (BLOCKER-1 fix).
+function hasCleanOwner(ownerPath) {
+  try {
+    const lines = fs.readFileSync(ownerPath, 'utf8')
+      .split('\n').map(function (l) { return l.trim(); }).filter(Boolean);
+    return lines.length > 0 &&
+      lines.every(function (l) { return /^[A-Za-z0-9][A-Za-z0-9._-]{7,}$/.test(l); });
+  } catch (e) { return false; }
+}
 function resolveCompanyDir() {
   const flag = argValue('--company-dir');
   if (flag) return path.resolve(flag);
   if (process.env.COMPANY_DIR) return path.resolve(process.env.COMPANY_DIR);
-  const local = path.resolve('.company');
-  if (fs.existsSync(local)) return local;
-  return path.join(os.homedir(), '.company');
+  const home = process.env.HOME || os.homedir();
+  const cwdDir = path.resolve('.company');
+  const homeDir = home ? path.join(home, '.company') : null;
+  // Prefer the dir that holds a clean OWNER (real active run) to avoid cwd-drift.
+  const cwdHasOwner = hasCleanOwner(path.join(cwdDir, 'OWNER'));
+  const homeHasOwner = homeDir && hasCleanOwner(path.join(homeDir, 'OWNER'));
+  // cwd/.company wins when it has a clean OWNER (project-local run, or both have OWNER).
+  if (cwdHasOwner) return cwdDir;
+  if (homeHasOwner) return homeDir;
+  return cwdDir; // new-run default: preserves original single-project behavior
 }
 const COMPANY_DIR = resolveCompanyDir();
@@ -628,6 +644,8 @@ function computeContextFill(transcriptFile, overrideModel) {
 // ---------- COMPANY.md org chart parser ----------
 // Returns { departments: [{ name, lead, roles: [{ name, isLead }] }] }
+// Non-roster headings (Priorities, Rules) and HTML-commented blocks are skipped.
+const NON_ROSTER_SECTIONS = /^(priorities|rules)$/i;
 function parseCompanyMd() {
   // Resolve COMPANY.md: env COMPANY_DIR first, else project cwd, else ~/.company
   const candidates = [
@@ -641,78 +659,73 @@ function parseCompanyMd() {
   }
   if (!text) return { departments: [] };
+  // Strip HTML comment blocks before parsing so commented-out sections are invisible.
+  text = text.replace(/<!--[\s\S]*?-->/g, '');
   const departments = [];
   let currentDept = null;
   for (const rawLine of text.split('\n')) {
     const line = rawLine.trim();
-    // ## heading = new department; strip parenthetical metadata (e.g. "Growth (added 2026-06-09)")
+    // ## heading = new department candidate; strip parenthetical metadata first.
     if (/^##\s+/.test(line)) {
       const raw = line.replace(/^##\s+/, '').trim();
+      // BUG #3: extract the declared lead from the heading "(Lead: X)" before stripping it.
+      const headingLeadMatch = raw.match(/\(Lead:\s*([^)]+)\)/i);
+      const headingLeadName = headingLeadMatch ? headingLeadMatch[1].trim() : null;
       const deptName = raw.replace(/\s*\([^)]*\).*$/, '').trim();
-      currentDept = { name: deptName, lead: null, roles: [] };
+      // BUG #2: skip known non-roster sections (Priorities, Rules).
+      if (NON_ROSTER_SECTIONS.test(deptName)) { currentDept = null; continue; }
+      currentDept = { name: deptName, lead: null, roles: [], _headingLeadName: headingLeadName };
       departments.push(currentDept);
       continue;
     }
     // Bullet line = a role entry (- **Name** - desc or - Name: desc or - Name, desc)
     if (/^[-*]\s+/.test(line) && currentDept) {
       const body = line.replace(/^[-*]\s+/, '').replace(/\*\*/g, '');
-      // Role name = text before first comma, dash, or colon; strip trailing parenthetical metadata
-      const nameMatch = body.match(/^([^,\-:]+)/);
-      if (!nameMatch) continue;
-      let roleName = nameMatch[1].replace(/\s*\([^)]*\).*$/, '').trim();
-      // Strip trailing "Lead" prefix markers like "Lead:"
-      const isExplicitLead = /^lead:/i.test(roleName) || / lead$/i.test(roleName);
-      if (/^lead:/i.test(roleName)) roleName = roleName.replace(/^lead:/i, '').trim();
+      // BUG #3: detect "Lead: Name" bullet form - name comes AFTER the colon.
+      const leadPrefixMatch = body.match(/^lead:\s*([^,\-\n]+)/i);
+      let roleName, isExplicitLead;
+      if (leadPrefixMatch) {
+        // "- Lead: Ana runs growth" -> roleName = "Ana", isExplicitLead = true
+        roleName = leadPrefixMatch[1].replace(/\s*\([^)]*\).*$/, '').trim().split(/\s+/)[0];
+        isExplicitLead = true;
+      } else {
+        // Normal "- Name, desc" or "- Name - desc" form
+        const nameMatch = body.match(/^([^,\-:]+)/);
+        if (!nameMatch) continue;
+        roleName = nameMatch[1].replace(/\s*\([^)]*\).*$/, '').trim();
+        isExplicitLead = false;
+      }
       // CEO is always tier-0, skip from dept roles
       if (/^ceo$/i.test(roleName)) continue;
       const role = { name: roleName, isLead: isExplicitLead };
       currentDept.roles.push(role);
-      // First role in dept becomes the lead if none marked explicit
+      // First role in dept becomes the lead if none marked explicit and no heading lead declared
       if (!currentDept.lead) currentDept.lead = role;
       if (isExplicitLead && currentDept.lead !== role) currentDept.lead = role;
     }
   }
-  // Remove departments with no roles
-  return { departments: departments.filter(d => d.roles.length > 0) };
-}
-// ---------- MUST-FIX 3 (revised): org tree from COMPANY.md + live agent overlay ----------
-function getActiveCycleNumber() {
-  // Read active cycle from roster header or latest cycle-N-briefing.md
-  const rosterText = cachedReadFile(path.join(COMPANY_DIR, 'active-roster.md')) || '';
-  const m = rosterText.match(/cycle\s+(\d+)/i);
-  if (m) return parseInt(m[1], 10);
-  // Fallback: find highest numbered briefing
-  try {
-    const files = fs.readdirSync(path.join(COMPANY_DIR, 'cycles'));
-    let max = 0;
-    for (const f of files) {
-      const bm = f.match(/^cycle-(\d+)-briefing\.md$/);
-      if (bm) { const n = parseInt(bm[1], 10); if (n > max) max = n; }
+  // Post-process: apply heading-declared lead (BUG #3) - override first-role default
+  for (const dept of departments) {
+    if (dept._headingLeadName && dept.roles.length > 0) {
+      // Find the role whose name matches the heading lead declaration
+      const found = dept.roles.find(r =>
+        r.name.toLowerCase() === dept._headingLeadName.toLowerCase()
+      );
+      if (found) dept.lead = found;
+      // If no role matches the heading name exactly, fall back to first role (safe default)
     }
-    if (max > 0) return max;
-  } catch (_) { /* ignore */ }
-  return null;
-}
-function parseCycleTaskFile(filepath) {
-  const text = cachedReadFile(filepath);
-  if (!text) return [];
-  const tasks = [];
-  // Match TASK N: ... EMPLOYEE: ... blocks separated by ---
-  const taskRe = /^TASK\s+\d+:\s*(.+?)\n(?:[\s\S]*?)^EMPLOYEE:\s*(.+?)(?:\n(?:[\s\S]*?)^SURFACES:\s*(.+?))?(?:\n|$)/gm;
-  let tm;
-  while ((tm = taskRe.exec(text)) !== null) {
-    tasks.push({
-      task: tm[1].trim().slice(0, 120),
-      employee: tm[2].trim().slice(0, 60),
-      surfaces: tm[3] ? tm[3].trim().slice(0, 80) : null
-    });
+    delete dept._headingLeadName;
   }
-  return tasks;
+  // Remove departments with no roles (BUG #2: also removes phantom sections)
+  return { departments: departments.filter(d => d.roles.length > 0) };
 }
+// ---------- Org tree from COMPANY.md + live agent overlay ----------
 // Score how well a live agent name matches a COMPANY.md role name (higher = better)
 function roleMatchScore(agentStr, roleName) {
   const a = agentStr.toLowerCase().replace(/[^a-z0-9 ]/g, ' ').trim();
@@ -731,25 +744,6 @@ function buildOrgTree(projDir, sessionId, liveAgents) {
   // Tier 0: orchestrator (CEO)
   const orchNode = { id: 'orchestrator', tier: 0, label: 'CEO', status: 'active', dept: null };
-  const activeCycle = getActiveCycleNumber();
-  const cyclesDir = path.join(COMPANY_DIR, 'cycles');
-  // Load enrichment from active cycle task files: dept -> [{task, employee, surfaces}]
-  const enrichment = new Map();
-  try {
-    const files = fs.readdirSync(cyclesDir);
-    for (const f of files) {
-      if (activeCycle !== null) {
-        const bm = f.match(/^cycle-(\d+)-tasks-(.+)\.md$/);
-        if (bm && parseInt(bm[1], 10) === activeCycle) {
-          const dept = bm[2];
-          const tasks = parseCycleTaskFile(path.join(cyclesDir, f));
-          if (tasks.length > 0) enrichment.set(dept, tasks);
-        }
-      }
-    }
-  } catch (_) { /* ignore */ }
   // Load active-roster.md employee names for mapping
   const rosterText = cachedReadFile(path.join(COMPANY_DIR, 'active-roster.md')) || '';
@@ -885,7 +879,7 @@ function buildOrgTree(projDir, sessionId, liveAgents) {
   }
   const note = 'Logically: CEO delegates to dept leads; leads own their team. Physically the orchestrator spawns all agents.';
-  return { nodes, edges, note, activeCycle };
+  return { nodes, edges, note };
 }
 // ---------- registry ----------

package/scripts/reset-company-guard.js CHANGED Viewed

@@ -28,7 +28,30 @@ const fs = require('fs');
 const path = require('path');
 const crypto = require('crypto');
-const companyDir = process.env.COMPANY_DIR || path.join(process.cwd(), '.company');
+// Resolve companyDir robustly: COMPANY_DIR env wins; else prefer the dir that holds
+// a clean OWNER (at least one valid session-id line); fall back to cwd/.company.
+// A blank/garbled OWNER does NOT qualify a dir as the active run (BLOCKER-1 fix).
+function hasCleanOwner(ownerPath) {
+  try {
+    const lines = fs.readFileSync(ownerPath, 'utf8')
+      .split('\n').map(function (l) { return l.trim(); }).filter(Boolean);
+    return lines.length > 0 &&
+      lines.every(function (l) { return /^[A-Za-z0-9][A-Za-z0-9._-]{7,}$/.test(l); });
+  } catch (e) { return false; }
+}
+function resolveCompanyDir() {
+  if (process.env.COMPANY_DIR) return process.env.COMPANY_DIR;
+  const home = process.env.HOME || '';
+  const cwdDir = path.join(process.cwd(), '.company');
+  const homeDir = path.join(home, '.company');
+  const cwdHasOwner = hasCleanOwner(path.join(cwdDir, 'OWNER'));
+  const homeHasOwner = home && hasCleanOwner(path.join(homeDir, 'OWNER'));
+  // cwd/.company wins when it has a clean OWNER (project-local run, or both have OWNER).
+  if (cwdHasOwner) return cwdDir;
+  if (homeHasOwner) return homeDir;
+  return cwdDir; // new-run default: preserves original single-project behavior
+}
+const companyDir = resolveCompanyDir();
 const lockPath = path.join(companyDir, 'criteria.lock');
 let anchorDir = null;

package/scripts/restart-debate.js CHANGED Viewed

@@ -15,7 +15,30 @@ if (argIdx !== -1 && process.argv[argIdx + 1]) {
 }
 const sessionId = sessionArg || process.env.CLAUDE_CODE_SESSION_ID || null;
-const companyDir = process.env.COMPANY_DIR || path.join(process.cwd(), '.company');
+// Resolve companyDir robustly: COMPANY_DIR env wins; else prefer the dir that holds
+// a clean OWNER (at least one valid session-id line); fall back to cwd/.company.
+// A blank/garbled OWNER does NOT qualify a dir as the active run (BLOCKER-1 fix).
+function hasCleanOwner(ownerPath) {
+  try {
+    const lines = fs.readFileSync(ownerPath, 'utf8')
+      .split('\n').map(function (l) { return l.trim(); }).filter(Boolean);
+    return lines.length > 0 &&
+      lines.every(function (l) { return /^[A-Za-z0-9][A-Za-z0-9._-]{7,}$/.test(l); });
+  } catch (e) { return false; }
+}
+function resolveCompanyDir() {
+  if (process.env.COMPANY_DIR) return process.env.COMPANY_DIR;
+  const home = process.env.HOME || '';
+  const cwdDir = path.join(process.cwd(), '.company');
+  const homeDir = path.join(home, '.company');
+  const cwdHasOwner = hasCleanOwner(path.join(cwdDir, 'OWNER'));
+  const homeHasOwner = home && hasCleanOwner(path.join(homeDir, 'OWNER'));
+  // cwd/.company wins when it has a clean OWNER (project-local run, or both have OWNER).
+  if (cwdHasOwner) return cwdDir;
+  if (homeHasOwner) return homeDir;
+  return cwdDir; // new-run default: preserves original single-project behavior
+}
+const companyDir = resolveCompanyDir();
 let input;
 try {

package/scripts/statusline.js CHANGED Viewed

@@ -25,8 +25,28 @@ const os = require('os');
 let input = '';
 try { input = fs.readFileSync(0, 'utf8'); } catch (_) {}
-// Resolve the company state directory.
-const dir = process.env.COMPANY_DIR || path.join(os.homedir(), '.company');
+// Resolve companyDir using the same clean-OWNER-preference logic as the hooks.
+// A blank/garbled OWNER does NOT qualify a dir as the active run (BLOCKER-1 fix).
+function hasCleanOwner(ownerPath) {
+  try {
+    const lines = fs.readFileSync(ownerPath, 'utf8')
+      .split('\n').map(function (l) { return l.trim(); }).filter(Boolean);
+    return lines.length > 0 &&
+      lines.every(function (l) { return /^[A-Za-z0-9][A-Za-z0-9._-]{7,}$/.test(l); });
+  } catch (e) { return false; }
+}
+function resolveDir() {
+  if (process.env.COMPANY_DIR) return process.env.COMPANY_DIR;
+  const home = process.env.HOME || os.homedir();
+  const cwdDir = path.join(process.cwd(), '.company');
+  const homeDir = home ? path.join(home, '.company') : null;
+  const cwdHasOwner = hasCleanOwner(path.join(cwdDir, 'OWNER'));
+  const homeHasOwner = homeDir && hasCleanOwner(path.join(homeDir, 'OWNER'));
+  if (cwdHasOwner) return cwdDir;
+  if (homeHasOwner) return homeDir;
+  return home ? path.join(home, '.company') : cwdDir;
+}
+const dir = resolveDir();
 // --- Chaining: run the prior statusline command if one is stored ---
 const baseCfgPath = path.join(dir, 'statusline-base.json');

package/skill/SKILL.md CHANGED Viewed

@@ -194,11 +194,11 @@ Otherwise:
 ```json
 {"goal":"...","criteria":[
-  {"id":1,"description":"specific checkable criterion","passes":false,"evidence":null}
+  {"id":1,"description":"specific checkable criterion","passes":false,"evidence":null,"stakes":"normal"}
 ]}
 ```
-Every criterion must be yes/no checkable. No vague language. Every criterion starts FAILING: `passes: false`, `evidence: null`. Only the VERIFY phase may flip a criterion to passing, and only by writing the reproduced evidence into the `evidence` field at the same time. When writing criteria.json for a NEW goal, first run `node <skill-scripts-dir>/reset-company-guard.js` to clear any stale `.company/criteria.lock`, `.company/CANCEL`, `.company/.context-guard-state`, and the external anchor dir from the previous run. The stop guard re-snapshots the new id set on first sight once the stale anchor is gone. Clearing the external anchor and the context-guard state is symmetric with clearing the criteria.lock: skipping either would leave the prior run's state active for the new goal.
+Every criterion must be yes/no checkable. No vague language. Every criterion starts FAILING: `passes: false`, `evidence: null`. Set `stakes: "high"` on any criterion that is irreversible, touches a security surface, makes a public-facing claim, or could cause data loss. Omit `stakes` or set it to `"normal"` for everything else; the default is normal (single critic, existing behavior unchanged). The 3-lens verify in the VERIFY section gates on this field, so a criterion that warrants high scrutiny MUST carry it here or the feature never triggers. Only the VERIFY phase may flip a criterion to passing, and only by writing the reproduced evidence into the `evidence` field at the same time. When writing criteria.json for a NEW goal, first run `node <skill-scripts-dir>/reset-company-guard.js` to clear any stale `.company/criteria.lock`, `.company/CANCEL`, `.company/.context-guard-state`, and the external anchor dir from the previous run. The stop guard re-snapshots the new id set on first sight once the stale anchor is gone. Clearing the external anchor and the context-guard state is symmetric with clearing the criteria.lock: skipping either would leave the prior run's state active for the new goal.
 The stop guard does NOT auto-heal when GOAL.md changes or any other file-state heuristic fires. That design is intentional: any automatic heal keyed on .company/ file state is bypassable by an in-run actor that can write criteria.json (and also write GOAL.md, which is a sibling file). `reset-company-guard.js` is the ONLY safe path - it is a deliberate, auditable action run before criteria.json is written, not a silent in-guard reset.
@@ -277,7 +277,7 @@ Write `.company/cycles/cycle-{N}-briefing.md` first (exact name, the PreCompact
 As CEO, read the GOAL and COMPANY.md. Decide which departments and employees are RELEVANT to this specific goal. Only activate relevant ones. A mobile app goal does not need a Topologist. Write `.company/active-roster.md`: each activated employee with a one-line reason.
-**Effort scaling:** size the spawn to the goal before spawning anything. Trivial goal (single surface, known fix): no leads, 1-2 contracts written by you. Medium (one department's scope, one wave): 1-2 leads. Complex (multi-surface or unknown root cause): full parallel leads + dependency waves. State the chosen tier in the cycle briefing so the critic can challenge over- or under-spawn. Tie effort to ROI: spend heavier spawn on the highest-value decomposition of the goal, not the most obvious. When two decompositions are both sound, pick the one that unblocks more downstream work or closes the riskiest criterion first.
+**Effort scaling:** size the spawn to the goal before spawning anything. Trivial goal (single surface, known fix): no leads, 1-2 contracts written by you. Medium (one department's scope, one wave): 1-2 leads. Complex (multi-surface or unknown root cause): full parallel leads + dependency waves. State the chosen tier in the cycle briefing so the critic can challenge over- or under-spawn. Tie effort to ROI and stakes: spend heavier spawn on the highest-value decomposition of the goal, not the most obvious. When two decompositions are both sound, pick the one that unblocks more downstream work or closes the riskiest criterion first.
 Spawn ALL relevant department leads in parallel: one `company-lead` Agent call per department, every Agent call in a SINGLE message. Sequential lead spawns are a bug. If an Agent call fails transiently, retry once, then record the lead as unavailable and fold its planning into your own.
@@ -355,6 +355,15 @@ Before spawning the reviewer, run the findings shape gate: `node <skill-scripts-
 Then spawn `company-critic` (the Devil's Advocate) on everything marked passing. Its probes: was the evidence reproduced or just transcribed? Does the test actually exercise the change? What input breaks it? What surface was never checked? Could this be simpler? Would a real user understand it? For every external claim: verified from their repo or docs, or guessed? A single unclosed gap means NOT DONE.
+**Perspective-diverse verify for high-stakes criteria.** This generalizes the restart-debate
+3-role panel (see Restart mode) and Anthropic's evaluator-optimizer/fresh-verifier pattern to
+in-loop high-stakes criteria. Normal-stakes criteria keep the single critic above.
+For a criterion tagged `stakes: high` in criteria.json (irreversible action, security surface,
+public-facing claim, or `attempts >= 2`), spawn the critic in THREE fresh contexts, each with a
+distinct `LENS:` directive in its prompt: correctness, security, reproducibility. Each returns a
+binary ACCEPT/REJECT + one gap line. Any REJECT blocks (unanimous ACCEPT required to pass).
+No per-lens numeric score. Record all three verdicts in the cycle review.
 **MERGE GATE:** nothing merges during EXECUTE. A worker's output stops at a draft PR. Only after the reviewer grades the relevant criterion MET on reproduced evidence AND the critic accepts it does the ORCHESTRATOR merge, recording the verdict in the cycle review. Workers never merge, ever. The merge gate reads the PR's Proof of work block against the reviewer's reproduction.
 **BRANCH AND WORKTREE HYGIENE (MANDATORY after every merge):** after merging a PR, the orchestrator MUST delete the merged branch with `gh pr merge --delete-branch` (the flag deletes the remote branch atomically with the merge) and remove its worktree with `git worktree remove --force <worktree-path>` followed by `git worktree prune`. A merged branch left on origin and a stale worktree are both bugs. Runs that touch multiple repos MUST apply this to every merged PR, not just the last one.
@@ -379,7 +388,11 @@ CYCLE {N} VERDICT: {DONE or NOT DONE}
 ALL criteria pass + critic accepts = EXIT.
 Otherwise = loop, re-spawning only the FAILING tasks with the review feedback in their contracts.
-**Stall detector:** the reviewer keeps an `attempts` count on each criterion's entry in criteria.json (increment on every cycle it stays failing - the stop guard ignores extra fields). At `attempts >= 2` with same-shape evidence, the next THINK MUST produce a structurally different decomposition for that criterion: new approach, new surfaces, or HIRE. Re-issuing a near-identical contract after two same-shape failures is a planning bug, not persistence.
+**Stall detector:** the reviewer keeps an `attempts` count on each criterion's entry in criteria.json (increment on every cycle it stays failing - the stop guard ignores extra fields). At `attempts >= 2` with same-shape evidence, the next THINK MUST produce a structurally different decomposition for that criterion: new approach, new surfaces, or HIRE. Re-issuing a near-identical contract after two same-shape failures is a planning bug, not persistence. The reviewer agent file instructs the reviewer to increment `attempts` every cycle a criterion stays failing; that instruction is what puts the value into criteria.json so the stall detector and the high-stakes gate can read it.
+**ANTI-VACUOUS TEST:** a test must exercise the exact code path it claims to verify, never bypass it (e.g. by setting an env that short-circuits the code under test, or asserting on base state that predates the change). A test that passes against the pre-change code is vacuous and provides no signal. The reviewer confirms that every new test FAILS against the unfixed code before accepting it as evidence. The critic probes for this on every passing criterion.
+**FEATURE REACHABILITY:** a shipped feature that gates on a field or condition is dead if the authoring path that sets that field does not exist. When a new feature is introduced, the reviewer checks that the runtime path which produces the gating field is actually present in the skill or agent instructions, not just that the feature's code merged. The critic probes for unreachable gates on every new feature.
 ### COMPRESS (between cycles)