company-skill 1.4.0 → 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/skill/SKILL.md +72 -447
package/package.json
CHANGED
package/skill/SKILL.md
CHANGED
|
@@ -17,502 +17,127 @@ allowed-tools:
|
|
|
17
17
|
- Skill
|
|
18
18
|
---
|
|
19
19
|
|
|
20
|
-
# /company
|
|
20
|
+
# /company
|
|
21
21
|
|
|
22
|
-
|
|
22
|
+
Give it a goal. Run every employee. Loop until verified done.
|
|
23
23
|
|
|
24
|
-
|
|
25
|
-
echo "════════════════════════════════════════════════" && echo " 🏢 COMPANY SKILL ACTIVE" && echo "════════════════════════════════════════════════" && echo "" && ([ -f COMPANY.md ] && echo "$(grep -c '^[0-9]\|^- ' COMPANY.md 2>/dev/null) roles found" || echo "No COMPANY.md, will auto-create")
|
|
26
|
-
```
|
|
27
|
-
|
|
28
|
-
Give it a goal. It runs the entire company in loops until the goal is verified done.
|
|
29
|
-
|
|
30
|
-
```
|
|
31
|
-
/company "Build the user authentication system with OAuth2"
|
|
32
|
-
/company "Research all competitors and write a competitive analysis"
|
|
33
|
-
/company "Fix the payment processing bug and deploy"
|
|
34
|
-
```
|
|
35
|
-
|
|
36
|
-
You set the goal. Walk away. Come back to a STATUS.md with everything accomplished.
|
|
37
|
-
|
|
38
|
-
## Core Loop
|
|
39
|
-
|
|
40
|
-
```
|
|
41
|
-
GOAL: "{what the user asked for}"
|
|
42
|
-
|
|
|
43
|
-
THINK ── Leads break the goal into tasks, assign to employees
|
|
44
|
-
|
|
|
45
|
-
EXECUTE ── Employees do the work (code, research, scan, test)
|
|
46
|
-
|
|
|
47
|
-
VERIFY ── Built-in Reviewer + Critic check if the goal is actually met
|
|
48
|
-
|
|
|
49
|
-
Done? ── NO: loop back to THINK with feedback on what's missing
|
|
50
|
-
── YES: write STATUS.md, report to user
|
|
51
|
-
```
|
|
52
|
-
|
|
53
|
-
The loop does NOT stop until VERIFY confirms the goal is met.
|
|
54
|
-
|
|
55
|
-
## Built-In Roles (Always Active)
|
|
56
|
-
|
|
57
|
-
These exist in EVERY company, regardless of COMPANY.md. They are the backbone.
|
|
58
|
-
If the user already defined any of these roles, the built-in version merges with theirs (no duplicates).
|
|
59
|
-
|
|
60
|
-
### Leadership (THINK phase)
|
|
61
|
-
**CEO**, Reads the goal, sets priorities, resolves conflicts between departments. If the user defined a CEO, that definition is used. If not, auto-created.
|
|
62
|
-
|
|
63
|
-
**CTO**, Technical decisions, architecture review, code quality standards. Auto-created if not defined.
|
|
64
|
-
|
|
65
|
-
### Quality Gate (VERIFY phase)
|
|
66
|
-
**Internal Reviewer**, Reviews all work after each cycle. Checks: is it correct? Does it address the goal? Are there bugs or gaps? Grades each success criterion as MET / NOT MET / PARTIALLY MET.
|
|
67
|
-
|
|
68
|
-
**User Advocate**, "Would a real user understand this? Is the API ugly? Does the README make sense in 10 seconds?" Represents the end user who doesn't care about internals.
|
|
69
|
-
|
|
70
|
-
**Devil's Advocate**, Attacks every result. "What could go wrong? What did we miss? Would an external expert accept this?" Only satisfied when there are zero remaining holes.
|
|
71
|
-
|
|
72
|
-
**Elegance Enforcer**, "Can this be simpler? Is there unnecessary complexity? Does every component justify its existence?" Prevents over-engineering.
|
|
73
|
-
|
|
74
|
-
### Deduplication Rule
|
|
75
|
-
When parsing COMPANY.md, check if the user defined roles matching these built-ins:
|
|
76
|
-
- Match by name similarity: "CEO", "Chief Executive", "CTO", "Tech Lead", "Reviewer", "Critic", "Advocate", etc.
|
|
77
|
-
- If match found: use the USER'S description but ensure the role runs in the correct phase (CEO in THINK, Reviewer in VERIFY)
|
|
78
|
-
- If no match: auto-create with default description
|
|
79
|
-
- Never duplicate: one CEO, one CTO, one Reviewer, one Advocate, one Elegance Enforcer, one User Advocate
|
|
80
|
-
|
|
81
|
-
This means a 2-person COMPANY.md (just "Backend Dev" + "Frontend Dev") automatically gets: CEO + CTO + Backend Dev + Frontend Dev + Internal Reviewer + Devil's Advocate + Elegance Enforcer + User Advocate = 8 employees running.
|
|
82
|
-
|
|
83
|
-
## Step 1: Parse Goal + Company
|
|
84
|
-
|
|
85
|
-
Read the user's goal from the command argument or their message.
|
|
24
|
+
## Preamble
|
|
86
25
|
|
|
87
26
|
```bash
|
|
88
|
-
|
|
27
|
+
echo "════════════════════════════════════════════════" && echo " 🏢 COMPANY SKILL ACTIVE" && echo "════════════════════════════════════════════════"
|
|
28
|
+
[ -f COMPANY.md ] && echo "$(grep -c '^[0-9]\|^- ' COMPANY.md 2>/dev/null) roles" || echo "No COMPANY.md"
|
|
29
|
+
[ -f .company/playbook.md ] && echo "Playbook loaded" || echo "First run"
|
|
89
30
|
```
|
|
90
31
|
|
|
91
|
-
|
|
92
|
-
1. If a goal was given, create a COMPANY.md with a minimal team tailored to the goal (engineering dept for code goals, research dept for analysis goals, etc.) plus the built-in roles. Write it to disk so the user can edit it later.
|
|
93
|
-
2. If no goal given, create COMPANY.md from the template with placeholder departments and tell the user to fill it in before running again.
|
|
94
|
-
|
|
95
|
-
**Parsing rules:**
|
|
96
|
-
- Roles are `- ` lines that appear UNDER department headers (`## Department Name`)
|
|
97
|
-
- Lines under `## Priorities`, `## Rules`, `## Communication`, `## Protocol` are NOT roles
|
|
98
|
-
- Stop collecting roles when hitting the next `##` header
|
|
99
|
-
- A role line typically has a name followed by a comma and description: `- Role Name, description`
|
|
32
|
+
## Parse
|
|
100
33
|
|
|
101
|
-
|
|
102
|
-
- **THINK (Opus):** lead, director, chief, CEO, CTO, principal, architect, strategist
|
|
103
|
-
- **EXECUTE (Sonnet):** engineer, researcher, scientist, developer, specialist, analyst, scout, designer, writer
|
|
104
|
-
- **VERIFY (Opus):** Internal Reviewer + Devil's Advocate (always Opus, verification needs deep reasoning)
|
|
105
|
-
- **COMPRESS (Haiku):** auto-created digest writer
|
|
34
|
+
Read COMPANY.md (or create minimal team if missing). Read the user's goal.
|
|
106
35
|
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
## Step 2: Install Skills
|
|
110
|
-
|
|
111
|
-
```bash
|
|
112
|
-
INSTALLED=$(for d in ~/.claude/skills/*/SKILL.md .claude/skills/*/SKILL.md; do
|
|
113
|
-
[ -f "$d" ] && basename "$(dirname "$d")"
|
|
114
|
-
done 2>/dev/null | sort -u)
|
|
115
|
-
|
|
116
|
-
# Auto-install skill packs (failures don't block)
|
|
117
|
-
echo "$INSTALLED" | grep -q "gstack" || npx gstack@latest install 2>/dev/null || true
|
|
118
|
-
echo "$INSTALLED" | grep -q "gsd" || npx -y get-shit-done-cc@latest install 2>/dev/null || true
|
|
119
|
-
echo "$INSTALLED" | grep -q "trailofbits" || (git clone --depth 1 https://github.com/trailofbits/skills.git /tmp/tob-skills 2>/dev/null && cp -r /tmp/tob-skills/.claude/skills/* ~/.claude/skills/ 2>/dev/null && rm -rf /tmp/tob-skills) || true
|
|
120
|
-
# More skill packs via npm
|
|
121
|
-
echo "$INSTALLED" | grep -q "claude-mem\|thedotmack" || npm i -g claude-mem@latest 2>/dev/null || true
|
|
122
|
-
echo "$INSTALLED" | grep -q "oh-my-claude\|sisyphus" || npm i -g oh-my-claude-sisyphus@latest 2>/dev/null || true
|
|
123
|
-
# Marketplace plugins (detect if user added them, can't auto-install)
|
|
124
|
-
# /plugin marketplace add obra/superpowers-marketplace
|
|
125
|
-
# /plugin marketplace add wshobson/agents
|
|
126
|
-
# /plugin marketplace add alirezarezvani/claude-skills
|
|
127
|
-
|
|
128
|
-
for d in ~/.claude/skills/*/SKILL.md .claude/skills/*/SKILL.md; do
|
|
129
|
-
[ -f "$d" ] && basename "$(dirname "$d")"
|
|
130
|
-
done 2>/dev/null | sort -u
|
|
131
|
-
```
|
|
132
|
-
|
|
133
|
-
Build {DETECTED_SKILLS} list. When installed, skills are MANDATORY.
|
|
134
|
-
|
|
135
|
-
Users can install additional skill packs manually for more capabilities:
|
|
136
|
-
```
|
|
137
|
-
/plugin marketplace add wshobson/agents
|
|
138
|
-
/plugin marketplace add alirezarezvani/claude-skills
|
|
139
|
-
/plugin marketplace add obra/superpowers-marketplace
|
|
140
|
-
```
|
|
141
|
-
|
|
142
|
-
## Step 3: Initialize
|
|
143
|
-
|
|
144
|
-
```bash
|
|
145
|
-
mkdir -p .company/cycles .company/messages .company/memory
|
|
146
|
-
grep -q "^\.company/" .gitignore 2>/dev/null || echo ".company/" >> .gitignore 2>/dev/null || true
|
|
147
|
-
```
|
|
148
|
-
|
|
149
|
-
Write `.company/GOAL.md` with STRUCTURED checkable criteria (not free text):
|
|
150
|
-
```markdown
|
|
151
|
-
# Goal
|
|
152
|
-
{the user's goal}
|
|
153
|
-
|
|
154
|
-
# Success Criteria
|
|
155
|
-
Each criterion MUST be a yes/no checkable statement. No vague language.
|
|
156
|
-
|
|
157
|
-
- [ ] {specific measurable criterion 1}
|
|
158
|
-
- [ ] {specific measurable criterion 2}
|
|
159
|
-
- [ ] {specific measurable criterion 3}
|
|
160
|
-
|
|
161
|
-
Bad examples (REJECT these):
|
|
162
|
-
- "Code is clean" (vague)
|
|
163
|
-
- "Performance is good" (not measurable)
|
|
164
|
-
- "Implementation is complete" (not checkable)
|
|
165
|
-
|
|
166
|
-
Good examples:
|
|
167
|
-
- "Function X returns Y when given input Z"
|
|
168
|
-
- "Compression ratio > 20x measured with honest byte counting"
|
|
169
|
-
- "All tests pass with 0 failures"
|
|
170
|
-
- "README has install instructions that work on a fresh machine"
|
|
171
|
-
```
|
|
172
|
-
|
|
173
|
-
Write `.company/criteria.json` (machine-checkable state):
|
|
36
|
+
Write `.company/criteria.json`:
|
|
174
37
|
```json
|
|
175
|
-
{
|
|
176
|
-
"
|
|
177
|
-
|
|
178
|
-
{"id": 1, "description": "{criterion 1}", "passes": false, "evidence": null},
|
|
179
|
-
{"id": 2, "description": "{criterion 2}", "passes": false, "evidence": null},
|
|
180
|
-
{"id": 3, "description": "{criterion 3}", "passes": false, "evidence": null}
|
|
181
|
-
]
|
|
182
|
-
}
|
|
38
|
+
{"goal":"...","criteria":[
|
|
39
|
+
{"id":1,"description":"specific checkable criterion","passes":false,"evidence":null}
|
|
40
|
+
]}
|
|
183
41
|
```
|
|
42
|
+
Every criterion must be yes/no checkable. No vague language.
|
|
184
43
|
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
Write `.company/cycles/cycle-0-briefing.md` with: goal, criteria, team structure, rules, any previous state.
|
|
44
|
+
Read `.company/playbook.md` if it exists (accumulated knowledge from past sessions).
|
|
188
45
|
|
|
189
|
-
##
|
|
190
|
-
|
|
191
|
-
At the START of each cycle, run this Bash command (replace {N} with the cycle number):
|
|
46
|
+
## Loop
|
|
192
47
|
|
|
193
48
|
```bash
|
|
194
|
-
echo "
|
|
49
|
+
echo "════════════════════════════════════════════════" && echo "🏢 CYCLE {N} - THINK > EXECUTE > VERIFY" && echo "════════════════════════════════════════════════"
|
|
195
50
|
```
|
|
196
51
|
|
|
197
|
-
|
|
52
|
+
### THINK (Opus, all leads parallel)
|
|
198
53
|
|
|
199
|
-
|
|
200
|
-
echo "" && echo "📋 CYCLE {N} VERDICT: {DONE or NOT DONE}" && echo "{one line reason from Reviewer}"
|
|
201
|
-
```
|
|
54
|
+
First, CEO reads the GOAL and COMPANY.md. Decide which departments and employees are RELEVANT to this specific goal. Only activate relevant ones. A mobile app goal doesn't need a Topologist. A math research goal doesn't need DevOps.
|
|
202
55
|
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
**MANDATORY: Launch EVERY department lead from COMPANY.md in parallel. Not just CEO and CTO. ALL of them.** Parse COMPANY.md for every `## Department (Lead: Role)` header and launch that lead. Also launch the built-in roles (Internal Reviewer, Devil's Advocate, Elegance Enforcer, User Advocate) if they exist as separate departments.
|
|
206
|
-
|
|
207
|
-
If COMPANY.md has 8 departments, launch 8 leads. If it has 3, launch 3. Never skip a department.
|
|
208
|
-
|
|
209
|
-
Each lead gets:
|
|
210
|
-
|
|
211
|
-
```
|
|
212
|
-
You are {ROLE} ({DEPT} department). Cycle {N}.
|
|
213
|
-
|
|
214
|
-
GOAL: {contents of .company/GOAL.md}
|
|
215
|
-
BRIEFING: {contents of .company/cycles/cycle-{N}-briefing.md}
|
|
216
|
-
YOUR TEAM: {list of employees in your department}
|
|
217
|
-
INSTALLED SKILLS (MANDATORY, use these instead of doing work manually):
|
|
218
|
-
{DETECTED_SKILLS}
|
|
219
|
-
|
|
220
|
-
SKILL RULES (YOU MUST FOLLOW):
|
|
221
|
-
- Code review tasks: MUST use /review
|
|
222
|
-
- Bug investigation: MUST use /investigate
|
|
223
|
-
- Browser testing: MUST use /qa or /browse
|
|
224
|
-
- PR/shipping: MUST use /ship
|
|
225
|
-
- Architecture review: MUST use /plan-eng-review
|
|
226
|
-
- Strategy review: MUST use /plan-ceo-review or /office-hours
|
|
227
|
-
- Security audit: MUST use trailofbits skills if installed
|
|
228
|
-
- Project planning: MUST use /gsd:plan-phase if installed
|
|
229
|
-
- Web research: MUST use WebSearch
|
|
230
|
-
- Only use raw tools (Read/Write/Bash) when NO installed skill matches the task
|
|
231
|
-
|
|
232
|
-
{If cycle > 0:}
|
|
233
|
-
FEEDBACK FROM LAST VERIFICATION:
|
|
234
|
-
{what the Reviewer and Critic said was missing}
|
|
235
|
-
|
|
236
|
-
INSTRUCTIONS:
|
|
237
|
-
1. What does your department need to do to achieve the GOAL?
|
|
238
|
-
2. {If cycle > 0:} Address the verification feedback specifically.
|
|
239
|
-
3. Assign tasks to your employees. For EVERY task, check the SKILL RULES above first:
|
|
240
|
-
|
|
241
|
-
TASK: {one sentence}
|
|
242
|
-
ASSIGN: {employee role}
|
|
243
|
-
SKILL: {MANDATORY skill from rules above, or "raw" ONLY if no skill matches}
|
|
244
|
-
CONTEXT: {max 200 words}
|
|
245
|
-
|
|
246
|
-
4. Write to .company/cycles/cycle-{N}-think-{dept}.md
|
|
247
|
-
```
|
|
56
|
+
Write `.company/active-roster.md`: list of employees activated for THIS goal with a one-line reason why each is relevant.
|
|
248
57
|
|
|
249
|
-
|
|
58
|
+
Then launch ALL relevant department leads in parallel (skip departments with zero relevant employees).
|
|
250
59
|
|
|
251
|
-
|
|
60
|
+
Each lead gets: goal, criteria, playbook (if exists), active roster, previous cycle feedback, their team list, installed skills list.
|
|
252
61
|
|
|
253
|
-
|
|
254
|
-
You are {ROLE}. Cycle {N}.
|
|
255
|
-
|
|
256
|
-
GOAL: {the company's goal, so every employee knows WHY}
|
|
257
|
-
TASK: {from lead's assignment}
|
|
258
|
-
CONTEXT: {from lead}
|
|
259
|
-
SKILL: {assigned skill, MANDATORY}
|
|
260
|
-
PREVIOUS WORK: {.company/{dept}/{worker-slug}.md if exists}
|
|
261
|
-
|
|
262
|
-
INSTRUCTIONS:
|
|
263
|
-
1. If a skill was assigned (not "raw"), you MUST invoke it via the Skill tool. Do NOT do the work manually.
|
|
264
|
-
2. ONLY if skill is "raw", use raw tools (Read, Write, Bash, WebSearch).
|
|
265
|
-
3. Write findings to .company/{dept}/{worker-slug}.md with MANDATORY format:
|
|
266
|
-
FINDING: {what you found}
|
|
267
|
-
SOURCE: {file path, URL, experiment output, or exact command that proves this}
|
|
268
|
-
CONFIDENCE: HIGH (measured), MEDIUM (extrapolated), LOW (estimated)
|
|
269
|
-
Any claim without a SOURCE will be REJECTED by reviewers.
|
|
270
|
-
4. Rate finding 1-5.
|
|
271
|
-
5. Append to .company/messages/{dept}.jsonl:
|
|
272
|
-
{"type":"finding","from":"{ROLE}","priority":N,"source":"{proof}","confidence":"{H/M/L}","content":"summary"}
|
|
273
|
-
```
|
|
62
|
+
Leads assign tasks. For each task: one sentence, one employee, one skill (if installed).
|
|
274
63
|
|
|
275
|
-
|
|
64
|
+
If a lead sees a skill gap: write `HIRE: {role}, {why}` and CEO adds it to COMPANY.md and active roster.
|
|
65
|
+
If a lead realizes an idle employee is needed after all: add them to active roster.
|
|
276
66
|
|
|
277
|
-
|
|
67
|
+
### EXECUTE (Sonnet, all workers parallel)
|
|
278
68
|
|
|
279
|
-
|
|
280
|
-
```
|
|
281
|
-
You are the Internal Reviewer. Cycle {N} just completed.
|
|
282
|
-
|
|
283
|
-
Read .company/criteria.json and ALL .company/messages/*.jsonl and .company/{dept}/*.md findings.
|
|
284
|
-
|
|
285
|
-
VERIFICATION RULES:
|
|
286
|
-
- For EACH criterion in criteria.json with "passes": false, check if this cycle produced evidence.
|
|
287
|
-
- Every finding MUST have a SOURCE field. REJECT any finding without one.
|
|
288
|
-
- For priority 4-5 findings, RE-RUN the experiment or RE-CHECK the source file yourself.
|
|
289
|
-
- If a number is claimed, read the actual output file that produced it.
|
|
290
|
-
- Any claim you cannot independently verify: mark UNVERIFIED.
|
|
291
|
-
|
|
292
|
-
UPDATE .company/criteria.json:
|
|
293
|
-
- For each criterion where evidence now exists, set "passes": true, fill "evidence" with proof.
|
|
294
|
-
- If no evidence, keep "passes": false.
|
|
295
|
-
|
|
296
|
-
Write to .company/cycles/cycle-{N}-review.md:
|
|
297
|
-
For each criterion:
|
|
298
|
-
ID: {from criteria.json}
|
|
299
|
-
DESCRIPTION: {criterion}
|
|
300
|
-
PASSES: true/false
|
|
301
|
-
EVIDENCE: {file path, URL, or command output}
|
|
302
|
-
VERIFIED: YES/NO
|
|
303
|
-
|
|
304
|
-
FINAL VERDICT:
|
|
305
|
-
- If ALL criteria in criteria.json have "passes": true with non-null evidence: DONE
|
|
306
|
-
- If ANY has "passes": false: NOT DONE, list exactly what next cycle must do
|
|
307
|
-
```
|
|
69
|
+
Each worker gets: their task, their previous findings file, failed approaches from playbook.
|
|
308
70
|
|
|
309
|
-
|
|
71
|
+
Every finding MUST have:
|
|
310
72
|
```
|
|
311
|
-
|
|
312
|
-
|
|
313
|
-
GOAL: {from .company/GOAL.md}
|
|
314
|
-
REVIEWER VERDICT: {from cycle-{N}-review.md}
|
|
315
|
-
|
|
316
|
-
If the Reviewer said DONE, challenge it:
|
|
317
|
-
- Is the work ACTUALLY complete or just surface-level?
|
|
318
|
-
- What edge cases were missed?
|
|
319
|
-
- Would this survive scrutiny from an external expert?
|
|
320
|
-
- Is there anything we're fooling ourselves about?
|
|
321
|
-
|
|
322
|
-
If the Reviewer said NOT DONE, amplify:
|
|
323
|
-
- What's the REAL blocker?
|
|
324
|
-
- Is the team approaching this the right way or wasting cycles?
|
|
325
|
-
|
|
326
|
-
Write to .company/cycles/cycle-{N}-advocate.md:
|
|
327
|
-
VERDICT: ACCEPT (goal truly met) or CHALLENGE (not yet)
|
|
328
|
-
REASON: {honest assessment}
|
|
329
|
-
GAPS: {what's missing, if any}
|
|
73
|
+
FINDING: what
|
|
74
|
+
SOURCE: file/URL/command that proves it
|
|
330
75
|
```
|
|
76
|
+
No source = rejected by reviewer.
|
|
331
77
|
|
|
332
|
-
###
|
|
333
|
-
|
|
334
|
-
Launch Haiku digest writer to compress cycle output into next briefing.
|
|
335
|
-
|
|
336
|
-
Then check:
|
|
337
|
-
- Reviewer says DONE **AND** Advocate says ACCEPT → **EXIT LOOP**
|
|
338
|
-
- Either says NOT DONE / CHALLENGE → **inject their feedback into next cycle's briefing, continue**
|
|
339
|
-
|
|
340
|
-
No arbitrary iteration limit. The loop runs until the goal is verified done.
|
|
341
|
-
The only reasons to stop early:
|
|
342
|
-
- Context window approaching limit → compact and continue
|
|
343
|
-
- User presses Ctrl+C → save state to STATUS.md for /company resume
|
|
344
|
-
|
|
345
|
-
## Step 5: Final Report
|
|
346
|
-
|
|
347
|
-
Write `.company/STATUS.md`:
|
|
348
|
-
|
|
349
|
-
```markdown
|
|
350
|
-
# Company Status, {DATE}
|
|
351
|
-
## Goal
|
|
352
|
-
{the original goal}
|
|
353
|
-
|
|
354
|
-
## Verdict: {ACHIEVED / PARTIALLY ACHIEVED / IN PROGRESS}
|
|
78
|
+
### VERIFY (Opus)
|
|
355
79
|
|
|
356
|
-
|
|
357
|
-
|
|
80
|
+
Internal Reviewer reads criteria.json + all findings. For each criterion:
|
|
81
|
+
- Evidence exists with source? Set passes: true in criteria.json
|
|
82
|
+
- No evidence or source missing? Keep passes: false
|
|
358
83
|
|
|
359
|
-
|
|
360
|
-
{Reviewer's final assessment}
|
|
361
|
-
{Advocate's final assessment}
|
|
84
|
+
Devil's Advocate attacks anything marked as passing.
|
|
362
85
|
|
|
363
|
-
|
|
364
|
-
{
|
|
365
|
-
|
|
366
|
-
## Cycles: {N} of 5 max
|
|
86
|
+
```bash
|
|
87
|
+
echo "📋 CYCLE {N} VERDICT: {DONE or NOT DONE}" && echo "{reason}"
|
|
367
88
|
```
|
|
368
89
|
|
|
369
|
-
|
|
90
|
+
ALL criteria pass + Advocate accepts = EXIT.
|
|
91
|
+
Otherwise = loop with feedback.
|
|
370
92
|
|
|
371
|
-
##
|
|
93
|
+
## After Done
|
|
372
94
|
|
|
373
|
-
|
|
95
|
+
Write STATUS.md. Then update `.company/playbook.md`:
|
|
374
96
|
|
|
375
|
-
**Update COMPANY.md:**
|
|
376
|
-
- Employees with 3+ cycles of zero findings: add `[inactive]` tag to their role line
|
|
377
|
-
- Employees with consistently high-priority findings: add `[priority]` tag
|
|
378
|
-
- If a new role is needed (discovered during this session), ADD it to COMPANY.md
|
|
379
|
-
- If a department produced nothing useful, add a note: `<!-- Consider removing if inactive next session -->`
|
|
380
|
-
|
|
381
|
-
**Update .company/playbook.md** (new file, accumulates across ALL sessions):
|
|
382
97
|
```markdown
|
|
383
|
-
|
|
384
|
-
|
|
385
|
-
|
|
386
|
-
|
|
387
|
-
|
|
388
|
-
|
|
389
|
-
|
|
390
|
-
|
|
391
|
-
## Best Employees for Each Task Type
|
|
392
|
-
- Research tasks: {employee who produces most priority 4-5 findings}
|
|
393
|
-
- Code tasks: {employee who ships most working code}
|
|
394
|
-
- Review tasks: {employee who catches most real issues}
|
|
395
|
-
|
|
396
|
-
## Strategy Patterns
|
|
397
|
-
- {pattern}: {when to use it, based on past success}
|
|
98
|
+
## Session {date}
|
|
99
|
+
WORKED: {what succeeded, linked to evidence}
|
|
100
|
+
FAILED: {what failed} → USE INSTEAD: {what works} — WHY: {the difference}
|
|
101
|
+
INEFFICIENT: {what worked but was slow} → FASTER: {better approach}
|
|
102
|
+
HIRE: {roles added this session and why}
|
|
103
|
+
FIRE: {roles that produced nothing, marked [inactive] in COMPANY.md}
|
|
104
|
+
TOP: {employees with best findings, for priority activation next time}
|
|
398
105
|
```
|
|
399
106
|
|
|
400
|
-
|
|
107
|
+
The playbook is the ONLY self-improvement file. It accumulates across sessions. Leads read it before every THINK phase. One file, all lessons.
|
|
401
108
|
|
|
402
|
-
|
|
403
|
-
Write `.company/lead-overrides.md` with per-department adjustments:
|
|
404
|
-
```markdown
|
|
405
|
-
## Research Department
|
|
406
|
-
- Activate first: {top performer from performance.json}
|
|
407
|
-
- Skip unless needed: {employees with 0 findings last 2 sessions}
|
|
408
|
-
- Priority approach: {from playbook.md "Always Do First"}
|
|
109
|
+
CEO updates COMPANY.md: tag `[inactive]` on zero-contribution roles, `[priority]` on top performers, add any hired roles.
|
|
409
110
|
|
|
410
|
-
##
|
|
411
|
-
- ...
|
|
412
|
-
```
|
|
111
|
+
## Built-In Roles (always exist)
|
|
413
112
|
|
|
414
|
-
|
|
113
|
+
CEO, CTO, Internal Reviewer, User Advocate, Devil's Advocate, Elegance Enforcer.
|
|
114
|
+
Deduplicated if user defines them in COMPANY.md.
|
|
415
115
|
|
|
416
|
-
##
|
|
116
|
+
## Skills
|
|
417
117
|
|
|
418
|
-
|
|
419
|
-
|
|
420
|
-
|
|
421
|
-
|
|
422
|
-
|
|
423
|
-
|
|
424
|
-
```
|
|
425
|
-
|
|
426
|
-
The CEO reads all hire requests and:
|
|
427
|
-
1. If valid, ADDS the role to COMPANY.md under the right department
|
|
428
|
-
2. Creates the employee in the NEXT cycle's EXECUTE phase
|
|
429
|
-
3. Logs the hire in `.company/hires.md`
|
|
430
|
-
|
|
431
|
-
**After VERIFY phase, CEO can fire/deactivate:**
|
|
432
|
-
Based on performance.json:
|
|
433
|
-
- Employees with 0 findings for 3+ consecutive cycles: mark `[inactive]` in COMPANY.md
|
|
434
|
-
- Employees whose work was REJECTED by reviewers 2+ times: mark `[inactive]`
|
|
435
|
-
- `[inactive]` employees don't get spawned unless explicitly needed
|
|
436
|
-
|
|
437
|
-
**Dynamic reallocation:**
|
|
438
|
-
Between cycles, the CEO checks: which criteria are FAILING? What skills are missing?
|
|
439
|
-
Then reassigns employees:
|
|
440
|
-
- If compression research is stuck, pull the Outside-the-Box Thinker into that problem
|
|
441
|
-
- If code quality is failing, pull more reviewers in
|
|
442
|
-
- If a deadline is approaching, reduce research and increase engineering
|
|
443
|
-
|
|
444
|
-
This means the company structure CHANGES during execution, not just between sessions.
|
|
445
|
-
COMPANY.md gets updated in real-time as the company learns what it needs.
|
|
446
|
-
|
|
447
|
-
### Contrastive Insights
|
|
448
|
-
|
|
449
|
-
Every FAILED approach MUST be linked to a WORKING alternative in lessons:
|
|
450
|
-
```
|
|
451
|
-
FAILED: Variable-rate E8 (+3.67% PPL)
|
|
452
|
-
WORKS INSTEAD: Fixed-rate E8 (-0.05% PPL)
|
|
453
|
-
WHY: Energy is not a reliable proxy for importance
|
|
454
|
-
```
|
|
455
|
-
Not just "don't do X" but "do Y instead because Z." The VERIFY phase writes these.
|
|
456
|
-
|
|
457
|
-
### Optimization Tips
|
|
458
|
-
|
|
459
|
-
Track inefficient successes, not just failures:
|
|
460
|
-
```
|
|
461
|
-
INEFFICIENT: Spent 3 cycles on synthetic data before trying real KV
|
|
462
|
-
LESSON: Test on real data first, synthetic second
|
|
118
|
+
```bash
|
|
119
|
+
INSTALLED=$(for d in ~/.claude/skills/*/SKILL.md .claude/skills/*/SKILL.md; do [ -f "$d" ] && basename "$(dirname "$d")"; done 2>/dev/null | sort -u)
|
|
120
|
+
echo "$INSTALLED" | grep -q "gstack" || npx gstack@latest install 2>/dev/null || true
|
|
121
|
+
echo "$INSTALLED" | grep -q "gsd" || npx -y get-shit-done-cc@latest install 2>/dev/null || true
|
|
122
|
+
echo "$INSTALLED" | grep -q "trailofbits" || (git clone --depth 1 https://github.com/trailofbits/skills.git /tmp/tob-skills 2>/dev/null && cp -r /tmp/tob-skills/.claude/skills/* ~/.claude/skills/ 2>/dev/null && rm -rf /tmp/tob-skills) || true
|
|
123
|
+
for d in ~/.claude/skills/*/SKILL.md .claude/skills/*/SKILL.md; do [ -f "$d" ] && basename "$(dirname "$d")"; done 2>/dev/null | sort -u
|
|
463
124
|
```
|
|
464
|
-
CEO adds 1-3 optimization tips to playbook.md after each session.
|
|
465
|
-
|
|
466
|
-
### Meta-Audit
|
|
467
|
-
|
|
468
|
-
At END of every session, CEO asks:
|
|
469
|
-
1. Did Reviewer catch a real issue? If not, review process needs fixing.
|
|
470
|
-
2. Did Devil's Advocate find a genuine hole? If not, sharper prompts needed.
|
|
471
|
-
3. Did anyone read lessons.md before starting? If not, self-improvement isn't working.
|
|
472
|
-
4. Did any employee get hired/fired? If not, company isn't adapting.
|
|
473
|
-
Write results to `.company/meta-audit.md`.
|
|
474
|
-
|
|
475
|
-
### Skill Evolution
|
|
476
125
|
|
|
477
|
-
|
|
478
|
-
- If "Lattice Mathematician" keeps producing breakthroughs on entropy coding, update their description to include "entropy coding specialist"
|
|
479
|
-
- If "CTO" keeps finding bugs in experiments, add "experiment validation" to their role
|
|
480
|
-
- Employee skills evolve to match what the goal actually needs
|
|
126
|
+
When installed: MUST use /review for code review, /investigate for bugs, /qa for testing, /ship for PRs.
|
|
481
127
|
|
|
482
|
-
|
|
128
|
+
## Stop Hook
|
|
483
129
|
|
|
484
|
-
|
|
130
|
+
Claude cannot stop until ALL criteria.json entries have passes: true.
|
|
131
|
+
Cancel: `touch .company/CANCEL`
|
|
485
132
|
|
|
486
|
-
|
|
133
|
+
## Files
|
|
487
134
|
|
|
488
135
|
```
|
|
489
|
-
/
|
|
490
|
-
|
|
491
|
-
|
|
492
|
-
|
|
136
|
+
.company/
|
|
137
|
+
criteria.json ← machine-checkable goal state
|
|
138
|
+
playbook.md ← accumulated lessons (THE self-improvement file)
|
|
139
|
+
STATUS.md ← final report
|
|
140
|
+
cycles/ ← per-cycle briefings and reviews
|
|
141
|
+
messages/ ← typed findings per department
|
|
142
|
+
{dept}/ ← per-employee findings (persist across sessions)
|
|
493
143
|
```
|
|
494
|
-
|
|
495
|
-
## Without COMPANY.md
|
|
496
|
-
|
|
497
|
-
If no COMPANY.md exists, the skill creates a minimal team based on the goal:
|
|
498
|
-
- Code/build/fix goals → CEO + CTO + 2 engineers + Internal Reviewer + Devil's Advocate
|
|
499
|
-
- Research/analyze goals → CEO + Research Director + 2 researchers + Internal Reviewer + Devil's Advocate
|
|
500
|
-
- Review/audit goals → CEO + Lead Reviewer + 2 reviewers + Internal Reviewer + Devil's Advocate
|
|
501
|
-
|
|
502
|
-
The user doesn't need to configure anything. Just `/company "do this thing"`.
|
|
503
|
-
|
|
504
|
-
## Namespaced Memory
|
|
505
|
-
|
|
506
|
-
`.company/memory/{dept}.json` persists findings across sessions. Employees check memory before re-researching.
|
|
507
|
-
|
|
508
|
-
## Health Monitoring
|
|
509
|
-
|
|
510
|
-
Failed employees get logged and skipped. After 3 consecutive failures, flagged to user.
|
|
511
|
-
|
|
512
|
-
## Safety
|
|
513
|
-
|
|
514
|
-
- No arbitrary loop limit, runs until the objective is done
|
|
515
|
-
- Built-in Reviewer + Advocate prevent premature "done" claims
|
|
516
|
-
- If context window gets tight, compact and continue
|
|
517
|
-
- If user stops (Ctrl+C), state saves to STATUS.md for `/company resume`
|
|
518
|
-
- All work persists in `.company/`, nothing is lost
|