@fro.bot/systematic 1.23.2 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (55) hide show
  1. package/README.md +75 -61
  2. package/agents/research/best-practices-researcher.md +2 -3
  3. package/agents/research/issue-intelligence-analyst.md +2 -3
  4. package/package.json +2 -3
  5. package/skills/ce-brainstorm/SKILL.md +10 -11
  6. package/skills/ce-compound/SKILL.md +11 -11
  7. package/skills/ce-compound-refresh/SKILL.md +2 -2
  8. package/skills/ce-ideate/SKILL.md +3 -4
  9. package/skills/ce-plan/SKILL.md +8 -8
  10. package/skills/ce-plan-beta/SKILL.md +9 -10
  11. package/skills/ce-review/SKILL.md +7 -7
  12. package/skills/ce-work/SKILL.md +4 -4
  13. package/skills/ce-work-beta/SKILL.md +556 -0
  14. package/skills/claude-permissions-optimizer/SKILL.md +161 -0
  15. package/skills/claude-permissions-optimizer/scripts/extract-commands.mjs +805 -0
  16. package/skills/deepen-plan/SKILL.md +15 -15
  17. package/skills/deepen-plan-beta/SKILL.md +3 -3
  18. package/skills/deploy-docs/SKILL.md +8 -8
  19. package/skills/file-todos/SKILL.md +2 -1
  20. package/skills/generate_command/SKILL.md +1 -1
  21. package/skills/{report-bug → report-bug-ce}/SKILL.md +38 -33
  22. package/skills/resolve-todo-parallel/SKILL.md +65 -0
  23. package/skills/setup/SKILL.md +3 -3
  24. package/skills/test-browser/SKILL.md +3 -4
  25. package/commands/.gitkeep +0 -0
  26. package/skills/create-agent-skill/SKILL.md +0 -10
  27. package/skills/create-agent-skills/SKILL.md +0 -265
  28. package/skills/create-agent-skills/references/api-security.md +0 -226
  29. package/skills/create-agent-skills/references/be-clear-and-direct.md +0 -531
  30. package/skills/create-agent-skills/references/best-practices.md +0 -404
  31. package/skills/create-agent-skills/references/common-patterns.md +0 -595
  32. package/skills/create-agent-skills/references/core-principles.md +0 -437
  33. package/skills/create-agent-skills/references/executable-code.md +0 -175
  34. package/skills/create-agent-skills/references/iteration-and-testing.md +0 -474
  35. package/skills/create-agent-skills/references/official-spec.md +0 -134
  36. package/skills/create-agent-skills/references/recommended-structure.md +0 -168
  37. package/skills/create-agent-skills/references/skill-structure.md +0 -152
  38. package/skills/create-agent-skills/references/using-scripts.md +0 -113
  39. package/skills/create-agent-skills/references/using-templates.md +0 -112
  40. package/skills/create-agent-skills/references/workflows-and-validation.md +0 -510
  41. package/skills/create-agent-skills/templates/router-skill.md +0 -73
  42. package/skills/create-agent-skills/templates/simple-skill.md +0 -33
  43. package/skills/create-agent-skills/workflows/add-reference.md +0 -96
  44. package/skills/create-agent-skills/workflows/add-script.md +0 -93
  45. package/skills/create-agent-skills/workflows/add-template.md +0 -74
  46. package/skills/create-agent-skills/workflows/add-workflow.md +0 -126
  47. package/skills/create-agent-skills/workflows/audit-skill.md +0 -138
  48. package/skills/create-agent-skills/workflows/create-domain-expertise-skill.md +0 -605
  49. package/skills/create-agent-skills/workflows/create-new-skill.md +0 -197
  50. package/skills/create-agent-skills/workflows/get-guidance.md +0 -121
  51. package/skills/create-agent-skills/workflows/upgrade-to-router.md +0 -161
  52. package/skills/create-agent-skills/workflows/verify-skill.md +0 -204
  53. package/skills/heal-skill/SKILL.md +0 -148
  54. package/skills/resolve_parallel/SKILL.md +0 -36
  55. package/skills/resolve_todo_parallel/SKILL.md +0 -38
@@ -51,10 +51,10 @@ Ensure that the code is ready for analysis (either in worktree or on current bra
51
51
  #### Protected Artifacts
52
52
 
53
53
  <protected_artifacts>
54
- The following paths are compound-engineering pipeline artifacts and must never be flagged for deletion, removal, or gitignore by any review agent:
54
+ The following paths are systematic pipeline artifacts and must never be flagged for deletion, removal, or gitignore by any review agent:
55
55
 
56
- - `docs/brainstorms/*-requirements.md` — Requirements documents created by `/systematic:ce-brainstorm`. These are the product-definition artifacts that planning depends on.
57
- - `docs/plans/*.md` — Plan files created by `/systematic:ce-plan`. These are living documents that track implementation progress (checkboxes are checked off by `/systematic:ce-work`).
56
+ - `docs/brainstorms/*-requirements.md` — Requirements documents created by `/ce:brainstorm`. These are the product-definition artifacts that planning depends on.
57
+ - `docs/plans/*.md` — Plan files created by `/ce:plan`. These are living documents that track implementation progress (checkboxes are checked off by `/ce:work`).
58
58
  - `docs/solutions/*.md` — Solution documents created during the pipeline.
59
59
 
60
60
  If a review agent flags any file in these directories for cleanup or removal, discard that finding during synthesis. Do not create a todo for it.
@@ -62,7 +62,7 @@ If a review agent flags any file in these directories for cleanup or removal, di
62
62
 
63
63
  #### Load Review Agents
64
64
 
65
- Read `compound-engineering.local.md` in the project root. If found, use `review_agents` from YAML frontmatter. If the markdown body contains review context, pass it to each agent as additional instructions.
65
+ Read `systematic.local.md` in the project root. If found, use `review_agents` from YAML frontmatter. If the markdown body contains review context, pass it to each agent as additional instructions.
66
66
 
67
67
  If no settings file exists, invoke the `setup` skill to create one. Then read the newly created file and continue.
68
68
 
@@ -279,9 +279,9 @@ Remove duplicates, prioritize by severity and impact.
279
279
 
280
280
  ```bash
281
281
  # Launch multiple finding-creator agents in parallel
282
- Task() - Create todos for first finding
283
- Task() - Create todos for second finding
284
- Task() - Create todos for third finding
282
+ task() - Create todos for first finding
283
+ task() - Create todos for second finding
284
+ task() - Create todos for third finding
285
285
  etc. for each finding.
286
286
  ```
287
287
 
@@ -220,13 +220,13 @@ This command takes a work document (plan, specification, or todo file) and execu
220
220
  # Run full test suite (use project's test command)
221
221
  # Examples: bin/rails test, npm test, pytest, go test, etc.
222
222
 
223
- # Run linting (per CLAUDE.md)
223
+ # Run linting (per AGENTS.md)
224
224
  # Use linting-agent before pushing to origin
225
225
  ```
226
226
 
227
227
  2. **Consider Reviewer Agents** (Optional)
228
228
 
229
- Use for complex, risky, or large changes. Read agents from `compound-engineering.local.md` frontmatter (`review_agents`). If no settings file, invoke the `setup` skill to create one.
229
+ Use for complex, risky, or large changes. Read agents from `systematic.local.md` frontmatter (`review_agents`). If no settings file, invoke the `setup` skill to create one.
230
230
 
231
231
  Run configured agents in parallel with task tool. Present findings and address critical issues.
232
232
 
@@ -265,7 +265,7 @@ This command takes a work document (plan, specification, or todo file) and execu
265
265
 
266
266
  Brief explanation if needed.
267
267
 
268
- 🤖 Generated with [MODEL] via [HARNESS](HARNESS_URL) + Compound Engineering v[VERSION]
268
+ 🤖 Generated with [MODEL] via [HARNESS](HARNESS_URL) + Systematic v[VERSION]
269
269
 
270
270
  Co-Authored-By: [MODEL] ([CONTEXT] context, [THINKING]) <noreply@anthropic.com>
271
271
  EOF
@@ -360,7 +360,7 @@ This command takes a work document (plan, specification, or todo file) and execu
360
360
 
361
361
  ---
362
362
 
363
- [![Compound Engineering v[VERSION]](https://img.shields.io/badge/Compound_Engineering-v[VERSION]-6366f1)](https://github.com/EveryInc/compound-engineering-plugin)
363
+ [![Systematic v[VERSION]](https://img.shields.io/badge/Systematic-v[VERSION]-6366f1)](https://github.com/EveryInc/systematic)
364
364
  🤖 Generated with [MODEL] ([CONTEXT] context, [THINKING]) via [HARNESS](HARNESS_URL)
365
365
  EOF
366
366
  )"
@@ -0,0 +1,556 @@
1
+ ---
2
+ name: ce:work-beta
3
+ description: 'Use this skill when executing a plan with the ce:work workflow but you also want optional external delegate execution for implementation-heavy tasks. Ideal for large tasks where token conservation matters and acceptance criteria are already clear.'
4
+ argument-hint: '[plan file, specification, or todo file path]'
5
+ disable-model-invocation: true
6
+ ---
7
+
8
+ # Work Plan Execution Command
9
+
10
+ Execute a work plan efficiently while maintaining quality and finishing features.
11
+
12
+ ## Introduction
13
+
14
+ This command takes a work document (plan, specification, or todo file) and executes it systematically. The focus is on **shipping complete features** by understanding requirements quickly, following existing patterns, and maintaining quality throughout.
15
+
16
+ ## Input Document
17
+
18
+ <input_document> #$ARGUMENTS </input_document>
19
+
20
+ ## Execution Workflow
21
+
22
+ ### Phase 1: Quick Start
23
+
24
+ 1. **Read Plan and Clarify**
25
+
26
+ - Read the work document completely
27
+ - Treat the plan as a decision artifact, not an execution script
28
+ - If the plan includes sections such as `Implementation Units`, `Work Breakdown`, `Requirements Trace`, `Files`, `Test Scenarios`, or `Verification`, use those as the primary source material for execution
29
+ - Check for `Execution note` on each implementation unit — these carry the plan's execution posture signal for that unit (for example, test-first or characterization-first). Note them when creating tasks.
30
+ - Check for a `Deferred to Implementation` or `Implementation-Time Unknowns` section — these are questions the planner intentionally left for you to resolve during execution. Note them before starting so they inform your approach rather than surprising you mid-task
31
+ - Check for a `Scope Boundaries` section — these are explicit non-goals. Refer back to them if implementation starts pulling you toward adjacent work
32
+ - Review any references or links provided in the plan
33
+ - If the user explicitly asks for TDD, test-first, or characterization-first execution in this session, honor that request even if the plan has no `Execution note`
34
+ - If anything is unclear or ambiguous, ask clarifying questions now
35
+ - Get user approval to proceed
36
+ - **Do not skip this** - better to ask questions now than build the wrong thing
37
+
38
+ 2. **Setup Environment**
39
+
40
+ First, check the current branch:
41
+
42
+ ```bash
43
+ current_branch=$(git branch --show-current)
44
+ default_branch=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@')
45
+
46
+ # Fallback if remote HEAD isn't set
47
+ if [ -z "$default_branch" ]; then
48
+ default_branch=$(git rev-parse --verify origin/main >/dev/null 2>&1 && echo "main" || echo "master")
49
+ fi
50
+ ```
51
+
52
+ **If already on a feature branch** (not the default branch):
53
+ - Ask: "Continue working on `[current_branch]`, or create a new branch?"
54
+ - If continuing, proceed to step 3
55
+ - If creating new, follow Option A or B below
56
+
57
+ **If on the default branch**, choose how to proceed:
58
+
59
+ **Option A: Create a new branch**
60
+ ```bash
61
+ git pull origin [default_branch]
62
+ git checkout -b feature-branch-name
63
+ ```
64
+ Use a meaningful name based on the work (e.g., `feat/user-authentication`, `fix/email-validation`).
65
+
66
+ **Option B: Use a worktree (recommended for parallel development)**
67
+ ```bash
68
+ skill: git-worktree
69
+ # The skill will create a new branch from the default branch in an isolated worktree
70
+ ```
71
+
72
+ **Option C: Continue on the default branch**
73
+ - Requires explicit user confirmation
74
+ - Only proceed after user explicitly says "yes, commit to [default_branch]"
75
+ - Never commit directly to the default branch without explicit permission
76
+
77
+ **Recommendation**: Use worktree if:
78
+ - You want to work on multiple features simultaneously
79
+ - You want to keep the default branch clean while experimenting
80
+ - You plan to switch between branches frequently
81
+
82
+ 3. **Create Todo List**
83
+ - Use your available task tracking tool (e.g., todowrite, task lists) to break the plan into actionable tasks
84
+ - Derive tasks from the plan's implementation units, dependencies, files, test targets, and verification criteria
85
+ - Carry each unit's `Execution note` into the task when present
86
+ - For each unit, read the `Patterns to follow` field before implementing — these point to specific files or conventions to mirror
87
+ - Use each unit's `Verification` field as the primary "done" signal for that task
88
+ - Do not expect the plan to contain implementation code, micro-step TDD instructions, or exact shell commands
89
+ - Include dependencies between tasks
90
+ - Prioritize based on what needs to be done first
91
+ - Include testing and quality check tasks
92
+ - Keep tasks specific and completable
93
+
94
+ 4. **Choose Execution Strategy**
95
+
96
+ After creating the task list, decide how to execute based on the plan's size and dependency structure:
97
+
98
+ | Strategy | When to use |
99
+ |----------|-------------|
100
+ | **Inline** | 1-2 small tasks, or tasks needing user interaction mid-flight |
101
+ | **Serial subagents** | 3+ tasks with dependencies between them. Each subagent gets a fresh context window focused on one unit — prevents context degradation across many tasks |
102
+ | **Parallel subagents** | 3+ tasks where some units have no shared dependencies and touch non-overlapping files. Dispatch independent units simultaneously, run dependent units after their prerequisites complete |
103
+
104
+ **Subagent dispatch** uses your available subagent or task spawning mechanism. For each unit, give the subagent:
105
+ - The full plan file path (for overall context)
106
+ - The specific unit's Goal, Files, Approach, Execution note, Patterns, Test scenarios, and Verification
107
+ - Any resolved deferred questions relevant to that unit
108
+
109
+ After each subagent completes, update the plan checkboxes and task list before dispatching the next dependent unit.
110
+
111
+ For genuinely large plans needing persistent inter-agent communication (agents challenging each other's approaches, shared coordination across 10+ tasks), see Swarm Mode below which uses Agent Teams.
112
+
113
+ ### Phase 2: Execute
114
+
115
+ 1. **Task Execution Loop**
116
+
117
+ For each task in priority order:
118
+
119
+ ```
120
+ while (tasks remain):
121
+ - Mark task as in-progress
122
+ - Read any referenced files from the plan
123
+ - Look for similar patterns in codebase
124
+ - Implement following existing conventions
125
+ - Write tests for new functionality
126
+ - Run System-Wide Test Check (see below)
127
+ - Run tests after changes
128
+ - Mark task as completed
129
+ - Evaluate for incremental commit (see below)
130
+ ```
131
+
132
+ When a unit carries an `Execution note`, honor it. For test-first units, write the failing test before implementation for that unit. For characterization-first units, capture existing behavior before changing it. For units without an `Execution note`, proceed pragmatically.
133
+
134
+ Guardrails for execution posture:
135
+ - Do not write the test and implementation in the same step when working test-first
136
+ - Do not skip verifying that a new test fails before implementing the fix or feature
137
+ - Do not over-implement beyond the current behavior slice when working test-first
138
+ - Skip test-first discipline for trivial renames, pure configuration, and pure styling work
139
+
140
+ **System-Wide Test Check** — Before marking a task done, pause and ask:
141
+
142
+ | Question | What to do |
143
+ |----------|------------|
144
+ | **What fires when this runs?** Callbacks, middleware, observers, event handlers — trace two levels out from your change. | Read the actual code (not docs) for callbacks on models you touch, middleware in the request chain, `after_*` hooks. |
145
+ | **Do my tests exercise the real chain?** If every dependency is mocked, the test proves your logic works *in isolation* — it says nothing about the interaction. | Write at least one integration test that uses real objects through the full callback/middleware chain. No mocks for the layers that interact. |
146
+ | **Can failure leave orphaned state?** If your code persists state (DB row, cache, file) before calling an external service, what happens when the service fails? Does retry create duplicates? | Trace the failure path with real objects. If state is created before the risky call, test that failure cleans up or that retry is idempotent. |
147
+ | **What other interfaces expose this?** Mixins, DSLs, alternative entry points (Agent vs Chat vs ChatMethods). | Grep for the method/behavior in related classes. If parity is needed, add it now — not as a follow-up. |
148
+ | **Do error strategies align across layers?** Retry middleware + application fallback + framework error handling — do they conflict or create double execution? | List the specific error classes at each layer. Verify your rescue list matches what the lower layer actually raises. |
149
+
150
+ **When to skip:** Leaf-node changes with no callbacks, no state persistence, no parallel interfaces. If the change is purely additive (new helper method, new view partial), the check takes 10 seconds and the answer is "nothing fires, skip."
151
+
152
+ **When this matters most:** Any change that touches models with callbacks, error handling with fallback/retry, or functionality exposed through multiple interfaces.
153
+
154
+
155
+ 2. **Incremental Commits**
156
+
157
+ After completing each task, evaluate whether to create an incremental commit:
158
+
159
+ | Commit when... | Don't commit when... |
160
+ |----------------|---------------------|
161
+ | Logical unit complete (model, service, component) | Small part of a larger unit |
162
+ | Tests pass + meaningful progress | Tests failing |
163
+ | About to switch contexts (backend → frontend) | Purely scaffolding with no behavior |
164
+ | About to attempt risky/uncertain changes | Would need a "WIP" commit message |
165
+
166
+ **Heuristic:** "Can I write a commit message that describes a complete, valuable change? If yes, commit. If the message would be 'WIP' or 'partial X', wait."
167
+
168
+ If the plan has Implementation Units, use them as a starting guide for commit boundaries — but adapt based on what you find during implementation. A unit might need multiple commits if it's larger than expected, or small related units might land together. Use each unit's Goal to inform the commit message.
169
+
170
+ **Commit workflow:**
171
+ ```bash
172
+ # 1. Verify tests pass (use project's test command)
173
+ # Examples: bin/rails test, npm test, pytest, go test, etc.
174
+
175
+ # 2. Stage only files related to this logical unit (not `git add .`)
176
+ git add <files related to this logical unit>
177
+
178
+ # 3. Commit with conventional message
179
+ git commit -m "feat(scope): description of this unit"
180
+ ```
181
+
182
+ **Handling merge conflicts:** If conflicts arise during rebasing or merging, resolve them immediately. Incremental commits make conflict resolution easier since each commit is small and focused.
183
+
184
+ **Note:** Incremental commits use clean conventional messages without attribution footers. The final Phase 4 commit/PR includes the full attribution.
185
+
186
+ 3. **Follow Existing Patterns**
187
+
188
+ - The plan should reference similar code - read those files first
189
+ - Match naming conventions exactly
190
+ - Reuse existing components where possible
191
+ - Follow project coding standards (see AGENTS.md; use AGENTS.md only if the repo still keeps a compatibility shim)
192
+ - When in doubt, grep for similar implementations
193
+
194
+ 4. **Test Continuously**
195
+
196
+ - Run relevant tests after each significant change
197
+ - Don't wait until the end to test
198
+ - Fix failures immediately
199
+ - Add new tests for new functionality
200
+ - **Unit tests with mocks prove logic in isolation. Integration tests with real objects prove the layers work together.** If your change touches callbacks, middleware, or error handling — you need both.
201
+
202
+ 5. **Simplify as You Go**
203
+
204
+ After completing a cluster of related implementation units (or every 2-3 units), review recently changed files for simplification opportunities — consolidate duplicated patterns, extract shared helpers, and improve code reuse and efficiency. This is especially valuable when using subagents, since each agent works with isolated context and can't see patterns emerging across units.
205
+
206
+ Don't simplify after every single unit — early patterns may look duplicated but diverge intentionally in later units. Wait for a natural phase boundary or when you notice accumulated complexity.
207
+
208
+ If a `/simplify` skill or equivalent is available, use it. Otherwise, review the changed files yourself for reuse and consolidation opportunities.
209
+
210
+ 6. **Figma Design Sync** (if applicable)
211
+
212
+ For UI work with Figma designs:
213
+
214
+ - Implement components following design specs
215
+ - Use figma-design-sync agent iteratively to compare
216
+ - Fix visual differences identified
217
+ - Repeat until implementation matches design
218
+
219
+ 6. **Track Progress**
220
+ - Keep the task list updated as you complete tasks
221
+ - Note any blockers or unexpected discoveries
222
+ - Create new tasks if scope expands
223
+ - Keep user informed of major milestones
224
+
225
+ ### Phase 3: Quality Check
226
+
227
+ 1. **Run Core Quality Checks**
228
+
229
+ Always run before submitting:
230
+
231
+ ```bash
232
+ # Run full test suite (use project's test command)
233
+ # Examples: bin/rails test, npm test, pytest, go test, etc.
234
+
235
+ # Run linting (per AGENTS.md)
236
+ # Use linting-agent before pushing to origin
237
+ ```
238
+
239
+ 2. **Consider Reviewer Agents** (Optional)
240
+
241
+ Use for complex, risky, or large changes. Read agents from your local workflow settings frontmatter (`review_agents`). If no settings file exists, invoke the `setup` skill to create one.
242
+
243
+ Run configured agents in parallel with task tool. Present findings and address critical issues.
244
+
245
+ 3. **Final Validation**
246
+ - All tasks marked completed
247
+ - All tests pass
248
+ - Linting passes
249
+ - Code follows existing patterns
250
+ - Figma designs match (if applicable)
251
+ - No console errors or warnings
252
+ - If the plan has a `Requirements Trace`, verify each requirement is satisfied by the completed work
253
+ - If any `Deferred to Implementation` questions were noted, confirm they were resolved during execution
254
+
255
+ 4. **Prepare Operational Validation Plan** (REQUIRED)
256
+ - Add a `## Post-Deploy Monitoring & Validation` section to the PR description for every change.
257
+ - Include concrete:
258
+ - Log queries/search terms
259
+ - Metrics or dashboards to watch
260
+ - Expected healthy signals
261
+ - Failure signals and rollback/mitigation trigger
262
+ - Validation window and owner
263
+ - If there is truly no production/runtime impact, still include the section with: `No additional operational monitoring required` and a one-line reason.
264
+
265
+ ### Phase 4: Ship It
266
+
267
+ 1. **Create Commit**
268
+
269
+ ```bash
270
+ git add .
271
+ git status # Review what's being committed
272
+ git diff --staged # Check the changes
273
+
274
+ # Commit with conventional format
275
+ git commit -m "$(cat <<'EOF'
276
+ feat(scope): description of what and why
277
+
278
+ Brief explanation if needed.
279
+
280
+ 🤖 Generated with [MODEL] via [HARNESS](HARNESS_URL) + Systematic v[VERSION]
281
+
282
+ Co-Authored-By: [MODEL] ([CONTEXT] context, [THINKING]) <noreply@anthropic.com>
283
+ EOF
284
+ )"
285
+ ```
286
+
287
+ **Fill in at commit/PR time:**
288
+
289
+ | Placeholder | Value | Example |
290
+ |-------------|-------|---------|
291
+ | Placeholder | Value | Example |
292
+ |-------------|-------|---------|
293
+ | `[MODEL]` | Model name | Claude Opus 4.6, GPT-5.4 |
294
+ | `[CONTEXT]` | Context window (if known) | 200K, 1M |
295
+ | `[THINKING]` | Thinking level (if known) | extended thinking |
296
+ | `[HARNESS]` | Tool running you | OpenCode, Codex, Gemini CLI |
297
+ | `[HARNESS_URL]` | Link to that tool | `https://claude.com/claude-code` |
298
+ | `[VERSION]` | `plugin.json` → `version` | 2.40.0 |
299
+
300
+ Subagents creating commits/PRs are equally responsible for accurate attribution.
301
+
302
+ 2. **Capture and Upload Screenshots for UI Changes** (REQUIRED for any UI work)
303
+
304
+ For **any** design changes, new views, or UI modifications, you MUST capture and upload screenshots:
305
+
306
+ **Step 1: Start dev server** (if not running)
307
+ ```bash
308
+ bin/dev # Run in background
309
+ ```
310
+
311
+ **Step 2: Capture screenshots with agent-browser CLI**
312
+ ```bash
313
+ agent-browser open http://localhost:3000/[route]
314
+ agent-browser snapshot -i
315
+ agent-browser screenshot output.png
316
+ ```
317
+ See the `agent-browser` skill for detailed usage.
318
+
319
+ **Step 3: Upload using imgup skill**
320
+ ```bash
321
+ skill: imgup
322
+ # Then upload each screenshot:
323
+ imgup -h pixhost screenshot.png # pixhost works without API key
324
+ # Alternative hosts: catbox, imagebin, beeimg
325
+ ```
326
+
327
+ **What to capture:**
328
+ - **New screens**: Screenshot of the new UI
329
+ - **Modified screens**: Before AND after screenshots
330
+ - **Design implementation**: Screenshot showing Figma design match
331
+
332
+ **IMPORTANT**: Always include uploaded image URLs in PR description. This provides visual context for reviewers and documents the change.
333
+
334
+ 3. **Create Pull Request**
335
+
336
+ ```bash
337
+ git push -u origin feature-branch-name
338
+
339
+ gh pr create --title "Feature: [Description]" --body "$(cat <<'EOF'
340
+ ## Summary
341
+ - What was built
342
+ - Why it was needed
343
+ - Key decisions made
344
+
345
+ ## Testing
346
+ - Tests added/modified
347
+ - Manual testing performed
348
+
349
+ ## Post-Deploy Monitoring & Validation
350
+ - **What to monitor/search**
351
+ - Logs:
352
+ - Metrics/Dashboards:
353
+ - **Validation checks (queries/commands)**
354
+ - `command or query here`
355
+ - **Expected healthy behavior**
356
+ - Expected signal(s)
357
+ - **Failure signal(s) / rollback trigger**
358
+ - Trigger + immediate action
359
+ - **Validation window & owner**
360
+ - Window:
361
+ - Owner:
362
+ - **If no operational impact**
363
+ - `No additional operational monitoring required: <reason>`
364
+
365
+ ## Before / After Screenshots
366
+ | Before | After |
367
+ |--------|-------|
368
+ | ![before](URL) | ![after](URL) |
369
+
370
+ ## Figma Design
371
+ [Link if applicable]
372
+
373
+ ---
374
+
375
+ [![Systematic v[VERSION]](https://img.shields.io/badge/Systematic-v[VERSION]-6366f1)](https://github.com/marcusrbrown/systematic)
376
+ 🤖 Generated with [MODEL] ([CONTEXT] context, [THINKING]) via [HARNESS](HARNESS_URL)
377
+ EOF
378
+ )"
379
+ ```
380
+
381
+ 4. **Update Plan Status**
382
+
383
+ If the input document has YAML frontmatter with a `status` field, update it to `completed`:
384
+ ```
385
+ status: active → status: completed
386
+ ```
387
+
388
+ 5. **Notify User**
389
+ - Summarize what was completed
390
+ - Link to PR
391
+ - Note any follow-up work needed
392
+ - Suggest next steps if applicable
393
+
394
+ ---
395
+
396
+ ## Swarm Mode with Agent Teams (Optional)
397
+
398
+ For genuinely large plans where agents need to communicate with each other, challenge approaches, or coordinate across 10+ tasks with persistent specialized roles, use agent team capabilities if available (e.g., Agent Teams in OpenCode, multi-agent workflows in Codex).
399
+
400
+ **Agent teams are typically experimental and require opt-in.** Do not attempt to use agent teams unless the user explicitly requests swarm mode or agent teams, and the platform supports it.
401
+
402
+ ### When to Use Agent Teams vs Subagents
403
+
404
+ | Agent Teams | Subagents (standard mode) |
405
+ |-------------|---------------------------|
406
+ | Agents need to discuss and challenge each other's approaches | Each task is independent — only the result matters |
407
+ | Persistent specialized roles (e.g., dedicated tester running continuously) | Workers report back and finish |
408
+ | 10+ tasks with complex cross-cutting coordination | 3-8 tasks with clear dependency chains |
409
+ | User explicitly requests "swarm mode" or "agent teams" | Default for most plans |
410
+
411
+ Most plans should use subagent dispatch from standard mode. Agent teams add significant token cost and coordination overhead — use them when the inter-agent communication genuinely improves the outcome.
412
+
413
+ ### Agent Teams Workflow
414
+
415
+ 1. **Create team** — use your available team creation mechanism
416
+ 2. **Create task list** — parse Implementation Units into tasks with dependency relationships
417
+ 3. **Spawn teammates** — assign specialized roles (implementer, tester, reviewer) based on the plan's needs. Give each teammate the plan file path and their specific task assignments
418
+ 4. **Coordinate** — the lead monitors task completion, reassigns work if someone gets stuck, and spawns additional workers as phases unblock
419
+ 5. **Cleanup** — shut down all teammates, then clean up the team resources
420
+
421
+ ---
422
+
423
+ ## External Delegate Mode (Optional)
424
+
425
+ For plans where token conservation matters, delegate code implementation to an external delegate (currently Codex CLI) while keeping planning, review, and git operations in the current agent.
426
+
427
+ This mode integrates with the existing Phase 1 Step 4 strategy selection as a **task-level modifier** - the strategy (inline/serial/parallel) still applies, but the implementation step within each tagged task delegates to the external tool instead of executing directly.
428
+
429
+ ### When to Use External Delegation
430
+
431
+ | External Delegation | Standard Mode |
432
+ |---------------------|---------------|
433
+ | Task is pure code implementation | Task requires research or exploration |
434
+ | Plan has clear acceptance criteria | Task is ambiguous or needs iteration |
435
+ | Token conservation matters (e.g., Max20 plan) | Unlimited plan or small task |
436
+ | Files to change are well-scoped | Changes span many interconnected files |
437
+
438
+ ### Enabling External Delegation
439
+
440
+ External delegation activates when any of these conditions are met:
441
+ - The user says "use codex for this work", "delegate to codex", or "delegate mode"
442
+ - A plan implementation unit contains `Execution target: external-delegate` in its Execution note (set by ce:plan-beta or ce:plan)
443
+
444
+ The specific delegate tool is resolved at execution time. Currently the only supported delegate is Codex CLI. Future delegates can be added without changing plan files.
445
+
446
+ ### Environment Guard
447
+
448
+ Before attempting delegation, check whether the current agent is already running inside a delegate's sandbox. Delegation from within a sandbox will fail silently or recurse.
449
+
450
+ Check for known sandbox indicators:
451
+ - `CODEX_SANDBOX` environment variable is set
452
+ - `CODEX_SESSION_ID` environment variable is set
453
+ - The filesystem is read-only at `.git/` (Codex sandbox blocks git writes)
454
+
455
+ If any indicator is detected, print "Already running inside a delegate sandbox - using standard mode." and proceed with standard execution for that task.
456
+
457
+ ### External Delegation Workflow
458
+
459
+ When external delegation is active, follow this workflow for each tagged task. Do not skip delegation because a task seems "small", "simple", or "faster inline". The user or plan explicitly requested delegation.
460
+
461
+ 1. **Check availability**
462
+
463
+ Verify the delegate CLI is installed. If not found, print "Delegate CLI not installed - continuing with standard mode." and proceed normally.
464
+
465
+ 2. **Build prompt** — For each task, assemble a prompt from the plan's implementation unit (Goal, Files, Approach, and project conventions). Include rules: no git commits, no PRs, run `git status` and `git diff --stat` when done. Never embed credentials or tokens in the prompt - pass auth through environment variables.
466
+
467
+ 3. **Write prompt to file** — Save the assembled prompt to a unique temporary file to avoid shell quoting issues and cross-task races. Use a unique filename per task.
468
+
469
+ 4. **Delegate** — Run the delegate CLI, piping the prompt file via stdin (not argv expansion, which hits `ARG_MAX` on large prompts). Omit the model flag to use the delegate's default model, which stays current without manual updates.
470
+
471
+ 5. **Review diff** — After the delegate finishes, verify the diff is non-empty and in-scope. Run the project's test/lint commands. If the diff is empty or out-of-scope, fall back to standard mode for that task.
472
+
473
+ 6. **Commit** — The current agent handles all git operations. The delegate's sandbox blocks `.git/index.lock` writes, so the delegate cannot commit. Stage changes and commit with a conventional message.
474
+
475
+ 7. **Error handling** — On any delegate failure (rate limit, error, empty diff), fall back to standard mode for that task. Track consecutive failures - after 3 consecutive failures, disable delegation for remaining tasks and print "Delegate disabled after 3 consecutive failures - completing remaining tasks in standard mode."
476
+
477
+ ### Mixed-Model Attribution
478
+
479
+ When some tasks are executed by the delegate and others by the current agent, use the following attribution in Phase 4:
480
+
481
+ - If all tasks used the delegate: attribute to the delegate model
482
+ - If all tasks used standard mode: attribute to the current agent's model
483
+ - If mixed: use `Generated with [CURRENT_MODEL] + [DELEGATE_MODEL] via [HARNESS]` and note which tasks were delegated in the PR description
484
+
485
+ ---
486
+
487
+ ## Key Principles
488
+
489
+ ### Start Fast, Execute Faster
490
+
491
+ - Get clarification once at the start, then execute
492
+ - Don't wait for perfect understanding - ask questions and move
493
+ - The goal is to **finish the feature**, not create perfect process
494
+
495
+ ### The Plan is Your Guide
496
+
497
+ - Work documents should reference similar code and patterns
498
+ - Load those references and follow them
499
+ - Don't reinvent - match what exists
500
+
501
+ ### Test As You Go
502
+
503
+ - Run tests after each change, not at the end
504
+ - Fix failures immediately
505
+ - Continuous testing prevents big surprises
506
+
507
+ ### Quality is Built In
508
+
509
+ - Follow existing patterns
510
+ - Write tests for new code
511
+ - Run linting before pushing
512
+ - Use reviewer agents for complex/risky changes only
513
+
514
+ ### Ship Complete Features
515
+
516
+ - Mark all tasks completed before moving on
517
+ - Don't leave features 80% done
518
+ - A finished feature that ships beats a perfect feature that doesn't
519
+
520
+ ## Quality Checklist
521
+
522
+ Before creating PR, verify:
523
+
524
+ - [ ] All clarifying questions asked and answered
525
+ - [ ] All tasks marked completed
526
+ - [ ] Tests pass (run project's test command)
527
+ - [ ] Linting passes (use linting-agent)
528
+ - [ ] Code follows existing patterns
529
+ - [ ] Figma designs match implementation (if applicable)
530
+ - [ ] Before/after screenshots captured and uploaded (for UI changes)
531
+ - [ ] Commit messages follow conventional format
532
+ - [ ] PR description includes Post-Deploy Monitoring & Validation section (or explicit no-impact rationale)
533
+ - [ ] PR description includes summary, testing notes, and screenshots
534
+ - [ ] PR description includes Compound Engineered badge with accurate model, harness, and version
535
+
536
+ ## When to Use Reviewer Agents
537
+
538
+ **Don't use by default.** Use reviewer agents only when:
539
+
540
+ - Large refactor affecting many files (10+)
541
+ - Security-sensitive changes (authentication, permissions, data access)
542
+ - Performance-critical code paths
543
+ - Complex algorithms or business logic
544
+ - User explicitly requests thorough review
545
+
546
+ For most features: tests + linting + following patterns is sufficient.
547
+
548
+ ## Common Pitfalls to Avoid
549
+
550
+ - **Analysis paralysis** - Don't overthink, read the plan and execute
551
+ - **Skipping clarifying questions** - Ask now, not after building wrong thing
552
+ - **Ignoring plan references** - The plan has links for a reason
553
+ - **Testing at the end** - Test continuously or suffer later
554
+ - **Forgetting to track progress** - Update task status as you go or lose track of what's done
555
+ - **80% done syndrome** - Finish the feature, don't move on early
556
+ - **Over-reviewing simple changes** - Save reviewer agents for complex work