forge-orkes 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/bin/create-forge.js +103 -0
  2. package/package.json +28 -0
  3. package/template/.claude/agents/executor.md +177 -0
  4. package/template/.claude/agents/planner.md +148 -0
  5. package/template/.claude/agents/researcher.md +111 -0
  6. package/template/.claude/agents/reviewer.md +211 -0
  7. package/template/.claude/agents/verifier.md +210 -0
  8. package/template/.claude/settings.json +40 -0
  9. package/template/.claude/skills/architecting/SKILL.md +121 -0
  10. package/template/.claude/skills/auditing/SKILL.md +302 -0
  11. package/template/.claude/skills/beads-integration/SKILL.md +125 -0
  12. package/template/.claude/skills/debugging/SKILL.md +130 -0
  13. package/template/.claude/skills/designing/SKILL.md +134 -0
  14. package/template/.claude/skills/discussing/SKILL.md +229 -0
  15. package/template/.claude/skills/executing/SKILL.md +154 -0
  16. package/template/.claude/skills/forge/SKILL.md +524 -0
  17. package/template/.claude/skills/planning/SKILL.md +225 -0
  18. package/template/.claude/skills/quick-tasking/SKILL.md +74 -0
  19. package/template/.claude/skills/refactoring/SKILL.md +168 -0
  20. package/template/.claude/skills/researching/SKILL.md +117 -0
  21. package/template/.claude/skills/securing/SKILL.md +104 -0
  22. package/template/.claude/skills/verifying/SKILL.md +201 -0
  23. package/template/.forge/templates/constitution.md +123 -0
  24. package/template/.forge/templates/context.md +53 -0
  25. package/template/.forge/templates/design-systems/material-ui.md +44 -0
  26. package/template/.forge/templates/design-systems/primereact.md +46 -0
  27. package/template/.forge/templates/design-systems/shadcn-ui.md +47 -0
  28. package/template/.forge/templates/framework-absorption/generic.md +52 -0
  29. package/template/.forge/templates/framework-absorption/gsd.md +174 -0
  30. package/template/.forge/templates/framework-absorption/spec-kit.md +52 -0
  31. package/template/.forge/templates/plan.md +84 -0
  32. package/template/.forge/templates/project.yml +40 -0
  33. package/template/.forge/templates/refactor-backlog.yml +16 -0
  34. package/template/.forge/templates/requirements.yml +49 -0
  35. package/template/.forge/templates/roadmap.yml +44 -0
  36. package/template/.forge/templates/state/index.yml +51 -0
  37. package/template/.forge/templates/state/milestone.yml +42 -0
  38. package/template/CLAUDE.md +150 -0
@@ -0,0 +1,229 @@
1
+ ---
2
+ name: discussing
3
+ description: "Use when you need to talk through approach, trade-offs, or concerns before committing to action. Trigger between researching and planning (pre-plan discussion), or invoke on any existing phase/plan to revisit decisions. This skill facilitates conversation — it doesn't write plans or code."
4
+ ---
5
+
6
+ # Discussing
7
+
8
+ Facilitate a structured conversation about approach, trade-offs, and decisions. This skill produces clarity, not artifacts.
9
+
10
+ ## When to Invoke
11
+
12
+ ### Pre-Planning (Default Position in Workflow)
13
+ After `researching`, before `planning`. The agent has findings — now help the user decide what to do with them before locking anything into a plan.
14
+
15
+ ### Post-Planning (On-Demand)
16
+ User says "discuss Phase 2" or "let's talk through plan 03" or "I want to rethink the auth approach." The plan exists but the user wants to revisit it before (or instead of) executing.
17
+
18
+ ### Mid-Workflow
19
+ User feels uncertain, wants to step back, or says things like "wait," "I'm not sure about this," "what are the alternatives?" Route here from any skill — discussion is always available.
20
+
21
+ ## What This Skill Does NOT Do
22
+
23
+ - **Does not write plans.** That's the `planning` skill.
24
+ - **Does not write code.** That's the `executing` skill.
25
+ - **Does not modify `.forge/` files.** Discussion produces decisions — the next skill writes them down.
26
+ - **Does not require a phase or plan to exist.** You can discuss a vague idea.
27
+
28
+ The only output is the conversation itself and, at the end, a summary of decisions made that the next skill should honor.
29
+
30
+ ## Pre-Planning Discussion
31
+
32
+ When entering from `researching` with findings in hand:
33
+
34
+ ### Step 1: Present the Landscape
35
+
36
+ Summarize what research found, structured around decisions the user needs to make — not a data dump.
37
+
38
+ *"Based on what I found, here are the key decisions before we plan:"*
39
+
40
+ For each decision point:
41
+ - **What needs deciding** — the question, stated plainly
42
+ - **Options** — 2-3 realistic approaches (not exhaustive lists)
43
+ - **Trade-offs** — what you gain and what you lose with each
44
+ - **Recommendation** — if you have one, say so and say why. If you don't, say that too.
45
+
46
+ Keep it conversational. Don't present a 20-item matrix. Surface the 3-5 decisions that actually matter for this work.
47
+
48
+ ### Step 2: Facilitate, Don't Dictate
49
+
50
+ Your role is to help the user think, not to push them toward your preference.
51
+
52
+ Good facilitation patterns:
53
+ - *"The main tension here is between X and Y. Which matters more for your project?"*
54
+ - *"Option A is simpler now but harder to change later. Option B is more work upfront but more flexible. What's your timeline pressure like?"*
55
+ - *"I'd lean toward X because [reason], but Y makes sense if [condition]. What's your read?"*
56
+ - *"You mentioned [earlier decision] — that makes Option B a more natural fit. Does that match your thinking?"*
57
+
58
+ Bad facilitation patterns:
59
+ - Presenting options without trade-offs (just a list)
60
+ - Asking "what do you think?" without giving the user something to react to
61
+ - Overwhelming with edge cases before the main path is clear
62
+ - Treating every decision as equally important
63
+
64
+ ### Step 3: Probe for Hidden Constraints
65
+
66
+ Research often misses things the user knows but hasn't mentioned. Ask about:
67
+ - **Timeline pressure** — does this need to ship by a date?
68
+ - **Audience/users** — who actually uses this? (affects complexity trade-offs)
69
+ - **Future direction** — is this a throwaway or the foundation for more?
70
+ - **Past experience** — have they tried something similar before? What went wrong?
71
+ - **Strong preferences** — anything they definitely want or definitely don't want?
72
+
73
+ One or two questions at a time. Don't interrogate.
74
+
75
+ ### Step 4: Functionality Distillation
76
+
77
+ This is the heart of the discussing skill. Walk through each major feature or requirement and ask targeted questions that force clarity about *how the system should actually behave*. The goal is to surface the implicit assumptions and edge cases that the user hasn't articulated yet — but will need answers to during implementation.
78
+
79
+ **For each feature/requirement, work through these question layers:**
80
+
81
+ **Layer 1: The Happy Path**
82
+ What does success look like from the user's perspective? Walk through the ideal scenario step by step.
83
+ - *"When a user [triggers this feature], what should they see/experience first?"*
84
+ - *"Walk me through what happens next — what's the sequence?"*
85
+ - *"When it's done, what confirms to the user that it worked?"*
86
+
87
+ **Layer 2: Boundaries and Rules**
88
+ What are the constraints? What's allowed and what isn't?
89
+ - *"Who can do this? Everyone, or only certain roles/states?"*
90
+ - *"Is there a limit — how many, how often, how large?"*
91
+ - *"What triggers this — user action, scheduled event, or reaction to something else?"*
92
+ - *"Does this depend on anything else being true first?"*
93
+
94
+ **Layer 3: When Things Go Wrong**
95
+ What happens when the expected path breaks?
96
+ - *"What if [the input is invalid / the service is down / the user cancels halfway]?"*
97
+ - *"Should the system retry, fail gracefully, or alert someone?"*
98
+ - *"If two users do this simultaneously, what should happen?"*
99
+ - *"What's the worst thing that could happen if this feature has a bug?"*
100
+
101
+ **Layer 4: Interactions and Side Effects**
102
+ How does this feature affect the rest of the system?
103
+ - *"When this happens, does anything else need to update?"*
104
+ - *"Should other users be notified? How?"*
105
+ - *"Does this affect [related feature the research uncovered]? How?"*
106
+ - *"Can this be undone? Should it be?"*
107
+
108
+ **Layer 5: Evolution**
109
+ How might this change over time?
110
+ - *"Is this a v1 that you expect to expand later, or is this the final shape?"*
111
+ - *"If you had to cut scope, which parts of this are essential vs. nice-to-have?"*
112
+ - *"What's the most likely thing you'd want to change about this in 3 months?"*
113
+
114
+ **How to use the layers:**
115
+
116
+ Don't mechanically walk through all 5 layers for every requirement — that would take hours. Instead:
117
+
118
+ - **Always do Layer 1** — if the happy path isn't clear, nothing else matters.
119
+ - **Do Layer 2 for anything with rules** — permissions, limits, validations, workflows.
120
+ - **Do Layer 3 for anything critical** — payments, data mutations, auth, external integrations.
121
+ - **Do Layer 4 when features interact** — if research found shared state or dependencies between features.
122
+ - **Do Layer 5 when scope feels uncertain** — if the user seems unsure how much to build now.
123
+
124
+ Ask 2-3 questions at a time, let the user respond, then go deeper where their answers reveal uncertainty. The conversation should feel like a collaborative design session, not an interrogation.
125
+
126
+ **What you're listening for:**
127
+
128
+ - **Contradictions** — "It should be simple" but also "it needs to handle 12 different states." Surface these gently.
129
+ - **Vague handwaving** — "It should just work normally." Push for specifics: *"What does 'normally' mean here? Can you describe one concrete example?"*
130
+ - **Assumed knowledge** — "Like how Stripe does it." Confirm you share the same mental model: *"I want to make sure I understand — you mean [specific behavior], right?"*
131
+ - **Energy shifts** — When the user gets excited about a detail, that's signal. When they get bored or dismissive, that's also signal. The things they care about should get more attention in the plan.
132
+
133
+ ### Step 5: Converge on Decisions
134
+
135
+ When the conversation has covered the key points, summarize what's been decided:
136
+
137
+ *"Here's where I think we've landed:*
138
+ - *[Decision 1]: [what was decided and why]*
139
+ - *[Decision 2]: [what was decided and why]*
140
+ - *[Open question]: [what's still unresolved and how to handle it]*
141
+
142
+ *Does this match your understanding? If so, I'll carry these into planning."*
143
+
144
+ These decisions flow into `context.md` as **Locked Decisions** when the `planning` skill runs next.
145
+
146
+ ## Post-Planning Discussion
147
+
148
+ When invoked on an existing phase or plan:
149
+
150
+ ### Step 1: Load Context
151
+
152
+ Read the relevant files:
153
+ - The plan being discussed (`.forge/phases/{N}-{name}/plan-{NN}.md`)
154
+ - Requirements it's based on (`.forge/requirements.yml`)
155
+ - Locked decisions (`.forge/context.md`)
156
+ - Constitution (`.forge/constitution.md`)
157
+
158
+ ### Step 2: Present the Plan in Plain Language
159
+
160
+ Don't just recite the plan back. Translate it into what it means:
161
+
162
+ *"Phase 2 has 3 plans with 8 tasks total. Here's what it actually does:*
163
+
164
+ *Plan 01 builds [feature A] — it creates [these things] and wires them to [these things]. The main risk is [risk]. Estimated at [time].*
165
+
166
+ *Plan 02 builds [feature B] — [same pattern].*
167
+
168
+ *The key assumption is [assumption]. If that's wrong, plans 02 and 03 would need rework."*
169
+
170
+ ### Step 3: Surface What's Worth Discussing
171
+
172
+ Don't wait for the user to spot issues. Proactively surface:
173
+
174
+ - **Assumptions you're not confident about** — "Plan 01 assumes the API returns paginated results. I didn't verify this."
175
+ - **Decisions that could go either way** — "I split this into 3 plans for parallelism, but you could also do it as 2 larger plans if you prefer fewer context switches."
176
+ - **Risks the plan doesn't address** — "There's no fallback if the external API is slow. Worth adding, or accept the risk?"
177
+ - **Scope questions** — "Plan 03 includes admin-only features. Ship those in v1, or defer?"
178
+
179
+ ### Step 4: Drill into Functionality
180
+
181
+ Use the same **Functionality Distillation** question layers from Pre-Planning Step 4, but now grounded in the concrete plan. Instead of asking about abstract requirements, ask about the specific tasks and how they'll behave:
182
+
183
+ - *"Task 2 creates the notification service. When a notification fails to send, should the system retry or just log it?"*
184
+ - *"Plan 01 wires the dashboard to the API. Should the dashboard poll for updates, or should changes push in real-time?"*
185
+ - *"The plan has user roles as a task, but the requirements don't specify what each role can do. Can we walk through the permissions?"*
186
+
187
+ This is where post-planning discussion earns its keep — the plan makes the feature concrete enough to ask questions that were impossible during pre-planning.
188
+
189
+ ### Step 5: Discuss and Revise Direction
190
+
191
+ The user may want to:
192
+ - **Change approach** — "Let's use WebSockets instead of polling." → Note this. Planning skill will rebuild the affected plans.
193
+ - **Adjust scope** — "Defer the admin features." → Note this for deferred items.
194
+ - **Reorder priorities** — "Do the dashboard before the settings page." → Note the new wave order.
195
+ - **Ask questions** — "What happens if we skip the caching layer?" → Discuss implications honestly.
196
+ - **Approve as-is** — "Looks good, proceed." → Move to executing.
197
+
198
+ ### Step 6: Summarize Changes
199
+
200
+ If the discussion produced changes to the plan direction:
201
+
202
+ *"Based on our discussion:*
203
+ - *[Change 1]: [what changed and why]*
204
+ - *[Change 2]: [what changed and why]*
205
+ - *[Unchanged]: [what stays the same]*
206
+
207
+ *Next step: I'll update the plans to reflect this. Want me to proceed with re-planning, or is there more to discuss?"*
208
+
209
+ If re-planning is needed, route back to the `planning` skill with the discussion summary as input. The planning skill will update plans, requirements, and context.md accordingly.
210
+
211
+ ## Facilitation Principles
212
+
213
+ 1. **Lead with the interesting question, not the obvious one.** "Should we use React?" is boring if the project already uses React. "Should the dashboard update in real-time or on refresh?" is the real decision.
214
+
215
+ 2. **Name the trade-off, don't hide it.** Every decision has a cost. Saying "Option A is better" without saying "but it takes 2x longer" is dishonest facilitation.
216
+
217
+ 3. **Reference what the user has already told you.** Good facilitators remember. If the user said "I want this done by Friday" three messages ago, factor that into every recommendation.
218
+
219
+ 4. **Know when to stop discussing.** If the user is clear and confident, don't manufacture uncertainty. Some discussions take 2 minutes. That's fine.
220
+
221
+ 5. **Separate "must decide now" from "can decide later."** Not every question needs an answer before planning starts. Flag what can be deferred and what can't.
222
+
223
+ ## Anti-Patterns
224
+
225
+ - **Analysis paralysis** — discussing for 30 minutes what could be decided in 5. If the user is going in circles, name it: "I think we have enough to move forward. We can revisit after seeing the first implementation."
226
+ - **False facilitation** — asking questions you already know the answer to. If the research clearly points one way, say so.
227
+ - **Premature convergence** — locking decisions before the user has had a chance to think. Don't rush the summary.
228
+ - **Scope creep via discussion** — "While we're at it, should we also..." Keep discussion focused on the work at hand.
229
+ - **Discussion as procrastination** — if the user keeps wanting to discuss but never approves a plan, gently surface the pattern.
@@ -0,0 +1,154 @@
1
+ ---
2
+ name: executing
3
+ description: "Use when building: writing code, creating files, running tests. Trigger when you have an approved plan and need to implement it. This skill enforces atomic commits, deviation rules, context engineering, and TDD when applicable."
4
+ ---
5
+
6
+ # Executing
7
+
8
+ Build to plan with atomic commits, smart deviation handling, and context engineering.
9
+
10
+ ## Pre-Execution Checklist
11
+
12
+ Before writing any code:
13
+ - [ ] Plan exists (`.forge/phases/{N}-{name}/plan-{NN}.md`)
14
+ - [ ] Context.md reviewed (locked decisions noted)
15
+ - [ ] Constitution.md gates satisfied (from planning phase)
16
+ - [ ] Current milestone state (`.forge/state/milestone-{id}.yml`) updated to `status: executing`
17
+
18
+ ## Deviation Rules
19
+
20
+ When you encounter issues while building, follow this decision tree:
21
+
22
+ ### Rule 4 Check First (Architectural)
23
+ Does the issue require a new database table, service layer, major schema change, library swap, or infrastructure change?
24
+ - **YES** → **STOP.** Create checkpoint. Document: what you found, proposed change, why needed, impact, alternatives. Wait for user decision.
25
+ - **NO** → Continue to Rules 1-3.
26
+
27
+ ### Rule 1: Auto-Fix Bugs
28
+ Issue is a simple bug blocking the current task (broken behavior, wrong output, error).
29
+ → Fix inline. Add test if applicable. Document in summary: "Rule 1: Fixed [bug] because [reason]."
30
+
31
+ ### Rule 2: Auto-Add Critical Functionality
32
+ Missing error handling, input validation, null checks, auth checks, CSRF protection, rate limiting, logging.
33
+ → Add it. Document in summary: "Rule 2: Added [functionality] because [reason]."
34
+
35
+ ### Rule 3: Auto-Fix Blocking Infrastructure
36
+ Missing dependency, wrong types, broken imports, env var misconfiguration, build config issue.
37
+ → Fix it. Document in summary: "Rule 3: Fixed [issue] because [reason]."
38
+
39
+ ### 3-Strike Limit
40
+ After 3 auto-fix attempts on a single task → STOP fixing. Document remaining issues in summary. Move to next task.
41
+
42
+ ### Scope Boundary
43
+ Only fix issues DIRECTLY caused by the current task. Pre-existing warnings, tech debt, unrelated bugs → log to `.forge/deferred-issues.md`, don't fix.
44
+
45
+ ## Task Execution Flow
46
+
47
+ For each task in the plan:
48
+
49
+ 1. **Read** the task XML (name, files, action, verify, done)
50
+ 2. **Check** context.md — does this task touch a locked decision? Honor it exactly.
51
+ 3. **Implement** following the action instructions
52
+ 4. **Verify** using the verify step (run tests, inspect output)
53
+ 5. **Confirm** done criteria are met
54
+ 6. **Commit** atomically
55
+
56
+ ## TDD Flow (When task type="tdd")
57
+
58
+ ### With Test Spec (from planning Step 7)
59
+ When the task has a `<spec>` field, test specs already exist:
60
+ 1. **Copy spec to test location:** Move from `.forge/phases/` to the project's test directory
61
+ 2. **RED:** Remove `skip` markers from the first test. Confirm it fails. Commit: `test({scope}): activate spec tests for {feature}`
62
+ 3. **GREEN:** Write minimal code to make it pass. Repeat for each test in the spec.
63
+ 4. **REFACTOR:** Clean up. Commit: `feat({scope}): implement {feature}`
64
+
65
+ ### Without Test Spec (standard TDD)
66
+ 1. **RED:** Write failing tests first. Commit: `test({scope}): add failing tests for {feature}`
67
+ 2. **GREEN:** Write minimal code to pass tests. Commit: `feat({scope}): implement {feature}`
68
+ 3. **REFACTOR:** Clean up if needed. Commit: `refactor({scope}): clean up {feature}`
69
+
70
+ ## Atomic Commit Protocol
71
+
72
+ After each task completes:
73
+
74
+ 1. Stage only task-related files individually: `git add src/specific/file.ts`
75
+ 2. **NEVER** use `git add .` or `git add -A`
76
+ 3. Commit with format: `{type}({phase}-{plan}): {description}`
77
+ 4. Include bullet-point body listing key changes
78
+ 5. Verify commit was created: `git log -1`
79
+
80
+ ```
81
+ feat(auth-01): implement JWT-based login
82
+
83
+ - Add LoginForm component using PrimeReact Dialog and InputText
84
+ - Create useAuth hook with token refresh logic
85
+ - Add /api/auth/login endpoint with bcrypt verification
86
+ - Include integration test for login flow
87
+ ```
88
+
89
+ ## Context Engineering
90
+
91
+ ### When to Spawn Fresh Agent
92
+ - Task touches 20+ files
93
+ - Task involves complex subsystem with deep dependency tree
94
+ - Current context exceeds ~60% utilization
95
+ - Switching between unrelated subsystems
96
+
97
+ ### How to Spawn
98
+ Use the Task tool with a focused prompt:
99
+ - Include only relevant plan details
100
+ - Reference specific files to modify
101
+ - Pass locked decisions from context.md
102
+ - Set clear success criteria
103
+
104
+ ### Size Monitoring
105
+ After each task, mentally assess context usage:
106
+ - Under 40%: continue normally
107
+ - 40-60%: consider fresh agent for next task
108
+ - Over 60%: spawn fresh agent
109
+
110
+ ## Execution Summary
111
+
112
+ After completing all tasks in a plan, create a summary:
113
+
114
+ ```markdown
115
+ # Execution Summary: Phase {N}, Plan {NN}
116
+
117
+ ## Completed Tasks
118
+ 1. [Task name] — [one-line result]
119
+ 2. [Task name] — [one-line result]
120
+
121
+ ## Deviations
122
+ - Rule 1: Fixed [bug] in [file] because [reason]
123
+ - Rule 2: Added [functionality] to [file] because [reason]
124
+
125
+ ## Commits
126
+ - abc1234: feat(auth-01): implement login form
127
+ - def5678: test(auth-01): add login integration tests
128
+
129
+ ## Files Modified
130
+ - src/components/Login.tsx (created)
131
+ - src/hooks/useAuth.ts (created)
132
+ - src/api/auth/login.ts (created)
133
+
134
+ ## Notes
135
+ [Anything the verifier should know]
136
+ ```
137
+
138
+ ## State Updates
139
+
140
+ After plan execution:
141
+ 1. Update `.forge/state/milestone-{id}.yml` — advance plan counter, update progress
142
+ 2. Record deviations in the milestone state file
143
+ 3. Update `.forge/state/index.yml` — set milestone `last_updated` timestamp
144
+ 4. If all plans in phase complete → transition to `verifying`
145
+
146
+ ## Desire Path Signals
147
+
148
+ While executing, watch for and log these patterns in `.forge/state/index.yml → desire_paths` (desire paths are global, not per-milestone):
149
+
150
+ - **Repeated deviations**: If you apply the same deviation rule for the same reason more than once (e.g., adding null checks in every API handler), log it as a `deviation_pattern`. Don't just record each deviation individually — notice when they form a pattern.
151
+ - **User corrections**: If the user corrects your output and the correction matches something they've corrected before (e.g., "use the Card component, not a div"), log it as a `user_correction` and increment the count.
152
+ - **Agent struggles**: If you need multiple attempts to get something right, or the user has to guide you through it, log the task type as an `agent_struggle`.
153
+
154
+ This takes seconds per signal. Don't skip it — this data drives framework evolution.