forge-orkes 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/create-forge.js +103 -0
- package/package.json +28 -0
- package/template/.claude/agents/executor.md +177 -0
- package/template/.claude/agents/planner.md +148 -0
- package/template/.claude/agents/researcher.md +111 -0
- package/template/.claude/agents/reviewer.md +211 -0
- package/template/.claude/agents/verifier.md +210 -0
- package/template/.claude/settings.json +40 -0
- package/template/.claude/skills/architecting/SKILL.md +121 -0
- package/template/.claude/skills/auditing/SKILL.md +302 -0
- package/template/.claude/skills/beads-integration/SKILL.md +125 -0
- package/template/.claude/skills/debugging/SKILL.md +130 -0
- package/template/.claude/skills/designing/SKILL.md +134 -0
- package/template/.claude/skills/discussing/SKILL.md +229 -0
- package/template/.claude/skills/executing/SKILL.md +154 -0
- package/template/.claude/skills/forge/SKILL.md +524 -0
- package/template/.claude/skills/planning/SKILL.md +225 -0
- package/template/.claude/skills/quick-tasking/SKILL.md +74 -0
- package/template/.claude/skills/refactoring/SKILL.md +168 -0
- package/template/.claude/skills/researching/SKILL.md +117 -0
- package/template/.claude/skills/securing/SKILL.md +104 -0
- package/template/.claude/skills/verifying/SKILL.md +201 -0
- package/template/.forge/templates/constitution.md +123 -0
- package/template/.forge/templates/context.md +53 -0
- package/template/.forge/templates/design-systems/material-ui.md +44 -0
- package/template/.forge/templates/design-systems/primereact.md +46 -0
- package/template/.forge/templates/design-systems/shadcn-ui.md +47 -0
- package/template/.forge/templates/framework-absorption/generic.md +52 -0
- package/template/.forge/templates/framework-absorption/gsd.md +174 -0
- package/template/.forge/templates/framework-absorption/spec-kit.md +52 -0
- package/template/.forge/templates/plan.md +84 -0
- package/template/.forge/templates/project.yml +40 -0
- package/template/.forge/templates/refactor-backlog.yml +16 -0
- package/template/.forge/templates/requirements.yml +49 -0
- package/template/.forge/templates/roadmap.yml +44 -0
- package/template/.forge/templates/state/index.yml +51 -0
- package/template/.forge/templates/state/milestone.yml +42 -0
- package/template/CLAUDE.md +150 -0
|
@@ -0,0 +1,225 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: planning
|
|
3
|
+
description: "Use when you need to break work into executable tasks with verification gates. Trigger after researching, when you have enough context to plan but haven't started building. This skill enforces constitutional gates, structures requirements, decomposes tasks, and sets up goal-backward verification."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Planning
|
|
7
|
+
|
|
8
|
+
Turn research and requirements into executable, verifiable plans.
|
|
9
|
+
|
|
10
|
+
## Step 1: Resolution Gate
|
|
11
|
+
|
|
12
|
+
Read `.forge/context.md`. Check the **Needs Resolution** section.
|
|
13
|
+
|
|
14
|
+
If any unresolved items exist (unchecked `- [ ]` items under "Needs Resolution"):
|
|
15
|
+
|
|
16
|
+
**Pause and triage with the user.** This is a quick conversation, not a planning phase. Present all unresolved items and ask the user to make a call on each one.
|
|
17
|
+
|
|
18
|
+
Present each item clearly:
|
|
19
|
+
*"Before we plan, there are {N} items that need your input: [list items]. For each one, should we lock it as a decision, defer it, treat it as a bug to fix, or drop it?"*
|
|
20
|
+
|
|
21
|
+
**For each unresolved item, the user picks one:**
|
|
22
|
+
- **→ Lock it**: The answer is clear. Record in Locked Decisions section.
|
|
23
|
+
- **→ Defer it**: Not now. Move to Deferred Ideas with revisit date.
|
|
24
|
+
- **→ Fix it**: This is a gap/bug. Add as a requirement (FR-xxx) in `.forge/requirements.yml` — it flows into the plan being created.
|
|
25
|
+
- **→ Drop it**: No longer relevant. Note in Amendment Log.
|
|
26
|
+
|
|
27
|
+
Check off each resolved item `- [x]` and move it to the appropriate section.
|
|
28
|
+
|
|
29
|
+
**This should take 2-5 minutes, not a whole planning cycle.** The goal is to get quick human decisions so planning can proceed with accurate information.
|
|
30
|
+
|
|
31
|
+
Also scan the **Carried Forward** section — items here that affect the current phase should be triaged the same way.
|
|
32
|
+
|
|
33
|
+
## Step 2: Constitutional Gate Check
|
|
34
|
+
|
|
35
|
+
Read `.forge/constitution.md`. For each article relevant to this phase:
|
|
36
|
+
- Check the gate checkboxes
|
|
37
|
+
- If any gate fails → **STOP.** Resolve before planning.
|
|
38
|
+
- If an article needs amendment → flag for user decision (discuss phase)
|
|
39
|
+
|
|
40
|
+
Document gate results in the plan frontmatter.
|
|
41
|
+
|
|
42
|
+
## Step 3: Lock User Decisions
|
|
43
|
+
|
|
44
|
+
If not already done, create `.forge/context.md` from template:
|
|
45
|
+
|
|
46
|
+
1. Ask user about key decisions (framework choices, constraints, preferences)
|
|
47
|
+
2. Record as **Locked Decisions** — these are contracts
|
|
48
|
+
3. Record **Deferred Ideas** — explicitly out of scope
|
|
49
|
+
4. Record **Discretion Areas** — agent picks best approach
|
|
50
|
+
|
|
51
|
+
**Critical:** Do this BEFORE creating plans. Plans must reference context.md.
|
|
52
|
+
|
|
53
|
+
## Step 4: Structure Requirements
|
|
54
|
+
|
|
55
|
+
If `.forge/requirements.yml` doesn't exist, create from template:
|
|
56
|
+
|
|
57
|
+
1. Extract functional requirements from user description + research
|
|
58
|
+
2. Assign IDs: FR-001, FR-002, etc.
|
|
59
|
+
3. Write acceptance criteria in Given/When/Then format
|
|
60
|
+
4. Mark uncertain items: `[NEEDS CLARIFICATION]`
|
|
61
|
+
5. Separate P1 (must-have) from P2 (should-have) and P3 (nice-to-have)
|
|
62
|
+
6. List deferred items explicitly (DEF-001, etc.)
|
|
63
|
+
|
|
64
|
+
**Planning blocks until all P1 `[NEEDS CLARIFICATION]` items are resolved.**
|
|
65
|
+
|
|
66
|
+
## Step 5: Create Roadmap
|
|
67
|
+
|
|
68
|
+
If `.forge/roadmap.yml` doesn't exist (Full tier only):
|
|
69
|
+
|
|
70
|
+
1. Group requirements by natural delivery boundaries
|
|
71
|
+
2. Identify dependencies between groups
|
|
72
|
+
3. Create phases (coherent, verifiable capabilities)
|
|
73
|
+
4. Assign requirements to phases (every FR → exactly one phase, no orphans)
|
|
74
|
+
5. Analyze waves (independent phases = Wave 1, dependencies = Wave 2+)
|
|
75
|
+
|
|
76
|
+
## Step 6: Decompose into Tasks
|
|
77
|
+
|
|
78
|
+
For each phase (or the single feature in Standard tier):
|
|
79
|
+
|
|
80
|
+
1. Copy `.forge/templates/plan.md` → `.forge/phases/{N}-{name}/plan-{NN}.md`
|
|
81
|
+
2. Fill in frontmatter (phase, plan number, wave, dependencies)
|
|
82
|
+
3. Create goal-backward must_haves:
|
|
83
|
+
- **Truths:** Observable from user perspective when done (3-7)
|
|
84
|
+
- **Artifacts:** Files that must exist and be substantive, not stubs
|
|
85
|
+
- **Key Links:** Critical connections between artifacts
|
|
86
|
+
4. Decompose into XML tasks (2-3 per plan, 15-60 min each):
|
|
87
|
+
|
|
88
|
+
```xml
|
|
89
|
+
<task type="auto">
|
|
90
|
+
<name>Action-oriented name</name>
|
|
91
|
+
<files>Exact paths created/modified</files>
|
|
92
|
+
<action>What to build, how, what to avoid (with WHY)</action>
|
|
93
|
+
<verify>Command or check to prove completion</verify>
|
|
94
|
+
<done>Measurable acceptance criteria</done>
|
|
95
|
+
</task>
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
### Task Sizing
|
|
99
|
+
- Under 15 min → too small, combine with adjacent task
|
|
100
|
+
- 15-60 min → right size
|
|
101
|
+
- Over 60 min → too large, split into separate plan
|
|
102
|
+
|
|
103
|
+
### Task Types
|
|
104
|
+
- `auto` — Fully autonomous (90% of tasks)
|
|
105
|
+
- `checkpoint:human-verify` — Pause for visual/functional check
|
|
106
|
+
- `checkpoint:decision` — Pause for user choice between options
|
|
107
|
+
- `checkpoint:human-action` — Pause for truly manual action (email verification, 2FA)
|
|
108
|
+
|
|
109
|
+
### Vertical Slices (Preferred)
|
|
110
|
+
```
|
|
111
|
+
Plan 01: User feature (model + API + UI) → Wave 1
|
|
112
|
+
Plan 02: Product feature (model + API + UI) → Wave 1
|
|
113
|
+
```
|
|
114
|
+
Independent plans run in parallel.
|
|
115
|
+
|
|
116
|
+
### Avoid Horizontal Layers
|
|
117
|
+
```
|
|
118
|
+
Plan 01: All models → Wave 1
|
|
119
|
+
Plan 02: All APIs → Wave 2 (depends on 01)
|
|
120
|
+
Plan 03: All UI → Wave 3 (depends on 02)
|
|
121
|
+
```
|
|
122
|
+
Forces sequential execution. Only use when architecturally necessary.
|
|
123
|
+
|
|
124
|
+
## Step 7: Test Spec Generation (Optional)
|
|
125
|
+
|
|
126
|
+
**When to use:** Invoke this step when the work involves testable contracts — APIs with defined endpoints, libraries with public interfaces, data transformations with known inputs/outputs, or services with observable behavior. Skip for UI-heavy work where the existing must_haves approach is sufficient.
|
|
127
|
+
|
|
128
|
+
**The user can also explicitly request this:** *"Generate test specs for this plan."*
|
|
129
|
+
|
|
130
|
+
### Detect Testable Contracts
|
|
131
|
+
|
|
132
|
+
Scan the tasks from Step 6. A task is a testable contract candidate if it involves:
|
|
133
|
+
- **API endpoints** — defined request/response shapes, status codes, error cases
|
|
134
|
+
- **Library/module interfaces** — public functions with typed inputs and expected outputs
|
|
135
|
+
- **Data transformations** — parsers, formatters, validators, converters with known input→output mappings
|
|
136
|
+
- **Service integrations** — external API calls with expected payloads and error handling
|
|
137
|
+
- **Business logic** — calculations, rules engines, state machines with deterministic behavior
|
|
138
|
+
|
|
139
|
+
If no testable contracts are found, skip to Step 8.
|
|
140
|
+
|
|
141
|
+
### Generate Test Specifications
|
|
142
|
+
|
|
143
|
+
For each testable contract, create a test spec file alongside the plan:
|
|
144
|
+
|
|
145
|
+
**Location:** `.forge/phases/{N}-{name}/specs/`
|
|
146
|
+
|
|
147
|
+
**Format:** Language-appropriate test files (e.g., `.test.ts`, `_test.go`, `test_*.py`) that:
|
|
148
|
+
|
|
149
|
+
1. **Import the module that will exist** (the path from the task's `<files>` field)
|
|
150
|
+
2. **Describe expected behavior** derived from requirements, NOT from implementation:
|
|
151
|
+
- Happy path cases from acceptance criteria
|
|
152
|
+
- Edge cases from requirements (empty input, max limits, invalid data)
|
|
153
|
+
- Error cases from the `<done>` criteria
|
|
154
|
+
3. **Assert outcomes, not internals** — test what the function returns or what side effects occur, not how it's implemented
|
|
155
|
+
4. **Mark as pending/skipped** — tests should be valid syntax but marked to skip (e.g., `it.skip()`, `@pytest.mark.skip`, `t.Skip()`) so they don't break CI before implementation
|
|
156
|
+
|
|
157
|
+
```typescript
|
|
158
|
+
// Example: .forge/phases/1-auth/specs/login-endpoint.test.ts
|
|
159
|
+
describe('POST /api/auth/login', () => {
|
|
160
|
+
it.skip('returns 200 with valid token for correct credentials', () => {
|
|
161
|
+
// From FR-001: "User can log in with email and password"
|
|
162
|
+
});
|
|
163
|
+
|
|
164
|
+
it.skip('returns 401 for invalid password', () => {
|
|
165
|
+
// From FR-001 acceptance: "Invalid credentials show error"
|
|
166
|
+
});
|
|
167
|
+
|
|
168
|
+
it.skip('returns 422 for missing email field', () => {
|
|
169
|
+
// Edge case: required field validation
|
|
170
|
+
});
|
|
171
|
+
|
|
172
|
+
it.skip('rate-limits after 5 failed attempts', () => {
|
|
173
|
+
// From NFR-002: "Brute force protection"
|
|
174
|
+
});
|
|
175
|
+
});
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
### Update Plan Tasks
|
|
179
|
+
|
|
180
|
+
For tasks with generated test specs, update the task XML to reference the spec:
|
|
181
|
+
|
|
182
|
+
```xml
|
|
183
|
+
<task type="tdd">
|
|
184
|
+
<name>Implement login endpoint</name>
|
|
185
|
+
<files>src/api/auth/login.ts</files>
|
|
186
|
+
<spec>.forge/phases/1-auth/specs/login-endpoint.test.ts</spec>
|
|
187
|
+
<action>Make all tests in the spec file pass. Remove skip markers as you implement.</action>
|
|
188
|
+
<verify>All spec tests pass: npm test -- login-endpoint</verify>
|
|
189
|
+
<done>All skip markers removed, all tests green</done>
|
|
190
|
+
</task>
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
Note: The task type changes to `tdd` when a spec exists. The executor's TDD flow (RED → GREEN → REFACTOR) kicks in automatically.
|
|
194
|
+
|
|
195
|
+
### What NOT to Spec
|
|
196
|
+
|
|
197
|
+
- UI component rendering (use must_haves truths instead)
|
|
198
|
+
- Visual layout and styling (use human verification)
|
|
199
|
+
- Integration flows spanning many components (use key_links verification)
|
|
200
|
+
- Anything where the interface isn't defined yet (spec after architecting, not before)
|
|
201
|
+
|
|
202
|
+
## Step 8: Verify Plans (Plan Checker)
|
|
203
|
+
|
|
204
|
+
Before passing to executor, verify across 8 dimensions:
|
|
205
|
+
|
|
206
|
+
1. **Requirement Coverage** — Every phase requirement has task(s)
|
|
207
|
+
2. **Task Completeness** — All tasks have files + action + verify + done
|
|
208
|
+
3. **Dependency Correctness** — Valid DAG, no circular deps
|
|
209
|
+
4. **Key Links Planned** — Artifacts will be wired together (not orphaned)
|
|
210
|
+
5. **Scope Sanity** — 2-3 tasks per plan, targeting ~50% context
|
|
211
|
+
6. **Verification Derivation** — must_haves trace to phase goal
|
|
212
|
+
7. **Context Compliance** — Plans honor locked decisions, exclude deferred ideas
|
|
213
|
+
8. **Spec Validity** (when Step 7 was used) — Test specs are valid syntax, reference correct paths, derive from requirements not assumptions
|
|
214
|
+
|
|
215
|
+
If issues found → fix and re-verify. Max 3 revision cycles.
|
|
216
|
+
|
|
217
|
+
## Step 9: Present to User
|
|
218
|
+
|
|
219
|
+
Show the user:
|
|
220
|
+
1. Requirements summary (P1/P2/P3 counts, clarifications needed)
|
|
221
|
+
2. Phase/plan structure with wave analysis
|
|
222
|
+
3. Estimated effort per phase
|
|
223
|
+
4. Ask: "Does this plan match your expectations? Any changes?"
|
|
224
|
+
|
|
225
|
+
Planning is complete when user approves.
|
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: quick-tasking
|
|
3
|
+
description: "Use for small, scoped changes: typo fixes, config updates, minor bug fixes, dependency bumps, documentation tweaks. Trigger when the change is under 50 lines, touches 1-2 files, and requires no architectural decisions. This is the Quick tier — skip planning ceremony, just do it right."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Quick-Tasking
|
|
7
|
+
|
|
8
|
+
Small change? Skip the ceremony. Do it right, do it fast.
|
|
9
|
+
|
|
10
|
+
## Qualifies as Quick
|
|
11
|
+
|
|
12
|
+
- Single file change, under 50 lines
|
|
13
|
+
- Typo, grammar, or formatting fix
|
|
14
|
+
- Config or environment variable update
|
|
15
|
+
- Dependency version bump
|
|
16
|
+
- Simple bug fix with obvious root cause
|
|
17
|
+
- Documentation update
|
|
18
|
+
- CSS/style tweak within design system
|
|
19
|
+
- Adding a missing import or export
|
|
20
|
+
|
|
21
|
+
## Does NOT Qualify as Quick
|
|
22
|
+
|
|
23
|
+
Upgrade to Standard tier if any of these are true:
|
|
24
|
+
- Change touches 3+ files
|
|
25
|
+
- Requires new component, function, or module
|
|
26
|
+
- Involves logic changes affecting multiple features
|
|
27
|
+
- Needs a new dependency
|
|
28
|
+
- Requires architectural decision
|
|
29
|
+
- Estimated at more than 30 minutes
|
|
30
|
+
- You're unsure about the right approach
|
|
31
|
+
|
|
32
|
+
## Quick Workflow
|
|
33
|
+
|
|
34
|
+
### 1. Identify
|
|
35
|
+
What exactly needs to change? Be specific: which file, which line, what's wrong.
|
|
36
|
+
|
|
37
|
+
### 2. Validate Scope
|
|
38
|
+
Confirm it's truly quick:
|
|
39
|
+
- [ ] Under 50 lines of changes
|
|
40
|
+
- [ ] 1-2 files maximum
|
|
41
|
+
- [ ] No architectural decisions needed
|
|
42
|
+
- [ ] Root cause is obvious (or bug fix is straightforward)
|
|
43
|
+
|
|
44
|
+
If any check fails → escalate to Standard tier.
|
|
45
|
+
|
|
46
|
+
### 3. Execute
|
|
47
|
+
Make the change. Follow existing patterns in the codebase.
|
|
48
|
+
|
|
49
|
+
### 4. Test
|
|
50
|
+
- Run relevant tests: `npm test -- --filter={related}`
|
|
51
|
+
- If no tests exist for this area, at minimum verify it compiles/builds
|
|
52
|
+
- For UI changes, visually confirm the fix
|
|
53
|
+
|
|
54
|
+
### 5. Commit
|
|
55
|
+
Atomic commit with proper format:
|
|
56
|
+
```
|
|
57
|
+
{type}({scope}): {description}
|
|
58
|
+
```
|
|
59
|
+
Examples:
|
|
60
|
+
- `fix(docs): correct typo in API reference`
|
|
61
|
+
- `chore(deps): bump react to 19.1.0`
|
|
62
|
+
- `fix(ui): align header padding with design system`
|
|
63
|
+
|
|
64
|
+
### 6. Done
|
|
65
|
+
No verification ceremony needed. Move on.
|
|
66
|
+
|
|
67
|
+
## Scope Creep Detection
|
|
68
|
+
|
|
69
|
+
While executing a quick task, if you discover:
|
|
70
|
+
- The fix is bigger than expected → **STOP**. Escalate to Standard.
|
|
71
|
+
- Related issues that should also be fixed → **LOG** to `.forge/deferred-issues.md`. Don't fix now.
|
|
72
|
+
- An architectural question → **STOP**. Escalate to Standard with a note about what triggered it.
|
|
73
|
+
|
|
74
|
+
Quick-tasking is about discipline: do the small thing, only the small thing, and move on.
|
|
@@ -0,0 +1,168 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: refactoring
|
|
3
|
+
description: "Review code built during a milestone for refactoring opportunities. Runs after auditing passes. Produces a structured backlog of improvements the user can work through via quick-tasking. Soft gate — review items, add to backlog, complete milestone."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Refactoring: Post-Milestone Code Review for Improvement Opportunities
|
|
7
|
+
|
|
8
|
+
You review the code built during a milestone and catalog opportunities for improvement. This is not about correctness (that's `verifying`) or health (that's `auditing`) — it's about identifying code that works but could be cleaner, simpler, or more consistent.
|
|
9
|
+
|
|
10
|
+
## When to Trigger
|
|
11
|
+
|
|
12
|
+
- **Automatically** after `auditing` completes (all three paths: HEALTHY, NEEDS ATTENTION after fix, ACCEPTABLE WITH CAVEATS after accept)
|
|
13
|
+
- **On-demand** at any time via user request
|
|
14
|
+
|
|
15
|
+
## Step 1: Read Context
|
|
16
|
+
|
|
17
|
+
```
|
|
18
|
+
Read: .forge/project.yml → tech stack, conventions
|
|
19
|
+
Read: .forge/state/milestone-{id}.yml → milestone ID, name, phases completed
|
|
20
|
+
Read: .forge/audits/milestone-{id}-health-report.md → health findings (avoid overlap)
|
|
21
|
+
Read: .forge/refactor-backlog.yml → existing backlog items (if any)
|
|
22
|
+
Read: .forge/constitution.md → active architectural gates (if exists)
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
Determine the milestone's starting point for the git diff:
|
|
26
|
+
- Check git log for the commit tagged or noted as the milestone start
|
|
27
|
+
- If unavailable, use the first commit after the previous milestone's completion date
|
|
28
|
+
- Fallback: ask the user for the starting commit or branch
|
|
29
|
+
|
|
30
|
+
## Step 2: Scan Milestone Code
|
|
31
|
+
|
|
32
|
+
Spawn a fresh agent with isolated context. Pass it:
|
|
33
|
+
- The explicit list of files changed during the milestone (from `git diff --name-only {start}..HEAD`)
|
|
34
|
+
- The tech stack from `project.yml`
|
|
35
|
+
- The health report findings (so it doesn't duplicate auditing's work)
|
|
36
|
+
- The constitution (so it respects intentional decisions)
|
|
37
|
+
|
|
38
|
+
**The agent scans for 6 categories:**
|
|
39
|
+
|
|
40
|
+
| # | Category | What to Look For |
|
|
41
|
+
|---|----------|-----------------|
|
|
42
|
+
| 1 | **Duplication** | Similar logic in 2+ places that could be extracted into a shared function, hook, or utility |
|
|
43
|
+
| 2 | **Complexity hotspots** | Functions >50 lines, nesting >3 levels deep, high cyclomatic complexity, overly long files |
|
|
44
|
+
| 3 | **Naming & clarity** | Unclear variable/function names, misleading abstractions, functions that do more than their name suggests |
|
|
45
|
+
| 4 | **Pattern inconsistency** | Same thing done differently across the milestone's files (e.g., error handling, data fetching, state management) |
|
|
46
|
+
| 5 | **Dead code** | Unused functions, unreachable branches, commented-out code left behind, unused imports |
|
|
47
|
+
| 6 | **Abstraction issues** | Over-engineered helpers used once, repeated inline code that warrants extraction, premature or missing abstractions |
|
|
48
|
+
|
|
49
|
+
**Agent behavior rules:**
|
|
50
|
+
- Read every file in the diff. No sampling.
|
|
51
|
+
- Every finding must reference a specific file and line range.
|
|
52
|
+
- Understand context — don't flag intentional patterns documented in the constitution.
|
|
53
|
+
- Don't duplicate findings already in the health report from auditing.
|
|
54
|
+
- Estimate effort for each item: `quick` (< 30 min, under 50 lines) or `standard` (needs planning).
|
|
55
|
+
- Suggest a concrete approach for each finding, not just "refactor this."
|
|
56
|
+
- Prefer fewer high-quality findings over many low-signal ones.
|
|
57
|
+
|
|
58
|
+
**Output format** (return to orchestrator):
|
|
59
|
+
|
|
60
|
+
```yaml
|
|
61
|
+
findings:
|
|
62
|
+
- category: duplication
|
|
63
|
+
file: "src/api/users.ts"
|
|
64
|
+
lines: "42-67"
|
|
65
|
+
description: "Duplicate validation logic — same email check in createUser and updateUser"
|
|
66
|
+
effort: quick
|
|
67
|
+
suggested_approach: "Extract shared validateEmail() helper to src/utils/validation.ts"
|
|
68
|
+
- category: complexity
|
|
69
|
+
file: "src/components/Dashboard.tsx"
|
|
70
|
+
lines: "120-245"
|
|
71
|
+
description: "Dashboard render function is 125 lines with 4 levels of nesting"
|
|
72
|
+
effort: standard
|
|
73
|
+
suggested_approach: "Extract stat cards, chart section, and filter bar into subcomponents"
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
## Step 3: Present Findings to User
|
|
77
|
+
|
|
78
|
+
Group findings by category. Show each with:
|
|
79
|
+
- File and line range
|
|
80
|
+
- What the issue is
|
|
81
|
+
- Estimated effort
|
|
82
|
+
- Suggested approach
|
|
83
|
+
|
|
84
|
+
Present top findings (max 10 initially). If there are more, mention the count.
|
|
85
|
+
|
|
86
|
+
*"I found {N} refactoring opportunities in the code built during this milestone:"*
|
|
87
|
+
|
|
88
|
+
Then for each category with findings:
|
|
89
|
+
|
|
90
|
+
*"**Duplication** ({N} items):*
|
|
91
|
+
*1. `src/api/users.ts:42-67` — Duplicate email validation in createUser and updateUser. Quick fix: extract shared helper. [Accept / Dismiss]*
|
|
92
|
+
*2. ...*"
|
|
93
|
+
|
|
94
|
+
## Step 4: User Triage
|
|
95
|
+
|
|
96
|
+
The user can respond with:
|
|
97
|
+
- **Accept** (individual item) → add to backlog
|
|
98
|
+
- **Dismiss** (individual item) → skip, not a real issue or intentional
|
|
99
|
+
- **Accept all** → bulk add all remaining items to backlog
|
|
100
|
+
- **Dismiss all** → skip everything, no backlog items added
|
|
101
|
+
|
|
102
|
+
For dismissed items, optionally ask for a brief reason (helps calibrate future scans).
|
|
103
|
+
|
|
104
|
+
## Step 5: Write Backlog
|
|
105
|
+
|
|
106
|
+
Read existing `.forge/refactor-backlog.yml` (if any). Determine the next item ID by incrementing from the highest existing ID.
|
|
107
|
+
|
|
108
|
+
Append accepted items to `.forge/refactor-backlog.yml`:
|
|
109
|
+
|
|
110
|
+
```yaml
|
|
111
|
+
items:
|
|
112
|
+
- id: R001
|
|
113
|
+
milestone: 1
|
|
114
|
+
category: duplication
|
|
115
|
+
file: "src/api/users.ts"
|
|
116
|
+
lines: "42-67"
|
|
117
|
+
description: "Duplicate validation logic — same email check in createUser and updateUser"
|
|
118
|
+
effort: quick
|
|
119
|
+
suggested_approach: "Extract shared validateEmail() helper"
|
|
120
|
+
status: pending
|
|
121
|
+
added: "2026-03-18"
|
|
122
|
+
completed: null
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
If the file doesn't exist yet, create it from the template at `.forge/templates/refactor-backlog.yml`.
|
|
126
|
+
|
|
127
|
+
Present summary:
|
|
128
|
+
*"Added {N} items to the refactor backlog. {M} dismissed. You can work these anytime — pending items with `effort: quick` will show up as available Quick tasks when you start a session."*
|
|
129
|
+
|
|
130
|
+
## Step 6: Route
|
|
131
|
+
|
|
132
|
+
Update `.forge/state/milestone-{id}.yml`:
|
|
133
|
+
- Set `current.status` to `complete`
|
|
134
|
+
|
|
135
|
+
Update `.forge/state/index.yml`:
|
|
136
|
+
- Set milestone status to `complete`
|
|
137
|
+
- Update `last_updated` timestamp
|
|
138
|
+
|
|
139
|
+
Present to user:
|
|
140
|
+
*"Milestone [{name}] is complete. {N} refactoring items are in the backlog for whenever you want to tackle them."*
|
|
141
|
+
|
|
142
|
+
If Beads is installed, run `bd complete` to update the dependency graph.
|
|
143
|
+
|
|
144
|
+
## Gate Type: Soft Gate
|
|
145
|
+
|
|
146
|
+
This is a soft gate — it presents opportunities but never blocks milestone completion. Rationale:
|
|
147
|
+
- Refactoring is improvement, not correctness. The code already works (verified) and is healthy (audited).
|
|
148
|
+
- Users should review opportunities but aren't forced to act on them immediately.
|
|
149
|
+
- Backlog items persist across sessions and can be worked whenever it makes sense.
|
|
150
|
+
- Some items may become irrelevant as the codebase evolves — that's fine.
|
|
151
|
+
|
|
152
|
+
## Backlog Lifecycle
|
|
153
|
+
|
|
154
|
+
Backlog items follow this lifecycle:
|
|
155
|
+
|
|
156
|
+
```
|
|
157
|
+
pending → in_progress → done
|
|
158
|
+
pending → dismissed (during triage or later review)
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
Items with `effort: quick` can be picked up directly via `quick-tasking`.
|
|
162
|
+
Items with `effort: standard` should go through the Standard tier flow.
|
|
163
|
+
|
|
164
|
+
When working a backlog item:
|
|
165
|
+
1. `forge` surfaces it as an available task
|
|
166
|
+
2. User selects it
|
|
167
|
+
3. Route to `quick-tasking` or Standard tier based on effort
|
|
168
|
+
4. On completion, update the item's `status` to `done` and set `completed` date
|
|
@@ -0,0 +1,117 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: researching
|
|
3
|
+
description: "Use when you need to investigate before building: understand the codebase, evaluate technologies, clarify requirements, or assess feasibility. Trigger before any Standard or Full tier planning. Also trigger when joining an existing project or when blocked on 'how should this work?'"
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Researching
|
|
7
|
+
|
|
8
|
+
Gather context before planning. Bad research → bad plans → wasted time.
|
|
9
|
+
|
|
10
|
+
## Research Types
|
|
11
|
+
|
|
12
|
+
### 1. Codebase Research
|
|
13
|
+
**When:** Joining existing project, planning changes to existing code, brownfield work.
|
|
14
|
+
|
|
15
|
+
Steps:
|
|
16
|
+
1. Map directory structure (`ls`, `find`, `tree`)
|
|
17
|
+
2. Identify key files: entry points, config, routes, models, components
|
|
18
|
+
3. Read conventions: naming, imports, error handling, testing patterns
|
|
19
|
+
4. Identify tech stack: framework, libraries, versions
|
|
20
|
+
5. Find existing patterns similar to planned work (reuse, don't reinvent)
|
|
21
|
+
6. Document concerns: tech debt, fragile areas, missing tests
|
|
22
|
+
|
|
23
|
+
Output: Research summary with file paths, patterns found, reuse opportunities.
|
|
24
|
+
|
|
25
|
+
### 2. Requirements Research
|
|
26
|
+
**When:** Requirements are vague, conflicting, or incomplete.
|
|
27
|
+
|
|
28
|
+
Steps:
|
|
29
|
+
1. Read all available requirements/specs/user stories
|
|
30
|
+
2. Identify gaps — what's not specified?
|
|
31
|
+
3. Mark uncertainties with `[NEEDS CLARIFICATION]`
|
|
32
|
+
4. Draft acceptance criteria (Given/When/Then)
|
|
33
|
+
5. Present clarification questions to user
|
|
34
|
+
|
|
35
|
+
Output: Structured requirements draft with uncertainty markers.
|
|
36
|
+
|
|
37
|
+
### 3. Technology Research
|
|
38
|
+
**When:** Evaluating libraries, frameworks, APIs, or integration approaches.
|
|
39
|
+
|
|
40
|
+
Steps:
|
|
41
|
+
1. Check MCP servers first (structured data beats web scraping)
|
|
42
|
+
2. Read official documentation (not blog posts)
|
|
43
|
+
3. Check version compatibility with existing stack
|
|
44
|
+
4. Look for known issues, migration guides, breaking changes
|
|
45
|
+
5. Find working code examples from official sources
|
|
46
|
+
|
|
47
|
+
Output: Technology assessment with recommendation and confidence level.
|
|
48
|
+
|
|
49
|
+
### 4. Feasibility Research
|
|
50
|
+
**When:** Unsure if approach will work, need to spike a risky idea.
|
|
51
|
+
|
|
52
|
+
Steps:
|
|
53
|
+
1. Identify the riskiest assumption
|
|
54
|
+
2. Build minimal proof-of-concept (< 30 minutes)
|
|
55
|
+
3. Test the critical path only
|
|
56
|
+
4. Document: works / doesn't work / works with caveats
|
|
57
|
+
|
|
58
|
+
Output: Feasibility verdict with evidence.
|
|
59
|
+
|
|
60
|
+
## Tool Priority
|
|
61
|
+
|
|
62
|
+
Use tools in this order (highest confidence first):
|
|
63
|
+
|
|
64
|
+
1. **Codebase itself** — Read actual code. Highest confidence.
|
|
65
|
+
2. **MCP servers** — Structured data from Notion, GitHub, Linear, etc.
|
|
66
|
+
3. **Official docs** — Framework/library documentation via WebFetch.
|
|
67
|
+
4. **Web search** — Community solutions, Stack Overflow, blog posts.
|
|
68
|
+
5. **Training data** — Use as hypothesis only. Verify before trusting.
|
|
69
|
+
|
|
70
|
+
## Confidence Levels
|
|
71
|
+
|
|
72
|
+
Tag every finding:
|
|
73
|
+
|
|
74
|
+
| Level | Meaning | Source Required |
|
|
75
|
+
|-------|---------|----------------|
|
|
76
|
+
| **HIGH** | Verified in codebase or official docs | Direct observation or official source |
|
|
77
|
+
| **MEDIUM** | Found in multiple sources, not directly verified | 2+ independent sources agree |
|
|
78
|
+
| **LOW** | Single source or inferred from patterns | Flag for verification |
|
|
79
|
+
|
|
80
|
+
## Parallel Research
|
|
81
|
+
|
|
82
|
+
When investigating independent topics, research them simultaneously:
|
|
83
|
+
- Spawn separate research agents for each topic
|
|
84
|
+
- Each agent gets a focused scope
|
|
85
|
+
- Collect and synthesize results
|
|
86
|
+
|
|
87
|
+
Do NOT research sequentially when topics are independent.
|
|
88
|
+
|
|
89
|
+
## Output Template
|
|
90
|
+
|
|
91
|
+
```markdown
|
|
92
|
+
# Research: [Topic]
|
|
93
|
+
|
|
94
|
+
## Summary
|
|
95
|
+
[2-3 sentence executive summary of findings]
|
|
96
|
+
|
|
97
|
+
## Key Findings
|
|
98
|
+
1. [Finding] — Confidence: HIGH/MEDIUM/LOW — Source: [source]
|
|
99
|
+
2. [Finding] — Confidence: HIGH/MEDIUM/LOW — Source: [source]
|
|
100
|
+
|
|
101
|
+
## Recommendations
|
|
102
|
+
- [Recommended approach with rationale]
|
|
103
|
+
|
|
104
|
+
## Uncertainties
|
|
105
|
+
- [NEEDS CLARIFICATION]: [What we don't know and why it matters]
|
|
106
|
+
|
|
107
|
+
## Sources
|
|
108
|
+
- [Source 1]: [URL or file path]
|
|
109
|
+
- [Source 2]: [URL or file path]
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
## Size Gate
|
|
113
|
+
|
|
114
|
+
Research output should be under 500 lines. If larger, split into focused documents:
|
|
115
|
+
- `research-codebase.md` for codebase findings
|
|
116
|
+
- `research-tech.md` for technology evaluation
|
|
117
|
+
- `research-requirements.md` for requirements analysis
|
|
@@ -0,0 +1,104 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: securing
|
|
3
|
+
description: "Use when building features that touch authentication, user data, external APIs, or secrets. Trigger before shipping anything that handles sensitive data, manages sessions, calls third-party services, or processes user input. Also use for periodic security reviews of existing code."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Securing
|
|
7
|
+
|
|
8
|
+
Security is a requirement, not a feature. Review before shipping.
|
|
9
|
+
|
|
10
|
+
## When This Skill Triggers
|
|
11
|
+
|
|
12
|
+
Ask these 4 questions. If YES to any, run this skill:
|
|
13
|
+
|
|
14
|
+
1. Does this feature touch **authentication or authorization**?
|
|
15
|
+
2. Does it **store or transmit user data**?
|
|
16
|
+
3. Does it **call external APIs** or accept webhooks?
|
|
17
|
+
4. Does it **handle secrets, keys, or tokens**?
|
|
18
|
+
|
|
19
|
+
## Security Checklist
|
|
20
|
+
|
|
21
|
+
### Authentication & Authorization
|
|
22
|
+
- [ ] Passwords hashed with bcrypt/scrypt/argon2 (never MD5/SHA)
|
|
23
|
+
- [ ] Sessions/tokens expire (configurable, reasonable default)
|
|
24
|
+
- [ ] Token refresh mechanism exists
|
|
25
|
+
- [ ] Authorization checked server-side before every data access
|
|
26
|
+
- [ ] Failed login attempts rate-limited
|
|
27
|
+
- [ ] Password reset flow doesn't reveal account existence
|
|
28
|
+
|
|
29
|
+
### Input Validation & Injection
|
|
30
|
+
- [ ] All user input validated server-side (not just client-side)
|
|
31
|
+
- [ ] SQL queries use parameterized statements (never string concatenation)
|
|
32
|
+
- [ ] HTML output escaped to prevent XSS
|
|
33
|
+
- [ ] File uploads validated: type, size, content (not just extension)
|
|
34
|
+
- [ ] URL parameters sanitized before use
|
|
35
|
+
- [ ] JSON parsing has size limits
|
|
36
|
+
|
|
37
|
+
### Secrets Management
|
|
38
|
+
- [ ] Secrets stored in environment variables, never in code
|
|
39
|
+
- [ ] `.env` file in `.gitignore`
|
|
40
|
+
- [ ] No secrets in logs, error messages, or stack traces
|
|
41
|
+
- [ ] API keys scoped to minimum required permissions
|
|
42
|
+
- [ ] Secrets rotated periodically (documented rotation process)
|
|
43
|
+
|
|
44
|
+
### Data Protection
|
|
45
|
+
- [ ] Sensitive data encrypted at rest (if stored)
|
|
46
|
+
- [ ] HTTPS enforced for all data in transit
|
|
47
|
+
- [ ] PII not logged (names, emails, addresses, IPs)
|
|
48
|
+
- [ ] Data retention policy defined (how long, when deleted)
|
|
49
|
+
- [ ] CORS configured to allow only known origins
|
|
50
|
+
|
|
51
|
+
### Dependencies
|
|
52
|
+
- [ ] `npm audit` (or equivalent) passes or vulnerabilities documented
|
|
53
|
+
- [ ] No known vulnerable versions of critical dependencies
|
|
54
|
+
- [ ] Lock file committed (package-lock.json, yarn.lock)
|
|
55
|
+
- [ ] Dependencies from trusted registries only
|
|
56
|
+
|
|
57
|
+
### Error Handling
|
|
58
|
+
- [ ] Error messages don't leak internal details (stack traces, DB schema, file paths)
|
|
59
|
+
- [ ] Unhandled exceptions caught and logged (not displayed to user)
|
|
60
|
+
- [ ] Rate limiting on all public endpoints
|
|
61
|
+
- [ ] Graceful degradation when external services fail
|
|
62
|
+
|
|
63
|
+
## Output
|
|
64
|
+
|
|
65
|
+
Create `security-review.md`:
|
|
66
|
+
|
|
67
|
+
```markdown
|
|
68
|
+
# Security Review: [Feature Name]
|
|
69
|
+
|
|
70
|
+
Date: [YYYY-MM-DD]
|
|
71
|
+
Reviewer: Claude (Forge security skill)
|
|
72
|
+
|
|
73
|
+
## Scope
|
|
74
|
+
[What was reviewed: files, endpoints, data flows]
|
|
75
|
+
|
|
76
|
+
## Findings
|
|
77
|
+
|
|
78
|
+
### Critical (Must fix before shipping)
|
|
79
|
+
- [Finding with file path and line reference]
|
|
80
|
+
|
|
81
|
+
### Warning (Should fix soon)
|
|
82
|
+
- [Finding with file path and line reference]
|
|
83
|
+
|
|
84
|
+
### Info (Improvement opportunity)
|
|
85
|
+
- [Finding with file path and line reference]
|
|
86
|
+
|
|
87
|
+
## Checklist Results
|
|
88
|
+
[Copy checklist with pass/fail marks]
|
|
89
|
+
|
|
90
|
+
## Recommendations
|
|
91
|
+
[Prioritized list of actions]
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
## Common Patterns to Flag
|
|
95
|
+
|
|
96
|
+
| Pattern | Risk | Fix |
|
|
97
|
+
|---------|------|-----|
|
|
98
|
+
| `eval()` or `new Function()` | Code injection | Remove; use safe alternatives |
|
|
99
|
+
| `dangerouslySetInnerHTML` | XSS | Sanitize with DOMPurify first |
|
|
100
|
+
| `SELECT * FROM users WHERE id = '${id}'` | SQL injection | Use parameterized queries |
|
|
101
|
+
| `console.log(user)` with PII | Data leak | Log user.id only |
|
|
102
|
+
| Hardcoded `Bearer sk-...` | Secret leak | Move to env var |
|
|
103
|
+
| `cors({ origin: '*' })` | CORS bypass | Whitelist specific origins |
|
|
104
|
+
| Missing `httpOnly` on auth cookie | Cookie theft | Add `httpOnly: true, secure: true` |
|