@pdlc-os/pdlc 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/commands/brainstorm.md +360 -0
- package/.claude/commands/build.md +383 -0
- package/.claude/commands/init.md +371 -0
- package/.claude/commands/ship.md +349 -0
- package/.claude/settings.json +40 -0
- package/CLAUDE.md +179 -0
- package/README.md +452 -0
- package/agents/bolt.md +84 -0
- package/agents/echo.md +87 -0
- package/agents/friday.md +83 -0
- package/agents/jarvis.md +87 -0
- package/agents/muse.md +87 -0
- package/agents/neo.md +78 -0
- package/agents/oracle.md +81 -0
- package/agents/phantom.md +85 -0
- package/agents/pulse.md +95 -0
- package/bin/pdlc.js +221 -0
- package/hooks/pdlc-context-monitor.js +129 -0
- package/hooks/pdlc-guardrails.js +307 -0
- package/hooks/pdlc-session-start.sh +73 -0
- package/hooks/pdlc-statusline.js +183 -0
- package/package.json +48 -0
- package/scripts/frame-template.html +332 -0
- package/scripts/helper.js +88 -0
- package/scripts/server.cjs +357 -0
- package/scripts/start-server.sh +173 -0
- package/scripts/stop-server.sh +54 -0
- package/skills/reflect.md +189 -0
- package/skills/repo-scan.md +266 -0
- package/skills/review.md +156 -0
- package/skills/safety-guardrails.md +168 -0
- package/skills/ship.md +148 -0
- package/skills/tdd.md +88 -0
- package/skills/test.md +153 -0
- package/templates/CONSTITUTION.md +254 -0
- package/templates/INTENT.md +120 -0
- package/templates/OVERVIEW.md +93 -0
- package/templates/PRD.md +212 -0
- package/templates/STATE.md +113 -0
- package/templates/episode.md +182 -0
- package/templates/review.md +215 -0
|
@@ -0,0 +1,189 @@
|
|
|
1
|
+
# Retrospective Protocol
|
|
2
|
+
|
|
3
|
+
## When this skill activates
|
|
4
|
+
|
|
5
|
+
Activate at the start of the **Reflect sub-phase** of Operation, after the Verify sub-phase is complete and the human has signed off on smoke tests. This skill generates the full retrospective for the completed feature cycle — from Inception through Ship.
|
|
6
|
+
|
|
7
|
+
Before starting, gather all required inputs:
|
|
8
|
+
- The active episode file: `docs/pdlc/memory/episodes/[episode-id].md`
|
|
9
|
+
- The episode index: `docs/pdlc/memory/episodes/index.md`
|
|
10
|
+
- The PRD: `docs/pdlc/prds/PRD_[feature-name]_[YYYY-MM-DD].md`
|
|
11
|
+
- All review files generated during Construction: `docs/pdlc/reviews/REVIEW_[task-id]_[YYYY-MM-DD].md`
|
|
12
|
+
- `docs/pdlc/memory/STATE.md` — for guardrail event log and loop-breaker escalation count
|
|
13
|
+
- `docs/pdlc/memory/DECISIONS.md` — for any tech debt deferred this cycle
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## Protocol
|
|
18
|
+
|
|
19
|
+
### Step 1 — Per-agent contributions
|
|
20
|
+
|
|
21
|
+
List every agent that participated in this feature cycle. For each agent, record:
|
|
22
|
+
- Their role name and display name (e.g. "Neo — Architect")
|
|
23
|
+
- What they contributed: specific tasks, findings surfaced, decisions influenced
|
|
24
|
+
- Any notable findings they raised (e.g. a Phantom security finding that led to a code change, an Echo coverage gap that revealed a missing test)
|
|
25
|
+
|
|
26
|
+
Pull this information from: the review files, the episode file, STATE.md history, and the Beads task records (`bd show [task-id]` for each completed task).
|
|
27
|
+
|
|
28
|
+
Agents to check: Neo, Echo, Phantom, Jarvis (always-on), plus any auto-selected agents (Bolt, Friday, Muse, Oracle, Pulse) that participated based on task labels.
|
|
29
|
+
|
|
30
|
+
### Step 2 — Shipping streak
|
|
31
|
+
|
|
32
|
+
1. Read `docs/pdlc/memory/episodes/index.md`.
|
|
33
|
+
2. Count consecutive successfully delivered features ending with the current episode (i.e. episodes where the Ship sub-phase completed without a rollback or abandon).
|
|
34
|
+
3. A streak is broken by: a rollback, an explicitly abandoned feature, or a feature that did not reach Ship.
|
|
35
|
+
4. Record the current streak count.
|
|
36
|
+
|
|
37
|
+
Display format: "Shipping streak: X consecutive features delivered"
|
|
38
|
+
|
|
39
|
+
### Step 3 — Metrics snapshot
|
|
40
|
+
|
|
41
|
+
Collect the following metrics from the episode file, review files, and STATE.md:
|
|
42
|
+
|
|
43
|
+
**Test pass rate by layer:**
|
|
44
|
+
Read from the Test Summary in the episode file. For each layer, record: passed / total. Compute pass rate percentage.
|
|
45
|
+
|
|
46
|
+
**Cycle time:**
|
|
47
|
+
- Start date: the date the Inception phase began for this feature (read from the PRD file date or STATE.md history)
|
|
48
|
+
- End date: the date the Ship sub-phase completed (today's date or the Ship timestamp in STATE.md)
|
|
49
|
+
- Cycle time = end date − start date, in calendar days
|
|
50
|
+
|
|
51
|
+
**Review rounds:**
|
|
52
|
+
Count the number of times the Review sub-phase was run for this feature (i.e. how many times a review file was written or regenerated). Read from the review files — check multiple files for the same task-id if they exist.
|
|
53
|
+
|
|
54
|
+
**Guardrail triggers by tier:**
|
|
55
|
+
Read `docs/pdlc/memory/STATE.md` for logged guardrail events.
|
|
56
|
+
- Tier 1 events: hard blocks that required double-RED confirmation
|
|
57
|
+
- Tier 2 events: pause-and-confirm events
|
|
58
|
+
- Tier 3 events: logged warnings (skipped layers, accepted warnings, overrides)
|
|
59
|
+
- Count each tier separately.
|
|
60
|
+
|
|
61
|
+
**Loop-breaker escalations:**
|
|
62
|
+
Count the number of times the 3-attempt auto-fix limit was hit and the human was asked to intervene. Read from STATE.md and the episode file.
|
|
63
|
+
|
|
64
|
+
### Step 4 — What went well
|
|
65
|
+
|
|
66
|
+
Write 3–5 bullet points drawn from the episode's actual history. Be specific — reference actual events, not generic platitudes.
|
|
67
|
+
|
|
68
|
+
Inputs to draw from:
|
|
69
|
+
- Beads tasks completed without blockers or loop-breakers → "Task [X] was implemented cleanly in one TDD cycle"
|
|
70
|
+
- Review findings that led to measurable improvements → "Phantom surfaced an injection risk in [module] that was fixed before ship"
|
|
71
|
+
- Test layers that caught regressions early → "Integration tests caught a broken contract between [service A] and [service B]"
|
|
72
|
+
- A smooth human approval gate → "PRD and design docs approved in one round without revisions"
|
|
73
|
+
- CI/CD or tooling that worked well → "Playwright E2E suite ran against Chromium with zero flakes"
|
|
74
|
+
|
|
75
|
+
### Step 5 — What broke or was harder than expected
|
|
76
|
+
|
|
77
|
+
Write 3–5 bullet points. Be specific and honest. Reference actual incidents from the episode.
|
|
78
|
+
|
|
79
|
+
Inputs to draw from:
|
|
80
|
+
- Loop-breaker escalations (3-attempt limit hits) → what the root cause turned out to be
|
|
81
|
+
- Approval rounds that required multiple revisions → what caused the back-and-forth
|
|
82
|
+
- Test layers that had failures → what the failures were
|
|
83
|
+
- Guardrail Tier 1 or Tier 2 events → what triggered them and how they were resolved
|
|
84
|
+
- Merge conflicts or CI/CD failures during Ship
|
|
85
|
+
- Scope that expanded beyond the PRD → what crept in and why
|
|
86
|
+
|
|
87
|
+
### Step 6 — What to improve next time
|
|
88
|
+
|
|
89
|
+
Write 2–3 actionable improvement suggestions. These must be concrete and implementable in the next cycle.
|
|
90
|
+
|
|
91
|
+
Each suggestion should follow the format:
|
|
92
|
+
- **What**: the specific change to make
|
|
93
|
+
- **Why**: what problem it solves (trace back to this cycle's friction)
|
|
94
|
+
- **How**: a concrete first step to implement it
|
|
95
|
+
|
|
96
|
+
Examples of the kind of specificity required:
|
|
97
|
+
- "Add a perf benchmark for [endpoint] to the E2E suite before next iteration, so Layer 4 has automated coverage rather than manual timing"
|
|
98
|
+
- "Pre-populate the CONSTITUTION.md test gates section during Init — it was left blank this cycle, which caused ambiguity during the Test sub-phase"
|
|
99
|
+
- "Establish a baseline visual regression screenshot set before the next frontend task, to make Layer 6 actionable rather than advisory"
|
|
100
|
+
|
|
101
|
+
### Step 7 — Tech debt log
|
|
102
|
+
|
|
103
|
+
Read `docs/pdlc/memory/DECISIONS.md` for any tech debt deferred during this cycle. Also check review files for findings marked "Defer to tech debt."
|
|
104
|
+
|
|
105
|
+
For each item:
|
|
106
|
+
- Name the component or module affected
|
|
107
|
+
- Describe the debt (what was cut, why it was deferred)
|
|
108
|
+
- Propose a concrete remediation approach and a suggested future episode to address it
|
|
109
|
+
|
|
110
|
+
If no tech debt was introduced this cycle, state that explicitly: "No tech debt introduced this cycle."
|
|
111
|
+
|
|
112
|
+
### Step 8 — Write the retrospective into the episode file
|
|
113
|
+
|
|
114
|
+
Append the retrospective to the active episode file at `docs/pdlc/memory/episodes/[episode-id].md` under a "Reflect Notes" section.
|
|
115
|
+
|
|
116
|
+
The section must contain all seven elements from Steps 1–7:
|
|
117
|
+
- Per-agent contributions
|
|
118
|
+
- Shipping streak
|
|
119
|
+
- Metrics snapshot
|
|
120
|
+
- What went well
|
|
121
|
+
- What broke / was harder than expected
|
|
122
|
+
- What to improve next time
|
|
123
|
+
- Tech debt log
|
|
124
|
+
|
|
125
|
+
### Step 9 — Update OVERVIEW.md
|
|
126
|
+
|
|
127
|
+
Read `docs/pdlc/memory/OVERVIEW.md`. This is the aggregated view of all functionality delivered across every iteration.
|
|
128
|
+
|
|
129
|
+
Add an entry for this feature cycle:
|
|
130
|
+
- Feature name and episode ID
|
|
131
|
+
- What was built (2–3 sentence summary)
|
|
132
|
+
- Key decisions made
|
|
133
|
+
- Version tag shipped (v[X.Y.Z])
|
|
134
|
+
- Links to the episode file and PRD
|
|
135
|
+
|
|
136
|
+
Do not overwrite previous entries. Append only.
|
|
137
|
+
|
|
138
|
+
### Step 10 — Present for human approval
|
|
139
|
+
|
|
140
|
+
Present the human with:
|
|
141
|
+
1. The path to the updated episode file: `docs/pdlc/memory/episodes/[episode-id].md`
|
|
142
|
+
2. The path to the updated OVERVIEW.md: `docs/pdlc/memory/OVERVIEW.md`
|
|
143
|
+
3. A brief summary: shipping streak, cycle time, total tests passed, and the top "what to improve" item.
|
|
144
|
+
|
|
145
|
+
State: "Retrospective complete. Please review the episode file and OVERVIEW.md. Approve to close the episode and commit the final state."
|
|
146
|
+
|
|
147
|
+
Wait for human approval. Do not commit until the human approves.
|
|
148
|
+
|
|
149
|
+
### Step 11 — Commit and close
|
|
150
|
+
|
|
151
|
+
After human approval:
|
|
152
|
+
|
|
153
|
+
1. Stage and commit the episode file and OVERVIEW.md:
|
|
154
|
+
```bash
|
|
155
|
+
git add docs/pdlc/memory/episodes/[episode-id].md
|
|
156
|
+
git add docs/pdlc/memory/OVERVIEW.md
|
|
157
|
+
git commit -m "reflect: [feature-name] episode [episode-id] retrospective"
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
2. Push to main.
|
|
161
|
+
|
|
162
|
+
3. Update `docs/pdlc/memory/STATE.md`:
|
|
163
|
+
- Phase: Initialization (ready for next `/pdlc brainstorm`)
|
|
164
|
+
- Last completed episode: [episode-id]
|
|
165
|
+
- Active feature: none
|
|
166
|
+
|
|
167
|
+
4. Report to human: "Episode [episode-id] closed. Ready for the next feature. Run `/pdlc brainstorm` to begin."
|
|
168
|
+
|
|
169
|
+
---
|
|
170
|
+
|
|
171
|
+
## Rules
|
|
172
|
+
|
|
173
|
+
- The retrospective must be grounded in actual events from the episode — not generic observations. Every bullet point in "what went well" and "what broke" must be traceable to a specific event in the episode file, review files, or STATE.md.
|
|
174
|
+
- Shipping streak must be calculated from the full episode index, not estimated.
|
|
175
|
+
- Cycle time is calculated from Inception start to Ship completion — not from the first commit.
|
|
176
|
+
- Tech debt must be explicitly stated as either present (with specifics) or absent ("no tech debt introduced this cycle"). Silence is not acceptable.
|
|
177
|
+
- Do not commit the retrospective without explicit human approval.
|
|
178
|
+
- OVERVIEW.md is append-only. Never modify or remove previous entries.
|
|
179
|
+
- Improvement suggestions must be actionable: no "communicate better" or "be more careful." Each suggestion needs a concrete first step.
|
|
180
|
+
|
|
181
|
+
---
|
|
182
|
+
|
|
183
|
+
## Output
|
|
184
|
+
|
|
185
|
+
- "Reflect Notes" section appended to `docs/pdlc/memory/episodes/[episode-id].md` covering all seven elements.
|
|
186
|
+
- `docs/pdlc/memory/OVERVIEW.md` updated with a new entry for this feature cycle.
|
|
187
|
+
- Episode file and OVERVIEW.md committed to main after human approval.
|
|
188
|
+
- `docs/pdlc/memory/STATE.md` updated: phase reset to Initialization, active feature cleared.
|
|
189
|
+
- Human informed the feature cycle is closed and the system is ready for the next `/pdlc brainstorm`.
|
|
@@ -0,0 +1,266 @@
|
|
|
1
|
+
# Repo Scan — Brownfield Initialization
|
|
2
|
+
|
|
3
|
+
## When this skill activates
|
|
4
|
+
|
|
5
|
+
During `/pdlc init` when the repository contains existing source code (brownfield project). The goal is to deeply review the existing codebase and produce pre-populated drafts of all memory bank files, so initialization reflects reality rather than starting from blank templates.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Protocol
|
|
10
|
+
|
|
11
|
+
Execute every step below in order. Do not skip any step. Collect all findings before writing any memory files.
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
### Step 1 — Map the repository structure
|
|
16
|
+
|
|
17
|
+
Run the following to understand the top-level layout:
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
git ls-files | head -200
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
Also run:
|
|
24
|
+
```bash
|
|
25
|
+
find . -maxdepth 3 \
|
|
26
|
+
-not -path './.git/*' \
|
|
27
|
+
-not -path './node_modules/*' \
|
|
28
|
+
-not -path './.beads/*' \
|
|
29
|
+
-not -path './docs/pdlc/*' \
|
|
30
|
+
-not -name '.DS_Store' \
|
|
31
|
+
| sort
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
From this output, identify:
|
|
35
|
+
- Primary language(s) (file extensions)
|
|
36
|
+
- Framework indicators (`package.json`, `Gemfile`, `pyproject.toml`, `go.mod`, `Cargo.toml`, `pom.xml`, etc.)
|
|
37
|
+
- Entry points (`main.*`, `index.*`, `app.*`, `server.*`, `cmd/`)
|
|
38
|
+
- Test directories (`__tests__/`, `spec/`, `test/`, `tests/`, `*.test.*`, `*.spec.*`)
|
|
39
|
+
- Config files (`.env.example`, `docker-compose.yml`, `Dockerfile`, CI/CD configs)
|
|
40
|
+
- Existing documentation (`README*`, `docs/` excluding `docs/pdlc/`)
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
### Step 2 — Read key manifest and config files
|
|
45
|
+
|
|
46
|
+
Read every file from this list that exists:
|
|
47
|
+
|
|
48
|
+
- `package.json` / `package-lock.json` — for Node.js projects
|
|
49
|
+
- `Gemfile` — for Ruby projects
|
|
50
|
+
- `pyproject.toml` / `requirements.txt` / `setup.py` — for Python projects
|
|
51
|
+
- `go.mod` — for Go projects
|
|
52
|
+
- `Cargo.toml` — for Rust projects
|
|
53
|
+
- `pom.xml` / `build.gradle` — for Java/Kotlin projects
|
|
54
|
+
- `docker-compose.yml` / `Dockerfile` — for containerised stacks
|
|
55
|
+
- `.github/workflows/*.yml` — for CI/CD pipelines
|
|
56
|
+
- `README.md` / `README.rst` / `README.txt` — for existing documentation
|
|
57
|
+
- `.env.example` — for environment variable hints
|
|
58
|
+
- Any existing `ARCHITECTURE.md`, `CONTRIBUTING.md`, `DECISIONS.md`, or `ADR/` directory
|
|
59
|
+
|
|
60
|
+
Extract from these files:
|
|
61
|
+
- **Tech stack**: languages, frameworks, databases, cloud providers, key libraries
|
|
62
|
+
- **Scripts**: what `test`, `build`, `start`, `deploy` commands exist
|
|
63
|
+
- **Dependencies**: categorise into frontend, backend, testing, dev tools
|
|
64
|
+
- **Environment variables**: what external services are configured
|
|
65
|
+
- **CI/CD pipeline**: what stages run on merge/push
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
### Step 3 — Read entry points and core source files
|
|
70
|
+
|
|
71
|
+
Identify and read (or skim) up to 10 of the most important source files. Priority order:
|
|
72
|
+
|
|
73
|
+
1. Main entry point (`index.js`, `main.py`, `app.rb`, `main.go`, `src/main.*`, etc.)
|
|
74
|
+
2. Router or route definitions (`routes/`, `router.*`, `urls.py`, `routes.rb`)
|
|
75
|
+
3. Core models or data layer (`models/`, `schema.*`, `prisma/schema.prisma`, `db/schema.rb`)
|
|
76
|
+
4. Primary controllers or handlers (`controllers/`, `handlers/`, `views/`, `resolvers/`)
|
|
77
|
+
5. Auth layer if present (`auth.*`, `middleware/auth.*`, `lib/auth/`)
|
|
78
|
+
6. Existing API contract files (`openapi.yaml`, `swagger.json`, `graphql/schema.graphql`)
|
|
79
|
+
|
|
80
|
+
From these files, identify:
|
|
81
|
+
- **Core features**: what the application already does (be specific — list each distinct feature)
|
|
82
|
+
- **Data model**: main entities and their relationships
|
|
83
|
+
- **API surface**: existing endpoints or mutations
|
|
84
|
+
- **Business logic patterns**: where decisions are made, how data flows
|
|
85
|
+
- **Architectural style**: MVC, hexagonal, serverless functions, monolith, microservices
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
### Step 4 — Read existing tests
|
|
90
|
+
|
|
91
|
+
Find and skim up to 10 test files across different test types:
|
|
92
|
+
|
|
93
|
+
```bash
|
|
94
|
+
git ls-files | grep -E '\.(test|spec)\.' | head -20
|
|
95
|
+
git ls-files | grep -E '^(test|tests|spec|__tests__)/' | head -20
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
From test files, identify:
|
|
99
|
+
- What features are already covered by tests
|
|
100
|
+
- What testing libraries / frameworks are used
|
|
101
|
+
- Approximate test coverage (many tests = good coverage, few = sparse)
|
|
102
|
+
- Whether tests are unit, integration, or E2E style
|
|
103
|
+
- Any test conventions (naming, file co-location, fixtures)
|
|
104
|
+
|
|
105
|
+
---
|
|
106
|
+
|
|
107
|
+
### Step 5 — Read git history
|
|
108
|
+
|
|
109
|
+
Run the following to understand the project's timeline:
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
git log --oneline --no-merges -50
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
Also run:
|
|
116
|
+
```bash
|
|
117
|
+
git log --format="%ai %s" --no-merges -20
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
From the git log, identify:
|
|
121
|
+
- When the project started (first commit date)
|
|
122
|
+
- The main feature areas worked on (infer from commit messages)
|
|
123
|
+
- Recent areas of activity (last 10 commits)
|
|
124
|
+
- Any architectural pivots (e.g. "migrate to X", "replace Y with Z", "rewrite")
|
|
125
|
+
- Recurring contributors (for the team context)
|
|
126
|
+
|
|
127
|
+
---
|
|
128
|
+
|
|
129
|
+
### Step 6 — Synthesise findings
|
|
130
|
+
|
|
131
|
+
Before writing any file, compose a structured internal summary with these sections. You will use this summary to write all memory files:
|
|
132
|
+
|
|
133
|
+
```
|
|
134
|
+
REPO SCAN SUMMARY
|
|
135
|
+
=================
|
|
136
|
+
|
|
137
|
+
PROJECT
|
|
138
|
+
Name: [inferred from package.json/README/directory name]
|
|
139
|
+
Description: [1–2 sentence description of what it does]
|
|
140
|
+
Started: [date of first commit]
|
|
141
|
+
Primary language: [language]
|
|
142
|
+
Tech stack: [framework + DB + infra]
|
|
143
|
+
|
|
144
|
+
EXISTING FEATURES (list each concrete feature the app currently has)
|
|
145
|
+
1. [Feature name] — [1-sentence description]
|
|
146
|
+
2. ...
|
|
147
|
+
|
|
148
|
+
DATA MODEL (main entities)
|
|
149
|
+
- [Entity]: [fields / relationships in plain English]
|
|
150
|
+
- ...
|
|
151
|
+
|
|
152
|
+
API SURFACE (if applicable)
|
|
153
|
+
- [METHOD /path] — [what it does]
|
|
154
|
+
- ...
|
|
155
|
+
|
|
156
|
+
ARCHITECTURAL PATTERNS
|
|
157
|
+
- [Pattern observed, e.g. "MVC via Rails conventions", "Service objects for business logic"]
|
|
158
|
+
|
|
159
|
+
TEST COVERAGE
|
|
160
|
+
- Frameworks: [list]
|
|
161
|
+
- Covered: [features with tests]
|
|
162
|
+
- Gaps: [features with little or no test coverage]
|
|
163
|
+
|
|
164
|
+
CI/CD
|
|
165
|
+
- [What pipeline stages exist, what triggers them]
|
|
166
|
+
|
|
167
|
+
KEY DECISIONS (inferred from code and git history)
|
|
168
|
+
1. [Decision inferred, e.g. "Chose PostgreSQL over MongoDB — evidenced by ActiveRecord schemas"]
|
|
169
|
+
2. ...
|
|
170
|
+
|
|
171
|
+
TECH DEBT SIGNALS (code patterns suggesting debt)
|
|
172
|
+
- [e.g. "TODO/FIXME comments found in N files", "No tests for auth module", "Deprecated dep X"]
|
|
173
|
+
|
|
174
|
+
RECENT ACTIVITY (last 10 commits summary)
|
|
175
|
+
- [Area of focus, date range]
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
Print this summary to the user before proceeding, and ask: **"Does this look accurate? Any corrections before I generate the memory files?"** Wait for the user's response. Incorporate any corrections.
|
|
179
|
+
|
|
180
|
+
---
|
|
181
|
+
|
|
182
|
+
### Step 7 — Generate memory files from scan findings
|
|
183
|
+
|
|
184
|
+
Use the verified summary to write pre-populated versions of all memory files. Do not use blank template stubs — fill in real content wherever the scan produced findings.
|
|
185
|
+
|
|
186
|
+
#### `CONSTITUTION.md`
|
|
187
|
+
|
|
188
|
+
- **Tech Stack table**: fill in actual stack from scan
|
|
189
|
+
- **Architectural Constraints**: list observed patterns as constraints (e.g. "Service layer separates business logic from controllers — maintain this separation")
|
|
190
|
+
- **Coding Standards**: infer from code (e.g. linter config, consistent naming patterns found)
|
|
191
|
+
- **Test Gates**: check layers that already have tests; suggest enabling them
|
|
192
|
+
- Leave other sections as instructed defaults
|
|
193
|
+
|
|
194
|
+
#### `INTENT.md`
|
|
195
|
+
|
|
196
|
+
- **Project name**: from scan
|
|
197
|
+
- **Problem statement**: infer from README + features list. Write 2–4 sentences describing what problem the app solves
|
|
198
|
+
- **Target user**: infer from README or feature names. Mark clearly as "(inferred — please verify)"
|
|
199
|
+
- **Core value proposition**: draft from features and problem statement. Mark as "(inferred)"
|
|
200
|
+
- Leave success metrics and out-of-scope as placeholders — these need human input
|
|
201
|
+
|
|
202
|
+
#### `OVERVIEW.md`
|
|
203
|
+
|
|
204
|
+
This is the most important file to populate thoroughly:
|
|
205
|
+
|
|
206
|
+
- **Project Summary**: 2–3 sentences about what the app does today
|
|
207
|
+
- **Active Functionality**: list every feature identified in Step 3 as a bullet point with a 1-sentence description
|
|
208
|
+
- **Architecture Summary**: describe the architectural style and key layers from Step 3
|
|
209
|
+
- **Known Tech Debt**: list all tech debt signals from Step 6
|
|
210
|
+
- **Shipped Features table**: leave empty (no PDLC episodes yet, but note "Pre-PDLC functionality documented above")
|
|
211
|
+
|
|
212
|
+
#### `DECISIONS.md`
|
|
213
|
+
|
|
214
|
+
- Record each key decision from Step 6 as a lightweight ADR entry:
|
|
215
|
+
|
|
216
|
+
```markdown
|
|
217
|
+
## ADR-001 — [Decision title] *(pre-PDLC, inferred)*
|
|
218
|
+
|
|
219
|
+
**Date:** [inferred from git log or "unknown"]
|
|
220
|
+
**Status:** Accepted
|
|
221
|
+
|
|
222
|
+
**Decision:** [What was decided]
|
|
223
|
+
|
|
224
|
+
**Context:** [Why this decision makes sense given the codebase]
|
|
225
|
+
|
|
226
|
+
**Inferred from:** [git log / package.json / schema file / etc.]
|
|
227
|
+
|
|
228
|
+
---
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
Mark all pre-PDLC entries as `*(pre-PDLC, inferred)*` so the team knows these were reverse-engineered.
|
|
232
|
+
|
|
233
|
+
#### `CHANGELOG.md`
|
|
234
|
+
|
|
235
|
+
- Add a single entry for the pre-PDLC state:
|
|
236
|
+
|
|
237
|
+
```markdown
|
|
238
|
+
## Pre-PDLC baseline — [first commit date] to [today]
|
|
239
|
+
|
|
240
|
+
### Existing functionality
|
|
241
|
+
[List the features from the scan as bullet points under "Added"]
|
|
242
|
+
|
|
243
|
+
*Note: This entry documents the state of the repository before PDLC was introduced.
|
|
244
|
+
Future entries will be generated by Jarvis during each Ship sub-phase.*
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
#### `ROADMAP.md` and `STATE.md`
|
|
248
|
+
|
|
249
|
+
- Populate with project name but leave feature planning sections as stubs
|
|
250
|
+
- Set STATE.md to `Initialization Complete — Ready for /pdlc brainstorm`
|
|
251
|
+
|
|
252
|
+
---
|
|
253
|
+
|
|
254
|
+
## Rules
|
|
255
|
+
|
|
256
|
+
- **Never invent functionality** that isn't evidenced in the code, README, or git history. If you're uncertain, mark findings with "(inferred — please verify)".
|
|
257
|
+
- **Prefer specificity over generality**. "User authentication with JWT via Devise" is better than "authentication exists".
|
|
258
|
+
- **Respect existing conventions**. If the codebase uses a specific naming convention or architecture, document it in CONSTITUTION.md as a constraint to preserve.
|
|
259
|
+
- **Flag gaps explicitly**. If a feature exists but has no tests, say so. If there's no README, say so. If the git history is sparse, say so.
|
|
260
|
+
- **Mark all inferred content clearly** so the team can verify and correct.
|
|
261
|
+
|
|
262
|
+
---
|
|
263
|
+
|
|
264
|
+
## Output
|
|
265
|
+
|
|
266
|
+
Seven fully populated memory files under `docs/pdlc/memory/`, derived from real codebase analysis rather than blank templates.
|
package/skills/review.md
ADDED
|
@@ -0,0 +1,156 @@
|
|
|
1
|
+
# Multi-Agent Code Review
|
|
2
|
+
|
|
3
|
+
## When this skill activates
|
|
4
|
+
|
|
5
|
+
Activate at the start of the **Review sub-phase** of Construction, immediately after all tests for the active Beads task have passed. Do not run review before tests pass — this is a hard ordering constraint.
|
|
6
|
+
|
|
7
|
+
This skill governs one full review cycle per task. If the human requests revisions after reading the review file, re-run only the affected reviewer domains (or all, if the change is broad), regenerate the review file, and re-present for approval.
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## Protocol
|
|
12
|
+
|
|
13
|
+
### Step 1 — Establish context
|
|
14
|
+
|
|
15
|
+
Before any reviewer begins, load the following into context:
|
|
16
|
+
|
|
17
|
+
1. The active Beads task: `bd show [task-id]` — read title, description, acceptance criteria.
|
|
18
|
+
2. The PRD: `docs/pdlc/prds/PRD_[feature-name]_[YYYY-MM-DD].md` — check requirements, BDD stories, non-functional requirements, out-of-scope list.
|
|
19
|
+
3. `docs/pdlc/memory/CONSTITUTION.md` — rules, standards, definition of done.
|
|
20
|
+
4. `docs/pdlc/memory/DECISIONS.md` — architectural decisions already made; any deviation is a finding.
|
|
21
|
+
5. The design docs at `docs/pdlc/design/[feature-name]/` — ARCHITECTURE.md, data-model.md, api-contracts.md.
|
|
22
|
+
6. The full diff of all files changed in this task on the feature branch.
|
|
23
|
+
|
|
24
|
+
### Step 2 — Independent reviewer passes
|
|
25
|
+
|
|
26
|
+
Each reviewer operates independently within their domain. They do not wait for others. Run all four in parallel where possible. Each reviewer produces a list of findings — each finding has a title, description, affected file/line, and a severity note (Advisory / Recommended / Important — all are soft warnings; none are hard blocks).
|
|
27
|
+
|
|
28
|
+
**Neo — Architecture & PRD conformance**
|
|
29
|
+
|
|
30
|
+
Neo checks:
|
|
31
|
+
- Does the implementation match the architecture described in `docs/pdlc/design/[feature-name]/ARCHITECTURE.md`? Flag any divergence.
|
|
32
|
+
- Does the code implement what the PRD requires, and only what the PRD requires? Flag scope creep or missing requirements.
|
|
33
|
+
- Are any decisions recorded in `docs/pdlc/memory/DECISIONS.md` being violated or ignored?
|
|
34
|
+
- Is new tech debt being introduced? If so, is it intentional and documented?
|
|
35
|
+
- Are cross-cutting concerns (logging, error handling, config, auth) handled consistently with the rest of the codebase?
|
|
36
|
+
- Are module boundaries respected? Does new code reach across layers it should not?
|
|
37
|
+
|
|
38
|
+
**Phantom — Security**
|
|
39
|
+
|
|
40
|
+
Phantom checks against the OWASP Top 10 and general security hygiene:
|
|
41
|
+
- Injection: Is all user input validated and sanitized before use in queries, shell commands, or rendered output?
|
|
42
|
+
- Broken authentication: Are auth tokens validated correctly? Are session fixation, replay, and privilege escalation risks addressed?
|
|
43
|
+
- Sensitive data exposure: Are secrets, tokens, or PII exposed in logs, error messages, or API responses?
|
|
44
|
+
- Security misconfiguration: Are default credentials used? Are error pages leaking stack traces?
|
|
45
|
+
- Broken access control: Does the code enforce authorization at every access point, not just at the route level?
|
|
46
|
+
- Trust boundaries: Is data from external sources treated as untrusted until validated?
|
|
47
|
+
- Dependency risk: Are any new packages introduced that have known CVEs or unusual permissions?
|
|
48
|
+
- Are there any hardcoded secrets, API keys, or credentials anywhere in the diff?
|
|
49
|
+
|
|
50
|
+
**Echo — Test coverage & quality**
|
|
51
|
+
|
|
52
|
+
Echo checks:
|
|
53
|
+
- Does every acceptance criterion from the Beads task have at least one test?
|
|
54
|
+
- Are the Given/When/Then test names traceable to specific PRD user stories?
|
|
55
|
+
- Are edge cases covered: empty inputs, null values, boundary conditions, concurrent access, network failures?
|
|
56
|
+
- Are integration boundaries tested (not just mocked at every layer)?
|
|
57
|
+
- What is the regression risk? Have changes to shared utilities, base classes, or middleware been tested for downstream effects?
|
|
58
|
+
- Are there tests that are structurally present but not actually asserting meaningful behavior (i.e. tests that always pass regardless of implementation)?
|
|
59
|
+
|
|
60
|
+
**Jarvis — Documentation & API contracts**
|
|
61
|
+
|
|
62
|
+
Jarvis checks:
|
|
63
|
+
- Are all new public functions, methods, components, and APIs documented with inline comments or JSDoc/docstrings?
|
|
64
|
+
- If an API endpoint was added or changed: is `docs/pdlc/design/[feature-name]/api-contracts.md` up to date?
|
|
65
|
+
- Is the CHANGELOG entry for this task ready to be written? (Jarvis prepares a draft entry.)
|
|
66
|
+
- Are README or setup instructions impacted by this change? If so, are they updated?
|
|
67
|
+
- Are type signatures, return values, and error states documented accurately?
|
|
68
|
+
|
|
69
|
+
### Step 3 — Consolidate findings
|
|
70
|
+
|
|
71
|
+
After all four reviewers complete their passes:
|
|
72
|
+
|
|
73
|
+
1. Collect all findings into a single list.
|
|
74
|
+
2. Group by reviewer (Neo / Phantom / Echo / Jarvis).
|
|
75
|
+
3. Within each group, order by severity: Important → Recommended → Advisory.
|
|
76
|
+
4. Include the builder agent(s) who implemented the task as a named participant. They do not generate separate findings but are listed as contributors for traceability.
|
|
77
|
+
|
|
78
|
+
### Step 4 — Write the review file
|
|
79
|
+
|
|
80
|
+
Write the review file to:
|
|
81
|
+
```
|
|
82
|
+
docs/pdlc/reviews/REVIEW_[task-id]_[YYYY-MM-DD].md
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
The file must contain:
|
|
86
|
+
|
|
87
|
+
```
|
|
88
|
+
# Review: [task-id] — [task title]
|
|
89
|
+
Date: [YYYY-MM-DD]
|
|
90
|
+
Feature: [feature-name]
|
|
91
|
+
Reviewers: Neo, Phantom, Echo, Jarvis + [builder agent name(s)]
|
|
92
|
+
|
|
93
|
+
## Summary
|
|
94
|
+
[2–4 sentence summary of overall code quality and readiness]
|
|
95
|
+
|
|
96
|
+
## Neo — Architecture & PRD Conformance
|
|
97
|
+
[Findings, or "No findings."]
|
|
98
|
+
|
|
99
|
+
## Phantom — Security
|
|
100
|
+
[Findings, or "No findings."]
|
|
101
|
+
|
|
102
|
+
## Echo — Test Coverage & Quality
|
|
103
|
+
[Findings, or "No findings."]
|
|
104
|
+
|
|
105
|
+
## Jarvis — Documentation & API Contracts
|
|
106
|
+
[Findings, or "No findings." + draft CHANGELOG entry]
|
|
107
|
+
|
|
108
|
+
## Consolidated Finding Count
|
|
109
|
+
Important: X | Recommended: Y | Advisory: Z
|
|
110
|
+
|
|
111
|
+
## Human Decision Required
|
|
112
|
+
For each Important or Recommended finding, list:
|
|
113
|
+
- Finding title
|
|
114
|
+
- Proposed resolution
|
|
115
|
+
- Options: [ ] Fix now [ ] Accept and move on [ ] Defer to tech debt
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
### Step 5 — Human approval gate
|
|
119
|
+
|
|
120
|
+
Present the review file path to the human. State: "Review complete. Please read `docs/pdlc/reviews/REVIEW_[task-id]_[YYYY-MM-DD].md` and approve, or request changes."
|
|
121
|
+
|
|
122
|
+
Wait. Do not proceed to the Test sub-phase or push PR comments until the human explicitly approves.
|
|
123
|
+
|
|
124
|
+
If the human requests changes: address them, regenerate the review file, and re-present.
|
|
125
|
+
|
|
126
|
+
### Step 6 — Post approval actions
|
|
127
|
+
|
|
128
|
+
After human approval:
|
|
129
|
+
|
|
130
|
+
1. If GitHub integration is active: push findings as PR comments via the GitHub integration. Only push findings the human has not marked "Accept and move on."
|
|
131
|
+
2. For any finding marked "Defer to tech debt": add an entry to `docs/pdlc/memory/DECISIONS.md` under a "Tech Debt" section with the finding, the rationale for deferral, and a suggested remediation approach.
|
|
132
|
+
3. Update `docs/pdlc/memory/STATE.md`: mark review as approved for this task.
|
|
133
|
+
4. Proceed to the Test sub-phase.
|
|
134
|
+
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
## Rules
|
|
138
|
+
|
|
139
|
+
- Review runs only after all tests pass. Never before.
|
|
140
|
+
- All findings are soft warnings. No finding hard-blocks the build. Human decides: fix, accept, or defer.
|
|
141
|
+
- Human must approve the review file before PR comments are pushed. Never push PR comments automatically without approval.
|
|
142
|
+
- Severity labels — Important, Recommended, Advisory — are not severity scores for automation. They are signals to the human to help prioritize decisions.
|
|
143
|
+
- The builder agent(s) are always listed in the review file as participants. This is for traceability, not blame.
|
|
144
|
+
- Phantom security findings marked "Accept" must be logged as Tier 3 guardrail events in `docs/pdlc/memory/STATE.md`.
|
|
145
|
+
- Echo test coverage gaps marked "Accept" must also be logged as Tier 3 guardrail events in `docs/pdlc/memory/STATE.md`.
|
|
146
|
+
- Do not re-run the full review cycle for trivial fixes (e.g. a variable rename). Use judgment: re-run only the affected reviewer domain(s) when the human requests a change.
|
|
147
|
+
|
|
148
|
+
---
|
|
149
|
+
|
|
150
|
+
## Output
|
|
151
|
+
|
|
152
|
+
- `docs/pdlc/reviews/REVIEW_[task-id]_[YYYY-MM-DD].md` — the full review file, approved by human.
|
|
153
|
+
- PR comments pushed (if GitHub integration active) for non-accepted findings.
|
|
154
|
+
- Any deferred tech debt recorded in `docs/pdlc/memory/DECISIONS.md`.
|
|
155
|
+
- `docs/pdlc/memory/STATE.md` updated to reflect review approval.
|
|
156
|
+
- Task ready to proceed to the Test sub-phase.
|