@pdlc-os/pdlc 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,189 @@
1
+ # Retrospective Protocol
2
+
3
+ ## When this skill activates
4
+
5
+ Activate at the start of the **Reflect sub-phase** of Operation, after the Verify sub-phase is complete and the human has signed off on smoke tests. This skill generates the full retrospective for the completed feature cycle — from Inception through Ship.
6
+
7
+ Before starting, gather all required inputs:
8
+ - The active episode file: `docs/pdlc/memory/episodes/[episode-id].md`
9
+ - The episode index: `docs/pdlc/memory/episodes/index.md`
10
+ - The PRD: `docs/pdlc/prds/PRD_[feature-name]_[YYYY-MM-DD].md`
11
+ - All review files generated during Construction: `docs/pdlc/reviews/REVIEW_[task-id]_[YYYY-MM-DD].md`
12
+ - `docs/pdlc/memory/STATE.md` — for guardrail event log and loop-breaker escalation count
13
+ - `docs/pdlc/memory/DECISIONS.md` — for any tech debt deferred this cycle
14
+
15
+ ---
16
+
17
+ ## Protocol
18
+
19
+ ### Step 1 — Per-agent contributions
20
+
21
+ List every agent that participated in this feature cycle. For each agent, record:
22
+ - Their role name and display name (e.g. "Neo — Architect")
23
+ - What they contributed: specific tasks, findings surfaced, decisions influenced
24
+ - Any notable findings they raised (e.g. a Phantom security finding that led to a code change, an Echo coverage gap that revealed a missing test)
25
+
26
+ Pull this information from: the review files, the episode file, STATE.md history, and the Beads task records (`bd show [task-id]` for each completed task).
27
+
28
+ Agents to check: Neo, Echo, Phantom, Jarvis (always-on), plus any auto-selected agents (Bolt, Friday, Muse, Oracle, Pulse) that participated based on task labels.
29
+
30
+ ### Step 2 — Shipping streak
31
+
32
+ 1. Read `docs/pdlc/memory/episodes/index.md`.
33
+ 2. Count consecutive successfully delivered features ending with the current episode (i.e. episodes where the Ship sub-phase completed without a rollback or abandon).
34
+ 3. A streak is broken by: a rollback, an explicitly abandoned feature, or a feature that did not reach Ship.
35
+ 4. Record the current streak count.
36
+
37
+ Display format: "Shipping streak: X consecutive features delivered"
38
+
39
+ ### Step 3 — Metrics snapshot
40
+
41
+ Collect the following metrics from the episode file, review files, and STATE.md:
42
+
43
+ **Test pass rate by layer:**
44
+ Read from the Test Summary in the episode file. For each layer, record: passed / total. Compute pass rate percentage.
45
+
46
+ **Cycle time:**
47
+ - Start date: the date the Inception phase began for this feature (read from the PRD file date or STATE.md history)
48
+ - End date: the date the Ship sub-phase completed (today's date or the Ship timestamp in STATE.md)
49
+ - Cycle time = end date − start date, in calendar days
50
+
51
+ **Review rounds:**
52
+ Count the number of times the Review sub-phase was run for this feature (i.e. how many times a review file was written or regenerated). Read from the review files — check multiple files for the same task-id if they exist.
53
+
54
+ **Guardrail triggers by tier:**
55
+ Read `docs/pdlc/memory/STATE.md` for logged guardrail events.
56
+ - Tier 1 events: hard blocks that required double-RED confirmation
57
+ - Tier 2 events: pause-and-confirm events
58
+ - Tier 3 events: logged warnings (skipped layers, accepted warnings, overrides)
59
+ - Count each tier separately.
60
+
61
+ **Loop-breaker escalations:**
62
+ Count the number of times the 3-attempt auto-fix limit was hit and the human was asked to intervene. Read from STATE.md and the episode file.
63
+
64
+ ### Step 4 — What went well
65
+
66
+ Write 3–5 bullet points drawn from the episode's actual history. Be specific — reference actual events, not generic platitudes.
67
+
68
+ Inputs to draw from:
69
+ - Beads tasks completed without blockers or loop-breakers → "Task [X] was implemented cleanly in one TDD cycle"
70
+ - Review findings that led to measurable improvements → "Phantom surfaced an injection risk in [module] that was fixed before ship"
71
+ - Test layers that caught regressions early → "Integration tests caught a broken contract between [service A] and [service B]"
72
+ - A smooth human approval gate → "PRD and design docs approved in one round without revisions"
73
+ - CI/CD or tooling that worked well → "Playwright E2E suite ran against Chromium with zero flakes"
74
+
75
+ ### Step 5 — What broke or was harder than expected
76
+
77
+ Write 3–5 bullet points. Be specific and honest. Reference actual incidents from the episode.
78
+
79
+ Inputs to draw from:
80
+ - Loop-breaker escalations (3-attempt limit hits) → what the root cause turned out to be
81
+ - Approval rounds that required multiple revisions → what caused the back-and-forth
82
+ - Test layers that had failures → what the failures were
83
+ - Guardrail Tier 1 or Tier 2 events → what triggered them and how they were resolved
84
+ - Merge conflicts or CI/CD failures during Ship
85
+ - Scope that expanded beyond the PRD → what crept in and why
86
+
87
+ ### Step 6 — What to improve next time
88
+
89
+ Write 2–3 actionable improvement suggestions. These must be concrete and implementable in the next cycle.
90
+
91
+ Each suggestion should follow the format:
92
+ - **What**: the specific change to make
93
+ - **Why**: what problem it solves (trace back to this cycle's friction)
94
+ - **How**: a concrete first step to implement it
95
+
96
+ Examples of the kind of specificity required:
97
+ - "Add a perf benchmark for [endpoint] to the E2E suite before next iteration, so Layer 4 has automated coverage rather than manual timing"
98
+ - "Pre-populate the CONSTITUTION.md test gates section during Init — it was left blank this cycle, which caused ambiguity during the Test sub-phase"
99
+ - "Establish a baseline visual regression screenshot set before the next frontend task, to make Layer 6 actionable rather than advisory"
100
+
101
+ ### Step 7 — Tech debt log
102
+
103
+ Read `docs/pdlc/memory/DECISIONS.md` for any tech debt deferred during this cycle. Also check review files for findings marked "Defer to tech debt."
104
+
105
+ For each item:
106
+ - Name the component or module affected
107
+ - Describe the debt (what was cut, why it was deferred)
108
+ - Propose a concrete remediation approach and a suggested future episode to address it
109
+
110
+ If no tech debt was introduced this cycle, state that explicitly: "No tech debt introduced this cycle."
111
+
112
+ ### Step 8 — Write the retrospective into the episode file
113
+
114
+ Append the retrospective to the active episode file at `docs/pdlc/memory/episodes/[episode-id].md` under a "Reflect Notes" section.
115
+
116
+ The section must contain all seven elements from Steps 1–7:
117
+ - Per-agent contributions
118
+ - Shipping streak
119
+ - Metrics snapshot
120
+ - What went well
121
+ - What broke / was harder than expected
122
+ - What to improve next time
123
+ - Tech debt log
124
+
125
+ ### Step 9 — Update OVERVIEW.md
126
+
127
+ Read `docs/pdlc/memory/OVERVIEW.md`. This is the aggregated view of all functionality delivered across every iteration.
128
+
129
+ Add an entry for this feature cycle:
130
+ - Feature name and episode ID
131
+ - What was built (2–3 sentence summary)
132
+ - Key decisions made
133
+ - Version tag shipped (v[X.Y.Z])
134
+ - Links to the episode file and PRD
135
+
136
+ Do not overwrite previous entries. Append only.
137
+
138
+ ### Step 10 — Present for human approval
139
+
140
+ Present the human with:
141
+ 1. The path to the updated episode file: `docs/pdlc/memory/episodes/[episode-id].md`
142
+ 2. The path to the updated OVERVIEW.md: `docs/pdlc/memory/OVERVIEW.md`
143
+ 3. A brief summary: shipping streak, cycle time, total tests passed, and the top "what to improve" item.
144
+
145
+ State: "Retrospective complete. Please review the episode file and OVERVIEW.md. Approve to close the episode and commit the final state."
146
+
147
+ Wait for human approval. Do not commit until the human approves.
148
+
149
+ ### Step 11 — Commit and close
150
+
151
+ After human approval:
152
+
153
+ 1. Stage and commit the episode file and OVERVIEW.md:
154
+ ```bash
155
+ git add docs/pdlc/memory/episodes/[episode-id].md
156
+ git add docs/pdlc/memory/OVERVIEW.md
157
+ git commit -m "reflect: [feature-name] episode [episode-id] retrospective"
158
+ ```
159
+
160
+ 2. Push to main.
161
+
162
+ 3. Update `docs/pdlc/memory/STATE.md`:
163
+ - Phase: Initialization (ready for next `/pdlc brainstorm`)
164
+ - Last completed episode: [episode-id]
165
+ - Active feature: none
166
+
167
+ 4. Report to human: "Episode [episode-id] closed. Ready for the next feature. Run `/pdlc brainstorm` to begin."
168
+
169
+ ---
170
+
171
+ ## Rules
172
+
173
+ - The retrospective must be grounded in actual events from the episode — not generic observations. Every bullet point in "what went well" and "what broke" must be traceable to a specific event in the episode file, review files, or STATE.md.
174
+ - Shipping streak must be calculated from the full episode index, not estimated.
175
+ - Cycle time is calculated from Inception start to Ship completion — not from the first commit.
176
+ - Tech debt must be explicitly stated as either present (with specifics) or absent ("no tech debt introduced this cycle"). Silence is not acceptable.
177
+ - Do not commit the retrospective without explicit human approval.
178
+ - OVERVIEW.md is append-only. Never modify or remove previous entries.
179
+ - Improvement suggestions must be actionable: no "communicate better" or "be more careful." Each suggestion needs a concrete first step.
180
+
181
+ ---
182
+
183
+ ## Output
184
+
185
+ - "Reflect Notes" section appended to `docs/pdlc/memory/episodes/[episode-id].md` covering all seven elements.
186
+ - `docs/pdlc/memory/OVERVIEW.md` updated with a new entry for this feature cycle.
187
+ - Episode file and OVERVIEW.md committed to main after human approval.
188
+ - `docs/pdlc/memory/STATE.md` updated: phase reset to Initialization, active feature cleared.
189
+ - Human informed the feature cycle is closed and the system is ready for the next `/pdlc brainstorm`.
@@ -0,0 +1,266 @@
1
+ # Repo Scan — Brownfield Initialization
2
+
3
+ ## When this skill activates
4
+
5
+ During `/pdlc init` when the repository contains existing source code (brownfield project). The goal is to deeply review the existing codebase and produce pre-populated drafts of all memory bank files, so initialization reflects reality rather than starting from blank templates.
6
+
7
+ ---
8
+
9
+ ## Protocol
10
+
11
+ Execute every step below in order. Do not skip any step. Collect all findings before writing any memory files.
12
+
13
+ ---
14
+
15
+ ### Step 1 — Map the repository structure
16
+
17
+ Run the following to understand the top-level layout:
18
+
19
+ ```bash
20
+ git ls-files | head -200
21
+ ```
22
+
23
+ Also run:
24
+ ```bash
25
+ find . -maxdepth 3 \
26
+ -not -path './.git/*' \
27
+ -not -path './node_modules/*' \
28
+ -not -path './.beads/*' \
29
+ -not -path './docs/pdlc/*' \
30
+ -not -name '.DS_Store' \
31
+ | sort
32
+ ```
33
+
34
+ From this output, identify:
35
+ - Primary language(s) (file extensions)
36
+ - Framework indicators (`package.json`, `Gemfile`, `pyproject.toml`, `go.mod`, `Cargo.toml`, `pom.xml`, etc.)
37
+ - Entry points (`main.*`, `index.*`, `app.*`, `server.*`, `cmd/`)
38
+ - Test directories (`__tests__/`, `spec/`, `test/`, `tests/`, `*.test.*`, `*.spec.*`)
39
+ - Config files (`.env.example`, `docker-compose.yml`, `Dockerfile`, CI/CD configs)
40
+ - Existing documentation (`README*`, `docs/` excluding `docs/pdlc/`)
41
+
42
+ ---
43
+
44
+ ### Step 2 — Read key manifest and config files
45
+
46
+ Read every file from this list that exists:
47
+
48
+ - `package.json` / `package-lock.json` — for Node.js projects
49
+ - `Gemfile` — for Ruby projects
50
+ - `pyproject.toml` / `requirements.txt` / `setup.py` — for Python projects
51
+ - `go.mod` — for Go projects
52
+ - `Cargo.toml` — for Rust projects
53
+ - `pom.xml` / `build.gradle` — for Java/Kotlin projects
54
+ - `docker-compose.yml` / `Dockerfile` — for containerised stacks
55
+ - `.github/workflows/*.yml` — for CI/CD pipelines
56
+ - `README.md` / `README.rst` / `README.txt` — for existing documentation
57
+ - `.env.example` — for environment variable hints
58
+ - Any existing `ARCHITECTURE.md`, `CONTRIBUTING.md`, `DECISIONS.md`, or `ADR/` directory
59
+
60
+ Extract from these files:
61
+ - **Tech stack**: languages, frameworks, databases, cloud providers, key libraries
62
+ - **Scripts**: what `test`, `build`, `start`, `deploy` commands exist
63
+ - **Dependencies**: categorise into frontend, backend, testing, dev tools
64
+ - **Environment variables**: what external services are configured
65
+ - **CI/CD pipeline**: what stages run on merge/push
66
+
67
+ ---
68
+
69
+ ### Step 3 — Read entry points and core source files
70
+
71
+ Identify and read (or skim) up to 10 of the most important source files. Priority order:
72
+
73
+ 1. Main entry point (`index.js`, `main.py`, `app.rb`, `main.go`, `src/main.*`, etc.)
74
+ 2. Router or route definitions (`routes/`, `router.*`, `urls.py`, `routes.rb`)
75
+ 3. Core models or data layer (`models/`, `schema.*`, `prisma/schema.prisma`, `db/schema.rb`)
76
+ 4. Primary controllers or handlers (`controllers/`, `handlers/`, `views/`, `resolvers/`)
77
+ 5. Auth layer if present (`auth.*`, `middleware/auth.*`, `lib/auth/`)
78
+ 6. Existing API contract files (`openapi.yaml`, `swagger.json`, `graphql/schema.graphql`)
79
+
80
+ From these files, identify:
81
+ - **Core features**: what the application already does (be specific — list each distinct feature)
82
+ - **Data model**: main entities and their relationships
83
+ - **API surface**: existing endpoints or mutations
84
+ - **Business logic patterns**: where decisions are made, how data flows
85
+ - **Architectural style**: MVC, hexagonal, serverless functions, monolith, microservices
86
+
87
+ ---
88
+
89
+ ### Step 4 — Read existing tests
90
+
91
+ Find and skim up to 10 test files across different test types:
92
+
93
+ ```bash
94
+ git ls-files | grep -E '\.(test|spec)\.' | head -20
95
+ git ls-files | grep -E '^(test|tests|spec|__tests__)/' | head -20
96
+ ```
97
+
98
+ From test files, identify:
99
+ - What features are already covered by tests
100
+ - What testing libraries / frameworks are used
101
+ - Approximate test coverage (many tests = good coverage, few = sparse)
102
+ - Whether tests are unit, integration, or E2E style
103
+ - Any test conventions (naming, file co-location, fixtures)
104
+
105
+ ---
106
+
107
+ ### Step 5 — Read git history
108
+
109
+ Run the following to understand the project's timeline:
110
+
111
+ ```bash
112
+ git log --oneline --no-merges -50
113
+ ```
114
+
115
+ Also run:
116
+ ```bash
117
+ git log --format="%ai %s" --no-merges -20
118
+ ```
119
+
120
+ From the git log, identify:
121
+ - When the project started (first commit date)
122
+ - The main feature areas worked on (infer from commit messages)
123
+ - Recent areas of activity (last 10 commits)
124
+ - Any architectural pivots (e.g. "migrate to X", "replace Y with Z", "rewrite")
125
+ - Recurring contributors (for the team context)
126
+
127
+ ---
128
+
129
+ ### Step 6 — Synthesise findings
130
+
131
+ Before writing any file, compose a structured internal summary with these sections. You will use this summary to write all memory files:
132
+
133
+ ```
134
+ REPO SCAN SUMMARY
135
+ =================
136
+
137
+ PROJECT
138
+ Name: [inferred from package.json/README/directory name]
139
+ Description: [1–2 sentence description of what it does]
140
+ Started: [date of first commit]
141
+ Primary language: [language]
142
+ Tech stack: [framework + DB + infra]
143
+
144
+ EXISTING FEATURES (list each concrete feature the app currently has)
145
+ 1. [Feature name] — [1-sentence description]
146
+ 2. ...
147
+
148
+ DATA MODEL (main entities)
149
+ - [Entity]: [fields / relationships in plain English]
150
+ - ...
151
+
152
+ API SURFACE (if applicable)
153
+ - [METHOD /path] — [what it does]
154
+ - ...
155
+
156
+ ARCHITECTURAL PATTERNS
157
+ - [Pattern observed, e.g. "MVC via Rails conventions", "Service objects for business logic"]
158
+
159
+ TEST COVERAGE
160
+ - Frameworks: [list]
161
+ - Covered: [features with tests]
162
+ - Gaps: [features with little or no test coverage]
163
+
164
+ CI/CD
165
+ - [What pipeline stages exist, what triggers them]
166
+
167
+ KEY DECISIONS (inferred from code and git history)
168
+ 1. [Decision inferred, e.g. "Chose PostgreSQL over MongoDB — evidenced by ActiveRecord schemas"]
169
+ 2. ...
170
+
171
+ TECH DEBT SIGNALS (code patterns suggesting debt)
172
+ - [e.g. "TODO/FIXME comments found in N files", "No tests for auth module", "Deprecated dep X"]
173
+
174
+ RECENT ACTIVITY (last 10 commits summary)
175
+ - [Area of focus, date range]
176
+ ```
177
+
178
+ Print this summary to the user before proceeding, and ask: **"Does this look accurate? Any corrections before I generate the memory files?"** Wait for the user's response. Incorporate any corrections.
179
+
180
+ ---
181
+
182
+ ### Step 7 — Generate memory files from scan findings
183
+
184
+ Use the verified summary to write pre-populated versions of all memory files. Do not use blank template stubs — fill in real content wherever the scan produced findings.
185
+
186
+ #### `CONSTITUTION.md`
187
+
188
+ - **Tech Stack table**: fill in actual stack from scan
189
+ - **Architectural Constraints**: list observed patterns as constraints (e.g. "Service layer separates business logic from controllers — maintain this separation")
190
+ - **Coding Standards**: infer from code (e.g. linter config, consistent naming patterns found)
191
+ - **Test Gates**: check layers that already have tests; suggest enabling them
192
+ - Leave other sections as instructed defaults
193
+
194
+ #### `INTENT.md`
195
+
196
+ - **Project name**: from scan
197
+ - **Problem statement**: infer from README + features list. Write 2–4 sentences describing what problem the app solves
198
+ - **Target user**: infer from README or feature names. Mark clearly as "(inferred — please verify)"
199
+ - **Core value proposition**: draft from features and problem statement. Mark as "(inferred)"
200
+ - Leave success metrics and out-of-scope as placeholders — these need human input
201
+
202
+ #### `OVERVIEW.md`
203
+
204
+ This is the most important file to populate thoroughly:
205
+
206
+ - **Project Summary**: 2–3 sentences about what the app does today
207
+ - **Active Functionality**: list every feature identified in Step 3 as a bullet point with a 1-sentence description
208
+ - **Architecture Summary**: describe the architectural style and key layers from Step 3
209
+ - **Known Tech Debt**: list all tech debt signals from Step 6
210
+ - **Shipped Features table**: leave empty (no PDLC episodes yet, but note "Pre-PDLC functionality documented above")
211
+
212
+ #### `DECISIONS.md`
213
+
214
+ - Record each key decision from Step 6 as a lightweight ADR entry:
215
+
216
+ ```markdown
217
+ ## ADR-001 — [Decision title] *(pre-PDLC, inferred)*
218
+
219
+ **Date:** [inferred from git log or "unknown"]
220
+ **Status:** Accepted
221
+
222
+ **Decision:** [What was decided]
223
+
224
+ **Context:** [Why this decision makes sense given the codebase]
225
+
226
+ **Inferred from:** [git log / package.json / schema file / etc.]
227
+
228
+ ---
229
+ ```
230
+
231
+ Mark all pre-PDLC entries as `*(pre-PDLC, inferred)*` so the team knows these were reverse-engineered.
232
+
233
+ #### `CHANGELOG.md`
234
+
235
+ - Add a single entry for the pre-PDLC state:
236
+
237
+ ```markdown
238
+ ## Pre-PDLC baseline — [first commit date] to [today]
239
+
240
+ ### Existing functionality
241
+ [List the features from the scan as bullet points under "Added"]
242
+
243
+ *Note: This entry documents the state of the repository before PDLC was introduced.
244
+ Future entries will be generated by Jarvis during each Ship sub-phase.*
245
+ ```
246
+
247
+ #### `ROADMAP.md` and `STATE.md`
248
+
249
+ - Populate with project name but leave feature planning sections as stubs
250
+ - Set STATE.md to `Initialization Complete — Ready for /pdlc brainstorm`
251
+
252
+ ---
253
+
254
+ ## Rules
255
+
256
+ - **Never invent functionality** that isn't evidenced in the code, README, or git history. If you're uncertain, mark findings with "(inferred — please verify)".
257
+ - **Prefer specificity over generality**. "User authentication with JWT via Devise" is better than "authentication exists".
258
+ - **Respect existing conventions**. If the codebase uses a specific naming convention or architecture, document it in CONSTITUTION.md as a constraint to preserve.
259
+ - **Flag gaps explicitly**. If a feature exists but has no tests, say so. If there's no README, say so. If the git history is sparse, say so.
260
+ - **Mark all inferred content clearly** so the team can verify and correct.
261
+
262
+ ---
263
+
264
+ ## Output
265
+
266
+ Seven fully populated memory files under `docs/pdlc/memory/`, derived from real codebase analysis rather than blank templates.
@@ -0,0 +1,156 @@
1
+ # Multi-Agent Code Review
2
+
3
+ ## When this skill activates
4
+
5
+ Activate at the start of the **Review sub-phase** of Construction, immediately after all tests for the active Beads task have passed. Do not run review before tests pass — this is a hard ordering constraint.
6
+
7
+ This skill governs one full review cycle per task. If the human requests revisions after reading the review file, re-run only the affected reviewer domains (or all, if the change is broad), regenerate the review file, and re-present for approval.
8
+
9
+ ---
10
+
11
+ ## Protocol
12
+
13
+ ### Step 1 — Establish context
14
+
15
+ Before any reviewer begins, load the following into context:
16
+
17
+ 1. The active Beads task: `bd show [task-id]` — read title, description, acceptance criteria.
18
+ 2. The PRD: `docs/pdlc/prds/PRD_[feature-name]_[YYYY-MM-DD].md` — check requirements, BDD stories, non-functional requirements, out-of-scope list.
19
+ 3. `docs/pdlc/memory/CONSTITUTION.md` — rules, standards, definition of done.
20
+ 4. `docs/pdlc/memory/DECISIONS.md` — architectural decisions already made; any deviation is a finding.
21
+ 5. The design docs at `docs/pdlc/design/[feature-name]/` — ARCHITECTURE.md, data-model.md, api-contracts.md.
22
+ 6. The full diff of all files changed in this task on the feature branch.
23
+
24
+ ### Step 2 — Independent reviewer passes
25
+
26
+ Each reviewer operates independently within their domain. They do not wait for others. Run all four in parallel where possible. Each reviewer produces a list of findings — each finding has a title, description, affected file/line, and a severity note (Advisory / Recommended / Important — all are soft warnings; none are hard blocks).
27
+
28
+ **Neo — Architecture & PRD conformance**
29
+
30
+ Neo checks:
31
+ - Does the implementation match the architecture described in `docs/pdlc/design/[feature-name]/ARCHITECTURE.md`? Flag any divergence.
32
+ - Does the code implement what the PRD requires, and only what the PRD requires? Flag scope creep or missing requirements.
33
+ - Are any decisions recorded in `docs/pdlc/memory/DECISIONS.md` being violated or ignored?
34
+ - Is new tech debt being introduced? If so, is it intentional and documented?
35
+ - Are cross-cutting concerns (logging, error handling, config, auth) handled consistently with the rest of the codebase?
36
+ - Are module boundaries respected? Does new code reach across layers it should not?
37
+
38
+ **Phantom — Security**
39
+
40
+ Phantom checks against the OWASP Top 10 and general security hygiene:
41
+ - Injection: Is all user input validated and sanitized before use in queries, shell commands, or rendered output?
42
+ - Broken authentication: Are auth tokens validated correctly? Are session fixation, replay, and privilege escalation risks addressed?
43
+ - Sensitive data exposure: Are secrets, tokens, or PII exposed in logs, error messages, or API responses?
44
+ - Security misconfiguration: Are default credentials used? Are error pages leaking stack traces?
45
+ - Broken access control: Does the code enforce authorization at every access point, not just at the route level?
46
+ - Trust boundaries: Is data from external sources treated as untrusted until validated?
47
+ - Dependency risk: Are any new packages introduced that have known CVEs or unusual permissions?
48
+ - Are there any hardcoded secrets, API keys, or credentials anywhere in the diff?
49
+
50
+ **Echo — Test coverage & quality**
51
+
52
+ Echo checks:
53
+ - Does every acceptance criterion from the Beads task have at least one test?
54
+ - Are the Given/When/Then test names traceable to specific PRD user stories?
55
+ - Are edge cases covered: empty inputs, null values, boundary conditions, concurrent access, network failures?
56
+ - Are integration boundaries tested (not just mocked at every layer)?
57
+ - What is the regression risk? Have changes to shared utilities, base classes, or middleware been tested for downstream effects?
58
+ - Are there tests that are structurally present but not actually asserting meaningful behavior (i.e. tests that always pass regardless of implementation)?
59
+
60
+ **Jarvis — Documentation & API contracts**
61
+
62
+ Jarvis checks:
63
+ - Are all new public functions, methods, components, and APIs documented with inline comments or JSDoc/docstrings?
64
+ - If an API endpoint was added or changed: is `docs/pdlc/design/[feature-name]/api-contracts.md` up to date?
65
+ - Is the CHANGELOG entry for this task ready to be written? (Jarvis prepares a draft entry.)
66
+ - Are README or setup instructions impacted by this change? If so, are they updated?
67
+ - Are type signatures, return values, and error states documented accurately?
68
+
69
+ ### Step 3 — Consolidate findings
70
+
71
+ After all four reviewers complete their passes:
72
+
73
+ 1. Collect all findings into a single list.
74
+ 2. Group by reviewer (Neo / Phantom / Echo / Jarvis).
75
+ 3. Within each group, order by severity: Important → Recommended → Advisory.
76
+ 4. Include the builder agent(s) who implemented the task as a named participant. They do not generate separate findings but are listed as contributors for traceability.
77
+
78
+ ### Step 4 — Write the review file
79
+
80
+ Write the review file to:
81
+ ```
82
+ docs/pdlc/reviews/REVIEW_[task-id]_[YYYY-MM-DD].md
83
+ ```
84
+
85
+ The file must contain:
86
+
87
+ ```
88
+ # Review: [task-id] — [task title]
89
+ Date: [YYYY-MM-DD]
90
+ Feature: [feature-name]
91
+ Reviewers: Neo, Phantom, Echo, Jarvis + [builder agent name(s)]
92
+
93
+ ## Summary
94
+ [2–4 sentence summary of overall code quality and readiness]
95
+
96
+ ## Neo — Architecture & PRD Conformance
97
+ [Findings, or "No findings."]
98
+
99
+ ## Phantom — Security
100
+ [Findings, or "No findings."]
101
+
102
+ ## Echo — Test Coverage & Quality
103
+ [Findings, or "No findings."]
104
+
105
+ ## Jarvis — Documentation & API Contracts
106
+ [Findings, or "No findings." + draft CHANGELOG entry]
107
+
108
+ ## Consolidated Finding Count
109
+ Important: X | Recommended: Y | Advisory: Z
110
+
111
+ ## Human Decision Required
112
+ For each Important or Recommended finding, list:
113
+ - Finding title
114
+ - Proposed resolution
115
+ - Options: [ ] Fix now [ ] Accept and move on [ ] Defer to tech debt
116
+ ```
117
+
118
+ ### Step 5 — Human approval gate
119
+
120
+ Present the review file path to the human. State: "Review complete. Please read `docs/pdlc/reviews/REVIEW_[task-id]_[YYYY-MM-DD].md` and approve, or request changes."
121
+
122
+ Wait. Do not proceed to the Test sub-phase or push PR comments until the human explicitly approves.
123
+
124
+ If the human requests changes: address them, regenerate the review file, and re-present.
125
+
126
+ ### Step 6 — Post approval actions
127
+
128
+ After human approval:
129
+
130
+ 1. If GitHub integration is active: push findings as PR comments via the GitHub integration. Only push findings the human has not marked "Accept and move on."
131
+ 2. For any finding marked "Defer to tech debt": add an entry to `docs/pdlc/memory/DECISIONS.md` under a "Tech Debt" section with the finding, the rationale for deferral, and a suggested remediation approach.
132
+ 3. Update `docs/pdlc/memory/STATE.md`: mark review as approved for this task.
133
+ 4. Proceed to the Test sub-phase.
134
+
135
+ ---
136
+
137
+ ## Rules
138
+
139
+ - Review runs only after all tests pass. Never before.
140
+ - All findings are soft warnings. No finding hard-blocks the build. Human decides: fix, accept, or defer.
141
+ - Human must approve the review file before PR comments are pushed. Never push PR comments automatically without approval.
142
+ - Severity labels — Important, Recommended, Advisory — are not severity scores for automation. They are signals to the human to help prioritize decisions.
143
+ - The builder agent(s) are always listed in the review file as participants. This is for traceability, not blame.
144
+ - Phantom security findings marked "Accept" must be logged as Tier 3 guardrail events in `docs/pdlc/memory/STATE.md`.
145
+ - Echo test coverage gaps marked "Accept" must also be logged as Tier 3 guardrail events in `docs/pdlc/memory/STATE.md`.
146
+ - Do not re-run the full review cycle for trivial fixes (e.g. a variable rename). Use judgment: re-run only the affected reviewer domain(s) when the human requests a change.
147
+
148
+ ---
149
+
150
+ ## Output
151
+
152
+ - `docs/pdlc/reviews/REVIEW_[task-id]_[YYYY-MM-DD].md` — the full review file, approved by human.
153
+ - PR comments pushed (if GitHub integration active) for non-accepted findings.
154
+ - Any deferred tech debt recorded in `docs/pdlc/memory/DECISIONS.md`.
155
+ - `docs/pdlc/memory/STATE.md` updated to reflect review approval.
156
+ - Task ready to proceed to the Test sub-phase.