npm - waypoint-codex - Versions diffs - 0.8.0 → 0.9.0 - Mend

waypoint-codex 0.8.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/templates/.agents/skills/backend-ship-audit/references/report-template.md ADDED Viewed

@@ -0,0 +1,92 @@
+# Backend audit report template
+Use this template for `.waypoint/audit/dd-mm-yyyy-hh-mm-backend-audit.md`.
+```markdown
+# Backend Ship-Readiness Audit: <scope name>
+- Timestamp: <dd-mm-yyyy hh:mm>
+- Requested scope: <user request>
+- Assumed audit scope: <narrowed reviewable unit>
+- Ship recommendation: <Ready to ship | Ready to ship with explicit risk acceptance | Not ready to ship>
+## Scope
+### In scope
+- <paths, services, APIs, workers, migrations, docs>
+### Adjacent dependencies and boundaries
+- <datastores, queues, auth layers, partner APIs, shared libraries>
+### Out of scope
+- <explicit exclusions>
+## What was read
+- `<path>`: <why it mattered>
+- `<path>`: <why it mattered>
+## Open questions
+- <only unresolved questions that materially affect readiness>
+## Assumptions used in this audit
+- <assumption>
+- <assumption>
+## System understanding
+Provide a concise explanation of the scoped backend:
+- primary entry points
+- data flow
+- trust boundaries
+- transaction and async boundaries
+- external dependencies
+- operational controls
+## Priority summary
+- P0: <count>
+- P1: <count>
+- P2: <count>
+- P3: <count>
+- P4: <count>
+## Findings
+### BA-001: <title>
+- Priority: <P0-P4>
+- Why it matters: <plain-language impact>
+- Evidence:
+  - `<path>:<line-range>` <concise fact>
+  - `<path>:<line-range>` <concise fact>
+- Affected area: <service, endpoint, worker, migration, table, client>
+- Risk if shipped as-is: <practical release risk>
+- Recommended fix: <specific fix or mitigation>
+- Confidence: <High | Medium | Low>
+### BA-002: <title>
+- Priority: <P0-P4>
+- Why it matters: <plain-language impact>
+- Evidence:
+  - `<path>:<line-range>` <concise fact>
+- Affected area: <service, endpoint, worker, migration, table, client>
+- Risk if shipped as-is: <practical release risk>
+- Recommended fix: <specific fix or mitigation>
+- Confidence: <High | Medium | Low>
+## Release conditions / next actions
+List only the conditions that matter for shipment.
+1. <required fix, mitigation, or explicit risk acceptance>
+2. <required fix, mitigation, or explicit risk acceptance>
+## Notes
+Include only brief context that materially helps a future reviewer.
+```
+Guidance:
+- Keep the summary short.
+- Prefer fewer findings with stronger evidence.
+- Include no finding that is unsupported by the repository or an explicit unanswered question.
+- Use stable IDs in the form `BA-001`, `BA-002`, and so on.

package/templates/.agents/skills/frontend-context-interview/SKILL.md ADDED Viewed

@@ -0,0 +1,60 @@
+---
+name: frontend-context-interview
+description: Gather and persist durable frontend project context when missing or insufficient for implementation or review work. Use when frontend decisions depend on product type, audience, support matrix, accessibility, SEO, localization, design-system constraints, or similar context that is not clearly documented.
+---
+# Frontend Context Interview
+Use this skill when relevant frontend context is missing, stale, contradictory, or too weak to support correct implementation or review decisions.
+## Goals
+1. identify the missing frontend context that materially affects the work
+2. ask only high-leverage questions that cannot be answered from the repo or guidance files
+3. persist durable context into the project root guidance file
+4. avoid repeated questioning in future tasks
+## When to use
+Use this skill when the current task depends on context such as:
+- internal tool vs customer-facing product vs public marketing site
+- expected scale or traffic patterns
+- browser and device support requirements
+- accessibility targets
+- SEO requirements
+- localization or internationalization requirements
+- analytics or experimentation requirements
+- design-system or branding constraints
+- auth or role-based UI expectations
+- security or privacy expectations that change frontend behavior
+Do not use this skill when the answer is already clearly present in `AGENTS.md`, product docs, or the task itself.
+## Workflow
+### 1. Check persisted context first
+Inspect the project root guidance files.
+Priority:
+1. `AGENTS.md`
+Look for:
+- `## Project Context`
+- `## Frontend Context`
+- equivalent sections with the same intent
+If the existing section is accurate and sufficient, do not interview the user.
+### 2. Determine what is actually missing
+Only ask questions that materially affect implementation or review choices.
+Good triggers:
+- the right browser support changes implementation or QA expectations
+- accessibility bar changes component and interaction requirements
+- public marketing surface vs internal tool changes polish, SEO, and content expectations
+- localization changes copy, layout, and component design
+Do not ask broad or low-value questions.
+Do not ask generic discovery questions that do not affect implementation.

package/templates/.agents/skills/frontend-ship-audit/SKILL.md ADDED Viewed

@@ -0,0 +1,87 @@
+---
+name: frontend-ship-audit
+description: Audit a defined frontend scope for ship-readiness with a strong focus on real product risk, user-facing correctness, and evidence from the repository. Use when Codex needs to review an app, route group, feature, page set, component area, PR, or frontend directory to decide whether it is ready to ship; resolve the actual reviewable frontend scope from the user request and repository structure; read all relevant frontend code and docs completely; ask a concise high-leverage interview only for missing context that materially changes the release bar; persist durable frontend deployment context into the project root AGENTS.md under a Frontend Context section; and write an evidence-based audit with prioritized P0-P4 findings at .waypoint/audit/dd-mm-yyyy-hh-mm-frontend-audit.md.
+---
+Audit ship-readiness like a strong frontend reviewer. Optimize for user impact, release risk, and production correctness. Do not optimize for style policing.
+Use this workflow:
+1. Resolve the scope.
+   - Infer the most defensible reviewable unit from the user request and repository structure.
+   - State the assumed scope when the request is broad or ambiguous.
+   - List what is directly in scope, what important dependencies matter, and what is explicitly out of scope.
+   - Include dependent APIs, design systems, auth flows, platform constraints, analytics, SEO, localization, and accessibility requirements when they materially affect the scoped experience.
+2. Build repository understanding before judging readiness.
+   - Read the project root guidance file first: `AGENTS.md`.
+   - Read package manifests, router entry points, route definitions, layouts, pages, screens, composition layers, state containers, API clients, validation logic, design-system primitives, styling and theming files, accessibility helpers, tests, specs, design docs, runbooks, and architecture docs when they matter to the scoped frontend.
+   - Read complete files for all relevant materials. Do not rely on grep hits, single matched lines, or truncated snippets for anything that informs architecture or a finding.
+   - Ignore clearly irrelevant material such as vendored dependencies, generated outputs, caches, and unrelated subsystems.
+3. Model the real user experience.
+   - Trace primary and secondary user journeys across entry points, route transitions, loading states, empty states, errors, retries, mutations, auth boundaries, success states, and exits.
+   - Identify frontend boundary assumptions: API contracts, feature flags, experiments, permissions, browser support, device classes, SEO rules, localization rules, analytics expectations, and privacy constraints.
+   - Distinguish proven behavior from assumed behavior.
+4. Ask only the questions that materially change the release bar.
+   - Ask the interview after repository exploration.
+   - Group questions by topic.
+   - Keep them concise and high leverage.
+   - Skip questions that the codebase or docs already answer.
+   - If answers are unavailable, proceed with explicit assumptions and label them clearly in the audit.
+5. Persist durable frontend context.
+   - Prefer the project root `AGENTS.md`.
+   - If it does not exist, do not create a new guidance file unless the user explicitly asks.
+   - Update an existing `## Frontend Context` section when present.
+   - Otherwise add a new `## Frontend Context` section.
+   - Preserve surrounding content exactly.
+   - Do not overwrite unrelated sections.
+   - Do not duplicate existing context.
+   - Do not persist transient findings.
+   - Persist only stable deployment context and durable product constraints such as audience, browser support, device classes, accessibility targets, performance expectations, SEO expectations, localization requirements, analytics obligations, design-system constraints, auth expectations, and privacy or security expectations.
+   - Make this edit manually and preserve surrounding content exactly.
+   - Read `references/guidance-file-updates.md` before editing the guidance file.
+6. Produce the audit.
+   - Write the audit to `.waypoint/audit/dd-mm-yyyy-hh-mm-frontend-audit.md`.
+   - Create directories if needed.
+   - Use current local execution time for the timestamp unless the project or task specifies a different timezone convention.
+   - Use `scripts/create_frontend_audit.py` to create the timestamped audit file path and scaffold when helpful.
+   - Read `references/report-template.md` before writing the final report.
+7. Evaluate with practical release judgment.
+   - Judge the scoped frontend across architecture fit, boundary clarity, user journey completeness, loading and failure handling, form correctness, validation, API integration robustness, state management correctness, rendering behavior, responsiveness, accessibility, focus management, keyboard support, visual consistency, design-system usage, interaction quality, auth and authorization exposure, client-side security and privacy, performance risks, hydration risks, SEO and metadata correctness, analytics correctness, observability, future legibility, and cross-browser or cross-device risk when relevant.
+   - Do not limit the audit to that list. Apply specialist judgment.
+   - Read `references/review-framework.md` for the detailed audit lenses.
+8. Keep findings evidence-based and severity-calibrated.
+   - Do not include stylistic preferences, generic best-practice commentary, or trivial refactors without ship impact.
+   - Tie every finding to repository evidence.
+   - Use the smallest severity that honestly reflects the risk.
+   - Mark confidence when evidence is incomplete.
+Use this priority model consistently:
+- P0: clear ship blocker; likely severe production breakage, critical accessibility or security failure, or fundamentally unsafe release
+- P1: serious issue that should usually be fixed before shipping; substantial user, reliability, accessibility, security, or operational risk
+- P2: important issue that may be acceptable only with conscious acceptance of risk; not an immediate blocker in all contexts
+- P3: moderate weakness or gap; should be addressed soon but not necessarily before launch
+- P4: minor improvement with limited near-term impact
+Every finding must include:
+- ID
+- title
+- priority
+- why it matters
+- evidence
+- affected area
+- risk if shipped as-is
+- recommended fix
+- confidence level if evidence is incomplete
+When evidence is partial:
+- say what you verified
+- say what remains assumed
+- lower confidence instead of overstating certainty
+- ask only the missing questions that would change the release decision

package/templates/.agents/skills/frontend-ship-audit/agents/openai.yaml ADDED Viewed

@@ -0,0 +1,3 @@
+display_name: Frontend Ship Audit
+short_description: Audit a scoped frontend surface for ship-readiness with evidence-based findings and durable deployment context.
+default_prompt: Audit the ship-readiness of the requested frontend scope. Resolve the reviewable unit from the repo, read all relevant frontend files completely, ask only missing high-leverage questions, persist durable Frontend Context in the project root guidance file when present, and write a prioritized audit at .waypoint/audit/dd-mm-yyyy-hh-mm-frontend-audit.md.

package/templates/.agents/skills/frontend-ship-audit/references/guidance-file-updates.md ADDED Viewed

@@ -0,0 +1,57 @@
+# Frontend Context guidance file update rules
+Use these rules when persisting durable frontend context.
+## File selection
+1. Prefer `AGENTS.md` in the project root.
+2. If it does not exist, do not create a new guidance file unless the user explicitly asked for it.
+## Section rules
+- Target the exact heading `## Frontend Context`.
+- Update the existing section when present.
+- Otherwise append a new `## Frontend Context` section.
+- Preserve all surrounding content exactly.
+- Do not alter unrelated sections.
+- Do not duplicate facts that already exist in accurate form.
+- Do not persist audit findings, one-off bugs, or transient release notes.
+## Good content for this section
+Persist stable context such as:
+- deployment surface and audience
+- internal, partner, customer-facing, or public marketing classification
+- required browsers and device classes
+- accessibility target or compliance expectation
+- performance budget or latency expectation
+- SEO requirements
+- localization requirements
+- analytics or experimentation obligations
+- design-system or brand constraints
+- auth and role-based UI expectations
+- privacy and client-side security expectations
+## Bad content for this section
+Do not add:
+- current audit findings
+- temporary workarounds
+- one-time release decisions
+- generic engineering principles unrelated to the frontend deployment context
+## Suggested format
+Use concise bullets under `## Frontend Context`. Prefer facts and defaults over prose.
+Example:
+## Frontend Context
+- Surface: Public customer-facing web app.
+- Devices: Mobile and desktop must both work.
+- Browser support: Latest Chrome, Safari, Firefox, and Edge.
+- Accessibility: Keyboard-accessible flows and screen-reader-compatible forms are required.
+- Performance: Primary routes should remain responsive on mid-range mobile devices.
+- SEO: Product and marketing routes require accurate metadata and indexable content.
+- Localization: English only for now.
+- Analytics: Core conversion events must remain instrumented.

package/templates/.agents/skills/frontend-ship-audit/references/report-template.md ADDED Viewed

@@ -0,0 +1,51 @@
+# Frontend audit report template
+Use this structure for `.waypoint/audit/dd-mm-yyyy-hh-mm-frontend-audit.md`.
+# Frontend Ship-Readiness Audit
+Generated: DD-MM-YYYY HH:MM
+## Scope
+- Requested scope:
+- Assumed reviewable unit:
+- In scope:
+- Important dependencies:
+- Explicitly out of scope:
+## Deployment Context
+- Established context:
+- Missing context that affects the bar:
+- Assumptions used for this audit:
+## Repository Coverage
+- Files and docs read completely:
+- Areas intentionally skipped as irrelevant:
+## Summary
+- Verdict: Ready / Ready with accepted risk / Not ready
+- Highest-risk themes:
+- What would need to change before shipping, if not ready:
+## Findings
+### F-001: Title
+- Priority: P1
+- Why it matters:
+- Evidence:
+- Affected area:
+- Risk if shipped as-is:
+- Recommended fix:
+- Confidence: High / Medium / Low
+Repeat for each finding in priority order.
+## Positive evidence
+- Note behaviors that reduce release risk when they are directly supported by repository evidence.
+## Open questions
+- List only unanswered questions that would materially change the release decision.
+## Release recommendation
+- State the release recommendation in one concise paragraph.
+- If the scope can ship only with accepted risk, name the exact accepted risks.

package/templates/.agents/skills/frontend-ship-audit/references/review-framework.md ADDED Viewed

@@ -0,0 +1,83 @@
+# Frontend ship-readiness review framework
+Use these lenses to decide whether the scoped frontend is safe and complete enough to ship.
+## Reading order
+1. Read root guidance and product docs.
+2. Read route entry points, layouts, and page or screen composition.
+3. Read state, data fetching, API clients, validation, and mutation paths.
+4. Read design-system, styling, accessibility, metadata, analytics, and auth helpers that affect the scope.
+5. Read tests that exercise the scoped behavior.
+6. Read adjacent docs or runbooks when they explain production expectations.
+## Core evaluation lenses
+### Scope and architecture fit
+- Check whether the implementation boundary matches the requested surface.
+- Check whether route, page, and component boundaries are legible and coherent.
+- Check whether critical behavior is spread across too many layers or hidden behind implicit defaults.
+### User journey completeness
+- Trace main happy paths, edge paths, entry paths, exits, and return paths.
+- Check loading, empty, error, retry, disabled, unauthorized, and success states.
+- Check whether failures are visible and actionable rather than silent.
+### Boundary correctness
+- Verify request and response assumptions at frontend boundaries.
+- Check parsing, validation, null handling, optimistic updates, stale data handling, retries, and race conditions.
+- Check whether the UI assumes fields, permissions, or states that the backend does not guarantee.
+### State and rendering correctness
+- Check whether state ownership is clear.
+- Check whether derived state duplicates server state or causes drift.
+- Check whether effects, memoization, and conditional rendering create stale UI, loops, or hydration mismatches.
+### Forms and input correctness
+- Verify validation rules, error surfacing, submission gating, retry behavior, and server error handling.
+- Check whether defaults, formatting, and field constraints match product expectations.
+### Responsiveness and device fit
+- Check whether layouts, interactions, and content density hold across required breakpoints and device classes.
+- Check tap target sizes, overflow, sticky elements, keyboard overlap, and modal behavior on smaller screens.
+### Accessibility
+- Check semantic structure, labels, accessible names, focus order, focus visibility, keyboard access, screen-reader announcements, and dialog or popover behavior.
+- Check whether error messaging and status changes are perceivable.
+- Treat critical accessibility failures as real ship risk.
+### Visual and interaction quality
+- Check whether design-system primitives are used consistently where required.
+- Check whether states provide clear feedback and whether destructive or irreversible actions are appropriately signaled.
+- Do not flag visual issues that are purely stylistic unless they affect usability, consistency, or release confidence.
+### Auth, authorization, security, and privacy
+- Check whether privileged UI states are exposed to the wrong roles.
+- Check whether secrets, tokens, PII, or internal data are exposed in client code, storage, logs, analytics payloads, or rendered markup.
+- Check whether client-side behavior could mislead users about authorization.
+### Performance and delivery risk
+- Check route-level loading strategy, bundle pressure, unnecessary client rendering, hydration risk, redundant requests, and expensive re-renders where relevant.
+- Check whether performance expectations or budgets are violated for the target surface.
+### SEO, metadata, analytics, and observability
+- Check metadata, canonical handling, structured data, crawlability, and rendering mode when the scope is indexable.
+- Check event wiring, experiment exposure, and required tracking for key journeys.
+- Check whether failures have enough logging or observability to support release confidence when relevant.
+### Maintainability as ship risk
+- Flag overengineering, underengineering, hidden coupling, or silent fallback behavior when they create near-term release risk.
+- Ignore refactor ideas that do not materially affect shipping confidence.
+## Risk heuristics
+Raise priority when the issue is likely to:
+- break the primary journey
+- mis-handle auth or roles
+- hide errors or create silent failure
+- expose private or unsafe data
+- strand keyboard or screen-reader users
+- fail on a required browser or device class
+- create high-probability production regressions due to unclear ownership or boundary assumptions
+Lower priority when the issue is isolated, recoverable, obvious to users, or only affects non-critical polish.

package/templates/.agents/skills/frontend-ship-audit/scripts/create_frontend_audit.py ADDED Viewed

@@ -0,0 +1,81 @@
+#!/usr/bin/env python3
+from __future__ import annotations
+import argparse
+from datetime import datetime
+from pathlib import Path
+import sys
+TEMPLATE = """# Frontend Ship-Readiness Audit
+Generated: {generated}
+## Scope
+- Requested scope: {requested_scope}
+- Assumed reviewable unit:
+- In scope:
+- Important dependencies:
+- Explicitly out of scope:
+## Deployment Context
+- Established context:
+- Missing context that affects the bar:
+- Assumptions used for this audit:
+## Repository Coverage
+- Files and docs read completely:
+- Areas intentionally skipped as irrelevant:
+## Summary
+- Verdict: Ready / Ready with accepted risk / Not ready
+- Highest-risk themes:
+- What would need to change before shipping, if not ready:
+## Findings
+## Positive evidence
+## Open questions
+## Release recommendation
+"""
+def main() -> int:
+    parser = argparse.ArgumentParser(description="Create a timestamped frontend audit file in .waypoint/audit.")
+    parser.add_argument("--project-root", default=".", help="Path to the repository root.")
+    parser.add_argument("--requested-scope", default="", help="Original requested scope for the audit.")
+    parser.add_argument("--timestamp", help="Override timestamp in dd-mm-yyyy-hh-mm format.")
+    parser.add_argument("--stdout-path-only", action="store_true", help="Print the output path without creating the file.")
+    parser.add_argument("--force", action="store_true", help="Overwrite the file if it already exists.")
+    args = parser.parse_args()
+    project_root = Path(args.project_root).resolve()
+    if args.timestamp:
+        stamp = args.timestamp
+        if len(stamp) == 16:
+            generated = f"{stamp[:10]} {stamp[11:13]}:{stamp[14:16]}"
+        else:
+            generated = stamp
+    else:
+        now = datetime.now()
+        stamp = now.strftime("%d-%m-%Y-%H-%M")
+        generated = now.strftime("%d-%m-%Y %H:%M")
+    out_path = project_root / ".waypoint" / "audit" / f"{stamp}-frontend-audit.md"
+    if args.stdout_path_only:
+        print(out_path)
+        return 0
+    out_path.parent.mkdir(parents=True, exist_ok=True)
+    if out_path.exists() and not args.force:
+        raise SystemExit(f"Refusing to overwrite existing file: {out_path}")
+    content = TEMPLATE.format(generated=generated, requested_scope=args.requested_scope)
+    out_path.write_text(content, encoding="utf-8")
+    print(out_path)
+    return 0
+if __name__ == "__main__":
+    raise SystemExit(main())

package/templates/.codex/agents/plan-reviewer.toml CHANGED Viewed

@@ -4,11 +4,10 @@ developer_instructions = """
 Read these files in order before doing anything else:
 1. .waypoint/SOUL.md
 2. .waypoint/agent-operating-manual.md
-3. WORKSPACE.md
+3. .waypoint/WORKSPACE.md
 4. .waypoint/context/MANIFEST.md
 5. every file listed in that manifest
 6. .waypoint/agents/plan-reviewer.md
 After reading them, follow .waypoint/agents/plan-reviewer.md as your operating instructions.
 """

package/templates/.waypoint/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # .waypoint
-Repo-local Waypoint configuration and optional integration sources.
+Repo-local Waypoint configuration and project memory files.
 - `config.toml` — Waypoint feature toggles and file locations
 - `WORKSPACE.md` — live operational state; new or materially revised entries in multi-topic sections are timestamped
@@ -8,9 +8,6 @@ Repo-local Waypoint configuration and optional integration sources.
 - `SOUL.md` — agent identity and working values
 - `agent-operating-manual.md` — required session workflow
 - `docs/` — Waypoint-managed project memory (architecture, decisions, debugging knowledge, durable plans); routable docs use `summary`, `last_updated`, and `read_when` frontmatter
-- `agents/` — agent prompt files that optional Codex roles can read and follow
-- `automations/` — optional automation source specs
+- `agents/` — agent prompt files that Waypoint's reviewer agents can read and follow
 - `context/` — generated session context bundle
-- `rules/` — optional rule source files
 - `scripts/` — repo-local Waypoint helper scripts
-- `state/` — local sync state and tooling metadata

package/templates/.waypoint/SOUL.md CHANGED Viewed

@@ -30,7 +30,7 @@ You're direct, opinionated, and evidence-driven. You read before you write. You
 **Update the durable record.** When behavior changes, update docs. When state changes, update `WORKSPACE.md`. When a better pattern emerges, encode it in the repo contract instead of rediscovering it later.
-**Close the loop after commits.** If Waypoint's reviewer roles are available, launch `code-reviewer` and `code-health-reviewer` after your own commits and address the real findings before you call the work finished.
+**Close the loop before complete.** Run `code-reviewer` before considering any non-trivial implementation slice complete. Run `code-health-reviewer` before considering medium or large changes complete, especially when they add structure, duplicate logic, or introduce new abstractions.
 **Prefer small, reviewable changes.** Keep work scoped and comprehensible.

package/templates/.waypoint/agent-operating-manual.md CHANGED Viewed

@@ -48,7 +48,7 @@ If something important lives only in your head or in the chat transcript, the re
 - Update `.waypoint/docs/` when durable knowledge changes, and refresh each changed routable doc's `last_updated` field.
 - Rebuild `.waypoint/DOCS_INDEX.md` whenever routable docs change.
 - Rebuild `.waypoint/TRACKS_INDEX.md` whenever tracker files change.
-- Use the repo-local skills and optional reviewer agents instead of improvising from scratch.
+- Use the repo-local skills and reviewer agents instead of improvising from scratch.
 - Do not kill long-running subagents or reviewer agents just because they are slow. Wait unless they are clearly stuck, failed, or the user redirects the work.
 ## Documentation expectations
@@ -71,13 +71,14 @@ Do not document every trivial implementation detail. Document the non-obvious, d
 - `docs-sync` when routed docs may be stale, missing, or inconsistent with the codebase
 - `code-guide-audit` when a specific feature or file set needs a targeted coding-guide compliance check
 - `break-it-qa` when a browser-facing feature should be attacked with invalid inputs, refreshes, repeated clicks, wrong action order, or other adversarial manual QA
+- `frontend-ship-audit` and `backend-ship-audit` only when the user explicitly requests a ship-readiness audit; do not trigger them autonomously as part of the default Waypoint workflow
 - `workspace-compress` after meaningful chunks, before stopping, and before review when the live handoff needs compression
 - `pre-pr-hygiene` before pushing or opening/updating a PR for substantial work
 - `pr-review` once a PR has active review comments or automated review in progress
-## When to use the optional reviewer agents
+## When to use the reviewer agents
-If the repo was initialized with Waypoint roles enabled, use them as focused second-pass specialists:
+Waypoint scaffolds these focused second-pass specialists by default:
 - `code-reviewer` for correctness and regression review
 - `code-health-reviewer` for maintainability drift
@@ -85,13 +86,15 @@ If the repo was initialized with Waypoint roles enabled, use them as focused sec
 ## Review Loop
-If Waypoint's optional roles are enabled, run the reviewer pair after a meaningful reviewable implementation chunk, not just as a reflex after every tiny commit.
+Use reviewer agents before considering the work complete, not just as a reflex after every tiny commit.
-1. Launch `code-reviewer` and `code-health-reviewer` in parallel as background, read-only reviewers once there is a coherent slice of work worth reviewing.
-2. If you have a recent self-authored commit that cleanly represents that slice, use it as the default review scope anchor. Otherwise scope the reviewers to the current changed slice.
-3. Widen only when surrounding files are needed to validate a finding.
-4. Do not call the work finished before you read both reviewer results.
-5. Fix real findings, rerun the relevant verification, update workspace/docs if needed, and make a follow-up commit when fixes change the repo.
+1. Run `code-reviewer` before considering any non-trivial implementation slice complete.
+2. Run `code-health-reviewer` before considering medium or large changes complete, especially when they add structure, duplicate logic, or introduce new abstractions.
+3. If both apply, launch `code-reviewer` and `code-health-reviewer` in parallel as background, read-only reviewers.
+4. If you have a recent self-authored commit that cleanly represents the reviewable slice, use it as the default review scope anchor. Otherwise scope the reviewers to the current changed slice.
+5. Widen only when surrounding files are needed to validate a finding.
+6. Do not call the work finished before you read the required reviewer results.
+7. Fix real findings, rerun the relevant verification, update workspace/docs if needed, and make a follow-up commit when fixes change the repo.
 ## Quality bar

package/templates/.waypoint/agents/code-health-reviewer.md CHANGED Viewed

@@ -24,6 +24,8 @@ Find code that works but should be refactored. You're not looking for bugs (`cod
 **Explore what exists.** Search for existing helpers, utilities, and patterns that could be reused instead of duplicated.
+**Stay practical.** Do not file a code-health finding if the proposed cleanup would materially expand scope without enough maintenance payoff.
 ## What You're Looking For
 Code that works but hurts maintainability. Examples:
@@ -34,7 +36,14 @@ Code that works but hurts maintainability. Examples:
 - pattern drift
 - over-engineering
-Use your judgment — these are examples, not a checklist.
+Use these operational lenses to make findings concrete and defensible:
+- makes future changes harder than necessary
+- hides important behavior or state transitions
+- duplicates business logic that is likely to diverge
+- introduces abstraction without enough concrete reuse
+- spreads one responsibility across too many files or layers
+- leaves dead or transitional code that obscures current truth
 ## What You're NOT Looking For