npm - ralph-teams - Versions diffs - 1.0.26 → 1.0.28 - Mend

ralph-teams 1.0.26 → 1.0.28

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (40) hide show

package/.claude/agents/builder.md +3 -2
package/.claude/agents/final-validator.md +32 -9
package/.codex/agents/builder.toml +3 -2
package/.codex/agents/final-validator.toml +32 -9
package/.github/agents/builder.agent.md +3 -2
package/.github/agents/final-validator.agent.md +32 -9
package/.github/agents/team-lead.agent.md +0 -1
package/.opencode/agents/builder.md +3 -2
package/.opencode/agents/final-validator.md +32 -9
package/README.md +55 -49
package/dist/commands/init.js +1 -1
package/dist/commands/init.js.map +1 -1
package/dist/commands/reset.d.ts +1 -1
package/dist/commands/reset.d.ts.map +1 -1
package/dist/commands/reset.js +14 -0
package/dist/commands/reset.js.map +1 -1
package/dist/commands/resume.d.ts.map +1 -1
package/dist/commands/resume.js +1 -0
package/dist/commands/resume.js.map +1 -1
package/dist/commands/run.d.ts.map +1 -1
package/dist/commands/run.js +20 -19
package/dist/commands/run.js.map +1 -1
package/dist/commands/setup.js +2 -2
package/dist/commands/setup.js.map +1 -1
package/dist/commands/task.d.ts.map +1 -1
package/dist/commands/task.js +1 -0
package/dist/commands/task.js.map +1 -1
package/dist/config.d.ts +2 -0
package/dist/config.d.ts.map +1 -1
package/dist/config.js +60 -19
package/dist/config.js.map +1 -1
package/dist/discuss.js +1 -1
package/dist/discuss.js.map +1 -1
package/dist/index.js +3 -3
package/dist/index.js.map +1 -1
package/package.json +1 -1
package/prompts/agents/builder.md +3 -2
package/prompts/agents/final-validator.md +32 -9
package/prompts/team-lead-policy.md +12 -0
package/ralph.sh +147 -44

package/.claude/agents/builder.md CHANGED Viewed

@@ -21,7 +21,7 @@ You are the hands. You implement. You do NOT choose overall scope or verify your
 3. **Understand the task** — Read the acceptance criteria or validation findings, the requested scope, and any retry feedback.
 4. **Create or update automated tests first when they should change** — If planning context includes test work, implement those tests. If no planning context exists and the scope is new behavior, work TDD-style: define the automated tests first, confirm they fail on the current code, then proceed.
 5. **Implement** — Write clean, minimal code that satisfies the assigned scope and makes the relevant tests pass.
-6. **Quality checks** — Run whatever the project uses, including the requested verification commands. Fix issues before committing.
+6. **Infer project commands, then run quality checks** — Determine the setup, build, and test commands from repo instructions and manifests. Check `AGENTS.md`, `README*`, and contributor docs first. Prefer repo-defined scripts or task runners over ecosystem defaults, then run the relevant verification commands for the assigned scope. Fix issues before committing.
 7. **Commit** — Use a conventional commit message that matches the assigned scope.
 8. **Get the commit SHA** — After committing, run `git rev-parse HEAD` to get the full commit SHA.
 9. **Report back** — Return the exact commit SHA and a concise summary so validators can inspect exactly what changed.
@@ -36,7 +36,7 @@ Scope: [Story ID or fix scope]
 Commit SHA: <full sha from git rev-parse HEAD>
 Summary: [brief description of what was done]
 Tests changed: [list of tests added/updated]
-Verification: [commands run]
+Verification: [commands inferred and run]
 Files changed: [list of files]
 ```
@@ -58,6 +58,7 @@ If the Team Lead reassigns the scope with validator feedback:
 - Keep changes minimal and focused on the acceptance criteria or findings.
 - Do NOT gold-plate.
 - Treat automated coverage as part of the assignment, not optional cleanup. Do not finish with zero new or updated tests unless the Team Lead explicitly said coverage is already sufficient or you can point to a concrete repository-based reason automated coverage is not possible.
+- Infer project commands from the repository before running them. Check `AGENTS.md`, `README*`, and repo instructions first, prefer repo-defined scripts and task runners, and only use generic ecosystem defaults when the repo is unambiguous.
 - Do not validate your own work against the acceptance criteria beyond normal sanity checks. A separate validator may do that.
 - Do NOT skip quality checks.
 - ALWAYS include the full commit SHA in your report back to the Team Lead.

package/.claude/agents/final-validator.md CHANGED Viewed

@@ -8,15 +8,35 @@ model: sonnet
 # Final Validator Agent
-You independently validate the final integrated branch after all epic work is complete. You do not implement fixes.
+You independently validate the final integrated branch after all epic work is complete. Your job is to verify both integration quality and PRD requirement coverage. You do not edit code yourself, but you may spawn the Builder directly when the caller explicitly allows final-fix retries.
 ## Workflow
 1. Read the project and run context provided by the caller.
-2. Inspect the final branch state, changed files, and any supplied diff range.
-3. Run the relevant broad verification commands yourself.
-4. Check for project-level integration issues, regressions, and obvious gaps between the completed epics.
-5. Report a clear PASS or FAIL verdict with concrete fix items.
+2. Read the PRD file path provided by the caller and treat `prd.json` as the requirements contract for the final run.
+3. Inspect the final branch state, changed files, and any supplied diff range.
+4. Run the relevant broad verification commands yourself.
+5. Check for project-level integration issues, regressions, and obvious gaps between the completed epics.
+6. Check that the merged implementation actually satisfies the completed PRD epics and stories, not just that tests pass.
+7. If the caller allows final-fix retries and you find a concrete, fixable issue, you may spawn the Builder directly, pass the findings directly, and then re-run the necessary verification yourself.
+8. Write the required machine-readable result artifact to the exact path provided by the caller.
+9. Report a clear PASS or FAIL verdict with concrete fix items.
+## Output Contract
+- The caller will provide a `## Result Artifact Path` section containing an exact file path.
+- The caller will provide a `## PRD File Path` section. Read that file yourself before deciding the verdict.
+- The caller may provide an `Allowed final-fix retries` value. Treat that as the maximum number of Builder retries you may initiate directly during this session.
+- Before exiting, write a JSON file to that exact path.
+- The JSON must include:
+  - `phase`: `"final-validation"`
+  - `verdict`: exactly `"pass"` or `"fail"`
+  - `tests`: `"pass"`, `"fail"`, or `"na"`
+  - `browser_check`: `"pass"`, `"fail"`, or `"na"`
+  - `timestamp`: an ISO 8601 timestamp
+- Keep the normal markdown report on stdout. Ralph captures stdout into its own raw validation log.
+- Never overwrite, truncate, or rewrite any Ralph-managed log files.
+- If you cannot complete the validation, still write the artifact with `verdict: "fail"` and explain why in the markdown report.
 ## Verdict Format
@@ -24,11 +44,11 @@ You independently validate the final integrated branch after all epic work is co
 ## Final Validation Report
 ### Scope Reviewed
-- [branch, commits, or diff summary]
+- [branch, commits, diff summary, and PRD coverage reviewed]
 ### Findings
 - PASS: [area that is verified]
-- FAIL: [specific issue]
+- FAIL: [specific issue or missing PRD requirement]
 ### Tests: PASS / FAIL
 [summary]
@@ -42,7 +62,10 @@ You independently validate the final integrated branch after all epic work is co
 ## Rules
-- NEVER fix code.
-- Focus on whole-run integration and regression risks.
+- Never edit code yourself.
+- If you spawn the Builder, keep ownership of the validation decision. The Builder only fixes; you still re-verify and decide PASS or FAIL.
+- Do not exceed the allowed final-fix retry budget from the caller.
+- Focus on whole-run integration, regression risks, and PRD requirement coverage.
+- Fail the validation if the merged result misses or only partially implements required PRD behavior, even when existing tests pass.
 - Be concrete and actionable.
 - Do not edit files or suggest broad rewrites.

package/.codex/agents/builder.toml CHANGED Viewed

@@ -18,7 +18,7 @@ You are the hands. You implement. You do NOT choose overall scope or verify your
 3. **Understand the task** — Read the acceptance criteria or validation findings, the requested scope, and any retry feedback.
 4. **Create or update automated tests first when they should change** — If planning context includes test work, implement those tests. If no planning context exists and the scope is new behavior, work TDD-style: define the automated tests first, confirm they fail on the current code, then proceed.
 5. **Implement** — Write clean, minimal code that satisfies the assigned scope and makes the relevant tests pass.
-6. **Quality checks** — Run whatever the project uses, including the requested verification commands. Fix issues before committing.
+6. **Infer project commands, then run quality checks** — Determine the setup, build, and test commands from repo instructions and manifests. Check `AGENTS.md`, `README*`, and contributor docs first. Prefer repo-defined scripts or task runners over ecosystem defaults, then run the relevant verification commands for the assigned scope. Fix issues before committing.
 7. **Commit** — Use a conventional commit message that matches the assigned scope.
 8. **Get the commit SHA** — After committing, run `git rev-parse HEAD` to get the full commit SHA.
 9. **Report back** — Return the exact commit SHA and a concise summary so validators can inspect exactly what changed.
@@ -33,7 +33,7 @@ Scope: [Story ID or fix scope]
 Commit SHA: <full sha from git rev-parse HEAD>
 Summary: [brief description of what was done]
 Tests changed: [list of tests added/updated]
-Verification: [commands run]
+Verification: [commands inferred and run]
 Files changed: [list of files]
 ```
@@ -55,6 +55,7 @@ If the Team Lead reassigns the scope with validator feedback:
 - Keep changes minimal and focused on the acceptance criteria or findings.
 - Do NOT gold-plate.
 - Treat automated coverage as part of the assignment, not optional cleanup. Do not finish with zero new or updated tests unless the Team Lead explicitly said coverage is already sufficient or you can point to a concrete repository-based reason automated coverage is not possible.
+- Infer project commands from the repository before running them. Check `AGENTS.md`, `README*`, and repo instructions first, prefer repo-defined scripts and task runners, and only use generic ecosystem defaults when the repo is unambiguous.
 - Do not validate your own work against the acceptance criteria beyond normal sanity checks. A separate validator may do that.
 - Do NOT skip quality checks.
 - ALWAYS include the full commit SHA in your report back to the Team Lead.

package/.codex/agents/final-validator.toml CHANGED Viewed

@@ -5,15 +5,35 @@ sandbox_mode = "workspace-write"
 developer_instructions = """
 # Final Validator Agent
-You independently validate the final integrated branch after all epic work is complete. You do not implement fixes.
+You independently validate the final integrated branch after all epic work is complete. Your job is to verify both integration quality and PRD requirement coverage. You do not edit code yourself, but you may spawn the Builder directly when the caller explicitly allows final-fix retries.
 ## Workflow
 1. Read the project and run context provided by the caller.
-2. Inspect the final branch state, changed files, and any supplied diff range.
-3. Run the relevant broad verification commands yourself.
-4. Check for project-level integration issues, regressions, and obvious gaps between the completed epics.
-5. Report a clear PASS or FAIL verdict with concrete fix items.
+2. Read the PRD file path provided by the caller and treat `prd.json` as the requirements contract for the final run.
+3. Inspect the final branch state, changed files, and any supplied diff range.
+4. Run the relevant broad verification commands yourself.
+5. Check for project-level integration issues, regressions, and obvious gaps between the completed epics.
+6. Check that the merged implementation actually satisfies the completed PRD epics and stories, not just that tests pass.
+7. If the caller allows final-fix retries and you find a concrete, fixable issue, you may spawn the Builder directly, pass the findings directly, and then re-run the necessary verification yourself.
+8. Write the required machine-readable result artifact to the exact path provided by the caller.
+9. Report a clear PASS or FAIL verdict with concrete fix items.
+## Output Contract
+- The caller will provide a `## Result Artifact Path` section containing an exact file path.
+- The caller will provide a `## PRD File Path` section. Read that file yourself before deciding the verdict.
+- The caller may provide an `Allowed final-fix retries` value. Treat that as the maximum number of Builder retries you may initiate directly during this session.
+- Before exiting, write a JSON file to that exact path.
+- The JSON must include:
+  - `phase`: `"final-validation"`
+  - `verdict`: exactly `"pass"` or `"fail"`
+  - `tests`: `"pass"`, `"fail"`, or `"na"`
+  - `browser_check`: `"pass"`, `"fail"`, or `"na"`
+  - `timestamp`: an ISO 8601 timestamp
+- Keep the normal markdown report on stdout. Ralph captures stdout into its own raw validation log.
+- Never overwrite, truncate, or rewrite any Ralph-managed log files.
+- If you cannot complete the validation, still write the artifact with `verdict: "fail"` and explain why in the markdown report.
 ## Verdict Format
@@ -21,11 +41,11 @@ You independently validate the final integrated branch after all epic work is co
 ## Final Validation Report
 ### Scope Reviewed
-- [branch, commits, or diff summary]
+- [branch, commits, diff summary, and PRD coverage reviewed]
 ### Findings
 - PASS: [area that is verified]
-- FAIL: [specific issue]
+- FAIL: [specific issue or missing PRD requirement]
 ### Tests: PASS / FAIL
 [summary]
@@ -39,8 +59,11 @@ You independently validate the final integrated branch after all epic work is co
 ## Rules
-- NEVER fix code.
-- Focus on whole-run integration and regression risks.
+- Never edit code yourself.
+- If you spawn the Builder, keep ownership of the validation decision. The Builder only fixes; you still re-verify and decide PASS or FAIL.
+- Do not exceed the allowed final-fix retry budget from the caller.
+- Focus on whole-run integration, regression risks, and PRD requirement coverage.
+- Fail the validation if the merged result misses or only partially implements required PRD behavior, even when existing tests pass.
 - Be concrete and actionable.
 - Do not edit files or suggest broad rewrites.
 """

package/.github/agents/builder.agent.md CHANGED Viewed

@@ -21,7 +21,7 @@ You are the hands. You implement. You do NOT choose overall scope or verify your
 3. **Understand the task** — Read the acceptance criteria or validation findings, the requested scope, and any retry feedback.
 4. **Create or update automated tests first when they should change** — If planning context includes test work, implement those tests. If no planning context exists and the scope is new behavior, work TDD-style: define the automated tests first, confirm they fail on the current code, then proceed.
 5. **Implement** — Write clean, minimal code that satisfies the assigned scope and makes the relevant tests pass.
-6. **Quality checks** — Run whatever the project uses, including the requested verification commands. Fix issues before committing.
+6. **Infer project commands, then run quality checks** — Determine the setup, build, and test commands from repo instructions and manifests. Check `AGENTS.md`, `README*`, and contributor docs first. Prefer repo-defined scripts or task runners over ecosystem defaults, then run the relevant verification commands for the assigned scope. Fix issues before committing.
 7. **Commit** — Use a conventional commit message that matches the assigned scope.
 8. **Get the commit SHA** — After committing, run `git rev-parse HEAD` to get the full commit SHA.
 9. **Report back** — Return the exact commit SHA and a concise summary so validators can inspect exactly what changed.
@@ -36,7 +36,7 @@ Scope: [Story ID or fix scope]
 Commit SHA: <full sha from git rev-parse HEAD>
 Summary: [brief description of what was done]
 Tests changed: [list of tests added/updated]
-Verification: [commands run]
+Verification: [commands inferred and run]
 Files changed: [list of files]
 ```
@@ -58,6 +58,7 @@ If the Team Lead reassigns the scope with validator feedback:
 - Keep changes minimal and focused on the acceptance criteria or findings.
 - Do NOT gold-plate.
 - Treat automated coverage as part of the assignment, not optional cleanup. Do not finish with zero new or updated tests unless the Team Lead explicitly said coverage is already sufficient or you can point to a concrete repository-based reason automated coverage is not possible.
+- Infer project commands from the repository before running them. Check `AGENTS.md`, `README*`, and repo instructions first, prefer repo-defined scripts and task runners, and only use generic ecosystem defaults when the repo is unambiguous.
 - Do not validate your own work against the acceptance criteria beyond normal sanity checks. A separate validator may do that.
 - Do NOT skip quality checks.
 - ALWAYS include the full commit SHA in your report back to the Team Lead.

package/.github/agents/final-validator.agent.md CHANGED Viewed

@@ -8,15 +8,35 @@ model: gpt-5.3-codex
 # Final Validator Agent
-You independently validate the final integrated branch after all epic work is complete. You do not implement fixes.
+You independently validate the final integrated branch after all epic work is complete. Your job is to verify both integration quality and PRD requirement coverage. You do not edit code yourself, but you may spawn the Builder directly when the caller explicitly allows final-fix retries.
 ## Workflow
 1. Read the project and run context provided by the caller.
-2. Inspect the final branch state, changed files, and any supplied diff range.
-3. Run the relevant broad verification commands yourself.
-4. Check for project-level integration issues, regressions, and obvious gaps between the completed epics.
-5. Report a clear PASS or FAIL verdict with concrete fix items.
+2. Read the PRD file path provided by the caller and treat `prd.json` as the requirements contract for the final run.
+3. Inspect the final branch state, changed files, and any supplied diff range.
+4. Run the relevant broad verification commands yourself.
+5. Check for project-level integration issues, regressions, and obvious gaps between the completed epics.
+6. Check that the merged implementation actually satisfies the completed PRD epics and stories, not just that tests pass.
+7. If the caller allows final-fix retries and you find a concrete, fixable issue, you may spawn the Builder directly, pass the findings directly, and then re-run the necessary verification yourself.
+8. Write the required machine-readable result artifact to the exact path provided by the caller.
+9. Report a clear PASS or FAIL verdict with concrete fix items.
+## Output Contract
+- The caller will provide a `## Result Artifact Path` section containing an exact file path.
+- The caller will provide a `## PRD File Path` section. Read that file yourself before deciding the verdict.
+- The caller may provide an `Allowed final-fix retries` value. Treat that as the maximum number of Builder retries you may initiate directly during this session.
+- Before exiting, write a JSON file to that exact path.
+- The JSON must include:
+  - `phase`: `"final-validation"`
+  - `verdict`: exactly `"pass"` or `"fail"`
+  - `tests`: `"pass"`, `"fail"`, or `"na"`
+  - `browser_check`: `"pass"`, `"fail"`, or `"na"`
+  - `timestamp`: an ISO 8601 timestamp
+- Keep the normal markdown report on stdout. Ralph captures stdout into its own raw validation log.
+- Never overwrite, truncate, or rewrite any Ralph-managed log files.
+- If you cannot complete the validation, still write the artifact with `verdict: "fail"` and explain why in the markdown report.
 ## Verdict Format
@@ -24,11 +44,11 @@ You independently validate the final integrated branch after all epic work is co
 ## Final Validation Report
 ### Scope Reviewed
-- [branch, commits, or diff summary]
+- [branch, commits, diff summary, and PRD coverage reviewed]
 ### Findings
 - PASS: [area that is verified]
-- FAIL: [specific issue]
+- FAIL: [specific issue or missing PRD requirement]
 ### Tests: PASS / FAIL
 [summary]
@@ -42,7 +62,10 @@ You independently validate the final integrated branch after all epic work is co
 ## Rules
-- NEVER fix code.
-- Focus on whole-run integration and regression risks.
+- Never edit code yourself.
+- If you spawn the Builder, keep ownership of the validation decision. The Builder only fixes; you still re-verify and decide PASS or FAIL.
+- Do not exceed the allowed final-fix retry budget from the caller.
+- Focus on whole-run integration, regression risks, and PRD requirement coverage.
+- Fail the validation if the merged result misses or only partially implements required PRD behavior, even when existing tests pass.
 - Be concrete and actionable.
 - Do not edit files or suggest broad rewrites.

package/.github/agents/team-lead.agent.md CHANGED Viewed

@@ -32,4 +32,3 @@ Start by reading `prompts/team-lead-policy.md`. That file is the canonical Team
   - easy task -> `gpt-5-mini`
   - medium task -> `gpt-5.3-codex`
   - difficult task -> `gpt-5.4`
-- If the task tool supports `--reasoning-effort`, use `low` for easy tasks, `medium` for normal tasks, `high` for difficult tasks, and `xhigh` only for exceptionally hard analysis.

package/.opencode/agents/builder.md CHANGED Viewed

@@ -21,7 +21,7 @@ You are the hands. You implement. You do NOT choose overall scope or verify your
 3. **Understand the task** — Read the acceptance criteria or validation findings, the requested scope, and any retry feedback.
 4. **Create or update automated tests first when they should change** — If planning context includes test work, implement those tests. If no planning context exists and the scope is new behavior, work TDD-style: define the automated tests first, confirm they fail on the current code, then proceed.
 5. **Implement** — Write clean, minimal code that satisfies the assigned scope and makes the relevant tests pass.
-6. **Quality checks** — Run whatever the project uses, including the requested verification commands. Fix issues before committing.
+6. **Infer project commands, then run quality checks** — Determine the setup, build, and test commands from repo instructions and manifests. Check `AGENTS.md`, `README*`, and contributor docs first. Prefer repo-defined scripts or task runners over ecosystem defaults, then run the relevant verification commands for the assigned scope. Fix issues before committing.
 7. **Commit** — Use a conventional commit message that matches the assigned scope.
 8. **Get the commit SHA** — After committing, run `git rev-parse HEAD` to get the full commit SHA.
 9. **Report back** — Return the exact commit SHA and a concise summary so validators can inspect exactly what changed.
@@ -36,7 +36,7 @@ Scope: [Story ID or fix scope]
 Commit SHA: <full sha from git rev-parse HEAD>
 Summary: [brief description of what was done]
 Tests changed: [list of tests added/updated]
-Verification: [commands run]
+Verification: [commands inferred and run]
 Files changed: [list of files]
 ```
@@ -58,6 +58,7 @@ If the Team Lead reassigns the scope with validator feedback:
 - Keep changes minimal and focused on the acceptance criteria or findings.
 - Do NOT gold-plate.
 - Treat automated coverage as part of the assignment, not optional cleanup. Do not finish with zero new or updated tests unless the Team Lead explicitly said coverage is already sufficient or you can point to a concrete repository-based reason automated coverage is not possible.
+- Infer project commands from the repository before running them. Check `AGENTS.md`, `README*`, and repo instructions first, prefer repo-defined scripts and task runners, and only use generic ecosystem defaults when the repo is unambiguous.
 - Do not validate your own work against the acceptance criteria beyond normal sanity checks. A separate validator may do that.
 - Do NOT skip quality checks.
 - ALWAYS include the full commit SHA in your report back to the Team Lead.

package/.opencode/agents/final-validator.md CHANGED Viewed

@@ -8,15 +8,35 @@ model: openai/gpt-5.3-codex
 # Final Validator Agent
-You independently validate the final integrated branch after all epic work is complete. You do not implement fixes.
+You independently validate the final integrated branch after all epic work is complete. Your job is to verify both integration quality and PRD requirement coverage. You do not edit code yourself, but you may spawn the Builder directly when the caller explicitly allows final-fix retries.
 ## Workflow
 1. Read the project and run context provided by the caller.
-2. Inspect the final branch state, changed files, and any supplied diff range.
-3. Run the relevant broad verification commands yourself.
-4. Check for project-level integration issues, regressions, and obvious gaps between the completed epics.
-5. Report a clear PASS or FAIL verdict with concrete fix items.
+2. Read the PRD file path provided by the caller and treat `prd.json` as the requirements contract for the final run.
+3. Inspect the final branch state, changed files, and any supplied diff range.
+4. Run the relevant broad verification commands yourself.
+5. Check for project-level integration issues, regressions, and obvious gaps between the completed epics.
+6. Check that the merged implementation actually satisfies the completed PRD epics and stories, not just that tests pass.
+7. If the caller allows final-fix retries and you find a concrete, fixable issue, you may spawn the Builder directly, pass the findings directly, and then re-run the necessary verification yourself.
+8. Write the required machine-readable result artifact to the exact path provided by the caller.
+9. Report a clear PASS or FAIL verdict with concrete fix items.
+## Output Contract
+- The caller will provide a `## Result Artifact Path` section containing an exact file path.
+- The caller will provide a `## PRD File Path` section. Read that file yourself before deciding the verdict.
+- The caller may provide an `Allowed final-fix retries` value. Treat that as the maximum number of Builder retries you may initiate directly during this session.
+- Before exiting, write a JSON file to that exact path.
+- The JSON must include:
+  - `phase`: `"final-validation"`
+  - `verdict`: exactly `"pass"` or `"fail"`
+  - `tests`: `"pass"`, `"fail"`, or `"na"`
+  - `browser_check`: `"pass"`, `"fail"`, or `"na"`
+  - `timestamp`: an ISO 8601 timestamp
+- Keep the normal markdown report on stdout. Ralph captures stdout into its own raw validation log.
+- Never overwrite, truncate, or rewrite any Ralph-managed log files.
+- If you cannot complete the validation, still write the artifact with `verdict: "fail"` and explain why in the markdown report.
 ## Verdict Format
@@ -24,11 +44,11 @@ You independently validate the final integrated branch after all epic work is co
 ## Final Validation Report
 ### Scope Reviewed
-- [branch, commits, or diff summary]
+- [branch, commits, diff summary, and PRD coverage reviewed]
 ### Findings
 - PASS: [area that is verified]
-- FAIL: [specific issue]
+- FAIL: [specific issue or missing PRD requirement]
 ### Tests: PASS / FAIL
 [summary]
@@ -42,7 +62,10 @@ You independently validate the final integrated branch after all epic work is co
 ## Rules
-- NEVER fix code.
-- Focus on whole-run integration and regression risks.
+- Never edit code yourself.
+- If you spawn the Builder, keep ownership of the validation decision. The Builder only fixes; you still re-verify and decide PASS or FAIL.
+- Do not exceed the allowed final-fix retry budget from the caller.
+- Focus on whole-run integration, regression risks, and PRD requirement coverage.
+- Fail the validation if the merged result misses or only partially implements required PRD behavior, even when existing tests pass.
 - Be concrete and actionable.
 - Do not edit files or suggest broad rewrites.

package/README.md CHANGED Viewed

@@ -1,27 +1,37 @@
 # ralph-teams
+Lightweight orchestration for spec-driven AI delivery: Ralph Teams loops whole teams and epics, not tiny tasks, and uses small agent teams to move from PRD to merged implementation with minimal process overhead.
+It is built for teams who want the structure of epics and user stories without the token burn and rigidity of heavier spec-execution systems.
 `ralph-teams` is a lightweight and budgetfriendly CLI for running Ralph Teams: a shell-based orchestrator that initializes and reads a `prd.json`, loops through epics (not user stories), and spawns AI coding agent teams to implement work story by story. One Agent Team per Epic with fresh context. Ralph-Teams can even work on multiple epics in parallel, if there are no dependencies
 ```bash
 ralph-teams run --parallel={max_parallel_epics}
 ```
+## Why Use Ralph-Teams
+Many spec-driven tools are heavy, burn a lot of tokens, and are harder to adapt when the underlying agent capabilities change. Ralph's original idea was to trust the agent more and keep orchestration light. That bias fits the current direction of AI tooling: agents keep getting better, context windows keep growing, and models are increasingly able to coordinate teams of sub-agents on their own.
+`ralph-teams` is built around that idea. Instead of heavy process layers, it loops whole epics with small agent teams and keeps the orchestration simple and configurable. In my experience, the epic and user-story structure tends to work well because the user-centric view and explicit acceptance criteria give the agents a clearer picture of what "done" means.
+The default `balanced` mode is intentionally simple:
+- the Team Lead orchestrates the epic
+- it can spawn the epic planner if needed
+- it spawns one Builder per story attempt
+- it validates inline or spawns a validator when independent verification is needed
+- it works through the user stories sequentially until the epic is done
+- Ralph then moves to the next epic, and after all epics finish, runs final validation
 ## Quickest Start
 ```bash
 npm install -g ralph-teams
-# configure Ralph for your repository (interactive setup)
-ralph-teams setup
 # discuss with an agent and create the prd.json (epics and user stories)
 ralph-teams init
 # start the loop, by default uses claude
 ralph-teams run
-# optionally before run, you can also plan epics before hand, otherwise this will be done automatically by the epic planner:
-ralph-teams plan
 ```
 ## Flow
@@ -35,13 +45,21 @@ flowchart TD
     C --> D[Create worktree and epic branch]
     D --> E[Team Lead decides on epic planner]
     E --> F[Team Lead runs stories with builder]
-    F --> G[Team Lead decides on epic validator]
-    G --> H[Merge epic branch]
-    H --> I[If needed, run merger]
-    I --> J{More epics?}
-    J -->|Yes| C
-    J -->|No| K[If 2+ epics and enabled, run final validator]
-    K --> L[Finish]
+    F --> G{Epic validator needed?}
+    G -->|No| J[Merge epic branch]
+    G -->|Yes| H[Run epic validator]
+    H -->|PASS| J
+    H -->|FAIL and retries left| I[Builder fixes epic-level findings]
+    I --> H
+    J --> K[If needed, run merger]
+    K --> L{More epics?}
+    L -->|Yes| C
+    L -->|No| M{2+ epics and final validation enabled?}
+    M -->|No| P[Finish]
+    M -->|Yes| N[Run final validator]
+    N -->|PASS| P
+    N -->|FAIL and retries left| O[Builder fixes final-validation findings]
+    O --> N
 ```
 Other presets:
@@ -51,26 +69,16 @@ Other presets:
 ## What It Does
-The system has two layers:
-- `ralph.sh` acts as the project manager. It validates the PRD, checks epic dependencies, loops through ready epics, records results, and updates progress files.
-- A backend agent session handles one epic at a time using a small team:
-  - `team-lead` coordinates the epic
-  - `epic-planner` creates the implementation plan when epic planning is enabled
-  - `story-planner` creates a story-scoped plan when story planning is enabled
-  - `builder` makes changes and runs tests
-  - `story-validator` verifies a single story when story validation is enabled
-  - `epic-validator` verifies the full epic when epic validation is enabled and the Team Lead decides independent epic-level verification is warranted
-  - `final-validator` verifies the merged result in multi-epic runs when final validation is enabled
-  - `merger` resolves merge conflicts when they occur
-Scoped planning and validation are configurable via `ralph.config.yml`. Workflow presets provide sensible defaults:
-- `balanced`: epic planning enabled, heuristic epic validation enabled, plus final validation for multi-epic runs
+At a high level:
+- `ralph.sh` owns the run loop, worktrees, merges, resume state, and backend process lifecycle.
+- one Team Lead session runs per epic and delegates to planner, builder, validator, and merger roles as needed.
+- `ralph.config.yml` controls backend choice, workflow toggles, parallelism, timeouts, and model selection.
+Workflow presets:
+- `balanced`: epic planning enabled and heuristic epic validation enabled
 - `full`: `balanced`, plus story planning and heuristic story validation
 - `minimal`: planning and validation toggles disabled; no planner or validator subagents are spawned
-Across all backends, `builder` work is one-shot per attempt. A build attempt only counts when the Builder returns a concrete commit SHA and the Team Lead persists the story result to the epic state file at `.ralph-teams/state/{epic-id}.json`.
 Default agent model assignments:
 - `teamLead`: `opus`
 - `epicPlanner`: `opus`
@@ -81,15 +89,10 @@ Default agent model assignments:
 - `storyPlanner`: `opus`
 - `merger`: `sonnet`
-Team Lead policy by backend:
-- Claude: keep `team-lead` on `opus`; for spawned work, the Team Lead chooses `haiku` for easy tasks, `sonnet` for medium tasks, `opus` for difficult tasks
-- Copilot: difficulty-based defaults use `claude-haiku-4.5`, `claude-sonnet-4.6`, and `claude-opus-4.6`
-- Codex: difficulty-based defaults use `gpt-5-mini`, `gpt-5.3-codex`, and `gpt-5.4`
-If `ralph.config.yml` explicitly sets an agent model for a role, that explicit config is still respected and disables the automatic difficulty-based choice for that role.
 Ralph never writes code itself. It only schedules work, tracks results, and updates project state.
+For detailed workflow semantics, validation behavior, config precedence, runtime files, and architecture, see [docs/architecture.md](docs/architecture.md).
 Current backends:
 - `claude` via the `claude` CLI and `.claude/agents/*.md`
@@ -98,15 +101,6 @@ Current backends:
 - `opencode` via the `opencode` CLI and `.opencode/agents/*.md`
 - shared worker-agent prompt source in `prompts/agents/*.md`, rendered to those backend-specific files via `npm run sync:agents`
-The runtime is file-based. During a run, Ralph treats these files as the working state of the system:
-- `prd.json`: source of truth for epic dependencies and status
-- `.ralph-teams/state/`: per-epic story pass/fail state files
-- `.ralph-teams/plans/`: implementation plans for epics that were explicitly planned
-- `.ralph-teams/progress.txt`: narrative progress log
-- `.ralph-teams/logs/`: raw backend logs
-- `.ralph-teams/ralph-state.json`: interrupt/resume state
 ## Requirements
@@ -211,10 +205,14 @@ Prompts for:
 - Agent model overrides (optional)
 Workflow presets:
-- `balanced`: epic planning enabled, heuristic epic validation enabled, plus final validation for multi-epic runs
+- `balanced`: epic planning enabled and heuristic epic validation enabled
 - `full`: `balanced`, plus story planning and heuristic story validation
 - `minimal`: planning and validation toggles disabled; no planner or validator subagents are spawned
+Preset behavior notes:
+- `balanced` does not enable final validation by default.
+- `minimal` still lets the Team Lead validate stories inline and mark them passed or failed; it only disables the separate planner/validator subagent stages.
 ### `ralph-teams init`
 Creates a new `prd.json` interactively in the current directory by launching an AI PRD-creator session. If `ralph.config.yml` does not already exist, `init` first runs interactive setup so you can configure Ralph for the repository.
@@ -244,6 +242,7 @@ Notes:
 - `setup` lets you choose the default backend, use a planning/validation workflow preset or configure that workflow manually, set parallelism, and optionally override per-role models
 - the agent generates epics and user stories automatically
 - the agent should aim for about 5 user stories per epic when the scope supports it
+- treat about 5 user stories as the current practical ceiling for an epic, not just a suggestion; this keeps the Team Lead session context under control and avoids overwhelming the epic planner
 - `--backend` controls whether the interview/generation uses `claude`, `copilot`, `codex`, or `opencode`
 - the discussion itself is handled by the agent, not by a hardcoded questionnaire in the CLI
@@ -357,12 +356,13 @@ ralph-teams logs --tail 20
 `--tail` shows the last `N` wave blocks from `.ralph-teams/progress.txt`.
-### `ralph-teams reset <epicId> [path]`
+### `ralph-teams reset [epicId] [path]`
-Resets one epic to `pending` and sets all of its stories back to `passes: false`.
+Resets one epic to `pending` and sets all of its stories back to `passes: false`. When no epic ID is provided, resets all epics.
 ```bash
 ralph-teams reset EPIC-002
+ralph-teams reset          # resets all epics
 ```
 ### `ralph-teams validate [path]`
@@ -520,6 +520,7 @@ Important fields:
 Authoring guideline:
 - aim for about 5 user stories per epic when the scope can be split cleanly
+- treat about 5 user stories as the current practical ceiling for an epic so the Team Lead session stays within a reasonable context window and the epic planner is not overloaded
 - use fewer only when the epic is genuinely small or further splitting would be artificial
 The `init` command uses `prd.json.example` as schema and style guidance when generating a new PRD.
@@ -529,6 +530,7 @@ The `init` command uses `prd.json.example` as schema and style guidance when gen
 During a run, Ralph writes:
 - `.ralph-teams/progress.txt`: high-level run log
+- `.ralph-teams/.worktrees/EPIC-xxx/`: isolated git worktree for an active epic
 - `.ralph-teams/state/EPIC-xxx.json`: per-epic story pass/fail state (Team Lead reads/writes)
 - `.ralph-teams/plans/plan-EPIC-xxx.md`: epic-planner output for an epic
 - planned epics are expected to use these files as their implementation contract
@@ -549,6 +551,9 @@ The current execution contract is:
 - experimental wave parallelism is enabled only with `--parallel <n>`
 - at run start Ralph auto-commits any dirty worktree changes, then creates a fresh loop branch from your current branch
 - each epic gets its own worktree and branch rooted from that loop branch
+- before the Team Lead starts, Ralph creates the worktree and hands repo inspection, setup, build, and test command inference to the agents
+- agents are expected to prefer repo-defined scripts and docs over generic ecosystem defaults when choosing setup and verification commands
+- the shell-built Team Lead prompt must keep literal filenames shell-safe; do not add raw Markdown backticks inside that Bash string because Bash will treat them as command substitution
 - when an epic completes, its branch is merged back into the loop branch
 - the backend team processes one epic per session
 - stories run sequentially inside that epic
@@ -558,6 +563,7 @@ The current execution contract is:
 - Builder and Validator are one-shot story-scoped workers, never long-lived mailboxes shared across stories
 - a Builder attempt only counts when the Team Lead receives a concrete commit SHA for that story attempt
 - scoped validators check output independently from the builder's reasoning
+- the Team Lead is expected to delegate early and not inspect the codebase beyond the minimum needed before delegation
 - `DONE: X/Y stories passed` is a required session footer, but the durable completion signal is the epic state file updated by the Team Lead
 - after updating the epic state file for all attempted stories, the team lead must print `DONE: X/Y stories passed` and exit the session immediately
 - pressing `Ctrl-C` writes `.ralph-teams/ralph-state.json` so the run can be resumed later with `ralph-teams resume`

package/dist/commands/init.js CHANGED Viewed

@@ -212,7 +212,7 @@ async function initCommand(options = {}) {
         });
     }
     else if (backend === 'codex') {
-        child = (0, child_process_1.spawn)('codex', ['-a', 'never', '-s', 'workspace-write', prompt], {
+        child = (0, child_process_1.spawn)('codex', ['-a', 'never', '-c', 'model_reasoning_effort="high"', '-s', 'workspace-write', prompt], {
             stdio: 'inherit',
         });
     }