npm - theslopmachine - Versions diffs - 0.5.1 → 0.6.0 - Mend

theslopmachine 0.5.1 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/README.md +21 -4
package/RELEASE.md +8 -0
package/assets/agents/developer.md +10 -6
package/assets/agents/slopmachine-claude.md +58 -33
package/assets/agents/slopmachine.md +49 -19
package/assets/claude/agents/developer.md +5 -9
package/assets/skills/clarification-gate/SKILL.md +63 -2
package/assets/skills/claude-worker-management/SKILL.md +32 -9
package/assets/skills/developer-session-lifecycle/SKILL.md +124 -91
package/assets/skills/development-guidance/SKILL.md +4 -6
package/assets/skills/evaluation-triage/SKILL.md +44 -20
package/assets/skills/final-evaluation-orchestration/SKILL.md +74 -34
package/assets/skills/integrated-verification/SKILL.md +4 -1
package/assets/skills/planning-gate/SKILL.md +4 -0
package/assets/skills/planning-guidance/SKILL.md +15 -1
package/assets/skills/retrospective-analysis/SKILL.md +1 -2
package/assets/skills/scaffold-guidance/SKILL.md +22 -5
package/assets/skills/submission-packaging/SKILL.md +19 -16
package/assets/skills/verification-gates/SKILL.md +18 -7
package/assets/slopmachine/templates/AGENTS.md +6 -1
package/assets/slopmachine/utils/prepare_ai_session_for_convert.mjs +0 -15
package/assets/slopmachine/utils/strip_session_parent.py +2 -28
package/assets/slopmachine/workflow-init.js +84 -1
package/package.json +1 -1
package/src/cli.js +1 -1
package/src/config.js +17 -2
package/src/constants.js +1 -0
package/src/init.js +220 -16
package/src/install.js +3 -1
package/src/send-data.js +86 -23

package/README.md CHANGED Viewed

@@ -84,21 +84,40 @@ Or open OpenCode immediately after bootstrap:
 slopmachine init -o
 ```
+To adopt an existing project into a SlopMachine workspace and request a later workflow starting phase:
+```bash
+slopmachine init --adopt --phase P4
+```
 What it creates:
 - `repo/`
 - `docs/`
+- `self_test_reports/`
 - `sessions/`
 - `metadata.json`
 - `.ai/metadata.json`
+- `.ai/pre-planning-brief.md`
+- `.ai/clarification-options.md`
+- `.ai/clarification-prompt.md`
+- `.ai/startup-context.md`
 - root `.beads/`
 - `repo/AGENTS.md`
+- `repo/README.md`
+- `docs/questions.md`
+- `docs/design.md`
+- `docs/api-spec.md`
+- `docs/test-coverage.md`
 Important details:
 - `run_id` is created in `.ai/metadata.json`
 - the workspace root is the parent directory containing `repo/`
 - Beads lives in the workspace root, not inside `repo/`
+- after non-`-o` bootstrap, the command prints the exact `cd repo` next step so you can continue immediately
+- `--adopt` moves the current project files into `repo/`, preserves root workflow state in the parent workspace, and skips the automatic bootstrap commit
+- `--phase <PX>` records the requested starting phase for owner-side adoption and recovery
 ### `slopmachine set-token`
@@ -156,8 +175,7 @@ What it exports live:
 What it includes when present:
-- `self-test-run.md`
-- `self-test-fixes.md`
+- `self_test_reports/`
 - `retrospective-<run_id>.md`
 - `improvement-actions-<run_id>.md`
 - `metadata.json`
@@ -175,8 +193,7 @@ Fail-fast conditions:
 Warn-only conditions:
-- missing `self-test-run.md`
-- missing `self-test-fixes.md`
+- missing `self_test_reports/`
 - missing retrospective files
 Output behavior:

package/RELEASE.md CHANGED Viewed

@@ -36,6 +36,14 @@ mkdir -p .tmp-project-open
 SLOPMACHINE_HOME="$(pwd)/.tmp-home" node ./bin/slopmachine.js init -o .tmp-project-open
 ```
+5. Test existing-project adoption bootstrap:
+```bash
+mkdir -p .tmp-project-adopt
+printf 'console.log("hello")\n' > .tmp-project-adopt/index.js
+SLOPMACHINE_HOME="$(pwd)/.tmp-home" node ./bin/slopmachine.js init --adopt --phase P4 .tmp-project-adopt
+```
 Note:
 - `slopmachine init` is Node-driven.

package/assets/agents/developer.md CHANGED Viewed

@@ -73,16 +73,18 @@ During ordinary work, prefer:
 - targeted unit tests
 - targeted integration tests
 - targeted module or route-family tests
-- the selected stack's local UI or E2E tool on affected flows when UI is material
+- targeted component, route, page, or state-focused tests when UI behavior is material
-Owner-only broad gate commands:
+Broad commands you are not allowed to run during ordinary work:
 - never run `./run_tests.sh`
 - never run `docker compose up --build`
-- treat both commands as owner-run gate commands only, even if they are documented in the repo or look convenient for debugging
-- if your work would normally call for one of those commands, stop at targeted local verification and report that the change is ready for owner-run broad verification
+- never run browser E2E or Playwright during ordinary development slices
+- never run full test suites during ordinary development slices unless the user explicitly asks for that exact command
+- do not use those commands even if they are documented in the repo or look convenient for debugging
+- if your work would normally call for one of those commands, stop at targeted local verification and report that the change is ready for broader verification
-The owner reserves the limited broad gate budget. Your job is to make those owner-run gates likely to pass.
+Your job is to make the broader verification likely to pass without running it yourself.
 Selected-stack defaults:
@@ -102,6 +104,8 @@ Selected-stack defaults:
 - do not hardcode database connection values or database bootstrap values anywhere in the repo
 - for Dockerized web projects, do not require manual `export ...` steps for `docker compose up --build`
 - for Dockerized web projects, prefer an automatically invoked dev-only runtime bootstrap script instead of checked-in `.env` files or hardcoded runtime values
+- for Dockerized web projects, do not introduce a separate pre-seeded secret path for `./run_tests.sh`; use the same runtime bootstrap model or an equivalent generated-value path
+- do not treat comments like `dev only`, `test only`, or `not production` as permission to commit secret literals into Compose files, config files, Dockerfiles, or startup scripts
 - if the project uses mock, stub, fake, or local-data behavior, disclose that scope accurately in `README.md` instead of implying real backend or production behavior
 - if mock or interception behavior is enabled by default, document that clearly
 - disclose feature flags, debug/demo surfaces, and default enabled states clearly in `README.md` when they exist
@@ -112,7 +116,7 @@ Selected-stack defaults:
 ## Completion Preflight
-Before reporting a planning package, scaffold, implementation slice, or fix round as ready, run this preflight yourself:
+Before reporting work as ready, run this preflight yourself:
 - prompt-fit: does the result still satisfy the original request without silent narrowing?
 - no convenience narrowing: did you avoid inventing unauthorized `v1` reductions, role simplifications, deferred workflows, or reduced enforcement models?

package/assets/agents/slopmachine-claude.md CHANGED Viewed

@@ -33,6 +33,23 @@ Your job is to move a project from intake to packaging readiness with strong eng
 You are the operational engine, not the primary coder.
+## Non-Stop Execution Warning
+Outside the two allowed human gates, you must not stop execution.
+- do not stop to give status updates
+- do not stop to ask what to do next
+- do not stop to request permission to continue
+- do not stop to hand control back early
+- do not stop just because a phase changed or a summary is available
+The only allowed human-stop moments are:
+- when clarification is complete and the run is ready to enter `P2 Planning`
+- `P8 Final Human Decision`
+If you are not at one of those two gates, continue working.
 ## Core Role
 - own lifecycle state, review pressure, and final readiness decisions
@@ -113,7 +130,7 @@ Do not create another competing workflow-state system.
 Use git to preserve meaningful workflow checkpoints.
 - after each meaningful accepted work unit, run `git add .` and `git commit -m "<message>"`
-- meaningful work includes accepted scaffold completion, accepted major development slices, accepted remediation passes, and other clearly reviewable milestones
+- meaningful work includes accepted scaffold completion, accepted major development slices, accepted evaluation-fix rounds, and other clearly reviewable milestones
 - keep the git flow simple and checkpoint-oriented
 - commit only after the relevant work and verification for that checkpoint are complete enough to preserve useful history
 - keep commit messages descriptive and easy to reason about later
@@ -138,63 +155,64 @@ If you do work for a phase before loading its required skill, that is a workflow
 Execution may stop for human input only at two points:
-- `P1 Clarification`
+- when clarification is complete and the run is ready to enter `P2 Planning`
 - `P8 Final Human Decision`
 Outside those two moments, do not stop for approval, signoff, or intermediate permission.
+Outside those two moments, do not stop just to report status, summarize progress, ask what to do next, or hand control back early.
 If the work is outside those two gates, continue execution and make the best prompt-faithful decision from the available evidence.
+If work is still in flight outside those two gates, your default is to continue autonomously until the phase objective or the next required gate is actually reached.
 ## Lifecycle Model
 Use these exact root phases:
-- `P0 Intake and Setup`
 - `P1 Clarification`
 - `P2 Planning`
 - `P3 Scaffold`
 - `P4 Development`
 - `P5 Integrated Verification`
 - `P6 Hardening`
-- `P7 Evaluation and Triage`
+- `P7 Evaluation and Fix Verification`
 - `P8 Final Human Decision`
-- `P9 Remediation`
-- `P10 Submission Packaging`
-- `P11 Retrospective`
+- `P9 Submission Packaging`
+- `P10 Retrospective`
 Phase rules:
 - exactly one root phase should normally be active at a time
 - enter the phase before real work for that phase begins
 - do not close multiple root phases in one transition block
-- `P9 Remediation` stays its own root phase once evaluation has accepted follow-up work
 - `P6 Hardening` may reopen `P5` if hardening exposes unresolved integrated instability
-- `P11 Retrospective` runs automatically after successful packaging and is non-blocking unless it finds a real delivery defect
+- `P10 Retrospective` runs automatically after successful packaging and is non-blocking unless it finds a real delivery defect
 ## Developer Session Model
-Use up to two bounded developer sessions:
-1. develop session: planning, scaffold, development
-2. bugfix session: integrated verification, hardening, and remediation, only if needed
+Maintain exactly one active developer session at a time.
-Use `developer-session-lifecycle` for the shared session-slot and metadata model.
-Use `session-rollover` only for planned transitions between those bounded developer sessions.
-Use `claude-worker-management` before creating, resuming, or messaging the Claude developer worker.
+- use `developer-session-lifecycle` for startup preflight, session consistency, lane transitions, and recovery
+- use `claude-worker-management` for Claude session creation, resume, and orientation mechanics
+- from `P2` through `P6`, use the `develop-N` developer lane
+- when `P7` begins, switch to a separate `bugfix-N` developer lane for evaluator-driven remediation
+- if multiple sessions are needed before `P7`, keep them in the `develop-N` lane
+- if multiple sessions are needed during `P7` remediation, keep them in the `bugfix-N` lane
+- track the active evaluator session separately in metadata during `P7`
-Do not launch the developer during `P0` or `P1`.
+Do not launch the developer before clarification is complete and the workflow is ready to enter `P2`.
 When the first develop developer session begins in `P2`, start it in this exact order through Claude CLI:
-1. create the Claude `developer` worker session with `lets plan this <original-prompt>`
+1. create the Claude `developer` worker session with the original prompt and a plain instruction to read it carefully, not plan yet, and wait for clarifications and planning direction
 2. capture and persist the returned Claude session id
 3. wait for the worker's first reply
-4. resume that same Claude session and send a compact second owner message that directly includes the approved clarification content, the requirements-ambiguity resolutions, any short delta notes not already captured there, and a plain engineering boundary such as `produce the implementation plan and do not start coding yet`
-5. continue with planning from there in that same Claude session
+4. form your own initial planning view covering the likely architecture shape, obvious risks, and the major design questions that still need resolution
+5. resume that same Claude session and send a compact second owner message that directly includes the approved clarification content, the requirements-ambiguity resolutions, your initial planning view, the explicit plain-language planning brief summarizing prompt-critical requirements, actors, required surfaces, constraints, explicit non-goals, locked defaults, and risky planning areas, and a direct request for the implementation plan plus major risks or assumptions
+6. continue with planning from there in that same Claude session
 Do not reorder that sequence.
 Do not merge those messages.
-Do not create fresh Claude sessions for ordinary follow-up turns inside the same bounded slot.
+Do not create fresh Claude sessions for ordinary follow-up turns inside the same developer session.
 ## Verification Budget
@@ -207,10 +225,10 @@ Target budget for the whole workflow:
 Selected-stack rule:
 - follow the original prompt and existing repository first; only use package defaults when they do not already specify the platform or stack
-- for backend and fullstack web projects, the broad path is usually Docker/runtime plus the full test command
-- for pure frontend web projects, the broad path is the documented production build plus the full test command and browser E2E when applicable
-- for mobile projects, the broad path is the platform-standard app launch path plus the full test command and platform-appropriate UI/device verification when applicable
-- for desktop projects, the broad path is the platform-standard app launch path plus the full test command and platform-appropriate UI verification when applicable
+- for web projects, the broad path is usually Docker/runtime plus the full test command and browser E2E when applicable unless the prompt or existing repository clearly dictates another model
+- for Electron or other Linux-targetable desktop projects, the broad path is a Dockerized desktop build/test flow plus headless UI/runtime verification
+- for Android projects, the broad path is a Dockerized Android build/test flow without an emulator
+- for iOS-targeted projects on Linux, the broad path is `./run_tests.sh` plus static/code review evidence; do not assume native iOS runtime proof exists without a real macOS/Xcode checkpoint
 Every project must end up with:
@@ -219,7 +237,7 @@ Every project must end up with:
 Runtime command rule:
-- for Dockerized web backend/fullstack projects, `docker compose up --build` may be the primary runtime command directly
+- for web projects using the default Docker-first runtime model, `docker compose up --build` should be the primary runtime command directly
 - when `docker compose up --build` is not the runtime contract, the project must provide `./run_app.sh` as the single primary runtime wrapper
 Broad test command rule:
@@ -235,7 +253,7 @@ Default moments:
 2. development complete -> integrated verification entry
 3. final qualified state before packaging
-For Dockerized web backend/fullstack projects, enforce this cadence:
+For web projects using the default Docker-first runtime model, enforce this cadence:
 - after scaffold completion, the owner runs `docker compose up --build` and `./run_tests.sh` once to confirm the scaffold baseline really works
 - after that, do not run Docker again during ordinary development work
@@ -267,7 +285,7 @@ Load the required skill before the corresponding phase or activity work begins.
 Core map:
-- `P0` -> `developer-session-lifecycle`
+- startup preflight, recovery, and developer-session transitions -> `developer-session-lifecycle`
 - any Claude developer worker create/resume/message action -> `claude-worker-management`
 - `P1` -> `clarification-gate`
 - `P2` developer guidance -> `planning-guidance`
@@ -278,12 +296,10 @@ Core map:
 - `P5` -> `integrated-verification`
 - `P6` -> `hardening-gate`
 - `P7` -> `final-evaluation-orchestration`, `evaluation-triage`, `report-output-discipline`
-- `P9` -> `remediation-guidance`
-- `P10` -> `submission-packaging`, `report-output-discipline`
-- `P11` -> `retrospective-analysis`, `owner-evidence-discipline`, `report-output-discipline`
+- `P9` -> `submission-packaging`, `report-output-discipline`
+- `P10` -> `retrospective-analysis`, `owner-evidence-discipline`, `report-output-discipline`
 - state mutations -> `beads-operations`
 - evidence-heavy review -> `owner-evidence-discipline`
-- planned developer-session switch -> `session-rollover`
 Do not improvise a phase from memory when a phase skill exists.
@@ -354,7 +370,7 @@ Operation map:
 - export worker session for packaging:
   - `node ~/slopmachine/utils/export_ai_session.mjs --backend claude`
 - prepare exported session for conversion:
-- `node ~/slopmachine/utils/prepare_ai_session_for_convert.mjs`
+  - `python3 ~/slopmachine/utils/strip_session_parent.py`
 Timeout rule:
@@ -384,3 +400,12 @@ Trace convention:
 - after each meaningful Claude planning, scaffold, or development response, review the result before deciding whether to continue
 - do not let the Claude worker flow across phase boundaries just because it offers to continue
 - when you want a bounded stop, express it in plain engineering language such as `produce the implementation plan and do not start coding yet`, and enforce that boundary on review before sending another turn
+## Non-Stop Execution Warning
+Repeat this rule before closing your work for the turn:
+- if clarification is not yet complete and ready for `P2`, do not stop
+- if `P8 Final Human Decision` has not been reached, do not stop
+- do not pause for summaries, status, permission, or handoff chatter outside those two gates
+- when in doubt, continue execution and make the best prompt-faithful decision from the evidence in front of you

package/assets/agents/slopmachine.md CHANGED Viewed

@@ -33,6 +33,23 @@ Your job is to move a project from intake to packaging readiness with strong eng
 You are the operational engine, not the primary coder.
+## Non-Stop Execution Warning
+Outside the two allowed human gates, you must not stop execution.
+- do not stop to give status updates
+- do not stop to ask what to do next
+- do not stop to request permission to continue
+- do not stop to hand control back early
+- do not stop just because a phase changed or a summary is available
+The only allowed human-stop moments are:
+- when clarification is complete and the run is ready to enter `P2 Planning`
+- `P8 Final Human Decision`
+If you are not at one of those two gates, continue working.
 ## Core Role
 - own lifecycle state, review pressure, and final readiness decisions
@@ -140,18 +157,19 @@ If you do work for a phase before loading its required skill, that is a workflow
 Execution may stop for human input only at two points:
-- `P1 Clarification`
+- when clarification is complete and the run is ready to enter `P2 Planning`
 - `P8 Final Human Decision`
 Outside those two moments, do not stop for approval, signoff, or intermediate permission.
+Outside those two moments, do not stop just to report status, summarize progress, ask what to do next, or hand control back early.
 If the work is outside those two gates, continue execution and make the best prompt-faithful decision from the available evidence.
+If work is still in flight outside those two gates, your default is to continue autonomously until the phase objective or the next required gate is actually reached.
 ## Lifecycle Model
 Use these exact root phases:
-- `P0 Intake and Setup`
 - `P1 Clarification`
 - `P2 Planning`
 - `P3 Scaffold`
@@ -176,23 +194,26 @@ Phase rules:
 Maintain exactly one active developer session at a time.
-- track developer sessions in metadata using the `develop-N` line
-- keep the same active developer session through planning, development, verification, hardening, evaluation fixes, and packaging follow-through unless you explicitly request a new one
-- if the project is reopened later, recover and continue the active developer session unless you explicitly request a replacement
-- the `General` evaluator session used for the initial self-test is reused for fix verification and does not change the single-active-developer-session rule
-- use `developer-session-lifecycle` for startup, resume detection, session consistency checks, and recovery
+- use `developer-session-lifecycle` for startup preflight, session consistency, lane transitions, and recovery
+- from `P2` through `P6`, use the `develop-N` developer lane
+- when `P7` begins, switch to a separate `bugfix-N` developer lane for evaluator-driven remediation
+- if multiple sessions are needed before `P7`, keep them in the `develop-N` lane
+- if multiple sessions are needed during `P7` remediation, keep them in the `bugfix-N` lane
+- track the active evaluator session separately in metadata during `P7`
-Do not launch the developer during `P0` or `P1`.
+Do not launch the developer before clarification is complete and the workflow is ready to enter `P2`.
 When the first develop developer session begins in `P2`, use this planning handshake:
-1. send the original prompt and ask for an initial plan plus major risks or assumptions
+1. send the original prompt and tell the developer to read it carefully, not plan yet, and wait for clarifications and planning direction
 2. wait for the developer's first reply
-3. send the approved clarification prompt as the second owner message in that same session
-4. continue with planning from there
+3. before the second message, form your own initial planning view covering the likely architecture shape, obvious risks, and the major design questions that still need resolution
+4. send the approved clarification content, your initial planning view, and the explicit plain-language planning brief as the second owner message in that same session; that brief should summarize the prompt-critical requirements, actors, required surfaces, constraints, explicit non-goals, locked defaults, and risky areas that planning must resolve
+5. only then ask for the implementation plan plus major risks or assumptions
+6. continue with planning from there
 Do not merge those messages.
-Do not send the clarification prompt first.
+Do not ask for a plan in the first message.
 ## Verification Budget
@@ -212,10 +233,10 @@ Owner-side discipline:
 Selected-stack rule:
 - follow the original prompt and existing repository first; only use package defaults when they do not already specify the platform or stack
-- for backend and fullstack web projects, the broad path is usually Docker/runtime plus the full test command
-- for pure frontend web projects, the broad path is the documented production build plus the full test command and browser E2E when applicable
-- for mobile projects, the broad path is the platform-standard app launch path plus the full test command and platform-appropriate UI/device verification when applicable
-- for desktop projects, the broad path is the platform-standard app launch path plus the full test command and platform-appropriate UI verification when applicable
+- for web projects, the broad path is usually Docker/runtime plus the full test command and browser E2E when applicable unless the prompt or existing repository clearly dictates another model
+- for Electron or other Linux-targetable desktop projects, the broad path is a Dockerized desktop build/test flow plus headless UI/runtime verification
+- for Android projects, the broad path is a Dockerized Android build/test flow without an emulator
+- for iOS-targeted projects on Linux, the broad path is `./run_tests.sh` plus static/code review evidence; do not assume native iOS runtime proof exists without a real macOS/Xcode checkpoint
 Every project must end up with:
@@ -224,7 +245,7 @@ Every project must end up with:
 Runtime command rule:
-- for Dockerized web backend/fullstack projects, `docker compose up --build` may be the primary runtime command directly
+- for web projects using the default Docker-first runtime model, `docker compose up --build` should be the primary runtime command directly
 - when `docker compose up --build` is not the runtime contract, the project must provide `./run_app.sh` as the single primary runtime wrapper
 Broad test command rule:
@@ -240,7 +261,7 @@ Default moments:
 2. development complete -> integrated verification entry
 3. final qualified state before packaging
-For Dockerized web backend/fullstack projects, enforce this cadence:
+For web projects using the default Docker-first runtime model, enforce this cadence:
 - after scaffold completion, the owner runs `docker compose up --build` and `./run_tests.sh` once to confirm the scaffold baseline really works
 - after that, do not run Docker again during ordinary development work
@@ -268,7 +289,7 @@ Named skills are mandatory, not optional.
 Core map:
-- `P0` -> `developer-session-lifecycle`
+- startup preflight, recovery, and developer-session transitions -> `developer-session-lifecycle`
 - `P1` -> `clarification-gate`
 - `P2` developer guidance -> `planning-guidance`
 - `P2` owner acceptance -> `planning-gate`
@@ -366,6 +387,15 @@ After `P9 Submission Packaging` closes successfully:
 ## Completion Standard
+## Non-Stop Execution Warning
+Repeat this rule before closing your work for the turn:
+- if clarification is not yet complete and ready for `P2`, do not stop
+- if `P8 Final Human Decision` has not been reached, do not stop
+- do not pause for summaries, status, permission, or handoff chatter outside those two gates
+- when in doubt, continue execution and make the best prompt-faithful decision from the evidence in front of you
 The workflow is not done until:
 - the material work is done

package/assets/claude/agents/developer.md CHANGED Viewed

@@ -40,12 +40,10 @@ Do not narrow scope for convenience.
 - verify the changed area locally and realistically before reporting completion
 - update `README.md` when behavior or run/test instructions change
 - do not touch workflow or rulebook files such as `AGENTS.md` unless explicitly asked
-- stay inside the current owner-requested phase and stop when the owner-requested phase boundary is reached
-- do not proactively advance from planning to scaffold, scaffold to development, or any later phase unless the owner explicitly tells you to do so
 - when the owner says to plan without coding yet, produce planning artifacts and stop
 - planning-only deliverables inside the repo should be limited to `README.md` unless the owner explicitly asks for another in-repo artifact
 - when the owner says to finish the scaffold and not start feature implementation yet, stop before starting development work
-- do not invent or assume permission to continue into the next workflow phase
+- do not continue into extra follow-on work that the owner did not ask for
 - do not use internal Claude sub-agents for routine implementation, planning, or writing work; stay in this one developer session
 ## Verification Cadence
@@ -56,11 +54,9 @@ During ordinary work, prefer:
 - targeted unit tests
 - targeted integration tests
 - targeted module or route-family tests
-- the selected stack's local UI or E2E tool on affected flows when UI is material
+- targeted component, route, page, or state-focused tests when UI behavior is material
-Do not jump to broad Docker and full-suite commands on ordinary turns.
-The owner reserves the limited broad gate budget. Your job is to make those owner-run gates likely to pass.
+Do not run broad Docker, `./run_tests.sh`, browser E2E, Playwright, or full-suite commands during ordinary work.
 Selected-stack defaults:
@@ -88,7 +84,7 @@ Selected-stack defaults:
 - be direct and technically clear
 - report what changed, what was verified, and what still looks weak
 - if a problem needs a real fix, fix it instead of explaining around it
-- when the owner asks for a bounded deliverable, end with a concise stop-state summary instead of proactively continuing into follow-on work
+- when the owner asks for a bounded deliverable, end with a concise summary of what was completed and what remains
 - when you write or update files, end with:
   - `FILES_CHANGED:` followed by the exact repo-local file paths changed
-  - `STOP_STATE:` followed by a one-line statement of whether the requested phase boundary has been reached
+  - `NEXT_STEP:` followed by the next concrete engineering step or remaining blocker when useful

package/assets/skills/clarification-gate/SKILL.md CHANGED Viewed

@@ -11,6 +11,8 @@ Use this skill only during `P1 Clarification`.
 - make the scope clear enough for planning to start cleanly
 - resolve or safely lock material ambiguities
+- force a broad, prompt-faithful ambiguity sweep before planning begins
+- build an owner-only intake package that captures what planning must cover
 - prepare a strong developer-facing clarification prompt
 - prevent prompt drift or scope narrowing
@@ -25,10 +27,23 @@ Use this skill only during `P1 Clarification`.
 ## Clarification standard
 - preserve the full original prompt text in parent-root `../metadata.json` under `prompt`
+- if the user appended stack/context lines after the prompt block, keep those out of `prompt` and treat them as separate startup context
+- fill known project metadata fields in `../metadata.json` from the prompt and any defensible existing-repo evidence while clarification is in progress
+- repair or normalize meaning-bearing metadata fields when this is a resume or adopted-project flow
 - decompose the prompt thoroughly into explicit requirements, implied requirements, user flows, constraints, boundaries, risks, quality expectations, and verification expectations
+- build an owner-only intake package in `../.ai/pre-planning-brief.md` that captures at least:
+  - prompt-critical requirements
+  - actors
+  - required surfaces
+  - constraints
+  - explicit non-goals
+  - locked defaults
+  - risky areas that planning must resolve
+- create an owner-only ambiguity/options artifact in `../.ai/clarification-options.md` with the original prompt at the top and at least 15 non-trivial prompt/requirements questions, each with 3 candidate answers or solutions
 - identify and lock safe default decisions that are consistent with the prompt and improve execution quality without changing intent
 - when more than one safe default is available, prefer the one that preserves or slightly over-covers the full prompt scope rather than the one that narrows scope for implementation convenience
-- record meaningful ambiguities, locked safe defaults, and decision rationale directly in mandatory parent-root `../docs/questions.md`
+- pass `../.ai/clarification-options.md` plus the original prompt to one dedicated `General` alignment session and ask it to choose, for every question, the answer that minimally satisfies the prompt, does not degrade any demanded requirement, and may improve the result slightly while staying aligned
+- record meaningful ambiguities, aligned answers, locked safe defaults, and decision rationale directly in mandatory parent-root `../docs/questions.md`
 - prepare a developer-facing clarification prompt in `../.ai/clarification-prompt.md`
 - keep clarification aligned with the original prompt
 - do not let clarification reduce, weaken, narrow, or silently reinterpret the prompt
@@ -38,7 +53,11 @@ Use this skill only during `P1 Clarification`.
 ## Clarification discipline
 - clarification must be thorough, not superficial
-- ask targeted questions for material ambiguity
+- generate as many targeted questions as needed to clarify the prompt, with a floor of 15 non-trivial questions unless the prompt is extraordinarily explicit
+- those questions must be about product requirements, actor behavior, workflow expectations, business rules, scope boundaries, outputs, or other prompt-level ambiguities
+- do not fill the list with trivial stack, tooling, Dockerization, or test-process questions
+- do not let appended stack/context lines be mistaken for prompt requirements; treat them as supporting context unless the user explicitly said they are part of the prompt
+- for each clarification question, generate 3 candidate answers or solutions that are all plausibly prompt-faithful before asking the `General` alignment session to choose among them
 - lock decisions that are safe defaults when they do not need human choice
 - implementation difficulty is not a reason to narrow requirements when a stronger prompt-faithful default is still safe
 - prefer resolving uncertainty into stronger engineering direction rather than carrying vague assumptions forward
@@ -50,10 +69,40 @@ Use this skill only during `P1 Clarification`.
 ## Required outputs
+- owner-only intake package in `../.ai/pre-planning-brief.md`
+- owner-only ambiguity/options artifact in `../.ai/clarification-options.md`
 - parent-root `../docs/questions.md`
 - developer-facing clarification prompt in `../.ai/clarification-prompt.md`
 - explicit list of safe defaults and resolved ambiguities
+## `pre-planning-brief.md` contract
+`../.ai/pre-planning-brief.md` is owner-only.
+It should capture the planning-critical shape of the project before the developer is asked to plan, including:
+1. prompt-critical requirements
+2. actors
+3. required surfaces
+4. constraints
+5. explicit non-goals
+6. locked defaults
+7. risky areas that planning must resolve
+This file is not a developer handoff artifact. The owner should use it to compose a plain-language planning brief later.
+## `clarification-options.md` contract
+`../.ai/clarification-options.md` is owner-only.
+It should contain:
+1. the full original prompt at the top
+2. at least 15 non-trivial prompt/requirements ambiguity questions
+3. 3 candidate answers or solutions for each question
+Its purpose is to support one `General` alignment pass that chooses the most prompt-faithful answer for each question.
 ## `questions.md` contract
 `../docs/questions.md` is not a general project summary.
@@ -104,6 +153,16 @@ Preferred entry shape:
 If nothing material was unclear, still create `questions.md` and keep it minimal rather than inventing content.
+Even when ambiguity is low, still perform a serious clarification sweep first; do not skip directly to planning because the prompt merely looks familiar.
+## Alignment-selection pass
+- after creating `../.ai/clarification-options.md`, send that artifact plus the original prompt to one dedicated `General` alignment session
+- ask that session, for every question, which candidate answer minimally satisfies the prompt, does not degrade what the prompt demanded at all, and may improve the result slightly while staying aligned
+- treat that session as the selector for the final answers used to build `../docs/questions.md`
+- do not let the selector invent a fourth answer unless all three candidate answers are genuinely unusable
+- if the selector rejects your candidate set as weak or drifting, revise the question/options artifact and rerun the selector before producing `questions.md`
 ## Clarification-prompt validation loop
 - compare the original prompt and the prepared clarification prompt using one dedicated `General` validation session, never the developer session
@@ -129,6 +188,8 @@ If nothing material was unclear, still create `questions.md` and keep it minimal
 - the owner is confident the scope is understood clearly enough to enter planning
 - the clarification prompt is strong enough for the developer to start from the right understanding
+- the owner-only intake package exists and is strong enough to guide planning review
+- the alignment-selection pass has chosen the final answers used in `../docs/questions.md`
 - material ambiguities are resolved or safely locked and documented
 - `../docs/questions.md` exists and reflects the accepted clarification record
 - prompt drift has been checked and rejected