npm - theslopmachine - Versions diffs - 0.6.2 → 0.7.0 - Mend

theslopmachine 0.6.2 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (76) hide show

package/assets/skills/scaffold-guidance/SKILL.md CHANGED Viewed

@@ -7,145 +7,150 @@ description: Developer-facing scaffold guidance for slopmachine.
 Use this skill during `P3 Scaffold` before prompting the developer.
-## Scaffold standard
+## Core idea
-- create real foundations, not decorative boilerplate
-- establish the real runtime contract
-- establish the local verification path and the standardized gate path
-- make prompt-critical baseline behavior real where required
-- keep repo-local `README.md` honest from the start
-- make the selected-stack primary runtime command and the universal `./run_tests.sh` broad test command real from the scaffold stage
-- make the first scaffold pass strong enough that owner scaffold acceptance can rely on a narrow checklist rather than rereading the whole scaffold broadly
+Scaffold is a simple baseline bootstrap step.
-For web projects using the default runtime model, scaffold must make these commands real and working before scaffold can pass:
+- do not treat scaffold as early product implementation
+- do not require prompt-specific business flows during scaffold unless they are necessary to choose the stack, runtime contract, required technologies, or baseline verification path
+- the goal is a safe baseline that boots, tests, and documents itself honestly
+- for web projects, that baseline must always include working `docker compose up --build` and containerized `./run_tests.sh`
+- for Android, iOS, and desktop projects, that baseline must also include working `docker compose up --build` plus containerized `./run_tests.sh`, even when the app's true product proof is platform-specific and differs from web runtime semantics
-- `docker compose up --build`
-- `./run_tests.sh`
+## Scaffold contract
-## Scaffold and foundation guidance
+Scaffold should deliver only the baseline needed to start real implementation safely:
-- create the initial project structure intentionally
-- follow the original prompt and existing repository first; only use the package defaults below when they do not already specify the platform or stack
-- when the selected stack has an official or clearly best-known bootstrap command, prefer using that bootstrap path instead of hand-creating the project tree from scratch
-- prefer the modern recommended bootstrap variant for the chosen stack rather than an outdated minimal skeleton when the prompt leaves room
-- use Context7 first and targeted web research second to choose the strongest bootstrap command, starter, or tooling combination when the stack has multiple viable setup paths
-- treat the generator-created project as the starting foundation, then adapt and harden it to satisfy the runtime, test, E2E, security, infra, and documentation rules below
-- when using an official or best-known bootstrap path, make sure the resulting scaffold also installs and wires the local verification tooling needed for ordinary iteration rather than leaving test/tool setup half-finished
-- only hand-build the scaffold from scratch when no credible bootstrap path exists, the existing repository already dictates a different structure, or the generated scaffold would clearly fight the prompt requirements
-- when the prompt leaves stack or framework details open, choose defaults that maximize execution speed, ecosystem stability, onboarding quality, and compatibility with the required runtime and verification model
-- when the prompt leaves stack or framework details open, make the chosen default stack and bootstrap command explicit before generating the scaffold
-- create `./run_tests.sh` during scaffold for every project as the single broad test entrypoint
-- for web projects, default scaffold to Docker-first runtime foundations unless the prompt or existing repository clearly dictates another model
-- for web projects, make `docker compose up --build` real as the primary runtime command during scaffold unless the prompt or existing repository clearly dictates otherwise
-- for Dockerized web projects, create a dev-only runtime bootstrap script during scaffold and make the Docker startup path call it automatically
-- for Dockerized web projects, the local user should not need any manual `export ...` step before `docker compose up --build`
-- for Dockerized web projects, do not use `.env` files or hardcoded runtime values to satisfy local startup
-- for Dockerized web projects, use the bootstrap script to generate or inject local-development runtime values during startup instead of committing them to the repo
-- for Dockerized web projects, do not pre-seed secret literals in Compose files, config files, Dockerfiles, or startup scripts even if they are labeled dev-only, test-only, or non-production
-- for Dockerized web projects, if runtime values must persist across restarts, persist them only in Docker-managed runtime state rather than committed repo files
-- for Dockerized web projects, place a clear comment in that bootstrap script that it is for local development bootstrap only and is not the production secret-management path
-- when `docker compose up --build` is not the runtime contract, create `./run_app.sh` during scaffold as the single primary runtime wrapper
-- make `./run_tests.sh` self-sufficient from a clean Linux VM that only has Docker and curl available by default
-- do not rely on host package managers or preinstalled host language runtimes for the broad test path when Docker can provide the execution environment instead
-- for web projects using the default Docker-first runtime model, `./run_tests.sh` must execute the broad test path through Docker and should own that Dockerized test flow directly instead of requiring separate manual pre-setup
-- for Dockerized web projects, `./run_tests.sh` must use the same bootstrap path or an equivalent path with the same generated-value rules as `docker compose up --build`
-- when host-level setup would otherwise be required, prefer a Dockerized `./run_tests.sh` path so the broad test remains portable on a bare machine
-- for non-web or non-Docker projects, `./run_tests.sh` must execute the selected stack's platform-equivalent broad test flow while preserving the same single-command interface
-- local non-Docker test commands should still be installed and working for normal development iteration
-- for Electron or other Linux-targetable desktop projects, make `./run_tests.sh` own a Dockerized broad path that covers build, tests, packaging smoke checks, and headless UI/runtime verification through Xvfb or an equivalent Linux-capable desktop harness
-- for Android projects, make `./run_tests.sh` own a Dockerized broad path that covers Gradle build, lint, unit tests, and local Android JVM-side tests such as Robolectric without depending on an emulator
-- for iOS-targeted projects, still create `./run_tests.sh`, but make it the portable Linux verification wrapper for lint, typecheck, shared logic tests, JS/UI-level tests when applicable, and static config/build-shape validation rather than fake native iOS runtime proof
-- create required testing directories and baseline docs structure
-- create the owner-maintained external docs structure under parent-root `../docs/`
-- do not create documentation files inside `repo/` beyond `README.md`
-- put baseline config, logging, validation, and error-normalization structure in place
-- install and configure the local test tooling needed for ordinary iteration during scaffold rather than deferring local testing setup to later phases
-- create baseline test structure intentionally during scaffold so the project can grow toward at least 90 percent meaningful coverage instead of retrofitting tests late
-- when API tests are material, scaffold them so they hit real endpoints and print simple useful response evidence such as status codes and message/body summaries instead of hiding the real API behavior behind helper-only checks
-- for frontend-bearing web projects, install the local browser E2E tooling plus the component/page-or-route frontend test layer during scaffold when the project will need them
-- for mobile projects, install the local mobile testing layer during scaffold, defaulting to Jest plus React Native Testing Library for Expo/React Native work
-- for desktop projects, install the local desktop testing layer during scaffold, defaulting to the selected project test runner and Playwright Electron support or an equivalent desktop UI/E2E tool when UI verification is required
-- for Android projects, install the local Android testing layers needed for the Dockerized broad path during scaffold without relying on an emulator
-- for iOS-targeted projects, install the portable test layers that can run on Linux during scaffold and document any native iOS runtime/build gap honestly instead of hiding it
-- put migrations, worker/job foundation, and real runtime health surfaces in place when the project needs them
-- when the project has database dependencies, create `./init_db.sh` during scaffold as the only project-standard database initialization path
-- if the project has database dependencies, make `./init_db.sh` handle the real database setup already known at scaffold time rather than leaving a placeholder shell
-- if the project has database dependencies, wire the runtime and test entrypoints to call `./init_db.sh` whenever database preparation is required
-- if the project has database dependencies, treat `./init_db.sh` as a living project artifact that must be expanded as migrations, schema setup, bootstrap data, and other database dependencies become real through implementation
-- do not hardcode database connection values or database bootstrap values in the repo; drive database setup through `./init_db.sh`
-- when the project has database dependencies, do not package local database dependency files or local database state as part of delivery; the delivery should rely on the initialization-script path instead
-- treat prompt-critical security controls as real baseline runtime behavior, not placeholder checks or visual wiring
-- if a requirement implies enforcement, persistence, statefulness, or rejection behavior, make that behavior real in the scaffold unless the prompt clearly scopes it down
-- do not accept shape-only security implementations such as header presence checks, passive constants, or partially wired middleware when the requirement implies real protection
-- when applicable at scaffold time, require platform-appropriate real security baselines rather than shape-only placeholders
-- for web service flows, examples include nonce replay rejection, real lockout behavior, CSRF rejection on protected mutations, and meaningful server-side state when the protection model depends on it
-- for mobile flows, examples include safe auth-state handling, no bundled secrets, secure token storage when runtime tokens must be persisted, and protected deep-link or privileged screen behavior when applicable
-- for desktop flows, examples include secure preload/contextBridge boundaries, `contextIsolation` preserved, `nodeIntegration` not exposed by default in the renderer, and privileged actions kept out of untrusted renderer reach
-- remove prototype residue from runtime foundations: no placeholder titles, hidden setup, fake defaults, or seeded live-path assumptions
-- make prompt-critical runtime behavior visible in the scaffold instead of hand-waving it for later, especially offline, worker, backup, or HTTPS requirements
-- for Dockerized web projects, keep runtime isolation clean in shared environments: use self-contained Compose namespacing, avoid fragile generic project names, and prefer Compose-managed service naming over unnecessary hardcoded `container_name` values
-- for Dockerized web projects, derive a unique `COMPOSE_PROJECT_NAME` from the repo or worktree identity for runtime wrappers, and use a separate unique test namespace for `./run_tests.sh` so parallel local projects do not collide
-- for Dockerized web projects, expose only the primary app-facing port to the host by default, keep databases/cache/internal services off host ports unless the prompt truly requires exposure, and bind exposed ports to `127.0.0.1`
-- for Dockerized web projects, prefer Docker-assigned random host ports for the default host binding so plain `docker compose up --build` can run without host-port collisions; if the prompt requires a fixed host port, support an overrideable host-port variable and make the runtime or test wrapper fall back to a free port automatically when needed
-- for Dockerized web projects, keep image, network, and volume naming under Compose project scoping; if explicit image names are needed, namespace them with the Compose project name instead of using generic shared names
-- for Dockerized web projects, add healthchecks and make runtime or test wrappers wait for service readiness before proceeding so startup is reliable on slower machines
-- require reproducible build and tooling foundations: prefer lockfile-driven installs where the stack supports them, keep source and build outputs clearly separated, and do not allow generated runtime artifacts to drift back into source directories
-- for typed build pipelines, keep source-of-truth boundaries clean so compiled output does not create TS/JS or similar dual-source drift in the working tree
-- establish README structure early instead of leaving it until the end
-- ensure `README.md` clearly documents the primary runtime command and the broad `./run_tests.sh` contract for the selected stack
-- ensure `README.md` focuses on what the project does, how to run it, how to test it, the main repo contents, and any important new-developer information rather than trying to replace the full API catalog
-- ensure `README.md` also explains the delivered architecture and major implementation structure clearly enough for code review and handoff
-- ensure `README.md` stands on its own and does not tell users or reviewers to rely on parent-root docs for core repo understanding
-- for Dockerized web projects, ensure `README.md` explains that local runtime values are bootstrapped automatically by the development startup path and that this is local-development behavior rather than production secret management
-- maintain the seeded parent-root `../docs/design.md` as the owner-maintained planning/design contract from the start
-- maintain the seeded parent-root `../docs/test-coverage.md` as the owner-maintained evaluator-facing test matrix from the start
-- when API surfaces are material, fill in the seeded parent-root `../docs/api-spec.md` instead of treating it as a later add-on
-- if the project uses mock, stub, fake, interception, or local-data behavior, disclose that scope in `README.md` during scaffold instead of waiting until late phases
-- if mock or interception behavior is enabled by default, disclose that default state in `README.md` during scaffold
-- if feature flags, debug or demo surfaces, or default-enabled config toggles exist, disclose them in `README.md` during scaffold and record the fuller reference in parent-root docs when needed
-- establish a shared logging path during scaffold with meaningful categories, redaction expectations, and no reliance on random print statements as the long-term logging surface
-- establish a shared validation path during scaffold so forms, requests, boundary checks, and normalized error behavior do not get invented ad hoc later
-- prove the scaffold in a clean state before deeper feature work
-- verify clean startup and teardown behavior under the selected stack's runtime contract
-- make the scaffold handoff compact and checklist-driven: the developer should be able to state runtime proof, test proof, docs honesty, and required repo-surface proof without a long narrative dump
-- for Dockerized web projects, verify clean startup and teardown behavior under the chosen project namespace
-- when the architecture materially depends on infrastructure capabilities such as rate limiting, encryption, offline support, or browser-storage policy, put the baseline framework and policy in place during scaffold rather than deferring it to late implementation
-- for backend integration paths, prefer production-equivalent test infrastructure when practical rather than silently substituting a weaker database or runtime model that can hide real defects
-- do not treat scaffold as placeholder boilerplate or rely on hidden setup
-## Current policy
+- chosen stack and bootstrap path are explicit
+- requested platform and required technologies are installed and wired
+- the primary runtime command is real
+- `./run_tests.sh` is real as the portable broad test wrapper
+- `docker compose up --build` is real and working
+- `./run_tests.sh` is containerized and working
+- `./run_app.sh` may exist as an additional platform helper when useful, but it does not replace the Docker-based baseline contract
+- minimal real tests exist so the scaffold is not mostly `NO-SOURCE`
+- `README.md` is honest about scaffold status, runtime, tests, and main module layout
+- `README.md` already includes the baseline section shape needed for the final README audit, even if many sections are still marked scaffold-level
+## What scaffold is not
+Do not turn scaffold into feature delivery.
+At scaffold time, do not require:
+- prompt-specific business workflows
+- full auth/role/product enforcement unless the runtime baseline genuinely depends on it
+- final module completion
+- integrated product polish
+- deep domain behavior beyond what the chosen stack/runtime/tests need immediately
+## Runtime selection rule
+- follow the original prompt and existing repository first
+- for web projects, `docker compose up --build` is always required as the runtime baseline contract
+- for Android, iOS, and desktop projects, `docker compose up --build` is also required, but it may start a meaningful containerized build, artifact, preview, or support environment rather than pretending to be native runtime proof
+- `./run_tests.sh` must exist for every project and be containerized
+- when helpful, non-web projects may also provide `./run_app.sh` for host-side convenience or platform-specific local flow, but that does not replace the required Docker commands
+## Playbook rule
+- use the packaged scaffold docs under `~/slopmachine/scaffold-playbooks/` as owner-side source material
+- start with `~/slopmachine/scaffold-playbooks/selection-matrix.md`
+- use `docker-shared-contract.md` as the common Docker/runtime/test contract
+- when no exact stack playbook exists yet, use `generic-unknown-tech-guide.md` before falling back to nearest-family improvisation
+- use the family matrices to resolve placeholders, language-only prompts, and open-ended stack selections before choosing a concrete playbook
+- do not tell the developer to read those files directly if they are outside `repo/`; restate the relevant directives in the developer prompt
+- when a matching playbook exists, prefer following it over inventing a new scaffold contract from scratch
+- if no exact playbook exists, choose the nearest platform family playbook plus the framework-specific bootstrap command that best matches the prompt
+## Bootstrap rule
+- when the selected stack has an official or clearly best-known bootstrap command, prefer it
+- prefer the current recommended bootstrap variant rather than an outdated minimal skeleton
+- only hand-build from scratch when no credible bootstrap path exists or the existing repo already dictates a different structure
+- after generation, adapt only enough to make the baseline runtime, tests, wrapper scripts, and README contract real
+## Safe defaults
 - no `.env` files or env-file variants in the repo
-- do not edit `AGENTS.md` or other workflow/rulebook files unless explicitly asked
+- no committed secrets or plaintext bootstrap credentials
 - keep generated artifacts out of source-of-truth paths
-- keep real secrets out of the repository and use the selected stack's runtime/platform mechanism for sensitive values
-- if the stack requires env-file format at runtime, generate it ephemerally from the selected runtime environment rather than storing it in the repo or package
-- for Dockerized web projects, `docker compose up --build` must work as a single command without user-side exports
-- for Dockerized web projects, `./run_tests.sh` must work under the same no-export, no-`.env`, no-pre-seeded-secret-literals model
-- if the project has database dependencies, `./init_db.sh` must exist from scaffold onward and stay current with the real database setup
-- if the project uses mock, stub, fake, interception, or local-data behavior, the scaffold must already make those boundaries statically visible in docs and code structure
-- the repo must not depend on parent-root docs or sibling artifacts for startup, testing, build/preview, config, security understanding, or evaluator traceability
+- make the repo self-sufficient through code plus `README.md`
+- do not create extra in-repo docs beyond `README.md`
+- use one bootstrap source of truth when seed/setup data is needed
+- keep runtime/test/bootstrap paths aligned instead of inventing separate hidden setup flows
-## Acceptance target
+## Baseline requirements
+- create the initial project structure intentionally
+- wire the required technologies named by the prompt or selected by the default stack
+- install local targeted test tooling needed for ordinary development iteration
+- create baseline test structure intentionally during scaffold
+- when backend or fullstack API surfaces already exist at scaffold time, create the test structure in a way that later endpoint-by-endpoint mapping and true no-mock HTTP testing can grow without churn
+- create `./run_tests.sh` during scaffold for every project
+- create `./run_app.sh` during scaffold for non-web platforms when it helps expose the host-side or platform-specific local flow, but keep `docker compose up --build` and containerized `./run_tests.sh` as required baseline commands
+- if the project has database dependencies, create `./init_db.sh` during scaffold as the only project-standard database initialization path
+- make the scaffold handoff compact and checklist-driven rather than a long narrative dump
+## Minimal real test floor
+Do not leave the broad path mostly empty.
+At scaffold, require at least:
+- one or more real tests for a core baseline contract or rules/helper layer
+- one real runtime/build verification path in `./run_tests.sh`
+- one real smoke-level proof that the chosen stack is wired correctly
+- any static verification tasks that the selected platform needs to keep the baseline honest, such as schema export verification, lint, typecheck, or package/build health
+- one real Docker-backed runtime/build path behind `docker compose up --build`
+## README floor
+`README.md` inside `repo/` must already state:
+- project type near the top using one of `backend`, `fullstack`, `web`, `android`, `ios`, or `desktop`
+- scaffold status versus implemented product scope
+- primary runtime command
+- required Docker command: `docker compose up --build`
+- broad test command
+- startup instructions
+- access method
+- verification method
+- authentication section with demo credentials for known roles or the exact statement `No authentication required`
+- tech stack clarity and brief architecture explanation
+- important roles or workflows when relevant
+- major module or repo layout
+- important bootstrap/setup notes
+- any major honesty boundaries such as local-only, mock-only, offline-only, no-emulator, or no-native-runtime proof when those limits apply
+When the project type requires extra README shape for the final audit:
+- backend, fullstack, and web projects should include the exact legacy compatibility string `docker-compose up` somewhere in startup guidance in addition to the canonical `docker compose up --build` contract
+- Android projects should include a host-side build or emulator or device guidance section even though Docker remains required for the final broad contract
+- iOS projects should include Xcode or simulator or device guidance when applicable even though Docker remains part of the broader SlopMachine contract
+- desktop projects should include host-side run or build guidance when applicable even though Docker remains part of the broader SlopMachine contract
+## Parallelism rule
+- once the stack, runtime contract, and wrapper-script shape are fixed, use parallel branches for independent scaffold foundations when that materially reduces elapsed time
+- good scaffold parallel candidates include app shell, test harness, config wiring, and README baseline work that touch different areas
+- do not parallelize overlapping generator churn or shared foundation invention before the baseline contract is stable
-Scaffold should make later slices easier, not force them to retrofit missing fundamentals.
+## Acceptance target
-Before scaffold is handed back for owner acceptance, the developer should already have a compact answer for these scaffold checklist items:
+Scaffold is acceptable when:
-- runtime bootstrap works
-- database/bootstrap path works when relevant
-- `./run_tests.sh` works at the broad-scaffold level
-- frontend/backend wiring shape is real
-- config/env/bootstrap path is honest
-- `README.md` and scaffold docs are honest about what is and is not implemented
-- required scaffold files and directories exist
+- `docker compose up --build` works
+- `./run_tests.sh` works at the scaffold baseline level
+- required wrapper scripts exist and are honest
+- required technologies are wired
+- minimal real tests exist
+- `README.md` is honest and traceable
 - prohibited shortcuts or residue are not present
 ## Verification cadence
 - use local and narrow checks while correcting scaffold work
 - reserve one broad owner-run scaffold gate for actual scaffold acceptance
-- do not spend extra broad reruns once the acceptance question is already answered
-- for web projects using the default Docker-first runtime model, the owner must run `docker compose up --build` and `./run_tests.sh` once after scaffold completion to confirm the baseline actually works
-- after that single scaffold proof, ordinary development should use only local fast verification until development-complete / integrated-verification entry unless a real blocker forces earlier broad reruns
-- after that scaffold confirmation, do not run Docker again during ordinary development work; the next Docker-based run should be at development completion when integrated behavior is checked
+- for web projects, the owner should run `docker compose up --build` and `./run_tests.sh` once after scaffold completion
+- after scaffold acceptance, return to local fast verification until the next major gate unless a real blocker forces earlier escalation

package/assets/skills/submission-packaging/SKILL.md CHANGED Viewed

@@ -28,15 +28,21 @@ The final delivery layout in the parent project root must be:
   - `api-spec.md` when applicable
   - `test-coverage.md`
   - `questions.md`
-- `sessions/`
-  - `<label>.json` for every tracked developer session, including `develop-N.json` and `bugfix-N.json` when present
-- `session-<label>.json` for every tracked developer session
+- for non-Claude developer sessions:
+  - `sessions/`
+  - `sessions/<label>.json` for every tracked developer session, including `develop-N.json` and `bugfix-N.json` when present
+- for Claude-backed developer sessions:
+  - `claude-sessions.zip` in the parent root containing the whole Claude project session folder once
+  - no `sessions/` directory is required when all tracked developer sessions are Claude-backed
 - `metadata.json`
-- `self_test_reports/`
-  - `cycle-0/`
-  - `cycle-1/`
+- `.tmp/`
+  - `audit_report-<N>.md`
+  - `audit_report-<N>-fix_check-<M>.md` when present
+  - `test_coverage_and_readme_audit_report.md`
 - `repo/`
+In the clean two-bugfix path, `.tmp/` should end with at least 5 required markdown reports once the final coverage/README audit is included, though extra fresh audits or extra fix checks may legitimately increase that count.
 Inside the delivered `repo/`, the repository must remain self-sufficient:
 - `README.md`
@@ -64,34 +70,32 @@ No screenshots are required as packaging artifacts.
 - include `./run_tests.sh` and any supporting runner logic it needs to execute the project's broad test path from a clean environment
 - when the project has database dependencies, include `./init_db.sh` and ensure it reflects the final delivered database setup rather than an earlier scaffold placeholder
 - when the project has database dependencies, package the initialization-script path rather than raw environment-specific database dependency artifacts or local database state
-- verify parent-root `../self_test_reports/` exists and contains the required counted cycle directories
+- verify parent-root `../.tmp/` exists and contains the required audit and fix-check reports
+- verify parent-root `../.tmp/test_coverage_and_readme_audit_report.md` exists from the final post-bugfix coverage/README audit
 - export all tracked developer sessions before closing packaging
 - when packaging succeeds, update workflow metadata to mark `packaging_completed` as true
 ## Session export sequence
-Export every tracked developer session from metadata, keep a numbered cleaned root export, and convert each session into its lane-aware trajectory file.
-Use the tracked lane labels for converted developer sessions, for example:
+Export tracked developer sessions from metadata using the tracked lane labels, for example:
 - `develop-1`
 - `bugfix-1`
-For each tracked developer session:
+For session export:
-1. if `<backend>` is `claude`, run `node ~/slopmachine/utils/export_ai_session.mjs --backend claude --cwd "$PWD" --session-id <session-id> --output ../session-<N>.json`
-2. if `<backend>` is not `claude`, run `opencode export <session-id> > ../session-export-<label>.raw`
-3. if `<backend>` is not `claude`, run `python3 ~/slopmachine/utils/strip_session_parent.py ../session-export-<label>.raw --output ../session-<N>.json`
-4. `node ~/slopmachine/utils/convert_exported_ai_session.mjs --converter-script ~/slopmachine/utils/convert_ai_session.py --input ../session-<N>.json --output ../sessions/<label>.json`
+1. if at least one tracked developer session backend is `claude` or `claude-live`, run `node ~/slopmachine/utils/package_claude_session.mjs --cwd "$PWD" --session-id <any-claude-session-id> --label claude-sessions --output ../claude-sessions.zip`
+2. if `<backend>` is neither `claude` nor `claude-live`, run `opencode export <session-id> > ../session-export-<label>.raw`
+3. if `<backend>` is neither `claude` nor `claude-live`, run `python3 ~/slopmachine/utils/strip_session_parent.py ../session-export-<label>.raw --output ../sessions/<label>.json`
 Where `<backend>` comes from the tracked developer session record in metadata.
-Use `opencode` when no explicit backend field exists.
-Use the tracked developer-session order to assign `<N>`.
+Use `opencode` when no explicit backend field exists or when the backend is not Claude-backed.
+For Claude-backed sessions, the package helper resolves the Claude project folder under `~/.claude/projects/` from a tracked `session_id` plus the current project `cwd` and packages that folder once.
 After those steps:
-- verify every tracked developer session has been exported and converted into `../sessions/` before continuing
-- keep `../session-<N>.json` in the parent root as the cleaned or direct exported session artifact
+- verify every non-Claude developer session has been exported into `../sessions/<label>.json`
+- verify Claude-backed sessions have been packaged once into `../claude-sessions.zip`
 - treat only the raw `../session-export-<label>.raw` files as temporary packaging intermediates
 - remove the raw `../session-export-<label>.raw` files before closing packaging
 - if the required utilities, metadata session ids, or output files are missing, packaging is not ready to continue
@@ -101,11 +105,12 @@ After those steps:
 - run `python3 ~/slopmachine/utils/cleanup_delivery_artifacts.py .` once near the end of packaging to remove known recursive cleanup targets from the delivered repo tree
 - remove runtime, editor, cache, tooling noise, generated artifacts, and environment junk recursively anywhere in the delivered repo tree
 - do not remove required delivery artifacts just because they look noisy
-- remove `.opencode/`, `.codex/`, `.vscode/`, env-file variants, caches, `node_modules/`, `.venv/`, `.net/`, build outputs not part of delivery, raw test artifact directories, `__pycache__/`, `.pytest_cache/`, repo-local `AGENTS.md`, and accidental in-repo docs directories or extra documentation files beyond `README.md`
+- the cleanup helper removes common known targets such as `.opencode/`, `.codex/`, `.vscode/`, env-file variants, `node_modules/`, `.venv/`, `.net/`, common build/cache directories, `__pycache__/`, `.pytest_cache/`, and repo-local `AGENTS.md` / `CLAUDE.md`
 - remove environment-dependent content, local dependency trees, editor state, package-manager caches, and runtime caches anywhere in the delivery tree
 - do not package database dependency files or local database state when the delivered database setup is supposed to be injected through initialization scripts
-- do not package AI session conversion scripts or similar workflow utility scripts inside the delivered product attachment
-- remove repo-local `.tmp/` or parent-root `../.tmp/` if they exist; they are not part of the final delivery contract
+- manually review for accidental in-repo docs directories or workflow utility scripts that the helper does not remove automatically, and remove them before closing packaging
+- remove repo-local `.tmp/` if it exists and is not part of the delivered product
+- do not remove parent-root `../.tmp/`; it now holds the required `P7` audit and fix-check artifacts
 - the cleanup is recursive; do not leave forbidden directories or generated junk buried deeper in the repo hierarchy after cleanup
 ## Validation checklist
@@ -120,11 +125,12 @@ After those steps:
 - when the project has database dependencies, confirm database setup is injected through initialization scripts rather than packaged local database dependency artifacts
 - confirm the cleanup helper has been run and that no known recursive cleanup targets remain in the delivered repo tree
 - confirm no environment-dependent dependency directories, editor-state folders, runtime caches, or workflow utility scripts are packaged into the delivered product
-- confirm parent-root `../self_test_reports/` exists and contains the required counted cycle directories
-- confirm each counted cycle directory contains the initial audit report plus any fix-check reports generated for that cycle
+- confirm parent-root `../.tmp/` exists and contains the required `audit_report-<N>.md` files
+- confirm every bugfix-triggering audit number has its matching `audit_report-<N>-fix_check-<M>.md` files when fix checks were required
+- confirm parent-root `../.tmp/test_coverage_and_readme_audit_report.md` exists and is the final replaced copy rather than a numbered variant
 - confirm parent-root `../docs/test-coverage.md` explains the tested flows, mapped tests, and coverage boundaries
-- confirm exported developer sessions exist under parent-root `../sessions/` using the tracked `<label>.json` names
-- confirm cleaned session exports exist in the parent root as numbered `../session-<N>.json` files
+- confirm every non-Claude developer session exists under parent-root `../sessions/` using the tracked `<label>.json` names
+- confirm Claude-backed developer sessions exist in the parent root as `claude-sessions.zip`
 - confirm parent-root `../docs/` remains consistent as an external reference set when workflow policy still requires it, but the delivered repo does not depend on it
 - confirm parent-root metadata fields are populated correctly
 - confirm workflow metadata marks `packaging_completed` as true
@@ -137,4 +143,4 @@ After those steps:
 - confirm the delivered project is actually runnable in the promised startup model, the documented tests are runnable, frontend behavior is usable when applicable, UI quality is acceptable, core logic is complete, and Docker startup works when Docker is the runtime contract
 - confirm the final git checkpoint can be created cleanly for the packaged state when a checkpoint is needed
 - if packaging reveals a real defect or missing artifact, fix it before closing the phase
-- do not close packaging until all required docs, session exports, self-test files, cleanup conditions, and final structure checks are satisfied
+- do not close packaging until all required docs, session exports, audit/fix-check files, cleanup conditions, and final structure checks are satisfied

package/assets/skills/verification-gates/SKILL.md CHANGED Viewed

@@ -25,6 +25,7 @@ Use this skill after development begins whenever you are reviewing work, decidin
 - require the README to explain what the project does, how to run it, how to test it, the main repo contents, and any important new-developer information
 - require the README to show the correct primary runtime command and `./run_tests.sh` as the primary broad test command
 - do not require the README to carry a full API catalog
+- require the README to include the strict audit sections when they are relevant to the project shape: project type near the top, startup instructions, access method, verification method, and demo credentials for every role or the exact statement `No authentication required`
 - do not allow the repo to depend on parent-root docs or sibling artifacts for startup, build/preview, configuration, evaluator traceability, or basic project understanding
 - require the delivered repo to be statically reviewable: README, scripts, entry points, routes, config, and test commands must be traceably consistent
 - if the project uses mock, stub, fake, interception, or local-data behavior, require the README and visible code boundaries to disclose that scope accurately
@@ -33,7 +34,8 @@ Use this skill after development begins whenever you are reviewing work, decidin
 - require parent-root `../docs/test-coverage.md` to be evaluator-shaped rather than generic: requirement or risk point, mapped test evidence, coverage status, major gap, and minimum test addition
 - when auth or access-control behavior is relevant, require static security-boundary evidence that a fresh reviewer can trace for auth entry points, route authorization, object authorization, function-level authorization, admin/internal/debug surfaces, and tenant or user isolation when applicable
 - require logging structure and validation or error-handling structure to be statically traceable from repo artifacts and, when needed, owner-maintained external docs
-- for web projects, default the runtime command to `docker compose up --build` unless the prompt or existing repository clearly dictates another model
+- for web projects, require the runtime command to be `docker compose up --build`
+- for backend, fullstack, and web projects, allow and expect an additional README compatibility note containing the exact string `docker-compose up` for the strict README audit, but do not treat that as a replacement for the canonical `docker compose up --build` contract
 - for Dockerized web projects, require a dev-only runtime bootstrap script or equivalent startup path so `docker compose up --build` works without user exports or `.env`
 - do not accept Dockerized web startup that depends on manual export steps before the runtime command
 - do not accept Dockerized web startup that relies on checked-in `.env` files or hardcoded runtime values to satisfy local startup
@@ -41,12 +43,13 @@ Use this skill after development begins whenever you are reviewing work, decidin
 - require `./run_tests.sh` to use the same runtime bootstrap model or an equivalent model with the same generated-value rules as `docker compose up --build`
 - if runtime values persist across restarts, require them to live in Docker-managed runtime state rather than committed repo files
 - require README disclosure that the bootstrap path is local-development-only behavior rather than the production secret-management path
-- when `docker compose up --build` is not the runtime contract, require `./run_app.sh` to be the documented primary runtime wrapper
+- for Android, mobile, desktop, and iOS-targeted projects, require a meaningful `docker compose up --build` command even when platform-specific runtime proof differs from web semantics
+- for Android, mobile, desktop, and iOS-targeted projects, allow `./run_app.sh` as an additional platform helper but not as a replacement for the required Docker command
 - require `./run_tests.sh` to be self-sufficient enough to run from a clean Linux VM that only has Docker and curl available by default
 - do not accept a broad test path that depends on host package managers or preinstalled host language runtimes when Docker can provide the execution environment instead
-- for web projects using the default Docker-first runtime model, require `./run_tests.sh` to be the Dockerized broad test path used only for the limited broad verification moments rather than as the ordinary development verification path
+- for web projects, require `./run_tests.sh` to be the Dockerized broad test path used only for the limited broad verification moments rather than as the ordinary development verification path
 - when host-level setup would otherwise be required, prefer a Dockerized `./run_tests.sh` path even outside traditional web stacks so the broad verification remains portable
-- for non-web or non-Docker projects, require `./run_tests.sh` to be the platform-equivalent broad test path used for final broad verification
+- for non-web projects, require `./run_tests.sh` to remain containerized and usable as the platform-equivalent broad test path used for final broad verification
 ## Review standard
@@ -67,7 +70,11 @@ Use this skill after development begins whenever you are reviewing work, decidin
 - do not accept fake-success paths that materially hide missing failure handling
 - do not accept frontend/backend drift in fullstack work
 - do not accept missing end-to-end coverage for major fullstack flows
-- do not accept coverage posture that clearly falls short of roughly 90 percent meaningful coverage of the relevant behavior surface without a prompt-faithful reason
+- do not accept coverage posture that falls short of the minimum 90 percent coverage threshold for the relevant behavior surface without an explicit prompt-faithful exception
+- when backend or fullstack APIs exist, do not accept missing endpoint inventory or missing API-test mapping for the important `METHOD + PATH` surfaces
+- when backend or fullstack APIs exist, do not accept mocked or indirect tests being presented as equivalent to true no-mock HTTP endpoint coverage
+- do not accept a README that is missing project type, startup instructions, access method, verification method, or auth disclosure when the strict README audit would expect them
+- do not accept final delivered docs or wrapper flows that still depend on `npm install`, `pip install`, `apt-get`, manual DB setup, or other host-only setup assumptions after development is complete
 - do not accept a repo that only becomes understandable by reading parent-root docs or sibling workflow artifacts
 - do not accept frontend-bearing work that lacks repo-local build/preview/config guidance when those commands or surfaces are material to the product
 - do not accept frontend-bearing work that lacks a credible state model for prompt-critical flows
@@ -84,6 +91,16 @@ Use this skill after development begins whenever you are reviewing work, decidin
 - do not accept module completion that ignores integration seams or cross-cutting consistency with the existing system
 - do not accept end-to-end evidence that bypasses a required user-facing or admin-facing surface with direct API shortcuts
+## Gate-demand rule
+- when setting a planning, scaffold, development, integrated-verification, hardening, or evaluation gate, reference the relevant accepted plan sections and then give an explicit stage-exclusive checklist for that gate
+- the gate checklist should name:
+  - the exact outcomes that must now be true
+  - the exact evidence that must now exist
+  - the important shortcuts, omissions, or future-work excuses that are not acceptable for this gate
+- do not re-dump the whole plan; isolate the exact subset of plan-backed expectations that must now be closed
+- at gate moments, prefer more explicit owner messages over ultra-short prompts so the developer cannot plausibly misread what acceptance depends on
 ## Cadence rule
 - use targeted local verification as the default during scaffold corrections, development, hardening, and evaluation fix loops
@@ -91,7 +108,7 @@ Use this skill after development begins whenever you are reviewing work, decidin
 - do not turn ordinary acceptance into repeated integrated-style gate runs
 - do not run `./run_tests.sh` casually on the owner side
 - do not run `docker compose up --build` casually on the owner side
-- for web projects using the default Docker-first runtime model, the owner must run `docker compose up --build` and `./run_tests.sh` once after scaffold completion to confirm the scaffold baseline
+- for web projects, the owner must run `docker compose up --build` and `./run_tests.sh` once after scaffold completion to confirm the scaffold baseline
 - after that scaffold confirmation, the next Docker-based run should be at development completion or integrated-verification entry unless a real blocker forces earlier escalation
 - in between those two broad checks, ordinary development should rely on local fast verification only
 - ordinary in-phase verification should not invoke `docker compose up --build` or `./run_tests.sh` unless the workflow is explicitly at one of those broad gate moments or a blocker justifies an earlier escalation
@@ -101,8 +118,10 @@ Use this skill after development begins whenever you are reviewing work, decidin
 - inspect the result and evidence, not just the developer claim
 - review technical quality, prompt alignment, architecture impact, and verification depth of the current work
 - after planning is accepted, treat the accepted plan and its relevant section as the default slice baseline instead of restating the full slice contract in every owner prompt
-- for ordinary slice work after planning, keep the owner prompt to one short paragraph plus a small checklist of slice-specific guardrails, review concerns, or deltas that are not already clear from the accepted plan
+- for ordinary slice work after planning, keep the owner prompt anchored to the relevant accepted plan sections and use an explicit checklist of slice-specific required outcomes, verification expectations, and review concerns that are not already clear from the accepted plan
+- when the current step is a real gate or phase-exit decision, be more explicit than ordinary slice prompts and enumerate the full stage-exclusive acceptance checklist
 - during normal implementation iteration, always prefer fast local language-native or framework-native verification for the changed area instead of the selected stack's broad gate path
+- during normal implementation iteration, fast local tooling setup is allowed when it helps iteration speed, but treat it as temporary engineering scaffolding rather than part of the final delivered runtime or test contract
 - require the developer to set up and use the project-appropriate local test environment in the current working directory when normal local verification is needed
 - require the developer to report the exact verification commands that were run and the concrete results they produced
 - when API tests are used as evidence, require them to hit real endpoints and expose simple useful response evidence such as status codes and message/body summaries
@@ -126,11 +145,11 @@ Use this skill after development begins whenever you are reviewing work, decidin
 - the evaluator-session cycles required inside `P7` are not part of the ordinary owner-run broad-gate budget; they are the formal final evaluation model for that phase
 - for Electron or other Linux-targetable desktop projects, the broad gate should use the Dockerized desktop build/test path plus headless UI/runtime verification rather than pretending web-style Docker runtime semantics apply
 - for Android projects, the broad gate should use the Dockerized Android build/test path without depending on an emulator
-- for iOS-targeted projects on Linux, the broad gate should rely on `./run_tests.sh` plus static/code review evidence and should not claim native iOS runtime proof unless a real macOS/Xcode checkpoint exists
+- for iOS-targeted projects on Linux, the broad gate should include `docker compose up --build` plus `./run_tests.sh` and static/code review evidence, and should not claim native iOS runtime proof unless a real macOS/Xcode checkpoint exists
 - the workflow target is at most 3 broad owner-run verification moments across the whole cycle
 - ordinary planning, ordinary slice acceptance, and routine in-phase verification are not broad gates by default and should rely on targeted local verification unless the risk profile says otherwise
-For web projects using the default Docker-first runtime model, the default Docker cadence is:
+For web projects, the default Docker cadence is:
 1. one owner-run `docker compose up --build` plus one owner-run `./run_tests.sh` after scaffold completion
 2. no more Docker-based runs during ordinary development work
@@ -144,24 +163,34 @@ Use evidence such as internal metadata files, structured Beads comments, verific
 - clarification requires the `clarification-gate` conditions plus explicit approval record
 - planning requires the `developer-session-lifecycle` and planning-gate conditions plus a fresh planning-oriented start and the required documentation and repo hygiene state when relevant
+- planning exit also requires explicit owner review that the accepted planning artifacts cover the section-addressable contract deeply enough for later implementation: in-scope and out-of-scope, actors and success paths, modules, business rules, state machines, permissions, validation, verification strategy, checkpoints, and definition of done when applicable
+- planning exit does not pass if those sections exist only nominally or remain too vague to drive implementation without broad reinvention
+- planning exit also requires that the accepted plan covers the final README hard-gate shape and, when backend or fullstack APIs exist, the endpoint-inventory and API-test mapping strategy needed for the strict coverage audit
 - scaffold requires evidence for the bounded scaffold gate, baseline logging/config, and when relevant the chosen frontend stack and UI approach being set intentionally
 - scaffold also requires safe env/config handling, no persisted local secrets, real migration/runtime foundations, a usable local test environment in the current working directory, and the correct primary runtime command plus `./run_tests.sh` documented and working when practical
-- for web projects, scaffold normally requires Docker-first runtime foundations unless the prompt or existing repository clearly dictates another model
+- for web projects, scaffold requires Docker runtime foundations
+- for Android, mobile, desktop, and iOS-targeted projects, scaffold also requires a meaningful `docker compose up --build` path plus containerized `./run_tests.sh`
 - for Dockerized web projects, scaffold also requires the dev-only runtime bootstrap path to be wired so `docker compose up --build` works without manual exports or `.env`
 - for Dockerized web projects, scaffold also requires owner review of Compose files, runtime bootstrap scripts, entrypoints or wrappers, and `./run_tests.sh` to confirm the no-export, no-`.env`, no-pre-seeded-secret-literals model is actually implemented
 - when the project has database dependencies, scaffold also requires a real `./init_db.sh` created during scaffold, wired into the runtime/test flow when needed, and populated with the database setup already known at that stage
 - scaffold also requires `./run_tests.sh` to handle its own required setup from a clean Linux VM that only has Docker and curl available by default
 - local tests should still exist for ordinary development work even when the primary broad test command is Dockerized
+- scaffold also requires `README.md` to have the baseline section shape needed for the final README audit, even when many sections are still scaffold-level placeholders
 - when scaffold includes prompt-critical security controls, acceptance requires real runtime or endpoint verification of the protection rather than helper-only or shape-only proof
 - for security-bearing scaffolds, require applicable rejection evidence such as stale replay rejection, nonce reuse rejection, CSRF rejection on protected mutations, lockout triggering when lockout is in scope, or equivalent proof that the control is truly enforced
 - scaffold acceptance also requires clean startup and teardown behavior in the selected runtime model; for Dockerized web projects this includes self-contained Compose namespacing and no unnecessary fragile `container_name` usage
 - for Dockerized web projects, scaffold acceptance also requires collision-resistant shared-machine defaults: only the primary app-facing port exposed to host by default, internal services not bound to host without prompt need, default host binding on `127.0.0.1`, and either random host-port assignment or a real free-port fallback when fixed ports are required
-- for web projects using the default Docker-first runtime model, scaffold acceptance is not complete until the owner has actually run `docker compose up --build` and `./run_tests.sh` once successfully after scaffold completion
+- for web projects, scaffold acceptance is not complete until the owner has actually run `docker compose up --build` and `./run_tests.sh` once successfully after scaffold completion
+- for Android, mobile, desktop, and iOS-targeted projects, scaffold acceptance is not complete until the owner has also run `docker compose up --build` and `./run_tests.sh` once successfully after scaffold completion
 - module implementation requires targeted local verification only; browser E2E and other broad gate evidence belong to owner-run major checkpoints rather than ordinary slice acceptance
+- module implementation acceptance requires explicit checking against the relevant accepted plan sections and the current stage-exclusive checklist, not just a loose sense that the feature exists
 - module implementation acceptance should challenge tenant isolation, path confinement, sanitized error behavior, prototype residue, integration seams, and cross-cutting consistency when those concerns are in scope
 - module implementation acceptance should use a narrow slice-close checklist: required behavior present, adjacent high-risk seams checked, docs or contract honesty preserved, exact verification evidence supplied, and no known release-facing regression left behind
+- when backend or fullstack APIs are touched, module implementation acceptance should also check that endpoint-oriented coverage notes and true no-mock HTTP tests are moving with the code instead of being deferred indefinitely
 - integrated verification entry requires one of the limited owner-run broad gate moments once development is complete; this is the normal next place where `docker compose up --build` and `./run_tests.sh` are expected after scaffold acceptance
-- module implementation acceptance should also challenge whether the slice is advancing toward the planned module contract and the planned 90 percent meaningful coverage target instead of accumulating test debt
+- module implementation acceptance should also challenge whether the slice is advancing toward the planned module contract and the hard minimum 90 percent coverage threshold instead of accumulating test debt
+- before leaving development, require explicit proof that the planned development outcomes for the relevant modules or slices are actually closed, not merely started, and that the targeted verification evidence covers the important happy path, failure path, and security or ownership path where relevant
+- before leaving development, require cleanup of local-iteration residue from the delivered contract: final README, wrapper scripts, and declared run/test flows should no longer depend on host-only setup conveniences
 - integrated verification completion requires explicit full-system evidence before the phase can close
 - integrated verification completion also requires explicit evidence that the delivered startup path is runnable, the documented tests are real and runnable, frontend behavior is usable when applicable, UI quality is acceptable, core logic is complete, and Docker startup works when Docker is the runtime contract
 - web fullstack integrated verification must include owner-run Playwright coverage for every major flow, plus screenshots used to evaluate frontend behavior and UI quality along the flow using `frontend-design`
@@ -174,14 +203,15 @@ Use evidence such as internal metadata files, structured Beads comments, verific
 - hardening must explicitly re-check secret handling, redaction, and frontend/backend observability hygiene
 - hardening must explicitly satisfy the documentation and repo hygiene policy in this file before final evaluation can begin
 - hardening must leave the repo statically reviewable enough that the final static evaluator can trace startup, tests, entry points, routes, config, and mock/local-data boundaries without rewriting core code
-- hardening must explicitly challenge any remaining gaps against the intended 90 percent meaningful coverage target and require justification or fixes before `P7`
+- hardening must explicitly challenge any remaining gaps against the minimum 90 percent coverage threshold and require proof, fixes, or an explicit prompt-faithful exception before `P7`
 - before `P7`, require that parent-root `../docs/test-coverage.md` is detailed enough for the owner to map major requirement and risk points to tests and gaps without inference work
 - before `P7`, require that security-bearing projects present traceable static evidence for auth entry points, route authorization, object authorization, function-level authorization, admin/internal/debug protection, and tenant or user isolation when those dimensions apply
 - before `P7`, for non-trivial frontend work, require meaningful static frontend test evidence for major state transitions or failure paths rather than relying only on runtime screenshots or E2E confidence
 - before `P7`, require repo-local build/preview/config traceability plus disclosure in `README.md` of feature flags, debug/demo surfaces, and mock defaults when those surfaces exist
 - before `P7`, require logging and validation contracts to be statically traceable enough that the owner can review them from the repo plus external references when needed
-- final evaluation readiness requires the cycle-based `P7` self-test model under `../self_test_reports/`; failed initial audits trigger non-counted remediation, counted cycles begin only from a `pass` or `partial pass` initial audit, cycle fix loops stay scoped to that cycle's initial issue list, and 2 successful fresh-session counted cycles are required before final human decision
+- final evaluation readiness requires the audit-numbered `P7` model under `../.tmp/`; every fresh evaluation produces `audit_report-<N>.md`, `fail` audits route back to the latest `develop-N` session, `partial pass` audits open scoped `bugfix-N` sessions whose fix checks are stored as `audit_report-<N>-fix_check-<M>.md`, clean `pass` audits before the required bugfix sessions are discarded and rerun, and `P7` cannot finish until 2 bugfix sessions have been completed plus a clean `test_coverage_and_readme_audit_report.md`
 - if the `P7` issue-fix loop materially reopens the integrated verification boundary, route it back through integrated verification before continuing with follow-up fix verification
+- before leaving `P7`, require a clean parent-root `../.tmp/test_coverage_and_readme_audit_report.md`; if it finds any issue, route the fixes to the currently active recoverable developer session, replace the report, and rerun the audit until clean
 ## Acceptance rule