theslopmachine 0.3.7 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/MANUAL.md +13 -9
- package/README.md +163 -3
- package/RELEASE.md +11 -3
- package/assets/agents/developer-v2.md +86 -0
- package/assets/agents/developer.md +21 -23
- package/assets/agents/slopmachine-v2.md +219 -0
- package/assets/agents/slopmachine.md +56 -38
- package/assets/skills/beads-operations/SKILL.md +32 -31
- package/assets/skills/beads-operations-v2/SKILL.md +82 -0
- package/assets/skills/clarification-gate/SKILL.md +8 -1
- package/assets/skills/clarification-gate-v2/SKILL.md +74 -0
- package/assets/skills/developer-session-lifecycle/SKILL.md +45 -14
- package/assets/skills/developer-session-lifecycle-v2/SKILL.md +148 -0
- package/assets/skills/development-guidance-v2/SKILL.md +60 -0
- package/assets/skills/evaluation-triage-v2/SKILL.md +38 -0
- package/assets/skills/final-evaluation-orchestration/SKILL.md +9 -11
- package/assets/skills/final-evaluation-orchestration-v2/SKILL.md +57 -0
- package/assets/skills/get-overlays/SKILL.md +77 -6
- package/assets/skills/hardening-gate-v2/SKILL.md +64 -0
- package/assets/skills/integrated-verification-v2/SKILL.md +47 -0
- package/assets/skills/owner-evidence-discipline-v2/SKILL.md +15 -0
- package/assets/skills/planning-gate/SKILL.md +6 -4
- package/assets/skills/planning-gate-v2/SKILL.md +91 -0
- package/assets/skills/planning-guidance-v2/SKILL.md +100 -0
- package/assets/skills/remediation-guidance-v2/SKILL.md +31 -0
- package/assets/skills/report-output-discipline-v2/SKILL.md +15 -0
- package/assets/skills/scaffold-guidance-v2/SKILL.md +57 -0
- package/assets/skills/session-rollover-v2/SKILL.md +41 -0
- package/assets/skills/submission-packaging/SKILL.md +147 -115
- package/assets/skills/submission-packaging-v2/SKILL.md +142 -0
- package/assets/skills/verification-gates/SKILL.md +44 -16
- package/assets/skills/verification-gates-v2/SKILL.md +102 -0
- package/assets/slopmachine/backend-evaluation-prompt.md +9 -2
- package/assets/slopmachine/frontend-evaluation-prompt.md +9 -2
- package/assets/slopmachine/templates/AGENTS-v2.md +55 -0
- package/assets/slopmachine/templates/AGENTS.md +20 -17
- package/assets/slopmachine/tracker-init.js +104 -0
- package/assets/slopmachine/workflow-init-v2.js +99 -0
- package/package.json +1 -1
- package/src/constants.js +22 -3
- package/src/init.js +33 -28
- package/src/install.js +186 -140
- package/src/utils.js +19 -0
- package/assets/slopmachine/beads-init.js +0 -439
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: hardening-gate-v2
|
|
3
|
+
description: Release-readiness hardening rules for slopmachine-v2.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Hardening Gate v2
|
|
7
|
+
|
|
8
|
+
Use this skill only during `P6 Hardening`.
|
|
9
|
+
|
|
10
|
+
## Hardening audit priorities
|
|
11
|
+
|
|
12
|
+
The hardening phase should explicitly prepare the project to pass the final audit in these priority areas:
|
|
13
|
+
|
|
14
|
+
1. prompt-fit
|
|
15
|
+
2. security-critical flaws
|
|
16
|
+
3. test sufficiency
|
|
17
|
+
4. major engineering quality
|
|
18
|
+
|
|
19
|
+
Hardening should treat these as the main review buckets before final evaluation begins.
|
|
20
|
+
|
|
21
|
+
## Hardening scope
|
|
22
|
+
|
|
23
|
+
- dependency hygiene
|
|
24
|
+
- secret and config hygiene
|
|
25
|
+
- prototype residue cleanup
|
|
26
|
+
- docs honesty
|
|
27
|
+
- observability and redaction hygiene
|
|
28
|
+
- fragile-test and release-readiness cleanup
|
|
29
|
+
|
|
30
|
+
## Hardening guidance
|
|
31
|
+
|
|
32
|
+
- run a prompt-fit sweep for silent requirement substitution, partially delivered hard requirements, frontend/backend mismatch, and business-flow drift
|
|
33
|
+
- audit security boundaries, validation, ownership, and secret handling
|
|
34
|
+
- prioritize authentication, authorization, object ownership, tenant isolation, admin/debug exposure, and secret leakage risk over style issues
|
|
35
|
+
- audit whether the current tests are sufficient to catch major issues in the core business flow, major failure paths, security-critical areas, and obvious high-risk boundaries
|
|
36
|
+
- audit env/config paths so sensitive values are injected safely and are not baked into committed files or images
|
|
37
|
+
- inspect architecture, coupling, file size, and maintainability risks
|
|
38
|
+
- focus engineering review on the major maintainability and architecture concerns that materially affect delivery confidence
|
|
39
|
+
- check for bad engineering practices that accumulated during implementation
|
|
40
|
+
- tighten weak tests, weak docs, and weak operational instructions
|
|
41
|
+
- run exploratory testing around awkward states, repeated actions, and realistic edge behavior
|
|
42
|
+
- re-check frontend and backend observability, redaction, and operator visibility paths
|
|
43
|
+
- run a prototype-residue sweep for hardcoded preview values, placeholder text, seeded defaults, hidden fallbacks, and computed-but-unrendered behavior
|
|
44
|
+
- enforce env-file discipline during hardening
|
|
45
|
+
- run documentation verification against the real codebase and runtime behavior, not just document existence
|
|
46
|
+
- re-check prompt-critical operational obligations such as scheduled jobs, retention, backups, worker behavior, privacy/accountability logging, and admin controls
|
|
47
|
+
- enter release-candidate mode: stop feature work and focus only on fixes, verification, docs, and packaging preparation
|
|
48
|
+
- make sure the system is genuinely reviewable and reproducible
|
|
49
|
+
|
|
50
|
+
## Required hardening output
|
|
51
|
+
|
|
52
|
+
Before `P6` can close, the owner should have a clear answer for each of these:
|
|
53
|
+
|
|
54
|
+
- prompt-fit: does the delivered project still match the business goal, core flows, and implicit constraints?
|
|
55
|
+
- security-critical flaws: are there any unresolved auth, authorization, isolation, exposure, or secret-handling defects?
|
|
56
|
+
- test sufficiency: are the current tests strong enough to rule out most major issues, and if not, what was added or strengthened?
|
|
57
|
+
- major engineering quality: is the project structurally credible and maintainable, rather than piled-up or demo-grade?
|
|
58
|
+
|
|
59
|
+
## Rules
|
|
60
|
+
|
|
61
|
+
- do not start hardening until integrated verification is explicitly stable
|
|
62
|
+
- hardening is not a disguised second integrated phase
|
|
63
|
+
- if hardening exposes unresolved integrated instability, reopen the earlier phase cleanly
|
|
64
|
+
- do not use hardening for broad feature work
|
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: integrated-verification-v2
|
|
3
|
+
description: Integrated verification convergence and rerun-ladder rules for slopmachine-v2.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Integrated Verification v2
|
|
7
|
+
|
|
8
|
+
Use this skill only during `P5 Integrated Verification`.
|
|
9
|
+
|
|
10
|
+
## Core model
|
|
11
|
+
|
|
12
|
+
Treat the first broad integrated run as a discovery pass.
|
|
13
|
+
|
|
14
|
+
Once a failure class is known:
|
|
15
|
+
|
|
16
|
+
- classify it
|
|
17
|
+
- isolate the likely affected module, route family, spec family, helper family, or shared flow
|
|
18
|
+
- direct narrow proof and narrow repairs first
|
|
19
|
+
- escalate back to a broad rerun only when the narrower evidence is no longer sufficient
|
|
20
|
+
|
|
21
|
+
## Verification and review guidance
|
|
22
|
+
|
|
23
|
+
- run the relevant tests for the changed behavior
|
|
24
|
+
- during in-phase verification, prefer the fastest meaningful local test commands for the known failure class
|
|
25
|
+
- use local verification to prepare for the next owner-run broad gate rather than duplicating it casually
|
|
26
|
+
- for applicable fullstack or UI-bearing work, run Playwright for the affected flows in-phase, capture screenshots, and verify the UI behavior and quality directly
|
|
27
|
+
- verify requirement closure, not just feature existence
|
|
28
|
+
- verify behavior against the current plan, the actual requirements, and any settled project decisions that affect the change
|
|
29
|
+
- verify end-to-end flow behavior where the change affects real workflows
|
|
30
|
+
- for fullstack work, run Playwright coverage for major flows and review screenshots for real UI behavior and regressions
|
|
31
|
+
- end-to-end coverage must use the real intended user-facing or admin-facing surfaces for the flow; if the flow cannot be exercised that way, treat the missing surface as incomplete work
|
|
32
|
+
- verify important failure, conflict, stale-state, negative-auth, and cross-user-isolation paths where relevant
|
|
33
|
+
- verify security-sensitive behavior where applicable
|
|
34
|
+
- verify multi-tenant and cross-user isolation where applicable, including negative checks rather than single-actor happy paths only
|
|
35
|
+
- verify file/path safety for file-bearing flows where applicable, including traversal-style negative cases
|
|
36
|
+
- verify secrets are not committed, hardcoded, or leaking through logs/config/docs
|
|
37
|
+
- verify error surfaces and auth-related failures are sanitized for users and operators appropriately
|
|
38
|
+
- trace the changed tests and verification back to the prompt-critical risks, not just the easiest happy paths
|
|
39
|
+
- challenge integration seams and adjacent-module behavior, not just the changed module local path
|
|
40
|
+
|
|
41
|
+
## Rules
|
|
42
|
+
|
|
43
|
+
- keep integrated verification as a real full-system gate
|
|
44
|
+
- do not rerun the whole heavy suite after every single failure by default
|
|
45
|
+
- generalize shared failure classes early instead of rediscovering them slice by slice
|
|
46
|
+
- do not allow hardening to start before integrated stability is explicit
|
|
47
|
+
- if a broad rerun is not answering a new question, stop and go back to narrow proof
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: owner-evidence-discipline-v2
|
|
3
|
+
description: Owner evidence-ingestion rules for lower-token slopmachine-v2 review loops.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Owner Evidence Discipline v2
|
|
7
|
+
|
|
8
|
+
Use this skill when owner review risks rereading too much evidence.
|
|
9
|
+
|
|
10
|
+
## Rules
|
|
11
|
+
|
|
12
|
+
- read evidence once, summarize the decision-relevant facts, and reopen only on material change
|
|
13
|
+
- avoid rereading the same large artifact sets late in the run unless something actually changed
|
|
14
|
+
- keep evidence notes concrete so later decisions do not require large re-ingestion passes
|
|
15
|
+
- prefer reading only the file sections needed to answer the current question
|
|
@@ -9,22 +9,24 @@ Use this skill during `P2 Development Bootstrap and Planning` when reviewing, ti
|
|
|
9
9
|
|
|
10
10
|
## Usage rules
|
|
11
11
|
|
|
12
|
-
- Load this skill before accepting planning, before declaring the plan sufficient, and before creating deep execution sub-
|
|
12
|
+
- Load this skill before accepting planning, before declaring the plan sufficient, and before creating deep execution sub-items from the plan.
|
|
13
13
|
- Treat it as owner-side planning gate guidance, not developer-visible text.
|
|
14
14
|
- Use `get-overlays` as the source of truth for developer-facing planning guidance.
|
|
15
15
|
- Use this skill as the source of truth for owner-side planning acceptance and decomposition readiness.
|
|
16
|
+
- do not pause for planning approval or any other human check-in while using this skill; planning must continue until it is accepted internally or sent back to the developer for fixes
|
|
17
|
+
- maintain the owner-managed external planning docs in parent-root `../docs/` from the accepted plan rather than treating them as developer-owned files
|
|
16
18
|
|
|
17
19
|
## Core planning gate
|
|
18
20
|
|
|
19
21
|
- the developer should produce the first in-depth technical plan
|
|
20
|
-
- do not create deep execution sub-
|
|
22
|
+
- do not create deep execution sub-items before the technical plan is accepted
|
|
21
23
|
- do not accept planning that reduces, weakens, narrows, or silently reinterprets the original prompt
|
|
22
24
|
- declare prompt-critical planning acceptance criteria before accepting the first planning pass when those criteria are already visible from the prompt
|
|
23
25
|
- require relevant cross-cutting system contracts to be explicitly planned rather than left to per-module invention
|
|
24
26
|
|
|
25
27
|
## Cross-document discipline
|
|
26
28
|
|
|
27
|
-
- require
|
|
29
|
+
- require owner-maintained planning docs under parent-root `../docs/` when relevant, especially `../docs/design.md`, `../docs/api-spec.md`, and `../docs/test-coverage.md`
|
|
28
30
|
- require cross-document consistency so design, API/spec, and test-planning artifacts do not drift on lifecycle/state models, permissions, flow coverage, or operational behavior
|
|
29
31
|
- if planning docs disagree on core system behavior, planning is still in progress
|
|
30
32
|
|
|
@@ -65,4 +67,4 @@ Before accepting planning, apply this checklist when relevant:
|
|
|
65
67
|
|
|
66
68
|
- the first real technical plan is accepted against this gate
|
|
67
69
|
- planning artifacts are internally consistent enough to guide implementation
|
|
68
|
-
- deep execution sub-
|
|
70
|
+
- deep execution sub-items can be created from the accepted plan without guesswork
|
|
@@ -0,0 +1,91 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: planning-gate-v2
|
|
3
|
+
description: Owner-side planning acceptance and correction rules for slopmachine-v2.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Planning Gate v2
|
|
7
|
+
|
|
8
|
+
Use this skill during `P2 Planning` when reviewing or accepting the plan.
|
|
9
|
+
|
|
10
|
+
## Planning gate priorities
|
|
11
|
+
|
|
12
|
+
Before planning is accepted, the owner should explicitly review the plan against these later audit buckets:
|
|
13
|
+
|
|
14
|
+
1. prompt-fit
|
|
15
|
+
2. security-critical flaws
|
|
16
|
+
3. test sufficiency
|
|
17
|
+
4. major engineering quality
|
|
18
|
+
|
|
19
|
+
Planning should not pass if these have been ignored in a way that is likely to create expensive late failures.
|
|
20
|
+
|
|
21
|
+
## Core rule
|
|
22
|
+
|
|
23
|
+
If the owner notices a concrete role, contract, or scope mismatch, planning does not pass until one of these is true:
|
|
24
|
+
|
|
25
|
+
- the mismatch is corrected
|
|
26
|
+
- an explicit disposition note explains why acceptance is still valid
|
|
27
|
+
|
|
28
|
+
## Usage rules
|
|
29
|
+
|
|
30
|
+
- keep the existing high bar for implementation-grade planning
|
|
31
|
+
- treat this as owner-side planning gate guidance, not developer-visible text
|
|
32
|
+
- do not create deep execution decomposition before the plan is accepted
|
|
33
|
+
- keep planning as a cheap correction point rather than pushing known ambiguity into execution
|
|
34
|
+
|
|
35
|
+
## Core planning gate
|
|
36
|
+
|
|
37
|
+
- the developer should produce the first in-depth technical plan
|
|
38
|
+
- do not create deep execution sub-items before the technical plan is accepted
|
|
39
|
+
- do not accept planning that reduces, weakens, narrows, or silently reinterprets the original prompt
|
|
40
|
+
- declare prompt-critical planning acceptance criteria before accepting the first planning pass when those criteria are already visible from the prompt
|
|
41
|
+
- require relevant cross-cutting system contracts to be explicitly planned rather than left to per-module invention
|
|
42
|
+
|
|
43
|
+
## Cross-document discipline
|
|
44
|
+
|
|
45
|
+
- require owner-maintained planning docs under parent-root `../docs/` when relevant, especially `../docs/design.md`, `../docs/api-spec.md`, and `../docs/test-coverage.md`
|
|
46
|
+
- require cross-document consistency so design, API/spec, and test-planning artifacts do not drift on lifecycle/state models, permissions, flow coverage, or operational behavior
|
|
47
|
+
- if planning docs disagree on core system behavior, planning is still in progress
|
|
48
|
+
|
|
49
|
+
## Cross-cutting planning requirements
|
|
50
|
+
|
|
51
|
+
- require shared lifecycle and state models to be aligned across planning artifacts when the product has meaningful workflow state
|
|
52
|
+
- require explicit cross-cutting system contracts when relevant, especially:
|
|
53
|
+
- error normalization and user-visible error behavior
|
|
54
|
+
- audit/logging and redaction patterns
|
|
55
|
+
- permission alignment across UI, route guards, and API enforcement
|
|
56
|
+
- state-transition and context-switch behavior
|
|
57
|
+
- auth/session edge cases such as expiry, refresh, or clock skew tolerance
|
|
58
|
+
- when the prompt says behavior is configurable, require the real configuration surface, permissions, operator flow, and backend support to be planned explicitly
|
|
59
|
+
- when a feature must be admin-manageable or operator-manageable, require a real usable UI surface for that management flow, not just API endpoints or data-model notes
|
|
60
|
+
|
|
61
|
+
## Architecture-depth requirements
|
|
62
|
+
|
|
63
|
+
- for complex security, offline, sync, authorization, storage, or data-governance features, define what `done` means across all prompt-promised dimensions rather than accepting a partial foundation or hook layer
|
|
64
|
+
- define infrastructure requirements early when they are material to correctness, such as rate limiting, encryption boundaries, production-equivalent test infrastructure, and browser-storage rules for sensitive data
|
|
65
|
+
- define frontend validation and accessibility expectations when the product surface materially depends on them
|
|
66
|
+
- if the prompt names literal storage, indexing, partitioning, retention, or performance dimensions, represent them literally in the planning artifacts rather than abstracting them away
|
|
67
|
+
|
|
68
|
+
## Acceptance checklist
|
|
69
|
+
|
|
70
|
+
- scope is still prompt-faithful
|
|
71
|
+
- the plan has explicitly addressed prompt-fit risks and requirement drift
|
|
72
|
+
- major user-facing flows are mapped to backend support and verification targets
|
|
73
|
+
- security-critical areas are planned early enough that they will not be left to accidental late cleanup
|
|
74
|
+
- test sufficiency has been considered at the level of core happy path, major failure paths, security-critical paths, and obvious high-risk boundaries
|
|
75
|
+
- major engineering quality has been addressed through maintainable boundaries, clear decomposition, and shared contracts
|
|
76
|
+
- frontend route, page, component, and state boundaries are planned when the UI is material
|
|
77
|
+
- configurable behaviors are concretely planned where the prompt requires configurability
|
|
78
|
+
- lifecycle and state models are aligned across design and API/spec artifacts
|
|
79
|
+
- prompt-critical operational obligations and operator visibility paths are concretely planned
|
|
80
|
+
- prompt-literal storage, partitioning, indexing, retention, or performance requirements are explicitly represented
|
|
81
|
+
- relevant cross-cutting system contracts are explicitly defined rather than left to per-module invention
|
|
82
|
+
- each major module has a clear integration contract with existing modules and shared patterns
|
|
83
|
+
- verification plans include cross-module seam checks, not just isolated feature tests
|
|
84
|
+
- visible mismatches are corrected or explicitly dispositioned
|
|
85
|
+
- planning comments and artifacts reflect current policy truth
|
|
86
|
+
|
|
87
|
+
## Exit conditions
|
|
88
|
+
|
|
89
|
+
- the first real technical plan is accepted against this gate
|
|
90
|
+
- planning artifacts are internally consistent enough to guide implementation
|
|
91
|
+
- deep execution sub-items can be created from the accepted plan without guesswork
|
|
@@ -0,0 +1,100 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: planning-guidance-v2
|
|
3
|
+
description: Developer-facing planning guidance for slopmachine-v2.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Planning Guidance v2
|
|
7
|
+
|
|
8
|
+
Use this skill during `P2 Planning` to compose the developer-facing planning prompt.
|
|
9
|
+
|
|
10
|
+
## Planning audit priorities
|
|
11
|
+
|
|
12
|
+
Planning should explicitly shape the project against these later audit buckets before implementation begins:
|
|
13
|
+
|
|
14
|
+
1. prompt-fit
|
|
15
|
+
2. security-critical flaws
|
|
16
|
+
3. test sufficiency
|
|
17
|
+
4. major engineering quality
|
|
18
|
+
|
|
19
|
+
The goal is to reduce late audit failures by designing for these concerns up front instead of discovering them only in hardening or evaluation.
|
|
20
|
+
|
|
21
|
+
## Usage rules
|
|
22
|
+
|
|
23
|
+
- use this skill for planning guidance only
|
|
24
|
+
- keep the developer message natural and focused; do not dump the whole skill verbatim
|
|
25
|
+
- combine this guidance with the approved clarification and current policy truth
|
|
26
|
+
|
|
27
|
+
## Cross-cutting documentation discipline
|
|
28
|
+
|
|
29
|
+
- the owner maintains external docs under parent-root `../docs/`
|
|
30
|
+
- the developer should keep `README.md` and any codebase-local docs accurate for repo-local use
|
|
31
|
+
- planning guidance should stay explicit enough that owner-maintained external docs can later be updated accurately
|
|
32
|
+
- `README.md` must stay codebase-specific and must not become an index or explanation of the external docs set
|
|
33
|
+
|
|
34
|
+
## Cross-cutting env-file discipline
|
|
35
|
+
|
|
36
|
+
- never create or keep `.env` files anywhere in the repo tree
|
|
37
|
+
- do not allow committed `.env` files even as placeholders or examples
|
|
38
|
+
- keep real secrets out of the repository and rely on Docker-provided runtime variables for sensitive values
|
|
39
|
+
- if the stack requires env-file format at runtime, generate it ephemerally from Docker-provided runtime variables rather than storing it in the repo or package
|
|
40
|
+
- verify the delivered project can start from scratch without any preexisting `.env` file in the repo or package
|
|
41
|
+
|
|
42
|
+
## Planning guidance
|
|
43
|
+
|
|
44
|
+
- require implementation-grade planning, not brainstorming
|
|
45
|
+
- start from the actual project prompt and build the plan from there
|
|
46
|
+
- carry the settled project requirements forward consistently as you plan
|
|
47
|
+
- identify the hard non-negotiable requirements early and do not quietly trade them away for implementation convenience
|
|
48
|
+
- explicitly check that the plan still fits the business goal, main flows, and implicit constraints from the prompt
|
|
49
|
+
- when planning technical items that depend on a library, framework, API, or tool, check Context7 documentation first for authoritative usage details
|
|
50
|
+
- when planning needs targeted outside research beyond direct documentation, use Exa web search next
|
|
51
|
+
- use technical research to strengthen concrete planning decisions, interfaces, constraints, and verification strategy rather than leaving them vague
|
|
52
|
+
- break the problem into explicit requirements, constraints, flows, boundaries, and edge cases
|
|
53
|
+
- map requirements to real modules, surfaces, contracts, and verification targets
|
|
54
|
+
- make the planning explicit enough that the owner can maintain external design notes and API/spec docs accurately when relevant
|
|
55
|
+
- keep the spec focused on required behavior rather than turning it into a progress or completion narrative
|
|
56
|
+
- define major modules as meaningful delivery units, not arbitrary folders
|
|
57
|
+
- make frontend/backend crosswalks explicit when the project is fullstack
|
|
58
|
+
- for fullstack work, map frontend surfaces, routes, components, and state boundaries to the backend modules and contracts that support them
|
|
59
|
+
- for fullstack work, make the frontend-to-backend crosswalk explicit enough that each major route, page, component group, or state boundary has a defined supporting backend module, endpoint, and data shape
|
|
60
|
+
- define cross-cutting contracts early instead of leaving them to per-slice invention
|
|
61
|
+
- plan security-critical areas early instead of treating them as cleanup work, especially authentication, authorization, object ownership, tenant isolation, admin/debug exposure, secret handling, file/path safety, and sensitive-data exposure
|
|
62
|
+
- when the prompt says behavior is configurable, plan the real configuration surface, data model, permissions, and operator flow rather than treating configurability as an implementation detail to invent later
|
|
63
|
+
- when a feature must be admin-manageable or operator-manageable, plan the real usable UI surface for that management flow, not just the backing API or data model
|
|
64
|
+
- define failure paths, permissions, validation, logging, runtime assumptions, and test strategy before coding
|
|
65
|
+
- for complex security, offline, sync, authorization, or data-governance features, define what `done` means across all prompt-promised dimensions rather than stopping at a partial foundation or hook layer
|
|
66
|
+
- define shared lifecycle and state models when the product has meaningful workflow state, and keep those models aligned across design notes and API/spec notes
|
|
67
|
+
- require cross-document consistency so design, API/spec, and test-planning artifacts do not drift on lifecycle/state models, flow coverage, permissions, or operational behavior
|
|
68
|
+
- define logging and observability expectations for both frontend and backend
|
|
69
|
+
- define operator visibility and operator workflow expectations when the prompt implies admin, operational, audit, backup, or support responsibilities
|
|
70
|
+
- when the system has meaningful cross-cutting behavior, define shared implementation contracts early rather than leaving each module to invent its own pattern
|
|
71
|
+
- define error-handling contracts when relevant, including normalization patterns for user-visible errors and backend error-shape expectations
|
|
72
|
+
- define audit contracts when relevant, including centralized helper or service expectations and redaction rules
|
|
73
|
+
- define permission contracts when relevant so navigation visibility, route guards, and API enforcement stay aligned
|
|
74
|
+
- define state-lifecycle contracts when relevant, including context-switch or tenant-switch cleanup expectations
|
|
75
|
+
- define auth edge-case expectations when relevant, such as token refresh, session expiry, or clock-skew tolerance
|
|
76
|
+
- call out operational obligations early when they are prompt-critical, such as scheduling, retention, backups, workers, auditability, or offline behavior
|
|
77
|
+
- define infrastructure requirements early when they are material to correctness, such as rate limiting, encryption boundaries, production-equivalent test infrastructure, and browser-storage rules for sensitive data
|
|
78
|
+
- define frontend validation and accessibility expectations when the product surface materially depends on them, including keyboard, focus, feedback, and other user-interaction quality requirements where relevant
|
|
79
|
+
- if backup or recovery behavior is prompt-critical, plan the designated media, operator drill flow, visibility, and verification expectations explicitly
|
|
80
|
+
- if the prompt names literal storage, indexing, partitioning, retention, or performance dimensions, represent them literally in the planning artifacts rather than abstracting them away
|
|
81
|
+
- for frontend work, unless the prompt, existing repository, or established stack clearly dictates otherwise, default to Tailwind CSS for styling and `shadcn/ui` for component primitives
|
|
82
|
+
- if the existing project already uses a different UI system, preserve and extend that system instead of forcing Tailwind CSS or `shadcn/ui` into it
|
|
83
|
+
- define end-to-end coverage for major user flows before coding
|
|
84
|
+
- define enough test coverage up front to catch major issues later, especially core happy path, important failure paths, security-critical paths, and obvious high-risk boundaries
|
|
85
|
+
- for fullstack work, explicitly plan Playwright coverage for the synchronized frontend/backend flows when end-to-end testing is applicable
|
|
86
|
+
- when UI-bearing flows are material, explicitly plan screenshot review as part of Playwright verification so UI correctness is checked, not just browser success
|
|
87
|
+
- aim for at least 90 percent meaningful coverage of the relevant behavior surface
|
|
88
|
+
- define verification strategy, Docker expectations, and documentation implications before coding
|
|
89
|
+
- make major engineering quality a planning concern by defining maintainable boundaries, separation of concerns, shared patterns, extension points, and anti-chaos constraints before coding begins
|
|
90
|
+
- for each major module, define how it integrates with existing modules and which shared contracts it must follow consistently
|
|
91
|
+
- define verification plans that include cross-module scenarios and seam checks, not just isolated feature checks
|
|
92
|
+
- surface real unresolved risks honestly
|
|
93
|
+
- keep the plan aligned with current policy: owner-managed external docs, no `.env` files, junior-friendly repo-local README, and the v2 verification cadence
|
|
94
|
+
|
|
95
|
+
## Exit target
|
|
96
|
+
|
|
97
|
+
- make the plan detailed enough to guide real implementation and later verification
|
|
98
|
+
- review the module map and make sure it is stable before deeper implementation begins
|
|
99
|
+
- do not move into deeper implementation with vague architecture or unstable module boundaries
|
|
100
|
+
- make sure the plan has explicitly considered prompt-fit, security-critical flaws, test sufficiency, and major engineering quality before implementation starts
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: remediation-guidance-v2
|
|
3
|
+
description: Focused remediation rules for slopmachine-v2.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Remediation Guidance v2
|
|
7
|
+
|
|
8
|
+
Use this skill only during `P9 Remediation`.
|
|
9
|
+
|
|
10
|
+
## Remediation model
|
|
11
|
+
|
|
12
|
+
- fix only accepted evaluation findings and the directly related proof gaps
|
|
13
|
+
- keep the work bounded and concrete
|
|
14
|
+
- require owner-reproducible proof before phase closure
|
|
15
|
+
|
|
16
|
+
## Remediation guidance
|
|
17
|
+
|
|
18
|
+
- focus only on accepted defects and the work needed to fix them cleanly
|
|
19
|
+
- fix the issue completely instead of layering hacks on top
|
|
20
|
+
- trace the fix back to the original requirement so the remediation restores fidelity instead of only hiding the symptom
|
|
21
|
+
- rerun the relevant verification after each fix
|
|
22
|
+
- if the issue exposed drift, docs overclaim, or missing acceptance coverage, repair that too before closing the issue
|
|
23
|
+
- update docs if behavior or instructions changed
|
|
24
|
+
- report exactly what was fixed, what was rerun, and what still looks risky if anything remains
|
|
25
|
+
|
|
26
|
+
## Rules
|
|
27
|
+
|
|
28
|
+
- do not reopen broad feature development inside remediation
|
|
29
|
+
- do not patch over symptoms with hacks
|
|
30
|
+
- if the fix changes behavior or instructions, update repo-local docs too
|
|
31
|
+
- keep remediation proof strong on the first pass whenever possible
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: report-output-discipline-v2
|
|
3
|
+
description: File-backed reporting discipline for slopmachine-v2.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Report Output Discipline v2
|
|
7
|
+
|
|
8
|
+
Use this skill whenever an owner-side review or evaluation task would otherwise dump a large report into chat.
|
|
9
|
+
|
|
10
|
+
## Rules
|
|
11
|
+
|
|
12
|
+
- write the full report to a file whenever the output would be long or reused later
|
|
13
|
+
- return only a short decision-oriented summary in chat
|
|
14
|
+
- keep chat focused on findings, decisions, and next actions
|
|
15
|
+
- use this for evaluation reports, large audits, artifact inspections, and packaging summaries when appropriate
|
|
@@ -0,0 +1,57 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: scaffold-guidance-v2
|
|
3
|
+
description: Developer-facing scaffold guidance for slopmachine-v2.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Scaffold Guidance v2
|
|
7
|
+
|
|
8
|
+
Use this skill during `P3 Scaffold` before prompting the developer.
|
|
9
|
+
|
|
10
|
+
## Scaffold standard
|
|
11
|
+
|
|
12
|
+
- create real foundations, not decorative boilerplate
|
|
13
|
+
- establish the real runtime contract
|
|
14
|
+
- establish the local verification path and the standardized gate path
|
|
15
|
+
- make prompt-critical baseline behavior real where required
|
|
16
|
+
- keep repo-local `README.md` honest from the start
|
|
17
|
+
|
|
18
|
+
## Scaffold and foundation guidance
|
|
19
|
+
|
|
20
|
+
- create the initial project structure intentionally
|
|
21
|
+
- create `run_tests.sh` as the standard broad test entrypoint when the stack needs it
|
|
22
|
+
- create required testing directories and baseline docs structure
|
|
23
|
+
- put baseline config and logging structure in place
|
|
24
|
+
- put migrations, worker/job foundation, and real runtime health surfaces in place when the project needs them
|
|
25
|
+
- treat prompt-critical security controls as real baseline runtime behavior, not placeholder checks or visual wiring
|
|
26
|
+
- if a requirement implies enforcement, persistence, statefulness, or rejection behavior, make that behavior real in the scaffold unless the prompt clearly scopes it down
|
|
27
|
+
- do not accept shape-only security implementations such as header presence checks, passive constants, or partially wired middleware when the requirement implies real protection
|
|
28
|
+
- when applicable at scaffold time, require real security baselines such as nonce reuse rejection rather than nonce-header presence, real lockout behavior rather than config-only lockout values, CSRF rejection on protected mutations, and meaningful server-side state when the protection model depends on it
|
|
29
|
+
- remove prototype residue from runtime foundations: no placeholder titles, hidden setup, fake defaults, or seeded live-path assumptions
|
|
30
|
+
- make prompt-critical runtime behavior visible in the scaffold instead of hand-waving it for later, especially offline, worker, backup, or HTTPS requirements
|
|
31
|
+
- keep Docker runtime isolation clean in shared environments: use self-contained Compose namespacing, avoid fragile generic project names, and prefer Compose-managed service naming over unnecessary hardcoded `container_name` values
|
|
32
|
+
- require reproducible build and tooling foundations: prefer lockfile-driven installs where the stack supports them, keep source and build outputs clearly separated, and do not allow generated runtime artifacts to drift back into source directories
|
|
33
|
+
- for typed build pipelines, keep source-of-truth boundaries clean so compiled output does not create TS/JS or similar dual-source drift in the working tree
|
|
34
|
+
- establish README structure early instead of leaving it until the end
|
|
35
|
+
- prove the scaffold in a clean state before deeper feature work
|
|
36
|
+
- verify clean startup and teardown behavior under the chosen project namespace when Dockerized execution is in scope
|
|
37
|
+
- when the architecture materially depends on infrastructure capabilities such as rate limiting, encryption, offline support, or browser-storage policy, put the baseline framework and policy in place during scaffold rather than deferring it to late implementation
|
|
38
|
+
- for backend integration paths, prefer production-equivalent test infrastructure when practical rather than silently substituting a weaker database or runtime model that can hide real defects
|
|
39
|
+
- do not treat scaffold as placeholder boilerplate or rely on hidden setup
|
|
40
|
+
|
|
41
|
+
## Current policy
|
|
42
|
+
|
|
43
|
+
- no `.env` files or env-file variants in the repo
|
|
44
|
+
- do not edit `AGENTS.md` or other workflow/rulebook files unless explicitly asked
|
|
45
|
+
- keep generated artifacts out of source-of-truth paths
|
|
46
|
+
- keep real secrets out of the repository and rely on Docker-provided runtime variables for sensitive values
|
|
47
|
+
- if the stack requires env-file format at runtime, generate it ephemerally from Docker-provided runtime variables rather than storing it in the repo or package
|
|
48
|
+
|
|
49
|
+
## Acceptance target
|
|
50
|
+
|
|
51
|
+
Scaffold should make later slices easier, not force them to retrofit missing fundamentals.
|
|
52
|
+
|
|
53
|
+
## Verification cadence
|
|
54
|
+
|
|
55
|
+
- use local and narrow checks while correcting scaffold work
|
|
56
|
+
- reserve one broad owner-run scaffold gate for actual scaffold acceptance
|
|
57
|
+
- do not spend extra broad reruns once the acceptance question is already answered
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: session-rollover-v2
|
|
3
|
+
description: Planned developer-session handoff and rollover rules for bounded slopmachine-v2 developer sessions.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Session Rollover v2
|
|
7
|
+
|
|
8
|
+
Use this skill only when intentionally moving from one planned developer session slot to the next.
|
|
9
|
+
|
|
10
|
+
## Typical uses
|
|
11
|
+
|
|
12
|
+
- build session -> stabilization session
|
|
13
|
+
- stabilization session -> remediation session
|
|
14
|
+
|
|
15
|
+
## Rules
|
|
16
|
+
|
|
17
|
+
- rollover is planned, not a recovery event
|
|
18
|
+
- do not open the next developer session until the current session has a clear handoff out
|
|
19
|
+
- record the new session id and status immediately after the new session is created
|
|
20
|
+
|
|
21
|
+
## Required handoff contents
|
|
22
|
+
|
|
23
|
+
- current phase and why rollover is happening
|
|
24
|
+
- what is complete
|
|
25
|
+
- what is still active or risky
|
|
26
|
+
- what verification status currently exists
|
|
27
|
+
- what the next session should focus on first
|
|
28
|
+
|
|
29
|
+
## Metadata updates
|
|
30
|
+
|
|
31
|
+
When rollover succeeds, update metadata so it is obvious:
|
|
32
|
+
|
|
33
|
+
- which prior session is now completed or inactive
|
|
34
|
+
- which new session is active
|
|
35
|
+
- where the handoff artifact lives
|
|
36
|
+
- which phase group the new session now owns
|
|
37
|
+
|
|
38
|
+
## Avoid
|
|
39
|
+
|
|
40
|
+
- carrying old session context longer than needed once the work mode changed materially
|
|
41
|
+
- reopening a new developer session without a handoff artifact
|