create-issflow 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +41 -0
- package/bin/cli.js +96 -0
- package/package.json +23 -0
- package/template/.claude/agents/debugger.md +47 -0
- package/template/.claude/agents/e2e-runner.md +56 -0
- package/template/.claude/agents/implementer.md +75 -0
- package/template/.claude/agents/planner.md +65 -0
- package/template/.claude/agents/researcher.md +103 -0
- package/template/.claude/agents/synthesizer.md +72 -0
- package/template/.claude/agents/test-author.md +70 -0
- package/template/.claude/commands/log-decision.md +33 -0
- package/template/.claude/commands/log-issue.md +28 -0
- package/template/.claude/commands/overview.md +98 -0
- package/template/.claude/commands/phase.md +191 -0
- package/template/.claude/commands/quick.md +30 -0
- package/template/.claude/commands/replan.md +63 -0
- package/template/.claude/commands/store-wisdom.md +194 -0
- package/template/.claude/commands/synthesize.md +26 -0
- package/template/.claude/commands/unstuck.md +40 -0
- package/template/.claude/hooks/pre-compact.sh +25 -0
- package/template/.claude/hooks/session-start.sh +120 -0
- package/template/.claude/hooks/subagent-stop.sh +11 -0
- package/template/.claude/istartsoft-flow/METHODOLOGY.md +214 -0
- package/template/.claude/skills/caveman/SKILL.md +39 -0
- package/template/.claude/skills/grill-me/SKILL.md +10 -0
- package/template/.claude/skills/karpathy-guidelines/SKILL.md +34 -0
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Forced re-research after a circuit breaker. Stops flailing, re-routes to deep research with full memory of dead ends.
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
Caveman ULTRA mode.
|
|
6
|
+
|
|
7
|
+
Trigger: I chose "re-research" at a circuit breaker (see /phase step 5).
|
|
8
|
+
|
|
9
|
+
Steps:
|
|
10
|
+
|
|
11
|
+
1. WRITE IT DOWN. Append to docs/ISSUES.md as OPEN:
|
|
12
|
+
```
|
|
13
|
+
|
|
14
|
+
### <error title>
|
|
15
|
+
|
|
16
|
+
- [ ] open - stuck after 3 attempts
|
|
17
|
+
- symptom: <…>
|
|
18
|
+
- attempts that FAILED: <hypothesis 1>, <2>, <3>
|
|
19
|
+
|
|
20
|
+
```
|
|
21
|
+
Reference the existing debug-<slug>.md.
|
|
22
|
+
|
|
23
|
+
2. RESET FRAME. The 3 failed hypotheses are probably all wrong. Discard them.
|
|
24
|
+
|
|
25
|
+
3. DEEP RESEARCH. Dispatch `researcher` in IMPL mode WIDE:
|
|
26
|
+
- Read existing debug-<slug>.md and ISSUES.md failed-attempts FIRST.
|
|
27
|
+
- Re-read the actual error from scratch.
|
|
28
|
+
- Check real external service contract / docs.
|
|
29
|
+
- Look one layer below: config? env? version? data shape?
|
|
30
|
+
- Return fresh HYPOTHESIS backed by NEW evidence.
|
|
31
|
+
|
|
32
|
+
4. RE-PLAN if needed. Research shows phase design was wrong -> dispatch planner.
|
|
33
|
+
|
|
34
|
+
5. RESUME. Hand fresh hypothesis to `debugger`. It reads the prior debug file
|
|
35
|
+
(already knows what's ruled out). Budget = 3, NEW hypotheses only.
|
|
36
|
+
|
|
37
|
+
6. This counts as the path chosen at the first breaker. If STUCK again ->
|
|
38
|
+
/phase step 5 SECOND STUCK. Do not loop further.
|
|
39
|
+
|
|
40
|
+
Report each step.
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
# PreCompact hook. Fires before auto/manual /compact.
|
|
3
|
+
set -euo pipefail
|
|
4
|
+
cd "${CLAUDE_PROJECT_DIR:-.}"
|
|
5
|
+
|
|
6
|
+
mkdir -p docs/.snapshots
|
|
7
|
+
STAMP=$(date +%Y%m%d-%H%M%S)
|
|
8
|
+
SNAP="docs/.snapshots/precompact-${STAMP}.md"
|
|
9
|
+
|
|
10
|
+
{
|
|
11
|
+
echo "# Pre-compact snapshot ${STAMP}"
|
|
12
|
+
echo
|
|
13
|
+
echo "## Git"
|
|
14
|
+
git status --short 2>/dev/null || true
|
|
15
|
+
git diff --stat 2>/dev/null || true
|
|
16
|
+
echo
|
|
17
|
+
echo "## STATE.md at compact time"
|
|
18
|
+
[ -f docs/STATE.md ] && cat docs/STATE.md || echo "(no STATE.md)"
|
|
19
|
+
} > "$SNAP"
|
|
20
|
+
|
|
21
|
+
ls -1t docs/.snapshots/precompact-*.md 2>/dev/null | tail -n +6 | xargs -r rm -f
|
|
22
|
+
|
|
23
|
+
echo "Context was compacted. Recovery snapshot saved at ${SNAP}."
|
|
24
|
+
echo "STATE.md and ISSUES.md were re-injected by the SessionStart hook - trust those."
|
|
25
|
+
exit 0
|
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
# SessionStart hook. stdout is injected into Claude's context every session.
|
|
3
|
+
set -euo pipefail
|
|
4
|
+
cd "${CLAUDE_PROJECT_DIR:-.}"
|
|
5
|
+
|
|
6
|
+
emit () { printf '%s\n' "$1"; }
|
|
7
|
+
|
|
8
|
+
emit "=== iStartSoftFlow AUTO-CONTEXT (injected by hook, NOT optional) ==="
|
|
9
|
+
emit ""
|
|
10
|
+
|
|
11
|
+
# 1. git state
|
|
12
|
+
emit "## Git"
|
|
13
|
+
emit "branch: $(git branch --show-current 2>/dev/null || echo n/a)"
|
|
14
|
+
emit "uncommitted: $(git status --short 2>/dev/null | wc -l | tr -d ' ') file(s)"
|
|
15
|
+
git log --oneline -3 2>/dev/null | sed 's/^/ /' || true
|
|
16
|
+
emit ""
|
|
17
|
+
|
|
18
|
+
# 2. active state
|
|
19
|
+
if [ -f docs/STATE.md ]; then
|
|
20
|
+
emit "## STATE.md (current position - READ THIS FIRST)"
|
|
21
|
+
cat docs/STATE.md
|
|
22
|
+
emit ""
|
|
23
|
+
else
|
|
24
|
+
emit "## STATE.md missing -> run /overview to bootstrap the project."
|
|
25
|
+
emit ""
|
|
26
|
+
fi
|
|
27
|
+
|
|
28
|
+
# 3. issue log
|
|
29
|
+
if [ -f docs/ISSUES.md ]; then
|
|
30
|
+
OPEN=$(grep -c '^- \[ \]' docs/ISSUES.md 2>/dev/null || true)
|
|
31
|
+
OPEN=${OPEN:-0}
|
|
32
|
+
emit "## ISSUES.md (${OPEN} open) - check this before debugging anything"
|
|
33
|
+
awk '/^### / {p=1} p' docs/ISSUES.md 2>/dev/null | head -100
|
|
34
|
+
emit ""
|
|
35
|
+
fi
|
|
36
|
+
|
|
37
|
+
# 3b. research index
|
|
38
|
+
if [ -f docs/research/INDEX.md ]; then
|
|
39
|
+
RCOUNT=$(grep -c '^[0-9]' docs/research/INDEX.md 2>/dev/null || true)
|
|
40
|
+
RCOUNT=${RCOUNT:-0}
|
|
41
|
+
emit "## research/INDEX.md (${RCOUNT} prior investigations)"
|
|
42
|
+
emit "grep this before any new research or debugging."
|
|
43
|
+
grep '^[0-9]' docs/research/INDEX.md 2>/dev/null | tail -15 | sed 's/^/ /' || true
|
|
44
|
+
emit ""
|
|
45
|
+
fi
|
|
46
|
+
|
|
47
|
+
|
|
48
|
+
# 3d. shared KB — pull latest + load snapshot
|
|
49
|
+
KB_CONFIG=".claude/kb-config.json"
|
|
50
|
+
if [ -f "$KB_CONFIG" ]; then
|
|
51
|
+
KB_PATH=$(python3 -c "import json; print(json.load(open('$KB_CONFIG')).get('kb_path',''))" 2>/dev/null || echo "")
|
|
52
|
+
KB_PATH_EXPANDED="${KB_PATH/#\~/$HOME}"
|
|
53
|
+
|
|
54
|
+
if [ -n "$KB_PATH_EXPANDED" ] && [ -d "$KB_PATH_EXPANDED" ]; then
|
|
55
|
+
emit "## Shared KB"
|
|
56
|
+
|
|
57
|
+
# Pull latest (fail silently — offline or no remote shouldn't break session start)
|
|
58
|
+
if git -C "$KB_PATH_EXPANDED" pull --ff-only --quiet 2>/dev/null; then
|
|
59
|
+
emit "KB pulled: OK"
|
|
60
|
+
else
|
|
61
|
+
emit "KB pull skipped (offline or conflict — using local copy)"
|
|
62
|
+
fi
|
|
63
|
+
|
|
64
|
+
# Load INDEX.md into session snapshot
|
|
65
|
+
KB_INDEX="${KB_PATH_EXPANDED}/INDEX.md"
|
|
66
|
+
SNAPSHOT="docs/.kb-snapshot.md"
|
|
67
|
+
|
|
68
|
+
if [ -f "$KB_INDEX" ]; then
|
|
69
|
+
TODAY=$(date +%Y-%m-%d)
|
|
70
|
+
CUTOFF=$(date -d "6 months ago" +%Y-%m-%d 2>/dev/null \
|
|
71
|
+
|| python3 -c "from datetime import date, timedelta; print((date.today() - timedelta(days=180)).isoformat())" 2>/dev/null \
|
|
72
|
+
|| echo "2000-01-01")
|
|
73
|
+
|
|
74
|
+
{
|
|
75
|
+
echo "# KB snapshot — loaded $(date +%Y-%m-%d)"
|
|
76
|
+
echo "# Stale = created date older than ${CUTOFF}"
|
|
77
|
+
echo ""
|
|
78
|
+
while IFS='|' read -r entry_date domain slug summary rest; do
|
|
79
|
+
entry_date=$(echo "$entry_date" | tr -d ' ')
|
|
80
|
+
# Mark stale if date field present and older than cutoff
|
|
81
|
+
if [ -n "$entry_date" ] && [[ "$entry_date" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
|
|
82
|
+
if [[ "$entry_date" < "$CUTOFF" ]]; then
|
|
83
|
+
echo "[STALE] ${entry_date} |${domain} |${slug} |${summary} |${rest}"
|
|
84
|
+
else
|
|
85
|
+
echo "${entry_date} |${domain} |${slug} |${summary} |${rest}"
|
|
86
|
+
fi
|
|
87
|
+
else
|
|
88
|
+
echo "${entry_date} |${domain} |${slug} |${summary} |${rest}"
|
|
89
|
+
fi
|
|
90
|
+
done < <(grep -v '^#' "$KB_INDEX" | grep -v '^$')
|
|
91
|
+
} > "$SNAPSHOT"
|
|
92
|
+
|
|
93
|
+
TOTAL=$(grep -c '|' "$SNAPSHOT" 2>/dev/null || echo 0)
|
|
94
|
+
STALE=$(grep -c '^\[STALE\]' "$SNAPSHOT" 2>/dev/null || echo 0)
|
|
95
|
+
emit "KB snapshot loaded: ${TOTAL} entries (${STALE} stale — researcher will re-research these)"
|
|
96
|
+
emit "Snapshot at docs/.kb-snapshot.md — researcher reads this before web search."
|
|
97
|
+
else
|
|
98
|
+
emit "KB INDEX.md not found at ${KB_PATH_EXPANDED} — run /store-wisdom to populate it."
|
|
99
|
+
fi
|
|
100
|
+
emit ""
|
|
101
|
+
else
|
|
102
|
+
emit "## Shared KB: configured but path not found (${KB_PATH})"
|
|
103
|
+
emit "Re-run setup.sh to fix the KB path."
|
|
104
|
+
emit ""
|
|
105
|
+
fi
|
|
106
|
+
fi
|
|
107
|
+
|
|
108
|
+
# 4. hard rule reminder
|
|
109
|
+
emit "## RULES (enforced this session)"
|
|
110
|
+
emit "- caveman ULTRA mode is active."
|
|
111
|
+
emit "- before debugging ANY error: grep ISSUES.md AND research/INDEX.md first."
|
|
112
|
+
emit "- debug attempts: WARN at 2; first hard-stop at 3 STOPS and asks you."
|
|
113
|
+
emit "- end of every phase: run /synthesize, then /clear."
|
|
114
|
+
emit "- small obvious change? use /quick, not /phase."
|
|
115
|
+
if [ -f "$KB_CONFIG" ]; then
|
|
116
|
+
emit "- KB active: researcher checks docs/.kb-snapshot.md before web search."
|
|
117
|
+
emit "- learned something worth keeping? run /store-wisdom."
|
|
118
|
+
fi
|
|
119
|
+
emit "=== END AUTO-CONTEXT ==="
|
|
120
|
+
exit 0
|
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
# SubagentStop hook.
|
|
3
|
+
set -euo pipefail
|
|
4
|
+
cd "${CLAUDE_PROJECT_DIR:-.}"
|
|
5
|
+
|
|
6
|
+
mkdir -p docs/.snapshots
|
|
7
|
+
LOG="docs/.snapshots/agent-trace.log"
|
|
8
|
+
echo "$(date +%H:%M:%S) subagent finished" >> "$LOG"
|
|
9
|
+
|
|
10
|
+
tail -n 50 "$LOG" > "$LOG.tmp" 2>/dev/null && mv "$LOG.tmp" "$LOG" || true
|
|
11
|
+
exit 0
|
|
@@ -0,0 +1,214 @@
|
|
|
1
|
+
# iStartSoftFlow — portable agent methodology (single source of truth)
|
|
2
|
+
|
|
3
|
+
> **Adapted for Saku POS (from the open "anpunkit" workflow by MetheeS).**
|
|
4
|
+
> Namespaced under `.claude/istartsoft-flow/` to coexist with the repo's own
|
|
5
|
+
> `CLAUDE.md` / `AGENTS.md` (Saku project docs) — it does NOT replace them.
|
|
6
|
+
> Cursor support and the Azure infra phase were removed: this repo is Claude-only
|
|
7
|
+
> and infra is **managed (Vercel + Supabase)**, so **Phase 0 (infra) is N/A** —
|
|
8
|
+
> phases begin at the first vertical slice. Planning source of truth stays in
|
|
9
|
+
> **iSSM/BMAD** (PRD/architecture/stories); iStartSoftFlow is the execution loop on top.
|
|
10
|
+
|
|
11
|
+
<!-- ISTARTSOFTFLOW-AGENTS-SENTINEL-v2.0 -->
|
|
12
|
+
> **SENTINEL.** The HTML comment above (`ISTARTSOFTFLOW-AGENTS-SENTINEL-v2.0`) is a
|
|
13
|
+
> load-bearing marker. `setup.sh` greps for it to confirm this file resolved on
|
|
14
|
+
> disk and was not clobbered. Do not remove or rename it.
|
|
15
|
+
|
|
16
|
+
> **What this file is.** The complete, tool-agnostic methodology for the iStartSoftFlow
|
|
17
|
+
> workflow: the loop, the roles, the procedures, the rituals, and the hard rules.
|
|
18
|
+
> This is the ONE place every rule lives. Claude Code, Cursor, and any tool that
|
|
19
|
+
> reads the open `AGENTS.md` standard get the full methodology from here.
|
|
20
|
+
|
|
21
|
+
> **Anti-drift invariant (load-bearing).** Every rule lives in exactly ONE place:
|
|
22
|
+
> this file. `CLAUDE.md` restates NO rule — it only maps roles to Claude-native
|
|
23
|
+
> files and says "native mechanism X performs ritual Y automatically." A rule and
|
|
24
|
+
> its automation may never contradict. Duplication between files is an
|
|
25
|
+
> architectural defect, not a convenience.
|
|
26
|
+
|
|
27
|
+
Caveman ULTRA mode always on. Apply the `karpathy-guidelines` skill (engineering
|
|
28
|
+
discipline) on every coding and debugging task.
|
|
29
|
+
|
|
30
|
+
-----
|
|
31
|
+
|
|
32
|
+
## The loop
|
|
33
|
+
|
|
34
|
+
`design-research -> grill (×2) -> plan -> implement -> test -> deploy`, one
|
|
35
|
+
VERTICAL SLICE per phase. Implement AND test a phase before the next. The last
|
|
36
|
+
phase always includes deployment.
|
|
37
|
+
|
|
38
|
+
A phase runs in one of two orders, chosen at RESEARCH time by the TDD
|
|
39
|
+
APPLICABILITY check (see Procedures → `phase`):
|
|
40
|
+
|
|
41
|
+
- **TDD phase** (`TDD_PHASE=true`):
|
|
42
|
+
`RESEARCH -> SCAFFOLD -> RED -> GREEN -> TEST(e2e) -> FIX -> CLOSE`
|
|
43
|
+
- **Non-TDD phase** (pure infra / config / doc — `TDD_PHASE=false`):
|
|
44
|
+
`RESEARCH -> IMPLEMENT -> TEST -> FIX -> CLOSE`
|
|
45
|
+
|
|
46
|
+
`TDD_PHASE` = "the phase adds or changes a public callable surface (an endpoint,
|
|
47
|
+
exported function/class, CLI command, or message contract) that is assertable
|
|
48
|
+
from the acceptance spec." Size is NOT the criterion. On ambiguity, default
|
|
49
|
+
`TDD_PHASE=true` AND state the classification + reason so a human can override to
|
|
50
|
+
non-TDD before SCAFFOLD fires.
|
|
51
|
+
|
|
52
|
+
-----
|
|
53
|
+
|
|
54
|
+
## Roles (fresh-context workers)
|
|
55
|
+
|
|
56
|
+
Each role is a fresh-context worker. Where the host tool supports named subagents
|
|
57
|
+
(Claude Code natively; Cursor reads the same `.claude/agents/` files via its
|
|
58
|
+
Claude-compatibility path), each maps to a definition file. Where it does not, the role
|
|
59
|
+
degrades to a bounded sub-task that dumps its noise to a file and returns only a
|
|
60
|
+
terse summary + path. Workers cannot address the user — only the orchestrator
|
|
61
|
+
can. Escalation is at most two hops.
|
|
62
|
+
|
|
63
|
+
- **researcher** — two modes. DESIGN: domain/constraint research before planning
|
|
64
|
+
(service limits, API contracts, architectural constraints, cost surprises).
|
|
65
|
+
IMPL: per-phase codebase + service investigation. Checks the shared KB snapshot
|
|
66
|
+
first (step 0). Writes findings to `docs/research/`; returns terse summary + path.
|
|
67
|
+
- **planner** — research → vertical-slice `docs/PLAN.md`. Phase 0 (infra) always
|
|
68
|
+
first; the last phase always contains the deploy task.
|
|
69
|
+
- **implementer** — builds ONE phase. Two MODES for TDD phases (SCAFFOLD: stubs
|
|
70
|
+
only; FILL: logic to green) plus a legacy full-build mode for non-TDD phases.
|
|
71
|
+
Writes code, never tests. Maintains `docs/ENDPOINTS.md` each phase.
|
|
72
|
+
- **test-author** — writes tests BLIND (never reads implementation logic). On TDD
|
|
73
|
+
phases it is dispatched BEFORE logic exists (RED-first), so blindness is
|
|
74
|
+
structural, not honor-system. Writes a MOCK suite + a REAL API suite.
|
|
75
|
+
- **e2e-runner** — writes/runs functional browser E2E (Playwright) BLIND. Reads
|
|
76
|
+
- **debugger** — debugs in an ISOLATED context. Writes a trace to
|
|
77
|
+
`docs/research/debug-<slug>.md`; returns a summary.
|
|
78
|
+
- **synthesizer** — compresses `docs/STATE.md` / `docs/ISSUES.md`, prunes
|
|
79
|
+
snapshots. On the final phase, also updates `README.md` + `docs/OVERVIEW.md`.
|
|
80
|
+
|
|
81
|
+
The orchestrator ROUTES. It does not implement or debug.
|
|
82
|
+
|
|
83
|
+
-----
|
|
84
|
+
|
|
85
|
+
## Procedures (the slash-command set)
|
|
86
|
+
|
|
87
|
+
Named procedures, each with a canonical body in `commands.src/<name>.md` and a
|
|
88
|
+
generated copy per tool (`.claude/commands/`, `.cursor/commands/`).
|
|
89
|
+
|
|
90
|
+
- **overview** — bootstrap a project: design-research → grill r1 → design-research
|
|
91
|
+
→ re-grill r2 → `OVERVIEW.md` → planner → `PLAN.md` (Phase 0 always first).
|
|
92
|
+
- **phase [n]** — run one phase end-to-end with the circuit breaker. Chooses the
|
|
93
|
+
TDD or non-TDD order at RESEARCH. CLOSE runs the regression guard + ENDPOINTS
|
|
94
|
+
coverage gate.
|
|
95
|
+
- **quick [change]** — small, obvious, non-phase change; no agent chain. Stays
|
|
96
|
+
non-TDD. Runs the mock regression corpus after the change.
|
|
97
|
+
- **unstuck** — deep re-research after a circuit breaker (human-triggered).
|
|
98
|
+
- **synthesize** — compress STATE.md, dedup ISSUES.md, prune snapshots. Run
|
|
99
|
+
before a context reset.
|
|
100
|
+
- **replan** — revise `PLAN.md` (add/cut/split/merge/reorder pending phases) and
|
|
101
|
+
reconcile the regression corpus in step.
|
|
102
|
+
- **log-issue** — append an error to `ISSUES.md` with root cause + failed attempts.
|
|
103
|
+
- **log-decision** — record an architectural change in `docs/DESIGN_LOG.md`.
|
|
104
|
+
- **store-wisdom** — promote resolved issues + research to the shared KB.
|
|
105
|
+
|
|
106
|
+
-----
|
|
107
|
+
|
|
108
|
+
## Rituals (model-run fallback for hooks)
|
|
109
|
+
|
|
110
|
+
Where the host tool can run lifecycle hooks (Claude Code, Cursor), these rituals
|
|
111
|
+
are AUTOMATED by hook scripts and must NOT be run by hand. Where the tool cannot
|
|
112
|
+
inject context, the model performs them itself.
|
|
113
|
+
|
|
114
|
+
### SESSION-OPEN (start / clear / compact-resume)
|
|
115
|
+
|
|
116
|
+
At the start of every session, before any other work, surface:
|
|
117
|
+
1. git state (branch, uncommitted count, last 3 commits).
|
|
118
|
+
2. `docs/STATE.md` — the current position. READ THIS FIRST.
|
|
119
|
+
3. open items in `docs/ISSUES.md`.
|
|
120
|
+
4. `docs/research/INDEX.md` (research map) + infra status + auth nudge.
|
|
121
|
+
5. shared KB: pull latest + load `docs/.kb-snapshot.md` if `.claude/kb-config.json`
|
|
122
|
+
exists.
|
|
123
|
+
6. a one-line reminder of the hard rules below.
|
|
124
|
+
|
|
125
|
+
### COMPRESS (before a context compaction)
|
|
126
|
+
|
|
127
|
+
Snapshot the live position to `docs/.snapshots/` so a post-compact session can
|
|
128
|
+
recover: current phase, next action, open blocker.
|
|
129
|
+
|
|
130
|
+
-----
|
|
131
|
+
|
|
132
|
+
## Hard rules (1–11)
|
|
133
|
+
|
|
134
|
+
1. Before debugging ANY error: grep `docs/ISSUES.md` AND `docs/research/INDEX.md`.
|
|
135
|
+
The SESSION-OPEN ritual surfaces ISSUES.md — there is no excuse to miss it.
|
|
136
|
+
2. Debug attempt cap = 3: WARN the user at attempt 2; the FIRST hard-stop at 3
|
|
137
|
+
STOPS and asks the user. No 4th in-place attempt.
|
|
138
|
+
3. Every resolved error -> logged to `docs/ISSUES.md` with root cause + failed
|
|
139
|
+
attempts.
|
|
140
|
+
4. End of phase -> synthesize -> context reset -> next phase.
|
|
141
|
+
5. **PHASE GATE** = the current-phase REAL API suite passes AND (frontend phase)
|
|
142
|
+
the E2E suite passes AND the accumulated mock regression corpus stays green AND
|
|
143
|
+
every `docs/ENDPOINTS.md` entry has at least one test in `tests/regression/`.
|
|
144
|
+
The final phase additionally runs the full REAL regression corpus. A green
|
|
145
|
+
mock suite alone can never close a phase.
|
|
146
|
+
6. Tests are written by `test-author`, which never sees the implementation logic
|
|
147
|
+
(unbiased). On TDD phases the suite is written before the logic (RED-first).
|
|
148
|
+
`STACK NOT READY` / `FLAKE` do not spend the debug budget. Only `LOGIC FAIL`
|
|
149
|
+
reaches the debugger.
|
|
150
|
+
8. E2E auth = ROPC with a dedicated MFA-excluded test account. NEVER script the
|
|
151
|
+
Microsoft login UI.
|
|
152
|
+
9. Architectural change (new/removed agent, hook, command, or a changed workflow
|
|
153
|
+
rule)? -> run `log-decision` before closing.
|
|
154
|
+
work. Never debug an auth error without checking this first.
|
|
155
|
+
11. **No-rationalization (scoped).** Do not downgrade a TDD phase to non-TDD to
|
|
156
|
+
dodge the RED gate, and do not route phase-worthy work through `quick` to
|
|
157
|
+
dodge it. (Scoped deliberately to these two seams; this is not a broad
|
|
158
|
+
"never make excuses" rule.)
|
|
159
|
+
|
|
160
|
+
-----
|
|
161
|
+
|
|
162
|
+
## Shared KB (optional)
|
|
163
|
+
|
|
164
|
+
If `.claude/kb-config.json` exists, the SESSION-OPEN ritual pulls the KB and loads
|
|
165
|
+
a snapshot to `docs/.kb-snapshot.md`. The researcher checks the snapshot (step 0)
|
|
166
|
+
before any web search. Run `store-wisdom` to promote resolved issues + research to
|
|
167
|
+
the KB. The kit works normally without a KB.
|
|
168
|
+
|
|
169
|
+
-----
|
|
170
|
+
|
|
171
|
+
## File contract
|
|
172
|
+
|
|
173
|
+
- `docs/STATE.md` — current position. Small. Rewritten, not appended.
|
|
174
|
+
- `docs/ISSUES.md` — error log. Deduped by synthesizer.
|
|
175
|
+
- `docs/PLAN.md` — the phase plan. Phase 0 (infra) always first; last phase has
|
|
176
|
+
the deploy task.
|
|
177
|
+
- `docs/HISTORY.md` — one line per finished phase.
|
|
178
|
+
- `docs/DESIGN_LOG.md` — kit architectural rationale (§5.x decision log).
|
|
179
|
+
- `docs/OVERVIEW.md` — project scope. Written after the double-grill in `overview`.
|
|
180
|
+
E2E target.
|
|
181
|
+
- `docs/ENDPOINTS.md` — API/service endpoint catalogue. Maintained by implementer
|
|
182
|
+
each phase. Drives the CLOSE coverage gate.
|
|
183
|
+
- `docs/research/` — full research + debug files. `INDEX.md` is the searchable map.
|
|
184
|
+
`design-<slug>.md` (design research), `<slug>.md` (impl research),
|
|
185
|
+
`debug-<slug>.md` (debugger traces).
|
|
186
|
+
- `docs/.snapshots/` — pre-compact recovery markers (auto-pruned, gitignored).
|
|
187
|
+
holds no secrets.
|
|
188
|
+
- `e2e/`, `scripts/e2e-stack.sh`, `docker-compose.test.yml`, `playwright.config.ts`
|
|
189
|
+
— the E2E stack.
|
|
190
|
+
- `tests/phase-<n>/` — phase-local test suites.
|
|
191
|
+
- `tests/regression/` — cross-phase contract tests (the regression corpus). Run by
|
|
192
|
+
`scripts/regression.sh` (default mock; `--real` runs the real corpus).
|
|
193
|
+
- `.claude/kb-config.json` — shared KB path + remote (optional, written by setup.sh).
|
|
194
|
+
- `docs/.kb-snapshot.md` — KB INDEX loaded this session (auto-generated, gitignored).
|
|
195
|
+
|
|
196
|
+
-----
|
|
197
|
+
|
|
198
|
+
## Capability matrix (which tools get what)
|
|
199
|
+
|
|
200
|
+
- **Claude Code — full (reference implementation).** Generated commands; all three
|
|
201
|
+
lifecycle hooks WITH context injection (SessionStart / PreCompact / SubagentStop);
|
|
202
|
+
named subagents (`.claude/agents/*.md`); `@AGENTS.md` import; shared KB.
|
|
203
|
+
- **Cursor — full (verified against cursor.com/docs, 2026-06).** Generated
|
|
204
|
+
commands (`.cursor/commands/`); the three lifecycle hooks via `.cursor/hooks.json`
|
|
205
|
+
(sessionStart / preCompact / subagentStop) wired to the SAME shared hook scripts —
|
|
206
|
+
sessionStart injects context via a JSON-envelope wrapper
|
|
207
|
+
(`cursor-session-start.sh`); named subagents work natively because Cursor reads
|
|
208
|
+
`.claude/agents/*.md` directly (no duplication); methodology via a
|
|
209
|
+
`.cursor/rules/` pointer at this file. Known degradations: `model: haiku/opus`
|
|
210
|
+
tiers are Claude-specific (Cursor falls back to inherit/compatible), and Cursor's
|
|
211
|
+
preCompact is observational (the snapshot side-effect still runs; no context
|
|
212
|
+
modification).
|
|
213
|
+
- **Everything else.** Reads this `AGENTS.md` if it supports the open standard.
|
|
214
|
+
No generated adapters. Not claimed as supported in v2.0.
|
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: caveman
|
|
3
|
+
description: >
|
|
4
|
+
Ultra-compressed communication mode. Cuts token usage ~75% by dropping
|
|
5
|
+
filler, articles, and pleasantries while keeping full technical accuracy.
|
|
6
|
+
Use when user says "caveman mode", "talk like caveman", "use caveman",
|
|
7
|
+
"less tokens", "be brief", or invokes /caveman.
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
Respond terse like smart caveman. All technical substance stay. Only fluff die.
|
|
11
|
+
|
|
12
|
+
## Persistence
|
|
13
|
+
|
|
14
|
+
ACTIVE EVERY RESPONSE once triggered. No revert after many turns. No filler drift. Still active if unsure. Off only when user says "stop caveman" or "normal mode".
|
|
15
|
+
|
|
16
|
+
## Rules
|
|
17
|
+
|
|
18
|
+
Drop: articles (a/an/the), filler (just/really/basically/actually/simply), pleasantries (sure/certainly/of course/happy to), hedging. Fragments OK. Short synonyms (big not extensive, fix not "implement a solution for"). Abbreviate common terms (DB/auth/config/req/res/fn/impl). Strip conjunctions. Use arrows for causality (X -> Y). One word when one word enough.
|
|
19
|
+
|
|
20
|
+
Technical terms stay exact. Code blocks unchanged. Errors quoted exact.
|
|
21
|
+
|
|
22
|
+
Pattern: `[thing] [action] [reason]. [next step].`
|
|
23
|
+
|
|
24
|
+
Not: "Sure! I'd be happy to help you with that. The issue you're experiencing is likely caused by..."
|
|
25
|
+
Yes: "Bug in auth middleware. Token expiry check use `<` not `<=`. Fix:"
|
|
26
|
+
|
|
27
|
+
### Examples
|
|
28
|
+
|
|
29
|
+
**"Why React component re-render?"**
|
|
30
|
+
|
|
31
|
+
> Inline obj prop -> new ref -> re-render. `useMemo`.
|
|
32
|
+
|
|
33
|
+
**"Explain database connection pooling."**
|
|
34
|
+
|
|
35
|
+
> Pool = reuse DB conn. Skip handshake -> fast under load.
|
|
36
|
+
|
|
37
|
+
## Auto-Clarity Exception
|
|
38
|
+
|
|
39
|
+
Drop caveman temporarily for: security warnings, irreversible action confirmations, multi-step sequences where fragment order risks misread, user asks to clarify or repeats question. Resume caveman after.
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: grill-me
|
|
3
|
+
description: Interview the user relentlessly about a plan or design until reaching shared understanding, resolving each branch of the decision tree.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer.
|
|
7
|
+
|
|
8
|
+
Ask the questions one at a time.
|
|
9
|
+
|
|
10
|
+
If a question can be answered by exploring the codebase, explore the codebase instead.
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: karpathy-guidelines
|
|
3
|
+
description: Coding and debugging discipline. Apply on every coding and debug task.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# karpathy-guidelines
|
|
7
|
+
|
|
8
|
+
Apply on every coding + debug task. Caveman ULTRA mode.
|
|
9
|
+
|
|
10
|
+
## Coding
|
|
11
|
+
- Smallest change that works. No speculative abstraction. No "while I'm here".
|
|
12
|
+
- Make it run, then make it right, then make it fast — in that order.
|
|
13
|
+
- One concern per change. If the diff does two things, split it.
|
|
14
|
+
- Read before write — understand the existing code path before editing.
|
|
15
|
+
- Name things for what they are. Delete dead code, don't comment it out.
|
|
16
|
+
- Prefer boring, obvious solutions over clever ones.
|
|
17
|
+
|
|
18
|
+
## Debugging
|
|
19
|
+
- Reproduce first.
|
|
20
|
+
- One hypothesis at a time. State it before changing anything.
|
|
21
|
+
- Change one variable, observe, conclude. No shotgun edits.
|
|
22
|
+
- Read the actual error and stack — top to bottom — before theorizing.
|
|
23
|
+
- When stuck: the bug is somewhere you're sure it isn't. Check assumptions.
|
|
24
|
+
- 3 failed attempts -> stop poking, go research.
|
|
25
|
+
|
|
26
|
+
## Honesty
|
|
27
|
+
- Don't claim it works until you ran it.
|
|
28
|
+
- "I don't know yet" is valid — say it instead of guessing.
|
|
29
|
+
- A failing test is information. Don't edit the test to hide it.
|
|
30
|
+
- Surface uncertainty to the orchestrator; don't paper over it.
|
|
31
|
+
|
|
32
|
+
## Verification
|
|
33
|
+
- Every change gets run. Lint/typecheck/smoke at minimum.
|
|
34
|
+
- The real test is written blind by test-author — your own run is just sanity.
|