@canivel/ralph 0.2.1 → 0.2.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/ralph/PROMPT_build.md +126 -126
- package/.agents/ralph/config.sh +25 -25
- package/.agents/ralph/log-activity.sh +15 -15
- package/.agents/ralph/loop.sh +0 -0
- package/.agents/ralph/references/CONTEXT_ENGINEERING.md +126 -126
- package/.agents/ralph/references/GUARDRAILS.md +174 -174
- package/AGENTS.md +20 -20
- package/README.md +270 -270
- package/bin/ralph +767 -767
- package/diagram.svg +55 -55
- package/examples/commands.md +46 -46
- package/package.json +39 -39
- package/skills/commit/SKILL.md +219 -219
- package/skills/commit/references/commit_examples.md +292 -292
- package/skills/dev-browser/SKILL.md +211 -211
- package/skills/dev-browser/bun.lock +443 -443
- package/skills/dev-browser/package-lock.json +2988 -2988
- package/skills/dev-browser/package.json +31 -31
- package/skills/dev-browser/references/scraping.md +155 -155
- package/skills/dev-browser/scripts/start-relay.ts +32 -32
- package/skills/dev-browser/scripts/start-server.ts +117 -117
- package/skills/dev-browser/server.sh +24 -24
- package/skills/dev-browser/src/client.ts +474 -474
- package/skills/dev-browser/src/index.ts +287 -287
- package/skills/dev-browser/src/relay.ts +731 -731
- package/skills/dev-browser/src/snapshot/__tests__/snapshot.test.ts +223 -223
- package/skills/dev-browser/src/snapshot/browser-script.ts +877 -877
- package/skills/dev-browser/src/snapshot/index.ts +14 -14
- package/skills/dev-browser/src/snapshot/inject.ts +13 -13
- package/skills/dev-browser/src/types.ts +34 -34
- package/skills/dev-browser/tsconfig.json +36 -36
- package/skills/dev-browser/vitest.config.ts +12 -12
- package/skills/prd/SKILL.md +235 -235
- package/tests/agent-loops.mjs +79 -79
- package/tests/agent-ping.mjs +39 -39
- package/tests/audit.md +56 -56
- package/tests/cli-smoke.mjs +47 -47
- package/tests/real-agents.mjs +127 -127
|
@@ -1,126 +1,126 @@
|
|
|
1
|
-
# Build
|
|
2
|
-
|
|
3
|
-
You are an autonomous coding agent. Your task is to complete the work for exactly one story and record the outcome.
|
|
4
|
-
|
|
5
|
-
## Paths
|
|
6
|
-
- PRD: {{PRD_PATH}}
|
|
7
|
-
- AGENTS (optional): {{AGENTS_PATH}}
|
|
8
|
-
- Progress Log: {{PROGRESS_PATH}}
|
|
9
|
-
- Guardrails: {{GUARDRAILS_PATH}}
|
|
10
|
-
- Guardrails Reference: {{GUARDRAILS_REF}}
|
|
11
|
-
- Context Reference: {{CONTEXT_REF}}
|
|
12
|
-
- Errors Log: {{ERRORS_LOG_PATH}}
|
|
13
|
-
- Activity Log: {{ACTIVITY_LOG_PATH}}
|
|
14
|
-
- Activity Logger: {{ACTIVITY_CMD}}
|
|
15
|
-
- No-commit: {{NO_COMMIT}}
|
|
16
|
-
- Repo Root: {{REPO_ROOT}}
|
|
17
|
-
- Run ID: {{RUN_ID}}
|
|
18
|
-
- Iteration: {{ITERATION}}
|
|
19
|
-
- Run Log: {{RUN_LOG_PATH}}
|
|
20
|
-
- Run Summary: {{RUN_META_PATH}}
|
|
21
|
-
|
|
22
|
-
## Global Quality Gates (apply to every story)
|
|
23
|
-
{{QUALITY_GATES}}
|
|
24
|
-
|
|
25
|
-
## Selected Story (Do not change scope)
|
|
26
|
-
ID: {{STORY_ID}}
|
|
27
|
-
Title: {{STORY_TITLE}}
|
|
28
|
-
|
|
29
|
-
Story details:
|
|
30
|
-
{{STORY_BLOCK}}
|
|
31
|
-
|
|
32
|
-
If the story details are empty or missing, STOP and report that the PRD story format could not be parsed.
|
|
33
|
-
|
|
34
|
-
## Rules (Non-Negotiable)
|
|
35
|
-
- Implement **only** the work required to complete the selected story.
|
|
36
|
-
- Complete all tasks associated with this story (and only this story).
|
|
37
|
-
- Do NOT ask the user questions.
|
|
38
|
-
- Do NOT change unrelated code.
|
|
39
|
-
- Do NOT assume something is unimplemented — confirm by reading code.
|
|
40
|
-
- Implement completely; no placeholders or stubs.
|
|
41
|
-
- If No-commit is true, do NOT commit or push changes.
|
|
42
|
-
- Do NOT edit the PRD JSON (status is handled by the loop).
|
|
43
|
-
- All changes made during the run must be committed (including updates to progress/logs).
|
|
44
|
-
- Before committing, perform a final **security**, **performance**, and **regression** review of your changes.
|
|
45
|
-
|
|
46
|
-
## Your Task (Do this in order)
|
|
47
|
-
1. Read {{GUARDRAILS_PATH}} before any code changes.
|
|
48
|
-
2. Read {{ERRORS_LOG_PATH}} for repeated failures to avoid.
|
|
49
|
-
3. Read {{PRD_PATH}} for global context (do not edit).
|
|
50
|
-
4. Fully audit and read all necessary files to understand the task end-to-end before implementing. Do not assume missing functionality.
|
|
51
|
-
5. If {{AGENTS_PATH}} exists, follow its build/test instructions.
|
|
52
|
-
6. Implement only the tasks that belong to {{STORY_ID}}.
|
|
53
|
-
7. Run verification commands listed in the story, the global quality gates, and in {{AGENTS_PATH}} (if required).
|
|
54
|
-
8. If the project has a build or dev workflow, run what applies:
|
|
55
|
-
- Build step (e.g., `npm run build`) if defined.
|
|
56
|
-
- Dev server (e.g., `npm run dev`, `wrangler dev`) if it is the normal validation path.
|
|
57
|
-
- Confirm no runtime/build errors in the console.
|
|
58
|
-
9. Perform a brief audit before committing:
|
|
59
|
-
- **Security:** check for obvious vulnerabilities or unsafe handling introduced by your changes.
|
|
60
|
-
- **Performance:** check for avoidable regressions (extra queries, heavy loops, unnecessary re-renders).
|
|
61
|
-
- **Regression:** verify existing behavior that could be impacted still works.
|
|
62
|
-
10. If No-commit is false, commit changes using the `$commit` skill.
|
|
63
|
-
- Stage everything: `git add -A`
|
|
64
|
-
- Confirm a clean working tree after commit: `git status --porcelain` should be empty.
|
|
65
|
-
- After committing, capture the commit hash and subject using:
|
|
66
|
-
`git show -s --format="%h %s" HEAD`.
|
|
67
|
-
11. Append a progress entry to {{PROGRESS_PATH}} with run/commit/test details (format below).
|
|
68
|
-
If No-commit is true, skip committing and note it in the progress entry.
|
|
69
|
-
|
|
70
|
-
## Progress Entry Format (Append Only)
|
|
71
|
-
```
|
|
72
|
-
## [Date/Time] - {{STORY_ID}}: {{STORY_TITLE}}
|
|
73
|
-
Thread: [codex exec session id if available, otherwise leave blank]
|
|
74
|
-
Run: {{RUN_ID}} (iteration {{ITERATION}})
|
|
75
|
-
Run log: {{RUN_LOG_PATH}}
|
|
76
|
-
Run summary: {{RUN_META_PATH}}
|
|
77
|
-
- Guardrails reviewed: yes
|
|
78
|
-
- No-commit run: {{NO_COMMIT}}
|
|
79
|
-
- Commit: <hash> <subject> (or `none` + reason)
|
|
80
|
-
- Post-commit status: `clean` or list remaining files
|
|
81
|
-
- Verification:
|
|
82
|
-
- Command: <exact command> -> PASS/FAIL
|
|
83
|
-
- Command: <exact command> -> PASS/FAIL
|
|
84
|
-
- Files changed:
|
|
85
|
-
- <file path>
|
|
86
|
-
- <file path>
|
|
87
|
-
- What was implemented
|
|
88
|
-
- **Learnings for future iterations:**
|
|
89
|
-
- Patterns discovered
|
|
90
|
-
- Gotchas encountered
|
|
91
|
-
- Useful context
|
|
92
|
-
---
|
|
93
|
-
```
|
|
94
|
-
|
|
95
|
-
## Completion Signal
|
|
96
|
-
Only output the completion signal when the **selected story** is fully complete and verified.
|
|
97
|
-
When the selected story is complete, output:
|
|
98
|
-
<promise>COMPLETE</promise>
|
|
99
|
-
|
|
100
|
-
Otherwise, end normally without the signal.
|
|
101
|
-
|
|
102
|
-
## Additional Guardrails
|
|
103
|
-
- When authoring documentation, capture the why (tests + implementation intent).
|
|
104
|
-
- If you learn how to run/build/test the project, update {{AGENTS_PATH}} briefly (operational only).
|
|
105
|
-
- Keep AGENTS operational only; progress notes belong in {{PROGRESS_PATH}}.
|
|
106
|
-
- If you hit repeated errors, log them in {{ERRORS_LOG_PATH}} and add a Sign to {{GUARDRAILS_PATH}} using {{GUARDRAILS_REF}} as the template.
|
|
107
|
-
|
|
108
|
-
## Activity Logging (Required)
|
|
109
|
-
Log major actions to {{ACTIVITY_LOG_PATH}} using the helper:
|
|
110
|
-
```
|
|
111
|
-
{{ACTIVITY_CMD}} "message"
|
|
112
|
-
```
|
|
113
|
-
Log at least:
|
|
114
|
-
- Start of work on the story
|
|
115
|
-
- After major code changes
|
|
116
|
-
- After tests/verification
|
|
117
|
-
- After updating progress log
|
|
118
|
-
|
|
119
|
-
## Browser Testing (Required for Frontend Stories)
|
|
120
|
-
If the selected story changes UI, you MUST verify it in the browser:
|
|
121
|
-
1. Load the `dev-browser` skill.
|
|
122
|
-
2. Navigate to the relevant page.
|
|
123
|
-
3. Verify the UI changes work as expected.
|
|
124
|
-
4. Take a screenshot if helpful for the progress log.
|
|
125
|
-
|
|
126
|
-
A frontend story is NOT complete until browser verification passes.
|
|
1
|
+
# Build
|
|
2
|
+
|
|
3
|
+
You are an autonomous coding agent. Your task is to complete the work for exactly one story and record the outcome.
|
|
4
|
+
|
|
5
|
+
## Paths
|
|
6
|
+
- PRD: {{PRD_PATH}}
|
|
7
|
+
- AGENTS (optional): {{AGENTS_PATH}}
|
|
8
|
+
- Progress Log: {{PROGRESS_PATH}}
|
|
9
|
+
- Guardrails: {{GUARDRAILS_PATH}}
|
|
10
|
+
- Guardrails Reference: {{GUARDRAILS_REF}}
|
|
11
|
+
- Context Reference: {{CONTEXT_REF}}
|
|
12
|
+
- Errors Log: {{ERRORS_LOG_PATH}}
|
|
13
|
+
- Activity Log: {{ACTIVITY_LOG_PATH}}
|
|
14
|
+
- Activity Logger: {{ACTIVITY_CMD}}
|
|
15
|
+
- No-commit: {{NO_COMMIT}}
|
|
16
|
+
- Repo Root: {{REPO_ROOT}}
|
|
17
|
+
- Run ID: {{RUN_ID}}
|
|
18
|
+
- Iteration: {{ITERATION}}
|
|
19
|
+
- Run Log: {{RUN_LOG_PATH}}
|
|
20
|
+
- Run Summary: {{RUN_META_PATH}}
|
|
21
|
+
|
|
22
|
+
## Global Quality Gates (apply to every story)
|
|
23
|
+
{{QUALITY_GATES}}
|
|
24
|
+
|
|
25
|
+
## Selected Story (Do not change scope)
|
|
26
|
+
ID: {{STORY_ID}}
|
|
27
|
+
Title: {{STORY_TITLE}}
|
|
28
|
+
|
|
29
|
+
Story details:
|
|
30
|
+
{{STORY_BLOCK}}
|
|
31
|
+
|
|
32
|
+
If the story details are empty or missing, STOP and report that the PRD story format could not be parsed.
|
|
33
|
+
|
|
34
|
+
## Rules (Non-Negotiable)
|
|
35
|
+
- Implement **only** the work required to complete the selected story.
|
|
36
|
+
- Complete all tasks associated with this story (and only this story).
|
|
37
|
+
- Do NOT ask the user questions.
|
|
38
|
+
- Do NOT change unrelated code.
|
|
39
|
+
- Do NOT assume something is unimplemented — confirm by reading code.
|
|
40
|
+
- Implement completely; no placeholders or stubs.
|
|
41
|
+
- If No-commit is true, do NOT commit or push changes.
|
|
42
|
+
- Do NOT edit the PRD JSON (status is handled by the loop).
|
|
43
|
+
- All changes made during the run must be committed (including updates to progress/logs).
|
|
44
|
+
- Before committing, perform a final **security**, **performance**, and **regression** review of your changes.
|
|
45
|
+
|
|
46
|
+
## Your Task (Do this in order)
|
|
47
|
+
1. Read {{GUARDRAILS_PATH}} before any code changes.
|
|
48
|
+
2. Read {{ERRORS_LOG_PATH}} for repeated failures to avoid.
|
|
49
|
+
3. Read {{PRD_PATH}} for global context (do not edit).
|
|
50
|
+
4. Fully audit and read all necessary files to understand the task end-to-end before implementing. Do not assume missing functionality.
|
|
51
|
+
5. If {{AGENTS_PATH}} exists, follow its build/test instructions.
|
|
52
|
+
6. Implement only the tasks that belong to {{STORY_ID}}.
|
|
53
|
+
7. Run verification commands listed in the story, the global quality gates, and in {{AGENTS_PATH}} (if required).
|
|
54
|
+
8. If the project has a build or dev workflow, run what applies:
|
|
55
|
+
- Build step (e.g., `npm run build`) if defined.
|
|
56
|
+
- Dev server (e.g., `npm run dev`, `wrangler dev`) if it is the normal validation path.
|
|
57
|
+
- Confirm no runtime/build errors in the console.
|
|
58
|
+
9. Perform a brief audit before committing:
|
|
59
|
+
- **Security:** check for obvious vulnerabilities or unsafe handling introduced by your changes.
|
|
60
|
+
- **Performance:** check for avoidable regressions (extra queries, heavy loops, unnecessary re-renders).
|
|
61
|
+
- **Regression:** verify existing behavior that could be impacted still works.
|
|
62
|
+
10. If No-commit is false, commit changes using the `$commit` skill.
|
|
63
|
+
- Stage everything: `git add -A`
|
|
64
|
+
- Confirm a clean working tree after commit: `git status --porcelain` should be empty.
|
|
65
|
+
- After committing, capture the commit hash and subject using:
|
|
66
|
+
`git show -s --format="%h %s" HEAD`.
|
|
67
|
+
11. Append a progress entry to {{PROGRESS_PATH}} with run/commit/test details (format below).
|
|
68
|
+
If No-commit is true, skip committing and note it in the progress entry.
|
|
69
|
+
|
|
70
|
+
## Progress Entry Format (Append Only)
|
|
71
|
+
```
|
|
72
|
+
## [Date/Time] - {{STORY_ID}}: {{STORY_TITLE}}
|
|
73
|
+
Thread: [codex exec session id if available, otherwise leave blank]
|
|
74
|
+
Run: {{RUN_ID}} (iteration {{ITERATION}})
|
|
75
|
+
Run log: {{RUN_LOG_PATH}}
|
|
76
|
+
Run summary: {{RUN_META_PATH}}
|
|
77
|
+
- Guardrails reviewed: yes
|
|
78
|
+
- No-commit run: {{NO_COMMIT}}
|
|
79
|
+
- Commit: <hash> <subject> (or `none` + reason)
|
|
80
|
+
- Post-commit status: `clean` or list remaining files
|
|
81
|
+
- Verification:
|
|
82
|
+
- Command: <exact command> -> PASS/FAIL
|
|
83
|
+
- Command: <exact command> -> PASS/FAIL
|
|
84
|
+
- Files changed:
|
|
85
|
+
- <file path>
|
|
86
|
+
- <file path>
|
|
87
|
+
- What was implemented
|
|
88
|
+
- **Learnings for future iterations:**
|
|
89
|
+
- Patterns discovered
|
|
90
|
+
- Gotchas encountered
|
|
91
|
+
- Useful context
|
|
92
|
+
---
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
## Completion Signal
|
|
96
|
+
Only output the completion signal when the **selected story** is fully complete and verified.
|
|
97
|
+
When the selected story is complete, output:
|
|
98
|
+
<promise>COMPLETE</promise>
|
|
99
|
+
|
|
100
|
+
Otherwise, end normally without the signal.
|
|
101
|
+
|
|
102
|
+
## Additional Guardrails
|
|
103
|
+
- When authoring documentation, capture the why (tests + implementation intent).
|
|
104
|
+
- If you learn how to run/build/test the project, update {{AGENTS_PATH}} briefly (operational only).
|
|
105
|
+
- Keep AGENTS operational only; progress notes belong in {{PROGRESS_PATH}}.
|
|
106
|
+
- If you hit repeated errors, log them in {{ERRORS_LOG_PATH}} and add a Sign to {{GUARDRAILS_PATH}} using {{GUARDRAILS_REF}} as the template.
|
|
107
|
+
|
|
108
|
+
## Activity Logging (Required)
|
|
109
|
+
Log major actions to {{ACTIVITY_LOG_PATH}} using the helper:
|
|
110
|
+
```
|
|
111
|
+
{{ACTIVITY_CMD}} "message"
|
|
112
|
+
```
|
|
113
|
+
Log at least:
|
|
114
|
+
- Start of work on the story
|
|
115
|
+
- After major code changes
|
|
116
|
+
- After tests/verification
|
|
117
|
+
- After updating progress log
|
|
118
|
+
|
|
119
|
+
## Browser Testing (Required for Frontend Stories)
|
|
120
|
+
If the selected story changes UI, you MUST verify it in the browser:
|
|
121
|
+
1. Load the `dev-browser` skill.
|
|
122
|
+
2. Navigate to the relevant page.
|
|
123
|
+
3. Verify the UI changes work as expected.
|
|
124
|
+
4. Take a screenshot if helpful for the progress log.
|
|
125
|
+
|
|
126
|
+
A frontend story is NOT complete until browser verification passes.
|
package/.agents/ralph/config.sh
CHANGED
|
@@ -1,25 +1,25 @@
|
|
|
1
|
-
# Optional Ralph config overrides.
|
|
2
|
-
# All paths are relative to repo root unless absolute.
|
|
3
|
-
# Uncomment and edit as needed.
|
|
4
|
-
|
|
5
|
-
# PRD_PATH=".agents/tasks/prd.json"
|
|
6
|
-
# PROGRESS_PATH=".ralph/progress.md"
|
|
7
|
-
# GUARDRAILS_PATH=".ralph/guardrails.md"
|
|
8
|
-
# ERRORS_LOG_PATH=".ralph/errors.log"
|
|
9
|
-
# ACTIVITY_LOG_PATH=".ralph/activity.log"
|
|
10
|
-
# TMP_DIR=".ralph/.tmp"
|
|
11
|
-
# RUNS_DIR=".ralph/runs"
|
|
12
|
-
# GUARDRAILS_REF=".agents/ralph/references/GUARDRAILS.md"
|
|
13
|
-
# CONTEXT_REF=".agents/ralph/references/CONTEXT_ENGINEERING.md"
|
|
14
|
-
# ACTIVITY_CMD=".agents/ralph/log-activity.sh"
|
|
15
|
-
# AGENT_CMD defaults are defined in agents.sh. Override here if needed.
|
|
16
|
-
# AGENT_CMD="codex exec --yolo --skip-git-repo-check -"
|
|
17
|
-
# PRD_AGENT_CMD defaults are defined in agents.sh (interactive).
|
|
18
|
-
# PRD_AGENT_CMD="codex --yolo --skip-git-repo-check {prompt}"
|
|
19
|
-
# AGENT_CMD="claude -p --dangerously-skip-permissions \"\$(cat {prompt})\""
|
|
20
|
-
# AGENT_CMD="droid exec --skip-permissions-unsafe -f {prompt}"
|
|
21
|
-
# AGENTS_PATH="AGENTS.md"
|
|
22
|
-
# PROMPT_BUILD=".agents/ralph/PROMPT_build.md"
|
|
23
|
-
# NO_COMMIT=false
|
|
24
|
-
# MAX_ITERATIONS=25
|
|
25
|
-
# STALE_SECONDS=0
|
|
1
|
+
# Optional Ralph config overrides.
|
|
2
|
+
# All paths are relative to repo root unless absolute.
|
|
3
|
+
# Uncomment and edit as needed.
|
|
4
|
+
|
|
5
|
+
# PRD_PATH=".agents/tasks/prd.json"
|
|
6
|
+
# PROGRESS_PATH=".ralph/progress.md"
|
|
7
|
+
# GUARDRAILS_PATH=".ralph/guardrails.md"
|
|
8
|
+
# ERRORS_LOG_PATH=".ralph/errors.log"
|
|
9
|
+
# ACTIVITY_LOG_PATH=".ralph/activity.log"
|
|
10
|
+
# TMP_DIR=".ralph/.tmp"
|
|
11
|
+
# RUNS_DIR=".ralph/runs"
|
|
12
|
+
# GUARDRAILS_REF=".agents/ralph/references/GUARDRAILS.md"
|
|
13
|
+
# CONTEXT_REF=".agents/ralph/references/CONTEXT_ENGINEERING.md"
|
|
14
|
+
# ACTIVITY_CMD=".agents/ralph/log-activity.sh"
|
|
15
|
+
# AGENT_CMD defaults are defined in agents.sh. Override here if needed.
|
|
16
|
+
# AGENT_CMD="codex exec --yolo --skip-git-repo-check -"
|
|
17
|
+
# PRD_AGENT_CMD defaults are defined in agents.sh (interactive).
|
|
18
|
+
# PRD_AGENT_CMD="codex --yolo --skip-git-repo-check {prompt}"
|
|
19
|
+
# AGENT_CMD="claude -p --dangerously-skip-permissions \"\$(cat {prompt})\""
|
|
20
|
+
# AGENT_CMD="droid exec --skip-permissions-unsafe -f {prompt}"
|
|
21
|
+
# AGENTS_PATH="AGENTS.md"
|
|
22
|
+
# PROMPT_BUILD=".agents/ralph/PROMPT_build.md"
|
|
23
|
+
# NO_COMMIT=false
|
|
24
|
+
# MAX_ITERATIONS=25
|
|
25
|
+
# STALE_SECONDS=0
|
|
@@ -1,15 +1,15 @@
|
|
|
1
|
-
#!/bin/bash
|
|
2
|
-
set -euo pipefail
|
|
3
|
-
|
|
4
|
-
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
|
5
|
-
ROOT_DIR="$(cd "${SCRIPT_DIR}/../.." && pwd)"
|
|
6
|
-
ACTIVITY_LOG="$ROOT_DIR/.ralph/activity.log"
|
|
7
|
-
|
|
8
|
-
if [ $# -lt 1 ]; then
|
|
9
|
-
echo "Usage: $0 \"message\""
|
|
10
|
-
exit 1
|
|
11
|
-
fi
|
|
12
|
-
|
|
13
|
-
mkdir -p "$(dirname "$ACTIVITY_LOG")"
|
|
14
|
-
TS=$(date '+%Y-%m-%d %H:%M:%S')
|
|
15
|
-
echo "[$TS] $*" >> "$ACTIVITY_LOG"
|
|
1
|
+
#!/bin/bash
|
|
2
|
+
set -euo pipefail
|
|
3
|
+
|
|
4
|
+
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
|
5
|
+
ROOT_DIR="$(cd "${SCRIPT_DIR}/../.." && pwd)"
|
|
6
|
+
ACTIVITY_LOG="$ROOT_DIR/.ralph/activity.log"
|
|
7
|
+
|
|
8
|
+
if [ $# -lt 1 ]; then
|
|
9
|
+
echo "Usage: $0 \"message\""
|
|
10
|
+
exit 1
|
|
11
|
+
fi
|
|
12
|
+
|
|
13
|
+
mkdir -p "$(dirname "$ACTIVITY_LOG")"
|
|
14
|
+
TS=$(date '+%Y-%m-%d %H:%M:%S')
|
|
15
|
+
echo "[$TS] $*" >> "$ACTIVITY_LOG"
|
package/.agents/ralph/loop.sh
CHANGED
|
File without changes
|
|
@@ -1,126 +1,126 @@
|
|
|
1
|
-
# Context Engineering Reference
|
|
2
|
-
|
|
3
|
-
This document explains the malloc/free metaphor for LLM context management that underlies the Ralph technique.
|
|
4
|
-
|
|
5
|
-
## The malloc() Metaphor
|
|
6
|
-
|
|
7
|
-
In traditional programming:
|
|
8
|
-
- `malloc()` allocates memory
|
|
9
|
-
- `free()` releases memory
|
|
10
|
-
- Memory leaks occur when you allocate without freeing
|
|
11
|
-
|
|
12
|
-
In LLM context:
|
|
13
|
-
- Reading files, receiving responses, tool outputs = `malloc()`
|
|
14
|
-
- **There is no `free()`** - context cannot be released
|
|
15
|
-
- The only way to "free" is to start a new conversation
|
|
16
|
-
|
|
17
|
-
## Why This Matters
|
|
18
|
-
|
|
19
|
-
### Context Pollution
|
|
20
|
-
|
|
21
|
-
When you work on multiple unrelated tasks in the same context:
|
|
22
|
-
|
|
23
|
-
```
|
|
24
|
-
Task 1: Build authentication → context contains auth code, JWT docs, security patterns
|
|
25
|
-
Task 2: Build UI components → context now ALSO contains auth stuff
|
|
26
|
-
|
|
27
|
-
Result: LLM might suggest auth-related patterns when building UI
|
|
28
|
-
or mix concerns inappropriately
|
|
29
|
-
```
|
|
30
|
-
|
|
31
|
-
### Autoregressive Failure
|
|
32
|
-
|
|
33
|
-
LLMs predict the next token based on ALL context. When context contains:
|
|
34
|
-
- Unrelated information
|
|
35
|
-
- Failed attempts
|
|
36
|
-
- Mixed concerns
|
|
37
|
-
|
|
38
|
-
The model can "spiral" into wrong territory, generating increasingly off-base responses.
|
|
39
|
-
|
|
40
|
-
### The Gutter Metaphor
|
|
41
|
-
|
|
42
|
-
> "If the bowling ball is in the gutter, there's no saving it."
|
|
43
|
-
|
|
44
|
-
Once context is polluted with failed attempts or mixed concerns, the model will keep referencing that pollution. Starting fresh is often faster than trying to correct course.
|
|
45
|
-
|
|
46
|
-
## Context Health Indicators
|
|
47
|
-
|
|
48
|
-
### 🟢 Healthy Context
|
|
49
|
-
- Single focused task
|
|
50
|
-
- Relevant files only
|
|
51
|
-
- Clear progress
|
|
52
|
-
- Under 60% capacity
|
|
53
|
-
|
|
54
|
-
### 🟡 Warning Signs
|
|
55
|
-
- Multiple unrelated topics discussed
|
|
56
|
-
- Several failed attempts in history
|
|
57
|
-
- Approaching 80% capacity
|
|
58
|
-
- Repeated similar errors
|
|
59
|
-
|
|
60
|
-
### 🔴 Critical / Gutter
|
|
61
|
-
- Mixed concerns throughout
|
|
62
|
-
- Circular failure patterns
|
|
63
|
-
- Over 90% capacity
|
|
64
|
-
- Model suggesting irrelevant solutions
|
|
65
|
-
|
|
66
|
-
## Best Practices
|
|
67
|
-
|
|
68
|
-
### 1. One Task Per Context
|
|
69
|
-
|
|
70
|
-
Don't ask "fix the auth bug AND add the new feature". Do them in separate conversations.
|
|
71
|
-
|
|
72
|
-
### 2. Fresh Start on Topic Change
|
|
73
|
-
|
|
74
|
-
Finished auth? Start a new conversation for the next feature.
|
|
75
|
-
|
|
76
|
-
### 3. Don't Redline
|
|
77
|
-
|
|
78
|
-
Stay under 80% of context capacity. Quality degrades as you approach limits.
|
|
79
|
-
|
|
80
|
-
### 4. Recognize the Gutter
|
|
81
|
-
|
|
82
|
-
If you're seeing:
|
|
83
|
-
- Same error 3+ times
|
|
84
|
-
- Solutions that don't match the problem
|
|
85
|
-
- Circular suggestions
|
|
86
|
-
|
|
87
|
-
Start fresh. Your progress is in the files.
|
|
88
|
-
|
|
89
|
-
### 5. State in Files, Not Context
|
|
90
|
-
|
|
91
|
-
Write progress to files. The next conversation can read them. Context is ephemeral; files are permanent.
|
|
92
|
-
|
|
93
|
-
## Ralph's Approach
|
|
94
|
-
|
|
95
|
-
The original Ralph technique (`while :; do cat PROMPT.md | agent ; done`) naturally implements these principles:
|
|
96
|
-
|
|
97
|
-
1. **Each iteration is a fresh process** - Context is freed
|
|
98
|
-
2. **State persists in files** - Progress survives context resets
|
|
99
|
-
3. **Same prompt each time** - Focused, single-task context
|
|
100
|
-
4. **Failures inform guardrails** - Learning without context pollution
|
|
101
|
-
|
|
102
|
-
This Cursor implementation aims to bring these benefits while working within Cursor's session model.
|
|
103
|
-
|
|
104
|
-
## Measuring Context
|
|
105
|
-
|
|
106
|
-
Rough estimates:
|
|
107
|
-
- 1 token ≈ 4 characters
|
|
108
|
-
- Average code file: 500-2000 tokens
|
|
109
|
-
- Large file: 5000+ tokens
|
|
110
|
-
- Conversation history: 100-500 tokens per exchange
|
|
111
|
-
|
|
112
|
-
Track allocations in `.ralph/context-log.md` to stay aware.
|
|
113
|
-
|
|
114
|
-
## When to Start Fresh
|
|
115
|
-
|
|
116
|
-
**Definitely start fresh when:**
|
|
117
|
-
- Switching to unrelated task
|
|
118
|
-
- Context over 90% full
|
|
119
|
-
- Same error 3+ times
|
|
120
|
-
- Model suggestions are off-topic
|
|
121
|
-
|
|
122
|
-
**Consider starting fresh when:**
|
|
123
|
-
- Context over 70% full
|
|
124
|
-
- Significant topic shift within task
|
|
125
|
-
- Feeling "stuck"
|
|
126
|
-
- Multiple failed approaches in history
|
|
1
|
+
# Context Engineering Reference
|
|
2
|
+
|
|
3
|
+
This document explains the malloc/free metaphor for LLM context management that underlies the Ralph technique.
|
|
4
|
+
|
|
5
|
+
## The malloc() Metaphor
|
|
6
|
+
|
|
7
|
+
In traditional programming:
|
|
8
|
+
- `malloc()` allocates memory
|
|
9
|
+
- `free()` releases memory
|
|
10
|
+
- Memory leaks occur when you allocate without freeing
|
|
11
|
+
|
|
12
|
+
In LLM context:
|
|
13
|
+
- Reading files, receiving responses, tool outputs = `malloc()`
|
|
14
|
+
- **There is no `free()`** - context cannot be released
|
|
15
|
+
- The only way to "free" is to start a new conversation
|
|
16
|
+
|
|
17
|
+
## Why This Matters
|
|
18
|
+
|
|
19
|
+
### Context Pollution
|
|
20
|
+
|
|
21
|
+
When you work on multiple unrelated tasks in the same context:
|
|
22
|
+
|
|
23
|
+
```
|
|
24
|
+
Task 1: Build authentication → context contains auth code, JWT docs, security patterns
|
|
25
|
+
Task 2: Build UI components → context now ALSO contains auth stuff
|
|
26
|
+
|
|
27
|
+
Result: LLM might suggest auth-related patterns when building UI
|
|
28
|
+
or mix concerns inappropriately
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
### Autoregressive Failure
|
|
32
|
+
|
|
33
|
+
LLMs predict the next token based on ALL context. When context contains:
|
|
34
|
+
- Unrelated information
|
|
35
|
+
- Failed attempts
|
|
36
|
+
- Mixed concerns
|
|
37
|
+
|
|
38
|
+
The model can "spiral" into wrong territory, generating increasingly off-base responses.
|
|
39
|
+
|
|
40
|
+
### The Gutter Metaphor
|
|
41
|
+
|
|
42
|
+
> "If the bowling ball is in the gutter, there's no saving it."
|
|
43
|
+
|
|
44
|
+
Once context is polluted with failed attempts or mixed concerns, the model will keep referencing that pollution. Starting fresh is often faster than trying to correct course.
|
|
45
|
+
|
|
46
|
+
## Context Health Indicators
|
|
47
|
+
|
|
48
|
+
### 🟢 Healthy Context
|
|
49
|
+
- Single focused task
|
|
50
|
+
- Relevant files only
|
|
51
|
+
- Clear progress
|
|
52
|
+
- Under 60% capacity
|
|
53
|
+
|
|
54
|
+
### 🟡 Warning Signs
|
|
55
|
+
- Multiple unrelated topics discussed
|
|
56
|
+
- Several failed attempts in history
|
|
57
|
+
- Approaching 80% capacity
|
|
58
|
+
- Repeated similar errors
|
|
59
|
+
|
|
60
|
+
### 🔴 Critical / Gutter
|
|
61
|
+
- Mixed concerns throughout
|
|
62
|
+
- Circular failure patterns
|
|
63
|
+
- Over 90% capacity
|
|
64
|
+
- Model suggesting irrelevant solutions
|
|
65
|
+
|
|
66
|
+
## Best Practices
|
|
67
|
+
|
|
68
|
+
### 1. One Task Per Context
|
|
69
|
+
|
|
70
|
+
Don't ask "fix the auth bug AND add the new feature". Do them in separate conversations.
|
|
71
|
+
|
|
72
|
+
### 2. Fresh Start on Topic Change
|
|
73
|
+
|
|
74
|
+
Finished auth? Start a new conversation for the next feature.
|
|
75
|
+
|
|
76
|
+
### 3. Don't Redline
|
|
77
|
+
|
|
78
|
+
Stay under 80% of context capacity. Quality degrades as you approach limits.
|
|
79
|
+
|
|
80
|
+
### 4. Recognize the Gutter
|
|
81
|
+
|
|
82
|
+
If you're seeing:
|
|
83
|
+
- Same error 3+ times
|
|
84
|
+
- Solutions that don't match the problem
|
|
85
|
+
- Circular suggestions
|
|
86
|
+
|
|
87
|
+
Start fresh. Your progress is in the files.
|
|
88
|
+
|
|
89
|
+
### 5. State in Files, Not Context
|
|
90
|
+
|
|
91
|
+
Write progress to files. The next conversation can read them. Context is ephemeral; files are permanent.
|
|
92
|
+
|
|
93
|
+
## Ralph's Approach
|
|
94
|
+
|
|
95
|
+
The original Ralph technique (`while :; do cat PROMPT.md | agent ; done`) naturally implements these principles:
|
|
96
|
+
|
|
97
|
+
1. **Each iteration is a fresh process** - Context is freed
|
|
98
|
+
2. **State persists in files** - Progress survives context resets
|
|
99
|
+
3. **Same prompt each time** - Focused, single-task context
|
|
100
|
+
4. **Failures inform guardrails** - Learning without context pollution
|
|
101
|
+
|
|
102
|
+
This Cursor implementation aims to bring these benefits while working within Cursor's session model.
|
|
103
|
+
|
|
104
|
+
## Measuring Context
|
|
105
|
+
|
|
106
|
+
Rough estimates:
|
|
107
|
+
- 1 token ≈ 4 characters
|
|
108
|
+
- Average code file: 500-2000 tokens
|
|
109
|
+
- Large file: 5000+ tokens
|
|
110
|
+
- Conversation history: 100-500 tokens per exchange
|
|
111
|
+
|
|
112
|
+
Track allocations in `.ralph/context-log.md` to stay aware.
|
|
113
|
+
|
|
114
|
+
## When to Start Fresh
|
|
115
|
+
|
|
116
|
+
**Definitely start fresh when:**
|
|
117
|
+
- Switching to unrelated task
|
|
118
|
+
- Context over 90% full
|
|
119
|
+
- Same error 3+ times
|
|
120
|
+
- Model suggestions are off-topic
|
|
121
|
+
|
|
122
|
+
**Consider starting fresh when:**
|
|
123
|
+
- Context over 70% full
|
|
124
|
+
- Significant topic shift within task
|
|
125
|
+
- Feeling "stuck"
|
|
126
|
+
- Multiple failed approaches in history
|