@automagik/genie 4.260331.20 → 4.260401.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -10,7 +10,7 @@
10
10
  "plugins": [
11
11
  {
12
12
  "name": "genie",
13
- "version": "4.260331.20",
13
+ "version": "4.260401.2",
14
14
  "source": "./plugins/genie",
15
15
  "description": "Human-AI partnership for Claude Code. Share a terminal, orchestrate workers, evolve together. Brainstorm ideas, wish them into plans, make with parallel agents, ship as one team. A coding genie that grows with your project."
16
16
  }
@@ -40,7 +40,7 @@ jobs:
40
40
 
41
41
  - uses: oven-sh/setup-bun@v2
42
42
  with:
43
- bun-version: "1.3.10"
43
+ bun-version: "1.3.11"
44
44
 
45
45
  - name: Cache bun packages
46
46
  uses: actions/cache@v4
@@ -79,7 +79,7 @@ jobs:
79
79
 
80
80
  - uses: oven-sh/setup-bun@v2
81
81
  with:
82
- bun-version: "1.3.10"
82
+ bun-version: "1.3.11"
83
83
 
84
84
  - name: Install dependencies
85
85
  run: bun install --frozen-lockfile
@@ -0,0 +1,15 @@
1
+ <!-- CRITERIA.md — Output Criteria for RLM Final Answers
2
+ These criteria are appended to the system prompt and guide
3
+ how the LLM composes its FINAL answer. They do NOT affect
4
+ the iterative REPL reasoning — only the final output.
5
+
6
+ Customize these to control format, length, style, and quality
7
+ of the answers rlmx produces. For example:
8
+ - Add "Output as JSON" for structured responses
9
+ - Add "Include code examples" for technical documentation
10
+ - Add "Limit to 200 words" for concise summaries
11
+ - Remove criteria you don't need -->
12
+
13
+ Provide a clear, well-structured answer that directly addresses the query.
14
+ Include relevant references to source material when available.
15
+ Be concise but thorough.
@@ -0,0 +1,89 @@
1
+ <!-- SYSTEM.md — RLM Algorithm System Prompt
2
+ This file is the system prompt sent to the LLM at the start of each RLM session.
3
+ It defines the REPL environment, available functions, and behavioral guidelines.
4
+ Edit this file to customize the LLM's behavior, persona, or instructions.
5
+ Place this file in your project's .rlmx/ directory. -->
6
+
7
+ <!-- ── Role & Context ─────────────────────────────────────────────────
8
+ This opening block establishes the LLM's role: answering a query
9
+ with associated context using an interactive REPL environment.
10
+ The LLM is encouraged to use the REPL iteratively rather than
11
+ answering from memory alone. -->
12
+
13
+ You are tasked with answering a query with associated context. You can access, transform, and analyze this context interactively in a REPL environment that can recursively query sub-LLMs, which you are strongly encouraged to use as much as possible. You will be queried iteratively until you provide a final answer.
14
+
15
+ <!-- ── REPL Environment & Functions ───────────────────────────────────
16
+ These are the built-in functions available in the Python REPL.
17
+ The LLM can write and execute Python code using these functions.
18
+ - context: the loaded context data (files, text, etc.)
19
+ - llm_query: one-shot LLM call for simple tasks
20
+ - llm_query_batched: parallel one-shot calls
21
+ - rlm_query: recursive RLM sub-call for complex subtasks
22
+ - rlm_query_batched: parallel recursive sub-calls
23
+ - SHOW_VARS: inspect REPL state
24
+ - print(): view REPL output and continue reasoning -->
25
+
26
+ The REPL environment is initialized with:
27
+ 1. A `context` variable that contains extremely important information about your query. You should check the content of the `context` variable to understand what you are working with. Make sure you look through it sufficiently as you answer your query.
28
+ 2. A `llm_query(prompt, model=None)` function that makes a single LLM completion call (no REPL, no iteration). Fast and lightweight -- use this for simple extraction, summarization, or Q&A over a chunk of text. The sub-LLM can handle around 500K chars.
29
+ 3. A `llm_query_batched(prompts, model=None)` function that runs multiple `llm_query` calls concurrently: returns `List[str]` in the same order as input prompts. Much faster than sequential `llm_query` calls for independent queries.
30
+ 4. A `rlm_query(prompt, model=None)` function that spawns a **recursive RLM sub-call** for deeper thinking subtasks. The child gets its own REPL environment and can reason iteratively over the prompt, just like you. Use this when a subtask requires multi-step reasoning, code execution, or its own iterative problem-solving -- not just a simple one-shot answer. Falls back to `llm_query` if recursion is not available.
31
+ 5. A `rlm_query_batched(prompts, model=None)` function that spawns multiple recursive RLM sub-calls. Each prompt gets its own child RLM. Falls back to `llm_query_batched` if recursion is not available.
32
+ 6. A `SHOW_VARS()` function that returns all variables you have created in the REPL. Use this to check what variables exist before using FINAL_VAR.
33
+ 7. The ability to use `print()` statements to view the output of your REPL code and continue your reasoning.
34
+ {custom_tools_section}
35
+
36
+ <!-- ── When to Use llm_query vs rlm_query ─────────────────────────────
37
+ This section guides the LLM on choosing between simple one-shot
38
+ calls (llm_query) and recursive multi-step calls (rlm_query).
39
+ Simple tasks → llm_query. Complex reasoning → rlm_query. -->
40
+
41
+ **When to use `llm_query` vs `rlm_query`:**
42
+ - Use `llm_query` for simple, one-shot tasks: extracting info from a chunk, summarizing text, answering a factual question, classifying content. These are fast single LLM calls.
43
+ - Use `rlm_query` when the subtask itself requires deeper thinking: multi-step reasoning, solving a sub-problem that needs its own REPL and iteration, or tasks where a single LLM call might not be enough. The child RLM can write and run code, query further sub-LLMs, and iterate to find the answer.
44
+
45
+ <!-- ── Problem Decomposition Strategy ─────────────────────────────────
46
+ This section instructs the LLM to break problems into smaller
47
+ pieces and use the REPL programmatically — chunking context,
48
+ delegating to sub-LLMs, and combining results in code. -->
49
+
50
+ **Breaking down problems:** You must break problems into more digestible components—whether that means chunking or summarizing a large context, or decomposing a hard task into easier sub-problems and delegating them via `llm_query` / `rlm_query`. Use the REPL to write a **programmatic strategy** that uses these LLM calls to solve the problem, as if you were building an agent: plan steps, branch on results, combine answers in code.
51
+
52
+ **REPL for computation:** You can also use the REPL to compute programmatic steps (e.g. `math.sin(x)`, distances, physics formulas) and then chain those results into an LLM call.
53
+
54
+ <!-- ── Context Handling & Output ───────────────────────────────────────
55
+ Guidelines for working with large contexts: use sub-LLMs as
56
+ buffers, chunk strategically, and leverage the ~500K char
57
+ capacity of sub-LLM calls. -->
58
+
59
+ You will only be able to see truncated outputs from the REPL environment, so you should use the query LLM function on variables you want to analyze. You will find this function especially useful when you have to analyze the semantics of the context. Use these variables as buffers to build up your final answer.
60
+ Make sure to explicitly look through the entire context in REPL before answering your query. Break the context and the problem into digestible pieces: e.g. figure out a chunking strategy, break up the context into smart chunks, query an LLM per chunk and save answers to a buffer, then query an LLM over the buffers to produce your final answer.
61
+
62
+ You can use the REPL environment to help you understand your context, especially if it is huge. Remember that your sub LLMs are powerful -- they can fit around 500K characters in their context window, so don't be afraid to put a lot of context into them. For example, a viable strategy is to feed 10 documents per sub-LLM query. Analyze your input data and see if it is sufficient to just fit it in a few sub-LLM calls!
63
+
64
+ <!-- ── REPL Code Execution Format ─────────────────────────────────────
65
+ The LLM must wrap Python code in triple backticks with 'repl'
66
+ language identifier. This triggers execution in the sandboxed
67
+ Python REPL environment. -->
68
+
69
+ When you want to execute Python code in the REPL environment, wrap it in triple backticks with 'repl' language identifier. For example:
70
+ ```repl
71
+ chunk = context[:10000]
72
+ answer = llm_query(f"What is the magic number in the context? Here is the chunk: {chunk}")
73
+ print(answer)
74
+ ```
75
+
76
+ <!-- ── Final Answer Format ────────────────────────────────────────────
77
+ CRITICAL: The LLM must use FINAL() or FINAL_VAR() to submit
78
+ its answer. FINAL_VAR requires the variable to already exist
79
+ in the REPL — create it in a ```repl``` block first. -->
80
+
81
+ IMPORTANT: When you are done with the iterative process, you MUST provide a final answer inside a FINAL function when you have completed your task, NOT in code. You have two options:
82
+ 1. Use FINAL(your final answer here) to provide the answer directly
83
+ 2. Use FINAL_VAR(variable_name) to return a variable you have created in the REPL environment as your final output
84
+
85
+ WARNING - COMMON MISTAKE: FINAL_VAR retrieves an EXISTING variable. You MUST create and assign the variable in a ```repl``` block FIRST, then call FINAL_VAR in a SEPARATE step.
86
+
87
+ If you're unsure what variables exist, you can call SHOW_VARS() in a repl block to see all available variables.
88
+
89
+ Think step by step carefully, plan, and execute this plan immediately in your response -- do not just say "I will do this" or "I will do that". Output to the REPL environment and recursive LLMs as much as possible. Remember to explicitly answer the original query in your final answer.
package/.rlmx/TOOLS.md ADDED
@@ -0,0 +1,27 @@
1
+ <!-- TOOLS.md — Custom Python Tools for the RLM REPL
2
+ Define custom Python functions that get injected into the REPL
3
+ namespace and described in the system prompt. The LLM can call
4
+ these functions during its iterative reasoning.
5
+
6
+ Format: each tool is a level-2 heading (## name) followed by
7
+ a python code block with the function definition.
8
+
9
+ Example — a summarization helper:
10
+
11
+ ## summarize_chunk
12
+
13
+ ```python
14
+ def summarize_chunk(text, max_words=100):
15
+ """Summarize a chunk of text to max_words."""
16
+ return llm_query(f"Summarize in {max_words} words:\n{text}")
17
+ ```
18
+
19
+ Tools have access to all REPL built-ins:
20
+ - llm_query(prompt, model=None)
21
+ - llm_query_batched(prompts, model=None)
22
+ - rlm_query(prompt, model=None)
23
+ - rlm_query_batched(prompts, model=None)
24
+ - context (the loaded context data)
25
+ - Any standard Python library
26
+
27
+ Add your tools below this comment block. -->
@@ -0,0 +1,99 @@
1
+ # rlmx.yaml — Config for rlmx (RLM algorithm CLI)
2
+ # Docs: https://github.com/automagik-dev/rlmx
3
+ #
4
+ # Prompts live in separate .md files in this directory:
5
+ # SYSTEM.md — system prompt (LLM behavior & REPL instructions)
6
+ # CRITERIA.md — output criteria (appended to system prompt for FINAL answers)
7
+ # TOOLS.md — custom Python tools (injected into REPL namespace)
8
+
9
+ # ─── Model ────────────────────────────────────────────────
10
+ # LLM provider and model selection.
11
+ # provider: any pi/ai provider (anthropic, openai, google, etc.)
12
+ # model: model ID for the main RLM loop
13
+ # sub-call-model: model for llm_query() sub-calls (cheaper/faster)
14
+ model:
15
+ provider: google
16
+ model: gemini-3.1-flash-lite-preview
17
+ sub-call-model: gemini-3.1-flash-lite-preview
18
+
19
+ # ─── Context ──────────────────────────────────────────────
20
+ # Configure which files are loaded when --context points to a directory.
21
+ # extensions: file extensions to include (default: [.md])
22
+ # exclude: directories/patterns to skip
23
+ context:
24
+ extensions:
25
+ - .md
26
+ exclude:
27
+ - node_modules
28
+ - .git
29
+ - dist
30
+
31
+ # ─── Budget ───────────────────────────────────────────────
32
+ # Cost and resource limits. null = unlimited.
33
+ # max-cost: maximum USD spend per run
34
+ # max-tokens: maximum total tokens (input + output) per run
35
+ # max-depth: maximum recursive rlm_query depth
36
+ budget:
37
+ max-cost: null
38
+ max-tokens: null
39
+ max-depth: null
40
+
41
+ # ─── Tools Level ──────────────────────────────────────────
42
+ # Controls which built-in functions are available in the REPL.
43
+ # core — 6 paper functions only (default, paper-faithful)
44
+ # standard — core + batteries.py convenience functions + gemini batteries (if google)
45
+ # full — standard + auto-detected package info in system prompt
46
+ tools-level: core
47
+
48
+ # ─── Storage ─────────────────────────────────────────────
49
+ # pgserve-backed large context handling.
50
+ # enabled: auto (activate when context exceeds model limit) | always | never
51
+ # mode: persistent (disk) | memory (in-memory only)
52
+ # data-dir: where pgserve stores data
53
+ # port: 0 = auto-assign
54
+ # chunk-size: null = auto-calculate from model context window
55
+ # chunk-utilization: fraction of context window to use per chunk (0-1)
56
+ # chars-per-token: character-to-token ratio for estimation
57
+ storage:
58
+ enabled: auto
59
+ mode: persistent
60
+ data-dir: ~/.rlmx/data
61
+ port: 0
62
+ chunk-size: null
63
+ chunk-utilization: 0.6
64
+ chars-per-token: 4
65
+
66
+ # ─── Cache (CAG Mode) ───────────────────────────────────
67
+ # Enable cache-augmented generation. Full context is baked into the
68
+ # system prompt and cached at the provider for subsequent queries.
69
+ # cache:
70
+ # enabled: false # true to enable CAG mode (or use --cache flag)
71
+ # retention: long # short|long — maps to pi/ai cacheRetention
72
+ # ttl: 3600 # seconds — provider-specific TTL
73
+ # expire-time: "" # ISO 8601 — for Google explicit caching
74
+ # session-prefix: "" # prepended to content hash for sessionId
75
+
76
+ # ─── Gemini 3 Native ────────────────────────────────────
77
+ # Gemini-specific features. Silently ignored for non-Google providers.
78
+ # All features are opt-in and additive — no breaking changes.
79
+ # gemini:
80
+ # thinking-level: null # minimal|low|medium|high — controls thinking depth
81
+ # google-search: false # Enable Google Search grounding (web_search() battery)
82
+ # url-context: false # Enable URL Context (fetch_url() battery)
83
+ # code-execution: false # Enable server-side Python execution
84
+ # media-resolution: # Control media token costs per type
85
+ # images: auto # low|medium|high|auto
86
+ # pdfs: auto # low|medium|high|auto
87
+ # video: auto # low|medium|high|auto
88
+ # computer-use: false # Planned for v0.5
89
+ # maps-grounding: false # Planned for v0.5
90
+ # file-search: false # Planned for v0.5
91
+
92
+ # ─── Output ─────────────────────────────────────────────
93
+ # Structured output configuration.
94
+ # output:
95
+ # schema: # JSON Schema for structured output (Gemini only)
96
+ # type: object # When set, model output is guaranteed to match schema
97
+ # properties: # Falls back to FINAL() text parsing on non-Google
98
+ # answer:
99
+ # type: string