rlmgrep 0.1.0__tar.gz → 0.1.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/PKG-INFO +23 -5
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/README.md +22 -4
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/pyproject.toml +1 -1
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep/cli.py +71 -5
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep/config.py +4 -5
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep/ingest.py +24 -10
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep.egg-info/PKG-INFO +23 -5
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep/__init__.py +0 -0
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep/__main__.py +0 -0
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep/file_map.py +0 -0
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep/interpreter.py +0 -0
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep/render.py +0 -0
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep/rlm.py +0 -0
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep.egg-info/SOURCES.txt +0 -0
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep.egg-info/dependency_links.txt +0 -0
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep.egg-info/entry_points.txt +0 -0
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep.egg-info/requires.txt +0 -0
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/rlmgrep.egg-info/top_level.txt +0 -0
- {rlmgrep-0.1.0 → rlmgrep-0.1.2}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: rlmgrep
|
|
3
|
-
Version: 0.1.
|
|
3
|
+
Version: 0.1.2
|
|
4
4
|
Summary: Grep-shaped CLI search powered by DSPy RLM
|
|
5
5
|
Author: rlmgrep
|
|
6
6
|
License: MIT
|
|
@@ -17,7 +17,7 @@ Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, sca
|
|
|
17
17
|
## Quickstart
|
|
18
18
|
|
|
19
19
|
```sh
|
|
20
|
-
uv tool install --python 3.11
|
|
20
|
+
uv tool install --python 3.11 rlmgrep
|
|
21
21
|
# or from GitHub:
|
|
22
22
|
# uv tool install --python 3.11 git+https://github.com/halfprice06/rlmgrep.git
|
|
23
23
|
|
|
@@ -71,6 +71,8 @@ Common options:
|
|
|
71
71
|
- `--type T` include file types (repeatable, comma-separated)
|
|
72
72
|
- `--no-recursive` do not recurse directories
|
|
73
73
|
- `-a`, `--text` treat binary files as text
|
|
74
|
+
- `-y`, `--yes` skip file count confirmation
|
|
75
|
+
- `--stdin-files` treat stdin as newline-delimited file paths
|
|
74
76
|
- `--model`, `--sub-model` override model names
|
|
75
77
|
- `--api-key`, `--api-base`, `--model-type` override provider settings
|
|
76
78
|
- `--max-iterations`, `--max-llm-calls` cap RLM search effort
|
|
@@ -90,6 +92,9 @@ rlmgrep "error handling" -g "**/*.py" -g "**/*.md" .
|
|
|
90
92
|
|
|
91
93
|
# Read from stdin (only when no paths are provided)
|
|
92
94
|
cat README.md | rlmgrep "install"
|
|
95
|
+
|
|
96
|
+
# Use rg/grep to find candidate files, then rlmgrep over that list
|
|
97
|
+
rg -l "token" . | rlmgrep --stdin-files --answer "what does this token control?"
|
|
93
98
|
```
|
|
94
99
|
|
|
95
100
|
## Input selection
|
|
@@ -99,6 +104,7 @@ cat README.md | rlmgrep "install"
|
|
|
99
104
|
- `-g/--glob` matches path globs against normalized paths (forward slashes).
|
|
100
105
|
- Paths are printed relative to the current working directory when possible.
|
|
101
106
|
- If no paths are provided, rlmgrep reads from stdin and uses the synthetic path `<stdin>`; if stdin is empty, it exits with code 2.
|
|
107
|
+
- rlmgrep asks for confirmation when more than 200 files would be loaded (use `-y/--yes` to skip), and aborts when more than 1000 files would be loaded.
|
|
102
108
|
|
|
103
109
|
## Output contract (stable for agents)
|
|
104
110
|
|
|
@@ -116,6 +122,18 @@ cat README.md | rlmgrep "install"
|
|
|
116
122
|
|
|
117
123
|
Agent tip: use `-n -H` and no context for parse-friendly output, then key off exit codes.
|
|
118
124
|
|
|
125
|
+
## Regex-style queries (best effort)
|
|
126
|
+
|
|
127
|
+
rlmgrep can interpret traditional regex-style patterns inside a natural-language prompt. The RLM may use Python (including `re`) in its internal REPL to approximate regex logic, but it is **not guaranteed** to behave exactly like `grep`/`rg`.
|
|
128
|
+
|
|
129
|
+
Example (best-effort regex semantics + extra context):
|
|
130
|
+
|
|
131
|
+
```sh
|
|
132
|
+
rlmgrep -n "Find Python functions that look like `def test_\\w+` and are marked as slow or flaky in nearby comments." .
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
If you need strict, deterministic regex behavior, use `rg`/`grep`.
|
|
136
|
+
|
|
119
137
|
## Configuration
|
|
120
138
|
|
|
121
139
|
rlmgrep creates a default config automatically if missing. The config path is:
|
|
@@ -133,6 +151,8 @@ temperature = 1.0
|
|
|
133
151
|
max_tokens = 64000
|
|
134
152
|
max_iterations = 10
|
|
135
153
|
max_llm_calls = 20
|
|
154
|
+
file_warn_threshold = 200
|
|
155
|
+
file_hard_max = 1000
|
|
136
156
|
# markitdown_enable_images = false
|
|
137
157
|
# markitdown_image_llm_model = "gpt-5-mini"
|
|
138
158
|
# markitdown_image_llm_provider = "openai"
|
|
@@ -168,10 +188,8 @@ If more than one provider key is set and the model does not make the provider ob
|
|
|
168
188
|
|
|
169
189
|
- Prefer narrow corpora (globs/types) to reduce token usage.
|
|
170
190
|
- Use `--max-llm-calls` to cap costs; combine with small `--max-iterations` for safety.
|
|
171
|
-
- Always read stderr for warnings (skipped files, config issues, ambiguous API keys).
|
|
172
191
|
- For reproducible parsing, use `-n -H` and avoid context (`-C/-A/-B`).
|
|
173
|
-
|
|
174
|
-
|
|
192
|
+
|
|
175
193
|
## Development
|
|
176
194
|
|
|
177
195
|
- Install locally: `pip install -e .` or `uv tool install .`
|
|
@@ -5,7 +5,7 @@ Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, sca
|
|
|
5
5
|
## Quickstart
|
|
6
6
|
|
|
7
7
|
```sh
|
|
8
|
-
uv tool install --python 3.11
|
|
8
|
+
uv tool install --python 3.11 rlmgrep
|
|
9
9
|
# or from GitHub:
|
|
10
10
|
# uv tool install --python 3.11 git+https://github.com/halfprice06/rlmgrep.git
|
|
11
11
|
|
|
@@ -59,6 +59,8 @@ Common options:
|
|
|
59
59
|
- `--type T` include file types (repeatable, comma-separated)
|
|
60
60
|
- `--no-recursive` do not recurse directories
|
|
61
61
|
- `-a`, `--text` treat binary files as text
|
|
62
|
+
- `-y`, `--yes` skip file count confirmation
|
|
63
|
+
- `--stdin-files` treat stdin as newline-delimited file paths
|
|
62
64
|
- `--model`, `--sub-model` override model names
|
|
63
65
|
- `--api-key`, `--api-base`, `--model-type` override provider settings
|
|
64
66
|
- `--max-iterations`, `--max-llm-calls` cap RLM search effort
|
|
@@ -78,6 +80,9 @@ rlmgrep "error handling" -g "**/*.py" -g "**/*.md" .
|
|
|
78
80
|
|
|
79
81
|
# Read from stdin (only when no paths are provided)
|
|
80
82
|
cat README.md | rlmgrep "install"
|
|
83
|
+
|
|
84
|
+
# Use rg/grep to find candidate files, then rlmgrep over that list
|
|
85
|
+
rg -l "token" . | rlmgrep --stdin-files --answer "what does this token control?"
|
|
81
86
|
```
|
|
82
87
|
|
|
83
88
|
## Input selection
|
|
@@ -87,6 +92,7 @@ cat README.md | rlmgrep "install"
|
|
|
87
92
|
- `-g/--glob` matches path globs against normalized paths (forward slashes).
|
|
88
93
|
- Paths are printed relative to the current working directory when possible.
|
|
89
94
|
- If no paths are provided, rlmgrep reads from stdin and uses the synthetic path `<stdin>`; if stdin is empty, it exits with code 2.
|
|
95
|
+
- rlmgrep asks for confirmation when more than 200 files would be loaded (use `-y/--yes` to skip), and aborts when more than 1000 files would be loaded.
|
|
90
96
|
|
|
91
97
|
## Output contract (stable for agents)
|
|
92
98
|
|
|
@@ -104,6 +110,18 @@ cat README.md | rlmgrep "install"
|
|
|
104
110
|
|
|
105
111
|
Agent tip: use `-n -H` and no context for parse-friendly output, then key off exit codes.
|
|
106
112
|
|
|
113
|
+
## Regex-style queries (best effort)
|
|
114
|
+
|
|
115
|
+
rlmgrep can interpret traditional regex-style patterns inside a natural-language prompt. The RLM may use Python (including `re`) in its internal REPL to approximate regex logic, but it is **not guaranteed** to behave exactly like `grep`/`rg`.
|
|
116
|
+
|
|
117
|
+
Example (best-effort regex semantics + extra context):
|
|
118
|
+
|
|
119
|
+
```sh
|
|
120
|
+
rlmgrep -n "Find Python functions that look like `def test_\\w+` and are marked as slow or flaky in nearby comments." .
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
If you need strict, deterministic regex behavior, use `rg`/`grep`.
|
|
124
|
+
|
|
107
125
|
## Configuration
|
|
108
126
|
|
|
109
127
|
rlmgrep creates a default config automatically if missing. The config path is:
|
|
@@ -121,6 +139,8 @@ temperature = 1.0
|
|
|
121
139
|
max_tokens = 64000
|
|
122
140
|
max_iterations = 10
|
|
123
141
|
max_llm_calls = 20
|
|
142
|
+
file_warn_threshold = 200
|
|
143
|
+
file_hard_max = 1000
|
|
124
144
|
# markitdown_enable_images = false
|
|
125
145
|
# markitdown_image_llm_model = "gpt-5-mini"
|
|
126
146
|
# markitdown_image_llm_provider = "openai"
|
|
@@ -156,10 +176,8 @@ If more than one provider key is set and the model does not make the provider ob
|
|
|
156
176
|
|
|
157
177
|
- Prefer narrow corpora (globs/types) to reduce token usage.
|
|
158
178
|
- Use `--max-llm-calls` to cap costs; combine with small `--max-iterations` for safety.
|
|
159
|
-
- Always read stderr for warnings (skipped files, config issues, ambiguous API keys).
|
|
160
179
|
- For reproducible parsing, use `-n -H` and avoid context (`-C/-A/-B`).
|
|
161
|
-
|
|
162
|
-
|
|
180
|
+
|
|
163
181
|
## Development
|
|
164
182
|
|
|
165
183
|
- Install locally: `pip install -e .` or `uv tool install .`
|
|
@@ -8,7 +8,7 @@ from pathlib import Path
|
|
|
8
8
|
import dspy
|
|
9
9
|
from .config import ensure_default_config, load_config
|
|
10
10
|
from .file_map import build_file_map
|
|
11
|
-
from .ingest import FileRecord, load_files, resolve_type_exts
|
|
11
|
+
from .ingest import FileRecord, collect_candidates, load_files, resolve_type_exts
|
|
12
12
|
from .rlm import Match, build_lm, run_rlm
|
|
13
13
|
from .render import render_matches
|
|
14
14
|
|
|
@@ -17,6 +17,23 @@ def _warn(msg: str) -> None:
|
|
|
17
17
|
print(f"rlmgrep: {msg}", file=sys.stderr)
|
|
18
18
|
|
|
19
19
|
|
|
20
|
+
def _confirm_over_limit(count: int, threshold: int) -> bool:
|
|
21
|
+
prompt = (
|
|
22
|
+
f"rlmgrep: {count} files to load (over {threshold}). Continue? [y/N] "
|
|
23
|
+
)
|
|
24
|
+
try:
|
|
25
|
+
with open("/dev/tty", "r+") as tty:
|
|
26
|
+
print(prompt, file=tty, end="", flush=True)
|
|
27
|
+
response = tty.readline()
|
|
28
|
+
except Exception:
|
|
29
|
+
if not sys.stdin.isatty():
|
|
30
|
+
_warn("refusing to prompt for confirmation; use --yes to proceed")
|
|
31
|
+
return False
|
|
32
|
+
print(prompt, file=sys.stderr, end="", flush=True)
|
|
33
|
+
response = sys.stdin.readline()
|
|
34
|
+
return response.strip().lower() in {"y", "yes"}
|
|
35
|
+
|
|
36
|
+
|
|
20
37
|
def verify_matches(
|
|
21
38
|
matches: list[Match],
|
|
22
39
|
files: dict[str, FileRecord],
|
|
@@ -65,6 +82,12 @@ def _parse_args(argv: list[str]) -> argparse.Namespace:
|
|
|
65
82
|
parser.add_argument("-m", dest="max_count", type=int, default=None, help="Max matching lines per file")
|
|
66
83
|
parser.add_argument("-a", "--text", dest="binary_as_text", action="store_true", help="Search binary files as text")
|
|
67
84
|
parser.add_argument("--answer", action="store_true", help="Print a narrative answer before grep output")
|
|
85
|
+
parser.add_argument("-y", "--yes", action="store_true", help="Skip file count confirmation")
|
|
86
|
+
parser.add_argument(
|
|
87
|
+
"--stdin-files",
|
|
88
|
+
action="store_true",
|
|
89
|
+
help="Treat stdin as newline-delimited file paths",
|
|
90
|
+
)
|
|
68
91
|
|
|
69
92
|
parser.add_argument("-g", "--glob", dest="globs", action="append", default=[], help="Include files matching glob (may repeat)")
|
|
70
93
|
parser.add_argument("--type", dest="types", action="append", default=[], help="Include file types (py, js, md, etc.). May repeat")
|
|
@@ -318,22 +341,65 @@ def main(argv: list[str] | None = None) -> int:
|
|
|
318
341
|
for w in md_warnings:
|
|
319
342
|
_warn(w)
|
|
320
343
|
|
|
321
|
-
|
|
344
|
+
input_paths: list[str] | None = None
|
|
345
|
+
stdin_text: str | None = None
|
|
346
|
+
if args.paths:
|
|
347
|
+
input_paths = list(args.paths)
|
|
348
|
+
elif args.stdin_files:
|
|
349
|
+
if sys.stdin.isatty():
|
|
350
|
+
_warn("no input paths and stdin is empty")
|
|
351
|
+
return 2
|
|
352
|
+
raw = sys.stdin.read()
|
|
353
|
+
input_paths = [line.strip() for line in raw.splitlines() if line.strip()]
|
|
354
|
+
if not input_paths:
|
|
355
|
+
_warn("stdin contained no file paths")
|
|
356
|
+
return 2
|
|
357
|
+
else:
|
|
322
358
|
if sys.stdin.isatty():
|
|
323
359
|
_warn("no input paths and stdin is empty")
|
|
324
360
|
return 2
|
|
325
|
-
|
|
361
|
+
stdin_text = sys.stdin.read()
|
|
362
|
+
|
|
363
|
+
if input_paths is None:
|
|
364
|
+
text = stdin_text or ""
|
|
326
365
|
files = {
|
|
327
366
|
"<stdin>": FileRecord(path="<stdin>", text=text, lines=text.split("\n"))
|
|
328
367
|
}
|
|
329
368
|
warnings: list[str] = []
|
|
330
369
|
else:
|
|
331
|
-
|
|
332
|
-
|
|
370
|
+
warn_threshold = _parse_num(
|
|
371
|
+
_pick(None, config, "file_warn_threshold", 200), int
|
|
372
|
+
)
|
|
373
|
+
hard_max = _parse_num(_pick(None, config, "file_hard_max", 1000), int)
|
|
374
|
+
if warn_threshold is not None and warn_threshold <= 0:
|
|
375
|
+
warn_threshold = None
|
|
376
|
+
if hard_max is not None and hard_max <= 0:
|
|
377
|
+
hard_max = None
|
|
378
|
+
|
|
379
|
+
candidates = collect_candidates(
|
|
380
|
+
input_paths,
|
|
333
381
|
cwd=cwd,
|
|
334
382
|
recursive=args.recursive,
|
|
335
383
|
include_globs=globs,
|
|
336
384
|
type_exts=type_exts,
|
|
385
|
+
)
|
|
386
|
+
candidate_count = len(candidates)
|
|
387
|
+
if hard_max is not None and candidate_count > hard_max:
|
|
388
|
+
_warn(
|
|
389
|
+
f"{candidate_count} files to load (over {hard_max}); aborting"
|
|
390
|
+
)
|
|
391
|
+
return 2
|
|
392
|
+
if (
|
|
393
|
+
warn_threshold is not None
|
|
394
|
+
and candidate_count > warn_threshold
|
|
395
|
+
and not args.yes
|
|
396
|
+
):
|
|
397
|
+
if not _confirm_over_limit(candidate_count, warn_threshold):
|
|
398
|
+
return 2
|
|
399
|
+
|
|
400
|
+
files, warnings = load_files(
|
|
401
|
+
candidates,
|
|
402
|
+
cwd=cwd,
|
|
337
403
|
markitdown=markitdown,
|
|
338
404
|
enable_images=md_enable_images,
|
|
339
405
|
enable_audio=md_enable_audio,
|
|
@@ -3,10 +3,7 @@ from __future__ import annotations
|
|
|
3
3
|
from pathlib import Path
|
|
4
4
|
from typing import Any
|
|
5
5
|
|
|
6
|
-
|
|
7
|
-
import tomllib as _tomllib # type: ignore
|
|
8
|
-
except Exception: # pragma: no cover - fallback
|
|
9
|
-
import tomli as _tomllib # type: ignore
|
|
6
|
+
import tomllib
|
|
10
7
|
|
|
11
8
|
|
|
12
9
|
DEFAULT_CONFIG_TEXT = "\n".join(
|
|
@@ -19,6 +16,8 @@ DEFAULT_CONFIG_TEXT = "\n".join(
|
|
|
19
16
|
"max_tokens = 64000",
|
|
20
17
|
"max_iterations = 10",
|
|
21
18
|
"max_llm_calls = 20",
|
|
19
|
+
"file_warn_threshold = 200",
|
|
20
|
+
"file_hard_max = 1000",
|
|
22
21
|
"# markitdown_enable_images = false",
|
|
23
22
|
"# markitdown_image_llm_model = \"gpt-5-mini\"",
|
|
24
23
|
"# markitdown_image_llm_provider = \"openai\"",
|
|
@@ -65,7 +64,7 @@ def load_config(path: Path | None = None) -> tuple[dict[str, Any], list[str]]:
|
|
|
65
64
|
return {}, [f"config path is not a file: {config_path}"]
|
|
66
65
|
|
|
67
66
|
try:
|
|
68
|
-
data =
|
|
67
|
+
data = tomllib.loads(config_path.read_text())
|
|
69
68
|
except Exception as exc: # pragma: no cover - defensive
|
|
70
69
|
return {}, [f"failed to read config {config_path}: {exc}"]
|
|
71
70
|
|
|
@@ -237,12 +237,34 @@ def _matches_globs(path: str, globs: list[str]) -> bool:
|
|
|
237
237
|
return False
|
|
238
238
|
|
|
239
239
|
|
|
240
|
-
def
|
|
240
|
+
def collect_candidates(
|
|
241
241
|
paths: Iterable[str],
|
|
242
242
|
cwd: Path,
|
|
243
243
|
recursive: bool = True,
|
|
244
244
|
include_globs: list[str] | None = None,
|
|
245
245
|
type_exts: set[str] | None = None,
|
|
246
|
+
) -> list[Path]:
|
|
247
|
+
files = collect_files(paths, recursive=recursive)
|
|
248
|
+
candidates: list[Path] = []
|
|
249
|
+
for fp in files:
|
|
250
|
+
try:
|
|
251
|
+
key = fp.relative_to(cwd).as_posix()
|
|
252
|
+
except ValueError:
|
|
253
|
+
key = fp.as_posix()
|
|
254
|
+
|
|
255
|
+
if include_globs and not _matches_globs(key, include_globs):
|
|
256
|
+
continue
|
|
257
|
+
|
|
258
|
+
if type_exts and fp.suffix.lower() not in type_exts:
|
|
259
|
+
continue
|
|
260
|
+
|
|
261
|
+
candidates.append(fp)
|
|
262
|
+
return candidates
|
|
263
|
+
|
|
264
|
+
|
|
265
|
+
def load_files(
|
|
266
|
+
candidates: Iterable[Path],
|
|
267
|
+
cwd: Path,
|
|
246
268
|
markitdown: Any | None = None,
|
|
247
269
|
enable_images: bool = False,
|
|
248
270
|
enable_audio: bool = False,
|
|
@@ -254,20 +276,12 @@ def load_files(
|
|
|
254
276
|
image_convert_count = 0
|
|
255
277
|
audio_convert_count = 0
|
|
256
278
|
|
|
257
|
-
|
|
258
|
-
for fp in files:
|
|
279
|
+
for fp in candidates:
|
|
259
280
|
try:
|
|
260
281
|
key = fp.relative_to(cwd).as_posix()
|
|
261
282
|
except ValueError:
|
|
262
283
|
key = fp.as_posix()
|
|
263
284
|
|
|
264
|
-
if include_globs and not _matches_globs(key, include_globs):
|
|
265
|
-
continue
|
|
266
|
-
|
|
267
|
-
if type_exts:
|
|
268
|
-
if fp.suffix.lower() not in type_exts:
|
|
269
|
-
continue
|
|
270
|
-
|
|
271
285
|
suffix = fp.suffix.lower()
|
|
272
286
|
if markitdown is not None and not binary_as_text:
|
|
273
287
|
if enable_images and suffix in IMAGE_EXTS:
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: rlmgrep
|
|
3
|
-
Version: 0.1.
|
|
3
|
+
Version: 0.1.2
|
|
4
4
|
Summary: Grep-shaped CLI search powered by DSPy RLM
|
|
5
5
|
Author: rlmgrep
|
|
6
6
|
License: MIT
|
|
@@ -17,7 +17,7 @@ Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, sca
|
|
|
17
17
|
## Quickstart
|
|
18
18
|
|
|
19
19
|
```sh
|
|
20
|
-
uv tool install --python 3.11
|
|
20
|
+
uv tool install --python 3.11 rlmgrep
|
|
21
21
|
# or from GitHub:
|
|
22
22
|
# uv tool install --python 3.11 git+https://github.com/halfprice06/rlmgrep.git
|
|
23
23
|
|
|
@@ -71,6 +71,8 @@ Common options:
|
|
|
71
71
|
- `--type T` include file types (repeatable, comma-separated)
|
|
72
72
|
- `--no-recursive` do not recurse directories
|
|
73
73
|
- `-a`, `--text` treat binary files as text
|
|
74
|
+
- `-y`, `--yes` skip file count confirmation
|
|
75
|
+
- `--stdin-files` treat stdin as newline-delimited file paths
|
|
74
76
|
- `--model`, `--sub-model` override model names
|
|
75
77
|
- `--api-key`, `--api-base`, `--model-type` override provider settings
|
|
76
78
|
- `--max-iterations`, `--max-llm-calls` cap RLM search effort
|
|
@@ -90,6 +92,9 @@ rlmgrep "error handling" -g "**/*.py" -g "**/*.md" .
|
|
|
90
92
|
|
|
91
93
|
# Read from stdin (only when no paths are provided)
|
|
92
94
|
cat README.md | rlmgrep "install"
|
|
95
|
+
|
|
96
|
+
# Use rg/grep to find candidate files, then rlmgrep over that list
|
|
97
|
+
rg -l "token" . | rlmgrep --stdin-files --answer "what does this token control?"
|
|
93
98
|
```
|
|
94
99
|
|
|
95
100
|
## Input selection
|
|
@@ -99,6 +104,7 @@ cat README.md | rlmgrep "install"
|
|
|
99
104
|
- `-g/--glob` matches path globs against normalized paths (forward slashes).
|
|
100
105
|
- Paths are printed relative to the current working directory when possible.
|
|
101
106
|
- If no paths are provided, rlmgrep reads from stdin and uses the synthetic path `<stdin>`; if stdin is empty, it exits with code 2.
|
|
107
|
+
- rlmgrep asks for confirmation when more than 200 files would be loaded (use `-y/--yes` to skip), and aborts when more than 1000 files would be loaded.
|
|
102
108
|
|
|
103
109
|
## Output contract (stable for agents)
|
|
104
110
|
|
|
@@ -116,6 +122,18 @@ cat README.md | rlmgrep "install"
|
|
|
116
122
|
|
|
117
123
|
Agent tip: use `-n -H` and no context for parse-friendly output, then key off exit codes.
|
|
118
124
|
|
|
125
|
+
## Regex-style queries (best effort)
|
|
126
|
+
|
|
127
|
+
rlmgrep can interpret traditional regex-style patterns inside a natural-language prompt. The RLM may use Python (including `re`) in its internal REPL to approximate regex logic, but it is **not guaranteed** to behave exactly like `grep`/`rg`.
|
|
128
|
+
|
|
129
|
+
Example (best-effort regex semantics + extra context):
|
|
130
|
+
|
|
131
|
+
```sh
|
|
132
|
+
rlmgrep -n "Find Python functions that look like `def test_\\w+` and are marked as slow or flaky in nearby comments." .
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
If you need strict, deterministic regex behavior, use `rg`/`grep`.
|
|
136
|
+
|
|
119
137
|
## Configuration
|
|
120
138
|
|
|
121
139
|
rlmgrep creates a default config automatically if missing. The config path is:
|
|
@@ -133,6 +151,8 @@ temperature = 1.0
|
|
|
133
151
|
max_tokens = 64000
|
|
134
152
|
max_iterations = 10
|
|
135
153
|
max_llm_calls = 20
|
|
154
|
+
file_warn_threshold = 200
|
|
155
|
+
file_hard_max = 1000
|
|
136
156
|
# markitdown_enable_images = false
|
|
137
157
|
# markitdown_image_llm_model = "gpt-5-mini"
|
|
138
158
|
# markitdown_image_llm_provider = "openai"
|
|
@@ -168,10 +188,8 @@ If more than one provider key is set and the model does not make the provider ob
|
|
|
168
188
|
|
|
169
189
|
- Prefer narrow corpora (globs/types) to reduce token usage.
|
|
170
190
|
- Use `--max-llm-calls` to cap costs; combine with small `--max-iterations` for safety.
|
|
171
|
-
- Always read stderr for warnings (skipped files, config issues, ambiguous API keys).
|
|
172
191
|
- For reproducible parsing, use `-n -H` and avoid context (`-C/-A/-B`).
|
|
173
|
-
|
|
174
|
-
|
|
192
|
+
|
|
175
193
|
## Development
|
|
176
194
|
|
|
177
195
|
- Install locally: `pip install -e .` or `uv tool install .`
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|