rlmgrep 0.1.8__tar.gz → 0.1.17__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: rlmgrep
3
- Version: 0.1.8
3
+ Version: 0.1.17
4
4
  Summary: Grep-shaped CLI search powered by DSPy RLM
5
5
  Author: rlmgrep
6
6
  License: MIT
@@ -8,11 +8,12 @@ Requires-Python: >=3.11
8
8
  Description-Content-Type: text/markdown
9
9
  Requires-Dist: dspy>=3.1.1
10
10
  Requires-Dist: markitdown[all]>=0.1.4
11
+ Requires-Dist: pathspec>=0.12.1
11
12
  Requires-Dist: pypdf>=4.0.0
12
13
 
13
14
  # rlmgrep
14
15
 
15
- Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, scans the files you point at, and prints matching lines in a grep-like format.
16
+ Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, scans the files you point at, and prints matching lines in a grep-like format. Use `--answer` to get a narrative response grounded in the selected files/directories.
16
17
 
17
18
  ## Quickstart
18
19
 
@@ -22,9 +23,20 @@ uv tool install rlmgrep
22
23
  # uv tool install git+https://github.com/halfprice06/rlmgrep.git
23
24
 
24
25
  export OPENAI_API_KEY=... # or set keys in ~/.rlmgrep
25
- rlmgrep "where are API keys read" rlmgrep/
26
26
  ```
27
27
 
28
+ ```sh
29
+ rlmgrep --answer "What does this repo do and where are the entry points?" .
30
+ ```
31
+
32
+ ![Quickstart answer mode](docs/images/quickstart-answer.png)
33
+
34
+ ```sh
35
+ rlmgrep -C 2 "Where is retry/backoff configured and what are the defaults?" .
36
+ ```
37
+
38
+ ![Quickstart context mode](docs/images/quickstart-context.png)
39
+
28
40
  ## Requirements
29
41
 
30
42
  - Python 3.11+
@@ -38,8 +50,8 @@ One of rlmgrep’s most useful features is that it can “grep” **PDFs and Off
38
50
  How it works:
39
51
  - **PDFs** are parsed with `pypdf`. Each page gets a marker line like `===== Page N =====`, and output lines include a `page=N` suffix. Line numbers refer to the extracted text (not PDF coordinates).
40
52
  - **Office & binary docs** (`.docx`, `.pptx`, `.xlsx`, `.html`, `.zip`, etc.) are converted to Markdown via **MarkItDown**. This happens during ingestion, so rlmgrep can search them like any other text file.
41
- - **Images** can be described by a vision model through MarkItDown (OpenAI/Anthropic/Gemini).
42
- - **Audio** transcription is supported through OpenAI when enabled.
53
+ - **Images** can be described by a vision model and then searched through MarkItDown (OpenAI/Anthropic/Gemini), enable and configure in config.toml.
54
+ - **Audio** transcription is supported through OpenAI when enabled, configure in config.toml.
43
55
 
44
56
  Sidecar caching:
45
57
  - For images/audio, converted text is cached next to the original file as `<original>.<ext>.md` and reused on later runs.
@@ -47,7 +59,7 @@ Sidecar caching:
47
59
 
48
60
  ## Install Deno
49
61
 
50
- DSPy requires the Deno runtime. Install it with the official scripts:
62
+ DSPy's default implementation of RLM requires the Deno runtime. Install it with the official scripts:
51
63
 
52
64
  macOS/Linux:
53
65
 
@@ -75,12 +87,15 @@ rlmgrep [options] "query" [paths...]
75
87
 
76
88
  Common options:
77
89
 
90
+ - `--answer` return a narrative answer before the grep output
78
91
  - `-C N` context lines before/after (grep-style)
79
92
  - `-A N` context lines after
80
93
  - `-B N` context lines before
81
94
  - `-m N` max matching lines per file
82
95
  - `-g GLOB` include files matching glob (repeatable, comma-separated)
83
96
  - `--type T` include file types (repeatable, comma-separated)
97
+ - `--hidden` include hidden files and directories
98
+ - `--no-ignore` do not respect `.gitignore`
84
99
  - `--no-recursive` do not recurse directories
85
100
  - `-a`, `--text` treat binary files as text
86
101
  - `-y`, `--yes` skip file count confirmation
@@ -95,7 +110,7 @@ Examples:
95
110
 
96
111
  ```sh
97
112
  # Natural-language query over a repo
98
- rlmgrep -n -C 2 "Where is retry/backoff configured and what are the defaults?" .
113
+ rlmgrep -C 2 "Where is retry/backoff configured and what are the defaults?" .
99
114
 
100
115
  # Restrict to Python files
101
116
  rlmgrep "Where do we parse JWTs and enforce expiration?" --type py .
@@ -113,6 +128,7 @@ rg -l "token" . | rlmgrep --files-from-stdin --answer "What does this token cont
113
128
  ## Input selection
114
129
 
115
130
  - Directories are searched recursively by default. Use `--no-recursive` to stop recursion.
131
+ - Hidden files and ignore files (`.gitignore`, `.ignore`, `.rgignore`) are respected by default. Use `--hidden` or `--no-ignore` to include them.
116
132
  - `--type` uses built-in type mappings (e.g., `py`, `js`, `md`); unknown values are treated as file extensions.
117
133
  - `-g/--glob` matches path globs against normalized paths (forward slashes).
118
134
  - Paths are printed relative to the current working directory when possible.
@@ -125,7 +141,7 @@ rg -l "token" . | rlmgrep --files-from-stdin --answer "What does this token cont
125
141
  - Output uses rg-style headings by default:
126
142
  - A file header line like `./path/to/file`
127
143
  - Then `line:\ttext` for matches, `line-\ttext` for context lines
128
- - Line numbers are 1-based.
144
+ - Line numbers are always included and are 1-based.
129
145
  - When context ranges are disjoint, a `--` line separates groups.
130
146
  - Exit codes:
131
147
  - `0` = at least one match
@@ -140,7 +156,7 @@ rlmgrep can interpret traditional regex-style patterns inside a natural-language
140
156
  Example (best-effort regex semantics + extra context):
141
157
 
142
158
  ```sh
143
- rlmgrep -n "Find Python functions that look like `def test_\\w+` and are marked as slow or flaky in nearby comments." .
159
+ rlmgrep "Find Python functions that look like `def test_\\w+` and are marked as slow or flaky in nearby comments." .
144
160
  ```
145
161
 
146
162
  If you need strict, deterministic regex behavior, use `rg`/`grep`.
@@ -1,6 +1,6 @@
1
1
  # rlmgrep
2
2
 
3
- Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, scans the files you point at, and prints matching lines in a grep-like format.
3
+ Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, scans the files you point at, and prints matching lines in a grep-like format. Use `--answer` to get a narrative response grounded in the selected files/directories.
4
4
 
5
5
  ## Quickstart
6
6
 
@@ -10,9 +10,20 @@ uv tool install rlmgrep
10
10
  # uv tool install git+https://github.com/halfprice06/rlmgrep.git
11
11
 
12
12
  export OPENAI_API_KEY=... # or set keys in ~/.rlmgrep
13
- rlmgrep "where are API keys read" rlmgrep/
14
13
  ```
15
14
 
15
+ ```sh
16
+ rlmgrep --answer "What does this repo do and where are the entry points?" .
17
+ ```
18
+
19
+ ![Quickstart answer mode](docs/images/quickstart-answer.png)
20
+
21
+ ```sh
22
+ rlmgrep -C 2 "Where is retry/backoff configured and what are the defaults?" .
23
+ ```
24
+
25
+ ![Quickstart context mode](docs/images/quickstart-context.png)
26
+
16
27
  ## Requirements
17
28
 
18
29
  - Python 3.11+
@@ -26,8 +37,8 @@ One of rlmgrep’s most useful features is that it can “grep” **PDFs and Off
26
37
  How it works:
27
38
  - **PDFs** are parsed with `pypdf`. Each page gets a marker line like `===== Page N =====`, and output lines include a `page=N` suffix. Line numbers refer to the extracted text (not PDF coordinates).
28
39
  - **Office & binary docs** (`.docx`, `.pptx`, `.xlsx`, `.html`, `.zip`, etc.) are converted to Markdown via **MarkItDown**. This happens during ingestion, so rlmgrep can search them like any other text file.
29
- - **Images** can be described by a vision model through MarkItDown (OpenAI/Anthropic/Gemini).
30
- - **Audio** transcription is supported through OpenAI when enabled.
40
+ - **Images** can be described by a vision model and then searched through MarkItDown (OpenAI/Anthropic/Gemini), enable and configure in config.toml.
41
+ - **Audio** transcription is supported through OpenAI when enabled, configure in config.toml.
31
42
 
32
43
  Sidecar caching:
33
44
  - For images/audio, converted text is cached next to the original file as `<original>.<ext>.md` and reused on later runs.
@@ -35,7 +46,7 @@ Sidecar caching:
35
46
 
36
47
  ## Install Deno
37
48
 
38
- DSPy requires the Deno runtime. Install it with the official scripts:
49
+ DSPy's default implementation of RLM requires the Deno runtime. Install it with the official scripts:
39
50
 
40
51
  macOS/Linux:
41
52
 
@@ -63,12 +74,15 @@ rlmgrep [options] "query" [paths...]
63
74
 
64
75
  Common options:
65
76
 
77
+ - `--answer` return a narrative answer before the grep output
66
78
  - `-C N` context lines before/after (grep-style)
67
79
  - `-A N` context lines after
68
80
  - `-B N` context lines before
69
81
  - `-m N` max matching lines per file
70
82
  - `-g GLOB` include files matching glob (repeatable, comma-separated)
71
83
  - `--type T` include file types (repeatable, comma-separated)
84
+ - `--hidden` include hidden files and directories
85
+ - `--no-ignore` do not respect `.gitignore`
72
86
  - `--no-recursive` do not recurse directories
73
87
  - `-a`, `--text` treat binary files as text
74
88
  - `-y`, `--yes` skip file count confirmation
@@ -83,7 +97,7 @@ Examples:
83
97
 
84
98
  ```sh
85
99
  # Natural-language query over a repo
86
- rlmgrep -n -C 2 "Where is retry/backoff configured and what are the defaults?" .
100
+ rlmgrep -C 2 "Where is retry/backoff configured and what are the defaults?" .
87
101
 
88
102
  # Restrict to Python files
89
103
  rlmgrep "Where do we parse JWTs and enforce expiration?" --type py .
@@ -101,6 +115,7 @@ rg -l "token" . | rlmgrep --files-from-stdin --answer "What does this token cont
101
115
  ## Input selection
102
116
 
103
117
  - Directories are searched recursively by default. Use `--no-recursive` to stop recursion.
118
+ - Hidden files and ignore files (`.gitignore`, `.ignore`, `.rgignore`) are respected by default. Use `--hidden` or `--no-ignore` to include them.
104
119
  - `--type` uses built-in type mappings (e.g., `py`, `js`, `md`); unknown values are treated as file extensions.
105
120
  - `-g/--glob` matches path globs against normalized paths (forward slashes).
106
121
  - Paths are printed relative to the current working directory when possible.
@@ -113,7 +128,7 @@ rg -l "token" . | rlmgrep --files-from-stdin --answer "What does this token cont
113
128
  - Output uses rg-style headings by default:
114
129
  - A file header line like `./path/to/file`
115
130
  - Then `line:\ttext` for matches, `line-\ttext` for context lines
116
- - Line numbers are 1-based.
131
+ - Line numbers are always included and are 1-based.
117
132
  - When context ranges are disjoint, a `--` line separates groups.
118
133
  - Exit codes:
119
134
  - `0` = at least one match
@@ -128,7 +143,7 @@ rlmgrep can interpret traditional regex-style patterns inside a natural-language
128
143
  Example (best-effort regex semantics + extra context):
129
144
 
130
145
  ```sh
131
- rlmgrep -n "Find Python functions that look like `def test_\\w+` and are marked as slow or flaky in nearby comments." .
146
+ rlmgrep "Find Python functions that look like `def test_\\w+` and are marked as slow or flaky in nearby comments." .
132
147
  ```
133
148
 
134
149
  If you need strict, deterministic regex behavior, use `rg`/`grep`.
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "rlmgrep"
3
- version = "0.1.8"
3
+ version = "0.1.17"
4
4
  description = "Grep-shaped CLI search powered by DSPy RLM"
5
5
  readme = "README.md"
6
6
  requires-python = ">=3.11"
@@ -9,6 +9,7 @@ license = { text = "MIT" }
9
9
  dependencies = [
10
10
  "dspy>=3.1.1",
11
11
  "markitdown[all]>=0.1.4",
12
+ "pathspec>=0.12.1",
12
13
  "pypdf>=4.0.0",
13
14
  ]
14
15
 
@@ -1,2 +1,2 @@
1
1
  __all__ = ["__version__"]
2
- __version__ = "0.1.8"
2
+ __version__ = "0.1.17"
@@ -3,13 +3,21 @@ from __future__ import annotations
3
3
  import argparse
4
4
  import os
5
5
  import sys
6
+ import shutil
7
+ import subprocess
6
8
  from pathlib import Path
7
9
 
8
10
  import dspy
9
11
  from . import __version__
10
12
  from .config import ensure_default_config, load_config
11
13
  from .file_map import build_file_map
12
- from .ingest import FileRecord, collect_candidates, load_files, resolve_type_exts
14
+ from .ingest import (
15
+ FileRecord,
16
+ build_ignore_spec,
17
+ collect_candidates,
18
+ load_files,
19
+ resolve_type_exts,
20
+ )
13
21
  from .rlm import Match, build_lm, run_rlm
14
22
  from .render import render_matches
15
23
 
@@ -72,16 +80,22 @@ def _parse_args(argv: list[str]) -> argparse.Namespace:
72
80
  parser.add_argument("pattern", nargs="?", help="Query string (interpreted by RLM)")
73
81
  parser.add_argument("paths", nargs="*", help="Files or directories")
74
82
 
75
- parser.add_argument("-n", dest="line_numbers", action="store_true", help="Show line numbers (default)")
76
83
  parser.add_argument("-r", dest="recursive", action="store_true", help="Recursive (directories are searched recursively by default)")
77
84
  parser.add_argument("--no-recursive", dest="recursive", action="store_false", help="Do not recurse directories")
78
- parser.set_defaults(recursive=True, line_numbers=True)
85
+ parser.set_defaults(recursive=True)
79
86
 
80
87
  parser.add_argument("-C", dest="context", type=int, default=0, help="Context lines before/after")
81
88
  parser.add_argument("-A", dest="after", type=int, default=None, help="Context lines after")
82
89
  parser.add_argument("-B", dest="before", type=int, default=None, help="Context lines before")
83
90
  parser.add_argument("-m", dest="max_count", type=int, default=None, help="Max matching lines per file")
84
91
  parser.add_argument("-a", "--text", dest="binary_as_text", action="store_true", help="Search binary files as text")
92
+ parser.add_argument("--hidden", action="store_true", help="Include hidden files and directories")
93
+ parser.add_argument(
94
+ "--no-ignore",
95
+ dest="no_ignore",
96
+ action="store_true",
97
+ help="Do not respect ignore files (.gitignore/.ignore/.rgignore)",
98
+ )
85
99
  parser.add_argument("--answer", action="store_true", help="Print a narrative answer before grep output")
86
100
  parser.add_argument("-y", "--yes", action="store_true", help="Skip file count confirmation")
87
101
  parser.add_argument(
@@ -140,6 +154,67 @@ def _pick(cli_value, config: dict, key: str, default=None):
140
154
  return default
141
155
 
142
156
 
157
+ def _find_git_root(start: Path) -> tuple[Path | None, Path | None]:
158
+ for p in [start, *start.parents]:
159
+ git_path = p / ".git"
160
+ if git_path.is_dir():
161
+ return p, git_path
162
+ if git_path.is_file():
163
+ try:
164
+ raw = git_path.read_text(encoding="utf-8", errors="ignore").strip()
165
+ except Exception:
166
+ raw = ""
167
+ if raw.startswith("gitdir:"):
168
+ git_dir = raw.split(":", 1)[1].strip()
169
+ git_dir_path = Path(git_dir)
170
+ if not git_dir_path.is_absolute():
171
+ git_dir_path = (p / git_dir_path).resolve()
172
+ return p, git_dir_path
173
+ return p, None
174
+ return None, None
175
+
176
+
177
+ def _global_ignore_paths(cwd: Path | None = None) -> list[Path]:
178
+ paths: list[Path] = []
179
+ cwd = cwd or Path.cwd()
180
+ if shutil.which("git"):
181
+ try:
182
+ result = subprocess.run(
183
+ ["git", "config", "--get", "--path", "core.excludesfile"],
184
+ cwd=cwd,
185
+ capture_output=True,
186
+ text=True,
187
+ check=False,
188
+ )
189
+ value = (result.stdout or "").strip()
190
+ except Exception:
191
+ value = ""
192
+ if value:
193
+ candidate = Path(value).expanduser()
194
+ if candidate.exists():
195
+ paths.append(candidate)
196
+
197
+ xdg_config = os.getenv("XDG_CONFIG_HOME")
198
+ if xdg_config:
199
+ default_path = Path(xdg_config) / "git" / "ignore"
200
+ else:
201
+ default_path = Path.home() / ".config" / "git" / "ignore"
202
+ if default_path.exists():
203
+ paths.append(default_path)
204
+
205
+ legacy = Path.home() / ".gitignore_global"
206
+ if legacy.exists():
207
+ paths.append(legacy)
208
+
209
+ seen: set[Path] = set()
210
+ unique: list[Path] = []
211
+ for p in paths:
212
+ if p not in seen:
213
+ seen.add(p)
214
+ unique.append(p)
215
+ return unique
216
+
217
+
143
218
  def _env_value(name: str) -> str | None:
144
219
  val = os.getenv(name)
145
220
  if val is None:
@@ -425,12 +500,26 @@ def main(argv: list[str] | None = None) -> int:
425
500
  if hard_max is not None and hard_max <= 0:
426
501
  hard_max = None
427
502
 
503
+ ignore_spec = None
504
+ ignore_root = None
505
+ if not args.no_ignore:
506
+ git_root, git_dir = _find_git_root(cwd)
507
+ ignore_root = git_root or cwd
508
+ extra_ignores: list[Path] = []
509
+ if git_dir is not None:
510
+ extra_ignores.append(git_dir / "info" / "exclude")
511
+ extra_ignores.extend(_global_ignore_paths(ignore_root))
512
+ ignore_spec = build_ignore_spec(ignore_root, extra_paths=extra_ignores)
513
+
428
514
  candidates = collect_candidates(
429
515
  input_paths,
430
516
  cwd=cwd,
431
517
  recursive=args.recursive,
432
518
  include_globs=globs,
433
519
  type_exts=type_exts,
520
+ include_hidden=args.hidden,
521
+ ignore_spec=ignore_spec,
522
+ ignore_root=ignore_root,
434
523
  )
435
524
  candidate_count = len(candidates)
436
525
  if hard_max is not None and candidate_count > hard_max:
@@ -565,7 +654,6 @@ def main(argv: list[str] | None = None) -> int:
565
654
  output_lines = render_matches(
566
655
  files=files,
567
656
  matches=verified,
568
- show_line_numbers=args.line_numbers,
569
657
  before=before,
570
658
  after=after,
571
659
  use_color=use_color,
@@ -2,8 +2,11 @@ from __future__ import annotations
2
2
 
3
3
  from dataclasses import dataclass
4
4
  from fnmatch import fnmatch
5
+ import os
5
6
  from pathlib import Path, PurePosixPath
6
- from typing import Iterable, Any, Callable
7
+ from typing import Any, Callable, Iterable
8
+
9
+ import pathspec
7
10
 
8
11
  from pypdf import PdfReader
9
12
 
@@ -161,6 +164,97 @@ def collect_files(paths: Iterable[str], recursive: bool = True) -> list[Path]:
161
164
  return files
162
165
 
163
166
 
167
+ IGNORE_FILENAMES = {".gitignore", ".ignore", ".rgignore"}
168
+
169
+
170
+ def build_ignore_spec(
171
+ root: Path, extra_paths: Iterable[Path] | None = None
172
+ ) -> "pathspec.PathSpec | None":
173
+ root = root.resolve()
174
+ ignore_paths: list[Path] = []
175
+ extra_paths = list(extra_paths or [])
176
+
177
+ for dirpath, dirnames, filenames in os.walk(root):
178
+ if ".git" in dirnames:
179
+ dirnames.remove(".git")
180
+ for name in filenames:
181
+ if name in IGNORE_FILENAMES:
182
+ ignore_paths.append(Path(dirpath) / name)
183
+
184
+ for extra in extra_paths:
185
+ if extra.exists():
186
+ ignore_paths.append(extra)
187
+
188
+ if not ignore_paths:
189
+ return None
190
+
191
+ def _sort_key(p: Path) -> tuple[int, str]:
192
+ try:
193
+ rel = p.parent.relative_to(root)
194
+ depth = len(rel.parts)
195
+ return depth, rel.as_posix()
196
+ except ValueError:
197
+ return 0, p.as_posix()
198
+
199
+ ignore_paths.sort(key=_sort_key)
200
+
201
+ patterns: list[str] = []
202
+ for gi in ignore_paths:
203
+ try:
204
+ rel_dir = gi.parent.relative_to(root).as_posix()
205
+ except ValueError:
206
+ rel_dir = ""
207
+ if rel_dir in {".", ""}:
208
+ rel_dir = ""
209
+ try:
210
+ raw_lines = gi.read_text(encoding="utf-8", errors="ignore").splitlines()
211
+ except Exception:
212
+ continue
213
+ for raw in raw_lines:
214
+ line = raw.rstrip("\n")
215
+ if not line:
216
+ continue
217
+ escaped = False
218
+ if line.startswith("\\#") or line.startswith("\\!"):
219
+ line = line[1:]
220
+ escaped = True
221
+ if not escaped and line.startswith("#"):
222
+ continue
223
+
224
+ negated = False
225
+ if not escaped and line.startswith("!"):
226
+ negated = True
227
+ line = line[1:]
228
+ if not line:
229
+ continue
230
+
231
+ anchored = False
232
+ if line.startswith("/"):
233
+ anchored = True
234
+ line = line[1:]
235
+ if not line:
236
+ continue
237
+
238
+ if rel_dir:
239
+ if anchored:
240
+ line = f"{rel_dir}/{line}"
241
+ elif "/" in line:
242
+ line = f"{rel_dir}/{line}"
243
+ else:
244
+ line = f"{rel_dir}/**/{line}"
245
+ else:
246
+ if anchored:
247
+ line = f"/{line}"
248
+
249
+ if negated:
250
+ line = "!" + line
251
+ patterns.append(line)
252
+
253
+ if not patterns:
254
+ return None
255
+ return pathspec.PathSpec.from_lines("gitwildmatch", patterns)
256
+
257
+
164
258
  TYPE_EXTS = {
165
259
  "bash": {".bash"},
166
260
  "c": {".c", ".h"},
@@ -237,21 +331,46 @@ def _matches_globs(path: str, globs: list[str]) -> bool:
237
331
  return False
238
332
 
239
333
 
334
+ def _is_hidden_path(path: Path) -> bool:
335
+ return any(part.startswith(".") for part in path.parts if part)
336
+
337
+
240
338
  def collect_candidates(
241
339
  paths: Iterable[str],
242
340
  cwd: Path,
243
341
  recursive: bool = True,
244
342
  include_globs: list[str] | None = None,
245
343
  type_exts: set[str] | None = None,
344
+ include_hidden: bool = False,
345
+ ignore_spec: "pathspec.PathSpec | None" = None,
346
+ ignore_root: Path | None = None,
246
347
  ) -> list[Path]:
247
348
  files = collect_files(paths, recursive=recursive)
349
+ explicit_files: set[Path] = set()
350
+ for raw in paths:
351
+ p = Path(raw)
352
+ if p.exists() and p.is_file():
353
+ explicit_files.add(p.resolve())
248
354
  candidates: list[Path] = []
249
355
  for fp in files:
356
+ fp_resolved = fp.resolve()
357
+ is_explicit = fp_resolved in explicit_files
358
+ if not include_hidden and not is_explicit and _is_hidden_path(fp):
359
+ continue
360
+
250
361
  try:
251
362
  key = fp.relative_to(cwd).as_posix()
252
363
  except ValueError:
253
364
  key = fp.as_posix()
254
365
 
366
+ if ignore_spec is not None and ignore_root is not None and not is_explicit:
367
+ try:
368
+ rel = fp.relative_to(ignore_root).as_posix()
369
+ except ValueError:
370
+ rel = None
371
+ if rel and ignore_spec.match_file(rel):
372
+ continue
373
+
255
374
  if include_globs and not _matches_globs(key, include_globs):
256
375
  continue
257
376
 
@@ -23,13 +23,10 @@ def _format_line(
23
23
  line_no: int,
24
24
  text: str,
25
25
  is_match: bool,
26
- show_line_numbers: bool,
27
26
  use_color: bool,
28
27
  heading: bool,
29
28
  ) -> str:
30
29
  delim = ":" if is_match else "-"
31
- if not show_line_numbers:
32
- return text
33
30
  prefix = _colorize(str(line_no), COLOR_LINE_NO, use_color)
34
31
  sep = "\t" if heading else ""
35
32
  return f"{prefix}{delim}{sep}{text}"
@@ -52,7 +49,6 @@ def _merge_ranges(ranges: list[tuple[int, int]]) -> list[tuple[int, int]]:
52
49
  def render_matches(
53
50
  files: dict[str, FileRecord],
54
51
  matches: dict[str, list[int]],
55
- show_line_numbers: bool,
56
52
  before: int,
57
53
  after: int,
58
54
  use_color: bool = False,
@@ -86,7 +82,6 @@ def render_matches(
86
82
  line_no,
87
83
  text,
88
84
  True,
89
- show_line_numbers,
90
85
  use_color,
91
86
  heading,
92
87
  )
@@ -111,7 +106,6 @@ def render_matches(
111
106
  line_no,
112
107
  text,
113
108
  is_match,
114
- show_line_numbers,
115
109
  use_color,
116
110
  heading,
117
111
  )
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: rlmgrep
3
- Version: 0.1.8
3
+ Version: 0.1.17
4
4
  Summary: Grep-shaped CLI search powered by DSPy RLM
5
5
  Author: rlmgrep
6
6
  License: MIT
@@ -8,11 +8,12 @@ Requires-Python: >=3.11
8
8
  Description-Content-Type: text/markdown
9
9
  Requires-Dist: dspy>=3.1.1
10
10
  Requires-Dist: markitdown[all]>=0.1.4
11
+ Requires-Dist: pathspec>=0.12.1
11
12
  Requires-Dist: pypdf>=4.0.0
12
13
 
13
14
  # rlmgrep
14
15
 
15
- Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, scans the files you point at, and prints matching lines in a grep-like format.
16
+ Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, scans the files you point at, and prints matching lines in a grep-like format. Use `--answer` to get a narrative response grounded in the selected files/directories.
16
17
 
17
18
  ## Quickstart
18
19
 
@@ -22,9 +23,20 @@ uv tool install rlmgrep
22
23
  # uv tool install git+https://github.com/halfprice06/rlmgrep.git
23
24
 
24
25
  export OPENAI_API_KEY=... # or set keys in ~/.rlmgrep
25
- rlmgrep "where are API keys read" rlmgrep/
26
26
  ```
27
27
 
28
+ ```sh
29
+ rlmgrep --answer "What does this repo do and where are the entry points?" .
30
+ ```
31
+
32
+ ![Quickstart answer mode](docs/images/quickstart-answer.png)
33
+
34
+ ```sh
35
+ rlmgrep -C 2 "Where is retry/backoff configured and what are the defaults?" .
36
+ ```
37
+
38
+ ![Quickstart context mode](docs/images/quickstart-context.png)
39
+
28
40
  ## Requirements
29
41
 
30
42
  - Python 3.11+
@@ -38,8 +50,8 @@ One of rlmgrep’s most useful features is that it can “grep” **PDFs and Off
38
50
  How it works:
39
51
  - **PDFs** are parsed with `pypdf`. Each page gets a marker line like `===== Page N =====`, and output lines include a `page=N` suffix. Line numbers refer to the extracted text (not PDF coordinates).
40
52
  - **Office & binary docs** (`.docx`, `.pptx`, `.xlsx`, `.html`, `.zip`, etc.) are converted to Markdown via **MarkItDown**. This happens during ingestion, so rlmgrep can search them like any other text file.
41
- - **Images** can be described by a vision model through MarkItDown (OpenAI/Anthropic/Gemini).
42
- - **Audio** transcription is supported through OpenAI when enabled.
53
+ - **Images** can be described by a vision model and then searched through MarkItDown (OpenAI/Anthropic/Gemini), enable and configure in config.toml.
54
+ - **Audio** transcription is supported through OpenAI when enabled, configure in config.toml.
43
55
 
44
56
  Sidecar caching:
45
57
  - For images/audio, converted text is cached next to the original file as `<original>.<ext>.md` and reused on later runs.
@@ -47,7 +59,7 @@ Sidecar caching:
47
59
 
48
60
  ## Install Deno
49
61
 
50
- DSPy requires the Deno runtime. Install it with the official scripts:
62
+ DSPy's default implementation of RLM requires the Deno runtime. Install it with the official scripts:
51
63
 
52
64
  macOS/Linux:
53
65
 
@@ -75,12 +87,15 @@ rlmgrep [options] "query" [paths...]
75
87
 
76
88
  Common options:
77
89
 
90
+ - `--answer` return a narrative answer before the grep output
78
91
  - `-C N` context lines before/after (grep-style)
79
92
  - `-A N` context lines after
80
93
  - `-B N` context lines before
81
94
  - `-m N` max matching lines per file
82
95
  - `-g GLOB` include files matching glob (repeatable, comma-separated)
83
96
  - `--type T` include file types (repeatable, comma-separated)
97
+ - `--hidden` include hidden files and directories
98
+ - `--no-ignore` do not respect `.gitignore`
84
99
  - `--no-recursive` do not recurse directories
85
100
  - `-a`, `--text` treat binary files as text
86
101
  - `-y`, `--yes` skip file count confirmation
@@ -95,7 +110,7 @@ Examples:
95
110
 
96
111
  ```sh
97
112
  # Natural-language query over a repo
98
- rlmgrep -n -C 2 "Where is retry/backoff configured and what are the defaults?" .
113
+ rlmgrep -C 2 "Where is retry/backoff configured and what are the defaults?" .
99
114
 
100
115
  # Restrict to Python files
101
116
  rlmgrep "Where do we parse JWTs and enforce expiration?" --type py .
@@ -113,6 +128,7 @@ rg -l "token" . | rlmgrep --files-from-stdin --answer "What does this token cont
113
128
  ## Input selection
114
129
 
115
130
  - Directories are searched recursively by default. Use `--no-recursive` to stop recursion.
131
+ - Hidden files and ignore files (`.gitignore`, `.ignore`, `.rgignore`) are respected by default. Use `--hidden` or `--no-ignore` to include them.
116
132
  - `--type` uses built-in type mappings (e.g., `py`, `js`, `md`); unknown values are treated as file extensions.
117
133
  - `-g/--glob` matches path globs against normalized paths (forward slashes).
118
134
  - Paths are printed relative to the current working directory when possible.
@@ -125,7 +141,7 @@ rg -l "token" . | rlmgrep --files-from-stdin --answer "What does this token cont
125
141
  - Output uses rg-style headings by default:
126
142
  - A file header line like `./path/to/file`
127
143
  - Then `line:\ttext` for matches, `line-\ttext` for context lines
128
- - Line numbers are 1-based.
144
+ - Line numbers are always included and are 1-based.
129
145
  - When context ranges are disjoint, a `--` line separates groups.
130
146
  - Exit codes:
131
147
  - `0` = at least one match
@@ -140,7 +156,7 @@ rlmgrep can interpret traditional regex-style patterns inside a natural-language
140
156
  Example (best-effort regex semantics + extra context):
141
157
 
142
158
  ```sh
143
- rlmgrep -n "Find Python functions that look like `def test_\\w+` and are marked as slow or flaky in nearby comments." .
159
+ rlmgrep "Find Python functions that look like `def test_\\w+` and are marked as slow or flaky in nearby comments." .
144
160
  ```
145
161
 
146
162
  If you need strict, deterministic regex behavior, use `rg`/`grep`.
@@ -1,3 +1,4 @@
1
1
  dspy>=3.1.1
2
2
  markitdown[all]>=0.1.4
3
+ pathspec>=0.12.1
3
4
  pypdf>=4.0.0
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes