rlmgrep 0.1.7__tar.gz → 0.1.8__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: rlmgrep
3
- Version: 0.1.7
3
+ Version: 0.1.8
4
4
  Summary: Grep-shaped CLI search powered by DSPy RLM
5
5
  Author: rlmgrep
6
6
  License: MIT
@@ -17,9 +17,9 @@ Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, sca
17
17
  ## Quickstart
18
18
 
19
19
  ```sh
20
- uv tool install --python 3.11 rlmgrep
20
+ uv tool install rlmgrep
21
21
  # or from GitHub:
22
- # uv tool install --python 3.11 git+https://github.com/halfprice06/rlmgrep.git
22
+ # uv tool install git+https://github.com/halfprice06/rlmgrep.git
23
23
 
24
24
  export OPENAI_API_KEY=... # or set keys in ~/.rlmgrep
25
25
  rlmgrep "where are API keys read" rlmgrep/
@@ -31,6 +31,20 @@ rlmgrep "where are API keys read" rlmgrep/
31
31
  - Deno runtime (DSPy RLM uses a Deno-based interpreter)
32
32
  - API key for your chosen provider (OpenAI, Anthropic, Gemini, etc.)
33
33
 
34
+ ## Non-text Files (PDF + Office + Media)
35
+
36
+ One of rlmgrep’s most useful features is that it can “grep” **PDFs and Office files** by converting them into text before the RLM search runs.
37
+
38
+ How it works:
39
+ - **PDFs** are parsed with `pypdf`. Each page gets a marker line like `===== Page N =====`, and output lines include a `page=N` suffix. Line numbers refer to the extracted text (not PDF coordinates).
40
+ - **Office & binary docs** (`.docx`, `.pptx`, `.xlsx`, `.html`, `.zip`, etc.) are converted to Markdown via **MarkItDown**. This happens during ingestion, so rlmgrep can search them like any other text file.
41
+ - **Images** can be described by a vision model through MarkItDown (OpenAI/Anthropic/Gemini).
42
+ - **Audio** transcription is supported through OpenAI when enabled.
43
+
44
+ Sidecar caching:
45
+ - For images/audio, converted text is cached next to the original file as `<original>.<ext>.md` and reused on later runs.
46
+ - Use `-a/--text` if you want to treat binary files as raw text (UTF‑8 with replacement) and skip conversion.
47
+
34
48
  ## Install Deno
35
49
 
36
50
  DSPy requires the Deno runtime. Install it with the official scripts:
@@ -174,13 +188,6 @@ CLI flags override config values. Model keys are resolved as:
174
188
 
175
189
  If more than one provider key is set and the model does not make the provider obvious, rlmgrep emits a warning and requires an explicit `--api-key`.
176
190
 
177
- ## Non-text files (PDF, images, audio)
178
-
179
- - PDF files are parsed with `pypdf`. Each page gets a marker line `===== Page N =====`, and output lines include a `page=N` suffix.
180
- - Images and audio are converted via `markitdown` when enabled in config. Image conversion supports `openai`, `anthropic`, and `gemini` providers; audio conversion currently supports `openai` only.
181
- - Converted image/audio text is cached in sidecar files named `<original>.<ext>.md` next to the original file and reused on subsequent runs.
182
- - Use `-a/--text` to force binary files to be read as text (UTF-8 with replacement).
183
-
184
191
  ## Skill (Anthropic-style)
185
192
 
186
193
  A ready-to-copy skill lives in:
@@ -5,9 +5,9 @@ Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, sca
5
5
  ## Quickstart
6
6
 
7
7
  ```sh
8
- uv tool install --python 3.11 rlmgrep
8
+ uv tool install rlmgrep
9
9
  # or from GitHub:
10
- # uv tool install --python 3.11 git+https://github.com/halfprice06/rlmgrep.git
10
+ # uv tool install git+https://github.com/halfprice06/rlmgrep.git
11
11
 
12
12
  export OPENAI_API_KEY=... # or set keys in ~/.rlmgrep
13
13
  rlmgrep "where are API keys read" rlmgrep/
@@ -19,6 +19,20 @@ rlmgrep "where are API keys read" rlmgrep/
19
19
  - Deno runtime (DSPy RLM uses a Deno-based interpreter)
20
20
  - API key for your chosen provider (OpenAI, Anthropic, Gemini, etc.)
21
21
 
22
+ ## Non-text Files (PDF + Office + Media)
23
+
24
+ One of rlmgrep’s most useful features is that it can “grep” **PDFs and Office files** by converting them into text before the RLM search runs.
25
+
26
+ How it works:
27
+ - **PDFs** are parsed with `pypdf`. Each page gets a marker line like `===== Page N =====`, and output lines include a `page=N` suffix. Line numbers refer to the extracted text (not PDF coordinates).
28
+ - **Office & binary docs** (`.docx`, `.pptx`, `.xlsx`, `.html`, `.zip`, etc.) are converted to Markdown via **MarkItDown**. This happens during ingestion, so rlmgrep can search them like any other text file.
29
+ - **Images** can be described by a vision model through MarkItDown (OpenAI/Anthropic/Gemini).
30
+ - **Audio** transcription is supported through OpenAI when enabled.
31
+
32
+ Sidecar caching:
33
+ - For images/audio, converted text is cached next to the original file as `<original>.<ext>.md` and reused on later runs.
34
+ - Use `-a/--text` if you want to treat binary files as raw text (UTF‑8 with replacement) and skip conversion.
35
+
22
36
  ## Install Deno
23
37
 
24
38
  DSPy requires the Deno runtime. Install it with the official scripts:
@@ -162,13 +176,6 @@ CLI flags override config values. Model keys are resolved as:
162
176
 
163
177
  If more than one provider key is set and the model does not make the provider obvious, rlmgrep emits a warning and requires an explicit `--api-key`.
164
178
 
165
- ## Non-text files (PDF, images, audio)
166
-
167
- - PDF files are parsed with `pypdf`. Each page gets a marker line `===== Page N =====`, and output lines include a `page=N` suffix.
168
- - Images and audio are converted via `markitdown` when enabled in config. Image conversion supports `openai`, `anthropic`, and `gemini` providers; audio conversion currently supports `openai` only.
169
- - Converted image/audio text is cached in sidecar files named `<original>.<ext>.md` next to the original file and reused on subsequent runs.
170
- - Use `-a/--text` to force binary files to be read as text (UTF-8 with replacement).
171
-
172
179
  ## Skill (Anthropic-style)
173
180
 
174
181
  A ready-to-copy skill lives in:
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "rlmgrep"
3
- version = "0.1.7"
3
+ version = "0.1.8"
4
4
  description = "Grep-shaped CLI search powered by DSPy RLM"
5
5
  readme = "README.md"
6
6
  requires-python = ">=3.11"
@@ -1,2 +1,2 @@
1
1
  __all__ = ["__version__"]
2
- __version__ = "0.1.7"
2
+ __version__ = "0.1.8"
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: rlmgrep
3
- Version: 0.1.7
3
+ Version: 0.1.8
4
4
  Summary: Grep-shaped CLI search powered by DSPy RLM
5
5
  Author: rlmgrep
6
6
  License: MIT
@@ -17,9 +17,9 @@ Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, sca
17
17
  ## Quickstart
18
18
 
19
19
  ```sh
20
- uv tool install --python 3.11 rlmgrep
20
+ uv tool install rlmgrep
21
21
  # or from GitHub:
22
- # uv tool install --python 3.11 git+https://github.com/halfprice06/rlmgrep.git
22
+ # uv tool install git+https://github.com/halfprice06/rlmgrep.git
23
23
 
24
24
  export OPENAI_API_KEY=... # or set keys in ~/.rlmgrep
25
25
  rlmgrep "where are API keys read" rlmgrep/
@@ -31,6 +31,20 @@ rlmgrep "where are API keys read" rlmgrep/
31
31
  - Deno runtime (DSPy RLM uses a Deno-based interpreter)
32
32
  - API key for your chosen provider (OpenAI, Anthropic, Gemini, etc.)
33
33
 
34
+ ## Non-text Files (PDF + Office + Media)
35
+
36
+ One of rlmgrep’s most useful features is that it can “grep” **PDFs and Office files** by converting them into text before the RLM search runs.
37
+
38
+ How it works:
39
+ - **PDFs** are parsed with `pypdf`. Each page gets a marker line like `===== Page N =====`, and output lines include a `page=N` suffix. Line numbers refer to the extracted text (not PDF coordinates).
40
+ - **Office & binary docs** (`.docx`, `.pptx`, `.xlsx`, `.html`, `.zip`, etc.) are converted to Markdown via **MarkItDown**. This happens during ingestion, so rlmgrep can search them like any other text file.
41
+ - **Images** can be described by a vision model through MarkItDown (OpenAI/Anthropic/Gemini).
42
+ - **Audio** transcription is supported through OpenAI when enabled.
43
+
44
+ Sidecar caching:
45
+ - For images/audio, converted text is cached next to the original file as `<original>.<ext>.md` and reused on later runs.
46
+ - Use `-a/--text` if you want to treat binary files as raw text (UTF‑8 with replacement) and skip conversion.
47
+
34
48
  ## Install Deno
35
49
 
36
50
  DSPy requires the Deno runtime. Install it with the official scripts:
@@ -174,13 +188,6 @@ CLI flags override config values. Model keys are resolved as:
174
188
 
175
189
  If more than one provider key is set and the model does not make the provider obvious, rlmgrep emits a warning and requires an explicit `--api-key`.
176
190
 
177
- ## Non-text files (PDF, images, audio)
178
-
179
- - PDF files are parsed with `pypdf`. Each page gets a marker line `===== Page N =====`, and output lines include a `page=N` suffix.
180
- - Images and audio are converted via `markitdown` when enabled in config. Image conversion supports `openai`, `anthropic`, and `gemini` providers; audio conversion currently supports `openai` only.
181
- - Converted image/audio text is cached in sidecar files named `<original>.<ext>.md` next to the original file and reused on subsequent runs.
182
- - Use `-a/--text` to force binary files to be read as text (UTF-8 with replacement).
183
-
184
191
  ## Skill (Anthropic-style)
185
192
 
186
193
  A ready-to-copy skill lives in:
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes