PyPI - rlmgrep - Versions diffs - 0.1.7__tar.gz → 0.1.8__tar.gz - Mend

rlmgrep 0.1.7tar.gz → 0.1.8tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

{rlmgrep-0.1.7 → rlmgrep-0.1.8}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: rlmgrep
-Version: 0.1.7
+Version: 0.1.8
 Summary: Grep-shaped CLI search powered by DSPy RLM
 Author: rlmgrep
 License: MIT
@@ -17,9 +17,9 @@ Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, sca
 ## Quickstart
 ```sh
-uv tool install --python 3.11 rlmgrep
+uv tool install rlmgrep
 # or from GitHub:
-# uv tool install --python 3.11 git+https://github.com/halfprice06/rlmgrep.git
+# uv tool install git+https://github.com/halfprice06/rlmgrep.git
 export OPENAI_API_KEY=...  # or set keys in ~/.rlmgrep
 rlmgrep "where are API keys read" rlmgrep/
@@ -31,6 +31,20 @@ rlmgrep "where are API keys read" rlmgrep/
 - Deno runtime (DSPy RLM uses a Deno-based interpreter)
 - API key for your chosen provider (OpenAI, Anthropic, Gemini, etc.)
+## Non-text Files (PDF + Office + Media)
+One of rlmgrep’s most useful features is that it can “grep” **PDFs and Office files** by converting them into text before the RLM search runs.
+How it works:
+- **PDFs** are parsed with `pypdf`. Each page gets a marker line like `===== Page N =====`, and output lines include a `page=N` suffix. Line numbers refer to the extracted text (not PDF coordinates).
+- **Office & binary docs** (`.docx`, `.pptx`, `.xlsx`, `.html`, `.zip`, etc.) are converted to Markdown via **MarkItDown**. This happens during ingestion, so rlmgrep can search them like any other text file.
+- **Images** can be described by a vision model through MarkItDown (OpenAI/Anthropic/Gemini).
+- **Audio** transcription is supported through OpenAI when enabled.
+Sidecar caching:
+- For images/audio, converted text is cached next to the original file as `<original>.<ext>.md` and reused on later runs.
+- Use `-a/--text` if you want to treat binary files as raw text (UTF‑8 with replacement) and skip conversion.
 ## Install Deno
 DSPy requires the Deno runtime. Install it with the official scripts:
@@ -174,13 +188,6 @@ CLI flags override config values. Model keys are resolved as:
 If more than one provider key is set and the model does not make the provider obvious, rlmgrep emits a warning and requires an explicit `--api-key`.
-## Non-text files (PDF, images, audio)
-- PDF files are parsed with `pypdf`. Each page gets a marker line `===== Page N =====`, and output lines include a `page=N` suffix.
-- Images and audio are converted via `markitdown` when enabled in config. Image conversion supports `openai`, `anthropic`, and `gemini` providers; audio conversion currently supports `openai` only.
-- Converted image/audio text is cached in sidecar files named `<original>.<ext>.md` next to the original file and reused on subsequent runs.
-- Use `-a/--text` to force binary files to be read as text (UTF-8 with replacement).
 ## Skill (Anthropic-style)
 A ready-to-copy skill lives in:

{rlmgrep-0.1.7 → rlmgrep-0.1.8}/README.md RENAMED Viewed

@@ -5,9 +5,9 @@ Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, sca
 ## Quickstart
 ```sh
-uv tool install --python 3.11 rlmgrep
+uv tool install rlmgrep
 # or from GitHub:
-# uv tool install --python 3.11 git+https://github.com/halfprice06/rlmgrep.git
+# uv tool install git+https://github.com/halfprice06/rlmgrep.git
 export OPENAI_API_KEY=...  # or set keys in ~/.rlmgrep
 rlmgrep "where are API keys read" rlmgrep/
@@ -19,6 +19,20 @@ rlmgrep "where are API keys read" rlmgrep/
 - Deno runtime (DSPy RLM uses a Deno-based interpreter)
 - API key for your chosen provider (OpenAI, Anthropic, Gemini, etc.)
+## Non-text Files (PDF + Office + Media)
+One of rlmgrep’s most useful features is that it can “grep” **PDFs and Office files** by converting them into text before the RLM search runs.
+How it works:
+- **PDFs** are parsed with `pypdf`. Each page gets a marker line like `===== Page N =====`, and output lines include a `page=N` suffix. Line numbers refer to the extracted text (not PDF coordinates).
+- **Office & binary docs** (`.docx`, `.pptx`, `.xlsx`, `.html`, `.zip`, etc.) are converted to Markdown via **MarkItDown**. This happens during ingestion, so rlmgrep can search them like any other text file.
+- **Images** can be described by a vision model through MarkItDown (OpenAI/Anthropic/Gemini).
+- **Audio** transcription is supported through OpenAI when enabled.
+Sidecar caching:
+- For images/audio, converted text is cached next to the original file as `<original>.<ext>.md` and reused on later runs.
+- Use `-a/--text` if you want to treat binary files as raw text (UTF‑8 with replacement) and skip conversion.
 ## Install Deno
 DSPy requires the Deno runtime. Install it with the official scripts:
@@ -162,13 +176,6 @@ CLI flags override config values. Model keys are resolved as:
 If more than one provider key is set and the model does not make the provider obvious, rlmgrep emits a warning and requires an explicit `--api-key`.
-## Non-text files (PDF, images, audio)
-- PDF files are parsed with `pypdf`. Each page gets a marker line `===== Page N =====`, and output lines include a `page=N` suffix.
-- Images and audio are converted via `markitdown` when enabled in config. Image conversion supports `openai`, `anthropic`, and `gemini` providers; audio conversion currently supports `openai` only.
-- Converted image/audio text is cached in sidecar files named `<original>.<ext>.md` next to the original file and reused on subsequent runs.
-- Use `-a/--text` to force binary files to be read as text (UTF-8 with replacement).
 ## Skill (Anthropic-style)
 A ready-to-copy skill lives in:

{rlmgrep-0.1.7 → rlmgrep-0.1.8}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "rlmgrep"
-version = "0.1.7"
+version = "0.1.8"
 description = "Grep-shaped CLI search powered by DSPy RLM"
 readme = "README.md"
 requires-python = ">=3.11"

{rlmgrep-0.1.7 → rlmgrep-0.1.8}/rlmgrep/__init__.py RENAMED Viewed

@@ -1,2 +1,2 @@
 __all__ = ["__version__"]
-__version__ = "0.1.7"
+__version__ = "0.1.8"

{rlmgrep-0.1.7 → rlmgrep-0.1.8}/rlmgrep.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: rlmgrep
-Version: 0.1.7
+Version: 0.1.8
 Summary: Grep-shaped CLI search powered by DSPy RLM
 Author: rlmgrep
 License: MIT
@@ -17,9 +17,9 @@ Grep-shaped search powered by DSPy RLM. It accepts a natural-language query, sca
 ## Quickstart
 ```sh
-uv tool install --python 3.11 rlmgrep
+uv tool install rlmgrep
 # or from GitHub:
-# uv tool install --python 3.11 git+https://github.com/halfprice06/rlmgrep.git
+# uv tool install git+https://github.com/halfprice06/rlmgrep.git
 export OPENAI_API_KEY=...  # or set keys in ~/.rlmgrep
 rlmgrep "where are API keys read" rlmgrep/
@@ -31,6 +31,20 @@ rlmgrep "where are API keys read" rlmgrep/
 - Deno runtime (DSPy RLM uses a Deno-based interpreter)
 - API key for your chosen provider (OpenAI, Anthropic, Gemini, etc.)
+## Non-text Files (PDF + Office + Media)
+One of rlmgrep’s most useful features is that it can “grep” **PDFs and Office files** by converting them into text before the RLM search runs.
+How it works:
+- **PDFs** are parsed with `pypdf`. Each page gets a marker line like `===== Page N =====`, and output lines include a `page=N` suffix. Line numbers refer to the extracted text (not PDF coordinates).
+- **Office & binary docs** (`.docx`, `.pptx`, `.xlsx`, `.html`, `.zip`, etc.) are converted to Markdown via **MarkItDown**. This happens during ingestion, so rlmgrep can search them like any other text file.
+- **Images** can be described by a vision model through MarkItDown (OpenAI/Anthropic/Gemini).
+- **Audio** transcription is supported through OpenAI when enabled.
+Sidecar caching:
+- For images/audio, converted text is cached next to the original file as `<original>.<ext>.md` and reused on later runs.
+- Use `-a/--text` if you want to treat binary files as raw text (UTF‑8 with replacement) and skip conversion.
 ## Install Deno
 DSPy requires the Deno runtime. Install it with the official scripts:
@@ -174,13 +188,6 @@ CLI flags override config values. Model keys are resolved as:
 If more than one provider key is set and the model does not make the provider obvious, rlmgrep emits a warning and requires an explicit `--api-key`.
-## Non-text files (PDF, images, audio)
-- PDF files are parsed with `pypdf`. Each page gets a marker line `===== Page N =====`, and output lines include a `page=N` suffix.
-- Images and audio are converted via `markitdown` when enabled in config. Image conversion supports `openai`, `anthropic`, and `gemini` providers; audio conversion currently supports `openai` only.
-- Converted image/audio text is cached in sidecar files named `<original>.<ext>.md` next to the original file and reused on subsequent runs.
-- Use `-a/--text` to force binary files to be read as text (UTF-8 with replacement).
 ## Skill (Anthropic-style)
 A ready-to-copy skill lives in: