paperpipe 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,23 @@
1
+ __pycache__/
2
+ *.py[cod]
3
+ *$py.class
4
+ .pytest_cache/
5
+ .ruff_cache/
6
+ .ruff_cache_tmp/
7
+ .claude/
8
+ .venv/
9
+ .uv-cache/
10
+ env/
11
+ venv/
12
+ build/
13
+ dist/
14
+ *.egg-info/
15
+ .coverage
16
+ .coverage.*
17
+ coverage.xml
18
+ htmlcov/
19
+ .python-version
20
+ uv.lock
21
+ CLAUDE.md
22
+ GEMINI.md
23
+ QWEN.md
@@ -0,0 +1,84 @@
1
+ # Agent Integration Snippet (PaperPipe)
2
+
3
+ Add this section to your project's agent instructions file:
4
+ - Preferred: `AGENTS.md`
5
+ - Also works: `CLAUDE.md`, `GEMINI.md`, or your agent’s equivalent
6
+
7
+ ---
8
+
9
+ ## Paper References (PaperPipe)
10
+
11
+ This project implements methods from scientific papers. Papers are managed via `papi` (paperpipe).
12
+
13
+ ### Paper Database Location
14
+
15
+ Default database root is `~/.paperpipe/`, but it may be overridden (e.g. via `PAPER_DB_PATH`).
16
+ Prefer discovering the active location with:
17
+
18
+ ```bash
19
+ papi path
20
+ ```
21
+
22
+ Per-paper files live at: `<paper_db>/papers/{paper}/`
23
+
24
+ - `meta.json` — metadata + tags
25
+ - `summary.md` — coding-context overview
26
+ - `equations.md` — key equations + explanations (best for implementation verification)
27
+ - `source.tex` — full LaTeX (if available)
28
+ - `paper.pdf` — PDF (used by PaperQA2)
29
+
30
+ ### When to Use What
31
+
32
+ | Task | Best source |
33
+ |------|-------------|
34
+ | “Does my code match the paper?” | Read `{paper}/equations.md` (and/or `{paper}/source.tex`) |
35
+ | “What’s the high-level approach?” | Read `{paper}/summary.md` |
36
+ | “Find the exact formulation / definitions” | Read `{paper}/source.tex` |
37
+ | “Which papers discuss X?” | Run `papi search "X"` (fast) or `papi ask "X"` (PaperQA2) |
38
+ | “Compare methods across papers” | Load multiple `{paper}/equations.md` files |
39
+ | “Do the generated summaries/equations look sane?” | Run `papi audit` (and optionally regenerate flagged papers) |
40
+
41
+ ### Useful Commands
42
+
43
+ ```bash
44
+ # List papers and tags
45
+ papi list
46
+ papi tags
47
+
48
+ # Search by title, tag, or content
49
+ papi search "sdf loss"
50
+
51
+ # Export equations/summaries into the repo for a coding session
52
+ papi export neuralangelo neus --level equations --to ./paper-context/
53
+
54
+ # Or print directly to stdout for pasting into a terminal agent session
55
+ papi show neuralangelo neus --level eq
56
+
57
+ # Add papers (arXiv) / regenerate; use --no-llm to avoid LLM calls
58
+ papi add 2303.13476 # name auto-generated
59
+ papi add 2303.13476 --name neuralangelo # or explicit name
60
+ papi add 2303.13476 --update # refresh existing paper in-place
61
+ papi add 2303.13476 --duplicate # add a second copy (-2/-3 suffix)
62
+ papi regenerate neuralangelo --no-llm
63
+
64
+ # Audit generated content for obvious issues (and optionally regenerate flagged papers)
65
+ papi audit
66
+ papi audit --limit 5 --seed 0
67
+ papi audit --regenerate --no-llm -o summary,equations,tags
68
+ ```
69
+
70
+ ### LLM Configuration (Optional)
71
+
72
+ ```bash
73
+ export PAPERPIPE_LLM_MODEL="gemini/gemini-3-flash-preview" # any LiteLLM identifier
74
+ export PAPERPIPE_LLM_TEMPERATURE=0.3 # default: 0.3
75
+ ```
76
+
77
+ Without LLM, paperpipe falls back to metadata + section headings + regex equation extraction.
78
+
79
+ ### Code Verification Workflow
80
+
81
+ 1. Identify the referenced paper(s) (comments, function names, README, etc.)
82
+ 2. Read `{paper}/equations.md` and compare symbol-by-symbol with the implementation
83
+ 3. If ambiguous, confirm definitions/assumptions in `{paper}/source.tex`
84
+ 4. If the question is broad or spans multiple papers, run `papi ask "..."` (requires PaperQA2)
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Matthias Humt
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,459 @@
1
+ Metadata-Version: 2.4
2
+ Name: paperpipe
3
+ Version: 0.1.0
4
+ Summary: Unified paper database for coding agents + PaperQA2
5
+ Project-URL: Homepage, https://github.com/hummat/paperpipe
6
+ Project-URL: Documentation, https://github.com/hummat/paperpipe#readme
7
+ Project-URL: Repository, https://github.com/hummat/paperpipe
8
+ Author: Matthias Humt
9
+ License: MIT License
10
+
11
+ Copyright (c) 2025 Matthias Humt
12
+
13
+ Permission is hereby granted, free of charge, to any person obtaining a copy
14
+ of this software and associated documentation files (the "Software"), to deal
15
+ in the Software without restriction, including without limitation the rights
16
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
17
+ copies of the Software, and to permit persons to whom the Software is
18
+ furnished to do so, subject to the following conditions:
19
+
20
+ The above copyright notice and this permission notice shall be included in all
21
+ copies or substantial portions of the Software.
22
+
23
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
24
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
25
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
26
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
27
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
28
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
29
+ SOFTWARE.
30
+ License-File: LICENSE
31
+ Keywords: arxiv,coding-agent,llm,paperqa,papers,research
32
+ Classifier: Development Status :: 3 - Alpha
33
+ Classifier: Intended Audience :: Developers
34
+ Classifier: Intended Audience :: Science/Research
35
+ Classifier: License :: OSI Approved :: MIT License
36
+ Classifier: Programming Language :: Python :: 3
37
+ Classifier: Programming Language :: Python :: 3.10
38
+ Classifier: Programming Language :: Python :: 3.11
39
+ Classifier: Programming Language :: Python :: 3.12
40
+ Requires-Python: >=3.10
41
+ Requires-Dist: arxiv>=2.0.0
42
+ Requires-Dist: click>=8.0.0
43
+ Requires-Dist: requests>=2.28.0
44
+ Provides-Extra: all
45
+ Requires-Dist: litellm>=1.0.0; extra == 'all'
46
+ Requires-Dist: paper-qa>=5.0.0; (python_version >= '3.11') and extra == 'all'
47
+ Requires-Dist: paper-qa[pypdf-media]>=5.0.0; (python_version >= '3.11') and extra == 'all'
48
+ Provides-Extra: dev
49
+ Requires-Dist: build>=1.0.0; extra == 'dev'
50
+ Requires-Dist: pyright>=1.1.385; extra == 'dev'
51
+ Requires-Dist: pytest>=7.0.0; extra == 'dev'
52
+ Requires-Dist: ruff>=0.1.0; extra == 'dev'
53
+ Requires-Dist: twine>=5.0.0; extra == 'dev'
54
+ Provides-Extra: llm
55
+ Requires-Dist: litellm>=1.0.0; extra == 'llm'
56
+ Provides-Extra: paperqa
57
+ Requires-Dist: paper-qa>=5.0.0; (python_version >= '3.11') and extra == 'paperqa'
58
+ Provides-Extra: paperqa-media
59
+ Requires-Dist: paper-qa[pypdf-media]>=5.0.0; (python_version >= '3.11') and extra == 'paperqa-media'
60
+ Description-Content-Type: text/markdown
61
+
62
+ # paperpipe
63
+
64
+ A unified paper database for coding agents + [PaperQA2](https://github.com/Future-House/paper-qa).
65
+
66
+ **The problem:** You want AI coding assistants (Claude Code, Codex CLI, Gemini CLI) to reference scientific papers while implementing algorithms. But:
67
+ - PDFs are token-heavy and lose equation fidelity
68
+ - PaperQA2 is great for research but not optimized for code verification
69
+ - No simple way to ask "does my code match equation 7?"
70
+
71
+ **The solution:** A local database that stores:
72
+ - PDFs (for PaperQA2 RAG queries)
73
+ - LaTeX source (for exact equation comparison)
74
+ - Summaries optimized for coding context
75
+ - Extracted equations with explanations
76
+
77
+ ## Installation
78
+
79
+ ### With uv (recommended)
80
+
81
+ ```bash
82
+ # Basic installation
83
+ uv pip install paperpipe
84
+
85
+ # With LLM support (for better summaries/equations)
86
+ uv pip install 'paperpipe[llm]'
87
+
88
+ # With PaperQA2 integration
89
+ uv pip install 'paperpipe[paperqa]'
90
+
91
+ # Everything
92
+ uv pip install 'paperpipe[all]'
93
+ ```
94
+
95
+ Or install from source:
96
+ ```bash
97
+ git clone https://github.com/hummat/paperpipe
98
+ cd paperpipe
99
+ uv pip install -e ".[all]"
100
+ ```
101
+
102
+ ### With pip
103
+
104
+ ```bash
105
+ # Basic installation
106
+ pip install paperpipe
107
+
108
+ # With LLM support (for better summaries/equations)
109
+ pip install 'paperpipe[llm]'
110
+
111
+ # With PaperQA2 integration
112
+ pip install 'paperpipe[paperqa]'
113
+
114
+ # With PaperQA2 + multimodal PDF parsing (images/tables; installs Pillow)
115
+ pip install 'paperpipe[paperqa-media]'
116
+
117
+ # Everything
118
+ pip install 'paperpipe[all]'
119
+ ```
120
+
121
+ Or install from source:
122
+ ```bash
123
+ git clone https://github.com/hummat/paperpipe
124
+ cd paperpipe
125
+ pip install -e ".[all]"
126
+ ```
127
+
128
+ ## Development
129
+
130
+ ```bash
131
+ # Install app + dev tooling (ruff, pyright, pytest)
132
+ uv sync --group dev
133
+
134
+ uv run ruff check .
135
+ uv run pyright
136
+ uv run pytest -m "not integration"
137
+ ```
138
+
139
+ ## Quick Start
140
+
141
+ ```bash
142
+ # Add papers (names auto-generated from title; auto-tags from arXiv + LLM)
143
+ papi add 2303.13476 2106.10689 2112.03907
144
+
145
+ # Override auto-generated name with --name (single paper only):
146
+ papi add https://arxiv.org/abs/1706.03762 --name attention
147
+
148
+ # Re-adding the same arXiv ID is idempotent (skips). Use --update to refresh, or --duplicate for another copy:
149
+ papi add 1706.03762
150
+ papi add 1706.03762 --update --name attention
151
+ papi add 1706.03762 --duplicate
152
+
153
+ # List papers
154
+ papi list
155
+ papi list --tag sdf
156
+
157
+ # Search
158
+ papi search "surface reconstruction"
159
+
160
+ # Export for coding session
161
+ papi export neuralangelo neus --level equations --to ./paper-context/
162
+
163
+ # Query with PaperQA2 (if installed)
164
+ papi ask "What are the key differences between NeuS and Neuralangelo loss functions?"
165
+ ```
166
+
167
+ ## Database Structure
168
+
169
+ Default database root is `~/.paperpipe/` (override with `PAPER_DB_PATH`; see `papi path`).
170
+
171
+ ```
172
+ <paper_db>/
173
+ ├── index.json # Quick lookup index
174
+ ├── papers/
175
+ │ ├── neuralangelo/
176
+ │ │ ├── meta.json # Metadata + tags
177
+ │ │ ├── paper.pdf # For PaperQA2
178
+ │ │ ├── source.tex # Full LaTeX (if available)
179
+ │ │ ├── summary.md # Coding-context summary
180
+ │ │ └── equations.md # Key equations extracted
181
+ │ └── neus/
182
+ │ └── ...
183
+ ```
184
+
185
+ ## Integration with Coding Agents
186
+
187
+ > **Tip:** See [AGENT_INTEGRATION.md](AGENT_INTEGRATION.md) for a ready-to-use snippet you can append to your
188
+ > repo's agent instructions file (for example `AGENTS.md`, `CLAUDE.md`, `GEMINI.md`).
189
+
190
+ ### Claude Code / Codex CLI Skill
191
+
192
+ paperpipe includes a skill that automatically activates when you ask about papers,
193
+ verification, or equations. Install it for Claude Code and/or Codex CLI:
194
+
195
+ ```bash
196
+ # Install for both Claude Code and Codex CLI
197
+ papi install-skill
198
+
199
+ # Or install for a specific CLI only
200
+ papi install-skill --claude
201
+ papi install-skill --codex
202
+ ```
203
+
204
+ Restart your CLI after installing the skill.
205
+
206
+ Most coding-agent CLIs can read local files directly. The best workflow is:
207
+
208
+ 1. Use `papi` to build/manage your paper collection.
209
+ 2. For code verification, have the agent read `{paper}/equations.md` (and `source.tex` when needed).
210
+ 3. For research-y questions across many papers, use `papi ask` (PaperQA2).
211
+
212
+ Minimal snippet to add to your agent instructions:
213
+
214
+ ```markdown
215
+ ## Paper References (PaperPipe)
216
+
217
+ PaperPipe manages papers via `papi`. Find the active database root with:
218
+ `papi path`
219
+
220
+ Per-paper files are under `<paper_db>/papers/{paper}/`:
221
+ - `equations.md` — best for implementation verification
222
+ - `summary.md` — high-level overview
223
+ - `source.tex` — exact definitions (if available)
224
+
225
+ Use `papi search "query"` to find papers/tags quickly.
226
+ Use `papi ask "question"` for PaperQA2 multi-paper queries (if installed).
227
+ ```
228
+
229
+ If you want paper context inside your repo (useful for agents that can’t access `~`), export it:
230
+
231
+ ```bash
232
+ papi export neuralangelo neus --level equations --to ./paper-context/
233
+ ```
234
+
235
+ If you want to paste context directly into a terminal agent session, print to stdout:
236
+
237
+ ```bash
238
+ papi show neuralangelo neus --level eq
239
+ ```
240
+
241
+ ## Commands
242
+
243
+ | Command | Description |
244
+ |---------|-------------|
245
+ | `papi add <ids-or-urls...>` | Add one or more papers (idempotent by arXiv ID; use `--update`/`--duplicate` for existing) |
246
+ | `papi regenerate <papers...>` | Regenerate summary/equations/tags (use `--overwrite name` to rename) |
247
+ | `papi regenerate --all` | Regenerate for all papers |
248
+ | `papi audit [papers...]` | Audit generated summaries/equations and optionally regenerate flagged papers |
249
+ | `papi remove <papers...>` | Remove one or more papers (by name or arXiv ID/URL) |
250
+ | `papi list [--tag TAG]` | List papers, optionally filtered by tag |
251
+ | `papi search <query>` | Exact search (with fuzzy fallback if no exact matches) across title/tags/metadata + local summaries/equations (use `--exact` to disable fallback; `--tex` includes LaTeX) |
252
+ | `papi show <papers...>` | Show paper details or print stored content |
253
+ | `papi export <papers...>` | Export context files to a directory |
254
+ | `papi ask <query> [args]` | Query papers via PaperQA2 (supports all pqa args) |
255
+ | `papi models` | Probe which models work with your API keys |
256
+ | `papi tags` | List all tags with counts |
257
+ | `papi path` | Print database location |
258
+ | `papi install-skill` | Install the papi skill for Claude Code / Codex CLI |
259
+ | `--quiet/-q` | Suppress progress messages |
260
+ | `--verbose/-v` | Enable debug output |
261
+
262
+ ## Tagging
263
+
264
+ Papers are automatically tagged from three sources:
265
+
266
+ 1. **arXiv categories** → human-readable tags (cs.CV → computer-vision)
267
+ 2. **LLM-generated** → semantic tags from title/abstract
268
+ 3. **User-provided** → via `--tags` flag
269
+
270
+ ```bash
271
+ # Auto-tags from arXiv + LLM
272
+ papi add 2303.13476
273
+ # → name: neuralangelo, tags: computer-vision, graphics, neural-radiance-field, sdf, hash-encoding
274
+
275
+ # Add custom tags (and override auto-name)
276
+ papi add 2303.13476 --name my-neuralangelo --tags my-project,priority
277
+ ```
278
+
279
+ ## Export Levels
280
+
281
+ ```bash
282
+ # Just summaries (smallest, good for overview)
283
+ papi export neuralangelo neus --level summary
284
+
285
+ # Equations only (best for code verification)
286
+ papi export neuralangelo neus --level equations
287
+
288
+ # Full LaTeX source (most complete)
289
+ papi export neuralangelo neus --level full
290
+ ```
291
+
292
+ ## Show Levels (stdout)
293
+
294
+ ```bash
295
+ # Metadata (default)
296
+ papi show neuralangelo
297
+
298
+ # Print equations (for piping into agent sessions)
299
+ papi show neuralangelo neus --level eq
300
+
301
+ # Print summary / LaTeX
302
+ papi show neuralangelo --level summary
303
+ papi show neuralangelo --level tex
304
+ ```
305
+
306
+ ## Workflow Example
307
+
308
+ ```bash
309
+ # 1. Build your paper collection (names auto-generated)
310
+ papi add 2303.13476 2106.10689 2104.06405
311
+ # → neuralangelo, neus, volsdf
312
+
313
+ # 2. Research phase: use PaperQA2
314
+ papi ask "Compare the volume rendering approaches in NeuS, VolSDF, and Neuralangelo"
315
+
316
+ # 3. Implementation phase: export equations to project
317
+ cd ~/my-neural-surface-project
318
+ papi export neuralangelo neus volsdf --level equations --to ./paper-context/
319
+
320
+ # 4. In Claude Code / Codex / Gemini:
321
+ # "Compare my eikonal_loss() implementation with the formulations in paper-context/"
322
+
323
+ # 5. Clean up: remove papers you no longer need
324
+ papi remove volsdf neus
325
+ ```
326
+
327
+ ## Configuration
328
+
329
+ Set custom database location:
330
+ ```bash
331
+ export PAPER_DB_PATH=/path/to/your/papers
332
+ ```
333
+
334
+ ## Environment Setup
335
+
336
+ To use PaperQA2 via `papi ask` with the built-in default models, set the environment variables for your
337
+ chosen provider (PaperQA2 uses LiteLLM identifiers for `--llm` and `--embedding`).
338
+
339
+ | Provider | Required Env Var | Used For |
340
+ |----------|------------------|----------|
341
+ | **Google** | `GEMINI_API_KEY` | Gemini models & embeddings |
342
+ | **Anthropic** | `ANTHROPIC_API_KEY` | Claude models |
343
+ | **Voyage AI** | `VOYAGE_API_KEY` | Embeddings (recommended when using Claude) |
344
+ | **OpenAI** | `OPENAI_API_KEY` | GPT models & embeddings |
345
+
346
+ ## LLM Support
347
+
348
+ For better summaries and equation extraction, install with LLM support:
349
+
350
+ ```bash
351
+ pip install 'paperpipe[llm]'
352
+ # or with uv:
353
+ uv pip install 'paperpipe[llm]'
354
+ ```
355
+
356
+ This installs LiteLLM, which supports many providers. Set the appropriate API key:
357
+
358
+ ```bash
359
+ export GEMINI_API_KEY=... # For Gemini (default)
360
+ export OPENAI_API_KEY=... # For OpenAI/GPT
361
+ export ANTHROPIC_API_KEY=... # For Claude
362
+ ```
363
+
364
+ paperpipe defaults to `gemini/gemini-3-flash-preview`. Override via:
365
+ ```bash
366
+ export PAPERPIPE_LLM_MODEL=gpt-4o # or any LiteLLM model identifier
367
+ ```
368
+
369
+ You can also tune LLM generation:
370
+ ```bash
371
+ export PAPERPIPE_LLM_TEMPERATURE=0.3 # default: 0.3
372
+ ```
373
+
374
+ Without LLM support, paperpipe falls back to:
375
+ - Metadata + section headings from LaTeX
376
+ - Regex-based equation extraction
377
+
378
+ ## PaperQA2 Integration
379
+
380
+ When both paperpipe and [PaperQA2](https://github.com/Future-House/paper-qa) are installed, they share the same PDFs:
381
+
382
+ ```bash
383
+ # paperpipe stores PDFs in <paper_db>/papers/*/paper.pdf (see `papi path`)
384
+ # paperpipe ask routes to PaperQA2 for complex queries
385
+
386
+ papi ask "What optimizer settings do these papers recommend?"
387
+
388
+ # PaperQA uses LiteLLM model identifiers for `--llm` and `--embedding`.
389
+ # You can also pass through any other `pqa ask` flags after the query/options.
390
+ # By default, `papi ask` uses `pqa --settings default` to avoid failures caused by stale user
391
+ # settings files; pass `-s/--settings <name>` to use a specific PaperQA2 settings profile.
392
+ # `papi ask` also defaults to `--llm gemini/gemini-3-flash-preview` and `--embedding gemini/gemini-embedding-001`
393
+ # unless you pick a PaperQA2 settings profile with `-s/--settings` (in that case, the profile controls).
394
+ # If Pillow is not installed, `papi ask` also forces `--parsing.multimodal OFF` to avoid PDF
395
+ # image extraction errors; pass your own `--parsing...` args to override.
396
+ #
397
+ # Examples (specify LLM + embedding):
398
+ # Gemini 3 Flash + Google Embeddings
399
+ papi ask "Explain the architecture" --llm "gemini/gemini-3-flash-preview" --embedding "gemini/gemini-embedding-001"
400
+
401
+ # Gemini 3 Pro + Google Embeddings
402
+ papi ask "Give a detailed derivation of eq. 4 and explain implementation pitfalls" --llm "gemini/gemini-3-pro-preview" --embedding "gemini/gemini-embedding-001"
403
+
404
+ # Claude Sonnet 4.5 + Voyage AI Embeddings
405
+ papi ask "Compare the loss functions" --llm "claude-sonnet-4-5" --embedding "voyage/voyage-3-large"
406
+
407
+ # GPT-5.2 + OpenAI Embeddings
408
+ papi ask "How to implement eq 4?" --llm "gpt-5.2" --embedding "text-embedding-3-large"
409
+
410
+ # Pass any arbitrary PaperQA2 arguments (e.g., temperature, verbosity)
411
+ papi ask "Summarize the methods" --summary-llm gpt-4o-mini --temperature 0.2 --verbosity 2
412
+ ```
413
+
414
+ ### Model Probing
415
+
416
+ To see which model ids work with your currently configured API keys (this makes small live API calls):
417
+
418
+ ```bash
419
+ papi models
420
+ # (default: probes one "latest" completion model and one embedding model per provider for
421
+ # which you have an API key set; pass `latest` (or `--preset latest`) to probe a broader list.)
422
+ # or probe specific models only:
423
+ papi models --kind completion --model gemini/gemini-3-flash-preview --model gemini/gemini-2.5-flash --model gpt-4o-mini
424
+ papi models --kind embedding --model gemini/gemini-embedding-001 --model text-embedding-3-small
425
+ # probe "latest" defaults (gpt-5.2/5.1, gemini 3 preview, claude-sonnet-4-5; plus text-embedding-3-large if enabled):
426
+ papi models latest
427
+ # probe "last-gen" defaults (gpt-4.1/4o, gemini 2.5, older/smaller embeddings; Claude 3.5 is retired):
428
+ papi models last-gen
429
+ # probe a broader superset:
430
+ papi models all
431
+ # show underlying provider errors (noisy):
432
+ papi models --verbose
433
+ ```
434
+
435
+ ## Non-arXiv Papers
436
+
437
+ PaperPipe currently focuses on arXiv ingestion (`papi add <arxiv-id-or-url>`). For papers not on arXiv you can still
438
+ store files for agents to read, but they will not show up in `papi list/search` unless you also add index/meta
439
+ entries.
440
+
441
+ ```bash
442
+ PAPER_DB="$(papi path)"
443
+ mkdir -p "$PAPER_DB/papers/my-paper"
444
+ cp /path/to/paper.pdf "$PAPER_DB/papers/my-paper/paper.pdf"
445
+ # Create:
446
+ # - "$PAPER_DB/papers/my-paper/summary.md"
447
+ # - "$PAPER_DB/papers/my-paper/equations.md"
448
+ # (optional) "$PAPER_DB/papers/my-paper/source.tex"
449
+ ```
450
+
451
+ ## Credits
452
+
453
+ - **[PaperQA2](https://github.com/Future-House/paper-qa)** by Future House — the RAG engine powering `papi ask`.
454
+ *Skarlinski et al., "Language Agents Achieve Superhuman Synthesis of Scientific Knowledge", 2024.*
455
+ [arXiv:2409.13740](https://arxiv.org/abs/2409.13740)
456
+
457
+ ## License
458
+
459
+ MIT (see [LICENSE](LICENSE))