arxiv-to-prompt 0.1.1__tar.gz → 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
- Metadata-Version: 2.2
1
+ Metadata-Version: 2.4
2
2
  Name: arxiv-to-prompt
3
- Version: 0.1.1
3
+ Version: 0.2.0
4
4
  Summary: transform arXiv papers into a single latex prompt for LLMs
5
5
  Author: Takashi Ishida
6
6
  License: MIT
@@ -15,15 +15,16 @@ Requires-Dist: requests>=2.25.0
15
15
  Provides-Extra: test
16
16
  Requires-Dist: pytest>=7.0.0; extra == "test"
17
17
  Requires-Dist: pytest-cov>=4.0.0; extra == "test"
18
+ Dynamic: license-file
18
19
 
19
20
  # arxiv-to-prompt
20
21
 
21
- [![PyPI version](https://badge.fury.io/py/arxiv-to-prompt.svg?update=20250202)](https://pypi.org/project/arxiv-to-prompt/)
22
+ [![PyPI version](https://badge.fury.io/py/arxiv-to-prompt.svg?update=20250307)](https://pypi.org/project/arxiv-to-prompt/)
22
23
  [![Tests](https://github.com/takashiishida/arxiv-to-prompt/actions/workflows/tests.yml/badge.svg)](https://github.com/takashiishida/arxiv-to-prompt/actions)
23
24
  [![License](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
24
25
  [![Changelog](https://img.shields.io/github/v/release/takashiishida/arxiv-to-prompt?label=changelog)](https://github.com/takashiishida/arxiv-to-prompt/releases)
25
26
 
26
- A command-line tool to transform arXiv papers into a single LaTeX source that can be used as a prompt for asking LLMs questions about the paper. It downloads the source files, automatically finds the main tex file containing `\documentclass`, and flattens multiple files into a single coherent source by resolving `\input` and `\include` commands. The tool also provides an option to remove LaTeX comments from the output (which can be useful to shorten the prompt).
27
+ A command-line tool to transform arXiv papers into a single LaTeX source that can be used as a prompt for asking LLMs questions about the paper. It downloads the source files, automatically finds the main tex file containing `\documentclass`, and flattens multiple files into a single coherent source by resolving `\input` and `\include` commands. The tool also provides options to remove LaTeX comments and appendix sections from the output (which can be useful to shorten the prompt).
27
28
 
28
29
  ### Installation
29
30
 
@@ -41,6 +42,12 @@ arxiv-to-prompt 2303.08774
41
42
  # Display LaTeX source without comments
42
43
  arxiv-to-prompt 2303.08774 --no-comments
43
44
 
45
+ # Display LaTeX source without appendix sections
46
+ arxiv-to-prompt 2303.08774 --no-appendix
47
+
48
+ # Combine options (no comments and no appendix)
49
+ arxiv-to-prompt 2303.08774 --no-comments --no-appendix
50
+
44
51
  # Copy to clipboard
45
52
  arxiv-to-prompt 2303.08774 | pbcopy
46
53
 
@@ -62,8 +69,23 @@ latex_source = process_latex_source("2303.08774")
62
69
 
63
70
  # Get LaTeX source without comments
64
71
  latex_source = process_latex_source("2303.08774", keep_comments=False)
72
+
73
+ # Get LaTeX source without appendix sections
74
+ latex_source = process_latex_source("2303.08774", remove_appendix_section=True)
75
+
76
+ # Combine options (no comments and no appendix)
77
+ latex_source = process_latex_source("2303.08774", keep_comments=False, remove_appendix_section=True)
65
78
  ```
66
79
 
80
+ ### Projects Using arxiv-to-prompt
81
+
82
+ Here are some projects and use cases that leverage arxiv-to-prompt:
83
+
84
+ - [arxiv-latex-mcp](https://github.com/takashiishida/arxiv-latex-mcp): MCP server that uses arxiv-to-prompt to fetch and process arXiv LaTeX sources for precise interpretation of mathematical expressions in scientific papers.
85
+ - [arxiv-tex-ui](https://github.com/takashiishida/arxiv-tex-ui): chat with an llm about an arxiv paper by using the latex source.
86
+
87
+ If you're using arxiv-to-prompt in your project, please submit a pull request to add it to this list!
88
+
67
89
  ### References
68
90
 
69
91
  - Inspired by [files-to-prompt](https://github.com/simonw/files-to-prompt).
@@ -1,11 +1,11 @@
1
1
  # arxiv-to-prompt
2
2
 
3
- [![PyPI version](https://badge.fury.io/py/arxiv-to-prompt.svg?update=20250202)](https://pypi.org/project/arxiv-to-prompt/)
3
+ [![PyPI version](https://badge.fury.io/py/arxiv-to-prompt.svg?update=20250307)](https://pypi.org/project/arxiv-to-prompt/)
4
4
  [![Tests](https://github.com/takashiishida/arxiv-to-prompt/actions/workflows/tests.yml/badge.svg)](https://github.com/takashiishida/arxiv-to-prompt/actions)
5
5
  [![License](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
6
  [![Changelog](https://img.shields.io/github/v/release/takashiishida/arxiv-to-prompt?label=changelog)](https://github.com/takashiishida/arxiv-to-prompt/releases)
7
7
 
8
- A command-line tool to transform arXiv papers into a single LaTeX source that can be used as a prompt for asking LLMs questions about the paper. It downloads the source files, automatically finds the main tex file containing `\documentclass`, and flattens multiple files into a single coherent source by resolving `\input` and `\include` commands. The tool also provides an option to remove LaTeX comments from the output (which can be useful to shorten the prompt).
8
+ A command-line tool to transform arXiv papers into a single LaTeX source that can be used as a prompt for asking LLMs questions about the paper. It downloads the source files, automatically finds the main tex file containing `\documentclass`, and flattens multiple files into a single coherent source by resolving `\input` and `\include` commands. The tool also provides options to remove LaTeX comments and appendix sections from the output (which can be useful to shorten the prompt).
9
9
 
10
10
  ### Installation
11
11
 
@@ -23,6 +23,12 @@ arxiv-to-prompt 2303.08774
23
23
  # Display LaTeX source without comments
24
24
  arxiv-to-prompt 2303.08774 --no-comments
25
25
 
26
+ # Display LaTeX source without appendix sections
27
+ arxiv-to-prompt 2303.08774 --no-appendix
28
+
29
+ # Combine options (no comments and no appendix)
30
+ arxiv-to-prompt 2303.08774 --no-comments --no-appendix
31
+
26
32
  # Copy to clipboard
27
33
  arxiv-to-prompt 2303.08774 | pbcopy
28
34
 
@@ -44,9 +50,24 @@ latex_source = process_latex_source("2303.08774")
44
50
 
45
51
  # Get LaTeX source without comments
46
52
  latex_source = process_latex_source("2303.08774", keep_comments=False)
53
+
54
+ # Get LaTeX source without appendix sections
55
+ latex_source = process_latex_source("2303.08774", remove_appendix_section=True)
56
+
57
+ # Combine options (no comments and no appendix)
58
+ latex_source = process_latex_source("2303.08774", keep_comments=False, remove_appendix_section=True)
47
59
  ```
48
60
 
61
+ ### Projects Using arxiv-to-prompt
62
+
63
+ Here are some projects and use cases that leverage arxiv-to-prompt:
64
+
65
+ - [arxiv-latex-mcp](https://github.com/takashiishida/arxiv-latex-mcp): MCP server that uses arxiv-to-prompt to fetch and process arXiv LaTeX sources for precise interpretation of mathematical expressions in scientific papers.
66
+ - [arxiv-tex-ui](https://github.com/takashiishida/arxiv-tex-ui): chat with an llm about an arxiv paper by using the latex source.
67
+
68
+ If you're using arxiv-to-prompt in your project, please submit a pull request to add it to this list!
69
+
49
70
  ### References
50
71
 
51
72
  - Inspired by [files-to-prompt](https://github.com/simonw/files-to-prompt).
52
- - Reused some code from [paper2slides](https://github.com/takashiishida/paper2slides).
73
+ - Reused some code from [paper2slides](https://github.com/takashiishida/paper2slides).
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "arxiv-to-prompt"
7
- version = "0.1.1"
7
+ version = "0.2.0"
8
8
  description = "transform arXiv papers into a single latex prompt for LLMs"
9
9
  readme = "README.md"
10
10
  authors = [{ name = "Takashi Ishida" }]
@@ -22,13 +22,19 @@ def main():
22
22
  help=f"Custom directory to store downloaded files (default: {default_cache})",
23
23
  default=None
24
24
  )
25
+ parser.add_argument(
26
+ "--no-appendix",
27
+ action="store_true",
28
+ help="Remove the appendix section and everything after it"
29
+ )
25
30
 
26
31
  args = parser.parse_args()
27
32
 
28
33
  content = process_latex_source(
29
34
  args.arxiv_id,
30
35
  keep_comments=not args.no_comments,
31
- cache_dir=args.cache_dir
36
+ cache_dir=args.cache_dir,
37
+ remove_appendix_section=args.no_appendix
32
38
  )
33
39
  if content:
34
40
  print(content)
@@ -140,6 +140,14 @@ def remove_comments_from_lines(text: str) -> str:
140
140
  result.append(''.join(cleaned_line).rstrip())
141
141
  return '\n'.join(result)
142
142
 
143
+ def remove_appendix(text: str) -> str:
144
+ """Remove appendix section and everything after it."""
145
+ # Find the position of \appendix command
146
+ appendix_match = re.search(r'\\appendix\b', text)
147
+ if appendix_match:
148
+ return text[:appendix_match.start()].rstrip()
149
+ return text
150
+
143
151
  def flatten_tex(directory: str, main_file: str) -> str:
144
152
  """Combine all tex files into one, resolving inputs."""
145
153
  def process_file(file_path: str, processed_files: set) -> str:
@@ -201,7 +209,7 @@ def flatten_tex(directory: str, main_file: str) -> str:
201
209
 
202
210
  def process_latex_source(arxiv_id: str, keep_comments: bool = True,
203
211
  cache_dir: Optional[str] = None,
204
- use_cache: bool = False) -> Optional[str]:
212
+ use_cache: bool = False, remove_appendix_section: bool = False) -> Optional[str]:
205
213
  """
206
214
  Process LaTeX source files from arXiv and return the combined content.
207
215
 
@@ -210,6 +218,7 @@ def process_latex_source(arxiv_id: str, keep_comments: bool = True,
210
218
  keep_comments: Whether to keep LaTeX comments in the output
211
219
  cache_dir: Custom directory to store downloaded files
212
220
  use_cache: Whether to use cached files if they exist (default: False)
221
+ remove_appendix_section: Whether to remove the appendix section and everything after it
213
222
 
214
223
  Returns:
215
224
  The processed LaTeX content or None if processing fails
@@ -234,6 +243,10 @@ def process_latex_source(arxiv_id: str, keep_comments: bool = True,
234
243
  if not keep_comments:
235
244
  content = remove_comments_from_lines(content)
236
245
 
246
+ # Remove appendix if requested
247
+ if remove_appendix_section:
248
+ content = remove_appendix(content)
249
+
237
250
  return content
238
251
 
239
252
  def check_source_available(arxiv_id: str) -> bool:
@@ -1,6 +1,6 @@
1
- Metadata-Version: 2.2
1
+ Metadata-Version: 2.4
2
2
  Name: arxiv-to-prompt
3
- Version: 0.1.1
3
+ Version: 0.2.0
4
4
  Summary: transform arXiv papers into a single latex prompt for LLMs
5
5
  Author: Takashi Ishida
6
6
  License: MIT
@@ -15,15 +15,16 @@ Requires-Dist: requests>=2.25.0
15
15
  Provides-Extra: test
16
16
  Requires-Dist: pytest>=7.0.0; extra == "test"
17
17
  Requires-Dist: pytest-cov>=4.0.0; extra == "test"
18
+ Dynamic: license-file
18
19
 
19
20
  # arxiv-to-prompt
20
21
 
21
- [![PyPI version](https://badge.fury.io/py/arxiv-to-prompt.svg?update=20250202)](https://pypi.org/project/arxiv-to-prompt/)
22
+ [![PyPI version](https://badge.fury.io/py/arxiv-to-prompt.svg?update=20250307)](https://pypi.org/project/arxiv-to-prompt/)
22
23
  [![Tests](https://github.com/takashiishida/arxiv-to-prompt/actions/workflows/tests.yml/badge.svg)](https://github.com/takashiishida/arxiv-to-prompt/actions)
23
24
  [![License](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
24
25
  [![Changelog](https://img.shields.io/github/v/release/takashiishida/arxiv-to-prompt?label=changelog)](https://github.com/takashiishida/arxiv-to-prompt/releases)
25
26
 
26
- A command-line tool to transform arXiv papers into a single LaTeX source that can be used as a prompt for asking LLMs questions about the paper. It downloads the source files, automatically finds the main tex file containing `\documentclass`, and flattens multiple files into a single coherent source by resolving `\input` and `\include` commands. The tool also provides an option to remove LaTeX comments from the output (which can be useful to shorten the prompt).
27
+ A command-line tool to transform arXiv papers into a single LaTeX source that can be used as a prompt for asking LLMs questions about the paper. It downloads the source files, automatically finds the main tex file containing `\documentclass`, and flattens multiple files into a single coherent source by resolving `\input` and `\include` commands. The tool also provides options to remove LaTeX comments and appendix sections from the output (which can be useful to shorten the prompt).
27
28
 
28
29
  ### Installation
29
30
 
@@ -41,6 +42,12 @@ arxiv-to-prompt 2303.08774
41
42
  # Display LaTeX source without comments
42
43
  arxiv-to-prompt 2303.08774 --no-comments
43
44
 
45
+ # Display LaTeX source without appendix sections
46
+ arxiv-to-prompt 2303.08774 --no-appendix
47
+
48
+ # Combine options (no comments and no appendix)
49
+ arxiv-to-prompt 2303.08774 --no-comments --no-appendix
50
+
44
51
  # Copy to clipboard
45
52
  arxiv-to-prompt 2303.08774 | pbcopy
46
53
 
@@ -62,8 +69,23 @@ latex_source = process_latex_source("2303.08774")
62
69
 
63
70
  # Get LaTeX source without comments
64
71
  latex_source = process_latex_source("2303.08774", keep_comments=False)
72
+
73
+ # Get LaTeX source without appendix sections
74
+ latex_source = process_latex_source("2303.08774", remove_appendix_section=True)
75
+
76
+ # Combine options (no comments and no appendix)
77
+ latex_source = process_latex_source("2303.08774", keep_comments=False, remove_appendix_section=True)
65
78
  ```
66
79
 
80
+ ### Projects Using arxiv-to-prompt
81
+
82
+ Here are some projects and use cases that leverage arxiv-to-prompt:
83
+
84
+ - [arxiv-latex-mcp](https://github.com/takashiishida/arxiv-latex-mcp): MCP server that uses arxiv-to-prompt to fetch and process arXiv LaTeX sources for precise interpretation of mathematical expressions in scientific papers.
85
+ - [arxiv-tex-ui](https://github.com/takashiishida/arxiv-tex-ui): chat with an llm about an arxiv paper by using the latex source.
86
+
87
+ If you're using arxiv-to-prompt in your project, please submit a pull request to add it to this list!
88
+
67
89
  ### References
68
90
 
69
91
  - Inspired by [files-to-prompt](https://github.com/simonw/files-to-prompt).
@@ -9,6 +9,7 @@ from arxiv_to_prompt.core import (
9
9
  remove_comments_from_lines,
10
10
  check_source_available,
11
11
  flatten_tex,
12
+ remove_appendix,
12
13
  )
13
14
 
14
15
  # Test fixtures
@@ -176,3 +177,54 @@ Text with escaped \\% and then % \\input{commented_file3}
176
177
  assert "\\include{commented_file2}" in result
177
178
  assert "\\input{commented_file3}" in result
178
179
  assert "\\input{nonexistent_file}" in result
180
+
181
+
182
+ def test_remove_appendix():
183
+ """Test appendix removal functionality."""
184
+ test_cases = [
185
+ # Basic appendix removal
186
+ (
187
+ "Main content\n\n\\appendix\nAppendix content",
188
+ "Main content"
189
+ ),
190
+ # No appendix to remove
191
+ (
192
+ "Main content only",
193
+ "Main content only"
194
+ ),
195
+ # Appendix with sections
196
+ (
197
+ "Introduction\n\\section{Method}\nContent\n\\appendix\n\\section{Additional Info}\nMore stuff",
198
+ "Introduction\n\\section{Method}\nContent"
199
+ ),
200
+ # Multiple appendix commands (should remove from first one)
201
+ (
202
+ "Content\n\\appendix\nFirst appendix\n\\appendix\nSecond appendix",
203
+ "Content"
204
+ ),
205
+ # Appendix at the beginning
206
+ (
207
+ "\\appendix\nAll appendix content",
208
+ ""
209
+ ),
210
+ ]
211
+
212
+ for input_text, expected in test_cases:
213
+ result = remove_appendix(input_text)
214
+ assert result == expected, f"Failed for input: {input_text}"
215
+
216
+
217
+ def test_process_latex_with_appendix_removal(sample_arxiv_id, temp_cache_dir):
218
+ """Test processing LaTeX source with appendix removal."""
219
+ # Test with appendix removal
220
+ result = process_latex_source(
221
+ sample_arxiv_id,
222
+ keep_comments=True,
223
+ cache_dir=str(temp_cache_dir),
224
+ remove_appendix_section=True
225
+ )
226
+ assert result is not None
227
+ assert "\\documentclass" in result
228
+
229
+ # Check that appendix was removed (if it existed)
230
+ assert "\\appendix" not in result
File without changes