scientify 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,138 @@
1
+ ---
2
+ name: arxiv
3
+ description: "Search arXiv.org for academic papers using the built-in arxiv tool. Use for literature search, finding related work, and downloading paper sources."
4
+ metadata:
5
+ {
6
+ "openclaw":
7
+ {
8
+ "emoji": "📚",
9
+ },
10
+ }
11
+ ---
12
+
13
+ # ArXiv Search
14
+
15
+ Use the built-in `arxiv` tool to search for academic papers on arXiv.org.
16
+
17
+ ## Basic Search
18
+
19
+ ```
20
+ arxiv query:"graph neural network" max_results:10
21
+ ```
22
+
23
+ ## Parameters
24
+
25
+ | Parameter | Type | Description |
26
+ |-----------|------|-------------|
27
+ | `query` | string | Search query (required) |
28
+ | `max_results` | number | Max papers to return (default: 10, max: 50) |
29
+ | `sort_by` | string | Sort by: "relevance", "lastUpdatedDate", "submittedDate" |
30
+ | `date_from` | string | Filter papers after this date (YYYY-MM-DD) |
31
+ | `download` | boolean | Download .tex source files (default: false) |
32
+ | `output_dir` | string | Directory for downloads (default: ~/.openclaw/workspace/papers/) |
33
+
34
+ ## Examples
35
+
36
+ Search for recent transformer papers:
37
+ ```
38
+ arxiv query:"transformer attention mechanism" sort_by:"submittedDate" max_results:5
39
+ ```
40
+
41
+ Search with date filter:
42
+ ```
43
+ arxiv query:"diffusion models" date_from:"2024-01-01"
44
+ ```
45
+
46
+ Search and download .tex sources:
47
+ ```
48
+ arxiv query:"transformer attention" max_results:5 download:true
49
+ ```
50
+
51
+ ## Output
52
+
53
+ Returns JSON with:
54
+ - `query`: The search query
55
+ - `total`: Number of results
56
+ - `papers`: Array of paper objects with:
57
+ - `title`: Paper title
58
+ - `authors`: List of authors
59
+ - `abstract`: Paper abstract
60
+ - `arxiv_id`: ArXiv ID (e.g., "2401.12345")
61
+ - `pdf_url`: Direct PDF link
62
+ - `published`: Publication date
63
+ - `categories`: ArXiv categories (e.g., "cs.LG", "cs.AI")
64
+
65
+ When `download: true`:
66
+ - `downloads`: Array of download results with:
67
+ - `arxiv_id`: Paper ID
68
+ - `format`: "tex" or "pdf" (fallback)
69
+ - `files`: List of downloaded files
70
+ - `error`: Error message if download failed
71
+ - `output_dir`: Directory where files were saved
72
+
73
+ ---
74
+
75
+ ## Downloading Paper Source (.tex)
76
+
77
+ **IMPORTANT**: Prefer downloading .tex source over PDF for better AI readability.
78
+
79
+ ### Download .tex Source (Recommended - Use Tool)
80
+
81
+ The easiest way is to use the `arxiv` tool with `download: true`:
82
+
83
+ ```
84
+ arxiv query:"your search" max_results:5 download:true output_dir:"~/.openclaw/workspace/papers"
85
+ ```
86
+
87
+ This automatically:
88
+ 1. Downloads .tex source from `https://arxiv.org/src/{arxiv_id}`
89
+ 2. Extracts tar.gz archives
90
+ 3. Falls back to PDF if .tex unavailable
91
+
92
+ ### Manual Download (Bash)
93
+
94
+ If you need to download specific papers manually:
95
+
96
+ ```bash
97
+ mkdir -p ~/.openclaw/workspace/papers/{arxiv_id}
98
+ cd ~/.openclaw/workspace/papers/{arxiv_id}
99
+ curl -L "https://arxiv.org/src/{arxiv_id}" -o source.tar.gz
100
+ tar -xzf source.tar.gz
101
+ ```
102
+
103
+ ### Why .tex over PDF?
104
+
105
+ | Format | AI Readability | Formulas | Structure |
106
+ |--------|---------------|----------|-----------|
107
+ | **.tex** | Excellent | Full LaTeX | Preserved |
108
+ | .pdf | Poor (needs OCR) | Lost/garbled | Lost |
109
+
110
+ ### Fallback to PDF
111
+
112
+ If .tex source is unavailable (404 error), fall back to PDF:
113
+ ```bash
114
+ curl -L "https://arxiv.org/pdf/{arxiv_id}.pdf" -o ~/.openclaw/workspace/papers/{arxiv_id}.pdf
115
+ ```
116
+
117
+ ---
118
+
119
+ ## Workspace Integration
120
+
121
+ **Project-based workspace**: When using with `idea-generation` or `research-pipeline` skills, papers are stored per-project:
122
+
123
+ ```
124
+ ~/.openclaw/workspace/projects/
125
+ ├── .active # Current project ID
126
+ ├── {project-id}/ # e.g., nlp-summarization, cv-segmentation
127
+ │ ├── project.json # Project metadata
128
+ │ ├── papers/ # Downloaded papers for THIS project
129
+ │ │ ├── 2401.12345/ # Extracted .tex source
130
+ │ │ │ ├── main.tex
131
+ │ │ │ └── ...
132
+ │ │ └── 2402.67890.pdf # PDF fallback
133
+ │ └── ...
134
+ ```
135
+
136
+ **When called from idea-generation**: Use `output_dir: "$WORKSPACE/papers"` where `$WORKSPACE` is the active project directory.
137
+
138
+ **Standalone usage**: Default `output_dir` is `~/.openclaw/workspace/papers/` (flat structure).