mindforge-cc 11.5.1 → 11.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agent/mindforge/skill-tdd.md +53 -0
- package/.agent/mindforge/skills-index.md +118 -0
- package/.agent/mindforge/systematic-debug.md +60 -0
- package/.agent/mindforge/wf-catalog.md +37 -0
- package/.agent/mindforge/wf-code-audit.md +31 -0
- package/.agent/mindforge/wf-competitive-analysis.md +31 -0
- package/.agent/mindforge/wf-deep-research.md +32 -0
- package/.agent/mindforge/wf-feature-planner.md +31 -0
- package/.agent/mindforge/wf-incident-response.md +31 -0
- package/.agent/mindforge/wf-onboard-codebase.md +31 -0
- package/.agent/mindforge/wf-perf-optimize.md +31 -0
- package/.agent/mindforge/wf-pr-review.md +31 -0
- package/.agent/mindforge/wf-refactor-plan.md +31 -0
- package/.agent/mindforge/wf-release-prep.md +31 -0
- package/.agent/mindforge/wf-tdd-sprint.md +31 -0
- package/.agent/mindforge/wf-tech-evaluation.md +31 -0
- package/.agent/skills/1password-skill/SKILL.md +156 -0
- package/.agent/skills/1password-skill/references/cli-examples.md +31 -0
- package/.agent/skills/1password-skill/references/get-started.md +21 -0
- package/.agent/skills/article-illustrator/SKILL.md +199 -0
- package/.agent/skills/article-illustrator/references/prompt-construction.md +426 -0
- package/.agent/skills/article-illustrator/references/style-presets.md +80 -0
- package/.agent/skills/article-illustrator/references/styles.md +224 -0
- package/.agent/skills/article-illustrator/references/usage.md +50 -0
- package/.agent/skills/article-illustrator/references/workflow.md +332 -0
- package/.agent/skills/arxiv/SKILL.md +275 -0
- package/.agent/skills/blogwatcher/SKILL.md +130 -0
- package/.agent/skills/code-wiki/SKILL.md +438 -0
- package/.agent/skills/code-wiki/templates/README.md +31 -0
- package/.agent/skills/code-wiki/templates/architecture.md +30 -0
- package/.agent/skills/code-wiki/templates/getting-started.md +47 -0
- package/.agent/skills/code-wiki/templates/module.md +38 -0
- package/.agent/skills/codebase-inspection/SKILL.md +109 -0
- package/.agent/skills/comic-creator/SKILL.md +240 -0
- package/.agent/skills/comic-creator/references/analysis-framework.md +176 -0
- package/.agent/skills/comic-creator/references/auto-selection.md +71 -0
- package/.agent/skills/comic-creator/references/base-prompt.md +98 -0
- package/.agent/skills/comic-creator/references/character-template.md +180 -0
- package/.agent/skills/comic-creator/references/ohmsha-guide.md +85 -0
- package/.agent/skills/comic-creator/references/partial-workflows.md +106 -0
- package/.agent/skills/comic-creator/references/storyboard-template.md +143 -0
- package/.agent/skills/comic-creator/references/workflow.md +401 -0
- package/.agent/skills/concept-diagrams/SKILL.md +355 -0
- package/.agent/skills/concept-diagrams/references/dashboard-patterns.md +43 -0
- package/.agent/skills/concept-diagrams/references/infrastructure-patterns.md +144 -0
- package/.agent/skills/concept-diagrams/references/physical-shape-cookbook.md +42 -0
- package/.agent/skills/creative-ideation/SKILL.md +144 -0
- package/.agent/skills/creative-ideation/references/full-prompt-library.md +110 -0
- package/.agent/skills/devops-cli/SKILL.md +149 -0
- package/.agent/skills/devops-cli/references/app-discovery.md +112 -0
- package/.agent/skills/devops-cli/references/authentication.md +59 -0
- package/.agent/skills/devops-cli/references/cli-reference.md +104 -0
- package/.agent/skills/devops-cli/references/running-apps.md +171 -0
- package/.agent/skills/devops-watchers/SKILL.md +103 -0
- package/.agent/skills/docker-management/SKILL.md +273 -0
- package/.agent/skills/domain-intel/SKILL.md +96 -0
- package/.agent/skills/duckduckgo-search/SKILL.md +230 -0
- package/.agent/skills/github-auth/SKILL.md +240 -0
- package/.agent/skills/github-code-review/SKILL.md +474 -0
- package/.agent/skills/github-code-review/references/review-output-template.md +74 -0
- package/.agent/skills/github-issues/SKILL.md +363 -0
- package/.agent/skills/github-issues/templates/bug-report.md +35 -0
- package/.agent/skills/github-issues/templates/feature-request.md +31 -0
- package/.agent/skills/github-pr-workflow/SKILL.md +360 -0
- package/.agent/skills/github-pr-workflow/references/ci-troubleshooting.md +183 -0
- package/.agent/skills/github-pr-workflow/references/conventional-commits.md +71 -0
- package/.agent/skills/github-pr-workflow/templates/pr-body-bugfix.md +35 -0
- package/.agent/skills/github-pr-workflow/templates/pr-body-feature.md +33 -0
- package/.agent/skills/github-repo-management/SKILL.md +509 -0
- package/.agent/skills/github-repo-management/references/github-api-cheatsheet.md +161 -0
- package/.agent/skills/godmode/SKILL.md +396 -0
- package/.agent/skills/godmode/references/jailbreak-templates.md +128 -0
- package/.agent/skills/godmode/references/refusal-detection.md +142 -0
- package/.agent/skills/hyperframes/SKILL.md +182 -0
- package/.agent/skills/hyperframes/references/cli.md +185 -0
- package/.agent/skills/hyperframes/references/composition.md +129 -0
- package/.agent/skills/hyperframes/references/features.md +289 -0
- package/.agent/skills/hyperframes/references/gsap.md +136 -0
- package/.agent/skills/hyperframes/references/troubleshooting.md +137 -0
- package/.agent/skills/hyperframes/references/website-to-video.md +145 -0
- package/.agent/skills/jupyter-live-kernel/SKILL.md +160 -0
- package/.agent/skills/kanban-orchestrator/SKILL.md +209 -0
- package/.agent/skills/kanban-worker/SKILL.md +188 -0
- package/.agent/skills/llm-wiki/SKILL.md +499 -0
- package/.agent/skills/meme-generation/SKILL.md +122 -0
- package/.agent/skills/node-inspect-debugger/SKILL.md +312 -0
- package/.agent/skills/obsidian/SKILL.md +60 -0
- package/.agent/skills/osint-investigation/SKILL.md +269 -0
- package/.agent/skills/osint-investigation/templates/source-template.md +59 -0
- package/.agent/skills/oss-forensics/SKILL.md +422 -0
- package/.agent/skills/oss-forensics/references/evidence-types.md +89 -0
- package/.agent/skills/oss-forensics/references/github-archive-guide.md +184 -0
- package/.agent/skills/oss-forensics/references/investigation-templates.md +131 -0
- package/.agent/skills/oss-forensics/references/recovery-techniques.md +164 -0
- package/.agent/skills/oss-forensics/templates/forensic-report.md +151 -0
- package/.agent/skills/oss-forensics/templates/malicious-package-report.md +43 -0
- package/.agent/skills/parallel-cli/SKILL.md +384 -0
- package/.agent/skills/pinggy-tunnel/SKILL.md +302 -0
- package/.agent/skills/pixel-art/SKILL.md +209 -0
- package/.agent/skills/pixel-art/references/palettes.md +49 -0
- package/.agent/skills/plan/SKILL.md +331 -0
- package/.agent/skills/polymarket/SKILL.md +75 -0
- package/.agent/skills/polymarket/references/api-endpoints.md +220 -0
- package/.agent/skills/python-debugpy/SKILL.md +368 -0
- package/.agent/skills/requesting-code-review/SKILL.md +273 -0
- package/.agent/skills/research-paper-writing/SKILL.md +2367 -0
- package/.agent/skills/research-paper-writing/references/autoreason-methodology.md +394 -0
- package/.agent/skills/research-paper-writing/references/checklists.md +434 -0
- package/.agent/skills/research-paper-writing/references/citation-workflow.md +563 -0
- package/.agent/skills/research-paper-writing/references/experiment-patterns.md +728 -0
- package/.agent/skills/research-paper-writing/references/human-evaluation.md +476 -0
- package/.agent/skills/research-paper-writing/references/paper-types.md +481 -0
- package/.agent/skills/research-paper-writing/references/reviewer-guidelines.md +433 -0
- package/.agent/skills/research-paper-writing/references/sources.md +191 -0
- package/.agent/skills/research-paper-writing/references/writing-guide.md +474 -0
- package/.agent/skills/research-paper-writing/templates/README.md +251 -0
- package/.agent/skills/rest-graphql-debug/SKILL.md +507 -0
- package/.agent/skills/s6-container-supervision/SKILL.md +171 -0
- package/.agent/skills/scrapling/SKILL.md +328 -0
- package/.agent/skills/sherlock/SKILL.md +186 -0
- package/.agent/skills/simplify-code/SKILL.md +168 -0
- package/.agent/skills/skill-authoring/SKILL.md +158 -0
- package/.agent/skills/spike/SKILL.md +190 -0
- package/.agent/skills/subagent-driven-development/SKILL.md +345 -0
- package/.agent/skills/subagent-driven-development/references/context-budget-discipline.md +53 -0
- package/.agent/skills/subagent-driven-development/references/gates-taxonomy.md +93 -0
- package/.agent/skills/systematic-debugging/SKILL.md +360 -0
- package/.agent/skills/test-driven-development/SKILL.md +336 -0
- package/.agent/skills/video-orchestrator/SKILL.md +194 -0
- package/.agent/skills/video-orchestrator/references/examples.md +227 -0
- package/.agent/skills/video-orchestrator/references/intake.md +166 -0
- package/.agent/skills/video-orchestrator/references/kanban-setup.md +278 -0
- package/.agent/skills/video-orchestrator/references/monitoring.md +180 -0
- package/.agent/skills/video-orchestrator/references/role-archetypes.md +298 -0
- package/.agent/skills/video-orchestrator/references/tool-matrix.md +317 -0
- package/.agent/skills/web-pentest/SKILL.md +332 -0
- package/.agent/skills/web-pentest/references/bypass-techniques.md +133 -0
- package/.agent/skills/web-pentest/references/exploitation-techniques.md +204 -0
- package/.agent/skills/web-pentest/references/scope-enforcement.md +110 -0
- package/.agent/skills/web-pentest/references/vuln-taxonomy.md +81 -0
- package/.agent/skills/web-pentest/templates/authorization.md +69 -0
- package/.agent/skills/web-pentest/templates/pentest-report.md +178 -0
- package/.claude/commands/mindforge/skill-tdd.md +53 -0
- package/.claude/commands/mindforge/skills-index.md +118 -0
- package/.claude/commands/mindforge/systematic-debug.md +60 -0
- package/.claude/commands/mindforge/wf-catalog.md +37 -0
- package/.claude/commands/mindforge/wf-code-audit.md +31 -0
- package/.claude/commands/mindforge/wf-competitive-analysis.md +31 -0
- package/.claude/commands/mindforge/wf-deep-research.md +32 -0
- package/.claude/commands/mindforge/wf-feature-planner.md +31 -0
- package/.claude/commands/mindforge/wf-incident-response.md +31 -0
- package/.claude/commands/mindforge/wf-onboard-codebase.md +31 -0
- package/.claude/commands/mindforge/wf-perf-optimize.md +31 -0
- package/.claude/commands/mindforge/wf-pr-review.md +31 -0
- package/.claude/commands/mindforge/wf-refactor-plan.md +31 -0
- package/.claude/commands/mindforge/wf-release-prep.md +31 -0
- package/.claude/commands/mindforge/wf-tdd-sprint.md +31 -0
- package/.claude/commands/mindforge/wf-tech-evaluation.md +31 -0
- package/.mindforge/config.json +2 -2
- package/.mindforge/dynamic-workflows/REGISTRY.md +65 -0
- package/.mindforge/dynamic-workflows/index.json +171 -0
- package/.mindforge/dynamic-workflows/scripts/code-audit.js +103 -0
- package/.mindforge/dynamic-workflows/scripts/competitive-analysis.js +85 -0
- package/.mindforge/dynamic-workflows/scripts/deep-research.js +151 -0
- package/.mindforge/dynamic-workflows/scripts/feature-planner.js +104 -0
- package/.mindforge/dynamic-workflows/scripts/incident-response.js +106 -0
- package/.mindforge/dynamic-workflows/scripts/onboard-codebase.js +102 -0
- package/.mindforge/dynamic-workflows/scripts/perf-optimize.js +128 -0
- package/.mindforge/dynamic-workflows/scripts/pr-review.js +87 -0
- package/.mindforge/dynamic-workflows/scripts/refactor-plan.js +121 -0
- package/.mindforge/dynamic-workflows/scripts/release-prep.js +102 -0
- package/.mindforge/dynamic-workflows/scripts/tdd-sprint.js +103 -0
- package/.mindforge/dynamic-workflows/scripts/tech-evaluation.js +72 -0
- package/.mindforge/memory/sync-manifest.json +1 -1
- package/.mindforge/skills/arxiv/SKILL.md +294 -0
- package/.mindforge/skills/blogwatcher/SKILL.md +147 -0
- package/.mindforge/skills/code-wiki/SKILL.md +457 -0
- package/.mindforge/skills/codebase-inspection/SKILL.md +126 -0
- package/.mindforge/skills/concept-diagrams/SKILL.md +373 -0
- package/.mindforge/skills/creative-ideation/SKILL.md +162 -0
- package/.mindforge/skills/domain-intel/SKILL.md +116 -0
- package/.mindforge/skills/duckduckgo-search/SKILL.md +249 -0
- package/.mindforge/skills/github-code-review/SKILL.md +493 -0
- package/.mindforge/skills/github-issues/SKILL.md +382 -0
- package/.mindforge/skills/github-pr-workflow/SKILL.md +379 -0
- package/.mindforge/skills/jupyter-live-kernel/SKILL.md +179 -0
- package/.mindforge/skills/kanban-orchestrator/SKILL.md +227 -0
- package/.mindforge/skills/kanban-worker/SKILL.md +206 -0
- package/.mindforge/skills/meme-generation/SKILL.md +141 -0
- package/.mindforge/skills/obsidian/SKILL.md +80 -0
- package/.mindforge/skills/osint-investigation/SKILL.md +288 -0
- package/.mindforge/skills/oss-forensics/SKILL.md +421 -0
- package/.mindforge/skills/pixel-art/SKILL.md +228 -0
- package/.mindforge/skills/plan/SKILL.md +350 -0
- package/.mindforge/skills/requesting-code-review/SKILL.md +292 -0
- package/.mindforge/skills/research-paper-writing/SKILL.md +2384 -0
- package/.mindforge/skills/scrapling/SKILL.md +345 -0
- package/.mindforge/skills/sherlock/SKILL.md +203 -0
- package/.mindforge/skills/simplify-code/SKILL.md +187 -0
- package/.mindforge/skills/spike/SKILL.md +209 -0
- package/.mindforge/skills/subagent-driven-development/SKILL.md +364 -0
- package/.mindforge/skills/systematic-debugging/SKILL.md +379 -0
- package/.mindforge/skills/test-driven-development/SKILL.md +355 -0
- package/.mindforge/skills/web-pentest/SKILL.md +327 -0
- package/CHANGELOG.md +71 -0
- package/MINDFORGE.md +2 -2
- package/README.md +72 -3
- package/RELEASENOTES.md +109 -0
- package/bin/installer-core.js +6 -2
- package/bin/mindforge-cli.js +7 -0
- package/bin/workflows/workflow-runner.js +110 -0
- package/docs/commands-reference.md +25 -0
- package/docs/getting-started.md +42 -5
- package/package.json +2 -1
|
@@ -0,0 +1,275 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: arxiv
|
|
3
|
+
description: "Search arXiv papers by keyword, author, category, or ID."
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# arXiv Research
|
|
8
|
+
|
|
9
|
+
Search and retrieve academic papers from arXiv via their free REST API. No API key, no dependencies — just curl.
|
|
10
|
+
|
|
11
|
+
## Quick Reference
|
|
12
|
+
|
|
13
|
+
| Action | Command |
|
|
14
|
+
|--------|---------|
|
|
15
|
+
| Search papers | `curl "https://export.arxiv.org/api/query?search_query=all:QUERY&max_results=5"` |
|
|
16
|
+
| Get specific paper | `curl "https://export.arxiv.org/api/query?id_list=2402.03300"` |
|
|
17
|
+
| Read abstract (web) | `web_extract(urls=["https://arxiv.org/abs/2402.03300"])` |
|
|
18
|
+
| Read full paper (PDF) | `web_extract(urls=["https://arxiv.org/pdf/2402.03300"])` |
|
|
19
|
+
|
|
20
|
+
## Searching Papers
|
|
21
|
+
|
|
22
|
+
The API returns Atom XML. Parse with `grep`/`sed` or pipe through `python3` for clean output.
|
|
23
|
+
|
|
24
|
+
### Basic search
|
|
25
|
+
|
|
26
|
+
```bash
|
|
27
|
+
curl -s "https://export.arxiv.org/api/query?search_query=all:GRPO+reinforcement+learning&max_results=5"
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
### Clean output (parse XML to readable format)
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
curl -s "https://export.arxiv.org/api/query?search_query=all:GRPO+reinforcement+learning&max_results=5&sortBy=submittedDate&sortOrder=descending" | python3 -c "
|
|
34
|
+
import sys, xml.etree.ElementTree as ET
|
|
35
|
+
ns = {'a': 'http://www.w3.org/2005/Atom'}
|
|
36
|
+
root = ET.parse(sys.stdin).getroot()
|
|
37
|
+
for i, entry in enumerate(root.findall('a:entry', ns)):
|
|
38
|
+
title = entry.find('a:title', ns).text.strip().replace('\n', ' ')
|
|
39
|
+
arxiv_id = entry.find('a:id', ns).text.strip().split('/abs/')[-1]
|
|
40
|
+
published = entry.find('a:published', ns).text[:10]
|
|
41
|
+
authors = ', '.join(a.find('a:name', ns).text for a in entry.findall('a:author', ns))
|
|
42
|
+
summary = entry.find('a:summary', ns).text.strip()[:200]
|
|
43
|
+
cats = ', '.join(c.get('term') for c in entry.findall('a:category', ns))
|
|
44
|
+
print(f'{i+1}. [{arxiv_id}] {title}')
|
|
45
|
+
print(f' Authors: {authors}')
|
|
46
|
+
print(f' Published: {published} | Categories: {cats}')
|
|
47
|
+
print(f' Abstract: {summary}...')
|
|
48
|
+
print(f' PDF: https://arxiv.org/pdf/{arxiv_id}')
|
|
49
|
+
print()
|
|
50
|
+
"
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
## Search Query Syntax
|
|
54
|
+
|
|
55
|
+
| Prefix | Searches | Example |
|
|
56
|
+
|--------|----------|---------|
|
|
57
|
+
| `all:` | All fields | `all:transformer+attention` |
|
|
58
|
+
| `ti:` | Title | `ti:large+language+models` |
|
|
59
|
+
| `au:` | Author | `au:vaswani` |
|
|
60
|
+
| `abs:` | Abstract | `abs:reinforcement+learning` |
|
|
61
|
+
| `cat:` | Category | `cat:cs.AI` |
|
|
62
|
+
| `co:` | Comment | `co:accepted+NeurIPS` |
|
|
63
|
+
|
|
64
|
+
### Boolean operators
|
|
65
|
+
|
|
66
|
+
```
|
|
67
|
+
# AND (default when using +)
|
|
68
|
+
search_query=all:transformer+attention
|
|
69
|
+
|
|
70
|
+
# OR
|
|
71
|
+
search_query=all:GPT+OR+all:BERT
|
|
72
|
+
|
|
73
|
+
# AND NOT
|
|
74
|
+
search_query=all:language+model+ANDNOT+all:vision
|
|
75
|
+
|
|
76
|
+
# Exact phrase
|
|
77
|
+
search_query=ti:"chain+of+thought"
|
|
78
|
+
|
|
79
|
+
# Combined
|
|
80
|
+
search_query=au:hinton+AND+cat:cs.LG
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
## Sort and Pagination
|
|
84
|
+
|
|
85
|
+
| Parameter | Options |
|
|
86
|
+
|-----------|---------|
|
|
87
|
+
| `sortBy` | `relevance`, `lastUpdatedDate`, `submittedDate` |
|
|
88
|
+
| `sortOrder` | `ascending`, `descending` |
|
|
89
|
+
| `start` | Result offset (0-based) |
|
|
90
|
+
| `max_results` | Number of results (default 10, max 30000) |
|
|
91
|
+
|
|
92
|
+
```bash
|
|
93
|
+
# Latest 10 papers in cs.AI
|
|
94
|
+
curl -s "https://export.arxiv.org/api/query?search_query=cat:cs.AI&sortBy=submittedDate&sortOrder=descending&max_results=10"
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
## Fetching Specific Papers
|
|
98
|
+
|
|
99
|
+
```bash
|
|
100
|
+
# By arXiv ID
|
|
101
|
+
curl -s "https://export.arxiv.org/api/query?id_list=2402.03300"
|
|
102
|
+
|
|
103
|
+
# Multiple papers
|
|
104
|
+
curl -s "https://export.arxiv.org/api/query?id_list=2402.03300,2401.12345,2403.00001"
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
## BibTeX Generation
|
|
108
|
+
|
|
109
|
+
After fetching metadata for a paper, generate a BibTeX entry:
|
|
110
|
+
|
|
111
|
+
{% raw %}
|
|
112
|
+
```bash
|
|
113
|
+
curl -s "https://export.arxiv.org/api/query?id_list=1706.03762" | python3 -c "
|
|
114
|
+
import sys, xml.etree.ElementTree as ET
|
|
115
|
+
ns = {'a': 'http://www.w3.org/2005/Atom', 'arxiv': 'http://arxiv.org/schemas/atom'}
|
|
116
|
+
root = ET.parse(sys.stdin).getroot()
|
|
117
|
+
entry = root.find('a:entry', ns)
|
|
118
|
+
if entry is None: sys.exit('Paper not found')
|
|
119
|
+
title = entry.find('a:title', ns).text.strip().replace('\n', ' ')
|
|
120
|
+
authors = ' and '.join(a.find('a:name', ns).text for a in entry.findall('a:author', ns))
|
|
121
|
+
year = entry.find('a:published', ns).text[:4]
|
|
122
|
+
raw_id = entry.find('a:id', ns).text.strip().split('/abs/')[-1]
|
|
123
|
+
cat = entry.find('arxiv:primary_category', ns)
|
|
124
|
+
primary = cat.get('term') if cat is not None else 'cs.LG'
|
|
125
|
+
last_name = entry.find('a:author', ns).find('a:name', ns).text.split()[-1]
|
|
126
|
+
print(f'@article{{{last_name}{year}_{raw_id.replace(\".\", \"\")},')
|
|
127
|
+
print(f' title = {{{title}}},')
|
|
128
|
+
print(f' author = {{{authors}}},')
|
|
129
|
+
print(f' year = {{{year}}},')
|
|
130
|
+
print(f' eprint = {{{raw_id}}},')
|
|
131
|
+
print(f' archivePrefix = {{arXiv}},')
|
|
132
|
+
print(f' primaryClass = {{{primary}}},')
|
|
133
|
+
print(f' url = {{https://arxiv.org/abs/{raw_id}}}')
|
|
134
|
+
print('}')
|
|
135
|
+
"
|
|
136
|
+
```
|
|
137
|
+
{% endraw %}
|
|
138
|
+
|
|
139
|
+
## Reading Paper Content
|
|
140
|
+
|
|
141
|
+
After finding a paper, read it:
|
|
142
|
+
|
|
143
|
+
```
|
|
144
|
+
# Abstract page (fast, metadata + abstract)
|
|
145
|
+
web_extract(urls=["https://arxiv.org/abs/2402.03300"])
|
|
146
|
+
|
|
147
|
+
# Full paper (PDF → markdown via Firecrawl)
|
|
148
|
+
web_extract(urls=["https://arxiv.org/pdf/2402.03300"])
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
For local PDF processing, see the `ocr-and-documents` skill.
|
|
152
|
+
|
|
153
|
+
## Common Categories
|
|
154
|
+
|
|
155
|
+
| Category | Field |
|
|
156
|
+
|----------|-------|
|
|
157
|
+
| `cs.AI` | Artificial Intelligence |
|
|
158
|
+
| `cs.CL` | Computation and Language (NLP) |
|
|
159
|
+
| `cs.CV` | Computer Vision |
|
|
160
|
+
| `cs.LG` | Machine Learning |
|
|
161
|
+
| `cs.CR` | Cryptography and Security |
|
|
162
|
+
| `stat.ML` | Machine Learning (Statistics) |
|
|
163
|
+
| `math.OC` | Optimization and Control |
|
|
164
|
+
| `physics.comp-ph` | Computational Physics |
|
|
165
|
+
|
|
166
|
+
Full list: https://arxiv.org/category_taxonomy
|
|
167
|
+
|
|
168
|
+
## Helper Script
|
|
169
|
+
|
|
170
|
+
The `scripts/search_arxiv.py` script handles XML parsing and provides clean output:
|
|
171
|
+
|
|
172
|
+
```bash
|
|
173
|
+
python scripts/search_arxiv.py "GRPO reinforcement learning"
|
|
174
|
+
python scripts/search_arxiv.py "transformer attention" --max 10 --sort date
|
|
175
|
+
python scripts/search_arxiv.py --author "Yann LeCun" --max 5
|
|
176
|
+
python scripts/search_arxiv.py --category cs.AI --sort date
|
|
177
|
+
python scripts/search_arxiv.py --id 2402.03300
|
|
178
|
+
python scripts/search_arxiv.py --id 2402.03300,2401.12345
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
No dependencies — uses only Python stdlib.
|
|
182
|
+
|
|
183
|
+
---
|
|
184
|
+
|
|
185
|
+
## Semantic Scholar (Citations, Related Papers, Author Profiles)
|
|
186
|
+
|
|
187
|
+
arXiv doesn't provide citation data or recommendations. Use the **Semantic Scholar API** for that — free, no key needed for basic use (1 req/sec), returns JSON.
|
|
188
|
+
|
|
189
|
+
### Get paper details + citations
|
|
190
|
+
|
|
191
|
+
```bash
|
|
192
|
+
# By arXiv ID
|
|
193
|
+
curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:2402.03300?fields=title,authors,citationCount,referenceCount,influentialCitationCount,year,abstract" | python3 -m json.tool
|
|
194
|
+
|
|
195
|
+
# By Semantic Scholar paper ID or DOI
|
|
196
|
+
curl -s "https://api.semanticscholar.org/graph/v1/paper/DOI:10.1234/example?fields=title,citationCount"
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
### Get citations OF a paper (who cited it)
|
|
200
|
+
|
|
201
|
+
```bash
|
|
202
|
+
curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:2402.03300/citations?fields=title,authors,year,citationCount&limit=10" | python3 -m json.tool
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
### Get references FROM a paper (what it cites)
|
|
206
|
+
|
|
207
|
+
```bash
|
|
208
|
+
curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:2402.03300/references?fields=title,authors,year,citationCount&limit=10" | python3 -m json.tool
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
### Search papers (alternative to arXiv search, returns JSON)
|
|
212
|
+
|
|
213
|
+
```bash
|
|
214
|
+
curl -s "https://api.semanticscholar.org/graph/v1/paper/search?query=GRPO+reinforcement+learning&limit=5&fields=title,authors,year,citationCount,externalIds" | python3 -m json.tool
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
### Get paper recommendations
|
|
218
|
+
|
|
219
|
+
```bash
|
|
220
|
+
curl -s -X POST "https://api.semanticscholar.org/recommendations/v1/papers/" \
|
|
221
|
+
-H "Content-Type: application/json" \
|
|
222
|
+
-d '{"positivePaperIds": ["arXiv:2402.03300"], "negativePaperIds": []}' | python3 -m json.tool
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
### Author profile
|
|
226
|
+
|
|
227
|
+
```bash
|
|
228
|
+
curl -s "https://api.semanticscholar.org/graph/v1/author/search?query=Yann+LeCun&fields=name,hIndex,citationCount,paperCount" | python3 -m json.tool
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
### Useful Semantic Scholar fields
|
|
232
|
+
|
|
233
|
+
`title`, `authors`, `year`, `abstract`, `citationCount`, `referenceCount`, `influentialCitationCount`, `isOpenAccess`, `openAccessPdf`, `fieldsOfStudy`, `publicationVenue`, `externalIds` (contains arXiv ID, DOI, etc.)
|
|
234
|
+
|
|
235
|
+
---
|
|
236
|
+
|
|
237
|
+
## Complete Research Workflow
|
|
238
|
+
|
|
239
|
+
1. **Discover**: `python scripts/search_arxiv.py "your topic" --sort date --max 10`
|
|
240
|
+
2. **Assess impact**: `curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:ID?fields=citationCount,influentialCitationCount"`
|
|
241
|
+
3. **Read abstract**: `web_extract(urls=["https://arxiv.org/abs/ID"])`
|
|
242
|
+
4. **Read full paper**: `web_extract(urls=["https://arxiv.org/pdf/ID"])`
|
|
243
|
+
5. **Find related work**: `curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:ID/references?fields=title,citationCount&limit=20"`
|
|
244
|
+
6. **Get recommendations**: POST to Semantic Scholar recommendations endpoint
|
|
245
|
+
7. **Track authors**: `curl -s "https://api.semanticscholar.org/graph/v1/author/search?query=NAME"`
|
|
246
|
+
|
|
247
|
+
## Rate Limits
|
|
248
|
+
|
|
249
|
+
| API | Rate | Auth |
|
|
250
|
+
|-----|------|------|
|
|
251
|
+
| arXiv | ~1 req / 3 seconds | None needed |
|
|
252
|
+
| Semantic Scholar | 1 req / second | None (100/sec with API key) |
|
|
253
|
+
|
|
254
|
+
## Notes
|
|
255
|
+
|
|
256
|
+
- arXiv returns Atom XML — use the helper script or parsing snippet for clean output
|
|
257
|
+
- Semantic Scholar returns JSON — pipe through `python3 -m json.tool` for readability
|
|
258
|
+
- arXiv IDs: old format (`hep-th/0601001`) vs new (`2402.03300`)
|
|
259
|
+
- PDF: `https://arxiv.org/pdf/{id}` — Abstract: `https://arxiv.org/abs/{id}`
|
|
260
|
+
- HTML (when available): `https://arxiv.org/html/{id}`
|
|
261
|
+
- For local PDF processing, see the `ocr-and-documents` skill
|
|
262
|
+
|
|
263
|
+
## ID Versioning
|
|
264
|
+
|
|
265
|
+
- `arxiv.org/abs/1706.03762` always resolves to the **latest** version
|
|
266
|
+
- `arxiv.org/abs/1706.03762v1` points to a **specific** immutable version
|
|
267
|
+
- When generating citations, preserve the version suffix you actually read to prevent citation drift (a later version may substantially change content)
|
|
268
|
+
- The API `<id>` field returns the versioned URL (e.g., `http://arxiv.org/abs/1706.03762v7`)
|
|
269
|
+
|
|
270
|
+
## Withdrawn Papers
|
|
271
|
+
|
|
272
|
+
Papers can be withdrawn after submission. When this happens:
|
|
273
|
+
- The `<summary>` field contains a withdrawal notice (look for "withdrawn" or "retracted")
|
|
274
|
+
- Metadata fields may be incomplete
|
|
275
|
+
- Always check the summary before treating a result as a valid paper
|
|
@@ -0,0 +1,130 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: blogwatcher
|
|
3
|
+
description: "Monitor blogs and RSS/Atom feeds via blogwatcher-cli tool."
|
|
4
|
+
version: 2.0.0
|
|
5
|
+
prerequisites:
|
|
6
|
+
commands: [blogwatcher-cli]
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Blogwatcher
|
|
10
|
+
|
|
11
|
+
Track blog and RSS/Atom feed updates with the `blogwatcher-cli` tool. Supports automatic feed discovery, HTML scraping fallback, OPML import, and read/unread article management.
|
|
12
|
+
|
|
13
|
+
## Installation
|
|
14
|
+
|
|
15
|
+
Pick one method:
|
|
16
|
+
|
|
17
|
+
- **Go:** `go install github.com/JulienTant/blogwatcher-cli/cmd/blogwatcher-cli@latest`
|
|
18
|
+
- **Docker:** `docker run --rm -v blogwatcher-cli:/data ghcr.io/julientant/blogwatcher-cli`
|
|
19
|
+
- **Binary (Linux amd64):** `curl -sL https://github.com/JulienTant/blogwatcher-cli/releases/latest/download/blogwatcher-cli_linux_amd64.tar.gz | tar xz -C /usr/local/bin blogwatcher-cli`
|
|
20
|
+
- **Binary (Linux arm64):** `curl -sL https://github.com/JulienTant/blogwatcher-cli/releases/latest/download/blogwatcher-cli_linux_arm64.tar.gz | tar xz -C /usr/local/bin blogwatcher-cli`
|
|
21
|
+
- **Binary (macOS Apple Silicon):** `curl -sL https://github.com/JulienTant/blogwatcher-cli/releases/latest/download/blogwatcher-cli_darwin_arm64.tar.gz | tar xz -C /usr/local/bin blogwatcher-cli`
|
|
22
|
+
- **Binary (macOS Intel):** `curl -sL https://github.com/JulienTant/blogwatcher-cli/releases/latest/download/blogwatcher-cli_darwin_amd64.tar.gz | tar xz -C /usr/local/bin blogwatcher-cli`
|
|
23
|
+
|
|
24
|
+
All releases: https://github.com/JulienTant/blogwatcher-cli/releases
|
|
25
|
+
|
|
26
|
+
### Docker with persistent storage
|
|
27
|
+
|
|
28
|
+
By default the database lives at `~/.blogwatcher-cli/blogwatcher-cli.db`. In Docker this is lost on container restart. Use `BLOGWATCHER_DB` or a volume mount to persist it:
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
# Named volume (simplest)
|
|
32
|
+
docker run --rm -v blogwatcher-cli:/data -e BLOGWATCHER_DB=/data/blogwatcher-cli.db ghcr.io/julientant/blogwatcher-cli scan
|
|
33
|
+
|
|
34
|
+
# Host bind mount
|
|
35
|
+
docker run --rm -v /path/on/host:/data -e BLOGWATCHER_DB=/data/blogwatcher-cli.db ghcr.io/julientant/blogwatcher-cli scan
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
### Migrating from the original blogwatcher
|
|
39
|
+
|
|
40
|
+
If upgrading from `Hyaxia/blogwatcher`, move your database:
|
|
41
|
+
|
|
42
|
+
```bash
|
|
43
|
+
mv ~/.blogwatcher/blogwatcher.db ~/.blogwatcher-cli/blogwatcher-cli.db
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
The binary name changed from `blogwatcher` to `blogwatcher-cli`.
|
|
47
|
+
|
|
48
|
+
## Common Commands
|
|
49
|
+
|
|
50
|
+
### Managing blogs
|
|
51
|
+
|
|
52
|
+
- Add a blog: `blogwatcher-cli add "My Blog" https://example.com`
|
|
53
|
+
- Add with explicit feed: `blogwatcher-cli add "My Blog" https://example.com --feed-url https://example.com/feed.xml`
|
|
54
|
+
- Add with HTML scraping: `blogwatcher-cli add "My Blog" https://example.com --scrape-selector "article h2 a"`
|
|
55
|
+
- List tracked blogs: `blogwatcher-cli blogs`
|
|
56
|
+
- Remove a blog: `blogwatcher-cli remove "My Blog" --yes`
|
|
57
|
+
- Import from OPML: `blogwatcher-cli import subscriptions.opml`
|
|
58
|
+
|
|
59
|
+
### Scanning and reading
|
|
60
|
+
|
|
61
|
+
- Scan all blogs: `blogwatcher-cli scan`
|
|
62
|
+
- Scan one blog: `blogwatcher-cli scan "My Blog"`
|
|
63
|
+
- List unread articles: `blogwatcher-cli articles`
|
|
64
|
+
- List all articles: `blogwatcher-cli articles --all`
|
|
65
|
+
- Filter by blog: `blogwatcher-cli articles --blog "My Blog"`
|
|
66
|
+
- Filter by category: `blogwatcher-cli articles --category "Engineering"`
|
|
67
|
+
- Mark article read: `blogwatcher-cli read 1`
|
|
68
|
+
- Mark article unread: `blogwatcher-cli unread 1`
|
|
69
|
+
- Mark all read: `blogwatcher-cli read-all`
|
|
70
|
+
- Mark all read for a blog: `blogwatcher-cli read-all --blog "My Blog" --yes`
|
|
71
|
+
|
|
72
|
+
## Environment Variables
|
|
73
|
+
|
|
74
|
+
All flags can be set via environment variables with the `BLOGWATCHER_` prefix:
|
|
75
|
+
|
|
76
|
+
| Variable | Description |
|
|
77
|
+
|---|---|
|
|
78
|
+
| `BLOGWATCHER_DB` | Path to SQLite database file |
|
|
79
|
+
| `BLOGWATCHER_WORKERS` | Number of concurrent scan workers (default: 8) |
|
|
80
|
+
| `BLOGWATCHER_SILENT` | Only output "scan done" when scanning |
|
|
81
|
+
| `BLOGWATCHER_YES` | Skip confirmation prompts |
|
|
82
|
+
| `BLOGWATCHER_CATEGORY` | Default filter for articles by category |
|
|
83
|
+
|
|
84
|
+
## Example Output
|
|
85
|
+
|
|
86
|
+
```
|
|
87
|
+
$ blogwatcher-cli blogs
|
|
88
|
+
Tracked blogs (1):
|
|
89
|
+
|
|
90
|
+
xkcd
|
|
91
|
+
URL: https://xkcd.com
|
|
92
|
+
Feed: https://xkcd.com/atom.xml
|
|
93
|
+
Last scanned: 2026-04-03 10:30
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
```
|
|
97
|
+
$ blogwatcher-cli scan
|
|
98
|
+
Scanning 1 blog(s)...
|
|
99
|
+
|
|
100
|
+
xkcd
|
|
101
|
+
Source: RSS | Found: 4 | New: 4
|
|
102
|
+
|
|
103
|
+
Found 4 new article(s) total!
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
```
|
|
107
|
+
$ blogwatcher-cli articles
|
|
108
|
+
Unread articles (2):
|
|
109
|
+
|
|
110
|
+
[1] [new] Barrel - Part 13
|
|
111
|
+
Blog: xkcd
|
|
112
|
+
URL: https://xkcd.com/3095/
|
|
113
|
+
Published: 2026-04-02
|
|
114
|
+
Categories: Comics, Science
|
|
115
|
+
|
|
116
|
+
[2] [new] Volcano Fact
|
|
117
|
+
Blog: xkcd
|
|
118
|
+
URL: https://xkcd.com/3094/
|
|
119
|
+
Published: 2026-04-01
|
|
120
|
+
Categories: Comics
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
## Notes
|
|
124
|
+
|
|
125
|
+
- Auto-discovers RSS/Atom feeds from blog homepages when no `--feed-url` is provided.
|
|
126
|
+
- Falls back to HTML scraping if RSS fails and `--scrape-selector` is configured.
|
|
127
|
+
- Categories from RSS/Atom feeds are stored and can be used to filter articles.
|
|
128
|
+
- Import blogs in bulk from OPML files exported by Feedly, Inoreader, NewsBlur, etc.
|
|
129
|
+
- Database stored at `~/.blogwatcher-cli/blogwatcher-cli.db` by default (override with `--db` or `BLOGWATCHER_DB`).
|
|
130
|
+
- Use `blogwatcher-cli <command> --help` to discover all flags and options.
|