methods-mcp 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- methods_mcp-0.1.0/.claude/napkin.md +96 -0
- methods_mcp-0.1.0/.claude/settings.local.json +10 -0
- methods_mcp-0.1.0/.github/workflows/ci.yml +39 -0
- methods_mcp-0.1.0/.gitignore +43 -0
- methods_mcp-0.1.0/.python-version +1 -0
- methods_mcp-0.1.0/CHANGELOG.md +22 -0
- methods_mcp-0.1.0/LICENSE +21 -0
- methods_mcp-0.1.0/PKG-INFO +177 -0
- methods_mcp-0.1.0/README.md +136 -0
- methods_mcp-0.1.0/examples/__init__.py +0 -0
- methods_mcp-0.1.0/examples/claude_code_demo.md +54 -0
- methods_mcp-0.1.0/examples/reflexive_demo.py +134 -0
- methods_mcp-0.1.0/project-thoughts.md +74 -0
- methods_mcp-0.1.0/pyproject.toml +108 -0
- methods_mcp-0.1.0/src/methods_mcp/__init__.py +3 -0
- methods_mcp-0.1.0/src/methods_mcp/extraction/__init__.py +0 -0
- methods_mcp-0.1.0/src/methods_mcp/extraction/html.py +116 -0
- methods_mcp-0.1.0/src/methods_mcp/extraction/pdf.py +81 -0
- methods_mcp-0.1.0/src/methods_mcp/identifiers.py +91 -0
- methods_mcp-0.1.0/src/methods_mcp/llm.py +133 -0
- methods_mcp-0.1.0/src/methods_mcp/py.typed +0 -0
- methods_mcp-0.1.0/src/methods_mcp/schemas.py +167 -0
- methods_mcp-0.1.0/src/methods_mcp/server.py +223 -0
- methods_mcp-0.1.0/src/methods_mcp/sources/__init__.py +0 -0
- methods_mcp-0.1.0/src/methods_mcp/sources/arxiv.py +57 -0
- methods_mcp-0.1.0/src/methods_mcp/tools/__init__.py +0 -0
- methods_mcp-0.1.0/src/methods_mcp/tools/fetch.py +109 -0
- methods_mcp-0.1.0/src/methods_mcp/tools/methods.py +78 -0
- methods_mcp-0.1.0/src/methods_mcp/tools/repo.py +349 -0
- methods_mcp-0.1.0/src/methods_mcp/tools/summarize.py +60 -0
- methods_mcp-0.1.0/tests/__init__.py +0 -0
- methods_mcp-0.1.0/tests/conftest.py +39 -0
- methods_mcp-0.1.0/tests/fixtures/ar5iv_minimal.html +36 -0
- methods_mcp-0.1.0/tests/fixtures/arxiv_2410_01234.xml +20 -0
- methods_mcp-0.1.0/tests/fixtures/github_readme.json +6 -0
- methods_mcp-0.1.0/tests/fixtures/github_repo_meta.json +10 -0
- methods_mcp-0.1.0/tests/fixtures/github_tree.json +17 -0
- methods_mcp-0.1.0/tests/test_extraction_html.py +25 -0
- methods_mcp-0.1.0/tests/test_fetch_tools.py +52 -0
- methods_mcp-0.1.0/tests/test_identifiers.py +52 -0
- methods_mcp-0.1.0/tests/test_pdf_extraction.py +28 -0
- methods_mcp-0.1.0/tests/test_repo_heuristics.py +119 -0
- methods_mcp-0.1.0/tests/test_server_smoke.py +31 -0
- methods_mcp-0.1.0/uv.lock +1850 -0
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
# WWSF Challenge Napkin
|
|
2
|
+
|
|
3
|
+
Living doc for the Worldwide AI Science Fellowship (WWSF) Build Challenge — Flynn's tryout for the inaugural AI for Science fellowship (cohort runs Apr 15 → Jul 15, 2026).
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## The Ask (one-screen)
|
|
8
|
+
|
|
9
|
+
**Challenge:** Build something new in a week using the latest AI tools. Ship a real demo. Share it in public.
|
|
10
|
+
|
|
11
|
+
**Deliverables (submit by email to Michael Raspuzzi):**
|
|
12
|
+
1. Build-in-public post on socials (X / LinkedIn / YouTube — long or short form, Flynn's choice)
|
|
13
|
+
2. Public GitHub link (code + docs)
|
|
14
|
+
3. 2–3 min Loom walking through process and links
|
|
15
|
+
|
|
16
|
+
**What they're grading on:**
|
|
17
|
+
1. Real demo — does it actually work
|
|
18
|
+
2. Try new things — reaching into unfamiliar tools / hardware / territory
|
|
19
|
+
3. Process + narration — how Flynn thinks and explains it
|
|
20
|
+
|
|
21
|
+
**Deadline:** 2026-04-20, 11:59 PT (received Apr 14 ~00:44 PT, so effectively 6.5 days)
|
|
22
|
+
**Time budget:** ≥6 hrs recommended, average fellowship cadence is 8–10 hrs/wk
|
|
23
|
+
|
|
24
|
+
**Response owed:** reply "Challenge accepted." to lock it in.
|
|
25
|
+
|
|
26
|
+
## Example prompts they floated
|
|
27
|
+
- OpenClaw agent on a Digital Ocean droplet ($12) + dashboard or LabClaw literature review
|
|
28
|
+
- Claude Code designing a robotic protocol (Opentrons Flex sim)
|
|
29
|
+
- Arduino/ESP32 + sensor (temp, pH, turbidity) → MQTT/serial → live dashboard with threshold alerts
|
|
30
|
+
- Fork OpenSCAD/STEP file from Open Labware, parametrically modify, render, publish
|
|
31
|
+
|
|
32
|
+
(These are illustrative, not prescriptive — "figure it out" is the vibe.)
|
|
33
|
+
|
|
34
|
+
## Flynn's edge — use this
|
|
35
|
+
- **Dr. Flynn Lachendro, PhD in Biomedical Engineering.** This is literally "AI for Science" — lean into the science side, not just the engineering side.
|
|
36
|
+
- Already ships AI products fast (Arbor = research workbench; Lattice = AI research intel dashboard; Layers = pop-sci mobile; Model Collapse = prompt-engineering roguelike).
|
|
37
|
+
- Strongest stack: **Python + FastAPI + SQLAlchemy async + Pydantic + React/Next + TS**. Uses Anthropic / OpenAI / Gemini / OpenRouter regularly.
|
|
38
|
+
- Background in research pipelines (arXiv, Semantic Scholar, paper summarization, ingestion crons) → natural fit for LabClaw-style science agents.
|
|
39
|
+
|
|
40
|
+
## What "try new things" means here
|
|
41
|
+
To avoid this being "Flynn ships another web app," reach into at least one of:
|
|
42
|
+
- Real lab hardware / simulators (Opentrons Flex sim, OpenClaw, OT-2 protocol API)
|
|
43
|
+
- Physical-world I/O (ESP32 / Arduino / Pi, sensors, MQTT)
|
|
44
|
+
- Agentic lab workflows (LabClaw skills, MCP servers for scientific tooling)
|
|
45
|
+
- Parametric CAD / Open Labware (OpenSCAD, STEP manipulation)
|
|
46
|
+
- World models or VLA (video-language-action) perception
|
|
47
|
+
|
|
48
|
+
---
|
|
49
|
+
|
|
50
|
+
## Flynn Preferences (general, durable)
|
|
51
|
+
|
|
52
|
+
**Identity**
|
|
53
|
+
- Dr. Flynn Lachendro — PhD Biomedical Engineering, AI engineer/researcher, Founding Engineer at Nur Opus (Prism). Faculty / OpenKit background. Focus: multi-agent systems, LLM eval, AI safety.
|
|
54
|
+
|
|
55
|
+
**Collaboration style**
|
|
56
|
+
- **Always wait for explicit go-ahead** before git commit/push, PRs, or any irreversible action. "Should I do X?" ≠ "do X."
|
|
57
|
+
- No destructive-first instincts. `rm -rf`, recreating environments, `--no-verify` — only as last resort, after alternatives fail.
|
|
58
|
+
- No legacy wrappers / backwards-compat shims for internal refactors. Update call sites directly.
|
|
59
|
+
- Commit messages: short, single-line. No `Co-Authored-By`, no multi-paragraph bodies.
|
|
60
|
+
|
|
61
|
+
**Explanation style (two modes)**
|
|
62
|
+
- **Flynn style:** super concise, 2 ways to explain, simple analogies, minimal code, get the knack across. Used when Flynn says "explain Flynn style."
|
|
63
|
+
- **Claude style:** full clean written explanation in the response, then call the `say_tts` MCP (default voice) to read it aloud. Used when Flynn says "explain Claude style" or "read out Claude style."
|
|
64
|
+
|
|
65
|
+
**Package / tooling**
|
|
66
|
+
- `uv` for all Python. Avoid venvs.
|
|
67
|
+
- `ruff format .` → `ruff check . --fix` → `mypy <pkg>` → `pytest -v` is the "Linting, Typing and Checking" sequence.
|
|
68
|
+
- Stack defaults: FastAPI + SQLAlchemy 2.0 async + Pydantic + loguru (BE). Next.js + React + TS + Tailwind (web). Expo + RN + Reanimated (mobile). Supabase (auth + Postgres). Vercel (FE) + Railway (BE).
|
|
69
|
+
|
|
70
|
+
**Code snippets**
|
|
71
|
+
- When showing DB models / architectural primitives: full type annotations, both sides of relationships, `__tablename__` included. No pseudocode for real DB models.
|
|
72
|
+
|
|
73
|
+
**Root-markdown refresh**
|
|
74
|
+
- When Flynn says "refresh the root markdowns": review root `.md`s against current code, surgical updates (preserve style/tone), don't create new ones, report what changed.
|
|
75
|
+
|
|
76
|
+
**Granola**
|
|
77
|
+
- "Look through my Granola" → read `~/Library/Application Support/Granola/cache-v6.json`. `cache.state.documents` = meetings, `cache.state.transcripts` = full text keyed by doc ID.
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## Build Log
|
|
82
|
+
*(fill in as we go — what was tried, what worked, what got cut, links to commits / tweets / Loom)*
|
|
83
|
+
|
|
84
|
+
- 2026-04-14 — Challenge received. Napkin set up. Planning session in progress.
|
|
85
|
+
- 2026-04-14 — Plan locked: build a FastMCP server primitive, demo via Claude Code live, audience = AI-for-science community (cross-domain).
|
|
86
|
+
- 2026-04-14 — Pivot: `paper-mcp` already exists on PyPI (Bhvaik). Paper2Agent (Stanford) does batch methods+repro. **Wedge: lightweight, on-demand, real-time MCP for methods extraction + reproducibility *heuristics*** — no full re-execution required. Package name: **`methods-mcp`**.
|
|
87
|
+
|
|
88
|
+
## Open Questions / Decisions
|
|
89
|
+
*(TBD after the plan-mode interview)*
|
|
90
|
+
|
|
91
|
+
## Submission Checklist
|
|
92
|
+
- [ ] Reply "Challenge accepted." to Michael Raspuzzi
|
|
93
|
+
- [ ] GitHub repo public with README / process docs
|
|
94
|
+
- [ ] Build-in-public post drafted + posted (X / LinkedIn / YouTube)
|
|
95
|
+
- [ ] 2–3 min Loom recorded + link ready
|
|
96
|
+
- [ ] Submission email sent to Michael by **2026-04-20 23:59 PT**
|
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
name: CI
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches: [main]
|
|
6
|
+
pull_request:
|
|
7
|
+
branches: [main]
|
|
8
|
+
|
|
9
|
+
jobs:
|
|
10
|
+
test:
|
|
11
|
+
runs-on: ubuntu-latest
|
|
12
|
+
strategy:
|
|
13
|
+
matrix:
|
|
14
|
+
python-version: ["3.11", "3.12", "3.13"]
|
|
15
|
+
steps:
|
|
16
|
+
- uses: actions/checkout@v4
|
|
17
|
+
|
|
18
|
+
- name: Install uv
|
|
19
|
+
uses: astral-sh/setup-uv@v3
|
|
20
|
+
with:
|
|
21
|
+
enable-cache: true
|
|
22
|
+
|
|
23
|
+
- name: Set up Python ${{ matrix.python-version }}
|
|
24
|
+
run: uv python install ${{ matrix.python-version }}
|
|
25
|
+
|
|
26
|
+
- name: Install dependencies
|
|
27
|
+
run: uv sync --extra dev --extra agent --python ${{ matrix.python-version }}
|
|
28
|
+
|
|
29
|
+
- name: Ruff format check
|
|
30
|
+
run: uv run ruff format --check .
|
|
31
|
+
|
|
32
|
+
- name: Ruff lint
|
|
33
|
+
run: uv run ruff check .
|
|
34
|
+
|
|
35
|
+
- name: Mypy
|
|
36
|
+
run: uv run mypy src
|
|
37
|
+
|
|
38
|
+
- name: Pytest
|
|
39
|
+
run: uv run pytest tests -v
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
# Python
|
|
2
|
+
__pycache__/
|
|
3
|
+
*.py[cod]
|
|
4
|
+
*$py.class
|
|
5
|
+
*.so
|
|
6
|
+
.Python
|
|
7
|
+
*.egg-info/
|
|
8
|
+
*.egg
|
|
9
|
+
build/
|
|
10
|
+
dist/
|
|
11
|
+
wheels/
|
|
12
|
+
.venv/
|
|
13
|
+
venv/
|
|
14
|
+
env/
|
|
15
|
+
|
|
16
|
+
# uv
|
|
17
|
+
uv.lock.bak
|
|
18
|
+
|
|
19
|
+
# Testing / coverage
|
|
20
|
+
.pytest_cache/
|
|
21
|
+
.coverage
|
|
22
|
+
htmlcov/
|
|
23
|
+
.tox/
|
|
24
|
+
.nox/
|
|
25
|
+
.mypy_cache/
|
|
26
|
+
.ruff_cache/
|
|
27
|
+
|
|
28
|
+
# IDE
|
|
29
|
+
.vscode/
|
|
30
|
+
.idea/
|
|
31
|
+
*.swp
|
|
32
|
+
*.swo
|
|
33
|
+
.DS_Store
|
|
34
|
+
|
|
35
|
+
# Env / secrets
|
|
36
|
+
.env
|
|
37
|
+
.env.local
|
|
38
|
+
.env.*.local
|
|
39
|
+
*.pem
|
|
40
|
+
|
|
41
|
+
# Project-specific
|
|
42
|
+
fixtures_cache/
|
|
43
|
+
*.local.md
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
3.11
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
## 0.1.0 — 2026-04-14
|
|
4
|
+
|
|
5
|
+
Initial release. Built for the Worldwide AI Science Fellowship build challenge.
|
|
6
|
+
|
|
7
|
+
### Tools shipped
|
|
8
|
+
|
|
9
|
+
- `health` — server liveness / config check
|
|
10
|
+
- `get_paper_metadata` — URL/ID/DOI → canonical metadata
|
|
11
|
+
- `fetch_paper_text` — ar5iv HTML primary, pypdf fallback
|
|
12
|
+
- `extract_methods` — LLM + Pydantic structured methods
|
|
13
|
+
- `find_code_repo` — paper-text → abstract → Papers With Code
|
|
14
|
+
- `assess_repo_reproducibility` — heuristic, no-clone, GitHub REST API
|
|
15
|
+
- `summarize_paper` — three modes (abstract / tldr / exec)
|
|
16
|
+
- `methods_repro_review` — composite tool
|
|
17
|
+
|
|
18
|
+
### Notes
|
|
19
|
+
|
|
20
|
+
- Default model: `claude-sonnet-4-6` (override with `METHODS_MCP_MODEL` env var).
|
|
21
|
+
- All tool returns are Pydantic v2 models.
|
|
22
|
+
- 29 offline tests via respx mocks, mypy strict, ruff lint clean.
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Flynn Lachendro
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1,177 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: methods-mcp
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Lightweight, on-demand MCP server for structured methods extraction and reproducibility heuristics on academic papers.
|
|
5
|
+
Project-URL: Homepage, https://github.com/FlynnLachendro/methods-mcp
|
|
6
|
+
Project-URL: Repository, https://github.com/FlynnLachendro/methods-mcp
|
|
7
|
+
Project-URL: Issues, https://github.com/FlynnLachendro/methods-mcp/issues
|
|
8
|
+
Author-email: Flynn Lachendro <flynnlachendro@hotmail.co.uk>
|
|
9
|
+
Maintainer-email: Flynn Lachendro <flynnlachendro@hotmail.co.uk>
|
|
10
|
+
License-Expression: MIT
|
|
11
|
+
License-File: LICENSE
|
|
12
|
+
Keywords: academic-papers,ai-for-science,anthropic,claude,fastmcp,mcp,model-context-protocol,reproducibility
|
|
13
|
+
Classifier: Development Status :: 4 - Beta
|
|
14
|
+
Classifier: Intended Audience :: Developers
|
|
15
|
+
Classifier: Intended Audience :: Science/Research
|
|
16
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
17
|
+
Classifier: Programming Language :: Python :: 3
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
20
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
21
|
+
Classifier: Topic :: Scientific/Engineering
|
|
22
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
23
|
+
Classifier: Typing :: Typed
|
|
24
|
+
Requires-Python: >=3.11
|
|
25
|
+
Requires-Dist: anthropic>=0.40.0
|
|
26
|
+
Requires-Dist: fastmcp>=2.0.0
|
|
27
|
+
Requires-Dist: httpx>=0.27.0
|
|
28
|
+
Requires-Dist: loguru>=0.7.0
|
|
29
|
+
Requires-Dist: pydantic>=2.6.0
|
|
30
|
+
Requires-Dist: pypdf>=4.0.0
|
|
31
|
+
Requires-Dist: selectolax>=0.3.20
|
|
32
|
+
Provides-Extra: agent
|
|
33
|
+
Requires-Dist: claude-agent-sdk>=0.1.0; extra == 'agent'
|
|
34
|
+
Provides-Extra: dev
|
|
35
|
+
Requires-Dist: mypy>=1.10.0; extra == 'dev'
|
|
36
|
+
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
|
|
37
|
+
Requires-Dist: pytest>=8.0.0; extra == 'dev'
|
|
38
|
+
Requires-Dist: respx>=0.21.0; extra == 'dev'
|
|
39
|
+
Requires-Dist: ruff>=0.6.0; extra == 'dev'
|
|
40
|
+
Description-Content-Type: text/markdown
|
|
41
|
+
|
|
42
|
+
# methods-mcp
|
|
43
|
+
|
|
44
|
+
[](https://pypi.org/project/methods-mcp/)
|
|
45
|
+
[](https://pypi.org/project/methods-mcp/)
|
|
46
|
+
[](LICENSE)
|
|
47
|
+
|
|
48
|
+
> Lightweight, on-demand MCP server for **structured methods extraction** + **reproducibility heuristics** on academic papers. Built for the [Worldwide AI Science Fellowship](https://www.aisciencesummit.com/) build challenge.
|
|
49
|
+
|
|
50
|
+
`methods-mcp` is a small, sharply-scoped [Model Context Protocol](https://modelcontextprotocol.io) server. It gives any AI agent (Claude Code, Claude Desktop, your Agent SDK script, etc.) eight tools that turn an academic paper URL into:
|
|
51
|
+
|
|
52
|
+
- canonical metadata,
|
|
53
|
+
- best-effort full text + section split,
|
|
54
|
+
- a **Pydantic-validated structured methods object** (steps / reagents / equipment / analyses),
|
|
55
|
+
- the paper's associated **code repository** (best-effort discovery),
|
|
56
|
+
- a **no-execution-required reproducibility verdict** for that repo, and
|
|
57
|
+
- a multi-mode summary.
|
|
58
|
+
|
|
59
|
+
The wedge: heavyweight pipelines like [Paper2Agent](https://arxiv.org/abs/2509.06917) (Stanford) take 30 minutes to hours to digest a paper into agent-ready tools. `methods-mcp` is the **agent-callable, on-demand** complement — every tool returns in seconds, no clone, no execution.
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
## Install
|
|
64
|
+
|
|
65
|
+
```bash
|
|
66
|
+
uv add methods-mcp
|
|
67
|
+
# or, install globally:
|
|
68
|
+
uv tool install methods-mcp
|
|
69
|
+
# or, classic pip:
|
|
70
|
+
pip install methods-mcp
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
Set your Anthropic API key (used by the LLM-driven extraction tools):
|
|
74
|
+
|
|
75
|
+
```bash
|
|
76
|
+
export ANTHROPIC_API_KEY=sk-ant-...
|
|
77
|
+
# Optional — raises GitHub REST API rate limit for repro assessment:
|
|
78
|
+
export GITHUB_TOKEN=ghp_...
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
## Use it from Claude Code
|
|
82
|
+
|
|
83
|
+
```
|
|
84
|
+
/mcp add methods-mcp methods-mcp
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
Then in any Claude Code chat:
|
|
88
|
+
|
|
89
|
+
> Take https://arxiv.org/abs/2509.06917 and run `methods_repro_review`. Summarise what the paper does, the methods steps, and how reproducible the repo looks.
|
|
90
|
+
|
|
91
|
+
See [`examples/claude_code_demo.md`](examples/claude_code_demo.md) for more session prompts.
|
|
92
|
+
|
|
93
|
+
## Use it from the Claude Agent SDK
|
|
94
|
+
|
|
95
|
+
```python
|
|
96
|
+
from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
|
|
97
|
+
|
|
98
|
+
options = ClaudeAgentOptions(
|
|
99
|
+
mcp_servers={
|
|
100
|
+
"methods-mcp": {
|
|
101
|
+
"type": "stdio",
|
|
102
|
+
"command": "methods-mcp",
|
|
103
|
+
"args": [],
|
|
104
|
+
}
|
|
105
|
+
},
|
|
106
|
+
allowed_tools=["mcp__methods-mcp__methods_repro_review"],
|
|
107
|
+
)
|
|
108
|
+
|
|
109
|
+
async with ClaudeSDKClient(options=options) as client:
|
|
110
|
+
await client.query(
|
|
111
|
+
"Run methods_repro_review on https://arxiv.org/abs/2509.06917 "
|
|
112
|
+
"and tell me whether the repo looks reproducible."
|
|
113
|
+
)
|
|
114
|
+
async for msg in client.receive_response():
|
|
115
|
+
print(msg)
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
A complete runnable example lives in [`examples/reflexive_demo.py`](examples/reflexive_demo.py).
|
|
119
|
+
|
|
120
|
+
## Tools
|
|
121
|
+
|
|
122
|
+
| Tool | What it does |
|
|
123
|
+
|---|---|
|
|
124
|
+
| `health` | Server liveness + config check. |
|
|
125
|
+
| `get_paper_metadata(input_str)` | Resolve URL / arXiv ID / DOI to canonical metadata. arXiv inputs hit the arXiv export API for title/authors/abstract. |
|
|
126
|
+
| `fetch_paper_text(input_str, prefer="auto"\|"html"\|"pdf")` | Full text + section split. Defaults to ar5iv HTML for arXiv papers (cheap, structured), PDF fallback otherwise. |
|
|
127
|
+
| `extract_methods(input_str, model=None)` | LLM-driven, Pydantic-validated structured methods extraction. Returns `{steps, reagents, equipment, analyses, confidence}`. |
|
|
128
|
+
| `find_code_repo(input_str)` | Discover the paper's code repo via paper text → abstract → Papers With Code. |
|
|
129
|
+
| `assess_repo_reproducibility(repo_url, paper_id=None)` | Heuristic, no-clone reproducibility assessment via the GitHub REST API. Weighted signals (README, deps, fixtures, notebooks, figure scripts, recent maintenance, license) → `{verdict, score, recommended_entrypoint}`. |
|
|
130
|
+
| `summarize_paper(input_str, mode="tldr"\|"abstract"\|"exec")` | LLM summary in three depths. |
|
|
131
|
+
| `methods_repro_review(input_str)` | Composite — metadata + methods + repo + repro in one call. |
|
|
132
|
+
|
|
133
|
+
All tools return Pydantic v2 models (validated, JSON-serialisable). See [`src/methods_mcp/schemas.py`](src/methods_mcp/schemas.py) for the full type surface.
|
|
134
|
+
|
|
135
|
+
## Design notes
|
|
136
|
+
|
|
137
|
+
- **`extract_methods` uses Anthropic tool-use to coerce the model into emitting an instance of the `MethodsStructured` Pydantic schema.** On validation failure we send one repair message with the validation error and try again before raising.
|
|
138
|
+
- **`assess_repo_reproducibility` does not clone or execute anything.** It scores the repo from publicly-readable GitHub metadata + the recursive tree listing. This is the deliberate wedge against batch tools that try to actually rerun the paper.
|
|
139
|
+
- **`fetch_paper_text` prefers ar5iv HTML over PDF parsing for arXiv papers.** Falls back to `pypdf` for non-arXiv inputs.
|
|
140
|
+
- **The default model is `claude-sonnet-4-6`.** Override via `METHODS_MCP_MODEL` env var or per-call `model=` arg.
|
|
141
|
+
|
|
142
|
+
## Pair with `paper-mcp`
|
|
143
|
+
|
|
144
|
+
For broader paper search / citation graph tooling, run [`paper-mcp` (Bhvaik)](https://pypi.org/project/paper-mcp/) alongside in the same Claude Code session. `paper-mcp` does title-keyed search, full-text fetch, citations, and references; `methods-mcp` adds the structured-methods + reproducibility layer on top. The two were intentionally designed to compose.
|
|
145
|
+
|
|
146
|
+
## Develop locally
|
|
147
|
+
|
|
148
|
+
```bash
|
|
149
|
+
git clone https://github.com/FlynnLachendro/methods-mcp
|
|
150
|
+
cd methods-mcp
|
|
151
|
+
uv sync --extra dev --extra agent
|
|
152
|
+
|
|
153
|
+
uv run pytest # 29 tests, offline
|
|
154
|
+
uv run ruff format .
|
|
155
|
+
uv run ruff check . --fix
|
|
156
|
+
uv run mypy src
|
|
157
|
+
|
|
158
|
+
uv run methods-mcp --help
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
## Project notes
|
|
162
|
+
|
|
163
|
+
`project-thoughts.md` (in this repo) is a running log of what we tried, what stuck, and what we cut while building this. Honest write-up for the WWSF Loom narration.
|
|
164
|
+
|
|
165
|
+
## License
|
|
166
|
+
|
|
167
|
+
MIT — see [`LICENSE`](LICENSE).
|
|
168
|
+
|
|
169
|
+
## Acknowledgements
|
|
170
|
+
|
|
171
|
+
Built for the [Worldwide AI Science Fellowship](https://www.aisciencesummit.com/) inaugural cohort. Thanks to Michael Raspuzzi for the open-ended brief.
|
|
172
|
+
|
|
173
|
+
Built on:
|
|
174
|
+
- [FastMCP 3.x](https://github.com/jlowin/fastmcp) — the MCP server scaffold.
|
|
175
|
+
- [Claude Agent SDK](https://github.com/anthropics/claude-agent-sdk-python) — the agent loop in the demo.
|
|
176
|
+
- [ar5iv.labs.arxiv.org](https://ar5iv.labs.arxiv.org/) — clean HTML for arXiv papers.
|
|
177
|
+
- [Anthropic Claude](https://platform.claude.com) — the LLM behind structured extraction.
|
|
@@ -0,0 +1,136 @@
|
|
|
1
|
+
# methods-mcp
|
|
2
|
+
|
|
3
|
+
[](https://pypi.org/project/methods-mcp/)
|
|
4
|
+
[](https://pypi.org/project/methods-mcp/)
|
|
5
|
+
[](LICENSE)
|
|
6
|
+
|
|
7
|
+
> Lightweight, on-demand MCP server for **structured methods extraction** + **reproducibility heuristics** on academic papers. Built for the [Worldwide AI Science Fellowship](https://www.aisciencesummit.com/) build challenge.
|
|
8
|
+
|
|
9
|
+
`methods-mcp` is a small, sharply-scoped [Model Context Protocol](https://modelcontextprotocol.io) server. It gives any AI agent (Claude Code, Claude Desktop, your Agent SDK script, etc.) eight tools that turn an academic paper URL into:
|
|
10
|
+
|
|
11
|
+
- canonical metadata,
|
|
12
|
+
- best-effort full text + section split,
|
|
13
|
+
- a **Pydantic-validated structured methods object** (steps / reagents / equipment / analyses),
|
|
14
|
+
- the paper's associated **code repository** (best-effort discovery),
|
|
15
|
+
- a **no-execution-required reproducibility verdict** for that repo, and
|
|
16
|
+
- a multi-mode summary.
|
|
17
|
+
|
|
18
|
+
The wedge: heavyweight pipelines like [Paper2Agent](https://arxiv.org/abs/2509.06917) (Stanford) take 30 minutes to hours to digest a paper into agent-ready tools. `methods-mcp` is the **agent-callable, on-demand** complement — every tool returns in seconds, no clone, no execution.
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## Install
|
|
23
|
+
|
|
24
|
+
```bash
|
|
25
|
+
uv add methods-mcp
|
|
26
|
+
# or, install globally:
|
|
27
|
+
uv tool install methods-mcp
|
|
28
|
+
# or, classic pip:
|
|
29
|
+
pip install methods-mcp
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
Set your Anthropic API key (used by the LLM-driven extraction tools):
|
|
33
|
+
|
|
34
|
+
```bash
|
|
35
|
+
export ANTHROPIC_API_KEY=sk-ant-...
|
|
36
|
+
# Optional — raises GitHub REST API rate limit for repro assessment:
|
|
37
|
+
export GITHUB_TOKEN=ghp_...
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
## Use it from Claude Code
|
|
41
|
+
|
|
42
|
+
```
|
|
43
|
+
/mcp add methods-mcp methods-mcp
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
Then in any Claude Code chat:
|
|
47
|
+
|
|
48
|
+
> Take https://arxiv.org/abs/2509.06917 and run `methods_repro_review`. Summarise what the paper does, the methods steps, and how reproducible the repo looks.
|
|
49
|
+
|
|
50
|
+
See [`examples/claude_code_demo.md`](examples/claude_code_demo.md) for more session prompts.
|
|
51
|
+
|
|
52
|
+
## Use it from the Claude Agent SDK
|
|
53
|
+
|
|
54
|
+
```python
|
|
55
|
+
from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
|
|
56
|
+
|
|
57
|
+
options = ClaudeAgentOptions(
|
|
58
|
+
mcp_servers={
|
|
59
|
+
"methods-mcp": {
|
|
60
|
+
"type": "stdio",
|
|
61
|
+
"command": "methods-mcp",
|
|
62
|
+
"args": [],
|
|
63
|
+
}
|
|
64
|
+
},
|
|
65
|
+
allowed_tools=["mcp__methods-mcp__methods_repro_review"],
|
|
66
|
+
)
|
|
67
|
+
|
|
68
|
+
async with ClaudeSDKClient(options=options) as client:
|
|
69
|
+
await client.query(
|
|
70
|
+
"Run methods_repro_review on https://arxiv.org/abs/2509.06917 "
|
|
71
|
+
"and tell me whether the repo looks reproducible."
|
|
72
|
+
)
|
|
73
|
+
async for msg in client.receive_response():
|
|
74
|
+
print(msg)
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
A complete runnable example lives in [`examples/reflexive_demo.py`](examples/reflexive_demo.py).
|
|
78
|
+
|
|
79
|
+
## Tools
|
|
80
|
+
|
|
81
|
+
| Tool | What it does |
|
|
82
|
+
|---|---|
|
|
83
|
+
| `health` | Server liveness + config check. |
|
|
84
|
+
| `get_paper_metadata(input_str)` | Resolve URL / arXiv ID / DOI to canonical metadata. arXiv inputs hit the arXiv export API for title/authors/abstract. |
|
|
85
|
+
| `fetch_paper_text(input_str, prefer="auto"\|"html"\|"pdf")` | Full text + section split. Defaults to ar5iv HTML for arXiv papers (cheap, structured), PDF fallback otherwise. |
|
|
86
|
+
| `extract_methods(input_str, model=None)` | LLM-driven, Pydantic-validated structured methods extraction. Returns `{steps, reagents, equipment, analyses, confidence}`. |
|
|
87
|
+
| `find_code_repo(input_str)` | Discover the paper's code repo via paper text → abstract → Papers With Code. |
|
|
88
|
+
| `assess_repo_reproducibility(repo_url, paper_id=None)` | Heuristic, no-clone reproducibility assessment via the GitHub REST API. Weighted signals (README, deps, fixtures, notebooks, figure scripts, recent maintenance, license) → `{verdict, score, recommended_entrypoint}`. |
|
|
89
|
+
| `summarize_paper(input_str, mode="tldr"\|"abstract"\|"exec")` | LLM summary in three depths. |
|
|
90
|
+
| `methods_repro_review(input_str)` | Composite — metadata + methods + repo + repro in one call. |
|
|
91
|
+
|
|
92
|
+
All tools return Pydantic v2 models (validated, JSON-serialisable). See [`src/methods_mcp/schemas.py`](src/methods_mcp/schemas.py) for the full type surface.
|
|
93
|
+
|
|
94
|
+
## Design notes
|
|
95
|
+
|
|
96
|
+
- **`extract_methods` uses Anthropic tool-use to coerce the model into emitting an instance of the `MethodsStructured` Pydantic schema.** On validation failure we send one repair message with the validation error and try again before raising.
|
|
97
|
+
- **`assess_repo_reproducibility` does not clone or execute anything.** It scores the repo from publicly-readable GitHub metadata + the recursive tree listing. This is the deliberate wedge against batch tools that try to actually rerun the paper.
|
|
98
|
+
- **`fetch_paper_text` prefers ar5iv HTML over PDF parsing for arXiv papers.** Falls back to `pypdf` for non-arXiv inputs.
|
|
99
|
+
- **The default model is `claude-sonnet-4-6`.** Override via `METHODS_MCP_MODEL` env var or per-call `model=` arg.
|
|
100
|
+
|
|
101
|
+
## Pair with `paper-mcp`
|
|
102
|
+
|
|
103
|
+
For broader paper search / citation graph tooling, run [`paper-mcp` (Bhvaik)](https://pypi.org/project/paper-mcp/) alongside in the same Claude Code session. `paper-mcp` does title-keyed search, full-text fetch, citations, and references; `methods-mcp` adds the structured-methods + reproducibility layer on top. The two were intentionally designed to compose.
|
|
104
|
+
|
|
105
|
+
## Develop locally
|
|
106
|
+
|
|
107
|
+
```bash
|
|
108
|
+
git clone https://github.com/FlynnLachendro/methods-mcp
|
|
109
|
+
cd methods-mcp
|
|
110
|
+
uv sync --extra dev --extra agent
|
|
111
|
+
|
|
112
|
+
uv run pytest # 29 tests, offline
|
|
113
|
+
uv run ruff format .
|
|
114
|
+
uv run ruff check . --fix
|
|
115
|
+
uv run mypy src
|
|
116
|
+
|
|
117
|
+
uv run methods-mcp --help
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
## Project notes
|
|
121
|
+
|
|
122
|
+
`project-thoughts.md` (in this repo) is a running log of what we tried, what stuck, and what we cut while building this. Honest write-up for the WWSF Loom narration.
|
|
123
|
+
|
|
124
|
+
## License
|
|
125
|
+
|
|
126
|
+
MIT — see [`LICENSE`](LICENSE).
|
|
127
|
+
|
|
128
|
+
## Acknowledgements
|
|
129
|
+
|
|
130
|
+
Built for the [Worldwide AI Science Fellowship](https://www.aisciencesummit.com/) inaugural cohort. Thanks to Michael Raspuzzi for the open-ended brief.
|
|
131
|
+
|
|
132
|
+
Built on:
|
|
133
|
+
- [FastMCP 3.x](https://github.com/jlowin/fastmcp) — the MCP server scaffold.
|
|
134
|
+
- [Claude Agent SDK](https://github.com/anthropics/claude-agent-sdk-python) — the agent loop in the demo.
|
|
135
|
+
- [ar5iv.labs.arxiv.org](https://ar5iv.labs.arxiv.org/) — clean HTML for arXiv papers.
|
|
136
|
+
- [Anthropic Claude](https://platform.claude.com) — the LLM behind structured extraction.
|
|
File without changes
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
# Using methods-mcp from Claude Code
|
|
2
|
+
|
|
3
|
+
This is the path the WWSF Loom demos: install the server, point Claude Code at it, ask a science question.
|
|
4
|
+
|
|
5
|
+
## 1. Install
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
uv add methods-mcp # in any uv-managed project
|
|
9
|
+
# or globally:
|
|
10
|
+
uv tool install methods-mcp
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
## 2. Tell Claude Code about the server
|
|
14
|
+
|
|
15
|
+
In a Claude Code session:
|
|
16
|
+
|
|
17
|
+
```
|
|
18
|
+
/mcp add methods-mcp methods-mcp
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
(Form: `/mcp add <name> <command>`. The second `methods-mcp` is the console-script binary, which lives on your PATH after `uv tool install` or inside any project that ran `uv add methods-mcp` + `uv sync`.)
|
|
22
|
+
|
|
23
|
+
Set your Anthropic key (used for the LLM-driven extraction tools):
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
export ANTHROPIC_API_KEY=sk-ant-...
|
|
27
|
+
# Optional, raises GitHub API rate limit for repro assessment:
|
|
28
|
+
export GITHUB_TOKEN=ghp_...
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
## 3. Ask Claude Code a question that uses the tools
|
|
32
|
+
|
|
33
|
+
Try one of these:
|
|
34
|
+
|
|
35
|
+
> Take https://arxiv.org/abs/2509.06917 and run methods_repro_review. Summarise what the paper does, the methods steps, and how reproducible the repo looks.
|
|
36
|
+
|
|
37
|
+
> Compare the reproducibility of the code repos for these three papers: <url1>, <url2>, <url3>. Which has the strongest reproducibility signals?
|
|
38
|
+
|
|
39
|
+
> Pull the structured methods from this protocol paper [URL] and turn it into a checklist I could hand to a colleague.
|
|
40
|
+
|
|
41
|
+
Watch the tool calls fire in the Claude Code panel — that's the demo.
|
|
42
|
+
|
|
43
|
+
## 4. Tools exposed
|
|
44
|
+
|
|
45
|
+
| Tool | What it does |
|
|
46
|
+
|---|---|
|
|
47
|
+
| `health` | Liveness + config check. |
|
|
48
|
+
| `get_paper_metadata` | Resolve URL/ID/DOI to canonical metadata. |
|
|
49
|
+
| `fetch_paper_text` | Best-effort full text + section split. |
|
|
50
|
+
| `extract_methods` | LLM + Pydantic structured methods extraction. |
|
|
51
|
+
| `find_code_repo` | Discover the paper's GitHub repo. |
|
|
52
|
+
| `assess_repo_reproducibility` | No-clone heuristic repro assessment. |
|
|
53
|
+
| `summarize_paper` | Three modes: abstract / tldr / exec. |
|
|
54
|
+
| `methods_repro_review` | Composite: metadata + methods + repo + repro in one call. |
|