@roomi-fields/notebooklm-mcp 1.5.9 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,10 +1,10 @@
1
1
  <div align="center">
2
2
 
3
- # NotebookLM MCP + HTTP REST API
3
+ # NotebookLM REST API + MCP server
4
4
 
5
- **Google NotebookLM over MCP + a local HTTP REST API Q&A with citations, audio podcasts, video generation, multi-account rotation. Works with Claude Code, Codex, Cursor, n8n, Zapier, Make.**
5
+ **Automate Google NotebookLM at scale. 33-endpoint HTTP REST API for n8n / Zapier / Make / curl, plus an MCP server for Claude Code / Cursor / Codex. Citation-backed Q&A, full Studio generation (audio · video · infographic · report · presentation · data table), multi-account rotation with auto-reauth.**
6
6
 
7
- > 🟢 **Actively maintained fork** of [PleasePrompto/notebooklm-mcp](https://github.com/PleasePrompto/notebooklm-mcp) (upstream last push: 2025-12-27). This fork ships v1.5.8 (2026-04-19) with 2026 NotebookLM UI selectors, HTTP REST API for n8n / Zapier / Make, multi-account rotation, and documented install on Windows / WSL / Docker.
7
+ > v1.7.0 production-grade, batch-tested on overnight runs of 1 000+ questions. New: `batch_to_vault` is now a first-class MCP tool (no HTTP server required) on top of the existing `POST /batch-to-vault` endpoint. See [RTFM integration](./deployment/docs/14-RTFM-INTEGRATION.md) for the full pattern. [Compare with `PleasePrompto/notebooklm-mcp` v2.0.0](https://roomi-fields.github.io/notebooklm-mcp/compare) to see when this project is the right pick (REST API, full Studio, auto-reauth) and when the MCP-only upstream is.
8
8
 
9
9
  <!-- Badges -->
10
10
 
@@ -66,15 +66,35 @@ Generate multiple content types from your notebook sources:
66
66
  - **MCP Protocol** — Claude Code, Cursor, Codex, any MCP client
67
67
  - **HTTP REST API** — n8n, Zapier, Make.com, custom integrations
68
68
  - **Docker** — Isolated deployment with Docker or Docker Compose
69
+ - **[RTFM](https://github.com/roomi-fields/rtfm) retrieval layer** — `/batch-to-vault` writes citation-backed answers as markdown + JSON sidecars (`nblm-answer-v1` schema), indexable by [RTFM](https://github.com/roomi-fields/rtfm) (FTS5 + semantic) for unlimited offline queries. Ideal for academic / SOTA workflows. [Guide](./deployment/docs/14-RTFM-INTEGRATION.md).
69
70
 
70
71
  ---
71
72
 
72
73
  ## Quick Start
73
74
 
74
- ### Option 1: MCP Mode (Claude Code, Cursor, Codex)
75
+ ### Option 1 HTTP REST API (n8n, Zapier, Make, curl, any HTTP client)
75
76
 
76
77
  ```bash
77
- # Clone and build locally
78
+ git clone https://github.com/roomi-fields/notebooklm-mcp.git
79
+ cd notebooklm-mcp
80
+ npm install && npm run build
81
+ npm run setup-auth # One-time Google login
82
+ npm run start:http # Start REST API on port 3000
83
+ ```
84
+
85
+ ```bash
86
+ # Citation-backed Q&A, single curl, JSON response
87
+ curl -X POST http://localhost:3000/ask \
88
+ -H 'Content-Type: application/json' \
89
+ -d '{"question": "Summarize chapter 3", "notebook_id": "your-id", "source_format": "json"}'
90
+ ```
91
+
92
+ The full surface is **33 documented endpoints** — see the [REST API reference](https://roomi-fields.github.io/notebooklm-mcp/notebooklm-rest-api). For overnight batches of 1 000+ questions, see the [batch pattern](https://roomi-fields.github.io/notebooklm-mcp/batch-1000-questions).
93
+
94
+ ### Option 2 — MCP Mode (Claude Code, Cursor, Codex)
95
+
96
+ ```bash
97
+ # Build (same package, MCP transport)
78
98
  git clone https://github.com/roomi-fields/notebooklm-mcp.git
79
99
  cd notebooklm-mcp
80
100
  npm install && npm run build
@@ -82,7 +102,7 @@ npm install && npm run build
82
102
  # Claude Code
83
103
  claude mcp add notebooklm node /path/to/notebooklm-mcp/dist/index.js
84
104
 
85
- # Cursor - add to ~/.cursor/mcp.json
105
+ # Cursor add to ~/.cursor/mcp.json
86
106
  {
87
107
  "mcpServers": {
88
108
  "notebooklm": {
@@ -95,24 +115,7 @@ claude mcp add notebooklm node /path/to/notebooklm-mcp/dist/index.js
95
115
 
96
116
  Then say: _"Log me in to NotebookLM"_ → Chrome opens → log in with Google.
97
117
 
98
- ### Option 2: HTTP REST API (n8n, Zapier, Make.com)
99
-
100
- ```bash
101
- git clone https://github.com/roomi-fields/notebooklm-mcp.git
102
- cd notebooklm-mcp
103
- npm install && npm run build
104
- npm run setup-auth # One-time Google login
105
- npm run start:http # Start server on port 3000
106
- ```
107
-
108
- ```bash
109
- # Query the API
110
- curl -X POST http://localhost:3000/ask \
111
- -H "Content-Type: application/json" \
112
- -d '{"question": "Explain X", "notebook_id": "my-notebook"}'
113
- ```
114
-
115
- ### Option 3: Docker (NAS, Server)
118
+ ### Option 3 Docker (NAS, server, headless)
116
119
 
117
120
  ```bash
118
121
  # Build and run
@@ -131,19 +134,26 @@ See [Docker Guide](./deployment/docs/08-DOCKER.md) for NAS deployment (Synology,
131
134
 
132
135
  ## Documentation
133
136
 
134
- | Guide | Description |
135
- | ------------------------------------------------------------ | ----------------------------------------- |
136
- | [Installation](./deployment/docs/01-INSTALL.md) | Step-by-step setup for HTTP and MCP modes |
137
- | [Configuration](./deployment/docs/02-CONFIGURATION.md) | Environment variables and security |
138
- | [API Reference](./deployment/docs/03-API.md) | Complete HTTP endpoint documentation |
139
- | [n8n Integration](./deployment/docs/04-N8N-INTEGRATION.md) | Workflow automation setup |
140
- | [Troubleshooting](./deployment/docs/05-TROUBLESHOOTING.md) | Common issues and solutions |
141
- | [Notebook Library](./deployment/docs/06-NOTEBOOK-LIBRARY.md) | Multi-notebook management |
142
- | [Auto-Discovery](./deployment/docs/07-AUTO-DISCOVERY.md) | Autonomous metadata generation |
143
- | [Docker](./deployment/docs/08-DOCKER.md) | Docker and Docker Compose deployment |
144
- | [Multi-Interface](./deployment/docs/09-MULTI-INTERFACE.md) | Run Claude Desktop + HTTP simultaneously |
145
- | [Chrome Limitation](./docs/CHROME_PROFILE_LIMITATION.md) | Profile locking (solved in v1.3.6+) |
146
- | [Adding a Language](./docs/ADDING_A_LANGUAGE.md) | i18n system for multilingual UI support |
137
+ Full docs site: **<https://roomi-fields.github.io/notebooklm-mcp/>** · [OpenAPI 3.1 spec](./deployment/docs/openapi.yaml)
138
+
139
+ | Guide | Description |
140
+ | -------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------- |
141
+ | [Installation](./deployment/docs/01-INSTALL.md) | Step-by-step setup for HTTP and MCP modes |
142
+ | [Configuration](./deployment/docs/02-CONFIGURATION.md) | Environment variables and security |
143
+ | [REST API reference](./deployment/docs/03-API.md) | Complete HTTP endpoint documentation (33 endpoints) |
144
+ | [Run 1 000 questions overnight](./deployment/docs/12-BATCH-1000.md) | Production batch pattern with auto-reauth and rotation |
145
+ | [**RTFM integration — cache as searchable vault**](./deployment/docs/14-RTFM-INTEGRATION.md) | Pipeline pattern: NotebookLM as one-shot ingestion, RTFM as retrieval layer. `/batch-to-vault` endpoint, `nblm-answer-v1` schema. |
146
+ | [n8n integration](./deployment/docs/04-N8N-INTEGRATION.md) | Workflow automation setup |
147
+ | [Troubleshooting](./deployment/docs/05-TROUBLESHOOTING.md) | Common issues and solutions |
148
+ | [Notebook library](./deployment/docs/06-NOTEBOOK-LIBRARY.md) | Multi-notebook management |
149
+ | [Auto-discovery](./deployment/docs/07-AUTO-DISCOVERY.md) | Autonomous metadata generation |
150
+ | [Content management](./deployment/docs/10-CONTENT-MANAGEMENT.md) | Audio, video, infographic, report, presentation |
151
+ | [Multi-account rotation](./deployment/docs/11-MULTI-ACCOUNT.md) | Multiple accounts with TOTP auto-reauth |
152
+ | [Docker](./deployment/docs/08-DOCKER.md) | Docker and Docker Compose deployment |
153
+ | [Multi-interface](./deployment/docs/09-MULTI-INTERFACE.md) | Run Claude Desktop + HTTP simultaneously |
154
+ | [**Compare with PleasePrompto v2.0.0**](./deployment/docs/13-COMPARE.md) | Feature matrix vs the upstream MCP-only server |
155
+ | [Chrome profile limitation](./docs/CHROME_PROFILE_LIMITATION.md) | Profile locking (solved in v1.3.6+) |
156
+ | [Adding a language](./docs/ADDING_A_LANGUAGE.md) | i18n system for multilingual UI support |
147
157
 
148
158
  ---
149
159
 
@@ -153,6 +163,14 @@ See [ROADMAP.md](./ROADMAP.md) for planned features and version history.
153
163
 
154
164
  **Latest releases:**
155
165
 
166
+ - **v1.7.0** — `batch_to_vault` exposed as a first-class MCP tool (parity with the HTTP endpoint, no localhost server required); shared `runBatchToVault` helper deduplicates the loop across both transports
167
+ - **v1.6.0** — `/batch-to-vault` endpoint + RTFM integration (`nblm-answer-v1` JSON Schema published at [schemas.roomi-fields.com/nblm-answer-v1.json](https://schemas.roomi-fields.com/nblm-answer-v1.json)) for caching NotebookLM answers as a searchable markdown vault
168
+ - **v1.5.9** — Restore `mcpName` field for MCP Registry npm-package ownership verification
169
+ - **v1.5.8** — NotebookLM 2026 UI adaptations (icon-label sanitization, Discussion-panel recovery, count-based source detection) — PR #5 by @KhizarJamshaidIqbal
170
+ - **v1.5.7** — Citation extraction selector fix (`.highlighted`) and Docker multi-stage build — PR #1 by @JulienCANTONI
171
+ - **v1.5.6** — Citation extraction major rewrite (97% success rate), browser-verified auth at startup, profile auto-sync
172
+ - **v1.5.5** — Multi-account state-path bug fix, Windows startup scripts, hidden-window MCP proxy
173
+ - **v1.5.4** — Mid-session auto-reauth with stored credentials, TOTP support
156
174
  - **v1.5.3** — Docker deployment with noVNC for visual authentication + NAS support (Synology, QNAP)
157
175
  - **v1.5.2** — Notebook scraping from NotebookLM + Bulk delete + Bug fixes
158
176
  - **v1.5.1** — Multilingual UI support (FR/EN) with i18n selector system + E2E tests (76 tests)
@@ -1,6 +1,15 @@
1
- # API Documentation - NotebookLM MCP HTTP Server
2
-
3
- > Complete reference for all REST endpoints
1
+ # NotebookLM REST API reference 33 HTTP endpoints
2
+
3
+ > Complete reference for the NotebookLM HTTP REST API. Citation-backed Q&A,
4
+ > Studio content generation, notebook library, multi-account, sessions.
5
+ >
6
+ > **OpenAPI 3.1 spec:** [openapi.yaml](pathname:///openapi.yaml) (importable in
7
+ > Postman, Insomnia, Bruno, Swagger UI). Source in the repo at
8
+ > [`deployment/docs/openapi.yaml`](https://github.com/roomi-fields/notebooklm-mcp/blob/main/deployment/docs/openapi.yaml).
9
+ >
10
+ > **Companion guides:** [REST batch pattern (1 000 questions overnight)](/batch-1000-questions) ·
11
+ > [n8n integration](/notebooklm-n8n) · [Multi-account rotation](/notebooklm-multi-account) ·
12
+ > [Compare with PleasePrompto/notebooklm-mcp v2.0.0](/compare)
4
13
 
5
14
  ---
6
15
 
@@ -0,0 +1,165 @@
1
+ # Run 1 000 NotebookLM questions overnight
2
+
3
+ This is the pattern we use to run very long batches of citation-backed Q&A against Google NotebookLM — from PhD literature reviews to market intelligence pipelines. Single laptop, single account, eight hours, one thousand structured answers in a JSONL file.
4
+
5
+ The whole thing fits in **one shell loop**, because the project exposes a plain REST API on `http://localhost:3000`. There is no SDK to learn, no agent harness to configure, no MCP client to wire.
6
+
7
+ ## What you need
8
+
9
+ - This project running locally: `npm run setup-auth` (one-time Google login), then `npm run start:http`. [Install guide](/install).
10
+ - A list of questions in a text file, one per line.
11
+ - A notebook id. Either pick one from `GET /notebooks/scrape` or set a default with `PUT /notebooks/:id/activate`.
12
+ - Optionally: a second Google account for rotation. [Multi-account guide](/notebooklm-multi-account).
13
+
14
+ ## The minimum viable batch (10 lines of bash)
15
+
16
+ ```bash
17
+ NOTEBOOK_ID="paste-your-id-here"
18
+ INPUT="questions.txt"
19
+ OUTPUT="answers.jsonl"
20
+
21
+ while IFS= read -r question; do
22
+ curl -sS -X POST http://localhost:3000/ask \
23
+ -H 'Content-Type: application/json' \
24
+ -d "$(jq -n --arg q "$question" --arg n "$NOTEBOOK_ID" \
25
+ '{question: $q, notebook_id: $n, source_format: "json"}')" \
26
+ >> "$OUTPUT"
27
+ echo >> "$OUTPUT"
28
+ done < "$INPUT"
29
+ ```
30
+
31
+ That works. It does not survive a session expiry at hour 4, it does not throttle, it does not resume after a network blip, and it makes 1 000 sequential blocking calls. So for real batches we wrap it.
32
+
33
+ ## The production pattern
34
+
35
+ ```python
36
+ #!/usr/bin/env python3
37
+ """Run a batch of NotebookLM questions through the local REST API.
38
+
39
+ Resumes safely on restart, handles re-auth, rotates accounts, throttles to
40
+ respect rate limits, and writes one JSON line per answer with citations.
41
+ """
42
+
43
+ import json, time, sys
44
+ from pathlib import Path
45
+ import httpx
46
+
47
+ API = "http://localhost:3000"
48
+ NOTEBOOK_ID = "paste-your-id-here"
49
+ INPUT = Path("questions.txt")
50
+ OUTPUT = Path("answers.jsonl")
51
+ THROTTLE_SECONDS = 8 # average pace; tune to your account's quota
52
+ MAX_RETRIES = 3
53
+ ACCOUNTS = ["primary", "backup"] # registered via `npm run accounts add`
54
+
55
+ def already_done() -> set[str]:
56
+ """Resume support: skip questions already answered."""
57
+ if not OUTPUT.exists():
58
+ return set()
59
+ done = set()
60
+ for line in OUTPUT.read_text().splitlines():
61
+ try:
62
+ done.add(json.loads(line)["question"])
63
+ except (json.JSONDecodeError, KeyError):
64
+ continue
65
+ return done
66
+
67
+ def switch_account(name: str) -> None:
68
+ httpx.post(f"{API}/re-auth", json={"account": name}, timeout=120).raise_for_status()
69
+
70
+ def ask(question: str, account_idx: int = 0) -> dict:
71
+ payload = {
72
+ "question": question,
73
+ "notebook_id": NOTEBOOK_ID,
74
+ "source_format": "json",
75
+ }
76
+ for attempt in range(1, MAX_RETRIES + 1):
77
+ try:
78
+ r = httpx.post(f"{API}/ask", json=payload, timeout=180)
79
+ r.raise_for_status()
80
+ data = r.json()
81
+ if data.get("success"):
82
+ return data
83
+ # rate-limited or quota — try the next account
84
+ if "rate" in str(data.get("error", "")).lower():
85
+ account_idx = (account_idx + 1) % len(ACCOUNTS)
86
+ switch_account(ACCOUNTS[account_idx])
87
+ continue
88
+ except httpx.HTTPError as e:
89
+ print(f" attempt {attempt}: {e}", file=sys.stderr)
90
+ time.sleep(2 ** attempt)
91
+ raise RuntimeError(f"failed after {MAX_RETRIES} retries: {question}")
92
+
93
+ def main() -> None:
94
+ done = already_done()
95
+ questions = [q.strip() for q in INPUT.read_text().splitlines() if q.strip()]
96
+ todo = [q for q in questions if q not in done]
97
+ print(f"{len(done)} already answered · {len(todo)} to go")
98
+
99
+ with OUTPUT.open("a") as f:
100
+ for i, question in enumerate(todo, 1):
101
+ t0 = time.time()
102
+ answer = ask(question)
103
+ row = {
104
+ "question": question,
105
+ "answer": answer["answer"],
106
+ "citations": answer.get("citations", []),
107
+ "session_id": answer.get("session_id"),
108
+ "elapsed_s": round(time.time() - t0, 1),
109
+ }
110
+ f.write(json.dumps(row, ensure_ascii=False) + "\n")
111
+ f.flush()
112
+ print(f"[{i}/{len(todo)}] {row['elapsed_s']}s · {len(row['citations'])} cites")
113
+ time.sleep(THROTTLE_SECONDS)
114
+
115
+ if __name__ == "__main__":
116
+ main()
117
+ ```
118
+
119
+ Save it as `batch.py`, drop your questions in `questions.txt`, run `python batch.py`. Kill it any time, re-run, it picks up where it left off.
120
+
121
+ ## Why this pattern works
122
+
123
+ ### One file in, one file out
124
+
125
+ Both ends are plain text. Your input is a `questions.txt` you can edit in any tool. Your output is `answers.jsonl` — JSON Lines, one answer per line, trivially loadable into pandas, jq, BigQuery, or another LLM for further processing:
126
+
127
+ ```bash
128
+ jq '.answer' answers.jsonl | wc -l
129
+ jq -c '{q: .question, n: (.citations | length)}' answers.jsonl
130
+ ```
131
+
132
+ ### Resume on restart
133
+
134
+ Network drops, OS updates, batch scripts that get killed at 3am — they all happen. The first thing the script does is re-read the output file and skip questions whose text is already there. You lose the in-flight question and nothing else.
135
+
136
+ ### Auto-reauth across multi-hour runs
137
+
138
+ Google sessions don't survive the night. The REST API checks the URL ground truth on every call and re-logins automatically using credentials stored in the AES-256-GCM vault (`npm run setup-auth` puts them there). TOTP codes are computed on the fly, so 2FA-protected accounts work transparently. [Multi-account configuration](/notebooklm-multi-account).
139
+
140
+ ### Account rotation when one quota saturates
141
+
142
+ Free Google accounts hit a daily NotebookLM Q&A quota. The script flips to the next registered account on rate-limit errors via `POST /re-auth`. With two accounts you can typically push 1 500–2 000 questions in a 24-hour window without manual intervention.
143
+
144
+ ### Citations come back structured
145
+
146
+ `source_format: "json"` returns a `citations` array of `{id, source, excerpt}` objects directly attached to the answer. You can join citations back to your sources for downstream processing — fact-checking, page-number resolution, LaTeX `\cite{}` generation for a thesis bibliography.
147
+
148
+ ## Sizing your throttle
149
+
150
+ NotebookLM does not document a public rate limit, so we picked **8 seconds between calls** based on hundreds of overnight runs. That gives you ~450 questions in an hour and ~3 600 in eight hours per account. If you see `rate limit` errors before that, raise the throttle to 12-15 seconds or add a third account. If you can sustain 5 seconds without hitting limits, go for it.
151
+
152
+ ## When to switch to MCP mode instead
153
+
154
+ If your driver is a coding agent (Claude Code, Cursor, Codex) rather than a script, the same operations are exposed as MCP tools. Use that surface when you want the agent to reason about which question to ask next; use the REST API when you have a flat list to grind through. [Both modes ship from the same package](/install).
155
+
156
+ ## What this gives you in practice
157
+
158
+ For one of our PhD use cases we run 100-200 questions per chapter across a thirty-chapter thesis library. That's 5 000+ structured answers with citations, computed overnight on a laptop, indexed back into the thesis as `\cite{}`-ready snippets. Total cost: zero (uses the user's own NotebookLM account), total infrastructure: one Node process and one Python script.
159
+
160
+ ## Next steps
161
+
162
+ - [HTTP API reference](/notebooklm-rest-api) — every endpoint, every parameter.
163
+ - [n8n integration guide](/notebooklm-n8n) — same pattern but as a visual workflow.
164
+ - [Multi-account guide](/notebooklm-multi-account) — register a second Google account for rotation.
165
+ - [Compare with PleasePrompto](/compare) — when this project is the right pick over the upstream MCP-only server.
@@ -0,0 +1,12 @@
1
+ # Comparing NotebookLM MCP servers
2
+
3
+ There are a handful of community projects that automate Google NotebookLM. The two most active TypeScript implementations are **`PleasePrompto/notebooklm-mcp`** (the original, MCP-only) and **`@roomi-fields/notebooklm-mcp`** (this project — REST API + MCP, Studio-complete, batch-oriented).
4
+
5
+ This page is an honest, side-by-side comparison so you can pick the right one for your workflow. Both are MIT-licensed and actively maintained.
6
+
7
+ > **TL;DR**
8
+ >
9
+ > - You want a quick **MCP server** for Claude Code / Cursor / Codex with citations and audio overviews → either works; PleasePrompto v2 is a clean, MCP-spec-first build.
10
+ > - You need a **REST API** to call from n8n / Zapier / Make / curl / any non-MCP client → use this project. PleasePrompto v2 ships MCP-over-HTTP (Streamable HTTP transport) which is **not** a REST API.
11
+ > - You generate **video / infographic / presentation / data table** from NotebookLM Studio → use this project. PleasePrompto v2 ships audio only; the rest is deferred to a follow-up.
12
+ > - You run **long batches** (PhD research, market reports, content pipelines) and need **auto-reauth with TOTP** → use this project. PleasePrompto v2 explicitly does not store credentials.