@roomi-fields/notebooklm-mcp 1.5.9 → 1.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,10 +1,10 @@
1
1
  <div align="center">
2
2
 
3
- # NotebookLM MCP + HTTP REST API
3
+ # NotebookLM REST API + MCP server
4
4
 
5
- **Google NotebookLM over MCP + a local HTTP REST API Q&A with citations, audio podcasts, video generation, multi-account rotation. Works with Claude Code, Codex, Cursor, n8n, Zapier, Make.**
5
+ **Automate Google NotebookLM at scale. 33-endpoint HTTP REST API for n8n / Zapier / Make / curl, plus an MCP server for Claude Code / Cursor / Codex. Citation-backed Q&A, full Studio generation (audio · video · infographic · report · presentation · data table), multi-account rotation with auto-reauth.**
6
6
 
7
- > 🟢 **Actively maintained fork** of [PleasePrompto/notebooklm-mcp](https://github.com/PleasePrompto/notebooklm-mcp) (upstream last push: 2025-12-27). This fork ships v1.5.8 (2026-04-19) with 2026 NotebookLM UI selectors, HTTP REST API for n8n / Zapier / Make, multi-account rotation, and documented install on Windows / WSL / Docker.
7
+ > v1.6.0 production-grade, batch-tested on overnight runs of 1 000+ questions. New: `/batch-to-vault` endpoint and [RTFM integration](./deployment/docs/14-RTFM-INTEGRATION.md) for offline retrieval at scale. [Compare with `PleasePrompto/notebooklm-mcp` v2.0.0](https://roomi-fields.github.io/notebooklm-mcp/compare) to see when this project is the right pick (REST API, full Studio, auto-reauth) and when the MCP-only upstream is.
8
8
 
9
9
  <!-- Badges -->
10
10
 
@@ -66,15 +66,35 @@ Generate multiple content types from your notebook sources:
66
66
  - **MCP Protocol** — Claude Code, Cursor, Codex, any MCP client
67
67
  - **HTTP REST API** — n8n, Zapier, Make.com, custom integrations
68
68
  - **Docker** — Isolated deployment with Docker or Docker Compose
69
+ - **[RTFM](https://github.com/roomi-fields/rtfm) retrieval layer** — `/batch-to-vault` writes citation-backed answers as markdown + JSON sidecars (`nblm-answer-v1` schema), indexable by [RTFM](https://github.com/roomi-fields/rtfm) (FTS5 + semantic) for unlimited offline queries. Ideal for academic / SOTA workflows. [Guide](./deployment/docs/14-RTFM-INTEGRATION.md).
69
70
 
70
71
  ---
71
72
 
72
73
  ## Quick Start
73
74
 
74
- ### Option 1: MCP Mode (Claude Code, Cursor, Codex)
75
+ ### Option 1 HTTP REST API (n8n, Zapier, Make, curl, any HTTP client)
75
76
 
76
77
  ```bash
77
- # Clone and build locally
78
+ git clone https://github.com/roomi-fields/notebooklm-mcp.git
79
+ cd notebooklm-mcp
80
+ npm install && npm run build
81
+ npm run setup-auth # One-time Google login
82
+ npm run start:http # Start REST API on port 3000
83
+ ```
84
+
85
+ ```bash
86
+ # Citation-backed Q&A, single curl, JSON response
87
+ curl -X POST http://localhost:3000/ask \
88
+ -H 'Content-Type: application/json' \
89
+ -d '{"question": "Summarize chapter 3", "notebook_id": "your-id", "source_format": "json"}'
90
+ ```
91
+
92
+ The full surface is **33 documented endpoints** — see the [REST API reference](https://roomi-fields.github.io/notebooklm-mcp/notebooklm-rest-api). For overnight batches of 1 000+ questions, see the [batch pattern](https://roomi-fields.github.io/notebooklm-mcp/batch-1000-questions).
93
+
94
+ ### Option 2 — MCP Mode (Claude Code, Cursor, Codex)
95
+
96
+ ```bash
97
+ # Build (same package, MCP transport)
78
98
  git clone https://github.com/roomi-fields/notebooklm-mcp.git
79
99
  cd notebooklm-mcp
80
100
  npm install && npm run build
@@ -82,7 +102,7 @@ npm install && npm run build
82
102
  # Claude Code
83
103
  claude mcp add notebooklm node /path/to/notebooklm-mcp/dist/index.js
84
104
 
85
- # Cursor - add to ~/.cursor/mcp.json
105
+ # Cursor add to ~/.cursor/mcp.json
86
106
  {
87
107
  "mcpServers": {
88
108
  "notebooklm": {
@@ -95,24 +115,7 @@ claude mcp add notebooklm node /path/to/notebooklm-mcp/dist/index.js
95
115
 
96
116
  Then say: _"Log me in to NotebookLM"_ → Chrome opens → log in with Google.
97
117
 
98
- ### Option 2: HTTP REST API (n8n, Zapier, Make.com)
99
-
100
- ```bash
101
- git clone https://github.com/roomi-fields/notebooklm-mcp.git
102
- cd notebooklm-mcp
103
- npm install && npm run build
104
- npm run setup-auth # One-time Google login
105
- npm run start:http # Start server on port 3000
106
- ```
107
-
108
- ```bash
109
- # Query the API
110
- curl -X POST http://localhost:3000/ask \
111
- -H "Content-Type: application/json" \
112
- -d '{"question": "Explain X", "notebook_id": "my-notebook"}'
113
- ```
114
-
115
- ### Option 3: Docker (NAS, Server)
118
+ ### Option 3 Docker (NAS, server, headless)
116
119
 
117
120
  ```bash
118
121
  # Build and run
@@ -131,19 +134,26 @@ See [Docker Guide](./deployment/docs/08-DOCKER.md) for NAS deployment (Synology,
131
134
 
132
135
  ## Documentation
133
136
 
134
- | Guide | Description |
135
- | ------------------------------------------------------------ | ----------------------------------------- |
136
- | [Installation](./deployment/docs/01-INSTALL.md) | Step-by-step setup for HTTP and MCP modes |
137
- | [Configuration](./deployment/docs/02-CONFIGURATION.md) | Environment variables and security |
138
- | [API Reference](./deployment/docs/03-API.md) | Complete HTTP endpoint documentation |
139
- | [n8n Integration](./deployment/docs/04-N8N-INTEGRATION.md) | Workflow automation setup |
140
- | [Troubleshooting](./deployment/docs/05-TROUBLESHOOTING.md) | Common issues and solutions |
141
- | [Notebook Library](./deployment/docs/06-NOTEBOOK-LIBRARY.md) | Multi-notebook management |
142
- | [Auto-Discovery](./deployment/docs/07-AUTO-DISCOVERY.md) | Autonomous metadata generation |
143
- | [Docker](./deployment/docs/08-DOCKER.md) | Docker and Docker Compose deployment |
144
- | [Multi-Interface](./deployment/docs/09-MULTI-INTERFACE.md) | Run Claude Desktop + HTTP simultaneously |
145
- | [Chrome Limitation](./docs/CHROME_PROFILE_LIMITATION.md) | Profile locking (solved in v1.3.6+) |
146
- | [Adding a Language](./docs/ADDING_A_LANGUAGE.md) | i18n system for multilingual UI support |
137
+ Full docs site: **<https://roomi-fields.github.io/notebooklm-mcp/>** · [OpenAPI 3.1 spec](./deployment/docs/openapi.yaml)
138
+
139
+ | Guide | Description |
140
+ | -------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------- |
141
+ | [Installation](./deployment/docs/01-INSTALL.md) | Step-by-step setup for HTTP and MCP modes |
142
+ | [Configuration](./deployment/docs/02-CONFIGURATION.md) | Environment variables and security |
143
+ | [REST API reference](./deployment/docs/03-API.md) | Complete HTTP endpoint documentation (33 endpoints) |
144
+ | [Run 1 000 questions overnight](./deployment/docs/12-BATCH-1000.md) | Production batch pattern with auto-reauth and rotation |
145
+ | [**RTFM integration — cache as searchable vault**](./deployment/docs/14-RTFM-INTEGRATION.md) | Pipeline pattern: NotebookLM as one-shot ingestion, RTFM as retrieval layer. `/batch-to-vault` endpoint, `nblm-answer-v1` schema. |
146
+ | [n8n integration](./deployment/docs/04-N8N-INTEGRATION.md) | Workflow automation setup |
147
+ | [Troubleshooting](./deployment/docs/05-TROUBLESHOOTING.md) | Common issues and solutions |
148
+ | [Notebook library](./deployment/docs/06-NOTEBOOK-LIBRARY.md) | Multi-notebook management |
149
+ | [Auto-discovery](./deployment/docs/07-AUTO-DISCOVERY.md) | Autonomous metadata generation |
150
+ | [Content management](./deployment/docs/10-CONTENT-MANAGEMENT.md) | Audio, video, infographic, report, presentation |
151
+ | [Multi-account rotation](./deployment/docs/11-MULTI-ACCOUNT.md) | Multiple accounts with TOTP auto-reauth |
152
+ | [Docker](./deployment/docs/08-DOCKER.md) | Docker and Docker Compose deployment |
153
+ | [Multi-interface](./deployment/docs/09-MULTI-INTERFACE.md) | Run Claude Desktop + HTTP simultaneously |
154
+ | [**Compare with PleasePrompto v2.0.0**](./deployment/docs/13-COMPARE.md) | Feature matrix vs the upstream MCP-only server |
155
+ | [Chrome profile limitation](./docs/CHROME_PROFILE_LIMITATION.md) | Profile locking (solved in v1.3.6+) |
156
+ | [Adding a language](./docs/ADDING_A_LANGUAGE.md) | i18n system for multilingual UI support |
147
157
 
148
158
  ---
149
159
 
@@ -153,6 +163,13 @@ See [ROADMAP.md](./ROADMAP.md) for planned features and version history.
153
163
 
154
164
  **Latest releases:**
155
165
 
166
+ - **v1.6.0** — `/batch-to-vault` endpoint + RTFM integration (`nblm-answer-v1` JSON Schema published at <https://schemas.roomi-fields.com/nblm-answer-v1.json>) for caching NotebookLM answers as a searchable markdown vault
167
+ - **v1.5.9** — Restore `mcpName` field for MCP Registry npm-package ownership verification
168
+ - **v1.5.8** — NotebookLM 2026 UI adaptations (icon-label sanitization, Discussion-panel recovery, count-based source detection) — PR #5 by @KhizarJamshaidIqbal
169
+ - **v1.5.7** — Citation extraction selector fix (`.highlighted`) and Docker multi-stage build — PR #1 by @JulienCANTONI
170
+ - **v1.5.6** — Citation extraction major rewrite (97% success rate), browser-verified auth at startup, profile auto-sync
171
+ - **v1.5.5** — Multi-account state-path bug fix, Windows startup scripts, hidden-window MCP proxy
172
+ - **v1.5.4** — Mid-session auto-reauth with stored credentials, TOTP support
156
173
  - **v1.5.3** — Docker deployment with noVNC for visual authentication + NAS support (Synology, QNAP)
157
174
  - **v1.5.2** — Notebook scraping from NotebookLM + Bulk delete + Bug fixes
158
175
  - **v1.5.1** — Multilingual UI support (FR/EN) with i18n selector system + E2E tests (76 tests)
@@ -1,6 +1,15 @@
1
- # API Documentation - NotebookLM MCP HTTP Server
2
-
3
- > Complete reference for all REST endpoints
1
+ # NotebookLM REST API reference 33 HTTP endpoints
2
+
3
+ > Complete reference for the NotebookLM HTTP REST API. Citation-backed Q&A,
4
+ > Studio content generation, notebook library, multi-account, sessions.
5
+ >
6
+ > **OpenAPI 3.1 spec:** [openapi.yaml](pathname:///openapi.yaml) (importable in
7
+ > Postman, Insomnia, Bruno, Swagger UI). Source in the repo at
8
+ > [`deployment/docs/openapi.yaml`](https://github.com/roomi-fields/notebooklm-mcp/blob/main/deployment/docs/openapi.yaml).
9
+ >
10
+ > **Companion guides:** [REST batch pattern (1 000 questions overnight)](/batch-1000-questions) ·
11
+ > [n8n integration](/notebooklm-n8n) · [Multi-account rotation](/notebooklm-multi-account) ·
12
+ > [Compare with PleasePrompto/notebooklm-mcp v2.0.0](/compare)
4
13
 
5
14
  ---
6
15
 
@@ -0,0 +1,165 @@
1
+ # Run 1 000 NotebookLM questions overnight
2
+
3
+ This is the pattern we use to run very long batches of citation-backed Q&A against Google NotebookLM — from PhD literature reviews to market intelligence pipelines. Single laptop, single account, eight hours, one thousand structured answers in a JSONL file.
4
+
5
+ The whole thing fits in **one shell loop**, because the project exposes a plain REST API on `http://localhost:3000`. There is no SDK to learn, no agent harness to configure, no MCP client to wire.
6
+
7
+ ## What you need
8
+
9
+ - This project running locally: `npm run setup-auth` (one-time Google login), then `npm run start:http`. [Install guide](/install).
10
+ - A list of questions in a text file, one per line.
11
+ - A notebook id. Either pick one from `GET /notebooks/scrape` or set a default with `PUT /notebooks/:id/activate`.
12
+ - Optionally: a second Google account for rotation. [Multi-account guide](/notebooklm-multi-account).
13
+
14
+ ## The minimum viable batch (10 lines of bash)
15
+
16
+ ```bash
17
+ NOTEBOOK_ID="paste-your-id-here"
18
+ INPUT="questions.txt"
19
+ OUTPUT="answers.jsonl"
20
+
21
+ while IFS= read -r question; do
22
+ curl -sS -X POST http://localhost:3000/ask \
23
+ -H 'Content-Type: application/json' \
24
+ -d "$(jq -n --arg q "$question" --arg n "$NOTEBOOK_ID" \
25
+ '{question: $q, notebook_id: $n, source_format: "json"}')" \
26
+ >> "$OUTPUT"
27
+ echo >> "$OUTPUT"
28
+ done < "$INPUT"
29
+ ```
30
+
31
+ That works. It does not survive a session expiry at hour 4, it does not throttle, it does not resume after a network blip, and it makes 1 000 sequential blocking calls. So for real batches we wrap it.
32
+
33
+ ## The production pattern
34
+
35
+ ```python
36
+ #!/usr/bin/env python3
37
+ """Run a batch of NotebookLM questions through the local REST API.
38
+
39
+ Resumes safely on restart, handles re-auth, rotates accounts, throttles to
40
+ respect rate limits, and writes one JSON line per answer with citations.
41
+ """
42
+
43
+ import json, time, sys
44
+ from pathlib import Path
45
+ import httpx
46
+
47
+ API = "http://localhost:3000"
48
+ NOTEBOOK_ID = "paste-your-id-here"
49
+ INPUT = Path("questions.txt")
50
+ OUTPUT = Path("answers.jsonl")
51
+ THROTTLE_SECONDS = 8 # average pace; tune to your account's quota
52
+ MAX_RETRIES = 3
53
+ ACCOUNTS = ["primary", "backup"] # registered via `npm run accounts add`
54
+
55
+ def already_done() -> set[str]:
56
+ """Resume support: skip questions already answered."""
57
+ if not OUTPUT.exists():
58
+ return set()
59
+ done = set()
60
+ for line in OUTPUT.read_text().splitlines():
61
+ try:
62
+ done.add(json.loads(line)["question"])
63
+ except (json.JSONDecodeError, KeyError):
64
+ continue
65
+ return done
66
+
67
+ def switch_account(name: str) -> None:
68
+ httpx.post(f"{API}/re-auth", json={"account": name}, timeout=120).raise_for_status()
69
+
70
+ def ask(question: str, account_idx: int = 0) -> dict:
71
+ payload = {
72
+ "question": question,
73
+ "notebook_id": NOTEBOOK_ID,
74
+ "source_format": "json",
75
+ }
76
+ for attempt in range(1, MAX_RETRIES + 1):
77
+ try:
78
+ r = httpx.post(f"{API}/ask", json=payload, timeout=180)
79
+ r.raise_for_status()
80
+ data = r.json()
81
+ if data.get("success"):
82
+ return data
83
+ # rate-limited or quota — try the next account
84
+ if "rate" in str(data.get("error", "")).lower():
85
+ account_idx = (account_idx + 1) % len(ACCOUNTS)
86
+ switch_account(ACCOUNTS[account_idx])
87
+ continue
88
+ except httpx.HTTPError as e:
89
+ print(f" attempt {attempt}: {e}", file=sys.stderr)
90
+ time.sleep(2 ** attempt)
91
+ raise RuntimeError(f"failed after {MAX_RETRIES} retries: {question}")
92
+
93
+ def main() -> None:
94
+ done = already_done()
95
+ questions = [q.strip() for q in INPUT.read_text().splitlines() if q.strip()]
96
+ todo = [q for q in questions if q not in done]
97
+ print(f"{len(done)} already answered · {len(todo)} to go")
98
+
99
+ with OUTPUT.open("a") as f:
100
+ for i, question in enumerate(todo, 1):
101
+ t0 = time.time()
102
+ answer = ask(question)
103
+ row = {
104
+ "question": question,
105
+ "answer": answer["answer"],
106
+ "citations": answer.get("citations", []),
107
+ "session_id": answer.get("session_id"),
108
+ "elapsed_s": round(time.time() - t0, 1),
109
+ }
110
+ f.write(json.dumps(row, ensure_ascii=False) + "\n")
111
+ f.flush()
112
+ print(f"[{i}/{len(todo)}] {row['elapsed_s']}s · {len(row['citations'])} cites")
113
+ time.sleep(THROTTLE_SECONDS)
114
+
115
+ if __name__ == "__main__":
116
+ main()
117
+ ```
118
+
119
+ Save it as `batch.py`, drop your questions in `questions.txt`, run `python batch.py`. Kill it any time, re-run, it picks up where it left off.
120
+
121
+ ## Why this pattern works
122
+
123
+ ### One file in, one file out
124
+
125
+ Both ends are plain text. Your input is a `questions.txt` you can edit in any tool. Your output is `answers.jsonl` — JSON Lines, one answer per line, trivially loadable into pandas, jq, BigQuery, or another LLM for further processing:
126
+
127
+ ```bash
128
+ jq '.answer' answers.jsonl | wc -l
129
+ jq -c '{q: .question, n: (.citations | length)}' answers.jsonl
130
+ ```
131
+
132
+ ### Resume on restart
133
+
134
+ Network drops, OS updates, batch scripts that get killed at 3am — they all happen. The first thing the script does is re-read the output file and skip questions whose text is already there. You lose the in-flight question and nothing else.
135
+
136
+ ### Auto-reauth across multi-hour runs
137
+
138
+ Google sessions don't survive the night. The REST API checks the URL ground truth on every call and re-logins automatically using credentials stored in the AES-256-GCM vault (`npm run setup-auth` puts them there). TOTP codes are computed on the fly, so 2FA-protected accounts work transparently. [Multi-account configuration](/notebooklm-multi-account).
139
+
140
+ ### Account rotation when one quota saturates
141
+
142
+ Free Google accounts hit a daily NotebookLM Q&A quota. The script flips to the next registered account on rate-limit errors via `POST /re-auth`. With two accounts you can typically push 1 500–2 000 questions in a 24-hour window without manual intervention.
143
+
144
+ ### Citations come back structured
145
+
146
+ `source_format: "json"` returns a `citations` array of `{id, source, excerpt}` objects directly attached to the answer. You can join citations back to your sources for downstream processing — fact-checking, page-number resolution, LaTeX `\cite{}` generation for a thesis bibliography.
147
+
148
+ ## Sizing your throttle
149
+
150
+ NotebookLM does not document a public rate limit, so we picked **8 seconds between calls** based on hundreds of overnight runs. That gives you ~450 questions in an hour and ~3 600 in eight hours per account. If you see `rate limit` errors before that, raise the throttle to 12-15 seconds or add a third account. If you can sustain 5 seconds without hitting limits, go for it.
151
+
152
+ ## When to switch to MCP mode instead
153
+
154
+ If your driver is a coding agent (Claude Code, Cursor, Codex) rather than a script, the same operations are exposed as MCP tools. Use that surface when you want the agent to reason about which question to ask next; use the REST API when you have a flat list to grind through. [Both modes ship from the same package](/install).
155
+
156
+ ## What this gives you in practice
157
+
158
+ For one of our PhD use cases we run 100-200 questions per chapter across a thirty-chapter thesis library. That's 5 000+ structured answers with citations, computed overnight on a laptop, indexed back into the thesis as `\cite{}`-ready snippets. Total cost: zero (uses the user's own NotebookLM account), total infrastructure: one Node process and one Python script.
159
+
160
+ ## Next steps
161
+
162
+ - [HTTP API reference](/notebooklm-rest-api) — every endpoint, every parameter.
163
+ - [n8n integration guide](/notebooklm-n8n) — same pattern but as a visual workflow.
164
+ - [Multi-account guide](/notebooklm-multi-account) — register a second Google account for rotation.
165
+ - [Compare with PleasePrompto](/compare) — when this project is the right pick over the upstream MCP-only server.
@@ -0,0 +1,12 @@
1
+ # Comparing NotebookLM MCP servers
2
+
3
+ There are a handful of community projects that automate Google NotebookLM. The two most active TypeScript implementations are **`PleasePrompto/notebooklm-mcp`** (the original, MCP-only) and **`@roomi-fields/notebooklm-mcp`** (this project — REST API + MCP, Studio-complete, batch-oriented).
4
+
5
+ This page is an honest, side-by-side comparison so you can pick the right one for your workflow. Both are MIT-licensed and actively maintained.
6
+
7
+ > **TL;DR**
8
+ >
9
+ > - You want a quick **MCP server** for Claude Code / Cursor / Codex with citations and audio overviews → either works; PleasePrompto v2 is a clean, MCP-spec-first build.
10
+ > - You need a **REST API** to call from n8n / Zapier / Make / curl / any non-MCP client → use this project. PleasePrompto v2 ships MCP-over-HTTP (Streamable HTTP transport) which is **not** a REST API.
11
+ > - You generate **video / infographic / presentation / data table** from NotebookLM Studio → use this project. PleasePrompto v2 ships audio only; the rest is deferred to a follow-up.
12
+ > - You run **long batches** (PhD research, market reports, content pipelines) and need **auto-reauth with TOTP** → use this project. PleasePrompto v2 explicitly does not store credentials.
@@ -0,0 +1,288 @@
1
+ # NotebookLM + RTFM — cache batch outputs as a searchable markdown vault
2
+
3
+ NotebookLM is brilliant at producing citation-backed answers, but it's slow (~10–30s per query) and rate-limited (50 queries per day per Google account on the free tier). For any workflow that re-asks similar questions over time — academic literature reviews, competitive intelligence pipelines, internal knowledge bases — querying NotebookLM live every time is the wrong architecture.
4
+
5
+ The pattern that scales: **NotebookLM as a one-shot ingestion layer, [RTFM](https://github.com/roomi-fields/rtfm) as the retrieval layer.** Run an exhaustive question set once, persist every answer (with citations, source titles, and excerpts) as markdown, then point your CLI agent at the vault for unlimited offline queries.
6
+
7
+ ```
8
+ [Once per notebook, periodic]
9
+ CLI agent generates an exhaustive question set
10
+ → POST /batch-to-vault (titles + excerpts, citations preserved)
11
+ → vault/*.md + vault/*.json (RTFM-ingestable)
12
+
13
+ [At will, unlimited, ~ms, offline]
14
+ Agent → rtfm_search → rtfm_expand → answer
15
+ ```
16
+
17
+ This page shows how to wire the two together.
18
+
19
+ ## Why this beats querying NotebookLM live
20
+
21
+ | Concern | Live NotebookLM | NotebookLM → vault → RTFM |
22
+ | ----------------- | --------------------------- | ------------------------------ |
23
+ | Latency per query | 10–30s | ~milliseconds |
24
+ | Quota | 50/day per Google account | Unlimited after one-shot batch |
25
+ | Repeat queries | Cost a quota slot each time | Free |
26
+ | Offline | No | Yes |
27
+ | Source citations | Yes (titles + excerpts) | Yes (preserved in markdown) |
28
+ | Best for | Fresh interpretation | Re-querying ingested knowledge |
29
+
30
+ ## What you need
31
+
32
+ - This project running locally: `npm run start:http` after `npm run setup-auth`. [Install guide](/install).
33
+ - [RTFM](https://github.com/roomi-fields/rtfm) installed and configured to point at your vault directory.
34
+ - A notebook with sources already attached. List them with `GET /notebooks/scrape`.
35
+ - A list of questions you want answered against that notebook.
36
+
37
+ ## The endpoint
38
+
39
+ `POST /batch-to-vault` runs a list of questions and writes each answer as two artifacts in a vault directory:
40
+
41
+ - `{slug}.md` — markdown with YAML frontmatter, the answer body, and a "Sources" section with quoted excerpts. Indexable by any markdown vault tool (RTFM, Obsidian, Foam, Dendron…).
42
+ - `{slug}.json` — a structured payload conforming to the `nblm-answer-v1` schema (see [schema below](#nblm-answer-v1-json-schema)) for richer ingestion.
43
+
44
+ ### Request
45
+
46
+ ```bash
47
+ curl -X POST http://localhost:3000/batch-to-vault \
48
+ -H 'Content-Type: application/json' \
49
+ -d '{
50
+ "questions": [
51
+ "What is the OSBD process?",
52
+ "How does NVC differentiate a need from a strategy?",
53
+ "What is empathic listening in NVC?"
54
+ ],
55
+ "notebook_id": "notebook-1",
56
+ "vault_dir": "/path/to/your/vault/cnv",
57
+ "slug_prefix": "sota",
58
+ "source_format": "json",
59
+ "sleep_between_ms": 2000
60
+ }'
61
+ ```
62
+
63
+ ### Parameters
64
+
65
+ | Field | Required | Default | Description |
66
+ | ------------------ | -------- | -------- | --------------------------------------------------------------------------------------------- |
67
+ | `questions` | yes | — | Non-empty array of strings. Each question becomes one `.md` + one `.json` file. |
68
+ | `vault_dir` | yes | — | Destination directory. Created with `mkdir -p` if missing. |
69
+ | `notebook_id` | no | active | Library notebook id to query. |
70
+ | `notebook_url` | no | — | Direct NotebookLM URL (alternative to `notebook_id`). |
71
+ | `slug_prefix` | no | `""` | Prepended to each filename. Use to namespace per topic, e.g. `"sota"`, `"market-2026q2"`. |
72
+ | `source_format` | no | `"json"` | Citation extraction mode. `"json"` is recommended for vault output (keeps titles + excerpts). |
73
+ | `sleep_between_ms` | no | `0` | Pause between questions to avoid hammering NotebookLM. 1500–3000ms is sane for batches > 20. |
74
+ | `session_id` | no | new | Reuse an existing session for context continuity across the batch. |
75
+
76
+ ### Response
77
+
78
+ ```json
79
+ {
80
+ "success": true,
81
+ "data": {
82
+ "vault_dir": "/path/to/your/vault/cnv",
83
+ "total": 3,
84
+ "succeeded": 3,
85
+ "failed": 0,
86
+ "session_id": "5f1d8731",
87
+ "notebook": {
88
+ "id": "notebook-1",
89
+ "url": "https://notebooklm.google.com/notebook/74912e55-..."
90
+ },
91
+ "files": [
92
+ {
93
+ "question": "What is the OSBD process?",
94
+ "md_path": "/path/to/your/vault/cnv/sota-001-what-is-the-osbd-process.md",
95
+ "json_path": "/path/to/your/vault/cnv/sota-001-what-is-the-osbd-process.json",
96
+ "success": true,
97
+ "citations_count": 16
98
+ }
99
+ ]
100
+ }
101
+ }
102
+ ```
103
+
104
+ ## What gets written
105
+
106
+ ### `{slug}.md`
107
+
108
+ ```markdown
109
+ ---
110
+ title: 'What is the OSBD process?'
111
+ type: nblm-answer
112
+ asked_at: 2026-05-04T13:30:00.000Z
113
+ notebook_id: 'notebook-1'
114
+ notebook_url: 'https://notebooklm.google.com/notebook/74912e55-...'
115
+ session_id: '5f1d8731'
116
+ citations_count: 16
117
+ sources:
118
+ - 'Pratiquer la Communication NonViolente_F.Keller.pdf'
119
+ - 'CNV et OSBD : outils pour pratiquer la communication bienveillante'
120
+ - "Rapport d'analyse systémique sur les cursus de formation en CNV"
121
+ ---
122
+
123
+ # What is the OSBD process?
124
+
125
+ > Asked on 2026-05-04T13:30:00.000Z against [CNV - Communication NonViolente](https://notebooklm.google.com/notebook/...)
126
+
127
+ ## Answer
128
+
129
+ OSBD is the four-step acronym at the core of Nonviolent Communication...
130
+
131
+ ## Sources
132
+
133
+ ### [1] CNV et OSBD : outils pour pratiquer la communication bienveillante
134
+
135
+ > Ce mode de communication est un choix conscient...
136
+
137
+ ### [2] Pratiquer la Communication NonViolente_F.Keller.pdf
138
+
139
+ > Observation Je décris, de manière neutre, la situation...
140
+ ```
141
+
142
+ The frontmatter is standard YAML — every markdown indexer (RTFM, Obsidian, Foam) reads it natively. The body has stable section headings (`## Answer`, `## Sources`) so a parser can lift the answer text and citation excerpts independently.
143
+
144
+ ### `{slug}.json`
145
+
146
+ A structured sidecar conforming to [`nblm-answer-v1`](#nblm-answer-v1-json-schema). Use it when your indexer wants typed access to citations, source positions, or session metadata without re-parsing the markdown.
147
+
148
+ ## Pointing RTFM at the vault
149
+
150
+ [RTFM](https://github.com/roomi-fields/rtfm) is an MCP-native retrieval layer with FTS5 + semantic search over markdown vaults, wikilink resolution, and progressive disclosure for AI agents. It speaks the same markdown convention `/batch-to-vault` writes, so wiring is essentially "point and index":
151
+
152
+ ```bash
153
+ # 1. Generate the vault from NotebookLM
154
+ curl -X POST http://localhost:3000/batch-to-vault -d '{...}'
155
+
156
+ # 2. Index it with RTFM
157
+ rtfm index /path/to/your/vault/cnv
158
+
159
+ # 3. Search from your CLI agent (or any MCP client)
160
+ rtfm search "OSBD process" --top 5
161
+ rtfm expand sota-001-what-is-the-osbd-process
162
+ ```
163
+
164
+ Inside an MCP client (Claude Code, Cursor, Codex), the same flow becomes a two-tool pattern: `rtfm_search` to surface the relevant cached answer, `rtfm_expand` to read the full markdown with citations preserved. No NotebookLM call needed for repeat queries.
165
+
166
+ When new sources land in the notebook, re-run `/batch-to-vault` to refresh the cache.
167
+
168
+ ## Recommended layout for academic / SOTA workflows
169
+
170
+ ```
171
+ ~/research-vault/
172
+ ├── cnv/ # one notebook → one folder
173
+ │ ├── sota-001-...md
174
+ │ ├── sota-001-...json
175
+ │ ├── sota-002-...md
176
+ │ └── sota-002-...json
177
+ ├── ifs-therapy/
178
+ │ ├── sota-001-...md
179
+ │ └── ...
180
+ └── attachment-theory/
181
+ └── ...
182
+ ```
183
+
184
+ Each folder maps to one NotebookLM notebook. `slug_prefix` per topic keeps filenames sortable and unique. RTFM indexes the whole tree and resolves cross-folder wikilinks if you add them.
185
+
186
+ ## Question generation
187
+
188
+ The matching pattern on the input side: ask Claude (or any LLM) to generate an exhaustive question set for a topic before you batch them.
189
+
190
+ ```
191
+ You are preparing a SOTA (state of the art) document on {topic} from a NotebookLM
192
+ notebook containing {N sources}. Generate {K} questions that, taken together,
193
+ extract everything a domain expert would want to know:
194
+
195
+ - Foundational definitions and key concepts
196
+ - Historical context and lineage
197
+ - Core mechanisms / processes
198
+ - Distinctions vs adjacent fields
199
+ - Empirical evidence and limitations
200
+ - Practical applications
201
+ - Open debates and research gaps
202
+
203
+ Output as a JSON array of strings, no commentary.
204
+ ```
205
+
206
+ Save the output as `questions.json`, then:
207
+
208
+ ```bash
209
+ curl -X POST http://localhost:3000/batch-to-vault \
210
+ -H 'Content-Type: application/json' \
211
+ -d "$(jq -n --slurpfile q questions.json --arg dir ~/research-vault/cnv \
212
+ '{questions: $q[0], notebook_id: "notebook-1", vault_dir: $dir, slug_prefix: "sota", sleep_between_ms: 2000}')"
213
+ ```
214
+
215
+ For batches above ~50 questions, multi-account rotation kicks in automatically when a quota is hit. See [Multi-account rotation](/notebooklm-multi-account).
216
+
217
+ ## `nblm-answer-v1` JSON schema
218
+
219
+ Sidecar `{slug}.json` files conform to this schema. Stable across releases under SemVer; breaking changes will bump the major version.
220
+
221
+ > **Canonical URL** (resolvable, served as `application/schema+json` with CORS): <https://schemas.roomi-fields.com/nblm-answer-v1.json> — fetch from any JSON Schema validator. The version below mirrors the canonical document.
222
+
223
+ ```json
224
+ {
225
+ "$schema": "https://json-schema.org/draft/2020-12/schema",
226
+ "$id": "https://schemas.roomi-fields.com/nblm-answer-v1.json",
227
+ "title": "NotebookLM Answer (nblm-answer-v1)",
228
+ "description": "Structured sidecar payload produced by notebooklm-mcp /batch-to-vault. Encodes a single NotebookLM answer with citations, source positions, and session metadata for typed ingestion by retrieval systems (e.g. RTFM).",
229
+ "type": "object",
230
+ "required": ["type", "version", "asked_at", "question", "answer", "citations", "metadata"],
231
+ "properties": {
232
+ "$schema": { "type": "string", "format": "uri" },
233
+ "type": { "const": "nblm-answer" },
234
+ "version": { "const": "1.0" },
235
+ "asked_at": { "type": "string", "format": "date-time" },
236
+ "session_id": { "type": ["string", "null"] },
237
+ "notebook": {
238
+ "type": "object",
239
+ "properties": {
240
+ "id": { "type": ["string", "null"] },
241
+ "name": { "type": ["string", "null"] },
242
+ "url": { "type": ["string", "null"] }
243
+ }
244
+ },
245
+ "question": { "type": "string" },
246
+ "answer": {
247
+ "type": "object",
248
+ "required": ["text", "format"],
249
+ "properties": {
250
+ "text": { "type": "string" },
251
+ "format": { "const": "markdown" }
252
+ }
253
+ },
254
+ "citations": {
255
+ "type": "array",
256
+ "items": {
257
+ "type": "object",
258
+ "required": ["marker", "number"],
259
+ "properties": {
260
+ "marker": { "type": "string", "description": "Display marker, e.g. \"[1]\"" },
261
+ "number": { "type": "integer", "minimum": 1 },
262
+ "source_name": { "type": ["string", "null"] },
263
+ "source_text": {
264
+ "type": ["string", "null"],
265
+ "description": "Highlighted excerpt from the cited source"
266
+ }
267
+ }
268
+ }
269
+ },
270
+ "metadata": {
271
+ "type": "object",
272
+ "properties": {
273
+ "tags": { "type": "array", "items": { "type": "string" } },
274
+ "extraction_success": { "type": ["boolean", "null"] },
275
+ "citations_count": { "type": "integer", "minimum": 0 },
276
+ "source_names": { "type": "array", "items": { "type": "string" } }
277
+ }
278
+ }
279
+ }
280
+ }
281
+ ```
282
+
283
+ ## See also
284
+
285
+ - [Run 1 000 questions overnight](/batch-1000-questions) — the larger batch pattern with auto-reauth and rotation
286
+ - [Multi-account rotation](/notebooklm-multi-account) — how quotas and TOTP auto-reauth work
287
+ - [REST API reference](/notebooklm-rest-api) — full endpoint surface (33 endpoints + `/batch-to-vault`)
288
+ - [RTFM on GitHub](https://github.com/roomi-fields/rtfm) — the retrieval layer